Posts

Showing posts from March, 2018

Painting with … Deep Learning

Image
The catch in the title is that this is about  painting  (brush strokes, pressure, and whatnot), and not  image creation  (generating pixels on the screen), i.e., more about the painting process that resulted in the image. (•) To put this differently,  how do we teach robots to draw ? I mean, yeah, we could have them become a high-speed  Seurat , but that’s faking it, at least in my viewpoint  The folks at   DeepMind   have worked on   exactly this , and have a pretty good start. They used Reinforcement Learning on an agent to paint using   MyPaint . It starts off just slapping stuff on, with the   reinforcement   part being used to tell it whether what it’s painting is “good” or “bad”. The results below really aren’t all that bad! Technically speaking, this doesn’t really work like a  Generative Adversarial Networks (GAN) , since the agent isn’t being trained on the  pixels , but on  how it’s painting , e.g., if the brush stroke is too hard, the stroke is too wide, etc. It’s a

Migrating to a New Database

Image
Come migration time, if you’re (very!) lucky, you can go offline, and  Just Take Care Of It . Odds are, however, that it isn’t going to be that simple — systems that can’t go offline, customers that scream really loud, money, whatever. In that case, the general strategy tends to be some variation of the following 1. Migrate Creates / Updates / Deletes to the New Every time you insert/update/delete data into your old database, you do the same on the new database. You’re still  Read ing from the old database however, so in case things go south, you find a bug, etc., you can just totally nuke the new database and start again from scratch And yeah, theres a whole bunch of detail here around preloading, metrics, alerts (•), and so forth that I’m eliding  2. Serve (some!) Reads from the New In -1- ,  all  your  Read s came from the old environment. Now, you can, and should, start serving  Read s from the new environment. Mind you,  if at all possible , do this gradually ( canary depl

Banana Or Toaster? — Deep Learning Edition

Image
TL;DR:  This one simple sticker will turn your banana into a toaster (to a #DeepLearning system) (OK, that’s a bit simplistic, but it’s not that far off.) Thing is, we’ve known for a while now that image classification can be hacked (cue scare stories about Stop Signs and School buses, etc.). In fact, there is  some evidence to point out that this “hackability” is intrinsic to the process , a kind of uncertainty principle if you will. Most of the work, thus far, has focused on  small  perturbations, the kind of stuff that would be invisible to the human eye, but would fool image classifiers. Stuff like manipulating pixels shades, quasi steganographic changes, etc. The  large  perturbations that people have looked at have also been (quasi)invisible-to-the-human-eye changes, things like glasses that break the classifier (•) There is new work out by  Brown et al.  that does things very differently — they create stickers that are pretty huge (10% the size of the actual object, o

Merkle Trees — Not Just For Blockchains

Image
(Apologies, I couldn’t resist the image/pun). Regardless, you   know   that hash trees have been used to validate data ( since, well, 1979 I guess ) in all sorts of arena, from databases like Cassandra, Dynamo, and Riak, to version control systems like git and subversion, to a host of file systems. They weren’t invented just for Bitcoin   Anyhow, here’s another nifty use case — using merkle trees to do immutable deploys of websites at   Netlify . Why immutable deploys? A bunch-a reasons, including •   atomic deploys and instant rollbacks , • the ability to   accurately   preview deploys   (by dereferencing the name from the content, you just point the name at the new hash behind the scene), • split testing   (transparently routing requests to different versions of the site), and a whole bunch more. The way they do this is pretty nifty • They identify and store all files based on the hashes of the content (not the names!) • A deploy is based on a merkle-tree based on the

Reproducibility and Machine Learning

Image
I can say with my hand on my heart, that machine learning is by far the worst environment I’ve ever found for collaborating and keeping track of changes.  —  Pete Warden I’d actually quite agree. Mind you, it’s not because of something fundamentally bad about the world of Deep Learning (•), it’s more about a collection of things that add up to a lot of pain. To summarize   a much longer writeup about this 1. Size : The source data is   large . This is a problem in and as of itself, since the state-of-the-art in managing   large   amounts of data is still, well, sucky. Think “ where do you put it ”, “ how do you get at it ”, “ how do   others   get at it ”, etc. And yeah, DropBox, and it’s ilk is where most of this stuff lives. 2. Canonical Data : The data used isn’t canonical (Actually it’s worse, it is   near-canonical ). Your version of the data may be different by just a few records, or you tweaked just a few records. And this may be before you get your hands on the data. Thi

Code Provenance

Image
Me: “Hmmm,   that’s   a weird bug” ( spend hours debugging, and narrowing it down to one section ) Me: “That’s weird, whereTF did he come up with   this   way of doing it?” ( Google, find stack overflow page with same code, including typo in comment ) ( Gaze in awe at realization that the lifted code came from the question, not the answer ) Update 1:   To be fair, the code did two things, one correctly (which was needed here originally), and one wrongly (which, because   Hyrum’s law rules , means that eventually the wrong stuff ended up getting exercised…) Update 2:   Of course,   Geek-and-Poke already said it .

How do *you* review code?

Image
Does it go something like this? a)  Checkout the PR b)  Scan through the code, looking for obvious s**t c)  Approve it Thats about it, right? Mind you, if you’re feeling particularly testy,  -b-  above might turn into  “ Scan through the code, and find something to criticize” but that’s  really  about it. /via http://geek-and-poke.com This isn’t really because you’re a bad person. Odds are, you’re in the middle of one of two things Code Reviews Take Time It’s true. And there really isn’t much that you can do about it. That said, a different way of saying this is  There is no short-cut to Quality . You want software that doesn’t suck? Well, that’s going to take time, no way around it. If your tech-lead/PM/ProdMgr doesn’t get it, either help them get it, or find a new job (•). At some point, s**t is going to break, and, well, you don’t want to be in the blast radius… #CowboyDeveloper  Hates Code Reviews This is a wee bit trickier. If you’ve got one of those, they  kn

Transferring Font “style” via Deep Learning

Image
There’s been  considerable work done  in transferring the “style” of a picture to another image (like, “ make cabins look like they’re in the depths of winter ” as seen below)   Doing this for pictures, it turns out, is a lot ( lot! ) simpler than doing it for fonts. Turns out that designing fonts is  hard , and, frankly, a bit of a PITA. As a result, when you come up with a specific font for, say, a movie poster, or a logo, or whatever, you usually just design the glyphs for the specific use at hand. So, if your image has the text “ PLUGH ” in it, then you do the letters “ P ”, “ L ”, “ U ”, “ G ”, and “ H ”, put ’em together, tweak spacing, color, etc. and you’re done. You just don’t worry about the rest of the letters of the alphabet, casing, numbers, symbols, whatever. So great, somebody comes along and wants to create an image with the text “ XYZZY ” on it, in the same font. (Remember, this somebody could be  you !). The old school way would be labor-intensive — you just s

Two Excellent ways to deal with Technical Debt

Image
/via http://www.commitstrip.com/en/2018/03/21/it-flew-by/ So yeah, whenever we build  anything , we incur tech-debt. That’s just the reality of the industry, and it’s not worth getting into. There have, literally, been volumes written about it. All that said, the two stellar strategies are  Post-Its  and  Kick The Can Down The Road. Post-Its : Every time you do something that you know you need to fix later, you write it up on a post-it, and put it on the wall next to you • If your wall gets too crowded, you have to actually start fixing stuff • If the Post-It falls, and you don’t notice it, well, consider that  Breakage . If it’s really important, it’ll show up again. If not, well, it doesn’t matter Kick The Can Down The Road : Pretty much what it sounds like. Do the bare minimum of whatever you need to do to make this a problem for the future. • If your startup fails, it doesn’t matter • If you quit, it’s not your problem • If you moved to a different job, it’s really