Posts

Showing posts from July, 2018

The Knowns, The Unknowns, and Engineering

Image
In  an earlier post , I described a way of making decisions based on the type of problems that you are facing. I won’t repeat it — go read  the original post   — but the key point is that the way you go about making decisions really should depend on the type of situation you’re in. /via https://www.thingsinsquares.com/comics/google-maps/ Think about directions to get somewhere. If you’ve never been there before, well, Google Maps to the rescue. But what if data service is sucky (it happens sometimes!)? Do you ask for directions? What if there is nobody to ask, d’you explore? Or go back and print out directions at home? So yeah, the specifics of the situation affect the way you go about resolving them. And to get slightly  Rumsfeldian , most situations can be broken out into the following four categories 1. Known Knowns : You’ve solved this specific problem before, and can totally do it again. Heck, you can automate the solution, or if it is a manual thing, probably write ou

Canary Release, and Experimentation

Image
Canary Release  (also called  phased , or  incremental deployment) is when you deploy your new release to a subset of servers, check it out there against real live production traffic, and take it from there — rolling it out, or rolling it back, depending on what happened. You’re basically doing the  Canary In A Coal Mine  thing, using a few servers and live traffic to ensure that at worst, you only affect a subset of your users. It’s not a bad approach at all, and depending on how you do this, can be quite efficient resource-wise (you don’t need an entire second environment a-la   Blue-Green releases ). Mind you, the flip side to this is that you need to   really careful about compatibility. You’ve got multiple releases running at the same time, so things like data versioning, persistence formats, process flows, transaction recovery, etc. need to either be forward/backward compatible, or very (!!!) carefully partitioned/rollback-able. The tricky part here though is when you d

Not Just vi

Image
/via http://www.commitstrip.com/en/2014/03/05/what-you-would-never-admin/ It is just a way of life with  vi . Somebody, somewhere, is always figuring out a new and even more cryptic (yet easy to use!) way of simplifying my development life. Then, of course, there is  emacs . Every time I pick it up, I spend the first week or so re-remembering everything I used to know… PyCharm  (and the rest of the JetBrains ilk) are just about the same. Mind you, given the amount of time that I’ve spent getting the development environment working Just Right with it, I could probably have re-learnt emacs. Guess I should go pick up  SublimeText .

Recursive Pooh

Image
Mind you, if Pooh is  tail-recursive , he’s basically getting to eat an ∞ amount of honey. Or, more likely, there is an OutOfHoney error…

Facial Recognition, and Bias

Image
As you’ve probably heard,  IBM is about to release a pretty monster dataset  — over a million images — along with tool, all with the aim of helping get rid of bias in facial analysis. The cool part for me is actually the announcement of a  second dataset — around 36,000 images — that are “ equally distributed across skin tones, genders, and ages ”. So, why does this matter? Before answering this, let’s first take a brief diversion. Let’s say you are doing something involving Machine Learning and facial recognition. You’d need a dataset to train your models against — think about how you would select your dataset. You’d probably take into consideration the specifics of the task (“ I need to know if the face is smiling or not ”), the details of the algorithm that you’re working on (“ Can I still tell it’s a smile if the background changes? ”) and such-like. You’d then go to one of the  handy-dandy collection of facial-recognition databases , and pick the most appropriate one. e.g.

Interactive Visualization — the Why

Image
/via https://www.slideshare.net/tgwilson/waw-sep2010-datavisualizationwithnotes Is your Visualization effective? Yes, yes, this is a totally loaded question. After all, WTF does “ effective ” even mean? Well, if you’re using it to understand data, did it work? OTOH, what if you’re trying to explain it to somebody else? You may have to make it easier to comprehend, or more accessible (you’re not color-blind, but what if they are?), etc. And how do you make sure that in doing so for Bob, you didn’t foreclose the ability to explain it to Carol (who, unlike Bob,  really  sweats the small stuff). The common point underlying the above is — “ Why do I, the end-user, give any f**ks about this? ”. And that tends to be issue. Static visualizations, by their very nature, represent a single viewport into the data, the one baked into it’s current representation. And that’s where  Interaction  comes in. An interactive visualization changes the lens with which the data is viewed, by makin

AWS, Data Transfer Costs, and Options

Image
I’ve written about  Data Transfer Costs  in the past , and how they’re probably the premier source of unpleasant surprises when the AWS bill shows up. There are many many sources for these, but, in my experience, the vast majority of them come down to • Traffic to public IPs, and forgetting that  any  traffic to them, even from your own instances, counts • Static assets on EC2 which you haven’t moved to Cloudflare. Especially when you’ve got videos (ouch!) • Multi-AZ deployments, and in particular, multi-AZ RDS, where every damn write involves data-transfer costs. So fine, you know the above, what’re you going to do about it? You  could  stay on top of it with  Cloudability  or  CloudCheckr  or some such, but this really requires you to  seriously  stay on top of it. And the issue with this i s that relying on always doing the right thing to keep your costs down  will  end up ‘sploding in your face , and probably at the worst possible moment at that! Alternatively, you cou

How do *you* make decisions?

Image
/via https://marketoonist.com/2006/04/the-decision-matrix.html “IT DEPENDS” Seriously, how  do  you decide what to do? Do you just wing it? Or do you have a detailed “if — then — else” flowchart that you follow? The answer is, most likely, some variation of “ it depends ”, right? I mean, • “ Should I put my shoes on? ” is obvious— “ I’m going to the gym, so yes, I should put my shoes on ” • “ Which flavor of gelato should I order? ” is complicated, with some serious expertise necessary to resolve— “ I’m in NYC, not Sicily, but the reviews say this tastes authentic, and they say they use Bronte Pistachios, so yes, I should order Pistacchi ” • “ Why did Alice’s grades suddenly go to shit this semester? ” is way more complex. There are so many things happening in her life in this, her third semester, and it’s likely that there might actually be  more  than one actual cause. What you probably do is — delicately! — probe, to try and make sense of what’s going on. “ So, honey, yo