Posts

Showing posts from February, 2018

No Man is an Island — Magpie Edition

Image
Every heard of the   Social Intelligence Hypothesis ? Put simply, it sez. that the more social activity you engage in, the more intelligent you are (ok, not   you , it’s   your species . Whatever). It’s not a particularly controversial idea, but it’s been kinda hard to actually prove, with numbers and stuff. Part of the reason being that society size isn’t exactly dispositive — i mean, Wildebeest are   dumb , but lords, do they every have large societies. Anyhow, Ashton et al., in a recent paper, have done stellar work in fleshing this out with real data in   a recent paper in Nature . They studied magpies, and conclusively demonstrate that group size amongst magpies is associated with the cognitive abilities of the magpies (•). The larger the groups, the better they are at cognition. Yes, there were exceptions, but no more so than you would find in any normal distribution of intelligence (MIT graduates are not all brilliant   ) Mind you, we need to be somewhat careful here —

When you run your own Elasticsearch Instances

Image
So, Meltwater decided to run   their own ES setup   (on AWS, but still, their own instances). Why? In their own words — “ AWS Elasticsearch Service   allows us too little control, and   Elastic Cloud would cost us 2–3 times more than running directly on EC2 ” Go read   the entire post  — it is well worth it (especially the bit about sharding!), but the following performance tips are worth calling out 1. Limit searches to relevant data . Obvious, but lords, it’s amazing how much this one gets ignored. 2. Invest in Observability . Do you know where your resources are going? GC stats, io-wait, memory hogs, CPU usage, etc. etc. In particular, get really close and intimate with JVM performance tuning 3. Memory, not GC . Unless you’re really good, or have a real edge case, odds are that you’re not going to do much with the GC sub-system. Instead, focus on reducing memory allocation. 4. Azul Zin g for Memory . Basically, outsourcing your JVM management   It’s an expensive product,

Documenting Your Brilliance

Image
/via http://geekandpoke.typepad.com/geekandpoke/2010/03/simply-explained-entropy.html The two different times when your code made sense to you 1) When you wrote it 2) (Later) When you spent hours pouring over it, and figured out — again — what it does. The amazing thing is, it tends to be blindingly obvious each time you figure it out. The even more amazing thing is how you didn’t add any documentation, even after -2-, because   this   time you were sure you had it figured out.

What Time Is It?

Image
You’re ingesting measures (not metrics.  Measures aren’t Metrics — know the difference !), and, clearly, you’re time-stamping stuff, right? Well, let’s think about that for a moment. What exactly  does  “time-stamping” mean? More to the point,  what exactly are you time-stamping ? Consider the following Creation Time : Theoretically, the time at which that data was created. Question is,  whose wall-clock are we looking at ? Unless this is a single monolith (and even there, you might be screwing up time measurements 😱), you’re looking at a distributed system, and, well, everything is relative at that point. (•) Ingestion Time : The time  you  received the data. Again, this is true for a given value of “you”, after all, “you” might be a distributed system too. Storage Time : The time when you actually stored the data that you ingested. And the reason you might care about this is because you might make, or have made, decisions based on data,  and you need to know what data you

Transferring Image “Style” via Deep Learning

Image
So, what you want to do is apply a specific  style  to a photo. Oh, I’m not referring to the Instagram/Filter thing, I mean more in the sense of “ I want to make this picture, taken in the summer,  look like it is the depths of winter, with snow and clouds and whatnot ”, pretty much like the example below.   There are existing techniques to do stuff like this — e.g. The  “Van Gogh” approach by Gatys et al., or the  “Deep Photo”  approach by Luan et al. — but they all tend to either be  too  stylized (so yeah, it looks like “Starry Nights”. Yay.), or they have obvious artifacts. Whats more, they are all quite compute intensive, requiring upwards of minutes, on serious horsepower mind you, to do a single image, a far cry from Instagram-level abilities. Li et al. from UCMerced and NVIDIA have a nifty new paper out  ( and code! and docs! ) where they deal, quite successfully, with the above issues. In short, their approach applies the stylization as normal, but then proceeds to app

What is in *your* language toolkit?

Image
The way we solve problems, heck, the way we   think   about problems, is based upon the tools that we use. Phrased somewhat simply — “If all you have is a hammer, everything looks like a nail”. Mind you, this gets particularly egregious in the world of software development. Your run-of-the-mill #CowboyDeveloper learns Java (or C++, or whatever), and then spends the rest of their life writing Java in Ruby syntax, or golang syntax, or whatever. Mind you, the really bad part here is   not   that they haven’t figured out the semantics of the new language, it is that they genuinely believe that there is no difference, that programming languages are just different syntaxes, that   notation doesn’t matter . And that, my friends, is a problem. As   Kenneth Iverson pointed out in 1979 ,   Notation Matters . If you translate the words from English to Telugu, you might, maybe, make some sense, but only barely. Getting anything beyond the bare minimum of information across will be remarkab

So You’re Doing Hardware…Efficiently?

Image
How do you know? I mean, if you’ve been in the biz. for a while, you’ve probably got all sorts of metrics that you use to track this. OTOH, if you’re just getting into the game — either as a new biz, or through an unexpected promotion — then, well, how do you   know? There are many,   many   ways to do this, but there   is   One Simple Trick (sorry!) that you can use to figure out how things are progressing, and that involves looking at your BOM. What you want to do is keep track of it, and more importantly, keep track of how frequently it changes. (•) To put it simply, the more frequently it changes, a) the more rework you have to do to just get the system back to steady-state, and b) the less opportunity you have to build stuff on top of the hardware Remember, hardware, in general, is far less tolerant of incessant adjustments (yeah, neither is software, but it gets much worse with hardware). And, all the time you spend in fixing everything that broke with your latest BOM upd

How to (not) do Crypto — Episode #962

Image
/via /via   http://speakgeek.isovitis.com/ Matthew Green has a nice writeup   on the whole   MedSec / St. Jude entertainment , specifically, on the underlying vulnerabilities that were the proximate cause for the whole (admittedly serious!) kerfuffle. It’s worth reading   the whole thing   in full, but one particularly egregious part is worth explicating, the role of   security through obscurity. ( Note : If you ever,   ever, EVER , think it’s a good idea, go take a nap ) I’ll just copy the relevant part below — it says everything that needs to be said… Programmer commands are authenticated through the inclusion of a three-byte (24 bit) “authentication tag” that must be present and correct within each command message received by the implantable device. If this tag is not correct, the device will refuse to accept the command. To our surprise, SJM does not appear to use a standard cryptographic function to compute this tag.   Instead, they use an unusual and apparently “hom

Reducing AWS / RDS Spend

Image
/via https://www.networkcomputing.com/cloud-infrastructure/when-aws-goes-awry/1164227596 Herewith some fairly obvious pointers on what to look at when it comes to reducing your AWS / RDS spend. Instance Size  : Obvious, but is your instance small enough? Or, heave-forfend, big enough? Storage Type  : Obvious, but are you using the right type of storage? Do you really   need   io1 ? Data  : Why for you storing   everything   in RDS? Slap it in S3/CloudFront, and use a pointer! Usage  : aka “Tune your DB”. The correct index might obviate the need for a   db.m4.16xlarge   Archival  : Do you really need   all   that data in RDS? Clobber your old data, once you’re done with it! Data Transfer  : Yeah, here’s where you’ll definitely get nailed. Moving data out of each instance dips into your wallet. That includes backups, multi-AZ, remote transfers, and whatnot. Of the above,   Data Transfer   is one of those things you really,   really   need to think about beforehand. It’s t

Up-Front Design: Not just for Chumps

Image
You’re a “bottom-up” developer, eh? Elaborate please? That was the opening with recently with a “senior architect” at a company I’d inherited. It turns out that this person’s definition meant   “I start coding, and let the architecture evolve as I code ” Yup. Let that sink in for a bit. Oh, to help it sink in, understand that this wasn’t for wee-tiny components either — this was for the whole shebang, the system as a whole! (hence the scare-quotes around “senior architect”) The above is something I’ve seen quite a bit of, especially from “big company refugees”, people who’ve done their time at Accenture, IBM, Microsoft, or some such, and have come into start-up land, having absorbed all the wrong ideas from said companies. In a nutshell, “ Process is bad ”. Look around your company — if you have people like this, be afraid, be very very afraid. They are going to be (or already are!) living embodiments of the “ maintenance load equals available resources ” trope (•). Up front de