“Because it works” is a justification of last resort

Reliability is Hard. Oh, there are tools to make it easier, but “easier” is not the same as “easy”. And Reliability is, certainly, not Easy. When you factor in the (geometrical) complexity of distributed components, and the (geometrical) complexities that get layered on top when these components span nodes, and then networks, well, “Reliability is Hard” gets close to being the Understatement of the Year.
And that’s where rigor comes in, rigor in the form of empirical tests, rigor in the form of mathematical certainties, rigor in the form of proofs, models, and battle-testing, in short RIGOR.
Do you, genuinely, understand what your system does?
Can you explain this to others, simply, in a form they can understand (a-la Feynman)?
Most importantly, is there any place in the system where you up with
  • “I hand-tuned this till it worked”
  • “I implemented my own <something in common use, that I didn’t know existed>
  • “It doesn’t matter, it works, and that’s what’s important”
If the above sounds familiar (and it should, I’ve seen this, lo, too many times in my life) you might want to take a step back and examine your priors…

Comments

Popular posts from this blog

Erlang, Binaries, and Garbage Collection (Sigh)

Its time to call Bullshit on "Technical Debt"

Visualizing Prime Numbers