What Price Resilience

So you’re going to revolutionize ice-cream delivery, through the powers of your iOS/Android app.
(Work with me here. Pretend that you’ve actually invented a better mouse-trap)
Your back-end runs on an AWS/EC2 instance, which is fine while you’re developing, but as you start thinking about real live customers, reliability comes to mind, and you split the back-end up so that it’s running on TWO AWS/EC2 instances.
Great, right? If one goes down, the other takes over the load, and everything is hunky dory (as long as you the combined load is less than the capacity of any one of the servers, but let’s ignore that for now).
Except, what if AWS itself goes down? As June 29, 2012 taught a lot of us, the un-imagineable does happen every now and then. The solution, of course, is to deploy your instances across multiple availability zones, or maybe even multiple regions, or heck, while you’re at it, across multiple cloud providers.
The thing is, as you go up the reliability food chain, you system architecture necessarily gets affected. You now need to start worrying about fun stuff like load-balancing, network partitions, health checks, failovers, response playbooks, deployments, rollbacks, and a host of other stuff. All of which is adding complexity to your system, and complexity always comes at a cost!
The question you need to ask yourself is — “Is the cost worth it?”.
Does your service really need to be able to function through that black swan of a once-a-decade region wide outage? Mind you, the answer could very well be “YES”, but it isn’t automatically so.
Ask yourself what that outage is actually going to cost you, and compare that to the costs of building out — and maintaining! — all the additional complexity. Also remember that you can even get “cyber insurance” (yes, that’s what they call it) for events like these!
Always remember, your discussions about reliability and resilience need to be conducted within a business context. Unless you’re doing academic research of course, in which case, knock yourself out…

Comments

Popular posts from this blog

Cannonball Tree!

Erlang, Binaries, and Garbage Collection (Sigh)