Monday, April 2, 2012

Eventual Consistency is not something that needs to be Worked Around. Its a feature

You can absolutely go read up on Eventual Consistency, the CAP theorem, etc. to understand Eventual Consistency, quite a bit of which is good.  They tend to skimp on something I find quite important though, i.e., the role of White Lies in an Eventually Consistent environment.
Depending on the architectural context, you can end up with a system where there is a requirement for consistency, but only once all the Business Processes are taken into count.   Or, to phrase it differently, if you are willing to lie Just a Little, you can relax constraints without getting yourself into trouble.

Consider my company, which, to put it technically, does all sorts of big cloud telephony thingies.  These thingies are all donewith clumps of servers ("clump" --> another technical term). 
At the simplest level, what we have is just a big honking phone system.  Y'know, the kind where people call each other and leaves voicemails and stuff.
So, consider a fairly typically scenario on a Wednesday morning (birds chirping, bunnies hopping, etc.)
Cue the sound of a phone ringing...
Its Alice calling Bob, who really, really doesn't want to talk to Alice right now. He sees the incoming call on his screen, and hits the Send To Voicemail button.  Alice dutifully leaves a voicemail ("If you don't pick up the dry-cleaning on the way home, the wedding is off, you hear me?  OFF").  This voicemail gets left on the Arizona clump (which just happens to be in Arizona).

Shortly after hitting the Send To Voicemail button, Bob has the minor epiphany that the wedding is getting near, Alice is somewhat frazzled, and he should probably find out what she was calling about.  By then Bob's screen has a nice blinky voicemail icon on it, so figuring that discretion might be the better part of valor, Bob calls in to check his voicemail before calling Alice.

Remember the "cloud PBX" thingy?  Tragically, Bob calls into the Zurich clump, which happens to be in, wait for it, Zurich.

Old School
If we designed the Voicemail system Old School, with big honking databases and color glossies with circles and arrows on the back of each one, we would have set things up so that the Voicemail icon on the screen only shows up when the actual voicemail data has been copied to all the clumps.  Which implies that we either take quite a bit longer for the icon to show up on the screen (hey, it takes a while for data to get copied from Arizona to Zurich) or have a very high bandwidth connection between clumps..
Alternately, we get even more complicated, and do things like geo-locate calls, track callers based on caller-ID, etc. and, with a judicious amount of luck, have all relevant parties end up on the same clump.

We also give people ponies while we are at it.

 Oh, this, by the way, is how we used to do things in a previous iteration of our system (without the ponies)

Eventual Consistency
In an Eventually Consistent world, we realize that sometimes it just isn't all that important for everything to happen atomically, and - this is important - the occasional white lie is actually ridiculously useful.
When Bob calls in, if the voicemail hasn't been copied over to the Zurich clump (yet), we play an audio saying "This voicemail is still being processed.  Please try again, or press 2 to listen to the next message".  Notice what we did
  • We lied.  Well, we actually didn't, the voicemail is still being processed, in that it is being copied.  But thats ok!
  • Odds are, in the time that it took us to play the above audio, the voicemail would probably have come across, i.e., we bought ourselves just enough time so that we didn't have to make everything atomic
The thing is, People are used to stuff like this!  When you order your Mocha-frappa-cino at Starbucks, it doesnt show up with your receipt - you get to go over to a different area where you hang around and chat with your #HipsterBarrista.  In short, people are used to services having built in delays in processing.  All we are doing is taking advantage of human nature to buy ourselves some time to Get Things Done.

The point behind all this, of course, is that Eventual Consistency is not something that needs to be worked around.  It is actually a feature, that buys you stuff elsewhere (go see the CAP theorem for more about this).  In our case, we get significant cost-savings (have you priced gigabit links from Arizona to Zurich?), as well as reliability (try doing things old-school when your gigabit-link is down)

In short, the next time you're looking at a problem involving Data Consistency and trying to engineer The Correct Solution, you might want to take a step back, and ask yourself if there are Business Rules or Processes that you can take advantage of, which will help ease your life.  Chances are, there are, and they will :-)

3 comments: