Migrating to a New Database

Come migration time, if you’re (very!) lucky, you can go offline, and Just Take Care Of It. Odds are, however, that it isn’t going to be that simple — systems that can’t go offline, customers that scream really loud, money, whatever.
In that case, the general strategy tends to be some variation of the following
  1. 1. Migrate Creates / Updates / Deletes to the NewEvery time you insert/update/delete data into your old database, you do the same on the new database. You’re still Reading from the old database however, so in case things go south, you find a bug, etc., you can just totally nuke the new database and start again from scratch
    And yeah, theres a whole bunch of detail here around preloading, metrics, alerts (•), and so forth that I’m eliding 
  2. 2. Serve (some!) Reads from the New
    In -1-all your Reads came from the old environment. Now, you can, and should, start serving Reads from the new environment. Mind you, if at all possible, do this gradually (canary deployments, whatever). You’re still doing all your Creates/Updates/Deletes to both environments, so if things go south, you can start over.
    I can’t emphasize how important the “gradual deployment” bit is — you’d be (not?) surprised how bad you can misjudge things here!
    Also, be sure to set up metrics around missing data! (••)You really, reallywant to know when one of the Reads against the new environment barfed, so that you can find out why!
  3. 3. Serve (all!) Reads from the New
    If you were doing the canary thing in -2- (and you should!) then this is where you’ve switch all your Reads over to the new environment. As before, you’re still doing all your Creates/Updates/Deletes to both environments, so you’ve got the option of reverting.
    That said, you now have the reverse problem, viz., the likelihood that there is data on the new environment that isn’t on the old. And yeah, it’ll happen — code paths diverge, a bad commit, whatever. So, be sure to setup the same “failed Read” metric for your old environment too!
  4. 4. Retire the Old
    Pretty much what it sounds like. Once you’ve hit whatever your success criteria is (2 weeks without failure, no missing Reads, etc), you can retire the old environment.
    Mind you, be sure to define your success criteria before you start this whole process, and validate it again before retiring (•••)
The devil in all this is, as they say, in the details. Check out this Couchbase-to-Redis writeup for good info on the process, including some testing info, red flags, etc. Also, be sure to google the heck out of this!
(•) You do test your alerts, don’t you? This might be a good time to reallyvalidate that they work! 
(••) You might want to also consider doing a Migrate On Fail, i.e., if a missing Read shows up, migrate the data over on the spot. (Warning! Here Be Dragons!)
(•••) Yes, you need Requirements for Pulling The Plug™. Why would you think otherwise?

Comments

Popular posts from this blog

Erlang, Binaries, and Garbage Collection (Sigh)

Cannonball Tree!

Visualizing Prime Numbers