NoSQL - what you'll find (for sure)

A question on StackOverflow about whether to use sharding or clustering w/ MySQL had the following point made in one of the responses(emphasis mine)

More interesting differences are when you look at areas other than performance...Sharded MySQL is more 'roll your own'.
Having sharding built into your application gives you maximum scaling potential, but adds complexity and limits your flexibility in terms of cross-shard queries and operations. If your sharding is premature then it may be the root of some problems for you. MySQL Cluster lets you get some of the benefits of sharding without having to constrain your application to be single-shard only.
The key points in italics above are incredibly relevant in the NoSQL world.  It doesn't matter how 'canned' the package is, you will find out that
  • You didn't understand your own problem-space as well as you thought you did. 
  • You didn't understand the package that you are using as well as you think you do.
  • It will not scale the way you thought it would.  Oh, it'll scale all right, just not the way you thought it would. 
  • Your object/document/JSON/whatever model really doesn't map exactly the way you expected it to.
In short, you will be rolling your own, and if you define your platform/architecture prematurely, this will be the root of some (or most!) of your problems.

Oh, for bonus points, each of the myriad NoSQL solutions out there have their own benefits/curses.  You really need to know what you're getting into - e.g., MongoDB and CouchDB are not interchangeable (ok, they could be.  But you get what I mean...)


Comments

Popular posts from this blog

Erlang, Binaries, and Garbage Collection (Sigh)

Visualizing Prime Numbers

Its time to call Bullshit on "Technical Debt"