MongoDB - the MySQL of the NoSQL movement?

Its a pretty serious point since, as far as I can tell, it is rapidly becoming the default DB that everybody uses when they have to do something "cloud-y".

Wheres the problem with that?

Well, lets think back to SQL databases - by and large MySQL and PosgreSQL really do tend to be "one size fits all".  If you're already thinking relational, then it really doesnt matter which of these two you use (no flames please!)  Yes there are differences in how they replicate, shard, do stored procedures, etc., but they're really the same damn thing (as is Oracle, for what its worth).

NoSQL databases? Ah well, thats different.  They come in all sorts of shapes and sizes (document, column-oriented, key-value, etc.) and each of these has a different sweet-spot.  I tend to think of them as solution-oriented data stores, i.e., each DB is tuned towards a specific solution domain.  Oh yes, they are certainly moving towards each other - Riak has secondary indexes, CouchDB's bigcouch variant shards and scales brilliantly, etc. but that is evolution for you.

Of course you can use these (somewhat, and very loosely) interchange-ably, but ye gods, wouldn't that be a dumb thing to do.  As a simple exercise, imagine the entertainment associated with swapping out CouchDB with Berkeley DB.  (And, in case you are wondering, I was actually asked to help in this recently because "They both are DBs right? Whats the difference?")

Which, I believe, brings us back to the original problem - we now have a generation of people who have been raised to believe two things
  1. Data gets persisted in a database, just use an ORM / SQL to get at it
  2. Which database you use is about religion/money/advertising
Note that the specifics of what you are trying to do have virtually nothing to do with the choice of the database.  MySQL vs PostgreSQL flame wars, Oracle is for Big Business, etc., etc. we've seen these discussions all over the place, but everybody still uses Hibernate and/or whatever ORM they care about to persist stuff into the Chosen Database In The Datacenter.  When push comes to shove, they also merrily swap out one for the other, and nobody is the wiser (yes, I've done that too many times to count)

Enter MongoDB.  Its document-oriented, has a nice SQL-y syntax, and is easy enough to use that people can Just Do It without too much trouble.  And consequently, they are doing it for everything, everything I tell you.
Need a distributed file system? Use GridFS on MongoDB.  (ed: Please don't)
Need a distributed key-value store for your cache? Use MongoDB.
Transactionality? WTF, go ahead and use it.
etc., etc.
And people are actually using it that way.  Its the bass-ackwards solution to persistence that goes back to the SQL world.
  1. Install MongoDB
  2. Retrofit every damn thing you need using it
  3. Wonder why performance sucks (or scalability, or why your code suddenly got really complex)
Given all the "entertainment" associated with MongoDB, you might think that I'd be the first person to damn it, but that isn't necessarily true.  Its got its time and place, and hey, it may even be exactly what you're looking for.  But check and make sure beforehand, otherwise there is going to be a lot of pain and suffering in the future!

Now that I've gotten this far, do note that this problem is endemic to the NoSQL movement.  MongoDB happens to have pretty huge mind-share, but hey, its happening with other DBs too.  John Wood (of Signal fame) has a post up about how they moved to CouchDB, and are now moving off.  Its well written, carefully put together, and by no means a rant against CouchDB.  Its just that CouchDB was not the correct fit for their problem space, and MongoDB might actually be.
Which really should hammer home my point about NoSQL being solution-oriented.  The fact that Signal went into production with CouchDB before discovering their issues is just icing on the cake.

The bottom line? Think about your problem-space first.  You might actually be better off with MongoDB.  Or MySQL. Or Cassandra.  Or Voldemort.  Or whatever. 
Just. Think. First.
Then make your choice.
That is all...

Post a Comment

Popular posts from this blog

Erlang, Binaries, and Garbage Collection (Sigh)

Good News (!!!) from the world of TCP Congestion Control