MongoDB - the MySQL of the NoSQL movement?
Its a pretty serious point since, as far as I can tell, it is rapidly becoming the default DB that everybody uses when they have to do something "cloud-y".
Wheres the problem with that?
Well, lets think back to SQL databases - by and large MySQL and PosgreSQL really do tend to be "one size fits all". If you're already thinking relational, then it really doesnt matter which of these two you use (no flames please!) Yes there are differences in how they replicate, shard, do stored procedures, etc., but they're really the same damn thing (as is Oracle, for what its worth).
NoSQL databases? Ah well, thats different. They come in all sorts of shapes and sizes (document, column-oriented, key-value, etc.) and each of these has a different sweet-spot. I tend to think of them as solution-oriented data stores, i.e., each DB is tuned towards a specific solution domain. Oh yes, they are certainly moving towards each other - Riak has secondary indexes, CouchDB's bigcouch variant shards and scales brilliantly, etc. but that is evolution for you.
Of course you can use these (somewhat, and very loosely) interchange-ably, but ye gods, wouldn't that be a dumb thing to do. As a simple exercise, imagine the entertainment associated with swapping out CouchDB with Berkeley DB. (And, in case you are wondering, I was actually asked to help in this recently because "They both are DBs right? Whats the difference?")
Which, I believe, brings us back to the original problem - we now have a generation of people who have been raised to believe two things
Enter MongoDB. Its document-oriented, has a nice SQL-y syntax, and is easy enough to use that people can Just Do It without too much trouble. And consequently, they are doing it for everything, everything I tell you.
Need a distributed file system? Use GridFS on MongoDB. (ed: Please don't)
Need a distributed key-value store for your cache? Use MongoDB.
Transactionality? WTF, go ahead and use it.
etc., etc.
And people are actually using it that way. Its the bass-ackwards solution to persistence that goes back to the SQL world.
Now that I've gotten this far, do note that this problem is endemic to the NoSQL movement. MongoDB happens to have pretty huge mind-share, but hey, its happening with other DBs too. John Wood (of Signal fame) has a post up about how they moved to CouchDB, and are now moving off. Its well written, carefully put together, and by no means a rant against CouchDB. Its just that CouchDB was not the correct fit for their problem space, and MongoDB might actually be.
Which really should hammer home my point about NoSQL being solution-oriented. The fact that Signal went into production with CouchDB before discovering their issues is just icing on the cake.
The bottom line? Think about your problem-space first. You might actually be better off with MongoDB. Or MySQL. Or Cassandra. Or Voldemort. Or whatever.
Just. Think. First.
Then make your choice.
That is all...
Wheres the problem with that?
Well, lets think back to SQL databases - by and large MySQL and PosgreSQL really do tend to be "one size fits all". If you're already thinking relational, then it really doesnt matter which of these two you use (no flames please!) Yes there are differences in how they replicate, shard, do stored procedures, etc., but they're really the same damn thing (as is Oracle, for what its worth).
NoSQL databases? Ah well, thats different. They come in all sorts of shapes and sizes (document, column-oriented, key-value, etc.) and each of these has a different sweet-spot. I tend to think of them as solution-oriented data stores, i.e., each DB is tuned towards a specific solution domain. Oh yes, they are certainly moving towards each other - Riak has secondary indexes, CouchDB's bigcouch variant shards and scales brilliantly, etc. but that is evolution for you.
Of course you can use these (somewhat, and very loosely) interchange-ably, but ye gods, wouldn't that be a dumb thing to do. As a simple exercise, imagine the entertainment associated with swapping out CouchDB with Berkeley DB. (And, in case you are wondering, I was actually asked to help in this recently because "They both are DBs right? Whats the difference?")
Which, I believe, brings us back to the original problem - we now have a generation of people who have been raised to believe two things
- Data gets persisted in a database, just use an ORM / SQL to get at it
- Which database you use is about religion/money/advertising
Need a distributed file system? Use GridFS on MongoDB. (ed: Please don't)
Need a distributed key-value store for your cache? Use MongoDB.
Transactionality? WTF, go ahead and use it.
etc., etc.
And people are actually using it that way. Its the bass-ackwards solution to persistence that goes back to the SQL world.
- Install MongoDB
- Retrofit every damn thing you need using it
- Wonder why performance sucks (or scalability, or why your code suddenly got really complex)
Now that I've gotten this far, do note that this problem is endemic to the NoSQL movement. MongoDB happens to have pretty huge mind-share, but hey, its happening with other DBs too. John Wood (of Signal fame) has a post up about how they moved to CouchDB, and are now moving off. Its well written, carefully put together, and by no means a rant against CouchDB. Its just that CouchDB was not the correct fit for their problem space, and MongoDB might actually be.
Which really should hammer home my point about NoSQL being solution-oriented. The fact that Signal went into production with CouchDB before discovering their issues is just icing on the cake.
The bottom line? Think about your problem-space first. You might actually be better off with MongoDB. Or MySQL. Or Cassandra. Or Voldemort. Or whatever.
Just. Think. First.
Then make your choice.
That is all...
Comments