Thinking about NoSQL databases (Classification and Use Cases)
What kind of animal is a NoSQL database?
I have met often the following NoSQL classification (see Analysis of the NoSQL Landscape):
- Column-oriented stores
- Document databases
- Graph databases
- Key-value stores
While this classification is useful, when people talk about NoSQL databases, there is often something hidden. What is hidden, what is forgotten, is the following: object databases.
The current situation sounds just like 'object database' would have a strong negative meaning; but, look around, it's quite the reality: almost nobody uses it!
From my point of view, the NoSQL situation is quite clear: the NoSQL databases, even the data grids, are object databases in disguise, with some revamping. The NoSQL movement sounds like the hidden revenge of the object databases.
NoSQL Databases are Object Databases in Disguise
NoSQL products sound like a kind of object database:
- Column-oriented stores: column-oriented storage mode for object databases
- Document databases: normal storage mode of object databases
- Graph databases: distributed object databases
- Key-value stores: BLOB-like storage mode of object databases
So, in my humble opinion, NoSQL databases seem an awful lot like object databases.
Well, now, take a look at data grids. A data grid provides a object storage, triggers, (select) query mechanism, etc. So, again, I look at a data grid as an in-memory distributed object database (usually without a transactional mechanism, even if, for example, Oracle Coherence is going to provide such a mechanism with the coming v3.6 release...).
In fact, a data grid is more than that; I think a (true) data grid is both an in-memory distributed object database and a distributed cache (for advanced caching on the client side), but that's another story.
The NoSQL promoters seem to be reinventing the wheel, and distributing it under another name Regardless, these NoSQL databases do carry some advantages, over traditional object databases. These advantages are twofold:
- First, it might be easier to start with, while selecting the product providing the only features/constraints developers need.
- Secondly, developers could relax some properties, could play with the CAP constraints the way that fits better with their needs.
How to think about NoSQL database use cases?
NoSQL databases should be seen as the database for the middle tiers, or at least, as the secondary database in the middle tiers.
The middle tiers is the place for non-durable, transient data, close to the object model. This then leads to the following main use cases:
- http session storage,
- workflow storage (UI workflow),
- data cache (in order to alleviate the database burden)
For example, SourceForge.net has already chosen MongoDB, instead of Memcached, for data caching purposes.
These main use cases are a first step for introducing NoSQL in an IT architecture. These cases are all about data storage in the middle tiers; that is, about using a database inside this tier; but, for these use cases, the data model of this middle tiers is seen as secondary when compared with the data model of the back-end tiers (hosting a mainstream database). According to that view, the database of the back-end tiers is the database which matters, the storage choices of the middle tiers are then seen as optimizations.
Of course, the big challenge becomes promoting NoSQL as a primary database in the middle-to-backend tiers
When NoSQL databases move to the foreground, will they be used to replace mainstream databases in the back-end, or will they just stay in the middle tiers? Closer to the back end, they don't play a secondary role anymore, as their role becomes just as important as the database in the back-end tiers; and both should be synchronized. There are different strategies for the middle=>back-end tier synchronization: write-through, or write-behind.
On one hand, it's unfortunate object database vendors have not taken benefit from relational database limits, to push their solutions under the new NoSQL umbrella.
On the other hand, the NoSQL movement is at the end of the Highlander Fallacy (from Jim Waldo): "there can be only one", this "one" being the relational database category; and then, NoSQL brings quite a breath of fresh air! So, developers are more able to focus on the constraints they have to respect and the features they need to do so, instead of having to deal always with the limits of the relational model.
The future sounds interesting, because more open.