What is NoSQL? NoSQL is a term that describes a variety of non-relational databases act as alternatives to traditional data storage systems that organized data into multiple tables that used primary and foreign keys to create relationships between them. In traditional database systems, Structured Query Language is used to navigate through primary key and foreign key relationships. With NoSQL databases, there are no traditional foreign key relationships to navigate, so they are cleverly given the moniker NoSQL.
NoSQL solutions were originally created by companies like Google and Amazon to address massive scalability challenges related to their own business models. However, as more and more organizations have begun dealing with high levels of web traffic, big data, and social media, the need for NoSQL has grown, and the number of options has proliferated. Today, there are many kinds of NoSQL solutions available for different types of data and use cases, with the most popular NoSQL architectures being documents, graphs, wide column and key-value stores.
What makes NoSQL different?
A NoSQL solution doesn't rely on data laid out in an orderly fashion in tables that are all linked closely together in a relational manner, which is the standard SQL based approach. With tables, users have the ability to carry out very refined and detailed queries to extract data in the form of reports. However, SQL databases can be cumbersome to maintain—especially as the amount of data increases exponentially. And a bigger problem than being difficult to maintain is the fact that they simply can't scale linearly beyond a certain point.
Rather than scaling vertically and spending millions of dollars on an Oracle box, you can buy a federation of cheap hardware and distribute your database.
Eric Redmond, author of Seven Databases in Seven Months
Kelly Stirman, Director of Product Marketing at MongoDB, describes why keeping data up-to-date in a traditional database can be a challenge. "The relational data model spreads data across many tables. There might be thousands of tables for one application. If you need to update an object in the data layer, you're coordinating the updating of data across many tables in one operation. You need sophisticated transactions to ensure the integrity of that update across many tables." Stirman says a NoSQL approach using document stores is less complex. "The data model is very different. Instead of trying to map a large, complex schema to objects, you have a direct mapping of documents to objects. Updating an object is as simple as updating a single document."
Benefits of using NoSQL
Horizontal scalability, the ability to build out quickly and easily at a low cost, is a leading perk of NoSQL. Eric Redmond, author of Seven Databases in Seven Months, explains the promise of NoSQL simply and succinctly: "Rather than scaling vertically and spending millions of dollars on an Oracle box, you can buy a federation of cheap hardware and distribute your database. There's only so large you can build one machine. But you can keep gluing on servers or data centers forever."
From a business standpoint, the ability to scale in a modular fashion over time is very attractive due to lower costs and greater flexibility. Plus, there are additional benefits of various NoSQL solutions:
Data persistence naturally preserves the historical version of data when a new, updated structure is created.
Schema-less architecture means data does not have to be defined down to the last detail to be stored in the database, and the data can be easily migrated as needed with little or no downtime.
NoSQL is cloud-friendly for IaaS or Paas deployment using solutions like Amazon Web Services or RackSpace to further reduce capital expenditures.
Non-relational DBs are ideally structured for use with virtual machines and load balancing for high availability of data and effective use of available memory.
Potential NoSQL pitfalls
Because NoSQL is still an emerging technology, there are a number of mistakes that businesses tend to make. The most common error is using NoSQL when it isn't really necessary. The idea of being able to scale up to incredible levels can be tempting. But if a current SQL database can be scaled to handle expected traffic at a reasonable price, sticking with it may be wise.
New, open source technology can be attractive when it appears cost-effective and innovative up front. However, there can be hidden costs down the road. In particular, lack of standards can be a concern since poor portability can easily lead to vendor lock-in and other challenges.
Dan McCreary, coauthor of Making Sense of NoSQL, says another issue businesses often encounter is a lack of developer familiarity with how to write code for NoSQL DBs that store entire documents. "Teams that have only Java and Hibernate experience may end up using the tools they know, modeling things with UML and generating Java classes that automatically do tens of thousands of inserts where one could be done." In other words, they are using old tools to do new things, adding unnecessary complexity and wasting resources.
Best practices and strategies for NoSQL implementation
How can an organization decide when and how to use NoSQL? Here are a few tips that can help:
Understand the use cases in question before shopping for a NoSQL solution. Stick with SQL unless there are compelling cost-savings and performance enhancements on the table.
If a database requires massive scalability, handles lots of temporary data, stores lots of objects, or needs to run queries that can't be done in SQL, seek out NoSQL alternatives.
Question vendors vigorously about how they are implementing their solution. Consider working with third party vendors that graft on standards to ensure greater compatibility with other systems/solutions.
Finally, be prepared for sweeping changes. As Dan points out, "It's not just the database that has to change. It's the development architecture, tools, training, and people."