In today's era of cheap, highly available storage capacity, data keeps getting bigger with no end in sight. This bounty brings with it both benefits and difficulties. Organizations have more opportunities than ever to acquire insightful business intelligence (BI) by taking advantage of unprecedented access to information. Yet vendors and business users face challenges in managing and utilizing data in the age of cloud computing. What's going on in the big data management space, and how are NoSQL technology providers rising to meet the challenge?
Part of the problem with a lot of data solutions is that they are still in the early phases of maturation.
lead engineer of OpenShift, RedHat
Choosing the right database is all about use case
Selecting from the wide range of databases (DBs) available is the first tough question faced by the business customer. Cameron Peron, vice president of marketing at Redis Labs, mentioned three NoSQL technology databases that are all gaining traction in the enterprise space: Cassandra, MongoDB and Redis. It's good to have choices because, sometimes, there's not one database that can deliver everything customers need for their use case.
"In many cases with enterprise customers, it isn't about which NoSQL database is better or whether to use NoSQL versus a relational database," Peron said. "We see that most of our developers use a pretty wide variety of databases. Redis is used specifically for high-performance use cases like real-time analytics, [Internet of Things] and massive data ingestion and caching where speed is essential. In conjunction with that, we see other databases being used to solve different problems." Managing multiple databases for the same organization increases costs and complexity. But this polyglot approach may be one of the only ways to capture all the value delivered by the diverse data streams ingested by the typical enterprise.
Which database solutions will survive and thrive?
The current capability of the database itself isn't the only thing to consider. According to Peron, the community that supports a technology is essential to its survival. "It's really about the open source community that's behind the DB -- the fact that it works in and of itself independently of the vendors that are working there that contribute code back into the project. The major open source DB offerings are attractive to a wide variety of developers. It's that strong interest that helps these communities to grow," he said.
Vendor lock-in accompanied by price hikes is always a concern. But the specter of relying on a technology that might falter and become outdated simply because it is no longer supported is the real nightmare scenario. In the end, database selection may come down to a popularity contest.
Big data solutions are still in their adolescence
What about the tooling that supports the use of big data? Clayton Coleman, lead engineer of OpenShift at Red Hat, admitted that the accelerated pace of change hasn't left a lot of time for data tools to grow up. "Part of the problem with a lot of data solutions is that they are still in the early phases of maturation," he said. "It took 20 years to bring Linux to the point where someone could download, install and run it on a supported platform in a predictable and consistent way."
The enterprise community can't afford to wait another two decades for tooling to catch up with burgeoning data requirements. That's one reason the OpenShift project exists. According to Coleman, big data must be able to run frameworks that can use the capacity of the data center to do batch jobs, data processing, analytics and warehousing -- and this needs to happen now. The open source community is working with an entire ecosystem from OpenStack to Kubernetes to help bring tools to maturity as quickly as possible. "The fact that it is still difficult to set up and run these very complex and very powerful pieces of software is another angle where OpenShift is striving to make things easier," said Coleman.
Integrating data solutions with BI is the essential next step
As databases fill with information and vendors strive to bring their tools up to speed, the average company is still struggling with the critical question: What's the point of having big data if it doesn't provide a competitive advantage?
Chad King at Ayoka Systems, a DFW-based custom software development firm, said this question still remains unanswered for too many organizations. "Businesses are good at gathering data, but they're not doing anything meaningful with it," he said. King described one recent client using giant, unwieldy Excel spreadsheets to assess data -- with unsatisfactory results. The introduction of self-service BI tools that cut through the noise made a huge difference on a practical level for this client. End users could easily run a custom report or generate a visual representation -- such as a pie-chart or pictograph -- then drill down to look at specific information, including date-specific data.
"When they started investigating their data with better reporting tools, they discovered an interesting fact," King said. "For several years in a row, they had a spike in orders for a particular product each July. With that piece of data in hand, they could start planning production and inventory to take full advantage of the increase in demand. It was their first experience with predictive analytics. Of course, then they started wondering what other patterns they might be missing."
With big data getting bigger every day, more and more enterprises will automate reporting and make analytics part of the decision-making workflow. But they will also be exploring new possibilities made possible as databases and tools mature. For NoSQL technology providers and businesses alike, the underlying fear of missing something valuable in the massive pile of data will continue to drive innovation and improvement for years to come.
Boost runtime performance with NoSQL
How to determine if you need a NoSQL DBMS
How to choose the best NoSQL DBMS