Handling the four 'V's of big data: volume, velocity, variety, and veracity
By Jason Tee
Has your enterprise just started to grasp the enormous scope of big data? Do you realize that it's expanding exponentially even as you read this? Does your organization feel overwhelmed? It's not unusual for enterprises to have big data and simply not know what to do with it. The knee-jerk response may be to throw money at the problem. You can certainly hire consultants out the wazoo, outsource your data management, and simply hand over the keys to your data to someone claiming to be an expert. This might seem to make the whole mess vanish. However, sending your data into a black hole doesn't solve your business problems.
Any firm can move your data to a bright, shiny new database. That's not good enough if the data can't be queried quickly and effectively when your organization needs to actually use the data.
Jason Tee, enterprise software architect
Hiring a consultant could make or break your data
A consulting firm with real big data expertise can help position your company for success. One that just talks a good game will charge big money without delivering value from your data. It's easy to get suckered by a pitch full of buzzwords. Your best defense is self-education. Here is a quick primer that may help you determine whether a data consultant is really going to help your enterprise or if they are all trigger and no barrel.
Ask your consultant how they handle the four 'V's
IBM has a nice, simple explanation for the four critical features of big data: volume, velocity, variety, and veracity. Big data is always large in volume. It actually doesn't have to be a certain number of petabytes to qualify. If your store of old data and new incoming data has gotten so large that you are having difficulty handling it, that's big data. Remember that it's going to keep getting bigger. Your consultant needs to recommend a scalable solution that can grow with your data.
Velocity or speed refers to how fast the data is coming in, but also to how fast you need to be able to analyze and utilize it. If you have one or more business processes that require real-time data analysis, you have a velocity challenge. Solving this issue might mean expanding your private cloud using a hybrid model that allows bursting for additional compute power as-needed for data analysis. Your consultant may need to offer suggestions for hardware, software, and business process changes to handle today's high-speed data.
Variety points to the number of sources or incoming vectors leading to your databases. That might be embedded sensor data, phone conversations, documents, video uploads or feeds, social media, and much more. Variety in data means variety in databases – you'll almost certainly need to add a non-relational database if you haven't already done so. Can your consultant provide the right set of solutions to store, maintain, and analyze each type of data your business uses? Can they offer new ideas for how your business can use existing data or what new types of data you should collect and analyze?
Veracity is probably the toughest nut to crack. If you can't trust the data itself, the source of the data, or the processes you are using to identify which data points are important, you have a veracity problem. One of the biggest problems with big data is the tendency for errors to snowball. User entry errors, redundancy and corruption all affect the value of data. Your consulting firm needs to help you clean your existing data and put processes in place to reduce the accumulation of dirty data going forward.
Mix and match to get your big data just right
It's always nice to hire a consultant with experience handling every issue you currently face. But many of big data's problems are new. This means getting a one-to-one match between a consulting firm's previous projects and your enterprise's project isn't always possible. This is a good reason to find a firm that uses a collaborative approach to problem solving. If your data consultant is telling you they have an all-inclusive solution for big data, they probably don't understand the complexity involved. This industry is still in its infancy and is already changing so rapidly it's impossible for one firm to keep up. Your consultant should be willing and able to bring in support from best of breed vendors and partners to customize a solution for your enterprise. Remember that one of the things you are paying for is their industry network of expert contacts.
Ask how they measure success
A consultant who stands behind their work will have no issue setting up milestones for your project. These shouldn't just be tied to basic actions – they should be tied to metrics with business value for your enterprise. For example, any firm can move your data to a bright, shiny new database. That's not good enough if the data can't be queried quickly and effectively when your organization needs to actually use the data. The process of setting milestones will actually help you clarify what you want to achieve with your big data. So, above all, don't rush through this part of negotiations.
What key skills do you look for in a big data consultant? Let us know.
01 Aug 2013