In the last weeks I visited several Cloud and Big Data conferences. Especially the Big Data Innovation in Boston gained me a lot of insight. Some people only consider the technology side of BigData technologies like Hadoop or Cassandra. The real driver however is a different one. Business analysts discover Big Data technologies as the means to leverage tons of existing data and ask questions about customer behavior and all sorts relationships to drive business strategy. By doing that they are pushing their IT departments to run ever bigger Hadoop environments and ever faster real time systems.
What’s interesting from a technical side is that ad-hoc analytics on existing data is allowed to take some time. However ad-hoc, implies people waiting for an answer, meaning we are talking about minutes and not hours. Another interesting insight is that Hadoop environments are never static or standalone. Most companies take in new data on a continuous basis via technologies like flume. This means Hadoop map reduce jobs need to be able to keep up with the data flow; either by adding more hardware or by optimizing them.
There are multiple drivers to BigData (actually there are a lot) but the two most important ones are these: Analytics and Technical Need for Speed. Let’s look at some of those and the resulting takeaways:
Read full article