Big Data - A New Evolution Every 6 Months

I've never personally lived through a disruptive technology phase that is evolving as quickly as Big Data. Client requirements that we heard 12 and even 6 months ago, are not the same that we are hearing today.

I looked back at the Cowen report on Big Data, authored by Peter Goldmacher way back in July 2011. At the time, and even now, I thought it was the most comprehensive overview of the Big Data wave, with a balanced view of the business and technology landscape. There are a few points that I want to re-highlight and update:

1) There is an assertion that Big Data evolves the database from a product to a category. While this may sound like fluff, it is an insightful statement. In fact, we have already evolved how we are talking about Big Data, giving credence to this point.

2) Peter talks about the diversity of data sets and the related issue of a lack of meta data. This remains a pre-eminent blockade to unleashing the value of Big Data. Many players in the market have mastered a few data sets, but cannot support all (certainly in an industry-specific context).

3) There is an assertion that emerging solutions are cheaper and therefore, it will be hard for the traditional enterprise players to excel in Big Data. While this was plausible 6 months ago, my experience with clients says this is now false. It is false for 3 reasons:

a) Enterprise clients want the backing of a deep development team, with enterprise experience, before they will trust a 3rd party with their most valuable data sets. There is too much risk if things go wrong (think data security, data loss, corruption, etc).

b) While Hadoop is critical to Big Data, it is not the endgame for an enterprise. Analytics and outcomes are the endgame. I believe its hard to deliver analytics for the enterprise, if a company has not lived in that world for some period of time. This is why large enterprises are actually looking to the more established players for solutions.

c) The big savings that I see for clients in Big Data comes on the hardware side, specifically around storage. Therefore, I don't think a comparison of software license pricing is sufficient for the analysis. Let there be no confusion, if a vendor is betting their Big Data play on the notion that clients will leverage expensive storage for the big data sets...they will fail. Take heed EMC. The report goes on to compare average storage prices: Main memory- $40/gb, SSD- $2.60/gb, and HD- $1.37/gb. When you bring storage into the TCO discussion, the perceived software license savings become a rounding error.

4) The report asserts that application development is more expensive in the Big Data world, as compared to the traditional SQL world. Peter likens coding in the Big Data world to Cobol/Codasyl, given that it is ~ 10x the code and effort, as compared to SQL. While this is sufficiently accurate today, this is the kind of thing that could change in the next 6 months.