Big Data; Are You Creating a Garbage Dump or Mountains of Gold

You’re not really sure how it happened, but some time between last year and the summer of 2011 you were suddenly facing a big data problem, or you were being told you were facing a big data problem, or more accurately you were being told that you needed a big data solution.

Funny thing was that you hadn’t really done anything drastic over the last couple of years that would seem to indicate a tsunami of data was about to breach your storage floodgates, but then again it wasn’t like you watched yourself going bald either.

On the other hand it is hard to argue with the old adage that knowledge is power and that data leads to information which can lead to knowledge which can hopefully translate into making highly informed decisions.

It is also true that there is a massive amount of data traversing in and out of the business that seems to ghost its way through the system like an angst ridden, Goth teen with a passion for walks in dark alleys – alone at night – while wearing all black and really dark purple.

So, does your data hold some big, magical, gold nuggets of information that will radically transform the business?

Will you single-handedly (using MPP, NoSQL, schema-less architectures) Hadoop your way to greatness, while getting a chance to brush up on your python scripting skills?

Or is this whole thing going to end up like your companies previous – highly successful? – ERP, CRM, CMDB, PKI, and <insert hype technology du-jour> deployments?

You’re dying to find out aren’t you?

Before you go leaping into the murky abyss of your companies data swamp you may just want to take a deep breathe and do a little planning.

  • First off, what question(s) are you hoping and/or needing to answer? If you don’t know what question(s) you are trying to answer why would you exert any effort on trying to find an answer?
  • Do you believe you have the data to actually answer the question? This is a very common theme in analytics and BI, you want an answer, you spend a ton of money to churn, prep, store, map-reduce, analyze, and visualize data only to find out the questions you really need answered, you cannot because you simply do not collect the data to answer the questions regardless of the myriad of tools, consultants, and sharply dressed enterprise sales weasels that come a knocking
  • Assuming you were able to get the answer, could you act on the information? Another common problem is that many companies are not organizationally structured to support acting on business intelligence, especially if it requires iteratively dialoguing with the data
  • Do you have the skills in-house to understand you data, infrastructure, and analytic requirements? Why purchase and deploy if you cannot actually administer and use
  • What tools do you currently use for data management, analytics and business intelligence? 
  • What is the gap between your data requirements and the incumbent tools capabilities?
  • Do you have the skills in-house to research, analyze, test, and deploy a big data solution?

Assuming you can answer those questions then it might be time to start doing some research on big data solutions, but before you run out and grab the latest shiny-object make sure you know what the problem actually is that you’re trying to solve.

That seems like a lot to think about, but would you rather spend your time up front doing all the research, planning and expectation setting to increase your probability of success or just go for it and take the pain in the back side when you watch hundreds of thousands to millions of dollars and 18 months flush down the drain with nothing to show for it except a metric ton of crappy free big data vendor schwag.

2 thoughts on “Big Data; Are You Creating a Garbage Dump or Mountains of Gold

  1. Pingback: Needles in a Digital Hay Stack; Finding Value in Big Data « Amrit Williams Blog

  2. Pingback: How Much Big Data is Too Much? « Just Like Davos

Leave a comment