Business at the Speed of Insights
What if you could streamline the way your company did business? What if you could move into adjacent markets, expand new revenue opportunities, and do it from a data driven perspective? What if you could identify a way to improve your customer’s experience? What if you could get these insights well ahead of your competition?
I would venture to guess one of the hottest topics of any CIO staff meeting today is analytics. The hope of answering the “what if” questions by using the valuable data IT has been managing all these years and turn it into actionable information to help propel the business. While this may seem like a new topic, analysis of data has been around for years, as I am sure many of you already know. Even when I was a young Unix admin, for example, I had written several scripts to mine data from a variety of flat file data sources. This would then create a new text file with the newly captured data in a format that could then be fed back into another script which would ultimately massage it into one of the freely available graphic tools to represent the findings in a meaningful and compelling manner. This is what some would call “old school” analytics; some would call it the beginnings of business intelligence. Whatever you called it, it was mining the data and logs to convert it to information for the business. While that script worked for a small number of source files, and was very rudimentary, it would never work in today’s environment of multiple terabytes and petabytes of data. However, the need still exists to extract information from our data today, probably more than ever. Therefore tools have been created to help achieve these results at massive scale. Tools such as Hadoop, Spark, Hive, Tableau, noSQL databases, etc. all provide a method and approach for peering into the data to extract information. Just about every CIO I speak to is either exploring or deploying an analytics farm in the datacenter. The problem is, as one of my peers so aptly put, “what questions do I ask of my data?”
Every CIO I speak to is either exploring or deploying an analytics farm in the data center
Today’s digital world has changed so exponentially from my young Unix admin days in the 80s, that is can actually be quite daunting to consider where to begin. Today we are keeping more data than ever, a huge selection of new data sources, and the ever-present desire to gain new insights to drive our business. In the big data world this is referred to as Volume, Variety, and Velocity.
Actually one of the key drivers of the analytics phenomenon is the fact that we are creating and keeping more data today than ever before. Just a few years ago, IBM research published a rather startling fact; it states we as a society are creating 2.5 quintillion bytesor 2.5 exabytes of data every day. The EMC Digital Universe study with research and analysis by IDC, projects a doubling of our digital universe every two years to a whopping 44 zettabytes or 44 trillion gigabytes by 2020–containing as many digital bits as there are stars in the universe. This may sound completely overwhelming, but this is where the opportunity exists for organizations to uncover new business insights. The data we create, the data we keep, the data we manage, becomes the informational basis to answer all of the “what if” questions. It also tells us what data we are not capturing and need to capture to be more successful and more competitive in the market today. Answering the “what if” moves an organization from the business intelligence models that tell us
• How we performed last quarter?
• How many units we sold?
• Identifying situational problems
To transforming an organization into the data science of analysis, looking at predictive opportunities, business modeling and scenario analysis.
Beyond the basics
This is the beauty of where our industry is headed, and there are storage platforms specifically designed to help propel the answer to the “what if”, be it a PCI NVMe rack-scale flash storage, or a data lake storage solutions purpose built to support multiple protocols allowing for in place analytics. I know there is a lot of discussion and “hype” about just using a free distro of Hadoop with a cluster of “pizza boxes” containing both storage and compute. While this may sound appealing, because of its low cost, repurposing of old or aging equipment and free–one really needs to consider if this is a “go forward” infrastructure to bet the business on. I certainly believe using a small startup cluster of “pizza boxes” is quite all right, at a very small scale but where I believe this breaks down is when it begins to grow out of control and into the 200, 300, 500, 1000TB range. The manageability becomes a problem; the “low-cost” and “free” doesn’t exist anymore, OPEX may spike just keeping the racks of “pizza boxes” and then there is the question of scale. In this scenario storage and compute scale together, there is not opportunity to scale compute separate from storage. The reality is you may not be getting the valuable, actionable information you desired from the outset.
My challenge to you, as CIOs of your organizations, is to not approach this with a “budget” in mind, or by making low level technology decisions but approach this with a desired business outcome in mind, and clearly identify the goals you intend to achieve with this new “platform”. IT is changing, we need to pivot, and begin to look at our roles as supporting actors in the big picture of the business and not “just the IT guy”.