Helping Companies Succeed with Information Technology

As is often the case with early stage emerging technologies, there are almost as many definitions of “Big Data” as there are authors, analysts, and bloggers who write about it and technology or consulting companies that sell products or services that can be remotely associated with it.  Although there is no agreed to definition for Big Data, common to most definitions are the three main factors that make Big Data unique:

  

Other factors that distinguish Big Data are its Variability (complexity from diverse sources), Veracity (sorting out data from noise), and Value (ability to derive value from large data volumes).


Regardless of the definition, the potential impact and benefits that can be achieved with Big Data technologies are real. Most people know how Google, Amazon and LinkedIn have leveraged these technologies to reinvent their industries. Many people are also aware that Google was the creator of Hadoop, one of the core Big Data technologies.


According to an article in the December 16, 2014 Wall Street Journal 1:

 

 

The challenges that have prevented companies from being successful with Big Data technologies fall into three categories:


1.  The challenges that companies typically face getting value from any early stage      emerging technology (e.g., adoption, scalability, integration, expertise)

2.  The challenges that companies have always had being successful with Data      Technologies, particularly Data Quality and Data Governance.

3.  Challenges that are unique to Big Data:



  Big Data

 

INSIGHTS


A lot of attention has been paid to the definitions of “Big Data” and on the new Infrastructure technologies that enable businesses to process it. In reality a business will only realize any value from Big Data when they take action based on the new Insights they will gain from the Analyses that are produced from it.

Architecture

Layer

Big Data Components

Legacy Data Technologies

Application

  • Prescriptive Analytics
  • Streaming Analytics
  • Descriptive Analytics
  • Predictive Analytics
  • Data Visualization
  • Natural Language Processing

Data

  • Extremely large data volumes
  • Continuously flowing data streams
  • Unstructured data
  • Semi-structured data


Infrastructure

  • Hadoop
  • Map Reduce
  • No SQL
  • In-memory Databases (e.g.,Hana, Terra Cotta, etc.)
  • Data Appliances


Enabling Technologies

  • New Scripting Languages (e.g., Hive, Python, Pig)


System Integrity


  • Data Masking
  • Data Access Security

The benefits that can be derived from Big Data technologies fall into three categories:

  

  1. Cost – Open source software and commodity hardware dramatically lower the cost of data storage and processing large complex analyses
  2. Speed – dramatically reduce the time it takes to run a complex analytical application from hours and days to minutes and seconds.
  3. Insights – Cost effectively and quickly process extremely large volumes of structured and unstructured data to provide analytic insights that would not be otherwise possible


Unfortunately, many of the companies that invest in Big Data projects will not achieve these benefits, or any benefits at all.

We believe there are three keys to being successful with Big Data:

  1. Companies must have access to experienced Data Scientists that are equal parts Quantitative Analyst, Data Analyst and Programmer. For Data Scientists to be effective, they also should have sufficient knowledge of the industry and functional area to be analyzed.
  2. Companies should develop a Big Data strategy that defines the questions that they would like to answer with Big Data technologies and the potential business value of answering those questions.  This should be done before making big investments in acquiring and working with these new technologies.
  3. With Big Data it is even more critical for organizations to master the Data Management capability described in our IT Capability Framework.  This includes data governance, meta and master data management, data quality, and database management, and quality data management.

For companies that have Big Data projects already underway, and are not completely comfortable with the results that they are getting, we can help by completing a Big Data Program Assessment.


The technologies driving the Big Data revolution are summarized below using the IT Renaissance IT Architecture Framework.


1 The Joys and Hype of Software Called Hadoop, On-line Wall Street Journal, December 16, 2014