May 16, 2014

How to compete using small data

Source: The New Yorker

The business world continues use data to gain a competitive advantage. While "big data" has reached a level of media prominence, small data analysis continues to comprise the silent majority of business intelligence. Organizations of all shapes and sizes use small data to develop and maintain competitive advantage in a wide variety of industries. In this post we'll define small data, discuss its associated challenges, and map out its place in the data universe.

What is small data?

Small data is, in many ways, an inverted approach to big data. Rather than having enormous amounts of data from a few key sources, small data means using a lesser amount of data from a more diverse set of sources.

Despite its low profile, the vast majority organizations that analyze data use this type of approach. And many who are the most successful at small data pull their information from publicly available or non-proprietary sources.  Some examples of small data that are commonly used include:

  • Historical pricing for goods and services
  • Polling data (raw or cross-tabs)
  • Econometric data
  • Historical weather patterns
  • Financial reporting information

While none of these types of data are terribly exciting, they are the backbone to the analytical frameworks of most groups and organizations looking to synthesize information into actionable decision-making.

Big or small, velocity is always important

Much has been made of the Three V's of Big Data: Volume, Variety and Velocity. At the risk of stating the obvious, the timeliness of data is critical to achieving competitive advantage whether you are using one data point or a billion. Often, small data is pre-processed before you receive it, and as such it can never compete with the currency of data that comes directly from the source. The relative size of small data, however, can reduce the amount of time it takes to analyze and extract actionable insight from the compiled information.

Channel of success

The proverbial "sweet spot" of competitive advantage in data comprises a very tight corridor between creating original insight and avoiding analysis paralysis. As shown on our original 2x2 matrix, not having enough data points or data sources will surely leave you in the dust, while, having too many of both can make it virtually impossible to achieve any kind of insight in a timely fashion. Every industry is different and there are exceptions to every rule (Google), but a general guideline looks something like this:

Small data in action: Nate Silver in 2012

A very public example of small data success is Nate Silver's accurate prediction of the 2012 presidential election. Starting before the 2008 election, he realized was that polls are inherently inaccurate because of their built-in biases and statistical margins of error. So, rather than creating his own poll, he simply aggregated the results of many different polls and adjusted for historical accuracy. While this was methodologically sophisticated, it did not require enormous computing power, proprietary software or a room full of servers; it was done using a series of spreadsheets on a laptop computer.

For Nate, the insight he achieved was not from the volume of data, but in its diversity. By being able to see that no data source is perfect and realizing that even publicly available data can be melded and rearranged to offer competitive insight, he was able to raise his profile to one of the foremost authorities on data and analysis.

The hardest part of small data

Despite its value, the hardest part of using a lot of different data sources is getting them in one place where they can be used. With data diversity comes a wide variety of formats, structures, interfaces, storage mechanisms and errors. And even once these sources have been pulled together for the first time, the critical task of keeping the data constantly updated is often enormously time consuming. Even if data consolidation is streamlined (perhaps using an ETL process) building the infrastructure for storing and accessing this data across organizations can take years at huge cost.

We at Shooju see this as the number one challenge for organizations looking to gain actionable insight from small data. To learn more about how we solve this problem, and some insight into the best technologies for the job, check out our architecture.

It's how you use it

As the saying goes, it's not the size that counts, it's how you use it. This framework lays out an ingredient to analytical success with small data, and doesn't replace the value of true insight. Our belief is that effective and practical usage of small data allows for groups and organizations to spend more time extracting value from the data rather than managing and collecting it. 

No comments :

Post a Comment