Garbage in, garbage out (GIGO) has been a familiar term used to describe data quality and its usefulness. In our age of “recycling” and “reuse”, perhaps a more contemporary rendering of this phrase might be, “What starts out as a tiny piece of irrelevant information, winds up as Big Data”.
The explosive growth in the amount of data created in the world continues to accelerate and surprise us in terms of sheer volume. The data deluge is happening everywhere and is not only restricted to niche sources. It encompasses sensor and machine data, transactional data, metadata, and social network data .
In addition to leaving a vapor trail filled with toxic gases, a Boeing jet engine produces up to 10 terabytes of operational information for every 30 minutes of use. Suppose a four-engine jumbo jet generates 640 terabytes of data per Atlantic crossing, then multiply that by the more than 25,000 flights flown each day, and well - you do the math!
Social network data is another source adding to the “Big Bang Theory” explosion of data. The micro-blogging site Twitter serves more than 200 million users who produce more than 90 million “tweets” per day, or 800 per second. Each of these posts is approximately 200 bytes in size. On an average day, this traffic equals more than 12 gigabytes, and, throughout the Twitter ecosystem, the company produces a total of eight terabytes of data per day.
Earlier this year, Facebook announced they had surpassed the 750 million active-user mark, making the social networking site the largest consumer-driven data source in the world. According to industry sources, Facebook users spend more than 700 billion minutes per month on the service, and the average user creates 90 pieces of content every 30 days. Each month, the community creates more than 30 billion pieces of content ranging from Web links, news, stories, blog posts and notes, to videos and photos.
Everywhere you look, the quantity of information in the world is soaring. The term “Big Data” has emerged to describe this monstrous growth in data. “Big Data” represents data sets whose characteristics are comprised of large scale, high throughput, and an abundance of data structures.
John Brantley Hooks, IV, has served as a consultant to Private Equity and Hedge Fund firms, and as an interim executive at several top-line turnarounds for over 25 years. He has held executive positions at Accenture, LLP, and Anderson Consulting, in their Corporate Strategy and Corporate Venture organizations.