Is ‘Big Data’ Real, or Jargon?

April 11, 2013 5:26 am 1 comment Views: 288

Share this Article

  • LinkedIn
  • TwitterTwitter
  • FacebookFacebook
  • DeliciousDelicious
  • DiggDigg
  • StumbleuponStumble
  • RedditReddit
  • Follow Me on PinterestPinterest
  • Google+

Tags:

Author:

 

Source:

 

numbersThe average person doesn’t think much about big data and how it might affect them. Anyone who uses the internet benefits from big data and doesn’t even know it. So what is big data? The video Big Ideas: How Big is Big Data? By EMC Corp does a great job with the explanation.

Big data is a collection of data sets so large and complex that it becomes difficult to process using traditional database management applications. By this definition it appears that big data only reflects those in business and IT, but that is not the case.

Let’s take a look at Ancestry.com: millions of people use it every month to search for their ancestors. I happen to be one of them. According to Ancestry.com, they have more than four billion records. That is a lot of data! The average user is pretty satisfied with this number because it gives the impression that it would be pretty simple to find your relatives. And for the most part, it is, until you find that your great-great-grandmother, Jane Smith, had a hugely common name. To make it worse, all of the most likely ancestors named Jane Smith happen to live in the general vicinity and have close birth dates in the same year. Oh, and all of them have a husband named John. There are differentiators such as siblings and children, but if you go in with few clues regarding who your relative might be, it can get quite frustrating.

Although Ancestry.com has some powerful search options, it is not perfect. The information provided in the system come from multiple databases, often assembled by people who are not experts in genealogy: you and me. Every time we enter family data it is recorded in Ancestry’s database. As long as the information is not marked private (usually nformation related only to living relatives) it is available to all who search. And we only as accurate as the relative who passed down the family history.

This blog is not to discredit Ancestry.com. As a matter of fact, I have found this database well worth the membership fee. But when it comes to big data, Ancestry.com provides a familiar, convenient example to further illustrate:

  1. Big data is just that: lots of data.
    And its availability is important. I wrote a blog, Big Data: Not About the Technology – About Solving Biggest Analytical and Data Challenges. Big data is about the technology, and the technology reaches out to everyone who is in search of answers. Whether it be in finding your relatives or finding answers to the most critical business questions, data availability of large, complex data sets will continue to drive innovation.
  2. The more data that is available the bigger the challenge.
    Figuring out which Jane Smith belongs to your clan may take more than just the information available in your database. Data analysis can often be confusing. Data relationships may be obscured and may require more traditional manual research to clarify.
  3. Clean and accurate data makes a huge difference in the outcome of your analysis.
    Unfortunately, inaccuracy is not uncommon. We can only hope that your intuition will alert you if the information doesn’t ”ring true.” No one wants false information or a family with fictional characters! It is always prudent to exercise caution with the information provided and avoid making claims you may regret later.
  4. Big data help business intelligence tell a story.
    Storytelling continues to be one of the benefits of business intelligence. Without big data, the story we create through analysis would be incomplete. Using a database like Ancestry.com has given me the story of my family. Relatives I have never heard of now have names and personalities. I have vision of their lives, how they struggled and lived through joy and tragedy.

I believe we can all appreciate the importance of big data in our world. It has improved and changed the lives of individuals and organizations alike. Go ahead and call it a “fad,” but big data will not be going away anytime soon.

By Cindy Balon Harder, EPM Contributor, from: http://www.visualdatagroup.com/node/168

150x150xCindy_New_Profile_Picture-300x275.jpg.pagespeed.ic.zmSIUsYkSoThroughout her 20+ year career, Cindy Balon Harder has had extensive experience in Marketing, Wholesale & Distribution, Supply Chain, and in developing Sales and Operations Planning processes. She is particularly familiar with the Consumer Products industry where she has participated in all aspects of the supply chain, from demand planning to warehouse distribution. Cindy is a Principal at Visual Data Group. Her main focus is Marketing, PR and Social Media, and Supply Chain consulting. See Cindy’s articles on EPM Channel here.

1 Comment

  • Tom Deutsch

    Bit of a misleading title for the post, and while worth a read the author is mashing up some idea that could use some clarification. The multiple or similar names example provided is an Entity issue, not inherently a big data issue. Now to be fair we use our big data platforms to help resolve/inform Entity issues but it is a distinct class of problem.

    That isn’t an EMC definition btw, Pre-dates their involvement in the space. I’d also hasten to add it is a very narrow and database-centric view of this as it ignores Streaming and other in-memory approaches.

    The author also ignores the variety and velocity aspects which is a a pretty serious omission.

Leave a Reply