15th Century Big Data - What Can We Learn From It?

October 19, 2012 5:21 am 0 comments Views: 38

Share this Article

  • LinkedIn
  • TwitterTwitter
  • FacebookFacebook
  • DeliciousDelicious
  • DiggDigg
  • StumbleuponStumble
  • RedditReddit
  • Follow Me on PinterestPinterest
  • Google+

Tags:

Author:

 

Source:

 

 “Big Data” is in vogue today. It’s the new fashion in business and technology. It’s a phenomenon that is difficult to explain but somehow managed to trek from the Technology Street to Wall Street and now blazing its trails into the Main Street.

The technocrats, the data scientists and the business executives are claiming that there hasn’t been anything like this throughout the human history. The sheer volume and rate of growth is indeed unparalleled in history. However; the term “Big” is relative. A 30 MB hard drive a few years ago was “big” but today 2 TB drives are common. Similarly, half a GB of RAM on a desktop was a luxury 10 years ago but it is common to have 4 GB RAM on a laptop today.

What was big then is almost laughable now.

There was another time in history when suddenly we started creating large volumes of data, in variety of forms and at an accelerated rate. This was a revolution impacting social, religious, scientific and economic domains. This happened in the fifteenth century. It was a revolution at the cusp of medieval period and the renaissance, a phenomenon that was responsible for the scientific method and a platform that launched the Industrial revolution.

Before we discuss this fifteenth century Big Data revolution, let’s make sure that we are on the same page in terms of the current Big Data revolution. It is estimated that the data created from the dawn of civilization to 2003 was around 5 exabytes (260 bytes). We create that data every two days now! In 2011 we created around 1.8 zetabytes (270 bytes) of data. We are not only creating more data but we are also creating variety of data and at an astonishing rate.

You might have heard about the three Vs of Big Data – Volume, Variety and Velocity. We are creating lots of data (Volume) in variety of different types like numbers, texts, pictures, audio, video, twitter feeds, mobile and more (Variety) and at an increased rate (velocity). The amount of data and the variety has made the traditional storage and access methods obsolete. The relational databases cannot store the amount and variety that Big Data creates. The traditional analytical methods are also facing their own challenges. There is some merit to the statement that the current Big Data revolution is unrivaled in the history.

The fifteenth century Big Data revolution that I am talking about started around 1440 when Johannes Gutenberg invented the printing press. Gutenberg’s printing press at the middle of the fifteenth century was a unique phenomenon that has all the markings of the current Big Data revolution. The three Vs – Volume, Variety and Velocity are all applicable to this new invention. The most amazing part is the impact that it made. The impact was felt in many domains from social to religious to scientific to economic to art to literature. The impact was profound and absolute. The impact was surprising and unexpected. The impact set Europe on a path of progress that can still be felt. We should be able to learn a lot from it.

So let’s discuss this fifteenth century invention in more detail.

Gutenberg did not invent printing. Around 100 BC, almost 1600 years ago, Chinese had invented paper using vegetable pulp. Chinese started producing block printed texts around 6th century AD and by the 11th century Chinese had also developed movable type. The initial movable type was clay and later Chinese experimented with wooden and ceramic types. The Chinese print technology moved to Europe by 1400 AD via Muslim trades and crusades.

Gutenberg’s modifications and improvements to the existing Chinese invention of printing took it to a new height. Gutenberg integrated the movable “metal” type with oil based ink and linen press. The combination was so potent that printing took off like wild fire. It created a new industry that included production of paper, printing press, editing, creation and distribution of printed material in books and texts forms.

So let’s look at this innovation in terms of three Vs of Big Data:

Volume: During 6th and 7th centuries only 120 books were produced annually in Western Europe. Books were a luxury items. Only rich and the church were able to hire scribes and produce books. Most of the books were of religious kind. Gutenberg introduced the printing press around 1440 and by 1500 in only 50 years, 10 million texts including 2 million books were printed. In the next 400 years the world started producing around billion books a year. See the chart below:

 

Just like Big Data today there was an explosion in the volume of texts and books (data) after the introduction of printing press around 1440 AD.

Variety: The printing press revolution introduced three major types of variety to texts and documents that were previously unknown.

a.      The First variety was the proliferation of secular documents. Before the introduction of printing press by Gutenberg around 1440, the books were mostly religious. There were handful of secular books written but they were primarily written for the consumption of rich people who could afford to have diverse interests and could hire scribes to write or commission a work. The type of work included poems and short stories like Dante’s “Devine Comedy” and Chaucer’s “Canterbury Tales”. These were popular books but distributed in select circles and were difficult for consumption by common man. The printing press made the books affordable and helped increase the literacy in Europe. This in turn gave rise to a variety of new forms of writing that was previously not possible.
b.      The second variety was that of the form of the printed texts. Apart from books one could print pamphlets like Martin Luther’s 95 Theses, newspapers, magazines, books on science and technology, research papers and more. Cervantes, the Spanish writer of 17th century introduced the Novel (Don Quixote) for the first time.
c.      The third variety was the use of vernacular instead of Latin for writing and printing texts. As writing and reading were mostly domains reserved for the rich, most of the texts were written in Latin as that was the language of the intellectual community. Some of the ardent followers of this dictum continued to do so through 17th century like Isaac Newton who wrote his masterpiece “Principia” in Latin. After the revolution of the printing press many new writers started adopting local languages or vernacular to compose their texts like English, German, Italian and Spanish.

Velocity: Writing a book using scribes was a laborious and time consuming process. The printing press was a paradigm shift in terms of rate of production. From a single printing press in Germany in the 14th century the print activity was spread among 270 cities in Europe by the end of 15thcentury. By 1500 the printing presses had printed 10 million texts and 2 million books. By the next century the output rose to 150 to 200 million copies and in the next 400 years we started printing more than a billion books.

You can see that the Big Data revolution of today and the Big Data revolution of fifteenth century have a lot in common. The printing press revolution brought certain changes to the society and its thought process. These changes were also responsible for creating deep impact in Europe and the world at large. Let us first look at the changes:

1.      Establishment of print ecosystem: The print ecosystem evolved fairly quickly without any central planning by any state or government. The complex network of print supplies, printing press and editors, emergence of writers, print production, distribution and demand management came to life on its own and worked beautifully until recently when the digital media created a new challenge.
Does today’s Big Data revolution reflect this change? The answer is a resounding yes. As soon as we started generating large amount of data several new technologies to emerged to store and analyze them. Google pioneered the Google File System and MapReduce algorithms that were later embedded into a new open source solution called Hadoop. The Hadoop ecosystem was soon crowded with numerous administrative and analytical tools. The tools set for Hadoop is still evolving but it resembles the emergence of print ecosystem.
2.      Ease of sharing of ideas: Before the advent of mass transportation and communication channels, sharing of ideas was not very easy. It was necessary for the two learned persons to either meet in person or communicate via letters and books written through a laborious process by the scribes. The printing press however made storing ideas on a paper much easier and reliable. Community of scientist could communicate there findings through scientific journals, community of writers could bounce new idea among their colleagues, communities of religious people across multiple geography could easily express their opinions and challenge the status quo. This was a precursor to the Reformation movement by Martin Luther and the introduction of the scientific method by Francis Bacon and others.
Does today’s Big Data revolution reflect this change? The answer again is a resounding yes. The current Big Data architecture has enables us to share everything about our life and philosophies using multiple tools and technique. We share our thoughts using blogs, we share our social life using Facebook, we share our business contacts using LinkedIn, we share our pictures and videos, and the email is an absolute necessity. On top of this the business is tracking everything we do from our shopping habits to our political thoughts to our suggestions and complaints. New mediums like Skype are trying to make communication almost face to face. No time in history the world was as connected as we are today.
3.      Rise of production of secular documents: The printing press and its relative simplicity and affordability encouraged many new ideas apart from religion like ethics, society, government, science, technology, art and literature. The period after the collapse of the Roman Empire in the 5th century is often considered as the “Dark Age”. There were several social and political changes in Europe between 500 and 1000 AD however the intellectual branches of knowledge like philosophy, science, mathematics and technology lagged behind. This changed with the rise of the Universities in the 12th century. The idea of the University spread rapidly in Europe and dozens of Universities were created by 1400s in Paris, Bologna, Oxford in England and Krakow in Poland to name a few. The University and the printing press was a fatal combination that increased the production of secular documents as opposed to the religious texts in the past.
Does today’s Big Data revolution reflect this change? Yes but in a slightly different way. The production of secular documents was a fifteenth century problem and not an issue today however; most of the data that was stored and analyzed is business focused like customer transactions, supply chain or financial data. The big shift due to today’s Big Data revolution is in the social and personal data that is being generated using social media and mobile technology. We do not know the real impact of this change yet. Every business is trying to make sense and monetize this data but so far there isn’t any silver bullet.
4.      Rise of “mass” literacy in Europe and then the world: The availability of cheap printed documents, the rise of the universities, the decline of church dogma and the use of vernacular for writing texts together lifted the literacy rate in Europe. This in turn increased discussion of ideas and encourage more and more people to become literate. The reason that the scientific revolution did not began in other advanced civilization like China is a testament to the power of knowledge and communication.
Does today’s Big Data revolution reflect this change? The answer is again yes. The whole world has become technology savvy. Grandparents who had never used computers before find themselves comfortable with Facebook or Skype. The illiterate and poor populations in the third world countries find themselves knowledgeable enough to use smart phones and many intuitive applications. The new generation is puzzled at the mention of LP, CD, VCR and camera’s with film! Suddenly the whole world has become digital and connected in a unique way.

The most interesting part of the print revolution is not the changes that are mentioned above. The most interesting part is the impact that these changes had the social, political, religious and economical domains. That is its lasting legacy. I want to discuss some of the critical impacts or movements that the printing press was responsible for. The printing press wasn’t a sufficient condition for the below mentioned impacts but it was certainly a necessary condition. The printing press revolution alone couldn’t have produced these impacts but it played a major role in making it happen. Following are some of the impact:

1.      Reformation Movement: In the early part of the 16th century the Roman Catholic church was engaged in selling “indulgences” (reduction of punishment due to forgiving sins) to raise money to build St. Peter’s Basilica in Rome. Martin Luther, a German monk protested this practice by formulating 95 theses that asked controversial questions like - “Why does not the pope, whose wealth today is greater than the wealth of the richest Crassus, build the basilica of St. Peter with his own money rather than the money of poor believers?” Luther posted a copy of the 95 Theses on the door of the Castle Church in Wittenburg and within weeks these 95 Theses were spread all over Europe. The print technology played a major role in distributing these Theses. Martin Luther was soon followed by other reformers like John Calvin and others and this ended up forming a new sect in Christianity referred collectively as “Protestant”. The protestant reformers encouraged owning and reading of Bible that was readily available due to print technology. The Reformation also played a critical role in the development of new ideas in philosophy and science. Before Reformation the Roman Catholic Church controlled ideas and anybody challenging church’s authority were considered heretics. After reformation many protestant countries accepted the freedom of religion and engaged in spirited discussion of ideas. They also printed and distributed critical books across Europe.  It is interesting to note that many of the prominent scientists came from protestant countries.
Is there a similar impact due to today’s Big Data? This is an interesting questions and I don’t think anybody has the right answer but I have an idea that has some promise. Like the Roman Catholic dogma prior to Reformation, the traditional Data Warehousing and Analytics concept need a challenge. The relational databases and enterprise models are becoming the thing of the past. The relational database was never designed with analytics in mind and was best suited for single updates and inserts. The Data Warehouse and the Analytics community have adopted it for the last 20 years but the time has come for “Reformation”. Hadoop and its ecosystems are challenging the tradition and we can see two emerging philosophies. One that wants to stick to the tradition and the other wants to embrace the change and uncertainty.
I am sure that there are other trends that I have missed? Please share if you have an idea or opinion.
2.      Scientific Revolution: Aristotle had written a book on logic called the “Organon” or “The Instrument”. This book dominated the thinking for the next 2000 years. The book relied entirely on deductive logic. In 1620, Francis Bacon wrote another book that he called the “Novum Organum” or “New Instrument”. This book was an open call to the intellectual community to adopt the empirical method of enquiry and use the inductive method. He argued that knowledge is not something that we start with and deduce conclusions from but. He claimed that knowledge is something that we must arrive at by collecting facts and then drawing conclusion based on these facts using inductive reasoning. Francis Bacon was followed by Rene Descartes and others in coming up with the “Scientific Method”. This method was responsible for developments in physics, chemistry, mathematics, astronomy, biology, medicine and other fields that gave rise to the scientific revolution that started post Renaissance through the 18th century. This included works from the scientists like Copernicus, Kepler, Galileo, Newton, Leibniz and more. It is important to mention again that the printing press was not the sufficient condition for scientific revolution but it certainly was a necessary condition. The printing press, reformation, scientific method and the rise in exchange of ideas was ultimately responsible for the most fertile period in scientific discoveries and inventions.
Is there a similar impact due to today’s Big Data? This is again a difficult question to answer. Hundreds of years had passed after the print revolution before something known as Scientific Revolution came along. The current pace of technology is of course much swifter than the past however; predictions are difficult especially about the future!
My thought in this regard is that within few years I hope to see new principals and frameworks being crystallized similar to the traditional methods established by the Data Warehousing movement. Again, it will be interesting to hear thoughts from others.
3.      Industrial Revolution: Due to the scientific revolution there was a deeper understanding of the natural phenomenon. Entrepreneurs and engineers started experimenting with new tools and technologies. The scientific revolution was followed by the industrial revolution which was nothing but the application of the scientific knowledge that was gathered during the scientific revolution. The industrial revolution brought changes in agriculture, mining, transportation, manufacturing and more. The Steam Engine is one such example that impacted the mining, manufacturing and transportation industries.
Is there a similar impact due to today’s Big Data? This is something that we could foresee – not in absolute terms but in concept. The Industrial Revolution was an application of the scientific method resulting in tools and technologies like steam engine, spinning jenny and more. Similarly the current Big Data revolution would have its own applications and tools that will enable effective and efficient storing, administration and retrieval of Big Data. There is already a proliferation of several new tools and ideas trying to solve Big Data challenges. The Venture Capitalists are pretty bullish in investing in startups trying to monetize Big Data.
There is also a war brewing between the Data Scientists and the computer engineers. The Data Scientists are claiming that the computer engineers are trying to become data scientist without appropriate analytical skills and the computer engineers are claiming that the Data Scientist are trying to become programmers without adequate computer skills. The strategy consulting firm McKinsey is estimating a shortage of Data experts in the range of 200,000 by year 2018. I hope to see the skills roadmap becoming clear in a couple of years.

This was my slightly different take on Big Data. I wanted to give it a context and not treat is as unique and unprecedented. In reality there is nothing that is truly unique and if we could learn from history we can make several useful inferences. This was an attempt to do just that. Please share your insights and your attempts to predict the future!

By Anil Inamdar, from: http://bigdatathougths.blogspot.com/2012/10/fifteenth-century-big-data-what-can-we.html

Leave a Reply