Sunday 22 March 2015

The First 'C' of Mega Data: Capture

Or perhaps it should be 'C' for Create?.  There are many estimates of just how many devices will be generating data as part of the massive growth of IoE3 (Internet of Everything, Everywhere, Everyone).  I thought it would be interesting to take a look at just how these devices will be identified across the Internet.

Cisco predicts that there will be 50 billion devices by 2020: Most remarkable observation in the infographic (left) that Cisco produced was that already by 2008 there were more devices connected to the Internet than there were people on Earth.

Another observation is rhat the introduction of IPv6 will provide 100 Internet addresses for every atom on the face of the Earth.  That's an estimate that will reassure everyone who's worried that we'll run out of IP addresses!

IPv6 was introduced in 2011 and has since become to be adopted by technology vendors for IP addresses.  Quoting the Internet Society:

"An IP address is basically a postal address for each and every Internet-connected device. Without one, websites would not know where to send the information each time you perform a search or try to access a website. However, the world officially ran out of the 4.3 billion available IPv4 addresses in February 2011.
Yet, hundreds of millions of people are still to come online, many of whom will do so in the next few years. IPv6 is what will allow them to do so, providing enough addresses (2128 to be exact) for everyone and all of their various devices."

So, there you go....we now have an almost unlimited number of addresses that can be used for identifying devices.  Now, just imagine how much data they'll generate.....could be the subject of a future blog.

Sunday 15 March 2015

The Four C's of Mega Data

The term Big Data has a hazy genealogy but is generally considered to have come into use in the 1990's.  Broadly speaking, the main attributes to determine Big Data have been Volume, Velocity and Variability.  As vendors have joined the party, the original 3 V's have been extended to include Veracity, Volume and Various other V's!


With the expected explosion of data arising from IoE3 (Internet of Everything, Everywhere, Everyone) we are now going beyond Big Data and are heading into the era of Mega Data.

 Each of the topics within the subject areas are worthy of an article in their own right.

In future blogs I hope to focus on each of the key areas:

Communicate - Analytics is forecast to become a $9.83 Billion market by 2020.  The power of Data Visualisation continues to grow with many mainstream BI vendors providing toolsets with comprehensive visualisation capabilities.

Calculate -  Moving from traditional statistical models, more and more is being applied to the use of Advanced Machine Learning. Several technology vendors have stepped into this market and there are also courses being promoted by universities.

Curate - This is what replaces the Extract - Transfer - Load phase of traditional Data Warehousing.  There will still be a need for some ETL but with concepts such as Data Federation and Schema on Read then the amount of data transferred from source to target may need to radically change.

Capture - The starting point of the data journey.  Estimates vary about how many devices there will be but forecasts of in excess of 50 Billion devices , proposed by Cisco, don't seem unrealistic.






With this exponential growth of devices to capture data, it will be interesting to see how our networks keep pace.