Data Generation is Growing – Start Storing Everything

by AlanSBPerkins

The world is generating more data than ever before.

In 2013 IBM reported that 90% of the world’s data had been generated in the last two years, and this trend is continuing. So what is causing this explosion of data?

Several major factors have contributed. Chief among these is the changing way we interact with computers. The first generation of data capture involved rigorously-prepared data being hard-wired physically into the machine by the engineers. The second generation involved professionally trained computer operators who would feed data into the machines at our request. Software was designed to enforce restraints so the machine knew how to process the data. The third generation saw everyone getting access to enter their own data – Web 2.0, the interactive Web: people were given the freedom to capture whatever they wanted to capture, and the amount of data captured exploded. Now we are on a fourth generation of data capture – the Internet of Things, Machine to Machine communication. Gartner has projected that there will be 26Billion connected devices by 2020, and the number of sensors will be measured in the trillions.

The price of storing all the data we are generating has fallen dramatically. The first hard disk drive, IBM’s RAMAC 305 was launched in 1956 at a cost of $50,000, or $435,000 in today’s dollars – the drive stored 5Megabytes. To put that in context, it would cost $350Billion today to store 4 Terabytes of data using that technology, and it would take up a floor area of 2.5 times the area of Singapore. Of course, today, a 4Terabyte drive can be purchased for little more than a hundred dollars and can fit in the palm of a hand.

Since storage costs have been largely resolved, the biggest challenge remained around data processing and analysis. There are two issues here: firstly traditional database systems are designed to run on one machine. Larger databases implies scaling up to bigger machines, but the amount of data now available has outstripped our capacity to process it using traditional methods on one computer – the computers are just not powerful enough. Secondly data has become more structurally complex, and traditional database designs, which rely on the structure of the data being predefined during the design phase, no longer cope with the flexibility required when people and systems can evolve to use data in unpredictable ways.

Until the rise of next-generation database systems like MongoDB, these limitations resulted in a lot of data being thrown away: what’s the point of storing stuff if you cannot make sense of it? But MongoDB has helped change all that. MongoDB is inherently designed to work across many computers thus enabling it to handle vastly larger amounts of data. Furthermore, the structure of the data does not have to be defined in advance – MongoDB allows for the storage of anything, and yet patterns and sense can be gleaned regardless of the structure.

For example, in a database of customers, we may have access to store information about each customer that is highly specific to them as individuals – their pets, their hobbies and special interests, places they have visited, books they have read, companies where they have worked. We may not know what information we can glean, but with MongoDB we can store it today and make sense of it in the future.

Given tools now exist that can handle less structured and very large datasets, and storage costs have so dramatically reduced, it stands to reason to start storing everything. Businesses of the past were valued on their brand awareness; businesses of the future will be increasingly valued on how well they can make use of the data at their disposal to understand each customer, improve their efficiency and responsiveness, and make the best decisions.

Storing data today for use tomorrow makes a great deal of business sense. Once data has been collected, questions about how to use the data will naturally ensue. Without the data, people won’t even think about questions they could be asking, things like:

What would be the optimum price for this product?
How soon should we follow up a customer after they have purchased item X?
What products act as good loss leaders?
What are the signs that indicate a customer will churn?
When a customer churns, who are the people most at risk of following them?
What impact does the weather have on purchasing patterns?

MongoDB makes the perfect solution to store data. It has the flexibility to cope with unstructured and structured data, can scale to many petabytes with full replication, and has a great deal of support for analytics. Even if there is no current interest in analysing data, a business case will be much easier to make for analytical tools in the future if there is a large reservoir of data to use, rather than just an idea to grow from scratch.

The businesses who win in the future will be those that know how to harvest all the data available to them. The sooner they start storing and practising how to glean the most from their data, the sooner they can learn to pre-empt the customers’ needs, pre-empt the marketplace, learn to cope with the veritable deluge in a highly responsive manner and leave the less-informed competition behind.

Read more from Big Data

IT News: Inchcape's crack at uniting legacy, manual IT	18th Jan 2017
IT News: Who will win Consumer CIO of the Year	25th Nov 2016
CRN Magazine: Meet the buyer: Inchcape Australia	Sep 2016
ComputerWorld: Rackspace CTO Retorts to Gartner’s Disapproval of OpenStack	4th Dec 2013
e27: Cloud can help Asian startups slay giants	21st Nov 2013
ZDNet: Biggest cloud risk for CIOs is being blind to potential	15th Nov 2013
Delimiter: Rackspace hires high-profile cloud CIO Perkins	5th Feb 2013
Asia Pacific Security:Cloud guru Alan Perkins joins Rackspace in new Asia Pacific role	5th Feb 2013
Technology Spectator: CeBIT kicks off with cloud computing focus	22nd May 2012
Australian Financial Review: Conference offers peek into future	22nd May 2012
CeBIT: Interview with Alan Perkins	26th Apr 2012
Sydney Morning Herald: Are CIOs Scared of the Cloud?	6th Mar 2012
The Australian: People to Watch in 2012	23rd Feb 2012
IT Wire: Altium Cloud Guru Weighs his Options	20th Feb 2012
The Australian: Cloud computing set for IT industry baptism	29th Nov 2011
Delimiter: CIO gives top seven tips for cloud adoption	17th Nov 2011
Delimiter: Does Australia need a cloud computing visionary?	13th Nov 2011
The Sydney Morning Herald: Companies not investing enough in IT security.	11th Nov 2011
The Australian Financial Review: Gatekeepers cultivate a new image. Also in print	8th Nov 2011	pS10
Australian IT (The Australian): New frontier in digital sock drawer	18th Oct 2011
MIS Australia: The Right Foundations (Also published in MIS Magazine)	26th Aug 2011	p30
IT News: Salesforce's Chatter goes private	9th Jun 2011
BRW: Internet Advantage	12th May 2011	p45
cio.com.au: Opinion: The buck stops with you on Cloud	22nd Apr 2011
BRW: Smart IT Outsourcing	21st Apr 2011	p33
Delimiter: Cloud Vendors Need to Communicate Better: CIO	13th Apr 2011
Sramana Mitra: Thought Leaders in the Cloud	21st Dec 2010

August 5, 2014

Data Generation is Growing – Start Storing Everything

Leave a comment Cancel reply

Alan Perkins

Selected Media

Recent Posts

Recent Tweets

Archives

About

Pages

Email Subscription

August 5, 2014

Subscribe

Data Generation is Growing – Start Storing Everything

Share this:

Related

Leave a comment Cancel reply

Alan Perkins

Selected Media

Recent Posts

Recent Tweets

Archives

About

Pages

Email Subscription