DevOps: the Solution to Disintermediation
Around 1986/87, I approached a motorcycle courier and asked him, “what impact, if any, has the fax machine had on your business”. His reply surprised me: “It has been the best thing that ever happened – people now expect instant delivery, but the fax is not a valid original”.
At first glance, I thought this was counter-intuitive – I expected the fax machine to kill the couriers, or to have no impact – I certainly didn’t expect it to have the positive impact he described. Suddenly people knew they could get their hands on a facsimile of a document instantaneously, and so they came to expect instant gratification, instant delivery. And so, they would call on couriers rather than use the postal service to ensure they got the documents quickly. The fax machine changed the paradigm.
Similarly, the proliferation of SaaS (Software as a Service) offerings like Salesforce, Yammer, Marketo, GMail, Clarizen, Zoho, Zuora etc have changed end users’ expectations of what is an acceptable timeframe for delivery of new systems or modifications. Marketing, Finance, HR and other teams now expect new functionality to be available in days/weeks/months rather than quarters and years. They are making new demands of their IT teams that are in the eyes of traditional CIOs unconscionable, impossible, reckless. This leads to frustration and tension, and a strong desire to bypass the IT department – a process that has become known as disintermediation: cutting out the middle man.
This process is so well established that Gartner stated in early 2012 that by 2017 marketing departments will be spending more on IT systems than CIOs. The trouble is that marketers are good at marketing, not IT, and so they will end up creating siloed “crapplications” rather than extensions of a single source of truth. There will be a plethora of disjoint standalone copies of customer databases, product catalogs, charts of accounts etc distributed through the enterprise.
CIOs can well look at this and feel frustrated and concerned – how do they convey the risk being taken by bypassing the architectural Design in order to get instant gratification? How do they get back in control of the systems agenda? How do they learn to respond to the accelerated expectations of business departments that cannot afford to stand by and watch their business erode by far more responsive newcomers?
The answer, I believe, largely lies in moving to a DevOps model where the dichotomy between software development and operations infrastructure provisioning disappears, and hardware and any required software is provisioned on demand by scripting and automation. The DevOps approach leads to continuous deployment – in some cases up to 75 times per day. This model aligns with agile programming techniques like Scrum, and supports the idea of disposable “cattle” hardware provisioning so synonymous with Cloud and Big Data.
Critics may argue that such approaches are risky, even reckless, that new versions must be thoroughly tested before seeing the light of day. New systems and changes to hardware environments must go through rigorous testing before being published. This has consequences for change management complexity and thus risk: keeping development, staging and production environments in sync is difficult when major changes are bundled together and released in one large batch.
It is perhaps paradoxical that the systems developed using DevOps principles will be delivered faster at lower risk. This is due to the combination of:
- Being able to continuously deploy updates using automated deployment systems like Jenkins;
- Procuring scripted environment using the likes of Chef, Puppet, Saltstack or Ansible that guarantee all the environments (development, staging and production) are materially the same;
- Autonomous failover and recovery systems that know how to self-repair or self-replace parts of the environment in real time;
- Agile software practices that are designed to iteratively deal with intrinsic and extrinsic problems, scaling needs, and are capable of responding to changes in direction;
- Systems being designed with testing as a first foundation rather than an afterthought; and
- Systems being designed from the ground up to cope with fallible, unreliable infrastructure.
CIOs often want to wait “until Cloud matures” before they move workloads or change practices. Such wait-and-see approach can doom them to fail, or at least consign them to a much more difficult road into the new world. Marketers and other functional heads are not going to wait, and once the systems do become disparate and disconnected as a result of them going it alone, it will become increasingly difficult for the CIOs to introduce DevOps processes and re-establish themselves as the credible go-to resource for any form of IT related system.
Unfortunately too many CIOs have been convinced by Cloud vendors that claim that the only true Cloud is a public multi-tenanted one. I myself felt that way in the early days. The reality is that the location of the system is a secondary matter. Cloud and DevOps principles apply regardless of whether the systems are on premise or off premise. Too many CIOs see their biggest challenge of deciding whether to run IT on premise or in the Cloud, but this is a false dichotomy, a false choice – it is not a question of on-premise OR Cloud – there are two choices: one a choice of on-premise or off-premise and the other a choice of Cloud or not Cloud, DevOps or not DevOps.
Chief Marketing Officers, Financial Officers, HR Officers are not going to wait for their visions to be implemented by IT teams encumbered by old practices – nor should they. They need to be able to rely on their CIOs to deliver what they need when they need it. DevOps and Cloud technologies will empower the IT departments to be enablers leading the organisation forward into the new world instead of roadblocks preventing the business from moving ahead.
Data Generation is Growing – Start Storing Everything
The world is generating more data than ever before.
In 2013 IBM reported that 90% of the world’s data had been generated in the last two years, and this trend is continuing. So what is causing this explosion of data?
Several major factors have contributed. Chief among these is the changing way we interact with computers. The first generation of data capture involved rigorously-prepared data being hard-wired physically into the machine by the engineers. The second generation involved professionally trained computer operators who would feed data into the machines at our request. Software was designed to enforce restraints so the machine knew how to process the data. The third generation saw everyone getting access to enter their own data – Web 2.0, the interactive Web: people were given the freedom to capture whatever they wanted to capture, and the amount of data captured exploded. Now we are on a fourth generation of data capture – the Internet of Things, Machine to Machine communication. Gartner has projected that there will be 26Billion connected devices by 2020, and the number of sensors will be measured in the trillions.
The price of storing all the data we are generating has fallen dramatically. The first hard disk drive, IBM’s RAMAC 305 was launched in 1956 at a cost of $50,000, or $435,000 in today’s dollars – the drive stored 5Megabytes. To put that in context, it would cost $350Billion today to store 4 Terabytes of data using that technology, and it would take up a floor area of 2.5 times the area of Singapore. Of course, today, a 4Terabyte drive can be purchased for little more than a hundred dollars and can fit in the palm of a hand.
Since storage costs have been largely resolved, the biggest challenge remained around data processing and analysis. There are two issues here: firstly traditional database systems are designed to run on one machine. Larger databases implies scaling up to bigger machines, but the amount of data now available has outstripped our capacity to process it using traditional methods on one computer – the computers are just not powerful enough. Secondly data has become more structurally complex, and traditional database designs, which rely on the structure of the data being predefined during the design phase, no longer cope with the flexibility required when people and systems can evolve to use data in unpredictable ways.
Until the rise of next-generation database systems like MongoDB, these limitations resulted in a lot of data being thrown away: what’s the point of storing stuff if you cannot make sense of it? But MongoDB has helped change all that. MongoDB is inherently designed to work across many computers thus enabling it to handle vastly larger amounts of data. Furthermore, the structure of the data does not have to be defined in advance – MongoDB allows for the storage of anything, and yet patterns and sense can be gleaned regardless of the structure.
For example, in a database of customers, we may have access to store information about each customer that is highly specific to them as individuals – their pets, their hobbies and special interests, places they have visited, books they have read, companies where they have worked. We may not know what information we can glean, but with MongoDB we can store it today and make sense of it in the future.
Given tools now exist that can handle less structured and very large datasets, and storage costs have so dramatically reduced, it stands to reason to start storing everything. Businesses of the past were valued on their brand awareness; businesses of the future will be increasingly valued on how well they can make use of the data at their disposal to understand each customer, improve their efficiency and responsiveness, and make the best decisions.
Storing data today for use tomorrow makes a great deal of business sense. Once data has been collected, questions about how to use the data will naturally ensue. Without the data, people won’t even think about questions they could be asking, things like:
- What would be the optimum price for this product?
- How soon should we follow up a customer after they have purchased item X?
- What products act as good loss leaders?
- What are the signs that indicate a customer will churn?
- When a customer churns, who are the people most at risk of following them?
- What impact does the weather have on purchasing patterns?
MongoDB makes the perfect solution to store data. It has the flexibility to cope with unstructured and structured data, can scale to many petabytes with full replication, and has a great deal of support for analytics. Even if there is no current interest in analysing data, a business case will be much easier to make for analytical tools in the future if there is a large reservoir of data to use, rather than just an idea to grow from scratch.
The businesses who win in the future will be those that know how to harvest all the data available to them. The sooner they start storing and practising how to glean the most from their data, the sooner they can learn to pre-empt the customers’ needs, pre-empt the marketplace, learn to cope with the veritable deluge in a highly responsive manner and leave the less-informed competition behind.