Let me begin by saying that I’m impressed by the work McKinsey has done on cloud adoption. The article I’m focusing on is quite right to assert that ‘companies tend to fall into the trap of confusing simply moving IT systems to the cloud with the transformational strategy needed to get the full value of the cloud.’
This is something we see all the time at Capacitas: it’s incredibly common to come across large enterprises that have switched from physical servers to the cloud without the mentality shift required to really make the most of the opportunity.
As McKinsey take the lead on showing how Cloud can accelerate IT modernisation, one of the elephants in the room remains cost. So, in this article, I touch on the tools and skills that Capacitas use to help you follow Mckinsey's advice and also become a more cost-effective organization.
1. The ‘Self-Healing’ Nature of the Cloud
In the piece, the author dives into the benefits provided by automating IT process in the cloud. Two of these benefits are the reduction of IT outages and incidents through autoscaling and the ability to automatically add capacity. For example, automatically allocating more storage to a database.
In principle, there isn’t a problem with this; of course, harnessing automation to prevent reputation-killing IT incidents is never a bad thing. All that being said, it does come with a caveat. If you’re going to use automation in this way, you need to understand exactly why the extra capacity was added, what it’s being used for, and when it’s no longer needed.
Often, new adopters of the cloud simply don’t do this. The result is that costs – for capacity that you potentially don’t need – quickly spiral and you’re left with a hefty bill. On top of that, you won’t have addressed the root cause of the problem, so, the minute you pare back the extra capacity, you’re back at square one.
So, what can you do to avoid this situation?
Well, your developers can keep doing what they do best – building high-performance and reliable solutions that use capacity to scale when needed. But, when it comes to keeping a handle on costs, Capacitas can help with two things.
It's not what people want to hear, but the best way to protect yourself against IT outages –and deliver real cost-savings – is to tackle software inefficiencies during application design.
This should done in two ways.
One, design and build well-architected applications with end-to-end efficiency in mind. Set cost efficiency targets for your devs and then later look out for any differences between the test and live environments.
Secondly, you need to introduce the concept of demand modelling. Both necessitate a tighter alignment between business, architecture, product and development teams, to ensure that cloud applications are built to scale and perform in unison with business demand. The bottom line: well architected, high-performance code reduces reliance on self-healing, and, as a result, lowers cloud costs.
In existing environments, cloud optimisation is somewhat trickier as applications are already in production. However, in this scenario, cost step changes due to cloud self-healing offer an ideal situation to identify opportunities for optimisation. A sanity check needs to be carried out using demand modelling techniques. This makes it possible for developers to improve the end-to-end architecture and optimise applications to scale according to demand — with less reliance on self-healing.
2. The $30m Figure for ‘Typical Cloud Spend’
I have no doubt that the $30m figure for typical cloud spend once a large organisation has reached maturity in the cloud was reached after extensive research. What’s more, I acknowledge that McKinsey does state that it’s a ‘rule of thumb’, not a hard and fast figure.
Nevertheless, the idea that there is a ballpark figure that organisations should be aiming for could be taken the wrong way. It sounds obvious, but every business really is different. What’s right for your organisation will depend on your business workloads. Naturally, this could be substantially more – or less – than $30m and is subject to constant flux as your business grows.
Instead of trying to pin your cloud spend to one headline figure, you need to identify a figure that is optimal for your business — based on building an end-to-end system model. Forecasting alone is only ever likely to give you an inaccurate figure; the only variable is just how inaccurate that figure is. Whereas, developing a robust model during the engineering stage that includes business demand, app demand, resource demand, and the cost associated with all three will give you a far more accurate and useful number.
As an example, Ancestry’s board and their investors set the senior management team a challenge to deliver 20% cost savings on their annual AWS spend within a 12-month window. In less than 6 months, using modelling, we helped Ancestry identify a further 20% – double their original headline figure.
3. Engineers and Cost Efficiency
Dealing with the need for up-skilling of development teams when moving to the cloud, the article suggests that engineers need to understand compute, storage, and cloud security. And this is entirely correct. But, it’s only a very narrow subset of what your dev team need to understand in order to take full advantage of the cloud.
My colleague covered this in a recent post, but, to get the most from the switch to cloud, your developers also need to understand costs and what drives them.
Without a more business-oriented understanding of the cloud, there is little incentive for developers to write cost-efficiency into applications – something that’s very hard to fix retroactively and will pose problems down the road. After all, capacity is limitless, and, if an application uses more than intended, the dev team can always just spin up some more.
To counter this, you need to engage in modelling that goes beyond financial forecasting. Your engineering team need an understanding of the impact of business, application and resource demand and the associated costs that support each of them.
4. The Importance of an End-to-end View
This is something the article doesn’t mention at all but should be a key consideration of any organisation using cloud technology. Large businesses usually equal complex environments with multiple teams acting autonomously of each other.
While it’s a necessary part of business, it can spell trouble when you add the cloud into the equation. It’s all too easy for teams acting independently to lose sight of the overall picture and this can be particularly problematic when it comes to cloud costs.
Addressing this requires an understanding of all of the components involved in a process and the costs associated with each as demand increases. Take for example a transaction on a webpage, the process may look something like this:
Transaction on-page – web server – application server – database – third-party system
Understanding the demand at each stage of these processes and what happens when it scales is crucial to accurately model and predict what your costs will look like. The data to build these models is typically spread across multiple data sources in an enterprise environment such as APM, resource monitoring and business monitoring tools. As an example: DataDog, AppDynamics, New Relic and Google Analytics.
The ability to pull this data together and make sense of it is how Capacitas has managed to double the headline savings for customers such as Ancestry and Skype.
5. AI and Machine Learning Apps Must be Well-Architected
Finally, the article mentions that many ‘born digital’ companies – having initially owned their own IT infrastructure – are opting for the cloud due to the scalability, flexibility and higher-order functionality it offers. Of these businesses, Spotify is held up as a prime example of a successful cloud move.
While there’s no doubt that the way in which companies like Spotify have approached cloud has yielded early success, there’s an important element which is often ignored. Businesses with high processing demands – like Spotify – use ‘data full’ or streaming apps and are increasingly harnessing AI and machine learning tools to this end.
From a custom experience and technical standpoint, this is excellent. Service users get vastly improved service and the business benefits from massive efficiency gains.
However, what doesn’t seem to come into the conversation is cost. Due to the huge datasets involved with associated processing, new applications like AI and machine learning will dramatically increase compute and storage processing workloads. This has the potential to lead to significant increases in cloud bills as the demand for more processing is answered with more capacity.
You need to know how much of that processing is costing you and how much value it’s giving. In a recent customer engagement, we found that 90% of processing was being carried out on data that the users were not interested in. The only way to do this is to join the dots through end-to-end modelling.
So what can be done? Well, much like the other points in this article, the key is to ensure these applications are well-architected and cost-optimised from day one. This means understanding the key drivers of cloud cost and architecting AI apps for performance, efficiency, and cost optimisation.
In practice, this means accurate modelling and forecasting is needed to validate the business case for any adoption of AI or ML. Modelling what demand and consumption will look like as you scale datasets can go a long way to ensuring you avoid unexpected costs.
For insight on where to begin with modelling for cloud costs, a great place to start is our Seven Pillars of Software Performance framework. Alternatively, to get an idea of how cost-efficient your cloud architecture is, try our cloud diagnostic tool.