Ancestry has long been the leading brand in both family history and consumer genomics. Their network of over 10 million people is twice that of their nearest competitor and they were the first organisation to process 1 million, 5 million, and most recently 10 million DNA samples. They boast more than 3 million subscribers and over 1 million international customers.
Ancestry also have the largest online collection of family history records globally, with more than:
Ancestry has experienced rapid global growth in recent years, growing revenues to $1bn in 2017 - an increase of 30% over the previous year. In order to scale their IT systems to meet this impressive growth, Ancestry decided to move away from the limitations of physical data centres to the infinite scale, flexibility and agility of AWS. However, during the migration to AWS, it became apparent that their infrastructure costs were higher and increasing faster than had been forecast. To address this a cost reduction program was started and the Ancestry team made some early savings by using optimisation tooling (such as Cloudability) to identify unused capacity (and switching it off) and purchasing AWS Reserved Instances, where appropriate. However, the costs were still far higher than desired or expected and would inevitably continue to increase further as the migration to AWS continued. So it was decided that further, more radical action needed to be taken.
The board and their investors set the Ancestry senior management team a challenge to deliver 20% cost savings on their annual AWS spend within a 12-month window.
How would the Ancestry team achieve these cost reductions within that time period without delaying the customer migration program and impacting live customer experiences?
Ancestry’s live environment in AWS is made up of 15,000+ hosts spread over 600+ platforms. The live environment supports 75 million page views per day during peak and a failure in any one of the 600 + platforms can result in serious customer impact due to widespread, complex and (at times) unknown interdependencies.
“Within 2 months, a significant amount of AWS cost reduction has been delivered in live. Working together with Capacitas we have made our services more stable and are well positioned to deliver further cost optimisation."
Niraj Nagrani, SVP Products and Platforms, Ancestry
Performance and capacity experts Capacitas were brought in by the Ancestry leadership team, off the back of the success of a previous, similar engagement at Skype. Following their cost optimisation process (see diagram below) and unique “shapes based” data analytics tooling Capacitas quickly identified a range of rightsizing and efficiency opportunities that would deliver considerable cost savings, without introducing risk to live services.
Capacitas’s cost optimisation engagements to date show that a combination of rightsizing and efficiency gains typically accounts for over 70% of the total cost reductions in a cloud platform (with just 30% achieved through removing idle capacity and buying Reserved Instances).
In less than 6 months, and after analysing less than half of Ancestry’s services, Capacitas had identified opportunities to deliver 40% of the total cost savings target.
Using their unique data analytics tooling and library of efficiency benchmarks to analyse Cloud Watch and New Relic (APM) data, Capacitas were able to understand the risk levels of these optimisations and produced a prioritised roadmap of activity. This roadmap categorised optimisations by potential cost saving and by risk level. Approximately 66% of the optimisations were rated as low to medium risk and were fast tracked into live, while the remaining high risk, or more complex optimisations were put into a backlog to go through testing before they went into live.
Within 2 months, a significant amount of AWS costs had been removed through the changes implemented in live. A further amount had been tried but backed out because the optimisations had caused a complication. This was a vital step for the teams as they gained valuable information about the weaknesses and unknown dependencies of the platforms they had optimised. The information was used to create a backlog of key technical debt that had to be addressed in order to make the services more stable and to deliver more costs savings.
Capacitas supported the Ancestry teams every step of the way providing data and expertise to identify and mitigate any risks in the platform before, during and after the optimisations. Ancestry’s engagement with Capacitas continues and the optimisation program is on target to identify and implement the remainder of the required savings.
As well as the huge cost savings implemented and identified to date, the other benefits have been significant for Ancestry. The teams have uncovered and removed risks that had been hidden in costly, oversized cloud environments, thereby improving the overall quality of their platforms and ensuring they are capable of scaling for continued business growth. This improved understanding of their technical stacks and system interdependencies, and how customers use their services, has enabled Ancestry to be more agile and more confident in making faster, less risky change.
If you want to see big boosts to performance, with risk managed and costs controlled, then talk to us now to see how our expertise gets you the most from your IT.