Government web site goes down at a critical time
Just before the deadline for registering to vote, the government web site that handles this process collapsed under the load. It could hardly have come at a more critical time, and before long the blame was flying.
Jeremy Corbyn: “I'm told the register to vote site has crashed, so people can't register to vote for #EUreferendum. If so, the deadline has to be extended”
Tim Farron, described the crash as a “shambles” and blamed the government: “With individual voter registration, and a big campaign to encourage young people to register, many of whom have been trying to do so last-minute, this could have major consequences for the result."
This should not have been a surprise – there was much talk in the press of a last minute surge being expected. For example this, again from the Guardian: Surge in voter registrations expected before EU referendum deadline
The exact level of demand in cases like this is clearly difficult to predict. How many of the millions of unregistered voters will try and register at the last minute? What can be done to address these kind of situations, where a large demand is expected?
What could have been done?
Capacitas has a wealth of experience in helping our customers prepare for peak demand, and there are several techniques that can usefully be deployed.
One thing the site got right is that it was well metered. The people responsible had access to good information to help them understand the performance of the site, and have made that information publicly available.
Testing the service on the actual live systems is a powerful way of identifying how much demand the site can handle. It also shows up any bottlenecks that might result in a lower than expected capacity. Does the service scale well? How much load can it take? Capacitas have considerable experience in safely testing systems in this way, helping our clients deliver maximum sales on the big day.
When a site goes down, no one gets served. While not ideal, it is clearly better to place any excess of customers in a queue, and process the maximum number that can be dealt with, as soon as possible. For retailers this would represent lost revenue, but in cases like this it is clearly better to keep the site serving as many as possible.
There is a myth that if your service runs in the cloud, you don’t need to worry about capacity. This is not true. While you may be able to scale up your platform fast, you need to be sure that your services will also scale up smoothly, (see the section on load testing above).
This update added on 9th June 2016.
Examining the data available on the site performance, it is clear that the page load time increased considerably BEFORE the site went down:
This is exactly the sort of warning sign that would be used to trigger defensive measures during live operation, or backing off during a production load test. These stats also show the relative scale of the peak:
This shows that the ratio of the peak load to the typical load is about 50:1
While this is a very strong peak, it is not impossible to deal with, and many sites have to deal with similar scale surges in demand. For example, such events are almost routine for sites that sell tickets for popular concerts,
Stay tuned for further developments! We will be producing another blog post once we see how the voter registration site hold up at the new deadline.
If you would like more information on how Capacitas can help your organisation prepare for a peak, and deliver great results, then please click here.
If you would like to learn more about our Prepare for Peak and Performance testing solutions, please click below, to see our latest Ebook.