Load Testing Lessons to Learn from Black Friday 2014
Black Friday is over for another year and, as always, this critical e-commerce day has left a few casualties in its trail…
Let’s take a look at some of this year’s biggest Black Friday crashes on both sides of the world:
USA: Best Buy’s Black Friday Outages
The world's largest consumer electronics retailer Best Buy went down twice within 24 hours on Black Friday, triggering a flood of complaints on social media, losses in sales, and a drop in the company’s shares from 2.5% to 1.7%.
Best Buy’s spokeswoman Amy Von Walter attributed the outages to unprecedented traffic on the site.
"A concentrated spike in mobile traffic triggered issues that led us to shut down BestBuy.com in order to take proactive measures to restore full performance."
Amy Von Walter, Best Buy Spokeswoman.
The UK: Tesco, Game, Argos, Boots, John Lewis and Currys PC World
Newer to Black Friday website traffic chaos than their American counterparts, scores of the UK’s biggest online stores struggled to keep up with demand.
Annual Black Friday sales, which have become increasingly popular in England over the past four years, brought down the websites of top UK retailers like Tesco, Argos, John Lewis and Currys.
UK supermarket giant Tesco suffered some of the biggest losses when its website went down for the first 12 hours of Black Friday. As with Best Buy, Tesco attributed the problems to a huge traffic spike:
“Last night we saw five times more customers visiting our website for our Black Friday sale than last year. Due to this massive demand the website was temporarily unavailable but it is now available for customers to use.”
What Can We Learn From These Critical Web Outages?
Both Tesco and Best Buy (and probably many other eCommerce sites!) made the same key mistake...they underestimated the load.
Don’t Underestimate the Load
As I outlined in a previous blog post, ‘Will Your Web or App Fail this Black Friday?’, underestimating the traffic likely to come to your site is mistake that’s easy to make but hard to recover from. On the big day itself, Tesco had five times more customers than the previous year and Best Buy also noted ‘record levels of website traffic’ - and they were unable to cope.
Fortunately, there is a relatively simple solution to this one. When conducting your load testing before Black Friday (or any other day that’s likely to see high traffic loads), try to bring your system into the failure.
How do you do this? Don’t ever be satisfied with a successful test. Run a sequence of tests while continually increasing the load and monitor the hits/s throughout as you do this. Keep on doing it until you hit a scaling problem.
For example: in the chart below you can see where your system reaches its capacity (in this case, you’re unable to increase the hits per second rate after 300 virtual users).
Of course, underestimating the load was only part of the problem. Tesco’s crash wouldn’t have been so catastrophic if it had been able to recover quickly from the outage - instead it was down for the first 12 hours of the day.
Recover Quickly from Technical Problems
Your crash might not even be due to high traffic loads. Perhaps you’re unlucky enough to have a power outage on the busiest day of the year. Whatever the reason, it’s key to make sure that you’re prepared to recover quickly from anything that gets thrown your way.
We recommend setting up back-up servers and locations ready so you can recover quickly if there’s a problem. Set up a database replication, database failover cluster or application failover cluster. If there’s a problem, just switch over to the failover location - you won’t have to wait for your main server to recover as your backup can be running while you resolve the critical issues.
You can even switch to your failover location manually. You just need to prepare a procedure in advance and make sure your Ops/DevOps team members know exactly what to do if a problem does occur.
It’s also important to know as accurately as possible how your website will perform on the actual day. For this reason, we recommend you:
Run Load Tests From the Production Environment
If you want to be sure your test will be accurate and you’re stressing every point in the chain of delivery, it’s best practice to run your tests in your production environment at times when you know traffic will be low (i.e 1am on a Sunday morning).
The most efficient ways to do this include using a load testing tool like JMeter and buying several Virtual Private Servers (VPS) in different geo-locations to test your web or app servers under heavy load. Alternatively, you can use a cloud performance testing tool like BlazeMeter to simulate the load from multiple geo-locations and various devices with just a few clicks in the User Interface (UI).
Be Prepared for Your Next Big Traffic Spike
Want to find out more? We’ve covered detailed explanations on each of these points and more (such as identifying your critical points, tracking the end user performance, and considering third party integrations) in our whitepaper “How to Ensure Your Website or Mobile App Won’t Fail on Black Friday - The Top Six Performance Testing Mistakes and Their Solutions”.
In addition, sign up for our free upcoming webinar on Sept 28 - Running a Large Scale Load Test Ahead of Black Friday & Cyber Monday.