How I Made Sure my Geo-Redundant Cloud Service Wouldn’t Fail
This is a guest post by Alex Gomes, a leading expert in cloud technology and the author of Highly Scalable Geo Redundant Cloud Services: Using Microsoft Azure.
Today, customers expect a certain level of performance and availability every time they visit a website or launch an application. The trouble begins if/when a huge number of people want to access it all at once. At any time, a Facebook mention or favorable Tweet could result in an overwhelming amount of traffic. At the same time, a hardware failure, incomplete software upgrade, network glitch and/or natural disaster can cause a failure in a datacenter or server. The need to balance performance, scalability and cost has led to the widespread adoption of cloud-based services.
To meet this growing need, we created a highly scalable, geo-redundant cloud service designed to support applications, websites and services deployment on the Microsoft Azure cloud platform. The solution features a webrole created using ASP .NET MVC 4. Once uploaded as an Azure Cloud Service, we endowed it with critical autoscaling capabilities. To ensure redundancy, we duplicated our Traffic Manager service to span multiple locations across the globe and programmed it to effectively handle load balancing and ensure failover in the event of an outage.
Load Testing our Traffic Manager Solution
Before releasing it for public consumption, we needed to fully test this approach. We used BlazeMeter’s hosted solution to conduct a comprehensive load test that compared the performance of our Traffic Manager solution to that of a single webrole instance. We simply logged in, pointed to the correct URLs and set the test specifications – 5,000 users over a 50 minute time span – and then executed both test scripts on a side by side basis.
When the results were analyzed, they showed that the single webrole instance struggled to handle the increased traffic streams whereas our service did not degrade under load conditions. Response time for our Traffic Manager solution remained steady throughout the test whereas server performance for the single webrole instance was extremely erratic – response times spiked and the number of users dropped.
And, our Traffic Manager solution failed less frequently than the single webrole deployment.
While this may seem like an “apples to oranges” comparison, this test does illustrate the advantages of a redundant, autoscalable solution over traditional methods. Our solution easily scales up when the number of users increases and down when the users decline. Just as importantly, it will easily reroute traffic when any single datacenter or server fails. As an added benefit, response time is faster as traffic is automatically routed to the datacenter closets to the user’s geographic location.
Find out more on this subject in the book: Highly Scalable Geo Redundant Cloud Services: Using Microsoft Azure.