Sizing Your AWS Cloud’s Capacity with BlazeMeter and Cloudyn
How do you prepare your service for the next hike in traffic? How do you correctly size your cloud environment for the coming year? Can you forecast your demand? Do you know where your IT environment’s bottlenecks are? While most people think that cost optimization solely relates to increasing capacity, it also works in reverse, with environments that are underutilized and require a downsize in cloud operations in order to optimize usage and costs. In this article, we will show you a case to help guide you through planning, building and controlling your cloud environment in terms of finding the right sized compute capacity to support online demand in an optimized environment.
Up Size and Set Your Autoscale Policy
Let’s take a WordPress based online magazine as an example. The magazine’s IT operations team decides to use BlazeMeter to run a load test on a scaled down sample environment of their system, as shown on the right, in order to understand their environment’s capabilities.
The system runs on the AWS cloud and consists of medium EC2 instances, an RDS master database, S3 for file storage, and will be load balanced with ELB. The load tests show that the sampled environment provides a 3-second average load time per page for 20 concurrent visitors, which is ideal for SEO and user experience, as shown below:
There are other factors that could be the cause of the system’s failure. Keeping in mind that the solution may not be to simply scale up or out, the team may need to perform a stress test and use an APM, like NewRelic’s integration with BlazeMeter, in order to further monitor bottlenecks in their database or network.
For the sake of this example, we’ll say that instance size was, in fact, the bottleneck for the online magazine. The IT operations team knows that they want to increase the size of their instances, they just don’t know by how much. With BlazeMeter the team can easily update the EC2 instances and run a few additional iterations of JMeter tests, enabling them to quickly make better decisions regarding instance size as well as auto scale configuration parameters.
Downsize and Optimize Costs
Next, they turn to their Cloudyn account to get some professional recommendations that help identify the best options to optimize their system. Cloudyn’s tools show CPU and memory metrics along with recommended actions in terms of resizing instances. As illustrated in the screen below, Cloudyn’s recommendations aren’t limited to just sizing up or down within a single family, they also take newer families and instance generations into consideration.
So, now that we've looked at our performance and the ability to upsize with BlazeMeter, we want to make sure that we're covering all of our bases in terms of budget allocation. Therefore, the first area that we're going to look at, with Cloudyn, is proper RDS and EC2 instance sizing. To do that, we'll turn to our sizing recommendations, which essentially analyze CPU, memory metrics, and I/O.
To complement that, Cloudyn also looks at the situation from a cost perspective. Once the necessary next steps are understood on the operational performance level, they make sure those things also make sense financially. In our case, Cloudyn was able to find the magazine a different family with more CPU and memory for less money. Once things stabilized, in terms of usage, Cloudyn was able to see how many particular instances were used, as well as if their usage warranted switching from on-demand pricing to reserved.
This can be a very tricky area, because figuring out the aggregate usage across multiple instances is not a simple task. Figuring out if an instance has 50% usage over the course of a year and whether that translates into 3 or 6 months of consecutive usage, is no simple matter. However, Cloudyn provides the tools to do just that, identifying when, where, and what types of reservations should be bought.
Due to the flexible nature of the cloud, online services can easily support high demand and maintain a great user experience. However, it is critical to be able to control the capacity changes in this highly dynamic environment, while maintaining an optimized environment in order to avoid capacity bottlenecks or over provisioning.