Lessons Learned From Working 'Under the Hood' With JMeter
The best way to learn is through experience - whether it’s your own or someone else’s.
In this article, we’re going to share the experiences of one company’s IT team when they had to prepare for higher traffic loads due to an exponential growth of their systems.
Disclaimer: Due to the sensitivity of the company’s cloud operations we can’t reveal its name - but we can reveal what it learned!
A Little Bit About the Company...
This company supports workflows and services that aid in transferring large amounts of imaging data between systems. Much of this data comes from desktop clients and is uploaded using a Java applet. As the IT team’s systems grew and they acquired more users, they knew that they needed to look for better ways to prepare for higher loads.
Having weighed all the available options, the IT department found that JMeter was the most capable tool for simulating complex clients. They started using JMeter and proxy recordings to find out how much of their API could be simulated. It should be noted that their fairly rich RESTful API moves various files to a server using multipart MIME posts.
Initially, their IT leaders were concerned that the load simulation scripts would be too hard to develop due to the complexity of the API. They needed to make sure that the script could be dynamic enough to ensure repeatability. Once an upload was recorded, many recorded values required variables to be injected. Part of the complexity of their traffic lies in their proprietary server interface and API. There are various unique IDs for the upload session. Web clients get these IDs from the returned JSON and must use them in future requests. By using JMeter post-processors to process JSON, the company was able to get these variables extracted and effectively simulate an entire end-to-end conversation between the client and server.
Another element that created traffic complexity was the multipart MIME posts, which is how clients send files up to the server. Since these posts need the entire file contents, the company didn’t know how JMeter would manage. They found that JMeter effectively records the source location of the file that you post. When playing back the script, JMeter just needs access to the files and it will perform the multipart MIME post for you. They were delighted to find that JMeter could do this out of the box, without any additional extensions or programming.
Moving Hundreds of MB in Minutes
For their on-premises solution, they were looking to simulate various types of loads in their system and built a script that they call the “Load Suite.” This is one script that can generate continuous load on most of their significant services. For example: for the upload service they took the various footprints of imaging data into consideration. In some cases, their end users may need to upload a large number of small files (i.e. one thousand files that are ½MB each). However, with other types of imaging, their users may need to upload eight to sixteen files that are 40MB each.
The company also has an on-premises solution and targeted 3 times the amount of simultaneous upload users they normally get at peak times. When you consider that these users are moving data via fast, local networks, transferring 500MB of data in the span of a few minutes, it represents a significant I/O load. As soon as the team started running experiments with JMeter in their lab, they found bottlenecks that were affecting performance. They expanded their script to simulate load on other services, having separate thread groups within the script perform different roles in the system. They have users that upload data, visualize that data, and use various parts of their web interface. Other types of users process the data and have it routed off to other systems via a proprietary imaging network protocol. The company was able to drive all of this through JMeter and simulate the conditions of a very busy server to see how things performed.
How JMeter Helped
When the company first started reliably simulating this load in their lab, they were able to analyze their application, profile it, and understand where a user’s time was being spent. They were aware that there were some inefficiencies from an I/O perspective, and that it became more sensitive based on system storage performance. The company’s service, depending on how it's deployed, has varying levels of I/O backing. Since it's typically network attached storage, it doesn’t perform as quickly as if it were a local disk.
With JMeter, they were able to demonstrate that with about half of their goal users, their service was degrading, and they had clear evidence that their service wasn’t scaling well enough. In response, they rewrote some of the backend code so that it could use faster storage for some portions of the processing, as well as eliminate some disk I/O that could be done in memory. Through various iterations, they were able to improve their performance to comfortably handle 3 times their peak upload users. Apparently, the slower the I/O was on the backend, the bigger the impact. This was an important point that they understood about their production environments that they were unable to model until they built the lab and staged the system. Using JMeter was part of a larger effort to properly isolate and test their software so that they could turn different knobs and monitor the responses.
They’ve also benefitted from some Linux OS level tuning, as well as some database tuning. Most of the database issues were due to inefficiencies in the way their application used the database and required some refactoring and SQL tuning. With JMeter and proper isolation, they were able to find lots of pain points in the backend and focus tuning efforts based on their worst offenders’ list.
The company has similarly been running these load tests on their cloud services, which have very similar scenarios dealing with huge amounts of data. They have the same applet capability that is on their on-premises solution as well as additional native clients that send data to their cloud. They use BlazeMeter to scale out JMeter scripts and simulate a large number of users coming from different points around the globe. They now perform these tests on a regular basis and will continue to enhance both the JMeter client simulation and their service’s ability to scale ahead of the ever growing load that builds as their network expands.