In the second part of the series of articles about load testing, we will discuss how you can find bottle necks in the system, and examine how much load the system can handle without compromising of the user experience. At 3bits, we often get this question from our customers before a campaign launch or if they of some other reason expect more visitors at the site.
Just as any type of tests, a load test can never guarantee that the system is flawless, and will work in every situation, but only verify that the test cases or user scenarios that are tested works as designed.
If you want to have a somewhat complete load test of your e-commerce site, it is not just enough to simulate users that are surfing at the site, and watching products. You also need to simulate that the user register, logs in, visit the cart, and performs an order. We can for example look at two user scenarios: (i) product browsing, and (ii) login and order.
In the first scenario, we just surf at the site, and watch the products. This kind of user case is very easy to record with a load testing tool such as Load Impact, but for the result of the load test to be useful it is crucial that the recorded scenario is comparable to the real users’ behaviour both for pages that are visited, and how long the user stays at every page. Tools as Google Analytics can be a good way to start, but the problem with load test before a campaign is that the users’ behaviour often differs a lot from their usual behaviour. The best thing is if you have statistics from previous campaigns that can be used. User scenarios are saved in a script which makes it possible to afterwards change waiting times, and which pages that are visited. The advanced user can also adjust the scripts to make them randomly select which products that should be visited, and how long every page should be displayed, creating a higher coverage in the test, and a more real behaviour when it comes to waiting times.
The second scenario is also easy to record with the load tool. In this case, it is always necessary to adjust the recorded script afterwards, if you don’t want exactly the same user logging in simultaneously and ordering from a large number of different browser sessions. This can be fixed by creating a number of test users first, and that the script randomly selects the log in information from someone of these.
When you should perform a load test, you need, a part from the user scenarios, also create a load profile. The load profile describes how many simulated users that should perform a certain user scenario at each point of time in the test. Just as for every user scenario, it is important that the number of different user scenarios are similar to the real behaviour of the users, and it can also be different during a campaign than a regular day. It is for example normal that the conversion rate is higher than normal during a campaign, if statistics from a normal day is used, the number of users that are placing orders should be more.
When it comes to the number of simultaneous users you want to load with, you would normally start in a low level, and successively increase the load up to a given maximum to see how this affect the load time for example. The maximum is set by the requirements you have for the site regarding the number of simultaneous users. Sometimes these requirements are not clear, and instead you want to see where the limit is for how much the system can handle. As we wrote in the first part of the series of articles, it is per definition a stress test, but the only difference is that you increase as much in order for a resource to hit the roof and no longer handles to deliver results in the same speed as the calls come in.
If you want to stress test the site, it is of course suitable to do this at a time where there are as few normal users as possible at the site, for example during the night.
During the load test, a number of instances of each user scenario are created according to the load profile, and the load times for every page execution are collected when they run. Except from the load times from a user perspective, it can also be a good idea to check the web servers for the site, the database server, and the load balancer during the test to have a collected image of what can be the reason if load times go up. Load Impacts also offers an agent that can be installed at the server side to load data for CPU load for example and memory usage, to get a more gathered image directly in the test tool.
The result that you can expect in connection to a load test is that you have a constant load time up until a certain load, after which the load time is starting increasing proportionally against the load. This breaking point means that some resource is fully used, and the users need to start queuing, and waiting to use that resource. As long as the load times are not increasing so much that the user experience is affected, this does not have to be alarming. But you still need to investigate what makes the bottle neck, having the users to wait. If a certain resource, for example band width is out long before other resources, the servers are not used as they could be. In these cases the load test tool can for example provide information about how large the bandwidth that is used for different image types, java scripts. This can later provide information about how to increase your own band width, or if you can simply move some static content to a cloud solution such as Akamai.
A more disturbing result is if the load time is increasing exponentially instead of proportionally. This means that the system is very close to breaking. In the graph below, the result of a load test is displayed where the CPU load (yellow curve) has at maximum. The load time (blue curve) is then increasing more than the load (green curve). The reason for this is that when the server is close to 100 % CPU load, it delivers less than if it is a bit further down. You can also see this since the delivered band width (red curve) has started to decrease. In the case you could have had more from the system if you cut of the band width and having the users queuing before the CPU load hit the roof.
By load testing you can both have a measure for the systems limitations, and useful information about how you best use your resources. In the last and finishing part about load tests, we will take a look at how you continually can use load tests as a part of your QoS testing.
A study from Akamai has shown that 40 % of the users leave an e-commerce site if the page load time is more than 3 s, and 64 % of these users select another site next time the will go shopping. There is a lot to gain by securing that the site will deliver even when the pressure is high.
In the third and final part, we look at how load testing can be used as a part of your Quality of Service (QoS) testing. Even if the procedure is mostly the same as a stress test, the purpose is different. And to have maximum output of the load test, this must be reflected in the setup.