9
votes

We have a MySQL driven site that will occasionally get 100K users in the space of 48 hours, all logging into the site and making purchases.

We are attempting to simulate this kind of load using tools like Apache Bench and Siege.

While the key metric seems to me number of concurrent users, and we've got our report results, we still feel like we're in the dark.

What I want to ask is: What kinds of things should we be testing to anticipate this kind of traffic?

50 concurrent users 1000 Times? 500 concurrent users 10 times?

We're looking at DB errors, apache timeouts, and response times. What else should we be looking at?

This is a vague question and I know there is no "right" answer, we're just looking for some general thoughts on how to determine what our infrastructure can realistically handle.

Thanks in advance!

2

2 Answers

3
votes

Simultaneous users is certainly one of the key factors - especially as that applies to DB connection pools, etc. But you will also want to verify that the page rate (pages/sec) of your tests is also in the range you expect. If the the think-time in your testcases is off by much, you can accidentally simulate a much higher (or lower) page rate than your real-world traffic. Think time is the amount of time the user spends between page requests - reading the page, filling out a form, etc.

Depending on what other information you have on hand, this might help you calculate the number of simultaneous users to simulate: Virtual User Calculators

The complete page load time seen by the end-user is usually the most important metric to evaluate system performance. You'll also want to look for failure rates on all transactions. You should also be on the lookout for transactions that never complete. Some testing tools do not report these very well, allowing simulated users to hang indefinitely when the server doesn't respond...and not reporting this condition. Look for tools that report the number of users waiting on a given page or transaction and the average amount of time those users are waiting.

As for the server-side metrics to look for, what other technologies is your app built on? You'll want to look at different things for a .NET app vs. a PHP app.

Lastly, we have found it very valuable to look at how the system responds to increasing load, rather than looking at just a single level of load. This article goes into more detail.

1
votes

Ideally you are going to want to model your usage to the user, but creating simulated concurrent sessions for 100k users is usually not easily accomplished. The best source would be to check out your logs for the busiest hour and try and figure out a way to model that load level.

The database is usually a critical piece of infrastructure, so I would look at recording the number and length of lock waits as well as the number and duration of db statements.

Another key item to look at is disk queue lengths.

Mostly the process is to look for slow responses either in across the whole site or for specific pages and then hone in on the cause.

The biggest problem for load testing is that is quite hard to test your network and if you have (as most public sites do) a limited bandwidth through your ISP, that may create a performance issue that is not reflected in the load tests.