One of the things that keeps coming up in load testing is the comparison of simulated to real users.

People tend to get hung up on pure concurrency numbers (the quantity) without assessing things like user profiles and throughput (the quality).  It’s very simple to crank up threads and, say, feed search functionality a collection of values;  sure, you’ll load the servers that way, but that doesn’t necessarily have any real relation to the number of users an application can sustain.

A better approach is to find relevant use case(s) and try to match the expected artifact throughput(s).

To illustrate, we recently completed an SAP CRM performance engagement.  The client had recently acquired additional assets and was in the process of restructuring their sales and CSR infrastructure to accommodate; everyone was being brought into the same corporate WAN, and integrated into the new CRM.  They needed multiple points-of-presence (UK, CA, US) for the test itself, they had some specific CSR ticket and sales order volumes they needed to test across varying network architectures, and they needed the resulting records to funnel through an existing multi-tiered series of webservices to their end point customers.

If we had taken a simple concurrency approach, we could have ramped up a few thousand threads against search and login, saw that servers were being loaded, and called it a day … maybe they’d have had a successful launch and maybe they wouldn’t.   What we did instead was to take a hard look at the sales/CSR volume they’d had in the past (both at the parent and at the new subsidiaries), modeled test scripts to meet those numbers with their desired concurrency, then applied that targeted load from the various geographic entry points.  We hit both the concurrency numbers and the sales order / CSR ticket volumes, hit all the key elements of their architecture, and scaled those numbers up until we found their actual bottlenecks (in this particular case, the limiting factor was at the memory configuration level in a couple of nodes … easy to find and fix if you can dial real load up and down at will, but very illusive if you have to depend on production troubleshooting).

You simply can’t find (and fix) the issues they would have encountered at launch by cranking up threads without that thread count to the artifacts created by real users.

Relating thread counts to real user counts is vital to real-world load testing.