Site Performance and scalability testing

Abstract

Web stress performance overview

Scalability and performance; the difference

Using the Web Application Stress tool

The Transactional Cost Analysis Methodology

Where to run the tests from

Monitor CPU Usage on WAS machine

Double-check that everything ran as expected

Aims of stress testing

Overview

Why Use the TCA Methodology?

Performing the Testing

1st Step: Break down the site into "transactions"

2nd Step: Determine Usage Profile

3rd Step: Perform the Stress Tests

ASPToday Subscriber's Article
Site Performance and scalability testing
by Andrew Reid
Categories: Site Design, Performance
Article Rating:
Published on January 25, 2002

Content

This article looks at performance and scalability issues on a dynamic website and how the Transactional Cost Analysis methodology can be used to see where real bottlenecks and performance issues lie. This methodology allows you to accurately model the performance of your website and see how potential changes and optimizations affect the overall capacity of the site.

A step-by step guide to using the Transactional Cost Analysis methodology with the freely available Web Application Stress tool is given. The metrics that should be monitored and how these relate to real-life performance are explained in detail.

Myths and misconceptions regarding ASP performance and scalability are dealt with and example results discussed. The key differences between performance and scalability are looked at along with strategies that will work to improve both. In addition, real world experiences with a large e-commerce site are given as well as strategies that will help capacity planning and how resources are best used.

The performance of a web application should be thought about during planning, technical design, architecture design, development and testing. It cannot be "bug-fixed" on at the end.

It is still common for new sites to launch and be overloaded, or even existing sites to come grinding to a halt. As anyone who tried to place a bet on the English Grand National this year at most of the online bookmakers will testify, their sites were unusable for most of the day. The effects of such poor planning and design have a major impact on company revenue. Given the maturity of the web, such incidents should be and can be avoided.

All major websites should be assigned a person who is responsible for capacity planning and ensuring that the website matches predicted peaks in load. This is an ongoing job; it requires detailed analysis of past trends and future marketing work. This position is often overlooked or ignored, assuming that stress testing during development has ironed out all problems. What is often forgotten is the fact that different shopper behavior patterns have a significant effect on site performance. This is where the TCA methodology is so powerful, it allows us to run already gathered performance data against potential shopper behavior models and sees how performance and capacity is affected.

The terms scalability and performance are often used interchangeably. This is a mistake because a high performance website is not necessarily a scalable one.

Performance denotes the speed and efficiency with which the system performs its tasks
The ability to scale a website means the ability to add new servers (horizontal scalability) or upgrade existing hardware (vertical scalability) to increase capacity. A well written, scalable application will allow the addition of new resources to increase capacity without any changes being required to the application itself and with little impact on system behavior or performance.

For a large website with growing demands on capacity, the most critical measure is the ability to scale out effectively by adding new servers. Windows 2000 Advanced Server comes with built in load-balancing software that can be complemented by the addition of Application Center to help manage and monitor the web-farm. Other worthwhile alternatives include hardware load balancing such as F1 BigIP or Cisco Local Director.

If a website application scales effectively, i.e. doubling the number of servers results in a near doubling of capacity, this is an extremely cost-effective way of scaling a website. New web servers cost in the region of £2000 today for a high specification rack-mounted server. There is also the cost of hosting but often a company will already have space free if they have rented half a rack or a full rack. In addition, the state of the IT market at the moment means that most data-centers have vast amounts of unused space so rentals are coming down: That is comparable to a couple of days of programmers consultancy fees.

This emphasizes how it is critical to ensure that during development of a web application, care is taken to ensure scalability (by performance testing against a single server, and then a web farm of two or more servers) and also how expensive development time is focused on areas that will significantly and measurably improve performance and ability to scale.

The Microsoft Web Application Stress tool is currently one of the most used free tools, and with good reason. It has a lot of the features of more powerful and costly software suites.

To fully cover the WAS would take a full article (there are others on the http://www.asptoday.com/?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD site and I've added some links at the end of the article), but here are a few relevant pointers to get up and running with the TCA methodology.

1. Start up WAS and select "Manual Script".
2. You will be presented with the following screen:

3. Enter your server name where "localhost " is (don't prefix with http:// because this will cause WAS to error). It is not recommended to run your tests from the same machine as your web server. This will give you false results because generating the test hits will take a significant amount of CPU time: Tests should always be run from other machines.
4. Select "Get" as the Verb (this is the HTTP method used to access the page) and enter the path to a page on your server. Because with the TCA methodology we only stress test one page at a time in order to build our model, we can ignore the "Group" and "Delay" options. The screen should now look like this:

5. Now expand the New Screen item and enter the "Settings" as follows:

Stress Level and Stress Multiplier can be used to adjust the stress on the server. These are often misunderstood - fortunately for the TCA methodology, their actual values are irrelevant as far as the results are concerned because we are not trying to model a real web load (100s of concurrent users), just stressing a single page at a time and applying those results to a mathematical model. For those who are interested, "Threads" are the number of NT threads that WAS starts up to send requests from; if you are not using the TCA methodology and are trying to stimulate a "real" load, you will likely need multiple client machines, in which case the threads are divided amongst these. Each thread can then contain multiple "sockets" which are concurrent connections to the web server. Microsoft's website has the following formula which can help explain the relationship between these values:

Total Concurrent Requests = Stress level (threads) x Stress multiplier
(sockets per thread) = Total Number Sockets

For those who want to delve deeper into this particular topic, I have included a link at the end of the article.

The values you have entered are good to start with, but it is worth taking some time to experiment with the two settings to see how it affects how much stress you can put on an individual page. To fine-tune the stress you're applying to the page, you can adjust the random delay. It will take some time playing with these setting before you get a feel for the best initial settings to avoid having to keep re-running tests as you look for the maximum requests per second that a given page can handle. Generally I suggest starting off with two threads and one socket and a delay of between 100 and 150ms. If the site is not getting sufficiently stressed, increase the stress multiplier by 1 and re-examine the results. If the site just over-capacity and has 100% CPU usage, you can lower the stress by increasing the random delay gradually until you reach peak capacity without over-loading.

For the TCA methodology, you also need to monitor CPU usage during stress testing. Usually you will find that the processor usage during peak requests per second is between 80% and 95% usage. If the CPU usage is 100% for the duration of the test, it means that the CPU is overloaded and the requests per second will drop. To add performance counters, click on the "Perf Counters " link and add the processor usage monitor.

You are now ready to run the tests. Click on the "New Script " label to get back to the first screen and press the "Run " button on the toolbar. The test will initialize and run.

To view the report that has been generated for that run of the Stress Tool, click on the "View " menu and then on "Reports ".

It is important to run the tests from somewhere local to your web servers, otherwise you will be severely restricted in how effectively you can stress the site and both bandwidth limitations and latency will skew the results. The best solution is to be physically next to the machine on a fast LAN. If you don't do this and try to perform the tests on a live site over the Internet, you are just wasting your time. If you host your site at a data centre such as Exodus you can take a powerful laptop to the data centre and attach it to the network or if you have a backup or management server that is hosted, you can Terminal Service or PC-Anywhere and do the tests from there.

The WAS tool can run across multiple machines to provide enough hits to properly stress a web server. With the TCA methodology, where each page is tested individually, this is usually not an issue because it does not require the same volume of hits. However it is still important to monitor the CPU usage on the WAS machine; if the CPU utilization is consistently above 80%, the test may well be invalid because the WAS machine is likely incapable of providing enough hits to the server.

There are two other values you should look at. These are:

Result Codes (on the main report screen): You should always check that you are getting the correct results code back from your web pages! The most common cause of erroneous results using WAS is probably the web server not returning the page as expected. Ensure that you have pointed to the right pages (you're not getting 404's) and the page is not reporting an error (500s).

Downloaded Content Length (under "Page Data" on the reporting screen). Check that the downloaded content is approximately what you would expect for the page you are running tests again. If either of the above results suggest that your tests are not running as expected, check the log files of IIS to see if the correct page is being requested, and also what values are being passed to it.

Stress testing (or "load testing" - these terms are often used interchangeably) is a term used to describe the stressing of a real application under simulated load provided by virtual users.

Properly conducted stress testing should have a number of aims:

Check performance during development
Testing before site launch
Finding page response times
Finding peak capacity
Finding how well the site scales

Page Response Times (TTFB and TTLB)

Page response times measures how long it takes to get the first byte of a page and the last byte of a page under different loads. Page response times are critical because users will not wait long before they get impatient and try to visit another web site. There are two metrics to page response times, Time To First Byte (TTFB) and Time To Last Byte (TTLB). When performing stress tests, one measure of how stressed the server is and how close to peak capacity you are is the time between these two values. As an example, you may find that up until a point you find that TTFB and TTLB are 230 and 244 ms respectively fairly consistently. There will be a point as you increase the load on the server, either by increasing threads and sockets or reducing the delay between requests, that TTFB and TTLB start to move apart; they now record 250 and 420 ms. You will also find that hits per second decreases. It is often useful to agree a maximum page load time (for example, 1 second) and stress the server until this maximum is reached (by TTLB). That way, you know what the maximum site capacity is while still keeping acceptable response times.

Requests Per Second

Requests Per Second is probably the key metric you should monitor. Very crudely, it tells you how well the page you are stressing is performing. You will see significant differences between pages on your site depending on how they are written and how they interact with back-end systems.

Testing for Scalability

Testing for the ability of a site to scale out is often forgotten about. On large sites with multiple servers, it is of critical importance. To test for scalability, first test against one server only, then against two servers, then against your full web-farm: This will show how scalable each page in the site is, i.e. how adding new servers improves its ability to serve pages. You will find that plain HTML pages or ASP pages with no back-end calls scale almost linearly, so going from one server to two servers results in a doubling of the number of pages served. Conversely, if you have a poorly thought out site with multiple database calls per page, you will find that adding new servers results in poor scaling - perhaps improving performance by 1.1 times rather than doubling as expected!

The TCA methodology models the CPU load that a single user exerts on your web server. The methodology works because although it may seem absurd to equate a single browser session to a CPU usage, e.g. 1.5 Mhz, when there are hundreds of shoppers on your site, this holds true. The TCA methodology as applied to websites is a product of the Microsoft Research Laboratory and there are some interesting papers, which use the methodology on Microsoft's MSDN site.

The Transactional Cost Analysis methodology is a theoretical method of determining the CPU cost of transactions (think "sessions") on a website. One of the key benefits of the TCA methodology is it provides a mathematical model of your website performance and capacity.

Two examples of the power of the TCA model are given below:

It is possible to apply a different shopper profile to the data to see how site capacity changes. During Christmas or promotional periods, shopper behavior can differ significantly (the ratio of the number of buyers to the number of shoppers usually goes up). This means that the site is handling more transactions, which usually involves more processing power. It is easy to put the Christmas shopper profile into the TCA model and see exactly how great this effect is.
By examining CPU usage per page and how this affects overall site capacity, it is possible to focus on areas that really affect site capacity rather than those, which the raw throughput values would suggest.

There are some operations on the site that involve more than one request to an ASP page. This most commonly happens when one ASP page posts to another or redirects to another. For example search.asp may call search_results.asp (2 requests form part of the transaction). Similarly, product.asp may call add_to_basket.asp and redirect to basket.asp (3 requests form part of the transaction). For the most part, pages will be distinct units however, for example the home page, category pages, product page, etc.

In order to use the TCA methodology, we need a profile of shopper behaviour. This the "average" behaviour exhibited by the users of your site. If you are logging page accesses to a SQL database or have web site reporting tools, this is simple for a given time period:

Total Hits To Page / Total Number Of User Sessions

The supplied spreadsheet contains calculations for determining usage profile given page hits, number of sessions and average session length.

Rate this Article

How useful was this article?

Not useful

Very useful

Brief Reader Comments:

Read Comments

Your name (optional):

Run the stress tests on a page-by-page basis. Stress each page (or group of pages if they form part of a "transaction") until you find the maximum number of requests per second they can handle. When you find the peak requests per second, note the:

Requests Per Second
Average CPU Usage

Those are all that is required for the TCA model. You may also find it useful to make a record of TTFB and TTLB.

4th Step Use the TCA Model

Once the results have been gathered, they can be entered into the TCA model using the supplied spreadsheet (see below).

The Download

I have provided a fully worked out Transactional Cost Analysis spreadsheet model as the download to this article. The data in the spreadsheet is imaginary, but you can use it to easily model the real data from your own web site. It is worth taking time to examine the spreadsheet and the example data and graphs that I've included.

Common Myths about ASP performance

The most common myth regarding ASP performance is that translating a page to use COM rather than ASP functions or VBScript classes will result in a significant performance increase. Under Windows NT this held true, but on Windows 2000 the ASP VBScript Scripting engine is as fast as if not faster than compiled code in most situations. If there are performance issues with a particular page, look for bottlenecks in the page. Usually these will be to do with the looping, dynamic creation of HTML (both of which reduce performance but not ability to scale) or database calls (which more worryingly affect both performance and ability to scale out). There is a danger to think that coding trickery can turn a poorly performing, poorly scaling page into a high performance scalable one. This is rubbish - most performance problems are introduced as fundamental flaws in the page design or overall site architecture. As I have already stated, performance cannot be bug-fixed on at the end of development!

Wherever possible, use cached data rather than making database calls. This is the single biggest improvement you can make to site performance and scalability. The overall issue of performance is huge and encompasses web server, program architecture, program implementation, logical database design, physical implementation, hardware, network, connectivity, etc. The best thing you can do is to test and keep testing.

Capacity planning

Capacity planning is the process of measuring a web site's ability to serve content to its visitors at an acceptable speed for foreseeable visitor numbers and usage trends.

The most important aspect of capacity planning is ensuring that your website and associated systems can deal with peaks in traffic. The usual assumption is that the 80/20 rule should be followed - i.e. 80% of the traffic will occur only 20% of the time. This indicates that a website should be able to handle peaks 4x greater than the average traffic levels. From personal experience I would say that this is optimistic. Prominent online promotions or television features can result in peaks of between 10 and 20 times average. If you have used the TCA methodology wisely and know both the capacity of your website and how well it scales with the addition of new servers, you should be well equipped to deal with future demands.

On an ongoing basis, the web servers and database servers should be monitored for performance. I would advice monitoring regularly with Performance Monitor during normal browsing periods and peak periods. I would also advise setting up alerts which will inform you if certain thresholds are exceeded, for example processor usage averaging greater than 80%, excessive memory usage, etc. You can have these generate an email, which can then trigger a pager, or a SMS text message that will allow you to respond to the situation or at the very least, monitor it.

Conclusion

This article has covered many aspects of measuring performance and scalability using the Transactional Cost Analysis Methodology. The process of capacity planning has also been touched on. The recent spectacular failure of the UK government's census website (a link is provided at the end of the article) has again shown how important capacity planning and stress testing are: They should be a fundamental part of the development project through requirements capture, architecture, design and implementation. Even when the application has been deployed, its performance should be measured on an ongoing basis and present and predicted usage trends analyzed to ensure that demand could be met!

Links

The Microsoft Web Application Stress Tool site (the tool is also available on the Windows 2000 Resource Kit Companion CD):

http://webtool.rte.microsoft.com/?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD

Detailed information on WAS Threads and Sockets

http://webtool.rte.microsoft.com/Threads/WASThreads.htm?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD

Detailed Article on using the WAS Tool

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnduwon/html/d5wast_2.asp?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD

Excellent article on site design for performance. Compares all ASP implementation to COM and COM+ under Windows 2000 and returns results and conclusions which will be surprising to some

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnnile/html/docu2kbench.asp?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD

"Census website goes offline": BBC news article about the recent high profile failure of a UK government website to cope with demand.

http://news.bbc.co.uk/hi/english/uk/newsid_1749000/1749045.stm?WROXEMPTOKEN=582263ZQlT0wWrzzPVX3B3qjcD

Content