pnp.gif

How To: Model the Workload for Web Applications

J. D. Meier, Prashant Bansode, Carlos Farre, Scott Barber

Applies To

Summary

This How To explains how to create a workload model that represents how a web application is expected to be used in production. For performance testing to yield results that are directly applicable to understanding the performance characteristics of an application in production, the tested workloads must represent the real-world production scenario. To create a reasonably accurate representation of reality, you must understand the business context for the use of the application, expected transaction volumes in various situations, expected user path(s) by volume, and other usage factors.

Contents

Objectives

Overview

Workload modeling is the process of identifying one or more composite application usage profiles of interest for use in performance testing. A workload model contains data related to such items as:
While it is certainly true that simulating unrealistic workload models can provide valuable information to a team when conducting performance testing, you can only make accurate predictions about performance in production, or accomplish performance optimizations, when realistic workload models are simulated.

Summary of Steps

Step 1: Identify the Objectives

The objectives of creating a workload model typically center on ensuring the realism of a test scenario, or on designing a test to address a specific requirement, goal, or performance-testing objective. (For more information, see {HowTo:Quantify End User Requirements and HowTo:Determine Performance Testing Objectives}. When identifying the objectives, you should work with targets that will satisfy the stated business requirements. Consider the following key questions when formulating your objectives:

This information can be gathered from Web server logs, from marketing documentation reflecting business requirements, or from stakeholders. The following are some of the objectives identified during this process:

It is acceptable if these objectives only make sense in the context of the project at this point. The remaining steps will help you fill in the necessary details to achieve the objectives.

Considerations

Consider the following key points when identifying objectives:

Step 2: Identify Key Scenarios

It is typically either impractical or impossible to simulate every possible user task or activity in a performance test. As a result, whether you are identifying user behavior by analyzing server logs, observing usability studies, interpreting marketing materials, or simply starting with your best educated guess, you will probably want to apply some limiting heuristic to the number of activities, or key scenarios you identify for performance testing. You may find the following limiting heuristics useful:

The following are some key scenarios identified for an e-commerce application:

Considerations

Consider the following key points when identifying key scenarios:

Step 3: Determine Navigation Paths for Key Scenarios

Human beings are unpredictable and Web sites commonly offer multiple paths to accomplish the same task or activity. Even with a relatively small number of users, it is almost certain that real users will not only use every path you think they will to complete a task, but they also will inevitably invent some that you hadn’t thought of. Each path the user takes to complete an activity will put a different load on the system. That difference may be trivial, or it may be enormous—there is no way to be certain until you test it. There are many methods for determining navigation paths to complete a task or activity, including the following:

After the application is released for unscripted user acceptance testing, for beta testing, or to production, you will be able to determine how the majority of users accomplish activities on the system being tested. It is always a good idea to compare your models against reality and make an informed decision about whether to perform additional testing based on the similarities and differences you find.

Apply the same limiting heuristics to navigation paths as you did when determining activities to decide which paths you want to include in your performance simulation.

Considerations

Consider the following key points when determining navigation paths for key scenarios:

Step 4: Identify Unique Data for Navigation Paths and / or Simulated Users

Unfortunately, navigation paths alone do not provide all of the information required to implement a workload simulation. To fully implement the workload model, several more pieces of information are needed, including:

The following table provides an example of unique data identified for an e-commerce application.

Implementation Data
ScenarioPage/ StepData InputsData OutputsThink Time
LoginLogin pageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
Browse (experienced user) Login PageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
Browse Catalog Tree/Structure (static), User Type (weighted) Product Description, Title, Category4 – 60 Sec, Random
Browse (new user)Login PageUsername (unique), Password (matched to username) 6 - 21 Sec, Random
BrowseCatalog Tree/Structure (static), User Type (weighted)Product Description, Title, Category20 - 90 sec, Random

Considerations

Consider the following key points when identifying unique data for navigation paths and/or simulated users:

Step 5: Determine Relative Distribution of Scenarios

Now that you have determined what scenarios you want to simulate and what the steps and associated data are for those scenarios, you need to determine how often each scenario needs to be simulated relative to the other scenarios in order to complete the workload model. Sometimes one workload model is not enough. Research and experience have shown that user activities often vary greatly over time. To ensure test validity, you must validate that activities are evaluated according to time of day, day of week, day of month, and time of year. As an example, consider an online bill-payment site. If all bills go out on the 20th of the month, the activity on the site immediately before the 20th will be focused on updating accounts, importing billing information, and so on by system administrators, while immediately after the 20th, customers will be viewing and paying their bills until the payment due date of the 5th of the next month.

The most common methods for determining the relative distribution of scenarios are:
The following table provides an example of the distribution of scenarios for an eCommerce application.

Work Distribution
User Scenarios% of Work distribution
Browse50
Search30
Place Order20
Total100

Considerations

Consider the following key points when determining relative distribution of scenarios:

Step 6: Identify Target Load Levels

Although it frequently the case that each workload model will be executed at a variety of load levels and that the load level is very easy to change at run time using most load-generation tools, it is still important to identify the expected and peak target load levels for each workload model for the purpose of predicting or comparing with production conditions. The following are the inputs and outputs used for determining target load levels:

Inputs

Volume

The information in the following table can be extracted from Web server logs, marketing documentation, or stakeholders.

Time PeriodBusiness volume (# sessions to the web site)Business volume (# sessions, peak values)Peak Load increasePeak build up timePeak Duration
Monthly46078913608902.951 hour2 hours
Daily15359453632.951 hour2 hours
Hourly(15 hour traffic)102330242.951 hour2 hours

Output

By combining the volume information with objectives, key scenarios, user delays, navigation paths, and scenario distributions from the previous steps, you can determine the remaining details necessary to implement the workload model at a particular target load.

Example:
Total hourly sessions: 1023
Total hourly sessions( peak ): 3024
Place Orders 20%: 205
Place Orders 20% ( peak ) : 605
Session Average time = 18 minutes
Sessions Per hour = 60/18=3.7
Total Users to produce 205 orders = 61 ( 205/3.7)
Total Users to produce 605 order = 181(605/3.7)

The following table presents an example of target load levels for an e-commerce application.

User Scenarios% of Work distributionTotal Normal sessionsTotal Peak sessionsSession duration minutes# concurrent Users Normal# concurrent users peak%Percentage New Users
Browse5051215121512837810%
Search3030790773610610%
Place Order20205605186118112%
Total10010233024 12%

Considerations

Consider the following key points when identifying target load levels:

Step 7: Prepare to Implement the Model

Preparing to implement the model is tightly tied to the method of implementation, typically a load-generation tool. For more information about implementing a workload model using Visual Studio 2005 Team Suite or Visual Studio 2005 Team Edition for Software Testers , see {HowTo:SomeName}.

Considerations:

Consider the following key points when preparing to implement the model:

Additional Resources