pnp.gif

How To: Model the Workload for Web Applications

J. D. Meier, Prashant Bansode, Carlos Farre, Mark Tomlinson, Scott Barber

Applies To

Summary

This how to shows you how to model the workload for Web Applications. For performance testing to yield results that are directly applicable to understanding the performance characteristics of an application in production, the tested workloads must represent reality. To create a reasonably accurate representation of reality one must understand the business context for the use of the application, expected transaction volumes in various situations, expected user path(s) by volume and other usage factors.

Contents

Objectives

Overview

Workload modeling is the process of identifying one or more composite application usage profiles of interest for use in performance testing. A workload model contains data related to such items as:
It is certainly true that simulating unrealistic workload models can provide valuable information to a team while conducting performance testing, but it is only when realistic workload models are simulated that predictions about performance in production be made, or that performance optimizations for production be accomplished.

Summary of Steps

Step 1: Identify the Objectives

The objectives of creating a workload model typically center around ensuring realism of a test, or designing a test to address a specific requirements, goal or performance testing objective. (For more info see {HowTo:Quantify End User Requirements and HowTo:Determine Performance Testing Objectives}. When identifying the objectives work with targets that will satisfy business requirements. Below is the key input for working towards building the objectives:
This information can be gathered from Web server logs, from marketing, reflecting business requirements or from stakeholders. Below are some objectives identified during this process:
It is acceptable if these objectives only make sense in the context of the project at this point. The remaining steps will help you fill in the necessary details to achieve the objectives.

Considerations

Step 2: Identify Key Scenarios

It is typically somewhere between impractical and impossible to simulate every possible user task or activity in a performance test. As a result, whether identifying what users do by analyzing server logs, observing usability studies, interpreting marketing material or starting with your best educated guess, you will probably want to apply some limiting heuristic to the number of activities, or key scenarios you identify for performance testing. You may find the following limiting heuristics useful:
Below are an example of key scenarios identified for an eCommerce application:

Considerations

Step 3: Determine Navigation Paths for Key Scenarios

Human beings are unpredictable and web sites commonly offer multiple paths to accomplish the same task or activity. Even with a relatively small number of users, it is almost certain that real users will not only use every path you think they will to complete a task, they will inevitably invent some that you hadn’t thought of. Each path they take to complete an activity will put a different load on the system. That difference may be trivial, it may be enormous. There is no way to be certain until we test it. There are many methods to determine navigation paths to complete a task or activity. Some include: Once the application is released for unscripted user acceptance testing, beta testing or to production, you will be able to determine how the majority of users accomplish activities on the system under test. It is always a good idea to compare your models against reality and make an informed decision about whether to do additional testing based on the similarities and differences found.

Apply the same limiting heuristics to navigation paths as you did when determining activities to decide which paths you want to include in your performance simulation.

Considerations

Step 4: Identify Unique Data for Navigation Paths and / or Simulated Users

Unfortunately, navigation paths alone don’t provide all of the information required to implement a workload simulation. To fully implement the workload model, several more pieces of information are needed. This information includes items such as:
Below is an example of unique data identified for an eCommerce application:
Implementation Data
ScenarioPage/ StepData InputsData OutputsThink Time
LoginLogin pageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
Browse
Login PageUsername (unique), Password (matched to username) 6 – 9 Sec, Random
Browse Catalog Tree/Structure (static), User Type (weighted) Product Description, Title, Category4 – 60 Sec, Random

Considerations

Step 5: Determine Relative Distribution of Scenarios

Now that you’ve determined what scenarios you want to simulate and what the steps and associated data are for those scenarios, you need to determine how often each scenario needs to be simulated relative to the other scenarios to complete the workload model. Sometimes, one workload model is not enough. Research and experience tell us that, user activities often vary greatly over time. To ensure test validity, we must validate that activities are evaluated by time of day, day of week, day of month and time of year. As an example, consider an on-line bill payment site. If all bills go out on the 20th of the month, the activity on the site immediately before the 20th will be focused on updating accounts and importing billing information, etc. by system administrators, while immediately after the 20th, customers will be viewing and paying their bills until the payment due date of the 5th of the next month. The most common methods to determine the relative distribution of scenarios are:
Below is an example of the distribution of scenarios for an eCommerce application:
Work Distribution
User Scenarios% of Work distribution
Browse50
Search30
Place Order20
Total100

Considerations

Step 6: Identify Target Load Levels

While it is frequent that each workload model will be executed at a variety of load levels and that changing the load level is very easy to change at run time in most load generation tools, it is still important to identify the expected and peak target load levels for each workload model for the purposes of predicting or comparing to production conditions. Below are the inputs and outputs for determining target load levels:

Inputs

Volume

Below information can be extracted from Web server logs, marketing or stakeholders:

Time PeriodBusiness volume (# sessions to the web site)Business volume (# sessions, peak values)Peak Load increasePeak build up timePeak Duration
Monthly46078913608902.951 hour2 hours
Daily15359453632.951 hour2 hours
Hourly(15 hour traffic)102330242.951 hour2 hours


OutputCombining the volume information with objectives, key scenarios the user delays, navigation paths and scenario distributions from the previous steps, you can determine the remaining details necessary to implement the workload model at a particular target load.

Example:
Total hourly sessions: 1023
Total hourly sessions( peak ): 3024
Place Orders 20% : 205
Place Orders 20% ( peak ) : 605
Session Average time = 18 minutes
Sessions Per hour = 60/18=3.7
Total Users to produce 205 orders= 61 ( 205/3.7)
Total Users to produce 605 order= 181(605/3.7)

User Scenarios% of Work distributionTotal Normal sessionsTotal Peak sessionsSession duration minutes# concurrent Users Normal# concurrent users peak%Percentage New Users
Browse5051215121512837810%
Search3030790773610610%
Place Order20205605186118112%
Total10010233024 12%

Considerations

Step 7: Prepare to Implement the Model

Preparing to implement the model is tightly tied to the method of implementation model, typically a load generation tool. For more information about implementing a workload model using VSTS see {HowTo:SomeName}.

Considerations:

Additional Resources