pnp.gif

How To: Identify Functions causing a High User-mode CPU Bottleneck for Server Applications in a Production Environment

Clint Huffman

Applies To

Summary

High CPU performance issues are relatively common. This How-To focuses on how to identify the functions consuming the most user mode processor time on computers that do not have Visual Studio installed such as production level servers. Once the functions have been identified, finding the lines of code causing the problem is much easier.

Contents

Objectives

In this module, you will learn to do the following:

Overview

Microsoft Visual Studio Team System ships with a powerful new profiling tool that allows us to discover what function calls our applications are spending most of their time in. This How-To focuses on how to identify the functions consuming the most user mode processor time on computers that do not have Visual Studio installed such as production level servers.

This How-To article shows you how to identify a high user mode processor bottleneck, how to identify which process or processes are causing the high user mode processor bottleneck, how to install and setup the stand alone Visual Studio Profiler, how to profile the process using the stand alone Visual Studio Profiler, and finally how to compile and interpret the profiler report. The result is statistics on which function calls are consuming the most user mode processor time.

Note: The Visual Studio 2005 Profiler does not work against .NET Framework v1.1 applications. For more information on this subject, see the “Profiling .NET Framework v1.1 Applications Considerations” section below.

Summary of Steps

Step 1. Identify a high user mode processor bottleneck

Before jumping right into profiling a process we must first identify if the server is having a processor bottleneck. Using the Microsoft Performance Monitor tool which is built-in to the operating system, we will look at the “Processor\% Processor Time” and “Processor\% User Time” performance counters to determine if an overall user mode processor bottleneck exists.

In the image below, we see a screenshot of Performance Monitor, and we see a red line and a white line. The red line is “Processor\% Processor Time” which is the total processor time of the server. The white line is “Processor\% User
Time” which is the total user mode processing time.

Note: For more information on the difference between privileged mode versus user mode processing, then see the “Privileged Mode versus User Mode Processing” section in the More Information section of this document.

What we are looking for is to see if the processor is spending most of its time in user mode CPU. Otherwise, it means the processor is spending too much time in privileged mode CPU, which needs to be resolved differently and not part of the scope of this document. In this case, the processor is spending 98% of its time in user mode processing making this computer a good candidate for profiling, which is what this document focuses on.

HighCPU1.GIF

Note: Sharp spikes in CPU usage is normal.

To view “% Processor Time” and “% User Time” in Microsoft Performance Monitor, follow these steps:
  1. Open the Microsoft Performance Monitor: Click Start, Run, then type, perfmon, then click Ok. The Microsoft Performance Monitor will show.
  2. Clear the graph: Clear the existing counters by clicking the New counter set button in the upper left hand corner. This is so we can start with a clean graph.
  3. Add all of the Processor instances of “% Processor Time” and “% User Time”:
    • In Perfmon, click the Add (plus sign) button to add counters
    • Select Processor as the Performance object, select % Processor Time and % User Time as the counters.
    • Select All instances as the instances
    • Click the Add button, and then click Close. This will add all of the instances of % Processor Time and % User Time to the graph with the value of their respective user mode CPU consumption.
HighCPU2.GIF
After adding the counters, you should see a graph similar to this:
HighCPU3.GIF
You can optionally remove the instances of Total. Total adds up the CPU usage of all of the processors on the computer, then it divides it by the number of processors. This can sometimes skew a high CPU problem on one of more of the processors.

Alternatively, we could have made the same conclusion using the Windows Task Manager. It is recommended to use the Performance Monitor method, so you can see the processor usage is on each of the processors versus an average of all of the processors. To use Task Manager to analyze processor usage, simply open the Performance tab of Task Manager, click View and check Show Kernel Times. This will add a red line to the CPU Usage. If the system CPU usage consistently high (typically above 75% on average) and the majority of it is green (meaning high user mode CPU usage), then you have a user mode CPU bottleneck.

HighCPU4.GIF
HighCPU5.GIF
Once we have determined the system has a high user mode CPU bottleneck, then the next step is to identify which process or processes are causing the high CPU consumption.

For more information on how to identify a processor bottleneck, see “Detecting Processor Bottlenecks” at:
http://www.microsoft.com/technet/archive/ntwrkstn/reskit/procsr.mspx?mfr=true

Step 2. Identify which processes are causing a high user mode processor bottleneck

Again, using the Microsoft Performance Monitor (perfmon) tool we will use the “Process\% User Time” performance counter to determine which processes are taking up the most user mode CPU (processor) time.

HighCPU6.GIF
Note: Try not to confuse “Process” and “Processor”.

You should see a graph similar to this:
HighCPU7.GIF HighCPU8.GIF
Now, we see some processes taking up CPU. Double-click on each of these to identify them.
Note: You may see values greater than 100%. These counters are adding up all of the CPU consumption across all of the processors. For example, if you have a computer with 8 processors, then any process has the potential of consuming up to 800% CPU.

In this case, NetDebugDemos.exe is the process causing the most user mode CPU consumption.
HighCPU10.GIF
For more information on how to identify a processor bottleneck, see “Detecting Processor Bottlenecks” at:
http://www.microsoft.com/technet/archive/ntwrkstn/reskit/procsr.mspx?mfr=true

Step 3. Installation of the stand alone profiler

Once a process or set of processes have been identified as high user mode CPU consumers, we need to use a process profiler such as the Visual Studio 2005 Profiler tool to be able to see which functions the application is spending the majority of its time in.

Before we can profile the process or processes, we must first install the command line based stand alone profiler (VSProfiler). The stand alone profiler is best used on systems where it is not practical to have the Visual Studio 2005 IDE installed such as on quality assurance (QA), pre-production, and production level servers.

Important: Usage of the stand alone profiler should be tested in a non-production environment prior to using it in a quality assurance (QA), pre-production, or production level server environment. Usage in a production level environment should be a last resort.

The following steps walk through how to install the stand alone, command line, Visual Studio 2005 Profiler (VSProfiler):

The Visual Studio 2005 Profiler should now be installed successfully. Next, the profiling environment needs to be prepared.

Step 4. Preparing the environment for profiling

In order to profile an application that is consuming high user mode CPU, the CLR profiling environment variables must be in place before the application needing to be profile can be properly profiled. This step walks through how to prepare the environment for sample based profiling.

Warning: This step requires a reboot of the server.

For more information about the difference between sampling based profiling and instrumented based profiling, see the More Information section of this document.

The following steps walk through how to prepare the environment for profiling applications:
If you installed it to the default directory locations, then run the following command where the respective paths are the installation directory and IDE directories:

SET PATH=%PATH%;C:\Program Files\Visual Studio 8\Team Tools\Performance Tools;C:\Program Files\Visual Studio 8\Common7\IDE

In our case, we ran:

SET PATH=%PATH%;D:\Program Files\Visual Studio 8\Team Tools\Performance Tools;D:\Program Files\Visual Studio 8\Common7\IDE

HighCPU11.GIF
Run “set” to ensure the PATH environment variable has been updated. It should look something like this:

Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;D:\Program Files\Visual Studio 8\Team Tools\Performance Tools;D:\Program Files\Visual Studio 8\Common7\IDE
Usage: VSInstr /U filename options
HighCPU12.GIF
This verifies that the VS_Profiler commands are running properly
VSPerfCLREnv /globalsampleon

Enabling VSPerf Global Profiling. Allows to 'attaching' to managed services.
HighCPU13.GIF
Important: Applications running under the CLR profiling environment variables may experience a slight degradation in performance due to the profiling overhead. Therefore, applications on the server that have been restarted after turning on global profiling may experience a slight performance degradation. This degradation is typically trivial, but still worth mentioning. In any case, this is why you should disable global sampling profiling as soon as you are done with it and reboot the server. Again, be sure to test this procedure in a testing environment first. For more information on the “VSPerfCmd” commands see the documentation on the “VSPerfCmd” command at http://msdn2.microsoft.com/en-us/library/ms182403(VS.80).aspx
If you installed it to the default directory locations, then run the following command where the respective paths are the installation directory and IDE directories:

SET PATH=%PATH%;C:\Program Files\Visual Studio 8\Team Tools\Performance Tools;C:\Program Files\Visual Studio 8\Common7\IDE

In our case, we ran:

SET PATH=%PATH%;D:\Program Files\Visual Studio 8\Team Tools\Performance Tools;D:\Program Files\Visual Studio 8\Common7\IDE

HighCPU14.GIF
Run “set” to ensure the PATH environment variable has been updated. It should look something like this:

Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;D:\Program Files\Visual Studio 8\Team Tools\Performance Tools;D:\Program Files\Visual Studio 8\Common7\IDE

Set the “NTSYMBOL_PATH” environment variable to symsrv*symsrv.dll*c:\symcache*http://msdl.microsoft.com/download/symbols by running the following command:
SET _NT_SYMBOL_PATH={{symsrv*symsrv.dll*c:\symcache*http://msdl.microsoft.com/download/symbols}}

Run “set” again to verify the NTSYMBOL_PATH is set.

For more detailed instructions on setting up a symbol server, see “Download Windows Symbol Packages”
http://www.microsoft.com/whdc/devtools/debugging/symbolpkg.mspx

We are now ready to profile a process.

Step 5. Profiling the process

The following steps walk through how to profile an application or service:
In our case, the user mode CPU on our server is at 100% user mode CPU caused by the NetDebugDemos.exe application.
HighCPU15.GIF
For more information on how to identify processes consuming high user mode CPU, go to Step 1 and Step 2 of this document or use “Detecting Processor Bottlenecks” at:
http://www.microsoft.com/technet/archive/ntwrkstn/reskit/procsr.mspx?mfr=true
VSPerfCmd /start:sample /attach:<pid|name> /output:<outputfile>.vsp

In our case, we ran the following command:

VSPerfCmd /start:sample /attach:NetDebugDemos.exe /output:NetDebugDemos.vsp

After running the above command, the command line should return output similar to this where 1212 is the process ID of NetDebugDemos.exe.

Successfully attached to process ID:1212
HighCPU16.GIF
If you receive Warning VSP2309, then this is normal. This just means to be sure to turn off global CLR profiling and reboot later to clear its overhead.

VSPerfCmd /status

Process and Thread Status
============================================================
Output File Name : D:\Program Files\Microsoft Visual Studio 8\Team Tools\Performance Tools\NetDebugDemos.vsp
Collection mode : SAMPLE
Maximum Processes : 64
Maximum Threads : 128
Number of Buffers : 258
Size of Buffers : 65536
============================================================
Maximum Number of Processes : 64
Number of Active Processes : 1
Global Start/Stop Count : 1
Global Suspend/Resume Count : 0
--------------------------------------------------------
Process : NetDebugDemos.exe
Process ID : 3728
Num Threads : 5
Start/Stop Count : 1
Suspend/Resume Count : 0
============================================================
Users with access rights to monitor:
UserName (SID)
BUILTIN\Administrators (S-1-5-32-544)
NT AUTHORITY\SYSTEM (S-1-5-18)

VSPerfCmd /detach:<pid|name>

VSPerfCmd /detach:NetDebugDemos.exe
Successfully detached from process ID:1212
HighCPU17.GIF
VSPerfCmd /shutdown

Shutting down the Profile Monitor
--------------------------------------------------------
HighCPU18.GIF

VSPerfCLREnv /globaloff

Disabling VSPerf Global Profiling.
You need to restart the service to detect the new settings. This may require a reboot of your machine.
The profiling session is finished and the global profiling environment is disabled. Next, we need to compile the report in order to review the results.

For more information on sampled based profiling using the stand alone command line profiler (VSProfiler), see “Walkthrough: Command-Line Profiling Using Sampling”
http://msdn2.microsoft.com/en-gb/library/aa985627(VS.80).aspx

Step 6. Compiling the profiler report

Once the profiling data is captured into a VSP file, we need to compile the report. Compilation has a lot of options that we cannot fully cover in this document, so we will cover a simple compilation that will give us the top function calls being used by the application we profiled. This step walks though how to compile the report using the command line VSProfiler commands.

If you want a more organized profiler report, then consider compiling the report inside of Visual Studio. See the “Compiling the Profile Report Using Visual Studio Considerations” section of this document on how to do this.

Compilation of the report is where symbol matching takes place. Symbols match up memory offsets to function names kind of like computer names resolve to IP addresses. In most cases, applications built on the .NET Framework do not need symbols to tell us information about its functions, but native applications (those not built on the .NET Framework) require symbols. At best, native applications without symbols can only tell us which DLL or EXE we are in.

Note: The SYMBOLPATH option in VSPerfReport may not need to be specified since we already defined it using the NTSYMBOLPATH environment variable. If you did not specify the NTSYMBOLPATH, then you should specify it in the command line using the /SYMBOLPATH parameter.

VSPerfReport <OutputFile>.vsp /SUMMARY:ALL /SYMBOLPATH:<SymbolPath>

VSPerfReport NetDebugDemos.vsp /SUMMARY:ALL /SYMBOLPATH:symsrv*symsrv.dll*c:\symcache*http://msdl.microsoft.com/download/symbols

During compilation, you may be prompted to accept a license agreement to use the Microsoft public symbol server. Read over the agreement and accept to continue. The Microsoft public symbol server is needed to resolve memory offsets to functions for Microsoft DLL’s and EXE’s. An internet connection is needed to contact the public symbol server.

Note: The Microsoft public symbol server has the address of http://msdl.microsoft.com/download/symbols. Even though this is an HTTP address, a web browser cannot be used. The data at this site is not browseable. Instead, it is accessed by the debuggers that ship with Debugging Tools for Windows and profilers.

You should see output similar to this:
A report file NetDebugDemos_Header.CSV is generated.
A report file NetDebugDemos_FunctionSummary.CSV is generated.
A report file NetDebugDemos_ThreadSummary.CSV is generated.
A report file NetDebugDemos_CallTree.CSV is generated.
A report file NetDebugDemos_CallerCallee.CSV is generated.
A report file NetDebugDemos_ModuleSummary.CSV is generated.

Some symbol file resolution errors may occur. This is only a concern if the reports show a DLL or EXE name without a function call next to it. If it is a Microsoft DLL or EXE, then go to the Microsoft Symbol Server web site or open a support case with Microsoft Product Support to resolve the issue. Otherwise, contact the vendor of the DLL or EXE to acquire symbols. The report will need to be compiled again with the new symbols.

Alternatively, if you want a nicer looking report, then you can compile the report inside of Visual Studio. See the “Compiling the Profile Report Using Visual Studio Considerations” section of this document on how to do this.

For more information about how to use the Microsoft Symbol Server go to:
http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx

Step 7. Analyzing the results of the profiler report

Compilation of the report using the VSPerfReport command will generate several comma separated value (CSV) files. Many of these files can help with tracking down where the application is spending most of its time.

Using Microsoft Excel or another CSV viewing application, open the file with the name <ReportName>_FunctionSummary.CSV. You should see output similar to the table below. In the sample output below, we have removed most of the data for brevity. The FunctionSummary.CSV file lists all of the functions called during profiling. The functions are grouped by the module they are in.
<<Insert Table>>
In our output, we show NetDebugDemos.Hang.HighCPU() has the highest Exclusive Samples (meaning the application was exclusively in this function when sampled) and the Exclusive Percent is 100%. This tells us our application is spending all of is user mode CPU in NetDebugDemos.Hang.HighCPU(). In a normal application, these results will not be so clear, so pay particular attention to which functions have the highest inclusive and exclusive percentages.

Exclusive samples is the total number of performance data samples collected for the function excluding performance data for other functions that were called by it.

Inclusive samples is the total number of performance data samples collected for the function including performance data for other functions that were called by it.

Exclusive Percent is the percentage of performance data samples where the application was exclusively in function.

Inclusive Percent is the percentage of performance data samples where the application was had this function in the call stack, but was not exclusively in the function. Let’s say you had an application that started off in the Main() function and the Main() function call HighCPU() function. Your call stack would look like this: Main() => HighCPU(). When sampled in the profiler, Main() would get 1 inclusive sample and HighCPU() would get 1 exclusive sample. In a nutshell, functions with a very high inclusive sample rate or percentage means that the application spent most of its time in other functions called by this function. Functions with a very high exclusive sample rate or percentage means the application spent most of its time in this function.

For more information about the descriptions of each column in this report, then go to: http://msdn2.microsoft.com/en-us/library/ms242735(VS.80).aspx

Here is some brief information about the CSV files produced in the report compilation:

Profiling .NET Framework v1.1 Application Considerations

The profiler that shipped with Visual Studio 2005 is unable to profile applications built on the .NET Framework v1.1. Therefore, if sampling based or instrumented based profiling is needed, then consider third party products. With that said, the .NET Framework v1.1 shipped with tools such as the CLRProfiler, but the CLRProfiler’s primary focus is memory allocations and is a different type of profiler.

The CLR Profiler enables you to look at the managed heap of a process and investigate the behavior of the garbage collector. Using the various views in the tool, you can obtain useful information about the execution, allocation, and memory consumption of your application.

For more information about how to use the CLRProfiler, go to: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenethowto13.asp

Compiling the Profile Report Using Visual Studio Considerations

If you want a more organized profile report, then consider compiling the report using Visual Studio. The report is consolidated into one report and data views can be resorted helping with interpreting the report.

The following steps must be performed on the computer with Visual Studio installed:
  1. Use the same executable path and version: The executable file (*.exe) and its supporting DLL’s that were targeted by the profiling must be in the same absolute path on the Visual Studio computer as it was on the computer where profiling took place. For example, if the path to the executable file was “D:\apps\myapp.exe”, then myapp.exe must be at the same path (“D:\apps\myapp.exe”) on the Visual Studio computer. Furthermore, the executable file must be the same build version used on the computer where it was profiled.
  2. Set the NTSYMBOLPATH: The NTSYMBOLPATH environment variable must set with the appropriate symbol path on the Visual Studio computer. For more information on the Microsoft Symbol Server go to:
http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx
  1. Start/Restart Visual Studio: Once the NTSYMBOL_PATH environment variable is set, then Visual Studio must be started/restarted for it to take effect.
  2. Open or Create a new Visual Studio Project: In Visual Studio open or create a new project.
  3. Open a new Performance Session: In Visual Studio, click “Tools”, “Performance Tools”, “New Performance Session…”. Performance Explorer should show.
  4. Open one of the profiling output files: Expand “Performance1” or any of the performance nodes, right-click on Reports, and select “Add Report…”. The Add Report dialog box should show. Browse to one of the profiling output files (*.vsp), then click the Open button. The output file should show under the Reports node.
  5. Compile the report: Double-click on the VSP file that you just added to compile the report. This process can be time consuming and resource intensive. An internet connection may be required to contact the Microsoft Symbol Server.

After compiling the report in Visual Studio, you should see a report similar to this:
HighCPU19.GIF

More Information

!!!Privileged Mode versus User Mode Processing
The processor can only be in one of two modes: privileged mode or user mode. Privileged mode (also known as kernel mode) is how much time the processor is spending doing things in the kernel such as dealing with input/output (I/O) of the disk subsystem, memory I/O, context switching, device drivers, etc. User mode is how much time the processor is spending executing application code. If % time in privileged mode and % time in user mode were added up, then you have the % total processor time. If we have a server spending all of its time in user mode processing, then it means we need to find out more about what the application is doing “under the hood”. “Under the hood” means profiling it which we will get to later in this document.

!!!Sampling Based Profiling versus Instrumented Based Profiling
The profiler has many other options such as instrumented based profiling and sampling based profiling. While instrumented based profiling session will give you more detail in the final report, it has significantly more overhead than a sampling based profiler session and could degrade an application’s performance to a point where it begins behaving differently. Sampling based profiling periodically samples the application allowing the application to mostly run free with the least amount of profiling overhead. Therefore, it is recommended to do a sampling based profiler session when performance tuning or profiling production level applications. In a nutshell, instrumented based profiling is invasive to an application while sampling based profiling is relatively non-invasive.

!!Additional Resources