Monday, August 18, 2014

Processor Performance Counter

CPU and Memory utilization are key metrics captured in any performance test. This post will be focusing on CPU utilization and dealing with CPU bottleneck.

Whenever we say CPU utilization for a CPU with configuration 4 GHZ with 1 core is 80% this mean that currently 3.2 billion cycles in a seconds. However this calculation is not that easy in normal case considering when we have multi cores, hyperthreading, virtualization, shared cache and other advancement going in infrastructure space.

For any loadtest run, following performance counters should be typically used

- % Processor Time_Total Instance - Percentage of elapsed time a CPU is busy executing a non-idle thread (An indicator or processor activity). 85% of processor utilization can be taken as a threshold value.

Normally, CPU utilization should increase as load is increased. If CPU utilization is not increased then we may have a bottleneck which will impact the throughout and response time. Underutilization normally happens when we have multiprocessor systems with one JVM. To take the advantages of most of the processing power we might like to consider more than one JVM


Sometimes, load test will show us spike/burst in CPU utilization. Understanding the reason behind burst will require additional metrics and counters.

- Processor\% User Time : This counter will helps us in identifying any high user mode processor bottleneck.

- % Privilege Time-Percent of threads running in privileged mode (file or network I/O, or allocate memory)

Processor % Privilege Time consistently over 75 percent indicates a bottleneck.

Processor Queue Length - Number of tasks ready to run than the processors can get to.
Processor Queue Length greater than 2 indicates a bottleneck. It would be good to check with the Dev/infra team about the value of thread pool and should it be increased or not.


System\Context Switches /sec. Occurs when higher priority threads preempts lower priority threads that are currently running, and can indicate when too many threads are competing for processor time. If much processor utilization is not seen and very low levels of context switching are seen, it could indicate that threads are blocked

As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.



Monday, August 11, 2014

Powershell for Performance counters

APM tools and perfmon are great tools to monitor the system resources utilization. We can also use simple PowerShell scripts to monitor resources utilization and much more. Needless to say all the windows server products comes with inbuilt PowerShell.

Below is the simple PowerShell script which can be used to monitor CPU and memory utilization by a particular process


1:  $loop_count = 3   
2:  $cpu_threshold = 85   
3:  $memory_threshold = 90   
4:  $sleep_interval = 5   
5:  $hitcpu = 0   
6:  $hitmemory = 0   
7:  $target="firefox"   
8:  foreach($turn in 1..$loop_count)   
9:  {   
10:   $cpu = (gwmi -class Win32_Processor).LoadPercentage   
11:   $process = Get-Process | Where-Object {$_.ProcessName -eq $target}   
12:   $memory=[Math]::Round($process.privatememorysize/1mb, 2)    
13:   Add-content c:\users\amah11\Desktop\logs.txt "CPU utilization is Currently at $cpu%'"   
14:   Add-content c:\users\amah11\Desktop\logs.txt "Memory utilized by $target is $memory"   
15:   If($cpu -ge $cpu_threshold )   
16:   {   
17:      $hitcpu = $hitcpu+1   
18:   }   
19:   If($memory -ge $memory_threshold )   
20:   {   
21:      $hitmemory = $hitmemory + 1   
22:   }   
23:   start-sleep $sleep_interval    
24:   if($hit -eq 3)    
25:   {   
26:   Write-Host "CPU utilization above $cpu_threshold" -foregroundcolor red -backgroundcolor yellow   
27:   }   
28:   if($hitmemory -eq 3)    
29:   {   
30:   Write-Host "Memory utilization above $memory_threshold" -foregroundcolor red -backgroundcolor yellow   
31:   }   
32:  }   
In this script we are storing the result in a log file and checking if any counter is going above the threshold value. Based on occurrence of threshold violation we are echoing a warning message.

Through PowerShell we have to all the performance counters available. Also, for capturing memory utilization for a process which is not running before start of test we would have to capture all instances of process and then filter out result which can be avoided in case we use ps script.

Wednesday, August 6, 2014

Virtual Vuser Vs Real Users.




We all know in performance testing we have Vusers concepts but do Vuser and Real Users are same and do we have a one to one relationship between real and virtual users always. No that’s not the case. It depends on what is being tested and how we have script/design the scenario. We can design the scenario in such a way so that we can represent multiple real users by single vuser.

To determine the ratio, we need to have a good understanding of Performance goal and Application usage pattern. Little law's can be used to estimate number of vusers

Little law is represented by below formula
L= λW

Where W=Average Response Time + Think Time and λ is the arrival rate.

So, considering there is an eCommerce application where user arrives at the rate of 10 users per seconds and we have target average response time is 5 Seconds. In this case number of vusers we would be simulating is 10*3=50

Above concepts helps in designs  scenario where we have a Vuser # restriction for tool license. Using the above approach we can simulate the load for higher number of user by manipulating the think time.


Load generator capacity calculation


Resources requirement for a load test infrastructure vary from applications to applications due to the technology stack being used and complexity of scenarios and scripts Saying my load generator would support x number of users is a very risky statement unless we have done some analysis and math to prove the statement. Following steps are suggested by HP to figure out the load generator capacity with respect to the protocol and test script

  1. Run the single user test using controller. Keep a delay of few minutes in starting the script. Once script executions starts, observe the decrease in memory. Amount of memory decreased is our "First Vuser Memory"
  2. Modify the test to run for 5-10 Vuser. Keep a delay of few minutes in starting the script and for each vuser. Notice the decrease in memory when each new user ramp up. This decrease in memory is our "Each Additional Vuser memory"
  3. Now, for getting the Load generator capacity
  4. Find out the total RAM available on the load generator. This will be "Total RAM"
  5. Subtract 700-750 MB RAM for OS activities
  6. Find out what is the 75% of the remaining RAM
  7. Subtract "First Vuser Memory" from the remaining RAM in step 5
  8. Divide the figure by "Each Additional Vuser memory+1" to get number of vuser supported by LG


So, we can have following formula to arrive at load generator capacity based on RAM

((Total LG RAM - ~750 MB) - First Vuser Memory)/(Each Additional Vuser memory + 1)

This formula will provide the good result for all protocols except protocols involving GUI interactions like citrix, truclient, RDP as these protocols have GDI interactions which is not taken into account in above calculations

Above steps can be tweaked for getting result based on other system resources as well.

The result obtained by the above can be treated as a conservative figure but it is good to play safe when you don't want to affect your test due to test infrastructure