Capacitas Logo

Windows Capacity and Performance Monitoring Overview

Microsoft provides access to a vast array of capacity and performance monitoring counters. These can be monitored through a bundled tool named System Monitor (Sysmon) or through a multitude of third-party tools. The sheer number of counters is often daunting to the uninitiated. This article provides a simple introduction to Windows performance monitoring and a basic set of meaningful performance counters to be monitored on any server.

Sysmon Overview

Sysmon organises performance counters in a hierarchical manner. Performance counters are logically grouped together under performance objects. There are currently 86 performance objects specified on Microsoft Technet.

There may be multiple instances of single counter. These instances may be named, e.g. a logical disk volume, or nominally identified, e.g. a processor number.

The Sysmon syntax for identifying a counter is as follows:

object\counter(instance)

By way of an example, the capacity utilisation of a processor is defined as the percentage of time that the processor is busy (% Processor Time). In Sysmon the counter that describes the % Processor Time for the second processor in a quad-processor server would be:

Processor\% Processor Time(1)

Note that Sysmon numbers processors 0, 1, 2, etc.

Sysmon also provides in-built instances that are used to summarise multiple instances, e.g:

Processor\% Processor Time(_Total)

Here the _Total instance describes the average of all processor instances. Note that _Total instance can represent a summation in other contexts.

Processor Capacity and Performance Monitoring

The table below describes the performance counters specific to processor capacity that should be part of a base monitoring set:

Counter Comments
Processor\% Processor Time(instance)
  • This is the primary counter for measuring processor capacity
  • The _Total instance should be measured in order to view average processor utilisation across a multiple processor server
  • Individual instances should be monitored in order to understand workloads where an application has affinity to a subset of processors
System\Processor Queue Length
  • Should be used alongside the counter in order to judge whether or not there are capacity and performance issues
  • As a guide for OLTP systems, a capacity bottleneck is likely when % Processor Time is in excess of 70% and the Processor Queue Length exceeds twice the number of installed processors

Memory Capacity and Performance Monitoring

The table below describes the performance counters specific to memory capacity that should be part of a base monitoring set:

Counter Comments
Memory\Available Bytes
  • A broad indicator of available capacity
  • Cannot be used in isolation as memory usage is dynamic
  • As a guide this counter must always be greater than 4 MB
Memory\Page Reads/sec
  • The rate of hard page faults
  • A hard page fault results in one or more pages being read from paging file into main memory
  • Cannot be used in isolation as memory usage is dynamic
  • As a guide this counter must account for less than 20% of the I/O throughput capacity of the volume where the paging file resides
Memory\%Committed Bytes in Use
  • The commit limit is based on the sum of main memory and the paging file size
  • This counter describes the percentage of the commit limit that is currently in use

Disk Capacity and Performance Monitoring

The table below describes the performance counters specific to disk capacity that should be part of a base monitoring set. These counters are applicable to both the Logical Disk and Physical Disk objects. Note that these counters describe the disk capacity and performance from the operating systems perspective.

Counter Comments
\%Idle Time
  • Shows the percentage of elapsed time during the sample interval that the selected disk drive was idle
  • The recommended counter for measuring disk utilisation
\ Avg. Disk Queue Length
  • Shows the average number of both read and write requests that were queued for the selected disk during the sample interval
  • As a guide, a disk bottleneck may be identified when the average disk queue length is consistently greater than 2 * number physical disks and %Idle Time is consistently less than 20%
\ Avg. Disk sec/Transfer
  • Average response time across the disk subsystem in seconds
  • Includes all subsystem layers, e.g. device driver layer, I/O bus and I/O channel
  • Includes queuing time at these layers
  • Does not pinpoint where delays are occurring
\% Free Space
  • Shows the percentage of the total usable space on the selected disk that is free
  • As a guide for NTFS volumes, usable capacity is exhausted when this counter reaches 15%
\ Free Megabytes
  • Shows the unallocated space, in megabytes, on the disk
  • Should be employed with the previous counter in order to assess disk space capacity

Network Interface Capacity and Performance Monitoring

The table below describes the performance counters specific to network interface capacity that should be part of a base monitoring set:

Counter Comments
Network Interface\Bytes Sent/sec
  • Report as a percentage utilisation of bandwidth in the sent direction using the counter Network Interface\Current Bandwidth
Network Interface\Bytes Received/sec
  • Report as a percentage utilisation of bandwidth in the receive direction using the counter Network Interface\Current Bandwidth
Network Interface\Packets Sent/sec
  • Average rate at which packets are sent during the sample interval
Network Interface\Packets Received/sec
  • Average rate at which packets are received during the sample interval
Network Interface \Outbound Queue Length
  • Measure of the packet queue length in the send direction
  • As a guide: % bandwidth utilisation in send direction is greater than 40% and Outbound Queue Length is consistently greater than 2 indicates a capacity bottleneck in the send direction
Network Interface\Packets Received Discarded
  • Measure of packets discarded
  • Likely to occur when inbound packet queues become full
  • As a guide: % bandwidth utilisation in receive direction is greater than 40% and Packets Received Discarded represents greater than 5% of packet throughput a capacity bottleneck in the receive direction is likely

Summary

This article introduced a basic set of performance counters to use for capacity and performance monitoring key components of Windows. A capacity and performance monitoring strategy would address the following areas not discussed here:

  • Monitoring other Windows components such as the file system cache
  • Performance monitoring to understand how the global workload is broken down by application
  • Monitoring counters required to diagnose performance problems
  • Monitoring at appropriate sample intervals
  • Monitoring counters to enable application response time to be estimated
  • Centrally storing capacity data in an ITIL compliant capacity database (CDB)

The author is tutor for the public training course, Capacity Planning Windows 2000/2003. Further details are available here.

Access to Capacitas articles is unrestricted although research is restricted to registered users of this website; registration is free and available to all. Click here to sign up now. Subscribers will be informed via email when new research is published.

© Capacitas Ltd 2008 Privacy Policy Code of Professional Practice