Capacitas Logo

IBM Mainframe Capacity and Performance Management

It is often said that the discipline of Capacity and Performance Management (CPM), along with other traditional service management processes, grew out of the management of the IBM Mainframe. This may or may not be true; however, this article looks at what sets this platform aside from others from a CPM viewpoint, instrumentation and workload management.

Instrumentation

“If you can’t measure it, you can’t manage it”, is a well used phrase in the CPM world and this was clearly taken to heart in designing the operating systems (MVS, OS/390 through to z/OS) on the IBM mainframe. The System Measurement Facility, better known as SMF gives the CPM specialist a very good starting point when trying to understand what’s happening on a system.

SMF collects data about resource utilisation and performance of a system and records this data into a repository where the data is categorised by a “type”. Sharing this repository with SMF is RMF (Resource Monitoring Facility) which also has its own range of types.

In general the difference between SMF and RMF is that SMF is generally focussed in gathering data about address spaces (the equivalent of processes on other platforms) whilst RMF looks from the hardware or service class point of view (service classes are groups of address spaces that are related in some way). SMF records tend to be event based (but not exclusively) while RMF records tend to be interval based.

The most useful SMF type in CPM is the type 30 record. The type 30 record has six subtypes; the most important being subtypes 2, 4 and 5:

  • Subtype 2 – Interval Termination
  • Subtype 4 – Step Termination
  • Subtype 5 – Job or Task Termination

    Batch address spaces are split into one or more sequential job steps, each step potentially running a different program. This makes the subtype 4 an essential record. Subtypes 4 and 5 are event records as they are written when a job or step ends (the subtype 5 aggregating all the subtype 4 records for the same job). The subtype 2 record, as the description suggests, is an interval record. The interval is user configurable and is typically set at 15 minutes.

    All three of these subtypes carry similar information, for example, amount of CPU used, amount of I/O connect time to different devices, numbers of I/Os (EXCPs), elapsed time, average working set size and various other metrics. The subtype 2 comes in particularly useful for long running address spaces as there is likely to be an accumulation of resource use that won’t be accounted for until the job/step is finished.

    The most used RMF types in CPM are –

  • Type 70 – Processor Activity
  • Type 71 – Paging Activity
  • Type 72 – Workload Activity and Storage Data
  • Type 73 – Channel Path Activity
  • Type 74 – Resource Activity (I/O devices, Coupling Facility etc)
  • Type 78 – Virtual Storage and I/O activity

    Most of the types are hardware or device related, however type 72 subtype 3 (workload activity) bridges the gap between SMF and RMF. To understand its unique value we need to understand the second of the IBM mainframe’s advantages in CPM, the Workload Manager (WLM).

    Workload Management

    Most operating systems have a way of prioritising workloads. Microsoft Windows uses a very simple method of assigning a priority based on six different possibilities. Some Unix based operating systems go a bit further by taking into consideration the accumulation of resource over a period of time. However none of these operating systems have a specific service target that can be explicitly set. Up until the introduction of OS/390 this could also have described the IBM mainframe.

    In WLM there is the concept of a service class, each of these service classes can contain one or more address space. The grouping of address spaces into service classes is typically based on service requirements or predicted rates of resource utilisation. The purpose of defining service classes is that WLM can dynamically adjust priorities for access to system resources such that all service requirements can be met (assuming there are sufficient resources).

    Service requirements are defined to WLM by means of “goals”. These goals can be either response time goals or velocity goals. Response time goals are typically defined as a percentage of transactions that must respond in a set time, e.g. “95% of transactions to respond within 1s”. Velocity goals are used when there are no known service requirements and are defined in terms of useful work which is time spent using resources (CPU and I/O). All other time is defined as not useful. Therefore a velocity of 30% indicates that over a time period, 30% of a service class's elapsed time has been spent using CPU and/or I/O resources (the rest of the time has been spent either queuing for these resources or being idle).

    WLM adjust priorities automatically “under the covers” but we can still see the results of this by querying the RMF type 72 subtype 3 records. These will show resource usage and response time information for each interval. The response time information is shown in categories e.g. the number of transactions responding within 5%, 10%, 50%, 100% etc of the response time goal.

    A reader who is new to WLM may think, “Well that’s all very good but I’d prefer that information per address space rather than lumped together into a service class”. There are two ways to deal with this, either by defining each address space as a separate service class (although there are obvious overheads with this) or defining the address space its own reporting class. A reporting class has its own RMF type 72 records but has no influence on WLM’s priority adjustment. An address space can exist in one service class but any number of report classes.

    It is WLM’s success at managing workloads that has earned the IBM mainframe the reputation of being able to run near or at 100% and still meet service demands. This, combined with the vast array of “out of the box” instrumentation, makes it an ideal place for the CPM specialist to learn their trade.

    Access to Capacitas articles is unrestricted although research is restricted to registered users of this website; registration is free and available to all. Click here to sign up now. Subscribers will be informed via email when new research is published.

  • © Capacitas Ltd 2008 Privacy Policy Code of Professional Practice