Chapter 4. Planning and monitoring tools 137
IBM TotalStorage Productivity Center for Disk report and zSeries
The z/OS systems have proven performance monitoring and management tools available to
use for performance analysis. RMF, a z/OS performance tool, collects performance data and
reports it for the desired interval. It also provides cache reports. The cache reports are similar
to the Disk-to-Cache and Cache-to-Disk reports available in the TotalStorage productivity
Center for Disk, except that RMF’s cache reports are provided in text format. The RMF
provides the Rank level statistics as SMF records. These SMF records are raw data that you
can run your own post processor against to generate reports. RMF is discussed in detail in
Chapter 10, “zSeries servers” on page 357.
4.3.9 IBM TotalStorage Productivity Center for Disk in mixed environment
A benefit of the IBM TotalStorage productivity Center for Disk is that you can analyze both
open systems fixed block (FB) and S/390 CKD workloads. When the DS6000 subsystems are
attached to multiple hosts running on different platforms, open systems hosts may affect your
S/390 workload and vice versa. If this is the case, taking a look at RMF reports will not be
sufficient. You need also the information about the open systems. The IBM TotalStorage
Productivity Center for Disk informs you about the cache and I/O activity.
Before beginning the diagnostic process, you must understand your workload and your
physical configuration. You need to know how your system resources are allocated, as well as
understand your path and channel configuration for all attached servers.
Let us assume that you have an environment with a DS6000 attached to a z/OS host, an AIX
pSeries host, and several Windows hosts. You have noticed that your z/OS online users
experience a performance degradation between 7:30 a.m. and 8:00 a.m. each morning.
You may notice that there are 3390 volumes indicating high
disconnect times, or high device
busy delay
time for several volumes in the RMF device activity reports. Unlike UNIX or
Windows, you may notice
response time and its breakdown to connect, disconnect, pending,
and IOS
queuing.
Disconnect time is an indication of cache miss activity or destage wait (due to persistent
memory high utilization) for logical disks behind the DS6000s.
Device busy delay is an indication that another system locks up a volume, and an extent
conflict
occurs among S/390 hosts or applications in the same host when using Parallel
Access Volumes. The DS6000’s multiple allegiance or Parallel Access Volume capability
allows it to process multiple I/Os against the same volume at the same time. However, if a
read or write request against an extent is pending while another I/O is writing to the extent, or
if a write request against an extent is pending while another I/O is reading or writing data from
the extent, the DS6000 will delay the I/O by queuing. This condition is referred to as
extent
conflict.
Queuing time due to extent conflict is accumulated to device busy (DB) delay time.
An extent is a sphere of access; the unit of increment is a track; usually I/O drivers or system
routines decide and declare the sphere.
To determine the possible cause of high
disconnect times, you should check the read cache
hit ratios, read-to-write ratios, and bypass I/Os for those volumes. If you see the cache hit
ratio is lower than usual while you have not added other workload on your S/390 environment,
I/Os against open systems fixed block volumes might be a cause of the problem. Possibly
fixed block (FB) volumes defined on the same server had a cache-unfriendly workload, thus
impacting your S/390 volumes hit ratio.
In order to get more information about cache usage, you can check the cache statistics of the
Fixed Block volumes, which belong to the same server. You may be able to point out the Fixed
Block volumes that have a low read hit ratio and short cache holding time. If you can move the