44 End-to-End e-business Transaction Management Made Easy
deployed on an application server. The steps might include logging on and
obtaining the main page display.
Playing back the transaction. The Generic Windows component plays back
the recorded transaction and measures response times.
2.3 Reporting and troubleshooting with TMTP WTP
One of the strengths of this release of TMTP is its reporting capabilities. The
following subsections introduce you to the various visual components and
reports that can be gathered from TMTP and the way in which these could be
Troubleshooting transactions with the Topology view
Your organization has installed TMTP V5.2 and it has been configured to send
e-mail to the TMTP Administrator as well as sending an event to the Tivoli
Enterprise Console upon a transaction performance violation. Using the following
steps, the TMTP administrator identifies and analyzes the transaction
performance violation and ultimately identifies the root cause.
After receiving the notification from TMTP, the Administrator would log onto
TMTP and access the “Big Board” view, shown in Figure 2-3.
Figure 2-3 Big Board View
From the Big Board View, the administrator can see that the J2EE policy called
“quick_listen” had a violation at 16:27. The user can also tell the policy had a
threshold of “goes above 5 seconds”, which was violated, as the value was 6.03
Chapter 2. IBM Tivoli Monitoring for Transaction Performance in brief 45
The administrator can now click on the topology icon for that policy and load the
most recent topology that TMTP has data for (see Figure 2-4).
Figure 2-4 Topology view indicating problem
Since, by default, topologies are filtered to exclude any nodes that are slower
than one second (this is configurable), the default view is to show the latest
aggregated data for slow nodes. In Figure 2-4, you can see that there were only
two slow performing nodes.
All nodes in the topology have a numeric value on them. If the node is a container
for other nodes (for example, a Servlet node may contain four different Servlets)
the time expressed on the node is the maximum time of what is contained within
the node. This makes it easy to track down where the slow node resides. Once
you have drilled down to the bottom level, the time on the base node indicates
the actual time for that node (average for aggregate data, and specific timings for
instance data). In Figure 2-4, the root node (J2EE/.*) has an icon that indicates
that it has had performance violations for that hour.
The administrator can now select the node that is in violation and click on the
Inspector icon. The Inspector view (Figure 2-5 on page 46) reveals that the
threshold setting of “goes above 5 seconds” was violated nine times out of 11 for
the hour and that the minimum time was 0.075 and the maximum time was 6.03.
The administrator can conclude from these numbers that this nodes performance
was fairly erratic.
Chapter 2. IBM Tivoli Monitoring for Transaction Performance in brief 47
Figure 2-6 on page 46 shows nine instances with asterisks indicating that they
violated thresholds and two others with low performance figures indicating they
did not violate. The administrator can now select the first instance that violated
(they are in order of occurrence) and click the Apply button to obtain an instance
topology (Figure 2-7).
Figure 2-7 Instance topology
Again, this topology has the one second filtering turned on, so any extraneous
nodes are filtered out. Here the administrator can see that, as suspected, the
Timer.goGet() method is taking up a majority of the time, ruling out a problem
with the root transaction.
The Timer.goGet() method has an upside down orange triangle indicating it has
been deemed the most violated instance. This calculation is determined by
comparing the instances duration (6.004 seconds in this case) to the average for
the hour (4.303 seconds, as we saw above) while taking into account the number
of times the method was called by that method. Doing this provides an estimate
of the amount of time spent in a node that was above its average. This
calculation provides an indication of abnormal behavior because it is slower than
normal. Other slow performing nodes will be marked with a yellow upside down
triangle, indicating a problem against the average for the hour (by default, 5% of
the methods will have a marking).