Proactive Performance Management in Production
Here are some examples of typical performance management problems that crop up:
Users call with a response time problem.
The JVM reaches an out-of-memory state and crashes.
Logging space on the filesystem is exhausted.
A database table maxes out on the extents.
The nightly batch runs too long or freezes.
Performance management solutions in production tend to be reactive. For example, here is a standard pattern of activity:
A problem occurs and is reported.
The system is analyzed to identify what caused the problem.
Some corrective action is taken to eliminate the problem.
Everything goes back to normal.
Reactive performance management will always be required because it is impossible to anticipate all conditions. However, proactive performance management can minimize situations in which reactive activity is required. Proactive performance management requires monitoring normal activity to identify unusual performance spikes, dips, and trends.
For example, one site with a performance monitoring policy identified a trend that showed long page response time. Analysis of the server indicated that a cache was configured incorrectly. This was fixed and the problem was eliminated. Users noticed nothing. Without the site’s proactive policy, administration would not have known about the problem until users complained about increasingly slow performance.
Similarly, another site with a proactive performance management policy monitored JVM memory size. After ...