8 Event Management and Best Practices
discarded. Other times, a problem does not need to be investigated until it occurs
several times. For example, a high CPU condition may not be a problem if a
single process, such as a backup, uses many cycles for a minute or two.
However, if the condition happens several times within a certain time interval,
there most likely is a problem. In this case, the problem should be addressed
after the necessary number of occurrences. Unless diagnostic data, such as the
raw CPU busy values, is required from subsequent events, they can be dropped.
The process of reporting events after a certain number of occurrences is known
When multiple events are generated as a result of the same initial problem or
provide information about the same system resource, there may be a relationship
between the events. The process of defining this relationship in an event
processor and implementing actions to deal with the related events is known as
Correlated events may reference the same affected resource or different
resources. They may generated by the same event source or handled by the
same event processor.
Problem and clearing event correlation
This section presents an example of events that are
generated from the same event source and deal with the
same system resource. An agent monitoring a system
detects that a service has failed and sends an event to an
event processor. The event describes an error condition,
problem event. When the service is later
restored, the agent sends another event to inform the
event processor the service is again running and the
error condition has cleared. This event is known as a
clearing event. When an event processor receives a
clearing event, it normally closes the problem event to
show that it is no longer an issue.
The relationship between the problem and clearing event
can be depicted graphically as shown in Figure 1-1. The
correlation sequence is described as follows:
Problem is reported when received (Service Down).
Event is closed when a recovery event is received
Figure 1-1 Problem