Chapter 7 Policy-Based Fault Management

This chapter describes the use of policies for managing faults in the IT systems. A fault or an error in an IT system disrupts the continued and efficient operation of the system. As components in an IT system become more interdependent on each other for their operation, a single fault in a component may cause many other components connected to the component to experience problems in their operation. Consequently, even a single fault may cause a large number of alarms to be generated that can easily overwhelm the operator. The task of a fault management system is to take the input alarms raised by various system components, diagnose the underlying root cause fault for the alarms, and then take corrective ...

Get Policy Technologies for Self-Managing Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.