June 2003
Intermediate to advanced
464 pages
10h 33m
English
Availability management aims at responding to anomalous events fast enough that the damage is contained and repaired before the SLA criteria are missed. Typically, humans take at least a few seconds to observe, analyze, and react to a situation.[24] With the speed of modern processors, a lot of damage can occur in a few seconds.
[24] As today's systems continue to improve, these anomalous events happen less and less frequently—a good thing. As a result, we tend to forget what the proper procedure is to respond to the event—a bad thing. Typically, there is a written procedure on how to recover. But a site is almost guaranteed to have a lengthy outage if the operator first has to read up on what to do.
Automation refers to a program ...