Disaster Recovery
Disaster recovery is the art of being able to resume normal systems operations when faced with a disaster scenario. What constitutes a disaster depends on your context. In general, I consider a disaster to be an anomalous event that causes the interruption of normal operations. In a traditional data center, for example, the loss of a hard drive is not a disaster scenario, because it is more or less an expected event. A fire in the data center, on the other hand, is an abnormal event likely to cause an interruption of normal operations.
The total and sudden loss of a complete server, which you might consider a disaster in a physical data center, happens—relatively speaking—all of the time in the cloud. Although such a frequency demotes such events from the realm of disaster recovery, you still need solid disaster recovery processes to deal with them. As a result, disaster recovery is not simply a good idea that you can keep putting off in favor of other priorities—it is a requirement.
What makes disaster recovery so problematic in a physical environment is the amount of manual labor required to prepare for and execute a disaster recovery plan. Furthermore, fully testing your processes and procedures is often very difficult. Too many organizations have a disaster recovery plan that has never actually been tested in an environment that sufficiently replicates real-world conditions to give them the confidence that the plan will work.
Disaster recovery in the cloud can ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access