O'Reilly logo

Practice of Cloud System Administration, The: DevOps and SRE Practices for Web Services, Volume 2 by Christina J. Hogan, Strata R. Chalup, Thomas A. Limoncelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 14. Oncall

Be alert... the world needs more lerts.

—Woody Allen

Oncall is the way we handle exceptional situations. Even though we try to automate all operational tasks, there will always be responsibilities and edge cases that cannot be automated away. These exceptional situations can happen at any time of the day; they do not schedule themselves nicely between the hours of 9 AM and 5 PM.

Exceptional situations are, in brief, outages and anything that, if left unattended, would lead to an outage. More specifically, they are situations where the service is, or will become, in violation of the SLA.

An operations team needs a strategy to assure that exceptional situations are attended to promptly and receive appropriate action. The strategy ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required