Skip to Content
Modern System Administration
book

Modern System Administration

by Jennifer Davis
November 2022
Intermediate to advanced content levelIntermediate to advanced
325 pages
8h 13m
English
O'Reilly Media, Inc.
Content preview from Modern System Administration

Chapter 20. Managing Incidents

As we explored in Chapter 19, the purpose of on-call is to be aware of your systems so you can keep them healthy. But as much as you strive to reduce risk, failure will happen—there will be incidents. Incident management begins when you detect a problem during an on-call rotation, but management often extends beyond on-call when other subject matter experts and teams are required for issue resolution. The aim of incident management is to minimize the impact of an incident.

You, as an individual, need the kinds of tools, techniques, and practices that will not only get you through an incident with minimal suffering but will also help you feel prepared ahead of time and able to react effectively when an incident occurs. You need good, clear communication across teams so that the appropriate subject matter experts can share their knowledge and minimize time to resolution. And you need a way to capture and apply what you learned from the incident to improve overall production, reduce future impacts to customers, and reduce the team’s toil.

In this chapter, I share the framework for collaborative and sustainable incident management from identifying incidents to conducting post-incident reviews and identifying the actions required to improve the live environment.

Note

I am assuming your team has incident management and that you’ll have some framework to which you can apply what I’m sharing to improve your experience. If your team doesn’t currently do incident ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Practical Linux System Administration

Practical Linux System Administration

Kenneth Hess
UNIX and Linux System Administration Handbook, 5th Edition

UNIX and Linux System Administration Handbook, 5th Edition

Trent R. Hein, Evi Nemeth, Garth Snyder, Ben Whaley, Dan Mackin

Publisher Resources

ISBN: 9781492055204Errata Page