6 Alert fatigue

This chapter covers

  • Using on-call best practices
  • Staffing for on-call rotations
  • Tracking on-call happiness
  • Providing ways to improve the on-call experience

When you launch a system into production, you’re often paranoid and completely ill-equipped to understand all of the ways your system might break. You spend a lot of time creating alarms for all the nightmare scenarios you can think of. But the problem with that is you generate a lot of noise in your alerting system that quickly becomes ignored and treated as the normal rhythms of the business. This pattern, called alert fatigue, can lead your team to serious burnout.

This chapter focuses on the aspects of on-call life for teams and how best to set them up for success. ...

Get Operations Anti-Patterns, DevOps Solutions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.