Chapter 42. Why I Hate Our Playbooks
Frances Rees
Playbooks (also known as runbooks) in SRE are collections of documentation intended to help an on-caller resolve issues. There are many styles of playbooks, but most that I’ve encountered suffer from the same anti-patterns.
First, they can contain too much detail, making them difficult to maintain and creating large documents, which complicates finding specific information. This is often caused by a fear of missing anything. Examples of such playbooks are those written during a transfer of on-call ownership or as a continuous log by on-callers.
It’s infeasible to assume that any playbook is absolutely complete, so it’s important to expect it to be a tool that cannot fill the entire role of an SRE. The content covered in team onboarding is a useful baseline for the level of detail that can be elided as assumed knowledge. It’s also beneficial to replace details, such as how to locate a job with links, to speed up data search without requiring recall of myriad small facts. A special case to consider is playbooks written for users and customers of an infrastructure service, as opposed to those written for the owners of the service, for whom a very different level of background knowledge is needed. Caveats and unintuitive implications can easily cause accidental harm to users.
The opposite can happen with too little detail. Well-meaning ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access