Chapter 1. Context Versus Control in SRE
David: We’ve had the pleasure of talking about a lot of things in the time we’ve known each other. One of the most interesting things I’ve heard you speak about is a way of doing SRE that focuses on providing context instead of using processes that are centered around control (the more common way SRE is practiced). Can we dig some more into this? Can you explain what you mean by context versus control and what a good example of each would be?
Coburn: I think of context as providing additional, pertinent information, which allows someone to better understand the rationale behind a given request or statement. At the highest level, availability-related context as shared at Netflix with an engineering team would be the trended availability of their microservice[s] and how that relates to the desired goal, including availability of downstream dependencies. With this domain-specific context, an engineering team has the responsibility (and context) to take the necessary steps to improve their availability.
In a control-based model a team will be aware of their microservice[s] availability goal, but if they fail to achieve that goal there might be a punitive action. This action might involve removing their ability to push code to production. At Netflix, we err toward the former model, sharing context on microservice-level availability, then working with teams ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access