Chapter 9. Operator Philosophy
We’ve noted the problems Operators aim to solve, and you’ve stepped through detailed examples of how to build Operators with the SDK. You’ve also seen how to distribute Operators in a coherent way with OLM. Let’s try to connect those tactics to the strategic ideas that underpin them to understand an existential question: what are Operators for?
The Operator concept descends from Site Reliability Engineering (SRE). Back in Chapter 1, we talked about Operators as software SREs. Let’s review some key SRE tenets to understand how Operators apply them.
SRE for Every Application
SRE began at Google in response to the challenges of running large systems with ever-increasing numbers of users and features. A key SRE objective is allowing services to grow without forcing the teams that run them to grow in direct proportion. To run systems at dramatic scale without a team of dramatic size, SREs write code to handle deployment, operations, and maintenance tasks. SREs create software that runs other software, keeps it running, and manages it over time. SRE is a wider set of management and engineering techniques with automation as a central principle. You may have heard its goal referred to by different names, like “autonomous” or “self-driving” software. In the Operator Maturity Model we introduced in Figure 4-1, we call it “Auto Pilot.”
Operators and the Operator Framework make it easier to implement this kind of automation for applications that run on Kubernetes. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access