Chapter 7. Conclusion
SRE is an emerging methodology that is integral to maintaining and defending the reliability of an organization’s service. The SLO Adoption and Usage Survey shows that many organizations have integrated basic principles of SRE into their operations (e.g., applying software engineering to ops, implementing capacity planning, and holding blameless postmortems) and are ready to advance their SRE practices by implementing an SLO and an error-budget approach to managing reliability.
SLOs and error budgets are powerful business tools that provide executives and product owners, as well as development, operations, and SRE teams, a data-driven framework for measuring the quality of service and balancing two often competing demands: change/feature release velocity and service reliability. Realizing the full benefits of SLOs requires organizations to thoughtfully establish SLIs that inform SLOs and error budgets, and to continually evaluate and improve the quality of their SLOs.
The real magic of SLOs, SLIs, and error budgets is that they align traditionally divided teams (development and operations) on a common goal. SLOs provide precise numerical targets that provide telemetry into the service’s performance that allow SRE and product teams to come together and more effectively manage innovation and risk.
Our survey shows that large organizations are more likely to have a longer history of using SLOs than smaller organizations do. While it makes sense that larger organizations ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access