Chapter 3. Selecting SLOs

SLOs are well-defined, concrete targets for system availability. They represent a dividing line between user happiness and unhappiness, and they frame all discussions about whether the system is running reliably as perceived by users. The SLO Adoption and Usage Survey reveals that, despite the importance of SLOs in ensuring the reliability of services, many organizations have not implemented SLOs. The survey also shows that organizations using SLOs fail to regularly update them as their businesses evolve. These are missed opportunities that can keep organizations from gaining all of the benefits SRE offers.

Although the survey did not delve into the reasons why respondents do not leverage SLOs and SLIs, we speculate that it is because defining SLOs and SLIs is a difficult task and many do not know where to start. In addition, our experience with customers suggests that teams may lack executive support—a critical component in SLO definition, alignment, and success. The remainder of this report provides a step-by-step guide for building SLOs and SLIs, and describes how to apply them to error budgets so that your organization can use this data to make business decisions that drive feature release velocity that is in balance with the availability appropriate for your business and customer needs.

This chapter and Chapter 4 describe how to set SLOs using the following three steps:

  1. Define desired objectives or the services you want to cover with SLOs

Get SLO Adoption and Usage in Site Reliability Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.