Chapter 44. The Basics of Service-Level Objectives

Kit Merker, Brian Singer, and Alex Nauda

“I want it to work perfectly,” your boss or maybe a product manager tells you. But you, as a cloud engineer, know they aren’t really willing to pay for that level of service, even if it were possible, which it’s not.

How can you give management an easy way to instantly understand the trade-offs between reliability, speed of innovation, and cost? Service-level objectives (SLOs) are the answer. SLOs create clear reliability guidelines that balance the trade-offs between cloud costs, speed of change, and external risks.

What Are SLOs?

SLOs are key performance indicators for cloud services based on customer happiness. SLOs define the precise level of service that needs to be achieved in order to avoid unacceptable risk of displeasing the customer.

Let’s use availability as an example. When we talk about how often our infrastructure is available (uptime), we typically speak in terms of nines. If your infrastructure is available four nines, or 99.99%, it will be unavailable 52.6 minutes a year. However, if your infrastructure achieves five nines, then your system is up and working 99.999% ...

Get 97 Things Every Cloud Engineer Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.