Appendix A. Example SLO Document

This document describes the SLOs for the Example Game Service.

Status Published
Author Steven Thurgood
Date 2018-02-19
Reviewers David Ferguson
Approvers Betsy Beyer
Approval Date 2018-02-20
Revisit Date 2019-02-01

Service Overview

The Example Game Service allows Android and iPhone users to play a game with each other. The app runs on users’ phones, and moves are sent back to the API via a REST API. The data store contains the states of all current and previous games. A score pipeline reads this table and generates up-to-date league tables for today, this week, and all time. League table results are available in the app, via the API, and also on a public HTTP server.

The SLO uses a four-week rolling window.

SLIs and SLOs

Category SLI SLO
API
Availability

The proportion of successful requests, as measured from the load balancer metrics.

Any HTTP status other than 500–599 is considered successful.

count of "api" http_requests which
do not have a 5XX status code
divided by
count of all "api" http_requests
97% success
Latency

The proportion of sufficiently fast requests, as measured from the load balancer metrics.

“Sufficiently fast” is defined as < 400 ms, or < 850 ms.

 count of "api" http_requests with a duration less than or equal to "0.4" seconds divided by count of all "api" http_requests count of "api" http_requests with a duration less than or equal to "0.85" seconds divided by count of all "api" http_requests ...

Get The Site Reliability Workbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.