The Site Reliability Workbook
by Betsy Beyer, Niall Richard Murphy, David K. Rensin, Kent Kawahara, Stephen Thorne
Appendix A. Example SLO Document
This document describes the SLOs for the Example Game Service.
| Status | Published |
|---|---|
| Author | Steven Thurgood |
| Date | 2018-02-19 |
| Reviewers | David Ferguson |
| Approvers | Betsy Beyer |
| Approval Date | 2018-02-20 |
| Revisit Date | 2019-02-01 |
Service Overview
The Example Game Service allows Android and iPhone users to play a game with each other. The app runs on users’ phones, and moves are sent back to the API via a REST API. The data store contains the states of all current and previous games. A score pipeline reads this table and generates up-to-date league tables for today, this week, and all time. League table results are available in the app, via the API, and also on a public HTTP server.
The SLO uses a four-week rolling window.
SLIs and SLOs
| Category | SLI | SLO |
|---|---|---|
| API | ||
| Availability |
The proportion of successful requests, as measured from the load balancer metrics. Any HTTP status other than 500–599 is considered successful. count of "api" http_requests which do not have a 5XX status code divided by count of all "api" http_requests |
97% success |
| Latency |
The proportion of sufficiently fast requests, as measured from the load balancer metrics. “Sufficiently fast” is defined as < 400 ms, or < 850 ms. count of "api" http_requests with a duration less than or equal to "0.4" seconds divided by count of all "api" http_requests count of "api" http_requests with a duration less than or equal to "0.85" seconds divided by count of all "api" http_requests ... | |
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access