Appendix A. Example SLO Document
This document describes the SLOs for the Example Game Service.
Status | Published |
---|---|
Author | Steven Thurgood |
Date | 2018-02-19 |
Reviewers | David Ferguson |
Approvers | Betsy Beyer |
Approval Date | 2018-02-20 |
Revisit Date | 2019-02-01 |
Service Overview
The Example Game Service allows Android and iPhone users to play a game with each other. The app runs on users’ phones, and moves are sent back to the API via a REST API. The data store contains the states of all current and previous games. A score pipeline reads this table and generates up-to-date league tables for today, this week, and all time. League table results are available in the app, via the API, and also on a public HTTP server.
The SLO uses a four-week rolling window.
SLIs and SLOs
Category | SLI | SLO |
---|---|---|
API | ||
Availability |
The proportion of successful requests, as measured from the load balancer metrics. Any HTTP status other than 500–599 is considered successful. count of "api" http_requests which do not have a 5XX status code divided by count of all "api" http_requests |
97% success |
Latency |
The proportion of sufficiently fast requests, as measured from the load balancer metrics. “Sufficiently fast” is defined as < 400 ms, or < 850 ms. count of "api" http_requests with a duration less than or equal to "0.4" seconds divided by count of all "api" http_requests count of "api" http_requests with a duration less than or equal to "0.85" seconds divided by count of all "api" http_requests ... |
Get The Site Reliability Workbook now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.