Setting Up Performance Error Budgets
Now, you have multiple ways to visualize how performant your system is. But how can you use those tools to keep that performance on time? Well, the answer to that comes from reliability engineering. Its name: error budgets.
An error budget is a concept that defines the acceptable level of system unreliability over a period of time. It is assumed that having 100 percent uptime is neither realistic nor cost-effective. Instead, organizations define SLOs (Service Level Objectives). For example, Twilio maintains a staggering 99.999 percent (aka, five nines) of availability.[110] This means that every year, Twilio only has a max downtime of five minutes and 16 seconds, which is less than half a minute per month. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access