Chapter 20. SRE Team Lifecycles

The Preface to this book set a goal to “dispel the idea that SRE is implementable only at ‘Google scale’ or in ‘Google culture.’” This chapter lays out a roadmap for maturing an SRE organization from unstaffed but aspirational, through various stages of maturity, to a robust and (potentially) globally distributed set of SRE teams. Regardless of where you are in your journey as an SRE organization, this chapter will help you identify strategies for evolving your SRE organization.

We discuss the SRE principles that need to be in place at each stage of this journey. While your own journey will vary depending on the size, nature, and geographic distribution of your organization, the path we describe to successfully apply SRE principles and implement SRE practices should be generalizable to many different types of organizations.

SRE Practices Without SREs

Even if you don’t have SREs, you can adopt SRE practices by using SLOs. As discussed in Chapter 2, SLO are the foundations for SRE practices. As such, they inform our first principle of SRE:

Principle #1

SRE needs SLOs with consequences.

The performance of your service relative to SLOs should guide your business decisions.

We believe that the following practices—which you can achieve without even having a single SRE—are the crucial steps toward implementing SRE practices:

  • Acknowledge that you don’t want 100% reliability.

  • Set a reasonable ...

Get The Site Reliability Workbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.