book

How SRE relates to DevOps

by Niall Richard Murphy, Liz Fong-Jones, Betsy Beyer

March 2018

Intermediate to advanced

8 pages

25m

English

O'Reilly Media, Inc.

Content preview from How SRE relates to DevOps

How SRE Relates to DevOps

class SRE implements interface DevOps

Written by Niall Richard Murphy, Liz Fong-Jones, and Betsy Beyer

Contributions from Todd Underwood, Laura Nolan, and David K. Rensin

Operations, as a discipline, is hard.¹ Not only is there the generally unsolved question of how to run systems well, but the best practices that have been found to work are highly context-dependent and far from widely adopted. There is also the largely unaddressed question of how to run operations teams well. Detailed analysis of these topics is generally thought to originate with Operational Research devoted to improving processes and output in the Allied military during World War II, but in reality, we have been thinking about how to operate things better for millennia.

Yet, despite all this effort and thought, reliable production operations remains elusive—particularly in the domains of information technology and software operability. The enterprise world, for example, often treats operations as a cost center,² which makes meaningful improvements in outcomes difficult if not impossible. The tremendous short-sightedness of this approach is not yet widely understood, but dissatisfaction with it has given rise to a revolution in how to organize what we do in IT.

That revolution stemmed from trying to solve a common set of problems. The newest solutions to these problems are called by two separate names—DevOps and Site Reliability Engineering (SRE). Although we talk about them individually ...