Chapter 16. Database Reliability Engineering
This is an edited excerpt from Database Reliability Engineering by Laine Campbell and Charity Majors (O’Reilly, 2017).
In this chapter, I talk about the craft of database reliability engineering as a subset of SRE. The database tier is the tier with the least tolerance for risk and is thus one of the greatest opportunities for growth through a culture of reliability engineering. Traditionally, DBAs were in the business of crafting silos and snowflakes. Their tools were different, their hardware was different, their languages were different. DBAs were writing SQL, systems engineers were writing Perl, software engineers were writing C++, web developers were writing PHP, and network engineers were crafting their own perfect appliances. Only half of the teams were using version control in any kind of way, and they certainly didn’t talk or step on one another’s turf. How could they? It was like entering a foreign land.
The days for which this model can prove itself to be effective and sustainable are numbered. This chapter is a view of reliability engineering as seen through a pair of database engineering glasses. I do not plan on covering everything possible here. Instead, I am describing what I do see as important, through the lens of the SRE experience. You can then apply this framework to multiple datastores, architectures, and organizations.
Guiding Principles of the Database Reliability Engineer
I have ...