Skip to Content
Google SRE工作手册
book

Google SRE工作手册

by Betsy Beyer, Niall Richard Murphy, David K. Rensin, Kent Kawahara, Stephen Thorne
September 2020
Intermediate to advanced
526 pages
8h 23m
Chinese
China Electric Power Press Ltd.
Content preview from Google SRE工作手册
发现运维超负荷并从中恢复
395
设法通过一定的工作来解决某系列工单的共同问题,或通过运维工作来减少未来
可能的工单数量。
案例研究
2
:组织和工作负荷发生变化后的感知
超负荷
背景
在此案例研究中,相关的
Google SRE
团队分布在两个不同地点的办公室,每个办公
室都有
6~7
on-call
工程师(有关团队规模的更多讨论,请参见第一本
SRE
书的第
11
章)。悉尼团队的运行状况良好,苏黎世团队超负荷了。
在苏黎世团队超负荷工作之前,整体情况很稳定,大家也都对工作很满意。我们管
理的服务数量相对稳定,每种服务的种类繁多且运维的工作都很重。尽管我们支持
的服务的
SLO
与它们的外部依赖的
SLO
不匹配,但是这种不匹配并没有引起任何问
题。我们正在开展许多项目的工作,以改善我们管理的服务(例如,改善负载的平衡)。
若干个同时触发的因素导致苏黎世团队陷入了超负荷状态:我们开始采用噪声较大
且与
Google
的通用基础设施集成程度较低的新服务,技术负责人兼经理和另一位团
队成员离开了我们的团队,使团队缺失了
2
个人。额外的工作量和知识流失的共同
导致了更多的问题:
新服务和迁移相关的监控未经精调,导致每个轮值周期都产生了更多的告警。这
种积累是渐进的,因此我们并没有注意到这种上升。
SRE
对新服务感到相对无助。我们对它们的了解不足,无法做出适当的反应,因
此经常需要向开发团队提出问题。虽然超负荷以后是可以将服务交还给开发人员
的,但我们的团队从未这么做过,所以这使我们产生了不能这么做的错觉。
较小的
5
on-call ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python数据分析(第2版)

Python数据分析(第2版)

Posts & Telecom Press, Armando Fandango
Google系统架构解密: 构建安全可靠的系统

Google系统架构解密: 构建安全可靠的系统

Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya

Publisher Resources

ISBN: 9787519845858