Skip to Content
Google SRE工作手册
book

Google SRE工作手册

by Betsy Beyer, Niall Richard Murphy, David K. Rensin, Kent Kawahara, Stephen Thorne
September 2020
Intermediate to advanced
526 pages
8h 23m
Chinese
China Electric Power Press Ltd.
Content preview from Google SRE工作手册
监控
93
如果对你的用户应用了速率限制或配额限制,请监控那些由于配额不足而导致的
拒绝请求次数的统计信息。
此数据的图表可以帮助你识别发生在生产系统变更期间的错误量的显著变化。
实施有意图的度量指标
每一个暴露的指标都应该是有意义的。不要只因为那些指标是易于获取的,就把它
们导出来。相反,请考虑如何使用这些指标。指标设计得好坏都会产生一定的影响。
在理想情况下,用于告警的指标数值仅在系统进入故障状态时才发生明显变化,并
且在系统正常运行时不会发生变化。另一方面,用于排错(
debugging
)的指标并没
有这些需求,它们旨在当有关告警触发时,为当时的状况提供相关信息。良好的排
错指标将揭示出那些可能导致系统问题的地方。当你撰写事后调查报告时,请考虑
哪些其他的指标可以加快你的故障排查过程。
测试告警逻辑
在理想情况下,监控和告警的代码应遵循与代码开发相同的测试标准。虽然
Prometheus
的开发人员正在讨论开发用于监控的单元测试,但目前还没有什么系统
采用了这样的做法。
Google
,我们使用了一种特定领域语言来测试监控和告警,该语言允许我们创建
仿真的时间序列数据。然后,我们根据派生时间序列中的值或特定的状态触发告警,
并打上特定告警存在性的判定描述标签。
监控和告警通常是一个多步骤的流程,因此需要多个单元测试系列。虽然这个领域
仍然很不发达,但如果你什么时候想实施监控测试了,我们建议采用三层次的方法,
如图
4-1
所示。
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python数据分析(第2版)

Python数据分析(第2版)

Posts & Telecom Press, Armando Fandango
Google系统架构解密: 构建安全可靠的系统

Google系统架构解密: 构建安全可靠的系统

Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya

Publisher Resources

ISBN: 9787519845858