Skip to Content
Google SRE工作手册
book

Google SRE工作手册

by Betsy Beyer, Niall Richard Murphy, David K. Rensin, Kent Kawahara, Stephen Thorne
September 2020
Intermediate to advanced
526 pages
8h 23m
Chinese
China Electric Power Press Ltd.
Content preview from Google SRE工作手册
84
4
告警
对告警进行分类是很有帮助的:你可以根据不同类型的告警进行相应的处理。为不
同告警设置不同的严重性级别的功能也很有用:你可以提交工单,用来调查一个持
续时间超过了一小时的低错误率事件,然而在错误率是
100%
的时候,这就是一种需
要立即响应的紧急情况。
告警抑制功能可以消除掉那些分散
on-call
工程师的不必要的噪音。例如:
当所有节点都遇到同样的高错误率时,你只能发出一次关于全局错误率的告警,
而不是为每个单个节点发送单独的告警。
当你所依赖的某个服务触发了严重告警(例如,慢速后端)时,你并不需要为你
的服务发出错误率告警。
你还需要确保在事件结束后不再抑制告警。
你对系统的控制级别将决定你是使用第三方监控服务,还是部署和运行你自己的监
控系统。
Google
在内部开发了自己的监控系统,但是市面上也有大量的可以使用的
开源和商业监控系统。
监控数据源
你所选择的监控系统将会对接那些将要使用到的监控数据源。本节讨论监控数据的
两个常见来源:日志和指标(
metrics
)。还存在一些其他有价值的监控数据源,例
如分布式跟踪和运行态自查,我们在这里就不介绍了。
指标是刻画属性和事件的数值度量,通常是在一定的时间间隔里对许多数据点进行
采集而来。日志是不断追加的事件记录。本章的讨论重点是结构化日志,它们支持
丰富的查询和聚合工具,而不仅仅是纯文本日志。
Google
的基于日志的系统处理着大量的高度精细数据。事件发生时刻和日志中的可
见点之间存在着一些固有的延迟。对于那些非时间敏感的分析,可以使用批处理系 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python数据分析(第2版)

Python数据分析(第2版)

Posts & Telecom Press, Armando Fandango
Google系统架构解密: 构建安全可靠的系统

Google系统架构解密: 构建安全可靠的系统

Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya

Publisher Resources

ISBN: 9787519845858