O'Reilly logo

Distributed Systems Observability by Cindy Sridharan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 2. Monitoring and Observability

No discussion on observability is complete without contrasting it to monitoring. Observability isn’t a substitute for monitoring, nor does it obviate the need for monitoring; they are complementary. The goals of monitoring and observability, as shown in Figure 2-1, are different.

Figure 2-1. Observability is a superset of both monitoring and testing; it provides information about unpredictable failure modes that couldn’t be monitored for or tested

Observability is a superset of monitoring. It provides not only high-level overviews of the system’s health but also highly granular insights into the implicit failure modes of the system. In addition, an observable system furnishes ample context about its inner workings, unlocking the ability to uncover deeper, systemic issues.

Monitoring, on the other hand, is best suited to report the overall health of systems and to derive alerts.

Alerting Based on Monitoring Data

Alerting is inherently both failure- and human-centric. In the past, it made sense to “monitor” for and alert on symptoms of system failure that:

  • Were of the predictable nature

  • Would seriously affect users

  • Required human intervention to be remedied as soon as possible

Systems becoming more distributed has led to the advent of sophisticated tooling and platforms that abstract away several of the problems that human- and failure-centric ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required