Chapter 2. Monitoring and Observability

No discussion on observability is complete without contrasting it to monitoring. Observability isn’t a substitute for monitoring, nor does it obviate the need for monitoring; they are complementary. The goals of monitoring and observability, as shown in Figure 2-1, are different.

Figure 2-1. Observability is a superset of both monitoring and testing; it provides information about unpredictable failure modes that couldn’t be monitored for or tested

Observability is a superset of monitoring. It provides not only high-level overviews of the system’s health but also highly granular insights into the implicit failure modes of the system. In addition, an observable system furnishes ample context about its inner workings, unlocking the ability to uncover deeper, systemic issues.

Monitoring, on the other hand, is best suited to report the overall health of systems and to derive alerts.

Alerting Based on Monitoring Data

Alerting is inherently both failure- and human-centric. In the past, it made sense to “monitor” for and alert on symptoms of system failure that:

  • Were of the predictable nature

  • Would seriously affect users

  • Required human intervention to be remedied as soon as possible

Systems becoming more distributed has led to the advent of sophisticated tooling and platforms that abstract away several of the problems that human- and failure-centric ...

Get Distributed Systems Observability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.