Chapter 18. Alerting

Back in “What Is Monitoring?” I stated that alerting was one of the components of monitoring, allowing you to notify a human when there is a problem. Prometheus allows you to define conditions in the form of PromQL expressions that are continuously evaluated, and any resulting time series become alerts. This chapter will show you how to configure alerts in Prometheus.

As you saw from the example in “Alerting”, Prometheus is not responsible for sending out notifications such as emails, chat messages, or pages. That role is handled by the Alertmanager.

Prometheus is where your logic to determine what is or isn’t alerting is defined. Once an alert is firing in Prometheus, it is sent to an Alertmanager, which can take in alerts from many Prometheus servers. The Alertmanager then groups alerts together and sends you throttled notifications (Figure 18-1).

Prometheus and Alertmanager architecture.
Figure 18-1. Prometheus and Alertmanager architecture

This architecture shown in Figure 18-1 allows you not only flexibility, but also the ability to have a single notification based on alerts from multiple different Prometheus servers. For example, if you had an issue propagating serving data to all of your datacenters, you could configure your alert grouping so that you got only a single notification rather than being spammed by a notification for each datacenter you have.

Alerting Rules

Alerting ...

Get Prometheus: Up & Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.