Chapter 2. Application Metrics
The complexity of distributed systems comprised of many communicating microservices means it is especially important to be able to observe the state of the system. The rate of change is high, including new code releases, independent scaling events with changing load, changes to infrastructure (cloud provider changes), and dynamic configuration changes propagating through the system. In this chapter, we will focus on how to measure and alert on the performance of the distributed system and some industry best practices to adopt.
An organization must commit at a minimum to one or more monitoring solutions. There are a wide range of choices including open source, commercial on-premises, and SaaS offerings with a broad spectrum of capabilities. The market is mature enough that an organization of any size and complexity can find a solution that fits its requirements.
The choice of monitoring system is important to preserve the fixed-cost characteristic of metrics data. The StatsD protocol, for example, requires an emission to a StatsD agent from an application on a per-event basis. Even if this agent is running as a sidecar process on the same host, the application still suffers the allocation cost of creating the payload on a per-event basis, so this protocol breaks at least this advantage of metrics telemetry. This isn’t always (or even commonly) catastrophic, but be aware of this cost.
Black Box Versus White Box Monitoring
Approaches to metrics collection ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access