Chapter 14. Monitoring Theory
Monitoring is measuring, collecting, storing, exploring, and visualizing data from infrastructure (including hardware, software, and human processes). Monitoring helps you answer the “when” and “why” questions of your work, and it informs business decisions that support humans working sustainably (e.g., hiring so that your sysadmins are not constantly working at total capacity).
In this chapter, I will help you think about monitoring by providing a framework for identifying effective monitoring strategies. I will differentiate monitoring from observability and explain the elements and steps of the monitoring process and how they work together. Understanding these mechanics at a high level will help you prioritize the other desirable outcomes monitoring makes possible, decide how and what you monitor, and increase visibility into your workflow, systems, and teams, regardless of the tools you choose.
Why Monitor?
There are many reasons to monitor and increase system visibility: to bring attention to weakness, fragility, or risk and to help you make better decisions. Some reasons for visibility include the following:
- Problem discovery
-
You are identifying problems and understanding issue resolution. For example, you could discover problems by monitoring latencies of web requests and identifying when slow MySQL queries are impacting customers.
- Process improvement
-
You are continuously improving team processes to increase accuracy and speed of task ...