Chapter 38. Service Monitoring

This chapter is about monitoring the services and networks in your environment. Monitoring is the primary way we gain visibility into the systems we run. It is the process of observing information about the state of things for use in decision making. The operational goal of monitoring is to detect the precursors of outages so they can be fixed before they become actual outages, to collect information that aids decision making in the future, and, of course, to detect actual outages. The ideal monitoring system makes the operations team omniscient and omnipresent.

There is an axiom in the business world that if you can’t measure it, you can’t manage it. This holds true in system administration, too, and we use monitoring ...

