8.1. What Is Monitoring?

Monitoring provides an end-to-end view of the organization's production environments. There are two primary purposes of monitoring a system. The first is to provide the operations team with a complete picture of the current state of the live systems, from both a hardware and software perspective. This can range from the amount of available disk space on a server or disk array to a full-scale shutdown of a system or component. The second purpose is to provide effective information and tools that the operations team can use to rapidly respond to incidents and alerts.

Monitoring software keeps track of what's happening on the numerous servers and devices within the organization as well as the diverse range of applications and software that runs on them. Many monitoring solutions are available, including BMC Patrol and Microsoft Operations Manager (MOM). The software is mainly used to monitor all the systems and applications in the production environment. Some organizations also monitor other key environments, such as the disaster recovery environment and the pre-production environment (if there is one) to ensure that these environments are ready to serve when necessary. However, the pre-production environment may be integrated with development or test versions of the monitoring tools.

Monitoring software is typically implemented using the master/agent paradigm, as discussed in Chapter 5, in which the agents capture and send events and performance data back ...

Get Design – Build – Run: Applied Practices and Principles for Production-Ready Software Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.