As you manage your Microsoft-based infrastructure, your goal is to develop a higher level of operational awareness concerning your unique IT environment. This awareness can be summed up in a simple statement: “I know what is going on in my IT environment right now.” When you can say this with a high degree of confidence, you have arrived at your goal.
You can reach this goal by performing operations management. Operations management is not system administration. System administration, and the skill sets that go along with it, are used in operations management, but system administration is narrower in focus in that it applies to a single system or platform. Operations management has a broader scope. It looks at how multiple systems work together to provide IT services to a company. It involves troubleshooting, managing, and reporting on all those systems as a whole.
You can perform operations management manually by examining Windows event logs, gathering performance monitor data, and depending on your users to tell you that something has gone awry. This can be very time-consuming, even if you have a small number of machines; ultimately, this approach leaves you reacting to events rather than preventing them. Microsoft Operations Manager 2005 (MOM 2005 or just MOM) performs many of these tasks for you and generates an alert when it detects a malfunction in the monitored applications or a condition that can lead to a malfunction. In the alert, MOM 2005 tells you what is wrong, what the most common causes of the problem are, and the likely initial steps to fix the issue. Whether these issues have a large impact or are unnoticeable to the end user, resolving them quickly has two effects. If the impact is large—say an Exchange server mailbox store is down—then quick resolution is the obvious goal, with obvious benefits. If the issue is small, for example a missed Active Directory (AD) replication cycle, then fixing it now often prevents a small issue from causing larger issues that will continue to snowball until you have a major outage on your hands. Being able to fix small issues promptly because you know what is going on in your environment lets you be proactive. This is the biggest benefit of successful operations management and is why developing your operational awareness is critical to that success.
The goal of this book is to teach you how to use MOM 2005 correctly so that you can raise your level of operational awareness for your environment. Every IT environment will have unique monitoring, alerting, and reporting requirements—there is no cookie-cutter implementation for MOM. The key to using MOM correctly for your environment is to understand how MOM works and how to use it. Throughout this book, I will show you ways to implement and use MOM in different environments. You can then plan and use MOM to the greatest effect in your environment.
Plan and design. Essentially, these tasks involve taking inventory of your environment and determining what your business and technical needs are. This has to be the starting point of your MOM deployment, and it is where the uniqueness of your environment is baked into your MOM implementation. Here you create a design that will meet your business needs.
Install MOM 2005. Every installation of MOM 2005 has the same basic components, just as every car has an engine, tires, a seat for the driver, and controls that the driver uses. But the placement and configuration of these parts varies from car to car. So, too, will the location and configuration of MOM parts vary from installation to installation. And they have to vary to meet the unique business needs of the environments they are being installed into.
Deploy agents . This is the first task that you will do once you have MOM 2005 installed. Agents are deployed to machines that MOM 2005 goes out to and discovers. Discovery is performed according to rules that you configure. The most common rule tells MOM to discover all machines in a domain. Deploying agents onto machines that host applications you want to monitor lets MOM 2005 do one thing that you cannot—be in many places at the same time. Agents monitor applications on servers, as well as the servers themselves, and compare the collected data to sets of customizable health rules defined by the application vendors. When an exception is found, an alert is generated. The deployment and management of agents is covered in detail in Chapter 3.
Monitor your environment and fix what is wrong. MOM 2005 will tell you things about your environment that you are completely unaware of. Some of these things will be informational in nature, and others will require immediate action for resolution.
Tweak MOM. Not every health rule that MOM 2005 provides out-of-the-box will be useful or even appropriate for your environment. If MOM is alerting you to issues that you don’t want to be alerted about, turn the rule off or adjust the thresholds in the rule so that MOM is giving you relevant, actionable information.
Of course these are not all the things you can do with MOM, but you will do at least these things, no matter what. In the last section of this chapter, additional services of MOM 2005, such as its reporting, will be covered.