Microsoft Operations Manager (MOM) 2005 can deliver a huge amount of value to organizations of any size. It automates burdensome and lengthy diagnostic tasks so that you are notified of an event in your environment almost as soon as it happens. Armed with this information, and a set of integrated tools engineered to help you fix whatever is wrong, a solid MOM implementation helps any IT infrastructure reduce outages and simply run better. And if your machines and applications are healthier, chances are your life will be just a bit easier.
System administrators of any network must perform operations management duties. If your network is small and relatively uncomplicated, you probably handle these tasks without the assistance of a tool like MOM. Instead, you rely on your end users to tell you about an outage or an incident with a system. More than likely, you rely on built-in tools such as event logs, performance counters, Dr. Watson logs, and application-specific logs to provide the data you need to diagnose issues with your systems. In addition, you perform tasks such as pinging the IP address of a device to see if it is up on the network. You run diagnostic tools to get more in-depth diagnostic information, such as DCDIAG and REPLMON for domain controller issues. You rely on your experience, your knowledge of the systems involved, and external support coming from online research or a phone call to a support engineer to determine a course of action to fix issues and restore service.
Sometimes, issues resolve themselves, or they arise intermittently—evaporating before you can capture the data needed to diagnose them. If you are lucky and can determine the root cause of the incident, you will need a good deal of self-discipline to record the facts of the issue and the steps you took to resolve it for future reference. Then, during the budgetary cycle when you have to justify why new server hardware is needed, or why a different backup solution is appropriate, you will probably scramble to find that supporting documentation.
Ultimately, you would probably describe your workday as being interruption driven—you spend most of your time putting out fires. You never have time for systems design and new implementations. You are always in a reactive mode.
An operations management system can help you with many of these duties, not by eliminating them—no product can ever do that—but by capturing system data, intelligently analyzing it, correlating it with data from other systems, and notifying you about an issue with actionable information. The operations management system can then help you perform further diagnostic tests and suggest courses of action based on vendor knowledge. In addition, it can give you a place to capture the resolution steps that are associated with the issue, so that the next time it happens your troubleshooting information is already there for quick resolution. It can help you track system performance and events over a long period of time for reporting purposes. Probably the greatest help that an operations management system can give you is that it, unlike you, can be in many different places doing many different things at the same time. And through the implementation of such a system, your organization can gain operational awareness, which is knowledge about the state of your IT systems at any given point in time. Once an organization has this level of insight into the health of its distributed systems, it can start preventing issues before they become outages. This allows an IT organization to shift from being primarily in a reactive mode, to being primarily in a proactive mode.
MOM 2005 takes a unique approach, developing your operational awareness by providing you with information that focuses on the state of an application or server, based on the condition of its components. The current state of a monitored computer or application is judged against a definition of its health as defined by the vendor that produced it. Health definitions, otherwise known as management packs, are developed by the application product teams and contain their distilled diagnostic and troubleshooting knowledge. Using MOM gives you the benefit of their experience. This, combined with MOM’s ability to monitor all the servers that play a role in delivering an application service (such as email or Active Directory (AD)), can let you know how healthy your whole environment is at a single glance.
When you transition into a new job or to new duties in your current job, you naturally first want to learn only what you must know to do whatever it is you have to do. How you find out what you must know is an experience who’s pain lies somewhere between that of a root canal and sore muscles from a hard workout. The goal of this book is to give you what you must know to have a solid foundation for planning, implementing, and administering MOM 2005, which is a pretty good workout.
I wrote this book with the first-time MOM administrator in minde. To get the most from this book, it is helpful to have some background as a Windows system administrator in a multiple-server environment. If you are a complete novice to the Windows OS and are not even sure what event logs or performance counters are, then you are going to struggle a bit with this material until you get some basic Windows OS administration under your belt.