SNMP is good for proactive monitoring (and some reactive monitoring situations when using SNMP traps), but it doesn’t always help with unplanned situations like network emergencies. In these situations, you may need to monitor the network in ways that are not covered by the available SNMP variables.
Here’s a true story that shows how Perl can help in these times. One Saturday evening I casually logged into a machine on my network to read my email. Much to my surprise, I found our mail and web servers near death and fading fast. Attempts to read and send mail or look at web content yielded slow responses, hung connections, and outright connection failures. Our mail queue was starting to reach critical mass.
I looked first at the state of the servers. Interactive response was fine, and the CPU load was high, but not deadly. One sign of trouble was the number of mail processes running. According to the mail logs, there were more processes running than expected because many transactions were not completing. Processes that had started up to handle incoming connections from the outside were hanging, driving up the load. This load was then capping any new outgoing connections from initiating. This strange network behavior led me to examine the current connection table of the server using netstat.
The last column of the netstat output told me that there were indeed many connections in progress on that machine from many different hosts. The big shocker ...