Coming Soon: SNMP Monitoring for Changes in Polled Values

Most SNMP monitoring can be facilitated by comparing the value of a specific retrieved SNMP object to an expected string or threshold, but monitoring for some conditions can only really be accomplished by comparing the current value to a previous value.  

Three examples of this are:

1)      Serial Interface Flapping:  If a serial connection is experiencing problems, the interface may bounce up and down rapidly.  If an SNMP poll on that interface is occurring every 1, 3, or 5 minutes, it may not detect any problems (if the interface is up for the poll), meaning that compromised availability could go undetected for several polling cycles.  These conditions can be detected by comparing the Interface Resets (locIFResets) counter in the Cisco local interfaces table  to previous values.   

2)      Default Gateway Changes on Redundant Routers:  In some redundant WAN router deployments, a default gateway change on the routers is indicative of a redundancy failover.  Because all devices and interfaces may be up and reachable before and after a failover, it may be difficult to detect when the failover has occurred, potentially meaning production traffic is routed over a slower backup link.   This can be detected by monitoring the Default Gateway value (ipRouteNextHop in the ipRouteTable) on the routers and detecting changes when compared to previous polling cycles.

3)      High-Availability State Changes on CheckPoint SPLAT Firewalls:  In an HA configuration on CheckPoint firewalls, the haStatus value will return a string value of “active” or “standby.”  The best way to detect an HA failover is to watch this value for a change.

These are just three examples of many potential scenarios where monitoring of an SNMP object is best served by comparing current values to previously polled values.  Unfortunately, this capability is not a common feature in many of the monitoring tools that I am familiar with.  

Over the next week (or more), I will be posting articles about how I have to implemented just such a monitor for the three described scenarios using the two monitoring products that I currently work with:  SolarWinds ORION and System Center Operations Manager.  In the case of ORION, these monitors can be implemented fairly easily with a bit of SQL work.  In the case of SCOM, it’s a little bit more complicated, but ultimately doable.

Advertisement

About Kristopher Bash
Kris is a Senior Program Manager at Microsoft, working on UNIX and Linux management features in Microsoft System Center. Prior to joining Microsoft, Kris worked in systems management, server administration, and IT operations for nearly 15 years.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: