SCOM: Updates to the Cisco Management Pack (R2) v1.0.2.6

I’m hoping to finish up the SP1 version of the Cisco Management Pack pretty soon, but I’ve modified the R2 version to include several new changes.  The current version: 1.0.2.6 can be downloaded here

The changes in this version are:

  • Added three new containment classes:  Cisco Device Chassis, Cisco Device System Components, and Cisco Device Interfaces.   These classes contain monitored objects to add an additional level of hierarchical organization.
  • Added discovery of the IFAlias property for Interfaces
  • Added discovery of the Hostname (OLD-CISCO-MIB) and Chassis description for the Cisco Device class.
  • Updated the properties displayed by default in the Device and Interface views
  • Added a rule to clean up unused XML temporary files once a day.   Several of the monitors utilize temporary XML files written to the %TEMP% path.  In the previous version, old files would be left on the file system if a previously monitored object was removed.  This rule will remove those temporary files.
  • Modified discovery intervals for some objects for more balanced timing.
  • Added four new monitors for switches that implement the CISCO-STACK-MIB.  The monitors are targeted at the Cisco Device Chassis class and include
    • Fan Alarm
    • Temperature Alarm
    • Minor Alarm
    • Major Alarm

With the new containment classes, the diagram view looks a lot better:

Advertisement

About Kristopher Bash
Kris is a Senior Program Manager at Microsoft, working on UNIX and Linux management features in Microsoft System Center. Prior to joining Microsoft, Kris worked in systems management, server administration, and IT operations for nearly 15 years.

26 Responses to SCOM: Updates to the Cisco Management Pack (R2) v1.0.2.6

  1. Pingback: SCOM: Advanced SNMP Monitoring Part III: The Completed Cisco Management Pack « Operating-Quadrant

  2. Richard says:

    Kris,

    First off, thanks for the update!

    I do have one concern though. I noticed that the Discovery ID for the Interfaces was changed from:

    CiscoSNMP.Discovery.CiscoInterfaces

    to:

    CiscoSNMP.Discovery.Interface

    Isn’t this going to affect any overrides already in place for Interface discovery?

  3. Richard says:

    Kris,

    Yes, it does affect any overrides already in place.

    I had a little over a hundred overrides for limiting interfaces for devices and they are gone after updating.

    • Kristopher Bash says:

      Ouch. This version did introduce significant changes to the class organization. Do you have a backup of your old MP with the overrides?

      • Richard says:

        Yes, i have backups. I have been going through your MP and disabling most of the discoveries and then sealing the MP so I can store my overrides in a separate MP. I had them all exported to an excel spreadsheet also, so adding them back won’t be that big of a deal. Glad I had only done 3 sites so far, we have over 300 total, that would have been a lot of work to redo.

  4. Richard says:

    Also, with the new discovery, when creating a new override for “Discover Cisco Interface” , the override is now targeted at “Cisco Device Interfaces” which only shows an IP address as the path, the hostname/devicename is no longer shown so it makes it harder to target the override if you are used to using the hostname/devicename instead of the IP Address (which I am).

  5. Marnix Wolf says:

    Hi Kris.

    I get many events like this one:
    Event Type: Error
    Event Source: Health Service Modules
    Event Category: None
    Event ID: 11903
    Date: 2-10-2009
    Time: 10:11:57
    User: N/A
    Computer: XXXX
    Description:
    The Microsoft Operations Manager Expression Filter Module could not convert the received value to the requested type.

    Property Expression: Property[@Name=’MemoryPctUsed’]

    Property Value: 82,33

    Conversion Type: DataItemElementTypeDouble(3)

    Original Error: 0x80FF005A

    One or more workflows were affected by this.

    Workflow name: CiscoSNMP.Monitor.MemoryPoolPctUtil
    Instance name: Processor
    Instance ID: {AC2BBD47-FA33-0B10-62AA-F4F9DCBA2CCA}
    Management group: XXXX

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

    Also this event pops up, many times:

    Event Type: Error
    Event Source: Health Service Modules
    Event Category: None
    Event ID: 11001
    Date: 2-10-2009
    Time: 10:08:12
    User: N/A
    Computer: XXXX
    Description:
    Error sending an SNMP GET message to IP Address 10.40.1.252, Community String:=XXXX, Status 0x6c.

    One or more workflows were affected by this.

    Workflow name: CiscoSNMP.Rule.CollectIFCollisions
    Instance name: 59
    Instance ID: {20030910-0988-FC87-B40E-04FD9E0FD9B3}
    Management group: XXX

    I have already disabled for two ports the Cisco Status Interface Monitor since these ports aren’t used, so the Alerts raised for these ports aren’t needed.

    • Kristopher Bash says:

      I’ll look into these errors.

      • Marnix Wolf says:

        Thanks Kris.

        Don’t get me wrong. I couldn’t write such a MP myself. I have investigated it further and the SNMP device is being monitored properly in OpsMgr.

        The EventID 11001 happens many times per second. These workflows are reporting the above mentioned issue:
        CiscoSNMP.Rule.CollectIFCollisions
        CiscoSNMP.Rule.CollectInterfaceInPctUtil
        CiscoSNMP.Monitor.InterfaceCollisions
        CiscoSNMP.Rule.CollectInterfaceOutPctUtil
        CiscoSNMP.Monitor.InterfacePctUtil

        If there is anything else I can do, let me know.

        Best regards,
        Marnix

  6. Kristopher Bash says:

    Marnix,

    Thanks for bringing this to my attention, if you’re up for it, I may ask you to collect some more information so that we can get to the root of the issue. How many CIsco devices are you monitoring and what is the highest number of interfaces per device you are monitoring? Are you able to tell if these errors are occurring on just one device or on multiple devices? Off the top of my head, I suspect one of three things is hapepning:
    1) Too many SNMP requests are being sent to the device at the same time and some are timing out
    2) Too many SNMP requests are being sent from the proxy agent at the same time and some are timing out
    3) A malformed OID is being passed to the monitor

    To explain item 3, all of the interface monitors use a variable replacement to form the oid. So the OID is defined in the monitor as .1.3.1.x.x.x.x.x.$Config/Index$. I have used expression filters to try to prevent any SNMP requests from being sent if the $Config/Index$ value is null, but I suppose there could be a problem with this that I am unaware of.

    If the SNMP request is successfully sent and no data value is returned, it should not generate an 11001 event, the return value would just be empty.

    As for the memory pool utilization error, I believe that is a problem due to regional language settings (or rather the MP not properly anticipating regional setting differences). I haven’t proven this out, but my hunch is that the performance data mapper may not like the value being returned as 82,33 instead of 82.33. I think this is probably the result of using a vbs formatnumber function to trim the decimal places. I’ll try to see if I can prove this out in a lab setting.

    Thanks again!

    • Marnix Wolf says:

      Hi Kris.

      I am collection information. (I am on a different site now). When I have the information I’ll let you know.

      I am sure it is too many snmp requests since the Event ID: 11001 popped up tens of times per second.

      At the moment only one Cisco device was being monitored. I have removed the Cisco MP for now since it flooded the OpsMgr eventlog of the Management Server (whih monitors this Cisco device as a network device) too much.

      Thanks again for your efforts.

      By the way, can we communicate more directly? (When you are interested that is)

      Best regards,
      Marnix Wolf

  7. Marnix Wolf says:

    Hi Kris.

    Information about the device: Cisco 6500 with13 slots with at this moment 91 ports.

    Best regards,
    Marnix Wolf

  8. Yury Krylov says:

    Hi Kris,
    Thanks for your great job. The only thing I wnat to point out is that if SCOM is installed on Windows Server 2008 the file RegCiscoMibs.cmd needs to be modified:
    all command lines %SYSTEMROOT%\system32\wbem\smtp\smi2smir.exe chnage to %SYSTEMROOT%\system32\wbem\smi2smir.exe. There is no %SYSTEMROOT%\system32\wbem\smtp directory in WS 2008, if I’m not mistaken

    -Yury

  9. Jeroen Mazereel says:

    Hi,

    First of all, great job with the management pack!

    I get a lot of warning alerts like this one:

    Cisco Temperature Sensor Status Monitor

    The temperature sensor (chassis) on the device xxxxxxx is in a warning or error state. The sensor state is: 5.

    Legend:
    normal(1),
    warning(2),
    critical(3),
    shutdown(4),
    notPresent(5),
    notFunctioning(6)

    It probably shouldn’t alert on this state. That or the condition on which the alert triggers should be overridable.

    Again, great job so far with the management pack.

    Best Regards,
    Jeroen Mazereel

  10. Chris Taylor says:

    First, nice work. =) This is extremely useful.

    A couple things that I’ve noticed though (I’m testing monitoring with an ASA5505).

    1) Interface DMZ is administratively down, there is no cable plugged in. It is healthy (since it’s admin status is 2). If I change this to up (1) it shows up in a critical state (since the line protocol is down). If I shut the interface back down it doesn’t automatically return to healthy. If I reset the health it does show up correctly.

    2) The same thing seems to happen if the device being managed is offline longer than the Discover Cisco Device discovery interval the device drops out of the Cisco Devices group.

  11. Alex Fischer says:

    Hi Kristopher,

    I’m working with the 1.0.2.7 build (SCOM 2007 R2 under W2K8). What could be the reason that Cisco 6500 (Supervisor 720) are just visible and you’re able to see the integrity of the chassis but not being able to see the status of the devices?

    Alex

  12. Pierre-Emmanuel says:

    Hi Kris,

    This is a great job you have done so far, I am currently inspecting the nooks and crannies of the mp and I have a question right off the bat:

    In the data source module for discovering cisco devices, you have an OID Filter module which, from what I understand, filters off any network devices which do not have an OID containing, 1.3.6.1.4.1.9., then in the final mapper module, you use the FilteredClassSnapshotDataMapper, in this module there is an expression that filters again on the OID, 1.3.6.1.4.1.9.. Is this really necessary? wouldn’t the ClassSnapshotDataMapper module be sufficient since the OID is previously filtered? The way I see it, if an OID doesn’t match the first OID filter, the workflow stops there and therefor further filter isin’t necessary.

    Best regards!

    Pierre-Emmanuel

  13. Hi Kristopher,

    That is some great work ! Really appreciated.

    I have just one remark, the Cisco switch I’m monitoring has some 10Gbits ports and these are the ones I precisely want to monitor. Indexfilter parameter works fine. The problem I see now is that the ifInOctets value is limited to 4294967295 as it is a Counter32. With 10Gbits bandwidth the counter is reset almost every 5 mins which makes the inpct calculation false. I found there http://www.cisco.com/en/US/tech/tk648/tk362/technologies_q_and_a_item09186a00800b69ac.shtml that there were some 64bits counters that provide the values. So I have two questions: Is it possible to query such snmp 64bits counters in SCOM ?

    • Kristopher Bash says:

      Thanks for your comment. The 32bit counter rollover is a real problem with 10GbE interfaces and can be a problem with 1Gb interfaces too. The challenge in a monitoring scenario is detecting which interfaces support the 64bit counters in the IF-MIB as some vendors/versions don’t support 64bit octet counters on 1Gb interfaces and others don’t. I am happy to say that this issue is fully addressed in the xSNMP management pack which I will post for public beta testing this week. In this MP, I configured support for both 32 and 64 bit IF Octet counters by discovering whether each interface supports 64bit counters, and then passing the correct OID (32 or 64 bit octect counter) as an overridable parameter to the utilization monitors/rules.

Leave a Reply to Alex Fischer Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: