HP provides great SCOM management packs for monitoring of Proliant servers, but only Windows agents are supported by these management packs. If you’re running ESX on Proliant servers, it takes a little bit more effort to implement monitoring of hardware status. Fortunately, HP also offers their Management Agents for ESX. Thus, all that is needed to monitor HP ESX server hardware are some custom monitors to poll the snmp data exposed by the management agent. An overview of the process for this is as follows:
Installing the HP Management Agent for ESX
- Configure SNMP on the ESX servers and set options: http://thwack.com/blogs/geekspeak/archive/2008/10/30/how-to-enable-snmp-on-a-vmware-esx-server.aspx
- Download the HP ESX agent (make sure your server model is supported by the agent) and copy the .tgz file to a temporary location on the ESX server
- Extract the file hpmgmt-8.x.x-vmware3x.tgz with a tar-zxvf command
- In the extracted directory, run the install script — later versions of the agent have a preinstall_setup.sh script which is to be manually run first, and requires a reboot.
- Amongst other configuration prompts, you will be prompted to use an existing snmpd.conf, if you choose “no,” the install will create a new snmpd.conf that has to be configured with your snmp settings.
- If you use an existing snmpd.conf, you will have to add one line to it: cd to /etc/snmp/ and edit snmpd.conf. Add the following line: dlmod cmaX /usr/lib/libcmaX.so – this extends the SNMP agent to include the HP objects as a module.
- Restart snmp with: service snmpd restart
Testing
The HP agents implement the Compaq mibs under the OID 1.3.6.1.4.1.232. To test, you can use an SNMP browser to remotely connect and walk this OID, or from the ESX server, you can use an snmpwalk command: snmpwalk –v 2c -c <read-only community name> localhost 1.3.6.1.4.1.232.
Monitoring with SCOM
- Discover the ESX servers as Network Devices
- Create a group for HP ESX servers (optionally in a new management pack). You can use dynamic inclusion logic by setting a filter on the Device Description (Contains vmnix)
- Create your SNMP monitors and rules, targeting the SNMP Network Device class. Configure the monitors and rules to be disabled, and then use an override to enable them for the HP ESX server group
- Create any required views or console tasks
What to Monitor?
When HP purchased Compaq, they made a smart decision in utilizing the Compaq SNMP MIBs for all HP servers, as this is one of the better vendor SNMP implementations out there. It has remained very consistent over the years and most importantly, it tends to implement a single status value for each group of subcomponents that are represented in SNMP tables, so you don’t have to walk the table to get the overall status. Thus, instead of checking the status of each disk drive, which will vary in number (and identifier in the table), you can just poll the cpqDaMibCondition (1.3.6.1.4.1.232.3.1.3) from the CPQIDA MIB to get the overall intelligent drive array health. The agent’s System Management web console can be used for specifically drilling in to problems, so from a monitoring perspective, it is really only necessary to know when there is a problem, and what it’s general nature is.
These are the SNMP objects that I like to alert on for HP servers running UNIX:
| Object |
Name |
OID |
| CPU Fans |
cpqHeThermalCpuFanStatus |
1.3.6.1.4.1.232.6.2.6.5.0 |
| Drive Array Health |
cpqDaMibCondition |
1.3.6.1.4.1.232.3.1.3.0 |
| Drive Array Controller (1) |
cpqDaCntlCondition |
1.3.6.1.4.1.232.3.2.2.1.1.6.1 |
| Power supplies |
cpqHEfltTolPwrSupply |
1.3.6.1.4.1.232.6.2.9.1.0 |
| System Fans |
cpqHeThermalSystemFanStatus |
1.3.6.1.4.1.232.6.2.6.4.0 |
| Temperature (Status) |
cpqHeThermalTempStatus |
1.3.6.1.4.1.232.6.2.6.3.0 |
| Thermal Conditions |
cpqHeThermalCondition |
1.3.6.1.4.1.232.6.2.6.1.0 |
| Integrated Management Log |
cpqHeEventLogCondition |
1.3.6.1.4.1.232.6.2.11.2.0 |
| Critical Errors |
cpqHeCritLogCondition |
1.3.6.1.4.1.232.6.2.2.2.0 |
| Correctable Memory Errors |
cpqHeCorrMemLogStatus |
1.3.6.1.4.1.232.6.2.3.1.0 |
For reference on SNMP MIBS, ByteSphere provides a great Online MIB Database. The primary Compaq MIBS to look for are: CPQHLTH, CPQIDA, CPSTSYS, CPQHOST, CPQNIC, CPQTHRSH.
Filed under: SCOM, SNMP, System Center Operations Manager, UNIX, VMWare | 3 Comments »