Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part III

Introduction

In Part II of this series, I walked through creation of data sources, a discovery, a monitor type, and a monitor for customized “Process Count” monitoring for discovered instances of a “Service” class. In this post, I will continue to build on this example MP to implement dynamic log file discovery and monitoring.

Dynamic Log File Discovery and Monitoring

Log file monitoring of a single known log file can be easily implemented with the Microsoft.Unix.SCXLog modules, but in some cases, the full path to a log file isn’t static.   For example, if an application maintains multiple log files in a directory, the file name portion of the log file path may not be known ahead of time.    To handle this monitoring scenario, we can implement dynamic log file discovery – using a shell command execution, and then pass the full path of the log file to the SCXLog module for standard log file monitoring. This requires a new class instance, a discovery data source, a discovery rule, and a rule that actually implements the log file monitoring.

Defining the Log File Class and Hosting Relationship

Firstly, a new custom class is required to represent the log file objects.   Instances of this class will be discovered by the discovery rule.  

Definition 

  • ID:  MyApp.Monitoring.LogFile
  • Base Class:  Microsoft.Unix.ApplicationComponent
  • Name:  MyApp Log file

Properties

  • Name (String)
  • Path (String) – Key

The properties for the log file class represent the file name and full path.   The full path is assured to be unique, so I have specified that as the key property of the class.

The log file class needs to be hosted by the MyApp class, to maintain the relationship between the log files and the application.  

Discovery Data Source:  MyApp.Monitoring.DataSource.DiscoverLogFiles

This data source will use the MyApp.Monitoring.DataSource.ShellCommandDiscovery probe action to find files in a given directory that match a pattern.   The output from this command execution will then be passed to a Microsoft.Windows.PowerShellDiscoveryProbe.   The reason that I am using a PowerShellDiscoveryProbe is that the listing of matched files will be returned as a single data item, the StdOut from the command.   Using a PowerShellDiscoveryProbe provides an easy way to split each line from the output and discover an instance per line. 

Configuration Parameters:

  • Interval (integer):  Scheduler interval in seconds – overridable
  • TargetSystem (string):  UNIX/Linux agent computer to execute the discovery
  • Appname (string):   The name of the application object (which is the key property for the hosting class instance)
  • LogFileNamePattern (string): The pattern that will be used in the grep operation to identify log files to discovery
  • LogFilepath (string):  The path to search for log files at (via an ls command)

Member Modules:

The first member modules is a MyApp.Monitoring.DataSource.ShellCommandDiscovery probe action, that executes the following command:
ls $Config/LogFilepath$ |grep $Config/LogFileNamePattern$.  This simply enumerates the contents of the specified directory path, and pipes the results to grep, to match a specified pattern, which could be a string match or regular expression.

Module Configuration:

<Interval>$Config/Interval$</Interval>
<TargetSystem>$Config/TargetSystem$</TargetSystem>
<ShellCommand>ls $Config/LogFilepath$
   |grep $Config/LogFileNamePattern$</ShellCommand>
<Timeout>120</Timeout>

The output of this shell command then needs to be parsed so that each valid line in the output is discovered as an instance of a log file object.   This is most easily done with a PowerShellDiscoveryProbe:

param ([string]$CmdOutput,[string]$AppName,[string]$LogFilePath, [string] $TargetSystem,[string] $SourceID,[string]$ManagedEntityID)

$api = New-Object -comObject ‘Mom.ScriptAPI’
$discoveryData = $api.CreateDiscoveryData(0, $SourceID, $ManagedEntityID)

if ($CmdOutput -ne $null){
        $CmdOutput = $CmdOutput.Replace([Environment]::newline,” “)
 [array]$arList = $CmdOutput.Split(” “)
 $arList | ForEach-Object{
  [string]$sFile = $_
 if([int]$sFile.Length -ge [int]1){
  $SFilePath = $LogFilePath + “/” + $sFile
  $oInst = $discoveryData.CreateClassInstance(“$MPElement[Name=’MyApp.Monitoring.Logfile’]$”)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.Logfile’]/Name$”, $sFile)
  $oInst.AddProperty(“$MPElement[Name=’System!System.Entity’]/DisplayName$”, $sFile)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.Logfile’]/Path$”, $sFilePath)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.MyApp’]/Name$”, $AppName)
  $oInst.AddProperty(“$MPElement[Name=’MicrosoftUnixLibrary!Microsoft.Unix.Computer’]/PrincipalName$”, $TargetSystem)

  $discoveryData.AddInstance($oInst)
 }     
}
 $discoveryData
}

Remove-variable api
Remove-variable discoveryData

The PowerShell script loads the Mom.ScriptAPI, creates a Discovery Data instance, and then walks through each line of the ouptut.   If the file name is a valid string (not empty), a class instance is created for the MyApp.Monitoring.Logfile class, and the path and file name properties are set.   The PrincipalName property of the Microsoft.Unix.Computer object, and the AppName property of the MyApp.Monitoring.MyApp class ares included in the DiscoveryData, so that the discovery mapping process can map the hosting relationships. 

Parameters are passed from the module configuration to the script using the Parameters XML fragment in the module configuration:

<Parameters>
<Parameter>
 <Name>TargetSystem</Name>
 <Value>$Config/TargetSystem$</Value>
 </Parameter>
<Parameter>
 <Name>AppName</Name>
 <Value>$Config/Appname$</Value>
 </Parameter>
<Parameter>
 <Name>LogFilePath</Name>
 <Value>$Config/LogFilepath$</Value>
 </Parameter>
<Parameter>
 <Name>CmdOutput</Name>
 <Value>
 $Data///*[local-name()="StdOut"]$
</Value>
 </Parameter>
<Parameter>
 <Name>ManagedEntityID</Name>
 <Value>$Target/Id$</Value>
 </Parameter>
<Parameter>
 <Name>SourceID</Name>
 <Value>$MPElement$</Value>
 </Parameter>
 </Parameters>

This data source can then be used to discover log files matching a pattern, in a specified directory.  

Discovery Rule:  MyApp.Monitoring.Discovery.LogFile

This discovery will discover dynamically-named log files, in a specified path, using a regular expression to filter by file name.   It discovers instances of the MyApp.Monitoring.LogFile class, and uses the MyApp.Monitoring.DataSource.DiscoverLogFiles data source.  The discovery targets  MyApp.Monitoring.MyApp

Data Source Configuration:

  • <Interval>14400</Interval>
  • <TargetSystem>$Target/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/PrincipalName$</TargetSystem>
  • <Appname>$Target/Property[Type=”MyApp.Monitoring.MyApp”]/Name$</Appname>
  • <LogFileNamePattern>‘^log[0-9]+’</LogFileNamePattern>
  • <LogFilepath>$Target/Property[Type=”MyApp.Monitoring.MyApp”]/InstallPath$/logs</LogFilepath>

The two parameters to note are the LogFilepath (which is defined as the application path discovered for the MyApp application, appended with “/logs”) and the LogFileNamePattern (which is a regular expression – ‘^log[0-9]+’ – that will match log files named:  logxxx, where xxx is a number).  

Monitoring the Discovered Log Files

Log File Monitoring Rule:   MyApp.Monitoring.Rule.AlertOnLogError

Now that the dynamically-named log files will be discovered, we need a rule to alert when an error is found in one of the logs.   The rule will target all instances of the MyApp.Monitoring.LogFile class, so that when a new log file instance is discovered, it is automatically monitored.  The rule uses the MicrosoftUnixLibrary!Microsoft.Unix.SCXLog.Privileged.Datasource (assuming the log files require privileged credentials to access).

Data source configuration:

  • <Host>$Target/Host/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/NetworkName$</Host>
  • <LogFile>$Target/Property[Type=”MyApp.Monitoring.Logfile”]/Path$</LogFile> <RegExpFilter>^.*(e|E)rror.*$</RegExpFilter>

The discovered path to the logfile instance is input as the LogFile parameter value, and a Regular Exprssion (^.*(e|E)rror.*$) is defined to match any log entries with the string:  error or Error in the message.  

Condition Detection configuration:

A System!System.Event.GenericDataMapper condition detection is then configured to map the data to EventData, for consumption by OpsMgr.  The configuration of this module is:

  • <EventOriginId>$MPElement$</EventOriginId>
  • <PublisherId>$MPElement$</PublisherId>
  • <PublisherName>MyApp</PublisherName>
  • <Channel>Application</Channel>
  • <LoggingComputer>$Target/Host/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/NetworkName$</LoggingComputer>
  • <EventNumber>8001</EventNumber>
  • <EventCategory>0</EventCategory>
  • <EventLevel>1</EventLevel>
  • <UserName/>
  •  <Params/>
  •  </ConditionDetection>

Write Actions:

In this rule, I have configured two write actions, for collecting the event, and generating an alert.  The CollectEvent (SC!Microsoft.SystemCenter.CollectEvent) module requires no additional configuration, and the alert can be configured to provide details about the logged error message:

 

Stay tuned for more in this series…

Advertisements

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part II

Introduction

In Part I of this series, I walked through creation of a custom Management Pack for monitoring an application hosted on a UNIX or Linux server, as well as the creation of some base data sources and application discovery.   In this post, I will build on this MP to implement custom process monitoring – monitoring the count of instances of a running daemon/process to check that the count is within a range.   While the standard process monitoring provider (SCX_UnixProcess) is the best source for process information in OpsMgr UNIX and Linux monitoring, it does not support this level of customized monitoring.

Advanced Service Monitoring

Continuing this custom application monitoring scenario, our hypothetical app has a single daemon associated with the app, but we will build the classes and data sources so that they could easily be extended to add more services/daemons to monitor.    In this example, we can suppose that we want to monitor a daemon that may have multiple instances running, and drive an alert if too many or too few instances of that process are running.   This monitoring will be implemented by using the ps command in a WSMan Invoke module.   To implement monitoring of a daemon for a discovered, custom application, there are two approaches that are viable:
 
  1. Define a custom service class, and discover an instance of this class for each service to monitor, configure monitor types and monitors targeting this class
  2. Create a monitor for each service to monitor, targeting the custom application class

Both methods are completely viable, and in most cases, it is appropriate to take the simpler approach and target the custom monitors to the application, providing static inputs into the monitor.   There are some cases where discovering a class instance for the service makes sense though.  Facilitating dynamic discovery of services or thresholds (read from a config file), using the service class in a Distributed Application model in OpsMgr, or maintaining logical seperation (in terms of monitoring) between the application and its subsystems are all scenarios that would benefit from discovering the monitored services as class instances.   For the purpose of illustration, I will discover the daemon to monitor in this example Management Pack as a class instance.

Class Definition

Class:  MyApp.Monitoring.Service

Definition

  • ID:  MyApp.Monitoring.Service
  • Base Class:  Microsoft.Unix.ApplicationComponent
  • Name:  MyApp Service

Properties

  • Name (String) – Key
  • MinRunning (Integer)
  • MaxRunning (Integer)

Discovery

Then we can define the data source to discover a service.   In this case, we know the name of the service and the value of the properties, so we don’t need to actually poll the agent to return data.   We can simply combine a Discovery Scheduler with a Discovery Data Mapper module to implement the data source.  However, we want to be able to override the values of MinRunning and MaxRunning, so these will need to be exposed as overridable configuration parameters.

Therefore, I’ve chosen to implement this data source in two parts.   The first data source, will simply combine a System.Discovery.Scheduler module and a System.Discovery.ClassSnapshotDataMapper module.   This data source will accept Interval, ClassId and InstanceSettings parameters as inputs.  The second data source will reference the first data source, but implement parameters for Service Name, MinRunning, and MaxRunning.    By breaking this into two data sources, the first data source can be used for other simple discoveries.

Discovery Data Source:  MyApp.Monitoring.DataSource.DiscoverObject

This is the data source that simply combines a scheduler and a discovery data mapper.  It requires that the MapperSchema be added to the Configuration:

<Configuration>
<IncludeSchemaTypes>
<SchemaType>
 System!System.Discovery.MapperSchema
</SchemaType>
</IncludeSchemaTypes>
…
 Read more of this post

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part I

Introduction

The OpsMgr UNIX and Linux monitoring implementation can be extended through MP authoring to implement robust system and application monitoring for UNIX/Linux servers.   The most direct mechanism of extension comes in the form of the script provider, accessed with WSMan Invoke modules.   The WSMan Invoke modules support three methods of invoking actions:

  • ExecuteCommand – execute a command (e.g. a script already on the file system ) and return the results
  • ExecuteShellCommand – execute a command through sh (with pipeline support) and return the results
  • ExecuteScript  – download and execute an embedded script and return the results

Of these three methods, I prefer to use ExecuteShellCommand in most cases, as it allows for the use of complex one-liner shell commands, embedded in the MP.

In a series of posts, I will describe the creation of an example Management Pack for monitoring an application, featuring dynamic application discovery, discovery of multiple log files, and advanced monitoring implementations.

Example Application Details

The example MP described in these blog posts implements monitoring for a hypothetical application (MyApp).  The application involves a daemon, a set of log files, and application performance counters where the metrics are accessible as the contents of files.

Part I – Discovering an Application

Setting up the MP

I am a big fan of the R2 Authoring Console and will be using it to create this example MP.   The first step then is to create a new MP in the Authoring Console (ID:  MyApp.Monitoring).    Once the MP is created and saved, references are needed.   References I am adding are:

  • Microsoft.Unix.Library – contains UNIX/Linux classes and modules
  • Microsoft.SystemCenter.DataWarehouse.Library – required for publishing performance data to the DW
  • System.Image.Library – contains icon images referenced in class definition

Configuring the Base Composite Modules
Read more of this post

xSNMP Reports 1.1.1 Now Available!

The xSNMP Reports version 1.1.1 package is now available at manage-x.net.   This suite of management pack adds value to the xSNMP suite by implementing OpsMgr reports for data collected by performance rules in the xSNMP management packs.   Reporting management packs included are:

  • xSNMP for APC Reports
  • xSNMP for APC NetBotz Reports
  • xSNMP for Brocade Reports
  • xsNMP for Check Point Secure Platform Reports
  • xSNMP for Cisco Reports
  • xSNMP for Data Domain Reports
  • xSNMP for Dell PowerEdge Reports
  • xSNMP for HP ProCurve Reports
  • xSNMP for IBM AIX Reports
  • xSNMP for Juniper Networks Reports
  • xSNMP for Juniper-NetScreen Reports
  • xSNMP for NetApp Reports
  • xSNMP for Net-SNMP Reports
  • xSNMP for SonicWALL Reports

Requirements are:   OpsMgr 2007 R2, xSNMP suite, OpsMgr reporting implementation.   Like the xSNMP suite, the reports MP’s are licensed with the GNU-GPL and unsealed versions are provided.

xSNMP Verions 1.1.1(Alpha) Available Now!

I have posted the 1.1.1 (Alpha) version of the xSNMP suite at manage-x.net.   While the updates in this release are relatively minor in nature, I would still consider this to be an early test copy of this version and deploy in test environments before production.  The abbreviated change log for this release is:

New Management Packs:

  • xSNMP for APC NetBotz Management Pack
  • xSNMP for IBM AIX
  • xSNMP for Juniper Networks Management Pack
  • xSNMP for SonicWALL Management Pack

New Features:

  • Added support for monitoring of Net-SNMP Extend objects (xSNMP for Net-SNMP Management Pack)
  • Added a three state monitor for Net-SNMP Exec objects(xSNMP for Net-SNMP Management Pack)
  • New Cisco Firewall Subsystem monitoring for PIX and ASA firewalls
  • Added new data sources to the xSNMP MP to decrease data source redundancy
  • Individual inbound/outbound speeds can now be set through the Speed Override discovery in the xSNMP MP
  • All views are now publically accessible

Issues Resolved:

  • Fixed an uncommon issue with network interface utilization calculations
    that could result in invalid values being calculated if a previous poll returned null data
  • Updated the list of Device OID’s for APC UPS devices to include missing UPS models
  • Fixed an issue with an incorrect OID specified for HP Proliant SCSI (IDA) storage health monitoring
  • Fixed an issue with the Cisco Default Gateway Changed alert-generating rule

Using PowerShell to Check for Syntax Errors in System.ExpressionFilter Modules (OpsMgr Authoring)

With its wide range of usefulness in implementing conditional logic, expression evaluation, and error filtering, the System.ExpressionFilter module is likely to be a frequently used module in most OpsMgr Management Pack authoring scenarios.   However, the default configuration for the System.ExpressionFilter module may lead to a potential syntax error that is quite easy to miss, in my opinion.  This potential syntax error relates to the default configuration of the ValueExpression elements of the ExpressionFilter, in that the ValueExpression defaults to an XPathQuery value type.   If this default is not changed when evaluating a non-XPathQuery value, the workflow (as well as all workflows cooked down with the faulted workflow) will fail.

To illustrate an oversight resulting in this error, the following ExpressionFilter configuration would result in a workflow failure:

<SimpleExpression>
   <ValueExpression>
        <XPathQuery>ErrorCode</XPathQuery>
   </ValueExpression>
   <Operator>Equal</Operator>
   <ValueExpression>
       <XPathQuery>1</XPathQuery>
    </ValueExpression>
</SimpleExpression>

The correct implementation of this Expression Filter module would be:

<SimpleExpression>
   <ValueExpression>
        <XPathQuery>ErrorCode</XPathQuery>
   </ValueExpression>
   <Operator>Equal</Operator>
   <ValueExpression>
       <Value>1</Value>
    </ValueExpression>
</SimpleExpression>

Unfortunately, this type of error is not caught by the MPBPA, most likely due to the difficulty in differentiating between a valid XPathQuery value and string value.  To perform some basic error checking to identify these errors when authoring, I have written a simple PowerShell script that analyzes the XML of an unsealed Management Pack and reports errors and potential errors with ValueExpression configuration in ExpressionFilter modules.  

The logic of the script assumes the following:

  • Any ValueExpression configured as an XPathQuery expression and having a value that starts with a dollar sign ($) can be assumed to be a configuration error
  • Any ValueExpression configured as an XPathQuery expression and having a value that does not start with a forward slash or the strings:  Property or ErrorCode might be misconfigured

The script accepts the path to the Management Pack XML file as the single input parameter and then searches the XML for RegExExpression, SimpleExpression, and DayTimeExpression XML nodes.  These nodes are evaluated for mismatches on XPathQuery expressions and mismatches are reported to the console.

The output of the script looks like:

PS D:\Development\SCOM> ./checkexpressionfilters.ps1 -MPFIle:”d:\development\scom\xsnmp\1.1.Dev\xSNMP.AIX.xml”
Evaluating d:\development\scom\xsnmp\1.1.Dev\xSNMP.AIX.xml 

Error – Found mismatched XPathQuery value with XML:
<XPathQuery>$Target/Property[Type=”xSNMP.AIX.Processor”]/Name$</XPathQuery> 

Error – Found mismatched XPathQuery value with XML:
<XPathQuery>$Target/Property[Type=”xSNMP.AIX.Processor”]/Name$</XPathQuery> 

Errors found: 2
Warnings found: 0

The script can be downloaded here.

Read more of this post

Development Updates

While this blog has been a bit quiet lately, it has not been for lack of efforts.   In the next two weeks, a minor update to the xSNMP suite should be ready with a few bug-fixes and feature improvements.    This update will also include three new management packs:

  • xSNMP  for Juniper Networks
  • xSNMP for SonicWALL
  • xSNMP for APC NetBotz

We’re still finishing up testing and fine-tuning on the Oracle Unix/Linux MP as well.

These updates will be described in more detail here and available for download on manage-X.net soon.