OpsMgr 2012 UNIX/Linux Authoring Templates: Shell Command

Many of the OpsMgr authoring examples for UNIX/Linux monitoring that I have described on this blog are based on the use of the WSMan Invoke modules to execute shell commands. This is a really powerful mechanism to extend the capabilities of Operations Manager monitoring, and the 2012 version of Operations Manager includes a new set of templates allowing the creation of rules, monitors, and tasks using UNIX/Linux shell commands directly from the Authoring pane of the console.

The new templates are:

Monitors

  • UNIX/Linux Shell Command Three State Monitor
  • UNIX/Linux Shell Command Two State Monitor

Rules

  • UNIX/Linux Shell Command (Alert)
  • UNIX/Linux Shell Command (Performance)

Tasks

  • Run a UNIX/Linux Shell Command

Note: For the OpsMgr 2012 Release Candidate, the Shell Command Template MP needs to be downloaded and imported.  In the final release, it will be imported by default.

Underneath the covers, all of these templates use the ExecuteShellCommand method of the agent’s script provider with the WSMan Invoke module. This method executes the command and outputs StdOut, StdErr, and ReturnCode. The command can be a path to a simple command, a command or script existing on the managed computer, or a “one-liner” script (a shell script condensed to one line with pipes and semi-colons).  The templates also allow you to select whether to run with the nonprivileged action account, or the privileged account (which also supports sudo elevation).

If you’ve done this kind of UNIX/Linux authoring in 2007 R2, you will quickly see how much easier and faster this can be done in 2012.

To show the use of these templates, I have put together an MP authoring walkthrough for monitoring BIND DNS servers on Linux. This entire MP will be created in the Operations Console, with no XML editing!

Walkthrough: Monitoring BIND on Linux

Read more of this post

Advertisement

OpsMgr 2012 UNIX/Linux Authoring Templates: Process Monitoring

In Operations Manager, custom rules and monitors can be used to extensively build on the out-of-the-box Management Pack contents. Unfortunately, this kind of custom authoring for UNIX/Linux monitoring carried a steep learning curve with OpsMgr 2007 R2. However, the 2012 release of Operations Manager has some new features to enable many common UNIX/Linux authoring scenarios using templates, directly from the console.  The first of these new templates I wanted to cover is the new process monitoring template.

UNIX/Linux Process Monitoring Template

Operations Manager 2007 R2 included the Unix Service Monitoring template for custom monitoring of daemons on UNIX and Linux agents.   This template has been replaced in the System Center 2012 release of Operations Manager with the far more capable UNIX/Linux Process Monitoring template.   The new UNIX/Linux Process Monitoring template allows more flexibility in process/daemon monitoring, including the ability to monitor for minimum and maximum process count thresholds, and the ability to filter processes on arguments in addition to the process name. For this example, I will walk through the use the UNIX/Linux Process Monitoring template to monitor a Tomcat daemon.

The UNIX/Linux Process Monitoring template is accessible in the Authoring Pane of the Operations Console.   It can be launched with the “Add Monitoring Wizard” task under the Management Pack Templates view.

Read more of this post

OpsMgr: UNIX/Linux Heartbeat Failures After Applying KB2585542

The OpsMgr UNIX/Linux monitoring team at Microsoft is currently investigating an issue that results in heartbeat failures on Operations Manager UNIX/Linux agents after the security update KB2585542 is applied to a Management Server or Gateway.  This update fixes a vulnerability in SSL/TLS1.0, but appears to cause WS-Management connections to UNIX/Linux agents to fail. 

The vulnerability is described in bulletin MS12-006, and more information can be found in the KB article.  While we continue to investigate options for resolving this issue, there are two viable workarounds (which must be applied to all Mgmt Servers and Gateways that manage UNIX/Linux agents):

  1. Uninstall the update KB2585542 
  2. Make a registry modification to disable the SecureChannel changes implemented in the update

Note: the registry modification described here and in the KB article effectively disables the security fix that the update implements, so the modified system is subject to the same vulnerability as an unpatched system.

Modifying the registry to disable the SecureChannel changes:

  • A “FixIt” package is available in the KB article under the Known Issues section that can be used to disable the security update
  • Alternatively, you can add the 32bit DWORD value:
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control
    \SecurityProviders\SCHANNEL\

     SendExtraRecord = 2

These changes take effect immediately and do not require a reboot.

Operations Manager UNIX/Linux Agent Certificates (and using a PKI)

Introduction

UNIX and Linux agent monitoring in Operations Manager requires certificates to secure the SSL communication channel between the Management Servers and agents.  In this post, I will provide some background information on this communication and the certificates, as well as describe an optional approach to replace the default Operations Manager certificate infrastructure with your organization’s Public Key Infrastructure.

The Protocols

The Operations Manager UNIX/Linux agent is a very lightweight agent implementation, comprising a CIM Object Manager (OpenPegasus) and CIM Providers.   Unlike Operations Manager Windows agents, the UNIX/Linux agent doesn’t have a health service, and doesn’t run workflows locally.  Rather, the Management Server (or Gateway) that manages the agent runs the workflows and remotely connects to the UNIX/Linux agent to retrieve current data.  

There are two protocols involved in the communication between the Management Server and the UNIX/Linux agent:  ssh and WS-Management.   

Ssh is used purely for agent maintenance activities, and is not used for any monitoring.   Operations like agent installation, uninstallation, upgrade, or agent daemon restart (through a recovery task) are executed over ssh.    Ssh facilitates the transfer of files and execution of remote commands for these operations when the agent daemon is unavailable.  

WS-Management (or WSMan) is the core protocol used in UNIX/Linux monitoring.   WSMan is a SOAP-based protocol for cross-platform management.   All monitoring operations (e.g. enumerating CIM providers for data on file systems, memory, etc, execution of commands/scripts for monitoring, executing log file reads for monitoring) are implemented over WSMan.   As WSMan is a web service protocol, the OpenPegasus-based CIMOM functions as a secure web server (user credentials are authenticated through PAM).  This is where the agent certificate comes in to play.

The Certificate

The UNIX/Linux agent certificate is quite simply used to secure the WSMan connection using SSL and provide authentication for the remote agent host.   The requirements for this certificate are:

  • The certificate is a server authentication certificate (Enhanced Key Usage: 1.3.6.1.5.5.7.3.1)
  • The CN of the certificate matches the FQDN that the Management Server uses to connect to the agent
  • The certificate is signed by a trusted authority (and can be checked for revocation)

When the Operations Manager UNIX/Linux agent is installed, it generates a certificate (using openssl) at the path:  /etc/opt/microsoft/ssl.  The file name of the certificate is scx-host-<hostname>.pem and the corresponding private key is named scx-key.pem.   The agent actually looks for the certificate at /etc/opt/microsoft/scx/ssl/scx.pem, which is initially configured as a symbolic link pointing to scx-host-<hostname>.pem.

Upon initial agent installation, the certificate is not signed, and is not usable for securing the WSMan SSL communication.

Note:  when initially creating the certificate, the agent attempts to determine the agent hostname for use as the CN value of the certificate.   In cases where the DNS name known to the local host does not match the FQDN that OpsMgr will use to communicate with the agent, additional steps are required to establish a valid certificate.  More information can be found here: http://technet.microsoft.com/en-us/library/dd891009.aspx

Certificates and Management Servers

When a Management Server discovers a UNIX or Linux agent, the server uses its certificate to sign the agent certificate, acting like a standalone Certificate Authority.  In the discovery process, this actually involves securely transferring the certificate from the agent to the Management Server, signing it, copying it back to the agent, and restarting the agent daemon.  

In order to move an agent between Management Servers, the new Management Server must trust the certificate that was used to sign the agent’s certificate.  This becomes particularly important in the 2012 version of Operations Manager, where agents will move automatically between the Management Servers that are members of the Resource Pool managing the agent.  For more information on the procedure to trust a server’s certificate from another server, review this document: http://technet.microsoft.com/en-us/library/hh287152.aspx.

Using a PKI Instead of Management Servers for Signing

Because the certificates used for securing the agent SSL channel are not proprietary, a separate Public Key Infrastructure can be used to manage the agent certificates, if the PKI option is appealing for your organization.  While this requires some additional resources in the environment (a Certificate Authority) and customization, there are a few benefits to using a PKI: 

  • Certificate policies are controlled by the PKI and customizable to meet your organization’s security requirements
  • Migrations of agents between Management Servers (within or between Resource Pools) can be done without exporting/importing Management Server certificates – simplifying the provisioning of Management Servers.
  • More options exist for automation of agent deployment and certificate signing

The procedure to use a PKI instead of Management Server signed certificates varies with different requirements and environments, but I will describe the steps required for one example approach.  This example assumes that the Certificate Authority is a Windows 2008 Certificate Authority. 

Prerequisites:

  1. Configure the certificate template on the Certificate Authority – you can use the “Web Server” template or a copy of it – configure options and permissions, publish the template.
  2. Import the CA certificate from the signing CA  to the trusted authorities list on every management serverthat will manage the UNIX/Linux agents:
    1. certutil -f -config “<CAHostname>\<CAName>” -ca.cert <CACertFile>
    2. certutil -addstore Root <CACertFile>

Per-Agent steps:

  1. Install the agent – this can be done through the OpsMgr Discovery Wizard, manually, or with another package distribution tool.  If you use the OpsMgr Discovery Wizard to install the agent, the agent will generate a certificate that is signed by the management server, but this can be replaced with your PKI CA signed certificate.
  2. Generate a cert signing request – either create a new private key with OpenSSL or use the private key generated during the agent install
    1. a.      Command to generate a CSR using the key generated during agent install:
      openssl req -new -key /etc/opt/microsoft/scx/ssl/scx-key.pem  -subj /CN=<FQDN of agent host> -text -out <OutputPath>
    2. 3.       Copy the CSR back to a Windows machine
    3. 4.       Submit the CSR to the CA – this command assumes auto-enrollment is enabled and authorized:
      1. certreq.exe -submit -config <CAHostName>\<CAName> -attrib “CertificateTemplate:<TemplateName>” <CSR FileName> <OutputCertName>
      2. Copy the signed cert back to the UNIX/Linux agent using  a secure copy method.  If auto-enrollment was used in step 4, the value for <OutputCertName>  specifies the file name of the signed certificate to copy to the agent.
      3. Update the symbolic link: /etc/opt/microsoft/scx/ssl/scx.pem to point to your new signed certificate
      4. Restart the agent:  /opt/microsoft/scx/bin/tools/scxadmin –restart
      5. Discover the agent using the Operations Console or PowerShell Cmdlet

Automation and Customization Opportunities

All of the per-agent steps described above can be executed from a command line, meaning that this procedure can be automated through scripting.  Using a script on a Windows server, the UNIX/Linux commands and file copying actions can be executed with SSH utilities like PuTTY’s plink and pscp.  For really robust automation capabilities, all of the steps can be implemented in a PowerShell script – I like the plink.exe integration example described on this blog: http://www.christowles.com/2011/06/how-to-ssh-from-powershell-using.html.

Aside from the primary benefits of automating these steps in terms of reducing manual interactions, other customization opportunities are exposed with using this scripting approach.  For example, if your DNS infrastructure and UNIX/Linux agent hostnames don’t neatly correlate, you could modify step 2 of the per-agent steps to also generate a new certificate with openssl using the desired FQDN as the certificate’s CN (http://technet.microsoft.com/en-us/library/dd891009.aspx).  Alternatively, if you are using Operations Manager 2007 R2 and want to implement agent deployment and certificate signing using sudo elevation instead of root credentials, the UNIX/Linux host commands in the per-agent steps could be prepended with the sudo command (this functionality is built into the 2012 version of Operations Manager).

 

 

Operations Manager Releases

In case you missed it…

Operations Manager 2012 Beta

http://www.microsoft.com/systemcenter/en/us/om-vnext-beta.aspx

Operations Manager 2012 Beta is available.   Exciting new features for UNIX/Linux monitoring include:

  • Discovery Wizard:  All new Discovery Wizard, making it easier to deploy and discover UNIX and Linux agents
  • Sudo support:  Privileged operations (monitoring and agent maintenance) can now be performed without root privileges by using a non-privileged credential and sudo elevation.
  • SSH Key support:  Agent maintenance operations (via SSH) can be authenticated with an SSH key instead of password
  • High Availability for UNIX/Linux agent monitoring:   Resource Pools implement highly-available UNIX/Linux monitoring with automatic failover and load distribution.
  • RHEL 6 and AIX 7.1 support

Operations Manager 2007 R2 CU5

http://support.microsoft.com/kb/2495674

Operations Manager 2007 R2 Cumulative Update 5 is also now available.   Fixes for UNIX/Linux monitoring include:

  • Performance data for LVM managed volumes not available
  • Process monitoring does not keep name if run by using symbolic link
  • AIX with large number of running processes crashes with bad alloc

CU5 also implements RHEL 6 support for Operations Manager 2007 R2.

 

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part IV

Introduction

In Part III of this series, I walked through creation of data sources, a discovery, and a rule for discovering dynamically-named log files and implementing an alert-generating rule for log file monitoring.  In this post, I will continue to expand this Management Pack to implement performance collection rules, using WSMan Invoke methods to collect numerical performance data from a shell command. 

Using Shell Commands to Collect Performance Data

Whether it is system performance data from the /proc or /sys file systems, or application performance metrics in other locations, performance data for UNIX and Linux systems can often be found in flat files.   In this example Management Pack, I wanted to demonstrate using a WSMan Invoke module with the script provider to gather a numeric value from a file and publish the data as performance data.   In many cases, this would be slightly more complex than is represented in this example (e.g. if the performance metric value should be the delta between data points in the file over time), but this example should provide the framework for using the contents of a file to drive performance collection rules.   The root of these workflows is a shell command using the cat command to parse the file, which could be piped to grep, awk, and sed to filter for specific lines and columns.  

Additionally, if the performance data (e.g. hardware temperature or fan speed, current application user or connection count) that you are looking for is not stored in a file, but available in the output of a utility command, the same method could be used by using the utility command instead of cat.

Collecting Performance Data from a File

In this example, the MyApp application stores three performance metrics in flat files in the subdirectory ./perf.   I have built three rules that cat these files, and map the values to performance data.  The three rules are functionally identical, so I will only describe one of them.

Performance Collection Rule:  MyApp.Monitoring.Rule.CollectMyAppMem

Read more of this post

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part III

Introduction

In Part II of this series, I walked through creation of data sources, a discovery, a monitor type, and a monitor for customized “Process Count” monitoring for discovered instances of a “Service” class. In this post, I will continue to build on this example MP to implement dynamic log file discovery and monitoring.

Dynamic Log File Discovery and Monitoring

Log file monitoring of a single known log file can be easily implemented with the Microsoft.Unix.SCXLog modules, but in some cases, the full path to a log file isn’t static.   For example, if an application maintains multiple log files in a directory, the file name portion of the log file path may not be known ahead of time.    To handle this monitoring scenario, we can implement dynamic log file discovery – using a shell command execution, and then pass the full path of the log file to the SCXLog module for standard log file monitoring. This requires a new class instance, a discovery data source, a discovery rule, and a rule that actually implements the log file monitoring.

Defining the Log File Class and Hosting Relationship

Firstly, a new custom class is required to represent the log file objects.   Instances of this class will be discovered by the discovery rule.  

Definition 

  • ID:  MyApp.Monitoring.LogFile
  • Base Class:  Microsoft.Unix.ApplicationComponent
  • Name:  MyApp Log file

Properties

  • Name (String)
  • Path (String) – Key

The properties for the log file class represent the file name and full path.   The full path is assured to be unique, so I have specified that as the key property of the class.

The log file class needs to be hosted by the MyApp class, to maintain the relationship between the log files and the application.  

Discovery Data Source:  MyApp.Monitoring.DataSource.DiscoverLogFiles

This data source will use the MyApp.Monitoring.DataSource.ShellCommandDiscovery probe action to find files in a given directory that match a pattern.   The output from this command execution will then be passed to a Microsoft.Windows.PowerShellDiscoveryProbe.   The reason that I am using a PowerShellDiscoveryProbe is that the listing of matched files will be returned as a single data item, the StdOut from the command.   Using a PowerShellDiscoveryProbe provides an easy way to split each line from the output and discover an instance per line. 

Configuration Parameters:

  • Interval (integer):  Scheduler interval in seconds – overridable
  • TargetSystem (string):  UNIX/Linux agent computer to execute the discovery
  • Appname (string):   The name of the application object (which is the key property for the hosting class instance)
  • LogFileNamePattern (string): The pattern that will be used in the grep operation to identify log files to discovery
  • LogFilepath (string):  The path to search for log files at (via an ls command)

Member Modules:

The first member modules is a MyApp.Monitoring.DataSource.ShellCommandDiscovery probe action, that executes the following command:
ls $Config/LogFilepath$ |grep $Config/LogFileNamePattern$.  This simply enumerates the contents of the specified directory path, and pipes the results to grep, to match a specified pattern, which could be a string match or regular expression.

Module Configuration:

<Interval>$Config/Interval$</Interval>
<TargetSystem>$Config/TargetSystem$</TargetSystem>
<ShellCommand>ls $Config/LogFilepath$
   |grep $Config/LogFileNamePattern$</ShellCommand>
<Timeout>120</Timeout>

The output of this shell command then needs to be parsed so that each valid line in the output is discovered as an instance of a log file object.   This is most easily done with a PowerShellDiscoveryProbe:

param ([string]$CmdOutput,[string]$AppName,[string]$LogFilePath, [string] $TargetSystem,[string] $SourceID,[string]$ManagedEntityID)

$api = New-Object -comObject ‘Mom.ScriptAPI’
$discoveryData = $api.CreateDiscoveryData(0, $SourceID, $ManagedEntityID)

if ($CmdOutput -ne $null){
        $CmdOutput = $CmdOutput.Replace([Environment]::newline,” “)
 [array]$arList = $CmdOutput.Split(” “)
 $arList | ForEach-Object{
  [string]$sFile = $_
 if([int]$sFile.Length -ge [int]1){
  $SFilePath = $LogFilePath + “/” + $sFile
  $oInst = $discoveryData.CreateClassInstance(“$MPElement[Name=’MyApp.Monitoring.Logfile’]$”)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.Logfile’]/Name$”, $sFile)
  $oInst.AddProperty(“$MPElement[Name=’System!System.Entity’]/DisplayName$”, $sFile)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.Logfile’]/Path$”, $sFilePath)
  $oInst.AddProperty(“$MPElement[Name=’MyApp.Monitoring.MyApp’]/Name$”, $AppName)
  $oInst.AddProperty(“$MPElement[Name=’MicrosoftUnixLibrary!Microsoft.Unix.Computer’]/PrincipalName$”, $TargetSystem)

  $discoveryData.AddInstance($oInst)
 }     
}
 $discoveryData
}

Remove-variable api
Remove-variable discoveryData

The PowerShell script loads the Mom.ScriptAPI, creates a Discovery Data instance, and then walks through each line of the ouptut.   If the file name is a valid string (not empty), a class instance is created for the MyApp.Monitoring.Logfile class, and the path and file name properties are set.   The PrincipalName property of the Microsoft.Unix.Computer object, and the AppName property of the MyApp.Monitoring.MyApp class ares included in the DiscoveryData, so that the discovery mapping process can map the hosting relationships. 

Parameters are passed from the module configuration to the script using the Parameters XML fragment in the module configuration:

<Parameters>
<Parameter>
 <Name>TargetSystem</Name>
 <Value>$Config/TargetSystem$</Value>
 </Parameter>
<Parameter>
 <Name>AppName</Name>
 <Value>$Config/Appname$</Value>
 </Parameter>
<Parameter>
 <Name>LogFilePath</Name>
 <Value>$Config/LogFilepath$</Value>
 </Parameter>
<Parameter>
 <Name>CmdOutput</Name>
 <Value>
 $Data///*[local-name()="StdOut"]$
</Value>
 </Parameter>
<Parameter>
 <Name>ManagedEntityID</Name>
 <Value>$Target/Id$</Value>
 </Parameter>
<Parameter>
 <Name>SourceID</Name>
 <Value>$MPElement$</Value>
 </Parameter>
 </Parameters>

This data source can then be used to discover log files matching a pattern, in a specified directory.  

Discovery Rule:  MyApp.Monitoring.Discovery.LogFile

This discovery will discover dynamically-named log files, in a specified path, using a regular expression to filter by file name.   It discovers instances of the MyApp.Monitoring.LogFile class, and uses the MyApp.Monitoring.DataSource.DiscoverLogFiles data source.  The discovery targets  MyApp.Monitoring.MyApp

Data Source Configuration:

  • <Interval>14400</Interval>
  • <TargetSystem>$Target/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/PrincipalName$</TargetSystem>
  • <Appname>$Target/Property[Type=”MyApp.Monitoring.MyApp”]/Name$</Appname>
  • <LogFileNamePattern>‘^log[0-9]+’</LogFileNamePattern>
  • <LogFilepath>$Target/Property[Type=”MyApp.Monitoring.MyApp”]/InstallPath$/logs</LogFilepath>

The two parameters to note are the LogFilepath (which is defined as the application path discovered for the MyApp application, appended with “/logs”) and the LogFileNamePattern (which is a regular expression – ‘^log[0-9]+’ – that will match log files named:  logxxx, where xxx is a number).  

Monitoring the Discovered Log Files

Log File Monitoring Rule:   MyApp.Monitoring.Rule.AlertOnLogError

Now that the dynamically-named log files will be discovered, we need a rule to alert when an error is found in one of the logs.   The rule will target all instances of the MyApp.Monitoring.LogFile class, so that when a new log file instance is discovered, it is automatically monitored.  The rule uses the MicrosoftUnixLibrary!Microsoft.Unix.SCXLog.Privileged.Datasource (assuming the log files require privileged credentials to access).

Data source configuration:

  • <Host>$Target/Host/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/NetworkName$</Host>
  • <LogFile>$Target/Property[Type=”MyApp.Monitoring.Logfile”]/Path$</LogFile> <RegExpFilter>^.*(e|E)rror.*$</RegExpFilter>

The discovered path to the logfile instance is input as the LogFile parameter value, and a Regular Exprssion (^.*(e|E)rror.*$) is defined to match any log entries with the string:  error or Error in the message.  

Condition Detection configuration:

A System!System.Event.GenericDataMapper condition detection is then configured to map the data to EventData, for consumption by OpsMgr.  The configuration of this module is:

  • <EventOriginId>$MPElement$</EventOriginId>
  • <PublisherId>$MPElement$</PublisherId>
  • <PublisherName>MyApp</PublisherName>
  • <Channel>Application</Channel>
  • <LoggingComputer>$Target/Host/Host/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/NetworkName$</LoggingComputer>
  • <EventNumber>8001</EventNumber>
  • <EventCategory>0</EventCategory>
  • <EventLevel>1</EventLevel>
  • <UserName/>
  •  <Params/>
  •  </ConditionDetection>

Write Actions:

In this rule, I have configured two write actions, for collecting the event, and generating an alert.  The CollectEvent (SC!Microsoft.SystemCenter.CollectEvent) module requires no additional configuration, and the alert can be configured to provide details about the logged error message:

 

Stay tuned for more in this series…

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part II

Introduction

In Part I of this series, I walked through creation of a custom Management Pack for monitoring an application hosted on a UNIX or Linux server, as well as the creation of some base data sources and application discovery.   In this post, I will build on this MP to implement custom process monitoring – monitoring the count of instances of a running daemon/process to check that the count is within a range.   While the standard process monitoring provider (SCX_UnixProcess) is the best source for process information in OpsMgr UNIX and Linux monitoring, it does not support this level of customized monitoring.

Advanced Service Monitoring

Continuing this custom application monitoring scenario, our hypothetical app has a single daemon associated with the app, but we will build the classes and data sources so that they could easily be extended to add more services/daemons to monitor.    In this example, we can suppose that we want to monitor a daemon that may have multiple instances running, and drive an alert if too many or too few instances of that process are running.   This monitoring will be implemented by using the ps command in a WSMan Invoke module.   To implement monitoring of a daemon for a discovered, custom application, there are two approaches that are viable:
 
  1. Define a custom service class, and discover an instance of this class for each service to monitor, configure monitor types and monitors targeting this class
  2. Create a monitor for each service to monitor, targeting the custom application class

Both methods are completely viable, and in most cases, it is appropriate to take the simpler approach and target the custom monitors to the application, providing static inputs into the monitor.   There are some cases where discovering a class instance for the service makes sense though.  Facilitating dynamic discovery of services or thresholds (read from a config file), using the service class in a Distributed Application model in OpsMgr, or maintaining logical seperation (in terms of monitoring) between the application and its subsystems are all scenarios that would benefit from discovering the monitored services as class instances.   For the purpose of illustration, I will discover the daemon to monitor in this example Management Pack as a class instance.

Class Definition

Class:  MyApp.Monitoring.Service

Definition

  • ID:  MyApp.Monitoring.Service
  • Base Class:  Microsoft.Unix.ApplicationComponent
  • Name:  MyApp Service

Properties

  • Name (String) – Key
  • MinRunning (Integer)
  • MaxRunning (Integer)

Discovery

Then we can define the data source to discover a service.   In this case, we know the name of the service and the value of the properties, so we don’t need to actually poll the agent to return data.   We can simply combine a Discovery Scheduler with a Discovery Data Mapper module to implement the data source.  However, we want to be able to override the values of MinRunning and MaxRunning, so these will need to be exposed as overridable configuration parameters.

Therefore, I’ve chosen to implement this data source in two parts.   The first data source, will simply combine a System.Discovery.Scheduler module and a System.Discovery.ClassSnapshotDataMapper module.   This data source will accept Interval, ClassId and InstanceSettings parameters as inputs.  The second data source will reference the first data source, but implement parameters for Service Name, MinRunning, and MaxRunning.    By breaking this into two data sources, the first data source can be used for other simple discoveries.

Discovery Data Source:  MyApp.Monitoring.DataSource.DiscoverObject

This is the data source that simply combines a scheduler and a discovery data mapper.  It requires that the MapperSchema be added to the Configuration:

<Configuration>
<IncludeSchemaTypes>
<SchemaType>
 System!System.Discovery.MapperSchema
</SchemaType>
</IncludeSchemaTypes>
…
 Read more of this post

Operations Manager – Extending UNIX/Linux Monitoring with MP Authoring – Part I

Introduction

The OpsMgr UNIX and Linux monitoring implementation can be extended through MP authoring to implement robust system and application monitoring for UNIX/Linux servers.   The most direct mechanism of extension comes in the form of the script provider, accessed with WSMan Invoke modules.   The WSMan Invoke modules support three methods of invoking actions:

  • ExecuteCommand – execute a command (e.g. a script already on the file system ) and return the results
  • ExecuteShellCommand – execute a command through sh (with pipeline support) and return the results
  • ExecuteScript  – download and execute an embedded script and return the results

Of these three methods, I prefer to use ExecuteShellCommand in most cases, as it allows for the use of complex one-liner shell commands, embedded in the MP.

In a series of posts, I will describe the creation of an example Management Pack for monitoring an application, featuring dynamic application discovery, discovery of multiple log files, and advanced monitoring implementations.

Example Application Details

The example MP described in these blog posts implements monitoring for a hypothetical application (MyApp).  The application involves a daemon, a set of log files, and application performance counters where the metrics are accessible as the contents of files.

Part I – Discovering an Application

Setting up the MP

I am a big fan of the R2 Authoring Console and will be using it to create this example MP.   The first step then is to create a new MP in the Authoring Console (ID:  MyApp.Monitoring).    Once the MP is created and saved, references are needed.   References I am adding are:

  • Microsoft.Unix.Library – contains UNIX/Linux classes and modules
  • Microsoft.SystemCenter.DataWarehouse.Library – required for publishing performance data to the DW
  • System.Image.Library – contains icon images referenced in class definition

Configuring the Base Composite Modules
Read more of this post

Announcements!

In October 2010, I started employment with Microsoft, where I am working as a Program Manager on the System Center Operations Manager Cross-Platform team.  This is an opportunity that I am very excited about and I am working with a team that I am really proud to be a part of.  I will continue to blog at this site, and I will focus primarily on OpsMgr Cross-Platform topics. 

Additionally, I am excited to announce some new developments regarding the xSNMP and Manage-X for Oracle Management Pack projects.   After developing these Management Pack projects as a solo effort for some time, these projects are now being converted to community-driven projects.  Pete Zerger, of SystemCenterCentral.com, has assembled a fantastic team of MP authoring experts, with a remarkable depth of OpsMgr knowledge,  to drive continued development of these management packs.   

The xSNMP and Manage-X for Oracle MP projects are now homed at:  http://xsnmp.codeplex.com.   General download, source code, documentation, and discussion can be found at this CodePlex site.