SCOM: WSH Vs. PowerShell Modules in Composite Workflows – Resource Utilization in SNMP Data Manipulation

One of the realities of working with SNMP monitoring is that more often than not, the monitoring data are presented in a raw form that requires some kind of manipulation in order to render meaningful output.  For example, required data manipulation may be a simple arithmetic operation on two values to calculate a percentage, or in the case of Counter data, mathematical operations based on the delta between values recorded in multiple polling cycles.  In Operations Manager, these manipulations require exiting the realm of managed code and utilizing script-based modules to perform the operations or facilitate temporary storage of values from previous polling cycles.  Two sets of modules are available for the Operations Manager –supported scripting engines: WSH and PowerShell.  To date, I had been opting to use VB scripts when authoring Management Packs for two reasons: 1) WSH is universally deployed in Windows environments whereas PowerShell is not necessarily so – by using VB scripts, there is no requirement to install Power Shell on proxy agents 2) I had assumed that the resource utilization impact of PowerShell was equal or greater than that of WSH.   I had assumed that PowerShell would carry a heavier impact based on the simple notion that if one were to watch process resource utilization when simply launching powershell.exe and cscript.exe, powershell.exe consumes more memory and CPU time (assuming WSH 5.7 is installed).  

The resource utilization of these script providers becomes a major concern particularly when implementing script-based modules in SNMP monitoring scenarios.   To illustrate this point, if a proxy agent were configured to proxy SNMP requests for 10 Cisco switches, with each of these switches having an average of 20 interfaces discovered, and each interface monitored with two monitors that utilize a script probe action to manipulate the raw SNMP data (e.g. collisions and octets), 400 scripts would be executed in a single polling cycle for just the interface monitors for this small scale monitoring scenario.  This poses a threat to the scalability of SNMP monitoring and could severely limit the number of devices/objects a single proxy agent can handle effectively.  

In the course of trying to find a way to address this scalability issue, I was fortunate enough to communicate with someone possessing a great deal of insight into Operations Manager who helpfully suggested that the PowerShell modules should be more efficient than WSH-based modules in composite workflows.  I rewrote all of the scripts in the Cisco MP to convert them from VB Script to PowerShell and began some testing.  I was familiar with the tighter integration of PowerShell in R2 modules (PS scripts no longer have to be launched as external commands), but to be honest, I was expecting to see a large number of powershell.exe processes spawned as the monitors fired.   However, this is not the case.  Rather, it looks to me that the modules are executing the PowerShell script through the .NET framework within the context of the monitoringhost.exe process.   This does appear to be more efficient overall, as the overhead associated with spawning new processes is effectively eliminated, and my impressions thus far are that CPU utilization overall is reduced.

However, switching from WSH scripts to PowerShell scripts in R2 workflows is a little bit of jumping from the frying pan and into the fire in that, instead of spawning a large number of processes each consuming relatively small amounts of processor and memory resources, the PowerShell script modules drive a single process (monitroinghost.exe) to consume a large quantity of resources, particularly CPU cycles.   Overall, memory utilization looks a lot better with the PowerShell modules, and although CPU utilization does seem to be better, it is still a concern for scalability. 

Thus far, I have been doing this performance testing in a development environment, with OpsMgr running on some virtual machines on both on workstation and older server class hardware, neither of which provide a good indication of real-world scalability (particularly given the fact that I have these VM’s running SQL, all OpsMgr duties, and SNMP simulations to boot).  On one of these woefully over-utilized VM’s, something around 130-150 interfaces on 10 monitored Cisco devices seemed to be the breaking point, but a more realistic OpsMgr deployment scenario (segregated database, RMS, and MS duties) on physical hardware should be able to handle far more than that.   I will report an update once I get a chance to do some broader scalability testing with the PowerShell version of the MP on more appropriate hardware. 

In summary, both the WSH and PowerShell probe and write action modules introduce a relatively heavy CPU load when utilized for data manipulation – relative to the very simple operations required to manipulate SNMP data, and a managed code module would be far more desirable, if available.  However, at present, these two providers are the only supported mechanisms for handling data that require processing before returning to a rule or monitor.   My testing thus far appears to support the assertion that R2 implements the PowerShell modules more efficiently than the WSH-based modules, which is welcome news, given the relative ease and impressive flexibility of scripting with PowerShell.  I’ve seen a bit of talk that PowerShell V2 is supposed to bring significant performance improvements over V1, and I hope to do some testing with the CTP version of V2 on an OpsMgr proxy agent in the very near future to see if it helps address any of the scalability challenges in SNMP monitoring with OpsMgr.  As for the best approach for the present, it looks like PowerShell is the way to go, and the overall impact on the MS/proxy agents can be mitigated by spreading monitored objects across multiple proxy agents, focusing discovery to only those objects which are required to be monitored (i.e. interfaces), and avoiding overly-aggressive scheduling of monitors.

Advertisements

SCOM: Monitoring the WSH Version on Windows Agents

Given the severe performance issues that can be caused by SCOM monitoring on hosts without the Windows Script Host 5.7, and the possibility that WSH 5.7 binaries can be replaced with older versions by Windows File Protection (http://support.microsoft.com/default.aspx/kb/955360), I’ve found it useful to use a SCOM unit monitor to monitor managed agents for the expected WSH version (5.7 or later).  

The script that I’ve written for this purpose first checks the OS Caption with WMI (to exclude 64bit hosts from the check) and then checks the version of cscript.exe using a WSH FileSystemObject. 

set oFSO = CreateObject(“Scripting.FileSystemObject”)
 sFileVersion=oFSO.GetFileVersion(sWinDir & “\system32\cscript.exe”)

To deploy this as a unit monitor, create a two-state timed script monitor.   Set the unhealthy state expression to:  Property[@Name=’status’] equals Error and the healthy state expression to Property[@Name=’status’] equals OK.

A message with a description of the problem and the current cscript.exe version can be added to the alert with the  $Data/Context/Property[@Name=”Message”]$  Xpath string.