November 23, 2011 Leave a comment
UNIX and Linux agent monitoring in Operations Manager requires certificates to secure the SSL communication channel between the Management Servers and agents. In this post, I will provide some background information on this communication and the certificates, as well as describe an optional approach to replace the default Operations Manager certificate infrastructure with your organization’s Public Key Infrastructure.
The Operations Manager UNIX/Linux agent is a very lightweight agent implementation, comprising a CIM Object Manager (OpenPegasus) and CIM Providers. Unlike Operations Manager Windows agents, the UNIX/Linux agent doesn’t have a health service, and doesn’t run workflows locally. Rather, the Management Server (or Gateway) that manages the agent runs the workflows and remotely connects to the UNIX/Linux agent to retrieve current data.
There are two protocols involved in the communication between the Management Server and the UNIX/Linux agent: ssh and WS-Management.
Ssh is used purely for agent maintenance activities, and is not used for any monitoring. Operations like agent installation, uninstallation, upgrade, or agent daemon restart (through a recovery task) are executed over ssh. Ssh facilitates the transfer of files and execution of remote commands for these operations when the agent daemon is unavailable.
WS-Management (or WSMan) is the core protocol used in UNIX/Linux monitoring. WSMan is a SOAP-based protocol for cross-platform management. All monitoring operations (e.g. enumerating CIM providers for data on file systems, memory, etc, execution of commands/scripts for monitoring, executing log file reads for monitoring) are implemented over WSMan. As WSMan is a web service protocol, the OpenPegasus-based CIMOM functions as a secure web server (user credentials are authenticated through PAM). This is where the agent certificate comes in to play.
The UNIX/Linux agent certificate is quite simply used to secure the WSMan connection using SSL and provide authentication for the remote agent host. The requirements for this certificate are:
- The certificate is a server authentication certificate (Enhanced Key Usage: 188.8.131.52.184.108.40.206.1)
- The CN of the certificate matches the FQDN that the Management Server uses to connect to the agent
- The certificate is signed by a trusted authority (and can be checked for revocation)
When the Operations Manager UNIX/Linux agent is installed, it generates a certificate (using openssl) at the path: /etc/opt/microsoft/ssl. The file name of the certificate is scx-host-<hostname>.pem and the corresponding private key is named scx-key.pem. The agent actually looks for the certificate at /etc/opt/microsoft/scx/ssl/scx.pem, which is initially configured as a symbolic link pointing to scx-host-<hostname>.pem.
Upon initial agent installation, the certificate is not signed, and is not usable for securing the WSMan SSL communication.
Note: when initially creating the certificate, the agent attempts to determine the agent hostname for use as the CN value of the certificate. In cases where the DNS name known to the local host does not match the FQDN that OpsMgr will use to communicate with the agent, additional steps are required to establish a valid certificate. More information can be found here: http://technet.microsoft.com/en-us/library/dd891009.aspx
Certificates and Management Servers
When a Management Server discovers a UNIX or Linux agent, the server uses its certificate to sign the agent certificate, acting like a standalone Certificate Authority. In the discovery process, this actually involves securely transferring the certificate from the agent to the Management Server, signing it, copying it back to the agent, and restarting the agent daemon.
In order to move an agent between Management Servers, the new Management Server must trust the certificate that was used to sign the agent’s certificate. This becomes particularly important in the 2012 version of Operations Manager, where agents will move automatically between the Management Servers that are members of the Resource Pool managing the agent. For more information on the procedure to trust a server’s certificate from another server, review this document: http://technet.microsoft.com/en-us/library/hh287152.aspx.
Using a PKI Instead of Management Servers for Signing
Because the certificates used for securing the agent SSL channel are not proprietary, a separate Public Key Infrastructure can be used to manage the agent certificates, if the PKI option is appealing for your organization. While this requires some additional resources in the environment (a Certificate Authority) and customization, there are a few benefits to using a PKI:
- Certificate policies are controlled by the PKI and customizable to meet your organization’s security requirements
- Migrations of agents between Management Servers (within or between Resource Pools) can be done without exporting/importing Management Server certificates – simplifying the provisioning of Management Servers.
- More options exist for automation of agent deployment and certificate signing
The procedure to use a PKI instead of Management Server signed certificates varies with different requirements and environments, but I will describe the steps required for one example approach. This example assumes that the Certificate Authority is a Windows 2008 Certificate Authority.
- Configure the certificate template on the Certificate Authority – you can use the “Web Server” template or a copy of it – configure options and permissions, publish the template.
- Import the CA certificate from the signing CA to the trusted authorities list on every management serverthat will manage the UNIX/Linux agents:
- certutil -f -config “<CAHostname>\<CAName>” -ca.cert <CACertFile>
- certutil -addstore Root <CACertFile>
- Install the agent – this can be done through the OpsMgr Discovery Wizard, manually, or with another package distribution tool. If you use the OpsMgr Discovery Wizard to install the agent, the agent will generate a certificate that is signed by the management server, but this can be replaced with your PKI CA signed certificate.
- Generate a cert signing request – either create a new private key with OpenSSL or use the private key generated during the agent install
- a. Command to generate a CSR using the key generated during agent install:
openssl req -new -key /etc/opt/microsoft/scx/ssl/scx-key.pem -subj /CN=<FQDN of agent host> -text -out <OutputPath>
- 3. Copy the CSR back to a Windows machine
- 4. Submit the CSR to the CA – this command assumes auto-enrollment is enabled and authorized:
- certreq.exe -submit -config <CAHostName>\<CAName> -attrib “CertificateTemplate:<TemplateName>” <CSR FileName> <OutputCertName>
- Copy the signed cert back to the UNIX/Linux agent using a secure copy method. If auto-enrollment was used in step 4, the value for <OutputCertName> specifies the file name of the signed certificate to copy to the agent.
- Update the symbolic link: /etc/opt/microsoft/scx/ssl/scx.pem to point to your new signed certificate
- Restart the agent: /opt/microsoft/scx/bin/tools/scxadmin –restart
- Discover the agent using the Operations Console or PowerShell Cmdlet
Automation and Customization Opportunities
All of the per-agent steps described above can be executed from a command line, meaning that this procedure can be automated through scripting. Using a script on a Windows server, the UNIX/Linux commands and file copying actions can be executed with SSH utilities like PuTTY’s plink and pscp. For really robust automation capabilities, all of the steps can be implemented in a PowerShell script – I like the plink.exe integration example described on this blog: http://www.christowles.com/2011/06/how-to-ssh-from-powershell-using.html.
Aside from the primary benefits of automating these steps in terms of reducing manual interactions, other customization opportunities are exposed with using this scripting approach. For example, if your DNS infrastructure and UNIX/Linux agent hostnames don’t neatly correlate, you could modify step 2 of the per-agent steps to also generate a new certificate with openssl using the desired FQDN as the certificate’s CN (http://technet.microsoft.com/en-us/library/dd891009.aspx). Alternatively, if you are using Operations Manager 2007 R2 and want to implement agent deployment and certificate signing using sudo elevation instead of root credentials, the UNIX/Linux host commands in the per-agent steps could be prepended with the sudo command (this functionality is built into the 2012 version of Operations Manager).