UNIX/Linux MP Authoring – Discovering and Monitoring Failover Clusters
July 30, 2013 Leave a comment
In my last post, I walked through creation of an MP with dynamic discovery for a UNIX/Linux application. In this post, I’ll continue to demonstrate the use of the UNIX/Linux Authoring Library examples for MP authoring, but I will take the demonstration quite a bit deeper into authoring territory – by creating an MP for monitoring of Linux failover clusters. While the base UNIX and Linux operating system Management Packs don’t have built-in detection/monitoring of failover clusters, MP Authoring can be used to build a robust cluster monitoring solution.
In this post, I will walkthrough authoring of a basic MP for monitoring a Linux failover cluster. I have two goals for this post:
- Demonstrate the use of the UNIX/Linux Authoring Library for MP Authoring scenarios
- Demonstrate the creation of a basic cluster monitoring MP that can be modified to work with other cluster technologies and monitoring requirement
This is fairly involved MP Authoring, and is intended for the author with a bit of experience.
Note: the example MP described in this blog post (and the VSAE project) can be found in the \ExampleMPs folder of the UNIX/Linux Authoring Library .zip file.
Background
The MP I am building is intended to perform discovery and monitoring of Linux failover clusters, though it could certainly be adapted to work for other cluster technologies. Prior to starting on the MP implementation, I think it is useful to conceptually model the implementation.
Regardless of the specific technology, failover clusters tend to have the same general concepts. Entities that I want to represent are:
- Cluster nodes – hosts that participate in the failover cluster
- Monitor for requisite daemons
- Monitor for quorum state
- Cluster – a “group,” containing the member nodes as well as clustered resources
- Roll-up monitors describing the total state of the cluster
- Service – a clustered service, such as a virtual IP or Web server
- Monitor each service for availability
These conceptual elements will need to be described in Management Pack ClassTypes and corresponding RelationshipTypes. A basic diagram of my intended implementation looks like:
Tools and Commands
For both dynamic discovery of the cluster nodes, as well as monitoring of the cluster resource status, I leveraged the clustat utility.
As an example, the clustat output in my test environment, with two nodes and a single virtual IP address as a service, looks like:
[monuser@clnode1 ~]$ sudo clustat Cluster Status for hacluster @ Tue Jul 23 19:24:47 2013 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ clnode1 1 Online, Local, rgmanager clnode2 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:IP clnode1 started |
As you can see, the output here can be parsed and used in discovery of cluster nodes, cluster, and services, while also providing health information about the cluster nodes, cluster, and services. Depending on versions, clustat may require privileges (i.e. sudo), but it has the advantage of being a status reporting tool that is not used in cluster configuration, so it is a useful command line tool for monitoring purposes.
Creating the MP
Getting Started
Just like with my previous example, starting off on this MP involves creating a new Management Pack project with the VSAE: