Finishing the Oracle SCX Management Pack for OpsMgr Cross-Platform Agents
May 9, 2010 17 Comments
I am in the process of putting the finishing touches on the Oracle SCX Management Pack that I have described in several recent posts (part 1, part 2, part 3, part 4) and will soon be moving on to finishing the documentation for the management pack. This post is intended to provide a preview of the finished MP and highlight the monitoring implemented by this MP. I intend to make this management pack freely available in the near future, but I would like to get some more testing done before posting it for download. In the meantime, if anyone is interested in testing this management pack, please leave a comment or contact me and I will be glad to provide an early beta version.
While completing this management pack has taken me longer than I had anticipated, this authoring process has been a great learning experience for me in an area a bit outside of my comfort zone. I also like to think that this management pack is a good representation of just how flexible the OpsMgr SCX extensions are in that I was able to implement all of the discovery and monitoring in the management pack using the basic SCX agent CIM providers without any additional scripts or compiled code on the agents.
Screenshots
Alert View
(Looks like it’s time to grow some table spaces)
Rules and Monitors
At present, the management pack implements 51 performance collection rules, 6 alert generating rules, and 26 monitors. The current inventory of rules and monitors in the management pack is listed below.
Alert Generating Rules
- Alert on Oracle SCX Archive Hung Error
Alert generating rule that alerts when an ORA-00257 or ORA-16038 error is logged in the Oracle Alert Logs, indicating an Archive Hung condition - Alert on Oracle SCX Data Block Corruption Error
Alert generating rule that alerts when an ORA-01157, ORA-01578, or ORA-27048 error is logged in the Oracle Alert Logs, indicating Data Block corruption - Alert on Oracle SCX Deadlock Error
Alert generating rule that generates an alert when an ORA-000060 deadlock error is logged in the Oracle Alert Logs - Alert on Oracle SCX Media Failure Errors
Alert generating rule that alerts when an ORA-15130, ORA-15049, ORA-15050 or ORA-15051 error is logged in the Oracle Alert Logs, indicating a Media Failure error - Alert on Oracle SCX ORA 600 Error
Alert generating rule that alerts when an ORA-00600 Internal Error is logged - Alert on Oracle SCX Auto Execute Job Failure
Alert generating rule that alerts when an ORA-12012 error is logged in the Oracle Alert Logs, indicating an error on the auto execution of a job
Event Collection Rules
- Collect Oracle SCX Instance Alert Log Events
Collects Oracle Alert Log events that match the Regular Expression Filter (default filter: .*(ORA-|error).*)
Performance Collection Rules
- Collect Oracle SCX Database Blocking Lock Count
Performance collection rule to collect the current count of Blocking Locks in an Oracle Database - Collect Oracle SCX Database Free Space MB
Performance collection rule to collect the Free Space of an Oracle Database in Megabytes - Collect Oracle SCX Database Size MB
Performance collection rule to collect the size of an Oracle Database in Megabytes - Collect Oracle SCX Data File Average IO Time
Performance collection rule to collect the Average IO Time (in hudredths of a second) for an Oracle Data File - Collect Oracle SCX Data File Size MB
Performance collection rule to collect to the size in Megabytes of an Oracle Data File object - Collect Oracle SCX Active Transaction Count
Performance collection rule to collect the count of Active Transactions for an Oracle Instance - Collect Oracle SCX Buffer Busy Waits
Peformance collection rule to collect the current count of Buffer Busy Waits for an Oracle Instance - Collect Oracle SCX Buffer Cache Hit Ratio
Performance collection rule to collect the Buffer Cache Hit Ratio value for an Oracle Instance - Collect Oracle SCX Dictionary Cache Hit Ratio
Performance collection rule to collect the Dictionary Cache Hit Ratio value for an Oracle Instance - Collect Oracle SCX Disk Sort Ratio
Performance collection rule to collect the ratio of sorts executed using temporary disk segments to sorts in memory for an Oracle Instance - Collect Oracle SCX Dispatcher Wait Time
Performance collection rule to collect the current Dispatcher Wait Time in hudredths of a second, for the dispatcher queue with the highest wait time - Collect Oracle SCX Dispather Workload Percent Busy
Performance collection rule to collect the value of the Dispatcher Workload Percent Busy (Dispatcher Busy/Total * 100) for an Oracle Instance - Collect Oracle SCX Failed Login Count 1 Hour
Performance collection rule for the count of Oracle failed logins in the past 1 hour - Collect Oracle SCX Latch Hit Ratio
Performance collection rule to collect the Latch Hit Ratio (%) value for an Oracle Instance - Collect Oracle SCX Library Cache Hit Ratio
Performance collection rule to collect the Library Cache Hit Ratio (%) value for an Oracle Instance - Collect Oracle SCX Open Cursor Count
Performance collection rule to collect the current count of open cursors for an Oracle Instance - Collect Oracle SCX Percent CPU Used by Parsing
Performance collection rule to collect the Percentage of Oracle CPU usage used by Parsing for an Oracle Instance - Collect Oracle SCX Percent CPU Used by Sessions
Performance collection rule to collect the Percentage of Oracle CPU usage used by Sessions for an Oracle Instance - Collect Oracle SCX Percent Locks Resource Limit Used
Performance collection rule for the current used percentage of the DML_Locks resource limit for an Oracle instance - Collect Oracle SCX Percent Processes Resource Limit Used
Performance collection rule for the current used percentage of the Processes resource limit for an Oracle instance - Collect Oracle SCX Percent Sessions Resource Limit Used
Performance collection rule for the current used percentage of the Sessions resource limit for an Oracle instance - Collect Oracle SCX Percent Waits – Commit
Performance collection rule to collect the Percent Waits value for Commit waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Waits – Network
Performance collection rule to collect the Percent Waits value for Network waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Wait Time – System I/O
Performance collection rule to collect the Percent Wait Time value for System I/O waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Waits – User I/O
Performance collection rule to collect the Percent Waits value for User I/O waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Wait Time – Commit
Performance collection rule to collect the Percent Wait Time value for Commit waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Wait Time – Network
Performance collection rule to collect the Percent Wait Time value for Network waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Wait Time – System I/O
Performance collection rule to collect the Percent Wait Time value for System I/O waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX Percent Wait Time – User I/O
Performance collection rule to collect the Percent Wait Time value for User I/O waits for an Oracle Instance. Valid for Oracle versions 10g and later. - Collect Oracle SCX PGA Cache Hit Percentage
Performance collection rule to collect the PGA Cache Hit Percentage for an Oracle Instance - Collect Oracle SCX PGA In Use MB
Performance collection rule to collect the PGA In Use Megabytes for an Oracle Instance - Collect Oracle SCX PGA Percent Used
Performance collection rule to collect the PGA Percent Used value for an Oracle Instance - Collect Oracle SCX PGA Process Count
Performance collection rule to collect the PGA Process Count for an Oracle Instance - Collect Oracle SCX PGA Total Allocated MB
Performance collection rule to collect the PGA Total Allocated Megabytes for an Oracle Instance - Collect Oracle SCX Physical Read KB
Performance collection rule to collect the Kilobytes Physically Read (per polling cycle) for an Oracle Instance - Collect Oracle SCX Physical Write KB
Performance collection rule to collect the Kilobytes Physically Written (per polling cycle) for an Oracle Instance - Collect Oracle SCX Redo Log Space Request Ratio
Performance collection rule to collect the ratio of Redo Log Space Requests to Redo Log Entries, as a percentage, for an Oracle Instance - Collect Oracle SCX Redo Log Buffer Retry Ratio
Performance collection rule to collect the Redo Log Buffer Retry Ratio (redo buffer allocation retries/redo entries) as a percentage for an Oracle Instance - Collect Oracle SCX Redo Log Space Requests
Performance collection rule to collect the count of Redo Log Space Requests (per polling cycle) for an Oracle Instance - Collect Oracle SCX Rollback Segment Waits Ratio
Performance collection rule to collect the Rollback Segment Waits Ratio (Rollback Segment Waits:Rollback Segment Gets, as a percentage) value for an Oracle Instance - Collect Oracle SCX Session Count
Performance collection rule to collect the current count of Sessions for an Oracle Instance - Collect Oracle SCX SGA Large Pool Free MB
Performance collection rule to collect the Free Memory of the SGA Large Pool in Megabytes - Collect Oracle SCX SGA Large Pool Size MB
Performance collection rule to collect the size of the SGA Large Pool in Megabytes - Collect Oracle SCX SGA Shared Pool Free MB
Performance collection rule to collect the Free Memory of the SGA Shared Pool in Megabytes - Collect Oracle SCX SGA Shared Pool Size MB
Performance collection rule to collect the size of the SGA Shared Pool in Megabytes - Collect Oracle SCX Sorts – Disk
Performance collection rule for the number of Sorts in Disk (per polling cycle) - Collect Oracle SCX Sorts – Memory
Performance collection rule for the number of Sorts in Memory (per polling cycle) - Collect Oracle SCX Process Percent Used Memory KB
Performance collection rule to collect the memory used by an Oracle Process in KB. Processes identifiers are dynamically discovered every four hours. If an Oracle process stops and is restarted, data collection may miss intervals until the new PID is discovered. - Collect Oracle SCX Process Percent User CPU Utilization
Performance collection rule to collect the percentage of User CPU time used by an Oracle Process. Processes identifiers are dynamically discovered every four hours. If an Oracle process stops and is restarted, data collection may miss intervals until the new PID is discovered. - Collect Oracle SCX Table Space Percent Free Space
Performance collection rule to collect the Percent Free Space value for an Oracle Table Space object - Collect Oracle SCX Table Space Size MB
Performance collection rule to collect the current size of an Oracle Table Space object in Megabytes
Monitors
Target: Oracle SCX Database
- Oracle SCX Archive Log Destination Error
Monitors the count of defined archive log destinations that are in an ERROR state, and generates an alert if any are found - Oracle SCX Archive Log Destination Full
Monitors the count of defined archive log destinations that are in a FULL state, and generates an alert if any are found - Oracle SCX Archive Log Destination Quota Utilization
Monitors the count of valid Oracle archive log destinations that are currently using > 90% of their defined size quota, and generates an alert if any are found - Oracle SCX Flash Recovery Area Quota Utilization
Monitors the percent utilization of the Flash Recovery Area quota and generates an alert if the percent quota utilization is high - Oracle SCX Segments Nearing Max Extents
Monitors the count of segments with a count of Extents within 3 of the Segment Max Extent value and generates an alert if any are found
Target: Oracle SCX File System
- Oracle SCX File System Percent Free Space
Monitors the percentage of free space for a File System that Oracle depends on (e.g. data file, archive log destination, etc), and generates an alert if the free space is lower than the threshold
Target: Oracle SCX Instance
- Oracle SCX Job Queue Broken Jobs Count Increase
Monitor that checks an Oracle Instance job queue for ‘broken’ jobs and generates an alert if the count of broken jobs increases - Oracle SCX Instance TNS Ping Status
Monitors the status of of a tnsping test against an Oracle instance and generates an alert if the response is not ‘OK’ - Oracle SCX Percent Locks Resource Limit Used
Monitor that checks the current percent utilization of the configured DML_Locks resource limit for an Oracle Instance and generates an alert if the warning or critical thresholds are exceeded - Oracle SCX Percent Processes Resource Limit
Monitor that checks the current percent utilization of the configured Processes resource limit for an Oracle Instance and generates an alert if the warning or critical thresholds are exceeded - Oracle SCX Percent Sessions Resource Limit Used
Monitor that checks the current percent utilization of the configured Sessions resource limit for an Oracle Instance and generates an alert if the warning or critical thresholds are exceeded - Oracle SCX High CPU Use by Parsing
Monitors the percentage of CPU used by SQL parsing and generates an alert if the threshold is exceeded - Oracle SCX High Redo Log Space Request Ratio
Monitors the ratio of Redo Log Space Requests to Redo Log Entries and generates an alert if the threshold (percentage) is exceeded - Oracle SCX Long Running Blocking Lock
Monitor that checks an Oracle instance for blocking locks that have been running for longer than the defined threshold, in seconds - Oracle SCX Open Cursors High
Monitors the count of open cursors for an Oracle instance and generates an alert if the threshold is exceeded - Oracle SCX Redo Log Buffer Retry Ratio High
Monitor that checks the ratio of redo log buffer retries to redo log buffer entries and generates an alert if the ratio is above the threshold - Oracle SCX Recent Failed Login Count
Monitor that calculates the count of failed logins in the past hour and generates an alert if the threshold is exceeded
Target: Oracle SCX Listener
- Oracle SCX Listener Status
Monitors the status of a discovered Oracle listener with ‘lsnrctl status’ and generates an alert if the status is not OK
Target: Oracle SCX Processes
- Oracle SCX Instance ORA_CKPT Process Monitor
Monitor that checks for the existance of a running ORA_CKPT process for the Oracle Instance - Oracle SCX Instance ORA_LGWR Process Monitor
Monitor that checks for the existance of a running ORA_LGWR process for the Oracle Instance - Oracle SCX Instance ORA_MMAN Process Monitor
Monitor that checks for the existance of a running ORA_MMAN process for the Oracle Instance - Oracle SCX Instance ORA_PMON Process Monitor
Monitor that checks for the existance of a running ORA_PMON process for the Oracle Instance - Oracle SCX Instance ORA_SMON Process Monitor
Monitor that checks for the existance of a running ORA_SMON process for the Oracle Instance
Target: Oracle SCX Table Space
- Oracle SCX Table Space Percent Space Used
Monitors the percentage of used space of an Oracle Database Table Space and generates an alert if the percent space used is high - Oracle SCX Table Space Free Space Deficit
Monitors for segments that will be unable to fit their next extent in the Oracle DatabaseTable Space due to limited free space - Oracle SCX Table Space Rapid Space Utilization Change
Monitors the percent space utilization of an Oracle Database Table Space over multiple polling cycles and generates an alert if a rapid decline in free space is detected
Hi Cris,
I’m very glade to see you creating management pack to monitor Oracle on UNIX/Linux as I was looking for this kind of MP last few months. We have few Oracle severs running on Linux, Redhat enterprise V4.
It would be very grateful if you can let me do some beta testing with this MP in my environment. Please let me know where can I download this Mp for testing.
Thansk
Chama
Hi chama, have u tried this already ?
Great work.
Can i have a beta realease of this MP ?
Thanks
Thanks for the comments, and I will take both of you up on the offer to beta test. I am hard at work adding RAC and ASM support to the MP and hope to be finished with that soon. I don’t want to release the beta until that is done.
Thanks again,
Kris
Hi Kris,
did you ever finish that management pack with RAC and ASM support?
I’m very interested.
Regards,
Peter
Hi Kris,
Can i also get this MP for testing?
Thanks you.
I appreciate all of your hard work.
Hi Kris!
Thanks for your great work, we appreciate that very (!) much!
Just one question: when do you think that this MP is RTM?
Thanks again,
Patrick
Hi Kris,
Man that’s a really really nice job !!
I really need to take a look @ this mp. Can you please email me the download link ?
thanks
Rob
robson_infra@hotmail.com
Looking foward to testing this MP. Keep up the good work.
Hi,
Approciate your hard work, can we have a testing MP? please send me download link and recent news on Oracle management pack
Nice Work! I would love to Beta test this as soon as your ready.
-Ian
Hi Kris!
Any chance I can also test this MP and review the guide you are creating for it?
Thanks from your Dutch friends at BICTT!
Bob
Hi! Does this pack connect directly to Oracle database instances or is it a connector for Grid Control?
Thanks!
Hi Kris,
Any news on the Oracle MP? Did you eve finish it? If not, could you share what you did so far?
I am looking into implementing Oracle monitoring and checked different options. I tried Quest (ex EXC) solution but don’t care for it much.
Thank you
Hi Mate,
Do you mind sharing the beta version of this MP with us?. I’m trying to find a good oracle MP that supports AIX and would like to evaluate yours as well?..
thank you very much,
best regards.
Hello,
You make references to SCX in lots of places. We run Oracle on Redhat. Will your MP work for us? Where can I get it?
Thanks!
Hi Kristopher,
is there a newer version available, than the version (1.0.4) we already got?
Best Regards