Appendix C. Probes

Contents

C.1. Probe Guidelines
C.2. Apache 1.3.x and 2.0.x
C.3. BEA WebLogic 6.x and higher
C.4. General
C.5. Linux
C.6. LogAgent
C.7. MySQL 3.23 - 3.33
C.8. Network Services
C.9. Oracle 8i, 9i, 10g, and 11g
C.10. SUSE Manager

As described in Section 3.10, “Monitoring — [Mon]”, monitoring-entitled systems can have probes applied to them that constantly confirm their health and full operability. This appendix lists the available probes broken down by command group, such as Apache.

Many probes that monitor internal system aspects (such as the Linux::Disk Usage probe) rather than external aspects (such as the Network Services::SSH probe) require the installation of the SUSE Manager monitoring daemon (rhnmd). This requirement is noted within the individual probe reference.

Each probe has its own reference in this appendix that identifies required fields (marked with *), default values, and the thresholds that may be set to trigger alerts. Similarly, the beginning of each command group's section contains information applicable to all probes in that group. Section C.1, “Probe Guidelines” covers general guidelines; the remaining sections examine individual probes.

[Note]

Nearly all of the probes use Transmission Control Protocol (TCP) as their transport protocol. Exceptions to this are noted within the individual probe references.

C.1. Probe Guidelines

The following general guidelines outline the meaning of each probe state, and provide guidance in setting thresholds for your probes.

The following list provides a brief description of the meaning of each probe state:

Unknown

The probes that cannot collect the metrics needed to determine probe state. Most (though not all) probes enter this state when exceeding their timeout period. Probes in this state may be configured incorrectly, as well.

Pending

The probes whose data has not been received by SUSE Manager. It is normal for new probes to be in this state. However, if all probes move into this state, your monitoring infrastructure may be failing.

OK

The probes that have run successfully without error. This is the desired state for all probes.

Warning

The probes that have crossed their WARNING thresholds.

Critical

The probes that have crossed their CRITICAL thresholds or reached a critical status by some other means. (Some probes become critical when exceeding their timeout period.)

While adding probes, select meaningful thresholds that, when crossed, notify you and your administrators of problems within your infrastructure. Timeout periods are entered in seconds unless otherwise indicated. Exceptions to these rules are noted within the individual probe references.

[Important]

Some probes have thresholds based on time. In order for such CRITICAL and WARNING thresholds to work as intended, their values cannot exceed the amount of time allotted to the timeout period. Otherwise, an UNKNOWN status is returned in all instances of extended latency, thereby nullifying the thresholds. For this reason, it is strongly recommended to ensure that timeout periods exceed all timed thresholds.

Remember it is recommended to run your probes without notifications for a time to establish baseline performance for each of your systems. Although the default values provided for probes may suit your needs, every organization has a different environment that may require altering thresholds.

C.2. Apache 1.3.x and 2.0.x

The probes in this section may be applied to instances of the Apache Web server. Although the default values presume you will apply these probes using standard HTTP, you may also use them over secure connections by changing the application protocol to https and the port to 443.

C.2.1. Apache::Processes

The Apache::Processes probe monitors the processes executed on an Apache Web server and collects the following metrics:

  • Data Transferred Per Child — Records data transfer information only on individual children. A child process is one that is created from the parent process or another process.

  • Data Transferred Per Slot — The cumulative amount of data transferred by a child process that restarts. The number of slots is configured in the httpd.conf file using the MaxRequestsPerChild setting.

The ExtendedStatus directive in the httpd.conf file of the Web server must be set to On for this probe to function properly.

Table C.1. Apache::Processes settings

FieldValue
Application Protocol*http
Port*80
Pathname*/server-status
UserAgent*NOCpulse-ApacheUptime/1.0
Username 
Password 
Timeout*15
Critical Maximum Megabytes Transferred Per Child 
Warning Maximum Megabytes Transferred Per Child 
Critical Maximum Megabytes Transferred Per Slot 
Warning Maximum Megabytes Transferred Per Slot 

C.2.2. Apache::Traffic

The Apache::Traffic probe monitors the requests on an Apache Web server and collects the following metrics:

  • Current Requests — The number of requests being processed by the server at probe runtime.

  • Request Rate — The accesses to the server per second since the probe last ran.

  • Traffic — The kilobytes per second of traffic the server has processed since the probe last ran.

The ExtendedStatus directive in the httpd.conf file of the Web server must be set to On for this probe to function properly.

Table C.2. Apache::Traffic settings

FieldValue
Application Protocol*http
Port*80
Pathname*/server-status
UserAgent*NOCpulse-ApacheUptime/1.0
Username 
Password 
Timeout*15
Critical Maximum Current Requests (number) 
Warning Maximum Current Requests (number) 
Critical Maximum Request Rate (events per second) 
Warning Maximum Request Rate (events per second) 
Critical Maximum Traffic (kilobytes per second) 
Warning Maximum Traffic (kilobytes per second) 

C.2.3. Apache::Uptime

The Apache::Uptime probe stores the cumulative time since the Web server was last started. No metrics are collected by this probe, which is designed to help track service level agreements (SLAs).

Table C.3. Apache::Uptime settings

FieldValue
Application Protocol*http
Port*80
Pathname*/server-status
UserAgent*NOCpulse-ApacheUptime/1.0
Username 
Password 
Timeout*15

C.3. BEA WebLogic 6.x and higher

The probes in this section (with the exception of JDBC Connection Pool) can be configured to monitor the properties of any BEA WebLogic 6.x and higher server (administration or managed) running on a given host, even in a clustered environment. Monitoring of a cluster is achieved by sending all SNMP queries to the administration server of the domain and then querying its Managed Servers for individual data.

In order to obtain this higher level of granularity, the BEA Domain Admin Server parameter must be used to differentiate between the administration server receiving SNMP queries and the Managed Server undergoing the specified probe. If the host to be probed is the Administration Server, then the BEA Domain Admin Server parameter can be left blank, and both the SNMP queries and the probe will be sent to it only.

If the host to be probed is a managed server, then the IP address of the administration server should be provided in the BEA Domain Admin Server parameter, and the managed server name should be included in the BEA Server Name parameter and appended to the end of the SNMP Community String field. This causes the SNMP queries to be sent to the administration server host, as is required, but redirects the specific probe to the managed server host.

It should also be noted that the community string needed for probes run against managed server hosts should be in the form of community_prefix@managed_server_name in order for the SNMP query to return results for the desired managed server. Finally, SNMP must be enabled on each monitored system. SNMP support can be enabled and configured through the WebLogic console.

Please see the documentation that came with your BEA server or information on the BEA website for more details about BEA's community string naming conventions: http://e-docs.bea.com/wls/docs70/snmpman/snmpagent.html

C.3.1. BEA WebLogic::Execute Queue

The BEA WebLogic::Execute queue probe monitors the WebLogic execute queue and provides the following metrics:

  • Idle Execute Threads — The number of execution threads in an idle state.

  • Queue Length — The number of requests in the queue.

  • Request Rate — The number of requests per second.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.4. BEA WebLogic::Execute Queue settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*1
BEA Domain Admin Server 
BEA Server Name*myserver
Queue Name*default
Critical Maximum Idle Execute Threads 
Warning Maximum Idle Execute Threads 
Critical Maximum Queue Length 
Warning Maximum Queue Length 
Critical Maximum Request Rate 
Warning Maximum Request Rate 

C.3.2. BEA WebLogic::Heap Free

The BEA WebLogic::Heap Free probe collects the following metric:

  • Heap Free — The percentage of free heap space.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.5. BEA WebLogic::Heap Free settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*1
BEA Domain Admin Server 
BEA Server Name*myserver
Critical Maximum Heap Free 
Warning Maximum Heap Free 
Warning Minimum Heap Free 
Critical Minimum Heap Free 

C.3.3. BEA WebLogic::JDBC Connection Pool

The BEA WebLogic::JDBC Connection Pool probe monitors the Java Database Connection (JDBC) pool on a domain admin server only (no managed servers) and collects the following metrics:

  • Connections — The number of connections to the JDBC.

  • Connections Rate — The speed at which connections are made to the JDBC, measured in connections per second.

  • Waiters — The number of sessions waiting to connect to the JDBC.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.6. BEA WebLogic::JDBC Connection Pool settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*1
BEA Domain Admin Server 
BEA Server Name*myserver
JDBC Pool Name*MyJDBC Connection Pool
Critical Maximum Connections 
Warning Maximum Connections 
Critical Maximum Connection Rate 
Warning Maximum Connection Rate 
Critical Maximum Waiters 
Warning Maximum Waiters 

C.3.4. BEA WebLogic::Server State

The BEA WebLogic::Server state probe monitors the current state of a BEA Weblogic Web server. If the probe is unable to make a connection to the server, a CRITICAL status results.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.7. BEA WebLogic::Server State settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*1
BEA Domain Admin Server 
BEA Server Name* 

C.3.5. BEA WebLogic::Servlet

The BEA WebLogic::Servlet probe monitors the performance of a particular servlet deployed on a WebLogic server and collects the following metrics:

  • High Execution Time — The highest amount of time in milliseconds that the servlet takes to execute since the system was started.

  • Low Execution Time — The lowest amount of time in milliseconds that the servlet takes to execute since the system was started.

  • Execution Time Moving Average — A moving average of the execution time.

  • Execution Time Average — A standard average of the execution time.

  • Reload Rate — The number of times the specified servlet is reloaded per minute.

  • Invocation Rate — The number of times the specified servlet is invoked per minute.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.8. BEA WebLogic::Servlet settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*1
BEA Domain Admin Server 
BEA Server Name*myserver
Servlet Name* 
Critical Maximum High Execution Time 
Warning Maximum High Execution Time 
Critical Maximum Execution Time Moving Average 
Warning Maximum Execution Time Moving Average 

C.4. General

The probes in this section are designed to monitor basic aspects of your systems. When applying them, ensure their timed thresholds do not exceed the amount of time allotted to the timeout period. Otherwise, the probe returns an UNKOWN status in all instances of extended latency, thereby nullifying the thresholds.

C.4.1. General::Remote Program

The General::Remote Program probe allows you to run any command or script on your system and obtain a status string. Note that the resulting message will be limited to 1024 bytes.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.9. General::Remote Program settings

FieldValue
Command* 
OK Exit Status*0
Warning Exit Status*1
Critical Exit Status*2
Timeout15

C.4.2. General::Remote Program with Data

The General::Remote Program with Data probe allows you to run any command or script on your system and obtain a value, as well as a status string. To use this probe, you must include XML code in the body of your script. This probe supports the following XML tags:

  • <perldata> </perldata>

  • <hash> </hash>

  • <hash key="..."> </hash>

The remote program will need to output some iteration of the following code to STDOUT:

<perldata>
  <hash>
    <item key="data">10</item>
    <item key="status_message">status message here</item>
  </hash>
</perldata>

The required value for data is the data point to be inserted in the database for time-series trending. The status_message is optional and can be whatever text string is desired with a maximum length of 1024 bytes. Remote programs that do not include a status_message still report the value and status returned.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe. XML is case-sensitive. The data item key name cannot be changed and it must collect a number as its value.

Table C.10. General::Remote Program with Data settings

FieldValue
Command* 
OK Exit Status*0
Warning Exit Status*1
Critical Exit Status*2
Timeout15

C.4.3. General::SNMP Check

The General::SNMP Check probe tests your SNMP server by specifying a single object identifier (OID) in dotted notation (such as 1.3.6.1.2.1.1.1.0) and a threshold associated with the return value. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the SNMP server to answer a connection request.

Requirements — SNMP must be running on the monitored system to perform this probe. Only integers can be used for the threshold values.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.11. General::SNMP Check settings

FieldValue
SNMP OID* 
SNMP Community String*public
SNMP Port*161
SNMP Version*2
Timeout*15
Critical Maximum Value 
Warning Maximum Value 
Warning Minimum Value 
Critical Minimum Value 

C.4.4. General::TCP Check

The General::TCP Check probe tests your TCP server by verifying that it can connect to a system via the specified port number. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the TCP server to answer a connection request.

The probe passes the string specified in the Send field upon making a connection. The probe anticipates a response from the system, which should include the substring specified in the Expect field. If the expected string is not found, the probe returns a CRITICAL status.

Table C.12. General::TCP Check settings

FieldValue
Send 
Expect 
Port*1
Timeout*10
Critical Maximum Latency 
Warning Maximum Latency 

C.4.5. General::UDP Check

The General::UDP Check probe tests your UDP server by verifying that it can connect to a system via the specified port number. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the UDP server to answer a connection request.

The probe passes the string specified in the Send field upon making a connection. The probe anticipates a response from the system, which should include the substring specified in the Expect field. If the expected string is not found, the probe returns a CRITICAL status.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.13. General::UDP Check settings

FieldValue
Port*1
Send 
Expect 
Timeout*10
Critical Maximum Latency 
Warning Maximum Latency 

C.4.6. General::Uptime (SNMP)

The General::Uptime (SNMP) probe records the time since the device was last started. It uses the SNMP object identifier (OID) to obtain this value. The only error status it will return is UNKNOWN.

Requirements — SNMP must be running on the monitored system and access to the OID must be enabled to perform this probe.

This probe's transport protocol is User Datagram Protocol (UDP).

Table C.14. General::Uptime (SNMP) settings

FieldValue
SNMP Community String*public
SNMP Port*161
SNMP Version*2
Timeout*15

C.5. Linux

The probes in this section monitor essential aspects of your Linux systems, from CPU usage to virtual memory. Apply them to mission-critical systems to obtain warnings prior to failure.

Unlike other probe groups, which may or may not require the SUSE Manager monitoring daemon, every Linux probe requires that the rhnmd daemon be running on the monitored system.

C.5.1. Linux::CPU Usage

The Linux::CPU Usage probe monitors the CPU utilization on a system and collects the following metric:

  • CPU Percent Used — The five-second average of the percent of CPU usage at probe execution.

Requirements — The SUSE Manager Monitoring Daemon must be running on the monitored system to run this probe.

Table C.15. Linux::CPU Usage settings

FieldValue
Timeout*15
Critical Maximum CPU Percent Used 
Warning Maximum CPU Percent Used 

C.5.2. Linux::Disk IO Throughput

The Linux::Disk IO Throughput probe monitors a given disk and collects the following metric:

  • Read Rate — The amount of data that is read in kilobytes per second.

  • Write Rate — The amount of data that is written in kilobytes per second.

To obtain the value for the required Disk number or disk name field, run iostat on the system to be monitored and see what name has been assigned to the disk you desire. The default value of 0 usually provides statistics from the first hard drive connected directly to the system.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe. Also, the Disk number or disk name parameter must match the format visible when the iostat command is run. If the format is not identical, the configured probe enters an UNKNOWN state.

Table C.16. Linux::Disk IO Throughput settings

FieldValue
Disk number or disk name*0
Timeout*15
Critical Maximum KB read/second 
Warning Maximum KB read/second 
Warning Minimum KB read/second 
Critical Minimum KB read/second 
Critical Maximum KB written/second 
Warning Maximum KB written/second 
Warning Minimum KB written/second 
Critical Minimum KB written/second 

C.5.3. Linux::Disk Usage

The Linux::Disk Usage probe monitors the disk space on a specific file system and collects the following metrics:

  • File System Used — The percentage of the file system currently in use.

  • Space Used — The amount of the file system in megabytes currently in use.

  • Space Available — The amount of the file system in megabytes currently available.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.17. Linux::Disk Usage settings

FieldValue
File system*/dev/hda1
Timeout*15
Critical Maximum File System Percent Used 
Warning Maximum File System Percent Used 
Critical Maximum Space Used 
Warning Maximum Space Used 
Warning Minimum Space Available 
Critical Minimum Space Available 

C.5.4. Linux::Inodes

The Linux::Inodes probe monitors the specified file system and collects the following metric:

  • Inodes — The percentage of inodes currently in use.

An inode is a data structure that holds information about files in a Linux file system. There is an inode for each file, and a file is uniquely identified by the file system on which it resides and its inode number on that system.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be runnig on the monitored system to execute this probe.

Table C.18. Linux::Inodes settings

FieldValue
File system*/
Timeout*15
Critical Maximum Inodes Percent Used 
Warning Maximum Inodes Percent Used 

C.5.5. Linux::Interface Traffic

The Linux::Interface Traffic probe measures the amount of traffic into and out of the specified interface (such as eth0) and collects the following metrics:

  • Input Rate — The traffic in bytes per second going into the specified interface.

  • Output Rate — The traffic in bytes per second going out of the specified interface.

Requirements — The SUSE Manager monitoring daemon must be running on the monitored system to execute this probe.

Table C.19. Linux::Interface Traffic settings

FieldValue
Interface* 
Timeout*30
Critical Maximum Input Rate 
Warning Maximum Input Rate 
Warning Minimum Input Rate 
Critical Minimum Input Rate 
Critical Maximum Output Rate 
Warning Maximum Output Rate 
Warning Minimum Output Rate 
Critical Minimum Output Rate 

C.5.6. Linux::Load

The Linux::Load probe monitors the CPU of a system and collects the following metric:

  • Load — The average load on the system CPU over various periods.

Requirements — The SUSE Manager monitoring daemon must be running on the monitored system to execute this probe.

Table C.20. Linux::Load settings

FieldValue
Timeout*15
Critical CPU Load 1-minute average 
Warning CPU Load 1-minute average 
Critical CPU Load 5-minute average 
Warning CPU Load 5-minute average 
Critical CPU Load 15-minute average 
Warning CPU Load 15-minute average 

C.5.7. Linux::Memory Usage

The Linux::Memory Usage probe monitors the memory on a system and collects the following metric:

  • RAM Free — The amount of free random access memory (RAM) in megabytes on a system.

You can also include the reclaimable memory in this metric by entering yes or no in the Include reclaimable memory field.

Requirements — The SUSE Manager Monitoring Daemon must be running on the monitored system to execute this probe.

Table C.21. Linux::Memory Usage settings

FieldValue
Include reclaimable memoryno
Timeout*15
Warning Maximum RAM Free 
Critical Maximum RAM Free 

C.5.8. Linux::Process Counts by State

The Linux::Process Counts by State probe identifies the number of processes in the following states:

  • Blocked — A process that has been switched to the waiting queue and whose state has been switched to waiting.

  • Defunct — A process that has terminated (either because it has been killed by a signal or because it has called exit()) and whose parent process has not yet received notification of its termination by executing some form of the wait() system call.

  • Stopped — A process that has been stopped before its execution could be completed.

  • Sleeping — A process that is in the Interruptible sleep state and that can later be reintroduced into memory, resuming execution where it left off.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.22. Linux::Process Counts by State settings

FieldValue
Timeout*15
Critical Maximum Blocked Processes 
Warning Maximum Blocked Processes 
Critical Maximum Defunct Processes 
Warning Maximum Defunct Processes 
Critical Maximum Stopped Processes 
Warning Maximum Stopped Processes 
Critical Maximum Sleeping Processes 
Warning Maximum Sleeping Processes 
Critical Maximum Child Processes 
Warning Maximum Child Processes 

C.5.9. Linux::Process Count Total

The Linux::Process Count Total probe monitors a system and collects the following metric:

  • Process Count — The total number of processes currently running on the system.

Requirements — The SUSE Manager monitoring daemon must be running on the monitored system to execute this probe.

Table C.23. Linux::Process Count Total settings

FieldValue
Timeout*15
Critical Maximum Process Count 
Warning Maximum Process Count 

C.5.10. Linux::Process Health

The Linux::Process Health probe monitors user-specified processes and collects the following metrics:

  • CPU Usage — The CPU usage rate for a given process in milliseconds per second. This metric reports the time column of ps output, which is the cumulative CPU time used by the process. This makes the metric independent of probe interval, allows sane thresholds to be set, and generates usable graphs (i.e. a sudden spike in CPU usage shows up as a spike in the graph).

  • Child Process Groups — The number of child processes spawned from the specified parent process. A child process inherits most of its attributes, such as open files, from its parent.

  • Threads — The number of running threads for a given process. A thread is the basic unit of CPU utilization, and consists of a program counter, a register set, and a stack space. A thread is also called a lightweight process.

  • Physical Memory Used — The amount of physical memory (or RAM) in kilobytes used by the specified process.

  • Virtual Memory Used — The amount of virtual memory in kilobytes used by the specified process, or the size of the process in real memory plus swap.

Specify the process by its command name or process ID. (PID). Entering a PID overrides the entry of a command name. If no command name or PID is entered, the error Command not found is displayed and the probe will be set to a CRITICAL state.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.24. Linux::Process Health settings

FieldValue
Command Name 
Process ID (PID) file 
Timeout*15
Critical Maximum CPU Usage 
Warning Maximum CPU Usage 
Critical Maximum Child Process Groups 
Warning Maximum Child Process Groups 
Critical Maximum Threads 
Warning Maximum Threads 
Critical Maximum Physical Memory Used 
Warning Maximum Physical Memory Used 
Critical Maximum Virtual Memory Used 
Warning Maximum Virtual Memory Used 

C.5.11. Linux::Process Running

The Linux::Process running probe verifies that the specified process is functioning properly. It counts either processes or process groups, depending on whether the Count process groups checkbox is selected.

By default, the checkbox is selected, thereby indicating that the probe should count the number of process group leaders independent of the number of children. This allows you, for example, to verify that two instances of the Apache Web server are running regardless of the (dynamic) number of child processes. If it is not selected, the probe conducts a straightforward count of the number of processes (children and leaders) matching the specified process.

Specify the process by its command name or process ID. (PID). Entering a PID overrides the entry of a command name. If no command name or PID is entered, the error Command not found is displayed and the probe enters a CRITICAL state.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.25. Linux::Process Running settings

FieldValue
Command name 
PID file 
Count process groups(checked)
Timeout*15
Critical Maximum Number Running 
Critical Minimum Number Running 

C.5.12. Linux::Swap Usage

The Linux::Swap Usage probe monitors the swap partitions running on a system and reports the following metric:

  • Swap Free — The percent of swap memory currently free.

Requirements — The SUSE Manager Monitoring Daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.26. Linux::Swap Usage settings

FieldValue
Timeout*15
Warning Minimum Swap Free 
Critical Minimum Swap Free 

C.5.13. Linux::TCP Connections by State

The Linux::TCP Connections by State probe identifies the total number of TCP connections, as well as the quantity of each in the following states:

  • TIME_WAIT — The socket is waiting after close for remote shutdown transmission so it may handle packets still in the network.

  • CLOSE_WAIT — The remote side has been shut down and is now waiting for the socket to close.

  • FIN_WAIT — The socket is closed, and the connection is now shutting down.

  • ESTABLISHED — The socket has a connection established.

  • SYN_RCVD — The connection request has been received from the network.

This probe can be helpful in finding and isolating network traffic to specific IP addresses or examining network connections into the monitored system.

The filter parameters for the probe let you narrow the probe's scope. This probe uses the netstat -ant command to retrieve data. The Local IP address and Local port parameters use values in the Local Address column of the output; the Remote IP address and Remote port parameters use values in the Foreign Address column of the output for reporting.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.27. Linux::TCP Connections by State settings

FieldValue
Local IP address filter pattern list 
Local port number filter 
Remote IP address filter pattern list 
Remote port number filter 
Timeout*15
Critical Maximum Total Connections 
Warning Maximum Total Connections 
Critical Maximum TIME_WAIT Connections 
Warning Maximum TIME_WAIT Connections 
Critical Maximum CLOSE_WAIT Connections 
Warning Maximum CLOSE_WAIT Connections 
Critical Maximum FIN_WAIT Connections 
Warning Maximum FIN_WAIT Connections 
Critical Maximum ESTABLISHED Connections 
Warning Maximum ESTABLISHED Connections 
Critical Maximum SYN_RCVD Connections 
Warning Maximum SYN_RCVD Connections 

C.5.14. Linux::Users

The Linux::Users probe monitors the users of a system and reports the following metric:

  • Users — The number of users currently logged in.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.28. Linux::Users settings

FieldValue
Timeout*15
Critical Maximum Users 
Warning Maximum Users 

C.5.15. Linux::Virtual Memory

The Linux::Virtual Memory probe monitors the total system memory and collects the following metric:

  • Virtual Memory — The percent of total system memory - random access memory (RAM) plus swap - that is free.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.29. Linux::Virtual Memory settings

FieldValue
Timeout*15
Warning Minimum Virtual Memory Free 
Critical Minimum Virtual Memory Free 

C.6. LogAgent

The probes in this section monitor the log files on your systems. You can use them to query logs for certain expressions and track the sizes of files. For LogAgent probes to run, the nocpulse user must be granted read access to your log files.

Note that data from the first run of these probes is not measured against the thresholds to prevent spurious notifications caused by incomplete metric data. Measurements will begin on the second run.

C.6.1. LogAgent::Log Pattern Match

The LogAgent::Log Pattern match probe uses regular expressions to match text located within the monitored log file and collects the following metrics:

  • Regular Expression Matches — The number of matches that have occurred since the probe last ran.

  • Regular Expression Match Rate — The number of matches per minute since the probe last ran.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe. For this probe to run, the nocpulse user must be granted read access to your log files.

In addition to the name and location of the log file to be monitored, you must provide a regular expression to be matched against. The expression must be formatted for egrep, which is equivalent to grep -E and supports extended regular expressions. This is the regular expression set for egrep:

^ beginning of line 
$ end of line 
. match one char 
* match zero or more chars 
[] match one character set, e.g. '[Ff]oo' 
[^] match not in set '[^A-F]oo' 
+ match one or more of preceding chars 
? match zero or one of preceding chars 
| or, e.g. a|b 
() groups chars, e.g., (foo|bar) or (foo)+
[Warning]

Do not include single quotation marks (') within the expression. Doing so causes egrep to fail silently and the probe to time out.

Table C.30. LogAgent::Log Pattern Match settings

FieldValue
Log file*/var/log/messages
Basic regular expression* 
Timeout*45
Critical Maximum Matches 
Warning Maximum Matches 
Warning Minimum Matches 
Critical Minimum Matches 
Critical Maximum Match Rate 
Warning Maximum Match Rate 
Warning Minimum Match Rate 
Critical Maximum Match Rate 

C.6.2. LogAgent::Log Size

The LogAgent::Log Size probe monitors log file growth and collects the following metrics:

  • Size — The size the log file has grown in bytes since the probe last ran.

  • Output Rate — The number of bytes per minute the log file has grown since the probe last ran.

  • Lines — The number of lines written to the log file since the probe last ran.

  • Line Rate — The number of lines written per minute to the log file since the probe last ran.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe. For this probe to run, the nocpulse user must be granted read access to your log files.

Table C.31. LogAgent::Log Size settings

FieldValue
Log file*/var/log/messages
Timeout*20
Critical Maximum Size 
Warning Maximum Size 
Warning Minimum Size 
Critical Minimum Size 
Critical Maximum Output Rate 
Warning Maximum Output Rate 
Warning Minimum Output Rate 
Critical Minimum Output Rate 
Critical Maximum Lines 
Warning Maximum Lines 
Warning Minimum Lines 
Critical Minimum Lines 
Critical Maximum Line Rate 
Warning Maximum Line Rate 
Warning Minimum Line Rate 
Critical Minimum Line Rate 

C.7. MySQL 3.23 - 3.33

The probes in this section monitor aspects of the MySQL database using the mysqladmin binary. No specific user privileges are needed for these probes.

Note that the mysql-server package must be installed on the system conducting the monitoring for these probes to complete. Refer to the MySQL Installation section of the SUSE Manager Installation Guide for instructions.

C.7.1. MySQL::Database Accessibility

The MySQL::Database Accessibility probe tests connectivity through a database account that has no database privileges. If no connection is made, a CRITICAL status results.

Table C.32. MySQL::Database Accessibility settings

FieldValue
Username* 
Password 
MySQL Port3306
Database*mysql
Timeout15

C.7.2. MySQL::Opened Tables

The MySQL::Opened Tables probe monitors the MySQL server and collects the following metric:

  • Opened Tables — The tables that have been opened since the server was started.

Table C.33. MySQL::Opened Tables settings

FieldValue
Username 
Password 
MySQL Port*3306
Timeout15
Critical Maximum Opened Objects 
Warning Maximum Opened Objects 
Warning Minimum Opened Objects 
Critical Minimum Opened Objects 

C.7.3. MySQL::Open Tables

The MySQL::Open Tables probe monitors the MySQL server and collects the following metric:

  • Open Tables — The number of tables open when the probe runs.

Table C.34. MySQL::Open Tables settings

FieldValue
Username 
Password 
MySQL Port*3306
Timeout15
Critical Maximum Open Objects 
Warning Maximum Open Objects 
Warning Minimum Open Objects 
Critical Minimum Open Objects 

C.7.4. MySQL::Query Rate

The MySQL::Query Rate probe monitors the MySQL server and collects the following metric:

  • Query Rate — The average number of queries per second per database server.

Table C.35. MySQL::Query Rate settings

FieldValue
Username 
Password 
MySQL Port*3306
Timeout15
Critical Maximum Query Rate 
Warning Maximum Query Rate 
Warning Minimum Query Rate 
Critical Minimum Query Rate 

C.7.5. MySQL::Threads Running

The MySQL::Threads running probe monitors the MySQL server and collects the following metric:

  • Threads Running — The total number of running threads within the database.

Table C.36. MySQL::Threads Running settings

FieldValue
Username 
Password 
MySQL Port*3306
Timeout15
Critical Maximum Threads Running 
Warning Maximum Threads Running 
Warning Minimum Threads Running 
Critical Minimum Threads Running 

C.8. Network Services

The probes in this section monitor various services integral to a functioning network. When applying them, ensure that their timed thresholds do not exceed the amount of time allotted to the timeout period. Otherwise, an UNKNOWN status is returned in all instances of extended latency, thereby nullifying the thresholds.

C.8.1. Network Services::DNS Lookup

The Network Services::DNS Lookup probe uses the dig command to see if it can resolve the system or domain name specified in the Host or Address to look up field. It collects the following metric:

  • Query Time — The time in milliseconds required to execute the dig request.

This is useful in monitoring the status of your DNS servers. To monitor one of your DNS servers, supply a well-known host/domain name, such as a large search engine or corporate Web site.

Table C.37. Network Services::DNS Lookup settings

FieldValue
Host or Address to look up 
Timeout*10
Critical Maximum Query Time 
Warning Maximum Query Time 

C.8.2. Network Services::FTP

The Network Services::FTP probe uses network sockets to test FTP port availability. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the FTP server to answer a connection request.

This probe supports authentication. Provide a username and password in the appropriate fields to use this feature.The optional Expect value is the string to be matched against after a successful connection is made to the FTP server. If the expected string is not found, the probe returns a CRITICAL state.

Table C.38. Network Services::FTP settings

FieldValue
ExpectFTP
Username 
Password 
FTP Port*21
Timeout*10
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.3. Network Services::IMAP Mail

The Network Services::IMAP Mail probe determines if it can connect to the IMAP 4 service on the system. Specifying an optional port will override the default port 143. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the IMAP server to answer a connection request.

The required Expect value is the string to be matched against after a successful connection is made to the IMAP server. If the expected string is not found, the probe returns a CRITICAL state.

Table C.39. Network Services::IMAP Mail settings

FieldValue
IMAP Port*143
Expect*OK
Timeout*5
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.4. Network Services::Mail Transfer (SMTP)

The Network Services::Mail Transfer (SMTP) probe determines if it can connect to the SMTP port on the system. Specifying an optional port number overrides the default port 25. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the SMTP server to answer a connection request.

Table C.40. Network Services::Mail Transfer (SMTP) settings

FieldValue
SMTP Port*25
Timeout*10
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.5. Network Services::Ping

The Network Services::Ping probe determines if the SUSE Manager server can ping the monitored system or a specified IP address. It also checks the packet loss and compares the round trip average against the Warning and Critical threshold levels. The required Packets to send value allows you to control how many ICMP ECHO packets are sent to the system. This probe collects the following metrics:

  • Round-Trip Average — The time it takes in milliseconds for the ICMP ECHO packet to travel to and from the monitored system.

  • Packet Loss — The percent of data lost in transit.

Although optional, the IP Address field can be instrumental in collecting metrics for systems that have multiple IP addresses. For instance, if the system is configured with multiple virtual IP addresses or uses Network Address Translation (NAT) to support internal and external IP addresses, this option may be used to check a secondary IP address rather than the primary address associated with the hostname.

Note that this probe conducts the ping from an SUSE Manager server and not the monitored system. Populating the IP address field does not test connectivity between the system and the specified IP address but between the SUSE Manager server and the IP address. Therefore, entering the same IP address for Ping probes on different systems accomplishes precisely the same task. To conduct a ping from a monitored system to an individual IP address, use the Remote Ping probe instead. Refer to Section C.8.7, “Network Services::Remote Ping”.

Table C.41. Network Services::Ping settings

FieldValue
IP Address (defaults to system IP) 
Packets to send*20
Timeout*10
Critical Maximum Round-Trip Average 
Warning Maximum Round-Trip Average 
Critical Maximum Packet Loss 
Warning Maximum Packet Loss 

C.8.6. Network Services::POP Mail

The Network Services::POP Mail probe determines if it can connect to the POP3 port on the system. A port number must be specified; specifying another port number overrides the default port 110. This probe collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the POP server to answer a connection request.

The required Expect value is the string to be matched against after a successful connection is made to the POP server. The probe looks for the string in the first line of the response from the system. The default is +OK. If the expected string is not found, the probe returns a CRITICAL state.

Table C.42. Network Services::POP Mail settings

FieldValue
Port*110
Expect*+OK
Timeout*10
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.7. Network Services::Remote Ping

The Network Services::Remote Ping probe determines if the monitored system can ping a specified IP address. It also monitors the packet loss and compares the round trip average against the Warning and Critical threshold levels. The required Packets to send value allows you to control how many ICMP ECHO packets are sent to the address. This probe collects the following metrics:

  • Round-Trip Average — The time it takes in milliseconds for the ICMP ECHO packet to travel to and from the IP address.

  • Packet Loss — The percent of data lost in transit.

The IP Address field identifies the precise address to be pinged. Unlike the similar, optional field in the standard ping probe, this field is required. The monitored system directs the ping to a third address, rather than to the SUSE Manager server. Since the remote ping probe tests connectivity from the monitored system, another IP address must be specified. To conduct pings from the SUSE Manager server to a system or IP address, use the standard Ping probe instead. Refer to Section C.8.5, “Network Services::Ping”.

Requirements — The SUSE Manager Monitoring Daemon (rhnmd) must be running on the monitored system to execute this probe.

Table C.43. Network Services::Remote Ping settings

FieldValue
IP Address* 
Packets to send*20
Timeout*10
Critical Maximum Round-Trip Average 
Warning Maximum Round-Trip Average 
Critical Maximum Packet Loss 
Warning Maximum Packet Loss 

C.8.8. Network Services::RPCService

The Network Services::RPCService probe tests the availability of remote procedure call (RPC) programs on a given IP address. It collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the RPC server to answer a connection request.

RPC server programs, which provide function calls via that RPC network, register themselves with the RPC network by declaring a program ID and a program name. NFS is an example of a service that works via the RPC mechanism.

Client programs that wish to use the resources of RPC server programs do so by asking the machine on which the server program resides to provide access to RPC functions within the RPC program number or program name. These conversations can occur over either TCP or UDP (but are almost always UDP).

This probe allows you to test simple program availability. You must specify the program name or number, the protocol over which the conversation occurs, and the usual timeout period.

Table C.44. Network Services::RPCService settings

FieldValue
Protocol (TCP/UDP)udp
Service Name*nfs
Timeout*10
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.9. Network Services::Secure Web Server (HTTPS)

The Network Services::Secure Web Server (HTTPS) probe determines the availability of the secure Web server and collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the HTTPS server to answer a connection request.

This probe confirms that it can connect to the HTTPS port on the specified host and retrieve the specified URL. If no URL is specified, the probe fetches the root document. The probe looks for a HTTP/1. message from the system unless you alter that value. Specifying another port number overrides the default port of 443.

This probe supports authentication. Provide a username and password in the appropriate fields to use this feature. Unlike most other probes, this probe returns a CRITICAL status if it cannot contact the system within the timeout period.

Table C.45. Network Services::Secure Web Server (HTTPS) settings

FieldValue
URL Path/
Expect HeaderHTTP/1
Expect Content 
UserAgent*NOCpulse-check_http/1.0
Username 
Password 
Timeout*10
HTTPS Port*443
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.10. Network Services::SSH

The Network Services::SSH probe determines the availability of SSH on the specified port and collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the SSH server to answer a connection request.

Upon successfully contacting the SSH server and receiving a valid response, the probe displays the protocol and server version information. If the probe receives an invalid response, it displays the message returned from the server and generates a WARNING state.

Table C.46. Network Services::SSH settings

FieldValue
SSH Port*22
Timeout*5
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.8.11. Network Services::Web Server (HTTP)

The Network Services::Web Server (HTTP) probe determines the availability of the Web server and collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the HTTP server to answer a connection request.

This probe confirms it can connect to the HTTP port on the specified host and retrieve the specified URL. If no URL is specified, the probe will fetch the root document. The probe looks for a HTTP/1. message from the system, unless you alter that value. Specifying another port number will override the default port of 80. Unlike most other probes, this probe will return a CRITICAL status if it cannot contact the system within the timeout period.

This probe supports authentication. Provide a username and password in the appropriate fields to use this feature. Also, the optional Virtual Host field can be used to monitor a separate documentation set located on the same physical machine presented as a standalone server. If your Web server is not configured to use virtual hosts (which is typically the case), you should leave this field blank. If you do have virtual hosts configured, enter the domain name of the first host here. Add as many probes as necessary to monitor all virtual hosts on the machine.

Table C.47. Network Services::Web Server (HTTP) settings

FieldValue
URL Path/
Virtual Host 
Expect HeaderHTTP/1
Expect Content 
UserAgent*NOCpulse-check_http/1.0
Username 
Password 
Timeout*10
HTTP Port*80
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.9. Oracle 8i, 9i, 10g, and 11g

The probes in this section may be applied to instances of the Oracle database matching the versions supported. Oracle probes require the configuration of the database and associations made by running the following command:

$ORACLE_HOME/rdbms/admin/catalog.sql

In addition, for these probes to function properly, the Oracle user configured in the probe must have minimum privileges of CONNECT and SELECT_CATALOG_ROLE.

Some Oracle probes are specifically aimed at tuning devices for long-term performance gains, rather than avoiding outages. Therefore, it is recommended to schedule them to occur less frequently, between every hour and every two days. This provides a better statistical representation, de-emphasizing anomalies that can occur at shorter time intervals. This applies to following probes: Buffer Cache, Data Dictionary Cache, Disk Sort Ratio, Library Cache, and Redo Log.

For CRITICAL and WARNING thresholds based upon time to work as intended, their values cannot exceed the amount of time allotted to the timeout period. Otherwise, an UNKNOWN status is returned in all cases of extended latency, thereby nullifying the thresholds. For this reason, it is strongly recommended to ensure that timeout periods exceed all timed thresholds. In this section, this refers specifically to the probe TNS Ping.

Finally, customers using these Oracle probes against a database using Oracle's Multi-Threaded Server (MTS) must contact Novell support to have entries added to the SUSE Manager Server's /etc/hosts file to ensure that the DNS name is resolved correctly.

C.9.1. Oracle::Active Sessions

The Oracle::Active Sessions probe monitors an Oracle instance and collects the following metrics:

  • Active Sessions — The number of active sessions based on the value of V$PARAMETER.PROCESSES.

  • Available Sessions — The percentage of active sessions that are available based on the value of V$PARAMETER.PROCESSES.

Table C.48. Oracle::Active Sessions settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Critical Maximum Active Sessions 
Warning Maximum Active Sessions 
Critical Maximum Available Sessions Used 
Warning Maximum Available Sessions Used 

C.9.2. Oracle::Availability

The Oracle::Availability probe determines the availability of the database from SUSE Manager.

Table C.49. Oracle::Availability settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30

C.9.3. Oracle::Blocking Sessions

The Oracle::Blocking Sessions probe monitors an Oracle instance and collects the following metric:

  • Blocking Sessions — The number of sessions preventing other sessions from committing changes to the Oracle database, as determined by the required Time Blocking value you provide. Only those sessions that have been blocking for this duration, which is measured in seconds, are counted as blocking sessions.

Table C.50. Oracle::Blocking Sessions settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Time Blocking (seconds)*20
Timeout*30
Critical Maximum Blocking Sessions 
Warning Maximum Blocking Sessions 

C.9.4. Oracle::Buffer Cache

The Oracle::Buffer Cache probe computes the Buffer Cache Hit Ratio so as to optimize the system global area (SGA) Database Buffer Cache size. It collects the following metrics:

  • Db Block Gets — The number of blocks accessed via single block gets (not through the consistent get mechanism).

  • Consistent Gets — The number of accesses made to the block buffer to retrieve data in a consistent mode.

  • Physical Reads — The cumulative number of blocks read from disk.

  • Buffer Cache Hit Ratio — The rate at which the database goes to the buffer instead of the hard disk to retrieve data. A low ratio suggests more RAM should be added to the system.

Table C.51. Oracle::Buffer Cache settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port1521
Timeout*30
Warning Minimum Buffer Cache Hit Ratio 
Critical Minimum Buffer Cache Hit Ratio 

C.9.5. Oracle::Client Connectivity

The Oracle::Client Connectivity probe determines if the database is up and capable of receiving connections from the monitored system. This probe opens an rhnmd connection to the system and issues a sqlplus connect command on the monitored system.

The Expected DB name parameter is the expected value of V$DATABASE.NAME. This value is case-insensitive. A CRITICAL status is returned if this value is not found.

Requirements — The SUSE Manager monitoring daemon (rhnmd) must be running on the monitored system to execute this probe. For this probe to run, the nocpulse user must be granted read access to your log files.

Table C.52. Oracle::Client Connectivity settings

FieldValue
Oracle Hostname or IP address* 
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
ORACLE_HOME*/opt/oracle
Expected DB Name* 
Timeout*30

C.9.6. Oracle::Data Dictionary Cache

The Oracle::Data Dictionary Cache probe computes the Data Dictionary Cache Hit Ratio so as to optimize the SHARED_POOL_SIZE in init.ora. It collects the following metrics:

  • Data Dictionary Hit Ratio — The ratio of cache hits to cache lookup attempts in the data dictionary cache. In other words, the rate at which the database goes to the dictionary instead of the hard disk to retrieve data. A low ratio suggests more RAM should be added to the system.

  • Gets — The number of blocks accessed via single block gets (not through the consistent get mechanism).

  • Cache Misses — The number of accesses made to the block buffer to retrieve data in a consistent mode.

Table C.53. Oracle::Data Dictionary Cache settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Warning Minimum Data Dictionary Hit Ratio 
Critical Minimum Data Dictionary Hit Ratio 

C.9.7. Oracle::Disk Sort Ratio

The Oracle::Disk Sort Ratio probe monitors an Oracle database instance and collects the following metric:

  • Disk Sort Ratio — The rate of Oracle sorts that were too large to be completed in memory and were instead sorted using a temporary segment.

Table C.54. Oracle::Disk Sort Ratio settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Critical Maximum Disk Sort Ratio 
Warning Maximum Disk Sort Ratio 

C.9.8. Oracle::Idle Sessions

The Oracle::Idle Sessions probe monitors an Oracle instance and collects the following metric:

  • Idle Sessions — The number of Oracle sessions that are idle, as determined by the required Time Idle value you provide. Only those sessions that have been idle for this duration, which is measured in seconds, are counted as idle sessions.

Table C.55. Oracle::Idle Sessions settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Time Idle (seconds)*20
Timeout*30
Critical Maximum Idle Sessions 
Warning Maximum Idle Sessions 

C.9.9. Oracle::Index Extents

The Oracle::Index Extents probe monitors an Oracle instance and collects the following metric:

  • Allocated Extents — The number of allocated extents for any index.

  • Available Extents — The percentage of available extents for any index.

The required Index Name field contains a default value of % that matches any index name.

Table C.56. Oracle::Index Extents settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Index Owner*%
Index Name*%
Timeout*30
Critical Maximum of Allocated Extents 
Warning Maximum of Allocated Extents 
Critical Maximum of Available Extents 
Warning Maximum of Available Extents 

C.9.10. Oracle::Library Cache

The Oracle::Library Cache probe computes the Library Cache Miss Ratio so as to optimize the SHARED_POOL_SIZE in init.ora. It collects the following metrics:

  • Library Cache Miss Ratio — The rate at which a library cache pin miss occurs. This happens when a session executes a statement that it has already parsed but finds that the statement is no longer in the shared pool.

  • Executions — The number of times a pin was requested for objects of this namespace.

  • Cache Misses — The number of pins of objects with previous pins since the object handle was created that must now retrieve the object from disk.

Table C.57. Oracle::Library Cache settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Critical Maximum Library Cache Miss Ratio 
Warning Maximum Library Cache Miss Ratio 

C.9.11. Oracle::Locks

The Oracle::Locks probe monitors an Oracle database instance and collects the following metric:

  • Active Locks — The current number of active locks as determined by the value in the v$locks table. Database administrators should be aware of high numbers of locks present in a database instance.

Locks are used so that multiple users or processes updating the same data in the database do not conflict. This probe is useful for alerting database administrators when a high number of locks are present in a given instance.

Table C.58. Oracle::Locks settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Critical Maximum Active Locks 
Warning Maximum Active Locks 

C.9.12. Oracle::Redo Log

The Oracle::Redo Log probe monitors an Oracle database instance and collects the following metrics:

  • Redo Log Space Request Rate — The average number of redo log space requests per minute since the server has been started.

  • Redo Buffer Allocation Retry Rate — The average number of buffer allocation retries per minute since the server was started.

The metrics returned and the thresholds they are measured against are numbers representing the rate of change in events per minute. The rate of change for these metrics should be monitored because fast growth can indicate problems requiring investigation.

Table C.59. Oracle::Redo Log settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Timeout*30
Critical Maximum Redo Log Space Request Rate 
Warning Maximum Redo Log Space Request Rate 
Critical Maximum Redo Buffer Allocation Retry Rate 
Warning Maximum Redo Buffer Allocation Retry Rate 

C.9.13. Oracle::Table Extents

The Oracle::Table Extents probe monitors an Oracle database instance and collects the following metrics:

  • Allocated Extents-Any Table — The total number of extents for any table.

  • Available Extents-Any Table — The percentage of available extents for any table.

In Oracle, table extents allow a table to grow. When a table is full, it is extended by an amount of space configured when the table is created. Extents are configured on a per-table basis, with an extent size and a maximum number of extents.

For example, a table that starts with 10 MB of space and that is configured with an extent size of 1 MB and max extents of 10 can grow to a maximum of 20 MB (by being extended by 1 MB ten times). This probe can be configured to alert by (1) the number of allocated extents (e.g. "go critical when the table has been extended 5 or more times"), or (2) the table is extended past a certain percentage of its max extents (e.g. "go critical when the table has exhausted 80% or more of its max extents").

The required Table Owner and Table Name fields contain a default value of % that matches any table owner or name.

Table C.60. Oracle::Table Extents settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Table Owner*%
Table Name*%
Timeout*30
Critical Maximum Allocated Extents 
Warning Maximum Allocated Extents 
Critical Maximum Available Extents 
Warning Maximum Available Extents 

C.9.14. Oracle::Tablespace Usage

The Oracle::Tablespace Usage probe monitors an Oracle database instance and collects the following metric:

  • Available Space Used — The percentage of available space in each tablespace that has been used.

Tablespace is the shared pool of space in which a set of tables live. This probe alerts the user when the total amount of available space falls below the threshold. Tablespace is measured in bytes, so extents do not factor into it directly (though each extension removes available space from the shared pool).

The required Tablespace Name field is case insensitive and contains a default value of % that matches any table name.

Table C.61. Oracle::Tablespace Usage settings

FieldValue
Oracle SID* 
Oracle Username* 
Oracle Password* 
Oracle Port*1521
Tablespace Name*%
Timeout*30
Critical Maximum Available Space Used 
Warning Maximum Available Space Used 

C.9.15. Oracle::TNS Ping

The Oracle::TNS Ping probe determines if an Oracle listener is alive and collects the following metric:

  • Remote Service Latency — The time it takes in seconds for the Oracle server to answer a connection request.

Table C.62. Oracle::TNS Ping settings

FieldValue
TNS Listener Port*1521
Timeout*15
Critical Maximum Remote Service Latency 
Warning Maximum Remote Service Latency 

C.10. SUSE Manager

The probes in this section may be applied to the SUSE Manager itself to monitor its health and performance. Since these probes run locally, no specific application or transport protocols are required.

C.10.1. SUSE Manager::Disk Space

The SUSE Manager::Disk Space probe monitors the free disk space on a SUSE Manager and collects the following metrics:

  • File System Used — The percent of the current file system now in use.

  • Space Used — The file size used by the current file system.

  • Space Available — The file size available to the current file system.

Table C.63. SUSE Manager::Disk Space settings

FieldValue
Device Pathname*/dev/hda1
Critical Maximum File System Used 
Warning Maximum File System Used 
Critical Maximum Space Used 
Warning Maximum Space Used 
Critical Maximum Space Available 
Warning Maximum Space Available 

C.10.2. SUSE Manager::Execution Time

The SUSE Manager::Execution Time probe monitors the execution time for probes run from a SUSE Manager and collects the following metric:

  • Probe Execution Time Average — The seconds required to fully execute a probe.

Table C.64. SUSE Manager::Execution Time settings

FieldValue
Critical Maximum Probe Execution Time Average 
Warning Maximum Probe Execution Time Average 

C.10.3. SUSE Manager::Interface Traffic

The SUSE Manager::Interface Traffic probe monitors the interface traffic on a SUSE Manager and collects the following metrics:

  • Input Rate — The amount of traffic in bytes per second the device receives.

  • Output Rate — The amount of traffic in bytes per second the device sends.

Table C.65. SUSE Manager::Interface Traffic settings

FieldValue
Interface*eth0
Timeout (seconds)*30
Critical Maximum Input Rate 
Critical Maximum Output Rate 

C.10.4. SUSE Manager::Latency

The SUSE Manager::Latency probe monitors the latency of probes on SUSE Manager and collects the following metric:

  • Probe Latency Average — The lag in seconds between the time a probe becomes ready to run and the time it is actually run. Under normal conditions, this is generally less than a second. When SUSE Manager is overloaded (because it has too many probes with respect to their average execution time), the number goes up.

Table C.66. SUSE Manager::Latency settings

FieldValue
Critical Maximum Probe Latency Average 
Warning Maximum Probe Latency Average 

C.10.5. SUSE Manager::Load

The SUSE Manager::Load probe monitors the CPU load on a SUSE Manager and collects the following metric:

  • Load — The load average on the CPU for a 1-, 5-, and 15-minute period.

Table C.67. SUSE Manager::Load settings

FieldValue
Critical Maximum 1-minute Average 
Warning Maximum 1-minute Average 
Critical Maximum 5-minute Average 
Warning Maximum 5-minute Average 
Critical Maximum 15-minute Average 
Warning Maximum 15-minute Average 

C.10.6. SUSE Manager::Probe Count

The SUSE Manager::Probe Count probe monitors the number of probes on SUSE Manager and collects the following metric:

  • Probes — The number of individual probes running on SUSE Manager.

Table C.68. SUSE Manager::Probe Count settings

FieldValue
Critical Maximum Probe Count 
Warning Maximum Probe Count 

C.10.7. SUSE Manager::Process Counts

The SUSE Manager::Process Counts probe monitors the number of processes on SUSE Manager and collects the following metrics:

  • Blocked — The number of processes that have been switched to the waiting queue and waiting state.

  • Child — The number of processes spawned by another process already running on the machine.

  • Defunct — The number of processes that have terminated (either because they have been killed by a signal or have called exit()) and whose parent processes have not yet received notification of their termination by executing some form of the wait() system call.

  • Stopped — The number of processes that have stopped before their executions could be completed.

  • Sleeping — A process that is in the Interruptible sleep state and that can later be reintroduced into memory, resuming execution where it left off.

Table C.69. SUSE Manager::Process Counts settings

FieldValue
Critical Maximum Blocked Processes 
Warning Maximum Blocked Processes 
Critical Maximum Child Processes 
Warning Maximum Child Processes 
Critical Maximum Defunct Processes 
Warning Maximum Defunct Processes 
Critical Maximum Stopped Processes 
Warning Maximum Stopped Processes 
Critical Maximum Sleeping Processes 
Warning Maximum Sleeping Processes 

C.10.8. SUSE Manager::Processes

The SUSE Manager::Processes probe monitors the number of processes on SUSE Manager and collects the following metric:

  • Processes — The number of processes running simultaneously on the machine.

Table C.70. SUSE Manager::Processes settings

FieldValue
Critical Maximum Processes 
Warning Maximum Processes 

C.10.9. SUSE Manager::Process Health

The SUSE Manager::Process Health probe monitors customer-specified processes and collects the following metrics:

  • CPU Usage — The CPU usage percent for a given process.

  • Child Process Groups — The number of child processes spawned from the specified parent process. A child process inherits most of its attributes, such as open files, from its parent.

  • Threads — The number of running threads for a given process. A thread is the basic unit of CPU utilization, and consists of a program counter, a register set, and a stack space. A thread is also called a lightweight process.

  • Physical Memory Used — The amount of physical memory in kilobytes being used by the specified process.

  • Virtual Memory Used — The amount of virtual memory in kilobytes being used by the specified process, or the size of the process in real memory plus swap.

Specify the process by its command name or process ID. (PID). Entering a PID overrides the entry of a command name. If no command name or PID is entered, the error Command not found is displayed and the probe is set to a CRITICAL state.

Table C.71. SUSE Manager::Process Health settings

FieldValue
Command Name 
Process ID (PID) file 
Timeout*15
Critical Maximum CPU Usage 
Warning Maximum CPU Usage 
Critical Maximum Child Process Groups 
Warning Maximum Child Process Groups 
Critical Maximum Threads 
Warning Maximum Threads 
Critical Maximum Physical Memory Used 
Warning Maximum Physical Memory Used 
Critical Maximum Virtual Memory Used 
Warning Maximum Virtual Memory Used 

C.10.10. SUSE Manager::Process Running

The SUSE Manager::Process Running probe verifies that the specified process is running. Specify the process by its command name or process ID. (PID). Entering a PID overrides the entry of a command name. A Critical status results if the probe cannot verify the command or PID.

Table C.72. SUSE Manager::Process Running settings

FieldValue
Command Name 
Process ID (PID) file 
Critical Number Running Maximum 
Critical Number Running Minimum 

C.10.11. SUSE Manager::Swap

The SUSE Manager::Swap probe monitors the percent of free swap space available on SUSE Manager. A CRITICAL status results if the value falls below the Critical threshold. A WARNING status results if the value falls below the Warning threshold.

Table C.73. SUSE Manager::Swap settings

FieldValue
Critical Minimum Swap Percent Free 
Warning Minimum Swap Percent Free 

C.10.12. SUSE Manager::Users

The SUSE Manager::Users probe monitors the number of users currently logged into SUSE Manager. A CRITICAL status results if the value exceeds the Critical threshold. A WARNING status results if the value exceeds the Warning threshold.

Table C.74. SUSE Manager::Users settings

FieldValue
Critical Maximum Users 
Warning Maximum Users