Chapter 4. Monitoring

Contents

4.1. Prerequisites
4.2. SUSE Manager Monitoring Daemon (rhnmd)
4.3. mysql package
4.4. Notifications
4.5. Probes
4.6. Troubleshooting

The entitlement allows you to perform a whole host of actions designed to keep your systems running properly and efficiently. With it, you can keep close watch on system resources, network services, databases, and both standard and custom applications.

Monitoring provides both real-time and historical state-change information, as well as specific metric data. You are not only notified of failures immediately and warned of performance degradation before it becomes critical, but you are also given the information necessary to conduct capacity planning and event correlation. For instance, the results of a probe recording CPU usage across systems would prove invaluable in balancing loads on those systems.

Monitoring entails establishing notification methods, installing probes on systems, regularly reviewing the status of all probes, and generating reports displaying historical data for a system or service. This chapter seeks to identify common tasks associated with the Monitoring entitlement. Remember, virtually all changes affecting your Monitoring infrastructure must be finalized by updating your configuration, through the Scout Config Push page.

4.1. Prerequisites

Before attempting to implement monitoring within your infrastructure, ensure you have all of the necessary tools in place. At a minimum, you need:

  • Monitoring entitlements — These entitlements are required for all systems that are to be monitored. Monitoring is supported only on SUSE Linux Enterprise systems.

  • SUSE Manager with monitoring — Monitoring systems must be connected to SUSE Manager with a base operating system of SUSE Linux Enterprise 11 Refer to the SUSE Manager Installation Guide within Help for installation instructions.

  • Monitoring administrator — This role must be granted to users installing probes, creating notification methods, or altering the monitoring infrastructure in any way. (Remember, the SUSE Manager administrator automatically inherits the abilities of all other roles within an organization and can therefore conduct these tasks.). Assign this role through the User Details page for the user.

  • SUSE Manager monitoring daemon — This daemon, along with the SSH key for the scout, is required on systems that are monitored in order for the internal process monitors to be executed. You may, however, be able to run these probes using the systems' existing SSH daemon (sshd). Refer to Section 4.2, “SUSE Manager Monitoring Daemon (rhnmd)” for installation instructions and a quick list of probes requiring this secure connection. Refer to Appendix C, Probes for the complete list of available probes.

4.2. SUSE Manager Monitoring Daemon (rhnmd)

To get the most out of your monitoring entitlement, Red Hat suggests installing the SUSE Manager monitoring daemon on your client systems. Based upon OpenSSH, rhnmd enables SUSE Manager to communicate securely with the client system to access internal processes and retrieve probe status.

Please note that the SUSE Manager monitoring daemon requires that monitored systems allow connections on port 4545. You may avoid opening this port and installing the daemon altogether by using sshd instead. Refer to Section 4.2.3, “Configuring SSH” for details. docbook_4

4.2.1. Probes requiring the daemon

An encrypted connection, either through the SUSE Manager monitoring daemon or sshd, is required on client systems for the following probes to run:

  • Linux::CPU Usage

  • Linux::Disk IO Throughput

  • Linux::Disk Usage

  • Linux::Inodes

  • Linux::Interface Traffic

  • Linux::Load

  • Linux::Memory Usage

  • Linux::Process Counts by State

  • Linux::Process Count Total

  • Linux::Process Health

  • Linux::Process Running

  • Linux::Swap Usage

  • Linux::TCP Connections by State

  • Linux::Users

  • Linux::Virtual Memory

  • LogAgent::Log Pattern Match

  • LogAgent::Log Size

  • Network Services::Remote Ping

  • Oracle::Client Connectivity

  • General::Remote Program

  • General::Remote Program with Data

Note that all probes in the Linux group have this requirement.

4.2.2. Installing the SUSE Manager Monitoring Daemon

Install the SUSE Manager Monitoring Daemon to prepare systems for monitoring with the probes identified in Section 4.2.1, “Probes requiring the daemon”. Note that the steps in this section are optional if you intend to use sshd to allow secure connections between the monitoring infrastructure and the monitored systems. Refer to Section 4.2.3, “Configuring SSH” for instructions.

The rhnmd package can be found in the client tools channel for all SUSE Linux Enterprise distributions. To install it:

  1. Subscribe the systems to be monitored to the client tools channel associated with the system. This can be done individually through the System Details+Channels+Software subtab or for multiple systems at once through the Channel Details+Target Systems tab.

  2. Once subscribed, open the Channel Details+Packages tab and find the rhnmd package (under 'R').

  3. Click the package name to open the Package Details page. Go to the Target Systems tab, select the desired systems, and click Install Packages.

  4. Install the SSH public key on all client systems to be monitored, as described in Section 4.2.4, “Installing the SSH key”.

  5. Start the SUSE Manager monitoring daemon on all client systems using the command:

    rcrhnmd start
  6. When adding probes requiring the daemon, accept the default values for RHNMD User and RHNMD Port: nocpulse and 4545, respectively.

4.2.3. Configuring SSH

If you wish to avoid installing the SUSE Manager monitoring daemon and opening port 4545 on client systems, you may configure sshd to provide the encrypted connection required between the systems and SUSE Manager. This may be especially desirable if you already have sshd running. To configure the daemon for monitoring use:

  1. Ensure the SSH package is installed on the systems to be monitored:

    rpm -qi openssh
  2. Identify the user to be associated with the daemon. This can be any user available on the system, as long as the required SSH key can be put in the user's ~/.ssh/authorized_keys file.

  3. Identify the port used by the daemon, as identified in its /etc/ssh/sshd_config configuration file. The default is port 22.

  4. Install the SSH public key on all client systems to be monitored, as described in Section 4.2.4, “Installing the SSH key”.

  5. Start the sshd on all client systems using the command:

    service sshd start
  6. When adding probes requiring the daemon, insert the values derived from steps 2 and 3 in the RHNMD User and RHNMD Port fields.

4.2.4. Installing the SSH key

Whether you use rhnmd or sshd, you must install the SUSE Manager monitoring daemon public SSH key on the systems to be monitored to complete the secure connection. To install it:

  1. Navigate to the Monitoring+Scout Config Push page on the SUSE Manager interface and click the name of the Scout that will monitor the client system. The SSH id_dsa.pub key is visible on the resulting page.

  2. Copy the character string (beginning with ssh-dss and ending with the hostname of the SUSE Manager server).

  3. Select the systems you want to send the key to from the Systems, then selecting Systems from the left menu, and finally clicking the checkbox next to the systems you want to send the SSH key and click the Manage button at the top.

  4. From the System Set Manager, click Run remote commands, then in the Script text box, type the following line:

    #!/bin/sh
    cat <<EOF >> ~nocpulse/.ssh/authorized_keys

    Then, press Enter and then paste the SSH Key. The result should look similar to the following:

    #!/bin/sh
    cat <<EOF >> /opt/nocpulse/.ssh/authorized_keys
    ssh-dss AABBAB3NzaC3kc3MABCCBAJ4cmyf5jt/ihdtFbNE1YHsT0np0SYJz7xk
    hzoKUUWnZmOUqJ7eXoTbGEcZjZLppOZgzAepw1vUHXfa/L9XiXvsV8K5Qmcu70h0
    1gohBIder/1I1QbHMCgfDVFPtfV5eedau4AAACAc99dHbWhk/dMPiWXgHxdI0vT2
    SnuozIox2klmfbTeO4Ajn/Ecfxqgs5diat/NIaeoItuGUYepXFoVv8DVL3wpp45E
    02hjmp4j2MYNpc6Pc3nPOVntu6YBv+whB0VrsVzeqX89u23FFjTLGbfYrmMQflNi
    j8yynGRePIMFhI= root@example.com
    EOF
  5. Set the date and time you want for the action to take place, then click Schedule Remote Command.

Once the key is in place and accessible, all probes that require it should allow ssh connections between the Monitoring infrastructure and the monitored system. You may then schedule probes requiring the monitoring daemon to run against the newly configured systems.

4.3. mysql package

If your SUSE Manager will serve monitoring-entitled client systems against which you wish to run MySQL probes, you must configure the mysql package on the SUSE Manager. Refer to Appendix C, Probes for a listing of all available probes.

Subscribe the SUSE Manager to the SUSE Linux Enterprise base channel and install the mysql package either through the zypper up or YaST

Once finished, your SUSE Manager may be used to schedule MySQL probes.

4.4. Notifications

In addition to viewing probe status within the SUSE Manager interface, you may be notified whenever a probe changes state. This is especially important when monitoring mission-critical production systems.

To enable probe notifications within SUSE Manager, you must have identified a mail exchange server and mail domain during installation of your SUSE Manager and configured sendmail to properly handle incoming mail. Refer to the Installation chapter of the SUSE Manager Installation Guide for details.

4.4.1. Creating Notification Methods

Notifications are sent via a notification method, an email or pager address associated with a specific SUSE Manager user. Although the address is tied to a particular user account, it may serve multiple administrators through an alias or mailing list. Each user account can contain multiple notification methods. To create a notification method:

  1. Log into the SUSE Manager website as either an SUSE Manager administrator or monitoring administrator.

  2. Navigate to the User Details+Notification Methods tab and click create new method.

  3. Enter an intuitive, descriptive label for the method name, such as DBA day email, and provide the correct email or pager address. Remember, the labels for all notification methods are available in a single list during probe creation, so they should be unique to your organization.

  4. Select the checkbox if you desire abbreviated messages to be sent to the pager. This shorter format contains only the probe state, system hostname, probe name, time of message, and Send ID. The standard, longer format displays additional message headers, system and probe details, and instructions for response.

  5. When finished, click Create Method. The new method shows up in the User Details+Notification Methods tab and the Notification page under the top Monitoring category. Click its name to edit or delete it.

  6. While adding probes, select the Probe Notifications checkbox and select the new notification method from the resulting dropdown menu. Notification methods assigned to probes cannot be deleted until they are dis-associated from the probe.

4.4.2. Receiving Notifications

If you create notification methods and associate them with probes, you must be prepared to receive them. These notifications come in the form of brief text messages sent to either email or pager addresses. Here is an example of an email notification:

As you can see, the longer email notifications contain virtually everything you would need to know about the associated probe. In addition to the probe command, run time, system monitored, and state, the message contains the Send ID, which is a unique character string representing the precise message and probe. In the above message, the Send ID is 01dc8hqw.

Pager notifications, by necessity, contain only the most important details, namely the subject of the email message (containing state, system, probe, and time) and the Send ID. Here is an example pager notification:

4.4.3. Redirecting Notifications

Upon receiving a notification, you may redirect it by including advanced notification rules within an acknowledgment email. Just reply to the notification and include the desired option. These are the possible redirect options, or filter types:

  • ACK METOO — Sends the notification to the redirect destination(s) in addition to the default destination.

  • ACK SUSPEND — Suspends the notification method for a specified time period.

  • ACK AUTOACK — Does not change the destination of the notification, but automatically acknowledges matching alerts as soon as they are sent.

  • ACK REDIR — Sends the notification to the redirect destination(s) instead of the default destination.

The format of the rule should be filter_type probe_type duration email_address where filter_type indicates one of the previous advanced commands, probe_type indicates probe or system, duration indicates the length of time for the redirect, and email_address indicates the intended recipient. For example:

ACK METOO system 1h boss@domain.com

Capitalization is not required. Duration can be listed in minutes (m), hours (h), or days (d). Email addresses are needed only for redirects (REDIR) and supplemental (METOO) notifications.

The description of the action contained in the resulting email defaults to the command entered by the user. The reason listed is a summary of the action, such as email ack redirect by user@domain.com where user equals the sender of the email.

[Note]

You can halt or redirect almost all probe notifications by replying to a notification emails with a variation of the command ack suspend host. However, you cannot halt SUSE Manager probe notifications by responding to a probe with ack suspend host or other redirect responses. These probes require you to change the notifications within the SUSE Manager web interface.

4.4.4. Filtering Notifications

Since notifications can be generated whenever a probe changes state, simple changes in your network can result in a flood of notifications. The creation, cancellation, and application of Notification filters is discussed in detail in Section 3.10.3.1, “Notification+Filters.

4.4.5. Deleting Notification Methods

Theoretically, removing notification methods should be as easy as creating them. After all, you must populate no fields to conduct the deletion and a button exists for this explicit purpose. However, existing relationships between methods and probes can complicate this process. Follow these steps to remove a notification method:

  1. Log into the SUSE Manager website as an SUSE Manager administrator or monitoring administrator.

  2. Navigate to the Monitoring+Notifications page and click the name of the method to be removed.

  3. On the User Details+Notification Methods tab, click delete method. If the method is not associated with any probes, you are presented with a confirmation page. Click Confirm Deletion. The notification method is removed.

    [Note]

    Since both the notification method name and address can be edited, consider updating the method rather than deleting it. This redirects notifications from all probes using the method without having to edit each probe and create a new notification method.

  4. If the method is associated with one or more probes, you are presented with a list of the probes using the method and the systems to which the probes are attached instead of a confirmation page. Click the probe name to go directly to the System Details+Probes tab.

  5. On the System Details+Probes tab, select another notification method and click Update Probe.

  6. You may now return to the Monitoring+Notifications page and delete the notification method.

4.5. Probes

Now that the SUSE Manager monitoring daemon has been installed and notification methods have been created, you may begin installing probes on your monitoring-entitled systems. If a system is entitled to monitoring, a Probes tab appears within its System Details page. This is where you will conduct most probe-related work.

4.5.1. Managing Probes

To add a probe to a system, the system must be entitled to monitoring. Further, you must have access to the system itself, either as the system's root user, through the system group administrator role, or as the SUSE Manager administrator. Then:

  1. Log into the SUSE Manager website as either an SUSE Manager administrator or the system group administrator for the system.

  2. Navigate to the System Details+Probes tab and click create new probe.

  3. On the System Probe Creation page, complete all required fields. First, select the probe command group. This alters the list of available probes and other fields and requirements. Refer to Appendix C, Probes for the complete list of probes by command group. Remember that some probes require the SUSE Manager network monitoring daemon to be installed on the client system.

  4. Select the desired probe command and the monitoring scout. Enter a brief but unique description for the probe.

  5. Select the Probe Notifications checkbox to receive notifications when the probe changes state. Use the Probe Check Interval dropdown menu to determine how often notifications should be sent. Selecting 1 minute (and the Probe Notification checkbox) means you will receive notifications every minute the probe surpasses its CRITICAL or WARNING thresholds. Refer to Section 4.4, “Notifications” to find out how to create notification methods and acknowledge their messages.

  6. Use the RHNMD User and RHNMD Port fields, if they appear, to force the probe to communicate via sshd, rather than the SUSE Manager monitoring daemon. Refer to Section 4.2.3, “Configuring SSH” for details. Otherwise, accept the default values of nocpulse and 4545, respectively.

  7. If the Timeout field appears, review the default value and adjust to meet your needs. Most but not all timeouts result in an UNKNOWN state. If the probe's metrics are time-based, ensure the timeout is not less than the time allotted to thresholds. Otherwise, the metrics serve no purpose, as the probe will time out before any thresholds are crossed.

  8. Use the remaining fields to establish the probe's alert thresholds, if applicable. These CRITICAL and WARNING values determine at what point the probe has changed state. Refer to Section 4.5.2, “Establishing Thresholds” for best practices regarding these thresholds.

  9. When finished, click Create Probe. Remember, you must commit your monitoring configuration change on the Scout Config Push page for this to take effect.

To delete a probe, navigate to its Current State page (by clicking the name of the probe from the System Details+Probes tab), and click delete probe. Finally, confirm the deletion.

4.5.2. Establishing Thresholds

Many of the probes offered by SUSE Manager contain alert thresholds that, when crossed, indicate a change in state for the probe. For instance, the Linux::CPU Usage probe allows you to set CRITICAL and WARNING thresholds for the percent of CPU used. If the monitored system reports 75 percent of its CPU used, and the WARNING threshold is set to 70 percent, the probe will go into a WARNING state. Some probes offer a multitude of such thresholds.

In order to get the most out of your monitoring entitlement and avoid false notifications, it is recommended to run your probes without notifications for a time to establish baseline performance for each of your systems. Although the default values provided for probes may suit you, every organization has a different environment that may require altering thresholds.

4.5.3. Monitoring the SUSE Manager Server

In addition to monitoring all of your client systems, you may also use SUSE Manager to monitor your SUSE Manager server. To monitor your SUSE Manager server, find a system monitored by the server, and go to that system's System Details+Probes tab.

Click create new probe and select the SUSE Manager probe command group. Next, complete the remaining fields as you would for any other probe. Refer to Section 4.5.1, “Managing Probes” for instructions.

Although the SUSE Manager server appears to be monitored by the client system, the probe is actually run from the server on itself. Thresholds and notifications work normally.

[Note]

Any probes that require SUSE Manager network monitoring daemon connections cannot be used against a SUSE Manager or a SUSE Manager proxy server on which monitoring software is running. This includes most probes in the Linux command group as well as the log agent probes and the remote program probes. Use the SUSE Manager command group probes to monitor SUSE Manager and SUSE Manager proxy servers. In the case of proxy scouts, the probes are listed under the system for which they are reporting data.

4.6. Troubleshooting

Though all monitoring-related activities are conducted through the SUSE Manager website, SUSE provides access to some command line diagnostic tools that may help you determine the cause of errors. To use these tools, you must be able to become the nocpulse user on the SUSE Manager server conducting the monitoring.

First log into the SUSE Manager server as root. Then switch to the nocpulse user with the following command:

su - nocpulse

You may now use the diagnostic tools described within the rest of this section.

4.6.1. Examining Probes with rhn-catalog

To thoroughly troubleshoot a probe, you must first obtain its probe ID. You may obtain this information by running rhn-catalog on the SUSE Manager server as the nocpulse user. The output will resemble:

2 ServiceProbe on exa1.example.com (199.168.36.245): test 2
3 ServiceProbe on exa2.example.com (199.168.36.173): rhel2.1 test
4 ServiceProbe on exa3.example.com (199.168.36.174): SSH
5 ServiceProbe on exa4.example.com (199.168.36.175): HTTP

The probe ID is the first number, while the probe name (as entered in the SUSE Manager website) is the final entry on the line. In the above example, the 5 probe ID corresponds to the probe named HTTP.

Further, you may pass the --commandline (-c) and --dump (-d) options along with a probe ID to rhn-catalog to obtain additional details about the probe, like so:

rhn-catalog --commandline --dump 5

The --commandline option yields the command parameters set for the probe, while --dump retrieves everything else, including alert thresholds and notification intervals and methods.

The command above will result in output similar to:

5 ServiceProbe on exa4.example.com (199.168.36.175  ):
linux:cpu usage
      Run as: Unix::CPU.pm --critical=90 --sshhost=199.168.36.175  
--warn=70 --timeout=15 --sshuser=nocpulse
--shell=SSHRemoteCommandShell --sshport=4545

Now that you have the ID, you use it with rhn-runprobe to examine the probe's output. Refer to Section 4.6.2, “Viewing the output of rhn-runprobe” for instructions.

4.6.2. Viewing the output of rhn-runprobe

Now that you have obtained the probe ID with rhn-catalog, use it in conjunction with rhn-runprobe to examine the complete output of the probe. Note that by default, rhn-runprobe works in test mode, meaning no results are entered in the database. Here are its options:

Table 4.1. rhn-runprobe Options

OptionDescription
--help List the available options and exit.
--probe=PROBE_ID Run the probe with this ID.
--prob_arg=PARAMETER Override any probe parameters from the database.
--module=PERL_MODULE Package name of alternate code to run.
--log=all=LEVEL Set log level for a package or package prefix.
--debug=LEVEL Set numeric debugging level.
--live Execute the probe, enqueue data and send out notifications (if needed).

At a minimum, you should include the --probe option, the --log option, and values for each. The --probe option takes the probeID as its value and the --log option takes the value "all" (for all run levels) and a numeric verbosity level as its values. Here is an example:

rhn-runprobe --probe=5 --log=all=4

The above command requests the probe output for probeID 5, for all run levels, with a high level of verbosity.

More specifically, you may provide the command parameters derived from rhn-catalog, like so:

rhn-runprobe 5 --log=all=4 --sshuser=nocpulse --sshport=4545

This yields verbose output depicting the probe's attempted execution. Errors are clearly identified.