NAME
managelogs - Piped logging program to rotate/purge Apache logs
SYNOPSIS
managelogs
[ options1 ] <base-path1> [[ options2 ] <base-path2> ...]
DESCRIPTION
managelogs is a log processing program primarily intended to be used in
conjunction with Apache's
piped logfile feature. It automatically rotates and purges log files based
on a set of user-defined options.
Features include :
-
- Rotation on a file size limit or time delay
- Rotation on an external signal
- Can run an external command in background each time a rotation occurs
- Purge log files based on a combination of global size limit, number of log
files, and time delay
- Integrated on-the-fly compression,
- Maintains symbolic links on log files (active and/or backup),
- Can change its uid/gid to run as a given (non-root) user/group,
- Ensures that rotations occur on line boundaries,
- Maintains its state across stop/start/restart operations
- Supports reading its data from a named pipe (fifo)
A managelogs process continuously reads data (by default, from its standard
input) and sends it in parallel to its
log manager(s).
A log manager should be considered as an output channel. Each log
manager is defined
on the command line by a set of options followed by a path named
base path
, as this path is used as the base other file names are computed
from. Each set of '[ options ] <base-path>' arguments defines a
log manager. The direcotry part of the
base path
is the
base directory
There is no limit to the number of log managers a
managelogs process can handle. They all receive the same data and each of them
processes it according to its configuration.
Note: In most cases, and especially
during your first steps with managelogs, you will define only one log
manager per managelogs process, making no difference between the log manager
and the managelogs process it belongs to.
A log manager manages an
active
log file (the file it is currently writing into) and a set of
backup
log files.
Backup
log files are previous active log files which have been closed during a
rotation. The default behavior is to create the active and backup log files
in the base directory, but an option allows log files to reside in another
location.
Every time the global limits
are exceeded, a
purge
operation takes place and the oldest backup file is deleted.
Each log manager also maintains a
status file
which, among other information, contains the paths of the active and backup log
files. Thanks to this file, a managelogs process can be stopped and restarted
without loosing its state. When a managelogs process is restarted, each log
manager recovers the state it was in when the process was previously stopped.
Note : unlike other log management programs like
rotatelogs, (re)starting the Apache server does not cause its managelogs
processes to switch to a new log file. The only way to force a file rotation
from the 'outside' is to send a SIGUSR1 signal to the managelogs process
(see SIGNALS).
Another difference with other log rotation programs is that different
processes cannot write to the same log file. A given
base path
must be privately owned and managed by only one log manager. If another program
writes to one of the files maintained by this log manager, the result is unpredictable.
The only exception to this rule is that backup log files can be deleted
(but not written to) at any time without disturbing the logic of the purge
engine.
OPTIONS
---- Global options (these options apply to the whole managelogs process) :
- -h|--help
-
Display the list of available options and exit
- -V|--version
-
Display current version and exit
- -u|--user <id>
-
run with this user/group ID (usable only when the program is started
by root)
<id> = <uid>[:<gid>], where <uid> and <gid> are
user/group names or numeric ids.
- -I|--stats
-
Display internal statistics before exiting (used for troubleshooting, debugging,
or performance tests)
- -R|--refresh-only
-
Just refresh/purge log files and exit
- -i|--input <path>
-
Read data from <path> instead of standard input.
See 'READING FROM A NAMED PIPE' below for more info on named pipe support.
---- Log manager options (apply to the next <base-path> only) :
- -v|--verbose
-
Increment debug level (can be set more than once).
- -d|--debug <path>
-
Write debug messages to <path>.
Special values : 'stdout' and 'stderr' respectively correspond to
the process' standard output and standard error streams.
Default : debug messages go to stderr.
- -c|--compress <comp>[:<level>]
-
Activate compression and append the corresponding suffix to the log file names.
<comp> may be gz or bz2 (this is also the value of the
suffix appended to the log file names).
<level> is one of {1|2|3|4|5|6|7|8|9|best|fast}
Default compression level: best
- -s|--size <size>
-
Sets the log file size at which rotation occurs.
<size> is a numeric value
optionnally followed by a unit : K (Kilobytes), M (Megabytes), or
G (Gigabytes).
Default: no limit
Note that the size we set here is the size the file takes on disk. If compression
is turned on, the limit is checked against the compressed size.
- -r|--rotate-delay <delay>
-
Sets the maximum delay between the creation of a new log file and the next
rotation.
<delay> is a suite of patterns in the form '[0-9]+[dhms]' (d=days, h=hours,
m=minutes, s=seconds)
Example : 1d12h = 1 day and 12 hours (same as '36h')
- -p|--purge-delay<delay>
-
Removes backup log files older than <delay>
Argument: same format as for '--rotate-delay'
- -S|--global-size <size>
-
Sets the maximum size that the managed log files (active + backup) can take on
disk. As soon as this size is exceeded, a purge occurs (the oldest backup file
is removed).
Argument: same format as for '--size'. If this option is set and the '--size'
option is not, the individual file limit is implicitely set to 1/2 of
the global limit (so that the directory always contains at least one backup
file).
Default : no global limit
- -m|--mode <mode>
-
File permissions to apply to newly-created log files.
<mode> is a numeric octal Unix-style file permission (see chmod(1) for more).
Default mode: 644
- -k|--keep <n>
-
Keep only <n> log files (the active one + <n-1> backups). This
option is an alternative to the '--global-size' option, but can also be
used in conjunction, especially if you send signals to trigger rotations
before the size limit is reached.
- -P|--log-path <path>
-
Use <path> instead of the base path when creating a new log
file. <path> can be a relative or absolute path. If it is a relative
path, it is relative to the base directory.
- -l|--link
-
Maintain a link file (named <base-path>) to the active log file.
- -L|--backup-links
-
Also maintain links to the backup log files (backup link are named
<base-path>.{B.001[.gz|.bz2],B.002[.gz|.bz2],...}, most recent first)
- -H|--hardlink
-
Create hard links instead of symbolic links.
Note: when using hard links with a separate log path, the base and log paths
must reside in the same file system.
- -e|--ignore-eol
-
By default, managelogs ensures that log file rotation occurs on line boundaries,
so that every log files contain entire lines. This option disables this
buffering mechanism.
- -C|--rotate-cmd <path>
-
Execute <path> in background each time a rotation occurs.
See 'ROTATE COMMAND' below for more info on this option
- -x|--ignore-enospc
-
Ignore 'file system full' errors. This option causes managelogs to silently
discard data when there is not enough free space to write it on disk.
Warning : This behavior should be activated only if you really understand the
consequences, especially concerning possible log data corruption. If you are not
sure, avoid this option.
FILES
Each log manager maintains its own set of files. The files are named after the
log manager's base and log paths. The directory part of these paths must exist
before managelogs is started. They must also be writable by the user managelogs
is running as. By default, the log path is the same as the base path.
Here are the files that a log manager creates and maintains :
- <base-path>.pid
-
This file is present when a process is currently managing this base path. It
contains
the pid of the managelogs process. This is the file to read to know who to send
signals to. When the process exits, the pid file is removed.
- <base-path>.status
-
The status file. As described above, this file allows a log manager to recover
its previous state at start time. This way, the memory of active and backup
files is kept.
- <log-path>._<xxxxxxxx>[.gz|.bz2]
-
A log file. The <xxxxxxxx> part of the name is a unique identifier
computed
by the log manager when the file is created. When several log files are present,
their alphabetical order corresponds to their creation time chronological
order. So, when you list a directory in
alphabetical order, the oldest backup
log file comes first, and the active log
file comes last, so that the 'cat <base-path>._*' command displays the
whole log data in chronological order.
When compression is turned on, the log manager automatically appends the
compression type to the file name.
- <base-path>
-
If the '--link' option is set, the log manager maintains a link
from <base-path> to the active log file. By default, it is a symbolic link,
but the '--hardlink' option allows to use hard links instead.
- <base-path>.B.{001[.gz|.bz2],002[.gz|.bz2],...}
-
These are also links, but to the backup log files. They are created and
maintained only if the '--backup-links' option was set. The files are numbered
in reverse chronological order : <base-path>.B.001[.gz|.bz2] is the most recent backup,
<base-path>.B.002[.gz|.bz2] is the previous one...
SIGNALS
- SIGUSR1
-
This signal triggers an immediate rotation on every log managers attached to
the process. Note that the rotation can cause the global conditions
to be exceeded. In this case, a purge will follow.
- SIGUSR2
-
This signal causes every log managers to flush to disk the data they may
have in memory. This is especially useful for compressed streams, as compressed
files
cannot be read before such a flush operation is executed. This is due to the
fact that a compressed file must contain a trailer block to be valid. As long
as the compression engine processes the data, this trailer block is not
written and, if you try to read the compressed data from the file, it is
considered as invalid. When you send a SIGUSR2 signal to the process, the compression
engine flushes the data it currently has in memory, writes the corresponding
trailer data to the file, and starts a new block. Then, you can uncompress
the data from the compressed file. Note that this flush operation adds about
16 bytes to the log file, so it shouldn't be done too often.
- SIGTERM
-
Causes the managelogs process to exit cleanly (flush data to disk, update status
file, and exit). You will need this signal when reading from a named
pipe as, in this case, this is the only way to stop the managelogs process.
SIGKILL should never be sent as it cannot be trapped
and will create inconsistencies in the status file.
ROTATE COMMAND
Every time managelogs decides to switch to a new log file, whatever reason it
may have for this, an external command can be executed. This is what we call the
rotate command.
This command is set via the --rotate-cmd option on the managelogs command line.
The option value is the path of an executable file (binary or script).
This executable file is run in background and its return code is ignored.
Actually, once launched, the subprocess is totally forgotten by the managelogs
process. So,
there is no limit to the time it may take, as it does not suspend managelogs
execution.
The subprocess receives several environment variables from managelogs :
- LOGMANAGER_FILE_PATH
-
The path to the log file managelogs just closed. In a statistics gathering
scenario, for instance, the data to integrate will be read from this file.
- LOGMANAGER_BASE_PATH
-
This is the
base path
associated with this log manager.
- LOGMANAGER_BASE_DIR
-
This is the directory part of the
base path
- LOGMANAGER_LOG_PATH
-
This is the
log path
as the base path.
- LOGMANAGER_LOG_DIR
-
This is the directory part of the
log path
- LOGMANAGER_COMPRESSION
-
This is the compression type used to write to the log file. If compression
is off, contains an empty string.
- LOGMANAGER_VERSION
-
The version of the log manager library.
- LOGMANAGER_TIME
-
The current time in Unix numeric format (number of seconds since 01/Jan/1970).
All the paths transmitted in these variables are absolute paths, even if
relative paths were provided on the command line.
Note : During its execution, the rotate command is allowed to delete the
file pointed to by $LOGMANAGER_FILE_PATH. You may do it, for instance, if you just
want some statistics without keeping the detailed logs, or if you use the rotate
command to transfer the log file to another location/server.
Also note that you shouldn't assume anything about the default directory the
command is executed in. Either you explicitely set a default directory at
the beginning of your script ('cd $LOGMANAGER_BASE_DIR' for instance), or you
must use absolute paths only. Relative
paths are supported on the managelogs command line because they are
internally converted to absolute paths before being used.
READING FROM A NAMED PIPE
Although managelogs was primarily intended to be used with Apache, it can be
used as a general purpose log managing program for a lot of other software.
As these software generally don't support a piped logfile feature similar to
Apache, an alternative is to connect them with managelogs through a named pipe
(aka fifo).
In order to connect to managelogs through a named pipe :
-
- The pipe file must exist before both processes are started (mkfifo),
- If the writer process is started before managelogs, its write operations to
the pipe can block after a given amount of data. This is why it is generally
recommended to start the reader process (managelogs) before the writer.
Actually, it is more natural, as, in such an architecture, the writer process
can stop and restart while the reader process is supposed to remain untouched.
The same when stopping the process: the correct procedure is to stop the writer
process before the reader, to make sure that any remaining buffered data won't
be lost.
- The '-i|--input' option must be used on the managelogs command line, followed
with the path of the named pipe (if you redirect the standard input from
the named pipe, managelogs cannot detect that its input is coming from a pipe).
- managelogs then automatically detects that the file it is reading from is a named
pipe, and adapts its behavior (see below).
Note that managelogs explicitely checks the input file type. In other words,
using the '--input' option does not automatically imply the 'named pipe behavior'.
If the option is followed with the path of a regular file, managelogs will
behave as if this file had been redirected to its standard input.
When managelogs is reading from a named pipe, it remains connected indefinitely,
even after the process writing to the pipe exits. This way,
both processes are independant : one or several writer processes can connect to and
disconnect from the pipe (in turn) without disturbing the managelogs process.
Although technically possible, you should probably avoid having several
processes write to the same pipe in parallel, as the data coming
from the different processes will be mixed in a way you cannot control.
The only way to stop a managelogs process connected to a named pipe is to kill it
with a SIGTERM signal.
EXAMPLES
Say we want to keep the last 3 Mbytes of access_log data in /var/log/httpd,
each log file will take at most 1 Mbyte, and we want to maintain symbolic
links to the active and backup log files.
The corresponding configuration line looks like :
CustomLog "| /usr/bin/managelogs --size 1M --global-size 3M --link --backup-links /var/log/httpd/access_log" combined
Here is a typical list of files present in the /var/log/httpd directory with
such a configuration :
# ls -l $apache_dir/logs/access_log*
...
lrwxrwxrwx 1 root root 20 Mar 17 15:16 access_log -> access_log._49BFB0A2
lrwxrwxrwx 1 root root 20 Mar 17 15:16 access_log.B.001 -> access_log._49BF8366
lrwxrwxrwx 1 root root 20 Mar 17 15:16 access_log.B.002 -> access_log._49BF2522
-rw-r--r-- 1 root root 1048564 Mar 5 12:34 access_log._49BF2522
-rw-r--r-- 1 root root 1048543 Mar 17 15:16 access_log._49BF8366
-rw-r--r-- 1 root root 483328 Mar 19 07:05 access_log._49BFB0A2
-rw-r--r-- 1 root root 6 Feb 22 08:30 access_log.pid
-rw-r--r-- 1 root root 321 Mar 17 15:16 access_log.status
- In this list you can see (in alphabetical order) :
-
- The symbolic link to the active log file
- The 2 symbolic links to the backup log files
- The 2 backup log files (in chronological order)
- The active log file
- The pid file
- The status file
Now, something more complex : we want to keep 3 Mbytes of uncompressed log
data
to be used by our 1st-level support team, as in the previous example, and we
also need to archive a bigger amount of data for 2nd-level analysis,
security, compliance, or any other need. This archived data will be compressed,
as it allows to save a lot of space (usually more than 95 %).
The corresponding directive looks like :
CustomLog "| /usr/bin/managelogs --size 1M --global-size 3M --link --backup-links /var/log/httpd/access_log --size 100M --global-size 1G --compression bz2:best /archives/logs/access_log" combined
With such a configuration, the files in the /var/log/httpd directory will
be the same as in the previous example, but managelogs will also maintain the
most recent 1 Gbytes of compressed access log data in /archives/logs (in
chunks of 100 Mbytes). This way, we have two levels of access to the log
data : the most recent data is easily accessible and, when we need to examine
something older, it is less easy, but the retention size is much larger.
Now, if we want to force an immediate rotation of these log files, whatever
reason we may have for this, the command to use is :
kill -USR1 `cat /var/log/httpd/access_log.pid`
Note that we could also have used '/archives/logs/access_log.pid', as both pid
files contain the same. The signal triggers a rotation in both directories.
Here is a typical example of using a rotate command :
First, we create an executable text file with the following content :
!/bin/sh
perl <awstat-dir>/awstats.pl -config=<mysite> -update -LogFile=$LOGMANAGER_FILE_PATH
This script integrates the rotated log file ($LOGMANAGER_FILE_PATH) in an
AWStats database. To have managelogs execute it each time a rotation
occurs, we add the '-C /path/to/the/script' option on the managelogs command line
(remember to use only absolute paths with managelogs).
Note : This option applies only to a single log manager. If you are using
several log managers (as in the example above), you can define different
rotate commands for the different log managers.
As a complement, if we want the statistics to be integrated at least once
per day, we can add '--rotate-delay 1d' on the managelogs command line.
Another way would be to create a cron job, executed every night :
0 0 * * * kill -USR1 `cat /var/log/httpd/access_log.pid`
Last, an example with a separate log path
SEE ALSO
The managelogs web site : http://managelogs.tekwire.net
AUTHOR
Francois Laupretre <francois@tekwire.net>
LICENSE
Apache license, Version 2.0 <http://www.apache.org/licenses/>
BUGS
Please send bug reports to <managelogs-bugs@tekwire.net>