Check Aggregation: check_many
Thomas Guyot-Sionnest, June 9, 2009
Overview
This proposal is for a simple plugin wrapper allowing aggregation and serialization of multiple checks.
Problem
There is no easy way to configure a single Nagios service that aggregates
multiple results together. Taking for example a standard check_nagios
between servers, how can such checks be extended to cover additional
components? Usually it involves writing a custom shell wrapper around them,
or configuring all the checks separately and using check_cluster
to
aggregate them. There ought to be a better way …
Proposal
Written in C, check_many
would be a fairly simple and fast solution for this
issue. The idea is a plugin that takes checks commands from STDIN
; one
command per line. It would run them and aggregate them according to
processing preferences as configured in the plugin arguments.
Check Options
The following options can be used to control plugin processing (grouped by category):
Command Parsing
-s, --shell=<always|never|auto>
Specify when a shell should be invoked for executing commands. "always"
invokes the shell for every command, "never" forces commands to be executed
directly, and "auto" (default) invokes the shell only if shell meta
characters are present in the check command. Unless -d (--delimiter) is
specified, any whitespace is used for separating arguments.
-d, --delimiter=CHARACTER
Delimiter to use for separating command arguments when shell is not used.
Implies --shell=never and is mutually exclusive with any other shell option.
Standard backslash escapes are allowed, except "\n".
Note: Should we allow strings as delimiters?
Processing Option
-P, --process=<all|first-fail|first-ok>
By default, all commands are processed and the worst state is returned
("all"). "first-fail" stops at the first non-ok check and returns it, while
"first-ok" stops at the first successful check and returns it. The latter
two override --status and --output and return the plugins's instead.
-f, --file=FILE
Read checks from FILE instead of STDIN.
Output Options
--output=<normal|oneline|status>
"normal" outputs Nagios v3+ multi-line result, first line being a summary of
the checks performed; "oneline" squeezes everything into a single line; and
"status" returns only a status line. This option has no effect with
--process=first-fail|first-ok.
Note: How about allowing nth result?
Examples
Aggregate multiple checks together:
$ echo '/path/check_http -H www.example.com
/path/check_http -H www.example.com -p 443' | /path/check_many
Get list of checks from a file:
$ /path/check_many <~nagios/multiple_checks.txt
$ /path/check_many -f /home/nagios/multiple_checks.txt
Using a delimiter:
$ echo '/path/check_foo:-H:example.com
/path/check_bar:-H:example.com:-s:$string with special chars;' \
| /path/check_many -d: