[Nagiosplug-devel] RFC: Nagios 3 and Embedded Perl Plugins
Andreas Ericsson
ae at op5.se
Thu Jan 4 11:06:52 CET 2007
Florian Gleixner wrote:
> Andreas Ericsson wrote:
>> Florian Gleixner wrote:
>>> True, leaks and crashes could make nagios more unstable. dl-plugins
>>> should be used with care. "Worker threads" could isolate some of the risk.
>>>
>>> The performance gain is simply the time a C plugin needs to create a
>>> process. You could say, that this is not very much time, but some nagios
>>> setups make thousands of checks per minute. Here is a very simple test:
>>> The bash has the echo command build in. On most linux systems you will
>>> find a /bin/echo program with same functionality too. So compare:
>>>
>>> time for ((i=0 ; i< 10000 ; i++)) ; do echo bla ; done
>>> real 0m1.536s
>>> user 0m0.172s
>>> sys 0m0.020s
>>>
>>> time for ((i=0 ; i< 10000 ; i++)) ; do /bin/echo bla ; done
>>> real 0m34.047s
>>> user 0m8.761s
>>> sys 0m15.365s
>>>
>>> I think some default plugins like ping or tcp-check could be made as dl
>>> module, the more complicated or the plugins that are usually executed at
>>> the monitored nodes should be "normal" plugins.
>>>
>>> I never had a look at the nagios code, it was just a idea popping up.
>>>
>> A lower hanging apple is to make Nagios use fork() / execve() instead of
>> using popen(), which does a double fork() / exec() thing.
>>
>
> or use the popen() call from popen.{h,c} from the nagios plugins.
That doesn't leave room for passing the environment though, which will
break a very valuable feature in Nagios atm. Btw, popen.[hc] have been
replaced by runcmd.[hc]. How old a version are you running?
> The nagios plugins also call external programs via this call. So at the
> moment one plugin check usually creates a shell process, the plugin
> executable process and if the plugin creates a process we have three
> process created for one simple ping.
No, there is the fork()/execve() in nagios (done through popen(3)) which
spawns a shell. Then there's the fork()/execve() in the shell, and
finally the plugin is run, so it's always three processes per plugin
invocation. If the plugin spawns fe /bin/ps or /bin/df we have four
processes for one plugin.
> Ideally a dynamically loaded plugin, that does not call external
> programs but has the code of for example "ping" complied in, does not
> create a single process.
>
This is a Bad Idea beacuse the core program can't block on read()'s,
which means all plugins that work over the network will have their
timing values skewed unless you run each check in a separate thread or
fork() a new nagios daemon for each check to run dynamically, in which
case you've already lost 90% of the gain and ended up with a wicked
burden of maintainability. That's without considering the initial cost
(in developer time) to rewrite all plugins to never use signals[1] (or
alarm(3)), which will be huge.
Also, for PING checks you're opening a new can of worms, since
implementing the ICMP protocol generally requires access to raw sockets,
which is, on almost all systems, restricted to the super-user. It's
possible to work around this by obtaining one[2] raw socket prior to
dropping the root privileges at startup, but then you'd be up for a
fairly complex ping program that needs to keep track of all the hosts
that currently has echo requests pending and assign each response to the
right check.
[1] All module-based checks would want to catch the same signals, so the
signal-handlers would be overwritten. alarm(3) is sometimes implemented
through signals, so that's not usable.
[2] Obtaining one socket per ping-check at start-up and keeping them is
not feasible, since most systems normally only allow 1024
file-descriptors / process.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
More information about the Devel
mailing list