[Nagiosplug-devel] Patch for persistent data on plugins
Jose Luis Martinez
jlmartinez-lists-nagplug-devel at capside.com
Thu Sep 17 18:07:07 CEST 2009
Ton Voon escribió:
>> IMHO, the instance id should not be selected by the developer, as he
>> is
>> not capable of previewing all these situations. There should be a
>> way of
>> automatically giving it to the plugin.
>
> I think I'd rather the developer consider these options, than make it
> difficult for the nagios administrator to configure.
The Nagios administrator wouldn't have to be bothered if the plugin
recieved a unique id, that the plugin would then use as an instance ID
automatically. Both plugin developer and user proof :)
In last discussion, I proposed a technique that can be used (based on
Thomas' perfdata tecnique):
> It's based on generating a token if one hasn't been passed,
> and then getting it back via the last check (maybe
> via $SERVICEOUTPUT$, maybe via perfdata?
>
>if ($SERVICEOUTPUT$ contains tk=UUID){
> read /tmp/UUID;
> do_stuff();
> write /tmp/UUID;
> output "$output tk=UUID";
> exit $status;
>} else {
> UUID = gen_UUID();
> write to /tmp/UUID;
> output "UNKNOWN tk=UUID";
> exit with UNKNOWN
>}
note that '/tmp/UUID' is pseudo code for "storage for UUID". I don't
like this tecnique because it implies having the admin to configure
check --id="$SERVICEOUTPUT$".
I supspect that Nagios already has an internal identifier just waiting
to be exposed via a macro so that it can be used as:
check --id=$SERVICEUUID$
an important property for the macro would be that a Nagios reload
wouldn't change it. Is this possible?
It's back compatible too! The pre-SERVICEUUID Nagios admins just have to
assign an unused id... It would work via NRPE too!
It could all be done automatically via an ENV var to not pester the
administrators, but that, I suspect will need changes to Nagios, remote
agents, and possibly protocols... And the pre-SERVICEUUID admins would
have to wrap the plugin in a wrapper that would set that ENV var, or the
check would still have to provide an id parameter to set it's value...
> I think this actually shows that Nagios should interpret the perfdata.
> If nagios stored the previous values, then it can work out things like
> rate changes.
But the plugin needs the info get back to it so it can decide the
status. One problem I see is that if the plugin recieves the performance
data it last outputted back, it may not be of utility. For example:
check_disk reads from a counter in the kernel that marks how many blocks
have been read from a disk since last reboot
"check_disk -w 100 -c 500" first execution reads 1000 blocks.
Stores the "1000" in the data store
Status: UNKNOWN
Perfdata: blocks=
"check_disk -w 100 -c 500" executes again reading 3000 blocks.
Reads the stored 1000. 100 seconds have passed.
(3000 - 1000) / 100 = 200 blocks/second.
Stores the 3000
Status:WARNING ( w > 200 > c )
Perfdata: blocks=200
"check_disk -w 100 -c 500" executes again reading 10000 blocks.
Reads the stored 3000. 100 seconds have passed.
(10000 - 3000) / 100 = 700 blocks/second.
Stores the 10000
Status:CRITICAL ( c > 500 )
Perfdata: blocks=700
If I get this perfdata back, I can't do next steps calculations. (I can
always output the raw counter value, and then do the calculation based
on that, but that would affect the way the graphing software now has to
treat those values).
Use check_cpu as a case. The linux kernel returns how many slices of
time it has passed in user, system, etc. But the output you want is
user=50 system=30 (percent). The raw values for the system and user
times would be bothering.
Another limitation is that it is just for numerical only values.
And another one: The plugin will only get back one set of data... How
would you implement a plugin that would check every five minutes, but
the status calculated with a 1 hour window of results?
Just my 2c,
Jose Luis Martinez
jlmartinez at capside.com
More information about the Devel
mailing list