[Nagiosplug-devel] Patch for persistent data on plugins

Jose Luis Martinez jlmartinez-lists-nagplug-devel at capside.com
Thu Sep 17 18:07:07 CEST 2009


Ton Voon escribió:
>> IMHO, the instance id should not be selected by the developer, as he  
>> is
>> not capable of previewing all these situations. There should be a  
>> way of
>> automatically giving it to the plugin.
> 
> I think I'd rather the developer consider these options, than make it  
> difficult for the nagios administrator to configure.

The Nagios administrator wouldn't have to be bothered if the plugin 
recieved a unique id, that the plugin would then use as an instance ID 
automatically. Both plugin developer and user proof :)

In last discussion, I proposed a technique that can be used (based on 
Thomas' perfdata tecnique):

 > It's based on generating a token if one hasn't been passed,
 > and then getting it back via the last check (maybe
 > via $SERVICEOUTPUT$, maybe via perfdata?
 >
 >if ($SERVICEOUTPUT$ contains tk=UUID){
 >    read /tmp/UUID;
 >    do_stuff();
 >    write /tmp/UUID;
 >    output "$output tk=UUID";
 >    exit $status;
 >} else {
 >    UUID = gen_UUID();
 >    write to /tmp/UUID;
 >    output "UNKNOWN tk=UUID";
 >    exit with UNKNOWN
 >}

note that '/tmp/UUID' is pseudo code for "storage for UUID". I don't 
like this tecnique because it implies having the admin to configure

check --id="$SERVICEOUTPUT$".

I supspect that Nagios already has an internal identifier just waiting 
to be exposed via a macro so that it can be used as:

check --id=$SERVICEUUID$

an important property for the macro would be that a Nagios reload 
wouldn't change it. Is this possible?

It's back compatible too! The pre-SERVICEUUID Nagios admins just have to 
assign an unused id... It would work via NRPE too!

It could all be done automatically via an ENV var to not pester the 
administrators, but that, I suspect will need changes to Nagios, remote 
agents, and possibly protocols... And the pre-SERVICEUUID admins would 
have to wrap the plugin in a wrapper that would set that ENV var, or the 
check would still have to provide an id parameter to set it's value...

> I think this actually shows that Nagios should interpret the perfdata.  
> If nagios stored the previous values, then it can work out things like  
> rate changes.

But the plugin needs the info get back to it so it can decide the 
status. One problem I see is that if the plugin recieves the performance 
data it last outputted back, it may not be of utility. For example:

check_disk reads from a counter in the kernel that marks how many blocks 
have been read from a disk since last reboot

"check_disk -w 100 -c 500" first execution reads 1000 blocks.
Stores the "1000" in the data store
Status: UNKNOWN
Perfdata: blocks=

"check_disk -w 100 -c 500" executes again reading 3000 blocks.
Reads the stored 1000. 100 seconds have passed.
(3000 - 1000) / 100 = 200 blocks/second.
Stores the 3000
Status:WARNING ( w > 200 > c )
Perfdata: blocks=200

"check_disk -w 100 -c 500" executes again reading 10000 blocks.
Reads the stored 3000. 100 seconds have passed.
(10000 - 3000) / 100 = 700 blocks/second.
Stores the 10000
Status:CRITICAL ( c > 500 )
Perfdata: blocks=700

If I get this perfdata back, I can't do next steps calculations. (I can 
always output the raw counter value, and then do the calculation based 
on that, but that would affect the way the graphing software now has to 
treat those values).

Use check_cpu as a case. The linux kernel returns how many slices of 
time it has passed in user, system, etc. But the output you want is 
user=50 system=30 (percent). The raw values for the system and user 
times would be bothering.

Another limitation is that it is just for numerical only values.

And another one: The plugin will only get back one set of data... How 
would you implement a plugin that would check every five minutes, but 
the status calculated with a 1 hour window of results?

Just my 2c,

Jose Luis Martinez
jlmartinez at capside.com







More information about the Devel mailing list