[Nagiosplug-devel] Suggested alterations to the Performance Protocol

Ben Clewett Ben at clewett.org.uk
Wed Sep 8 08:35:01 CEST 2004


Ton, thanks for replying to my email.  I'll answer in line, including 
comments from later Yves posting:

>> 1.
>>
>> Patch a loop-hole in the document.  The Warn and Crit values should 
>> be  of the same UOM as the value.  Second, all numbers, value, max, 
>> min,  warn and crit, should be directly comparable.
>>
>> Although this is common sense, I have seen some plugins (eg,  
>> check_disk) which certainly used to use a different number type for  
>> the warn & crit as to the value.  In this case, 'Disk Free' and 'Disk  
>> Used', which are not comparable.
> 
> 
> Good point and a no-brainer. I'll update tonight (the sf.net web page  
> will not update immediately - it takes a few days for the proxy cache  
> to expire).
> 
> I think check_disk used to be broken in its reporting of perf data, but  
> I believe the current CVS version is correct.

I am glad this was a valid point.  Good to hear the check_disk is 
working as well.  My own plugins are quite old, I'll take a look.

>> 2.
>>
>> Suggested by Yves Mettier:  The addition of a special reserved  
>> variable, 'check_time' which records the time at which the plugin  
>> completed the check.
> 
> 
> Firstly, why is this performance data?

I'll leave Yves to answer this one, as he did.

This did bring up a comment about what the 'time' means?  Is this the 
time at which the plugin started, ended, or the time it took to run.  If 
these will be reserved variables for future use, it might be worth 
reserving both the start and the stop.  Maybe 'start_time' and 'end_time' ?

> However, I like the idea of "special reserved variables" - I think it  
> is worthwhile to add a table with a list of common labels, such as  
> "time". Any comments?

Great!  I can't think of any off hand accept the time related ones.


>> The addition of macro's to define special numbers.  Some mentioned 
>> are  NULL to indicate no value or an invalid value.  INF and -INF to  
>> indicate an infinite value.  Possibly NAN to represent Not a Number,  
>> as with division by zero.  Not often used, but do have a place.
> 
> 
> This is already covered in  http://nagiosplug.sourceforge.net/developer- 
> guidelines.html#THRESHOLDFORMAT but is not specifically mentioned for  
> the perf data output. This should be clearer.

I think I might have confused things here.  I was thinking of these 
macro's for the value, not the threshold range.

The real one of interest now is the idea of a NULL value.  For instance 
the latency of check_icmp when there is no reply.  Using NULL to show a 
gap in the graph.  Rather than joining the graph together round the 
missing value, or using some odd value to imply NULL.

Eg:

  | latency=NULLms

I then was thinking that as well as NULL, there is INF, -INF, NAN etc, 
which if they have a use, might be included as well.  Yves has suggested 
using '~' for +INF and -INF when in the right place.


> I like the idea of macros. I had proposed using some arcane characters  
> (such as ~ for negative infinity), but I think your macro idea is far  
> clearer. Any comments?

Maybe use both?  As long as it's clear and developers such as my self 
can parse them exactly.  This would be good.



>> 4.
>>
>> To allow any UOM unit.  For instance, 'degc' for temperature, 'users'  
>> for a user count etc.
> 
> 
> I think degc makes sense (is there a formal SI unit for degrees  
> centigrade?), but users doesn't - users is already covered in point 10a  
> at http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN185.  
> For example, "active_users=10" would be sufficient without a UOM, but  
> "cabinet_temperature=20" could be in degrees centigrade or degrees  
> Fahrenheit.
> 
> The idea was that the label was free text to describe the thing being  
> measured, while the UOM gives the graphing program enough data on how  
> to graph (eg, RRD has a concept of graphing the difference between two  
> values for counters type data). Thus having an exhaustive list of UOM  
> units would make it extra coding. But there does seem to be confusion  
> as things like B (bytes) and s (seconds) are UOMs whereas it wouldn't  
> matter to the graphing program. Maybe we should be more like SI units?

I would strongly support SI units.  I think this would be an excellent 
idea.  Although some use Greek characters, but I am sure there is a 
Latin character notation for these.  Anybody know ?  If not there can't 
be too much harm in making them up.

You did talk about automatically comparing variables in a graph drawing 
package.  It might be worth writing down some of the expected 
conversions.  Eg 1B = 8b, 1MB = 1024KB etc...


>> 5.
>>
>> There is no way of representing a date.  There may be some plugins,  
>> eg, recording user information, which do want to record a date.
>>
>> I have suggested UNIX time above.  However another suggestion is to  
>> use the popular SQL syntax: '%Y-%m-%q %d:%M:%S.ms', eg, '2004-09-07  
>> 16:10:15.123'.  Or a component of 'date', 'data time', 'time.ms'.  It  
>> works for SQL :)
> 
> 
> I would prefer to use Unix time, only because of brevity. As long as it  
> gets translated later (and there are lots of common functions for it),  
> then the graphing would be okay.
> 
> Would Unix time with a .ms make sense for more granularity? This would  
> presumably need a UOM defined too.

Ok.  Translating a SQL style data (Thanks Yves for the correction to my 
format :) would be a pain anyway.  UNIX time is great.  As long as it's 
written down so we can follow it, I'm very happy to use what ever people 
think is best.


> My personal schedules dictate the amount of time I can afford on this,  
> but I hope it is always a friendly reception... :)

Thanks for the response, looks like most issues are already sorted.

Regards, Ben.






More information about the Devel mailing list