[Nagiosplug-devel] RFC: Performance data guidelines
Voon, Ton
Ton.Voon at egg.com
Tue Jul 8 03:59:15 CEST 2003
Hi!
One of the features required for 1.4 is performance data. I would like to
write up the guidelines for this, but wanted confirmation if this is the
right way to go, so any comments would be appreciated.
I think perf data should have/be:
- short labels
- generic and common labels across plugins if possible
- comma separated, no spaces. Regex format: [a-z0-9]+=[0-9]?\.?[0-9]+
- redundant data removed (eg, if check_disk returns pct and number (free),
can calculate used bytes)
My suggestion for labels are:
Name ; Units ; printf format ; Details
time ; seconds ; %.3f ; time taken to do a specific check (eg DNS query,
HTTP request, ping RTA)
pct ; percent ; %.3f ; percentage (free rather than used if applicable) (eg
total disk, total swap, ping percent loss)
number ; must be bytes if applicable ; %d ; a given number of things (free
rather than used if applicable) (eg processes, users, bytes used such as
total disk or total swap)
numberf ; float ; %.3f ; a given number of things that may be fractional
(eg, load average, average bytes transmitted)
counter ; a continuous counter (must be bytes if applicable) ; %d ; a
continuous counter (eg bytes transmitted on an interface)
load1 ; load ; %.2f ; load average over 1 min
load5 ; load ; %.2f ; load average over 5 min
load15 ; load ; %.2f ; load average over 15 min
Contentious points:
- loadx. Not really keen on these, but don't seem to fit into any other
labels, unless we only return load5 and use numberf
- taking free values rather than used. This is consistent with the output
for check_disk and check_swap. Looking at graphs, I guess you want to see it
nearer zero which is your definite limit, rather than continuously
increasing
- maybe numberf is not required, but we say that number could be fractional.
I think this maybe better as RRD doesn't care whether values are integers or
not
- too reductionalist? Would you prefer labels that describe the measure? I
think the labels should be generic and the plugin describes the context
As an example, the patches submitted on SF for check_ping had perf labels of
rta and loss, but I think these should be time and pct respectively. I think
this makes it easier for something like RRD to work out what type of value
it is to draw the graphs. Why the returned values are bad is then up to
interpretation (and that is the key to any performance analysis!).
Ton
This private and confidential e-mail has been sent to you by Egg.
The Egg group of companies includes Egg Banking plc
(registered no. 2999842), Egg Financial Products Ltd (registered
no. 3319027) and Egg Investments Ltd (registered no. 3403963) which
carries out investment business on behalf of Egg and is regulated
by the Financial Services Authority.
Registered in England and Wales. Registered offices: 1 Waterhouse Square,
138-142 Holborn, London EC1N 2NA.
If you are not the intended recipient of this e-mail and have
received it in error, please notify the sender by replying with
'received in error' as the subject and then delete it from your
mailbox.
More information about the Devel
mailing list