[Nagiosplug-devel] Working on testcases
Ethan Galstad
nagios at nagios.org
Sun Nov 13 23:12:06 CET 2005
I agree with what Sean is suggestion for the return codes. I would
recommend staying away from using the UNKNOWN state for much of
anything - save internal plugin errors and command line boo-boos.
Connection failures and timeouts should probably be CRITICAL errors,
as they represent a real (serious) problem with the service being
checked.
On 13 Nov 2005 at 14:57, sean finney wrote:
> hi,
>
> just to throw another $0.02 into the bucket...
>
> On Fri, Nov 11, 2005 at 12:51:48AM +0000, Ton Voon wrote:
> > "UNKNOWN is for invalid command args or any other failure before the
> > requested check can be performed - with the only exception being
> > hostname lookups which should return CRITICAL."
>
> given the example you listed below, i don't think this is a good idea.
> rather, i think something like:
>
> "UNKNOWN is for invalid command args or other failures preventing
> the plugin from performing the specified operation."
>
> about dns: i think there are two specific and very different kinds
> of failure. there is "general resolution failure", and there is a
> "host does not exist failure". i would say that the former ought
> to remain as an UNKNOWN, as it parallels similar failures in other
> system calls such as malloc. however, if the plugin gets a "no such
> host" response, then it definitely should be CRITICAL--as you could
> implicitly divine that the hostname is supposed to resolve. similarly,
> i feel that remote service check connection failures should remain
> CRITICAL.
>
>
>
> > (2) check_http -H webserver -w 2
> >
> > This returns OK if can connect to webserver and returns data within 2
> > seconds. If it cannot connect, then this returns UNKNOWN because it
> > is not the metric that is being requested to check against (currently
> > returns CRITICAL).
>
> i'd say it should still return CRITICAL.
>
> > (3) check_http -H webserver -r 'string_to_find'
> >
> > This returns OK if it can find the server and return data with the
> > string. If it cannot connect to the server (currently CRITICAL), or
> > gets a 302 redirection (currently OK (?) ), this should be an UNKNOWN.
>
> again, i think things such as "connection refused" should still result
> in states indicative of a problem. the big difference in my
> view is that some problems prevent the plugin from doing its job,
> while other problems signify that there really is a problem.
>
> wrt the 302 redirections, i haven't even looked at what we're currently
> doing but feel we ought to follow the redirection (or provide
> a cmdline toggle) if we want to be good user-agents :)
>
> for example, malloc or name resolution failing means that the plugin
> could not tell you the service status regardless of what it was,
> whereas a "host does not exist" or "connection refused" mean that
> something is in fact wrong (and that other people would probably
> be having the same problem).
>
>
> sean
>
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
More information about the Devel
mailing list