[Nagiosplug-devel] Check_ping question
Andreas Ericsson
ae at op5.se
Wed Nov 10 16:42:01 CET 2004
Robert Nelson wrote:
>>Nonsensical. The timeout for iterative plugins should always be
>>calculated based on the sum of per-iteration timeouts (in
>>this case, 5
>>seconds * 3 packets).
>
>
> Yes, but what I'm really looking for is a real "check-host-alive". I
> don't want a status report, I just want a PASS/FAIL result. Since Nagios
> seems configured to use check_ping for it, I am not looking for it to be
> an iterative plugin in this case.
>
I'm working on writing a check_host_alive plugin, which will be able to
do just this.
>
>>>I end up with an *effective* timeout
>>>value of (5000.0 / 1000.0 * 3 + 3 =) 18 seconds. This seems
>>
>>"broken" to
>>
>>>me.
>>>
>>
>>It isn't. If you specify a per-packet timeout value of 5 seconds and
>>send 3 packets that means the complete timeout must be at
>>least equal to
>>15 seconds (I don't know where those extra three come from),
>>otherwise
>>the per-packet timeout wouldn't make sense (should you count
>>packets as
>>lost if they're not even sent?).
>
>
> I guess I also am confused on this one, as /bin/ping on most OS's will
> send packet 2 at 1000ms, and packet 3 at 2000ms, with a 5000 ms timeout,
> that's 7.0000000001 seconds total. Why are we going all the way up to
> 18, or even 15 seconds then?
>
It's just the maximum. check_ping doesn't have a backoff factor (which
is what you describe), since it forks system ping to do its dirtywork.
>
>
>>>I ended up commenting out the last three lines quoted from
>>
>>check_ping.c
>>
>>>and recompiling it. I'm just curious whether this is
>>
>>behavior by design
>>
>>>or by error, and whether I need to make notes about it for when
>>>plugins-1.3.2 comes out. Thanks!
>>>
>>
>>It's obviously by design, and 1.3.2 won't come out. They're at 1.4.0
>>now. What would be good would be to remove the timeout value,
>>but that
>>would make a LOT of configurations out there return UNKNOWN instead.
>
>
> I'll note that. I don't mind the way it calculates how long it *should*
> take. To me it appears that a specified "timeout" value should not be
> overridden, though.
>
It isn't overridden. It's just prioritized lower than the timeout value
specified in the threshold values for programmatical reasons.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Lead Developer
More information about the Devel
mailing list