[Nagiosplug-devel] [Nagiosplug-help] check_ntp

Andreas Ericsson ae at op5.se
Mon Nov 12 10:21:09 CET 2007


Thomas Guyot-Sionnest wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 11/11/07 12:33 PM, Andreas Ericsson wrote:
>> Thomas Guyot-Sionnest wrote:
>>> Isn't that as bad? A plugin doesn't do what it is advertised to do, so
>>> change the description and get out with it? What about people that
>>> actually relied on this description? Shouldn't they get what they think
>>> they're getting?
>>>
>> Well, the plugin *does* check the selected ntp server. The description is
>> just ambiguous, not actually wrong.
> 
> If you add jitter or stratum check. The base check alert only on clock
> diff between the local and remote host using NTP, which doesn't have
> much to do about the remote NTP server apart that it's responding.
> 

90% of all the plugins that claim to check a network service only check
that it's responding.

>>> I didn't reviewed my code yet; I also planed some more thoughtful tests
>>> and even putting check_ntpd in a full production system before release.
>>>
>>> If you found any bug I'll be glad to fix them.
>>>
>> Here's one.
>>
>> diff --git a/plugins/check_ntpd.c b/plugins/check_ntpd.c
>> [...}
> 
> I don't get this. It isn't a patch since the b/ code it the current one.
> So I guess it's one of my commit. I moved the status variable from an
> argument-passed *int to a local int, so there's no bug here. I also
> planed reviewing variables in this function anyways.
> 

Arggh, you're right. I got fooled by the git-diff format, which is clearly
broken for this case. It shows the function declaration with each hunk-header,
but since you changed the declaration in the same commit there is no bug.
Sorry for the noise.

> 
>>> , use check_ntpd.
>>> if you want to check the time diff between a server and a remote clock,
>> Ambiguous again. "If you want to check if a server is in sync with a
>> particular ntp server, use check_ntp_time" (check_time_ntp and check_ntpd
>> wouldn't sort next to each other in directory listings, so check_ntp_time
>> would be better, imo).
> 
> Noted, and yes I hesitated between both; I'll go with check_ntp_time if
> you prefer :)
> 

It'll sort better if nothing else.

>>> Yes and I mentioned creating two checks and not touching check_ntp two
>>> posts ago, so I don't understand why it's still going on in the thread.
>>>
>> Because you want to deprecate it, which means "get ready to remove it in
>> the future", while you're saying "oh, but lookee here you can use this
>> new plugin to do the exact same thing".
> 
> The final removal can be for the next major release. And before
> upgrading a major release people should be prepared for non-backward
> compatible changes (We'll make sure to document them, though).
> 
>>> I mean it checks if the remote server has a sync.peer (noted with a star
>>> in "ntpq -p" output. This might not be working well yet, but as I said
>>> it's still work in progress...
>>>
>> So changing from check_ntp to check_ntpd would mean one gets no warning
>> if the nagios server starts drifting then? That's a clear change of
>> semantics right there, and that's why a LOT of people will think the
>> new plugin is just plain broken.
> 
> Not it they monitor their local NTP daemon. In my setups the Nagios gets
> the same basic check as all other servers, regardless if it's using NRPE
> or not.
> 
> On a side note we just got a new CMS for nagiosplugins.org. There's
> place for more documentation on Nagios plugins and I'll do my best to
> fill this gap in the coming months. I can't watch everyone's
> implementation of Nagios to see if they do everything right, but I can
> give them tools to build it the best way possible.
> 

I think the whole reason for the confusion in the first place is that the
plugin was poorly named. It's a shame it got released in its original
incantation, but breaking things for existing setups now would be worse
than living with a poorly named plugin for a year or so.

I'd suggest doing this:
Keep check_ntp for now, but introduce check_ntp_peers (the current
check_ntpd).

Introduce check_ntp_local_offset, doing what check_time_ntp does (or is
intended to do anyways). 

These two break nothing, so it can be done any time, really.

1.5:
Make check_ntp_local_offset install as check_ntp_local_offset and
check_ntp, while adding -j and -k to check_ntp_local_offset, but
silently ignoring them.

1.6:
Make check_ntp_local_offset give a warning when it gets -j or -k,
and deprecate check_ntp (with a warning if it's called with that
name).

1.7:
Remove check_ntp.


Alternatively, release 2.0 and do all of the above in one go.
It's traditional to have "all bets are off" when releasing a
major version, so people will just have to cope with it. Put
sed oneliners in some document somewhere for people to run on
their config files to make use of the new plugins over the
old ones.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231




More information about the Devel mailing list