[Nagiosplug-devel] check_http: Socket timeout after 10 seconds (but webserver is up! )

Christoph Stotz stotz at logo.de
Sun Jan 26 14:06:13 CET 2003


Hello,

I am experiencing a strange behaviour regarding check_http. I am trying to
check multiple virtual Webs being hosted on the same machine. It seems, that
first web (I call it first, because it has the lowest IP even if that does
not should really matter) is checked without any problems, but any following
times out.

<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->
spaehfix:/usr/local/nagios # ./libexec/check_http -H A.B.C.D
HTTP ok: HTTP/1.1 200 OK -   0.212 second response time |time=  0.212
spaehfix:/usr/local/nagios # ./libexec/check_http -H E.F.G.H
Socket timeout after 10 seconds
spaehfix:/usr/local/nagios # ./libexec/check_http -H I.J.K.L
Socket timeout after 10 seconds
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->

and so on. All webs are on the very same server within the same identical
IIS instance (if "use Apache" pops up in your mind: I have the same problem
with an Apache Web as well :o). The difference seems to be the way a request
like the following one is served:

<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->
spaehfix:/usr/local/nagios # telnet A.B.C.D 80
Trying A.B.C.D...
Connected to A.B.C.D.
Escape character is '^]'.
GET / HTTP/1.0
[CR+LF][CR+LF]
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->

On the first Web, this request returns the whole HTML Header- and
Content-blabla for the "/"-Page. On all the others this request just returns
nothing within 10 (or more) seconds. I believe this is caused by sending an
"mixed-old-style" HTTP-Request. Infact on the Apache I get a proper error
message logged, saying that my request does not conform to RFC2616 section
14.23. Not that section talks about specifying the host within the request. 

Error Message in Apache (so other people may find this article):

<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->
[Sun Jan 26 20:45:51 2003] [error] [client A.B.C.D] client sent HTTP/1.1
request without hostname (see RFC2616 section 14.23): /
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here--->

in IIS I only see

<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->
2003-01-26 21:34:21 A.B.C.D - E.F.G.H GET /Default.htm - 200 0 18 206718 80
HTTP/1.0 - - -
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->
(even if this looks like the check_http succeeded - it did not. It timed
out).

So this whole situation makes me believe, that there is something either
wrong in check_http or unsupported. The fact, that the very first web on the
IIS works fine and all the other does not makes me believe, that this is up
to handling virtual hosts in the http-Request (FYI: all those Web own their
own IP-Address).

I also tried out running the check_http using the "--verbose" option. This
does not make any difference:

<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->
spaehfix:/usr/local/nagios/libexec # ./check_http --verbose -H E.F.G.H
Socket timeout after 10 seconds
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->

And if you think, that this might be a network problem, have a look to
tcpdump's output:


<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->
spaehfix:~ # tcpdump src E.F.G.H or dst E.F.G.H -i eth0.3128
tcpdump: listening on eth0.3128
22:47:26.527226 E.F.G.H.42125 > A.B.C.D.http: S 4032238030:4032238030(0) win
5840 <mss 1460,sackOK,timestamp 9874648 0,nop,wscale0> (DF)
22:47:26.527399 A.B.C.D.http > E.F.G.H.42125: S 702361681:702361681(0) ack
4032238031 win 8760 <mss 1460> (DF)
22:47:26.527432 E.F.G.H.42125 > A.B.C.D.http: . ack 1 win 5840 (DF)
22:47:26.527654 E.F.G.H.42125 > A.B.C.D.http: P 1:17(16) ack 1 win 5840 (DF)
22:47:26.703234 A.B.C.D.http > E.F.G.H.42125: . ack 17 win 8744 (DF)
22:47:26.703270 E.F.G.H.42125 > A.B.C.D.http: P 17:102(85) ack 1 win 5840
(DF)
22:47:27.240061 E.F.G.H.42125 > A.B.C.D.http: P 17:102(85) ack 1 win 5840
(DF)
22:47:27.241273 A.B.C.D.http > E.F.G.H.42125: . ack 102 win 8659 (DF)
22:47:36.520323 E.F.G.H.42125 > A.B.C.D.http: F 102:102(0) ack 1 win 5840
(DF)
22:47:36.520495 A.B.C.D.http > E.F.G.H.42125: . ack 103 win 8659 (DF)
<---cut here---><---cut here---><---cut here---><---cut here---><---cut
here---><---cut here--->

I believe this means, that there is no problem in allocating the network
socket. 

So, is there somebody around experiencing the very same problem and maybe
has an idea on how to solve it (other then by using the check_tcp only) ?


Thanks a lot for your help!


Christoph Stotz
logo: GmbH




More information about the Devel mailing list