Hi Krzysztof and Willy,
I have compiled haproxy 1.3.15.1 for x86_64
I found with dmesg some of this errors:
ip_conntrack: table full, dropping packet.
So I tougth I had found the problem but I was wrong. I doubled the ip_conntrack_max value and the dmesg error dissapeared but I still get the same problem. Around 5 reloads ok and then a wrong reload wich takes all my websites offline for EXACTLY 1 MIN.
connecting to port 81 I see every backend up. connecting on port 80 I get 503 error on every backend. exactly when the stats web arrives to "uptime 1:01min" I get every backend up on port 80. I am sure the 1min is not random. I have tried it 2 times with the same result.
Other strange thing is that I am getting this on logs on every reload. on both good reload and wrong reloads
May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy trac started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy trac started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen21 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen21 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy txema started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy txema started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy ibm3 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy ibm3 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen29_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen29_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy phylemon_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy phylemon_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy vmware-web started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy vmware-web started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy tomcat-gen26 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy tomcat-gen26 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen20 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gen20 started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy babelomics_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy babelomics_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gepas_backend started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy gepas_backend started. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy trac. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy trac. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy trac failed to enter pausemode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen21. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen21. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen21 failed to enter pausemode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy txema. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy txema. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy txema failed to enter pausemode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy ibm3. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy ibm3. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy ibm3 failed to enter pausemode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen29_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen29_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen29_backend failed to enterpause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy phylemon_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy phylemon_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy phylemon_backend failed toenter pause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy vmware-web. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy vmware-web. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy vmware-web failed to enterpause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy tomcat-gen26. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy tomcat-gen26. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy tomcat-gen26 failed to enterpause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen20. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gen20. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen20 failed to enter pausemode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy babelomics_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy babelomics_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy babelomics_backend failed toenter pause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gepas_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy gepas_backend. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gepas_backend failed to enterpause mode.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy http-in. May 30 00:47:58 127.0.0.1 haproxy[27763]: Pausing proxy http-in. May 30 00:47:58 127.0.0.1 haproxy[27763]: Some proxies refused to pause,performing soft stop now.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy trac in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy trac in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy gen21 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy gen21 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy txema in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy txema in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy ibm3 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy ibm3 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy gen29_backend in 0ms.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy vmware-web in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy vmware-web in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy tomcat-gen26 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy tomcat-gen26 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy gen20 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy gen20 in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy babelomics_backend in0 ms.
May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy http-in in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Stopping proxy http-in in 0 ms. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy trac stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy trac stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen21 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen21 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy txema stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy txema stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy ibm3 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy ibm3 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen29_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen29_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy phylemon_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy phylemon_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy vmware-web stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy vmware-web stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy tomcat-gen26 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy tomcat-gen26 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen20 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gen20 stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy babelomics_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy babelomics_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gepas_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy gepas_backend stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy http-in stopped. May 30 00:47:58 127.0.0.1 haproxy[27763]: Proxy http-in stopped. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy http-in started. May 30 00:47:58 127.0.0.1 haproxy[27776]: Proxy http-in started.
other strange thing is that in the wrong reload I got this on the log:
May 30 00:47:42 127.0.0.1 haproxy[27763]: 127.0.0.1:35292 [30/May/2008:00:47:42.500] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34106 - - PR-- 0/0/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:42 127.0.0.1 haproxy[27763]: 127.0.0.1:35297 [30/May/2008:00:47:42.990] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34084 - - PR-- 1/1/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:43 127.0.0.1 haproxy[27763]: 127.0.0.1:35310 [30/May/2008:00:47:43.507] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34053 - - PR-- 1/1/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:43 127.0.0.1 haproxy[27763]: 127.0.0.1:35315 [30/May/2008:00:47:43.622] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34055 - - PR-- 2/2/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:44 127.0.0.1 haproxy[27763]: 127.0.0.1:35328 [30/May/2008:00:47:44.109] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34056 - - PR-- 1/1/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:44 127.0.0.1 haproxy[27763]: 127.0.0.1:35335 [30/May/2008:00:47:44.555] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34056 - - PR-- 2/2/0/0/0 0/0 "GET /stats HTTP/1.1"
30/May/2008:00:47:44
---------- HERE STOPS LOGIN FOR AROUND ONE MINUTE ------------------------------- THEN CONTINUES LOGGING BUT TIMESTAMP IS 1SEC LATER ----------- May 30 00:47:45
May 30 00:47:45 127.0.0.1 haproxy[27763]: 127.0.0.1:35338 [30/May/2008:00:47:45.035] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34056 - - PR-- 0/0/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:45 127.0.0.1 haproxy[27763]: 127.0.0.1:35355 [30/May/2008:00:47:45.559] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34059 - - PR-- 2/2/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:46 127.0.0.1 haproxy[27763]: 127.0.0.1:35371 [30/May/2008:00:47:46.043] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34060 - - PR-- 3/3/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:46 127.0.0.1 haproxy[27763]: 127.0.0.1:35373 [30/May/2008:00:47:46.283] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34060 - - PR-- 2/2/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:46 127.0.0.1 haproxy[27763]: 127.0.0.1:35376 [30/May/2008:00:47:46.545] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34060 - - PR-- 1/1/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:47 127.0.0.1 haproxy[27763]: 127.0.0.1:35389 [30/May/2008:00:47:47.217] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34060 - - PR-- 5/5/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:47 127.0.0.1 haproxy[27763]: 127.0.0.1:35406 [30/May/2008:00:47:47.915] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34061 - - PR-- 2/2/0/0/0 0/0 "GET /stats HTTP/1.1" May 30 00:47:48 127.0.0.1 haproxy[27763]: 127.0.0.1:35415 [30/May/2008:00:47:48.371] http-in http-in/<STATS> 0/-1/-1/-1/0 200 34096 - - PR-- 3/3/0/0/0 0/0 "GET /stats HTTP/1.1"
¿¿maybe can affect that I am using syslog-ng?? I dont think so but.....
The output of ethtool is this:
Settings for eth0:
Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes Settings for eth1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes
Not sure if is possible what you mention " duplex mismatch between the haproxy server and the switch". I will try to explore it deeper tomorrow at the lab.
The 503 error logs that I sent on my last mail I think was ok. I mean, the backend was really down and the 503 error was right. I just added the "trac" backend today and testing it I turned it off. The strange thing is that those are the only 503 errors on my log. On the 1 minute downtime I dont get any 503 error on my haproxy log (I can see the 503 just on my browser).
I thik tomorrow I will try to compile latests haproxy version for x86 and do some testing....
many thanks again for all your help.
I go to bed now. enough for one day :)
best wishes,
Pablo.
El Thursday 29 May 2008 22:40:59 Willy Tarreau escribió:
> On Thu, May 29, 2008 at 06:00:18PM +0200, Pablo Escobar wrote:
> > Hi,
> >
> > Many thanks for your answer Willy, really helpful as always :)
> >
> > After your indications I tried to increase the check interval and reduce
> > the rise parameter and I have been testing during this week. For three
> > days it hasn´t failed but today suddenly the 503 error came back.
> >
> > I havent found that the check column is growing in noany of the backends.
> >
> > I have a quite strange behaviour. If I directly connect to the haproxy
> > listening on port 81 I can see that all my backeds are up but I get the
> > 503 error for around 10 -15 seconds after the reload. This happens
> > randomly.
> >
> > Also if a backend is down when I do the reload I get the 503 on all my
> > backend, not just the down backend. ¿is this the normal behaviour?
>
> No, but it sounds to me like a network problem on the proxy's network
> interface. Having several backends down at start time means that some
> health-checks have failed at least once. The fact that you get them
> more often when you reload may simply be because during a reload you
> already have traffic on the interface, while if you start from scratch
> you don't have any traffic yet so the health-checks have more chances
> to pass.
>
> This would also explain why you get them all down if you reload during
> a problem : if you see one backend down, it means the problem is happening
> again. Then you reload everything and get all checks to fail on startup.
> This could be caused by a duplex mismatch between the haproxy server and
> the switch (most often a 100 Mbps switch forced to 100full with a server
> set to autoneg).
>
> Please check "ethtool -i eth0" on the server. I would not be surprized
> to see "100 Half" here. Also, "dmesg" might be showing links going UP
> and DOWN.
>
> You may as well be using spanning tree on your local switches, with a
> failing link causing occasionnal STP recalculations during which no
> traffic can pass.
>
> > Right now I have a reverse proxy using apache´s mod_proxy which sends all
> > my inbound http traffic to the haproxy listening on the same machine on
> > port 81. ¿maybe this can be affecting?
>
> no, I see no reason for that to cause any problem.
>
> > I am doing it in this way because having a
> > apache´s vhost which processes all my http traffic let me apply
> > mod_rewrite, mod_security and mod_cband to all my traffic.
> >
> > If you are interested I can send you my vhost config and my haproxy
> > config.
> >
> > many thanks in advance for any help.
> >
> > Pablo.
> >
> > p.d. I havent forgot about the docs to get snmpd working with perl
> > support. Sorry for the delay, late weeks I have been overloaded with
> > work. I promise to send it this week.
>
> No emergency, keep calm :-)
> We're all used to disappear for weeks and re-appear here on the list.
> It seems we all have a ton of work :-)
>
> Regards,
> Willy
-- Pablo Escobar Lopez Head of Infrastructure & IT Support Bioinformatics Department Centro de Investigación Príncipe Felipe (CIPF) Tfn: (34) 96 328 96 80 ext: 1004 http://bioinfo.cipf.esReceived on 2008/05/30 01:22
This archive was generated by hypermail 2.2.0 : 2008/05/30 01:31 CEST