Why would haproxy send a request to a server that is down?

From: Malcolm Handley <malcolm#asana.com>
Date: Fri, 14 May 2010 12:07:53 -0700


Hi, everyone.

I'm having some trouble with the routing of requests to servers within a backend.

Firstly, although I have "retries 3" in the defaults section of my config file I'm not seeing any evidence of retries. If a server is down but has not been detected as down by haproxy then a request may still get sent to it and a failure returned to the client. (This is the first bold line in the log below.)

Second, occasionally haproxy seems to route a request to a server that it knows is down. (This is the second bolded section below.)

I could understand both of these if I were using cookies for routing and had not enabled redispatching. But I'm using "balance leastconn" with no mention of cookies in the config file. What else might I be doing that would force haproxy to use a downed backend and not retry requests?

May 14 04:03:03 prod_lb0 haproxy[28398]: 67.112.125.46:61758
[14/May/2010:04:02:44.517]

ws_in ws_in/<NOSRV> -1/-1/-1/-1/18609 400 187 - - CR-- 7/7/0/0/0 0/0 "<BADREQ>"
May 14 04:03:17 prod_lb0 haproxy[28398]: *127.0.0.1:58921
[14/May/2010:04:03:08.486]

ws_in lists_ws/ws_2 51/0/0/-1/9049 502 204 - - SH-- 7/7/2/1/0 0/0 "GET /-/ping HTTP/1.1"*
May 14 04:03:17 prod_lb0 haproxy[28074]: 67.112.125.46:59661
[14/May/2010:03:04:22.220]

ws_in lists_ws/ws_2 55/0/0/31/3535558 101 27943453 - - ---- 2/2/2/2/0 0/0 "GET /app/etherlist/socket?session_id=9680378170&profiler=1 HTTP/1.1" May 14 04:03:17 prod_lb0 haproxy[28398]: 99.66.213.198:58624
[14/May/2010:03:52:18.455]

ws_in lists_ws/ws_2 455/0/0/2/659334 101 3348145 - - ---- 6/6/1/0/0 0/0 "GET /app/etherlist/socket?session_id=9385884222&profiler=1 HTTP/1.1" May 14 04:03:17 prod_lb0 haproxy[28074]: 98.210.108.197:43537
[14/May/2010:03:04:22.477]

ws_in lists_ws/ws_2 165/0/0/2/3535454 101 8809800 - - ---- 1/1/1/1/0 0/0 "GET /-/socket?session_id=9646753365 HTTP/1.1" May 14 04:03:18 prod_lb0 haproxy[28074]: 70.36.139.123:56872
[14/May/2010:03:04:22.511]

ws_in lists_ws/ws_2 88/0/0/2/3535503 101 25165709 - - ---- 0/0/0/0/0 0/0 "GET /-/socket?session_id=9669242145 HTTP/1.1" May 14 04:03:21 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms.*
May 14 04:03:23 prod_lb0 haproxy[28398]: *127.0.0.1:58965
[14/May/2010:04:03:20.475]

ws_in lists_ws/ws_2 10/0/-1/-1/3032 503 212 - - SC-- 8/8/2/0/3 0/0 "GET /-/ping HTTP/1.1"*
May 14 04:03:29 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is UP, reason: Layer4 check passed, check duration: 0ms.* May 14 04:03:34 prod_lb0 haproxy[28398]: 99.66.213.198:59005
[14/May/2010:04:02:43.444]

ws_in ws/ws_1 15/0/1/30/50569 304 296 - - cD-- 9/9/4/2/0 0/0 "GET /-/static/luna/browser/images/loading.gif HTTP/1.1" Received on 2010/05/14 21:07

This archive was generated by hypermail 2.2.0 : 2010/05/14 21:15 CEST