Hello Willy,
thanks for the detailed reply.
On 29/11/08 20:32, Willy Tarreau wrote:
> Hello Kai,
>
> On Fri, Nov 28, 2008 at 11:55:51AM +0000, Kai Krueger wrote:
>
>> Hello list,
>>
>> we are trying to set up haproxy as a load balancer between several
>> webservers running long standing database queries (on the order of
>> minutes). Altogether it is working quite nicely, however there are two
>> somewhat related issues causing many requests to fail with a 503 error.
>> As the backend webservers have some load that is outside the control of
>> haproxy, during some periods of time, requests fail immediately with a
>> 503 over load error from the backends. In order to circumvent this, we
>> use the httpchk option to monitor the servers and mark them as down when
>> they return 503 errors. However, there seem to be cases where requests
>> get routed to backends despite them being correctly recognized as down
>> (according to haproxy stats page) causing these requests to fail too.
>> Worse still is that due to the use of least connection scheduling the
>> entire queue gets immediately drained to this server once this happens,
>> causing all of requests in the queue to fail with 503.
>> I haven't identified under what circumstances this exactly happens, as
>> most of the time it works correctly. One guess would be that the issue
>> may be something to do with that there are still several connections
>> open to the server when it gets marked down which happily continue to
>> run to completion.
>>
>
> From your description, it looks like this is what is happening. However,
> this must be independant on the LB algo. What might be happening though,
> is that the last server to be identified as failing gets all the requests.
>
Is there a way to try and find out if this is the case? I think at those
times, there were still other backends up.
Bellow I have a log which I think captures the effect. (Similar config as previously, but some maxconn and check inter settings tuned to make it easier to trigger this and it was run on a different machine)
Nov 30 00:21:31 aiputerlx haproxy[10400]: Proxy ROMA started. Nov 30 00:21:45 aiputerlx haproxy[10400]: 127.0.0.1:51993 [30/Nov/2008:00:21:39.125] ROMA ROMA/mat 0/0/0/2339/6830 200 8909913 - ----- 3/3/3/1/0 0/0 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0" Nov 30 00:21:47 aiputerlx haproxy[10400]: Server ROMA/mat is DOWN. 2 active and 0 backup servers left. 2 sessions active, 0 requeued, 0 remaining in queue.
According to the log, Server ROMA/mat is DOWN at 00:21:47, however the
4th request which was sent at 00:21:48 was still queued to ROMA/mat
which at that time was still down, as it is not up till 00:22:26. It
might have received a valid health check message and might have been
down going up. Am I perhaps misreading the logs here?
>
>> Is there a way to prevent this from happening?
>>
>
> Not really since the connections are already established, it's too late.
> You can shorten your srvtimeout though. It will abort the requests earlier.
> Yours are really huge (20mn for HTTP is really huge). But anyway, once
> connected to the server, the request is immediately sent, and even if you
> close, you only close one-way so that server will still process the request.
>
The backend servers have to query a database to generate the response
and some of these replies can occasionally take up to 10 - 20 minutes to
process. This is why we chose the very long request timeout so that even
these can be handled. Th
>
>> The second question is regarding requeuing. As the load checks fluctuate
>> quite rapidly periodically querying the backends to see if they are
>> overloaded seems somewhat too slow, leaving a window open between when
>> the backends reject requests and until haproxy notices this and takes
>> down that server. More ideally would be for haproxy to recognize the
>> error and automatically requeue the request to a different backend.
>>
>
> In fact, what I wanted to add in the past, was the ability to either
> modulate weights depending on the error ratios, or speed-up health
> checks when an error has been detected in a response, so that a failing
> server can be identified faster.
That would probably be useful to detect these problems and take the
backend server offline faster. I had a similar idea and tried hacking
the source code a bit. I ended up adding a call to set_server_down() in
process_srv() of proto_http.c if the reply is a 500 error. It seemed to
work sofar, but I haven't understood the code well enough to know if
this is even closely valid or if it has nasty race conditions or other
problems. Is this a safe thing to do?
> Also, is it on purpose that your "inter"
> parameter is set to 10 seconds ? You need at least 20 seconds in your
> config to detect a faulty server. Depending on the load or your site,
> this might impact a huge number of requests. Isn't it possible to set
> shorter intervals ? I commonly use 1s, and sometimes even 100ms on some
> servers (with a higher fall parameter). Of course it depends on the work
> performed on the server for each health-check.
>
It could probably be reduced, somewhat, although if there is a queue
even a short interval may see a lot of failed requests,
as it can drain away the entire queue in the mean time.
>
>> At
>> the moment haproxy seems to pass through all errors directly to the
>> client. Is there a way to configure haproxy to requeue on errors?
>>
>
> clearly, no. It can as long as the connection has not been established
> to the server. Once established, the request begins to flow towards the
> server, so it's too late.
>
>
>> I
>> think I have read somewhere that haproxy doesn't requeue because it does
>> not know if it is safe, however these databases are completely read only
>> and thus we know that it is safe to requeue, as the requests have no
>> side effects.
>>
>
> There are two problems to replay a request. The first one is that the
> request is not in the buffer anymore once the connection is established
> to the server. We could imagine mechanisms to block it under some
> circumstances. The second problem is that as you say, haproxy cannot
> know which requests are safe to replay. HTTP defines idempotent methods
> such as GET, PUT, DELETE, ... which are normally safe to replay.
> Obviously now GET is not safe anymore, and with cookies it's even
> more complicated because a backend server can identify a session
> for a request and make that session's state progress for each request.
> Also imagine what can happen if the user presses Stop, clicks a different
> link, and by this time, haproxy finally gives up on former request and
> replays it on another server. You can seriously affect the usability of
> a site or even its security (you don't want a login page to be replayed
> past the logout link for instance).
>
> So there are a lot of complex cases where replaying is very dangerous.
>
It is true that in many cases it is dangerous to requeue requests, but
it would be nice if there were a configuration parameter with which one
could tell haproxy that in this particular case one knows for sure that
requeueing is safe.
> I'd be more tempted by adding an option to not return anything upon
> certain types of errors, so that the browser can decide itself whether
> to replay or not. Oh BTW you can already do that using "errorfile".
> Simply return an empty file for 502, 503 and 504, and your clients
> will decide whether they should replay or not when a server fails to
> respond.
>
Well, I had hoped to use haproxy to mask these errors and provide a
higher availability and not shift the burden of retrying to the client.
>
>> P.S. In case it helps, I have attached the configuration we use with
>> haproxy (version 1.3.15.5, running on freeBSD)
>>
>
> Thanks, that's a good reflex, people often forget that important part ;-)
>
:-)
Here are the things that changed in the config that generated the log file above and it was this time run on a linux box
server mat localhost:80 maxconn 6 weight 10 check inter 1s rise 30 fall 1
server Quiky 68.49.216.76:80 maxconn 1 weight 10 check inter 10s rise 3
fall 1
server mat 79.143.240.199:80 maxconn 1 weight 10 check inter 10s rise 3
fall 1
> Regards,
> Willy
>
Kai
Received on 2008/11/30 02:28
This archive was generated by hypermail 2.2.0 : 2008/11/30 02:30 CET