Hi,
On Mon, Aug 02, 2010 at 09:05:02AM -0700, Rich Rauenzahn wrote:
> I'm using haproxy ("balance uri") inside an intranet to direct traffic
> to 4 squid servers in order to cache content normally served directly
> by our web server. This web server serves large files (ranging from
> 10's of MB to several GB)
>
> I'm worried that our haproxy server could be a network bottleneck (the
> NIC, not the software)
Yes it could become a bottleneck, but everything depends on your traffic. A large site I know runs at 10 Gbps 24x7 on 3 haproxy machines. When 1 is in maintenance, that means 5 Gbps per haproxy, and it does not even saturate one core. That means that the NIC is used to 1/3 to 1/2 of its potential and the CPU is even less used. I don't know if you have a higher traffic, but there are several ways to scale, and the easiest one is to stack layer 4 + layer 7 LBs :
The advantage is that the first layer does maximum randomization and ensures very smooth distribution on the load at the second layer. The second layer uses URI hashing to find the best cache. That way you get the best of L4 and L7 : total scalability + URL-awareness.
> and am wondering if there is a way to use an
> http redirect instead of passthrough -- then the actual traffic could
> come directly (and only) from the back end squid server and not have
> to also pass through the haproxy NIC.
Yes, you can do that by specifying "redir" on the "server" lines. Haproxy will then send a 302 to the client with the IP:port of the server and the same URI for GET/HEAD requests. POSTs still pass through haproxy. This is particularly useful for multi-site LB, because it only sends redirets for server that are known to be up, and it applies the correct LB algorithm. However, your servers have to be able to get direct requests from clients.
> I have a feeling from browsing the docs that haproxy just isn't
> intended to be used in this kind of model.
it is :-)
> Is it possible to do this? Should I be using a different load
> balancer? Or does this kind of redirection have a nasty side effect I
> haven't thought of yet?
The only thing I can think of is that when performing such a redirect, the client will put the server's IP (or name) in the "Host" header, which means that the site name must be deduced from something else. For many setups this is not a problem, but this is still something to keep in mind.
Regards,
Willy
Received on 2010/08/03 07:32
This archive was generated by hypermail 2.2.0 : 2010/08/03 07:45 CEST