So it seems that setting the source address in the proxy definition is
the culprit. At least, I've gone about 3 hours now without a
connection failure since removing the source address definition.
The explanation I've come up with makes sense to me, however I'm not a
coder and I don't know C. So I'm presenting this as my understanding
of what's going on, not as a statement of fact. Please correct me if
I'm wrong.
Setting the source address for the server or the proxy causes haproxy
to call bind() to set the IP we've requested on the fd. haproxy sets
sockopt SO_REUSEADDR. This succeeds but periodically the connect()
fails with an EADDRINUSE.
The reading I've done suggests that SO_REUSEADDR allows you to reuse
local addresses but the 5 tuple must still be unique. So if haproxy
reuses a socket currently in TIME_WAIT with web1:80 to connect() to
web1:80 it will fail with EADDRINUSE.
James.
On Sep 20, 2008, at 3:20 PM, James Satterfield wrote:
I've run into some problems with haproxy in my production environment.
We're getting a lot of these errors
kernel: Sep 20 14:18:38 localhost haproxy[81006]: Connect() failed for
server corp-www/web1: local address already in use.
We only get these errors on connections to the two web servers. We
never see these errors on connections to our application hosts.
I've looked through the source and this appears to be triggered by connect() returning EADDRINUSE. This does look to be a freebsd problem and not an haproxy problem but I'm hoping to find someone on this list that's found a solution.
Platform is two freebsd 6.3 machines running pf/pfsync/carp/haproxy
1.2.17
I've attached my sanitized config.
The internal interface config for load balancer A
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet 10.1.11.3 netmask 0xffffff00 broadcast 10.1.11.255 ether 00:30:48:63:b2:70 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active carp1: flags=49<UP,LOOPBACK,RUNNING> mtu 1500 inet 10.1.11.10 netmask 0xffffff00 inet 10.1.11.2 netmask 0xffffff00 carp: MASTER vhid 11 advbase 1 advskew 0
The load balancer generally has about 1000 connections to the web servers in FIN_WAIT_2 and another ~500 in TIME_WAIT with application hosts 1 - 4.
Thanks,
James.
<haproxy.conf> Received on 2008/09/21 10:26
This archive was generated by hypermail 2.2.0 : 2008/09/21 10:30 CEST