Hi,
We have recently migrated our game servers from Linux to FreeBSD. We have 8 web servers running in jails, with HAProxy as load balancer. We also have CARP configured in case of network failover.
carp is running as master on the 1st server(webm01), and backup on the 2nd server(webm02). haproxy on both servers are actively running, though only one is working at a time, depending on which server with carp acting as master. Both servers have pf running as well.
We are running FreeBSD 8.2-RELEASE, haproxy-1.4.15, apache-2.2.19 and the game is php coded.
Our network architecture is as follows. There is a backend database running as well on a jail in a different server, which I excluded from the diagram (hope the ascii diagram will be displayed well in the mail):
+----- wj01 |
(webm01) |------ wj02
user -------- carp -------- haproxy ------+ | |------ wj03 | | | +----- wj04 | | +----- wj05 | | | |----- wj06 carp -------- haproxy ------+
(webm02) |----- wj07
| +----- wj08
Our main problem at the moment is a lot of users (more than a hundred users) have complained that they are getting a "504 Gateway Timeout" error. This normally happens at night (CEST), when most players start playing the game. However, the load of our servers are consistently low at all time.
At the moment there is no obvious pattern as to when this error occurs.
Here is our haproxy.conf:
global
log /var/run/log local0 notice
maxconn 4096
daemon
chroot /var/run/haproxy
user haproxy
group haproxy
stats socket /var/run/haproxy/haproxy.sock uid 1005 gid 1005
defaults
log global
mode http
option httpclose
option forwardfor
option httplog
option tcplog
option dontlognull
option tcpka
retries 3
option redispatch
maxconn 2000
timeout connect 5000
timeout client 50000
timeout server 50000
listen webjailfarm 78.xx.xx.xx:80
mode http
cookie SERVERID insert nocache indirect
balance roundrobin
option httpclose
option forwardfor
option httpchk HEAD / HTTP/1.0
stats uri /haproxy-status
stats enable
stats auth admin:password
server wj01 192.168.30.10:80 <http://192.168.30.10/> cookie A weight 10 check inter 2000 rise 2 fall 2
server wj02 192.168.30.20:80 <http://192.168.30.20/> cookie B weight 10 check inter 2000 rise 2 fall 2
server wj03 192.168.30.30:80 <http://192.168.30.30/> cookie C weight 10 check inter 2000 rise 2 fall 2
server wj04 192.168.30.40:80 <http://192.168.30.40/> cookie D weight 10 check inter 2000 rise 2 fall 2
server wj05 192.168.30.50:80 <http://192.168.30.50/> cookie E weight 10 check inter 2000 rise 2 fall 2
server wj06 192.168.30.60:80 <http://192.168.30.60/> cookie F weight 10 check inter 2000 rise 2 fall 2
server wj07 192.168.30.70:80 <http://192.168.30.70/> cookie G weight 10 check inter 2000 rise 2 fall 2
server wj08 192.168.30.80:80 <http://192.168.30.80/> cookie H weight 10 check inter 2000 rise 2 fall 2
##################################################################
And here is our pf.conf (the exact same pf is running on webm02, only the IPs changed accordingly):
### macros
webm01 = 78.xx.xx.xx
db = 10.10.10.101
carp_dev = "carp0"
ext_if = "igb0"
jail_if = "igb0:0"
trusted = "{ 192.168.30.0/24, 10.10.10.0/24, 78.xx.xx.xx/xx, 85.xx.xx.xx/xx
}"
tcp_services = "{ xxxxx, 4949 }"
ssh_ports = "{ xxxxx, xxxxx, xxxxx, xxxxx }"
icmp_types = "{ echoreq, unreach }"
# jails
wj01 = 192.168.30.10 wj02 = 192.168.30.20 wj03 = 192.168.30.30 wj04 = 192.168.30.40
### normalization
scrub in all
### translation
nat on $ext_if from $jails to !10.10.10.0/24 -> ($jail_if)
rdr pass on $ext_if inet proto tcp from any to $webm01 port xxxxx -> $wj01
### ssh redirect
rdr pass on $ext_if inet proto tcp from any to $webm01 port xxxxx -> $wj02 rdr pass on $ext_if inet proto tcp from any to $webm01 port xxxxx -> $wj03 rdr pass on $ext_if inet proto tcp from any to $webm01 port xxxxx -> $wj04 rdr pass on $ext_if inet proto tcp from any to ($carp_dev) port 80 ->$webm01
### filtering - drop incoming everything
block in all
block return
### keep state of outgoing connections
pass out keep state
### skip loopback interface
set skip on { lo0 }
### spoofing protection for all interfaces
block in quick from urpf-failed
antispoof log for $ext_if
### allow outgoing
pass out on $ext_if proto tcp to any port $tcp_services
pass out quick on $ext_if proto udp from $webm01 to any port = 123 keep
state
pass quick on $ext_if proto carp keep state (no-sync)
pass out on $carp_dev proto tcp to any port 80
### allow incoming services from within internal network to ssh ports
pass in on $ext_if proto tcp from $trusted to $wj01 port xxxxx flags S/SA
synproxy state
pass in on $ext_if proto tcp from $trusted to $wj02 port xxxxx flags S/SA
synproxy state
pass in on $ext_if proto tcp from $trusted to $wj03 port xxxxx flags S/SA
synproxy state
pass in on $ext_if proto tcp from $trusted to $wj04 port xxxxx flags S/SA
synproxy state
### allow incoming services
pass in on $ext_if proto tcp from any to $jails port 80 flags S/SA synproxy
state
pass in on $ext_if proto tcp from any to $webm01 port $tcp_services flags
S/SA synproxy state
pass inet proto icmp all icmp-type $icmp_types keep state
### for munin
pass in on $ext_if proto tcp from $trusted to $jails port 4949 flags S/SA
synproxy state
If there are more information needed, please let me know. Appreciate any advice offered.
Thanks. Received on 2011/07/06 11:44
This archive was generated by hypermail 2.2.0 : 2011/07/06 12:00 CEST