Hi!
2011/7/18 Willy Tarreau <w#1wt.eu>
> Hi,
>
> On Mon, Jul 18, 2011 at 06:42:54AM +0400, Dmitriy Samsonov wrote:
> > My test setup is three Dell r410 servers (dual Intel(R) Xeon(R) CPU X5650
> @
> > 2.67GHz - 24 threads total, 128Gb RAM) all connected to 1Gbps network.
> >
> > One server is haproxy, configured to block all requests with
> > 'Accept-Encoding: none':
> >
> > global
> > daemon
> > maxconn 80000
> > option forwardfor
> > retries 10
>
> Something is missing here above, probably "defaults" and maybe a few other
> options before "option forwardfor", such as "mode http" and even a few
> timeouts.
>
Yes, section defaults is the follwing:
defaults
mode http maxconn 79500 timeout client 20s timeout server 15s timeout queue 60s timeout connect 4s timeout http-request 5s timeout http-keep-alive 250 #option httpclose option abortonclose balance roundrobin option forwardfor retries 10
Also there was conntrack module loaded - friend of mine was playing with iptables and did not remove it. Now there is no iptables at all:
dex9 ipv4 # sysctl -a | grep conntrack | wc -l
0
dex9 ipv4 # lsmod | grep xt_ | wc -l
0
dex9 ipv4 # lsmod | grep nf_ | wc -l
0
I followed your recommendation and set affinity for processes: dex9 ipv4 # schedtool 16638 # haproxy's process PID 16638: PRIO 0, POLICY N: SCHED_NORMAL, NICE 0, AFFINITY 0x2
dex9 ipv4 # schedtool 3 # ksoftirqd/0
PID 3: PRIO 0, POLICY N: SCHED_NORMAL, NICE 0, AFFINITY 0x1
Now in top it looks like this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16638 root 20 0 100m 36m 520 R 93 0.0 19:33.18 haproxy
3 root 20 0 0 0 0 R 38 0.0 7:11.76 ksoftirqd/0
93% percent for haproxy and 38% for ksoftirqd/0
I was lucky enough to reach 66903 session rate (max) and average value for Cur is around 40-42k.
Typical output of one of two ab2 running is:
Server Software:
Server Hostname: nohost Server Port: 80 Document Path: / Document Length: 0 bytes Concurrency Level: 1000 Time taken for tests: 470.484 seconds Complete requests: 10000000 Failed requests: 0 Write errors: 0 Total transferred: 0 bytes HTML transferred: 0 bytes Requests per second: 21254.72 [#/sec] (mean) Time per request: 47.048 [ms] (mean) Time per request: 0.047 [ms] (mean, across all concurrent requests) Transfer rate: 0.00 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 34 275.9 11 21086 Processing: 0 13 17.8 11 784 Waiting: 0 0 0.0 0 0 Total: 2 47 276.9 22 21305 Percentage of the requests served within a certain time (ms) 50% 22 66% 26 75% 28 80% 30 90% 37 95% 41 98% 47
Typical output of vmstat is:
dex9 ipv4 # vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
1 0 0 131771328 46260 64016 0 0 2 1 865 503 1 6
94 0
1 0 0 131770688 46260 64024 0 0 0 0 40496 6323 1 9
90 0
1 0 0 131771920 46260 64024 0 0 0 0 47046 7212 1 8
91 0
2 0 0 131771704 46260 64024 0 0 0 0 40864 6143 1 8
91 0
1 0 0 131771688 46268 64024 0 0 0 12 36547 5268 1 8
91 0
2 0 0 131771696 46268 64032 0 0 0 0 53189 8979 0 5
95 0
3 0 0 131771584 46268 64024 0 0 0 0 31633 4025 0 4
96 0
2 0 0 131771840 46268 64020 0 0 0 0 49723 9290 1 9
91 0
2 0 0 131772448 46268 64028 0 0 0 0 44484 7008 1 8
91 0
2 0 0 131771688 46276 64032 0 0 0 20 40132 4531 1 8
92 0
2 0 0 131771456 46276 64028 0 0 0 0 36006 4445 0 8
91 0
2 0 0 131772208 46276 64024 0 0 0 0 41325 5902 1 8
91 0
2 0 0 131771208 46276 64032 0 0 0 0 44262 7427 1 5
94 0
1 0 0 131771456 46276 64028 0 0 0 0 42403 5422 1 8
91 0
2 0 0 131771944 46284 64020 0 0 0 12 46907 7419 1 7
93 0
1 0 0 131772960 46284 64020 0 0 0 0 42772 6663 1 8
91 0
2 0 0 131772832 46284 64024 0 0 0 0 45298 6695 1 8
91 0
2 0 0 131772712 46284 64028 0 0 0 0 44604 5361 0 6
94 0
1 0 0 131772464 46284 64028 0 0 0 0 39798 5105 0 4
96 0
Also, I've checked version of NIC's firmware:
dex9 ipv4 # ethtool -i eth0
driver: bnx2
version: 2.0.21
firmware-version: 6.2.12 bc 5.2.3
bus-info: 0000:01:00.0
Moreover, I've tried launching two ab2 localy:
dex9 ipv4 # ab2 -c 1000 -H 'Accept-Encoding: None' -n 10000000
http://localweb/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking inbet.cc (be patient)
Completed 1000000 requests
Completed 2000000 requests
^C
Server Software:
Server Hostname: localweb Server Port: 80 Document Path: / Document Length: 0 bytes Concurrency Level: 1000 Time taken for tests: 104.583 seconds Complete requests: 2141673 Failed requests: 0 Write errors: 0 Total transferred: 0 bytes HTML transferred: 0 bytes Requests per second: 20478.13 [#/sec] (mean) Time per request: 48.833 [ms] (mean) Time per request: 0.049 [ms] (mean, across all concurrent requests) Transfer rate: 0.00 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 38 352.7 6 21073 Processing: 0 10 49.7 7 14919 Waiting: 0 0 0.0 0 0 Total: 1 48 365.8 13 21078 Percentage of the requests served within a certain time (ms) 50% 13 66% 19 75% 26 80% 35 90% 36 95% 37 98% 39 99% 67
Two such ab2 processes are running both at 100% and saturating haproxy to 100%. 'Cur' session rate is also around 40-44k/s.
Should I get rid of dell r410 and replace it with Core i5?:)) Being serious, is there any other tips or tricks I can try? To see those amazing 100k/s session rate?
> > frontend public
> > bind 192.168.0.1:80
> > default_backend nginx
> > acl accepts_none hdr(Accept-Encoding) -i none
> > errorfile 403 /raid/emptypage.txt
> > block if accepts_none
> >
> > backend nginx
> > server srv 127.0.0.1:80 maxconn 8192
> >
> > File /raid/emptypage.txt is an empty file made with 'touch
> > /raid/emptypage.txt'.
> >
> > I'm doing ab2 -c 1000 -H 'Accept-Encoding: None' -n 1000000
> > http://192.168.0.1/ on two other servers and get following:
> >
> > When nbproc = 1 haproxy saturates 100% of cpu core it runs at, but server
> is
> > running nice, I'm able to get reply from nginx behind by using curl on my
> > machine: curl http://192.168.0.1/, ab reports 16833 requests/second each
> and
> > longest request is around 14seconds.
>
> 16k req/s is quite low for such a server, especially since no request
> should
> be forwarded. You should check how the user/system CPU usage is spread, I
> suspect there could be an important part spent in the system. Such a test
> should better report numbers in the 100k range.
>
> Also, 14 seconds to respond is simply unacceptable. Something is not
> working at all. 14 milliseconds would even be high for this type of
> processing but could be explained by packet queuing and interrupt
> rate depending on the NIC tuning.
>
> > When I change nbproc to higher values (maximum is 24 as there 24 threads
> > total) I can see ksoftirq/0 process saturating cpu core, network becomes
> > slow on server, ab reports same 16k-17k requests/second for each client,
> but
> > longest request is always around 20-30 seconds.
>
> At such loads, you should not see ksoftirq trigger. 16kreq/s, assuming
> there is no keep-alive will basically result in 7 packets per request,
> (4 in, 3 out) so you're running around 112 kpps, or 64k IN and 48k out.
>
> Could you check with "vmstat 1" your interrupt rate ? Wouldn't your NIC
> be a BNX2 with an old firmware such as 1.9.6 which is known for causing
> a high number of packet losses ? (check with ethtool -i eth0).
>
> > I've seen such things with ksoftirq/0 running at 100% and network is
> almost
> > down during DDoS attacks in case of too many iptables rules but what is
> > happening now?
>
> Indeed, if you have many iptables rules, then your machine will be very
> sensible to SYN floods because for each SYN it will be necessary to find
> the matching rule. Do you have ip_conntrack / nf_conntrack loaded ? I
> assume that it's the case. Did you tune it ? Default settings are fine
> for a desktop usage but not for a server. It is possible that all the
> time is wasted in the conntrack hash table lookup.
>
> > And what number to use at nbproc? Is it ok to have haproxy
> > running at 100%?
>
> It can be under extreme situations (eg: ddos) but it should not be the
> norm, otherwise your scaling abilities are seriously limited. Nbproc
> should normally be 1. With proper tuning this will give you much more
> raw performance than what you can get in multi-process and poor tuning.
> To give you an idea, at Exceliance, we can get 300kconn/s out of a single
> core of a dual-core i5 3 GHz. You're at 16k.
>
> What is very important too is to bind network interrupts and processes
> to different core sharing the same L2 cache. If you force all your NIC
> interrupts to core 0 and bind haproxy to core 1, chances are that they
> will be on the same cache (check in /proc/cpuinfo). This is how I get
> the most performance, with IRQs taking between 50 and 80% of one core,
> and haproxy 100% of another core.
>
> > It looks like I can have 30k requests per second in my
> > setup, is there any way to make it higher? I've done some basic tuning
> like
> > tcp_max_tw_buckets = 1024*1024, tcp_tw_reuse = 1, tcp_max_syn_backlog =
> > 30000. Am I running out of options?
>
> There are a number of other options, such as net.core.somaxconn which
> also limits the backlog (128 by default!). But in your case, the apparent
> lack of NIC/process binding can be one cause, and if iptables is there,
> it surely is another one.
>
> It is easy to verify if iptables is likely involved : for this you'll
> enable keep-alive. Add this to your haproxy config :
>
> option http-server-close
>
> And replace :
>
> block if accepts_none
>
> with :
>
> redirect location / code 302 if accepts_none
>
>
> Then use "ab -k" instead of "ab" to run your tests. It will try to maintain
> connections open as long as possible. The only difference is that it will
> see status 302 instead of 403. But that's not a problem for a test.
>
> If you see that the numbers increase by a factor of approximately two, it
> will probably mean that you're bound by the packet count and/or by the CPU.
> If your numbers increase a lot more (eg: 5-10 times) then it will be very
> likely due to the fact that you don't stress conntrack anymore and don't
> spend your CPU cycles doing connection lookups.
>
> I really encourage you to try this as it's a simple test. Hint: on my
> 6 y-o laptop, I'm already getting your numbers above, and testing with
> keep-alive doubles the performance.
>
> Regards,
> Willy
>
>
Received on 2011/07/18 20:01
This archive was generated by hypermail 2.2.0 : 2011/07/18 20:15 CEST