Thanks again for your help, Willy.
It looks like you were right on the keepalive issue. When I tried this, requests per second on my tiny file doubled to about 35,000 on the cluster. Requests per second on the 100K file were basically unchanged, however.
I tried copying a 512MB file between two of the servers involved and the throughput I received was about 45MB/sec. I understand that theoretically one should be able to achieve 125MB/sec over GigE, but I'm not sure what one could expect to get in a real-world scenario. I suppose I should investigate that more.
Thanks!
Martin
On 12/12/07, Willy Tarreau <w#1wt.eu> wrote:
>
> On Tue, Dec 11, 2007 at 08:15:51PM -0500, Martin Goldman wrote:
> > Thanks Willy. I got excited there for a second, but I tried making the
> > suggested changes, and would you believe me if I said it didn't seem to
> > help?
> >
> > As you suggested, I updated the conf file with maxconn set to 100000,
> just
> > so I could be sure that wouldn't be a bottleneck. I recompiled for
> linux26,
> > and got the appropriate output this time:
> >
> > martin#kramer:~/haproxy-1.3.13.1$ sudo /usr/sbin/haproxy -f
> /etc/haproxy.cfg
> > -V
> > Available polling systems :
> > sepoll : pref=400, test result OK
> > epoll : pref=300, test result OK
> > poll : pref=200, test result OK
> > select : pref=150, test result OK
> > Total: 4 (4 usable), will use sepoll.
> > Using sepoll() as the polling mechanism.
> >
> > I re-ran the apachebench, and the requests per second achieved were
> still
> > lower than that of each of the web servers individually. I did two
> tests:
> >
> > - 500-byte file, 1000 concurrent requests, 500,000 total requests:
> > Individual node = 14,000 requests/second; Cluster = 13,100
> requests/second
> > - 100 KB file, 100 concurrent requests, 50,000 total requests:
> Individual
> > node = 780 requests/second; Cluster = 400 requests/second
>
> This is really not expected, because here I see three limitations :
>
> - req/s cluster < individual node : one of the possible reasons could be
> that the load balancer's sysctls have not been tuned, or a poor
> network
> driver (eg: forcedeth) leading to some packet loss, but I expect that
> on a dual xeon, you would have something decent such as a tg3 or
> e1000.
>
> - 100kB files: this normally puts very low stress on haproxy since it
> has
> almost nothing to do except copy data between two sockets (it's the
> system which is doing the painful job here). 400 reqs/s for 100 kB
> files
> is around one third of what you should get, this is only 40 MB/s, or
> around 320 Mbps. On a setup like yours, Gigabit should be achieved
> with
> 5-10kB files.
>
> - 100kB files on individual nodes: only 780 req/s. You do not saturate
> the gigabit either, so there is definitely something wrong somewhere.
>
> Among the possibilities I see, a poor network chip or driver would explain
> the most these symptoms. For instance, if you just have a PCI NIC, it is
> limited to around 800 Mbps in+out, which would explain why you don't
> saturate the Gig with your apache alone, and why it is halved through
> haproxy. But as I said, I expect such machines to run correct chips.
>
> In fact, it is essential that you first achieve to reach the gigabit
> on your individual nodes. As long as we don't know why it is not possible,
> we'll not find any solution.
>
> > I don't know if there's anything else you can think of, but I certainly
> > appreciate all the ideas thus far.
>
> I also have another idea for an additional test. By default, apache-bench
> uses keep-alive requests, but haproxy transforms them into close requests.
> Since a keep-alive request needs less packets than a close request, and
> since we don't know yet what's happening at the network level, it could
> be possible that this difference explains the performance drop.
>
> You could try to run apache-bench with keep-alive disabled on individual
> nodes (I believe it is the -k option but I'm not sure), then you could
> comment out the "option httpclose" in your haproxy config so that it
> does not transform the requests. It will not be good for logs and load
> balancing but it will show if this is what lowers your performance.
>
> Last, are you sure that both of your nodes respond correctly for the
> 500 bytes files ? Having only one of them responding fast would lead
> to a lower performance for the cluster.
>
> Best regards,
> Willy
>
>
Received on 2007/12/12 14:20
This archive was generated by hypermail 2.2.0 : 2007/12/12 14:30 CET