You do have bunch of services that are http mode that don't seem to have any
type of http close. Some I don't understand why they are not http mode and
they probably should be.
Just a note you may be able to greatly simplify (and possibly speed up) your config using the new capabilities for tables of IPs added in 1.4.6.
solr should probably be http mode and anywhere else that you have http mode you probably want an http close option turned on.
I am not sure why they chose dispatch for the prod glassfish server, my guess is they are running apache and mod_jk or something and then forwarding the requests to different glassfish servers - are there really more than one prod glassfish servers? I am wondering if the previous admin set up more than one copy of haproxy and that is why several services are redirected to the same machine - like glassfish prod there is no other reference to port 4850 in this config, so what is running on port 4850? haproxy/apache/heaven forbid - glassfish itself? netstat -antope | fgrep LIST | fgrep 4850
I think one of the problems is the "inter_server" it doesn't have http mode set so if more than one hit/request comes in on an open connection then your request parsing rules are not run on any requests except the first one (as Wille keeps reminding people). That might work ok for most things since you are mostly breaking things up by service liferay goes to the liferay servers, etc - the problem comes in if you have a portal that people sign into and then have a menu/navbar that they can choose different services that should be going to different front/backends.
On 5/18/10 3:49 PM, Chih Yin wrote:
>
>
> On Mon, May 17, 2010 at 11:11 PM, Hank A. Paulson
> <hap#spamproof.nospammail.net <mailto:hap#spamproof.nospammail.net>> wrote:
>
> On 5/17/10 10:24 PM, Willy Tarreau wrote:
>
> On Mon, May 17, 2010 at 07:42:03PM -0700, Hank A. Paulson wrote:
>
> I have some sites running a similar set up - Xen domU,
> keepalived,
> fedora not RHEL and they get 50+ million hits per day with
> pretty
> fast response. you might want to use the "log separate
> errors" (sp?)
> option and review those 50X errors carefully, you might see
> a pattern
> - do you have http-close* in all you configs? That got me
> weird, slow
> results when I missed it once.
>
>
> Indeed, that *could* be a possibility if combined with a server
> maxconn
> because connections would be kept for a long time on the server
> (waiting
> for either the client or the server to close) and during that
> time nobody
> else could connect. The typical problem with keep-alive to the
> servers in
> fact. The 503 could be caused by requests waiting too long in
> the queue
> then.
>
>
> My example was just to assure Chin Yin that haproxy on xen should be
> able to handle his current load depending, of course, on the
> glassfish servers.
>
> I meant some kind of httpclose option
> (httpclose/forceclose/http-server-close/etc) turned on regardless of
> keep-alive status - you know, like you are always reminding people :)
>
> I noticed when I forgot it on a section (that was not keepalive
> related) it caused wacky results - hanging browsers,
> images/icons/css not showing up, etc. Obviously it should not affect
> single requests like you would assume Akamai would be sending, it
> was a pure guess.
>
>
> Thank you everyone for your feedback. I really appreciate your help.
>
> Sorry for taking so long to respond. I had to get permission from my
> director to post some of the log data and our haproxy configuration
> file. I also had to hide a bit more of the configuration than was
> suggested because of concerns about making the issues we're encountering
> too public. I hope you understand.
>
> From my research on HAProxy and high availability websites in general,
> it seemed to me that compared to other websites, our traffic volume is
> actually rather light. In addition to how we have configured HAProxy
> for our infrastructure, I'm definitely also taking a look at our
> application servers and our content as well.
>
> I started looking at the log files and the HAProxy configuration file
> more closely today.
> I attached the (poorly) cleaned HAProxy configuration file. Looking at
> it, I can already see that the httpclose option isn't consistently
> included in all the sections, both the frontend and the backend. I will
> make sure this option is in all sections. Should I also add this to the
> global settings for HAProxy? Is it okay if this option is listed more
> than once in a section (I noticed that this happened a couple of times)?
>
>
> Chin Yin, Xani was right, please take a look at your logs. Also,
> sending
> us your config would help a lot. Replace IP addresses and
> passwords with
> "XXX" if you want, we'll comment on the rest. BTW you should
> tell your
> admin that 1.3.21 has an annoying bug which makes it crash when
> connecting
> to the stats socket. Thus, this reduces your possibilities of
> debugging it.
> When you have some time, you should upgrade it to 1.3.22 or
> later (1.3.24)
> which fix a small number of remaining bugs.
>
> example stats page screenshot attached.
>
>
> Nice stats Hank :-)
>
>
> That is just the page frames (mostly) not including images, css, js,
> static icons or any other "stuff" but neither is it just for one
> day, it is longer.
>
>
> I have already reported to my director to let him know that we really
> need to upgrade to 1.3.22 or later.
>
> As for the logs, it seems that I'll need to look at the configuration
> for HAProxy a bit more to make some adjustments first. A few months
> back, I know I saw messages indicating the status of server (e.g. 3
> active, 2 backup). I also see messages when the HAProxy configuration
> was reloaded or when HAProxy was restarted. I no longer see these
> status messages in the log files.
That is a good reason to turn on the log separate errors option - the error go into both log files but it is easier to review the error log without all the normal accesses. It doesnt realy add any load, just makes life easier.
> I recall that the system
> administrator who initially configured HAProxy mentioned that he removed
> the logging of some inter-server traffic to make the log file sizes
> smaller. I'm wondering if he also removed these status messages as well.
>
Maybe, that would be surprising since those msgs should infrequent and are somewhat important - It is more probable that they adjusted the apache logging (for example on the cas servers) to not log the hits to /security/check.txt given that you are hitting it all the cas servers every 7 seconds so those start to add up if your real reaffic is low.
option httpchk HEAD /security/check.txt HTTP/1.0
> Again, thank you all for your help and suggestions.
> C.Y.
>
>
> Cheers,
> Willy
>
>
>
Received on 2010/05/19 04:45
This archive was generated by hypermail 2.2.0 : 2010/05/19 05:00 CEST