Hypermail

From: Rupert Fiasco <rufiasco#gmail.com>
Date: Wed, 8 Oct 2008 16:00:19 -0700

Duh, I should have thought of that.

I guess in a way I am not surprised that haproxy doesnt just factor the "reader" out from the stats. Since its just a given... but whatever, now that I know I can adjust for it..

Thanks!

-Rupert

On Wed, Oct 8, 2008 at 3:55 PM, Konstantin Svist <fry.kun#gmail.com> wrote:
> One of those 6 is the connection to haproxy itself - the stats page :)
>
> Rupert Fiasco wrote:
>> As they say, a pic is worth 1000 words.
>>
>> http://brockwine.com/images/haproxy_stats.gif
>>
>> So this haproxy instance has 20 mongrels defined, with 6 requests in
>> the queue, but only 5 being served. You would think that all 6 would
>> be handed off to the backend, since there are plenty of mongrels that
>> are available...
>>
>> -Rupert
>>
>> On Wed, Oct 8, 2008 at 3:05 PM, Rupert Fiasco <rufiasco#gmail.com> wrote:
>>
>>> So far haproxy has been awesome, especially with that maxconn fix from
>>> last month...
>>>
>>> We are gathering statistics from all of our haproxy instances, by
>>> virtue of reading the CSV. We have a couple of different 'listen'
>>> blocks, one for each set of mongrels that are hosting a different
>>> application (on the same server).
>>>
>>> I have noticed that when I am parsing the CSV and pulling our data
>>> such as "current queue size" and "requests in process", there is an
>>> off-by-one error in the CSV output.
>>>
>>> An example, assume this config:
>>>
>>> listen aaaa-mongrels *:8999
>>> balance roundrobin
>>> server server-0 127.0.0.1:4000 maxconn 1 check inter 30s slowstart 10s
>>> server server-1 127.0.0.1:4001 maxconn 1 check inter 30s slowstart 10s
>>> ......more hosts defined.....
>>> stats auth xxx:yyy
>>> stats refresh 5s
>>> stats uri /haproxy?stats
>>>
>>>
>>> listen bbbb-mongrels *:8998
>>> balance roundrobin
>>> server server-0 127.0.0.1:5000 maxconn 1 check inter 30s slowstart 10s
>>> server server-1 127.0.0.1:5001 maxconn 1 check inter 30s slowstart 10s
>>> ......more hosts defined.....
>>> stats auth xxx:yyy
>>> stats refresh 5s
>>> stats uri /haproxy?stats
>>>
>>>
>>> Now if I read the CSV from this host on port 8999 (implying that I am
>>> interested in the aaa-mongrels data) and parse out the FRONTEND row
>>> for aaaa-mongrels and compare it to the BACKEND row aaa-mongrels on
>>> the 'cur' column, then the frontend value will always be 1 greater
>>> than the backend value. But only when not all mongrels are in process.
>>> What I mean, is that lets say I have 20 mongrels defined in haproxy,
>>> but if only 10 requests are coming in (and all mongrels are up!) then
>>> the frontend 'cur' value will be 11 while the backend 'cur' value will
>>> be 10. The actual number of requests being processed is the backend
>>> value. But if the number of requests that are queued up by haproxy is
>>> greater than the number of mongrels available (e.g. 20 are available,
>>> but I have 30 requests coming in) then both frontend will report 30
>>> while backend will report 20, which is correct.
>>>
>>> So this output disparity is only present when there are backend hosts
>>> available and the number of requests is less than the number of
>>> backends available.
>>>
>>> To top it off, if I read the CSV output on the *other* listen port of
>>> 8998 but parse out the aaaa-mongrels, then both numbers are always
>>> correct.
>>>
>>> In other words, the numbers are always correct if I read the CSV from
>>> a list port that is not "owned" by that listen block?
>>>
>>> (I find it strange that the stats for all haproxy pools can be read
>>> from any listen port. You would think that if you just hit the stats
>>> for a given listen port, then you would get a data set that is only
>>> limited to the listen block that is pertinent for that block.)
>>>
>>> All of this is apparent in just looking at either the web stats or the
>>> CSV stats. It has nothing to do with my parsing code. I just first
>>> noticed it when I was writing my parsing code.
>>>
>>> This is not a showstopper bug. Since it is reproducible every single
>>> time, I can just subtract 1. Just thought it would be nice to give you
>>> a heads up.
>>>
>>> -Rupert
>>>
>>>
>>
>>
>>
>
>
>
Received on 2008/10/09 01:00

Re: stats: off by one