Hi Krzysztof,
I've merged your spread-checks work. I've brought minor adaptations :
- I renamed the parameter "spread-checks" instead of "spread-check".
I realized that it's how I called it everywhere intuitively, so
it should be better that way.
- I added a lower bound for the minimal interval. I had a case where
two backends shared 5 servers with 1000 ms interval, and one backend
had one server with 100 ms interval. All 11 servers would be started
within 100 ms, which meant about 45 ms between two consecutive checks
for the same server in each backend. Now, servers with inter < 1000
will not affect the "mininter" variable, and will have their start
date relative to their own inter instead of mininter. It gave fairly
better results for the case above, as identical servers would then
be checked 450 ms apart.
- I found that the randomization was only applied in the code path where
a health check had failed. This may explain why it took 45 minutes to
stabilize on your server. Here, with that fixed, it already stabilizes
after 2 or 3 checks.
- I added some minimal doc on the new variable, and suggested 2..5 as
good values since they were those which showed me good results.
All in all, I'm satisfied with this feature, and I've merged it for 1.3.12.3.
I'll send another mail here on the ML with a preview of the changes.
Cheers,
Willy
Received on 2007/10/15 00:11