HAProxy - Load balancer
Find a file
Willy Tarreau 7867cebf31 BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue
Commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend is
full") has uncovered a long-burried bug in the dequeing code: when a
server releases a connection, it picks a new one from the proxy's or
its queue. Technically speaking it only picks a pendconn which is a
link between a position in the queue and a stream. It then sets this
pendconn's target to itself, and wakes up the stream's task so that
it can try to connect again.

The stream then goes through the regular connection setup phases,
calls back_try_conn_req() which calls pendconn_dequeue(), which
sets the stream's target to the pendconn's and releases the pendconn.
It then reaches assign_server() which sees no SF_ASSIGNED and calls
assign_server_and_queue() to perform load balancing or queuing. This
one first destroys the stream's target and gets ready to perform load
balancing. At this point we're load-balancing for no reason since we
already knew what server was available. And this is where the commit
above comes into play: the check for the backend's queue above may
detect other connections that arrived in between, and will immediately
return FULL, forcing this request back into the queue. If the server
had a very low maxconn (e.g. 1 due to a long slowstart), it's possible
that this evicted connection was the last one on the server and that
no other one will ever be present to process the queue. Usually a
regularly processed request will still have its own srv_conn that will
be used during stream_free() to dequeue other connections. But if the
server had a down-up cycle, then a call to pendconn_grab_from_px()
may start to dequeue entries which had no srv_conn and which will have
no server slot to offer when they expire, thus maintaining the situation
above forever. Worse, as new requests arrive, there are always some
requests in the queue and the situation feeds on itself.

The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue()
when the stream's target is assigned (as it's what this flag means), so
as to avoid a load-balancing pass when dequeuing.

Many thanks to Pierre Cheynier for the numerous detailed traces he
provided that helped narrow this problem down.

This could be backported to all stable versions, but in practice only
2.3 and above are really affected since the presence of the commit
above. Given how tricky this code is it's better to limit it to those
versions that really need it.
2021-06-16 09:05:35 +02:00
.github CI: github actions: enable alpine/musl builds 2021-06-12 18:23:22 +02:00
addons BUG/MEDIUM: opentracing: initialization before establishing daemon and/or chroot mode 2021-06-10 06:45:39 +02:00
admin MAJOR: config: remove parsing of the global "nbproc" directive 2021-06-11 17:02:13 +02:00
dev CLEANUP: dev/flags: remove useless test in the stdin number parser 2021-04-03 15:29:10 +02:00
doc DOC: update references to process numbers in cpu-map and bind-process 2021-06-15 16:52:42 +02:00
examples MAJOR: config: remove parsing of the global "nbproc" directive 2021-06-11 17:02:13 +02:00
include CLEANUP: shctx: remove the different inter-process locking techniques 2021-06-15 16:52:42 +02:00
reg-tests REGTESTS: Remove REQUIRE_VERSION=1.7 from all tests 2021-06-11 19:21:28 +02:00
scripts SCRIPTS: opentracing: enable parallel builds in build-ot.sh 2021-06-10 07:35:15 +02:00
src BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue 2021-06-16 09:05:35 +02:00
tests MINOR: config: reject long-deprecated "option forceclose" 2021-06-11 16:57:34 +02:00
.cirrus.yml CI: introduce scripts/build-vtest.sh for installing VTest 2021-05-18 10:48:30 +02:00
.gitattributes MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes 2021-02-22 18:17:57 +01:00
.gitignore ADDONS: make addons/ discoverable by git via .gitignore 2021-05-07 16:48:14 +02:00
.travis.yml CI: introduce scripts/build-vtest.sh for installing VTest 2021-05-18 10:48:30 +02:00
BRANCHES DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
CHANGELOG [RELEASE] Released version 2.5-dev0 2021-05-14 09:36:37 +02:00
CONTRIBUTING CLEANUP: contrib: remove the last references to the now dead contrib/ directory 2021-04-21 15:13:58 +02:00
INSTALL CLEANUP: shctx: remove the different inter-process locking techniques 2021-06-15 16:52:42 +02:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS CONTRIB: move spoa_example out of the tree 2021-04-21 09:39:06 +02:00
Makefile CLEANUP: shctx: remove the different inter-process locking techniques 2021-06-15 16:52:42 +02:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
ROADMAP DOC: update the outdated ROADMAP file 2019-06-15 21:59:54 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 2.4.0 2021-05-14 09:03:30 +02:00
VERSION [RELEASE] Released version 2.5-dev0 2021-05-14 09:36:37 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)