haproxy

mirror of https://github.com/haproxy/haproxy.git synced 2026-03-29 05:43:58 -04:00

Author	SHA1	Message	Date
Olivier Houchard	0087651128	MINOR: counters: Introduce COUNTERS_UPDATE_MAX() Introduce COUNTERS_UPDATE_MAX(), and use it instead of using HA_ATOMIC_UPDATE_MAX() directly. For now it just calls HA_ATOMIC_UPDATE_MAX(), but will later be modified so that we can disable max calculation. This can be backported up to 2.8 if the usage of COUNTERS_UPDATE_MAX() generates too many conflicts.	2026-03-05 15:39:42 +01:00
Amaury Denoyelle	c71ef2969b	OPTIM: backend: reduce contention when checking MUX init with ALPN Some checks are pending Contrib / build (push) Waiting to run Details alpine/musl / gcc (push) Waiting to run Details VTest / Generate Build Matrix (push) Waiting to run Details VTest / (push) Blocked by required conditions Details Windows / Windows, gcc, all features (push) Waiting to run Details In connect_server(), MUX initialization must be delayed if ALPN negotiation is configured, unless ALPN can already be retrieved via the server cache. A readlock is used to consult the server cache. Prior to this patch, it was always taken even if no ALPN is configured. The lock was thus used for every new backend connection instantiation. Rewrite the check so that now the lock is only used if ALPN is configured. Thus, no lock access is done if SSL is not used or if ALPN is not defined. In practice, there will be no performance gain, as the read lock should never block if ALPN is not configured. However, the code is cleaner as it better reflect that only access to server nego_alpn requires the path_params lock protection.	2026-02-19 11:27:49 +01:00
Amaury Denoyelle	55e9c67381	BUG/MINOR: backend: check delay MUX before conn_prepare() In connect_server(), when a new connection must be instantiated, MUX initialization is delayed if an ALPN setting is present on the server line configuration, as negotiation must be performed to select the correct MUX. However, this is not the case if the ALPN can already be retrieved on the server cache. This check is performed too late however and may cause issue with the QUIC stack. The problem can happen when the server ALPN is not yet set. In the normal case, quic_conn layer is instantiated and MUX init is delayed until the handshake completion. When the MUX is finally instantiated, it reused without any issue app_ops from its quic_conn, which is derived from the negotiated ALPN. However, there is a race condition if another QUIC connection populates the server ALPN cache. If this happens after the first quic_conn init but prior to the MUX delay check, the MUX will thus immediately start in connect_server(). When app_ops is retrieved from its quic_conn, a crash occurs in qcc_install_app_ops() as the QUIC handshake is not yet finalized : #0 0x000055e242a66df4 in qcc_install_app_ops (qcc=0x7f127c39da90, app_ops=0x0) at src/mux_quic.c:1697 1697 if (app_ops->init && !app_ops->init(qcc)) { [Current thread is 1 (Thread 0x7f12810f06c0 (LWP 25758))] To fix this, MUX delay check is moved up in connect_server(). It is now performed prior conn_prepare() which is responsible for the quic_conn layer instantiation. Thus, it ensures consistency for the QUIC stack : MUX init is always delayed if the quic_conn does not reuses itself the SSL session and ALPN server cache (no quic_reuse_srv_params()). This must be backported up to 3.3.	2026-02-19 11:22:55 +01:00
Nenad Merdanovic	5a079d1811	MEDIUM: Add connect/queue/tarpit timeouts to set-timeout Add the ability to set connect, queue and tarpit timeouts from the set-timeout action. This is especially useful when using set-dst to dynamically connect to servers. This patch also adds the relevant fe_/be_/cur_ sample fetches for these timeouts.	2026-02-19 08:20:37 +01:00
Aurelien DARRAGON	747ff09818	MEDIUM: backend: make "balance random" consider tg local req rate when loads are equal Some checks are pending Contrib / build (push) Waiting to run Details alpine/musl / gcc (push) Waiting to run Details VTest / Generate Build Matrix (push) Waiting to run Details VTest / (push) Blocked by required conditions Details Windows / Windows, gcc, all features (push) Waiting to run Details This is a follow up to `b6bdb2553` ("MEDIUM: backend: make "balance random" consider req rate when loads are equal") In the above patch, we used the global sess_per_sec metric to choose which server we should be using. But the original intent was to use the per thread group statistic. No backport needed, the previous patch already improved the situation in 3.3, so let's not take the risk of breaking that.	2026-02-17 09:51:46 +01:00
Amaury Denoyelle	817003aa31	MINOR: backend: add function to check support for dynamic servers Move backend compatibility checks performed during 'add server' in a dedicated function be_supports_dynamic_srv(). This should simplify addition of future restriction. This function will be reused when implementing backend creation at runtime.	2026-02-06 14:35:19 +01:00
Willy Tarreau	b6bdb2553b	MEDIUM: backend: make "balance random" consider req rate when loads are equal As reported by Damien Claisse and C�dric Paillet, the "random" LB algorithm can become particularly unfair with large numbers of servers having few connections. It's indeed fairly common to see many servers with zero connection in a thousand-server large farm, and in this case the P2C algo consisting in checking the servers' loads doesn't help at all and is basically similar to random(1). In this case, we only rely on the distribution of server IDs in the random space to pick the best server, but it's possible to observe huge discrepancies. An attempt to model the problem clearly shows that with 1600 servers with weight 10, for 1 million requests, the lowest loaded ones will take 300 req while the most loaded ones will get 780, with most of the values between 520 and 700. In addition, only the first 28 lower bits of server IDs are used for the key calculation, which means that node keys are more determinist. Setting random keys in the lowest 28 bits only better packs values with min around 530 and max around 710, with values mostly between 550 and 680. This can only be compensated by increasing weights and draws without being a perfect fix either. At 4 draws, the min is around 560 and the max around 670, with most values bteween 590 and 650. This patch takes another approach to this problem: when servers are on tie regarding their loads, instead of arbitrarily taking the second one, we now compare their current request rates, which is updated all the time and smoothed over one second, and we pick the server with the lowest request rate. Now with 2 draws, the curve is mostly flat, with the min at 580 and the max at 628, and almost all values between 611 and 625. And 4 draws exclusively gives values from 614 to 624. Other points will need to be addressed separately (bits of server ID, maybe refine the hash algorithm), but these ones would affect how caches are selected, and cannot be changed without an extra option. For random however we can perform a change without impacting anyone. This should be backported, probably only to 3.3 since it's where the "random" algo became the default.	2026-02-04 14:54:16 +01:00
Olivier Houchard	7f4b053b26	MEDIUM: counters: mostly revert `da813ae4d7` Some checks are pending Contrib / build (push) Waiting to run Details alpine/musl / gcc (push) Waiting to run Details VTest / Generate Build Matrix (push) Waiting to run Details VTest / (push) Blocked by required conditions Details Windows / Windows, gcc, all features (push) Waiting to run Details Contrarily to what was previously believed, there are corner cases where the counters may not be allocated, and we may want to make them optional at a later date, so we have to check if those counters are there. However, just checking that shared.tg is non-NULL is enough, we can then assume that shared.tg[tgid - 1] has properly been allocated too. Also modify the various COUNTER_SHARED_* macros to make sure they check for that too.	2026-01-14 12:39:14 +01:00
Olivier Houchard	da813ae4d7	MEDIUM: counters: Remove some extra tests Before updating counters, a few tests are made to check if the counters exits. but those counters should always exist at this point, so just remmove them. This commit should have no impact, but can easily be reverted with no functional impact if various crashes appear.	2026-01-13 11:12:34 +01:00
Olivier Houchard	5495c88441	MEDIUM: counters: Dynamically allocate per-thread group counters Instead of statically allocating the per-thread group counters, based on the max number of thread groups available, allocate them dynamically, based on the number of thread groups actually used. That way we can increase the maximum number of thread groups without using an unreasonable amount of memory.	2026-01-13 11:12:34 +01:00
Willy Tarreau	933cb76461	BUG/MINOR: backend: inspect request not response buffer to check for TFO In 2.6, do_connect_server() was introduced by commit `0a4dcb65f` ("MINOR: stream-int/backend: Move si_connect() in the backend scope") and changed the approach to work with a stream instead of a stream-interface. However si_oc(si) was wrongly turned to &s->res instead of &s->req, which breaks TFO by always inspecting the response channel to figure whether there are data pending. This fix can be backported to all versions till 2.6.	2025-12-31 13:03:53 +01:00
Willy Tarreau	799653d536	BUG/MINOR: backend: fix the conn_retries check for TFO In 2.6, the retries counter on a stream was changed from retries left to retries done via commit `731c8e6cf` ("MINOR: stream: Simplify retries counter calculation"). However, one comparison fell through the cracks in order to detect whether or not we can use TFO (only first attempt), resulting in TFO never working anymore. This may be backported to all versions till 2.6.	2025-12-31 13:03:53 +01:00
Olivier Houchard	40d16af7a6	BUG/MEDIUM: backend: Do not remove CO_FL_SESS_IDLE in assign_server() Back in the mists of time, commit `e91a526c8f` decided that if we were trying to stay on the same server than the previous request, and if there were a connection available in the session, we'd remove its CO_FL_SESS_IDLE. The reason for doing that has been long lost, probably it fixed a bug at some point, but it was most probably not the right place to do that. And starting with 3.3, this triggers a BUG_ON() because that flag is expected later on. So just revert the commit, if the ancient bug shows up again, it will be fixed another way. This should be backported to 3.3. There is little reason to backport it to previous versions, unless other patches depend on it.	2025-12-18 16:09:34 +01:00
Christopher Faulet	5c5914c32e	CLEANUP: backend: Remove useless test on server's xprt The server's xprt is always defined and cannot be NULL. So there is no reason to test it. It could lead to wrong assumptions later in the code. This patch should fix a Coverity report from #3213.	2025-12-15 07:56:53 +01:00
Christopher Faulet	7e9d921141	MEDIUM: tcpcheck/backend: Get the connection SNI before initializing SSL ctx The SNI of a new connection is now retrieved earlier, before the initialization of the SSL context. So, concretely, it is now performed before calling conn_prepare(). The SNI is then set just after.	2025-12-08 15:22:01 +01:00
Christopher Faulet	28654f3c9b	MINOR: connection/ssl: Store the SNI hash value in the connection itself When a SNI is set on a new connection, its hash is now saved in the connection itself. To do so, a dedicated field was added into the connection strucutre, called sni_hash. For now, this value is only used when the TLS session is cached.	2025-12-08 15:22:01 +01:00
Christopher Faulet	7d9cc28f92	Revert "BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set" This reverts commit `de29000e60`. The fix was in fact invalid. First it is not supprted by WolfSSL to call SSL_set_tlsext_host_name with a hostname to NULL. Then, it is not specified as supported by other SSL libraries. But, by reviewing the root cause of this bug, it appears there is an issue with the reuse of TLS sesisons. It must not be performed if the SNI does not match. A TLS session created with a SNI must not be reused with another SNI. The side effects are not clear but functionnaly speaking, it is invalid. So, for now, the commit above was reverted because it is invalid and it crashes with WolfSSL. Then the init of the SSL connection must be reworked to get the SNI earlier, to be able to reuse or not an existing TLS session.	2025-11-26 12:05:43 +01:00
Christopher Faulet	de29000e60	BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set Some checks are pending Contrib / build (push) Waiting to run Details alpine/musl / gcc (push) Waiting to run Details VTest / Generate Build Matrix (push) Waiting to run Details VTest / (push) Blocked by required conditions Details Windows / Windows, gcc, all features (push) Waiting to run Details When a new SSL server connection is created, if no SNI is set, it is possible to inherit from the one of the reused TLS session. The bug was introduced by the commit `95ac5fe4a` ("MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid"). The mixup is possible between regular connections but also with health-checks connections. To fix the issue, when no SNI is set, for regular server connections and for health-check connections, the SNI must explicitly be disabled by calling ssl_sock_set_servername() with the hostname set to NULL. Many thanks to Lukas for his detailed bug report. This patch should fix the issue #3195. It must be backported as far as 3.0.	2025-11-25 16:32:46 +01:00
Olivier Houchard	e9d34f991e	BUG/MEDIUM: queues: Don't forget to unlock the queue before exiting Some checks are pending Contrib / build (push) Waiting to run Details alpine/musl / gcc (push) Waiting to run Details VTest / Generate Build Matrix (push) Waiting to run Details VTest / (push) Blocked by required conditions Details Windows / Windows, gcc, all features (push) Waiting to run Details In assign_server_and_queue(), there's a rare case when the server was full, so we created a pendconn, another server was considered but in the meanwhile the pendconn was unqueued already, so we just left the function. We did so, however, while still holding the queue lock, which will ultimately lead to a deadlock, and ultimately the watchdog would kill the process. To fix that, just unlock the queue before leaving. This should be backported to 3.2.	2025-11-20 13:57:06 +01:00
Amaury Denoyelle	d79295d89b	Revert "BUG/MEDIUM: connections: permit to permanently remove an idle conn" The target patch fixes a rare race condition which happen when a MUX IO handler is working on a connection already moved into the purge list. In this case, the handler will incorrectly moved back the connection into the idle list. To fix this, conn_delete_from_tree() was extended to remove flags along with the connection from the idle list. This was performed when the connection is moved into the purge list. However, it introduces another issue related to the idle server connection accounting. Thus it is necessary to revert it prior to the incoming newer fix. This patch must be backported to every version where the original commit is.	2025-11-14 16:06:34 +01:00
Amaury Denoyelle	8415254cea	MINOR: check: clarify check-reuse-pool interaction with reuse policy check-reuse-pool can only perform as expected if reuse policy on the backend is set to aggressive or higher. Update the documentation to reflect this and implement a server diag warning.	2025-11-14 10:44:05 +01:00
Olivier Houchard	25559e7055	MEDIUM: backend: Defer conn_xprt_start() after mux creation In connect_server(), defer the call to conn_xprt_start() until after we had a chance to create the mux. The xprt can behave differently depending on if a mux is or is not available at this point, as if it is, it may want to wait until some data comes from the mux. This does not need to be backported.	2025-11-07 11:40:52 +01:00
Willy Tarreau	096999ee20	BUG/MEDIUM: connections: permit to permanently remove an idle conn There's currently a function conn_delete_from_tree() which is used to detach an idle connection from the tree it's currently attached to so that it is no longer found. This function is used in three circumstances: - when picking a new connection that no longer has any avail stream - when temporarily working on the connection from an I/O handler, in which case it's re-added at the end - when killing a connection The 2nd case above is quite specific, as it requires to preserve the CO_FL_LIST_MASK flags so that the connection can be re-inserted into the proper tree when leaving the handler. However, there's a catch. When killing a connection, we want to be certain it will not be reinserted into the tree. The flags preservation is causing a tiny race if an I/O happens while the connection is in the kill list, because in this case the I/O handler will note the connection flags, do its work, then reinsert the connection where it believed it was, then the connection gets purged, and another user can find it in the tree. The issue is very difficult to reproduce. On a 128-thread machine it happens in H2 around 500k req/s after around 50M requests. In H1 it happens after around 1 billion requests. The fix here consists in passing an extra argument to the function to indicate if the removal is permanent or not. When it's permanent, the function will clear the associated flags. The callers were adjusted so that all those dequeuing a connection in order to kill it do it permanently and all other ones do it only temporarily. A slightly different approach could have worked: the function could always remove all flags, and the callers would need to restore them. But this would require trickier modifications of the various call places, compared to only passing 0/1 to indicate the permanent status. This will need to be backported to all stable versions. The issue was at least reproduced since 3.1 (not tested before). The patch will need to be adjusted for 3.2 and older, because a 2nd argument "thr" was added in 3.3, so the patch will not apply to older versions as-is.	2025-11-05 11:08:25 +01:00
Olivier Houchard	7d4aa7b22b	BUG/MEDIUM: server: Add a rwlock to path parameter Add a rwlock to control the server's path_parameter, to make sure multiple threads don't set it at the same time, and it can't be seen in an inconsistent state. Also don't set the parameter every time, only set them if they have changed, to prevent needless writes. This does not need to be backported.	2025-11-04 18:47:34 +01:00
Amaury Denoyelle	6bfabfdc77	OPTIM: backend: skip conn reuse for incompatible proxies Some checks failed Contrib / build (push) Has been cancelled Details alpine/musl / gcc (push) Has been cancelled Details VTest / Generate Build Matrix (push) Has been cancelled Details Windows / Windows, gcc, all features (push) Has been cancelled Details VTest / (push) Has been cancelled Details When trying to reuse a backend connection, a connection hash is calculated to match an entry with similar parameters. Previously, this operation was skipped if the stream content wasn't based on HTTP, as it would have been incompatible with http-reuse. With the introduction of SPOP backends, this condition was removed, so that it can also benefit from connection reuse. However, this means that now hash calcul is always performed when connecting to a server, even for TCP or log backends. This is unnecessary as these proxies cannot perform connection reuse. Note also that reuse mode is resetted on postparsing for incompatible backends. This at least guarantees that no tree lookup will be performed via be_reuse_connection(). However, connection lookup is still performed in the session via session_get_conn() which is another unnecessary operation. Thus, this patch restores the condition so that reuse operations are now entirely skipped if a backend mode is incompatible. This is implemented via a new utility function named be_supports_conn_reuse(). This could be backported up to 3.1, as this commit could be considered as a performance regression for tcp/log backend modes.	2025-11-03 10:43:50 +01:00
Willy Tarreau	fe47e8dfc5	MINOR: proxy: only check abortonclose through a dedicated function In order to prepare for changing the way abortonclose works, let's replace the direct flag check with a similarly named function (proxy_abrt_close) which returns the on/off status of the directive for the proxy. For now it simply reflects the flag's state.	2025-10-08 10:29:41 +02:00
Olivier Houchard	b01a00acb1	BUG/MEDIUM: connections: Only avoid creating a mux if we have one In connect_server(), only avoid creating a mux when we're reusing a connection, if that connection already has one. We can reuse a connection with no mux, if we made a first attempt at connecting to the server and it failed before we could create the mux (or during the mux creation). The connection will then be reused when trying again. This fixes a bug where a stream could stall if the first connection attempt failed before the mux creation. It is easy to reproduce by creating random memory allocation failure with -dmFail. This was introduced by commit `4aaf0bfbce`, and thus does not need any backport as long as that commit is not backported.	2025-10-03 13:13:10 +02:00
Chris Staite	54f53bc875	MINOR: backend: srv_is_up converter There is currently an srv_queue converter which is capable of taking the output of a dynamic name and determining the queue length for a given server. In addition there is a sample fetcher for whether a server is currently up. This simply combines the two such that srv_is_up can be used as a converter too. Future work might extend this to other sample fetchers for servers, but this is probably the most useful for acl routing.	2025-09-26 10:46:48 +02:00
Chris Staite	faba98c85f	MINOR: backend: srv_queue helper In preparation of providing further server converters, split the code for finding the server from the sample out. Additionally, update the documentation for srv_queue converter to note security concerns.	2025-09-26 10:46:48 +02:00
Aurelien DARRAGON	5c299dee5a	MEDIUM: stats: consider that shared stats pointers may be NULL This patch looks huge, but it has a very simple goal: protect all accessed to shared stats pointers (either read or writes), because we know consider that these pointers may be NULL. The reason behind this is despite all precautions taken to ensure the pointers shouldn't be NULL when not expected, there are still corner cases (ie: frontends stats used on a backend which no FE cap and vice versa) where we could try to access a memory area which is not allocated. Willy stumbled on such cases while playing with the rings servers upon connection error, which eventually led to process crashes (since 3.3 when shared stats were implemented) Also, we may decide later that shared stats are optional and should be disabled on the proxy to save memory and CPU, and this patch is a step further towards that goal. So in essence, this patch ensures shared stats pointers are always initialized (including NULL), and adds necessary guards before shared stats pointers are de-referenced. Since we already had some checks for backends and listeners stats, and the pointer address retrieval should stay in cpu cache, let's hope that this patch doesn't impact stats performance much.	2025-09-18 16:49:51 +02:00
Willy Tarreau	2d6b5c7a60	MEDIUM: connection: reintegrate conn_hash_node into connection Previously the conn_hash_node was placed outside the connection due to the big size of the eb64_node that could have negatively impacted frontend connections. But having it outside also means that one extra allocation is needed for each backend connection, and that one memory indirection is needed for each lookup. With the compact trees, the tree node is smaller (16 bytes vs 40) so the overhead is much lower. By integrating it into the connection, We're also eliminating one pointer from the connection to the hash node and one pointer from the hash node to the connection (in addition to the extra object bookkeeping). This results in saving at least 24 bytes per total backend connection, and only inflates connections by 16 bytes (from 240 to 256), which is a reasonable compromise. Tests on a 64-core EPYC show a 2.4% increase in the request rate (from 2.08 to 2.13 Mrps).	2025-09-16 09:23:46 +02:00
Willy Tarreau	ceaf8c1220	MEDIUM: connection: move idle connection trees to ceb64 Idle connection trees currently require a 56-byte conn_hash_node per connection, which can be reduced to 32 bytes by moving to ceb64. While ceb64 is theoretically slower, in practice here we're essentially dealing with trees that almost always contain a single key and many duplicates. In this case, ceb64 insert and lookup functions become faster than eb64 ones because all duplicates are a list accessed in O(1) while it's a subtree for eb64. In tests it is impossible to tell the difference between the two, so it's worth reducing the memory usage. This commit brings the following memory savings to conn_hash_node (one per backend connection), and to srv_per_thread (one per thread and per server): struct before after delta conn_hash_nodea 56 32 -24 srv_per_thread 96 72 -24 The delicate part is conn_delete_from_tree(), because we need to know the tree root the connection is attached to. But thanks to recent cleanups, it's now clear enough (i.e. idle/safe/avail vs session are easy to distinguish).	2025-09-16 09:23:46 +02:00
Willy Tarreau	95b8adff67	MINOR: connection: pass the thread number to conn_delete_from_tree() We'll soon need to choose the server's root based on the connection's flags, and for this we'll need the thread it's attached to, which is not always the current one. This patch simply passes the thread number from all callers. They know it because they just set the idle_conns lock on it prior to calling the function.	2025-09-16 09:23:46 +02:00
Willy Tarreau	3d18a0d4c2	CLEANUP: backend: factor the connection lookup loop The connection lookup loop is made of two identical blocks, one looking in the idle or safe lists and the other one looking into the safe list only. The second one is skipped if a connection was found or if the request looks for a safe one (since already done). Also the two are slightly different due to leftovers from earlier versions in that the second one checks for safe connections and not the first one, and the second one sets is_safe which is not used later. Let's just rationalize all this by placing them in a loop which checks first from the idle conns and second from the safe ones, or skips the first step if the request wants a safe connection. This reduces the code and shortens the time spent under the lock.	2025-09-16 09:23:46 +02:00
Olivier Houchard	d4c51a4f57	MEDIUM: server: Make use of the stored ALPN stored in the server Now that which ALPN gets negociated for a given server, use that to decide if we can create the mux right away in connect_server(), and use it in conn_install_mux_be(). That way, we may create the mux soon enough for early data to be sent, before the handshake has been completed. This commit depends on several previous commits, and it has not been deemed important enough to backport.	2025-09-09 19:01:24 +02:00
Willy Tarreau	6a2b3269f9	CLEANUP: backend: clarify the cases where we want to use early data The conditions to use early data on output are super tricky and detected later, so that it's difficult to figure how this works. This patch splits the condition in two parts, the one that can be performed early that is based on config/client/etc. It is used to clear a variable that allows early data to be used in case any condition is not satisfied. It was purposely split into multiple independent and reviewable tests. The second part remains where it was at the end, and is used to temporarily clear the handshake flags to let the data layer use early data. This one being tricky, a large comment explaining the principle was added. The logic was not changed at all, only the code was made more readable.	2025-09-09 19:01:24 +02:00
Willy Tarreau	9b9d0720e1	CLEANUP: backend: simplify the complex ifdef related to 0RTT in connect_server() Since 3.0 we have HAVE_SSL_0RTT precisely to avoid checking horribly complicated and unmaintainable conditions to detect support for 0RTT. Let's just drop the complex condition and use the macro instead.	2025-09-09 19:01:24 +02:00
Willy Tarreau	4aaf0bfbce	CLEANUP: backend: invert the condition to start the mux in connect_server() Instead of trying to switch from delayed start to instant start based on a single condition, let's do the opposite and preset the condition to instant start and detect what could cause it to be delayed, thus falling back to the slow mode. The condition remains exactly the inverted one and better matches the comment about ALPN being the only cause of such a delay.	2025-09-09 19:01:24 +02:00
Willy Tarreau	7b4a7f92b5	CLEANUP: backend: clarify the role of the init_mux variable in connect_server() The init_mux variable is currently used in a way that's not super easy to grasp. It's set a bit too late and requires to know a lot of info at once. Let's first rename it to "may_start_mux_now" to clarify its role, as the purpose is not to force the mux to be initialized now but to permit it to do it.	2025-09-09 19:01:24 +02:00
Christopher Faulet	52866349a1	OPTIM: backend: Don't set SNI for non-ssl connections There is no reason to set the SNI for non-ssl connections. It is not really an issue because ssl_sock_set_servername() function will do nothing. But there is no reason to uselessly evaluate an expression. No backport needed, because there is no bug.	2025-09-05 15:56:42 +02:00
Willy Tarreau	93cc18ac42	MAJOR: backend: switch the default balancing algo to "random" For many years, an unset load balancing algorithm would use "roundrobin". It was shown several times that "random" with at least 2 draws (the default) generally provides better performance and fairness in that it will automatically adapt to the server's load and capacity. This was further described with numbers in this discussion: https://www.mail-archive.com/haproxy@formilux.org/msg46011.html https://github.com/orgs/haproxy/discussions/3042 BTW there were no objection and only support for the change. The goal of this patch is to change the default algo when none is specified, from "roundrobin" to "random". This way, users who don't care and don't set the load balancing algorithm will benefit from a better one in most cases, while those who have good reasons to prefer roundrobin (for session affinity or for reproducible sequences like used in regtests) can continue to specify it. The vast majority of users should not notice a difference.	2025-09-04 08:30:35 +02:00
Amaury Denoyelle	21f7974e05	OPTIM: backend: set release on takeover for strict maxconn When strict maxconn is enforced on a server, it may be necessary to kill an idle connection to never exceed the limit. To be able to delete a connection from any thread, takeover is first used to migrate it on the current thread prior to its deletion. As takeover is performed to delete a connection instead of reusing it, <release> argument can be set to true. This removes unnecessary allocations of resources prior to connection deletion. As such, this patch is a small optimization for strict maxconn implementation. Note that this patch depends on the previous one which removes any assumption in takeover implementation that thread isolation is active if <release> is true.	2025-08-28 16:11:32 +02:00
Amaury Denoyelle	ec1ab8d171	MINOR: session: remove redundant target argument from session_add_conn() session_add_conn() uses three argument : connection and session instances, plus a void pointer labelled as target. Typically, it represents the server, but can also be a backend instance (for example on dispatch). In fact, this argument is redundant as <target> is already a member of the connection. This commit simplifies session_add_conn() by removing it. A BUG_ON() on target is extended to ensure it is never NULL.	2025-07-30 11:39:57 +02:00
Aurelien DARRAGON	c24de077bd	OPTIM: stats: store fast sharded counters pointers at session and stream level Following commit `75e480d10` ("MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct"), in order to minimize the impact of the recent sharded counters work, we try to push things a bit further in this patch by storing and using "fast" pointers at the session and stream levels when available to avoid costly indirections and systematic "tgid" resolution (which can not be cached by the CPU due to its THREAD-local nature). Indeed, we know that a session/stream is tied to a given CPU, thanks to this we know that the tgid for a given session/stream will never change. Given that, we are able to store sharded frontend and listener counters pointer at the session level (namely sess->fe_tgcounters and sess->li_tgcounters), and once the backend and the server are selected, we are also able to store backend and server sharded counters pointer at the stream level (namely s->be_tgcounters and s->sv_tgcounters) Everywhere we rely on these counters and the stream or session context is available, we use the fast pointers it instead of the indirect pointers path to make the pointer resolution a bit faster. This optimization proved to bring a few percents back, and together with the previous `75e480d10` commit we now fixed the performance regression (we are back to back with 3.2 stats performance)	2025-07-25 18:24:23 +02:00
Aurelien DARRAGON	75e480d107	MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct Between 3.2 and 3.3-dev we noticed a noticeable performance regression due to stats handling. After bisecting, Willy found out that recent work to split stats computing accross multiple thread groups (stats sharding) was responsible for that performance regression. We're looking at roughly 20% performance loss. More precisely, it is the added indirections, multiplied by the number of statistics that are updated for each request, which in the end causes a significant amount of time being spent resolving pointers. We noticed that the fe_counters_shared and be_counters_shared structures which are currently allocated in dedicated memory since `a0dcab5c` ("MAJOR: counters: add shared counters base infrastructure") are no longer huge since `16eb0fab31` ("MAJOR: counters: dispatch counters over thread groups") because they now essentially hold flags plus the per-thread group id pointer mapping, not the counters themselves. As such we decided to try merging fe_counters_shared and be_counters_shared in their parent structures. The cost is slight memory overhead for the parent structure, but it allows to get rid of one pointer indirection. This patch alone yields visible performance gains and almost restores 3.2 stats performance. counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and now returns either failure or success instead of a pointer because we don't need to retrieve a shared pointer anymore, the function takes care of initializing existing pointer.	2025-07-25 16:46:10 +02:00
Willy Tarreau	6ad9285796	CLEANUP: server: rename server_find_by_name() to server_find() This function doesn't just look at the name but also the ID when the argument starts with a '#'. So the name is not correct and explains why this function is not always used when the name only is needed, and why the list-based findserver() is used instead. So let's just call the function "server_find()", and rename its generation-id based cousin "server_find_unique()".	2025-07-15 10:30:28 +02:00
Aurelien DARRAGON	4fcc9b5572	MINOR: counters: rename last_change counter to last_state_change Since proxy and server struct already have an internal last_change variable and we cannot merge it with the shared counter one, let's rename the last_change counter to be more specific and prevent the mixup between the two. last_change counter is renamed to last_state_change, and unlike the internal last_change, this one is a shared counter so it is expected to be updated by other processes in our back. However, when updating last_state_change counter, we use the value of the server/proxy last_change as reference value.	2025-06-30 16:26:38 +02:00
Aurelien DARRAGON	5b1480c9d4	MEDIUM: proxy: add and use a separate last_change variable for internal use Same motivation as previous commit, proxy last_change is "abused" because it is used for 2 different purposes, one for stats, and the other one for process-local internal use. Let's add a separate proxy-only last_change variable for internal use, and leave the last_change shared (and thread-grouped) counter for statistics.	2025-06-30 16:26:31 +02:00
Amaury Denoyelle	a0db93f3d8	MEDIUM: backend: delay MUX init with ALPN even if proto is forced On backend side, multiplexer layer is initialized during connect_server(). However, this step is not performed if ALPN is used, as the negotiated protocol may be unknown. Multiplexer initialization is delayed after TLS handshake completion. There are still exceptions though that forces the MUX to be initialized even if ALPN is used. One of them was if <mux_proto> server field was already set at this stage, which is the case when an explicit proto is selected on the server line configuration. Remove this condition so that now MUX init is delayed with ALPN even if proto is forced. The scope of this change should be minimal. In fact, the only impact concerns server config with both proto and ALPN set, which is pretty unlikely as it is contradictory. The main objective of this patch is to prepare QUIC support on the backend side. Indeed, QUIC proto will be forced on the server if a QUIC address is used, similarly to bind configuration. However, we still want to delay MUX initialization after QUIC handshake completion. This is mandatory to know the selected application protocol, required during QUIC MUX init.	2025-06-12 11:21:32 +02:00
Frederic Lecaille	7c76252d8a	MINOR: quic-be: Correct the QUIC protocol lookup From connect_server(), QUIC protocol could not be retreived by protocol_lookup() because of the PROTO_TYPE_STREAM default passed as argument. In place to support QUIC srv->addr_type.proto_type may be safely passed.	2025-06-11 18:37:34 +02:00

1 2 3 4 5 ...

833 commits