Commit graph

20694 commits

Author SHA1 Message Date
Frederic Lecaille
0a02acecf3 BUG/MEDIUM: qpack: correctly deal with too large decoded numbers
Same fix as this one for hpack:

	7315428615 ("BUG/MEDIUM: hpack: correctly deal with too large decoded numbers")

Indeed, the encoding of integers for QPACK is the same as for HPACK but for 64 bits
integers.

Must be backported as far as 2.6.
2026-03-05 15:02:02 +01:00
Frederic Lecaille
cdcdc016cc BUG/MINOR: quic: fix OOB read in preferred_address transport parameter
This bug impacts only the QUIC backend. A QUIC server does receive
a server preferred address transport parameter.

In quic_transport_param_dec_pref_addr(), the boundary check for the
connection ID was inverted and incorrect. This could lead to an
out-of-bounds read during the following memcpy.

This patch fixes the comparison to ensure the buffer has enough input data
for both the CID and the mandatory Stateless Reset Token.

Thank you to Kamil Frankowicz for having reported this.

Must be backported to 3.3.
2026-03-05 15:02:02 +01:00
Frederic Lecaille
54b614d2b5 BUG/MINOR: qpack: fix 1-byte OOB read in qpack_decode_fs_pfx()
In qpack_decode_fs_pfx(), if the first qpack_get_varint() call
consumes the entire buffer, the code would perform a 1-byte
out-of-bounds read when accessing the sign bit via **raw.

This patch adds an explicit length check at the beginning of
qpack_get_varint(), which systematically secures all other callers
against empty inputs. It also adds a necessary check before the
second varint call in qpack_decode_fs_pfx() to ensure data is still
available before dereferencing the pointer to extract the sign bit,
returning QPACK_RET_TRUNCATED if the buffer is exhausted.

Thank you to Kamil Frankowicz for having reported this.

Must be backported as far as 2.6.
2026-03-05 15:02:02 +01:00
Frederic Lecaille
e38b86e72c BUG/MAJOR: qpack: unchecked length passed to huffman decoder
A call to huffman decoder function (huff_dec()) is made from qpack_decode_fs()
without checking the buffer length passed to this function, leading to OOB read
which can crash the process.

Thank you to Kamil Frankowicz for having reported this.

Must be backport as far as 2.6.
2026-03-05 15:02:02 +01:00
Willy Tarreau
7315428615 BUG/MEDIUM: hpack: correctly deal with too large decoded numbers
The varint hpack decoder supports unbounded numbers but returns 32-bit
results. This means that possible truncation my happen on some field
lengths or indexes that would be emitted as quantities that do not fit
in a 32-bit number. The final value will also depend on how the left
shift operation behaves on the target architecture (e.g. whether bits
are lost or used modulo 31). This could lead to a desynchronization of
the HPACK stream decoding compared to what an external observer would
see (e.g. from a network traffic capture). However, there isn't any
impact between streams, HPACK is performed at the connection level,
not at the stream level, so no stream may try to leverage this
limitation to have any effect on another one.

For the fix, instead of adding checks everywhere in the loop and for
the final stage, let's rewrite the decoder to compare the read value
to a max value that is shifted by 7 bits for every 7 bits read. This
allows a sender to continue to emit zeroes for higher bits without
being blocked, while detecting that a received value would overflow.
The loop is now simpler as it deals both with values with the higher
bit set and the final ones, and stops once the final value was recorded.

A test on non-zero before performing the shift was added to please
ubsan, though in practice zero shifted by any quantity remains zero.
But the test is cheap so that's OK.

Thanks to Guillaume Meunier, Head of Vulnerability Operations Center
France at Orange Cyberdefense, for reporting this bug.

This should be backported to all stable versions.
2026-03-05 14:33:21 +01:00
Amaury Denoyelle
b1441c6440 MINOR: quic: use server cache for ALPN on BE side
Some checks failed
Contrib / build (push) Has been cancelled
alpine/musl / gcc (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
On the backend side, QUIC MUX may be started preemptively before the
ALPN negotiation. This is useful notably for 0-RTT implementation.

However, this was a source of crashes. ALPN was expected to be retrieved
from the server cache, however QUIC MUX still used the ALPN from the
transport layer. This could cause a crash, especially when several
connections runs in parallel as the server cache is shared among
threads.

Thanks to the previous patch which reworks QUIC MUX init, this solution
can now be fixed. Indeed, if conn_get_alpn() is not successful, MUX can
look at the server cache again to use the expected value.

Note that this could still prevent the MUX to work as expected if the
server cache is resetted between connect_server() and MUX init. Thus,
the ultimate solution would be to copy the cached ALPN into the
connection. This problem is not specific to QUIC though, and must be
fixed in a separate patch.
2026-03-03 16:23:03 +01:00
Amaury Denoyelle
940e1820f6 MEDIUM: quic/mux-quic: adjust app-ops install
This patch reworks the installation of app-ops layer by QUIC MUX.
Previously, app_ops field was stored directly into the quic_conn
structure. Then the MUX reused it directly during its qmux_init().

This patch removes app_ops field from quic_conn and replaces it with a
copy of the negotiated ALPN. By using quic_alpn_to_app_ops(), it ensures
it remains compatible with a known application layer.

On the MUX layer, qcc_install_app_ops() now uses the standard
conn_get_alpn() to retrieve the ALPN from the transport layer. This is
done via the newly defined <get_alpn> QUIC xprt callback.

This new architecture should be cleaner as it better highlights the
responsibility of each layers in the ALPN/app negotiation.
2026-03-03 16:22:57 +01:00
Amaury Denoyelle
9c7cf1c684 MINOR: mux-quic: add function for ALPN to app-ops conversion
Extract the conversion from ALPN to qcc_app_ops type from quic_conn
source file into QUIC MUX. The newly created function is named
quic_alpn_to_app_ops(). This will serve as a central point to identify
which ALPNs are currently supported in our QUIC stack.

This patch is purely a small refactoring. It will be useful for the next
one which rework MUX app-ops layer init. The current cleanup allows
notably to remove H3/hq-interop headers from quic_conn source file.
2026-03-03 16:20:16 +01:00
Amaury Denoyelle
4120faf289 MINOR: quic/h3: reorganize stream reject after MUX closure
The QUIC MUX layer is closed after its transport counterpart. This may
be necessary then to reject any new streams opened by the remote peer.
This operation is dependent however from the application protocol.

Previously, a function qc_h3_request_reject() was directly implemented
in quic_conn source file for use when HTTP/3 was previously negotiated.
However, this solution was not evolutive and broke layering.

This patch introduces a new proper separation with a <strm_reject>
callback defined in quic_conn structure. When set, it will be used to
preemptively close any new stream. QUIC MUX is responsible to set it
just before its closure.

No functional change. This patch is purely a refactoring with a better
architecture design. Especially, H3 specific code from transport layer
is now completely removed.
2026-03-03 16:19:13 +01:00
Amaury Denoyelle
58830990d0 MINOR: quic: use signed char type for ALPN manipulation
In most of haproxy code, ALPN is used as a signed char pointer. In QUIC
code instead, it is manipulated as unsigned.

Unifies this by using signed type in QUIC code. This allows to remove a
bunch of unnecessary casts.
2026-03-03 16:11:58 +01:00
Amaury Denoyelle
f41e684e9a BUG/MINOR: hlua: fix return with push nil on proxy check
Some checks are pending
Contrib / build (push) Waiting to run
alpine/musl / gcc (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
hlua_check_proxy() may now return NULL if the target proxy instance has
been flagged for deletion. Thus, proxies method have been adjusted and
may push nil to report such case.

This patch fixes these error paths. When nil is pushed, 1 must be
returned instead of 0. This represents the count of pushed values on the
stack which can be retrieved by the caller.

No need to backport.
2026-03-03 08:45:27 +01:00
Amaury Denoyelle
712055f2f8 MEDIUM: proxy: implement backend deletion
This patch finalizes "del backend" handler by implementing the proper
proxy deletion.

After ensuring backend deletion can be performed, several steps are
executed. First, any watcher elements are updated to point on the next
proxy instance. The backend is then removed from ID and name global
trees and is finally detached from proxies_list.

Once the backend instance is removed from proxies_list, the backend
cannot be found by new elements. Thread isolation is lifted and
proxy_drop() is called, which will purge the proxy if its refcount is
null. Thanks to recently introduced PROXIES_DEL_LOCK, proxy_drop() is
thread safe.
2026-03-02 14:14:05 +01:00
Amaury Denoyelle
6145f52d9c MINOR: proxy: use atomic ops for default proxy refcount
Default proxy refcount <def_ref> is used to comptabilize reference on a
default proxy instance by standard proxies. Currently, this is necessary
when a default proxy defines TCP/HTTP rules or a tcpcheck ruleset.

Transform every access on <def_ref> so that atomic operations are now
used. Currently, this is not strictly needed as default proxies
references are only manipulated at init or deinit in single thread mode.
However, when dynamic backends deletion will be implemented, <def_ref>
will be decremented at runtime also.
2026-03-02 14:14:05 +01:00
Amaury Denoyelle
f64aa036d8 MEDIUM: proxy: add lock for global accesses during default free
This patch is similar to the previous one, but this time it deals with
functions related to defaults proxies instances. Lock PROXIES_DEL_LOCK
is used to protect accesses on global collections.

This patch will be necessary to implement dynamic backend deletion, even
if defaults won't be use as direct target of a "del backend" CLI.
However, a backend may have a reference on a default instance. When the
backend is freed, this references is released, which can in turn cause
the freeing of the default proxy instance. All of this will occur at
runtime, outside of thread isolation.
2026-03-02 14:10:46 +01:00
Amaury Denoyelle
f58b2698ce MEDIUM: proxy: add lock for global accesses during proxy free
Define a new lock with label PROXIES_DEL_LOCK. Its purpose is to protect
operations performed on global lists or trees while a proxy is freed.

Currently, this lock is unneeded as proxies are only freed on
single-thread init or deinit. However, with the incoming dynamic backend
deletion, this operation will be also performed at runtime, outside of
thread isolation.
2026-03-02 14:09:25 +01:00
Amaury Denoyelle
a7d1c59a92 MINOR: proxy: add comment for defaults_px_ref/unref_all()
Write documentation for functions related to default proxies instances.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
98c8c5e16e MINOR: cli: implement wait on be-removable
Implement be-removable argument to CLI wait. This is implemented via
be_check_for_deletion() invokation, also used by "del backend" handler.

The objective is to test whether a backend instance can be removed. If
this is not the case, the command may returns immediately if the target
proxy is incompatible with dynamic removal or if a user action is
required. Else, the command will wait until the temporary restriction is
lifted.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
5ddfbd4b03 MINOR: server: mark backend removal as forbidden if QUIC was used
Currenly, quic_conn on the backend side may access their parent proxy
instance during their lifetime. In particular, this is the case for
counters update, with <prx_counters> field directly referencing a proxy
memory zone.

As such, this prevents safe backend removal. One solution would be to
check if the upper connection instance is still alive, as a proxy cannot
be removed if connection are still active. However, this would
completely prevent proxy counters update via
quic_conn_prx_cntrs_update(), as this is performed on quic_conn release.

Another solution would be to use refcount, or a dedicated counter on the
which account for QUIC connections on a backend instance. However,
refcount is currently only used by short-term references, and it could
also have a negative impact on performance.

Thus, the simplest solution for now is to disable a backend removal if a
QUIC server is/was used in it. This is considered acceptable for now as
QUIC on the backend side is experimental.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
053887cc98 MINOR: proxy: prevent backend deletion if server still exists in it
Ensure a backend instance cannot be removed if there is still server in
it. This is checked via be_check_for_deletion() to ensure "del backend"
cannot be executed. The only solution is to use "del server" to remove
on the servers instances.

This check only covers servers not yet targetted via "del server". For
deleted servers not yet purged (due to their refcount), the proxy
refcount is incremented but this does not block "del backend"
invokation.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
7f725f0754 MINOR: proxy: prevent deletion of backend referenced by config elements
Define a new proxy flag PR_FL_NON_PURGEABLE. This is used to mark every
proxy instance explicitely referenced in the config. Such instances
cannot be deleted at runtime.

Static use_backend/default_backend rules are handled in
proxy_finalize(). Also, sample expression proxy references are protected
via smp_resolve_args().

Note that this last case also incidentally protects any proxies
referenced via a CLI "set var" expression. This should not be the case
as in this case variable value is instantly resolved so the proxy
reference is not needed anymore. This also affects dynamic servers.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
7bf3020952 MINOR: proxy: prevent backend removal when unsupported
Prevent removal of a backend which relies on features not compatible
with dynamic backends. This is the case if either dispatch or
transparent option is used, or if a stick-table is declared.

These limitations are similar to the "add backend" ones.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
ad1e00b2ac MINOR: lua: handle proxy refcount
Implement proxy refcount for Lua proxy class. This is similar to the
server class.

In summary, proxy_take() is used to increment refcount when a Lua proxy
is instantiated. proxy_drop() is called via Lua garbage collector. To
ensure a deleted backend is released asap, hlua_check_proxy() now
returns NULL if PR_FL_DELETED is set.

This approach is directly dependable on Lua GC execution. As such, it
probably suffers from the same limitations as the ones already described
in the previous commit. With the current patch, "del backend" is not
directly impacted though. However, the final proxy deinit may happen
after a long period of time, which could cause memory pressure increase.

One final observations regarding deinit : it is necessary to delay a
BUG_ON() which checks that defaults proxies list is empty. Now this must
be executed after Lua deinit (called via post_deinit_list). This should
guarantee that all proxies and their defaults refcount are null.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
f521c2ce2d MINOR: server: take proxy refcount when deleting a server
When a server is deleted via "del server", increment refcount of its
parent backend. This is necessary as the server is not referenced
anymore in the backend, but can still access it via its own <proxy>
member. Thus, backend removal must not happen until the complete purge
of the server.

The proxy refcount is released in srv_drop() if the flag SRV_F_DELETED
is set, which indicates that "del server" was used. This operation is
performed after the complete release of the server instance to ensure no
access will be performed on the proxy via itself. The refcount must not
be decremented if a server is freed without "del server" invokation.

Another solution could be for servers to always increment the refcount.
However, for now in haproxy refcount usage is limited, so the current
approach is preferred. It should also ensure that if the refcount is
still incremented, it may indicate that some servers are not completely
purged themselves.

Note that this patch may cause issues if "del backend" are used in
parallel with LUA scripts referencing servers. Currently, any servers
referenced by LUA must be released by its garbage collector to ensure it
can be finally freed. However, it appeas that in some case the gc does
not run for several minutes. At least this has been observed with Lua
version 5.4.8. In the end, this will result in indefinitely blocking of
"del backend" commands.
2026-03-02 14:08:30 +01:00
Amaury Denoyelle
ee1f0527c6 MINOR: proxy: rename default refcount to avoid confusion
Rename proxy conf <refcount> to <def_ref>. This field only serves for
defaults proxy instances. The objective is to avoid confusion with the
newly introduced <refcount> field used for dynamic backends.

As an optimization, it could be possible to remove <def_ref> and only
use <refcount> also for defaults proxies usage. However for now the
simplest solution is implemented.

This patch does not bring any functional change.
2026-03-02 14:07:40 +01:00
Amaury Denoyelle
f3127df74d MINOR: proxy: add refcount to proxies
Implement refcount notion into proxy structure. The objective is to be
able to increment refcount on proxy to prevent its deletion temporarily.
This is similar to the server refcount : "del backend" is not blocked
and will remove the targetted instance from the global proxies_list.
However, the final free operation is delayed until the refcount is null.

As stated above, the API is similar to servers. Proxies are initialized
with a refcount of 1. Refcount can be incremented via proxy_take(). When
no longer useful, refcount is decremented via proxy_drop() which
replaces the older free_proxy(). Deinit is only performed once refcount
is null.

This commit also defines flag PR_FL_DELETED. It is set when a proxy
instance has been removed via a "del backend" CLI command. This should
serve as indication to modules which may still have a refcount on the
target proxy so that they can release it as soon as possible.

Note that this new refcount is completely ignored for a default proxy
instance. For them, proxy_take() is pure noop. Free is immediately
performed on first proxy_drop() invokation.
2026-03-02 10:44:59 +01:00
Amaury Denoyelle
ebbdfc5915 MINOR: lua: use watcher for proxies iterator
Ensures proxies iteration via lua functions is safe via a new watcher
member. The principle is similar to the one already used for servers
iteration.
2026-03-02 10:36:21 +01:00
Amaury Denoyelle
20376c54e2 MINOR: stats: protect proxy iteration via watcher
Define a new <px_watch> watcher member in stats applet context. It is
used to register the applet on a proxy when iterating over the proxies
list. <obj1> is automatically updated via the watcher interaction.
Watcher is first initialized prior to stats_dump_proxies() invocation.

This guarantees that stats dump is safe even if applet yields and a
backend is removed in parallel.
2026-02-27 10:28:24 +01:00
Amaury Denoyelle
4bcfc09acf MINOR: proxy: define proxy watcher member
Define a new member watcher_list in proxy. It will be used to register
modules which iterate over the proxies list. This will ensure that the
operation is safe even if a backend is removed in parallel.
2026-02-27 10:28:24 +01:00
Amaury Denoyelle
08623228a1 MINOR: proxy: define a basic "del backend" CLI
Add "del backend" handler which is restricted to admin level. Along with
it, a new function be_check_for_deletion() is used to test if the
backend is removable.
2026-02-27 10:28:24 +01:00
Amaury Denoyelle
78549c66c5 BUG/MINOR: proxy: add dynamic backend into ID tree
Add missing proxy_index_id() call in "add backend" handler. This step is
responsible to store the newly created proxy instance in the
used_proxy_id global tree.

No need to backport.
2026-02-26 18:24:36 +01:00
Willy Tarreau
9db62d408a BUG/MINOR: call EXTRA_COUNTERS_FREE() before srv_free_params() in srv_drop()
Some checks failed
Contrib / build (push) Has been cancelled
alpine/musl / gcc (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
As seen with the last changes to counters allocation, the move of the
counters storage to the thread group as operated in commit 04a9f86a85
("MEDIUM: counters: add a dedicated storage for extra_counters in various
structs") causes some random errors when using ASAN, because the extra
counters are freed in srv_drop() after calling srv_free_params(), which
is responsible for freeing the per-thread group storage.

For the proxies however it's OK because free calls are made before the
call to deinit_proxy() which frees the per_tgrp area.

No backport is needed, this is purely 3.4-dev.
2026-02-26 17:24:59 +01:00
Willy Tarreau
9019a5db93 MEDIUM: counters: return aggregate extra counters in ->fill_stats()
Now thanks to new macro EXTRA_COUNTERS_AGGR() we can iterate over all
thread groups storages when returning the data for a given metric. This
remains convenient and mostly transparent. The caller continues to pass
the pointer to the metric in the first group, and offsets are calculated
for all other groups and data summed. For now all groups except the
first one contain only zeroes but reported values are nevertheless
correct.
2026-02-26 17:03:53 +01:00
Willy Tarreau
de0eddf512 MINOR: counters: add EXTRA_COUNTERS_BASE() to retrieve extra_counters base storage
The goal is to always retrieve the storage address of the first thread
group for the given module. This will be used to iterate over all thread
groups. For now it returns the same value as EXTRA_COUNTERS_GET().
2026-02-26 17:03:53 +01:00
Willy Tarreau
a60e1fcf7f MEDIUM: counters: store the number of thread groups accessing extra_counters
In order to be able to properly allocate all storage and retrieve data
from there, we'll need to know how many thread groups are supposed to
access it. Let's store the number of thread groups at init time. If the
tgrp_step is zero, there's always only one tg though.

Now EXTRA_COUNTERS_ALLOC() takes this number of thread groups in argument
and stores it in the structure. It also allocates as many areas as needed,
incrementing the datap pointer by the step for each of them.

EXTRA_COUNTERS_FREE() uses this info to free all allocated areas.

EXTRA_COUNTERS_INIT() initializes all allocated areas, this is used
elsewhere to clear/preset counters, e.g. in proxy_stats_clear_counters().
It involves a memcpy() call for each array, which is normally preset to
something empty but might also be used to preset certain non-scalar
fields such as an instance name.
2026-02-26 17:03:53 +01:00
Willy Tarreau
7ac47910a2 MINOR: counters: store a tgroup step for extra_counters to access multiple tgroups
We'll need to permit any user to update its own tgroup's extra counters
instead of the global ones. For this we now store the per-tgroup step
between two consecutive data storages, for when they're stored in a
tgroup array. When shared (e.g. resolvers or listeners), we just store
zero to indicate that it doesn't scale with tgroups. For now only the
registration was handled, it's not used yet.
2026-02-26 17:03:53 +01:00
Willy Tarreau
04a9f86a85 MEDIUM: counters: add a dedicated storage for extra_counters in various structs
Servers, proxies, listeners and resolvers all use extra_counters. We'll
need to move the storage to per-tgroup for those where it matters. Now
we're relying on an external storage, and the data member of the struct
was replaced with a pointer to that pointer to data called datap. When
the counters are registered, these datap are set to point to relevant
locations. In the case of proxies and servers, it points to the first
tgrp's storage. For listeners and resolvers, it points to a local
storage. The rationale here is that listeners are limited to a single
group anyway, and that resolvers have a low enough load so that we do
not care about contention there.

Nothing should change for the user at this point.
2026-02-26 17:03:47 +01:00
Willy Tarreau
8dd22a62a4 CLEANUP: counters: only retrieve zeroes for unallocated extra_counters
Since version 2.4 with commit 7f8f6cb926 ("BUG/MEDIUM: stats: prevent
crash if counters not alloc with dummy one") we can afford to always
update extra_counters because we know they're always either allocated
or linked to a dedicated trash. However, the ->fill_stats() callbacks
continue to access such values, making it technically possible to
retrieve random counters from this trash, which is not really clean.
Let's implement an explicit test in the ->fill_stats() functions to
only return 0 for the metric when not allocated like this. It's much
cleaner because it guarantees that we're returning an empty counter
in this case rather than random values.

The situation currently happens for dummy servers like the ones used
in Lua proxies as well as those used by rings (e.g. used for logging
or traces). Normally, none of the objects retrieved via stats or
Prometheus is concerned by this unallocated extra_counters situation,
so this is more about a cleanup than a real fix.
2026-02-26 08:24:03 +01:00
Willy Tarreau
95a9f472d2 MEDIUM: counters: change the fill_stats() API to pass the module and extra_counters
We'll soon need to iterate over thread groups in the fill_stats() functions,
so let's first pass the extra_counters and stats_module pointers to the
fill_stats functions. They now call EXTRA_COUNTERS_GET() themselves with
these elements in order to retrieve the required pointer. Nothing else
changed, and it's getting even a bit more transparent for callers.

This doesn't change anything visible however.
2026-02-26 08:24:03 +01:00
Willy Tarreau
56fc12d6fa CLEANUP: stats: drop stats.h / stats-t.h where not needed
A number of C files include stats.h or stats-t.h, many of which were
just to access the counters. Now those which really need counters rely
on counters.h or counters-t.h, which already reduces the amount of
preprocessed code to be built (~3000 lines or about 0.05%).
2026-02-26 08:24:03 +01:00
Willy Tarreau
9910af6117 CLEANUP: quic-stats: include counters from quic_stats
There's something a bit awkward in the way stats counters are inherited
through the QUIC modules: quic_conn-t includes quic_stats-t.h, which
declares quic_stats_module as extern from a type that's not known from
this file. And anyway externs should not be exported from type defintions
since they're not part of the ABI itself.

This commit moves the declaration to quic_stats.h which now takes care
to include stats-t.h to get the definition of struct stats_module. The
few users who used to learn it through quic_conn-t.h now include it
explicitly. As a bonus this reduces the number of preprocessed lines
by 5000 (~0.1%).

By the way, it looks like struct stats_module could benefit from being
moved off stats-t.h since it's only used at places where the rest of
the stats is not needed. Maybe something to consider for a future
cleanup.
2026-02-26 08:24:03 +01:00
Willy Tarreau
fb5e280e0d CLEANUP: tree-wide: drop a few useless null-checks before free()
We only support platforms where free(NULL) is a NOP so that
null checks are useless before free(). Let's drop them to keep
the code clean. There were a few in cfgparse-global, flt_trace,
ssl_sock and stats.
2026-02-26 08:24:03 +01:00
Willy Tarreau
709c3be845 BUG/MINOR: server: adjust initialization order for dynamic servers
It appears that in cli_parse_add_server(), we're calling srv_alloc_lb()
and stats_allocate_proxy_counters_internal() before srv_preinit() which
allocates the thread groups. LB algos can make use of the per_tgrp part
which is initialized by srv_preinit(). Fortunately for now no algo uses
both tgrp and ->server_init() so this explains why this remained
unnoticed to date. Also, extra counters will soon require per_tgrp to
already be initialized. So let's move these between srv_preinit() and
srv_postinit(). It's possible that other parts will have to be moved
in between.

This could be backported to recent versions for the sake of safety but
it looks like the current code cannot tell the difference.
2026-02-26 08:24:03 +01:00
Willy Tarreau
44932b6c41 BUG/MEDIUM: mux-h2: make sure to always report pending errors to the stream
Some checks are pending
Contrib / build (push) Waiting to run
alpine/musl / gcc (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Some stream parsing errors that do not affect the connection result in
the parsed block not being transferred from the rx buffer to the channel
and not being reported upstream in rcv_buf(), causing the stconn to time
out. Let's detect this condition, and propagate term flags anyway since
no more progress will be made otherwise.

This should be backported at least till 3.2, probably even 2.8.
2026-02-26 00:30:42 +01:00
Willy Tarreau
e67e36c9eb MINOR: mux-h2: add a new setting, "tune.h2.log-errors" to tweak error logging
The H2 mux currently logs whenever some decoding fails. Most of the errors
happen at the connection level, but some are even at the stream level,
meaning that multiple logs can be emitted for a given connection, which
can quickly use some resource for little value. This new setting allows
to tweak this and decide to only log errors that affect the connection,
or even none at all.

This should be backported at least as far as 3.2.
2026-02-25 22:43:40 +01:00
Willy Tarreau
cad6e0b3da MINOR: mux-h2: also count glitches on invalid trailers
Two cases were not causing glitches to be incremented:
  - invalid trailers
  - trailers on closed streams

This patch addresses this. It could be backported, at least to 3.2.
2026-02-25 22:03:16 +01:00
Frederic Lecaille
5af42fa342 CLEANUP: ssl: remove outdated comments
Some checks are pending
Contrib / build (push) Waiting to run
alpine/musl / gcc (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
ssl_sock_srv_try_reuse_sess() was modified by this commit to no longer
fail (it now returns void), but the related comments remained:

  BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions

This patch cleans them up.
2026-02-25 11:25:05 +01:00
Frederic Lecaille
89c75b0777 BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions
The QUIC mux requires "application operations" (app ops), which are a list
of callbacks associated with the application level (i.e., h3, h0.9) and
derived from the ALPN. For 0-RTT, when the session cache cannot be reused
before activation, the current code fails to reach the initialization of
these app ops, causing the mux to crash during its initialization.

To fix this, this patch restores the behavior of
ssl_sock_srv_try_reuse_sess(), whose purpose was to reuse sessions stored
in the session cache regardless of whether 0-RTT was enabled, prior to
this commit:

  MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend
  sessions (0-RTT)

With this patch, this function now does only one thing: attempt to reuse a
session, and that's it!

This patch allows ignoring whether a session was successfully reused from
the cache or not. This directly fixes the issue where app ops
initialization was skipped upon a session cache reuse failure. From a
functional standpoint, starting a mux without reusing the session cache
has no negative impact; the mux will start, but with no early data to
send.

Finally, there is the case where the ALPN is reset when the backend is
stopped. It is critical to continue locking read access to the ALPN to
secure shared access, which this patch does. It is indeed possible for the
server to be stopped between the call to connect_server() and
quic_reuse_srv_params(). But this cannot prevent the mux to start
without app ops. This is why a 'TODO' section was added, as a reminder that a
race condition regarding the ALPN reset still needs to be fixed.

Must be backported to 3.3
2026-02-25 11:13:52 +01:00
Olivier Houchard
84837b6e70 BUG/MEDIUM: cpu-topo: Distribute CPUs fairly across groups
Some checks are pending
Contrib / build (push) Waiting to run
alpine/musl / gcc (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Make sure CPUs are distributed fairly across groups, in case the number
of groups to generate is not a divider of the number of CPUs, otherwise
we may end up with a few groups that will have no CPU bound to them.

This was introduced in 3.4-dev2 with commit 56fd0c1a5c ("MEDIUM: cpu-topo:
Add an optional directive for per-group affinity"). No backport is
needed unless this commit is backported.
2026-02-24 08:17:16 +01:00
Frederic Lecaille
ca5332a9c3 BUG/MINOR: haterm: cannot reset default "haterm" mode
Some checks are pending
Contrib / build (push) Waiting to run
alpine/musl / gcc (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
When "mode haterm" was set in a "defaults" section, it could not be
overridden in subsequent sections using the "mode" keyword. This is because
the proxy stream instantiation callback was not being reset to the
default stream_new() value.

This could break the stats URI with a configuration such as:

    defaults
        mode haterm
        # ...

    frontend stats
		bind :8181
		mode http
		stats uri /

This patch ensures the ->stream_new_from_sc() proxy callback is reset
to stream_new() when the "mode" keyword is parsed for any mode other
than "haterm".

No need to backport.
2026-02-23 17:57:19 +01:00
Maxime Henrion
a9dc8e2587 MINOR: quic: add a new metric for ncbuf failures
Some checks failed
Contrib / build (push) Has been cancelled
alpine/musl / gcc (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
This counts the number of times we failed to add data to the ncbuf
buffer because of the gap size limit.
2026-02-23 17:47:45 +01:00