When both USE_LINUX_SPLICE and USE_KTLS are enabled, it's worth
enabling kTLS on the bind line as it significantly increases the
local bit rate as well as through TLS accelerators (up to x2/x3).
The -dT option remains available to disable it. It was verified to
gracefully downgrade when not supported (e.g. OpenSSL 3.0.1 does
this).
When built without USE_OPENSSL, the binding errors are dirty, speaking
about crt-store and stuff like this. Better just indicate that SSL
support was not built in and explain how to enable it.
Commit 92581043fb ("MINOR: haterm: add long options for QUIC and TCP
"bind" settings") added --tcp-bind-opts. The doc (and commit) says that
it applies to TCP bind lines but it only applied to the TCP/SSL ones,
not the clear ones. Let's fix it. No backport needed, this is only 3.4.
When building without QUIC support (e.g. an SSL library not supporting
it), we'll get errors when trying to bind to the SSL port that QUIC is
not supported because the quic binding was unconditional. Let's only
place it when QUIC is supported. No backport needed, this is only 3.4.
The use of the unshare() mechanism to get the ability to chroot as an
unprivileged user produced a warning on some configurations where the
haproxy process has the CAP_SYS_CHROOT capability. We now only attempt
to use it when a previous chroot() call failed because of insufficient
privileges.
This should fix GitHub issue #3395. No backport needed.
The recent update to the makefile in commit bfbca23dc2 ("BUILD: makefile:
search for Lua 5.5 as well") to enable searching for Lua 5.5 revealed a
problem by which we were using the fallback versions before the main one
(e.g. /usr/include/lua-5.4/lua.h before /usr/include/lua.h). However, the
libs often contain the version in their name so that we can end up linking
with 5.5 while 5.4 was used in the include.
This was detected only when enabling lua 5.5 because in Lua 5.4
"luaL_openlibs()" was a symbol and became an inline in 5.5, preventing
from using a mix of the two versions.
The current change is minimal in that it skips all fallbacks when lua.h
is present in /usr/include, and includes it in the test to make sure that
the directory found contains valid C. LUA_LIB checks for lua before the
variants so as to remain consistent with the system provided version.
Thanks to @gene-git for reporting this problem in GH issue #3404.
This may have to be backported after a period of observation if users
face build issues for older releases on newer distros. In this case,
backporting 1c0f781994 ("MINOR: hlua: Add support for lua 5.5") would
equally be needed. However this will result in the system's version
being used first, which may or may not be desired.
The ACT_OPT_FINAL flag was not properly handled in the pause action. When
this flag is set, because of an abort or an unexpected error, an action must
no longer yield. However, in the pause action, this flag was never tested.
In case of client abort for instance, this could trigger an internal error
instead of a client error.
This patch should fix the issue #3403. It must be backported as far as 3.2.
The default sni-auto that aims at not upsetting certain servers doing
excessive checks of SNI vs host has some drawbacks (lower reuse ratio)
that are particularly hard to diagnose, so let's explain how connections
are reused/purged when dealing with many hosts, and how to cheat as well.
Let's also mention the expression used by "sni-auto" since it was only
mentioned in the code.
Three functions are provided here:
fd_dump: lists all FDs
fd_dump_conn: lists all FDs holding a connection
fd_dump_listener: lists all FDs holding a listener
They take no argument, and dump some of the known info. E.g. for
a connection, ctrl, xprt, flags, mux, sessions, frontend's name
and session's age are reported. Example:
(gdb) fd_dump_conn
fd 31: rm=0 tm=0x2 um=0 st=0x21 refc=0x1 tkov=0 gen=0 conn=0x7fffe803b600: flg=0x300 err=0 ctrl=0xdf51c0 xprt=0xdf5c80 mux=0xbaeee0 sess=0x7ffff003b570: fe=0x1e45b00 id=foo age=0ms
They are particularly slow because they iterate over all possible FDs,
so better limit them to the desired types.
The thread_dump function dumps the list of known threads and a few info
on them (pointer, current run queue, flags etc). This should help more
easily spot a particular one and find stuck ones.
E.g:
(gdb) thread_dump
Tid 0: pth=0x7ffff7e797c0 mono=2222322327950732 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 1: pth=0x7ffff78d8640 mono=2222322327928085 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 2: pth=0x7ffff6b7e640 mono=2222322327927150 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 3: pth=0x7ffff637d640 mono=2222322327924878 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 4: pth=0x7ffff5b7c640 mono=2222322327925676 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 5: pth=0x7ffff537b640 mono=2222322327929524 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 6: pth=0x7ffff4b7a640 mono=2222322327926817 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 7: pth=0x7fffdffff640 mono=2222322327947960 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
New functions task_dump_wq and task_dump_rq can be used to dump tasks
in a wait queue or in a run queue respectively. For the wait queue (the
most common usage), one needs to pass either the thread-local's timers,
or the thread group ones for shared tasks:
task_dump_wq &ha_tgroup_ctx[0].timers
task_dump_wq &ha_thread_ctx[0].timers
For the run queue, task_dump_rq will take the thread's rqueue:
task_dump_rq &ha_thread_ctx[0].rqueue
The output is the task pointer and a dump of the task* struct per line,
then a total count at the end.
The ebtree descent functions currently use $arg0 as is and it's up to
the user to manually type the required casts that are never obvious
(particularly when coming from a pointer). Let's put the eb_root* cast
in the function to be more user-friendly.
Support for Lua 5.5 was brought in 3.4-dev2 with commit 1c0f781994
("MINOR: hlua: Add support for lua 5.5") but the Makefile doesn't look
for it, which can be quite confusing on recent distros which start to
ship with it. Let's add it to the looked up names.
In __task_wakeup(), for a niced task, we don't always want to increase
the niced_task counter of the running thread's thread group, if we are
waking up the task of another thread, who belongs to a different thread
group, then we want to increment that thread group's counter instead, as
that's the one that will get decremented later.
So just increase the counter for the target thread'd thread group,
instead of using tg_ctx.
The impact is probably pretty minor, niced task shared amongst thread
are not very common, and the impact would mostly mean we'd run more/less
tasks in one run of process_runnable_tasks() than expected.
This should be backported as far as 2.8.
When we fail to allocate a new tree element, we're still holding the
write lock, so we should do an write unlock, not a read unlock, or the
lock will get corrupted and most likely this will end in a deadlock.
This should be backported up to 3.2.
In spop_get_varint(), -1 is returned if there is not enough data in the
buffer to decode the variable integer. However a strict comparison agasint
b_data() was performed, which is wrong. A failure must be reported if the
index is greater or equal to b_data().
This patch must be backported as far as 3.2.
After sending HTX data to an applet, htx_to_buf() must be called on the
applet buffer to commit changes (and possibly to reset the buffer if it is
empty). This was performed on the output buffer while it should in fact be
performed on the input buffer. So let's fix it.
This patch must be backported as far as 3.0.
When the check for server hash was introduced to make sure we're using
the right alpn, the logic to store the new alpn was flawed. We should
always check that the new alpn length is small enough to fit in the
buffer, no matter if the server hash is not the same or not. So always
check the length first, and only check if the alpn or the server changed
after.
This should be backported whenever commit
de3f245df0 has been backported.
This adds "-dA[file]" on the command line, which dumps an archive of all
dependencies detected at runtime into the designated file in tar format.
This is equivalent to "set-dumpable libs", but instead of keeping the libs
in memory, it dumps them into a file. This may be used after a core dump,
in order to provide all necessary libraries to developers to permit them
to exploit the core. This may not be available on all operating systems.
When shared libs were loaded via "set-dumpable libs", better release
them upon deinit, it will make valgrind happier. For this we now have
a new function free_collected_libs() in tools.c and call it in deinit().
In htx_xfer() function, when headers are partially copied, depending on the
flags, a rollback may be performed to remove all copied headers from the
destination message. However, there was an issue in the loop performing the
rollback. Instead of decrementing the returned value using the size of the
HTX block from the destination message, the one from the source message was
used. So the wrong value was be returned and in worst case, it could
overflow.
In addition, the BUG_ON() in the loop was removed because test condition was
wrong.
It is a 3.4-specific issue. No backport needed.
When message headers are formatted, the connection and upgrade header values
are parsed to be sanitized and to fill H1M flags. The values are modified in
place without changing the HTX message information accordingly (the block
info and the HTX info). It could be an issue if the output buffer is full
and the header cannot be formatted. Because the formatting can be stopped
with a HTX message in hazardous state.
It should be quite difficult to trigger this issue. But now, a copy of the
value is performed before parsing it. So only the copy will be altered,
leaving the HTX message in a safe state.
This patch must be backported to all stable versions.
During maxage parsing, the size of the value was not properly computed when
it was copied into the trash chunk. The name (max-age or s-maxage) must be
skipped with the '=' character. But instead of doing a subtraction, and
addition was performed, adding 2 extra bytes to the value used for the
convertion to integer.
In addition, the "chunk_memcat(chk, "", 1)" operation to add a trailing
NULL-byte was replaced by "*(b_tail(chk)) = '\0'". It a bit easier to
understand.
This patch should be backported to all stable versions.
MAX_PUSH_ID frames are emitted by the client only on the control stream.
These conditions are checked via h3_check_frame_valid() since the
following patch.
e4a5a64198
BUG/MINOR: h3: reject server MAX_PUSH_ID frame
However control stream test was inverted by mistake. This patch fixes
it.
Due to this bug, H3 connections were improperly closed on error by
haproxy for clients which send MAX_PUSH_ID frames. This has been
detected on the QUIC interop with aioquic and neqo clients.
This must be backported up to 3.3.
In qcc_qmux_recv(), when calling qmux_parse_frm(), also treat negative
values as an error, and close the connexion. qmux_parse_frm() will
return -1 if the frame is of an invalid type, and we don't want to
process any further, or we will crash.
Move the security contact out of intro.txt into a dedicated, easily
searchable doc/security.txt that points reporters at the threat model
first, and reference it from intro.txt's contacts section and the
documentation index.
Add doc/internals/threat-model.txt describing what does and does not
qualify as a security vulnerability in HAProxy so that reporters and
developers have a common understanding of the threat model, and make it
clear that anything non-critical should be handled in the open and
not hidden behind embargoes.
The document lists assets to protect, what constitutes an attack, what
are the mitigations in place, and the severity ordering of various
risks. This may in the long term also help developers make better
choices of default settings and option names, and may also justify
changing default settings over time when modern operating systems
bring new possibilities.
A section also lists some invariants and defaults in an attempt to
limit the risk of reporting theoretical issues that are technically
impossible to happen in the field.
This is an initial version meant to be refined as cases arise. It
was incrementally designed and cross-checked with the help of three
independent LLMs (Qwen, Gemini and Claude) until each correctly
classified a set of sample reports against it. In the current state
they do not raise any residual ambiguities anymore.
After testing against a few LLMs, it appeared that several entries in
the core principles document were ambiguous or imprecise and could be
misread (size_t, pools, trash, dwcas, comparison, ncbuf). No more
complaint after this rewording so this will be sufficient for now.
Found via cppcheck --force --enable=all --output-file=haproxy.log :
admin/halog/halog.c:1805:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1806:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1809:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1810:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1814:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
Found via cppcheck --force --enable=all --output-file=haproxy.log :
src/ncbmbuf.c:192:9: warning: If memory allocation fails, then there is a possible null pointer dereference: area [nullPointerOutOfMemory]
src/ncbmbuf.c:373:9: warning: If memory allocation fails, then there is a possible null pointer dereference: data [nullPointerOutOfMemory]
src/ncbmbuf.c:546:9: warning: If memory allocation fails, then there is a possible null pointer dereference: data [nullPointerOutOfMemory]
Found via cppcheck --force --enable=all --output-file=haproxy.log :
addons/51degrees/51d.c:130:3: warning: If memory allocation fails, then
there is a possible null pointer dereference: name [nullPointerOutOfMemory]
addons/51degrees/51d.c:922:4: warning: If memory allocation fails, then
there is a possible null pointer dereference: _51d_property_list [nullPointerOutOfMemory]
We can't call call the prepare_srv() method too early, because it needs
global.nbthreads to be properly set, which won't be true at post_parse
time. So instead, make it so that code runs later, as a post_check
function, when it will be safe to do so.
This should be backported up to 2.8.
This should fix github issue #3402
Ever since the introduction of multiple cache trees, the "show cache"
CLI command was not properly showing the contents of each tree, but was
only showing the first one.
Fix that by properly resetting next_key when we switch to the next tree.
Should be backported up to 3.0.
In in46un_to_addr(), when copying a struct sockaddr_in6, copy the
sin6_flowinfo and sin6_scope_id, as they are part of the structure too.
They are unlikely to be of any use for us, but this is more correct
anyway.
Very similarly to what was fixed with commit
63f853957a, we cast a sockaddr_in46 in
quic_dgram_parse() to sockaddr_storage while providing source and
destination addresses to qc_handle_conn_migration(), which will then
copy the whole sockaddr_storage, thus reading memory past what was
provided.
While this most likely won't have any impact, let's do the right thing,
and use in46un_to_addr() to generate a real sockaddr_storage.
This does not need to be backported.
EXTRA_MAKE allows to source an external makefile to bring new options
that will result in including add-ons etc. It must be part of the
construction of .build_opts that decides whether or not existing .o
are reusable or need to be rebuilt, otherwise we can end up with a mix
of .o built with some options and others with different options.
No backport is needed, as this appeared in 3.4.
Some setups where the number of threads is forced without any binding
(no cpu-map), are quite suspicious if they result in less threads than
available CPUs, and not even predictably bound, so we want to notify
the user that this might be an oversight.
Similarly, when thread-groups is forced and not nbthread (and no cpu-map),
and the final number of threads is lower than the hard-limit or the number
of CPUs we also indicate the impact and how to remedy it. This can happen
for example when starting on a machine with more than 64 CPUs and
thread-groups forced to 1, or on more than 128 CPUs and thread-groups
forced to 2 (e.g. when moving an older config to a new platform).
It is possible that some of these conditions might need to be readjusted
in the future to catch other traps or to relax certain commonly used,
valid cases, so for now it is preferable not to backport this patch.
The cpu-policy directive is ignored when nbthreads, thread-groups, or
cpu-map are set. In addition, first-usable-node is ignored when the
process was externally restricted (e.g. taskset). This is difficult to
debug when it happens because multiple parameters come into the mix and
it's easy to forget to unset one. Let's emit a notice when this happens
and the policy was forced. This way, it remains silent with the default
policy, but if it was forced, the incompatibility is reported.
It's worth noting that ll the cpu-policy functions take a char **err
but none uses it. It could have been useful here instead of calling
ha_notice() all along, but one needs to determine who the consumers
are and who will be responsible for freeing the message, so let's go
with ha_notice() given that were were already some diag_warnings in
these functions.
It could be helpful to backport this to 3.2.
Since it's easy to get caught by some parameters being ignored, let's
detect when mtpg was explicitly set and report a notice if it is ignored
due to thread-groups being set. For this we need to avoid presetting
the value in the global section and only set it when entering function
thread_detect_count(), which is OK since the value cannot be used before.
As documented, max-threads-per-group is the default number of threads
to arrange in a group before creating another group, and is only meant
to be used when thread-groups is not set.
However it was always enforced, so configs like:
global
thready-groups 2
which were sufficient in 3.2 and above to start with 64-128 threads
are now suddenly limited to 32 threads! Let's relax the limit when
thread-groups is set!
No backport is needed since this is only 3.4.
When starting, say, 128 threads with max-threads-per-group set to 2
and MAX_TGROUPS set to the default 32, instead of setting the resulting
number of groups to 32 and threads to 64, they're set to 1 and 32
respectively because the condition to raise grp_min is not satisfied.
Let's cut the condition in two parts to also permit to raise it at
least to grp_max.
This should be backported to 3.2.
It was made from the split of the original one into the SSL and the QUIC
variant. However there's a catch: both use the same certificates which
includes the OCSP URL 127.0.0.1:12345, and both need to start a server
on that port. Depending on the number of parallel process and their
speed, they might very well work, or totally fail due to a binding
conflict and the fact that the test runs for a few seconds.
Let's disable the QUIC variant for now, since the whole point of the
test is to verify all the sequencing, the SSL one is greatly sufficient.
Maybe a better approach can be found later.
Commit 1c59c39171 deferred hlua_init() to be called lazily from the
config keyword handlers (lua-load, lua-load-per-thread,
lua-prepend-path, tune.lua.openlibs), with a call inside
hlua_post_init() as a safety net for the case where no Lua directive
appears in the configuration at all.
The problem is hlua_init() is a function that allocates internal
servers (socket_proxy, socket_tcp, socket_ssl) that must exist before
haproxy initialize the configuration. But hlua_post_init() is done too
far after this initialization, so the safety net does not work
correctly.
This would results in a crash in the deinit() if no lua
configuration was loaded in haproxy.
Core was generated by `./haproxy -W -f /dev/null'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005671c72b1047 in _ceb_first (root=0x30, kofs=16, key_type=CEB_KT_U64, key_len=0,
is_dup_ptr=0x7ffc13197a14) at include/import/cebtree-prv.h:1160
1160 if (!*root)
(gdb) bt
#0 0x00005671c72b1047 in _ceb_first (root=0x30, kofs=16, key_type=CEB_KT_U64, key_len=0,
is_dup_ptr=0x7ffc13197a14) at include/import/cebtree-prv.h:1160
#1 _ceb64_first (root=0x30, kofs=16) at src/_ceb_int.c:73
#2 ceb64_ofs_first (root=0x30, kofs=16) at src/_ceb_int.c:66
#3 0x00005671c6be5e6e in srv_close_idle_conns (srv=0x5671fd592a80) at src/server.c:7676
#4 0x00005671c6d3be17 in deinit_proxy (p=0x5671fd5d7780) at src/proxy.c:393
#5 0x00005671c6d3c536 in proxy_drop (p=0x5671fd5d7780) at src/proxy.c:479
#6 0x00005671c6aed998 in hlua_deinit () at src/hlua.c:14934
#7 0x00005671c6db2e41 in deinit () at src/haproxy.c:2846
#8 0x00005671c6db3d98 in deinit_and_exit (status=0) at src/haproxy.c:2966
#9 0x00005671c6db6111 in main (argc=4, argv=0x7ffc131983c8) at src/haproxy.c:3997
The fix is to do the initialization earlier, in a pre-check callback.
Thanks to Amaury for reporting this issue.
No backport needed.