We'll need to permit any user to update its own tgroup's extra counters
instead of the global ones. For this we now store the per-tgroup step
between two consecutive data storages, for when they're stored in a
tgroup array. When shared (e.g. resolvers or listeners), we just store
zero to indicate that it doesn't scale with tgroups. For now only the
registration was handled, it's not used yet.
Servers, proxies, listeners and resolvers all use extra_counters. We'll
need to move the storage to per-tgroup for those where it matters. Now
we're relying on an external storage, and the data member of the struct
was replaced with a pointer to that pointer to data called datap. When
the counters are registered, these datap are set to point to relevant
locations. In the case of proxies and servers, it points to the first
tgrp's storage. For listeners and resolvers, it points to a local
storage. The rationale here is that listeners are limited to a single
group anyway, and that resolvers have a low enough load so that we do
not care about contention there.
Nothing should change for the user at this point.
Since version 2.4 with commit 7f8f6cb926 ("BUG/MEDIUM: stats: prevent
crash if counters not alloc with dummy one") we can afford to always
update extra_counters because we know they're always either allocated
or linked to a dedicated trash. However, the ->fill_stats() callbacks
continue to access such values, making it technically possible to
retrieve random counters from this trash, which is not really clean.
Let's implement an explicit test in the ->fill_stats() functions to
only return 0 for the metric when not allocated like this. It's much
cleaner because it guarantees that we're returning an empty counter
in this case rather than random values.
The situation currently happens for dummy servers like the ones used
in Lua proxies as well as those used by rings (e.g. used for logging
or traces). Normally, none of the objects retrieved via stats or
Prometheus is concerned by this unallocated extra_counters situation,
so this is more about a cleanup than a real fix.
We'll soon need to iterate over thread groups in the fill_stats() functions,
so let's first pass the extra_counters and stats_module pointers to the
fill_stats functions. They now call EXTRA_COUNTERS_GET() themselves with
these elements in order to retrieve the required pointer. Nothing else
changed, and it's getting even a bit more transparent for callers.
This doesn't change anything visible however.
A number of C files include stats.h or stats-t.h, many of which were
just to access the counters. Now those which really need counters rely
on counters.h or counters-t.h, which already reduces the amount of
preprocessed code to be built (~3000 lines or about 0.05%).
There's something a bit awkward in the way stats counters are inherited
through the QUIC modules: quic_conn-t includes quic_stats-t.h, which
declares quic_stats_module as extern from a type that's not known from
this file. And anyway externs should not be exported from type defintions
since they're not part of the ABI itself.
This commit moves the declaration to quic_stats.h which now takes care
to include stats-t.h to get the definition of struct stats_module. The
few users who used to learn it through quic_conn-t.h now include it
explicitly. As a bonus this reduces the number of preprocessed lines
by 5000 (~0.1%).
By the way, it looks like struct stats_module could benefit from being
moved off stats-t.h since it's only used at places where the rest of
the stats is not needed. Maybe something to consider for a future
cleanup.
We only support platforms where free(NULL) is a NOP so that
null checks are useless before free(). Let's drop them to keep
the code clean. There were a few in cfgparse-global, flt_trace,
ssl_sock and stats.
It appears that in cli_parse_add_server(), we're calling srv_alloc_lb()
and stats_allocate_proxy_counters_internal() before srv_preinit() which
allocates the thread groups. LB algos can make use of the per_tgrp part
which is initialized by srv_preinit(). Fortunately for now no algo uses
both tgrp and ->server_init() so this explains why this remained
unnoticed to date. Also, extra counters will soon require per_tgrp to
already be initialized. So let's move these between srv_preinit() and
srv_postinit(). It's possible that other parts will have to be moved
in between.
This could be backported to recent versions for the sake of safety but
it looks like the current code cannot tell the difference.
Some stream parsing errors that do not affect the connection result in
the parsed block not being transferred from the rx buffer to the channel
and not being reported upstream in rcv_buf(), causing the stconn to time
out. Let's detect this condition, and propagate term flags anyway since
no more progress will be made otherwise.
This should be backported at least till 3.2, probably even 2.8.
The H2 mux currently logs whenever some decoding fails. Most of the errors
happen at the connection level, but some are even at the stream level,
meaning that multiple logs can be emitted for a given connection, which
can quickly use some resource for little value. This new setting allows
to tweak this and decide to only log errors that affect the connection,
or even none at all.
This should be backported at least as far as 3.2.
Two cases were not causing glitches to be incremented:
- invalid trailers
- trailers on closed streams
This patch addresses this. It could be backported, at least to 3.2.
ssl_sock_srv_try_reuse_sess() was modified by this commit to no longer
fail (it now returns void), but the related comments remained:
BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions
This patch cleans them up.
The QUIC mux requires "application operations" (app ops), which are a list
of callbacks associated with the application level (i.e., h3, h0.9) and
derived from the ALPN. For 0-RTT, when the session cache cannot be reused
before activation, the current code fails to reach the initialization of
these app ops, causing the mux to crash during its initialization.
To fix this, this patch restores the behavior of
ssl_sock_srv_try_reuse_sess(), whose purpose was to reuse sessions stored
in the session cache regardless of whether 0-RTT was enabled, prior to
this commit:
MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend
sessions (0-RTT)
With this patch, this function now does only one thing: attempt to reuse a
session, and that's it!
This patch allows ignoring whether a session was successfully reused from
the cache or not. This directly fixes the issue where app ops
initialization was skipped upon a session cache reuse failure. From a
functional standpoint, starting a mux without reusing the session cache
has no negative impact; the mux will start, but with no early data to
send.
Finally, there is the case where the ALPN is reset when the backend is
stopped. It is critical to continue locking read access to the ALPN to
secure shared access, which this patch does. It is indeed possible for the
server to be stopped between the call to connect_server() and
quic_reuse_srv_params(). But this cannot prevent the mux to start
without app ops. This is why a 'TODO' section was added, as a reminder that a
race condition regarding the ALPN reset still needs to be fixed.
Must be backported to 3.3
Make sure CPUs are distributed fairly across groups, in case the number
of groups to generate is not a divider of the number of CPUs, otherwise
we may end up with a few groups that will have no CPU bound to them.
This was introduced in 3.4-dev2 with commit 56fd0c1a5c ("MEDIUM: cpu-topo:
Add an optional directive for per-group affinity"). No backport is
needed unless this commit is backported.
When "mode haterm" was set in a "defaults" section, it could not be
overridden in subsequent sections using the "mode" keyword. This is because
the proxy stream instantiation callback was not being reset to the
default stream_new() value.
This could break the stats URI with a configuration such as:
defaults
mode haterm
# ...
frontend stats
bind :8181
mode http
stats uri /
This patch ensures the ->stream_new_from_sc() proxy callback is reset
to stream_new() when the "mode" keyword is parsed for any mode other
than "haterm".
No need to backport.
During proxy_finalize(), a lookup is performed over the servers by name
tree to detect any collision. Only the first conflict for each server
instance is reported to avoid a combinatory explosion with too many
alerts shown.
Previously, this was written using a for loop without any iteration.
Replace this by a simple if statement as this is cleaner.
This should fix github issue #3276.
itbmap_next() advances an iterator over a ncbmbuf buffer storage. When
reaching the end of the buffer, <b> field is set to NULL, and the caller
is expected to stop working with the iterator.
Complete this part to ensure that itbmap type is fully initialized in
case null iterator value is returned. This is not strictly required
given the above description, but this is better to avoid any possible
future mistake.
This should fix coverity issue from github #3273.
This could be backported up to 2.8.
Changes brought to support large buffers revealed a bug in the SPOE applet
when a frame is copied in the SPOE context buffer. A b_xfer() was performed
without allocating the SPOE context buffer. It is not expected. As stated in
the function documentation, the caller is responsible for ensuring there is
enough space in the destination buffer. So first of all, it must ensure this
buffer was allocated.
With recent changes, we are able to hit a BUG_ON() because the swap is no
longer possible if source and destination buffers size are not the same.
This patch should fix the issue #3286. It could be backported as far as 3.1.
Some options do not support "no" nor "defaults" and they're placed after
the check for their absence. However, "accept-invalid-http-request" and
"accept-invalid-http-response" still used to check for the flags that
come with these prefixes, but Coverity noticed this was dead code in
github issue #3272. Let's just drop the test.
No backport needed as it's just dead code.
This function was recently created by moving code from acme_gen_tmp_x509()
(in acme.c) to ssl_gencrt.c (ssl_gen_x509()). The <ctmp> variable was
initialized and then freed without ever being used. This was already the
case in the original acme_gen_tmp_x509() function.
This patch removes these useless statements.
Reported in GH #3284
Avoid such a warnings from coverity:
CID 1645121: (#1 of 1): Calling risky function (DC.WEAK_CRYPTO)
dont_call: random should not be used for security-related applications,
because linear congruential algorithms are too easy to break.
Reported in GH #3283 and #3285
This patch changes the registration of the following keywords to be
unconditional:
- ssl-dh-param-file
- ssl-engine
- ssl-propquery, ssl-provider, ssl-provider-path
- ssl-default-bind-curves, ssl-default-server-curves
- ssl-default-bind-sigalgs, ssl-default-server-sigalgs
- ssl-default-bind-client-sigalgs, ssl-default-server-client-sigalgs
Instead of excluding them at compile time via #ifdef guards in the keyword
registration table, their parsing functions now check feature availability
at runtime and return a descriptive error when the feature is missing.
For features controlled by the SSL library (providers, curves, sigalgs,
DH), the error message includes the actual OpenSSL version string via
OpenSSL_version(OPENSSL_VERSION), so users can immediately identify which
library they are running rather than seeing cryptic internal macro names.
For ssl-dh-param-file, the message also includes "(no DH support)" as a
hint, since OPENSSL_NO_DH can be set either by an OpenSSL build or by
HAProxy itself in certain configurations.
For ssl-engine, which depends on a HAProxy build-time flag (USE_ENGINE),
the message retains the flag name as it is more actionable for the user.
This addresses issue https://github.com/haproxy/haproxy/issues/3246.
In acme_req_finalize(), acme_req_challenge(), acme_req_neworder(),
acme_req_account(), and acme_post_as_get(), the success path always
calls unconditionally memprintf(errmsg, ...).
This may result in a leak of errmsg.
Additionally, acme_res_chkorder(), acme_res_finalize(), acme_res_auth(),
and acme_res_neworder() had unused 'out:' labels that were removed.
Must be backported as far as 3.2.
365a696 ("MINOR: acme: emit a log for DNS-01 challenge response")
introduces the auth->dns member which is istdup(). But this member is
never free, instead auth->token was freed twice by mistake.
Must be backported to 3.2.
QUIC is now implemented on the backend side. Complete definitions for
QUIC/H3 stats module to add STATS_PX_CAP_BE capability.
This change is necessary to display QUIC/H3 counters on backend lines
for HTML stats page.
This should be backported up to 3.3.
half_open_conn is a proxy counter used to account for quic_conn in
half-open state : this represents a connection whose address is not yet
validated (handshake successful, or via token validation).
This counter only has sense for the frontend side. Currently, code is
safe as access is only performed if quic_conn is not yet flagged with
QUIC_FL_CONN_PEER_VALIDATED_ADDR, which is always set for backend
connections.
To better reflect this, add a BUG_ON() when half_open_conn is
incremented/decremented to ensure this never occurs for backend
connections.
quic_conn is initialized with a pointer to its proxy counters. These
counters are then updated during the connection lifetime.
Counters pointer was incorrect for backend quic_conn, as it always
referenced frontend counters. For pure backend, no stats would be
updated. For listen instances, this resulted in incorrect stats
reporting.
Fix this by correctly set proxy counters based on the connection side.
This must be backported up to 3.3.
This is a very minor bug with a very low probability of occurring.
However, it could be flagged by a static analyzer or result in a small
contribution, which is always time-consuming for very little gain.
Add the --quic-bind-opts and --tcp-bind-opts long options to append
settings to all QUIC and TCP bind lines. This requires modifying the argv
parser to first process these new options, ensuring they are available
during the second argv pass to be added to each relevant "bind" line.
Add -b and -c options to the haterm argv parser. Use -b to specify the RSA
private key size (in bits) and -c to define the ECDSA certificate curves.
These self-signed certificates are required for haterm SSL bindings.
Allows server keyword "no-check-sni-auto" for dynamic servers. This may
be necessary to users who do not want to benefit from auto SNI for
checks.
Keyword "check-sni-auto" is still deactivated for dynamic servers, for
the same reason as "sni-auto" (cf the previous patch for a complete
explanation).
This must be backported up to 3.3.
Auto SNI configuration is configured during check config validity.
However, nothing was implemented for dynamic servers.
Fix this by implementing auto SNI configuration during "add server" CLI
handler. Auto SNI configuration code is moved in a dedicated function
srv_configure_auto_sni() called both for static and dynamic servers.
Along with this, allows the keyword "no-sni-auto" on dynamic servers, so
that this process can be deactivated if wanted. Note that "sni-auto"
remains unavailable as it only makes sense with default-servers which
are never used for dynamic server creation.
This must be backported up to 3.3.
There was no check on the result of strdup() used to setup auto SNI on a
server instance during check config validity. In case of failure, the
error would be silently ignored as the following server_parse_exprs()
does nothing when <sni_expr> server field is NULL. Hence, no SNI would
be used on the server, without any error nor warning reported.
Fix this by adding a check on strdup() return value. On error, ERR_ABORT
is reported along with an alert, parsing should be interrupted as soon
as possible.
This must be backported up to 3.3. Note that the related code in this
case is present in cfgparse.c source file.
When leveraging shm-stats-file, global_now_ms and global_now_ns are stored
(and thus shared) inside the shared map, so that all co-processes share
the same clock source.
Since the global_now_{ns,ms} clocks are derived from now_ns, and given
that now_ns is a monotonic clock (hence inconsistent from one host to
another or reset after reboot) special care must be taken to detect
situations where the clock stored in the shared map is inconsistent
with the one from the local process during startup, and cannot be
relied upon anymore. A common situation where the current implementation
fails is resuming from a shared file after reboot: the global_now_ns stored
in the shm-stats-file will be greater than the local now_ns after reboot,
and applying the shared offset doesn't help since it was only relevant to
processes prior to rebooting. Haproxy's clock code doesn't expect that
(once the now offset is applied) global_now_ns > now_ns, and it creates
ambiguous situation where the clock computations (both haproxy oriented
and shm-stats-file oriented) are broken.
To fix the issue, when we detect that the clock stored in the shm is
off by more than SHM_STATS_FILE_HEARTBEAT_TIMEOUT (60s) from the
local now_ns, since this situation is not supposed to happen in normal
environment on the host, we assume that the shm file was previously used
on a different system (or that the current host rebooted).
In this case, we perform a manually adjustment of the now offset so that
the monotonic clock from the current host is consistent again with the
global_now_ns stored in the file. Doing so we can ensure that clock-
dependent objects (such as freq_counters) stored within the map will keep
working as if we just (re)started where we left off when the last process
stopped updating the map.
Normally it is not expected that we update the now offset stored in the
map once the map was already created (because of concurrent accesses to
the file when multiple processes are attached to it), but in this specific
case, we know we are the first process on this host to start working
(again) on the file, thus we update the offset as if we created the
file ourself, while keeping existing content.
It should be backported in 3.3
shm-stats-file heartbeat is derived from now_ms with an extra time added
to it, thus it should be handled using the same time as now_ms is.
Until now, we used to handle heartbeat using signed integer. This was not
found to cause severe harm but it could result in improper handling due
to early wrapping because of signedness for instance, so let's better fix
that before it becomes a real issue.
It should be backported in 3.3
Amaury reported that when the following warning is reported by haproxy:
[WARNING] (296347) : config: failed to get shm stats file slot for 'haproxy.stats', all slots are occupied
haproxy would segfault right after during clock update operation.
The reason for the warning being emitted is not the object of this commit
(all shm-stats-file slots occupied by simultaneous co-processes) but since
it was is intended that haproxy is able to keep working despite that
warning (ignoring the use of shm-stats-file), we should fix the crash.
The crash is caused by the fact that we detach from the shared memory while
the global_now_ns and global_now_ms clock pointers still point to the shared
memory. Instead we should revert to using our local clock instead before
detaching from the map.
It should be backported in 3.3
QUIC uses many objects and the default pool size causes a lot of
thrashing at the current request rate, taking ~12% CPU in pools.
Let's increase it to 3MB, which allows us to reach around 11M
req/s on a 80-core machine.
haterm_init.c is added to implement haproxy_init_args() which overloads
the one defined by haproxy.c. This way, haterm program uses its own argv[]
parsing function. It generates its own configuration in memory that is
parsed during boot and executed by the common code.
Contrary to haproxy, httpterm does not support all the HTTP protocols.
Furthermore, it has become easier to handle inbound/outbound
connections / streams since the rework done at conn_stream level.
This patch implements httpterm HTTP server services into haproxy. To do
so, it proceeds the same way as for the TCP checks which use only one
stream connector, but on frontend side.
The makefile is modified to handle haterm.c in additions to all the C
files for haproxy to build new haterm program into haproxy, the haterm
server also instantiates a haterm stream (hstream struct) attached to a
stream connector for each incoming connection without backend stream
connector. This is the role of sc_new_from_endp() called by the muxes to
instantiate streams/hstreams.
As for stream_new(), hstream_new() instantiates a task named
process_hstream() (see haterm.c) which has the same role as
process_stream() but for haterm streams.
haterm into haproxy takes advantage of the HTTP muxes and HTX API to
support all the HTTP protocols supported by haproxy.
Add a pointer to function to proxies as ->stream_new_from_sc proxy
struct member to instantiate stream from connection as this is done by
all the muxes when they call sc_new_from_endp(). The default value for
this pointer is obviously stream_new() which is exported by this patch.
This patches allows the argv[] parsing function to be redefined from
others C modules. This is done extracting the function which really
parse the argv[] array to implement haproxy_init_args(). This function
is declared as a weak symbol which may be overloaded by others C module.
Same thing for copy_argv() which checks/cleanup/modifies the argv array.
One may want this function to be redefined. This is the case when other
C modules do not handle the same command line option. Copying such
argv[] would lead to conflicts with the original haproxy argv[] during
the copy.
This patch provides the possibility to initialize haproxy without
configuration file. This may be identified by the new global and exported
<fileless_mode> and <fileless_cfg> variables which may be used to
provide a struct cfgfile to haproxy by others means than a physical
file (built in memory).
When enabled, this fileless mode skips all the configuration files
parsing.
Add definitions for haterm stream as arguments to be used by the TRACE API.
This will be used by the haterm module to come which will have to handle
hstream struct objects (in place of stream struct objects).
Add "generate-dummy" on/off type keyword to "load" directive to
automatically generate dummy certificates as this is done for ACME from
ckch_conf_load_pem_or_generate() function which is called if a "crt"
keyword is also provide for this directive.
Also implement "keytype" to specify the key type used for these
certificates. Only "RSA" or "ECDSA" is accepted. This patch also
implements "bits" keyword for the "load" directive to specify the
private key size used for RSA. For ECDSA, a new "curves" keyword is also
provided by this patch to specify the curves to be used for the EDCSA
private keys generation.
ckch_conf_load_pem_or_generate() is modified to use these parameters
provided by "keytype", "bits" and "curves" to generate the private key
with ssl_gen_EVP_PKEY() before generating the X509 certificate calling
ssl_gen_x509().
Move acme_EVP_PKEY_gen() implementation to ssl_gencrt.c and rename it to
ssl_EVP_PKEY_gen(). Also extract from acme_gen_tmp_x509() the generic
part to implement ssl_gen_x509() into ssl_gencrt.c.
To generate a self-signed expired certificate ssl_EVP_PKEY_gen() must be
used to generate the private key. Then, ssl_gen_x509() must be called
with the private key as argument. acme_gen_tmp_x509() is also modified
to called these two functions to generate a temporary certificate has
done before modifying this part.
Such an expired self-signed certificate should not be use on the field
but only during testing and development steps.
In connect_server(), MUX initialization must be delayed if ALPN
negotiation is configured, unless ALPN can already be retrieved via the
server cache.
A readlock is used to consult the server cache. Prior to this patch, it
was always taken even if no ALPN is configured. The lock was thus used
for every new backend connection instantiation.
Rewrite the check so that now the lock is only used if ALPN is
configured. Thus, no lock access is done if SSL is not used or if ALPN
is not defined.
In practice, there will be no performance gain, as the read lock should
never block if ALPN is not configured. However, the code is cleaner as
it better reflect that only access to server nego_alpn requires the
path_params lock protection.