Thanks to this patch, it is now possible to specify an healthcheck section
on the server line. In that case, the server will use the tcpcheck as
defined in the correspoding healthcheck section instead of the proxy's one.
tcpcheck_ruleset struct was extended to host a config part that will be used
for healthcheck sections. This config part is mainly used to store element
for the server's tcpcheck part.
When a healthcheck section is parsed, a ruleset is created with its name
(which must be unique). "*healthcheck-{NAME}" is used for these ruleset. So
it is not possible to mix them with regular rulesets.
For now, in a healthcheck section, the type must be defined, based on the
options name (tcp-check, httpchk, redis-check...). In addition, several
"tcp-check" or "http-check" rules can be specified, depending on the
healthcheck type.
Functions used to parse directives related to tcpchecks were split to have a
first step testing the proxy and creating the tcpcheck ruleset if necessary,
and a second step filling the ruleset. The aim of this patch is to preapre
the parsing of healthcheck sections. In this context, only the second steip
will be used.
When log-format stirngs were parsed in context of a tcpcheck, ARGC_SRV
context was used instead of ARGC_TCK. This context is used to report
accurrate errors.
This patch could be backported to all stable versions.
The proxy flag PR_O_TCPCHK_SSL is replaced by a flag on the tcpcheck
itself. When TCPCHK_FL_USE_SSL flag is set, it means the healthcheck will
use an SSL connection and the SSL xprt must be prepared for the server.
In tcpchecks context, when HTTP sample expressions are parsed, there is no
reason to set the proxy's http_needed value to 1. This value is only used
for streams to allocate an HTTP txn.
This patch could be backported to all stable versions.
disable-on-404 and send-state options, configured on an HTTP healtcheck,
were handled as proxy options. Now, these options are handled in the
tcp-check itself. So the corresponding PR_O and PR_02 flags are removed.
The tcpcheck_rules structure is replaced by the tcpcheck structure. The main
difference is that the ruleset is now referenced in the tcpcheck structure,
instead of the rules list. The flags about the ruleset type are moved into
the ruleset structure and flags to track unused rules remains on the
tcpcheck structure. So it should be easier to track unused rulesets. But it
should be possible to configure a set of tcpcheck rules outside of the proxy
scope.
The main idea of these changes is to prepare the parsing of a new
healthcheck section. So this patch is quite huge, but it is mainly about
renaming some fields.
When parsing httpchck option, a wrong flag (TCPCHK_SND_HTTP_FROM_OPT) was
set on the rules, while it is in fact a flag for a send rule. Let's remove
it. There is no issue here because there is no corresponding flag for
tcpcheck rules.
This patch must be backported to all stable versions.
The CLI commands (get|add|del|clear|commit|set) | (acl|map) does not
contain a permission check on admin level.
Must be backported to 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
'set ssl ocsp-response', 'update ssl ocsp-response', 'show ssl
ocsp-response', 'show ssl ocsp-updates' are lacking permissions checks
on admin level.
Must be backported in 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
Both 'set ssl tls-key' and 'show tls-keys' command are missing the
permission checks so the commands can be used only in admin mode.
Must be backported to 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when map/acl commands are accessed
without admin level. This is to warn users that these commands will be
restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when OCSP commands are accessed without
admin level. This is to warn users that these commands will be
restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when 'show tls-keys' or 'set ssl
tls-key' are accessed without admin level. This is to warn users that
these commands will be restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
The new "-L" option is convenient for quick backport sessions, but it
doesn't list the commit subjects nor the review command. Let's just add
these to ease backport sessions. However we don't do it in quiet mode
(-q) because the output is sometimes parsed by automatic backport
scripts.
Add documentation for two new directives in the acme section:
- challenge-ready: configures the conditions that must be satisfied
before notifying the ACME server that a dns-01 challenge is ready.
Accepted values are cli, dns and none. cli waits for an operator
to signal readiness via the "acme challenge_ready" CLI command. dns
performs a DNS pre-check against the "default" resolvers section,
not the authoritative name servers. When both are combined, HAProxy
waits for the CLI confirmation before triggering the DNS check.
- dns-delay: configures the delay before the first DNS resolution
attempt and between retries when challenge-ready includes dns.
Default is 300 seconds.
The previous patch implemented the 'dns-check' option. This one replaces
it by a more generic 'challenge-ready' option, which allows the user to
chose the condition to validate the readiness of a challenge. It could
be 'cli', 'dns' or both.
When in dns-01 mode it's by default to 'cli' so the external tool used to
configure the TXT record can validate itself. If the tool does not
validate the TXT record, you can use 'cli,dns' so a DNS check would be
done after the CLI validated with 'challenge_ready'.
For an automated validation of the challenge, it should be set to 'dns',
this would check that the TXT record is right by itself.
When using the dns-01 challenge type, TXT record propagation across
DNS servers can take time. If the ACME server verifies the challenge
before the record is visible, the challenge fails and it's not possible
to trigger it again.
This patch introduces an optional DNS pre-check mechanism controlled
by two new configuration directives in the "acme" section:
- "dns-check on|off": enable DNS propagation verification before
notifying the ACME server (default: off)
- "dns-delay <time>": delay before querying DNS (default: 300s)
When enabled, three new states are inserted in the state machine
between AUTH and CHALLENGE:
- ACME_RSLV_WAIT: waits dns-delay seconds before starting
- ACME_RSLV_TRIGGER: starts an async TXT resolution for each
pending authorization using HAProxy's resolver infrastructure
- ACME_RSLV_READY: compares the resolved TXT record against the
expected token; retries from ACME_RSLV_WAIT if any record is
missing or does not match
The "acme_rslv" structure is implemented in acme_resolvers.c, it holds
the resolution for each domain. The "auth" structure which contains each
challenge to resolve contains an "acme_rslv" structure. Once
ACME_RSLV_TRIGGER leaves, the DNS tasks run on the same thread, and the
last DNS task which finishes will wake up acme_process().
Note that the resolution goes through the configured resolvers, not
through the authoritative name servers of the domain. The result may
therefore still be affected by DNS caching at the resolver level.
This patch adds support for TXT records. It allows to get the first
string of a TXT-record which is limited to 255 characters.
The rest of the record is ignored.
Latest commit a336c467a0 ("BUG/MINOR: net_helper: fix length controls
on ip.fp tcp options parsing") was malformed and broke the build. This
should be backported wherever the fix above is backported.
If opt len is truncated by tcplen we may read 1 Byte after the
tcp header.
There is also missing controls parsing MSS and WS we may compute
invalid values on fingerprint reading after the tcp header in
case of truncated options.
This patch should be backported on versions including ip.fp
We leverage the SE_FL_APP_STARTED flag to detect whether the application
layer had a chance to run or not when an RST_STREAM is received. This
allows us to triage RST_STREAM between regular ones and harmful ones,
and to count glitches for them. It reveals extremely effective at
detecting fast HEADERS+RST pairs.
It could be useful to backport it to 3.2, though it depends on these
two previous patches to be backported first (the first one was already
planned and the second one is harmless, though will require to drop
the haterm changes):
BUG/MINOR: stconn: Always declare the SC created from healthchecks as a back SC
MINOR: stconn: flag the stream endpoint descriptor when the app has started
In order to improve our ability to distinguish operations that had
already started from others under high loads, it would be nice to know
if an application layer (stream) has started to work with an endpoint
or not. The use case typically is a frontend mux instantiating a stream
to instantly cancel it. Currently this info will take some time to be
detected and processed if the applcation's task takes time to wake up.
By flagging the sedesc with SE_FL_APP_STARTED the first time a the app
layer starts, the lower layers can know whether they're cancelling a
stream that has started to work or not, and act accordingly. For now
this is done unconditionally on the backend, and performed early in the
only two app layers that can be reached by a frontend: process_stream()
and process_hstream() (for haterm).
The SC created from a healthcheck is always a back SC. But SC_FL_ISBACK
flags was missing. Instead of passing it when sc_new_from_check() is called,
the function was simplified to set SC_FL_ISBACK flag systematically when a
SC is created from a healthcheck.
This patch should be backported as far as 2.6.
RFC 9000 lists each supported frames and the type of packets in which it
can be present.
Prior to this patch, a packet with an incompatible frame is dropped.
However, QUIC specification mandates that the connection is immediately
closed with PROTOCOL_VIOLATION error code. This patch completes
qc_parse_frm() to add such connection closure.
This must be backported up to 2.6.
The GitHub API silently caps per_page at 100, so passing per_page=200
was silently returning at most 100 tags. AWS-LC-FIPS tags appear late
in the list, causing version detection to fail.
Replace the single-page fetch in get_all_github_tags() with a loop that
iterates all pages.
Could be backported in previous branches.
When an htx DATA block is partially transfer, we must take care to remove
exactly the copied size. To do so, we must save the size of the last block
value copied and not rely on the last data block after the copy. Indeed,
data can be merged with an existing DATA block, so the last block size can
be larger than the last part copied.
Because of this issue, it is possible to remove more data than
expected. Worse, this could lead to a crash by performing an integer
overflow on the block size.
No backport needed.
Fix a leak of the task object in acme_start_task() when one of the
condition in the function failed.
Fix issue #3308.
Must be backported to 3.2 and later.
There are two settings to control idle connection sharing across
threads.
tune.idle-pool.shared, that enables or disables it, and then
tune.takeover-other-tg-connections, which lets you or not get idle
connections from other thread groups.
Add a new keyword for tune.idle-pool.shared, "full", that lets you get
connections from other thread groups (equivalent to "full" keyword for
tune.takeover-other-tg-connections). The "on" keyword now will be
equivalent to the "restrict" one, which allowed getting connection from
other thread groups only when not doing it would result in a connection
failure (when reverse-http or when strict-macxonn are used).
tune.takeover-other-tg-connections will be deprecated.
If server returns an auth with status valid it seems that client
needs to always skip it, CA can recycle authorizations, without
this change haproxy fails to obtain certificates in that case.
It is also something that is explicitly allowed and stated
in the dns-persist-01 draft RFC.
Note that it would be better to change how haproxy does status polling,
and implements the state machine, but that will take some thought
and time, this patch is a quick fix of the problem.
See:
https://github.com/letsencrypt/boulder/issues/2125https://github.com/letsencrypt/pebble/issues/133
This must be backported to 3.2 and later.
When abortonclose option is enabled (by default since 3.3), the HTTP rules
can no longer yield if the client aborts. However, stream aborts were also
considered. So it was possible to interrupt yielding rules, especially on
the response processing, while the client was still waiting for the
response.
So now, when abortonclose option is enabled, we now take care to only
consider client aborts to prevent HTTP rules to yield.
Many thanks to @DirkyJerky for his detailed analysis.
This patch should fix the issue #3306. It should be backported as far as
2.8.
warnif_misplaced_* functions return 1 when a warning is reported and 0
otherwise. So the caller must properly handle the return value.
When parsing a proxy, ERR_WARN code must be added to the error code instead
of the return value. When a warning was reported, ERR_RETRYABLE (1) was
added instead of ERR_WARN.
And when tcp rules were parsed, warnings were ignored. Message were emitted
but the return values were ignored.
This patch should be backported to all stable versions.
When warnif_cond_conflicts() is called, we must take care to emit a warning
only when a conflict is reported. We cannot rely on the err_code variable
because some warnings may have been already reported. We now rely on the
errmsg variable. If it contains something, a warning is emitted. It is good
enough becasue warnif_cond_conflicts() only reports warnings.
This patch should fix the issue #3305. It is a 3.4-dev specific issue. No
backport needed.
When picking a mux, pay attention to its MX_FL_FRAMED. If it is set,
then it means we explicitely want QUIC, so don't use that mux for any
protocol that is not QUIC.
When parsing the check address, store the associated proto too.
That way we can use the notation like quic4@address, and the right
protocol will be used. It is possible for checks to use a different
protocol than the server, ie we can have a QUIC server but want to run
TCP checks, so we can't just reuse whatever the server uses.
WIP: store the protocol in checks
Don't assume the check will reuse the server's xprt. It may not be true
if some settings such as the ALPN has been set, and it differs from the
server's one. If the server is QUIC, and we want to use TCP for checks,
we certainly don't want to reuse its XPRT.
Permission checks on the CLI for ACME are missing.
This patch adds a check on the ACME commands
so they can only be run in admin mode.
ACME is stil a feature in experimental-mode.
Initial report by Cameron Brown.
Must be backported to 3.2 and later.
Permission checks on the CLI for ECH are missing.
This patch adds a check for "(add|set|del|show) ssl ech" commands
so they can only be run in admin mode.
ECH is stil a feature in experimental-mode and is not compiled by
default.
Initial report by Cameron Brown.
Must be backported to 3.3.
This patch fixes a warning that can be reproduced with gcc-8.5 on RHEL8
(gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-28)).
This should fix issue #3303.
Must be backported everywhere 917e82f283 ("MINOR: debug: copy debug
symbols from /usr/lib/debug when present") was backported, which is
to branch 3.2 for now.
Fix the check or arguments of the 'acme challenge_ready' command which
was checking if all arguments are NULL instead of one of the argument.
Must be backported to 3.2 and later.
Replace atol() by _strl2uic() in cases the input are ISTs when parsing
the retry-after header. There's no risk of an error since it will stop
at the first non-digit.
Must be backported to 3.2 and later.