Commit graph

14488 commits

Author SHA1 Message Date
Michał Kępień
64367010f2
Fix "rndc flushname" for longer name server names
dns_adb_flushname() calls dns_name_hash() to determine the ADB bucket
number to search for the given name.  Meanwhile, all other functions in
lib/dns/adb.c call dns_name_fullhash() for determining the bucket number
instead.  This discrepancy causes dns_adb_flushname() to have virtually
no chances of actually removing the given name from the ADB if the
name is longer than 16 bytes (since dns_name_hash() only hashes the
first 16 bytes of the name provided to it) - more specifically, the
probability of success for names longer than 16 bytes is inversely
proportional to the number of ADB buckets in use, i.e. 1:1021 at best.

Fix by using dns_name_fullhash() instead of dns_name_hash() in
dns_adb_flushname(), so that the logic for determining the bucket number
that a given name belongs to is consistent throughout lib/dns/adb.c.
2025-01-30 07:44:18 +01:00
Andoni Duarte Pintado
73997c8161 Merge tag 'v9.18.33' into bind-9.18 2025-01-29 17:23:11 +01:00
Ondřej Surý
b14df7d459
Stop the timer when shuttingdown the fetch context
When canceling the last fetch, we also need to stop the fctx_expired
timer from possibly firing between the fctx_shutdown() call and the
fetch being actually destroyed along with the timer.  As there are
multiple places where fctx_shutdown() is being called without stopping
the timer, move the fctx_stoptimer() to fctx_shutdown() and cleanup the
explicit usage.
2025-01-23 17:46:37 +01:00
Mark Andrews
8790d5cd22 Terminate yaml string after negative comment
(cherry picked from commit 89afc11389)
2025-01-22 23:58:54 +00:00
Ondřej Surý
239f4104da
Remove memory limit on ADB finds and fetches
Address Database (ADB) shares the memory for the short lived ADB
objects (finds, fetches, addrinfo) and the long lived ADB
objects (names, entries, namehooks).  This could lead to a situation
where the resolver-heavy load would force evict ADB objects from the
database to point where ADB is completely empty, leading to even more
resolver-heavy load.

Make the short lived ADB objects use the other memory context that we
already created for the hashmaps.  This makes the ADB overmem condition
to not be triggered by the ongoing resolver fetches.

(cherry picked from commit 05faff6d53)
2025-01-22 15:29:27 +01:00
Ondřej Surý
4cc1160e4d
Replace linked lists with the hashtables to hold fetch contexts
When the recursive-clients value is too large, the linked lists holding
the fetch contexts can also grow large and since the algorithm to merge
outgoing queries is quadratic, named can get slow.

Replace the linked list with hashtable for faster lookups.  This also
allows us to reduce the number of tasks (buckets) in the resolver.
2025-01-22 15:06:04 +01:00
JINMEI Tatuya
065ffb2eb8
Optimize database decref by avoiding locking with refs > 1
Previously, this function always acquires a node write lock if it
might need node cleanup in case the reference decrements to 0.  In
fact, the lock is unnecessary if the reference is larger than 1 and it
can be optimized as an "easy" case. This optimization could even be
"necessary". In some extreme cases, many worker threads could repeat
acquring and releasing the reference on the same node, resulting in
severe lock contention for nothing (as the ref wouldn't decrement to 0
in most cases). This change would prevent noticeable performance
drop like query timeout for such cases.

Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com>
Co-authored-by: Ondřej Surý <ondrej@isc.org>

(cherry picked from commit 7f4471594d)
2025-01-22 14:31:09 +01:00
Ondřej Surý
8bf311c769
Shutdown the fetch context after canceling the last fetch
Currently, the fetch context will continue running even when the last
fetch (response) has been removed from the context, so named can process
and cache the answer.  This can lead to a situation where the number of
outgoing recursing clients exceeds the the configured number for
recursive-clients.

Be more stringent about the recursive-clients limit and shutdown the
fetch context immediately after the last fetch has been canceled from
that particular fetch context.
2025-01-22 14:21:51 +01:00
Ondřej Surý
1b9d949534
Remove --with-tuning=small/large configuration option
The last remaining tuning value was RESOLVER_NTASKS and instead of
having variable number of the tasks per-cpu and in named and in
dns_client, set the number of the resolver tasks to 523 (number taken
from dns_client unit) to accomodate most of the recursive-clients
values.
2025-01-22 14:16:40 +01:00
Ondřej Surý
d8206a939c
Reduce struct isc__nm_uvreq size from 1560 to 560 bytes
The uv_req union member of struct isc__nm_uvreq contained libuv request
types that we don't use.  Turns out that uv_getnameinfo_t is 1000 bytes
big and unnecessarily enlarged the whole structure.  Remove all the
unused members from the uv_req union.
2025-01-22 14:12:38 +01:00
Ondřej Surý
a7630c2c62
Reduce sizeof isc_sockaddr from 152 to 48 bytes
After removing sockaddr_unix from isc_sockaddr, we can also remove
sockaddr_storage and reduce the isc_sockaddr size from 152 bytes to just
48 bytes needed to hold IPv6 addresses.

(cherry picked from commit 2367b6a2e1)
2025-01-22 14:12:38 +01:00
Artem Boldariev
550b692343 DoH: reduce excessive bad request logging
We started using isc_nm_bad_request() more actively throughout
codebase. In the case of HTTP/2 it can lead to a large count of
useless "Bad Request" messages in the BIND log, as often we attempt to
send such request over effectively finished HTTP/2 sessions.

This commit fixes that.

(cherry picked from commit 937b5f8349)
2025-01-15 16:50:13 +01:00
Artem Boldariev
796708775d DoH: introduce manual read timer control
This commit introduces manual read timer control as used by StreamDNS
and its underlying transports. Before that, DoH code would rely on the
timer control provided by TCP, which would reset the timer any time
some data arrived. Now, the timer is restarted only when a full DNS
message is processed in line with other DNS transports.

That change is required because we should not stop the timer when
reading from the network is paused due to throttling. We need a way to
drop timed-out clients, particularly those who refuse to read the data
we send.

(cherry picked from commit 609a41517b)
2025-01-15 16:49:32 +01:00
Artem Boldariev
ee42514be2 DoH: floodding clients detection
This commit adds logic to make code better protected against clients
that send valid HTTP/2 data that is useless from a DNS server
perspective.

Firstly, it adds logic that protects against clients who send too
little useful (=DNS) data. We achieve that by adding a check that
eventually detects such clients with a nonfavorable useful to
processed data ratio after the initial grace period. The grace period
is limited to processing 128 KiB of data, which should be enough for
sending the largest possible DNS message in a GET request and then
some. This is the main safety belt that would detect even flooding
clients that initially behave well in order to fool the checks server.

Secondly, in addition to the above, we introduce additional checks to
detect outright misbehaving clients earlier:

The code will treat clients that open too many streams (50) without
sending any data for processing as flooding ones; The clients that
managed to send 1.5 KiB of data without opening a single stream or
submitting at least some DNS data will be treated as flooding ones.
Of course, the behaviour described above is nothing else but
heuristical checks, so they can never be perfect. At the same time,
they should be reasonable enough not to drop any valid clients,
realatively easy to implement, and have negligible computational
overhead.

(cherry picked from commit 3425e4b1d0)
2025-01-15 16:49:23 +01:00
Artem Boldariev
11a2956dce DoH: process data chunk by chunk instead of all at once
Initially, our DNS-over-HTTP(S) implementation would try to process as
much incoming data from the network as possible. However, that might
be undesirable as we might create too many streams (each effectively
backed by a ns_client_t object). That is too forgiving as it might
overwhelm the server and trash its memory allocator, causing high CPU
and memory usage.

Instead of doing that, we resort to processing incoming data using a
chunk-by-chunk processing strategy. That is, we split data into small
chunks (currently 256 bytes) and process each of them
asynchronously. However, we can process more than one chunk at
once (up to 4 currently), given that the number of HTTP/2 streams has
not increased while processing a chunk.

That alone is not enough, though. In addition to the above, we should
limit the number of active streams: these streams for which we have
received a request and started processing it (the ones for which a
read callback was called), as it is perfectly fine to have more opened
streams than active ones. In the case we have reached or surpassed the
limit of active streams, we stop reading AND processing the data from
the remote peer. The number of active streams is effectively decreased
only when responses associated with the active streams are sent to the
remote peer.

Overall, this strategy is very similar to the one used for other
stream-based DNS transports like TCP and TLS.

(cherry picked from commit 9846f395ad)
2025-01-15 16:47:21 +01:00
Artem Boldariev
125bfd71d3 Add isc__nm_async_run()
This commit adds isc__nm_async_run() which is very similar to
isc_async_run() in newer versions of BIND: it allows calling a
callback asynchronously.

Potentially, it can be used to replace some other async operations in
other networking code, in particular the delayed I/O calls in TLS a
TCP DNS transports to name a few and remove quiet a lot of code, but
it we are unlikely to do that for the strictly maintenance only
branch, so it is protected with DoH-related #ifdefs.

It is implemented in a "universal" way mainly because doing it in the
specific code requires the same amount of code and is not simpler.
2025-01-15 16:43:47 +01:00
Artem Boldariev
13d521fa5f Implement TLS manual read timer control functionality
This commit adds a manual TLS read timer control mode which is
supposed to override automatic resetting of the timer when any data is
received.

It both depends and complements similar functionality in TCP.
2025-01-15 15:34:43 +00:00
Artem Boldariev
a67b325542 Implement TCP manual read timer control functionality
This commit adds a manual TCP read timer control mode which is
supposed to override automatic resetting of the timer when any data is
received. That can be accomplished by
`isc__nmhandle_set_manual_timer()`.

This functionality is supposed to be used by multilevel networking
transports which require finer grained control over the read
timer (TLS Stream, DoH).

The commit is essentially an implementation of the functionality from
newer versions of BIND.
2025-01-15 15:34:43 +00:00
Ondřej Surý
fa7b7973e3 Limit the additional processing for large RDATA sets
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA.  This limits the
number of lookups into the database(s) during a single client query,
reducing query processing load.

Also, don't append any additional data to type=ANY queries. The
answer to ANY is already big enough.

(cherry picked from commit a1982cf1bb)
2025-01-15 14:13:45 +01:00
Aram Sargsyan
73b6d9e9e5 Fix a bug in isc_rwlock_trylock()
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.

Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
2025-01-07 13:30:26 +00:00
Mark Andrews
841269601c Fix parsing of unknown directives in resolv.conf
Only call eatline() to skip to the next line if we're not
already at the end of a line when parsing an unknown directive.
We were accidentally skipping the next line when there was only
a single unknown directive on the current line.

(cherry picked from commit eb78ad2080)
2024-12-10 00:49:11 +00:00
Matthijs Mekking
b2516e1e0c Use query counters in validator code
Commit af7db89513 as part of #4141 was
supposed to apply the 'max-recursion-queries' quota to validator
queries, but the counter was never actually passed on to
dns_resolver_createfetch(). This has been fixed, and the global query
counter ('max-query-count', per client request) is now also added.

(cherry picked from commit 5b1ae4a948)
2024-12-09 11:44:24 +01:00
Ondřej Surý
43f7642e5d Update picohttpparser.{c,h} with upstream repository
Upstream code doesn't do regular releases, so we need to regularly
sync the code from the upstream repository.  This is synchronization up
to the commit f8d0513 from Jan 29, 2024.

(cherry picked from commit d14a76e115)
2024-12-08 12:30:11 +00:00
Matthijs Mekking
413ba531f5 Remove unused maxquerycount
While implementing the global limit 'max-query-count', initially I
thought adding the variable to the resolver structure. But the limit
is per client request so it was moved to the view structure (and
counter in ns_query structure). However, I forgot to remove the
variable from the resolver structure again. This commit fixes that.

(cherry picked from commit 397ca34e34)
2024-12-06 15:19:01 +00:00
Matthijs Mekking
a0ce89bc15 Implement global limit for outgoing queries
This global limit is not reset on query restarts and is a hard limit
for any client request.

Note: This commit has been significantly modified because of many
merge conflicts due to the dns_resolver_createfetch api changes.

(cherry picked from commit 16b3bd1cc7)
2024-12-06 15:17:53 +00:00
Matthijs Mekking
3d0559621b Implement getter function for counter limit
(cherry picked from commit ca7d487357)
2024-12-06 15:17:53 +00:00
Matthijs Mekking
5a806910a8 Implement 'max-query-count'
Add another option to configure how many outgoing queries per
client request is allowed. The existing 'max-recursion-queries' is
per restart, this one is a global limit.

(cherry picked from commit bbc16cc8e6)
2024-12-06 15:17:53 +00:00
Matthijs Mekking
90fbe91997 Fix nsupdate hang when processing a large update
The root cause is the fix for CVE-2024-0760 (part 3), which resets
the TCP connection on a failed send. Specifically commit
4b7c6138 stops reading on the socket
because the TCP connection is throttling.

When the tcpdns_send_cb callback thinks about restarting reading
on the socket, this fails because the socket is a client socket.
And nsupdate is a client and is using the same netmgr code.

This commit removes the requirement that the socket must be a server
socket, allowing reading on the socket again after being throttled.

(manually picked from commit aa24b77d8b)
2024-12-06 09:26:40 +00:00
Mark Andrews
01f5ad3b1d Keep a local copy of the update rules to prevent UAF
Previously, the update policy rules check was moved earlier in the
sequence, and the keep rule match pointers were kept to maintain the
ability to verify maximum records by type.

However, these pointers can become invalid if server reloading
or reconfiguration occurs before update completion. To prevent
this issue, extract the maximum records by type value immediately
during processing and only keep the copy of the values instead of the
full ssurule.

(cherry picked from commit 44a54a29d8)
2024-12-05 15:45:34 +11:00
Ondřej Surý
a4e3d25652
Use attach()/detach() functions instead of touching .references
In rbtdb.c, there were two places where the code touched .references
directly instead of using the helper functions.  Use the helper
functions instead.
2024-11-27 21:17:22 +01:00
JINMEI Tatuya
7d1de99656 use more generic log module name for 'logtoomanyrecords'
DNS_LOGMODULE_RBTDB was simply inappropriate, and this
log message is actually dependent on db implementation
details, so DNS_LOGMODULE_DB would be the best choice.

(cherry picked from commit b0309ee631)
2024-11-27 12:34:11 +11:00
JINMEI Tatuya
a129206f37 emit more helpful log for exceeding max-records-per-type
The new log message is emitted when adding or updating an RRset
fails due to exceeding the max-records-per-type limit. The log includes
the owner name and type, corresponding zone name, and the limit value.
It will be emitted on loading a zone file, inbound zone transfer
(both AXFR and IXFR), handling a DDNS update, or updating a cache DB.
It's especially helpful in the case of zone transfer, since the
secondary side doesn't have direct access to the offending zone data.

It could also be used for max-types-per-name, but this change
doesn't implement it yet as it's much less likely to happen
in practice.

(cherry picked from commit 4156995431)
2024-11-27 12:34:11 +11:00
Ondřej Surý
4fbdad515c
Move contributed DLZ modules into a separate repository
The DLZ modules are poorly maintained as we only ensure they can still
be compiled, the DLZ interface is blocking, so anything that blocks the
query to the database blocks the whole server and they should not be
used except in testing.  The DLZ interface itself should be scheduled
for removal.

(cherry picked from commit a6cce753e2)
2024-11-26 16:24:35 +01:00
Aram Sargsyan
b91b7093f2 Fix error path bugs in the "recursing-clients" list management
In two places, after linking the client to the manager's
"recursing-clients" list using the check_recursionquota()
function, the query.c module fails to unlink it on error
paths. Fix the bugs by unlinking the client from the list.

Also make sure that unlinking happens before detaching the
client's handle, as it is the logically correct order, e.g.
in case if it's the last handle and ns__client_reset_cb()
can be called because of the detachment.

(cherry picked from commit 36c4808903)
2024-11-26 12:40:04 +00:00
Mark Andrews
2d55935c6e Parse the URI template and check for a dns variable
The 'dns' variable in dohpath can be in various forms ({?dns},
{dns}, {&dns} etc.).  To check for a valid dohpath it ends up
being simpler to just parse the URI template rather than looking
for all the various forms if substring.

(cherry picked from commit af54ef9f5d)
2024-11-26 03:41:51 +00:00
Remi Gacogne
e12e91b90d '{&dns}' is as valid as '{?dns}' in a SVCB's dohpath
See for example section 1.2. "Levels and Expression Types" of rfc6570.

(cherry picked from commit e74052ea71)
2024-11-26 03:41:51 +00:00
Mark Andrews
6fc76a1e87 Provide more visibility into configuration errors
by logging SSL_CTX_use_certificate_chain_file and
SSL_CTX_use_PrivateKey_file errors

(cherry picked from commit 9006839ed7)
2024-11-26 12:24:41 +11:00
Mark Andrews
2f26d2fde7 Re-split format strings
Re-split format strings that had been poorly split by multiple
clang-format runs using different versions of clang-format.

(cherry picked from commit a24d6e1654)
2024-11-21 04:22:15 +00:00
Ondřej Surý
b3d8f2796a
Remove redundant semicolons after the closing braces of functions
(cherry picked from commit 1a19ce39db)
2024-11-19 16:06:49 +01:00
Ondřej Surý
c5bac96fd0
Remove redundant parentheses from the return statement
(cherry picked from commit 0258850f20)
2024-11-19 16:06:16 +01:00
Matthijs Mekking
ff53fd3951 Revert "Use a binary search to find the NSEC3 closest encloser"
This reverts commit 94f6655915.
2024-11-15 13:14:30 +00:00
Matthijs Mekking
e5c711fd43 Add inline-signing warning for upgrading to 9.20
For dynamic zones that do not set inline-signing explicitly, add a
warning that the default value for inline-signing has changed. Dynamic
zones that want to be able to reuse the zone (and not trigger a full
resign) should explicitly configure "inline-signing no;".
2024-10-23 10:34:49 +00:00
Petr Menšík
e5ffa52c6d Remove unused <openssl/{hmac,engine}.h> headers from OpenSSL shims
The <openssl/{hmac,engine}.h> headers were unused and including the
<openssl/engine.h> header might cause build failure when OpenSSL
doesn't have Engines support enabled.

See https://fedoraproject.org/wiki/Changes/OpensslDeprecateEngine

(cherry picked from commit 75a50925f7)
2024-10-18 01:29:27 +00:00
Mark Andrews
94f6655915 Use a binary search to find the NSEC3 closest encloser
maxlabels is the suffix length that corresponds to the latest
NXDOMAIN response.  minlabels is the suffix length that corresponds
to longest found existing name.

(cherry picked from commit 67f31c5046)
2024-10-14 23:55:13 +00:00
Matthijs Mekking
fdeb456341 Small keymgr improvement
When a key is to be purged, don't run the key state machinery for it.

(cherry picked from commit af54e3dadc)
2024-10-14 13:54:09 +00:00
Matthijs Mekking
4091177181 Verify new key files before running keymgr
Prior to running the keymgr, first make sure that existing keys
are present in the new keylist. If not, treat this as an operational
error where the keys are made offline (temporarily), possibly unwanted.

(cherry picked from commit 5fdad05a8a)
2024-10-14 13:54:09 +00:00
Matthijs Mekking
60bd3bc051 Revert "fix: chg: Improve performance when looking for the closest encloser"
The 9.18 code does not have the rbtdb refactoring. Rather than
backporting from MR !9611, this reverts directly from commit
5d81a258e3.
2024-10-10 14:26:13 +02:00
Ondřej Surý
7ad2d6e986
Don't enable SO_REUSEADDR on outgoing UDP sockets
Currently, the outgoing UDP sockets have enabled
SO_REUSEADDR (SO_REUSEPORT on BSDs) which allows multiple UDP sockets to
bind to the same address+port.  There's one caveat though - only a
single (the last one) socket is going to receive all the incoming
traffic.  This in turn could lead to incoming DNS message matching to
invalid dns_dispatch and getting dropped.

Disable setting the SO_REUSEADDR on the outgoing UDP sockets.  This
needs to be done explicitly because `uv_udp_open()` silently enables the
option on the socket.

(cherry picked from commit eec30c33c2)
2024-10-02 15:20:28 +02:00
Ondřej Surý
5bac885ace
Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.

(cherry picked from commit 88227ea665)
2024-10-02 09:09:03 +02:00
Mark Andrews
b1cf7997a7 Store static-stub addresses seperately in the adb
Static-stub address and addresses from other sources where being
mixed together resulting in static-stub queries going to addresses
not specified in the configuration or alternatively static-stub
addresses being used instead of the real addresses.

(cherry picked from commit b3a2c790f3)
2024-10-01 15:30:17 +10:00