Commit graph

16094 commits

Author SHA1 Message Date
Aram Sargsyan
14915b0241 Redesign the unreachable primaries cache
The cache for unreachable primaries was added to BIND 9 in 2006 via
1372e172d0. It features a 10-slot LRU
array with 600 seconds (10 minutes) fixed delay. During this time, any
primary with a hiccup would be blocked for the whole block duration
(unless overwritten by a different entry).

As this design is not very flexible (i.e. the fixed delay and the fixed
amount of the slots), redesign it based on the badcache.c module, which
was implemented earlier for a similar mechanism.

The differences between the new code and the badcache module were large
enough to create a new module instead of trying to make the badcache
module universal, which could complicate the implementation.

The new design implements an exponential backoff for entries which are
added again soon after expiring, i.e. the next expiration happens in
double the amount of time of the previous expiration, but in no more
time than the defined maximum value.

The initial and the maximum expiration values are hard-coded, but, if
required, it should be trivial to implement configurable knobs.
2025-06-04 09:16:35 +00:00
Evan Hunt
b8f325ae01 Add support for zone templates
A "template" statement can contain the same configuration clauses
as a "zone" statement.  A "zone" statement can now reference a
template, and all the clauses in that template will be used as
default values for the zone. For example:

    template primary {
        type primary;
        file "$name.db";
        initial-file "primary.db";
    };

    zone example.com {
        template primary;
        file "different-name.db"; // overrides the template
    };
2025-06-03 12:03:07 -07:00
Evan Hunt
598ae3f63c Allow zone names to be generated parametrically
Special tokens can now be specified in a zone "file" option
in order to generate the filename parametrically. The first
instead of "$name" in the "file" option is replaced with the
zone origin, the first instance of "$type" is replaced with the
zone type (i.e., primary, secondary, etc), and the first instance
of "$view" is replaced with the view name..

This simplifies the creation of zones using initial-file templates.
For example:

   $ rndc addzone <zonename> \
     { type primary; file "$name.db"; initial-file "template.db"
2025-06-03 12:03:07 -07:00
Evan Hunt
60b129da25 Add zone "initial-file" option
When loading a primary zone for the first time, if the zonefile
does not exist but an "initial-file" option has been set, then a
new file will be copied into place from the path specified by
"initial-file".

This can be used to simplify the process of adding new zones. For
instance, a template zonefile could be used by running:

    $ rndc addzone example.com \
        '{ type primary; file "example.db"; initial-file "template.db"; };'
2025-06-03 12:03:07 -07:00
Evan Hunt
0b8f943a6a normalize syntax checks between named and libisccfg
there were some duplicated syntax checks in named_zone_configure()
that are no longer needed, now that we perform those same checks
using isccfg_check_zoneconf().

there were also some syntax checks that were *only* in
named_zone_configure(), which have now been moved to
isccfg_check_zoneconf(). test cases for them have been
added to the checkconf system test.
2025-06-03 11:15:54 -07:00
Evan Hunt
2d57c1e737 call zone syntax checks when running rndc addzone/modzone
the function that checks zone syntax in libisccfg was previously
only called when loading named.conf, not when parsing an an
"rndc addzone" or "rndc modzone" command. this has been corrected.

note that some checks are still skipped: those that check for
duplication of filenames, key directories, etc.  to fix this, we'd need
to export the symbol tables that are set up when loading named.conf and
preserve them so they could be reused later.
2025-06-03 11:15:40 -07:00
Miltos Allamanis
c04b840260
Fix builds for the OSS-Fuzz project
Since 70b1777d8a was commited the OSS-Fuzz
build was broken because the `chunk_get_raw()` was not updated in the
`FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION`-enabled area.  Add the `size`
argument to the fuzzing version of the `chunk_get_raw()` function.
2025-06-03 18:41:45 +02:00
Michal Nowak
b5e7d96f0a Allow commandline.c to compile on Solaris
commandline.c failed to compile on Solaris because NAME_MAX was
undefined.  Include 'isc/dir.h' which defines NAME_MAX for platforms
that don't define it.

    In file included from commandline.c:54:
    ./include/isc/commandline.h:31:38: error: 'NAME_MAX' undeclared here (not in a function)
       31 | extern char isc_commandline_progname[NAME_MAX];
          |                                      ^~~~~~~~
2025-06-02 09:00:48 +00:00
Aydın Mercan
408190fd3b
use proper flexible arrays in rrl
The single-element array hack can trip newer sanitizers or fortification
levels.
2025-05-30 10:44:49 +00:00
Aydın Mercan
23d70bde6c
add attribute macro for counted_by
Using C23 attributes for `counted_by` is broken with clang.

`__has_attribute` is used since `__has_c_attribute` only works with C23
attributes, (`gnu::counted_by`/`clang::counted_by`)
2025-05-30 08:04:49 +00:00
Mark Andrews
521bf1d50f Allow keystore.c to compile on Solaris
keystore.c failed to compile on Solaris because NAME_MAX was
undefined.  Include 'isc/dir.h' which defines NAME_MAX for platforms
that don't define it.
2025-05-30 15:14:38 +10:00
Mark Andrews
72cd6e8591 Silence tainted scalar in client.c
Coverity detected that 'optlen' was not being checked in 'process_opt'.
This is actually already done when the OPT record was initially
parsed.  Add an INSIST to silence Coverity as is done in message.c.
2025-05-29 06:26:37 +00:00
Ondřej Surý
15ddacbf17
Remove spurious zconf.h include
The #include <zconf.h> got spuriously included into isc_commandline
unit.  The #include <limits.h> needs to be used instead.
2025-05-29 06:34:08 +02:00
Ondřej Surý
a676551395
Unify handling of the program name in all the utilities
There were several methods how we used 'argv[0]'.  Some programs had a
static value, some programs did use isc_file_progname(), some programs
stripped 'lt-' from the beginning of the name.  And some used argv[0]
directly.

Unify the handling and all the variables into isc_commandline_progname
that gets populated by the new isc_commandline_init(argc, argv) call.
2025-05-29 06:17:32 +02:00
Ondřej Surý
7f498cc60d
Give every memory pool a name
Instead of giving the memory pools names with an explicit call to
isc_mempool_setname(), add the name to isc_mempool_create() call to have
all the memory pools an unconditional name.
2025-05-29 05:46:46 +02:00
Ondřej Surý
4e79e9baae
Give every memory context a name
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
2025-05-29 05:46:46 +02:00
Evan Hunt
dd9a685f4a simplify code around isc_mem_put() and isc_mem_free()
it isn't necessary to set a pointer to NULL after calling
isc_mem_put() or isc_mem_free(), because those macros take
care of it automatically.
2025-05-28 17:22:32 -07:00
Ondřej Surý
5d264b3329
Set name for all the isc_mem context
The memory context for isc_managers and dst_api units had no name and
that was causing trouble with the statistics channel output.  Set the
name for the two memory context that were missing a proper name.
2025-05-28 21:27:13 +02:00
Aram Sargsyan
874ca5ca2f Prepare a zone for shutting down when deleting it from a view
After b171cacf4f, a zone object can
remain in the memory for a while, until garbage collection is run.
Setting the DNS_ZONEFLG_EXITING flag should prevent the zone
maintenance function from running while it's in that state.
Otherwise, a secondary zone could initiate a zone transfer after
it had been deleted.
2025-05-28 16:59:05 +00:00
Aram Sargsyan
f4cd307c6b Emit a ISC_R_CANCELED result instead of ISC_R_SHUTTINGDOWN
When request manager shuts down, it also shuts down all its ongoing
requests. Currently it calls their callback functions with a
ISC_R_SHUTTINGDOWN result code for the request. Since a request
manager can shutdown not only during named shutdown but also during
named reconfiguration, instead of sending ISC_R_SHUTTINGDOWN result
code send a ISC_R_CANCELED code to avoid confusion and errors with
the expectation that a ISC_R_SHUTTINGDOWN result code can only be
received during actual shutdown of named.

All the callback functions which are passed to either the
dns_request_create() or the dns_request_createraw() functions have
been analyzed to confirm that they can process both the
ISC_R_SHUTTINGDOWN and ISC_R_CANCELED result codes. Changes were
made where it was necessary.
2025-05-28 16:20:13 +00:00
Aram Sargsyan
b07ec4f0b3 Add a debug log in zone.c:refresh_callback()
The new debug message logs the request result in the SOA request
callback function.
2025-05-28 16:20:13 +00:00
Aram Sargsyan
228e441328 Fix a zone refresh bug in zone.c:refresh_callback()
When the zone.c:refresh_callback() callback function is called during
a SOA request before a zone transfer, it can receive a
ISC_R_SHUTTINGDOWN result for the sent request when named is shutting
down, and in that case it just destroys the request and finishes the
ongoing transfer, without clearing the DNS_ZONEFLG_REFRESH flag of the
zone. This is alright when named is going to shutdown, but currently
the callback can get a ISC_R_SHUTTINGDOWN result also when named is
reconfigured during the ongoibg SOA request. In that case, leaving the
DNS_ZONEFLG_REFRESH flag set results in the zone never being able
to refresh again, because any new attempts will be caneled while
the flag is set. Clear the DNS_ZONEFLG_REFRESH flag on the 'exiting'
error path of the callback function.
2025-05-28 16:20:13 +00:00
Evan Hunt
8d065fd3e1 add DNS_DBITERATOR_FOREACH and DNS_RDATASETITER_FOREACH
when iterating databases, use DNS_DBITERATOR_FOREACH and
DNS_DNSRDATASETITER_FOREACH macros where possible.
2025-05-27 21:08:09 -07:00
Evan Hunt
24d077afb0 add CFG_LIST_FOREACH macro
replace the pattern `for (elt = cfg_list_first(x); elt != NULL;
elt = cfg_list_next(elt))` with a new `CFG_LIST_FOREACH` macro.
2025-05-27 21:08:09 -07:00
Evan Hunt
f10f5572ac add DNS_RDATASET_FOREACH macro
replace the pattern `for (result = dns_rdataset_first(x); result ==
ISC_R_SUCCES; result = dns_rdataset_next(x)` with a new
`DNS_RDATASET_FOREACH` macro throughout BIND.
2025-05-27 21:08:09 -07:00
Evan Hunt
6a6ef103dd import_rdataset() can't fail
the import_rdataset() function can't return any value other
than ISC_R_SUCCESS, so it's been changed to void and its callers
don't rely on its return value any longer.
2025-05-27 20:56:54 -07:00
Evan Hunt
c437da59ee correct the DbC assertions in message.c
the comments for some calls in the dns_message API specified
requirements which were not actually enforced in the functions.

in most cases, this has now been corrected by adding the missing
REQUIREs. in one case, the comment was incorrect and has been
revised.
2025-05-27 23:11:04 +00:00
Evan Hunt
8487e43ad9 make all ISC_LIST_FOREACH calls safe
previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.

similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.
2025-05-23 13:09:10 -07:00
Alessio Podda
d7064c9b88 Tune min and max chunk size
Before implementing adaptive chunk sizing, it was necessary to ensure
that a chunk could hold up to 48 twigs, but the new logic will size-up
new chunks to ensure that the current allocation can succeed.

We exploit the new logic in two ways:
 - We make the minimum chunk size smaller than the old limit of 2^6,
   reducing memory consumption.
 - We make the maximum chunk size larger, as it has been observed that
   it improves resolver performance.
2025-05-22 15:19:48 -07:00
alessio
70b1777d8a Adaptive memory allocation strategy for qp-tries
qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.

Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.

This commit implements an adaptive chunking strategy that:
 - Tracks the size of the most recently allocated chunk.
 - Doubles the chunk size for each new allocation until reaching a
   predefined maximum.

This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.

This commit also splits the callback freeing qpmultis into two
phases, one that frees the underlying qptree, and one that reclaims
the qpmulti memory. In order to prevent races between the qpmulti
destructor and chunk garbage collection jobs, the second phase is
protected by reference counting.
2025-05-22 15:19:27 -07:00
Michał Kępień
c9bf5df999 Merge tag 'v9.21.8' 2025-05-21 21:23:09 +02:00
Ondřej Surý
8171bf01ed
Deprecate max-rsa-exponent-size, always use 4096 instead
The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification.  Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).
2025-05-21 00:50:08 +02:00
Ondřej Surý
841b25fb62
Cleanup the DST cryptographic API
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
2025-05-20 09:52:35 +02:00
Aram Sargsyan
e42d6b4810 Implement a new 'notify-defer' configuration option
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.
2025-05-15 12:24:13 +00:00
Aram Sargsyan
d79b14ff5d Update the dns_zone_setnotifydelay() function's documentation
Add a note that the delay is in seconds.
2025-05-15 12:21:30 +00:00
Aram Sargsyan
62f66c0be0 Delete the unused dns_zone_getnotifydelete() function
The function is unused, delete it.
2025-05-15 12:21:30 +00:00
Evan Hunt
a1e9b885d2
Prevent assertion when processing TSIG algorithm
In a previous change, the "algorithm" value passed to
dns_tsigkey_create() was changed from a DNS name to an integer;
the name was then chosen from a table of known algorithms. A
side effect of this change was that a query using an unknown TSIG
algorithm was no longer handled correctly, and could trigger an
assertion failure.  This has been corrected.

The dns_tsigkey struct now stores the signing algorithm
as dst_algorithm_t value 'alg' instead of as a dns_name,
but retains an 'algname' field, which is used only when the
algorithm is DST_ALG_UNKNOWN.  This allows the name of the
unrecognized algorithm name to be returned in a BADKEY
response.
2025-05-08 22:45:48 +02:00
Mark Andrews
9bb93520f1
Wrong NSEC3 chosen for NO QNAME proof
When we optimised the closest encloser NSEC3 discovery the maxlabels
variable was used in the binary search. The updated value was later
used to add the NO QNAME NSEC3 but that block of code needed the
original value. This resulted in the wrong NSEC3 sometimes being
chosen to perform this role.
2025-05-08 21:48:11 +02:00
Mark Andrews
c0fcb9fd0e Fix the error handling of put_yamlstr calls
The return value was sometimes being ignored when it shouldn't
have been.
2025-04-30 15:39:52 +10:00
Your Name
59086c33e2 Call rcu_barrier earlier in the destructor
If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.

In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.

As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.
2025-04-25 13:13:44 +02:00
Aram Sargsyan
74a8acdc8d Separate the single setter/getter functions for TCP timeouts
Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
b9e9b98d55 Use the configured TCP connect timeout in checkds_send_toaddr()
The checkds_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
daede6876b Use the configured TCP connect timeout in notify_send_toaddr()
The notify_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
70ad94257d Implement tcp-primaries-timeout
The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.

Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
e1a415b412 Fix a date race in qpcache_addrdataset()
The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.
2025-04-23 13:02:43 +00:00
Aram Sargsyan
412aa881f2 Fix a serve-stale issue with a delegated zone
When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.

Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.
2025-04-23 11:46:16 +00:00
Mark Andrews
5eeb31f0b9 Split EDNS COOKIE YAML into separate parts
Split the YAML display of the EDNS COOKIE option into CLIENT and SERVER
parts.  The STATUS of the EDNS COOKIE in the reply is now a YAML element
rather than a comment.
2025-04-22 09:24:18 +10:00
Mark Andrews
07c28652a3 Fix EDNS TCP-KEEPALIVE option YAML output
There was missing white space between the option name and its value.
2025-04-22 09:24:18 +10:00
Mark Andrews
81334113c3 Fix EDNS LLQ option YAML output
The EDNS LLQ option was not being emitted as valid YAML. Correct
the output to be valid YAML with each field of the LLQ being
individually selectable.
2025-04-22 09:24:18 +10:00
Mark Andrews
27e8732c17 Change the EDNS KEY-TAG YAML output format
When using YAML, print the EDNS KEY-TAG as an array of integers
for easier machine parsing. Check the validity of the YAML output.
2025-04-22 09:24:18 +10:00
Mark Andrews
378bc7cfa6 Use YAML comments for durations rather than parentheses
This will allow the values to be parsed using standard yaml processing
tools, and still provide the value in a human friendly form.
2025-04-22 09:24:18 +10:00
Mark Andrews
68cdc4774c Change the name and YAML format of EDNS UL
The offical EDNS option name for "UL" is "UPDATE-LEASE".  We now
emit "UPDATE-LEASE" instead of "UL", when printing messages, but
"UL" has been retained as an alias on the command line.

Update leases consist of 1 or 2 values, LEASE and KEY-LEASE.  These
components are now emitted separately so they can be easily extracted
from YAML output.  Tests have been added to check YAML correctness.
2025-04-22 09:24:18 +10:00
Mark Andrews
280e9b7cf4 Add YAML escaping where needed
When rendering text, such as domain names or the EXTRA-TEXT
field of the EDE option, backslashes and quotation marks must
be escaped to ensure that the emitted message is valid YAML.
2025-04-22 09:24:18 +10:00
Mark Andrews
e7ef4e41eb Collapse common switch cases when emitting EDNS options
The CHAIN and REPORT-CHANNEL EDNS options are both domain names, so they
can be combined.  THE CLIENT-TAG and SERVER-TAG EDNS options are both 16
bit integers, so they can be combined.
2025-04-22 09:23:53 +10:00
Ondřej Surý
bf1b8824ac
Disable own memory context for libxml2 on macOS 15.4 Sequoia
The custom allocation API for libxml2 is deprecated starting in macOS
Sequoia 15.4, iOS 18.4, tvOS 18.4, visionOS 2.4, and tvOS 18.4.

Disable the memory function override for libxml2 when
LIBXML_HAS_DEPRECATED_MEMORY_ALLOCATION_FUNCTIONS is defined as Apple
broke the system-wide libxml2 starting with macOS Sequoia 15.4.
2025-04-18 20:16:13 +02:00
Nicki Křížek
c5707cb75a Merge tag 'v9.21.7' 2025-04-16 15:23:14 +02:00
Ondřej Surý
30d4939382
Move the call_rcu_thread explicit create and shutdown to isc_loop
When isc__thread_initialize() is called from a library constructor, it
could be called before we fork the main process.  This happens with
named, and then we have the call_rcu_thread attached to the pre-fork
process and not the post-fork process, which means that the initial
process will never shutdown, because there's noone to tell it so.

Move the isc__thread_initialize() and isc__thread_shutdown() to the
isc_loop unit where we call it before creating the extra thread and
after joining all the extra threads respectively.
2025-04-16 12:30:14 +02:00
Ondřej Surý
6ed821beb4
Reduce QPDB_VIRTUAL to 10 seconds
The *DB_VIRTUAL value was introduced to allow the clients (presumably
ns_clients) that has been running for some time to access the cached
data that was valid at the time of its inception.  The default value
of 5 minutes is way longer than longevity of the ns_client object as
the resolver will give up after 2 minutes.

Reduce the value to 10 seconds to accomodate to honour the original
more closely, but still allow some leeway for clients that started some
time in the past.

Our measurements show that even setting this value to 0 has no
statistically significant effect, thus the value of 10 seconds should be
on the safe side.
2025-04-16 11:21:38 +02:00
Mark Andrews
0d9cab1555 Process NSID and DNS COOKIE options when returning BADVERS
This will help identify the broken server if we happen to break
EDNS version negotiation.  It will also help protect the client
from spoofed BADVERSION responses.
2025-04-15 02:38:37 +00:00
Mark Andrews
ca7355b7d0 Fix OID check for PRIVATEOID keys and signatures
We were failing to account for the length byte before the OID.
See RFC 4034.

   Algorithm number 254 is reserved for private use and will never be
   assigned to a specific algorithm.  The public key area in the DNSKEY
   RR and the signature area in the RRSIG RR begin with an unsigned
   length byte followed by a BER encoded Object Identifier (ISO OID) of
   that length.  The OID indicates the private algorithm in use, and the
   remainder of the area is whatever is required by that algorithm.
   Entities should only use OIDs they control to designate their private
   algorithms.
2025-04-03 23:00:16 +11:00
Mark Andrews
90b2f94d9b Don't cache lack of EDNS based on received responses
Caching prevents server upgrades being detected in a timely manner
and it can also prevent DNSSEC responses being requested.
2025-04-03 10:53:35 +02:00
Ondřej Surý
2988ebae21
Don't copy EDE codes if source is same as destination
If the nested DNS validator ends up in the same fetch because of the
loops, the code could be copying the EDE codes from the same source EDE
context as the destination EDE context.  Skip copying the EDE codes if
the source and the destination is the same.
2025-04-02 18:06:52 +02:00
Ondřej Surý
fe48290140
Don't pass edectx from fetch_and_forget
Pass NULL as edectx for the fetch_and_forget() fetches as nobody
is reading the EDE contexts and it can mess the main client buffer.
2025-04-02 17:38:31 +02:00
Ondřej Surý
d7593196a1
Add static ede context into each validator layer
Instead of passing the edectx from the fetchctx into all subvalidators,
make the ede context ownership explict for dns_resolver_createfetch()
callers, and copy the ede result codes from the children validators to
the parent when finishing the validation process.
2025-04-02 17:32:50 +02:00
alessio
4017a40b1d Remove zero initialization of large buffers
Profiles show that an high amount of CPU time spent in memset.
By removing zero initalization of certain large buffers we improve
performance in certain authoritative workloads.
2025-04-02 16:24:31 +02:00
Evan Hunt
ad7f744115 use ISC_LIST_FOREACH in more places
use the ISC_LIST_FOREACH pattern in places where lists had
been iterated using a different pattern from the typical
`for` loop: for example, `while (!ISC_LIST_EMPTY(...))` or
`while ((e = ISC_LIST_HEAD(...)) != NULL)`.
2025-03-31 13:45:14 -07:00
Evan Hunt
522ca7bb54 switch to ISC_LIST_FOREACH everywhere
the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)`
has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few
cases where the change would be excessively complex.

in most cases this was a straightforward change. in some places,
however, the list element variable was referenced after the loop
ended, and the code was refactored to avoid this necessity.

also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare
the list elements, compilation failures can occur if the list object
has a `const` qualifier.  some `const` qualifiers have been removed
from function parameters to avoid this problem, and where that was not
possible, `UNCONST` was used.
2025-03-31 13:45:10 -07:00
Evan Hunt
5cff8f9017 implicitly declare list elements in ISC_LIST_FOREACH macros
ISC_LIST_FOREACH and related macros now use 'typeof(list.head)' to
declare the list elements automatically; the caller no longer needs
to do so.

ISC_LIST_FOREACH_SAFE also now implicitly declares its own 'next'
pointer, so it only needs three parameters instead of four.
2025-03-31 13:37:47 -07:00
Mark Andrews
31968a7534 Remove dead code in dns_message_sectiontotext
Following the merge of !10302 this code to reset the result code
on ISC_R_NOMORE is no longer needed.
2025-03-31 14:37:03 +00:00
Ondřej Surý
c27fce26e6
Drop readline alternatives in favor of libedit
The libedit is now ubiquitous and has a licences compatible with
MPL 2.0.  Drop readline (GPL 3.0) and editline (obsolete) support
in favor of libedit.
2025-03-31 15:20:40 +02:00
Artem Boldariev
2592e309c7 Dispatch: carefully check if the server name for SNI is a hostname
Previously the code would not check if the string intended to be used
for SNI is a hostname.
2025-03-31 14:23:19 +03:00
Artem Boldariev
1f199ee606 Add isc_tls_valid_sni_hostname()
Add a function that checks if a 'hostname' is not a valid IPv4 or IPv6
address. Returns 'true' if the hostname is likely a domain name, and
'false' if it represents an IP address.
2025-03-31 14:23:19 +03:00
Colin Vidal
4eb2cd364a copy __FILE__ when allocating memory
When allocating memory under -m trace|record, the __FILE__ pointer is
stored, so it can be printed out later in order to figure out in which
file an allocation leaked. (among others, like the line number).

However named crashes when called with -m record and using a plugin
leaking memory. The reason is that plugins are unloaded earlier than
when the leaked allocations are dumped (obviously, as it's done as late
as possible). In such circumstances, __FILE__ is dangling because the
dynamically loaded library (the plugin) is not in memory anymore.

Fix the crash by systematically copying the __FILE__ string
instead of copying the pointer. Of course, this make each allocation to
consume a bit more memory (and longer, as it needs to calculate the
length of __FILE__) but this occurs only under -m trace|record debugging
flags.

In term of unit test, because grepping in C is not fun, and because the
whole "syntax" of the dump output is tested in other tests, this simply
search for a substring in the whole buffer to make sure the expected
allocations are found.
2025-03-27 10:44:17 +01:00
alessio
2f27d66450 Refactor to use list-like macro for message sections
In the code base it is very common to iterate over all names in a message
section and all rdatasets for each name, but various idioms are used for
iteration.

This commit standardizes them as much as possible to a single idiom,
using the macro MSG_SECTION_FOREACH, similar to the existing
ISC_LIST_FOREACH.
2025-03-27 03:09:46 +01:00
Evan Hunt
3188b1c055 move application of dns64 to a separate function
the code in query_dns64() that applies the dns64 prefixes to
an A rdataset has been moved into the dns_dns64 module, and
dns_dns64_destroy() now unlinks the dns64 object from its
containing list. with these changes, we no longer need the
list-manipulation API calls dns_dns64_next() and
dns_dns64_unlink().
2025-03-26 23:30:38 +00:00
Evan Hunt
db8c11ea0b dns_message_gettemp*() resets objects
callers of dns_message_gettemprdata() and dns_message_getrdatalist()
initialize the objects after retrieving them. this is no longer
necessary.
2025-03-26 23:30:38 +00:00
Ondřej Surý
1233dc8a61 Add isc_sieve unit implementing SIEVE-LRU algorithm
This is the core implementation of the SIEVE algorithm described in the
following paper:

  Zhang, Yazhuo, Juncheng Yang, Yao Yue, Ymir Vigfusson, and K V
  Rashmi. “SIEVE Is Simpler than LRU: An Efficient Turn-Key Eviction
  Algorithm for Web Caches,” n.d.. available online from
  https://junchengyang.com/publication/nsdi24-SIEVE.pdf
2025-03-26 15:36:33 -07:00
Ondřej Surý
e8a1949566
Remove lock upgrading from the hot path in the cache
In QPcache, there were two places that tried to upgrade the lock.  In
clean_stale_header(), the code would try to upgrade the lock and cleanup
the header, and in qpzonode_release(), the tree lock would be optionally
upgraded, so we can cleanup the node directly if empty.  These
optimizations are not needed and they have no effect on the performance.
2025-03-25 10:57:19 +01:00
Ondřej Surý
3ef9b09620
Fix invalid cache-line padding for qpcache buckets
The isc_queue_t was missing in the calculation of the required
padding size inside the qpcache bucket structure.
2025-03-25 10:56:21 +01:00
Aram Sargsyan
fb16080280 Don't call dst_key_free(keyp) on an invalid 'keyp'
After a refactoring in 2e6107008d the
dst_key_free() call is invalid and can cause an assertion. Remove the
dst_key_free() call.
2025-03-25 08:19:45 +00:00
Evan Hunt
5c21576f82 Don't check DNS_KEYFLAG_NOAUTH
All DNSKEY keys are able to authenticate. The DNS_KEYTYPE_NOAUTH
(and DNS_KEYTYPE_NOCONF) flags were defined for the KEY rdata type,
and are not applicable to DNSKEY.

Previously, because the DNSKEY implementation was built on top of
KEY, the NOAUTH flag prevented authentication in DNSKEYs as well.
This has been corrected.
2025-03-25 06:38:25 +00:00
Evan Hunt
fee1ba40df Tidy up keyvalue.h definitions
Use enums for DNS_KEYFLAG_, DNS_KEYTYPE_, DNS_KEYOWNER_, DNS_KEYALG_,
and DNS_KEYPROTO_ values.

Remove values that are never used.

Eliminate the obsolete DNS_KEYFLAG_SIGNATORYMASK. Instead, add three
more RESERVED bits for the key flag values that it covered but which
were never used.
2025-03-25 06:38:25 +00:00
Matthijs Mekking
2c52aea3dc Remove dns_qpmulti_lockedread declaration
This function was removed in 6217e434b5
but not from the header file.
2025-03-25 05:58:31 +00:00
Evan Hunt
36cf1c6a5b when forwarding, try with CD=0 first
when sending a query to a forwarder for a name within a secure domain,
the first query is now sent with CD=0. when the forwarder itself
is validating, this will give it a chance to detect bogus data and
replace it with valid data before answering. this reduces our chances
of being stuck with data that can't be validated.

if the forwarder returns SERVFAIL to the initial query, the query
will be repeated with CD=1, to allow for the possibility that the
forwarder's validator is faulty or that the bogus answer is covered
by an NTA.

note: previously, CD=1 was only sent when the query name was in a
secure domain. today, validating servers have a trust anchor at the
root by default, so virtually all queries are in a secure domain.
therefore, the code has been simplified.  as long as validation is
enabled, any forward query that receives a SERVFAIL response will be
retried with CD=1.
2025-03-24 17:33:11 -07:00
Mark Andrews
78de8afd47 Return raw zone serial for inline zones 2025-03-24 22:16:09 +00:00
Mark Andrews
9428e32b13 Add an option to disable ZONEVERSION responses
The option provide-zoneversion controls whether ZONEVERSION is
returned.  This applies to primary, secondary and mirror zones.
2025-03-24 22:16:09 +00:00
Mark Andrews
a4f5c1d5f3 Add option request-zoneversion
This can be set at the option, view and server levels and causes
named to add an EDNS ZONEVERSION option to requests.  Replies are
logged to the 'zoneversion' category.
2025-03-24 22:16:09 +00:00
Mark Andrews
e9a87f0389 Return EDNS ZONEVERSION if requested
If there was an EDNS ZONEVERSION option in the DNS request and the
answer was from a zone, return the zone's serial and number of
labels excluding the root label with the type set to 0 (ZONE-SERIAL).
2025-03-24 22:16:09 +00:00
Mark Andrews
ebe751f88c Add dns_zone_getzoneversion
Returns the EDNS ZONEVERSION for the zone.  Return the database
specific version otherwise return a type 0 version (serial).
2025-03-24 22:16:09 +00:00
Mark Andrews
5c126a7e66 Add dns_db_getzoneversion
Provides a database method to return a database specific EDNS
ZONEVERSION option.  The default EDNS ZONEVERSION is serial.
2025-03-24 22:16:09 +00:00
Mark Andrews
998b93acca Add EDNS ZONEVERSION option counter 2025-03-24 22:16:09 +00:00
Mark Andrews
ec3cbc5468 Extend message code to display ZONEVERSION 2025-03-24 22:16:09 +00:00
Mark Andrews
b2c2b6755e Check EDNS ZONEVERSION when parsing OPT record 2025-03-24 22:16:09 +00:00
Mark Andrews
8e7229f641 Fix gaining adbname reference
Call dns_adbname_ref before calling dns_resolver_createfetch to
ensure adbname->name remains stable for the life of the fetch.
2025-03-20 23:25:29 +00:00
Evan Hunt
2e6107008d optimize key ID check when searching for matching keys
when searching a DNSKEY or KEY rrset for the key that matches
a particular algorithm and ID, it's a waste of time to convert
every key into a dst_key object; it's faster to compute the key
ID by checksumming the region, and then only do the full key
conversion once we know we've found the correct key.

this optimization was already in use in the validator, but it's
been refactored for code clarity, and is now also used in query.c
and message.c.
2025-03-20 18:22:58 +00:00
Evan Hunt
341b962665 move dns_zonekey_iszonekey() to dns_dnssec module
dns_zonekey_iszonekey() was the only function defined in the
dns_zonekey module, and was only called from one place. it
makes more sense to group this with dns_dnssec functions.
2025-03-20 18:22:58 +00:00
alessio
e1e10adc3a Switch symtab to use fxhash hashing
This merge request resolves some performance regressions introduced
with the change from isc_symtab_t to isc_hashmap_t.

The key improvements are:

1. Using a faster hash function than both isc_hashmap_t and
   isc_symtab_t. The previous implementation used SipHash, but the
   hashflood resistance properties of SipHash are unneeded for config
   parsing.
2. Shrinking the initial size of the isc_hashmap_t used inside
   isc_symtab_t. Symtab is mainly used for config parsing, and the
   when used that way it will have between 1 and ~50 keys, but the
   previous implementation initialized a map with 128 slots.
   By initializing a smaller map, we speed up mallocs and optimize for
   the typical case of few config keys.
3. Slight optimization of the string matching in the hashmap, so that
   the tail is handled in a single load + comparison, instead of byte
   by byte.
   Of the three improvements, this is the least important.
2025-03-20 11:26:09 +01:00
Matthijs Mekking
3e836a87e6 Update Retired and Removed if we update lifetime
If we are updating the lifetime, and it was not set before, also
set/update the Retired and Removed timing metadata.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
6c6b8796d3 Fix keymgr bug wrt setting the next time
Only set the next time the keymgr should run if the value is non zero.
Otherwise we default back to one hour. This may happen if there is one
or more key with an unlimited lifetime.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
8c9d2eb2bf keymgr: also set DeleteCDS when setting PublishCDS
The keymgr never set the expected timing metadata when CDS/CDNSKEY
records for the corresponding key will be removed from the zone. This
is not troublesome, as key states dictate when this happens, but with
the new pytest we use the timing metadata to determine if the CDS and/or
CDNSKEY for the given key needs to be published.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
63edc4435f Fix wrong usage of safety intervals in keymgr
There are a couple of cases where the safety intervals are added
inappropriately:

1. When setting the PublishCDS/SyncPublish timing metadata, we don't
   need to add the publish-safety value if we are calculating the time
   when the zone is completely signed for the first time. This value
   is for when the DNSKEY has been published and we add a safety
   interval before considering the DNSKEY omnipresent.

2. The retire-safety value should only be added to ZSK rollovers if
   there is an actual rollover happening, similar to adding the sign
   delay.

3. The retire-safety value should only be added to KSK rollovers if
   there is an actual rollover happening. We consider the new DS
   omnipresent a bit later, so that we are forced to keep the old DS
   a bit longer.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
ef671919d5 Fix a small keymgr bug
While converting the kasp system test to pytest, I encountered a small
bug in the keymgr code. We retire keys when there is more than one
key matching a 'keys' line from the dnssec-policy. But if there are
multiple identical 'keys' lines, as is the case for the test zone
'checkds-doubleksk.kasp', we retire one of the two keys that have the
same properties.

Fix this by checking if there are double matches. This is not fool proof
because there may be many keys for a few identical 'keys' lines, but it
is good enough for now. In practice it makes no sense to have a policy
that dictates multiple keys with identical properties.
2025-03-20 10:12:16 +00:00
Mark Andrews
bfbaacc9a0 Use reference counted counters for nfail and nvalidations
The fetch context that held these values could be freed while there
were still active pointers to the memory.  Using a reference counted
pointer avoids this.
2025-03-20 09:12:49 +11:00
Aram Sargsyan
830e548111 Fix the resolvers RTT-ranged responses statistics counters
When a response times out the fctx_cancelquery() function
incorrectly calculates it in the 'dns_resstatscounter_queryrtt5'
counter (i.e. >=1600 ms). To avoid this, the rctx_timedout()
function should make sure that 'rctx->finish' is NULL. And in order
to adjust the RTT values for the timed out server, 'rctx->no_response'
should be true. Update the rctx_timedout() function to make those
changes.
2025-03-18 16:20:59 +00:00
Aram Sargsyan
12e7dfa397 Fix resolver responses statistics counter
The resquery_response() function increases the response counter without
checking if the response was successful. Increase the counter only when
the result indicates success.
2025-03-18 16:20:59 +00:00
Ondřej Surý
0d9f58b745 Remove a kludge to process non-authoritative CNAME response
A BIND 8 server could return a non-authoritative answer when a CNAME is
followed.  This is no longer handled as a valid answer.
2025-03-17 23:23:24 +00:00
Ondřej Surý
05d6542e6d Remove the kludges for records in the bad sections
There were kludges to help process responses from authoritative servers
giving RRs in wrong sections (mentioning BIND 8).  These should just go
away and such responses should not be processed.
2025-03-17 23:23:24 +00:00
Ondřej Surý
ff73d37f69 Small cleanup in dns_adb unit 2025-03-17 23:23:24 +00:00
Aram Sargsyan
807ef8545d Implement -T cookiealwaysvalid
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.
2025-03-17 10:42:47 +00:00
Mark Andrews
d0a59277fb Add missing locks when returning addresses
Add missing locks in dns_zone_getxfrsource4 et al. Addresses CID
468706, 468708, 468741, 468742, 468785 and 468778.

Cleanup dns_zone_setxfrsource4 et al to now return void.

Remove double copies with dns_zone_getprimaryaddr and dns_zone_getsourceaddr.
2025-03-15 04:51:59 +00:00
Evan Hunt
606d30796e use new dns_rdatatype classification functions
modify code to use dns_rdatatype_ismulti(), dns_rdatatype_issig(),
dns_rdatatype_isaddr(), and dns_rdatatype_isalias() where applicable.
2025-03-15 00:27:54 +00:00
Evan Hunt
37ff0aa9c0 convert rdatatype classification routines to inline
turn the dns_rdatatype_is*() functions into static inline
functions in rdata.h.
2025-03-15 00:27:54 +00:00
Evan Hunt
1c51d44d82 add new functions to classify rdata types
- dns_rdatatype_ismulti() returns true if a given type can have
  multiple answers: ANY, RRSIG, or SIG.
- dns_rdatatype_issig() returns true for a signature: RRSIG or SIG.
- dns_rdatatype_isaddr() returns true for an address: A or AAAA.
- dns_rdatatype_isalias() returns true for an alias: CNAME or DNAME.
2025-03-15 00:27:54 +00:00
Evan Hunt
24eaff7adc qpzone.c:step() could ignore rollbacks
the step() function (used for stepping to the prececessor or
successor of a database node) could overlook a node because
there was an rdataset marked IGNORE because it had been rolled
back, covering an active rdataset under it.
2025-03-14 23:19:17 +00:00
Evan Hunt
9cfe9f5eb7 fix handling of revoked keys
when a key is revoked its key ID changes, due to the inclusion
of the "revoke" flag. a collision between this changed key ID and
that of an unrelated public-only key could cause a crash in
dnssec-signzone.
2025-03-14 22:25:44 +00:00
Mark Andrews
de519cd1c9 Don't leak the original QTYPE to parent zone
When performing QNAME minimization, named now sends an NS
query for the original QNAME, to prevent the parent zone from
receiving the QTYPE.

For example, when looking up example.com/A, we now send NS queries
for both com and example.com before sending the A query to the
servers for example.com.  Previously, an A query for example.com
would have been sent to the servers for com.

Several system tests needed to be adjusted for the new query pattern:

- Some queries in the serve-stale test were sent to the wrong server.
- The synthfromdnssec test could fail due to timing issues; this
  has been addressed by adding a 1-second delay.
- The cookie test could fail due to the a change in the count of
  TSIG records received in the "check that missing COOKIE with a
  valid TSIG signed response does not trigger TCP fallback" test case.
- The GL #4652 regression test case in the chain system test depends
  on a particular query order, which no longer occurs when QNAME
  minimization is active. We now disable qname-minimization
  for that test.
2025-03-14 01:01:26 +00:00
Mark Andrews
496f7963cd Fix handling of ISC_R_TIMEOUT in resume_qmin()
If a timeout occurs when sending a QMIN query, QNAME
minimization should be disabled. This now causes a hard
failure in strict mode, or a fallback to non-minimized queries
in relaxed mode.
2025-03-14 01:01:26 +00:00
Mark Andrews
98fc14dc75 Exempt QNAME minimization queries from fetches-per-zone
The calling fetch has already called fcount_incr() for this zone;
calling it again for a QMIN query results in double counting.

When resuming after a QMIN query is answered, however, we do now
ensure before continuing that the fetches-per-zone limit has not
been exceeded.
2025-03-14 01:01:26 +00:00
Colin Vidal
24ffbdcfea add support for EDE 20 (Not Authoritative)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.

RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.
2025-03-13 11:16:01 +01:00
Colin Vidal
334ea1269f add support for EDE 7 and 8
Extended DNS Error messages EDE 7 (expired key) and EDE 8 (validity
period of the key not yet started) are now sent in case of such DNSSEC
validation failures.

Refactor the existing validator extended error APIs in order to make it
easy to have a consisdent extra info (with domain/type) in the various
use case (i.e. when the EDE depends on validator state,
validate_extendederror or when the EDE doesn't depend of any state but
can be called directly in a specific flow).
2025-03-13 09:57:09 +01:00
Ondřej Surý
1e4695510a
Revert "fix: dev: Delete dead nodes when committing a new version"
This reverts commit 67255da4b3, reversing
changes made to 74c9ff384e.
2025-03-05 17:46:54 +01:00
Aram Sargsyan
6cd9e4f67c Fix a bug in get_request_transport_type()
When dns_remote_done() is true, calling dns_remote_curraddr() asserts.
Add a dns_remote_curraddr() check before calling dns_remote_curraddr().
2025-03-05 12:18:11 +00:00
Ondřej Surý
1fae6ccea1
Add the call function tracking to isc_mem API
As we already track __func__, __FILE__, __LINE__ triplet in most places,
add the function tracking to the isc_mem tracking API.
2025-03-05 11:17:17 +01:00
Ondřej Surý
eab9fc22e7
Replace attach/detach in isc_mem with refcount implementation
The isc_mem API is one of the most commonly used APIs that didn't
used ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros.  Replace the
implementation of isc_mem_attach(), isc_mem_detach() and
isc_mem_destroy() with the respective macros.

This also removes the legacy isc_mem_destroy() functionality that would
check whether all references had been detached from the memory context
as it doesn't work reliably when using the call_rcu() API.  Instead of
doing this individually, call isc_mem_checkdestroyed(stderr) from the
isc_mem_destroy() macro to keep the extra check that all contexts were
freed when the program is exiting.
2025-03-05 11:17:17 +01:00
Ondřej Surý
552cf64a70
Replace isc_mem_destroy() with isc_mem_detach()
Remove legacy isc_mem_destroy() and just use isc_mem_detach() as
isc_mem_destroy() doesn't play well with call_rcu API.
2025-03-05 11:17:17 +01:00
Mark Andrews
006c5990ce Implement digest_sig and digest_rrsig for ZONEMD
ZONEMD needs to be able to digest SIG and RRSIG records.  The signer
field can be compressed in SIG so we need to call dns_name_digest().
While for RRSIG the records the signer field is not compressed the
canonical form has the signer field downcased (RFC 4034, 6.2).  This
also implies that compare_rrsig needs to downcase the signer field
during comparison.
2025-03-05 18:05:12 +11:00
Ondřej Surý
303c20caf8
Fix the foundname vs dcname madness in qpcache_findzonecut()
The qpcache_findzonecut() accepts two "foundnames": 'foundname' and
'dcname' could be NULL.  Originally, when 'dcname' would be NULL, the
'dcname' would be set to 'foundname'.  Then code like this was present:

    result = find_deepest_zonecut(&search, node, nodep, foundname,
                                  rdataset,
                                  sigrdataset DNS__DB_FLARG_PASS);
    dns_name_copy(foundname, dcname);

Which basically means that we are copying the .ndata over itself for no
apparent reason.  Cleanup the dcname vs foundname usage.

Co-authored-by: Evan Hunt <each@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
2025-03-05 07:49:46 +01:00
alessio
87776a51ae Cleanup dns_opcode_t
Make dns_opcode_t refer directly to the underlying enum, and use
attributes to ensure the underlying enum is the same size as uint16_t.
2025-03-04 18:35:14 +01:00
Mark Andrews
988dc57c8c Call isc__iterated_hash_initialize
The iterated hash implementation needs to be initialised
on the worker thread.  Also clean it up after we are done.
2025-03-04 12:54:39 +00:00
Artem Boldariev
eaad0aefe6 DoH: Bump the active streams processing limit
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.

The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.
2025-03-03 11:32:29 +02:00
Artem Boldariev
217a1ebd79 DoH: remove obsolete INSIST() check
The check, while not active by default, is not valid since the commit
8b8f4d500d.

See 'if (total == 0) { ...' below branch to understand why.
2025-03-03 11:32:11 +02:00
Artem Boldariev
c5f7968856 DoH: Flush HTTP write buffer on an outgoing DNS message
Previously, the code would try to avoid sending any data regardless of
what it is unless:

a) The flush limit is reached;
b) There are no sends in flight.

This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").

Now, additionally to the listed cases, we also:

c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).

That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0e1b02868a DoH: Limit the number of delayed IO processing requests
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.

This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0956fb9b9e DoH: Simplify http_do_bio()
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.

The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.

Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).
2025-03-03 11:32:11 +02:00
Ondřej Surý
ce7879c924
Remove STATIC_ASSERT variants in favor of the C11 variant
Previously, a gcc < 4.6 shim for _Static_assert() was included.  Such an
old compiler is not supported now anyway, so the macro variant has been
removed in favor of a single definition using _Static_assert().
2025-03-01 07:33:53 +01:00
Ondřej Surý
534069e048
Move locking macros into individual headers
Previously, the LOCK()/UNLOCK() and friends macros were defined in the
isc/util.h header.  Those macros were moved to their respective headers
as those would have to be included anyway if that particular lock was in
use.
2025-03-01 07:33:51 +01:00
Ondřej Surý
901637c25c
Remove superflous header includes from isc/util.h header
Formerly, isc/util.h would pull a few extra headers (isc/list.h,
isc/attributes.h, isc/result.h and errno.h).  These includes were
removed in favor of including them directly when used.
2025-03-01 07:33:40 +01:00
Ondřej Surý
c5075a9a61
Remove convenience list macros from isc/util.h
The short convenience list macros were used very sparingly and
inconsistenly in the code base.  As the consistency is prefered over
the convenience, all shortened list macro were removed in favor of
their ISC_LIST API targets.
2025-03-01 07:33:40 +01:00
Ondřej Surý
2aa70fff76
Remove unused isc_mutexblock and isc_condition units
The isc_mutexblock and isc_condition units were no longer in use and
were removed.
2025-03-01 07:33:09 +01:00
Aram Sargsyan
7293cb0612 Fix a bug in dns_zone_getprimaryaddr()
When all the addresses were already iterated over, the
dns_remote_curraddr() function asserts. So before calling it,
dns_zone_getprimaryaddr() now checks the address list using the
dns_remote_done() function. This also means that instead of
returning 'isc_sockaddr_t' it now returns 'isc_result_t' and
writes the primary's address into the provided pointer only when
returning success.
2025-02-28 15:33:37 +00:00
Evan Hunt
9ebeb60174 fix the fetchresponse result for CNAME/DNAME
the fix in commit 1edbbc32b4 was incomplete; the wrong
event result could also be set in cache_name() and validated().
2025-02-27 19:00:27 +00:00
Aydın Mercan
f4ab4f07e3
unify fips handling to isc_crypto and make the toggle one way
Since algorithm fetching is handled purely in libisc, FIPS mode toggling
can be purely done in within the library instead of provider fetching in
the binary for OpenSSL >=3.0.

Disabling FIPS mode isn't a realistic requirement and isn't done
anywhere in the codebase. Make the FIPS mode toggle enable-only to
reflect the situation.
2025-02-27 17:37:43 +03:00
Aram Sargsyan
5633dc90d3 Fix TTL issue with ANY queries processed through RPZ "passthru"
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.
2025-02-27 08:36:49 +00:00
Evan Hunt
1edbbc32b4 set eresult based on the type in ncache_adderesult()
when the caching of a negative record failed because of the
presence of a positive one, ncache_adderesult() could override
this to ISC_R_SUCCESS. this could cause CNAME and DNAME responses
to be handled incorrectly.  ncache_adderesult() now sets the result
code correctly in such cases.
2025-02-25 21:29:19 -08:00
Evan Hunt
6c2af2ae3b remove 'target' from dns_adb
the target name parameter to dns_adb_createfind() was always passed as
NULL, so we can safely remove it.

relatedly, the 'target' field in the dns_adbname structure was never
referenced after being set.  the 'expire_target' field was used, but
only as a way to check whether an ADB name represents a CNAME or DNAME,
and that information can be stored as a single flag.
2025-02-26 00:43:21 +00:00
Mark Andrews
f98a8331aa Fix dual-stack-servers
Named was stopping nameserver address resolution attempts too soon
when dual stack servers are configured.  Dual stack servers are
used when there are *not* addresses for the server in a particular
address family so find->status == DNS_ADB_NOMOREADDRESSES is not a
sufficient stopping condition when dual stack servers are available.
Call fctx_try to see if the alternate servers can be used.
2025-02-25 23:47:46 +00:00
Mark Andrews
b048190e23 Relax private DNSKEY and RRSIG constraints
DNSKEY, KEY, RRSIG and SIG constraints have been relaxed to allow
empty key and signature material after the algorithm identifier for
PRIVATEOID and PRIVATEDNS. It is arguable whether this falls within
the expected use of these types as no key material is shared and
the signatures are ineffective but these are private algorithms and
they can be totally insecure.
2025-02-25 22:59:46 +00:00
Evan Hunt
c2e4358267 prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().
2025-02-25 22:40:48 +00:00
Evan Hunt
2f7e6eb019 allow NULL compression context in dns_name_towire()
passing NULL as the compression context to dns_name_towire()
copies the uncompressed name data directly into the target buffer.
2025-02-25 12:53:25 -08:00
Evan Hunt
afb424c9b6 simplify dns_name_fromtext() interface
previously, dns_name_fromtext() took both a target name and an
optional target buffer parameter, which could override the name's
dedicated buffer. this interface is unnecessarily complex.

we now have two functions, dns_name_fromtext() to convert text
into a dns_name that has a dedicated buffer, and dns_name_wirefromtext()
to convert text into uncompressed DNS wire format and append it to a
target buffer.

in cases where it really is necessary to have both, we can use
dns_name_fromtext() to load the dns_name, then dns_name_towire()
to append the wire format to the target buffer.
2025-02-25 12:53:25 -08:00
Evan Hunt
a6986f6837 remove 'target' parameter from dns_name_concatenate()
the target buffer passed to dns_name_concatenate() was never
used (except for one place in dig, where it wasn't actually
needed, and has already been removed in a prior commit).
we can safely remove the parameter.
2025-02-25 12:53:25 -08:00
Evan Hunt
2edefbad4a remove the 'name_coff' parameter in dns_name_towire()
this parameter was added as a (minor) optimization for
cases where dns_name_towire() is run repeatedly with the
same compression context, as when rendering all of the rdatas
in an rdataset. it is currently only used in one place.

we now simplify the interface by removing the extra parameter.
the compression offset value is now part of the compression
context, and can be activated when needed by calling
dns_compress_setmultiuse(). multiuse mode is automatically
deactivated by any subsequent call to dns_compress_permitted().
2025-02-25 12:53:25 -08:00
Evan Hunt
94a96a7a0e save time when creating a slab from another slab
the dns_rdataslab_fromrdataset() function creates a slab
from an rdataset. if the source rdataset already uses a slab,
then no processing is necessary; we can just copy the existing
slab to a new location.
2025-02-25 18:37:35 +00:00
Ondřej Surý
1e4fb53c61
Destroy the hashmap iterator inside the rwlock
Previously, the hashmap iterator for fetches-per-zone was destroy
outside the rwlock.  This could lead to an assertion failure due to a
timing race with the internal rehashing of the hashmap table as the
rehashing process requires no iterators to be running when rehashing the
hashmap table.  This has been fixed by moving the destruction of the
iterator inside the read locked section.
2025-02-25 13:36:37 +01:00
Ondřej Surý
67e1df1a07
Squash set_offsets() and dns_name_offsets() into single function
The third argument to set_offsets() was only used in
dns_name_fromregion() and not really needed.  We can remove the third
argument and then manually check whether the last label is root label.
2025-02-25 12:17:34 +01:00
Ondřej Surý
79c3871a7b
Remove target buffer from dns_name_downcase()
There was just a single use of passing an extra buffer to
dns_name_downcase() which have been replaced by simple call to
isc_ascii_lowercase() and the 'target' argument from dns_name_downcase()
function has been removed.
2025-02-25 12:17:34 +01:00
Ondřej Surý
3bb47bc6cd
Remove MAKE_EMPTY() macro from dns_name unit
The MAKE_EMPTY() macro was clearing up the output variable in case of
the failure.  However, this was breaking the usual design pattern that
the output variables are left in indeterminate state or we don't touch
them at all when a failure occurs.  Remove the macro and change the
dns_name_downcase() to not touch the name contents until success.
2025-02-25 12:17:34 +01:00
Ondřej Surý
259600c837
Cleanup the usage of dns_offsets_t vs unsigned char * pointers
There was a back-and-forth between static arrays and the pointers to the
offsets.  Since we are now only using the static arrays, we can cleanup
the usage of the pointers that would previously point either to the
static array or name->offsets if available.
2025-02-25 12:17:34 +01:00
Ondřej Surý
1c22ab2ef7
Simplify name initializers
We no longer need to pass labels to DNS_NAME_INITABSOLUTE
and DNS_NAME_INITNONABSOLUTE.
2025-02-25 12:17:34 +01:00
Ondřej Surý
04c2c2cbc8
Simplify dns_name_init()
Remove the now-unused offsets parameter from dns_name_init().
2025-02-25 12:17:34 +01:00
Ondřej Surý
08e966df82
Remove offsets from the dns_name and dns_fixedname structures
The offsets were meant to speed-up the repeated dns_name operations, but
it was experimentally proven that there's actually no real-world
benefit.  Remove the offsets and labels fields from the dns_name and the
static offsets fields to save 128 bytes from the fixedname in favor of
calculating labels and offsets only when needed.
2025-02-25 12:17:34 +01:00
alessio
45132df850 Remove unused symtab implementation
The old symtab implementation should have been removed in !9921 , but
it wasn't. This commit addresses that.
2025-02-25 11:29:58 +01:00
alessio
887502e37d Drop malformed notify messages early instead of decompressing them
The DNS header shows if a message has multiple questions or invalid
NOTIFY sections. We can drop these messages early, right after parsing
the question. This matches RFC 9619 for multi-question messages and
Unbound's handling of NOTIFY.
To further add further robustness, we include an additional check for
unknown opcodes, and also drop those messages early.

Add early_sanity_check() function to check for these conditions:
- Messages with more than one question, as required by RFC 9619
- NOTIFY query messages containing answer sections (like Unbound)
- NOTIFY messages containing authority sections (like Unbound)
- Unknown opcodes.
2025-02-25 10:40:38 +01:00
Evan Hunt
d0fd9cbe3b Fix a logic error in cache_name()
A change in 6aba56ae8 (checking whether a rejected RRset was identical
to the data it would have replaced, so that we could still cache a
signature) inadvertently introduced cases where processing of a
response would continue when previously it would have been skipped.
2025-02-24 15:04:14 -08:00
Ondřej Surý
d1ef6a93c1
Acquire the database reference before possibly last node release
Acquire the database refernce in the detachnode() to prevent the last
reference to be release while the NODE_LOCK being locked.  The NODE_LOCK
is locked/unlocked inside the RCU critical section, thus it is most
probably this should not pose a problem as the database uses call_rcu
memory reclamation, but this it is still safer to acquire the reference
before releasing the node.
2025-02-24 20:05:56 +01:00
Ondřej Surý
4917ffa61b
Explicitly create and shutdown the call_rcu_thread
As the default_call_rcu_thread can't be forced to flush all the work
during the executable shutdown, create one call_rcu_thread explicitly
and assign it to the all created threads.

This allows this explicit call_rcu_thread to be unassociated from the
main thread and freed before the executable destructor exits.
2025-02-22 16:19:01 +01:00
Ondřej Surý
f5c204ac3e
Move the library init and shutdown to executables
Instead of relying on unreliable order of execution of the library
constructors and destructors, move them to individual binaries.  The
advantage is that the execution time and order will remain constant and
will not depend on the dynamic load dependency solver.

This requires more work, but that was mitigated by a simple requirement,
any executable using libisc and libdns, must include <isc/lib.h> and
<dns/lib.h> respectively (in this particular order).  In turn, these two
headers must not be included from within any library as they contain
inlined functions marked with constructor/destructor attributes.
2025-02-22 16:19:00 +01:00
Ondřej Surý
c6b0368b21
Dump the fetches from dns_resolver_dumpfetches()
Previously, the dns_resolver_dumpfetches() would go over the fetch
counters.  Alas, because of the earlier optimization, the fetch counters
would be increased only when fetches-per-zone was not 0, otherwise the
whole counting was skipped for performance reasons.

Instead of using the auxiliary fetch counters hash table, use the real
hash table that stores the fetch contexts to dump the ongoing fetches to
the recursing file.

Additionally print more information about the fetch context like start
and expiry times, number of fetch responses, number of queries and count
of allowed and dropped fetches.
2025-02-21 22:25:43 +01:00
Ondřej Surý
cf078fadeb
Fix the fetch context hash table lock ordering
The order of the fetch context hash table rwlock and the individual
fetch context was reversed when calling the release_fctx() function.
This was causing a problem when iterating the hash table, and thus the
ordering has been corrected in a way that the hash table rwlock is now
always locked on the outside and the fctx lock is the interior lock.
2025-02-21 22:05:43 +01:00
Ondřej Surý
b9e3cd5d2a
Add isc_timer_running() function to check status of timer
In the next commit, we need to know whether the timer has been started
or stopped.  Add isc_timer_running() function that returns true if the
timer has been started.
2025-02-21 22:05:43 +01:00
Aram Sargsyan
3ea2fbc238 Fix RPZ bug when resuming a query during a reconfiguration
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.

Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.
2025-02-21 11:10:15 +00:00
Ondřej Surý
77ec2a6c22 Cleanup the isc_counter unit
The isc_counter_create() doesn't need the return value (it was always
ISC_R_SUCCESS), use the macros to implement the reference counting,
little style cleanup, and expand the unit test.
2025-02-21 09:51:42 +00:00
Mark Andrews
83159d0a54 Remove check for missing RRSIG records from getsection
Checking whether the authority section is properly signed should
be left to the validator.  Checking in getsection (dns_message_parse)
was way too early and resulted in resolution failures of lookups
that should have otherwise succeeded.
2025-02-20 20:31:07 +00:00
Aram Sargsyan
716b936045 Implement sig0key-checks-limit and sig0message-checks-limit
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number two was that more
than one key should only be required only during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.

This change introduces two new configuration options for the views,
sig0key-checks-limit and sig0message-checks-limit, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.
2025-02-20 13:35:14 +00:00
Aram Sargsyan
c6529891bb Fix isc_quota bug
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.

Fix the wrong memory ordering for 'quota->used', as the relaxed
ordering doesn't ensure that data modifications made by one thread
are visible in other threads.

Add checks in both isc_quota_release() and isc_quota_acquire_cb()
to make sure that the described hangup does not happen. Also see
code comments.
2025-02-20 10:56:00 +00:00
Aram Sargsyan
c701b590e4 Expose the incoming transfers' rates in the statistics channel
Expose the average transfer rate (in bytes-per-second) during the
last full 'min-transfer-rate-in <bytes> <minutes>' minutes interval.
If no such interval has passed yet, then the overall average rate is
reported instead.
2025-02-20 09:32:55 +00:00
Aram Sargsyan
91ea156203 Implement the min-transfer-rate-in configuration option
This new option sets a minimum amount of transfer rate for
an incoming zone transfer that will abort a transfer, which
for some network related reasons run very slowly.
2025-02-20 09:32:55 +00:00
Evan Hunt
6aba56ae89 Check whether a rejected rrset is different
Add a new dns_rdataset_equals() function to check whether two
rdatasets are equal in DNSSEC terms.

When an rdataset being cached is rejected because its trust
level is lower than the existing rdataset, we now check to see
whether the rejected data was identical to the existing data.
This allows us to cache a potentially useful RRSIG when handling
CD=1 queries, while still rejecting RRSIGs that would definitely
have resulted in a validation failure.
2025-02-19 17:25:20 -08:00
Ondřej Surý
2fc32c105d Remove the "raw" version of the dns_slabheader API
The "raw" version of the header was used for the noqname and the closest
proofs to save around 152 bytes of the dns_slabheader_t while bringing
an additional complexity.  Remove the raw version of the dns_slabheader
API at the slight expense of having unused dns_slabheader_t data sitting
in front of the proofs.
2025-02-19 15:00:15 -08:00
Evan Hunt
c2e19771ac refactor dns_rdataslab_subtract() for efficiency
reduce the number of rdata comparisons needed by walking
through the original slab once to determine whether the rdata
in it is duplicated in the slab to be subtracted, and then
write out the rdatas that aren't. previously, this was
done twice: once when determining the size of the target buffer
and then again when copying data into it.
2025-02-19 15:00:15 -08:00
Evan Hunt
1d5fe36136 refactor dns_rdataslab_merge() for efficiency
when merging two rdata slabs, we now check once to see
whether an item in the new slab has a duplicate in the
old. previously this was done twice; once to determine the
size of the target buffer required, and then again when
copying the data into it.

we also minimize the number of rdata comparisons necessary,
by remembering which items in the old slab have already been
found to be duplicates.
2025-02-19 15:00:15 -08:00
Evan Hunt
ed83455c81 dns_slabheader_fromrdataset() -> dns_rdataset_getheader()
The function name dns_slabheader_fromrdataset() was too similar
to dns_rdataslab_fromrdataset(). Instead, we now have an rdataset
method 'getheader' which is implemented for slab-type rdatasets.

A new NOHEADER rdataset attribute is set for rdatasets using
raw slabs (i.e., noqname and closest encloser proofs); when
called on rdatasets with that flag set, dns_rdataset_getheader()
returns NULL.
2025-02-19 14:58:32 -08:00
Evan Hunt
82edec67a5 initialize header in dns_rdataslab_fromrdataset()
when dns_rdataslab_fromrdataset() is run, in addition to
allocating space for a slab header, it also partially
initializes it, setting the type match rdataset->type and
rdataset->covers, the trust to rdataset->trust, and the TTL to
rdataset->ttl.
2025-02-19 14:58:32 -08:00
Evan Hunt
b4bde9bef4 clarify dns_rdataslab_fromrdataset()
there are now two functions for creating an rdataslab from an
rdataset: dns_rdataslab_fromrdataset() creates a full slab (including
space for a slab header), and dns_rdataslab_raw_fromrdataset() creates
a raw slab.
2025-02-19 14:58:32 -08:00
Evan Hunt
f1ab7f199b refactor dns_rdataslab_merge() and _subtract()
these two functions have been refactored for clarity
and readability, with a more logical flow, added comments,
and less code duplication.
2025-02-19 14:58:32 -08:00
Evan Hunt
6908d1f9be more rdataslab refactoring
- there are now two functions for getting rdataslab size:
  dns_rdataslab_size() is for full slabs and dns_rdataslab_sizeraw()
  for raw slabs. there is no longer a need for a reservelen parameter.
- dns_rdataslab_count() also no longer takes a reservelen parameter.
  (currently it's never used for raw slabs, so there is no _countraw()
  function.)
- dns_rdataslab_rdatasize() has been removed, because
  dns_rdataslab_sizeraw() can do the same thing.
- dns_rdataslab_merge() and dns_rdataslab_subtract() both take
  slabheader parameters instead of character buffers, and the
  reservelen parameter has been removed.
2025-02-19 14:58:32 -08:00
Evan Hunt
4601d4299a fix and simplify dns_rdataset_equal() and _equalx()
if both rdataslabs being compared have zero length, return true.

also, since these functions are only ever called on slabheaders
with sizeof(dns_slabheader_t) as the reserve length, we can
simplify the API: remove the reservelen argument, and pass the
slabs as type dns_slabheader_t * instead of unsigned char *.
2025-02-19 14:58:32 -08:00
Ondřej Surý
15fe68e50d Add .up pointer to slabheader
The dns_slabheader object uses the 'next' pointer for two purposes.
In the first header for any given type, 'next' points to the first
header for the next type. But 'down' points to the next header of
the same type, and in that record, 'next' points back up.

This design made the code confusing to read.  We now use a union
so that the 'next' pointer can also be called 'up'.
2025-02-19 14:58:32 -08:00
Artem Boldariev
2adabe835a DoH: http_send_outgoing() return value is not used
The value returned by http_send_outgoing() is not used anywhere, so we
make it not return anything (void). Probably it is an omission from
older times.
2025-02-19 17:52:36 +02:00
Artem Boldariev
8b8f4d500d DoH: Fix missing send callback calls
When handling outgoing data, there were a couple of rarely executed
code paths that would not take into account that the callback MUST be
called.

It could lead to potential memory leaks and consequent shutdown hangs.
2025-02-19 17:52:36 +02:00
Artem Boldariev
a22bc2d7d4 DoH: change how the active streams number is calculated
This commit changes the way how the number of active HTTP streams is
calculated and allows it to scale with the values of the maximum
amount of streams per connection, instead of effectively capping at
STREAM_CLIENTS_PER_CONN.

The original limit, which is intended to define the pipelining limit
for TCP/DoT. However, it appeared to be too restrictive for DoH, as it
works quite differently and implements pipelining at protocol level by
the means of multiplexing multiple streams. That renders each stream
to be effectively a separate connection from the point of view of the
rest of the codebase.
2025-02-19 17:52:36 +02:00
Artem Boldariev
05e8a50818 DoH: Track the amount of in flight outgoing data
Previously we would limit the amount of incoming data to process based
solely on the presence of not completed send requests. That worked,
however, it was found to severely degrade performance in certain
cases, as was revealed during extended testing.

Now we switch to keeping track of how much data is in flight (or ready
to be in flight) and limit the amount of processed incoming data when
the amount of in flight data surpasses the given threshold, similarly
to like we do in other transports.
2025-02-19 17:52:36 +02:00
Evan Hunt
e58ce19cf2 when committing a new qpzone version, delete dead nodes
if all data has been deleted from a node in the qpzone
database, delete the node too.
2025-02-18 14:22:38 -08:00
Ondřej Surý
ce9f6e68c3 Unify how we handle database version in the cache
Database versions are not used in cache databases. Some places in
qpcache.c required the version argument to be NULL; others marked it
as UNUSED. Unify all cases to require version to be NULL.
2025-02-18 20:15:00 +00:00
Ondřej Surý
2d53796e28 Clean up 'now' usage in the cache
Unify the way we handle the 'now' argument in the cache: when it's
set to zero by the caller, it is replaced with isc_stdtime_now().
2025-02-18 20:15:00 +00:00
Ondřej Surý
3b2fe808c4 Clean up the search part in qpcache_find()
Slightly refactor the header search in qpcache_find(), so the scope
level is reduced and the cname parts are logically grouped together.
2025-02-18 20:15:00 +00:00
Ondřej Surý
bfb219ac2d Refactor the search in qpcache_findrdataset()
Add new related_headers() function that simplifies the code
flow in qpcache_findrdataset().  Also use check_stale_header() function
to remove code duplication.
2025-02-18 20:15:00 +00:00
Ondřej Surý
cf66ba02a4 Refactor simple slabheader matching
Add a helper function both_headers() that unifies the slabheader
matching for simple type: it returns true when both the type and
the matching RRSIG have been found.
2025-02-18 20:15:00 +00:00
Ondřej Surý
4cd1dd8dd7 Add new helper maybe_update_headers() function
The new maybe_update_headers() function unifies the LRU updates to the
slabheaders that was scattered all over the place.  More calls to update
headers after bindrdatasets() were also added for completeness.
2025-02-18 20:15:00 +00:00
Ondřej Surý
4448f1adb2 Add bindrdatasets() function that binds both rdatasets
This removes code duplication between the dual bindrdataset() calls.  It
also unifies the handling as there were small differences between the
calls: one variant was checking for !NEGATIVE(found) condition and one
wasn't, and it is technically ok to do the check for all variants.
2025-02-18 20:15:00 +00:00