Commit graph

15876 commits

Author SHA1 Message Date
Colin Vidal
24ffbdcfea add support for EDE 20 (Not Authoritative)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.

RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.
2025-03-13 11:16:01 +01:00
Colin Vidal
334ea1269f add support for EDE 7 and 8
Extended DNS Error messages EDE 7 (expired key) and EDE 8 (validity
period of the key not yet started) are now sent in case of such DNSSEC
validation failures.

Refactor the existing validator extended error APIs in order to make it
easy to have a consisdent extra info (with domain/type) in the various
use case (i.e. when the EDE depends on validator state,
validate_extendederror or when the EDE doesn't depend of any state but
can be called directly in a specific flow).
2025-03-13 09:57:09 +01:00
Ondřej Surý
1e4695510a
Revert "fix: dev: Delete dead nodes when committing a new version"
This reverts commit 67255da4b3, reversing
changes made to 74c9ff384e.
2025-03-05 17:46:54 +01:00
Aram Sargsyan
6cd9e4f67c Fix a bug in get_request_transport_type()
When dns_remote_done() is true, calling dns_remote_curraddr() asserts.
Add a dns_remote_curraddr() check before calling dns_remote_curraddr().
2025-03-05 12:18:11 +00:00
Ondřej Surý
1fae6ccea1
Add the call function tracking to isc_mem API
As we already track __func__, __FILE__, __LINE__ triplet in most places,
add the function tracking to the isc_mem tracking API.
2025-03-05 11:17:17 +01:00
Ondřej Surý
eab9fc22e7
Replace attach/detach in isc_mem with refcount implementation
The isc_mem API is one of the most commonly used APIs that didn't
used ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros.  Replace the
implementation of isc_mem_attach(), isc_mem_detach() and
isc_mem_destroy() with the respective macros.

This also removes the legacy isc_mem_destroy() functionality that would
check whether all references had been detached from the memory context
as it doesn't work reliably when using the call_rcu() API.  Instead of
doing this individually, call isc_mem_checkdestroyed(stderr) from the
isc_mem_destroy() macro to keep the extra check that all contexts were
freed when the program is exiting.
2025-03-05 11:17:17 +01:00
Ondřej Surý
552cf64a70
Replace isc_mem_destroy() with isc_mem_detach()
Remove legacy isc_mem_destroy() and just use isc_mem_detach() as
isc_mem_destroy() doesn't play well with call_rcu API.
2025-03-05 11:17:17 +01:00
Mark Andrews
006c5990ce Implement digest_sig and digest_rrsig for ZONEMD
ZONEMD needs to be able to digest SIG and RRSIG records.  The signer
field can be compressed in SIG so we need to call dns_name_digest().
While for RRSIG the records the signer field is not compressed the
canonical form has the signer field downcased (RFC 4034, 6.2).  This
also implies that compare_rrsig needs to downcase the signer field
during comparison.
2025-03-05 18:05:12 +11:00
Ondřej Surý
303c20caf8
Fix the foundname vs dcname madness in qpcache_findzonecut()
The qpcache_findzonecut() accepts two "foundnames": 'foundname' and
'dcname' could be NULL.  Originally, when 'dcname' would be NULL, the
'dcname' would be set to 'foundname'.  Then code like this was present:

    result = find_deepest_zonecut(&search, node, nodep, foundname,
                                  rdataset,
                                  sigrdataset DNS__DB_FLARG_PASS);
    dns_name_copy(foundname, dcname);

Which basically means that we are copying the .ndata over itself for no
apparent reason.  Cleanup the dcname vs foundname usage.

Co-authored-by: Evan Hunt <each@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
2025-03-05 07:49:46 +01:00
alessio
87776a51ae Cleanup dns_opcode_t
Make dns_opcode_t refer directly to the underlying enum, and use
attributes to ensure the underlying enum is the same size as uint16_t.
2025-03-04 18:35:14 +01:00
Mark Andrews
988dc57c8c Call isc__iterated_hash_initialize
The iterated hash implementation needs to be initialised
on the worker thread.  Also clean it up after we are done.
2025-03-04 12:54:39 +00:00
Artem Boldariev
eaad0aefe6 DoH: Bump the active streams processing limit
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.

The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.
2025-03-03 11:32:29 +02:00
Artem Boldariev
217a1ebd79 DoH: remove obsolete INSIST() check
The check, while not active by default, is not valid since the commit
8b8f4d500d.

See 'if (total == 0) { ...' below branch to understand why.
2025-03-03 11:32:11 +02:00
Artem Boldariev
c5f7968856 DoH: Flush HTTP write buffer on an outgoing DNS message
Previously, the code would try to avoid sending any data regardless of
what it is unless:

a) The flush limit is reached;
b) There are no sends in flight.

This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").

Now, additionally to the listed cases, we also:

c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).

That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0e1b02868a DoH: Limit the number of delayed IO processing requests
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.

This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0956fb9b9e DoH: Simplify http_do_bio()
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.

The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.

Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).
2025-03-03 11:32:11 +02:00
Ondřej Surý
ce7879c924
Remove STATIC_ASSERT variants in favor of the C11 variant
Previously, a gcc < 4.6 shim for _Static_assert() was included.  Such an
old compiler is not supported now anyway, so the macro variant has been
removed in favor of a single definition using _Static_assert().
2025-03-01 07:33:53 +01:00
Ondřej Surý
534069e048
Move locking macros into individual headers
Previously, the LOCK()/UNLOCK() and friends macros were defined in the
isc/util.h header.  Those macros were moved to their respective headers
as those would have to be included anyway if that particular lock was in
use.
2025-03-01 07:33:51 +01:00
Ondřej Surý
901637c25c
Remove superflous header includes from isc/util.h header
Formerly, isc/util.h would pull a few extra headers (isc/list.h,
isc/attributes.h, isc/result.h and errno.h).  These includes were
removed in favor of including them directly when used.
2025-03-01 07:33:40 +01:00
Ondřej Surý
c5075a9a61
Remove convenience list macros from isc/util.h
The short convenience list macros were used very sparingly and
inconsistenly in the code base.  As the consistency is prefered over
the convenience, all shortened list macro were removed in favor of
their ISC_LIST API targets.
2025-03-01 07:33:40 +01:00
Ondřej Surý
2aa70fff76
Remove unused isc_mutexblock and isc_condition units
The isc_mutexblock and isc_condition units were no longer in use and
were removed.
2025-03-01 07:33:09 +01:00
Aram Sargsyan
7293cb0612 Fix a bug in dns_zone_getprimaryaddr()
When all the addresses were already iterated over, the
dns_remote_curraddr() function asserts. So before calling it,
dns_zone_getprimaryaddr() now checks the address list using the
dns_remote_done() function. This also means that instead of
returning 'isc_sockaddr_t' it now returns 'isc_result_t' and
writes the primary's address into the provided pointer only when
returning success.
2025-02-28 15:33:37 +00:00
Evan Hunt
9ebeb60174 fix the fetchresponse result for CNAME/DNAME
the fix in commit 1edbbc32b4 was incomplete; the wrong
event result could also be set in cache_name() and validated().
2025-02-27 19:00:27 +00:00
Aydın Mercan
f4ab4f07e3
unify fips handling to isc_crypto and make the toggle one way
Since algorithm fetching is handled purely in libisc, FIPS mode toggling
can be purely done in within the library instead of provider fetching in
the binary for OpenSSL >=3.0.

Disabling FIPS mode isn't a realistic requirement and isn't done
anywhere in the codebase. Make the FIPS mode toggle enable-only to
reflect the situation.
2025-02-27 17:37:43 +03:00
Aram Sargsyan
5633dc90d3 Fix TTL issue with ANY queries processed through RPZ "passthru"
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.
2025-02-27 08:36:49 +00:00
Evan Hunt
1edbbc32b4 set eresult based on the type in ncache_adderesult()
when the caching of a negative record failed because of the
presence of a positive one, ncache_adderesult() could override
this to ISC_R_SUCCESS. this could cause CNAME and DNAME responses
to be handled incorrectly.  ncache_adderesult() now sets the result
code correctly in such cases.
2025-02-25 21:29:19 -08:00
Evan Hunt
6c2af2ae3b remove 'target' from dns_adb
the target name parameter to dns_adb_createfind() was always passed as
NULL, so we can safely remove it.

relatedly, the 'target' field in the dns_adbname structure was never
referenced after being set.  the 'expire_target' field was used, but
only as a way to check whether an ADB name represents a CNAME or DNAME,
and that information can be stored as a single flag.
2025-02-26 00:43:21 +00:00
Mark Andrews
f98a8331aa Fix dual-stack-servers
Named was stopping nameserver address resolution attempts too soon
when dual stack servers are configured.  Dual stack servers are
used when there are *not* addresses for the server in a particular
address family so find->status == DNS_ADB_NOMOREADDRESSES is not a
sufficient stopping condition when dual stack servers are available.
Call fctx_try to see if the alternate servers can be used.
2025-02-25 23:47:46 +00:00
Mark Andrews
b048190e23 Relax private DNSKEY and RRSIG constraints
DNSKEY, KEY, RRSIG and SIG constraints have been relaxed to allow
empty key and signature material after the algorithm identifier for
PRIVATEOID and PRIVATEDNS. It is arguable whether this falls within
the expected use of these types as no key material is shared and
the signatures are ineffective but these are private algorithms and
they can be totally insecure.
2025-02-25 22:59:46 +00:00
Evan Hunt
c2e4358267 prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().
2025-02-25 22:40:48 +00:00
Evan Hunt
2f7e6eb019 allow NULL compression context in dns_name_towire()
passing NULL as the compression context to dns_name_towire()
copies the uncompressed name data directly into the target buffer.
2025-02-25 12:53:25 -08:00
Evan Hunt
afb424c9b6 simplify dns_name_fromtext() interface
previously, dns_name_fromtext() took both a target name and an
optional target buffer parameter, which could override the name's
dedicated buffer. this interface is unnecessarily complex.

we now have two functions, dns_name_fromtext() to convert text
into a dns_name that has a dedicated buffer, and dns_name_wirefromtext()
to convert text into uncompressed DNS wire format and append it to a
target buffer.

in cases where it really is necessary to have both, we can use
dns_name_fromtext() to load the dns_name, then dns_name_towire()
to append the wire format to the target buffer.
2025-02-25 12:53:25 -08:00
Evan Hunt
a6986f6837 remove 'target' parameter from dns_name_concatenate()
the target buffer passed to dns_name_concatenate() was never
used (except for one place in dig, where it wasn't actually
needed, and has already been removed in a prior commit).
we can safely remove the parameter.
2025-02-25 12:53:25 -08:00
Evan Hunt
2edefbad4a remove the 'name_coff' parameter in dns_name_towire()
this parameter was added as a (minor) optimization for
cases where dns_name_towire() is run repeatedly with the
same compression context, as when rendering all of the rdatas
in an rdataset. it is currently only used in one place.

we now simplify the interface by removing the extra parameter.
the compression offset value is now part of the compression
context, and can be activated when needed by calling
dns_compress_setmultiuse(). multiuse mode is automatically
deactivated by any subsequent call to dns_compress_permitted().
2025-02-25 12:53:25 -08:00
Evan Hunt
94a96a7a0e save time when creating a slab from another slab
the dns_rdataslab_fromrdataset() function creates a slab
from an rdataset. if the source rdataset already uses a slab,
then no processing is necessary; we can just copy the existing
slab to a new location.
2025-02-25 18:37:35 +00:00
Ondřej Surý
1e4fb53c61
Destroy the hashmap iterator inside the rwlock
Previously, the hashmap iterator for fetches-per-zone was destroy
outside the rwlock.  This could lead to an assertion failure due to a
timing race with the internal rehashing of the hashmap table as the
rehashing process requires no iterators to be running when rehashing the
hashmap table.  This has been fixed by moving the destruction of the
iterator inside the read locked section.
2025-02-25 13:36:37 +01:00
Ondřej Surý
67e1df1a07
Squash set_offsets() and dns_name_offsets() into single function
The third argument to set_offsets() was only used in
dns_name_fromregion() and not really needed.  We can remove the third
argument and then manually check whether the last label is root label.
2025-02-25 12:17:34 +01:00
Ondřej Surý
79c3871a7b
Remove target buffer from dns_name_downcase()
There was just a single use of passing an extra buffer to
dns_name_downcase() which have been replaced by simple call to
isc_ascii_lowercase() and the 'target' argument from dns_name_downcase()
function has been removed.
2025-02-25 12:17:34 +01:00
Ondřej Surý
3bb47bc6cd
Remove MAKE_EMPTY() macro from dns_name unit
The MAKE_EMPTY() macro was clearing up the output variable in case of
the failure.  However, this was breaking the usual design pattern that
the output variables are left in indeterminate state or we don't touch
them at all when a failure occurs.  Remove the macro and change the
dns_name_downcase() to not touch the name contents until success.
2025-02-25 12:17:34 +01:00
Ondřej Surý
259600c837
Cleanup the usage of dns_offsets_t vs unsigned char * pointers
There was a back-and-forth between static arrays and the pointers to the
offsets.  Since we are now only using the static arrays, we can cleanup
the usage of the pointers that would previously point either to the
static array or name->offsets if available.
2025-02-25 12:17:34 +01:00
Ondřej Surý
1c22ab2ef7
Simplify name initializers
We no longer need to pass labels to DNS_NAME_INITABSOLUTE
and DNS_NAME_INITNONABSOLUTE.
2025-02-25 12:17:34 +01:00
Ondřej Surý
04c2c2cbc8
Simplify dns_name_init()
Remove the now-unused offsets parameter from dns_name_init().
2025-02-25 12:17:34 +01:00
Ondřej Surý
08e966df82
Remove offsets from the dns_name and dns_fixedname structures
The offsets were meant to speed-up the repeated dns_name operations, but
it was experimentally proven that there's actually no real-world
benefit.  Remove the offsets and labels fields from the dns_name and the
static offsets fields to save 128 bytes from the fixedname in favor of
calculating labels and offsets only when needed.
2025-02-25 12:17:34 +01:00
alessio
45132df850 Remove unused symtab implementation
The old symtab implementation should have been removed in !9921 , but
it wasn't. This commit addresses that.
2025-02-25 11:29:58 +01:00
alessio
887502e37d Drop malformed notify messages early instead of decompressing them
The DNS header shows if a message has multiple questions or invalid
NOTIFY sections. We can drop these messages early, right after parsing
the question. This matches RFC 9619 for multi-question messages and
Unbound's handling of NOTIFY.
To further add further robustness, we include an additional check for
unknown opcodes, and also drop those messages early.

Add early_sanity_check() function to check for these conditions:
- Messages with more than one question, as required by RFC 9619
- NOTIFY query messages containing answer sections (like Unbound)
- NOTIFY messages containing authority sections (like Unbound)
- Unknown opcodes.
2025-02-25 10:40:38 +01:00
Evan Hunt
d0fd9cbe3b Fix a logic error in cache_name()
A change in 6aba56ae8 (checking whether a rejected RRset was identical
to the data it would have replaced, so that we could still cache a
signature) inadvertently introduced cases where processing of a
response would continue when previously it would have been skipped.
2025-02-24 15:04:14 -08:00
Ondřej Surý
d1ef6a93c1
Acquire the database reference before possibly last node release
Acquire the database refernce in the detachnode() to prevent the last
reference to be release while the NODE_LOCK being locked.  The NODE_LOCK
is locked/unlocked inside the RCU critical section, thus it is most
probably this should not pose a problem as the database uses call_rcu
memory reclamation, but this it is still safer to acquire the reference
before releasing the node.
2025-02-24 20:05:56 +01:00
Ondřej Surý
4917ffa61b
Explicitly create and shutdown the call_rcu_thread
As the default_call_rcu_thread can't be forced to flush all the work
during the executable shutdown, create one call_rcu_thread explicitly
and assign it to the all created threads.

This allows this explicit call_rcu_thread to be unassociated from the
main thread and freed before the executable destructor exits.
2025-02-22 16:19:01 +01:00
Ondřej Surý
f5c204ac3e
Move the library init and shutdown to executables
Instead of relying on unreliable order of execution of the library
constructors and destructors, move them to individual binaries.  The
advantage is that the execution time and order will remain constant and
will not depend on the dynamic load dependency solver.

This requires more work, but that was mitigated by a simple requirement,
any executable using libisc and libdns, must include <isc/lib.h> and
<dns/lib.h> respectively (in this particular order).  In turn, these two
headers must not be included from within any library as they contain
inlined functions marked with constructor/destructor attributes.
2025-02-22 16:19:00 +01:00
Ondřej Surý
c6b0368b21
Dump the fetches from dns_resolver_dumpfetches()
Previously, the dns_resolver_dumpfetches() would go over the fetch
counters.  Alas, because of the earlier optimization, the fetch counters
would be increased only when fetches-per-zone was not 0, otherwise the
whole counting was skipped for performance reasons.

Instead of using the auxiliary fetch counters hash table, use the real
hash table that stores the fetch contexts to dump the ongoing fetches to
the recursing file.

Additionally print more information about the fetch context like start
and expiry times, number of fetch responses, number of queries and count
of allowed and dropped fetches.
2025-02-21 22:25:43 +01:00
Ondřej Surý
cf078fadeb
Fix the fetch context hash table lock ordering
The order of the fetch context hash table rwlock and the individual
fetch context was reversed when calling the release_fctx() function.
This was causing a problem when iterating the hash table, and thus the
ordering has been corrected in a way that the hash table rwlock is now
always locked on the outside and the fctx lock is the interior lock.
2025-02-21 22:05:43 +01:00
Ondřej Surý
b9e3cd5d2a
Add isc_timer_running() function to check status of timer
In the next commit, we need to know whether the timer has been started
or stopped.  Add isc_timer_running() function that returns true if the
timer has been started.
2025-02-21 22:05:43 +01:00
Aram Sargsyan
3ea2fbc238 Fix RPZ bug when resuming a query during a reconfiguration
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.

Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.
2025-02-21 11:10:15 +00:00
Ondřej Surý
77ec2a6c22 Cleanup the isc_counter unit
The isc_counter_create() doesn't need the return value (it was always
ISC_R_SUCCESS), use the macros to implement the reference counting,
little style cleanup, and expand the unit test.
2025-02-21 09:51:42 +00:00
Mark Andrews
83159d0a54 Remove check for missing RRSIG records from getsection
Checking whether the authority section is properly signed should
be left to the validator.  Checking in getsection (dns_message_parse)
was way too early and resulted in resolution failures of lookups
that should have otherwise succeeded.
2025-02-20 20:31:07 +00:00
Aram Sargsyan
716b936045 Implement sig0key-checks-limit and sig0message-checks-limit
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number two was that more
than one key should only be required only during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.

This change introduces two new configuration options for the views,
sig0key-checks-limit and sig0message-checks-limit, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.
2025-02-20 13:35:14 +00:00
Aram Sargsyan
c6529891bb Fix isc_quota bug
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.

Fix the wrong memory ordering for 'quota->used', as the relaxed
ordering doesn't ensure that data modifications made by one thread
are visible in other threads.

Add checks in both isc_quota_release() and isc_quota_acquire_cb()
to make sure that the described hangup does not happen. Also see
code comments.
2025-02-20 10:56:00 +00:00
Aram Sargsyan
c701b590e4 Expose the incoming transfers' rates in the statistics channel
Expose the average transfer rate (in bytes-per-second) during the
last full 'min-transfer-rate-in <bytes> <minutes>' minutes interval.
If no such interval has passed yet, then the overall average rate is
reported instead.
2025-02-20 09:32:55 +00:00
Aram Sargsyan
91ea156203 Implement the min-transfer-rate-in configuration option
This new option sets a minimum amount of transfer rate for
an incoming zone transfer that will abort a transfer, which
for some network related reasons run very slowly.
2025-02-20 09:32:55 +00:00
Evan Hunt
6aba56ae89 Check whether a rejected rrset is different
Add a new dns_rdataset_equals() function to check whether two
rdatasets are equal in DNSSEC terms.

When an rdataset being cached is rejected because its trust
level is lower than the existing rdataset, we now check to see
whether the rejected data was identical to the existing data.
This allows us to cache a potentially useful RRSIG when handling
CD=1 queries, while still rejecting RRSIGs that would definitely
have resulted in a validation failure.
2025-02-19 17:25:20 -08:00
Ondřej Surý
2fc32c105d Remove the "raw" version of the dns_slabheader API
The "raw" version of the header was used for the noqname and the closest
proofs to save around 152 bytes of the dns_slabheader_t while bringing
an additional complexity.  Remove the raw version of the dns_slabheader
API at the slight expense of having unused dns_slabheader_t data sitting
in front of the proofs.
2025-02-19 15:00:15 -08:00
Evan Hunt
c2e19771ac refactor dns_rdataslab_subtract() for efficiency
reduce the number of rdata comparisons needed by walking
through the original slab once to determine whether the rdata
in it is duplicated in the slab to be subtracted, and then
write out the rdatas that aren't. previously, this was
done twice: once when determining the size of the target buffer
and then again when copying data into it.
2025-02-19 15:00:15 -08:00
Evan Hunt
1d5fe36136 refactor dns_rdataslab_merge() for efficiency
when merging two rdata slabs, we now check once to see
whether an item in the new slab has a duplicate in the
old. previously this was done twice; once to determine the
size of the target buffer required, and then again when
copying the data into it.

we also minimize the number of rdata comparisons necessary,
by remembering which items in the old slab have already been
found to be duplicates.
2025-02-19 15:00:15 -08:00
Evan Hunt
ed83455c81 dns_slabheader_fromrdataset() -> dns_rdataset_getheader()
The function name dns_slabheader_fromrdataset() was too similar
to dns_rdataslab_fromrdataset(). Instead, we now have an rdataset
method 'getheader' which is implemented for slab-type rdatasets.

A new NOHEADER rdataset attribute is set for rdatasets using
raw slabs (i.e., noqname and closest encloser proofs); when
called on rdatasets with that flag set, dns_rdataset_getheader()
returns NULL.
2025-02-19 14:58:32 -08:00
Evan Hunt
82edec67a5 initialize header in dns_rdataslab_fromrdataset()
when dns_rdataslab_fromrdataset() is run, in addition to
allocating space for a slab header, it also partially
initializes it, setting the type match rdataset->type and
rdataset->covers, the trust to rdataset->trust, and the TTL to
rdataset->ttl.
2025-02-19 14:58:32 -08:00
Evan Hunt
b4bde9bef4 clarify dns_rdataslab_fromrdataset()
there are now two functions for creating an rdataslab from an
rdataset: dns_rdataslab_fromrdataset() creates a full slab (including
space for a slab header), and dns_rdataslab_raw_fromrdataset() creates
a raw slab.
2025-02-19 14:58:32 -08:00
Evan Hunt
f1ab7f199b refactor dns_rdataslab_merge() and _subtract()
these two functions have been refactored for clarity
and readability, with a more logical flow, added comments,
and less code duplication.
2025-02-19 14:58:32 -08:00
Evan Hunt
6908d1f9be more rdataslab refactoring
- there are now two functions for getting rdataslab size:
  dns_rdataslab_size() is for full slabs and dns_rdataslab_sizeraw()
  for raw slabs. there is no longer a need for a reservelen parameter.
- dns_rdataslab_count() also no longer takes a reservelen parameter.
  (currently it's never used for raw slabs, so there is no _countraw()
  function.)
- dns_rdataslab_rdatasize() has been removed, because
  dns_rdataslab_sizeraw() can do the same thing.
- dns_rdataslab_merge() and dns_rdataslab_subtract() both take
  slabheader parameters instead of character buffers, and the
  reservelen parameter has been removed.
2025-02-19 14:58:32 -08:00
Evan Hunt
4601d4299a fix and simplify dns_rdataset_equal() and _equalx()
if both rdataslabs being compared have zero length, return true.

also, since these functions are only ever called on slabheaders
with sizeof(dns_slabheader_t) as the reserve length, we can
simplify the API: remove the reservelen argument, and pass the
slabs as type dns_slabheader_t * instead of unsigned char *.
2025-02-19 14:58:32 -08:00
Ondřej Surý
15fe68e50d Add .up pointer to slabheader
The dns_slabheader object uses the 'next' pointer for two purposes.
In the first header for any given type, 'next' points to the first
header for the next type. But 'down' points to the next header of
the same type, and in that record, 'next' points back up.

This design made the code confusing to read.  We now use a union
so that the 'next' pointer can also be called 'up'.
2025-02-19 14:58:32 -08:00
Artem Boldariev
2adabe835a DoH: http_send_outgoing() return value is not used
The value returned by http_send_outgoing() is not used anywhere, so we
make it not return anything (void). Probably it is an omission from
older times.
2025-02-19 17:52:36 +02:00
Artem Boldariev
8b8f4d500d DoH: Fix missing send callback calls
When handling outgoing data, there were a couple of rarely executed
code paths that would not take into account that the callback MUST be
called.

It could lead to potential memory leaks and consequent shutdown hangs.
2025-02-19 17:52:36 +02:00
Artem Boldariev
a22bc2d7d4 DoH: change how the active streams number is calculated
This commit changes the way how the number of active HTTP streams is
calculated and allows it to scale with the values of the maximum
amount of streams per connection, instead of effectively capping at
STREAM_CLIENTS_PER_CONN.

The original limit, which is intended to define the pipelining limit
for TCP/DoT. However, it appeared to be too restrictive for DoH, as it
works quite differently and implements pipelining at protocol level by
the means of multiplexing multiple streams. That renders each stream
to be effectively a separate connection from the point of view of the
rest of the codebase.
2025-02-19 17:52:36 +02:00
Artem Boldariev
05e8a50818 DoH: Track the amount of in flight outgoing data
Previously we would limit the amount of incoming data to process based
solely on the presence of not completed send requests. That worked,
however, it was found to severely degrade performance in certain
cases, as was revealed during extended testing.

Now we switch to keeping track of how much data is in flight (or ready
to be in flight) and limit the amount of processed incoming data when
the amount of in flight data surpasses the given threshold, similarly
to like we do in other transports.
2025-02-19 17:52:36 +02:00
Evan Hunt
e58ce19cf2 when committing a new qpzone version, delete dead nodes
if all data has been deleted from a node in the qpzone
database, delete the node too.
2025-02-18 14:22:38 -08:00
Ondřej Surý
ce9f6e68c3 Unify how we handle database version in the cache
Database versions are not used in cache databases. Some places in
qpcache.c required the version argument to be NULL; others marked it
as UNUSED. Unify all cases to require version to be NULL.
2025-02-18 20:15:00 +00:00
Ondřej Surý
2d53796e28 Clean up 'now' usage in the cache
Unify the way we handle the 'now' argument in the cache: when it's
set to zero by the caller, it is replaced with isc_stdtime_now().
2025-02-18 20:15:00 +00:00
Ondřej Surý
3b2fe808c4 Clean up the search part in qpcache_find()
Slightly refactor the header search in qpcache_find(), so the scope
level is reduced and the cname parts are logically grouped together.
2025-02-18 20:15:00 +00:00
Ondřej Surý
bfb219ac2d Refactor the search in qpcache_findrdataset()
Add new related_headers() function that simplifies the code
flow in qpcache_findrdataset().  Also use check_stale_header() function
to remove code duplication.
2025-02-18 20:15:00 +00:00
Ondřej Surý
cf66ba02a4 Refactor simple slabheader matching
Add a helper function both_headers() that unifies the slabheader
matching for simple type: it returns true when both the type and
the matching RRSIG have been found.
2025-02-18 20:15:00 +00:00
Ondřej Surý
4cd1dd8dd7 Add new helper maybe_update_headers() function
The new maybe_update_headers() function unifies the LRU updates to the
slabheaders that was scattered all over the place.  More calls to update
headers after bindrdatasets() were also added for completeness.
2025-02-18 20:15:00 +00:00
Ondřej Surý
4448f1adb2 Add bindrdatasets() function that binds both rdatasets
This removes code duplication between the dual bindrdataset() calls.  It
also unifies the handling as there were small differences between the
calls: one variant was checking for !NEGATIVE(found) condition and one
wasn't, and it is technically ok to do the check for all variants.
2025-02-18 20:15:00 +00:00
Ondřej Surý
53d9ef5bd0 Refactor check_stale_header() function
The check_stale_header() function now updates header_prev directly
so it doesn't have to be handled in the outer loop; it's always
set to the correct value of the previous header in the chain.
2025-02-18 20:15:00 +00:00
Evan Hunt
5281c708d3 clean up unnecessary code in qpcache
some code was left in the cache database implementation after
it was separated from the zone database, and can be cleaned up
and refactored now:

- the DNS_SLABHEADERATTR_IGNORE flag is never set in the cache
- support for loading the cache from was removed, but the add()
  function still had a 'loading' flag that's always false
- two different macros were used for checking the
  DNS_SLABHEADERATTR_NONEXISTENT flag - EXISTS() and NONEXISTENT().
  it's clearer to just use EXISTS().
- the cache doesn't support versions, so it isn't necessary to
  walk down the 'down' pointer chain when iterating through the
  cache or looking for a header to update.  'down' now only points
  to records that are deleted from the cache but have not yet been
  purged from memory. this allows us to simplify both the iterator
  and the add() function.
2025-02-18 20:15:00 +00:00
Artem Boldariev
fd3beaba2e Fix wrong logging severity in do_nsfetch()
ISC_LOG_WARNING was used while ISC_LOG_DEBUG(3) was implied.
2025-02-18 10:28:23 +02:00
Evan Hunt
fffa150df3 fix dns_qp_insert() checks in qpzone
in some places there were checks for failures of dns_qp_insert()
after dns_qp_getname(). such failures could only happen if another
thread inserted a node between the two calls, and that can't happen
because the calls are serialized with dns_qpmulti_write(). we can
simplify the code and just add an INSIST.
2025-02-17 12:21:50 -08:00
Aram Sargsyan
d5d63d6253 Fix a bug in generic_totext_in_svcb()
The 'sbpr_dohpath' case was missing from the switch-case. Add the
'sbpr_dohpath' case, which should work similarly as the 'sbpr_text'
case.
2025-02-17 17:33:43 +00:00
Aram Sargsyan
c6e3695478 Use named Service Parameter Keys (SvcParamKeys) by default
When converting SVCB records to text representation use named
SvcParamKeys values unless backward-compatible mode is activated,
in which case the values which were not defined initially in
RFC9460 and were added later (see [1]) are converted to opaque
"keyN" syntax, like, for example, "key7" instead of "dohpath".

[1] https://www.iana.org/assignments/dns-svcb/dns-svcb.xhtml

Co-authored-by: sdomi <ja@sdomi.pl>
2025-02-17 17:33:43 +00:00
alessio
53991ecc14 Refactor and simplify isc_symtab
This commit does several changes to isc_symtab:

1. Rewrite the isc_symtab to internally use isc_hashmap instead of
   hand-stiched hashtable.

2. Create a new isc_symtab_define_and_return() api, which returns
   the already defined symvalue on ISC_R_EXISTS; this allows users
   of the API to skip the isc_symtab_lookup()+isc_symtab_define()
   calls and directly call isc_symtab_define_and_return().

3. Merge isccc_symtab into isc_symtab - the only missing function
   was isccc_symtab_foreach() that was merged into isc_symtab API.

4. Add full set of unit tests for the isc_symtab API.
2025-02-17 11:43:19 +01:00
Mark Andrews
04b1484ed8 Re-fetch pending records that failed validation
If a deferred validation on data that was originally queried with
CD=1 fails, we now repeat the query, since the zone data may have
changed in the meantime.
2025-02-17 08:57:58 +11:00
Mark Andrews
8b900d1808 Complete the deferred validation if there are no RRSIGs
When a query is made with CD=1, we store the result in the
cache marked pending so that it can be validated later, at
which time it will either be accepted as an answer or removed
from the cache as invalid.  Deferred validation was not
attempted when there were no cached RRSIGs for DNSKEY and
DS.  We now complete the deferred validation in this scenario.
2025-02-17 08:57:58 +11:00
Mark Andrews
5e49a9e4ae Fix "CNAME and other data" detection
prio_type was being used in the wrong place to optimize cname_and_other.
We have to first exclude and accepted types and we also have to
determine that the record exists before we can check if we are at
a point where a later CNAME cannot appear.
2025-02-14 01:51:38 +00:00
Ondřej Surý
732fc338a9
Switch the locknum generation for qpznode to random
Instead of using on hash of the name modulo number of the buckets,
assign the locknum randomly with isc_random_uniform().  This makes
the locknum assignment aligned with qpcache and allows the bucket
number to be non-prime in the future.
2025-02-04 22:50:49 +01:00
Ondřej Surý
1fa5219fdf
Rely on call_rcu() to destroy the qpzone outside of locks
Reduce the number of qpzone_ref() and qpzone_unref() calls in
qpzone_detachnode() by relying on the call_rcu to delay
the destruction of the lock buckets.
2025-02-04 21:37:46 +01:00
Ondřej Surý
6dcc398726
Reduce false sharing in dns_qpzone
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpzone_bucket_t that is cacheline aligned and have
a single array of those.
2025-02-04 21:37:46 +01:00
Ondřej Surý
c602d76c1f
Reduce false sharing in dns_qpcache
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpcache_bucket_t struct that is cacheline aligned
and have a single array of those.

Additionaly, make both the head and the tail of isc_queue_t padded, not
just the head, to prevent false sharing of the lock-free structure with
the lock that follows it.
2025-02-04 21:37:46 +01:00
Aram Sargsyan
19843f6c9d Include destination address port number in query logging
When query logging is enabled, named will now include the destination
address port in the logged message.

Example messages for before and after this change:

    before: client @0x7608b2026000 10.53.0.1#52136 (example.test): query: example.test IN A +E(0)K (10.53.0.1)
    after:  client @0x729bf5c26000 10.53.0.1#35976 (example.test): query: example.test IN A +E(0)K (10.53.0.1#53)
2025-02-04 10:49:26 +00:00
Ondřej Surý
355fc48472
Print the expiration time of the stale records (not ancient)
In #1870, the expiration time of ANCIENT records were printed, but
actually the ancient records are very short lived, and the information
carries a little value.

Instead of printing the expiration of ANCIENT records, print the
expiration time of STALE records.
2025-02-03 15:47:06 +01:00
Ondřej Surý
36a3ceb19f
Restore the .ttl field for slabheader in dns_qpzone
The original .ttl field was actually used as TTL in the dns_qpzone unit.
Restore the field by adding it to union with the .expire struct member
and cleanup all the code that added or subtracted 'now' from the ttl
field as that was misleading as 'now' would be always 0 for qpzone
database.
2025-02-03 14:39:06 +01:00
Ondřej Surý
60f6b88c63
Remove duplicate 'now' argument from find_coveringnsec()
The find_coveringnsec() was getting the 'now' from two sources -
search->now and separate now argument.  Things like this are ticking
bombs, remove the extra 'now' argument and use single source of 'now'.
2025-02-03 14:39:06 +01:00