Commit graph

16614 commits

Author SHA1 Message Date
Alessio Podda
ffbab8ce23 Clean up comments
A comment was still referencing the rdataset attribute, now it has been
fixed.
2026-05-11 10:28:20 +02:00
Alessio Podda
52edb98e3c Delay binding glue to rdataset
The dns_glue struct currently contains four dns_rdataset structs to hold
the glue. These structs are over 100 bytes each because they need to be
able to hold data for multiple types of databases.

Since the dns_glue_t type is only used by qpzone, we can instead hold
pointers to the vecheaders directly, and only bind the vecheaders to
the rdatasets when adding the glue to the message.
2026-05-11 10:28:20 +02:00
Alessio Podda
d7a5a2e86b Make dns_glue_t private to qpzone
The dns_glue_t, dns_gluelist_t and dns_glue_additionaldata_ctx types are
only used in qpzone.c. This commits moves them to the private header
qpzone_p.h.

This is done in preparation of a followup commit that will refactor them
to use types that are private to qpzone.
2026-05-11 10:22:25 +02:00
Evan Hunt
2c60870527 check for val->name == NULL when adding EDE text
When a validator is being shut down, the associated name
`val->name` is set to NULL.  This could cause a crash if a worker
thread subsequently added an EDE code to the response containing
val->name in the extra text.

`validator_addede()` now checks whether the name is NULL before
trying to add it to the extra text.
2026-05-06 20:47:43 +00:00
Aram Sargsyan
67e0090371 Fix a bug in catz_process_apl()
The allow-transfer/allow-query catalog zone custom properties support
only APL RRtypes. All other types are correctly rejected by the
catz_process_apl() function. However, when an APL RRtype is processed
by that function, and another (non-APL) RRtype is then attempted to be
processed, there is an assertion failure happening in the prologue
of the function because `*aclbp != NULL` (i.e. an APL has been already
processed). Move the code to do type checking before the affected
REQUIRE assertion.
2026-05-06 19:35:23 +00:00
Aram Sargsyan
4576a67a93 Fix a memory leak issue in catz_process_primaries()
Free the old version of the keyname (if it exists) before setting
the new one.
2026-05-06 17:30:51 +00:00
Evan Hunt
7213b038f0 Clear dns64_aaaaok immediately after use
The DNS64 state information stored in client->query.dns64_aaaaok
could cause an assertion failure in query_respond() if the server
was configured in such a way as to trigger a new recursion before
the query had been reset - for example, by using the filter-aaaa
plugin, which may need to recurse to find out whether an A record
exists.

This has been addressed by clearing DNS64 state information
immediately after the call to query_filter64().
2026-05-06 06:46:32 +02:00
Evan Hunt
b26a860ec8 Fix a stack use-after-free in qpzone
In previous_closest_nsec(), a new qpreader was opened to search the NSEC
tree. It was possible for that to be used to update a QP iterator object
owned by the caller, and then be destroyed when the function returned.

This qpreader object isn't necessary anymore; since namespaces were
added to the QP trie in commit 15653c54a0, we can now just reuse the
existing reader for the main tree.
2026-05-05 23:19:30 +00:00
Evan Hunt
26c895cc92 Hold a reference to the NTA table for the lifetime of each NTA
Each dns__nta_t now references its parent ntatable in nta_create() and
releases it in dns__nta_destroy().  This avoids a use-after-free in
fetch_done() and other callbacks that dereference nta->ntatable: the
ntatable could otherwise be released by view destruction while an
in-flight resolver fetch still holds a reference to the NTA.
2026-05-05 22:27:39 +02:00
Ondřej Surý
f9d24b1b85
Reject oversized RRsets at slab/vec construction
makeslab(), makevec(), dns_rdatavec_merge() and dns_rdatavec_subtract()
summed per-record storage into an unsigned int with no upper-bound
check.  An RRset whose total encoded size exceeds DNS_RDATA_MAXLENGTH
cannot fit in a DNS message and is unservable; building its in-memory
representation only burns memory on data that will fail at response
time, and at the upper bound the running sum could in theory wrap.

Cap the running total at DNS_RDATA_MAXLENGTH and return ISC_R_NOSPACE
when exceeded.  Update the qpdb cache memory-purge test to use a
record size that fits within the new limit.

Assisted-by: Claude:claude-opus-4-7
2026-05-05 18:14:40 +02:00
Mark Andrews
cd96894bcd
Remove remaining RFC 3445 KEY flags
RFC 3445 also eliminated the DNS_KEYTYPE_NOAUTH, DNS_KEYTYPE_NOCONF,
and DNS_KEYOWNER_ENTITY flags. With NOAUTH and NOCONF gone, the
concept of NOKEY can no longer be expressed in KEY records.

DNS_KEYOWNER_ENTITY was already unused as of 22d688f656 but still
defined; that is now also removed.
2026-05-05 10:17:31 +02:00
Evan Hunt
9c06f0a41d
Remove DNS_KEYFLAG_EXTENDED
The DNS_KEYFLAG_EXTENDED flag was only legitimate for type KEY
and was eliminated by RFC 3445. Dropping the extended-flags
handling in pub_compare() also fixes a possible crash when
signing a zone whose journal contains a crafted DNSKEY: a
6-byte record with the EXTENDED bit set produced a memmove()
length that underflowed and ran off a stack buffer.
2026-05-05 10:16:02 +02:00
Ondřej Surý
a6b7ce29c4
Use a keyed hash for the RRL bucket table
The previous hash_key() was a deterministic, unkeyed (<<1) + add over the
key words.  An off-path attacker could invert it offline and submit
queries whose source /24, qname hash, and qtype map to a single bucket;
under chaining this turns every lookup into an O(N) walk under
rrl->lock and starves legitimate query processing on the very feature
deployed to mitigate DoS.

Replace it with isc_hash32(), which is HalfSipHash-2-4 keyed by a
per-process random seed, so collision sets cannot be precomputed.

Assisted-by: Claude:claude-opus-4-7
2026-05-04 13:39:01 +02:00
Ondřej Surý
141e8110f7 Guard parent-NS walk against running off the root
Once the walk reaches the root, splitting one more label off would
trip an internal assertion and abort named.  Stop cleanly with
ISC_R_NOTFOUND so the dispatcher cancels the fetch.  Only reachable
through misconfiguration (root configured as a primary with parental
agents, or a parent zone that NODATAs its own NS).

Assisted-by: Claude:claude-opus-4-7
2026-05-01 08:18:36 +02:00
Ondřej Surý
2d468cb21f Assert adb find loop-affinity invariant at lifetime entry points
The dns_adbfind_t lifetime model has no reference counting; storage
liveness is held together by find->lock and the FIND_EVENT_SENT
idempotency flag, plus an unwritten cross-module rule that all
non-trivial operations on a find run on find->loop. If a caller
violates that rule, the unlock-relock window in dns_adb_cancelfind
(and similar paths) becomes a use-after-free and we crash later
inside libpthread on a corrupted mutex.

Add REQUIREs at dns_adb_cancelfind, dns_adb_destroyfind and
find_sendevent so a violation aborts at the offending call site
rather than silently freeing storage another loop is still touching.
Also poison find->magic with ~DNS_ADBFIND_MAGIC in free_adbfind so
DNS_ADBFIND_VALID catches reuse-after-free at the next public entry
point instead of letting the dangling pointer reach the mutex code.

Assisted-by: Claude:claude-opus-4-7
2026-05-01 07:50:29 +02:00
Ondřej Surý
045d5d0455 Reject RSA DNSKEYs with degenerate modulus at parse time
The wire-format RSA DNSKEY parser used the residual rdata length after
the exponent as the modulus length, with no positive lower bound.  A
crafted DNSKEY whose declared exponent length consumed the whole buffer
produced n = 0; the BN_bin2bn(_, 0, _) returned a non-NULL BIGNUM, the
NULL-check passed, and dnssec-importkey -f wrote out a "valid" key with
no key material.  RSASHA1 also bypassed the algorithm-specific lower
bound in opensslrsa_createctx (which only checks an upper bound for the
SHA1 algorithms), so the degenerate key reached the verify path with
whatever behaviour the linked OpenSSL exhibits for n = 0.

Add OPENSSLRSA_MIN_MODULUS_BITS = 512 (the lowest legitimate modulus
across the RSA DNSSEC algorithms per RFC 5702) and reject smaller
moduli at parse time in opensslrsa_fromdns, opensslrsa_parse, and
opensslrsa_fromlabel — the same three load paths where the existing
exponent upper-bound check lives.

Assisted-by: Claude:claude-opus-4-7
2026-04-30 15:50:32 +02:00
Ondřej Surý
ab8c1a77e0 Reject RSA DNSKEYs with oversize public exponents at parse time
The wire-format RSA DNSKEY parser was the only key path with no upper
bound on the public exponent — opensslrsa_parse and opensslrsa_fromlabel
already cap at RSA_MAX_PUBEXP_BITS.  An attacker-controlled DNSKEY could
therefore force a validator to compute s^e mod n with e up to ~|n| bits,
amplifying every verify by ~120x for typical 2048-bit moduli (OpenSSL
itself only caps the exponent for moduli above 3072 bits).  Apply the
same bit-count cap to wire-format keys.

Assisted-by: Claude:claude-opus-4-7
2026-04-30 10:55:42 +02:00
Ondřej Surý
4d465f4fa5 Dispatch ratelimiter events under the lock
isc__ratelimiter_tick() and isc_ratelimiter_shutdown() each pulled
events out of rl->pending into a function-local list, dropped the
mutex, and then iterated.  ISC_LIST_APPEND leaves the link in the
LINKED state, so a concurrent isc_ratelimiter_dequeue() saw an
event as still queued, called ISC_LIST_UNLINK against rl->pending —
which patched the prev/next of the local list — and freed the
event before dispatch finished, producing either an INSIST in the
unlink macro or a use-after-free in the dispatch loop.

isc_async_run() is a non-blocking wfcq enqueue, so there is no
benefit to dropping the mutex around it.  Unlink each event and
hand it to isc_async_run() while still holding rl->lock; the
existing ISC_LINK_LINKED check in dequeue then correctly
distinguishes "still queued and cancellable" from "already taken".

Assisted-by: Claude:claude-opus-4-7
2026-04-30 10:16:32 +02:00
Ondřej Surý
c62f24f7ee Fix swapped arguments in redirect2() single-label branch
For a query whose qname is the root, the labels==1 branch in
redirect2() called dns_name_copy(redirectname, view->redirectzone)
with arguments reversed, overwriting the view-global
nxdomain-redirect target with the empty redirectname rather than
copying the configured target into the per-query lookup name.  After
the corruption, view->redirectzone names the root, so
dns_name_issubdomain() makes redirect2() short-circuit for every
subsequent query and the nxdomain-redirect feature stops working
until named is restarted.

Triggering this needs the resolver to receive an NXDOMAIN for the
root from upstream, which does not happen in normal DNS operation.

Swap the arguments to match the dns_name_copy(source, dest)
signature.  Add a system test that issues a root query through the
nxdomain-redirect resolver and verifies the redirect feature still
works for a normal NXDOMAIN-producing query afterwards.

Assisted-by: Claude:claude-opus-4-7
2026-04-29 21:46:27 +02:00
Ondřej Surý
46f6bb6364
Size HMAC key generation buffers to the maximum block size
hmac_generate() declared its on-stack nonce buffer as
unsigned char data[ISC_MAX_MD_SIZE], i.e. 64 bytes. That is the maximum
digest size, but the buffer is filled up to the algorithm's HMAC block
size, which is 128 bytes for SHA-384 and SHA-512. Asking rndc-confgen
for an HMAC-SHA-384 or HMAC-SHA-512 key with -b > 512 (the documented
range allows up to 1024) wrote past the end of the stack buffer; on
hardened builds this aborted with a stack-smash detector firing
instead of producing a key.

Use the existing ISC_MAX_BLOCK_SIZE (128) for the buffer so the full
1..1024 range advertised by -A hmac-sha{384,512} works as documented.
The matching key_rawsecret[64] in confgen's generate_key() is enlarged
the same way so the generated key fits when dumped to the buffer.

Add a system test that exercises rndc-confgen across the previously
overflowing keysizes; with -Db_sanitize=address it caught the abort
before the fix.

Assisted-by: Claude:claude-opus-4-7
2026-04-29 19:21:20 +02:00
Ondřej Surý
6082274450 Stop isc_file_safecreate from following symlinks
The function existence-checked the target with stat() and then opened
the same path without O_NOFOLLOW, so a symlink at the target path
passed the regular-file test against the link's destination and the
open() that followed truncated and wrote through the link.
rndc-confgen -a is typically run as root and writes the keyfile under
a directory that service accounts may have write access to, so a stray
symlink there would silently redirect the truncate, fchown, and
overwrite to whatever file the link pointed at.

Switch the existence check to lstat() and use S_ISREG() so a symlink's
S_IFLNK mode is detected directly (a plain bitmask of S_IFREG matches
both, since S_IFLNK shares its high bit). Add O_NOFOLLOW to both
open() flag sets to close the lstat/open TOCTOU window. Hardening
against unexpected symlinks on intermediate path components is out of
scope.

Assisted-by: Claude:claude-opus-4-7
2026-04-29 16:56:25 +02:00
Ondřej Surý
746fb28369
Drop unused DNS_MASTER_NOINCLUDE and warn about untrusted zone text
DNS_MASTER_NOINCLUDE was defined to suppress $INCLUDE processing, but
no caller ever set it, so the guarded code path was dead and the flag
gave the false impression that named-checkzone could be hardened
against untrusted input. The zone-file parser cannot safely read text
from a less-trusted source than the user running the tool: $INCLUDE
opens any local file readable by that user, and fragments of its
contents leak through tokenizer error messages.

Rather than wire up an opt-in flag that suggests this is a supported
mode, remove the dead flag and the dead guard, and document in the
named-checkzone and named-compilezone manual pages that these tools
must not be run on zone text from an untrusted source.

Assisted-by: Claude:claude-opus-4-7
2026-04-29 15:08:20 +02:00
Colin Vidal
aeee4c1c1d Do not add glues from different parent in delegdb
When processing a referral, the `cache_delegns()` function was accepting
glues from a different parent. For instance:

```
AUTHORITY
test.example.		NS	ns.test.example.
test.example.		NS	ns.foo.example.
test.example.		NS	ns.bar.

ADDITIONAL
ns.bar.			A	1.2.3.4
ns.foo.example.		A	5.6.7.8
ns.test.example.	A	9.8.7.6
```

In such situation, only the glues for `ns.foo.example.` and
`ns.test.example.` should be used, and the glue from `ns.bar.` should be
ignored as this is not either a sub-domain or a sibling domain, the
parent is different (`bar.` instead of `example.`). This is now fixed.

Sibling glue and cyclic sibling glues are defined in RFC 9471 section
2.2 and section 2.3.
2026-04-28 19:17:39 +01:00
Aram Sargsyan
4ede6edc54 Remove OpenSSL memory tracking support from the ossl3.c module
OPENSSL_cleanup() in OpenSSL 4 doesn't free the memory, and that is
not compatible with BIND 9's memory leak detection code. Don't use
custom allocation/deallocation functions for OpenSSL's internal memory
management in the ossl3.c module.

See https://github.com/openssl/openssl/pull/29721
2026-04-28 14:42:40 +00:00
Aydın Mercan
48a77a4bfc don't set named curves explicitly in pre-3.0 libcrypto
The function `EC_KEY_set_asn1_flag` is deprecated in AWS-LC. Fortunately
calling it to make sure we use named curve keys is entirely unnecessary.

More information for pre-3.0 libcrypto and significant forks are as
following:

OpenSSL: Named curves were the default between 1.1.0 and 3.6.1 [1],[2]
AWS-LC: Library only supports named curves in the first place [3]
BoringSSL: Likewise with AWS-LC [4]
LibreSSL: `EC_GROUP`s are named by default [5]

[1] 86f300d385
[2] 9db6af922c
[3] a605df416b/include/openssl/ec_key.h (L442-L445)
[4] 514abb73bb/include/openssl/ec_key.h (L279-L280)
[5] c933874518/src/lib/libcrypto/ec/ec_lib.c (L94)
2026-04-28 09:28:18 +03:00
Alessio Podda
0fe1d091f7 Fix race condition in getsigningtime()
Compute qpzone_get_lock(elem->node) into a local variable while the
heap lock is still held, rather than dereferencing the stale elem
pointer after releasing the lock. A concurrent thread running
setsigningtime() (e.g. via IXFR apply on a worker thread) could free
the top-of-heap element between the heap lock release and the
dereference, causing a use-after-free.
2026-04-27 18:09:47 +02:00
Evan Hunt
7e3561a477 remove unneeded options in dns_zonefetch
In the dns_zonefetch mechanism, some option flags for
dns_resolver_createfetch() were used for all fetches, but
were actually only needed by the DNSKEY refresh fetches.

(Specifially, these options were DNS_FETCHOPT_UNSHARED
and DNS_FETCHOPT_NOCACHED, which were used along with
DNS_FETCHOPT_NOVALIDATE to ensure we get a new copy of
the DNSKEY as it is currently published by the authority,
without prior validation.  Those conditions are needed
for RFC 5011 trust anchor maintenace, but not when looking
up parent-NS or DSYNC RRsets.)
2026-04-22 10:58:43 +00:00
Ondřej Surý
592f3cc671
Add DTRACE probes to dns_delegdb
Instrument the delegation cache (introduced to back both NS-based and
DELEG-based delegations) with 11 USDT probes in the libdns provider so
that hit rate, eviction pressure, and lookup latency can be measured
without recompiling or enabling logging.

The probes are:

- delegdb_lookup_start / delegdb_lookup_done wrap dns_delegdb_lookup()
  and pass the query name plus the result code.

- delegdb_insert_start / delegdb_insert_done wrap dns_delegset_insert().
  The early SHUTTINGDOWN return is funneled through the cleanup label
  so the done probe fires on every path.

- delegdb_cleanup_start / delegdb_cleanup_done bracket the SIEVE-based
  eviction triggered when the cache goes overmem, reporting the number
  of bytes requested and actually reclaimed.  An additional per-node
  delegdb_evict probe (guarded by _ENABLED() because it fires inside
  the loop) exposes which zones are being evicted.

- delegdb_create, delegdb_reuse, and delegdb_shutdown trace the per-view
  lifecycle across server reloads.

- delegdb_delete traces rndc flush-delegation paths, reporting whether
  a subtree or single name was removed.

Name arguments are stringified with dns_name_format() behind
LIBDNS_*_ENABLED() guards so that the hot lookup and insert paths remain
zero-cost when no consumer is attached.
2026-04-20 13:14:19 +02:00
Ondřej Surý
3a44a13232 Refuse SIG and NXT records in dynamic updates
SIG (24) and NXT (30) are obsolete DNSSEC record types, superseded by
RRSIG and NSEC in RFC 3755.  Allowing them through dynamic update
exposes two distinct bugs that the surrounding GL#5818 work already
fixes as defense-in-depth:

  - dns__db_findrdataset() used to REQUIRE that (covers == 0 ||
    type == RRSIG), which aborts named when a SIG update reaches the
    prescan foreach_rr() call.  Fixed to accept dns_rdatatype_issig().
  - diff.c rdata_covers() used to test only RRSIG, dropping the
    covered-type field for SIG rdatas; the zone DB then filed every
    SIG rdataset under typepair (SIG, 0) instead of
    (SIG, covered_type) and follow-up adds collided at that bucket.
    Fixed to use dns_rdatatype_issig().

Both underlying bugs are still reachable via inbound zone transfer
(diff.c rdata_covers() runs from both dns_diff_apply on the IXFR path
and dns_diff_load on the AXFR path), so the type-helper fixes above
remain necessary.  For the dynamic-update path, the simplest and
safest posture is to refuse SIG and NXT outright at the front door in
ns/update.c, alongside the existing NSEC/NSEC3/non-apex-RRSIG
refusals.  KEY remains permitted because it is still used to carry
public keys for SIG(0) transaction authentication.

The existing tcp-self SIG regression test is repointed to assert
REFUSED on the SIG add, a symmetric NXT test is added, and the
SIG-via-dyn-update covers-bucket test is removed because it is no
longer reachable through this entry point; AXFR-based coverage of
diff.c rdata_covers() follows in a separate commit.
2026-04-17 16:09:39 +02:00
Ondřej Surý
0a5ba57116 Fix dropped covers field for SIG records in dns_diff_apply
rdata_covers() in lib/dns/diff.c discriminated only on
dns_rdatatype_rrsig (46) and returned 0 for the legacy SIG (24), so
the covered-type field was silently discarded on the dynamic-update
and IXFR paths.  Every SIG rdataset was then filed in the zone DB
under typepair (SIG, 0) instead of (SIG, covered_type); a second SIG
add with a different covers but a different TTL collided at that
bucket, tripped DNS_DBADD_EXACTTTL in qpzone, returned
DNS_R_NOTEXACT, and came back to the client as SERVFAIL.

Use dns_rdatatype_issig() here so both SIG and RRSIG carry their
covers through the diff, matching the helper pattern already used in
lib/dns/master.c, lib/ns/xfrout.c, lib/dns/qpcache.c, and the
dns__db_findrdataset() REQUIRE that the surrounding merge request
just relaxed.
2026-04-17 16:09:39 +02:00
Mark Andrews
03edeccaa1 Fix assertion failure in dns_db_findrdataset() for SIG records
dns__db_findrdataset() had a REQUIRE() that only accepted
dns_rdatatype_rrsig when the covers parameter was set.  A dynamic
update containing a SIG record (type 24) would trigger this
assertion, crashing named.  Use dns_rdatatype_issig() to accept
both SIG and RRSIG.
2026-04-17 16:09:39 +02:00
Alessio Podda
c7a167c739 Fix strict weak ordering violation in resign_sooner()
resign_sooner_values() only checked whether rhs was SOA-typed when
resign times were equal, but did not check lhs. When both entries were
SOA-typed with equal resign times, the comparison returned true in both
directions, violating irreflexivity and corrupting heap invariants.

Add lhs_typepair parameter and require lhs to be non-SOA for the
tie-breaking logic to apply.
2026-04-17 14:31:15 +02:00
Alessio Podda
bcfa2adaa3 Add missing parenthesis to fxhash
The fxhash implementation had a missing parenthesis that caused it to
diverge from Rust's reference implementation. This commit fixes this.
2026-04-16 16:03:40 +02:00
Ondřej Surý
0c007d8659
Rename view->hints to view->rootdb and rearm priming
With the parent-centric resolver, dns_view_bestzonecut() consults the
delegation DB (view->deleg) rather than the main cache for the closest
zonecut.  Root is never the target of a referral, so it never lands in
delegdb; bestzonecut therefore falls through to the hints lookup on
every query whose closest ancestor is root.  prime_done() only called
dns_root_checkhints(), which logs discrepancies but does not update
any store bestzonecut looks at, so the fresh root NS records obtained
by priming were never used and priming kept re-firing.

Rename view->hints to view->rootdb and refresh it when a priming
fetch completes: the '.' NS rdataset is replaced with the fetched
one, and for each listed nameserver the matching A/AAAA glue is
copied from the response's ADDITIONAL section.  Only glue for names
that actually appear as NS targets is accepted, so a hostile response
cannot inject unrelated records.  Glue the response did not carry is
left untouched, so the hints-file records loaded at startup remain as
a fallback.

Each view gets its own rootdb: the previous shared
named_g_server->in_roothints is gone, and configure_view() calls
dns_rootns_create() per view when the class-IN defaults are needed.
That keeps the priming writer one-per-DB, so concurrent priming in
different views cannot race on the same zone-DB version.

The rootdb refresh runs synchronously from the resolver response path,
so records go straight from the wire into rootdb with no cache round
trip and no dependency on DNSSEC validation state.  A new
DNS_FETCHOPT_PRIMING option marks the priming fetch; prime_done()
itself is now pure cleanup.

Track the rootdb freshness window in view->rootdb_expires and trigger
re-priming lazily from dns_view_find() and bestzonecut_rootdb() only
when the window has elapsed.  Stale records are still served while the
fresh priming fetch is in flight.

Drop dns_root_checkhints() and its helpers; the rootdb is now the
authoritative source the resolver consults.
2026-04-16 13:39:18 +02:00
Aram Sargsyan
e30a535760 Treat '%' and '$' as special characters for catalog member zone names
The filename of the catalog member zones are generated dynamically
based on the zone's name. If the zone's name is too long or if it
contains special characters the name's digest is used instead.

Since '%' and '$' are now treated as special characters in the zone
names (see !10779), add these characters to the list of the special
characters.
2026-04-16 11:37:02 +00:00
Aram Sargsyan
d2f7b969fe Fix case-sensitivity bug in zone filename token-parsing
The setfilename() function uses case-insensitive strcasestr() when
matching the possible tokens, but then one of the token parsers
uses case-sensitive INSIST checks which can assert when, for example,
matching '%X' and INSIST only accepts '%x'.

The case-insensitivity is documented, which means it's the parser
that needs to be fixed, not the matcher.

Convert the character to lowercase before checking the token's
validity.
2026-04-16 11:37:02 +00:00
Colin Vidal
1d10e4513f rename DNS_DBFIND_NOEXACT to DNS_DBFIND_ABOVE
The `DNS_DBFIND_NOEXACT` flag name is ambiguous, as it does not clearly
indicate the lookup behavior (e.g., sibling, child, or parent).

Rename it to `DNS_DBFIND_ABOVE` to better reflect that the lookup
targets a closer ancestor name.
2026-04-16 11:28:13 +02:00
Ondřej Surý
ec024735df Replace FIXME with rationale for not cleaning expired delegdb nodes
Expired delegation nodes are naturally replaced when the resolver
fetches fresh data, and any remaining stale nodes are reclaimed by
SIEVE eviction under memory pressure.
2026-04-16 11:28:13 +02:00
Colin Vidal
193e01ab20 Remove hiwater/lowater fields from delegdb
The delegdb does not directly use the hiwater and lowater values during
the cleaning flow, so these fields are no longer necessary.
2026-04-16 11:28:13 +02:00
Ondřej Surý
4d772cda3c Reclaim only what the new delegation needs
delegdb_cleanup() was overwriting the caller-supplied 'requested'
value with (hiwater - lowater), so every overmem cleanup tried to
free the full watermark band regardless of how much memory the new
delegation actually needed.  Drop the override so the caller's size
is used: we now walk the SIEVE only until we have reclaimed enough
room for the new node, leaving unrelated entries in place.
2026-04-16 11:28:13 +02:00
Ondřej Surý
876a896f0f Account transient delegsets against the caller's memory context
dns_delegset_fromnsrdataset() used isc_g_mctx for the transient
delegset it builds from a DNS NS rdataset.  That hides delegation
data in the global default context instead of accounting it against
the subsystem that owns it: a resolver fctx, a view, or a query
context.

Take an explicit mctx parameter so callers can direct the allocation
to the right place, and update the three call sites:
- lib/dns/view.c:1189 (dns_view_bestzonecut fallback) uses view->mctx
- lib/dns/resolver.c:7071 (resume_dslookup) uses fctx->mctx
- lib/ns/query.c:8672 (query_delegation_recurse) uses the client
  manager's mctx

Also tighten delegdb cleanup to run inside the same write transaction
as the insert: delegdb_node_prepare() now returns the size of the new
node, and delegdb_cleanup() takes the caller's open qp so that the
overmem reclamation and the insert share one commit instead of doing
two nested write transactions.
2026-04-16 11:28:13 +02:00
Ondřej Surý
9191dc7acb Fix delegation database NOEXACT lookup for top-level names
dns__deleg_lookup() with DNS_DBFIND_NOEXACT is supposed to return
the deepest proper ancestor of the lookup name.  It called
getparentnode() to step up from an exact match, but getparentnode()
only iterated while the chain length was >= 2.  When the chain
contained a single entry (the exact match itself with no ancestor
stored in the trie), the loop did not execute and left the caller
looking at the exact match.  The subsequent isactive() check then
returned success and the function reported the exact match as the
"deepest ancestor", violating NOEXACT semantics.

This was observable as the resolver picking the child-side
delegation for an at-parent type (e.g. a DS query for a TLD), then
sending the query to the child's own nameservers and recovering via
the "chase DS servers" path.

Have getparentnode() set '*node' to NULL when it cannot find an
active proper ancestor, and make dns__deleg_lookup() NULL-check
before returning, matching the canonical NOEXACT implementation in
dns_zt_find().  Update the deleg unit test to expect NOTFOUND for
the top-level-no-parent case.
2026-04-16 11:28:13 +02:00
Ondřej Surý
764625ee5b Use the delegation database in get_dsset()
When the validator needs a DS RRset and the cache does not have it,
get_dsset() falls back to creating a fresh fetch.  Without a hint, the
resolver picks the closest known zone cut for the DS query, and in the
parent-centric resolver that can land on a delegation at the DS owner
name itself (the child side). This can happens when the parent
delegation is expired, or if the zonecut of the parent doesn't match the
labels in the name.

Querying the child for its own DS records yields NODATA from the apex of
the zone, which sends the resolver into the "chase DS servers" recovery
path and costs two extra round trips for a parent delegation we already
had cached in the delegation database.

Look up the parent zone in the delegation database before kicking
off the fetch, and pass any usable delegation to the resolver as a
hint.  When the hint is present, the resolver sends the DS query
straight to the parent's nameservers and the chase path is avoided
entirely.

To support this, create_fetch() now takes optional 'domain' and
'delegset' parameters that are forwarded to dns_resolver_createfetch().
All other call sites pass NULL.
2026-04-16 11:28:13 +02:00
Mark Andrews
b4ba11f151 Fix a bug with template filename reuse
When a zone filename is defined in named.conf which will be
written to by the server - i.e., secondary or dynamically updated
zones - there is a test at configuration time to ensure that the
filename is non-unique.

This test is run before the zone is actually created, so a zone
configured using a template may not have had its filename expanded
yet.  This can cause a configuration to fail because, for example,
multiple zones appear to using the filename "$name.db".

This has been fixed by calling dns_zone_expandzonefile() from
isccfg_check_zoneconf(), to expand the names when checking for
uniqueness.
2026-04-14 21:50:31 -07:00
Mark Andrews
20f8e9eb56 Make zone filename expansion accessible from outside dns_zone
This adds a new API call dns_zone_expandzonefie(), which will enable
named-checkconf to expand filenames the same way the server does in
dns_zone_setfile().
2026-04-14 21:49:59 -07:00
Mark Andrews
9f411c93c4 Remove unnecessary dns_name_free call
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.
2026-04-15 09:00:26 +10:00
Ondřej Surý
4654796683
Include disptype and transport in dispatch hash key
Move disptype and transport into dispatch_hash() and dispatch_match()
so that the match function is the single source of truth for whether
two TCP dispatches are interchangeable.  This replaces the post-loop
disptype filter in dispatch_gettcp() and makes the disptype field in
struct dispatch_key actually used.
2026-04-14 17:48:24 +02:00
Ondřej Surý
6e78094ebd
Do not reuse shared TCP dispatches for zone transfers
Zone transfers (XFRIN) need a dedicated TCP connection because they
are long-lived and stream the entire zone.
2026-04-14 17:48:23 +02:00
Ondřej Surý
3e364aec2b
Use sequential per-dispatch message IDs for TCP
TCP dispentries no longer use the global QID hash table at all.
Responses are matched by scanning disp->active, and sequential
per-dispatch IDs (bounded by the pipelining limit) are unique
within a single dispatch by construction.  Since TCP delivers
only data we asked for on a specific connection, the per-peer
uniqueness that the global table enforced was never actually
needed for TCP.

DNS_DISPATCHOPT_FIXEDID is plumbed through dns_request_createraw
-> get_dispatch -> dns_dispatch_createtcp so FIXEDID TCP requests
always get a fresh isolated dispatch — the caller-supplied ID
then cannot collide with any other in-flight query either.
2026-04-14 17:48:21 +02:00
Ondřej Surý
385ceabe8f
Limit TCP pipelining per shared dispatch
Cap the number of in-flight queries on a single shared TCP dispatch.
When the limit is reached, the dispatch is removed from the hash
table so subsequent queries get a fresh connection.  The existing
dispatch continues serving its queries until they complete.

This bounds the blast radius of a connection drop: at most N queries
fail simultaneously instead of all queries to that server.

The default limit is 256.  It can be overridden for testing via
'named -T tcppipelining=N'.
2026-04-14 17:48:16 +02:00