Commit graph

15859 commits

Author SHA1 Message Date
Mark Andrews
9428e32b13 Add an option to disable ZONEVERSION responses
The option provide-zoneversion controls whether ZONEVERSION is
returned.  This applies to primary, secondary and mirror zones.
2025-03-24 22:16:09 +00:00
Mark Andrews
a4f5c1d5f3 Add option request-zoneversion
This can be set at the option, view and server levels and causes
named to add an EDNS ZONEVERSION option to requests.  Replies are
logged to the 'zoneversion' category.
2025-03-24 22:16:09 +00:00
Mark Andrews
e9a87f0389 Return EDNS ZONEVERSION if requested
If there was an EDNS ZONEVERSION option in the DNS request and the
answer was from a zone, return the zone's serial and number of
labels excluding the root label with the type set to 0 (ZONE-SERIAL).
2025-03-24 22:16:09 +00:00
Mark Andrews
ebe751f88c Add dns_zone_getzoneversion
Returns the EDNS ZONEVERSION for the zone.  Return the database
specific version otherwise return a type 0 version (serial).
2025-03-24 22:16:09 +00:00
Mark Andrews
5c126a7e66 Add dns_db_getzoneversion
Provides a database method to return a database specific EDNS
ZONEVERSION option.  The default EDNS ZONEVERSION is serial.
2025-03-24 22:16:09 +00:00
Mark Andrews
998b93acca Add EDNS ZONEVERSION option counter 2025-03-24 22:16:09 +00:00
Mark Andrews
ec3cbc5468 Extend message code to display ZONEVERSION 2025-03-24 22:16:09 +00:00
Mark Andrews
b2c2b6755e Check EDNS ZONEVERSION when parsing OPT record 2025-03-24 22:16:09 +00:00
Mark Andrews
8e7229f641 Fix gaining adbname reference
Call dns_adbname_ref before calling dns_resolver_createfetch to
ensure adbname->name remains stable for the life of the fetch.
2025-03-20 23:25:29 +00:00
Evan Hunt
2e6107008d optimize key ID check when searching for matching keys
when searching a DNSKEY or KEY rrset for the key that matches
a particular algorithm and ID, it's a waste of time to convert
every key into a dst_key object; it's faster to compute the key
ID by checksumming the region, and then only do the full key
conversion once we know we've found the correct key.

this optimization was already in use in the validator, but it's
been refactored for code clarity, and is now also used in query.c
and message.c.
2025-03-20 18:22:58 +00:00
Evan Hunt
341b962665 move dns_zonekey_iszonekey() to dns_dnssec module
dns_zonekey_iszonekey() was the only function defined in the
dns_zonekey module, and was only called from one place. it
makes more sense to group this with dns_dnssec functions.
2025-03-20 18:22:58 +00:00
alessio
e1e10adc3a Switch symtab to use fxhash hashing
This merge request resolves some performance regressions introduced
with the change from isc_symtab_t to isc_hashmap_t.

The key improvements are:

1. Using a faster hash function than both isc_hashmap_t and
   isc_symtab_t. The previous implementation used SipHash, but the
   hashflood resistance properties of SipHash are unneeded for config
   parsing.
2. Shrinking the initial size of the isc_hashmap_t used inside
   isc_symtab_t. Symtab is mainly used for config parsing, and the
   when used that way it will have between 1 and ~50 keys, but the
   previous implementation initialized a map with 128 slots.
   By initializing a smaller map, we speed up mallocs and optimize for
   the typical case of few config keys.
3. Slight optimization of the string matching in the hashmap, so that
   the tail is handled in a single load + comparison, instead of byte
   by byte.
   Of the three improvements, this is the least important.
2025-03-20 11:26:09 +01:00
Matthijs Mekking
3e836a87e6 Update Retired and Removed if we update lifetime
If we are updating the lifetime, and it was not set before, also
set/update the Retired and Removed timing metadata.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
6c6b8796d3 Fix keymgr bug wrt setting the next time
Only set the next time the keymgr should run if the value is non zero.
Otherwise we default back to one hour. This may happen if there is one
or more key with an unlimited lifetime.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
8c9d2eb2bf keymgr: also set DeleteCDS when setting PublishCDS
The keymgr never set the expected timing metadata when CDS/CDNSKEY
records for the corresponding key will be removed from the zone. This
is not troublesome, as key states dictate when this happens, but with
the new pytest we use the timing metadata to determine if the CDS and/or
CDNSKEY for the given key needs to be published.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
63edc4435f Fix wrong usage of safety intervals in keymgr
There are a couple of cases where the safety intervals are added
inappropriately:

1. When setting the PublishCDS/SyncPublish timing metadata, we don't
   need to add the publish-safety value if we are calculating the time
   when the zone is completely signed for the first time. This value
   is for when the DNSKEY has been published and we add a safety
   interval before considering the DNSKEY omnipresent.

2. The retire-safety value should only be added to ZSK rollovers if
   there is an actual rollover happening, similar to adding the sign
   delay.

3. The retire-safety value should only be added to KSK rollovers if
   there is an actual rollover happening. We consider the new DS
   omnipresent a bit later, so that we are forced to keep the old DS
   a bit longer.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
ef671919d5 Fix a small keymgr bug
While converting the kasp system test to pytest, I encountered a small
bug in the keymgr code. We retire keys when there is more than one
key matching a 'keys' line from the dnssec-policy. But if there are
multiple identical 'keys' lines, as is the case for the test zone
'checkds-doubleksk.kasp', we retire one of the two keys that have the
same properties.

Fix this by checking if there are double matches. This is not fool proof
because there may be many keys for a few identical 'keys' lines, but it
is good enough for now. In practice it makes no sense to have a policy
that dictates multiple keys with identical properties.
2025-03-20 10:12:16 +00:00
Mark Andrews
bfbaacc9a0 Use reference counted counters for nfail and nvalidations
The fetch context that held these values could be freed while there
were still active pointers to the memory.  Using a reference counted
pointer avoids this.
2025-03-20 09:12:49 +11:00
Aram Sargsyan
830e548111 Fix the resolvers RTT-ranged responses statistics counters
When a response times out the fctx_cancelquery() function
incorrectly calculates it in the 'dns_resstatscounter_queryrtt5'
counter (i.e. >=1600 ms). To avoid this, the rctx_timedout()
function should make sure that 'rctx->finish' is NULL. And in order
to adjust the RTT values for the timed out server, 'rctx->no_response'
should be true. Update the rctx_timedout() function to make those
changes.
2025-03-18 16:20:59 +00:00
Aram Sargsyan
12e7dfa397 Fix resolver responses statistics counter
The resquery_response() function increases the response counter without
checking if the response was successful. Increase the counter only when
the result indicates success.
2025-03-18 16:20:59 +00:00
Ondřej Surý
0d9f58b745 Remove a kludge to process non-authoritative CNAME response
A BIND 8 server could return a non-authoritative answer when a CNAME is
followed.  This is no longer handled as a valid answer.
2025-03-17 23:23:24 +00:00
Ondřej Surý
05d6542e6d Remove the kludges for records in the bad sections
There were kludges to help process responses from authoritative servers
giving RRs in wrong sections (mentioning BIND 8).  These should just go
away and such responses should not be processed.
2025-03-17 23:23:24 +00:00
Ondřej Surý
ff73d37f69 Small cleanup in dns_adb unit 2025-03-17 23:23:24 +00:00
Aram Sargsyan
807ef8545d Implement -T cookiealwaysvalid
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.
2025-03-17 10:42:47 +00:00
Mark Andrews
d0a59277fb Add missing locks when returning addresses
Add missing locks in dns_zone_getxfrsource4 et al. Addresses CID
468706, 468708, 468741, 468742, 468785 and 468778.

Cleanup dns_zone_setxfrsource4 et al to now return void.

Remove double copies with dns_zone_getprimaryaddr and dns_zone_getsourceaddr.
2025-03-15 04:51:59 +00:00
Evan Hunt
606d30796e use new dns_rdatatype classification functions
modify code to use dns_rdatatype_ismulti(), dns_rdatatype_issig(),
dns_rdatatype_isaddr(), and dns_rdatatype_isalias() where applicable.
2025-03-15 00:27:54 +00:00
Evan Hunt
37ff0aa9c0 convert rdatatype classification routines to inline
turn the dns_rdatatype_is*() functions into static inline
functions in rdata.h.
2025-03-15 00:27:54 +00:00
Evan Hunt
1c51d44d82 add new functions to classify rdata types
- dns_rdatatype_ismulti() returns true if a given type can have
  multiple answers: ANY, RRSIG, or SIG.
- dns_rdatatype_issig() returns true for a signature: RRSIG or SIG.
- dns_rdatatype_isaddr() returns true for an address: A or AAAA.
- dns_rdatatype_isalias() returns true for an alias: CNAME or DNAME.
2025-03-15 00:27:54 +00:00
Evan Hunt
24eaff7adc qpzone.c:step() could ignore rollbacks
the step() function (used for stepping to the prececessor or
successor of a database node) could overlook a node because
there was an rdataset marked IGNORE because it had been rolled
back, covering an active rdataset under it.
2025-03-14 23:19:17 +00:00
Evan Hunt
9cfe9f5eb7 fix handling of revoked keys
when a key is revoked its key ID changes, due to the inclusion
of the "revoke" flag. a collision between this changed key ID and
that of an unrelated public-only key could cause a crash in
dnssec-signzone.
2025-03-14 22:25:44 +00:00
Mark Andrews
de519cd1c9 Don't leak the original QTYPE to parent zone
When performing QNAME minimization, named now sends an NS
query for the original QNAME, to prevent the parent zone from
receiving the QTYPE.

For example, when looking up example.com/A, we now send NS queries
for both com and example.com before sending the A query to the
servers for example.com.  Previously, an A query for example.com
would have been sent to the servers for com.

Several system tests needed to be adjusted for the new query pattern:

- Some queries in the serve-stale test were sent to the wrong server.
- The synthfromdnssec test could fail due to timing issues; this
  has been addressed by adding a 1-second delay.
- The cookie test could fail due to the a change in the count of
  TSIG records received in the "check that missing COOKIE with a
  valid TSIG signed response does not trigger TCP fallback" test case.
- The GL #4652 regression test case in the chain system test depends
  on a particular query order, which no longer occurs when QNAME
  minimization is active. We now disable qname-minimization
  for that test.
2025-03-14 01:01:26 +00:00
Mark Andrews
496f7963cd Fix handling of ISC_R_TIMEOUT in resume_qmin()
If a timeout occurs when sending a QMIN query, QNAME
minimization should be disabled. This now causes a hard
failure in strict mode, or a fallback to non-minimized queries
in relaxed mode.
2025-03-14 01:01:26 +00:00
Mark Andrews
98fc14dc75 Exempt QNAME minimization queries from fetches-per-zone
The calling fetch has already called fcount_incr() for this zone;
calling it again for a QMIN query results in double counting.

When resuming after a QMIN query is answered, however, we do now
ensure before continuing that the fetches-per-zone limit has not
been exceeded.
2025-03-14 01:01:26 +00:00
Colin Vidal
24ffbdcfea add support for EDE 20 (Not Authoritative)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.

RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.
2025-03-13 11:16:01 +01:00
Colin Vidal
334ea1269f add support for EDE 7 and 8
Extended DNS Error messages EDE 7 (expired key) and EDE 8 (validity
period of the key not yet started) are now sent in case of such DNSSEC
validation failures.

Refactor the existing validator extended error APIs in order to make it
easy to have a consisdent extra info (with domain/type) in the various
use case (i.e. when the EDE depends on validator state,
validate_extendederror or when the EDE doesn't depend of any state but
can be called directly in a specific flow).
2025-03-13 09:57:09 +01:00
Ondřej Surý
1e4695510a
Revert "fix: dev: Delete dead nodes when committing a new version"
This reverts commit 67255da4b3, reversing
changes made to 74c9ff384e.
2025-03-05 17:46:54 +01:00
Aram Sargsyan
6cd9e4f67c Fix a bug in get_request_transport_type()
When dns_remote_done() is true, calling dns_remote_curraddr() asserts.
Add a dns_remote_curraddr() check before calling dns_remote_curraddr().
2025-03-05 12:18:11 +00:00
Ondřej Surý
1fae6ccea1
Add the call function tracking to isc_mem API
As we already track __func__, __FILE__, __LINE__ triplet in most places,
add the function tracking to the isc_mem tracking API.
2025-03-05 11:17:17 +01:00
Ondřej Surý
eab9fc22e7
Replace attach/detach in isc_mem with refcount implementation
The isc_mem API is one of the most commonly used APIs that didn't
used ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros.  Replace the
implementation of isc_mem_attach(), isc_mem_detach() and
isc_mem_destroy() with the respective macros.

This also removes the legacy isc_mem_destroy() functionality that would
check whether all references had been detached from the memory context
as it doesn't work reliably when using the call_rcu() API.  Instead of
doing this individually, call isc_mem_checkdestroyed(stderr) from the
isc_mem_destroy() macro to keep the extra check that all contexts were
freed when the program is exiting.
2025-03-05 11:17:17 +01:00
Ondřej Surý
552cf64a70
Replace isc_mem_destroy() with isc_mem_detach()
Remove legacy isc_mem_destroy() and just use isc_mem_detach() as
isc_mem_destroy() doesn't play well with call_rcu API.
2025-03-05 11:17:17 +01:00
Mark Andrews
006c5990ce Implement digest_sig and digest_rrsig for ZONEMD
ZONEMD needs to be able to digest SIG and RRSIG records.  The signer
field can be compressed in SIG so we need to call dns_name_digest().
While for RRSIG the records the signer field is not compressed the
canonical form has the signer field downcased (RFC 4034, 6.2).  This
also implies that compare_rrsig needs to downcase the signer field
during comparison.
2025-03-05 18:05:12 +11:00
Ondřej Surý
303c20caf8
Fix the foundname vs dcname madness in qpcache_findzonecut()
The qpcache_findzonecut() accepts two "foundnames": 'foundname' and
'dcname' could be NULL.  Originally, when 'dcname' would be NULL, the
'dcname' would be set to 'foundname'.  Then code like this was present:

    result = find_deepest_zonecut(&search, node, nodep, foundname,
                                  rdataset,
                                  sigrdataset DNS__DB_FLARG_PASS);
    dns_name_copy(foundname, dcname);

Which basically means that we are copying the .ndata over itself for no
apparent reason.  Cleanup the dcname vs foundname usage.

Co-authored-by: Evan Hunt <each@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
2025-03-05 07:49:46 +01:00
alessio
87776a51ae Cleanup dns_opcode_t
Make dns_opcode_t refer directly to the underlying enum, and use
attributes to ensure the underlying enum is the same size as uint16_t.
2025-03-04 18:35:14 +01:00
Mark Andrews
988dc57c8c Call isc__iterated_hash_initialize
The iterated hash implementation needs to be initialised
on the worker thread.  Also clean it up after we are done.
2025-03-04 12:54:39 +00:00
Artem Boldariev
eaad0aefe6 DoH: Bump the active streams processing limit
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.

The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.
2025-03-03 11:32:29 +02:00
Artem Boldariev
217a1ebd79 DoH: remove obsolete INSIST() check
The check, while not active by default, is not valid since the commit
8b8f4d500d.

See 'if (total == 0) { ...' below branch to understand why.
2025-03-03 11:32:11 +02:00
Artem Boldariev
c5f7968856 DoH: Flush HTTP write buffer on an outgoing DNS message
Previously, the code would try to avoid sending any data regardless of
what it is unless:

a) The flush limit is reached;
b) There are no sends in flight.

This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").

Now, additionally to the listed cases, we also:

c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).

That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0e1b02868a DoH: Limit the number of delayed IO processing requests
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.

This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0956fb9b9e DoH: Simplify http_do_bio()
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.

The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.

Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).
2025-03-03 11:32:11 +02:00
Ondřej Surý
ce7879c924
Remove STATIC_ASSERT variants in favor of the C11 variant
Previously, a gcc < 4.6 shim for _Static_assert() was included.  Such an
old compiler is not supported now anyway, so the macro variant has been
removed in favor of a single definition using _Static_assert().
2025-03-01 07:33:53 +01:00