Commit graph

15907 commits

Author SHA1 Message Date
Mark Andrews
9bb93520f1
Wrong NSEC3 chosen for NO QNAME proof
When we optimised the closest encloser NSEC3 discovery the maxlabels
variable was used in the binary search. The updated value was later
used to add the NO QNAME NSEC3 but that block of code needed the
original value. This resulted in the wrong NSEC3 sometimes being
chosen to perform this role.
2025-05-08 21:48:11 +02:00
Mark Andrews
c0fcb9fd0e Fix the error handling of put_yamlstr calls
The return value was sometimes being ignored when it shouldn't
have been.
2025-04-30 15:39:52 +10:00
Your Name
59086c33e2 Call rcu_barrier earlier in the destructor
If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.

In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.

As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.
2025-04-25 13:13:44 +02:00
Aram Sargsyan
74a8acdc8d Separate the single setter/getter functions for TCP timeouts
Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
b9e9b98d55 Use the configured TCP connect timeout in checkds_send_toaddr()
The checkds_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
daede6876b Use the configured TCP connect timeout in notify_send_toaddr()
The notify_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
70ad94257d Implement tcp-primaries-timeout
The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.

Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.
2025-04-23 17:03:05 +00:00
Aram Sargsyan
e1a415b412 Fix a date race in qpcache_addrdataset()
The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.
2025-04-23 13:02:43 +00:00
Aram Sargsyan
412aa881f2 Fix a serve-stale issue with a delegated zone
When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.

Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.
2025-04-23 11:46:16 +00:00
Mark Andrews
5eeb31f0b9 Split EDNS COOKIE YAML into separate parts
Split the YAML display of the EDNS COOKIE option into CLIENT and SERVER
parts.  The STATUS of the EDNS COOKIE in the reply is now a YAML element
rather than a comment.
2025-04-22 09:24:18 +10:00
Mark Andrews
07c28652a3 Fix EDNS TCP-KEEPALIVE option YAML output
There was missing white space between the option name and its value.
2025-04-22 09:24:18 +10:00
Mark Andrews
81334113c3 Fix EDNS LLQ option YAML output
The EDNS LLQ option was not being emitted as valid YAML. Correct
the output to be valid YAML with each field of the LLQ being
individually selectable.
2025-04-22 09:24:18 +10:00
Mark Andrews
27e8732c17 Change the EDNS KEY-TAG YAML output format
When using YAML, print the EDNS KEY-TAG as an array of integers
for easier machine parsing. Check the validity of the YAML output.
2025-04-22 09:24:18 +10:00
Mark Andrews
378bc7cfa6 Use YAML comments for durations rather than parentheses
This will allow the values to be parsed using standard yaml processing
tools, and still provide the value in a human friendly form.
2025-04-22 09:24:18 +10:00
Mark Andrews
68cdc4774c Change the name and YAML format of EDNS UL
The offical EDNS option name for "UL" is "UPDATE-LEASE".  We now
emit "UPDATE-LEASE" instead of "UL", when printing messages, but
"UL" has been retained as an alias on the command line.

Update leases consist of 1 or 2 values, LEASE and KEY-LEASE.  These
components are now emitted separately so they can be easily extracted
from YAML output.  Tests have been added to check YAML correctness.
2025-04-22 09:24:18 +10:00
Mark Andrews
280e9b7cf4 Add YAML escaping where needed
When rendering text, such as domain names or the EXTRA-TEXT
field of the EDE option, backslashes and quotation marks must
be escaped to ensure that the emitted message is valid YAML.
2025-04-22 09:24:18 +10:00
Mark Andrews
e7ef4e41eb Collapse common switch cases when emitting EDNS options
The CHAIN and REPORT-CHANNEL EDNS options are both domain names, so they
can be combined.  THE CLIENT-TAG and SERVER-TAG EDNS options are both 16
bit integers, so they can be combined.
2025-04-22 09:23:53 +10:00
Ondřej Surý
bf1b8824ac
Disable own memory context for libxml2 on macOS 15.4 Sequoia
The custom allocation API for libxml2 is deprecated starting in macOS
Sequoia 15.4, iOS 18.4, tvOS 18.4, visionOS 2.4, and tvOS 18.4.

Disable the memory function override for libxml2 when
LIBXML_HAS_DEPRECATED_MEMORY_ALLOCATION_FUNCTIONS is defined as Apple
broke the system-wide libxml2 starting with macOS Sequoia 15.4.
2025-04-18 20:16:13 +02:00
Nicki Křížek
c5707cb75a Merge tag 'v9.21.7' 2025-04-16 15:23:14 +02:00
Ondřej Surý
30d4939382
Move the call_rcu_thread explicit create and shutdown to isc_loop
When isc__thread_initialize() is called from a library constructor, it
could be called before we fork the main process.  This happens with
named, and then we have the call_rcu_thread attached to the pre-fork
process and not the post-fork process, which means that the initial
process will never shutdown, because there's noone to tell it so.

Move the isc__thread_initialize() and isc__thread_shutdown() to the
isc_loop unit where we call it before creating the extra thread and
after joining all the extra threads respectively.
2025-04-16 12:30:14 +02:00
Ondřej Surý
6ed821beb4
Reduce QPDB_VIRTUAL to 10 seconds
The *DB_VIRTUAL value was introduced to allow the clients (presumably
ns_clients) that has been running for some time to access the cached
data that was valid at the time of its inception.  The default value
of 5 minutes is way longer than longevity of the ns_client object as
the resolver will give up after 2 minutes.

Reduce the value to 10 seconds to accomodate to honour the original
more closely, but still allow some leeway for clients that started some
time in the past.

Our measurements show that even setting this value to 0 has no
statistically significant effect, thus the value of 10 seconds should be
on the safe side.
2025-04-16 11:21:38 +02:00
Mark Andrews
0d9cab1555 Process NSID and DNS COOKIE options when returning BADVERS
This will help identify the broken server if we happen to break
EDNS version negotiation.  It will also help protect the client
from spoofed BADVERSION responses.
2025-04-15 02:38:37 +00:00
Mark Andrews
ca7355b7d0 Fix OID check for PRIVATEOID keys and signatures
We were failing to account for the length byte before the OID.
See RFC 4034.

   Algorithm number 254 is reserved for private use and will never be
   assigned to a specific algorithm.  The public key area in the DNSKEY
   RR and the signature area in the RRSIG RR begin with an unsigned
   length byte followed by a BER encoded Object Identifier (ISO OID) of
   that length.  The OID indicates the private algorithm in use, and the
   remainder of the area is whatever is required by that algorithm.
   Entities should only use OIDs they control to designate their private
   algorithms.
2025-04-03 23:00:16 +11:00
Mark Andrews
90b2f94d9b Don't cache lack of EDNS based on received responses
Caching prevents server upgrades being detected in a timely manner
and it can also prevent DNSSEC responses being requested.
2025-04-03 10:53:35 +02:00
Ondřej Surý
2988ebae21
Don't copy EDE codes if source is same as destination
If the nested DNS validator ends up in the same fetch because of the
loops, the code could be copying the EDE codes from the same source EDE
context as the destination EDE context.  Skip copying the EDE codes if
the source and the destination is the same.
2025-04-02 18:06:52 +02:00
Ondřej Surý
fe48290140
Don't pass edectx from fetch_and_forget
Pass NULL as edectx for the fetch_and_forget() fetches as nobody
is reading the EDE contexts and it can mess the main client buffer.
2025-04-02 17:38:31 +02:00
Ondřej Surý
d7593196a1
Add static ede context into each validator layer
Instead of passing the edectx from the fetchctx into all subvalidators,
make the ede context ownership explict for dns_resolver_createfetch()
callers, and copy the ede result codes from the children validators to
the parent when finishing the validation process.
2025-04-02 17:32:50 +02:00
alessio
4017a40b1d Remove zero initialization of large buffers
Profiles show that an high amount of CPU time spent in memset.
By removing zero initalization of certain large buffers we improve
performance in certain authoritative workloads.
2025-04-02 16:24:31 +02:00
Evan Hunt
ad7f744115 use ISC_LIST_FOREACH in more places
use the ISC_LIST_FOREACH pattern in places where lists had
been iterated using a different pattern from the typical
`for` loop: for example, `while (!ISC_LIST_EMPTY(...))` or
`while ((e = ISC_LIST_HEAD(...)) != NULL)`.
2025-03-31 13:45:14 -07:00
Evan Hunt
522ca7bb54 switch to ISC_LIST_FOREACH everywhere
the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)`
has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few
cases where the change would be excessively complex.

in most cases this was a straightforward change. in some places,
however, the list element variable was referenced after the loop
ended, and the code was refactored to avoid this necessity.

also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare
the list elements, compilation failures can occur if the list object
has a `const` qualifier.  some `const` qualifiers have been removed
from function parameters to avoid this problem, and where that was not
possible, `UNCONST` was used.
2025-03-31 13:45:10 -07:00
Evan Hunt
5cff8f9017 implicitly declare list elements in ISC_LIST_FOREACH macros
ISC_LIST_FOREACH and related macros now use 'typeof(list.head)' to
declare the list elements automatically; the caller no longer needs
to do so.

ISC_LIST_FOREACH_SAFE also now implicitly declares its own 'next'
pointer, so it only needs three parameters instead of four.
2025-03-31 13:37:47 -07:00
Mark Andrews
31968a7534 Remove dead code in dns_message_sectiontotext
Following the merge of !10302 this code to reset the result code
on ISC_R_NOMORE is no longer needed.
2025-03-31 14:37:03 +00:00
Ondřej Surý
c27fce26e6
Drop readline alternatives in favor of libedit
The libedit is now ubiquitous and has a licences compatible with
MPL 2.0.  Drop readline (GPL 3.0) and editline (obsolete) support
in favor of libedit.
2025-03-31 15:20:40 +02:00
Artem Boldariev
2592e309c7 Dispatch: carefully check if the server name for SNI is a hostname
Previously the code would not check if the string intended to be used
for SNI is a hostname.
2025-03-31 14:23:19 +03:00
Artem Boldariev
1f199ee606 Add isc_tls_valid_sni_hostname()
Add a function that checks if a 'hostname' is not a valid IPv4 or IPv6
address. Returns 'true' if the hostname is likely a domain name, and
'false' if it represents an IP address.
2025-03-31 14:23:19 +03:00
Colin Vidal
4eb2cd364a copy __FILE__ when allocating memory
When allocating memory under -m trace|record, the __FILE__ pointer is
stored, so it can be printed out later in order to figure out in which
file an allocation leaked. (among others, like the line number).

However named crashes when called with -m record and using a plugin
leaking memory. The reason is that plugins are unloaded earlier than
when the leaked allocations are dumped (obviously, as it's done as late
as possible). In such circumstances, __FILE__ is dangling because the
dynamically loaded library (the plugin) is not in memory anymore.

Fix the crash by systematically copying the __FILE__ string
instead of copying the pointer. Of course, this make each allocation to
consume a bit more memory (and longer, as it needs to calculate the
length of __FILE__) but this occurs only under -m trace|record debugging
flags.

In term of unit test, because grepping in C is not fun, and because the
whole "syntax" of the dump output is tested in other tests, this simply
search for a substring in the whole buffer to make sure the expected
allocations are found.
2025-03-27 10:44:17 +01:00
alessio
2f27d66450 Refactor to use list-like macro for message sections
In the code base it is very common to iterate over all names in a message
section and all rdatasets for each name, but various idioms are used for
iteration.

This commit standardizes them as much as possible to a single idiom,
using the macro MSG_SECTION_FOREACH, similar to the existing
ISC_LIST_FOREACH.
2025-03-27 03:09:46 +01:00
Evan Hunt
3188b1c055 move application of dns64 to a separate function
the code in query_dns64() that applies the dns64 prefixes to
an A rdataset has been moved into the dns_dns64 module, and
dns_dns64_destroy() now unlinks the dns64 object from its
containing list. with these changes, we no longer need the
list-manipulation API calls dns_dns64_next() and
dns_dns64_unlink().
2025-03-26 23:30:38 +00:00
Evan Hunt
db8c11ea0b dns_message_gettemp*() resets objects
callers of dns_message_gettemprdata() and dns_message_getrdatalist()
initialize the objects after retrieving them. this is no longer
necessary.
2025-03-26 23:30:38 +00:00
Ondřej Surý
1233dc8a61 Add isc_sieve unit implementing SIEVE-LRU algorithm
This is the core implementation of the SIEVE algorithm described in the
following paper:

  Zhang, Yazhuo, Juncheng Yang, Yao Yue, Ymir Vigfusson, and K V
  Rashmi. “SIEVE Is Simpler than LRU: An Efficient Turn-Key Eviction
  Algorithm for Web Caches,” n.d.. available online from
  https://junchengyang.com/publication/nsdi24-SIEVE.pdf
2025-03-26 15:36:33 -07:00
Ondřej Surý
e8a1949566
Remove lock upgrading from the hot path in the cache
In QPcache, there were two places that tried to upgrade the lock.  In
clean_stale_header(), the code would try to upgrade the lock and cleanup
the header, and in qpzonode_release(), the tree lock would be optionally
upgraded, so we can cleanup the node directly if empty.  These
optimizations are not needed and they have no effect on the performance.
2025-03-25 10:57:19 +01:00
Ondřej Surý
3ef9b09620
Fix invalid cache-line padding for qpcache buckets
The isc_queue_t was missing in the calculation of the required
padding size inside the qpcache bucket structure.
2025-03-25 10:56:21 +01:00
Aram Sargsyan
fb16080280 Don't call dst_key_free(keyp) on an invalid 'keyp'
After a refactoring in 2e6107008d the
dst_key_free() call is invalid and can cause an assertion. Remove the
dst_key_free() call.
2025-03-25 08:19:45 +00:00
Evan Hunt
5c21576f82 Don't check DNS_KEYFLAG_NOAUTH
All DNSKEY keys are able to authenticate. The DNS_KEYTYPE_NOAUTH
(and DNS_KEYTYPE_NOCONF) flags were defined for the KEY rdata type,
and are not applicable to DNSKEY.

Previously, because the DNSKEY implementation was built on top of
KEY, the NOAUTH flag prevented authentication in DNSKEYs as well.
This has been corrected.
2025-03-25 06:38:25 +00:00
Evan Hunt
fee1ba40df Tidy up keyvalue.h definitions
Use enums for DNS_KEYFLAG_, DNS_KEYTYPE_, DNS_KEYOWNER_, DNS_KEYALG_,
and DNS_KEYPROTO_ values.

Remove values that are never used.

Eliminate the obsolete DNS_KEYFLAG_SIGNATORYMASK. Instead, add three
more RESERVED bits for the key flag values that it covered but which
were never used.
2025-03-25 06:38:25 +00:00
Matthijs Mekking
2c52aea3dc Remove dns_qpmulti_lockedread declaration
This function was removed in 6217e434b5
but not from the header file.
2025-03-25 05:58:31 +00:00
Evan Hunt
36cf1c6a5b when forwarding, try with CD=0 first
when sending a query to a forwarder for a name within a secure domain,
the first query is now sent with CD=0. when the forwarder itself
is validating, this will give it a chance to detect bogus data and
replace it with valid data before answering. this reduces our chances
of being stuck with data that can't be validated.

if the forwarder returns SERVFAIL to the initial query, the query
will be repeated with CD=1, to allow for the possibility that the
forwarder's validator is faulty or that the bogus answer is covered
by an NTA.

note: previously, CD=1 was only sent when the query name was in a
secure domain. today, validating servers have a trust anchor at the
root by default, so virtually all queries are in a secure domain.
therefore, the code has been simplified.  as long as validation is
enabled, any forward query that receives a SERVFAIL response will be
retried with CD=1.
2025-03-24 17:33:11 -07:00
Mark Andrews
78de8afd47 Return raw zone serial for inline zones 2025-03-24 22:16:09 +00:00
Mark Andrews
9428e32b13 Add an option to disable ZONEVERSION responses
The option provide-zoneversion controls whether ZONEVERSION is
returned.  This applies to primary, secondary and mirror zones.
2025-03-24 22:16:09 +00:00
Mark Andrews
a4f5c1d5f3 Add option request-zoneversion
This can be set at the option, view and server levels and causes
named to add an EDNS ZONEVERSION option to requests.  Replies are
logged to the 'zoneversion' category.
2025-03-24 22:16:09 +00:00