Commit graph

16094 commits

Author SHA1 Message Date
Ondřej Surý
6e2ca5e0d7
Remove the negative type logic from qpcache
Previously, when a negative header was stored in the cache, it would be
stored in the dns_typepair_t as .type = 0, .covers = <negative type>.
When searching the cache internally, we would have to look for both
positive and negative typepair and the slabheader .down list could be a
mix of positive and negative types.

Remove the extra representation of the negative type and simply use the
negative attribute on the slabheader.  Other units (namely dns_ncache)
can still insert the (0, type) negative rdatasets into the cache, but
internally, those will be converted into (type, 0) slabheaders, and vice
versa - when binding the rdatasets, the negative (type, 0) slabheader
will be converted to (0, type) rdataset.  Simple DNS_TYPEPAIR() helper
macro was added to simplify converting single rdatatype to typepair
value.

As a side-effect, the search logic in all places can exit early if
there's a negative header for the type we are looking for, f.e. when
searching for the zone cut, we don't have to walk through all the
slabheaders, if there's a stored negative slabheader.
2025-08-15 07:22:52 +02:00
Ondřej Surý
3445362918
Add dns_rdatatype_isnsec() helper function
Replace the checks for both NSEC and NSEC3 with a single helper
function.
2025-08-15 07:22:52 +02:00
Ondřej Surý
59d1326175
Use dns_rdatatype_none more consistently
Use dns_rdatatype_none instead of plain '0' for dns_rdatatype_t and
dns_typepair_t manipulation.  While plain '0' is technically ok, it
doesn't carry the required semantic meaning, and using the named
dns_rdatatype_none constant makes the code more readable.
2025-08-15 07:22:52 +02:00
Ondřej Surý
8837491697
Add strict checks on typepair values in the developer's mode
When in developer's mode, make the DNS_TYPEPAIR_* macros be more
strict on the contents of the 'base' and 'covers', so we can catch
invalid use of the API.
2025-08-15 07:22:52 +02:00
Ondřej Surý
76c027e949
Disallow TYPE0 to be queried or inserted into the database
The RR type 0 is a reserved type for SIG[1] resource record.  It should
not be ever inserted into the database nor queried.  Add a special
handling to bail out quickly with DNS_R_DISALLOWED when inserting and
ISC_R_NOTFOUND when looking up TYPE0.  This is also prerequisite for
stricter checks in the follow-up commit.

1. https://www.rfc-editor.org/rfc/rfc2535#section-4.1.8.1
2025-08-15 07:22:52 +02:00
Ondřej Surý
101b1e5a57
Unify the dns_typepair_t variable naming and usage
The dns_typepair_t and dns_rdatatype_t variables were both named 'type'
in multiple places.  Rename all dns_typepair_t variables to include word
'pair' in the variable name to make sure that the distinction between
the two types is more clear.
2025-08-15 07:22:51 +02:00
Ondřej Surý
c22f156404
Simplify the DNS_R_UNCHANGED handling in dns_resolver unit
Instead of catching the DNS_R_UNCHANGED from dns_db_addrdataset() (via
cache_rrset() and dns_ncache_add()) individually, mask it properly as
soon as possible, by moving the sigrdataset caching logic inside
cache_rrset() and returning ISC_R_SUCCESS from cache_rrset() and
dns_ncache_add() when the database was unchanged.
2025-08-15 06:28:01 +02:00
Ondřej Surý
2b269fd0a4 Always delete the cached results on broken chain
The logic to delete records from the cache was relying on the contents
of the validation answer.  Change the logic to always delete the
contents of the cache on the broken chain result.
2025-08-14 16:08:56 -07:00
Evan Hunt
70e99bb27a result could be set incorrectly in validated()
during a recent refactoring of validated(), a line was
removed, causing 'result' to be left unchanged. this
wasted time continuing to try to validate when a
non-recoverable error had occured, and caused the wrong
reason to be logged in add_bad().
2025-08-14 16:07:54 -07:00
Mark Andrews
841d1647d4 Use DNS_RDATACOMMON_INIT to hide branch differences
Initialization of the common members of rdata type structures varies
across branches.  Standardize it by using the DNS_RDATACOMMON_INIT
macro for all types, so that new types are more likely to use it,
and hence backport more cleanly.
2025-08-15 07:30:30 +10:00
Alessio Podda
a05db4196f Remove unused dns_slabheader_reset argument
As a part of the previous refactor, the db argument of
dns_slabheader_reset is now unused, and can be removed.
2025-08-07 11:39:38 -07:00
Alessio Podda
ae6a34cbda Decouple database and node lifetimes by adding node-specific vtables
All databases in the codebase follow the same structure: a database is
an associative container from DNS names to nodes, and each node is an
associative container from RR types to RR data.

Each database implementation (qpzone, qpcache, sdlz, builtin, dyndb) has
its own corresponding node type (qpznode, qpcnode, etc). However, some
code needs to work with nodes generically regardless of their specific
type - for example, to acquire locks, manage references, or
register/unregister slabs from the heap.

Currently, these generic node operations are implemented as methods in
the database vtable, which creates problematic coupling between database
and node lifetimes. If a node outlives its parent database, the node
destructor will destroy all RR data, and each RR data destructor will
try to unregister from heaps by calling a virtual function from the
database vtable. Since the database was already freed, this causes a
crash.

This commit breaks the coupling by standardizing the layout of all
database nodes, adding a dedicated vtable for node operations, and
moving node-specific methods from the database vtable to the node
vtable.
2025-08-07 11:39:38 -07:00
Alessio Podda
4a8f77e483 Refactor sdlz to use name instead of pointer to name
Right now dns_sdlzlookup has a slight difference from other dbnode
implementations in that it stores a pointer to a dns name instead of
the dns name itself.

This commit harmonizes dns_sdlzlookup with other dbnode
implementations, facilitating further refactoring.
2025-08-07 11:44:18 +02:00
Evan Hunt
5a2938b452
refactor validated()
- there was special-case code in validated() to handle the results
  of a validator started by a CD=1 query. since that never happens,
  the code has been removed.
- the section of code that handles opportunistic caching of
  validated SOA, NS and NSEC data has been split out to a separate
  function.
- the number of goto statements has been reduced considerably.
2025-08-05 12:16:36 +02:00
Evan Hunt
9f674c43cf
split out helper functions
- fctx_setresult() sets the event result in a fetch response
  according to the rdataset being returned - DNS_R_NCACHENXDOMAIN or
  DNS_R_NXRRSET for negative responses, ISC_R_SUCCESS, DNS_R_CNAME,
  or DNS_R_DNAME for positive ones.
- cache_rrset() looks up a node and adds an rdataset.
- delete_rrset() looks up a node and removes rdatasets of a specified
  type and, optionally, the associated signatures.
- gettrust() returns the trust level of an rdataset, or dns_trust_none
  if the rdataset is NULL or not associated.
- getrrsig() scans the rdatasets associated with a name for the
  RRSIG covering a given type.
2025-08-05 12:16:36 +02:00
Evan Hunt
723d167f26
further subdivide caching functions
rctx_cacherdataset() has been split into two functions:
- rctx_cache_secure() starts validation for rdatasets
  that need it; they are then cached by the validator
  completion callback validated()
- rctx_cache_insecure() caches rdatasets immediately; it
  is called when validation is disabled or the data
  to be cached is glue.
2025-08-05 12:16:36 +02:00
Evan Hunt
ed56a91d7d
rename and refactor cache_name() and related functions
- renamed cache_message() to rctx_cachemessage()
- renamed cache_name() to rctx_cachename()
- merged ncache_message() into rctx_ncache()
- split out a new function, rctx_cacherdataset(), which is
  called by rctx_cachename() in a loop to process each of
  the rdatasets associated with the name.
2025-08-05 12:16:36 +02:00
Evan Hunt
83980d76b2
reduce code duplication around findnoqname()
every call to findnoqname() was followed by a call to
dns_rdataset_addnoqname(). we can move that call into
findnoqname() itself, and simplify the calling functions
a bit.
2025-08-05 12:16:36 +02:00
Evan Hunt
b940d40635
set ANSWERSIG flag when processing ANY responses
previously, rctx_answer_any() set the ANSWER flag for all
rdatasets in the answer section; it now sets ANSWERSIG for
RRSIG/SIG rdatasets and ANSWER for everything else.  this
error didn't cause any harm in the current code, but it
could have led to unexpected behavior in the future.
2025-08-05 12:16:36 +02:00
Evan Hunt
c23cc105a1
split out some functionality in cache_name()
there are now separate functions to check the cacheability of
an rdataset or to normalize TTLs, and the code to determine
whether validation is necessary has been simplified.
2025-08-05 12:16:36 +02:00
Evan Hunt
7841de08af
add functions to match rdataset types
- dns_rdataset_issigtype() returns true if the rdataset is
  of type RRSIG and covers a specified type
- dns_rdataset_matchestype() returns true if the rdataset
  is of the specified type *or* the RRSIG covering it.
2025-08-05 12:16:36 +02:00
Evan Hunt
51a4e00d1d
reduce steps for negative caching
whenever ncache_adderesult() was called, some preparatory code
was run first; this has now been moved into a single function
negcache() to reduce code duplication.
2025-08-05 12:16:36 +02:00
Evan Hunt
7371c4882a
change issecuredomain() functions to bool
dns_keytable_issecuredomain() and dns_view_issecuredomain()
previously returned a result code to inform the caller of
unexpected database failures when looking up names in the
keytable and/or NTA table. such failures are not actually
possible. both functions now return a simple bool.

also, dns_view_issecuredomain() now returns false if
view->enablevalidation is false, so the caller no longer
has to check for that.
2025-08-05 12:16:36 +02:00
Evan Hunt
5d56df23f2
split out cookie checks from resquery_response_continue()
split the code section that handles cookie issues into a
separate function for better readablity.
2025-08-05 12:16:36 +02:00
Evan Hunt
5e1df53d05
simplify dns_ncache_add()
there's no longer any reason to have both dns_ncache_add() and
dns_ncache_addoptout().
2025-08-05 12:16:36 +02:00
Ondřej Surý
f23bdc29ef
Document the current default stack sizes on different systems
The default stack sizes varies between operating systems and between
different system libc libraries from 128kB (Alpine Linux with MUSL) to
8M (Linux with glibc).  Document the different values used to justify
the value of THREAD_MINSTACKSIZE (currently set to 1MB).
2025-08-05 10:46:09 +02:00
Ondřej Surý
96dad96ae5
Add support for setting thread stack size
When running the isc_quota unit test with less than usual amount of
RAM (e.g. in a CI for architectures with 32 bits of address space),
the pthread_create() function fails with the "Resource temporarily
unavailable (11):" error code.

Add functions to get and set the thread stack size (if requested),
and use these to set the thread stack size to smaller value in the
isc_quota unit test.
2025-08-05 10:46:09 +02:00
Mark Andrews
c47615094e Add support for parsing and displaying DSYNC rdata type 2025-08-05 17:27:44 +10:00
Mark Andrews
6e1311c624 Add support for parsing DSYNC scheme mnemonics
Adds dns_dsyncscheme_fromtext, dns_dsyncscheme_totext and
dns_dsyncscheme_format.  Adds type dns_dsyncscheme_t.
2025-08-05 17:27:44 +10:00
Matthijs Mekking
2f70a0ef12 Add ede for zone with rpz cname override policy
When the zone is configured with a CNAME override policy, also add the
configured EDE code.

When the zone is contains a wildcard CNAME, also add the configured
EDE code.
2025-08-05 08:35:51 +02:00
Ondřej Surý
3a06c24962
Silence "may be truncated" warnings
Use memccpy() instead of strncpy() for safe string manipulation.
2025-08-04 15:38:17 +02:00
Ondřej Surý
f2e107508a
Add rcu_barrier() to isc__log_shutdown()
There is a data race when QP is reclaiming chunks on the call_rcu
threads and it tries to log the number of reclaimed chunks while the
server is shuttingdown.  Workaround this by adding rcu_barrier() before
shuttingdown the global log context.
2025-08-04 11:29:54 +02:00
Ondřej Surý
f7e5c1db38
Change the 'isc_g_mctx' to be always available
This required couple of internal changes to the isc_mem_debugging.

The isc_mem_debugging is now internal to isc_mem unit and there are
three new functions:

1. isc_mem_setdebugging() can change the debugging setting for an
   individual memory context.  This is need for the memory contexts used
   for OpenSSL, libxml and libuv accounting as recording and tracing
   memory is broken there.

2. isc_mem_debugon() / isc_mem_debugoff() can be used to change default
   memory debugging flags as well as debugging flags for isc_g_mctx.

Additionally, the memory debugging is inconsistent across the code-base.
For now, we are keeping the existing flags, but three new environment
variables have been added 'ISC_MEM_DEBUGRECORD', 'ISC_MEM_DEBUGTRACE'
and 'ISC_MEM_DEBUGUSAGE' to set the global debugging flags at any
program using the memory contexts.
2025-08-04 11:29:50 +02:00
Ondřej Surý
74726b3313
Add and use global memory context called isc_g_mctx
Instead of having individual memory contexts scattered across different
files and called different names, add a single memory context called
isc_g_mctx that replaces named_g_mctx and various other global memory
contexts in various utilities and tests.
2025-08-04 11:29:26 +02:00
Mark Andrews
8aa130f253 validator.c:check_signer now clones val->sigrdataset
Spurious validation failures were traced back to check_signer looping
over val->sigrdataset directly.  Cloning val->sigrdataset prevents
check_signer from interacting with callers that are also looping
over val->sigrdataset.
2025-07-31 19:21:32 -07:00
Colin Vidal
7747ac8aed plugin expand path automatically adds extension
If a plugin is configured without the extension,
`ns_plugin_expandpath()` automatically take cares of appending the
suffix to the path. The way it works is by checking if a file exists at
the expanded path. If it doesn't, it assumes the plugin path (or name)
doesn't have the extension and append the extension (which is
platform-specific) to the actual path.
2025-07-28 23:08:04 +02:00
Ondřej Surý
f6aed602f0
Refactor the network manager to be a singleton
There is only a single network manager running on top of the loop
manager (except for tests).  Refactor the network manager to be a
singleton (a single instance) and change the unit tests, so that the
shorter read timeouts apply only to a specific handle, not the whole
extra 'connect_nm' network manager instance.
2025-07-23 22:45:38 +02:00
Ondřej Surý
b8d00e2e18
Change the loopmgr to be singleton
All the applications built on top of the loop manager were required to
create just a single instance of the loop manager.  Refactor the loop
manager to not expose this instance to the callers and keep the loop
manager object internal to the isc_loop compilation unit.

This significantly simplifies a number of data structures and calls to
the isc_loop API.
2025-07-23 22:44:16 +02:00
Ondřej Surý
933dcc18ee Reword the 'shut down hung fetch while resolving' message
The log message 'shut down hung fetch while resolving' may be confusing
because no detection of hung fetches actually takes place, but rather
the timer on the fetch context expires and the resolver gives up.

Change the log message to actually say that instead of the original
cryptic message about hung fetch.
2025-07-23 22:37:56 +02:00
Matthijs Mekking
7774f16ed5 Special case refresh stale ncache data
When refreshing stale ncache data, the qctx->rdataset is NULL and
requires special processing.
2025-07-23 07:18:48 +00:00
Matthijs Mekking
a66b04c8d4 Make serve-stale refresh behave as prefetch
A serve-stale refresh is similar to a prefetch, the only difference
is when it triggers. Where a prefetch is done when an RRset is about
to expire, a serve-stale refresh is done when the RRset is already
stale.

This means that the check for the stale-refresh window needs to
move into query_stale_refresh(). We need to clear the
DNS_DBFIND_STALEENABLED option at the same places as where we clear
DNS_DBFIND_STALETIMEOUT.

Now that serve-stale refresh acts the same as prefetch, there is no
worry that the same rdataset is added to the message twice. This makes
some code obsolete, specifically where we need to clear rdatasets from
the message.
2025-07-23 07:18:48 +00:00
Ondřej Surý
855960ce46
Rename 'free' variable to 'nfree' to not clash with free()
The beauty and horrors of the C - the compiler properly detects variable
shadowing, but you can freely shadow a standard function 'free()' with
variable called 'free'.  And if you reference 'free()' just as 'free'
you get the function pointer which means you can do also pointer
arithmetics, so 'free > 0' is always valid even when you delete the
local variable.

Replace the local variables 'free' with a name that doesn't shadow the
'free()' function to prevent future hard to detect bugs.
2025-07-22 09:32:56 +02:00
Mark Andrews
7de4207cb6 Fix find_coveringnsec in qpcache.c
dns_qp_lookup was returning ISC_R_NOTFOUND rather than DNS_R_PARTIALMATCH
when there wasn't a parent with a NSEC record in the cache.  This was
causing find_coveringnsec to fail rather than returing the covering NSEC.
2025-07-21 17:05:50 +02:00
Alessio Podda
fdbcdcfc06 Remove unused link field from rdatacommon
The field link in rdatacommon is unused. This change should save 16
bytes for each rdata we create.
2025-07-17 12:57:51 +02:00
Michał Kępień
a951ab1872 Update broken reference to dlz_minimal.h
Commit a6cce753e2 missed a spot in
lib/dns/include/dns/clientinfo.h.  Replace the outdated file reference
with the URL used in all similar cases.
2025-07-17 07:17:12 +02:00
Andoni Duarte Pintado
ffee986ae0 Merge tag 'v9.21.10' 2025-07-16 17:16:27 +02:00
Mark Andrews
125a232bfb Digest type GOST is also deprecated 2025-07-15 23:53:57 +10:00
Mark Andrews
cb6903c55e Warn about deprecated DNSKEY and DS algorithms / digest types
DNSKEY algorithms RSASHA1 and RSASHA-NSEC3-SHA1 and DS digest type
SHA1 are deprecated.  Log when these are present in primary zone
files and when generating new DNSKEYs, DS and CDS records.
2025-07-15 23:53:57 +10:00
Matthijs Mekking
c5a14f263f Add namespace to new_qp(c|z)node
Is there a time when new_qp(c|z)node() would not be followed by
assignment of the namespace? No, so let's add the assignment to the
function that creates the node.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
df6763fd2a Rename DNS_DB_NSEC_ constants to DNS_DBNAMESPACE_
Naming is hard exercise.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
a7021a3a51 Rename dns_qp_lookup2 back to dns_qp_lookup
Now that we have to code working, rename 'dns_qp_lookup2' back to
'dns_qp_lookup' and adjust all remaining 'dns_qp_lookup' occurrences
to take a space=0 parameter.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
e052e14b40 Change denial type to enum
For now we only allow DNS_DB_NSEC_* values so it makes sense to change
the type to an enum.

Rename 'denial' to the more intuitive 'space', indicating the namespace
of the keyvalue pair.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
61f8886fc3 Fix the dbiterator to assume only one qp-trie
The dbiterator can take three modes: full, nsec3only and nonsec3.
Previously, in full mode the dbiterator requires special logic to jump
from one qp-trie to the other. Now everything is in one trie, other
special logic is needed.

The qp-trie is now sorted in such a way that all the normal nodes come
first, followed by NSEC nodes, and finally the NSEC3 nodes. NSEC nodes
are empty nodes and need to be skipped when iterating.

We add an additional auxiliary node to the trie, an NSEC origin, so
we can easily find the point in the trie where we need to continue
iterating.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
16a1c5a623 Prepend qpkey with denial byte
In preparation to merge the three qp tries (tree, nsec, nsec3) into
one, add the piece of information into the qpkey. This is the most
significant bit of information, so prepend the denial type to the qpkey.

This means we need to pass on the denial type when constructing the
qpkey from a name, or doing a lookup.

Reuse the the DNS_DB_NSEC_* values. Most qp tries in the code we just
pass on 0 (nta, rpz, zt, etc.), because there is no need for denial of
existence, but for qpzone and qpcache we must pass the right value.

Change the code, so that node->nsec no longer can have the value
DNS_DB_NSEC_HAS_NSEC, instead track this in a new attribute 'havensec'.

Since we use node->nsec to convert names to keys, the value MUST be set
before inserting the node into the qp-trie.

Update the fuzzing and unit tests accordingly. This only adds a few
extra test cases, more are needed.

In the qp_test.c we can remove test code for empty keys as this is
no longer possible.
2025-07-10 13:52:59 +00:00
Petr Špaček
0a5a25729c Remove unused DNS_RDATASET_COUNT
Albeit technically not unused, it was always defined as 0 and thus did
nothing.

Related: #4666
2025-07-10 11:17:19 +02:00
Petr Špaček
ba861f23f2 Remove unused DNS_RDATASET_ORDER
Related: #4666
2025-07-10 11:17:19 +02:00
Petr Špaček
ae600b0a95 Remove unused DNS_RDATASET_FIXED
There was no way to define this in the build system.

Related: #4666
2025-07-10 11:17:19 +02:00
Petr Špaček
750d8a61b6 Convert DNS_RDATASETATTR_ bitfield manipulation to struct of bools
RRset ordering is now an enum inside struct rdataset attributes. This
was done to keep size to of the structure to its original value before
this MR.

I expect zero performance impact but it should be easier to deal with
attributes in debuggers and language servers.
2025-07-10 11:17:19 +02:00
Aram Sargsyan
27e7961479 Log dropped or slipped responses in the query-errors category
As mentioned in the comments block before the changed code block,
the dropped or slipped responses should be logged in the query
category (or rather query-errors category as done in lib/ns/client.c),
so that requests are not silently lost.

Also fix a couple of errors/typos in the code comments.
2025-07-10 08:20:17 +00:00
Alessio Podda
e84704bd55 Improve efficiency of ns_client_t reset
The ns_client_t struct is reset and zero-ed out on every query,
but some fields (query, message, manager) are preserved.

We observe two things:
 - The sendbuf field is going to be overwritten anyway, there's
   no need to zero it out.
 - The fields are copied out when the struct is zero-ed out, and
   then copied back in. For the query field (which is 896 bytes)
   this is very inefficient.

This commit makes the reset more efficient avoiding to unnecessary
zero-ing and copy.
2025-07-10 07:19:47 +02:00
Ondřej Surý
cdeb8d1c14
Use cds_lfht for lock-free hashtables in dns_adb
Replace the read-write locked isc_hashmap with lock-free cds_lfht
hashtable and replace the singular LRU tables for ADB names and entries
with a per-thread LRU tables.  These changes allowed to remove all the
read-write locking on the names and entries tables.
2025-07-09 21:22:48 +02:00
Ondřej Surý
cca4b26d31
Use regular reference counting macro for isc_nm_t structure
Instead of having hand crafted attach/detach/destroy functions, replace
them with the standard ISC_REFCOUNT macro.  This also have advantage
that delayed netmgr detach (from dns_dispatch) now doesn't cause
assertion failure.  This can happen with delayed (call_rcu) shutdown of
dns_adb.
2025-07-09 21:22:48 +02:00
Ondřej Surý
51d7efbfb4
Print the memory context when printing overmem limits
When printing the memory context going into or out of the overmem
condition, also print the memory context name for easier debugging.
2025-07-09 21:22:48 +02:00
Ondřej Surý
7682bc21a9
Rewrite dns_adb LRU to SIEVE
The dns_adb cleaning is little bit muddled as it mixes the "TTL"
based cleaning (.expire_v4 and .expire_v6 for adbname, .expires for
adbentry) with overmem cleaning.

Rewrite the LRU based cleaning to use SIEVE algorithm and to be overmem
cleaning only with a requirement to always cleanup at least 2-times the
size of the newly added entry.
2025-07-09 21:22:47 +02:00
Alessio Podda
25daa047d4 Replace per-zone lock buckets with global buckets
Qpzone employs a locking strategy where rwlocks are grouped into
buckets, and each zone gets 17 buckets.
This strategy is suboptimal in two ways:
 - If named is serving a single zone or a zone is the majority of the
   traffic, this strategy pretty much guarantees contention when using
   more than a dozen threads.
 - If named is serving many small zones, it causes substantial memory
   usage.

This commit switches the locking to a global table initialized at start
time. This should have three effects:
 - Performance should improve in the single zone case, since now we are
   selecting from a bigger pool of locks.
 - Memory consumption should go down significantly in the many zone
   cases.
 - Performance should not degrade substantially in the many zone cases.
   The reason for this is that, while we could have substantially more
   zones than locks, we can query/edit only O(num threads) at the same
   time. So by making the global table much bigger than the expected
   number of threads, we can limit contention.
2025-07-09 15:27:38 +02:00
Alessio Podda
0b1785ec10 Extract the resigning heap into a separate struct
In the current implementation, the resigning heap is part of the zone
database. This leads to a cycle, as the database has a reference to its
nodes, but each node needs a reference to the database.

This MR splits the resigning heap into its own separate struct, in order
to help breaking the cycle.
2025-07-09 12:33:18 +02:00
Alessio Podda
c2a84bb17a Abstract bucket lock selection logic
Recovering the node lock from a pointer to the header and a pointer to
the db is a common operation. This commit abstracts it away into a
function, so that the node lock selection logic may be modified more
easily.
2025-07-09 12:33:18 +02:00
Petr Menšík
d2c6966232 Add few extra WANT_QUERYTRACE logs into resume_qmin
Print optionally a bit more details not passed to event in case
dns_view_findzonecut returns unexpected result. Result would be
visible later in foundevent, but found fname would be lost. Print it
into the log.
2025-07-09 10:13:29 +10:00
Petr Mensik
2fd3da54f9 Handle CNAME and DNAME in resume_min in a special way
When authoritative zone is loaded when query minimization query for the
same zone is already pending, it might receive unexpected result codes.

Normally DNS_R_CNAME would follow to query_cname after processing sent
events, but dns_view_findzonecut does not fill CNAME target into
event->foundevent. Usual lookup via query_lookup would always have that
filled.

Ideally we would restart the query with unmodified search name, if
unexpected change from recursing to local zone cut were detected. Until
dns_view_findzonecut is modified to export zone/cache source of the cut,
at least fail queries which went into unexpected state.
2025-07-09 10:13:29 +10:00
Ondřej Surý
eb0ffa0d5f
When overmem, clean enough memory when adding new ADB names/entries
The purge_stale_names()/purge_stale_entries() is opportunistic even when
we are under memory pressure (overmem).  Split the opportunistic LRU
cleaning and overmem cleaning.  This makes the stale purging much
simpler as we don't have to try that hard and makes the overmem cleaning
always cleanup double the amount of the newly allocated ADB name/entry.
2025-07-08 05:56:19 +02:00
Mark Andrews
9158e63218 Separate out adbname flags that are hashed
There are three adbname flags that are used to identify different
types of adbname lookups when hashing rather than using multiple
hash tables.  Separate these to their own structure element as these
need to be able to be read without locking the adbname structure.
2025-07-06 22:33:27 +10:00
Aram Sargsyan
3d8bd8bbf1 Reset DNS_DBFIND_STALETIMEOUT in query_lookup()
If ns__query_start() is called because of a chained query (e.g.
after encountering a CNAME), a previously set DNS_DBFIND_STALETIMEOUT
flag on the query's 'dboptions' field can cause an assertion
failure if the new query's 'stalefirst' value is not true (e.g. if the
target qname is an authoritative zone for the server). Reset the
DNS_DBFIND_STALETIMEOUT flag in the query_lookup() function before
evaluating the 'stalefirst' value, and make sure to assign a fresh
value to the `stalefirst' flag instead of conditionally assigning it
only if the value is 'true'.
2025-07-03 11:03:34 +02:00
Ondřej Surý
5eec9a2ebb
Change the .inuse member of isc_mem to be per-thread/per-loop
The .inuse member was causing a lot of contention between threads using
the same memory context.  Scather the .inuse and .overmem members of
isc_mem_t structure to be an per-tid array of variables to reduce the
contention as the writes are now independent of each other.

The array uses one tad bit nasty trick, as ISC_TID_UNKNOWN is now -1,
the array has been sized to fit the unknown tid with [-1] index into the
array accomplished with `ctx->stat = &ctx->stat_s[1];`.  It will not win
a beauty contest, but it works seamlessly by just passing `isc_tid()` as
an index into the array.

The caveat here is that gathering the real inuse value requires walking
the whole array for all registered tid values (isc_tid_count()).  The
gather part happens only when statistics are being gathered or when
isc_mem_isovermem() is called.  As the isc_mem_isovermem() call happens
only when new data is being added to cache or ADB, it doesn't happen on
the hottest (read-only) path and according to the measurements, it
doesn't slow down neither the cold cache nor the hot cache latency.
2025-06-30 13:23:17 +02:00
Ondřej Surý
f689dc2297
Don't use ssize_t for storing difference between sizes
As POSIX guarantees only that the type ssize_t shall be capable of
storing values at least in the range [-1, {SSIZE_MAX}], it can't be used
to calculate the difference between two memory sizes.  Change the logic
for junk filling to test whether the new size is larger than old size
and then use size_t as the result will be always positive.
2025-06-30 13:22:39 +02:00
Ondřej Surý
560047307d
Remove .hi_called member of isc_mem_t structure
The .hi_called member was dead structure member and it hasn't been used
since the overmem callback has been removed in commit
14bdd21e0a.
2025-06-30 13:22:39 +02:00
Ondřej Surý
d1427e9fa8
Add and use MALLOCX_ZERO_GET() macro to jemalloc_shim.h
Pull MALLOCX_ZERO_GET() macro to align the usage with the jemalloc
jemalloc/internal/jemalloc_internal_types.h header.
2025-06-30 13:22:39 +02:00
Ondřej Surý
c6828bcf8f
Delete jemalloc arena support from isc_mem
The jemalloc arena in isc_mem was added to solve runaway memory problem
for outgoing TCP connections.  In the end, this was a red herring and
the jemalloc arena code is now unused (via e28266bf).  Remove the
support for jemalloc memory arenas as we can restore this at any time if
we need it ever again, but right now it's just a dead code.
2025-06-30 13:22:39 +02:00
Ondřej Surý
74e5f5c6cf
Fix implicit headers when using isc/overflow.h header
In jemalloc_shim.h, we relied on including <isc/overflow.h> implicitly
instead of explicitly and same was happening inside isc/overflow.h - the
stdbool.h (for bool type) was being included implicitly instead of
explicitly.
2025-06-30 13:22:38 +02:00
Ondřej Surý
dd37fd6a49 Add ISC_TID_MAX with default being 512 threads
The ISC_TID_MAX variable allows other units to declare static arrays
with this as size for per-thread/per-loop variables.
2025-06-28 13:32:12 +02:00
Ondřej Surý
1032681af0 Convert the isc/tid.h to use own signed integer isc_tid_t type
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'.  Internally, the signed
integer type is being used.  This allows us to have negatively indexed
arrays that works both for threads with assigned tid and the threads
with unassigned tid.  This should be used only in specific situations.
2025-06-28 13:32:12 +02:00
Alessio Podda
ef95806e05 Change QP and qpcache logging from DEBUG(1) to DEBUG(3)
Per pspacek, currently qp and qpcache logs are too verbose and enabled at a
level too low compared to how often the logging is useful.

This commit increases the logging level, while keeping it configurable
via a define.
2025-06-25 14:37:01 +02:00
Alessio Podda
19818aebf7 Use RCU for rad name
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.

Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.

This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
2025-06-25 09:55:02 +02:00
Mark Andrews
422b9118e8 Use clang-format-20 to update formatting 2025-06-25 12:44:22 +10:00
Mark Andrews
3620db5ea6 Preserve brackets in DNS_SLABHEADER_GETATTR macro
We need to turn off clang-format to preserve the brackets as
'attribute' can be an expression and we need it to be evaluated
first.

Similarly we need the entire result to be evaluated independent of
the adjoining code.
2025-06-25 12:44:22 +10:00
Matthijs Mekking
d494698852 Fix spurious missing key files log messages
This happens because old key is purged by one zone view, then the other
is freaking out about it.

Keys that are unused or being purged should not be taken into account
when verifying key files are available.

The keyring is maintained per zone. So in one zone, a key in the
keyring is being purged. The corresponding key file is removed.

The key maintenance is done for the other zone view. The key in that
keyring is not yet set to purge, but its corresponding key file is
removed. This leads to "some keys are missing" log errors.

We should not check the purge variable at this point, but the
current time and purge-keys duration.

This commit fixes this erroneous logic.
2025-06-19 08:13:07 +02:00
Mark Andrews
92393f3c97 Add example PRIVATEDNS algorithm identifiers to DS 2025-06-19 07:15:20 +10:00
Mark Andrews
e687710dc7 Add PRIVATEOIDs for RSASHA256 and RSASHA512
Use the existing RSASHA256 and RSASHA512 implementation to provide
working PRIVATEOID example implementations.  We are using the OID
values normally associated with RSASHA256 (1.2.840.113549.1.1.11)
and RSASHA512 (1.2.840.113549.1.1.13).
2025-06-19 07:15:20 +10:00
Mark Andrews
10d094a289 Future: DS private algorithm support
Add support for proposed DS digest types that encode the private
algorithm identifier at the start of the DS digest as is done for
DNSKEY and RRSIG.  This allows a DS record to identify the specific
DNSSEC algorithm, rather than a set of algorithms, when the algorithm
field is set to PRIVATEDNS or PRIVATEOID.
2025-06-19 07:15:20 +10:00
Mark Andrews
c428af5e7a Support PRIVATEOID/PRIVATEDNS in zone.c
- dns_zone_cdscheck() has been extended to extract the key algorithms
  from DNSKEY data when the CDS algorithm is PRIVATEOID or PRIVATEDNS.

- dns_zone_signwithkey() has been extended to support signing with
  PRIVATEDNS and PRIVATEOID algorithms.  The signing record (type 65534)
  added at the zone apex to indicate the current state of automatic zone
  signing can now contain an additional two-byte field for the DST
  algorithm value, when the DNS secalg value isn't enough information.
2025-06-19 07:15:20 +10:00
Mark Andrews
05c5f79d58 Support PRIVATEOID/PRIVATEDNS in the validator
DS records need to checked against the DNSKEY RRset to find
the private algorithm they correspond to.
2025-06-19 07:00:53 +10:00
Mark Andrews
eb184b864c Support PRIVATEOID/PRIVATEDNS in the resolver
dns_resolver_algorithm_supported() has been extended so in addition to
an algorithm number, it can also take a pointer to an RRSIG signature
field in which key information is encoded.
2025-06-19 07:00:53 +10:00
Mark Andrews
71801ab123 Use DST algorithm values instead of dns_secalg where needed
DST algorithm and DNSSEC algorithm values are not necessarily the same
anymore: if the DNSSEC algorithm value is PRIVATEOID or PRIVATEDNS, then
the DST algorithm will be mapped to something else. The conversion is
now done correctly where necessary.
2025-06-19 07:00:53 +10:00
Mark Andrews
6fe09d85ab Support for DST_ALG_PRIVATEDNS and DST_ALG_PRIVATEOID
The algorithm values PRIVATEDNS and PRIVATEOID are placeholders,
signifying that the actual algorithm identifier is encoded into the
key data. Keys using this mechanism are now supported.

- The algorithm values PRIVATEDNS and PRIVATEOID cannot be used to
  build a key file name; dst_key_buildfilename() will assert if
  they are used.

- The DST key values for private algorithms are higher than 255.
  Since DST_ALG_MAXALG now exceeds 256, algorithm arrays that were
  previously hardcoded to size 256 have been resized.

- New mnemonic/text conversion functions have been added.
  dst_algorithm_{fromtext,totext,format} can handle algorithm
  identifiers encoded in PRIVATEDNS and PRIVATEOID keys, as well
  as the traditional algorithm identifiers. (Note: The existing
  dns_secalg_{fromtext,totext,format} functions are similar, but
  do *not* support PRIVATEDNS and PRIVATEOID. In most cases, the
  new functions have taken the place of the old ones, but in a few
  cases the old version is still appropriate.)

- dns_private{oid,dns}_{fromtext,totext,format} converts between
  DST algorithm values and the mnemonic strings for algorithms
  implemented using PRIVATEDNS or PRIVATEOID. (E.g., "RSASHA256OID").

- dst_algorithm_tosecalg() returns the DNSSEC algorithm identifier
  that applies for a given DST algorithm.  For PRIVATEDNS- or
  PRIVATEOID- based algorithms, the result will be PRIVATEDNS or
  PRIVATEOID, respectively.

- dst_algorithm_fromprivatedns() and dst_algorithm_fromprivateoid()
  return the DST algorithm identifier for an encoded algorithm in
  wire format, represented as in DNS name or an object identifier,
  respectively.

- dst_algorithm_fromdata() is a front-end for the above; it extracts
  the private algorithm identifier encoded at the begining of a
  block of key or signature data, and returns the matching DST
  algorithm number.

- dst_key_fromdns() and dst_key_frombuffer() now work with keys
  that have PRIVATEDNS and PRIVATEOID algorithm identifiers at the
  beginning.
2025-06-19 07:00:53 +10:00
Mark Andrews
9ab4160be6 Add DS digest type code points SM3 and GOST-2012
Provide mapping between mnemonic and value.
2025-06-19 07:00:53 +10:00
Mark Andrews
cf968a1a58 Add rdata type header files to dns_header_depfiles macro
The header file dns/rdatastruct.h was not being rebuilt when the
rdata type header files where modified.

Removed proforma.c from the list.  It is a starting point for new
types.
2025-06-13 12:49:36 +00:00
Mark Andrews
6c28411c55 Add CO support to dig
Dig now support setting the EDNS CO as flag using "+coflag" /
"+nocoflag" rather than as part of +ednsflags.
2025-06-13 07:50:16 +00:00
Evan Hunt
d586c29069 Remove zone keyopts field
The "keyopts" field of the dns_zone object was added to support
"auto-dnssec"; at that time the "options" field already had most of
its 32 bits in use by other flags, so it made sense to add a new
field.

Since then, "options" has been widened to 64 bits, and "auto-dnssec"
has been obsoleted and removed. Most of the DNS_ZONEKEY flags are no
longer needed. The one that still seems useful (_FULLSIGN) has been
moved into DNS_ZONEOPT and the rest have been removed, along with
"keyopts" and its setter/getter functions.
2025-06-12 18:29:29 -07:00
Evan Hunt
1a24dfcddf Clean up CFG_ZONE_DELEGATION
"type delegation-only" has been obsolete for some time
(see #3953) but the zone type flag for it was still defined
in libisccfg. It has now been removed.
2025-06-12 17:46:14 -07:00
Aydın Mercan
5cd6c173ff
replace the build system with meson
Meson is a modern build system that has seen a rise in adoption and some
version of it is available in almost every platform supported.

Compared to automake, meson has the following advantages:

* Meson provides a significant boost to the build and configuration time
  by better exploiting parallelism.

* Meson is subjectively considered to be better in readability.

These merits alone justify experimenting with meson as a way of
improving development time and ergonomics. However, there are some
compromises to ensure the transition goes relatively smooth:

* The system tests currently rely on various files within the source
  directory. Changing this requirement is a non-trivial task that can't
  be currently justified. Currently the last compiled build directory
  writes into the source tree which is in turn used by pytest.

* The minimum version supported has been fixed at 0.61. Increasing this
  value will require choosing a baseline of distributions that can
  package with meson. On the contrary, there will likely be an attempt
  to decrease this value to ensure almost universal support for building
  BIND 9 with meson.
2025-06-11 10:30:12 +03:00
Alessio Podda
4a6d7eb4f3 Try to skip lock on fully lower names
If the name is fully lowercase, we don't need to access the case bitmap
in order to set the case. Therefore, we can check for the FULLYLOWERCASE
flag using only atomic operations, and skip a lock in the hot path,
provided we clear the FULLYLOWERCASE flag before changing the case
bitmap.
2025-06-04 10:48:08 +00:00