Commit graph

43937 commits

Author SHA1 Message Date
Colin Vidal
cb0807be2b chg: dev: refactor view creation/configuration loops in dedicated functions
Refactor a bit of `apply_configuration` by extracting (into respective dedicated function) the logic to build the keystores list, the KASP list as well as creating the view/zones and configuring those. This is the next step of MR !10895 and !10901

While the code is extracted, some global variables has been changed into a function parameters which enable to have a clear view of the dependency of the function, typically, to know if it depends on local configuration object or runtime "production" object. The end goal (not in this MR, but later on) is to move as much as possible initialization logic outside of the exclusive mode. 

As a first step, latest commits move the keystores list, KASP list and view/zones creation outside of the exclusive mode. (The view/zone configuration remain in exclusive mode for now, because of a dependency to the runtime "cachelist". This is the target of a next MR.

For the record; while moving the keystores list, KASP list and view/zone creation doesn't have a significant impact on the time the exclusive mode is taken (from my experiment on a 1M small zones instance); moving `configure_views` did have a _massive_ impact (basically, the time spend in the exclusive mode is then non calculable). Configuring views outside the exclusive mode needs more work, which will be done in future MRs.

See #4673

Merge branch 'colin/refactor-applyconfig' into 'main'

See merge request isc-projects/bind9!10910
2025-09-24 11:46:38 +02:00
Colin Vidal
17a2cbcbc5 comment about ifs scan twice the first time
Add comment message about why we're scanning interfaces twice during the
initial configuration (FreeBSD compatibility). See #3583
2025-09-24 10:54:50 +02:00
Colin Vidal
3fe239e5cf apply_configuration: log subroutines for tests
In order to have a (minimal) test ensuring we don't move back
`apply_configuration` subroutines which can be done before the exclusive
lock is taken, `APPLY_CONFIGURATION_SUBROUTINE_LOG` macro is added and
used for the few subroutines already extracted from the exclusive mode.
Those expected logs are added in `configloading` system test checks.
2025-09-24 10:54:50 +02:00
Colin Vidal
c225ba17c2 creation of client TLS ctx before exclusive mode
When the server is configured (inside `apply_configuration`) a client
TLS context cache is created and attached to the global server object.
It is then used by `configure_view` flow (and also during runtime though
the zone manager).

It is now created before the exclusive mode, and the swap of the
previous TLS cache ctx is done at the end of the exclusive mode, if
everything went well.

This allows us (among other follow-up changes) to move the
`configure_views` function outside of the exclusive mode.
2025-09-24 10:54:50 +02:00
Colin Vidal
e1be2be4ef move creation of keystores, kasp list and view outside of exclusive mode
The keystores initialization, the KASP list initialization as well as
the initialization of the view no longer depends of any data shared by
running "production" objects during re-configuration of the server. This
allows us to move those outside (before) the exclusive mode is taken.
2025-09-24 10:54:50 +02:00
Colin Vidal
201f62d9ef cfg_aclconfctx_t object is part of named_server
`named_g_actconfctx` is a global variable holding the ACL configuration
context alive (in particular, to dynamically load zones). However, this
object is build once per configuration (early) and is used only inside
server.c `apply_configuration` flow. (Two exceptions: the shutdown flow,
still in server.c and plugin check flow, which doesn't need it, so it's
NULL in such case).

Instead of leaving this global publicly exposed, it is now part of the
`named_server_t` object. This allows us to clearly see that, when
reconfigureing the server, the new instance of the ACL context is known
only by the newly built object and not currently used by "production"
object; and will help to move move logic before the exclusive mode is
taken.

The other advantage is that the ACL configuration context can now be
built before the exclusive lock as well.
2025-09-24 10:54:50 +02:00
Colin Vidal
4523852ded apply_configuration: bump config map before exclusive mode
Moving the config map building outside of the exclusive mode, and this
is local data only and no runtime object uses it.
2025-09-24 10:54:50 +02:00
Colin Vidal
de11150e47 apply_configuation: add configure_keystores
The keystores list build logic was inlined in apply_configuration, this
commit extracts it into its own function.
2025-09-24 10:54:50 +02:00
Colin Vidal
c97be6a7f5 apply_configuration: add configure_kasplist
The kasplist (dnssec-policy defined in the builtin and global
configuration options) was built inside apply_configuration. This
commit extracts this logic into its separate function.

In order to make the view configuration independent of the global
`server` object, the newly built kasplist is now passed as parameter.
(This eventually will help to be able to configure the views outside of
the exclusive mode by limiting its dependency to the global
`server`/`named_g_server`).
2025-09-24 10:54:50 +02:00
Colin Vidal
0fb6c9ae74 apply_configuration: remove builtin_viewlist
When creating/configuring the view, the user-defined views are built and
set into the viewlist, then builtin-view inside the builtin_viewlist.
But there is no seperate logic applied to those two lists, and they are
immediately merged into viewlist right after. This commit removes this
intermediate list and add builtin-views directly into the main viewlist
instead.
2025-09-24 10:54:50 +02:00
Colin Vidal
36c74c58c1 refactor view creation/config in apply_configuration
In order to help splitting apply_configuration, the inline loops and bit
of logic around it for views creation and configuration, each of those
are now in a dedicatated function.
2025-09-24 10:54:50 +02:00
Ondřej Surý
0ac744ee4d chg: dev: Use lock-free hashtable for storing resolver fetch contexts
Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero.  This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.

Merge branch 'ondrej/use-urcu-lfht-for-resolver-tables' into 'main'

See merge request isc-projects/bind9!10653
2025-09-24 00:08:45 +02:00
Ondřej Surý
0dcf56d5e3
fixup! Use lock-free hashtable for storing resolver fetch contexts 2025-09-24 00:08:21 +02:00
Ondřej Surý
6011fb5484
Use lock-free hashtable for storing resolver fetch contexts
Previously, the fetch contexts were stored inside rwlocked hashmap
table.  This was one of the most contended places for the resolver,
especially in the cold cache situation.

Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero.  This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.
2025-09-24 00:08:21 +02:00
Ondřej Surý
a20c8fe74b chg: dev: Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry).  Add a .related member to the slabtop that allows us to
expire the headers in both related headers and also optimize the lookups
because now both slabtops are looked up at the same time.

Closes #3396

Merge branch '3396-bind-rrsigs-to-records' into 'main'

See merge request isc-projects/bind9!10985
2025-09-24 00:07:32 +02:00
Ondřej Surý
3e05958a42
Refactor find headers to make use of related
Change the code of finding headers to make use of the related circular
reference.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:08 +02:00
Ondřej Surý
0f13d7f2fa
Expire related headers at the same time
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry).  This commit expires the headers in both related
headers.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
0b317abe4e
Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled.  Add a .related member to the slabtop that allows us to
optimize the lookups because now both slabtops are looked up at the
same time.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
270f78194e
Refactor find headers
Another recurring code pattern that can be moved into a separate
function.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
6b0e6cb058
Refactor check header
There was a pattern where first the header was checked for NULL
and then for being stale. In both cases the code path is the same
so it makes sense to put them in a separate function.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
133d76c05e
Move the size of the expired data into expireheader
Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
7443ff330c chg: dev: Convert slabtop and slabheader to use the cds list
This is the first MR in series that aims to reduce the node locking
by replacing the single-linked list of slabtop(s) and slabheader(s)
with CDS linked list.  This commit doesn't do anything else beyond
replacing .next and .down links with the cds_list_head.  The RCU
semantics will be added later.

Merge branch 'ondrej/use-rcu-list-for-slabtop' into 'main'

See merge request isc-projects/bind9!10944
2025-09-24 00:06:35 +02:00
Ondřej Surý
28fff0045d
Convert slabheader to use the cds_list
This is the second commit in series that aims to reduce the node locking
by replacing the single-linked list of slabheader(s) with CDS linked list.
This commit doesn't do anything else beyond replacing .next link with
the cds_list_head.  RCU semantics is going to be added in the subsequent
commits.
2025-09-23 23:18:44 +02:00
Ondřej Surý
63389b8ce6
Convert slabtop to use the cds_list
This is the first commit in series that aims to reduce the node locking
by replacing the single-linked list of slabtop(s) with CDS linked list.
This commit doesn't do anything else beyond replacing .next link with
the cds_list_head.  RCU semantics is going to be added in the subsequent
commits.
2025-09-23 11:21:47 +02:00
Ondřej Surý
2924f59cb3 fix: dev: Fix datarace between unlocking fctx lock and shuttingdown fctx
There was a data race where new fetch response could be added to the
fetch context after we unlock the fetch context and before we shut it
down.  This could cause assertion failure when fctx__done() was called
with ISC_R_SUCCESS because there was originally no fetch response, but
new fetch response without associated dataset was added before we had a
chance to shutdown the fetch context.  This manifested in the
validated() callback, where cache_rrset() now returns ISC_R_SUCCESS
instead of DNS_R_UNCHANGED when cache was not changed.  However the data
race was wrong on a general level.

Add new argument to fctx__done() that allows to call it with fctx->lock
already acquired to prevent these data races.

Closes #5507

Merge branch '5507-dont-release-fctx-lock-on-done' into 'main'

See merge request isc-projects/bind9!10961
2025-09-23 11:17:53 +02:00
Ondřej Surý
09762bdc44
Fix datarace between unlocking fctx lock and shuttingdown fctx
There was a data race where new fetch response could be added to the
fetch context after we unlock the fetch context and before we shut it
down.  This could cause assertion failure when fctx__done() was called
with ISC_R_SUCCESS because there was originally no fetch response, but
new fetch response without associated dataset was added before we had a
chance to shutdown the fetch context.  This manifested in the
validated() callback, where cache_rrset() now returns ISC_R_SUCCESS
instead of DNS_R_UNCHANGED when cache was not changed.  However the data
race was wrong on a general level.

When the fctx__done() is called with ISC_R_SUCCESS as result is expects
the fctx->lock to be already acquired to prevent these data races.
2025-09-23 10:29:27 +02:00
Ondřej Surý
1aa9cd3484
Split the fctx_done() into success and failure variants
The split will allow us to call fctx__done() with fctx->lock acquired
when it is called with ISC_R_SUCESS to prevent data races when finishing
the fetch context.
2025-09-19 10:53:29 +02:00
Michal Nowak
bed22f3acb chg: ci: Do not mangle named path in respdiff
Merge branch 'mnowak/respdiff-do-not-mangle-named-path' into 'main'

See merge request isc-projects/bind9!10855
2025-09-18 15:51:14 +02:00
Michal Nowak
287c06b8f3
Do not mangle named path in respdiff 2025-09-18 15:30:43 +02:00
Nicki Křížek
e5a417a8a6 chg: ci: Only run relevant CI jobs based on the changes
Trigger selected CI jobs on MR automatically only if there are related
code changes. Otherwise, offer an option to run the jobs manually in
MRs. For other sources, like schedules, tags etc., execute the jobs as
usual.

Merge branch 'nicki/ci-restrict-rules-changes' into 'main'

See merge request isc-projects/bind9!10987
2025-09-18 15:27:02 +02:00
Nicki Křížek
96974330d5 Run shfmt on util/check-make-install.sh.in 2025-09-18 13:55:00 +02:00
Nicki Křížek
02c58d9baa Only run relevant CI jobs based on the changes
Trigger selected CI jobs on MR automatically only if there are related
code changes. Otherwise, offer an option to run the jobs manually in
MRs. For other sources, like schedules, tags etc., execute the jobs as
usual.
2025-09-18 13:50:33 +02:00
Nicki Křížek
2d690499dd Add .sh extension to shell scripts
Use .sh(.in) file extension consistently for shell scripts
to allow more reliable detection of shell scripts based on their file
extension.
2025-09-18 13:50:33 +02:00
Colin Vidal
33bcff46d3 fix: usr: preserve cache when reload fails and reload the server again
Fixes an issue where failing to reconfigure/reload the server would prevent to preserved the views caches on the subsequent server reconfiguration/reload.

Closes #5523

Merge branch 'colin/fix-cache-revert' into 'main'

See merge request isc-projects/bind9!10984
2025-09-17 17:38:54 +02:00
Colin Vidal
a1703fa35b preserve cache when reload fails
If the server is reloaded, new views are created and preexisting cache
is attached to those _but_ something goes wrong later, the previous
views are restored but the previous cache list is destroyed. This makes
the subsequent reload to drop the existing cache. This fixes it by
avoiding a mutation of the old cache list.
2025-09-17 16:45:51 +02:00
Colin Vidal
714693742e test that cache is preserved on reconfing failure
A named bug scrap the cache on a second reload after an initial reload
failure. Adds a test checking that the cache is preserved between server
reconfiguration/reloads even if it fails at some point (after attempting
to re-use the cache) and the server is re-loaded later.
2025-09-17 16:45:51 +02:00
Ondřej Surý
22803b93e3 chg: dev: Squash the qpcache tree and nsec tries
The dns_qpcache already had all the namespace changes needed to put the
normal data and auxiliary NSEC data into a single tree.  Remove the
extra nsec QP trie and use the single QP trie for all the cache data.

Merge branch 'ondrej/use-qp-namespace-in-cache' into 'main'

See merge request isc-projects/bind9!10975
2025-09-17 15:59:13 +02:00
Ondřej Surý
cdc6950d04
Add more unit tests for dns_qp unit
Add basic unit tests and add missing DbC checks for mandatory
dns_qp_create() arguments.
2025-09-17 15:58:44 +02:00
Ondřej Surý
9e2d5d94bd
Remove dns_dbtree_t and its usage
As we removed the ability to count nodes in the auxiliary trees (because
there are no auxiliary trees), we can also cleanup the API and
associated enum type (dns_dbtree_t).
2025-09-17 15:58:44 +02:00
Ondřej Surý
a3e96f2d49
Squash the qpcache tree and nsec trees
The dns_qpcache already had all the namespace changes needed to put the
normal data and auxiliary NSEC data into a single tree.  Remove the
extra nsec QP trie and use the single QP trie for all the cache data.
2025-09-17 15:58:44 +02:00
Ondřej Surý
ec0a5f3a9d
Remove the dbiterator_{last,prev} from the qpcache
The dbiterator_{last,prev} functions are not used in the cache, and the
implementation would get quite complicated when we squash the main and
nsec trees together.  It's easier to just not implement these.
2025-09-17 15:58:44 +02:00
Ondřej Surý
70c8054b84
Remove CacheNSECNodes statistics counter
There is no auxiliary NSEC tree, so we can't count the NSEC nodes
separately.  Remove the CacheNSECNodes statistics counter as it would be
always zero.
2025-09-17 15:58:44 +02:00
Petr Špaček
f3032ecd85 chg: test: Improve root zone loading into AsyncServer
Merge branch 'pspacek/test-server-root' into 'main'

See merge request isc-projects/bind9!10981
2025-09-17 13:22:15 +00:00
Petr Špaček
339e5162d6 Add ability to load root zone into AsyncServer
We would prefer if explicit $ORIGIN is used only for root zone and
nothing else, solely to avoid zone files named "..db". For all other
zones the file name should match zone name.
2025-09-17 15:20:22 +02:00
Ondřej Surý
3b6fe27455 fix: nil: Small qpcache and qpzone related fixes
Merge branch 'ondrej/little-fixups-in-qp' into 'main'

See merge request isc-projects/bind9!10970
2025-09-16 18:49:30 +02:00
Ondřej Surý
136fddf538
Use result of first_*_header() calls instead of direct value
Fix places where we got the header by calling first_*_header() function,
but then worked with top->header instead of the result.
2025-09-16 18:47:53 +02:00
Ondřej Surý
fa12ba28ce
Fix up the descriptions in rdataslab.h
There were a lot of outdated comments. They've been updated or removed.
2025-09-16 18:47:53 +02:00
Ondřej Surý
63ed3cd0bd fix: nil: Fix dns_qpmulti_memusage() on empty dns_qpmulti_t instance
The dns_qpmulti_memusage() causes assertion failure when called on
freshly created qpmulti instance because the qp->usage hasn't been
allocated yet.

Merge branch 'ondrej/fix-qpmulti_memusage' into 'main'

See merge request isc-projects/bind9!10977
2025-09-16 16:43:46 +02:00
Ondřej Surý
b2f653b332
Fix dns_qpmulti_memusage() on empty dns_qpmulti_t instance
The dns_qpmulti_memusage() causes assertion failure when called on
freshly created qpmulti instance because the qp->usage hasn't been
allocated yet.
2025-09-16 16:30:15 +02:00
Colin Vidal
722ce92f10 chg: dev: simplify nchildren count in isc_nm_listenudp
Slight simplification of the logic to define .nchildren listening UDP
socket.

Merge branch 'colin/simplify-socket-nchildren-count' into 'main'

See merge request isc-projects/bind9!10978
2025-09-16 15:49:49 +02:00