Commit graph

43965 commits

Author SHA1 Message Date
Colin Vidal
d7dbfbb011 apply_configuration: leave exclusive mode after viewlist cleanup
When a re-configuration fails, `apply_configuration` flows jump to a
cleanup label and, at some point, leave the exclusive mode and cleanup
the viewlist. It looks fine as the viewlist is at this point only
locally known (if this is a configuration failure, this is the new view
list, if this is a success, this is the old list which has been swapped
out from the production list during the exclusive mode).

However, the view and zone initialization code enqueues job callbacks,
for instance from `dns_zone_setsigninginterval` (but there are others
cases) which will be called for the new views and zones after the
exclusive mode is over.

Depending where the configuration fails, those views and zones can be
half-configured, for instance a view might have an unfrozen resolver.
Hence, leaving the exclusive mode before cleaning up those views ans
zones will immediately called the previously enqueued callbacks and lead
to this reconfiguration-failure crash stack:

```
isc_assertion_failed
dns_resolver_createfetch
do_keyfetch
isc__async_cb
...
uv_run
loop_thread
thread_body
thread_run
start_thread
...
```

To avoid the problem, the views are now cleaned up before leaving the
exclusive mode (which also clean up the zones and enqueued callbacks).

As context, the bug was introduced by !10910 which moved the creation
(not configuration) of the view outsides of the exclusive mode. This is
a safe move (as at this point, the newly view are only known locally by
`apply_configuration`) but the re-order was wrong regarding the point
where the exclusive mode was ended (before the change, the exclusive
mode as always ended before the new view are detached).
2025-09-26 14:55:01 +02:00
Matthijs Mekking
6246f9d7cb fix: usr: rndc sign during ZSK rollover will now replace signatures
When performing a ZSK rollover, if the new DNSKEY is omnipresent, the :option:`rndc sign` command now signs the zone completely with the successor key, replacing all zone signatures from the predecessor key with new ones.

Closes #5483

Merge branch '5483-smooth-operator-bug' into 'main'

See merge request isc-projects/bind9!10867
2025-09-26 12:03:46 +00:00
Matthijs Mekking
489752eb1f Update the retire interval after full sign
After a full sign we no longer have to need to take the sign delay into
account. Update the timing checks in keymgr_transition_time to determine
the start of the interval: Either the last change, or if SigPublish/
SigDelete is set. The latter case indicates a full sign was done and
so we no longer have to take the sign delay into account.
2025-09-26 13:24:46 +02:00
Matthijs Mekking
acbf110b18 Test the next key event after full sign
After a full sign we no longer have to need to take the sign delay into
account.
2025-09-26 12:49:23 +02:00
Matthijs Mekking
844bde0c70 Force full sign to generate new signatures
When introducing the kasp logic, a full sign of the zone did not
generate new signatures for the new active keys during a ZSK rollover.

The introduced kasp logic ensured that the rollover is performed
smoothly, as in the signatures are only replaced if the old signature
is close to expiring (depending on the signatures-refresh option).

Fix by maintaining a fullsign boolean value in the signing structure,
that will ensure the RRsets are signed with the correct key, rather
than a similar good key.

In case of a fullsign, we can also remove signatures from inactive
keys.

Remove the unused dns_zone_signwithkey function.
2025-09-26 12:49:23 +02:00
Matthijs Mekking
008d3d2a9c Test rndc sign updates the signatures
Add a check to the ZSK rollover test case that ensures the zone is
signed with the successor key only, after a 'rndc sign' is commanded.
2025-09-26 12:49:23 +02:00
Mark Andrews
7e0318df85 fix: usr: Use signer name when disabling DNSSEC algorithms
``disable-algorithms`` could cause DNSSEC validation failures when the parent zone was
signed with the algorithms that were being disabled for the child zone.
This has been fixed; `disable-algorithms` now works
on a whole-of-zone basis.

If the zone's name is at or below the ``disable-algorithms`` name the algorithm
is disabled for that zone, using deepest match when there are multiple
``disable-algorithms`` clauses. 

Closes #5165

Merge branch '5165-use-signer-name-when-disabling-dnssec-algorithms' into 'main'

See merge request isc-projects/bind9!10837
2025-09-26 00:13:38 +10:00
Matthijs Mekking
81d3a29e4e Check disable-algorithms with non-zone names
Test that if disable-algorithms is configured on a name that is below
the zonecut, it still validates (z.secure.example).

Test that if disable-algorithms is configured on a name that is above
the zonecut, it is treated as insecure (zonecut.ent.secure.example).
2025-09-25 11:14:27 +10:00
Mark Andrews
28848ab578 Make it clearer that disable-algorithms applies to zone names 2025-09-25 11:14:27 +10:00
Mark Andrews
21934102d3 Check that badalg.secure.example resolves
Previously, badalg.secure.example would return SERVFAIL because the DS
records (from the parent) could not be validated.
2025-09-25 11:14:27 +10:00
Mark Andrews
a0945f6337 Use signer name when disabling DNSSEC algorithms
When disabling algorithms, use the signer name to determine if the
algorithm is disabled or not.  This allows for algorithms to be
cleanly disabled on a zone level basis.  Previously, just using the
records owner name, "disable-algorithms" could impact resolution of
names that where not disabled.  This does now mean that
"disable-algorithms" can not be used to disable part of a zone anymore.
2025-09-25 11:14:27 +10:00
Colin Vidal
0411142f82 chg: dev: rename cfg_aclconfctx_t variables to aclctx
ACL configuration context variables are inconsistently named as `actx`,
`ac`, or `aclconfctx`, which caused confusion during code reviews. This
commit renames all `cfg_aclconfctx_t` variables to `aclctx`, which is
short, consistent, and unambiguous.

Closes #5530

Merge branch '5530-rename-actx' into 'main'

See merge request isc-projects/bind9!11003
2025-09-24 20:57:38 +02:00
Colin Vidal
36a05c81b4 rename cfg_aclconfctx_t variables to aclctx
ACL configuration context variables are inconsistently named as `actx`,
`ac`, or `aclconfctx`, which caused confusion during code reviews. This
commit renames all `cfg_aclconfctx_t` variables to `aclctx`, which is
short, consistent, and unambiguous.
2025-09-24 20:14:49 +02:00
Matthijs Mekking
23a79b42ea new: usr: Add dnssec-policy keys configuration check to named-checkconf
A new option `-k` is added to `named-checkconf` that allows checking the `dnssec-policy` `keys` configuration against the configured key stores. If the found key files are not in sync with the given `dnssec-policy`, the check will fail.

This is useful to run before migrating to `dnssec-policy`.

Closes #5486

Merge branch '5486-named-checkconf-dnssec-policy-key-directory' into 'main'

See merge request isc-projects/bind9!10907
2025-09-24 15:44:08 +00:00
Matthijs Mekking
b4e1527b56 Cleanup unused constant
The DST_ALGORITHM_FORMATSIZE constant is unused. It could be used in
dst_kasp_key_format, but instead we will use DNS_NAME_FORMATSIZE
because it is used in other places too. Clean up the unused constant.
2025-09-24 17:03:06 +02:00
Matthijs Mekking
dcd49f2ead Change checkconf to include built-in dnssec-policy
The configuration should also take into account the built-in
DNSSEC policies when verifying the keys in the key-directory match the
given policy. Update the code accordingly and add some good and
failure test cases.
2025-09-24 17:03:06 +02:00
Matthijs Mekking
3918a8ca4c Test named-checkconf -k
Test named-checkconf -k option, that checks the dnssec-policy against
the configured keystores.
2025-09-24 17:03:06 +02:00
Matthijs Mekking
9fe520ece9 Implement named-checkconf -k (check keys)
With named-checkconf -k you can check your configuration including
checking the dnssec-policy keys against the configured keystores. If
there is a mismatch in the key files versus the policy, named-checkconf
will fail. This is useful for running before migrating to dnssec-policy.

For logging purposes, introduce a function that writes the identifying
information about a policy key into a string.

Allow a dnssec key to be initialized outside the keymgr code.

Add 'log_errors' to 'cfg_kasp_fromconfig' to avoid duplicate error
logs.
2025-09-24 17:03:06 +02:00
Nicki Křížek
72a640d8b1 chg: ci: Temporarily disable shotgun jobs
There's currently an issue with the shotgun workflow that's being
investigated. Until it's resolved, there's no point in creating the
shotgun jobs as they'll just fail.

Merge branch 'nicki/ci-temporarily-disable-shotgun-jobs' into 'main'

See merge request isc-projects/bind9!11005
2025-09-24 14:28:40 +02:00
Nicki Křížek
2669463b43 Temporarily disable shotgun jobs
There's currently an issue with the shotgun workflow that's being
investigated. Until it's resolved, there's no point in creating the
shotgun jobs as they'll just fail.
2025-09-24 14:28:09 +02:00
Petr Špaček
f0c3e18c59 fix: nil: Reformat strings broken by successive clang-format runs
Merge branch 'marka-re-format-strings' into 'main'

See merge request isc-projects/bind9!11002
2025-09-24 12:14:50 +00:00
Mark Andrews
ccc41c7044 re-split STATIC_ASSERT message 2025-09-24 12:14:28 +00:00
Mark Andrews
a64c350523 re-split log message text 2025-09-24 12:14:28 +00:00
Nicki Křížek
c2161609c6 chg: test: Re-enable delv tests with TSAN
With the loopmgr rewrite in 9.20, the delv issue shoud no longer happen,
thus the delv tests can be executed under TSAN as well.

Related #4119

Merge branch 'nicki/delv-reenable-under-tsan' into 'main'

See merge request isc-projects/bind9!10996
2025-09-24 13:34:26 +02:00
Nicki Křížek
7e118fdb06 Re-enable delv tests with TSAN
With the loopmgr rewrite in 9.20, the delv issue shoud no longer happen,
thus the delv tests can be executed under TSAN as well.
2025-09-24 13:34:16 +02:00
Ondřej Surý
b6971fb724 chg: dev: Add option to compile named with static linking and LTO
Statically linking lib{isc,dns,ns,cfg,isccc} and enabling LTO shows over 10% improvements on all almost measurements in perflab. That said, we can't use Meson's option for LTO since it would result in every binary being compiled with LTO and a great increase in compile time.

To work around it, we add a configuration option that enables LTO and static linking only for the `named` binary.

Merge branch 'alessio/meson-lto-v2' into 'main'

See merge request isc-projects/bind9!10761
2025-09-24 13:23:21 +02:00
Alessio Podda
d45a392086
Add named-lto option to meson build to named with LTO
Enabling LTO yields substantial performance gains on both authoritative
and resolver benchmarks.
But since LTO defers many optimization passes to link time, enabling LTO
across the board would cause an increase in compilation time, as passes
that would be run only once would need to be run for each executable.

As a compromise, this commit adds a named-lto build option, that
compiles the individual object files with the -ffat-lto-object option
and then enables LTO only for the named executable. Object files are
reused between lib*.so and the named executable.
2025-09-24 13:19:37 +02:00
Alessio Podda
6e7aec2cb7
Use unique names for probes.d files
Enabling LTO in the subsequent commit requires the file names to be
unique and having same probes.d in each of the libraries breaks this
requirement.  Rename probes.d to probes-{isc,dns,ns}.d files and adjust
the includes.
2025-09-24 13:18:13 +02:00
Colin Vidal
cb0807be2b chg: dev: refactor view creation/configuration loops in dedicated functions
Refactor a bit of `apply_configuration` by extracting (into respective dedicated function) the logic to build the keystores list, the KASP list as well as creating the view/zones and configuring those. This is the next step of MR !10895 and !10901

While the code is extracted, some global variables has been changed into a function parameters which enable to have a clear view of the dependency of the function, typically, to know if it depends on local configuration object or runtime "production" object. The end goal (not in this MR, but later on) is to move as much as possible initialization logic outside of the exclusive mode. 

As a first step, latest commits move the keystores list, KASP list and view/zones creation outside of the exclusive mode. (The view/zone configuration remain in exclusive mode for now, because of a dependency to the runtime "cachelist". This is the target of a next MR.

For the record; while moving the keystores list, KASP list and view/zone creation doesn't have a significant impact on the time the exclusive mode is taken (from my experiment on a 1M small zones instance); moving `configure_views` did have a _massive_ impact (basically, the time spend in the exclusive mode is then non calculable). Configuring views outside the exclusive mode needs more work, which will be done in future MRs.

See #4673

Merge branch 'colin/refactor-applyconfig' into 'main'

See merge request isc-projects/bind9!10910
2025-09-24 11:46:38 +02:00
Colin Vidal
17a2cbcbc5 comment about ifs scan twice the first time
Add comment message about why we're scanning interfaces twice during the
initial configuration (FreeBSD compatibility). See #3583
2025-09-24 10:54:50 +02:00
Colin Vidal
3fe239e5cf apply_configuration: log subroutines for tests
In order to have a (minimal) test ensuring we don't move back
`apply_configuration` subroutines which can be done before the exclusive
lock is taken, `APPLY_CONFIGURATION_SUBROUTINE_LOG` macro is added and
used for the few subroutines already extracted from the exclusive mode.
Those expected logs are added in `configloading` system test checks.
2025-09-24 10:54:50 +02:00
Colin Vidal
c225ba17c2 creation of client TLS ctx before exclusive mode
When the server is configured (inside `apply_configuration`) a client
TLS context cache is created and attached to the global server object.
It is then used by `configure_view` flow (and also during runtime though
the zone manager).

It is now created before the exclusive mode, and the swap of the
previous TLS cache ctx is done at the end of the exclusive mode, if
everything went well.

This allows us (among other follow-up changes) to move the
`configure_views` function outside of the exclusive mode.
2025-09-24 10:54:50 +02:00
Colin Vidal
e1be2be4ef move creation of keystores, kasp list and view outside of exclusive mode
The keystores initialization, the KASP list initialization as well as
the initialization of the view no longer depends of any data shared by
running "production" objects during re-configuration of the server. This
allows us to move those outside (before) the exclusive mode is taken.
2025-09-24 10:54:50 +02:00
Colin Vidal
201f62d9ef cfg_aclconfctx_t object is part of named_server
`named_g_actconfctx` is a global variable holding the ACL configuration
context alive (in particular, to dynamically load zones). However, this
object is build once per configuration (early) and is used only inside
server.c `apply_configuration` flow. (Two exceptions: the shutdown flow,
still in server.c and plugin check flow, which doesn't need it, so it's
NULL in such case).

Instead of leaving this global publicly exposed, it is now part of the
`named_server_t` object. This allows us to clearly see that, when
reconfigureing the server, the new instance of the ACL context is known
only by the newly built object and not currently used by "production"
object; and will help to move move logic before the exclusive mode is
taken.

The other advantage is that the ACL configuration context can now be
built before the exclusive lock as well.
2025-09-24 10:54:50 +02:00
Colin Vidal
4523852ded apply_configuration: bump config map before exclusive mode
Moving the config map building outside of the exclusive mode, and this
is local data only and no runtime object uses it.
2025-09-24 10:54:50 +02:00
Colin Vidal
de11150e47 apply_configuation: add configure_keystores
The keystores list build logic was inlined in apply_configuration, this
commit extracts it into its own function.
2025-09-24 10:54:50 +02:00
Colin Vidal
c97be6a7f5 apply_configuration: add configure_kasplist
The kasplist (dnssec-policy defined in the builtin and global
configuration options) was built inside apply_configuration. This
commit extracts this logic into its separate function.

In order to make the view configuration independent of the global
`server` object, the newly built kasplist is now passed as parameter.
(This eventually will help to be able to configure the views outside of
the exclusive mode by limiting its dependency to the global
`server`/`named_g_server`).
2025-09-24 10:54:50 +02:00
Colin Vidal
0fb6c9ae74 apply_configuration: remove builtin_viewlist
When creating/configuring the view, the user-defined views are built and
set into the viewlist, then builtin-view inside the builtin_viewlist.
But there is no seperate logic applied to those two lists, and they are
immediately merged into viewlist right after. This commit removes this
intermediate list and add builtin-views directly into the main viewlist
instead.
2025-09-24 10:54:50 +02:00
Colin Vidal
36c74c58c1 refactor view creation/config in apply_configuration
In order to help splitting apply_configuration, the inline loops and bit
of logic around it for views creation and configuration, each of those
are now in a dedicatated function.
2025-09-24 10:54:50 +02:00
Ondřej Surý
0ac744ee4d chg: dev: Use lock-free hashtable for storing resolver fetch contexts
Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero.  This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.

Merge branch 'ondrej/use-urcu-lfht-for-resolver-tables' into 'main'

See merge request isc-projects/bind9!10653
2025-09-24 00:08:45 +02:00
Ondřej Surý
0dcf56d5e3
fixup! Use lock-free hashtable for storing resolver fetch contexts 2025-09-24 00:08:21 +02:00
Ondřej Surý
6011fb5484
Use lock-free hashtable for storing resolver fetch contexts
Previously, the fetch contexts were stored inside rwlocked hashmap
table.  This was one of the most contended places for the resolver,
especially in the cold cache situation.

Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero.  This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.
2025-09-24 00:08:21 +02:00
Ondřej Surý
a20c8fe74b chg: dev: Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry).  Add a .related member to the slabtop that allows us to
expire the headers in both related headers and also optimize the lookups
because now both slabtops are looked up at the same time.

Closes #3396

Merge branch '3396-bind-rrsigs-to-records' into 'main'

See merge request isc-projects/bind9!10985
2025-09-24 00:07:32 +02:00
Ondřej Surý
3e05958a42
Refactor find headers to make use of related
Change the code of finding headers to make use of the related circular
reference.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:08 +02:00
Ondřej Surý
0f13d7f2fa
Expire related headers at the same time
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry).  This commit expires the headers in both related
headers.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
0b317abe4e
Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled.  Add a .related member to the slabtop that allows us to
optimize the lookups because now both slabtops are looked up at the
same time.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
270f78194e
Refactor find headers
Another recurring code pattern that can be moved into a separate
function.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
6b0e6cb058
Refactor check header
There was a pattern where first the header was checked for NULL
and then for being stale. In both cases the code path is the same
so it makes sense to put them in a separate function.

Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
133d76c05e
Move the size of the expired data into expireheader
Co-authored-by: Matthijs Mekking <matthijs@isc.org>
2025-09-24 00:07:07 +02:00
Ondřej Surý
7443ff330c chg: dev: Convert slabtop and slabheader to use the cds list
This is the first MR in series that aims to reduce the node locking
by replacing the single-linked list of slabtop(s) and slabheader(s)
with CDS linked list.  This commit doesn't do anything else beyond
replacing .next and .down links with the cds_list_head.  The RCU
semantics will be added later.

Merge branch 'ondrej/use-rcu-list-for-slabtop' into 'main'

See merge request isc-projects/bind9!10944
2025-09-24 00:06:35 +02:00