Commit graph

14588 commits

Author SHA1 Message Date
Matthijs Mekking
6c76a99c24 Require to be dereferenced arguments are non-NULL
The function 'dns_dnssec_syncupdate()' is dereferencing arguments
'keys' and 'rmkeys'. There should be a REQUIRE that those are not
null pointers.
2023-02-28 09:38:31 +01:00
Matthijs Mekking
ea4130d6bd Update syncupdate() function to disable CDNSKEY
Add a new function argument so you can choose whether the CDNSKEY
record should be published or not.
2023-02-28 09:38:17 +01:00
Mark Andrews
59cd228216 Fix dns_kasp_attach / dns_kasp_detach usage
The kasp pointers in dns_zone_t should consistently be changed by
dns_kasp_attach and dns_kasp_detach so the usage is balanced.
2023-02-28 09:38:17 +01:00
Matthijs Mekking
28cde5cac2 Suppress duplicate digest types
When adding CDS digest types to the kasp structure, check for
duplicates.
2023-02-28 09:38:17 +01:00
Matthijs Mekking
f2c0c4ed4e Add 'dns_dnssec_syncupdate' argument descriptions 2023-02-28 09:38:17 +01:00
Matthijs Mekking
c0b606885e Make cds-digest-type plural
Allow for configuring multiple CDS records with different digest
types (currently only SHA-256 and SHA-384 are allowed).
2023-02-28 09:38:17 +01:00
Matthijs Mekking
8d20c4e95f Update code to publish CDS with other digest type
Now that we can configure a different digest type, update the code
to honor the configuration. Update 'dns_dnssec_syncupdate' so that
the correct CDS record is published, and also when deleting CDS records,
ensure that all possible CDS records are removed from the zone.
2023-02-28 09:36:50 +01:00
Matthijs Mekking
2742fe656f Add configuration cds-digest-type
Add the 'cds-digest-type' configuration option to 'dnssec-policy'.
2023-02-28 09:36:49 +01:00
Matthijs Mekking
32114afc46 Add functions to set CDS digest-type
BIND dnssec-policy currently only supports CDS digest-type 2. Add
API functions to allow other digest-types.
2023-02-28 09:36:39 +01:00
Matthijs Mekking
dc7818ebcb Fix CDS/CDSNKEY publication logging
The CDS and CDNSKEY "is published" logs were mixed up (CDNSKEY was
logged when CDS was published and vice versa).
2023-02-28 09:36:39 +01:00
Michał Kępień
df3062ed52 Fix DNSRPS code after struct dns_db refactoring
Commits ffa4757c79 and
77e7eac54c inadvertently broke
DNSRPS-enabled builds:

  - the new member of struct dns_db that holds a reference count for the
    database is called 'references', not 'refcount',

  - a syntax error was introduced in the designated initializer for
    'rpsdb_rdataset_methods',

  - rpsdb_destroy() no longer takes a 'dbp' argument.

Address all of the above issues to make DNSRPS-enabled builds work
again.
2023-02-28 09:16:05 +01:00
Tony Finch
0858514ae8 Improve qp-trie compaction in write transactions
In general, it's better to do one thorough compaction when a batch of
work is complete, which is the way that `update` transactions work.
Conversely, `write` transactions are designed so that lots of little
transactions are not too inefficient, but they need explicit
compaction. This changes `dns_qp_compact()` so that it is easier to
compact any time that makes sense, if there isn't a better way to
schedule compaction. And `dns_qpmulti_commit()` only recycles garbage
when there is enough to make it worthwhile.
2023-02-27 13:47:57 +00:00
Tony Finch
a8b29f0365 Improve qp-trie refcount debugging
Add some qp-trie tracing macros which can be enabled by a
developer. These print a message when a leaf is attached or
detached, indicating which part of the qp-trie implementation
did so. The refcount methods must now return the refcount value
so it can be printed by the trace macros.
2023-02-27 13:47:57 +00:00
Tony Finch
7dcde5d2fc Make the qp-trie stats logging quieter
Only log when useful work was done
2023-02-27 13:47:57 +00:00
Tony Finch
4b5ec07bb7 Refactor qp-trie to use QSBR
The first working multi-threaded qp-trie was stuck with an unpleasant
trade-off:

  * Use `isc_rwlock`, which has acceptable write performance, but
    terrible read scalability because the qp-trie made all accesses
    through a single lock.

  * Use `liburcu`, which has great read scalability, but terrible
    write performance, because I was relying on `rcu_synchronize()`
    which is rather slow. And `liburcu` is LGPL.

To get the best of both worlds, we need our own scalable read side,
which we now have with `isc_qsbr`. And we need to modify the write
side so that it is not blocked by readers.

Better write performance requires an async cleanup function like
`call_rcu()`, instead of the blocking `rcu_synchronize()`. (There
is no blocking cleanup in `isc_qsbr`, because I have concluded
that it would be an attractive nuisance.)

Until now, all my multithreading qp-trie designs have been based
around two versions, read-only and mutable. This is too few to
work with asynchronous cleanup. The bare minimum (as in epoch
based reclamation) is three, but it makes more sense to support an
arbitrary number. Doing multi-version support "properly" makes
fewer assumptions about how safe memory reclamation works, and it
makes snapshots and rollbacks simpler.

To avoid making the memory management even more complicated, I
have introduced a new kind of "packed reader node" to anchor the
root of a version of the trie. This is simpler because it re-uses
the existing chunk lifetime logic - see the discussion under
"packed reader nodes" in `qp_p.h`.

I have also made the chunk lifetime logic simpler. The idea of a
"generation" is gone; instead, chunks are either mutable or
immutable. And the QSBR phase number is used to indicate when a
chunk can be reclaimed.

Instead of the `shared_base` flag (which was basically a one-bit
reference count, with a two version limit) the base array now has a
refcount, which replaces the confusing ad-hoc lifetime logic with
something more familiar and systematic.
2023-02-27 13:47:55 +00:00
Tony Finch
549854f63b Some minor qp-trie improvements
Adjust the dns_qp_memusage() and dns_qp_compact() functions
to be more informative and flexible about handling fragmentation.

Avoid wasting space in runt chunks.

Switch from twigs_mutable() to cells_immutable() because that is the
sense we usually want.

Drop the redundant evacuate() function and rename evacuate_twigs() to
evacuate(). Move some chunk test functions closer to their point of
use.

Clarify compact_recursive(). Some small cleanups to comments.

Use isc_time_monotonic() for qp-trie timing stats.

Use #define constants to control debug logging.

Set up DNS name label offsets in dns_qpkey_fromname() so it is easier
to use in cases where the name is not fully hydrated.
2023-02-27 13:47:25 +00:00
Tony Finch
4b09c9a6ae qp-trie naming improvements
Adjust to typename_operation style
	s/VALID_QP/QP_VALID/g
	s/QP_VALIDMULTI/QPMULTI_VALID/g

Improved greppability
	s/\bctx\b/uctx/g

Less cluttered logging
	s/QP_TRACE/TRACE/g
	s/QP_LOG_STATS/LOG_STATS/g
2023-02-27 13:47:25 +00:00
Tony Finch
df6747ee70 Fix qp-trie refcounting mistake
The error occurred when:

  * The bump chunk was re-used across multiple write transactions.
    In this situation the bump chunk is marked immutable, but the
    immutable flag is disregarded for cells after the fender, which
    were allocated in the current transaction.

  * The bump chunk fills up during an insert operation, so that the
    enlarged twigs vector is allocated from a new bump chunk.

  * Before this happened, we should have (but didn't) made the twigs
    vector mutable. This would have adjusted its refcounts as necessary.

  * However, moving to a new bump chunk has a side effect: twigs that
    were previously considered mutable because they are after the
    fender become immutable.

  * Because of this, the old twigs vector was not destroyed as expected.

  * So leaves were duplicated without their refcounts being increased.

The effect is that the refcounts were lower than they should have
been, and underflowed. The tests failed to check for refcount
underflow, so this mistake was detected much later than it ideally
could have been.

After the fix, it is now correct not to ensure the twigs are mutable,
because they are about to be copied to a larger vector. Instead, we
need to find out whether `squash_twigs()` destroyed the old twigs, and
adjust the refcounts accordingly.
2023-02-27 13:47:25 +00:00
Tony Finch
6b9ddbd1ce Add a qp-trie data structure
A qp-trie is a kind of radix tree that is particularly well-suited to
DNS servers. I invented the qp-trie in 2015, based on Dan Bernstein's
crit-bit trees and Phil Bagwell's HAMT. https://dotat.at/prog/qp/

This code incorporates some new ideas that I prototyped using
NLnet Labs NSD in 2020 (optimizations for DNS names as keys)
and 2021 (custom allocator and garbage collector).
https://dotat.at/cgi/git/nsd.git

The BIND version of my qp-trie code has a number of improvements
compared to the prototype developed for NSD.

  * The main omission in the prototype was the very sketchy outline of
    how locking might work. Now the locking has been implemented,
    using a reader/writer lock and a mutex. However, it is designed to
    benefit from liburcu if that is available.

  * The prototype was designed for two-version concurrency, one
    version for readers and one for the writer. The new code supports
    multiversion concurrency, to provide a basis for BIND's dbversion
    machinery, so that updates are not blocked by long-running zone
    transfers.

  * There are now two kinds of transaction that modify the trie: an
    `update` aims to support many very small zones without wasting
    memory; a `write` avoids unnecessary allocation to help the
    performance of many small changes to the cache.

  * There is also a single-threaded interface for situations where
    concurrent access is not necessary.

  * The API makes better use of types to make it more clear which
    operations are permitted when.

  * The lookup table used to convert a DNS name to a qp-trie key is
    now initialized by a run-time constructor instead of a programmer
    using copy-and-paste. Key conversion is more flexible, so the
    qp-trie can be used with keys other than DNS names.

  * There has been much refactoring and re-arranging things to improve
    the terminology and order of presentation in the code, and the
    internal documentation has been moved from a comment into a file
    of its own.

Some of the required functionality has been stripped out, to be
brought back later after the basics are known to work.

  * Garbage collector performance statistics are missing.

  * Fancy searches are missing, such as longest match and
    nearest match.

  * Iteration is missing.

  * Search for update is missing, for cases where the caller needs to
    know if the value object is mutable or not.
2023-02-27 13:47:25 +00:00
Tony Finch
c6bf51492d Define DNS_NAME_MAXLABELS and DNS_NAME_LABELLEN
Some qp-trie operations will need to know the maximum number of labels
in a name, so I wanted a standard macro definition with the right
value.

Replace DNS_MAX_LABELS from <dns/resolver.h with DNS_NAME_MAXLABELS in
<dns/name.h>, and add its counterpart DNS_NAME_LABELLEN.

Use these macros in `name.c` and `resolver.c`.

Fix an off-by-one error in an assertion in `dns_name_countlabels()`.
2023-02-27 11:27:12 +00:00
Aram Sargsyan
cf79692a66 catz: unregister the db update-notify callback before detaching from db
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.

There is a similar code already in place for RPZ.
2023-02-27 10:06:32 +00:00
Aram Sargsyan
0ef0c86632 Searching catzs->zones requires a read lock
Lock the catzs->lock mutex before searching in the catzs->zones
hash table.
2023-02-27 10:06:32 +00:00
Aram Sargsyan
ed268b46f1 Process db callbacks in zone_loaddone() after zone_postload()
The zone_postload() function can fail and unregister the callbacks.

Call dns_db_endload() only after calling zone_postload() to make
sure that the registered update-notify callbacks are not called
when the zone loading has failed during zone_postload().

Also, don't ignore the return value of zone_postload().
2023-02-27 10:06:32 +00:00
Mark Andrews
cf5f133679 Fix memory leak in isc_hmac_init
If EVP_DigestSignInit failed 'pkey' was not freed.
2023-02-26 22:56:07 +00:00
Aram Sargsyan
030ffbf475 Make sure catz->catzs isn't destroyed before catz
Call dns_catz_unref_catzs() only after detaching 'catz'.
2023-02-24 19:40:34 +00:00
Ondřej Surý
4e7187601f
Pause the catz dbiterator while processing the zone
The dbiterator read-locks the whole zone and it stayed locked during
whole processing time when catz is being read.  Pause the iterator, so
the updates to catz zone are not being blocked while processing the catz
update.
2023-02-24 17:06:18 +01:00
Ondřej Surý
b1cd4a066a
Unlock catzs during dns__catz_update_cb()
Instead of holding the catzs->lock the whole time we process the catz
update, only hold it for hash table lookup and then release it.  This
should unblock any other threads that might be processing updates to
catzs triggered by extra incoming transfer.
2023-02-24 17:04:33 +01:00
Aram Sargsyan
0b96c9234f
Offload catalog zone updates
Offload catalog zone processing so that the network manager threads
are not interrupted by a large catalog zone update.

Introduce a new 'updaterunning' state alongside with 'updatepending',
like it is done in the RPZ module.

Note that the dns__catz_update_cb() function currently holds the
catzs->lock during the whole process, which is far from being optimal,
but the issue is going to be addressed separately.
2023-02-24 15:18:02 +01:00
Aram Sargsyan
246b7084d6
Add shutdown signaling for catalog zones
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
2023-02-24 15:06:54 +01:00
Aram Sargsyan
53f0c5a9ac
Add reference count tracing for dns_catz_zone_t and dns_catz_zones_t
Tracing can be activated by defining DNS_RPZ_TRACE in catz.h.
2023-02-24 15:00:26 +01:00
Aram Sargsyan
8cb79fec9d
Light refactoring of catz.c
* Change 'dns_catz_new_zones()' function's prototype (the order of the
  arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
  macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
  catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
  one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
  expressions when allocating or freeing memory for 'ptr'.
2023-02-24 15:00:26 +01:00
Tony Finch
330ff06d4a Move irs_resconf into libdns and remove libirs
`libirs` used to be a reference implementation of `getaddrinfo` and
related modern resolver APIs. It was stripped down in BIND 9.18
leaving only the `irs_resconf` module, which parses
`/etc/resolv.conf`. I have kept its include path and namespace prefix,
so it remains a little fragment of libirs now embedded in libdns.
2023-02-24 09:38:59 +00:00
Evan Hunt
4e93d44c74 fix a bug in dns_dispatch_getnext()
when a message arrives over a TCP connection matching an expected
QID, the dispatch is updated so it no longer expects that QID,
but continues reading. subsequent messages with the same QID are
ignored, unless the dispatch entry has called dns_dispatch_getnext()
or dns_dispatch_resume().

however, a coding error caused those functions to have no effect
when the dispatch was reading, so streams of messages with the same
QID could not be received over a single TCP connection, breaking *XFR.

this has been corrected by changing the order of operations in
tcp_dispatch_getnext() so that disp->reading isn't checked until
after the dispatch entry has been reactivated.
2023-02-24 08:30:33 +00:00
Evan Hunt
f0c766abec refactor dns_xfrin to use dns_dispatch
the dns_xfrin module was still using the network manager directly to
manage TCP connections and send and receive messages.  this commit
changes it to use the dispatch manager instead.
2023-02-24 08:30:33 +00:00
Evan Hunt
a4c8decc6a implement refcount tracing in xfrin.c
use ISC_REFCOUNT_IMPL for dns_xfrin_ctx_t (which has been renamed
to dns_xfrin_t to keep the function names dns_xfrin_attach() and
dns_xfrin_detach() unchanged).
2023-02-24 08:30:33 +00:00
Evan Hunt
d72419d1f5 minor cleanups in dispatch.c
- simplified tcp_startrecv()
- removed a short function that was only called once
- removed an unnecessary if statement
2023-02-24 08:30:33 +00:00
Evan Hunt
1dd42a80d6 log the xfrin pointer address in xfrin_log()
to make it easier to trace xfrin events in the log, include
the address of the dns_xfrin_t object in all xfrin log messages.
2023-02-24 08:30:33 +00:00
Evan Hunt
9d37621012 remove dead code in dns_request
the 'connected' variable in 'dns_request_create()` was always false.
2023-02-24 08:30:33 +00:00
Evan Hunt
ae5ba54fbe move dispatchmgr from resolver to view
the 'dispatchmgr' member of the resolver object is used by both
the dns_resolver and dns_request modules, and may in the future
be used by others such as dns_xfrin. it doesn't make sense for it
to live in the resolver object; this commit moves it into dns_view.
2023-02-24 08:30:33 +00:00
Tony Finch
9b7aa536ba QSBR: safe memory reclamation for lock-free data structures
This "quiescent state based reclamation" module provides support for
the qp-trie module in dns/qp. It is a replacement for liburcu, written
without reference to the urcu source code, and in fact it works in a
significantly different way.

A few specifics of BIND make this variant of QSBR somewhat simpler:

  * We can require that wait-free access to a qp-trie only happens in
    an isc_loop callback. The loop provides a natural quiescent state,
    after the callbacks are done, when no qp-trie access occurs.

  * We can dispense with any API like rcu_synchronize(). In practice,
    it takes far too long to wait for a grace period to elapse for each
    write to a data structure.

  * We use the idea of "phases" (aka epochs or eras) from EBR to
    reduce the amount of bookkeeping needed to track memory that is no
    longer needed, knowing that the qp-trie does most of that work
    already.

I considered hazard pointers for safe memory reclamation. They have
more read-side overhead (updating the hazard pointers) and it wasn't
clear to me how to nicely schedule the cleanup work. Another
alternative, epoch-based reclamation, is designed for fine-grained
lock-free updates, so it needs some rethinking to work well with the
heavily read-biased design of the qp-trie. QSBR has the fastest read
side of the basic SMR algorithms (with no barriers), and fits well
into a libuv loop. More recent hybrid SMR algorithms do not appear to
have enough benefits to justify the extra complexity.
2023-02-23 15:57:53 +00:00
Tony Finch
63cd73d43e Include thread ID in refcount trace output 2023-02-23 14:28:27 +00:00
Aram Sargsyan
18d67fa916 Remove catzs->loop
The 'loop' member of the dns_catz_zones structure is not used.
2023-02-23 08:56:37 +00:00
Evan Hunt
dc27552c30 remove isc_glob
the isc_glob module was originally needed to support posix-style glob
processing on Windows, but is now just an unnecessary wrapper around
glob(3). this commit removes it.
2023-02-22 17:35:29 +00:00
Evan Hunt
4dfc3f056d fix a crash from using an empty string for "include"
the parser could crash when "include" specified an empty string in place
of the filename. this has been fixed by returning ISC_R_FILENOTFOUND
when the string length is 0.
2023-02-22 17:35:29 +00:00
Ondřej Surý
6eb1340d1b Use atomic stack for async job queue
Previously, the async job queue would use a locked-list (ISC_LIST).
With introduction of atomic stack (that has to be drained at once), we
could use it to remove some contention between the threads and simplify
the async queue.

Fortunately, the reverse order still works for us - instead of append
and tail/prev operation on the list, we are now using prepend and
head/next operation on the atomic stack.
2023-02-22 16:13:37 +00:00
Tony Finch
36e56923ce Simple lock-free stack in <isc/stack.h>
Add a singly-linked stack that supports lock-free prepend and drain (to
empty the list and clean up its elements).  Intended for use with QSBR
to collect objects that need safe memory reclamation, or any other user
that works with adding objects to the stack and then draining them in
one go like various work queues.

In <isc/atomic.h>, add an `atomic_ptr()` macro to make type
declarations a little less abominable, and clean up a duplicate
definition of `atomic_compare_exchange_strong_acq_rel()`
2023-02-22 16:13:37 +00:00
Evan Hunt
b058f99cb8 remove references to obsolete isc_task/timer functions
removed references in code comments, doc/dev documentation, etc, to
isc_task, isc_timer_reset(), and isc_timertype_inactive. also removed a
coccinelle patch related to isc_timer_reset() that was no longer needed.
2023-02-22 08:13:30 +00:00
Evan Hunt
603cdb6332 move the dns_sdb API
move all dns_sdb code into bin/named/builtin.c, which is the
only place from which it's called.

(note this is temporary: later we'll refactor builtin so that it's a
standalone dns_db implementation on its own instead of using SDB
as a wrapper.)
2023-02-21 10:13:10 -08:00
Evan Hunt
77e7eac54c enable detailed db tracing
move database attach/detach functions to db.c, instead of
requiring them to be implemented for every database type.
instead, they must implement a 'destroy' function that is
called when references go to zero.

this enables us to use ISC_REFCOUNT_IMPL for databases,
with detailed tracing enabled by setting DNS_DB_TRACE to 1.
2023-02-21 10:13:10 -08:00
Evan Hunt
8da43bb7f5 simplify dns_sdb API
SDB is currently (and foreseeably) only used by the named
builtin databases, so it only needs as much of its API as
those databases use.

- removed three flags defined for the SDB API that were always
  set the same by builtin databases.

- there were two different types of lookup functions defined for
  SDB, using slightly different function signatures. since backward
  compatibility is no longer a concern, we can eliminate the 'lookup'
  entry point and rename 'lookup2' to 'lookup'.

- removed the 'allnodes' entry point and all database iterator
  implementation code

- removed dns_sdb_putnamedrr() and dns_sdb_putnamedrdata() since
  they were never used.
2023-02-21 10:13:10 -08:00