When generating a new key, dnssec-keygen checks for possible
key ID collisions with existing keys. The dnssec.c:findmatchingkeys()
function, which is supposed to get the list of the existing keys,
fails to do that for the existing KEY rrtype keys (i.e. generated
using 'dnssec-keygen -T KEY') because it doesn't pass down to the
dst_key_fromnamedfile() -> dst_key_read_public() functions the type
of the keys it's interested in. Fix the issue by introducing a new
function parameter which tells in which type of keys the caller is
currently interested in.
Add an API to parse and extract either an IPv4 or IPv6 address from
a name using the reverse format. It takes care of family detection,
and returns a generic error in case of syntax error.
Simplify the implementation around the database ownercase. Remove the
dns_rdataset_setownercase() implementation for the slabheaders and only
allow setting ownercase on rdatalists and rdatasets. The ownercase in
the database can now be set only with dns_db_addrdataset() by passing
rdataset with correctly set ownercase.
Mimic the Unbound behaviour where the cyclic offset is taken from query
ID, and remove recording of the current state. As the incoming query ID
should have random distribution, the cyclic ordering should also have
uniform distribution of the starting record.
Extend the `$name`, `$view` and `$type` tokens (expanding into the zone
name, zone's view name and type); the new following tokens are now also
accepted:
- $name or %s is replaced with the zone name in lower case;
- $type or %t is replaced with the zone type -- i.e., primary,
secondary, etc);
- $view or %v is replaced with the view name;
- $char1 or %1 is replaced with the first character of the zone name;
- $char2 or %2 is replaced with the second character of the zone name
(or a dot if there is no second character);
- $char3 or %3 is replaced with the third character of the zone name (or
a dot if there is no third character);
- $label1 or %z is replaced with the toplevel domain of the zone (or a
dot if it is the root zone);
- $label2 or %y is replaced with the next label under the toplevel
domain (or a dot if there is no next label);
- $label3 or %x is replaced with the next-next label under the toplevel
domain (or a dot if there is no next-next label).
> Put a space before opening parentheses only after control statement
> keywords (for/if/while...) except this option doesn’t apply to ForEach
> and If macros. This is useful in projects where ForEach/If macros are
> treated as function calls instead of control statements.
The slabheader.c, qpzone.c and qpcache.c had couple of shared macros
that were copied and paste between the units. Move these common
attributes access macros into private header, so these can be shared
among the three compilation units.
The RR type 0 is a reserved type for SIG[1] resource record. It should
not be ever inserted into the database nor queried. Add a special
handling to bail out quickly with DNS_R_DISALLOWED when inserting and
ISC_R_NOTFOUND when looking up TYPE0. This is also prerequisite for
stricter checks in the follow-up commit.
1. https://www.rfc-editor.org/rfc/rfc2535#section-4.1.8.1
All databases in the codebase follow the same structure: a database is
an associative container from DNS names to nodes, and each node is an
associative container from RR types to RR data.
Each database implementation (qpzone, qpcache, sdlz, builtin, dyndb) has
its own corresponding node type (qpznode, qpcnode, etc). However, some
code needs to work with nodes generically regardless of their specific
type - for example, to acquire locks, manage references, or
register/unregister slabs from the heap.
Currently, these generic node operations are implemented as methods in
the database vtable, which creates problematic coupling between database
and node lifetimes. If a node outlives its parent database, the node
destructor will destroy all RR data, and each RR data destructor will
try to unregister from heaps by calling a virtual function from the
database vtable. Since the database was already freed, this causes a
crash.
This commit breaks the coupling by standardizing the layout of all
database nodes, adding a dedicated vtable for node operations, and
moving node-specific methods from the database vtable to the node
vtable.
dns_keytable_issecuredomain() and dns_view_issecuredomain()
previously returned a result code to inform the caller of
unexpected database failures when looking up names in the
keytable and/or NTA table. such failures are not actually
possible. both functions now return a simple bool.
also, dns_view_issecuredomain() now returns false if
view->enablevalidation is false, so the caller no longer
has to check for that.
Instead of having individual memory contexts scattered across different
files and called different names, add a single memory context called
isc_g_mctx that replaces named_g_mctx and various other global memory
contexts in various utilities and tests.
There is only a single network manager running on top of the loop
manager (except for tests). Refactor the network manager to be a
singleton (a single instance) and change the unit tests, so that the
shorter read timeouts apply only to a specific handle, not the whole
extra 'connect_nm' network manager instance.
All the applications built on top of the loop manager were required to
create just a single instance of the loop manager. Refactor the loop
manager to not expose this instance to the callers and keep the loop
manager object internal to the isc_loop compilation unit.
This significantly simplifies a number of data structures and calls to
the isc_loop API.
Now that we have to code working, rename 'dns_qp_lookup2' back to
'dns_qp_lookup' and adjust all remaining 'dns_qp_lookup' occurrences
to take a space=0 parameter.
For now we only allow DNS_DB_NSEC_* values so it makes sense to change
the type to an enum.
Rename 'denial' to the more intuitive 'space', indicating the namespace
of the keyvalue pair.
If zone and denial data are going to be stored in the same qp storage,
the unit tests need to be updated to reflect this change. The code
changes mainly affect name to qpkey conversion, lookups, and
predecessors.
A note on predecessors: since the denial and zone data are now in the
same qp storage, the predecessor of the first name in the zone data will
consequently be the last name in the denial data.
In preparation to merge the three qp tries (tree, nsec, nsec3) into
one, add the piece of information into the qpkey. This is the most
significant bit of information, so prepend the denial type to the qpkey.
This means we need to pass on the denial type when constructing the
qpkey from a name, or doing a lookup.
Reuse the the DNS_DB_NSEC_* values. Most qp tries in the code we just
pass on 0 (nta, rpz, zt, etc.), because there is no need for denial of
existence, but for qpzone and qpcache we must pass the right value.
Change the code, so that node->nsec no longer can have the value
DNS_DB_NSEC_HAS_NSEC, instead track this in a new attribute 'havensec'.
Since we use node->nsec to convert names to keys, the value MUST be set
before inserting the node into the qp-trie.
Update the fuzzing and unit tests accordingly. This only adds a few
extra test cases, more are needed.
In the qp_test.c we can remove test code for empty keys as this is
no longer possible.
RRset ordering is now an enum inside struct rdataset attributes. This
was done to keep size to of the structure to its original value before
this MR.
I expect zero performance impact but it should be easier to deal with
attributes in debuggers and language servers.
Instead of having hand crafted attach/detach/destroy functions, replace
them with the standard ISC_REFCOUNT macro. This also have advantage
that delayed netmgr detach (from dns_dispatch) now doesn't cause
assertion failure. This can happen with delayed (call_rcu) shutdown of
dns_adb.
Qpzone employs a locking strategy where rwlocks are grouped into
buckets, and each zone gets 17 buckets.
This strategy is suboptimal in two ways:
- If named is serving a single zone or a zone is the majority of the
traffic, this strategy pretty much guarantees contention when using
more than a dozen threads.
- If named is serving many small zones, it causes substantial memory
usage.
This commit switches the locking to a global table initialized at start
time. This should have three effects:
- Performance should improve in the single zone case, since now we are
selecting from a bigger pool of locks.
- Memory consumption should go down significantly in the many zone
cases.
- Performance should not degrade substantially in the many zone cases.
The reason for this is that, while we could have substantially more
zones than locks, we can query/edit only O(num threads) at the same
time. So by making the global table much bigger than the expected
number of threads, we can limit contention.
In the current implementation, the resigning heap is part of the zone
database. This leads to a cycle, as the database has a reference to its
nodes, but each node needs a reference to the database.
This MR splits the resigning heap into its own separate struct, in order
to help breaking the cycle.
Recovering the node lock from a pointer to the header and a pointer to
the db is a common operation. This commit abstracts it away into a
function, so that the node lock selection logic may be modified more
easily.
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'. Internally, the signed
integer type is being used. This allows us to have negatively indexed
arrays that works both for threads with assigned tid and the threads
with unassigned tid. This should be used only in specific situations.
We need disable clang-format here to preserve the brackets around
the string concatenation to prevent -Wstring-concatenation -Werror
breaking the build.
Add support for proposed DS digest types that encode the private
algorithm identifier at the start of the DS digest as is done for
DNSKEY and RRSIG. This allows a DS record to identify the specific
DNSSEC algorithm, rather than a set of algorithms, when the algorithm
field is set to PRIVATEDNS or PRIVATEOID.
Meson is a modern build system that has seen a rise in adoption and some
version of it is available in almost every platform supported.
Compared to automake, meson has the following advantages:
* Meson provides a significant boost to the build and configuration time
by better exploiting parallelism.
* Meson is subjectively considered to be better in readability.
These merits alone justify experimenting with meson as a way of
improving development time and ergonomics. However, there are some
compromises to ensure the transition goes relatively smooth:
* The system tests currently rely on various files within the source
directory. Changing this requirement is a non-trivial task that can't
be currently justified. Currently the last compiled build directory
writes into the source tree which is in turn used by pytest.
* The minimum version supported has been fixed at 0.61. Increasing this
value will require choosing a baseline of distributions that can
package with meson. On the contrary, there will likely be an attempt
to decrease this value to ensure almost universal support for building
BIND 9 with meson.
The cache for unreachable primaries was added to BIND 9 in 2006 via
1372e172d0. It features a 10-slot LRU
array with 600 seconds (10 minutes) fixed delay. During this time, any
primary with a hiccup would be blocked for the whole block duration
(unless overwritten by a different entry).
As this design is not very flexible (i.e. the fixed delay and the fixed
amount of the slots), redesign it based on the badcache.c module, which
was implemented earlier for a similar mechanism.
The differences between the new code and the badcache module were large
enough to create a new module instead of trying to make the badcache
module universal, which could complicate the implementation.
The new design implements an exponential backoff for entries which are
added again soon after expiring, i.e. the next expiration happens in
double the amount of time of the previous expiration, but in no more
time than the defined maximum value.
The initial and the maximum expiration values are hard-coded, but, if
required, it should be trivial to implement configurable knobs.
Special tokens can now be specified in a zone "file" option
in order to generate the filename parametrically. The first
instead of "$name" in the "file" option is replaced with the
zone origin, the first instance of "$type" is replaced with the
zone type (i.e., primary, secondary, etc), and the first instance
of "$view" is replaced with the view name..
This simplifies the creation of zones using initial-file templates.
For example:
$ rndc addzone <zonename> \
{ type primary; file "$name.db"; initial-file "template.db"
When loading a primary zone for the first time, if the zonefile
does not exist but an "initial-file" option has been set, then a
new file will be copied into place from the path specified by
"initial-file".
This can be used to simplify the process of adding new zones. For
instance, a template zonefile could be used by running:
$ rndc addzone example.com \
'{ type primary; file "example.db"; initial-file "template.db"; };'
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.
similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.
The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification. Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
In a previous change, the "algorithm" value passed to
dns_tsigkey_create() was changed from a DNS name to an integer;
the name was then chosen from a table of known algorithms. A
side effect of this change was that a query using an unknown TSIG
algorithm was no longer handled correctly, and could trigger an
assertion failure. This has been corrected.
The dns_tsigkey struct now stores the signing algorithm
as dst_algorithm_t value 'alg' instead of as a dns_name,
but retains an 'algname' field, which is used only when the
algorithm is DST_ALG_UNKNOWN. This allows the name of the
unrecognized algorithm name to be returned in a BADKEY
response.
Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.
The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.
Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.
We were failing to account for the length byte before the OID.
See RFC 4034.
Algorithm number 254 is reserved for private use and will never be
assigned to a specific algorithm. The public key area in the DNSKEY
RR and the signature area in the RRSIG RR begin with an unsigned
length byte followed by a BER encoded Object Identifier (ISO OID) of
that length. The OID indicates the private algorithm in use, and the
remainder of the area is whatever is required by that algorithm.
Entities should only use OIDs they control to designate their private
algorithms.
the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)`
has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few
cases where the change would be excessively complex.
in most cases this was a straightforward change. in some places,
however, the list element variable was referenced after the loop
ended, and the code was refactored to avoid this necessity.
also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare
the list elements, compilation failures can occur if the list object
has a `const` qualifier. some `const` qualifiers have been removed
from function parameters to avoid this problem, and where that was not
possible, `UNCONST` was used.
This is the core implementation of the SIEVE algorithm described in the
following paper:
Zhang, Yazhuo, Juncheng Yang, Yao Yue, Ymir Vigfusson, and K V
Rashmi. “SIEVE Is Simpler than LRU: An Efficient Turn-Key Eviction
Algorithm for Web Caches,” n.d.. available online from
https://junchengyang.com/publication/nsdi24-SIEVE.pdf