Hold a weak reference to the view so that it can't go away while
nta is performing its lookups. Cancel nta timers once all external
references to the view have gone to prevent them triggering new work.
(cherry picked from commit 0b2555e8cf)
In the rare case that you have multiple keys acting as KSK and that
have the same keytag, you can now set the algorithm when calling
'-checkds'.
(cherry picked from commit 46fcd927e7)
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.
Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.
This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).
Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.
By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.
(cherry picked from commit 04d8fc0143)
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
(cherry picked from commit e24bc324b4)
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views. Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:
1. mdb_env_open(envA) is called when the view is first created.
2. "rndc reconfig" is called.
3. mdb_env_open(envB) is called for the new instance of the view.
4. mdb_env_close(envA) is called for the old instance of the view.
This seems to have worked so far. However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).
Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.
To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead. Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps). Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.
[1] 2fd44e3251
(cherry picked from commit 53120279b5)
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.
Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
(cherry picked from commit 19ce9ec1d4)
- clone keynode->dsset rather than return a pointer so that thread
use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
while in use.
- create a new keynode when removing DS records so that dangling
pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
when DS records are added.
(cherry picked from commit e5b2eca1d3)
DS records only belong at delegation points and if present
at the zone apex are invariably the result of administrative
errors. Additionally they can't be queried for with modern
resolvers as the parent servers will be queried.
(cherry picked from commit 35a58d30c9)
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers. This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
Originally, every library and binaries got linked to everything, which
creates unnecessary overlinking. This wasn't as straightforward as it
should be as we still support configuration without libtool for 9.16.
Couple of smaller issues related to include headers and an issue where
sanitizer overload dlopen and dlclose symbols, so we were getting false
negatives in the autoconf test.
The first attempt to add DNSSEC sign statistics was naive: for each
zone we allocated 64K counters, twice. In reality each zone has at
most four keys, so the new approach only has room for four keys per
zone. If after a rollover more keys have signed the zone, existing
keys are rotated out.
The DNSSEC sign statistics has three counters per key, so twelve
counters per zone. First counter is actually a key id, so it is
clear what key contributed to the metrics. The second counter
tracks the number of generated signatures, and the third tracks
how many of those are refreshes.
This means that in the zone structure we no longer need two separate
references to DNSSEC sign metrics: both the resign and refresh stats
are kept in a single dns_stats structure.
Incrementing dnssecsignstats:
Whenever a dnssecsignstat is incremented, we look up the key id
to see if we already are counting metrics for this key. If so,
we update the corresponding operation counter (resign or
refresh).
If the key is new, store the value in a new counter and increment
corresponding counter.
If all slots are full, we rotate the keys and overwrite the last
slot with the new key.
Dumping dnssecsignstats:
Dumping dnssecsignstats is no longer a simple wrapper around
isc_stats_dump, but uses the same principle. The difference is that
rather than dumping the index (key tag) and counter, we have to look
up the corresponding counter.
(cherry picked from commit 705810d577)
"max-journal-size" is set by default to twice the size of the zone
database. however, the calculation of zone database size was flawed.
- change the size calculations in dns_db_getsize() to more accurately
represent the space needed for a journal file or *XFR message to
contain the data in the database. previously we returned the sizes
of all rdataslabs, including header overhead and offset tables,
which resulted in the database size being reported as much larger
than the equivalent journal transactions would have been.
- map files caused a particular problem here: the full name can't be
determined from the node while a file is being deserialized, because
the uppernode pointers aren't set yet. so we store "full name length"
in the dns_rbtnode structure while serializing, and clear it after
deserialization is complete.
Start enforcing the clang-format rules on changed files
Closes#46
See merge request isc-projects/bind9!3063
(cherry picked from commit a04cdde45d)
d2b5853b Start enforcing the clang-format rules on changed files
618947c6 Switch AlwaysBreakAfterReturnType from TopLevelDefinitions to All
654927c8 Add separate .clang-format files for headers
5777c44a Reformat using the new rules
60d29f69 Don't enforce copyrights on .clang-format
adjust clang-format options to get closer to ISC style
See merge request isc-projects/bind9!3061
(cherry picked from commit d3b49b6675)
0255a974 revise .clang-format and add a C formatting script in util
e851ed0b apply the modified style
Add curly braces using uncrustify and then reformat with clang-format back
Closes#46
See merge request isc-projects/bind9!3057
(cherry picked from commit 67b68e06ad)
36c6105e Use coccinelle to add braces to nested single line statement
d14bb713 Add copy of run-clang-tidy that can fixup the filepaths
056e133c Use clang-tidy to add curly braces around one-line statements
Reformat source code with clang-format
Closes#46
See merge request isc-projects/bind9!2156
(cherry picked from commit 7099e79a9b)
4c3b063e Import Linux kernel .clang-format with small modifications
f50b1e06 Use clang-format to reformat the source files
11341c76 Update the definition files for Windows
df6c1f76 Remove tkey_test (which is no-op anyway)
When you do a restart or reconfig of named, or rndc loadkeys, this
triggers the key manager to run. The key manager will check if new
keys need to be created. If there is an active key, and key rollover
is scheduled far enough away, no new key needs to be created.
However, there was a bug that when you just start to sign your zone,
it takes a while before the KSK becomes an active key. An active KSK
has its DS submitted or published, but before the key manager allows
that, the DNSKEY needs to be omnipresent. If you restart named
or rndc loadkeys in quick succession when you just started to sign
your zone, new keys will be created because the KSK is not yet
considered active.
Fix is to check for introducing as well as active keys. These keys
all have in common that their goal is to become omnipresent.
- Add quotes before and after zone name when generating "addzone"
input so avoid "unexpected token" errors.
- Use a hex digest for zone filenames when the zone or view name
contains a slash.
- Test with a domain name containing a slash.
- Incidentally added 'catzhash.py' to contrib/scripts to generate
hash labels for catalog zones, as it was needed to write the test.
it now removes matching trust anchors from from the dslist while leaving
the other trust anchors in place.
also cleaned up the API to remove functions that were never being used.
NOTE: the keytable test is still failing because dns_keytable_deletekey()
is looking for exact matches in keynodes containing dst_key objects,
which no keynode has anymore.
the internal keytable structure has not yet been changed, but
insertion of DS anchors is the only method now available.
NOTE: the keytable unit test is currently failing because of tests
that expect individual keynode objects to contain single DST key
objects.
as initial-key and static-key trust anchors will now be stored as a
DS rrset, code referencing keynodes storing DNSKEY trust anchors will
no longer be reached.
this function is used by dns_view_untrust() to handle revoked keys, so
it will still be needed after the keytable/validator refactoring is
complete, even though the keytable will be storing DS trust anchors
instead of keys. to simplify the way it's called, it now takes a DNSKEY
rdata struct instead of a DST key.
Previously, the dns_geoip API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables, and
creating the local memory context was moved to named and stored in the
named_g_geoip global context.
Previously, the dns_dt API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
Previously, the dns_name API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
The indentation for dumping the master zone was driven by two
global variables dns_master_indent and dns_master_indentstr. In
threaded mode, this becomes prone to data access races, so this commit
converts the global variables into a local per-context tuple that
consist of count and string.
note: this also needs further refactoring.
- when initializing RFC 5011 for a name, we populate the managed-keys
zone with KEYDATA records derived from the initial-key trust anchors.
however, with initial-ds trust anchors, there is no key. but the
managed-keys zone still must have a KEYDATA record for the name,
otherwise zone_refreshkeys() won't refresh that key. so, for
initial-ds trust anchors, we now add an empty KEYDATA record and set
the key refresh timer so that the real keys will be looked up as soon
as possible.
- when a key refresh query is done, we verify it against the
trust anchor; this is done in two ways, one with the DS RRset
set up during configuration if present, or with the keys linked
from each keynode in the list if not. because there are two different
verification methods, the loop structure is overly complex and should
be simplified.
- the keyfetch_done() and sync_keyzone() functions are both too long
and should be broken into smaller functions.
note: this is a frankensteinian kluge which needs further refactoring.
the keytable started as an RBT where the node->data points to a list of
dns_keynode structures, each of which points to a single dst_key.
later it was modified so that the list could instead point to a single
"null" keynode structure, which does not reference a key; this means
a trust anchor has been configured but the RFC 5011 refresh failed.
in this branch it is further updated to allow the first keynode in
the list to point to an rdatalist of DS-style trust anchors. these will
be used by the validator to populate 'val->dsset' when validating a zone
key.
a DS style trust anchor can be updated as a result of RFC 5011
processing to contain DST keys instead; this results in the DS list
being freed. the reverse is not possible; attempting to add a DS-style
trust anchor if a key-style trust anchor is already in place results
in an error.
later, this should be refactored to use rdatalists for both DS-style
and key-style trust anchors, but we're keeping the existing code for
old-style trust anchors for now.