The 'all_spilled' local variable in resolver.c:fctx_getaddresses()
is 'true' by default, and only becomes false when there is at least
one successfully found NS address. However, when a 'forward only;'
configuration is used, the code jumps over the part where it looks
for NS addresses and doesn't reset the 'all_spilled' to false, which
results in incorretly increased 'serverquota' statistics variable,
and also in invalid return error code from the function. The result
code error didn't make any differences, because all codes other than
'ISC_R_SUCCESS' or 'DNS_R_WAIT' were treated in the same way, and
the result code was never logged anywhere.
Set the default value of 'all_spilled' to 'false', and only make it
'true' before actually starting to look up NS addresses.
Currently, the isc_work API is overloaded. It runs both the
CPU-intensive operations like DNSSEC validations and long-term tasks
like RPZ processing, CATZ processing, zone file loading/dumping and few
others.
Under specific circumstances, when many large zones are being loaded, or
RPZ zones processed, this stops the CPU-intensive tasks and the DNSSEC
validation is practically stopped until the long-running tasks are
finished.
As this is undesireable, this commit moves the CPU-intensive operations
from the isc_work API to the isc_helper API that only runs fast memory
cleanups now.
When verifying a message in an offloaded thread there is a race with
the worker thread which writes to the same buffer. Clone the message
buffer before offloading.
This change allows fallback from an IXFR failure to AXFR when the
reason is DNS_R_TOOMANYRECORDS. This is because this error condition
could be temporary only in an intermediate version of IXFR
transactions and it's possible that the latest version of the zone
doesn't have that condition. In such a case, the secondary would never
be able to update the zone (even if it could) without this fallback.
This fallback behavior is particularly useful with the recently
introduced max-records-per-type and max-types-per-name options:
the primary may not have these limitations and may temporarily
introduce "too many" records, breaking IXFR. If the primary side
subsequently deletes these records, this fallback will help recover
the zone transfer failure automatically; without it, the secondary
side would first need to increase the limit, which requires more
operational overhead and has its own adverse effect.
This change also fixes a minor glitch that DNS_R_TOOMANYRECORDS wasn't
logged in xfrin_fail.
The rcu_xchg_pointer() function can be used outside of a critical
section, and usually must be followed by a synchronize_rcu() or
call_rcu() call to detach from the resource, unless if there are
some guarantees in place because of our own reference counting.
If the ZSK has lifetime unlimited, the timing metadata "Inactive" and
"Delete" cannot be found and is treated as an error. Fix by allowing
these metadata to not exist.
Return partial match from dns_db_find/dns_db_find when requested
to short circuit the closest encloser discover process. Most of the
time this will be the actual closest encloser but may not be when
there yet to be committed / cleaned up versions of the zone with
names below the actual closest encloser.
fctx->state should be read with the lock held.
1559 /*
1560 * Caller must be holding the fctx lock.
1561 */
CID 468796: (#1 of 1): Data race condition (MISSING_LOCK)
1. missing_lock: Accessing fctx->state without holding lock fetchctx.lock.
Elsewhere, fetchctx.state is written to with fetchctx.lock held 2 out of 2 times.
1562 REQUIRE(fctx->state == fetchstate_done);
1563
1564 FCTXTRACE("sendevents");
1565
1566 LOCK(&fctx->lock);
1567
Give prefetches a free pass through the quota so that the cache
entries for popular zones could be updated successfully even if the
quota for is already reached.
Give prefetches a free pass through the quota so that the cache entry
for a popular zone could be updated successfully even if the quota for
it is already reached.
This limits the maximum number of received incremental zone
transfer differences for a secondary server. Upon reaching the
confgiured limit, the secondary aborts IXFR and initiates a full
zone transfer (AXFR).
If there is an algorithm rollover and two keys of different algorithm
share the same keytags, then there is a possibility that if we check
that a key matches a specific state, we are checking against the wrong
key.
Fix this by not only checking for matching key id but also key
algorithm.
Some things we no longer want to do when we are in offline-ksk mode.
1. Don't check for inactive and private keys if the key is a KSK.
2. Don't update the TTL of DNSKEY, CDS and CDNSKEY RRset, these come
from the SKR.
With offline-ksk enabled, we don't run the keymgr because the key
timings are determined by the SKR. We do update the key states but
we derive them from the timing metadata.
Then, we can skip a other tasks in offline-ksk mode, like DS checking
at the parent and CDS synchronization, because the CDS and CDNSKEY
RRsets also come from the SKR.
This added source code stores SKR data. It is loosely based on:
https://www.iana.org/dnssec/archive/files/draft-icann-dnssec-keymgmt-01.txt
A SKR contains a list of signed DNSKEY RRsets. Each change in data
should be stored in a separate bundle. So if the RRSIG is refreshed that
means it is stored in the next bundle. Likewise, if there is a new ZSK
pre-published, it is in the next bundle.
In addition (not mentioned in the draft), each bundle may contain
signed CDS and CDNSKEY RRsets.
Each bundle has an inception time. These will determine when we need
to re-sign or re-key the zone.
Add a new configuration option to enable Offline KSK key management.
Offline KSK cannot work with CSK because it splits how keys with the
KSK and ZSK role operate. Therefore, one key cannot have both roles.
Add a configuration check to ensure this.
There are few places where we attach/detach from the dns_xfrin object
while running on a different thread than the zone's assigned thread -
xfrin_xmlrender() in the statschannel and dns_zone_stopxfr() to name the
two places where it happens now. In the rare case, when the incoming
transfer completes (or shuts down) in the brief period between the other
thread attaches and detaches from the dns_xfrin, the isc_timer_destroy()
calls would be called by the last thread calling the xfrin_detach().
In the worst case, it would be this other thread causing assertion
failure. Move the isc_timer_destroy() call to xfrin_end() function
which is always called on the right thread and to match this move
isc_timer_create() to xfrin_start() - although this other change makes
no difference.
Remove the complicated mechanism that could be (in theory) used by
external libraries to register new categories and modules with
statically defined lists in <isc/log.h>. This is similar to what we
have done for <isc/result.h> result codes. All the libraries are now
internal to BIND 9, so we don't need to provide a mechanism to register
extra categories and modules.
Add isc_logconfig_get() function to get the current logconfig and use
the getter to replace most of the little dancing around setting up
logging in the tools. Thus:
isc_log_create(mctx, &lctx, &logconfig);
isc_log_setcontext(lctx);
dns_log_setcontext(lctx);
...
...use lcfg...
...
isc_log_destroy();
is now only:
logconfig = isc_logconfig_get(lctx);
...use lcfg...
For thread-safety, isc_logconfig_get() should be surrounded by RCU read
lock, but since we never use isc_logconfig_get() in threaded context,
the only place where it is actually used (but not really needed) is
named_log_init().
OpenSSL has added support for deterministic ECDSA (RFC 6979) with
version 3.2.
Use it by default as derandomization doesn't pose a risk for DNS
usecases and is allowed by FIPS 186-5.
The fcount_incr() was not increasing counter->count when force was set
to true, but fcount_decr() would try to decrease the counter leading to
underflow and assertion failure. Swap the order of the arguments in the
condition, so the !force is evaluated after incrementing the .count.
Since the enable_fips_mode() now resides inside the isc_tls unit, BIND 9
would fail to compile when FIPS mode was enabled as the DST subsystem
logging functions were missing.
Move the crypto library logging functions from the openssl_link unit to
isc_tls unit and enhance it, so it can now be used from both places
keeping the old dst__openssl_toresult* macros alive.
MAX_RESTARTS is no longer hard-coded; ns_server_setmaxrestarts()
and dns_client_setmaxrestarts() can now be used to modify the
max-restarts value at runtime. in both cases, the default is 11.
the number of steps that can be followed in a CNAME chain
before terminating the lookup has been reduced from 16 to 11.
(this is a hard-coded value, but will be made configurable later.)
there were cases in resolver.c when queries for NS records were
started without passing a pointer to the parent fetch's query counter;
as a result, the max-recursion-queries quota for those queries started
counting from zero, instead of sharing the limit for the parent fetch,
making the quota ineffective in some cases.
Instead of calling dst_lib_init() and dst_lib_destroy() explicitly by
all the programs, create a separate memory context for the DST subsystem
and use the library constructor and destructor to initialize the DST
internals.
Since the support for OpenSSL Engines has been removed, we can now also
remove the checks for OPENSSL_API_LEVEL; The OpenSSL 3.x APIs will be
used when compiling with OpenSSL 3.x, and OpenSSL 1.1.xx APIs will be
used only when OpenSSL 1.1.x is used.
The OpenSSL 1.x Engines support has been deprecated in the OpenSSL 3.x
and is going to be removed. Remove the OpenSSL Engine support in favor
of OpenSSL Providers.
When adding glue to the header, we add header to the wait-free stack to
be cleaned up later which sets wfc_node->next to non-NULL value. When
the actual cleaning happens we would only cleanup the .glue_list, but
since the database isn't locked for the time being, the headers could be
reused while cleaning the existing glue entries, which creates a data
race between database versions.
Revert the code back to use per-database-version hashtable where keys
are the node pointers. This allows each database version to have
independent glue cache table that doesn't affect nodes or headers that
could already "belong" to the future database version.
when searching the cache for a node so that we can delete an
rdataset, it is not necessary to set the 'create' flag. if the
node doesn't exist yet, we then we won't be able to delete
anything from it anyway.