Previously, dumping the zones to the files were quantized, so it doesn't
slow down network IO processing. With the introduction of network
manager asynchronous threadpools, we can move the IO intensive work to
use that API and we don't have to quantize the work anymore as it the
file IO won't block anything except other zone dumping processes.
(cherry picked from commit 8a5c62de83)
We should also lock kasp when reading key files, because at the same
time the zone in another view may be updating the key file.
(cherry picked from commit 252a1ae0a1)
this rolls up numerous changes that have been applied to the
main branch, including moving isc_task operations into the
netmgr event loops, and other general stabilization.
When looking for key files, we could use isdigit rather than checking
if the character is within the range [0-9].
Use (unsigned char) cast to ensure the value is representable in the
unsigned char type (as suggested by the isdigit manpage).
Change " & 0xff" occurrences to the recommended (unsigned char) type
cast.
(cherry picked from commit 1998ad6c776a9c17c27788b17765dee90d9e25df)
The rndc command 'dnssec -status' only considered keys from
'dns_dnssec_findmatchingkeys' which only includes keys with accessible
private keys. Change it so that offline keys are also listed in the
status.
(cherry picked from commit b3a5859a9b)
When the feature was backported, we should have leave it disabled by
default, it turns out the default `100%` is producing some unexpected
results (under investigation), so for the time being, we are going to to
disable the max-ixfr-ratio.
Add a new built-in policy "insecure", to be used to gracefully unsign
a zone. Previously you could just remove the 'dnssec-policy'
configuration from your zone statement, or remove it.
The built-in policy "none" (or not configured) now actually means
no DNSSEC maintenance for the corresponding zone. So if you
immediately reconfigure your zone from whatever policy to "none",
your zone will temporarily be seen as bogus by validating resolvers.
This means we can remove the functions 'dns_zone_use_kasp()' and
'dns_zone_secure_to_insecure()' again. We also no longer have to
check for the existence of key state files to figure out if a zone
is transitioning to insecure.
(cherry picked from commit 2710d9a11d)
While working on the serve-stale backports, I noticed the following
oddities:
1. In the serve-stale system test, in one case we keep track of the
time how long it took for dig to complete. In commit
aaed7f9d8c, the code removed the
exception to check for result == ISC_R_SUCCESS on stale found
answers, and adjusted the test accordingly. This failed to update
the time tracking accordingly. Move the t1/t2 time track variables
back around the two dig commands to ensure the lookups resolved
faster than the resolver-query-timeout.
2. We can remove the setting of NS_QUERYATTR_STALEOK and
DNS_RDATASETATTR_STALE_ADDED on the "else if (stale_timeout)"
code path, because they are added later when we know we have
actually found a stale answer on a stale timeout lookup.
3. We should clear the NS_QUERYATTR_STALEOK flag from the client
query attributes instead of DNS_RDATASETATTR_STALE_ADDED (that
flag is set on the rdataset attributes).
4. In 'bin/named/config.c' we should set the configuration options
in alpabetical order.
5. In the ARM, in the backports we have added "(stale)" between
"cached" and "RRset" to make more clear a stale RRset may be
returned in this scenario.
(cherry picked from commit 104b676235)
It follows a description of the steps that were leading to the deadlock:
1. `do_addzone` calls `isc_task_beginexclusive`.
2. `isc_task_beginexclusive` waits for (N_WORKERS - 1) halted tasks,
this blocks waiting for those (no. workers -1) workers to halt.
...
isc_task_beginexclusive(isc_task_t *task0) {
...
while (manager->halted + 1 < manager->workers) {
wake_all_queues(manager);
WAIT(&manager->halt_cond, &manager->halt_lock);
}
```
3. It is possible that in `task.c / dispatch()` a worker is running a
task event, if that event blocks it will not allow this worker to
halt.
4. `do_addzone` acquires `LOCK(&view->new_zone_lock);`,
5. `rmzone` event is called from some worker's `dispatch()`, `rmzone`
blocks waiting for the same lock.
6. `do_addzone` calls `isc_task_beginexclusive`.
7. Deadlock triggered, since:
- `rmzone` is wating for the lock.
- `isc_task_beginexclusive` is waiting for (no. workers - 1) to
be halted
- since `rmzone` event is blocked it won't allow the worker to halt.
To fix this, we updated do_addzone code to call isc_task_beginexclusive
before the lock is acquired, we postpone locking to the nearest required
place, same for isc_task_beginexclusive.
The same could happen with rndc modzone, so that was addressed as well.
Using "stale-answer-client-timeout" turns out to have unforeseen
negative consequences, and thus it is better to disable the feature
by default for the time being.
(cherry picked from commit e443279bbf)
Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec
-rollover' command is received, because such a command may influence
the next key event. Updating the keys immediately avoids unnecessary
rollover delays.
The kasp system test no longer needs to call 'rndc loadkeys' after
a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.
(cherry picked from commit 82f72ae249)
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit. The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.
This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
Add a new option 'purge-keys' to 'dnssec-policy' that will purge key
files for deleted keys. The option determines how long key files
should be retained prior to removing the corresponding files from
disk.
If set to 0, the option is disabled and 'named' will not remove key
files from disk.
(cherry picked from commit 313de3a7e2)
The lmdb.h header doesn't have to be included from the dns/lmdb.h
header as it can be separately included where used. This stops
exposing the inclusion of lmdb.h from the libdns headers.
This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.
During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.
This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.
This commit allows to specify "disabled" or "off" in
stale-answer-client-timeout statement. The logic to support this
behavior will be added in the subsequent commits.
This commit also ensures an upper bound to stale-answer-client-timeout
which equals to one second less than 'resolver-query-timeout'.
(cherry picked from commit 0ad6f594f6)
The general logic behind the addition of this new feature works as
folows:
When a client query arrives, the basic path (query.c / ns_query_recurse)
was to create a fetch, waiting for completion in fetch_callback.
With the introduction of stale-answer-client-timeout, a new event of
type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale
answers are enabled and the fetch took longer than
stale-answer-client-timeout to complete.
When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we
must ensure that the folowing happens:
1. Setup a new query context with the sole purpose of looking up for
stale RRset only data, for that matters a new flag was added
'DNS_DBFIND_STALEONLY' used in database lookups.
. If a stale RRset is found, mark the original client query as
answered (with a new query attribute named NS_QUERYATTR_ANSWERED),
so when the fetch completion event is received later, we avoid
answering the client twice.
. If a stale RRset is not found, cleanup and wait for the normal
fetch completion event.
2. In ns_query_done, we must change this part:
/*
* If we're recursing then just return; the query will
* resume when recursion ends.
*/
if (RECURSING(qctx->client)) {
return (qctx->result);
}
To this:
if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) {
return (qctx->result);
}
Otherwise we would not proceed to answer the client if it happened
that a stale answer was found when looking up for stale only data.
When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we
proceed as before, resuming query, updating stats, etc, but a few
exceptions had to be added, most important of which are two:
1. Before answering the client (ns_client_send), check if the query
wasn't already answered before.
2. Before detaching a client, e.g.
isc_nmhandle_detach(&client->reqhandle), ensure that this is the
fetch completion event, and not the one triggered due to
stale-answer-client-timeout, so a correct call would be:
if (!QUERY_STALEONLY(client)) {
isc_nmhandle_detach(&client->reqhandle);
}
Other than these notes, comments were added in code in attempt to make
these updates easier to follow.
(cherry picked from commit 171a5b7542)
The -E option does not default to pkcs11 if --with-pkcs11 is set,
but always needs to be set explicitly.
(cherry picked from commit 0536375d4cf61c9b570a32e808dde78a7ef859bf)
Coverity Scan identified the following issue in bin/named/zoneconf.c:
*** CID 314969: Control flow issues (DEADCODE)
/bin/named/zoneconf.c: 2212 in named_zone_inlinesigning()
if (!inline_signing && !zone_is_dynamic &&
cfg_map_get(zoptions, "dnssec-policy", &signing) == ISC_R_SUCCESS &&
signing != NULL)
{
if (strcmp(cfg_obj_asstring(signing), "none") != 0) {
inline_signing = true;
>>> CID 314969: Control flow issues (DEADCODE)
>>> Execution cannot reach the expression ""no"" inside this statement: "dns_zone_log(zone, 1, "inli...".
dns_zone_log(
zone, ISC_LOG_DEBUG(1), "inline-signing: %s",
inline_signing
? "implicitly through dnssec-policy"
: "no");
} else {
...
}
}
This is because we first set 'inline_signing = true' and then check
its value in 'dns_zone_log'.
(cherry picked from commit 8df629d0b2)
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
(cherry picked from commit f619708bbf)
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
(cherry picked from commit 16e14353b1)
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.
When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.
If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.
(cherry picked from commit cf420b2af0)
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061
(cherry picked from commit 634bdfb16d)
The DNS Flag Day 2020 reduced all the EDNS buffer sizes to 1232. In
this commit, we revert the default value for nocookie-udp-size back to
4096 because the option is too obscure and most people don't realize
that they also need to change this configuration option in addition to
max-udp-size.
(cherry picked from commit 79c196fc77)
Since the queries sent towards root and TLD servers are now included in
the count (as a result of the fix for CVE-2020-8616),
"max-recursion-queries" has a higher chance of being exceeded by
non-attack queries. Increase its default value from 75 to 100.
(cherry picked from commit ab0bf49203)
When generating a new salt, compare it with the previous NSEC3
paremeters to ensure the new parameters are different from the
previous ones.
This moves the salt generation call from 'bin/named/*.s' to
'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new
function parameter 'resalt' to enforce a new salt to be generated. A
new salt will also be generated if 'salt' is set to NULL.
Logging salt with zone context can now be done with 'dnssec_log',
removing the need for 'dns_nsec3_log_salt'.
(cherry picked from commit 6b5d7357df)
Upon request from Mark, change the configuration of salt to salt
length.
Introduce a new function 'dns_zone_checknsec3aram' that can be used
upon reconfiguration to check if the existing NSEC3 parameters are
in sync with the configuration. If a salt is used that matches the
configured salt length, don't change the NSEC3 parameters.
(cherry picked from commit 6f97bb6b1f)
The 'rndc signing' command allows you to manipulate the private
records that are used to store signing state. Don't use these with
'dnssec-policy' as such manipulations may violate the policy (if you
want to change the NSEC3 parameters, change the policy and reconfig).
(cherry picked from commit eae9a6d297)
When doing 'rndc reconfig', named may complain about a zone not being
reusable because it has a raw version of the zone, and the new
configuration has not set 'inline-signing'. However, 'inline-signing'
may be implicitly true if a 'dnssec-policy' is used for the zone, and
the zone is not dynamic.
Improve the check in 'named_zone_reusable'. Create a new function for
checking 'inline-signing' configuration that matches existing code in
'bin/named/server.c'.
(cherry picked from commit ba8128ea00)