When we are recursing, RPZ processing is not allowed. But when we are
performing a lookup due to "stale-answer-client-timeout", we are still
recursing. This effectively means that RPZ processing is disabled on
such a lookup.
In this case, bail the "stale-answer-client-timeout" lookup and wait
for recursion to complete, as we we can't perform the RPZ rewrite
rules reliably.
(cherry picked from commit 3d3a6415f7)
The dboption DNS_DBFIND_STALEONLY caused confusion because it implies
we are looking for stale data **only** and ignore any active RRsets in
the cache. Rename it to DNS_DBFIND_STALETIMEOUT as it is more clear
the option is related to a lookup due to "stale-answer-client-timeout".
Rename other usages of "staleonly", instead use "lookup due to...".
Also rename related function and variable names.
(cherry picked from commit 839df94190)
When doing a staleonly lookup we don't want to fallback to recursion.
After all, there are obviously problems with recursion, otherwise we
wouldn't do a staleonly lookup.
When resuming from recursion however, we should restore the
RECURSIONOK flag, allowing future required lookups for this client
to recurse.
(cherry picked from commit 3f81d79ffb)
When implementing "stale-answer-client-timeout", we decided that
we should only return positive answers prematurely to clients. A
negative response is not useful, and in that case it is better to
wait for the recursion to complete.
To do so, we check the result and if it is not ISC_R_SUCCESS, we
decide that it is not good enough. However, there are more return
codes that could lead to a positive answer (e.g. CNAME chains).
This commit removes the exception and now uses the same logic that
other stale lookups use to determine if we found a useful stale
answer (stale_found == true).
This means we can simplify two test cases in the serve-stale system
test: nodata.example is no longer treated differently than data.example.
(cherry picked from commit aaed7f9d8c)
The NS_QUERYATTR_ANSWERED attribute is to prevent sending a response
twice. Without the attribute, this may happen if a staleonly lookup
found a useful answer and sends a response to the client, and later
recursion ends and also tries to send a response.
The attribute was also used to mask adding a duplicate RRset. This is
considered harmful. When we created a response to the client with a
stale only lookup (regardless if we actually have send the response),
we should clear the rdatasets that were added during that lookup.
Mark such rdatasets with the a new attribute,
DNS_RDATASETATTR_STALE_ADDED. Set a query attribute
NS_QUERYATTR_STALEOK if we may have added rdatasets during a stale
only lookup. Before creating a response on a normal lookup, check if
we can expect rdatasets to have been added during a staleonly lookup.
If so, clear the rdatasets from the message with the attribute
DNS_RDATASETATTR_STALE_ADDED set.
(cherry picked from commit 3d5429f61f)
With stale-answer-client-timeout, we may send a response to the client,
but we may want to hold on to the network manager handle, because
recursion is going on in the background, or we need to refresh a
stale RRset.
Simplify the setting of 'nodetach':
* During a staleonly lookup we should not detach the nmhandle, so just
set it prior to 'query_lookup()'.
* During a staleonly "stalefirst" lookup set the 'nodetach' to true
if we are going to refresh the RRset.
Now there is no longer the need to clear the 'nodetach' if we go
through the "dbfind_stale", "stale_refresh_window", or "stale_only"
paths.
(cherry picked from commit 48b0dc159b)
When doing a staleonly lookup, ignore active RRsets from cache. If we
don't, we may add a duplicate RRset to the message, and hit an
assertion failure in query.c because adding the duplicate RRset to the
ANSWER section failed.
This can happen on a race condition. When a client query is received,
the recursion is started. When 'stale-answer-client-timeout' triggers
around the same time the recursion completes, the following sequence
of events may happen:
1. Queue the "try stale" fetch_callback() event to the client task.
2. Add the RRsets from the authoritative response to the cache.
3. Queue the "fetch complete" fetch_callback() event to the client task.
4. Execute the "try stale" fetch_callback(), which retrieves the
just-inserted RRset from the database.
5. In "ns_query_done()" we are still recursing, but the "staleonly"
query attribute has already been cleared. In other words, the
query will resume when recursion ends (it already has ended but is
still on the task queue).
6. Execute the "fetch complete" fetch_callback(). It finds the answer
from recursion in the cache again and tries to add the duplicate to
the answer section.
This commit changes the logic for finding stale answers in the cache,
such that on "stale_only" lookups actually only stale RRsets are
considered. It refactors the code so that code paths for "dbfind_stale",
"stale_refresh_window", and "stale_only" are more clear.
First we call some generic code that applies in all three cases,
formatting the domain name for logging purposes, increment the
trystale stats, and check if we actually found stale data that we can
use.
The "dbfind_stale" lookup will return SERVFAIL if we didn't found a
usable answer, otherwise we will continue with the lookup
(query_gotanswer()). This is no different as before the introduction of
"stale-answer-client-timeout" and "stale-refresh-time".
The "stale_refresh_window" lookup is similar to the "dbfind_stale"
lookup: return SERVFAIL if we didn't found a usable answer, otherwise
continue with the lookup (query_gotanswer()).
Finally the "stale_only" lookup.
If the "stale_only" lookup was triggered because of an actual client
timeout (stale-answer-client-timeout > 0), and if database lookup
returned a stale usable RRset, trigger a response to the client.
Otherwise return and wait until the recursion completes (or the
resolver query times out).
If the "stale_only" lookup is a "stale-anwer-client-timeout 0" lookup,
preferring stale data over a lookup. In this case if there was no stale
data, or the data was not a positive answer, retry the lookup with the
stale options cleared, a.k.a. a normal lookup. Otherwise, continue
with the lookup (query_gotanswer()) and refresh the stale RRset. This
will trigger a response to the client, but will not detach the handle
because a fetch will be created to refresh the RRset.
(cherry picked from commit 92f7a67892)
The stale-answer-client-timeout feature introduced a dependancy on
when a client may be detached from the handle. The dboption
DNS_DBFIND_STALEONLY was reused to track this attribute. This overloads
the meaning of this database option, and actually introduced a bug
because the option was checked in other places. In particular, in
'ns_query_done()' there is a check for 'RECURSING(qctx->client) &&
(!QUERY_STALEONLY(&qctx->client->query) || ...' and the condition is
satisfied because recursion has not completed yet and
DNS_DBFIND_STALEONLY is already cleared by that time (in
query_lookup()), because we found a useful answer and we should detach
the client from the handle after sending the response.
Add a new boolean to the client structure to keep track of client
detach from handle is allowed or not. It is only disallowed if we are
in a staleonly lookup and we didn't found a useful answer.
(cherry picked from commit fee164243f)
Previously, every function had it's own #ifdef GSSAPI #else #endif block
that defined shim function in case GSSAPI was not being used. Now the
dummy shim functions have be split out into a single #else #endif block
at the end of the file.
This makes the gssapictx.c similar to 9.17.x code, making the backports
and reviews easier.
The Heimdal Kerberos library handles the OID sets in a different manner.
Unify the handling of the OID sets between MIT and Heimdal
implementations by dynamically creating the OID sets instead of using
static predefined set. This is how upstream recommends to handle the
OID sets.
The custom ISC SPNEGO mechanism implementation is no longer needed on
the basis that all major Kerberos 5/GSSAPI (mit-krb5, heimdal and
Windows) implementations support SPNEGO mechanism since 2006.
This commit removes the custom ISC SPNEGO implementation, and removes
the option from both autoconf and win32 Configure script. Unknown
options are being ignored, so this doesn't require any special handling.
Do not require config.h to use isc/util.h
See merge request isc-projects/bind9!4840
(cherry picked from commit 19b69e9a3b)
81eb3396 Do not require config.h to use isc/util.h
CDS/CDNSKEY DELETE records are only useful if they are signed,
otherwise the parent cannot verify these RRsets anyway. So once the DS
has been removed (and signaled to BIND), we can remove the DNSKEY and
RRSIG records, and at this point we can also remove the CDS/CDNSKEY
records.
(cherry picked from commit 6f31f62d69)
While not useful, having a CDS/CDNSKEY DELETE record in an unsigned
zone is not an error and "named-checkzone" should not complain.
(cherry picked from commit f211c7c2a1)
The 'keymgr_key_init()' function initializes key states if they have
not been set previously. It looks at the key timing metadata and
determines using the given times whether a state should be set to
RUMOURED or OMNIPRESENT.
However, the DNSKEY and ZRRSIG states were mixed up: When looking
at the Activate timing metadata we should set the ZRRSIG state, and
when looking at the Published timing metadata we should set the
DNSKEY state.
(cherry picked from commit 27e7d5f698)
The current isc_time_now uses CLOCK_REALTIME_COARSE which only updates
on a timer tick. This clock is generally fine for millisecond accuracy,
but on servers with 100hz clocks, this clock is nowhere near accurate
enough for microsecond accuracy.
This commit adds a new isc_time_now_hires function that uses
CLOCK_REALTIME, which gives the current time, though it is somewhat
expensive to call. When microsecond accuracy is required, it may be
required to use extra resources for higher accuracy.
(cherry picked from commit ebced74b19)
The tlsdns API is not yet used in the 9.16 branch and the tlsdns_test
fails too often. Temporarily disable running the test until it is
actually needed.
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit. The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.
This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
The udp, tcpdns and tlsdns contained lot of cut&paste code or code that
was very similar making the stack harder to maintain as any change to
one would have to be copied to the the other protocols.
In this commit, we merge the common parts into the common functions
under isc__nm_<foo> namespace and just keep the little differences based
on the socket type.
After the TCPDNS refactoring the initial and idle timers were broken and
only the tcp-initial-timeout was always applied on the whole TCP
connection.
This broke any TCP connection that took longer than tcp-initial-timeout,
most often this would affect large zone AXFRs.
This commit changes the timeout logic in this way:
* On TCP connection accept the tcp-initial-timeout is applied
and the timer is started
* When we are processing and/or sending any DNS message the timer is
stopped
* When we stop processing all DNS messages, the tcp-idle-timeout
is applied and the timer is started again
When thawing a zone, we don't know what changes have been made. If we
do DNSSEC maintenance on this zone, schedule a full sign.
(cherry picked from commit b90846f222)
Dynamic zones with dnssec-policy could not be thawed because KASP
zones were considered always dynamic. But a dynamic KASP zone should
also check whether updates are disabled.
(cherry picked from commit b518ed9f46)
When we query the resolver for a domain name that is in the same zone
for which is already one or more fetches outstanding, we could
potentially hit the fetch limits. If so, recursion fails immediately
for the incoming query and if serve-stale is enabled, we may try to
return a stale answer.
If the resolver is also is authoritative for the parent zone (for
example the root zone), first a delegation is found, but we first
check the cache for a better response.
Nothing is found in the cache, so we try to recurse to find the
answer to the query.
Because of fetch-limits 'dns_resolver_createfetch()' returns an error,
which 'ns_query_recurse()' propagates to the caller,
'query_delegation_recurse()'.
Because serve-stale is enabled, 'query_usestale()' is called,
setting 'qctx->db' to the cache db, but leaving 'qctx->version'
untouched. Now 'query_lookup()' is called to search for stale data
in the cache database with a non-NULL 'qctx->version'
(which is set to a zone db version), and thus we hit an assertion
in rbtdb.
This crash was introduced in 'v9_16' by commit
2afaff75ed.
(cherry picked from commit 87591de6f7)
- use a value less than 2^32 for DNS_ZONEFLG_FIXJOURNAL; a larger value
could cause problems in some build environments. the zone flag
DNS_ZONEFLG_DIFFONRELOAD, which was no longer in use, has now been
deleted and its value reused for _FIXJOURNAL.
(cherry picked from commit 990dd9dbff)
*** CID 329157: Null pointer dereferences (REVERSE_INULL)
/lib/dns/journal.c: 754 in journal_open()
748 j->header.index_size * sizeof(journal_rawpos_t));
749 }
750 if (j->index != NULL) {
751 isc_mem_put(j->mctx, j->index,
752 j->header.index_size * sizeof(journal_pos_t));
753 }
CID 329157: Null pointer dereferences (REVERSE_INULL)
Null-checking "j->filename" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
754 if (j->filename != NULL) {
755 isc_mem_free(j->mctx, j->filename);
756 }
757 if (j->fp != NULL) {
758 (void)isc_stdio_close(j->fp);
759 }
(cherry picked from commit 4054405909)
'named-journalprint -x' now prints the journal's index table and
the offset of each transaction in the journal, so that index consistency
can be confirmed.
(cherry picked from commit a4972324a6)
named-journalprint can now upgrade or downgrade a journal file
in place; the '-u' option upgrades and the '-d' option downgrades.
(cherry picked from commit fb2d0e2897)
when the 'max-ixfr-ratio' option was added, journal transaction
headers were revised to include a count of RR's in each transaction.
this made it impossible to read old journal files after an upgrade.
this branch restores the ability to read version 1 transaction
headers. when rolling forward, printing journal contents, if
the wrong transaction header format is found, we can switch.
when dns_journal_rollforward() detects a version 1 transaction
header, it returns DNS_R_RECOVERABLE. this triggers zone_postload()
to force a rewrite of the journal file in the new format, and
also to schedule a dump of the zone database with minimal delay.
journal repair is done by dns_journal_compact(), which rewrites
the entire journal, ignoring 'max-journal-size'. journal size is
corrected later.
newly created journal files now have "BIND LOG V9.2" in their headers
instead of "BIND LOG V9". files with the new version string cannot be
read using the old transaction header format. note that this means
newly created journal files will be rejected by older versions of named.
named-journalprint now takes a "-x" option, causing it to print
transaction header information before each delta, including its
format version.
(cherry picked from commit ee19966326)
The strlcat() call was wrong.
*** CID 316608: Memory - corruptions (OVERRUN)
/lib/dns/resolver.c: 5017 in fctx_create()
5011 * Make fctx->info point to a copy of a formatted string
5012 * "name/type".
5013 */
5014 dns_name_format(name, buf, sizeof(buf));
5015 dns_rdatatype_format(type, typebuf, sizeof(typebuf));
5016 p = strlcat(buf, "/", sizeof(buf));
>>> CID 316608: Memory - corruptions (OVERRUN)
>>> Calling "strlcat" with "buf + p" and "1036UL" is suspicious because "buf" points into a buffer of 1036 bytes and the function call may access "(char *)(buf + p) + 1035UL". [Note: The source code implementation of the function has been overridden by a builtin model.]
5017 strlcat(buf + p, typebuf, sizeof(buf));
5018 fctx->info = isc_mem_strdup(mctx, buf);
5019
5020 FCTXTRACE("create");
5021 dns_name_init(&fctx->name, NULL);
5022 dns_name_dup(name, mctx, &fctx->name);
(cherry picked from commit 59bf6e71e2)
Call the libisc isc__initialize() constructor and isc__shutdown()
destructor from DllMain instead of having duplicate code between
those and DllMain() code.
(cherry picked from commit a50f5d0cf5)
Under normal situation, the linker throws out all symbols from
compilation unit when no symbols are used in the final binary, which is
the case for lib/isc/lib.c. This commit adds empty function to lib.c
that's being called from different CU (mem.c in this case) and that
makes the linker to include all the symbols including the normally
unreferenced isc__initialize() and isc__shutdown() in the final binary.
The pthread_self(), thrd_current() or GetCurrentThreadId() could
actually be a pointer, so we should rather convert the value into
uintptr_t instead of unsigned long.
(cherry picked from commit a0181056a8)
Convert the isc_hp API to use the globally available isc_tid_v instead
of locally defined tid_v. This should solve most of the problems on
machines with many number of cores / CPUs.
(cherry picked from commit bea333f7c9)
The current isc_hp API uses internal tid_v variable that gets
incremented for each new thread using hazard pointers. This tid_v
variable is then used as a index to global shared table with hazard
pointers state. Since the tid_v is only incremented and never
decremented the table could overflow very quickly if we create set of
threads for short period of time, they finish the work and cease to
exist. Then we create identical set of threads and so on and so on.
This is not a problem for a normal `named` operation as the set of
threads is stable, but the problematic place are the unit tests where we
test network manager or other APIs (task, timer) that create threads.
This commits adds a thin wrapper around any function called from
isc_thread_create() that adds unique-but-reusable small digit thread id
that can be used as index to f.e. hazard pointer tables. The trampoline
wrapper ensures that the thread ids will be reused, so the highest
thread_id number doesn't grow indefinitely when threads are created and
destroyed and then created again. This fixes the hazard pointer table
overflow on machines with many cores. [GL #2396]
(cherry picked from commit cbbecfcc82)
Disable the internal memory allocator when AddressSanitizer is in use.
The basic blocks in the internal memory allocator prevents
AddressSanitizer from properly tracking the allocations and
deallocations, so we need to ensure it has been disabled for any build
that has AddressSanitizer enabled.
When AddressSanitizer is in use, disable the internal mempool
implementation and redirect the isc_mempool_get to isc_mem_get
(and similarly for isc_mempool_put). This is the method recommended
by the AddressSanitizer authors for tracking allocations and
deallocations instead of custom poison/unpoison code (see
https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning).
The BIND 9 libraries on Windows define DllMain() optional entry point
into a dynamic-link library (DLL). When the system starts or terminates
a process or thread, it calls the entry-point function for each loaded
DLL using the first thread of the process.
When the DLL is being loaded into the virtual address space of the
current process as a result of the process starting up, we make a call
to DisableThreadLibraryCalls() which should disable the
DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified
dynamic-link library (DLL).
This seems not be the case because we never check the return value of
the DisableThreadLibraryCalls() call, and it could in fact fail. The
DisableThreadLibraryCalls() function fails if the DLL specified by
hModule has active static thread local storage, or if hModule is an
invalid module handle.
In this commit, we remove the safe-guard assertion put in place for the
DLL_THREAD_ATTACH and DLL_THREAD_DETACH events and we just ignore them.
BIND 9 doesn't create/destroy enough threads for it actually to make any
difference, and in fact we do use static thread local storage in the
code.
Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use
gcc/clang attributes on POSIX and DLLMain on Windows to initialize and
shutdown OpenSSL library.
This resolves the issue when isc_nm_create() / isc_nm_destroy() was
called multiple times and it would call OpenSSL library destructors from
isc_nm_destroy().
At the same time, since we now have introduced the ctor/dtor for libisc,
this commit moves the isc_mem API initialization (the list of the
contexts) and changes the isc_mem_checkdestroyed() to schedule the
checking of memory context on library unload instead of executing the
code immediately.
Disables the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for
the specified dynamic-link library (DLL). This can reduce the size of
the working set for some applications.
Although harmless, the memmove() in tlsdns and tcpdns was guarded by a
current message length variable that was always bigger than 0 instead of
correct current buffer length remainder variable.
Since we now require both libcrypto and libssl to be initialized for
netmgr, we move all the OpenSSL initialization code except the engine
initialization to isc_tls API.
The isc_tls_initialize() and isc_tls_destroy() has been made idempotent,
so they could be called multiple time. However when isc_tls_destroy()
has been called, the isc_tls_initialize() could not be called again.
* Following the example set in 634bdfb16d, the tlsdns netmgr
module now uses libuv and SSL primitives directly, rather than
opening a TLS socket which opens a TCP socket, as the previous
model was difficult to debug. Closes#2335.
* Remove the netmgr tls layer (we will have to re-add it for DoH)
* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
library; move the OpenSSL initialization/deinitialization from dstapi
needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()
* Add couple of new shims needed for OpenSSL 1.0.x
* When LibreSSL is used, require at least version 2.7.0 that
has the best OpenSSL 1.1.x compatibility and auto init/deinit
* Enforce OpenSSL 1.1.x usage on Windows
(cherry picked from commit e493e04c0f)
When a staleonly lookup doesn't find a satisfying answer, it should
not try to respond to the client.
This is not true when the initial lookup is staleonly (that is when
'stale-answer-client-timeout' is set to 0), because no resolver fetch
has been created at this point. In this case continue with the lookup
normally.
(cherry picked from commit f8b7b597e9)
Fix a crash that can happen in the following scenario:
A client request is received. There is no data for it in the cache,
(not even stale data). A resolver fetch is created as part of
recursion.
Some time later, the fetch still hasn't completed, and
stale-answer-client-timeout is triggered. A staleonly lookup is
started. It will also find no data in the cache.
So 'query_lookup()' will call 'query_gotanswer()' with ISC_R_NOTFOUND,
so this will call 'query_notfound()' and this will start recursion.
We will eventually end up in 'ns_query_recurse()' and that requires
the client query fetch to be NULL:
REQUIRE(client->query.fetch == NULL);
If the previously started fetch is still running this assertion
fails.
The crash is easily prevented by not requiring recursion for
staleonly lookups.
Also remove a redundant setting of the staleonly flag at the end of
'query_lookup_staleonly()' before destroying the query context.
Add a system test to catch this case.
(cherry picked from commit 9e061faaae)
When applying dnssec-policy on a dynamic zone (e.g. that allows Dynamic
Updates), the NSEC3 parameters were put on the queue, but they were
not being processed (until a reload of the zone or reconfiguration).
Process the NSEC3PARAM queue on zone postload when handling a
dynamic zone.
(cherry picked from commit 4b176c850b)
The 'checknames' field wasn't initialized in dns_view_create(), but it
should otherwise AddressSanitizer identifies the following runtime error
in query_test.c.
runtime error: load of value 190, which is not a valid value for type '_Bool'
(cherry picked from commit 0c6fa16477)
On each keymgr run, we now also check if key files can be removed.
The 'purge-keys' interval determines how long keys should be retained
after they have become completely hidden.
Key files should not be removed if it has a state that is set to
something else then HIDDEN, if purge-keys is 0 (disabled), if
the key goal is set to OMNIPRESENT, or if the key is unused (a key is
unused if no timing metadata set, and no states are set or if set,
they are set to HIDDEN).
If the last changed timing metadata plus the purge-keys interval is
in the past, the key files may be removed.
Add a dst_key_t variable 'purge' to signal that the key file should
not be written to file again.
(cherry picked from commit 8c526cb67f)
Add a new option 'purge-keys' to 'dnssec-policy' that will purge key
files for deleted keys. The option determines how long key files
should be retained prior to removing the corresponding files from
disk.
If set to 0, the option is disabled and 'named' will not remove key
files from disk.
(cherry picked from commit 313de3a7e2)
dns_dt_open() is not currently called with mode dns_dtmode_unix.
*** CID 281489: Resource leaks (RESOURCE_LEAK)
/lib/dns/dnstap.c: 983 in dns_dt_open()
977
978 if (!dnstap_file(handle->reader)) {
979 CHECK(DNS_R_BADDNSTAP);
980 }
981 break;
982 case dns_dtmode_unix:
CID 281489: Resource leaks (RESOURCE_LEAK)
Variable "handle" going out of scope leaks the storage it points to.
983 return (ISC_R_NOTIMPLEMENTED);
984 default:
985 INSIST(0);
986 ISC_UNREACHABLE();
987 }
988
(cherry picked from commit 003dd8cc70)
The ISC SPNEGO implementation is based on mod_auth_kerb code. When
CVE-2006-5989 was disclosed, the relevant fix was not applied to the
BIND 9 codebase, making the latter vulnerable to the aforementioned flaw
when "tkey-gssapi-keytab" or "tkey-gssapi-credential" is set in
named.conf.
The original description of CVE-2006-5989 was:
Off-by-one error in the der_get_oid function in mod_auth_kerb 5.0
allows remote attackers to cause a denial of service (crash) via a
crafted Kerberos message that triggers a heap-based buffer overflow
in the component array.
Later research revealed that this flaw also theoretically enables remote
code execution, though achieving the latter in real-world conditions is
currently deemed very difficult.
This vulnerability was responsibly reported as ZDI-CAN-12302 ("ISC BIND
TKEY Query Heap-based Buffer Overflow Remote Code Execution
Vulnerability") by Trend Micro Zero Day Initiative.
In DNS Flag Day 2020, the development branch started setting the
IP_DONTFRAG option on the UDP sockets. It turned out, that this
code was incomplete leading to dropping the outgoing UDP packets.
Henceforth this commit rolls back this setting until we have a
proper fix that would send back empty response with TC flag set.
(cherry picked from commit 66eefac78c)
The USE_OPENSSL constraint in dh_test does not seems to be necessary
anymore, the test runs with PKCS#11 as well.
(cherry picked from commit c341e7f740)
Make sure lib/dns/tests/dnstap_test returns an exit code that indicates
a skipped test when dnstap is not enabled.
(cherry picked from commit c286341703)
BIND 9.17 changed exit code of skipped test to meet Automake
expectations in fa505bfb0e. BIND 9.16 was
not rewritten to Automake, but for consistency reasons, the same
SKIPPED_TEST_EXIT_CODE preprocessor macro is used (though the actual
exit code differs from the one in BIND 9.17).
(cherry picked from commit fa505bfb0e)
The only reason for including the gssapi.h from the dst/gssapi.h header
was to get the typedefs of gss_cred_id_t and gss_ctx_id_t. Instead of
using those types directly this commit introduces dns_gss_cred_id_t and
dns_gss_ctx_id_t types that are being used in the public API and
privately retyped to their counterparts when we actually call the gss
api.
This also conceals the gssapi headers, so users of the libdns library
doesn't have to add GSSAPI_CFLAGS to the Makefile when including libdns
dst API.
The lmdb.h header doesn't have to be included from the dns/lmdb.h
header as it can be separately included where used. This stops
exposing the inclusion of lmdb.h from the libdns headers.
This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.
During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.
This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.
Coverity assumes that the memory holding any value read using byte
swapping is tainted. As we store the NSEC3PARAM records in wire
form and iterations is byte swapped the memory holding the record
is marked as tainted. nsec3->salt_length is marked as tainted
transitively. To remove the taint the value need to be range checked.
For a correctly formatted record region.length should match
nsec3->salt_length and provides a convenient value to check the field
against.
*** CID 316507: Insecure data handling (TAINTED_SCALAR)
/lib/dns/rdata/generic/nsec3param_51.c: 241 in tostruct_nsec3param()
235 region.length = rdata->length;
236 nsec3param->hash = uint8_consume_fromregion(®ion);
237 nsec3param->flags = uint8_consume_fromregion(®ion);
238 nsec3param->iterations = uint16_consume_fromregion(®ion);
239
240 nsec3param->salt_length = uint8_consume_fromregion(®ion);
>>> CID 316507: Insecure data handling (TAINTED_SCALAR)
>>> Passing tainted expression "nsec3param->salt_length" to "mem_maybedup", which uses it as an offset.
241 nsec3param->salt = mem_maybedup(mctx, region.base,
242 nsec3param->salt_length);
243 if (nsec3param->salt == NULL) {
244 return (ISC_R_NOMEMORY);
245 }
246 isc_region_consume(®ion, nsec3param->salt_length);
(cherry picked from commit c40133d840)
Coverity assumes that the memory holding any value read using byte
swapping is tainted. As we store the NSEC3 records in wire form
and iterations is byte swapped the memory holding the record is
marked as tainted. nsec3->salt_length and nsec3->next_length are
marked as tainted transitively. To remove the taint the values need
to be range checked. Valid values for these should never exceed
region.length so that is becomes a reasonable value to check against.
*** CID 316509: (TAINTED_SCALAR)
/lib/dns/rdata/generic/nsec3_50.c: 312 in tostruct_nsec3()
306 if (nsec3->salt == NULL) {
307 return (ISC_R_NOMEMORY);
308 }
309 isc_region_consume(®ion, nsec3->salt_length);
310
311 nsec3->next_length = uint8_consume_fromregion(®ion);
>>> CID 316509: (TAINTED_SCALAR)
>>> Passing tainted expression "nsec3->next_length" to "mem_maybedup", which uses it as an offset.
312 nsec3->next = mem_maybedup(mctx, region.base, nsec3->next_length);
313 if (nsec3->next == NULL) {
314 goto cleanup;
315 }
316 isc_region_consume(®ion, nsec3->next_length);
317
/lib/dns/rdata/generic/nsec3_50.c: 305 in tostruct_nsec3()
299 region.length = rdata->length;
300 nsec3->hash = uint8_consume_fromregion(®ion);
301 nsec3->flags = uint8_consume_fromregion(®ion);
302 nsec3->iterations = uint16_consume_fromregion(®ion);
303
304 nsec3->salt_length = uint8_consume_fromregion(®ion);
>>> CID 316509: (TAINTED_SCALAR)
>>> Passing tainted expression "nsec3->salt_length" to "mem_maybedup", which uses it as an offset.
305 nsec3->salt = mem_maybedup(mctx, region.base, nsec3->salt_length);
306 if (nsec3->salt == NULL) {
307 return (ISC_R_NOMEMORY);
308 }
309 isc_region_consume(®ion, nsec3->salt_length);
310
(cherry picked from commit fd8d1337a5)
The AddressSanitizer constraint in some libns unit tests does not seem
to be necessary anymore, these tests run fine under AddressSanitizer.
(cherry picked from commit 613be8706e)
The 'query_usestale()' function was only called when in
'query_gotanswer()' and an unexpected error occurred. This may have
been "quota reached", and thus we were in some cases returning
stale data on fetch-limits (and if serve-stale enabled of course).
But we can also hit fetch-limits when recursing because we are
following a referral (in 'query_notfound()' and
'query_delegation_recurse()'). Here we should also check for using
stale data in case an error occurred.
Specifically don't check for using stale data when refetching a
zero TTL RRset from cache.
Move the setting of DNS_DBFIND_STALESTART into the 'query_usestale()'
function to avoid code duplication.
(cherry picked from commit 8bcd7fe69e)
If we did not attempt a fetch due to fetch-limits, we should not start
the stale-refresh-time window.
Introduce a new flag DNS_DBFIND_STALESTART to differentiate between
a resolver failure and unexpected error. If we are resuming, this
indicates a resolver failure, then start the stale-refresh-time window,
otherwise don't start the stale-refresh-time window, but still fall
back to using stale data.
(This commit also wraps some docstrings to 80 characters width)
(cherry picked from commit aabdedeae3)
Before this change, BIND will only fallback to using stale data if
there was an actual attempt to resolve the query. Then on a timeout,
the stale data from cache becomes eligible.
This commit changes this so that on any unexpected error stale data
becomes eligble (you would still have to have 'stale-answer-enable'
enabled of course).
If there is no stale data, this may return in an error again, so don't
loop on stale data lookup attempts. If the DNS_DBFIND_STALEOK flag is
set, this means we already tried to lookup stale data, so if that is
the case, don't use stale again.
(cherry picked from commit c6fd02aed5)
*** CID 318094: Null pointer dereferences (REVERSE_INULL)
/lib/dns/rbtdb.c: 1389 in newversion()
1383 version->xfrsize = rbtdb->current_version->xfrsize;
1384 RWUNLOCK(&rbtdb->current_version->rwlock, isc_rwlocktype_read);
1385 rbtdb->next_serial++;
1386 rbtdb->future_version = version;
1387 RBTDB_UNLOCK(&rbtdb->lock, isc_rwlocktype_write);
1388
CID 318094: Null pointer dereferences (REVERSE_INULL)
Null-checking "version" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
1389 if (version == NULL) {
1390 return (result);
1391 }
1392
1393 *versionp = version;
1394
(cherry picked from commit 456d53d1fb)
Three small cleanups:
1. Remove an unused keystr/dst_key_format.
2. Initialize a dst_key_state_t state with NA.
3. Update false comment about local policy (local policy only adds
barrier on transitions to the RUMOURED state, not the UNRETENTIVE
state).
(cherry picked from commit 189d9a2d21)
There was a bug in function 'keymgr_ds_hidden_or_chained()'.
The funcion 'keymgr_ds_hidden_or_chained()' implements (3e) of rule2
as defined in the "Flexible and Robust Key Rollover" paper. The rules
says: All DS records need to be in the HIDDEN state, or if it is not
there must be a key with its DNSKEY and KRRSIG in OMNIPRESENT, and
its DS in the same state as the key in question. In human langauge,
if all keys have their DS in HIDDEN state you can do what you want,
but if a DS record is available to some validators, there must be
a chain of trust for it.
Note that the barriers on transitions first check if the current
state is valid, and then if the next state is valid too. But
here we falsely updated the 'dnskey_omnipresent' (now 'dnskey_chained')
with the next state. The next state applies to 'key' not to the state
to be checked. Updating the state here leads to (true) always, because
the key that will move its state will match the falsely updated
expected state. This could lead to the assumption that Key 2 would be
a valid chain of trust for Key 1, while clearly the presence of any
DS is uncertain.
The fix here is to check if the DNSKEY and KRRSIG are in OMNIPRESENT
state for the key that does not have its DS in the HIDDEN state, and
only if that is not the case, ensure that there is a key with the same
algorithm, that provides a valid chain of trust, that is, has its
DNSKEY, KRRSIG, and DS in OMNIPRESENT state.
The changes in 'keymgr_dnskey_hidden_or_chained()' are only cosmetical,
renaming 'rrsig_omnipresent' to 'rrsig_chained' and removing the
redundant initialization of the DST_KEY_DNSKEY expected state to NA.
(cherry picked from commit 291bcc3721)
The previous commit changed the function definition of
'keymgr_key_is_successor()', this commit updates the code where
this function is called.
In 'keymgr_key_exists_with_state()' the logic is also updated slightly
to become more readable. First handle the easy cases:
- If the key does not match the state, continue with the next key.
- If we found a key with matching state, and there is no need to
check the successor relationship, return (true).
- Otherwise check the successor relationship.
In 'keymgr_key_has_successor()' it is enough to check if a key has
a direct successor, so instead of calling 'keymgr_key_is_successor()',
we can just check 'keymgr_direct_dep()'.
In 'dns_keymgr_run()', we want to make sure that there is no
dependency on the keys before retiring excess keys, so replace
'keymgr_key_is_successor()' with 'keymgr_dep()'.
(cherry picked from commit 600915d1b2)
So far the key manager could only deal with two keys in a rollover,
because it used a simplified version of the successor relationship
equation from "Flexible and Robust Key Rollover" paper. The simplified
version assumes only two keys take part in the key rollover and it
for that it is enough to check the direct relationship between two
keys (is key x the direct predecessor of key z and is key z the direct
successor of key x?).
But when a third key (or more keys) comes into the equation, the key
manager would assume that one key (or more) is redundant and removed
it from the zone prematurely.
Fix by implementing Equation(2) correctly, where we check for
dependencies on keys:
z ->T x: Dep(x, T) = ∅ ∧
(x ∈ Dep(z, T) ∨
∃ y ∈ Dep(z, T)(y != z ∧ y ->T x ∧ DyKyRySy = DzKzRzSz))
This says: key z is a successor of key x if:
- key x depends on key z if z is a direct successor of x,
- or if there is another key y that depends on key z that has identical
key states as key z and key y is a successor of key x.
- Also, key x may not have any other keys depending on it.
This is still a simplified version of Equation(2) (but at least much
better), because the paper allows for a set of keys to depend on a
key. This is defined as the set Dep(x, T). Keys in the set Dep(x, T)
have a dependency on key x for record type T. The BIND implementation
can only have one key in the set Dep(x, T). The function
'keymgr_dep()' stores this key in 'uint32_t *dep' if there is a
dependency.
There are two scenarios where multiple keys can depend on a single key:
1. Rolling keys is faster than the time required to finish the
rollover procedure. This scenario is covered by the recursive
implementation, and checking for a chain of direct dependencies
will suffice.
2. Changing the policy, when a zone is requested to be signed with
a different key length for example. BIND 9 will not mark successor
relationships in this case, but tries to move towards the new
policy. Since there is no successor relationship, the rules are
even more strict, and the DNSSEC reconfiguration is actually slower
than required.
Note: this commit breaks the build, because the function definition
of 'keymgr_key_is_successor' changed. This will be fixed in the
following commit.
(cherry picked from commit cc38527b63)
Since keys now have their goals initialized in 'keymgr_key_init()',
remove this redundant piece of code in 'keymgr_key_run()'.
(cherry picked from commit 82632fa6d9)
The 'key_init()' function is used to initialize a state file for keys
that don't have one yet. This can happen if you are migrating from a
'auto-dnssec' or 'inline-signing' to a 'dnssec-policy' configuration.
It did not look at the "Inactive" and "Delete" timing metadata and so
old keys left behind in the key directory would also be considered as
a possible active key. This commit fixes this and now explicitly sets
the key goal to OMNIPRESENT for keys that have their "Active/Publish"
timing metadata in the past, but their "Inactive/Delete" timing
metadata in the future. If the "Inactive/Delete" timing metadata is
also in the past, the key goal is set to HIDDEN.
If the "Inactive/Delete" timing metadata is in the past, also the
key states are adjusted to either UNRETENTIVE or HIDDEN, depending on
how far in the past the metadata is set.
(cherry picked from commit 76cf72e65a)
This commit fix a race that could happen when two or more threads have
failed to refresh the same RRset, the threads could simultaneously
attempt to update the header->last_refresh_fail_ts field in
check_stale_header, a field used to implement stale-refresh-time.
By making this field atomic we avoid such race.
(cherry picked from commit c75575e350)
First of all, there was a flaw in the code related to the
'stale-refresh-time' option. If stale answers are enabled, and we
returned stale data, then it was assumed that it was because we were
in the 'stale-refresh-time' window. But now we could also have returned
stale data because of a 'stale-answer-client-timeout'. To fix this,
introduce a rdataset attribute DNS_RDATASETATTR_STALE_WINDOW to
indicate whether the stale cache entry was returned because the
'stale-refresh-time' window is active.
Second, remove the special case handling when the result is
DNS_R_NCACHENXRRSET. This can be done more generic in the code block
when dealing with stale data.
Putting all stale case handling in the code block when dealing with
stale data makes the code more easy to follow.
Update documentation to be more verbose and to match then new code
flow.
(cherry picked from commit fa0c9280d2)
Both functions employed the same code lines to allocate query context
buffers, which are used to store query results, so this shared portion
of code was extracted out to a new function, qctx_prepare_buffers.
Also, this commit uses qctx_init to initialize the query context whitin
query_refresh_rrset function.
(cherry picked from commit 966060c03b)
This commit makes the code in query_usestale easier to follow, it also
doesn't attach/detach to the database if stale answers are not enabled.
(cherry picked from commit f89ac07b28)
This commit allows stale RRset to be used (if available) for responding
a query, before an attempt to refresh an expired, or otherwise resolve
an unavailable RRset in cache is made.
For that to work, a value of zero must be specified for
stale-answer-client-timeout statement.
To better understand the logic implemented, there are three flags being
used during database lookup and other parts of code that must be
understood:
. DNS_DBFIND_STALEOK: This flag is set when BIND fails to refresh a
RRset due to timeout (resolver-query-timeout), its intent is to
try to look for stale data in cache as a fallback, but only if
stale answers are enabled in configuration.
This flag is also used to activate stale-refresh-time window, since it
is the only way the database knows that a resolution has failed.
. DNS_DBFIND_STALEENABLED: This flag is used as a hint to the database
that it may use stale data. It is always set during query lookup if
stale answers are enabled, but only effectively used during
stale-refresh-time window. Also during this window, the resolver will
not try to resolve the query, in other words no attempt to refresh the
data in cache is made when the stale-refresh-time window is active.
. DNS_DBFIND_STALEONLY: This new introduced flag is used when we want
stale data from the database, but not due to a failure in resolution,
it also doesn't require stale-refresh-time window timer to be active.
As long as there is a stale RRset available, it should be returned.
It is mainly used in two situations:
1. When stale-answer-client-timeout timer is triggered: in that case
we want to know if there is stale data available to answer the
client.
2. When stale-answer-client-timeout value is set to zero: in that
case, we also want to know if there is some stale RRset available
to promptly answer the client.
We must also discern between three situations that may happen when
resolving a query after the addition of stale-answer-client-timeout
statement, and how to handle them:
1. Are we running query_lookup() due to stale-answer-client-timeout
timer being triggered?
In this case, we look for stale data, making use of
DNS_DBFIND_STALEONLY flag. If a stale RRset is available then
respond the client with the data found, mark this query as
answered (query attribute NS_QUERYATTR_ANSWERED), so when the
fetch completes the client won't be answered twice.
We must also take care of not detaching from the client, as a
fetch will still be running in background, this is handled by the
following snippet:
if (!QUERY_STALEONLY(&client->query)) {
isc_nmhandle_detach(&client->reqhandle);
}
Which basically tests if DNS_DBFIND_STALEONLY flag is set, which
means we are here due to a stale-answer-client-timeout timer
expiration.
2. Are we running query_lookup() due to resolver-query-timeout being
triggered?
In this case, DNS_DBFIND_STALEOK flag will be set and an attempt
to look for stale data will be made.
As already explained, this flag is algo used to activate
stale-refresh-time window, as it means that we failed to refresh
a RRset due to timeout.
It is ok in this situation to detach from the client, as the
fetch is already completed.
3. Are we running query_lookup() during the first time, looking for
a RRset in cache and stale-answer-client-timeout value is set to
zero?
In this case, if stale answers are enabled (probably), we must do
an initial database lookup with DNS_DBFIND_STALEONLY flag set, to
indicate to the database that we want stale data.
If we find an active RRset, proceed as normal, answer the client
and the query is done.
If we find a stale RRset we respond to the client and mark the
query as answered, but don't detach from the client yet as an
attempt in refreshing the RRset will still be made by means of
the new introduced function 'query_resolve'.
If no active or stale RRset is available, begin resolution as
usual.
(cherry picked from commit e219422575)
This commit allows to specify "disabled" or "off" in
stale-answer-client-timeout statement. The logic to support this
behavior will be added in the subsequent commits.
This commit also ensures an upper bound to stale-answer-client-timeout
which equals to one second less than 'resolver-query-timeout'.
(cherry picked from commit 0ad6f594f6)
The general logic behind the addition of this new feature works as
folows:
When a client query arrives, the basic path (query.c / ns_query_recurse)
was to create a fetch, waiting for completion in fetch_callback.
With the introduction of stale-answer-client-timeout, a new event of
type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale
answers are enabled and the fetch took longer than
stale-answer-client-timeout to complete.
When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we
must ensure that the folowing happens:
1. Setup a new query context with the sole purpose of looking up for
stale RRset only data, for that matters a new flag was added
'DNS_DBFIND_STALEONLY' used in database lookups.
. If a stale RRset is found, mark the original client query as
answered (with a new query attribute named NS_QUERYATTR_ANSWERED),
so when the fetch completion event is received later, we avoid
answering the client twice.
. If a stale RRset is not found, cleanup and wait for the normal
fetch completion event.
2. In ns_query_done, we must change this part:
/*
* If we're recursing then just return; the query will
* resume when recursion ends.
*/
if (RECURSING(qctx->client)) {
return (qctx->result);
}
To this:
if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) {
return (qctx->result);
}
Otherwise we would not proceed to answer the client if it happened
that a stale answer was found when looking up for stale only data.
When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we
proceed as before, resuming query, updating stats, etc, but a few
exceptions had to be added, most important of which are two:
1. Before answering the client (ns_client_send), check if the query
wasn't already answered before.
2. Before detaching a client, e.g.
isc_nmhandle_detach(&client->reqhandle), ensure that this is the
fetch completion event, and not the one triggered due to
stale-answer-client-timeout, so a correct call would be:
if (!QUERY_STALEONLY(client)) {
isc_nmhandle_detach(&client->reqhandle);
}
Other than these notes, comments were added in code in attempt to make
these updates easier to follow.
(cherry picked from commit 171a5b7542)
Since it takes a couple lines of code to check whether stale answers
are enabled for a given view, code was extracted out to a proper
function.
(cherry picked from commit 74840ec50b)
This is a minor performance improvement, we store the result of the
first call to strlcat to use as an offset in the next call when
constructing fctx->info string.
(cherry picked from commit 49c40827f6)
When 'opensslecdsa_parse()' encounters a label tag in the private key
file, load the private key with 'opensslecdsa_fromlabel()'. Otherwise
load it from the private structure.
This was attempted before with 'load_privkey()' and 'uses_engine()',
but had the same flaw as 'opensslecdsa_fromlabel()' had previously,
that is getting the private and public key separately, juggling with
pointers between EC_KEY and EVP_PKEY, did not create a valid
cryptographic key that could be used for signing.
(cherry picked from commit 57ac70ad46)
The 'opensslecdsa_fromlabel()' function does not need to get the
OpenSSL engine twice to load the private and public key. Also no need
to call 'dst_key_to_eckey()' as the EC_KEY can be derived from the
loaded EVP_PKEY's.
Add some extra checks to ensure the key has the same base id and curve
(group nid) as the dst key.
Since we already have the EVP_PKEY, no need to call 'finalize_eckey()',
instead just set the right values in the key structure.
(cherry picked from commit 393052d6ff)
The 'ecdsa_check()' function tries to correctly set the public key
on the eckey, but this should be skipped if the public key is
retrieved via the private key.
(cherry picked from commit 06b9724152)
The 'opensslecdsa_tofile()' function tags the label as an RSA label,
that is a copy paste error and should be of course an ECDSA label.
(cherry picked from commit 46afeca8bf)
The functions 'load_pubkey_from_engine()' and
'load_privkey_from_engine()' did not correctly store the pointers.
Update both functions to add 'EC_KEY_set_public_key()' and
'EC_KEY_set_private_key()' respectively, so that the pointers to
the public and private keys survive the "load from engine" functions.
(cherry picked from commit 01239691a1)
The 'function load_pubkey_from_engine()' made a call to the libssl
function 'ENGINE_load_private_key'. This is a copy paste error and
should be 'ENGINE_load_public_key'.
(cherry picked from commit 370285a62d)
- change name of 'bytes' to 'xfrsize' in dns_db_getsize() parameter list
and related variables; this is a more accurate representation of what
the function is doing
- change the size calculations in dns_db_getsize() to more accurately
represent the space needed for a *XFR message or journal file to contain
the data in the database. previously we returned the sizes of all
rdataslabs, including header overhead and offset tables, which
resulted in the database size being reported as much larger than the
equivalent *XFR or journal.
- map files caused a particular problem here: the fullname can't be
determined from the node while a file is being deserialized, because
the uppernode pointers aren't set yet. so we store "full name length"
in the dns_rbtnode structure while serializing, and clear it after
deserialization is complete.
the call initailizing a journal iterator can now optionally return
to the caller the size in bytes of an IXFR message (not including
DNS header overhead, signatures etc) containing the differences from
the beginning to the ending serial number.
this is calculated by scanning the journal transaction headers to
calculate the transfer size. since journal file records contain a length
field that is not included in IXFR messages, we subtract out the length
of those fields from the overall transaction length.
this necessitated adding an "RR count" field to the journal transaction
header, so we know how many length fields to subract. NOTE: this will
make existing journal files stop working!
The BIND 9 libraries are considered to be internal only and hence the
API and ABI changes a lot. Keeping track of the API/ABI changes takes
time and it's a complicated matter as the safest way to make everything
stable would be to bump any library in the dependency chain as in theory
if libns links with libdns, and a binary links with both, and we bump
the libdns SOVERSION, but not the libns SOVERSION, the old libns might
be loaded by binary pulling old libdns together with new libdns loaded
by the binary. The situation gets even more complicated with loading
the plugins that have been compiled with few versions old BIND 9
libraries and then dynamically loaded into the named.
We are picking the safest option possible and usable for internal
libraries - instead of using -version-info that has only a weak link to
BIND 9 version number, we are using -release libtool option that will
embed the corresponding BIND 9 version number into the library name.
That means that instead of libisc.so.1608 (as an example) the library
will now be named libisc-9.16.10.so.
(cherry picked from commit c605d75ea5)
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
(cherry picked from commit f619708bbf)
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
(cherry picked from commit 16e14353b1)
It is possible to have two threads destroying an rbtdb at the same
time when detachnode() executes and removes the last reference to
a node between exiting being set to true for the node and testing
if the references are zero in maybe_free_rbtdb(). Move NODE_UNLOCK()
to after checking if references is zero to prevent detachnode()
changing the reference count too early.
(cherry picked from commit 859d2fdad6)
While fixing #2359, 'report()' was changed so that it would print the
newline.
Newlines were missing from the output of 'dnssec-signzone'
and 'dnssec-verify' because change
664b8f04f5 moved the printing from
newlines to the library.
This had to be reverted because this also would print redundant
newlines in logfiles.
While doing the revert, some newlines in 'lib/dns/zoneverify.c'
were left in place, now making 'dnssec-signzone' and 'dnssec-verify'
print too many newlines.
This commit removes those newlines, so that the output looks nice
again.
(cherry picked from commit 18c62a077e)
The keymgr prevented zones from going to insecure mode. If we
have a policy with an empty key list this is a signal that the zone
wants to go back to insecure mode. In this case allow one extra state
transition to be valid when checking for DNSSEC safety.
(cherry picked from commit 9134100069)
Check if zone is transitioning from secure to insecure. If so,
delete the CDS/CDNSKEY records, otherwise make sure they are not
part of the RRset.
(cherry picked from commit 68d715a229)
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.
When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.
If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.
(cherry picked from commit cf420b2af0)
For purposes of zones transitioning back to insecure mode, it is
practical to see if related keys have a state file associated.
(cherry picked from commit 8f2c5e45da)
When using the `unixtime` or `date` method to update the SOA serial,
`named` and `dnssec-signzone` would silently fallback to `increment`
method to prevent the new serial number to be smaller than the old
serial number (using the serial number arithmetics). Add a warning
message when such fallback happens.
(cherry picked from commit ef685bab5c)
A typo in macro definition caused the load-balanced sockets to be
disabled even on platforms with existing support for load-balanced
sockets.
(cherry picked from commit 5caf33feda)
On Windows, we were limiting the number of listening children to just 1,
but we were then iterating on mgr->nworkers. That lead to scheduling
more async_*listen() than actually allocated and out-of-bound read-write
operation on the heap.
(cherry picked from commit 87c5867202)
When we were in nmthread, the isc__nm_async_<proto>connect() function
executes in the same thread as the isc__nm_<proto>connect() and on a
failure, it would block indefinitely because the failure branch was
setting sock->active to false before the condition around the wait had a
chance to skip the WAIT().
This also fixes the zero system test being stuck on FreeBSD 11, so we
re-enable the test in the commit.
On FreeBSD, the option to configure connection timeout is called
TCP_KEEPINIT, use it to configure the connection timeout there.
This also fixes the dangling socket problems in the unit test, so
re-enable them.
On platforms without load-balancing socket all the queries would be
handle by a single thread. Currently, the support for load-balanced
sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with
SO_REUSEPORT_LB.
This commit adds workaround for such platforms that:
1. setups single shared listening socket for all listening nmthreads for
UDP, TCP and TCPDNS netmgr transports
2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and
for rest of the nmthreads only copy the internal libuv flags (should
be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6).
3. start reading on UDP socket or listening on TCP socket
The load distribution among the nmthreads is uneven, but it's still
better than utilizing just one thread for processing all the incoming
queries
On FreeBSD, the stack is destroyed more aggressively than on Linux and
that revealed a bug where we were allocating the 16-bit len for the
TCPDNS message on the stack and the buffer got garbled before the
uv_write() sendback was executed. Now, the len is part of the uvreq, so
we can safely pass it to the uv_write() as the req gets destroyed after
the sendcb is executed.
(cherry picked from commit 94afea9325)
On Windows, WSAStartup() needs to be called to initialize Winsock before
any sockets are created or else socket() calls will return error code
10093 (WSANOTINITIALISED). Since BIND's Network Manager is intended to
work as a reusable networking library, it should take care of calling
WSAStartup() - and its cleanup counterpart, WSACleanup() - itself rather
than relying on external code to do it. Add the necessary WSAStartup()
and WSACleanup() calls to isc_nm_start() and isc_nm_destroy(),
respectively.
(cherry picked from commit 88f96faba8)
Make sure the error code is included in the message logged for
unexpected socket creation errors in order to facilitate troubleshooting
on Windows.
(cherry picked from commit dc2e1dea86)
Due to the added attach/detach tracing in the netmgr-v2 code, the
libns tests needs to be adjusted as the real function names have
changed from isc_nmhandle_* to isc__nmhandle_*.
The isc/util.h header redefine the DbC checks (REQUIRE, INSIST, ...) to
be cmocka "fake" assertions. However that means that cmocka.h needs to
be included after UNIT_TESTING is defined but before isc/util.h is
included. Because isc/util.h is included in most of the project headers
this means that the sequence MUST be:
#define UNIT_TESTING
#include <cmocka.h>
#include <isc/_anything_.h>
See !2204 for other header requirements for including cmocka.h.
(cherry picked from commit 0ba697fe8c)
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
After turning the users callbacks to be asynchronous, there was a
visible performance drop. This commit prevents the unnecessary
allocations while keeping the code paths same for both asynchronous and
synchronous calls.
The same change was done to the isc__nm_udp_{read,send} as those two
functions are in the hot path.
(cherry picked from commit d6d2fbe0e9)
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061
(cherry picked from commit 634bdfb16d)
When calling the high level netmgr functions, the callback would be
sometimes called synchronously if we catch the failure directly, or
asynchronously if it happens later. The synchronous call to the
callback could create deadlocks as the caller would not expect the
failed callback to be executed directly.
(cherry picked from commit a49d88568f)
This commit adds couple of additional safeguards against running
sends/reads on inactive sockets. The changes was modeled after the
changes we made to netmgr/tcpdns.c
(cherry picked from commit fa424225af)
Parse the configuration of tls objects into SSL_CTX* objects. Listen on
DoT if 'tls' option is setup in listen-on directive. Use DoT/DoH ports
for DoT/DoH.
(cherry picked from commit 38b78f59a0)
- peer address was not being reported correctly by "dig +tls"
- the protocol used is now reported in the dig output: UDP, TCP, or TLS.
(cherry picked from commit 8886569e9d)
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.
Add DoT support to tcpdns - isc_nm_listentlsdns().
(cherry picked from commit b2ee0e9dc3)
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:
- a failure when the zone was shut down and called dns_xfrin_detach()
before the xfrin had finished connecting; the connect timeout
terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
had already been cleared.
this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
socket() call can return an error - e.g. EMFILE, so we need to handle
this nicely and not crash.
Additionally wrap the socket() call inside a platform independent helper
function as the Socket data type on Windows is unsigned integer:
> This means, for example, that checking for errors when the socket and
> accept functions return should not be done by comparing the return
> value with –1, or seeing if the value is negative (both common and
> legal approaches in UNIX). Instead, an application should use the
> manifest constant INVALID_SOCKET as defined in the Winsock2.h header
> file.
(cherry picked from commit 8af7f81d6c)
Because we use result earlier for setting the loadbalancing on the
socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the
variable and when the UDP connection would succeed, we would
errorneously return this value instead of ISC_R_SUCCESS.
(cherry picked from commit 050258bda4)
use isc_nmhandle_settimeout() to set read/recv timeouts, and get rid
of connect_timeout() and related functions in dighost.c.
(cherry picked from commit ea2b04c361)
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.
(cherry picked from commit 4be63c5b00)