Commit graph

12623 commits

Author SHA1 Message Date
Ondřej Surý
f9711481ad Expire the 0 TTL RRSet quickly rather using them for serve-stale
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.

This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".

(cherry picked from commit 6ffa2ddae0)
2020-08-05 09:09:16 +02:00
Ondřej Surý
b48e9ab201 Add stale-cache-enable option and disable serve-stable by default
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).

This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage.  The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.

The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.

In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.

The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.

(cherry picked from commit ce53db34d6)
2020-08-05 09:09:16 +02:00
Mark Andrews
2dc26ebdb6 Map DNS_R_BADTSIG to FORMERR
Now that the log message has been printed set the result code to
DNS_R_FORMERR.  We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.

(cherry picked from commit 20488d6ad3)
2020-08-04 23:04:34 +10:00
Diego Fronza
fca1000ee9 Fix ns_statscounter_recursclients underflow
The basic scenario for the problem was that in the process of
resolving a query, if any rrset was eligible for prefetching, then it
would trigger a call to query_prefetch(), this call would run in
parallel to the normal query processing.

The problem arises due to the fact that both query_prefetch(), and,
in the original thread, a call to ns_query_recurse(), try to attach
to the recursionquota, but recursing client stats counter is only
incremented if ns_query_recurse() attachs to it first.

Conversely, if fetch_callback() is called before prefetch_done(),
it would not only detach from recursionquota, but also decrement
the stats counter, if query_prefetch() attached to te quota first
that would result in a decrement not matched by an increment, as
expected.

To solve this issue an atomic bool was added, it is set once in
ns_query_recurse(), allowing fetch_callback() to check for it
and decrement stats accordingly.

For a more compreensive explanation check the thread comment below:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1719#note_145857
2020-08-03 19:18:04 -03:00
Witold Kręcicki
a12076cc52 netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.

(cherry picked from commit a0f7d28967)
2020-07-31 13:33:06 +02:00
Mark Andrews
14fe6e77a7 Always check the return from isc_refcount_decrement.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range.  Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.

(cherry picked from commit bde5c7632a)
2020-07-31 12:54:47 +10:00
Mark Andrews
1981fb1327 Refactor the code that counts the last log version to keep
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions.  This improve fix for [GL #1989].

(cherry picked from commit aca18b8b5b)
2020-07-31 10:01:36 +10:00
Ondřej Surý
0fff3008ac Change the dns_name hashing to use 32-bit values
Change the dns_hash_name() and dns_hash_fullname() functions to use
isc_hash32() as the maximum hashtable size in rbt is 0..UINT32_MAX
large.

(cherry picked from commit a9182c89a6)
2020-07-30 11:57:24 +02:00
Ondřej Surý
ebb2b055cc Add isc_hash32() and rename isc_hash_function() to isc_hash64()
As the names suggest the original isc_hash64 function returns 64-bit
long hash values and the isc_hash32() returns 32-bit values.

(cherry picked from commit f59fd49fd8)
2020-07-30 11:57:24 +02:00
Ondřej Surý
1e5df7f3bf Add HalfSipHash 2-4 reference implementation
The HalfSipHash implementation has 32-bit keys and returns 32-bit
value.

(cherry picked from commit 344d66aaff)
2020-07-30 11:57:24 +02:00
Ondřej Surý
d89eb403f3 Remove OpenSSL based SipHash 2-4 implementation
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
to reuse the OpenSSL contexts and keys we'll use our own implementation of
siphash instead of trying to integrate with OpenSSL.

(cherry picked from commit 21d751dfc7)
2020-07-30 11:57:24 +02:00
Ondřej Surý
aa72c31422 Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:

1. Our internal hashing function returns uint64_t value, but it was
   silently truncated to unsigned int in dns_name_hash() and
   dns_name_fullhash() functions.  As the SipHash 2-4 higher bits are
   more random, we need to use the upper half of the return value.

2. The hashtable implementation in rbt.c was using modulo to pick the
   slot number for the hash table.  This has several problems because
   modulo is: a) slow, b) oblivious to patterns in the input data.  This
   could lead to very uneven distribution of the hashed data in the
   hashtable.  Combined with the single-linked lists we use, it could
   really hog-down the lookup and removal of the nodes from the rbt
   tree[a].  The Fibonacci Hashing is much better fit for the hashtable
   function here.  For longer description, read "Fibonacci Hashing: The
   Optimization that the World Forgot"[b] or just look at the Linux
   kernel.  Also this will make Diego very happy :).

3. The hashtable would rehash every time the number of nodes in the rbt
   tree would exceed 3 * (hashtable size).  The overcommit will make the
   uneven distribution in the hashtable even worse, but the main problem
   lies in the rehashing - every time the database grows beyond the
   limit, each subsequent rehashing will be much slower.  The mitigation
   here is letting the rbt know how big the cache can grown and
   pre-allocate the hashtable to be big enough to actually never need to
   rehash.  This will consume more memory at the start, but since the
   size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
   will only consume maximum of 32GB of memory for hashtable in the
   worst case (and max-cache-size would need to be set to more than
   4TB).  Calling the dns_db_adjusthashsize() will also cap the maximum
   size of the hashtable to the pre-computed number of bits, so it won't
   try to consume more gigabytes of memory than available for the
   database.

   FIXME: What is the average size of the rbt node that gets hashed?  I
   chose the pagesize (4k) as initial value to precompute the size of
   the hashtable, but the value is based on feeling and not any real
   data.

For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.

Notes:
a. A doubly linked list should be used here to speedup the removal of
   the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/

(cherry picked from commit e24bc324b4)
2020-07-30 11:57:24 +02:00
Michał Kępień
b6c33087b0 Fix idle timeout for connected TCP sockets
When named acting as a resolver connects to an authoritative server over
TCP, it sets the idle timeout for that connection to 20 seconds.  This
fixed timeout was picked back when the default processing timeout for
each client query was hardcoded to 30 seconds.  Commit
000a8970f8 made this processing timeout
configurable through "resolver-query-timeout" and decreased its default
value to 10 seconds, but the idle TCP timeout was not adjusted to
reflect that change.  As a result, with the current defaults in effect,
a single hung TCP connection will consistently cause the resolution
process for a given query to time out.

Set the idle timeout for connected TCP sockets to half of the client
query processing timeout configured for a resolver.  This allows named
to handle hung TCP connections more robustly and prevents the timeout
mismatch issue from resurfacing in the future if the default is ever
changed again.

(cherry picked from commit 953d704bd2)
2020-07-30 11:16:09 +02:00
Diego Fronza
a8ce7b461c Fix rpz wildcard name matching
Whenever an exact match is found by dns_rbt_findnode(),
the highest level node in the chain will not be put into
chain->levels[] array, but instead the chain->end
pointer will be adjusted to point to that node.

Suppose we have the following entries in a rpz zone:
example.com     CNAME rpz-passthru.
*.example.com   CNAME rpz-passthru.

A query for www.example.com would result in the
following chain object returned by dns_rbt_findnode():

chain->level_count = 2
chain->level_matches = 2
chain->levels[0] = .
chain->levels[1] = example.com
chain->levels[2] = NULL
chain->end = www

Since exact matches only care for testing rpz set bits,
we need to test for rpz wild bits through iterating the nodechain, and
that includes testing the rpz wild bits in the highest level node found.

In the case of an exact match, chain->levels[chain->level_matches]
will be NULL, to address that we must use chain->end as the start point,
then iterate over the remaining levels in the chain.
2020-07-27 17:02:16 -03:00
Mark Andrews
b0942c2442 Check walking the hip rendezvous servers.
Also fixes extraneous white space at end of record when
there are no rendezvous servers.

(cherry picked from commit 78db46d746)
2020-07-24 15:24:49 +10:00
Petr Menšík
ac79d68765 Remove few lines in unix socket handling
Reuse the same checks two times, make difference minimal.

(cherry picked from commit 72d81c4768)
2020-07-24 13:47:26 +10:00
Tinderbox User
b03a635f68 prep 9.16.5 2020-07-15 23:10:55 +02:00
Matthijs Mekking
4dabb688db Check return value of dst_key_getbool()
Fix Coverity CHECKED_RETURN reports for dst_key_getbool().  In most
cases we do not really care about its return value, but it is prudent
to check it.

In one case, where a dst_key_getbool() error should be treated
identically as success, cast the return value to void and add a relevant
comment.

(cherry picked from commit e645d2ef1e)
2020-07-14 17:48:21 +02:00
Mark Andrews
d47c42a0ab Mark 'addr' as unused if HAVE_IF_NAMETOINDEX is not defined
Also 'zone' should be initialised to zero.

(cherry picked from commit e7662c4c63)
2020-07-14 10:53:06 +10:00
Mark Andrews
b955da48aa Handle namespace clash over 'SEC' on illumos.
(cherry picked from commit 18eef20241)
2020-07-14 09:06:46 +10:00
Mark Andrews
f771d75c9b Address potential double unlock in process_fd
(cherry picked from commit cc0089c66b)
2020-07-14 07:35:17 +10:00
Mark Andrews
94288631a9 Add changes for [GL #1989]
(cherry picked from commit 42b2290c3a)
2020-07-13 14:04:53 +10:00
Mark Andrews
67f85d648f Address overrun in remove_old_tsversions
If too many versions of log / dnstap files to be saved where requests
the memory after to_keep could be overwritten.  Force the number of
versions to be saved to a save level.  Additionally the memmove length
was incorrect.

(cherry picked from commit 6ca78bc57d)
2020-07-13 14:04:04 +10:00
Mark Andrews
e67b7a62d0 Assert tsigout is non-NULL
(cherry picked from commit 827746e89b)
2020-07-13 13:21:12 +10:00
Mark Andrews
12fac1ce70 check returns from inet_pton()
(cherry picked from commit 9499adeb5e)
2020-07-13 11:44:58 +10:00
Michał Kępień
0bc4d6cc7a Fix locking for LMDB 0.9.26
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views.  Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:

 1. mdb_env_open(envA) is called when the view is first created.
 2. "rndc reconfig" is called.
 3. mdb_env_open(envB) is called for the new instance of the view.
 4. mdb_env_close(envA) is called for the old instance of the view.

This seems to have worked so far.  However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).

Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.

To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead.  Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps).  Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.

[1] 2fd44e3251

(cherry picked from commit 53120279b5)
2020-07-10 11:30:31 +02:00
Mark Andrews
86681ca6f1 Adjust range limit of unknown meta types
(cherry picked from commit 092a159dcd)
2020-07-08 13:44:47 +10:00
Ondřej Surý
0279cc76a7 Update STALE and ANCIENT header attributes atomically
The ThreadSanitizer found a data race when updating the stale header.
Instead of trying to acquire the write lock and failing occasionally
which would skew the statistics, the dns_rdatasetheader_t.attributes
field has been promoted to use stdatomics.  Updating the attributes in
the mark_header_ancient() and mark_header_stale() now uses the cmpxchg
to update the attributes forfeiting the need to hold the write lock on
the tree.  Please note that mark_header_ancient() still needs to hold
the lock because .dirty is being updated in the same go.

(cherry picked from commit 81d4230e60)
2020-07-08 12:01:46 +10:00
Mark Andrews
dd32fb9284 Make the stdatomic shim and mutexatomic type complete
The stdatomic shims for non-C11 compilers (Windows, old gcc, ...) and
mutexatomic implemented only and minimal subset of the atomic types.
This commit adds 16-bit operations for Windows and all atomic types as
defined in standard.

(cherry picked from commit bccea5862d)
2020-07-08 10:29:59 +10:00
Mark Andrews
244ebdfb8c remove redundant rctx != NULL check
(cherry picked from commit 2fa2dbd5fb)
2020-07-06 10:30:25 +10:00
Witold Kręcicki
000c7d1340 rbtdb: cleanup_dead_nodes should ignore alive nodes on the deadlist
(cherry picked from commit c8f2d55acf)
2020-07-01 15:35:21 +02:00
Witold Kręcicki
03e583ffa8 Fix assertion failure during startup when the server is under load.
When we're coming back from recursion fetch_callback does not accept
DNS_R_NXDOMAIN as an rcode - query_gotanswer calls query_nxdomain in
which an assertion fails on qctx->is_zone. Yet, under some
circumstances, qname minimization will return an DNS_R_NXDOMAIN - when
root zone mirror is not yet loaded. The fix changes the DNS_R_NXDOMAIN
answer to DNS_R_SERVFAIL.
2020-07-01 12:55:12 +02:00
Matthijs Mekking
9f5a43808f Fix linking problem for #1612
When a library is examined, an object file within it can be left out
of the link if it does not provide symbols that the symbol table
needs.  Introducing `isc_stdtime_tostring` caused a build failure for
`update_test` because it now requires `libisc.a(stdtime.o)` and that
also exports the `isc_stdtime_get` symbol, meaning we have a
multiple definition error.

Add a local version of `isc_stdtime_tostring`, so that the linker
will not search for it in available object files.
2020-07-01 10:55:30 +02:00
Matthijs Mekking
f1b3686cd2 Output rndc dnssec -status
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.

Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.

(cherry picked from commit 19ce9ec1d4)
2020-07-01 09:57:44 +02:00
Matthijs Mekking
7915327aac Move dst key printtime in separate function
I'd like to use the same functionality (pretty print the datetime
of keytime metadata) in the 'rndc dnssec -status' command.  So it is
better that this logic is done in a separate function.

Since the stdtime.c code have differernt files for unix and win32,
I think the "#ifdef WIN32" define can be dropped.

(cherry picked from commit 9e03f8e8fe)
2020-07-01 09:57:44 +02:00
Evan Hunt
952461b6af restore "blackhole" functionality
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.

in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().

(cherry picked from commit 23c7373d68)
2020-06-30 21:10:31 -07:00
Tony Finch
b7f7b8128e Fix rndc dnstap -roll N
The `rndc` argument was always overridden by the static configuration,
because the logic for handling the number of dnstap files to retain
was both backwards and a bit redundant.

(cherry picked from commit 7c07129a51)
2020-06-29 22:30:01 +00:00
Michał Kępień
be35b872fd Address compilation warnings on FreeBSD 11.4
With Clang 10.0.0 on FreeBSD 11.4, compiling lib/dns/spnego.c triggers
the following warnings:

    spnego.c:361:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
                    return (GSS_S_DEFECTIVE_TOKEN);
                            ^
    /usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
    #define GSS_S_DEFECTIVE_TOKEN      (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
                                            ^
    spnego.c:366:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
                    return (GSS_S_DEFECTIVE_TOKEN);
                            ^
    /usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
    #define GSS_S_DEFECTIVE_TOKEN      (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
                                            ^
    spnego.c:371:12: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
                            return (GSS_S_DEFECTIVE_TOKEN);
                                    ^
    /usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
    #define GSS_S_DEFECTIVE_TOKEN      (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
                                            ^
    spnego.c:376:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
                    return (GSS_S_DEFECTIVE_TOKEN);
                            ^
    /usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
    #define GSS_S_DEFECTIVE_TOKEN      (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
                                            ^
    spnego.c:380:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
                    return (GSS_S_DEFECTIVE_TOKEN);
                            ^
    /usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
    #define GSS_S_DEFECTIVE_TOKEN      (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
                                            ^
    5 errors generated.

Address by replacing all instances of the GSS_S_DEFECTIVE_TOKEN constant
with a boolean value.  Invert the values returned by cmp_gss_type() so
that its only call site reads more naturally in the context of the
comment preceding it.
2020-06-29 12:03:01 +02:00
Matthijs Mekking
7eed00502f kasp tests: fix wait for reconfig done
The wait until zones are signed after rndc reconfig is broken
because the zones are already signed before the reconfig.  Fix
by having a different way to ensure the signing of the zone is
complete.  This does require a call to the "wait_for_done_signing"
function after each "check_keys" call after the ns6 reconfig.

The "wait_for_done_signing" looks for a (newly added) debug log
message that named will output if it is done signing with a certain
key.

(cherry picked from commit a47192ed5b)
2020-06-29 08:09:40 +02:00
Witold Kręcicki
4582ef3bb2 Fix a shutdown race in netmgr udp.
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process - otherwise we might destroy the
callback argument before actually stopping listening, and call
the callback on a bad memory.
2020-06-26 01:44:03 -07:00
Witold Kręcicki
97e44fa3df Make netmgr tcpdns send calls asynchronous.
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving the whole tcpdns
processing to a proper netmgr thread.
2020-06-26 01:18:27 -07:00
Evan Hunt
f171017570 append "0" to IPv6 addresses ending in "::" when printing YAML
such addresses broke some YAML parsers.

(cherry picked from commit a8baf79e33)
2020-06-25 18:57:06 -07:00
Mark Andrews
3612f662da The validator could fail when select_signing_key/get_dst_key failed
to select the signing key because the algorithm was not supported
and the loop was prematurely aborted.

(cherry picked from commit d475f3aeed)
2020-06-25 22:42:43 +10:00
Mark Andrews
3f48a1e06e Add INSIST's to silence cppcheck warnings
(cherry picked from commit 0cf25d7f38)
2020-06-25 21:13:17 +10:00
Mark Andrews
b43641a55d Address potential thread issues:
Assign and then check node for NULL to address another thread
changing radix->head in the meantime.

Move 'node != NULL' check into while loop test to silence cppcheck
false positive.

Fix pointer != NULL style.

(cherry picked from commit 51f08d2095)
2020-06-25 21:11:27 +10:00
Evan Hunt
dca3658720 "check-names primary" and "check-names secondary" were ignored
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.

(cherry picked from commit ba31b189b4)
2020-06-22 14:30:14 +02:00
Mark Andrews
34a5ad82d6 Address race between zone_maintenance and dns_zone_setview_helper
There was a possible NULL dereference due to data race between accessing
zone->view and zone->view->adb.

(cherry picked from commit 67c8f7329d)
2020-06-22 12:27:11 +02:00
Mark Andrews
41e38c216d Add missing #pragma once to <dns/lmdb.h> 2020-06-19 12:12:45 +10:00
Tinderbox User
adab85b815 prep 9.16.4 2020-06-18 10:25:50 +02:00
Mark Andrews
6964a21fa6 Remove INSIST from from new_reference
RBTDB node can now appear on the deadnodes lists following the changes
to decrement_reference in 176b23b6cd to
defer checking of node->down when the tree write lock is not held.  The
node should be unlinked instead.

(cherry picked from commit 569cc155b8680d8ed12db1fabbe20947db24a0f9)
2020-06-18 10:18:42 +02:00