Commit graph

13071 commits

Author SHA1 Message Date
Matthijs Mekking
9e2ea5efb1 Don't set pubkey if eckey already has public key
The 'ecdsa_check()' function tries to correctly set the public key
on the eckey, but this should be skipped if the public key is
retrieved via the private key.

(cherry picked from commit 06b9724152)
2021-01-26 15:04:21 +01:00
Matthijs Mekking
e3acfb44d5 ECDSA code should not use RSA label
The 'opensslecdsa_tofile()' function tags the label as an RSA label,
that is a copy paste error and should be of course an ECDSA label.

(cherry picked from commit 46afeca8bf)
2021-01-26 15:04:11 +01:00
Matthijs Mekking
8b25d3ab57 Correctly update pointers to pubkey and privkey
The functions 'load_pubkey_from_engine()' and
'load_privkey_from_engine()' did not correctly store the pointers.

Update both functions to add 'EC_KEY_set_public_key()' and
'EC_KEY_set_private_key()' respectively, so that the pointers to
the public and private keys survive the "load from engine" functions.

(cherry picked from commit 01239691a1)
2021-01-26 15:04:03 +01:00
Matthijs Mekking
f66df9f1b7 load_pubkey_from_engine() should load public key
The 'function load_pubkey_from_engine()' made a call to the libssl
function 'ENGINE_load_private_key'.  This is a copy paste error and
should be 'ENGINE_load_public_key'.

(cherry picked from commit 370285a62d)
2021-01-26 15:03:43 +01:00
Evan Hunt
077e2c2a74 add serial number to "transfer ended" log messages 2021-01-26 12:38:32 +01:00
Evan Hunt
2df6ffc051 check size ratio when responding to IXFR requests 2021-01-26 12:38:32 +01:00
Evan Hunt
9950247c78 improve calculation of database transfer size
- change name of 'bytes' to 'xfrsize' in dns_db_getsize() parameter list
  and related variables; this is a more accurate representation of what
  the function is doing
- change the size calculations in dns_db_getsize() to more accurately
  represent the space needed for a *XFR message or journal file to contain
  the data in the database. previously we returned the sizes of all
  rdataslabs, including header overhead and offset tables, which
  resulted in the database size being reported as much larger than the
  equivalent *XFR or journal.
- map files caused a particular problem here: the fullname can't be
  determined from the node while a file is being deserialized, because
  the uppernode pointers aren't set yet. so we store "full name length"
  in the dns_rbtnode structure while serializing, and clear it after
  deserialization is complete.
2021-01-26 12:38:32 +01:00
Evan Hunt
70df95e9f5 dns_journal_iter_init() can now return the size of the delta
the call initailizing a journal iterator can now optionally return
to the caller the size in bytes of an IXFR message (not including
DNS header overhead, signatures etc) containing the differences from
the beginning to the ending serial number.

this is calculated by scanning the journal transaction headers to
calculate the transfer size. since journal file records contain a length
field that is not included in IXFR messages, we subtract out the length
of those fields from the overall transaction length.

this necessitated adding an "RR count" field to the journal transaction
header, so we know how many length fields to subract. NOTE: this will
make existing journal files stop working!
2021-01-26 12:38:32 +01:00
Evan Hunt
57aadd6cea add syntax and setter/getter functions to configure max-ixfr-ratio 2021-01-26 12:38:32 +01:00
Ondřej Surý
0e25af628c Use -release instead of -version-info for internal library SONAMEs
The BIND 9 libraries are considered to be internal only and hence the
API and ABI changes a lot.  Keeping track of the API/ABI changes takes
time and it's a complicated matter as the safest way to make everything
stable would be to bump any library in the dependency chain as in theory
if libns links with libdns, and a binary links with both, and we bump
the libdns SOVERSION, but not the libns SOVERSION, the old libns might
be loaded by binary pulling old libdns together with new libdns loaded
by the binary.  The situation gets even more complicated with loading
the plugins that have been compiled with few versions old BIND 9
libraries and then dynamically loaded into the named.

We are picking the safest option possible and usable for internal
libraries - instead of using -version-info that has only a weak link to
BIND 9 version number, we are using -release libtool option that will
embed the corresponding BIND 9 version number into the library name.

That means that instead of libisc.so.1608 (as an example) the library
will now be named libisc-9.16.10.so.

(cherry picked from commit c605d75ea5)
2021-01-25 15:28:09 +01:00
Tinderbox User
536bc1163a prep 9.16.11 2021-01-21 09:11:54 +01:00
Evan Hunt
1a32a4d001 prevent "primaries" lists from having duplicate names
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.

(cherry picked from commit f619708bbf)
2021-01-12 15:21:14 +01:00
Evan Hunt
746aa2581c add "primary-only" as a synonym for "master-only"
update the "notify" option to use RFC 8499 terminology as well.

(cherry picked from commit 424a3cf3cc)
2021-01-12 15:21:14 +01:00
Evan Hunt
04b9cdb53c add "primaries" as a synonym for "masters" in named.conf
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.

added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.

(cherry picked from commit 16e14353b1)
2021-01-12 15:21:14 +01:00
Matthijs Mekking
c4520620dc Fix signatures-validity config option
KASP was using 'signatures-validity-dnskey' instead of
'signatures-validity'.

(cherry picked from commit ad63e9e4f8)
2021-01-12 13:13:05 +01:00
Mark Andrews
07e899f616 Inactive incorrectly incremented
It is possible to have two threads destroying an rbtdb at the same
time when detachnode() executes and removes the last reference to
a node between exiting being set to true for the node and testing
if the references are zero in maybe_free_rbtdb().  Move NODE_UNLOCK()
to after checking if references is zero to prevent detachnode()
changing the reference count too early.

(cherry picked from commit 859d2fdad6)
2021-01-06 16:33:32 +11:00
Matthijs Mekking
63e58f09a5 Fix dnssec-signzone and -verify logging (again)
While fixing #2359, 'report()' was changed so that it would print the
newline.

Newlines were missing from the output of 'dnssec-signzone'
and 'dnssec-verify' because change
664b8f04f5 moved the printing from
newlines to the library.

This had to be reverted because this also would print redundant
newlines in logfiles.

While doing the revert, some newlines in 'lib/dns/zoneverify.c'
were left in place, now making 'dnssec-signzone' and 'dnssec-verify'
print too many newlines.

This commit removes those newlines, so that the output looks nice
again.

(cherry picked from commit 18c62a077e)
2021-01-05 13:41:49 +01:00
Matthijs Mekking
d564ad5f52 Update keymgr to allow transition to insecure mode
The keymgr prevented zones from going to insecure mode. If we
have a policy with an empty key list this is a signal that the zone
wants to go back to insecure mode. In this case allow one extra state
transition to be valid when checking for DNSSEC safety.

(cherry picked from commit 9134100069)
2020-12-23 11:56:54 +01:00
Matthijs Mekking
6da379d844 Publish CDS/CDNSKEY Delete Records
Check if zone is transitioning from secure to insecure. If so,
delete the CDS/CDNSKEY records, otherwise make sure they are not
part of the RRset.

(cherry picked from commit 68d715a229)
2020-12-23 11:56:44 +01:00
Matthijs Mekking
cf0439cd5f Treat dnssec-policy "none" as a builtin zone
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.

When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.

If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.

(cherry picked from commit cf420b2af0)
2020-12-23 11:56:33 +01:00
Matthijs Mekking
6ff69ee8ba Add function to see if dst key uses kasp
For purposes of zones transitioning back to insecure mode, it is
practical to see if related keys have a state file associated.

(cherry picked from commit 8f2c5e45da)
2020-12-23 11:56:25 +01:00
Mark Andrews
4d003dd0f8 Only pick CPUs that are part of the existing CPU affinity set when
assigning a thread to a CPU.

(cherry picked from commit 698d9285d4)
2020-12-23 09:21:29 +11:00
Ondřej Surý
04f9f45c54 Print warning when falling back to increment soa serial method
When using the `unixtime` or `date` method to update the SOA serial,
`named` and `dnssec-signzone` would silently fallback to `increment`
method to prevent the new serial number to be smaller than the old
serial number (using the serial number arithmetics).  Add a warning
message when such fallback happens.

(cherry picked from commit ef685bab5c)
2020-12-12 07:55:29 +01:00
Ondřej Surý
2c04299eb1 Fix HAVE_SO_REUSEPORT_LB macro name definition
A typo in macro definition caused the load-balanced sockets to be
disabled even on platforms with existing support for load-balanced
sockets.

(cherry picked from commit 5caf33feda)
2020-12-09 10:46:16 +01:00
Ondřej Surý
90979a79e2 Sync the func() -> func(void) in netmgr 2020-12-09 10:46:16 +01:00
Ondřej Surý
bb9b55dfba Use sock->nchildren instead of mgr->nworkers when initializing NM
On Windows, we were limiting the number of listening children to just 1,
but we were then iterating on mgr->nworkers.  That lead to scheduling
more async_*listen() than actually allocated and out-of-bound read-write
operation on the heap.

(cherry picked from commit 87c5867202)
2020-12-09 10:46:16 +01:00
Ondřej Surý
857704b879 Explicitly link the netmgr tests with -luv 2020-12-09 10:46:16 +01:00
Ondřej Surý
7ec4ec3a81 Fix datarace when UDP/TCP connect fails and we are in nmthread
When we were in nmthread, the isc__nm_async_<proto>connect() function
executes in the same thread as the isc__nm_<proto>connect() and on a
failure, it would block indefinitely because the failure branch was
setting sock->active to false before the condition around the wait had a
chance to skip the WAIT().

This also fixes the zero system test being stuck on FreeBSD 11, so we
re-enable the test in the commit.
2020-12-09 10:46:16 +01:00
Ondřej Surý
90a9b0611a Add FreeBSD connection timeout socket option
On FreeBSD, the option to configure connection timeout is called
TCP_KEEPINIT, use it to configure the connection timeout there.

This also fixes the dangling socket problems in the unit test, so
re-enable them.
2020-12-09 10:46:16 +01:00
Ondřej Surý
0ee8672692 Distribute queries among threads even on platforms without lb sockets
On platforms without load-balancing socket all the queries would be
handle by a single thread.  Currently, the support for load-balanced
sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with
SO_REUSEPORT_LB.

This commit adds workaround for such platforms that:

1. setups single shared listening socket for all listening nmthreads for
   UDP, TCP and TCPDNS netmgr transports

2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and
   for rest of the nmthreads only copy the internal libuv flags (should
   be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6).

3. start reading on UDP socket or listening on TCP socket

The load distribution among the nmthreads is uneven, but it's still
better than utilizing just one thread for processing all the incoming
queries
2020-12-09 10:46:16 +01:00
Ondřej Surý
4c70100ce0 Don't use stack allocated buffer for uv_write()
On FreeBSD, the stack is destroyed more aggressively than on Linux and
that revealed a bug where we were allocating the 16-bit len for the
TCPDNS message on the stack and the buffer got garbled before the
uv_write() sendback was executed.  Now, the len is part of the uvreq, so
we can safely pass it to the uv_write() as the req gets destroyed after
the sendcb is executed.

(cherry picked from commit 94afea9325)
2020-12-09 10:46:16 +01:00
Michał Kępień
12fa8a7aed Make netmgr initialize and cleanup Winsock itself
On Windows, WSAStartup() needs to be called to initialize Winsock before
any sockets are created or else socket() calls will return error code
10093 (WSANOTINITIALISED).  Since BIND's Network Manager is intended to
work as a reusable networking library, it should take care of calling
WSAStartup() - and its cleanup counterpart, WSACleanup() - itself rather
than relying on external code to do it.  Add the necessary WSAStartup()
and WSACleanup() calls to isc_nm_start() and isc_nm_destroy(),
respectively.

(cherry picked from commit 88f96faba8)
2020-12-09 10:46:16 +01:00
Michał Kępień
216fc34490 Extend log message for unexpected socket() errors
Make sure the error code is included in the message logged for
unexpected socket creation errors in order to facilitate troubleshooting
on Windows.

(cherry picked from commit dc2e1dea86)
2020-12-09 10:46:16 +01:00
Ondřej Surý
e8e8ed7fb9 Adjust the nstests for isc_nmhandle_{attach,detach} name change
Due to the added attach/detach tracing in the netmgr-v2 code, the
libns tests needs to be adjusted as the real function names have
changed from isc_nmhandle_* to isc__nmhandle_*.
2020-12-09 10:46:16 +01:00
Ondřej Surý
9b2184893d The cmocka.h header MUST be included before isc/util.h gets included
The isc/util.h header redefine the DbC checks (REQUIRE, INSIST, ...)  to
be cmocka "fake" assertions.  However that means that cmocka.h needs to
be included after UNIT_TESTING is defined but before isc/util.h is
included.  Because isc/util.h is included in most of the project headers
this means that the sequence MUST be:

    #define UNIT_TESTING
    #include <cmocka.h>

    #include <isc/_anything_.h>

See !2204 for other header requirements for including cmocka.h.

(cherry picked from commit 0ba697fe8c)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7fc62f829d Add libssl libraries to Windows build
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
2020-12-09 10:46:16 +01:00
Ondřej Surý
48759bd047 Fix the data race in accessing the isc_nm_t timers
The following TSAN report about accessing the mgr timers (mgr->init,
mgr->idle, mgr->keepalive and mgr->advertised) has been fixed in this
commit:

    ==================
    WARNING: ThreadSanitizer: data race (pid=2746)
    Read of size 4 at 0x7b440008a948 by thread T18:
    #0 isc__nm_tcpdns_read /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 (libisc.so.1706+0x2ba0f)
    #1 isc_nm_read /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1679:3 (libisc.so.1706+0x22258)
    #2 tcpdns_connect_connect_cb /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:363:2 (tcpdns_test+0x4bc5fb)
    #3 isc__nm_async_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1816:2 (libisc.so.1706+0x228c9)
    #4 isc__nm_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1791:3 (libisc.so.1706+0x22713)
    #5 tcpdns_connect_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:343:2 (libisc.so.1706+0x2d89d)
    #6 uv__stream_connect /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1381:5 (libuv.so.1+0x27c18)
    #7 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1298:5 (libuv.so.1+0x25977)
    #8 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:462:11 (libuv.so.1+0x2e795)
    #9 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5 (libuv.so.1+0x158ec)
    #10 nm_thread /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:530:11 (libisc.so.1706+0x1c94a)

    Previous write of size 4 at 0x7b440008a948 by main thread:
    #0 isc_nm_settimeouts /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:490:12 (libisc.so.1706+0x1dda5)
    #1 tcpdns_recv_two /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:601:2 (tcpdns_test+0x4bad0e)
    #2 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70be)
    #3 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Location is heap block of size 281 at 0x7b440008a840 allocated by main thread:
    #0 malloc <null> (tcpdns_test+0x42864b)
    #1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:713:8 (libisc.so.1706+0x6d261)
    #2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:622:8 (libisc.so.1706+0x69b9c)
    #3 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1044:9 (libisc.so.1706+0x6d379)
    #4 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2432:10 (libisc.so.1706+0x6889e)
    #5 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:203:8 (libisc.so.1706+0x1c219)
    #6 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #7 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #8 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Thread T18 'isc-net-0000' (tid=3513, running) created by main thread at:
    #0 pthread_create <null> (tcpdns_test+0x429e7b)
    #1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:73:8 (libisc.so.1706+0x8476a)
    #2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:271:3 (libisc.so.1706+0x1c66a)
    #3 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #4 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #5 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 in isc__nm_tcpdns_read
    ==================
    ThreadSanitizer: reported 1 warnings

(cherry picked from commit 2e1dd56d0b)
2020-12-09 10:46:16 +01:00
Ondřej Surý
a61b7294c2 Avoid netievent allocations when the callbacks can be called directly
After turning the users callbacks to be asynchronous, there was a
visible performance drop.  This commit prevents the unnecessary
allocations while keeping the code paths same for both asynchronous and
synchronous calls.

The same change was done to the isc__nm_udp_{read,send} as those two
functions are in the hot path.

(cherry picked from commit d6d2fbe0e9)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7b9c8b9781 Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061

(cherry picked from commit 634bdfb16d)
2020-12-09 10:46:16 +01:00
Ondřej Surý
fa9ca83862 Turn all the callback to be always asynchronous
When calling the high level netmgr functions, the callback would be
sometimes called synchronously if we catch the failure directly, or
asynchronously if it happens later.  The synchronous call to the
callback could create deadlocks as the caller would not expect the
failed callback to be executed directly.

(cherry picked from commit a49d88568f)
2020-12-09 10:46:16 +01:00
Ondřej Surý
bcc9ad98ea netmgr: Add additional safeguards to netmgr/tls.c
This commit adds couple of additional safeguards against running
sends/reads on inactive sockets.  The changes was modeled after the
changes we made to netmgr/tcpdns.c

(cherry picked from commit fa424225af)
2020-12-09 10:46:16 +01:00
Witold Kręcicki
b83dff0585 isc_nm_tls_create_server_ctx can create ephemeral certs
In-memory ephemeral certs creation for easy DoT/DoH deployment.

(cherry picked from commit 3c00fb71db)
2020-12-09 10:46:16 +01:00
Witold Kręcicki
d7fa046a69 Add DoT support to bind
Parse the configuration of tls objects into SSL_CTX* objects.  Listen on
DoT if 'tls' option is setup in listen-on directive.  Use DoT/DoH ports
for DoT/DoH.

(cherry picked from commit 38b78f59a0)
2020-12-09 10:46:16 +01:00
Evan Hunt
0f5fff5c1e report peer address in TLS mode, and specify protocol
- peer address was not being reported correctly by "dig +tls"
- the protocol used is now reported in the dig output: UDP, TCP, or TLS.

(cherry picked from commit 8886569e9d)
2020-12-09 10:46:16 +01:00
Witold Kręcicki
4a854da141 netmgr: server-side TLS support
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.

Add DoT support to tcpdns - isc_nm_listentlsdns().

(cherry picked from commit b2ee0e9dc3)
2020-12-09 10:46:16 +01:00
Evan Hunt
6f6f0e26ab address some possible shutdown races in xfrin
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:

- a failure when the zone was shut down and called dns_xfrin_detach()
  before the xfrin had finished connecting; the connect timeout
  terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
  had already been cleared.

this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
2020-12-09 10:46:16 +01:00
Ondřej Surý
c4dcedd2dc netmgr: Don't crash if socket() returns an error in udpconnect
socket() call can return an error - e.g. EMFILE, so we need to handle
this nicely and not crash.

Additionally wrap the socket() call inside a platform independent helper
function as the Socket data type on Windows is unsigned integer:

> This means, for example, that checking for errors when the socket and
> accept functions return should not be done by comparing the return
> value with –1, or seeing if the value is negative (both common and
> legal approaches in UNIX). Instead, an application should use the
> manifest constant INVALID_SOCKET as defined in the Winsock2.h header
> file.

(cherry picked from commit 8af7f81d6c)
2020-12-09 10:46:16 +01:00
Ondřej Surý
21daa258a2 netmgr: Always load the result from async socket
Because we use result earlier for setting the loadbalancing on the
socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the
variable and when the UDP connection would succeed, we would
errorneously return this value instead of ISC_R_SUCCESS.

(cherry picked from commit 050258bda4)
2020-12-09 10:46:16 +01:00
Evan Hunt
70e08cab6b dig: use new netmgr timeout mechanism
use isc_nmhandle_settimeout() to set read/recv timeouts, and get rid
of connect_timeout() and related functions in dighost.c.

(cherry picked from commit ea2b04c361)
2020-12-09 10:46:16 +01:00
Evan Hunt
4598d7b30d add isc_nmhandle_settimeout() function
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.

(cherry picked from commit 4be63c5b00)
2020-12-09 10:46:16 +01:00
Ondřej Surý
5877befb51 fix nmhandle attach/detach errors in tcpdnsconnect_cb()
we need to attach to the statichandle when connecting TCPDNS sockets,
same as with UDP.

(cherry picked from commit 2191d2bf44)
2020-12-09 10:46:16 +01:00
Mark Andrews
574e0d9f6e Incorrect result code passed to failed_connect_cb
*** CID 312970:  Incorrect expression  (COPY_PASTE_ERROR) /lib/isc/netmgr/tcp.c: 282 in tcp_connect_cb()
    276     	}
    277
    278     	isc__nm_incstats(sock->mgr, sock->statsindex[STATID_CONNECT]);
    279     	r = uv_tcp_getpeername(&sock->uv_handle.tcp, (struct sockaddr *)&ss,
    280     			       &(int){ sizeof(ss) });
    281     	if (r != 0) {
    >>>     CID 312970:  Incorrect expression  (COPY_PASTE_ERROR)
    >>>     "status" in "isc___nm_uverr2result(status, true, "netmgr/tcp.c", 282U)" looks like a copy-paste error.
    282     		failed_connect_cb(sock, req, isc__nm_uverr2result(status));
    283     		return;
    284     	}
    285
    286     	atomic_store(&sock->connecting, false);
    287

(cherry picked from commit 0073cb7356)
2020-12-09 10:46:16 +01:00
Ondřej Surý
268e111546 Put up additional safe guards to not use inactive/closed tcpdns socket
When we are operating on the tcpdns socket, we need to double check
whether the socket or its outerhandle or its listener or its mgr is
still active and when not, bail out early.

(cherry picked from commit c14c1fdd2c)
2020-12-09 10:46:16 +01:00
Witold Kręcicki
fb19091a32 Fix improper closed connection handling in tcpdns.
If dnslisten_readcb gets a read callback it needs to verify that the
outer socket wasn't closed in the meantime, and issue a CANCELED callback
if it was.

(cherry picked from commit 3ab3d90de0)
2020-12-09 10:46:16 +01:00
Evan Hunt
80de62645c check return value from uv_tcp_getpeername() when connecting
if we can't determine the peer, the connect should fail.

(cherry picked from commit 8fcad58ea6)
2020-12-09 10:46:16 +01:00
Evan Hunt
12b1ae64ff set REUSEPORT and REUSEADDR on TCP sockets if needed
When binding a TCP socket, if bind() fails with EADDRINUSE,
try again with REUSEPORT/REUSEADDR (or the equivalent options).

(cherry picked from commit 26a3a22895)
2020-12-09 10:46:16 +01:00
Ondřej Surý
e35b8db249 Fix more races between connect and shutdown
There were more races that could happen while connecting to a
socket while closing or shutting down the same socket.  This
commit introduces a .closing flag to guard the socket from
being closed twice.

(cherry picked from commit ed3ab63f74)
2020-12-09 10:46:16 +01:00
Ondřej Surý
d8c3e48970 Fix a race between isc__nm_async_shutdown() and new sends/reads
There was a data race where a new event could be scheduled after
isc__nm_async_shutdown() had cleaned up all the dangling UDP/TCP
sockets from the loop.

(cherry picked from commit 6cfadf9db0)
2020-12-09 10:46:16 +01:00
Ondřej Surý
c4816ce34f Refactor udp_recv_cb()
- more logical code flow.
- propagate errors back to the caller.
- add a 'reading' flag and call the callback from failed_read_cb()
  only when it the socket was actively reading.

(cherry picked from commit 5fcd52209a)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7945fb0c90 Fix netmgr read/connect timeout issues
- don't bother closing sockets that are already closing.
- UDP read timeout timer was not stopped after reading.
- improve handling of TCP connection failures.

(cherry picked from commit cdccac4993)
2020-12-09 10:46:16 +01:00
Ondřej Surý
e9354e7bfe Add isc__nm_udp_shutdown() function
This function will be called during isc_nm_closedown() to ensure
that all UDP sockets are closed and detached.

(cherry picked from commit 7a6056bc8f)
2020-12-09 10:46:16 +01:00
Evan Hunt
c919a3338f add netmgr functions to support outgoing DNS queries
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
  timeout argument to ensure connections time out and are correctly
  cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
  stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
  failure if reading is interrupted by a non-network thread (e.g.
  a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
  prior to 1.27, when uv_udp_connect() was added

all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.

(cherry picked from commit 5dcdc00b93)
2020-12-09 10:46:16 +01:00
Tinderbox User
7406ea925a prep 9.16.10 2020-12-09 10:46:16 +01:00
Ondřej Surý
a35a666a7c Reformat sources using clang-format-11
(cherry picked from commit 7ba18870dc)
2020-12-08 19:34:05 +01:00
Mark Andrews
5c10b5a4e8 Adjust default value of "max-recursion-queries"
Since the queries sent towards root and TLD servers are now included in
the count (as a result of the fix for CVE-2020-8616),
"max-recursion-queries" has a higher chance of being exceeded by
non-attack queries.  Increase its default value from 75 to 100.

(cherry picked from commit ab0bf49203)
2020-12-02 00:53:49 +11:00
Mark Andrews
4926888306 Fix misplaced declaration
(cherry picked from commit 49b9219bb3)
2020-12-01 23:19:20 +11:00
Mark Andrews
7e85b2cd22 Add comment about cookie sizes
(cherry picked from commit 304df53991)
2020-11-27 08:44:00 +11:00
Mark Andrews
df5f076a02 Tighten DNS COOKIE response handling
Fallback to TCP when we have already seen a DNS COOKIE response
from the given address and don't have one in this UDP response. This
could be a server that has turned off DNS COOKIE support, a
misconfigured anycast server with partial DNS COOKIE support, or a
spoofed response. Falling back to TCP is the correct behaviour in
all 3 cases.

(cherry picked from commit 0e3b1f5a25)
2020-11-27 08:15:11 +11:00
Diego Fronza
5c28451949 Silence coverity warnings in query.c
Return value of dns_db_getservestalerefresh() and
dns_db_getservestalettl() functions were previously unhandled.

This commit purposefully ignore those return values since there is
no side effect if those results are != ISC_R_SUCCESS, it also supress
Coverity warnings.
2020-11-26 14:56:22 +00:00
Matthijs Mekking
2f0b924ce6 Add NSEC3PARAM unit test, refactor zone.c
Add unit test to ensure the right NSEC3PARAM event is scheduled in
'dns_zone_setnsec3param()'.  To avoid scheduling and managing actual
tasks, split up the 'dns_zone_setnsec3param()' function in two parts:

1. 'dns__zone_lookup_nsec3param()' that will check if the requested
   NSEC3 parameters already exist, and if a new salt needs to be
   generated.

2. The actual scheduling of the new NSEC3PARAM event (if needed).

(cherry picked from commit 64db30942d)
2020-11-26 14:15:05 +00:00
Matthijs Mekking
6db879160f Detect NSEC3 salt collisions
When generating a new salt, compare it with the previous NSEC3
paremeters to ensure the new parameters are different from the
previous ones.

This moves the salt generation call from 'bin/named/*.s' to
'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new
function parameter 'resalt' to enforce a new salt to be generated. A
new salt will also be generated if 'salt' is set to NULL.

Logging salt with zone context can now be done with 'dnssec_log',
removing the need for 'dns_nsec3_log_salt'.

(cherry picked from commit 6b5d7357df)
2020-11-26 14:15:05 +00:00
Matthijs Mekking
93f9d3b812 Move logging of salt in separate function
There may be a desire to log the salt without losing the context
of log module, level, and category.

(cherry picked from commit 7878f300ff)
2020-11-26 14:15:04 +00:00
Matthijs Mekking
52d3bf5f31 Change nsec3param salt config to saltlen
Upon request from Mark, change the configuration of salt to salt
length.

Introduce a new function 'dns_zone_checknsec3aram' that can be used
upon reconfiguration to check if the existing NSEC3 parameters are
in sync with the configuration. If a salt is used that matches the
configured salt length, don't change the NSEC3 parameters.

(cherry picked from commit 6f97bb6b1f)
2020-11-26 14:15:04 +00:00
Matthijs Mekking
d35dab3db8 Add check for NSEC3 and key algorithms
NSEC3 is not backwards compatible with key algorithms that existed
before the RFC 5155 specification was published.

(cherry picked from commit 00c5dabea3)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
2a1793a2be Check nsec3param configuration values
Check 'nsec3param' configuration for the number of iterations.  The
maximum number of iterations that are allowed are based on the key
size (see https://tools.ietf.org/html/rfc5155#section-10.3).

Check 'nsec3param' configuration for correct salt. If the string is
not "-" or hex-based, this is a bad salt.

(cherry picked from commit 7039c5f805)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
008e84e965 Support for NSEC3 in dnssec-policy
Implement support for NSEC3 in dnssec-policy.  Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.

Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.

(cherry picked from commit 114af58ee2)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
5dfd3b2d7b Add kasp nsec3param configuration
Add configuration and documentation on how to enable NSEC3 when
using dnssec-policy for signing your zones.

(cherry picked from commit f7ca96c805)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
9b9ac92fd0 Move generate_salt function to lib/dns/nsec3
We will be using this function also on reconfig, so it should have
a wider availability than just bin/named/server.

(cherry picked from commit 84a4273074)
2020-11-26 14:14:56 +00:00
Michal Nowak
8885f4a6f7
Fix typo in ISC_PLAFORM_HAVESYSUNH 2020-11-26 14:17:17 +01:00
Michał Kępień
a6f2e36ee6 Use proper cmocka macros for pointer checks
Make sure pointer checks in unit tests use cmocka assertion macros
dedicated for use with pointers instead of those dedicated for use with
integers or booleans.

(cherry picked from commit f440600126)
2020-11-26 13:13:21 +01:00
Tinderbox User
14620951cc prep 9.16.9 2020-11-26 12:25:53 +01:00
Mark Andrews
328e7474d2 Remove now redundant check for state != NULL
(cherry picked from commit ee135d8946)
2020-11-25 13:21:58 +01:00
Michał Kępień
a452798af4 Convert add_quota() to a function
cppcheck 2.2 reports the following false positive:

    lib/isc/tests/quota_test.c:71:21: error: Array 'quotas[101]' accessed at index 110, which is out of bounds. [arrayIndexOutOfBounds]
     isc_quota_t *quotas[110];
                        ^

The above is not even an array access, so this report is obviously
caused by a cppcheck bug.  Yet, it seems to be triggered by the presence
of the add_quota() macro, which should really be a function.  Convert
the add_quota() macro to a function in order to make the code cleaner
and to prevent the above cppcheck 2.2 false positive from being
triggered.

(cherry picked from commit ea54a932d2)
2020-11-25 13:21:58 +01:00
Michał Kępień
3158a2aead Silence cppcheck 2.2 false positive in udp_recv()
cppcheck 2.2 reports the following false positive:

    lib/dns/dispatch.c:1241:14: warning: Either the condition 'resp==NULL' is redundant or there is possible null pointer dereference: resp. [nullPointerRedundantCheck]
     if (disp != resp->disp) {
                 ^
    lib/dns/dispatch.c:1212:11: note: Assuming that condition 'resp==NULL' is not redundant
     if (resp == NULL) {
              ^
    lib/dns/dispatch.c:1241:14: note: Null pointer dereference
     if (disp != resp->disp) {
                 ^

Apparently this version of cppcheck gets confused about conditional
"goto" statements because line 1241 can never be reached if 'resp' is
NULL.

Move a code block to prevent the above false positive from being
reported without affecting the processing logic.

(cherry picked from commit 0b6216d1c7)
2020-11-25 13:21:58 +01:00
Mark Andrews
b3d259107f Fix DNAME when QTYPE is CNAME or ANY
The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.

(cherry picked from commit e980affba0)
2020-11-19 10:52:29 +11:00
Diego Fronza
73c199dec7 Check 'stale-refresh-time' when sharing cache between views
This commit ensures that, along with previous restrictions, a cache is
shareable between views only if their 'stale-refresh-time' value are
equal.
2020-11-11 16:06:23 -03:00
Diego Fronza
24ec021e50 Warn if 'stale-refresh-time' < 30 (default)
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.

Added check into named-checkconf, which will warn if values below the
default are found in configuration.

BIND will also log the warning during loading of configuration in the
same fashion.
2020-11-11 16:00:22 -03:00
Diego Fronza
8cc5abff23 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 15:59:56 -03:00
Mark Andrews
30c96198e8 Address TSAN error between dns_rbt_findnode() and subtractrdataset().
Having dns_rbt_findnode() in previous_closest_nsec() check of
node->data is a optimisation that triggers a TSAN error with
subtractrdataset().  find_closest_nsec() still needs to check if
the NSEC record are active or not and look for a earlier NSEC records
if it isn't.  Set DNS_RBTFIND_EMPTYDATA so node->data isn't referenced
without the node lock being held.

    WARNING: ThreadSanitizer: data race
    Read of size 8 at 0x000000000001 by thread T1 (mutexes: read M1, read M2):
    #0 dns_rbt_findnode lib/dns/rbt.c:1708
    #1 previous_closest_nsec lib/dns/rbtdb.c:3760
    #2 find_closest_nsec lib/dns/rbtdb.c:3942
    #3 zone_find lib/dns/rbtdb.c:4091
    #4 dns_db_findext lib/dns/db.c:536
    #5 query_lookup lib/ns/query.c:5582
    #6 ns__query_start lib/ns/query.c:5505
    #7 query_setup lib/ns/query.c:5229
    #8 ns_query_start lib/ns/query.c:11380
    #9 ns__client_request lib/ns/client.c:2166
    #10 processbuffer netmgr/tcpdns.c:230
    #11 dnslisten_readcb netmgr/tcpdns.c:309
    #12 read_cb netmgr/tcp.c:832
    #13 <null> <null>
    #14 <null> <null>

    Previous write of size 8 at 0x000000000001 by thread T2 (mutexes: write M3):
    #0 subtractrdataset lib/dns/rbtdb.c:7133
    #1 dns_db_subtractrdataset lib/dns/db.c:742
    #2 diff_apply lib/dns/diff.c:368
    #3 dns_diff_apply lib/dns/diff.c:459
    #4 do_one_tuple lib/dns/update.c:247
    #5 update_one_rr lib/dns/update.c:275
    #6 delete_if_action lib/dns/update.c:689
    #7 foreach_rr lib/dns/update.c:471
    #8 delete_if lib/dns/update.c:716
    #9 dns_update_signaturesinc lib/dns/update.c:1948
    #10 receive_secure_serial lib/dns/zone.c:15637
    #11 dispatch lib/isc/task.c:1152
    #12 run lib/isc/task.c:1344
    #13 <null> <null>

    Location is heap block of size 130 at 0x000000000028 allocated by thread T3:
    #0 malloc <null>
    #1 default_memalloc lib/isc/mem.c:713
    #2 mem_get lib/isc/mem.c:622
    #3 mem_allocateunlocked lib/isc/mem.c:1268
    #4 isc___mem_allocate lib/isc/mem.c:1288
    #5 isc__mem_allocate lib/isc/mem.c:2453
    #6 isc___mem_get lib/isc/mem.c:1037
    #7 isc__mem_get lib/isc/mem.c:2432
    #8 create_node lib/dns/rbt.c:2239
    #9 dns_rbt_addnode lib/dns/rbt.c:1202
    #10 dns_rbtdb_create lib/dns/rbtdb.c:8668
    #11 dns_db_create lib/dns/db.c:118
    #12 receive_secure_db lib/dns/zone.c:16154
    #13 dispatch lib/isc/task.c:1152
    #14 run lib/isc/task.c:1344
    #15 <null> <null>

    Mutex M1 (0x000000000040) created at:
    #0 pthread_rwlock_init <null>
    #1 isc_rwlock_init lib/isc/rwlock.c:39
    #2 dns_rbtdb_create lib/dns/rbtdb.c:8527
    #3 dns_db_create lib/dns/db.c:118
    #4 receive_secure_db lib/dns/zone.c:16154
    #5 dispatch lib/isc/task.c:1152
    #6 run lib/isc/task.c:1344
    #7 <null> <null>

    Mutex M2 (0x000000000044) created at:
    #0 pthread_rwlock_init <null>
    #1 isc_rwlock_init lib/isc/rwlock.c:39
    #2 dns_rbtdb_create lib/dns/rbtdb.c:8600
    #3 dns_db_create lib/dns/db.c:118
    #4 receive_secure_db lib/dns/zone.c:16154
    #5 dispatch lib/isc/task.c:1152
    #6 run lib/isc/task.c:1344
    #7 <null> <null>

    Mutex M3 (0x000000000046) created at:
    #0 pthread_rwlock_init <null>
    #1 isc_rwlock_init lib/isc/rwlock.c:39
    #2 dns_rbtdb_create lib/dns/rbtdb.c:8600
    #3 dns_db_create lib/dns/db.c:118
    #4 receive_secure_db lib/dns/zone.c:16154
    #5 dispatch lib/isc/task.c:1152
    #6 run lib/isc/task.c:1344
    #7 <null> <null>

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_nm_start netmgr/netmgr.c:232
    #3 create_managers bin/named/main.c:909
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_taskmgr_create lib/isc/task.c:1434
    #3 create_managers bin/named/main.c:915
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    Thread T3 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_taskmgr_create lib/isc/task.c:1434
    #3 create_managers bin/named/main.c:915
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    SUMMARY: ThreadSanitizer: data race lib/dns/rbt.c:1708 in dns_rbt_findnode

(cherry picked from commit 244f84a84b)
2020-11-11 08:21:39 +11:00
Matthijs Mekking
a6755ce7f8 Cleanup duplicate definitions in query.h
(cherry picked from commit 31692744cc47eef7ad6b41aeb53f5566ca6e7efe)
2020-11-10 15:50:20 +01:00
Mark Andrews
14fe29b76d Implement DNSTAP support in ns_client_sendraw()
ns_client_sendraw() is currently only used to relay UPDATE
responses back to the client.  dns_dt_send() is called with
this assumption.

(cherry picked from commit b09727a765)
2020-11-10 17:59:04 +11:00
Mark Andrews
5340176c36 Lock zone before calling zone_namerd_tostr()
WARNING: ThreadSanitizer: data race
    Read of size 8 at 0x000000000001 by thread T1:
    #0 inline_raw lib/dns/zone.c:1375
    #1 zone_namerd_tostr lib/dns/zone.c:15316
    #2 dns_zone_name lib/dns/zone.c:15391
    #3 xfrin_log lib/dns/xfrin.c:1605
    #4 xfrin_destroy lib/dns/xfrin.c:1477
    #5 dns_xfrin_detach lib/dns/xfrin.c:739
    #6 xfrin_connect_done lib/dns/xfrin.c:970
    #7 tcpdnsconnect_cb netmgr/tcpdns.c:786
    #8 tcp_connect_cb netmgr/tcp.c:292
    #9 <null> <null>
    #10 <null> <null>

    Previous write of size 8 at 0x000000000001 by thread T2 (mutexes: write M1):
    #0 zone_shutdown lib/dns/zone.c:14462
    #1 dispatch lib/isc/task.c:1152
    #2 run lib/isc/task.c:1344
    #3 <null> <null>

    Location is heap block of size 2769 at 0x000000000013 allocated by thread T3:
    #0 malloc <null>
    #1 default_memalloc lib/isc/mem.c:713
    #2 mem_get lib/isc/mem.c:622
    #3 mem_allocateunlocked lib/isc/mem.c:1268
    #4 isc___mem_allocate lib/isc/mem.c:1288
    #5 isc__mem_allocate lib/isc/mem.c:2453
    #6 isc___mem_get lib/isc/mem.c:1037
    #7 isc__mem_get lib/isc/mem.c:2432
    #8 dns_zone_create lib/dns/zone.c:984
    #9 configure_zone bin/named/server.c:6502
    #10 do_addzone bin/named/server.c:13391
    #11 named_server_changezone bin/named/server.c:13788
    #12 named_control_docommand bin/named/control.c:207
    #13 control_command bin/named/controlconf.c:392
    #14 dispatch lib/isc/task.c:1152
    #15 run lib/isc/task.c:1344
    #16 <null> <null>

(cherry picked from commit 84f43903da)
2020-11-10 17:16:31 +11:00
Mark Andrews
e554daa76c fctx->id was not initalised 2020-11-09 21:48:22 +00:00
Artem Boldariev
e8106afe43 Fix build with DEBUG defined (-DDEBUG)
The problem was introduced by commit 98b55eb4.
2020-11-06 12:58:19 +02:00
Mark Andrews
b0f477df87 Call nta_detach() before dns_view_weakdetach() so view is available.
(cherry picked from commit ea956976d1)
2020-11-03 23:49:24 +11:00
Michał Kępień
923c443389 Fix getrbp()
The following compiler warning is emitted for the BACKTRACE_X86STACK
part of lib/isc/backtrace.c:

    backtrace.c: In function ‘getrbp’:
    backtrace.c:142:1: warning: no return statement in function returning non-void [-Wreturn-type]

While getrbp() stores the value of the RBP register in the RAX register
and thus does attempt to return a value, this is not enough for an
optimizing compiler to always produce the expected result.  With -O2,
the following machine code may be generated in isc_backtrace_gettrace():

    0x00007ffff7b0ff7a <+10>:	mov    %rbp,%rax
    0x00007ffff7b0ff7d <+13>:	mov    $0x17,%eax
    0x00007ffff7b0ff82 <+18>:	retq

The above is equivalent to:

    sp = (void **)getrbp();
    return (ISC_R_NOTFOUND);

and results in the backtrace never getting printed.

Fix by using an intermediate variable.  With this change in place, the
machine code generated with -O2 becomes something like:

    0x00007ffff7af5638 <+24>:	mov    $0x17,%eax
    0x00007ffff7af563d <+29>:	mov    %rbp,%rdx
    0x00007ffff7af5640 <+32>:	test   %rdx,%rdx
    0x00007ffff7af5643 <+35>:	je     0x7ffff7af56bd <isc_backtrace_gettrace+157>
    ...
    0x00007ffff7af56bd <+157>:	retq

(Note that this method of grabbing a stack trace is finicky anyway
because in order for RBP to be relied upon, -fno-omit-stack-frame must
be present among CFLAGS.)
2020-10-30 09:12:50 +01:00
Michał Kępień
10d7055791 Check for _Unwind_Backtrace() support
Some operating systems (e.g. Linux, FreeBSD) provide the
_Unwind_Backtrace() function in libgcc_s.so, which is automatically
linked into any binary using the functions provided by that library.  On
OpenBSD, though, _Unwind_Backtrace() is provided by libc++abi.so, which
is not automatically linked into binaries produced by the stock system C
compiler.

Meanwhile, lib/isc/backtrace.c assumes that any GNU-compatible toolchain
allows _Unwind_Backtrace() to be used without any extra provisions in
the build system.  This causes build failures on OpenBSD (and possibly
other systems).

Instead of making assumptions, actually check for _Unwind_Backtrace()
support in the toolchain if the backtrace() function is unavailable.
2020-10-30 09:12:50 +01:00
Mark Andrews
903c1136ef Handle DNS_R_NCACHENXRRSET in fetch_callback_{dnskey,validator}()
DNS_R_NCACHENXRRSET can be return when zones are in transition state
from being unsigned to signed and signed to unsigned.  The validation
should be resumed and should result in a insecure answer.

(cherry picked from commit 718e597def)
2020-10-30 08:21:43 +11:00
Witold Kręcicki
e1c75d00b7 Properly handle outer TCP connection closed in TCPDNS.
If the connection is closed while we're processing the request
we might access TCPDNS outerhandle which is already reset. Check
for this condition and call the callback with ISC_R_CANCELED result.

(cherry picked from commit c41ce8e0c9)
2020-10-29 13:21:55 +01:00
Mark Andrews
2a5d2c55aa Hold qid->lock when calling deref_portentry() as
socket_search() need portentry to be unchanging.

    WARNING: ThreadSanitizer: data race
    Write of size 8 at 0x000000000001 by thread T1 (mutexes: write M1):
    #0 deref_portentry lib/dns/dispatch.c:630
    #1 deactivate_dispsocket lib/dns/dispatch.c:861
    #2 udp_recv lib/dns/dispatch.c:1105
    #3 udp_exrecv lib/dns/dispatch.c:1028
    #4 dispatch lib/isc/task.c:1152
    #5 run lib/isc/task.c:1344
    #6 <null> <null>

    Previous read of size 8 at 0x000000000001 by thread T2 (mutexes: write M1, write M2):
    #0 socket_search lib/dns/dispatch.c:661
    #1 get_dispsocket lib/dns/dispatch.c:744
    #2 dns_dispatch_addresponse lib/dns/dispatch.c:3120
    #3 resquery_send lib/dns/resolver.c:2467
    #4 fctx_query lib/dns/resolver.c:2217
    #5 fctx_try lib/dns/resolver.c:4245
    #6 fctx_timeout lib/dns/resolver.c:4570
    #7 dispatch lib/isc/task.c:1152
    #8 run lib/isc/task.c:1344
    #9 <null> <null>

(cherry picked from commit 5c253c416d)
2020-10-24 07:14:47 +11:00
Mark Andrews
2e264a4ae2 DNS_ZONEFLAG_NOIXFR should be DNS_ZONEFLG_NOIXFR
(cherry picked from commit 3a044444bd)
2020-10-24 00:26:25 +11:00
Ondřej Surý
bca8604bf3 Fix the data race when read-writing sock->active by using cmpxchg
(cherry picked from commit 8797e5efd5)
2020-10-22 15:00:07 -07:00
Ondřej Surý
74378ea4f4 Ignore and don't log ISC_R_NOTCONNECTED from uv_accept()
When client disconnects before the connection can be accepted, the named
would log a spurious log message:

    error: Accepting TCP connection failed: socket is not connected

We now ignore the ISC_R_NOTCONNECTED result code and log only other
errors

(cherry picked from commit 5ef71c420f)
2020-10-22 15:00:07 -07:00
Ondřej Surý
301e4145de Fix the isc_nm_closedown() to actually close the pending connections
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
   whether the socket was still alive and scheduling reads/sends on
   closed socket.

2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
   changed to always return the error conditions via the callbacks, so
   they always succeed.  This applies to all protocols (UDP, TCP and
   TCPDNS).

(cherry picked from commit f7c82e406e)
2020-10-22 15:00:00 -07:00
Ondřej Surý
5547657bce Fix the way tcp_send_direct() is used
There were two problems how tcp_send_direct() was used:

1. The tcp_send_direct() can return ISC_R_CANCELED (or translated error
   from uv_tcp_send()), but the isc__nm_async_tcpsend() wasn't checking
   the error code and not releasing the uvreq in case of an error.

2. In isc__nm_tcp_send(), when the TCP send is already in the right
   netthread, it uses tcp_send_direct() to send the TCP packet right
   away.  When that happened the uvreq was not freed, and the error code
   was returned to the caller.  We need to return ISC_R_SUCCESS and
   rather use the callback to report an error in such case.

(cherry picked from commit 6af08d1ca6)
2020-10-22 14:59:01 -07:00
Ondřej Surý
e0ebd02b9c Detach the sock->server in uv_close() callback, not before
(cherry picked from commit d72bc3eb52)
2020-10-22 14:59:01 -07:00
Ondřej Surý
e18f3fd003 Explicitly stop reading before closing the nmtcpsocket
When closing the socket that is actively reading from the stream, the
read_cb() could be called between uv_close() and close callback when the
server socket has been already detached hence using sock->statichandle
after it has been already freed.

(cherry picked from commit 97b33e5bde)
2020-10-22 14:59:01 -07:00
Witold Kręcicki
63e923364f Proper handling of socket references in case of TCP conn failure.
(cherry picked from commit ff0a336d52)
2020-10-22 14:59:00 -07:00
Witold Kręcicki
b4e27a075a Don't crash if isc_uv_export returns an error in accept_connection.
isc_uv_export can return an error - e.g. EMFILE (from dup), handle this
nicely.

(cherry picked from commit ae9a6befa8)
2020-10-22 14:59:00 -07:00
Ondřej Surý
81085bbeca Fix the way udp_send_direct() is used
There were two problems how udp_send_direct() was used:

1. The udp_send_direct() can return ISC_R_CANCELED (or translated error
   from uv_udp_send()), but the isc__nm_async_udpsend() wasn't checking
   the error code and not releasing the uvreq in case of an error.

2. In isc__nm_udp_send(), when the UDP send is already in the right
   netthread, it uses udp_send_direct() to send the UDP packet right
   away.  When that happened the uvreq was not freed, and the error code
   was returned to the caller.  We need to return ISC_R_SUCCESS and
   rather use the callback to report an error in such case.

(cherry picked from commit afca2e3b21)
2020-10-22 14:59:00 -07:00
Tinderbox User
44e91206a4 prep 9.16.8 2020-10-22 09:09:07 +02:00
Diego Fronza
d5355b8105 Always return address records in additional section for NS queries 2020-10-21 12:12:22 -03:00
Diego Fronza
7a3dbbc395 Fix transfer of glue records in stub zones if master has minimal-responses set
Stub zones don't make use of AXFR/IXFR for the transfering of zone
data, instead, a single query is issued to the master asking for
their nameserver records (NS).

That works fine unless master is configured with 'minimal-responses'
set to yes, in which case glue records are not provided by master
in the answer with nameservers authoritative for the zone, leaving
stub zones with incomplete databases.

This commit fix this problem in a simple way, when the answer with
the authoritative nameservers is received from master (stub_callback),
for each nameserver listed (save_nsrrset), a A and AAAA records for
the name is verified in the additional section, and if not present
a query is created to resolve the corresponsing missing glue.

A struct 'stub_cb_args' was added to keep relevant information for
performing a query, like TSIG key, udp size, dscp value, etc, this
information is borrowed from, and created within function 'ns_query',
where the resolving of nameserver from master starts.

A new field was added to the struct 'dns_stub', an atomic integer,
namely pending_requests, which is used to keep how many queries are
created when resolving nameserver addresses that were missing in
the glue.

When the value of pending_requests is zero we know we can release
resources, adjust zone timers, dump to zone file, etc.
2020-10-21 12:11:31 -03:00
Matthijs Mekking
5c0b5b64e5 Don't increment network error stats on UV_EOF
When networking statistics was added to the netmgr (in commit
5234a8e00a), two lines were added that
increment the 'STATID_RECVFAIL' statistic: One if 'uv_read_start'
fails and one at the end of the 'read_cb'.  The latter happens
if 'nread < 0'.

According to the libuv documentation, I/O read callbacks (such as for
files and sockets) are passed a parameter 'nread'. If 'nread' is less
than 0, there was an error and 'UV_EOF' is the end of file error, which
you may want to handle differently.

In other words, we should not treat EOF as a RECVFAIL error.

(cherry picked from commit 6c5ff94218)
2020-10-20 14:05:09 +00:00
Mark Andrews
da0a7a34ec Complete the isc_nmhandle_detach() in the worker thread.
isc_nmhandle_detach() needs to complete in the same thread
as shutdown_walk_cb() to avoid a race.  Clear the caller's
pointer then pass control to the worker if necessary.

    WARNING: ThreadSanitizer: data race
    Write of size 8 at 0x000000000001 by thread T1:
    #0 isc_nmhandle_detach lib/isc/netmgr/netmgr.c:1258:15
    #1 control_command bin/named/controlconf.c:388:3
    #2 dispatch lib/isc/task.c:1152:7
    #3 run lib/isc/task.c:1344:2

    Previous read of size 8 at 0x000000000001 by thread T2:
    #0 isc_nm_pauseread lib/isc/netmgr/netmgr.c:1449:33
    #1 recv_data lib/isccc/ccmsg.c:109:2
    #2 isc__nm_tcp_shutdown lib/isc/netmgr/tcp.c:1157:4
    #3 shutdown_walk_cb lib/isc/netmgr/netmgr.c:1515:3
    #4 uv_walk <null>
    #5 process_queue lib/isc/netmgr/netmgr.c:659:4
    #6 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
    #7 process_queues lib/isc/netmgr/netmgr.c:590:8
    #8 async_cb lib/isc/netmgr/netmgr.c:548:2
    #9 <null> <null>

(cherry picked from commit f95ba8aa20)
2020-10-15 11:03:47 +11:00
Ondřej Surý
dbf2d0b15f Clean the last remnant of ISC_PLATFORM_HAVEIPV6 macro
In set_sndbuf() we were using ISC_PLATFORM_HAVEIPV6 macro that doesn't
exist anymore, because we assume that IPv6 support is always available.

(cherry picked from commit 96ac91a18a)
2020-10-08 09:03:08 +02:00
Ondřej Surý
53d6a11a0e Clone the csock in accept_connection(), not in callback
If we clone the csock (children socket) in TCP accept_connection()
instead of passing the ssock (server socket) to the call back and
cloning it there we unbreak the assumption that every socket is handled
inside it's own worker thread and therefore we can get rid of (at least)
callback locking.

(cherry picked from commit e8b56acb49)
2020-10-08 08:16:54 +02:00
Ondřej Surý
69e6c8467c Change the isc__nm_tcpdns_stoplistening() to be asynchronous event
The isc__nm_tcpdns_stoplistening() would call isc__nmsocket_clearcb()
that would clear the .accept_cb from non-netmgr thread.  Change the
tcpdns_stoplistening to enqueue ievent that would get processed in the
right netmgr thread to avoid locking.

(cherry picked from commit d86a74d8a4)
2020-10-08 08:16:53 +02:00
Mark Andrews
84922b2dc7 Restore the dns_message_reset() call before the dns_dispatch_getnext()
This was accidentally lost in the process of moving rmessage from fctx
to query.  Without this dns_message_setclass() will fail.

(cherry picked from commit 1f63bb15b3)
2020-10-08 16:27:10 +11:00
Mark Andrews
33d7b5b56f Silence Coverity REVERSE_INULL report
message does not need to be tested to NULL

(cherry picked from commit f0a66cb5aa)
2020-10-06 23:37:13 +11:00
Ondřej Surý
58a518adca Change the default ENDS buffer size to 1232 for DNS Flag Day 2020
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication.  In this commit, we implement the minimal
required changes by changing the defaults for `edns-udp-size`,
`max-udp-size` and `nocookie-udp-size` to `1232` (the value picked by
DNS Flag Day 2020).

(cherry picked from commit bb990030d3)
2020-10-06 09:35:20 +02:00
Ondřej Surý
ccd2902a02 Split reusing the addr/port and load-balancing socket options
The SO_REUSEADDR, SO_REUSEPORT and SO_REUSEPORT_LB has different meaning
on different platform. In this commit, we split the function to set the
reuse of address/port and setting the load-balancing into separate
functions.

The libuv library already have multiplatform support for setting
SO_REUSEADDR and SO_REUSEPORT that allows binding to the same address
and port, but unfortunately, when used after the load-balancing socket
options have been already set, it overrides the previous setting, so we
need our own helper function to enable the SO_REUSEADDR/SO_REUSEPORT
first and then enable the load-balancing socket option.

(cherry picked from commit fd975a551d)
2020-10-05 16:19:23 +02:00
Ondřej Surý
5cff119533 Use uv_os_sock_t instead of uv_os_fd_t for sockets
On POSIX based systems both uv_os_sock_t and uv_os_fd_t are both typedef
to int.  That's not true on Windows, where uv_os_sock_t is SOCKET and
uv_os_fd_t is HANDLE and they differ in level of indirection.

(cherry picked from commit acb6ad9e3c)
2020-10-05 16:19:23 +02:00
Ondřej Surý
601fc37efe Refactor isc__nm_socket_freebind() to take fd and sa_family as args
The isc__nm_socket_freebind() has been refactored to match other
isc__nm_socket_...() helper functions and take uv_os_fd_t and
sa_family_t as function arguments.

(cherry picked from commit 9dc01a636b)
2020-10-05 16:19:23 +02:00
Ondřej Surý
c3d721b13b Add helper function to enable DF (don't fragment) flag on UDP sockets
This commits add isc__nm_socket_dontfrag() helper functions.

(cherry picked from commit d685bbc822)
2020-10-05 16:19:23 +02:00
Ondřej Surý
72b85a4100 Add SO_REUSEPORT and SO_INCOMING_CPU helper functions
The setting of SO_REUSE**** and SO_INCOMING_CPU have been moved into a
separate helper functions.

(cherry picked from commit 5daaca7146)
2020-10-05 16:19:23 +02:00
Matthijs Mekking
63652ca58f Use explicit result codes for 'rndc dnssec' cmd
It is better to add new result codes than to overload existing codes.

(cherry picked from commit 70d1ec432f)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
6bbb2a8581 Various rndc dnssec -checkds fixes
While working on 'rndc dnssec -rollover' I noticed the following
(small) issues:

- The key files where updated with hints set to "-when" and that
  should always be "now.
- The kasp system test did not properly update the test number when
  calling 'rndc dnssec -checkds' (and ensuring that works).
- There was a missing ']' in the rndc.c help output.

(cherry picked from commit edc53fc416)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
5bbecc5116 Test rndc rollover inactive key
When users (accidentally) try to roll an inactive key, throw an error.

(cherry picked from commit fcd34abb9e)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
ad48f07c9a Add manual key rollover logic
Add to the keymgr a function that will schedule a rollover. This
basically means setting the time when the key needs to retire,
and updating the key lifetime, then update the state file. The next
time that named runs the keymgr the new lifetime will be taken into
account.

(cherry picked from commit df8276aef0)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
79f9a5ddd5 Change condition for rndc dumpdb -expired
After backporting #1870 to 9.11-S I saw that the condition check there
is different than in the main branch. In 9.11-S "stale" can mean
stale and serve-stale, or not active (awaiting cleanup). In 9.16 and
later versions, "stale" is stale and serve-stale, and "ancient" means
not active (awaiting cleanup). An "ancient" RRset is one that is not
active (TTL expired) and is not eligble for serve-stale.

Update the condition for rndc dumpdb -expired to closer match what is
in 9.11-S.

(cherry picked from commit 5614454c3b)
2020-10-05 10:46:14 +02:00
Matthijs Mekking
456925d6ec Fix kasp min key size bug
The minimal size for RSASHA1, RSASHA256 is 512, but due to bad
assignment it was set to 1024.

(cherry picked from commit 7c555254fe)
2020-10-02 10:18:59 +02:00
Matthijs Mekking
a63dad13da Fix Ed25519 and Ed448 in dnssec-policy keymgr
The kasp code had bad implicit size values for the cryptographic
algorithms Ed25519 and Ed448. When creating keys they would never
match the dnssec-policy, leading to new attempts to create keys.

These algorithms were previously not yet added to the system tests,
due to lack of availability on some systems.

(cherry picked from commit 0e207392ec)
2020-10-02 10:18:25 +02:00
Michał Kępień
9e62c206c6 Allow "order none" in "rrset-order" rules
named-checkconf treats the following configuration as valid:

    options {
        rrset-order {
            order none;
        };
    };

Yet, the above configuration causes named to crash on startup with:

    order.c:74: REQUIRE(mode == 0x00000800 || mode == 0x00000400 || mode == 0x00800000) failed, back trace

Add DNS_RDATASETATTR_NONE to the list of RRset ordering modes accepted
by dns_order_add() to allow "order none" to be used in "rrset-order"
rules.  This both prevents the aforementioned crashes and addresses the
discrepancy between named-checkconf and named.

(cherry picked from commit dbcf683c1a)
2020-10-02 08:50:51 +02:00
Ondřej Surý
50db10b7ca Fix the clang 12 warnings with multi-line strings in string arrays
The clang 12 has a new warning that warns when using multi-line strings
in the string arrays, f.e.:

    { "aa",
      "b"
      "b",
      "cc" }

would generate warning like this:

    private_test.c:162:7: error: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Werror,-Wstring-concatenation]
                                      "33333/RSASHA1" };
                                      ^
    private_test.c:161:7: note: place parentheses around the string literal to silence warning
                                      "Done removing signatures for key "
                                      ^
    private_test.c:197:7: error: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Werror,-Wstring-concatenation]
                                      "NSEC chain",
                                      ^
    private_test.c:196:7: note: place parentheses around the string literal to silence warning
                                      "Removing NSEC3 chain 1 0 30 DEAF / creating "
                                      ^
    2 errors generated.

(cherry picked from commit 7b07f22969)
2020-10-01 18:42:11 +02:00
Ondřej Surý
7a90ad1fe2 Add separate prefetch nmhandle to ns_client_t
As the query_prefetch() or query_rpzfetch() could be called during
"regular" fetch, we need to introduce separate storage for attaching
the nmhandle during prefetching the records.  The query_prefetch()
and query_rpzfetch() are guarded for re-entrance by .query.prefetch
member of ns_client_t, so we can reuse the same .prefetchhandle for
both.

(cherry picked from commit d4976e0ebe)
2020-10-01 18:09:35 +02:00
Ondřej Surý
1126fe3b5b Refactor the pausing/unpausing and finishing the nm_thread
The isc_nm_pause(), isc_nm_resume() and finishing the nm_thread() from
nm_destroy() has been refactored, so all use the netievents instead of
directly touching the worker structure members.  This allows us to
remove most of the locking as the .paused and .finished members are
always accessed from the matching nm_thread.

When shutting down the nm_thread(), instead of issuing uv_stop(), we
just shutdown the .async handler, so all uv_loop_t events are properly
finished first and uv_run() ends gracefully with no outstanding active
handles in the loop.

(cherry picked from commit e5ab137ba3)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
4a7dfd69ac tracing of active sockets and handles
If NETMGR_TRACE is defined, we now maintain a list of active sockets
in the netmgr object and a list of active handles in each socket
object; by walking the list and printing `backtrace` in a debugger
we can see where they were created, to assist in in debugging of
reference counting errors.

On shutdown, if netmgr finds there are still active sockets after
waiting, isc__nm_dump_active() will be called to log the list of
active sockets and their underlying handles, along with some details
about them.

(cherry picked from commit 00e04a86c8)
2020-10-01 18:09:35 +02:00
Evan Hunt
686b73ae25 limit the time we wait for netmgr to be destroyed
if more than 10 seconds pass while we wait for netmgr events to
finish running on shutdown, something is almost certainly wrong
and we should assert and crash.

(cherry picked from commit 2f2d60a989)
2020-10-01 18:09:35 +02:00
Ondřej Surý
5a92958fba properly lock the setting/unsetting of callbacks in isc_nmsocket_t
changes to socket callback functions were not thread safe.

(cherry picked from commit 89c534d3b9)
2020-10-01 18:09:35 +02:00
Evan Hunt
ba2e9dfb99 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

(cherry picked from commit 57b4dde974)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
0202b289c2 assorted small netmgr-related changes
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
  indicate whether it is to be used as a client or server socket

(cherry picked from commit 7eb4564895)
2020-10-01 16:44:43 +02:00
Evan Hunt
7a4e97ef50 Use different allocators for UDP and TCP
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.

This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.

(cherry picked from commit 38264b6a4d)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
f0b089d922 netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.

(cherry picked from commit a0f7d28967)
2020-10-01 16:44:43 +02:00
Evan Hunt
bc5ea9d65e use handles for isc_nm_pauseread() and isc_nm_resumeread()
by having these functions act on netmgr handles instead of socket
objects, they can be used in callback functions outside the netgmr.

(cherry picked from commit 55896df79d)
2020-10-01 16:44:43 +02:00
Evan Hunt
6b77bd309a Don't destroy a non-closed socket, wait for all the callbacks.
We erroneously tried to destroy a socket after issuing
isc__nm_tcp{,dns}_close. Under some (race) circumstances we could get
nm_socket_cleanup to be called twice for the same socket, causing an
access to a dead memory.

(cherry picked from commit 233f134a4f)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
5a0fbc41ec Fix possible race in isc__nm_tcpconnect.
There's a possibility of race in isc__nm_tcpconnect if the asynchronous
connect operation finishes with all the callbacks before we exit the
isc__nm_tcpconnect itself we might access an already freed memory.
Fix it by creating an additional reference to the socket freed at the
end of isc__nm_tcpconnect.

(cherry picked from commit 896db0f419)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
ebb4d506d6 Add missing libisc.def definitions, netmgr version of isc_sockettype_t.
(cherry picked from commit 25f84ffc68)
2020-10-01 16:44:43 +02:00
Evan Hunt
b14cb9e2f1 restore "blackhole" functionality
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.

in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().

(cherry picked from commit 23c7373d68)
2020-10-01 16:44:43 +02:00
Evan Hunt
80569bf977 Make netmgr tcpdns send calls asynchronous
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving tcpdns processing
to a proper netmgr thread.

(cherry picked from commit 591b79b597)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
3942b226b8 Fix a shutdown race in netmgr udp
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.

(cherry picked from commit 1cf65cd882)
2020-10-01 16:44:43 +02:00
Evan Hunt
ca39572e5d clean up outerhandle when a tcpdns socket is disconnected
this prevents a crash when some non-netmgr thread, such as a
recursive lookup, times out after the TCP socket is already
disconnected.

(cherry picked from commit 3704c4fff2)
2020-10-01 16:44:43 +02:00
Evan Hunt
f64a881a30 change the signature of recv callbacks to include a result code
this will allow recv event handlers to distinguish between cases
in which the region is NULL because of error, shutdown, or cancelation.

(cherry picked from commit 75c985c07f)
2020-10-01 16:44:43 +02:00
Evan Hunt
d9d482e9e2 implement isc_nm_cancelread()
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.

(cherry picked from commit 5191ec8f86)
2020-10-01 16:44:43 +02:00
Evan Hunt
e1ebbaacea shorten the sleep in isc_nm_destroy()
when isc_nm_destroy() is called, there's a loop that waits for
other references to be detached, pausing and unpausing the netmgr
to ensure that all the workers' events are run, followed by a
1-second sleep. this caused a delay on shutdown which will be
noticeable when netmgr is used in tools other than named itself,
so the delay has now been reduced to a hundredth of a second.

(cherry picked from commit 870204fe47)
2020-10-01 16:44:43 +02:00
Evan Hunt
a9061ea123 implement isc_nm_tcpconnect()
the isc_nm_tcpconnect() function establishes a client connection via
TCP.  once the connection is esablished, a callback function will be
called with a newly created network manager handle.

(cherry picked from commit abbb79f9d1)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
8db2ef9f8e allow tcpdns sockets to self-reference while connected
A TCPDNS socket creates a handle for each complete DNS message.

Previously, when all the handles were disconnected, the socket
would be closed, but the wrapped TCP socket might still have
more to read.

Now, when a connection is established, the TCPDNS socket creates
a reference to itself by attaching itself to sock->self. This
reference isn't cleared until the connection is closed via
EOF, timeout, or server shutdown. This allows the socket to remain
open even when there are no active handles for it.

(cherry picked from commit cd79b49538)
2020-10-01 16:44:43 +02:00
Evan Hunt
4209f051e9 modify reference counting within netmgr
- isc__nmhandle_get() now attaches to the sock in the nmhandle object.
  the caller is responsible for dereferencing the original socket
  pointer when necessary.
- tcpdns listener sockets attach sock->outer to the outer tcp listener
  socket. tcpdns connected sockets attach sock->outerhandle to the handle
  for the tcp connected socket.
- only listener sockets need to be attached/detached directly. connected
  sockets should only be accessed and reference-counted via their
  associated handles.

(cherry picked from commit 5ea26ee1f1)
2020-10-01 16:44:43 +02:00
Evan Hunt
573bcdf932 make isc_nmsocket_{attach,detach}{} functions private
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.

(cherry picked from commit 9e740cad21)
2020-10-01 16:44:43 +02:00
Ondřej Surý
826ddb246e Revert the tree to allow cherry-picking netmgr changes from main
The following reverted changes will be picked again as part of the
netmgr sync with main branch.

Revert "Merge branch '1996-confidential-issue-v9_16' into 'security-v9_16'"

This reverts commit e160b1509f, reversing
changes made to c01e643715.

Revert "Merge branch '2038-use-freebind-when-bind-fails-v9_16' into 'v9_16'"

This reverts commit 5f8ecfb918, reversing
changes made to 23021385d5.

Revert "Merge branch '1936-blackhole-fix-v9_16' into 'v9_16'"

This reverts commit f20bc90a72, reversing
changes made to 490016ebf1.

Revert "Merge branch '1938-fix-udp-race' into 'v9_16'"

This reverts commit 0a6c7ab2a9, reversing
changes made to 4ea84740e6.

Revert "Merge branch '1947-fix-tcpdns-race' into 'v9_16'"

This reverts commit 4ea84740e6, reversing
changes made to d761cd576b.
2020-10-01 16:44:43 +02:00
Mark Andrews
2b4f4cbbd0 Add the ability select individual tests to rdata_test
(cherry picked from commit 6293682020)
2020-10-01 22:57:47 +10:00
Mark Andrews
119630ec4b Add the ability to print out the list of test names (-l)
(cherry picked from commit a9c3374717)
2020-10-01 22:57:46 +10:00
Mark Andrews
6583a9437f Add the ability to select tests to run
task_test [-t <test_name>]

(cherry picked from commit 76837484e7)
2020-10-01 22:57:43 +10:00
Mark Andrews
8746e496c7 Alphabetise tests
(cherry picked from commit 96febe6b38)
2020-10-01 22:56:16 +10:00
Mark Andrews
fc3cab22a4 Add missing rwlock calls when access keynode.initial and keynode.managed
WARNING: ThreadSanitizer: data race
    Write of size 1 at 0x000000000001 by thread T1 (mutexes: write M1):
    #0 dns_keynode_trust lib/dns/keytable.c:836
    #1 keyfetch_done lib/dns/zone.c:10187
    #2 dispatch lib/isc/task.c:1152
    #3 run lib/isc/task.c:1344
    #4 <null> <null>

    Previous read of size 1 at 0x000000000001 by thread T2 (mutexes: read M2):
    #0 keynode_dslist_totext lib/dns/keytable.c:682
    #1 dns_keytable_totext lib/dns/keytable.c:732
    #2 named_server_dumpsecroots bin/named/server.c:11357
    #3 named_control_docommand bin/named/control.c:264
    #4 control_command bin/named/controlconf.c:390
    #5 dispatch lib/isc/task.c:1152
    #6 run lib/isc/task.c:1344
    #7 <null> <null>

    Location is heap block of size 241 at 0x000000000010 allocated by thread T3:
    #0 malloc <null>
    #1 default_memalloc lib/isc/mem.c:713
    #2 mem_get lib/isc/mem.c:622
    #3 mem_allocateunlocked lib/isc/mem.c:1268
    #4 isc___mem_allocate lib/isc/mem.c:1288
    #5 isc__mem_allocate lib/isc/mem.c:2453
    #6 isc___mem_get lib/isc/mem.c:1037
    #7 isc__mem_get lib/isc/mem.c:2432
    #8 new_keynode lib/dns/keytable.c:346
    #9 insert lib/dns/keytable.c:393
    #10 dns_keytable_add lib/dns/keytable.c:421
    #11 process_key bin/named/server.c:955
    #12 load_view_keys bin/named/server.c:983
    #13 configure_view_dnsseckeys bin/named/server.c:1140
    #14 configure_view bin/named/server.c:5371
    #15 load_configuration bin/named/server.c:9110
    #16 loadconfig bin/named/server.c:10310
    #17 named_server_reconfigcommand bin/named/server.c:10693
    #18 named_control_docommand bin/named/control.c:250
    #19 control_command bin/named/controlconf.c:390
    #20 dispatch lib/isc/task.c:1152
    #21 run lib/isc/task.c:1344
    #22 <null> <null>

    Mutex M1 is already destroyed.

    Mutex M2 is already destroyed.

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_taskmgr_create lib/isc/task.c:1434
    #3 create_managers bin/named/main.c:915
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_taskmgr_create lib/isc/task.c:1434
    #3 create_managers bin/named/main.c:915
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    Thread T3 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create pthreads/thread.c:73
    #2 isc_taskmgr_create lib/isc/task.c:1434
    #3 create_managers bin/named/main.c:915
    #4 setup bin/named/main.c:1223
    #5 main bin/named/main.c:1523

    SUMMARY: ThreadSanitizer: data race lib/dns/keytable.c:836 in dns_keynode_trust

(cherry picked from commit 840cf7adb3)
2020-10-01 18:14:37 +10:00
Mark Andrews
ebf16118df Add ISO time stamps to the microsecond
(cherry picked from commit 519b070618)
2020-10-01 00:14:45 +10:00
Ondřej Surý
f0989bdf03 The dns_message_create() cannot fail, change the return to void
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.

(cherry picked from commit 33eefe9f85)
2020-09-30 14:26:26 +02:00
Diego Fronza
f557681472 Properly handling dns_message_t shared references
This commit fix the problems that arose when moving the dns_message_t
object from fetchctx_t to the query structure.

Since the lifetime of query objects are different than that of a
fetchctx and the dns_message_t object held by the query may be being
used by some external module, e.g. validator, even after the query
may have been destroyed, propery handling of the references to the
message were added in this commit to avoid accessing an already
destroyed object.

Specifically, in rctx_done(), a reference to the message is attached
at the beginning of the function and detached at the end, since a
possible call to fctx_cancelquery() would release the dns_message_t
object, and in the next lines of code a call to rctx_nextserver()
or rctx_chaseds() would require a valid pointer to the same object.

In valcreate() a new reference is attached to the message object,
this ensures that if the corresponding query object is destroyed
before the validator attempts to access it, no invalid pointer
access occurs.

In validated() we have to attach a new reference to the message,
since we destroy the validator object at the beginning of the
function, and we need access to the message in the next lines of
the same function.

rctx_nextserver() and rctx_chaseds() functions were adapted to
receive a new parameter of dns_message_t* type, this was so they
could receive a valid reference to a dns_message_t since using the
response context respctx_t to access the message through
rctx->query->rmessage could lead to an already released reference
due to the query being canceled.

(cherry picked from commit cde6227a68)
2020-09-30 11:35:11 +10:00
Diego Fronza
dfa2b7a247 Fix invalid dns message state in resolver's logic
The assertion failure REQUIRE(msg->state == DNS_SECTION_ANY), caused
by calling dns_message_setclass within function resquery_response()
in resolver.c, was happening due to wrong management of dns message_t
objects used to process responses to the queries issued by the
resolver.

Before the fix, a resolver's fetch context (fetchctx_t) would hold
a pointer to the message, this same reference would then be used
over all the attempts to resolve the query, trying next server,
etc... for this to work the message object would have it's state
reset between each iteration, marking it as ready for a new processing.

The problem arose in a scenario with many different forwarders
configured, managing the state of the dns_message_t object was
lacking better synchronization, which have led it to a invalid
dns_message_t state in resquery_response().

Instead of adding unnecessarily complex code to synchronize the
object, the dns_message_t object was moved from fetchctx_t structure
to the query structure, where it better belongs to, since each query
will produce a response, this way whenever a new query is created
an associated dns_messate_t is also created.

This commit deals mainly with moving the dns_message_t object from
fetchctx_t to the query structure.

(cherry picked from commit 02f9e125c1)
2020-09-30 11:34:57 +10:00
Diego Fronza
da84f8d1fd Refactored dns_message_t for using attach/detach semantics
This commit will be used as a base for the next code updates in
order to have a better control of dns_message_t objects' lifetime.

(cherry picked from commit 12d6d13100)
2020-09-30 11:34:42 +10:00
Mark Andrews
0c5191f27a Update comments to have binary notation
(cherry picked from commit 6727e23a47)
2020-09-29 10:40:56 +10:00
Michał Kępień
e05e5d7c12 Clean up use of function wrapping
Currently, building BIND using "--without-dlopen" universally breaks
building unit tests which employ the --wrap linker option (because the
replacement functions are put in a shared library and building shared
objects requires "--with-dlopen").  Fix by moving the overridden symbol,
isc_nmhandle_unref(), to lib/ns/tests/nstest.c and dropping
lib/ns/tests/wrap.c altogether.  This makes lib/ns/tests/Makefile.in
simpler and prevents --without-dlopen from messing with the process of
building unit tests.

Remove parts of configure.ac which are made redundant by the above
changes.

Put the replacement definition of isc_nmhandle_unref() inside an #ifdef
block, so that the build does not break for non-libtool builds (see
below).

These changes allow the broadest possible set of build variants to work
while also simplifying the build process:

  - for libtool builds, overriding isc_nmhandle_unref() is done by
    placing that symbol directly in lib/ns/tests/nstest.c and relying on
    the dynamic linker to perform symbol resolution in the expected way
    when the test binary is run,

  - for non-libtool builds, overriding isc_nmhandle_unref() is done
    using the --wrap linker option (the libtool approach cannot be used
    in this case as multiple strong symbols with the same name cannot
    coexist in the same binary),

  - the "--without-dlopen" option no longer affects building unit tests.
2020-09-28 09:16:48 +02:00
Evan Hunt
50cc4d6a3e Purge memory pool upon plugin destruction
The typical sequence of events for AAAA queries which trigger recursion
for an A RRset at the same name is as follows:

 1. Original query context is created.
 2. An AAAA RRset is found in cache.
 3. Client-specific data is allocated from the filter-aaaa memory pool.
 4. Recursion is triggered for an A RRset.
 5. Original query context is torn down.

 6. Recursion for an A RRset completes.
 7. A second query context is created.
 8. Client-specific data is retrieved from the filter-aaaa memory pool.
 9. The response to be sent is processed according to configuration.
10. The response is sent.
11. Client-specific data is returned to the filter-aaaa memory pool.
12. The second query context is torn down.

However, steps 6-12 are not executed if recursion for an A RRset is
canceled.  Thus, if named is in the process of recursing for A RRsets
when a shutdown is requested, the filter-aaaa memory pool will have
outstanding allocations which will never get released.  This in turn
leads to a crash since every memory pool must not have any outstanding
allocations by the time isc_mempool_destroy() is called.

Fix by creating a stub query context whenever fetch_callback() is called,
including cancellation events. When the qctx is destroyed, it will ensure
the client is detached and the plugin memory is freed.

(cherry picked from commit 86eddebc83)
2020-09-25 14:04:54 -07:00
Matthijs Mekking
f521948b2b rndc dumpdb -expired: print when RRsets expired
When calling 'rndc dumpdb -expired', also print when the RRset expired.

(cherry picked from commit d14c2d0d73)
2020-09-25 08:21:24 +02:00
Matthijs Mekking
02b53d38af Handle ancient rrsets in bind_rdataset
An ancient RRset is one still in the cache but expired, and awaiting
cleanup.

(cherry picked from commit 388cc666e5)
2020-09-25 08:21:02 +02:00
Matthijs Mekking
c139f1c23b Include expired rdatasets in iteration functions
By changing the check in 'rdatasetiter_first' and 'rdatasetiter_next'
from "now > header->rdh_ttl" to "now - RBDTB_VIRTUAL > header->rdh_ttl"
we include expired rdataset entries so that they can be used for
"rndc dumpdb -expired".

(cherry picked from commit 17d5bd4493)
2020-09-25 08:20:46 +02:00
Matthijs Mekking
d77283ff63 Add -expired flag to rndc dumpdb command
This flag is the same as -cache, but will use a different style format
that will also print expired entries (awaiting cleanup) from the cache.

(cherry picked from commit 8beda7d2ea)
2020-09-25 08:20:02 +02:00
Mark Andrews
c4edcaf140 It appears that you can't change what you are polling for while connecting.
WARNING: ThreadSanitizer: data race
    Read of size 8 at 0x000000000001 by thread T1 (mutexes: write M1):
    #0 epoll_ctl <null>
    #1 watch_fd lib/isc/unix/socket.c:704:8
    #2 wakeup_socket lib/isc/unix/socket.c:897:11
    #3 process_ctlfd lib/isc/unix/socket.c:3362:3
    #4 process_fds lib/isc/unix/socket.c:3275:10
    #5 netthread lib/isc/unix/socket.c:3516:10

    Previous write of size 8 at 0x000000000001 by thread T2 (mutexes: write M2):
    #0 connect <null>
    #1 isc_socket_connect lib/isc/unix/socket.c:4737:7
    #2 resquery_send lib/dns/resolver.c:2892:13
    #3 fctx_query lib/dns/resolver.c:2202:12
    #4 fctx_try lib/dns/resolver.c:4300:11
    #5 resquery_connected lib/dns/resolver.c:3130:4
    #6 dispatch lib/isc/task.c:1152:7
    #7 run lib/isc/task.c:1344:2

    Location is file descriptor 513 created by thread T2 at:
    #0 connect <null>
    #1 isc_socket_connect lib/isc/unix/socket.c:4737:7
    #2 resquery_send lib/dns/resolver.c:2892:13
    #3 fctx_query lib/dns/resolver.c:2202:12
    #4 fctx_try lib/dns/resolver.c:4300:11
    #5 resquery_connected lib/dns/resolver.c:3130:4
    #6 dispatch lib/isc/task.c:1152:7
    #7 run lib/isc/task.c:1344:2

    Mutex M1 (0x000000000016) created at:
    #0 pthread_mutex_init <null>
    #1 isc__mutex_init lib/isc/pthreads/mutex.c:288:8
    #2 setup_thread lib/isc/unix/socket.c:3584:3
    #3 isc_socketmgr_create2 lib/isc/unix/socket.c:3825:3
    #4 create_managers bin/named/main.c:932:11
    #5 setup bin/named/main.c:1223:11
    #6 main bin/named/main.c:1523:2

    Mutex M2 is already destroyed.

    Thread T1 'isc-socket-1' (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_socketmgr_create2 lib/isc/unix/socket.c:3826:3
    #3 create_managers bin/named/main.c:932:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: data race in epoll_ctl

(cherry picked from commit c37b251eb9)
2020-09-23 14:22:15 +10:00
Mark Andrews
9bd58a1c7a Address lock order inversions.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000000) => M2 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_mutex_lock <null>
    #1 dns_view_findzonecut lib/dns/view.c:1310:2
    #2 fctx_create lib/dns/resolver.c:5070:13
    #3 dns_resolver_createfetch lib/dns/resolver.c:10813:12
    #4 dns_resolver_prime lib/dns/resolver.c:10442:12
    #5 dns_view_find lib/dns/view.c:1176:4
    #6 dbfind_name lib/dns/adb.c:3833:11
    #7 dns_adb_createfind lib/dns/adb.c:3155:12
    #8 findname lib/dns/resolver.c:3497:11
    #9 fctx_getaddresses lib/dns/resolver.c:3808:3
    #10 fctx_try lib/dns/resolver.c:4197:12
    #11 fctx_start lib/dns/resolver.c:4824:4
    #12 dispatch lib/isc/task.c:1152:7
    #13 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 dns_resolver_createfetch lib/dns/resolver.c:10767:2
    #2 dns_resolver_prime lib/dns/resolver.c:10442:12
    #3 dns_view_find lib/dns/view.c:1176:4
    #4 dbfind_name lib/dns/adb.c:3833:11
    #5 dns_adb_createfind lib/dns/adb.c:3155:12
    #6 findname lib/dns/resolver.c:3497:11
    #7 fctx_getaddresses lib/dns/resolver.c:3808:3
    #8 fctx_try lib/dns/resolver.c:4197:12
    #9 fctx_start lib/dns/resolver.c:4824:4
    #10 dispatch lib/isc/task.c:1152:7
    #11 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T1:
    #0 pthread_mutex_lock <null>
    #1 dns_resolver_shutdown lib/dns/resolver.c:10530:4
    #2 view_flushanddetach lib/dns/view.c:632:4
    #3 dns_view_detach lib/dns/view.c:689:2
    #4 qctx_destroy lib/ns/query.c:5152:2
    #5 fetch_callback lib/ns/query.c:5749:3
    #6 dispatch lib/isc/task.c:1152:7
    #7 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 view_flushanddetach lib/dns/view.c:630:3
    #2 dns_view_detach lib/dns/view.c:689:2
    #3 qctx_destroy lib/ns/query.c:5152:2
    #4 fetch_callback lib/ns/query.c:5749:3
    #5 dispatch lib/isc/task.c:1152:7
    #6 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_mutex_lock

(cherry picked from commit a669c919c8)
2020-09-23 01:49:56 +00:00
Mark Andrews
9e79a7d7ce Clone the saved / query message buffers
The message buffer passed to ns__client_request is only valid for
the life of the the ns__client_request call.  Save a copy of it
when we recurse or process a update as ns__client_request will
return before those operations complete.

(cherry picked from commit f0d9bf7c30)
2020-09-23 11:17:23 +10:00
Mark Andrews
0b861934b4 Address lock-order-inversion
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000001) => M2 (0x000000000002) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_wrlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:52:4
    #2 zone_postload lib/dns/zone.c:5101:2
    #3 receive_secure_db lib/dns/zone.c:16206:11
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 receive_secure_db lib/dns/zone.c:16204:2
    #2 dispatch lib/isc/task.c:1152:7
    #3 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T1:
    #0 pthread_mutex_lock <null>
    #1 get_raw_serial lib/dns/zone.c:2518:2
    #2 zone_gotwritehandle lib/dns/zone.c:2559:4
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 zone_gotwritehandle lib/dns/zone.c:2552:2
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_wrlock

(cherry picked from commit 1090876693)
2020-09-22 22:26:53 +10:00
Mark Andrews
d04d43c777 Remove the memmove call on dns_rbtnode_t structure that contains atomics
Calling the plain memmove on the structure that contains atomic members
triggers following TSAN warning (even when we don't really use the
atomic members in the code):

    WARNING: ThreadSanitizer: data race
      Read of size 8 at 0x000000000001 by thread T1 (mutexes: write M1, write M2):
	#0 memmove <null>
	#1 memmove /usr/include/x86_64-linux-gnu/bits/string_fortified.h:40:10
	#2 deletefromlevel lib/dns/rbt.c:2675:3
	#3 dns_rbt_deletenode lib/dns/rbt.c:2143:2
	#4 delete_node lib/dns/rbtdb.c
	#5 decrement_reference lib/dns/rbtdb.c:2202:4
	#6 prune_tree lib/dns/rbtdb.c:2259:3
	#7 dispatch lib/isc/task.c:1152:7
	#8 run lib/isc/task.c:1344:2

      Previous atomic write of size 8 at 0x000000000001 by thread T2 (mutexes: read M3):
	#0 __tsan_atomic64_fetch_sub <null>
	#1 decrement_reference lib/dns/rbtdb.c:2103:7
	#2 detachnode lib/dns/rbtdb.c:5440:6
	#3 dns_db_detachnode lib/dns/db.c:588:2
	#4 qctx_clean lib/ns/query.c:5104:3
	#5 ns_query_done lib/ns/query.c:10868:2
	#6 query_sign_nodata lib/ns/query.c
	#7 query_nodata lib/ns/query.c:8438:11
	#8 query_gotanswer lib/ns/query.c
	#9 query_lookup lib/ns/query.c:5624:10
	#10 ns__query_start lib/ns/query.c:5500:10
	#11 query_setup lib/ns/query.c:5224:11
	#12 ns_query_start lib/ns/query.c:11357:8
	#13 ns__client_request lib/ns/client.c:2166:3
	#14 udp_recv_cb lib/isc/netmgr/udp.c:414:2
	#15 uv__udp_recvmsg /home/ondrej/Projects/tsan/libuv/src/unix/udp.c
	#16 uv__udp_io /home/ondrej/Projects/tsan/libuv/src/unix/udp.c:180:5
	#17 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:461:11
	#18 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5
	#19 nm_thread lib/isc/netmgr/netmgr.c:500:11

      Location is heap block of size 132 at 0x000000000030 allocated by thread T3:
	#0 malloc <null>
	#1 default_memalloc lib/isc/mem.c:713:8
	#2 mem_get lib/isc/mem.c:622:8
	#3 mem_allocateunlocked lib/isc/mem.c:1268:8
	#4 isc___mem_allocate lib/isc/mem.c:1288:7
	#5 isc__mem_allocate lib/isc/mem.c:2453:10
	#6 isc___mem_get lib/isc/mem.c:1037:11
	#7 isc__mem_get lib/isc/mem.c:2432:10
	#8 create_node lib/dns/rbt.c:2239:9
	#9 dns_rbt_addnode lib/dns/rbt.c:1435:12
	#10 findnodeintree lib/dns/rbtdb.c:2895:12
	#11 findnode lib/dns/rbtdb.c:2941:10
	#12 dns_db_findnode lib/dns/db.c:439:11
	#13 diff_apply lib/dns/diff.c:306:5
	#14 dns_diff_apply lib/dns/diff.c:459:10
	#15 do_one_tuple lib/ns/update.c:444:11
	#16 update_one_rr lib/ns/update.c:495:10
	#17 update_action lib/ns/update.c:3123:6
	#18 dispatch lib/isc/task.c:1152:7
	#19 run lib/isc/task.c:1344:2

      Mutex M1 is already destroyed.

      Mutex M2 is already destroyed.

      Mutex M3 is already destroyed.

      Thread T1 (running) created by main thread at:
	#0 pthread_create <null>
	#1 isc_thread_create lib/isc/pthreads/thread.c:73:8
	#2 isc_taskmgr_create lib/isc/task.c:1434:3
	#3 create_managers bin/named/main.c:915:11
	#4 setup bin/named/main.c:1223:11
	#5 main bin/named/main.c:1523:2

      Thread T2 (running) created by main thread at:
	#0 pthread_create <null>
	#1 isc_thread_create lib/isc/pthreads/thread.c:73:8
	#2 isc_nm_start lib/isc/netmgr/netmgr.c:223:3
	#3 create_managers bin/named/main.c:909:15
	#4 setup bin/named/main.c:1223:11
	#5 main bin/named/main.c:1523:2

      Thread T3 (running) created by main thread at:
	#0 pthread_create <null>
	#1 isc_thread_create lib/isc/pthreads/thread.c:73:8
	#2 isc_taskmgr_create lib/isc/task.c:1434:3
	#3 create_managers bin/named/main.c:915:11
	#4 setup bin/named/main.c:1223:11
	#5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: data race in memmove

(cherry picked from commit 48d54368d5)
2020-09-21 19:21:28 +10:00
Ondřej Surý
0ad77036d9 Handle the errors from sysconf() call in isc_meminfo_totalphys()
isc_meminfo_totalphys() would return invalid memory size when sysconf()
call would fail, because ((size_t)-1 * -1) is very large number.

(cherry picked from commit 79ca724d46)
2020-09-21 10:58:37 +02:00
Michał Kępień
170b869294 Fix updating summary RPZ DB for mixed-case RPZs
Each dns_rpz_zone_t structure keeps a hash table of the names this RPZ
database contains.  Here is what happens when an RPZ is updated:

  - a new hash table is prepared for the new version of the RPZ by
    iterating over it; each name found is added to the summary RPZ
    database,

  - every name added to the new hash table is searched for in the old
    hash table; if found, it is removed from the old hash table,

  - the old hash table is iterated over; all names found in it are
    removed from the summary RPZ database (because at that point the old
    hash table should only contain names which are not present in the
    new version of the RPZ),

  - the new hash table replaces the old hash table.

When the new version of the RPZ is iterated over, if a given name is
spelled using a different letter case than in the old version of the
RPZ, the new variant will hash to a different value than the old
variant, which means it will not be removed from the old hash table.
When the old hash table is subsequently iterated over to remove
seemingly deleted names, the old variant of the name will still be
there, causing the name to be deleted from the summary RPZ database
(which effectively causes a given rule to be ignored).

The issue can be triggered not just by altering the case of existing
names in an RPZ, but also by adding sibling names spelled with a
different letter case.  This is because RBT code preserves case when
node splitting occurs.  The end result is that when the RPZ is iterated
over, a given name may be using a different case than in the zone file
(or XFR contents).

Fix by downcasing all names found in the RPZ database before adding them
to the summary RPZ database.

(cherry picked from commit dc8a7791bd)
2020-09-21 09:32:21 +02:00
Ondřej Surý
8b1e4a5373 Exclude isc_mem_isovermem from ThreadSanitizer
The .is_overmem member of isc_mem_t structure is intentionally accessed
unlocked as 100% accuracy isn't necessary here.

Without the attribute, following TSAN warning would show up:

    WARNING: ThreadSanitizer: data race
      Write of size 1 at 0x000000000001 by thread T1 (mutexes: write M1, write M2):
	#0 isc___mem_put lib/isc/mem.c:1119:19
	#1 isc__mem_put lib/isc/mem.c:2439:2
	#2 dns_rdataslab_fromrdataset lib/dns/rdataslab.c:327:2
	#3 addrdataset lib/dns/rbtdb.c:6761:11
	#4 dns_db_addrdataset lib/dns/db.c:719:10
	#5 cache_name lib/dns/resolver.c:6538:13
	#6 cache_message lib/dns/resolver.c:6628:14
	#7 resquery_response lib/dns/resolver.c:7883:13
	#8 dispatch lib/isc/task.c:1152:7
	#9 run lib/isc/task.c:1344:2

      Previous read of size 1 at 0x000000000001 by thread T2 (mutexes: write M3):
	#0 isc_mem_isovermem lib/isc/mem.c:1553:15
	#1 addrdataset lib/dns/rbtdb.c:6866:25
	#2 dns_db_addrdataset lib/dns/db.c:719:10
	#3 addoptout lib/dns/ncache.c:281:10
	#4 dns_ncache_add lib/dns/ncache.c:101:10
	#5 ncache_adderesult lib/dns/resolver.c:6668:12
	#6 ncache_message lib/dns/resolver.c:6845:11
	#7 rctx_ncache lib/dns/resolver.c:9174:11
	#8 resquery_response lib/dns/resolver.c:7894:2
	#9 dispatch lib/isc/task.c:1152:7
	#10 run lib/isc/task.c:1344:2

      Location is heap block of size 328 at 0x000000000020 allocated by thread T3:
	#0 malloc <null>
	#1 default_memalloc lib/isc/mem.c:713:8
	#2 mem_create lib/isc/mem.c:763:8
	#3 isc_mem_create lib/isc/mem.c:2425:2
	#4 configure_view bin/named/server.c:4494:4
	#5 load_configuration bin/named/server.c:9062:3
	#6 run_server bin/named/server.c:9771:2
	#7 dispatch lib/isc/task.c:1152:7
	#8 run lib/isc/task.c:1344:2

    [...]

    SUMMARY: ThreadSanitizer: data race lib/isc/mem.c:1119:19 in isc___mem_put

(cherry picked from commit 0110d1ab17)
2020-09-17 17:35:58 +02:00
Mark Andrews
b7b0a4d71f Pause dbiterator ealier to prevent lock-order-inversion
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000000) => M2 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 findnodeintree lib/dns/rbtdb.c:2877:2
    #3 findnode lib/dns/rbtdb.c:2941:10
    #4 dns_db_findnode lib/dns/db.c:439:11
    #5 resume_addnsec3chain lib/dns/zone.c:3776:11
    #6 rss_post lib/dns/zone.c:20659:3
    #7 setnsec3param lib/dns/zone.c:20471:3
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 rss_post lib/dns/zone.c:20658:3
    #2 setnsec3param lib/dns/zone.c:20471:3
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T2:
    #0 pthread_mutex_lock <null>
    #1 zone_nsec3chain lib/dns/zone.c:8666:5
    #2 zone_maintenance lib/dns/zone.c:11063:4
    #3 zone_timer lib/dns/zone.c:14098:2
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_next lib/dns/rbtdb.c:9647:3
    #4 dns_dbiterator_next lib/dns/dbiterator.c:87:10
    #5 zone_nsec3chain lib/dns/zone.c:8656:13
    #6 zone_maintenance lib/dns/zone.c:11063:4
    #7 zone_timer lib/dns/zone.c:14098:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

(cherry picked from commit 9e584a4511)
2020-09-17 18:24:07 +10:00
Mark Andrews
6edd349af5 Pause the database iterator to release rwlock
(cherry picked from commit 2e63de94aa)
2020-09-17 18:24:07 +10:00
Mark Andrews
5cdc4671ec Pause dbiterator to release rwlock to prevent lock-order-inversion.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000000) => M2 (0x000000000001) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 getsigningtime lib/dns/rbtdb.c:8198:2
    #3 dns_db_getsigningtime lib/dns/db.c:979:11
    #4 set_resigntime lib/dns/zone.c:3887:11
    #5 dns_zone_markdirty lib/dns/zone.c:11119:4
    #6 update_action lib/ns/update.c:3376:3
    #7 dispatch lib/isc/task.c:1152:7
    #8 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 dns_zone_markdirty lib/dns/zone.c:11089:2
    #2 update_action lib/ns/update.c:3376:3
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T1:
    #0 pthread_mutex_lock <null>
    #1 zone_nsec3chain lib/dns/zone.c:8502:3
    #2 zone_maintenance lib/dns/zone.c:11056:4
    #3 zone_timer lib/dns/zone.c:14091:2
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_current lib/dns/rbtdb.c:9695:3
    #4 dns_dbiterator_current lib/dns/dbiterator.c:101:10
    #5 zone_nsec3chain lib/dns/zone.c:8539:3
    #6 zone_maintenance lib/dns/zone.c:11056:4
    #7 zone_timer lib/dns/zone.c:14091:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_rdlock

(cherry picked from commit fbed962204)
2020-09-17 18:24:07 +10:00
Mark Andrews
02f09ac566 Pause dbiterator to release rwlock to prevent lock-order-inversion.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000001) => M2 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 zone_sign lib/dns/zone.c:9247:3
    #3 zone_maintenance lib/dns/zone.c:11047:4
    #4 zone_timer lib/dns/zone.c:14090:2
    #5 dispatch lib/isc/task.c:1152:7
    #6 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_next lib/dns/rbtdb.c:9647:3
    #4 dns_dbiterator_next lib/dns/dbiterator.c:87:10
    #5 zone_sign lib/dns/zone.c:9488:13
    #6 zone_maintenance lib/dns/zone.c:11047:4
    #7 zone_timer lib/dns/zone.c:14090:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T2:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 findnodeintree lib/dns/rbtdb.c:2877:2
    #3 findnode lib/dns/rbtdb.c:2941:10
    #4 dns_db_findnode lib/dns/db.c:439:11
    #5 dns_db_getsoaserial lib/dns/db.c:780:11
    #6 dump_done lib/dns/zone.c:11428:15
    #7 dump_quantum lib/dns/masterdump.c:1487:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 dump_done lib/dns/zone.c:11426:4
    #3 dump_quantum lib/dns/masterdump.c:1487:2
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_rdlock

(cherry picked from commit c9dbad97b2)
2020-09-17 18:24:07 +10:00
Mark Andrews
d36b4ed8ed Pause dbiterator to release rwlock to prevent lock-order-inversion.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000000) => M2 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 getsigningtime lib/dns/rbtdb.c:8198:2
    #3 dns_db_getsigningtime lib/dns/db.c:979:11
    #4 set_resigntime lib/dns/zone.c:3887:11
    #5 dns_zone_markdirty lib/dns/zone.c:11115:4
    #6 update_action lib/ns/update.c:3376:3
    #7 dispatch lib/isc/task.c:1152:7
    #8 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null>
    #1 dns_zone_markdirty lib/dns/zone.c:11085:2
    #2 update_action lib/ns/update.c:3376:3
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T2:
    #0 pthread_mutex_lock <null>
    #1 zone_nsec3chain lib/dns/zone.c:8274:3
    #2 zone_maintenance lib/dns/zone.c:11052:4
    #3 zone_timer lib/dns/zone.c:14087:2
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_next lib/dns/rbtdb.c:9647:3
    #4 dns_dbiterator_next lib/dns/dbiterator.c:87:10
    #5 zone_nsec3chain lib/dns/zone.c:8412:13
    #6 zone_maintenance lib/dns/zone.c:11052:4
    #7 zone_timer lib/dns/zone.c:14087:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_rdlock

(cherry picked from commit 98025e15d0)
2020-09-17 18:24:07 +10:00
Mark Andrews
6a1cd20473 Pause dbiterator to release rwlock to prevent lock-order-inversion.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000001) => M2 (0x000000000002) => M3 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 findnodeintree lib/dns/rbtdb.c:2877:2
    #3 findnode lib/dns/rbtdb.c:2941:10
    #4 dns_db_findnode lib/dns/db.c:439:11
    #5 copy_non_dnssec_records lib/dns/zone.c:16031:11
    #6 receive_secure_db lib/dns/zone.c:16163:12
    #7 dispatch lib/isc/task.c:1152:7
    #8 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_first lib/dns/rbtdb.c:9407:3
    #4 dns_dbiterator_first lib/dns/dbiterator.c:43:10
    #5 receive_secure_db lib/dns/zone.c:16160:16
    #6 dispatch lib/isc/task.c:1152:7
    #7 run lib/isc/task.c:1344:2

    Mutex M3 acquired here while holding mutex M2 in thread T2:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 zone_sign lib/dns/zone.c:9244:3
    #3 zone_maintenance lib/dns/zone.c:11044:4
    #4 zone_timer lib/dns/zone.c:14087:2
    #5 dispatch lib/isc/task.c:1152:7
    #6 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_next lib/dns/rbtdb.c:9647:3
    #4 dns_dbiterator_next lib/dns/dbiterator.c:87:10
    #5 zone_sign lib/dns/zone.c:9485:13
    #6 zone_maintenance lib/dns/zone.c:11044:4
    #7 zone_timer lib/dns/zone.c:14087:2
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M3 in thread T3:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 findnodeintree lib/dns/rbtdb.c:2877:2
    #3 findnode lib/dns/rbtdb.c:2941:10
    #4 dns_db_findnode lib/dns/db.c:439:11
    #5 zone_get_from_db lib/dns/zone.c:5602:11
    #6 get_raw_serial lib/dns/zone.c:2520:12
    #7 zone_gotwritehandle lib/dns/zone.c:2559:4
    #8 dispatch lib/isc/task.c:1152:7
    #9 run lib/isc/task.c:1344:2

    Mutex M3 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 zone_gotwritehandle lib/dns/zone.c:2552:2
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T3 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_rdlock

(cherry picked from commit e185e37137)
2020-09-17 18:24:06 +10:00
Mark Andrews
f5a8d9055f Address lock-order-inversion between the keytable and the db locks.
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
    Cycle in lock order graph: M1 (0x000000000000) => M2 (0x000000000000) => M1

    Mutex M2 acquired here while holding mutex M1 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 dns_keytable_find lib/dns/keytable.c:522:2
    #3 sync_keyzone lib/dns/zone.c:4560:12
    #4 dns_zone_synckeyzone lib/dns/zone.c:4635:11
    #5 mkey_refresh bin/named/server.c:15423:2
    #6 named_server_mkeys bin/named/server.c:15727:4
    #7 named_control_docommand bin/named/control.c:236:12
    #8 control_command bin/named/controlconf.c:365:17
    #9 dispatch lib/isc/task.c:1152:7
    #10 run lib/isc/task.c:1344:2

    Mutex M1 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 resume_iteration lib/dns/rbtdb.c:9357:2
    #3 dbiterator_first lib/dns/rbtdb.c:9407:3
    #4 dns_dbiterator_first lib/dns/dbiterator.c:43:10
    #5 dns_rriterator_first lib/dns/rriterator.c:71:15
    #6 sync_keyzone lib/dns/zone.c:4543:16
    #7 dns_zone_synckeyzone lib/dns/zone.c:4635:11
    #8 mkey_refresh bin/named/server.c:15423:2
    #9 named_server_mkeys bin/named/server.c:15727:4
    #10 named_control_docommand bin/named/control.c:236:12
    #11 control_command bin/named/controlconf.c:365:17
    #12 dispatch lib/isc/task.c:1152:7
    #13 run lib/isc/task.c:1344:2

    Mutex M1 acquired here while holding mutex M2 in thread T1:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 zone_find lib/dns/rbtdb.c:4029:2
    #3 dns_db_find lib/dns/db.c:500:11
    #4 addifmissing lib/dns/zone.c:4481:11
    #5 dns_keytable_forall lib/dns/keytable.c:786:4
    #6 sync_keyzone lib/dns/zone.c:4586:2
    #7 dns_zone_synckeyzone lib/dns/zone.c:4635:11
    #8 mkey_refresh bin/named/server.c:15423:2
    #9 named_server_mkeys bin/named/server.c:15727:4
    #10 named_control_docommand bin/named/control.c:236:12
    #11 control_command bin/named/controlconf.c:365:17
    #12 dispatch lib/isc/task.c:1152:7
    #13 run lib/isc/task.c:1344:2

    Mutex M2 previously acquired by the same thread here:
    #0 pthread_rwlock_rdlock <null>
    #1 isc_rwlock_lock lib/isc/rwlock.c:48:3
    #2 dns_keytable_forall lib/dns/keytable.c:770:2
    #3 sync_keyzone lib/dns/zone.c:4586:2
    #4 dns_zone_synckeyzone lib/dns/zone.c:4635:11
    #5 mkey_refresh bin/named/server.c:15423:2
    #6 named_server_mkeys bin/named/server.c:15727:4
    #7 named_control_docommand bin/named/control.c:236:12
    #8 control_command bin/named/controlconf.c:365:17
    #9 dispatch lib/isc/task.c:1152:7
    #10 run lib/isc/task.c:1344:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) in pthread_rwlock_rdlock

(cherry picked from commit 9e5f83c499)
2020-09-17 18:24:06 +10:00
Tinderbox User
a4f73cfe8a prep 9.16.7 2020-09-16 22:50:38 +02:00
Evan Hunt
df698d73f4 update all copyright headers to eliminate the typo 2020-09-14 16:50:58 -07:00
Mark Andrews
2c1b6b2168 Silence REVERSE_INULL warning (CID 306652)
(cherry picked from commit 584dbffab1)
2020-09-10 07:54:19 +10:00
Mark Andrews
c359fa0933 Turn off TSAN for isc_log_wouldlog
(cherry picked from commit 7b3c7f52c2)
2020-09-09 16:22:39 +10:00
Mark Andrews
947bc2594b Only test node->data if we care about whether data is present or not.
WARNING: ThreadSanitizer: data race (pid=28788)
  Write of size 8 at 0x7b200002e060 by thread T1 (mutexes: write M2947):
    #0 add32 /builds/isc-projects/bind9/lib/dns/rbtdb.c:6638:18 (libdns.so.1110+0xe7843)
    #1 addrdataset /builds/isc-projects/bind9/lib/dns/rbtdb.c:6975:12 (libdns.so.1110+0xe4185)
    #2 dns_db_addrdataset /builds/isc-projects/bind9/lib/dns/db.c:783:10 (libdns.so.1110+0x650ee)
    #3 validated /builds/isc-projects/bind9/lib/dns/resolver.c:5140:11 (libdns.so.1110+0x1909f7)
    #4 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507f5)
    #5 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d749)

  Previous read of size 8 at 0x7b200002e060 by thread T5 (mutexes: write M521146194917735760):
    #0 dns_rbt_findnode /builds/isc-projects/bind9/lib/dns/rbt.c:1708:9 (libdns.so.1110+0xd910d)
    #1 cache_find /builds/isc-projects/bind9/lib/dns/rbtdb.c:5098:11 (libdns.so.1110+0xe188e)
    #2 dns_db_find /builds/isc-projects/bind9/lib/dns/db.c:554:11 (libdns.so.1110+0x642bb)
    #3 dns_view_find2 /builds/isc-projects/bind9/lib/dns/view.c:1068:11 (libdns.so.1110+0x1cc2c4)
    #4 dbfind_name /builds/isc-projects/bind9/lib/dns/adb.c:3714:11 (libdns.so.1110+0x46a4b)
    #5 dns_adb_createfind2 /builds/isc-projects/bind9/lib/dns/adb.c:3133:12 (libdns.so.1110+0x45278)
    #6 findname /builds/isc-projects/bind9/lib/dns/resolver.c:3166:11 (libdns.so.1110+0x1827f0)
    #7 fctx_getaddresses /builds/isc-projects/bind9/lib/dns/resolver.c:3462:3 (libdns.so.1110+0x18032d)
    #8 fctx_try /builds/isc-projects/bind9/lib/dns/resolver.c:3819:12 (libdns.so.1110+0x17e174)
    #9 fctx_start /builds/isc-projects/bind9/lib/dns/resolver.c:4219:4 (libdns.so.1110+0x1787a3)
    #10 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507f5)
    #11 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d749)

(cherry picked from commit 71ef3a8038)
2020-09-09 16:22:39 +10:00
Mark Andrews
f6ba3ec731 Address lock-order-inversion
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=12714)
  Cycle in lock order graph: M100252 (0x7b7c00010a08) => M1171 (0x7b7400000dc8) => M100252

  Mutex M1171 acquired here while holding mutex M100252 in thread T1:
    #0 pthread_mutex_lock <null> (delv+0x4483a6)
    #1 dns_resolver_createfetch3 /builds/isc-projects/bind9/lib/dns/resolver.c:9585:2 (libdns.so.1110+0x1769fd)
    #2 dns_resolver_createfetch /builds/isc-projects/bind9/lib/dns/resolver.c:9504:10 (libdns.so.1110+0x174e17)
    #3 create_fetch /builds/isc-projects/bind9/lib/dns/validator.c:1156:10 (libdns.so.1110+0x1c1e5f)
    #4 validatezonekey /builds/isc-projects/bind9/lib/dns/validator.c:2124:13 (libdns.so.1110+0x1c3b6d)
    #5 start_positive_validation /builds/isc-projects/bind9/lib/dns/validator.c:2301:10 (libdns.so.1110+0x1bfde9)
    #6 validator_start /builds/isc-projects/bind9/lib/dns/validator.c:3647:12 (libdns.so.1110+0x1bef62)
    #7 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #8 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M100252 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null> (delv+0x4483a6)
    #1 validator_start /builds/isc-projects/bind9/lib/dns/validator.c:3628:2 (libdns.so.1110+0x1bee31)
    #2 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #3 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M100252 acquired here while holding mutex M1171 in thread T1:
    #0 pthread_mutex_lock <null> (delv+0x4483a6)
    #1 dns_validator_destroy /builds/isc-projects/bind9/lib/dns/validator.c:3912:2 (libdns.so.1110+0x1bf788)
    #2 validated /builds/isc-projects/bind9/lib/dns/resolver.c:4916:2 (libdns.so.1110+0x18fdfd)
    #3 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #4 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M1171 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null> (delv+0x4483a6)
    #1 validated /builds/isc-projects/bind9/lib/dns/resolver.c:4907:2 (libdns.so.1110+0x18fc3d)
    #2 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #3 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Thread T1 'isc-worker0000' (tid=12729, running) created by main thread at:
    #0 pthread_create <null> (delv+0x42afdb)
    #1 isc_thread_create /builds/isc-projects/bind9/lib/isc/pthreads/thread.c:60:8 (libisc.so.1107+0x726d8)
    #2 isc__taskmgr_create /builds/isc-projects/bind9/lib/isc/task.c:1468:7 (libisc.so.1107+0x4d635)
    #3 isc_taskmgr_createinctx /builds/isc-projects/bind9/lib/isc/task.c:2091:11 (libisc.so.1107+0x4f4ac)
    #4 main /builds/isc-projects/bind9/bin/delv/delv.c:1639:2 (delv+0x4b7f96)

SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (/builds/isc-projects/bind9/bin/delv/.libs/delv+0x4483a6) in pthread_mutex_lock
(cherry picked from commit 992a79a14b)
2020-09-09 16:22:39 +10:00
Mark Andrews
5d469f2498 Address lock-order-inversion
Obtain references to view->redirect and view->managed_keys then
release view->lock so dns_zone_setviewcommit and dns_zone_setviewrevert
can obtain the view->lock while holding zone->lock.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=9132)
  Cycle in lock order graph: M987831431424375936 (0x000000000000) => M1012319771577875480 (0x000000000000) => M987831431424375936

  Mutex M1012319771577875480 acquired here while holding mutex M987831431424375936 in thread T2:
    #0 pthread_mutex_lock <null> (named+0x4642a6)
    #1 dns_zone_setviewcommit /builds/isc-projects/bind9/lib/dns/zone.c:1571:2 (libdns.so.1110+0x1d74eb)
    #2 dns_view_setviewcommit /builds/isc-projects/bind9/lib/dns/view.c:2388:3 (libdns.so.1110+0x1cfe29)
    #3 load_configuration /builds/isc-projects/bind9/bin/named/./server.c:8188:3 (named+0x51eadd)
    #4 loadconfig /builds/isc-projects/bind9/bin/named/./server.c:9438:11 (named+0x510c66)
    #5 ns_server_reconfigcommand /builds/isc-projects/bind9/bin/named/./server.c:9773:2 (named+0x510b41)
    #6 ns_control_docommand /builds/isc-projects/bind9/bin/named/control.c:243:12 (named+0x4e451a)
    #7 control_recvmessage /builds/isc-projects/bind9/bin/named/controlconf.c:465:13 (named+0x4e9056)
    #8 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #9 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M987831431424375936 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null> (named+0x4642a6)
    #1 dns_view_setviewcommit /builds/isc-projects/bind9/lib/dns/view.c:2382:2 (libdns.so.1110+0x1cfde7)
    #2 load_configuration /builds/isc-projects/bind9/bin/named/./server.c:8188:3 (named+0x51eadd)
    #3 loadconfig /builds/isc-projects/bind9/bin/named/./server.c:9438:11 (named+0x510c66)
    #4 ns_server_reconfigcommand /builds/isc-projects/bind9/bin/named/./server.c:9773:2 (named+0x510b41)
    #5 ns_control_docommand /builds/isc-projects/bind9/bin/named/control.c:243:12 (named+0x4e451a)
    #6 control_recvmessage /builds/isc-projects/bind9/bin/named/controlconf.c:465:13 (named+0x4e9056)
    #7 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #8 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M987831431424375936 acquired here while holding mutex M1012319771577875480 in thread T7:
    #0 pthread_mutex_lock <null> (named+0x4642a6)
    #1 dns_view_findzonecut2 /builds/isc-projects/bind9/lib/dns/view.c:1300:2 (libdns.so.1110+0x1cc93a)
    #2 dns_view_findzonecut /builds/isc-projects/bind9/lib/dns/view.c:1261:9 (libdns.so.1110+0x1cc864)
    #3 fctx_create /builds/isc-projects/bind9/lib/dns/resolver.c:4459:13 (libdns.so.1110+0x1779d3)
    #4 dns_resolver_createfetch3 /builds/isc-projects/bind9/lib/dns/resolver.c:9628:12 (libdns.so.1110+0x176cb6)
    #5 dns_resolver_createfetch /builds/isc-projects/bind9/lib/dns/resolver.c:9504:10 (libdns.so.1110+0x174e17)
    #6 zone_refreshkeys /builds/isc-projects/bind9/lib/dns/zone.c:10061:12 (libdns.so.1110+0x2055a5)
    #7 zone_maintenance /builds/isc-projects/bind9/lib/dns/zone.c:10274:5 (libdns.so.1110+0x203a78)
    #8 zone_timer /builds/isc-projects/bind9/lib/dns/zone.c:13106:2 (libdns.so.1110+0x1e815a)
    #9 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #10 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Mutex M1012319771577875480 previously acquired by the same thread here:
    #0 pthread_mutex_lock <null> (named+0x4642a6)
    #1 zone_refreshkeys /builds/isc-projects/bind9/lib/dns/zone.c:9951:2 (libdns.so.1110+0x204dc3)
    #2 zone_maintenance /builds/isc-projects/bind9/lib/dns/zone.c:10274:5 (libdns.so.1110+0x203a78)
    #3 zone_timer /builds/isc-projects/bind9/lib/dns/zone.c:13106:2 (libdns.so.1110+0x1e815a)
    #4 dispatch /builds/isc-projects/bind9/lib/isc/task.c:1157:7 (libisc.so.1107+0x507d5)
    #5 run /builds/isc-projects/bind9/lib/isc/task.c:1331:2 (libisc.so.1107+0x4d729)

  Thread T2 'isc-worker0001' (tid=9163, running) created by main thread at:
    #0 pthread_create <null> (named+0x446edb)
    #1 isc_thread_create /builds/isc-projects/bind9/lib/isc/pthreads/thread.c:60:8 (libisc.so.1107+0x726d8)
    #2 isc__taskmgr_create /builds/isc-projects/bind9/lib/isc/task.c:1468:7 (libisc.so.1107+0x4d635)
    #3 isc_taskmgr_create /builds/isc-projects/bind9/lib/isc/task.c:2109:11 (libisc.so.1107+0x4f587)
    #4 create_managers /builds/isc-projects/bind9/bin/named/./main.c:886:11 (named+0x4f1a97)
    #5 setup /builds/isc-projects/bind9/bin/named/./main.c:1305:11 (named+0x4f05ee)
    #6 main /builds/isc-projects/bind9/bin/named/./main.c:1556:2 (named+0x4ef12d)

  Thread T7 'isc-worker0006' (tid=9168, running) created by main thread at:
    #0 pthread_create <null> (named+0x446edb)
    #1 isc_thread_create /builds/isc-projects/bind9/lib/isc/pthreads/thread.c:60:8 (libisc.so.1107+0x726d8)
    #2 isc__taskmgr_create /builds/isc-projects/bind9/lib/isc/task.c:1468:7 (libisc.so.1107+0x4d635)
    #3 isc_taskmgr_create /builds/isc-projects/bind9/lib/isc/task.c:2109:11 (libisc.so.1107+0x4f587)
    #4 create_managers /builds/isc-projects/bind9/bin/named/./main.c:886:11 (named+0x4f1a97)
    #5 setup /builds/isc-projects/bind9/bin/named/./main.c:1305:11 (named+0x4f05ee)
    #6 main /builds/isc-projects/bind9/bin/named/./main.c:1556:2 (named+0x4ef12d)

SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (/builds/isc-projects/bind9/bin/named/.libs/named+0x4642a6) in pthread_mutex_lock
(cherry picked from commit cdcfde9e65)
2020-09-09 16:22:39 +10:00
Mark Andrews
5b425046dd isc_mutex_init_errcheck prototype should not be under ISC_MUTEX_PROFILE
(cherry picked from commit 552e0b852e)
2020-09-09 16:22:38 +10:00
Michał Kępień
6e2a35df2d Include BIND documentation in Windows zips
As generated documentation files are no longer stored in the BIND Git
repository, put a copy of the PDF version of the BIND ARM generated by
the "docs" GitLab CI job into the Windows zips to make it easily
available to the end users on that platform.

Make sure Windows zips also contain certain documentation files included
in source tarballs to make the contents of each release more consistent
across different platforms.

(cherry picked from commit 549ddca256)
2020-09-03 12:02:19 +02:00
Mark Andrews
e6332e4a67 watch_fd also requires thread->fdlock[lockid] to be held
(cherry picked from commit 22f499cdc4)
2020-09-03 07:14:45 +10:00
Mark Andrews
eadfe4b673 remove dead code
(cherry picked from commit e923e62f6c)
2020-09-03 07:14:45 +10:00
Ondřej Surý
56d2cf6f1e Print diagnostics on dns_name_issubdomain() failure in fctx_create()
Log diagnostic message when dns_name_issubdomain() in the fctx_create()
when the resolver is qname minimizing and forwarding at the same time.

(cherry picked from commit 0a22024c27)
2020-09-02 18:29:01 +02:00
Diego Fronza
eb9d8e9e10 Fix resolution of unusual ip6.arpa names
Before this commit, BIND was unable to resolve ip6.arpa names like
the one reported in issue #1847 when using query minimization.

As reported in the issue, an attempt to resolve a name like
'rec-test-dom-158937817846788.test123.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.3.4.3.5.4.0.8.2.6.0.1.0.0.2.ip6.arpa'
using default settings would fail.

The reason was that query minimization algorithm in 'fctx_minimize_qname'
would divide any ip6.arpa names in increasing number of labels,
7,11, ... up to 35, thus limiting the destination name (minimized) to a number
of 35 labels.

In case the last query minimization attempt (with 35 labels) would fail with
NXDOMAIN, BIND would attempt the query mininimization again with the exact
same QNAME, limited on the 35 labels, and that in turn would fail again.

This fix avoids this fail loop by considering the extra labels that may appear
in the leftmost part of an ip6.arpa name, those after the IPv6 part.

(cherry picked from commit 230d79c191)
2020-09-02 16:52:39 +02:00
Matthijs Mekking
4a7f87aa89 Log when CDS/CDNSKEY is published in zone.
Log when named decides to add a CDS/CDNSKEY record to the zone. Now
you understand how the bug was found that was fixed in the previous
commits.

(cherry picked from commit f9ef5120c1)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
6405b04477 Fix CDS (non-)publication
The CDS/CDNSKEY record will be published when the DS is in the
rumoured state. However, with the introduction of the rndc '-checkds'
command, the logic in the keymgr was changed to prevent the DS
state to go in RUMOURED unless the specific command was given. Hence,
the CDS was never published before it was seen in the parent.

Initially I thought this was a policy approval rule, however it is
actually a DNSSEC timing rule. Remove the restriction from
'keymgr_policy_approval' and update the 'keymgr_transition_time'
function. When looking to move the DS state to OMNIPRESENT it will
no longer calculate the state from its last change, but from when
the DS was seen in the parent, "DS Publish". If the time was not set,
default to next key event of an hour.

Similarly for moving the DS state to HIDDEN, the time to wait will
be derived from the "DS Delete" time, not from when the DS state
last changed.

(cherry picked from commit c8205bfa0e)
2020-09-02 14:59:20 +02:00
Mark Andrews
e460d83dbb isc_ratelimiter needs to hold a reference to its task
to prevent the task subsystem shutting down before the
ratelimiter is freed.

(cherry picked from commit b8e4b6d303)
2020-09-02 11:39:36 +10:00
Mark Andrews
489b99b65c remove unused variable sock
(cherry picked from commit b1c424ddf3)
2020-09-02 08:41:11 +10:00
Mark Andrews
1af9cf78bd Use memory_order_acq_rel in isc_refcount_decrement.
While

if (isc_refcount_decrement() == 1) {	// memory_order_release
	isc_refcount_destroy();		// memory_order_acquire
	...
}

is theoretically the most efficent in practice, using
memory_order_acq_rel produces the same code on x86_64 and doesn't
trigger tsan data races (which use a idealistic model) if
isc_refcount_destroy() is not called immediately.  In fact
isc_refcount_destroy() could be removed if we didn't want
to check for the count being 0 when isc_refcount_destroy() is
called.

https://stackoverflow.com/questions/49112732/memory-order-in-shared-pointer-destructor
(cherry picked from commit 6278899a38)
2020-09-01 22:24:52 +10:00
Ondřej Surý
9b9fee13fa Handle EPROTO errno from recvmsg
It was discovered, that some systems might set EPROTO instead of EACCESS
on recvmsg() call causing spurious syslog messages from the socket
code.  This commit returns soft handling of EPROTO errno code to the
socket code. [GL #1928]

(cherry picked from commit e0380d437d)
2020-08-28 20:49:01 +02:00
Ondřej Surý
2b08ff879a Fix off-by-one error when calculating new hashtable size
When calculating the new hashtable bitsize, there was an off-by-one
error that would allow the new bitsize to be larger than maximum allowed
causing assertion failure in the rehash() function.

(cherry picked from commit 78543ad5a7)
2020-08-28 20:43:38 +02:00
Mark Andrews
c2ee9eea3a Refactor totext_loc
(cherry picked from commit 2ca4d35037)
2020-08-26 16:44:01 +02:00
Mark Andrews
baf93342d0 Correctly encode LOC records with non integer negative altitudes.
(cherry picked from commit 337cc878fa)
2020-08-26 16:44:01 +02:00
Mark Andrews
06b76b2b16 Check LOC's altitude field is properly parsed and encoded.
(cherry picked from commit 888dfd78c7)
2020-08-26 16:44:00 +02:00
Mark Andrews
7eb5d61703 Tighten LOC parsing to reject period and/or m as a value.
(cherry picked from commit 9225c67835)
2020-08-26 16:44:00 +02:00
Ondřej Surý
5674f76590 Use the Fibonacci Hashing for the RBTDB glue table
The rbtdb version glue_table has been refactored similarly to rbt.c hash
table, so it does use 32-bit hash function return values and apply
Fibonacci Hashing to lookup the index to the hash table instead of
modulo.  For more details, see the lib/dns/rbt.c commit log.

(cherry picked from commit 01684cc219)
2020-08-26 21:49:59 +10:00
Mark Andrews
511747307f rbtversion->glue_table_size must be read when holding a lock
(cherry picked from commit 33d0e8d168)
2020-08-26 21:49:59 +10:00
Mark Andrews
3fce53b0e3 Cast the original rcode to (dns_ttl_t) when setting extended rcode
Shifting (signed) integer left could trigger undefined behaviour when
the shifted value would overflow into the sign bit (e.g. 2048).

The issue was found when using AFL++ and UBSAN:

    message.c:2274:33: runtime error: left shift of 2048 by 20 places cannot be represented in type 'int'
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior message.c:2274:33 in

(cherry picked from commit a347641782)
2020-08-25 16:40:42 +02:00
Michal Nowak
79e8f1076a
Fix warnings in when build with --enable-buffer-useinline
sockaddr.c:147:49: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness
    rdata.c:1780:30: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness

(cherry picked from commit dd425254a7)
2020-08-25 16:08:44 +02:00
Mark Andrews
da4f189ea8 Add missing isc_mutex_init to manytasks subtest.
(cherry picked from commit 2eb5c29c83)
2020-08-25 09:58:30 +10:00
Evan Hunt
1c7e3c8515 BIND 9.16.6
-----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEENKwGS3ftSQfs1TU17QVz/8hFYQUFAl8xFCMPHG1pY2hhbEBp
 c2Mub3JnAAoJEO0Fc//IRWEFM/IP/AyKCPJsh+grYskFAws5UqhvDM0XBvQWPZBP
 DM0lKA2BK8vaOl80iI2hlf9SrLMmBiD0f3WHigbS8i0MbnJAz0T7mFDuDmcJQSP4
 skwgwX3obpDwdzl/Tgu2V5bTbwe0WT8wdUKIT8oZnPBNuhh57hjQh3D+DTZ+YPye
 RSPq1lYtQq80QukHkzQ9JnpMzW8JYZTbwzG4swZNl4upbI+Z3Jx93LYnNBCKZuw1
 jlrrFAQZavFdQJ4fxnLicHYsnmfWOX3Lhg/wSHemwMeYgxlrWYXgKCDO+0veB1Sq
 rGVnrfmkN7fNDY9gPJOe7TBPkKLtlSs5zxPNDvfYiDUGhdNTIV/hItF+u81eKetu
 QLp0RNa+uPuCkUGe9bBcqEJ0DIVL7yfzrtxrvtceLKG6A9XIF69nfPl81dv2GjHn
 hR81v/VQC8H2fzzbfypJpTwXeAJ3HKeBahoQttwMH7hux3iatOFdKb1pulkviK0d
 fzX5TSWHK2JLyCH0ed1SPzJFYG9irGl0lYuQIO1cbmb7IZOAMWJODXpafJiJwBpR
 YgHenf+XS1bluadl6kItA2QhLsMnly+LfYO9XXhGMmIqE8Xf1RrHLCIts/hQjY0/
 B+lRvWAXvzLMb+y+W+wxe8BsNSI/RYhHxXsJRavrlCPeFeSg5CMOu4VXTkdnUKcV
 tDQuUJBh
 =p//M
 -----END PGP SIGNATURE-----

Merge tag 'v9_16_6' into v9_16

BIND 9.16.6
2020-08-20 12:08:57 -07:00
Mark Andrews
d8a57d32b1 A6: return FORMERR in fromwire if bits are non zero.
oss_fuzz: Issue 24864: bind9:dns_rdata_fromwire_text_fuzzer: Overwrites-const-input in dns_rdata_fromwire_text_fuzzer

(cherry picked from commit 8452404bd7)
2020-08-18 11:11:40 +02:00
Mark Andrews
6b1675a62c RRSIG: reject records with empty SIG section
(cherry picked from commit f6d7b8c20d)
2020-08-18 11:11:40 +02:00
Mark Andrews
9675d83b96 X25: Check that record is all ASCII digits
(cherry picked from commit 7e49689746)
2020-08-14 00:38:45 +10:00
Mark Andrews
ee10a93cbc WKS: reject records with zero octets at the end of the bitmap
(cherry picked from commit 9d446142d8)
2020-08-14 00:38:45 +10:00
Mark Andrews
e44da35092 TLSA: fix fromwire length checks
(cherry picked from commit 3429c35f52)
2020-08-14 00:38:45 +10:00
Mark Andrews
b4a66cffa8 SIG: reject records with a zero length signature
(cherry picked from commit 9b93e5d684)
2020-08-14 00:38:45 +10:00
Mark Andrews
10e8ad22c5 NXT: fix fromwire bitmap checks
(cherry picked from commit 73dd849655)
2020-08-14 00:38:45 +10:00
Mark Andrews
c712394b34 NSEC3PARAM: check that saltlen is consistent with the rdata length
(cherry picked from commit 7dc8e720ff)
2020-08-14 00:38:45 +10:00
Mark Andrews
26be6c817b NSEC3: reject records with a zero length hash field
(cherry picked from commit 031ee9e279)
2020-08-14 00:38:45 +10:00
Mark Andrews
ebd8033a96 IPSECKEY: require non-zero length public keys
(cherry picked from commit d7f7014803)
2020-08-14 00:38:45 +10:00
Mark Andrews
baf7d114af CERT: reject records with a empty certificate field
(cherry picked from commit a238f37239)
2020-08-14 00:38:45 +10:00
Mark Andrews
f09691feff Get rid of type 'RESERVED0'.
(cherry picked from commit 3c492b3ef1)
2020-08-14 00:38:45 +10:00
Mark Andrews
5806d856cf base32_decode*() could incorrectly decode a input.
base32_decode_char() added a extra zero octet to the output
if the fifth character was a pad character.  The length
of octets to copy to the output was set to 3 instead of 2.

(cherry picked from commit 6c7e50c267)
2020-08-14 00:11:06 +10:00
Mark Andrews
f3b25f1ffb Address use after free between view, resolver and nta.
Hold a weak reference to the view so that it can't go away while
nta is performing its lookups.  Cancel nta timers once all external
references to the view have gone to prevent them triggering new work.

(cherry picked from commit 0b2555e8cf)
2020-08-11 11:55:44 +10:00
Ondřej Surý
25846cfec4 Reduce the default RBT hash table size to 16 entries (4 bits)
The hash table rework MRs (!3865, !3871) increased the default RBT hash
table size from 64 to 65,536 entries (for 64-bit architectures, that is
512 bytes before vs. 524,288 bytes after).  This works fine for RBTs
used for cache databases, but since three separate RBT databases are
created for every zone loaded (RRs, NSEC, NSEC3), memory usage would
skyrocket when BIND 9 is used as an authoritative DNS server with many
zones.

The default RBT hash table size before the rework was 64 entries, this
commit reduces it to 16 entries because our educated guess is that most
zones are just couple of entries (SOA, NS, A, AAAA, MX) and rehashing
small hash tables is actually cheap.  The rework we did in the previous
MRs tries to avoid growing the hash tables for big-to-huge caches where
growing the hash table comes at a price because the whole cache needs to
be locked.

(cherry picked from commit 1e043a011b)
(cherry picked from commit f0ccc17f30)
2020-08-10 11:31:13 +02:00
Ondřej Surý
f0ccc17f30 Reduce the default RBT hash table size to 16 entries (4 bits)
The hash table rework MRs (!3865, !3871) increased the default RBT hash
table size from 64 to 65,536 entries (for 64-bit architectures, that is
512 bytes before vs. 524,288 bytes after).  This works fine for RBTs
used for cache databases, but since three separate RBT databases are
created for every zone loaded (RRs, NSEC, NSEC3), memory usage would
skyrocket when BIND 9 is used as an authoritative DNS server with many
zones.

The default RBT hash table size before the rework was 64 entries, this
commit reduces it to 16 entries because our educated guess is that most
zones are just couple of entries (SOA, NS, A, AAAA, MX) and rehashing
small hash tables is actually cheap.  The rework we did in the previous
MRs tries to avoid growing the hash tables for big-to-huge caches where
growing the hash table comes at a price because the whole cache needs to
be locked.

(cherry picked from commit 1e043a011b)
2020-08-10 10:32:26 +02:00
Mark Andrews
8a4dd25562 Silence 'may be used uninitialized' 2020-08-08 16:12:12 +10:00
Matthijs Mekking
624f1b9531 rndc dnssec -checkds set algorithm
In the rare case that you have multiple keys acting as KSK and that
have the same keytag, you can now set the algorithm when calling
'-checkds'.

(cherry picked from commit 46fcd927e7)
2020-08-07 13:34:10 +02:00
Matthijs Mekking
4892006a92 Make 'parent-registration-delay' obsolete
With the introduction of 'checkds', the 'parent-registration-delay'
option becomes obsolete.

(cherry picked from commit a25f49f153)
2020-08-07 13:30:50 +02:00
Matthijs Mekking
5dcf56f216 Fix time printing in key files
Don't strip off the final character when printing times in key files.

With the introduction of 'rndc dnssec -status' we introduced
'isc_stdtime_tostring()'. This changed in behavior such that it was no
longer needed to strip of the final '\n' of the string format
datetime. However, in 'printtime()' it still stripped the final
character.

(cherry picked from commit e3eb55fd1c)
2020-08-07 13:30:30 +02:00
Matthijs Mekking
81d0c63ecb Implement 'rndc dnssec -checkds'
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.

Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.

This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).

Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.

By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.

(cherry picked from commit 04d8fc0143)
2020-08-07 13:30:19 +02:00
Tinderbox User
a195123ad0 prep 9.16.6 2020-08-06 08:14:40 +00:00
Ondřej Surý
ac3862a5da Fix crash in pk11_numbits() when native-pkcs11 is used
When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure.  Fix that by properly handling such input.
2020-08-05 15:51:40 +02:00
Mark Andrews
0eec632d6a Always keep a copy of the message
this allows it to be available even when dns_message_parse()
returns a error.
2020-08-05 15:47:25 +02:00
Evan Hunt
81514ff925 permanently disable QNAME minimization in a fetch when forwarding
QNAME minimization is normally disabled when forwarding. if, in the
course of processing a fetch, we switch back to normal recursion at
some point, we can't safely start minimizing because we may have
been left in an inconsistent state.
2020-08-05 15:44:18 +02:00
Evan Hunt
9a372f2bce Use different allocators for UDP and TCP
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.

This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
2020-08-05 12:57:58 +02:00
Ondřej Surý
f9711481ad Expire the 0 TTL RRSet quickly rather using them for serve-stale
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.

This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".

(cherry picked from commit 6ffa2ddae0)
2020-08-05 09:09:16 +02:00
Ondřej Surý
b48e9ab201 Add stale-cache-enable option and disable serve-stable by default
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).

This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage.  The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.

The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.

In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.

The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.

(cherry picked from commit ce53db34d6)
2020-08-05 09:09:16 +02:00