Commit graph

189 commits

Author SHA1 Message Date
Ondřej Surý
4d027ab945 Remove TLSDNS, TLS and HTTP protocols from netmgr
For further stabilization of the 9.16 branch, we are removing the unused
protocols from the netmgr.
2021-05-14 12:52:48 +02:00
Evan Hunt
ef1d909fa9 backport of netmgr/taskmgr to 9.16
this rolls up numerous changes that have been applied to the
main branch, including moving isc_task operations into the
netmgr event loops, and other general stabilization.
2021-05-14 12:52:48 +02:00
Ondřej Surý
45c55b1e7e Add isc_trampoline API to have simple accounting around threads
The current isc_hp API uses internal tid_v variable that gets
incremented for each new thread using hazard pointers.  This tid_v
variable is then used as a index to global shared table with hazard
pointers state.  Since the tid_v is only incremented and never
decremented the table could overflow very quickly if we create set of
threads for short period of time, they finish the work and cease to
exist.  Then we create identical set of threads and so on and so on.
This is not a problem for a normal `named` operation as the set of
threads is stable, but the problematic place are the unit tests where we
test network manager or other APIs (task, timer) that create threads.

This commits adds a thin wrapper around any function called from
isc_thread_create() that adds unique-but-reusable small digit thread id
that can be used as index to f.e. hazard pointer tables.  The trampoline
wrapper ensures that the thread ids will be reused, so the highest
thread_id number doesn't grow indefinitely when threads are created and
destroyed and then created again.  This fixes the hazard pointer table
overflow on machines with many cores. [GL #2396]

(cherry picked from commit cbbecfcc82)
2021-02-26 21:14:17 +01:00
Ondřej Surý
effe3ee595 Refactor TLSDNS module to work with libuv/ssl directly
* Following the example set in 634bdfb16d, the tlsdns netmgr
  module now uses libuv and SSL primitives directly, rather than
  opening a TLS socket which opens a TCP socket, as the previous
  model was difficult to debug.  Closes #2335.

* Remove the netmgr tls layer (we will have to re-add it for DoH)

* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
  library; move the OpenSSL initialization/deinitialization from dstapi
  needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()

* Add couple of new shims needed for OpenSSL 1.0.x

* When LibreSSL is used, require at least version 2.7.0 that
  has the best OpenSSL 1.1.x compatibility and auto init/deinit

* Enforce OpenSSL 1.1.x usage on Windows

(cherry picked from commit e493e04c0f)
2021-02-26 16:14:50 +01:00
Ondřej Surý
0e25af628c Use -release instead of -version-info for internal library SONAMEs
The BIND 9 libraries are considered to be internal only and hence the
API and ABI changes a lot.  Keeping track of the API/ABI changes takes
time and it's a complicated matter as the safest way to make everything
stable would be to bump any library in the dependency chain as in theory
if libns links with libdns, and a binary links with both, and we bump
the libdns SOVERSION, but not the libns SOVERSION, the old libns might
be loaded by binary pulling old libdns together with new libdns loaded
by the binary.  The situation gets even more complicated with loading
the plugins that have been compiled with few versions old BIND 9
libraries and then dynamically loaded into the named.

We are picking the safest option possible and usable for internal
libraries - instead of using -version-info that has only a weak link to
BIND 9 version number, we are using -release libtool option that will
embed the corresponding BIND 9 version number into the library name.

That means that instead of libisc.so.1608 (as an example) the library
will now be named libisc-9.16.10.so.

(cherry picked from commit c605d75ea5)
2021-01-25 15:28:09 +01:00
Ondřej Surý
48759bd047 Fix the data race in accessing the isc_nm_t timers
The following TSAN report about accessing the mgr timers (mgr->init,
mgr->idle, mgr->keepalive and mgr->advertised) has been fixed in this
commit:

    ==================
    WARNING: ThreadSanitizer: data race (pid=2746)
    Read of size 4 at 0x7b440008a948 by thread T18:
    #0 isc__nm_tcpdns_read /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 (libisc.so.1706+0x2ba0f)
    #1 isc_nm_read /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1679:3 (libisc.so.1706+0x22258)
    #2 tcpdns_connect_connect_cb /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:363:2 (tcpdns_test+0x4bc5fb)
    #3 isc__nm_async_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1816:2 (libisc.so.1706+0x228c9)
    #4 isc__nm_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1791:3 (libisc.so.1706+0x22713)
    #5 tcpdns_connect_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:343:2 (libisc.so.1706+0x2d89d)
    #6 uv__stream_connect /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1381:5 (libuv.so.1+0x27c18)
    #7 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1298:5 (libuv.so.1+0x25977)
    #8 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:462:11 (libuv.so.1+0x2e795)
    #9 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5 (libuv.so.1+0x158ec)
    #10 nm_thread /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:530:11 (libisc.so.1706+0x1c94a)

    Previous write of size 4 at 0x7b440008a948 by main thread:
    #0 isc_nm_settimeouts /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:490:12 (libisc.so.1706+0x1dda5)
    #1 tcpdns_recv_two /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:601:2 (tcpdns_test+0x4bad0e)
    #2 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70be)
    #3 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Location is heap block of size 281 at 0x7b440008a840 allocated by main thread:
    #0 malloc <null> (tcpdns_test+0x42864b)
    #1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:713:8 (libisc.so.1706+0x6d261)
    #2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:622:8 (libisc.so.1706+0x69b9c)
    #3 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1044:9 (libisc.so.1706+0x6d379)
    #4 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2432:10 (libisc.so.1706+0x6889e)
    #5 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:203:8 (libisc.so.1706+0x1c219)
    #6 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #7 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #8 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Thread T18 'isc-net-0000' (tid=3513, running) created by main thread at:
    #0 pthread_create <null> (tcpdns_test+0x429e7b)
    #1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:73:8 (libisc.so.1706+0x8476a)
    #2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:271:3 (libisc.so.1706+0x1c66a)
    #3 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #4 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #5 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 in isc__nm_tcpdns_read
    ==================
    ThreadSanitizer: reported 1 warnings

(cherry picked from commit 2e1dd56d0b)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7b9c8b9781 Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061

(cherry picked from commit 634bdfb16d)
2020-12-09 10:46:16 +01:00
Mark Andrews
eed4fab37b Report Extended DNS Error codes
(cherry picked from commit b144ae1bb0)
2020-05-13 10:26:39 +10:00
Ondřej Surý
af1b56240f Resolve the overlinking of the system libraries
Originally, every library and binaries got linked to everything, which
creates unnecessary overlinking.  This wasn't as straightforward as it
should be as we still support configuration without libtool for 9.16.

Couple of smaller issues related to include headers and an issue where
sanitizer overload dlopen and dlclose symbols, so we were getting false
negatives in the autoconf test.
2020-05-11 09:49:54 +02:00
Ondřej Surý
5948a29463 Stop leaking OpenSSL types and defines in the isc/safe.h
The two "functions" that isc/safe.h declared before were actually simple
defines to matching OpenSSL functions.  The downside of the approach was
enforcing all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto.  By hiding the specific
implementation into the private namespace changing the defines into
simple functions, we no longer enforce this.  In the long run, this
might also allow us to switch cryptographic library implementation
without affecting the downstream users.

(cherry picked from commit ab827ab5bf)
2020-04-28 16:27:39 +02:00
Petr Menšík
ad79e7c080 Link all required libraries to libisc
It would fail to link -lisc without additional libraries, which should
not be required.

(cherry picked from commit 4cc7d2412f)
2020-04-08 17:03:27 +02:00
Witold Kręcicki
c6c0a9fdba Add isc_uv_export()/isc_uv_import() functions to libuv compatibility layer.
These functions can be used to pass a uv handle between threads in a
safe manner. The other option is to use uv_pipe and pass the uv_handle
via IPC, which is way more complex.  uv_export() and uv_import() functions
existed in libuv at some point but were removed later. This code is
based on the original removed code.

The Windows version of the code uses two functions internal to libuv;
a patch for libuv is attached for exporting these functions.
2020-01-13 10:52:07 -08:00
Witold Kręcicki
70397f9d92 netmgr: libuv-based network manager
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.

UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.

TCP sockets are shared amongst worker threads.

A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.

(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
2019-11-07 11:55:37 -08:00
Evan Hunt
a8c814cb2f implement fixed-size array stack data structure 2019-11-07 11:55:37 -08:00
Witold Kręcicki
402969bf95 implement fetch-and-add array queue data structure
this is a lockless queue based on hazard pointers.
2019-11-07 11:55:37 -08:00
Witold Kręcicki
aa57fa7090 implement hazard pointer data structure
this is a mechanism to allow safe lock-free data structures.
2019-11-07 11:55:37 -08:00
Ondřej Surý
2b632a232f Convert the configure.ac rules for zlib library to use pkg-config 2019-07-31 14:54:40 +02:00
Ondřej Surý
5d1e7be582 Rename OPENSSL_INCLUDES to OPENSSL_CFLAGS in AX_CHECK_OPENSSL() macro
The ax_check_openssl m4 macro used OPENSSL_INCLUDES.  Rename the
subst variable to OPENSSL_CFLAGS and wrap AX_CHECK_OPENSSL() in
action-if-not-found part of PKG_CHECK_MODULE check for libcrypto.
2019-06-25 12:36:01 +02:00
Ondřej Surý
e3e6888946 Make the usage of json-c objects opaque to the caller
The json-c have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header.  This MR fixes the usage making the caller object opaque.
2019-06-25 12:04:20 +02:00
Ondřej Surý
0771dd3be8 Make the usage of libxml2 opaque to the caller
The libxml2 have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header.  This MR fixes the usage making the caller object opaque.
2019-06-25 12:01:32 +02:00
Ondřej Surý
5098c95452 Merge unix/app.c and win32/app.c
The differences between two files are very minimal and most of the
code is common.  Merge those two files and use #ifdef WIN32 to include
the right bits on Windows.
2019-06-20 18:52:27 +02:00
Ondřej Surý
4d2d3b49ce Cleanup the way we detect json-c library to use only pkg-config 2019-05-29 15:08:52 +02:00
Ondřej Surý
a197df137a Add reference SipHash 2-4 implementation 2019-05-20 19:01:31 +02:00
Ondřej Surý
e2cdf066ea Remove message catalogs 2019-01-09 23:44:26 +01:00
Ondřej Surý
b98ac2593c Add generic hashed message authentication code API (isc_hmac) to replace specific HMAC functions hmacmd5/hmacsha1/hmacsha2... 2018-10-25 08:15:42 +02:00
Ondřej Surý
7fd3dc63de Add generic message digest API (isc_md) to replace specific MD functions md5/sha1/sha256 2018-10-25 08:15:42 +02:00
Ondřej Surý
fecbc7923a Remove isc_keyboard family of functions as they were not used anywhere 2018-08-28 14:37:30 +02:00
Ondřej Surý
0a7535ac81 isc_refcount_init() now doesn't return isc_result_t and asserts on failed initialization 2018-08-28 12:15:39 +02:00
Ondřej Surý
e119de4169 Replace arch specific atomic.h with global atomic.h header using either stdatomic, __atomic or __sync primitives 2018-08-28 12:15:39 +02:00
Ondřej Surý
1672935717 Use strerror_r from POSIX.1-2001 (strerror_s on Windows) instead of custom isc__strerror() 2018-08-28 10:31:48 +02:00
Ondřej Surý
efd613e874 memmove, strtoul, and strcasestr functions are part of ISO C90, remove the compatibility shim 2018-08-28 10:31:48 +02:00
Ondřej Surý
388d6db5a1 Remove support for legacy systems without inet_{ntop,pton} w/ IPv6 support 2018-08-28 10:31:48 +02:00
Ondřej Surý
7b21bbb7c1 Require IPv6 support from the OS 2018-08-28 10:31:47 +02:00
Ondřej Surý
c692da2182 Improve autoconf pthread detection 2018-08-16 17:18:52 +02:00
Ondřej Surý
71877806e8 Fix ax_check_openssl to accept yes and improve it to modern autotools standard 2018-07-23 22:10:52 +02:00
Ondřej Surý
66ba2fdad5 Replace isc_safe routines with their OpenSSL counter parts 2018-07-20 00:34:26 -04:00
Ondřej Surý
c3b8130fe8 Make OpenSSL mandatory 2018-07-19 12:47:03 -04:00
Ondřej Surý
302c6cbe7f Add thin openssl shim for OpenSSL 1.1.x and LibreSSL compatibility functions 2018-06-13 14:19:07 +02:00
Ondřej Surý
99ba29bc52 Change isc_random() to be just PRNG, and add isc_nonce_buf() that uses CSPRNG
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.

The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.

The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space.  Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill.  After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.

The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias.  For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff.  Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
2018-05-29 22:58:21 +02:00
Ondřej Surý
7ee8a7e69f address win32 build issues
- Replace external -DOPENSSL/-DPKCS11CRYPTO with properly AC_DEFINEd
  HAVE_OPENSSL/HAVE_PKCS11
- Don't enforce the crypto provider from platform.h, just from dst_api.c
  and configure scripts
2018-05-22 16:32:21 -07:00
Ondřej Surý
3a4f820d62 Replace all random functions with isc_random, isc_random_buf and isc_random_uniform API.
The three functions has been modeled after the arc4random family of
functions, and they will always return random bytes.

The isc_random family of functions internally use these CSPRNG (if available):

1. getrandom() libc call (might be available on Linux and Solaris)
2. SYS_getrandom syscall (might be available on Linux, detected at runtime)
3. arc4random(), arc4random_buf() and arc4random_uniform() (available on BSDs and Mac OS X)
4. crypto library function:
4a. RAND_bytes in case OpenSSL
4b. pkcs_C_GenerateRandom() in case PKCS#11 library
2018-05-16 09:54:35 +02:00
Ondřej Surý
a11e23b5ed Replace all usage of inet_aton() with inet_pton() 2018-02-23 13:57:10 +01:00
Ondřej Surý
843d389661 Update license headers to not include years in copyright in all applicable files 2018-02-23 10:12:02 +01:00
Evan Hunt
883a9485e9 [master] copyrights 2018-02-15 11:56:13 -08:00
Ondřej Surý
4ff2d36adc Remove whole unused ondestroy callback mechanism 2018-02-12 14:49:32 +01:00
Mark Andrews
124712666e 4653. [bug] Reorder includes to move @DST_OPENSSL_INC@ and
@ISC_OPENSSL_INC@ after shipped include directories.
                        [RT #45581]
2017-07-20 11:52:03 +10:00
Tinderbox User
18b7760b29 update copyright notice / whitespace 2017-04-24 23:45:33 +00:00
Evan Hunt
67e1f8fa4e [master] allow parralel make
4609.	[cleanup]	Rearrange makefiles to enable parallel execution
			(i.e. "make -j"). [RT #45078]
2017-04-23 23:04:25 -07:00
Evan Hunt
6087f87afb [master] make uninstall
4503.	[cleanup]	"make uninstall" now removes file installed by
			BIND. (This currently excludes Python files
			due to lack of support in setup.py.) [RT #42912]
2016-11-01 19:17:07 -07:00
Evan Hunt
3390d74e33 [master] fix dyndb issues; isc_errno_toresult()
4445.	[cleanup]	isc_errno_toresult() can now be used to call the
			formerly private function isc__errno2result().
			[RT #43050]

4444.	[bug]		Fixed some issues related to dyndb: A bug caused
			braces to be omitted when passing configuration text
			from named.conf to a dyndb driver, and there was a
			use-after-free in the sample dyndb driver. [RT #43050]

Patch for dyndb driver submitted by Petr Spacek at Red Hat.
2016-08-17 11:37:57 -07:00