bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-06-21 03:08:52 -04:00

Author	SHA1	Message	Date
Ondřej Surý	cedfc97974	Improve reporting for pthread_once errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any once-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-10-14 16:39:21 +02:00
Ondřej Surý	beecde7120	Rewrite isc_httpd using picohttpparser and isc_url_parse Rewrite the isc_httpd to be more robust. 1. Replace the hand-crafted HTTP request parser with picohttpparser for parsing the whole HTTP/1.0 and HTTP/1.1 requests. Limit the number of allowed headers to 10 (arbitrary number). 2. Replace the hand-crafted URL parser with isc_url_parse for parsing the URL from the HTTP request. 3. Increase the receive buffer to match the isc_netmgr buffers, so we can at least receive two full isc_nm_read()s. This makes the truncation processing much simpler. 4. Process the received buffer from single isc_nm_read() in a single loop and schedule the sends to be independent of each other. The first two changes makes the code simpler and rely on already existing libraries that we already had (isc_url based on nodejs) or are used elsewhere (picohttpparser). The second two changes remove the artificial "truncation" limit on parsing multiple request. Now only a request that has too many headers (currently 10) or is too big (so, the receive buffer fills up without reaching end of the request) will end the connection. We can be benevolent here with the limites, because the statschannel channel is by definition private and access must be allowed only to administrators of the server. There are no timers, no rate-limiting, no upper limit on the number of requests that can be served, etc.	2022-10-14 11:26:54 +02:00
Ondřej Surý	3a8884f024	Add picohttpparser.{c.h} from https://github.com/h2o/picohttpparser PicoHTTPParser is a tiny, primitive, fast HTTP request/response parser. Unlike most parsers, it is stateless and does not allocate memory by itself. All it does is accept pointer to buffer and the output structure, and setups the pointers in the latter to point at the necessary portions of the buffer.	2022-10-14 11:26:54 +02:00
Ondřej Surý	b6b7a6886a	Don't set load-balancing socket option on the UDP connect sockets The isc_nm_udpconnect() erroneously set the reuse port with load-balancing on the outgoing connected UDP sockets. This socket option makes only sense for the listening sockets. Don't set the load-balancing reuse port option on the outgoing UDP sockets.	2022-10-12 15:36:25 +02:00
Artem Boldariev	eaebb92f3e	TLS DNS: fix certificate verification error message reporting This commit fixes TLS DNS verification error message reporting which we probably broke during one of the recent networking code refactorings. This prevent e.g. dig from producing useful error messages related to TLS certificates verification.	2022-10-12 16:24:04 +03:00
Artem Boldariev	6789b88d25	TLS: clear error queue before doing IO or calling SSL_get_error() Ensure that TLS error is empty before calling SSL_get_error() or doing SSL I/O so that the result will not get affected by prior error statuses. In particular, the improper error handling led to intermittent unit test failure and, thus, could be responsible for some of the system test failures and other intermittent TLS-related issues. See here for more details: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html In particular, it mentions the following: > The current thread's error queue must be empty before the TLS/SSL > I/O operation is attempted, or SSL_get_error() will not work > reliably. As we use the result of SSL_get_error() to decide on I/O operations, we need to ensure that it works reliably by cleaning the error queue. TLS DNS: empty error queue before attempting I/O	2022-10-12 16:24:04 +03:00
Aram Sargsyan	be95ba0119	Remove a superfluous check of sock->fd against -1 The check is left from when tcp_connect_direct() called isc__nm_socket() and it was uncertain whether it had succeeded, but now isc__nm_socket() is called before tcp_connect_direct(), so sock->fd cannot be -1. *** CID 357292: (REVERSE_NEGATIVE) /lib/isc/netmgr/tcp.c: 309 in isc_nm_tcpconnect() 303 304 atomic_store(&sock->active, true); 305 306 result = tcp_connect_direct(sock, req); 307 if (result != ISC_R_SUCCESS) { 308 atomic_store(&sock->active, false); >>> CID 357292: (REVERSE_NEGATIVE) >>> You might be using variable "sock->fd" before verifying that it is >= 0. 309 if (sock->fd != (uv_os_sock_t)(-1)) { 310 isc__nm_tcp_close(sock); 311 } 312 isc__nm_connectcb(sock, req, result, true); 313 } 314	2022-10-12 08:21:35 +00:00
Tony Finch	138908b211	Avoid dead code warning when using a constant boolean The value of `sign_bit` is platform-dependent but constant at compile time. Use a cast to convert the boolean `sign_bit` to 0 or 1 instead of ternary `?:` because one branch of the conditional is dead code. (We could leave out the cast to `size_t` but our style prefers to handle booleans more explicitly, hence the `?:` that caused the issue.) *** CID 358310: Possible Control flow issues (DEADCODE) /lib/isc/resource.c: 118 in isc_resource_setlimit() 112 * rlim_t, and whether rlim_t has a sign bit. 113 / 114 isc_resourcevalue_t rlim_max = UINT64_MAX; 115 size_t wider = sizeof(rlim_max) - sizeof(rlim_t); 116 bool sign_bit = (double)(rlim_t)-1 < 0; 117 >>> CID 358310: Possible Control flow issues (DEADCODE) >>> Execution cannot reach the expression "1" inside this statement: "rlim_max >>= 8UL wider + ...". 118 rlim_max >>= CHAR_BIT * wider + (sign_bit ? 1 : 0); 119 rlim_value = ISC_MIN(value, rlim_max); 120 } 121 122 rl.rlim_cur = rl.rlim_max = rlim_value; 123 unixresult = setrlimit(unixresource, &rl);	2022-10-05 15:51:05 +00:00
Ondřej Surý	c0598d404c	Use designated initializers instead of memset()/MEM_ZERO for structs In several places, the structures were cleaned with memset(...)) and thus the semantic patch converted the isc_mem_get(...) to isc_mem_getx(..., ISC_MEM_ZERO). Use the designated initializer to initialized the structures instead of zeroing the memory with ISC_MEM_ZERO flag as this better matches the intended purpose.	2022-10-05 16:44:05 +02:00
Ondřej Surý	c1d26b53eb	Add and use semantic patch to replace isc_mem_get/allocate+memset Add new semantic patch to replace the straightfoward uses of: ptr = isc_mem_{get,allocate}(..., size); memset(ptr, 0, size); with the new API call: ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);	2022-10-05 16:44:05 +02:00
Ondřej Surý	dbf5672f32	Replace isc_mem__aligned(..., alignment) with isc_mem_x(..., flags) Previously, the isc_mem_get_aligned() and friends took alignment size as one of the arguments. Replace the specific function with more generic extended variant that now accepts ISC_MEM_ALIGN(alignment) for aligned allocations and ISC_MEM_ZERO for allocations that zeroes the (re-)allocated memory before returning the pointer to the caller.	2022-10-05 16:44:05 +02:00
Ondřej Surý	c14a4ac763	Add a case-insensitive option directly to siphash 2-4 implementation Formerly, the isc_hash32() would have to change the key in a local copy to make it case insensitive. Change the isc_siphash24() and isc_halfsiphash24() functions to lowercase the input directly when reading it from the memory and converting the uint8_t * array to 64-bit (respectively 32-bit numbers).	2022-10-04 10:32:40 +02:00
Mark Andrews	5f07fe8cbb	Use strnstr implementation from FreeBSD if not provided by OS	2022-10-04 14:21:41 +11:00
Tony Finch	4e37a6f77a	Avoid signed integer overflow in isc_resource_setlimit() On systems with signed rlim_t the old code calculated its maximum value by shifting 1 into the sign bit, which is undefined behaviour. Avoid the bug by using an unsigned shift.	2022-10-03 11:37:17 +00:00
Ondřej Surý	477eb22c12	Refactor isc_ratelimiter API Because the dns_zonemgr_create() was run before the loopmgr was started, the isc_ratelimiter API was more complicated that it had to be. Move the dns_zonemgr_create() to run_server() task which is run on the main loop, and simplify the isc_ratelimiter API implementation. The isc_timer is now created in the isc_ratelimiter_create() and starting the timer is now separate async task as is destroying the timer in case it's not launched from the loop it was created on. The ratelimiter tick now doesn't have to create and destroy timer logic and just stops the timer when there's no more work to do. This should also solve all the races that were causing the isc_ratelimiter to be left dangling because the timer was stopped before the last reference would be detached.	2022-09-30 10:36:30 +02:00
Ondřej Surý	09b50d2237	Fix small problems in the isc_ratelimiter	2022-09-30 09:50:17 +02:00
Ondřej Surý	1e2ededb07	Add missing DbC check for name##_detach in ISC_REFCOUNT_IMPL macro The detach function in the ISC_REFCOUNT_IMPL macro was missing DbC checks, add them.	2022-09-30 09:50:17 +02:00
Tony Finch	a4930e1969	Improve DBC in isc_mem_free Unlike standard free(), isc_mem_free() is not a no-op when passed a NULL pointer. For size accounting purposes it calls sallocx(), which crashes when passed a NULL pointer. To get more helpful diagnostics, REQUIRE() that the pointer is not NULL so that when the programmer makes a mistake they get a backtrace that shows what went wrong.	2022-09-29 10:07:34 +00:00
Ondřej Surý	173c352452	Call the isc__nm_udp_send() callbacks asynchronously on shutdown The isc__nm_udp_send() callback would be called synchronously when shutting down or when the socket has been closed. This could lead to double locking in the calling code and thus those callbacks needs to be called asynchronously.	2022-09-29 11:06:58 +02:00
Ondřej Surý	3b31f7f563	Add autoconf option to enable memory leak detection in libraries There's a known memory leak in the engine_pkcs11 at the time of writing this and it interferes with the named ability to check for memory leaks in the OpenSSL memory context by default. Add an autoconf option to explicitly enable the memory leak detection, and use it in the CI except for pkcs11 enabled builds. When this gets fixed in the engine_pkc11, the option can be enabled by default.	2022-09-27 17:53:04 +02:00
Ondřej Surý	e537fea861	Use custom isc_mem based allocator for libxml2 The libxml2 library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, strdup and free). Create a memory context specifically for libxml2 to allow tracking the memory usage that has originated from within libxml2. This will provide a separate memory context for libxml2 to track the allocations and when shutting down the application it will check that all libxml2 allocations were returned to the allocator. Additionally, move the xmlInitParser() and xmlCleanupParser() calls from bin/named/main.c to library constructor/destructor in libisc library.	2022-09-27 17:10:42 +02:00
Ondřej Surý	236d4b7739	Use custom isc_mem based allocator for OpenSSL The OpenSSL library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, and free). Create a memory context specifically for OpenSSL to allow tracking the memory usage that has originated from within OpenSSL. This will provide a separate memory context for OpenSSL to track the allocations and when shutting down the application it will check that all OpenSSL allocations were returned to the allocator.	2022-09-27 17:10:42 +02:00
Ondřej Surý	a32d06dd42	Use custom isc_mem based allocator for libuv The libuv library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, calloc and free). Create a memory context specifically for libuv to allow tracking the memory usage that has originated from within libuv. This requires libuv >= 1.38.0 which provides uv_library_shutdown() function that assures no more allocations will be made.	2022-09-27 17:10:42 +02:00
Ondřej Surý	a30e75db86	Check for working __builtin_mul_overflow() implementation Instead of using generic HAVE_BUILTIN_OVERFLOW, we need to check whether the overflow functions actually work as there was a bug in GCC that it would not detect mul overflow when compiled with `-m32` option without optimizations and the bug was fixed only for GCC 6.5+ and 7.3+/8+. For further details see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82274	2022-09-27 17:10:42 +02:00
Ondřej Surý	2d2022a509	Make the debugging flags local to the memory context Previously, the isc_mem_debugging would be single global variable that would affect the behavior of the memory context whenever it would be changed which could be after some allocation were already done. Change the memory debugging options to be local to the memory context and immutable, so all allocations within the same memory context are treated the same.	2022-09-27 17:10:41 +02:00
Ondřej Surý	0086ebf3fc	Bump the libuv requirement to libuv >= 1.34.0 By bumping the minimum libuv version to 1.34.0, it allows us to remove all libuv shims we ever had and makes the code much cleaner. The up-to-date libuv is available in all distributions supported by BIND 9.19+ either natively or as a backport.	2022-09-27 17:09:10 +02:00
Evan Hunt	1926ddc987	change ISC__BUFFER macros to inline functions previously, when ISC_BUFFER_USEINLINE was defined, macros were used to implement isc_buffer primitives (isc_buffer_init(), isc_buffer_region(), etc). these macros were missing the DbC assertions for those primitives, which made it possible for coding errors to go undetected. adding the assertions to the macros caused compiler warnings on some platforms. therefore, this commit converts the ISC__BUFFER macros to static inline functions instead, with assertions included, and eliminates the non-inline implementation from buffer.c. the --enable-buffer-useinline configure option has been removed.	2022-09-26 23:49:27 -07:00
Ondřej Surý	1baed21688	Switch the CSPRNG function from RAND_bytes() to uv_random() The RAND_bytes() implementation differs between the OpenSSL versions and uses the system entropy only for seeding its internal CSPRNG. The uv_random() on the other hand uses the system provided CSPRNG. Switch from RAND_bytes() to uv_random() to use system provided CSPRNG.	2022-09-26 15:13:11 +02:00
Ondřej Surý	fffd444440	Cleanup the asychronous code in the stream implementations After the loopmgr work has been merged, we can now cleanup the TCP and TLS protocols a little bit, because there are stronger guarantees that the sockets will be kept on the respective loops/threads. We only need asynchronous call for listening sockets (start, stop) and reading from the TCP (because the isc_nm_read() might be called from read callback again. This commit does the following changes (they are intertwined together): 1. Cleanup most of the asynchronous events in the TCP code, and add comments for the events that needs to be kept asynchronous. 2. Remove isc_nm_resumeread() from the netmgr API, and replace isc_nm_resumeread() calls with existing isc_nm_read() calls. 3. Remove isc_nm_pauseread() from the netmgr API, and replace isc_nm_pauseread() calls with a new isc_nm_read_stop() call. 4. Disable the isc_nm_cancelread() for the streaming protocols, only the datagram-like protocols can use isc_nm_cancelread(). 5. Add isc_nmhandle_close() that can be used to shutdown the socket earlier than after the last detach. Formerly, the socket would be closed only after all reading and sending would be finished and the last reference would be detached. The new isc_nmhandle_close() can be used to close the underlying socket earlier, so all the other asynchronous calls would call their respective callbacks immediately. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org>	2022-09-22 14:51:15 +02:00
Ondřej Surý	5319d4f6c5	Require isc_timer to be manipulated on the timer loop Each isc_timer needs to be created, started and destroyed on the current loop. The isc_timer_stop() can be run on any loop, but when run from different loop than the one associated with the timer, the request to stop the timer will be recorded in atomic variable and the underlying uv_timer_t will be stopped on next uv_timer_t callback call. This allows any thread to stop the timer.	2022-09-21 14:25:33 -07:00
Ondřej Surý	869c6d77a2	Convert isc_ratelimiter API to use on-loop timers In preparation for the on-loop timers, the isc_ratelimiter API was converted to use the timer on main loop and start and stop the timer asynchronously on the main loop.	2022-09-21 14:25:33 -07:00
Ondřej Surý	27d1e498b8	Add isc_timer_async_destroy() helper function As it sometimes happens that the object using isc_timer_t is destroyed via detaching all the references with no guarantee that the last thread will be matching thread, add a helper isc_timer_async_destroy() function that stops the timer and runs the destroy function via isc_async_run() on the matching thread.	2022-09-21 14:25:33 -07:00
Evan Hunt	4b7248545e	additional code cleanups in httpd.c - use isc_buffer functions when appropriate, rather than converting to and from isc_region unnecessarily - use the zlib total_out value instead of calculating it - use c99 struct initialization	2022-09-21 11:45:12 -07:00
Tony Finch	4b9af22830	Ensure the first random number is non-zero when fuzzing In fuzzing mode, `isc_random` uses a fixed seed for reproducibility. The particular seed chosen happened to produce zero as its first number, however commit `bd251de0` introduced an initialization check in `random_test` that required it to be non-zero. This change adjusts the seed to avoid spurious test failures. Also, remove the temporary variable that was used for initialization because it did not match the type of the thread-local seed array.	2022-09-21 12:47:26 +01:00
Michał Kępień	2ee16067c5	BIND 9.19.5 -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEENKwGS3ftSQfs1TU17QVz/8hFYQUFAmMZ2WwPHG1pY2hhbEBp c2Mub3JnAAoJEO0Fc//IRWEFZz0P/3B8tQXCztMneNsAzvQ11hASuQH3RVvd1p9z H6yPfbBuqyBM7FOJWozLQSI0JvxwBPXW+G+AmEhafSB4plgJBfNb12TsN7ZpECbF E6ckVQTiLwiYWt/2neu2OYg0aOnl5mhO5J4ESkSgqXGXcDihQ922xLJFQdAAgeAj T6TzrF1rv0fVNNlAcE1hrsZsGChTdPAguo/jVPXJjOO8hcEFGEqCWGhCX+wuyY6t WRXYcnh37/rlLIY29R3sVKttPIrD7DN6doGuz0/BP0PuuXCFnWBz/t61Et8Q/nxO hTS4RoKs/14IXRH7UBspo1dnG7khGYu2z44mCRwx15+fjpJ+zAL/Ym9xa0ElLOWg +Asd8w1N275xUQdrcTxpM7z/2z7SP/+bxtLJjIPW+9Z2a8rk8ifLu1yjtWASwOUO vLIK0WU3T7FPhpdP+0VgeSYAlJgLEoIgwIWCB+u+I4dR9DJJ7TtjPHDcfrJKXaJ6 eTTFIZ97xIFEpH53mT+QRG52PFP39fiLa0i7ylM+C0UbMklG++UgtkHz2CkkzV4H hqVcQ0Usk8XICkZ0PHAQklaDnDhXBD48x0J7wJOQSy+KS1foAyMFSPXv0ZelwiRM Q0StU+t+wXTAK3QID0tBqU4CyFD8fKO3cFwUnv5zqmrRc4ITu3etObT17MDPQKJj KLSl1VyB =6VJu -----END PGP SIGNATURE----- Merge tag 'v9_19_5' BIND 9.19.5	2022-09-21 13:04:58 +02:00
Tony Finch	bd251de035	Move random number re-seeding out of the hot path Instead of checking if we need to re-seed for every isc_random call, seed the random number generator in the libisc global initializer and the per-thread initializer.	2022-09-19 16:27:12 +02:00
Ondřej Surý	f6e4f620b3	Use the semantic patch to do the unsigned -> unsigned int change Apply the semantic patch on the whole code base to get rid of 'unsigned' usage in favor of explicit 'unsigned int'.	2022-09-19 15:56:02 +02:00
Ondřej Surý	b1026dd4c1	Add missing isc_refcount_destroy() for isc__nmsocket_t The destructor for the isc__nmsocket_t was missing call to the isc_refcount_destroy() on the reference counter, which might lead to spurious ThreadSanitizer data race warnings if we ever change the acquire-release memory order in the isc_refcount_decrement().	2022-09-19 14:38:56 +02:00
Ondřej Surý	9b8d432403	Reorder the uv_close() calls to close the socket immediately Simplify the closing code - during the loopmgr implementation, it was discovered that the various lists used by the uv_loop_t aren't FIFO, but LIFO. See doc/dev/libuv.md for more details. With this knowledge, we can close the protocol handles (uv_udp_t and uv_tcp_t) and uv_timer_t at the same time by reordering the uv_close() calls, and thus making sure that after calling the isc__nm_stoplistening(), the code will not issue any additional callback calls (accept, read) on the socket that stopped listening. This might help with the TLS and DoH shutting down sequence as described in the [GL #3509] as we now stop the reading, stop the timer and call the uv_close() as earliest as possible.	2022-09-19 14:38:56 +02:00
Ondřej Surý	eac8bc5c1a	Prevent unexpected UDP client read callbacks The network manager UDP code was misinterpreting when the libuv called the udp_recv_cb with nrecv == 0 and addr == NULL -> this doesn't really mean that the "stream" has ended, but the libuv indicates that the receive buffer can be freed. This could lead to assertion failure in the code that calls isc_nm_read() from the network manager read callback due to the extra spurious callbacks. Properly handle the extra callback calls from the libuv in the client read callback, and refactor the UDP isc_nm_read() implementation to be synchronous, so no datagram is lost between the time that we stop the reading from the UDP socket and we restart it again in the asychronous udpread event. Add a unit test that tests the isc_nm_read() call from the read callback to receive two datagrams.	2022-09-19 12:20:41 +02:00
Ondřej Surý	6562227cc8	Handle canceled read during sending data over stats channel An assertion failure would be triggered when the TCP connection is canceled during sending the data back to the client. Don't require the state to be `RECV` on non successful read to gracefully handle canceled TCP connection during the SEND state of the HTTPD channel.	2022-09-15 10:29:37 +02:00
Tony Finch	21a383a8fd	General-purpose unrolled ASCII tolower() loops When converting a string to lower case, the compiler is able to autovectorize nicely, so a nice simple implementation is also very fast, comparable to memcpy(). Comparisons are more difficult for the compiler, so we convert eight bytes at a time using "SIMD within a register" tricks. Experiments indicate it's best to stick to simple loops for shorter strings and the remainder of long strings.	2022-09-12 12:18:57 +01:00
Tony Finch	27a561273e	Consolidate some ASCII tables in `isc/ascii` and `isc/hex` There were a number of places that had copies of various ASCII tables (case conversion, hex and decimal conversion) that are intended to be faster than the ctype.h macros, or avoid locale pollution. Move them into libisc, and wrap the lookup tables with macros that avoid the ctype.h gotchas.	2022-09-12 12:18:57 +01:00
Michał Kępień	3b1c80fd0f	Fix error reporting for POSIX Threads functions Commit 3608abc8fa6a33046e1d34a0789cf7c9547f09ad inadvertently carried over a mistake in logging pthread_cond_init() errors to the ERRNO_CHECK() preprocessor macro: instead of passing the value returned by a given pthread_() function to strerror_r(), ERRNO_CHECK() passes the errno variable to strerror_r(). This causes bogus error reports because POSIX Threads API functions do not set the errno variable. Fix by passing the value returned by a given pthread_() function instead of the errno variable to strerror_r(). Since this change makes the name of the affected macro (ERRNO_CHECK()) confusing, rename the latter to PTHREADS_RUNTIME_CHECK(). Also log the integer error value returned by a given pthread_*() function verbatim to rule out any further confusion in runtime error reporting.	2022-09-09 20:25:47 +02:00
Evan Hunt	47e9fa981e	compression buffer was not reused correctly when the compression buffer was reused for multiple statistics requests, responses could grow beyond the correct size. this was because the buffer was not cleared before reuse; compressed data was still written to the beginning of the buffer, but then the size of used region was increased by the amount written, rather than set to the amount written. this caused responses to grow larger and larger, potentially reading past the end of the allocated buffer.	2022-09-08 11:15:52 +02:00
Michał Kępień	4c49068531	Fix building with --disable-doh Commit `b69e783164` inadvertently caused builds using the --disable-doh switch to fail, by putting the declaration of the isc__nm_async_settlsctx() function inside an #ifdef block that is only evaluated when DNS-over-HTTPS support is enabled. This results in the following compilation errors being triggered: netmgr/netmgr.c:2657:1: error: no previous prototype for 'isc__nm_async_settlsctx' [-Werror=missing-prototypes] 2657 \| isc__nm_async_settlsctx(isc__networker_t worker, isc__netievent_t ev0) { \| ^~~~~~~~~~~~~~~~~~~~~~~ Fix by making the declaration of the isc__nm_async_settlsctx() function in lib/isc/netmgr/netmgr-int.h visible regardless of whether DNS-over-HTTPS support is enabled or not.	2022-09-07 12:50:08 +02:00
Aram Sargsyan	2f11e48f0d	Fix isc_nm_listentlsdns() error path bug The isc_nm_listentlsdns() function erroneously calls isc__nm_tcpdns_stoplistening() instead of isc__nm_tlsdns_stoplistening() when something goes wrong, which can cause an assertion failure.	2022-09-05 14:58:52 +00:00
Aram Sargsyan	e97c3eea95	Add mctx attach/detach when creating/destroying a memory pool This should make sure that the memory context is not destroyed before the memory pool, which is using the context.	2022-09-02 08:16:17 +00:00
Ondřej Surý	718e92c31a	Clear the callbacks when isc_nm_stoplistening() is called When we are closing the listening sockets, there's a time window in which the TCP connection could be accepted although the respective stoplistening function has already returned to control to the caller. Clear the accept callback function early, so it doesn't get called when we are not interested in the incoming connections anymore.	2022-08-26 09:09:25 +02:00
Ondřej Surý	4d07768a09	Remove the isc_app API The isc_app API is no longer used and has been removed.	2022-08-26 09:09:25 +02:00
Ondřej Surý	b69e783164	Update netmgr, tasks, and applications to use isc_loopmgr Previously: * applications were using isc_app as the base unit for running the application and signal handling. * networking was handled in the netmgr layer, which would start a number of threads, each with a uv_loop event loop. * task/event handling was done in the isc_task unit, which used netmgr event loops to run the isc_event calls. In this refactoring: * the network manager now uses isc_loop instead of maintaining its own worker threads and event loops. * the taskmgr that manages isc_task instances now also uses isc_loopmgr, and every isc_task runs on a specific isc_loop bound to the specific thread. * applications have been updated as necessary to use the new API. * new ISC_LOOP_TEST macros have been added to enable unit tests to run isc_loop event loops. unit tests have been updated to use this where needed.	2022-08-26 09:09:24 +02:00
Ondřej Surý	49b149f5fd	Update isc_timer to use isc_loopmgr * isc_timer was rewritten using the uv_timer, and isc_timermgr_t was completely removed; isc_timer objects are now directly created on the isc_loop event loops. * the isc_timer API has been simplified. the "inactive" timer type has been removed; timers are now stopped by calling isc_timer_stop() instead of resetting to inactive. * isc_manager now creates a loop manager rather than a timer manager. * modules and applications using isc_timer have been updated to use the new API.	2022-08-25 17:17:07 +02:00
Ondřej Surý	84c90e223f	New event loop handling API This commit introduces new APIs for applications and signal handling, intended to replace isc_app for applications built on top of libisc. * isc_app will be replaced with isc_loopmgr, which handles the starting and stopping of applications. In isc_loopmgr, the main thread is not blocked, but is part of the working thread set. The loop manager will start a number of threads, each with a uv_loop event loop running. Setup and teardown functions can be assigned which will run when the loop starts and stops, and jobs can be scheduled to run in the meantime. When isc_loopmgr_shutdown() is run from any the loops, all loops will shut down and the application can terminate. * signal handling will now be handled with a separate isc_signal unit. isc_loopmgr only handles SIGTERM and SIGINT for application termination, but the application may install additional signal handlers, such as SIGHUP as a signal to reload configuration. * new job running primitives, isc_job and isc_async, have been added. Both units schedule callbacks (specifying a callback function and argument) on an event loop. The difference is that isc_job unit is unlocked and not thread-safe, so it can be used to efficiently run jobs in the same thread, while isc_async is thread-safe and uses locking, so it can be used to pass jobs from one thread to another. * isc_tid will be used to track the thread ID in isc_loop worker threads. * unit tests have been added for the new APIs.	2022-08-25 12:24:29 +02:00
Ondřej Surý	a26862e653	Simplify the isc_event API The ev_tag field was never used, and has now been removed.	2022-08-25 12:24:25 +02:00
Aram Sargsyan	8c4cdd9b21	Fix statistics channel multiple request processing with non-empty bodies When the HTTP request has a body part after the HTTP headers, it is not getting processed and is being prepended to the next request's data, which results in an error when trying to parse it. Improve the httpd.c:process_request() function with the following additions: 1. Require that HTTP POST requests must have Content-Length header. 2. When Content-Length header is set, extract its value, and make sure that it is valid and that the whole request's body is received before processing the request. 3. Discard the request's body by consuming Content-Length worth of data in the buffer.	2022-08-19 08:10:54 +00:00
Aram Sargsyan	86b8e62106	Enhance the have_header() function to find the HTTP header's value Add a new `const char **fvalue` parameter to the httpd.c:have_header() function which, when set, will point to the found header's value.	2022-08-19 08:10:54 +00:00
Evan Hunt	9d9bd3ace2	fix overflow error in mem_putstats() an integer overflow could cause an assertion failure when freeing memory.	2022-08-09 10:59:43 -07:00
Artem Boldariev	32565d0d65	TLS: do not ignore readpaused flag in certain circumstances In some circumstances generic TLS code could have resumed data reading unexpectedly on the TCP layer code. Due to this, the behaviour of isc_nm_pauseread() and isc_nm_resumeread() might have been unexpected. This commit fixes that. The bug does not seems to have real consequences in the existing code due to the way the code is used. However, the bug could have lead to unexpected behaviour and, at any rate, makes the TLS code behave differently from the TCP code, with which it attempts to be as compatible as possible.	2022-08-02 14:02:01 +03:00
Artem Boldariev	c52c691b18	TLS: fix double resumption in isc__nm_tls_resumeread() This commit fixes an obvious error in isc__nm_tls_resumeread() so that read cannot be resumed twice.	2022-07-26 14:25:59 +03:00
Artem Boldariev	5d450cd0ba	TLS: clear 'errno' when handling SSL status Sometimes tls_do_bio() might be called when there is no new data to process (most notably, when resuming reads), in such a case internal TLS session state will remain untouched and old value in 'errno' will alter the result of SSL_get_error() call, possibly making it to return SSL_ERROR_SYSCALL. This value will be treated as an error, and will lead to closing the connection, which is not what expected.	2022-07-26 14:25:59 +03:00
Ondřej Surý	3e10d3b45f	Cleanup the STATID_CONNECT and STATID_CONNECTFAIL stat counters The STATID_CONNECT and STATID_CONNECTFAIL statistics were used incorrectly. The STATID_CONNECT was incremented twice (once in the *_connect_direct() and once in the callback) and STATID_CONNECTFAIL would not be incremented at all if the failure happened in the callback. Closes: #3452	2022-07-14 14:34:53 +02:00
Ondřej Surý	a280855f7b	Handle the transient TCP connect() failures on FreeBSD On FreeBSD (and perhaps other *BSD) systems, the TCP connect() call (via uv_tcp_connect()) can fail with transient UV_EADDRINUSE error. The UDP code already handles this by trying three times (is a charm) before giving up. Add a code for the TCP, TCPDNS and TLSDNS layers to also try three times before giving up by calling uv_tcp_connect() from the callback two more time on UV_EADDRINUSE error. Additionally, stop the timer only if we succeed or on hard error via isc__nm_failed_connect_cb().	2022-07-14 14:20:10 +02:00
Michał Kępień	b67ff4728f	Improve reporting for barrier errors uv_barrier_init() errors are currently ignored. Use UV_RUNTIME_CHECK() to catch them and to improve error reporting for any uv_barrier_init() run-time failures (by augmenting error messages with file/line information and the error string corresponding to the value returned).	2022-07-13 13:19:32 +02:00
Michał Kępień	7009f9d270	Improve reporting for read-write lock errors Replace direct uses of implementation-specific rwlock functions in lib/isc/include/isc/rwlock.h with preprocessor macros that use ERRNO_CHECK(), in order to augment rwlock-related error messages with file/line/caller information and the error string corresponding to errno. Adjust the implementation-specific functions for pthreads-based rwlocks so that they return any errors encountered to the caller instead of aborting execution immediately using RUNTIME_CHECK(). To keep code modifications simple, make the non-pthreads-based implementation-specific rwlock functions always return 0; these functions continue to handle errors using less verbose run-time assertions as they do not set errno anyway.	2022-07-13 13:19:32 +02:00
Michał Kępień	badeeff0ac	Improve reporting for condition variable errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/condition.h with ERRNO_CHECK(), in order to improve error reporting for any condition-variable-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	f352a834a7	Improve reporting for mutex errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/mutex.h with ERRNO_CHECK(), in order to improve error reporting for any mutex-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	77aead5ab6	Enable tracking of pthreads barriers Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_barrier_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_barrier_destroy() or else the memory allocated for the barrier will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying barriers that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_barrier_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_barrier_t structures on the heap in isc_barrier_init() and free them in isc_barrier_destroy(). Reuse existing barrier macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	e4606da2c6	Enable tracking of pthreads rwlocks Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_rwlock_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_rwlock_destroy() or else the memory allocated for the rwlock will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying rwlocks that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_rwlock_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set (and --enable-pthread-rwlock is used), allocate isc_rwlock_t structures on the heap in isc_rwlock_init() and free them in isc_rwlock_destroy(). Reuse existing functions defined in lib/isc/rwlock.c for other operations, but rename them first, so that they contain triple underscores (to indicate that these functions are implementation-specific, unlike their mutex and condition variable counterparts, which always use the pthreads implementation). Define the isc__rwlock_init() macro so that it is a logical counterpart of isc__mutex_init() and isc__condition_init(); adjust isc___rwlock_init() accordingly. Remove a redundant function prototype for isc__rwlock_lock() and rename that (static) function to rwlock_lock() in order to avoid having to use quadruple underscores.	2022-07-13 13:19:32 +02:00
Ondřej Surý	8dfdb95a20	Enable tracking of pthreads condition variables Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_cond_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_cond_destroy() or else the memory allocated for the condition variable will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying condition variables that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_cond_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_condition_t structures on the heap in isc_condition_init() and free them in isc_condition_destroy(). Reuse existing condition variable macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	ebcfb16576	Enable tracking of pthreads mutexes Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_mutex_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_mutex_destroy() or else the memory allocated for the mutex will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying mutexes that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_mutex_destroy() calls on any platform on which it works reliably. Introduce a new ISC_TRACK_PTHREADS_OBJECTS preprocessor macro, which causes isc_mutex_t structures to be allocated on the heap by isc_mutex_init() and freed by isc_mutex_destroy(). Reuse existing mutex macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	deae974366	Directly cause assertion failure on pthreads primitives failure Instead of returning error values from isc_rwlock_(), isc_mutex_(), and isc_condition_*() macros/functions and subsequently carrying out runtime assertion checks on the return values in the calling code, trigger assertion failures directly in those macros/functions whenever any pthread function returns an error, as there is no point in continuing execution in such a case anyway.	2022-07-13 13:19:32 +02:00
Ondřej Surý	8e5e0fa522	Use library constructor to create default mutex attr once Instead of using isc_once_do() on every isc_mutex_init() call, use the global library constructor to initialize the default mutex attr object (optionally with PTHREAD_MUTEX_ADAPTIVE_NP if supported) just once when the library is loaded.	2022-07-13 13:19:32 +02:00
Michał Kępień	5759ace07f	Handle pthread_*_init() failures consistently isc_rwlock_init() currently detects pthread_rwlock_init() failures using a REQUIRE() assertion. Use the ERRNO_CHECK() macro for that purpose instead, so that read-write lock initialization failures are handled identically as condition variable (pthread_cond_init()) and mutex (pthread_mutex_init()) initialization failures.	2022-07-13 13:19:32 +02:00
Michał Kępień	365b47caee	Add an ERRNO_CHECK() preprocessor macro In a number of situations in pthreads-related code, a common sequence of steps is taken: if the value returned by a library function is not 0, pass errno to strerror_r(), log the string returned by the latter, and immediately abort execution. Add an ERRNO_CHECK() preprocessor macro which takes those exact steps and use it wherever (conveniently) possible. Notes: 1. The "log the return value of strerror_r() and abort" pattern is used in a number of other places that this commit does not touch; only "!= 0" checks followed by isc_error_fatal() calls with non-customized error messages are replaced here. 2. This change temporarily breaks file name & line number reporting for isc__mutex_init() errors, to prevent breaking the build. This issue will be rectified in a subsequent change.	2022-07-13 13:19:32 +02:00
Artem Boldariev	ffcb54211e	TLS: do not ignore accept callback result Before this change the TLS code would ignore the accept callback result, and would not try to gracefully close the connection. This had not been noticed, as it is not really required for DoH. Now the code tries to shut down the TLS connection gracefully when accepting it is not successful.	2022-07-12 14:40:22 +03:00
Artem Boldariev	8585b92f98	TLSDNS: try pass incoming data to OpenSSL if there are any Otherwise the code path will lead to a call to SSL_get_error() returning SSL_ERROR_SSL, which in turn might lead to closing connection to early in an unexpected way, as it is clearly not what is intended. The issue was found when working on loppmgr branch and appears to be timing related as well. Might be responsible for some unexpected transmission failures e.g. on zone transfers.	2022-07-12 14:40:22 +03:00
Artem Boldariev	fc74b15e67	TLS: bail out earlier when NM is stopping In some operations - most prominently when establishing connection - it might be beneficial to bail out earlier when the network manager is stopping. The issue is backported from loopmgr branch, where such a change is not only beneficial, but required.	2022-07-12 14:40:22 +03:00
Artem Boldariev	ac4fb34f18	TLS: sometimes TCP conn. handle might be NULL on when connecting In some cases - in particular, in case of errors, NULL might be passed to a connection callback instead of a handle that could have led to an abort. This commit ensures that such a situation will not occur. The issue was found when working on the loopmgr branch.	2022-07-12 14:40:22 +03:00
Artem Boldariev	88524e26ec	TLS: try to close sockets whenever there are no pending operations This commit ensures that the underlying TCP socket of a TLS connection gets closed earlier whenever there are no pending operations on it. In the loop-manager branch, in some circumstances the connection could have remained opened for far too long for no reason. This commit ensures that will not happen.	2022-07-12 14:40:22 +03:00
Artem Boldariev	237ce05b89	TLS: Implement isc_nmhandle_setwritetimeout() This commit adds a proper implementation of isc_nmhandle_setwritetimeout() for TLS connections. Now it passes the value to the underlying TCP handle.	2022-07-12 14:40:22 +03:00
Evan Hunt	a499794984	REQUIRE should not have side effects it's a style violation to have REQUIRE or INSIST contain code that must run for the server to work. this was being done with some atomic_compare_exchange calls. these have been cleaned up. uses of atomic_compare_exchange in assertions have been replaced with a new macro atomic_compare_exchange_enforced, which uses RUNTIME_CHECK to ensure that the exchange was successful.	2022-07-05 12:22:55 -07:00
Artem Boldariev	d2e13ddf22	Update the set of HTTP endpoints on reconfiguration This commit ensures that on reconfiguration the set of HTTP endpoints (=paths) is being updated within HTTP listeners.	2022-06-28 15:42:38 +03:00
Artem Boldariev	e72962d5f1	Update max concurrent streams limit in HTTP listeners on reconfig This commit ensures that HTTP listeners concurrent streams limit gets updated properly on reconfiguration.	2022-06-28 15:42:38 +03:00
Michal Nowak	1c45a9885a	Update clang to version 14	2022-06-16 17:21:11 +02:00
Artem Boldariev	e616d7f240	TLS DNS: do not call accept callback twice Before the changes from this commit were introduced, the accept callback function will get called twice when accepting connection during two of these stages: * when accepting the TCP connection; * when handshake has completed. That is clearly an error, as it should have been called only once. As far as I understand it the mistake is a result of TLS DNS transport being essentially a fork of TCP transport, where calling the accept callback immediately after accepting TCP connection makes sense. This commit fixes this mistake. It did not have any very serious consequences because in BIND the accept callback only checks an ACL and updates stats.	2022-06-15 14:21:11 +03:00
Ondřej Surý	b432d5d3bc	Gracefully handle uv_read_start() failures Under specific rare timing circumstances the uv_read_start() could fail with UV_EINVAL when the connection is reset between the connect (or accept) and the uv_read_start() call on the nmworker loop. Handle such situation gracefully by propagating the errors from uv_read_start() into upper layers, so the socket can be internally closed().	2022-06-14 11:33:02 +02:00
Ondřej Surý	2c3b2dabe9	Move all the unit tests to /tests/<libname>/ The unit tests are now using a common base, which means that lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and link with lib/isc/test.c and lib/ns/tests has to include both libisc and libdns parts. Instead of cross-linking code between the directories, move the /lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h to /tests/include/tests/<foo>.h and create a single libtest.la convenience library in /tests/. At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep it symlinked to the old location) and adjust paths accordingly. In few places, we are now using absolute paths instead of relative paths, because the directory level has changed. By moving the directories under the /tests/ directory, the test-related code is kept in a single place and we can avoid referencing files between libns->libdns->libisc which is unhealthy because they live in a separate Makefile-space. In the future, the /bin/tests/ should be merged to /tests/ and symlink kept, and the /fuzz/ directory moved to /tests/fuzz/.	2022-05-28 14:53:02 -07:00
Ondřej Surý	63fe9312ff	Give the unit tests a big overhaul The unit tests contain a lot of duplicated code and here's an attempt to reduce code duplication. This commit does several things: 1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake conditionals. 2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test implementations, test lists, and the main test routine, so we don't have to repeat this all over again. The macros were modeled after libuv test suite but adapted to cmocka as the test driver. A simple example of a unit test would be: ISC_RUN_TEST_IMPL(test1) { assert_true(true); } ISC_TEST_LIST_START ISC_TEST_ENTRY(test1) ISC_TEST_LIST_END ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?) For more complicated examples including group setup and teardown functions, and per-test setup and teardown functions. 3. The macros prefix the test functions and cmocka entries, so the name of the test can now match the tested function name, and we don't have to append `_test` because `run_test_` is automatically prepended to the main test function, and `setup_test_` and `teardown_test_` is prepended to setup and teardown function. 4. Update all the unit tests to use the new syntax and fix a few bits here and there. 5. In the future, we can separate the test declarations and test implementations which are going to greatly help with uncluttering the bigger unit tests like doh_test and netmgr_test, because the test implementations are not declared static (see `ISC_RUN_TEST_DECLARE` and `ISC_RUN_TEST_IMPL` for more details. NOTE: This heavily relies on preprocessor macros, but the result greatly outweighs all the negatives of using the macros. There's less duplicated code, the tests are more uniform and the implementation can be more flexible.	2022-05-28 14:52:56 -07:00
Ondřej Surý	1fe391fd40	Make all tasks to be bound to a thread Previously, tasks could be created either unbound or bound to a specific thread (worker loop). The unbound tasks would be assigned to a random thread every time isc_task_send() was called. Because there's no logic that would assign the task to the least busy worker, this just creates unpredictability. Instead of random assignment, bind all the previously unbound tasks to worker 0, which is guaranteed to exist.	2022-05-25 16:04:51 +02:00
Artem Boldariev	98f758ed4f	CID 352848: split xfrin_start() and remove dead code This commit separates TLS context creation code from xfrin_start() as it has become too large and hard to follow into a new function (similarly how it is done in dighost.c) The dead code has been removed from the cleanup section of the TLS creation code: * there is no way 'tlsctx' can equal 'found'; * there is no way 'sess_cache' can be non-NULL in the cleanup section. Also, it fixes a bug in the older version of the code, where TLS client session context fetched from the cache would not get passed to isc_nm_tlsdnsconnect().	2022-05-25 12:38:38 +03:00
Petr Menšík	057438cb45	Fix failures in isc netmgr_test on big endian machines Typing from libuv structure to isc_region_t is not possible, because their sizes differ on 64 bit architectures. Little endian machines seems to be lucky and still result in test passed. But big endian machine such as s390x fails the test reliably. Fix by directly creating the buffer as isc_region_t and skipping the type conversion. More readable and still more correct.	2022-05-24 19:51:30 +02:00
Artem Boldariev	40be3c9263	Do not provide a shim for SSL_SESSION_is_resumable() The recently added TLS client session cache used SSL_SESSION_is_resumable() to avoid polluting the cache with non-resumable sessions. However, it turned out that we cannot provide a shim for this function across the whole range of OpenSSL versions due to the fact that OpenSSL 1.1.0 does uses opaque pointers for SSL_SESSION objects. The commit replaces the shim for SSL_SESSION_is_resumable() with a non public approximation of it on systems shipped with OpenSSL 1.1.0. It is not turned into a proper shim because it does not fully emulate the behaviour of SSL_SESSION_is_resumable(), but in our case it is good enough, as it still helps to protect the cache from pollution. For systems shipped with OpenSSL 1.0.X and derivatives (e.g. older versions of LibreSSL), the provided replacement perfectly mimics the function it is intended to replace.	2022-05-23 18:25:18 +03:00
Artem Boldariev	9abb00bb5f	Fix an abort in DoH (client-side) when writing on closing sock The commit fixes a corner case in client-side DoH code, when a write attempt is done on a closing socket (session). The change ensures that the write call-back will be called with a proper error code (see failed_send_cb() call in client_httpsend()).	2022-05-20 20:18:40 +03:00
Artem Boldariev	245f7cec2e	Avoid aborting when uv_timer_start() is used on a closing socket In such a case it will return UV_EINVAL (-EINVAL), leading to aborting, as the code expects the function to succeed.	2022-05-20 20:18:40 +03:00
Artem Boldariev	35338b4105	Add SSL_SESSION_is_resumable() implementation shim This commit adds SSL_SESSION_is_resumable() implementation if it is missing.	2022-05-20 20:17:48 +03:00
Artem Boldariev	86465c1dac	DoT: implement TLS client session resumption This commit extends DoT code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:48 +03:00
Artem Boldariev	90bc13a5d5	TLS stream/DoH: implement TLS client session resumption This commit extends TLS stream code and DoH code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:45 +03:00
Artem Boldariev	987892d113	Extend TLS context cache with TLS client session cache This commit extends TLS context cache with TLS client session cache so that an associated session cache can be stored alongside the TLS context within the context cache.	2022-05-20 20:13:20 +03:00
Artem Boldariev	4ef40988f3	Add TLS client session cache implementation This commit adds an implementation of a client TLS session cache. TLS client session cache is an object which allows efficient storing and retrieval of previously saved TLS sessions so that they can be resumed. This object is supposed to be a foundation for implementing TLS session resumption - a standard technique to reduce the cost of re-establishing a connection to the remote server endpoint. OpenSSL does server-side TLS session caching transparently by default. However, on the client-side, a TLS session to resume must be manually specified when establishing the TLS connection. The TLS client session cache is precisely the foundation for that.	2022-05-20 20:13:20 +03:00
Ondřej Surý	61117840c1	Move setting the sock->write_timeout to the async_*send Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send functions could lead to (harmless) data race when setting the value for the first time when the isc_nm_send() function would be called from thread not-matching the socket we are sending to. Move the setting the sock->write_timeout to the matching async function which is always called from the matching thread.	2022-05-19 22:36:47 +02:00
Ondřej Surý	14c8d43863	Use C2x [[fallthrough]] when supported by LLVM/clang Clang added support for the gcc-style fallthrough attribute (i.e. __attribute__((fallthrough))) in version 10. However, __has_attribute(fallthrough) will return 1 in C mode in older versions, even though they only support the C++11 fallthrough attribute. At best, the unsupported attribute is simply ignored; at worst, it causes errors. The C2x fallthrough attribute has the advantages of being supported in the broadest range of clang versions (added in version 9) and being easy to check for support. Use C2x [[fallthrough]] attribute if possible, and fall back to not using an attribute for clang versions that don't have it. Courtesy of Joshua Root	2022-05-19 21:40:24 +02:00
Michal Nowak	c9aca34b1e	BIND 9.19.1 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEIz+ZTe/bbr1Q+/5RJKPoRjruXlYFAmJ42nYACgkQJKPoRjru XlYtjA/8Dm/V5GSluMEoOiYXzqJ48qkdZk9sGpj+nlrnKSGY9UISZdB+9cc9JsvG D8c0a4JVYy4+Rcu6ivTc/iL7jrS7ypg5FFRFxOrWEugmCyOEJQ8tnhjvtQpzWyce m3PHtPn8s5HBojfmW4DJG5A+1CtbzStzGGdtZY6+uE9LcXynDyIjf0ebrYn7prVH E3UC+cYOMhq/v9AsOBvphc/3KpEWkTLeYLknPzD4el1MpCX7bTvEgnOPE8RgeVtm SGkXoEn2+EvfJf0UMJU6i4gqKJ4HFG2gwqk7H5XmEi61U3qerAExqgz81r9/pFzC PupeB7qjtHB0QO1QN3q++CW9sQJ4Xy0BrbcDWe0dgY7Kt8UgrM+CDV+qm4ueryem d6gqmT1WKFeS2NevHPnOoqoSJa2IhEWR07/DoZVUXF0ADtFeswANaRVDTv+fGy1j qKKPwoLndYePJROuQ296xntyK4A7E4lNkwdP76/x1I0vhqdRoMZNP2l2e7s1uznL O8FP6yBov2EopIoGRfmrSFVUdkGn4gPzx4M5DHYhgsI+S2TXpXVyJq0XcEvEE3S6 bMYCHU3yR8EExvKdFxcshxJMhkezF8OvxRxKp3Vap5ClFagg+sAnI0wv5GsmxKgq RVzFKyuTtZisfV9a3rC5TxBtjmnMPcWuI9kj09VPlzqKh9xibhU= =Im1y -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEIz+ZTe/bbr1Q+/5RJKPoRjruXlYFAmKGBicACgkQJKPoRjru XlbjRxAAk+2JaH7/lffXyCgcCj1A75AUlS+KNnP0NwtNTMvcvDXfI5R3HYW0ZfXg ITlixiIyH1la029vUuyr7pYwLhM8H7ocqkmsMDh7JqhaM8DDVCUSEeBGU+dZJKbs IBsBgQ0I5vsD4UIiyW/1LuI05GfmFA0Ood8meIZMZ176le0M7NsWQnawZVSsY2f1 u+r6Ca50XIPrF1J5tUk1Dmj0aPPIVSSMmcn3+ZChTyiilUegrBjv1jKKqkf6+Kgi vMIqZLMTtJluzPkxTUZ2kQDfCtzFM3kijAWPko1Zcybxq2OsKT5hSGFkoFo/afF4 pmk8XzGdSII+DYfiBUU2ddt3NS7htbWgf6vfSa/oXUZXqvv8V0eYUn5A0wIw46w3 gT6ut4BDLZ8Hl32rbuXJ0RgzVnD+0GFpkqpl9okwz9E5nbj18+CXWMRLdCUktxyK ZjnbiW0luuOmwSEyzA2jfNOcqbgElmCfmeJhUSWbSlt1u9k/bTms9NRjNM4MRy+r c7VjAEPwAzNugf4B3uZ+ObaGwAsUTBooOxXdwiHtpRAU8hSHhIVNBMRCtNzCz0dZ Wwd87eF7KqsKnikkm8qajvZUACty7DklDiODV8j+Ir/JXpZgGn0jqTyo5T/dueQq s6448xoLbVTBRtvtuAWZX95EmWDLdiizqn3HaDaHOxYXzQO5OhY= =Kjz0 -----END PGP SIGNATURE----- Merge tag 'v9_19_1' BIND 9.19.1	2022-05-19 10:55:42 +02:00
Evan Hunt	6936db2f59	Always use the number of CPUS for resolver->ntasks Since the fctx hash table is now self-resizing, and resolver tasks are selected to match the thread that created the fetch context, there shouldn't be any significant advantage to having multiple tasks per CPU; a single task per thread should be sufficient. Additionally, the fetch context is always pinned to the calling netmgr thread to minimize the contention just to coalesced fetches - if two threads starts the same fetch, it will be pinned to the first one to get the bucket.	2022-05-19 09:27:33 +02:00
Ondřej Surý	933162ae14	Lock the trampoline when attaching When attaching to the trampoline, the isc__trampoline_max was access unlocked. This would not manifest under normal circumstances because we initialize 65 trampolines by default and that's enough for most commodity hardware, but there are ARM machines with 128+ cores where this would be reported by ThreadSanitizer. Add locking around the code in isc__trampoline_attach(). This also requires the lock to leak on exit (along with memory that we already) because a new thread might be attaching to the trampoline while we are running the library destructor at the same time.	2022-05-13 10:07:20 +02:00
Ondřej Surý	0582478c96	Remove isc_task_destroy() and isc_task_shutdown() After removing the isc_task_onshutdown(), the isc_task_shutdown() and isc_task_destroy() became obsolete. Remove calls to isc_task_shutdown() and replace the calls to isc_task_destroy() with isc_task_detach(). Simplify the internal logic to destroy the task when the last reference is removed.	2022-05-12 14:55:49 +02:00
Ondřej Surý	2235edabcf	Remove isc_task_onshutdown() The isc_task_onshutdown() was used to post event that should be run when the task is being shutdown. This could happen explicitly in the isc_test_shutdown() call or implicitly when we detach the last reference to the task and there are no more events posted on the task. This whole task onshutdown mechanism just makes things more complicated, and it's easier to post the "shutdown" events when we are shutting down explicitly and the existing code already always knows when it should shutdown the task that's being used to execute the onshutdown events. Replace the isc_task_onshutdown() calls with explicit calls to execute the shutdown tasks.	2022-05-12 13:45:34 +02:00
Artem Boldariev	a696be6a2d	Fix a crash by avoiding destroying TLS stream socket too early This commit fixes a crash in generic TLS stream code, which could be reproduced during some runs of the 'sslyze' tool. The intention of this commit is twofold. Firstly, it ensures that the TLS socket object cannot be destroyed too early. Now it is being deleted alongside the underlying TCP socket object. Secondly, it ensures that the TLS socket object cannot be destroyed as a result of calling 'tls_do_bio()' (the primary function which performs encryption/decryption during the IO) as the code did not expect that. This code path is fixed now.	2022-05-04 19:38:16 +02:00
Ondřej Surý	a0a102cc50	Restore the implementation of uv_os_getenv() shim Somewhere in the move from netmgr/uv-compat.h to uv.c, the uv_os_getenv() implementation was lost in the process. Restore the implementation, so we can support Debian stretch for couple more months.	2022-05-04 12:31:46 +02:00
Ondřej Surý	b43812692d	Move netmgr/uv-compat.h to <isc/uv.h> As we are going to use libuv outside of the netmgr, we need the shims to be readily available for the rest of the codebase. Move the "netmgr/uv-compat.h" to <isc/uv.h> and netmgr/uv-compat.c to uv.c, and as a rule of thumb, the users of libuv should include <isc/uv.h> instead of <uv.h> directly. Additionally, merge netmgr/uverr2result.c into uv.c and rename the single function from isc__nm_uverr2result() to isc_uverr2result().	2022-05-03 10:02:19 +02:00
Ondřej Surý	24c3879675	Move socket related functions to netmgr/socket.c Move the netmgr socket related functions from netmgr/netmgr.c and netmgr/uv-compat.c to netmgr/socket.c, so they are all present all in the same place. Adjust the names of couple interal functions accordingly.	2022-05-03 09:52:49 +02:00
Tony Finch	66b3cb9732	Remove several superfluous newlines in log messages	2022-05-02 23:49:38 +01:00
Artem Boldariev	978f97dcdd	TLSDNS: call send callbacks after only the data was sent This commit ensures that write callbacks are getting called only after the data has been sent via the network. Without this fix, a situation could appear when a write callback could get called before the actual encrypted data would have been sent to the network. Instead, it would get called right after it would have been passed to the OpenSSL (i.e. encrypted). Most likely, the issue does not reveal itself often because the callback call was asynchronous, so in most cases it should have been called after the data has been sent, but that was not guaranteed by the code logic. Also, this commit removes one memory allocation (netievent) from a hot path, as there is no need to call this callback asynchronously anymore.	2022-04-27 17:44:23 +03:00
Ondřej Surý	407b37c3f2	Set IP(V6)_RECVERR on connect UDP sockets (via libuv) The connect()ed UDP socket provides feedback on a variety of ICMP errors (eg port unreachable) which bind can then use to decide what to do with errors (report them to the client, try again with a different nameserver etc). However, Linux's implementation does not report what it considers "transient" conditions, which is defined as Destination host Unreachable, Destination network unreachable, Source Route Failed and Message Too Big. Explicitly enable IP_RECVERR / IPV6_RECVERR (via libuv uv_udp_bind() flag) to learn about ICMP destination network/host unreachable.	2022-04-26 12:22:18 +02:00
Ondřej Surý	eb8f2974b1	Abort when libuv at runtime mismatches libuv at compile time When we compile with libuv that has some capabilities via flags passed to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would fail with invalid arguments when older libuv version is linked at the runtime that doesn't understand the flag that was available at the compile time. Enforce minimal libuv version when flags have been available at the compile time, but are not available at the runtime. This check is less strict than enforcing the runtime libuv version to be same or higher than compile time libuv version.	2022-04-26 11:40:40 +02:00
Tony Finch	b2950c96de	Revert "Move random number re-seeding out of the hot path" This reverts commit `b1bb41603e`.	2022-04-25 15:18:58 +01:00
Tony Finch	b1bb41603e	Move random number re-seeding out of the hot path Instead of checking if we need to re-seed for every isc_random call, seed the random number generator in the libisc global initializer and the per-thread initializer.	2022-04-22 16:40:37 +01:00
Tony Finch	254d2abafb	Clean up isc_random Remove redundant comments and avoid implicit casts.	2022-04-22 16:40:37 +01:00
Tony Finch	d20ea4a703	Make isc_random_uniform() nearly divisionless It used to require two 32-bit integer divisions to get a random number less than some limit. Now we use Daniel Lemire's "nearly-divisionless" algorithm for unbiased bounded random numbers, which requires one 64-bit integer multiply in the usual case, and one 32-bit integer division in rare slow cases. Even the slow cases are faster than before; there are also fewer branches. I think this algorithm is exceptionally beautiful. It also has more clever tricks than lines of code, so I have done my best to explain how it works.	2022-04-22 16:40:37 +01:00
Michał Kępień	7aa7b6474b	Prevent memory bloat caused by a jemalloc quirk Since version 5.0.0, decay-based purging is the only available dirty page cleanup mechanism in jemalloc. It relies on so-called tickers, which are simple data structures used for ensuring that certain actions are taken "once every N times". Ticker data (state) is stored in a thread-specific data structure called tsd in jemalloc parlance. Ticks are triggered when extents are allocated and deallocated. Once every 1000 ticks, jemalloc attempts to release some of the dirty pages hanging around (if any). This allows memory use to be kept in check over time. This dirty page cleanup mechanism has a quirk. If the first allocator-related action for a given thread is a free(), a minimally-initialized tsd is set up which does not include ticker data. When that thread subsequently calls *alloc(), the tsd transitions to its nominal state, but due to a certain flag being set during minimal tsd initialization, ticker data remains unallocated. This prevents decay-based dirty page purging from working, effectively enabling memory exhaustion over time. [1] The quirk described above has been addressed (by moving ticker state to a different structure) in jemalloc's development branch [2], but not in any numbered jemalloc version released to date (the latest one being 5.2.1 as of this writing). Work around the problem by ensuring that every thread spawned by isc_thread_create() starts with a malloc() call. Avoid immediately calling free() for the dummy allocation to prevent an optimizing compiler from stripping away the malloc() + free() pair altogether. An alternative implementation of this workaround was considered that used a pair of isc_mem_create() + isc_mem_destroy() calls instead of malloc() + free(), enabling the change to be fully contained within isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the compiler is not allowed to strip away arbitrary function calls. However, that solution was eventually dismissed as it triggered ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited abruptly without waiting for all worker threads to finish their work. [1] https://github.com/jemalloc/jemalloc/issues/2251 [2] `c259323ab3`	2022-04-21 14:19:39 +02:00
Ondřej Surý	d1d88a2895	Add detailed tracing when TASKMGR_TRACE is defined When TASKMGR_TRACE=1 is defined, the task and event objects have detailed tracing information about function, file, line, and backtrace (to the extent tracked by gcc) where it was created. At exit, when there are unfinished tasks, they will be printed along with the detailed information.	2022-04-19 14:25:23 +02:00
Ondřej Surý	f0feaa3305	Remove isc_task_sendto(anddetach) functions The only place where isc_task_sendto() was used was in dns_resolver unit, where the "sendto" part was actually no-op, because dns_resolver uses bound tasks. Remove the isc_task_sendto() and isc_task_sendtoanddetach() functions in favor of using bound tasks create with isc_task_create_bound(). Additionally, cache the number of running netmgr threads (nworkers) locally to reduce the number of function calls.	2022-04-19 14:24:36 +02:00
Ondřej Surý	1eeb4c1121	Remove isc_event_constallocate() The isc_event_constallocate() function was not used anywhere, thus remove the isc_event_constallocate() macro, declaration and definition.	2022-04-19 13:46:26 +02:00
Ondřej Surý	f55a4d3e55	Allow listening on less than nworkers threads For some applications, it's useful to not listen on full battery of threads. Add workers argument to all isc_nm_listen*() functions and convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.	2022-04-19 11:08:13 +02:00
Artem Boldariev	df317184eb	Add isc_nmsocket_set_tlsctx() This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function that replaces the TLS context within a given TLS-enabled listener socket object. It is based on the newly added reference counting functionality. The intention of adding this function is to add functionality to replace a TLS context without recreating the whole socket object, including the underlying TCP listener socket, as a BIND process might not have enough permissions to re-create it fully on reconfiguration.	2022-04-06 18:45:57 +03:00
Artem Boldariev	25609156a5	Maintain a per-thread TLS ctx reference in TLS stream code This commit changes the generic TLS stream code to maintain a per-worker thread TLS context reference.	2022-04-06 18:45:57 +03:00
Artem Boldariev	9256026d18	Use isc_tlsctx_attach() in TLS DNS code This commit adds proper reference counting for TLS contexts into generic TLS DNS (DoT) code.	2022-04-06 18:45:57 +03:00
Artem Boldariev	b52d46612f	Use isc_tlsctx_attach() in TLS stream code This commit adds proper reference counting for TLS contexts into generic TLS stream code.	2022-04-06 18:45:57 +03:00
Artem Boldariev	a7a482c1b1	Add isc_tlsctx_attach() The implementation is done on top of the reference counting functionality found in OpenSSL/LibreSSL, which allows for avoiding wrapping the object. Adding this function allows using reference counting for TLS contexts in BIND 9's codebase.	2022-04-06 18:45:57 +03:00
Mark Andrews	98718b3b4b	Unlink the timer event before trying to purge it as far as I can determine the order of operations is not important. *** CID 351372: Concurrent data access violations (ATOMICITY) /lib/isc/timer.c: 227 in timer_purge() 221 LOCK(&timer->lock); 222 if (!purged) { 223 /* 224 * The event has already been executed, but not 225 * yet destroyed. 226 */ >>> CID 351372: Concurrent data access violations (ATOMICITY) >>> Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect. 227 timerevent_unlink(timer, event); 228 } 229 } 230 } 231 232 void	2022-04-06 07:33:41 +00:00
Artem Boldariev	f0ac4c47b0	Change X509_STORE_up_ref() shim return value X509_STORE_up_ref() must return 1 on success, while the previous implementation would return the references count. This commit fixes that.	2022-04-05 15:03:27 +03:00
Ondřej Surý	7868d8145b	Rename shutdown() to test_shutdown() in timer_test.c The shutdown() is part of standard library (POSIX-1), don't use such name in the timer_test.c, but rather rename it to test_shutdown().	2022-04-05 01:49:04 +02:00
Ondřej Surý	142c63dda8	Enable the load-balance-sockets configuration Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private netmgr-int.h header file, making the configuration of load balanced sockets inoperable. Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add missing isc_nm_getloadbalancesockets() implementation.	2022-04-05 01:30:58 +02:00
Ondřej Surý	85c6e797aa	Add option to configure load balance sockets Previously, the option to enable kernel load balancing of the sockets was always enabled when supported by the operating system (SO_REUSEPORT on Linux and SO_REUSEPORT_LB on FreeBSD). It was reported that in scenarios where the networking threads are also responsible for processing long-running tasks (like RPZ processing, CATZ processing or large zone transfers), this could lead to intermitten brownouts for some clients, because the thread assigned by the operating system might be busy. In such scenarious, the overall performance would be better served by threads competing over the sockets because the idle threads can pick up the incoming traffic. Add new configuration option (`load-balance-sockets`) to allow enabling or disabling the load balancing of the sockets.	2022-04-04 23:10:04 +02:00
Ondřej Surý	f106d0ed2b	Run the RPZ update as offloaded work Previously, the RPZ updates ran quantized on the main nm_worker loops. As the quantum was set to 1024, this might lead to service interruptions when large RPZ update was processed. Change the RPZ update process to run as the offloaded work. The update and cleanup loops were refactored to do as little locking of the maintenance lock as possible for the shortest periods of time and the db iterator is being paused for every iteration, so we don't hold the rbtdb tree lock for prolonged periods of time.	2022-04-04 21:20:05 +02:00
Ondřej Surý	ae01ec2823	Don't use reference counting in isc_timer unit The reference counting and isc_timer_attach()/isc_timer_detach() semantic are actually misleading because it cannot be used under normal conditions. The usual conditions under which is timer used uses the object where timer is used as argument to the "timer" itself. This means that when the caller is using `isc_timer_detach()` it needs the timer to stop and the isc_timer_detach() does that only if this would be the last reference. Unfortunately, this also means that if the timer is attached elsewhere and the timer is fired it will most likely be use-after-free, because the object used in the timer no longer exists. Remove the reference counting from the isc_timer unit, remove isc_timer_attach() function and rename isc_timer_detach() to isc_timer_destroy() to better reflect how the API needs to be used. The only caveat is that the already executed event must be destroyed before the isc_timer_destroy() is called because the timer is no longet attached to .ev_destroy_arg.	2022-04-02 01:23:15 +02:00
Ondřej Surý	30e0fd942b	Remove task privileged mode Previously, the task privileged mode has been used only when the named was starting up and loading the zones from the disk as the "first" thing to do. The privileged task was setup with quantum == 2, which made the taskmgr/netmgr spin around the privileged queue processing two events at the time. The same effect can be achieved by setting the quantum to UINT_MAX (e.g. practically unlimited) for the loadzone task, hence the privileged task mode was removed in favor of just processing all the events on the loadzone task in a single task_run().	2022-04-01 23:55:26 +02:00
Ondřej Surý	62a72211aa	Remove isc_pool API Since the last user of the isc_pool API is gone, remove the whole isc_pool API.	2022-04-01 23:50:34 +02:00
Ondřej Surý	2bc7303af2	Use isc_nm_getnworkers to manage zone resources Instead of passing the number of worker to the dns_zonemgr manually, get the number of nm threads using the new isc_nm_getnworkers() call. Additionally, remove the isc_pool API and manage the array of memory context, zonetasks and loadtasks directly in the zonemgr.	2022-04-01 23:50:34 +02:00
Ondřej Surý	2707d0eeb7	Set hard thread affinity for each zone After switching to per-thread resources in the zonemgr, the performance was decreased because the memory context, zonetask and loadtask was picked from the pool at random. Pin the zone to single threadid (.tid) and align the memory context, zonetask and loadtask to be the same, this sets the hard affinity of the zone to the netmgr thread.	2022-04-01 23:50:34 +02:00
Ondřej Surý	a94678ff77	Create per-thread task and memory context for zonemgr Previously, the zonemgr created 1 task per 100 zones and 1 memory context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to reduce the contention between threads. Instead of reducing the contention by having many resources, create a per-nm_thread memory context, loadtask and zonetask and spread the zones between just per-thread resources. Note: this commit alone does decrease performance when loading the zone by couple seconds (in case of 1M zone) and thus there's more work in this whole MR fixing the performance.	2022-04-01 23:50:34 +02:00
Ondřej Surý	15ea6f002f	Add isc_task_setquantum() and use it for post-init zone loading Add isc_task_setquantum() function that modifies quantum for the future isc_task_run() invocations. NOTE: The current isc_task_run() caches the task->quantum into a local variable and therefore the current event loop is not affected by any quantum change.	2022-04-01 23:45:23 +02:00
Ondřej Surý	c17eee034b	Remove isc_task_purge() and isc_task_purgerange() The isc_task_purge() and isc_task_purgerange() were now unused, so sweep the task.c file. Additionally remove unused ISC_EVENTATTR_NOPURGE event attribute.	2022-04-01 23:45:23 +02:00
Ondřej Surý	48b2a5df97	Keep the list of scheduled events on the timer Instead of searching for the events to purge, keep the list of scheduled events on the timer list and purge the events that we have scheduled.	2022-04-01 23:45:23 +02:00
Ondřej Surý	17aed2f895	Repair isc_task_purgeevent(), clean isc_task_unsend{,range}() The isc_task_purgerange() was walking through all events on the task to find a matching task. Instead use the ISC_LINK_LINKED to find whether the event is active. Cleanup the related isc_task_unsend() and isc_task_unsendrange() functions that were not used anywhere.	2022-04-01 23:45:23 +02:00
Ondřej Surý	b84c9b2608	Turn isc_hash_bits32() into static online function Adding extra val & 0xffff in the isc_hash_bits32() macros in the hotpath has significantly reduced the performance. Turn the macro into static inline function matching the previous hash_32() function used to compute hashval matching the hashtable->bits.	2022-04-01 23:04:24 +02:00
Artem Boldariev	3edf7a9fe7	Implement shim for SSL_CTX_set1_cert_store() (affects Debian 9) This commit implements a shim for SSL_CTX_set1_cert_store() for OpenSSL/LibreSSL versions where it is not available.	2022-04-01 16:33:43 +03:00
Ondřej Surý	b05a991ad0	Make isc_ht optionally case insensitive Previously, the isc_ht API would always take the key as a literal input to the hashing function. Change the isc_ht_init() function to take an 'options' argument, in which ISC_HT_CASE_SENSITIVE or _INSENSITIVE can be specified, to determine whether to use case-sensitive hashing in isc_hash32() when hashing the key.	2022-03-28 15:02:18 -07:00
Evan Hunt	e9ef3defa4	consolidate fibonacci hashing in one place Fibonacci hashing was implemented in four separate places (rbt.c, rbtdb.c, resolver.c, zone.c). This commit combines them into a single implementation. The hash_32() function is now replaced with isc_hash_bits32().	2022-03-28 14:44:21 -07:00
Ondřej Surý	4dceab142d	Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE() In couple places, we have missed INSIST(0) or ISC_UNREACHABLE() replacement on some branches with UNREACHABLE(). Replace all ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().	2022-03-28 23:26:08 +02:00
Artem Boldariev	783663db80	Add ISC_R_TLSBADPEERCERT error code to the TLS related code This commit adds support for ISC_R_TLSBADPEERCERT error code, which is supposed to be used to signal for TLS peer certificates verification in dig and other code. The support for this error code is added to our TLS and TLS DNS implementations. This commit also adds isc_nm_verify_tls_peer_result_string() function which is supposed to be used to get a textual description of the reason for getting a ISC_R_TLSBADPEERCERT error.	2022-03-28 15:32:30 +03:00
Artem Boldariev	71cf8fa5ac	Extend TLS context cache with CA certificates store This commit adds support for keeping CA certificates stores associated with TLS contexts. The intention is to keep one reusable store per a set of related TLS contexts.	2022-03-28 15:31:22 +03:00
Artem Boldariev	c49a81e27d	Add foundational functions to implement Strict/Mutual TLS This commit adds a set of functions that can be used to implement Strict and Mutual TLS: * isc_tlsctx_load_client_ca_names(); * isc_tlsctx_load_certificate(); * isc_tls_verify_peer_result_string(); * isc_tlsctx_enable_peer_verification().	2022-03-28 15:31:22 +03:00
Artem Boldariev	32783d36c2	Add utility functions to manipulate X509 certificate stores This commit adds a set of high-level utility functions to manipulate the certificate stores. The stores are needed to implement TLS certificates verification efficiently.	2022-03-28 15:31:22 +03:00
Ondřej Surý	9de10cd153	Remove extrahandle size from netmgr Previously, it was possible to assign a bit of memory space in the nmhandle to store the client data. This was complicated and prevents further refactoring of isc_nmhandle_t caching (future work). Instead of caching the data in the nmhandle, allocate the hot-path ns_client_t objects from per-thread clientmgr memory context and just assign it to the isc_nmhandle_t via isc_nmhandle_set().	2022-03-25 10:38:35 +01:00
Ondřej Surý	20f0936cf2	Remove use of the inline keyword used as suggestion to compiler Historically, the inline keyword was a strong suggestion to the compiler that it should inline the function marked inline. As compilers became better at optimising, this functionality has receded, and using inline as a suggestion to inline a function is obsolete. The compiler will happily ignore it and inline something else entirely if it finds that's a better optimisation. Therefore, remove all the occurences of the inline keyword with static functions inside single compilation unit and leave the decision whether to inline a function or not entirely on the compiler NOTE: We keep the usage the inline keyword when the purpose is to change the linkage behaviour.	2022-03-25 08:33:43 +01:00
Ondřej Surý	04d0b70ba2	Replace ISC_NORETURN with C11's noreturn C11 has builtin support for _Noreturn function specifier with convenience noreturn macro defined in <stdnoreturn.h> header. Replace ISC_NORETURN macro by C11 noreturn with fallback to __attribute__((noreturn)) if the C11 support is not complete.	2022-03-25 08:33:43 +01:00
Ondřej Surý	584f0d7a7e	Simplify way we tag unreachable code with only ISC_UNREACHABLE() Previously, the unreachable code paths would have to be tagged with: INSIST(0); ISC_UNREACHABLE(); There was also older parts of the code that used comment annotation: /* NOTREACHED */ Unify the handling of unreachable code paths to just use: UNREACHABLE(); The UNREACHABLE() macro now asserts when reached and also uses __builtin_unreachable(); when such builtin is available in the compiler.	2022-03-25 08:33:43 +01:00
Ondřej Surý	fe7ce629f4	Add FALLTHROUGH macro for __attribute__((fallthrough)) Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which is explicit version of the /* FALLTHROUGH / comment we are currently using. Add and apply FALLTHROUGH macro that uses the attribute if available, but does nothing on older compilers. In one case (lib/dns/zone.c), using the macro revealed that we were using the / FALLTHROUGH */ comment in wrong place, remove that comment.	2022-03-25 08:33:43 +01:00
Ondřej Surý	d70daa29f7	Make netmgr the authority on number of threads running Instead of passing the "workers" variable back and forth along with passing the single isc_nm_t instance, add isc_nm_getnworkers() function that returns the number of netmgr threads are running. Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired knowledge.	2022-03-18 21:53:28 +01:00
Ondřej Surý	ff22498849	Add couple missing braces around single-line statements The clang-format-15 has new option InsertBraces that could add missing branches around single line statements. Use that to our advantage without switching to not-yet-released LLVM version to add missing braces in couple of places.	2022-03-17 18:27:45 +01:00
Ondřej Surý	cd52953f8a	Update the isc_ht unit test to also tesh rehashing As incremental rehashing has been added to isc_ht implementation, we need to test whether the rehashing works. Update the isc_ht unit test to test: * preinitialized hash table large enough to hold all the elements * smallest hash table that fully grows to hold all the elements * partially preinitialized hash table that grows * iterating while rehashing is in progress	2022-03-17 08:16:24 +01:00
Ondřej Surý	e42cb1f198	Implement incremental hash table resizing in isc_ht Previously, an incremental hash table resizing was implemented for the dns_rbt_t hash table implementation. Using that as a base, also implement the incremental hash table resizing also for isc_ht API hashtables: 1. During the resize, allocate the new hash table, but keep the old table unchanged. 2. In each lookup, delete, or iterator operation, check both tables. 3. Perform insertion operations only in the new table. 4. At each insertion also move <r> elements from the old table to the new table. 5. When all elements are removed from the old table, deallocate it. To ensure that the old table is completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (<r> + 1)/<r> during resizing. In our implementation <r> is equal to 1. The downside of this approach is that the old table and the new table could stay in memory for longer when there are no new insertions into the hash table for prolonged periods of time as the incremental rehashing happens only during the insertions.	2022-03-17 08:16:24 +01:00
Ondřej Surý	bfa4b9c141	Run .closehandle_cb asynchrounosly in nmhandle_detach_cb() When sock->closehandle_cb is set, we need to run nmhandle_detach_cb() asynchronously to ensure correct order of multiple packets processing in the isc__nm_process_sock_buffer(). When not run asynchronously, it would cause: a) out-of-order processing of the return codes from processbuffer(); b) stack growth because the next TCP DNS message read callback will be called from within the current TCP DNS message read callback. The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP sockets which calls isc__nm_process_sock_buffer(). If the read callback (called from isc__nm_process_sock_buffer()->processbuffer()) doesn't attach to the nmhandle (f.e. because it wants to drop the processing or we send the response directly via uv_try_write()), the isc__nm_resume_processing() (via .closehandle_cb) would call isc__nm_process_sock_buffer() recursively. The below shortened code path shows how the stack can grow: 1: ns__client_request(handle, ...); 2: isc_nm_tcpdns_sequential(handle); 3: ns_query_start(client, handle); 4: query_lookup(qctx); 5: query_send(qctcx->client); 6: isc__nmhandle_detach(&client->reqhandle); 7: nmhandle_detach_cb(&handle); 8: sock->closehandle_cb(sock); // isc__nm_resume_processing 9: isc__nm_process_sock_buffer(sock); 10: processbuffer(sock); // isc__nm_tcpdns_processbuffer 11: isc_nmhandle_attach(req->handle, &handle); 12: isc__nm_readcb(sock, req, ISC_R_SUCCESS); 13: isc__nm_async_readcb(NULL, ...); 14: uvreq->cb.recv(...); // ns__client_request Instead, if 'sock->closehandle_cb' is set, we need to run detach the handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in the code flow above does not start this recursion. This ensures the correct order when processing multiple packets in the function 'isc__nm_process_sock_buffer()' and prevents the stack growth. When not run asynchronously, the out-of-order processing leaves the first TCP socket open until all requests on the stream have been processed. If the pipelining is disabled on the TCP via `keep-response-order` configuration option, named would keep the first socket in lingering CLOSE_WAIT state when the client sends an incomplete packet and then closes the connection from the client side.	2022-03-16 22:11:49 +01:00
Ondřej Surý	79b5ccbf34	Implement isc_interval_t on top of isc_time_t Change the isc_interval_t implementation from separate data type and separate implementation to be shim implementation on top of isc_time_t. The distinction between isc_interval_t and isc_time_t has been kept because they are semantically different - isc_interval_t is relative and isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to be freely interchangeable, f.e. this: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);; isc_time_subtract(t1, interval, t2); isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2)); to just: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_time_subtract(t1, t2, interval); without introducing a whole set of new functions.	2022-03-14 13:00:05 -07:00
Ondřej Surý	e6ca2a651f	Refactor isc_timer_reset() use with semantic patch Add and apply semantic patch to remove expires argument from the isc_timer_reset() calls through the codebase.	2022-03-14 13:00:05 -07:00
Ondřej Surý	6437bcc488	Remove expires argument from isc_timer API The isc_timer_reset() now works only with intervals for once timers. This makes the API almost 1:1 compatible with the libuv timers making the further refactoring possible.	2022-03-14 13:00:05 -07:00
Ondřej Surý	27850a5ad2	Change isc_timer_reset() usage to never use expires argument There were two places where expires argument (absolute isc_time_t value) was being used. Both places has been converted to use relative interval argument in preparation of simplification and refactoring of isc_timer API.	2022-03-14 13:00:05 -07:00
Ondřej Surý	c259cecc90	Refactor isc_timer_create() to just create timer The isc_timer_create() function was a bit conflated. It could have been used to create a timer and start it at the same time. As there was a single place where this was done before (see the previous commit for nta.c), this was cleaned up and the isc_timer_create() function was changed to only create new timer.	2022-03-14 13:00:05 -07:00
Ondřej Surý	8fbb42c49c	Remove "a temporary hack, 'rndc timerpoke'" In 2002, "a temporary hack, 'rndc timerpoke'" was added. It's time for it to go, so it was removed.	2022-03-14 13:00:05 -07:00
Ondřej Surý	f4751a91f7	Remove unused isc_timer_touch() function The isc_timer_touch() was unused, just remove it.	2022-03-14 13:00:05 -07:00
Ondřej Surý	bbe1c06a8b	Remove isc_timertype_limited from isc_timer API The isc_timertype_limited timer type was never used (not even in tests). Remove isc_timertype_limited timer type before planned refactoring.	2022-03-14 13:00:05 -07:00
Ondřej Surý	49c804f8b7	Cleanup the nmhandle attach/detach in httpd.c In httpd.c, the send callback can directly call read callback without calling isc_nm_resumeread(). When per-send timeout was added, this could lead to use-after-free when shutting down the named. Cleanup the way how we attach to .readhandle and .sendhandle, so there's assurance that .readhandle will be always non-NULL when reading and .sendhandle will be always non-NULL when sending. Additionally, it was found that the implementation ignored the "Connection: close" header and it worked only accidentally by closing the connection after the first read from the TCP socket. This has been also fixed.	2022-03-11 09:57:10 +01:00
Ondřej Surý	6ddac2d56d	On shutdown, reset the established TCP connections Previously, the established TCP connections (both client and server) would be gracefully closed waiting for the write timeout. Don't wait for TCP connections to gracefully shutdown, but directly reset them for faster shutdown.	2022-03-11 09:56:57 +01:00
Ondřej Surý	a761aa59e3	Change single write timer to per-send timers Previously, there was a single per-socket write timer that would get restarted for every new write. This turned out to be insufficient because the other side could keep reseting the timer, and never reading back the responses. Change the single write timer to per-send timer which would in turn reset the TCP connection on the first send timeout.	2022-03-11 09:56:57 +01:00
Ondřej Surý	f251d69eba	Remove usage of deprecated ATOMIC_VAR_INIT() macro The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple assignment of the value as this is what all supported stdatomic.h implementations do anyway: * MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v} * Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE) 1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf	2022-03-08 23:55:10 +01:00
Ondřej Surý	8fa27365ec	Make isc_ht_init() and isc_ht_iter_create() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 14:51:55 +01:00
Ondřej Surý	bbb4cdb92d	Make isc_heap_create() and isc_heap_insert() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 11:19:34 +01:00
Ondřej Surý	8098a58581	Set TCP maximum segment size to minimum size of 1220 Previously the socket code would set the TCPv6 maximum segment size to minimum value to prevent IP fragmentation for TCP. This was not yet implemented for the network manager. Implement network manager functions to set and use minimum MTU socket option and set the TCP_MAXSEG socket option for both IPv4 and IPv6 and use those to clamp the TCP maximum segment size for TCP, TCPDNS and TLSDNS layers in the network manager to 1220 bytes, that is 1280 (IPv6 minimum link MTU) minus 40 (IPv6 fixed header) minus 20 (TCP fixed header) We already rely on a similar value for UDP to prevent IP fragmentation and it make sense to use the same value for IPv4 and IPv6 because the modern networks are required to support IPv6 packet sizes. If there's need for small TCP segment values, the MTU on the interfaces needs to be properly configured.	2022-03-08 10:27:05 +01:00
Ondřej Surý	5d34a14f22	Set minimum MTU (1280) on IPv6 sockets The IPV6_USE_MIN_MTU socket option directs the IP layer to limit the IPv6 packet size to the minimum required supported MTU from the base IPv6 specification, i.e. 1280 bytes. Many implementations of TCP running over IPv6 neglect to check the IPV6_USE_MIN_MTU value when performing MSS negotiation and when constructing a TCP segment despite MSS being defined to be the MTU less the IP and TCP header sizes (60 bytes for IPv6). This leads to oversized IPv6 packets being sent resulting in unintended Path Maximum Transport Unit Discovery (PMTUD) being performed and to fragmented IPv6 packets being sent. Add and use a function to set socket option to limit the MTU on IPv6 sockets to the minimum MTU (1280) both for UDP and TCP.	2022-03-08 10:27:05 +01:00
Ondřej Surý	6bd025942c	Replace netievent lock-free queue with simple locked queue The current implementation of isc_queue uses Michael-Scott lock-free queue that in turn uses hazard pointers. It was discovered that the way we use the isc_queue, such complicated mechanism isn't really needed, because most of the time, we either execute the work directly when on nmthread (in case of UDP) or schedule the work from the matching nmthreads. Replace the current implementation of the isc_queue with a simple locked ISC_LIST. There's a slight improvement - since copying the whole list is very lightweight - we move the queue into a new list before we start the processing and locking just for moving the queue and not for every single item on the list. NOTE: There's a room for future improvements - since we don't guarantee the order in which the netievents are processed, we could have two lists - one unlocked that would be used when scheduling the work from the matching thread and one locked that would be used from non-matching thread.	2022-03-04 13:49:51 +01:00
Aram Sargsyan	ef0d7177b6	Remove EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() shims LibreSSL 3.5.0 fails to compile with these shims. We could have just removed the LibreSSL check from the pre-processor condition, but it seems that these shims are no longer needed because all the supported versions of OpenSSL and LibreSSL have those functions. According to EVP_ENCRYPTINIT(3) manual page in LibreSSL, EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() first appeared in OpenSSL 0.9.8b, and have been available since OpenBSD 4.5.	2022-03-02 10:48:09 +00:00
Mark Andrews	4c356d2770	Grow the lex token buffer in one more place when parsing key pairs, if the '=' character fell at max_token a protective INSIST preventing buffer overrun could be triggered. Attempt to grow the buffer immediately before the INSIST. Also removed an unnecessary INSIST on the opening double quote of key buffer pair.	2022-03-01 16:05:39 -08:00
Ondřej Surý	b220fb32bd	Handle TCP sockets in isc__nmsocket_reset() The isc__nmsocket_reset() was missing a case for raw TCP sockets (used by RNDC and DoH) which would case a assertion failure when write timeout would be triggered. TCP sockets are now also properly handled in isc__nmsocket_reset().	2022-02-28 02:06:03 -08:00
Ondřej Surý	ecf042991c	Fix typo __SANITIZE_ADDRESS -> __SANITIZE_ADDRESS__ When checking for Address Sanitizer to disable the inactivehandles caching, there was a typo in the macro.	2022-02-24 00:15:16 +01:00
Ondřej Surý	be339b3c83	Disable inactive uvreqs caching when compiled with sanitizers When isc__nm_uvreq_t gets deactivated, it could be just put onto array stack to be reused later to save some initialization time. Unfortunately, this might hide some use-after-free errors. Disable the inactive uvreqs caching when compiled with Address or Thread Sanitizer.	2022-02-24 00:15:16 +01:00
Ondřej Surý	92cce1da65	Disable inactive handles caching when compiled with sanitizers When isc_nmhandle_t gets deactivated, it could be just put onto array stack to be reused later to safe some initialization time. Unfortunately, this might hide some use-after-free errors. Disable the inactive handles caching when compiled with Address or Thread Sanitizer.	2022-02-23 23:21:29 +01:00
Ondřej Surý	e2555a306f	Remove active handles tracking from isc__nmsocket_t The isc__nmsocket_t has locked array of isc_nmhandle_t that's not used for anything. The isc__nmhandle_get() adds the isc_nmhandle_t to the locked array (and resized if necessary) and removed when isc_nmhandle_put() finally destroys the handle. That's all it does, so it serves no useful purpose. Remove the .ah_handles, .ah_size, and .ah_frees members of the isc__nmsocket_t and .ah_pos member of the isc_nmhandle_t struct.	2022-02-23 22:54:47 +01:00
Ondřej Surý	3268627916	Delay isc__nm_uvreq_t deallocation to connection callback When the TCP, TCPDNS or TLSDNS connection times out, the isc__nm_uvreq_t would be pushed into sock->inactivereqs before the uv_tcp_connect() callback finishes. Because the isc__nmsocket_t keeps the list of inactive isc__nm_uvreq_t, this would cause use-after-free only when the sock->inactivereqs is full (which could never happen because the failure happens in connection timeout callback) or when the sock->inactivereqs mechanism is completely removed (f.e. when running under Address or Thread Sanitizer). Delay isc__nm_uvreq_t deallocation to the connection callback and only signal the connection callback should be called by shutting down the libuv socket from the connection timeout callback.	2022-02-23 22:54:47 +01:00
Ondřej Surý	88418c3372	Properly free up enqueued netievents in nm_destroy() When the isc_netmgr is being destroyed, the normal and priority queues should be dequeued and netievents properly freed. This wasn't the case.	2022-02-23 22:51:12 +01:00
Ondřej Surý	d01562f22b	Remove the keep-response-order ACL map The keep-response-order option has been obsoleted, and in this commit, remove the keep-response-order ACL map rendering the option no-op, the call the isc_nm_sequential() and the now unused isc_nm_sequential() function itself.	2022-02-18 09:16:03 +01:00
Ondřej Surý	4f5b4662b6	Remove the limit on the number of simultaneous TCP queries There was an artificial limit of 23 on the number of simultaneous pipelined queries in the single TCP connection. The new network managers is capable of handling "unlimited" (limited only by the TCP read buffer size ) queries similar to "unlimited" handling of the DNS queries receive over UDP. Don't limit the number of TCP queries that we can process within a single TCP read callback.	2022-02-17 16:19:12 -08:00
Ondřej Surý	3c7b04d015	Add network manager based timer API This commits adds API that allows to create arbitrary timers associated with the network manager handles.	2022-02-17 21:38:17 +01:00
Ondřej Surý	4716c56ebb	Reset the TCP connection when garbage is received When invalid DNS message is received, there was a handling mechanism for DoH that would be called to return proper HTTP response. Reuse this mechanism and reset the TCP connection when the client is blackholed, DNS message is completely bogus or the ns_client receives response instead of query.	2022-02-17 20:39:55 +01:00
Ondřej Surý	ee359d6ffa	Update writetimeout to be T_IDLE in netmgr_test.c Use the isc_nmhandle_setwritetimeout() function in the netmgr unit test to allow more time for writing and reading the responses because some of the intervals that are used in the unit tests are really small leaving a little room for any delays.	2022-02-17 09:06:58 +01:00
Ondřej Surý	a89d9e0fa6	Add isc_nmhandle_setwritetimeout() function In some situations (unit test and forthcoming XFR timeouts MR), we need to modify the write timeout independently of the read timeout. Add a isc_nmhandle_setwritetimeout() function that could be called before isc_nm_send() to specify a custom write timeout interval.	2022-02-17 09:06:58 +01:00
Ondřej Surý	408b362169	Add TCP, TCPDNS and TLSDNS write timer When the outgoing TCP write buffers are full because the other party is not reading the data, the uv_write() could wait indefinitely on the uv_loop and never calling the callback. Add a new write timer that uses the `tcp-idle-timeout` value to interrupt the TCP connection when we are not able to send data for defined period of time.	2022-02-17 09:06:58 +01:00
Ondřej Surý	cd3b58622c	Add uv_tcp_close_reset compat The uv_tcp_close_reset() function was added in libuv 1.32.0 and since we support older libuv releases, we have to add a shim uv_tcp_close_reset() implementation loosely based on libuv.	2022-02-17 09:06:58 +01:00
Ondřej Surý	45a73c113f	Rename sock->timer to sock->read_timer Before adding the write timer, we have to remove the generic sock->timer to sock->read_timer. We don't touch the function names to limit the impact of the refactoring.	2022-02-17 09:06:58 +01:00
Ondřej Surý	8715be1e4b	Use UV_RUNTIME_CHECK() as appropriate Replace the RUNTIME_CHECK() calls for libuv API calls with UV_RUNTIME_CHECK() to get more detailed error message when something fails and should not.	2022-02-16 11:16:57 +01:00
Ondřej Surý	62e15bb06d	Add UV_RUNTIME_CHECK() macro to print uv_strerror() When libuv functions fail, they return correct return value that could be useful for more detailed debugging. Currently, we usually just check whether the return value is 0 and invoke assertion error if it doesn't throwing away the details why the call has failed. Unfortunately, this often happen on more exotic platforms. Add a UV_RUNTIME_CHECK() macro that can be used to print more detailed error message (via uv_strerror() before ending the execution of the program abruptly with the assertion.	2022-02-16 11:16:57 +01:00
Ondřej Surý	b9cb29076f	Log when starting and ending task exclusive mode The task exclusive mode stops all processing (tasks and networking IO) except the designated exclusive task events. This has impact on the operation of the server. Add log messages indicating when we start the exclusive mode, and when we end exclusive task mode.	2022-02-10 21:09:06 +01:00
Ondřej Surý	0893b5fb79	Assert if statistics counter underflows in the developer mode There are reported occurences where the statitic counters underflows and starts reporting non-sense. Add a check for the underflow, when ``named`` is compiled in the developer mode.	2022-02-10 17:18:09 +01:00
Ondřej Surý	0500345513	Remove unused functions from isc_thread API The isc_thread_setaffinity call was removed in !5265 and we are not going to restore it because it was proven that the performance is better without it. Additionally, remove the already disabled cpu system test. The isc_thread_setconcurrency function is unused and also calling pthread_setconcurrency() on Linux has no meaning, formerly it was added because of Solaris in 2001 and it was removed when taskmgr was refactored to run on top of netmgr in !4918.	2022-02-09 17:22:06 +01:00
Ondřej Surý	2ae84702ad	Add log message when hard quota is reached in TCP accept When isc_quota_attach_cb() API returns ISC_R_QUOTA (meaning hard quota was reached) the accept_connection() would return without logging a message about quota reached. Change the connection callback to log the quota reached message.	2022-02-01 21:00:05 +01:00
Evan Hunt	d3fed6f400	update dlz_minimal.h the addition of support for ECS client information in DLZ modules omitted some necessary changes to build modules in contrib.	2022-01-27 15:48:50 -08:00
Petr Menšík	f00f521e9c	Use detected cache line size IBM power architecture has L1 cache line size equal to 128. Take advantage of that on that architecture, do not force more common value of 64. When it is possible to detect higher value, use that value instead. Keep the default to be 64.	2022-01-27 13:02:23 +01:00
Aram Sargsyan	81d3584116	Set the ephemeral certificate's "not before" a short time in the past TLS clients can have their clock a short time in the past which will result in not being able to validate the certificate. Setting the "not before" property 5 minutes in the past will accommodate with some possible clock skew across systems.	2022-01-25 09:09:35 +00:00
Ondřej Surý	b28327354d	Ignore the invalid L1 cache line size returned by sysconf() On some systems, the glibc can return 0 instead of cache-line size to indicate the cache line sizes cannot be determined. This is comment from glibc source code: /* In general we cannot determine these values. Therefore we return zero which indicates that no information is available. */ As the goal of the check is to determine whether the L1 cache line size is still 64 and we would use this value in case the sysconf() call is not available, we can also ignore the invalid values returned by the sysconf() call.	2022-01-22 16:59:50 +01:00
Ondřej Surý	b5e086257d	Explicitly enable IPV6_V6ONLY on the netmgr sockets Some operating systems (OpenBSD and DragonFly BSD) don't restrict the IPv6 sockets to sending and receiving IPv6 packets only. Explicitly enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent failures from using the IPv4-mapped IPv6 address.	2022-01-17 22:16:27 +01:00
Evan Hunt	be0bc24c7f	add UV_ENOTSUP to isc___nm_uverr2result() This error code is now mapped to ISC_R_FAMILYNOSUPPORT.	2022-01-17 11:45:10 +01:00
Artem Boldariev	ca9fe3559a	DoH: ensure that server_send_error_response() is used properly The server_send_error_response() function is supposed to be used only in case of failures and never in case of legitimate requests. Ensure that ISC_HTTP_ERROR_SUCCESS is never passed there by mistake.	2022-01-14 16:00:42 +02:00
Artem Boldariev	a38b4945c1	DoH: add bad HTTP/2 requests logging Add some error logging when facing bad requests over HTTP/2. Log the address and the error description.	2022-01-14 16:00:42 +02:00
Ondřej Surý	0a4e91ee47	Revert "Always enqueue isc__nm_tcp_resumeread()" The commit itself is harmless, but at the same time it is also useless, so we are reverting it. This reverts commit `11c869a3d5`.	2022-01-13 19:06:39 +01:00
Ondřej Surý	7370725008	Fix the UDP recvmmsg support Previously, the netmgr/udp.c tried to detect the recvmmsg detection in libuv with #ifdef UV_UDP_<foo> preprocessor macros. However, because the UV_UDP_<foo> are not preprocessor macros, but enum members, the detection didn't work. Because the detection didn't work, the code didn't have access to the information when we received the final chunk of the recvmmsg and tried to free the uvbuf every time. Fortunately, the isc__nm_free_uvbuf() had a kludge that detected attempt to free in the middle of the receive buffer, so the code worked. However, libuv 1.37.0 changed the way the recvmmsg was enabled from implicit to explicit, and we checked for yet another enum member presence with preprocessor macro, so in fact libuv recvmmsg support was never enabled with libuv >= 1.37.0. This commit changes to the preprocessor macros to autoconf checks for declaration, so the detection now works again. On top of that, it's now possible to cleanup the alloc_cb and free_uvbuf functions because now, the information whether we can or cannot free the buffer is available to us.	2022-01-13 19:06:39 +01:00
Aram Sargsyan	6f457c5121	Generate a random serial number for 'tls ephemeral' certificates Clients can cache the TLS certificates and refuse to accept another one with the same serial number from the same issuer. Generate a random serial number for the self-signed certificates instead of using a fixed value.	2022-01-13 11:03:07 +00:00
Aram Sargsyan	0a19b5cd62	Use uncompressed point conversion form for 'tls ephemeral' ECC keys GnuTLS, NSS, and possibly other TLS libraries currently fail to work with compressed point conversion form supported by OpenSSL. Use uncompressed point conversion form for better compatibility.	2022-01-13 11:03:06 +00:00
Ondřej Surý	58bd26b6cf	Update the copyright information in all files in the repository This commit converts the license handling to adhere to the REUSE specification. It specifically: 1. Adds used licnses to LICENSES/ directory 2. Add "isc" template for adding the copyright boilerplate 3. Changes all source files to include copyright and SPDX license header, this includes all the C sources, documentation, zone files, configuration files. There are notes in the doc/dev/copyrights file on how to add correct headers to the new files. 4. Handle the rest that can't be modified via .reuse/dep5 file. The binary (or otherwise unmodifiable) files could have license places next to them in <foo>.license file, but this would lead to cluttered repository and most of the files handled in the .reuse/dep5 file are system test files.	2022-01-11 09:05:02 +01:00
Ondřej Surý	11c869a3d5	Always enqueue isc__nm_tcp_resumeread() The isc__nm_tcp_resumeread() was using maybe_enqueue function to enqueue netmgr event which could case the read callback to be executed immediately if there was enough data waiting in the TCP queue. If such thing would happen, the read callback would be called before the previous read callback was finished and the worker receive buffer would be still marked "in use" causing a assertion failure. This would affect only raw TCP channels, e.g. rndc and http statistics.	2022-01-06 10:34:04 -08:00
Ondřej Surý	d026ddde82	Add unit test of aligned isc_mem functions Add unit test that checks whether all the aligned functions work and that allocators return memory aligned at the specified boundary.	2022-01-05 17:17:39 +01:00
Ondřej Surý	6269fce0fe	Use isc_mem_get_aligned() for isc_queue and cleanup max_threads The isc_queue_new() was using dirty tricks to allocate the head and tail members of the struct aligned to the cacheline. We can now use isc_mem_get_aligned() to allocate the structure to the cacheline directly. Use ISC_OS_CACHELINE_SIZE (64) instead of arbitrary ALIGNMENT (128), one cacheline size is enough to prevent false sharing. Cleanup the unused max_threads variable - there was actually no limit on the maximum number of threads. This was changed a while ago.	2022-01-05 17:10:58 +01:00
Ondřej Surý	c84eb55049	Reduce the memory used by hazard pointers The hazard pointers implementation was bit of frivolous with memory usage allocating memory based on maximum constants rather than on the usage. Make the retired list bit use exactly the memory needed for specified number of hazard pointers. This reduced the memory used by hazard pointers to one quarter in our specific case because we only use single HP in the queue implementation (as opposed to allocating memory for HP_MAX_HPS = 4). Previously, the alignment to prevent false sharing was double the cacheline size. This was copied from the ConcurrencyFreaks implementation, but one cacheline size is enough to prevent false sharing, so we are using this now to save few bits of memory. The top level hazard pointers and retired list arrays are now not aligned to the cacheline size - they are read-only for the whole life-time of the isc_hp object. Only hp (hazard pointer) and rl (retired list) array members are allocated aligned to the cacheline size to avoid false sharing between threads. Cleanup HP_MAX_HPS and HP_THRESHOLD_R constants from the paper, because we don't use them in the code. HP_THRESHOLD_R was 0, so the check whether the retired list size was smaller than the value was basically a dead code.	2022-01-05 17:10:58 +01:00
Ondřej Surý	c917a2ca88	Add isc_mem_*_aligned() function that works with aligned memory There are some situations where having aligned allocations would be useful, so we don't have to play tricks with padding the data to the cacheline sizes. Add isc_mem_{get,put,reget,putanddetach}_aligned() functions that has alignment and size as last argument mimicking the POSIX posix_memalign() functions on systems with jemalloc (see the documentation on MALLOX_ALIGN() for more details). On systems without jemalloc, those functions are same as non-aligned variants.	2022-01-05 17:10:56 +01:00
Ondřej Surý	4f78f9d72a	Add #define ISC_OS_CACHELINE_SIZE 64 Add library ctor and dtor for isc_os compilation unit which initializes the numbers of the CPUs and also checks whether L1 cacheline size is really 64 if the sysconf() call is available.	2022-01-05 17:07:35 +01:00
Ondřej Surý	e705f213ca	Remove taskmgr->excl_lock, fix the locking for taskmgr->exiting While doing code review, it was found that the taskmgr->exiting is set under taskmgr->lock, but accessed under taskmgr->excl_lock in the isc_task_beginexclusive(). Additionally, before the change that moved running the tasks to the netmgr, the task_ready() subrouting of isc_task_detach() would lock mgr->lock, requiring the mgr->excl to be protected mgr->excl_lock to prevent deadlock in the code. After !4918 has been merged, this is no longer true, and we can remove taskmgr->excl_lock and use taskmgr->lock in its stead. Solve both issues by removing the taskmgr->excl_lock and exclusively use taskmgr->lock to protect both taskmgr->excl and taskmgr->exiting which now doesn't need to be atomic_bool, because it's always accessed from within the locked section.	2022-01-05 16:44:57 +01:00
Ondřej Surý	f9d90159b8	On shutdown, return ISC_R_SHUTTINGDOWN from isc_taskmgr_excltask() The isc_taskmgr_excltask() would return ISC_R_NOTFOUND either when the exclusive task was not set (yet) or when the taskmgr is shutting down and the exclusive task has been already cleared. Distinguish between the two states and return ISC_R_SHUTTINGDOWN when the taskmgr is being shut down instead of ISC_R_NOTFOUND.	2022-01-05 13:41:12 +01:00
Evan Hunt	61c160c4a5	Clean up isc_tlsctx_cache_detach() For consistency with similar functions, rename `pcache` to `cachep`, call a separate destroy function when references reach 0, and add a missing call to isc_refcount_destroy().	2022-01-04 23:07:12 -08:00
Evan Hunt	f5074c0c8e	Ensure that cache pointer is set to NULL by isc_tlsctx_cache_detach() If the reference count was higher than 1, detaching a tlsctx cache didn't clear the pointer, which could trigger an assertion later.	2022-01-04 11:48:25 -08:00
Artem Boldariev	5b7d4341fe	Use the TLS context cache for server-side contexts Using the TLS context cache for server-side contexts could reduce the number of contexts to initialise in the configurations when e.g. the same 'tls' entry is used in multiple 'listen-on' statements for the same DNS transport, binding to multiple IP addresses. In such a case, only one TLS context will be created, instead of a context per IP address, which could reduce the initialisation time, as initialising even a non-ephemeral TLS context introduces some delay, which can be visually noticeable by log activity. Also, this change lays down a foundation for Mutual TLS (when the server validates a client certificate, additionally to a client validating the server), as the TLS context cache can be extended to store additional data required for validation (like intermediates CA chain). Additionally to the above, the change ensures that the contexts are not being changed after initialisation, as such a practice is frowned upon. Previously we would set the supported ALPN tags within isc_nm_listenhttp() and isc_nm_listentlsdns(). We do not do that for client-side contexts, so that appears to be an overlook. Now we set the supported ALPN tags right after server-side contexts creation, similarly how we do for client-side ones.	2021-12-29 10:25:14 +02:00
Artem Boldariev	eb37d967c2	Add TLS context cache This commit adds a TLS context object cache implementation. The intention of having this object is manyfold: - In the case of client-side contexts: allow reusing the previously created contexts to employ the context-specific TLS session resumption cache. That will enable XoT connection to be reestablished faster and with fewer resources by not going through the full TLS handshake procedure. - In the case of server-side contexts: reduce the number of contexts created on startup. That could reduce startup time in a case when there are many "listen-on" statements referring to a smaller amount of `tls` statements, especially when "ephemeral" certificates are involved. - The long-term goal is to provide in-memory storage for additional data associated with the certificates, like runtime representation (X509_STORE) of intermediate CA-certificates bundle for Strict TLS/Mutual TLS ("ca-file").	2021-12-29 10:25:11 +02:00
Michał Kępień	ea89ab80ae	Fix error codes passed to connection callbacks Commit `9ee60e7a17` erroneously introduced duplicate conditions to several existing conditional statements responsible for determining error codes passed to connection callbacks upon failure. Fix the affected expressions to ensure connection callbacks are invoked with: - the ISC_R_SHUTTINGDOWN error code when a global netmgr shutdown is in progress, - the ISC_R_CANCELED error code when a specific operation has been canceled. This does not fix any known bugs, it only adjusts the changes introduced by commit `9ee60e7a17` so that they match its original intent.	2021-12-28 15:09:50 +01:00
Michał Kępień	7983d5fa7c	Check for SSL_CTX_set_keylog_callback() support The SSL_CTX_set_keylog_callback() function is a fairly recent OpenSSL addition, having first appeared in version 1.1.1. Add a configure.ac check for the availability of that function to prevent build errors on older platforms. Sort similar checks alphabetically. This makes the SSLKEYLOGFILE mechanism a silent no-op on unsupported platforms, which is considered acceptable for a debugging feature.	2021-12-22 18:17:26 +01:00
Michał Kępień	060fed3097	Log TLS pre-master secrets when requested Generate log messages containing TLS pre-master secrets when the SSLKEYLOGFILE environment variable is set. This only ensures such messages are prepared using the right logging category and passed to libisc for further processing. The TLS pre-master secret logging callback needs to be set on a per-context basis, so ensure it happens for both client-side and server-side TLS contexts.	2021-12-22 18:17:26 +01:00
Michał Kępień	3081bda798	Add a logging category for TLS pre-master secrets TLS pre-master secrets will be dumped to disk using the logging framework provided by libisc. Add a new logging category for this type of debugging data in order to enable exporting it to a dedicated channel. Derive the name of the new category from the name of the relevant environment variable, SSLKEYLOGFILE.	2021-12-22 18:17:26 +01:00
Aram Sargsyan	5d87725fdc	Use ECDSA P-256 instead of 4096-bit RSA for 'tls ephemeral' ECDSA P-256 performs considerably better than the previously used 4096-bit RSA (can be observed using `openssl speed`), and, according to RFC 6605, provides a security level comparable to 3072-bit RSA.	2021-12-20 10:09:05 +00:00
Ondřej Surý	ee1f8b60c5	Simplify Address Sanitizer tweaks in mem.c Previously, whole isc_mempool_get() and isc_mempool_set() would be replaced by simpler version when run with address sanitizer. Change the code to limit the fillcount to 1 and freemax to 0. This change will make isc_mempool_get() to always allocate and use a single new item and isc_mempool_put() will always return the item to the allocator.	2021-12-17 14:43:05 +01:00
Mark Andrews	a23507c4fa	Pass the digest buffer length to EVP_DigestSignFinal OpenSSL 3.0.1 does not accept 0 as a digest buffer length when calling EVP_DigestSignFinal as it now checks that the digest buffer length is large enough for the digest. Pass the digest buffer length instead.	2021-12-17 20:28:01 +11:00
Michal Nowak	9c013f37d0	Drop cppcheck workarounds As cppcheck was removed from the CI, associated workarounds and suppressions are not required anymore.	2021-12-14 15:03:56 +01:00
Petr Menšík	929bbe192d	Improve error message when directory name is given Surprising error IO error is returned when directory name is given instead of named.conf file. It can be passed to named-checkconf or include statement. Make a simple change to return Invalid file instead. Still not precise, but much better error message is returned. Fix of rhbz#490837.	2021-12-10 10:50:21 +01:00
Michał Kępień	eb4713c8e5	Remove mutex debugging code Mutex debugging code (used when the ISC_MUTEX_DEBUG preprocessor macro is set to 1 and PTHREAD_MUTEX_ERRORCHECK is defined) has been broken for the past 3 years (since commit `2f3eee5a4f`) and nobody complained, which is a strong indication that this code is not being used these days any more. External tools for detecting locking issues are already wired into various GitLab CI checks. Drop all code depending on the ISC_MUTEX_DEBUG preprocessor macro being set.	2021-12-09 14:02:36 +01:00
Michał Kępień	0964a94ad5	Remove mutex profiling code Mutex profiling code (used when the ISC_MUTEX_PROFILE preprocessor macro is set to 1) has been broken for the past 3 years (since commit `0bed9bfc28`) and nobody complained, which is a strong indication that this code is not being used these days any more. External tools for both measuring performance and detecting locking issues are already wired into various GitLab CI checks. Drop all code depending on the ISC_MUTEX_PROFILE preprocessor macro being set.	2021-12-09 12:25:21 +01:00
Ondřej Surý	57d0fabadd	Stop leaking mutex in nmworker and cond in nm socket On FreeBSD, the pthread primitives are not solely allocated on stack, but part of the object lives on the heap. Missing pthread_*_destroy causes the heap memory to grow and in case of fast lived object it's possible to run out-of-memory. Properly destroy the leaking mutex (worker->lock) and the leaking condition (sock->cond).	2021-12-08 17:58:53 +01:00
Ondřej Surý	c6f3e12fe7	Reduce the number of hazard pointers Previously, we set the number of the hazard pointers to be 4 times the number of workers because the dispatch ran on the old socket code. Since the old socket code was removed there's a smaller number of threads, namely: - 1 main thread - 1 timer thread - <n> netmgr threads - <n> threadpool threads Set the number of hazard pointers to 2 + 2 * workers.	2021-12-07 21:12:53 +01:00
Ondřej Surý	15ce1737fa	Fix the isc_hp initialization and memory usage Previously, the isc_hp_init() could not lower the value of isc__hp_max_threads, but because of a mistake the isc__hp_max_threads would be set to HP_MAX_THREADS (e.g. 128 threads) thus it would be always set to 128. This would result in increased memory usage even when small number of workers were in use. Change the default value of isc__hp_max_threads to be 1. Additionally, enforce the max_hps value in isc_hp_new() to be smaller or equal to HP_MAX_HPS. The only user is isc_queue which uses just 1 hazard pointer, so it's only theoretical issue.	2021-12-07 20:41:46 +01:00
Ondřej Surý	20ac73eb22	Improve the logging on failed TCP accept Previously, when TCP accept failed, we have logged a message with ISC_LOG_ERROR level. One common case, how this could happen is that the client hits TCP client quota and is put on hold and when resumed, the client has already given up and closed the TCP connection. In such case, the named would log: TCP connection failed: socket is not connected This message was quite confusing because it actually doesn't say that it's related to the accepting the TCP connection and also it logs everything on the ISC_LOG_ERROR level. Change the log message to "Accepting TCP connection failed" and for specific error states lower the severity of the log message to ISC_LOG_INFO.	2021-12-02 13:50:00 +01:00
Artem Boldariev	5f859d8a98	TLS context handling code: Fix an abort on ancient OpenSSL version There was a logical bug when setting a list of enabled TLS protocols, which may lead to a crash (an abort()) on systems with ancient OpenSSL versions. The problem was due to the fact that we were INSIST()ing on supporting all of the TLS versions, while checking only for mentioned in the configuration was implied.	2021-12-01 12:00:30 +02:00
Artem Boldariev	f0e18f3927	Add isc_nm_has_encryption() This commit adds an isc_nm_has_encryption() function intended to check if a given handle is backed by a connection which uses encryption.	2021-11-30 12:20:22 +02:00
Artem Boldariev	07cf827b0b	Add isc_nm_socket_type() This commit adds an isc_nm_socket_type() function which can be used to obtain a handle's socket type. This change obsoletes isc_nm_is_tlsdns_handle() and isc_nm_is_http_handle(). However, it was decided to keep the latter as we eventually might end up supporting multiple HTTP versions.	2021-11-30 12:20:22 +02:00
Artem Boldariev	b211fff4cb	TLS stream: disable TLS I/O debug log message by default This commit makes the TLS stream code to not issue mostly useless debug log message on error during TLS I/O. This message was cluttering logs a lot, as it can be generated on (almost) any non-clean TLS connection termination, even in the cases when the actual query completed successfully. Nor does it provide much value for end-users, yet it can occasionally be seen when using dig and quite often when running BIND over a publicly available network interface.	2021-11-26 10:23:17 +02:00
Artem Boldariev	0b0c29dd51	DoH: Remove unneeded isc__nmsocket_prep_destroy() call This commit removes unneeded isc__nmsocket_prep_destroy() call on ALPN negotiation failure, which was eventually causing the TLS handle to leak. This call is not needed, as not attaching to the transport (TLS) handle should be enough. At this point it seems like a kludge from earlier days of the TLS code.	2021-11-26 10:23:17 +02:00
Matthijs Mekking	89f4f8f0c8	Add OPENSSL_cleanup to tls_shutdown function This prevents a direct leak in OPENSSL_init_crypto (called from OPENSSL_init_ssl). Add shim version of OPENSSL_cleanup because it is missing in LibreSSL on OpenBSD.	2021-11-26 08:20:10 +01:00
Mark Andrews	1092d8e25a	use .s_addr to handle potential union in struct in_addr	2021-11-25 12:33:04 +00:00
Artem Boldariev	6c8a97c78f	Fix a crash on unexpected incoming DNS message during XoT xfer This commit fixes a peculiar corner case in the client-side DoT code because of which a crash could occur during a zone transfer. A junk DNS message should be sent at the end of a zone transfer via TLS to trigger the crash (abort). This commit, hopefully, fixes that. Also, this commit adds similar changes to the TCP DNS code, as it shares the same origin and most of the logic.	2021-11-24 11:18:36 +02:00
Evan Hunt	7f63ee3bae	address '--disable-doh' failures Change 5756 (GL #2854) introduced build errors when using 'configure --disable-doh'. To fix this, isc_nm_is_http_handle() is now defined in all builds, not just builds that have DoH enabled. Missing code comments were added both for that function and for isc_nm_is_tlsdns_handle().	2021-11-17 13:48:43 -08:00
Artem Boldariev	80482f8d3e	DoH: Add isc_nm_set_min_answer_ttl() This commit adds an isc_nm_set_min_answer_ttl() function which is intended to to be used to give a hint to the underlying transport regarding the answer TTL. The interface is intentionally kept generic because over time more transports might benefit from this functionality, but currently it is intended for DoH to set "max-age" value within "Cache-Control" HTTP header (as recommended in the RFC8484, section 5.1 "Cache Interaction"). It is no-op for other DNS transports for the time being.	2021-11-05 14:14:59 +02:00
Mark Andrews	0b83f1495d	Handle truncating the request stream in isc_httpd If we have had to truncate the request stream, don't resume reading from it.	2021-11-04 17:06:36 -07:00
Mark Andrews	49531e4582	Handle HTTP/1.1 pipelined requests Check to see whether there are outstanding requests in the httpd receive buffer after sending the response, and if so, process them. Test that pipelined requests are handled by sending multiple minimal HTTP/1.1 using netcat (nc) and checking that we get back the same number of responses.	2021-11-04 17:05:29 -07:00
Mark Andrews	e46c64bf42	Consume the HTTP headers after processing a request Remember the amount of space consumed by the HTTP headers, then move any trailing data to the start of the httpd->recvbuf once we have finished processing the request.	2021-11-04 17:00:18 -07:00
Evan Hunt	cbf8c2e019	statschannel doesn't handle multiple reads correctly if an incoming HTTP request is incomplete, but nothing else is clearly wrong with it, the stats channel continues reading to see if there's more coming. the buffer length was not being processed correctly in this case. also, the server state was not reset correctly when the request was complete, so that subsequent requests could be appended to the first buffer instead of being treated as new. in addition fixing the above problems, this commit also increases the size of the httpd request buffer from 1024 to 4096, because some browsers send a lot of headers.	2021-11-04 15:52:58 +11:00
Mark Andrews	60535fc5f7	The OpenSSL engine API is deprecated in OpenSSL 3.0.0 don't use the engine API unless the OpenSSL API is less than 3.0.0 (OPENSSL_API_LEVEL < 30000)	2021-10-28 07:39:37 +00:00
Aram Sargsyan	965bdd9894	Use OpenSSL version macro instead of function check Unless being configured with the `no-deprecated` option, OpenSSL 3.0.0 still has the deprecated APIs present and will throw warnings during compilation, when using them. Make sure that the old APIs are being used only with the older versions of OpenSSL.	2021-10-28 07:39:37 +00:00
Mark Andrews	ebea7ee97b	Use EVP_RSA_gen() if available BN and other low level functions are deprecated in OpenSSL 3.0.0 the is one of the replacement methods for generating RSA keys.	2021-10-28 07:38:56 +00:00
Aram Sargsyan	15cb706f22	Refactor the OpenSSL HMAC usage to use newer APIs OpenSSL 3 deprecates the HMAC* family and associated APIs. Rewrite portions of OpenSSL library usage code to use a newer set of HMAC APIs.	2021-10-28 07:38:56 +00:00
Aram Sargsyan	2a6febd5d2	Use thinner shims for OpenSSL's EVP_MD_CTX_new() and EVP_MD_CTX_free() The EVP_MD_CTX_new() and EVP_MD_CTX_free() functions are renamed APIs which were previously available as EVP_MD_CTX_create() and EVP_MD_CTX_destroy() respectively, which means that we can use them instead of providing our own shim functions.	2021-10-28 07:38:56 +00:00
Aram Sargsyan	c45d853f44	Use EVP_MD_CTX_get0_md() instead of deprecated EVP_MD_CTX_md() OpenSSL 3.0.0 deprecates the EVP_MD_CTX_md() function. Use EVP_MD_CTX_md() instead of EVP_MD_CTX_get0_md() and create a shim to use the old variant for the older OpenSSL versions which don't have the newer EVP_MD_CTX_get0_md().	2021-10-28 07:38:56 +00:00
Ondřej Surý	04511736a0	Add isc_time_add and isc_time_subtract unit test The isc_time_add() and isc_time_subtract() didn't have a unit test, add the unit test with couple of edge case vectors to check whether overflow and underflow is correctly handled.	2021-10-21 09:31:01 +02:00
Ondřej Surý	2b147ac358	Use __builtin_*_overflow for isc_time_{add,subtract}() Use the __builtin_uadd_overflow() and __builtin_usub_overflow() for overflow checks in isc_time_add() and isc_time_subtract(). This generates more efficient and safe code.	2021-10-21 09:31:01 +02:00
Ondřej Surý	8c05f12bc8	Fix isc_time_add() overflow The isc_time_add() could overflow when t.seconds + i.seconds == UINT_MAX and t.nanoseconds + i.nanoseconds >= NS_PER_S. Fix the overflow in isc_time_add(), and simplify the ISC_R_RANGE checks both in isc_time_add() and isc_time_subtract() functions.	2021-10-21 09:31:01 +02:00
Evan Hunt	32b50407bf	check statichandle before attaching it is possible for udp_recv_cb() to fire after the socket is already shutting down and statichandle is NULL; we need to create a temporary handle in this case.	2021-10-18 14:21:04 -07:00
Evan Hunt	a55589f881	remove all references to isc_socket and related types Removed socket.c, socket.h, and all references to isc_socket_t, isc_socketmgr_t, isc_sockevent_t, etc.	2021-10-15 01:01:25 -07:00
Evan Hunt	075139f60e	netmgr: refactor isc__nm_incstats() and isc__nm_decstats() route/netlink sockets don't have stats counters associated with them, so it's now necessary to check whether socket stats exist before incrementing or decrementing them. rather than relying on the caller for this, we now just pass the socket and an index, and the correct stats counter will be updated if it exists.	2021-10-15 00:57:02 -07:00
Evan Hunt	8c51a32e5c	netmgr: add isc_nm_routeconnect() isc_nm_routeconnect() opens a route/netlink socket, then calls a connect callback, much like isc_nm_udpconnect(), with a handle that can then be monitored for network changes. Internally the socket is treated as a UDP socket, since route/netlink sockets follow the datagram contract.	2021-10-15 00:56:58 -07:00
Evan Hunt	8d6bf826c6	netmgr: refactor isc__nm_incstats() and isc__nm_decstats() After support for route/netlink sockets is merged, not all sockets will have stats counters associated with them, so it's now necessary to check whether socket stats exist before incrementing or decrementing them. rather than relying on the caller for this, we now just pass the socket and an index, and the correct stats counter will be updated if it exists.	2021-10-15 00:40:37 -07:00
Ondřej Surý	e603983ec9	Stop providing branch prediction information The __builtin_expect() can be used to provide the compiler with branch prediction information. The Gcc manual says[1] on the subject: In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. Stop using __builtin_expect() and ISC_LIKELY() and ISC_UNLIKELY() macros to provide the branch prediction information as the performance testing shows that named performs better when the __builtin_expect() is not being used. 1. https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect	2021-10-14 10:33:24 +02:00
Evan Hunt	9a9e906306	fixed a bug in rolling timestamp logfiles due to comparing logfile suffixes as 32 bit rather than 64 bit integers, logfiles with timestamp suffixes that should have been removed when rolling could be left in place. this has been fixed.	2021-10-13 08:31:47 -07:00
Ondřej Surý	f3635bcc14	Use #pragma once as header guards Unify the header guard style and replace the inconsistent include guards with #pragma once. The #pragma once is widely and very well supported in all compilers that BIND 9 supports, and #pragma once was already in use in several new or refactored headers. Using simpler method will also allow us to automate header guard checks as this is simpler to programatically check. For reference, here are the reasons for the change taken from Wikipedia[1]: > In the C and C++ programming languages, #pragma once is a non-standard > but widely supported preprocessor directive designed to cause the > current source file to be included only once in a single compilation. > > Thus, #pragma once serves the same purpose as include guards, but with > several advantages, including: less code, avoidance of name clashes, > and sometimes improvement in compilation speed. On the other hand, > #pragma once is not necessarily available in all compilers and its > implementation is tricky and might not always be reliable. 1. https://en.wikipedia.org/wiki/Pragma_once	2021-10-13 00:49:15 -07:00
Matthijs Mekking	2af05beafa	Replace "master/slave" terms in code Replace some "master/slave" terminology in the code with the preferred "primary/secondary" keywords. This also changes user output such as log messages, and fixes a typo ("seconary") in cfg_test.c. There are still some references to "master" and "slave" for various reasons: - The old syntax can still be used as a synonym. - The master syntax is kept when it refers to master files and formats. - This commit replaces mainly keywords that are local. If "master" or "slave" is used in for example a structure that is all over the place, it is considered out of scope for the moment.	2021-10-12 13:11:13 -07:00
Ondřej Surý	ed95f9fba3	Update the source code formatting using clang-format-13 clang-format-13 fixed some of the formatting that clang-format-12 got wrong. Update the formatting.	2021-10-12 11:14:40 +02:00
Michał Kępień	5178ba4cf2	Properly handle JEMALLOC_* Autoconf variables The AX_CHECK_JEMALLOC() m4 macro sets the JEMALLOC_CFLAGS variable, not JEMALLOC_CPPFLAGS. Furthermore, the JEMALLOC_CFLAGS and JEMALLOC_LIBS variables should only be included in the build flags if jemalloc was successfully configured. Tweak lib/isc/Makefile.am accordingly.	2021-10-12 10:44:30 +02:00
Ondřej Surý	2e3a2eecfe	Make isc_result a static enum Remove the dynamic registration of result codes. Convert isc_result_t from unsigned + #defines into 32-bit enum type in grand unified <isc/result.h> header. Keep the existing values of the result codes even at the expense of the description and identifier tables being unnecessary large. Additionally, add couple of: switch (result) { [...] default: break; } statements where compiler now complains about missing enum values in the switch statement.	2021-10-06 11:22:20 +02:00
Ondřej Surý	804ec1bcaa	Improve STATIC_ASSERT macro for older compilers Previously, when using compiler without support for static assertions, the STATIC_ASSERT() macro would be replaced with runtime assertion. Change the STATIC_ASSERT() macro to a version that's compile time assertion even when using pre-C11 compilers. Courtesy of Joseph Quinsey: https://godbolt.org/z/K9RvWS	2021-10-05 22:13:29 +02:00
Artem Boldariev	abecfdc298	DoT: do not attempt to call read callback if it is not avaialble This commit fixes a crash in DoT code when it was attempting to call a read callback on the later stages of the connection when it is not available. It also fixes [GL #2884] (back-trace provided in the bug report is exactly the same as was seen when fixing this problem).	2021-10-05 11:26:14 +03:00
Artem Boldariev	610bd2726e	Add ALPN negotiation tests to TLS DNS test set This commits adds a set of unit tests to ensure that ALPN happens over the connections and that the result of the negotiation can be checked.	2021-10-05 11:23:47 +03:00
Artem Boldariev	25b2c6ad96	Require "dot" ALPN token for zone transfer requests over DoT (XoT) This commit makes BIND verify that zone transfers are allowed to be done over the underlying connection. Currently, it makes sense only for DoT, but the code is deliberately made to be protocol-agnostic.	2021-10-05 11:23:47 +03:00
Artem Boldariev	eba3278e52	Add isc_nm_xfr_allowed() function The intention of having this function is to have a predicate to check if a zone transfer could be performed over the given handle. In most cases we can assume that we can do zone transfers over any stream transport except DoH, but this assumption will not work for zone transfers over DoT (XoT), as the RFC9103 requires ALPN to happen, which might not be the case for all deployments of DoT.	2021-10-05 11:23:47 +03:00
Artem Boldariev	56b3f5d832	Low level code to support ALPN in DoT This commit adds low-level code necessary to support ALPN in DoT as XoT requires "dot" ALPN token to be negotiated on a connection for zone transfers.	2021-10-05 11:23:47 +03:00
Evan Hunt	8b532d2e64	dispatch: Refactor to eliminate dns_dispatchevent - Responses received by the dispatch are no longer sent to the caller via a task event, but via a netmgr-style recv callback. the 'action' parameter to dns_dispatch_addresponse() is now called 'response' and is called directly from udp_recv() or tcp_recv() when a valid response has been received. - All references to isc_task and isc_taskmgr have been removed from dispatch functions. - All references to dns_dispatchevent_t have been removed and the type has been deleted. - Added a task to the resolver response context, to be used for fctx events. - When the caller cancels an operation, the response handler will be called with ISC_R_CANCELED; it can abort immediately since the caller will presumably have taken care of cleanup already. - Cleaned up attach/detach in resquery and request.	2021-10-02 11:39:56 -07:00
Evan Hunt	08ce69a0ea	Rewrite dns_resolver and dns_request to use netmgr timeouts - The `timeout_action` parameter to dns_dispatch_addresponse() been replaced with a netmgr callback that is called when a dispatch read times out. this callback may optionally reset the read timer and resume reading. - Added a function to convert isc_interval to milliseconds; this is used to translate fctx->interval into a value that can be passed to dns_dispatch_addresponse() as the timeout. - Note that netmgr timeouts are accurate to the millisecond, so code to check whether a timeout has been reached cannot rely on microsecond accuracy. - If serve-stale is configured, then a timeout received by the resolver may trigger it to return stale data, and then resume waiting for the read timeout. this is no longer based on a separate stale timer. - The code for canceling requests in request.c has been altered so that it can run asynchronously. - TCP timeout events apply to the dispatch, which may be shared by multiple queries. since in the event of a timeout we have no query ID to use to identify the resp we wanted, we now just send the timeout to the oldest query that was pending. - There was some additional refactoring in the resolver: combining fctx_join() and fctx_try_events() into one function to reduce code duplication, and using fixednames in fetchctx and fetchevent. - Incidental fix: new_adbaddrinfo() can't return NULL anymore, so the code can be simplified.	2021-10-02 11:39:56 -07:00
Evan Hunt	308bc46a59	Convert dispatch to netmgr The flow of operations in dispatch is changing and will now be similar for both UDP and TCP queries: 1) Call dns_dispatch_addresponse() to assign a query ID and register that we'll be listening for a response with that ID soon. the parameters for this function include callback functions to inform the caller when the socket is connected and when the message has been sent, as well as a task action that will be sent when the response arrives. (later this could become a netmgr callback, but at this stage to minimize disruption to the calling code, we continue to use isc_task for the response event.) on successful completion of this function, a dispatch entry object will be instantiated. 2) Call dns_dispatch_connect() on the dispatch entry. this runs isc_nm_udpconnect() or isc_nm_tcpdnsconnect(), as needed, and begins listening for responses. the caller is informed via a callback function when the connection is established. 3) Call dns_dispatch_send() on the dispatch entry. this runs isc_nm_send() to send a request. 4) Call dns_dispatch_removeresponse() to terminate listening and close the connection. Implementation comments below: - As we will be using netmgr buffers now. code to send the length in TCP queries has also been removed as that is handled by the netmgr. - TCP dispatches can be used by multiple simultaneous queries, so dns_dispatch_connect() now checks whether the dispatch is already connected before calling isc_nm_tcpdnsconnect() again. - Running dns_dispatch_getnext() from a non-network thread caused a crash due to assertions in the netmgr read functions that appear to be unnecessary now. the assertions have been removed. - fctx->nqueries was formerly incremented when the connection was successful, but is now incremented when the query is started and decremented if the connection fails. - It's no longer necessary for each dispatch to have a pool of tasks, so there's now a single task per dispatch. - Dispatch code to avoid UDP ports already in use has been removed. - dns_resolver and dns_request have been modified to use netmgr callback functions instead of task events. some additional changes were needed to handle shutdown processing correctly. - Timeout processing is not yet fully converted to use netmgr timeouts. - Fixed a lock order cycle reported by TSAN (view -> zone-> adb -> view) by by calling dns_zt functions without holding the view lock.	2021-10-02 11:39:56 -07:00
Ondřej Surý	9ee60e7a17	netmgr fixes needed for dispatch - The read timer must always be stopped when reading stops. - Read callbacks can now call isc_nm_read() again in TCP, TCPDNS and TLSDNS; previously this caused an assertion. - The wrong failure code could be sent after a UDP recv failure because the if statements were in the wrong order. the check for a NULL address needs to be after the check for an error code, otherwise the result will always be set to ISC_R_EOF. - When aborting a read or connect because the netmgr is shutting down, use ISC_R_SHUTTINGDOWN. (ISC_R_CANCELED is now reserved for when the read has been canceled by the caller.) - A new function isc_nmhandle_timer_running() has been added enabling a callback to check whether the timer has been reset after processing a timeout. - Incidental netmgr fix: always use isc__nm_closing() instead of referencing sock->mgr->closing directly - Corrected a few comments that used outdated function names.	2021-10-02 11:39:56 -07:00
Evan Hunt	d9e1ad9e37	Remove reference count REQUIRE in isc_nm_read() Previously isc_nm_read() required references on the handle to be at least 2, under the assumption that it would only ever be called from a connect or accept callback. however, it can also be called from a read callback, in which case the reference count might be only 1.	2021-10-02 11:39:56 -07:00
Evan Hunt	f439eb5d99	Dispatch API simplification - Many dispatch attributes can be set implicitly instead of being passed in. we can infer whether to set DNS_DISPATCHATTR_TCP or _UDP from whether we're calling dns_dispatch_createtcp() or _createudp(). we can also infer DNS_DISPATCHATTR_IPV4 or _IPV6 from the addresses or the socket that were passed in. - We no longer use dup'd sockets in UDP dispatches, so the 'dup_socket' parameter has been removed from dns_dispatch_createudp(), along with the code implementing it. also removed isc_socket_dup() since it no longer has any callers. - The 'buffersize' parameter was ignored and has now been removed; buffersize is now fixed at 4096. - Maxbuffers and maxrequests don't need to be passed in on every call to dns_dispatch_createtcp() and _createudp(). In all current uses, the value for mgr->maxbuffers will either be raised once from its default of 20000 to 32768, or else left alone. (passing in a value lower than 20000 does not lower it.) there isn't enough difference between these values for there to be any need to configure this. The value for disp->maxrequests controls both the quota of concurrent requests for a dispatch and also the size of the dispatch socket memory pool. it's not clear that this quota is necessary at all. the memory pool size currently starts at 32768, but is sometimes lowered to 4096, which is definitely unnecessary. This commit sets both values permanently to 32768. - Previously TCP dispatches allocated their own separate QID table, which didn't incorporate a port table. this commit removes per-dispatch QID tables and shares the same table between all dispatches. since dispatches are created for each TCP socket, this may speed up the dispatch allocation process. there may be a slight increase in lock contention since all dispatches are sharing a single QID table, but since TCP sockets are used less often than UDP sockets (which were already sharing a QID table), it should not be a substantial change. - The dispatch port table was being used to determine whether a port was already in use; if so, then a UDP socket would be bound with REUSEADDR. this commit removes the port table, and always binds UDP sockets that way.	2021-10-02 10:21:49 +02:00
Artem Boldariev	c759f25c7b	Add "session-tickets" options to the "tls" clause This commit adds the ability to enable or disable stateless TLS session resumption tickets (see RFC5077). Having this ability is twofold. Firstly, these tickets are encrypted by the server, and the algorithm might be weaker than the algorithm negotiated during the TLS session establishment (it is in general the case for TLSv1.2, but the generic principle applies to TLSv1.3 as well, despite it having better ciphers for session tickets). Thus, they might compromise Perfect Forward Secrecy. Secondly, disabling it might be necessary if the same TLS key/cert pair is supposed to be used by multiple servers to achieve, e.g., load balancing because the session ticket by default gets generated in runtime, while to achieve successful session resumption ability, in this case, would have required using a shared key. The proper alternative to having the ability to disable stateless TLS session resumption tickets is to implement a proper session tickets key rollover mechanism so that key rotation might be performed often (e.g. once an hour) to not compromise forward secrecy while retaining the associated performance benefits. That is much more work, though. On the other hand, having the ability to disable session tickets allows having a deployable configuration right now in the cases when either forward secrecy is wanted or sharing the TLS key/cert pair between multiple servers is needed (or both).	2021-10-01 15:50:43 +03:00
Artem Boldariev	16c6e2be06	Add "prefer-server-ciphers" options to the "tls" clause This commit adds support for enforcing the preference of server ciphers over the client ones. This way, the server attains control over the ciphers priority and, thus, can choose more strong cyphers when a client prioritises less strong ciphers over the more strong ones, which is beneficial when trying to achieve Perfect Forward Secrecy.	2021-10-01 15:50:43 +03:00
Artem Boldariev	3b88d783a2	Add "ciphers" options to the "tls" clause This commit adds support for setting TLS cipher list string in the format specified in the OpenSSL documentation (https://www.openssl.org/docs/man1.1.1/man1/ciphers.html). The syntax of the cipher list is verified so that specifying the wrong string will prevent the configuration from being loaded.	2021-10-01 15:50:43 +03:00
Artem Boldariev	f2ae4c8480	DH-parameters loading support This commit adds support for loading DH-parameters (Diffie-Hellman parameters) via the new "dhparam-file" option within "tls" clause. In particular, Diffie-Hellman parameters are needed to enable the range of forward-secrecy enabled cyphers for TLSv1.2, which are getting silently disabled otherwise.	2021-10-01 15:50:43 +03:00
Artem Boldariev	992f815770	Add "protocols" options to the "tls" clause This commit adds the ability to specify allowed TLS protocols versions within the "tls" clause. If an unsupported TLS protocol version is specified in a file, the configuration file will not pass verification. Also, this commit adds strict checks for "tls" clauses verification, in particular: - it ensures that loading configuration files containing duplicated "tls" clauses is not allowed; - it ensures that loading configuration files containing "tls" clauses missing "cert-file" or "key-file" is not allowed; - it ensures that loading configuration files containing "tls" clauses named as "ephemeral" or "none" is not allowed.	2021-10-01 15:50:43 +03:00
Artem Boldariev	9e039986cd	TLS: set some common options both for client and server contexts This commit makes the TLS context manipulation code set some of the common protocol versions regardless of the OpenSSL version in use.	2021-10-01 15:50:42 +03:00
Ondřej Surý	c3250a9b81	Use assertions to check for failed allocations It was discovered that named could crash due to a segmentation fault when jemalloc was in use and memory allocation failed. This was not intended to happen as jemalloc's "xmalloc" option was set to "true" in the "malloc_conf" configuration variable. However, that variable was only set after jemalloc was already done with parsing it, which effectively caused setting that variable to have no effect. While investigating this issue, it was also discovered that enabling the "xmalloc" option makes jemalloc use a slow processing path, decreasing its performance by about 25%. [1] Additionally, further testing (carried out after fixing the way "malloc_conf" was set) revealed that the non-default configuration options do not have any measurable effect on either authoritative or recursive DNS server performance. Replace code setting various jemalloc options to non-default values with assertion checks of mallocx()/rallocx() return values. [1] https://github.com/jemalloc/jemalloc/pull/523	2021-09-30 13:54:55 +02:00
Mark Andrews	8fc9bb8e8e	Address use before NULL check warning of ievent->sock Reorder REQUIRE checks to ensure ievent->sock is checked earlier	2021-09-28 11:57:47 +10:00
Mark Andrews	7079829b84	Address use before NULL check warning of uvreq move dereference of uvreq until the after NULL check.	2021-09-28 11:57:47 +10:00
Ondřej Surý	8248da3b83	Preserve the contents of socket buffer on realloc On TCPDNS/TLSDNS read callback, the socket buffer could be reallocated if the received contents would be larger than the buffer. The existing code would not preserve the contents of the existing buffer which lead to the loss of the already received data. This commit changes the isc_mem_put()+isc_mem_get() with isc_mem_reget() to preserve the existing contents of the socket buffer.	2021-09-23 22:36:01 +02:00
Ondřej Surý	8edbd0929f	Use isc_mem_reget() to handle the internal active handle cache The netmgr, has an internal cache for freed active handles. This cache was allocated using isc_mem_allocate()/isc_mem_free() API because it was simpler to reallocate the cache when we needed to grow it. The new isc_mem_reget() function could be used here reducing the need to use isc_mem_allocate() API which is tad bit slower than isc_mem_get() API.	2021-09-23 22:17:15 +02:00
Ondřej Surý	15d6249260	Use isc_mem_reget() when growing buffer dynamically Previously, we cannot use isc_mem_reallocate() for growing the buffer dynamically, because the memory was allocated using the isc_mem_get()/isc_mem_put() API. With the introduction of the isc_mem_reget() function, we can use grow/shrink the memory directly without always moving the memory around as the allocator might have reserved some extra space after the initial allocation.	2021-09-23 22:17:15 +02:00
Ondřej Surý	4cdb3abf27	Return non-NULL pointer on zero-sized allocations and reallocations Previously, the zero-sized allocations would return NULL pointer and the caller had to make sure to not dereference such pointer. The C standard defines the zero-sized calls to malloc() as implementation specific and jemalloc mallocx() with zero size would be undefined behaviour. This complicated the code as it had to handle such cases in a special manner in all allocator and deallocator functions. Now, for realloc(), the situation is even more complicated. In C standard up to C11, the behavior would be implementation defined, and actually some implementation would free to orig ptr and some would not. Since C17 (via DR400) would deprecate such usage and since C23, the behaviour would be undefined. This commits changes helper mem_get(), mem_put() and mem_realloc() functions to grow the zero-allocation from 0 to sizeof(void *). This way we get a predicable behaviour that all the allocations will always return valid pointer.	2021-09-23 22:17:15 +02:00
Ondřej Surý	aeb3d1cab3	Add isc_mem_reget() function to realloc isc_mem_get allocations The isc_mem_get() and isc_mem_put() functions are leaving the memory allocation size tracking to the users of the API, while isc_mem_allocate() and isc_mem_free() would track the sizes internally. This allowed to have isc_mem_rellocate() to manipulate the memory allocations by the later set, but not the former set of the functions. This commit introduces isc_mem_reget(ctx, old_ptr, old_size, new_size) function that operates on the memory allocations with external size tracking completing the API.	2021-09-23 11:18:07 -07:00
Ondřej Surý	edee9440d0	Remove the mastefile-format map option As previously announced, this commit removes the masterfile-format format 'map' from named, all the tools, the documentation and the system tests.	2021-09-17 07:09:50 +02:00
Ondřej Surý	8cb2ba5dd3	Remove native PKCS#11 support The native PKCS#11 support has been removed in favour of better maintained, more performance and easier to use OpenSSL PKCS#11 engine from the OpenSC project.	2021-09-09 15:35:39 +02:00
Aram Sargsyan	74f50cd29f	Remove dead code Remove dead code from the USE_DEVPOLL branch in libisc's socket.c	2021-09-08 10:12:03 +00:00
Ondřej Surý	45726fc01f	Synchronize the isc_trampoline API with changes needed in v9_16 This commit synchronizes the isc_trampoline API to match the changes needed to fix Windows service in v9_16.	2021-09-01 10:44:21 +02:00
Artem Boldariev	db1ba15ff2	Replace multiple /dns-query constants with a global one This commit replaces the constants defining /dns-query, the default DoH endpoint, with a global definition.	2021-08-30 10:32:17 +03:00
Artem Boldariev	530133c10f	Unify DoH URI making throughout the codebase This commit adds new function isc_nm_http_makeuri() which is supposed to unify DoH URI construction throughout the codebase. It handles IPv6 addresses, hostnames, and IPv6 addresses given as hostnames properly, and replaces similar ad-hoc code in the codebase.	2021-08-30 10:21:58 +03:00
Ondřej Surý	cdf9a1fd20	Remove support for external applications to register libisc The previous versions of BIND 9 exported its internal libraries so that they can be used by third-party applications more easily. Certain library functions were altered from specific BIND-only behavior to more generic behavior when used by other applications. This commit removes the function isc_lib_register() that was used by external applications to enable the functionality.	2021-08-30 08:47:39 +02:00
Evan Hunt	fc6f751fbe	replace per-protocol keepalive functions with a common one this commit removes isc__nm_tcpdns_keepalive() and isc__nm_tlsdns_keepalive(); keepalive for these protocols and for TCP will now be set directly from isc_nmhandle_keepalive(). protocols that have an underlying TCP socket (i.e., TLS stream and HTTP), now have protocol-specific routines, called by isc_nmhandle_keeaplive(), to set the keepalive value on the underlying socket.	2021-08-27 10:02:10 -07:00
Evan Hunt	7867b8b57d	enable keepalive when the keepalive EDNS option is seen previously, receiving a keepalive option had no effect on how long named would keep the connection open; there was a place to configure the keepalive timeout but it was never used. this commit corrects that. this also fixes an error in isc__nm_{tcp,tls}dns_keepalive() in which the sense of a REQUIRE test was reversed; previously this error had not been noticed because the functions were not being used.	2021-08-27 09:56:51 -07:00
Evan Hunt	19e24e22f5	cleanup netmgr-int.h - fix some duplicated and out-of-order prototypes declared in netmgr-int.h - rename isc_nm_tcpdns_keepalive to isc__nm_tcpdns_keepalive as it's for internal use	2021-08-27 09:56:51 -07:00
Artem Boldariev	8a655320c8	Fix a crash (in dig) when closing HTTP socket with unused session This commit fixes a crash (caused by an assert) when closing an HTTP/2 socket with unused HTTP/2 session.	2021-08-27 12:14:48 +03:00
Artem Boldariev	32cd4367a3	Make no assumptions regarding HTTP headers processing order This commit changes the DoH code in such a way that it makes no assumptions regarding which headers are expected to be processed first. In particular, the code expected the :method: pseudo-header to be processed early, which might not be true.	2021-08-25 10:32:56 +03:00
Matthijs Mekking	9acce8a82a	Add a function isc_stats_resize Add a new function to resize the number of counters in a statistics counter structure. This will be needed when we keep track of DNSSEC sign statistics and new keys are introduced due to a rollover.	2021-08-24 09:07:15 +02:00
Matthijs Mekking	0bac9c7c5c	Add stats unit test Add a simple stats unit test that tests the existing library functions isc_stats_ncounters, isc_stats_increment, isc_stats_decrement, isc_stats_set, and isc_stats_update_if_greater.	2021-08-24 09:07:15 +02:00
Michal Nowak	d3d32683c0	Fix typos in lib/isc/trampoline_p.h	2021-08-19 07:12:33 +02:00
Ondřej Surý	87d5c8ab7c	Disable the Path MTU Discover on UDP Sockets Instead of disabling the fragmentation on the UDP sockets, we now disable the Path MTU Discovery by setting IP(V6)_MTU_DISCOVER socket option to IP_PMTUDISC_OMIT on Linux and disabling IP(V6)_DONTFRAG socket option on FreeBSD. This option sets DF=0 in the IP header and also ignores the Path MTU Discovery. As additional mitigation on Linux, we recommend setting net.ipv4.ip_no_pmtu_disc to Mode 3: Mode 3 is a hardend pmtu discover mode. The kernel will only accept fragmentation-needed errors if the underlying protocol can verify them besides a plain socket lookup. Current protocols for which pmtu events will be honored are TCP, SCTP and DCCP as they verify e.g. the sequence number or the association. This mode should not be enabled globally but is only intended to secure e.g. name servers in namespaces where TCP path mtu must still work but path MTU information of other protocols should be discarded. If enabled globally this mode could break other protocols.	2021-08-19 07:12:33 +02:00
Mark Andrews	89fe8e920c	Use %d for enum values	2021-08-19 10:19:32 +10:00
Mark Andrews	26b22a1445	add tests for string and qstring	2021-08-18 13:49:48 +10:00
Mark Andrews	a6357d8b5c	Add unit test for keypair	2021-08-18 13:49:48 +10:00
Mark Andrews	42c22670b3	Add support for parsing <tag>[=<value>] where <value> may be a quoted string. Previously quoted string only supported opening quotes at the start of the string.	2021-08-18 13:49:48 +10:00
Artem Boldariev	d72b1fa5cd	Fix the doh_recv_send() logic in the doh_test The commit fixes the doh_recv_send() because occasionally it would fail because it did not wait for all responses to be sent, making the check for ssends value to nit pass.	2021-08-12 14:28:17 +03:00
Artem Boldariev	e639957b58	Optimise TLS stream for small write size (>= 512 bytes) This commit changes TLS stream behaviour in such a way, that it is now optimised for small writes. In the case there is a need to write less or equal to 512 bytes, we could avoid calling the memory allocator at the expense of possibly slight increase in memory usage. In case of larger writes, the behviour remains unchanged.	2021-08-12 14:28:17 +03:00
Artem Boldariev	e301e1e3b8	Avoid memory copying during send in TLS stream At least at this point doing memory copying is not required. Probably it was a workaround for some problem in the earlier days of DoH, at this point it appears to be a waste of CPU cycles.	2021-08-12 14:28:17 +03:00
Artem Boldariev	bd69c7c57c	Simplify buffering code logic in http_send_outgoing() This commit significantly simplifies the code in http_send_outgoing() as it was unnecessary complicated, because it was dealing with multiple statically and dynamically allocated buffers, making it extremely hard to follow, as well as making it to do unnecessary memory copying in some situations. This commit fixes these issues, while retaining the high level buffering logic.	2021-08-12 14:28:17 +03:00
Artem Boldariev	a32faa20b4	DoH: replace a custom buffer code for POST data with isc_buffer_t This commit replaces the custom buffer code in client-side DoH code intended to keep track of POST data, with isc_buffer_t.	2021-08-12 14:28:17 +03:00
Artem Boldariev	5b52a7e37e	When terminating a client session, mark it as closing When an HTTP/2 client terminates a session it means that it is about to close the underlying connection. However, we were not doing that. As a result, with the latest changes to the test suite, which made it to limit amount of requests per a transport connection, the tests using quota would hang for quite a while. This commit fixes that.	2021-08-12 14:28:17 +03:00
Artem Boldariev	dbca22877a	Limit the number of requests sent per connection in DoH tests This commit ensures that only a limited number of requests is going to be sent over a single HTTP/2 connection. Before that change was introduced, it was possible to complete all of the planned sends via only one transport connection, which undermines the purpose of the tests using the quota facility.	2021-08-12 14:28:16 +03:00
Artem Boldariev	a05728beb0	Do not call http_do_bio() in isc__nm_http_request() The function should not be called here because it is, in general, supposed to be called at the end of the transport level callbacks to perform I/O, and thus, calling it here is clearly a mistake because it breaks other code expectations. As a result of the call to http_do_bio() from within isc__nm_http_request() the unit tests were running slower than expected in some situations. In this particular situation http_do_bio() is going to be called at the end of the transport_connect_cb() (initially), or http_readcb(), sending all of the scheduled requests at once. This change affects only the test suite because it is the only place in the codebase where isc__nm_http_request() is used in order to ensure that the server is able to handle multiple HTTP/2 streams at once.	2021-08-12 14:28:16 +03:00
Artem Boldariev	849d38b57b	Fix a crash by attach to the transport socket as early as possible This commit fixes a crash in DoH caused by transport handle to be detached too early when sending outgoing data. We need to attach to the session->handle earlier because as an indirect result of the nghttp2_session_mem_send() the session might get closed and the handle detached. However, there is still might be some outgoing data to handle. Besides, even when the underlying socket was closed via the handle, we still should try to attempt to send outgoing data via isc_nm_send() to let it call write callback, passed to the http_send_outgoing().	2021-08-12 14:28:16 +03:00
Artem Boldariev	e0704f2e5d	Use isc_buffer_t to keep track of outgoing response This commit gets rid of custom code taking care of response buffering by replacing the custom code with isc_buffer_t. Also, it gets rid of an unnecessary memory copying when sending a response.	2021-08-12 14:28:16 +03:00
Artem Boldariev	6fe4ab39b9	Use isc_buffer_t to keep track of incoming POST data This commit replaces the ad-hoc 64K buffer for incoming POST data with isc_buffer_t backed by dynamically allocated buffer sized accordingly to the value in the "Content-Length" header.	2021-08-12 14:28:16 +03:00
Artem Boldariev	0ca790d9bf	DoH: isc__buffer_usedregion->isc_buffer_usedregion in client_send() This commit replaces wrong usage of isc__buffer_usedregion() instead of implied isc_buffer_usedregion().	2021-08-12 14:28:16 +03:00
Artem Boldariev	2733cca3ac	Replace ad-hoc DNS message buffer in client code with isc_buffer_t The commit replaces an ad-hoc incoming DNS-message buffer in the client-side DoH code with isc_buffer_t. The commit also fixes a timing issue in the unit tests revealed by the change.	2021-08-12 14:28:16 +03:00
Artem Boldariev	c819caa3a1	Replace the HTTP/2 session's ad-hoc buffer with isc_buffer_t This commit replaces a static ad-hoc HTTP/2 session's temporary buffer with a realloc-able isc_buffer_t object, which is being allocated on as needed basis, lowering the memory consumption somewhat. The buffer is needed in very rare cases, so allocating it prematurely is not wise. Also, it fixes a bug in http_readcb() where the ad-hoc buffer appeared to be improperly used, leading to a situation when the processed data from the receiving regions can be processed twice, while unprocessed data will never be processed.	2021-08-12 14:28:16 +03:00
Artem Boldariev	170cc41d5c	Get rid of some HTTP/2 related types when NGHTTP2 is not available This commit removes definitions of some DoH-related types when libnghttp2 is not available.	2021-08-04 10:32:27 +03:00
Artem Boldariev	f388b71378	Get rid of RW locks in the DoH code This commit gets rid of RW locks in a hot path of the DoH code. In the original design, it was implied that we add new endpoints after the HTTP listener was created. Such a design implies some locking. We do not need such flexibility, though. Instead, we could build a set of endpoints before the HTTP listener gets created. Such a design does not need RW locks at all.	2021-08-04 10:32:25 +03:00
Ondřej Surý	22db2705cd	Use static storage for isc_mem water_t On the isc_mem water change the old water_t structure could be used after free. Instead of introducing reference counting on the hot-path we are going to introduce additional constraints on the isc_mem_setwater. Once it's set for the first time, the additional calls have to be made with the same water and water_arg arguments.	2021-07-22 11:51:46 +02:00
Artem Boldariev	590e8e0b86	Make max number of HTTP/2 streams configurable This commit makes number of concurrent HTTP/2 streams per connection configurable as a mean to fight DDoS attacks. As soon as the limit is reached, BIND terminates the whole session. The commit adds a global configuration option (http-streams-per-connection) which can be overridden in an http <name> {...} statement like follows: http local-http-server { ... streams-per-connection 100; ... }; For now the default value is 100, which should be enough (e.g. NGINX uses 128, but it is a full-featured WEB-server). When using lower numbers (e.g. ~70), it is possible to hit the limit with e.g. flamethrower.	2021-07-16 11:50:22 +03:00
Artem Boldariev	03a557a9bb	Add (http-)listener-clients option (DoH quota mechanism) This commit adds support for http-listener-clients global options as well as ability to override the default in an HTTP server description, like: http local-http-server { ... listener-clients 100; ... }; This way we have ability to specify per-listener active connections quota globally and then override it when required. This is exactly what AT&T requested us: they wanted a functionality to specify quota globally and then override it for specific IPs. This change functionality makes such a configuration possible. It makes sense: for example, one could have different quotas for internal and external clients. Or, for example, one could use BIND's internal ability to serve encrypted DoH with some sane quota value for internal clients, while having un-encrypted DoH listener without quota to put BIND behind a load balancer doing TLS offloading for external clients. Moreover, the code no more shares the quota with TCP, which makes little sense anyway (see tcp-clients option), because of the nature of interaction of DoH clients: they tend to keep idle opened connections for longer periods of time, preventing the TCP and TLS client from being served. Thus, the need to have a separate, generally larger, quota for them. Also, the change makes any option within "http <name> { ... };" statement optional, making it easier to override only required default options. By default, the DoH connections are limited to 300 per listener. I hope that it is a good initial guesstimate.	2021-07-16 11:50:20 +03:00
Artem Boldariev	954240467d	Verify HTTP paths both in incoming requests and in config file This commit adds the code (and some tests) which allows verifying validity of HTTP paths both in incoming HTTP requests and in BIND's configuration file.	2021-07-16 10:28:08 +03:00
Evan Hunt	4f6e2317e9	document isc__trampoline Added some header file documentation to the isc__trampoline implementation in trampoline_p.h.	2021-07-14 10:55:12 -07:00
Artem Boldariev	64cd7e8a7f	Fix crash in DoH on empty query string in GET requests An unhandled code path left GET query string data uninitialised (equal to NULL) and led to a crash during the requests' base64 data decoding. This commit fixes that.	2021-07-13 16:53:51 +03:00
Ondřej Surý	a9e6a7ae57	Disable setting the thread affinity It was discovered that setting the thread affinity on both the netmgr and netthread threads lead to inconsistent recursive performance because sometimes the netmgr and netthread threads would compete over single resource and sometimes not. Removing setting the affinity causes a slight dip in the authoritative performance around 5% (the measured range was from 3.8% to 7.8%), but the recursive performance is now consistently good.	2021-07-13 14:48:29 +02:00
Ondrej Sury	6eca4b402e	Use max_align_t for memory sizeinfo alignment on OpenBSD On OpenBSD and more generally on platforms without either jemalloc or malloc_(usable_)size, we need to increase the alignment for the memory to sizeof(max_align_t) as with plain sizeof(void *), the compiled code would be crashing when accessing the returned memory.	2021-07-13 13:48:33 +02:00
Ondrej Sury	23751fe252	Cache the isc_os_ncpu() result It was discovered that on some platforms (f.e. Alpine Linux with MUSL) the result of isc_os_ncpus() call differ when called before and after we drop privileges. This commit changes the isc_os_ncpus() call to cache the result from the first call and thus always return the same value during the runtime of the named. The first call to isc_os_ncpus() is made as soon as possible on the library initalization.	2021-07-13 09:12:04 +02:00
Ondřej Surý	ce03015d48	Remove nonnull attribute from isc_mem_{get,allocate,reallocate} The isc_mem_get(), isc_mem_allocate() and isc_mem_reallocate() can return NULL ptr in case where the allocation size is NULL. Remove the nonnull attribute from the functions' declarations. This stems from the following definition in the C11 standard: > If the size of the space requested is zero, the behavior is > implementation-defined: either a null pointer is returned, or the > behavior is as if the size were some nonzero value, except that the > returned pointer shall not be used to access an object. In this case, we return NULL as it's easier to detect errors when accessing pointer from zero-sized allocation which should obviously never happen.	2021-07-12 10:02:18 +02:00
Ondřej Surý	d1a9e549b1	Fix the real allocation size in OpenBSD rallocx shim In the rallocx() shim for OpenBSD (that's the only platform that doesn't have malloc_size() or malloc_usable_size() equivalent), the newly allocated size was missing the extra size_t member for storing the allocation size leading to size_t sized overflow at the end of the reallocated memory chunk.	2021-07-12 08:43:14 +02:00
Mark Andrews	3945c289bb	Reset errcnt at the start of each subtest	2021-07-12 03:47:11 +00:00
Mark Andrews	ce5207699d	Fix unchecked return of isc_rwlock_lock and isc_rwlock_unlock (cherry picked from commit `bcaf23dd27`)	2021-07-12 13:26:29 +10:00
Ondřej Surý	29a285a67d	Revert the allocate/free -> get/put change from jemalloc change In the jemalloc merge request, we missed the fact that ah_frees and ah_handles are reallocated which is not compatible with using isc_mem_get() for allocation and isc_mem_put() for deallocation. This commit reverts that part and restores use of isc_mem_allocate() and isc_mem_free().	2021-07-09 18:19:57 +02:00
Artem Boldariev	3673abc53c	Use restrict and const in isc_mempool_t This commit makes add restrict and const modifiers to some variables to aid compiler to do its optimizations.	2021-07-09 15:58:02 +02:00
Artem Boldariev	c11a401add	Do not use atomic variables in isc_mempool_t As now mempool objects intended to be used in a thread-local manner, there is no point in using atomic here.	2021-07-09 15:58:02 +02:00
Ondřej Surý	63b06571b9	Use isc_mem_get() and isc_mem_put() in isc_mem_total test Previously, the isc_mem_allocate() and isc_mem_free() would be used for isc_mem_total test, but since we now use the real allocation size (sallocx, malloc_size, malloc_usable_size) to track the allocation size, it's impossible to get the test value right. Changing the test to use isc_mem_get() and isc_mem_put() will use the exact size provided, so the test would work again on all the platforms even when jemalloc is not being used.	2021-07-09 15:58:02 +02:00
Ondřej Surý	6f162e8aa4	Rewrite isc_mem water to use single atomic exchange operation This commit refactors the water mechanism in the isc_mem API to use single pointer to a water_t structure that can be swapped with atomic_exchange operation instead of having four different values (water, water_arg, hi_water, lo_water) in the flat namespace. This reduces the need for locking and prevents a race when water and water_arg could be desynchronized.	2021-07-09 15:58:02 +02:00
Ondřej Surý	798333d456	Allow size == 0 in isc_mem_{get,allocate,reallocate} Calls to jemalloc extended API with size == 0 ends up in undefined behaviour. This commit makes the isc_mem_get() and friends calls more POSIX aligned: If size is 0, either a null pointer or a unique pointer that can be successfully passed to free() shall be returned. We picked the easier route (which have been already supported in the old code) and return NULL on calls to the API where size == 0.	2021-07-09 15:58:02 +02:00
Ondřej Surý	e20cc41e56	Use system allocator when jemalloc is unavailable This commit adds support for systems where the jemalloc library is not available as a package, here's the quick summary: * On Linux - the jemalloc is usually available as a package, if configured --without-jemalloc, the shim would be used around malloc(), free(), realloc() and malloc_usable_size() * On macOS - the jemalloc is available from homebrew or macports, if configured --without-jemalloc, the shim would be used around malloc(), free(), realloc() and malloc_size() * On FreeBSD - the jemalloc is the system allocator, we just need to check for <malloc_np.h> header to get access to non-standard API * On NetBSD - the jemalloc is the system allocator, we just need to check for <jemalloc/jemalloc.h> header to get access to non-standard API * On a system hostile to users and developers (read OpenBSD) - the jemalloc API is emulated by using ((size_t *)ptr)[-1] field to hold the size information. The OpenBSD developers care only for themselves, so why should we care about speed on OpenBSD?	2021-07-09 15:58:02 +02:00
Ondřej Surý	e754360170	Remove atomic thread synchronization from the memory hot-path This commit refactors the hi/lo-water related code to remove contention on the hot path in the memory allocator.	2021-07-09 15:58:02 +02:00
Ondřej Surý	efb385ecdc	Clean up isc_mempool API - isc_mempool_get() can no longer fail; when there are no more objects in the pool, more are always allocated. checking for NULL return is no longer necessary. - the isc_mempool_setmaxalloc() and isc_mempool_getmaxalloc() functions are no longer used and have been removed.	2021-07-09 15:58:02 +02:00
Ondřej Surý	f487c6948b	Replace locked mempools with memory contexts Current mempools are kind of hybrid structures - they serve two purposes: 1. mempool with a lock is basically static sized allocator with pre-allocated free items 2. mempool without a lock is a doubly-linked list of preallocated items The first kind of usage could be easily replaced with jemalloc small sized arena objects and thread-local caches. The second usage not-so-much and we need to keep this (in libdns:message.c) for performance reasons.	2021-07-09 15:58:02 +02:00
Ondřej Surý	fd3ceec475	Add debug tracing capability to isc_mempool_create/destroy Previously, we only had capability to trace the mempool gets and puts, but for debugging, it's sometimes also important to keep track how many and where do the memory pools get created and destroyed. This commit adds such tracking capability.	2021-07-09 15:58:02 +02:00
Ondřej Surý	5ab05d1696	Replace isc_mem_allocate() usage with isc_mem_get() in netmgr.c The isc_mem_allocate() comes with additional cost because of the memory tracking. In this commit, we replace the usage with isc_mem_get() because we track the allocated sizes anyway, so it's possible to also replace isc_mem_free() with isc_mem_put().	2021-07-09 15:58:02 +02:00
Ondřej Surý	fcc6814776	Replace internal memory calls with non-standard jemalloc API The jemalloc non-standard API fits nicely with our memory contexts, so just rewrite the memory context internals to use the non-public API. There's just one caveat - since we no longer track the size of the allocation for isc_mem_allocate/isc_mem_free combination, we need to use sallocx() to get real allocation size in both allocator and deallocator because otherwise the sizes would not match.	2021-07-09 15:58:02 +02:00
Ondřej Surý	4b3d0c6600	Remove ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGRECORD The ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGCTX did sanity checks on matching size and memory context on the memory returned to the allocator. Those will no longer needed when most of the allocator will be replaced with jemalloc.	2021-07-09 15:58:02 +02:00
Ondřej Surý	692fd2a216	Remove default_memalloc and default_memfree Now that we have xmalloc:true enabled, we can remove our xmalloc-like wrappers around malloc and free.	2021-07-09 15:58:02 +02:00
Ondřej Surý	5184384efd	Add recommended jemalloc configuration for our load There's global variable called `malloc_conf` that can be used to configure jemalloc behaviour at the program startup. We use following configuration: * xmalloc:true - abort-on-out-of-memory enabled. * background_thread:true - Enable internal background worker threads to handle purging asynchronously. * metadata_thp:auto - allow jemalloc to use transparent huge page (THP) for internal metadata initially, but may begin to do so when metadata usage reaches certain level. * dirty_decay_ms:30000 - Approximate time in milliseconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused. * muzzy_decay_ms:30000 - Approximate time in milliseconds from the creation of a set of unused muzzy pages until an equivalent set of unused muzzy pages is purged and/or reused. More information about the specific meaning can be found in the jemalloc manpage or online at http://jemalloc.net/jemalloc.3.html	2021-07-09 15:58:02 +02:00
Ondřej Surý	7f1c525625	Compile with jemalloc to reduce memory allocator contention The jemalloc allocator is scalable high performance allocator, this is the first in the series of commits that will add jemalloc as a memory allocator for BIND 9. This commit adds configure.ac check and Makefile modifications to use jemalloc as BIND 9 allocator.	2021-07-09 15:58:02 +02:00
Ondřej Surý	63924968d1	Add debug tracing capability to isc_mem_create/isc_mem_destroy Previously, we only had capability to trace the memory gets and puts, but for debugging, it's sometimes also important to keep track how many and where do the memory contexts get created and destroyed. This commit adds such tracking capability.	2021-07-09 15:58:02 +02:00
Artem Boldariev	c6d0e3d3a7	Return HTTP status code for small/malformed requests This commit makes BIND return HTTP status codes for malformed or too small requests. DNS request processing code would ignore such requests. Such an approach works well for other DNS transport but does not make much sense for HTTP, not allowing it to complete the request/response sequence. Suppose execution has reached the point where DNS message handling code has been called. In that case, it means that the HTTP request has been successfully processed, and, thus, we are expected to respond to it either with a message containing some DNS payload or at least to return an error status code. This commit ensures that BIND behaves this way.	2021-07-09 16:37:08 +03:00
Artem Boldariev	fedff2cd6c	Return "Bad Request" (400) in a case of Base64 decoding error This error code fits better than the more generic "Internal Server Error" (500) which implies that the problem is on the server. Also, do not end the whole HTTP/2 session on a bad request.	2021-07-09 16:26:46 +03:00
Artem Boldariev	1792740075	Ignore an "Accept" HTTP header value We were too strict regarding the value and presence of "Accept" HTTP header, slightly breaking compatibility with the specification. According to RFC8484 client SHOULD add "Accept" header to the requests but MUST be able to handle "application/dns-message" media type regardless of the value of the header. That basically suggests we ignore its value. Besides, verifying the value of the "Accept" header is a bit tricky because it could contain multiple media types, thus requiring proper parsing. That is doable but does not provide us with any benefits. Among other things, not verifying the value also fixes compatibility with clients, which could advertise multiple media types as supported, which we should accept. For example, it is possible for a perfectly valid request to contain "application/dns-message", "application/", and "/*" in the "Accept" header value. Still, we would treat such a request as invalid.	2021-07-09 16:26:46 +03:00
Artem Boldariev	7b6945fb60	Fix BIND hanging when browsers end HTTP/2 streams prematurely The commit fixes BIND hanging when browsers end HTTP/2 streams prematurely (for example, by sending RST_STREAM). It ensures that isc__nmsocket_prep_destroy() will be called for an HTTP/2 stream, allowing it to be properly disposed. The problem was impossible to reproduce using dig or DoH benchmarking software (e.g. flamethrower) because these do not tend to end HTTP/2 streams prematurely.	2021-07-09 15:42:44 +03:00
Artem Boldariev	094fcc10e7	Move the code which calls server read callback into a separate func This commit moves the code which calls server read callback into a separate function to avoid code repetition.	2021-07-09 15:42:44 +03:00
Ondřej Surý	2bb454182b	Make the DNS over HTTPS support optional This commit adds two new autoconf options `--enable-doh` (enabled by default) and `--with-libnghttp2` (mandatory when DoH is enabled). When DoH support is disabled the library is not linked-in and support for http(s) protocol is disabled in the netmgr, named and dig.	2021-07-07 09:50:53 +02:00
Ondřej Surý	29c2e52484	The isc/platform.h header has been completely removed The isc/platform.h header was left empty which things either already moved to config.h or to appropriate headers. This is just the final cleanup commit.	2021-07-06 05:33:48 +00:00
Ondřej Surý	bf4a0e26dc	Move NAME_MAX and PATH_MAX from isc/platform.h to isc/dir.h The last remaining defines needed for platforms without NAME_MAX and PATH_MAX (I'm looking at you, GNU Hurd) were moved to isc/dir.h where it's prevalently used.	2021-07-06 05:33:48 +00:00
Ondřej Surý	4da0c49e80	Move ISC_STRERRORSIZE to isc/strerr.h header The ISC_STRERRORSIZE was defined in isc/platform.h header as the value was different between Windows and POSIX platforms. Now that Windows is gone, move the define to where it belongs.	2021-07-06 05:33:48 +00:00
Ondřej Surý	d881e30b0a	Remove LIB<>_EXTERNAL_DATA defines After Windows has been removed, the LIB<>_EXTERNAL_DATA defines were just dummy leftovers. Remove them.	2021-07-06 05:33:48 +00:00
Ondřej Surý	e59a359929	Move the include Makefile.tests to the bottom of Makefile.am(s) The Makefile.tests was modifying global AM_CFLAGS and LDADD and could accidentally pull /usr/include to be listed before the internal libraries, which is known to cause problems if the headers from the previous version of BIND 9 has been installed on the build machine.	2021-06-24 15:33:52 +02:00
Ondřej Surý	b941411072	Disable IP fragmentation on the UDP sockets In DNS Flag Day 2020, we started setting the DF (Don't Fragment socket option on the UDP sockets. It turned out, that this code was incomplete leading to dropping the outgoing UDP packets. This has been now remedied, so it is possible to disable the fragmentation on the UDP sockets again as the sending error is now handled by sending back an empty response with TC (truncated) bit set. This reverts commit `66eefac78c`.	2021-06-23 17:41:34 +02:00
Evan Hunt	a3ba95116e	Handle UDP send errors when sending DNS message larger than MTU When the fragmentation is disabled on UDP sockets, the uv_udp_send() call can fail with UV_EMSGSIZE for messages larger than path MTU. Previously, this error would end with just discarding the response. In this commit, a proper handling of such case is added and on such error, a new DNS response with truncated bit set is generated and sent to the client. This change allows us to disable the fragmentation on the UDP sockets again.	2021-06-23 17:41:34 +02:00
Ondřej Surý	ec86759401	Replace netmgr per-protocol sequential function with a common one Previously, each protocol (TCPDNS, TLSDNS) has specified own function to disable pipelining on the connection. An oversight would lead to assertion failure when opcode is not query over non-TCPDNS protocol because the isc_nm_tcpdns_sequential() function would be called over non-TCPDNS socket. This commit removes the per-protocol functions and refactors the code to have and use common isc_nm_sequential() function that would either disable the pipelining on the socket or would handle the request in per specific manner. Currently it ignores the call for HTTP sockets and causes assertion failure for protocols where it doesn't make sense to call the function at all.	2021-06-22 17:21:44 +03:00
Ondřej Surý	54c389dbc0	Drop support for clang atomic and gcc __sync builtins The requirements for BIND 9.17+ now requires C11 support from the compiler, so we can safely drop most of the stdatomic.h shims from lib/isc/unix/include/stdatomic.h. This commit removes support for clang atomic builtins (clang >= 3.6.0 includes stdatomic.h header) and for Gcc __sync builtins. The only compatibility shim that remains is support for __atomic builtins for Gcc >= 4.7.0 since CentOS 7 still includes only Gcc 4.8.1 and the proper stdatomic.h header was only introduced in Gcc >= 4.9.	2021-06-17 09:51:04 +02:00
Ondřej Surý	4677bb28d1	Remove atomics emulated by a mutex-locked variable Mutex atomics were intended to be used as a debugging tool only and it has already served its purpose and it's not needed anymore.	2021-06-17 09:51:04 +02:00
Artem Boldariev	dc356bb196	Fix ASAN error in DoH (passing NULL to memmove()) The warning was produced by an ASAN build: runtime error: null pointer passed as argument 2, which is declared to never be null This commit fixes it by checking if nghttp2_session_mem_send() has actually returned anything.	2021-06-16 17:46:10 +03:00
Mark Andrews	234ad2d075	Lock access to task->threadid	2021-06-15 00:01:58 +00:00
Artem Boldariev	ccd2267b1c	Set sock->iface and sock->peer properly for layered connection types This change sets the mentioned fields properly and gets rid of klusges added in the times when we were keeping pointers to isc_sockaddr_t instead of copies. Among other things it helps to avoid a situation when garbage instead of an address appears in dig output.	2021-06-14 11:37:36 +03:00
Artem Boldariev	b84fa122ce	Make BIND refuse to serve XFRs over DoH We cannot use DoH for zone transfers. According to RFC8484 a DoH request contains exactly one DNS message (see Section 6: Definition of the "application/dns-message" Media Type, https://datatracker.ietf.org/doc/html/rfc8484#section-6). This makes DoH unsuitable for zone transfers as often (and usually!) these need more than one DNS message, especially for larger zones. As zone transfers over DoH are not (yet) standardised, nor discussed in RFC8484, the best thing we can do is to return "not implemented." Technically DoH can be used to transfer small zones which fit in one message, but that is not enough for the generic case. Also, this commit makes the server-side DoH code ensure that no multiple responses could be attempted to be sent over one HTTP/2 stream. In HTTP/2 one stream is mapped to one request/response transaction. Now the write callback will be called with failure error code in such a case.	2021-06-14 11:37:36 +03:00
Artem Boldariev	009752cab0	Pass an HTTP handle to the read callback when finishing a stream This commit fixes a leftover from an earlier version of the client-side DoH code when the underlying transport handle was used directly.	2021-06-14 11:37:36 +03:00
Artem Boldariev	d5d20cebb2	Fix a crash in the client-side DoH code (header processing callback) Support a situation in header processing callback when client side code could receive a belated response or part of it. That could happen when the HTTP/2 session was already closed, but there were some response data from server in flight. Other client-side nghttp2 callbacks code already handled this case. The bug became apparent after HTTP/2 write buffering was supported, leading to rare unit test failures.	2021-06-14 11:37:33 +03:00
Artem Boldariev	2dfc0d9afc	Nullify connect.cstream in time and keep track of all client streams This commit ensures that sock->h2.connect.cstream gets nullified when the object in question is deleted. This fixes a nasty crash in dig exposed when receiving large responses leading to double free()ing. Also, it refactors how the client-side code keeps track of client streams (hopefully) preventing from similar errors appearing in the future.	2021-06-14 11:37:29 +03:00
Artem Boldariev	5b507c1136	Fix BIND to serve large HTTP responses This commit makes NM code to report HTTP as a stream protocol. This makes it possible to handle large responses properly. Like: dig +https @127.0.0.1 A cmts1-dhcp.longlines.com	2021-06-14 11:37:17 +03:00
Ondřej Surý	b3de93e54c	Update the source code formatting using clang-format-12 clang-format now tries to keep the type-cast on the same line as the variable. Update the formatting.	2021-06-13 08:46:28 +02:00
Ondřej Surý	440fb3d225	Completely remove BIND 9 Windows support The Windows support has been completely removed from the source tree and BIND 9 now no longer supports native compilation on Windows. We might consider reviewing mingw-w64 port if contributed by external party, but no development efforts will be put into making BIND 9 compile and run on Windows again.	2021-06-09 14:35:14 +02:00
Mark Andrews	66d1df57cb	Report which assertion failed when calling set_global_error	2021-06-03 11:55:31 +10:00
Ondřej Surý	f14d870d15	Fix copy&paste error in setsockopt_off Because of copy&paste error the setsockopt_off macro would enable the socket option instead of disabling it.	2021-06-02 17:47:14 +02:00
Ondřej Surý	67afea6cfc	Cleanup the remaining of HAVE_UV_<func> macros While cleaning up the usage of HAVE_UV_<func> macros, we forgot to cleanup the HAVE_UV_UDP_CONNECT in the actual code and HAVE_UV_TRANSLATE_SYS_ERROR and this was causing Windows build to fail on uv_udp_send() because the socket was already connected and we were falsely assuming that it was not. The platforms with autoconf support were not affected, because we were still checking for the functions from the configure.	2021-06-02 11:23:36 +02:00
Artem Boldariev	35d0027f36	HTTP/2 write buffering This commit adds the ability to consolidate HTTP/2 write requests if there is already one in flight. If it is the case, the code will consolidate multiple subsequent write request into a larger one allowing to utilise the network in a more efficient way by creating larger TCP packets as well as by reducing TLS records overhead (by creating large TLS records instead of multiple small ones). This optimisation is especially efficient for clients, creating many concurrent HTTP/2 streams over a transport connection at once. This way, the code might create a small amount of multi-kilobyte requests instead of many 50-120 byte ones. In fact, it turned out to work so well that I had to add a work-around to the code to ensure compatibility with the flamethrower, which, at the time of writing, does not support TLS records larger than two kilobytes. Now the code tries to flush the write buffer after 1.5 kilobyte, which is still pretty adequate for our use case. Essentially, this commit implements a recommendation given by nghttp2 library: https://nghttp2.org/documentation/nghttp2_session_mem_send.html	2021-06-01 21:07:45 +03:00
Ondřej Surý	7670f98377	Add isc_task_getnetmgr() function Add a function to pull the attached netmgr from inside the executed task. This is needed for any task that needs to call the netmgr API.	2021-05-31 14:52:05 +02:00
Ondřej Surý	87fe97ed91	Add asynchronous work API to the network manager The libuv has a support for running long running tasks in the dedicated threadpools, so it doesn't affect networking IO. This commit adds isc_nm_work_enqueue() wrapper that would wraps around the libuv API and runs it on top of associated worker loop. The only limitation is that the function must be called from inside network manager thread, so the call to the function should be wrapped inside a (bound) task.	2021-05-31 14:52:05 +02:00
Ondřej Surý	211bfefbaa	Use UV_VERSION_HEX to decide whether we need libuv shim functions Instead of having a configure check for every missing function that has been added in later version of libuv, we now use UV_VERSION_HEX to decide whether we need the shim or not.	2021-05-31 14:52:05 +02:00
Ondřej Surý	7477d1b2ed	Add uv_os_getenv() and uv_os_setenv() compatibility shims The uv_os_getenv() and uv_os_setenv() functions were introduced in the libuv >= 1.12.0. Add simple compatibility shims for older versions.	2021-05-31 14:52:05 +02:00
Ondřej Surý	f752840db3	Add uv_req_get_data() and uv_req_set_data() compatibility shims The uv_req_get_data() and uv_req_set_data() functions were introduced in libuv >= 1.19.0, so we need to add compatibility shims with older libuv versions.	2021-05-31 14:52:05 +02:00
Mark Andrews	d68b009cfe	Remove priority from attribute constructor/destructor On some platforms, the __attribute__ constructor and destructor won't take priorities and the compilation failed. On such platform would be macOS. For this reason, the constructor/destructor in the libisc was reworked to not use priorities, but have a single constructor and destructor that calls the appropriate routines in correct order. This commit removes the extra priority because it's now not needed and it also breaks a compilation on macOS with GCC 10.	2021-05-27 08:02:21 +02:00
Mark Andrews	715a2c7fc1	Add missing initialisations configuring with --enable-mutex-atomics flagged these incorrectly initialised variables on systems where pthread_mutex_init doesn't just zero out the structure.	2021-05-26 08:15:08 +00:00
Ondřej Surý	a227562f13	Cleanup the struct isc_nmiface In previous MR, I forgot to remove the `struct isc_nmiface`, this commit rectifies that.	2021-05-26 09:55:10 +02:00
Ondřej Surý	50270de8a0	Refactor the interface handling in the netmgr The isc_nmiface_t type was holding just a single isc_sockaddr_t, so we got rid of the datatype and use plain isc_sockaddr_t in place where isc_nmiface_t was used before. This means less type-casting and shorter path to access isc_sockaddr_t members. At the same time, instead of keeping the reference to the isc_sockaddr_t that was passed to us when we start listening, we will keep a local copy. This prevents the data race on destruction of the ns_interface_t objects where pending nmsockets could reference the sockaddr of already destroyed ns_interface_t object.	2021-05-26 09:43:12 +02:00
Ondřej Surý	28b65d8256	Reduce the number of clientmgr objects created Previously, as a way of reducing the contention between threads a clientmgr object would be created for each interface/IP address. We tasks being more strictly bound to netmgr workers, this is no longer needed and we can just create clientmgr object per worker queue (ncpus). Each clientmgr object than would have a single task and single memory context.	2021-05-24 20:44:54 +02:00
Ondřej Surý	4db5e30177	Run shutdown events with the task's existing threadid Previously, task->threadid was reassigned to 0 while shutting down, which caused an assertion.	2021-05-24 20:02:20 +02:00
Ondřej Surý	0be7ea78be	Reduce the number of client tasks and bind them to netmgr queues Since a client object is bound to a netmgr handle, each client will always be processed by the same netmgr worker, so we can simplify the code by binding client->task to the same thread as the client. Since ns__client_request() now runs in the same event loop as client->task events, is no longer necessary to pause the task manager before launching them. Also removed some functions in isc_task that were not used.	2021-05-24 20:02:20 +02:00
Artem Boldariev	67c50abe5a	Add DoH quota tests This commit adds unit tests which ensure that DoH code is compatible with quota functionality.	2021-05-19 10:28:47 +03:00
Mark Andrews	7e83c6df94	initialise worker->cond_prio	2021-05-18 07:47:42 +00:00
Ondřej Surý	9e3cb396b2	Replace netmgr quantum with loop-preventing barrier Instead of using fixed quantum, this commit adds atomic counter for number of items on each queue and uses the number of netievents scheduled to run as the limit of maximum number of netievents for a single process_queue() run. This prevents the endless loops when the netievent would schedule more netievents onto the same loop, but we don't have to pick "magic" number for the quantum.	2021-05-17 11:59:19 +02:00
Ondřej Surý	4509089419	Add configuration option to set send/recv buffers on the nm sockets This commit adds a new configuration option to set the receive and send buffer sizes on the TCP and UDP netmgr sockets. The default is `0` which doesn't set any value and just uses the value set by the operating system. There's no magic value here - set it too small and the performance will drop, set it too large, the buffers can fill-up with queries that have already timeouted on the client side and nobody is interested for the answer and this would just make the server clog up even more by making it produce useless work. The `netstat -su` can be used on POSIX systems to monitor the receive and send buffer errors.	2021-05-17 08:47:09 +02:00
Ondřej Surý	cd413234f7	Fix the outgoing UDP socket selection on Windows The outgoing UDP socket selection would pick unintialized children socket on Windows, because we have more netmgr workers than we have listening sockets. This commit fixes the selection by keeping the outgoing socket the same, so it's always run on existing socket.	2021-05-13 15:04:48 +02:00
Artem Boldariev	bab9309231	Fix DoH unit tests logic This commit fixes logic bugs in DoH test suite revealed by making DoH not to call nghttp2_session_terminate_session() in server-side code.	2021-05-13 10:42:25 +03:00
Artem Boldariev	6816a741ca	Fix crash in TLS caused by improper handling of shutdown messages The problem was found when flamethrower was accidentally run in DoT mode against DoH port.	2021-05-13 10:42:25 +03:00
Artem Boldariev	1947f6372d	Limit the number of active concurrent HTTP/2 streams The initial intent was to limit the number of concurrent streams by the value of 100 but due to the error when reading the documentation it was set to the maximum possible number of streams per session. This could lead to security issues, e.g. a remote attacker could have taken down the BIND instance by creating lots of sessions via low number of transport connections. This commit fixes that.	2021-05-13 10:42:25 +03:00
Artem Boldariev	d80d1b0dd9	Do not allow empty DoH endpoints to be added It was possible to specify empty DoH endpoint in BIND's configuration file: that was an error, we should not allow doing so.	2021-05-13 10:42:25 +03:00
Artem Boldariev	9155a87528	Do not call nghttp2_session_terminate_session() in server-side code We should not call nghttp2_session_terminate_session() in server-side code after all of the active HTTP/2 streams are processed. The underlying transport connection is expected to remain opened at least for some time in this case for new HTTP/2 requests to arrive. That is what flamethrower was expecting and it makes perfect sense from the HTTP/2 perspective.	2021-05-13 10:42:25 +03:00
Mark Andrews	0f6ae9000a	initalise sock->cond	2021-05-11 14:06:26 +02:00
Ondřej Surý	3713a38689	Bump the netmgr quantum to 1024 During the stress testing, it was discovered that the default netmgr quantum of 128 is not enough and there was a performance drop for TCP on FreeBSD. Bumping the default quantum to 1024 solves the performance issue and is still enough to prevent the endless loops.	2021-05-10 21:32:31 +02:00
Ondřej Surý	e623c12757	Destroy reference to taskmgr after all tasks are done We were clearing the pointer to taskmgr as soon as isc_taskmgr_destroy() would be called and before all tasks were finished. Unfortunately, some tasks would use global named_g_taskmgr objects from inside the events and this would cause either a data race or NULL pointer dereference. This commit fixes the data race by moving the destruction of the referenced pointer to the time after all tasks are finished.	2021-05-10 12:13:27 -07:00
Ondřej Surý	6c57a6cc3d	Add isc_taskmgr_detach when task is created while shutting down When taskmgr is shutting down, the creating the task would attach to the taskmgr, but don't detach on error condition.	2021-05-10 11:39:51 +02:00
Ondřej Surý	0133096c88	improvements to socket_test - be more strict, but patient, waiting for event completion. - use an atomic pointer for the socket to silence TSAN warnings.	2021-05-07 14:28:33 -07:00
Ondřej Surý	365c6a9851	ensure interlocked netmgr events run on worker[0] Network manager events that require interlock (pause, resume, listen) are now always executed in the same worker thread, mgr->workers[0], to prevent races. "stoplistening" events no longer require interlock.	2021-05-07 14:28:32 -07:00
Evan Hunt	c44423127d	fix shutdown deadlocks - ensure isc_nm_pause() and isc_nm_resume() work the same whether run from inside or outside of the netmgr. - promote 'stop' events to the priority event level so they can run while the netmgr is pausing or paused. - when pausing, drain the priority queue before acquiring an interlock; this prevents a deadlock when another thread is waiting for us to complete a task. - release interlock after pausing, reacquire it when resuming, so that stop events can happen. some incidental changes: - use a function to enqueue pause and resume events (this was part of a different change attempt that didn't work out; I kept it because I thought was more readable). - make mgr->nworkers a signed int to remove some annoying integer casts.	2021-05-07 14:28:32 -07:00
Ondřej Surý	4c8f6ebeb1	Use barriers for netmgr synchronization The netmgr listening, stoplistening, pausing and resuming functions now use barriers for synchronization, which makes the code much simpler. isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier on platforms where that works, and pthread_barrier where it doesn't (including TSAN builds).	2021-05-07 14:28:32 -07:00
Ondřej Surý	2eae7813b6	Run isc__nm_http_stoplistening() synchronously in netmgr When isc__nm_http_stoplistening() is run from inside the netmgr, we need to make sure it's run synchronously. This commit is just a band-aid though, as the desired behvaior for isc_nm_stoplistening() is not always the same: 1. When run from outside user of the interface, the call must be synchronous, e.g. the calling code expects the call to really stop listening on the interfaces. 2. But if there's a call from listen<proto> when listening fails, that needs to be scheduled to run asynchronously, because isc_nm_listen<proto> is being run in a paused (interlocked) netmgr thread and we could get stuck. The proper solution would be to make isc_nm_stoplistening() behave like uv_close(), i.e., to have a proper callback.	2021-05-07 14:28:32 -07:00
Evan Hunt	5c08f97791	only run tasks as privileged if taskmgr is in privileged mode all zone loading tasks have the privileged flag, but we only want them to run as privileged tasks when the server is being initialized; if we privilege them the rest of the time, the server may hang for a long time after a reload/reconfig. so now we call isc_taskmgr_setmode() to turn privileged execution mode on or off in the task manager. isc_task_privileged() returns true if the task's privilege flag is set and the taskmgr is in privileged execution mode. this is used to determine in which netmgr event queue the task should be run.	2021-05-07 14:28:30 -07:00
Ondřej Surý	29a208aaf7	Fix crash when allocating UDP socket fails on OpenBSD When socket() call fails, the UDP connect code would call the connectcb with empty req->handle. This has been fixed.	2021-05-07 14:28:30 -07:00
Ondřej Surý	dacf586e18	Make the netmgr queue processing quantized There was a theoretical possibility of clogging up the queue processing with an endless loop where currently processing netievent would schedule new netievent that would get processed immediately. This wasn't such a problem when only netmgr netievents were processed, but with the addition of the tasks, there are at least two situation where this could happen: 1. In lib/dns/zone.c:setnsec3param() the task would get re-enqueued when the zone was not yet fully loaded. 2. Tasks have internal quantum for maximum number of isc_events to be processed, when the task quantum is reached, the task would get rescheduled and then immediately processed by the netmgr queue processing. As the isc_queue doesn't have a mechanism to atomically move the queue, this commit adds a mechanism to quantize the queue, so enqueueing new netievents will never stop processing other uv_loop_t events. The default quantum size is 128. Since the queue used in the network manager allows items to be enqueued more than once, tasks are now reference-counted around task_ready() and task_run(). task_ready() now has a public API wrapper, isc_task_ready(), that the netmgr can use to reschedule processing of a task if the quantum has been reached. Incidental changes: Cleaned up some unused fields left in isc_task_t and isc_taskmgr_t after the last refactoring, and changed atomic flags to atomic_bools for easier manipulation.	2021-05-07 14:28:30 -07:00
Ondřej Surý	b5bf58b419	Destroy netmgr before destroying taskmgr With taskmgr running on top of netmgr, the ordering of how the tasks and netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy() was waiting until all tasks were properly shutdown and detached. This responsibility was moved to netmgr, so we now need to do the following: 1. shutdown all the tasks - this schedules all shutdown events onto the netmgr queue 2. shutdown the netmgr - this also makes sure all the tasks and events are properly executed 3. Shutdown the taskmgr - this now waits for all the tasks to finish running before returning 4. Shutdown the netmgr - this call waits for all the netmgr netievents to finish before returning This solves the race when the taskmgr object would be destroyed before all the tasks were finished running in the netmgr loops.	2021-05-07 14:28:30 -07:00
Ondřej Surý	a011d42211	Add new isc_managers API to simplify <>mgr create/destroy Previously, netmgr, taskmgr, timermgr and socketmgr all had their own isc_<>mgr_create() and isc_<>mgr_destroy() functions. The new isc_managers_create() and isc_managers_destroy() fold all four into a single function and makes sure the objects are created and destroy in correct order. Especially now, when taskmgr runs on top of netmgr, the correct order is important and when the code was duplicated at many places it's easy to make mistake. The former isc_<>mgr_create() and isc_<*>mgr_destroy() functions were made private and a single call to isc_managers_create() and isc_managers_destroy() is required at the program startup / shutdown.	2021-05-07 10:19:05 -07:00
Artem Boldariev	8c0ea01f34	DoH: close active server streams when finishing session Under some circumstances a situation might occur when server-side session gets finished while there are still active HTTP/2 streams. This would lead to isc_nm_httpsocket object leaks. This commit fixes this behaviour as well as refactors failed_read_cb() to allow better code reuse.	2021-05-07 15:47:24 +03:00
Artem Boldariev	a9e97f28b7	Fix crash in client side DoH code This commit fixes a situation when a cstream object could get unlinked from the list as a result of a cstream->read_cb call. Thus, unlinking it after the call could crash the program.	2021-05-07 15:47:24 +03:00
Artem Boldariev	cd178043d9	Make some TLS tests actually use quota A directive to check quota was missing from some of the TLS tests which were supposed to test TLS code with quotas.	2021-05-07 15:47:24 +03:00
Artem Boldariev	22376fc69a	TLS: cancel reading on the underlying TCP socket after (see below) ... the last handle has been detached after calling write callback. That makes it possible to detach from the underlying socket and not to keep the socket object alive for too long. This issue was causing TLS tests with quota to fail because quota might not have been detached on time (because it was still referenced by the underlying TCP socket). One could say that this commit is an ideological continuation of: `513cdb52ec`.	2021-05-07 15:47:24 +03:00
Artem Boldariev	3bf331c453	Fix crashes in TLS when handling TLS shutdown messages This commit fixes some situations which could appear in TLS code when dealing with shutdown messages and lead to crashes.	2021-05-07 15:47:24 +03:00
Artem Boldariev	0d3f503dc9	Avoid creating connect netievents during low level failures in HTTP This way we create less netievent objects, not bombarding NM with the messages in case of numerous low-level errors (like too many open files) in e.g. unit tests.	2021-05-07 15:47:24 +03:00
Artem Boldariev	0e8ac61d6e	Avoid creating httpclose netievents in case of low level failures This way we create less load on NM workers by avoiding netievent creation.	2021-05-07 15:47:24 +03:00
Artem Boldariev	8510c5cd59	Always call TCP connect callback from within a worker context This change ensures that a TCP connect callback is called from within the context of a worker thread in case of a low-level error when descriptors cannot be created (e.g. when there are too many open file descriptors).	2021-05-07 15:47:24 +03:00
Artem Boldariev	1349142333	Got rid of tlsconnect event and corresponding code We do not need it since we decided to not return values from connect functions.	2021-05-07 15:47:24 +03:00
Artem Boldariev	39448c1581	Finish HTTP session on write failure Not doing so caused client-side code to not free file descriptors as soon as possible, that was causing unit tests to fail.	2021-05-07 15:47:24 +03:00
Artem Boldariev	4c5b36780b	Fix flawed DoH unit tests logic This commit fixes some logical mistakes in DoH unit tests logic, causing them either to fail or not to do what they are intended to do.	2021-05-07 15:47:24 +03:00
Matthijs Mekking	66f2cd228d	Use isdigit instead of checking character range When looking for key files, we could use isdigit rather than checking if the character is within the range [0-9]. Use (unsigned char) cast to ensure the value is representable in the unsigned char type (as suggested by the isdigit manpage). Change " & 0xff" occurrences to the recommended (unsigned char) type cast.	2021-05-05 19:15:33 +02:00
Ondřej Surý	dfd56b84f5	Add support for generating backtraces on Windows This commit adds support for generating backtraces on Windows and refactors the isc_backtrace API to match the Linux/BSD API (without the isc_ prefix) * isc_backtrace_gettrace() was renamed to isc_backtrace(), the third argument was removed and the return type was changed to int * isc_backtrace_symbols() was added * isc_backtrace_symbols_fd() was added and used as appropriate	2021-05-03 20:31:52 +02:00
Ondřej Surý	37c0d196e3	Use uv_sleep in the netmgr code libuv added uv_sleep(unsigned int msec) to the API since 1.34.0. Use that in the netmgr code and define usleep based shim for libuv << 1.34.0.	2021-05-03 20:22:54 +02:00
Ondřej Surý	c37ff5d188	Add nanosleep and usleep Windows shims This commit adds POSIX nanosleep() and usleep() shim implementation for Windows to help implementors use less #ifdef _WIN32 in the code.	2021-05-03 20:22:54 +02:00
Ondřej Surý	cd54bbbd9a	Add trampoline around iocompletionport_createthreads() On Windows, the iocompletionport_createthreads() didn't use isc_thread_create() to create new threads for processing IO, but just a simple CreateThread() function that completely circumvent the isc_trampoline mechanism to initialize global isc_tid_v. This lead to segmentation fault in isc_hp API because '-1' isn't valid index to the hazard pointer array. This commit changes the iocompletionport_createthreads() to use isc_thread_create() instead of CreateThread() to properly initialize isc_tid_v.	2021-05-03 20:21:15 +02:00
Diego Fronza	7729844150	Address comparison of integers with different signedess	2021-05-03 06:54:30 +00:00
Diego Fronza	54aa60eef8	Add malloc attribute to memory allocation functions The malloc attribute allows compiler to do some optmizations on functions that behave like malloc/calloc, like assuming that the returned pointer do not alias other pointers.	2021-04-26 11:32:17 -03:00
Diego Fronza	efb9c540cd	Removed unnecessary check (mpctx->items == NULL) There is no possibility for mpctx->items to be NULL at the point where the code was removed, since we enforce that fillcount > 0, if mpctx->items == NULL when isc_mempool_get is called, then we will allocate fillcount more items and add to the mpctx->items list.	2021-04-26 11:32:17 -03:00
Artem Boldariev	62033110b9	Use a constant for timeouts in soft-timeout tests It makes it easier to change the value should the need arise.	2021-04-23 10:01:42 -07:00
Evan Hunt	7f367b0c7f	use the correct handle when calling the read callback when calling isc_nm_read() on an HTTP socket, the read callback was being run with the incorrect handle. this has been corrected.	2021-04-23 10:01:42 -07:00
Evan Hunt	f0d75ee7c3	fix DOH timeout recovery as with TLS, the destruction of a client stream on failed read needs to be conditional: if we reached failed_read_cb() as a result of a timeout on a timer which has subsequently been reset, the stream must not be closed.	2021-04-23 10:01:42 -07:00
Evan Hunt	b258df8562	add HTTP timeout recovery test NOTE: this test currently fails	2021-04-22 12:40:04 -07:00
Evan Hunt	23ec011298	fix TLS timeout recovery the destruction of the socket in tls_failed_read_cb() needs to be conditional; if reached due to a timeout on a timer that has subsequently been reset, the socket must not be destroyed.	2021-04-22 12:08:04 -07:00
Evan Hunt	c90da99180	fix TCP timeout recovery removed an unnecessary assert in the failed_read_cb() function. also renamed to isc__nm_tcp_failed_read_cb() to match the practice in other modules.	2021-04-22 12:08:04 -07:00
Evan Hunt	25ef0547a9	add TCP and TLS timeout recovery tests NOTE: currently these tests fail	2021-04-22 12:08:04 -07:00
Evan Hunt	52f256f9ae	add TCPDNS and TLSDNS timeout recovery tests this is similar in structure to the UDP timeout recovery test. this commit adds a new mechanism to the netmgr test allowing the listen socket to accept incoming TCP connections but never send a response. this forces the client to time out on read.	2021-04-22 12:08:04 -07:00
Evan Hunt	bcf5b2a675	run read callbacks synchronously on timeout when running read callbacks, if the event result is not ISC_R_SUCCESS, the callback is always run asynchronously. this is a problem on timeout, because there's no chance to reset the timer before the socket has already been destroyed. this commit allows read callbacks to run synchronously for both ISC_R_SUCCESS and ISC_R_TIMEDOUT result codes.	2021-04-22 12:08:04 -07:00
Evan Hunt	609975ad20	add a UDP timeout recovery test this test sets up a server socket that listens for UDP connections but never responds. the client will always time out; it should retry five times before giving up.	2021-04-22 12:08:04 -07:00
Evan Hunt	1f41d59a5e	allow client read callback to be assignable allow netmgr client tests to choose the function that will be used as a read callback, without having to write a different connect callback handler.	2021-04-22 12:08:04 -07:00
Ondřej Surý	b540722bc3	Refactor taskmgr to run on top of netmgr This commit changes the taskmgr to run the individual tasks on the netmgr internal workers. While an effort has been put into keeping the taskmgr interface intact, couple of changes have been made: * The taskmgr has no concept of universal privileged mode - rather the tasks are either privileged or unprivileged (normal). The privileged tasks are run as a first thing when the netmgr is unpaused. There are now four different queues in in the netmgr: 1. priority queue - netievent on the priority queue are run even when the taskmgr enter exclusive mode and netmgr is paused. This is needed to properly start listening on the interfaces, free resources and resume. 2. privileged task queue - only privileged tasks are queued here and this is the first queue that gets processed when network manager is unpaused using isc_nm_resume(). All netmgr workers need to clean the privileged task queue before they all proceed normal operation. Both task queues are processed when the workers are finished. 3. task queue - only (traditional) task are scheduled here and this queue along with privileged task queues are process when the netmgr workers are finishing. This is needed to process the task shutdown events. 4. normal queue - this is the queue with netmgr events, e.g. reading, sending, callbacks and pretty much everything is processed here. * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t) object. * The isc_nm_destroy() function now waits for indefinite time, but it will print out the active objects when in tracing mode (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been made a little bit more asynchronous and it might take longer time to shutdown all the active networking connections. * Previously, the isc_nm_stoplistening() was a synchronous operation. This has been changed and the isc_nm_stoplistening() just schedules the child sockets to stop listening and exits. This was needed to prevent a deadlock as the the (traditional) tasks are now executed on the netmgr threads. * The socket selection logic in isc__nm_udp_send() was flawed, but fortunatelly, it was broken, so we never hit the problem where we created uvreq_t on a socket from nmhandle_t, but then a different socket could be picked up and then we were trying to run the send callback on a socket that had different threadid than currently running.	2021-04-20 23:22:28 +02:00
Ondřej Surý	16fe0d1f41	Cleanup the public vs private ISCAPI remnants Since all the libraries are internal now, just cleanup the ISCAPI remnants in isc_socket, isc_task and isc_timer APIs. This means, there's one less layer as following changes have been done: * struct isc_socket and struct isc_socketmgr have been removed * struct isc__socket and struct isc__socketmgr have been renamed to struct isc_socket and struct isc_socketmgr * struct isc_task and struct isc_taskmgr have been removed * struct isc__task and struct isc__taskmgr have been renamed to struct isc_task and struct isc_taskmgr * struct isc_timer and struct isc_timermgr have been removed * struct isc__timer and struct isc__timermgr have been renamed to struct isc_timer and struct isc_timermgr * All the associated code that dealt with typing isc_<foo> to isc__<foo> and back has been removed.	2021-04-19 13:18:24 +02:00
Ondřej Surý	3388ef36b3	Cleanup the isc_<>mgr_createinc() constructors Previously, the taskmgr, timermgr and socketmgr had a constructor variant, that would create the mgr on top of existing appctx. This was no longer true and isc_<>mgr was just calling isc_<*>mgr_create() directly without any extra code. This commit just cleans up the extra function.	2021-04-19 10:22:56 +02:00
Artem Boldariev	66432dcd65	Handle a situation when SSL shutdown messages were sent and received It fixes a corner case which was causing dig to print annoying messages like: 14-Apr-2021 18:48:37.099 SSL error in BIO: 1 TLS error (errno: 0). Arguments: received_data: (nil), send_data: (nil), finish: false even when all the data was properly processed.	2021-04-15 15:49:36 +03:00
Artem Boldariev	513cdb52ec	TLS: try to close TCP socket descriptor earlier when possible Before this fix underlying TCP sockets could remain opened for longer than it is actually required, causing unit tests to fail with lots of ISC_R_TOOMANYOPENFILES errors. The change also enables graceful SSL shutdown (before that it would happen only in the case when isc_nm_cancelread() were called).	2021-04-15 15:49:36 +03:00
Ondřej Surý	202b1d372d	Merge the tls_test.c into netmgr_test.c and extend the tests suite This commit merges TLS tests into the common Network Manager unit tests suite and extends the unit test framework to include support for additional "ping-pong" style tests where all data could be sent via lesser number of connections (the behaviour of the old test suite). The tests for TCP and TLS were extended to make use of the new mode, as this mode better translates to how the code is used in DoH. Both TLS and TCP tests now share most of the unit tests' code, as they are expected to function similarly from a users's perspective anyway. Additionally to the above, the TLS test suite was extended to include TLS tests using the connections quota facility.	2021-04-15 15:49:36 +03:00
Artem Boldariev	8da12738f1	Use T_CONNECT timeout constant for TCP tests (instead of 1 ms) The netmgr_test would be failing on heavily loaded systems because the connection timeout was set to 1 ms. Use the global constant instead.	2021-04-07 15:37:10 +02:00
Ondřej Surý	72ef5f465d	Refactor async callbacks and fix the double tlsdnsconnect callback The isc_nm_tlsdnsconnect() call could end up with two connect callbacks called when the timeout fired and the TCP connection was aborted, but the TLS handshake was not complete yet. isc__nm_connecttimeout_cb() forgot to clean up sock->tls.pending_req when the connect callback was called with ISC_R_TIMEDOUT, leading to a second callback running later. A new argument has been added to the isc__nm__failed_connect_cb and isc__nm__failed_read_cb functions, to indicate whether the callback needs to run asynchronously or not.	2021-04-07 15:36:59 +02:00
Ondřej Surý	58e75e3ce5	Skip long tls_tests in the CI We already skip most of the recv_send tests in CI because they are too timing-related to be run in overloaded environment. This commit adds a similar change to tls_test before we merge tls_test into netmgr_test.	2021-04-07 15:36:59 +02:00
Artem Boldariev	340235c855	Prevent short TLS tests from hanging in case of errors The tests in tls_test.c could hang in the event of a connect error. This commit allows the tests to bail out when such an error occurs.	2021-04-07 15:36:59 +02:00
Evan Hunt	426c40c96d	rearrange nm_teardown() to check correctness after shutting down if a test failed at the beginning of nm_teardown(), the function would abort before isc_nm_destroy() or isc_tlsctx_free() were reached; we would then abort when nm_setup() was run for the next test case. rearranging the teardown function prevents this problem.	2021-04-07 15:36:59 +02:00
Ondřej Surý	86f4872dd6	isc_nm_connect() always return via callback The isc_nm_connect() functions were refactored to always return the connection status via the connect callback instead of sometimes returning the hard failure directly (for example, when the socket could not be created, or when the network manager was shutting down). This commit changes the connect functions in all the network manager modules, and also makes the necessary refactoring changes in places where the connect functions are called.	2021-04-07 15:36:59 +02:00
Evan Hunt	a70cd026df	move UDP connect retries from dig into isc_nm_udpconnect() dig previously ran isc_nm_udpconnect() three times before giving up, to work around a freebsd bug that caused connect() to return a spurious transient EADDRINUSE. this commit moves the retry code into the network manager itself, so that isc_nm_udpconnect() no longer needs to return a result code.	2021-04-07 15:36:59 +02:00
Ondřej Surý	ca12e25bb0	Use generic functions for reading and timers in TCP The TCP module has been updated to use the generic functions from netmgr.c instead of its own local copies. This brings the module mostly up to par with the TCPDNS and TLSDNS modules.	2021-04-07 15:36:59 +02:00
Ondřej Surý	7df8c7061c	Fix and clean up handling of connect callbacks Serveral problems were discovered and fixed after the change in the connection timeout in the previous commits: * In TLSDNS, the connection callback was not called at all under some circumstances when the TCP connection had been established, but the TLS handshake hadn't been completed yet. Additional checks have been put in place so that tls_cycle() will end early when the nmsocket is invalidated by the isc__nm_tlsdns_shutdown() call. * In TCP, TCPDNS and TLSDNS, new connections would be established even when the network manager was shutting down. The new call isc__nm_closing() has been added and is used to bail out early even before uv_tcp_connect() is attempted.	2021-04-07 15:36:59 +02:00
Ondřej Surý	5a87c7372c	Make it possible to recover from connect timeouts Similarly to the read timeout, it's now possible to recover from ISC_R_TIMEDOUT event by restarting the timer from the connect callback. The change here also fixes platforms that missing the socket() options to set the TCP connection timeout, by moving the timeout code into user space. On platforms that support setting the connect timeout via a socket option, the timeout has been hardcoded to 2 minutes (the maximum value of tcp-initial-timeout).	2021-04-07 15:36:58 +02:00
Ondřej Surý	33c00c281f	Make it possible to recover from read timeouts Previously, when the client timed out on read, the client socket would be automatically closed and destroyed when the nmhandle was detached. This commit changes the logic so that it's possible for the callback to recover from the ISC_R_TIMEDOUT event by restarting the timer. This is done by calling isc_nmhandle_settimeout(), which prevents the timeout handling code from destroying the socket; instead, it continues to wait for data. One specific use case for multiple timeouts is serve-stale - the client socket could be created with shorter timeout (as specified with stale-answer-client-timeout), so we can serve the requestor with stale answer, but keep the original query running for a longer time.	2021-04-07 15:36:58 +02:00
Ondřej Surý	0aad979175	Disable netmgr tests only when running under CI The full netmgr test suite is unstable when run in CI due to various timing issues. Previously, we enabled the full test suite only when CI_ENABLE_ALL_TESTS environment variable was set, but that went against original intent of running the full suite when an individual developer would run it locally. This change disables the full test suite only when running in the CI and the CI_ENABLE_ALL_TESTS is not set.	2021-04-07 15:36:58 +02:00
Artem Boldariev	ee10948e2d	Remove dead code which was supposed to handle TLS shutdowns nicely Fixes Coverity issue CID 330954 (See #2612).	2021-04-07 11:21:08 +03:00
Artem Boldariev	e6062210c7	Handle buggy situations with SSL_ERROR_SYSCALL See "BUGS" section at: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html It is mentioned there that when TLS status equals SSL_ERROR_SYSCALL AND errno == 0 it means that underlying transport layer returned EOF prematurely. However, we are managing the transport ourselves, so we should just resume reading from the TCP socket. It seems that this case has been handled properly on modern versions of OpenSSL. That being said, the situation goes in line with the manual: it is briefly mentioned there that SSL_ERROR_SYSCALL might be returned not only in a case of low-level errors (like system call failures).	2021-04-07 11:21:08 +03:00
Artem Boldariev	fa062162a7	Fix crash (regression) in DIG when handling non-DoH responses This commit fixes crash in dig when it encounters non-expected header value. The bug was introduced at some point late in the last DoH development cycle. Also, refactors the relevant code a little bit to ensure better incoming data validation for client-side DoH connections.	2021-04-01 17:31:29 +03:00
Artem Boldariev	11ed7aac5d	TLS code refactoring, fixes and unit-tests This commit fixes numerous stability issues with TLS transport code as well as adds unit tests for it.	2021-04-01 17:31:29 +03:00
Petr Mensik	81eb3396bf	Do not require config.h to use isc/util.h util.h requires ISC_CONSTRUCTOR definition, which depends on config.h inclusion. It does not include it from isc/util.h (or any other header). Using isc/util.h fails hard when isc/util.h is used without including bind's config.h. Move the check to c file, where ISC_CONSTRUCTOR is used. Ensure config.h is included there.	2021-03-26 11:41:22 +01:00
Patrick McLean	ebced74b19	Add isc_time_now_hires function to get current time with high resolution The current isc_time_now uses CLOCK_REALTIME_COARSE which only updates on a timer tick. This clock is generally fine for millisecond accuracy, but on servers with 100hz clocks, this clock is nowhere near accurate enough for microsecond accuracy. This commit adds a new isc_time_now_hires function that uses CLOCK_REALTIME, which gives the current time, though it is somewhat expensive to call. When microsecond accuracy is required, it may be required to use extra resources for higher accuracy.	2021-03-20 11:25:55 -07:00
Ondřej Surý	d016ea745f	Fix compilation with NETMGR_TRACE(_VERBOSE) enabled on non-Linux When NETMGR_TRACE(_VERBOSE) is enabled, the build would fail on some non-Linux non-glibc platforms because: * Use <stdint.h> print macros because uint_fast32_t is not always unsigned long * The header <execinfo.h> is not available on non-glibc, thus commit adds dummy backtrace() and backtrace_symbols_fd() functions for platforms without HAVE_BACKTRACE	2021-03-19 16:25:28 +01:00
Ondřej Surý	42e4e3b843	Improve reliability of the netmgr unit tests The netmgr unit tests were designed to push the system limits to maximum by sending as many queries as possible in the busy loop from multiple threads. This mostly works with UDP, but in the stateful protocol where establishing the connection takes more time, it failed quite often in the CI. On FreeBSD, this happened more often, because the socket() call would fail spuriosly making the problem even worse. This commit does several things to improve reliability: * return value of isc_nm_<proto>connect() is always checked and retried when scheduling the connection fails * The busy while loop has been slowed down with usleep(1000); so the netmgr threads could schedule the work and get executed. * The isc_thread_yield() was replaced with usleep(1000); also to allow the other threads to do any work. * Instead of waiting on just one variable, we wait for multiple variables to reach the final value * We are wrapping the netmgr operations (connects, reads, writes, accepts) with reference counting and waiting for all the callbacks to be accounted for. This has two effects: a) the isc_nm_t is always clean of active sockets and handles when destroyed, so it will prevent the spurious INSIST(references == 1) from isc_nm_destroy() b) the unit test now ensures that all the callbacks are always called when they should be called, so any stuck test means that there was a missing callback call and it is always a real bug These changes allows us to remove the workaround that would not run certain tests on systems without port load-balancing.	2021-03-19 16:25:28 +01:00
Ondřej Surý	e4e0e9e3c1	Call isc__nm_tlsdns_failed_read on tls_error to cleanup the socket In tls_error(), we now call isc__nm_tlsdns_failed_read() instead of just stopping timer and reading from the socket. This allows us to properly cleanup any pending operation on the socket.	2021-03-19 15:28:52 +01:00
Ondřej Surý	e4b0730387	Call the isc__nm_failed_connect_cb() early when shutting down When shutting down, calling the isc__nm_failed_connect_cb() was delayed until the connect callback would be called. It turned out that the connect callback might not get called at all when the socket is being shut down. Call the failed_connect_cb() directly in the tlsdns_shutdown() instead of waiting for the connect callback to call it.	2021-03-18 14:31:15 -07:00
Ondřej Surý	73c574e553	Fix typo in processbuffer() - tcpdns vs tlsdns The processbuffer() would call isc__nm_tcpdns_processbuffer() instead of isc__nm_tlsdns_processbuffer() for the isc_nm_tlsdnssocket type of socket.	2021-03-18 21:35:13 +01:00
Ondřej Surý	1d64d4cde8	Fix memory accounting bug in TLSDNS After a partial write the tls.senddata buffer would be rearranged to contain only the data tha wasn't sent and the len part would be made shorter, which would lead to attempt to free only part of a socket's tls.senddata buffer.	2021-03-18 18:14:38 +01:00
Ondřej Surý	5cc406a920	Fix dangling uvreq when data is sent from tlsdns_cycle() The tlsdns_cycle() might call uv_write() to write data to the socket, when this happens and the socket is shutdown before the callback completes, the uvreq structure was not freed because the callback would be called with non-zero status code.	2021-03-18 17:58:56 +01:00
Ondřej Surý	36ddefacb4	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 16:37:57 +01:00
Ondřej Surý	1ef232f93d	Merge the common parts between udp, tcpdns and tlsdns protocol The udp, tcpdns and tlsdns contained lot of cut&paste code or code that was very similar making the stack harder to maintain as any change to one would have to be copied to the the other protocols. In this commit, we merge the common parts into the common functions under isc__nm_<foo> namespace and just keep the little differences based on the socket type.	2021-03-18 16:37:57 +01:00
Ondřej Surý	caa5b6548a	Fix TCPDNS and TLSDNS timers After the TCPDNS refactoring the initial and idle timers were broken and only the tcp-initial-timeout was always applied on the whole TCP connection. This broke any TCP connection that took longer than tcp-initial-timeout, most often this would affect large zone AXFRs. This commit changes the timeout logic in this way: * On TCP connection accept the tcp-initial-timeout is applied and the timer is started * When we are processing and/or sending any DNS message the timer is stopped * When we stop processing all DNS messages, the tcp-idle-timeout is applied and the timer is started again	2021-03-18 16:37:57 +01:00
Mark Andrews	a9f883cbc2	Stop using deprecated calls in lib/isc/tls.c from Rosen Penev @neheb	2021-03-17 20:05:47 +00:00
Artem Boldariev	75363dcb7c	Load full certificate chain from a certificate chain file This commit fixes loading the certificate chain files so that the full chain could be sent to the clients which require that for verification. Before that fix only the top most certificate would be loaded from the chain and sent to clients preventing some of them to perform certificate validation (e.g. Windows 10 DoH client).	2021-03-16 11:49:04 +02:00
Mark Andrews	99bd0c346f	cast (char) to (unsigned char) when calling is*()	2021-03-15 14:18:03 +11:00
Artem Boldariev	7a59fb8207	Disable Nagle's algorithm for HTTP/2 connections It is advisable to disable Nagle's algorithm for HTTP/2 connections because multiple HTTP/2 streams could be multiplexed over one transport connection. Thus, delays when delivering small packets could bring down performance for the whole session. HTTP/2 is meant to be used this way.	2021-03-05 18:09:42 +02:00
Artem Boldariev	66d20cf28b	Fix deadlock in isc_nm_tlsconnect() when called from within the context of a network thread, isc_nm_tlsconnect() hangs. it is waiting for the socket's result code to be updated, but that update is supposed to happen asynchronously in the network thread, and if we're already blocking in the network thread, it can never occur. we can kluge around this by setting the socket result code early; this works for most clients (including "dig"), but it causes inconsistent behaviors that manifest as test failures in the DoH unit test. so we kluged around it even more by setting the socket result code early only when running in the network thread. we need a better solution for this problem, but this will do for now.	2021-03-05 18:09:22 +02:00
Artem Boldariev	ca9a15e3bc	DoH: call send callbacks after data was actually sent	2021-03-05 13:29:32 +02:00
Artem Boldariev	71668437d4	Put sane limitations in place to handle bad requests gracefully This commit makes the server-side code polite. It fixes the error handling code on the server side and fixes returning error code in responses (there was a nasty bug which could potentially crash the server). Also, in this commit we limit max size POST request data to 96K, max processed data size in headers to 128K (should be enough to handle any GET requests). If these limits are surpassed, server will terminate the request with RST_STREAM without responding with error code. Otherwise it politely responds with error code. This commit also limits number of concurrent HTTP/2 streams per transport connection on server to 100 (as nghttp2 advises by default). Ideally, these parameters should be configurable both globally and per every HTTP endpoint description in the configuration file, but for now putting sane limits should be enough.	2021-03-05 13:29:32 +02:00
Evan Hunt	88752b1121	refactor outgoing HTTP connection support - style, cleanup, and removal of unnecessary code. - combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint() into one function, renamed isc_http_endpoint(). - moved isc_nm_http_connect_send_request() into doh_test.c as a helper function; remove it from the public API. - renamed isc_http2 and isc_nm_http2 types and functions to just isc_http and isc_nm_http, for consistency with other existing names. - shortened a number of long names. - the caller is now responsible for determining the peer address. in isc_nm_httpconnect(); this eliminates the need to parse the URI and the dependency on an external resolver. - the caller is also now responsible for creating the SSL client context, for consistency with isc_nm_tlsdnsconnect(). - added setter functions for HTTP/2 ALPN. instead of setting up ALPN in isc_tlsctx_createclient(), we now have a function isc_tlsctx_enable_http2client_alpn() that can be run from isc_nm_httpconnect(). - refactored isc_nm_httprequest() into separate read and send functions. isc_nm_send() or isc_nm_read() is called on an http socket, it will be stored until a corresponding isc_nm_read() or _send() arrives; when we have both halves of the pair the HTTP request will be initiated. - isc_nm_httprequest() is renamed isc__nm_http_request() for use as an internal helper function by the DoH unit test. (eventually doh_test should be rewritten to use read and send, and this function should be removed.) - added implementations of isc__nm_tls_settimeout() and isc__nm_http_settimeout(). - increased NGHTTP2 header block length for client connections to 128K. - use isc_mem_t for internal memory allocations inside nghttp2, to help track memory leaks. - send "Cache-Control" header in requests and responses. (note: currently we try to bypass HTTP caching proxies, but ideally we should interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)	2021-03-05 13:29:26 +02:00
Ondřej Surý	a55bdb28f9	Assigning uint64_t from buffer might be misaligned in netmgr tests Resolve possible 8-byte unaligned access when assigning the magic value from the received buffer.	2021-03-04 15:02:24 +01:00
Ondřej Surý	d3bb3ae64f	Fix comparison between signed and unsigned integer expressions Simple typecast to size_t should be enough to silence the warning on ARMv7, even though the code is in fact correct, because the readlen is checked for being < 0 in the block before the warning.	2021-03-04 11:21:43 +01:00
Ondřej Surý	a50f5d0cf5	Call isc__initialize()/isc__shutdown() from win32 DllMain Call the libisc isc__initialize() constructor and isc__shutdown() destructor from DllMain instead of having duplicate code between those and DllMain() code.	2021-03-01 14:24:57 +01:00
Ondřej Surý	888bdfc1ff	Add mempool get/put tracking with AddressSanitizer When AddressSanitizer is in use, disable the internal mempool implementation and redirect the isc_mempool_get to isc_mem_get (and similarly for isc_mempool_put). This is the method recommended by the AddressSanitizer authors for tracking allocations and deallocations instead of custom poison/unpoison code (see https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning).	2021-02-26 10:05:42 -08:00
Ondřej Surý	a0181056a8	Change the isc_thread_self() return type to uintptr_t The pthread_self(), thrd_current() or GetCurrentThreadId() could actually be a pointer, so we should rather convert the value into uintptr_t instead of unsigned long.	2021-02-25 16:21:10 +01:00
Ondřej Surý	bea333f7c9	Use globally assigned thread_id in the isc_hp API Convert the isc_hp API to use the globally available isc_tid_v instead of locally defined tid_v. This should solve most of the problems on machines with many number of cores / CPUs.	2021-02-25 16:21:10 +01:00
Ondřej Surý	cbbecfcc82	Add isc_trampoline API to have simple accounting around threads The current isc_hp API uses internal tid_v variable that gets incremented for each new thread using hazard pointers. This tid_v variable is then used as a index to global shared table with hazard pointers state. Since the tid_v is only incremented and never decremented the table could overflow very quickly if we create set of threads for short period of time, they finish the work and cease to exist. Then we create identical set of threads and so on and so on. This is not a problem for a normal `named` operation as the set of threads is stable, but the problematic place are the unit tests where we test network manager or other APIs (task, timer) that create threads. This commits adds a thin wrapper around any function called from isc_thread_create() that adds unique-but-reusable small digit thread id that can be used as index to f.e. hazard pointer tables. The trampoline wrapper ensures that the thread ids will be reused, so the highest thread_id number doesn't grow indefinitely when threads are created and destroyed and then created again. This fixes the hazard pointer table overflow on machines with many cores. [GL #2396]	2021-02-25 16:21:10 +01:00
Mark Andrews	3ac53daa06	Address unbalanced lock/unlock Also address race between reading and testing mpctx->allocated and incrementing mpctx->allocated.	2021-02-25 13:08:07 +11:00
Ondřej Surý	c5887c4312	Disable safe-guard assertion in DLL_THREAD_ATTACH/DLL_THREAD_DETACH The BIND 9 libraries on Windows define DllMain() optional entry point into a dynamic-link library (DLL). When the system starts or terminates a process or thread, it calls the entry-point function for each loaded DLL using the first thread of the process. When the DLL is being loaded into the virtual address space of the current process as a result of the process starting up, we make a call to DisableThreadLibraryCalls() which should disable the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified dynamic-link library (DLL). This seems not be the case because we never check the return value of the DisableThreadLibraryCalls() call, and it could in fact fail. The DisableThreadLibraryCalls() function fails if the DLL specified by hModule has active static thread local storage, or if hModule is an invalid module handle. In this commit, we remove the safe-guard assertion put in place for the DLL_THREAD_ATTACH and DLL_THREAD_DETACH events and we just ignore them. BIND 9 doesn't create/destroy enough threads for it actually to make any difference, and in fact we do use static thread local storage in the code.	2021-02-24 08:31:42 +01:00
Ondřej Surý	f53e7ed12c	Include lib/isc/tls_p.h in release tarballs The addition of lib/isc/tls_p.h to the source tree was not accounted for in the relevant variable in lib/isc/Makefile.am and thus the former file is not being included in release tarballs prepared using "make dist". Fix by tweaking the libisc_la_SOURCES list in lib/isc/Makefile.am accordingly.	2021-02-19 13:25:18 +01:00
Ondřej Surý	494d0da522	Use library constructor/destructor to initialize OpenSSL Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use gcc/clang attributes on POSIX and DLLMain on Windows to initialize and shutdown OpenSSL library. This resolves the issue when isc_nm_create() / isc_nm_destroy() was called multiple times and it would call OpenSSL library destructors from isc_nm_destroy(). At the same time, since we now have introduced the ctor/dtor for libisc, this commit moves the isc_mem API initialization (the list of the contexts) and changes the isc_mem_checkdestroyed() to schedule the checking of memory context on library unload instead of executing the code immediately.	2021-02-18 19:33:54 +01:00
Ondřej Surý	4bde4f050b	Disable calling DllMain() on thread creation/destruction Disables the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified dynamic-link library (DLL). This can reduce the size of the working set for some applications.	2021-02-18 19:33:54 +01:00
Ondřej Surý	f225462055	Fix the invalid condition variable Although harmless, the memmove() in tlsdns and tcpdns was guarded by a current message length variable that was always bigger than 0 instead of correct current buffer length remainder variable.	2021-02-18 19:33:54 +01:00
Ondřej Surý	4775e9f256	Move most of the OpenSSL initialization to isc_tls Since we now require both libcrypto and libssl to be initialized for netmgr, we move all the OpenSSL initialization code except the engine initialization to isc_tls API. The isc_tls_initialize() and isc_tls_destroy() has been made idempotent, so they could be called multiple time. However when isc_tls_destroy() has been called, the isc_tls_initialize() could not be called again.	2021-02-18 19:33:54 +01:00
Ondřej Surý	ff47b47f1a	Remove overrun checking code from memory allocator The ISC_MEM_CHECKOVERRUN would add canary byte at the end of every allocations and check whether the canary byte hasn't been changed at the free time. The AddressSanitizer and valgrind memory checks surpases simple checks like this, so there's no need to actually keep the code inside the allocator.	2021-02-18 19:33:54 +01:00
Ondřej Surý	549e5b693a	Modify the way we benchmark mem_{get,put} Previously, the mem_{get,put} benchmark would pass the allocation size as thread_create argument. This has been now changed, so the allocation size is stored and decremented (divided) in atomic variable and the thread create routing is given a memory context. This will allow to write tests where each thread is given different memory context and do the same for mempool benchmarking.	2021-02-18 19:33:54 +01:00
Ondřej Surý	f34f943b16	Disable memory debugging features in non-developer build The two memory debugging features: ISC_MEM_DEFAULTFILL (ISC_MEMFLAG_FILL) and ISC_MEM_TRACKLINES were always enabled in all builds and the former was only disabled in `named`. This commits disables those two features in non-developer build to make the memory allocator significantly faster.	2021-02-18 19:33:54 +01:00
Ondřej Surý	c9fe12443f	Make the mempool names unconditional The named memory pools were default and always compiled-in. Remove the extra complexity by removing the #define and #ifdefs around the code.	2021-02-18 19:33:54 +01:00
Ondřej Surý	b09106e93a	Make the memory and mempool counters to be stdatomic types This is yet another step into unlocking some parts of the memory contexts. All the regularly updated variables has been turned into atomic types, so we can later remove the locks when updating various counters. Also unlock as much code as possible without breaking anything.	2021-02-18 19:33:51 +01:00
Ondřej Surý	0f44139145	Bump the maximum number of hazard pointers in tests On 24-core machine, the tests would crash because we would run out of the hazard pointers. We now adjust the number of hazard pointers to be in the <128,256> interval based on the number of available cores. Note: This is just a band-aid and needs a proper fix.	2021-02-18 19:32:55 +01:00
Ondřej Surý	7de846977b	Remove the extra level of indirection via isc_memmethods_t Previously, the applications using libisc would be able to override the internal memory methods with own implementation. This was no longer possible, but the extra level of indirection was not removed. This commit removes the extra level of indirection for the memory methods and the default_memalloc() and default_memfree().	2021-02-18 19:32:55 +01:00
Ondřej Surý	55ace5d3aa	Remove the internal memory allocator The internal memory allocator had an extra code to keep a list of blocks for small size allocation. This would help to reduce the interactions with the system malloc as the memory would be already allocated from the system, but there's an extra cost associated with that - all the allocations/deallocations must be locked, effectively eliminating any optimizations in the system allocator targeted at multi-threaded applications. While the isc_mem API is still using locks pretty heavily, this is a first step into reducing the memory allocation/deallocation contention.	2021-02-18 19:32:02 +01:00
Ondřej Surý	66eefac78c	Rollback setting IP_DONTFRAG option on the UDP sockets In DNS Flag Day 2020, the development branch started setting the IP_DONTFRAG option on the UDP sockets. It turned out, that this code was incomplete leading to dropping the outgoing UDP packets. Henceforth this commit rolls back this setting until we have a proper fix that would send back empty response with TC flag set.	2021-02-17 08:09:56 +01:00
Michal Nowak	c286341703	Use SKIPPED_TEST_EXIT_CODE consistently Commit `fa505bfb0e` omitted two unit tests while introducing the SKIP_TEST_EXIT_CODE preprocessor macro. Fix the outliers to make use of SKIP_TEST_EXIT_CODE consistent across all unit tests. Also make sure lib/dns/tests/dnstap_test returns an exit code that indicates a skipped test when dnstap is not enabled.	2021-02-16 13:41:50 +01:00
Ondřej Surý	d1448a4c2a	Move the <isc/readline.h> header to bin/dig/readline.h The <isc/readline.h> header provided a compatibility shim to use when other non-GNU readline libraries are in use. The two places where readline library is being used is nslookup and nsupdate, so the header file has been moved to bin/dig directory and it's directly included from bin/nsupdate. This also conceals any readline headers exposed from the libisc headers.	2021-02-16 01:04:46 +00:00
Michal Nowak	fa505bfb0e	Record skipped unit test as skipped in Automake framework	2021-02-15 11:18:03 +01:00
Ondřej Surý	1cc24a2c8b	Unit-test fixes and manual page updates for DoH configuration This commit contains fixes to unit tests to make them work well on various platforms (in particular ones shipping old versions of OpenSSL) and for different configurations. It also updates the generated manpage to include DoH configuration options.	2021-02-03 12:06:17 +01:00
Artem Boldariev	08da09bc76	Initial support for DNS-over-HTTP(S) This commit completes the support for DNS-over-HTTP(S) built on top of nghttp2 and plugs it into the BIND. Support for both GET and POST requests is present, as required by RFC8484. Both encrypted (via TLS) and unencrypted HTTP/2 connections are supported. The latter are mostly there for debugging/troubleshooting purposes and for the means of encryption offloading to third-party software (as might be desirable in some environments to simplify TLS certificates management).	2021-02-03 12:06:17 +01:00
Witold Kręcicki	7a96081360	nghttp2-based HTTP layer in netmgr This commit includes work-in-progress implementation of DNS-over-HTTP(S). Server-side code remains mostly untested, and there is only support for POST requests.	2021-02-03 12:06:17 +01:00
Witold Kręcicki	cdf9d21731	Add isc_mem_strndup() function This commit adds an implementation of strndup() function which allocates memory from the supplied isc_mem_t memory context.	2021-02-03 12:06:17 +01:00
Artem Boldariev	6b9a31989c	Resurrect old TLS code This commit resurrects the old TLS code from `8f73c70d23`. It also includes numerous stability fixes and support for isc_nm_cancelread() for the TLS layer. The code was resurrected to be used for DoH.	2021-02-03 12:06:17 +01:00
Mark Andrews	3b11bacbb7	Cleanup redundant isc_rwlock_init() result checks	2021-02-03 12:22:33 +11:00
Ondřej Surý	c605d75ea5	Use -release instead of -version-info for internal library SONAMEs The BIND 9 libraries are considered to be internal only and hence the API and ABI changes a lot. Keeping track of the API/ABI changes takes time and it's a complicated matter as the safest way to make everything stable would be to bump any library in the dependency chain as in theory if libns links with libdns, and a binary links with both, and we bump the libdns SOVERSION, but not the libns SOVERSION, the old libns might be loaded by binary pulling old libdns together with new libdns loaded by the binary. The situation gets even more complicated with loading the plugins that have been compiled with few versions old BIND 9 libraries and then dynamically loaded into the named. We are picking the safest option possible and usable for internal libraries - instead of using -version-info that has only a weak link to BIND 9 version number, we are using -release libtool option that will embed the corresponding BIND 9 version number into the library name. That means that instead of libisc.so.1701 (as an example) the library will now be named libisc-9.17.10.so.	2021-01-25 14:19:53 +01:00
Ondřej Surý	e493e04c0f	Refactor TLSDNS module to work with libuv/ssl directly * Following the example set in `634bdfb16d`, the tlsdns netmgr module now uses libuv and SSL primitives directly, rather than opening a TLS socket which opens a TCP socket, as the previous model was difficult to debug. Closes #2335. * Remove the netmgr tls layer (we will have to re-add it for DoH) * Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc library; move the OpenSSL initialization/deinitialization from dstapi needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}() * Add couple of new shims needed for OpenSSL 1.0.x * When LibreSSL is used, require at least version 2.7.0 that has the best OpenSSL 1.1.x compatibility and auto init/deinit * Enforce OpenSSL 1.1.x usage on Windows * Added a TLSDNS unit test and implemented a simple TLSDNS echo server and client.	2021-01-25 09:19:22 +01:00
Michał Kępień	347d666b0f	Update library API versions	2021-01-21 08:57:22 +01:00
Mark Andrews	698d9285d4	Only pick CPUs that are part of the existing CPU affinity set when assigning a thread to a CPU.	2020-12-21 15:09:57 +01:00
Michał Kępień	2c44266a5a	Update library API versions	2020-12-16 22:05:50 +01:00
Ondřej Surý	7ba18870dc	Reformat sources using clang-format-11	2020-12-08 18:36:23 +01:00
Ondřej Surý	5caf33feda	Fix HAVE_SO_REUSEPORT_LB macro name definition A typo in macro definition caused the load-balanced sockets to be disabled even on platforms with existing support for load-balanced sockets.	2020-12-04 14:45:22 +01:00
Ondřej Surý	87c5867202	Use sock->nchildren instead of mgr->nworkers when initializing NM On Windows, we were limiting the number of listening children to just 1, but we were then iterating on mgr->nworkers. That lead to scheduling more async_*listen() than actually allocated and out-of-bound read-write operation on the heap.	2020-12-03 18:03:25 +01:00
Ondřej Surý	151852f428	Fix datarace when UDP/TCP connect fails and we are in nmthread When we were in nmthread, the isc__nm_async_<proto>connect() function executes in the same thread as the isc__nm_<proto>connect() and on a failure, it would block indefinitely because the failure branch was setting sock->active to false before the condition around the wait had a chance to skip the WAIT(). This also fixes the zero system test being stuck on FreeBSD 11, so we re-enable the test in the commit.	2020-12-03 13:56:34 +01:00
Ondřej Surý	4adeaab73d	Add FreeBSD connection timeout socket option On FreeBSD, the option to configure connection timeout is called TCP_KEEPINIT, use it to configure the connection timeout there. This also fixes the dangling socket problems in the unit test, so re-enable them.	2020-12-03 09:23:24 +01:00
Ondřej Surý	1d066e4bc5	Distribute queries among threads even on platforms without lb sockets On platforms without load-balancing socket all the queries would be handle by a single thread. Currently, the support for load-balanced sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with SO_REUSEPORT_LB. This commit adds workaround for such platforms that: 1. setups single shared listening socket for all listening nmthreads for UDP, TCP and TCPDNS netmgr transports 2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and for rest of the nmthreads only copy the internal libuv flags (should be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6). 3. start reading on UDP socket or listening on TCP socket The load distribution among the nmthreads is uneven, but it's still better than utilizing just one thread for processing all the incoming queries	2020-12-03 09:20:33 +01:00
Ondřej Surý	94afea9325	Don't use stack allocated buffer for uv_write() On FreeBSD, the stack is destroyed more aggressively than on Linux and that revealed a bug where we were allocating the 16-bit len for the TCPDNS message on the stack and the buffer got garbled before the uv_write() sendback was executed. Now, the len is part of the uvreq, so we can safely pass it to the uv_write() as the req gets destroyed after the sendcb is executed.	2020-12-03 08:58:16 +01:00
Michał Kępień	88f96faba8	Make netmgr initialize and cleanup Winsock itself On Windows, WSAStartup() needs to be called to initialize Winsock before any sockets are created or else socket() calls will return error code 10093 (WSANOTINITIALISED). Since BIND's Network Manager is intended to work as a reusable networking library, it should take care of calling WSAStartup() - and its cleanup counterpart, WSACleanup() - itself rather than relying on external code to do it. Add the necessary WSAStartup() and WSACleanup() calls to isc_nm_start() and isc_nm_destroy(), respectively.	2020-12-02 22:36:23 +01:00
Michał Kępień	dc2e1dea86	Extend log message for unexpected socket() errors Make sure the error code is included in the message logged for unexpected socket creation errors in order to facilitate troubleshooting on Windows.	2020-12-02 22:36:23 +01:00
Michal Nowak	8499825525	Add uv_wrap.h to libisctest_la_SOURCES uv_wrap.h is included in tcp_test.c and udp_test.c and therefore should be listed in lib/isc/tests/Makefile.am, otherwise unit test run from distribution tarball fails to compile: tcp_test.c:37:10: fatal error: uv_wrap.h: No such file or directory #include "uv_wrap.h" ^~~~~~~~~~~ udp_test.c:37:10: fatal error: uv_wrap.h: No such file or directory #include "uv_wrap.h" ^~~~~~~~~~~	2020-12-02 16:08:18 +01:00
Ondřej Surý	2e1dd56d0b	Fix the data race in accessing the isc_nm_t timers The following TSAN report about accessing the mgr timers (mgr->init, mgr->idle, mgr->keepalive and mgr->advertised) has been fixed in this commit: ================== WARNING: ThreadSanitizer: data race (pid=2746) Read of size 4 at 0x7b440008a948 by thread T18: #0 isc__nm_tcpdns_read /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 (libisc.so.1706+0x2ba0f) #1 isc_nm_read /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1679:3 (libisc.so.1706+0x22258) #2 tcpdns_connect_connect_cb /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:363:2 (tcpdns_test+0x4bc5fb) #3 isc__nm_async_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1816:2 (libisc.so.1706+0x228c9) #4 isc__nm_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1791:3 (libisc.so.1706+0x22713) #5 tcpdns_connect_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:343:2 (libisc.so.1706+0x2d89d) #6 uv__stream_connect /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1381:5 (libuv.so.1+0x27c18) #7 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1298:5 (libuv.so.1+0x25977) #8 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:462:11 (libuv.so.1+0x2e795) #9 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5 (libuv.so.1+0x158ec) #10 nm_thread /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:530:11 (libisc.so.1706+0x1c94a) Previous write of size 4 at 0x7b440008a948 by main thread: #0 isc_nm_settimeouts /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:490:12 (libisc.so.1706+0x1dda5) #1 tcpdns_recv_two /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:601:2 (tcpdns_test+0x4bad0e) #2 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70be) #3 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a) Location is heap block of size 281 at 0x7b440008a840 allocated by main thread: #0 malloc <null> (tcpdns_test+0x42864b) #1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:713:8 (libisc.so.1706+0x6d261) #2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:622:8 (libisc.so.1706+0x69b9c) #3 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1044:9 (libisc.so.1706+0x6d379) #4 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2432:10 (libisc.so.1706+0x6889e) #5 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:203:8 (libisc.so.1706+0x1c219) #6 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4) #7 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd) #8 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a) Thread T18 'isc-net-0000' (tid=3513, running) created by main thread at: #0 pthread_create <null> (tcpdns_test+0x429e7b) #1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:73:8 (libisc.so.1706+0x8476a) #2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:271:3 (libisc.so.1706+0x1c66a) #3 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4) #4 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd) #5 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a) SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 in isc__nm_tcpdns_read ================== ThreadSanitizer: reported 1 warnings	2020-12-02 10:14:31 +01:00
Ondřej Surý	d6d2fbe0e9	Avoid netievent allocations when the callbacks can be called directly After turning the users callbacks to be asynchronous, there was a visible performance drop. This commit prevents the unnecessary allocations while keeping the code paths same for both asynchronous and synchronous calls. The same change was done to the isc__nm_udp_{read,send} as those two functions are in the hot path.	2020-12-02 09:45:05 +01:00
Ondřej Surý	3e5ee16eb6	Disable the new netmgr tests on non-Linux platforms The new netmgr tests are not-yet fine-tuned for non-Linux platforms. Disable them now, so we can move forward and fix the tests of *BSD in the next iteration. This commit will get reverted when we add support for netmgr multi-threading.	2020-12-01 17:24:15 +01:00
Ondřej Surý	0ba697fe8c	The cmocka.h header MUST be included before isc/util.h gets included The isc/util.h header redefine the DbC checks (REQUIRE, INSIST, ...) to be cmocka "fake" assertions. However that means that cmocka.h needs to be included after UNIT_TESTING is defined but before isc/util.h is included. Because isc/util.h is included in most of the project headers this means that the sequence MUST be: #define UNIT_TESTING #include <cmocka.h> #include <isc/_anything_.h> See !2204 for other header requirements for including cmocka.h.	2020-12-01 16:47:25 +01:00
Ondřej Surý	634bdfb16d	Refactor netmgr and add more unit tests This is a part of the works that intends to make the netmgr stable, testable, maintainable and tested. It contains a numerous changes to the netmgr code and unfortunately, it was not possible to split this into smaller chunks as the work here needs to be committed as a complete works. NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and tcpdns.c and it should be a subject to refactoring in the future. The changes that are included in this commit are listed here (extensively, but not exclusively): * The netmgr_test unit test was split into individual tests (udp_test, tcp_test, tcpdns_test and newly added tcp_quota_test) * The udp_test and tcp_test has been extended to allow programatic failures from the libuv API. Unfortunately, we can't use cmocka mock() and will_return(), so we emulate the behaviour with #define and including the netmgr/{udp,tcp}.c source file directly. * The netievents that we put on the nm queue have variable number of members, out of these the isc_nmsocket_t and isc_nmhandle_t always needs to be attached before enqueueing the netievent_<foo> and detached after we have called the isc_nm_async_<foo> to ensure that the socket (handle) doesn't disappear between scheduling the event and actually executing the event. * Cancelling the in-flight TCP connection using libuv requires to call uv_close() on the original uv_tcp_t handle which just breaks too many assumptions we have in the netmgr code. Instead of using uv_timer for TCP connection timeouts, we use platform specific socket option. * Fix the synchronization between {nm,async}_{listentcp,tcpconnect} When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was waiting for socket to either end up with error (that path was fine) or to be listening or connected using condition variable and mutex. Several things could happen: 0. everything is ok 1. the waiting thread would miss the SIGNAL() - because the enqueued event would be processed faster than we could start WAIT()ing. In case the operation would end up with error, it would be ok, as the error variable would be unchanged. 2. the waiting thread miss the sock->{connected,listening} = `true` would be set to `false` in the tcp_{listen,connect}close_cb() as the connection would be so short lived that the socket would be closed before we could even start WAIT()ing * The tcpdns has been converted to using libuv directly. Previously, the tcpdns protocol used tcp protocol from netmgr, this proved to be very complicated to understand, fix and make changes to. The new tcpdns protocol is modeled in a similar way how tcp netmgr protocol. Closes: #2194, #2283, #2318, #2266, #2034, #1920 * The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to pass accepted TCP sockets between netthreads, but instead (similar to UDP) uses per netthread uv_loop listener. This greatly reduces the complexity as the socket is always run in the associated nm and uv loops, and we are also not touching the libuv internals. There's an unfortunate side effect though, the new code requires support for load-balanced sockets from the operating system for both UDP and TCP (see #2137). If the operating system doesn't support the load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB on FreeBSD 12+), the number of netthreads is limited to 1. * The netmgr has now two debugging #ifdefs: 1. Already existing NETMGR_TRACE prints any dangling nmsockets and nmhandles before triggering assertion failure. This options would reduce performance when enabled, but in theory, it could be enabled on low-performance systems. 2. New NETMGR_TRACE_VERBOSE option has been added that enables extensive netmgr logging that allows the software engineer to precisely track any attach/detach operations on the nmsockets and nmhandles. This is not suitable for any kind of production machine, only for debugging. * The tlsdns netmgr protocol has been split from the tcpdns and it still uses the old method of stacking the netmgr boxes on top of each other. We will have to refactor the tlsdns netmgr protocol to use the same approach - build the stack using only libuv and openssl. * Limit but not assert the tcp buffer size in tcp_alloc_cb Closes: #2061	2020-12-01 16:47:07 +01:00
Michał Kępień	f440600126	Use proper cmocka macros for pointer checks Make sure pointer checks in unit tests use cmocka assertion macros dedicated for use with pointers instead of those dedicated for use with integers or booleans.	2020-11-26 13:10:40 +01:00
Michał Kępień	2bb0a5dcdb	Update library API versions	2020-11-26 12:12:17 +01:00
Michał Kępień	ea54a932d2	Convert add_quota() to a function cppcheck 2.2 reports the following false positive: lib/isc/tests/quota_test.c:71:21: error: Array 'quotas[101]' accessed at index 110, which is out of bounds. [arrayIndexOutOfBounds] isc_quota_t *quotas[110]; ^ The above is not even an array access, so this report is obviously caused by a cppcheck bug. Yet, it seems to be triggered by the presence of the add_quota() macro, which should really be a function. Convert the add_quota() macro to a function in order to make the code cleaner and to prevent the above cppcheck 2.2 false positive from being triggered.	2020-11-25 12:45:47 +01:00
Mark Andrews	38d6f68de4	add dns_dns64_findprefix	2020-11-25 08:25:29 +11:00
Ondřej Surý	a49d88568f	Turn all the callback to be always asynchronous When calling the high level netmgr functions, the callback would be sometimes called synchronously if we catch the failure directly, or asynchronously if it happens later. The synchronous call to the callback could create deadlocks as the caller would not expect the failed callback to be executed directly.	2020-11-11 22:15:40 +01:00
Michal Nowak	9088052225	Drop unused headers	2020-11-11 10:08:12 +01:00
Ondřej Surý	fa424225af	netmgr: Add additional safeguards to netmgr/tls.c This commit adds couple of additional safeguards against running sends/reads on inactive sockets. The changes was modeled after the changes we made to netmgr/tcpdns.c	2020-11-10 14:17:20 +01:00
Witold Kręcicki	3c00fb71db	isc_nm_tls_create_server_ctx can create ephemeral certs In-memory ephemeral certs creation for easy DoT/DoH deployment.	2020-11-10 14:17:04 +01:00
Witold Kręcicki	38b78f59a0	Add DoT support to bind Parse the configuration of tls objects into SSL_CTX* objects. Listen on DoT if 'tls' option is setup in listen-on directive. Use DoT/DoH ports for DoT/DoH.	2020-11-10 14:16:55 +01:00
Evan Hunt	8886569e9d	report peer address in TLS mode, and specify protocol - peer address was not being reported correctly by "dig +tls" - the protocol used is now reported in the dig output: UDP, TCP, or TLS.	2020-11-10 14:16:41 +01:00
Witold Kręcicki	b2ee0e9dc3	netmgr: server-side TLS support Add server-side TLS support to netmgr - that includes moving some of the isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper tcp or tls function, and a new isc_nm_listentls() function. Add DoT support to tcpdns - isc_nm_listentlsdns().	2020-11-10 14:16:27 +01:00
Evan Hunt	e011521ef1	address some possible shutdown races in xfrin there were two failures during observed in testing, both occurring when 'rndc halt' was run rather than 'rndc stop' - the latter dumps zone contents to disk and presumably introduced enough delay to prevent the races: - a failure when the zone was shut down and called dns_xfrin_detach() before the xfrin had finished connecting; the connect timeout terminated without detaching its handle - a failure when the tcpdns socket timer fired after the outerhandle had already been cleared. this commit incidentally addresses a failure observed in mutexatomic due to a variable having been initialized incorrectly.	2020-11-09 12:33:37 -08:00
Ondřej Surý	127ba7e930	Add libssl libraries to Windows build This commit extends the perl Configure script to also check for libssl in addition to libcrypto and change the vcxproj source files to link with both libcrypto and libssl.	2020-11-09 16:00:28 +01:00
Ondřej Surý	8af7f81d6c	netmgr: Don't crash if socket() returns an error in udpconnect socket() call can return an error - e.g. EMFILE, so we need to handle this nicely and not crash. Additionally wrap the socket() call inside a platform independent helper function as the Socket data type on Windows is unsigned integer: > This means, for example, that checking for errors when the socket and > accept functions return should not be done by comparing the return > value with –1, or seeing if the value is negative (both common and > legal approaches in UNIX). Instead, an application should use the > manifest constant INVALID_SOCKET as defined in the Winsock2.h header > file.	2020-11-08 13:36:12 -08:00
Ondřej Surý	050258bda4	netmgr: Always load the result from async socket Because we use result earlier for setting the loadbalancing on the socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the variable and when the UDP connection would succeed, we would errorneously return this value instead of ISC_R_SUCCESS.	2020-11-07 21:12:08 +01:00
Evan Hunt	ea2b04c361	dig: use new netmgr timeout mechanism use isc_nmhandle_settimeout() to set read/recv timeouts, and get rid of connect_timeout() and related functions in dighost.c.	2020-11-07 20:49:53 +01:00
Evan Hunt	4be63c5b00	add isc_nmhandle_settimeout() function this function sets the read timeout for the socket associated with a netmgr handle and, if the timer is running, resets it. for TCPDNS sockets it also sets the read timeout and resets the timer on the outer TCP socket.	2020-11-07 20:49:53 +01:00
Ondřej Surý	2191d2bf44	fix nmhandle attach/detach errors in tcpdnsconnect_cb() we need to attach to the statichandle when connecting TCPDNS sockets, same as with UDP.	2020-11-07 20:49:53 +01:00
Mark Andrews	0073cb7356	Incorrect result code passed to failed_connect_cb *** CID 312970: Incorrect expression (COPY_PASTE_ERROR) /lib/isc/netmgr/tcp.c: 282 in tcp_connect_cb() 276 } 277 278 isc__nm_incstats(sock->mgr, sock->statsindex[STATID_CONNECT]); 279 r = uv_tcp_getpeername(&sock->uv_handle.tcp, (struct sockaddr *)&ss, 280 &(int){ sizeof(ss) }); 281 if (r != 0) { >>> CID 312970: Incorrect expression (COPY_PASTE_ERROR) >>> "status" in "isc___nm_uverr2result(status, true, "netmgr/tcp.c", 282U)" looks like a copy-paste error. 282 failed_connect_cb(sock, req, isc__nm_uverr2result(status)); 283 return; 284 } 285 286 atomic_store(&sock->connecting, false); 287	2020-11-04 21:58:05 +00:00
Ondřej Surý	c14c1fdd2c	Put up additional safe guards to not use inactive/closed tcpdns socket When we are operating on the tcpdns socket, we need to double check whether the socket or its outerhandle or its listener or its mgr is still active and when not, bail out early.	2020-11-02 20:58:00 +01:00
Witold Kręcicki	3ab3d90de0	Fix improper closed connection handling in tcpdns. If dnslisten_readcb gets a read callback it needs to verify that the outer socket wasn't closed in the meantime, and issue a CANCELED callback if it was.	2020-11-02 15:10:28 +01:00
Evan Hunt	8fcad58ea6	check return value from uv_tcp_getpeername() when connecting if we can't determine the peer, the connect should fail.	2020-10-30 11:11:54 +01:00
Ondřej Surý	14f54d13dc	add a netmgr unit test tests of UDP and TCP cases including: - sending and receiving - closure sockets without reading or sending - closure of sockets at various points while sending and receiving - since the teste is multithreaded, cmocka now aborts tests on the first failure, so that failures in subthreads are caught and reported correctly.	2020-10-30 11:11:54 +01:00
Evan Hunt	26a3a22895	set REUSEPORT and REUSEADDR on TCP sockets if needed When binding a TCP socket, if bind() fails with EADDRINUSE, try again with REUSEPORT/REUSEADDR (or the equivalent options).	2020-10-30 11:11:54 +01:00
Ondřej Surý	ed3ab63f74	Fix more races between connect and shutdown There were more races that could happen while connecting to a socket while closing or shutting down the same socket. This commit introduces a .closing flag to guard the socket from being closed twice.	2020-10-30 11:11:54 +01:00
Ondřej Surý	6cfadf9db0	Fix a race between isc__nm_async_shutdown() and new sends/reads There was a data race where a new event could be scheduled after isc__nm_async_shutdown() had cleaned up all the dangling UDP/TCP sockets from the loop.	2020-10-30 11:11:54 +01:00
Ondřej Surý	5fcd52209a	Refactor udp_recv_cb() - more logical code flow. - propagate errors back to the caller. - add a 'reading' flag and call the callback from failed_read_cb() only when it the socket was actively reading.	2020-10-30 11:11:54 +01:00
Ondřej Surý	cdccac4993	Fix netmgr read/connect timeout issues - don't bother closing sockets that are already closing. - UDP read timeout timer was not stopped after reading. - improve handling of TCP connection failures.	2020-10-30 11:11:54 +01:00
Ondřej Surý	7a6056bc8f	Add isc__nm_udp_shutdown() function This function will be called during isc_nm_closedown() to ensure that all UDP sockets are closed and detached.	2020-10-30 11:11:54 +01:00
Evan Hunt	5dcdc00b93	add netmgr functions to support outgoing DNS queries - isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection. - isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a timeout argument to ensure connections time out and are correctly cleaned up on failure. - isc_nm_read() now supports UDP; it reads a single datagram and then stops until the next time it's called. - isc_nm_cancelread() now runs asynchronously to prevent assertion failure if reading is interrupted by a non-network thread (e.g. a timeout). - isc_nm_cancelread() can now apply to UDP sockets. - added shim code to support UDP connection in versions of libuv prior to 1.27, when uv_udp_connect() was added all these functions will be used to support outgoing queries in dig, xfrin, dispatch, etc.	2020-10-30 11:11:54 +01:00
Witold Kręcicki	c41ce8e0c9	Properly handle outer TCP connection closed in TCPDNS. If the connection is closed while we're processing the request we might access TCPDNS outerhandle which is already reset. Check for this condition and call the callback with ISC_R_CANCELED result.	2020-10-29 12:32:25 +01:00
Ondřej Surý	37b9511ce1	Use libuv's shared library handling capabilities While libltdl is a feature-rich library, BIND 9 code only uses its basic capabilities, which are also provided by libuv and which BIND 9 already uses for other purposes. As libuv's cross-platform shared library handling interface is modeled after the POSIX dlopen() interface, converting code using the latter to the former is simple. Replace libltdl function calls with their libuv counterparts, refactoring the code as necessary. Remove all use of libltdl from the BIND 9 source tree.	2020-10-28 15:48:58 +01:00
Ondřej Surý	8797e5efd5	Fix the data race when read-writing sock->active by using cmpxchg	2020-10-22 11:46:58 -07:00
Ondřej Surý	5ef71c420f	Ignore and don't log ISC_R_NOTCONNECTED from uv_accept() When client disconnects before the connection can be accepted, the named would log a spurious log message: error: Accepting TCP connection failed: socket is not connected We now ignore the ISC_R_NOTCONNECTED result code and log only other errors	2020-10-22 11:37:16 -07:00
Ondřej Surý	f7c82e406e	Fix the isc_nm_closedown() to actually close the pending connections 1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking whether the socket was still alive and scheduling reads/sends on closed socket. 2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been changed to always return the error conditions via the callbacks, so they always succeed. This applies to all protocols (UDP, TCP and TCPDNS).	2020-10-22 11:37:16 -07:00
Ondřej Surý	6af08d1ca6	Fix the way tcp_send_direct() is used There were two problems how tcp_send_direct() was used: 1. The tcp_send_direct() can return ISC_R_CANCELED (or translated error from uv_tcp_send()), but the isc__nm_async_tcpsend() wasn't checking the error code and not releasing the uvreq in case of an error. 2. In isc__nm_tcp_send(), when the TCP send is already in the right netthread, it uses tcp_send_direct() to send the TCP packet right away. When that happened the uvreq was not freed, and the error code was returned to the caller. We need to return ISC_R_SUCCESS and rather use the callback to report an error in such case.	2020-10-22 11:37:16 -07:00
Ondřej Surý	d72bc3eb52	Detach the sock->server in uv_close() callback, not before	2020-10-22 11:37:16 -07:00
Ondřej Surý	97b33e5bde	Explicitly stop reading before closing the nmtcpsocket When closing the socket that is actively reading from the stream, the read_cb() could be called between uv_close() and close callback when the server socket has been already detached hence using sock->statichandle after it has been already freed.	2020-10-22 11:37:16 -07:00

... 10 11 12 13 14 ...

5087 commits