bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-06-20 19:48:53 -04:00

Author	SHA1	Message	Date
Tony Finch	92fcb7457c	Use isc_mem_callocate() in http_calloc() Closes #4120	2023-06-27 12:38:09 +02:00
Tony Finch	81d73600c1	Add isc_mem_callocate() for safer array allocation As well as clearing the fresh memory, `calloc()`-like functions must ensure that the count and size do not overflow when multiplied. Use `isc_mem_callocate()` in `isc__uv_calloc()`.	2023-06-27 12:38:09 +02:00
Tony Finch	7474cad4ad	Add <isc/overflow.h> for checked mul, add, and sub The `ISC_OVERFLOW_XXX()` macros are usually wrappers around `__builtin_xxx_overflow()`, with alternative implementations for compilers that lack the builtins. Replace the overflow checks in `isc/time.c` with the new macros.	2023-06-27 12:38:09 +02:00
Ondřej Surý	5bd9343c4e	Remove the explicit call_rcu thread creating and destruction The free_all_cpu_call_rcu_data() call can consume hundreds of milliseconds on shutdown. Don't try to be smart and let the RCU library handle this internally.	2023-06-27 07:59:00 +02:00
Tony Finch	e18ca83a3b	Improve statschannel HTTP Connection: header protocol conformance In HTTP/1.0 and HTTP/1.1, RFC 9112 section 9.6 says the last response in a connection should include a `Connection: close` header, but the statschannel server omitted it. In an HTTP/1.0 response, the statschannel server can sometimes send a `Connection: keep-alive` header when it is about to close the connection. There are two ways: If the first request on a connection is keep-alive and the second request is not, then _both_ responses have `Connection: keep-alive` but the connection is (correctly) closed after the second response. If a single request contains Connection: close Connection: keep-alive then RFC 9112 section 9.3 says the keep-alive header is ignored, but the statschannel sends a spurious keep-alive in its response, though it correctly closes the connection. To fix these bugs, make it more clear that the `httpd->flags` are part of the per-request-response state. The Connection: flags are now described in terms of the effect they have instead of what causes them to be set.	2023-06-15 17:03:09 +01:00
Ondřej Surý	a8e6c3b8f7	Make isc_result tables smaller The isc_result_t enum was to sparse when each library code would skip to next << 16 as a base. Remove the huge holes in the isc_result_t enum to make the isc_result tables more compact. This change required a rewrite how we map dns_rcode_t to isc_result_t and back, so we don't ever return neither isc_result_t value nor dns_rcode_t out of defined range.	2023-06-15 15:32:04 +02:00
Midnight Veil	dd6acc1cac	Translate POSIX errorcode EROFS to ISC_R_NOPERM Report "permission denied" instead of "unexpected error" when trying to update a zone file on a read-only file system.	2023-06-14 13:12:45 +01:00
Mark Andrews	e6e4ac05b8	Fix typo in synchronize_rcu macro (add h) synchronize_rcu has not been used until now in BIND9 and there was a typo in the define (a 'h' was missing).	2023-06-06 08:10:09 +10:00
Ondřej Surý	f760ee3f8c	Disable URCU inlining if inlined rcu_dereference() fails to compile In some cases, the inlined version rcu_dereference() would not compile when working on pointer to opaque struct (namely Ubuntu Jammy). Detect such condition in the autoconf and disable the inlining of the small functions if it breaks the build.	2023-06-01 16:51:38 +02:00
Mark Andrews	ac2e0bc3ff	Move isc_mem_put to after node is checked for equality isc_mem_put NULL's the pointer to the memory being freed. The equality test 'parent->r == node' was accidentally being turned into a test against NULL.	2023-05-29 01:40:57 +00:00
Evan Hunt	512e5e786b	don't set SHUTTINGDOWN until after calling the request callbacks if we set ISC_HTTPDMGR_SHUTTINGDOWN in the http manager before calling the pending request callbacks, it can trigger an assertion.	2023-05-27 00:41:37 +00:00
Michal Nowak	1fe5c008d6	Ensure "wrap" variable is non-NULL RUNTIME_CHECK on the "wrap" variable avoids possible NULL dereference: thread.c: In function 'thread_wrap': thread.c:60:15: error: dereference of possibly-NULL 'wrap' [CWE-690] [-Werror=analyzer-possible-null-dereference] 60 \| *wrap = (struct thread_wrap){ The RUNTIME_CHECK was there before `7d1ceaf35d`.	2023-05-19 11:02:59 +02:00
Michał Kępień	6029010dd2	Remove <isc/cmocka.h> The last use of the cmocka_add_test_byname() helper macro was removed in commit `63fe9312ff`. Remove the <isc/cmocka.h> header that defines it.	2023-05-18 15:12:23 +02:00
Tony Finch	c319ccd4c9	Fixes for liburcu-qsbr Move registration and deregistration of the main thread from `isc_loopmgr_run()` into `isc__initialize()` / `isc__shutdown()`: liburcu-qsbr fails an assertion if we try to use it from an unregistered thread, and we need to be able to use it when the event loops are not running. Use `rcu_assign_pointer()` and `rcu_dereference()` in qp-trie transactions so that they properly mark threads as online. The RCU-protected pointer is no longer declared atomic because liburcu does not (yet) use standard C atomics. Fix the definition of `isc_qsbr_rcu_dereference()` to return the referenced value, and to call the right function inside liburcu. Change the thread sanitizer suppressions to match any variant of `rcu_*_barrier()`	2023-05-15 20:49:42 +00:00
Tony Finch	afae41aa40	Check the return value from uv_async_send() An omission pointed out by the following report from Coverity: /lib/isc/loop.c: 483 in isc_loopmgr_pause() >>> CID 455002: Error handling issues (CHECKED_RETURN) >>> Calling "uv_async_send" without checking return value (as is done elsewhere 5 out of 6 times). 483 uv_async_send(&loop->pause_trigger);	2023-05-15 18:52:04 +01:00
Evan Hunt	b4ac7faee9	allow streamdns read to resume after timeout when reading on a streamdns socket failed due to timeout, but the dispatch was still waiting for other responses, it would resume reading by calling isc_nm_read() again. this caused an assertion because the socket was already reading. we now check that either the socket is reading, or that it was already reading on the same handle.	2023-05-13 23:31:45 -07:00
Tony Finch	fc770a8bd0	Remove the now-unused ISC_STACK We are using the liburcu concurrent data structures instead.	2023-05-12 20:49:43 +01:00
Tony Finch	f11cc83142	Use per-CPU RCU helper threads Create and free per-CPU helper threads from the main thread and tell thread sanitizer to suppress leaking threads. (We are not leaking threads ourselves and we can safely ignore the Userspace-RCU thread leaks.)	2023-05-12 20:48:31 +01:00
Tony Finch	c377e0a9e3	Help thread sanitizer to cope with liburcu All the places the qp-trie code was using `call_rcu()` needed `__tsan_release()` and `__tsan_acquire()` annotations, so add a couple of wrappers to encapsulate this pattern. With these wrappers, the tests run almost clean under thread sanitizer. The remaining problems are due to `rcu_barrier()` which can be suppressed using `.tsan-suppress`. It does not suppress the whole of `liburcu`, because we would like thread sanitizer to detect problems in `call_rcu()` callbacks, which are called from `liburcu`. The CI jobs have been updated to use `.tsan-suppress` by default, except for a special-case job that needs the additional suppressions in `.tsan-suppress-extra`. We might be able to get rid of some of this after liburcu gains support for thread sanitizer. Note: the `rcu_barrier()` suppression is not entirely effective: tsan sometimes reports races that originate inside `rcu_barrier()` but tsan has discarded the stack so it does not have the information required to suppress the report. These "races" can be made much easier to reproduce by adding `atexit_sleep_ms=1000` to `TSAN_OPTIONS`. The problem with tsan's short memory can be addressed by increasing `history_size`: when it is large enough (6 or 7) the `rcu_barrier()` stack usually survives long enough for suppression to work.	2023-05-12 20:48:31 +01:00
Tony Finch	05ca11e122	Remove isc_qsbr (we are using liburcu instead) This commit breaks the qp-trie code.	2023-05-12 20:48:31 +01:00
Tony Finch	cd0795beea	Slightly more sanitary thread dispatch Tell thread sanitizer that the thread wrapper is released before passing it to a new thread.	2023-05-12 20:48:31 +01:00
Tony Finch	2e0c954806	Wait for RCU to finish before destroying a memory context Memory reclamation by `call_rcu()` is asynchronous, so during shutdown it can lose a race with the destruction of its memory context. When we defer memory reclamation, we need to attach to the memory context to indicate that it is still in use, but that is not enough to delay its destruction. So, call `rcu_barrier()` in `isc_mem_destroy()` to wait for pending RCU work to finish before proceeding to destroy the memory context.	2023-05-12 20:48:31 +01:00
Tony Finch	4f97a679f0	A macro for the size of a struct with a flexible array member It can be fairly long-winded to allocate space for a struct with a flexible array member: in general we need the size of the struct, the size of the member, and the number of elements. Wrap them all up in a STRUCT_FLEX_SIZE() macro, and use the new macro for the flexible arrays in isc_ht and dns_qp.	2023-05-12 20:48:31 +01:00
Ondřej Surý	fd3522c37b	Add Userspace-RCU to global CFLAGS and LIBS The Userspace-RCU headers are now needed for more parts of the libisc and libdns, thus we need to add it globally to prevent compilation failures on systems with non-standard Userspace-RCU installation path.	2023-05-12 14:16:25 +02:00
Ondřej Surý	00f1823366	Change the isc_quota API to use cds_wfcqueue internally The isc_quota API was using locked list of isc_job_t objects to keep the waiting TCP accepts. Change the isc_quota implementation to use cds_wfcqueue internally - the enqueue is wait-free and only dequeue needs to be locked.	2023-05-12 14:16:25 +02:00
Ondřej Surý	7b1d985de2	Change the isc_async API to use cds_wfcqueue internally The isc_async API was using lock-free stack (where enqueue operation was not wait-free). Change the isc_async to use cds_wfcqueue internally - enqueue and splice (move the queue members from one list to another) is nonblocking and wait-free.	2023-05-12 14:16:25 +02:00
Ondřej Surý	7220851f67	Replace glue_cache hashtable with direct link in rdatasetheader Instead of having a global hashtable with a global rwlock for the GLUE cache, move the glue_list directly into rdatasetheader and use Userspace-RCU to update the pointer when the glue_list is empty. Additionally, the cached glue_lists needs to be stored in the RBTDB version for early cleaning, otherwise the circular dependencies between nodes and glue_lists will prevent nodes to be ever cleaned up.	2023-05-12 13:25:39 +02:00
Michal Nowak	31935a3537	Disable ASAN in nsupdate for fatal cases Clang 16 LeakSanitizer reports a memory leak when dns_request_create() returned a TLS error in the nsupdate system test. While technically a memory leak on error handling, it's not a problem because the program is immediately terminated; nsupdate is not expected to run for a prolonged time.	2023-05-11 13:39:51 +02:00
Mark Andrews	9fcd42c672	Re-write remove_old_tsversions and greatest_version Stop deliberately breaking const rules by copying file->name into dirbuf and truncating it there. Handle files located in the root directory properly. Use unlinkat() from POSIX 200809.	2023-05-03 09:12:34 +02:00
Matthijs Mekking	70629d73da	Fix purging old log files with absolute file path Removing old timestamp or increment versions of log backup files did not work when the file is an absolute path: only the entry name was provided to the file remove function. The dirname was also bogus, since the file separater was put back too soon. Fix these issues to make log file rotation work when the file is configured to be an absolute path.	2023-05-03 09:12:11 +02:00
Tony Finch	7d1ceaf35d	Move per-thread RCU setup into isc_thread All the per-loop `libuv` setup remains in `isc_loop`, but the per-thread RCU setup is moved to `isc_thread` alongside the other per-thread setup. This avoids repeating the per-thread setup for `call_rcu()` helpers, and explains a little better why some parts of the per-thread setup is missing for `call_rcu()` helpers. This also removes the per-loop `call_rcu()` helpers as we refactored the isc__random_initialize() in the previous commit.	2023-04-27 12:38:53 +02:00
Ondřej Surý	65021dbf52	Move the isc_random API initialization to the thread_local variable Instead of writing complicated wrappers for every thread, move the initialization back to isc_random unit and check whether the random seed was initialized with a thread_local variable. Ensure that isc_entropy_get() returns a non-zero seed. This avoids problems with thread sanitizer tests getting stuck in an infinite loop.	2023-04-27 12:38:53 +02:00
Tony Finch	e0248bf60f	Simplify isc_thread a little Remove the `isc_threadarg_t` and `isc_threadresult_t` typedefs which were unhelpful disguises for `void *`, and free the dummy jemalloc allocation sooner.	2023-04-27 12:38:53 +02:00
Tony Finch	06f534fa69	Avoid spurious compilation failures in liburcu headers When liburcu is not installed from a system package, its headers are not treated as system headers by the compiler, so BIND's -Werror and other warning options take effect. The liburcu headers have a lot of inline functions, some of which do not use all their arguments, which BIND's build treats as an error.	2023-04-27 12:38:53 +02:00
Ondřej Surý	c2c907d728	Improve the Userspace RCU integration This commit allows BIND 9 to be compiled with different flavours of Userspace RCU, and improves the integration between Userspace RCU and our event loop: - In the RCU QSBR, the thread is put offline when polling and online when rcu_dereference, rcu_assign_pointer (or friends) are called. - In other RCU modes, we check that we are not reading when reaching the quiescent callback in the event loop. - We register the thread before uv_work_run() callback is called and after it has finished. The rcu_(un)register_thread() has a large overhead, but that's fine in this case.	2023-04-27 12:38:53 +02:00
Ondřej Surý	58663574b9	Use server socket to log TCP accept failures The accept_connection() could detach from the child socket on a failure, so we need to keep and use the server socket for logging the accept failures.	2023-04-27 11:07:57 +02:00
Ondřej Surý	27ad3a65f9	Fix potential UAF when shutting down isc_httpd Use the ISC_LIST_FOREACH_SAFE() macro to safely walk the running https and shut them down in a manner safe from deletion.	2023-04-25 08:16:46 +02:00
Ondřej Surý	ae997d9e21	Add ISC_LIST_FOREACH(_SAFE) macros There's a recurring pattern walking the ISC_LISTs that just repeats over and over. Add two macros: * ISC_LIST_FOREACH(list, elt, link) - walk the static list * ISC_LIST_FOREACH_SAFE(list, elt, link, next) - walk the list in a manner that's safe against list member deletions	2023-04-25 08:16:46 +02:00
Evan Hunt	0393b54afb	add a result code for ENOPROTOOPT, EPROTONOSUPPORT there was no isc_result_t value for invalid protocol errors that could be returned from libuv.	2023-04-21 12:42:10 +02:00
Ondřej Surý	b497e90179	Add isc_spinlock unit with shim pthread_spin implementation The spinlock is small (atomic_uint_fast32_t at most), lightweight synchronization primitive and should only be used for short-lived and most of the time a isc_mutex should be used. Add a isc_spinlock unit which is either (most of the time) a think wrapper around pthread_spin API or an efficient shim implementation of the simple spinlock.	2023-04-21 12:10:02 +02:00
Ondřej Surý	3b10814569	Fix the streaming read callback shutdown logic When shutting down TCP sockets, the read callback calling logic was flawed, it would call either one less callback or one extra. Fix the logic in the way: 1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on the handle, the read callback will be called with ISC_R_CANCELED to cancel active reading from the socket/handle. 2. When isc_nm_read() has been called and isc_nm_read_stop() has been called on the on the handle, the read callback will be called with ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket is being shut down. 3. The .reading and .recv_read flags are little bit tricky. The .reading flag indicates if the outer layer is reading the data (that would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream), the .recv_read flag indicates whether somebody is interested in the data read from the socket. Usually, you would expect that the .reading should be false when .recv_read is false, but it gets even more tricky with TLSStream as the TLS protocol might need to read from the socket even when sending data. Fix the usage of the .recv_read and .reading flags in the TLSStream to their true meaning - which mostly consist of using .recv_read everywhere and then wrapping isc_nm_read() and isc_nm_read_stop() with the .reading flag. 4. The TLS failed read helper has been modified to resemble the TCP code as much as possible, clearing and re-setting the .recv_read flag in the TCP timeout code has been fixed and .recv_read is now cleared when isc_nm_read_stop() has been called on the streaming socket. 5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and isc_httpd units have been greatly simplified due to the improved design. 6. More unit tests for TCP and TLS testing the shutdown conditions have been added. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org>	2023-04-20 12:58:32 +02:00
Ondřej Surý	f677cf6b73	Remove unused netmgr->worker->sendbuf By inspecting the code, it was discovered that .sendbuf member of the isc__nm_networker_t was unused and just consuming ~64k per worker. Remove the member and the association allocation/deallocation.	2023-04-14 16:20:14 +02:00
Ondřej Surý	1715cad685	Refactor the isc_quota code and fix the quota in TCP accept code In `e185412872`, the TCP accept quota code became broken in a subtle way - the quota would get initialized on the first accept for the server socket and then deleted from the server socket, so it would never get applied again. Properly fixing this required a bigger refactoring of the isc_quota API code to make it much simpler. The new code decouples the ownership of the quota and acquiring/releasing the quota limit. After (during) the refactoring it became more clear that we need to use the callback from the child side of the accepted connection, and not the server side.	2023-04-12 14:10:37 +02:00
Ondřej Surý	1768522045	Convert tls_send() callback to use isc_job_run() The tls_send() was already using uvreq; convert this to use more direct isc_job_run() - the on-loop no-allocation method.	2023-04-12 14:10:37 +02:00
Ondřej Surý	1302345c93	Convert isc__nm_http_send() from isc_async_run() to isc_job_run() The isc__nm_http_send() was already using uvreq; convert this to use more direct isc_job_run() - the on-loop no-allocation method.	2023-04-12 14:10:37 +02:00
Ondřej Surý	3adba8ce23	Use isc_job_run() for reading from StreamDNS socket Change the reading in the StreamDNS code to use isc_job_run() instead of using isc_async_run() for less allocations and more streamlined execution.	2023-04-12 14:10:37 +02:00
Ondřej Surý	74cbf523b3	Run closehandle_cb on run queue instead of async queue Instead of using isc_async_run() when closing StreamDNS handle, add isc_job_t member to the isc_nmhandle_t structure and use isc_job_run() to avoid allocation/deallocation on the StreamDNS hot-path.	2023-04-12 14:10:37 +02:00
Ondřej Surý	d27f6f2d68	Accept overquota TCP connection on local thread if possible If the quota callback is called on a thread matching the socket, call the TCP accept function directly instead of using isc_async_run() which allocates-deallocates memory.	2023-04-12 14:10:37 +02:00
Ondřej Surý	0a468e7c9e	Make isc_tid() a header-only function The isc_tid() function is often called on the hot-path and it's the only function is to return thread_local variable, make the isc_tid() function a header-only to save several function calls during query-response processing.	2023-04-12 14:10:37 +02:00
Tony Finch	3405b43fe9	Fix a division by zero bug in isc_histo This can occur when calculating the standard deviation of an empty histogram.	2023-04-05 23:29:21 +02:00
Mark Andrews	bf58c10dce	Silence NULL pointer dereferene false positive Only attempt to digest 'in' if it is non NULL. This will prevent false positives about NULL pointer dereferences against 'in' and should also speed up the processing.	2023-04-03 13:32:40 +00:00
Artem Boldariev	2b3a3c21dc	Stream DNS: avoid memory copying/buffer resizing when reading data This commit optimises isc_dnsstream_assembler_t in such a way that memory copying and reallocation are avoided when receiving one or more complete DNS messages at once. We try to handle the data from the messages directly, without storing them in an intermediate memory buffer.	2023-04-03 13:31:46 +00:00
Tony Finch	cd0e7f853a	Simplify histogram quantiles The `isc_histosummary_t` functions were written in the early days of `hg64` and carried over when I brought `hg64` into BIND. They were intended to be useful for graphing cumulative frequency distributions and the like, but in practice whatever draws charts is better off with a raw histogram export. Especially because of the poor performance of the old functions. The replacement `isc_histo_quantiles()` function is intended for providing a few quantile values in BIND's stats channel, when the user does not want the full histogram. Unlike the old functions, the caller provides all the query fractions up-front, so that the values can be found in a single scan instead of a scan per value. The scan is from larger values to smaller, since larger quantiles are usually more interesting, so the scan can bail out early.	2023-04-03 12:08:05 +01:00
Tony Finch	bc2389b828	Add per-thread sharded histograms for heavy loads Although an `isc_histo_t` is thread-safe, it can suffer from cache contention under heavy load. To avoid this, an `isc_histomulti_t` contains a histogram per thread, so updates are local and low-contention.	2023-04-03 12:08:05 +01:00
Tony Finch	82213a48cf	Add isc_histo for histogram statistics This is an adaptation of my `hg64` experiments for use in BIND. As well as renaming everything according to ISC style, I have written some more extensive tests that ensure the edge cases are correct and the fenceposts are in the right places. I have added utility functions for working with precision in terms of decimal significant figures as well as this code's native binary.	2023-04-03 12:08:05 +01:00
Ondřej Surý	3a6a0fa867	Replace DE_CONST(k, v) with v = UNCONST(k) macro Replace the complicated DE_CONST macro that required union with much simple reference-dereference trick in the UNCONST() macro.	2023-04-03 10:25:56 +00:00
Ondřej Surý	4ec9c4a1db	Cleanup the last Windows / MSC ifdefs and comments Cleanup the remnants of MS Compiler bits from <isc/refcount.h>, printing the information in named/main.c, and cleanup some comments about Windows that no longer apply. The bits in picohttpparser.{h,c} were left out, because it's not our code.	2023-04-03 09:06:20 +00:00
Mark Andrews	2abd6c7ab4	Handle MD5 not being supported by lib crypto When initialising the message digests in lib/isc/md.c no longer assume that the initialisation cannot fail.	2023-04-03 12:44:27 +10:00
Mark Andrews	a3172c8f9c	Don't check for OPENSSL_cleanup failures by default OPENSSL_cleanup is supposed to free all remaining memory in use provided the application has cleaned up properly. This is not the case on some operating systems. Silently ignore memory that is freed after OPENSSL_cleanup has been called.	2023-04-03 12:44:27 +10:00
Mark Andrews	e029803704	Handle fatal and FIPS provider interactions When fatal is called we may be holding memory allocated by OpenSSL. This may result in the reference count for the FIPS provider not going to zero and the shared library not being unloaded during OPENSSL_cleanup. When the shared library is ultimately unloaded, when all remaining dynamically loaded libraries are freed, we have already destroyed the memory context we where using to track memory leaks / late frees resulting in INSIST being called. Disable triggering the INSIST when fatal has being called.	2023-04-03 12:44:27 +10:00
Mark Andrews	5a2e82557e	Define isc_fips_mode() and isc_fips_set_mode() isc_fips_mode() determines if the process is running in FIPS mode isc_fips_set_mode() sets the process into FIPS mode	2023-04-03 12:05:28 +10:00
Tony Finch	555690a3c9	Simplify thread spawning The `isc_trampoline` module had a lot of machinery to support stable thread IDs for use by hazard pointers. But the hazard pointer code is gone, and the `isc_loop` module now has its own per-loop thread IDs. The trampoline machinery seems over-complicated for its remaining tasks, so move the per-thread initialization into `isc/thread.c`, and delete the rest.	2023-03-31 17:21:52 +01:00
Ondřej Surý	a5f5f68502	Refactor isc_time_now() to return time, and not result The isc_time_now() and isc_time_now_hires() were used inconsistently through the code - either with status check, or without status check, or via TIME_NOW() macro with RUNTIME_CHECK() on failure. Refactor the isc_time_now() and isc_time_now_hires() to always fail when getting current time has failed, and return the isc_time_t value as return value instead of passing the pointer to result in the argument.	2023-03-31 15:02:06 +02:00
Ondřej Surý	263d232c79	Replace isc_fsaccess API with more secure file creation The isc_fsaccess API was created to hide the implementation details between POSIX and Windows APIs. As we are not supporting the Windows APIs anymore, it's better to drop this API used in the DST part. Moreover, the isc_fsaccess was setting the permissions in an insecure manner - it operated on the filename, and not on the file descriptor which can lead to all kind of attacks if unpriviledged user has read (or even worse write) access to key directory. Replace the code that operates on the private keys with code that uses mkstemp(), fchmod() and atomic rename() at the end, so at no time the private key files have insecure permissions.	2023-03-31 12:52:59 +00:00
Ondřej Surý	aca7dd3961	Add isc_os_umask() function to get current umask As it's impossible to get the current umask without modifying it at the same time, initialize the current umask at the program start and keep the loaded value internally. Add isc_os_umask() function to access the starttime umask.	2023-03-31 12:52:59 +00:00
Ondřej Surý	4bd6096d4b	Remove isc_stdtime_get() macro Now that isc_stdtime_get() macro is unused, remove it from the header file.	2023-03-31 13:33:16 +02:00
Ondřej Surý	46f06c1d6e	Apply the semantic patch to remove isc_stdtime_get() This is a simple replacement using the semantic patch from the previous commit and as added bonus, one removal of previously undetected unused variable in named/server.c.	2023-03-31 13:32:56 +02:00
Ondřej Surý	c11af0448a	Provide isc_stdtime_now(void) that returns value As isc_stdtime_get() cannot fail, the API seems to be too complicated, add new isc_stdtime_now() that returns the unixtime as a return value.	2023-03-31 13:16:28 +02:00
Tony Finch	194621a74e	Fix a crash when dig or host receive a signal When the loopmanager is shutting down following a signal, `dig` and `host` should stop cleanly. Before this commit they were oblivious to ISC_R_SHUTTINGDOWN. The `isc_signal` callbacks now report this kind of mistake with a stack backtrace.	2023-03-31 09:52:54 +00:00
Ondřej Surý	2c0a9575d7	Replace __attribute__((unused)) with ISC_ATTR_UNUSED attribute macro Instead of marking the unused entities with UNUSED(x) macro in the function body, use a `ISC_ATTR_UNUSED` attribute macro that expans to C23 [[maybe_unused]] or __attribute__((__unused__)) as fallback.	2023-03-30 23:29:25 +02:00
Ondřej Surý	1176bf0552	Use C23 attributes if available, add ISC_ATTR_UNUSED Use C23 attribute styles if available: * Add new ISC_ATTR_UNUSED attribute macro that either expands to C23's [[maybe_unused]] or __attribute__((__unused__)); * Add default expansion of the `noreturn` to [[noreturn]] if available; * Move the FALLTHROUGH from <isc/util.h> to <isc/attributes.h>	2023-03-30 22:43:39 +02:00
Artem Boldariev	43e21d653f	TLS Stream: remove incorrect/obsolete INSIST()s from tls_do_bio() With the changes to tls_try_handshake() made in `2846888c57` there are some incorrect INSISTS() related to handshake handling which better to be removed.	2023-03-30 18:21:50 +03:00
Ondřej Surý	2846888c57	Attach the accept "client" socket to .listener member of the socket When accepting a TCP connection in the higher layers (tlsstream, streamdns, and http) attach to the socket the connection was accepted on, and use this socket instead of the parent listening socket. This has an advantage - accessing the sock->listener now doesn't break the thread boundaries, so we can properly check whether the socket is being closed without requiring .closing member to be atomic_bool.	2023-03-30 16:10:08 +02:00
Ondřej Surý	45365adb32	Convert sock->active to non-atomic variable, cleanup rchildren The last atomic_bool variable sock->active was converted to non-atomic bool by properly handling the listening socket case where we were checking parent socket instead of children sockets. This is no longer necessary as we properly set the .active to false on the children sockets. Additionally, cleanup the .rchildren - the atomic variable was used for mutex+condition to block until all children were listening, but that's now being handled by a barrier. Finally, just remove dead .self and .active_child_connections members of the netmgr socket.	2023-03-30 16:10:08 +02:00
Ondřej Surý	e1a4572fd6	Refactor the use of atomics in netmgr Now that everything runs on their own loop and we don't cross the thread boundaries (with few exceptions), most of the atomic_bool variables used to track the socket state have been unatomicized because they are always accessed from the matching thread. The remaining few have been relaxed: a) the sock->active is now using acquire/release memory ordering; b) the various global limits are now using relaxed memory ordering - we don't really care about the synchronization for those.	2023-03-30 16:10:08 +02:00
Ondřej Surý	f5fc224af3	Add isc_async_current() macro to run job on current loop Previously, isc_job_run() could have been used to run the job on the current loop and the isc_job_run() would take care of allocating and deallocating the job. After the change in this MR, the isc_job_run() is more complicated to use, so we introduce the isc_async_current() macro to suplement isc_async_run() when we need to run the job on the current loop.	2023-03-30 16:07:41 +02:00
Ondřej Surý	1844590ad9	Refactor isc_job_run to not-make any allocations Change the isc_job_run() to not-make any allocations. The caller must make sure that it allocates isc_job_t - usually as part of the argument passed to the callback. For simple jobs, using isc_async_run() is advised as it allocates its own separate isc_job_t.	2023-03-30 16:00:52 +02:00
Ondřej Surý	639d5065a3	Refactor the isc__nm_uvreq_t to have idle callback Change the isc__nm_uvreq_t to have the idle callback as a separate member as we always need to use it to properly close the uvreq. Slightly refactor uvreq_put and uvreq_get to remove the unneeded arguments - in uvreq_get(), we always use sock->worker, and in uvreq_put, we always use req->sock, so there's not reason to pass those extra arguments.	2023-03-29 21:16:44 +02:00
Ondřej Surý	476198f26c	Use uv_idle API for calling asynchronous connect/read/send callback Instead of using isc_job_run() that's quite heavy as it allocates memory for every new job, add uv_idle_t to uvreq union, and use uv_idle API directly to execute the connect/read/send callback without any additional allocations.	2023-03-29 21:16:44 +02:00
Ondřej Surý	670df3da74	Re-add the comment to streamdns_readmore() Put the comment back, so it's more obvious that we are only restarting timer when there's a last handle attached to the socket; there has to be always at least one.	2023-03-29 21:16:44 +02:00
Tony Finch	295e7c80e8	Ad-hoc backtrace logging with isc_backtrace_log() It's sometimes helpful to get a quick idea of the call stack when debugging. This change factors out the backtrace logging from named's fatal error handler so that it's easy to use in other places too.	2023-03-29 10:47:53 +00:00
Ondřej Surý	665f8bb78d	Fix isc_nm_httpconnect to check for shuttindown condition The isc_nm_httpconnect() would succeed even if the netmgr would be already shuttingdown. This has been fixed and the unit test has been updated to cope with fact that the handle would be NULL when isc_nm_httpconnect() returns with an error.	2023-03-29 05:49:57 +00:00
Evan Hunt	fe7ed2ba24	update stream sockets with bound address/port when isc_nm_listenstreamdns() is called with a local port of 0, a random port is chosen. call uv_getsockname() to determine what the port is as soon as the socket is bound, and add a function isc_nmsocket_getaddr() to retrieve it, so that the caller can connect to the listening socket. this will be used in cases where the same process is acting as both client and server.	2023-03-28 12:38:28 -07:00
Evan Hunt	4ad95e0567	add ns_interface_create() add a public function ns_interface_create() allowing the caller to set up a listening interface directly without having to set up listen-on and scan network interfaces.	2023-03-28 12:38:28 -07:00
Ondřej Surý	a2e4a6883f	Remove the netievent remnants After removing all functional netievents, remove what has been left from the netievents. This also includes leftovers from previous refactorings.	2023-03-24 07:58:53 +01:00
Ondřej Surý	6b107c3fbc	Convert stopping generic socket children to to isc_async callback Simplify the stopping of the generic socket children by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:53 +01:00
Ondřej Surý	744e93b70d	Convert setting of the TLS contexts to to isc_async callback Simplify the setting of the TLS contexts by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:53 +01:00
Ondřej Surý	7ddc49d66a	Convert canceling StreamDNS socket to to isc_async callback Simplify the canceling of the StreamDNS socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:53 +01:00
Ondřej Surý	2185dc75f0	Convert reading from StreamDNS socket to to isc_async callback Simplify the reading from the StreamDNS socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	4a4bd68777	Convert setting of the DoH endpoints to to isc_async callback Simplify the setting of the DoH endpoints by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	115160de73	Convert sending on the DoH socket to to isc_async callback Simplify the sending on the DoH socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	a321d3f419	Convert closing the DoH socket to to isc_async callback Simplify the closing the DoH socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	8c48c51f71	Convert doing the TLS IO to to isc_async callback Simplify the doing the TLS IO by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	3d4d099ac8	Cleanup already defunct tlsconnect netievent The netievent used for TLS connect was already defunct, just cleanup the cruft.	2023-03-24 07:58:52 +01:00
Ondřej Surý	35b4ef0a08	Convert sending on the TLS socket to to isc_async callback Simplify the sending on the TLS socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	4f27b14cd1	Convert closing the TLS socket to to isc_async callback Simplify the closing the TLS socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	e185412872	Convert accepting new TCP connection to to isc_async callback Simplify the acception the new TCP connection by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	1baffb6ff5	Convert canceling UDP socket to to isc_async callback Simplify the canceling of the UDP socket by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	4419848efd	Convert stopping TCP children to to isc_async callback Simplify the stopping of the TCP children by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	e1524f2b4e	Convert starting TCP children to to isc_async callback Simplify the starting of the TCP children by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	8cb4cfd9db	Convert stopping UDP children to to isc_async callback Simplify the stopping of the UDP children by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	b25dd5eaf5	Convert starting UDP children to to isc_async callback Simplify the starting of the UDP children by using the isc_async API from the loopmgr instead of using the asychronous netievent mechanism in the netmgr.	2023-03-24 07:58:52 +01:00
Ondřej Surý	5a43be0775	Simplify netmgr active handles accounting The active handles accounting was both using atomic counter and ISC_LIST to keep track of active handles. Remove the atomic counter that was in use before the ISC_LIST was added for better tracking of the handles attached to the socket.	2023-03-24 07:58:52 +01:00
Ondřej Surý	96cff4fc51	Convert netmgr handle detach to synchronous callback Instead of calling isc__nmhandle_detach calling nmhandle_detach_cb() asynchronously when there's closehandle_cb initialized, convert the closehandle_cb to use isc_job, and make the isc__nmhandle_detach() to be fully synchronous.	2023-03-24 07:58:52 +01:00
Ondřej Surý	237f4af152	Convert netmgr connect, read and send callbacks to isc_job The netmgr connect, read and send callbacks can now only be executed on the same loop, convert it from asynchronous netievent queue event to more direct isc_job.	2023-03-23 22:33:40 -07:00
Artem Boldariev	719343348e	Delete old TLS DNS and TCP DNS dead code This commit removes old, unused TLS DNS and TCP DNS definitions from the code. They should have been deleted earlier, but that was missed.	2023-03-15 18:40:58 +02:00
Tony Finch	7e565a87a7	Apply adjusted clang-format The headers were slightly reordered when liburcu was added.	2023-03-10 17:31:28 +01:00
Ondřej Surý	2532b558b4	Build with liburcu, Userspace RCU BIND needs a collection of standard lock-free data structures, which we can find in liburcu, along with its RCU safe memory reclamation machinery. We will use liburcu's QSBR variant instead of the home-grown isc_qsbr.	2023-03-10 17:31:28 +01:00
Aram Sargsyan	fce68da460	Fix ISC_REFCOUNT_TRACE_IMPL usage ISC_REFCOUNT_TRACE_IMPL uses isc_tid(), but the corresponding header file is not included, which breaks, for example, compiling BIND with DNS_CATZ_TRACE defined in lib/dns/include/dns/catz.h. Add '#include <isc/tid.h>' in lib/isc/include/isc/refcount.h.	2023-03-09 21:38:04 +00:00
Mark Andrews	0045b24500	Silence uninitialized value false positives In base32_decode_char the GCC 12 static analyser fails to determine that ctx->val[1], ctx->val[3], ctx->val[4] and ctx->val[6] are assigned values by the previous call to base32_decode_char. Initialise ctx->val to zeros when initalising the rest of ctx to silence the false positive.	2023-03-08 22:40:03 +00:00
Tony Finch	c43668f031	Remove some lingering references to libbind9 Clean up the `.clang-format` #include priority list and the `\file` declaration in `isc/getaddresses.h`.	2023-03-08 10:06:22 +00:00
Mark Andrews	cf5f133679	Fix memory leak in isc_hmac_init If EVP_DigestSignInit failed 'pkey' was not freed.	2023-02-26 22:56:07 +00:00
Tony Finch	9b7aa536ba	QSBR: safe memory reclamation for lock-free data structures This "quiescent state based reclamation" module provides support for the qp-trie module in dns/qp. It is a replacement for liburcu, written without reference to the urcu source code, and in fact it works in a significantly different way. A few specifics of BIND make this variant of QSBR somewhat simpler: * We can require that wait-free access to a qp-trie only happens in an isc_loop callback. The loop provides a natural quiescent state, after the callbacks are done, when no qp-trie access occurs. * We can dispense with any API like rcu_synchronize(). In practice, it takes far too long to wait for a grace period to elapse for each write to a data structure. * We use the idea of "phases" (aka epochs or eras) from EBR to reduce the amount of bookkeeping needed to track memory that is no longer needed, knowing that the qp-trie does most of that work already. I considered hazard pointers for safe memory reclamation. They have more read-side overhead (updating the hazard pointers) and it wasn't clear to me how to nicely schedule the cleanup work. Another alternative, epoch-based reclamation, is designed for fine-grained lock-free updates, so it needs some rethinking to work well with the heavily read-biased design of the qp-trie. QSBR has the fastest read side of the basic SMR algorithms (with no barriers), and fits well into a libuv loop. More recent hybrid SMR algorithms do not appear to have enough benefits to justify the extra complexity.	2023-02-23 15:57:53 +00:00
Tony Finch	63cd73d43e	Include thread ID in refcount trace output	2023-02-23 14:28:27 +00:00
Evan Hunt	dc27552c30	remove isc_glob the isc_glob module was originally needed to support posix-style glob processing on Windows, but is now just an unnecessary wrapper around glob(3). this commit removes it.	2023-02-22 17:35:29 +00:00
Ondřej Surý	6eb1340d1b	Use atomic stack for async job queue Previously, the async job queue would use a locked-list (ISC_LIST). With introduction of atomic stack (that has to be drained at once), we could use it to remove some contention between the threads and simplify the async queue. Fortunately, the reverse order still works for us - instead of append and tail/prev operation on the list, we are now using prepend and head/next operation on the atomic stack.	2023-02-22 16:13:37 +00:00
Tony Finch	36e56923ce	Simple lock-free stack in <isc/stack.h> Add a singly-linked stack that supports lock-free prepend and drain (to empty the list and clean up its elements). Intended for use with QSBR to collect objects that need safe memory reclamation, or any other user that works with adding objects to the stack and then draining them in one go like various work queues. In <isc/atomic.h>, add an `atomic_ptr()` macro to make type declarations a little less abominable, and clean up a duplicate definition of `atomic_compare_exchange_strong_acq_rel()`	2023-02-22 16:13:37 +00:00
Evan Hunt	b058f99cb8	remove references to obsolete isc_task/timer functions removed references in code comments, doc/dev documentation, etc, to isc_task, isc_timer_reset(), and isc_timertype_inactive. also removed a coccinelle patch related to isc_timer_reset() that was no longer needed.	2023-02-22 08:13:30 +00:00
Tony Finch	3fef7c626a	Move bind9_getaddresses() to isc_getaddresses() No need to have a whole library for one function.	2023-02-21 13:12:26 +00:00
Evan Hunt	a52b17d39b	remove isc_task completely as there is no further use of isc_task in BIND, this commit removes it, along with isc_taskmgr, isc_event, and all other related types. functions that accepted taskmgr as a parameter have been cleaned up. as a result of this change, some functions can no longer fail, so they've been changed to type void, and their callers have been updated accordingly. the tasks table has been removed from the statistics channel and the stats version has been updated. dns_dyndbctx has been changed to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION has been udpated as well.	2023-02-16 18:35:32 +01:00
Evan Hunt	f58e7c28cd	switch to using isc_loopmgr_pause() instead of task exclusive change functions using isc_taskmgr_beginexclusive() to use isc_loopmgr_pause() instead. also, removed an unnecessary use of exclusive mode in named_server_tcptimeouts(). most functions that were implemented as task events because they needed to be running in a task to use exclusive mode have now been changed into loop callbacks instead. (the exception is catz, which is being changed in a separate commit because it's a particularly complex change.)	2023-02-16 17:51:55 +01:00
Tony Finch	f9c725d7d4	Remove do-nothing header <isc/stat.h> Use <sys/stat.h> instead	2023-02-15 16:44:47 +00:00
Tony Finch	6927a30926	Remove do-nothing header <isc/print.h> This one really truly did nothing. No lines added!	2023-02-15 16:44:47 +00:00
Tony Finch	c7615bc28d	Remove do-nothing header <isc/offset.h> And replace all uses of isc_offset_t with standard off_t	2023-02-15 16:44:47 +00:00
Tony Finch	bed09c1676	Remove do-nothing header <isc/netdb.h> Not needed since we dropped Windows support	2023-02-15 16:44:47 +00:00
Tony Finch	b0893ae09a	Explain <isc/strerr.h> a little more The purpose of the `strerror_r()` wrapper was not obvious.	2023-02-15 16:44:09 +00:00
Tony Finch	75f7a85a39	Deprecate <isc/deprecated.h> We refactor more freely these days.	2023-02-15 15:36:20 +00:00
Ondřej Surý	6ffda5920e	Add the reader-writer synchronization with modified C-RW-WP This changes the internal isc_rwlock implementation to: Irina Calciu, Dave Dice, Yossi Lev, Victor Luchangco, Virendra J. Marathe, and Nir Shavit. 2013. NUMA-aware reader-writer locks. SIGPLAN Not. 48, 8 (August 2013), 157–166. DOI:https://doi.org/10.1145/2517327.24425 (The full article available from: http://mcg.cs.tau.ac.il/papers/ppopp2013-rwlocks.pdf) The implementation is based on the The Writer-Preference Lock (C-RW-WP) variant (see the 3.4 section of the paper for the rationale). The implemented algorithm has been modified for simplicity and for usage patterns in rbtdb.c. The changes compared to the original algorithm: * We haven't implemented the cohort locks because that would require a knowledge of NUMA nodes, instead a simple atomic_bool is used as synchronization point for writer lock. * The per-thread reader counters are not being used - this would require the internal thread id (isc_tid_v) to be always initialized, even in the utilities; the change has a slight performance penalty, so we might revisit this change in the future. However, this change also saves a lot of memory, because cache-line aligned counters were used, so on 32-core machine, the rwlock would be 4096+ bytes big. * The readers use a writer_barrier that will raise after a while when readers lock can't be acquired to prevent readers starvation. * Separate ingress and egress readers counters queues to reduce both inter and intra-thread contention.	2023-02-15 09:30:04 +01:00
Ondřej Surý	28fe8104ee	Add isc_hashmap_find() DbC check for valuep This adds DbC check, so we don't pass non-NULL memory for a valued to the isc_hashmap_find() function.	2023-02-15 09:30:04 +01:00
Tony Finch	436b76bb17	Improve the spinloop pause / yield hint Unfortunately, C still lacks a standard function for pause (x86, sparc) or yeild (arm) instructions, for use in spin lock or CAS loops. BIND has its own based on vendor intrinsics or inline asm. Previously, it was buried in the `isc_rwlock` implementation. This commit renames `isc_rwlock_pause()` to `isc_pause()` and moves it into <isc/pause.h>. This commit also fixes the configure script so that it detects ARM yield support on systems that identify as `aarch` instead of `arm`. On 64-bit ARM systems we now use the ISB (instruction synchronization barrier) instruction in preference to yield. The ISB instruction pauses the CPU for longer, several nanoseconds, which is more like the x86 pause instruction. There are more details in a Rust pull request, which also refers to MySQL making the same change: https://github.com/rust-lang/rust/pull/84725	2023-02-14 17:13:24 +00:00
Evan Hunt	3a1bb8dac8	remove some unused functions removed some functions that are no longer used and unlikely to be resurrected, and also some that were only used to support Windows and can now be replaced with generic versions.	2023-02-13 11:50:59 -08:00
Evan Hunt	935879ed11	remove isc_bind9 variable isc_bind9 was a global bool used to indicate whether the library was being used internally by BIND or by an external caller. external use is no longer supported, but the variable was retained for use by dyndb, which needed it only when being built without libtool. building without libtool is also no longer supported, so the variable can go away.	2023-02-09 18:00:13 +00:00
Ondřej Surý	d4d57f16c3	Sync compile-time & run-time libuv requirements Bump the minimum libuv version required at runtime so that it matches the compile-time requirements.	2023-02-09 15:04:52 +01:00
Ondřej Surý	735d09bffe	Enforce version drift limits for libuv libuv support for receiving multiple UDP messages in a single system call (recvmmsg()) has been tweaked several times between libuv versions 1.35.0 and 1.40.0. Mixing and matching libuv versions within that span may lead to assertion failures and is therefore considered harmful, so try to limit potential damage be preventing users from mixing libuv versions with distinct sets of recvmmsg()-related flags.	2023-02-09 15:04:52 +01:00
Ondřej Surý	251f411fc3	Avoid libuv 1.35 and 1.36 that have broken recvmmsg implementation The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is incomplete and could cause assertion failure under certain circumstances. Modify the configure and runtime checks to report a fatal error when trying to compile or run with the affected versions.	2023-02-09 15:04:52 +01:00
Ondřej Surý	baced007af	Require C11 Atomic Operations via <stdatomic.h> Make the C11 Atomic Operations mandatory and drop the Gcc __atomic builtin shims.	2023-02-08 21:33:23 +01:00
Ondřej Surý	1c456c0284	Require C11 thread_local keyword and <threads.h> header Change the autoconf check to require C11 <threads.h> header and thread_local keyword.	2023-02-08 21:33:23 +01:00
Tony Finch	ff63b53ff4	Add isc_time_monotonic() This is to simplify measurements of how long things take.	2023-02-06 12:14:51 +00:00
Tony Finch	b8e71f9580	Fix ISC_MEM_ZERO on allocators with malloc_usable_size() ISC_MEM_ZERO requires great care to use when the space returned by the allocator is larger than the requested space, and when memory is reallocated. You must ensure that _every_ call to allocate or reallocate a particular block of memory uses ISC_MEM_ZERO, to ensure that the extra space is zeroed as expected. (When ISC_MEMFLAG_FILL is set, the extra space will definitely be non-zero.) When BIND is built without jemalloc, ISC_MEM_ZERO is implemented in `jemalloc_shim.h`. This had a bug on systems that have malloc_size() or malloc_usable_size(): memory was only zeroed up to the requested size, not the allocated size. When an oversized allocation was returned, and subsequently reallocated larger, memory between the original requested size and the original allocated size could contain unexpected nonzero junk. The realloc call does not know the original requested size and only zeroes from the original allocated size onwards. After this change, `jemalloc_shim.h` always zeroes up to the allocated size, not the requested size.	2023-02-06 11:21:12 +00:00
Evan Hunt	7fd78344e0	refactor isc_ratelimiter to use loop callbacks the rate limter now uses loop callbacks rather than task events. the API for isc_ratelimiter_enqueue() has been changed; we now pass in a loop, a callback function and a callback argument, and receive back a rate limiter event object (isc_rlevent_t). it is no longer necessary for the caller to allocate the event. the callback argument needs to include a pointer to the rlevent object so that it can be freed using isc_rlevent_free(), or by dequeueing.	2023-01-31 21:41:19 -08:00
Ondřej Surý	3d674ccc1d	Restore Malloced memory counter as InUse alias + little cleanups This restores the Malloced memory counter and it's now always equal to InUse counter. This is only for backwards compatibility reason and there is no separate counter. The commit also cleanups little things like structure with a single item (summary.inuse), and shuts up a wrong cppcheck warning (the notorious NULL check after assignment).	2023-01-24 17:57:16 +00:00
Ondřej Surý	474279e5f1	Remove ContextSize memory counter Again, this was an internal allocator counter, now it's useless.	2023-01-24 17:57:16 +00:00
Ondřej Surý	863b2b8bf3	Make the all inuse memory counter atomic operations relaxed Instead of enforcing stronger synchronization between threads, make all the atomic operations relaxed. We are not really interested in exact numbers at all times - the single place where we need the exact number is when the memory context is being destroyed. Even when there's a overmem counter, we don't care about exact ordering or exact number.	2023-01-24 17:57:16 +00:00
Ondřej Surý	a08e2d37ed	Cleanup the ptr argument from mem_putstats() The ptr argument was unneeded and unused.	2023-01-24 17:57:16 +00:00
Ondřej Surý	699736b7bb	Remove the Lost memory counter The Lost memory counter would count the memory "lost" by external libraries. There's really no such thing as `named` require the memory contexts to be clean on destroy.	2023-01-24 17:57:16 +00:00
Ondřej Surý	7588cd5cb1	Remove stats buckets memory counters The stats buckets were again more useful for internal allocator, because we would see the individual "block" caches where the allocations would fall into. Remove the stats buckets, and if needed, we can pull more detailed statistics out of the jemalloc.	2023-01-24 17:57:16 +00:00
Ondřej Surý	1ea8894626	Remove the 'totalgets' memory counter The totalgets falls into the same category as other "total" and "max" numbers - it's just a big number with no meaning to end user.	2023-01-24 17:57:16 +00:00
Ondřej Surý	3d4e41d076	Remove the total memory counter The total memory counter had again little or no meaning when we removed the internal memory allocator. It was just a monotonic counter that would count add the allocation sizes but never subtracted anything, so it would be just a "big number".	2023-01-24 17:57:16 +00:00
Ondřej Surý	91e349433f	Remove maxinuse memory counter The maxinuse memory counter indicated the highest amount of memory allocated in the past. Checking and updating this high- water mark value every time memory was allocated had an impact on server performance, so it has been removed. Memory size can be monitored more efficiently via an external tool logging RSS.	2023-01-24 17:57:16 +00:00
Ondřej Surý	971df0b4ed	Remove malloced and maxmalloced memory counter The malloced and maxmalloced memory counters were mostly useless since we removed the internal allocator blocks - it would only differ from inuse by the memory context size itself.	2023-01-24 17:57:16 +00:00
Ondřej Surý	7d8aa63026	Make {increment,decrement}_malloced() return void The return value was only used in a single place and only for decrement_malloced() and we can easily replace that with atomic_load().	2023-01-24 17:57:16 +00:00
Evan Hunt	a2d773fb98	Refactor dnssec-signzone to use loop callbacks Use isc_job_run() instead of isc_task_send() for dnssec-signzone worker threads. Also fix the issue where the additional assignwork() would be run only from the main thread effectively serializing all the signing.	2023-01-21 23:39:09 -08:00
Evan Hunt	301f8b23e1	complete change of NETMGR_TRACE to ISC_NETMGR_TRACE some references to the old ifdef were still in place.	2023-01-20 12:46:34 -08:00
Mark Andrews	b74dd2e8c2	Use INSIST rather then REQUIRE to meet DBC usage rules	2023-01-20 11:05:24 +11:00
Mark Andrews	08c39736a9	isc_nm_listentcp: treat socket failures gracefully The old code didn't handle race conditions and errors on systems with non load balancing sockets gracefully. Look for an error on any child socket and if found close all the child sockets and return an error.	2023-01-20 11:05:24 +11:00
Mark Andrews	624f5a0dae	isc_nm_listenudp: treat socket failures gracefully The old code didn't handle race conditions and errors on systems with non load balancing sockets gracefully. Look for an error on any child socket and if found close all the child sockets and return an error.	2023-01-20 11:05:24 +11:00
Artem Boldariev	942569a1bb	Fix building BIND on DragonFly BSD (on both older an newer versions) This commit ensures that BIND and supplementary tools still can be built on newer versions of DragonFly BSD. It used to be the case, but somewhere between versions 6.2 and 6.4 the OS developers rearranged headers and moved some function definitions around. Before that the fact that it worked was more like a coincidence, this time we, at least, looked at the related man pages included with the OS. No in depth testing has been done on this OS as we do not really support this platform - so it is more like a goodwill act. We can, however, use this platform for testing purposes, too. Also, we know that the OS users do use BIND, as it is included in its ports directory. Building with './configure' and './configure --without-jemalloc' have been fixed and are known to work at the time the commit is made.	2023-01-20 00:19:12 +02:00
Aram Sargsyan	41dc48bfd7	Refactor isc_nm_xfr_allowed() Return 'isc_result_t' type value instead of 'bool' to indicate the actual failure. Rename the function to something not suggesting a boolean type result. Make changes in the places where the API function is being used to check for the result code instead of a boolean value.	2023-01-19 10:24:08 +00:00
Ondřej Surý	5abbcdadaf	Use thread_local EVP_MD in isc_iterated_hash() Cherry-pick small fixup commit from 9.18/9.16 branches needed for thread-safety. This fixup commit is not needed for 9.19+ because of reworked application setup, but it decouples isc_iterated_hash and isc_md units and keeps all the branches in sync.	2023-01-18 23:33:43 +01:00
Ondřej Surý	f3753d591f	Use thread_local EVP_MD_CTX in isc_iterated_hash() As this code is on hot path (NSEC3) this introduces an additional optimization of the EVP_MD API - instead of calling EVP_MD_CTX_new() on every call to isc_iterated_hash(), we create two thread_local objects for each thread - a basectx and mdctx, initialize basectx once and then use EVP_MD_CTX_copy_ex() to flip the initialized state into mdctx. This saves us couple more valuable microseconds from the isc_iterated_hash() call.	2023-01-18 19:36:21 +01:00
Ondřej Surý	25db8d0103	Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash() If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it. The API has been deprecated in OpenSSL 3.0, but it is significantly faster than EVP_MD API, so make an exception here and keep using it until we can't.	2023-01-18 19:36:17 +01:00
Ondřej Surý	36654df732	Use OpenSSL EVP_MD API directly in isc_iterated_hash() Instead of going through another layer, use OpenSSL EVP_MD API directly in the isc_iterated_hash() implementation. This shaves off couple of microseconds in the microbenchmark.	2023-01-18 18:32:57 +01:00
Ondřej Surý	e6bfb8e456	Avoid implicit algorithm fetch for OpenSSL EVP_MD family The implicit algorithm fetch causes a lock contention and significant slowdown for small input buffers. For more details, see: https://github.com/openssl/openssl/issues/19612 Instead of using EVP_DigestInit_ex() initialize empty MD_CTX objects for each algorithm and use EVP_MD_CTX_copy_ex() to initialize MD_CTX from a static copy. Additionally avoid implicit algorithm fetching by using EVP_MD_fetch() for OpenSSL 3.0.	2023-01-18 18:32:57 +01:00
Tony Finch	290899661d	Fix a typo in the NS_PER_ macros Milliseconds and microseconds were swapped.	2023-01-16 20:33:57 +00:00
Ondřej Surý	d07c4a98da	Prefer the pthread_barrier implementation over uv_barrier Prefer the pthread_barrier implementation on platforms where it is available over uv_barrier implementation. This also solves the problem with thread sanitizer builds on macOS that doesn't have pthread barrier.	2023-01-11 09:51:02 +01:00
Ondřej Surý	d06602f036	Get rid of locking during UDP and TCP listen We already have a synchronization mechanism when starting the UDP and TCP listener children - barriers. Change how we start the first-born child (tid == 0), so we don't have to race for sock->parent->result and sock->parent->fd.	2023-01-11 07:17:46 +01:00
Ondřej Surý	10f884a5b8	Remove unused isc_astack unit The isc_astack unit is now unused, so just remove it.	2023-01-10 20:31:24 +01:00
Ondřej Surý	359faf2ff7	Convert isc_astack usage in netmgr to mempool and ISC_LIST Change the per-socket inactive uvreq cache (implemented as isc_astack) to per-worker memory pool. Change the per-socket inactive nmhandle cache (implemented as isc_astack) to unlocked per-socket ISC_LIST.	2023-01-10 20:31:24 +01:00
Ondřej Surý	5bbba0d1a1	Simplify tracing the reference counting in isc_netmgr Always track the per-worker sockets in the .active_sockets field in the isc__networker_t struct and always track the per-socket handles in the .active_handles field ian the isc_nmsocket_t struct.	2023-01-10 19:57:39 +01:00
Mark Andrews	349c23dbb7	Accept 'in=NULL' with 'inlen=0' in isc_{half}siphash24 Arthimetic on NULL pointers is undefined. Avoid arithmetic operations when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero.	2023-01-10 17:52:56 +11:00
Evan Hunt	916ea26ead	remove nonfunctional DSCP implementation DSCP has not been fully working since the network manager was introduced in 9.16, and has been completely broken since 9.18. This seems to have caused very few difficulties for anyone, so we have now marked it as obsolete and removed the implementation. To ensure that old config files don't fail, the code to parse dscp key-value pairs is still present, but a warning is logged that the feature is obsolete and should not be used. Nothing is done with configured values, and there is no longer any range checking.	2023-01-09 12:15:21 -08:00
Evan Hunt	9c577e10c3	use separate barriers for "stop" and "listen" operations On some platforms, when a synchronizing barrier is cleared, one thread can progress while other threads are still in the process of releasing the barrier. If a barrier is reused by the progressing thread during this window, it can cause a deadlock. This can occur if, for example, we stop listening immediately after we start, because the stop and listen functions both use socket->barrier. This has been addressed by using separate barrier objects for stop and listen.	2023-01-07 16:30:21 -08:00
Ondřej Surý	6613f89c62	Enhance the isc_loop unit to allow reference count tracking Use ISC_REFCOUNT_TRACE_{IMPL,DECL} to allow better isc_loop reference tracking - use `#define ISC_LOOP_TRACE 1` in <isc/loop.h> to enable.	2023-01-05 12:33:15 +00:00
Ondřej Surý	6553927d27	Enforce strong thread-affinity on StreamDNS sockets Add a check that the isc__nm_streamdns_read(), isc__nm_streamdns_send(), and isc__nm_streamdns_close() are being called from the matching thread.	2023-01-05 09:43:09 +01:00
Mark Andrews	096b280b1c	Do not pass NULL pointer to memmove - undefined behaviour Check if 'old_base' is NULL and if so skip calling memmove.	2023-01-03 14:40:30 +11:00
Artem Boldariev	fbf1546fb8	TLS: use isc_buffer_t for send requests This commit replaces ad-hoc code for send requests buffer management within TLS with the one based on isc_buffer_t. Previous version of the code was trying to use pre-allocated small buffers to avoid extra allocations. The code would allocate a larger dynamic buffer when needed. There is no need to have ad-hoc code for this, as isc_buffer_t now provides this functionality internally. Additionally to the above, the old version of the code lacked any logic to reuse the dynamically allocated buffers. Now, as we do not manage memory buffers, but isc_buffer_t objects, we can implement this strategy. It can be in particular helpful for longer lasting connections, as in this case the buffer will adjust itself to the size of the messages being transferred. That is, it is in particular useful for XoT, as Stream DNS happen to order send requests in such a way that the send request will get reused.	2022-12-30 19:56:25 +02:00
Artem Boldariev	7962e7f575	tlsctx_client_session_cache_new() -> tlsctx_client_session_create() Additionally to renaming, it changes the function definition so that it accepts a pointer to pointer instead of returning a pointer to the new object. It is mostly done to make it in line with other functions in the module.	2022-12-23 11:10:11 +02:00
Artem Boldariev	f102df96b8	Rename isc_tlsctx_cache_new() -> isc_tlsctx_cache_create() Additionally to renaming, it changes the function definition so that it accepts a pointer to pointer instead of returning a pointer to the new object. It is mostly done to make it in line with other functions in the module.	2022-12-23 11:10:11 +02:00
Ondřej Surý	6cb6373b5a	Convert Stream DNS to use isc_buffer API Drop the whole isc_dnsbuffer API and use new improved isc_buffer API that provides same functionality as the isc_dnsbuffer unit now.	2022-12-20 22:13:53 +02:00
Artem Boldariev	0a7e83feea	StreamDNS: Use isc__nm_senddns() to send DNS messages This commit modifies the Stream DNS message so that it uses the optimised code path (isc__nm_senddns()) for sending DNS messages over the underlying transport. This way we avoid allocating any intermediate memory buffers needed to render a DNS message with its length pre-pended ahead of the contents (TCP DNS message format).	2022-12-20 22:13:53 +02:00
Artem Boldariev	cb6f3dc3c8	TLS: isc__nm_senddns() support This commit adds support for isc_nm_senddns() to the generic TLS code.	2022-12-20 22:13:53 +02:00
Artem Boldariev	ad876a65af	Add isc__nm_senddns() The new internal function works in the same way as isc_nm_send() except that it sends a DNS message size ahead of the DNS message data (the format used in DNS over TCP). The intention is to provide a fast path for sending DNS messages over streams protocols - that is, without allocating any intermediate memory buffers.	2022-12-20 22:13:53 +02:00
Artem Boldariev	56732ac2a0	TLS: try to avoid allocating send request objects This commit optimises TLS send request object allocation to enable send request object reuse, somewhat reducing pressure on the memory manager. It is especially helpful in the case when Stream DNS uses the TLS implementation as the transport.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4277eeeb9c	Remove TLS DNS transport (and parts common with TCP DNS) This commit removes TLS DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	e5649710d3	Remove TCP DNS transport This commit removes TCP DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4524bf4083	Make isc_nm_tlssocket non-optional This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so that it will be available regardless of the fact if BIND was built with DNS over HTTP support or not.	2022-12-20 22:13:53 +02:00
Artem Boldariev	efe4267044	DoH: use isc_nmhandle_set_tcp_nodelay() This commit replaces ad-hoc code for disabling Nagle's algorithm with a call to isc_nmhandle_set_tcp_nodelay().	2022-12-20 22:13:53 +02:00
Artem Boldariev	e89575ddce	StreamDNS: opportunistically disable Nagle's algorithm This commit ensures that Stream DNS code attempts to disable Nagle's algorithm regardless of underlying stream transport (TCP or TLS), as we are not interested in trading latency for throughout when dealing with DNS messages.	2022-12-20 22:13:53 +02:00
Artem Boldariev	05cfb27b80	Disable Nagle's algorithm for TLS connections by default This commit ensures that Nagle's algorithm is disabled by default for TLS connections on best effort basis, just like other networking software (e.g. NGINX) does, as, in the case of TLS, we are not interested in trading latency for throughput, rather vice versa. We attempt to disable it as early as we can, right after TCP connections establishment, as an attempt to speed up handshake handling.	2022-12-20 22:13:53 +02:00
Artem Boldariev	371b02f37a	TCP: make it possible to set Nagle's algorithms state via handle This commit adds ability to turn the Nagle's algorithm on or off via connections handle. It adds the isc_nmhandle_set_tcp_nodelay() function as the public interface for this functionality.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4606384345	Extend isc__nm_socket_tcp_nodelay() to accept value This makes it possible to both enable and disable Nagle's algorithm for a TCP socket descriptor, before the change it was possible only to disable it.	2022-12-20 22:13:53 +02:00
Artem Boldariev	f395cd4b3e	Add isc_nm_streamdnssocket (aka Stream DNS) This commit adds an initial implementation of isc_nm_streamdnssocket transport: a unified transport for DNS over stream protocols messages, which is capable of replacing both TCP DNS and TLS DNS transports. Currently, the interface it provides is a unified set of interfaces provided by both of the transports it attempts to replace. The transport is built around "isc_dnsbuffer_t" and "isc_dnsstream_assembler_t" objects and attempts to minimise both the number of memory allocations during network transfers as well as memory usage.	2022-12-20 22:13:51 +02:00
Artem Boldariev	338cf3e467	Add isc_dnsstream_assembler_t implementation This commit adds the implementation for an "isc_dnsstream_assembler_t" object. The object is built on top of "isc_dnsbuffer_t" and is intended to encapsulate the state machine used for handling DNS messages received in the format used for messages transmitted over TCP. The idea is that the object accepts the input data received from a socket, tries to assemble DNS messages from the incoming data and calls the callback which contains the status of the incoming data as well as a pointer to the memory region referencing the data of the assembled message. It is capable of assembling DNS messages no matter how torn apart they are when sent over network. The following statuses might be passed to the callback: * ISC_R_SUCCESS - a message has been successfully assembled; * ISC_R_NOMORE - not enough data has been processed to assemble a message; * ISC_R_RANGE - there was an attempt to process a zero-sized DNS message (someone attempts to send us junk data). One could say that the object replaces the implementation of "isc__nm__processbuffer()" functions used by the old TCP DNS and TLS DNS transports with a better defined state machine completely decoupled from the networking code itself. Such a design makes it trivial to write unit tests for it, leading to better verification of its correctness. Another important difference is directly related to the fact that it is built on top of "isc_dnsbuffer_t", which tries to manage memory in a smart way. In particular: It tries to use a static buffer for smaller messages, reducing pressure on the memory manager (hot path); * When allocating dynamic memory for larger messages, it tries to allocate memory conservatively (generic path). These characteristics is a significant upgrade over the older logic where a 64KB(+2 bytes) buffer was allocated from dynamic memory regardless of the fact if we need a buffer this large or not. That is, lesser memory usage is expected in a generic case for DNS transports built on top of "isc_dnsstream_assembler_t."	2022-12-20 21:24:44 +02:00
Artem Boldariev	cbb758abd4	Add isc_dnsbuffer_t implementation This commit adds "isc_dnsbuffer_t" object implementation, a thin wrapper on top of "isc_buffer_t" which has the following characteristics: * provides interface specifically atuned for handling/generating DNS messages, especially in the format used for DNS messages over TCP; * avoids allocating dynamic memory when handling small DNS messages, while transparently switching to using dynamic memory when handling larger messages. This approach significantly reduces pressure on the memory allocator, as most of the DNS messages are small.	2022-12-20 21:24:44 +02:00
Artem Boldariev	c0c59b55ab	TLS: add an internal function isc__nmhandle_get_selected_alpn() The added function provides the interface for getting an ALPN tag negotiated during TLS connection establishment. The new function can be used by higher level transports.	2022-12-20 21:24:44 +02:00
Artem Boldariev	15e626f1ca	TLS: add manual read timer control mode This commit adds manual read timer control mode, similarly to TCP. This way the read timer can be controlled manually using: * isc__nmsocket_timer_start(); * isc__nmsocket_timer_stop(); * isc__nmsocket_timer_restart(). The change is required to make it possible to implement more sophisticated read timer control policies in DNS transports, built on top of TLS.	2022-12-20 21:24:44 +02:00
Artem Boldariev	9aabd55725	TCP: add manual read timer control mode This commit adds a manual read timer control mode to the TCP code (adding isc__nmhandle_set_manual_timer() as the interface to it). Manual read timer control mode suppresses read timer restarting the read timer when receiving any amount of data. This way the read timer can be controlled manually using: * isc__nmsocket_timer_start(); * isc__nmsocket_timer_stop(); * isc__nmsocket_timer_restart(). The change is required to make it possible to implement more sophisticated read timer control policies in DNS transports, built on top of TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	f4760358f8	TLS: expose the ability to (re)start and stop underlying read timer This commit adds implementation of isc__nmsocket_timer_restart() and isc__nmsocket_timer_stop() for generic TLS code in order to make its interface more compatible with that of TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	f18a9b3743	TLS: add isc__nmsocket_timer_running() support This commit adds isc__nmsocket_timer_running() support to the generic TLS code in order to make it more compatible with TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	c0808532e1	TLS: isc_nm_bad_request() and isc__nmsocket_reset() support This commit adds implementations of isc_nm_bad_request() and isc__nmsocket_reset() to the generic TLS stream code in order to make it more compatible with TCP code.	2022-12-20 21:24:44 +02:00
Artem Boldariev	94e650ce89	Use 'restrict' and 'const' for 'isc_buffer_t' The purpose of this commit is to aid compiler in generating better code when working with `isc_buffer_t` objects by using restricted pointers (and, to a lesser extent, 'const' modifier for read-only arguments). This way we, basically, instruct the compiler that the members of structured passed by pointers into the functions can be treated as local variables in the scope of a function. That should reduce the number of load/store operations emitted by compilers when accessing objects (e.g. 'isc_buffer_t') via pointers.	2022-12-20 21:01:27 +02:00
Ondřej Surý	460afcda18	Add isc_buffer_trycompact() function needed for StreamDNS Add isc_buffer_trycompact() that's an optimization; it will compact the buffer only when the remaining length is smaller than used length.	2022-12-20 19:13:48 +01:00
Ondřej Surý	e6062ee3ae	Add isc_buffer_setmctx() and isc_buffer_clearmctx() function Add two extra functions needed by StreamDNS: 1. isc_buffer_setmctx() sets the buffer internal memory context, so we can use isc_buffer_reserve() on the buffer. For this, we also need to track whether the .base was dynamically allocated or not. This needs to be called after isc_buffer_init() and before first isc_buffer_reserve() call. 2. isc_buffer_clearmctx() clears the buffer internal memory context, and frees any dynamically allocated buffer. This needs to be called after the last isc_buffer_reserve() call and before calling the isc_buffer_invalidate()	2022-12-20 19:13:48 +01:00
Ondřej Surý	8e3a86f6dd	Make the isc_buffer unit header-only The isc_buffer is often used in the hot-path, so make it header-only implementation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2ddea1e41c	Add a static pre-allocated buffer to isc_buffer_t When the buffer is allocated via isc_buffer_allocate() and the size is smaller or equal ISC_BUFFER_STATIC_SIZE (currently 512 bytes), the buffer will be allocated as a flexible array member in the buffer structure itself instead of allocating it on the heap. This should help when the buffer is used on the hot-path with small allocations.	2022-12-20 19:13:48 +01:00
Ondřej Surý	6bd2b34180	Enable auto-reallocation for all isc_buffer_allocate() buffers When isc_buffer_t buffer is created with isc_buffer_allocate() assume that we want it to always auto-reallocate instead of having an extra call to enable auto-reallocation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	135ec7a0f0	Remove single use isc_buffer_putdecint() function The isc_buffer_putdecint() could be easily replaced with isc_buffer_printf() with just a small overhead of calling vsnprintf() twice instead once. This is not on a hot-path (dns_catz unit), so we can ignore the overhead and instead have less single-use code in favor of using reusable more generic function.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2a94123d5b	Refactor the isc_buffer_{get,put}uintN, add isc_buffer_peekuintN The Stream DNS implementation needs a peek methods that read the value from the buffer, but it doesn't advance the current position. Add isc_buffer_peekuintX methods, refactor the isc_buffer_{get,put}uintN methods to modern integer types, and move the isc_buffer_getuintN to the header as static inline functions.	2022-12-20 19:13:48 +01:00
Ondřej Surý	a1d45685e6	Move and extend the uint8_t low-endian to uint{32,64}t to endian.h Move the U8TO{32,64}_LE and U{32,64}TO8_LE macros to endian.h and extend the macros for 16-bit and Big-Endian variants. Use the macros both in isc_siphash (LE) and isc_buffer (BE) units.	2022-12-20 19:13:48 +01:00
Ondřej Surý	aea251f3bc	Change the isc_buffer_reserve() to take just buffer pointer The isc_buffer_reserve() would be passed a reference to the buffer pointer, which was unnecessary as the pointer would never be changed in the current implementation. Remove the extra dereference.	2022-12-20 19:13:48 +01:00
Ondřej Surý	52307f8116	Add internal logging functions to the netmgr Add internal logging functions isc__netmgr_log, isc__nmsocket_log(), and isc__nmhandle_log() that can be used to add logging messages to the netmgr, and change all direct use of isc_log_write() to use those logging functions to properly prefix them with netmgr, nmsocket and nmsocket+nmhandle.	2022-12-14 19:34:48 +01:00
Ondřej Surý	7cefcb6184	Allow zero length keys in isc_hashmap In case, we are trying to hash the empty key into the hashmap, the key is going to have zero length. This might happen in the unit test. Allow this and add a unit test to ensure the empty zero-length key doesn't hash to slot 0 as SipHash 2-4 (our hash function of choice) has no problem with zero-length inputs.	2022-12-14 17:59:07 +01:00
Artem Boldariev	837fef78b1	Fix TLS session resumption via IDs when Mutual TLS is used This commit fixes TLS session resumption via session IDs when client certificates are used. To do so it makes sure that session ID contexts are set within server TLS contexts. See OpenSSL documentation for 'SSL_CTX_set_session_id_context()', the "Warnings" section.	2022-12-14 18:06:20 +02:00
Ondřej Surý	e2262c2112	Remove isc_resource API and set limits directly in named_os unit The only function left in the isc_resource API was setting the file limit. Replace the whole unit with a simple getrlimit to check the maximum value of RLIMIT_NOFILE and set the maximum back to rlimit_cur. This is more compatible than trying to set RLIMIT_UNLIMITED on the RLIMIT_NOFILE as it doesn't work on Linux (see man 5 proc on /proc/sys/fs/nr_open), neither it does on Darwin kernel (see man 2 getrlimit). The only place where the maximum value could be raised under privileged user would be BSDs, but the `named_os_adjustnofile()` were not called there before. We would apply the increased limits only on Linux and Sun platforms.	2022-12-07 19:40:00 +01:00
Artem Boldariev	bed5e2bb08	TLS: check for sock->recv_cb when handling received data This commit adds a check if 'sock->recv_cb' might have been nullified during the call to 'sock->recv_cb'. That could happen, e.g. by an indirect call to 'isc_nmhandle_close()' from within the callback when wrapping up. In this case, let's close the TLS connection.	2022-12-02 13:20:37 +02:00
Artem Boldariev	8b7e123528	DoH: Avoid accessing non-atomic listener socket flags when accepting This commit ensures that the non-atomic flags inside a DoH listener socket object (and associated worker) are accessed when doing accept for a connection only from within the context of the dedicated thread, but not other worker threads. The purpose of this commit is to avoid TSAN errors during isc__nmsocket_closing() calls. It is a continuation of `4b5559cd8f`.	2022-12-02 12:16:12 +02:00
Artem Boldariev	4d0c226375	TLS: Avoid accessing non-atomic listener socket flags during HS This commit ensures that the non-atomic flags inside a TLS listener socket object (and associated worker) are accessed when doing handshake for a connection only from within the context of the dedicated thread, but not other worker threads. The purpose of this commit is to avoid TSAN errors during isc__nmsocket_closing() calls. It is a continuation of `4b5559cd8f`.	2022-12-02 12:16:12 +02:00
Artem Boldariev	4b5559cd8f	TLS: Avoid accessing listener socket flags from other threads This commit ensures that the flags inside a TLS listener socket object (and associated worker) are accessed when accepting a connection only from within the context of the dedicated thread, but not other worker threads.	2022-12-01 21:07:49 +02:00
Ondřej Surý	e3c628d562	Honour single read per client isc_nm_read() call in the TLSDNS The TLSDNS transport was not honouring the single read callback for TLSDNS client. It would call the read callbacks repeatedly in case the single TLS read would result in multiple DNS messages in the decoded buffer.	2022-12-01 18:31:05 +01:00
Artem Boldariev	2bfc079946	TLS stream: always handle send callbacks asynchronously This commit ensures that send callbacks are always called from within the context of its worker thread even in the case of shuttigdown/inactive socket, just like TCP transport does and with which TLS attempts to be as compatible as possible.	2022-11-30 18:09:52 +02:00
Artem Boldariev	ef659365ce	TLS Stream: use ISC_R_CANCELLED error when shutting down This commit changes ISC_R_NOTCONNECTED error code to ISC_R_CANCELLED when attempting to start reading data on the shutting down socket in order to make its behaviour compatible with that of TCP and not break the common code in the unit tests.	2022-11-30 18:09:52 +02:00
Artem Boldariev	fb9955a372	TLS Stream: fix isc_nm_read_stop() and reading flags handling It turned out that after the latest Network Manager refactoring 'sock->reading' flag was not processed correctly. Due to this isc_nm_read_stop() might not work as expected because reading from the underlying TCP socket could have been resume in 'tls_do_bio()' regardless of the 'sock->reading' value. This bug did not seem to cause problems with DoH, so it was not noticed, but Stream DNS has more strict expectations regarding the underlying transport. Additionally to the above, the 'sock->recv_read' flag was completely ignored and corresponding logic was completely unimplemented. That did not allow to implement one fine detail compared to TCP: once reading is started, it could be satisfied by one datum reading. This commit fixes the issues above.	2022-11-30 18:09:52 +02:00
Ondřej Surý	50f357cb36	Refactor the dns_adb unit The dns_adb unit has been refactored to be much simpler. Following changes have been made: 1. Simplify the ADB to always allow GLUE and hints There were only two places where dns_adb_createfind() was used - in the dns_resolver unit where hints and GLUE addresses were ok, and in the dns_zone where dns_adb_createfind() would be called without DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set. Simplify the logic by allowing hint and GLUE addresses when looking up the nameserver addresses to notify. The difference is negligible and would cause a difference in the notified addresses only when there's mismatch between the parent and child addresses and we haven't cached the child addresses yet. 2. Drop the namebuckets and entrybuckets Formerly, the namebuckets and entrybuckets were used to reduced the lock contention when accessing the double-linked lists stored in each bucket. In the previous refactoring, the custom hashtable for the buckets has been replaced with isc_ht/isc_hashmap, so only a single item (mostly, see below) would end up in each bucket. Removing the entrybuckets has been straightforward, the only matching was done on the isc_sockaddr_t member of the dns_adbentry. Removing the zonebuckets required GLUEOK and HINTOK bits to be removed because the find could match entries with-or-without the bits set, and creating a custom key that stores the DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a straightforward lookup into the hashtable without traversing a list that contains items with different flags. 3. Remove unassociated entries from ADB database Previously, the adbentries could live in the ADB database even after unlinking them from dns_adbnames. Such entries would show up as "Unassociated entries" in the ADB dump. The benefit of keeping such entries is little - the chance that we link such entry to a adbname is small, and it's simpler to evict unlinked entries from the ADB cache (and the hashtable) than create second LRU cleaning mechanism. Unlinked ADB entries are now directly deleted from the hash table (hashmap) upon destruction. 4. Cleanup expired entries from the hash table When buckets were still in place, the code would keep the buckets always allocated and never shrink the hash table (hashmap). With proper reference counting in place, we can delete the adbnames from the hash table and the LRU list. 5. Stop purging the names early when we hit the time limit Because the LRU list is now time ordered, we can stop purging the names when we find a first entry that doesn't fullfil our time-based eviction criteria because no further entry on the LRU list will meet the criteria. Future work: 1. Lock contention In this commit, the focus was on correctness of the data structure, but in the future, the lock contention in the ADB database needs to be addressed. Currently, we use simple mutex to lock the hash tables, because we almost always need to use a write lock for properly purging the hashtables. The ADB database needs to be sharded (similar to the effect that buckets had in the past). Each shard would contain own hashmap and own LRU list. 2. Time-based purging The ADB names and entries stay intact when there are no lookups. When we add separate shards, a timer needs to be added for time-based cleaning in case there's no traffic hashing to the inactive shard. 3. Revisit the 30 minutes limit The ADB cache is capped at 30 minutes. This needs to be revisited, and at least the limit should be configurable (in both directions).	2022-11-30 10:03:24 +01:00
Ondřej Surý	118ae66976	Add extra set of ISC_REFCOUNT_TRACE_{IMPL,DECL} macros The new ISC_REFCOUNT_TRACE_{IMPL,DECL} macros can be used to add a reference tracing capability to any unit using the reference counting. It requires a little bit of extra work in each header as you can't have a define from inside a define (see rpz.h), but it's fairly easy to add tracing to any struct using reference counting with these macros.	2022-11-29 23:57:40 -08:00
Artem Boldariev	9b1c8c03fd	TCP: use uv_try_write() to optimise sends This commit make TCP code use uv_try_write() on best effort basis, just like TCP DNS and TLS DNS code does. This optimisation was added in 'caa5b6548a11da6ca772d6f7e10db3a164a18f8d' but, similar change was mistakenly omitted for generic TCP code. This commit fixes that.	2022-11-29 13:41:10 +02:00
Michal Nowak	afdb41a5aa	Update sources to Clang 15 formatting	2022-11-29 08:54:34 +01:00
Ondřej Surý	d8df29e37d	Be more resilient when destroying the httpd requests Don't restart reading in the send callback after the httpdmgr has been shut down, and call httpd_request(..., ISC_R_SHUTDOWN, ...) when shutting down the httpdmgr to reduce code duplication.	2022-11-25 16:20:34 +01:00
Ondřej Surý	f3004da3a5	Make the netmgr send callback to be asynchronous only when needed Previously, the send callback would be synchronous only on success. Add an option (similar to what other callbacks have) to decide whether we need the asynchronous send callback on a higher level. On a general level, we need the asynchronous callbacks to happen only when we are invoking the callback from the public API. If the path to the callback went through the libuv callback or netmgr callback, we are already on asynchronous path, and there's no need to make the call to the callback asynchronous again. For the send callback, this means we need the asynchronous path for failure paths inside the isc_nm_send() (which calls isc__nm_udp_send(), isc__nm_tcp_send(), etc...) - all other invocations of the send callback could be synchronous, because those are called from the respective libuv send callbacks.	2022-11-25 15:46:25 +01:00
Ondřej Surý	5ca49942a3	Make the netmgr read callback to be asynchronous only when needed Previously, the read callback would be synchronous only on success or timeout. Add an option (similar to what other callbacks have) to decide whether we need the asynchronous read callback on a higher level. On a general level, we need the asynchronous callbacks to happen only when we are invoking the callback from the public API. If the path to the callback went through the libuv callback or netmgr callback, we are already on asynchronous path, and there's no need to make the call to the callback asynchronous again. For the read callback, this means we need the asynchronous path for failure paths inside the isc_nm_read() (which calls isc__nm_udp_read(), isc__nm_tcp_read(), etc...) - all other invocations of the read callback could be synchronous, because those are called from the respective libuv or netmgr read callbacks.	2022-11-25 15:46:15 +01:00
Tony Finch	00307fe318	Deduplicate time unit conversion factors The various factors like NS_PER_MS are now defined in a single place and the names are no longer inconsistent. I chose the _PER_SEC names rather than _PER_S because it is slightly more clear in isolation; but the smaller units are always NS, US, and MS.	2022-11-25 13:23:36 +00:00
Ondřej Surý	e4654d1a6a	Bump the allowed HTTP headers in statschannel to 100 Firefox 90+ apparently sends more than 10 headers, so we need to bump the number to some higher number. Bump it to 100 just to be on a save side, this is for internal use only anyway.	2022-11-10 16:34:26 +01:00
Ondřej Surý	f46ce447a6	Add isc_hashmap API that implements Robin Hood hashing Add new isc_hashmap API that differs from the current isc_ht API in several aspects: 1. It implements Robin Hood Hashing which is open-addressing hash table algorithm (e.g. no linked-lists) 2. No memory allocations - the array to store the nodes is made of isc_hashmap_node_t structures instead of just pointers, so there's only allocation on resize. 3. The key is not copied into the hashmap node and must be also stored externally, either as part of the stored value or in any other location that's valid as long the value is stored in the hashmap. This makes the isc_hashmap_t a little less universal because of the key storage requirements, but the inserts and deletes are faster because they don't require memory allocation on isc_hashmap_add() and memory deallocation on isc_hashmap_delete().	2022-11-10 15:07:19 +01:00
Ondřej Surý	9d2f22e666	Properly name the loop->mctx The per loop memory context were unnamed, properly name them as 'loop<tid>'.	2022-11-08 13:32:13 +01:00
Ondřej Surý	0492bbf590	Make the pthread_rwlock implementation header-only macros [2/2] While using mutrace, the phtread-rwlock based isc_rwlock implementation would be all tracked in the rwlock.c unit losing all useful information as all rwlocks would be traced in a single place. Rewrite the pthread_rwlock based implementation to be header-only macros, so we can use mutrace to properly track the rwlock contention without heavily patching mutrace to understand the libisc synchronization primitives.	2022-11-02 10:34:10 +01:00
Ondřej Surý	6bd201ccec	Remove one level of indirection from isc_rwlock [1/2] Instead of checking the PTHREAD_RUNTIME_CHECK from the header, move it to the pthread_rwlock implementation functions. The internal isc_rwlock actually cannot fail, so the checks in the header was useless anyway.	2022-11-02 10:27:09 +01:00
Ondřej Surý	98b7a93772	Remove isc_rwlock_downgrade() from isc_rwlock The isc_rwlock_downgrade() is not used anywhere, so we can remove it and make the pthread_rwlock implementation simpler.	2022-11-02 09:05:37 +01:00
Evan Hunt	dc878e3098	isc_async_run() runs events in reverse order when more than one event was scheduled in the isc_aysnc queue, they were executed in reverse order. we need to pull events off the back of queue instead the front, so that uv_loop will run them in the right order. note that isc_job_run() has the same behavior, because it calls uv_idle_start() directly. in that case we just document it so it'll be less surprising in the future.	2022-10-31 05:43:45 -07:00
Mark Andrews	3881afeb15	Add dns_rdata_checksvcb dns_rdata_checksvcb performs data entry checks on SVCB records. In particular that _dns SVBC record have an 'alpn' and if that 'alpn' parameter indicates HTTP is in use that 'dophath' is present.	2022-10-29 00:22:54 +11:00
Ondřej Surý	6ba0a22627	Change the return type of isc_lex_create() to void The isc_lex_create() cannot fail, so cleanup the return type from isc_result_t to void.	2022-10-26 12:55:06 +02:00
Evan Hunt	67c0128ebb	Fix an error when building with --disable-doh The netievent handler for isc_nmsocket_set_tlsctx() was inadvertently ifdef'd out when BIND was built with --disable-doh, resulting in an assertion failure on startup when DoT was configured.	2022-10-24 13:54:39 -07:00
Ondřej Surý	13959781cb	Serialize the HTTP/1.1 statschannel requests The statschannel truncated test still terminates abruptly sometimes and it doesn't return the answer for the first query. This might happen when the second process_request() discovers there's not enough space before the sending is complete and the connection is terminated before the client gets the data. Change the isc_http, so it pauses the reading when it receives the data and resumes it only after the sending has completed or there's incomplete request waiting for more data. This makes the request processing slightly less efficient, but also less taxing for the server, because previously all requests that has been received via single TCP read would be processed in the loop and the sends would be queued after the read callback has processed a full buffer.	2022-10-19 14:45:36 +02:00
Ondřej Surý	dfaae53b9a	Fix the non-developer build with OpenSSL 1.0.2 In non-developer build, a wrong condition prevented the isc__tls_malloc_ex, isc__tls_realloc_ex and isc__tls_free_ex to be defined. This was causing FTBFS on platforms with OpenSSL 1.0.2.	2022-10-19 14:41:10 +02:00
Artem Boldariev	09dcc914b4	TLS Stream: handle successful TLS handshake after listener shutdown It was possible that accept callback can be called after listener shutdown. In such a case the callback pointer equals NULL, leading to segmentation fault. This commit fixes that.	2022-10-18 18:30:24 +03:00
Ondřej Surý	5e20c2ccfb	Replace (void )-1 with ISC_LINK_TOMBSTONE Instead of having "arbitrary" (void )-1 to define non-linked, add a ISC_LINK_TOMBSTONE(type) macro that replaces the "magic" value with a define.	2022-10-18 11:36:15 +02:00
Ondřej Surý	cb3c36b8bf	Add ISC_{LIST,LINK}_INITIALIZER for designated initializers Since we are using designated initializers, we were missing initializers for ISC_LIST and ISC_LINK, add them, so you can do foo = (foo_t){ .list = ISC_LIST_INITIALIZER }; Instead of: foo = (foo_t){ 0 }; ISC_LIST_INIT(foo->list);	2022-10-18 11:36:15 +02:00
Artem Boldariev	5ab2c0ebb3	Synchronise stop listening operation for multi-layer transports This commit introduces a primitive isc__nmsocket_stop() which performs shutting down on a multilayered socket ensuring the proper order of the operations. The shared data within the socket object can be destroyed after the call completed, as it is guaranteed to not be used from within the context of other worker threads.	2022-10-18 12:06:00 +03:00
Tony Finch	26ed03a61e	Include the function name when reporting unexpected errors I.e. print the name of the function in BIND that called the system function that returned an error. Since it was useful for pthreads code, it seems worthwhile doing so everywhere.	2022-10-17 13:43:59 +01:00
Tony Finch	a34a2784b1	De-duplicate some calls to strerror_r() Specifically, when reporting an unexpected or fatal error.	2022-10-17 11:58:26 +01:00
Tony Finch	ec50c58f52	De-duplicate __FILE__, __LINE__ Mostly generated automatically with the following semantic patch, except where coccinelle was confused by #ifdef in lib/isc/net.c @@ expression list args; @@ - UNEXPECTED_ERROR(__FILE__, __LINE__, args) + UNEXPECTED_ERROR(args) @@ expression list args; @@ - FATAL_ERROR(__FILE__, __LINE__, args) + FATAL_ERROR(args)	2022-10-17 11:58:26 +01:00
Artem Boldariev	d62eb206f7	Fix isc_nmsocket_set_tlsctx() During loop manager refactoring isc_nmsocket_set_tlsctx() was not properly adapted. The function is expected to broadcast the new TLS context for every worker, but this behaviour was accidentally broken.	2022-10-14 23:06:31 +03:00
Ondřej Surý	cedfc97974	Improve reporting for pthread_once errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any once-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-10-14 16:39:21 +02:00
Ondřej Surý	beecde7120	Rewrite isc_httpd using picohttpparser and isc_url_parse Rewrite the isc_httpd to be more robust. 1. Replace the hand-crafted HTTP request parser with picohttpparser for parsing the whole HTTP/1.0 and HTTP/1.1 requests. Limit the number of allowed headers to 10 (arbitrary number). 2. Replace the hand-crafted URL parser with isc_url_parse for parsing the URL from the HTTP request. 3. Increase the receive buffer to match the isc_netmgr buffers, so we can at least receive two full isc_nm_read()s. This makes the truncation processing much simpler. 4. Process the received buffer from single isc_nm_read() in a single loop and schedule the sends to be independent of each other. The first two changes makes the code simpler and rely on already existing libraries that we already had (isc_url based on nodejs) or are used elsewhere (picohttpparser). The second two changes remove the artificial "truncation" limit on parsing multiple request. Now only a request that has too many headers (currently 10) or is too big (so, the receive buffer fills up without reaching end of the request) will end the connection. We can be benevolent here with the limites, because the statschannel channel is by definition private and access must be allowed only to administrators of the server. There are no timers, no rate-limiting, no upper limit on the number of requests that can be served, etc.	2022-10-14 11:26:54 +02:00
Ondřej Surý	3a8884f024	Add picohttpparser.{c.h} from https://github.com/h2o/picohttpparser PicoHTTPParser is a tiny, primitive, fast HTTP request/response parser. Unlike most parsers, it is stateless and does not allocate memory by itself. All it does is accept pointer to buffer and the output structure, and setups the pointers in the latter to point at the necessary portions of the buffer.	2022-10-14 11:26:54 +02:00
Ondřej Surý	b6b7a6886a	Don't set load-balancing socket option on the UDP connect sockets The isc_nm_udpconnect() erroneously set the reuse port with load-balancing on the outgoing connected UDP sockets. This socket option makes only sense for the listening sockets. Don't set the load-balancing reuse port option on the outgoing UDP sockets.	2022-10-12 15:36:25 +02:00
Artem Boldariev	eaebb92f3e	TLS DNS: fix certificate verification error message reporting This commit fixes TLS DNS verification error message reporting which we probably broke during one of the recent networking code refactorings. This prevent e.g. dig from producing useful error messages related to TLS certificates verification.	2022-10-12 16:24:04 +03:00
Artem Boldariev	6789b88d25	TLS: clear error queue before doing IO or calling SSL_get_error() Ensure that TLS error is empty before calling SSL_get_error() or doing SSL I/O so that the result will not get affected by prior error statuses. In particular, the improper error handling led to intermittent unit test failure and, thus, could be responsible for some of the system test failures and other intermittent TLS-related issues. See here for more details: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html In particular, it mentions the following: > The current thread's error queue must be empty before the TLS/SSL > I/O operation is attempted, or SSL_get_error() will not work > reliably. As we use the result of SSL_get_error() to decide on I/O operations, we need to ensure that it works reliably by cleaning the error queue. TLS DNS: empty error queue before attempting I/O	2022-10-12 16:24:04 +03:00
Aram Sargsyan	be95ba0119	Remove a superfluous check of sock->fd against -1 The check is left from when tcp_connect_direct() called isc__nm_socket() and it was uncertain whether it had succeeded, but now isc__nm_socket() is called before tcp_connect_direct(), so sock->fd cannot be -1. *** CID 357292: (REVERSE_NEGATIVE) /lib/isc/netmgr/tcp.c: 309 in isc_nm_tcpconnect() 303 304 atomic_store(&sock->active, true); 305 306 result = tcp_connect_direct(sock, req); 307 if (result != ISC_R_SUCCESS) { 308 atomic_store(&sock->active, false); >>> CID 357292: (REVERSE_NEGATIVE) >>> You might be using variable "sock->fd" before verifying that it is >= 0. 309 if (sock->fd != (uv_os_sock_t)(-1)) { 310 isc__nm_tcp_close(sock); 311 } 312 isc__nm_connectcb(sock, req, result, true); 313 } 314	2022-10-12 08:21:35 +00:00
Tony Finch	138908b211	Avoid dead code warning when using a constant boolean The value of `sign_bit` is platform-dependent but constant at compile time. Use a cast to convert the boolean `sign_bit` to 0 or 1 instead of ternary `?:` because one branch of the conditional is dead code. (We could leave out the cast to `size_t` but our style prefers to handle booleans more explicitly, hence the `?:` that caused the issue.) *** CID 358310: Possible Control flow issues (DEADCODE) /lib/isc/resource.c: 118 in isc_resource_setlimit() 112 * rlim_t, and whether rlim_t has a sign bit. 113 / 114 isc_resourcevalue_t rlim_max = UINT64_MAX; 115 size_t wider = sizeof(rlim_max) - sizeof(rlim_t); 116 bool sign_bit = (double)(rlim_t)-1 < 0; 117 >>> CID 358310: Possible Control flow issues (DEADCODE) >>> Execution cannot reach the expression "1" inside this statement: "rlim_max >>= 8UL wider + ...". 118 rlim_max >>= CHAR_BIT * wider + (sign_bit ? 1 : 0); 119 rlim_value = ISC_MIN(value, rlim_max); 120 } 121 122 rl.rlim_cur = rl.rlim_max = rlim_value; 123 unixresult = setrlimit(unixresource, &rl);	2022-10-05 15:51:05 +00:00
Ondřej Surý	c0598d404c	Use designated initializers instead of memset()/MEM_ZERO for structs In several places, the structures were cleaned with memset(...)) and thus the semantic patch converted the isc_mem_get(...) to isc_mem_getx(..., ISC_MEM_ZERO). Use the designated initializer to initialized the structures instead of zeroing the memory with ISC_MEM_ZERO flag as this better matches the intended purpose.	2022-10-05 16:44:05 +02:00
Ondřej Surý	c1d26b53eb	Add and use semantic patch to replace isc_mem_get/allocate+memset Add new semantic patch to replace the straightfoward uses of: ptr = isc_mem_{get,allocate}(..., size); memset(ptr, 0, size); with the new API call: ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);	2022-10-05 16:44:05 +02:00
Ondřej Surý	dbf5672f32	Replace isc_mem__aligned(..., alignment) with isc_mem_x(..., flags) Previously, the isc_mem_get_aligned() and friends took alignment size as one of the arguments. Replace the specific function with more generic extended variant that now accepts ISC_MEM_ALIGN(alignment) for aligned allocations and ISC_MEM_ZERO for allocations that zeroes the (re-)allocated memory before returning the pointer to the caller.	2022-10-05 16:44:05 +02:00
Ondřej Surý	c14a4ac763	Add a case-insensitive option directly to siphash 2-4 implementation Formerly, the isc_hash32() would have to change the key in a local copy to make it case insensitive. Change the isc_siphash24() and isc_halfsiphash24() functions to lowercase the input directly when reading it from the memory and converting the uint8_t * array to 64-bit (respectively 32-bit numbers).	2022-10-04 10:32:40 +02:00
Mark Andrews	5f07fe8cbb	Use strnstr implementation from FreeBSD if not provided by OS	2022-10-04 14:21:41 +11:00
Tony Finch	4e37a6f77a	Avoid signed integer overflow in isc_resource_setlimit() On systems with signed rlim_t the old code calculated its maximum value by shifting 1 into the sign bit, which is undefined behaviour. Avoid the bug by using an unsigned shift.	2022-10-03 11:37:17 +00:00
Ondřej Surý	477eb22c12	Refactor isc_ratelimiter API Because the dns_zonemgr_create() was run before the loopmgr was started, the isc_ratelimiter API was more complicated that it had to be. Move the dns_zonemgr_create() to run_server() task which is run on the main loop, and simplify the isc_ratelimiter API implementation. The isc_timer is now created in the isc_ratelimiter_create() and starting the timer is now separate async task as is destroying the timer in case it's not launched from the loop it was created on. The ratelimiter tick now doesn't have to create and destroy timer logic and just stops the timer when there's no more work to do. This should also solve all the races that were causing the isc_ratelimiter to be left dangling because the timer was stopped before the last reference would be detached.	2022-09-30 10:36:30 +02:00
Ondřej Surý	09b50d2237	Fix small problems in the isc_ratelimiter	2022-09-30 09:50:17 +02:00
Ondřej Surý	1e2ededb07	Add missing DbC check for name##_detach in ISC_REFCOUNT_IMPL macro The detach function in the ISC_REFCOUNT_IMPL macro was missing DbC checks, add them.	2022-09-30 09:50:17 +02:00
Tony Finch	a4930e1969	Improve DBC in isc_mem_free Unlike standard free(), isc_mem_free() is not a no-op when passed a NULL pointer. For size accounting purposes it calls sallocx(), which crashes when passed a NULL pointer. To get more helpful diagnostics, REQUIRE() that the pointer is not NULL so that when the programmer makes a mistake they get a backtrace that shows what went wrong.	2022-09-29 10:07:34 +00:00
Ondřej Surý	173c352452	Call the isc__nm_udp_send() callbacks asynchronously on shutdown The isc__nm_udp_send() callback would be called synchronously when shutting down or when the socket has been closed. This could lead to double locking in the calling code and thus those callbacks needs to be called asynchronously.	2022-09-29 11:06:58 +02:00
Ondřej Surý	3b31f7f563	Add autoconf option to enable memory leak detection in libraries There's a known memory leak in the engine_pkcs11 at the time of writing this and it interferes with the named ability to check for memory leaks in the OpenSSL memory context by default. Add an autoconf option to explicitly enable the memory leak detection, and use it in the CI except for pkcs11 enabled builds. When this gets fixed in the engine_pkc11, the option can be enabled by default.	2022-09-27 17:53:04 +02:00
Ondřej Surý	e537fea861	Use custom isc_mem based allocator for libxml2 The libxml2 library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, strdup and free). Create a memory context specifically for libxml2 to allow tracking the memory usage that has originated from within libxml2. This will provide a separate memory context for libxml2 to track the allocations and when shutting down the application it will check that all libxml2 allocations were returned to the allocator. Additionally, move the xmlInitParser() and xmlCleanupParser() calls from bin/named/main.c to library constructor/destructor in libisc library.	2022-09-27 17:10:42 +02:00
Ondřej Surý	236d4b7739	Use custom isc_mem based allocator for OpenSSL The OpenSSL library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, and free). Create a memory context specifically for OpenSSL to allow tracking the memory usage that has originated from within OpenSSL. This will provide a separate memory context for OpenSSL to track the allocations and when shutting down the application it will check that all OpenSSL allocations were returned to the allocator.	2022-09-27 17:10:42 +02:00
Ondřej Surý	a32d06dd42	Use custom isc_mem based allocator for libuv The libuv library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, calloc and free). Create a memory context specifically for libuv to allow tracking the memory usage that has originated from within libuv. This requires libuv >= 1.38.0 which provides uv_library_shutdown() function that assures no more allocations will be made.	2022-09-27 17:10:42 +02:00
Ondřej Surý	a30e75db86	Check for working __builtin_mul_overflow() implementation Instead of using generic HAVE_BUILTIN_OVERFLOW, we need to check whether the overflow functions actually work as there was a bug in GCC that it would not detect mul overflow when compiled with `-m32` option without optimizations and the bug was fixed only for GCC 6.5+ and 7.3+/8+. For further details see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82274	2022-09-27 17:10:42 +02:00
Ondřej Surý	2d2022a509	Make the debugging flags local to the memory context Previously, the isc_mem_debugging would be single global variable that would affect the behavior of the memory context whenever it would be changed which could be after some allocation were already done. Change the memory debugging options to be local to the memory context and immutable, so all allocations within the same memory context are treated the same.	2022-09-27 17:10:41 +02:00
Ondřej Surý	0086ebf3fc	Bump the libuv requirement to libuv >= 1.34.0 By bumping the minimum libuv version to 1.34.0, it allows us to remove all libuv shims we ever had and makes the code much cleaner. The up-to-date libuv is available in all distributions supported by BIND 9.19+ either natively or as a backport.	2022-09-27 17:09:10 +02:00
Evan Hunt	1926ddc987	change ISC__BUFFER macros to inline functions previously, when ISC_BUFFER_USEINLINE was defined, macros were used to implement isc_buffer primitives (isc_buffer_init(), isc_buffer_region(), etc). these macros were missing the DbC assertions for those primitives, which made it possible for coding errors to go undetected. adding the assertions to the macros caused compiler warnings on some platforms. therefore, this commit converts the ISC__BUFFER macros to static inline functions instead, with assertions included, and eliminates the non-inline implementation from buffer.c. the --enable-buffer-useinline configure option has been removed.	2022-09-26 23:49:27 -07:00
Ondřej Surý	1baed21688	Switch the CSPRNG function from RAND_bytes() to uv_random() The RAND_bytes() implementation differs between the OpenSSL versions and uses the system entropy only for seeding its internal CSPRNG. The uv_random() on the other hand uses the system provided CSPRNG. Switch from RAND_bytes() to uv_random() to use system provided CSPRNG.	2022-09-26 15:13:11 +02:00
Ondřej Surý	fffd444440	Cleanup the asychronous code in the stream implementations After the loopmgr work has been merged, we can now cleanup the TCP and TLS protocols a little bit, because there are stronger guarantees that the sockets will be kept on the respective loops/threads. We only need asynchronous call for listening sockets (start, stop) and reading from the TCP (because the isc_nm_read() might be called from read callback again. This commit does the following changes (they are intertwined together): 1. Cleanup most of the asynchronous events in the TCP code, and add comments for the events that needs to be kept asynchronous. 2. Remove isc_nm_resumeread() from the netmgr API, and replace isc_nm_resumeread() calls with existing isc_nm_read() calls. 3. Remove isc_nm_pauseread() from the netmgr API, and replace isc_nm_pauseread() calls with a new isc_nm_read_stop() call. 4. Disable the isc_nm_cancelread() for the streaming protocols, only the datagram-like protocols can use isc_nm_cancelread(). 5. Add isc_nmhandle_close() that can be used to shutdown the socket earlier than after the last detach. Formerly, the socket would be closed only after all reading and sending would be finished and the last reference would be detached. The new isc_nmhandle_close() can be used to close the underlying socket earlier, so all the other asynchronous calls would call their respective callbacks immediately. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org>	2022-09-22 14:51:15 +02:00
Ondřej Surý	5319d4f6c5	Require isc_timer to be manipulated on the timer loop Each isc_timer needs to be created, started and destroyed on the current loop. The isc_timer_stop() can be run on any loop, but when run from different loop than the one associated with the timer, the request to stop the timer will be recorded in atomic variable and the underlying uv_timer_t will be stopped on next uv_timer_t callback call. This allows any thread to stop the timer.	2022-09-21 14:25:33 -07:00
Ondřej Surý	869c6d77a2	Convert isc_ratelimiter API to use on-loop timers In preparation for the on-loop timers, the isc_ratelimiter API was converted to use the timer on main loop and start and stop the timer asynchronously on the main loop.	2022-09-21 14:25:33 -07:00
Ondřej Surý	27d1e498b8	Add isc_timer_async_destroy() helper function As it sometimes happens that the object using isc_timer_t is destroyed via detaching all the references with no guarantee that the last thread will be matching thread, add a helper isc_timer_async_destroy() function that stops the timer and runs the destroy function via isc_async_run() on the matching thread.	2022-09-21 14:25:33 -07:00
Evan Hunt	4b7248545e	additional code cleanups in httpd.c - use isc_buffer functions when appropriate, rather than converting to and from isc_region unnecessarily - use the zlib total_out value instead of calculating it - use c99 struct initialization	2022-09-21 11:45:12 -07:00
Tony Finch	4b9af22830	Ensure the first random number is non-zero when fuzzing In fuzzing mode, `isc_random` uses a fixed seed for reproducibility. The particular seed chosen happened to produce zero as its first number, however commit `bd251de0` introduced an initialization check in `random_test` that required it to be non-zero. This change adjusts the seed to avoid spurious test failures. Also, remove the temporary variable that was used for initialization because it did not match the type of the thread-local seed array.	2022-09-21 12:47:26 +01:00
Michał Kępień	2ee16067c5	BIND 9.19.5 -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEENKwGS3ftSQfs1TU17QVz/8hFYQUFAmMZ2WwPHG1pY2hhbEBp c2Mub3JnAAoJEO0Fc//IRWEFZz0P/3B8tQXCztMneNsAzvQ11hASuQH3RVvd1p9z H6yPfbBuqyBM7FOJWozLQSI0JvxwBPXW+G+AmEhafSB4plgJBfNb12TsN7ZpECbF E6ckVQTiLwiYWt/2neu2OYg0aOnl5mhO5J4ESkSgqXGXcDihQ922xLJFQdAAgeAj T6TzrF1rv0fVNNlAcE1hrsZsGChTdPAguo/jVPXJjOO8hcEFGEqCWGhCX+wuyY6t WRXYcnh37/rlLIY29R3sVKttPIrD7DN6doGuz0/BP0PuuXCFnWBz/t61Et8Q/nxO hTS4RoKs/14IXRH7UBspo1dnG7khGYu2z44mCRwx15+fjpJ+zAL/Ym9xa0ElLOWg +Asd8w1N275xUQdrcTxpM7z/2z7SP/+bxtLJjIPW+9Z2a8rk8ifLu1yjtWASwOUO vLIK0WU3T7FPhpdP+0VgeSYAlJgLEoIgwIWCB+u+I4dR9DJJ7TtjPHDcfrJKXaJ6 eTTFIZ97xIFEpH53mT+QRG52PFP39fiLa0i7ylM+C0UbMklG++UgtkHz2CkkzV4H hqVcQ0Usk8XICkZ0PHAQklaDnDhXBD48x0J7wJOQSy+KS1foAyMFSPXv0ZelwiRM Q0StU+t+wXTAK3QID0tBqU4CyFD8fKO3cFwUnv5zqmrRc4ITu3etObT17MDPQKJj KLSl1VyB =6VJu -----END PGP SIGNATURE----- Merge tag 'v9_19_5' BIND 9.19.5	2022-09-21 13:04:58 +02:00
Tony Finch	bd251de035	Move random number re-seeding out of the hot path Instead of checking if we need to re-seed for every isc_random call, seed the random number generator in the libisc global initializer and the per-thread initializer.	2022-09-19 16:27:12 +02:00
Ondřej Surý	f6e4f620b3	Use the semantic patch to do the unsigned -> unsigned int change Apply the semantic patch on the whole code base to get rid of 'unsigned' usage in favor of explicit 'unsigned int'.	2022-09-19 15:56:02 +02:00
Ondřej Surý	b1026dd4c1	Add missing isc_refcount_destroy() for isc__nmsocket_t The destructor for the isc__nmsocket_t was missing call to the isc_refcount_destroy() on the reference counter, which might lead to spurious ThreadSanitizer data race warnings if we ever change the acquire-release memory order in the isc_refcount_decrement().	2022-09-19 14:38:56 +02:00
Ondřej Surý	9b8d432403	Reorder the uv_close() calls to close the socket immediately Simplify the closing code - during the loopmgr implementation, it was discovered that the various lists used by the uv_loop_t aren't FIFO, but LIFO. See doc/dev/libuv.md for more details. With this knowledge, we can close the protocol handles (uv_udp_t and uv_tcp_t) and uv_timer_t at the same time by reordering the uv_close() calls, and thus making sure that after calling the isc__nm_stoplistening(), the code will not issue any additional callback calls (accept, read) on the socket that stopped listening. This might help with the TLS and DoH shutting down sequence as described in the [GL #3509] as we now stop the reading, stop the timer and call the uv_close() as earliest as possible.	2022-09-19 14:38:56 +02:00
Ondřej Surý	eac8bc5c1a	Prevent unexpected UDP client read callbacks The network manager UDP code was misinterpreting when the libuv called the udp_recv_cb with nrecv == 0 and addr == NULL -> this doesn't really mean that the "stream" has ended, but the libuv indicates that the receive buffer can be freed. This could lead to assertion failure in the code that calls isc_nm_read() from the network manager read callback due to the extra spurious callbacks. Properly handle the extra callback calls from the libuv in the client read callback, and refactor the UDP isc_nm_read() implementation to be synchronous, so no datagram is lost between the time that we stop the reading from the UDP socket and we restart it again in the asychronous udpread event. Add a unit test that tests the isc_nm_read() call from the read callback to receive two datagrams.	2022-09-19 12:20:41 +02:00
Ondřej Surý	6562227cc8	Handle canceled read during sending data over stats channel An assertion failure would be triggered when the TCP connection is canceled during sending the data back to the client. Don't require the state to be `RECV` on non successful read to gracefully handle canceled TCP connection during the SEND state of the HTTPD channel.	2022-09-15 10:29:37 +02:00
Tony Finch	21a383a8fd	General-purpose unrolled ASCII tolower() loops When converting a string to lower case, the compiler is able to autovectorize nicely, so a nice simple implementation is also very fast, comparable to memcpy(). Comparisons are more difficult for the compiler, so we convert eight bytes at a time using "SIMD within a register" tricks. Experiments indicate it's best to stick to simple loops for shorter strings and the remainder of long strings.	2022-09-12 12:18:57 +01:00
Tony Finch	27a561273e	Consolidate some ASCII tables in `isc/ascii` and `isc/hex` There were a number of places that had copies of various ASCII tables (case conversion, hex and decimal conversion) that are intended to be faster than the ctype.h macros, or avoid locale pollution. Move them into libisc, and wrap the lookup tables with macros that avoid the ctype.h gotchas.	2022-09-12 12:18:57 +01:00
Michał Kępień	3b1c80fd0f	Fix error reporting for POSIX Threads functions Commit 3608abc8fa6a33046e1d34a0789cf7c9547f09ad inadvertently carried over a mistake in logging pthread_cond_init() errors to the ERRNO_CHECK() preprocessor macro: instead of passing the value returned by a given pthread_() function to strerror_r(), ERRNO_CHECK() passes the errno variable to strerror_r(). This causes bogus error reports because POSIX Threads API functions do not set the errno variable. Fix by passing the value returned by a given pthread_() function instead of the errno variable to strerror_r(). Since this change makes the name of the affected macro (ERRNO_CHECK()) confusing, rename the latter to PTHREADS_RUNTIME_CHECK(). Also log the integer error value returned by a given pthread_*() function verbatim to rule out any further confusion in runtime error reporting.	2022-09-09 20:25:47 +02:00
Evan Hunt	47e9fa981e	compression buffer was not reused correctly when the compression buffer was reused for multiple statistics requests, responses could grow beyond the correct size. this was because the buffer was not cleared before reuse; compressed data was still written to the beginning of the buffer, but then the size of used region was increased by the amount written, rather than set to the amount written. this caused responses to grow larger and larger, potentially reading past the end of the allocated buffer.	2022-09-08 11:15:52 +02:00
Michał Kępień	4c49068531	Fix building with --disable-doh Commit `b69e783164` inadvertently caused builds using the --disable-doh switch to fail, by putting the declaration of the isc__nm_async_settlsctx() function inside an #ifdef block that is only evaluated when DNS-over-HTTPS support is enabled. This results in the following compilation errors being triggered: netmgr/netmgr.c:2657:1: error: no previous prototype for 'isc__nm_async_settlsctx' [-Werror=missing-prototypes] 2657 \| isc__nm_async_settlsctx(isc__networker_t worker, isc__netievent_t ev0) { \| ^~~~~~~~~~~~~~~~~~~~~~~ Fix by making the declaration of the isc__nm_async_settlsctx() function in lib/isc/netmgr/netmgr-int.h visible regardless of whether DNS-over-HTTPS support is enabled or not.	2022-09-07 12:50:08 +02:00
Aram Sargsyan	2f11e48f0d	Fix isc_nm_listentlsdns() error path bug The isc_nm_listentlsdns() function erroneously calls isc__nm_tcpdns_stoplistening() instead of isc__nm_tlsdns_stoplistening() when something goes wrong, which can cause an assertion failure.	2022-09-05 14:58:52 +00:00
Aram Sargsyan	e97c3eea95	Add mctx attach/detach when creating/destroying a memory pool This should make sure that the memory context is not destroyed before the memory pool, which is using the context.	2022-09-02 08:16:17 +00:00
Ondřej Surý	718e92c31a	Clear the callbacks when isc_nm_stoplistening() is called When we are closing the listening sockets, there's a time window in which the TCP connection could be accepted although the respective stoplistening function has already returned to control to the caller. Clear the accept callback function early, so it doesn't get called when we are not interested in the incoming connections anymore.	2022-08-26 09:09:25 +02:00
Ondřej Surý	4d07768a09	Remove the isc_app API The isc_app API is no longer used and has been removed.	2022-08-26 09:09:25 +02:00
Ondřej Surý	b69e783164	Update netmgr, tasks, and applications to use isc_loopmgr Previously: * applications were using isc_app as the base unit for running the application and signal handling. * networking was handled in the netmgr layer, which would start a number of threads, each with a uv_loop event loop. * task/event handling was done in the isc_task unit, which used netmgr event loops to run the isc_event calls. In this refactoring: * the network manager now uses isc_loop instead of maintaining its own worker threads and event loops. * the taskmgr that manages isc_task instances now also uses isc_loopmgr, and every isc_task runs on a specific isc_loop bound to the specific thread. * applications have been updated as necessary to use the new API. * new ISC_LOOP_TEST macros have been added to enable unit tests to run isc_loop event loops. unit tests have been updated to use this where needed.	2022-08-26 09:09:24 +02:00
Ondřej Surý	49b149f5fd	Update isc_timer to use isc_loopmgr * isc_timer was rewritten using the uv_timer, and isc_timermgr_t was completely removed; isc_timer objects are now directly created on the isc_loop event loops. * the isc_timer API has been simplified. the "inactive" timer type has been removed; timers are now stopped by calling isc_timer_stop() instead of resetting to inactive. * isc_manager now creates a loop manager rather than a timer manager. * modules and applications using isc_timer have been updated to use the new API.	2022-08-25 17:17:07 +02:00
Ondřej Surý	84c90e223f	New event loop handling API This commit introduces new APIs for applications and signal handling, intended to replace isc_app for applications built on top of libisc. * isc_app will be replaced with isc_loopmgr, which handles the starting and stopping of applications. In isc_loopmgr, the main thread is not blocked, but is part of the working thread set. The loop manager will start a number of threads, each with a uv_loop event loop running. Setup and teardown functions can be assigned which will run when the loop starts and stops, and jobs can be scheduled to run in the meantime. When isc_loopmgr_shutdown() is run from any the loops, all loops will shut down and the application can terminate. * signal handling will now be handled with a separate isc_signal unit. isc_loopmgr only handles SIGTERM and SIGINT for application termination, but the application may install additional signal handlers, such as SIGHUP as a signal to reload configuration. * new job running primitives, isc_job and isc_async, have been added. Both units schedule callbacks (specifying a callback function and argument) on an event loop. The difference is that isc_job unit is unlocked and not thread-safe, so it can be used to efficiently run jobs in the same thread, while isc_async is thread-safe and uses locking, so it can be used to pass jobs from one thread to another. * isc_tid will be used to track the thread ID in isc_loop worker threads. * unit tests have been added for the new APIs.	2022-08-25 12:24:29 +02:00
Ondřej Surý	a26862e653	Simplify the isc_event API The ev_tag field was never used, and has now been removed.	2022-08-25 12:24:25 +02:00
Aram Sargsyan	8c4cdd9b21	Fix statistics channel multiple request processing with non-empty bodies When the HTTP request has a body part after the HTTP headers, it is not getting processed and is being prepended to the next request's data, which results in an error when trying to parse it. Improve the httpd.c:process_request() function with the following additions: 1. Require that HTTP POST requests must have Content-Length header. 2. When Content-Length header is set, extract its value, and make sure that it is valid and that the whole request's body is received before processing the request. 3. Discard the request's body by consuming Content-Length worth of data in the buffer.	2022-08-19 08:10:54 +00:00
Aram Sargsyan	86b8e62106	Enhance the have_header() function to find the HTTP header's value Add a new `const char **fvalue` parameter to the httpd.c:have_header() function which, when set, will point to the found header's value.	2022-08-19 08:10:54 +00:00
Evan Hunt	9d9bd3ace2	fix overflow error in mem_putstats() an integer overflow could cause an assertion failure when freeing memory.	2022-08-09 10:59:43 -07:00
Artem Boldariev	32565d0d65	TLS: do not ignore readpaused flag in certain circumstances In some circumstances generic TLS code could have resumed data reading unexpectedly on the TCP layer code. Due to this, the behaviour of isc_nm_pauseread() and isc_nm_resumeread() might have been unexpected. This commit fixes that. The bug does not seems to have real consequences in the existing code due to the way the code is used. However, the bug could have lead to unexpected behaviour and, at any rate, makes the TLS code behave differently from the TCP code, with which it attempts to be as compatible as possible.	2022-08-02 14:02:01 +03:00
Artem Boldariev	c52c691b18	TLS: fix double resumption in isc__nm_tls_resumeread() This commit fixes an obvious error in isc__nm_tls_resumeread() so that read cannot be resumed twice.	2022-07-26 14:25:59 +03:00
Artem Boldariev	5d450cd0ba	TLS: clear 'errno' when handling SSL status Sometimes tls_do_bio() might be called when there is no new data to process (most notably, when resuming reads), in such a case internal TLS session state will remain untouched and old value in 'errno' will alter the result of SSL_get_error() call, possibly making it to return SSL_ERROR_SYSCALL. This value will be treated as an error, and will lead to closing the connection, which is not what expected.	2022-07-26 14:25:59 +03:00
Ondřej Surý	3e10d3b45f	Cleanup the STATID_CONNECT and STATID_CONNECTFAIL stat counters The STATID_CONNECT and STATID_CONNECTFAIL statistics were used incorrectly. The STATID_CONNECT was incremented twice (once in the *_connect_direct() and once in the callback) and STATID_CONNECTFAIL would not be incremented at all if the failure happened in the callback. Closes: #3452	2022-07-14 14:34:53 +02:00
Ondřej Surý	a280855f7b	Handle the transient TCP connect() failures on FreeBSD On FreeBSD (and perhaps other *BSD) systems, the TCP connect() call (via uv_tcp_connect()) can fail with transient UV_EADDRINUSE error. The UDP code already handles this by trying three times (is a charm) before giving up. Add a code for the TCP, TCPDNS and TLSDNS layers to also try three times before giving up by calling uv_tcp_connect() from the callback two more time on UV_EADDRINUSE error. Additionally, stop the timer only if we succeed or on hard error via isc__nm_failed_connect_cb().	2022-07-14 14:20:10 +02:00
Michał Kępień	b67ff4728f	Improve reporting for barrier errors uv_barrier_init() errors are currently ignored. Use UV_RUNTIME_CHECK() to catch them and to improve error reporting for any uv_barrier_init() run-time failures (by augmenting error messages with file/line information and the error string corresponding to the value returned).	2022-07-13 13:19:32 +02:00
Michał Kępień	7009f9d270	Improve reporting for read-write lock errors Replace direct uses of implementation-specific rwlock functions in lib/isc/include/isc/rwlock.h with preprocessor macros that use ERRNO_CHECK(), in order to augment rwlock-related error messages with file/line/caller information and the error string corresponding to errno. Adjust the implementation-specific functions for pthreads-based rwlocks so that they return any errors encountered to the caller instead of aborting execution immediately using RUNTIME_CHECK(). To keep code modifications simple, make the non-pthreads-based implementation-specific rwlock functions always return 0; these functions continue to handle errors using less verbose run-time assertions as they do not set errno anyway.	2022-07-13 13:19:32 +02:00
Michał Kępień	badeeff0ac	Improve reporting for condition variable errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/condition.h with ERRNO_CHECK(), in order to improve error reporting for any condition-variable-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	f352a834a7	Improve reporting for mutex errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/mutex.h with ERRNO_CHECK(), in order to improve error reporting for any mutex-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	77aead5ab6	Enable tracking of pthreads barriers Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_barrier_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_barrier_destroy() or else the memory allocated for the barrier will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying barriers that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_barrier_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_barrier_t structures on the heap in isc_barrier_init() and free them in isc_barrier_destroy(). Reuse existing barrier macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	e4606da2c6	Enable tracking of pthreads rwlocks Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_rwlock_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_rwlock_destroy() or else the memory allocated for the rwlock will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying rwlocks that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_rwlock_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set (and --enable-pthread-rwlock is used), allocate isc_rwlock_t structures on the heap in isc_rwlock_init() and free them in isc_rwlock_destroy(). Reuse existing functions defined in lib/isc/rwlock.c for other operations, but rename them first, so that they contain triple underscores (to indicate that these functions are implementation-specific, unlike their mutex and condition variable counterparts, which always use the pthreads implementation). Define the isc__rwlock_init() macro so that it is a logical counterpart of isc__mutex_init() and isc__condition_init(); adjust isc___rwlock_init() accordingly. Remove a redundant function prototype for isc__rwlock_lock() and rename that (static) function to rwlock_lock() in order to avoid having to use quadruple underscores.	2022-07-13 13:19:32 +02:00
Ondřej Surý	8dfdb95a20	Enable tracking of pthreads condition variables Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_cond_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_cond_destroy() or else the memory allocated for the condition variable will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying condition variables that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_cond_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_condition_t structures on the heap in isc_condition_init() and free them in isc_condition_destroy(). Reuse existing condition variable macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	ebcfb16576	Enable tracking of pthreads mutexes Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_mutex_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_mutex_destroy() or else the memory allocated for the mutex will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying mutexes that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_mutex_destroy() calls on any platform on which it works reliably. Introduce a new ISC_TRACK_PTHREADS_OBJECTS preprocessor macro, which causes isc_mutex_t structures to be allocated on the heap by isc_mutex_init() and freed by isc_mutex_destroy(). Reuse existing mutex macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	deae974366	Directly cause assertion failure on pthreads primitives failure Instead of returning error values from isc_rwlock_(), isc_mutex_(), and isc_condition_*() macros/functions and subsequently carrying out runtime assertion checks on the return values in the calling code, trigger assertion failures directly in those macros/functions whenever any pthread function returns an error, as there is no point in continuing execution in such a case anyway.	2022-07-13 13:19:32 +02:00
Ondřej Surý	8e5e0fa522	Use library constructor to create default mutex attr once Instead of using isc_once_do() on every isc_mutex_init() call, use the global library constructor to initialize the default mutex attr object (optionally with PTHREAD_MUTEX_ADAPTIVE_NP if supported) just once when the library is loaded.	2022-07-13 13:19:32 +02:00
Michał Kępień	5759ace07f	Handle pthread_*_init() failures consistently isc_rwlock_init() currently detects pthread_rwlock_init() failures using a REQUIRE() assertion. Use the ERRNO_CHECK() macro for that purpose instead, so that read-write lock initialization failures are handled identically as condition variable (pthread_cond_init()) and mutex (pthread_mutex_init()) initialization failures.	2022-07-13 13:19:32 +02:00
Michał Kępień	365b47caee	Add an ERRNO_CHECK() preprocessor macro In a number of situations in pthreads-related code, a common sequence of steps is taken: if the value returned by a library function is not 0, pass errno to strerror_r(), log the string returned by the latter, and immediately abort execution. Add an ERRNO_CHECK() preprocessor macro which takes those exact steps and use it wherever (conveniently) possible. Notes: 1. The "log the return value of strerror_r() and abort" pattern is used in a number of other places that this commit does not touch; only "!= 0" checks followed by isc_error_fatal() calls with non-customized error messages are replaced here. 2. This change temporarily breaks file name & line number reporting for isc__mutex_init() errors, to prevent breaking the build. This issue will be rectified in a subsequent change.	2022-07-13 13:19:32 +02:00
Artem Boldariev	ffcb54211e	TLS: do not ignore accept callback result Before this change the TLS code would ignore the accept callback result, and would not try to gracefully close the connection. This had not been noticed, as it is not really required for DoH. Now the code tries to shut down the TLS connection gracefully when accepting it is not successful.	2022-07-12 14:40:22 +03:00
Artem Boldariev	8585b92f98	TLSDNS: try pass incoming data to OpenSSL if there are any Otherwise the code path will lead to a call to SSL_get_error() returning SSL_ERROR_SSL, which in turn might lead to closing connection to early in an unexpected way, as it is clearly not what is intended. The issue was found when working on loppmgr branch and appears to be timing related as well. Might be responsible for some unexpected transmission failures e.g. on zone transfers.	2022-07-12 14:40:22 +03:00
Artem Boldariev	fc74b15e67	TLS: bail out earlier when NM is stopping In some operations - most prominently when establishing connection - it might be beneficial to bail out earlier when the network manager is stopping. The issue is backported from loopmgr branch, where such a change is not only beneficial, but required.	2022-07-12 14:40:22 +03:00
Artem Boldariev	ac4fb34f18	TLS: sometimes TCP conn. handle might be NULL on when connecting In some cases - in particular, in case of errors, NULL might be passed to a connection callback instead of a handle that could have led to an abort. This commit ensures that such a situation will not occur. The issue was found when working on the loopmgr branch.	2022-07-12 14:40:22 +03:00
Artem Boldariev	88524e26ec	TLS: try to close sockets whenever there are no pending operations This commit ensures that the underlying TCP socket of a TLS connection gets closed earlier whenever there are no pending operations on it. In the loop-manager branch, in some circumstances the connection could have remained opened for far too long for no reason. This commit ensures that will not happen.	2022-07-12 14:40:22 +03:00
Artem Boldariev	237ce05b89	TLS: Implement isc_nmhandle_setwritetimeout() This commit adds a proper implementation of isc_nmhandle_setwritetimeout() for TLS connections. Now it passes the value to the underlying TCP handle.	2022-07-12 14:40:22 +03:00
Evan Hunt	a499794984	REQUIRE should not have side effects it's a style violation to have REQUIRE or INSIST contain code that must run for the server to work. this was being done with some atomic_compare_exchange calls. these have been cleaned up. uses of atomic_compare_exchange in assertions have been replaced with a new macro atomic_compare_exchange_enforced, which uses RUNTIME_CHECK to ensure that the exchange was successful.	2022-07-05 12:22:55 -07:00
Artem Boldariev	d2e13ddf22	Update the set of HTTP endpoints on reconfiguration This commit ensures that on reconfiguration the set of HTTP endpoints (=paths) is being updated within HTTP listeners.	2022-06-28 15:42:38 +03:00
Artem Boldariev	e72962d5f1	Update max concurrent streams limit in HTTP listeners on reconfig This commit ensures that HTTP listeners concurrent streams limit gets updated properly on reconfiguration.	2022-06-28 15:42:38 +03:00
Michal Nowak	1c45a9885a	Update clang to version 14	2022-06-16 17:21:11 +02:00
Artem Boldariev	e616d7f240	TLS DNS: do not call accept callback twice Before the changes from this commit were introduced, the accept callback function will get called twice when accepting connection during two of these stages: * when accepting the TCP connection; * when handshake has completed. That is clearly an error, as it should have been called only once. As far as I understand it the mistake is a result of TLS DNS transport being essentially a fork of TCP transport, where calling the accept callback immediately after accepting TCP connection makes sense. This commit fixes this mistake. It did not have any very serious consequences because in BIND the accept callback only checks an ACL and updates stats.	2022-06-15 14:21:11 +03:00
Ondřej Surý	b432d5d3bc	Gracefully handle uv_read_start() failures Under specific rare timing circumstances the uv_read_start() could fail with UV_EINVAL when the connection is reset between the connect (or accept) and the uv_read_start() call on the nmworker loop. Handle such situation gracefully by propagating the errors from uv_read_start() into upper layers, so the socket can be internally closed().	2022-06-14 11:33:02 +02:00
Ondřej Surý	2c3b2dabe9	Move all the unit tests to /tests/<libname>/ The unit tests are now using a common base, which means that lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and link with lib/isc/test.c and lib/ns/tests has to include both libisc and libdns parts. Instead of cross-linking code between the directories, move the /lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h to /tests/include/tests/<foo>.h and create a single libtest.la convenience library in /tests/. At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep it symlinked to the old location) and adjust paths accordingly. In few places, we are now using absolute paths instead of relative paths, because the directory level has changed. By moving the directories under the /tests/ directory, the test-related code is kept in a single place and we can avoid referencing files between libns->libdns->libisc which is unhealthy because they live in a separate Makefile-space. In the future, the /bin/tests/ should be merged to /tests/ and symlink kept, and the /fuzz/ directory moved to /tests/fuzz/.	2022-05-28 14:53:02 -07:00
Ondřej Surý	63fe9312ff	Give the unit tests a big overhaul The unit tests contain a lot of duplicated code and here's an attempt to reduce code duplication. This commit does several things: 1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake conditionals. 2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test implementations, test lists, and the main test routine, so we don't have to repeat this all over again. The macros were modeled after libuv test suite but adapted to cmocka as the test driver. A simple example of a unit test would be: ISC_RUN_TEST_IMPL(test1) { assert_true(true); } ISC_TEST_LIST_START ISC_TEST_ENTRY(test1) ISC_TEST_LIST_END ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?) For more complicated examples including group setup and teardown functions, and per-test setup and teardown functions. 3. The macros prefix the test functions and cmocka entries, so the name of the test can now match the tested function name, and we don't have to append `_test` because `run_test_` is automatically prepended to the main test function, and `setup_test_` and `teardown_test_` is prepended to setup and teardown function. 4. Update all the unit tests to use the new syntax and fix a few bits here and there. 5. In the future, we can separate the test declarations and test implementations which are going to greatly help with uncluttering the bigger unit tests like doh_test and netmgr_test, because the test implementations are not declared static (see `ISC_RUN_TEST_DECLARE` and `ISC_RUN_TEST_IMPL` for more details. NOTE: This heavily relies on preprocessor macros, but the result greatly outweighs all the negatives of using the macros. There's less duplicated code, the tests are more uniform and the implementation can be more flexible.	2022-05-28 14:52:56 -07:00
Ondřej Surý	1fe391fd40	Make all tasks to be bound to a thread Previously, tasks could be created either unbound or bound to a specific thread (worker loop). The unbound tasks would be assigned to a random thread every time isc_task_send() was called. Because there's no logic that would assign the task to the least busy worker, this just creates unpredictability. Instead of random assignment, bind all the previously unbound tasks to worker 0, which is guaranteed to exist.	2022-05-25 16:04:51 +02:00
Artem Boldariev	98f758ed4f	CID 352848: split xfrin_start() and remove dead code This commit separates TLS context creation code from xfrin_start() as it has become too large and hard to follow into a new function (similarly how it is done in dighost.c) The dead code has been removed from the cleanup section of the TLS creation code: * there is no way 'tlsctx' can equal 'found'; * there is no way 'sess_cache' can be non-NULL in the cleanup section. Also, it fixes a bug in the older version of the code, where TLS client session context fetched from the cache would not get passed to isc_nm_tlsdnsconnect().	2022-05-25 12:38:38 +03:00
Petr Menšík	057438cb45	Fix failures in isc netmgr_test on big endian machines Typing from libuv structure to isc_region_t is not possible, because their sizes differ on 64 bit architectures. Little endian machines seems to be lucky and still result in test passed. But big endian machine such as s390x fails the test reliably. Fix by directly creating the buffer as isc_region_t and skipping the type conversion. More readable and still more correct.	2022-05-24 19:51:30 +02:00
Artem Boldariev	40be3c9263	Do not provide a shim for SSL_SESSION_is_resumable() The recently added TLS client session cache used SSL_SESSION_is_resumable() to avoid polluting the cache with non-resumable sessions. However, it turned out that we cannot provide a shim for this function across the whole range of OpenSSL versions due to the fact that OpenSSL 1.1.0 does uses opaque pointers for SSL_SESSION objects. The commit replaces the shim for SSL_SESSION_is_resumable() with a non public approximation of it on systems shipped with OpenSSL 1.1.0. It is not turned into a proper shim because it does not fully emulate the behaviour of SSL_SESSION_is_resumable(), but in our case it is good enough, as it still helps to protect the cache from pollution. For systems shipped with OpenSSL 1.0.X and derivatives (e.g. older versions of LibreSSL), the provided replacement perfectly mimics the function it is intended to replace.	2022-05-23 18:25:18 +03:00
Artem Boldariev	9abb00bb5f	Fix an abort in DoH (client-side) when writing on closing sock The commit fixes a corner case in client-side DoH code, when a write attempt is done on a closing socket (session). The change ensures that the write call-back will be called with a proper error code (see failed_send_cb() call in client_httpsend()).	2022-05-20 20:18:40 +03:00
Artem Boldariev	245f7cec2e	Avoid aborting when uv_timer_start() is used on a closing socket In such a case it will return UV_EINVAL (-EINVAL), leading to aborting, as the code expects the function to succeed.	2022-05-20 20:18:40 +03:00
Artem Boldariev	35338b4105	Add SSL_SESSION_is_resumable() implementation shim This commit adds SSL_SESSION_is_resumable() implementation if it is missing.	2022-05-20 20:17:48 +03:00
Artem Boldariev	86465c1dac	DoT: implement TLS client session resumption This commit extends DoT code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:48 +03:00
Artem Boldariev	90bc13a5d5	TLS stream/DoH: implement TLS client session resumption This commit extends TLS stream code and DoH code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:45 +03:00
Artem Boldariev	987892d113	Extend TLS context cache with TLS client session cache This commit extends TLS context cache with TLS client session cache so that an associated session cache can be stored alongside the TLS context within the context cache.	2022-05-20 20:13:20 +03:00
Artem Boldariev	4ef40988f3	Add TLS client session cache implementation This commit adds an implementation of a client TLS session cache. TLS client session cache is an object which allows efficient storing and retrieval of previously saved TLS sessions so that they can be resumed. This object is supposed to be a foundation for implementing TLS session resumption - a standard technique to reduce the cost of re-establishing a connection to the remote server endpoint. OpenSSL does server-side TLS session caching transparently by default. However, on the client-side, a TLS session to resume must be manually specified when establishing the TLS connection. The TLS client session cache is precisely the foundation for that.	2022-05-20 20:13:20 +03:00
Ondřej Surý	61117840c1	Move setting the sock->write_timeout to the async_*send Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send functions could lead to (harmless) data race when setting the value for the first time when the isc_nm_send() function would be called from thread not-matching the socket we are sending to. Move the setting the sock->write_timeout to the matching async function which is always called from the matching thread.	2022-05-19 22:36:47 +02:00

... 5 6 7 8 9 ...

5087 commits