bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-03-11 02:30:44 -04:00

Author	SHA1	Message	Date
Ondřej Surý	91e349433f	Remove maxinuse memory counter The maxinuse memory counter indicated the highest amount of memory allocated in the past. Checking and updating this high- water mark value every time memory was allocated had an impact on server performance, so it has been removed. Memory size can be monitored more efficiently via an external tool logging RSS.	2023-01-24 17:57:16 +00:00
Ondřej Surý	971df0b4ed	Remove malloced and maxmalloced memory counter The malloced and maxmalloced memory counters were mostly useless since we removed the internal allocator blocks - it would only differ from inuse by the memory context size itself.	2023-01-24 17:57:16 +00:00
Evan Hunt	301f8b23e1	complete change of NETMGR_TRACE to ISC_NETMGR_TRACE some references to the old ifdef were still in place.	2023-01-20 12:46:34 -08:00
Aram Sargsyan	41dc48bfd7	Refactor isc_nm_xfr_allowed() Return 'isc_result_t' type value instead of 'bool' to indicate the actual failure. Rename the function to something not suggesting a boolean type result. Make changes in the places where the API function is being used to check for the result code instead of a boolean value.	2023-01-19 10:24:08 +00:00
Ondřej Surý	f3753d591f	Use thread_local EVP_MD_CTX in isc_iterated_hash() As this code is on hot path (NSEC3) this introduces an additional optimization of the EVP_MD API - instead of calling EVP_MD_CTX_new() on every call to isc_iterated_hash(), we create two thread_local objects for each thread - a basectx and mdctx, initialize basectx once and then use EVP_MD_CTX_copy_ex() to flip the initialized state into mdctx. This saves us couple more valuable microseconds from the isc_iterated_hash() call.	2023-01-18 19:36:21 +01:00
Ondřej Surý	e6bfb8e456	Avoid implicit algorithm fetch for OpenSSL EVP_MD family The implicit algorithm fetch causes a lock contention and significant slowdown for small input buffers. For more details, see: https://github.com/openssl/openssl/issues/19612 Instead of using EVP_DigestInit_ex() initialize empty MD_CTX objects for each algorithm and use EVP_MD_CTX_copy_ex() to initialize MD_CTX from a static copy. Additionally avoid implicit algorithm fetching by using EVP_MD_fetch() for OpenSSL 3.0.	2023-01-18 18:32:57 +01:00
Tony Finch	290899661d	Fix a typo in the NS_PER_ macros Milliseconds and microseconds were swapped.	2023-01-16 20:33:57 +00:00
Ondřej Surý	d07c4a98da	Prefer the pthread_barrier implementation over uv_barrier Prefer the pthread_barrier implementation on platforms where it is available over uv_barrier implementation. This also solves the problem with thread sanitizer builds on macOS that doesn't have pthread barrier.	2023-01-11 09:51:02 +01:00
Ondřej Surý	10f884a5b8	Remove unused isc_astack unit The isc_astack unit is now unused, so just remove it.	2023-01-10 20:31:24 +01:00
Ondřej Surý	5bbba0d1a1	Simplify tracing the reference counting in isc_netmgr Always track the per-worker sockets in the .active_sockets field in the isc__networker_t struct and always track the per-socket handles in the .active_handles field ian the isc_nmsocket_t struct.	2023-01-10 19:57:39 +01:00
Evan Hunt	916ea26ead	remove nonfunctional DSCP implementation DSCP has not been fully working since the network manager was introduced in 9.16, and has been completely broken since 9.18. This seems to have caused very few difficulties for anyone, so we have now marked it as obsolete and removed the implementation. To ensure that old config files don't fail, the code to parse dscp key-value pairs is still present, but a warning is logged that the feature is obsolete and should not be used. Nothing is done with configured values, and there is no longer any range checking.	2023-01-09 12:15:21 -08:00
Ondřej Surý	6613f89c62	Enhance the isc_loop unit to allow reference count tracking Use ISC_REFCOUNT_TRACE_{IMPL,DECL} to allow better isc_loop reference tracking - use `#define ISC_LOOP_TRACE 1` in <isc/loop.h> to enable.	2023-01-05 12:33:15 +00:00
Mark Andrews	096b280b1c	Do not pass NULL pointer to memmove - undefined behaviour Check if 'old_base' is NULL and if so skip calling memmove.	2023-01-03 14:40:30 +11:00
Artem Boldariev	7962e7f575	tlsctx_client_session_cache_new() -> tlsctx_client_session_create() Additionally to renaming, it changes the function definition so that it accepts a pointer to pointer instead of returning a pointer to the new object. It is mostly done to make it in line with other functions in the module.	2022-12-23 11:10:11 +02:00
Artem Boldariev	f102df96b8	Rename isc_tlsctx_cache_new() -> isc_tlsctx_cache_create() Additionally to renaming, it changes the function definition so that it accepts a pointer to pointer instead of returning a pointer to the new object. It is mostly done to make it in line with other functions in the module.	2022-12-23 11:10:11 +02:00
Ondřej Surý	6cb6373b5a	Convert Stream DNS to use isc_buffer API Drop the whole isc_dnsbuffer API and use new improved isc_buffer API that provides same functionality as the isc_dnsbuffer unit now.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4277eeeb9c	Remove TLS DNS transport (and parts common with TCP DNS) This commit removes TLS DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	e5649710d3	Remove TCP DNS transport This commit removes TCP DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4524bf4083	Make isc_nm_tlssocket non-optional This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so that it will be available regardless of the fact if BIND was built with DNS over HTTP support or not.	2022-12-20 22:13:53 +02:00
Artem Boldariev	371b02f37a	TCP: make it possible to set Nagle's algorithms state via handle This commit adds ability to turn the Nagle's algorithm on or off via connections handle. It adds the isc_nmhandle_set_tcp_nodelay() function as the public interface for this functionality.	2022-12-20 22:13:53 +02:00
Artem Boldariev	f395cd4b3e	Add isc_nm_streamdnssocket (aka Stream DNS) This commit adds an initial implementation of isc_nm_streamdnssocket transport: a unified transport for DNS over stream protocols messages, which is capable of replacing both TCP DNS and TLS DNS transports. Currently, the interface it provides is a unified set of interfaces provided by both of the transports it attempts to replace. The transport is built around "isc_dnsbuffer_t" and "isc_dnsstream_assembler_t" objects and attempts to minimise both the number of memory allocations during network transfers as well as memory usage.	2022-12-20 22:13:51 +02:00
Artem Boldariev	338cf3e467	Add isc_dnsstream_assembler_t implementation This commit adds the implementation for an "isc_dnsstream_assembler_t" object. The object is built on top of "isc_dnsbuffer_t" and is intended to encapsulate the state machine used for handling DNS messages received in the format used for messages transmitted over TCP. The idea is that the object accepts the input data received from a socket, tries to assemble DNS messages from the incoming data and calls the callback which contains the status of the incoming data as well as a pointer to the memory region referencing the data of the assembled message. It is capable of assembling DNS messages no matter how torn apart they are when sent over network. The following statuses might be passed to the callback: * ISC_R_SUCCESS - a message has been successfully assembled; * ISC_R_NOMORE - not enough data has been processed to assemble a message; * ISC_R_RANGE - there was an attempt to process a zero-sized DNS message (someone attempts to send us junk data). One could say that the object replaces the implementation of "isc__nm__processbuffer()" functions used by the old TCP DNS and TLS DNS transports with a better defined state machine completely decoupled from the networking code itself. Such a design makes it trivial to write unit tests for it, leading to better verification of its correctness. Another important difference is directly related to the fact that it is built on top of "isc_dnsbuffer_t", which tries to manage memory in a smart way. In particular: It tries to use a static buffer for smaller messages, reducing pressure on the memory manager (hot path); * When allocating dynamic memory for larger messages, it tries to allocate memory conservatively (generic path). These characteristics is a significant upgrade over the older logic where a 64KB(+2 bytes) buffer was allocated from dynamic memory regardless of the fact if we need a buffer this large or not. That is, lesser memory usage is expected in a generic case for DNS transports built on top of "isc_dnsstream_assembler_t."	2022-12-20 21:24:44 +02:00
Artem Boldariev	cbb758abd4	Add isc_dnsbuffer_t implementation This commit adds "isc_dnsbuffer_t" object implementation, a thin wrapper on top of "isc_buffer_t" which has the following characteristics: * provides interface specifically atuned for handling/generating DNS messages, especially in the format used for DNS messages over TCP; * avoids allocating dynamic memory when handling small DNS messages, while transparently switching to using dynamic memory when handling larger messages. This approach significantly reduces pressure on the memory allocator, as most of the DNS messages are small.	2022-12-20 21:24:44 +02:00
Artem Boldariev	94e650ce89	Use 'restrict' and 'const' for 'isc_buffer_t' The purpose of this commit is to aid compiler in generating better code when working with `isc_buffer_t` objects by using restricted pointers (and, to a lesser extent, 'const' modifier for read-only arguments). This way we, basically, instruct the compiler that the members of structured passed by pointers into the functions can be treated as local variables in the scope of a function. That should reduce the number of load/store operations emitted by compilers when accessing objects (e.g. 'isc_buffer_t') via pointers.	2022-12-20 21:01:27 +02:00
Ondřej Surý	460afcda18	Add isc_buffer_trycompact() function needed for StreamDNS Add isc_buffer_trycompact() that's an optimization; it will compact the buffer only when the remaining length is smaller than used length.	2022-12-20 19:13:48 +01:00
Ondřej Surý	e6062ee3ae	Add isc_buffer_setmctx() and isc_buffer_clearmctx() function Add two extra functions needed by StreamDNS: 1. isc_buffer_setmctx() sets the buffer internal memory context, so we can use isc_buffer_reserve() on the buffer. For this, we also need to track whether the .base was dynamically allocated or not. This needs to be called after isc_buffer_init() and before first isc_buffer_reserve() call. 2. isc_buffer_clearmctx() clears the buffer internal memory context, and frees any dynamically allocated buffer. This needs to be called after the last isc_buffer_reserve() call and before calling the isc_buffer_invalidate()	2022-12-20 19:13:48 +01:00
Ondřej Surý	8e3a86f6dd	Make the isc_buffer unit header-only The isc_buffer is often used in the hot-path, so make it header-only implementation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2ddea1e41c	Add a static pre-allocated buffer to isc_buffer_t When the buffer is allocated via isc_buffer_allocate() and the size is smaller or equal ISC_BUFFER_STATIC_SIZE (currently 512 bytes), the buffer will be allocated as a flexible array member in the buffer structure itself instead of allocating it on the heap. This should help when the buffer is used on the hot-path with small allocations.	2022-12-20 19:13:48 +01:00
Ondřej Surý	6bd2b34180	Enable auto-reallocation for all isc_buffer_allocate() buffers When isc_buffer_t buffer is created with isc_buffer_allocate() assume that we want it to always auto-reallocate instead of having an extra call to enable auto-reallocation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	135ec7a0f0	Remove single use isc_buffer_putdecint() function The isc_buffer_putdecint() could be easily replaced with isc_buffer_printf() with just a small overhead of calling vsnprintf() twice instead once. This is not on a hot-path (dns_catz unit), so we can ignore the overhead and instead have less single-use code in favor of using reusable more generic function.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2a94123d5b	Refactor the isc_buffer_{get,put}uintN, add isc_buffer_peekuintN The Stream DNS implementation needs a peek methods that read the value from the buffer, but it doesn't advance the current position. Add isc_buffer_peekuintX methods, refactor the isc_buffer_{get,put}uintN methods to modern integer types, and move the isc_buffer_getuintN to the header as static inline functions.	2022-12-20 19:13:48 +01:00
Ondřej Surý	a1d45685e6	Move and extend the uint8_t low-endian to uint{32,64}t to endian.h Move the U8TO{32,64}_LE and U{32,64}TO8_LE macros to endian.h and extend the macros for 16-bit and Big-Endian variants. Use the macros both in isc_siphash (LE) and isc_buffer (BE) units.	2022-12-20 19:13:48 +01:00
Ondřej Surý	aea251f3bc	Change the isc_buffer_reserve() to take just buffer pointer The isc_buffer_reserve() would be passed a reference to the buffer pointer, which was unnecessary as the pointer would never be changed in the current implementation. Remove the extra dereference.	2022-12-20 19:13:48 +01:00
Artem Boldariev	837fef78b1	Fix TLS session resumption via IDs when Mutual TLS is used This commit fixes TLS session resumption via session IDs when client certificates are used. To do so it makes sure that session ID contexts are set within server TLS contexts. See OpenSSL documentation for 'SSL_CTX_set_session_id_context()', the "Warnings" section.	2022-12-14 18:06:20 +02:00
Ondřej Surý	e2262c2112	Remove isc_resource API and set limits directly in named_os unit The only function left in the isc_resource API was setting the file limit. Replace the whole unit with a simple getrlimit to check the maximum value of RLIMIT_NOFILE and set the maximum back to rlimit_cur. This is more compatible than trying to set RLIMIT_UNLIMITED on the RLIMIT_NOFILE as it doesn't work on Linux (see man 5 proc on /proc/sys/fs/nr_open), neither it does on Darwin kernel (see man 2 getrlimit). The only place where the maximum value could be raised under privileged user would be BSDs, but the `named_os_adjustnofile()` were not called there before. We would apply the increased limits only on Linux and Sun platforms.	2022-12-07 19:40:00 +01:00
Ondřej Surý	50f357cb36	Refactor the dns_adb unit The dns_adb unit has been refactored to be much simpler. Following changes have been made: 1. Simplify the ADB to always allow GLUE and hints There were only two places where dns_adb_createfind() was used - in the dns_resolver unit where hints and GLUE addresses were ok, and in the dns_zone where dns_adb_createfind() would be called without DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set. Simplify the logic by allowing hint and GLUE addresses when looking up the nameserver addresses to notify. The difference is negligible and would cause a difference in the notified addresses only when there's mismatch between the parent and child addresses and we haven't cached the child addresses yet. 2. Drop the namebuckets and entrybuckets Formerly, the namebuckets and entrybuckets were used to reduced the lock contention when accessing the double-linked lists stored in each bucket. In the previous refactoring, the custom hashtable for the buckets has been replaced with isc_ht/isc_hashmap, so only a single item (mostly, see below) would end up in each bucket. Removing the entrybuckets has been straightforward, the only matching was done on the isc_sockaddr_t member of the dns_adbentry. Removing the zonebuckets required GLUEOK and HINTOK bits to be removed because the find could match entries with-or-without the bits set, and creating a custom key that stores the DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a straightforward lookup into the hashtable without traversing a list that contains items with different flags. 3. Remove unassociated entries from ADB database Previously, the adbentries could live in the ADB database even after unlinking them from dns_adbnames. Such entries would show up as "Unassociated entries" in the ADB dump. The benefit of keeping such entries is little - the chance that we link such entry to a adbname is small, and it's simpler to evict unlinked entries from the ADB cache (and the hashtable) than create second LRU cleaning mechanism. Unlinked ADB entries are now directly deleted from the hash table (hashmap) upon destruction. 4. Cleanup expired entries from the hash table When buckets were still in place, the code would keep the buckets always allocated and never shrink the hash table (hashmap). With proper reference counting in place, we can delete the adbnames from the hash table and the LRU list. 5. Stop purging the names early when we hit the time limit Because the LRU list is now time ordered, we can stop purging the names when we find a first entry that doesn't fullfil our time-based eviction criteria because no further entry on the LRU list will meet the criteria. Future work: 1. Lock contention In this commit, the focus was on correctness of the data structure, but in the future, the lock contention in the ADB database needs to be addressed. Currently, we use simple mutex to lock the hash tables, because we almost always need to use a write lock for properly purging the hashtables. The ADB database needs to be sharded (similar to the effect that buckets had in the past). Each shard would contain own hashmap and own LRU list. 2. Time-based purging The ADB names and entries stay intact when there are no lookups. When we add separate shards, a timer needs to be added for time-based cleaning in case there's no traffic hashing to the inactive shard. 3. Revisit the 30 minutes limit The ADB cache is capped at 30 minutes. This needs to be revisited, and at least the limit should be configurable (in both directions).	2022-11-30 10:03:24 +01:00
Ondřej Surý	118ae66976	Add extra set of ISC_REFCOUNT_TRACE_{IMPL,DECL} macros The new ISC_REFCOUNT_TRACE_{IMPL,DECL} macros can be used to add a reference tracing capability to any unit using the reference counting. It requires a little bit of extra work in each header as you can't have a define from inside a define (see rpz.h), but it's fairly easy to add tracing to any struct using reference counting with these macros.	2022-11-29 23:57:40 -08:00
Tony Finch	00307fe318	Deduplicate time unit conversion factors The various factors like NS_PER_MS are now defined in a single place and the names are no longer inconsistent. I chose the _PER_SEC names rather than _PER_S because it is slightly more clear in isolation; but the smaller units are always NS, US, and MS.	2022-11-25 13:23:36 +00:00
Ondřej Surý	f46ce447a6	Add isc_hashmap API that implements Robin Hood hashing Add new isc_hashmap API that differs from the current isc_ht API in several aspects: 1. It implements Robin Hood Hashing which is open-addressing hash table algorithm (e.g. no linked-lists) 2. No memory allocations - the array to store the nodes is made of isc_hashmap_node_t structures instead of just pointers, so there's only allocation on resize. 3. The key is not copied into the hashmap node and must be also stored externally, either as part of the stored value or in any other location that's valid as long the value is stored in the hashmap. This makes the isc_hashmap_t a little less universal because of the key storage requirements, but the inserts and deletes are faster because they don't require memory allocation on isc_hashmap_add() and memory deallocation on isc_hashmap_delete().	2022-11-10 15:07:19 +01:00
Ondřej Surý	0492bbf590	Make the pthread_rwlock implementation header-only macros [2/2] While using mutrace, the phtread-rwlock based isc_rwlock implementation would be all tracked in the rwlock.c unit losing all useful information as all rwlocks would be traced in a single place. Rewrite the pthread_rwlock based implementation to be header-only macros, so we can use mutrace to properly track the rwlock contention without heavily patching mutrace to understand the libisc synchronization primitives.	2022-11-02 10:34:10 +01:00
Ondřej Surý	6bd201ccec	Remove one level of indirection from isc_rwlock [1/2] Instead of checking the PTHREAD_RUNTIME_CHECK from the header, move it to the pthread_rwlock implementation functions. The internal isc_rwlock actually cannot fail, so the checks in the header was useless anyway.	2022-11-02 10:27:09 +01:00
Ondřej Surý	98b7a93772	Remove isc_rwlock_downgrade() from isc_rwlock The isc_rwlock_downgrade() is not used anywhere, so we can remove it and make the pthread_rwlock implementation simpler.	2022-11-02 09:05:37 +01:00
Evan Hunt	dc878e3098	isc_async_run() runs events in reverse order when more than one event was scheduled in the isc_aysnc queue, they were executed in reverse order. we need to pull events off the back of queue instead the front, so that uv_loop will run them in the right order. note that isc_job_run() has the same behavior, because it calls uv_idle_start() directly. in that case we just document it so it'll be less surprising in the future.	2022-10-31 05:43:45 -07:00
Mark Andrews	3881afeb15	Add dns_rdata_checksvcb dns_rdata_checksvcb performs data entry checks on SVCB records. In particular that _dns SVBC record have an 'alpn' and if that 'alpn' parameter indicates HTTP is in use that 'dophath' is present.	2022-10-29 00:22:54 +11:00
Ondřej Surý	6ba0a22627	Change the return type of isc_lex_create() to void The isc_lex_create() cannot fail, so cleanup the return type from isc_result_t to void.	2022-10-26 12:55:06 +02:00
Ondřej Surý	5e20c2ccfb	Replace (void )-1 with ISC_LINK_TOMBSTONE Instead of having "arbitrary" (void )-1 to define non-linked, add a ISC_LINK_TOMBSTONE(type) macro that replaces the "magic" value with a define.	2022-10-18 11:36:15 +02:00
Ondřej Surý	cb3c36b8bf	Add ISC_{LIST,LINK}_INITIALIZER for designated initializers Since we are using designated initializers, we were missing initializers for ISC_LIST and ISC_LINK, add them, so you can do foo = (foo_t){ .list = ISC_LIST_INITIALIZER }; Instead of: foo = (foo_t){ 0 }; ISC_LIST_INIT(foo->list);	2022-10-18 11:36:15 +02:00
Tony Finch	26ed03a61e	Include the function name when reporting unexpected errors I.e. print the name of the function in BIND that called the system function that returned an error. Since it was useful for pthreads code, it seems worthwhile doing so everywhere.	2022-10-17 13:43:59 +01:00
Tony Finch	a34a2784b1	De-duplicate some calls to strerror_r() Specifically, when reporting an unexpected or fatal error.	2022-10-17 11:58:26 +01:00
Tony Finch	ec50c58f52	De-duplicate __FILE__, __LINE__ Mostly generated automatically with the following semantic patch, except where coccinelle was confused by #ifdef in lib/isc/net.c @@ expression list args; @@ - UNEXPECTED_ERROR(__FILE__, __LINE__, args) + UNEXPECTED_ERROR(args) @@ expression list args; @@ - FATAL_ERROR(__FILE__, __LINE__, args) + FATAL_ERROR(args)	2022-10-17 11:58:26 +01:00
Ondřej Surý	cedfc97974	Improve reporting for pthread_once errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any once-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-10-14 16:39:21 +02:00
Ondřej Surý	beecde7120	Rewrite isc_httpd using picohttpparser and isc_url_parse Rewrite the isc_httpd to be more robust. 1. Replace the hand-crafted HTTP request parser with picohttpparser for parsing the whole HTTP/1.0 and HTTP/1.1 requests. Limit the number of allowed headers to 10 (arbitrary number). 2. Replace the hand-crafted URL parser with isc_url_parse for parsing the URL from the HTTP request. 3. Increase the receive buffer to match the isc_netmgr buffers, so we can at least receive two full isc_nm_read()s. This makes the truncation processing much simpler. 4. Process the received buffer from single isc_nm_read() in a single loop and schedule the sends to be independent of each other. The first two changes makes the code simpler and rely on already existing libraries that we already had (isc_url based on nodejs) or are used elsewhere (picohttpparser). The second two changes remove the artificial "truncation" limit on parsing multiple request. Now only a request that has too many headers (currently 10) or is too big (so, the receive buffer fills up without reaching end of the request) will end the connection. We can be benevolent here with the limites, because the statschannel channel is by definition private and access must be allowed only to administrators of the server. There are no timers, no rate-limiting, no upper limit on the number of requests that can be served, etc.	2022-10-14 11:26:54 +02:00
Ondřej Surý	dbf5672f32	Replace isc_mem__aligned(..., alignment) with isc_mem_x(..., flags) Previously, the isc_mem_get_aligned() and friends took alignment size as one of the arguments. Replace the specific function with more generic extended variant that now accepts ISC_MEM_ALIGN(alignment) for aligned allocations and ISC_MEM_ZERO for allocations that zeroes the (re-)allocated memory before returning the pointer to the caller.	2022-10-05 16:44:05 +02:00
Ondřej Surý	c14a4ac763	Add a case-insensitive option directly to siphash 2-4 implementation Formerly, the isc_hash32() would have to change the key in a local copy to make it case insensitive. Change the isc_siphash24() and isc_halfsiphash24() functions to lowercase the input directly when reading it from the memory and converting the uint8_t * array to 64-bit (respectively 32-bit numbers).	2022-10-04 10:32:40 +02:00
Mark Andrews	5f07fe8cbb	Use strnstr implementation from FreeBSD if not provided by OS	2022-10-04 14:21:41 +11:00
Ondřej Surý	477eb22c12	Refactor isc_ratelimiter API Because the dns_zonemgr_create() was run before the loopmgr was started, the isc_ratelimiter API was more complicated that it had to be. Move the dns_zonemgr_create() to run_server() task which is run on the main loop, and simplify the isc_ratelimiter API implementation. The isc_timer is now created in the isc_ratelimiter_create() and starting the timer is now separate async task as is destroying the timer in case it's not launched from the loop it was created on. The ratelimiter tick now doesn't have to create and destroy timer logic and just stops the timer when there's no more work to do. This should also solve all the races that were causing the isc_ratelimiter to be left dangling because the timer was stopped before the last reference would be detached.	2022-09-30 10:36:30 +02:00
Ondřej Surý	1e2ededb07	Add missing DbC check for name##_detach in ISC_REFCOUNT_IMPL macro The detach function in the ISC_REFCOUNT_IMPL macro was missing DbC checks, add them.	2022-09-30 09:50:17 +02:00
Ondřej Surý	e537fea861	Use custom isc_mem based allocator for libxml2 The libxml2 library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, strdup and free). Create a memory context specifically for libxml2 to allow tracking the memory usage that has originated from within libxml2. This will provide a separate memory context for libxml2 to track the allocations and when shutting down the application it will check that all libxml2 allocations were returned to the allocator. Additionally, move the xmlInitParser() and xmlCleanupParser() calls from bin/named/main.c to library constructor/destructor in libisc library.	2022-09-27 17:10:42 +02:00
Ondřej Surý	236d4b7739	Use custom isc_mem based allocator for OpenSSL The OpenSSL library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, and free). Create a memory context specifically for OpenSSL to allow tracking the memory usage that has originated from within OpenSSL. This will provide a separate memory context for OpenSSL to track the allocations and when shutting down the application it will check that all OpenSSL allocations were returned to the allocator.	2022-09-27 17:10:42 +02:00
Ondřej Surý	a32d06dd42	Use custom isc_mem based allocator for libuv The libuv library provides a way to replace the default allocator with user supplied allocator (malloc, realloc, calloc and free). Create a memory context specifically for libuv to allow tracking the memory usage that has originated from within libuv. This requires libuv >= 1.38.0 which provides uv_library_shutdown() function that assures no more allocations will be made.	2022-09-27 17:10:42 +02:00
Ondřej Surý	0086ebf3fc	Bump the libuv requirement to libuv >= 1.34.0 By bumping the minimum libuv version to 1.34.0, it allows us to remove all libuv shims we ever had and makes the code much cleaner. The up-to-date libuv is available in all distributions supported by BIND 9.19+ either natively or as a backport.	2022-09-27 17:09:10 +02:00
Evan Hunt	1926ddc987	change ISC__BUFFER macros to inline functions previously, when ISC_BUFFER_USEINLINE was defined, macros were used to implement isc_buffer primitives (isc_buffer_init(), isc_buffer_region(), etc). these macros were missing the DbC assertions for those primitives, which made it possible for coding errors to go undetected. adding the assertions to the macros caused compiler warnings on some platforms. therefore, this commit converts the ISC__BUFFER macros to static inline functions instead, with assertions included, and eliminates the non-inline implementation from buffer.c. the --enable-buffer-useinline configure option has been removed.	2022-09-26 23:49:27 -07:00
Ondřej Surý	1baed21688	Switch the CSPRNG function from RAND_bytes() to uv_random() The RAND_bytes() implementation differs between the OpenSSL versions and uses the system entropy only for seeding its internal CSPRNG. The uv_random() on the other hand uses the system provided CSPRNG. Switch from RAND_bytes() to uv_random() to use system provided CSPRNG.	2022-09-26 15:13:11 +02:00
Ondřej Surý	fffd444440	Cleanup the asychronous code in the stream implementations After the loopmgr work has been merged, we can now cleanup the TCP and TLS protocols a little bit, because there are stronger guarantees that the sockets will be kept on the respective loops/threads. We only need asynchronous call for listening sockets (start, stop) and reading from the TCP (because the isc_nm_read() might be called from read callback again. This commit does the following changes (they are intertwined together): 1. Cleanup most of the asynchronous events in the TCP code, and add comments for the events that needs to be kept asynchronous. 2. Remove isc_nm_resumeread() from the netmgr API, and replace isc_nm_resumeread() calls with existing isc_nm_read() calls. 3. Remove isc_nm_pauseread() from the netmgr API, and replace isc_nm_pauseread() calls with a new isc_nm_read_stop() call. 4. Disable the isc_nm_cancelread() for the streaming protocols, only the datagram-like protocols can use isc_nm_cancelread(). 5. Add isc_nmhandle_close() that can be used to shutdown the socket earlier than after the last detach. Formerly, the socket would be closed only after all reading and sending would be finished and the last reference would be detached. The new isc_nmhandle_close() can be used to close the underlying socket earlier, so all the other asynchronous calls would call their respective callbacks immediately. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org>	2022-09-22 14:51:15 +02:00
Ondřej Surý	869c6d77a2	Convert isc_ratelimiter API to use on-loop timers In preparation for the on-loop timers, the isc_ratelimiter API was converted to use the timer on main loop and start and stop the timer asynchronously on the main loop.	2022-09-21 14:25:33 -07:00
Ondřej Surý	27d1e498b8	Add isc_timer_async_destroy() helper function As it sometimes happens that the object using isc_timer_t is destroyed via detaching all the references with no guarantee that the last thread will be matching thread, add a helper isc_timer_async_destroy() function that stops the timer and runs the destroy function via isc_async_run() on the matching thread.	2022-09-21 14:25:33 -07:00
Ondřej Surý	f6e4f620b3	Use the semantic patch to do the unsigned -> unsigned int change Apply the semantic patch on the whole code base to get rid of 'unsigned' usage in favor of explicit 'unsigned int'.	2022-09-19 15:56:02 +02:00
Tony Finch	21a383a8fd	General-purpose unrolled ASCII tolower() loops When converting a string to lower case, the compiler is able to autovectorize nicely, so a nice simple implementation is also very fast, comparable to memcpy(). Comparisons are more difficult for the compiler, so we convert eight bytes at a time using "SIMD within a register" tricks. Experiments indicate it's best to stick to simple loops for shorter strings and the remainder of long strings.	2022-09-12 12:18:57 +01:00
Tony Finch	27a561273e	Consolidate some ASCII tables in `isc/ascii` and `isc/hex` There were a number of places that had copies of various ASCII tables (case conversion, hex and decimal conversion) that are intended to be faster than the ctype.h macros, or avoid locale pollution. Move them into libisc, and wrap the lookup tables with macros that avoid the ctype.h gotchas.	2022-09-12 12:18:57 +01:00
Michał Kępień	3b1c80fd0f	Fix error reporting for POSIX Threads functions Commit 3608abc8fa6a33046e1d34a0789cf7c9547f09ad inadvertently carried over a mistake in logging pthread_cond_init() errors to the ERRNO_CHECK() preprocessor macro: instead of passing the value returned by a given pthread_() function to strerror_r(), ERRNO_CHECK() passes the errno variable to strerror_r(). This causes bogus error reports because POSIX Threads API functions do not set the errno variable. Fix by passing the value returned by a given pthread_() function instead of the errno variable to strerror_r(). Since this change makes the name of the affected macro (ERRNO_CHECK()) confusing, rename the latter to PTHREADS_RUNTIME_CHECK(). Also log the integer error value returned by a given pthread_*() function verbatim to rule out any further confusion in runtime error reporting.	2022-09-09 20:25:47 +02:00
Ondřej Surý	4d07768a09	Remove the isc_app API The isc_app API is no longer used and has been removed.	2022-08-26 09:09:25 +02:00
Ondřej Surý	b69e783164	Update netmgr, tasks, and applications to use isc_loopmgr Previously: * applications were using isc_app as the base unit for running the application and signal handling. * networking was handled in the netmgr layer, which would start a number of threads, each with a uv_loop event loop. * task/event handling was done in the isc_task unit, which used netmgr event loops to run the isc_event calls. In this refactoring: * the network manager now uses isc_loop instead of maintaining its own worker threads and event loops. * the taskmgr that manages isc_task instances now also uses isc_loopmgr, and every isc_task runs on a specific isc_loop bound to the specific thread. * applications have been updated as necessary to use the new API. * new ISC_LOOP_TEST macros have been added to enable unit tests to run isc_loop event loops. unit tests have been updated to use this where needed.	2022-08-26 09:09:24 +02:00
Ondřej Surý	49b149f5fd	Update isc_timer to use isc_loopmgr * isc_timer was rewritten using the uv_timer, and isc_timermgr_t was completely removed; isc_timer objects are now directly created on the isc_loop event loops. * the isc_timer API has been simplified. the "inactive" timer type has been removed; timers are now stopped by calling isc_timer_stop() instead of resetting to inactive. * isc_manager now creates a loop manager rather than a timer manager. * modules and applications using isc_timer have been updated to use the new API.	2022-08-25 17:17:07 +02:00
Ondřej Surý	84c90e223f	New event loop handling API This commit introduces new APIs for applications and signal handling, intended to replace isc_app for applications built on top of libisc. * isc_app will be replaced with isc_loopmgr, which handles the starting and stopping of applications. In isc_loopmgr, the main thread is not blocked, but is part of the working thread set. The loop manager will start a number of threads, each with a uv_loop event loop running. Setup and teardown functions can be assigned which will run when the loop starts and stops, and jobs can be scheduled to run in the meantime. When isc_loopmgr_shutdown() is run from any the loops, all loops will shut down and the application can terminate. * signal handling will now be handled with a separate isc_signal unit. isc_loopmgr only handles SIGTERM and SIGINT for application termination, but the application may install additional signal handlers, such as SIGHUP as a signal to reload configuration. * new job running primitives, isc_job and isc_async, have been added. Both units schedule callbacks (specifying a callback function and argument) on an event loop. The difference is that isc_job unit is unlocked and not thread-safe, so it can be used to efficiently run jobs in the same thread, while isc_async is thread-safe and uses locking, so it can be used to pass jobs from one thread to another. * isc_tid will be used to track the thread ID in isc_loop worker threads. * unit tests have been added for the new APIs.	2022-08-25 12:24:29 +02:00
Ondřej Surý	a26862e653	Simplify the isc_event API The ev_tag field was never used, and has now been removed.	2022-08-25 12:24:25 +02:00
Michał Kępień	b67ff4728f	Improve reporting for barrier errors uv_barrier_init() errors are currently ignored. Use UV_RUNTIME_CHECK() to catch them and to improve error reporting for any uv_barrier_init() run-time failures (by augmenting error messages with file/line information and the error string corresponding to the value returned).	2022-07-13 13:19:32 +02:00
Michał Kępień	7009f9d270	Improve reporting for read-write lock errors Replace direct uses of implementation-specific rwlock functions in lib/isc/include/isc/rwlock.h with preprocessor macros that use ERRNO_CHECK(), in order to augment rwlock-related error messages with file/line/caller information and the error string corresponding to errno. Adjust the implementation-specific functions for pthreads-based rwlocks so that they return any errors encountered to the caller instead of aborting execution immediately using RUNTIME_CHECK(). To keep code modifications simple, make the non-pthreads-based implementation-specific rwlock functions always return 0; these functions continue to handle errors using less verbose run-time assertions as they do not set errno anyway.	2022-07-13 13:19:32 +02:00
Michał Kępień	badeeff0ac	Improve reporting for condition variable errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/condition.h with ERRNO_CHECK(), in order to improve error reporting for any condition-variable-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	f352a834a7	Improve reporting for mutex errors Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/mutex.h with ERRNO_CHECK(), in order to improve error reporting for any mutex-related run-time failures (by augmenting error messages with file/line/caller information and the error string corresponding to errno).	2022-07-13 13:19:32 +02:00
Michał Kępień	77aead5ab6	Enable tracking of pthreads barriers Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_barrier_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_barrier_destroy() or else the memory allocated for the barrier will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying barriers that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_barrier_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_barrier_t structures on the heap in isc_barrier_init() and free them in isc_barrier_destroy(). Reuse existing barrier macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	e4606da2c6	Enable tracking of pthreads rwlocks Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_rwlock_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_rwlock_destroy() or else the memory allocated for the rwlock will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying rwlocks that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_rwlock_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set (and --enable-pthread-rwlock is used), allocate isc_rwlock_t structures on the heap in isc_rwlock_init() and free them in isc_rwlock_destroy(). Reuse existing functions defined in lib/isc/rwlock.c for other operations, but rename them first, so that they contain triple underscores (to indicate that these functions are implementation-specific, unlike their mutex and condition variable counterparts, which always use the pthreads implementation). Define the isc__rwlock_init() macro so that it is a logical counterpart of isc__mutex_init() and isc__condition_init(); adjust isc___rwlock_init() accordingly. Remove a redundant function prototype for isc__rwlock_lock() and rename that (static) function to rwlock_lock() in order to avoid having to use quadruple underscores.	2022-07-13 13:19:32 +02:00
Ondřej Surý	8dfdb95a20	Enable tracking of pthreads condition variables Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_cond_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_cond_destroy() or else the memory allocated for the condition variable will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying condition variables that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_cond_destroy() calls on any platform on which it works reliably. When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro is set, allocate isc_condition_t structures on the heap in isc_condition_init() and free them in isc_condition_destroy(). Reuse existing condition variable macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	ebcfb16576	Enable tracking of pthreads mutexes Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate memory on the heap when pthread_mutex_init() is called. Every call to that function must be accompanied by a corresponding call to pthread_mutex_destroy() or else the memory allocated for the mutex will leak. jemalloc can be used for detecting memory allocations which are not released by a process when it exits. Unfortunately, since jemalloc is also the system allocator on FreeBSD and a special (profiling-enabled) build of jemalloc is required for memory leak detection, this method cannot be used for detecting leaked memory allocated by libthr on a stock FreeBSD installation. However, libthr's behavior can be emulated on any platform by implementing alternative versions of libisc functions for creating and destroying mutexes that allocate memory using malloc() and release it using free(). This enables using jemalloc for detecting missing pthread_mutex_destroy() calls on any platform on which it works reliably. Introduce a new ISC_TRACK_PTHREADS_OBJECTS preprocessor macro, which causes isc_mutex_t structures to be allocated on the heap by isc_mutex_init() and freed by isc_mutex_destroy(). Reuse existing mutex macros (after renaming them appropriately) for other operations.	2022-07-13 13:19:32 +02:00
Ondřej Surý	deae974366	Directly cause assertion failure on pthreads primitives failure Instead of returning error values from isc_rwlock_(), isc_mutex_(), and isc_condition_*() macros/functions and subsequently carrying out runtime assertion checks on the return values in the calling code, trigger assertion failures directly in those macros/functions whenever any pthread function returns an error, as there is no point in continuing execution in such a case anyway.	2022-07-13 13:19:32 +02:00
Michał Kępień	365b47caee	Add an ERRNO_CHECK() preprocessor macro In a number of situations in pthreads-related code, a common sequence of steps is taken: if the value returned by a library function is not 0, pass errno to strerror_r(), log the string returned by the latter, and immediately abort execution. Add an ERRNO_CHECK() preprocessor macro which takes those exact steps and use it wherever (conveniently) possible. Notes: 1. The "log the return value of strerror_r() and abort" pattern is used in a number of other places that this commit does not touch; only "!= 0" checks followed by isc_error_fatal() calls with non-customized error messages are replaced here. 2. This change temporarily breaks file name & line number reporting for isc__mutex_init() errors, to prevent breaking the build. This issue will be rectified in a subsequent change.	2022-07-13 13:19:32 +02:00
Evan Hunt	a499794984	REQUIRE should not have side effects it's a style violation to have REQUIRE or INSIST contain code that must run for the server to work. this was being done with some atomic_compare_exchange calls. these have been cleaned up. uses of atomic_compare_exchange in assertions have been replaced with a new macro atomic_compare_exchange_enforced, which uses RUNTIME_CHECK to ensure that the exchange was successful.	2022-07-05 12:22:55 -07:00
Artem Boldariev	d2e13ddf22	Update the set of HTTP endpoints on reconfiguration This commit ensures that on reconfiguration the set of HTTP endpoints (=paths) is being updated within HTTP listeners.	2022-06-28 15:42:38 +03:00
Artem Boldariev	e72962d5f1	Update max concurrent streams limit in HTTP listeners on reconfig This commit ensures that HTTP listeners concurrent streams limit gets updated properly on reconfiguration.	2022-06-28 15:42:38 +03:00
Michal Nowak	1c45a9885a	Update clang to version 14	2022-06-16 17:21:11 +02:00
Ondřej Surý	1fe391fd40	Make all tasks to be bound to a thread Previously, tasks could be created either unbound or bound to a specific thread (worker loop). The unbound tasks would be assigned to a random thread every time isc_task_send() was called. Because there's no logic that would assign the task to the least busy worker, this just creates unpredictability. Instead of random assignment, bind all the previously unbound tasks to worker 0, which is guaranteed to exist.	2022-05-25 16:04:51 +02:00
Artem Boldariev	98f758ed4f	CID 352848: split xfrin_start() and remove dead code This commit separates TLS context creation code from xfrin_start() as it has become too large and hard to follow into a new function (similarly how it is done in dighost.c) The dead code has been removed from the cleanup section of the TLS creation code: * there is no way 'tlsctx' can equal 'found'; * there is no way 'sess_cache' can be non-NULL in the cleanup section. Also, it fixes a bug in the older version of the code, where TLS client session context fetched from the cache would not get passed to isc_nm_tlsdnsconnect().	2022-05-25 12:38:38 +03:00
Artem Boldariev	86465c1dac	DoT: implement TLS client session resumption This commit extends DoT code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:48 +03:00
Artem Boldariev	90bc13a5d5	TLS stream/DoH: implement TLS client session resumption This commit extends TLS stream code and DoH code with TLS client session resumption support implemented on top of the TLS client session cache.	2022-05-20 20:17:45 +03:00
Artem Boldariev	987892d113	Extend TLS context cache with TLS client session cache This commit extends TLS context cache with TLS client session cache so that an associated session cache can be stored alongside the TLS context within the context cache.	2022-05-20 20:13:20 +03:00
Artem Boldariev	4ef40988f3	Add TLS client session cache implementation This commit adds an implementation of a client TLS session cache. TLS client session cache is an object which allows efficient storing and retrieval of previously saved TLS sessions so that they can be resumed. This object is supposed to be a foundation for implementing TLS session resumption - a standard technique to reduce the cost of re-establishing a connection to the remote server endpoint. OpenSSL does server-side TLS session caching transparently by default. However, on the client-side, a TLS session to resume must be manually specified when establishing the TLS connection. The TLS client session cache is precisely the foundation for that.	2022-05-20 20:13:20 +03:00
Ondřej Surý	14c8d43863	Use C2x [[fallthrough]] when supported by LLVM/clang Clang added support for the gcc-style fallthrough attribute (i.e. __attribute__((fallthrough))) in version 10. However, __has_attribute(fallthrough) will return 1 in C mode in older versions, even though they only support the C++11 fallthrough attribute. At best, the unsupported attribute is simply ignored; at worst, it causes errors. The C2x fallthrough attribute has the advantages of being supported in the broadest range of clang versions (added in version 9) and being easy to check for support. Use C2x [[fallthrough]] attribute if possible, and fall back to not using an attribute for clang versions that don't have it. Courtesy of Joshua Root	2022-05-19 21:40:24 +02:00
Evan Hunt	6936db2f59	Always use the number of CPUS for resolver->ntasks Since the fctx hash table is now self-resizing, and resolver tasks are selected to match the thread that created the fetch context, there shouldn't be any significant advantage to having multiple tasks per CPU; a single task per thread should be sufficient. Additionally, the fetch context is always pinned to the calling netmgr thread to minimize the contention just to coalesced fetches - if two threads starts the same fetch, it will be pinned to the first one to get the bucket.	2022-05-19 09:27:33 +02:00
Ondřej Surý	0582478c96	Remove isc_task_destroy() and isc_task_shutdown() After removing the isc_task_onshutdown(), the isc_task_shutdown() and isc_task_destroy() became obsolete. Remove calls to isc_task_shutdown() and replace the calls to isc_task_destroy() with isc_task_detach(). Simplify the internal logic to destroy the task when the last reference is removed.	2022-05-12 14:55:49 +02:00
Ondřej Surý	2235edabcf	Remove isc_task_onshutdown() The isc_task_onshutdown() was used to post event that should be run when the task is being shutdown. This could happen explicitly in the isc_test_shutdown() call or implicitly when we detach the last reference to the task and there are no more events posted on the task. This whole task onshutdown mechanism just makes things more complicated, and it's easier to post the "shutdown" events when we are shutting down explicitly and the existing code already always knows when it should shutdown the task that's being used to execute the onshutdown events. Replace the isc_task_onshutdown() calls with explicit calls to execute the shutdown tasks.	2022-05-12 13:45:34 +02:00
Ondřej Surý	b43812692d	Move netmgr/uv-compat.h to <isc/uv.h> As we are going to use libuv outside of the netmgr, we need the shims to be readily available for the rest of the codebase. Move the "netmgr/uv-compat.h" to <isc/uv.h> and netmgr/uv-compat.c to uv.c, and as a rule of thumb, the users of libuv should include <isc/uv.h> instead of <uv.h> directly. Additionally, merge netmgr/uverr2result.c into uv.c and rename the single function from isc__nm_uverr2result() to isc_uverr2result().	2022-05-03 10:02:19 +02:00
Tony Finch	d20ea4a703	Make isc_random_uniform() nearly divisionless It used to require two 32-bit integer divisions to get a random number less than some limit. Now we use Daniel Lemire's "nearly-divisionless" algorithm for unbiased bounded random numbers, which requires one 64-bit integer multiply in the usual case, and one 32-bit integer division in rare slow cases. Even the slow cases are faster than before; there are also fewer branches. I think this algorithm is exceptionally beautiful. It also has more clever tricks than lines of code, so I have done my best to explain how it works.	2022-04-22 16:40:37 +01:00
Ondřej Surý	d1d88a2895	Add detailed tracing when TASKMGR_TRACE is defined When TASKMGR_TRACE=1 is defined, the task and event objects have detailed tracing information about function, file, line, and backtrace (to the extent tracked by gcc) where it was created. At exit, when there are unfinished tasks, they will be printed along with the detailed information.	2022-04-19 14:25:23 +02:00
Ondřej Surý	f0feaa3305	Remove isc_task_sendto(anddetach) functions The only place where isc_task_sendto() was used was in dns_resolver unit, where the "sendto" part was actually no-op, because dns_resolver uses bound tasks. Remove the isc_task_sendto() and isc_task_sendtoanddetach() functions in favor of using bound tasks create with isc_task_create_bound(). Additionally, cache the number of running netmgr threads (nworkers) locally to reduce the number of function calls.	2022-04-19 14:24:36 +02:00
Ondřej Surý	1eeb4c1121	Remove isc_event_constallocate() The isc_event_constallocate() function was not used anywhere, thus remove the isc_event_constallocate() macro, declaration and definition.	2022-04-19 13:46:26 +02:00
Ondřej Surý	f55a4d3e55	Allow listening on less than nworkers threads For some applications, it's useful to not listen on full battery of threads. Add workers argument to all isc_nm_listen*() functions and convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.	2022-04-19 11:08:13 +02:00
Artem Boldariev	df317184eb	Add isc_nmsocket_set_tlsctx() This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function that replaces the TLS context within a given TLS-enabled listener socket object. It is based on the newly added reference counting functionality. The intention of adding this function is to add functionality to replace a TLS context without recreating the whole socket object, including the underlying TCP listener socket, as a BIND process might not have enough permissions to re-create it fully on reconfiguration.	2022-04-06 18:45:57 +03:00
Artem Boldariev	a7a482c1b1	Add isc_tlsctx_attach() The implementation is done on top of the reference counting functionality found in OpenSSL/LibreSSL, which allows for avoiding wrapping the object. Adding this function allows using reference counting for TLS contexts in BIND 9's codebase.	2022-04-06 18:45:57 +03:00
Ondřej Surý	142c63dda8	Enable the load-balance-sockets configuration Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private netmgr-int.h header file, making the configuration of load balanced sockets inoperable. Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add missing isc_nm_getloadbalancesockets() implementation.	2022-04-05 01:30:58 +02:00
Ondřej Surý	85c6e797aa	Add option to configure load balance sockets Previously, the option to enable kernel load balancing of the sockets was always enabled when supported by the operating system (SO_REUSEPORT on Linux and SO_REUSEPORT_LB on FreeBSD). It was reported that in scenarios where the networking threads are also responsible for processing long-running tasks (like RPZ processing, CATZ processing or large zone transfers), this could lead to intermitten brownouts for some clients, because the thread assigned by the operating system might be busy. In such scenarious, the overall performance would be better served by threads competing over the sockets because the idle threads can pick up the incoming traffic. Add new configuration option (`load-balance-sockets`) to allow enabling or disabling the load balancing of the sockets.	2022-04-04 23:10:04 +02:00
Ondřej Surý	f106d0ed2b	Run the RPZ update as offloaded work Previously, the RPZ updates ran quantized on the main nm_worker loops. As the quantum was set to 1024, this might lead to service interruptions when large RPZ update was processed. Change the RPZ update process to run as the offloaded work. The update and cleanup loops were refactored to do as little locking of the maintenance lock as possible for the shortest periods of time and the db iterator is being paused for every iteration, so we don't hold the rbtdb tree lock for prolonged periods of time.	2022-04-04 21:20:05 +02:00
Ondřej Surý	ae01ec2823	Don't use reference counting in isc_timer unit The reference counting and isc_timer_attach()/isc_timer_detach() semantic are actually misleading because it cannot be used under normal conditions. The usual conditions under which is timer used uses the object where timer is used as argument to the "timer" itself. This means that when the caller is using `isc_timer_detach()` it needs the timer to stop and the isc_timer_detach() does that only if this would be the last reference. Unfortunately, this also means that if the timer is attached elsewhere and the timer is fired it will most likely be use-after-free, because the object used in the timer no longer exists. Remove the reference counting from the isc_timer unit, remove isc_timer_attach() function and rename isc_timer_detach() to isc_timer_destroy() to better reflect how the API needs to be used. The only caveat is that the already executed event must be destroyed before the isc_timer_destroy() is called because the timer is no longet attached to .ev_destroy_arg.	2022-04-02 01:23:15 +02:00
Ondřej Surý	30e0fd942b	Remove task privileged mode Previously, the task privileged mode has been used only when the named was starting up and loading the zones from the disk as the "first" thing to do. The privileged task was setup with quantum == 2, which made the taskmgr/netmgr spin around the privileged queue processing two events at the time. The same effect can be achieved by setting the quantum to UINT_MAX (e.g. practically unlimited) for the loadzone task, hence the privileged task mode was removed in favor of just processing all the events on the loadzone task in a single task_run().	2022-04-01 23:55:26 +02:00
Ondřej Surý	62a72211aa	Remove isc_pool API Since the last user of the isc_pool API is gone, remove the whole isc_pool API.	2022-04-01 23:50:34 +02:00
Ondřej Surý	2707d0eeb7	Set hard thread affinity for each zone After switching to per-thread resources in the zonemgr, the performance was decreased because the memory context, zonetask and loadtask was picked from the pool at random. Pin the zone to single threadid (.tid) and align the memory context, zonetask and loadtask to be the same, this sets the hard affinity of the zone to the netmgr thread.	2022-04-01 23:50:34 +02:00
Ondřej Surý	a94678ff77	Create per-thread task and memory context for zonemgr Previously, the zonemgr created 1 task per 100 zones and 1 memory context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to reduce the contention between threads. Instead of reducing the contention by having many resources, create a per-nm_thread memory context, loadtask and zonetask and spread the zones between just per-thread resources. Note: this commit alone does decrease performance when loading the zone by couple seconds (in case of 1M zone) and thus there's more work in this whole MR fixing the performance.	2022-04-01 23:50:34 +02:00
Ondřej Surý	15ea6f002f	Add isc_task_setquantum() and use it for post-init zone loading Add isc_task_setquantum() function that modifies quantum for the future isc_task_run() invocations. NOTE: The current isc_task_run() caches the task->quantum into a local variable and therefore the current event loop is not affected by any quantum change.	2022-04-01 23:45:23 +02:00
Ondřej Surý	c17eee034b	Remove isc_task_purge() and isc_task_purgerange() The isc_task_purge() and isc_task_purgerange() were now unused, so sweep the task.c file. Additionally remove unused ISC_EVENTATTR_NOPURGE event attribute.	2022-04-01 23:45:23 +02:00
Ondřej Surý	48b2a5df97	Keep the list of scheduled events on the timer Instead of searching for the events to purge, keep the list of scheduled events on the timer list and purge the events that we have scheduled.	2022-04-01 23:45:23 +02:00
Ondřej Surý	17aed2f895	Repair isc_task_purgeevent(), clean isc_task_unsend{,range}() The isc_task_purgerange() was walking through all events on the task to find a matching task. Instead use the ISC_LINK_LINKED to find whether the event is active. Cleanup the related isc_task_unsend() and isc_task_unsendrange() functions that were not used anywhere.	2022-04-01 23:45:23 +02:00
Ondřej Surý	b84c9b2608	Turn isc_hash_bits32() into static online function Adding extra val & 0xffff in the isc_hash_bits32() macros in the hotpath has significantly reduced the performance. Turn the macro into static inline function matching the previous hash_32() function used to compute hashval matching the hashtable->bits.	2022-04-01 23:04:24 +02:00
Ondřej Surý	b05a991ad0	Make isc_ht optionally case insensitive Previously, the isc_ht API would always take the key as a literal input to the hashing function. Change the isc_ht_init() function to take an 'options' argument, in which ISC_HT_CASE_SENSITIVE or _INSENSITIVE can be specified, to determine whether to use case-sensitive hashing in isc_hash32() when hashing the key.	2022-03-28 15:02:18 -07:00
Evan Hunt	e9ef3defa4	consolidate fibonacci hashing in one place Fibonacci hashing was implemented in four separate places (rbt.c, rbtdb.c, resolver.c, zone.c). This commit combines them into a single implementation. The hash_32() function is now replaced with isc_hash_bits32().	2022-03-28 14:44:21 -07:00
Artem Boldariev	783663db80	Add ISC_R_TLSBADPEERCERT error code to the TLS related code This commit adds support for ISC_R_TLSBADPEERCERT error code, which is supposed to be used to signal for TLS peer certificates verification in dig and other code. The support for this error code is added to our TLS and TLS DNS implementations. This commit also adds isc_nm_verify_tls_peer_result_string() function which is supposed to be used to get a textual description of the reason for getting a ISC_R_TLSBADPEERCERT error.	2022-03-28 15:32:30 +03:00
Artem Boldariev	71cf8fa5ac	Extend TLS context cache with CA certificates store This commit adds support for keeping CA certificates stores associated with TLS contexts. The intention is to keep one reusable store per a set of related TLS contexts.	2022-03-28 15:31:22 +03:00
Artem Boldariev	c49a81e27d	Add foundational functions to implement Strict/Mutual TLS This commit adds a set of functions that can be used to implement Strict and Mutual TLS: * isc_tlsctx_load_client_ca_names(); * isc_tlsctx_load_certificate(); * isc_tls_verify_peer_result_string(); * isc_tlsctx_enable_peer_verification().	2022-03-28 15:31:22 +03:00
Artem Boldariev	32783d36c2	Add utility functions to manipulate X509 certificate stores This commit adds a set of high-level utility functions to manipulate the certificate stores. The stores are needed to implement TLS certificates verification efficiently.	2022-03-28 15:31:22 +03:00
Ondřej Surý	9de10cd153	Remove extrahandle size from netmgr Previously, it was possible to assign a bit of memory space in the nmhandle to store the client data. This was complicated and prevents further refactoring of isc_nmhandle_t caching (future work). Instead of caching the data in the nmhandle, allocate the hot-path ns_client_t objects from per-thread clientmgr memory context and just assign it to the isc_nmhandle_t via isc_nmhandle_set().	2022-03-25 10:38:35 +01:00
Ondřej Surý	04d0b70ba2	Replace ISC_NORETURN with C11's noreturn C11 has builtin support for _Noreturn function specifier with convenience noreturn macro defined in <stdnoreturn.h> header. Replace ISC_NORETURN macro by C11 noreturn with fallback to __attribute__((noreturn)) if the C11 support is not complete.	2022-03-25 08:33:43 +01:00
Ondřej Surý	584f0d7a7e	Simplify way we tag unreachable code with only ISC_UNREACHABLE() Previously, the unreachable code paths would have to be tagged with: INSIST(0); ISC_UNREACHABLE(); There was also older parts of the code that used comment annotation: /* NOTREACHED */ Unify the handling of unreachable code paths to just use: UNREACHABLE(); The UNREACHABLE() macro now asserts when reached and also uses __builtin_unreachable(); when such builtin is available in the compiler.	2022-03-25 08:33:43 +01:00
Ondřej Surý	fe7ce629f4	Add FALLTHROUGH macro for __attribute__((fallthrough)) Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which is explicit version of the /* FALLTHROUGH / comment we are currently using. Add and apply FALLTHROUGH macro that uses the attribute if available, but does nothing on older compilers. In one case (lib/dns/zone.c), using the macro revealed that we were using the / FALLTHROUGH */ comment in wrong place, remove that comment.	2022-03-25 08:33:43 +01:00
Ondřej Surý	d70daa29f7	Make netmgr the authority on number of threads running Instead of passing the "workers" variable back and forth along with passing the single isc_nm_t instance, add isc_nm_getnworkers() function that returns the number of netmgr threads are running. Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired knowledge.	2022-03-18 21:53:28 +01:00
Ondřej Surý	e42cb1f198	Implement incremental hash table resizing in isc_ht Previously, an incremental hash table resizing was implemented for the dns_rbt_t hash table implementation. Using that as a base, also implement the incremental hash table resizing also for isc_ht API hashtables: 1. During the resize, allocate the new hash table, but keep the old table unchanged. 2. In each lookup, delete, or iterator operation, check both tables. 3. Perform insertion operations only in the new table. 4. At each insertion also move <r> elements from the old table to the new table. 5. When all elements are removed from the old table, deallocate it. To ensure that the old table is completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (<r> + 1)/<r> during resizing. In our implementation <r> is equal to 1. The downside of this approach is that the old table and the new table could stay in memory for longer when there are no new insertions into the hash table for prolonged periods of time as the incremental rehashing happens only during the insertions.	2022-03-17 08:16:24 +01:00
Ondřej Surý	79b5ccbf34	Implement isc_interval_t on top of isc_time_t Change the isc_interval_t implementation from separate data type and separate implementation to be shim implementation on top of isc_time_t. The distinction between isc_interval_t and isc_time_t has been kept because they are semantically different - isc_interval_t is relative and isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to be freely interchangeable, f.e. this: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);; isc_time_subtract(t1, interval, t2); isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2)); to just: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_time_subtract(t1, t2, interval); without introducing a whole set of new functions.	2022-03-14 13:00:05 -07:00
Ondřej Surý	e6ca2a651f	Refactor isc_timer_reset() use with semantic patch Add and apply semantic patch to remove expires argument from the isc_timer_reset() calls through the codebase.	2022-03-14 13:00:05 -07:00
Ondřej Surý	6437bcc488	Remove expires argument from isc_timer API The isc_timer_reset() now works only with intervals for once timers. This makes the API almost 1:1 compatible with the libuv timers making the further refactoring possible.	2022-03-14 13:00:05 -07:00
Ondřej Surý	c259cecc90	Refactor isc_timer_create() to just create timer The isc_timer_create() function was a bit conflated. It could have been used to create a timer and start it at the same time. As there was a single place where this was done before (see the previous commit for nta.c), this was cleaned up and the isc_timer_create() function was changed to only create new timer.	2022-03-14 13:00:05 -07:00
Ondřej Surý	8fbb42c49c	Remove "a temporary hack, 'rndc timerpoke'" In 2002, "a temporary hack, 'rndc timerpoke'" was added. It's time for it to go, so it was removed.	2022-03-14 13:00:05 -07:00
Ondřej Surý	f4751a91f7	Remove unused isc_timer_touch() function The isc_timer_touch() was unused, just remove it.	2022-03-14 13:00:05 -07:00
Ondřej Surý	bbe1c06a8b	Remove isc_timertype_limited from isc_timer API The isc_timertype_limited timer type was never used (not even in tests). Remove isc_timertype_limited timer type before planned refactoring.	2022-03-14 13:00:05 -07:00
Ondřej Surý	f251d69eba	Remove usage of deprecated ATOMIC_VAR_INIT() macro The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple assignment of the value as this is what all supported stdatomic.h implementations do anyway: * MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v} * Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE) 1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf	2022-03-08 23:55:10 +01:00
Ondřej Surý	8fa27365ec	Make isc_ht_init() and isc_ht_iter_create() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 14:51:55 +01:00
Ondřej Surý	bbb4cdb92d	Make isc_heap_create() and isc_heap_insert() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 11:19:34 +01:00
Ondřej Surý	6bd025942c	Replace netievent lock-free queue with simple locked queue The current implementation of isc_queue uses Michael-Scott lock-free queue that in turn uses hazard pointers. It was discovered that the way we use the isc_queue, such complicated mechanism isn't really needed, because most of the time, we either execute the work directly when on nmthread (in case of UDP) or schedule the work from the matching nmthreads. Replace the current implementation of the isc_queue with a simple locked ISC_LIST. There's a slight improvement - since copying the whole list is very lightweight - we move the queue into a new list before we start the processing and locking just for moving the queue and not for every single item on the list. NOTE: There's a room for future improvements - since we don't guarantee the order in which the netievents are processed, we could have two lists - one unlocked that would be used when scheduling the work from the matching thread and one locked that would be used from non-matching thread.	2022-03-04 13:49:51 +01:00
Ondřej Surý	d01562f22b	Remove the keep-response-order ACL map The keep-response-order option has been obsoleted, and in this commit, remove the keep-response-order ACL map rendering the option no-op, the call the isc_nm_sequential() and the now unused isc_nm_sequential() function itself.	2022-02-18 09:16:03 +01:00
Ondřej Surý	3c7b04d015	Add network manager based timer API This commits adds API that allows to create arbitrary timers associated with the network manager handles.	2022-02-17 21:38:17 +01:00
Ondřej Surý	a89d9e0fa6	Add isc_nmhandle_setwritetimeout() function In some situations (unit test and forthcoming XFR timeouts MR), we need to modify the write timeout independently of the read timeout. Add a isc_nmhandle_setwritetimeout() function that could be called before isc_nm_send() to specify a custom write timeout interval.	2022-02-17 09:06:58 +01:00
Ondřej Surý	0500345513	Remove unused functions from isc_thread API The isc_thread_setaffinity call was removed in !5265 and we are not going to restore it because it was proven that the performance is better without it. Additionally, remove the already disabled cpu system test. The isc_thread_setconcurrency function is unused and also calling pthread_setconcurrency() on Linux has no meaning, formerly it was added because of Solaris in 2001 and it was removed when taskmgr was refactored to run on top of netmgr in !4918.	2022-02-09 17:22:06 +01:00
Evan Hunt	d3fed6f400	update dlz_minimal.h the addition of support for ECS client information in DLZ modules omitted some necessary changes to build modules in contrib.	2022-01-27 15:48:50 -08:00
Petr Menšík	f00f521e9c	Use detected cache line size IBM power architecture has L1 cache line size equal to 128. Take advantage of that on that architecture, do not force more common value of 64. When it is possible to detect higher value, use that value instead. Keep the default to be 64.	2022-01-27 13:02:23 +01:00
Ondřej Surý	58bd26b6cf	Update the copyright information in all files in the repository This commit converts the license handling to adhere to the REUSE specification. It specifically: 1. Adds used licnses to LICENSES/ directory 2. Add "isc" template for adding the copyright boilerplate 3. Changes all source files to include copyright and SPDX license header, this includes all the C sources, documentation, zone files, configuration files. There are notes in the doc/dev/copyrights file on how to add correct headers to the new files. 4. Handle the rest that can't be modified via .reuse/dep5 file. The binary (or otherwise unmodifiable) files could have license places next to them in <foo>.license file, but this would lead to cluttered repository and most of the files handled in the .reuse/dep5 file are system test files.	2022-01-11 09:05:02 +01:00
Ondřej Surý	6269fce0fe	Use isc_mem_get_aligned() for isc_queue and cleanup max_threads The isc_queue_new() was using dirty tricks to allocate the head and tail members of the struct aligned to the cacheline. We can now use isc_mem_get_aligned() to allocate the structure to the cacheline directly. Use ISC_OS_CACHELINE_SIZE (64) instead of arbitrary ALIGNMENT (128), one cacheline size is enough to prevent false sharing. Cleanup the unused max_threads variable - there was actually no limit on the maximum number of threads. This was changed a while ago.	2022-01-05 17:10:58 +01:00
Ondřej Surý	c917a2ca88	Add isc_mem_*_aligned() function that works with aligned memory There are some situations where having aligned allocations would be useful, so we don't have to play tricks with padding the data to the cacheline sizes. Add isc_mem_{get,put,reget,putanddetach}_aligned() functions that has alignment and size as last argument mimicking the POSIX posix_memalign() functions on systems with jemalloc (see the documentation on MALLOX_ALIGN() for more details). On systems without jemalloc, those functions are same as non-aligned variants.	2022-01-05 17:10:56 +01:00
Ondřej Surý	4f78f9d72a	Add #define ISC_OS_CACHELINE_SIZE 64 Add library ctor and dtor for isc_os compilation unit which initializes the numbers of the CPUs and also checks whether L1 cacheline size is really 64 if the sysconf() call is available.	2022-01-05 17:07:35 +01:00
Evan Hunt	61c160c4a5	Clean up isc_tlsctx_cache_detach() For consistency with similar functions, rename `pcache` to `cachep`, call a separate destroy function when references reach 0, and add a missing call to isc_refcount_destroy().	2022-01-04 23:07:12 -08:00
Artem Boldariev	eb37d967c2	Add TLS context cache This commit adds a TLS context object cache implementation. The intention of having this object is manyfold: - In the case of client-side contexts: allow reusing the previously created contexts to employ the context-specific TLS session resumption cache. That will enable XoT connection to be reestablished faster and with fewer resources by not going through the full TLS handshake procedure. - In the case of server-side contexts: reduce the number of contexts created on startup. That could reduce startup time in a case when there are many "listen-on" statements referring to a smaller amount of `tls` statements, especially when "ephemeral" certificates are involved. - The long-term goal is to provide in-memory storage for additional data associated with the certificates, like runtime representation (X509_STORE) of intermediate CA-certificates bundle for Strict TLS/Mutual TLS ("ca-file").	2021-12-29 10:25:11 +02:00
Michał Kępień	3081bda798	Add a logging category for TLS pre-master secrets TLS pre-master secrets will be dumped to disk using the logging framework provided by libisc. Add a new logging category for this type of debugging data in order to enable exporting it to a dedicated channel. Derive the name of the new category from the name of the relevant environment variable, SSLKEYLOGFILE.	2021-12-22 18:17:26 +01:00
Mark Andrews	a23507c4fa	Pass the digest buffer length to EVP_DigestSignFinal OpenSSL 3.0.1 does not accept 0 as a digest buffer length when calling EVP_DigestSignFinal as it now checks that the digest buffer length is large enough for the digest. Pass the digest buffer length instead.	2021-12-17 20:28:01 +11:00
Michal Nowak	9c013f37d0	Drop cppcheck workarounds As cppcheck was removed from the CI, associated workarounds and suppressions are not required anymore.	2021-12-14 15:03:56 +01:00
Michał Kępień	eb4713c8e5	Remove mutex debugging code Mutex debugging code (used when the ISC_MUTEX_DEBUG preprocessor macro is set to 1 and PTHREAD_MUTEX_ERRORCHECK is defined) has been broken for the past 3 years (since commit `2f3eee5a4f`) and nobody complained, which is a strong indication that this code is not being used these days any more. External tools for detecting locking issues are already wired into various GitLab CI checks. Drop all code depending on the ISC_MUTEX_DEBUG preprocessor macro being set.	2021-12-09 14:02:36 +01:00
Michał Kępień	0964a94ad5	Remove mutex profiling code Mutex profiling code (used when the ISC_MUTEX_PROFILE preprocessor macro is set to 1) has been broken for the past 3 years (since commit `0bed9bfc28`) and nobody complained, which is a strong indication that this code is not being used these days any more. External tools for both measuring performance and detecting locking issues are already wired into various GitLab CI checks. Drop all code depending on the ISC_MUTEX_PROFILE preprocessor macro being set.	2021-12-09 12:25:21 +01:00
Artem Boldariev	f0e18f3927	Add isc_nm_has_encryption() This commit adds an isc_nm_has_encryption() function intended to check if a given handle is backed by a connection which uses encryption.	2021-11-30 12:20:22 +02:00
Artem Boldariev	07cf827b0b	Add isc_nm_socket_type() This commit adds an isc_nm_socket_type() function which can be used to obtain a handle's socket type. This change obsoletes isc_nm_is_tlsdns_handle() and isc_nm_is_http_handle(). However, it was decided to keep the latter as we eventually might end up supporting multiple HTTP versions.	2021-11-30 12:20:22 +02:00
Evan Hunt	7f63ee3bae	address '--disable-doh' failures Change 5756 (GL #2854) introduced build errors when using 'configure --disable-doh'. To fix this, isc_nm_is_http_handle() is now defined in all builds, not just builds that have DoH enabled. Missing code comments were added both for that function and for isc_nm_is_tlsdns_handle().	2021-11-17 13:48:43 -08:00
Artem Boldariev	80482f8d3e	DoH: Add isc_nm_set_min_answer_ttl() This commit adds an isc_nm_set_min_answer_ttl() function which is intended to to be used to give a hint to the underlying transport regarding the answer TTL. The interface is intentionally kept generic because over time more transports might benefit from this functionality, but currently it is intended for DoH to set "max-age" value within "Cache-Control" HTTP header (as recommended in the RFC8484, section 5.1 "Cache Interaction"). It is no-op for other DNS transports for the time being.	2021-11-05 14:14:59 +02:00
Aram Sargsyan	15cb706f22	Refactor the OpenSSL HMAC usage to use newer APIs OpenSSL 3 deprecates the HMAC* family and associated APIs. Rewrite portions of OpenSSL library usage code to use a newer set of HMAC APIs.	2021-10-28 07:38:56 +00:00
Evan Hunt	a55589f881	remove all references to isc_socket and related types Removed socket.c, socket.h, and all references to isc_socket_t, isc_socketmgr_t, isc_sockevent_t, etc.	2021-10-15 01:01:25 -07:00
Evan Hunt	8c51a32e5c	netmgr: add isc_nm_routeconnect() isc_nm_routeconnect() opens a route/netlink socket, then calls a connect callback, much like isc_nm_udpconnect(), with a handle that can then be monitored for network changes. Internally the socket is treated as a UDP socket, since route/netlink sockets follow the datagram contract.	2021-10-15 00:56:58 -07:00
Ondřej Surý	e603983ec9	Stop providing branch prediction information The __builtin_expect() can be used to provide the compiler with branch prediction information. The Gcc manual says[1] on the subject: In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. Stop using __builtin_expect() and ISC_LIKELY() and ISC_UNLIKELY() macros to provide the branch prediction information as the performance testing shows that named performs better when the __builtin_expect() is not being used. 1. https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect	2021-10-14 10:33:24 +02:00
Ondřej Surý	f3635bcc14	Use #pragma once as header guards Unify the header guard style and replace the inconsistent include guards with #pragma once. The #pragma once is widely and very well supported in all compilers that BIND 9 supports, and #pragma once was already in use in several new or refactored headers. Using simpler method will also allow us to automate header guard checks as this is simpler to programatically check. For reference, here are the reasons for the change taken from Wikipedia[1]: > In the C and C++ programming languages, #pragma once is a non-standard > but widely supported preprocessor directive designed to cause the > current source file to be included only once in a single compilation. > > Thus, #pragma once serves the same purpose as include guards, but with > several advantages, including: less code, avoidance of name clashes, > and sometimes improvement in compilation speed. On the other hand, > #pragma once is not necessarily available in all compilers and its > implementation is tricky and might not always be reliable. 1. https://en.wikipedia.org/wiki/Pragma_once	2021-10-13 00:49:15 -07:00
Matthijs Mekking	2af05beafa	Replace "master/slave" terms in code Replace some "master/slave" terminology in the code with the preferred "primary/secondary" keywords. This also changes user output such as log messages, and fixes a typo ("seconary") in cfg_test.c. There are still some references to "master" and "slave" for various reasons: - The old syntax can still be used as a synonym. - The master syntax is kept when it refers to master files and formats. - This commit replaces mainly keywords that are local. If "master" or "slave" is used in for example a structure that is all over the place, it is considered out of scope for the moment.	2021-10-12 13:11:13 -07:00
Ondřej Surý	ed95f9fba3	Update the source code formatting using clang-format-13 clang-format-13 fixed some of the formatting that clang-format-12 got wrong. Update the formatting.	2021-10-12 11:14:40 +02:00
Ondřej Surý	2e3a2eecfe	Make isc_result a static enum Remove the dynamic registration of result codes. Convert isc_result_t from unsigned + #defines into 32-bit enum type in grand unified <isc/result.h> header. Keep the existing values of the result codes even at the expense of the description and identifier tables being unnecessary large. Additionally, add couple of: switch (result) { [...] default: break; } statements where compiler now complains about missing enum values in the switch statement.	2021-10-06 11:22:20 +02:00
Ondřej Surý	804ec1bcaa	Improve STATIC_ASSERT macro for older compilers Previously, when using compiler without support for static assertions, the STATIC_ASSERT() macro would be replaced with runtime assertion. Change the STATIC_ASSERT() macro to a version that's compile time assertion even when using pre-C11 compilers. Courtesy of Joseph Quinsey: https://godbolt.org/z/K9RvWS	2021-10-05 22:13:29 +02:00
Artem Boldariev	25b2c6ad96	Require "dot" ALPN token for zone transfer requests over DoT (XoT) This commit makes BIND verify that zone transfers are allowed to be done over the underlying connection. Currently, it makes sense only for DoT, but the code is deliberately made to be protocol-agnostic.	2021-10-05 11:23:47 +03:00
Artem Boldariev	eba3278e52	Add isc_nm_xfr_allowed() function The intention of having this function is to have a predicate to check if a zone transfer could be performed over the given handle. In most cases we can assume that we can do zone transfers over any stream transport except DoH, but this assumption will not work for zone transfers over DoT (XoT), as the RFC9103 requires ALPN to happen, which might not be the case for all deployments of DoT.	2021-10-05 11:23:47 +03:00
Artem Boldariev	56b3f5d832	Low level code to support ALPN in DoT This commit adds low-level code necessary to support ALPN in DoT as XoT requires "dot" ALPN token to be negotiated on a connection for zone transfers.	2021-10-05 11:23:47 +03:00
Evan Hunt	8b532d2e64	dispatch: Refactor to eliminate dns_dispatchevent - Responses received by the dispatch are no longer sent to the caller via a task event, but via a netmgr-style recv callback. the 'action' parameter to dns_dispatch_addresponse() is now called 'response' and is called directly from udp_recv() or tcp_recv() when a valid response has been received. - All references to isc_task and isc_taskmgr have been removed from dispatch functions. - All references to dns_dispatchevent_t have been removed and the type has been deleted. - Added a task to the resolver response context, to be used for fctx events. - When the caller cancels an operation, the response handler will be called with ISC_R_CANCELED; it can abort immediately since the caller will presumably have taken care of cleanup already. - Cleaned up attach/detach in resquery and request.	2021-10-02 11:39:56 -07:00
Evan Hunt	08ce69a0ea	Rewrite dns_resolver and dns_request to use netmgr timeouts - The `timeout_action` parameter to dns_dispatch_addresponse() been replaced with a netmgr callback that is called when a dispatch read times out. this callback may optionally reset the read timer and resume reading. - Added a function to convert isc_interval to milliseconds; this is used to translate fctx->interval into a value that can be passed to dns_dispatch_addresponse() as the timeout. - Note that netmgr timeouts are accurate to the millisecond, so code to check whether a timeout has been reached cannot rely on microsecond accuracy. - If serve-stale is configured, then a timeout received by the resolver may trigger it to return stale data, and then resume waiting for the read timeout. this is no longer based on a separate stale timer. - The code for canceling requests in request.c has been altered so that it can run asynchronously. - TCP timeout events apply to the dispatch, which may be shared by multiple queries. since in the event of a timeout we have no query ID to use to identify the resp we wanted, we now just send the timeout to the oldest query that was pending. - There was some additional refactoring in the resolver: combining fctx_join() and fctx_try_events() into one function to reduce code duplication, and using fixednames in fetchctx and fetchevent. - Incidental fix: new_adbaddrinfo() can't return NULL anymore, so the code can be simplified.	2021-10-02 11:39:56 -07:00
Evan Hunt	308bc46a59	Convert dispatch to netmgr The flow of operations in dispatch is changing and will now be similar for both UDP and TCP queries: 1) Call dns_dispatch_addresponse() to assign a query ID and register that we'll be listening for a response with that ID soon. the parameters for this function include callback functions to inform the caller when the socket is connected and when the message has been sent, as well as a task action that will be sent when the response arrives. (later this could become a netmgr callback, but at this stage to minimize disruption to the calling code, we continue to use isc_task for the response event.) on successful completion of this function, a dispatch entry object will be instantiated. 2) Call dns_dispatch_connect() on the dispatch entry. this runs isc_nm_udpconnect() or isc_nm_tcpdnsconnect(), as needed, and begins listening for responses. the caller is informed via a callback function when the connection is established. 3) Call dns_dispatch_send() on the dispatch entry. this runs isc_nm_send() to send a request. 4) Call dns_dispatch_removeresponse() to terminate listening and close the connection. Implementation comments below: - As we will be using netmgr buffers now. code to send the length in TCP queries has also been removed as that is handled by the netmgr. - TCP dispatches can be used by multiple simultaneous queries, so dns_dispatch_connect() now checks whether the dispatch is already connected before calling isc_nm_tcpdnsconnect() again. - Running dns_dispatch_getnext() from a non-network thread caused a crash due to assertions in the netmgr read functions that appear to be unnecessary now. the assertions have been removed. - fctx->nqueries was formerly incremented when the connection was successful, but is now incremented when the query is started and decremented if the connection fails. - It's no longer necessary for each dispatch to have a pool of tasks, so there's now a single task per dispatch. - Dispatch code to avoid UDP ports already in use has been removed. - dns_resolver and dns_request have been modified to use netmgr callback functions instead of task events. some additional changes were needed to handle shutdown processing correctly. - Timeout processing is not yet fully converted to use netmgr timeouts. - Fixed a lock order cycle reported by TSAN (view -> zone-> adb -> view) by by calling dns_zt functions without holding the view lock.	2021-10-02 11:39:56 -07:00
Evan Hunt	f439eb5d99	Dispatch API simplification - Many dispatch attributes can be set implicitly instead of being passed in. we can infer whether to set DNS_DISPATCHATTR_TCP or _UDP from whether we're calling dns_dispatch_createtcp() or _createudp(). we can also infer DNS_DISPATCHATTR_IPV4 or _IPV6 from the addresses or the socket that were passed in. - We no longer use dup'd sockets in UDP dispatches, so the 'dup_socket' parameter has been removed from dns_dispatch_createudp(), along with the code implementing it. also removed isc_socket_dup() since it no longer has any callers. - The 'buffersize' parameter was ignored and has now been removed; buffersize is now fixed at 4096. - Maxbuffers and maxrequests don't need to be passed in on every call to dns_dispatch_createtcp() and _createudp(). In all current uses, the value for mgr->maxbuffers will either be raised once from its default of 20000 to 32768, or else left alone. (passing in a value lower than 20000 does not lower it.) there isn't enough difference between these values for there to be any need to configure this. The value for disp->maxrequests controls both the quota of concurrent requests for a dispatch and also the size of the dispatch socket memory pool. it's not clear that this quota is necessary at all. the memory pool size currently starts at 32768, but is sometimes lowered to 4096, which is definitely unnecessary. This commit sets both values permanently to 32768. - Previously TCP dispatches allocated their own separate QID table, which didn't incorporate a port table. this commit removes per-dispatch QID tables and shares the same table between all dispatches. since dispatches are created for each TCP socket, this may speed up the dispatch allocation process. there may be a slight increase in lock contention since all dispatches are sharing a single QID table, but since TCP sockets are used less often than UDP sockets (which were already sharing a QID table), it should not be a substantial change. - The dispatch port table was being used to determine whether a port was already in use; if so, then a UDP socket would be bound with REUSEADDR. this commit removes the port table, and always binds UDP sockets that way.	2021-10-02 10:21:49 +02:00
Artem Boldariev	c759f25c7b	Add "session-tickets" options to the "tls" clause This commit adds the ability to enable or disable stateless TLS session resumption tickets (see RFC5077). Having this ability is twofold. Firstly, these tickets are encrypted by the server, and the algorithm might be weaker than the algorithm negotiated during the TLS session establishment (it is in general the case for TLSv1.2, but the generic principle applies to TLSv1.3 as well, despite it having better ciphers for session tickets). Thus, they might compromise Perfect Forward Secrecy. Secondly, disabling it might be necessary if the same TLS key/cert pair is supposed to be used by multiple servers to achieve, e.g., load balancing because the session ticket by default gets generated in runtime, while to achieve successful session resumption ability, in this case, would have required using a shared key. The proper alternative to having the ability to disable stateless TLS session resumption tickets is to implement a proper session tickets key rollover mechanism so that key rotation might be performed often (e.g. once an hour) to not compromise forward secrecy while retaining the associated performance benefits. That is much more work, though. On the other hand, having the ability to disable session tickets allows having a deployable configuration right now in the cases when either forward secrecy is wanted or sharing the TLS key/cert pair between multiple servers is needed (or both).	2021-10-01 15:50:43 +03:00
Artem Boldariev	16c6e2be06	Add "prefer-server-ciphers" options to the "tls" clause This commit adds support for enforcing the preference of server ciphers over the client ones. This way, the server attains control over the ciphers priority and, thus, can choose more strong cyphers when a client prioritises less strong ciphers over the more strong ones, which is beneficial when trying to achieve Perfect Forward Secrecy.	2021-10-01 15:50:43 +03:00
Artem Boldariev	3b88d783a2	Add "ciphers" options to the "tls" clause This commit adds support for setting TLS cipher list string in the format specified in the OpenSSL documentation (https://www.openssl.org/docs/man1.1.1/man1/ciphers.html). The syntax of the cipher list is verified so that specifying the wrong string will prevent the configuration from being loaded.	2021-10-01 15:50:43 +03:00
Artem Boldariev	f2ae4c8480	DH-parameters loading support This commit adds support for loading DH-parameters (Diffie-Hellman parameters) via the new "dhparam-file" option within "tls" clause. In particular, Diffie-Hellman parameters are needed to enable the range of forward-secrecy enabled cyphers for TLSv1.2, which are getting silently disabled otherwise.	2021-10-01 15:50:43 +03:00
Artem Boldariev	992f815770	Add "protocols" options to the "tls" clause This commit adds the ability to specify allowed TLS protocols versions within the "tls" clause. If an unsupported TLS protocol version is specified in a file, the configuration file will not pass verification. Also, this commit adds strict checks for "tls" clauses verification, in particular: - it ensures that loading configuration files containing duplicated "tls" clauses is not allowed; - it ensures that loading configuration files containing "tls" clauses missing "cert-file" or "key-file" is not allowed; - it ensures that loading configuration files containing "tls" clauses named as "ephemeral" or "none" is not allowed.	2021-10-01 15:50:43 +03:00
Ondřej Surý	aeb3d1cab3	Add isc_mem_reget() function to realloc isc_mem_get allocations The isc_mem_get() and isc_mem_put() functions are leaving the memory allocation size tracking to the users of the API, while isc_mem_allocate() and isc_mem_free() would track the sizes internally. This allowed to have isc_mem_rellocate() to manipulate the memory allocations by the later set, but not the former set of the functions. This commit introduces isc_mem_reget(ctx, old_ptr, old_size, new_size) function that operates on the memory allocations with external size tracking completing the API.	2021-09-23 11:18:07 -07:00
Ondřej Surý	edee9440d0	Remove the mastefile-format map option As previously announced, this commit removes the masterfile-format format 'map' from named, all the tools, the documentation and the system tests.	2021-09-17 07:09:50 +02:00
Ondřej Surý	8cb2ba5dd3	Remove native PKCS#11 support The native PKCS#11 support has been removed in favour of better maintained, more performance and easier to use OpenSSL PKCS#11 engine from the OpenSC project.	2021-09-09 15:35:39 +02:00
Artem Boldariev	db1ba15ff2	Replace multiple /dns-query constants with a global one This commit replaces the constants defining /dns-query, the default DoH endpoint, with a global definition.	2021-08-30 10:32:17 +03:00
Artem Boldariev	530133c10f	Unify DoH URI making throughout the codebase This commit adds new function isc_nm_http_makeuri() which is supposed to unify DoH URI construction throughout the codebase. It handles IPv6 addresses, hostnames, and IPv6 addresses given as hostnames properly, and replaces similar ad-hoc code in the codebase.	2021-08-30 10:21:58 +03:00
Ondřej Surý	cdf9a1fd20	Remove support for external applications to register libisc The previous versions of BIND 9 exported its internal libraries so that they can be used by third-party applications more easily. Certain library functions were altered from specific BIND-only behavior to more generic behavior when used by other applications. This commit removes the function isc_lib_register() that was used by external applications to enable the functionality.	2021-08-30 08:47:39 +02:00
Evan Hunt	7867b8b57d	enable keepalive when the keepalive EDNS option is seen previously, receiving a keepalive option had no effect on how long named would keep the connection open; there was a place to configure the keepalive timeout but it was never used. this commit corrects that. this also fixes an error in isc__nm_{tcp,tls}dns_keepalive() in which the sense of a REQUIRE test was reversed; previously this error had not been noticed because the functions were not being used.	2021-08-27 09:56:51 -07:00
Evan Hunt	19e24e22f5	cleanup netmgr-int.h - fix some duplicated and out-of-order prototypes declared in netmgr-int.h - rename isc_nm_tcpdns_keepalive to isc__nm_tcpdns_keepalive as it's for internal use	2021-08-27 09:56:51 -07:00
Matthijs Mekking	9acce8a82a	Add a function isc_stats_resize Add a new function to resize the number of counters in a statistics counter structure. This will be needed when we keep track of DNSSEC sign statistics and new keys are introduced due to a rollover.	2021-08-24 09:07:15 +02:00
Mark Andrews	42c22670b3	Add support for parsing <tag>[=<value>] where <value> may be a quoted string. Previously quoted string only supported opening quotes at the start of the string.	2021-08-18 13:49:48 +10:00
Artem Boldariev	f388b71378	Get rid of RW locks in the DoH code This commit gets rid of RW locks in a hot path of the DoH code. In the original design, it was implied that we add new endpoints after the HTTP listener was created. Such a design implies some locking. We do not need such flexibility, though. Instead, we could build a set of endpoints before the HTTP listener gets created. Such a design does not need RW locks at all.	2021-08-04 10:32:25 +03:00
Ondřej Surý	22db2705cd	Use static storage for isc_mem water_t On the isc_mem water change the old water_t structure could be used after free. Instead of introducing reference counting on the hot-path we are going to introduce additional constraints on the isc_mem_setwater. Once it's set for the first time, the additional calls have to be made with the same water and water_arg arguments.	2021-07-22 11:51:46 +02:00
Artem Boldariev	590e8e0b86	Make max number of HTTP/2 streams configurable This commit makes number of concurrent HTTP/2 streams per connection configurable as a mean to fight DDoS attacks. As soon as the limit is reached, BIND terminates the whole session. The commit adds a global configuration option (http-streams-per-connection) which can be overridden in an http <name> {...} statement like follows: http local-http-server { ... streams-per-connection 100; ... }; For now the default value is 100, which should be enough (e.g. NGINX uses 128, but it is a full-featured WEB-server). When using lower numbers (e.g. ~70), it is possible to hit the limit with e.g. flamethrower.	2021-07-16 11:50:22 +03:00
Artem Boldariev	03a557a9bb	Add (http-)listener-clients option (DoH quota mechanism) This commit adds support for http-listener-clients global options as well as ability to override the default in an HTTP server description, like: http local-http-server { ... listener-clients 100; ... }; This way we have ability to specify per-listener active connections quota globally and then override it when required. This is exactly what AT&T requested us: they wanted a functionality to specify quota globally and then override it for specific IPs. This change functionality makes such a configuration possible. It makes sense: for example, one could have different quotas for internal and external clients. Or, for example, one could use BIND's internal ability to serve encrypted DoH with some sane quota value for internal clients, while having un-encrypted DoH listener without quota to put BIND behind a load balancer doing TLS offloading for external clients. Moreover, the code no more shares the quota with TCP, which makes little sense anyway (see tcp-clients option), because of the nature of interaction of DoH clients: they tend to keep idle opened connections for longer periods of time, preventing the TCP and TLS client from being served. Thus, the need to have a separate, generally larger, quota for them. Also, the change makes any option within "http <name> { ... };" statement optional, making it easier to override only required default options. By default, the DoH connections are limited to 300 per listener. I hope that it is a good initial guesstimate.	2021-07-16 11:50:20 +03:00
Artem Boldariev	954240467d	Verify HTTP paths both in incoming requests and in config file This commit adds the code (and some tests) which allows verifying validity of HTTP paths both in incoming HTTP requests and in BIND's configuration file.	2021-07-16 10:28:08 +03:00
Ondřej Surý	ce03015d48	Remove nonnull attribute from isc_mem_{get,allocate,reallocate} The isc_mem_get(), isc_mem_allocate() and isc_mem_reallocate() can return NULL ptr in case where the allocation size is NULL. Remove the nonnull attribute from the functions' declarations. This stems from the following definition in the C11 standard: > If the size of the space requested is zero, the behavior is > implementation-defined: either a null pointer is returned, or the > behavior is as if the size were some nonzero value, except that the > returned pointer shall not be used to access an object. In this case, we return NULL as it's easier to detect errors when accessing pointer from zero-sized allocation which should obviously never happen.	2021-07-12 10:02:18 +02:00
Artem Boldariev	3673abc53c	Use restrict and const in isc_mempool_t This commit makes add restrict and const modifiers to some variables to aid compiler to do its optimizations.	2021-07-09 15:58:02 +02:00
Ondřej Surý	efb385ecdc	Clean up isc_mempool API - isc_mempool_get() can no longer fail; when there are no more objects in the pool, more are always allocated. checking for NULL return is no longer necessary. - the isc_mempool_setmaxalloc() and isc_mempool_getmaxalloc() functions are no longer used and have been removed.	2021-07-09 15:58:02 +02:00
Ondřej Surý	f487c6948b	Replace locked mempools with memory contexts Current mempools are kind of hybrid structures - they serve two purposes: 1. mempool with a lock is basically static sized allocator with pre-allocated free items 2. mempool without a lock is a doubly-linked list of preallocated items The first kind of usage could be easily replaced with jemalloc small sized arena objects and thread-local caches. The second usage not-so-much and we need to keep this (in libdns:message.c) for performance reasons.	2021-07-09 15:58:02 +02:00
Ondřej Surý	fd3ceec475	Add debug tracing capability to isc_mempool_create/destroy Previously, we only had capability to trace the mempool gets and puts, but for debugging, it's sometimes also important to keep track how many and where do the memory pools get created and destroyed. This commit adds such tracking capability.	2021-07-09 15:58:02 +02:00
Ondřej Surý	4b3d0c6600	Remove ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGRECORD The ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGCTX did sanity checks on matching size and memory context on the memory returned to the allocator. Those will no longer needed when most of the allocator will be replaced with jemalloc.	2021-07-09 15:58:02 +02:00
Ondřej Surý	63924968d1	Add debug tracing capability to isc_mem_create/isc_mem_destroy Previously, we only had capability to trace the memory gets and puts, but for debugging, it's sometimes also important to keep track how many and where do the memory contexts get created and destroyed. This commit adds such tracking capability.	2021-07-09 15:58:02 +02:00
Artem Boldariev	c6d0e3d3a7	Return HTTP status code for small/malformed requests This commit makes BIND return HTTP status codes for malformed or too small requests. DNS request processing code would ignore such requests. Such an approach works well for other DNS transport but does not make much sense for HTTP, not allowing it to complete the request/response sequence. Suppose execution has reached the point where DNS message handling code has been called. In that case, it means that the HTTP request has been successfully processed, and, thus, we are expected to respond to it either with a message containing some DNS payload or at least to return an error status code. This commit ensures that BIND behaves this way.	2021-07-09 16:37:08 +03:00
Ondřej Surý	2bb454182b	Make the DNS over HTTPS support optional This commit adds two new autoconf options `--enable-doh` (enabled by default) and `--with-libnghttp2` (mandatory when DoH is enabled). When DoH support is disabled the library is not linked-in and support for http(s) protocol is disabled in the netmgr, named and dig.	2021-07-07 09:50:53 +02:00
Ondřej Surý	29c2e52484	The isc/platform.h header has been completely removed The isc/platform.h header was left empty which things either already moved to config.h or to appropriate headers. This is just the final cleanup commit.	2021-07-06 05:33:48 +00:00
Ondřej Surý	bf4a0e26dc	Move NAME_MAX and PATH_MAX from isc/platform.h to isc/dir.h The last remaining defines needed for platforms without NAME_MAX and PATH_MAX (I'm looking at you, GNU Hurd) were moved to isc/dir.h where it's prevalently used.	2021-07-06 05:33:48 +00:00
Ondřej Surý	4da0c49e80	Move ISC_STRERRORSIZE to isc/strerr.h header The ISC_STRERRORSIZE was defined in isc/platform.h header as the value was different between Windows and POSIX platforms. Now that Windows is gone, move the define to where it belongs.	2021-07-06 05:33:48 +00:00
Ondřej Surý	d881e30b0a	Remove LIB<>_EXTERNAL_DATA defines After Windows has been removed, the LIB<>_EXTERNAL_DATA defines were just dummy leftovers. Remove them.	2021-07-06 05:33:48 +00:00
Ondřej Surý	ec86759401	Replace netmgr per-protocol sequential function with a common one Previously, each protocol (TCPDNS, TLSDNS) has specified own function to disable pipelining on the connection. An oversight would lead to assertion failure when opcode is not query over non-TCPDNS protocol because the isc_nm_tcpdns_sequential() function would be called over non-TCPDNS socket. This commit removes the per-protocol functions and refactors the code to have and use common isc_nm_sequential() function that would either disable the pipelining on the socket or would handle the request in per specific manner. Currently it ignores the call for HTTP sockets and causes assertion failure for protocols where it doesn't make sense to call the function at all.	2021-06-22 17:21:44 +03:00
Ondřej Surý	54c389dbc0	Drop support for clang atomic and gcc __sync builtins The requirements for BIND 9.17+ now requires C11 support from the compiler, so we can safely drop most of the stdatomic.h shims from lib/isc/unix/include/stdatomic.h. This commit removes support for clang atomic builtins (clang >= 3.6.0 includes stdatomic.h header) and for Gcc __sync builtins. The only compatibility shim that remains is support for __atomic builtins for Gcc >= 4.7.0 since CentOS 7 still includes only Gcc 4.8.1 and the proper stdatomic.h header was only introduced in Gcc >= 4.9.	2021-06-17 09:51:04 +02:00
Ondřej Surý	4677bb28d1	Remove atomics emulated by a mutex-locked variable Mutex atomics were intended to be used as a debugging tool only and it has already served its purpose and it's not needed anymore.	2021-06-17 09:51:04 +02:00
Artem Boldariev	b84fa122ce	Make BIND refuse to serve XFRs over DoH We cannot use DoH for zone transfers. According to RFC8484 a DoH request contains exactly one DNS message (see Section 6: Definition of the "application/dns-message" Media Type, https://datatracker.ietf.org/doc/html/rfc8484#section-6). This makes DoH unsuitable for zone transfers as often (and usually!) these need more than one DNS message, especially for larger zones. As zone transfers over DoH are not (yet) standardised, nor discussed in RFC8484, the best thing we can do is to return "not implemented." Technically DoH can be used to transfer small zones which fit in one message, but that is not enough for the generic case. Also, this commit makes the server-side DoH code ensure that no multiple responses could be attempted to be sent over one HTTP/2 stream. In HTTP/2 one stream is mapped to one request/response transaction. Now the write callback will be called with failure error code in such a case.	2021-06-14 11:37:36 +03:00
Ondřej Surý	b3de93e54c	Update the source code formatting using clang-format-12 clang-format now tries to keep the type-cast on the same line as the variable. Update the formatting.	2021-06-13 08:46:28 +02:00
Ondřej Surý	440fb3d225	Completely remove BIND 9 Windows support The Windows support has been completely removed from the source tree and BIND 9 now no longer supports native compilation on Windows. We might consider reviewing mingw-w64 port if contributed by external party, but no development efforts will be put into making BIND 9 compile and run on Windows again.	2021-06-09 14:35:14 +02:00
Ondřej Surý	7670f98377	Add isc_task_getnetmgr() function Add a function to pull the attached netmgr from inside the executed task. This is needed for any task that needs to call the netmgr API.	2021-05-31 14:52:05 +02:00
Ondřej Surý	87fe97ed91	Add asynchronous work API to the network manager The libuv has a support for running long running tasks in the dedicated threadpools, so it doesn't affect networking IO. This commit adds isc_nm_work_enqueue() wrapper that would wraps around the libuv API and runs it on top of associated worker loop. The only limitation is that the function must be called from inside network manager thread, so the call to the function should be wrapped inside a (bound) task.	2021-05-31 14:52:05 +02:00
Mark Andrews	d68b009cfe	Remove priority from attribute constructor/destructor On some platforms, the __attribute__ constructor and destructor won't take priorities and the compilation failed. On such platform would be macOS. For this reason, the constructor/destructor in the libisc was reworked to not use priorities, but have a single constructor and destructor that calls the appropriate routines in correct order. This commit removes the extra priority because it's now not needed and it also breaks a compilation on macOS with GCC 10.	2021-05-27 08:02:21 +02:00
Ondřej Surý	50270de8a0	Refactor the interface handling in the netmgr The isc_nmiface_t type was holding just a single isc_sockaddr_t, so we got rid of the datatype and use plain isc_sockaddr_t in place where isc_nmiface_t was used before. This means less type-casting and shorter path to access isc_sockaddr_t members. At the same time, instead of keeping the reference to the isc_sockaddr_t that was passed to us when we start listening, we will keep a local copy. This prevents the data race on destruction of the ns_interface_t objects where pending nmsockets could reference the sockaddr of already destroyed ns_interface_t object.	2021-05-26 09:43:12 +02:00
Ondřej Surý	28b65d8256	Reduce the number of clientmgr objects created Previously, as a way of reducing the contention between threads a clientmgr object would be created for each interface/IP address. We tasks being more strictly bound to netmgr workers, this is no longer needed and we can just create clientmgr object per worker queue (ncpus). Each clientmgr object than would have a single task and single memory context.	2021-05-24 20:44:54 +02:00
Ondřej Surý	0be7ea78be	Reduce the number of client tasks and bind them to netmgr queues Since a client object is bound to a netmgr handle, each client will always be processed by the same netmgr worker, so we can simplify the code by binding client->task to the same thread as the client. Since ns__client_request() now runs in the same event loop as client->task events, is no longer necessary to pause the task manager before launching them. Also removed some functions in isc_task that were not used.	2021-05-24 20:02:20 +02:00
Ondřej Surý	9e3cb396b2	Replace netmgr quantum with loop-preventing barrier Instead of using fixed quantum, this commit adds atomic counter for number of items on each queue and uses the number of netievents scheduled to run as the limit of maximum number of netievents for a single process_queue() run. This prevents the endless loops when the netievent would schedule more netievents onto the same loop, but we don't have to pick "magic" number for the quantum.	2021-05-17 11:59:19 +02:00
Ondřej Surý	4509089419	Add configuration option to set send/recv buffers on the nm sockets This commit adds a new configuration option to set the receive and send buffer sizes on the TCP and UDP netmgr sockets. The default is `0` which doesn't set any value and just uses the value set by the operating system. There's no magic value here - set it too small and the performance will drop, set it too large, the buffers can fill-up with queries that have already timeouted on the client side and nobody is interested for the answer and this would just make the server clog up even more by making it produce useless work. The `netstat -su` can be used on POSIX systems to monitor the receive and send buffer errors.	2021-05-17 08:47:09 +02:00
Ondřej Surý	6c57a6cc3d	Add isc_taskmgr_detach when task is created while shutting down When taskmgr is shutting down, the creating the task would attach to the taskmgr, but don't detach on error condition.	2021-05-10 11:39:51 +02:00
Ondřej Surý	4c8f6ebeb1	Use barriers for netmgr synchronization The netmgr listening, stoplistening, pausing and resuming functions now use barriers for synchronization, which makes the code much simpler. isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier on platforms where that works, and pthread_barrier where it doesn't (including TSAN builds).	2021-05-07 14:28:32 -07:00
Evan Hunt	5c08f97791	only run tasks as privileged if taskmgr is in privileged mode all zone loading tasks have the privileged flag, but we only want them to run as privileged tasks when the server is being initialized; if we privilege them the rest of the time, the server may hang for a long time after a reload/reconfig. so now we call isc_taskmgr_setmode() to turn privileged execution mode on or off in the task manager. isc_task_privileged() returns true if the task's privilege flag is set and the taskmgr is in privileged execution mode. this is used to determine in which netmgr event queue the task should be run.	2021-05-07 14:28:30 -07:00
Ondřej Surý	dacf586e18	Make the netmgr queue processing quantized There was a theoretical possibility of clogging up the queue processing with an endless loop where currently processing netievent would schedule new netievent that would get processed immediately. This wasn't such a problem when only netmgr netievents were processed, but with the addition of the tasks, there are at least two situation where this could happen: 1. In lib/dns/zone.c:setnsec3param() the task would get re-enqueued when the zone was not yet fully loaded. 2. Tasks have internal quantum for maximum number of isc_events to be processed, when the task quantum is reached, the task would get rescheduled and then immediately processed by the netmgr queue processing. As the isc_queue doesn't have a mechanism to atomically move the queue, this commit adds a mechanism to quantize the queue, so enqueueing new netievents will never stop processing other uv_loop_t events. The default quantum size is 128. Since the queue used in the network manager allows items to be enqueued more than once, tasks are now reference-counted around task_ready() and task_run(). task_ready() now has a public API wrapper, isc_task_ready(), that the netmgr can use to reschedule processing of a task if the quantum has been reached. Incidental changes: Cleaned up some unused fields left in isc_task_t and isc_taskmgr_t after the last refactoring, and changed atomic flags to atomic_bools for easier manipulation.	2021-05-07 14:28:30 -07:00
Ondřej Surý	b5bf58b419	Destroy netmgr before destroying taskmgr With taskmgr running on top of netmgr, the ordering of how the tasks and netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy() was waiting until all tasks were properly shutdown and detached. This responsibility was moved to netmgr, so we now need to do the following: 1. shutdown all the tasks - this schedules all shutdown events onto the netmgr queue 2. shutdown the netmgr - this also makes sure all the tasks and events are properly executed 3. Shutdown the taskmgr - this now waits for all the tasks to finish running before returning 4. Shutdown the netmgr - this call waits for all the netmgr netievents to finish before returning This solves the race when the taskmgr object would be destroyed before all the tasks were finished running in the netmgr loops.	2021-05-07 14:28:30 -07:00
Ondřej Surý	a011d42211	Add new isc_managers API to simplify <>mgr create/destroy Previously, netmgr, taskmgr, timermgr and socketmgr all had their own isc_<>mgr_create() and isc_<>mgr_destroy() functions. The new isc_managers_create() and isc_managers_destroy() fold all four into a single function and makes sure the objects are created and destroy in correct order. Especially now, when taskmgr runs on top of netmgr, the correct order is important and when the code was duplicated at many places it's easy to make mistake. The former isc_<>mgr_create() and isc_<*>mgr_destroy() functions were made private and a single call to isc_managers_create() and isc_managers_destroy() is required at the program startup / shutdown.	2021-05-07 10:19:05 -07:00
Ondřej Surý	dfd56b84f5	Add support for generating backtraces on Windows This commit adds support for generating backtraces on Windows and refactors the isc_backtrace API to match the Linux/BSD API (without the isc_ prefix) * isc_backtrace_gettrace() was renamed to isc_backtrace(), the third argument was removed and the return type was changed to int * isc_backtrace_symbols() was added * isc_backtrace_symbols_fd() was added and used as appropriate	2021-05-03 20:31:52 +02:00
Diego Fronza	54aa60eef8	Add malloc attribute to memory allocation functions The malloc attribute allows compiler to do some optmizations on functions that behave like malloc/calloc, like assuming that the returned pointer do not alias other pointers.	2021-04-26 11:32:17 -03:00
Ondřej Surý	b540722bc3	Refactor taskmgr to run on top of netmgr This commit changes the taskmgr to run the individual tasks on the netmgr internal workers. While an effort has been put into keeping the taskmgr interface intact, couple of changes have been made: * The taskmgr has no concept of universal privileged mode - rather the tasks are either privileged or unprivileged (normal). The privileged tasks are run as a first thing when the netmgr is unpaused. There are now four different queues in in the netmgr: 1. priority queue - netievent on the priority queue are run even when the taskmgr enter exclusive mode and netmgr is paused. This is needed to properly start listening on the interfaces, free resources and resume. 2. privileged task queue - only privileged tasks are queued here and this is the first queue that gets processed when network manager is unpaused using isc_nm_resume(). All netmgr workers need to clean the privileged task queue before they all proceed normal operation. Both task queues are processed when the workers are finished. 3. task queue - only (traditional) task are scheduled here and this queue along with privileged task queues are process when the netmgr workers are finishing. This is needed to process the task shutdown events. 4. normal queue - this is the queue with netmgr events, e.g. reading, sending, callbacks and pretty much everything is processed here. * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t) object. * The isc_nm_destroy() function now waits for indefinite time, but it will print out the active objects when in tracing mode (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been made a little bit more asynchronous and it might take longer time to shutdown all the active networking connections. * Previously, the isc_nm_stoplistening() was a synchronous operation. This has been changed and the isc_nm_stoplistening() just schedules the child sockets to stop listening and exits. This was needed to prevent a deadlock as the the (traditional) tasks are now executed on the netmgr threads. * The socket selection logic in isc__nm_udp_send() was flawed, but fortunatelly, it was broken, so we never hit the problem where we created uvreq_t on a socket from nmhandle_t, but then a different socket could be picked up and then we were trying to run the send callback on a socket that had different threadid than currently running.	2021-04-20 23:22:28 +02:00
Ondřej Surý	16fe0d1f41	Cleanup the public vs private ISCAPI remnants Since all the libraries are internal now, just cleanup the ISCAPI remnants in isc_socket, isc_task and isc_timer APIs. This means, there's one less layer as following changes have been done: * struct isc_socket and struct isc_socketmgr have been removed * struct isc__socket and struct isc__socketmgr have been renamed to struct isc_socket and struct isc_socketmgr * struct isc_task and struct isc_taskmgr have been removed * struct isc__task and struct isc__taskmgr have been renamed to struct isc_task and struct isc_taskmgr * struct isc_timer and struct isc_timermgr have been removed * struct isc__timer and struct isc__timermgr have been renamed to struct isc_timer and struct isc_timermgr * All the associated code that dealt with typing isc_<foo> to isc__<foo> and back has been removed.	2021-04-19 13:18:24 +02:00
Ondřej Surý	3388ef36b3	Cleanup the isc_<>mgr_createinc() constructors Previously, the taskmgr, timermgr and socketmgr had a constructor variant, that would create the mgr on top of existing appctx. This was no longer true and isc_<>mgr was just calling isc_<*>mgr_create() directly without any extra code. This commit just cleans up the extra function.	2021-04-19 10:22:56 +02:00
Ondřej Surý	86f4872dd6	isc_nm_connect() always return via callback The isc_nm_connect() functions were refactored to always return the connection status via the connect callback instead of sometimes returning the hard failure directly (for example, when the socket could not be created, or when the network manager was shutting down). This commit changes the connect functions in all the network manager modules, and also makes the necessary refactoring changes in places where the connect functions are called.	2021-04-07 15:36:59 +02:00
Petr Mensik	81eb3396bf	Do not require config.h to use isc/util.h util.h requires ISC_CONSTRUCTOR definition, which depends on config.h inclusion. It does not include it from isc/util.h (or any other header). Using isc/util.h fails hard when isc/util.h is used without including bind's config.h. Move the check to c file, where ISC_CONSTRUCTOR is used. Ensure config.h is included there.	2021-03-26 11:41:22 +01:00
Patrick McLean	ebced74b19	Add isc_time_now_hires function to get current time with high resolution The current isc_time_now uses CLOCK_REALTIME_COARSE which only updates on a timer tick. This clock is generally fine for millisecond accuracy, but on servers with 100hz clocks, this clock is nowhere near accurate enough for microsecond accuracy. This commit adds a new isc_time_now_hires function that uses CLOCK_REALTIME, which gives the current time, though it is somewhat expensive to call. When microsecond accuracy is required, it may be required to use extra resources for higher accuracy.	2021-03-20 11:25:55 -07:00
Ondřej Surý	36ddefacb4	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 16:37:57 +01:00
Ondřej Surý	caa5b6548a	Fix TCPDNS and TLSDNS timers After the TCPDNS refactoring the initial and idle timers were broken and only the tcp-initial-timeout was always applied on the whole TCP connection. This broke any TCP connection that took longer than tcp-initial-timeout, most often this would affect large zone AXFRs. This commit changes the timeout logic in this way: * On TCP connection accept the tcp-initial-timeout is applied and the timer is started * When we are processing and/or sending any DNS message the timer is stopped * When we stop processing all DNS messages, the tcp-idle-timeout is applied and the timer is started again	2021-03-18 16:37:57 +01:00
Evan Hunt	88752b1121	refactor outgoing HTTP connection support - style, cleanup, and removal of unnecessary code. - combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint() into one function, renamed isc_http_endpoint(). - moved isc_nm_http_connect_send_request() into doh_test.c as a helper function; remove it from the public API. - renamed isc_http2 and isc_nm_http2 types and functions to just isc_http and isc_nm_http, for consistency with other existing names. - shortened a number of long names. - the caller is now responsible for determining the peer address. in isc_nm_httpconnect(); this eliminates the need to parse the URI and the dependency on an external resolver. - the caller is also now responsible for creating the SSL client context, for consistency with isc_nm_tlsdnsconnect(). - added setter functions for HTTP/2 ALPN. instead of setting up ALPN in isc_tlsctx_createclient(), we now have a function isc_tlsctx_enable_http2client_alpn() that can be run from isc_nm_httpconnect(). - refactored isc_nm_httprequest() into separate read and send functions. isc_nm_send() or isc_nm_read() is called on an http socket, it will be stored until a corresponding isc_nm_read() or _send() arrives; when we have both halves of the pair the HTTP request will be initiated. - isc_nm_httprequest() is renamed isc__nm_http_request() for use as an internal helper function by the DoH unit test. (eventually doh_test should be rewritten to use read and send, and this function should be removed.) - added implementations of isc__nm_tls_settimeout() and isc__nm_http_settimeout(). - increased NGHTTP2 header block length for client connections to 128K. - use isc_mem_t for internal memory allocations inside nghttp2, to help track memory leaks. - send "Cache-Control" header in requests and responses. (note: currently we try to bypass HTTP caching proxies, but ideally we should interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)	2021-03-05 13:29:26 +02:00
Ondřej Surý	494d0da522	Use library constructor/destructor to initialize OpenSSL Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use gcc/clang attributes on POSIX and DLLMain on Windows to initialize and shutdown OpenSSL library. This resolves the issue when isc_nm_create() / isc_nm_destroy() was called multiple times and it would call OpenSSL library destructors from isc_nm_destroy(). At the same time, since we now have introduced the ctor/dtor for libisc, this commit moves the isc_mem API initialization (the list of the contexts) and changes the isc_mem_checkdestroyed() to schedule the checking of memory context on library unload instead of executing the code immediately.	2021-02-18 19:33:54 +01:00
Ondřej Surý	f34f943b16	Disable memory debugging features in non-developer build The two memory debugging features: ISC_MEM_DEFAULTFILL (ISC_MEMFLAG_FILL) and ISC_MEM_TRACKLINES were always enabled in all builds and the former was only disabled in `named`. This commits disables those two features in non-developer build to make the memory allocator significantly faster.	2021-02-18 19:33:54 +01:00
Ondřej Surý	c9fe12443f	Make the mempool names unconditional The named memory pools were default and always compiled-in. Remove the extra complexity by removing the #define and #ifdefs around the code.	2021-02-18 19:33:54 +01:00
Ondřej Surý	b09106e93a	Make the memory and mempool counters to be stdatomic types This is yet another step into unlocking some parts of the memory contexts. All the regularly updated variables has been turned into atomic types, so we can later remove the locks when updating various counters. Also unlock as much code as possible without breaking anything.	2021-02-18 19:33:51 +01:00
Ondřej Surý	0f44139145	Bump the maximum number of hazard pointers in tests On 24-core machine, the tests would crash because we would run out of the hazard pointers. We now adjust the number of hazard pointers to be in the <128,256> interval based on the number of available cores. Note: This is just a band-aid and needs a proper fix.	2021-02-18 19:32:55 +01:00
Ondřej Surý	7de846977b	Remove the extra level of indirection via isc_memmethods_t Previously, the applications using libisc would be able to override the internal memory methods with own implementation. This was no longer possible, but the extra level of indirection was not removed. This commit removes the extra level of indirection for the memory methods and the default_memalloc() and default_memfree().	2021-02-18 19:32:55 +01:00

... 3 4 5 6 7 ...

1843 commits