bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-06-04 23:22:03 -04:00

Author	SHA1	Message	Date
Artem Boldariev	84d71c8e2c	TLS DNS: take into account partial writes by SSL_write_ex() This commit changes TLS DNS so that partial writes by the SSL_write_ex() function are taken into account properly. Now, before doing encryption, we are flushing the buffers for outgoing encrypted data. The problem is fairly complicated and originates from the fact that it is somewhat hard to understand by reading the documentation if and when partial writes are supported/enabled or not, and one can get a false impression that they are not supported or enabled by default (https://www.openssl.org/docs/man3.1/man3/SSL_write_ex.html). I have added a lengthy comment about that into the code because it will be more useful there. The documentation on this topic is vague and hard to follow. The main point is that when SSL_write_ex() fails with SSL_ERROR_WANT_WRITE, the OpenSSL code tells us that we need to flush the outgoing buffers and then call SSL_write_ex() again with exactly the same arguments in order to continue as partial write could have happened on the previous call to SSL_write_ex() (that is not hard to verify by calling BIO_pending(sock->tls.app_rbio) before and after the call to SSL_write_ex() and comparing the returned values). This aspect was not taken into account in the code. Now, one can wonder how that could have led to the behaviour that we saw in the #4255 bug report. In particular, how could we lose one message and duplicate one twice? That is where things get interesting. One needs to keep two things in mind (that is important): Firstly, the possibility that two (or more) subsequent SSL_write_ex() calls will be done with exactly the same arguments is very high (the code does not guarantee that in any way, but in practice, that happens a lot). Secondly, the dnsperf (the software that helped us to trigger the bug) bombed the test server with messages that contained exactly the same data. The only difference in the responses is message IDs, which can be found closer to the start of a message. So, that is what was going on in the older version of the code: 1. During one of the isc_nm_send() calls, the SSL_write_ex() call fails with SSL_ERROR_WANT_WRITE. Partial writing has happened, though, and we wrote a part of the message with the message ID (e.g. 2014). Nevertheless, we have rescheduled the complete send operation asynchronously by a call to tlsdns_send_enqueue(). 2. While the asynchronous request has not been completed, we try to send the message (e.g. with ID 2015). The next isc_nm_send() or re-queued send happens with a call to SSL_write_ex() with EXACTLY the same arguments as in the case of the previous call. That is, we are acting as if we want to complete the previously failed SSL_write_ex() attempt (according to the OpenSSL documentation: https://www.openssl.org/docs/man3.1/man3/SSL_write_ex.html, the "Warnings" section). This way, we already have a start of the message containing the previous ID (2014 in our case) but complete the write request with the rest of the data given in the current write attempt. However, as responses differ only in message ID, we end up sending a valid (properly structured) DNS message but with the ID of the previous one. This way, we send a message with ID from the previous isc_nm_send() attempt. The message with the ID from the send request from this attempt will never be sent, as the code thinks that it is sending it now (that is how we send the message with ID 2014 instead of 2015, as in our example, thus making the message with ID 2015 never to be sent). 3. At some point later, the asynchronous send request (the rescheduled on the first step) completes without an error, sending a second message with the same ID (2014). It took exhausting SSL write buffers (so that a data encryption attempt cannot be completed in one operation) via long DoT streams in order to exhibit the behaviour described above. The exhaustion happened because we have not been trying to flush the buffers often enough (especially in the case of multiple subsequent writes). In my opinion, the origin of the problem can be described as follows: It happened due to making wrong guesses caused by poorly written documentation.	2023-09-05 18:03:44 +02:00
Artem Boldariev	f5cb14265f	Add ability to set per jemalloc arena dirty and muzzy decay values This commit adds couple of functions to change "dirty_decay_ms" and "muzzy_decay_ms" settings on arenas associated with memory contexts. (cherry picked from commit `6e98b58d15`)	2023-09-05 15:02:30 +02:00
Artem Boldariev	16a45837ca	Make it possible to create memory contexts backed by jemalloc arenas This commit extends the internal memory management middleware code in BIND so that memory contexts backed by dedicated jemalloc arenas can be created. A new function (isc_mem_create_arena()) is added for that. Moreover, it extends the existing code so that specialised memory contexts can be created easily, should we need that functionality for other future purposes. We have achieved that by passing the flags to the underlying jemalloc-related calls. See the above isc_mem_create_arena(), which can serve as an example of this. Having this opens up possibilities for creating memory contexts tuned for specific needs. (cherry picked from commit `8550c52588`)	2023-09-05 15:02:30 +02:00
Artem Boldariev	d53ecb7720	Fix building BIND on DragonFly BSD (on both older an newer versions) This commit ensures that BIND and supplementary tools still can be built on newer versions of DragonFly BSD. It used to be the case, but somewhere between versions 6.2 and 6.4 the OS developers rearranged headers and moved some function definitions around. Before that the fact that it worked was more like a coincidence, this time we, at least, looked at the related man pages included with the OS. No in depth testing has been done on this OS as we do not really support this platform - so it is more like a goodwill act. We can, however, use this platform for testing purposes, too. Also, we know that the OS users do use BIND, as it is included in its ports directory. Building with './configure' and './configure --without-jemalloc' have been fixed and are known to work at the time the commit is made. (cherry picked from commit `942569a1bb`)	2023-09-05 10:33:51 +02:00
Mark Andrews	f77ffa7953	Take ownership of pointer before freeing (cherry picked from commit `9e2288208d`)	2023-09-01 14:03:49 +10:00
Mark Andrews	894b0970e6	Clear OpenSSL errors on TSL error paths (cherry picked from commit `4f790b6c58`)	2023-09-01 13:45:34 +10:00
Mark Andrews	900efd613f	Clear OpenSSL errors on SHA failures (cherry picked from commit `247422c69f`)	2023-09-01 13:45:34 +10:00
Mark Andrews	aca6f3e82d	Clear OpenSSL errors on EVP failures (cherry picked from commit `4ea926934a`)	2023-09-01 13:40:32 +10:00
Mark Andrews	b5b13771f2	Clear OpenSSL errors on EVP_PKEY_new failures (cherry picked from commit `6df53cdb87`)	2023-09-01 13:37:02 +10:00
Tony Finch	525afc666a	Parse statschannel Content-Length: more carefully A negative or excessively large Content-Length could cause a crash by making `INSIST(httpd->consume != 0)` fail. (cherry picked from commit `26e10e8fb5`)	2023-08-23 15:44:11 +02:00
Ondřej Surý	701eb26f97	Workaround faulty stdatomic.h header detection on Oracle Linux 7 Oracle Linux 7 sets __STDC_VERSION__ to 201112L, but doesn't define __STDC_NO_ATOMICS__, so we try to include <stdatomic.h> without the header present in the system. Since we are already detecting the header in the autoconf, use the HAVE_STDATOMIC_H for more reliable detecting whether <stdatomic.h> header is present.	2023-08-22 14:23:05 +02:00
Tony Finch	57069556eb	Fix a stack buffer overflow in the statistics channel A long timestamp in an If-Modified-Since header could overflow a fixed-size buffer. (cherry picked from commit `b22c87ca61`)	2023-08-14 13:07:47 +02:00
Ondřej Surý	c2c2ec0c96	Don't process detach and close as priority netmgr events The detach (and possibly close) netmgr events can cause additional callbacks to be called when under exclusive mode. The detach can trigger next queued TCP query to be processed and close will call configured close callback. Move the detach and close netmgr events from the priority queue to the normal queue as the detaching and closing the sockets can wait for the exclusive mode to be over.	2023-07-20 18:37:48 +02:00
Tony Finch	1ddf2b87f5	Improve statschannel HTTP Connection: header protocol conformance In HTTP/1.0 and HTTP/1.1, RFC 9112 section 9.6 says the last response in a connection should include a `Connection: close` header, but the statschannel server omitted it. In an HTTP/1.0 response, the statschannel server can sometimes send a `Connection: keep-alive` header when it is about to close the connection. There are two ways: If the first request on a connection is keep-alive and the second request is not, then _both_ responses have `Connection: keep-alive` but the connection is (correctly) closed after the second response. If a single request contains Connection: close Connection: keep-alive then RFC 9112 section 9.3 says the keep-alive header is ignored, but the statschannel sends a spurious keep-alive in its response, though it correctly closes the connection. To fix these bugs, make it more clear that the `httpd->flags` are part of the per-request-response state. The Connection: flags are now described in terms of the effect they have instead of what causes them to be set. (manually picked from commit `e18ca83a3b`)	2023-07-04 14:53:08 +02:00
Aram Sargsyan	0c751ce72e	Update the event loop's time after executing a task Tasks can block for a long time, especially when used by tools in interactive mode. Update the event loop's time to avoid unexpected errors when processing later events during the same callback. For example, newly started timers can fire too early, because the current time was stale. See the note about uv_update_time() in the https://docs.libuv.org/en/v1.x/timer.html#c.uv_timer_start page.	2023-06-20 10:21:54 +00:00
Ondřej Surý	be0f38553e	Make isc_result tables smaller The isc_result_t enum was to sparse when each library code would skip to next << 16 as a base. Remove the huge holes in the isc_result_t enum to make the isc_result tables more compact. This change required a rewrite how we map dns_rcode_t to isc_result_t and back, so we don't ever return neither isc_result_t value nor dns_rcode_t out of defined range. (cherry picked from commit `a8e6c3b8f7`)	2023-06-15 16:27:17 +02:00
Midnight Veil	5172f4c32a	Translate POSIX errorcode EROFS to ISC_R_NOPERM Report "permission denied" instead of "unexpected error" when trying to update a zone file on a read-only file system. (cherry picked from commit `dd6acc1cac`)	2023-06-14 13:48:25 +01:00
Mark Andrews	27eb8ed20f	Move isc_mem_put to after node is checked for equality isc_mem_put NULL's the pointer to the memory being freed. The equality test 'parent->r == node' was accidentally being turned into a test against NULL. (cherry picked from commit `ac2e0bc3ff`)	2023-05-29 13:27:51 +10:00
Artem Boldariev	9ab6c3a5b1	Make sockstop netievent a high-priority one Seemingly by omission, sockstop netievent used by multi-layer sockets was not a high priority event, like it should be (similarly to other socket types). In particular, that could make BIND stuck on reconfiguration after a DoH-listener is removed from the configuration. This commit fixes that.	2023-05-17 13:06:41 +03:00
Artem Boldariev	18d662f4d2	Pass the right worker into isc__nm_async_sockstop() The intention behind 'isc__nmsocket_stop()' was that the function sends notifications on every worker thread, making them synchronise on the barrier, then the initiating thread waits on it, too. This way we ensure than no other operation will start when we shutting down the listener. However, it seems that due to mistake we have been passing the wrong worker pointer into isc__nm_async_sockstop() from within the context of an worker thread which has initiated shutting down. While effectively we have not been using the pointer in this case, it could cause maintenance issues later. This commit fixes that.	2023-05-17 12:56:25 +03:00
Mark Andrews	fa35d059da	Re-write remove_old_tsversions and greatest_version Stop deliberately breaking const rules by copying file->name into dirbuf and truncating it there. Handle files located in the root directory properly. Use unlinkat() from POSIX 200809. (cherry picked from commit `9fcd42c672`)	2023-05-03 10:39:46 +02:00
Matthijs Mekking	33ad117166	Fix purging old log files with absolute file path Removing old timestamp or increment versions of log backup files did not work when the file is an absolute path: only the entry name was provided to the file remove function. The dirname was also bogus, since the file separater was put back too soon. Fix these issues to make log file rotation work when the file is configured to be an absolute path. (cherry picked from commit `70629d73da`)	2023-05-03 10:12:51 +02:00
Evan Hunt	2a714c25f8	add a result code for ENOPROTOOPT, EPROTONOSUPPORT there was no isc_result_t value for invalid protocol errors that could be returned from libuv. (cherry picked from commit `0393b54afb`)	2023-04-21 12:47:07 +02:00
Ondřej Surý	f7bdab0591	Revert "Kill unit tests that run more than 1200 seconds" This reverts commit `6cdeb5b046` which added wrapper around all the unit tests that would run the unit test in the forked process. This makes any debugging of the unit tests too hard. Futures attempts to fix #3980 (closed) should add a custom automake test harness (log driver) that would kill the unit test after configured timeout.	2023-04-14 06:21:03 +02:00
Mark Andrews	6cdeb5b046	Kill unit tests that run more than 1200 seconds The CI doesn't provide useful forensics when a system test locks up. Fork the process and kill it with ABRT if it is still running after 20 minutes. Pass the exit status to the caller. (cherry picked from commit `3d5c7cd46c`)	2023-04-03 11:11:26 +10:00
Ondřej Surý	718893ece4	Replace isc_fsaccess API with more secure file creation The isc_fsaccess API was created to hide the implementation details between POSIX and Windows APIs. As we are not supporting the Windows APIs anymore, it's better to drop this API used in the DST part. Moreover, the isc_fsaccess was setting the permissions in an insecure manner - it operated on the filename, and not on the file descriptor which can lead to all kind of attacks if unpriviledged user has read (or even worse write) access to key directory. Replace the code that operates on the private keys with code that uses mkstemp(), fchmod() and atomic rename() at the end, so at no time the private key files have insecure permissions. (cherry picked from commit `263d232c79`)	2023-03-31 16:47:15 +02:00
Ondřej Surý	dcea09a327	Add isc_os_umask() function to get current umask As it's impossible to get the current umask without modifying it at the same time, initialize the current umask at the program start and keep the loaded value internally. Add isc_os_umask() function to access the starttime umask. (cherry picked from commit `aca7dd3961`)	2023-03-31 16:47:15 +02:00
Artem Boldariev	d1d4f6e362	TLS Stream: backport connect callback handling behaviour from main This commit contains the backport of the behaviour for handling TLS connect callbacks when wrapping up. The current behaviour have not caused any problems to us, yet, but we are changing it to remain on the safer side.	2023-03-30 18:37:21 +03:00
Artem Boldariev	034b5febb1	DoT: remove TLS-related kludge in isc__nmsocket_connecttimeout_cb() This commit ensures that 'sock->tls.pending_req' is not getting nullified during TLS connection timeout callback as it prevents the connection callback being called when connecting was not successful. We expect 'isc__nm_failed_connect_cb() to be called from 'isc__nm_tlsdns_shutdown()' when establishing connections was successful, but with 'sock->tls.pending_req' nullified that will not happen. The code removed most likely was required in older iterations of the NM, but to me it seems that now it does only harm. One of the well know pronounced effects is leading to irrecoverable zone transfer hangs via TLS.	2023-03-14 18:49:29 +02:00
Mark Andrews	749c13cf04	Fix memory leak in isc_hmac_init If EVP_DigestSignInit failed 'pkey' was not freed. (cherry picked from commit `cf5f133679`)	2023-02-27 10:27:32 +11:00
Ondřej Surý	6873cc1c79	Run the RPZ update as offloaded work Previously, the RPZ updates ran quantized on the main nm_worker loops. As the quantum was set to 1024, this might lead to service interruptions when large RPZ update was processed. Change the RPZ update process to run as the offloaded work. The update and cleanup loops were refactored to do as little locking of the maintenance lock as possible for the shortest periods of time and the db iterator is being paused for every iteration, so we don't hold the rbtdb tree lock for prolonged periods of time. (cherry picked from commit `f106d0ed2b`)	2023-02-13 11:41:52 +00:00
Ondřej Surý	8d103f7bbc	Enforce version drift limits for libuv libuv support for receiving multiple UDP messages in a single system call (recvmmsg()) has been tweaked several times between libuv versions 1.35.0 and 1.40.0. Mixing and matching libuv versions within that span may lead to assertion failures and is therefore considered harmful, so try to limit potential damage be preventing users from mixing libuv versions with distinct sets of recvmmsg()-related flags. (cherry picked from commit `735d09bffe`)	2023-02-09 22:10:46 +01:00
Ondřej Surý	3368e5f231	Avoid libuv 1.35 and 1.36 that have broken recvmmsg implementation The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is incomplete and could cause assertion failure under certain circumstances. Modify the configure and runtime checks to report a fatal error when trying to compile or run with the affected versions. (cherry picked from commit `251f411fc3`)	2023-02-09 22:10:46 +01:00
Evan Hunt	342286ecdb	remove isc_bind9 variable isc_bind9 was a global bool used to indicate whether the library was being used internally by BIND or by an external caller. external use is no longer supported, but the variable was retained for use by dyndb, which needed it only when being built without libtool. building without libtool is also no longer supported, so the variable can go away. (cherry picked from commit `935879ed11`)	2023-02-09 10:07:39 -08:00
Artem Boldariev	cb9f8c08d5	Fix building BIND on DragonFly BSD (on both older an newer versions) This commit ensures that BIND and supplementary tools still can be built on newer versions of DragonFly BSD. It used to be the case, but somewhere between versions 6.2 and 6.4 the OS developers rearranged headers and moved some function definitions around. Before that the fact that it worked was more like a coincidence, this time we, at least, looked at the related man pages included with the OS. No in depth testing has been done on this OS as we do not really support this platform - so it is more like a goodwill act. We can, however, use this platform for testing purposes, too. Also, we know that the OS users do use BIND, as it is included in its ports directory. Building with './configure' and './configure --without-jemalloc' have been fixed and are known to work at the time the commit is made. (cherry picked from commit `942569a1bb`)	2023-01-20 00:56:16 +02:00
Aram Sargsyan	8f209c7dcf	Refactor isc_nm_xfr_allowed() Return 'isc_result_t' type value instead of 'bool' to indicate the actual failure. Rename the function to something not suggesting a boolean type result. Make changes in the places where the API function is being used to check for the result code instead of a boolean value. (cherry picked from commit `41dc48bfd7`)	2023-01-19 12:20:10 +00:00
Ondřej Surý	bf1a29e9e1	Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash() If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it. The API has been deprecated in OpenSSL 3.0, but it is significantly faster than EVP_MD API, so make an exception here and keep using it until we can't. (cherry picked from commit `25db8d0103`)	2023-01-19 00:33:37 +01:00
Ondřej Surý	a1dcbcab8d	Use OpenSSL EVP_MD API directly in isc_iterated_hash() Instead of going through another layer, use OpenSSL EVP_MD API directly in the isc_iterated_hash() implementation. This shaves off couple of microseconds in the microbenchmark. (cherry picked from commit `36654df732`)	2023-01-19 00:32:51 +01:00
Mark Andrews	80a052aaf6	Unlink the timer event before trying to purge it as far as I can determine the order of operations is not important. *** CID 351372: Concurrent data access violations (ATOMICITY) /lib/isc/timer.c: 227 in timer_purge() 221 LOCK(&timer->lock); 222 if (!purged) { 223 /* 224 * The event has already been executed, but not 225 * yet destroyed. 226 */ >>> CID 351372: Concurrent data access violations (ATOMICITY) >>> Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect. 227 timerevent_unlink(timer, event); 228 } 229 } 230 } 231 232 void (cherry picked from commit `98718b3b4b`)	2023-01-18 22:39:26 +01:00
Ondřej Surý	e26aa4cbb1	Don't use reference counting in isc_timer unit The reference counting and isc_timer_attach()/isc_timer_detach() semantic are actually misleading because it cannot be used under normal conditions. The usual conditions under which is timer used uses the object where timer is used as argument to the "timer" itself. This means that when the caller is using `isc_timer_detach()` it needs the timer to stop and the isc_timer_detach() does that only if this would be the last reference. Unfortunately, this also means that if the timer is attached elsewhere and the timer is fired it will most likely be use-after-free, because the object used in the timer no longer exists. Remove the reference counting from the isc_timer unit, remove isc_timer_attach() function and rename isc_timer_detach() to isc_timer_destroy() to better reflect how the API needs to be used. The only caveat is that the already executed event must be destroyed before the isc_timer_destroy() is called because the timer is no longet attached to .ev_destroy_arg. (cherry picked from commit `ae01ec2823`)	2023-01-18 22:39:26 +01:00
Ondřej Surý	7197cf2b7e	Remove isc_task_purge() and isc_task_purgerange() The isc_task_purge() and isc_task_purgerange() were now unused, so sweep the task.c file. Additionally remove unused ISC_EVENTATTR_NOPURGE event attribute. (cherry picked from commit `c17eee034b`)	2023-01-18 22:06:24 +01:00
Ondřej Surý	68abe3fa06	Add isc_task_setquantum() and use it for post-init zone loading Add isc_task_setquantum() function that modifies quantum for the future isc_task_run() invocations. NOTE: The current isc_task_run() caches the task->quantum into a local variable and therefore the current event loop is not affected by any quantum change. (cherry picked from commit `15ea6f002f`)	2023-01-18 18:04:41 +01:00
Ondřej Surý	5f141e2c7f	Keep the list of scheduled events on the timer Instead of searching for the events to purge, keep the list of scheduled events on the timer list and purge the events that we have scheduled. (cherry picked from commit 3f8024b4a2f12fcd28a9dd813b6f1f3f11d506f2)	2023-01-18 18:04:41 +01:00
Ondřej Surý	be99507488	Repair isc_task_purgeevent(), clean isc_task_unsend{,range}() The isc_task_purgerange() was walking through all events on the task to find a matching task. Instead use the ISC_LINK_LINKED to find whether the event is active. Cleanup the related isc_task_unsend() and isc_task_unsendrange() functions that were not used anywhere. (cherry picked from commit `17aed2f895`)	2023-01-18 18:04:41 +01:00
Ondřej Surý	8c31a939c9	Implement incremental hash table resizing in isc_ht Previously, an incremental hash table resizing was implemented for the dns_rbt_t hash table implementation. Using that as a base, also implement the incremental hash table resizing also for isc_ht API hashtables: 1. During the resize, allocate the new hash table, but keep the old table unchanged. 2. In each lookup, delete, or iterator operation, check both tables. 3. Perform insertion operations only in the new table. 4. At each insertion also move <r> elements from the old table to the new table. 5. When all elements are removed from the old table, deallocate it. To ensure that the old table is completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (<r> + 1)/<r> during resizing. In our implementation <r> is equal to 1. The downside of this approach is that the old table and the new table could stay in memory for longer when there are no new insertions into the hash table for prolonged periods of time as the incremental rehashing happens only during the insertions. (cherry picked from commit `e42cb1f198`)	2023-01-11 17:15:33 +01:00
Ondřej Surý	6906b42cdd	Prefer the pthread_barrier implementation over uv_barrier Prefer the pthread_barrier implementation on platforms where it is available over uv_barrier implementation. This also solves the problem with thread sanitizer builds on macOS that doesn't have pthread barrier. (cherry picked from commit `d07c4a98da`)	2023-01-11 10:21:39 +00:00
Ondřej Surý	d0d9e7dfb2	Don't honour single read per client isc_nm_read() call in the TLSDNS This reverts commit `f17f5e831b` that made following change: > The TLSDNS transport was not honouring the single read callback for > TLSDNS client. It would call the read callbacks repeatedly in case the > single TLS read would result in multiple DNS messages in the decoded > buffer. Turns out that this change broke XoT, so we are reverting the change until we figure out a proper fix that will keep the design promise and not break XoT at the same time.	2023-01-11 10:17:55 +01:00
Mark Andrews	f99593a9ca	Accept 'in=NULL' with 'inlen=0' in isc_{half}siphash24 Arthimetic on NULL pointers is undefined. Avoid arithmetic operations when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero. (cherry picked from commit `349c23dbb7`)	2023-01-10 18:36:08 +11:00
Evan Hunt	5fd93c66aa	remove nonfunctional DSCP implementation DSCP has not been fully working since the network manager was introduced in 9.16, and has been completely broken since 9.18. This seems to have caused very few difficulties for anyone, so we have now marked it as obsolete and removed the implementation. To ensure that old config files don't fail, the code to parse dscp key-value pairs is still present, but a warning is logged that the feature is obsolete and should not be used. Nothing is done with configured values, and there is no longer any range checking. (cherry picked from commit `916ea26ead`)	2023-01-09 14:23:26 -08:00
Artem Boldariev	bccbf28249	tlsctx_client_session_cache_new() -> tlsctx_client_session_create() Additionally to renaming, it changes the function definition so that it accepts a pointer to pointer instead of returning a pointer to the new object. It is mostly done to make it in line with other functions in the module. (cherry picked from commit `7962e7f575`)	2022-12-23 13:58:14 +02:00

1 2 3 4 5 ...

4508 commits