Commit graph

4129 commits

Author SHA1 Message Date
Ondřej Surý
f493a83941
Fix case insensitive matching in isc_ht hash table implementation
The case insensitive matching in isc_ht was basically completely broken
as only the hashvalue computation was case insensitive, but the key
comparison was always case sensitive.

(cherry picked from commit 175655b771)
2024-02-11 11:57:58 +01:00
Ondřej Surý
c12608ca93
Split fast and slow task queues
Change the taskmgr (and thus netmgr) in a way that it supports fast and
slow task queues.  The fast queue is used for incoming DNS traffic and
it will pass the processing to the slow queue for sending outgoing DNS
messages and processing resolver messages.

In the future, more tasks might get moved to the slow queues, so the
cached and authoritative DNS traffic can be handled without being slowed
down by operations that take longer time to process.

(cherry picked from commit 1b3b0cef22)
2024-02-01 21:51:07 +01:00
Ondřej Surý
a4baf32415
Backport isc_ht API changes from BIND 9.18
To prevent allocating large hashtable in dns_message, we need to
backport the improvements to isc_ht API from BIND 9.18+ that includes
support for case insensitive keys and incremental rehashing of the
hashtables.
2024-01-05 11:52:05 +01:00
Ondřej Surý
35630c9210
Reformat sources with up-to-date clang-format-17 2023-11-13 17:15:55 +01:00
Mark Andrews
a846180a01 Add parentheses around macro arguement 'msec'
The is needed to ensure that the multiplication is correctly done.
This was reported by Jinmei Tatuya.

(cherry picked from commit ebfbad29c1)
2023-10-20 11:26:04 +11:00
Michal Nowak
531c96b8ed
Update the source code formatting using clang-format-17 2023-10-17 17:56:31 +02:00
Ondřej Surý
b354e10793
Explicitly cast chars to unsigned chars for <ctype.h> functions
Apply the semantic patch to catch all the places where we pass 'char' to
the <ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()).

(cherry picked from commit 29caa6d1f0)
2023-09-22 17:10:25 +02:00
Ondřej Surý
36aba0db8f
Don't process detach and close as priority netmgr events
The detach (and possibly close) netmgr events can cause additional
callbacks to be called when under exclusive mode.  The detach can
trigger next queued TCP query to be processed and close will call
configured close callback.

Move the detach and close netmgr events from the priority queue to the
normal queue as the detaching and closing the sockets can wait for the
exclusive mode to be over.

(cherry picked from commit c2c2ec0c96)
2023-07-20 19:21:44 +02:00
Mark Andrews
300a2fb4ba Check the pointer alignments when deserialising
deserialize_corrupt_test may corrupt the pointers such that they
is no longer properly aligned.  Check that the alignment is consistent
with memory returned from isc_mem before checking the magic value.
2023-05-05 07:04:31 +00:00
Evan Hunt
6e422ae3ae fixed a bug in rolling timestamp logfiles
due to comparing logfile suffixes as 32 bit rather than 64 bit
integers, logfiles with timestamp suffixes that should have been
removed when rolling could be left in place. this has been fixed.

(cherry picked from commit 9a9e906306)
2023-03-28 10:03:33 +00:00
Ondřej Surý
100c20b470
Don't check for maximal version on Windows
The Windows doesn't have support for recvmmsg(), so we don't need to
check for maximal version on Windows (only for a minimal version).

Remove the MAXIMAL_VERSION when compiling on Windows.
2023-02-15 11:02:57 +01:00
Ondřej Surý
b163ca9f97 Enforce version drift limits for libuv
libuv support for receiving multiple UDP messages in a single system
call (recvmmsg()) has been tweaked several times between libuv versions
1.35.0 and 1.40.0.  Mixing and matching libuv versions within that span
may lead to assertion failures and is therefore considered harmful, so
try to limit potential damage be preventing users from mixing libuv
versions with distinct sets of recvmmsg()-related flags.

(cherry picked from commit 735d09bffe)
2023-02-10 06:50:32 +01:00
Ondřej Surý
9309589ad0 Avoid libuv 1.35 and 1.36 that have broken recvmmsg implementation
The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is
incomplete and could cause assertion failure under certain
circumstances.

Modify the configure and runtime checks to report a fatal error when
trying to compile or run with the affected versions.

(cherry picked from commit 251f411fc3)
2023-02-10 06:50:32 +01:00
Mark Andrews
363b40b1da Unlink the timer event before trying to purge it
as far as I can determine the order of operations is not important.

    *** CID 351372:  Concurrent data access violations  (ATOMICITY)
    /lib/isc/timer.c: 227 in timer_purge()
    221     		LOCK(&timer->lock);
    222     		if (!purged) {
    223     			/*
    224     			 * The event has already been executed, but not
    225     			 * yet destroyed.
    226     			 */
    >>>     CID 351372:  Concurrent data access violations  (ATOMICITY)
    >>>     Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
    227     			timerevent_unlink(timer, event);
    228     		}
    229     	}
    230     }
    231
    232     void

(cherry picked from commit 98718b3b4b)
2023-01-19 11:28:10 +01:00
Ondřej Surý
e241a3f4db Don't use reference counting in isc_timer unit
The reference counting and isc_timer_attach()/isc_timer_detach()
semantic are actually misleading because it cannot be used under normal
conditions.  The usual conditions under which is timer used uses the
object where timer is used as argument to the "timer" itself.  This
means that when the caller is using `isc_timer_detach()` it needs the
timer to stop and the isc_timer_detach() does that only if this would be
the last reference.  Unfortunately, this also means that if the timer is
attached elsewhere and the timer is fired it will most likely be
use-after-free, because the object used in the timer no longer exists.

Remove the reference counting from the isc_timer unit, remove
isc_timer_attach() function and rename isc_timer_detach() to
isc_timer_destroy() to better reflect how the API needs to be used.

The only caveat is that the already executed event must be destroyed
before the isc_timer_destroy() is called because the timer is no longet
attached to .ev_destroy_arg.

(cherry picked from commit ae01ec2823)
2023-01-19 11:28:10 +01:00
Ondřej Surý
7fef8e77d6 Add isc_task_setquantum() and use it for post-init zone loading
Add isc_task_setquantum() function that modifies quantum for the future
isc_task_run() invocations.

NOTE: The current isc_task_run() caches the task->quantum into a local
variable and therefore the current event loop is not affected by any
quantum change.

(cherry picked from commit 15ea6f002f)
2023-01-19 11:28:10 +01:00
Ondřej Surý
617186d514 Keep the list of scheduled events on the timer
Instead of searching for the events to purge, keep the list of scheduled
events on the timer list and purge the events that we have scheduled.

(cherry picked from commit 3f8024b4a2f12fcd28a9dd813b6f1f3f11d506f2)
2023-01-19 11:28:10 +01:00
Ondřej Surý
76859344fe Repair isc_task_purgeevent()
The isc_task_purgerange() was walking through all events on the task to
find a matching task.  Instead use the ISC_LINK_LINKED to find whether
the event is active.

(cherry picked from commit 17aed2f895)
2023-01-19 11:28:10 +01:00
Ondřej Surý
49af3a23b9 Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash()
Instead of going through another layer, use OpenSSL SHA1 API directly
in the isc_iterated_hash() implementation.

(cherry picked from commit 25db8d0103)
2023-01-18 23:26:40 +01:00
Mark Andrews
f1c08fe93b Accept 'in=NULL' with 'inlen=0' in isc_{half}siphash24
Arthimetic on NULL pointers is undefined.  Avoid arithmetic operations
when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero.

(cherry picked from commit 349c23dbb7)
2023-01-10 18:36:27 +11:00
Michal Nowak
771fed4a14
Update sources to Clang 15 formatting 2022-11-29 10:30:34 +01:00
Ondřej Surý
7b4cf67261
Replace (void *)-1 with ISC_LINK_TOMBSTONE
Instead of having "arbitrary" (void *)-1 to define non-linked, add a
ISC_LINK_TOMBSTONE(type) macro that replaces the "magic" value with a
define.

(cherry picked from commit 5e20c2ccfb)
2022-10-18 14:30:49 +02:00
Ondřej Surý
a317b2ea1c
Add ISC_{LIST,LINK}_INITIALIZER for designated initializers
Since we are using designated initializers, we were missing initializers
for ISC_LIST and ISC_LINK, add them, so you can do

    *foo = (foo_t){ .list = ISC_LIST_INITIALIZER };

Instead of:

    *foo = (foo_t){ 0 };
    ISC_LIST_INIT(foo->list);

(cherry picked from commit cb3c36b8bf)
2022-10-18 14:30:49 +02:00
Evan Hunt
7b6140e756 compression buffer was not cleared properly
clear the compression buffer before use. this eliminates the
possibility of a latent bug that, when combined with other changes,
allowed an overread in a later version of BIND.
2022-10-04 10:12:24 -07:00
Mark Andrews
9bb5157adf Use strnstr implementation from FreeBSD if not provided by OS
(cherry picked from commit 5f07fe8cbb)
2022-10-04 17:05:18 +11:00
Evan Hunt
2fdaa100c1 add assertions to isc_buffer macros
if ISC_BUFFER_USEINLINE is defined, then macros are used to implement
isc_buffer primitives (isc_buffer_init(), isc_buffer_region(), etc).
otherwise, functions are used. previously, only the functions had
DbC assertions, which made it possible for coding errors to go
undetected. this commit makes the macro versions enforce the same
requirements.
2022-09-26 23:48:21 -07:00
Aram Sargsyan
32779aba8a Add mctx attach/detach when creating/destroying a memory pool
This should make sure that the memory context is not destroyed
before the memory pool, which is using the context.

(cherry picked from commit e97c3eea95)
2022-09-02 09:05:03 +00:00
Mark Andrews
5a66ca501b Call isc_mutex_destroy(&lasttime_mx);
(cherry picked from commit 8109f495c8b5d7c7f88d581f7905650add0c184e)
2022-08-24 17:05:00 +10:00
Michał Kępień
4d1986ebcb Handle ISC_MEM_DEFAULTFILL consistently
Contrary to what the documentation states, memory filling is only
enabled by --enable-developer (or by setting -DISC_MEM_DEFAULTFILL=1) if
the internal memory allocator is used.  However, the internal memory
allocator is disabled by default, so just using the --enable-developer
build-time option does not enable memory filling (passing "-M fill" on
the named command line is necessary to actually enable it).  As memory
filling is a useful tool for troubleshooting certain types of bugs, it
should also be enabled by --enable-developer when the system allocator
is used.

Furthermore, memory-related preprocessor macros are handled in two
distinct locations: lib/isc/include/isc/mem.h and bin/named/main.c.
This makes the logic hard to follow.

Move all code handling the ISC_MEM_DEFAULTFILL preprocessor macro to
lib/isc/include/isc/mem.h, ensuring memory filling is enabled by the
--enable-developer build-time switch, no matter which memory allocator
is used.
2022-07-15 10:45:34 +02:00
Michał Kępień
7df6070c02 Fix mempool stats bug in the internal allocator
Commit c96b6eb5ec changed the way mempool
code handles freed allocations that cannot be retained for later use as
"free list" items: it no longer uses different logic depending on
whether the internal allocator is used or the system one.  However, that
commit did not update a relevant piece of code in isc_mempool_destroy(),
causing memory context statistics to always be off upon shutdown when
BIND 9 is built with -DISC_MEM_USE_INTERNAL_MALLOC=1.  This causes
assertion failures.  Update isc_mempool_destroy() accordingly in order
to prevent this issue from being triggered.
2022-07-15 10:45:34 +02:00
Michal Nowak
a584a8f88f
Update clang to version 14
(cherry picked from commit 1c45a9885a)
2022-06-16 18:11:03 +02:00
Ondřej Surý
6cfab7e4f7 Gracefully handle uv_read_start() failures
Under specific rare timing circumstances the uv_read_start() could
fail with UV_EINVAL when the connection is reset between the connect (or
accept) and the uv_read_start() call on the nmworker loop.  Handle such
situation gracefully by propagating the errors from uv_read_start() into
upper layers, so the socket can be internally closed().

(cherry picked from commit b432d5d3bc)
2022-06-14 11:55:03 +02:00
Mark Andrews
3292a54fed Add LIBUV_CFLAGS to CLINCLUDE in lib/isc/Makefile.in 2022-05-31 16:43:48 +10:00
Ondřej Surý
32a3970b13 Replace netievent lock-free queue with simple locked queue
The current implementation of isc_queue uses Michael-Scott lock-free
queue that in turn uses hazard pointers.  It was discovered that the way
we use the isc_queue, such complicated mechanism isn't really needed,
because most of the time, we either execute the work directly when on
nmthread (in case of UDP) or schedule the work from the matching
nmthreads.

Replace the current implementation of the isc_queue with a simple locked
ISC_LIST.  There's a slight improvement - since copying the whole list
is very lightweight - we move the queue into a new list before we start
the processing and locking just for moving the queue and not for every
single item on the list.

NOTE: There's a room for future improvements - since we don't guarantee
the order in which the netievents are processed, we could have two lists
- one unlocked that would be used when scheduling the work from the
matching thread and one locked that would be used from non-matching
thread.

(cherry picked from commit 6bd025942c)
2022-05-25 16:01:58 +02:00
Petr Menšík
1feb389f80 Fix failures in isc netmgr_test on big endian machines
Typing from libuv structure to isc_region_t is not possible, because
their sizes differ on 64 bit architectures. Little endian machines seems
to be lucky and still result in test passed. But big endian machine such
as s390x fails the test reliably.

Fix by directly creating the buffer as isc_region_t and skipping the
type conversion. More readable and still more correct.

(cherry picked from commit 057438cb45)
2022-05-24 20:23:04 +02:00
Ondřej Surý
ed4eda5ebc Move setting the sock->write_timeout to the async_*send
Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send
functions could lead to (harmless) data race when setting the value for
the first time when the isc_nm_send() function would be called from
thread not-matching the socket we are sending to.  Move the setting the
sock->write_timeout to the matching async function which is always
called from the matching thread.

(cherry picked from commit 61117840c1)
2022-05-19 22:38:47 +02:00
Ondřej Surý
4657b0f0c4 Use C2x [[fallthrough]] when supported by LLVM/clang
Clang added support for the gcc-style fallthrough
attribute (i.e. __attribute__((fallthrough))) in version 10.  However,
__has_attribute(fallthrough) will return 1 in C mode in older versions,
even though they only support the C++11 fallthrough attribute. At best,
the unsupported attribute is simply ignored; at worst, it causes errors.

The C2x fallthrough attribute has the advantages of being supported in
the broadest range of clang versions (added in version 9) and being easy
to check for support. Use C2x [[fallthrough]] attribute if possible, and
fall back to not using an attribute for clang versions that don't have
it.

Courtesy of Joshua Root

(cherry picked from commit 14c8d43863)
2022-05-19 22:02:07 +02:00
Evan Hunt
adeddfa8ff dont run isc__trampoline_initialize() in dlopen library
when built without libtool, the sample driver in the dyndb
system test runs library intializers that have already been
run, causing the value for isc__trampoline_min to be reset.
wrap the trampoline initialize and shutdown routines under
isc_once_do() to ensure that they are only run once.
2022-05-15 00:25:32 -07:00
Ondřej Surý
be7f672fcc Lock the trampoline when attaching
When attaching to the trampoline, the isc__trampoline_max was access
unlocked.  This would not manifest under normal circumstances because we
initialize 65 trampolines by default and that's enough for most
commodity hardware, but there are ARM machines with 128+ cores where
this would be reported by ThreadSanitizer.

Add locking around the code in isc__trampoline_attach().  This also
requires the lock to leak on exit (along with memory that we already)
because a new thread might be attaching to the trampoline while we are
running the library destructor at the same time.

(cherry picked from commit 933162ae14)
2022-05-13 13:42:23 +02:00
Mark Andrews
2a9ab8a732 Don't try to set IPV6_V6ONLY on OpenBSD
OpenBSD IPv6 sockets are always IPv6-only, so the socket option is read-only (not modifiable)
2022-05-02 14:09:31 +10:00
Ondřej Surý
4f30b16d96 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 11:52:02 +02:00
Michał Kępień
e850946557 Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc.  It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times".  Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance.  Ticks
are triggered when extents are allocated and deallocated.  Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any).  This allows memory use to be kept in check over time.

This dirty page cleanup mechanism has a quirk.  If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated.  This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]

The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).

Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call.  Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.

An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.

[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3

(cherry picked from commit 7aa7b6474b)
2022-04-21 14:23:59 +02:00
Ondřej Surý
d836f23f79 Fix the Windows paths modified for load balanced sockets
When backporting the load balanced sockets to BIND 9.16, the Windows
specific paths were missed.  Add the #if(n)def _WIN32 back into the
appropriate places.
2022-04-05 11:53:18 +02:00
Ondřej Surý
5f27873d01 Rename shutdown() to test_shutdown() in timer_test.c
The shutdown() is part of standard library (POSIX-1), don't use such
name in the timer_test.c, but rather rename it to test_shutdown().
2022-04-05 02:17:47 +02:00
Ondřej Surý
9159837315 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.

(cherry picked from commit 142c63dda8)
2022-04-05 02:17:47 +02:00
Ondřej Surý
8993ebc01a Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.

(cherry picked from commit 85c6e797aa)
2022-04-05 01:21:50 +02:00
Ondřej Surý
79b7804ce8 Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:28:05 +02:00
Ondřej Surý
4d1d91d709 Add win32 __builtin_unreachable() shim
The backport of using modern compiler features broken Windows debug
build because there's no __builtin_unreachable() in MSVC.

Define __builtin_unreachable() shim on MSVC using __assume(0).
2022-03-28 12:57:42 +02:00
Ondřej Surý
b624be2544 Remove use of the inline keyword used as suggestion to compiler
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline.  As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete.  The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.

Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler

NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.

(cherry picked from commit 20f0936cf2)
2022-03-25 09:37:18 +01:00
Ondřej Surý
75f9dd8e82 Simplify way we tag unreachable code with only ISC_UNREACHABLE()
Previously, the unreachable code paths would have to be tagged with:

    INSIST(0);
    ISC_UNREACHABLE();

There was also older parts of the code that used comment annotation:

    /* NOTREACHED */

Unify the handling of unreachable code paths to just use:

    UNREACHABLE();

The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.

(cherry picked from commit 584f0d7a7e)
2022-03-25 09:33:51 +01:00