Commit graph

13302 commits

Author SHA1 Message Date
Aram Sargsyan
61d77affdd Remove resolver.c:maybe_destroy()
After refactoring of `validated()`, the `maybe_destroy()` function is
no longer expected to actually destroy the fetch context when it is
being called, so effectively it only ensures that the validators are
canceled when the context has no more queries and pending events, but
that is redundant, because `maybe_destroy()` `REQUIRE`s that the context
should be in the shutting down state, and the function which sets that
state is already canceling the validators in its own turn.

As a failsafe, to make sure that no validators will be created after
`fctx_doshutdown()` is called, add an early return from `valcreate()` if
the context is in the shutting down state.
2022-06-30 19:12:17 +00:00
Aram Sargsyan
058a2e7d44 Fix a race between resolver query timeout and validation
The `resolver.c:validated()` function unlinks the current validator from
the fetch's validators list, which can leave it empty, then unlocks
the bucket lock. If, by a chance, the fetch was timed out just before
the `validated()` call, the final timeout callback running in parallel
with `validated()` can find the fetch context with no active fetches
and with an empty validators list and destroy it, which is unexpected
for the `validated()` function and can lead to a crash.

Increase the fetch context's reference count in the beginning of
`validated()` and decrease it when it finishes its work to avoid the
unexpected destruction of the fetch context.
2022-06-30 18:58:58 +00:00
Michał Kępień
cbfb93e1c7 Fix destination port extraction for client queries
The current logic for determining the address of the socket to which a
client sent its query is:

 1. Get the address:port tuple from the netmgr handle using
    isc_nmhandle_localaddr() or from the ns_interface_t structure.

 2. Convert the address:port tuple from step 1 into an isc_netaddr_t
    using isc_netaddr_fromsockaddr().

 3. Convert the address from step 2 back into a socket address with the
    port set to 0 using isc_sockaddr_fromnetaddr().

Note that the port number (readily available in the netmgr handle or in
the ns_interface_t structure) is needlessly lost in the process,
preventing it from being recorded in dnstap captures of client traffic
produced by named.

Fix by first storing the address:port tuple in client->destsockaddr and
then creating an isc_netaddr_t from that structure.  This allows the
port number to be retained in client->destsockaddr, which is what
subsequently gets passed to dns_dt_send().

Remove an outdated code comment.

(cherry picked from commit 2f945703f2)
2022-06-22 13:52:08 +02:00
Michal Nowak
a584a8f88f
Update clang to version 14
(cherry picked from commit 1c45a9885a)
2022-06-16 18:11:03 +02:00
Ondřej Surý
6cfab7e4f7 Gracefully handle uv_read_start() failures
Under specific rare timing circumstances the uv_read_start() could
fail with UV_EINVAL when the connection is reset between the connect (or
accept) and the uv_read_start() call on the nmworker loop.  Handle such
situation gracefully by propagating the errors from uv_read_start() into
upper layers, so the socket can be internally closed().

(cherry picked from commit b432d5d3bc)
2022-06-14 11:55:03 +02:00
JINMEI Tatuya
673211492c make the fix more complete
(cherry picked from commit a58647df6a)
2022-06-14 12:07:39 +10:00
JINMEI Tatuya
66cfaf0fb0 corrected the opcode param to opcode_totext
(cherry picked from commit 2b81a69659)
2022-06-14 12:07:39 +10:00
Aram Sargsyan
87b3ced5fe Do not cancel processing record datasets in catalog zone after an error
When there are multiple record datasets in a database node of a catalog
zone, and BIND encounters a soft error during processing of a dataset,
it breaks from the loop and doesn't process the other datasets in the
node.

There are cases when this is not desired. For example, the catalog zones
draft version 5 states that there must be a TXT RRset named
`version.$CATZ` with exactly one RR, but it doesn't set a limitation
on possible non-TXT RRsets named `version.$CATZ` existing alongside
with the TXT one. In case when one exists, we will get a processing
error and will not continue the loop to process the TXT RRset coming
next.

Remove the "break" statement to continue processing all record datasets.

(cherry picked from commit 0b2d5490cd)
2022-06-07 09:59:32 +00:00
Aram Sargsyan
1dc7288708 Don't process DNSSEC-related and ZONEMD records in catz
When processing a catalog zone update, skip processing records with
DNSSEC-related and ZONEMD types, because we are not interested in them
in the context of a catalog zone, and processing them will fail and
produce an unnecessary warning message.

(cherry picked from commit 73d6643137)
2022-06-02 10:33:03 +00:00
Mark Andrews
b318db2b7f Add missing INDENT call for UPDATE messages
Reported by Peter <pmc@citylink.dinoex.sub.org> on bind-users.

(cherry picked from commit 03132c93ca)
2022-06-02 08:29:28 +10:00
Mark Andrews
3292a54fed Add LIBUV_CFLAGS to CLINCLUDE in lib/isc/Makefile.in 2022-05-31 16:43:48 +10:00
Ondřej Surý
32a3970b13 Replace netievent lock-free queue with simple locked queue
The current implementation of isc_queue uses Michael-Scott lock-free
queue that in turn uses hazard pointers.  It was discovered that the way
we use the isc_queue, such complicated mechanism isn't really needed,
because most of the time, we either execute the work directly when on
nmthread (in case of UDP) or schedule the work from the matching
nmthreads.

Replace the current implementation of the isc_queue with a simple locked
ISC_LIST.  There's a slight improvement - since copying the whole list
is very lightweight - we move the queue into a new list before we start
the processing and locking just for moving the queue and not for every
single item on the list.

NOTE: There's a room for future improvements - since we don't guarantee
the order in which the netievents are processed, we could have two lists
- one unlocked that would be used when scheduling the work from the
matching thread and one locked that would be used from non-matching
thread.

(cherry picked from commit 6bd025942c)
2022-05-25 16:01:58 +02:00
Petr Menšík
1feb389f80 Fix failures in isc netmgr_test on big endian machines
Typing from libuv structure to isc_region_t is not possible, because
their sizes differ on 64 bit architectures. Little endian machines seems
to be lucky and still result in test passed. But big endian machine such
as s390x fails the test reliably.

Fix by directly creating the buffer as isc_region_t and skipping the
type conversion. More readable and still more correct.

(cherry picked from commit 057438cb45)
2022-05-24 20:23:04 +02:00
Matthijs Mekking
d3147417c5 Require valid key for dst_key functions
Make sure that the key structure is valid when calling the following
functions:
- dst_key_setexternal
- dst_key_isexternal
- dst_key_setmodified
- dst_key_ismodified

This commit is adapted because 9.16 has a different approach
of deconsting the variable.

(cherry picked from commit 888ec4e0d4)
2022-05-23 12:31:23 +02:00
Matthijs Mekking
7c42c04f3f Fix CID 352776: Concurrent data access violations
*** CID 352776:  Concurrent data access violations  (MISSING_LOCK)
/lib/dns/dst_api.c: 474 in dst_key_setmodified()
468     dst_key_isexternal(dst_key_t *key) {
469		return (key->external);
470     }
471
472     void
473     dst_key_setmodified(dst_key_t *key, bool value) {
>>>     CID 352776:  Concurrent data access violations  (MISSING_LOCK)
>>>     Accessing "key->modified" without holding lock
>>>	"dst_key.mdlock". Elsewhere, "dst_key.modified" is accessed with
>>>	"dst_key.mdlock" held 8 out of 11 times (8 of these accesses
>>>	strongly imply that it is necessary).
474		key->modified = value;
475     }
476
477     bool
478     dst_key_ismodified(dst_key_t *key) {
479		return (key->modified);

(cherry picked from commit 1fa24d0afb)
2022-05-23 12:03:56 +02:00
Ondřej Surý
ed4eda5ebc Move setting the sock->write_timeout to the async_*send
Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send
functions could lead to (harmless) data race when setting the value for
the first time when the isc_nm_send() function would be called from
thread not-matching the socket we are sending to.  Move the setting the
sock->write_timeout to the matching async function which is always
called from the matching thread.

(cherry picked from commit 61117840c1)
2022-05-19 22:38:47 +02:00
Ondřej Surý
4657b0f0c4 Use C2x [[fallthrough]] when supported by LLVM/clang
Clang added support for the gcc-style fallthrough
attribute (i.e. __attribute__((fallthrough))) in version 10.  However,
__has_attribute(fallthrough) will return 1 in C mode in older versions,
even though they only support the C++11 fallthrough attribute. At best,
the unsupported attribute is simply ignored; at worst, it causes errors.

The C2x fallthrough attribute has the advantages of being supported in
the broadest range of clang versions (added in version 9) and being easy
to check for support. Use C2x [[fallthrough]] attribute if possible, and
fall back to not using an attribute for clang versions that don't have
it.

Courtesy of Joshua Root

(cherry picked from commit 14c8d43863)
2022-05-19 22:02:07 +02:00
Matthijs Mekking
296cb390b6 Add new functions to lib/dns/win32/libdns.def.in
Missing from lib/dns/win32/libdns.def.in:
dst_key_setmodified
dst_key_ismodified
2022-05-16 18:31:55 +02:00
Matthijs Mekking
c2e8c72298 Check if key metadata is modified before writing
Add a new parameter to the dst_key structure, mark a key modified if
dst_key_(un)set[bool,num,state,time] is called. Only write out key
files during a keymgr run if the metadata has changed.

(cherry picked from commit 1da91b3ab4)
2022-05-16 10:35:33 +02:00
Evan Hunt
adeddfa8ff dont run isc__trampoline_initialize() in dlopen library
when built without libtool, the sample driver in the dyndb
system test runs library intializers that have already been
run, causing the value for isc__trampoline_min to be reset.
wrap the trampoline initialize and shutdown routines under
isc_once_do() to ensure that they are only run once.
2022-05-15 00:25:32 -07:00
Evan Hunt
82c197d93b Cleanup: always count ns_statscounter_recursclients
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.

(cherry picked from commit 0201eab655)
2022-05-14 00:58:26 -07:00
Evan Hunt
8516efa4fd Fix the fetches-per-server quota calculation
Since commit bad5a523c2, when the fetches-per-server quota
was increased or decreased, instead of the value being set to
the newly calculated quota, it was set to the *minimum* of
the new quota or 1 - which effectively meant it was always set to 1.
it should instead have been the maximum, to prevent the value from
ever dropping to zero.

(cherry picked from commit 694bc50273)
2022-05-14 00:52:22 -07:00
Evan Hunt
b6670787d2 prevent a possible buffer overflow in configuration check
corrected code that could have allowed a buffer overfow while
parsing named.conf.

(cherry picked from commit 921043b541)
2022-05-13 20:30:41 -07:00
Ondřej Surý
be7f672fcc Lock the trampoline when attaching
When attaching to the trampoline, the isc__trampoline_max was access
unlocked.  This would not manifest under normal circumstances because we
initialize 65 trampolines by default and that's enough for most
commodity hardware, but there are ARM machines with 128+ cores where
this would be reported by ThreadSanitizer.

Add locking around the code in isc__trampoline_attach().  This also
requires the lock to leak on exit (along with memory that we already)
because a new thread might be attaching to the trampoline while we are
running the library destructor at the same time.

(cherry picked from commit 933162ae14)
2022-05-13 13:42:23 +02:00
Mark Andrews
36612dadff Allow DNS_RPZ_POLICY_ERROR to be converted to a string
(cherry picked from commit f498d2db0d)
2022-05-04 23:53:21 +10:00
Mark Andrews
8f23d56fba Check the cache as well when glue NS are returned processing RPZ
(cherry picked from commit 8fb72012e3)
2022-05-04 23:53:21 +10:00
Mark Andrews
8c2ede6edc Process learned records as well as glue
(cherry picked from commit 07c828531c)
2022-05-04 23:53:21 +10:00
Mark Andrews
13129872eb Process the delegating NS RRset when checking rpz rules
(cherry picked from commit cf97c61f48)
2022-05-04 23:53:21 +10:00
Mark Andrews
2a9ab8a732 Don't try to set IPV6_V6ONLY on OpenBSD
OpenBSD IPv6 sockets are always IPv6-only, so the socket option is read-only (not modifiable)
2022-05-02 14:09:31 +10:00
Petr Menšík
c1b3862c4a Additional safety check for negative array index
inet_ntop result should always protect against empty string accepted
without an error. Make additional check to satisfy coverity scans.

(cherry picked from commit 656a0f076f)
2022-04-29 11:46:33 +10:00
Petr Menšík
1bc7552203 Ensure diff variable is not read uninitialized
Coverity detected issues:
- var_decl: Declaring variable "diff" without initializer.
- uninit_use_in_call: Using uninitialized value "diff.tuples.head" when
  calling "dns_diff_clear".

(cherry picked from commit 67e773c93c)
2022-04-29 11:46:33 +10:00
Ondřej Surý
4f30b16d96 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 11:52:02 +02:00
Michał Kępień
e850946557 Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc.  It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times".  Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance.  Ticks
are triggered when extents are allocated and deallocated.  Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any).  This allows memory use to be kept in check over time.

This dirty page cleanup mechanism has a quirk.  If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated.  This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]

The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).

Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call.  Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.

An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.

[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3

(cherry picked from commit 7aa7b6474b)
2022-04-21 14:23:59 +02:00
Mark Andrews
cb3c29cf8e Update the rdataset->trust field in ncache.c:rdataset_settrust
Both the trust recorded in the slab stucture and the trust on
rdataset need to be updated.

(cherry picked from commit d043a41499)
2022-04-19 09:45:16 +10:00
Matthijs Mekking
42f43cebdd Update dns_dnssec_syncdelete() function
Update the function that synchronizes the CDS and CDNSKEY DELETE
records. It now allows for the possibility that the CDS DELETE record
is published and the CDNSKEY DELETE record is not, and vice versa.

Also update the code in zone.c how 'dns_dnssec_syncdelete()' is called.

With KASP, we still maintain the DELETE records our self. Otherwise,
we publish the CDS and CDNSKEY DELETE record only if they are added
to the zone. We do still check if these records can be signed by a KSK.

This change will allow users to add a CDS and/or CDNSKEY DELETE record
manually, without BIND removing them on the next zone sign.

Note that this commit removes the check whether the key is a KSK, this
check is redundant because this check is also made in
'dst_key_is_signing()' when the role is set to DST_BOOL_KSK.

(cherry picked from commit 3d05c99abb)
2022-04-13 15:13:12 +02:00
Ondřej Surý
a7f893e836 Rename the configuration option to load balance sockets to reuseport
After some back and forth, it was decidede to match the configuration
option with unbound ("so-reuseport"), PowerDNS ("reuseport") and/or
nginx ("reuseport").

(cherry picked from commit 7e71c4d0cc)
2022-04-06 17:51:12 +02:00
Ondřej Surý
daa7d6d6db Revert "General cleanup of dns_rpz implementation"
This reverts commit 84e62cece5.
2022-04-06 10:41:49 +02:00
Ondřej Surý
f5fbe2c26f Revert "Refactor the dns_rpz_add/delete to use local rpz copy"
This reverts commit 635147d01a.
2022-04-06 10:41:39 +02:00
Ondřej Surý
b68ccdc48e Revert "Run the RPZ update as offloaded work"
This reverts commit 73a0bb8588.
2022-04-06 10:31:23 +02:00
Ondřej Surý
d836f23f79 Fix the Windows paths modified for load balanced sockets
When backporting the load balanced sockets to BIND 9.16, the Windows
specific paths were missed.  Add the #if(n)def _WIN32 back into the
appropriate places.
2022-04-05 11:53:18 +02:00
Ondřej Surý
5f27873d01 Rename shutdown() to test_shutdown() in timer_test.c
The shutdown() is part of standard library (POSIX-1), don't use such
name in the timer_test.c, but rather rename it to test_shutdown().
2022-04-05 02:17:47 +02:00
Ondřej Surý
9159837315 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.

(cherry picked from commit 142c63dda8)
2022-04-05 02:17:47 +02:00
Ondřej Surý
8993ebc01a Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.

(cherry picked from commit 85c6e797aa)
2022-04-05 01:21:50 +02:00
Ondřej Surý
73a0bb8588 Run the RPZ update as offloaded work
Previously, the RPZ updates ran quantized on the main nm_worker loops.
As the quantum was set to 1024, this might lead to service
interruptions when large RPZ update was processed.

Change the RPZ update process to run as the offloaded work.  The update
and cleanup loops were refactored to do as little locking of the
maintenance lock as possible for the shortest periods of time and the db
iterator is being paused for every iteration, so we don't hold the rbtdb
tree lock for prolonged periods of time.

(cherry picked from commit f106d0ed2b)
(cherry picked from commit e128b6a951)
2022-04-05 00:30:39 +02:00
Ondřej Surý
635147d01a Refactor the dns_rpz_add/delete to use local rpz copy
Previously dns_rpz_add() were passed dns_rpz_zones_t and index to .zones
array.  Because we actually attach to dns_rpz_zone_t, we should be using
the local pointer instead of passing the index and "finding" the
dns_rpz_zone_t again.

Additionally, dns_rpz_add() and dns_rpz_delete() were used only inside
rpz.c, so make them static.

(cherry picked from commit b6e885c97f)
(cherry picked from commit f4cba0784e)
2022-04-05 00:30:39 +02:00
Ondřej Surý
84e62cece5 General cleanup of dns_rpz implementation
Do a general cleanup of lib/dns/rpz.c style:

 * Removed deprecated and unused functions
 * Unified dns_rpz_zone_t naming to rpz
 * Unified dns_rpz_zones_t naming to rpzs
 * Add and use rpz_attach() and rpz_attach_rpzs() functions
 * Shuffled variables to be more local (cppcheck cleanup)

(cherry picked from commit 840179a247)
(cherry picked from commit bfee462403)
2022-04-05 00:02:35 +02:00
Mark Andrews
c284112bec Prevent arithmetic overflow of 'i' in master.c:generate
the value of 'i' in generate could overflow when adding 'step' to
it in the 'for' loop.  Use an unsigned int for 'i' which will give
an additional bit and prevent the overflow.  The inputs are both
less than 2^31 and and the result will be less than 2^32-1.

(cherry picked from commit 5abdee9004)
2022-04-01 21:42:53 +11:00
Tony Finch
a5d65815bc Log "not authoritative for update zone" more clearly
Ensure the update zone name is mentioned in the NOTAUTH error message
in the server log, so that it is easier to track down problematic
update clients. There are two cases: either the update zone is
unrelated to any of the server's zones (previously no zone was
mentioned); or the update zone is a subdomain of one or more of the
server's zones (previously the name of the irrelevant parent zone was
misleadingly logged).

Closes #3209

(cherry picked from commit 84c4eb02e7)
2022-03-30 13:24:56 +01:00
Ondřej Surý
79b7804ce8 Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:28:05 +02:00
Ondřej Surý
4d1d91d709 Add win32 __builtin_unreachable() shim
The backport of using modern compiler features broken Windows debug
build because there's no __builtin_unreachable() in MSVC.

Define __builtin_unreachable() shim on MSVC using __assume(0).
2022-03-28 12:57:42 +02:00