Commit graph

13446 commits

Author SHA1 Message Date
Ondřej Surý
c7d64009c2
Add test for dns_rbtdb overmem purging
Add a unit test to check if the overmem purging in the RBTDB is
effective when mixed size RR data is inserted into the database.

Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Jinmei Tatuya <jtatuya@infoblox.com>

(manually picked from 269c03831f)
2023-07-26 15:20:53 +02:00
Michał Kępień
d3b0df51cf
Revert GL !8123
This reverts commit 302d0d36f7
(7e9e96ba01 and
bd912b7bed), reversing changes made to
fc6992b3fb.
2023-07-24 11:02:37 +02:00
Mark Andrews
7e9e96ba01 Mark a primary as unreachable on timed out in xfin
When a primary server is not responding, mark it as temporarialy
unreachable.  This will prevent too many zones queuing up on a
unreachable server and allow the refresh process to move onto
the next primary sooner once it has been so marked.
2023-07-22 09:06:42 +10:00
Ondřej Surý
36aba0db8f
Don't process detach and close as priority netmgr events
The detach (and possibly close) netmgr events can cause additional
callbacks to be called when under exclusive mode.  The detach can
trigger next queued TCP query to be processed and close will call
configured close callback.

Move the detach and close netmgr events from the priority queue to the
normal queue as the detaching and closing the sockets can wait for the
exclusive mode to be over.

(cherry picked from commit c2c2ec0c96)
2023-07-20 19:21:44 +02:00
Matthijs Mekking
c003c5bc3c
Fix serve-stale hang at shutdown
The 'refresh_rrset' variable is used to determine if we can detach from
the client. This can cause a hang on shutdown. To fix this, move setting
of the 'nodetach' variable up to where 'refresh_rrset' is set (in
query_lookup(), and thus not in ns_query_done()), and set it to false
when actually refreshing the RRset, so that when this lookup is
completed, the client will be detached.
2023-06-09 15:53:10 +02:00
Evan Hunt
0101e28f91
Stale answer lookups could loop when over recursion quota
When a query was aborted because of the recursion quota being exceeded,
but triggered a stale answer response and a stale data refresh query,
it could cause named to loop back where we are iterating and following
a delegation. Having no good answer in cache, we would fall back to
using serve-stale again, use the stale data, try to refresh the RRset,
and loop back again, without ever terminating until crashing due to
stack overflow.

This happens because in the functions 'query_notfound()' and
'query_delegation_recurse()', we check whether we can fall back to
serving stale data. We shouldn't do so if we are already refreshing
an RRset due to having prioritized stale data in cache.

In other words, we need to add an extra check to 'query_usestale()' to
disallow serving stale data if we are currently refreshing a stale
RRset.

As an additional mitigation to prevent looping, we now use the result
code ISC_R_ALREADYRUNNING rather than ISC_R_FAILURE when a recursion
loop is encountered, and we check for that condition in
'query_usestale()' as well.
2023-06-09 15:52:51 +02:00
Ondřej Surý
f1d9e9ee38
Improve RBT overmem cache cleaning
When cache memory usage is over the configured cache size (overmem) and
we are cleaning unused entries, it might not be enough to clean just two
entries if the entries to be expired are smaller than the newly added
rdata.  This could be abused by an attacker to cause a remote Denial of
Service by possibly running out of the operating system memory.

Currently, the addrdataset() tries to do a single TTL-based cleaning
considering the serve-stale TTL and then optionally moves to overmem
cleaning if we are in that condition.  Then the overmem_purge() tries to
do another single TTL based cleaning from the TTL heap and then continue
with LRU-based cleaning up to 2 entries cleaned.

Squash the TTL-cleaning mechanism into single call from addrdataset(),
but ignore the serve-stale TTL if we are currently overmem.

Then instead of having a fixed number of entries to clean, pass the size
of newly added rdatasetheader to the overmem_purge() function and
cleanup at least the size of the newly added data.  This prevents the
cache going over the configured memory limit (`max-cache-size`).

Additionally, refactor the overmem_purge() function to reduce for-loop
nesting for readability.
2023-06-06 14:23:16 +02:00
Matthijs Mekking
2cce83e0d7 Fix serve-stale bug when cache has no data
We recently fixed a bug where in some cases (when following an
expired CNAME for example), named could return SERVFAIL if the target
record is still valid (see isc-projects/bind9#3678, and
isc-projects/bind9!7096). We fixed this by considering non-stale
RRsets as well during the stale lookup.

However, this triggered a new bug because despite the answer from
cache not being stale, the lookup may be triggered by serve-stale.
If the answer from database is not stale, the fix in
isc-projects/bind9!7096 erroneously skips the serve-stale logic.

Add 'answer_found' checks to the serve-stale logic to fix this issue.

(cherry picked from commit bbd163acf6)
2023-05-30 15:32:24 +02:00
Mark Andrews
a01c0e175a Properly process extra nameserver lines in resolv.conf
The whole line needs to be read rather than just the token "nameserver"
otherwise the next line in resolv.conf is not properly processed.

(cherry picked from commit 864cd08052)
2023-05-18 08:52:17 +10:00
Aram Sargsyan
537c2d2c68 Check whether zone->db is a valid pointer before attaching
The zone_resigninc() function does not check the validity of
'zone->db', which can crash named if the zone was unloaded earlier,
for example with "rndc delete".

Check that 'zone->db' is not 'NULL' before attaching to it, like
it is done in zone_sign() and zone_nsec3chain() functions, which
can similarly be called by zone maintenance.

(cherry picked from commit fae0930eb8)
2023-05-15 12:05:11 +00:00
Michal Nowak
ab9d43f814
Update sources to Clang 16 formatting 2023-05-11 14:26:14 +02:00
Mark Andrews
300a2fb4ba Check the pointer alignments when deserialising
deserialize_corrupt_test may corrupt the pointers such that they
is no longer properly aligned.  Check that the alignment is consistent
with memory returned from isc_mem before checking the magic value.
2023-05-05 07:04:31 +00:00
Mark Andrews
3d8a223256 Cleanup orphaned empty-non-terminal NSEC3
When OPTOUT was in use we didn't ensure that NSEC3 records
for orphaned empty-non-terminals where removed.  Check if
there are orphaned empty-non-terminal NSEC3 even if there
wasn't an NSEC3 RRset to be removed in dns_nsec3_delnsec3.

(cherry picked from commit 27160c137f)
2023-04-25 06:46:17 +01:00
Petr Špaček
dc6a888ea3
Export dns_view_istrusted() on Windows 2023-04-03 18:18:43 +02:00
Mark Andrews
489cba33bb
dns_view_untrust modifies dnskey->flags when it shouldn't
Copy the structure and declare dnskey as const.

(cherry picked from commit 21d828241b)
2023-04-03 17:48:31 +02:00
Mark Andrews
f708172d87
Handle dns_rdata_fromstruct failure dns_keytable_deletekey
dns_rdata_fromstruct in dns_keytable_deletekey can potentially
fail with ISC_R_NOSPACE.  Handle the error condition.

(cherry picked from commit b5df9b8591)
2023-04-03 17:48:31 +02:00
Mark Andrews
3cb366b1e0
Reduce the number of verifiations required
In selfsigned_dnskey only call dns_dnssec_verify if the signature's
key id matches a revoked key, the trust is pending and the key
matches a trust anchor.  Previously named was calling dns_dnssec_verify
unconditionally resulted in busy work.

(cherry picked from commit e68fecbdaa)
2023-04-03 17:48:31 +02:00
Mark Andrews
19f8033840
Add new view method dns_view_istrusted
dns_view_istrusted determines if the given key is treated as
being trusted by the view.

(cherry picked from commit 7278fff579)
2023-04-03 17:48:31 +02:00
Matthijs Mekking
89c000f356 Fix scan-build issue: initialized value never read
Value stored to 'source' during its initialization is never read.

(cherry picked from commit 4c33277446)
2023-03-29 15:08:36 +00:00
Evan Hunt
6e422ae3ae fixed a bug in rolling timestamp logfiles
due to comparing logfile suffixes as 32 bit rather than 64 bit
integers, logfiles with timestamp suffixes that should have been
removed when rolling could be left in place. this has been fixed.

(cherry picked from commit 9a9e906306)
2023-03-28 10:03:33 +00:00
Mark Andrews
772cdf453d When signing with a new algorithm preserve NSEC/NSEC3 chains
If the zone already has existing NSEC/NSEC3 chains then zone_sign
needs to continue to use them.  If there are no chains then use
kasp setting otherwise generate an NSEC chain.

(cherry picked from commit 4b55201459)
2023-03-15 00:30:22 +11:00
Aram Sargsyan
ddb67b01b2 Check if catz is active in dns_catz_update_from_db()
A reconfiguration can deactivate the catalog zone, while the
offloaded update process was preparing to run.

(cherry picked from commit 6980e3b354)
2023-03-02 19:42:16 +00:00
Aram Sargsyan
641627838b Use catzs->lock in dns_catz_prereconfig()
There can be an update running in another thread, so use a lock,
like it's done in dns_catz_postreconfig().

(cherry picked from commit 3973724d67)
2023-03-02 19:36:26 +00:00
Aram Sargsyan
2a719e9df2 Revert "Process db callbacks in zone_loaddone() after zone_postload()"
This reverts commit 1254f37584.

The commit introduced a data race, because dns_db_endload() is called
after unfreezing the zone.

(not cherry picked from commit 593dea871a)
2023-03-02 19:19:55 +00:00
Aram Sargsyan
fff49a2ffb catz: unregister the db update-notify callback before detaching from db
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.

There is a similar code already in place for RPZ.

(cherry picked from commit cf79692a66)
2023-02-28 14:40:17 +00:00
Aram Sargsyan
79ee7353ad Searching catzs->zones requires a read lock
Lock the catzs->lock mutex before searching in the catzs->zones
hash table.

(cherry picked from commit 0ef0c86632)
2023-02-28 14:40:17 +00:00
Aram Sargsyan
1254f37584 Process db callbacks in zone_loaddone() after zone_postload()
The zone_postload() function can fail and unregister the callbacks.

Call dns_db_endload() only after calling zone_postload() to make
sure that the registered update-notify callbacks are not called
when the zone loading has failed during zone_postload().

Also, don't ignore the return value of zone_postload().

(cherry picked from commit ed268b46f1)
2023-02-28 14:40:17 +00:00
Aram Sargsyan
466a05eaf0 Fix a cleanup bug when isc_task_create() fails in dns_catz_new_zones()
Use isc_mem_putanddetach() instead of isc_mem_put() to detach from the
memory context.

(cherry picked from commit 9050481d1f)
2023-02-27 13:55:05 +00:00
Mark Andrews
b49a3a56c9 Fix dns_kasp_attach / dns_kasp_detach usage
The kasp pointers in dns_zone_t should consistently be changed by
dns_kasp_attach and dns_kasp_detach so the usage is balanced.

(cherry picked from commit b41882cc75)
2023-02-21 16:58:42 +01:00
Mark Andrews
d0c92a31a9 In hmac_createctx free ctx on isc_hmac_init failure
(cherry picked from commit d22257a370)
2023-02-18 10:27:11 +11:00
Ondřej Surý
100c20b470
Don't check for maximal version on Windows
The Windows doesn't have support for recvmmsg(), so we don't need to
check for maximal version on Windows (only for a minimal version).

Remove the MAXIMAL_VERSION when compiling on Windows.
2023-02-15 11:02:57 +01:00
Mark Andrews
4df6019c16 Report the key name that failed in retry_keyfetch
When there are multiple managed trust anchors we need to know the
name of the trust anchor that is failing.  Extend the error message
to include the trust anchor name.

(cherry picked from commit fb7b7ac495)
2023-02-14 22:09:52 +00:00
Evan Hunt
aca10608b6 delay trust anchor management until zones are loaded
it was possible for a managed trust anchor needing to send a key
refresh query to be unable to do so because an authoritative zone
was not yet loaded. this has been corrected by delaying the
synchronization of managed-keys zones until after all zones are
loaded.

(cherry picked from commit bafbbd2465)
2023-02-14 10:23:28 -08:00
Ondřej Surý
b163ca9f97 Enforce version drift limits for libuv
libuv support for receiving multiple UDP messages in a single system
call (recvmmsg()) has been tweaked several times between libuv versions
1.35.0 and 1.40.0.  Mixing and matching libuv versions within that span
may lead to assertion failures and is therefore considered harmful, so
try to limit potential damage be preventing users from mixing libuv
versions with distinct sets of recvmmsg()-related flags.

(cherry picked from commit 735d09bffe)
2023-02-10 06:50:32 +01:00
Ondřej Surý
9309589ad0 Avoid libuv 1.35 and 1.36 that have broken recvmmsg implementation
The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is
incomplete and could cause assertion failure under certain
circumstances.

Modify the configure and runtime checks to report a fatal error when
trying to compile or run with the affected versions.

(cherry picked from commit 251f411fc3)
2023-02-10 06:50:32 +01:00
Michał Kępień
f7dc0a4708 Handle iterator options in rpsdb_allrdatasets()
Commit 4f3327cd41 added a new parameter,
'options', to the prototype of the 'allrdatasets' function pointer in
struct dns_dbmethods.  Handle this new parameter accordingly in
rpsdb_allrdatasets().

(cherry picked from commit f3def4e4ed)
2023-02-01 12:07:11 +01:00
Matthijs Mekking
3ffb63e9bb Force set DS state after 'rndc dnssec -checkds'
Set the DS state after issuing 'rndc dnssec -checkds'. If the DS
was published, it should go in RUMOURED state, regardless whether it
is already safe to do so according to the state machine.

Leaving it in HIDDEN (or if it was magically already in OMNIPRESENT or
UNRETENTIVE) would allow for easy shoot in the foot situations.

Similar, if the DS was withdrawn, the state should be set to
UNRETENTIVE. Leaving it in OMNIPRESENT (or RUMOURED/HIDDEN)
would also allow for easy shoot in the foot situations.

(cherry picked from commit ee42f66fbe)
2023-01-27 16:09:06 +01:00
Michał Kępień
7b0e57095a BIND 9.16.37
-----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEENKwGS3ftSQfs1TU17QVz/8hFYQUFAmPAjacPHG1pY2hhbEBp
 c2Mub3JnAAoJEO0Fc//IRWEFK/8QAIwzV2AXifS4FpuPG+AUDYDkWISambW62ZYx
 KVMxJquSRv3ZeBc7JZ0OFcqP6RcZKlj8X55aJlehusmEBTCOS3pXJUyBIJ8O//4P
 lqGUPaNrQd1Y0YsBfKLP0Eoljfopj9aplUGZMBz35LejkDSwbj4E6oO0R29ZtLKZ
 8qA2V5TgY1X28fkPlzZyEKtg2+MDZ8WebSjn/J3usJmTlmfyPT/II0aWSS/bhSz2
 M3IZdECPS0n11M4a/9pbgsGUHVfLMTrVkxVyNlJK6yRs2SWWlb0ylrRjZbxY4jyK
 hp8bOnQuLwQ/dtMW90Od7oLxbGzhW7fUmpRxA/UbycaDUiXHTFL6lgG1ZIcVBMHy
 pbg6B+RSGgAIb3SUyCcff83ya8HKyX456AxhfdbitHlioGi4sZUehVV8NUZOzrgE
 9xLbSWIkvVLzZGT42O81kHL225CkteZwc2NaIIrGCIXS+s58MqGIP/tNAmTbty5s
 40xyoYjaPc2g8DHw+Lw2ykJqA5O14vkJR+ERFdc6N5rgBIQWbuMG6AV4mH9cgixU
 ANyVytQF792O9Y2HHFmLcGTDHOjyUfpxVxWc7hy9jJ8ejgzaGQbY3UXeVRuJ6ZAW
 lsdfP0Nh9b371sGcgxkmQZaNDj1wUp5eYXDkCuk27t9iOHsqvWzM38iTvRZqpDZQ
 WE+64/IG
 =8rdy
 -----END PGP SIGNATURE-----

Merge tag 'v9_16_37' into v9_16

BIND 9.16.37
2023-01-25 21:34:55 +01:00
Mark Andrews
b548ee5815 Add missing node lock when setting node->wild in rbtdb.c
The write node lock needs to be held when setting node->wild in
add_wildcard_magic except when being called from loading_addrdataset
which is used to load the zone without locking during its initial
load.

(cherry picked from commit 81c24b8da2)
2023-01-20 00:38:43 +11:00
Mark Andrews
363b40b1da Unlink the timer event before trying to purge it
as far as I can determine the order of operations is not important.

    *** CID 351372:  Concurrent data access violations  (ATOMICITY)
    /lib/isc/timer.c: 227 in timer_purge()
    221     		LOCK(&timer->lock);
    222     		if (!purged) {
    223     			/*
    224     			 * The event has already been executed, but not
    225     			 * yet destroyed.
    226     			 */
    >>>     CID 351372:  Concurrent data access violations  (ATOMICITY)
    >>>     Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
    227     			timerevent_unlink(timer, event);
    228     		}
    229     	}
    230     }
    231
    232     void

(cherry picked from commit 98718b3b4b)
2023-01-19 11:28:10 +01:00
Ondřej Surý
e241a3f4db Don't use reference counting in isc_timer unit
The reference counting and isc_timer_attach()/isc_timer_detach()
semantic are actually misleading because it cannot be used under normal
conditions.  The usual conditions under which is timer used uses the
object where timer is used as argument to the "timer" itself.  This
means that when the caller is using `isc_timer_detach()` it needs the
timer to stop and the isc_timer_detach() does that only if this would be
the last reference.  Unfortunately, this also means that if the timer is
attached elsewhere and the timer is fired it will most likely be
use-after-free, because the object used in the timer no longer exists.

Remove the reference counting from the isc_timer unit, remove
isc_timer_attach() function and rename isc_timer_detach() to
isc_timer_destroy() to better reflect how the API needs to be used.

The only caveat is that the already executed event must be destroyed
before the isc_timer_destroy() is called because the timer is no longet
attached to .ev_destroy_arg.

(cherry picked from commit ae01ec2823)
2023-01-19 11:28:10 +01:00
Ondřej Surý
3cf7055286 Set quantum to infinity for the zone loading task
When we are loading the zones, set the quantum to UINT_MAX, which makes
task_run process all tasks at once.  After the zone loading is finished
the quantum will be dropped to 1 to not block server when we are loading
new zones after reconfiguration.

(cherry picked from commit 87c4c24cde)
2023-01-19 11:28:10 +01:00
Ondřej Surý
7fef8e77d6 Add isc_task_setquantum() and use it for post-init zone loading
Add isc_task_setquantum() function that modifies quantum for the future
isc_task_run() invocations.

NOTE: The current isc_task_run() caches the task->quantum into a local
variable and therefore the current event loop is not affected by any
quantum change.

(cherry picked from commit 15ea6f002f)
2023-01-19 11:28:10 +01:00
Ondřej Surý
617186d514 Keep the list of scheduled events on the timer
Instead of searching for the events to purge, keep the list of scheduled
events on the timer list and purge the events that we have scheduled.

(cherry picked from commit 3f8024b4a2f12fcd28a9dd813b6f1f3f11d506f2)
2023-01-19 11:28:10 +01:00
Ondřej Surý
76859344fe Repair isc_task_purgeevent()
The isc_task_purgerange() was walking through all events on the task to
find a matching task.  Instead use the ISC_LINK_LINKED to find whether
the event is active.

(cherry picked from commit 17aed2f895)
2023-01-19 11:28:10 +01:00
Ondřej Surý
4b222f154b
Detach the zone views outside of the zone lock
Detaching the views in the zone_shutdown() could lead to
lock-order-inversion between adb->namelocks[bucket], adb->lock,
view->lock and zone->lock.  Detach the views outside of the section that
zone-locked.

(cherry picked from commit 978a0ef84c)
2023-01-19 10:21:27 +01:00
Ondřej Surý
49af3a23b9 Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash()
Instead of going through another layer, use OpenSSL SHA1 API directly
in the isc_iterated_hash() implementation.

(cherry picked from commit 25db8d0103)
2023-01-18 23:26:40 +01:00
Ondřej Surý
cb083876c1
Detach the views in zone_shutdown(), not in zone_free()
The .view (and possibly .prev_view) would be kept attached to the
removed zone until the zone is fully removed from the memory in
zone_free().  If this process is delayed because server is busy
something else like doing constant `rndc reconfig`, it could take
seconds to detach the view, possibly keeping multiple dead views in the
memory.  This could quickly lead to a massive memory bloat.

Release the views early in the zone_shutdown() call, and don't wait
until the zone is freed.

(cherry picked from commit 13bb821280)
2023-01-17 22:48:37 +01:00
Ondřej Surý
606fc6d4aa Merge branch 'feature/main/zt-rwlock.h' into 'main'
Include isc_rwlocktype_t type definition in zt.h

See merge request isc-projects/bind9!7376

(cherry picked from commit d7bcdf8bd6)

395d6fca Include isc_rwlocktype_t type definition in zt.h
2023-01-16 11:08:43 +00:00
Aram Sargsyan
6bebcedb80 Cancel all fetch events in dns_resolver_cancelfetch()
Although 'dns_fetch_t' fetch can have two associated events, one for
each of 'DNS_EVENT_FETCHDONE' and 'DNS_EVENT_TRYSTALE' types, the
dns_resolver_cancelfetch() function is designed in a way that it
expects only one existing event, which it must cancel, and when it
happens so that 'stale-answer-client-timeout' is enabled and there
are two events, only one of them is canceled, and it results in an
assertion in dns_resolver_destroyfetch(), when it finds a dangling
event.

Change the logic of dns_resolver_cancelfetch() function so that it
cancels both the events (if they exist), and in the right order.

(cherry picked from commit ec2098ca35)
2023-01-12 13:00:03 +01:00