Commit graph

9226 commits

Author SHA1 Message Date
Evan Hunt
60a33ae6bb fix another dns_qp_lookup() iterator bug
there was another edge case in which an iterator could be positioned at
the wrong node after dns_qp_lookup().  when searching for a key, it's
possible to reach a dead-end branch that doesn't match, because the
branch offset point is *after* the point where the search key differs
from the branch's contents.

for example, if searching for the key "mop", we could reach a branch
containing "moon" and "moor". the branch offset point - i.e., the
point after which the branch's leaves differ from each other - is the
fourth character ("n" or "r"). however, both leaves differ from the
search key at position *three* ("o" or "p"). the old code failed to
detect this condition, and would have incorrectly left the iterator
pointing at some lower value and not at "moor".

this has been fixed and the unit test now includes this scenario.
2023-12-06 11:03:30 -08:00
Evan Hunt
8612902476 fix dns_qp_lookup() iterator bug
in some cases it was possible for the iterator to be positioned in the
wrong place by dns_qp_lookup(). previously, when a leaf node was found
which matched the search key at its parent branch's offset point, but
did not match after that point, the code incorrectly assumed the leaf
it had found was a successor to the searched-for name, and stepped the
iterator back to find a predecessor.  however, it was possible for the
non-matching leaf to be the predecessor, in which case stepping the
iterator back was wrong.

(for example: a branch contains "aba" and "abcd", and we are searching
for "abcde". we step down to the twig matching the letter "c" in
position 3. "abcd" is the predecessor of "abcde", so the iterator is
already correctly positioned, but because the twig was an exact match,
we would have moved it back one step to "aba".)

this previously went unnoticed due to a mistake in the qp_test unit
test, which had the wrong expected result for the test case that should
have detected the error. both the code and the test have been fixed.
2023-12-06 11:03:30 -08:00
Evan Hunt
947bc0a432 add an iterator argument to dns_qp_lookup()
the 'predecessor' argument to dns_qp_lookup() turns out not to
be sufficient for our needs: the predecessor node in a QP database
could have become "empty" (for the current version) because of an
update or because cache data expired, and in that case the caller
would have to iterate more than one step back to find the predecessor
node that it needs.

it may also be necessary for a caller to iterate forward, in
order to determine whether a node has any children.

for both of these reasons, we now replace the 'predecessor'
argument with an 'iter' argument. if set, this points to memory
with enough space for a dns_qpiter object.

when an exact match is found by the lookup, the iterator will be
pointing to the matching node. if not, it will be pointing to the
lexical predecessor of the nae that was searched for.

a dns_qpiter_current() method has been added for examining
the current value of the iterator without moving it in either
direction.
2023-12-06 11:03:30 -08:00
Artem Boldariev
b109fa9192 Fix TLS certs store deletion on concurrent access
During initialisation or reconfiguration, it is possible that multiple
threads are trying to create a TLS context and associated data (like
TLS certs store) concurrently. In some cases, a thread might be too
late to add newly created data to the TLS contexts cache, in which
case it needs to be discarded. In the code that handles that case, it
was not taken into account that, in some cases, the TLS certs store
could not have been created or should not be deleted, as it is being
managed by the TLS contexts cache already. Deleting the store in such
cases might lead to crashes.

This commit fixes the issue.
2023-12-06 16:01:20 +02:00
Artem Boldariev
5ed3a76f9d BIND: Add 'allow-proxy' and 'allow-proxy-on' options
The main intention of PROXY protocol is to pass endpoints information
to a back-end server (in our case - BIND). That means that it is a
valid way to spoof endpoints information, as the addresses and ports
extracted from PROXYv2 headers, from the point of view of BIND, are
used instead of the real connection addresses.

Of course, an ability to easily spoof endpoints information can be
considered a security issue when used uncontrollably. To resolve that,
we introduce 'allow-proxy' and 'allow-proxy-on' ACL options. These are
the only ACL options in BIND that work with real PROXY connections
addresses, allowing a DNS server operator to specify from what clients
and on which interfaces he or she is willing to accept PROXY
headers. By default, for security reasons we do not allow to accept
them.
2023-12-06 15:15:25 +02:00
Artem Boldariev
eb52015db1 Stream DNS: add PROXY over TLS support
This commit extends Stream DNS with PROXY over TLS support.
2023-12-06 15:15:25 +02:00
Artem Boldariev
e97903ca14 Add PROXY support to Stream DNS
This commit makes it possible to use Stream DNS on top of PROXY Stream
either directly or indirectly (in the case when TLS is involved).
2023-12-06 15:15:24 +02:00
Matthijs Mekking
ff4201e388 Lower the maximum allowed NSEC3 iterations to 50
BIND 9 will now treat the response as insecure when processing NSEC3
records with iterations larger than 50.

Earlier, we limited the number of iterations to 150 (in #2445).

RFC 9276 says: Because there has been a large growth of open (public)
DNSSEC validating resolvers that are subject to compute resource
constraints when handling requests from anonymous clients, this
document recommends that validating resolvers reduce their iteration
count limits over time. Specifically, validating resolver operators and
validating resolver software implementers are encouraged to continue
evaluating NSEC3 iteration count deployment trends and lower their
acceptable iteration limits over time.

After evaluation, we decided that the next major BIND release should
lower the maximum allowed NSEC3 iterations to 50, which should be
fine for 99,87% of the domain names.
2023-12-05 14:58:58 +00:00
Mark Andrews
7ee20d7d10 Destroy the message before detaching the view
With shared name memory pools (f5af981831)
the message needs to be destroyed before the view is detached which
in turn detaches the resolver which checks that all resources have
been returned.
2023-12-04 22:00:25 +00:00
Ondřej Surý
14bdd21e0a
Refactor the handling of isc_mem overmem condition
Previously, there were two methods of working with the overmem
condition:

1. hi/lo water callback - when the overmem condition was reached
   for the first time, the water callback was called with HIWATER
   mark and .is_overmem boolean was set internally.  Similarly,
   when the used memory went below the lo water mark, the water
   callback would be called with LOWATER mark and .is_overmem
   was reset.  This check would be called **every** time memory
   was allocated or freed.

2. isc_mem_isovermem() - a simple getter for the internal
   .is_overmem flag

This commit refactors removes the first method and move the hi/lo water
checks to the isc_mem_isovermem() function, thus we now have only a
single method of checking overmem condition and the check for hi/lo
water is removed from the hot path for memory contexts that doesn't use
overmem checks.
2023-11-29 14:16:20 +01:00
Mark Andrews
decc17d3b0 Ineffective DbC protections
Dereference before NULL checks.  Thanks to Eric Sesterhenn from X41
D-Sec GmbH for reporting this.
2023-11-21 14:48:43 +11:00
Matthijs Mekking
71f023a1c3 Recognize escapes when reading the public key
Escapes are valid in DNS names, and should be recognized when reading
the public key from disk.
2023-11-20 08:31:39 +01:00
Evan Hunt
9643281453 set loadtime during initial transfer of a secondary zone
when transferring in a non-inline-signing secondary for the first time,
we previously never set the value of zone->loadtime, so it remained
zero. this caused a test failure in the statschannel system test,
and that test case was temporarily disabled.  the value is now set
correctly and the test case has been reinstated.
2023-11-15 17:23:25 -08:00
Mark Andrews
a069513234 Check that buffer length in dns_message_renderbegin
The maximum DNS message size is 65535 octets. Check that the buffer
being passed to dns_message_renderbegin does not exceed this as the
compression code assumes that all offsets are no bigger than this.
2023-11-16 11:15:49 +11:00
Aram Sargsyan
c584899b1a Fix catz db update callback registration logic error (take two)
Please see the 998765fea5 commit for
the description of the original issue. The commit had fixed the
logic error, but it was reintroduced again later with the
a1afa31a5a commit, where the check of
the 'db_registered' flag was removed in dns__catz_update_cb(). The
check was removed, because the registration function was made
idempotent, so double registration is not an issue, but the check
also prevented from unneeded registration, on which the original
fix relied.

This commit just removes the update callback registration code from
the dns__catz_update_cb() function instead of bringing back the check,
because after code flow analysis, it is now clear that it's not required
at all. The "call onupdate() artificially" comment (which was mentioned
by the removed code) is speaking about the dns_catz_dbupdate_callback()
function, which is called by server.c on (re)configuration, and that
function already takes care of update callback's registration since the
998765fea5 commit was applied, so there
is no need to do that here again.
2023-11-14 08:59:48 +00:00
Ondřej Surý
79d9360011
Reformat sources with up-to-date clang-format-17 2023-11-13 16:52:35 +01:00
Aram Sargsyan
6687de854f Use a read lock when iterating over a hashmap
The 'dns_tsigkeyring_t' structure has a read/write lock to protect
its 'keys' member, which is a 'isc_hashmap_t' pointer and needs to
be protected.

The dns_tsigkeyring_dump() function, however, doesn't use the lock,
which can introduce a race with another thread, if the other thread
tries to modify the hashmap.

Add a read lock around the code, which iterates over the hashmap.
2023-11-13 12:06:26 +00:00
Evan Hunt
461b9a0442 if GLUEOK is set, and glue is found in a zone DB, don't check the cache
EXPERIMENT: when DNS_DB_GLUEOK is set, dns_view_find() will now return
glue if it is found it a local zone database, without checking to see
if a better answer has been cached previously.
2023-11-01 16:49:08 +01:00
Mark Andrews
9227b82e71 Also look for additional records in dns_adb_find
If a child zone is served by the same servers as a parent zone and
a NS query is made for the zone name then the addresses of the
nameservers are returned in the additional section are tagged as
trust additional.
2023-11-01 16:49:08 +01:00
Mark Andrews
578da93581 Turn on QNAME minimisation when fetching nameserver addresses 2023-11-01 16:49:08 +01:00
Evan Hunt
b12f709f05 restore isc_mem_setwater() call in the cache
Commit 4db150437e incorrectly removed the
call to isc_mem_setwater() from dns_cache_setcachesize().  The water()
function is a no-op, but we still need to set high- and low-water marks
in the memory context, otherwise overmem conditions will not be
detected.
2023-11-01 15:18:02 +00:00
Ondřej Surý
c855ed6a0b
Bump the mempool sizes in dns_message
Increasing the initial and freemax sizes for dns_message memory pools
restores the root zone performance.  The former sizes were suited for
per-dns_message memory pools and we need to bump the sizes up for
per-thread memory pools.
2023-10-27 15:28:27 +02:00
Evan Hunt
03183baa6d Prevent a possible race in dns_qpmulti_query() and _snapshot()
The `.reader` member of dns_qpmulti_t was accessed without RCU
protection; reader_open() calls rcu_dereference() on it, and this
call needs to be inside an RCU critical section.

A similar problem was identified in the dns_qpmulti_snapshot() - the
RCU critical section was completely missing.

These are relicts of the isc_qsbr - in the QSBR mode the rcu_read_lock()
and rcu_read_unlock() are no-ops and whole event loop is a critical section.
2023-10-26 00:32:22 -07:00
Ondřej Surý
6bb42939cf
Refactor dns_message using ISC_LIST_FOREACH macros
Do a light refactoring and cleanups that replaces common list walking
patterns with ISC_LIST_FOREACH macros and split some nested loops into
separate static functions to reduce the nesting depth.
2023-10-25 12:36:37 +02:00
Ondřej Surý
fd732a7fb5
Add dns__message_putassociatedrdataset() to deduplicate code
There was a lot of internal code looking like this:

    INSIST(dns_rdataset_isassociated(rdataset));
    dns_rdataset_disassociated(rdataset)
    isc_mempool_put(msg->rdspool, rdataset);

Deduplicate the code into local dns__message_puttemprdataset() routine,
and drop the INSIST() which is checked in dns_rdataset_disassociate().
2023-10-25 12:36:08 +02:00
Ondřej Surý
5fca0fb519
Remove unused dns_message_movename() method
Since dns_message_movename() was unused, it could be removed from the
code based to declutter the API.
2023-10-25 11:43:10 +02:00
Ondřej Surý
f213f644ed
Add option to mark TCP dispatch as unshared
The current dispatch code could reuse the TCP connection when
dns_dispatch_gettcp() would be used first.  This is problematic as the
dns_resolver doesn't use TCP connection sharing, but dns_request could
get the TCP stream that was created outside of the dns_request.

Add new DNS_DISPATCHOPT_UNSHARED option to dns_dispatch_createtcp() that
would prevent the TCP stream to be reused.  Use that option in the
dns_resolver call to dns_dispatch_createtcp() to prevent dns_request
from reusing the TCP connections created by dns_resolver.

Additionally, the dns_xfrin unit added TCP connection sharing for
incoming transfers.  While interleaving *xfr streams on a TCP connection
should work this should be a deliberate change and be property of the
server that can be controlled.  Additionally some level of parallel TCP
streams is desirable.  Revert to the old behaviour by removing the
dns_dispatch_gettcp() calls from dns_xfrin and use the new option to
prevent from sharing the transfer streams with dns_request.
2023-10-24 13:07:03 +02:00
Ondřej Surý
e0e089f106
Don't set the offloaded work result from main thread
The xfrin_recv_done() was accessing xfr->result where we stored the
result of the offloaded work from a thread that could receive data while
processing the transfer on the offloaded thread.

Completely remove the offloaded result from the dns_xfrin_t structure
and keep it local for *xfr_apply() and *xfr_apply_done() as the failure
is already recorded in .shutdown_result and we now that the processing
has failed because .shuttingdown has been already set.
2023-10-24 11:14:54 +02:00
Aram Sargsyan
4eb4fa288c Fix shutdown races in catzs
The dns__catz_update_cb() does not expect that 'catzs->zones'
can become NULL during shutdown.

Add similar checks in the dns__catz_update_cb() and dns_catz_zone_get()
functions to protect from such a case. Also add an INSIST in the
dns_catz_zone_add() function to explicitly state that such a case
is not expected there, because that function is called only during a
reconfiguration.
2023-10-23 08:21:39 +00:00
Evan Hunt
aacea440c3 handle pre-existing disp/dispentry when retrying
when xfrin_start() is called to retry a transfer, close the existing
dispatch entry and reuse the existing dispatch.
2023-10-20 18:16:25 +11:00
Mark Andrews
b69100b747 Suppress reporting upcoming changes in root hints
To reduce the amount of log spam when root servers change their
addresses keep a table of upcoming changes by expected date and time
and suppress reporting differences for them until then.

Add initial entry for B.ROOT-SERVERS.NET, Nov 27, 2023.
2023-10-20 14:05:56 +11:00
Mark Andrews
2ca2f7e985 Update b.root-servers.net IP addresses
This covers both root hints and the default primaries for the root
zone mirror.  The official change date is Nov 27, 2023.
2023-10-20 14:05:56 +11:00
Ondřej Surý
3737ea592b
Offload AXFR and IXFR processing
Instead of processing received data synchronously, store the incoming
differences in the list and process them asynchronously when we need to
commit the data into the database and/or journal.
2023-10-19 14:57:25 +02:00
Ondřej Surý
e5c79261c0
Remove all locking from XFR
Instead of locking the struct dns_xfrin members that get accessed from
the statistics, convert those into atomic types and use atomic accesses
to prevent ThreadSanitizer from blowing up.

In fact, even the atomic operations are not really needed here, because
all writes are done from a single thread and we don't really require
consistency from the statistics.  It's easier to use atomics here, but
it is slightly confusing as it suggests there might be multithreaded
accesses to those variables while in fact, the only off-thread access
happens when collecting the statistics.
2023-10-19 14:57:25 +02:00
Ondřej Surý
109dc883e7
Cleanup wrong whitespace in dns/diff.h 2023-10-19 14:57:25 +02:00
Ondřej Surý
e3892805d6
Remove the logic that applies differences when over limit
The ixfr_putdata() and axfr_putdata() had a logic to apply dns_diff when
the number of pending tuples went over 100.  Since we are going to
offload the XFR data processing, we don't need to do that anymore.
2023-10-19 14:57:25 +02:00
Ondřej Surý
8a590d1605
Cleanup the FAIL() macro in the dns_xfrin
The FAIL() macro was just setting the result and jumping to failure,
unobfuscate the code by removing the macro.
2023-10-19 14:57:25 +02:00
Mark Andrews
29f399797d Adjust UDP timeouts used in zone maintenance
Drop timeout before resending a UDP request from 15 seconds to 5
seconds and add 1 second to the total time to allow for the reply
to the third request to arrive.  This will speed up the time it
takes for named to recover from a lost packet when refreshing a
zone and for it to determine that a primary is down.
2023-10-18 13:06:28 +11:00
Michal Nowak
dd234c60fe
Update the source code formatting using clang-format-17 2023-10-17 17:47:46 +02:00
Matthijs Mekking
741ce2d07a Don't resign raw version of the zone
Update the function 'set_resigntime()' so that raw versions of
inline-signing zones are not scheduled to be resigned.

Also update the check in the same function for zone is dynamic, there
exists a function 'dns_zone_isdynamic()' that does a similar thing
and is more complete.

Also in 'zone_postload()' check whether the zone is not the raw
version of an inline-signing zone, preventing calculating the next
resign time.
2023-10-16 09:26:56 +02:00
Ondřej Surý
96bbf95b83 Convert rwlock in dns_acl to RCU
The dns_aclenv_t contains two dns_acl_t - localhost and localnets that
can be swapped with a different ACLs as we configure BIND 9.  Instead of
protecting those two pointers with heavyweight read-write lock, use RCU
mechanism to dereference and swap the pointers.
2023-10-13 14:44:40 +02:00
Ondřej Surý
546c327349 Convert manual dns_{acl,aclenv}_{attach,detach} to ISC_REFCOUNT_IMPL
Instead of having a manual set of functions, use ISC_REFCOUNT_IMPL macro
to implement the attach, detach, ref and unref functions.
2023-10-13 14:44:40 +02:00
Ondřej Surý
b3a8f0048f Refactor dns_{acl,aclenv}_create to return void
The dns_{acl,aclenv}_create() can't fail, so change it to return void.
2023-10-13 14:44:40 +02:00
Ondřej Surý
f5b0bd9b1b Convert manual dns_iptable_{attach,detach} to ISC_REFCOUNT_IMPL
Instead of having a manual set of functions, use ISC_REFCOUNT_IMPL macro
to implement the attach, detach, ref and unref functions.
2023-10-13 14:44:40 +02:00
Ondřej Surý
613ada72b6 Refactor dns_iptable_create() to return void
The dns_iptable_create() cannot fail now, so change it to return void.
2023-10-13 14:44:40 +02:00
Ondřej Surý
d46d51be78 Refactor isc_radix_create to return void
The isc_radix_create() can't fail, so change it to return void.
2023-10-13 14:44:40 +02:00
Aram Sargsyan
20fdab8667 Fix undefined behaviour occurrences
The undefined behaviour was detected by LLVM 17. Fix the affected
functions definitions to match the expected function type.
2023-10-13 09:57:28 +00:00
Ondřej Surý
6afa961534
Don't undef <unit>_TRACE, instead add comment how to enable it
In units that support detailed reference tracing via ISC_REFCOUNT
macros, we were doing:

    /* Define to 1 for detailed reference tracing */
    #undef <unit>_TRACE

This would prevent using -D<unit>_TRACE=1 in the CFLAGS.

Convert the above mentioned snippet with just a comment how to enable
the detailed reference tracing:

    /* Add -D<unit>_TRACE=1 to CFLAGS for detailed reference tracing */
2023-10-13 11:40:16 +02:00
Evan Hunt
3a206da456 check chain length is nonzero before examining last entry
It was possible to reach add_link() without visiting an
intermediate node first, and the check for a duplicate entry
could then cause a crash.

Credit to OSS-Fuzz for discovering this error.
2023-10-12 11:31:32 -07:00
Ondřej Surý
91f3b0edee
Use mul and div instead of bitshifts to calculate srtt
There was a microoptimization for smoothing srtt with bitshifts.  Revert
the code to use * 98 / 100, it doesn't really make that difference on
modern CPUs, for comparison here:

    muldiv:
	    imul    eax, edi, 98
	    imul    rax, rax, 1374389535
	    shr     rax, 37
	    ret
    shift:
	    mov     eax, edi
	    sal     eax, 9
	    sub     eax, edi
	    shr     eax, 9
	    ret
2023-10-12 12:35:00 +02:00