Commit graph

191 commits

Author SHA1 Message Date
Ondřej Surý
1c0564d715
Remove ns_query_init() cannot fail, remove the error paths
As ns_query_init() cannot fail now, remove the error paths, especially
in ns__client_setup() where we now don't have to care what to do with
the connection if setting up the client could fail.  It couldn't fail
even before, but now it's formal.
2024-07-03 09:05:51 +02:00
Aram Sargsyan
ad489c44df
Remove sig0checks-quota-maxwait-ms support
Waiting for a quota to appear complicates things and wastes
rosources on timer management. Just answer with REFUSE if
there is no quota.
2024-06-10 17:33:11 +02:00
Aram Sargsyan
f0cde05e06
Implement asynchronous view matching for SIG(0)-signed queries
View matching on an incoming query checks the query's signature,
which can be a CPU-heavy task for a SIG(0)-signed message. Implement
an asynchronous mode of the view matching function which uses the
offloaded signature checking facilities, and use it for the incoming
queries.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
c7f79a0353
Add a quota for SIG(0) signature checks
In order to protect from a malicious DNS client that sends many
queries with a SIG(0)-signed message, add a quota of simultaneously
running SIG(0) checks.

This protection can only help when named is using more than one worker
threads. For example, if named is running with the '-n 4' option, and
'sig0checks-quota 2;' is used, then named will make sure to not use
more than 2 workers for the SIG(0) signature checks in parallel, thus
leaving the other workers to serve the remaining clients which do not
use SIG(0)-signed messages.

That limitation is going to change when SIG(0) signature checks are
offloaded to "slow" threads in a future commit.

The 'sig0checks-quota-exempt' ACL option can be used to exempt certain
clients from the quota requirements using their IP or network addresses.

The 'sig0checks-quota-maxwait-ms' option is used to define a maximum
amount of time for named to wait for a quota to appear. If during that
time no new quota becomes available, named will answer to the client
with DNS_R_REFUSED.
2024-06-10 17:33:08 +02:00
Ondřej Surý
e28266bfbc
Remove the extra memory context with own arena for sending
The changes in this MR prevent the memory used for sending the outgoing
TCP requests to spike so much.  That strictly remove the extra need for
own memory context, and thus since we generally prefer simplicity,
remove the extra memory context with own jemalloc arenas just for the
outgoing send buffers.
2024-06-10 16:48:54 +02:00
Ondřej Surý
452a2e6348
Replace the tcp_buffers memory pool with static per-loop buffer
As a single thread can process only one TCP send at the time, we don't
really need a memory pool for the TCP buffers, but it's enough to have
a single per-loop (client manager) static buffer that's being used to
assemble the DNS message and then it gets copied into own sending
buffer.

In the future, this should get optimized by exposing the uv_try API
from the network manager, and first try to send the message directly
and allocate the sending buffer only if we need to send the data
asynchronously.
2024-06-10 16:48:53 +02:00
Aram Sargsyan
982eab7de0
ns_client: reuse TCP send buffers
Constantly allocating, reallocating and deallocating 64K TCP send
buffers by 'ns_client' instances takes too much CPU time.

There is an existing mechanism to reuse the ns_clent_t structure
associated with the handle using 'isc_nmhandle_getdata/_setdata'
(see ns_client_request()), but it doesn't work with TCP, because
every time ns_client_request() is called it gets a new handle even
for the same TCP connection, see the comments in
streamdns_on_complete_dnsmessage().

To solve the problem, we introduce an array of available (unused)
TCP buffers stored in ns_clientmgr_t structure so that a 'client'
working via TCP can have a chance to reuse one (if there is one)
instead of allocating a new one every time.
2024-06-10 16:48:53 +02:00
Aydın Mercan
e037520b92
Keep track of the recursive clients highwater
The high-water allows administrators to better tune the recursive
clients limit without having to to poll the statistics channel in high
rates to get this number.
2024-05-10 12:08:52 +03:00
Aydın Mercan
09e4fb2ffa
Return the old counter value in isc_stats_increment
Returning the value allows for better high-water tracking without
running into edge cases like the following:

0. The counter is at value X
1. Increment the value (X+1)
2. The value is decreased multiple times in another threads (X+1-Y)
3. Get the value (X+1-Y)
4. Update-if-greater misses the X+1 value which should have been the
   high-water
2024-05-10 12:08:52 +03:00
Aram Sargsyan
bd7463914f Disallow stale-answer-client-timeout non-zero values
Remove all the code and tests which support non-zero
stale-answer-client-timeout values, and adjust the
documentation.
2024-02-16 08:41:52 +00:00
Artem Boldariev
d59cf5e0ce Recreate listeners on DNS transport change
This commit ensures that listeners are recreated on reconfiguration in
the case when their type changes (or when PROXY protocol type changes,
too).

Previously, if a "listen-on" statement was modified to represent a
different transport, BIND would not pick-up the change on
reconfiguration if listener type changes (e.g. DoH -> DoT) for a given
interface address and port combination. This commit fixes that by
recreating the listener.

Initially, that worked for most of the new transports as we would
recreate listeners on each reconfiguration for DoH and DoT. But at
some point we changed that in such a way that listeners were not
recreated to avoid rebinding a port as on some platforms only root can
do that for port numbers <1000, making some ports binding possible
only on start-up. We chose to asynchronously update listener socket
settings (like TLS contexts, HTTP settings) instead.

Now, we both avoid recreating the sockets if unnecessary and recreate
listeners when listener type changes.
2024-01-12 14:55:12 +02:00
Artem Boldariev
eb924e460b Integrate TLS cipher suites support into BIND
This commit makes BIND use the new 'cipher-suites' option from the
'tls' statement.
2024-01-12 13:27:59 +02:00
Mark Andrews
7ab4e1537a Obtain a client->handle reference when calling async_restart
otherwise client may be freed before async_restart is called.
2023-12-20 02:50:48 +11:00
Artem Boldariev
f650d3eb63 Add 'proxy' option to 'listen-on' statement
This commit extends "listen-on" statement with "proxy" options that
allows one to enable PROXYv2 support on a dedicated listener. It can
have the following values:

- "plain" to send PROXYv2 headers without encryption, even in the case
of encrypted transports.
- "encrypted" to send PROXYv2 headers encrypted right after the TLS
handshake.
2023-12-06 15:15:25 +02:00
Ondřej Surý
17da9fed58
Remove AES algorithm for DNS cookies
The AES algorithm for DNS cookies was being kept for legacy reasons, and
it can be safely removed in the next major release.  Remove both the AES
usage for DNS cookies and the AES implementation itself.
2023-11-15 10:31:16 +01:00
Ondřej Surý
79d9360011
Reformat sources with up-to-date clang-format-17 2023-11-13 16:52:35 +01:00
Michal Nowak
dd234c60fe
Update the source code formatting using clang-format-17 2023-10-17 17:47:46 +02:00
Ondřej Surý
f5af981831
Change dns_message_create() function to accept memory pools
Instead of creating new memory pools for each new dns_message, change
dns_message_create() method to optionally accept externally created
dns_fixedname_t and dns_rdataset_t memory pools.  This allows us to
preallocate the memory pools in ns_client and dns_resolver units for the
lifetime of dns_resolver_t and ns_clientmgr_t.
2023-09-24 18:07:40 +02:00
Ondřej Surý
6fd06c461b
Make dns_dispatch bound to threads
Instead of high number of dispatches (4 * named_g_udpdisp)[1], make the
dispatches bound to threads and make dns_dispatchset_t create a dispatch
for each thread (event loop).

This required couple of other changes:

1. The dns_dispatch_createudp() must be called on loop, so the isc_tid()
   is already initialized - changes to nsupdate and mdig were required.

2. The dns_requestmgr had only a single dispatch per v4 and v6.  Instead
   of using single dispatch, use dns_dispatchset_t for each protocol -
   this is same as dns_resolver.
2023-09-16 07:32:17 +02:00
Artem Boldariev
01cc7edcca
Allocate DNS send buffers using dedicated per-worker memory arenas
This commit ensures that memory allocations related to DNS send
buffers are routed through dedicated per-worker memory arenas in order
to decrease memory usage on high load caused by TCP-based DNS
transports.

We do that by following jemalloc developers suggestions:

https://github.com/jemalloc/jemalloc/issues/2483#issuecomment-1639019699
https://github.com/jemalloc/jemalloc/issues/2483#issuecomment-1698173849
2023-09-05 09:39:41 +02:00
Ondřej Surý
bf44554889 Refactor ns_server_create() to return void
After isc_stats_create() change, the ns_server_create() cannot fail, so
refactor the function to return void and fix all its uses.
2023-07-27 11:37:44 +02:00
Ondřej Surý
5321c474ea Refactor isc_stats_create() and its downstream users to return void
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS.  Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
2023-07-27 11:37:44 +02:00
Mark Andrews
3969e2c5f7 Return BADCOOKIE on validly formed bad SERVER COOKIES
The server was previously tolerant of out-of-date or otherwise bad
DNS SERVER COOKIES that where well formed unless require-cookie was
set.  BADCOOKIE is now return for these conditions.
2023-07-13 01:58:53 +00:00
Artem Boldariev
d8a5feb556
Use appropriately sized send buffers for DNS messages over TCP
This commit changes send buffers allocation strategy for stream based
transports. Before that change we would allocate a dynamic buffers
sized at 64Kb even when we do not need that much. That could lead to
high memory usage on server. Now we resize the send buffer to match
the size of the actual data, freeing the memory at the end of the
buffer for being reused later.
2023-06-06 13:40:42 +02:00
Aram Sargsyan
dfaecfd752
Implement new -T options for xfer system tests
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.

'-T transferslowly' makes named to sleep for one second for every
xfrout message.

'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
2023-04-21 12:53:02 +02:00
Ondřej Surý
1715cad685
Refactor the isc_quota code and fix the quota in TCP accept code
In e185412872, the TCP accept quota code
became broken in a subtle way - the quota would get initialized on the
first accept for the server socket and then deleted from the server
socket, so it would never get applied again.

Properly fixing this required a bigger refactoring of the isc_quota API
code to make it much simpler.  The new code decouples the ownership of
the quota and acquiring/releasing the quota limit.

After (during) the refactoring it became more clear that we need to use
the callback from the child side of the accepted connection, and not the
server side.
2023-04-12 14:10:37 +02:00
Tony Finch
0d353704fb Use isc_histo for the message size statistics
This should have no functional effects.

The message size stats are specified by RSSAC002 so it's best not
to mess around with how they appear in the statschannel. But it's
worth changing the implementation to use general-purpose histograms,
to reduce code size and benefit from sharded counters.
2023-04-03 12:08:05 +01:00
Evan Hunt
d91097e0c7 change ns__client_request() to ns_client_request()
in the future we'll want to call this function from outside named,
so change the name to one suitable for external access.
2023-03-28 12:38:28 -07:00
Evan Hunt
4ad95e0567 add ns_interface_create()
add a public function ns_interface_create() allowing the caller
to set up a listening interface directly without having to set
up listen-on and scan network interfaces.
2023-03-28 12:38:28 -07:00
Evan Hunt
197334464e remove named_os_gethostname()
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
2023-02-18 20:23:41 +00:00
Evan Hunt
a52b17d39b
remove isc_task completely
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.

functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.

the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
2023-02-16 18:35:32 +01:00
Evan Hunt
0312789129
refactor dns_resolver to use loop callbacks
callback events from dns_resolver_createfetch() are now posted
using isc_async_run.

other modules which called the resolver and maintained task/taskmgr
objects for this purpose have been cleaned up.
2023-02-16 17:27:59 +01:00
Evan Hunt
b061c7e27f
refactor plugin hook resumption to use loop callbacks
plugins supporting asynchronous operation now use a loop callback
to resume operation in query_hookresume() rather than a task.
2023-02-16 17:16:41 +01:00
Evan Hunt
327b95566d
refactor update processing to use loop callbacks
update processing now uses loop callbacks instead of task events.
2023-02-16 16:34:20 +01:00
Evan Hunt
3a1bb8dac8 remove some unused functions
removed some functions that are no longer used and unlikely to
be resurrected, and also some that were only used to support Windows
and can now be replaced with generic versions.
2023-02-13 11:50:59 -08:00
Evan Hunt
7c47254a14 add an update quota
limit the number of simultaneous DNS UPDATE events that can be
processed by adding a quota for update and update forwarding.
this quota currently, arbitrarily, defaults to 100.

also add a statistics counter to record when the update quota
has been exceeded.
2023-01-12 11:52:48 +01:00
Evan Hunt
916ea26ead remove nonfunctional DSCP implementation
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.

To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
2023-01-09 12:15:21 -08:00
Ondřej Surý
5111258e7a
Propagate the shutdown event to the recursing ns_client(s)
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.

When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.

In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).
2022-12-07 18:05:36 +01:00
Evan Hunt
18606f5276 remove unused 'nupdates' field from client
the 'nupdates' field was originally used to track whether a client
was ready to shut down, along with other similar counters nreads,
nrecvs, naccepts and nsends. this is now tracked differently, but
nupdates was overlooked when the other counters were removed.
2022-11-23 23:44:10 +00:00
Matthijs Mekking
5fb8e555bc Add new recursion type for refreshing stale RRset
Refreshing a stale RRset is similar to a prefetch query, so we can
refactor this code to use the new recursion types introduced in !5883.
2022-10-05 08:20:48 +02:00
Matthijs Mekking
d939d2ecde Only refresh RRset once
Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.

One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.

Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.

Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).

We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.

This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.

Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.
2022-09-08 11:24:37 +02:00
Ondřej Surý
b69e783164
Update netmgr, tasks, and applications to use isc_loopmgr
Previously:

* applications were using isc_app as the base unit for running the
  application and signal handling.

* networking was handled in the netmgr layer, which would start a
  number of threads, each with a uv_loop event loop.

* task/event handling was done in the isc_task unit, which used
  netmgr event loops to run the isc_event calls.

In this refactoring:

* the network manager now uses isc_loop instead of maintaining its
  own worker threads and event loops.

* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
  and every isc_task runs on a specific isc_loop bound to the specific
  thread.

* applications have been updated as necessary to use the new API.

* new ISC_LOOP_TEST macros have been added to enable unit tests to
  run isc_loop event loops. unit tests have been updated to use this
  where needed.
2022-08-26 09:09:24 +02:00
Ondřej Surý
49b149f5fd
Update isc_timer to use isc_loopmgr
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
  completely removed; isc_timer objects are now directly created on the
  isc_loop event loops.

* the isc_timer API has been simplified. the "inactive" timer type has
  been removed; timers are now stopped by calling isc_timer_stop()
  instead of resetting to inactive.

* isc_manager now creates a loop manager rather than a timer manager.

* modules and applications using isc_timer have been updated to use the
  new API.
2022-08-25 17:17:07 +02:00
Artem Boldariev
3f0b310772 Store HTTP quota size inside a listenlist instead of the quota
This way only quota size is passed to the interface/listener
management code instead of a quota object. Thus, we can implement
updating the quota object size instead of recreating the object.
2022-06-28 15:42:38 +03:00
Michal Nowak
1c45a9885a
Update clang to version 14 2022-06-16 17:21:11 +02:00
Michał Kępień
172e15f7ad Attach to separate recursion quota pointers
Similarly to how different code paths reused common client handle
pointers and fetch references despite being logically unrelated, they
also reuse client->recursionquota, the field in which a reference to the
recursion quota is stored.  This unnecessarily forces all code using
that field to be aware of the fact that it is overloaded by different
features.

Overloading client->recursionquota also causes inconsistent behavior.
For example, if prefetch code triggers recursion and then delegation
handling code also triggers recursion, only one of these code paths will
be able to attach to the recursion quota, but both recursions will be
started anyway.  In other words, each code path only checks whether the
recursion quota has not been exceeded if the quota has not yet been
attached to by another code path.  This behavior theoretically allows
the configured recursion quota to be slightly exceeded; while it is not
expected to be a real-world operational issue, it is still confusing and
should therefore be fixed.

Extend the structures comprising the 'recursions' array with a new field
holding a pointer to the recursion quota that a given recursion process
attached to.  Update all code paths using client->recursionquota so that
they use the appropriate slot in the 'recursions' array.  Drop the
'recursionquota' field from ns_client_t.
2022-06-14 13:13:32 +02:00
Michał Kępień
9e187b893d Drop the 'fetchhandle' and 'fetch' fields
Drop the 'fetchhandle' field from ns_client_t as all code using it has
been migrated to use the recursion-type-specific HANDLE_RECTYPE_*()
macros.

Drop the 'fetch' field from ns_query_t as all code using it has been
migrated to use the recursion-type-specific FETCH_RECTYPE_*() macros.
2022-06-14 13:13:32 +02:00
Michał Kępień
e0be643f50 Make async hooks code use the 'recursions' array
Async hooks are the last feature using the client->fetchhandle and
client->query.fetch pointers.  Update ns_query_hookasync() and
query_hookresume() so that they use a dedicated slot in the 'recursions'
array.  Note that async hooks are still not expected to initiate
recursion if one was already started by a prior ns_query_recurse() call,
so the REQUIRE assertion in ns_query_hookasync() needs to check the
RECTYPE_NORMAL slot rather than the RECTYPE_HOOK one.
2022-06-14 13:13:32 +02:00
Michał Kępień
af6fcf5641 Make resolver glue code use the 'recursions' array
With prefetch and RPZ code updated to use separate slots in the
'recursions' array, the code responsible for starting recursion in
ns_query_recurse() and resuming query handling in fetch_callback()
should follow suit, so that it does not need to explicitly cooperate
with other code paths that may initiate recursion.

Replace:

  - client->fetchhandle with HANDLE_RECTYPE_NORMAL(client)
  - client->query.fetch with FETCH_RECTYPE_NORMAL(client)

Also update other functions using client->fetchhandle and
client->query.fetch (ns_query_cancel(), query_usestale()) so that those
two fields can shortly be dropped altogether.
2022-06-14 13:13:32 +02:00
Michał Kępień
30ace0663d Make prefetch code use the 'recursions' array
Replace:

  - client->prefetchhandle with HANDLE_RECTYPE_PREFETCH(client)
  - client->query.prefetch with FETCH_RECTYPE_PREFETCH(client)

This is preparatory work for separating prefetch code from RPZ code.
2022-06-14 13:13:32 +02:00