when transferring in a non-inline-signing secondary for the first time,
we previously never set the value of zone->loadtime, so it remained
zero. this caused a test failure in the statschannel system test,
and that test case was temporarily disabled. the value is now set
correctly and the test case has been reinstated.
(cherry picked from commit 9643281453)
The maximum DNS message size is 65535 octets. Check that the buffer
being passed to dns_message_renderbegin does not exceed this as the
compression code assumes that all offsets are no bigger than this.
(cherry picked from commit a069513234)
To reduce the amount of log spam when root servers change their
addresses keep a table of upcoming changes by expected date and time
and suppress reporting differences for them until then.
Add initial entry for B.ROOT-SERVERS.NET, Nov 27, 2023.
(cherry picked from commit b69100b747)
This covers both root hints and the default primaries for the root
zone mirror. The official change date is Nov 27, 2023.
(cherry picked from commit 2ca2f7e985)
The dns__catz_update_cb() does not expect that 'catzs->zones'
can become NULL during shutdown.
Add similar checks in the dns__catz_update_cb() and dns_catz_zone_get()
functions to protect from such a case. Also add an INSIST in the
dns_catz_zone_add() function to explicitly state that such a case
is not expected there, because that function is called only during a
reconfiguration.
(cherry picked from commit 4eb4fa288c)
Drop timeout before resending a UDP request from 15 seconds to 5
seconds and add 1 second to the total time to allow for the reply
to the third request to arrive. This will speed up the time it
takes for named to recover from a lost packet when refreshing a
zone and for it to determine that a primary is down.
(cherry picked from commit 29f399797d)
Update the function 'set_resigntime()' so that raw versions of
inline-signing zones are not scheduled to be resigned.
Also update the check in the same function for zone is dynamic, there
exists a function 'dns_zone_isdynamic()' that does a similar thing
and is more complete.
Also in 'zone_postload()' check whether the zone is not the raw
version of an inline-signing zone, preventing calculating the next
resign time.
(cherry picked from commit 741ce2d07a)
In units that support detailed reference tracing via ISC_REFCOUNT
macros, we were doing:
/* Define to 1 for detailed reference tracing */
#undef <unit>_TRACE
This would prevent using -D<unit>_TRACE=1 in the CFLAGS.
Convert the above mentioned snippet with just a comment how to enable
the detailed reference tracing:
/* Add -D<unit>_TRACE=1 to CFLAGS for detailed reference tracing */
(cherry picked from commit 6afa961534)
Move the block on the error path, where the link is checked, to a place
where it makes sense, to avoid accessing an unitialized link when
jumping to the 'cleanup_query' label from 4 different places. The link
is initialized only after those jumps happen.
In addition, initilize the link when creating the object, to avoid
similar errors.
(cherry picked from commit fb7bbbd1be)
Apply the semantic patch to catch all the places where we pass 'char' to
the <ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()).
(cherry picked from commit 29caa6d1f0)
This raises the log level of messages treated as FORMERR to NOTICE
when transfering in a zone. This also adds a missing log message
for TYPE0 and meta types received during a zone transfer.
(cherry picked from commit 6c3414739d)
This refactors the code flow in got_transfer_quota() to not use the
CHECK() macro as it really obfuscates the code flow logic here.
(cherry picked from commit 00cb151f8e)
The 'result' variable should be reset to ISC_R_NOTFOUND again,
because otherwise a log message could be logged about not being
able to get the TLS configuration based on on the 'result' value
from the previous calls to get the TSIG key.
(cherry picked from commit 6cab7fc627)
When flushing the cache, we create a new cache database. The serve-stale
settings need to be restored after doing this. We already did this
for max-stale-ttl, but forgot to do this for stale-refresh-time.
(cherry picked from commit 3ae721db6c)
Fixes
>>> CID 464851: Possible Control flow issues (DEADCODE)
>>> Execution cannot reach this statement: "BN_free(ex);".
Makes conditionals between declaring and use constistent. BN_free is
not needed as BIGNUM's returned by RSA_get0_key are not to be freed.
Commit dc6dafdad1 allows larger TTL values
in zones that go insecure, and ignores the maximum zone TTL.
This means that if you use TTL values larger than 1 day in your zone,
your zone runs the risk of going bogus before it moves safely to
insecure.
Most resolvers by default cap the maximum TTL that they cache RRsets,
at one day (Unbound, Knot, PowerDNS) so that is fine. However, BIND 9's
default is one week.
Change the default TTLsig to one week, so that also for BIND 9
resolvers in the default cases responses for zones that are going
insecure will not be evaluated as bogus.
This change does mean that when unsigning your zone, it will take six
days longer to safely go insecure, regardless of what TTL values you
use in the zone.
(cherry picked from commit 32686beabc)
Allow larger TTL values in zones that go insecure. This is necessary
because otherwise the zone will not be loaded due to the max-zone-ttl
of P1D that is part of the current insecure policy.
In the keymgr.c code, default back to P1D if the max-zone-ttl is set
to zero.
(cherry picked from commit dc6dafdad1)
When a primary server is not responding, mark it as temporarialy
unreachable. This will prevent too many zones queuing up on a
unreachable server and allow the refresh process to move onto
the next primary sooner once it has been so marked.
(cherry picked from commit 621c117101)
When a zone database update callback is called, the 'catzs' object,
extracted from the callback argument, might be already shutting down,
in which case the 'catzs->zones' can be NULL and cause an assertion
failure when calling isc_ht_find().
Add an early return from the callback if 'catzs->shuttingdown' is true.
Also check the validity of 'catzs->zones' after locking 'catzs' in
case there is a race with dns_catz_shutdown_catzs() running in another
thread.
(cherry picked from commit 28bb419edc)
All the heavy RPZ and CATZ work is already running with offloaded
threads, and running the remaining small functions in exclusive mode
offers more synchronization guaranties.
Move the update notify registration code from the offloaded
dns__catz_update_cb() function into dns__catz_done_cb().
After this change, it should be safe to remove the lock/unlock code
from the dns_catz_dbupdate_register() and dns_catz_dbupdate_unregister()
functions, as they were causing a benign TSAN lock-order-inversion
report.
The dns_zone_catz_enable_db() and dns_zone_catz_disable_db()
functions can race with similar operations in the catz module
because there is no synchronization between the threads.
Add catz functions which use the view's catalog zones' lock
when registering/unregistering the database update notify callback,
and use those functions in the dns_zone module, instead of doing it
directly.
(cherry picked from commit 6f1f5fc307)
in the past there was overlap between the fields used
as resolver fetch options and ADB addrinfo flags. this has
mostly been eliminated; now we can clean up the rest of
it and remove some confusing comments.
(cherry picked from commit 0955cf1af5)
The ability to read legacy HMAC-MD5 K* keyfile pairs using algorithm
number 157 was accidentally lost when the algorithm numbers were
consolidated into a single block, in commit
09f7e0607a.
The assumption was that these algorithm numbers were only known
internally, but they were also used in key files. But since HMAC-MD5
got renumbered from 157 to 160, legacy HMAC-MD5 key files no longer
work.
Move HMAC-MD5 back to 157 and GSSAPI back to 160. Add exception for
GSSAPI to list_hmac_algorithms.
(cherry picked from commit 3f93d3f757)
When DNS_FETCHOPT_NOFOLLOW is set DNS_R_DELEGATION needs to be
returned to restart the resolution process rather than converting
it to ISC_R_SUCCESS.
(cherry picked from commit ea11650376)
If we know that the NS RRset for an intermediate label doesn't exist
on cache contents don't query using that name when looking for a
referral.
(cherry picked from commit 80bc0ee075)
There is no harm in aquiring an additional reference to the resolver
after it has started shutting down. All the REQUIRE was doing was
introducing a point of failure when shutting down the server.
If the resolver received a FORMERR response to a request with
an DNS COOKIE option present that echoes the option back, resend
the request without an DNS COOKIE option present.
(cherry picked from commit f3b24ba789)
The mapping functions between isc_result_t and dns_rcode_t could return
both isc_result_t values not defined in the header and dns_rcode_t
values not defined in the header because it blindly maps anything
withing full 12-bits defined for RCODEs to isc_result_t and back.
Refactor the dns_result_{from,to}rcode() functions to always return
valid isc_result_t and dns_rcode_t values by explicitly mapping the
values to each other and returning DNS_R_SERVFAIL (dns_rcode_servfail)
when encountering value out of the defined range.
(cherry picked from commit b53d1d7069)
When a catalog zone is updated using AXFR, the zone database is changed,
so it is required to unregister the update notification callback from
the old database, and register it for the new one.
Currently, here is the order of the steps happening in such scenario:
1. The zone.c:zone_startload() function registers the notify callback
on the new database using dns_zone_catz_enable_db()
2. The callback, when called, notices that the new 'db' is different
than 'catz->db', and unregisters the old callback for 'catz->db',
marks that it's unregistered by setting 'catz->db_registered' to
false, then it schedules an update if it isn't already scheduled.
3. The offloaded update process, after completing its job, notices that
'catz->db_registered' is false, and (re)registers the update callback
for the current database it is working on. There is no harm here even
if it was registered also on step 1, and we can't skip it, because
this function can also be called "artificially" during a
reconfiguration, and in that case the registration step is required
here.
A problem arises when before step 1 an update process was already
in a running state, operating on the old database, and finishing its
work only after step 2. As described in step 3, dns__catz_update_cb()
notices that 'catz->db_registered' is false and registers the callback
on the current database it is working on, which, at that state, is
already obsolete and unused by the zone. When it detaches the database,
the function which is responsible for its cleanup (e.g. free_rbtdb())
asserts because there is a registered update notify callback there.
To fix the problem, instead of delaying the (re)registration to step 3,
make sure that the new callback is registered and 'catz->db_registered'
is accordingly marked on step 2.
(cherry picked from commit 998765fea5)
When cache memory usage is over the configured cache size (overmem) and
we are cleaning unused entries, it might not be enough to clean just two
entries if the entries to be expired are smaller than the newly added
rdata. This could be abused by an attacker to cause a remote Denial of
Service by possibly running out of the operating system memory.
Currently, the addrdataset() tries to do a single TTL-based cleaning
considering the serve-stale TTL and then optionally moves to overmem
cleaning if we are in that condition. Then the overmem_purge() tries to
do another single TTL based cleaning from the TTL heap and then continue
with LRU-based cleaning up to 2 entries cleaned.
Squash the TTL-cleaning mechanism into single call from addrdataset(),
but ignore the serve-stale TTL if we are currently overmem.
Then instead of having a fixed number of entries to clean, pass the size
of newly added rdatasetheader to the overmem_purge() function and
cleanup at least the size of the newly added data. This prevents the
cache going over the configured memory limit (`max-cache-size`).
Additionally, refactor the overmem_purge() function to reduce for-loop
nesting for readability.
The number of clients per query is calculated using the pending
fetch responses in the list. The dns_resolver_createfetch() function
includes every item in the list when deciding whether the limit is
reached (i.e. fctx->spilled is true). Then, when the limit is reached,
there is another calculation in fctx_sendevents(), when deciding
whether it is needed to increase the limit, but this time the TRYSTALE
responses are not included in the calculation (because of early break
from the loop), and because of that the limit is never increased.
A single client can have more than one associated response/event in the
list (currently max. two), and calculating them as separate "clients"
is unexpected. E.g. if 'stale-answer-enable' is enabled and
'stale-answer-client-timeout' is enabled and is larger than 0, then
each client will have two events, which will effectively halve the
clients-per-query limit.
Fix the dns_resolver_createfetch() function to calculate only the
regular FETCHDONE responses/events.
Change the fctx_sendevents() function to also calculate only FETCHDONE
responses/events. Currently, this second change doesn't have any impact,
because the TRYSTALE events were already skipped, but having the same
condition in both places will help prevent similar bugs in the future
if a new type of response/event is ever added.
(cherry picked from commit 2ae5c4a674)