Previously, an AXFR request would be issued every second while waiting
for the zone to be signed. This might've been the cause of issues in CI
where many tests are running in parallel and any extra load may increase
test instability.
Instead, check for the last NSEC record to have a signature before
commencing the AXFR request to check the zone has been fully signed.
Also increase the time for the zone signing to a total of 60+10 seconds
up from the previous 30.
(cherry picked from commit 3291c891f6)
There's no point in continuing the dupsigs test if a failure is
detected. End the test early to avoid wasting time and resources.
(cherry picked from commit ad647dca13)
Ensure messages from dupsigs system test end up in its log rather than
stdout. Previously, the output was hard to debug when running the tests
in parallel and messages wouldn't end up in the dupsigs.log.
(cherry picked from commit cbe2559f37)
If the zone already has existing NSEC/NSEC3 chains then zone_sign
needs to continue to use them. If there are no chains then use
kasp setting otherwise generate an NSEC chain.
(cherry picked from commit 4b55201459)
The dnstap system test fails intermittently, and it appears to be
a timing issue - adding a short delay after running 'fstrm_capture',
and before running 'dnstap -reopen' improves the situation from
50% failures (5 out of 10 times) to 0% failures (0 out of 20 times),
tested locally.
The reason is that 'fstrm_capture' is executed in the background,
and due to OS scheduling and other factors, the listener socket
may not be ready when the following command runs and tells 'named'
to (re)open it.
(cherry picked from commit fa686fcea5)
Dumping of the freshly transferred zone file can take some time.
Retry 5 times before failing.
The log excerpt below shows such a case, when dumping lasted more than
two seconds.
06-Mar-2023 09:32:09.973 zone example6/IN: Transfer started.
06-Mar-2023 09:32:10.301 zone example6/IN: zone transfer finished: success
06-Mar-2023 09:32:10.301 zone_dump: zone example6/IN: enter
06-Mar-2023 09:32:11.789 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): AXFR request
06-Mar-2023 09:32:11.801 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): transfer of 'example6/IN': AXFR ended: 5 messages, 2676 records, 55815 bytes, 0.011 secs (5074090 bytes/sec) (serial 1397051952)
06-Mar-2023 09:32:12.409 zone_gotwritehandle: zone example6/IN: enter
06-Mar-2023 09:32:12.421 dump_done: zone example6/IN: enter
06-Mar-2023 09:32:12.421 zone_journal_compact: zone example6/IN: target journal size 53044
(cherry picked from commit 5d5d4b523b)
There can be comments in dig output for a zone transfer only in case
of an error, so we should print those errors not when wait_for_tls_xfer
succeeds, but when it fails.
Also, there is no point in printing those comments when a failure was
indeed expected.
(cherry picked from commit 9672b6be57)
If wait_for_tls_xfer succeeds, while a failure was being expected,
set ret=1 to fail without further checking if the zone file exists.
(cherry picked from commit 2fdf01573c)
The serve-stale system test was intermittently failing due to a timing
issue:
I:serve-stale:check stale data.example TXT was refreshed...
I:serve-stale:failed
The RRset is refreshed, however, it first checks for an expected log
line, prior checking that the stale data.example TXT was refreshed
(using dig). This log line is there to ensure the record is actually
refreshed before we start querying again. Alternatively we could just
retry_quiet 10 <wait for dig output matches expectations>. It would
lower the chances for intermittent test failures, since there is no
longer a "check for log line, sleep one second if check fails, check
for log line, ...", prior to the check.
(cherry picked from commit 0bf36da305)
Named now logs both compile time and run time UV versions when
starting up. This is useful information to have when debugging
network issues involving named.
(cherry picked from commit 5fd2cd8018)
During reconfiguration, the configure_view() function reverts the
configured zones to the previous view in case if there is an error.
It uses the 'zones_configured' boolean variable to decide whether
it is required to revert the zones, i.e. the error happened after
all the zones were successfully configured.
The problem is that it does not account for the case when an error
happens during the configuration of one of the zones (not the first),
in which case there are zones that are already configured for the
new view (and they need to be reverted), and there are zones that
are not (starting from the failed one).
Since 'zones_configured' remains 'false', the configured zones are
not reverted.
Replace the 'zones_configured' variable with a pointer to the latest
successfully configured zone configuration element, and when reverting,
revert up to and including that zone.
(cherry picked from commit 84c235a4b0)
The trick is to configure a duplicate zone, which comes after the
catalog zone, where the duplicate zone is an existing member zone.
In that scenario, all the zones which come before the "faulty" zone
in the configuration file will fail to be reverted to the previous
version of the view after a reconfiguration error, and in this
particular case that will result in an assertion failure when the
catalog zone update is initiated, because it will be still tied to
the new version of the view, which was dismissed.
(cherry picked from commit 93c4f382f4)
Add the 'ixfr-from-differences yes;' option to trigger a failed
zone postload operation when a zone is updated but the serial
number is not updated, then issue two successive 'rndc reload'
commands to trigger the bug, which causes an assertion failure.
(cherry picked from commit a73b67456e)
This commit increases server start timeout from 60 to 90 seconds in
order to avoid system test failures on some platforms due to inability
to initialise TLS contexts in time.
(cherry picked from commit 705f0d1ed1)
The test was setting a minimum count for recursive clients which
was not always being met (e.g. 91 instead of 100) producing a false
positive. Lower the lower bound on recursive clients for this
test to 1.
(cherry picked from commit af47090d99)
Building the bin/tests/system/rpz/dnsrps helper binary is currently not
possible at all as the necessary compiler and linker flag definitions
are missing from bin/tests/system/Makefile.am. Add these as a basis for
addressing the problem.
Unfortunately, this is where the "mostly" bit mentioned in this commit's
subject line comes into play. The dlopen() parts of DNSRPS code have
not yet been reworked to use libuv's dlopen() API (uv_dlopen() etc.)
(See commit 37b9511ce1 for prior work in
this area.) While it is certainly possible to do that, implementing
such a change without testing it in practice against a usable librpz.so
(i.e. a DNSRPS provider library) is bound to cause more trouble and
confusion than keeping the code the way it is right now. However,
making that code buildable as-is requires linking against a C standard
library that exports the dlopen(), dlsym(), and dlclose() symbols used
by the DNSRPS dynamic loading code. glibc 2.34+ satisfies that
requirement, but older glibc versions do not (these come with a separate
libdl shared library that would need to be linked in as well). (Other
C standard library implementations have not been examined.) Since the
long-term plan is to rely on libuv's dlopen() API exclusively and
detecting the shared object containing dlopen() & friends would only
pull in build system complexity for no good reason, assume for now that
the target system provides the dlopen() API in its C standard library.
This change enables the system test suite to be run for a BIND 9 build
prepared using --enable-dnsrps --enable-dnsrps-dl (on systems satisfying
the requirement explained above). However, it is important to note that
this change by itself does NOT enable actual testing of the DNSRPS
feature as doing that requires a DNSRPS provider library to be present
on the test host.
(cherry picked from commit b396f55586)
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
(cherry picked from commit 246b7084d6)
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
(cherry picked from commit 8cb79fec9d)
Reproduce the assertion by configuring a 'named' resolver with
'recursive-clients 10;' configuration option and running 20
queries is parallel.
Also tweak the 'ans2/ans.pl' to simulate a 50ms network latency
when qname starts with "latency". This makes sure that queries
running in parallel don't get served immediately, thus allowing
the configured recursive clients quota limitation to be activated.
(cherry picked from commit 4b52b0b4a9)
The kasp pointers in dns_zone_t should consistently be changed by
dns_kasp_attach and dns_kasp_detach so the usage is balanced.
(cherry picked from commit b41882cc75)
When switching to a new view during a reconfiguration (or reverting
to the old view), detach the 'rpzs' and 'catzs' from the previuos view.
The 'catzs' case was earlier solved slightly differently, by detaching
from the new view when reverting to the old view, but we can not solve
this the same way for 'rpzs', because now in BIND 9.19 and BIND 9.18
a dns_rpz_shutdown_rpzs() call was added in view's destroy() function
before detaching the 'rpzs', so we can not leave the 'rpzs' attached to
the previous view and let it be shut down when we intend to continue
using it with the new view.
Instead, "re-fix" the issue for the 'catzs' pointer the same way as
for 'rpzs' for consistency, and also because a similar shutdown call
is likely to be implemented for 'catzs' in the near future.
(cherry picked from commit 121a095a22)
The faulty "DLZ" configuration triggers a reconfiguration failure
in such a place where view reverting code is covered.
(cherry picked from commit 95f4bac002)
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
(cherry picked from commit 197334464e)
bin/tests/system/get_algorithms.py:225:4: R1720: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it (no-else-raise)
(cherry picked from commit 8064ac6bec)
Free/detach tsigkey and sig0key when exiting and then call
dst_lib_destroy if we have previously called dst_lib_init. This will,
in theory, allow OPENSSL_cleanup to free all memory.
(cherry picked from commit 4c2525c418)
Include MD5 feature detection in featuretest tool and use it in some
places. When RHEL distribution or Fedora ELN is in FIPS mode, then MD5
algorithm is unavailable completely and even hmac-md5 algorithm usage
will always fail. Work that around by checking MD5 works and if not,
skipping its usage.
Those changes were dragged as downstream patch bind-9.11-fips-tests.patch
in Fedora and RHEL.
(cherry picked from commit 6ad794a8cd)
Tests using diff to compare outputs of dig +short shall ignore lines
starting with ";". In dig +short output, such lines should only be
present for errors such as network issues. Since we utilize dig's
default timeout/retry mechanisms, these transitory issues should be
ignored and only the final output should be considered during the diff
comparison.
(cherry picked from commit bd1ef66f83)
The dns_rpz_zones structure was using .refs and .irefs for strong and
weak reference counting. Rewrite the unit to use just a single
reference counting + shutdown sequence (dns_rpz_destroy_rpzs) that must
be called by the creator of the dns_rpz_zones_t object. Remove the
reference counting from the dns_rpz_zone structure as it is not needed
because the zone objects are fully embedded into the dns_rpz_zones
structure and dns_rpz_zones_t object must never be destroyed before all
dns_rpz_zone_t objects.
The dns_rps_zones_t reference counting uses the new ISC_REFCOUNT_TRACE
capability - enable by defining DNS_RPZ_TRACE in the dns/rpz.h header.
Additionally, add magic numbers to the dns_rpz_zone and dns_rpz_zones
structures.
(cherry picked from commit 77659e7392)
This adds an island of trust that is reachable from the root
where the trust anchors are added to island.conf.
This add an island of trust that is not reachable from the root
where the trust anchors are added to private.conf.
(cherry picked from commit 41bdb5b9fe)
Occasionally, the allotted 10 seconds for the "running" line to appear
in log after named is started proved insufficient in CI, especially
during increased load. Give named up to 60 seconds to start up to
mitigate this issue.
(cherry picked from commit b8bb4233e8)
isc_bind9 was a global bool used to indicate whether the library
was being used internally by BIND or by an external caller. external
use is no longer supported, but the variable was retained for use
by dyndb, which needed it only when being built without libtool.
building without libtool is *also* no longer supported, so the variable
can go away.
(cherry picked from commit 935879ed11)
Send the test message from ns3 to ns2 instead of ns2 to ns3 as ns2
is started first and therefore the test doesn't have to wait on the
resend of the the NOTIFY message to be successful.
(cherry picked from commit e7e1f59a3a)
Move and give unique names to the dns_db_t, dns_dbnode_t and
dns_dbversion_t pointers, so they have global scope and therefore
are visible to cleanup. Unique names are not strictly necessary,
as none of the functions involved call each other.
Change free_db to handle NULL pointers and also an optional
(dns_dbversion_t **).
In match_keyset_dsset and free_keytable, ki to be handled
differently to prevent a false positive NULL pointer dereference
warning from scan.
In formatset moved dns_master_styledestroy earlier and freed
buf before calling check_result to prevent memory leak.
In append_new_ds_set freed ds on the default path before
calling check_result to prevent memory leak.
(cherry picked from commit 13f9d29954)
dnssec-cds failed to cleanup on non error paths which meant that
the OpenSSL libraries could not cleanup properly.
(cherry picked from commit 81bde388e4)
the nsupdate system test was intermittently failing due to the update
quota not being exceeded when it should have been. this is most likely
a timing issue: the client is sending updates too slowly, or the server
is processing them too quickly, for the quota to fill. this commit
attempts to make that the failure less likely by increasing the number
of update transactions from 10 to 20.
(cherry picked from commit 06b1faf068)
Following deleting the root trust anchor and reconfiguring the
server it takes some time to for trust anchor to appear in 'rndc
managed-keys status' output. Retry several times.
(cherry picked from commit 71dbd09796)
check in the log files of receiving servers that the originating
ports for notify and SOA query messages were set correctly from
configured notify-source and transfer-source options.
(cherry picked from commit 9cffd5c431)
It is trivial to fully cleanup memory on all the error paths in
named-rrchecker, many of which are triggered by bad user input.
This involves freeing lex and mctx if they exist when fatal is
called.
(cherry picked from commit dbe82813e6)
it was possible for a managed trust anchor needing to send a key
refresh query to be unable to do so because an authoritative zone
was not yet loaded. this has been corrected by delaying the
synchronization of managed-keys zones until after all zones are
loaded.
(cherry-picked from commit bafbbd2465)
Add 'port' token to deprecated.conf. Also add options
'use-v4-udp-ports', 'use-v6-udp-ports', 'avoid-v4-udp-ports',
and 'avoid-v6-udp-ports'.
All of these should trigger warnings (except when deprecation warnings
are being ignored).
(cherry picked from commit 531914e660)
Deprecate the use of "port" when configuring query-source(-v6),
transfer-source(-v6), notify-source(-v6), parental-source(-v6),
etc. Also deprecate use-{v4,v6}-udp-ports and avoid-{v4,v6}udp-ports.
(cherry picked from commit 470ccbc8ed)
If the address lookup of the primary server fails just abort
the current update request rather than calling exit. This allows
nsupdate to cleanup gracefully.
(cherry picked from commit f1387514c6)