Add more tests to the dnstap system test to roll with different values.
Touch some files to make sure the number of existing files exceed the
number that we want to keep.
Add a test to the logfileconfig system test for the increment suffix.
(cherry picked from commit 9fb9670ebc)
Empty-non-terminal NSEC records where not always removed when the
delegations generating them where removed via update. Check that
they now are.
(cherry picked from commit ad91a70d15)
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
(cherry picked from commit dfaecfd752)
There is no 'ret' in this test, and it is obvious that 'ret=1'
should be 'tmp=1' for the check to work correctly, if the string
is not found in the log file.
(cherry picked from commit 613a9fc659)
Add a test case to cover #3679 where a user migrates from a KSK/ZSK
split using auto-dnssec maintain, to the default dnssec-policy (CSK).
The test actually does not use the default dnssec-policy, but it does
use one that has the same keys clause. For testing convenience, we use
the same propagation time values as other test cases that migrate to
dnssec-policy with mismatching existing key set.
(cherry picked from commit c42ec8a56e)
At the time of test number (19), there were 10 "sending packet to
10.53.0.7" lines in the "legacy/ns1/named.run" file; usually, only seven
are present:
I:legacy:checking recursive lookup to edns 512 + no tcp server does not cause query loops (19)
I:legacy:ns1 sent 10 queries to ns7, expected less than 10
I:legacy:failed
Those three can be attributed to tests "8", "10", and "18", where the
dig of "resolution_fails()" retried after a timeout to succeed with
"status: SERVFAIL" subsequently, as seen in each of
dig.out.test{8,10,18} files.
;; communications error to 10.53.0.1#13093: timed out
; <<>> DiG 9.19.12-dev <<>> -p 13093 +tcp @10.53.0.1 edns512-notcp. TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5368
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
This retry is unnecessary because "resolution_fails()" considers timeout
a positive result.
(cherry picked from commit e05460c813)
Manual page were updated to indicate it, but rndc -h still displays it
as required parameter. Make it look like optional.
(cherry picked from commit 0627214568)
'inst' is guarenteed to be non NULL at this point.
358 *instp = inst;
359
360cleanup:
CID 281450 (#2 of 2): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking inst suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
361 if (result != ISC_R_SUCCESS && inst != NULL) {
362 plugin_destroy((void **)&inst);
363 }
364
365 return (result);
(cherry picked from commit 8c5539e905)
The dnspython.Resolve.resolve() requires at least dnspython >= 2.0.0,
this wasn't enforced in the shutdown system test leading to infinite
loop waiting for the server start due to failing resolve() call.
We don't need a separate module/file for every test. Both the rpz tests
could live in the same file.
The setup/teardown of servers if performed separately for each module --
unless there is a need to do that, it's better to avoid it.
(cherry picked from commit 1734d4a33e)
All the answers are expected to have exactly one resource record. Check
it directly instead of iterating over all (possible) records.
(cherry picked from commit 2ed26609b8)
This adds rudimentary test for response-policy zones in multiple
views. Different combinations are tested:
- two views with response-policy inherited from options {};
- two views view explicit response-policy using same RPZ zone name
- two views view explicit response-policy using secondary RPZ zone
(cherry picked from commit 1649c768e9)
The default values are currently set to 30 seconds, use nsupdate
default (or overriden using the -t option) timeout value instead.
(cherry picked from commit 98c8135692)
The 'nsupdate' tool, when sending SOA queries, uses a hard-coded value
3 UDP retries and of 5 seconds of timeout for UDP queries, and 100
seconds of timeout for TCP queries.
Use the timeout and retry values which can be configured using the
-t, -u, -r command line options, and which are already used for
sending the update query.
(cherry picked from commit 3ef2a30c75)
The manual page of nsupdate's '-u udptimeout' option states that, quote:
> If zero, the interval is computed from the timeout interval and number
> of UDP retries.
However, nsupdate sets the UDP timeout value to UINT_MAX when it is 0,
thus, not behaving as documented.
Let dns_request_create() calculate the UDP timeout, if it was set to 0.
(cherry picked from commit 0ef11c0ccb)
* nsupdate should take 12 seconds (one try and three retries with
3 second timeout for each), UDP mode
* nsupdate -u 4 -r 1 should take 8 seconds (one try and one retry with
4 second timeout for each), UDP mode
* nsupdate -u 0 -t 8 -r 1 should also take 8 seconds, UDP mode
* nsupdate -u 4 -t 30 -r 1 should also take 8 seconds, as -u takes
precedence over -t, UDP mode
* nsupdate -t 8 -v should also take 8 seconds, TCP mode
(cherry picked from commit 5ce2ed0688)
These options and zone type were created to address the
SiteFinder controversy, in which certain TLD's redirected queries
rather than returning NXDOMAIN. since TLD's are now DNSSEC-signed,
this is no longer likely to be a problem.
The deprecation message for 'type delegation-only' is issued from
the configuration checker rather than the parser. therefore,
isccfg_check_namedconf() has been modified to take a 'nodeprecate'
parameter to suppress the warning when named-checkconf is used with
the command-line option to ignore warnings on deprecated options (-i).
(cherry picked from commit 2399556bee)
Previously, an AXFR request would be issued every second while waiting
for the zone to be signed. This might've been the cause of issues in CI
where many tests are running in parallel and any extra load may increase
test instability.
Instead, check for the last NSEC record to have a signature before
commencing the AXFR request to check the zone has been fully signed.
Also increase the time for the zone signing to a total of 60+10 seconds
up from the previous 30.
(cherry picked from commit 3291c891f6)
There's no point in continuing the dupsigs test if a failure is
detected. End the test early to avoid wasting time and resources.
(cherry picked from commit ad647dca13)
Ensure messages from dupsigs system test end up in its log rather than
stdout. Previously, the output was hard to debug when running the tests
in parallel and messages wouldn't end up in the dupsigs.log.
(cherry picked from commit cbe2559f37)
If the zone already has existing NSEC/NSEC3 chains then zone_sign
needs to continue to use them. If there are no chains then use
kasp setting otherwise generate an NSEC chain.
(cherry picked from commit 4b55201459)
The dnstap system test fails intermittently, and it appears to be
a timing issue - adding a short delay after running 'fstrm_capture',
and before running 'dnstap -reopen' improves the situation from
50% failures (5 out of 10 times) to 0% failures (0 out of 20 times),
tested locally.
The reason is that 'fstrm_capture' is executed in the background,
and due to OS scheduling and other factors, the listener socket
may not be ready when the following command runs and tells 'named'
to (re)open it.
(cherry picked from commit fa686fcea5)
Dumping of the freshly transferred zone file can take some time.
Retry 5 times before failing.
The log excerpt below shows such a case, when dumping lasted more than
two seconds.
06-Mar-2023 09:32:09.973 zone example6/IN: Transfer started.
06-Mar-2023 09:32:10.301 zone example6/IN: zone transfer finished: success
06-Mar-2023 09:32:10.301 zone_dump: zone example6/IN: enter
06-Mar-2023 09:32:11.789 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): AXFR request
06-Mar-2023 09:32:11.801 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): transfer of 'example6/IN': AXFR ended: 5 messages, 2676 records, 55815 bytes, 0.011 secs (5074090 bytes/sec) (serial 1397051952)
06-Mar-2023 09:32:12.409 zone_gotwritehandle: zone example6/IN: enter
06-Mar-2023 09:32:12.421 dump_done: zone example6/IN: enter
06-Mar-2023 09:32:12.421 zone_journal_compact: zone example6/IN: target journal size 53044
(cherry picked from commit 5d5d4b523b)
There can be comments in dig output for a zone transfer only in case
of an error, so we should print those errors not when wait_for_tls_xfer
succeeds, but when it fails.
Also, there is no point in printing those comments when a failure was
indeed expected.
(cherry picked from commit 9672b6be57)
If wait_for_tls_xfer succeeds, while a failure was being expected,
set ret=1 to fail without further checking if the zone file exists.
(cherry picked from commit 2fdf01573c)
The serve-stale system test was intermittently failing due to a timing
issue:
I:serve-stale:check stale data.example TXT was refreshed...
I:serve-stale:failed
The RRset is refreshed, however, it first checks for an expected log
line, prior checking that the stale data.example TXT was refreshed
(using dig). This log line is there to ensure the record is actually
refreshed before we start querying again. Alternatively we could just
retry_quiet 10 <wait for dig output matches expectations>. It would
lower the chances for intermittent test failures, since there is no
longer a "check for log line, sleep one second if check fails, check
for log line, ...", prior to the check.
(cherry picked from commit 0bf36da305)
Named now logs both compile time and run time UV versions when
starting up. This is useful information to have when debugging
network issues involving named.
(cherry picked from commit 5fd2cd8018)
During reconfiguration, the configure_view() function reverts the
configured zones to the previous view in case if there is an error.
It uses the 'zones_configured' boolean variable to decide whether
it is required to revert the zones, i.e. the error happened after
all the zones were successfully configured.
The problem is that it does not account for the case when an error
happens during the configuration of one of the zones (not the first),
in which case there are zones that are already configured for the
new view (and they need to be reverted), and there are zones that
are not (starting from the failed one).
Since 'zones_configured' remains 'false', the configured zones are
not reverted.
Replace the 'zones_configured' variable with a pointer to the latest
successfully configured zone configuration element, and when reverting,
revert up to and including that zone.
(cherry picked from commit 84c235a4b0)
The trick is to configure a duplicate zone, which comes after the
catalog zone, where the duplicate zone is an existing member zone.
In that scenario, all the zones which come before the "faulty" zone
in the configuration file will fail to be reverted to the previous
version of the view after a reconfiguration error, and in this
particular case that will result in an assertion failure when the
catalog zone update is initiated, because it will be still tied to
the new version of the view, which was dismissed.
(cherry picked from commit 93c4f382f4)
Add the 'ixfr-from-differences yes;' option to trigger a failed
zone postload operation when a zone is updated but the serial
number is not updated, then issue two successive 'rndc reload'
commands to trigger the bug, which causes an assertion failure.
(cherry picked from commit a73b67456e)
This commit increases server start timeout from 60 to 90 seconds in
order to avoid system test failures on some platforms due to inability
to initialise TLS contexts in time.
(cherry picked from commit 705f0d1ed1)
The test was setting a minimum count for recursive clients which
was not always being met (e.g. 91 instead of 100) producing a false
positive. Lower the lower bound on recursive clients for this
test to 1.
(cherry picked from commit af47090d99)
Building the bin/tests/system/rpz/dnsrps helper binary is currently not
possible at all as the necessary compiler and linker flag definitions
are missing from bin/tests/system/Makefile.am. Add these as a basis for
addressing the problem.
Unfortunately, this is where the "mostly" bit mentioned in this commit's
subject line comes into play. The dlopen() parts of DNSRPS code have
not yet been reworked to use libuv's dlopen() API (uv_dlopen() etc.)
(See commit 37b9511ce1 for prior work in
this area.) While it is certainly possible to do that, implementing
such a change without testing it in practice against a usable librpz.so
(i.e. a DNSRPS provider library) is bound to cause more trouble and
confusion than keeping the code the way it is right now. However,
making that code buildable as-is requires linking against a C standard
library that exports the dlopen(), dlsym(), and dlclose() symbols used
by the DNSRPS dynamic loading code. glibc 2.34+ satisfies that
requirement, but older glibc versions do not (these come with a separate
libdl shared library that would need to be linked in as well). (Other
C standard library implementations have not been examined.) Since the
long-term plan is to rely on libuv's dlopen() API exclusively and
detecting the shared object containing dlopen() & friends would only
pull in build system complexity for no good reason, assume for now that
the target system provides the dlopen() API in its C standard library.
This change enables the system test suite to be run for a BIND 9 build
prepared using --enable-dnsrps --enable-dnsrps-dl (on systems satisfying
the requirement explained above). However, it is important to note that
this change by itself does NOT enable actual testing of the DNSRPS
feature as doing that requires a DNSRPS provider library to be present
on the test host.
(cherry picked from commit b396f55586)
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
(cherry picked from commit 246b7084d6)
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
(cherry picked from commit 8cb79fec9d)
Reproduce the assertion by configuring a 'named' resolver with
'recursive-clients 10;' configuration option and running 20
queries is parallel.
Also tweak the 'ans2/ans.pl' to simulate a 50ms network latency
when qname starts with "latency". This makes sure that queries
running in parallel don't get served immediately, thus allowing
the configured recursive clients quota limitation to be activated.
(cherry picked from commit 4b52b0b4a9)
The kasp pointers in dns_zone_t should consistently be changed by
dns_kasp_attach and dns_kasp_detach so the usage is balanced.
(cherry picked from commit b41882cc75)
When switching to a new view during a reconfiguration (or reverting
to the old view), detach the 'rpzs' and 'catzs' from the previuos view.
The 'catzs' case was earlier solved slightly differently, by detaching
from the new view when reverting to the old view, but we can not solve
this the same way for 'rpzs', because now in BIND 9.19 and BIND 9.18
a dns_rpz_shutdown_rpzs() call was added in view's destroy() function
before detaching the 'rpzs', so we can not leave the 'rpzs' attached to
the previous view and let it be shut down when we intend to continue
using it with the new view.
Instead, "re-fix" the issue for the 'catzs' pointer the same way as
for 'rpzs' for consistency, and also because a similar shutdown call
is likely to be implemented for 'catzs' in the near future.
(cherry picked from commit 121a095a22)
The faulty "DLZ" configuration triggers a reconfiguration failure
in such a place where view reverting code is covered.
(cherry picked from commit 95f4bac002)
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
(cherry picked from commit 197334464e)