This removes a leftover check which should've been removed in a prior
change (see #5244). The softhsm2 failures when attempting to delete the
token should be ignored.
Previously, the one-second sleep was unreliable, as it didn't properly
indicate that the rndc reconfig has been processed. The "test 'rndc
reconfig' with a broken config" check would sometimes fail under TSAN
in CI, because the previous rndc reconfig was still ongoing, and the
subsequent rndc reconfig was ignored.
These tests have been unstable under TSAN in the past, but it appears
that the same failure mode can happen outside of TSAN tests as well.
These tests have produced 12 failures combined in the past three weeks
in nightlies.
The fetchlimit test has failed 8 times in the nightly CI over the past
three weeks. That makes the overall failure rate somewhere around 1 %,
which isn't a lot, but is still annoying when lots of testing is going
on.
Rndc test "test 'rndc reconfig' with a broken config" was failing
intermittently.
Wait for 'running' to be logged rather than just using 'sleep 1' before
calling 'rndc reconfig' a second time to get the expected error message
rather than 'reconfig request ignored: already running'.
There are many system tests where we set `dnssec-validation yes;` only
to also set `trust-anchors { };` which effectively disables the
validation.
This commit replaces this convoluted setup with just
`dnssec-validation no;`.
Root trust anchors are automatically updated as described in RFC5011.
Add a system test which ensures the root DNSKEYs are always queried by
named during startup.
Because this test uses real internet DNS root servers, it is enabled
only when `CI_ENABLE_LIVE_INTERNET_TESTS` is set.
This test doesn't require artifact checking but when bundled in the same
directory with the shell based tests, the `system:clang:tsan` job was
failing non-deterministically.
The extra messages are typically traceback from assertion failures.
Previously, they'd be printed only after all individual test case
results have been printed. That made it difficult to pair the traceback
to the failing test in some cases, as the node information (aka test
name) might not always be present.
Instead, log any extra messages related to a particular test failure
directly after reporting its result, making the failure details more
readily available and easy to connect with a particular test case.
Make sure the queries and responses are logged at the DEBUG level, which
may provide useful information in case of failing tests.
This doesn't seem to significantly increase the overall artifacts size.
Previously, pytest.log.txt files from all system tests would take around
3 MB, with this change, it's around 8 MB).
In some cases, it's useful to log the sent and received DNS messages.
Add options to enable this on demand. Query is only logged the first
time it's sent, since it doesn't change. If response logging is turned
on, then each response is logged, since it might be different every
time.
When multiline message is logged, indent all but the first line (which
will be preceeded by the LOG_FORMAT). This improves the clarity of logs,
as it's immediately clear which lines are regular log output, and which
ones are multiline debug output.
Adjust the isctest.run.cmd() stdout/stderr logging to this new format.
The messages obtained from test results may contain stuff like detailed
failure/error information, tracebacks etc. In many cases, the message
will be empty, in which case it doesn't need to be logged.
For an example, run test with many test cases, e.g.
verify/test_verify.py, and inspect the tail of the pytest.log.txt before
and after this commit.
Use isctest.log logging facility for consistent and predictable logging
output rather than using print(). Remove writes of stderr, as that
output will be logged in the debug log in case the commands called with
isctest.run.cmd() fails.
We skip those by default as:
a) we don't want to stress the upstream servers in every CI pipeline
b) system tests need to be runnable in a isolated environment by default
Previous CPU test relied on either missing default named.conf or the
missing permissions to write into its default directory. In short that
default configuration would be unusable with current user. It would hang
indefinitely at cpu test if the named user could write into directory
specified in default configuration.
Change it instead to explicitly try non-existent configuration file.
It will still fail immediately, but will not rely on running user or
presence of file at default configuration file path.
There is an ongoing debate about the usefulness of the extra artifacts
check. While it might be useful to detect unexpected behaviour in some
tests, it feels extraneous in many cases. This change provides a middle
ground by making the artifact checking optional. This might be
especially useful for writing new tests, since the author gets to decide
whether the check is useful -- and can utilize it, or can skip it for
sake of brevity.
The emptyzones system test ran two consecutive "rndc reload" commands
without waiting for the first one to complete. It used to work because
the commands were serialized, but now an rndc reconfig/reload command is
ignored if another one is already running, so the emptyzones test is
more likely to fail.
Fix this problem by waiting for the log message indicating that all the
zones are loaded before attempting the next reload.
Add a new system test which checks named output when starting,
reconfiguring and reloading the server. It checks that the steps where
configuration is loaded, when named enters exclusive mode, and when the
configuration is applied are all logged, and that they occur in the
correct order. This adds a guard/warning to keep the parsing of the
named.conf outside of the exclusive mode.
In some rare cases, the softhsm2 utility reports failure to delete the
token directory, despite the token being found. Subsequent attempts to
delete the token again indicate that the token was deleted.
Ignore this cleanup error, as it doesn't prevent our tests from working
properly. There is also an attempt to delete the token before the test
starts which ensures a clean state before the test is executed, in case
there's actually a leftover token.
For duration measurements, i.e. deadlines and timeouts, it's more
suitable to use monotonic time as it's guaranteed to only go forward,
unlike time.time() which can be affected by local clock settings.
Allow use of exception (and by extension, assert statements) in the
called function in order to extract essential debug information about
the type of failure that was encountered.
In case the called function fails to succeed on the last retry and
raised an exception, log it as error and set it as the assert message to
propagate it through the pytest framework.
Create a test scenario where a signed zone is in multiple views and
then a key may be purged. This is a bug case where the key files are
removed by one view and then the other view starts complaining.
Add a zone using DS records that embed the private algorithm
identifier in the digest field. There are 2 DS record for an
unsupported DNSSEC algorithm one of which that doesn't have a
matching DNSKEY. This zone should validate as insecure as the
validator can establish that both DS records are for unsupported
DNSSEC algorithms.
There are 4 tests:
1) a zone using a known private OID. Validations should succeed
and return AD=1.
2) a zone using an unknown private OID. Validation should succeed
and return AD=0 as the DS to DNSKEY has provably unsupported
algorithm.
3) a zone using a known private OID and an extra DS record. Validation
should succeed as there is DS to DNSKEY with a known algorithm
linkage.
4) a zone using an unknown private OID and an extra DS record.
Validation should fail as only one of the DS records can be matched
to a provable unknown algorithm. The algorithm of the second DS
is indeterminate.
Use the existing RSASHA256 and RSASHA512 implementation to provide
working PRIVATEOID example implementations. We are using the OID
values normally associated with RSASHA256 (1.2.840.113549.1.1.11)
and RSASHA512 (1.2.840.113549.1.1.13).
Add support for proposed DS digest types that encode the private
algorithm identifier at the start of the DS digest as is done for
DNSKEY and RRSIG. This allows a DS record to identify the specific
DNSSEC algorithm, rather than a set of algorithms, when the algorithm
field is set to PRIVATEDNS or PRIVATEOID.
- dns_zone_cdscheck() has been extended to extract the key algorithms
from DNSKEY data when the CDS algorithm is PRIVATEOID or PRIVATEDNS.
- dns_zone_signwithkey() has been extended to support signing with
PRIVATEDNS and PRIVATEOID algorithms. The signing record (type 65534)
added at the zone apex to indicate the current state of automatic zone
signing can now contain an additional two-byte field for the DST
algorithm value, when the DNS secalg value isn't enough information.
- When the algorithm value for a DNSSEC key is set to PRIVATEOID
or PRIVATEDNS, that's a placeholder value indicating that the
real algorithm identifier is encoded into the key or signature
data. That means the DNSKEY algorithm value and the DST algorithm
value may not be identical, so we must now add environment variables
DEFAULT_ALGORITHM_DST_NUMBER, ALTERNATIVE_ALGORITHM_DST_NUMBER
and DISABLED_ALGORITHM_DST_NUMBER to the test suite, with support
for mapping from DST algorithm value to PRIVATEDNS or PRIVATEOID.
- Some test cases use RRSIGs that have been modified to force
validation to fail. When making those modifications, we now
preserve the first part of the signature, so that PRIVATEDNS and
PRIVATEOID algorithm identifier values will still work. (This
assumes that the identifiers are short and fit into the first
base64 block.)
When going insecure, we publish CDS and CDNSKEY DELETE records. Update
the check_apex function to test this.
Also, skip some tests in the 'check_rollover_step()' function. If
we change the DNSSEC Policy, keys that no longer match the policy will
be retired. When this exactly happens is hard to determine, as it
happens on the reconfigure. So for these tests, we skip the key timing
metadata checks.
Also, the zone becomes unsigned, so don't call 'check_zone_is_signed'
in those cases.
These test cases involve a reconfiguration. The first one is a zone
that changes from dynamic to inline-signing. The others are tests that
key lifetimes are updated correctly after changing them.
move the "makejournal" tool from bin/tests/system to bin/tools
and rename it to "named-makejournal". add a man page. update
tests to use the new file location.
check that `delv +ns` sends iterative queries over both address
families when -4 and -6 are not used, and suppresses queries
appropriately when they are.
Meson is a modern build system that has seen a rise in adoption and some
version of it is available in almost every platform supported.
Compared to automake, meson has the following advantages:
* Meson provides a significant boost to the build and configuration time
by better exploiting parallelism.
* Meson is subjectively considered to be better in readability.
These merits alone justify experimenting with meson as a way of
improving development time and ergonomics. However, there are some
compromises to ensure the transition goes relatively smooth:
* The system tests currently rely on various files within the source
directory. Changing this requirement is a non-trivial task that can't
be currently justified. Currently the last compiled build directory
writes into the source tree which is in turn used by pytest.
* The minimum version supported has been fixed at 0.61. Increasing this
value will require choosing a baseline of distributions that can
package with meson. On the contrary, there will likely be an attempt
to decrease this value to ensure almost universal support for building
BIND 9 with meson.
A "template" statement can contain the same configuration clauses
as a "zone" statement. A "zone" statement can now reference a
template, and all the clauses in that template will be used as
default values for the zone. For example:
template primary {
type primary;
file "$name.db";
initial-file "primary.db";
};
zone example.com {
template primary;
file "different-name.db"; // overrides the template
};