the default value of dnssec-validation is 'auto', which causes
a server to send a key refresh query to the root zone when starting
up. this is undesirable behavior in system tests, so this commit
sets dnssec-validation to either 'yes' or 'no' in all tests where
it had not previously been set.
this change had the mostly-harmless side effect of changing the cached
trust level of unvalidated answer data from 'answer' to 'authanswer',
which caused a few test cases in which dumped cache data was examined in
the serve-stale system test to fail. those test cases have now been
updated to expect 'authanswer'.
Previously, the first check silently failed, as 454 is apparently (in my
local setup) the minimum output size for the dnstap output, rather than
470 which the test was expecting. Effectively, the check served as a 5
second sleep rather than waiting for the proper file size.
Additionally, check the expected file sizes and fail if expectations
aren't met.
The log message is supposed to contain the zone name which was
erroneously omitted, but didn't pop up during tests, since return code
was silently ignored.
Now it actually waits for the proper log message rather than being an
equivalent of 3 second sleep (which was also sufficient to make the test
pass, thus we detected no failure).
In HTTP/1.0 and HTTP/1.1, RFC 9112 section 9.6 says the last response
in a connection should include a `Connection: close` header, but the
statschannel server omitted it.
In an HTTP/1.0 response, the statschannel server can sometimes send a
`Connection: keep-alive` header when it is about to close the
connection. There are two ways:
If the first request on a connection is keep-alive and the second
request is not, then _both_ responses have `Connection: keep-alive`
but the connection is (correctly) closed after the second response.
If a single request contains
Connection: close
Connection: keep-alive
then RFC 9112 section 9.3 says the keep-alive header is ignored, but
the statschannel sends a spurious keep-alive in its response, though
it correctly closes the connection.
To fix these bugs, make it more clear that the `httpd->flags` are part
of the per-request-response state. The Connection: flags are now
described in terms of the effect they have instead of what causes them
to be set.
The "dns_dnssec_findzonekeys2" log message is a leftover from when that
was the name of the function. Rename to match the current name of the
function.
When we add DNSKEY records via dynamic update, this should no longer
trigger signing the zone with these keys. This currently happens when
'find_zone_keys()' looks up the keys by inspecting the DNSKEY RRset,
then attempting to read the corresponding key files.
Add checks that inspect the logs whether an attempt to read the key
files for the newly added keys was done (and failed because these files
are not available).
The purpose of the check is to verify the server has survived the
previous barrage of queries. This is done by sending a query and
checking we get a NOERROR response back.
Previously, that query could've been affected by a servfail cache - the
server would return a SERVFAIL answer, thus failing the check, despite
being up and running. Use version.bind txt ch query to avoid the
interference of servfail cache.
Prepare the fetchlimit system test for adding a clients-per-query
check. Change some functions and commands to accept a destination
NS IP address instead of using the hardcoded 10.53.0.3.
The get_core_dumps.sh script couldn't find and process core files of
out-of-tree configurations because it looked for them in the source
instead of the build directory.
Add a test case where when priming the cache with a slow authoritative
resolver, the stale-answer-client-timeout option should not return
a delegation to the client (it should wait until an applicable answer
is found, if no entry is found in the cache).
The check for 'would limit' log message is triggered by sending at least
three messages within one second. However, in extremely slow conditions
(currently when running with clang+TSAN in CI), the individual queries
might take too much time to send enough of them within one second.
Since this is a pretty rare condition, let's just silently skip this
test in environments where a single query takes more than 500 ms, since
there's no way to perform the check under such conditions.
Closes#4082
The 'update-nsec3.example' requires to be DNSSEC maintained via
dynamic update. Commit 03b22983cd20cec51ad8b9f25f2e7d0e472dc79c adds
checks to make sure the raw zone is not signed. So the test case neesd
to be updated to allow for DNSSEC maintenance.
A zone in multisigner model 2 should also be possible to remove
previously added DNSKEY, CDS and CDNSKEY records from the zone operated
by the other provider.
Add a test case where updates are being made against a hidden primary
and two bump in the wire signers (the providers in the multisigner
model) serve the zone.
The test covers the same cases as for two primary providers that is:
- Add DNSKEY
- Remove (previously added) DNSKEY
- Add CDNSKEY
- Remove (previously added) CDNSKEY
- Add CDS
- Remove (previously added) CDS
A zone in multisigner model 2 should also be possible to publish the
CDS and CDNSKEY records from their KSK into the zone operated by the
other provider.
Add a new system test to test multisigner model use cases. This
initial test just tests a small part of the model 2, and uses two
providers for the same zone, ns3 and ns4, each with their own unique
key set. This commit tests that each provider can import their ZSK
of the other provider into their DNSKEY RRset, using dynamic update.
Both providers use dnssec-policy, ns3 applies the DNSSEC records
directly, while ns4 uses inline-signing.
The check which attempts to forward dynamic update to a dead primary may
trigger a timing issue #4080. For some reason, this has manifested under
the pytest runner, while the test still passes with the legacy runner.
Move the dead primary check closer to the end of the test to avoid
hitting this issue before we have a proper fix.
The module-level logger has a handler that writes into a temporary
directory. Ensure the logging output is flushed and the handler is
closed before attempting to remove this temporary directory.
Previously, run.sh tried to use pytest's -k option for test selection.
The downside was that this filter expression matched any test case with
the given substring, rather than executing a system test suite with the
given name.
The run.sh has been rewritten to invoke pytest from a system test
directory instead. This behaves more consistently with the run.sh from
legacy system test framework.
run.sh is now also a shell script to avoid confusion regarding its
file extension.
EL8+ systems declare "which" function using environment variables in the
/etc/profile.d/which2.sh file. Because of our suboptimal environment
variable detection, which is required in order to support the legacy
runner, these variables are picked up by the pytest runner.
If subprocesses are spawned with these environment variables set, it
will cause the following issue when they spawn yet another subprocess:
/bin/sh: which: line 1: syntax error: unexpected end of file
/bin/sh: error importing function definition for `which'
Instantiate a new logger that is used during pytest initialization /
configuration. This logging isn't handled by pytest itself, since it
happens outside of any tests or fixtures.
Root logger can't be reused for this purpose, because that would
duplicate the logs. Instead, create a conftest-specific logger for this
purpose.
Unfortunately, this introduces another log file,
pytest.conftest.log.txt, which contains only the logging from pytest
initialization. However, unless one is debugging the runner /
environment, there should be no need to investigate this file.
In order to take the most advantage of parallel execution of tests,
ensure certain long running tests are scheduled first.
The list of tests considered long-running was created empirically. In
addition to the test run time, its position in the default
(alphabetical) ordering was also taken into account.
The logger fixture is provided as a test-level logging facility which
can be easily passed to tests to enable capturing and/or displaying
messages from tests written in Python.
While this works optimally with the pytest runner, messages on INFO
level or above will also be visible when using the legacy runner.
The test_zone_timers_secondary_json() and
test_zone_timers_secondary_xml() tests are affected by issue #3983. Due
to the way tests are run, they are only affected when executing them
with the pytest runner.
Strict mode is set for pytest runner, as it always fails there. The
strict mode ensures we'll catch the change when the it starts passing
once the underlying issue is fixed. It can't be set for the legacy
runner, since the test (incorrectly) passes there.
Related #3983
If a test fails with an assertion failure or exception, its content
along with traceback is displayed in pytest output. This information
should be preserved in the test-specific logger for a given system test
to make it easier to debug test failures.
In order to avoid issues with decoding/encoding env variables due to
different encodings on different systems, deal with the environment
variables directly as bytes.
The loadscope setting is required for parallel execution of our system
tests using pytest. The option ensure that all tests within a single
(module) scope will be assigned to the same worker.
This is neccessary because the worker sets up the nameservers for all
the tests within a module scope. If tests from the same module would be
assigned to different workers, then the setup could happen multiple
times, causing a race condition. This happens because each module uses
deterministic port numbers for the nameservers.
Utilize developers' muscle memory to incentivize using the pytest runner
instead of the legacy one. The script also serves as basic examples of
how to run the pyest command to achieve the same results as the legacy
runner.
Invoking pytest directly should be the end goal, since it offers many
potentially useful options (refer to pytest --help).
In order to run the shell system tests, the pytest runner has to pick
them up somehow. Adding an extra python file with a single function
for the shell tests for each system test proved to be the most
compatible way of running the shell tests across older pytest/xdist
versions.
Modify the legacy run.sh script to ignore these pytest-runner specific
glue files when executing tests written in pytest.
Special care needs to be taken to support older pytest / xdist versions.
The target versions are what is available in EL8, since that seems to
have the oldest versions that can be reasonably supported.
When an issue occurs inside a fixture (e.g. servers fail to start/stop),
the test result won't be detected as failed, but rather an error will be
thrown.
To ensure the tempdir is kept even if the test itself passes but the
system_test() fixture throws an error, a different mechanism is needed.
At the start of the critical test setup section, note that the fixture
hasn't finished yet. When this is detected in the system_test_dir()
fixture, it is recognized as error in test setup/teardown and the temp
directory is kept.
This may seem cumbersome, because it is. It's basically a workaround for
the way pytest handles fixtures and test errors in general.