An obscured DNSKEY RRset at a delegation was incorrectly added to
the NSEC/NSEC3 type bit map leading to zone verification failures.
This adds such a RRset to the test zone.
To avoid any escaping issues or messing with a language-specific format
when the variable has to be parsed, create a dedicated file for each
variable that is obtained from autoconf.
Make sure all initialization takes place in isctest.vars.__init__ and
export the initial env vars there. Remove the no longer needed env
fixture and use os.environ instead.
If OPENSSL_CONF is exported as an empty string, it will cause issues on
rhel9fips. Allow the environment variables to be set and exported, but
make sure to only export them if they have been set by the user.
The openssl config needs to be parsed for some tests that use SoftHSM2.
Rewrite the parsing to python and ensure the required variables are
properly set test-wide.
Provide a single point of access to all the variables used by tests. Use
a custom dict-like structure to access the underlying data without
making a copy. This allows the individual modules to update the contents
at runtime, which is used for some variables.
While this isn't required for pytest operation and execution of the
system test suite, it can be handy to allow test script development and
debugging. Especially setup scripts often source conf.sh and expect
environment variables to be loaded. If these scripts are executed
stand-alone, the environment variables need to be loaded from the python
package.
Remove conf.sh.in and move the environment variables into isctest/vars
python package. This enabled the removal of an ugly pytest hack which
loaded and parsed these variables from the environment.
qp and rbtdb produce stylistically different backup files. This
was causing the xferquota system test to fail. This has been
addressed by making the test independent of the stylistic differences.
This test is outdated because it tested the 'sig-validity-interval'
option that has been replaced by dnssec-policy's 'signatures-validity',
'signatures-refresh', and 'signatures-jitter' options.
Nevertheless, it tests if the jitter is spread correctly.
Update the test to make use of 'signatures-jitter', set the value
to 1 day (meaning resign in 499 days since 'signatures-validity' is
set to 500 days).
Note that this previously changed erroneously the refresh value to
449 days (should have been 499 days, but that is not allowed by
checkconf, since it is above 90% of 'signatures-validity').
After we have changed the maximum allowed iterations to 51 for signing,
the NSEC3 chain has changed and requires one more NSEC to be returned
in the answer (plus corresponding RRSIG). So the expected number or
records in the authority section is now 8.
If the key is offline and the keymgr runs, it will treat it as a missing key,
and generate a new key (according to the policy). Fix the test by putting
back the KSK temporarily when we run 'rndc loadkeys'.
1. When generating keys, don't set timing metadata. Otherwise keys
are considered to be in use and won't be selected when dnssec-policy
starts a new key rollover.
2. Add an extra check to make sure the new ZSK (zsk2) is prepublished.
Also add a check to make sure it has become active.
3. When using dnssec-settime, add -s to also write to key state files.
The config was recently modified to ensure ns4 won't leak any queries to
root servers. However, the test wasn't executed and it turns out the way
this was handled actually broke the test case. Add our custom root hint
to both of the views to ensure the test can still pass without leaking
any queries.
An RPZ response's SOA record TTL is set to 1 instead of the SOA TTL,
a boolean value is passed on to query_addsoa, which is supposed to be
a TTL value. I don't see what value is appropriate to be used for
overriding, so we will pass UINT32_MAX.
Creating the KSR happens on the "ZSK side". The KSK is offline and while
the public key and state file may be present, draft-icann-dnssec-keymgmt-01.txt
suggest that the KSR only contains ZSKs.
This is also what knot dns does, so it would also be in the spirit of
interoperability.
The final line in a KSR ";; KeySigningRequest generated at ..." was
missing the version number, that has now been fixed.
Thanks Libor Peltan for reporting.
the previous commit introduced a possible race in getsigningtime()
where the rdataset header could change between being found on the
heap and being bound.
getsigningtime() now looks at the first element of the heap, gathers the
locknum, locks the respective lock, and retrieves the header from the
heap again. If the locknum has changed, it will rinse and repeat.
Theoretically, this could spin forever, but practically, it almost never
will as the heap changes on the zone are very rare.
we simplify matters further by changing the dns_db_getsigningtime()
API call. instead of passing back a bound rdataset, we pass back the
information the caller actually needed: the resigning time, owner name
and type of the rdataset that was first on the heap.
Add test cases for the 'request' command. Reuse the earlier
pregenerated ZSKs. We also need to set up some KSK files, that can
be done with 'dnssec-keygen -k <policy> -fK' now.
The 'check_keys()' function is adjusted such that the expected active
time of the successor key is set to the inactive time of the
predecessor. Some additional information is saved to make 'request'
testing easier.
Add a system test for testing dnssec-ksr, initally for the keygen
command. This should be able to create or select key files given a
DNSSEC policy and a time window.
Introduce a new DNSSEC tool, dnssec-ksr, for creating signed key
response (SKR) files, given one or more key signing requests (KSRs).
For now it is just a dummy tool, but the future purpose of this utility
is to pregenerate ZSKs and signed RRsets for DNSKEY, CDNSKEY, and CDS
for a given period that a KSK is to be offline.
Check that RFC 1918 leak detection does not trigger an assertion
when nxdomain redirection is enabled in the server but not for the
RFC 1918 reverse namespace.
The condition in prereq.sh which attempts to match two string uses
integer equality operation. This results in an error, causing the
enginepkcs11 test to always be skipped. Use = operator for the string
comparison instead.
The autosign test uses sleep in many cases to wait for something to
happen. This inevitably leads to an instability that manifests in our
CI. Allow an automatic rerun of the test to improve its stability.
Variable assignment when calling subroutines might not be portable.
Notably, it doesn't work with FreeBSD shell, where the value of HOME
would be ignored in this case.
Since the commands are already executed in a subshell, export the HOME
variable to ensure it is properly handled in all shells.
Initializing the conftest logging upon importing the isctest package
isn't practical when there are standalone pieces which can be used
outside of the testing framework, such as the asyncdnsserver module.
Implement a new Python class, AsyncDnsServer, which can be used by
ans.py scripts placed in ansX/ system test subdirectories. This enables
conveniently starting a feature-limited, non-standards-compliant, custom
DNS server instance. It can read and serve zone files, but it is also
able to evaluate any user-provided query-processing logic, allowing
query responses to be changed, delayed, or dropped altogether. These
are all actions commonly taken by custom DNS servers written in Python
that are used in BIND 9 system tests. Having a single "base"
implementation of such a custom DNS server reduces code duplication,
improving test maintainability.
Co-authored-by: Tom Krizek <tkrizek@isc.org>
Be stricter in durations that are accepted. Basically we accept ISO 8601
formats, but fail to detect garbage after the integers in such strings.
For example, 'P7.5D' will be treated as 7 days. Pass 'endptr' to
'strtoll' and check if the endptr is at the correct suffix.
Add a regression test case for the scenario where a secure chain of
trust includes an inactive KSK, that is a KSK that is not signing the
DNSKEY RRset.
now that "qpzone" databases are available for use in zones, we no
longer need to retain the zone semantics in the "qp" database.
all zone-specific code has been removed from QPDB, and "configure
--with-zonedb" once again takes two values, rbt and qp.
some database API methods that are never used with a cache have
been removed from qpdb.c and qp-cachedb.c; these include newversion,
closeversion, subtractrdataset, and nodefullname.
add database API methods needed for loading rdatasets into memory
(currentversion, beginload, endload), plus the methods used by
zone_postload() for zone consistency checks (getoriginnode, find,
findnode, findrdataset, attachnode, detachnode, deletedata).
the QP trie doesn't support the find callback mechanism available
in dns_rbt_findnode() which allows examination of intermediate nodes
while searching, so the detection of wildcard and delegation nodes
is now done by scanning QP chains after calling dns_qp_lookup().
Note that the lookup in previous_closest_nsec() cannot return
ISC_R_NOTFOUND. In RBTDB, we checked for this return value and
ovewrote the result with ISC_R_NOMORE if it occurred. In the
qpzone implementation, we insist that this return value cannot happen.
dns_qp_lookup() would only return ISC_R_NOTFOUND if we asked for a
name outside the zone's authoritative domain, and we never do that
when looking up a predecessor NSEC record.
named-checkzone is now able to load a zone and check it for errors,
but cannot dump it.
by default, QPDB is the database used by named and all tools and
unit tests. the old default of RBTDB can now be restored by using
"configure --with-zonedb=rbt --with-cachedb=rbt".
some tests have been fixed so they will work correctly with either
database.
CHANGES and release notes have been updated to reflect this change.
The change from RBT to QP has changed the contents of generated zone
files slightly: node names are now always absolute, so instead of using
$ORIGIN and relative names, generated zone files use full names for all
records.
This caused a failure in the xferquota system test, which was looking
for a relative name in secondary zone files. Replace the string
matching with a regular expression to fix the test.
the dyndb test requires a mechanism to retrieve the name associated
with a database node, and since the database no longer uses RBT for
its underlying storage, dns_rbt_fullnamefromnode() doesn't work.
addressed this by adding dns_db_nodefullname() to the database API.
the change from RBT to QP has changed the contents of generated zone
files slightly: node names are now always absolute, so instead of using
$ORIGIN and relative names, generated zone files use full names for all
records.
this caused a failure in the stub system test, which was grepping for a
relative name in a dumped zone file. using "masterfile-style full" makes
the test pass regardless of the database being used.
as a side effect of the switch from RBT to QBDB, NSEC3 records
are no longer created for empty non-terminal nodes when the
node only contains insecure delegations in an opt-out range.
such NSEC3 records are optional according to RFC 5155 (and,
for example, they are not created by dnssec-signzone), but they were
previously created by named, as a harmless side effect of the RBT
structure, which contains empty internal nodes that can be reached
by a DB iterator. these nodes are not present in the QPDB, so
NSEC3 records are not created unless they're actually required.
the autosign system test contained a test case (added in commit
ad91a70d as part of GL #4027) that checked whether ENT NSEC3
records were deleted when the delegations under the ENT removed.
this test no longer passes, because the NSEC3's are not created
in the first place, and therefore cannot be removed.
rather than "fix" the QPDB to add unnecessary NSEC3 records, this
commit instead revises the test to check for removal of ENT NSEC3
records when *not* using opt-out.
replace the string "rbt" throughout BIND with "qp" so that
qpdb databases will be used by default instead of rbtdb.
rbtdb databases can still be used by specifying "database rbt;"
in a zone statement.
This is a regresssion test for GL #4621 where the NODATA responses
are SOA records that match the QNAME rather than the zone name. In
particular for NS queries.
the RRL test included a test case that tried to start named with
a broken configuration. the same error could be found with
named-checkconf, so it should have been tested in the checkconf
system test.
When the first parametrized test takes a bit longer than usual, the zone
transfer in ns3 may succeed before the second parametrized test is even
started, and then watch_log_from_here() won't find the "Transfer status:
success" message in the named log. Using watch_log_from_start() instead
makes sure the test is more stable.
This was mostly an artifact to tell which log lines belong to which test
from the time when the test output could be all mingled together. Now
this info is reduntant, because the pytest logger already includes both
the system test name, and the specific test.
Unify the different loggers (conftest, module, test) into a single
interface. Remove the need to select the proper logger by automatically
selecting the most-specific logger currently available.
This also removes the need to use the logger/mlogger fixtures manually
and pass these around. This was especially annoying and unwieldy when
splitting the test cases into functions, because logger had to always be
passed around. Instead, it is now possible to use the
isctest.log.(debug,info,warning,error) functions.
Preparation for further logging improvements - keep the watchlog
contents in a separate module inside isctest.log. Export the names in
the log package so the imports don't change for the users of these
classes.
This code has probably been accidentally added during some rebase. The
actual RNDCExecutor and related classes are in isctest/rndc.py. Remove
the duplicated and unused code from isctest/log.py, as it doesn't belong
there.
We were missing a test where a single owner name would have multiple
types with a different case. The generated RRSIGs and NSEC records will
then have different case than the signed records and message parser have
to cope with that and treat everything as the same owner.
When running `make check` on a platform which has older (but still
supported) pytest, e.g. 3.4.2 on EL8, the junit to trs conversion would
fail because the junit format has different structure. Make the junit
XML processing more lenient to support both the older and newer junit
XML formats.
The ditch.pl script is used to generate burst traffic without waiting
for the responses. When running other tests in parallel, this can result
in a ephemeral port clash, since the ditch.pl process closes the socket
immediately. In rare occasions when the message ID also clashes with
other tests' queries, it might result in an UnexpectedSource error from
dnspython.
Use a dedicated port EXTRAPORT8 which is reserved for each test as a
source port for the burst traffic.
Stop the cname_and_other_data processing if we already know that the
result is true. Also, we know that CNAME will be placed in the priority
headers, so we can stop looking for CNAME if we haven't found CNAME and
we are past the priority headers.
Currently we test the incoming zone transfers data in the statistics
channel by retransfering the zones in slow mode and capturing the XML
and JSON outputs in the meantime to check their validity. Add a new
transfer to the test, and check that the XML and JSON files correctly
indicate that we have 3 retransfers and 1 new (first time) transfer.
Explicitly use an empty 'trust-anchors' statement in the system
tests where it was used implicitly before.
In resolver/ns5/named.conf.in use the trust anchor in 'trusted.conf',
which was supposed to be used there.
Add checks into the 'checkconf' system test to make sure that the
'dnssec-validation yes' option fails without configured trusted
anchors, and succeeds with configured non-empty, as well as empty
trusted anchors.
the 'low', 'high' and 'discount' parameters to 'fetch-quota-param'
are meant to be ratios with values between zero and one, but higher
values can be assigned. this could potentially lead to an assertion
in maybe_adjust_quota().
In the second test we are looking for key files and extract the key
id numbers. Because keys can be in different directories, we needed
to change the maxdepth when searching for keys.
For the second kasp system test, check that 'dnssec-keygen -k' (default
policy) creates valid files, the 'get_keyids' returned more than one
keytag, namely the ones that are inside the keys/ directory, that were
created for the predecessor test, check that 'dnssec-keygen -k'
(configuredd policy) creates valid files.
This caused the system test to spew out errors that key files were
missing (we were looking for key files in the current directory, but
when looking for key id numbers we included the keys/ directory). It
could also cause the next test to fail, check that 'dnssec-settime' by
default does not edit key state file, because the STATE_FILE environment
variable was overwritten with the key file path of one of the keys that
were created with the configured policy.
We fix this by adjusting the maxdepth for the test in question. Other
tests don't need adjusting because they use unique zone names.
The name "uri" was considered to be too generic and could potentially
clash with a future URI configuration option. Renamed to "pkcs11-uri".
Note that this option name was also preferred over "pkcs11uri", the
dash is considered to be the more clearer form.
Add test cases for zones in different views that are using PKCS#11
tokens to store its keys.
If it is using the same DNSSEC policy, only one PKCS#11 token should be
created and the same key should be used for the zone in both views.
If it is using a different DNSSEC policy, multiple PKCS#11 token should
be created and each view should use their respective key.
This log may still occur if there is a DNSKEY in the unsigned zone.
This may happen in a multi-signer setup for example.
Ideally this should not log a warning, but that requires looking up
keys a different way (by searching for key files only). However, that
requires adapting a bunch of system tests, and is out of scope for now.
- Shell function body should be in between curly braces.
- Some erroneous '|| return 1' are replaced with '|| ret=1'.
- Fix a variable name (was 'ret', should be '_ret').
- Clean up when setting up a new test.
The bullseye and bookworm images are not set up with pkcs11-provider,
so we need to add an additional prerequisite for running the
pkcs11engine test. Check the path of OPENSSL_CONF.
Move dns_dnssec_findzonekeys from the dnssec.{c,h} source code to
zone.{c,h} (the header file already commented that this should be done
inside dns_zone_t).
Alter the function in such a way, that keys are searched for in the
key stores if a 'dnssec-policy' (kasp) is attached to the zone,
otherwise keep using the zone's key-directory.
When using dnssec-policy with dnssec-keygen in combination with setting
the key-directory on the command line, the commandline argument takes
priority over the key-directory from the default named.conf.
Add a test case where dnssec-policy uses key stores with a directory
other than the zone's key-directory.
This requires changing the kasp shell script to take into account that
keys can be in different directories. When looking for keys, the
'find' command now takes a maxdepth of 3 to also look for keys in
subdirectories. Note this maxdepth value is arbitrary, the added
'keystore.kasp' test only requires a maxdepth of 2.
Because of this change, the dnssec-keygen tests no longer work because
they are for the same zone (although different directories). Change
the test to use a different zone ('kasp2' instead of 'kasp').
Add cases for each algorithm to test the interaction between
dnssec-policy and engine_pkcs11. Ensure that named creates keys on
startup.
Also test dnssec-keygen when using a dnssec-policy with a PKCS#11
based key-store.
The check for printing zone list failed because of these additional
lines in the output:
good.conf:22: dnssec-policy: key algorithm 13 has predefined length; \
ignoring length value 256
I am not sure why this failure hasn't happened before already.
Similar to key-directory, check for zones in different views and
different key and signing policies. Zones must be using different key
directories to store key files on disk.
Now that a key directory can be linked with a dnssec-policy key, the
'keydirexist' checking needs to be reshuffled.
Add tests for bad configuration examples, named-checkconf should catch
those. Also add test cases for a mix of key-directory and key-store
directory.
Similar to key-directory, check if the key-store directory exists and
if it is an actual directory.
This commit fixes an accidental test bug in checkconf where if
the "warn key-dir" test failed, the result was ignored.
Add checkconf check to ensure that the used key-store in the keys
section exists. Error if that is not the case. We also don't allow
the special keyword 'key-directory' as that is internally used to
signal that the zone's key-directory should be used.
Add new configuration for setting key stores. The new 'key-store'
statement allows users to configure key store backends. These can be
of type 'file' (that works the same as 'key-directory') or of type
'pkcs11'. In the latter case, keys should be stored in a HSM that is
accessible through a PKCS#11 interface.
Keys configured within 'dnssec-policy' can now also use the 'key-store'
option to set a specific key store.
Update the checkconf test to accomodate for the new configuration.
These tests have ns1 configured as a mock root server. Make sure it is
used in all config files of those tests, otherwise some queries could
leak to root nameservers.
Some tests don't have a mock root server configured, because they don't
need one. However, these tests might still leak queries to actual name
servers. Add a shared root hints file which can serve as a blackhole for
these queries.
As dnsrps and native test cases have been properly split up, the
ckdnsrps.sh script is no longer used anywhere, as the logic for
selecting these test cases is handled by pytest.
Previously, dnsrps test was executed as an optional part of the rpz and
rpzrecurse system tests. This was conceptually problematic, as the test
took the responsibility of running parts of the test framework -
cleaning files and setting up servers again.
Instead, allow these tests to execute either the native variant, or the
dnsrps one. To ensure the same test coverage, trigger both of these
variants as separate test cases from pytest.
We need to skip some portions the system test in FIPS mode as some of
the algorithms used in the test are not available when using the FIPS
mode (e.g. TLS_CHACHA20_POLY1305_SHA256)
The older version of the code was reporting that listeners are going
to be of the same type after reconfiguration when switching from DoT
to HTTPS listener, making BIND abort its executions.
That was happening due to the flaw in logic due to which the code
could consider a current listener and a configuration for the new one
to be of the same type (DoT) even when the new listener entry is
explicitly marked as HTTP.
The checks for PROXY in between the configuration were masking that
behaviour, but when porting it to 9.18 (when there is no PROXY
support), the behaviour was exposed.
Now the code mirrors the logic in 'interface_setup()' closely (as it
was meant to).
This commit adds a system test that helps to verify that changing a
listener transport by editing "listen-on" statements before
reconfiguration works as expected.
This commit adds a new system test which verifies that using the
'cipher-suites' option actually works as expected (as well as adds
first TLSv1.3 specific tests).
The "exceeded time limit waiting for literal 'too many DNS UPDATEs
queued' in ns1/named.run" is prone to fail due to a timing issue.
Despite out efforts to stabilize it, the check still often fails on
FreeBSD in our CI. Allow the test to be re-run on this platform.
This file was missing explicit dnssec-validation. Seems like it was
missed in our previous efforts, probably because of the different
filename / extension. Rename it to end with *.in to reflect that it is a
template file used by copy_setports.
If the dnskey-ttl in the dnssec-policy doesn't match the DNSKEY's
ttl then the DNSKEY, CDNSKEY and CDS rrset should be updated by
named to reflect the expressed policy. Check that named does this
by creating a zone with a TTL that does not match the policy's TTL
and check that it is correctly updated.
In Net::DNS 1.42 $ns->main_loop no longer loops. Use current methods
for starting the server, wait for SIGTERM then cleanup child processes
using $ns->stop_server(), then remove the pid file.
Reconfiguring named using RNDC is a common action in BIND 9 system
tests. It involves sending the "reconfig" RNDC command to a named
instance and waiting until it is fully processed. Add a reconfigure()
method to the NamedInstance class in order to simplify and standardize
named reconfiguration using RNDC in Python-based system tests.
TODO:
- full reconfiguration support (w/templating *.in files)
- add an "rndc null" before every reconfiguration to show which file
is used (NamedInstance.add_mark_to_log() as it may be generically
useful?)
The "checkds" system test contains a lot of duplicated code despite
carrying out the same set of actions for every tested scenario
(zone_check() → wait for logs to appear → keystate_check()). Extract
the parts of the code shared between all tests into a new function,
test_checkds(), and use pytest's test parametrization capabilities to
pass distinct sets of test parameters to this new function, in an
attempt to cleanly separate the fixed parts of this system test from the
variable ones. Replace format() calls with f-strings.
The "checkds" system test only uses dns.resolver.Resolver objects to
access their 'nameservers' and 'port' attributes. Instances of the
NamedInstance class also expose that information via their attributes,
so only pass NamedInstance objects around instead of needlessly
depending on dns.resolver.Resolver.
Make log file watching in Python-based system tests consistent by
employing the helper Python classes designed for that purpose. Drop the
custom code currently used.
Waiting for a specific log line to appear in a named.run file is a
common action in BIND 9 system tests. Implement a set of Python classes
which intend to simplify and standardize this task in Python-based
system tests.
Co-authored-by: Štěpán Balážik <stepan@isc.org>
The "addzone" and "shutdown" system tests currently invoke rndc using
test-specific helper code. Rework the relevant bits of those tests so
that they use the helper classes from bin/tests/system/isctest.py.
Controlling named instances using RNDC is a common action in BIND 9
system tests. However, there is currently no standardized way of doing
that from Python-based system tests, which leads to code duplication.
Add a set of Python classes and pytest fixtures which intend to simplify
and standardize use of RNDC in Python-based system tests.
For now, RNDC commands are sent to servers by invoking the rndc binary.
However, a switch to a native Python module able to send RNDC commands
without executing external binaries is expected to happen soon. Even
when that happens, though, having the capability to invoke the rndc
binary (in order to test it) will remain useful. Define a common Python
interface that such "RNDC executors" should implement (RNDCExecutor), in
order to make switching between them convenient.
Co-authored-by: Štěpán Balážik <stepan@isc.org>
When changing the NSEC3 chain, the new NSEC3 chain must be built before
the old NSEC3PARAM is removed. Check each delta in the conversion to
ensure this ordering is met.
When transitioning from NSEC3 to NSEC the NSEC3 must be built before
the NSEC3PARAM is removed. Check each delta in the conversion to
ensure this ordering is met.
When transitioning from NSEC3 to NSEC the added records where not
being signed because the wrong time was being used to determine if
a key should be used or not. Check that these records are actually
signed.
The check_loaded() function compares the zone's loadtime value and
an expected loadtime value, which is based on the zone file's mtime
extracted from the filesystem.
For the secondary zones there may be cases, when the zone file isn't
ready yet before the zone transfer is complete and the zone file is
dumped to the disk, so a so zero value mtime is retrieved.
In such cases wait one second and retry until timeout. Also modify
the affected check to allow a possible difference of the same amount
of seconds as the chosen timeout value.
certain dig options which were deprecated and became nonoperational
several releases ago still had documentation in the dig man page and
warnings printed when they were used: these included +mapped,
+sigchase, +topdown, +unexpected, +trusted-key, and the -i and -n
options. these are now all fatal errors.
another option was described as deprecated in the man page, but
the code to print a warning was never added. it has been added now.
This commit extends the 'doth' system tests with additional secondary
NS instance that reuses the same 'tls' entry for connecting the the
primary to download zones. This configurations were known to crash
secondaries in some cases.
This commit adds a system test suite for PROXYv2. The idea on which it
is based is simple:
1. Firstly we check that 'allow-proxy' and 'allow-proxy-on' (whatever
is using the new 'isc_nmhandle_real_localaddr/peeraddr()') do what
they intended to do.
2. Anything else that needs an interface or peer address (ACL
functionality, for example) is using the old
'isc_nmhandle_localaddr/peeraddr()' - which are now returning
addresses received via PROXY (if any) instead of the real connection
addresses. The beauty of it that we DO NOT need to verify every bit of
the code relying on these functions: whatever works in one place will
work everywhere else, as these were the only functions that allowed
any higher level code to get peer and interface addresses.
This way it is relatively easy to see if PROXYv2 works as intended.
The system tests need to be updated because non-zero iterations are no
longer accepted.
The autosign system test changes its iterations from 1 to 0 in one
test case. This requires the hash to be updated.
The checkconf system test needs to change the iterations in the good
configuration files to 0, and in the bad ones to 1 (any non-zero value
would suffice, but we test the corner case here). Also, the expected
failure message is change, so needs to be adjusted.
The nsec3 system test also needs iteration configuration adjustments.
In addition, the test script no longer needs the ITERATIONS environment
variable.
In the process of updating the system tests, I noticed an error
in the dnssec-policy "nsec3-other", where the salt length in one
configuration file is different than in the other (they need to be
the same). Furthermore, the 'rndc signing -nsec3param' test case
is operated on the zone 'nsec-change.kasp', so is moved so that the
tests on the same zone are grouped together.
Create a utility package for code shared by the python tests. The
utility functions should use reasonable defaults and be split up into
modules according to their functionality.
Ensure assert rewriting is enabled for the modules to get the most
useful output from pytest.
By default, the useful assertion message rewrite is used by pytest for
test modules only. Since another module is imported with shared
functionality, ensure it has pytest's assertion message rewriting
enabled to obtain more debug information in case it fails.
This file is executed outside of pytest with pure python, which doesn't
do any AssertionError message rewriting like pytest. Ensure the assert
messages in this file provide a useful debug message.
Rewrite and reorganize the test documentation to focus on the pytest
runner, omit any mentions of the legacy runner which are no longer
relevant, and mention a few pytest tricks.
The ns2/named-alt3.conf.in config file was removed in
f8e264ba6d. From then on, system test
reports:
sed: can't read ns2/named-alt3.conf.in: No such file or directory"
Drop the last remnant of ns2/named-alt3.conf.in.
when transferring in a non-inline-signing secondary for the first time,
we previously never set the value of zone->loadtime, so it remained
zero. this caused a test failure in the statschannel system test,
and that test case was temporarily disabled. the value is now set
correctly and the test case has been reinstated.
The AES algorithm for DNS cookies was being kept for legacy reasons, and
it can be safely removed in the next major release. Remove both the AES
usage for DNS cookies and the AES implementation itself.
Not every element tagged `skipped` in the JUnitXML tree has to contain
the `type` attribute. An example of that is a test that results in
xpass.
This has been verified with pytest version 7.4.2 and prior.
Add a test case where serve-stale is enabled on a server that also
servers a local authoritative zone.
The particular case tests a lame delegation and checks if falling
back to serving stale data does not attempt to retrieve the query
by recursing from the root down.
The lock-file configuration (both from configuration file and -X
argument to named) has better alternatives nowadays. Modern process
supervisor should be used to ensure that a single named process is
running on a given configuration.
Alternatively, it's possible to wrap the named with flock(1).
All changes in this commit were automated using the command:
shfmt -w -i 2 -ci -bn . $(find . -name "*.sh.in")
By default, only *.sh and files without extension are checked, so
*.sh.in files have to be added additionally. (See mvdan/sh#944)
When named fails to starts due to not being able to obtain
a lock on the lock file that lock file should remain. Check
that the lock file exists before and after the attempt to
start a second instance of named.
We test EDNS requests returning FORMERR where named is expected
to retry without EDNS.
We test EDNS requests returning NOTIMP where named is expected
to fail the transfer as the remote end is not protocol compliant.
The server is expected to retry the transfer using SOA and if
the returned serial is greater than the current serial AXFR.
Check the log that IXFR is request.
Now that inline-signing is ignored when there is no dnssec-policy,
add 'dnssec-policy default;' to the zones when attempting to add them
via 'rndc addzone'.
The 'dynamic-signed-inline-signing.kasp' zone was set up with
the environment variable 'ksktimes', but that should be 'csktimes'
which is set one line above. Since the values are currently the same
the behavior is identical, but of course it should use the correct
variable.
The 'step4.enable-dnssec.autosign' zone was set up twice. This is
unnecessary.
Add a test scenario for a dynamic zone that uses inline-signing which
accidentally has signed the raw version of the zone.
This should not trigger resign scheduling on the raw version of the
zone.
The usage of the newline in the replacement part of the 'sed' call
works in GNU systems, but not in OpenBSD. Use 'awk' instead.
Also use the extended syntax of regular expressions for 'grep', which
is similarly more portable across the supported systems.
'rndc thaw' initiates asynchrous loading of all the zones
similar to 'rndc load'. Wait for the test zone's load to
complete before testing that it is updatable again.
Instead of creating new memory pools for each new dns_message, change
dns_message_create() method to optionally accept externally created
dns_fixedname_t and dns_rdataset_t memory pools. This allows us to
preallocate the memory pools in ns_client and dns_resolver units for the
lifetime of dns_resolver_t and ns_clientmgr_t.
The XFRST_INITIALSOA state in the xfrin module is named like that,
because the first RR in a zone transfer must be SOA. However, the
name of the state is a bit confusing (especially when exposed to
the users with statistics channel), because it can be mistaken with
the refresh SOA request step, which takes place before the zone
transfer starts.
Rename the state to XFRST_ZONEXFRREQUEST (i.e. Zone Transfer Request).
During that step the state machine performs several operations -
establishing a connection, sending a request, and receiving/parsing
the first RR in the answer.
Add two more secondary zones to ns3 to be transferred from ns1,
using its IPv6 address for which the 'tcp-only' is set to 'yes'.
Check the statistics channel's incoming zone transfers information
to confirm that the expected transports were used for each of the
SOA query cases (UDP, TCP, TLS), and also for zone transfers (TCP,
TLS).
This allows the statistics channel to be viewed in a browser while
the transfer is in progress. Also set the transfer format to
one-answer to extend the amount of time the re-transfer takes.
When running the statschannel test on its own, use
<http://10.53.0.3:5304/xml/v3/xfrins> to see the output.
Note: the port is subject to future change.
Use the named -T transferslowly test options to slow down a zone
transfer from the primary server, and test that it's correctly
exposed in the statistics channel of the secondary server, while
it's in-progress.
Apply the semantic patch to catch all the places where we pass 'char' to
the <ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()).
The Unix Domain Sockets support in BIND 9 has been completely disabled
since BIND 9.18 and it has been a fatal error since then. Cleanup the
code and the documentation that suggest that Unix Domain Sockets are
supported.
The previous symlink name convention was prone to name collisions If a
system test contained both a shell test and a pytest module of the same
name (e.g. dnstap test has both tests.sh and tests_dnstap.py), then
these would have the same convenience symlink, which could cause test
setup issues as well as confusion when examining test artifacts.
Update the naming convention to include the full pytest module name.
This results in a slightly more verbose names for shell tests (e.g.
dnstap_sh_dnstap instead of the previous dnstap_dnstap), but it removes
the chance of a collision.
Reorganize individual port fixtures and re-use the ports fixture to
obtain their number. Store it as integer and only cast it to string when
setting it as environment variable.
Remove code fork for legacy runner, reorganize imports and move a
pylint-silencing snippet to the top of the file. The rest of the code
was just unindented.
In order to python system tests, pytest (runner) has to be used
directly. This makes it possible to simplify the pytest runner and make
its behavior simpler and easier to extend.
The legacy runner can still be used to run shell system tests.
The legacy runner no longer uses make check. Ensure the legacy runner
script doesn't interact with that automake target in any way. The legacy
runner script remains available to execute the legacy runner, but there
is no out-of-the box support for running tests in parallel. Other tools
such as xargs can be utilized for that.
Pytest provides JUnit output and uses different exit codes from
Automake. Use the conversion script to interpret the JUnit test results
from python rather than relying on the status code.
It's important to parse the JUnit result file rather than relying on the
exit code from pytest, which has a different meaning. Include a .trs test
result for each test case and set an exit code which is most appropriate
as the aggregate result (e.g. it will be set to 77 (SKIP) if there's at
least one test case that was skipped).
Dependencies for these tests are already checked in prereq.sh - if the
dependencies are missing, these tests will be skipped. The extra
dependency check in Makefile.am is extraneous and only applied for the
legacy test runner.
To allow concurrent invocations of pytest, it is necessary to assign
ports properly to avoid conflicts. In order to do that, pytest needs to
know a complete list of all test modules.
When pytest is invoked from run.sh, the current working directory is the
system test directory. To properly detect other tests, the conftest.py
has to look in the bin/tests/system directory, rather than the current
working directory.
To conform with the expected naming convention, the pytest glue file for
the `allow-query` test should use underscore as the word separator in
the python file name: allow-query/tests_sh_allow_query.py
The _common directory is a special case directory which contains shared
files for other system test directories. Make sure it's tracked in git
and not deleted during temporary directory cleanup.
The old name "common" clashes with the convention of system test
directory naming. It appears as a system test directory, but it only
contains helper files.
To reduce confusion and to allow automatic detection of issues with
possibly missing test files, rename the helper directory to "_common".
The leading underscore indicates the directory is different and the its
name can no longer be confused with regular system test directories.
Reusing TCP connections with dns_dispatch_gettcp() used linear linked
list to lookup existing outgoing TCP connections that could be reused.
Replace the linked list with per-loop cds_lfht hashtable to speedup the
lookups. We use cds_lfht because it allows non-unique node insertion
that we need to check for dispatches in different connection states.
Test resolver-use-dns64 by simulating a connection to an IPv4-only
server through a NAT64.
This test uses EXTRAPORT1 rather than PORT for DNS traffic exchanged
between ns3 and ns4. Both servers also listen on PORT on their IPv4
addresses to support server startup testing in start.pl.
The dnssec-must-be-secure feature was added in the early days of BIND 9
and DNSSEC and it makes sense only as a debugging feature.
Remove the feature to simplify the code.
Use the new isc_mem_c*() calloc-like API for allocations that are
zeroed.
In turn, this also fixes couple of incorrect usage of the ISC_MEM_ZERO
for structures that need to be zeroed explicitly.
There are few places where isc_mem_cput() is used on structures with a
flexible member (or similar).
This allow for the EDNS options EXPIRE and NSID to be sent when
when making requests. The existing controls controlling whether
EDNS is used and whether EXPIRE or NSID are sent are honoured.
Adjust the expected byte counts in the xfer system test to reflect
the EDNS overhead. Adjust the dig call to match named's behavior
(don't set +expire as we are talking to a secondary).
*** CID 464884: Null pointer dereferences (REVERSE_INULL)
/bin/tests/system/dyndb/driver/db.c: 644 in create_db()
638
639 *dbp = (dns_db_t *)sampledb;
640
641 return (ISC_R_SUCCESS);
642
643 cleanup:
CID 464884: Null pointer dereferences (REVERSE_INULL)
Null-checking "sampledb" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
644 if (sampledb != NULL) {
645 if (dns_name_dynamic(&sampledb->common.origin)) {
646 dns_name_free(&sampledb->common.origin, mctx);
647 }
648
649 isc_mem_putanddetach(&sampledb->common.mctx, sampledb,
- Simplify configuration management by deducing SoftHSM module path
from openssl config
- Determine the engine flag (-E) value from openssl config
- Drop unused/unneeded environment variables
- Run pkcs11-provider tests on Debian "sid" ossl3 flavor
The rrl system test has been unstable and producing false positive
results for years (GL #172). Allow the test to be re-run (once) to
reduce the noise it causes.
The reclimit system test has been unstable and producing false positive
results for years (GL #1587). Allow the test to be re-run (once) to
reduce the noise it causes.
The qmin test is inherently unstable. It fails quite often with failure
modes described in GL #904. Allow the pytest runner to re-run the test
up to 3 times to only detect a more persistent and reproducible failures
rather than random noise caused by the nature of the test.
It is better to disable the specific check that causes the test to fail
rather than mark the entire test as xfail, which can mask other issues
which the test is capable of detecting.
Using check_PROGRAMS would postpone compiling the binaries needed by
system tests until `make check` would be called. Since it's preferable
to invoke pytest directly to run the system test suite, compile these
binaries without installing them during `make all` instead by using
noinst_PROGRAMS.
This removes the need to use TESTS= make -e check hack invoked from
pytest to work around this issue.
The command finds all directories in bin/tests/system which contain an
underscore. Underscore indicates either a temporary directory (_tmp_), a
symlink to test artifacts (TESTNAME_MODULENAME), or a python-related
cache. Using underscore for a system test name is invalid and a hyphen
must be used instead.
While it'd be fairly easy to split the function up into smaller ones,
the readability wouldn't be improved in this case. Silence the
suggestions instead.
While temporary directories are useful for test execution to keep
everything clean, they are difficult to work with manually. Create a
symlink for each test artifact directory with a stable and predictable
path. The symlink always either points to the latest artifacts, or is
missing in case the last run succeeded.
Ensure these symlinked directories aren't detected as test suites by the
pytest runner.
In some cases, BIND is not fast enough to fill the send buffer and
manages to answer all queries, contrary to what the test expects.
Repeat the check up to 3 times to limit this test instability.
If the flaky plugin for pytest is available, use its decorator to
support re-running unstable tests. In case the package is missing,
execute the test as usual without attempts to re-run it in case of
failure.
This is mostly intended to increase the test stability in CI. Using a
custom decorator enables us to keep the flaky package as an optional
dependency.
The following files were reported in CI by the legacy system test runner
and prevented job to pass. They should be removed.
$ if git rev-parse > /dev/null 2>&1; then ( ! grep "^I:.*:file.*not removed$" *.log ); fi
autosign.log:I:autosign:file autosign/ns3/kskonly.example.db.jbk not removed
autosign.log:I:autosign:file autosign/ns3/optout.example.db.jbk not removed
autosign.log:I:autosign:file autosign/ns3/reconf.example.db.jbk not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.jbk not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.signed not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.signed.jnl not removed
Don't print an error when the ns*/inactive directory is not
present:
rmdir: ns*/inactive: No such file or directory
Remove nsupdate.out.test file instead of nsupdate.out, as the latter
does not exist.
The dns_dispatchmgr object was only set in the dns_view object making it
prone to use-after-free in the dns_xfrin unit when shutting down named.
Remove dns_view_setdispatchmgr() and optionally pass the dispatchmgr
directly to dns_view_create() when it is attached and not just assigned,
so the dns_dispatchmgr doesn't cease to exist too early.
The dns_view_getdnsdispatchmgr() is now protected by the RCU lock, the
dispatchmgr reference is incremented, so the caller needs to detach from
it, and the function can return NULL in case the dns_view has been
already shut down.
The mkeys system test could fail because root zone was resigned
within the same second as it was previously signed causing reloads
to fail. Add delays to the test to prevent this.
'addr', 'ckresult' and 'drop' should return 0 rather than 1 after
calling 'setret' as the error has been logged and these functions
are not expect to fail.
The setup.pl script has been replaced with static BIND configurations,
and in the course of this change, the unused ns1 server was removed.
This enhancement has greatly improved the overall test's readability.
The shell version of the test was completed only after all DNS zone
updates were sent, even if the BIND server crashed while processing
them, leading to prolonged execution and potential hang in the CI
environment. The Python rewrite of the test ensures that DNS update
tasks finish within five minutes of starting, irrespective of a BIND
crash possibility or DNS zone updates not finishing in time.
Lower the size requirement for the dnstap output file produced during
the "dnstap" system test from 454 to 450 bytes; while files of that size
are not generated in any GitLab CI job, they are in other environments
where the test passes.
The fstrm_capture utility is started in the background during the
"dnstap" system test. Consequently, "rndc dnstap-reopen" and similar
commands may be executed before fstrm_capture starts listening on the
Unix domain socket it is configured to receive dnstap data on. This
results in the dnstap data sent to that socket in the meantime to be
lost; while the fstrm writer thread is able to recover from such a
scenario within a couple of seconds (by reopening the configured dnstap
destination itself), only one write attempt is made for data
successfully queued to the writer thread, so dnstap frames can still be
lost in the process. This may happen during the "dnstap" system test,
leading to the dnstap output file being empty, which in turn causes the
test to fail.
Fix by waiting until fstrm_capture starts listening on the Unix domain
socket it is configured to use before asking named to reopen the
configured dnstap destination. Since various fstrm_capture versions log
different messages when the listening socket is set up, wait for a
common string that works for all fstrm_capture versions released to
date. Add a few extra debug messages indicating test progress and make
the test fail if the expected fstrm_capture log message is not generated
within 10 seconds.
The fstrm_capture.out file is overwritten when the fstrm_capture utility
is restarted during the "dnstap" system test. Use a separate output
file for each fstrm_capture instance to ensure all output produced by
that tool during the "dnstap" system test is preserved for forensic
purposes.
Errors getting transfer statistics from named.run where not detected
as ret was not set to one if there hadn't been a success after looping
for a while.
these options concentrate zone maintenance actions into
bursts for the benefit of servers with intermittent connections.
that's no longer something we really need to optimize.
'HOME=value command' should only change HOME for command but on
some platforms this occasionally sets HOME for the rest of the
test. Explicitly isolate the enviroment change using a sub shell.
When using automated DNSSEC management, it is required that the zone
is dynamic, or that inline-signing is enabled (or both). Update the
checkconf code to also allow inline-signing to be enabled within
dnssec-policy.
Add an option to enable/disable inline-signing inside the
dnssec-policy clause. The existing inline-signing option that is
set in the zone clause takes priority, but if it is omitted, then the
value that is set in dnssec-policy is taken.
The built-in policies use inline-signing.
This means that if you want to use the default policy without
inline-signing you either have to set it explicitly in the zone
clause:
zone "example" {
...
dnssec-policy default;
inline-signing no;
};
Or create a new policy, only overriding the inline-signing option:
dnssec-policy "default-dynamic" {
inline-signing no;
};
zone "example" {
...
dnssec-policy default-dynamic;
};
This also means that if you are going insecure with a dynamic zone,
the built-in "insecure" policy needs to be accompanied with
"inline-signing no;".
When dns_request was canceled via dns_requestmgr_shutdown() the cancel
event would be propagated on different loop (loop 0) than the loop where
request was created on. In turn this would propagate down to isc_netmgr
where we require all the events to be called from the matching isc_loop.
Pin the dns_requests to the loops and ensure that all the events are
called on the associated loop. This in turn allows us to remove the
hashed locks on the requests and change the single .requests list to be
a per-loop list for the request accounting.
Additionally, do some extra cleanup because some race condititions are
now not possible as all events on the dns_request are serialized.
The conditions that trigger the crash:
- a stale record is in cache
- stale-answer-client-timeout is 0
- multiple clients query for the stale record, enough of them to exceed
the recursive-clients quota
- the response from the authoritative is sufficiently delayed so that
recursive-clients quota is exceeded first
The reproducer attempts to simulate this situation. However, it hasn't
proven to be 100 % reproducible, especially in CI. When reproducing
locally, the priming query also seems to sometimes interfere and prevent
the crash. When the reproducer is ran twice, it appears to be more
reliable in reproducing the issue.
The keys directory should be cleaned up in clean.sh. Doing that in the
test itself isn't reliable which may lead to failing mkdir which causes
the test to fail with set -e.
After commit f4eb3ba4, that is part of removing 'auto-dnssec', the
inline system test started to fail in FIPS CI jobs. This is because
the 'nsec3-loop' zone started to use a RSASHA256 key size of 1024 and
this is not FIPS compliant.
This commit changes the key size from 1024 to 4096, in order to
become FIPS compliant again.
The catz module has a fail-safe code to recreate a member zone
that was expected to exist but was not found.
Improve a test case where the fail-safe code is expected to execute
to check that the log message exists.
Add a test case where the fail-safe code is not expected to execute
to check that the log message does not exist.
1. Change the _new, _add and _copy functions to return the new object
instead of returning 'void' (or always ISC_R_SUCCESS)
2. Cleanup the isc_ht_find() + isc_ht_add() usage - the code is always
locked with catzs->lock (mutex), so when isc_ht_find() returns
ISC_R_NOTFOUND, the isc_ht_add() must always succeed.
3. Instead of returning direct iterator for the catalog zone entries,
add dns_catz_zone_for_each_entry2() function that calls callback
for each catalog zone entry and passes two extra arguments to the
callback. This will allow changing the internal storage for the
catalog zone entries.
4. Cleanup the naming - dns_catz_<fn>_<obj> -> dns_catz_<obj>_<fn>, as an
example dns_catz_new_zone() gets renamed to dns_catz_zone_new().
When checking for the number of logs related to DNSKEY key maintenance
events, don't include CDNSKEY is published lines.
Also consider RSASHA1: If not supported, the key maintenance for
the nsec-only zone are not logged.
These two configuration options worked in conjunction with 'auto-dnssec'
to determine KSK usage, and thus are now obsoleted.
However, in the code we keep KSK processing so that when a zone is
reconfigured from using 'dnssec-policy' immediately to 'none' (without
going through 'insecure'), the zone is not immediately made bogus.
Add one more test case for going straight to none, now with a dynamic
zone (no inline-signing).
Change test configuration to make use of 'dnssec-policy' instead of
'auto-dnssec'.
Because we now use 'dnssec-policy', there is no need to create an
explicit key in the final test that adds multiple inline zones
followed by a reconfig.
Change test configuration to make use of 'dnssec-policy' instead of
'auto-dnssec'.
Because we now add a DNSKEY with dynamic update, the sign statistics
change. When adding signatures triggered by dynamic update, the
dnssec-refresh stats are not incremented (this is only incremented
when signing is triggered by resign in lib/dns/zone.c).
The mkeys system test configured 'auto-dnssec' on the root zone to do
smart signing and simulate root key changes that should be picked up
by the automated trust anchor management of BIND.
This does not require 'auto-dnssec' or 'dnssec-policy', so change the
tests to use manual smart signing with 'dnssec-signzone'.
This test uses key timing metadata to do rollovers, this is no longer
applicable with 'dnssec-policy'. Note that with 'dnssec-policy' key
timing metadata is still written, but it is not used for determining
what and when to do key rollovers.
The inline system test tests 'auto-dnssec' in conjunction with
'inline-signing'. Change the tests to make use of 'dnssec-policy'.
Remove some tests that no longer make sense:
- The 'retransfer3.' zone tests changing the parameters with
'rndc signing -nsec3param'. This command is going away and NSEC3
parameters now need to be configured with nsec3param within
'dnssec-policy'.
- The 'inactivezsk.' and 'inactiveksk.' zones test whether the ZSK take
over signing if the KSK is inactive, or vice versa. This fallback
mode longer makes sense when using a DNSSEC policy.
Some tests need to be adapted more than just changing 'auto-dnssec'
to 'dnssec-policy':
- The 'delayedkeys.' zone first needs to be configured as insecure,
then we can change it to start signing. Previously, no existing
keys means that you cannot sign the zone, with 'dnssec-policy'
new keys will be created.
- The 'updated.' zone needs to have key states in a specific state
so that the minimal journal check still works (otherwise CDS/
CDNSKEY and related records will be in the journal too).
- External keys are now added to the unsigned zone and no longer
are maintained with key files. Adjust the 'externalkey.' zone
accordingly.
- The 'nsec3-loop.' zone requires three signing keys. Since
'dnssec-policy' will ignore duplicates in the 'keys' section,
create RSASHA256 keys with different role and/or key length.
Finally, the 'externalkey.' zone checks for an expected number of
DNSKEY and RRSIG records in the response. This used to be 3 DNSKEY
and 2 RRSIG records. Due to logic behavior changes (key timing
metadata is no longer authoritative, these expected values are
changed to 4 DNSKEY records (two signing keys and two external keys
per algorithm) and 1 RRSIG record (one active KSK per signing
algorithm).
The dnssec system test has some tests that use auto-dnssec. Update
these tests to make use of dnssec-policy.
Remove any 'rndc signing -nsec3param' commands because with
dnssec-policy you set the NSEC3 parameters in the configuration.
Remove now duplicate tests that checked if CDS and CDNSKEY RRsets
are signed with KSK only (the dnssec-dnskey-kskonly option worked
in combination with auto-dnssec).
Also remove the publish-inactive.example test case because such
use cases are no longer supported (only with manual signing).
The auto-nsec and auto-nsec3 zones need to use an alternative
algorithm because duplicate lines in dnssec-policy/keys are ignored.
The autosign system test mainly tests the auto-dnssec configuration
option. Since this option is going to be removed, update the system
test so that it uses dnssec-policy.
We could remove the complete system test, but keeping an altered
version of the system test may still be useful to detect unexpected
behavior after code changes.
Change the ns1 (test root server) to use manual signing. This zone
has some weird corner cases that do not fit the dnssec-policy model
very well.
The ns2 bar zone also needs to use manual signing, because it revokes
its key, and RFC 5011 key revocation is not supported with
dnssec-policy.
There are also a couple of weird corner test cases that can be removed:
- Inactive KSK or ZSK. With dnssec-policy there is no such thing as
ZSK taking over the role of a KSK when the KSK is deleted, or vice
versa.
- The CDS and CDNSKEY DELETE records are now automated with
dnssec-policy and so the tests for persistence are no longer required.
In tests.sh, bump the expected number of root DNSKEY records to 11,
because with manual signing the activation before publication is
actually honored.
Also remove any 'rndc signing -nsec3param' commands because with
dnssec-policy you set the NSEC3 parameters in the configuration.
Remove any check interval tests, these "next key event" times are
now calculated and tested in the kasp system test.
The "uname -o" command is harmful on OpenBSD because this platform does
not know about the "-o" option. It is a permanent failure since system
tests are started with "set -e".
We removed DNSSEC management via dynamic update (see issue #3686),
this means we also should no longer add signing records (of private
type) for DNSKEY records added via dynamic update.
to reduce the amount of common code that will need to be shared
between the separated cache and zone database implementations,
clean up unused portions of dns_db.
the methods dns_db_dump(), dns_db_isdnssec(), dns_db_printnode(),
dns_db_resigned(), dns_db_expirenode() and dns_db_overmem() were
either never called or were only implemented as nonoperational stub
functions: they have now been removed.
dns_db_nodefullname() was only used in one place, which turned out
to be unnecessary, so it has also been removed.
dns_db_ispersistent() and dns_db_transfernode() are used, but only
the default implementation in db.c was ever actually called. since
they were never overridden by database methods, there's no need to
retain methods for them.
in rbtdb.c, beginload() and endload() methods are no longer defined for
the cache database, because that was never used (except in a few unit
tests which can easily be modified to use the zone implementation
instead). issecure() is also no longer defined for the cache database,
as the cache is always insecure and the default implementation of
dns_db_issecure() returns false.
for similar reasons, hashsize() is no longer defined for zone databases.
implementation functions that are shared between zone and cache are now
prepended with 'dns__rbtdb_' so they can become nonstatic.
serve_stale_ttl is now a common member of dns_db.
To improve the compatibility of the inline test with the `set -e`
option, ensure all commands which are expected to pass are explicitly
checked for return code and non-zero return codes are handled.
The changes were mostly done with sed:
find . -name '*.sh' | xargs sed -i 's/`\([^`]*\)`/$(\1)/g'
There have been a few manual changes where the regex wasn't sufficient
(e.g. backslashes inside the `...`) or wrong (`...` referring to docs or
in comments).
Change the way arithmetic operations are performed in system test shell
scripts from using `expr` to `$(())`. This ensures that updating the
variable won't end up with a non-zero exit code, which would case the
script to exit prematurely when `set -e` is in effect.
The following replacements were performed using sed in all text files
(git grep -Il '' | xargs sed -i):
s/status=`expr $status + $ret`/status=$((status + ret))/g
s/n=`expr $n + 1`/n=$((n + 1))/g
s/t=`expr $t + 1`/t=$((t + 1))/g
s/status=`expr $status + 1`/status=$((status + 1))/g
s/try=`expr $try + 1`/try=$((try + 1))/g
Ensure all shell system tests are executed with the errexit option set.
This prevents unchecked return codes from commands in the test from
interfering with the tests, since any failures need to be handled
explicitly.
Add this test scenario for a bug fixed a while ago. When a third key is
introduced while the previous rollover hasn't finished yet, the keymgr
could decide to remove the first two keys, because it was not checking
for an indirect dependency on the keys.
In other words, the previous bug behavior was that the first two keys
were removed from the zone too soon.
This test case checks that all three keys stay in the zone, and no keys
are removed premature after another new key has been introduced.
In the kasp script, if one expected key is not found, continue checking
the other key ids, even if there is no match for the first one. This
provides a bit more information which keys mismatch and makes for
easier debugging test failures.
Pass 5 second timeout to the rndc status command(s) to avoid hitting the
hard 10 second timeout from subprocess.call, which would result in an
unwanted exception that would only mask the real issue: if the rndc
status times out in this test, it is likely due to the server not
stopping as it should.
The shutdown test attempts to shut down the server using two different
methods - rndc and sigterm. Use pytest.mark.parametrize to run these as
separate test cases for easier identification of failures.
Surround the variables which are checked whether they're executable in
double quotes. Without them, empty paths won't be properly interpreted
as not executable.
Since delv can occasionally hang in system tests when running with TSAN
(see GL#4119), disable these tests as a workaround. Otherwise, the hung
delv process will just waste CI resources and prevent any meaningful
output from the rest of the test suite.
tsig-keygen is now used to generate key files for TSIG. These have
a different format to those that were generated by dnssec-keygen.
Test that dig can still read these files.
tsig-keygen generates key files that are different to those that
where generated by dnssec-keygen. Check that nsupdate can still
read those old format files.
The ability to read legacy HMAC-MD5 K* keyfile pairs using algorithm
number 157 was accidentally lost when the algorithm numbers were
consolidated into a single block, in commit
09f7e0607a.
The assumption was that these algorithm numbers were only known
internally, but they were also used in key files. But since HMAC-MD5
got renumbered from 157 to 160, legacy HMAC-MD5 key files no longer
work.
Move HMAC-MD5 back to 157 and GSSAPI back to 160. Add exception for
GSSAPI to list_hmac_algorithms.
the default value of dnssec-validation is 'auto', which causes
a server to send a key refresh query to the root zone when starting
up. this is undesirable behavior in system tests, so this commit
sets dnssec-validation to either 'yes' or 'no' in all tests where
it had not previously been set.
this change had the mostly-harmless side effect of changing the cached
trust level of unvalidated answer data from 'answer' to 'authanswer',
which caused a few test cases in which dumped cache data was examined in
the serve-stale system test to fail. those test cases have now been
updated to expect 'authanswer'.
Previously, the first check silently failed, as 454 is apparently (in my
local setup) the minimum output size for the dnstap output, rather than
470 which the test was expecting. Effectively, the check served as a 5
second sleep rather than waiting for the proper file size.
Additionally, check the expected file sizes and fail if expectations
aren't met.
The log message is supposed to contain the zone name which was
erroneously omitted, but didn't pop up during tests, since return code
was silently ignored.
Now it actually waits for the proper log message rather than being an
equivalent of 3 second sleep (which was also sufficient to make the test
pass, thus we detected no failure).
In HTTP/1.0 and HTTP/1.1, RFC 9112 section 9.6 says the last response
in a connection should include a `Connection: close` header, but the
statschannel server omitted it.
In an HTTP/1.0 response, the statschannel server can sometimes send a
`Connection: keep-alive` header when it is about to close the
connection. There are two ways:
If the first request on a connection is keep-alive and the second
request is not, then _both_ responses have `Connection: keep-alive`
but the connection is (correctly) closed after the second response.
If a single request contains
Connection: close
Connection: keep-alive
then RFC 9112 section 9.3 says the keep-alive header is ignored, but
the statschannel sends a spurious keep-alive in its response, though
it correctly closes the connection.
To fix these bugs, make it more clear that the `httpd->flags` are part
of the per-request-response state. The Connection: flags are now
described in terms of the effect they have instead of what causes them
to be set.
The "dns_dnssec_findzonekeys2" log message is a leftover from when that
was the name of the function. Rename to match the current name of the
function.
When we add DNSKEY records via dynamic update, this should no longer
trigger signing the zone with these keys. This currently happens when
'find_zone_keys()' looks up the keys by inspecting the DNSKEY RRset,
then attempting to read the corresponding key files.
Add checks that inspect the logs whether an attempt to read the key
files for the newly added keys was done (and failed because these files
are not available).
The purpose of the check is to verify the server has survived the
previous barrage of queries. This is done by sending a query and
checking we get a NOERROR response back.
Previously, that query could've been affected by a servfail cache - the
server would return a SERVFAIL answer, thus failing the check, despite
being up and running. Use version.bind txt ch query to avoid the
interference of servfail cache.
Prepare the fetchlimit system test for adding a clients-per-query
check. Change some functions and commands to accept a destination
NS IP address instead of using the hardcoded 10.53.0.3.
The get_core_dumps.sh script couldn't find and process core files of
out-of-tree configurations because it looked for them in the source
instead of the build directory.
Add a test case where when priming the cache with a slow authoritative
resolver, the stale-answer-client-timeout option should not return
a delegation to the client (it should wait until an applicable answer
is found, if no entry is found in the cache).
The check for 'would limit' log message is triggered by sending at least
three messages within one second. However, in extremely slow conditions
(currently when running with clang+TSAN in CI), the individual queries
might take too much time to send enough of them within one second.
Since this is a pretty rare condition, let's just silently skip this
test in environments where a single query takes more than 500 ms, since
there's no way to perform the check under such conditions.
Closes#4082
The 'update-nsec3.example' requires to be DNSSEC maintained via
dynamic update. Commit 03b22983cd20cec51ad8b9f25f2e7d0e472dc79c adds
checks to make sure the raw zone is not signed. So the test case neesd
to be updated to allow for DNSSEC maintenance.
A zone in multisigner model 2 should also be possible to remove
previously added DNSKEY, CDS and CDNSKEY records from the zone operated
by the other provider.
Add a test case where updates are being made against a hidden primary
and two bump in the wire signers (the providers in the multisigner
model) serve the zone.
The test covers the same cases as for two primary providers that is:
- Add DNSKEY
- Remove (previously added) DNSKEY
- Add CDNSKEY
- Remove (previously added) CDNSKEY
- Add CDS
- Remove (previously added) CDS
A zone in multisigner model 2 should also be possible to publish the
CDS and CDNSKEY records from their KSK into the zone operated by the
other provider.
Add a new system test to test multisigner model use cases. This
initial test just tests a small part of the model 2, and uses two
providers for the same zone, ns3 and ns4, each with their own unique
key set. This commit tests that each provider can import their ZSK
of the other provider into their DNSKEY RRset, using dynamic update.
Both providers use dnssec-policy, ns3 applies the DNSSEC records
directly, while ns4 uses inline-signing.
The check which attempts to forward dynamic update to a dead primary may
trigger a timing issue #4080. For some reason, this has manifested under
the pytest runner, while the test still passes with the legacy runner.
Move the dead primary check closer to the end of the test to avoid
hitting this issue before we have a proper fix.
The module-level logger has a handler that writes into a temporary
directory. Ensure the logging output is flushed and the handler is
closed before attempting to remove this temporary directory.
Previously, run.sh tried to use pytest's -k option for test selection.
The downside was that this filter expression matched any test case with
the given substring, rather than executing a system test suite with the
given name.
The run.sh has been rewritten to invoke pytest from a system test
directory instead. This behaves more consistently with the run.sh from
legacy system test framework.
run.sh is now also a shell script to avoid confusion regarding its
file extension.
EL8+ systems declare "which" function using environment variables in the
/etc/profile.d/which2.sh file. Because of our suboptimal environment
variable detection, which is required in order to support the legacy
runner, these variables are picked up by the pytest runner.
If subprocesses are spawned with these environment variables set, it
will cause the following issue when they spawn yet another subprocess:
/bin/sh: which: line 1: syntax error: unexpected end of file
/bin/sh: error importing function definition for `which'
Instantiate a new logger that is used during pytest initialization /
configuration. This logging isn't handled by pytest itself, since it
happens outside of any tests or fixtures.
Root logger can't be reused for this purpose, because that would
duplicate the logs. Instead, create a conftest-specific logger for this
purpose.
Unfortunately, this introduces another log file,
pytest.conftest.log.txt, which contains only the logging from pytest
initialization. However, unless one is debugging the runner /
environment, there should be no need to investigate this file.
In order to take the most advantage of parallel execution of tests,
ensure certain long running tests are scheduled first.
The list of tests considered long-running was created empirically. In
addition to the test run time, its position in the default
(alphabetical) ordering was also taken into account.
The logger fixture is provided as a test-level logging facility which
can be easily passed to tests to enable capturing and/or displaying
messages from tests written in Python.
While this works optimally with the pytest runner, messages on INFO
level or above will also be visible when using the legacy runner.
The test_zone_timers_secondary_json() and
test_zone_timers_secondary_xml() tests are affected by issue #3983. Due
to the way tests are run, they are only affected when executing them
with the pytest runner.
Strict mode is set for pytest runner, as it always fails there. The
strict mode ensures we'll catch the change when the it starts passing
once the underlying issue is fixed. It can't be set for the legacy
runner, since the test (incorrectly) passes there.
Related #3983
If a test fails with an assertion failure or exception, its content
along with traceback is displayed in pytest output. This information
should be preserved in the test-specific logger for a given system test
to make it easier to debug test failures.
In order to avoid issues with decoding/encoding env variables due to
different encodings on different systems, deal with the environment
variables directly as bytes.
The loadscope setting is required for parallel execution of our system
tests using pytest. The option ensure that all tests within a single
(module) scope will be assigned to the same worker.
This is neccessary because the worker sets up the nameservers for all
the tests within a module scope. If tests from the same module would be
assigned to different workers, then the setup could happen multiple
times, causing a race condition. This happens because each module uses
deterministic port numbers for the nameservers.
Utilize developers' muscle memory to incentivize using the pytest runner
instead of the legacy one. The script also serves as basic examples of
how to run the pyest command to achieve the same results as the legacy
runner.
Invoking pytest directly should be the end goal, since it offers many
potentially useful options (refer to pytest --help).
In order to run the shell system tests, the pytest runner has to pick
them up somehow. Adding an extra python file with a single function
for the shell tests for each system test proved to be the most
compatible way of running the shell tests across older pytest/xdist
versions.
Modify the legacy run.sh script to ignore these pytest-runner specific
glue files when executing tests written in pytest.
Special care needs to be taken to support older pytest / xdist versions.
The target versions are what is available in EL8, since that seems to
have the oldest versions that can be reasonably supported.
When an issue occurs inside a fixture (e.g. servers fail to start/stop),
the test result won't be detected as failed, but rather an error will be
thrown.
To ensure the tempdir is kept even if the test itself passes but the
system_test() fixture throws an error, a different mechanism is needed.
At the start of the critical test setup section, note that the fixture
hasn't finished yet. When this is detected in the system_test_dir()
fixture, it is recognized as error in test setup/teardown and the temp
directory is kept.
This may seem cumbersome, because it is. It's basically a workaround for
the way pytest handles fixtures and test errors in general.
The temporary directory contains artifacts for the pytest module. That
module may contain multiple individual tests which were executed
sequentially. The artifacts should be kept if even one of these tests
failed.
Since pytest doesn't have any facility to expose test results to
fixtures, customize the pytest_runtest_makereport() hook to enable that.
It stores the test results into a session scope variable which is
available in all fixtures.
When deciding whether to remove the temporary directory, find the
relevant test results for this module and don't remove the tmpdir if any
one the tests failed.
For every pytest module, create a copy of its system test directory and
run the test from that directory. This makes it easier to clean up
afterwards and makes it less error-prone when re-running (failed) tests.
Configure the logger to capture the module's log output into a separate
file available from the temporary directory. This is quite convenient
for exploring failures.
Cases where temporary directory should be kept are handled in a
follow-up commits.
This is basically the pytest re-implementation of the run.sh script.
The fixture is applied to every module and ensures complete test
setup/teardown, such as starting/stopping servers, detecting coredumps
etc.
Note that the fixture system_test_dir is not defined yet. It is omitted
now for review readability and it's added in follow-up commits.
Add fixtures for deriving system test name from the directory name and
for a module-specific logger.
Add fixtures to execute shell and perl scripts. Similar to how run.sh
calls commands, this functionality is also needed in pytest. The
fixtures take care of switching to a proper directory, logging
everything and handling errors.
Note that the fixture system_test_dir is not defined yet. It is omitted
now for review readability and it's added in follow-up commits.
This is basically a pytest re-implementation of the get_ports.sh script.
The main difference is that ports are assigned on a module basis, rather
than a directory basis. Module is the new atomic unit for parallel
execution, therefore it needs to have unique ports to avoid collisions.
Each module gets its ports through the env fixture which is updated with
ports and other module-specific variables.
Some system tests require extra programs and/or dependencies to be
compiled first. This is done via `make check` with the automake
framework when using check_* variables such as check_PROGRAMS.
To avoid running any tests via the automake framework, set the TESTS
env variable to empty string and utilize `make -e check` to override
default Makefile variables with environment ones. This ensures automake
will only compile the needed dependencies without running any tests.
Additional consideration needs to be taken for xdist. The compilation
command should be called just once before any tests are executed. To
achieve that, use the pytest_configure() hook and check that the
PYTEST_XDIST_WORKER env variable isn't set -- if it is, it indicates
we're in the spawned xdist worker and the compilation was already done
by the main pytest process that spawned the workers.
This is mostly done to have on-par functionality with legacy test
framework. In the future, we should get rid of the need to run "empty"
make -e check and perhaps compile test-stuff by default.
The commands executed by pytest during a system test need to have the
same environment variables set as if they were executed by the run.sh
shell script.
It was decided that for the moment, legacy way of executing system tests
with run.sh should be kept, which complicates things a bit. In order to
avoid duplicating the required variables in both conf.sh and pytest, it
was decided to use the existing conf.sh as the only authoritative
place for the variables.
It is necessary to process the environment variables from conf.sh right
when conftest.py is loaded, since they might be needed right away (e.g.
to test for feature support during test collection).
This solution is a bit hacky and is only meant to be used during the
transitory phase when both pytest and the legacy run.sh are both
supported. In the future, a superior pytest-only solution should be
used.
For discussion of other options, refer to
https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6809#note_318889
Ensure pytest picks up our python test modules, since we're using
tests_*.py convention, which is different from the default.
Configure pytest logging to display the output from all tests (even the
ones that passed). This ensures we have sufficient amount of information
to debug test post-mortem just from the artifacts.
The legacy system test framework uses pytest to execute some tests.
Since it'd be quite difficult to convince pytest to decide whether to
include conftest.py (or which ones to include when launching from
subdir), it makes more sense to have a shared conftest.py which is used
by both the legacy test runner invocations of pytest and the new pytest
system test runner. It is ugly, but once we drop support for the legacy
runner, we'll get rid of it.
Properly scope the *port fixtures in order to ensure they'll work as
expected with the new pytest runner. Instead of using "session" (which
means the fixture is only evaluated once for the entire execution of
pytest), use "module" scope, which is evaluated separately for each
module. The legacy runner invoked pytest for each system test
separately, while the new pytest runner is invoked once for all system
tests -- therefore it requires the more fine-grained "module" scope to
for the fixtures to work properly.
Remove python shebang, as conftest.py isn't supposed to be an executable
script.
The line summarising TSAN reports was misplaced in the ASAN territory
and thus never used.
I also made core dumps, assertion failures, and TSAN reports detection
independent of each other.
When the FIPS provider is available, RSASHA1 signing keys for zone
"example.com." are ignored if the zone is attempted to be signed with
the dnssec-signzone "-F" (FIPS mode) option:
"fatal: No signing keys specified or found"
The upforwd test for forwarding updates to a dead primary can continue
running a little bit past its end, causing update replies to be
recorded during a subsequent test case. Correct this by only looking
for update requests and replies for the specific domain name being
tested at any given time.
After the RCU changes were merged, the `upforwd` test started
consistenly failing when run under thread sanitizer. After some
investigation, it turned out that retry attempts were continuing after
the "update forwarding to dead primary" test. This caused mismatches
in the DNSTAP message counts for the subsequent tests, because they
were also counting retries.
Fix this problem by `wait`ing for the `nsupdate` processes to exit.
While investigating the bug, I replaced several fixed 15 second delays
with `wait_for_log`, so the test runs faster.
All the places the qp-trie code was using `call_rcu()` needed
`__tsan_release()` and `__tsan_acquire()` annotations, so
add a couple of wrappers to encapsulate this pattern.
With these wrappers, the tests run almost clean under thread
sanitizer. The remaining problems are due to `rcu_barrier()`
which can be suppressed using `.tsan-suppress`. It does not
suppress the whole of `liburcu`, because we would like thread
sanitizer to detect problems in `call_rcu()` callbacks, which
are called from `liburcu`.
The CI jobs have been updated to use `.tsan-suppress` by
default, except for a special-case job that needs the
additional suppressions in `.tsan-suppress-extra`.
We might be able to get rid of some of this after liburcu gains
support for thread sanitizer.
Note: the `rcu_barrier()` suppression is not entirely effective:
tsan sometimes reports races that originate inside `rcu_barrier()`
but tsan has discarded the stack so it does not have the
information required to suppress the report. These "races" can
be made much easier to reproduce by adding `atexit_sleep_ms=1000`
to `TSAN_OPTIONS`. The problem with tsan's short memory can be
addressed by increasing `history_size`: when it is large enough
(6 or 7) the `rcu_barrier()` stack usually survives long enough
for suppression to work.
Previously, if an exception would happen inside the `with` block, the
error handler would wait indefinitely for the process to end. That would
never happen, since the termination signal was never sent to named and
the test would get stuck.
Using the try-finally block ensures that the named process is always
killed and any exception or errors will be handled gracefully.
Improve code readability by splitting the test into more functions. Some
could be re-used later on for more general-purpose subprocess handling
or named checks.
Add more tests to the dnstap system test to roll with different values.
Touch some files to make sure the number of existing files exceed the
number that we want to keep.
Add a test to the logfileconfig system test for the increment suffix.
When dns_request_create() failed in notify_send_toaddr(), sending the
notify would silently fail. When notify_done() failed, the error would
be logged on the DEBUG(2) level.
This commit remedies the situation by:
* Promoting several messages related to notifies to INFO level and add
a "success" log message at the INFO level
* Adding a TCP fallback - when sending the notify over UDP fails, named
will retry sending notify over TCP and log the information on the
NOTICE level
* When sending the notify over TCP fails, it will be logged on the
WARNING level
Closes: #4001, #4002
There is no 'ret' in this test, and it is obvious that 'ret=1'
should be 'tmp=1' for the check to work correctly, if the string
is not found in the log file.
Add a test case to cover #3679 where a user migrates from a KSK/ZSK
split using auto-dnssec maintain, to the default dnssec-policy (CSK).
The test actually does not use the default dnssec-policy, but it does
use one that has the same keys clause. For testing convenience, we use
the same propagation time values as other test cases that migrate to
dnssec-policy with mismatching existing key set.
At the time of test number (19), there were 10 "sending packet to
10.53.0.7" lines in the "legacy/ns1/named.run" file; usually, only seven
are present:
I:legacy:checking recursive lookup to edns 512 + no tcp server does not cause query loops (19)
I:legacy:ns1 sent 10 queries to ns7, expected less than 10
I:legacy:failed
Those three can be attributed to tests "8", "10", and "18", where the
dig of "resolution_fails()" retried after a timeout to succeed with
"status: SERVFAIL" subsequently, as seen in each of
dig.out.test{8,10,18} files.
;; communications error to 10.53.0.1#13093: timed out
; <<>> DiG 9.19.12-dev <<>> -p 13093 +tcp @10.53.0.1 edns512-notcp. TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5368
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
This retry is unnecessary because "resolution_fails()" considers timeout
a positive result.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
The dnspython.Resolve.resolve() requires at least dnspython >= 2.0.0,
this wasn't enforced in the shutdown system test leading to infinite
loop waiting for the server start due to failing resolve() call.
We don't need a separate module/file for every test. Both the rpz tests
could live in the same file.
The setup/teardown of servers if performed separately for each module --
unless there is a need to do that, it's better to avoid it.
This adds rudimentary test for response-policy zones in multiple
views. Different combinations are tested:
- two views with response-policy inherited from options {};
- two views view explicit response-policy using same RPZ zone name
- two views view explicit response-policy using secondary RPZ zone
* nsupdate should take 12 seconds (one try and three retries with
3 second timeout for each), UDP mode
* nsupdate -u 4 -r 1 should take 8 seconds (one try and one retry with
4 second timeout for each), UDP mode
* nsupdate -u 0 -t 8 -r 1 should also take 8 seconds, UDP mode
* nsupdate -u 4 -t 30 -r 1 should also take 8 seconds, as -u takes
precedence over -t, UDP mode
* nsupdate -t 8 -v should also take 8 seconds, TCP mode
The checkds system test could fail if some parent secondary servers did
not yet loaded all the zones before ns9 started sending DS queries. This
leads to SERVFAIL responses, while the test case expects good DS
responses. In order to mitigate against this issue, call 'rndc loadkeys'
to quickly restart the checkds procedure again.
Also refactor the checkds system test, to get rid of the many zone
name duplications. Update the functions 'zone_check' and
'keystate_check' to make the zone name an FQDN so we can just pass
the 'zone' variable into the function.
If the 'checkds' option is not explicitly set, check if there are
'parental-agents' for the zone configured. If so, default to "explicit",
otherwise default to "yes".
Add two new checkds test servers, that are hidden secondaries (hidden
as in not published in the NS RRset), that can be used specifically
for testing explicitly configured parental-agents.
Implement the new feature, automatic parental-agents. This is enabled
with 'checkds yes'.
When set to 'yes', instead of querying the explicit configured
parental agents, look up the parental agents by resolving the parent
NS records. The found parent NS RRset is considered to be the list
of parental agents that should be queried during a KSK rollover,
looking up the DS RRset corresponding to the key signing keys.
For each NS record, look up the addresses in the ADB. These addresses
will be used to send the DS requests. Count the number of servers and
keep track of how many good DS responses were seen.
The previous test cases already test the more complex case where there
are empty non-terminals between the child apex and the parent domain.
Add a test case where this is not the case, to execute the other code
path.
Add test cases for when checkds is disabled. Copy the test cases that
would have resulted in a DSPublish or DSRemoved and make sure that
with 'checkds no' the metadata is not set.
Add the test cases for automatic parental-agents, i.e. when 'checkds'
is set to 'yes'. Split out the special cases that use a reference
or a resolver as parental-agent so that the common use cases can be
tested with the same function.
Make the checkds system test more structured with the many more test
cases to come. Add a README for clarity.
Update the 'has_signed_apex_nsec' helper function so it can take any
domain name regardless of the number of labels.
Change the DNS tree structure such that we have different TLD names
for the various test scenarios, because we need servers that respond
differently to DS queries. Note that this isn't applicable to the
existing "checkds explicit" test cases, but is preparation work for
testing "checkds yes" (automatic parental agents).
Add a trust-anchor to the server that will be querying for parent
NS records.
Add a new configuration option to set how the checkds method should
work. Acceptable values are 'yes', 'no', and 'explicit'.
When set to 'yes', the checkds method is to lookup the parental agents
by querying the NS records of the parent zone.
When set to 'no', no checkds method is enabled. Users should run
the 'rndc checkds' command to signal that DS records are published and
withdrawn.
When set to 'explicit', the parental agents are explicitly configured
with the 'parental-agents' configuration option.
Cleanup the remnants of MS Compiler bits from <isc/refcount.h>, printing
the information in named/main.c, and cleanup some comments about Windows
that no longer apply.
The bits in picohttpparser.{h,c} were left out, because it's not our
code.
hypothesis prior to 4.41.2 uses hashlib.md5 which is not FIPS
compliant causing the wildcard system test to fail. Check if
we are running if FIPS mode and if so make the minimum version
of hypothesis we will accept to be 4.41.2.
The existing set of kerberos credential used deprecated algorithms
which are not supported by some implementations in FIPS mode.
Regenerate the saved credentials using more modern algorithms.
Added tsiggss/krb/setup.sh which sets up a test KDC with the required
principals for the system test to work. The tsiggss system test
needs to be run once with this active and KRB5_CONFIG appropriately.
set. See tsiggss/tests.sh for an example of how to do this.
OPENSSL_CONF="" is treated differently to no OPENSSL_CONF in
the environment by OpenSSL. OPENSSL_CONF="" lead to crypto
failure being reported in FIPS mode.
Call dst_lib_init to set FIPS mode if it was turned on at configure
time.
Check that named-checkconf report that dnssec policies that wont
work in FIPS mode are reported if named would be running in FIPS
mode.
Diffie-Hellman key echange doesn't appear to work in FIPS mode for
OpenSSL 1.x.x. Add feature test (--have-fips-dh) to identify builds
where DH key exchanges work (non FIPS builds and OpenSSL 3.0.0+) and
exclude test that would otherwise fail.
- RSASHA1 (5) and NSEC3RSASHA1 (7) are not accepted in FIPS mode
- minimum RSA key size is set to 2048 bit
adjust kasp and checkconf system tests to ensure non FIPS
compliant configurations are not used in FIPS mode
when testing the DNSRPS API, instead of linking to an installed
librpz.so from fastrpz, we now link to the test library. code that
ran dnsrpzd and checked the fastrpz license is now unnecessary and
has been removed.
two dnsrps-specific test cases in rpz (qname_as_ns and ip_as_ns) have
been removed, because they were only supported by fastrpz and do not
work in the test library. in rpzrecurse, nsip-wait-recurse and
nsdname-wait-recurse are now only tested in native mode, due to those
tests being specific to the native implementation.
These options and zone type were created to address the
SiteFinder controversy, in which certain TLD's redirected queries
rather than returning NXDOMAIN. since TLD's are now DNSSEC-signed,
this is no longer likely to be a problem.
The deprecation message for 'type delegation-only' is issued from
the configuration checker rather than the parser. therefore,
isccfg_check_namedconf() has been modified to take a 'nodeprecate'
parameter to suppress the warning when named-checkconf is used with
the command-line option to ignore warnings on deprecated options (-i).
Previously, an AXFR request would be issued every second while waiting
for the zone to be signed. This might've been the cause of issues in CI
where many tests are running in parallel and any extra load may increase
test instability.
Instead, check for the last NSEC record to have a signature before
commencing the AXFR request to check the zone has been fully signed.
Also increase the time for the zone signing to a total of 60+10 seconds
up from the previous 30.
Ensure messages from dupsigs system test end up in its log rather than
stdout. Previously, the output was hard to debug when running the tests
in parallel and messages wouldn't end up in the dupsigs.log.
stop and restart the server in the 'tsiggss' test, in order
to confirm that GSS negotiated TSIG keys are saved and restored
when named loads.
added logging to dns_tsigkey_createfromkey() to indicate whether
a key has been statically configured, generated via GSS negotiation,
or restored from a file.
If the zone already has existing NSEC/NSEC3 chains then zone_sign
needs to continue to use them. If there are no chains then use
kasp setting otherwise generate an NSEC chain.
The dnstap system test fails intermittently, and it appears to be
a timing issue - adding a short delay after running 'fstrm_capture',
and before running 'dnstap -reopen' improves the situation from
50% failures (5 out of 10 times) to 0% failures (0 out of 20 times),
tested locally.
The reason is that 'fstrm_capture' is executed in the background,
and due to OS scheduling and other factors, the listener socket
may not be ready when the following command runs and tells 'named'
to (re)open it.
Dumping of the freshly transferred zone file can take some time.
Retry 5 times before failing.
The log excerpt below shows such a case, when dumping lasted more than
two seconds.
06-Mar-2023 09:32:09.973 zone example6/IN: Transfer started.
06-Mar-2023 09:32:10.301 zone example6/IN: zone transfer finished: success
06-Mar-2023 09:32:10.301 zone_dump: zone example6/IN: enter
06-Mar-2023 09:32:11.789 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): AXFR request
06-Mar-2023 09:32:11.801 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): transfer of 'example6/IN': AXFR ended: 5 messages, 2676 records, 55815 bytes, 0.011 secs (5074090 bytes/sec) (serial 1397051952)
06-Mar-2023 09:32:12.409 zone_gotwritehandle: zone example6/IN: enter
06-Mar-2023 09:32:12.421 dump_done: zone example6/IN: enter
06-Mar-2023 09:32:12.421 zone_journal_compact: zone example6/IN: target journal size 53044
There can be comments in dig output for a zone transfer only in case
of an error, so we should print those errors not when wait_for_tls_xfer
succeeds, but when it fails.
Also, there is no point in printing those comments when a failure was
indeed expected.
The serve-stale system test was intermittently failing due to a timing
issue:
I:serve-stale:check stale data.example TXT was refreshed...
I:serve-stale:failed
The RRset is refreshed, however, it first checks for an expected log
line, prior checking that the stale data.example TXT was refreshed
(using dig). This log line is there to ensure the record is actually
refreshed before we start querying again. Alternatively we could just
retry_quiet 10 <wait for dig output matches expectations>. It would
lower the chances for intermittent test failures, since there is no
longer a "check for log line, sleep one second if check fails, check
for log line, ...", prior to the check.
Completely remove the TKEY Mode 2 (Diffie-Hellman Exchanged Keying) from
BIND 9 (from named, named.conf and all the tools). The TKEY usage is
fringe at best and in all known cases, GSSAPI is being used as it should.
The draft-eastlake-dnsop-rfc2930bis-tkey specifies that:
4.2 Diffie-Hellman Exchanged Keying (Deprecated)
The use of this mode (#2) is NOT RECOMMENDED for the following two
reasons but the specification is still included in Appendix A in case
an implementation is needed for compatibility with old TKEY
implementations. See Section 4.6 on ECDH Exchanged Keying.
The mixing function used does not meet current cryptographic
standards because it uses MD5 [RFC6151].
RSA keys must be excessively long to achieve levels of security
required by current standards.
We might optionally implement Elliptic Curve Diffie-Hellman (ECDH) key
exchange mode 6 if the draft ever reaches the RFC status. Meanwhile the
insecure DH mode needs to be removed.
The trick is to configure a duplicate zone, which comes after the
catalog zone, where the duplicate zone is an existing member zone.
In that scenario, all the zones which come before the "faulty" zone
in the configuration file will fail to be reverted to the previous
version of the view after a reconfiguration error, and in this
particular case that will result in an assertion failure when the
catalog zone update is initiated, because it will be still tied to
the new version of the view, which was dismissed.
Building the bin/tests/system/rpz/dnsrps helper binary is currently not
possible at all as the necessary compiler and linker flag definitions
are missing from bin/tests/system/Makefile.am. Add these as a basis for
addressing the problem.
Unfortunately, this is where the "mostly" bit mentioned in this commit's
subject line comes into play. The dlopen() parts of DNSRPS code have
not yet been reworked to use libuv's dlopen() API (uv_dlopen() etc.)
(See commit 37b9511ce1 for prior work in
this area.) While it is certainly possible to do that, implementing
such a change without testing it in practice against a usable librpz.so
(i.e. a DNSRPS provider library) is bound to cause more trouble and
confusion than keeping the code the way it is right now. However,
making that code buildable as-is requires linking against a C standard
library that exports the dlopen(), dlsym(), and dlclose() symbols used
by the DNSRPS dynamic loading code. glibc 2.34+ satisfies that
requirement, but older glibc versions do not (these come with a separate
libdl shared library that would need to be linked in as well). (Other
C standard library implementations have not been examined.) Since the
long-term plan is to rely on libuv's dlopen() API exclusively and
detecting the shared object containing dlopen() & friends would only
pull in build system complexity for no good reason, assume for now that
the target system provides the dlopen() API in its C standard library.
This change enables the system test suite to be run for a BIND 9 build
prepared using --enable-dnsrps --enable-dnsrps-dl (on systems satisfying
the requirement explained above). However, it is important to note that
this change by itself does NOT enable actual testing of the DNSRPS
feature as doing that requires a DNSRPS provider library to be present
on the test host.
This implements node reference tracing that passes all the internal
layers from dns_db API (and friends) to increment_reference() and
decrement_reference().
It can be enabled by #defining DNS_DB_NODETRACE in <dns/trace.h> header.
The output then looks like this:
incr:node:check_address_records:rootns.c:409:0x7f67f5a55a40->references = 1
decr:node:check_address_records:rootns.c:449:0x7f67f5a55a40->references = 0
incr:nodelock:check_address_records:rootns.c:409:0x7f67f5a55a40:0x7f68304d7040->references = 1
decr:nodelock:check_address_records:rootns.c:449:0x7f67f5a55a40:0x7f68304d7040->references = 0
There's associated python script to find the missing detach located at:
https://gitlab.isc.org/isc-projects/bind9/-/snippets/1038
Change the commandline option -G to take a string that determines what
sync records should be published. It is a comma-separated string with
each element being either "cdnskey", or "cds:<algorithm>", where
<algorithm> is a valid digest type. Duplicates are suppressed.
Change one of the test cases to use a different digest type (4). The
system tests and kasp script need to be updated to take into account
the new algorithm (instead of the hard coded 2).
The test was setting a minimum count for recursive clients which
was not always being met (e.g. 91 instead of 100) producing a false
positive. Lower the lower bound on recursive clients for this
test to 1.
Add the 'ixfr-from-differences yes;' option to trigger a failed
zone postload operation when a zone is updated but the serial
number is not updated, then issue two successive 'rndc reload'
commands to trigger the bug, which causes an assertion failure.
the dns_xfrin module was still using the network manager directly to
manage TCP connections and send and receive messages. this commit
changes it to use the dispatch manager instead.
the 'dispatchmgr' member of the resolver object is used by both
the dns_resolver and dns_request modules, and may in the future
be used by others such as dns_xfrin. it doesn't make sense for it
to live in the resolver object; this commit moves it into dns_view.
the parser could crash when "include" specified an empty string in place
of the filename. this has been fixed by returning ISC_R_FILENOTFOUND
when the string length is 0.
Reproduce the assertion by configuring a 'named' resolver with
'recursive-clients 10;' configuration option and running 20
queries is parallel.
Also tweak the 'ans2/ans.pl' to simulate a 50ms network latency
when qname starts with "latency". This makes sure that queries
running in parallel don't get served immediately, thus allowing
the configured recursive clients quota limitation to be activated.
move database attach/detach functions to db.c, instead of
requiring them to be implemented for every database type.
instead, they must implement a 'destroy' function that is
called when references go to zero.
this enables us to use ISC_REFCOUNT_IMPL for databases,
with detailed tracing enabled by setting DNS_DB_TRACE to 1.
initialize dns_dbmethods, dns_sdbmethods and dns_rdatasetmethods
using explicit struct member names, so we don't have to keep track
of NULLs for unimplemented functions any longer.
some dns_db functions would have crashed if the DB implementation failed
to implement them, requiring the implementations to add functions that
did nothing but return ISC_R_NOTIMPLEMENTED or some obvious default
value. we can just have the dns_db wrapper functions themselves return
those values, and clean up the implementations accordingly.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
change functions using isc_taskmgr_beginexclusive() to use
isc_loopmgr_pause() instead.
also, removed an unnecessary use of exclusive mode in
named_server_tcptimeouts().
most functions that were implemented as task events because they needed
to be running in a task to use exclusive mode have now been changed
into loop callbacks instead. (the exception is catz, which is being
changed in a separate commit because it's a particularly complex change.)
dns_request_create() and _createraw() now take a 'loop' parameter
and run the callback event on the specified loop.
as the task manager is no longer used, it has been removed from
the dns_requestmgr structure. the dns_resolver_taskmgr() function
is also no longer used and has been removed.
Include MD5 feature detection in featuretest tool and use it in some
places. When RHEL distribution or Fedora ELN is in FIPS mode, then MD5
algorithm is unavailable completely and even hmac-md5 algorithm usage
will always fail. Work that around by checking MD5 works and if not,
skipping its usage.
Those changes were dragged as downstream patch bind-9.11-fips-tests.patch
in Fedora and RHEL.
Tests using diff to compare outputs of dig +short shall ignore lines
starting with ";". In dig +short output, such lines should only be
present for errors such as network issues. Since we utilize dig's
default timeout/retry mechanisms, these transitory issues should be
ignored and only the final output should be considered during the diff
comparison.
This adds an island of trust that is reachable from the root
where the trust anchors are added to island.conf.
This add an island of trust that is not reachable from the root
where the trust anchors are added to private.conf.
Occasionally, the allotted 10 seconds for the "running" line to appear
in log after named is started proved insufficient in CI, especially
during increased load. Give named up to 60 seconds to start up to
mitigate this issue.
isc_bind9 was a global bool used to indicate whether the library
was being used internally by BIND or by an external caller. external
use is no longer supported, but the variable was retained for use
by dyndb, which needed it only when being built without libtool.
building without libtool is *also* no longer supported, so the variable
can go away.
Send the test message from ns3 to ns2 instead of ns2 to ns3 as ns2
is started first and therefore the test doesn't have to wait on the
resend of the the NOTIFY message to be successful.
the nsupdate system test was intermittently failing due to the update
quota not being exceeded when it should have been. this is most likely
a timing issue: the client is sending updates too slowly, or the server
is processing them too quickly, for the quota to fill. this commit
attempts to make that the failure less likely by increasing the number
of update transactions from 10 to 20.
Following deleting the root trust anchor and reconfiguring the
server it takes some time to for trust anchor to appear in 'rndc
managed-keys status' output. Retry several times.
check in the log files of receiving servers that the originating
ports for notify and SOA query messages were set correctly from
configured notify-source and transfer-source options.
* rbt node chains were sized to allow for bitstring labels, so they
had 256 levels; but in the absence of bistrings, 128 is enough.
* dns_byaddr_createptrname() had a redundant options argument,
and a very outdated doc comment.
* A number of comments referred to bitstring labels in a way that is
no longer helpful. (A few informative comments remain.)
Set the DS state after issuing 'rndc dnssec -checkds'. If the DS
was published, it should go in RUMOURED state, regardless whether it
is already safe to do so according to the state machine.
Leaving it in HIDDEN (or if it was magically already in OMNIPRESENT or
UNRETENTIVE) would allow for easy shoot in the foot situations.
Similar, if the DS was withdrawn, the state should be set to
UNRETENTIVE. Leaving it in OMNIPRESENT (or RUMOURED/HIDDEN)
would also allow for easy shoot in the foot situations.
The malloced and maxmalloced memory counters were mostly useless since
we removed the internal allocator blocks - it would only differ from
inuse by the memory context size itself.
The validity default days value of 1 was used for debugging and
left as such accidentally.
Use 10950 days, as used elsewhere (for example, in doth test CA).
This does not affect anything, the value will be effective when
generating new test certificates in the future.
Change the 'forward' system test to enable DoT on ns2 server,
and test that forwarding from ns4 to the DoT-enabled ns2 works.
In order to test different scenarios, create a test CA (based on
similar CAs for 'doth' and 'nsupdate' system tests), and test
both insecure (no certificate validation) and secure (also with
mutual TLS) TLS configurations, as well as a configuration with an
expired certificate.
A 'tls' statement can be specified both for individual addresses
and for the whole list (as a default value when an individual
address doesn't have its own 'tls' set), just as it was done
before for the 'port' value.
Create a new function 'print_rawqstring()' to print a string residing
in a 'isc_textregion_t' type parameter.
Create a new function 'copy_string()' to copy a string from a
'cfg_obj_t' object into a 'isc_textregion_t'.
Add a test case for a server that uses a resolver as an parental-agent.
We need two root servers, ns1 and ns10, one that delegates to the
'checkds' tld with the DS published (ns2), and one that delegates to
the 'checkds' tld with the DS removed (ns5). Both root zones are
being setup in the 'ns1/setup.sh' script.
We also need two resolvers, ns3 and ns8, that use different root hints
(one uses ns1 address as a hint, the other uses ns10).
Then add the checks to test_checkds.py is similar to the existing tests.
Update 'types' because for zones that have the DS withdrawn (or to be
withdrawn), the CDS and CDNSKEY records should not be published and
thus should not be in the NSEC bitmap.
Add 'port' token to deprecated.conf. Also add options
'use-v4-udp-ports', 'use-v6-udp-ports', 'avoid-v4-udp-ports',
and 'avoid-v6-udp-ports'.
All of these should trigger warnings (except when deprecation warnings
are being ignored).
Deprecate the use of "port" when configuring query-source(-v6),
transfer-source(-v6), notify-source(-v6), parental-source(-v6),
etc. Also deprecate use-{v4,v6}-udp-ports and avoid-{v4,v6}udp-ports.
The condition was accidentally reversed during refactoring in
9730ac4c56 . It would result in skipped
tests on builds with proper support and false negatives on builds
without proper feature support.
Credit for reporting the issue and the fix goes to Stanislav Levin.
Instead of using the current working directory to find the ifconfig.sh
script, look for the ifconfig.sh.in template in the directory where the
testsock.pl script is located. This enables the testsock.pl script to be
called from any working directory.
Using the ifconfig.sh.in template is sufficient, since it contains
the necessary information to be extracted: the max= value (which is
hard-coded in the template).
Move the core dump detection functionality for system test runs into a
separate script. This enables reuse by the pytest runner. The
functionality remains the same.
Avoid creating any temporary files in the current workdir.
Additional/changing files in the bin/tests/system directory are
problematic for pytest/xdist collection phase, which assumes the list of
files doesn't change between the collection phase of the main pytest
thread and the subsequent collection phase of the xdist worker threads.
Since the testcrypto.sh is also called during pytest initialization
through conf.sh.common (to detect feature support), this could
occasionally cause a race condition when the list of files would be
different for the main pytest thread and the xdist worker.
verify that updates are refused when the client is disallowed by
allow-query, and update forwarding is refused when the client is
is disallowed by update-forwarding.
verify that "too many DNS UPDATEs" appears in the log file when too
many simultaneous updates are processing.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
Prime the cache with the following records:
shortttl.cname.example. 1 IN CNAME longttl.target.example.
longttl.target.example. 600 IN A 10.53.0.2
Wait for the CNAME record to expire, disable the authoritative server,
and query 'shortttl.cname.example' again, expecting a stale answer.
This bug was masked in the tests because the `catz` test script did an
`rndc addzone` before an `rndc delzone`. The `addzone` autovivified
the NZF config, so `delzone` worked OK.
This commit swaps the order of two sections of the `catz` test script
so that it uses `delzone` before `addzone`, which provokes a crash
when `delzone` requires a non-NULL NZF config.
To fix the crash, we now try to remove the zone from the NZF config
only if it was dynamically added but not by a catalog zone.
Remove parsing the configuration options 'alt-transfer-source',
'alt-transfer-source-v6', and 'use-alt-transfer-source', and remove
the corresponding code that implements the feature.
Use the configured 'source' and 'source-v6' when initiating a zone
transfer, sending a notify, or when checking for the DS. Remove the
special code for using alternate transfer sources.
Update some system tests to use the new configuration and make sure
the tests still work.
Add a new way to configure the preferred source address when talking to
remote servers such as primaries and parental-agents. This will
eventually deprecate options such as 'parental-source',
'parental-source-v6', 'transfer-source', etc.
Example of the new configuration:
parental-agents "parents" port 5353 \
source 10.10.10.10 port 5354 dscp 54 \
source-v6 2001:db8::10 port 5355 dscp 55 {
10.10.10.11;
2001:db8::11;
};
The pre-defined test cases use named.$TESTCASE.conf naming convention,
where TESTCASE is a human readable name contaning actual word(s). The
autogenerated test cases' names always start with a number from 1 to 6.
bin/tests/system/rrsetorder/dig.out* files match a gitignore expression
present in bin/tests/system/.gitignore. Since these are meant to be
reference files that are compared to the files generated when the
"rrsetorder" system test is run, rename them to avoid listing tracked
files in .gitignore files.
The test expects a "connection timed out" message from DiG when it
experiences a timeout, while the current version of DiG prints just
a "timed out" message, like below:
;; communications error to 10.53.0.1#11314: timed out
;; communications error to 10.53.0.1#11314: timed out
;; communications error to 10.53.0.1#11314: timed out
; <<>> DiG 9.19.9-dev <<>> -p 11314 +tries +time +tcp +tries +time @10.53.0.1 dropedns. TXT
; (1 server found)
;; global options: +cmd
;; no servers could be reached
Change the expected string to match the current DiG output.
Use the '-F' switch for "grep" for matching a fixed string.
In order to have a common naming convention for system tests, rename the
only outlier "engine_pkcs11" to "enginepkcs11", which was the only
system test using an underscore in its name.
The only allowed word separators for system test names are either dash
or no separator.
It is better to use consistent file names to avoid issue with sorting
etc.
Using underscore in filenames as opposed to dash was chosen because it
seems more common in pytest/python to use underscore for filenames.
Also rename the bin/tests/system/timeouts/tests-tcp.py file to
bin/tests/system/timeouts/tests_tcp_timeouts.py to avoid pytest name
collision (there can't be two files named tests_tcp.py).
This introduces a Python dependency for running system tests. It is
needed in order to:
- write new test control scripts in Python
- gradually rewrite old Perl scripts into Python if needed
- eventually introduce pytest as the new test runner framework
This commit is not intended to be backported to 9.16.
Nothing from conf.sh.common is required to set these values. On the
contrary, a Python interpreter needs to be set in order to randomize the
algorithm set (which happens in conf.sh.common).
When testcrypto.sh is used as a standalone script, always use quiet mode
to avoid using undefined commands (such as echo_i) which require
inclusion of the entire conf.sh machinery.
The dispatches are not thread-bound, and used freely between various
threads (see the dns_resolver and dns_request units for details).
This refactoring make sure that all non-const dns_dispatch_t and
dns_dispentry_t members are accessed under a lock, and both object now
track their internal state (NONE, CONNECTING, CONNECTED, CANCELED)
instead of guessing the state from the state of various struct members.
During the refactoring, the artificial limit DNS_DISPATCH_SOCKSQUOTA on
UDP sockets per dispatch was removed as the limiting needs to happen and
happens on in dns_resolver and limiting the number of UDP sockets
artificially in dispatch could lead to unpredictable behaviour in case
one dispatch has the limit exhausted by others are idle.
The TCP artificial limit of DNS_DISPATCH_MAXREQUESTS makes even less
sense as the TCP connections are only reused in the dns_request API
that's not a heavy user of the outgoing connections.
As a side note, the fact that UDP and TCP dispatch pretends to be same
thing, but in fact the connected UDP is handled from dns_dispentry_t and
dns_dispatch_t acts as a broker, but connected TCP is handled from
dns_dispatch_t and dns_dispatchmgr_t acts as a broker doesn't really
help the clarity of this unit.
This refactoring kept to API almost same - only dns_dispatch_cancel()
and dns_dispatch_done() were merged into dns_dispatch_done() as we need
to cancel active netmgr handles in any case to not leave dangling
connections around. The functions handling UDP and TCP have been mostly
split to their matching counterparts and the dns_dispatch_<function>
functions are now thing wrappers that call <udp|tcp>_dispatch_<function>
based on the socket type.
More debugging-level logging was added to the unit to accomodate for
this fact.