The -n (nametype) option for keys defaults to ZONE for DNSKEY
type keys, and HOST for KEY type keys. There is currently no
practical reason to use any other name type; we can simplify
things by removing the option.
Explicitly use an empty 'trust-anchors' statement in the system
tests where it was used implicitly before.
In resolver/ns5/named.conf.in use the trust anchor in 'trusted.conf',
which was supposed to be used there.
The lock-file configuration (both from configuration file and -X
argument to named) has better alternatives nowadays. Modern process
supervisor should be used to ensure that a single named process is
running on a given configuration.
Alternatively, it's possible to wrap the named with flock(1).
All changes in this commit were automated using the command:
shfmt -w -i 2 -ci -bn . $(find . -name "*.sh.in")
By default, only *.sh and files without extension are checked, so
*.sh.in files have to be added additionally. (See mvdan/sh#944)
The changes were mostly done with sed:
find . -name '*.sh' | xargs sed -i 's/`\([^`]*\)`/$(\1)/g'
There have been a few manual changes where the regex wasn't sufficient
(e.g. backslashes inside the `...`) or wrong (`...` referring to docs or
in comments).
Change the way arithmetic operations are performed in system test shell
scripts from using `expr` to `$(())`. This ensures that updating the
variable won't end up with a non-zero exit code, which would case the
script to exit prematurely when `set -e` is in effect.
The following replacements were performed using sed in all text files
(git grep -Il '' | xargs sed -i):
s/status=`expr $status + $ret`/status=$((status + ret))/g
s/n=`expr $n + 1`/n=$((n + 1))/g
s/t=`expr $t + 1`/t=$((t + 1))/g
s/status=`expr $status + 1`/status=$((status + 1))/g
s/try=`expr $try + 1`/try=$((try + 1))/g
Ensure all shell system tests are executed with the errexit option set.
This prevents unchecked return codes from commands in the test from
interfering with the tests, since any failures need to be handled
explicitly.
In order to run the shell system tests, the pytest runner has to pick
them up somehow. Adding an extra python file with a single function
for the shell tests for each system test proved to be the most
compatible way of running the shell tests across older pytest/xdist
versions.
Modify the legacy run.sh script to ignore these pytest-runner specific
glue files when executing tests written in pytest.
At the time of test number (19), there were 10 "sending packet to
10.53.0.7" lines in the "legacy/ns1/named.run" file; usually, only seven
are present:
I:legacy:checking recursive lookup to edns 512 + no tcp server does not cause query loops (19)
I:legacy:ns1 sent 10 queries to ns7, expected less than 10
I:legacy:failed
Those three can be attributed to tests "8", "10", and "18", where the
dig of "resolution_fails()" retried after a timeout to succeed with
"status: SERVFAIL" subsequently, as seen in each of
dig.out.test{8,10,18} files.
;; communications error to 10.53.0.1#13093: timed out
; <<>> DiG 9.19.12-dev <<>> -p 13093 +tcp @10.53.0.1 edns512-notcp. TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5368
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
This retry is unnecessary because "resolution_fails()" considers timeout
a positive result.
The test expects a "connection timed out" message from DiG when it
experiences a timeout, while the current version of DiG prints just
a "timed out" message, like below:
;; communications error to 10.53.0.1#11314: timed out
;; communications error to 10.53.0.1#11314: timed out
;; communications error to 10.53.0.1#11314: timed out
; <<>> DiG 9.19.9-dev <<>> -p 11314 +tries +time +tcp +tries +time @10.53.0.1 dropedns. TXT
; (1 server found)
;; global options: +cmd
;; no servers could be reached
Change the expected string to match the current DiG output.
Use the '-F' switch for "grep" for matching a fixed string.
The system test should never attempt to start or stop any other server
than those that belong to that system test. Therefore, it is not
necessary to specify the system test name in function calls.
Additionally, this makes it possible to run the test inside a
differently named directory, as its name is automatically detected with
the $SYSTESTDIR variable. This enables running the system tests inside a
temporary directory.
Direct use of stop.pl was replaced with a more systematic approach to
use stop_servers helper function.
The checkbashisms script reports errors like this one:
script util/check-line-length.sh does not appear to have a #! interpreter line;
you may get strange results
DiG implements different logic in the `recv_done()` callback function
when processing a failure:
1. For a timed-out query it applies the "retries" logic first, then,
when it fails, fail-overs to the next server.
2. For an EOF (end-of-file, or unexpected disconnect) error it tries to
make a single retry attempt (even if the user has requested more
retries), then, when it fails, fail-overs to the next server.
3. For other types of failures, DiG does not apply the "retries" logic,
and tries to fail-over to the next servers (again, even if the user
has requested to make retries).
Simplify the logic and apply the same logic (1) of first retries, and
then fail-over, for different types of failures in `recv_done()`.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
The ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGCTX did sanity checks on matching
size and memory context on the memory returned to the allocator. Those
will no longer needed when most of the allocator will be replaced with
jemalloc.
The SOA lookup for edns512 could succeed if the negative response
for ns.edns512/AAAA completed before all the edns512/SOA query
attempts are made. The ns.edns512/AAAA lookup returns tc=1 and
the SOA record is cached after processing the NODATA response.
Lookup a TXT record at edns512 and look it up instead of the
SOA record.
Removed 'checking that TCP failures do not influence EDNS statistics
in the ADB' as it is no longer appropriate.
* the legacy test with -T maxudp512 will just fail, e.g. if the packets
larger than 512 octets are dropped along the path, the proper response
is to fail
* digdelv test was just expecting default server EDNS buffer size to be
4096, the test needed only slight adjustment
In order to lower the amount of memory allocated at startup by named
instances used in the BIND system test suite, set the default value of
"max-cache-size" for these to 2 megabytes. The purpose of this change
is to prevent named instances (or even entire virtual machines) from
getting killed by the operating system on the test host due to excessive
memory use.
Remove all "max-cache-size" statements from named configuration files
used in system tests ("checkconf" notwithstanding) to prevent confusion
as the "-T maxcachesize=..." command line option takes precedence over
configuration files.
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
The first step in all existing setup.sh scripts is to call clean.sh. To
reduce code duplication and ensure all system tests added in the future
behave consistently with existing ones, invoke clean.sh from run.sh
before calling setup.sh.
this adds functions in conf.sh.common to create DS-style trust anchor
files. those functions are then used to create nearly all of the trust
anchors in the system tests.
there are a few exceptions:
- some tests in dnssec and mkeys rely on detection of unsupported
algorithms, which only works with key-style trust anchors, so those
are used for those tests in particular.
- the mirror test had a problem with the use of a CSK without a
SEP bit, which still needs addressing
in the future, some of these tests should be changed back to using
traditional trust anchors, so that both types will be exercised going
forward.
With the netmgr in use, named may start answering queries before zones
are loaded. This can cause transient failures in system tests after
servers are restarted or reconfigured. This commit adds retry loops
and sleep statements where needed to address this problem.
Also incidentally silenced a clang warning.
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
If a TCP connection fails while attempting to send a query to a server,
the fetch context will be restarted without marking the target server as
a bad one. If this happens for a server which:
- was already marked with the DNS_FETCHOPT_EDNS512 flag,
- responds to EDNS queries with the UDP payload size set to 512 bytes,
- does not send response packets larger than 512 bytes,
and the response for the query being sent is larger than 512 byes, then
named will pointlessly alternate between sending UDP queries with EDNS
UDP payload size set to 512 bytes (which are responded to with truncated
answers) and TCP connections until the fetch context retry limit is
reached. Prevent such query loops by marking the server as bad for a
given fetch context if the advertised EDNS UDP payload size for that
server gets reduced to 512 bytes and it is impossible to reach it using
TCP.
- managed-keys is now deprecated as well as trusted-keys, though
it continues to work as a synonym for dnssec-keys
- references to managed-keys have been updated throughout the code.
- tests have been updated to use dnssec-keys format
- also the trusted-keys entries have been removed from the generated
bind.keys.h file and are no longer generated by bindkeys.pl.
Performing server setup checks using "+tries=3 +time=5" is redundant as
a single query is arguably good enough for determining whether a given
named instance was set up properly. Only use multiple queries with a
long timeout for resolution checks in the "legacy" system test, in order
to significantly reduce its run time (on a contemporary machine, from
about 1m45s to 0m40s).
In the "legacy" system test, in order to make server setup checks more
consistent with each other, add further checks for either presence or
absence of the EDNS OPT pseudo-RR in the responses returned by the
tested named instances.
Extract repeated dig and grep calls into two helper shell functions,
resolution_succeeds() and resolution_fails(), in order to reduce code
duplication in the "legacy" system test, emphasize the similarity
between all the resolution checks in that test, and make the conditions
for success and failure uniform for all resolution checks in that test.
When testing named instances which are configured to drop outgoing UDP
responses larger than 512 bytes, querying with DO=1 may be used instead
of querying for large TXT records as the effect achieved will be
identical: an unsigned response for a SOA query will be below 512 bytes
in size while a signed response for the same query will be over 512
bytes in size. Doing this makes all resolution checks in the "legacy"
system test more similar. Add checks for the TC flag being set in UDP
responses which are expected to be truncated to further make sure that
tested named instances behave as expected.