DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
Create a zone that triggers DNAME owner name checks in a zone that
is only reachable using a dual stack server. The answer contains
a name that is higher in the tree than the query name.
e.g.
foo.v4only.net. CNAME v4only.net.
v4only.net. A 10.0.0.1
ns4 is serving the test zone (ipv4-only)
ns6 is the root server for this test (dual stacked)
ns7 is acting as the dual stack server (dual stacked)
ns9 is the server under test (ipv6-only)
The test triggers a prefetch, but fails to check if it acutally
happened, which prevented it from catching a bug when the record's
TTL value matches the configured prefetch eligibility value.
Check that prefetch happened by comparing the TTL values.
Add a test ensuring that the amount of work fctx_getaddresses() performs
for any encountered delegation is limited: delegate example.net to a set
of 1,000 name servers in the redirect.com zone, the names of which all
resolve to IP addresses that nothing listens on, and query for a name in
the example.net domain, checking the number of times the findname()
function gets executed in the process; fail if that count is excessively
large.
Since the size of the referral response sent by ans3 is about 20 kB, it
cannot be sent back over UDP (EMSGSIZE) on some operating systems in
their default configuration (e.g. FreeBSD - see the
net.inet.udp.maxdgram sysctl). To enable reliable reproduction of
CVE-2022-2795 (retry patterns vary across BIND 9 versions) and avoid
false positives at the same time (thread scheduling - and therefore the
number of fetch context restarts - vary across operating systems and
across test runs), extend bin/tests/system/resolver/ans3/ans.pl so that
it also listens on TCP and make "ns1" in the "resolver" system test
always use TCP when communicating with "ans3".
Also add a test (foo.bar.sub.tld1/TXT) that ensures the new limitations
imposed on the resolution process by the mitigation for CVE-2022-2795 do
not prevent valid, glueless delegation chains from working properly.
the 'resolve' binary was added for testing dns_client as part of
the export library. the export libraries are no longer supported,
and tests using 'delv' provide the same coverage, so 'resolve' can
be removed now.
previously, when an iterative query returned FORMERR, resolution
would be stopped under the assumption that other servers for
the same domain would likely have the same capabilities. this
assumption is not correct; some domains have been reported for
which some but not all servers will return FORMERR to a given
query; retrying allows recursion to succeed.
a test case in the 'resolver' system test was reliant on
logged output that would only be present when query tracing
was enabled, as in developer builds. that test case is now
disabled when query tracing is not available. Thanks to
Anton Castelli.
- Removed all code that only runs under CYGWIN, and made all
code that doesn't run under CYGWIN non-optional.
- Removed the $TP variable which was used to add optional
trailing dots to filenames; they're no longer optional.
- Removed references to pssuspend and dos2unix.
- No need to use environment variables for diff and kill.
- Removed uses of "tr -d '\r'"; this was a workaround for
a cygwin regex bug that is no longer needed.
1. 10 seconds is an unfortunate pick because that reintroduces the
problem described in commit 5307bf64 (for an earlier check).
Change the +tries=3 +timeout=10 to +tries=2 +time=15, so that we
minimize the risk of dig missing any responses sent by the server in
the first 15 seconds while also increasing our chances of the
response arriving in time on machines under heavy load and allowing
it a single retry in case things go awry.
2. The comment about TCP above was misleading: as painfully proven by
GitLab CI, using TCP is no guarantee of receiving a response in a
timely manner. It may help a bit, but it is certainly not a 100%
reliable solution.
Change the dig invocation to just use UDP like in the two prior
tests for consistency (and revise that comment accordingly).
The resolver system tests was exhibiting often intermitten failures,
increase the timeout from default 5 second to 10 seconds to give the dig
more leeway for providing an answer.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
the resolver test checks that the correct number of fetches have
been sent NS rrsets of a given size, but it formerly did so by
counting queries received by the authoritative server, which could
result in an off-by-one count if one of the queries had been resent
due to a timeout or a port number collision.
this commit changes the test to count fetches initiated by the
resolver, which should prevent the intermittent test failure, and
is the actual datum we were interested in anyway.
Add a lame delegation to lame.example.org with only an A record
in the additional section; on failure, this will trigger a retry
with AAAA, which will loop. Test that dig returns SERVFAIL, in
addition to confirming that named doesn't hang on shutdown.
The additional processing method has been expanded to take the
owner name of the record, as HTTPS and SVBC need it to process "."
in service form.
The additional section callback can now return the RRset that was
added. We use this when adding CNAMEs. Previously, the recursion
would stop if it detected that a record you added already exists. With
CNAMEs this rule doesn't work, as you ultimately care about the RRset
at the target of the CNAME and not the presence of the CNAME itself.
Returning the record allows the caller to restart with the target
name. As CNAMEs can form loops, loop protection was added.
As HTTPS and SVBC can produce infinite chains, we prevent this by
tracking recursion depth and stopping if we go too deep.
The ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGCTX did sanity checks on matching
size and memory context on the memory returned to the allocator. Those
will no longer needed when most of the allocator will be replaced with
jemalloc.
one of the tests in the resolver system test depends on dig
getting no response to its first two query attempts, and SERVFAIL
on the third after resolution times out.
using a 5-second retry timer in dig means the SERVFAIL response
could occur while dig is discarding the second query and preparing
to send the third. in this case the server's response could be
missed. shortening the retry interval to 4 seconds ensures that
dig has already sent the third query when the SERVFAIL response
arrives.
also, the serve-stale system test could fail due to a race in which
it timed out after waiting ten seconds for a file to be written, and
the dig timeout was just a bit longer. this is addressed by extending
the dig timeout to 11 seconds for this test.
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
Add a system test that counts how many address fetches are made
for different numbers of NS records and checks that the number
are successfully limited.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
Added test to ensure that NXDOMAIN is returned when BIND is queried for a
non existing domain in CH class (if a view of CHAOS class is configured)
and that it also doesn't crash anymore in those cases.
The first step in all existing setup.sh scripts is to call clean.sh. To
reduce code duplication and ensure all system tests added in the future
behave consistently with existing ones, invoke clean.sh from run.sh
before calling setup.sh.
These two tests were failing basically because in order for prefetching to
happen, the TTL for a given DNS record must be greater than or equal to
the prefetch config value + 9.
The previous TTL for both records was 10, while prefetch value in
configuration was 3, thus making only records with TTL >= 12 elligible
for prefetching.
TTL value for both records was adjusted to the value 13, and prefetch
value was set to 4 (inc by 1), so records with TTL (4 + 9) >= 13 are
elligible for prefetching.
Adjusting prefetch value to 4 gives the test 1 second more to avoid time
problems when sharing resources on a heavy loaded PC.
Also prefetch value in settings is now read by the script and used
by it to corrrectly calculate the amount of time needed to delay before
sending a request to trigger prefetch, adding a bit of flexibility to
fine tune the test in the future.
The previous test had two problems:
1. It wasn't written specifically for testing what it was supposed to:
prefetch disabled.
2. It could fail in some circunstances if the computer's load is too
high, due to sleeps not taking parallel tests and cpu load into account.
The new test is testing prefetch disabled as follows:
1. It asks for a txt record for a given domain and takes note of the
record's TTL (which is 10).
2. It sleeps for (TTL - 5) = 5 seconds, having a window of 5 seconds to
issue new queries before the record expires from cache.
3. Three(3) queries are executed in a row, with a interval of 1 second
between them, and for each query we verify that the TTL in response is
less than the previous one, thus ensuring that prefetch is disabled (if
it were enabled this record would have been refreshed already and TTL
would be >= the first TTL).
Having a window of 5 seconds to perform 3 queries with a interval of 1
second between them gives the test a reasonable amount of time
to not suffer from a machine with heavy load.
this adds functions in conf.sh.common to create DS-style trust anchor
files. those functions are then used to create nearly all of the trust
anchors in the system tests.
there are a few exceptions:
- some tests in dnssec and mkeys rely on detection of unsupported
algorithms, which only works with key-style trust anchors, so those
are used for those tests in particular.
- the mirror test had a problem with the use of a CSK without a
SEP bit, which still needs addressing
in the future, some of these tests should be changed back to using
traditional trust anchors, so that both types will be exercised going
forward.
With the netmgr in use, named may start answering queries before zones
are loaded. This can cause transient failures in system tests after
servers are restarted or reconfigured. This commit adds retry loops
and sleep statements where needed to address this problem.
Also incidentally silenced a clang warning.
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
Some system tests assume dig's default setings are in effect. While
these defaults may only be silently overridden (because of specific
options set in /etc/resolv.conf) for BIND releases using liblwres for
parsing /etc/resolv.conf (i.e. BIND 9.11 and older), it is arguably
prudent to make sure that tests relying on specific +timeout and +tries
settings specify these explicitly in their dig invocations, in order to
prevent test failures from being triggered by any potential changes to
current defaults.
If dots are not escaped in the "1.2.3.4" regular expressions used for
checking whether IP address 1.2.3.4 is present in the tested resolver's
answers, a COOKIE that matches such a regular expression will trigger a
false positive for the "resolver" system test. Properly escape dots in
the aforementioned regular expressions to prevent that from happening.
For all system tests utilizing named instances, call clean.sh from each
test's setup.sh script in a consistent way to make sure running the same
system test multiple times using run.sh does not trigger false positives
caused by stale files created by previous runs.
Ideally we would just call clean.sh from run.sh, but that would break
some quirky system tests like "rpz" or "rpzrecurse" and being consistent
for the time being does not hurt.