There were two problems in the notify system test when it waited for
log messages to appear: the shellcheck refactoring introduced a call
to `wait_for_log` with a regex, but `wait_for_log` only supports fixed
strings, so it always ran for the full 45 second timeout; and the new
test to ensure that notify messages time out failed to reset the
nextpart pointer, so if the notify messages timed out before the test
ran, it would fail to see them.
This change adds a `wait_for_log_re` helper that matches a regex, and
uses it where appropriate in the notify system test, which stops the
test from waiting longer than necessary; and it resets the nextpart
pointer so that the notify timeout test works reliably.
Closes#3275
For some applications, it's useful to not listen on full battery of
threads. Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
Prime the cache with a negative cache DS entry then make a query for
name beneath that entry. This will cause the DS entry to be retieved
as part of the validation process. Each RRset in the ncache entry
will be validated and the trust level for each will be updated.
Catalog zones change of ownership is special mechanism to facilitate
controlled migration of a member zone from one catalog to another.
It is implemented using catalog zones property named "coo" and is
documented in DNS catalog zones draft version 5 document.
Implement the feature using a new hash table in the catalog zone
structure, which holds the added "coo" properties for the catalog zone
(containing the target catalog zone's name), and the key for the hash
table being the member zone's name for which the "coo" property is being
created.
Change some log messages to have consistent zone name quoting types.
Update the ARM with change of ownership documentation and usage
examples.
Add tests which check newly the added features.
According to DNS catalog zones draft version 5 document, catalog
zone custom properties must be placed under the "ext" label.
Make necessary changes to support the new custom properties syntax in
catalog zones with version "2" of the schema.
Change the default catalog zones schema version from "1" to "2" in
ARM to prepare for the new features and changes which come starting
from this commit in order to support the latest DNS catalog zones draft
document.
Make some restructuring in ARM and rename the term catalog zone "option"
to "custom property" to better reflect the terms used in the draft.
Change the version of 'catalog1.zone.' catalog zone in the "catz" system
test to "2", and leave the version of 'catalog2.zone.' catalog zone at
version "1" to test both versions.
Add tests to check that the new syntax works only with the new schema
version, and that the old syntax works only with the legacy schema
version catalog zones.
Add a test case for a dynamically added CDS DELETE record and make
sure it is not removed when signing the zone. This happens because
BIND maintains CDS and CDNSKEY publishing and it will only allow
CDS DELETE records if the zone is transitioning to insecure. This is
a state that can be identified when using KASP through 'dnssec-policy',
but not when using 'auto-dnssec'.
Commit bf3fffff67 added a Python-based
name server (bin/tests/system/forward/ans11/ans.py) to the "forward"
system test, but did not update bin/tests/system/Makefile.am to ensure
Python is present in the test environment before the "forward" system
test is run. Update bin/tests/system/Makefile.am to enforce that
requirement.
Implement TCP support in the `ans11` Python-based DNS server.
Implement a control command channel in `ans11` to support an optional
silent mode of operation, which, when enabled, will ignore incoming
queries.
In the added check, make the `ans11` the NS server of
"a.root-servers.nil." for `ns3`, so it uses `ans11` (in silent mode)
for the regular (non-forwarded) name resolutions.
This will trigger the "hung fetch" scenario, which was causing `named`
to crash.
- Check that an NS in an authority section returned from a forwarder
which is above the name in a configured "forward first" or "forward
only" zone (i.e., net/NS in a response from a forwarder configured for
local.net) is not cached.
- Test that a DNAME for a parent domain will not be cached when sent
in a response from a forwarder configured to answer for a child.
- Check that glue is rejected if its name falls below that of zone
configured locally.
- Check that an extra out-of-bailiwick data in the answer section is
not cached (this was already working correctly, but was not explicitly
tested before).
Add a test case to check for lingering TCP sockets stuck in the
CLOSE_WAIT state. This can happen if a client sends some garbage after
its first query.
The system test runs the reproducer script and then sends another TCP
query to the resolver. The resolver is configured to allow one TCP
client only. If BIND has its TCP socket stuck in CLOSE_WAIT, it does
not have the resources available to answer the second query.
Note: A better test would be to check if the named daemon does not
have a TCP socket stuck in CLOSE_WAIT for example with netstat. When
running this test locally you can examine named with netstat manually.
But since netstat is platform specific it is not a good candidate to do
this as a system test.
If you, if you could return, don't let it burn.
Do you have to let it linger?
- Cranberries
This allows Gitlab to show nice summary for individual tests/test
directories and to expose the results in Gitlab API for consumption
elsewhere.
A catch: As of Gitlab 14.7.7, the detailed results are stored
only in artifacts and thus expire. All consumers (including API) need
to be "fast enough" to get the data before they disappear.
This also forces us to always store the artifacts intead of storing them
only on failure.
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.
This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.
The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
In `send_udp()` and `launch_next_query()` functions, when calling
`dighost_printmessage()` to print detailed information about the
sent query, dig always prints the data of the first query in the
lookup's queries list.
The first query in the list can be already finished, having its handles
freed, and accessing this information results in assertion failure.
Print the current query's information instead.
In recv_done(), when dig decides to start the lookup's next query in
the line using `start_udp()` or `start_tcp()`, and for some reason,
no queries get started, dig doesn't cancel the lookup.
This can occur, for example, when there are two queries in the lookup,
one with a regular IP address, and another with a IPv4 mapped IPv6
address. When the regular IP address fails to serve the query, its
`recv_done()` callback starts the next query in the line (in this
case the one with a mapped IP address), but because `dig` doesn't
connect to such IP addresses, and there are no other queries in the
list, no new queries are being started, and the lookup keeps hanging.
After calling `start_udp()` or `start_tcp()` in `recv_done()`, check
if there are no pending/working queries then cancel the lookup instead
of only detaching from the current query.
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.
Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
a test case in the 'resolver' system test was reliant on
logged output that would only be present when query tracing
was enabled, as in developer builds. that test case is now
disabled when query tracing is not available. Thanks to
Anton Castelli.
the line "$GENERATE 19-28/2147483645 $ CNAME x" should generate
a single CNAME with the owner "19.example.com", but prior to the
overflow bug it generated several CNAMEs, half of them with large
negative values.
we now test for the bugfix by using "named-checkzone -D" and
grepping for a single CNAME in the output.
Ensure the update zone name is mentioned in the NOTAUTH error message
in the server log, so that it is easier to track down problematic
update clients. There are two cases: either the update zone is
unrelated to any of the server's zones (previously no zone was
mentioned); or the update zone is a subdomain of one or more of the
server's zones (previously the name of the irrelevant parent zone was
misleadingly logged).
Closes#3209
The use of isc_appctx_t in dns_client was used to wait for
dns_client_startresolve() to finish the processing (the resolve_done()
task callback).
This has been replaced with standard bool+cond+lock combination removing
the need of isc_appctx_t altogether.
Previously, the isc_ht API would always take the key as a literal input
to the hashing function. Change the isc_ht_init() function to take an
'options' argument, in which ISC_HT_CASE_SENSITIVE or _INSENSITIVE can
be specified, to determine whether to use case-sensitive hashing in
isc_hash32() when hashing the key.
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE(). Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
This commit extends the 'doth' system test with a set of Strict/Mutual
TLS related checks.
This commit also makes each doth NS instance use its own TLS
certificate that includes FQDN, IPv4, and IPv6 addresses, issued using
a common Certificate Authority, instead of ad-hoc certs.
Extend servers initialisation timeout to 60 seconds to improve the
tests stability in the CI as certain configurations could fail to
initialise on time under load.
The dnssec-settime -p and -up options print times in asctime() and
UNIX time_t formats, respectively. The asctime() format can also be
found inside K*.key public key files. Key files also contain times in
the YYYYMMDDHHMMSS format that can be used in timing parameter
options.
The dnssec-settime -p and -up time formats are now acceptable in
timing parameter options to dnssec-settime and dnssec-keygen, so it is
no longer necessary to parse key files to retrieve times that are
acceptable in timing parameter options.
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline. As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete. The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.
Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler
NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
From an attacker's point of view, a VLA declaration is essentially a
primitive for performing arbitrary arithmetic on the stack pointer. If
the attacker can control the size of a VLA they have a very powerful
tool for causing memory corruption.
To mitigate this kind of attack, and the more general class of stack
clash vulnerabilities, C compilers insert extra code when allocating a
VLA to probe the growing stack one page at a time. If these probes hit
the stack guard page, the program will crash.
From the point of view of a C programmer, there are a few things to
consider about VLAs:
* If it is important to handle allocation failures in a controlled
manner, don't use VLAs. You can use VLAs if it is OK for
unreasonable inputs to cause an uncontrolled crash.
* If the VLA is known to be smaller than some known fixed size,
use a fixed size array and a run-time check to ensure it is large
enough. This will be more efficient than the compiler's stack
probes that need to cope with arbitrary-size VLAs.
* If the VLA might be large, allocate it on the heap. The heap
allocator can allocate multiple pages in one shot, whereas the
stack clash probes work one page at a time.
Most of the existing uses of VLAs in BIND are in test code where they
are benign, but there was one instance in `named`, in the GSS-TSIG
verification code, which has now been removed.
This commit adjusts the style guide and the C compiler flags to allow
VLAs in test code but not elsewhere.
Rework the "ans8" server in the "digdelv" system test to support various
modes of operations using a control channel.
The supported modes are:
1. `silent` (do not respond)
2. `close` (UDP: same as `silent`; TCP: also close the connection)
3. `servfail` (always respond with `SERVFAIL`)
4. `unstable` (constantly switch between `silent` and `servfail`)
Add multiple tests to check the handling of both TCP and UDP socket
error scenarios in dig/host.
Add a test to check whether dig tries the next query/server after
a connection error.
Add a test to check whether dig tries the next query/server after
a one or more (default is 3) connection/request timeouts.
This test ensures that `dig` retries with another attempt after a
timed-out request, and that it does not crash when the retried
request returns a SERVFAIL result. See [GL #3020] for the latter
issue.
Since pytest itself skips tests using dnspython if the latter is not
available, also using Automake conditionals for silently skipping
pytest-based tests requiring dnspython is redundant and hides
information. Allow all pytest-based tests requiring dnspython to be run
whenever pytest itself is available, in order to ensure test skipping is
done in a uniform manner.
Note that the above reasoning only applies to pytest-based tests, so
similar adjustments were not made for shell-based tests using Python
scripts that require dnspython ("chain", "cookie", "dnssec", "qmin").
The ability to conveniently mark tests which should only be run when the
CI_ENABLE_ALL_TESTS environment variable is set seems to be useful on a
general level and therefore it should not be limited to the "timeouts"
system test, where it is currently used.
pytest documentation [1] suggests to reuse commonly used test markers by
putting them all in a single Python module which then has to be imported
by test files that want to use the markers defined therein. Follow that
advice by creating a new bin/tests/system/pytest_custom_markers.py
Python module containing the relevant marker definitions.
Note that "import pytest_custom_markers" works from a test-specific
subdirectory because pytest modifies sys.path so that it contains the
paths to all parent directories containing a conftest.py file (and
bin/tests/system/ is one). PyLint does not like that, though, so add a
relevant PyLint suppression.
The above changes make bin/tests/system/timeouts/conftest.py redundant,
so remove it.
[1] https://docs.pytest.org/en/7.0.x/how-to/skipping.html#id1
Ensure all "import dns.*" statements are always placed after
pytest.importorskip('dns') calls, in order to allow the latter to
fulfill their purpose. Explicitly import all dnspython modules used by
each dnspython-based test to avoid relying on nested imports. Replace
function-scoped imports with global imports to reduce code duplication.
The intended purpose of the @pytest.mark.dnspython{,2} decorators was to
cause dnspython-based tests to be skipped if dnspython is not available
(or not recent enough). However, a number of system tests employing
those decorators contain global "import dns.resolver" statements which
trigger ImportError exceptions during test initialization if dnspython
is not available. In other words, the @pytest.mark.dnspython{,2}
decorators serve no useful purpose.
Currently, whenever a Python-based test requires dnspython, that
requirement applies to all tests in a given *.py file. Given that,
employ global pytest.importorskip() calls to ensure dnspython-based
parts of various system tests are skipped when dnspython is not
available. Remove all occurrences of the @pytest.mark.dnspython{,2}
decorators (and all associated code) to prevent confusion.
The intended purpose of the @pytest.mark.requests decorator was to cause
Python-based parts of the "statschannel" system test to be skipped if
the requests Python module is not available. However, both
tests-json.py and tests-xml.py contain a global "import requests"
statement which triggers ImportError exceptions during test
initialization if the requests module is not available. In other words,
the @pytest.mark.requests decorator serves no useful purpose.
Since all tests in both tests-json.py and tests-xml.py depend on the
requests Python module, employ pytest.importorskip() to ensure the
Python-based parts of the "statschannel" system test are skipped when
the requests module is not available. Remove all occurrences of the
@pytest.mark.requests decorator (and all associated code) to prevent
confusion.
All tests in bin/tests/system/statschannel/tests-xml.py require libxml2
support to be enabled in BIND 9 at build-time. Instead of applying the
same pytest.mark.skipif() decorator to every test in that file, set the
'pytestmark' global accordingly in order to immediately skip all tests
in tests-xml.py if libxml2 support is not compiled in.
Remove all occurrences of the @pytest.mark.xml decorator (and all
associated code) from the "statschannel" system test as the
xml.etree.ElementTree module is a part of the Python standard library
since Python 2.5 (so checking whether it is available is redundant) and
checking for libxml2 support in the tested BIND 9 build is already
handled by setting the 'pytestmark' global accordingly.
All tests in bin/tests/system/statschannel/tests-json.py require json-c
support to be enabled in BIND 9 at build-time. Instead of applying the
same pytest.mark.skipif() decorator to every test in that file, set the
'pytestmark' global accordingly in order to immediately skip all tests
in tests-json.py if json-c support is not compiled in.
Remove all occurrences of the @pytest.mark.json decorator (and all
associated code) from the "statschannel" system test as the json module
is a part of the Python standard library since Python 2.6 (so checking
whether it is available is redundant) and checking for json-c support in
the tested BIND 9 build is already handled by setting the 'pytestmark'
global accordingly.
Also remove a related excerpt from bin/tests/system/rpzextra/conftest.py
as it is a copy-paste artifact that serves no purpose in the "rpzextra"
system test.
The "statschannel" system test contains two Python helper modules:
- generic.py: test functions directly invoked by both tests-json.py
and test-xml.py,
- helper.py: helper functions invoked by test functions in generic.py.
The above logic for splitting helper functions into Python modules
prevents selective test skipping from working due to unconditional
import statements being present in both helper modules. For example, if
dnspython is not available on the test host, tests-json.py imports
generic.py, which in turn imports helper.py, which in turn attempts to
import various dnspython modules, triggering ImportError exceptions
during test initialization. Various decorators used for some tests
(like @pytest.mark.dnspython) suggest that such a scenario should be
handled gracefully, but that is not the case - modifying the test
collection in conftest.py does not prevent pytest from failing due to
import errors.
Fix by moving helper functions around to achieve a different split:
- generic.py: helper functions only relying on the Python standard
library,
- generic_dnspython.py: helper functions requiring dnspython.
Only two tests in tests-{json,xml}.py need dnspython to work
(test_traffic_json(), test_traffic_xml()). Since all
dnspython-dependent code is now present in generic_dnspython.py, employ
pytest.importorskip() in those two tests to ensure they can be
selectively skipped when dnspython is not available. Adjust other code
to account for the revised Python helper module layout. Remove all
occurrences of the @pytest.mark.dnspython decorator (and all associated
code) from the "statschannel" system test to prevent confusion.
The find invocation used by the bin/tests/system/get_ports.sh script
("find . -maxdepth 1 -mindepth 1 -type d") assumes the list of
directories in bin/tests/system/ remains unchanged throughout the run
time of a single system test suite. With pytest in use and the
conftest.py file now present in bin/tests/system/, that assumption is no
longer true as a __pycache__ directory may be created when the first
pytest-based test is started. Since the list of names returned by the
above find invocation serves as a fixed-size array of "port range
slots", any changes to that list during a system test suite run may lead
to port assignment collisions [1].
Fix by making the find invocation more nuanced, so that it only returns
names of directories containing test code. Squash a grep / cut pipeline
into a single awk invocation.
[1] see commit 31e5ca4bd9