Commit graph

10998 commits

Author SHA1 Message Date
Ondřej Surý
576345a447
Refactor the privilege dropping
On Linux, the libcap is now mandatory.  It makes things simpler for us.

System without {set,get}res{uid,gid} now have compatibility shim using
setreuid/setregid or seteuid/setegid to setup effective UID/GID, so the
same code can be called all the time (including on Linux).
2022-11-01 14:37:30 +01:00
Ondřej Surý
04a5477eb2
Rescan interfaces before dropping privileges
The ns_interfacemgr_scan() now requires the loopmgr to be running, so we
need to end exclusive mode for the rescan and then begin it again.

This is relatively safe operation (because the scan happens on the timer
anyway), but we need to ensure that we won't load the configuration from
different threads.  This is already the case because the initial load
happens on the main thread and the control channel also listens just on
the main loop.
2022-11-01 11:48:56 +01:00
Aram Sargsyan
8c48eabbc1 Test managed-keys placeholder
Add a dnssec test to make sure that named can correctly process a
managed-keys zone with a placeholder KEYDATA record.
2022-11-01 09:50:34 +00:00
Evan Hunt
d9b85cbaae make dupsigs test less timing-sensitive
the dupsigs test is prone to failing on slow CI machines
because the first test can occur before the zone is fully
signed.

instead of just waiting ten seconds arbitrarily, we now
check every second, and allow up to 30 seconds before giving
up.
2022-10-31 04:03:01 -07:00
Ondřej Surý
04670889bc Refactor dns_master_dump*async() to use offloaded work
The dns_master_dump*async() functions were using isc_async_run() to
schedule work on the active loop; use isc_work_enqueue() instead.
2022-10-31 10:30:27 +00:00
Evan Hunt
b54c721894 refactor dns_master_dump*async() to use loop callbacks
Asynchronous zone dumping now uses loop callbacks instead of
task events.
2022-10-31 10:30:27 +00:00
Mark Andrews
1244a2ffb9 Test named's check-svcb behaviour with UPDATE
Checks that malformed _dns SVCB records are rejected unless
check-svcb is set to no, in which case they are accepted. Both
missing ALPN and missing DOHPATH are checked for.
2022-10-29 00:22:54 +11:00
Mark Andrews
c040e82c82 Check check-svcb processing in nsupdate 2022-10-29 00:22:54 +11:00
Mark Andrews
7782c78d15 Add various zones containing bad _dns SVCB records 2022-10-29 00:22:54 +11:00
Mark Andrews
da6359345e Add check-svcb to named
check-svcb signals whether to perform additional contraint tests
when loading / update primary zone files.
2022-10-29 00:22:54 +11:00
Mark Andrews
f857006cd9 Add checking of _dns SVCB records constraints to nsupdate
_dns SVBC records have additional constrains which should be checked
when records are being added.  This adds those constraint checks but
allows the user to override them using 'check-svcb no'.
2022-10-29 00:22:54 +11:00
Matthijs Mekking
72530d2f9c Add new upforwd system test
Add a new upforwd system test that checks if update forwarding still
works if the first primary is badly configured.

We cannot reuse the 'example.' zone for this test because that
checks if update forwarding works for DoT. What transport is used
in the new test is of no relevance.

Update the system test to use different known good file names for
the different zones that are being tested.
2022-10-27 12:22:23 +02:00
Tom Krizek
f65f276f98
Randomize algorithm selection for mkeys test
Use the ALGORITHM_SET option to use randomly selected default algorithm
in this test. Make sure the test works by using variables instead of
hard-coding values.
2022-10-27 12:14:29 +02:00
Tom Krizek
69b608ee9f
Set algorithms for system tests at runtime
Use the get_algorithms.py script to detect supported algorithms and
select random algorithms to use for the tests.

Make sure to load common.conf.sh after KEYGEN env var is exported.
2022-10-27 12:14:29 +02:00
Tom Krizek
5f480c8485
Script for random algorithm selection in system tests
Multiple algorithm sets can be defined in this script. These can be
selected via the ALGORITHM_SET environment variable. For compatibility
reasons, "stable" set contains the currently used algorithms, since our
system tests need some changes before being compatible with randomly
selected algorithms.

The script operation is similar to the get_ports.py - environment
variables are created and then printed out as `export NAME=VALUE`
commands, to be interpreted by shell. Once we support pytest runner for
system tests, this should be a fixture instead.
2022-10-27 12:14:29 +02:00
Tom Krizek
37d14c69c0
Export env variables in system tests
Certain variables have to be exported in order for the system tests to
work. It makes little sense to export the variables in one place/script
while they're defined in another place.

Since it makes no harm, export all the variables to make the behaviour
more predictable and consistent. Previously, some variables were
exported as environment variables, while others were just shell
variables which could be used once the configuration was sourced from
another script. However, they wouldn't be exposed to spawned processes.

For simplicity sake (and for the upcoming effort to run system tests
with pytest), export all variables that are used. TESTS, PARALLEL_UNIX
and SUBDIRS variables are automake-specific, aren't used anywhere else
and thus not exported.
2022-10-27 12:14:29 +02:00
Tom Krizek
bb1c6bbdc7
Support testcrypto.sh usage without including conf.sh
The only variable really needed for the script to work is the path to
the $KEYGEN binary. Allow setting this via an environment variable to
avoid loading conf.sh (and causing a chicken-egg problem). Also make
testcrypto.sh executable to allow its use from conf.sh.
2022-10-27 12:14:29 +02:00
Tom Krizek
01b293b055
Unify indentation level in testcrypto.sh 2022-10-27 12:14:27 +02:00
Matthijs Mekking
72d3bf8e4e Fix config bug related to port setting
There are three levels there for the port value, with increasing
priority:

1. The default ports, defined by 'port' and 'tls-port' config options.
2. The primaries-level default port: primaries port <number>  { ... };
3. The primaries element-level port: primaries { <address> port
   <number>; ... };"

In 'named_config_getipandkeylist()', the 'def_port' and 'def_tlsport'
variables are extracted from level 1. The 'port' variable is extracted
from the level 2. Currently if that is unset, it defaults to the
default port ('def_port' or 'def_tlsport' depending on the transport
used), but overrides the level 2 port setting for the next primaries in
the list.

Update the code such that we inherit the port only if the level 3 port
is not set, and inherit from the default ports if the level 2 port is
also not set.
2022-10-27 11:39:34 +02:00
Matthijs Mekking
622a499027 Add xfer system test case
Add a test case that if the first primary fails, the fallback of a
second primary on plain DNS works. This is mainly to test that the port
configuration inheritance works correctly.
2022-10-27 11:39:34 +02:00
Ondřej Surý
6ba0a22627
Change the return type of isc_lex_create() to void
The isc_lex_create() cannot fail, so cleanup the return type from
isc_result_t to void.
2022-10-26 12:55:06 +02:00
Tom Krizek
6295572b05
Remove misleading comment from serve-stale test
The stale-answer-client-timeout option is not set to 0 in the config
neither is it the default value. This was probably caused by a
copy-paste error.
2022-10-24 14:23:27 +02:00
Tom Krizek
a4d72a57f9
Test serve stale cache with timeout 0 and CNAME
Add a couple of tests that verify the serve-stale behavior when
stale-answer-client-timeout is set to 0 and a (stale) CNAME record is
queried.

Related #3517
2022-10-24 14:23:26 +02:00
Aram Sargsyan
0227565cf1 Getting the "prefetch" setting from the configuration cannot fail
The "prefetch" setting is in "defaultconf" so it cannot fail, use
INSIST to confirm that.

The 'trigger' and 'eligible' variables are now prefixed with
'prefetch_' and their declaration moved to an upper level, because
there is no more additional code block after this change.
2022-10-21 10:19:54 +00:00
Aram Sargsyan
89fa9a6592 Add another prefetch check in the resolver system test
The test triggers a prefetch, but fails to check if it acutally
happened, which prevented it from catching a bug when the record's
TTL value matches the configured prefetch eligibility value.

Check that prefetch happened by comparing the TTL values.
2022-10-21 10:17:03 +00:00
Artem Boldariev
fff01fe7eb
Fix named failing to start on Solaris systems with hundreds of CPUs
This commit fixes a startup issue on Solaris systems with
many (reportedly > 510) CPUs by bumping RLIMIT_NOFILE. This appears to
be a regression from 9.11.
2022-10-20 14:01:28 +02:00
Ondřej Surý
cd0e5c5784
Replace some raw nc usage in statschannel system test with curl
For tests where the TCP connection might get interrupted abruptly,
replace the nc with curl as the data sent from server to client might
get lost because of abrupt TCP connection.  This happens when the TCP
connection gets closed during sending the large request to the server.

As we already require curl for other system tests, replace the nc usage
in the statschannel test with curl that actually understands the
HTTP/1.1 protocol, so the same connection is reused for sending the
consequtive requests, but without client-side "pipelining".

For the record, the server doesn't support parallel processing of the
pipelined request, so it's a bit misnomer here, because what we are
actually testing is that we process all requests received in a single
TCP read callback.
2022-10-20 12:23:34 +02:00
Evan Hunt
575a924b1a add a test with CD=1 query for pending data
this is a regression test for [GL #3247].
2022-10-19 11:36:11 -07:00
Ondřej Surý
0f56a53d66
Remove the time requirement for the statschannel truncated test
The 5 seconds requirement to finish the 'pipelined with truncated
stream' was causing spurious failures in the CI because the job runners
might be very busy and sending 128k of data might simply take some time.

Remove the time requirement altogether, there's actually no reason why
the test SHOULD or even MUST finish under 5 seconds.
2022-10-19 14:08:24 +02:00
Tom Krizek
cbd0355328
Remove generated controls.conf file from system tests
The controls.conf file shouldn't be used directly without templating it
first. Remove this no longer used hard-coded file to avoid confusion.
2022-10-19 12:59:27 +02:00
Tom Krizek
cb0a2ae1dd
Revive dupsigs system test
Correctly source conf.sh in dupsigs test scripts (fix issue introduced
by 093af1c00a).

Update dupsigs test for dnssec-dnskey-kskonly default. Since v9.17.20,
the dnssec-dnskey-kskonly is set to yes. Update the test to not expect
the additional RRSIG with ZSK for DNSKEY.

Speed up the test from 20 minutes to 2.5 minutes and make it part of the
default test suite executed in CI.
- decrease number of records to sign from 2000 to 500
- decrease the signing interval by a factor of 6
- shorten the final part of the test after last signing (since nothing
  new happens there)

Finally, clarify misleading comments about (in)sufficient time for zone
re-signing. The time used in the test is in fact sufficient for the
re-signing to happen. If it wasn't, the previous ZSK would end up being
deleted while its signatures would still be present, which is a
situation where duplicate signatures can still happen.
2022-10-19 12:59:27 +02:00
Tom Krizek
7495deea3e
Revive the stress system test
Ensure the port numbers are dynamically filled in with copy_setports.

Clarify test fail condition.

Make the stress test part of the default test suite since it doesn't
seem to run too long or interfere with other tests any more (the
original note claiming so is more than 20 years old).

Related !6883
2022-10-19 12:59:27 +02:00
Tom Krizek
235ae5f344
Revive dialup system test
Properly template the port number in config files with copy_setports.

The test takes two minutes on my machine which doesn't seem like a
proper justification to exclude it from the test suite, especially
considering we run these tests in parallel nowadays. The resource usage
doesn't seems significantly increased so it shouldn't interfere with
other system tests.

There also exists a precedent for longer running system tests that are
already part of the default system test suite (e.g. serve-stale takes
almost three minutes on the same machine).
2022-10-19 12:59:27 +02:00
Tom Krizek
1e7d832342
Make digdelv test work in different network envs
When a target server is unreachable, the varying network conditions may
cause different ICMP message (or no message). The host unreachable
message was discovered when attempting to run the test locally while
connected to a VPN network which handles all traffic.

Extend the dig output check with "host unreachable" message to avoid a
false negative test result in certain network environments.
2022-10-19 12:59:25 +02:00
Michał Kępień
604d8f0b96
Add tests for CVE-2022-2795
Add a test ensuring that the amount of work fctx_getaddresses() performs
for any encountered delegation is limited: delegate example.net to a set
of 1,000 name servers in the redirect.com zone, the names of which all
resolve to IP addresses that nothing listens on, and query for a name in
the example.net domain, checking the number of times the findname()
function gets executed in the process; fail if that count is excessively
large.

Since the size of the referral response sent by ans3 is about 20 kB, it
cannot be sent back over UDP (EMSGSIZE) on some operating systems in
their default configuration (e.g. FreeBSD - see the
net.inet.udp.maxdgram sysctl).  To enable reliable reproduction of
CVE-2022-2795 (retry patterns vary across BIND 9 versions) and avoid
false positives at the same time (thread scheduling - and therefore the
number of fetch context restarts - vary across operating systems and
across test runs), extend bin/tests/system/resolver/ans3/ans.pl so that
it also listens on TCP and make "ns1" in the "resolver" system test
always use TCP when communicating with "ans3".

Also add a test (foo.bar.sub.tld1/TXT) that ensures the new limitations
imposed on the resolution process by the mitigation for CVE-2022-2795 do
not prevent valid, glueless delegation chains from working properly.
2022-10-19 11:53:08 +02:00
Evan Hunt
3c11fafadf
test for growth of compressed pipelined responses
add a test to compare the Content-Length of successive compressed
messages on a single HTTP connection that should contain the same
data; fail if the size grows by more than 100 bytes from one query
to the next.
2022-10-18 17:16:00 +02:00
Petr Špaček
c3e7bed1ab
Fix cookie system test for builds without --enable-developer
The "connecting via TCP" message comes from FCTXTRACE which is not
available on some builds.
2022-10-18 13:54:45 +02:00
Petr Špaček
ddf46056ca
Allow system tests to run under root user when inside CI
https://docs.gitlab.com/ee/ci/variables/predefined_variables.html
says variable CI_SERVER="yes" is available in all versions of Gitlab.
2022-10-18 13:30:16 +02:00
Petr Špaček
c8a38d70f0
Document that nsupdate ignores server command in GSS-TSIG mode
This behavior is present since introduction of GSS-TSIG support,
commit 289ae548d5.
2022-10-18 10:12:02 +02:00
Tony Finch
26ed03a61e Include the function name when reporting unexpected errors
I.e. print the name of the function in BIND that called the system
function that returned an error. Since it was useful for pthreads
code, it seems worthwhile doing so everywhere.
2022-10-17 13:43:59 +01:00
Tony Finch
ec50c58f52 De-duplicate __FILE__, __LINE__
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c

@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
2022-10-17 11:58:26 +01:00
Michal Nowak
212c4de043
Replace fgrep and egrep with grep -F/-E
GNU Grep 3.8 reports the following warnings:

    egrep: warning: egrep is obsolescent; using grep -E
    fgrep: warning: fgrep is obsolescent; using grep -F
2022-10-17 09:08:15 +02:00
Michal Nowak
65e91ef5e6
Remove stray backslashes
GNU Grep 3.8 reports several instances of stray backslashes in matching
patterns:

    grep: warning: stray \ before /
    grep: warning: stray \ before :
2022-10-17 09:08:15 +02:00
Tony Finch
45b2d8938b
Simplify and speed up DNS name compression
All we need for compression is a very small hash set of compression
offsets, because most of the information we need (the previously added
names) can be found in the message using the compression offsets.

This change combines dns_compress_find() and dns_compress_add() into
one function dns_compress_name() that both finds any existing suffix,
and adds any new prefix to the table. The old split led to performance
problems caused by duplicate names in the compression context.

Compression contexts are now either small or large, which the caller
chooses depending on the expected size of the message. There is no
dynamic resizing.

There is a behaviour change: compression now acts on all the labels in
each name, instead of just the last few.

A small benchmark suggests this is about 2x faster.
2022-10-17 08:45:44 +02:00
Ondřej Surý
cedfc97974 Improve reporting for pthread_once errors
Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with
PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any
once-related run-time failures (by augmenting error messages with
file/line/caller information and the error string corresponding to
errno).
2022-10-14 16:39:21 +02:00
Tom Krizek
05180154d9
Remove system test delzone
There are multiple reasons to remove this test as obsolete:

- The test may not possibly work for over 2.5 years, since
  98b3b93791 removed the rndc.py python
  tool on which this test relies.
- It isn't part of the test suite either in CI or locally unless it is
  explicitly enabled. As a result, there are many issues which prevent
  the test from being executed caused by various refactoring efforts
  accumulated over time.
- Even if the test could be executed, it has no clear failure condition.
  If the python script(s) fail, the test still passes.
2022-10-14 16:35:20 +02:00
Ondřej Surý
cad2706cce Replace the statschannel truncated tests with two new tests
Now that the artificial limit on the recv buffer has been removed, the
current system test always fails because it tests if the truncation has
happened.

Add test that sending more than 10 headers makes the connection to
closed; and add test that sending huge HTTP request makes the connection
to be closed.
2022-10-14 11:26:54 +02:00
Ondřej Surý
beecde7120 Rewrite isc_httpd using picohttpparser and isc_url_parse
Rewrite the isc_httpd to be more robust.

1. Replace the hand-crafted HTTP request parser with picohttpparser for
   parsing the whole HTTP/1.0 and HTTP/1.1 requests.  Limit the number
   of allowed headers to 10 (arbitrary number).

2. Replace the hand-crafted URL parser with isc_url_parse for parsing
   the URL from the HTTP request.

3. Increase the receive buffer to match the isc_netmgr buffers, so we
   can at least receive two full isc_nm_read()s.  This makes the
   truncation processing much simpler.

4. Process the received buffer from single isc_nm_read() in a single
   loop and schedule the sends to be independent of each other.

The first two changes makes the code simpler and rely on already
existing libraries that we already had (isc_url based on nodejs) or are
used elsewhere (picohttpparser).

The second two changes remove the artificial "truncation" limit on
parsing multiple request.  Now only a request that has too many
headers (currently 10) or is too big (so, the receive buffer fills up
without reaching end of the request) will end the connection.

We can be benevolent here with the limites, because the statschannel
channel is by definition private and access must be allowed only to
administrators of the server.  There are no timers, no rate-limiting, no
upper limit on the number of requests that can be served, etc.
2022-10-14 11:26:54 +02:00
Petr Špaček
53b3ceacd4
Replace #define DNS_NAMEATTR_ with struct of bools
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.
2022-10-13 17:04:02 +02:00
Artem Boldariev
95a551de7b doth system test: increase transfers-in/out limits
Sometimes doth test could intermittently fail shortly after start due
to inability to complete a zone transfer in time. As it turned out, it
could happen due to transfers-in/out limits. Initially the defaults
were fine, but over time, especially when adding Strict/Mutual TLS, we
added more than 10 zones so it became possible to hit the limits.

This commit takes care of that by bumping the limits.
2022-10-12 21:52:52 +03:00