Commit graph

9799 commits

Author SHA1 Message Date
Evan Hunt
69c1ee1ce9 rewrite statschannel to use netmgr
modify isc_httpd to use the network manager instead of the
isc_socket API.

also cleaned up bin/named/statschannel.c to use CHECK.
2020-07-15 22:35:07 -07:00
Mark Andrews
11ecf7901b Add regression test for [GL !3735]
Check that resign interval is actually in days rather than hours
by checking that RRSIGs are all within the allowed day range.
2020-07-14 10:59:59 +10:00
Tony Finch
030674b2a3 Fix re-signing when sig-validity-interval has two arguments
Since October 2019 I have had complaints from `dnssec-cds` reporting
that the signatures on some of my test zones had expired. These were
zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
`sig-validity-interval 10 8`.

This is the same setup we have used for our production zones since
2015, which is intended to re-sign the zones every 2 days, keeping
at least 8 days signature validity. The SOA expire interval is 7
days, so even in the presence of zone transfer problems, no-one
should ever see expired signatures. (These timers are a bit too
tight to be completely correct, because I should have increased
the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
But that should only matter when zone transfers are broken, which
was not the case for the error reports that led to this patch.)

For example, this morning my test zone contained:

        dev.dns.cam.ac.uk. 86400 IN RRSIG DNSKEY 13 5 86400 (
                                20200701221418 20200621213022 ...)

But one of my resolvers had cached:

        dev.dns.cam.ac.uk. 21424 IN RRSIG DNSKEY 13 5 86400 (
                                20200622063022 20200612061136 ...)

This TTL was captured at 20200622105807 so the resolver cached the
RRset 64976 seconds previously (18h02m56s), at 20200621165511
only about 12h before expiry.

The other symptom of this error was incorrect `resign` times in
the output from `rndc zonestatus`.

For example, I have configured a test zone

        zone fast.dotat.at {
                file "../u/z/fast.dotat.at";
                type primary;
                auto-dnssec maintain;
                sig-validity-interval 500 499;
        };

The zone is reset to a minimal zone containing only SOA and NS
records, and when `named` starts it loads and signs the zone. After
that, `rndc zonestatus` reports:

        next resign node: fast.dotat.at/NS
        next resign time: Fri, 28 May 2021 12:48:47 GMT

The resign time should be within the next 24h, but instead it is
near the signature expiry time, which the RRSIG(NS) says is
20210618074847. (Note 499 hours is a bit more than 20 days.)
May/June 2021 is less than 500 days from now because expiry time
jitter is applied to the NS records.

Using this test I bisected this bug to 09990672d which contained a
mistake leading to the resigning interval always being calculated in
hours, when days are expected.

This bug only occurs for configurations that use the two-argument form
of `sig-validity-interval`.
2020-07-14 10:57:43 +10:00
Evan Hunt
29dcdeba1b purge pending command events when shutting down
When we're shutting the system down via "rndc stop" or "rndc halt",
or reconfiguring the control channel, there are potential shutdown
races between the server task and network manager.  These are adressed by:

- purging any pending command tasks when shutting down the control channel
- adding an extra handle reference before the command handler to
  ensure the handle can't be deleted out from under us before calling
  command_respond()
2020-07-13 13:17:08 -07:00
Evan Hunt
45ab0603eb use an isc_task to execute rndc commands
- using an isc_task to execute all rndc functions makes it relatively
  simple for them to acquire task exclusive mode when needed
- control_recvmessage() has been separated into two functions,
  control_recvmessage() and control_respond(). the respond function
  can be called immediately from control_recvmessage() when processing
  a nonce, or it can be called after returning from the task event
  that ran the rndc command function.
2020-07-13 13:16:53 -07:00
Evan Hunt
3551d3ffd2 convert rndc and control channel to use netmgr
- updated libisccc to use netmgr events
- updated rndc to use isc_nm_tcpconnect() to establish connections
- updated control channel to use isc_nm_listentcp()

open issues:

- the control channel timeout was previously 60 seconds, but it is now
  overridden by the TCP idle timeout setting, which defaults to 30
  seconds. we should add a function that sets the timeout value for
  a specific listener socket, instead of always using the global value
  set in the netmgr. (for the moment, since 30 seconds is a reasonable
  timeout for the control channel, I'm not prioritizing this.)
- the netmgr currently has no support for UNIX-domain sockets; until
  this is addressed, it will not be possible to configure rndc to use
  them. we will need to either fix this or document the change in
  behavior.
2020-07-13 13:16:53 -07:00
Evan Hunt
002c328437 don't use exclusive mode for rndc commands that don't need it
"showzone" and "tsig-list" both used exclusive mode unnecessarily;
changing this will simplify future refactoring a bit.
2020-07-13 13:12:33 -07:00
Evan Hunt
0580d9cd8c style cleanup
clean up style in rndc and the control channel in preparation for
changing them to use the new network manager.
2020-07-13 12:41:04 -07:00
Evan Hunt
ed37c63e2b make sure new_zone_lock is locked before unlocking it
it was possible for the count_newzones() function to try to
unlock view->new_zone_lock on return before locking it, which
caused a crash on shutdown.
2020-07-13 12:06:26 -07:00
Mark Andrews
d02a14c795 Fallback to built in trust-anchors, managed-keys, or trusted-keys
if the bind.keys file cannot be parsed.
2020-07-13 14:12:14 +10:00
Mark Andrews
a0e8a11cc6 Don't verify the zone when setting expire to "now+1s" as it can fail
as too much wall clock time may have elapsed.

Also capture signzone output for forensic analysis
2020-07-13 01:39:13 +00:00
Mark Andrews
c91dc92410 Remove redundant check for listener being non-NULL 2020-07-12 23:46:35 +00:00
Michał Kępień
53120279b5 Fix locking for LMDB 0.9.26
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views.  Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:

 1. mdb_env_open(envA) is called when the view is first created.
 2. "rndc reconfig" is called.
 3. mdb_env_open(envB) is called for the new instance of the view.
 4. mdb_env_close(envA) is called for the old instance of the view.

This seems to have worked so far.  However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).

Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.

To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead.  Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps).  Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.

[1] 2fd44e3251
2020-07-10 11:29:18 +02:00
Evan Hunt
ba52377b37 use 'tsig-keygen' as the primary name for the tool
'ddns-confgen' is now an alias for 'tsig-keygen', rather than
the other way around.
2020-07-06 01:41:52 -07:00
Mark Andrews
c2c333e3f3 Bad isc_mem_put() size when an invalid type was specified in a ssu rule. 2020-07-06 10:33:27 +10:00
Matthijs Mekking
9347e7db7e Increase "rndc dnssec -status" output size
BUFSIZ (512 bytes on Windows) may not be enough to fit the status of a
DNSSEC policy and three DNSSEC keys.

Set the size of the relevant buffer to a hardcoded value of 4096 bytes,
which should be enough for most scenarios.
2020-07-03 12:14:53 +02:00
Ondřej Surý
f8b5958d28 Don't fail the system tests when shutdown test is missing pytest 2020-07-02 16:55:55 +02:00
Ondřej Surý
9ab86d0da2 Update the generated files after the source manpages update 2020-07-02 10:53:16 +02:00
Suzanne Goldlust
78af7e54e6 Text edits to manual paages
This commit updates the wording in following man pages:

* ddns-confgen.rst
* delv.rst
* dig.rst
* dnssec-dsfromkey.rst
* dnssec-importkey.rst
* dnssec-keyfromlabel.rst
* dnssec-keygen.rst
* dnssec-revoke.rst
* dnssec-settime.rst
* dnssec-signzone.rst
* dnssec-verify.rst
* dnstap-read.rst
* filter-aaaa.rst
* host.rst
* mdig.rst
* named-checkconf.rst
* named-checkzone.rst
* named-nzd2nzf.rst
* named.conf.rst
* named.rst
* nsec3hash.rst
* nsupdate.rst
* pkcs11-destroy.rst
* pkcs11-keygen.rst
* pkcs11-list.rst
* pkcs11-tokens.rst
* rndc-confgen.rst
* rndc.rst
2020-07-02 10:35:58 +02:00
Suzanne Goldlust
1efa88cf09 Text and formatting edits to various manual pages.
Follwing manual pages have been updated: rndc.conf.rst, rndc.rst
nsec3hash.rst, dnstap-read.rst, named-nzd2nzf.rst, mdig.rst,
named-rrchecker.rst, dnssec-revoke.rst, dnssec-cds.rst,
dnssec-keyfromlabel.rst, and dnssec-keygen.rst
2020-07-02 10:11:01 +02:00
Suzanne Goldlust
42386f3d9f Updates to .rst files to remove more references to "master" and "slave" 2020-07-02 09:47:27 +02:00
Suzanne Goldlust
e3e787bc14 Fix formatting of See Also section header 2020-07-01 23:45:04 +02:00
Matthijs Mekking
24e07ae98e Fix kasp test set_keytime
While the creation and publication times of the various keys
in this policy are nearly at the same time there is a chance that
one key is created a second later than the other.

The `set_keytimes_algorithm_policy` mistakenly set the keytimes
for KEY3 based of the "published" time from KEY2.
2020-07-01 22:42:29 +02:00
Evan Hunt
e43b3c1fa1 further tidying of primary/secondary terminology in system tests
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
2020-07-01 11:12:12 -07:00
Evan Hunt
68c384e118 use primary/secondary terminology in 'rndc zonestatus' 2020-07-01 11:11:34 -07:00
Evan Hunt
f619708bbf prevent "primaries" lists from having duplicate names
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
2020-07-01 11:11:34 -07:00
Evan Hunt
424a3cf3cc add "primary-only" as a synonym for "master-only"
update the "notify" option to use RFC 8499 terminology as well.
2020-07-01 11:11:34 -07:00
Evan Hunt
16e14353b1 add "primaries" as a synonym for "masters" in named.conf
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.

added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
2020-07-01 11:11:34 -07:00
Diego Fronza
042e509753 Added test for the fix
This test ensures that named will correctly shutdown
when receiving multiple control connections after processing
of either "rncd stop" or "kill -SIGTERM" commands.

Before the fix, named was crashing due to a race condition happening
between two threads, one running shutdown logic in named/server.c
and other handling control logic in controlconf.c.

This test tries to reproduce the above scenario by issuing multiple
queries to a target named instance, issuing either rndc stop or kill
-SIGTERM command to the same named instance, then starting multiple rndc
status connections to ensure it is not crashing anymore.
2020-07-01 11:59:01 +02:00
Ondřej Surý
be6cc53ec2 Don't continue opening a new rndc connection if we are shutting down
Due to lack of synchronization, whenever named was being requested to
stop using rndc, controlconf.c module could be trying to access an already
released pointer through named_g_server->interfacemgr in a separate
thread.

The race could only be triggered if named was being shutdown and more
rndc connections were ocurring at the same time.

This fix correctly checks if the server is shutting down before opening
a new rndc connection.
2020-07-01 08:44:56 +02:00
Evan Hunt
e3ee138098 update the acl system test to include a blackhole test case
this ACL was previously untested, which allowed a regression to
go undetected.
2020-06-30 17:29:09 -07:00
Matthijs Mekking
19ce9ec1d4 Output rndc dnssec -status
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.

Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
2020-06-30 09:51:04 +02:00
Matthijs Mekking
e1ba1bea7c Implement dummy 'rndc dnssec -status' command
Add the code and documentation required to provide DNSSEC signing
status through rndc.  This does not yet show any useful information,
just provide the command that will output some dummy string.
2020-06-30 09:51:04 +02:00
Matthijs Mekking
f0b5eb03bb Add one more RFC 4592 test
This deals with the SRV example.
2020-06-30 05:22:24 +00:00
Mark Andrews
b3215125ea Fix the dnstap roll test by:
* fixing the find call.
* checking that we rolled a file.
2020-06-30 08:27:58 +10:00
Ondřej Surý
b51d10608e Fix miscellaneous little bugs in RST formatting 2020-06-29 19:39:03 +02:00
Ondřej Surý
5c56a0ddbc Add missing rndc.conf header that was breaking manpages section
The rndc.conf main header was missing the header markup and that was
breaking the TOC for all manpages in the ARM because sphinx-build
incorrectly remembered the markup for subheader to be ~~~~ instead of
----.
2020-06-29 19:37:18 +02:00
Matthijs Mekking
a47192ed5b kasp tests: fix wait for reconfig done
The wait until zones are signed after rndc reconfig is broken
because the zones are already signed before the reconfig.  Fix
by having a different way to ensure the signing of the zone is
complete.  This does require a call to the "wait_for_done_signing"
function after each "check_keys" call after the ns6 reconfig.

The "wait_for_done_signing" looks for a (newly added) debug log
message that named will output if it is done signing with a certain
key.
2020-06-26 08:43:45 +00:00
Matthijs Mekking
cf76d839ae kasp tests: Replace while loops with retry_quiet 2020-06-26 08:43:45 +00:00
Evan Hunt
a8baf79e33 append "0" to IPv6 addresses ending in "::" when printing YAML
such addresses broke some YAML parsers.
2020-06-25 16:42:13 -07:00
Matthijs Mekking
c6345fffe9 Add todo in dnssec system test for [GL #1689]
Add a note why we don't have a test case for the issue.

It is tricky to write a good test case for this if our tools are
not allowed to create signatures for unsupported algorithms.
2020-06-25 13:46:36 +02:00
Mark Andrews
abe2c84b1d Suppress cppcheck warnings:
cppcheck-suppress objectIndex
cppcheck-suppress nullPointerRedundantCheck
2020-06-25 12:04:36 +10:00
Mark Andrews
0cf25d7f38 Add INSIST's to silence cppcheck warnings 2020-06-25 12:04:36 +10:00
Mark Andrews
4bc3de070f Resize unamebuf[] to avoid warnings about snprintf() not having
enough buffer space.  Also change named_os_uname() prototype so
that it is now returning (const char *) rather than (char *).  If
uname() is not supported on a UNIX build prepopulate unamebuf[]
with "unknown architecture".
2020-06-24 23:21:36 +00:00
Mark Andrews
a289a57c7f Check that 'rndc dnstap -roll <value>' works 2020-06-23 20:20:39 +10:00
Evan Hunt
a9154f2aab reorder system tests to shorten runtime
if tests that take a particularly long time to complete
(serve-stale, dnssec, rpzrecurse) are run first, a parallel
run of the system tests can finish 1-2 minutes faster.
2020-06-22 12:05:32 +00:00
Evan Hunt
ba31b189b4 "check-names primary" and "check-names secondary" were ignored
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.
2020-06-22 12:32:32 +02:00
Mark Andrews
3632b3c282 Add checking RFC 4592 responses examples to wildcard system test 2020-06-18 09:59:20 +02:00
Mark Andrews
e5b2eca1d3 The dsset returned by dns_keynode_dsset needs to be thread safe.
- clone keynode->dsset rather than return a pointer so that thread
  use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
  while in use.
- create a new keynode when removing DS records so that dangling
  pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
  when DS records are added.
2020-06-11 16:02:09 +10:00
Michał Kępień
fef15bc33d Disable temporarily unsupported tests on Windows
Due to the changes introduced by the Automake migration, system tests
requiring Python (chain, pipelined, qmin, tcp), dynamic loading of
shared objects (dlzexternal, dyndb, filter-aaaa), or LMDB (nzd2nzf)
currently do not work on Windows.  Temporarily disable them on that
platform by moving them from the PARALLEL_COMMON list to the
PARALLEL_UNIX list until the situation is rectified.
2020-06-09 15:35:54 +02:00