If we hit an error when issuing an 'rndc dnssec -step' command, and the
keymgr runs again at a later scheduled time, we don't want to enforce
transitions.
Rather than using the dnspython's facilities and defaults to create the
queries, use the isctest.query.create function in all the cases that
don't require special handling to have consistent defaults.
The kasp test cases assume that keymgr operations on the zone under test
have been completed before the test is executed. These are typically
quite fast, but the logs need to be explicitly checked for the messages,
otherwise there's a possibility of race conditions causing the
kasp/rollover tests to become unstable.
Call the wait function in all the kasp/rollover tests where it is
expected (which is generally in each test, unless we're dealing with
unsigned zones).
Many of our test cases only use a single NamedInstance from the
`servers` fixture. Introduce `nsX` helper fixtures to simplify these
tests and reduce boilterplate code further.
Specifically, the test no longer has to either define its own variable
to extract a single server from the list, or use the longer
servers["nsX"] syntax. While this may seem minor, the amount of times it
is repeated across the tests justifies the change. It also promotes
using more explicit server identification, i.e. `nsX`, rather than
generic `server`. This also improves the clarity of the tests and may be
helpful in traceback during debugging as well.
The test_kasp_case[secondary.kasp] can sometimes fail on freebsd13. It
appears the test gets stuck on some operation which should be very
quick, but for some reason takes at least a few seconds, causing the
cb_ixfr_is_signed() function to time out.
In one of the cases I investigated, it wasn't a query/response that
caused a timeout, but rather some operation in between. The test
attempts to read from a keyfile/statefile, but I see no reason why that
should block.
In any case, try to increase the timeout for the verification, as that
shouldn't hurt. Also allow the test to be re-run on freebsd13, as it's
likely to be caused by some odd behaviour on that platform -- the issue
doesn't appear anywhere else.
Create a test scenario where a signed zone is in multiple views and
then a key may be purged. This is a bug case where the key files are
removed by one view and then the other view starts complaining.
When going insecure, we publish CDS and CDNSKEY DELETE records. Update
the check_apex function to test this.
Also, skip some tests in the 'check_rollover_step()' function. If
we change the DNSSEC Policy, keys that no longer match the policy will
be retired. When this exactly happens is hard to determine, as it
happens on the reconfigure. So for these tests, we skip the key timing
metadata checks.
Also, the zone becomes unsigned, so don't call 'check_zone_is_signed'
in those cases.
These test cases involve a reconfiguration. The first one is a zone
that changes from dynamic to inline-signing. The others are tests that
key lifetimes are updated correctly after changing them.
Deduplicate the code for dynamic updates and increase code clarity by
using an actual dns.update.UpdateMessage rather than an undefined
intermediary format passed around as a list of arguments.
Move the 'csk-roll1' and 'csk-roll2' zones to the rollover test dir and
convert CSK rollover tests to pytest.
The DS swap spans multiple steps. Only the first time we should check
if the "CDS is now published" log is there, and only the first time we
should run 'rndc dnssec -checkds' on the keys. Add a new key to the
step dictionary to disable the DS swap checks.
This made me realize that we need to check for "is not None" in case
the value in the dictionary is False. Update check_rollover_step()
accordingly, and also add a log message which step/zone we are currently
checking.
Move the 'ksk-doubleksk' zones to the rollover test dir and convert KSK
rollover test to pytest.
Since the 'ksk-doubleksk' policy publishes different CDNSKEY/CDS RRsets,
update the 'check_rollover_step' to check which CDNSKEY/CDS RRsets should
be published and which should be prohibited. Update 'isctest.kasp'
accordingly.
We are changing the ZSK lifetime to unlimited in this test case as it
is of no importance (this actually discovered a bug in setting the
next time the keymgr should run).
Move the 'zsk-prepub' zones to the rollover test dir and convert ZSK
rollover test to pytest.
We need a way to signal a smooth rollover is going on. Signatures are
being replaced gradually during a ZSK rollover, so the existing
signatures of the predecessor ZSK are still being used. Add a smooth
operator to set the right expectations on what signatures are being
used.
Setting expected key relationships is a bit crude: a list of two
elements where the first element is the index of the expected keys that
is the predecessor, and the second element is the index of the expected
keys that is the successor.
We are changing the KSK lifetime to unlimited in this test case as it
is of no importance.
Move the 'enable-dnssec' to the rollover test dir and convert to pytest.
This requires new test functionality to check that "CDS is published"
messages are logged (or prohibited).
The setup part is slightly adapted such that it no longer needs to
set the '-P sync' value in most cases (this is then set by 'named'),
and to adjust for the inappropriate safety intervals fix.
Move the multi-signer test scenarios to the rollover directory and
convert tests to pytest.
- If the KeyProperties set the "legacy" to True, don't set expected
key times, nor check them. Also, when a matching key is found, set
key.external to True.
- External keys don't show up in the 'rndc dnssec -status' output so
skip them in the 'check_dnssecstatus' function. External keys never
sign RRsets, so also skip those keys in the '_check_signatures'
function.
- Key properties strings now can set expected key tag ranges, and if
KeyProperties have tag ranges set, they are checked.
In order to keep the kasp system test somewhat approachable, let's
move all rollover scenarios to its own test directory. Starting with
the manual rollover test cases.
A new test function is added to 'isctest.kasp', to verify that the
relationship metadata (Predecessor, Successor) is set correctly.
The configuration and setup for the zone 'manual-rollover.kasp' are
almost copied verbatim, the only exception is the keytimes. Similar
to the test kasp cases, we no longer set "SyncPublish/PublishCDS" in
the setup script. In addition to that, the offset is changed from one
day ago to one week ago, so that the key states match the timing
metadata (one day is too short to move a key from "hidden" to
"omnipresent").
The pytest cases checks if a zone is signed by looking at the NSEC
record at the apex. If that has an RRSIG record, it is considered
signed. But 'named' signs zones incrementally (in batches) and so
the zone may still lack some signatures. In other words, the tests
may consider a zone signed while in fact signing is not yet complete,
then performs additional checks such as is a subdomain signed with the
right key. If this check happens before the zone is actually fully
signed, the check will fail.
Fix this by using 'check_dnssec_verify' instead of
'check_is_zone_signed'. We were already doing this check, but we now
move it up. This will transfer the zone and then run 'dnssec-verify'
on the response. If the zone is partially signed, the check will fail,
and it will retry for up to ten times.
These tests ensure that if dnssec-policy is set on a higher level, the
zone is still signed (or unsigned) as expected. Or if a higher level
has an override, the new policy is honored as expected.
For 'keystore.kasp', a setting 'key-directories' is used. If set, this
will expect a list of two directories, the first one is where the KSKs
will be stored, the second in the list is the ZSK key directory. This
may be expanded in the future to test more complex key storage cases.
The 'rumoured.kasp' zone is weird, the key timings can never match
those key states. But it is a regression test for an early day bug,
so we convert it, but skip the expected key times check.
These test cases follow the same pattern as many other, but all require
some additional checks. These are set in "additional-tests".
The "zsk-missing.autosign" zone is special handled, as it expects the
KSK to sign the SOA RRset (because the ZSK is unavailable).
The kasp/ns3/setup.sh script is updated so the SyncPublish is not set
(named will initialize it correctly). For the test zones that have
missing private key files we do need to set the expected key timing
metadata.
Remove the counterparts for the newly added test from the kasp shell
tests script.
The zone 'pregenerated.kasp' is a case where there already exist more
keys than required. For this we set the 'pregenerated' setting. This
will change the 'keydir_to_keylist' function behavior: Only keys in use
are considered. A key is in use if all of the states are either
undefined, or set to 'hidden'.
The 'some-keys.kasp' zone is similar to 'pregenerated.kasp', except
only some keys have been pregenerated.
Write python-based tests for the many test cases from the kasp system
test. These test cases all follow the same pattern:
- Wait until the zone is signed.
- Check the keys from the key-directory against expected properties.
- Set the expected key timings derived from when the key was created.
- Check the key timing metadata against expected timings.
- Check the 'rndc dnssec -status' output.
- Check the apex is signed correctly.
- Check a subdomain is signed correctly.
- Verify that the zone is DNSSEC correct.
Remove the counterparts for the newly added test from the kasp shell
tests script.
This converts a special characters test case, a max-zone-ttl error
check, and two cases of insecure zones.
We no longer assert for having more than one DNSKEY and/or RRSIG
records. If the zone is insecure, this is no longer always true. And
we already check for the expected number of records in the
check_dnskeys/check_signatures functions.
This commit deals with converting the dynamic zone test cases to
pytest. The tests for 'inline-signing.kasp' are similar to the default
case, so these are added to 'test_kasp_default'.
Unfortunately I need to add sleep calls in between freezing, updating,
and thawing a zone. Without it the intermittent failures are too
frequent.
This commit deals with converting the test cases related to the default
dnssec-policy.
This requires a new method 'check_update_is_signed'. This method will
be used in future tests as well, and checks if an expected record is
in the zone and is properly signed.
Remove the counterparts for the newly added test from the kasp shell
tests script.
Convert the first couple of tests from 'kasp/tests.sh' to
'kasp/tests_kasp.py', those are test cases related to 'dnssec-keygen'
and 'dnssec-settime'.
For this, we also add a new KeyProperties method,
'policy_to_properties', that takes a list of strings which represent
the keys according to the dnssec-policy and the expected key states.
The dnssec-keygen command for the ZSK generation for the zone
multisigner-model2.kasp was wrong (no ZSK was generated in the setup
script, but when 'named' is started, the missing ZSK was created
anyway by 'dnssec-policy'.
There are a couple of cases where the safety intervals are added
inappropriately:
1. When setting the PublishCDS/SyncPublish timing metadata, we don't
need to add the publish-safety value if we are calculating the time
when the zone is completely signed for the first time. This value
is for when the DNSKEY has been published and we add a safety
interval before considering the DNSKEY omnipresent.
2. The retire-safety value should only be added to ZSK rollovers if
there is an actual rollover happening, similar to adding the sign
delay.
3. The retire-safety value should only be added to KSK rollovers if
there is an actual rollover happening. We consider the new DS
omnipresent a bit later, so that we are forced to keep the old DS
a bit longer.
In a multi-signer setup, removing DNSKEY records from the zone should
not be treated as a key that previously exists in the keyring, thus
blocking the keymgr. Add a test case to make sure.
Test that if a key to be purged is in the keyring, it does not
prevent the keymgr from running. Normally a key that is in the keyring
should be available again on the next run, but that is not true for
a key that can be purged.
In addition, fix some wait_for_log calls, by adding the missing
'|| ret=1' parts.
Some test cases were working but for the wrong reasons. These started
to fail when I implemented the first approach for #4763, where the
existence of a DNSKEY together with an empty keyring is suspicious and
would prevent the keymgr from running.
These are:
1. kasp: The multisigner-model2.kasp zone has ZSKs from other providers
in the zone, but not yet its own keys. Pregenerate signing keys and
add them to the unsigned zone as well.
2. kasp: The dynamic-signed-inline-signing.kasp zone has a key generated
and added in the raw version of the zone. But the key file is stored
outside the key-directory for the given zone. Add '-K keys' to the
dnssec-keygen command.
Prior to running the keymgr, first make sure that existing keys
are present in the new keylist. If not, treat this as an operational
error where the keys are made offline (temporarily), possibly unwanted.
In this specific case the key files are temporary unavailable, for
example because of an operator error, or a mount failure). In such
cases, BIND should not try to roll over these keys.
This commit adds support for timestamps in iso8601 format with timezone
when logging. This is exposed through the iso8601-tzinfo printtime
suboption.
It also makes the new logging format the default for -g output,
hopefully removing the need for custom timestamp parsing in scripts.
If there is a keytag conflict between keys with different algorithms,
we need to supply what key algorithm is used so we can get the right
public key.
For clarity, print the algorithm on the found keys after 'check_keys'.
The key lifetime should no longer be adjusted if the key is being
retired earlier, for example because a manual rollover was started.
This would falsely be seen as a dnssec-policy lifetime reconfiguration,
and would adjust the retire/removed time again.
This also means we should update the status output, and the next
rollover scheduled is now calculated using (retire-active) instead of
key lifetime.