All our MSVS Project files share the same intermediate directory. We
know that this doesn't cause any problems, so we can just disable the
detection in the project files.
Example of the warning:
warning MSB8028: The intermediate directory (.\Release\) contains files shared from another project (dnssectool.vcxproj). This can lead to incorrect clean and rebuild behavior.
Our vcxproj files set the WarningLevel to Level3, which is too verbose
for a code that needs to be portable. That basically leads to ignoring
all the errors that MSVC produces. This commits downgrades the
WarningLevel to Level1 and enables treating warnings as errors for
Release builds. For the Debug builds the WarningLevel got upgraded to
Level4, and treating warnings as errors is explicitly disabled.
We should eventually make the code clean of all MSVC warnings, but it's
a long way to go for Level4, so it's more reasonable to start at Level1.
For reference[1], these are the warning levels as described by MSVC
documentation:
* /W0 suppresses all warnings. It's equivalent to /w.
* /W1 displays level 1 (severe) warnings. /W1 is the default setting
in the command-line compiler.
* /W2 displays level 1 and level 2 (significant) warnings.
* /W3 displays level 1, level 2, and level 3 (production quality)
warnings. /W3 is the default setting in the IDE.
* /W4 displays level 1, level 2, and level 3 warnings, and all level 4
(informational) warnings that aren't off by default. We recommend
that you use this option to provide lint-like warnings. For a new
project, it may be best to use /W4 in all compilations. This option
helps ensure the fewest possible hard-to-find code defects.
* /Wall displays all warnings displayed by /W4 and all other warnings
that /W4 doesn't include — for example, warnings that are off by
default.
* /WX treats all compiler warnings as errors. For a new project, it
may be best to use /WX in all compilations; resolving all warnings
ensures the fewest possible hard-to-find code defects.
1. https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level?view=vs-2019
This test asserts that option "deny-answer-aliases" works correctly
when forwarding requests.
As a matter of example, the behavior expected for a forwarder BIND
instance, having an option such as deny-answer-aliases { "domain"; }
is that when forwarding a request for *.anything-but-domain, it is
expected that it will return SERVFAIL if any answer received has a CNAME
for "*.domain".
(cherry picked from commit 9bdb960a16a69997b08746e698b6b02c8dc6c795)
Increate the DNSKEY TTL of the migrate.kasp zone for the following
reason: The key states are initialized depending on the timing
metadata. If a key is present long enough in the zone it will be
initialized to OMNIPRESENT. Long enough here is the time when it
was published (when the setup script was run) plus DNSKEY TTL.
Otherwise it is set to RUMOURED, or to HIDDEN if no timing metadata
is set or the time is still in the future.
Since the TTL is "only" 5 minutes, the DNSKEY state may be
initialized to OMNIPRESENT if the test is slow, but we expect it
to be in RUMOURED state. If we increase the TTL to a couple of
hours it is very unlikely that it will be initialized to something
else than RUMOURED.
This fixes another intermittent failure in the kasp system test.
It does not happen often, except for in the Windows platform tests
where it takes a long time to run the tests.
In the "kasp" system test, there is an "rndc reconfig" call which
triggers a new rekey event. check_next_key_event() verifies the time
remaining from the moment "rndc reconfig" is called until the next key
event. However, the next key event time is calculated from the key
times provided during key creation (i.e. during test setup). Given
this, if "rndc reconfig" is called a significant amount of time after
the test is started, some check_next_key_event() checks will fail.
Fix by calculating the time passed since the start of the test and
when 'rndc reconfig' happens. Substract this time from the
calculated next key event.
This only needs to be done after an "rndc reconfig" on zones where
the keymgr needs to wait for a period of time (for example for keys
to become OMNIPRESENT, or HIDDEN). This is on step 2 and step 5 of
the algorithm rollover. In step 2 there is a waiting period before
the DNSKEY is OMNIPRESENT, in step 5 there is a waiting period
before the DNSKEY is HIDDEN.
In step 1 new keys are created, in step 3 and 4 key states just
entered OMNIPRESENT, and in step 6 we no longer care because the
key lifetime is unlimited and we default to checking once per hour.
Regardless of our indifference about the next key event after step 6,
change some of the key timings in the setup script to better
reflect reality: DNSKEY is in HIDDEN after step 5, DS times have
changed when the new DS became active.
Add a statschannel test case for DNSSEC sign metrics that has more
keys than there are allocated stats counters for. This will produce
gibberish, but at least it should not crash.
Add a test to ensure migration from 'auto-dnssec maintain;' to
dnssec-policy works even if the algorithm is changed. The existing
keys should not be removed immediately, but their goal should be
changed to become hidden, and the new keys with the different
algorithm should be introduced immediately.
If we initialize goals on all keys, superfluous keys that match
the policy all desire to be active. For example, there are six
keys available for a policy that needs just two, we only want to
set the goal state to OMNIPRESENT on two keys, not six.
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Earlier commit deals with migration where existing keys
match the policy. This commit deals with migration where existing
keys do not match the policy. In that case, named must not
immediately delete the existing keys, but gracefully roll to the
dnssec-policy.
However, named did remove the existing keys immediately. This is
because the legacy key states were initialized badly. Because
those keys had their states initialized to HIDDEN or RUMOURED, the
keymgr decides that they can be removed (because only when the key
has its states in OMNIPRESENT it can be used safely).
The original thought to initialize key states to HIDDEN (and
RUMOURED to deal with existing keys) was to ensure that those keys
will go through the required propagation time before the keymgr
decides they can be used safely. However, those keys are already
in the zone for a long time and making the key states represent
otherwise is dangerous: keys may be pulled out of the zone while
in fact they are required to establish the chain of trust.
Fix initializing key states for existing keys by looking more closely
at the time metadata. Add TTL and propagation delays to the time
metadata and see if the DNSSEC records have been propagated.
Initialize the state to OMNIPRESENT if so, otherwise initialize to
RUMOURED. If the time metadata is in the future, or does not exist,
keep initializing the state to HIDDEN.
The added test makes sure that new keys matching the policy are
introduced, but existing keys are kept in the zone until the new
keys have been propagated.
A few kasp system test tweaks to improve test failure debugging and
deal with tests related to migration to dnssec-policy.
1. When clearing a key, set lifetime to "none". If "none", skip
expect no lifetime set in the state file. Legacy keys that
are migrated but don't match the dnssec-policy will not have a
lifetime.
2. The kasp system test prints which key id and file it is checking.
Log explicitly if we are checking the id or a file.
3. Add quotes around "ID" when setting the key id, for consistency.
4. Fix a typo (non -> none).
5. Print which key ids are found, this way it is easier to see what
KEY[1-4] failed to match one of the key files.
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Several adjustments in the keymgr are required to get it right:
- Set published time on keys when we calculate prepublication time.
This is not strictly necessary, but it is weird to have an active
key without the published time set.
- Initalize key states also before matching keys. Determine the
target state by looking at existing time metadata: If the time
data is set and is in the past, it is a hint that the key and
its corresponding records have been published in the zone already,
and the state is initialized to RUMOURED. Otherwise, initialize it
as HIDDEN. This fixes migration to dnssec-policy from existing
keys.
- Initialize key goal on keys that match key policy to OMNIPRESENT.
These may be existing legacy keys that are being migrated.
- A key that has its goal to OMNIPRESENT *or* an active key can
match a kasp key. The code was changed with CHANGE 5354 that
was a bugfix to prevent creating new KSK keys for zones in the
initial stage of signing. However, this caused problems for
restarts when rollovers are in progress, because an outroducing
key can still be an active key.
The test for this introduces a new KEY property 'legacy'. This is
used to skip tests related to .state files.
These are mostly false positives, the clang-analyzer FAQ[1] specifies
why and how to fix it:
> The reason the analyzer often thinks that a pointer can be null is
> because the preceding code checked compared it against null. So if you
> are absolutely sure that it cannot be null, remove the preceding check
> and, preferably, add an assertion as well.
The 4 warnings reported are:
dnssec-cds.c:781:4: warning: Access to field 'base' results in a dereference of a null pointer (loaded from variable 'buf')
isc_buffer_availableregion(buf, &r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:996:36: note: expanded from macro 'isc_buffer_availableregion'
^
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:821:16: note: expanded from macro 'ISC__BUFFER_AVAILABLEREGION'
(_r)->base = isc_buffer_used(_b); \
^~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:152:29: note: expanded from macro 'isc_buffer_used'
((void *)((unsigned char *)(b)->base + (b)->used)) /*d*/
^~~~~~~~~
1 warning generated.
--
byname_test.c:308:34: warning: Access to field 'fwdtable' results in a dereference of a null pointer (loaded from variable 'view')
RUNTIME_CHECK(dns_fwdtable_add(view->fwdtable, dns_rootname,
^~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/util.h:318:52: note: expanded from macro 'RUNTIME_CHECK'
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/error.h:50:21: note: expanded from macro 'ISC_ERROR_RUNTIMECHECK'
((void)(ISC_LIKELY(cond) || \
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/likely.h:23:43: note: expanded from macro 'ISC_LIKELY'
^
1 warning generated.
--
./rndc.c:255:6: warning: Dereference of null pointer (loaded from variable 'host')
if (*host == '/') {
^~~~~
1 warning generated.
--
./main.c:1254:9: warning: Access to field 'sctx' results in a dereference of a null pointer (loaded from variable 'named_g_server')
sctx = named_g_server->sctx;
^~~~~~~~~~~~~~~~~~~~
1 warning generated.
References:
1. https://clang-analyzer.llvm.org/faq.html#null_pointer
The tkey test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The eddsa test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The ecdsa test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
Waiting for the reply message will ensure that all messages being
looked for exist in the logs at the time of checking. When the
test was only waiting for the send message there was a race between
grep and the ns1 instance of named logging that it had seen the
request.
The previous commit removed the code related to the internal symbol
table. On platforms where available, we can now use backtrace_symbols()
to print more verbose symbols table to the output.
As there's now general availability of backtrace() and
backtrace_symbols() functions (see below), the commit also removes the
usage of glibc internals and the custom stack tracing.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() are
provided in glibc since version 2.1.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() first
appeared in Mac OS X 10.5.
* The backtrace() library of functions first appeared in NetBSD 7.0 and
FreeBSD 10.0.
The kasp system test is timing critical. The test passes on all
Linux based machines, but fails frequently on Windows. The test
takes a lot more time on Windows and at the final checks fail
because the expected next key event is too far off. For example:
I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
I:kasp:error: bad next key event time 20909 for zone \
step2.algorithm-roll.kasp (expect 21600)
I:kasp:failed
This is because the kasp system test calculates the time when the
next key event should occur based on the policy. This assumes that
named is able to do key management within a minute. But starting,
named, doing key management for other zones, and reconfiguring takes
much more time on Windows and thus the next key event on Windows is
much shorter than anticipated.
That this happens is a good thing because this means that the
correct next key event is used, but is not so nice for testing, as
it is hard to determine how much time named needed before finishing
the current key event.
Disable the kasp test on Windows now because it is blocking the
release. We know the cause of these test failures, and it is clear
that this is a fault in the test, not the code. Therefore we feel
comfortable disabling the test right now and work on a fix while
unblocking the release.
When configuring the same dnssec-policy for two zones with the same
name but in different views, there is a race condition for who will
run the keymgr first. If running sequential only one set of keys will
be created, if running parallel two set of keys will be created.
Lock the kasp when running looking for keys and running the key
manager. This way, for the same zone in different views only one
keyset will be created.
The dnssec-policy does not implement sharing keys between different
zones.
Some comments started with a lowercased letter. Capitalized them to
be more consistent with the rest of the comments.
Add some newlines between `set_*` calls and check calls, also to be
more consistent with the other test cases.
There is a failure mode which gets triggered on heavily loaded
systems. A key change is scheduled in 5 seconds to make ZSK2 inactive
and ZSK3 active, but `named` takes more than 5 seconds to progress
from `rndc loadkeys` to the query check. At this time the SOA RRset
is already signed by the new ZSK which is not expected to be active
at that point yet.
Split up the checks to test the case where RRsets are signed
correctly with the offline KSK (maintained the signature) and
the active ZSK. First run, RRsets should be signed with the still
active ZSK2, second run RRsets should be signed with the new active
ZSK3.
We may be checking the algorithm steps too fast: the reconfig
command may still be in progress. Make sure the zones are signed
and loaded by digging the NSEC records for these zones.
Add a test case for algorithm rollover. This is triggered by
changing the dnssec-policy. A new nameserver ns6 is introduced
for tests related to dnssec-policy changes.
This requires a slight change in check_next_key_event to only
check the last occurrence. Also, change the debug log message in
lib/dns/zone.c to deal with checks when no next scheduled key event
exists (and default to loadkeys interval 3600).
Algorithm rollover will require four keys so introduce KEY4.
Also it requires to look at key files for multiple algorithms so
change getting key ids to be algorithm rollover agnostic (adjusting
count checks). The algorithm will be verified in check_key so
relaxing 'get_keyids' is fine.
Replace '${_alg_num}' with '$(key_get KEY[1-4] ALG_NUM)' in checks
to deal with multiple algorithms.