Any CI job:
- I:dnssec:file dnssec/ns1/trusted.keys not removed
- I:rpzrecurse:file rpzrecurse/ns3/named.run.prev not removed
system:gcc:sid:amd64:
- I🪞file mirror/ns3/_default.nzf not removed
system:gcc:xenial:amd64:
- I:shutdown:file shutdown/.cache/v/cache/lastfailed not removed
(cherry picked from commit 14a104d121)
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
(cherry picked from commit e43b3c1fa1)
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
(cherry picked from commit 16e14353b1)
In order to lower the amount of memory allocated at startup by named
instances used in the BIND system test suite, set the default value of
"max-cache-size" for these to 2 megabytes. The purpose of this change
is to prevent named instances (or even entire virtual machines) from
getting killed by the operating system on the test host due to excessive
memory use.
Remove all "max-cache-size" statements from named configuration files
used in system tests ("checkconf" notwithstanding) to prevent confusion
as the "-T maxcachesize=..." command line option takes precedence over
configuration files.
(cherry picked from commit dad6572093)
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
The "mirror" system test expects all dig queries (including recursive
ones) to be responded to within 1 second, which turns out to be overly
optimistic in certain cases and leads to false positives being
triggered. Increase dig query timeout used throughout the "mirror"
system test to 2 seconds in order to alleviate the issue.
Currently, ns3 in the "mirror" system test sends trust anchor telemetry
queries every second as it is started with "-T tat=1". Given the number
of trust anchors configured on ns3 (9), TAT-related traffic clutters up
log files, hindering troubleshooting efforts. Increase TAT query
interval to 3 seconds in order to alleviate the issue.
Note that the interval chosen cannot be much higher if intermittent test
failures are to be avoided: TAT queries are only sent after the
configured number of seconds passes since resolver startup. Quick
experiments show that even on contemporary hardware, ns3 should be
running for at least 5 seconds before it is first shut down, so a
3-second TAT query interval seems to be a reasonable, future-proof
compromise. Ensure the relevant check is performed before ns3 is first
shut down to emphasize this trade-off and make it more clear by what
time TAT queries are expected to be sent.
When a mirror zone is verified, the 'ignore_kskflag' argument passed to
dns_zoneverify_dnssec() is set to false. This means that in order for
its verification to succeed, a mirror zone needs to have at least one
key with the SEP bit set configured as a trust anchor. This brings no
security benefit and prevents zones signed only using keys without the
SEP bit set from being mirrored, so change the value of the
'ignore_kskflag' argument passed to dns_zoneverify_dnssec() to true.
The "mirror" system test checks whether log messages announcing a mirror
zone coming into effect are emitted properly. However, the helper
functions responsible for waiting for zone transfers and zone loading to
complete do not wait for these exact log messages, but rather for other
ones preceding them, which introduces a possibility of false positives.
This problem cannot be addressed by just changing the log message to
look for because the test still needs to discern between transferring a
zone and loading a zone.
Add two new log messages at debug level 99 (which is what named
instances used in system tests are configured with) that are to be
emitted after the log messages announcing a mirror zone coming into
effect. Tweak the aforementioned helper functions to only return once
the log messages they originally looked for are followed by the newly
added log messages. This reliably prevents races when looking for
"mirror zone is now in use" log messages and also enables a workaround
previously put into place in the "mirror" system test to be reverted.
In the "mirror" system test, ns3 periodically sends trust anchor
telemetry queries to ns1 and ns2. It may thus happen that for some
non-recursive queries for names inside mirror zones which are not yet
loaded, ns3 will be able to synthesize a negative answer from the cached
records it obtained from trust anchor telemetry responses. In such
cases, NXDOMAIN responses will be sent with the root zone SOA in the
AUTHORITY section. Since the root zone used in the "mirror" system test
has the same serial number as ns2/verify.db.in and zone verification
checks look for the specified serial numbers anywhere in the answer, the
test could be broken if different zone names were used.
The +noauth dig option could be used to address this weakness, but that
would prevent entire responses from being stored for later inspection,
which in turn would hamper troubleshooting test failures. Instead, use
a different serial number for ns2/verify.db.in than for any other zone
used in the "mirror" system test and check the number of records in the
ANSWER section of each response.
Due to the way the "mirror" system test is set up, it is impossible for
the "verify-unsigned" and "verify-untrusted" zones to contain any serial
number other than the original one present in ns2/verify.db.in. Thus,
using presence of a different serial number in the SOA records of these
zones as an indicator of problems with mirror zone verification is
wrong. Look for the original zone serial number instead as that is the
one that will be returned by ns3 if one of the aforementioned zones is
successfully verified.
Log a message if a mirror zone becomes unusable for the resolver (most
usually due to the zone's expiration timer firing). Ensure that
verification failures do not cause a mirror zone to be unloaded
(instead, its last successfully verified version should be served if it
is available).
Log a message when a mirror zone is successfully loaded from disk and
subsequently verified.
This could have been implemented in a simpler manner, e.g. by modifying
an earlier code branch inside zone_postload() which checks whether the
zone already has a database attached and calls attachdb() if it does
not, but that would cause the resulting logs to indicate that a mirror
zone comes into effect before the "loaded serial ..." message is logged,
which would be confusing.
Tweak some existing sed commands used in the "mirror" system test to
ensure that separate test cases comprising it do not break each other.
Log a message when a mirror zone is successfully transferred and
verified, but only if no database for that zone was yet loaded at the
time the transfer was initiated.
This could have been implemented in a simpler manner, e.g. by modifying
zone_replacedb(), but (due to the calling order of the functions
involved in finalizing a zone transfer) that would cause the resulting
logs to suggest that a mirror zone comes into effect before its transfer
is finished, which would be confusing given the nature of mirror zones
and the fact that no message is logged upon successful mirror zone
verification.
Once the dns_zone_replacedb() call in axfr_finalize() is made, it
becomes impossible to determine whether the transferred zone had a
database attached before the transfer was started. Thus, that check is
instead performed when the transfer context is first created and the
result of this check is passed around in a field of the transfer context
structure. If it turns out to be desired, the relevant log message is
then emitted just before the transfer context is freed.
Taking this approach means that the log message added by this commit is
not timed precisely, i.e. mirror zone data may be used before this
message is logged. However, that can only be fixed by logging the
message inside zone_replacedb(), which causes arguably more dire issues
discussed above.
dns_zone_isloaded() is not used to double-check that transferred zone
data was correctly loaded since the 'shutdown_result' field of the zone
transfer context will not be set to ISC_R_SUCCESS unless axfr_finalize()
succeeds (and that in turn will not happen unless dns_zone_replacedb()
succeeds).
Use a zone's 'type' field instead of the value of its DNS_ZONEOPT_MIRROR
option for checking whether it is a mirror zone. This makes said zone
option and its associated helper function, dns_zone_mirror(), redundant,
so remove them. Remove a check specific to mirror zones from
named_zone_reusable() since another check in that function ensures that
changing a zone's type prevents it from being reused during
reconfiguration.
Force ns3 to use a constant source address (10.53.0.3) when sending
transfer requests for the "initially-unavailable" zone to prevent
failures of transfers not triggered by bin/tests/system/mirror/tests.sh
from causing fallback to using a source address for which transfers of
that zone are refused throughout the entire "mirror" system test since
that might yield false positives.
When query processing hits a delegation from a locally configured zone,
an attempt may be made to look for a better answer in the cache. In
such a case, the zone-sourced delegation data is set aside and the
lookup is retried using the cache database. When that lookup is
completed, a decision is made whether the answer found in the cache is
better than the answer found in the zone.
Currently, if the zone-sourced answer turns out to be better than the
one found in the cache:
- qctx->zdb is not restored into qctx->db,
- qctx->node, holding the zone database node found, is not even saved.
Thus, in such a case both qctx->db and qctx->node will point at cache
data. This is not an issue for BIND versions which do not support
mirror zones because in these versions non-recursive queries always
cause the zone-sourced delegation to be returned and thus the
non-recursive part of query_delegation() is never reached if the
delegation is coming from a zone. With mirror zones, however,
non-recursive queries may cause cache lookups even after a zone
delegation is found. Leaving qctx->db assigned to the cache database
when query_delegation() determines that the zone-sourced delegation is
the best answer to the client's query prevents DS records from being
added to delegations coming from mirror zones. Fix this issue by
keeping the zone database and zone node in qctx while the cache is
searched for an answer and then restoring them into qctx->db and
qctx->node, respectively, if the zone-sourced delegation turns out to be
the best answer. Since this change means that qctx->zdb cannot be used
as the glue database any more as it will be reset to NULL by RESTORE(),
ensure that qctx->db is not a cache database before attaching it to
qctx->client->query.gluedb.
Furthermore, current code contains a conditional statement which
prevents a mirror zone from being used as a source of glue records.
Said statement was added to prevent assertion failures caused by
attempting to use a zone database's glue cache for finding glue for an
NS RRset coming from a cache database. However, that check is overly
strict since it completely prevents glue from being added to delegations
coming from mirror zones. With the changes described above in place,
the scenario this check was preventing can no longer happen, so remove
the aforementioned check.
If qctx->zdb is not NULL, qctx->zfname will also not be NULL;
qctx->zsigrdataset may be NULL in such a case, but query_putrdataset()
handles pointers to NULL pointers gracefully. Remove redundant
conditional expressions to make the cleanup code in query_freedata()
match the corresponding sequences of SAVE() / RESTORE() macros more
closely.
dns_view_zonecut() may associate the dns_rdataset_t structure passed to
it even if it returns a result different then ISC_R_SUCCESS. Not
handling this properly may cause a reference leak. Fix by ensuring
'nameservers' is cleaned up in all relevant failure modes.
Trying to resolve a trust anchor telemetry query for a locally served
zone does not cause upstream queries to be sent as the response is
determined just by consulting local data. Work around this issue by
calling dns_view_findzonecut() first in order to determine the NS RRset
for a given domain name and then passing the zone cut found to
dns_resolver_createfetch().
Note that this change only applies to TAT queries generated by the
resolver itself, not to ones received from downstream resolvers.
Update named_zone_reusable() so that it does not consider a zone to be
eligible for reuse if its old value of the "mirror" option differs from
the new one. This causes "rndc reconfig" to create a new zone structure
whenever the value of the "mirror" option is changed, which ensures that
the previous zone database is not reused and that flags are properly set
in responses sourced from zones whose "mirror" setting was changed at
runtime.
Replace "type: slave" with "type: mirror" in "rndc zonestatus" output
for mirror zones in order to enable the user to tell a regular slave
zone and a mirror zone apart.
Since the mirror zone feature is expected to mostly be used for the root
zone, prevent slaves from sending NOTIFY messages for mirror zones by
default. Retain the possibility to use "also-notify" as it might be
useful in certain cases.
As mirror zone data should be treated the way validated, cached DNS
responses are, outgoing mirror zone transfers should be disabled unless
they are explicitly enabled by zone configuration.
As mirror zone data should be treated the way validated, cached DNS
responses are, it should not be used when responding to clients who are
not allowed cache access. Reuse code responsible for determining cache
database access for evaluating mirror zone access.
If transferring or loading a mirror zone fails, resolution should still
succeed by means of falling back to regular recursive queries.
Currently, though, if a slave zone is present in the zone table and not
loaded, a SERVFAIL response is generated. Thus, mirror zones need
special handling in this regard.
Add a new dns_zt_find() flag, DNS_ZTFIND_MIRROR, and set it every time a
domain name is looked up rather than a zone itself. Handle that flag in
dns_zt_find() in such a way that a mirror zone which is expired or not
yet loaded is ignored when looking up domain names, but still possible
to find when the caller wants to know whether the zone is configured.
This causes a fallback to recursion when mirror zone data is unavailable
without making unloaded mirror zones invisible to code checking a zone's
existence.
Zone RRsets are assigned trust level "ultimate" upon load, which causes
the AD bit to not be set in responses coming from slave zones, including
mirror zones. Make dns_zoneverify_dnssec() update the trust level of
verified RRsets to "secure" so that the AD bit is set in such responses.
No rollback mechanism is implemented as dns_zoneverify_dnssec() fails in
case of any DNSSEC failure, which causes the mirror zone version being
verified to be discarded.
Section 4 of RFC 7706 suggests that responses sourced from a local copy
of a zone should not have the AA bit set. Follow that recommendation by
setting 'qctx->authoritative' to ISC_FALSE when a response to a query is
coming from a mirror zone.
When a resolver is a regular slave (i.e. not a mirror) for some zone,
non-recursive queries for names below that slaved zone will return a
delegation sourced from it. This behavior is suboptimal for mirror
zones as their contents should rather be treated as validated, cached
DNS responses. Modify query_delegation() and query_zone_delegation() to
permit clients allowed cache access to check its contents for a better
answer when responding to non-recursive queries.
Make ns3 mirror the "root" zone from ns1 and query the former for a
properly signed record below the root. Ensure ns1 is not queried during
resolution and that the AD bit is set in the response.
As mirror zone files are verified when they are loaded from disk, verify
journal files as well to ensure invalid data is not used. Reuse the
journals generated during IXFR tests to test this.
Update axfr_commit() so that all incoming versions of a mirror zone
transferred using AXFR are verified before being used. If zone
verification fails, discard the received version of the zone, wait until
the next refresh and retry.