Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc. It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times". Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance. Ticks
are triggered when extents are allocated and deallocated. Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any). This allows memory use to be kept in check over time.
This dirty page cleanup mechanism has a quirk. If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated. This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]
The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).
Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call. Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.
An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.
[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3
Resolve "Improve functions parameter validation in lib/dns/message.c to prevent accessing the -1 index of an array"
Closes#2898
See merge request isc-projects/bind9!5824
dns_message_findname and dns_message_sectiontotext incorrectly accepted
DNS_SECTION_ANY. If DNS_SECTION_ANY was passed the section array could
be incorrectly accessed at (-1).
dns_message_pseudosectiontotext and dns_message_pseudosectiontoyaml
incorrectly accepted DNS_PSEUDOSECTION_ANY. These functions are
designed to process a single section.
There were two problems in the notify system test when it waited for
log messages to appear: the shellcheck refactoring introduced a call
to `wait_for_log` with a regex, but `wait_for_log` only supports fixed
strings, so it always ran for the full 45 second timeout; and the new
test to ensure that notify messages time out failed to reset the
nextpart pointer, so if the notify messages timed out before the test
ran, it would fail to see them.
This change adds a `wait_for_log_re` helper that matches a regex, and
uses it where appropriate in the notify system test, which stops the
test from waiting longer than necessary; and it resets the nextpart
pointer so that the notify timeout test works reliably.
Closes#3275
When TASKMGR_TRACE=1 is defined, the task and event objects have
detailed tracing information about function, file, line, and
backtrace (to the extent tracked by gcc) where it was created.
At exit, when there are unfinished tasks, they will be printed along
with the detailed information.
The only place where isc_task_sendto() was used was in dns_resolver
unit, where the "sendto" part was actually no-op, because dns_resolver
uses bound tasks. Remove the isc_task_sendto() and
isc_task_sendtoanddetach() functions in favor of using bound tasks
create with isc_task_create_bound().
Additionally, cache the number of running netmgr threads (nworkers)
locally to reduce the number of function calls.
For some applications, it's useful to not listen on full battery of
threads. Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
dns_rdata_fromtext and dns_rdata_fromwire now checks that there is
a valid name or oid at the start of the keydata when the key algorithm
is PRIVATEDNS and PRIVATEOID respectively.
dns_rdata_totext now prints out the oid if the algorithm is PRIVATEOID.
Prime the cache with a negative cache DS entry then make a query for
name beneath that entry. This will cause the DS entry to be retieved
as part of the validation process. Each RRset in the ncache entry
will be validated and the trust level for each will be updated.
dig previously set an exit code of 9 when a TCP connection failed
or when a UDP connection timed out, but when the server address is
localhost it's possible for a UDP query to fail with ISC_R_CONNREFUSED.
that code path didn't update the exit code, causing dig to exit with
status 0. we now set the exit code to 9 in this failure case.
Catalog zones change of ownership is special mechanism to facilitate
controlled migration of a member zone from one catalog to another.
It is implemented using catalog zones property named "coo" and is
documented in DNS catalog zones draft version 5 document.
Implement the feature using a new hash table in the catalog zone
structure, which holds the added "coo" properties for the catalog zone
(containing the target catalog zone's name), and the key for the hash
table being the member zone's name for which the "coo" property is being
created.
Change some log messages to have consistent zone name quoting types.
Update the ARM with change of ownership documentation and usage
examples.
Add tests which check newly the added features.
When there are multiple record datasets in a database node of a catalog
zone, and BIND encounters a soft error during processing of a dataset,
it breaks from the loop and doesn't process the other datasets in the
node.
There are cases when this is not desired. For example, the catalog zones
draft version 5 states that there must be a TXT RRset named
`version.$CATZ` with exactly one RR, but it doesn't set a limitation
on possible non-TXT RRsets named `version.$CATZ` existing alongside
with the TXT one. In case when one exists, we will get a processing
error and will not continue the loop to process the TXT RRset coming
next.
Remove the "break" statement to continue processing all record datasets.
When processing a new or updated catalog zone, the record datasets
from the database are being processed in order. This creates a
problem because we need to know the version of the catalog zone
schema to process some of the records differently, but we do not
know the version until the 'version' record gets processed.
Find the 'version' record and process it first, only then iterate over
the database to process the rest, making sure not to process the
'version' record twice.
According to DNS catalog zones draft version 5 document, catalog
zone custom properties must be placed under the "ext" label.
Make necessary changes to support the new custom properties syntax in
catalog zones with version "2" of the schema.
Change the default catalog zones schema version from "1" to "2" in
ARM to prepare for the new features and changes which come starting
from this commit in order to support the latest DNS catalog zones draft
document.
Make some restructuring in ARM and rename the term catalog zone "option"
to "custom property" to better reflect the terms used in the draft.
Change the version of 'catalog1.zone.' catalog zone in the "catz" system
test to "2", and leave the version of 'catalog2.zone.' catalog zone at
version "1" to test both versions.
Add tests to check that the new syntax works only with the new schema
version, and that the old syntax works only with the legacy schema
version catalog zones.
In `+nssearch` mode `dig` starts the next query of the followup lookup
using `start_udp()` or `start_tcp()` calls without waiting for the
previous query to complete.
In UDP mode that happens in the `send_done()` callback of the previous
query, but in TCP mode that happens in the `start_tcp()` call of the
previous query (recursion) which doesn't work because `start_tcp()`
attaches the `lookup->current_query` to the query it is starting, so a
recursive call will result in an assertion failure.
Make the TCP mode to start the next query in `send_done()`, just like in
the UDP mode. During that time the `lookup->current_query` is already
detached by the `tcp_connected()`/`udp_ready()` callbacks.
Mention in the DNSSEC guide in the "revert to unsigned" recipe that you
can publish CDS and CDNSKEY DELETE records to remove the corresponding
DS records from the parent zone.
Update the function that synchronizes the CDS and CDNSKEY DELETE
records. It now allows for the possibility that the CDS DELETE record
is published and the CDNSKEY DELETE record is not, and vice versa.
Also update the code in zone.c how 'dns_dnssec_syncdelete()' is called.
With KASP, we still maintain the DELETE records our self. Otherwise,
we publish the CDS and CDNSKEY DELETE record only if they are added
to the zone. We do still check if these records can be signed by a KSK.
This change will allow users to add a CDS and/or CDNSKEY DELETE record
manually, without BIND removing them on the next zone sign.
Note that this commit removes the check whether the key is a KSK, this
check is redundant because this check is also made in
'dst_key_is_signing()' when the role is set to DST_BOOL_KSK.
Add a test case for a dynamically added CDS DELETE record and make
sure it is not removed when signing the zone. This happens because
BIND maintains CDS and CDNSKEY publishing and it will only allow
CDS DELETE records if the zone is transitioning to insecure. This is
a state that can be identified when using KASP through 'dnssec-policy',
but not when using 'auto-dnssec'.
Commit bf3fffff67 added a Python-based
name server (bin/tests/system/forward/ans11/ans.py) to the "forward"
system test, but did not update bin/tests/system/Makefile.am to ensure
Python is present in the test environment before the "forward" system
test is run. Update bin/tests/system/Makefile.am to enforce that
requirement.