The system test framework starts all named instances with the "-d 99"
command line option (unless it is overridden by a named.args file in a
given instance's working directory). This causes a lot of log messages
to be written to named.run files - currently over 5 million lines for a
single test suite run. While debugging information preserved in the log
files is essential for troubleshooting intermittent test failures, some
system tests involve sending hundreds or even thousands of queries,
which causes the relevant log files to explode in size. When multiple
tests (or even multiple test suites) are run in parallel, excessive
logging contributes considerably to the I/O load on the test host,
increasing the odds of intermittent test failures getting triggered.
Decrease the debug level for the seven most verbose named instances:
- use "-d 3" for ns2 in the "cacheclean" system test (it is the lowest
logging level at which the test still passes without the need to
apply any changes to tests.sh),
- use "-d 1" for the other six named instances.
This roughly halves the number of lines logged by each test suite run
while still leaving enough information in the logs to allow at least
basic troubleshooting in case of test failures.
This approach was chosen as it results in a greater decrease in the
number of lines logged than running all named instances with "-d 3",
without causing any test failures.
(cherry picked from commit 4a8d404876)
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
- all tests with "recursion yes" now also specify "dnssec-validation yes",
and all tests with "recursion no" also specify "dnssec-validation no".
this must be maintained in all new tests, or else validation will fail
when we use local root zones for testing.
- clean.sh has been modified where necessary to remove managed-keys.bind
and viewname.mkeys files.
- add CHANGES note
- update copyrights and license headers
- add -j to the make commands in .gitlab-ci.yml to take
advantage of parallelization in the gitlab CI process
3938. [func] Added quotas to be used in recursive resolvers
that are under high query load for names in zones
whose authoritative servers are nonresponsive or
are experiencing a denial of service attack.
- "fetches-per-server" limits the number of
simultaneous queries that can be sent to any
single authoritative server. The configured
value is a starting point; it is automatically
adjusted downward if the server is partially or
completely non-responsive. The algorithm used to
adjust the quota can be configured via the
"fetch-quota-params" option.
- "fetches-per-zone" limits the number of
simultaneous queries that can be sent for names
within a single domain. (Note: Unlike
"fetches-per-server", this value is not
self-tuning.)
- New stats counters have been added to count
queries spilled due to these quotas.
See the ARM for details of these options. [RT #37125]