The internal memory allocator had an extra code to keep a list of blocks
for small size allocation. This would help to reduce the interactions
with the system malloc as the memory would be already allocated from the
system, but there's an extra cost associated with that - all the
allocations/deallocations must be locked, effectively eliminating any
optimizations in the system allocator targeted at multi-threaded
applications. While the isc_mem API is still using locks pretty heavily,
this is a first step into reducing the memory allocation/deallocation
contention.
The <isc/readline.h> header provided a compatibility shim to use when
other non-GNU readline libraries are in use. The two places where
readline library is being used is nslookup and nsupdate, so the header
file has been moved to bin/dig directory and it's directly included from
bin/nsupdate.
This also conceals any readline headers exposed from the libisc headers.
This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.
Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).
This commit includes work-in-progress implementation of
DNS-over-HTTP(S).
Server-side code remains mostly untested, and there is only support
for POST requests.
This commit resurrects the old TLS code from
8f73c70d23.
It also includes numerous stability fixes and support for
isc_nm_cancelread() for the TLS layer.
The code was resurrected to be used for DoH.
* Following the example set in 634bdfb16d, the tlsdns netmgr
module now uses libuv and SSL primitives directly, rather than
opening a TLS socket which opens a TCP socket, as the previous
model was difficult to debug. Closes#2335.
* Remove the netmgr tls layer (we will have to re-add it for DoH)
* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
library; move the OpenSSL initialization/deinitialization from dstapi
needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()
* Add couple of new shims needed for OpenSSL 1.0.x
* When LibreSSL is used, require at least version 2.7.0 that
has the best OpenSSL 1.1.x compatibility and auto init/deinit
* Enforce OpenSSL 1.1.x usage on Windows
* Added a TLSDNS unit test and implemented a simple TLSDNS echo
server and client.
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.
Add DoT support to tcpdns - isc_nm_listentlsdns().
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
timeout argument to ensure connections time out and are correctly
cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
failure if reading is interrupted by a non-network thread (e.g.
a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
prior to 1.27, when uv_udp_connect() was added
all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
whether the socket was still alive and scheduling reads/sends on
closed socket.
2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
changed to always return the error conditions via the callbacks, so
they always succeed. This applies to all protocols (UDP, TCP and
TCPDNS).
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.
A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:
- reqhandle: held while waiting for a request callback (query,
notify, update)
- sendhandle: held while waiting for a send callback
- fetchhandle: held while waiting for a recursive fetch to
complete
- updatehandle: held while waiting for an update-forwarding
task to complete
control channel connection objects now contain:
- readhandle: held while waiting for a read callback
- sendhandle: held while waiting for a send callback
- cmdhandle: held while an rndc command is running
httpd connections contain:
- readhandle: held while waiting for a read callback
- sendhandle: held while waiting for a send callback
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
indicate whether it is to be used as a client or server socket
While
if (isc_refcount_decrement() == 1) { // memory_order_release
isc_refcount_destroy(); // memory_order_acquire
...
}
is theoretically the most efficent in practice, using
memory_order_acq_rel produces the same code on x86_64 and doesn't
trigger tsan data races (which use a idealistic model) if
isc_refcount_destroy() is not called immediately. In fact
isc_refcount_destroy() could be removed if we didn't want
to check for the count being 0 when isc_refcount_destroy() is
called.
https://stackoverflow.com/questions/49112732/memory-order-in-shared-pointer-destructor
This commit updates and simplifies the checks for the readline support
in nslookup and nsupdate:
* Change the autoconf checks to pkg-config only, all supported
libraries have accompanying .pc files now.
* Add editline support in addition to libedit and GNU readline
* Add isc/readline.h shim header that defines dummy readline()
function when no readline library is available
When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure. Fix that by properly handling such input.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
The stdatomic shims for non-C11 compilers (Windows, old gcc, ...) and
mutexatomic implemented only and minimal subset of the atomic types.
This commit adds 16-bit operations for Windows and all atomic types as
defined in standard.
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.
in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.
the isc_nm_tcpconnect() function establishes a client connection via
TCP. once the connection is esablished, a callback function will be
called with a newly created network manager handle.
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.
when built with "configure --enable-singletrace", named will produce
detailed query logging at the highest debug level for any query with
query ID zero.
this enables monitoring of the progress of a single query by specifying
the QID using "dig +qid=0". the "client" logging category should be set
to a low severity level to suppress logging of other queries. (the
chance of another query using QID=0 at the same time is only 1 in 2^16.)
"--enable-singletrace" turns on "--enable-querytrace" as well, so if the
logging severity is not lowered, all other queries will be logged
verbosely as well. compiling with either of these options will impair
query performance; they should only be turned on when testing or
troubleshooting.
Instead of using bind() and passing the listening socket to the children
threads using uv_export/uv_import use one thread that does the accepting,
and then passes the connected socket using uv_export/uv_import to a random
worker. The previous solution had thundering herd problems (all workers
waking up on one connection and trying to accept()), this one avoids this
and is simpler.
The tcp clients quota is simplified with isc_quota_attach_cb - a callback
is issued when the quota is available.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
The pk11/constants.h header contained static CK_BYTE arrays and
we had to use #defines to pull only those we need. This commit
changes the constants to only define byte arrays with the content
and either use them directly or define the CK_BYTE arrays locally
where used.
We introduce a isc_quota_attach_cb function - if ISC_R_QUOTA is returned
at the time the function is called, then a callback will be called when
there's quota available (with quota already attached). The callbacks are
organized as a LIFO queue in the quota structure.
It's needed for TCP client quota - with old networking code we had one
single place where tcp clients quota was processed so we could resume
accepting when the we had spare slots, but it's gone with netmgr - now
we need to notify the listener/accepter that there's quota available so
that it can resume accepting.
Remove unused isc_quota_force() function.
The isc_quote_reserve and isc_quota_release were used only internally
from the quota.c and the tests. We should not expose API we are not
using.
tcpdns used transport-specific functions to operate on the outer socket.
Use generic ones instead, and select the proper call in netmgr.c.
Make the missing functions (e.g. isc_nm_read) generic and add type-specific
calls (isc__nm_tcp_read). This is the preparation for netmgr TLS layer.
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
The <isc/md.h> header directly included <openssl/evp.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
While making the isc_md_type_t type opaque, the API using the data type
was changed to use the pointer to isc_md_type_t instead of using the
type directly.
The <isc/md.h> header directly included <openssl/hmac.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
The two "functions" that isc/safe.h declared before were actually simple
defines to matching OpenSSL functions. The downside of the approach was
enforcing all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace changing the defines into
simple functions, we no longer enforce this. In the long run, this
might also allow us to switch cryptographic library implementation
without affecting the downstream users.
The previous commit removed the code related to the internal symbol
table. On platforms where available, we can now use backtrace_symbols()
to print more verbose symbols table to the output.
As there's now general availability of backtrace() and
backtrace_symbols() functions (see below), the commit also removes the
usage of glibc internals and the custom stack tracing.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() are
provided in glibc since version 2.1.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() first
appeared in Mac OS X 10.5.
* The backtrace() library of functions first appeared in NetBSD 7.0 and
FreeBSD 10.0.
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.
By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.
The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.
It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.
To honor the explanation above, dns_resolver_prime is implemented as follow:
1. Atomically checks the attribute "priming" for the given resolver context.
2. If "priming" is false, assumes that "primefetch" is NULL (this is
ensured by the "prime_done" implementation), acquire "primelock"
lock and create a new fetch context, update "primefetch" pointer to
point to the newly allocated fetch context.
3. If "priming" is true, assumes that the job is already in progress,
no locks are acquired, nothing else to do.
To keep the previous invariant consistent, "prime_done" is implemented as follow:
1. Acquire "primefetch" lock.
2. Keep a reference to the current "primefetch" object;
3. Reset "primefetch" attribute to NULL.
4. Release "primefetch" lock.
5. Atomically update "priming" attribute to false.
6. Destroy the "primefetch" object by using the temporary reference.
This ensures that if "priming" is false, "primefetch" was already reset to NULL.
It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".
As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.
An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.
There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
hp implementation requires an object for each thread accessing
a hazard pointer. previous implementation had a hardcoded
HP_MAX_THREAD value of 128, which failed on machines with lots of
CPU cores (named uses 3n threads). We make isc__hp_max_threads
configurable at startup, with the value set to 4*named_g_cpus.
It's also important for this value not to be too big as we do
linear searches on a list.
The isc_refcount API that provides reference counting lost DbC checks for
overflows and underflows in the isc_refcount_{increment,decrement} functions.
The commit restores the overflow check in the isc_refcount_increment and
underflows check in the isc_refcount_decrement by checking for the previous
value to not be on the boundary.
- the socket stat counters have been moved from socket.h to stats.h.
- isc_nm_t now attaches to the same stats counter group as
isc_socketmgr_t, so that both managers can increment the same
set of statistics
- isc__nmsocket_init() now takes an interface as a paramter so that
the address family can be determined when initializing the socket.
- based on the address family and socket type, a group of statistics
counters will be associated with the socket - for example, UDP4Active
with IPv4 UDP sockets and TCP6Active with IPv6 TCP sockets. note
that no counters are currently associated with TCPDNS sockets; those
stats will be handled by the underlying TCP socket.
- the counters are not actually used by netmgr sockets yet; counter
increment and decrement calls will be added in a later commit.
After the network manager rewrite, tcp-higwater stats was only being
updated when a valid DNS query was received over tcp.
It turns out tcp-quota is updated right after a tcp connection is
accepted, before any data is read, so in the event that some client
connect but don't send a valid query, it wouldn't be taken into
account to update tcp-highwater stats, that is wrong.
This commit fix tcp-highwater to update its stats whenever a tcp connection
is established, independent of what happens after (timeout/invalid
request, etc).
The new ISC_THREAD_LOCAL macro unifies usage of platform dependent
Thread Local Storage definition thread_local vs __thread vs
__declspec(thread) to a single macro.
The commit also unifies the required level of support for TLS as for
some parts of the code it was mandatory and for some parts of the code
it wasn't.
FCTX_ATTR_SHUTTINGDOWN needs to be set and tested while holding the node
lock but the rest of the attributes don't as they are task locked. Making
fctx->attributes atomic allows both behaviours without races.
- restore support for tcp-initial-timeout, tcp-idle-timeout,
tcp-keepalive-timeout and tcp-advertised-timeout configuration
options, which were ineffective previously.
when the TCPDNS_CLIENTS_PER_CONN limit has been exceeded for a TCP
DNS connection, switch to sequential mode to ensure that memory cannot
be exhausted by too many simultaneous queries.
This allows a task to be temporary disabled so that objects won't be
processed simultaneously by libuv events and isc_task events. When a
task is paused, currently running events may complete, but no further
event will added to the run queue will be executed until the task is
unpaused.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.
UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.
TCP sockets are shared amongst worker threads.
A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.
(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
The double-locked queue implementation is still currently in use
in ns_client, but will be replaced by a fetch-and-add array queue.
This commit moves it from queue.h to list.h so that queue.h can be
used for the new data structure, and clean up dependencies between
list.h and types.h. Later, when the ISC_QUEUE is no longer is use,
it will be removed completely.
cppcheck 1.89 enabled certain value flow analysis mechanisms [1] which
trigger null pointer dereference false positives in lib/dns/rpz.c:
lib/dns/rpz.c:582:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:582:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:596:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:596:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:610:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:610:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
It seems that cppcheck no longer treats at least some REQUIRE()
assertion failures as fatal, so add extra assertion macro definitions to
lib/isc/include/isc/util.h that are only used when the CPPCHECK
preprocessor macro is defined; these definitions make cppcheck 1.89
behave as expected.
There is an important requirement for these custom definitions to work:
cppcheck must properly treat abort() as a function which does not
return. In order for that to happen, the __GNUC__ macro must be set to
a high enough number (because system include directories are used and
system headers compile attributes away if __GNUC__ is not high enough).
__GNUC__ is thus set to the major version number of the GCC compiler
used, which is what that latter does itself during compilation.
[1] aaeec462e6
The OASIS pkcs11.h header has a restrictive license. Replace the
pkcs11.h pkcs11f.h and pkcs11t.h headers with pkcs11.h from p11-kit.
For source distribution, the license for the OASIS headers itself
doesn't pose any licensing problem when combined with MPL license, but
it possibly creates problem for downstream distributors of BIND 9.
Previously the libisc allocator had ability to run unlocked when threading was
disabled. As the threading is now always on, remove the ISC_MEMFLAG_NOLOCK
memory flag as it serves no purpose.
The isc_mem_createx() function was only used in the tests to eliminate using the
default flags (which as of writing this commit message was ISC_MEMFLAG_INTERNAL
and ISC_MEMFLAG_FILL). This commit removes the isc_mem_createx() function from
the public API.
Previously, the isc_mem_create() and isc_mem_createx() functions took `max_size`
and `target_size` as first two arguments. Those values were never used in the
BIND 9 code. The refactoring removes those arguments and let BIND 9 always use
the default values.
Previously, the isc_mem_create() and isc_mem_createx() functions could have
failed because of failed memory allocation. As this was no longer true and the
functions have always returned ISC_R_SUCCESS, the have been refactored to return
void.
The native implementation's conversion from the uint8_t buffers to uint64_t now
follows the reference implementation that doesn't require aligned buffers.
This commit changes the BIND cookie algorithms to match
draft-sury-toorop-dnsop-server-cookies-00. Namely, it changes the Client Cookie
algorithm to use SipHash 2-4, adds the new Server Cookie algorithm using SipHash
2-4, and changes the default for the Server Cookie algorithm to be siphash24.
Add siphash24 cookie algorithm, and make it keep legacy aes as
The ThreadSanitizer found several possible data races in our rwlock
implementation. This commit changes all the unprotected variables to atomic and
also changes the explicit memory ordering (atomic_<foo>_explicit(..., <order>)
functions to use our convenience macros (atomic_<foo>_<order>).
The json-c have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
The libxml2 have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
Move the macOS section of <isc/endian.h> to a lower spot as it is
believed not to be the most popular platform for running BIND. Add a
comment and remove redundant definitions.
Instead of only supporting Linux, try making <isc/endian.h> support
other GNU platforms as well. Since some compilers define __GNUC__ on
BSDs (e.g. Clang on FreeBSD), move the relevant section to the bottom of
the platform-specific part of <isc/endian.h>, so that it only gets
evaluated when more specific platform determination criteria are not
met. Also include <byteswap.h> so that any byte-swapping macros which
may be defined in that file on older platforms are used in the fallback
definitions of the nonstandard hto[bl]e{16,32,64}() and
[bl]e{16,32,64}toh() conversion functions.
While Solaris does not support the nonstandard hto[bl]e{16,32,64}() and
[bl]e{16,32,64}toh() conversion functions, it does have some
byte-swapping macros available in <sys/byteorder.h>. Ensure these
macros are used in the fallback definitions of the aforementioned
nonstandard functions.
Since the hto[bl]e{16,32,64}() and [bl]e{16,32,64}toh() conversion
functions are nonstandard, add fallback definitions of these functions
to <isc/endian.h>, so that their unavailability does not prevent
compilation from succeeding.
Current versions of DragonFly BSD, FreeBSD, NetBSD, and OpenBSD all
support the modern variants of functions converting values between host
and big-endian/little-endian byte order while older ones might not.
Ensure <isc/endian.h> works properly in both cases.
This work cleans up the API which includes couple of things:
1. Make the isc_appctx_t type fully opaque
2. Protect all access to the isc_app_t members via stdatomics
3. sigwait() is part of POSIX.1, remove dead non-sigwait code
4. Remove unused code: isc_appctx_set{taskmgr,sockmgr,timermgr}
The header file <isc/atomic.h> now contains convenience macros for
most useful explicit memory ordering for C11 stdatomics, only relaxed
and acquire-release semantics is being used. These macros SHOULD be
used instead of atomic_<func>_explicit functions.
- if the TCP quota has been exceeded but there are no clients listening
for new connections on the interface, we can now force attachment to the
quota using isc_quota_force(), instead of carrying on with the quota not
attached.
- the TCP client quota is now referenced via a reference-counted
'ns_tcpconn' object, one of which is created whenever a client begins
listening for new connections, and attached to by members of that
client's pipeline group. when the last reference to the tcpconn
object is detached, it is freed and the TCP quota slot is released.
- reduce code duplication by adding mark_tcp_active() function
- convert counters to stdatomic
(cherry picked from commit a8dd133d270873b736c1be9bf50ebaa074f5b38f)
(cherry picked from commit 4a8fc979c4)
If we know that we'll have a task pool doing specific thing it's better
to use this knowledge and bind tasks to task queues, this behaves better
than randomly choosing the task queue.
- use bound resolver tasks - we have a pool of tasks doing resolutions,
we can spread the load evenly using isc_task_create_bound
- quantum set universally to 25
While implementing the new unit testing framework cmocka, it was found that the
BIND 9 code doesn't compile when assertions are disabled or replaced with any
function (such as mock_assert() from cmocka unit testing framework) that's not
directly recognized as assertion by the compiler.
This made the compiler to complain about blocks of code that was recognized as
unreachable before, but now it isn't.
The changes in this commit include:
* assigns default values to couple of local variables,
* moves some return statements around INSIST assertions,
* adds __builtin_unreachable(); annotations after some INSIST assertions,
* fixes one broken assertion (= instead of ==)
Remove the following functions in order to simplify socket code:
- isc_socket_recvv()
- isc_socket_sendtov()
- isc_socket_sendtov2()
- isc_socket_sendv()
- this enables memory to be allocated and freed in dyndb modules
when named is linked statically. when we standardize on libtool,
this should become unnecessary.
- also, simplified the isc_mem_create/createx API by removing
extra compatibility functions
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.
The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.
The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space. Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill. After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.
The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias. For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff. Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
- mark the 'geoip-use-ecs' option obsolete; warn when it is used
in named.conf
- prohibit 'ecs' ACL tags in named.conf; note that this is a fatal error
since simply ignoring the tags could make ACLs behave unpredictably
- re-simplify the radix and iptable code
- clean up dns_acl_match(), dns_aclelement_match(), dns_acl_allowed()
and dns_geoip_match() so they no longer take ecs options
- remove the ECS-specific unit and system test cases
- remove references to ECS from the ARM
- Replace external -DOPENSSL/-DPKCS11CRYPTO with properly AC_DEFINEd
HAVE_OPENSSL/HAVE_PKCS11
- Don't enforce the crypto provider from platform.h, just from dst_api.c
and configure scripts
The three functions has been modeled after the arc4random family of
functions, and they will always return random bytes.
The isc_random family of functions internally use these CSPRNG (if available):
1. getrandom() libc call (might be available on Linux and Solaris)
2. SYS_getrandom syscall (might be available on Linux, detected at runtime)
3. arc4random(), arc4random_buf() and arc4random_uniform() (available on BSDs and Mac OS X)
4. crypto library function:
4a. RAND_bytes in case OpenSSL
4b. pkcs_C_GenerateRandom() in case PKCS#11 library
Certain isc_buffer_*() functions might call memmove() with the second
argument (source) set to NULL and the third argument (length) set to 0.
While harmless, it triggers an ubsan warning:
runtime error: null pointer passed as argument 2, which is declared to never be null
Modify all memmove() call sites in lib/isc/include/isc/buffer.h and
lib/isc/buffer.c which may potentially use NULL as the second argument
(source) so that memmove() is only called if the third argument (length)
is non-zero.
4807. [cleanup] isc_rng_randombytes() returns a specified number of
bytes from the PRNG; this is now used instead of
calling isc_rng_random() multiple times. [RT #46230]
4768. [func] By default, memory is no longer filled with tag values
when it is allocated or freed; this improves
performance but makes debugging of certain memory
issues more difficult. "named -M fill" turns memory
filling back on. (Building "configure
--enable-developer", turns memory fill on by
default again; it can then be disabled with
"named -M nofill".) [RT #45123]
4762. [func] "update-policy local" is now restricted to updates
from local addresses. (Previously, other addresses
were allowed so long as updates were signed by the
local session key.) [RT #45492]
4724. [func] By default, BIND now uses the random number
functions provided by the crypto library (i.e.,
OpenSSL or a PKCS#11 provider) as a source of
randomness rather than /dev/random. This is
suitable for virtual machine environments
which have limited entropy pools and lack
hardware random number generators.
This can be overridden by specifying another
entropy source via the "random-device" option
in named.conf, or via the -r command line option;
however, for functions requiring full cryptographic
strength, such as DNSSEC key generation, this
cannot be overridden. In particular, the -r
command line option no longer has any effect on
dnssec-keygen.
This can be disabled by building with
"configure --disable-crypto-rand".
[RT #31459] [RT #46047]
4708. [cleanup] Legacy Windows builds (i.e. for XP and earlier)
are no longer supported. [RT #45186]
4707. [func] The lightweight resolver daemon and library (lwresd
and liblwres) have been removed. [RT #45186]
4706. [func] Code implementing name server query processing has
been moved from bin/named to a new library "libns".
Functions remaining in bin/named are now prefixed
with "named_" rather than "ns_". This will make it
easier to write unit tests for name server code, or
link name server functionality into new tools.
[RT #45186]
4579. [func] Logging channels and dnstap output files can now
be configured with a "suffix" option, set to
either "increment" or "timestamp", indicating
whether to use incrementing numbers or timestamps
as the file suffix when rolling over a log file.
[RT #42838]
4518. [func] The "print-time" option in the logging configuration
can now take arguments "local", "iso8601" or
"iso8601-utc" to indicate the format in which the
date and time should be logged. For backward
compatibility, "yes" is a synonym for "local".
[RT #42585]
4503. [cleanup] "make uninstall" now removes file installed by
BIND. (This currently excludes Python files
due to lack of support in setup.py.) [RT #42912]
4445. [cleanup] isc_errno_toresult() can now be used to call the
formerly private function isc__errno2result().
[RT #43050]
4444. [bug] Fixed some issues related to dyndb: A bug caused
braces to be omitted when passing configuration text
from named.conf to a dyndb driver, and there was a
use-after-free in the sample dyndb driver. [RT #43050]
Patch for dyndb driver submitted by Petr Spacek at Red Hat.
4411. [func] "rndc dnstap -roll" automatically rolls the
dnstap output file; the previous version is
saved with ".0" suffix, and earlier versions
with ".1" and so on. An optional numeric argument
indicates how many prior files to save. [RT #42830]
provisioning secondary servers in which a list of
zones to be served is stored in a DNS zone and can
be propagated to slaves via AXFR/IXFR. [RT #41581]
4375. [func] Add support for automatic reallocation of isc_buffer
to isc_buffer_put* functions. [RT #42394]
4235. [func] Added support in named for "dnstap", a fast method of
capturing and logging DNS traffic, and a new command
"dnstap-read" to read a dnstap log file. Use
"configure --enable-dnstap" to enable this
feature (note that this requires libprotobuf-c
and libfstrm). See the ARM for configuration details.
Thanks to Robert Edmonds of Farsight Security.
[RT #40211]
4224. [func] Added support for "dyndb", a new interface for loading
zone data from an external database, developed by
Red Hat for the FreeIPA project.
DynDB drivers fully implement the BIND database
API, and are capable of significantly better
performance and functionality than DLZ drivers,
while taking advantage of advanced database
features not available in BIND such as multi-master
replication.
Thanks to Adam Tkac and Petr Spacek of Red Hat.
[RT #35271]
4183. [cleanup] Use timing-safe memory comparisons in cryptographic
code. Also, the timing-safe comparison functions have
been renamed to avoid possible confusion with
memcmp(). [RT #40148]
3938. [func] Added quotas to be used in recursive resolvers
that are under high query load for names in zones
whose authoritative servers are nonresponsive or
are experiencing a denial of service attack.
- "fetches-per-server" limits the number of
simultaneous queries that can be sent to any
single authoritative server. The configured
value is a starting point; it is automatically
adjusted downward if the server is partially or
completely non-responsive. The algorithm used to
adjust the quota can be configured via the
"fetch-quota-params" option.
- "fetches-per-zone" limits the number of
simultaneous queries that can be sent for names
within a single domain. (Note: Unlike
"fetches-per-server", this value is not
self-tuning.)
- New stats counters have been added to count
queries spilled due to these quotas.
See the ARM for details of these options. [RT #37125]
experimental SIT option of BIND 9.10. The following
named.conf directives are avaliable: send-cookie,
cookie-secret, cookie-algorithm and nocookie-udp-size.
The following dig options are available:
+[no]cookie[=value] and +[no]badcookie. [RT #39928]
4005. [func] The buffer used for returning text from rndc
commands is now dynamically resizable, allowing
arbitrarily large amounts of text to be sent back
to the client. (Prior to this change, it was
possible for the output of "rndc tsig-list" to be
truncated.) [RT #37731]
3999. [func] "mkeys" and "nzf" files are now named after
their corresponding views, unless the view name
contains characters that would be incompatible
with use in a filename (i.e., slash, backslash,
or capital letters). If a view name does contain
these characters, the files will still be named
using a cryptographic hash of the view name.
Regardless of this, if a file using the old name
format is found to exist, it will continue to be
used. [RT #37704]
startup-notify-rate instead of serial-query-rate.
[RT #24454]
3955. [bug] Notify messages due to changes are no longer queued
behind startup notify messages. [RT #24454]
3936. [func] Added authoritative support for the EDNS Client
Subnet (ECS) option.
ACLs can now include "ecs" elements which specify
an address or network prefix; if an ECS option is
included in a DNS query, then the address encoded
in the option will be matched against "ecs" ACL
elements.
Also, if an ECS address is included in a query,
then it will be used instead of the client source
address when matching "geoip" ACL elements. This
behavior can be overridden with "geoip-use-ecs no;".
When "ecs" or "geoip" ACL elements are used to
select a view for a query, the response will include
an ECS option to indicate which client network the
answer is valid for.
(Thanks to Vincent Bernat.) [RT #36781]
No CHANGES entry was added as this commit mainly adds tests related
code.
Squashed commit of the following:
commit d3d44508daa128fb8b60f64b3a8c81f80602273d
Author: Evan Hunt <each@isc.org>
Date: Wed May 7 09:36:41 2014 -0700
[rt35904] remove private non-static names from .def file
commit dbca45661c3939f21c3bb3f405d08cfe1b35d7aa
Author: Mukund Sivaraman <muks@isc.org>
Date: Wed May 7 21:39:32 2014 +0530
Remove test for shortcut findnode()
The implementation was not included in this review branch, but the tests
erroneously made it through.
This functionality will be addressed in a different ticket (RT#35906).
commit 94ff14576ab3407f2612d34727b7eacfefc3668c
Author: Mukund Sivaraman <muks@isc.org>
Date: Wed May 7 21:36:50 2014 +0530
Minor indent fix
commit 50972f17697bb222996e433faa8224843366f9b2
Author: Evan Hunt <each@isc.org>
Date: Tue May 6 20:05:21 2014 -0700
[rt35904] style
commit 5c4d5d41fcc5bfecdeebc008896974385c841b8d
Author: Mukund Sivaraman <muks@isc.org>
Date: Sun May 4 19:19:36 2014 +0530
RBT related updates
* Add various RBT unit tests
* Add some helper methods useful in unit testing RBT code
* General cleanup
3786. [func] Provide more detailed error codes when using
native PKCS#11. "pkcs11-tokens" now fails robustly
rather than asserting when run against an HSM with
an incomplete PCKS#11 API implementation. [RT #35479]
3760. [bug] Improve SIT with native PKCS#11 and on Windows.
[RT #35433]
3759. [port] Enable delve on Windows. [RT #35441]
3758. [port] Enable export library APIs on windows. [RT #35382]
(which are similar to DNS Cookies by Donald Eastlake)
and are designed to help clients detect off path
spoofed responses and for servers to detect legitimate
clients.
SIT use a experimental EDNS option code (65001).
SIT can be enabled via --enable-developer or
--enable-sit. It is on by default in Windows.
RRL processing as been updated to know about SIT with
legitimate clients not being rate limited. [RT #35389]
3741. [func] "delve" (domain entity lookup and validation engine):
A new tool with dig-like semantics for performing DNS
lookups, with internal DNSSEC validation, using the
same resolver and validator logic as named. This
allows easy validation of DNSSEC data in environments
with untrustworthy resolvers, and assists with
troubleshooting of DNSSEC problems. (Note: not yet
available on win32.) [RT #32406]
information will be automatically updated if the
OS supports routing sockets. Use
"automatic-interface-scan no;" to disable.
Add "rndc scan" to trigger a scan. [RT #23027]
3705. [func] "configure --enable-native-pkcs11" enables BIND
to use the PKCS#11 API for all cryptographic
functions, so that it can drive a hardware service
module directly without the need to use a modified
OpenSSL as intermediary (so long as the HSM's vendor
provides a complete-enough implementation of the
PKCS#11 interface). This has been tested successfully
with the Thales nShield HSM and with SoftHSMv2 from
the OpenDNSSEC project. [RT #29031]
3700. [func] Allow access to subgroups of XML statistics via
special URLs http://<server>:<port>/xml/v3/server,
/zones, /net, /tasks, /mem, and /status. [RT #35115]
3699. [bug] Improvements to statistics channel XSL stylesheet:
the stylesheet can now be cached by the browser;
section headers are omitted from the stats display
when there is no data in those sections to be
displayed; counters are now right-justified for
easier readability. [RT #35117]
3535. [func] Add support for setting Differentiated Services Code
Point (DSCP) values in named. Most configuration
options which take a "port" option (e.g.,
listen-on, forwarders, also-notify, masters,
notify-source, etc) can now also take a "dscp"
option specifying a code point for use with
outgoing traffic, if supported by the underlying
OS. [RT #27596]
3524. [func] Added an alternate statistics channel in JSON format,
when the server is built with the json-c library:
http://[address]:[port]/json. [RT #32630]