In several cases where IDNA2008 mappings do not exist whereas IDNA2003
mappings do, dig was failing to process the suplied domain name. Take a
backwards compatible approach, and convert the domain to IDNA2008 form,
and if that fails try the IDNA2003 conversion.
YAML strings should be quoted if they contain colon characters.
Since IPv6 addresses do, we now quote the query_address and
response_address strings in all YAML output.
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
Clean up dns_rdatalist_tordataset() and dns_rdatalist_fromrdataset()
functions by making them return void, because they cannot fail.
Clean up other functions that subsequently cannot fail.
When DiG finishes its work with a lookup (due to success or error), it
calls the clear_current_lookup() function, which decreases the lookup's
reference count. That decrease action is the counterpart of the initial
creation of the reference counter, so this function was designed in such
a way that it should decrease the reference count only once, when there
are no more active queries in the lookup.
The way it checks whether there are any active queries is by looking
at the queries list of the lookup object - if it's NULL then there are
no active queries. But that is not always true - the cancel_lookup()
function, when canceling the queries one by one, also removes them
from the lookup's list, but in NSSEARCH mode, when the queries are
working in parallel, some of those queries can be still active. And
when their recv_done() callback gets called, it sees that the lookup
has been canceled, calls clear_current_lookup(), which decreases the
reference count every time for each query that was still active
(because ISC_LIST_HEAD(lookup->q) is NULL) and results in a reference
counting error.
Fix the issue by introducing a new "cleared" property for the lookup,
which will ensure that the clear_current_lookup() function does its
job only once per lookup.
In the NSSEARCH followup lookup, when one of the queries fails to be
set up (UDP) or connected (TCP), DiG doesn't start the next query.
This is a mistake, because in NSSEARCH mode the queries are independent
and DiG shouldn't stop the lookup process just because setting up (or
connecting to) one of the name servers returns an error code in the
`udp_ready()` or `tcp_connected()` callbacks.
Write a new `nssearch_next()` function which takes care of starting the
next query in NSSEARCH mode, so it can be used in several places without
code repetition.
Make sure that the `udp_ready()` and `tcp_connected()` functions call
`nssearch_next()` in case they won't be calling `send_udp()` and
`send_tcp()` respectively, because in that case the `send_done()`
callback, which usually does the job, won't be called.
Refactor `send_done()` to use the newly written `nssearch_next()`
function.
In the NSSEARCH followup lookup, when one of the queries fails to be
sent, DiG doesn't start the next query. This is a mistake, because in
NSSEARCH mode the queries are independent and DiG shouldn't stop the
lookup process just because sending a query to one of the name servers
returns an error code.
Restructure the `send_done()` function to unconditionally send the next
query in NSSEARCH mode, if it exists.
DiG implements different logic in the `recv_done()` callback function
when processing a failure:
1. For a timed-out query it applies the "retries" logic first, then,
when it fails, fail-overs to the next server.
2. For an EOF (end-of-file, or unexpected disconnect) error it tries to
make a single retry attempt (even if the user has requested more
retries), then, when it fails, fail-overs to the next server.
3. For other types of failures, DiG does not apply the "retries" logic,
and tries to fail-over to the next servers (again, even if the user
has requested to make retries).
Simplify the logic and apply the same logic (1) of first retries, and
then fail-over, for different types of failures in `recv_done()`.
When the `send_done()` callback function gets called with a failure
result code, DiG erroneously cancels the lookup.
Stop canceling the lookup and give DiG a chance to retry the failed
query, or fail-over to another server, using the logic implemented in
the `recv_done()` callback function.
When the `udp_ready()` callback function gets called with a failure
result code, DiG erroneously cancels the lookup.
Copy the logic behind `tcp_connected()` callback function into
`udp_ready()` so that DiG will now retry the failed query (if retries
are enabled) and then, if it fails again, it will fail-over to the next
server in the list, which synchronizes the behavior between TCP and UDP
modes.
Also, `udp_ready()` was calling `lookup_detach()` without calling
`lookup_attach()` first, but the issue was masked behind the fact
that `clear_current_lookup()` wasn't being called when needed, and
`lookup_detach()` was compensating for that. This also has been fixed.
Instead of returning error values from isc_rwlock_*(), isc_mutex_*(),
and isc_condition_*() macros/functions and subsequently carrying out
runtime assertion checks on the return values in the calling code,
trigger assertion failures directly in those macros/functions whenever
any pthread function returns an error, as there is no point in
continuing execution in such a case anyway.
In special NS search mode, after the initial lookup, dig starts the
followup lookup with discovered NS servers in the queries list. If one
of those queries then fail, dig, as usual, tries to start the next query
in the list, which results in a crash, because the NS search mode is
special in a way that the queries are running in parallel, so the next
query is usually already started.
Apply some special logic in `recv_done()` function to deal with the
described situation when handling the query result for the NS search
mode. Particularly, print a warning message for the failed query,
and do not try to start the next query in the list. Also, set a non-zero
exit code if all the queries in the followup lookup fail.
Fix an issue reported by Coverity by removing the unneded check.
*** CID 352554: Null pointer dereferences (REVERSE_INULL)
/bin/dig/dighost.c: 3056 in start_tcp()
3050
3051 if (ISC_LINK_LINKED(query, link)) {
3052 next = ISC_LIST_NEXT(query, link);
3053 } else {
3054 next = NULL;
3055 }
>>> CID 352554: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "connectquery" suggests that it may be null, but it
has already been dereferenced on all paths leading to the check.
3056 if (connectquery != NULL) {
3057 query_detach(&connectquery);
3058 }
3059 query_detach(&query);
3060 if (next == NULL) {
3061 clear_current_lookup();
There was a proposal in the late 1990s that it might, but it turned
out to be unworkable. See RFC 6891, Extension Mechanisms for
DNS (EDNS(0)), section 5, Extended Label Types.
The remnants of the code that supported this in BIND are redundant.
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
This commit removes dead code from cleanup handling part of the
get_create_tls_context().
In particular, currently:
* there is no way 'found_ctx' might equal 'ctx';
* there is no way 'session_cache' might equal a non-NULL value while
cleaning up after a TLS initialisation error.
This commit ensures that isc_nm_cancelread() is not called from within
dig code for HTTP sockets, as these lack its implementation.
It does not have much sense to have it due to transactional nature of
HTTP.
Every HTTP request-response pair is represented by a virtual socket,
where read callback is called only when full DNS message is received
or when an error code is being passed there. That is, there is nothing
to cancel at the time of the call.
This commit extends TLS context cache with TLS client session cache so
that an associated session cache can be stored alongside the TLS
context within the context cache.
The dns_message_gettempname(), dns_message_gettemprdata(),
dns_message_gettemprdataset(), and dns_message_gettemprdatalist() always
succeeds because the memory allocation cannot fail now. Change the API
to return void and cleanup all the use of aforementioned functions.
3034 next = ISC_LIST_NEXT(query, link);
3035 } else {
3036 next = NULL;
3037 }
CID 352554 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking connectquery suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
3038 if (connectquery != NULL) {
3039 query_detach(&connectquery);
3040 }
- var_decl: Declaring variable "tbuf" without initializer
- assign: Assigning: "target.base" = "tbuf", which points to
uninitialized data
- assign: Assigning: "r.base" = "target.base", which points to
uninitialized data
I expect it would correctly initialize length always. Add simple
initialization to silent coverity.
There was a query_detach() call missing in dig, which could lead to
dig hanging on TLS context creation errors. This commit fixes.
The error was introduced because the Strict TLS implementation was
initially made over an older version of the code, where this extra
query_detach() call was not needed.
Man pages for dig/mdig/delv used `.. option:: +[no]bla` to describe two
options at once, and very old Sphinx does not support that [] in option
names.
Solution is to split negative and positive options into `+bla, +nobla`
form. In the end it improves readability because it transforms hard to
read strings with double brackets from
`+[no]subnet=addr[/prefix-length]` to
`+subnet=addr[/prefix-length], +nosubnet`.
As a side-effect it also allows easier linking to dig/mdig/delv options
using their name directly instead of always overriding the link target
to `+[no]bla` form.
Transformation was done using regex:
s/:: +\[no\]\(.*\)/:: +\1, +no\1
... and manual review around occurences matching regex
+no.*=
Fixes: #3301
dig previously set an exit code of 9 when a TCP connection failed
or when a UDP connection timed out, but when the server address is
localhost it's possible for a UDP query to fail with ISC_R_CONNREFUSED.
that code path didn't update the exit code, causing dig to exit with
status 0. we now set the exit code to 9 in this failure case.
In `+nssearch` mode `dig` starts the next query of the followup lookup
using `start_udp()` or `start_tcp()` calls without waiting for the
previous query to complete.
In UDP mode that happens in the `send_done()` callback of the previous
query, but in TCP mode that happens in the `start_tcp()` call of the
previous query (recursion) which doesn't work because `start_tcp()`
attaches the `lookup->current_query` to the query it is starting, so a
recursive call will result in an assertion failure.
Make the TCP mode to start the next query in `send_done()`, just like in
the UDP mode. During that time the `lookup->current_query` is already
detached by the `tcp_connected()`/`udp_ready()` callbacks.
The error code path handling the `ISC_R_CANCELED` code lacks a
`clear_current_lookup()` call, without which dig hangs indefinitely
when handling the error.
Add the missing call to account for all references of the lookup so
it can be destroyed.
In `send_udp()` and `launch_next_query()` functions, when calling
`dighost_printmessage()` to print detailed information about the
sent query, dig always prints the data of the first query in the
lookup's queries list.
The first query in the list can be already finished, having its handles
freed, and accessing this information results in assertion failure.
Print the current query's information instead.
In recv_done(), when dig decides to start the lookup's next query in
the line using `start_udp()` or `start_tcp()`, and for some reason,
no queries get started, dig doesn't cancel the lookup.
This can occur, for example, when there are two queries in the lookup,
one with a regular IP address, and another with a IPv4 mapped IPv6
address. When the regular IP address fails to serve the query, its
`recv_done()` callback starts the next query in the line (in this
case the one with a mapped IP address), but because `dig` doesn't
connect to such IP addresses, and there are no other queries in the
list, no new queries are being started, and the lookup keeps hanging.
After calling `start_udp()` or `start_tcp()` in `recv_done()`, check
if there are no pending/working queries then cancel the lookup instead
of only detaching from the current query.
The `udp_ready()` and `tcp_connected()` functions in dighost.c are
used for similar purposes for UDP and TCP respectively.
Synchronize the `udp_ready()` function entry code to behave like
`tcp_connected()` by adding input validation, debug messages and
early exit code when `cancel_now` is `true`.
When finishing the NSSEARCH task and there is no more followup
lookups to start, dig does not destroy the last lookup, which
causes it to hang indefinitely.
Rename the unused `first_pass` member of `dig_query_t` to `started`
and make it `true` in the first callback after `start_udp()` or
`start_tcp()` of the query to indicate that the query has been
started.
Create a new `check_if_queries_done()` function to check whether
all of the queries inside a lookup have been started and finished,
or canceled.
Use the mentioned function in the TRACE code block in `recv_done()`
to check whether the current query is the last one in the lookup and
cancel the lookup in that case to free the resources.
This commit adds support for Strict/Mutual TLS to dig.
The new command-line options and their behaviour are modelled after
kdig (+tls-ca, +tls-hostname, +tls-certfile, +tls-keyfile) for
compatibility reasons. That is, using +tls-* is sufficient to enable
DoT in dig, implying +tls-ca
If there is no other DNS transport specified via command-line,
specifying any of +tls-* options makes dig use DoT. In this case, its
behaviour is the same as if +tls-ca is specified: that is, the remote
peer's certificate is verified using the platform-specific
intermediate CA certificates store. This behaviour is introduced for
compatibility with kdig.
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
When encountering a TCP connection error while trying to initiate a
connection to a server, dig erroneously cancels the lookup even when
there are other server(s) to try, which results in an assertion failure.
Cancel the lookup only when there are no more queries left in the
lookup's queries list (i.e. `next` is NULL).
When timing-out or having other types of socket errors during a query,
dig isn't trying to perform the lookup using other servers which exist
in the lookup's queries list.
After configured amount of timeout retries, or after a socket error,
check if there are other queries/servers in the lookup's queries list,
and start the next one if it exists, instead of unconditionally failing.
When a query times out, and `dig` (or `host`) creates a new query
to resend the request, it is being prepended to the lookup's queries
list, which can cause a confusion later, making `dig` (or `host`)
believe that there is another new query in the list, but that is
actually the old one, which was timed out. That mistake will result
in an assertion failure.
That can happen, in particular, when after a timed out request,
the retried request returns a SERVFAIL result, and the recursion
is enabled, and `+nofail` option was used with `dig` (that is the
default behavior in `host`, unless the `-s` option is provided).
Fix the problem by inserting the query just after the current,
timed-out query, instead of prepending to the list.
Before calling start_udp() detach `l->current_query`, like it is
done in another place in the function.
Slightly update a couple of debug messages to make them more
consistent.
After a query results in a SERVFAIL result, and there is another
registered query in the lookup's queries list, `dig` starts the next
query to try another server, but for some reason, reports about that
also when the current query is in the head of the list, even if there
is no other query in the list to try.
Use the same condition for both decisions, and after starting the next
query, jump to the "detach_query" label instead of "next_lookup",
because there is no need to start the next lookup after we just started
a query in the current lookup.
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements. Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.