There were two problems how udp_send_direct() was used:
1. The udp_send_direct() can return ISC_R_CANCELED (or translated error
from uv_udp_send()), but the isc__nm_async_udpsend() wasn't checking
the error code and not releasing the uvreq in case of an error.
2. In isc__nm_udp_send(), when the UDP send is already in the right
netthread, it uses udp_send_direct() to send the UDP packet right
away. When that happened the uvreq was not freed, and the error code
was returned to the caller. We need to return ISC_R_SUCCESS and
rather use the callback to report an error in such case.
(cherry picked from commit afca2e3b21)
This test is very simple, two nameserver instances are created:
- ns4: master, with 'minimal-responses yes', authoritative
for example. zone
- ns5: slave, stub zone
The first thing verified is the transfer of zone data from master
to slave, which should be saved in ns5/example.db.
After that, a query is issued to ns5 asking for target.example.
TXT, a record present in the master database with the "test" string
as content.
If that query works, it means stub zone successfully request
nameserver addresses from master, ns4.example. A/AAAA
The presence of both A/AAAA records for ns4 is also verified in the
stub zone local file, ns5/example.db.
Stub zones don't make use of AXFR/IXFR for the transfering of zone
data, instead, a single query is issued to the master asking for
their nameserver records (NS).
That works fine unless master is configured with 'minimal-responses'
set to yes, in which case glue records are not provided by master
in the answer with nameservers authoritative for the zone, leaving
stub zones with incomplete databases.
This commit fix this problem in a simple way, when the answer with
the authoritative nameservers is received from master (stub_callback),
for each nameserver listed (save_nsrrset), a A and AAAA records for
the name is verified in the additional section, and if not present
a query is created to resolve the corresponsing missing glue.
A struct 'stub_cb_args' was added to keep relevant information for
performing a query, like TSIG key, udp size, dscp value, etc, this
information is borrowed from, and created within function 'ns_query',
where the resolving of nameserver from master starts.
A new field was added to the struct 'dns_stub', an atomic integer,
namely pending_requests, which is used to keep how many queries are
created when resolving nameserver addresses that were missing in
the glue.
When the value of pending_requests is zero we know we can release
resources, adjust zone timers, dump to zone file, etc.
When networking statistics was added to the netmgr (in commit
5234a8e00a), two lines were added that
increment the 'STATID_RECVFAIL' statistic: One if 'uv_read_start'
fails and one at the end of the 'read_cb'. The latter happens
if 'nread < 0'.
According to the libuv documentation, I/O read callbacks (such as for
files and sockets) are passed a parameter 'nread'. If 'nread' is less
than 0, there was an error and 'UV_EOF' is the end of file error, which
you may want to handle differently.
In other words, we should not treat EOF as a RECVFAIL error.
(cherry picked from commit 6c5ff94218)
This commit ensures that dnstap output files captured
by fstrm_capture are properly flushed before any attempt
on reading them with dnstap-read is done.
By reading fstrm-capture source code it was noticed that
signal SIGHUP is used to flush the capture file.
Add a +burst option to mdig so that we have a second to setup the
mdig calls then they run at the start of the next second.
RRL uses 'queries in a second' as a approximation to
'queries per second'. Getting the bursts of traffic to all happen in
the same second should prevent false negatives in the system test.
We now have a second to setup the traffic in. Then the traffic should
be sent at the start of the next second. If that still fails we
should move to +burst=<now+2> (further extend mdig) instead of the
implicit <now+1> as the trigger second.
(cherry picked from commit 92cdc7b6c7)
isc_nmhandle_detach() needs to complete in the same thread
as shutdown_walk_cb() to avoid a race. Clear the caller's
pointer then pass control to the worker if necessary.
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 isc_nmhandle_detach lib/isc/netmgr/netmgr.c:1258:15
#1 control_command bin/named/controlconf.c:388:3
#2 dispatch lib/isc/task.c:1152:7
#3 run lib/isc/task.c:1344:2
Previous read of size 8 at 0x000000000001 by thread T2:
#0 isc_nm_pauseread lib/isc/netmgr/netmgr.c:1449:33
#1 recv_data lib/isccc/ccmsg.c:109:2
#2 isc__nm_tcp_shutdown lib/isc/netmgr/tcp.c:1157:4
#3 shutdown_walk_cb lib/isc/netmgr/netmgr.c:1515:3
#4 uv_walk <null>
#5 process_queue lib/isc/netmgr/netmgr.c:659:4
#6 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
#7 process_queues lib/isc/netmgr/netmgr.c:590:8
#8 async_cb lib/isc/netmgr/netmgr.c:548:2
#9 <null> <null>
(cherry picked from commit f95ba8aa20)
In set_sndbuf() we were using ISC_PLATFORM_HAVEIPV6 macro that doesn't
exist anymore, because we assume that IPv6 support is always available.
(cherry picked from commit 96ac91a18a)
If we clone the csock (children socket) in TCP accept_connection()
instead of passing the ssock (server socket) to the call back and
cloning it there we unbreak the assumption that every socket is handled
inside it's own worker thread and therefore we can get rid of (at least)
callback locking.
(cherry picked from commit e8b56acb49)
The isc__nm_tcpdns_stoplistening() would call isc__nmsocket_clearcb()
that would clear the .accept_cb from non-netmgr thread. Change the
tcpdns_stoplistening to enqueue ievent that would get processed in the
right netmgr thread to avoid locking.
(cherry picked from commit d86a74d8a4)
This was accidentally lost in the process of moving rmessage from fctx
to query. Without this dns_message_setclass() will fail.
(cherry picked from commit 1f63bb15b3)
* legacy test was just expecting default server EDNS buffer size to be 4096,
the test needed the adjustment to reset the buffer sizes back to 4096.
(cherry picked from commit 354a2e102d5b8b0a73c9bcea14a4af7091ed6e31)
* digdelv test was just expecting default server EDNS buffer size to be
4096, the test needed only slight adjustment
(cherry picked from commit f1556f8c41)
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication. In this commit, we implement the minimal
required changes by changing the defaults for `edns-udp-size`,
`max-udp-size` and `nocookie-udp-size` to `1232` (the value picked by
DNS Flag Day 2020).
(cherry picked from commit bb990030d3)