The following compiler warning is emitted for the BACKTRACE_X86STACK
part of lib/isc/backtrace.c:
backtrace.c: In function ‘getrbp’:
backtrace.c:142:1: warning: no return statement in function returning non-void [-Wreturn-type]
While getrbp() stores the value of the RBP register in the RAX register
and thus does attempt to return a value, this is not enough for an
optimizing compiler to always produce the expected result. With -O2,
the following machine code may be generated in isc_backtrace_gettrace():
0x00007ffff7b0ff7a <+10>: mov %rbp,%rax
0x00007ffff7b0ff7d <+13>: mov $0x17,%eax
0x00007ffff7b0ff82 <+18>: retq
The above is equivalent to:
sp = (void **)getrbp();
return (ISC_R_NOTFOUND);
and results in the backtrace never getting printed.
Fix by using an intermediate variable. With this change in place, the
machine code generated with -O2 becomes something like:
0x00007ffff7af5638 <+24>: mov $0x17,%eax
0x00007ffff7af563d <+29>: mov %rbp,%rdx
0x00007ffff7af5640 <+32>: test %rdx,%rdx
0x00007ffff7af5643 <+35>: je 0x7ffff7af56bd <isc_backtrace_gettrace+157>
...
0x00007ffff7af56bd <+157>: retq
(Note that this method of grabbing a stack trace is finicky anyway
because in order for RBP to be relied upon, -fno-omit-stack-frame must
be present among CFLAGS.)
Some operating systems (e.g. Linux, FreeBSD) provide the
_Unwind_Backtrace() function in libgcc_s.so, which is automatically
linked into any binary using the functions provided by that library. On
OpenBSD, though, _Unwind_Backtrace() is provided by libc++abi.so, which
is not automatically linked into binaries produced by the stock system C
compiler.
Meanwhile, lib/isc/backtrace.c assumes that any GNU-compatible toolchain
allows _Unwind_Backtrace() to be used without any extra provisions in
the build system. This causes build failures on OpenBSD (and possibly
other systems).
Instead of making assumptions, actually check for _Unwind_Backtrace()
support in the toolchain if the backtrace() function is unavailable.
If the connection is closed while we're processing the request
we might access TCPDNS outerhandle which is already reset. Check
for this condition and call the callback with ISC_R_CANCELED result.
(cherry picked from commit c41ce8e0c9)
When client disconnects before the connection can be accepted, the named
would log a spurious log message:
error: Accepting TCP connection failed: socket is not connected
We now ignore the ISC_R_NOTCONNECTED result code and log only other
errors
(cherry picked from commit 5ef71c420f)
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
whether the socket was still alive and scheduling reads/sends on
closed socket.
2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
changed to always return the error conditions via the callbacks, so
they always succeed. This applies to all protocols (UDP, TCP and
TCPDNS).
(cherry picked from commit f7c82e406e)
There were two problems how tcp_send_direct() was used:
1. The tcp_send_direct() can return ISC_R_CANCELED (or translated error
from uv_tcp_send()), but the isc__nm_async_tcpsend() wasn't checking
the error code and not releasing the uvreq in case of an error.
2. In isc__nm_tcp_send(), when the TCP send is already in the right
netthread, it uses tcp_send_direct() to send the TCP packet right
away. When that happened the uvreq was not freed, and the error code
was returned to the caller. We need to return ISC_R_SUCCESS and
rather use the callback to report an error in such case.
(cherry picked from commit 6af08d1ca6)
When closing the socket that is actively reading from the stream, the
read_cb() could be called between uv_close() and close callback when the
server socket has been already detached hence using sock->statichandle
after it has been already freed.
(cherry picked from commit 97b33e5bde)
There were two problems how udp_send_direct() was used:
1. The udp_send_direct() can return ISC_R_CANCELED (or translated error
from uv_udp_send()), but the isc__nm_async_udpsend() wasn't checking
the error code and not releasing the uvreq in case of an error.
2. In isc__nm_udp_send(), when the UDP send is already in the right
netthread, it uses udp_send_direct() to send the UDP packet right
away. When that happened the uvreq was not freed, and the error code
was returned to the caller. We need to return ISC_R_SUCCESS and
rather use the callback to report an error in such case.
(cherry picked from commit afca2e3b21)
When networking statistics was added to the netmgr (in commit
5234a8e00a), two lines were added that
increment the 'STATID_RECVFAIL' statistic: One if 'uv_read_start'
fails and one at the end of the 'read_cb'. The latter happens
if 'nread < 0'.
According to the libuv documentation, I/O read callbacks (such as for
files and sockets) are passed a parameter 'nread'. If 'nread' is less
than 0, there was an error and 'UV_EOF' is the end of file error, which
you may want to handle differently.
In other words, we should not treat EOF as a RECVFAIL error.
(cherry picked from commit 6c5ff94218)
isc_nmhandle_detach() needs to complete in the same thread
as shutdown_walk_cb() to avoid a race. Clear the caller's
pointer then pass control to the worker if necessary.
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 isc_nmhandle_detach lib/isc/netmgr/netmgr.c:1258:15
#1 control_command bin/named/controlconf.c:388:3
#2 dispatch lib/isc/task.c:1152:7
#3 run lib/isc/task.c:1344:2
Previous read of size 8 at 0x000000000001 by thread T2:
#0 isc_nm_pauseread lib/isc/netmgr/netmgr.c:1449:33
#1 recv_data lib/isccc/ccmsg.c:109:2
#2 isc__nm_tcp_shutdown lib/isc/netmgr/tcp.c:1157:4
#3 shutdown_walk_cb lib/isc/netmgr/netmgr.c:1515:3
#4 uv_walk <null>
#5 process_queue lib/isc/netmgr/netmgr.c:659:4
#6 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
#7 process_queues lib/isc/netmgr/netmgr.c:590:8
#8 async_cb lib/isc/netmgr/netmgr.c:548:2
#9 <null> <null>
(cherry picked from commit f95ba8aa20)
In set_sndbuf() we were using ISC_PLATFORM_HAVEIPV6 macro that doesn't
exist anymore, because we assume that IPv6 support is always available.
(cherry picked from commit 96ac91a18a)
If we clone the csock (children socket) in TCP accept_connection()
instead of passing the ssock (server socket) to the call back and
cloning it there we unbreak the assumption that every socket is handled
inside it's own worker thread and therefore we can get rid of (at least)
callback locking.
(cherry picked from commit e8b56acb49)
The isc__nm_tcpdns_stoplistening() would call isc__nmsocket_clearcb()
that would clear the .accept_cb from non-netmgr thread. Change the
tcpdns_stoplistening to enqueue ievent that would get processed in the
right netmgr thread to avoid locking.
(cherry picked from commit d86a74d8a4)
The SO_REUSEADDR, SO_REUSEPORT and SO_REUSEPORT_LB has different meaning
on different platform. In this commit, we split the function to set the
reuse of address/port and setting the load-balancing into separate
functions.
The libuv library already have multiplatform support for setting
SO_REUSEADDR and SO_REUSEPORT that allows binding to the same address
and port, but unfortunately, when used after the load-balancing socket
options have been already set, it overrides the previous setting, so we
need our own helper function to enable the SO_REUSEADDR/SO_REUSEPORT
first and then enable the load-balancing socket option.
(cherry picked from commit fd975a551d)
On POSIX based systems both uv_os_sock_t and uv_os_fd_t are both typedef
to int. That's not true on Windows, where uv_os_sock_t is SOCKET and
uv_os_fd_t is HANDLE and they differ in level of indirection.
(cherry picked from commit acb6ad9e3c)
The isc__nm_socket_freebind() has been refactored to match other
isc__nm_socket_...() helper functions and take uv_os_fd_t and
sa_family_t as function arguments.
(cherry picked from commit 9dc01a636b)
The isc_nm_pause(), isc_nm_resume() and finishing the nm_thread() from
nm_destroy() has been refactored, so all use the netievents instead of
directly touching the worker structure members. This allows us to
remove most of the locking as the .paused and .finished members are
always accessed from the matching nm_thread.
When shutting down the nm_thread(), instead of issuing uv_stop(), we
just shutdown the .async handler, so all uv_loop_t events are properly
finished first and uv_run() ends gracefully with no outstanding active
handles in the loop.
(cherry picked from commit e5ab137ba3)
If NETMGR_TRACE is defined, we now maintain a list of active sockets
in the netmgr object and a list of active handles in each socket
object; by walking the list and printing `backtrace` in a debugger
we can see where they were created, to assist in in debugging of
reference counting errors.
On shutdown, if netmgr finds there are still active sockets after
waiting, isc__nm_dump_active() will be called to log the list of
active sockets and their underlying handles, along with some details
about them.
(cherry picked from commit 00e04a86c8)
if more than 10 seconds pass while we wait for netmgr events to
finish running on shutdown, something is almost certainly wrong
and we should assert and crash.
(cherry picked from commit 2f2d60a989)
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.
A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:
- reqhandle: held while waiting for a request callback (query,
notify, update)
- sendhandle: held while waiting for a send callback
- fetchhandle: held while waiting for a recursive fetch to
complete
- updatehandle: held while waiting for an update-forwarding
task to complete
(cherry picked from commit 57b4dde974)
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
indicate whether it is to be used as a client or server socket
(cherry picked from commit 7eb4564895)
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
(cherry picked from commit 38264b6a4d)
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
(cherry picked from commit a0f7d28967)
by having these functions act on netmgr handles instead of socket
objects, they can be used in callback functions outside the netgmr.
(cherry picked from commit 55896df79d)
We erroneously tried to destroy a socket after issuing
isc__nm_tcp{,dns}_close. Under some (race) circumstances we could get
nm_socket_cleanup to be called twice for the same socket, causing an
access to a dead memory.
(cherry picked from commit 233f134a4f)
There's a possibility of race in isc__nm_tcpconnect if the asynchronous
connect operation finishes with all the callbacks before we exit the
isc__nm_tcpconnect itself we might access an already freed memory.
Fix it by creating an additional reference to the socket freed at the
end of isc__nm_tcpconnect.
(cherry picked from commit 896db0f419)
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.
in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
(cherry picked from commit 23c7373d68)
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving tcpdns processing
to a proper netmgr thread.
(cherry picked from commit 591b79b597)
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.
(cherry picked from commit 1cf65cd882)
this prevents a crash when some non-netmgr thread, such as a
recursive lookup, times out after the TCP socket is already
disconnected.
(cherry picked from commit 3704c4fff2)
this will allow recv event handlers to distinguish between cases
in which the region is NULL because of error, shutdown, or cancelation.
(cherry picked from commit 75c985c07f)
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.
(cherry picked from commit 5191ec8f86)
when isc_nm_destroy() is called, there's a loop that waits for
other references to be detached, pausing and unpausing the netmgr
to ensure that all the workers' events are run, followed by a
1-second sleep. this caused a delay on shutdown which will be
noticeable when netmgr is used in tools other than named itself,
so the delay has now been reduced to a hundredth of a second.
(cherry picked from commit 870204fe47)
the isc_nm_tcpconnect() function establishes a client connection via
TCP. once the connection is esablished, a callback function will be
called with a newly created network manager handle.
(cherry picked from commit abbb79f9d1)
A TCPDNS socket creates a handle for each complete DNS message.
Previously, when all the handles were disconnected, the socket
would be closed, but the wrapped TCP socket might still have
more to read.
Now, when a connection is established, the TCPDNS socket creates
a reference to itself by attaching itself to sock->self. This
reference isn't cleared until the connection is closed via
EOF, timeout, or server shutdown. This allows the socket to remain
open even when there are no active handles for it.
(cherry picked from commit cd79b49538)
- isc__nmhandle_get() now attaches to the sock in the nmhandle object.
the caller is responsible for dereferencing the original socket
pointer when necessary.
- tcpdns listener sockets attach sock->outer to the outer tcp listener
socket. tcpdns connected sockets attach sock->outerhandle to the handle
for the tcp connected socket.
- only listener sockets need to be attached/detached directly. connected
sockets should only be accessed and reference-counted via their
associated handles.
(cherry picked from commit 5ea26ee1f1)
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.
(cherry picked from commit 9e740cad21)
The following reverted changes will be picked again as part of the
netmgr sync with main branch.
Revert "Merge branch '1996-confidential-issue-v9_16' into 'security-v9_16'"
This reverts commit e160b1509f, reversing
changes made to c01e643715.
Revert "Merge branch '2038-use-freebind-when-bind-fails-v9_16' into 'v9_16'"
This reverts commit 5f8ecfb918, reversing
changes made to 23021385d5.
Revert "Merge branch '1936-blackhole-fix-v9_16' into 'v9_16'"
This reverts commit f20bc90a72, reversing
changes made to 490016ebf1.
Revert "Merge branch '1938-fix-udp-race' into 'v9_16'"
This reverts commit 0a6c7ab2a9, reversing
changes made to 4ea84740e6.
Revert "Merge branch '1947-fix-tcpdns-race' into 'v9_16'"
This reverts commit 4ea84740e6, reversing
changes made to d761cd576b.