We must ensure that the fd provided by userspace is really for a UDP
socket. If it's not we'll panic in udp_set_kernel_tunneling().
Reported by: Gert Doering <gert@greenie.muc.de>
Sponsored by: Rubicon Communications, LLC ("Netgate")
Fixes INVARIANTS build with Clang 15, which previously failed due to
set-but-not-used variable warnings.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36096
SDIO CMD53 (RW Extented) can either write to the same address (useful for FIFO)
or auto increment the destination address (to write to multiple registers).
It is more logical to have read/write_4 to use incremental mode and make other
helper function for writing to a FIFO destination especially since most FIFO
write/read will be 8bits based and not 32bits based.
Do not use b/l but _1/_4 also address comes first and then data.
This makes them closer to something like bus_space_{read,write}
We have no users in the tree.
Fixes INVARIANTS build with Clang 15, which previously failed due to
set-but-not-used variable warnings.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Currently, subr_bus.c shares logic for (a) maintaining all HW devices
(e.g. discovery/attach/detach logic) and (b) generic devctl notification
layer for devices/PMU/GEOM/interfaces/etc).
These two subsystems share really tiny interaction interface, composed of 3
notification functions. With that in mind, move devctl layer to a
separate file, establishing a clear notification interface between the
sub.c bus layer and the provider (devctl).
The primary driver of this change is netlink implementation (D36002).
The idea is to propagate device-level events to netlink as well, so all
netlink customers can subscribe to these changes.
The long-term goal is to deprecate devctl and to use netlink as the
kernel<> userland transport provided netlink gets enough traction.
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D36091
MFC after: 1 month
route_ctl.c size has grown considerably since initial introduction.
Factor out non-relevant parts:
* all rtentry logic, such as creation/destruction and accessors
goes to net/route/route_rtentry.c
* all rtable subscription logic goes to net/route/route_subscription.c
Differential Revision: https://reviews.freebsd.org/D36074
MFC after: 1 month
This change adds public KPI to work with routes using pre-created
nexthops, instead of using data from addrinfo structures. These
functions will be later used for adding/deleting kernel-originated
routes and upcoming netlink protocol.
As a part of providing this KPI, low-level route addition code has been
reworked to provide more control over route creation or change.
Specifically, a number of operation flags
(RTM_F_<CREATE|EXCL|REPLACE|APPEND>) have been added, defining the
desired behaviour the the route already exists (or not exists). This
change required some changes in the multipath addition code, resulting
in moving this code to route_ctl.c, rendering mpath_ctl.c empty.
Differential Revision: https://reviews.freebsd.org/D36073
MFC after: 1 month
This change is required for the upcoming introduction of the next
nexhop-based operations KPI, as it will create rtentry and nexthops
at different stages of route table modification.
Differential Revision: https://reviews.freebsd.org/D36072
MFC after: 2 weeks
* Use same filter func (rib_filter_f_t) for nexhtop groups to
simplify callbacks.
* simplify conditional route deletion & remove the need to pass
rt_addrinfo to the low-level deletion functions
* speedup rib_walk_del() by removing an additional per-prefix lookup
Differential Revision: https://reviews.freebsd.org/D36071
MFC after: 1 month
This and the follow-up routing-related changes target to remove or
reduce `struct rt_addrinfo` usage and use recently-landed nhop(9)
KPI instead.
Traditionally `rt_addrinfo` structure has been used to propagate all necessary
information between the protocol/rtsock and a routing layer. Many
functions inside routing subsystem uses it internally. However, using
this structure became somewhat complicated, as there are too many ways
of specifying a single state and verifying data consistency is hard.
For example, arerouting flgs consistent with mask/gateway sockaddr pointers?
Is mask really a host mask? Are sockaddr "valid" (e.g. properly zeroed, masked,
have proper length)? Are they mutable? Is the suggested interface specified
by the interface index embedded into the sockadd_dl gateway, or passed
as RTAX_IFP parameter, or directly provided by rti_ifp or it needs to
be derived from the ifa?
These (and other similar) questions have to be considered every time when
a function has `rt_addrinfo` pointer as an argument.
The new approach is to bring more control back to the protocols and
construct the desired routing objects themselves - in the end, it's the
protocol/subsystem who knows the desired outcome.
This specific diff changes the following:
* add explicit basic low-level radix operations:
add_route() (renamed from add_route_nhop())
delete_route() (factored from change_route_nhop())
change_route() (renamed from change_route_nhop)
* remove "info" parameter from change_route_conditional() as a part
of reducing rt_addrinfo usage in the internal KPIs
* add lookup_prefix_rt() wrapper for doing re-lookups after
RIB lock/unlock
Differential Revision: https://reviews.freebsd.org/D36070
MFC after: 2 weeks
This streamlines cloning of a socket from a listener. Now we do not
drop the inpcb lock during creation of a new socket, do not do useless
state transitions, and put a fully initialized socket+inpcb+tcpcb into
the listen queue.
Before this change, first we would allocate the socket and inpcb+tcpcb via
tcp_usr_attach() as TCPS_CLOSED, link them into global list of pcbs, unlock
pcb and put this onto incomplete queue (see 6f3caa6d81). Then, after
sonewconn() we would lock it again, transition into TCPS_SYN_RECEIVED,
insert into inpcb hash, finalize initialization of tcpcb. And then, in
call into tcp_do_segment() and upon transition to TCPS_ESTABLISHED call
soisconnected(). This call would lock the listening socket once again
with a LOR protection sequence and then we would relocate the socket onto
the complete queue and only now it is ready for accept(2).
Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D36064
as alternative KPI to sonewconn(). The latter has three stages:
- check the listening socket queue limits
- allocate a new socket
- call into protocol attach method
- link the new socket into the listen queue of the listening socket
The attach method, originally designed for a creation of socket by the
socket(2) syscall has slightly different semantics than attach of a socket
cloned by listener. Make it possible for protocols to call into the
first stage, then perform a different attach, and then call into the
final stage. The first stage, that checks limits and clones a socket
is called solisten_clone(), and the function that enqueues the socket
is solisten_enqueue().
Reviewed by: tuexen
Differential revision: https://reviews.freebsd.org/D36063
Changing mode on a pin (input/output/pullup/pulldown) is a bit slow.
Improve this by caching what we can.
We need to check if the pin is in gpio mode, do that the first time
that we have a request for this pin and cache the result. We can't do
that at attach as we are a child of rk_pinctrl and it didn't finished
its attach then.
Cache also the flags specific to the pinctrl (pullup or pulldown) if the
pin is in input mode.
Cache the registers that deals with input/output mode and output value. Also
remove some register reads when we change the direction of a pin or when we
change the output value since the bit changed in the registers only affect output
pins.
Define PAGE_SIZE and PAGE_MASK based on PAGE_SHIFT. With this we only
need to set one value to change one value to change the page size.
While here remove the unused PAGE_MASK_* macros.
Sponsored by: The FreeBSD Foundation
Fixes INVARIANTS build with Clang 15, which previously failed due to
set-but-not-used variable warnings.
Reviewed by: dim
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36097
Imagine we are in SYN-RCVD state and two ACKs arrive at the same time,
both valid, e.g. coming from the same host and with valid sequence.
First packet would locate the listening socket in the inpcb database,
write-lock it and start expanding the syncache entry into a socket.
Meanwhile second packet would wait on the write lock of the listening
socket. First packet will create a new ESTABLISHED socket, free the
syncache entry and unlock the listening socket. Second packet would
call into syncache_expand(), but this time it will fail as there
is no syncache entry. Second packet would generate RST, effectively
resetting the remote connection.
It seems to me, that it is impossible to solve this problem with
just rearranging locks, as the race happens at a wire level.
To solve the problem, for an ACK packet arrived on a listening socket,
that failed syncache lookup, perform a second non-wildcard lookup right
away. That lookup may find the new born socket. Otherwise, we indeed
send RST.
Tested by: kp
Reviewed by: tuexen, rrs
PR: 265154
Differential revision: https://reviews.freebsd.org/D36066
destinations.
The current assumption is that kernel-handled rtadv prefixes along with
the interface address prefixes are the only prefixes considered in
the ND neighbor eligibility code.
Change this by allowing any non-gatewaye routes to be eligible. This
will allow DHCPv6-controlled routes to be correctly handled by
the ND code.
Refactor nd6_is_new_addr_neighbor() to enable more deterministic
performance in "found" case and remove non-needed
V_rt_add_addr_allfibs handling logic.
Reviewed By: kbowling
Differential Revision: https://reviews.freebsd.org/D23695
MFC after: 1 month
Node names for gpio bank were made generic in Linux 5.16 so stop
using them to map the gpio controller to the pin controller bank unit.
Sponsored by: Beckhoff Automation GmbH & Co. KG
Put the return value of ip_output()/ip6_output in the output event
instead of adding another one in case of an error. This improves
consistency with other similar places.
Reviewed by: rscheff
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D36085
Here, the provider is responsible for updating the trapframe to redirect
control flow and for computing the return address. Once software-saved
registers are restored, the emulation shifts the remaining context down
on the stack to make space for the return address, then copies the
address provided by the invop handler. dtrace_invop() is modified to
allocate temporary storage space on the stack for use by the provider to
return the return address.
This is to support a new provider for amd64 which can instrument
arbitrary instructions, not just function entry and exit instructions as
FBT does.
In collaboration with: christos
Sponsored by: Google, Inc. (GSoC 2022)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
dtrace invop handlers have access to the whole trapframe, just use that
to extract %rax/%eax for return probes instead of relying on an
additional parameter to the handler. No functional change intended.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
As in vm_fault_cow(), it's possible, albeit rare, for multiple vm_maps
to share a shadow object. When copying a page from a backing object
into the shadow, all mappings of the source page must therefore be
removed. Otherwise, future operations on the object tree may detect
that the source page is fully shadowed and thus can be freed.
Approved by: so
Security: FreeBSD-SA-22:11.vm
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35635
Resulting sbuf_len() from proc_getargv() might return 0 if user mangled
ps_strings enough. Also, sbuf_len() API contract is to return -1 if the
buffer overflowed. The later should not occur because get_ps_strings()
checks for catenated length, but check for this subtle detail explicitly
as well to be more resilent.
The end result is that p_comm is used in this situations.
Approved by: so
Security: FreeBSD-SA-22:09.elf
Reported by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: delphij, markj
admbugs: 988
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35391
This driver supports the auto negotiation mode between the copper and fiber
ports.
This PHY has two independent PHYs (one for copper and other for fiber) but in
this case the functionality is presented as a single PHY for easy management.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Other arches like powerpc* needs it.
Fixes: d387a1b4b1 ("linuxkpi: io.h: Do not include asm/set_memory.h for armv6 and armv7")
Fixes: 789dbdbb48 ("linuxkpi: Add arch_io_{reserve,free}_memtype_wc")
Sponsored by: Beckhoff Automation GmbH & Co. KG
Remove a few more remnants from the old pre-KTLS support and instead
assume that each work request sends a single TLS record.
Sponsored by: Chelsio Communications
If an adapter advertises support for TLS keys but an empty TLS key
storage area in on-board memory, fail the request rather than invoking
vmem_alloc on an uninitialized vmem.
Sponsored by: Chelsio Communications
The recent commit to fix a gzip header extra field processing bug
introduced the new bug fixed here.
(cherry picked from zlib commit 1eb7682f845ac9e9bf9ae35bbfb3bad5dacbd91d)