Commit graph

151188 commits

Author SHA1 Message Date
Warner Losh
175b2c00a6 Fix bnxt build in LINT
LINT includes bnxt_re driver. Adjust the path in files, add missing
files and add a new BNXT_C to build (which thinly wraps OFED version
with bnxt specicif stuff).

Sponsored by:		Netflix
Fixes: acd884dec9 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
2024-05-29 09:49:53 -06:00
Son Phan Trung
6d849754b9 linux: implement PR_CHILD_SET_SUBREAPER
Reviewed by: imp, dchagin
Pull Request: https://github.com/freebsd/freebsd-src/pull/1260
2024-05-29 07:56:23 -06:00
Mariusz Zaborski
bb421be6c1 libutil: move ftime to libutil
It seems that there are still some applications that use ftime(3)
(for example, science/siconos and sysutils/lcdproc). The issue
is that we don't build libcompat as a shared library anymore.
The easiest solution is to move it to libutil, until we
deprecate it for good.

This solution was proposed by kib@ in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257789.

PR:		257789
MFC after:	1 month
Reviewed by:	kib (ages ago)
Differential Revision:	https://reviews.freebsd.org/D39994
2024-05-29 14:36:09 +02:00
Kristof Provost
6ee3e37682 pf: fix incorrect anchor_call to userspace
777a4702c changed how we copy out the anchor_call string, and
incorrectly limited it to 8 (4 on 32-bit systems) bytes. Fix that so we
get the full anchor path, rather than just the first few characters.

PR:		279225
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2024-05-28 22:27:22 +02:00
Gleb Smirnoff
2780e5f43d linux: allow RTM_GETADDR without full ifaddrmsg argument
Even modern glibc uses truncated argument for RTM_GETADDR when it wants to
list all addresses in a system.  See
sysdeps/unix/sysv/linux/ifaddrs.c:__netlink_sendreq().  It sends a one
char payload.  Linux kernel allows that as long as given socket is not
marked as a 'strict'.  We have a similar flag in the general netlink code
and it is checked in
sys/netlink/netlink_message_parser.h:nl_parse_header().  If the flag is
not present, parser will allocate a temporary zeroed buffer to make the
message correct.  The checks added in b977dd1ea5 blocked such message
before the parser.  My reading of glibc says that there are two types of
messages that are sent with __netlink_sendreq() - RTM_GETLINK and
RTM_GETADDR.  The RTM_GETLINK is binary compatible between Linux and
FreeBSD and thus doesn't need any ABI handler.

PR:		279012
Fixes:		b977dd1ea5
2024-05-28 13:13:08 -07:00
Mark Johnston
c867ba7288 bnxt: Do not compile on 32-bit platforms
The new bnxt_re driver doesn't compile on any of them (it uses writeq()
from the LinuxKPI, which isn't implemented there), and had already been
disconnected from the build on i386.

Reported by:	Jenkins
Fixes:	acd884dec9 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
2024-05-28 09:12:52 -04:00
Mark Johnston
bbe42332e5 bnxt_re: Explicitly cast pointer-to-integer conversions
Reported by:	Jenkins
Fixes:	acd884dec9 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
2024-05-28 09:12:42 -04:00
Mark Johnston
bf56e8b9c8 bnxt: Add a module makefile to fix the build
Fixes:	35b53f8c98 ("bnxt_en: Add PFC, ETS & App TLVs protocols support")
2024-05-28 08:02:19 -04:00
Chandrakanth patil
faeff3b851 bnxt_{en/re}: Update bnxt_en and bnxt_re Makefile
Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45202
2024-05-28 10:36:11 +00:00
Sumit Saxena
acd884dec9 RDMA/bnxt_re: Add bnxt_re RoCE driver
This patch introduces the RoCE driver for the
Broadcom NetXtreme-E 10/25/50/100/200G RoCE HCAs.

The RoCE driver is a two part driver that relies
on the bnxt_en NIC driver to operate. The changes
needed in the bnxt_en driver is included through
another patch "L2-RoCE driver communication interface"
in this set.

Presently, There is no user space support, Hence
recommendation to use the krping kernel module for
testing. User space support will be incorporated in
subsequent patch submissions.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45011
2024-05-28 10:36:11 +00:00
Chandrakanth patil
862af86f4b bnxt_en: Driver version update to 230.0.133.0
Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45010
2024-05-28 10:36:11 +00:00
Chandrakanth patil
3d8bbe0011 bnxt_en: Firmware header version update to 1.10.3.42
This file is automatically generated from the firmware code to
export the driver interfaces.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45009
2024-05-28 10:36:11 +00:00
Chandrakanth patil
c9965974a5 bnxt_en: Firmware error recovery support
Implement firmware error recovery support for Thor adapters.
This entails enabling the capability for the firmware to initiate
error recovery. Specifically, the firmware will send the reset notify
asynchronous event to notify the driver of an error and impending reset.
Subsequently, the driver will queue a task to execute the following steps.

1. Deactivate the allocated resources.
2. Await completion of the firmware's recovery process.
3. Configure the resources and reactivate the network interface.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45008
2024-05-28 10:36:11 +00:00
Chandrakanth patil
032899b59c bnxt_en: Added support for priority queues extended stats
Below priority queues extended stats are exposed to sysctl:

tx_bytes_pri{0-7}
rx_bytes_pri{0-7}
tx_packets_pri{0-7}
rx_packets_pri{0-7}

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45007
2024-05-28 10:36:11 +00:00
Chandrakanth patil
050d28e13c bnxt_en: L2-RoCE driver communication interface
- Added Aux bus support for RoCE.
- Implemented the ulp ops that are required by RoCE driver.
- Restructure context memory data structures
- DBR pacing support

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45006
2024-05-28 10:36:10 +00:00
Chandrakanth patil
35b53f8c98 bnxt_en: Add PFC, ETS & App TLVs protocols support
Created new directory "bnxt_en" in /dev/bnxt and /modules/bnxt
and moved source files and Makefile into respective directory.

ETS support:

   - Added new files bnxt_dcb.c & bnxt_dcb.h
   - Added sysctl node 'dcb' and created handlers 'ets' and
     'dcbx_cap'
   - Add logic to validate user input and configure ETS in
     the firmware
   - Updated makefile to include bnxt_dcb.c & bnxt_dcb.h

PFC support:

   - Created sysctl handlers 'pfc' under node 'dcb'
   - Added logic to validate user input and configure PFC in
     the firmware.

App TLV support:

   - Created 3 new sysctl handlers under node 'dcb'
       - set_apptlv (write only): Sets a specified TLV
       - del_apptlv (write only): Deletes a specified TLV
       - list_apptlv (read only): Lists all APP TLVs configured
   - Added logic to validate user input and configure APP TLVs
     in the firmware.

Added Below DCB ops for management interface:

   - Set PFC, Get PFC, Set ETS, Get ETS, Add App_TLV, Del App_TLV
     Lst App_TLV

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D45005
2024-05-28 10:15:29 +00:00
Keith Reynolds
1c45a62a2f qlnxe: Fix multiple locking issues
Multiple issues are reported with WITNESS and code inspection of the
locking and lock initialization.

PR:		278084
MFC after:	1 week
2024-05-27 23:41:05 -07:00
Kevin Bowling
fb78e20b4e Revert "qlnxe: Fix multiple locking issues"
This commit is missing a file, revert so I can do it correctly,
atomically.

This reverts commit 29684d08fa.
2024-05-27 23:39:23 -07:00
Keith Reynolds
29684d08fa qlnxe: Fix multiple locking issues
Multiple issues are reported with WITNESS and code inspection of the
locking and lock initialization.

PR:		278084
MFC after:	1 week
2024-05-27 23:13:10 -07:00
Keith Reynolds
e3ec564ecb qlnxe: Fix promiscuous and allmulti settings
PR:		278087
MFC after:	1 week
2024-05-27 22:57:44 -07:00
Fuqian Huang
9370f49ad1 qlnx: qlnxe: Fix kernel address leakage
In function qlnx_rdma_deregister_if,
the address of object rdma_if will be printed out.
rdma_if is the address of a global variable qlnxr_drv,
which is passed from dev/qlnx/qlnxr/qlnxr_os.c
A kernel address leakage happens.
Fix this by removing the printf statement.

PR:		238646
MFC after:	1 week
2024-05-27 22:45:52 -07:00
Fuqian Huang
ae38977758 qlxge: replace device_printf with QL_DPRINT2
QL_DPRINT2 checks the debug level first before printing.
Replace device_printf with QL_DPRINT2 to check debug level
first before printing out the kernel pointers.

PR:		238656
MFC after:	1 week
2024-05-27 22:40:12 -07:00
Fuqian Huang
3d6c7ee87e qlxgbe: fix debug prints in ql_os.c
QL_DPRINT2 checks the debug level first and then prints.
Replace device_printf with QL_DPRINT2 to check debug level
first before printing out the kernel pointers.

PR:		238655
MFC after:	1 week
2024-05-27 22:32:16 -07:00
Fuqian
a58b4ee025 qlxgbe: Remove pointer printing in ql_ioctl.c
PR:		238653
MFC after:	1 week
2024-05-27 22:18:52 -07:00
Zhenlei Huang
2439ae9483 mlx4, mlx5: Eliminate redundent NULL check for packet filter
mlx4 and mlx5 are Ethernet devices and ether_ifattach() does an
unconditional bpfattach(). From commit 16d878cc99 [1] and on, we
should not check ifp->if_bpf to tell us whether or not we have any bpf
peers that might be interested in receiving packets. And since commit
2b9600b449 [2], ifp->if_bpf can not be NULL even after the network
interface has been detached.

No functional change intended.

1. 16d878cc99 Fix the following bpf(4) race condition which can result in a panic
2. 2b9600b449 Add dead_bpf_if structure, that should be used as fake bpf_if during ifnet detach

Reviewed by:	kp, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D45196
2024-05-28 12:46:04 +08:00
Rick Macklem
6c9170e0af svc.c: Check for a non-NULL xp_socket
Commit a16ff32f04 added support to the kernel RPC to set
TCP_USE_DDP.
However, for the unusual case of a NFSv4.1/4.2 non-NULL callback,
the xp_socket field of SVCXPRT is NULL, since it uses the same
socket as the client->server connection.

This patch adds the check for this to avoid crashes.

This only affects NFSv4.1/4.2 mounts where either pNFS or
delegations are in use.

MFC after:	3 days
2024-05-27 19:22:04 -07:00
Mitchell Horne
deab57178f Adjust comments referencing vm_mem_init()
I cannot find a time where the function was not named this.

Reviewed by:	kib, markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45383
2024-05-27 18:37:40 -03:00
Bojan Novković
4c053c17f2 zfs: Update use of UMA-related symbols in arc_available_memory
da76d34 repurposed the use of UMA_MD_SMALL_ALLOC in a way that breaks
arc_available_memory on -CURRENT. This change ensures that
arc_available_memory uses the new symbol while maintaining compatibility
with older FreeBSD releases. This change was submitted to upstream
as well.

Approved by:	markj (mentor)
Fixes:	da76d34
2024-05-27 15:47:17 +02:00
Ryan Libby
9c975a0d90 pbuf_ctor(): Stop using LK_NOWAIT, use LK_NOWITNESS
The LK_NOWAIT was added to suppress a witness warning, but LK_NOWITNESS
is more what we mean.  This makes pbuf_ctor() more consistent with
buf_alloc(), although, unlike buf_alloc(), for pbuf there should not be
any danger of a wild locker relying on the type stability of the buf to
attempt a lock.  That is, this is essentially cosmetic.

Relevant history:
 - 531f8cfea0 Use dedicated lock name for pbufs
 - 5875b94c74 buf_alloc(): lock the buffer with LK_NOWAIT
 - c9e023541a pbuf_ctor(): lock the buffer with LK_NOWAIT
 - 1fb00c8f10 buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS

Reviewed by:	rew, kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D45360
2024-05-26 10:20:52 -07:00
Ryan Libby
6bd3f23a2a tmpfs_node_init: use MTX_NEW on lock from uninitialized memory
Reported by:	netchild
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45364
2024-05-26 10:20:52 -07:00
Rick Macklem
c68db4608e Revert "nfscl: Do not do readahead for directories"
The PR reported hangs that were avoided when this commit was
reverted.  Since it was only a cleanup, revert it.
The LORs in the PR need further investigation, since I think
readahead only hides the problem.

PR:	279138
This reverts commit fbe965591f.
2024-05-26 08:02:30 -07:00
Bojan Novković
d25ed65043 uma: Fix improper uses of UMA_MD_SMALL_ALLOC
UMA_MD_SMALL_ALLOC was recently replaced by UMA_USE_DMAP, but
da76d349b6 missed some improper uses of the old symbol.
This change makes sure that UMA_USE_DMAP is used properly in
code that selects uma_small_alloc.

Fixes: da76d349b6
Reported by: eduardo, rlibby
Approved by: markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D45368
2024-05-26 07:27:37 +02:00
Michael Tuexen
df9de82f54 tcp: fix sending RST after second inp lookup
When we first find an inp, we set also the tp. If then a second
lookup is necessary, the inp is recomputed. If this fails, the
tp is not cleared, which resulted in failing KASSERT.
Therefore, clear the tp when staring the inp lookup procedure.
Reported by:	Jenkins
Fixes:		02d15215ce ("tcp: improve blackhole support")
MFC after:	1 week
Sponsored by:	Netflix, Inc.
2024-05-25 19:58:48 +02:00
Bojan Novković
0a44b8a56d vm: Simplify startup page dumping conditional
This commit introduces the MINIDUMP_STARTUP_PAGE_TRACKING symbol and
uses it to simplify several instances of a complex preprocessor conditional
for adding pages allocated when bootstraping the kernel to minidumps.

Reviewed by:	markj, mhorne
Approved by:	markj (mentor)
Differential Revision: https://reviews.freebsd.org/D45085
2024-05-25 19:24:55 +02:00
Bojan Novković
da76d349b6 uma: Deduplicate uma_small_alloc
This commit refactors the UMA small alloc code and
removes most UMA machine-dependent code.
The existing machine-dependent uma_small_alloc code is almost identical
across all architectures, except for powerpc where using the direct
map addresses involved extra steps in some cases.

The MI/MD split was replaced by a default uma_small_alloc
implementation that can be overridden by architecture-specific code by
defining the UMA_MD_SMALL_ALLOC symbol. Furthermore, UMA_USE_DMAP was
introduced to replace most UMA_MD_SMALL_ALLOC uses.

Reviewed by: markj, kib
Approved by: markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D45084
2024-05-25 19:24:46 +02:00
Ed Maste
9b1de7e484 vt/sc: retire logic to select vt(4) by default for UEFI boot
We previously defaulted to using sc(4) with a special case to prefer
vt(4) when booted via UEFI.  As vt(4) is now always the default we can
simplify this.

Reviewed by:	imp, kevans
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45356
2024-05-25 11:00:35 -04:00
Michael Tuexen
02d15215ce tcp: improve blackhole support
There are two improvements to the TCP blackhole support:
(1) If net.inet.tcp.blackhole is set to 2, also sent no RST whenever
    a segment is received on an existing closed socket or if there is
    a port mismatch when using UDP encapsulation.
(2) If net.inet.tcp.blackhole is set to 3, no RST segment is sent in
    response to incoming segments on closed sockets or in response to
    unexpected segments on listening sockets.
Thanks to gallatin@ for suggesting such an improvement.

Reviewed by:		gallatin
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D45304
2024-05-24 06:59:13 +02:00
Konstantin Belousov
40d951bc59 x86/iommu: extract useful utilities into x86_iommu.c
related to the page tables page allocation and mapping.

Sponsored by:	The FreeBSD Foundation
Sponsored by:	Advanced Micro Devices (AMD)
MFC after:	1 week
2024-05-25 08:32:01 +03:00
cnbatch
ff92493a4f netlink: Fix C++ compile errors
Allow these files to be included in C++ programs with careful casting to
the proper type, like C++ wants (and in a way that also works for C).

MFC After: 1 week
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1245
2024-05-24 22:31:42 -06:00
Lexi Winter
bfd248f59d sys/amd64/conf/LINT-NOINET{6,}: don't set WITHOUT_INET{6,}_SUPPORT
Previously, it was necessary to set WITHOUT_INET_SUPPORT when building
the kernel without INET, and WITHOUT_INET6_SUPPORT when building the
kernel without INET6, or else the modules build would fail.  The
LINT-NOINET and LINT-NOINET6 configs did this using makeoptions.

After recent changes, this is no longer required, so remove these
makeoptions.  This avoids masking potential future build issues when
these aren't set.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1255
2024-05-24 22:21:25 -06:00
Lexi Winter
0e2ce86627 ipfw: don't build the module if INET not in kernel
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1255
2024-05-24 22:21:24 -06:00
Ryan Libby
a216e311a7 vm_pageout_scan_inactive: take a lock break
In vm_pageout_scan_inactive, release the object lock when we go to
refill the scan batch queue so that someone else has a chance to acquire
it.  This improves access latency to the object when the pagedaemon is
processing many consecutive pages from a single object, and also in any
case avoids a hiccup during refill for the last touched object.

Reviewed by:	alc, markj (previous version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D45288
2024-05-24 08:52:58 -07:00
Warner Losh
d09ee08f10 nvme: Count number of alginment splits
When possible, we split up I/Os to NVMe drives that advertise a
preferred alignment. Add a counter for this.

Sponsored by:		Netflix
Reviewed by:		chuck, mav
Differential Revision:	https://reviews.freebsd.org/D45311
2024-05-24 08:32:47 -06:00
Warner Losh
99c14fb99f cam: Drop periph lock when completing I/O with ENOMEM status
When biofinish calls g_io_deliver with an error of ENOMEM, that kicks
off the slowdown protocol, forcing I/O to go through g_down rather than
be directly dispatch. One of the side effects is that the I/O is
resubmitted, so the start routines get called recursively, leading to a
recursive lock panic. Rather than make the periph lock recursive, drop
and reacquire the lock around such calls to biofinish.

For nda, this happens only when we can't allocate space to construct a
TRIM. For ada and da, this is only for certain ZONE operations.

Sponsored by:		Netflix
Reviewed by:		gallatin
Differential Revision:	https://reviews.freebsd.org/D45310
2024-05-24 08:32:04 -06:00
Warner Losh
6d83b38186 geom_io: Shift to pause_sbt to eliminate bogus min and update comment.
Update to eliminate bogus min to ensure 0 was never passed to
pause. Instead, requrest 1ms with an 'infinite' precision, which
defaults to whatever the underlying time counter can do. This should
ensure we run fairly quickly to start processing done events, while
still giving a small pause for the system to catch its breath. This rate
limiter still is less than ideal, and this commit doesn't change
that. It should really have no functional change: it just uses a better
interface to express the desired sleep.

Sponsored by:		Netflix
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D45316
2024-05-24 08:31:55 -06:00
Warner Losh
32f40fc983 geom: Add counts for enomem and pausing
Add counts for the number of requests that complete with the ENOMEM as
kern.geom.nomem_count and the number of times we pause the g_down thread
to let the system recover as kern.geom.pause_count.

Sponsored by:		Netflix
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D45309
2024-05-24 08:31:15 -06:00
Mitchell Horne
1d3c23676d arm64, riscv: remove unused declaration
It is inherited from arm, where the global exists and is used. No
functional change.

Reviewed by:	markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45323
2024-05-24 10:55:24 -03:00
Mitchell Horne
b5e17840de arm64, riscv: removed unused struct pv_addr
No functional change.

Reviewed by:	markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45322
2024-05-24 10:55:24 -03:00
Bjoern A. Zeeb
45bce6fa30 LinuxKPI: 802.11: lock MO tx/wake_tx_queue() downcalls
Lock the two TX MO downcalls into driver/firmware in
lkpi_80211_txq_tx_one() to make sure they cannot happen in the
middle of other (net80211 triggered) updates calling down into
the driver/firmware.

Sponsored by:	The FreeBSD Foundation (commit)
MFC after:	3 days
Reviewed by:	cc
Differential Revision: https://reviews.freebsd.org/D43966
2024-05-23 23:43:29 +00:00
Henrich Hartzer
674956e199 sys/netinet/cc: Switch from deprecated random() to prng32()
Related: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277655

Signed-off-by: henrichhartzer@tuta.io
Reviewed by: imp, mav
Pull Request: https://github.com/freebsd/freebsd-src/pull/1162
2024-05-23 15:10:09 -06:00