Commit graph

152568 commits

Author SHA1 Message Date
Roger Pau Monné
e7fe856437 xen/blk{front,back}: fix usage of sector sizes different than 512b
The units of the size reported in the 'sectors' xenbus node is always 512b,
regardless of the value of the 'sector-size' node.  The sector offsets in
the ring requests are also always based on 512b sectors, regardless of the
'sector-size' reported in xenbus.

Fix both blkfront and blkback to assume 512b sectors in the required fields.

The blkif.h public header has been recently updated in upstream Xen repository
to fix the regressions in the specification introduced by later modifications,
and clarify the base units of xenstore and shared ring fields.

PR: 280884
Reported by: Christian Kujau
MFC after: 1 week
Sponsored by: Cloud Software Group
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D46756
2024-10-08 09:29:13 +02:00
Konstantin Belousov
e90b2b7d6c ptrace(PT_VM_ENTRY): report max protection
Reviewed by:	brooks, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46971
2024-10-08 09:50:17 +03:00
Konstantin Belousov
409c2fa385 kinfo_vmentry: report max protection
Reviewed by:	brooks, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46971
2024-10-08 09:50:17 +03:00
Warner Losh
6c711019f2 nvme: Don't create sysctl for io queues not created
When we can't set the number of I/O queues ont he admin queue, we
continue on. However, we don't create the I/O queue structures, so
having pointers (NULL) into them for sysctls makes no sense and leads to
a panic when accessed. When summing up different stats, also skip the
ioq stats when it's NULL.

Sponsored by:		Netflix
2024-10-07 22:22:40 -06:00
Mark Johnston
c59166e5b4 vm_page: Fix a logic bug in vm_page_unwire_managed()
When releasing a page reference, we have logic for various cases, based
on the value of the counter.  But, the implementation fails to take into
account the possibility that the VPRC_BLOCKED flag is set, which is ORed
into the counter for short windows when removing mappings of a page.  If
the flag is set while the last reference is being released, we may fail
to add the page to a page queue when the last wiring reference is
released.

Fix the problem by performing comparisons with VPRC_BLOCKED masked off.
While here, add a related assertion.

Reviewed by:	dougm, kib
Tested by:	pho
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D46944
2024-10-07 20:52:15 +00:00
Mark Johnston
d8b32da235 vm_page: Use atomic loads for cmpset loops
Make sure that the compiler loads the initial value value only once.
Because atomic_fcmpset is used to load the value for subsequent
iterations, this is probably not needed, but we should not rely on that.

I verified that code generated for an amd64 GENERIC kernel does not
change.

Reviewed by:	dougm, alc, kib
Tested by:	pho
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D46943
2024-10-07 20:52:08 +00:00
Navdeep Parhar
52e5a66eac cxgbe(4): Use correct synchronization when marking the adapter offline.
adapter->flags are guarded by a synch_op, as noted in the comment in
adapter.h where the flags are defined.

Fixes:	5241b210a4 cxgbe(4): Basic infrastructure for ULDs to participate in adapter reset.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2024-10-07 10:25:53 -07:00
Roger Pau Monné
9a73b5b1e8 xen: remove PV suspend/resume support copyright
Thew code for PV suspend/resume support has long been removed, also remove the
copyright notice associated with it.

There are still two copyright blocks with (to my understanding) slightly
different wordings of the BSD 2 clause license.  I however don't feel like
merging them due to those wording differences.

The removal of the PV suspend/resume code was done in
ed95805e90.

Sponsored by: Cloud Software Group
Reviewed by: imp
Differential revision: https://reviews.freebsd.org/D46860
2024-10-07 18:59:45 +02:00
Roger Pau Monné
9dd5105f22 xen: expose support for poweroff/reboot/suspend on xenbus
Some toolstacks won't attempt the signal power actions on xenbus unless the VM
explicitly exposes support for them.  FreeBSD supports all power actions, hence
signal on xenbus such support by setting the nodes to the value of "1".

Sponsored by: Cloud Software Group
Reviewed by: markj
Differential review: https://reviews.freebsd.org/D46859
2024-10-07 18:59:45 +02:00
Bojan Novković
a02f9685ed vm_meter: Add counter for NOFREE pages
This change adds a new counter that tracks the total number
of permanently allocated pages.

Differential Revision:	https://reviews.freebsd.org/D46978
Reviewed by:	alc, markj
2024-10-07 18:46:32 +02:00
Doug Moore
6af02087d2 swap_pager: rename iter init functions
Add _init to the function names of the functions that initialize
iterators for swblks.

Reported by:	alc, markj
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D46974
2024-10-07 11:11:33 -05:00
Michael Tuexen
3326ab87cc getsockopt: improve locking for SOL_SOCKET level socket options
Ensure SOLISTENING() is done inside SOCK_LOCK()/SOCK_UNLOCK()
for getsockopt() handling of SOL_SOCKET-level socket options.

Reviewed by:		markj, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46881
2024-10-07 16:46:41 +02:00
Zhenlei Huang
8161000892 iflib: Make iflib_stop() static
It is declared as static. Make the definition consistent with the
declaration.

This follows 7ff9ae90f0 and partially reverts 09f6ff4f1a.

Reviewed by:	erj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D46185
2024-10-07 22:19:02 +08:00
Igor Ostapenko
dfcb8de5ef dummymbuf: Log the entire rule set if no delimiters are present
An empty string was printed instead.

Reviewed by:	kp
Approved by:	kp (mentor)
Differential Revision:	https://reviews.freebsd.org/D46964
2024-10-07 11:16:44 +00:00
Konstantin Belousov
6a3fbdc7e9 kinfo_vmobject: report backing object of the SysV shm segments
Use reserved work for kvo_flags.
Mark such object with KVMO_FLAG_SYSVSHM.
Provide segment key in kvo_vn_fileid, vnode never can back shm mapping.
Provide sequence number in kvo_vn_fsid_freebsd11.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Konstantin Belousov
d3dd6bd403 kinfo_vmentry: report mappings of the SysV shm segments
Mark such mappings with the new flag KVME_FLAG_SYSVSHM.
Provide segment key in kve_vn_fileid, vnode never can back shm mapping.
Provide sequence number in kve_vn_fsid_freebsd11.

Reviewed by:	markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Konstantin Belousov
b72029589e sysvshm: add shmobjinfo() function to find key/seq of the segment backed by obj
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Konstantin Belousov
f186252e0d vm_object: add OBJ_SYSVSHM flag to indicate SysV shm backing object
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Konstantin Belousov
34935a6b37 vm_object: reformat flags definitions
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Konstantin Belousov
8771dc950a sysv_ipc: remove sys/cdefs.h include
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D46959
2024-10-07 11:22:12 +03:00
Mark Johnston
fdd100a715 devctl: Add missing validation to DEV_RESET
As in other ioctls which access the parent bus, we need to check for a
NULL parent here.  Otherwise it's possible to trigger a null pointer
dereference by resetting the root device.

Reported by:	Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D46965
2024-10-07 00:16:07 +00:00
Mark Johnston
7f1dfd6c33 vm_object: Fix the argument type to vm_object_set_flag()
Reported by:	kib
Fixes:		9d52823bf1 ("vm_object: Widen the flags field")
2024-10-06 22:55:02 +00:00
Doug Moore
9147a0c93b pctrie: don't assign to root
User pctrie_root_store(*, PCTRIE_LOCKED) to change the root value of a
pctrie, to ensure proper synchronization when smr is in use.

Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D46968
2024-10-06 15:10:10 -05:00
Ed Maste
357185a966 membarrier.h: fix typo
Sponsored by:	The FreeBSD Foundation
2024-10-06 13:22:21 -04:00
Mark Johnston
9d52823bf1 vm_object: Widen the flags field
Take advantage of a nearby 2-byte hole to avoid growing the struct.
This way, only the offsets of "flags" and "pg_color" change.  Bump
__FreeBSD_version since some out-of-tree kernel modules may access these
fields, though I haven't found any examples so far.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35905
2024-10-06 13:13:30 +00:00
Igor Ostapenko
6bd8d85579 dummymbuf: Fix code style
No functional change intended.

Reviewed by:	kp
Approved by:	kp (mentor)
Differential Revision:	https://reviews.freebsd.org/D46958
2024-10-06 11:34:28 +00:00
Bojan Novković
149e1af6ae vm_kern: Use VM_ALLOC_NOFREE when allocating 'zero_region' page
Allocate the 'zero_region' page using VM_ALLOC_NOFREE since
 it never gets released.

 Differential Revision:  https://reviews.freebsd.org/D46885
 Reviewed by:    alc, markj, kib
2024-10-05 17:05:40 +02:00
Igor Ostapenko
9f146a81d2 dummymbuf: Validate syntax upon write to net.dummymbuf.rules sysctl
For now, opargs are not validated due to their runtime nature.

Reviewed by:	kp
Approved by:	kp (mentor)
Differential Revision:	https://reviews.freebsd.org/D46496
2024-10-05 08:04:08 +00:00
Brooks Davis
d0af970e3c sysv shm: Fix SHM_REMAP flag value
SHM_REMAP was incorrectly defined to 030000 which collides with
SHM_RDONLY and SHM_RND.  Renumber to 040000 (incidentally matching
Linux).

This is an ABI break, but the previous ABI was unusable (SHM_REMAP would
imply SHM_RDONLY and vice versa).  Fortunately SHM_REMAP has very few
consumers in the wild (I spotted openjdk for Linux, libfabric, MIPCH,
and one other MPI thing in Debian code search)

Reviewed by:	kib
Fixes:		ea7e7006db Implement shmat(2) flag SHM_REMAP.
Differential Revision:	https://reviews.freebsd.org/D46825
2024-10-04 21:34:03 +01:00
Florian Walpen
e0c37c160b snd_hdsp(4): Support AO4S-192 and AI4S-192 extension boards.
Create an additional 4 channel pcm device for RME HDSP 9632 sound cards,
to support the optional AO4S-192 and AI4S-192 extension boards. For
simplicity, the <HDSP 9632 [ext]> pcm device is always present, even if
the extension boards are not installed.

Unfortunately I cannot test this with actual hardware, but I made sure
the additional channels do not affect the functionality of the HDSP 9632
as currently in src.

Reviewed by: christos, br
Differential Revision: https://reviews.freebsd.org/D46837
2024-10-04 19:51:49 +01:00
Florian Walpen
8fb4675688 snd_hdspe(4): Addendum to AO4S-192 and AI4S-192 support.
Fix unified pcm mode after support for the AO4S-192 and AI4S-192
extension boards was added. Adjust the man page accordingly.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D46946
2024-10-04 19:46:39 +01:00
Ruslan Bukin
9e7e15b539 snd_hdspe(4): AO4S/AI4S support.
Add support for RME AO4S/AI4S extension cards. They are designed as a bracket
with 4 stereo TRS jacks each.

https://archiv.rme-audio.de/download/ao4s192_e.pdf
https://archiv.rme-audio.de/download/ai4s192_e.pdf

Reviewed by: Florian Walpen <dev@submerge.ch>
Differential Revision: https://reviews.freebsd.org/D46409
2024-10-04 19:36:06 +01:00
Mark Johnston
33c2c58f0a shm: Respect PROT_MAX when creating private mappings
We were previously unconditionally adding PROT_WRITE to the maxprot of
private mapping (because a private mapping can be written even if the
fd is read-only), but this might violate the user's PROT_MAX request.

While here, rename cap_maxprot to max_maxprot.  This is the intersection
of the maximum protections imposed by capsicum rights on the fd (not
really relevant for private mappings) and the user-required maximum
protections (which were not being obeyed).  In particular, cap_maxprot
is a misnomer after the introduction of PROT_MAX.

Add some regression test cases.  mmap__maxprot_shm fails without this
patch.

Note: Capsicum's CAP_MMAP_W is a bit ambiguous.  Should it be required
in order to create writeable private mappings?  Currently it is, even
though such mappings don't permit writes to the object referenced by the
fd.

Reported by:	brooks
Reviewed by:	brooks
MFC after:	1 month
Fixes:		c7841c6b8e ("Relax restrictions on private mappings of POSIX shm objects.")
Differential Revision:	https://reviews.freebsd.org/D46741
2024-10-04 15:56:34 +00:00
Mark Johnston
b37b2543a2 ggate: Avoid dropping the GEOM topology lock in dumpconf
In general it's not safe to drop the topology lock in these routines, as
GEOM assumes that the mesh will be consistent during traversal.
However, there's no reason we can't hold the topology lock across calls
to g_gate_release().  (Note that g_gate_hold() can be called with the
topology lock held.)

PR:		238814
MFC after:	2 weeks
2024-10-04 15:56:34 +00:00
Andrew Turner
48979e8def arm64: Support HWCAP2_AFP and HWCAP2_RPRES
These add alternative behaviour to some floating-point instructions
so don't need any kernel support and can just be exposed to userspace.

Sponsored by:	Arm Ltd
2024-10-04 14:06:29 +00:00
Doug Moore
75734c4360 tmpfs: check residence in data_locked
tmpfs_seek_data_locked should return the offset of the first page
either resident in memory or in swap, but may return an offset to a
nonresident page. Check for residence to fix that.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D46879
2024-10-04 02:44:19 -05:00
Pierre Pronchery
64b0f52be2 ctl: limit memory allocation in pci_virtio_scsi
The virtio_scsi device allows a VM guest to directly send SCSI commands
(ctsio->cdb array) to the kernel driver exposed on /dev/cam/ctl
(ctl.ko).

All kernel commands accessible from the guest are defined by
ctl_cmd_table.

The command ctl_persistent_reserve_out (cdb[0]=0x5F and cbd[1]=0) allows
the caller to call malloc() with an arbitrary size (uint32_t). This can
be used by the guest to overload the kernel memory (DOS attack).

Reported by:    Synacktiv
Reviewed by:	asomers
Security:       HYP-08
Sponsored by:   The Alpha-Omega Project
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D46044
2024-10-03 20:22:34 -04:00
Brooks Davis
0f1d5bfb72 sysent: fix a couple more do-no-edit comments
Add blank like after comment to align with upcoming refactor for
makesysent.lua.

Fixes:		0d490c6a44 sysent: make header comments more consistent
2024-10-03 18:01:30 +01:00
John Baldwin
59f5f100b7 openzfs: Reduce local diffs
These are all local diffs that have no functional change.

Reviewed by:	mav, emaste
Sponsored by:	AFRL, DARPA
Differential Revision:	https://reviews.freebsd.org/D46530
2024-10-03 12:07:43 -04:00
Kajetan Staszkiewicz
65074f6f31 pf: fix double ruleset evaluation for fragments sent to dummynet
The function `pf_setup_pdesc()` handles ruleset evaluation for non-reassembled
packets. Having it called before `pf_mtag` is checked for flags
`PF_MTAG_FLAG_ROUTE_TO` and `PF_MTAG_FLAG_DUMMYNET` will cause loops for
fragmented packets if reassembly is disabled.

Move `pd` zeroing and `pf_mtag` extraction from `pf_setup_pdesc()` to a separate
function `pf_init_pdesc()` and change the order of function calls: first
call `pf_init_pdesc()`, then check if the currently processed packet has been
reinjected from dummynet, finally call `pf_setup_pdesc()`.

Add functionality of sending UDP packets to `pft_ping.py` with fragmentation
support and fix broken IPv6 reassembly.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D46880
2024-10-03 13:49:57 +02:00
Konstantin Belousov
eb8326421e iommu_qi_seq_processed: use atomic to read hw-written seq number
otherwise iommu_qi_wait_for_seq() can be legitimately optimized out.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-10-03 09:19:09 +03:00
Kyle Evans
e3953c036f sys: Chase libmd version bump with a __FreeBSD_version bump
Ports need to be rebuilt anew following this change to get off of the
old libmd.so.6.
2024-10-02 14:55:52 -05:00
John Baldwin
519981e3c0 tcp_output: Clear FIN if tcp_m_copym truncates output length
Reviewed by:	rscheff, tuexen, gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D46824
2024-10-02 15:12:37 -04:00
John Baldwin
c08e016f00 unix: Use a dedicated mtx pool for vnode name locks
mtxpool_sleep should be used as a leaf lock since it is a general
purpose pool shared across many consumers.  Holding a pool lock while
acquiring other locks (e.g. the socket buffer lock in soreserve()) can
trigger LOR warnings for unrelated code.

Reviewed by:	glebius
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D46792
2024-10-02 15:12:37 -04:00
Bartosz Fabianowski
cd8c3af747 ACPI: Treat all 20-element _BIX entires as revision 0
Some Fujitsu Lifebooks return an invalid _BIX object. The first element
of _BIX is a revision number, which indicates what elements will follow:
* ACPI 4.0 defined _BIX revision 0 with 20 elements.
* ACPI 6.0 introduced _BIX revision 1 with 21 elements.
The problem is that the offending Lifebooks have the a non-zero _BIX
revision, but provide 20 fields only.

The ACPICA parser chokes on this [1], but that seems to be
inconsequential. More importantly, our own battery info handling code
also verifies that for revision > 0, there are at least 21 fields - and
refuses to process the invalid _BIX. One workaround would be to
introduce special case / quirk handling for Fujitsu Lifebooks. A better
one is to relax the requirements check: If there are only 20 elements,
treat the _BIX as revision 0, no matter what revision number was
provided by the device.

Linux doesn't run into this problem by the way because it only supports
the 20 fields defined in the ACPI 4.0 spec [3]. It never looks at the
revision number or the 21st field added in ACPI 6.0.

[1] https://cgit.freebsd.org/src/tree/sys/contrib/dev/acpica/components/namespace/nsprepkg.c#n815
[2] https://cgit.freebsd.org/src/tree/sys/dev/acpica/acpi_cmbat.c#n371
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/battery.c#n418

PR: 252030
Reviewed by: imp
MFC After: 2 weeks
2024-10-02 12:30:15 -06:00
Warner Losh
ab03b79062 uart: Add entry for an Intel UART
While we really should infer this baud-clock rate in some cases, use the
right baud-clock for this device.

Sponsored by:		Netflix
2024-10-02 12:29:24 -06:00
Dag-Erling Smørgrav
1c82bbd865 pf: Fix NOINET and NOINET6 build.
The issue was only apparent when ALTQ was not enabled, which is why it
was not revealed by `make universe`.

Fixes:		27f54be50b
Reviewed by:	kp, imp
Differential Revision:	https://reviews.freebsd.org/D46877
2024-10-02 19:54:10 +02:00
Kajetan Staszkiewicz
e5c64b2662 pf: replace union pf_krule_ptr with struct pf_krule in in-kernel structs
There is no need for the union pf_krule_ptr for kernel-only structs like
pf_kstate and pf_ksrc_node. The rules are always accessed by pointer. The rule
numbers are a leftover from using the same structure for pfctl(8) and pf(4).

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D46868
2024-10-02 19:53:26 +02:00
Pierre Pronchery
d19fa9c1b7 vmm: avoid potential KASSERT kernel panic in vm_handle_db
If the guest VM emits the exit code VM_EXITCODE_DB the kernel will
execute the function named vm_handle_db.

If the value of rsp is not page aligned and if rsp+sizeof(uint64_t)
spans across two pages, the function vm_copy_setup will need two structs
vm_copyinfo to prepare the copy operation.

For instance is rsp value is 0xFFC, two vm_copyinfo objects are needed:

* address=0xFFC, len=4
* address=0x1000, len=4

The vulnerability was addressed by commit 51fda658ba ("vmm: Properly
handle writes spanning across two pages in vm_handle_db").  Still,
replace the KASSERT with an error return as a more defensive approach.

Reported by:    Synacktiv
Reviewed by	markj, emaste
Security:       HYP-09
Sponsored by:   The Alpha-Omega Project
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D46133
2024-10-02 12:58:45 -04:00
Bojan Novković
51fda658ba vmm: Properly handle writes spanning across two pages in vm_handle_db
The vm_handle_db function is responsible for writing correct status
register values into memory when a guest VM is being single-stepped
using the RFLAGS.TF mechanism. However, it currently does not properly
handle an edge case where the resulting write spans across two pages.
This commit fixes this by making vm_handle_db use two vm_copy_info
structs.

Security:	HYP-09
Reviewed by:	markj
2024-10-02 18:43:36 +02:00