Commit graph

139605 commits

Author SHA1 Message Date
Roger Pau Monné
41a0aef504 vt/vga: ignore ACPI_FADT_NO_VGA unless running virtualized
There's too many broken hardware out there that wrongly has the
ACPI_FADT_NO_VGA bit set. Ignore it unless running as a virtualized
guest, as then the expectation would be that the hypervisor does
provide correct ACPI tables.

Reviewed by: emaste, 0mp, eugen
Sponsored by: Citrix Systems R&D
PR: 230172

(cherry picked from commit 0518832011)
2022-03-23 14:44:30 +01:00
Roger Pau Monné
88aff320c8 x86/xen: fix CPUID signature
Reviewed by: cem
Sponsored by: Citrix Systems R&D

(cherry picked from commit 396a8479b0)
2022-03-23 14:44:07 +01:00
Ed Maste
79902c8c2d Add Tempo Semiconductor 92HD95B HDA codec ID
This codec is found in recent versions of the Framework laptop.  Tempo
Semiconductor acquired these products from IDT's Audio Business Unit.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit e997f33700)
2022-03-22 21:27:16 -04:00
Piotr Pawel Stefaniak
acec1d6969 mount: improve error message for invalid filesystem names
For an invalid filesystem name used like this:
mount -t asdfs /dev/ada1p5 /usr/obj

emit an error message like this:
mount: /dev/ada1p5: Invalid fstype: Invalid argument

instead of:
mount: /dev/ada1p5: Operation not supported by device

Differential Revision:	https://reviews.freebsd.org/D31540

(cherry picked from commit 6e8272f317)
2022-03-22 19:47:13 +01:00
hlh-restart
205fa5f0a5 rtsx: Call rtsx_init() on resume.
MFC after:	3 days

(cherry picked from commit 1b1bab0078)
2022-03-21 20:28:34 -04:00
Mark Johnston
476b3bb091 fusefs: Initialize a pad word in the mknod message
Reported by:	Jenkins (KMSAN job)
Reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c0b98fe16f)
2022-03-21 10:42:39 -04:00
Vincenzo Maffione
ddb842e2ad netmap: add a tunable for the maximum number of VALE switches
The new dev.netmap.max_bridges sysctl tunable can be set in
loader.conf(5) to change the default maximum number of VALE
switches that can be created. Current defaults is 8.

MFC after:	2 weeks

(cherry picked from commit dd6ab49a9a)
2022-03-20 09:00:50 +00:00
Kristof Provost
f6138d93b5 if_epair: build fix
66acf7685b failed to build on riscv (and mips). This is because the
atomic_testandset_int() (and friends) functions do not exist there.
Happily those platforms do have the long variant, so switch to that.

PR:		262571
MFC after:	3 days

(cherry picked from commit 0bf7acd6b7)
2022-03-20 01:25:03 +01:00
Vincenzo Maffione
9f600a260a netmap: Fix TOCTOU vulnerability in nmreq_copyin
The total size of the user-provided nmreq was first computed and then
trusted during the copyin. This might lead to kernel memory corruption
and escape from jails/containers.

Reported by: Lucas Leong (@_wmliang_) of Trend Micro Zero Day Initiative
Security: CVE-2022-23084
MFC after:	3 days

(cherry picked from commit 3937299165)
2022-03-19 17:36:39 +00:00
Vincenzo Maffione
9df8dd3ea3 netmap: Fix integer overflow in nmreq_copyin
An unsanitized field in an option could be abused, causing an integer
overflow followed by kernel memory corruption. This might be used
to escape jails/containers.

Reported by: Reno Robert and Lucas Leong (@_wmliang_) of Trend Micro
Zero Day Initiative
Security: CVE-2022-23085

(cherry picked from commit 694ea59c70)
2022-03-19 17:36:27 +00:00
Eugene Grosbein
4a11315a2c virtio_random(8): MFC: avoid deadlock at shutdown time (regression fix)
FreeBSD 13+ running as virtual guest may load virtio_random(8) driver
by means of devd(8) unless the driver is blacklisted or disabled
via device.hints(5). Currently, the driver may prevent
the system from rebooting or shutting down correctly.

This change deactivates virtio_random at very late stage
during system shutdown sequence to avoid deadlock
that results in kernel hang.

PR:		253175
Tested by:	tom
Relnotes:	yes

(cherry picked from commit adbf7727b3)
2022-03-19 11:20:58 +07:00
Zhenlei Huang
f877ca16c9 x86: Correctly report unexpected cache level
Reviewed by:	rpokala, emaste
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34577

(cherry picked from commit ba46c6c4b7)
2022-03-18 20:31:00 -04:00
Mark Johnston
c764d4468f armv8crypto: Remove leftover debug printfs
Fixes:	26b08c5d21 ("armv8crypto: Use cursors to access crypto buffer data")
Reported by:	bz

(cherry picked from commit c89def05b5)
2022-03-18 11:31:57 -04:00
Mark Johnston
f180b84358 armv8crypto: Use cursors to access crypto buffer data
Currently armv8crypto copies the scheme used in aesni(9), where payload
data and output buffers are allocated on the fly if the crypto buffer is
not virtually contiguous.  This scheme is simple but incurs a lot of
overhead: for an encryption request with a separate output buffer we
have to
- allocate a temporary buffer to hold the payload
- copy input data into the buffer
- copy the encrypted payload to the output buffer
- zero the temporary buffer before freeing it

We have a handy crypto buffer cursor abstraction now, so reimplement the
armv8crypto routines using that instead of temporary buffers.  This
introduces some extra complexity, but gallatin@ reports a 10% throughput
improvement with a KTLS workload without additional CPU usage.  The
driver still allocates an AAD buffer for AES-GCM if necessary.

Reviewed by:	jhb
Tested by:	gallatin
Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.

(cherry picked from commit 26b08c5d21)
2022-03-18 11:31:44 -04:00
Mark Johnston
2d7f27a4fb opencrypto: Add a routine to copy a crypto buffer cursor
This was useful in converting armv8crypto to use buffer cursors.  There
are some cases where one wants to make two passes over data, and this
provides a way to "reset" a cursor.

Reviewed by:	jhb

(cherry picked from commit 09bfa5cf16)
2022-03-18 11:31:33 -04:00
Mark Johnston
04df02b2f4 armv8crypto: Factor out some duplicated GCM code
This is in preparation for using buffer cursors.  No functional change
intended.

Reviewed by:	jhb
Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.

(cherry picked from commit 0b3235ef74)
2022-03-18 11:29:24 -04:00
Hans Petter Selasky
19b779498c xhci(4): Add quirk for "Fresco Logic FL1009 USB3.0 xHCI Controller".
Submitted by:		John F Carr <jfc@mit.edu>
Sponsored by:		NVIDIA Networking

(cherry picked from commit 19837718ab)
2022-03-17 10:55:18 +01:00
Hans Petter Selasky
d17c5a4f62 LinuxKPI: Add comment describing proper use of the on_each_cpu() function.
Sponsored by:		NVIDIA Networking

(cherry picked from commit c6cf874c7d)
2022-03-17 10:55:15 +01:00
Hans Petter Selasky
d67b2c9615 lindebugfs: Make single_release() NULL safe.
Sponsored by:	NVIDIA Networking

(cherry picked from commit a23e475c48)
2022-03-17 10:55:12 +01:00
Hans Petter Selasky
a16772a811 lindebugfs: The Linux file operations use negative return values in the kernel.
Fix sign.

Sponsored by:	NVIDIA Networking

(cherry picked from commit 68ec2949ad)
2022-03-17 10:55:07 +01:00
Hans Petter Selasky
b5cc52c21d lindebugfs: Zero the linux_file structure before use.
This avoids clients using garbage values on the stack and makes
debugging easier.

Sponsored by:	NVIDIA Networking

(cherry picked from commit 88a29d89eb)
2022-03-17 10:55:02 +01:00
Michael Gmelin
bb9ad300f0 if_epair: fix race condition on multi-core systems
As an unwanted side effect of the performance improvements in
24f0bfbad5, epair interfaces stop forwarding traffic on higher
load levels when running on multi-core systems.

This happens due to a race condition in the logic that decides when to
place work in the task queue(s) responsible for processing the content
of ring buffers.

In order to fix this, a field named state is added to the epair_queue
structure. This field is used by the affected functions to signal each
other that something happened in the underlying ring buffers that might
require work to be scheduled in task queue(s), replacing the existing
logic, which relied on checking if ring buffers are empty or not.

epair_menq() does:
  - set BIT_MBUF_QUEUED
  - queue mbuf
  - if testandset BIT_QUEUE_TASK:
      enqueue task

epair_tx_start_deferred() does:
  - swap ring buffers
  - process mbufs
  - clear BIT_QUEUE_TASK
  - if testandclear BIT_MBUF_QUEUED
      enqueue task

PR:		262571
Approved by:    re (gjb, early MFC)
Reported by:	Johan Hendriks <joh.hendriks@gmail.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34569

(cherry picked from commit 66acf7685b)
2022-03-17 00:38:33 +01:00
Hans Petter Selasky
ea318f1ad1 xhci(4): Add quirk for "TUSB73x0 USB3.0 xHCI Controller".
Tested by:	br@
Sponsored by:	NVIDIA Networking

(cherry picked from commit 33cbbf268f)
2022-03-16 15:55:22 +01:00
Mateusz Guzik
981e9c1486 vfs: [2/2] fix stalls in vnode reclaim by only counting attempts
... and ignoring if they succeded, which matches historical behavior.

Reported by:	pho

(cherry picked from commit 3a4c5dab92)
2022-03-15 21:12:50 +00:00
Mateusz Guzik
aa49e413a8 vfs: [1/2] fix stalls in vnode reclaim by not requeieing from vnlru
Reported by:	pho

(cherry picked from commit c35ec1efdc)
2022-03-15 21:12:43 +00:00
Mateusz Guzik
fa86eac818 cache: hide hash stats behind DEBUG_CACHE
They take a long time to dump and hinder sysctl -a when used with
DIAGNOSTIC.

(cherry picked from commit afb08a6d07)
2022-03-15 21:12:30 +00:00
Mark Johnston
b01d706cca i386: Call clock_init() after finishidentcpu()
In a subsequent commit clock_init() will attempt to determine the TSC
frequency, and this requires that CPU identification is finalized.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c3d830cf7c)
2022-03-15 11:41:02 -04:00
Mark Johnston
c04e4ff616 fasttrap: Avoid creating WX mappings
fasttrap instruments certain instructions by overwriting them and
copying the original instruction to some per-thread scratch space which
is executed after the probe fires.  This trampoline jumps back to the
tracepoint after executing the original instruction.

The created mapping has both write and execute permissions, and so this
mechanism doesn't work when allow_wx is disabled.  Work around the
restriction by using proc_rwmem() to write to the trampoline.

Reviewed by:	vangyzen
Tested by:	Amit <akamit91@hotmail.com>
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 3a56cfedbc)
2022-03-15 11:40:47 -04:00
Mark Johnston
87e1a4346d fasttrap: Assert that fasttrap_fork() successfully unmaps scratch space
No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 83958173eb)
2022-03-15 11:40:36 -04:00
Mark Johnston
7c27fee0ea proc: Remove assertion that P_WEXIT is not set in proc_rwmem()
exit1() sets P_WEXIT before waiting for holding threads to finish,
rather than after, so this assertion is racy.

Fixes:	12fb39ec3e ("proc: Relax proc_rwmem()'s assertion on the process hold count")
Reported by:	Jenkins

(cherry picked from commit 879b0604a8)
2022-03-15 11:40:22 -04:00
Mark Johnston
76dcbd770d proc: Relax proc_rwmem()'s assertion on the process hold count
This reference ensures that the process and its associated vmspace will
not be destroyed while proc_rwmem() is executing.  If, however, the
calling thread belongs to the target process, then it is unnecessary to
hold the process.  In particular, fasttrap - a module which enables
userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
avoid the overhead of locking and bumping the hold count when possible.

Thus, make the assertion conditional on "p != curproc".  Also assert
that the process is not already exiting.  No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 12fb39ec3e)
2022-03-15 11:39:55 -04:00
Andrew Turner
1169099f8f Make the arm64 get_pcpu a function again
We assume the pointer returned from get_pcpu will be consistent even
if the thread is moved to a new CPU. Fix this by partially reverting
63c858a04d to make get_pcpu a function again.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ed30663426)
2022-03-14 15:00:27 +00:00
Mark Johnston
831049dc34 riscv: Add support for enabling SV48 mode
This increases the size of the user map from 256GB to 128TB.  The kernel
map is left unchanged for now.

For now SV48 mode is left disabled by default, but can be enabled with a
tunable.  Note that extant hardware does not implement SV48, but QEMU
does.

- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
  mode.  If the write to SATP doesn't take, the kernel continues to run
  in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit.  In SV39 mode,
  the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
  mappable.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 31218f3209)
2022-03-14 10:45:49 -04:00
Mark Johnston
25b18f0a12 riscv: Add support for dynamically allocating L1 page table pages
This is required in SV48 mode.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 6ce716f7c3)
2022-03-14 10:45:47 -04:00
Mark Johnston
536a8230b4 riscv: Handle four-level page tables in various pmap traversal routines
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 1321117200)
2022-03-14 10:45:45 -04:00
Mark Johnston
79e0f742da riscv: Maintain the allpmaps list only in SV39 mode
When four-level page tables are used, there is no need to distribute
updates to the top-level page to all pmaps.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ceed61483c)
2022-03-14 10:45:43 -04:00
Mark Johnston
89b4150af5 riscv: Add pmap helper functions required by four-level page tables
No functional change intended.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 5cf3a8216e)
2022-03-14 10:45:40 -04:00
Mark Johnston
7e27397935 riscv: Try to improve the comments for locore's page table setup
No functional change intended.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 4337979236)
2022-03-14 10:45:38 -04:00
Mark Johnston
4bf624d7cf riscv: Conditionally modify the ELF64 sysentvec for SV48
A sysinit determines whether the pmap has enabled SV48 mode and modifies
the corresponding fields which describe the user memory map.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ecaf115434)
2022-03-14 10:45:36 -04:00
Mark Johnston
879863f7e1 riscv: Define a SV48 memory map
No functional change intended.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 35d0f443cf)
2022-03-14 10:45:34 -04:00
Mark Johnston
709db7b67f riscv: Add various pmap definitions needed to support SV48 mode
No functional change intended.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 59f192c507)
2022-03-14 10:45:32 -04:00
Mark Johnston
c4f8506809 riscv: Use generic CSR macros for writing SATP
Instead of having the one-off load_satp(), just use csr_write().  No
functional change intended.

Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 2e956c30ca)
2022-03-14 10:45:29 -04:00
Mark Johnston
7414b4f4b4 riscv: Rename struct pmap's pm_l1 field to pm_top
In SV48 mode, the top-level page will be an L0 page rather than an L1
page.  Rename the field accordingly.  No functional change intended.

Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 82f4e0d0f0)
2022-03-14 10:45:27 -04:00
Mark Johnston
40e9da8c6b rmlock: Add required compiler barriers to _rm_runlock()
Also remove excessive whitespace in _rm_rlock().

Reviewed by:	jah, mjg
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 89ae8eb74e)
2022-03-14 10:45:06 -04:00
Konstantin Belousov
dd54e44a27 buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS
Despite the buffer taken from cache or free list, it still can be
locked, due to 'lockless lookup' in getblkx() potentially operating on
the freed buffers.  The lock is transient, but prevents the use of
LK_NOWAIT there for the goal of neutralizing WITNESS.

Just use LK_NOWITNESS.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 1fb00c8f10)
2022-03-14 10:10:23 -04:00
Eugene Grosbein
80bab8aa7e linuxkpi: fix module build outside of kernel build environment
(cherry picked from commit f5a2e7b0e8)
2022-03-13 15:56:38 +07:00
Martin Matuska
bd2e56ef47 zfs: merge openzfs/zfs@ef83e07db (zfs-2.1-release) into stable/13
OpenZFS release 2.1.3

Notable upstream pull request merges:
  #12569 FreeBSD: Really zero the zero page
  #12828 FreeBSD: Add vop_standard_writecount_nomsyn
  #12828 zfs: Fix a deadlock between page busy and the teardown lock
  #12828 FreeBSD: Catch up with more VFS changes
  #12851 FreeBSD: Provide correct file generation number
  #12857 Verify dRAID empty sectors
  #12874 FreeBSD: Update argument types for VOP_READDIR
  #12896 Reduce number of arc_prune threads
  #12934 FreeBSD: Fix zvol_*_open() locking
  #12961 FreeBSD: Fix leaked strings in libspl mnttab
  #12964 Fix handling of errors from dmu_write_uio_dbuf() on FreeBSD
  #12981 Introduce a flag to skip comparing the local mac when
         raw sending
  #12985 Avoid memory allocations in the ARC eviction thread
  #13014 Report dnodes with faulty bonuslen
  #13016 FreeBSD: Fix zvol_cdev_open locking
  #13027 Fix clearing set-uid and set-gid bits on a file when
         replying a write
  #13031 Add enumerated vdev names to 'zpool iostat -v' and
         'zpool list -v'
  #13074 Enable encrypted raw sending to pools with greater ashift
  #13076 Receive checks should allow unencrypted child datasets
  #13098 Avoid dirtying the final TXGs when exporting a pool
  #13172 Fix ENOSPC when unlinking multiple files from full pool

Obtained from:	OpenZFS
OpenZFS commit:	ef83e07db5
OpenZFS tag:	zfs-2.1.3
Relnotes:	yes
2022-03-11 10:54:49 +01:00
Colin Percival
dd6c1475a6 Add support for getting early entropy from UEFI
UEFI provides a protocol for accessing randomness. This is a good way
to gather early entropy, especially when there's no driver for the RNG
on the platform (as is the case on the Marvell Armada8k (MACCHIATObin)
for now).

If the entropy_efi_seed option is enabled in loader.conf (default: YES)
obtain 2048 bytes of entropy from UEFI and pass is to the kernel as a
"module" of name "efi_rng_seed" and type "boot_entropy_platform"; if
present, ingest it into the kernel RNG.

Submitted by:	Greg V
Reviewed by:	markm, kevans
Approved by:	csprng (markm)
Differential Revision:	https://reviews.freebsd.org/D20780
2022-03-10 18:11:41 -08:00
Santiago Martinez
20ea94a9ec if_epair: fix build with RSS and INET or INET6 disabled
Reviewed by:	kp
MFC after:	1 week

(cherry picked from commit 52bcdc5b80)
2022-03-10 09:51:41 +01:00
Hans Petter Selasky
cea6dbdf1b Make sure the avr32dci_odevd structure is used.
This fixes a compilation error.

Sponsored by:	NVIDIA Networking

(cherry picked from commit 3f5054862a)
2022-03-10 09:29:22 +01:00