Commit graph

9295 commits

Author SHA1 Message Date
Ahmad Khalifa
b538d49110 Add a new sysctl in order to diffrentiate UEFI architectures
With the new 32-bit UEFI loader, it's convenient to have a sysctl to
figure out how we booted. Can be accessed at machdep.efi_arch

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1098
2024-09-20 08:45:09 -06:00
Konstantin Belousov
666303f598 sysarch: improve checks for max user address
making LA48 processes have the same limit as with the pre-LA57 kernels.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-17 02:02:14 +03:00
Konstantin Belousov
29a0a720c3 amd64 sysarch(2): style
Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-17 02:02:14 +03:00
Konstantin Belousov
e134cd9580 amd64: pml5 entries do not support PAT bits
Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-17 02:02:14 +03:00
Konstantin Belousov
4f82af24f1 amd64 pmap: do not set PG_G for usermode pmap pml5 kernel entry
Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-17 02:02:14 +03:00
Konstantin Belousov
bbb00b1719 pmap_bootstrap_la57(): reload IDT
after the trip through protected mode.  This is required by AMD64 ARM.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-17 02:02:14 +03:00
Konstantin Belousov
678bc2281c la57: do not set global bit for PML5 entry
The bit is reserved for PLM5, causing #PF on KVA access on real
hardware, unlike QEMU.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:13:51 +03:00
Konstantin Belousov
280e50461a amd64 la57_trampoline: save registers in memory
AMD64 ARM states that 64bit part of the architectural state is undefined
after 32<->64 mode switching.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:12:25 +03:00
Konstantin Belousov
687b896f8e amd64 la57_trampoline: lgdt descriptor is always 10 bytes in long mode
Extend its storage to be compliant.
This is currently nop due to padding and nul gdt descriptor right after
the lgdt descriptor.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:11:54 +03:00
Konstantin Belousov
1be58e67eb amd64 la57_trampoline: turn off global pages and PCID before turning off paging
SDM is explicit that having CR4.PCID=1 while toggling CR3.PG causes #GP.
To be safe and to avoid some more effects, also turn off CR4.PGE.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:10:16 +03:00
Konstantin Belousov
b7ea2b69ef amd64 la57_trampoline: disable EFER.LME around setting CR4.LA57
Changing paging mode while LME is set seems to be not allowed.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	jThe FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:09:38 +03:00
Konstantin Belousov
9a49c98baf amd64 la57_trampoline: stop using %rdx to remember original %cr0
Store %cr0 in %ebp.  %rdx is needed for MSR access.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:09:20 +03:00
Konstantin Belousov
180c8ab079 amd64 la57_trampoline: jump immediately after re-enabling paging
Literally follow requirements from SDM and execute jmp right after
%cr0 CR0_PG bit is toggled back.

Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:09:03 +03:00
Konstantin Belousov
787259bfe5 amd64 pmap: flush whole TLB after LA57 trampoline is installed
Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:08:53 +03:00
Konstantin Belousov
2912c2fbd4 amd64 pmap: be more verbose around entering and leaving LA57 trampoline
Sponsored by:	Advanced Micro Devices (AMD)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-09-16 11:08:53 +03:00
Doug Moore
8aa2cd9d13 rangeset: speed up range traversal
For rangeset-next search, use exact search rather than greater-than search.

Move a bit of the testing logic from the pmap code to the common rangeset code.

Reviewed by:	kib (previous version)
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D46314
2024-09-09 16:50:14 -05:00
Wuyang Chung
5d889e60c1 amd64: move the right parenthesis to the right place
Reviewed by: imp, emaste
Pull Request: https://github.com/freebsd/freebsd-src/pull/1356
2024-09-06 12:34:31 -06:00
Mark Johnston
133a513ddc vmm: Make vmm_dev.h more self-contained
vmm.h is required for VM_MAX_SUFFIXLEN.  vmm_snapshot.h is required for
struct vm_snapshot_meta.

This is a prerequisite for including vmm_dev.h in the headers parsed by
libsysdecode.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D46485
2024-09-01 14:03:15 +00:00
Mark Johnston
a852dc580c vmm: Harmonize compat ioctl definitions
For compat ioctls and structures, we use a mix of suffixes: _old,
_fbsd<version>, _<version>.  Standardize on _<version> to make things
more consistent.  No functional change intended.

Reported by:	jhb
Reviewed by:	corvink, jhb
Differential Revision:	https://reviews.freebsd.org/D46449
2024-08-28 19:12:32 +00:00
Mark Johnston
e12b6aaf0d vmm: Move compat ioctl definitions to vmm_dev.c
There is no reason to keep them in vmm_dev.h.  No functional change
intended.

Reviewed by:	corvink, jhb
Differential Revision:	https://reviews.freebsd.org/D46432
2024-08-26 18:42:13 +00:00
Mark Johnston
b9ef152bec vmm: Merge vmm_dev.c
This file contains the vmm device file implementation.  Most of this
code is not machine-dependent and so shouldn't be duplicated this way.
Move most of it into a generic dev/vmm/vmm_dev.c.  This will make it
easier to introduce a cdev-based interface for VM creation, which in
turn makes it possible to implement support for running bhyve as an
unprivileged user.

Machine-dependent ioctls continue to be handled in machine-dependent
code.  To make the split a bit easier to handle, introduce a pair of
tables which define MI and MD ioctls.  Each table entry can set flags
which determine which locks need to be held in order to execute the
handler.  vmmdev_ioctl() now looks up the ioctl in one of the tables,
acquires locks and either handles the ioctl directly or calls
vmmdev_machdep_ioctl() to handle it.

No functional change intended.  There is a lot of churn in this change
but the underlying logic in the ioctl handlers is the same.  For now,
vmm_dev.h is still mostly separate, even though some parts could be
merged in principle.  This would involve changing include paths for
userspace, though.

Reviewed by:	corvink, jhb
Differential Revision:	https://reviews.freebsd.org/D46431
2024-08-26 18:41:39 +00:00
Mark Johnston
3df92c9728 vmm: Enable assertions in vmmdev_lookup()
The comment has been there since the initial import of the vmm code
and presumably reflected some kind of problem with standalone builds of
vmm.ko.  However, I don't see any problems with it, and mtx_assert() is
used elsewhere within the vmm code.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D46438
2024-08-26 18:41:23 +00:00
Mark Johnston
93e81baa1c vmm: Move duplicated stats code into a generic file
There is a small difference between the arm64 and amd64 implementations:
the latter makes use of a "scope" to exclude AMD-specific stats on Intel
systems and vice-versa.  Replace this with a more generic predicate
callback which can be used for the same purpose.

No functional change intended.

Reviewed by:	corvink, jhb
Differential Revision:	https://reviews.freebsd.org/D46430
2024-08-26 18:41:14 +00:00
Mark Johnston
3ccb02334b vmm: Move vmm_ktr.h to a common directory
No functional change intended.

Reviewed by:	corvink, jhb, emaste
Differential Revision:	https://reviews.freebsd.org/D46429
2024-08-26 18:41:05 +00:00
John Baldwin
776cd02b89 vmm ppt: Enable busmastering and BAR decoding while a device is assigned
Reviewed by:	corvink, markj
Fixes:		f44ff2aba2 bhyve: Treat the COMMAND register for PCI passthru devices as emulated
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D46245
2024-08-22 14:40:48 -04:00
Konstantin Belousov
47656cc1ef amd64: use INVLPGB for kernel pmap invalidations
avoiding broadcast IPIs.

Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45191
2024-08-21 19:35:15 +03:00
Konstantin Belousov
bc4ffcadf2 amd64: add variables indicating INVLPGB works
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45191
2024-08-21 19:35:07 +03:00
Konstantin Belousov
111c7fc2fe amd64: add convenience wrappers for INVLPGB and TBLSYNC
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45191
2024-08-21 19:34:59 +03:00
Warner Losh
ce7fac64ba Revert "nvme: Separate total failures from I/O failures"
All kinds of crazy stuff was mixed into this commit. Revert
it and do it again.

This reverts commit d5507f9e43.

Sponsored by:		Netflix
2024-08-15 21:29:53 -06:00
Warner Losh
d5507f9e43 nvme: Separate total failures from I/O failures
When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
commands on half-failed drives.

Fixes: 9229b3105d (nvme: Fail passthrough commands right away in failed state)
Sponsored by: Netflix
2024-08-15 20:22:18 -06:00
Bojan Novković
ddc09a10ea pmap_growkernel: Use VM_ALLOC_NOFREE when allocating pagetable pages
This patch modifies pmap_growkernel in all pmaps to use VM_ALLOC_NOFREE
when allocating new pagetable pages. This should help reduce longterm
fragmentation as these pages are never released after
they are allocated.

Differential Revision:	https://reviews.freebsd.org/D45998
Reviewed by:	alc, markj, kib, mhorne
Tested by:	alc
2024-07-30 17:38:24 +02:00
Mark Johnston
ba682f8b9b vm: Remove kernel stack swapping support, part 5
- Remove cpu_thread_swapin() and cpu_thread_swapout().

Tested by:	pho
Reviewed by:	alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D46116
2024-07-29 01:40:39 +00:00
Bjoern A. Zeeb
d1bdc2821f Deprecate contigfree(9) in favour of free(9)
As of 9e6544dd6e contigfree(9) is no longer
needed and should not be used anymore.  We leave a wrapper for 3rd party
code in at least 15.x but remove (almost) all other cases from the tree.

This leaves one use of contigfree(9) untouched; that was the original
trigger for 9e6544dd6e and is handled in D45813 (to be committed
seperately later).

Sponsored by:	The FreeBSD Foundation
Reviewed by:	markj, kib
Tested by:	pho (10h stress test run)
Differential Revision: https://reviews.freebsd.org/D46099
2024-07-26 10:45:01 +00:00
Alan Cox
5b8c01d13a amd64 pmap: Optimize PKU lookups when creating superpage mappings
Modify pmap_pkru_same() to update the prototype PTE at the same time as
checking the address range.  This eliminates the need for calling
pmap_pkru_get() in addition to pmap_pkru_same().  pmap_pkru_same() was
already doing most of the work of pmap_pkru_get().

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D46135
2024-07-26 00:38:46 -05:00
Jessica Clarke
8415a654d0 Retire non-NEW_PCIB code and remove config option
All architectures enable NEW_PCIB in DEFAULTS (arm being the most recent
to do so in 121be55599 (arm: Set NEW_PCIB in DEFAULTS rather than a
subset of kernel configs")), so it's time we removed the legacy code
that no longer sees much testing and has a significant maintenance
burden.

Reviewed by:	jhb, andrew, emaste
Differential Revision:	https://reviews.freebsd.org/D32954
2024-07-18 18:55:12 +01:00
Warner Losh
e9ac41698b Remove residual blank line at start of Makefile
This is a residual of the $FreeBSD$ removal.

MFC After: 3 days (though I'll just run the command on the branches)
Sponsored by: Netflix
2024-07-15 16:43:39 -06:00
John Baldwin
9cc06bf7aa amd64 GENERIC: Switch uart hints from "isa" to "acpi"
This causes these hints to be only used to wire device unit numbers
for serial ports enumerated by ACPI but will not create ISA device
nodes if ACPI doesn't enumerate them.  Note that IRQ hints are not
used for wiring so have been removed.

PR:		270707
Reported by:	aixdroix_OSS@protonmail.com, Michael Dexter
Reported by:	mfw_burn@pm.me, Hannes Hfauswedell <h2+fbsdports@fsfe.org>
Reported by:	Matthias Lanter <freebsd@lanter-it.ch>
Reported by:	William Bulley <web@umich.edu>
Reviewed by:	imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D45945
2024-07-15 12:15:29 -07:00
John Baldwin
480cc750a2 amd64 GENERIC: Drop hints for fdc0 and ppc0
Modern x86 systems do not ship with ISA floppy disk controllers or LPT
ports.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D45946
2024-07-15 07:16:48 -07:00
Konstantin Belousov
10a6ae0ddf amd64 pmap_allocpte_nosleep(): stop testing tautological condition
Enabled PTI for given pmap is equivalent to pm_ucr3 being valid is
equivalent to root userspace page table page pm_pmltopu being
allocated.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45966
2024-07-14 06:22:45 +03:00
Konstantin Belousov
616dd88a2e amd64 pmap_allocpte_nosleep(): fix indent
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45966
2024-07-14 06:22:45 +03:00
Konstantin Belousov
7a296a86d1 IPSEC_OFFLOAD: add the option to GENERIC on amd64 and arm64
Sponsored by:	NVIDIA networking
2024-07-12 07:27:59 +03:00
Emmanuel Vadot
13d00a43cb conf: Add usbhid and hidbus to GENERIC* kernel configs
Include the new unified HID stack by default in generic.
This will allow us to migrate to the multi-stack hkbd and hms instead of
relying on the older ukbd and ums which only work with USB.
To test those drivers just add hw.usb.usbhid.enable=1 in loader.conf

Differential Revision:	https://reviews.freebsd.org/D45658
Reviewed by:	emaste, imp, wulf (all older version)
Sponsored by:	Beckhoff Automation GmbH & Co. KG
2024-07-10 08:05:25 +02:00
Mitchell Horne
e7f849e25b amd64: use pc_is_full() helper function
This seems to have been missed in a few places. No functional change
intended.

Reviewed by:	kib
MFC after:	1 week
Fixes:	4d90a5afc5 ("sys: Consolidate common implementation details...")
Differential Revision:	https://reviews.freebsd.org/D45920
2024-07-09 15:46:16 -03:00
Mark Johnston
fa6cbe8d60 sdt: Use a multibyte nop for tracepoints on amd64
Differential Revision:	https://reviews.freebsd.org/D45666
2024-07-08 11:40:06 -04:00
Alan Cox
fb32ba6aa4 amd64/arm64: Eliminate unnecessary demotions in pmap_protect()
In pmap_protect(), when the mapping isn't changing, we don't need to
perform a superpage demotion, even though the requested change doesn't
cover the entire superpage.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D45886
2024-07-06 15:48:10 -05:00
Ryan Libby
2730f42914 amd64 pcpu: fix clobbers, suppress warnings, and clean up
These changes mostly apply to the !__SEG_GS section, which is no longer
the normal compilation path.  They're made to be consistent with changes
to i386.

 - Add missing cc clobber to __PCPU_ADD (which is currently unused).
 - Allow the compiler the opportunity to marginally improve code
   generation from __PCPU_PTR by letting it figure out how to do the add
   (also removing the addition fixes a missing cc clobber).
 - Quiet gcc -Warray-bounds by using constant operands instead of bogus
   memory references.
 - Remove the struct __s __s temporaries, just cast through the type.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45827
2024-07-03 08:36:31 -07:00
Doug Moore
5dbf886104 x86: use order_base_2
Use order_base_2 in place of expressions involving fls.

Reviewed by:	alc, markj
Differential Revision:	https://reviews.freebsd.org/D45536
2024-06-24 02:26:23 -05:00
Ryan Libby
6095f4b04c amd64 kernel __storeload_barrier: quiet gcc -Warray-bounds
Use a constant input operand instead of an output operand to tell the
compiler about OFFSETOF_MONITORBUF.  If we tell it we are writing to
*(u_int *)OFFSETOF_MONITORBUF, it rightly complains, but we aren't.  The
memory clobber already covers the necessary semantics for the compiler.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D45694
2024-06-23 16:23:14 -07:00
Mark Johnston
ddf0ed09bd sdt: Implement SDT probes using hot-patching
The idea here is to avoid a memory access and conditional branch per
probe site.  Instead, the probe is represented by an "unreachable"
unconditional function call.  asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record.  Each SDT probe carries a list
of tracepoints.

When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label.  The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general.  The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.

Per gallatin@ in D43504, this approach has less overhead when probes are
disabled.  To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case.  It could be re-added later if need be.

The approach taken in this patch enables some more improvements:
1. We can now automatically fill out the "function" field of SDT probe
   names.  The SDT macros let the programmer specify the function and
   module names, but this is really a bug and shouldn't have been
   allowed.  The intent was to be able to have the same probe in
   multiple functions and to let the user restrict which probes actually
   get enabled by specifying a function name or glob.
2. We can avoid branching on SDT_PROBES_ENABLED() by adding the ability
   to include blocks of code in the out-of-line path.  For example:

	if (SDT_PROBES_ENABLED()) {
		int reason = CLD_EXITED;

		if (WCOREDUMP(signo))
			reason = CLD_DUMPED;
		else if (WIFSIGNALED(signo))
			reason = CLD_KILLED;
		SDT_PROBE1(proc, , , exit, reason);
	}

could be written

	SDT_PROBE1_EXT(proc, , , exit, reason,
		int reason;

		reason = CLD_EXITED;
		if (WCOREDUMP(signo))
			reason = CLD_DUMPED;
		else if (WIFSIGNALED(signo))
			reason = CLD_KILLED;
	);

In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.

Reviewed by:	Domagoj Stolfa, avg
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D44483
2024-06-19 16:57:41 -04:00
Mark Johnston
46bb2dca53 kasan: Increase the size of the bootstrap PTP reservation
We were undercounting in the case where the boot stack crosses a 2MB
boundary, resulting in a panic during locore execution.

MFC after:	1 week
Fixes:	756bc3adc5 ("kasan: Create a shadow for the bootstack prior to hammer_time()")
2024-06-16 13:33:13 -04:00