Commit graph

8474 commits

Author SHA1 Message Date
Ed Maste
8dc330532b Correct "Fondation" typo (missing "u")
(cherry picked from commit 54399caa2f)
2021-09-04 01:26:23 +08:00
Ka Ho Ng
dee7519333 vmm: Fix wrong assert in ivhd_dev_add_entry
The correct condition is to check the number of ivhd entries fit into
the array.

Reported by:	bz
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31514

(cherry picked from commit 179bc5729d)
2021-09-04 01:21:10 +08:00
Konstantin Belousov
048a9b2d37 amd64: correctly calculate KVA of the preloaded ucode blob
(cherry picked from commit 9939af1a16)
2021-09-03 04:08:35 +03:00
Konstantin Belousov
ebc0d29e14 amd64: remove lfence after swapgs on syscall entry
(cherry picked from commit 7aa47cace1)
2021-09-02 03:52:24 +03:00
Alan Cox
026f9acc38 pmap: Micro-optimize pmap_remove_pages() on amd64 and arm64
Reduce the live ranges for three variables so that they do not span the
call to PHYS_TO_VM_PAGE().  This enables the compiler to generate
slightly smaller machine code.

Reviewed by:	kib, markj

(cherry picked from commit d411b285bc)
2021-09-01 09:29:01 -04:00
Alan Cox
4b38880259 Clear the accessed bit when copying a managed superpage mapping
pmap_copy() is used to speculatively create mappings, so those mappings
should not have their access bit preset.

Reviewed by:	kib, markj

(cherry picked from commit 325ff93274)
2021-09-01 09:29:01 -04:00
Jason A. Harmening
bfff99f7d1 factor out PT page allocation/freeing
As follow-on work to e4b8deb222, move page table page
allocation and freeing into their own functions.  Use these
functions to provide separate kernel vs. user page table page
accounting, and to wrap common tasks such as management of
zero-filled page state.

Requested by:	markj, kib
Reviewed by:	kib

(cherry picked from commit c2460d7cfe)
2021-09-01 09:29:01 -04:00
Jason A. Harmening
166784e275 amd64 pmap: convert to counter(9), add PV and pagetable page counts
This change converts most of the counters in the amd64 pmap from
global atomics to scalable counter(9) counters.  Per discussion
with kib@, it also removes the handrolled per-CPU PCID save count
as it isn't considered generally useful.

The bulk of these counters remain guarded by PV_STATS, as it seems
unlikely that they will be useful outside of very specific debugging
scenarios.  However, this change does add two new counters that
are available without PV_STATS.  pt_page_count and pv_page_count
track the number of active physical-to-virtual list pages and page
table pages, respectively.  These will be useful in evaluating
the memory footprint of pmap structures under various workloads,
which will help to guide future changes in this area.

Reviewed by:	kib

(cherry picked from commit e4b8deb222)
2021-09-01 09:29:01 -04:00
Cyril Zhang
2d4c599e7d vmm: Add credential to cdev object
Add a credential to the cdev object in sysctl_vmm_create(), then check
that we have the correct credentials in sysctl_vmm_destroy(). This
prevents a process in one jail from opening or destroying the /dev/vmm
file corresponding to a VM in a sibling jail.

Add regression tests.

Reviewed by:	jhb, markj
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit a85404906b)
2021-09-01 09:07:42 -04:00
Alan Cox
8024a900a4 amd64: Don't repeat unnecessary tests when cmpset fails
When a cmpset for removing the PG_RW bit in pmap_promote_pde() fails,
there is no need to repeat the alignment, PG_A, and PG_V tests just to
reload the PTE's value.  The only bit that we need be concerned with at
this point is PG_M.  Use fcmpset instead.

(cherry picked from commit 3687797618)
2021-08-31 15:09:24 -04:00
Alan Cox
1fe88bc851 amd64: Eliminate a redundant test from pmap_enter_object()
The call to pmap_allow_2m_x_page() in pmap_enter_object() is redundant.
Specifically, even without the call to pmap_allow_2m_x_page() in
pmap_enter_object(), pmap_allow_2m_x_page() is eventually called by
pmap_enter_pde(), so the outcome will be the same.  Essentially,
calling pmap_allow_2m_x_page() in pmap_enter_object() amounts to
"optimizing" for the unexpected case.

Reviewed by:	kib

(cherry picked from commit b7de535288)
2021-08-31 15:09:24 -04:00
Alan Cox
380c653c7d On a failed fcmpset don't pointlessly repeat tests
In a few places, on a failed compare-and-set, both the amd64 pmap and
the arm64 pmap repeat tests on bits that won't change state while the
pmap is locked.  Eliminate some of these unnecessary tests.

Reviewed by:	andrew, kib, markj

(cherry picked from commit e41fde3ed7)
2021-08-31 15:09:24 -04:00
Alan Cox
605e07a27e amd64: a simplication to pmap_remove_{all,write}
Eliminate some unnecessary unlocking and relocking when we have to retry
the operation to avoid deadlock.  (All of the other pmap functions that
iterate over a PV list already implemented retries without these same
unlocking and relocking operations.)

Reviewed by:	kib, markj

(cherry picked from commit 1a8bcf30f9)
2021-08-31 15:09:23 -04:00
Ka Ho Ng
6e2fc728d8 AMD-vi: Fortify IVHD device_identify process
- Use malloc(9) to allocate ivhd_hdrs list. The previous assumption
  that there are at most 10 IVHDs in a system is not true. A counter
  example would be a system with 4 IOMMUs, and each IOMMU is related
  to IVHDs type 10h, 11h and 40h in the ACPI IVRS table.
- Always scan through the whole ivhd_hdrs list to find IVHDs that has
  the same DeviceId but less prioritized IVHD type.

Sponsored by:	The FreeBSD Foundation
MFC with:	74ada297e8
Reviewed by:	grehan
Approved by:	lwhsu (mentor)
Differential Revision:	https://reviews.freebsd.org/D29525

(cherry picked from commit 6fe60f1d5c)
2021-08-27 21:05:58 +08:00
Ka Ho Ng
877ba067c0 vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
In hw.vmm.create sysctl handler the maximum length of vm name is
VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is
only VM_MAX_NAMELEN - 1 chars. Bump the length of the internal buffer to
allow the length of VM_MAX_NAMELEN for vm name.

Reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31372

(cherry picked from commit df95cc76af)
2021-08-27 16:52:49 +08:00
Konstantin Belousov
89423483e8 amd64 pmap_vm_page_alloc_check(): loose the assert
(cherry picked from commit 665895db26)
2021-08-24 02:21:14 +03:00
Konstantin Belousov
9452209154 amd64 pmap_vm_page_alloc_check(): print more data for failed assert
(cherry picked from commit 1a55a3a729)
2021-08-24 02:21:14 +03:00
Konstantin Belousov
a686d177a7 Add pmap_vm_page_alloc_check()
(cherry picked from commit 041b7317f7)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
8ca493ffb4 amd64: do not assume that kernel is loaded at 2M physical
(cherry picked from commit e18380e341)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
a8d453eec1 amd64: stop doing special allocation for the AP startup trampoline
(cherry picked from commit b27fe1c3ba)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
c946f69985 amd64: rework AP startup
(cherry picked from commit d6717f8778)
2021-08-24 02:21:12 +03:00
Mark Johnston
ed03d05824 amd64: Fix output operand specs for the stmxcsr and vmread intrinsics
This does not appear to affect code generation, at least with the
default toolchain.

Noticed because incorrect output specifications lead to false positives
from KMSAN, as the instrumentation uses them to update shadow state for
output operands.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit e54ae8258d)
2021-08-16 09:01:29 -04:00
Mark Johnston
034eea1ee5 vmm: Make iommu ops tables const
While here, use designated initializers and rename some AMD iommu method
implementations to match the corresponding op names.  No functional
change intended.

Reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 41335c6b7f)
2021-08-16 09:01:20 -04:00
Peter Grehan
d7388d33b4 MFC 517904de5c: igc(4): Introduce new driver for the Intel I225 Ethernet controller.
This controller supports 2.5G/1G/100MB/10MB speeds, and allows
tx/rx checksum offload, TSO, LRO, and multi-queue operation.

The driver was derived from code contributed by Intel, and modified
by Netgate to fit into the iflib framework.

Thanks to Mike Karels for testing and feedback on the driver.

Reviewed by:	bcr (manpages), kbowling, scottl, erj
Relnotes:	yes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30668

(cherry picked from commit 517904de5c)
2021-08-15 20:33:54 +10:00
Mark Johnston
7f39284c27 amd64: Set MSR_KGSBASE to 0 during AP startup
There is no reason to initialize it to anything else, and this matches
initialization of the BSP.  No functional change intended.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit e153745083)
2021-08-12 09:43:31 -04:00
Mark Johnston
b8adacf39a amd64: Set GS.base before calling init_secondary() on APs
KMSAN instrumentation requires thread-local storage to track
initialization state for function parameters and return values.  This
buffer is accessed as part of each function prologue.  It is provided by
the KMSAN runtime, which looks up a pointer in the current thread's
structure.

When KMSAN is configured, init_secondary() is instrumented, but this
means that GS.base must be initialized first, otherwise the runtime
cannot safely access curthread.  Work around this by loading GS.base
before calling init_secondary(), so that the runtime can at least check
curthread == NULL and return a pointer to some dummy storage.  Note that
init_secondary() still must reload GS.base after calling lgdt(), which
loads a selector into %gs, which in turn clears the base register.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 4b136ef259)
2021-08-12 09:43:21 -04:00
Konstantin Belousov
6ba7789189 amd64: add pmap_alloc_page_below_4g()
(cherry picked from commit c8bae074d9)
2021-08-03 12:52:37 +03:00
Konstantin Belousov
5a84640fcf amd64 pti init: fix calculation of the kernel text start
(cherry picked from commit 34516d4ad1)
2021-08-03 12:52:37 +03:00
Konstantin Belousov
2c7315c09a amd64: do not touch low memory in AP startup unless we used legacy boot
(cherry picked from commit 2572376f7f)
2021-08-03 12:52:36 +03:00
Konstantin Belousov
17332276a6 amd64: do not touch low memory in AP startup unless we used legacy boot
(cherry picked from commit 48216088b1)
2021-08-03 12:52:36 +03:00
Konstantin Belousov
21049f0567 amd64: make efi_boot global
(cherry picked from commit 6a3821369f)
2021-08-03 12:52:36 +03:00
Konstantin Belousov
39f259b1d5 Do not call FreeBSD-ABI specific code for all ABIs
(cherry picked from commit 28a66fc3da)
2021-07-22 01:11:52 +03:00
Ka Ho Ng
fc661f1903 vmm: Fix AMD-vi using wrong rid range
The ACPI parsing code around rid range was wrong on assuming there is
only one pair of start/end device id range. Besides, ivhd_dev_parse()
never work as supposed. The start/end rid info was always zero.

Restructure the code to build dynamic-sized tables for each IOMMU softc
holding device entries. The device entries are enumerated to find a
suitable IOMMU unit. Operations on devices not governed (e.g. the IOMMU
unit itself) are no-op from now on. There are also a minor fix on wrong
%b formatting string usage.

Tested on my EPYC 7282.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D30827

(cherry picked from commit b5c74dfd64)
2021-07-21 23:06:35 +08:00
Konstantin Belousov
fec5a70e1f x86: use ANSI C definition style for trap_fatal
PR:	257062

(cherry picked from commit 55e63ed307)
2021-07-17 13:52:04 +03:00
Konstantin Belousov
4b52622de7 amd64 pmap: unexpand the NBPDR macro definition
(cherry picked from commit fdc71fa112)
2021-07-17 13:51:59 +03:00
Konstantin Belousov
dc5511d75d amd64 locore.S: trim .globl list from symbols gone for long time
(cherry picked from commit 9dc715230c)
2021-07-17 13:51:55 +03:00
Konstantin Belousov
48619df1d2 amd64 mpboot.S: fix typo in comment
(cherry picked from commit 71463a34ab)
2021-07-17 13:51:50 +03:00
Konstantin Belousov
0ecd3cde77 amd64 locore.S: add FF copyright for LA57 work
(cherry picked from commit 63664df720)
2021-07-17 13:51:45 +03:00
Helge Oldach
864b57281a MINIMAL: remove debugging and some loadable network modules
Remove deugging stuff, since it's arguably not needed in a minimal
setup. Also vlan, tuntap and gif since they can be loaded.

imp didn't include the part of the patch that removed xen guest support.
Xen guest is relatively small and has no way of being loaded.

Reviewed by:	imp
PR:		229564
MFC After:	3 days

(cherry picked from commit b21f19c9e0)
2021-07-16 12:28:44 -06:00
Ka Ho Ng
67d02e1301 vmm: Fix ivrs_drv device_printf usage
The original %b description string is wrong.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	imp, jhb
Differential Revision:	https://reviews.freebsd.org/D30805

(cherry picked from commit 210e6aec4f)
2021-07-14 01:58:56 +08:00
Konstantin Belousov
99c12760d0 amd64: do not touch BIOS reset flag halfword, unless we boot through BIOS
(cherry picked from commit 33e1287b6a)
2021-06-30 07:42:13 +03:00
Mateusz Guzik
9aee734554 amd64: typo fix: memcmpy -> memcmp in a comment
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 9a8e4527f0)
2021-06-26 16:27:06 +00:00
Konstantin Belousov
d16b938e73 amd64 efirt: initialize vm_pages backing EFI runtime memory
(cherry picked from commit 0247c33e89)
2021-06-24 05:20:41 +03:00
Konstantin Belousov
52d8029e93 Add quirks for Linux ABI signals handling
(cherry picked from commit 870e197d52)
2021-06-22 04:45:32 +03:00
Mark Johnston
4a77ce73ea amd64: Fix propagation of LDT updates
When a process has used sysarch(2) to specify descriptors for its
private LDT, upon rfork(RFMEM) descriptors are copied into the new child
process.  Any updates to the descriptors are thus reflected to all other
processes sharing the vmspace.  However, this is incorrect in the rather
obscure case where the child process was created before the LDT was
modified.  Fix this by only modifying other processes which already
share the LDT.

Reported by:	syzkaller
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 70dd5eebc0)
2021-06-21 09:13:20 -04:00
Mark Johnston
85a55e0c5e vmm: Let guests enable SMEP/SMAP if the host supports it
Reviewed by:	kib, grehan, jhb
Tested by:	grehan (AMD)
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 4c599db71a)
2021-06-16 10:03:12 -04:00
Andrew Turner
ade8b810b0 Create VM_MEMATTR_DEVICE on all architectures
This is intended to be used with memory mapped IO, e.g. from
bus_space_map with no flags, or pmap_mapdev.

Use this new memory type in the map request configured by
resource_init_map_request, and in pciconf.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29692

(cherry picked from commit 5d2d599d3f)
2021-06-13 16:49:36 +01:00
Konstantin Belousov
dc107fe1f9 linuxolator: Add compat.linux.setid_allowed knob
PR:	21463

(cherry picked from commit 598f6fb49c)
2021-06-13 04:22:33 +03:00
Mark Johnston
cb5fe9aa9f amd64: Clear the local TSS when creating a new thread
Otherwise it is copied from the creating thread.  Then, if either thread
exits, the other is left with a dangling pointer, typically resulting in
a page fault upon the next context switch.

Reported by:	syzkaller
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 8cd05b8833)
2021-06-08 09:33:59 -04:00
Mark Johnston
2a87d7c013 amd64: Relax the assertion added in commit 4a59cbc12
We only need to ensure that interrupts are disabled when handling a
fault from iret.  Otherwise it's possible to trigger the assertion
legitimately, e.g., by copying in from an invalid address.

Fixes:		4a59cbc12
Reported by:	pho
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 6cda627556)
2021-06-06 21:02:38 -04:00