Commit graph

2101 commits

Author SHA1 Message Date
Mark Johnston
eb62bd1d38 x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f06f1d1fdb)
2021-12-27 10:48:00 -05:00
Mark Johnston
55e020a6f9 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ab12e8db29)
2021-12-15 08:31:48 -05:00
Mitchell Horne
a6172e6353 amd64: provide PHYS_IN_DMAP() and VIRT_IN_DMAP()
It is useful for quickly checking an address against the DMAP region.
These definitions exist already on arm64 and riscv.

Reviewed by:	kib, markj
MFC after:	3 days
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D32962

(cherry picked from commit 90d4da6225)
2021-12-03 10:02:02 -04:00
Mark Johnston
bc7bc3bdf9 netinet: Remove in_cksum_update()
It was never implemented on powerpc or riscv and appears to have been
unused since it was added in 1998.  No functional change intended.

Reviewed by:	kp, glebius, cy
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 09100f936b)
2021-12-01 07:41:55 -05:00
Mark Johnston
e0cee46f58 amd64: Initialize kernel_pmap's active CPU set to all_cpus
This is in preference to simply filling the cpuset, and allows the
conditional in pmap_invalidate_cpu_mask() to be elided.

Also export pmap_invalidate_cpu_mask() outside of pmap.c for use in a
subsequent commit.

Suggested by:	kib
Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 71e6e9da22)
2021-11-29 09:17:08 -05:00
Mark Johnston
36ca4b79b8 amd64: Define KVA regions for KMSAN shadow maps
KMSAN requires two shadow maps, each one-to-one with the kernel map.
Allocate regions of the kernels PML4 page for them.  Add functions to
create mappings in the shadow map regions, these will be used by the
KMSAN runtime.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f95f780ea4)
2021-11-02 18:17:58 -04:00
Mark Johnston
bf0986b742 Generalize bus_space(9) and atomic(9) sanitizer interceptors
Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN.  Lay a bit of groundwork for KASAN and KMSAN.

When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h.  These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.

No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 3ead60236f)
2021-11-01 10:16:39 -04:00
Mark Johnston
65f3c07b98 amd64: Add MD bits for KASAN
- Initialize KASAN before executing SYSINITs.
- Add a GENERIC-KASAN kernel config, akin to GENERIC-KCSAN.
- Increase the kernel stack size if KASAN is enabled.  Some of the
  ASAN instrumentation increases stack usage and it's enough to
  trigger stack overflows in ZFS.
- Mark the trapframe as valid in interrupt handlers if it is
  assigned to td_intr_frame.  Otherwise, an interrupt in a function
  which creates a poisoned alloca region can trigger false positives.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f115c06121)
2021-11-01 10:02:41 -04:00
Mark Johnston
a3d4c8e21d amd64: Implement a KASAN shadow map
The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map.  This region is the shadow map.  The
compiler inserts calls to the KASAN runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid.  Various kernel allocators call kasan_mark() to update
the shadow map.

Since the shadow map tracks only accesses to the kernel map, accesses to
other kernel maps are not validated by KASAN.  UMA_MD_SMALL_ALLOC is
disabled when KASAN is configured to reduce usage of the direct map.
Currently we have no mechanism to completely eliminate uses of the
direct map, so KASAN's coverage is not comprehensive.

The shadow map uses one byte per eight bytes in the kernel map.  In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data.

When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map.  kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29417

(cherry picked from commit 6faf45b34b)
2021-11-01 09:57:30 -04:00
Konstantin Belousov
56c136a0d5 amd64: add pmap_page_set_memattr_noflush()
(cherry picked from commit 33c17670af)
2021-10-10 12:21:18 +03:00
Konstantin Belousov
a53e3f3bf2 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
(cherry picked from commit e36d0e86e3)
2021-10-10 12:21:17 +03:00
Konstantin Belousov
3804d1ff77 amd64: eliminate td_md.md_fpu_scratch
(cherry picked from commit bd9e0f5df6)
2021-10-10 12:21:17 +03:00
Konstantin Belousov
4d0d7ea9e7 amd64: stop using top of the thread' kernel stack for FPU user save area
MFC note: this commit changes layout of td_md for amd64, resulting in
static checks for struct thread ABI in kern_thread.c to fail.  Next
two commits restore the layout, I decided to not overcomplicate the
merge and not do the work that is going to be overwritten immediately.

(cherry picked from commit df8dd6025a)
2021-10-10 12:21:17 +03:00
Konstantin Belousov
61b29720ef amd64: centralize definitions of CS_SECURE and EFL_SECURE
(cherry picked from commit a42d362bb5)
2021-10-10 12:21:17 +03:00
Adam Fenn
9ba8ea3f05 x86: cpufunc: Add rdtsc_ordered()
(cherry picked from commit 652ae7b114)
2021-10-10 12:21:16 +03:00
Adam Fenn
60490b72ba x86: cpufunc: Add rdtscp_aux()
(cherry picked from commit 908e277230)
2021-10-10 12:21:16 +03:00
Jason A. Harmening
166784e275 amd64 pmap: convert to counter(9), add PV and pagetable page counts
This change converts most of the counters in the amd64 pmap from
global atomics to scalable counter(9) counters.  Per discussion
with kib@, it also removes the handrolled per-CPU PCID save count
as it isn't considered generally useful.

The bulk of these counters remain guarded by PV_STATS, as it seems
unlikely that they will be useful outside of very specific debugging
scenarios.  However, this change does add two new counters that
are available without PV_STATS.  pt_page_count and pv_page_count
track the number of active physical-to-virtual list pages and page
table pages, respectively.  These will be useful in evaluating
the memory footprint of pmap structures under various workloads,
which will help to guide future changes in this area.

Reviewed by:	kib

(cherry picked from commit e4b8deb222)
2021-09-01 09:29:01 -04:00
Konstantin Belousov
89423483e8 amd64 pmap_vm_page_alloc_check(): loose the assert
(cherry picked from commit 665895db26)
2021-08-24 02:21:14 +03:00
Konstantin Belousov
9452209154 amd64 pmap_vm_page_alloc_check(): print more data for failed assert
(cherry picked from commit 1a55a3a729)
2021-08-24 02:21:14 +03:00
Konstantin Belousov
a686d177a7 Add pmap_vm_page_alloc_check()
(cherry picked from commit 041b7317f7)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
8ca493ffb4 amd64: do not assume that kernel is loaded at 2M physical
(cherry picked from commit e18380e341)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
c946f69985 amd64: rework AP startup
(cherry picked from commit d6717f8778)
2021-08-24 02:21:12 +03:00
Konstantin Belousov
6ba7789189 amd64: add pmap_alloc_page_below_4g()
(cherry picked from commit c8bae074d9)
2021-08-03 12:52:37 +03:00
Konstantin Belousov
21049f0567 amd64: make efi_boot global
(cherry picked from commit 6a3821369f)
2021-08-03 12:52:36 +03:00
Andrew Turner
ade8b810b0 Create VM_MEMATTR_DEVICE on all architectures
This is intended to be used with memory mapped IO, e.g. from
bus_space_map with no flags, or pmap_mapdev.

Use this new memory type in the map request configured by
resource_init_map_request, and in pciconf.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29692

(cherry picked from commit 5d2d599d3f)
2021-06-13 16:49:36 +01:00
Mitchell Horne
e21ed730a5 gdb: report specific stop reason for watchpoints
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.

[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html

Reviewed by:	jhb
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	51

(cherry picked from commit 7446b0888d)
2021-04-21 10:20:33 -03:00
Mitchell Horne
20e6d79f43 x86: implement kdb watchpoint functions
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.

Reviewed by:	jhb, kib, markj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.

(cherry picked from commit 15dc1d4452)
2021-04-21 10:20:33 -03:00
Mark Johnston
5e96fbdfc4 amd64: Make KPDPphys local to pmap.c
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 7ae2e70336)
2021-03-31 09:16:11 -04:00
D Scott Phillips
7590d7800c bhyve: support relocating fbuf and passthru data BARs
We want to allow the UEFI firmware to enumerate and assign
addresses to PCI devices so we can boot from NVMe[1]. Address
assignment of PCI BARs is properly handled by the PCI emulation
code in general, but a few specific cases need additional support.
fbuf and passthru map additional objects into the guest physical
address space and so need to handle address updates. Here we add a
callback to emulated PCI devices to inform them of a BAR
configuration change. fbuf and passthru then watch for these BAR
changes and relocate the frame buffer memory segment and passthru
device mmio area respectively.

We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls
to vmm(4) to facilitate the unmapping needed for addres updates.

[1]: https://github.com/freebsd/uefi-edk2/pull/9/

Originally by:	scottph
Sponsored by:	Intel Corporation
Reviewed by:	grehan
Approved by:	philip (mentor)
Differential Revision:	https://reviews.freebsd.org/D24066

(cherry picked from commit f8a6ec2d57)
2021-03-26 21:50:41 +08:00
Mark Johnston
980da033a8 Rename _cscan_atomic.h and _cscan_bus.h to atomic_san.h and bus_san.h
Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.

No functional change intended.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29103

(cherry picked from commit 435c7cfb24)
2021-03-15 11:39:11 -04:00
Mateusz Guzik
6d3ce617f3 amd64: use compiler intrinsics for bsf* and bsr*
(cherry picked from commit aae89f6f09)
2021-02-04 18:01:08 +00:00
Mateusz Guzik
be03df57d6 amd64: retire sse2_pagezero
All page zeroing is using temporal stores with rep movs*, the routine is
unused for several years.

Should a need arise for zeroing using non-temporal stores, a more
optimized variant can be implemented with a more descriptive name.

(cherry picked from commit d1de5698df)
2021-02-01 12:39:18 +00:00
Mateusz Guzik
37bd3aa6fa amd64: use builtins for all ffs* variants
While here even up whitespace.
2021-01-14 14:37:22 +00:00
Roger Pau Monne
ed78016d00 xen/privcmd: implement the dm op ioctl
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.

This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:33:27 +01:00
Konstantin Belousov
45974de8fb x86: Add rdtscp32() into cpufunc.h.
Suggested by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27986
2021-01-10 04:42:34 +02:00
Mitchell Horne
72939459bd amd64: use register macros for gdb_cpu_getreg()
Prefer these newly-added definitions to bare values.

MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2020-12-18 16:16:03 +00:00
Mitchell Horne
0ef474de88 amd64: allow gdb(4) to write to most registers
Similar to the recent patch to arm's gdb stub in r368414, allow GDB to
update the contents of most general purpose registers.

Reviewed by:	cem, jhb, markj
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	44
Differential Revision:	https://reviews.freebsd.org/D27642
2020-12-18 16:09:24 +00:00
Peter Grehan
15add60d37 Convert vmm_ops calls to IFUNC
There is no need for these to be function pointers since they are
never modified post-module load.

Rename AMD/Intel ops to be more consistent.

Submitted by:	adam_fenn.io
Reviewed by:	markj, grehan
Approved by:	grehan (bhyve)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D27375
2020-11-28 01:16:59 +00:00
Maxim Sobolev
fd2ef8ef5a Unobfuscate "KERNLOAD" parameter on amd64. This change lines-up amd64 with the
i386 and the rest of supported architectures by defining KERNLOAD in the
vmparam.h and getting rid of magic constant in the linker script, which albeit
documented via comment but isn't programmatically accessible at a compile time.

Use KERNLOAD to eliminate another (matching) magic constant 100 lines down
inside unremarkable TU "copy.c" 3 levels deep in the EFI loader tree.

Reviewed by:	markj
Approved by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D27355
2020-11-25 23:19:01 +00:00
John Baldwin
1925586e03 Honor the disabled setting for MSI-X interrupts for passthrough devices.
Add a new ioctl to disable all MSI-X interrupts for a PCI passthrough
device and invoke it if a write to the MSI-X capability registers
disables MSI-X.  This avoids leaving MSI-X interrupts enabled on the
host if a guest device driver has disabled them (e.g. as part of
detaching a guest device driver).

This was found by Chelsio QA when testing that a Linux guest could
switch from MSI-X to MSI interrupts when using the cxgb4vf driver.

While here, explicitly fail requests to enable MSI on a passthrough
device if MSI-X is enabled and vice versa.

Reported by:	Sony Arpita Das @ Chelsio
Reviewed by:	grehan, markj
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D27212
2020-11-24 23:18:52 +00:00
Mark Johnston
6f5a960678 vmm: Make pmap_invalidate_ept() wait synchronously for guest exits
Currently EPT TLB invalidation is done by incrementing a generation
counter and issuing an IPI to all CPUs currently running vCPU threads.
The VMM inner loop caches the most recently observed generation on each
host CPU and invalidates TLB entries before executing the VM if the
cached generation number is not the most recent value.
pmap_invalidate_ept() issues IPIs to force each vCPU to stop executing
guest instructions and reload the generation number.  However, it does
not actually wait for vCPUs to exit, potentially creating a window where
guests may continue to reference stale TLB entries.

Fix the problem by bracketing guest execution with an SMR read section
which is entered before loading the invalidation generation.  Then,
pmap_invalidate_ept() increments the current write sequence before
loading pm_active and sending IPIs, and polls readers to ensure that all
vCPUs potentially operating with stale TLB entries have exited before
pmap_invalidate_ept() returns.

Also ensure that unsynchronized loads of the generation counter are
wrapped with atomic(9), and stop (inconsistently) updating the
invalidation counter and pm_active bitmask with acquire semantics.

Reviewed by:	grehan, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26910
2020-11-11 15:01:17 +00:00
Mark Johnston
cff169880e amd64: Make it easier to configure exception stack sizes
The amd64 kernel handles certain types of exceptions on a dedicated
stack.  Currently the sizes of these stacks are all hard-coded to
PAGE_SIZE, but for at least NMI handling it can be useful to use larger
stacks.  Add constants to intr_machdep.h to make this easier to tweak.

No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27076
2020-11-04 16:42:20 +00:00
Konstantin Belousov
546df7a45d amd64 pmap.h: explicitly provide constants values instead of relying
on some more advanced C features.

This fixes gcc-toolchain build of exception.S.

Reported and tested by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-10-16 16:22:32 +00:00
Konstantin Belousov
e406235000 Fix for mis-interpretation of PCB_KERNFPU.
RIght now PCB_KERNFPU is used both as indication that kernel prepared
hardware FPU context to use and that the thread is fpu-kern
thread.  This also breaks fpu_kern_enter(FPU_KERN_NOCTX), since
fpu_kern_leave() then clears PCB_KERNFPU.

Introduce new flag PCB_KERNFPU_THR which indicates that the thread is
fpu-kern.  Do not clear PCB_KERNFPU if fpu-kern thread leaves noctx
fpu region.

Reported and tested by:	jhb (amd64)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25511
2020-10-14 23:01:41 +00:00
Konstantin Belousov
df01340989 amd64: Store full 64bit of FIP/FDP for 64bit processes when using XSAVE.
If current process is 64bit, use rex-prefixed version of XSAVE
(XSAVE64).  If current process is 32bit and CPU supports saving
segment registers cs/ds in the FPU save area, use non-prefixed variant
of XSAVE.

Reported and tested by:	Michał Górny <mgorny@mgorny@moritz.systems>
PR:	250043
Reviewed by:	emaste, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26643
2020-10-03 23:17:29 +00:00
Konstantin Belousov
5e8ea68fd8 Move ctx_switch_xsave declaration to amd64 md_var.h.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-10-03 23:07:09 +00:00
Edward Tomasz Napierala
1e2521ffae Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26458
2020-09-27 18:47:06 +00:00
Mark Johnston
78257765f2 Add a vmparam.h constant indicating pmap support for large pages.
Enable SHM_LARGEPAGE support on arm64.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26467
2020-09-23 19:34:21 +00:00
D Scott Phillips
00e6614750 Sparsify the vm_page_dump bitmap
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).

Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.

Reviewed by:	jhb
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26131
2020-09-21 22:21:59 +00:00
D Scott Phillips
ab041f713a Move vm_page_dump bitset array definition to MI code
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.

Reviewed by:	kib, markj
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26129
2020-09-21 22:20:37 +00:00