Commit graph

328 commits

Author SHA1 Message Date
Dmitry Chagin
6fddab804a amd64: Reload CPU ext features after resume or cr4 changes
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D35555
MFC after:		2 weeks

(cherry picked from commit 050f5a8405)
2022-07-13 14:48:49 +03:00
Dmitry Chagin
cca86d9418 Remove dead code.
is_physical_memory() dead since 235a54de.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35056
MFC after:		2 weeks

(cherry picked from commit fe2c9f83a6)
2022-06-17 22:34:10 +03:00
Dmitry Chagin
5a66ec2748 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:            kib
Differential revision:  https://reviews.freebsd.org/D31472
MFC after:              2 weeks

(cherry picked from commit de8374df28)
2022-06-17 22:33:28 +03:00
John Baldwin
382ffc8786 x86: Add a NT_X86_SEGBASES register set.
This register set contains the values of the fsbase and gsbase
registers.  Note that these registers can already be controlled
individually via ptrace(2) via MD operations, so the main reason for
adding this is to include these register values in core dumps.  In
particular, this will enable looking up the value of TLS variables
from core dumps in gdb.

The value of NT_X86_SEGBASES was chosen to match the value of
NT_386_TLS on Linux.  The notes serve similar purposes, but FreeBSD
will never dump a note equivalent to NT_386_TLS (which dumps a single
segment descriptor rather than a pair of addresses) and picking a
currently-unused value in the NT_X86_* range could result in a future
conflict.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D34650

(cherry picked from commit 931983ee08)
2022-05-13 09:45:19 -07:00
Andrew Turner
e5e2c7ffa0 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830

(cherry picked from commit b792434150)
2022-05-12 15:12:59 -07:00
John Baldwin
5159d50a1e Simplify swi for bus_dma.
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages.  For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm).  However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi.  I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux.  However, in the current scheme there's no need for the mux.

Instead, remove swi_vm and vm_ih.  Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly.  This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.

One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.

Reviewed by:	imp, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33447

(cherry picked from commit 254e4e5b77)
2022-04-29 14:27:47 -07:00
John Baldwin
25f14b19c6 Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:

- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)

Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).

Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr.  libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).

A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.

Reviewed by:	kib (older version of x86/tls.h), jrtc27
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33351

For stable/13 only, sys/arm/include/tls.h includes support for
ARM_TP_ADDRESS which is not present in main.

(cherry picked from commit 1a62e9bc00)
2022-04-29 13:50:05 -07:00
Mark Johnston
44eddec48e x86: Probe the TSC frequency earlier
This lets us use the TSC to implement early DELAY, limiting the use of
the sometimes-unreliable 8254 PIT.

PR:		262155
Reviewed by:	emaste
Tested by:	emaste, mike tancsa <mike@sentex.net>, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 84369dd523)
2022-03-31 12:05:25 -04:00
Ed Maste
94e6d14488 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 9feff969a0)
2022-02-08 15:00:55 -05:00
Mark Johnston
13dbf9da68 x86: Perform late TSC calibration before LAPIC timer calibration
This ensures that LAPIC calibration is done using the correct tsc_freq
value, i.e., the one associated with the TSC timecounter.  It does mean
though that TSC calibration cannot use sbinuptime() to read the
reference timecounter, as timehands are not yet set up.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 553af8f1ec)
2021-12-29 10:39:22 -05:00
Mark Johnston
c2f86ca65b x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing
calibration.

Change lapic_setup to avoid configuring the timer when booting, and move
calibration and initial configuration to a new lapic routine,
lapic_calibrate_timer.  This calibration will be initiated from
cpu_initclocks(), before an eventtimer is selected.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 62d09b46ad)
2021-12-29 10:39:10 -05:00
Mark Johnston
eb62bd1d38 x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f06f1d1fdb)
2021-12-27 10:48:00 -05:00
Mark Johnston
55e020a6f9 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ab12e8db29)
2021-12-15 08:31:48 -05:00
Alexander Motin
550ccfd8f6 mca: Decode new Intel status bits.
MFC after:	1 week

(cherry picked from commit 3bdba24c74)
2021-12-14 23:00:17 -05:00
Mitchell Horne
8e0b699730 x86: remove unused T_USER flag
It stopped being used in 3c256f5395, when trap() was reorganized to
have separate switch statements for user and kernel traps. Remove the
two leftover references and the flag itself.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33253

(cherry picked from commit 03b3d7bbec)
2021-12-10 14:31:20 -04:00
Mitchell Horne
eb2ea57ef1 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989

(cherry picked from commit 1adebe3cd6)
2021-12-03 10:02:03 -04:00
Mark Johnston
bf0986b742 Generalize bus_space(9) and atomic(9) sanitizer interceptors
Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN.  Lay a bit of groundwork for KASAN and KMSAN.

When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h.  These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.

No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 3ead60236f)
2021-11-01 10:16:39 -04:00
Mitchell Horne
fc7febf483 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885

(cherry picked from commit ab4ed843a3)
2021-10-15 12:20:48 -03:00
Konstantin Belousov
a53e3f3bf2 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
(cherry picked from commit e36d0e86e3)
2021-10-10 12:21:17 +03:00
Konstantin Belousov
45e930db73 linux32: add a hack to avoid redefining the type of the savefpu tag
(cherry picked from commit 0f6829488e)
2021-10-10 12:21:17 +03:00
Adam Fenn
0c301b963d pvclock: Add vDSO support
(cherry picked from commit d4b2d3035a)
2021-10-10 12:21:17 +03:00
Adam Fenn
0977ab30c9 kvm_clock: KVM paravirtual clock support
(cherry picked from commit 6c69c6bb4c)
2021-10-10 12:21:16 +03:00
Adam Fenn
b2f8b90f6e pvclock: Add 'struct pvclock' API
(cherry picked from commit 0b3382b863)
2021-10-10 12:21:16 +03:00
Adam Fenn
9ba8ea3f05 x86: cpufunc: Add rdtsc_ordered()
(cherry picked from commit 652ae7b114)
2021-10-10 12:21:16 +03:00
Konstantin Belousov
a8d453eec1 amd64: stop doing special allocation for the AP startup trampoline
(cherry picked from commit b27fe1c3ba)
2021-08-24 02:21:13 +03:00
Konstantin Belousov
56f6a96f93 x86_msr_op: extend the KPI to allow MSR read and single-CPU operations
(cherry picked from commit d0bc4b4666)
2021-08-23 12:24:39 +03:00
Mitchell Horne
9d61599983 Consolidate machine/endian.h definitions
This change serves two purposes.

First, we take advantage of the compiler provided endian definitions to
eliminate some long-standing duplication between the different versions
of this header. __BYTE_ORDER__ has been defined since GCC 4.6, so there
is no need to rely on platform defaults or e.g. __MIPSEB__ to determine
endianness. A new common sub-header is added, but there should be no
changes to the visibility of these definitions.

Second, this eliminates the hand-rolled __bswapNN() routines, again in
favor of the compiler builtins. This was done already for x86 in
e6ff6154d2. The benefit here is that we no longer have to maintain our
own implementations on each arch, and can instead rely on the compiler
to emit appropriate instructions or libcalls, as available. This should
result in equivalent or better code generation. Notably 32-bit arm will
start using the `rev` instruction for these routines, which is available
on armv6+.

PR:		236920
Reviewed by:	arichardson, imp
Tested by:	bdragon (BE powerpc)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D29012

(cherry picked from commit 720dc6bcb5)
2021-06-24 20:42:56 -03:00
Konstantin Belousov
88d6f049c1 x86: add x86_clear_dbregs() helper
(cherry picked from commit a8b75a57c9)
2021-04-23 14:14:08 +03:00
Mitchell Horne
20e6d79f43 x86: implement kdb watchpoint functions
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.

Reviewed by:	jhb, kib, markj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.

(cherry picked from commit 15dc1d4452)
2021-04-21 10:20:33 -03:00
Mitchell Horne
47301dfb9a x86: consolidate hw watchpoint logic into new file
This is a prerequisite to using these functions outside of ddb, but also
provides some cleanup and minor refactoring. This code is almost
entirely duplicated between the two implementations, the only
significant difference being the lack of dbreg synchronization on i386.

Cleanups are:
 - demote some internal functions to static
 - use the constant NDBREGS instead of a '4' literal
 - remove K&R definitions
 - some added comments

Reviewed by:	kib, jhb
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.

(cherry picked from commit c02c04f113)
2021-04-21 10:20:32 -03:00
Mark Johnston
980da033a8 Rename _cscan_atomic.h and _cscan_bus.h to atomic_san.h and bus_san.h
Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.

No functional change intended.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29103

(cherry picked from commit 435c7cfb24)
2021-03-15 11:39:11 -04:00
Mateusz Guzik
1a2bc12c4b x86: use compiler intrinsics for bswap*
(cherry picked from commit e6ff6154d2)
2021-02-04 18:01:13 +00:00
Roger Pau Monné
51ab5e0d82 stand/multiboot: adjust the protocol between loader and kernel
There's a currently ad-hoc protocol to hand off the FreeBSD kernel
payload between the loader and the kernel itself when Xen is in the
middle of the picture. Such protocol wasn't very resilient to changes
to the loader itself, because it relied on moving metadata around to
package it using a certain layout. This has proven to be fragile, so
replace it with a more robust version.

The new protocol requires using a xen_header structure that will be
used to pass data between the FreeBSD loader and the FreeBSD kernel
when booting in dom0 mode. At the moment the only data conveyed is the
offset of the start of the module metadata relative to the start of the
module itself.

This is a slightly disruptive change since it also requires a change
to the kernel which is contained in this patch. In order to update
with this change the kernel must be updated before updating the
loader, as described in the handbook. Note this is only required when
booting a FreeBSD/Xen dom0. This change doesn't affect the normal
FreeBSD boot protocol.

This fixes booting FreeBSD/Xen in dom0 mode after
3630506b9d.

(cherry picked from commit b6d85a5f51)
2021-02-03 14:07:43 +01:00
Toomas Soome
a4a10b37d4 Add VT driver for VBE framebuffer device
Implement vt_vbefb to support Vesa Bios Extensions (VBE) framebuffer with VT.
vt_vbefb is built based on vt_efifb and is assuming similar data for
initialization, use MODINFOMD_VBE_FB to identify the structure vbe_fb
in kernel metadata.

struct vbe_fb, is populated by boot loader, and is passed to kernel via
metadata payload.

Differential Revision:	https://reviews.freebsd.org/D27373
2020-11-30 08:22:40 +00:00
Konstantin Belousov
d3ba71b2b1 Limit workaround for errata E400 to appropriate AMD cpus.
From Linux sources and several datasheets I looked at, it seems that
the workaround is only needed on families 0xf and 0x10.  For instance,
Ryzens do not implement the accessed MSR at all, it is documented as
reserved.  Also, hypervisors should not allow guest to put CPU into
idle state, so activate workaround only when on bare hardware.

While there, style the code:
    move MSR defines to specialreg.h
    move identification to initcpu.c

Reported by:	whu
Reviewed by:	avg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26470
2020-10-14 22:57:50 +00:00
Konstantin Belousov
5e8ea68fd8 Move ctx_switch_xsave declaration to amd64 md_var.h.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-10-03 23:07:09 +00:00
Michal Meloun
88f7c52f31 Add missing declarations of 64-bit variants of bus_peek/bus_poke on amd64.
It fixes GENERIC-KCSAN build.

Reported by:	rpokala
MFC after:	1 month
MFC with:	r365899
2020-09-24 08:40:32 +00:00
D Scott Phillips
ab041f713a Move vm_page_dump bitset array definition to MI code
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.

Reviewed by:	kib, markj
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26129
2020-09-21 22:20:37 +00:00
Michal Meloun
3182062142 Add missing assignment forgotten in r365899
Noticed by:	mav
MFC after:	1 month
MFC with:	r365899
2020-09-20 15:11:52 +00:00
Michal Meloun
95a85c125d Add NetBSD compatible bus_space_peek_N() and bus_space_poke_N() functions.
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.

Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.

This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25371
2020-09-19 11:06:41 +00:00
John Baldwin
d8dc46f6e9 Add constant for the DE_CFG MSR on AMD CPUs.
Reported by:	Patrick Mooney <pmooney@pfmooney.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25885
2020-09-11 20:32:40 +00:00
Mateusz Guzik
ab6c81a218 x86: clean up empty lines in .c and .h files 2020-09-01 21:23:59 +00:00
Konstantin Belousov
25f2da2e64 Add amd64 procctl(2) ops to manage forced LA48/LA57 VA after exec.
Tested by:	pho (LA48 hardware)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:32:13 +00:00
Konstantin Belousov
4ba405dcdb Add definition for CR4.LA57 bit.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:08:05 +00:00
Peter Grehan
3a3f1e9dfa Export a routine to provide the TSC_AUX MSR value and use this in vmm.
Also, drop an unnecessary set of braces.

Requested by:	kib
Reviewed by:	kib
MFC after:	3 weeks
2020-08-18 11:36:38 +00:00
Ruslan Bukin
c4cd699010 o Add machine/iommu.h and include MD iommu headers from it,
so we don't ifdef for every arch in busdma_iommu.c;
o No need to include specialreg.h for x86, remove it.

Requested by:	andrew
Reviewed by:	kib
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D25957
2020-08-05 19:11:31 +00:00
Ruslan Bukin
1b0c9c21d9 o Move iommu_set_buswide_ctx, iommu_is_buswide_ctx to
the generic iommu busdma backend;
o Move bus_dma_iommu_set_buswide, bus_dma_iommu_load_ident
  prototypes to iommu.h.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D25866
2020-07-29 13:23:27 +00:00
Ruslan Bukin
ea4c01156a o Move the buswide_ctxs bitmap to iommu_unit and rename related functions.
o Rename bus_dma_dmar_load_ident() as well.

Reviewed by:	kib
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D25852
2020-07-28 16:08:14 +00:00
Alexander Motin
855e49f3b0 Add initial driver for ACPI Platform Error Interfaces.
APEI allows platform to report different kinds of errors to OS in several
ways.  We've found that Supermicro X10/X11 motherboards report PCIe errors
appearing on hot-unplug via this interface using NMI.  Without respective
driver it ended up in kernel panic without any additional information.

This driver introduces support for the APEI Generic Hardware Error Source
reporting via NMI, SCI or polling.  It decodes the reported errors and
either pass them to pci(4) for processing or just logs otherwise.  Errors
marked as fatal still end up in kernel panic, but some more informative.

When somebody get to native PCIe AER support implementation both of the
reporting mechanisms should get common error recovery code.  Since in our
case errors happen when the device is already gone, there is nothing to
recover, so the code just clears the error statuses, practically ignoring
the otherwise destructive NMIs in nicer way.

MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2020-07-27 21:19:41 +00:00
Alexander Motin
aba10e131f Allow swi_sched() to be called from NMI context.
For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs.  On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY.  To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25754
2020-07-25 15:19:38 +00:00