Commit graph

1205 commits

Author SHA1 Message Date
John Baldwin
382ffc8786 x86: Add a NT_X86_SEGBASES register set.
This register set contains the values of the fsbase and gsbase
registers.  Note that these registers can already be controlled
individually via ptrace(2) via MD operations, so the main reason for
adding this is to include these register values in core dumps.  In
particular, this will enable looking up the value of TLS variables
from core dumps in gdb.

The value of NT_X86_SEGBASES was chosen to match the value of
NT_386_TLS on Linux.  The notes serve similar purposes, but FreeBSD
will never dump a note equivalent to NT_386_TLS (which dumps a single
segment descriptor rather than a pair of addresses) and picking a
currently-unused value in the NT_X86_* range could result in a future
conflict.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D34650

(cherry picked from commit 931983ee08)
2022-05-13 09:45:19 -07:00
Andrew Turner
e5e2c7ffa0 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830

(cherry picked from commit b792434150)
2022-05-12 15:12:59 -07:00
John Baldwin
5159d50a1e Simplify swi for bus_dma.
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages.  For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm).  However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi.  I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux.  However, in the current scheme there's no need for the mux.

Instead, remove swi_vm and vm_ih.  Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly.  This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.

One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.

Reviewed by:	imp, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33447

(cherry picked from commit 254e4e5b77)
2022-04-29 14:27:47 -07:00
John Baldwin
25f14b19c6 Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:

- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)

Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).

Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr.  libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).

A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.

Reviewed by:	kib (older version of x86/tls.h), jrtc27
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33351

For stable/13 only, sys/arm/include/tls.h includes support for
ARM_TP_ADDRESS which is not present in main.

(cherry picked from commit 1a62e9bc00)
2022-04-29 13:50:05 -07:00
Roger Pau Monné
a1a5c53516 x86/xen: fallback when VCPUOP_send_nmi is not available
It has been reported that on some AWS instances VCPUOP_send_nmi
returns -38 (ENOSYS). The hypercall is only available for HVM guests
in Xen 4.7 and newer. Add a fallback to use the native NMI sending
procedure when VCPUOP_send_nmi is not available, so that the NMI is
not lost.

Reported and Tested by: avg
Fixes: b2802351c1 ('xen: fix dispatching of NMIs')
Sponsored by: Citrix Systems R&D
(cherry picked from commit ad15eeeaba)
2022-04-12 10:05:47 +02:00
Mark Johnston
85e36b3c9c i386: Fix the nodevice apic build
PR:		263124
Fixes:		62d09b46ad ("x86: Defer LAPIC calibration until after timecounters are available")
Reviewed by:	kib, jhb, emaste
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit aa597d4049)
2022-04-11 09:43:27 -04:00
Gordon Bergling
b5efaf71b1 xen(4): Fix a few typos in source code comments
- s/querried/queried/

(cherry picked from commit bba12ee453)
2022-04-02 15:33:07 +02:00
Mark Johnston
b6c207a830 x86: Defer early TSC timecounter calibration to SI_SUB_CPU
If we can't determine the TSC frequency using CPU registers, we need to
give a chance for Hyper-V drivers to register a timecounter (during
SI_SUB_HYPERVISOR) since an emulated 8254 might not be available.
Thus, split probe_tsc_freq() into early and late stages, and wait until
the latter to attempt calibration using a reference clock.

Fixes:		84369dd523 ("x86: Probe the TSC frequency earlier")
Reported and tested by:	khng, Shawn Webb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 075e2779ac)
2022-03-31 12:05:46 -04:00
Mark Johnston
44eddec48e x86: Probe the TSC frequency earlier
This lets us use the TSC to implement early DELAY, limiting the use of
the sometimes-unreliable 8254 PIT.

PR:		262155
Reviewed by:	emaste
Tested by:	emaste, mike tancsa <mike@sentex.net>, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 84369dd523)
2022-03-31 12:05:25 -04:00
Roger Pau Monné
88aff320c8 x86/xen: fix CPUID signature
Reviewed by: cem
Sponsored by: Citrix Systems R&D

(cherry picked from commit 396a8479b0)
2022-03-23 14:44:07 +01:00
Zhenlei Huang
f877ca16c9 x86: Correctly report unexpected cache level
Reviewed by:	rpokala, emaste
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34577

(cherry picked from commit ba46c6c4b7)
2022-03-18 20:31:00 -04:00
Austin Zhang
b8319d9f78 atrtc: reads Century field from FADT table
The ACPI spec describes the FADT->Century field as:

    The RTC CMOS RAM index to the century of data value (hundred and
    thousand year decimals).  If this field contains a zero, then the
    RTC centenary feature is not supported.  If this field has a non-zero
    value, then this field contains an index into RTC RAM space that
    OSPM can use to program the centenary field.

Use this field to decide whether to program the CENTURY register
of the CMOS RTC device.

Reviewed by:	akumar3@isilon.com, dab, vangyzen
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D33667

MFC after:	1 week
Sponsored by:	Dell EMC Isilon

(cherry picked from commit e1ef6c0ef2)
2022-03-03 08:20:07 -06:00
Allan Jude
7abbfbda1e smbios: Move smbios driver out from x86 machdep code
Add it to the x86 GENERIC and MINIMAL kernels

Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.
Reviewed by:	rpokala
Differential Revision:	https://reviews.freebsd.org/D28738

(cherry picked from commit d0673fe160)
2022-03-03 08:20:07 -06:00
Eric van Gyzen
94df301ac2 Wait longer for a previous IPI to be sent
When sending an IPI, if a previous IPI is still pending delivery,
native_lapic_ipi_vectored() waits for the previous IPI to be sent.
We've seen a few inexplicable panics with the current timeout of 50 ms.
Increase the timeout to 1 second and make it tunable.

No hardware specification mentions a timeout in this case; I checked
the Intel SDM, Intel MP spec, and Intel x2APIC spec.  Linux and illumos
wait forever.  In Linux, see __default_send_IPI_shortcut() in
arch/x86/kernel/apic/ipi.c.  In illumos, see apic_send_ipi() in
usr/src/uts/i86pc/io/pcplusmp/apic_common.c.  However, misbehaving hardware
could hang the system if we wait forever.

Reviewed by:	mav kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29942

(cherry picked from commit 2f32a971b7)
2022-03-02 15:56:30 -06:00
Edward Tomasz Napierala
cbab32bdf1 linux: deduplicate DUMMY() entries
No functional changes.

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30524

(cherry picked from commit 83043a741d)
2022-02-13 22:22:11 +00:00
Colin Percival
a605a66017 Use CPUID leaf 0x40000010 for local APIC freq
Some VM systems announce the frequency of the local APIC via the
CPUID leaf 0x40000010.  Using this allows us to boot slightly
faster by avoiding the need for timer calibration.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva

(cherry picked from commit de1292c6ff)
2022-02-10 22:52:00 -08:00
Colin Percival
51e77b34e4 TSC: Use 0x40000010 CPUID leaf for all VM types
While this CPUID leaf was originally only used by VMWare, other
hypervisors now also use it to announce the TSC frequency to guests.

This speeds up the boot process by 100 ms in EC2 and other systems,
by allowing the early calibration DELAY to be skipped.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva

(cherry picked from commit 4a432614f6)
2022-02-10 22:52:00 -08:00
Colin Percival
eb28df303b Detect CPU type before asking VMWare for TSC freq
This allows us to set tsc_is_invariant and select appropriately
fenced versions of RDTSC based on the CPU type.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva

(cherry picked from commit fd980feb57)
2022-02-10 22:52:00 -08:00
Colin Percival
baee6cc181 x86: Speed up clock calibration
Prior to this commit, the TSC and local APIC frequencies were calibrated
at boot time by measuring the clocks before and after a one-second sleep.
This was simple and effective, but had the disadvantage of *requiring a
one-second sleep*.

Rather than making two clock measurements (before and after sleeping) we
now perform many measurements; and rather than simply subtracting the
starting count from the ending count, we calculate a best-fit regression
between the target clock and the reference clock (for which the current
best available timecounter is used). While we do this, we keep track
of an estimate of the uncertainty in the regression slope (aka. the ratio
of clock speeds), and stop measuring when we believe the uncertainty is
less than 1 PPM.

In order to avoid the risk of aliasing resulting from the data-gathering
loop synchronizing with (a multiple of) the frequency of the reference
clock, we add some additional spinning depending upon the iteration number.

For numerical stability and simplicity of implementation, we make use of
floating-point arithmetic for the statistical calculations.

On the author's Dell laptop, this reduces the time spent in calibration
from 2000 ms to 29 ms; on an EC2 c5.xlarge instance, it is reduced from
2000 ms to 2.5 ms.

Reviewed by:	bde (previous version), kib
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D33802

(cherry picked from commit c2705ceaeb)
2022-02-10 22:52:00 -08:00
Kyle Evans
00bc7bbde5 sched: separate out schedinit_ap()
schedinit_ap() sets up an AP for a later call to sched_throw(NULL).

Currently, ULE sets up some pcpu bits and fixes the idlethread lock with
a call to sched_throw(NULL); this results in a window where curthread is
setup in platforms' init_secondary(), but it has the wrong td_lock.
Typical platform AP startup procedure looks something like:

- Setup curthread
- ... other stuff, including cpu_initclocks_ap()
- Signal smp_started
- sched_throw(NULL) to enter the scheduler

cpu_initclocks_ap() may have callouts to process (e.g., nvme) and
attempt to sched_add() for this AP, but this attempt fails because
of the noted violated assumption leading to locking heartburn in
sched_setpreempt().

Interrupts are still disabled until cpu_throw() so we're not really at
risk of being preempted -- just let the scheduler in on it a little
earlier as part of setting up curthread.

(cherry picked from commit 589aed00e3)
2022-02-10 14:55:29 -06:00
Ed Maste
94e6d14488 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 9feff969a0)
2022-02-08 15:00:55 -05:00
Colin Percival
a04376045d Fix variable name: freq_khz -> freq
An earlier version of this code computed the TSC frequency in kHz.
When the code was changed to compute the frequency more accurately,
the variable name was not updated.

Reviewed by:	markj
Fixes:		22875f8879 x86: Implement deferred TSC calibration
Differential Revision:	https://reviews.freebsd.org/D33696

(cherry picked from commit 698727d637)
2022-01-15 21:33:48 -08:00
Colin Percival
7299cefe39 Skip TSC calibration if exact value known
It's possible that the "early" TSC calibration gave us a value which
is known to be exact; in that case, skip the later re-calibration.

Differential Revision:	https://reviews.freebsd.org/D33695

(cherry picked from commit 9cb3288287)
2022-01-15 21:33:33 -08:00
Stefan Eßer
dc4114875e Make CPU_SET macros compliant with other implementations
(cherry picked from commit e2650af157)
2022-01-14 18:17:30 +02:00
Alexander Motin
2123d7a2d6 x86: Remove CTLFLAG_NEEDGIANT from sysctls.
MFC after:	2 weeks

(cherry picked from commit 1d6fb900ed)
2022-01-08 20:24:04 -05:00
Mark Johnston
302738e4ff x86: Fix up the MFC of 08161fd3b2
This is a direct commit to stable/13.
2022-01-06 09:07:45 -05:00
Mark Johnston
08161fd3b2 x86: Skip late calibration if our reference timer has low quality
Some AMD Geode-based systems end up using the 8254 PIT to calibrate the
TSC during late calibration, which doesn't work because that
timecounter's mask (65535) is much smaller than its frequency (1193182).
Moreover, early calibration is done against the 8254 timer anyway.

Work around the problem by simply using early calibration results if no
high-quality timecounters exist.

PR:		260868
Fixes:		22875f8879 ("x86: Implement deferred TSC calibration")
Reported and tested by:	mike@sentex.net, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Reviewed by:	imp, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 0e494a9e3f)
2022-01-06 08:46:33 -05:00
Alexander Motin
b7668d009e Make CPU children explicitly share parent unit numbers.
Before this device unit number match was coincidental and broke if I
disabled some CPU device(s).  Aside of cosmetics, for some drivers
(may be considered broken) it caused talking to wrong CPUs.

(cherry picked from commit d3a8f98acb)
2022-01-04 12:21:42 -05:00
Bjoern A. Zeeb
a34668185b modules: increase MAXMODNAME and provide backward compat
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.

MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first.  With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].

Reported by:	Greg V (greg unrelenting.technology)
Sponsored by:	The FreeBSD Foundation
Suggested by:	imp [1]
Reviewed by:	imp (and others to different level)
Differential Revision:	https://reviews.freebsd.org/D32383

(cherry picked from commit df38ada293)
2021-12-30 18:26:18 +00:00
Mark Johnston
4642ec266c x86: Do not attempt to calibrate the LAPIC timer if no APIC is present
Reported and tested by:	Michael Butler <imb@protected-networks.net>
Reviewed by:	jhb, kib
Fixes:	62d09b46ad ("x86: Defer LAPIC calibration until after timecounters are available")
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 0ecda8d5ae)
2021-12-29 10:39:41 -05:00
Mark Johnston
13dbf9da68 x86: Perform late TSC calibration before LAPIC timer calibration
This ensures that LAPIC calibration is done using the correct tsc_freq
value, i.e., the one associated with the TSC timecounter.  It does mean
though that TSC calibration cannot use sbinuptime() to read the
reference timecounter, as timehands are not yet set up.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 553af8f1ec)
2021-12-29 10:39:22 -05:00
Mark Johnston
c2f86ca65b x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing
calibration.

Change lapic_setup to avoid configuring the timer when booting, and move
calibration and initial configuration to a new lapic routine,
lapic_calibrate_timer.  This calibration will be initiated from
cpu_initclocks(), before an eventtimer is selected.

Reviewed by:	kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 62d09b46ad)
2021-12-29 10:39:10 -05:00
Mark Johnston
1e40acb545 x86: Implement deferred TSC calibration
There is no universal way to find the TSC frequency.  Newer Intel CPUs
may report it via CPUID leaves 0x15 and 0x16.  Sometimes it can be
obtained from the PLATFORM_INFO MSR as well, though we never use that.
On older platforms we derive the frequency using a DELAY(1000000) call,
which uses the 8254 PIT.  On some newer platforms the 8254 is apparently
non-functional, leading to bogus calibration results.  On such platforms
the TSC frequency must be available from CPUID.  It is also possible to
disable calibration with a tunable, in which case we try to parse the
brand string if the TSC freq is not available from CPUID.

CPUID 0x15 provides an authoritative TSC frequency value, but even that
is not always available on new Intel platforms.  CPUID 0x16 provides the
specified processor base frequency, which is not the same as the TSC
frequency.  Empirically, it is close enough for early boot, but too far
off for timekeeping: on a Comet Lake NUC, CPUID 0x16 yields 1600MHz but
the TSC frequency is rougly 1608MHz, leading to frequent clock stepping
when NTP is in use.

Thus we have a situation where we cannot calibrate using the PIT and
cannot obtain a precise frequency from CPUID (or MSRs).  This change
seeks to address that by using the CPUID 0x16 value during early boot
and refining the calibration later once ACPI-based timecounters are
available.  TSC frequency detection is thus split into two phases:

Early phase:
- On Intel platforms, query CPUID 0x15 and 0x16 and use that value
  initially if available.
- Otherwise, get an estimate using the PIT, reducing the delay loop to
  100ms from 1s.
- Continue to register the TSC as the CPU ticks provider early, even
  though the frequency may be off.  Otherwise any code executed during
  boot that uses cpu_ticks() (e.g., context switching) gets tripped up
  when the ticks provider changes.

Later phase:
- In SI_SUB_CLOCKS, once the timehands are initialized, load the current
  TSC and timecounter (sbinuptime()) values at the beginning and end of
  a 1s interval and use the timecounter frequency (typically from
  kvmclock, HPET or the ACPI PM timer) to estimate the TSC frequency.
- Update the TSC timecounter, global tsc_freq and CPU ticker with the
  new frequency and finally register the TSC as a timecounter.

Reviewed by:	kib, jhb (previous version)
Discussed with:	imp, cperciva
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 22875f8879)
2021-12-29 10:38:50 -05:00
Mark Johnston
eb62bd1d38 x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f06f1d1fdb)
2021-12-27 10:48:00 -05:00
Alexander Motin
c3593fcdfb smist: Remove unneeded Giant from bus_dma_tag_create().
bus_dmamap_load() call uses BUS_DMA_NOWAIT.

MFC after:	2 weeks

(cherry picked from commit a69f810466)
2021-12-23 20:05:24 -05:00
Alexander Motin
0d3b432779 busdma: Remove outdated comments about Giant.
MFC after:	2 weeks

(cherry picked from commit 8493918868)
2021-12-23 20:05:17 -05:00
Alexander Motin
8596d2b3d5 mca: Some error handling logic improvements.
- Enable local MCEs on capable Intel CPUs.  It delivers exceptions
only to the affected CPU instead of global broadcast, requiring a lot
of synchronization between CPUs.  AMD always deliver MCEs locally.
 - Make MCE handler process only uncorrected errors, while CMCI and
polling only corrected.  It reduces synchronization problems between
them and is explicitly recommended by the documentation.
 - Add minimal support for uncorrected software recoverable errors
on Intel CPUs.  It allows to avoid kernel panics in case uncorrected
errors do not affect current operation, like ones found during scrub
or write.  Such errors are only logged, postponing the panic until
the corrupted data will actually be needed (that may never happen).
 - Reduce polling period from 1 hour to 5 minutes.

MFC after:	2 weeks

(cherry picked from commit 63346fef33)
2021-12-22 23:03:04 -05:00
Alexander Motin
96787c2ffe x86: Make AMD elvtX dump more compact.
MFC after:	2 weeks

(cherry picked from commit a07d426509)
2021-12-18 23:49:07 -05:00
Corvin Köhne
3a5ac549a4 pci: add missing PCI id of Coffee Lake GPU
(cherry picked from commit 16f02a4cb4)
2021-12-19 04:42:52 +02:00
Mark Johnston
55e020a6f9 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit ab12e8db29)
2021-12-15 08:31:48 -05:00
Alexander Motin
98a6200ee2 mca: Switch to using taskqueue_enqueue_timeout_sbt().
Previously it was not allowed on fast taskqueues.  It was fixed in
4730a8972b.  This should make no functional change, just a bit
cleaner and efficient code.

MFC after:	1 week

(cherry picked from commit 9a128e1678)
2021-12-14 23:00:17 -05:00
Alexander Motin
550ccfd8f6 mca: Decode new Intel status bits.
MFC after:	1 week

(cherry picked from commit 3bdba24c74)
2021-12-14 23:00:17 -05:00
Alexander Motin
218457166f mca: Remove excessively verbose debug messages.
Expecially in case of AMD there was more than dozen lines per CPU.

MFC after:	1 week

(cherry picked from commit 935dc0de88)
2021-12-14 23:00:17 -05:00
Alexander Motin
00c069c68c mca: Make some sysctls also a loader tunables.
MFC after:	1 week

(cherry picked from commit c2003f2684)
2021-12-14 23:00:17 -05:00
Mitchell Horne
8e0b699730 x86: remove unused T_USER flag
It stopped being used in 3c256f5395, when trap() was reorganized to
have separate switch statements for user and kernel traps. Remove the
two leftover references and the flag itself.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33253

(cherry picked from commit 03b3d7bbec)
2021-12-10 14:31:20 -04:00
Mitchell Horne
eb2ea57ef1 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989

(cherry picked from commit 1adebe3cd6)
2021-12-03 10:02:03 -04:00
Alexander Motin
7fd7b6b4a5 Prefer CPUID leaf 1Fh for Intel CPU topology detection.
Leaf 1Fh is a prefered extended version of 0Bh.  It is supported by
new Lader Lake CPUs, though does not report anything new so far.

MFC after:	2 weeks

(cherry picked from commit 6badb512a9)
2021-11-19 19:26:29 -05:00
Mark Johnston
bf0986b742 Generalize bus_space(9) and atomic(9) sanitizer interceptors
Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN.  Lay a bit of groundwork for KASAN and KMSAN.

When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h.  These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.

No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 3ead60236f)
2021-11-01 10:16:39 -04:00
Mark Johnston
281ad195bd x86: Mark the trapframe as initialized in ipi_bitmap_handler()
Otherwise KASAN may generate false positives if the trapframe was
written into a poisoned region of the stack.

Reported by:	pho
Reported by:	syzbot+ee60455cd58e6eed20c9@syzkaller.appspotmail.com
Reported by:	syzbot+be5f9df26426ace3a00c@syzkaller.appspotmail.com
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 36226163fa)
2021-11-01 10:06:57 -04:00
Mark Johnston
0dbb46e5fa stack(9): Disable KASAN in stack_capture()
When unwinding the stack, we may encounter a stack frame in a poisoned
region of the stack, triggering a false positive.

Reviewed by:	andrew, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 831850d8b0)
2021-11-01 10:06:00 -04:00