Commit graph

2328 commits

Author SHA1 Message Date
Mitchell Horne
0db22efc94 Use KERNEL_PANICKED() in more places
This is slightly more optimized than checking panicstr directly. For
most of these instances performance doesn't matter, but let's make
KERNEL_PANICKED() the common idiom.

Reviewed by:	mjg
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D35373

(cherry picked from commit 35eb9b10c2)
2022-06-23 19:19:26 -03:00
Mark Johnston
d818a78393 boot/zfs: Extend zfsimpl.h and make it easier to use
Some makefs(8) patches make use of zfsimpl.h (not zfsimpl.c though) to
provide definitions for various on-disk structures.  Most of this diff
simply adds new definitions that are useful.

Also reduce dependencies of the header:
- remove an unused list_node_t field to drop the sys/list.h dependency
- replace CTASSERT with _Static_assert
And fix the declaration of decode_embedded_bp_compressed().

No functional change intended.

Reviewed by:	tsoome
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 77649f35a7)
2022-06-05 11:15:04 -04:00
Mark Johnston
abf56f145e libsa: Make the nvlist implementation more self-contained
Move declarations into a new nvlist.h rather than putting everything in
libzfs.h.  This makes this nvlist code easier to reuse elsewhere.  In
particular, the nvlist implementation in sys/contrib/libnv does not
provide XDR encoding, but this is needed when reading from or writing to
ZFS pools.

Also:
- Remove references to boolean_t.  It has to be a 32-bit int here, so
  just reference the underlying type.
- Add includes needed when compiling the nvlist code outside of stand/.

No functional change intended.

Reviewed by:	tsoome
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit e097436cb2)
2022-05-27 09:15:14 -04:00
Andrew Turner
e5e2c7ffa0 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830

(cherry picked from commit b792434150)
2022-05-12 15:12:59 -07:00
Mark Johnston
2baed7fc24 stand/zfs: Fix const-qual warnings
The input buffer is read-only, update casts to match.

No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 8d20f1560d)
2022-05-12 08:55:56 -04:00
Mark Johnston
2ecf3b58eb fbt: Add support for CTFv3 containers
The general aim in this and subsequent patches is to minimize the
amount of code that directly references CTF types such as ctf_type_t,
ctf_array_t, etc.  To that end, introduce some routines similar to the
existing fbt_get_ctt_size() (which exists to deal with differences
between v1 and v2) and change ctf_lookup_by_id() to return a void
pointer.

Support for v2 containers is preserved.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit d9175438c0)
2022-04-06 20:30:44 -04:00
Mark Johnston
3681c4f065 ctf: Import ctf.h from OpenBSD
Use it instead of the existing ctf.h from OpenSolaris.  This makes it
easier to use CTF in the core kernel, and to extend the CTF format to
support wider type IDs.

The imported ctf.h is modified to depend only on _types.h, and also to
provide macros which use the "parent" bit of a type ID to refer to types
in a parent CTF container.

No functional change intended.

Reviewed by:	Domagoj Stolfa, emaste
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 2d5d2a986c)
2022-04-06 20:30:44 -04:00
Mark Johnston
c04e4ff616 fasttrap: Avoid creating WX mappings
fasttrap instruments certain instructions by overwriting them and
copying the original instruction to some per-thread scratch space which
is executed after the probe fires.  This trampoline jumps back to the
tracepoint after executing the original instruction.

The created mapping has both write and execute permissions, and so this
mechanism doesn't work when allow_wx is disabled.  Work around the
restriction by using proc_rwmem() to write to the trampoline.

Reviewed by:	vangyzen
Tested by:	Amit <akamit91@hotmail.com>
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 3a56cfedbc)
2022-03-15 11:40:47 -04:00
Mark Johnston
87e1a4346d fasttrap: Assert that fasttrap_fork() successfully unmaps scratch space
No functional change intended.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 83958173eb)
2022-03-15 11:40:36 -04:00
Mark Johnston
c9b56f90d6 fbt: Remove handling for CTFv1
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 2997ab002e)
2022-03-02 08:59:16 -05:00
Andrew Turner
61a7ad35bb Teach DTrace about BTI on arm64
The Branch Target Identification (BTI) Armv8-A extension adds new
instructions that can be placed where we may indirrectly branch to,
e.g. at the start of a function called via a function pointer. We can't
emulate these in DTrace as the kernel will have raised a different
exception before the DTrace handler has run.

Skip over the BTI instruction if it's used as the first instruction in
a function.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit b5876847ac)
2022-02-22 16:23:07 +00:00
Andrew Turner
efe35aecb8 Handle functions that use a nop in the arm64 fbt
To trace leaf asm functions we can insert a single nop instruction as
the first instruction in a function and trigger off this.

Reviewed by:	gnn
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28132

(cherry picked from commit 28d945204e)
2022-02-22 16:23:07 +00:00
Ed Maste
b32bf19a50 sys/cddl: remove extraneous semicolons
Fixes:		5a1b490d50 ("FreeBSD changes to vendor source.")
Fixes:		91eaf3e183 ("Custom DTrace kernel module...")
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit b1a217a369)
2022-02-08 15:04:31 -05:00
Ed Maste
1b69d2aedc DTrace: remove sparc64 remnants in non-contrib code
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 64a790d264)
2022-02-08 15:00:55 -05:00
Andriy Gapon
eacdf85d44 dtrace: add a knob to control maximum size of principal buffers
We had a hardcoded limit of 1/128-th of physical memory that was further
subdivided between all CPUs as principal buffers are allocated on the
per-CPU basis.  Actually, the buffers could use up 1/64-th of the
memmory because with the default switch policy there are two buffers per
CPU.

This commit allows to change that limit.

Note that the discussed limit is per dtrace command invocation.
The idea is to limit the size of a single malloc(9) call, not the total
memory size used by DTrace buffers.

(cherry picked from commit 7fdf0e8835)
2022-02-01 10:12:03 +02:00
Andrew Turner
7b7f391db4 Allow ddb and dtrace use the DMAP region on arm64
When writing to memory on arm64 we may be trying to be accessing a
read-only page. In this case try to access via the DMAP region to
get a writable location.

While here simplify writing data in DDB and stop trashing the size as
it is passed into the cache handling functions.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32053

(cherry picked from commit 3d2533f5c2)
2022-01-04 10:49:51 +00:00
Andrew Turner
5143c53dfb Fix dtrace fbt return probes on arm64
As with arm and riscv fix return fbt probes on arm64. arg0 should be
the offset within the function of the return instruction and arg1
should be the return value.

Reviewed by:	kp, markj
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33440

(cherry picked from commit e3ccf4f9de)
2021-12-29 10:06:00 +00:00
Domagoj Stolfa
6c9c924b43 dtrace: Disable getf() as it is broken on FreeBSD
getf() on FreeBSD calls _sx_slock(), _sx_sunlock() and fget_locked().
Furthermore, it does not set the per-core fault flag, meaning it
usually ends up in a double fault panic once getf() does get called,
especially from fbt.

Reviewing the DTrace Toolkit + a number of other scripts scattered
around FreeBSD, I have not been able to find one use of getf(). Given
how broken the implementation currently is, we disable it until it
can be implemented properly.

Also comment out a test in aggs/tst.subr.d for getf().

Reviewed by:	markj

(cherry picked from commit 30ec3138ed)
2021-12-26 19:15:07 -05:00
Kyle Evans
08703a5649 kern: drop remaining references to removed makesyscalls.sh
This was accidentally omitted from the recent removal of makeyscalls.sh.

(cherry picked from commit 35aa1d6e45)
2021-09-14 20:53:03 -05:00
Mark Johnston
26b117bc6a fbt: Remove some handling for multiple CTF containers
This was ported from illumos but not completely done.  Currently we do
not perform type deduplication between KLDs and the kernel, i.e., kernel
modules have a complete type graph.  So, remove it for now since it's
not functional and complicates the task of modifying various CTF type
definitions, and we are hitting some limits in the current format which
necessitate an update.

No functional change intended.

(cherry picked from commit 4d221f59b8)
2021-04-15 21:27:24 -04:00
Robert Watson
ba08ba5226 Teach DTrace that unaligned accesses are OK on aarch64, not just x86.
MFC after:	3 days
Reviewed:	andrew
Differential Revision:	https://reviews.freebsd.org/D29369
2021-03-25 09:17:38 -04:00
Robert Watson
fcc700abe4 Tune DTrace 'aframes' for the FBT and profile providers on arm64.
In both cases, too few frames were trimmed, leading to exception handling
or DTrace internals being exposed in stack traces exposed by D's stack()
primitive.

Reviewed by:	emaste, andrew
Differential Revision:	https://reviews.freebsd.org/D29356

(cherry picked from commit 599fb1d198)
2021-03-25 09:16:41 -04:00
Robert Watson
bbcdd9faca Reimplemen FreeBSD/arm64 dtrace_gethrtime() to use the system timer.
dtrace_gethrtime() is the high-resolution nanosecond timestemp used
for the DTrace 'timestamp' built-in variable.  The new implementation
uses the EL0 cycle counter and frequency registers in ARMv8-A.  This
replaces a previous implementation that relied on an
instrumentation-safe implementation of getnanotime(), which provided
only timer resolution.

Approved by:	re (gjb)
Reviewed by:	andrew, bsdimp (older version)
Useful comments appreciated:	jrtc27, emaste
Differential Revision:	https://reviews.freebsd.org/D28723
2021-02-25 21:38:30 +00:00
Mitchell Horne
0b92d1dd18 riscv: fix kernel build
A more complete fix for this function is being worked on in D28054. Fix
the uninitialized variable error so that builds can at least proceed.

Reported by:	several
2021-01-15 11:57:04 -04:00
Andrew Turner
c00ec4dab2 Handle using a sub instruction in the arm64 fbt
Some stack frames are too large for a store pair instruction we already
detect in the arm64 fbt code. Add support for handling subtracting the
stack pointer directly.

Sponsored by:	Innovate UK
2021-01-12 12:42:23 +00:00
Andrew Turner
d0df1a2d54 Only allow a store through sp in the arm64 fbt
When searching for an instruction to patch out in the arm64 function
boundary trace we search for a store pair with a write back. This
instruction is commonly used to store two registers to the stack
and update the stack pointer to hold space for more.

This works in many cases, however not all functions use this, e.g.
when the stack frame is too large. In these cases we may find another
instruction of the same type that doesn't store through the stack
pointer. Filter these instructions out and assume if we see one we
are past the function prologue.

Reported by:	rwatson
Sponsored by:	Innovate UK
2021-01-12 12:42:23 +00:00
Kristof Provost
f743976583 dtrace: Blacklist riscv exception handlers for fbt
We can't safely instrument those exception handlers, so blacklist them.

Test case: dtrace -n :::

Reviewed by:		markj (previous version)
Differential Revision:	https://reviews.freebsd.org/D27754
2021-01-12 10:33:16 +01:00
Robert Watson
30b68ecda8 Changes that improve DTrace FBT reliability on freebsd/arm64:
- Implement a dtrace_getnanouptime(), matching the existing
  dtrace_getnanotime(), to avoid DTrace calling out to a potentially
  instrumentable function.

  (These should probably both be under KDTRACE_HOOKS.  Also, it's not clear
  to me that they are correct implementations for the DTrace thread time
  functions they are used in .. fixes for another commit.)

- Don't allow FBT to instrument functions involved in EL1 exception handling
  that are involved in FBT trap processing: handle_el1h_sync() and
  do_el1h_sync().

- Don't allow FBT to instrument DDB and KDB functions, as that makes it
  rather harder to debug FBT problems.

Prior to these changes, use of FBT on FreeBSD/arm64 rapidly led to kernel
panics due to recursion in DTrace.

Reliable FBT on FreeBSD/arm64 is reliant on another change from @andrew to
have the aarch64 instrumentor more carefully check that instructions it
replaces are against the stack pointer, which can otherwise lead to memory
corruption.  That change remains under review.

MFC after:	2 weeks
Reviewed by:	andrew, kp, markj (earlier version), jrtc27 (earlier version)
Differential revision:	https://reviews.freebsd.org/D27766
2021-01-11 15:42:22 +00:00
Bryan Drewery
f222a6b886 dtrace: Fix /"string" == NULL/ comparisons using an uninitialized value.
A test of this is funcs/tst.strtok.d which has this filter:

    BEGIN
    /(this->field = strtok(this->str, ",")) == NULL/
    {
            exit(1);
    }
The test will randomly fail with exit status of 1 indicating that this->field
was NULL even though printing it out shows it is not.

This is compiled to the DTrace instruction set:
    // Pushed arguments not shown here
    // call strtok() and set result into %r1
    07: 2f001f01    call DIF_SUBR(31), %r1          ! strtok
    // set thread local scalar this->field from %r1
    08: 39050101    stls %r1, DT_VAR(1281)          ! DT_VAR(1281) = "field"
    // Prepare for the == comparison
    // Set right side of %r2 to NULL
    09: 25000102    setx DT_INTEGER[1], %r2         ! 0x0
    // string compare %r1 (strtok result) to %r2
    10: 27010200    scmp %r1, %r2

In this case only %r1 is loaded with a string limit set to lim1.  %r2 being
NULL does not get loaded and does not set lim2.  Then we call dtrace_strncmp()
with MIN(lim1, lim2) resulting in passing 0 and comparing neither side.
dtrace_strncmp() handles this case fine and it already has been while
being lucky with what lim2 was [un]initialized as.

Reviewed by:	markj, Don Morris <dgmorris AT earthlink.net>
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D27671
2021-01-08 14:37:17 -08:00
Alex Richardson
1f6612b444 Install dtrace.h and dependencies
This makes the minimum amount of changes to allow inclusion of dtrace.h
without all the solaris compatibility headers. Installing dtrace.h allows
compiling consumers of libdtrace (e.g. https://github.com/tmetsch/python-dtrace)
without requiring a copy of the source tree.
For python-dtrace I worked around this in 58019c9a12
but being able to build the library without installed sources would be
extremely useful.

Reviewed By:	gnn
Differential Revision: https://reviews.freebsd.org/D27884
2021-01-07 09:26:21 +00:00
John Baldwin
ae95396817 Check that the frame pointer is within the current stack.
This same check is used on other architectures.  Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).

Reviewed by:	andrew, markj
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27362
2020-12-08 18:00:58 +00:00
John Baldwin
9b9e7f4c51 Stack unwinding robustness fixes for RISC-V.
- Push the kstack_contains check down into unwind_frame() so that it
  is honored by DDB and DTrace.

- Check that the trapframe for an exception frame is contained in the
  traced thread's kernel stack for DDB traces.

Reviewed by:	markj
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27357
2020-12-08 17:57:18 +00:00
Mark Johnston
7be2770a42 sdt: Create providers and probes in separate passes when loading sdt.ko
The sdt module's load handler iterates over SDT linker sets for the
kernel and all loaded modules to create probes and providers defined by
SDT(9).  Probes in one module may belong to a provider in a different
module, but when a probe is created we assume that the provider is
already defined.  To maintain this invariant, modify the load handler to
perform two separate passes over loaded modules: one to define providers
and the other to define probes.

The problem manifests when loading linux.ko, which depends on
linux_common.ko, which defines providers used by probes defined in
linux.ko.

Reported by:	gallatin
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-12-03 17:10:00 +00:00
Konstantin Belousov
013a1ae66e Fix syntax 2020-12-01 23:51:48 +00:00
Konstantin Belousov
24f10b03a1 Fix syntax 2020-12-01 22:44:23 +00:00
John Baldwin
4d16f94191 Use uintptr_t instead of uint64_t for pointers in stack frames.
Reviewed by:	andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27361
2020-12-01 18:22:34 +00:00
John Baldwin
fdd947e4ce Use uintptr_t for pointers in stack frames.
This catches up to the changes made to struct unwind_state in r364180.

Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27360
2020-12-01 18:08:22 +00:00
John Baldwin
5941edfcdc Add a kstack_contains() helper function.
This is useful for stack unwinders which need to avoid out-of-bounds
reads of a kernel stack which can trigger kernel faults.

Reviewed by:	kib, markj
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27356
2020-12-01 17:04:46 +00:00
Adrian Chadd
975e1c1ce6 [cddl] Fix lz4 function definitions to not tri pup compile.
This tripped up in llvm compilation on amd64 noting that lz4_init/lz4_fini
were lacking in being previously defined.

Reviewed by:	emaste, freqlabs, brooks
Differential Revision:	https://reviews.freebsd.org/D27240
2020-11-17 17:11:07 +00:00
Mateusz Guzik
bdcc222644 malloc: move malloc_type_internal into malloc_type
According to code comments the original motivation was to allow for
malloc_type_internal changes without ABI breakage. This can be trivially
accomplished by providing spare fields and versioning the struct, as
implemented in the patch below.

The upshots are one less memory indirection on each alloc and disappearance
of mt_zone.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27104
2020-11-06 21:33:59 +00:00
Edward Tomasz Napierala
bce7ee9d41 Drop "All rights reserved" from all my stuff. This includes
Foundation copyrights, approved by emaste@.  It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.

Reviewed by:	emaste, imp, gbe (manpages)
Differential Revision:	https://reviews.freebsd.org/D26980
2020-10-28 13:46:11 +00:00
Mitchell Horne
6cb13a3058 Fix build after r367020
DTrace also relies on these definitions.

Reported by:	jenkins
2020-10-24 23:21:51 +00:00
Matt Macy
180f822596 Update OpenZFS to 2.0.0-rc3-gfc5966
- fix panic due to tqid overflow
- Improve libzfs_error_init messages
- Expose zfetch_max_idistance tunable
- Make dbufstat work on FreeBSD
- Fix EIO after resuming receive of new dataset over an existing one
2020-10-17 01:06:04 +00:00
Warner Losh
9257c69b1c Turn off zstd on aarch64
loader support for zstd and zfs doesn't work for aarch64. Disable it
to unbreak the build.
2020-10-13 02:36:16 +00:00
Warner Losh
2fec3ae896 Add zstd support to the boot loader.
Add support to the _STANDALONE environment enough bits of the kernel
that we can compile it. We still have a small zstd_shim.c since there
were 3 items that were a bit hard to nail down and may be cleaned up
in the future. These go hand in hand with a number of commits to
sys/sys in the past weeks, should this need be MFCd.

Discussed with: mmacy (in review and on IRC/Slack)
Reviewed by: freqlabs (on openzfs repo)
Differential Revision: https://reviews.freebsd.org/D26218
2020-10-12 22:19:07 +00:00
Toomas Soome
e307eb94ae loader: zfs should support bootonce an nextboot
bootonce feature is temporary, one time boot, activated by
"bectl activate -t BE", "bectl activate -T BE" will reset the bootonce flag.

By default, the bootonce setting is reset on attempt to boot and the next
boot will use previously active BE.

By setting zfs_bootonce_activate="YES" in rc.conf, the bootonce BE will
be set permanently active.

bootonce dataset name is recorded in boot pool labels, bootenv area.

in case of nextboot, the nextboot_enable boolean variable is recorded in
freebsd:nvstore nvlist, also stored in boot pool label bootenv area.
On boot, the loader will process /boot/nextboot.conf if nextboot_enable
is "YES", and will set nextboot_enable to "NO", preventing /boot/nextboot.conf
processing on next boot.

bootonce and nextboot features are usable in both UEFI and BIOS boot.

To use bootonce/nextboot features, the boot loader needs to be updated on disk;
if loader.efi is stored on ESP, then ESP needs to be updated and
for BIOS boot, stage2 (zfsboot or gptzfsboot) needs to be updated
(gpart or other tools).

At this time, only lua loader is updated.

Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25512
2020-09-21 09:01:10 +00:00
Kristof Provost
effd82ca70 dtrace: fix fbt return probes on RISC-V
Return values are passed in a0, so read it from there. We also pass a1 through
to userspace, as the ABI allows small structs to be returned in registers
a0/a1. While here read the register values directly from the trapframe rather
than rtval, and remove the now unneeded argument from dtrace_invop().

Set fbtp_roffset so that we get the correct return location in arg0.

Reviewed by:	markj
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D26389
2020-09-11 09:15:49 +00:00
Mark Johnston
fcc0db1734 Tighten frame pointer checking in DTrace's amd64 stack unwinder.
Avoid assuming that the kernel was compiled with
-fno-omit-frame-pointer.

MFC after:	1 week
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2020-09-01 15:15:44 +00:00
Matt Macy
a86e97e50d ZFS: band-aid for -DNO_CLEAN
Submitted by:	Neal Chauhan
Approved by:	imp@
Differential Revision:	https://reviews.freebsd.org/D26183
2020-08-25 23:35:55 +00:00
Matt Macy
9e5787d228 Merge OpenZFS support in to HEAD.
The primary benefit is maintaining a completely shared
code base with the community allowing FreeBSD to receive
new features sooner and with less effort.

I would advise against doing 'zpool upgrade'
or creating indispensable pools using new
features until this change has had a month+
to soak.

Work on merging FreeBSD support in to what was
at the time "ZFS on Linux" began in August 2018.
I first publicly proposed transitioning FreeBSD
to (new) OpenZFS on December 18th, 2018. FreeBSD
support in OpenZFS was finally completed in December
2019. A CFT for downstreaming OpenZFS support in
to FreeBSD was first issued on July 8th. All issues
that were reported have been addressed or, for
a couple of less critical matters there are
pull requests in progress with OpenZFS. iXsystems
has tested and dogfooded extensively internally.
The TrueNAS 12 release is based on OpenZFS with
some additional features that have not yet made
it upstream.

Improvements include:
  project quotas, encrypted datasets,
  allocation classes, vectorized raidz,
  vectorized checksums, various command line
  improvements, zstd compression.

Thanks to those who have helped along the way:
Ryan Moeller, Allan Jude, Zack Welch, and many
others.

Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25872
2020-08-25 02:21:27 +00:00