Commit graph

18797 commits

Author SHA1 Message Date
Konstantin Belousov
36960099a0 vfs_mount.c: convert explicit panics and KASSERTs to MPASSERT/MPPASS
(cherry picked from commit ad175a107b)
2022-07-06 15:33:30 +03:00
Konstantin Belousov
96251798e0 vfs_op_exit(): assert that mnt_vfs_ops stays non-zero for unmount or suspend
(cherry picked from commit 1e54362824)
2022-07-06 15:33:30 +03:00
Mitchell Horne
11895d2176 set_cputicker: use a bool
The third argument to this function indicates whether the supplied
ticker is fixed or variable, i.e. requiring calibration. Give this
argument a type and name that better conveys this purpose.

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35459

(cherry picked from commit 8701571df9)
2022-07-04 13:37:05 -03:00
Jamie Gritton
cf18a61708 MFC jail: Remove a prison's shared memory when it dies
Add shm_remove_prison(), that removes all POSIX shared memory segments
belonging to a prison.  Call it from prison_cleanup() so a prison
won't be stuck in a dying state due to the resources still held.

PR:		257555
Reported by:	grembo

(cherry picked from commit 7060da62ff)
2022-07-03 12:25:43 -07:00
Jamie Gritton
06dcf1499b MFC jail: add prison_cleanup() to release resources held by a dying jail
Currently, when a jail starts dying, either by losing its last user
reference or by being explicitly killed,
osd_jail_call(...PR_METHOD_REMOVE...) is called.  Encapsulate this
into a function prison_cleanup() that can then do other cleanup.

(cherry picked from commit a9f7455c38)
2022-07-03 12:24:49 -07:00
Mark Johnston
353aa91c64 mount: Fix an incorrect assertion in kernel_mount()
The pointer to the mount values may be null if an error occurred while
copying them in, so fix the assertion condition to reflect that
possibility.

While here, move some initialization code into the error == 0 block.  No
functional change intended.

Reported by:	syzkaller
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 7565431f30)
2022-06-29 10:12:33 -04:00
Mitchell Horne
5a96b88f05 kerneldump: remove physical from dump routines
It is unused, especially now that the underlying d_dumper methods do not
accept the argument.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35174

(cherry picked from commit db71383b88)
2022-06-27 16:32:06 -03:00
Mitchell Horne
e06f07bc3f kerneldump: remove physical argument from d_dumper
The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35173

(cherry picked from commit 489ba22236)
2022-06-27 16:32:06 -03:00
Mitchell Horne
13f544bc8e livedump: add event handler hooks
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.

Reviewed by:	markj, kib (previous version)
Reviewed by:	debdrup (manpages)
Reviewed by:	Pau Amma <pauamma@gundo.com> (manpages)
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34067

(cherry picked from commit eb9d205fa6)
2022-06-27 16:32:06 -03:00
Mitchell Horne
758e72c0a8 Add new vnode dumper to support live minidumps
This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by:	markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with:	kib
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D33813

(cherry picked from commit c9114f9f86)
2022-06-27 16:32:06 -03:00
Mitchell Horne
6d26e87f48 Split out dumper allocation from list insertion
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34068

(cherry picked from commit 59c27ea18c)
2022-06-27 16:32:06 -03:00
Eric van Gyzen
8320036255 netdump: send key before dump, in case dump fails
Previously, if an encrypted netdump failed, such as due to a timeout or
network failure, the key was not saved, so a partial dump was
completely useless.

Send the key first, so the partial dump can be decrypted, because even a
partial dump can be useful.

Reviewed by:	bdrewery, markj
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D31453

(cherry picked from commit 13a58148de)
2022-06-27 16:32:06 -03:00
Damjan Jovanovic
db8710e219 struct kinfo_file changes needed for lsof to work using only usermode APIs`
(cherry picked from commit 8c309d48aa)
2022-06-24 22:37:33 +03:00
Damjan Jovanovic
c1731fa54d KERN_LOCKF: report kl_file_fsid consistently with stat(2)
PR:	264723

(cherry picked from commit 8ae7694913)
2022-06-24 22:37:33 +03:00
Konstantin Belousov
9a24a80a17 reap_kill_proc(): avoid singlethreading any other process if we are exiting
Tested by:	pho (whole series MFC)

(cherry picked from commit 1575804961)
2022-06-24 17:45:50 +03:00
Konstantin Belousov
b18df35be0 reap_kill_subtree(): hold the reaper when entering it into the queue to handle later
(cherry picked from commit e0343eacf3)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
935509ba18 reap_kill_subtree_once(): handle proctree_lock unlock in reap_kill_proc()
(cherry picked from commit 1d4abf2cfa)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
9644a36d95 reap_kill_proc: do not retry on thread_single() failure
(cherry picked from commit addf103ce6)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
6a0a83e5cc Make stop_all_proc_block interruptible to avoid deadlock with parallel suspension
(cherry picked from commit 008b2e6544)
2022-06-24 17:45:46 +03:00
Mark Johnston
64717c0148 thread_single_end(): consistently maintain p_boundary_count for ALLPROC mode
(cherry picked from commit 2d5ef216b6)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
ea6a7512d2 thread_unsuspend(): do not unuspend the suspended leader thread doing SINGLE_ALLPROC
(cherry picked from commit 1b4701fe1e)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
b66c168721 thread_single(): remove already checked conditional expression
(cherry picked from commit b9009b1789)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
d64c3f263f Do not single-thread itself when the process single-threaded some another process
(cherry picked from commit 4493a13e3b)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
36f99db22b weed_inhib(): correct the condition to re-suspend a thread
(cherry picked from commit dd883e9a7e)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
209131e656 weed_inhib(): do not double-suspend already suspended thread if the loop reiterates
(cherry picked from commit b9893b3533)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
ed770595b0 thread_single: wait for P_STOPPED_SINGLE to pass
(cherry picked from commit d7a9e6e740)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
c0a303b60b issignal(): ignore signals when process is single-threading for exit
(cherry picked from commit 02a2aacbe2)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
ddd432de61 P2_WEXIT: avoid thread_single() for exiting process earlier
(cherry picked from commit d3000939c7)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
dbb76ce57d Fix another race between fork(2) and PROC_REAP_KILL subtree
(cherry picked from commit 709783373e)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
78d5ef6505 Fix a race between fork(2) and PROC_REAP_KILL subtree
(cherry picked from commit 39794d80ad)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
034c2787d2 kern_procctl: add possibility to take stop_all_proc_block() around exec
(cherry picked from commit d1df347368)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
a03a5ac3ba Add stop_all_proc_block(9)
(cherry picked from commit 2e7595ef2f)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
ccba0468b8 reap_kill(): split children and subtree killers into helpers
(cherry picked from commit 54a11adbd9)
2022-06-24 17:45:45 +03:00
Konstantin Belousov
55b04fbcf8 reap_kill(): rename the reap variable to reaper
(cherry picked from commit 134529b11b)
2022-06-24 17:45:44 +03:00
Konstantin Belousov
70f54bc4f3 reap_kill(): de-inline LIST_FOREACH(), twice
(cherry picked from commit e4ce431e2a)
2022-06-24 17:45:44 +03:00
Konstantin Belousov
e6e9b3d736 reaper_abandon_children(): upgrade proctree_lock assert to exclusive
(cherry picked from commit b9294a3e15)
2022-06-24 17:45:44 +03:00
Mitchell Horne
0db22efc94 Use KERNEL_PANICKED() in more places
This is slightly more optimized than checking panicstr directly. For
most of these instances performance doesn't matter, but let's make
KERNEL_PANICKED() the common idiom.

Reviewed by:	mjg
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D35373

(cherry picked from commit 35eb9b10c2)
2022-06-23 19:19:26 -03:00
Warner Losh
db761c6a64 Create wrapper for Giant taken for newbus
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D31831

(cherry picked from commit c6df6f5322)
2022-06-21 17:13:20 +02:00
Mark Johnston
c75a5bc2f6 vm_object: Use the vm_object_(set|clear)_flag() helpers
... rather than setting and clearing flags inline.  No functional change
intended.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 630f633f2a)
2022-06-21 08:53:24 -04:00
John Baldwin
e4aabaaa3d vfs: Consistently validate AT_* flags in kern_* functions.
Some syscalls checked for invalid AT_* flags in sys_* and others in
kern_*.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D32864

(cherry picked from commit 57093f9366)
2022-06-17 22:35:42 +03:00
Dmitry Chagin
089a76e915 Finish cpuset_getaffinity() after f35093f8
Split cpuset_getaffinity() into a two counterparts, where the
user_cpuset_getaffinity() is intended to operate on the cpuset_t from
user va, while kern_cpuset_getaffinity() expects the cpuset from kernel
va.
Accordingly, the code that clears the high bits is moved to the
user_cpuset_getaffinity(). Linux sched_getaffinity() syscall returns
the size of set copied to the user-space and then glibc wrapper clears
the high bits.

MFC after:		2 weeks

(cherry picked from commit d46174cd88)
2022-06-17 22:35:31 +03:00
Dmitry Chagin
7aeec4eea8 sysent: Get rid of bogus sys/sysent.h include.
Where appropriate hide sysent.h under proper condition.

MFC after:	2 weeks

(cherry picked from commit 31d1b816fe)
2022-06-17 22:35:31 +03:00
Dmitry Chagin
3cf95e49cb Retire sv_transtrap
Call translate_traps directly from sendsig().

MFC after:		2 weeks

(cherry picked from commit eca368ecb6)
2022-06-17 22:35:27 +03:00
Dmitry Chagin
7d2b9eb04c kqueue: Trim trailing whitespace
MFC after:		1 week

(cherry picked from commit 2479e381cd)
2022-06-17 22:35:24 +03:00
Dmitry Chagin
2fe96ee753 sysvsem: Fix a typo
Per jamie@ rpr can be NULL if the jail is created with sysvsem=disable.
But at least it doesn't appear to be fatal, since rpr is never dereferenced
but is only compared to other prison pointers.

Reviewed by:		jamie
Differential revision:	https://reviews.freebsd.org/D35198
MFC after:		2 weeks

(cherry picked from commit cb2ae61631)
2022-06-17 22:35:17 +03:00
Dmitry Chagin
c2f736c9ff sysvsem: Style(9)
MFC after:	2 weeks

(cherry picked from commit b6c8f461f0)
2022-06-17 22:35:17 +03:00
Dmitry Chagin
89daba2ff4 sysvsem: Trim traiing whitespace
MFC after:	2 weeks

(cherry picked from commit f0b0fdf15e)
2022-06-17 22:35:17 +03:00
Dmitry Chagin
6e2a3ed6a7 kdump: Decode cpuset_t.
Reviewed by:		jhb
Differential revision:	https://reviews.freebsd.org/D34982
MFC after:		2 weeks

(cherry picked from commit 586ed32106)
2022-06-17 22:35:15 +03:00
Dmitry Chagin
72bc1e6806 cpuset: Byte swap cpuset for compat32 on big endian architectures
Summary:
BITSET uses long as its basic underlying type, which is dependent on the
compile type, meaning on 32-bit builds the basic type is 32 bits, but on
64-bit builds it's 64 bits.  On little endian architectures this doesn't
matter, because the LSB is always at the low bit, so the words get
effectively concatenated moving between 32-bit and 64-bit, but on
big-endian architectures it throws a wrench in, as setting bit 0 in
32-bit mode is equivalent to setting bit 32 in 64-bit mode.  To
demonstrate:

32-bit mode:

BIT_SET(foo, 0):        0x00000001

64-bit sees: 0x0000000100000000

cpuset is the only system interface that uses bitsets, so solve this
by swapping the integer sub-components at the copyin/copyout points.

Reviewed by:    kib
Sponsored by:   Juniper Networks, Inc.
Differential Revision:  https://reviews.freebsd.org/D35225

(cherry picked from commit 47a57144af)

Fix the build after 47a57144

(cherry picked from commit 89737eb829)

cpuset: Fix the KASAN and KMSAN builds

Rename the "copyin" and "copyout" fields of struct cpuset_copy_cb to
something less generic, since sanitizers define interceptors for
copyin() and copyout() using #define.

Reported by:    syzbot+2db5d644097fc698fb6f@syzkaller.appspotmail.com
Fixes:  47a57144af ("cpuset: Byte swap cpuset for compat32 on big endian architectures")
Sponsored by:   The FreeBSD Foundation

(cherry picked from commit 4a3e51335e)

Use Linux semantics for the thread affinity syscalls.

Linux has more tolerant checks of the user supplied cpuset_t's.

Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.

Reviewed by:            Pau Amma (man pages)
In collaboration with:  jhb
Differential revision:  https://reviews.freebsd.org/D34849
MFC after:              2 weeks

(cherry picked from commit f35093f8d6)
2022-06-17 22:35:14 +03:00
Dmitry Chagin
da7cee20aa sysvsem: Add a timeout argument to the semop.
For future use in the Linux emulation layer for the semtimedop syscall
split the sys_semop syscall into two counterparts and add
struct timespec *timeout argument to the last one.

Reviewed by:		jhb, kib
Differential revision:	https://reviews.freebsd.org/D35121
MFC after:		2 weeks

(cherry picked from commit f04534f5c8)
2022-06-17 22:34:42 +03:00