syzbot reported a single boot-time crash in g_event_procbody(), a page
fault when dereferencing g_event_td. g_event_td is initialized by the
kproc_kthread_add() call which creates the GEOM event thread:
kproc_kthread_add(g_event_procbody, NULL, &g_proc, &g_event_td,
RFHIGHPID, 0, "geom", "g_event");
I believe that the caller of kproc_kthread_add() was preempted after
adding the new thread to the scheduler, and before setting *newtdp,
which is equal to g_event_td. Thus, since the first action of the GEOM
event thread is to lock itself, it ended up dereferencing a NULL
pointer.
Fix the problem simply by initializing *newtdp earlier. I see no harm
in that, and it matches kproc_create1(). The scheduler provides
sufficient synchronization to ensure that the store is visible to the
new thread, wherever it happens to run.
Reported by: syzbot+5397f4d39219b85a9409@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42986
(cherry picked from commit ae77041e0714627f9ec8045ca9ee2b6ea563138e)
This was handy for some ad-hoc debugging and fits in with other
kmsan_check_*() routines which operate on some kind of data container.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
(cherry picked from commit be5464ae233ada46a778cc82f7107a10a7d5343b)
Four pad bytes at the end of each xtty structure were not being cleared
before being copied out. Fix this by clearing the whole structure
before populating fields.
MFC after: 3 days
Reported by: KMSAN
(cherry picked from commit 3c0fb026b2fc998fa9bea8aed76e96c58671aee3)
We only want to produce syscall.mk for the main syscall table so default
to not producing it (send it to /dev/null) and add a syscalls.conf to
sys/kern to trigger the creation of sys/sys/syscall.mk. This eliminates
the need for entries in other syscalls.conf files and is a cleaner
pattern going forward.
Reviewed by: kevans, imp
Differential Revision: https://reviews.freebsd.org/D42663
(cherry picked from commit 54d487c4d01d68ef0ac03eae1fc574f7533d46f6)
It is possible to reach this function from ddb via the "reset" command.
When this happens, we don't actually exit kdb, meaning we never execute
the latter steps of kdb_break() to restore the system state (e.g.
re-enable scheduler).
Therefore, we should not clear the kdb_active flag in this function, as
the debugger is still active. Put differently, kern_reboot() is not an
authority on kdb state, and should not touch it. The original motivation
for this assignment is not clear; I have checked thoroughly and I am
convinced it is not required by any reset code.
This fixes an edge case where a panic can be triggered during reset from
ddb:
1. Enter ddb via keyboard break sequence (KERNEL_PANICKED() == false &&
td->td_critnest > 0)
2. Execute the "reset" command
3. kern_reboot() sets kdb_active = false
4. A witness_checkorder() call via shutdown handler sees !kdb_active
and panics
Reviewed by: imp, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42684
(cherry picked from commit 4e78a766f607192698514d970ff4e9fa91d0482d)
This is to handle the case where the system has not panicked but the
debugger is active, where we still can't wait for thread termination.
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42683
(cherry picked from commit 960612a19f009df602a4cb008fa90a45a6e869bb)
Don't try to gracefully terminate the pkt_manager thread if the
scheduler is not running.
We should not attempt to shutdown ald if RB_NOSYNC is set, and must not
if the scheduler is stopped (the function calls wakeup()).
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42340
(cherry picked from commit d79a9edb5ce162c1ba49e12e5c93b894e6a25ad2)
AT_BSDFLAGS shouldn't be sign extended on 64-bit systems so use a
uint32_t instead of an int.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D42365
(cherry picked from commit 326bf5089ca788d5ff1951eed7a9067281a2b65e)
It is very common and according to dtrace while running poudriere almost
all calls with SEEK_CUR pass 0.
(cherry picked from commit 305a2676ae93fb50a623024d51039415521cb2da)
When a process attempts to access a snapshot under
/<dataset>/.zfs/snapshot, the snapshot is automounted.
However, without this patch, the automount does not
set mnt_exjail, which results in the snapshot not being
accessible over NFS.
This patch defines a new function called vfs_exjail_clone()
which sets mnt_exjail from another mount point and
then uses that function to set mnt_exjail in the snapshot
automount. A separate patch that is currently a pull request
for OpenZFS, calls this function to fix the problem.
PR: 275200
(cherry picked from commit f5f277728adec4c5b3e840a1fb16bd16f8cc956d)
If a module (e.g. the ertt hhook for TCP) can't clean up at
shutdown, there is nothing to be done about it. In the ertt case,
cleanup is just shutting down a UMA zone, which doesn't need to be
done. Suppress EBUSY warnings on shutdown.
PR: 271677
Reviewed by: tuexen, imp
Differential Revision: https://reviews.freebsd.org/D42650
(cherry picked from commit 415c1c748d5492e41328fedf96b6bf3c9be94595)
The tty_rubchar() code handling backspaces for UTF-8 characters didn't
properly check whether the beginning of the current line was reached.
This resulted in a kernel panic in ttyinq_unputchar() when prodded with
certain malformed UTF-8 sequences.
PR: 275009
Reviewed by: christos
Differential Revision: https://reviews.freebsd.org/D42564
(cherry picked from commit c6d7be214811c315d234d64c6cbaa92d4f55d2c1)
Using the new UMA_ALIGN_CACHE_AND_MASK() facility, which allows to
simultaneously guarantee a minimum of 32 bytes of alignment (the 5 lower
bits are always 0).
For the record, to this day, here's a (possibly non-exhaustive) list of
synchronization primitives using lower bits to store flags in pointers
to thread structures:
- lockmgr, rwlock and sx all use the 5 bits directly.
- rmlock indirectly relies on sx, so can use the 5 bits.
- mtx (non-spin) relies on the 3 lower bits.
Reviewed by: markj, kib
MFC after: 2 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42266
(cherry picked from commit 7d1469e555bdce32b3dfc898478ae5564d5072b1)
When the cross-mount walking logic in vfs_lookup() was factored into
a separate function, the main cross-mount traversal loop was changed
from a do...while loop conditional on the current vnode having
VIRF_MOUNTPOINT set to an unconditional for(;;) loop. For the
unionfs 'crosslock' case in which the vnode may be re-locked, this
meant that continuing the loop upon finding inconsistent
v_mountedhere state would no longer branch to a check that the vnode
is in fact still a mountpoint. This would in turn lead to over-
iteration and, for INVARIANTS builds, a failed assert on the next
iteration.
Fix this by restoring the previous loop behavior.
Reported by: pho
Tested by: pho
Fixes: 80bd5ef070
(cherry picked from commit 586fed0b03561558644eccc37f824c7110500182)
Otherwise a KMSAN report (which panics the system by default) could
trigger a recursive panic.
MFC after: 1 week
Fixes: ca6cd604c8 ("kmsan: Use the correct origin bytes in kmsan_check_arg()")
(cherry picked from commit 346134f19aa9ba38a0384244609e2bcd4f7838f4)
When the scheduler is stopped, mtx_unlock() turns into a no-op, so the
loop
while (mtx_owned(&Giant))
mtx_unlock(&Giant);
runs forever if the calling thread has Giant locked.
Reviewed by: mhorne
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42460
(cherry picked from commit deacab756026f86515781944a9e0271e8db9f86b)
All of the kern_* prototypes belong in this header. While here, sort
the prototypes by function name.
Reviewed by: dchagin
Fixes: 6453d4240f vfs: Export exattr methods to reuse by Linuxulator
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D41766
(cherry picked from commit 3555be0124a4f105c72d932f00071f332691e8cf)
As of LLVM 16, the -fsanitize-memory-param-retval option is set to true
by default, meaning that MSan will eagerly report uninitialized function
parameters and return values, even if they are not used. A
witness_save()/witness_restore() call pair fails this test since
witness_save() may return before saving file and line number
information.
Modify witness_save() to initialize the out-params unconditionally; this
appears to be the only instance of the problem triggered when booting to
a login prompt, so let's just address it directly.
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
MFC after: 1 week
(cherry picked from commit 7123222220aa563dc16bf1989d335722e4ff57a6)
Make sure that we don't try to copy with a negative resid.
Make sure that we don't walk off the end of the iovec array.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42098
(cherry picked from commit 8fd0ec53deaad34383d4b344714b74d67105b258)
Accesses to KMSAN's TLS block are not instrumented, so there's no need
to use kmsan_memset(). No functional change intended.
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
(cherry picked from commit e5caed14067b40f1454d74e99789a28508d0eea3)
This is a temporary solution to fix PR before release.
During 15.0 it's necessary to refactor symlinks handling
between vfs & namecache.
PR: 273414
Reported by: Vincent Milum Jr, Dan Kotowski, glebius
Tested by: Dan Kotowski, glebius
Reviewed by:
Differential Revision: https://reviews.freebsd.org/D41806
MFC after: 3 days
(cherry picked from commit bb8ecf259f96510b9c2146d846403393543061b7)
This patch fixes UTF-8 sequence validation logic in
teken_utf8_bytes_to_codepoint() and fixes fallback behaviour in
ttydisc_rubchar() when an invalid UTF8 sequence is encountered. The code
previously used __bitcount() to extract sequence length information from
the leading byte. However, this assumption breaks for certain code
points that have additional bits set in the first half of the leading
byte (e.g. Cyrillic characters). This lead to incorrect behaviour when
deleting those characters using backspaces. The code now checks the
number of consecutive set bits in the leading byte starting from the
MSB, as per RFC 3629.
Reviewed by: christos
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42147
(cherry picked from commit 2fed1c579c52d63b72fc08ffcc652ba0183f9254)
This patch adds additional logic in ttydisc_rubchar() to properly handle
backspace behaviour for UTF-8 characters.
Currently, typing in a backspace after a UTF8 character will delete only
one byte from the byte sequence, leaving garbled output in the tty's
output queue. With this change all of the character's bytes are deleted.
This change is only active when the IUTF8 flag is set (see
19054eb6053189144aa962b2ecc1bf5087758a3e "(s)tty: add support for IUTF8
input flag")
The code uses the teken_wcwidth() function to properly handle character
column widths for different code points, and adds the
teken_utf8_bytes_to_codepoint() function that converts a UTF-8 byte
sequence to a codepoint, as specified in RFC3629.
Reported by: christos
Reviewed by: christos, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42067
(cherry picked from commit 9e589b0938579f3f4d89fa5c051f845bf754184d)
This patch adds the necessary kernel and stty code to support setting
the IUTF8 flag for ttys. It is the first of two patches that fix
backspace behaviour for UTF-8 encoded characters when in canonical mode.
Reported by: christos
Reviewed by: christos, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42066
(cherry picked from commit 128f63cedc14ae21b35f74e11e2fe1a5659c58e8)
When recvmsg(2) is used with MSG_TRUNC on an atomic socket type (DGRAM
or SEQPACKET), soreceive_generic() and uipc_peek_dgram() may
intentionally underflow uio_resid so that userspace can find out how
many bytes it should have asked for.
If this happens, and KTR_GENIO is enabled, ktrgenio() will attempt to
copy in beyond the end of the output buffer's iovec. In general this
will silently cause the ktrace operation to fail since it'll result in
EFAULT from uiomove(). Let's be more careful and make sure not to try
and copy more bytes than we have.
Fixes: be1f485d7d ("sockets: add MSG_TRUNC flag handling for recvfrom()/recvmsg().")
Reported by: syzbot+30b4bb0c0bc0f53ac198@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42099
(cherry picked from commit 761ae1ce798add862d78728cc5ac5240ce7db779)
The loader tunable 'debug.kmsan.disabled' does not have corresponding
sysctl MIB entry. Add it so that it can be retrieved, and `sysctl -T`
will also report it correctly.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138
(cherry picked from commit 1d2b743784f7527a6840fe35ddb7e34cd41bc17a)
The loader tunable 'debug.kasan.disabled' does not have corresponding
sysctl MIB entry. Add it so that it can be retrieved, and `sysctl -T`
will also report it correctly.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138
(cherry picked from commit db5d0bc868be669ed6588ebeccf8c02e76aabc41)
The loader tunable 'kern.boottrace.table_size' does not have
corresponding sysctl MIB entry. Add it so that it can be retrieved,
and `sysctl -T` will also report it correctly.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42138
(cherry picked from commit 51dc362d1a148362dc4cfacaa3629db928523204)
cr_canseeotheruids(), cr_canseeothergids() and cr_canseejailproc()
should not be used directly now. cr_bsd_visible() has to be called
instead.
Reviewed by: mhorne
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40629
(cherry picked from commit 91e9d669b475d1900e8dc01a49ad90a621c4a068)
Using the effective group and not the real one when testing membership
has the consequence that unprivileged processes cannot see setuid
commands they launch until these have relinquished their privileges.
This is also in contradiction with how the similar cr_canseeotheruids()
works, i.e., by taking into account real user IDs.
Fix this by substituting groupmember() with realgroupmember(). While
here, simplify the code.
PR: 272093
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40642
Differential Revision: https://reviews.freebsd.org/D40644
(cherry picked from commit 91658080f1a598ddda03943a783c9a941199f7d2)
(cherry picked from commit 0452dd841336cea7cd979b13ef12b6ea5e992eff)
Like groupmember(), but taking into account the real group instead of
the effective group. Leverages the new supplementary_group_member()
function.
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40641
Differential Revision: https://reviews.freebsd.org/D40643
(cherry picked from commit 2a2bfa6ad92e9c82dcc55733ad2fd58fd2ea7559)
(cherry picked from commit 5d9f38405a10fdcd9fc108c940dcf2642e9f1833)
This is in preparation for the introduction of the new realgroupmember()
function, which does the same search into supplementary groups as
groupmember().
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40640
(cherry picked from commit b725f232f3b09b4bcbc426854fe1545234c66965)
As implemented, this security policy would only prevent seeing processes
in sub-jails, but would not prevent sending signals to, changing
priority of or debugging processes in these, enabling attacks where
unprivileged users could tamper with random processes in sub-jails in
particular circumstances (conflated UIDs) despite the policy being
enforced.
PR: 272092
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Kumacom SAS
Differential Revision: https://reviews.freebsd.org/D40628
(cherry picked from commit 5817169bc4a06a35aa5ef7f5ed18f6cb35037e18)