Bus drivers which use an rman to sub-divide a resource allocated from
a parent bus should handle mapping requests (and activate/deactivate
requests) for those sub-allocated resources by doing a subset mapping
of the resource allocated from the parent (and then using this to
handle activate/deactivate requests).
However, not all bus drivers which use internal rmans (such as acpi(4)
and pci_pci(4)) do that since not all nexus drivers support
bus_map/unmap. Eventually bus drivers should be updated to do this
properly at which point these assertions can be reenabled.
Reported by: delphij, kib
(cherry picked from commit ed88eef140a1c3d57d546f409c216806dd3da809)
These routines can be used to implement
bus_alloc/adjust/activate/deactive/release_resource on bus drivers
which suballocate resources from rman(9) resource managers.
These methods require a new bus_get_rman method in the bus driver to
return the suitable rman for a given resource type. The
activate/deactivate helpers also require the bus to implement the
bus_map/ummap_resource methods.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42739
(cherry picked from commit 751615c538446ea0384f8faa9cb2508670c3799a)
This helper function for BUS_MAP_RESOURCE performs common argument
validation.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42723
(cherry picked from commit 19f073c612afa0111d216e5ccab9525bfc97ec32)
Change vfs_byname_kld to always return an error value of ENODEV to
indicate an unsupported fstype leaving ENOENT to indicate errors such
as a missing mount point or invalid path. This allows nmount(2) to
better distinguish these cases and avoid treating a missing device
node as an invalid fstype after commit 6e8272f317.
While here, change mount(2) to return EINVAL instead of ENODEV for an
invalid fstype to match nmount(2).
PR: 274600
Reviewed by: pstef, markj
Differential Revision: https://reviews.freebsd.org/D42327
(cherry picked from commit 3eed4803f943e2937325e81140b88e2e8eea8deb)
When copy_file_range(2) was first being developed,
*inoffp + len had to be <= infile_size or an error was
returned. This semantic (as defined by Linux) changed
to allow *inoffp + len to be greater than infile_size and
the copy would end at *inoffp + infile_size.
Unfortunately, the code that decided if the outfd should
be truncated in length did not get updated for this
semantics change.
As such, if a copy_file_range(2) is done, where infile_size - *inoffp
is less that outfile_size but len is large, the outfd file is truncated
when it should not be. (The semantics for this for Linux is to not
truncate outfd in this case.)
This patch fixes the problem. I believe the calculation is safe
for all non-negative values of outsize, *outoffp, *inoffp and insize,
which should be ok, since they are all guaranteed to be non-negative.
Note that this bug is not observed over NFSv4.2, since it truncates
len to infile_size - *inoffp.
PR: 276045
(cherry picked from commit 2319ca6a01816f7fc85d623097c639f239e18c6a)
Call sigexit rather than exit1 so that a core is generated.
If running the SIGABRT handler is desired, this would need to use
kern_psignal() instead. In that case a userspace wrapper in libc
would be needed to force an exit if the handler doesn't exit. Given
that abort2(2)'s intended use case is when userland is in a
sufficiently bad state such that it can't safely call syslog(3) before
abort(3), a userspace abort2(3) wrapper in libc might be dubious.
Reviewed by: Olivier Certner <olce.freebsd@certner.fr>, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42163
(cherry picked from commit 9b57e30cf5b6036263a1a2551df8574571c6f5a4)
In general we copy error strings as part of reporting an error from
lower layers, so if the copyout() fails there's nothing to do since we'd
prefer to preserve the original error.
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Reviewed by: olce, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43147
(cherry picked from commit 099d25c354d93d9cd9c9cd261428f5ab0547a194)
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D43144
(cherry picked from commit 3379d9b5de4c4876a317d25ca008e66b1111b701)
It does not seem reasonable to return to userspace after calling
umtx_thread_exit().
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Reviewed by: olce, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43143
(cherry picked from commit f450277f7a608f26624384e046c1987490c51296)
There is a documented bug in sendfile.2 which notes that sendfile(2)
does not raise an error if it fails to copy out the number of bytes
written. Explicitly ignore the error from copyout() calls in
preparation for annotating copyout() with __result_use_check.
Reviewed by: glebius, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43129
(cherry picked from commit d0adc2f283ad5db6b568ca533a056c9f635551cd)
Some implementations copy data to userspace, an operation which can in
principle fail. In preparation for adding a __result_use_check
annotation to copyin() and related functions, let implementations of
cpu_set_upcall() return an error, and check for errors when copying data
to user memory.
Reviewed by: kib, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43100
(cherry picked from commit 7b68fb5ab2a276ccd081cc1a43cebf0fb315e952)
This is in preparation for adding a __result_use_check annotation to
copyin() and related functions.
Reviewed by: imp, kib, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43099
(cherry picked from commit 4f35450ce52a7b141e7ae8d37fa257b5f8971dda)
syzbot reported a single boot-time crash in g_event_procbody(), a page
fault when dereferencing g_event_td. g_event_td is initialized by the
kproc_kthread_add() call which creates the GEOM event thread:
kproc_kthread_add(g_event_procbody, NULL, &g_proc, &g_event_td,
RFHIGHPID, 0, "geom", "g_event");
I believe that the caller of kproc_kthread_add() was preempted after
adding the new thread to the scheduler, and before setting *newtdp,
which is equal to g_event_td. Thus, since the first action of the GEOM
event thread is to lock itself, it ended up dereferencing a NULL
pointer.
Fix the problem simply by initializing *newtdp earlier. I see no harm
in that, and it matches kproc_create1(). The scheduler provides
sufficient synchronization to ensure that the store is visible to the
new thread, wherever it happens to run.
Reported by: syzbot+5397f4d39219b85a9409@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42986
(cherry picked from commit ae77041e0714627f9ec8045ca9ee2b6ea563138e)
This was handy for some ad-hoc debugging and fits in with other
kmsan_check_*() routines which operate on some kind of data container.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
(cherry picked from commit be5464ae233ada46a778cc82f7107a10a7d5343b)
Four pad bytes at the end of each xtty structure were not being cleared
before being copied out. Fix this by clearing the whole structure
before populating fields.
MFC after: 3 days
Reported by: KMSAN
(cherry picked from commit 3c0fb026b2fc998fa9bea8aed76e96c58671aee3)
We only want to produce syscall.mk for the main syscall table so default
to not producing it (send it to /dev/null) and add a syscalls.conf to
sys/kern to trigger the creation of sys/sys/syscall.mk. This eliminates
the need for entries in other syscalls.conf files and is a cleaner
pattern going forward.
Reviewed by: kevans, imp
Differential Revision: https://reviews.freebsd.org/D42663
(cherry picked from commit 54d487c4d01d68ef0ac03eae1fc574f7533d46f6)
It is possible to reach this function from ddb via the "reset" command.
When this happens, we don't actually exit kdb, meaning we never execute
the latter steps of kdb_break() to restore the system state (e.g.
re-enable scheduler).
Therefore, we should not clear the kdb_active flag in this function, as
the debugger is still active. Put differently, kern_reboot() is not an
authority on kdb state, and should not touch it. The original motivation
for this assignment is not clear; I have checked thoroughly and I am
convinced it is not required by any reset code.
This fixes an edge case where a panic can be triggered during reset from
ddb:
1. Enter ddb via keyboard break sequence (KERNEL_PANICKED() == false &&
td->td_critnest > 0)
2. Execute the "reset" command
3. kern_reboot() sets kdb_active = false
4. A witness_checkorder() call via shutdown handler sees !kdb_active
and panics
Reviewed by: imp, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42684
(cherry picked from commit 4e78a766f607192698514d970ff4e9fa91d0482d)
This is to handle the case where the system has not panicked but the
debugger is active, where we still can't wait for thread termination.
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42683
(cherry picked from commit 960612a19f009df602a4cb008fa90a45a6e869bb)
Don't try to gracefully terminate the pkt_manager thread if the
scheduler is not running.
We should not attempt to shutdown ald if RB_NOSYNC is set, and must not
if the scheduler is stopped (the function calls wakeup()).
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42340
(cherry picked from commit d79a9edb5ce162c1ba49e12e5c93b894e6a25ad2)
AT_BSDFLAGS shouldn't be sign extended on 64-bit systems so use a
uint32_t instead of an int.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D42365
(cherry picked from commit 326bf5089ca788d5ff1951eed7a9067281a2b65e)
It is very common and according to dtrace while running poudriere almost
all calls with SEEK_CUR pass 0.
(cherry picked from commit 305a2676ae93fb50a623024d51039415521cb2da)
When a process attempts to access a snapshot under
/<dataset>/.zfs/snapshot, the snapshot is automounted.
However, without this patch, the automount does not
set mnt_exjail, which results in the snapshot not being
accessible over NFS.
This patch defines a new function called vfs_exjail_clone()
which sets mnt_exjail from another mount point and
then uses that function to set mnt_exjail in the snapshot
automount. A separate patch that is currently a pull request
for OpenZFS, calls this function to fix the problem.
PR: 275200
(cherry picked from commit f5f277728adec4c5b3e840a1fb16bd16f8cc956d)
If a module (e.g. the ertt hhook for TCP) can't clean up at
shutdown, there is nothing to be done about it. In the ertt case,
cleanup is just shutting down a UMA zone, which doesn't need to be
done. Suppress EBUSY warnings on shutdown.
PR: 271677
Reviewed by: tuexen, imp
Differential Revision: https://reviews.freebsd.org/D42650
(cherry picked from commit 415c1c748d5492e41328fedf96b6bf3c9be94595)
The tty_rubchar() code handling backspaces for UTF-8 characters didn't
properly check whether the beginning of the current line was reached.
This resulted in a kernel panic in ttyinq_unputchar() when prodded with
certain malformed UTF-8 sequences.
PR: 275009
Reviewed by: christos
Differential Revision: https://reviews.freebsd.org/D42564
(cherry picked from commit c6d7be214811c315d234d64c6cbaa92d4f55d2c1)
Using the new UMA_ALIGN_CACHE_AND_MASK() facility, which allows to
simultaneously guarantee a minimum of 32 bytes of alignment (the 5 lower
bits are always 0).
For the record, to this day, here's a (possibly non-exhaustive) list of
synchronization primitives using lower bits to store flags in pointers
to thread structures:
- lockmgr, rwlock and sx all use the 5 bits directly.
- rmlock indirectly relies on sx, so can use the 5 bits.
- mtx (non-spin) relies on the 3 lower bits.
Reviewed by: markj, kib
MFC after: 2 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42266
(cherry picked from commit 7d1469e555bdce32b3dfc898478ae5564d5072b1)
When the cross-mount walking logic in vfs_lookup() was factored into
a separate function, the main cross-mount traversal loop was changed
from a do...while loop conditional on the current vnode having
VIRF_MOUNTPOINT set to an unconditional for(;;) loop. For the
unionfs 'crosslock' case in which the vnode may be re-locked, this
meant that continuing the loop upon finding inconsistent
v_mountedhere state would no longer branch to a check that the vnode
is in fact still a mountpoint. This would in turn lead to over-
iteration and, for INVARIANTS builds, a failed assert on the next
iteration.
Fix this by restoring the previous loop behavior.
Reported by: pho
Tested by: pho
Fixes: 80bd5ef070
(cherry picked from commit 586fed0b03561558644eccc37f824c7110500182)
Otherwise a KMSAN report (which panics the system by default) could
trigger a recursive panic.
MFC after: 1 week
Fixes: ca6cd604c8 ("kmsan: Use the correct origin bytes in kmsan_check_arg()")
(cherry picked from commit 346134f19aa9ba38a0384244609e2bcd4f7838f4)
When the scheduler is stopped, mtx_unlock() turns into a no-op, so the
loop
while (mtx_owned(&Giant))
mtx_unlock(&Giant);
runs forever if the calling thread has Giant locked.
Reviewed by: mhorne
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42460
(cherry picked from commit deacab756026f86515781944a9e0271e8db9f86b)
All of the kern_* prototypes belong in this header. While here, sort
the prototypes by function name.
Reviewed by: dchagin
Fixes: 6453d4240f vfs: Export exattr methods to reuse by Linuxulator
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D41766
(cherry picked from commit 3555be0124a4f105c72d932f00071f332691e8cf)
As of LLVM 16, the -fsanitize-memory-param-retval option is set to true
by default, meaning that MSan will eagerly report uninitialized function
parameters and return values, even if they are not used. A
witness_save()/witness_restore() call pair fails this test since
witness_save() may return before saving file and line number
information.
Modify witness_save() to initialize the out-params unconditionally; this
appears to be the only instance of the problem triggered when booting to
a login prompt, so let's just address it directly.
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
MFC after: 1 week
(cherry picked from commit 7123222220aa563dc16bf1989d335722e4ff57a6)
Make sure that we don't try to copy with a negative resid.
Make sure that we don't walk off the end of the iovec array.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42098
(cherry picked from commit 8fd0ec53deaad34383d4b344714b74d67105b258)