This is important for wpa_supplicant operation on a crowded network.
Note: we actually need an API to increase maximum datagram size on a
socket. Previously SO_SNDBUF magically acted like that, but that was
an undocumented "feature".
Also move the comment to the proper line. Previously it was the receive
buffer that imposed the limit. Now notion of buffer size and maximum
datagram are separate.
Reviewed by: bz, tuexen, karels
Differential Revision: https://reviews.freebsd.org/D42830
PR: 274990
Just like it was done for accept(2) in cfb1e92912, use same approach
for two simplier syscalls that return socket addresses. Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.
Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D42694
Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.
While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient. Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do. Linux also does that. Update tests accordingly.
Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635
Currently, a prison in "dying" state (removed but still holding
resources) can be brought back to alive state via "jail -d", or
the JAIL_DYING flag to jail_set(2). This seemed like a good idea
at the time.
Its main use was to improve support for specifying the jid when
creating a jail, which also seemed like a good idea at the time.
But resurrecting a jail that was partway through thr process of
shutting down is trouble waiting to happen.
This patch deprecates that flag, leaving it as a no-op for creating
jails (but still useful for looking at dying jails). It sill allows
creating a new jail with the same jid as a dying one, but will renumber
the old one in that case. That's imperfect, but allows for current
behavior.
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D28150
It is similar to VOP_GETWRITEMOUNT(), and for given vnode vp should
return the lower vnode which would actually handle write to vp.
Flags allow to specify FREAD or FWRITE for benefit of possible unionfs
implementation.
Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603
Since kqueue timer may exist after the process that created it exited
(same scenario with rfork(2) as in PR 275286), make the tailq
p_kqtim_stop accessed by filt_timerdetach() type-stable.
Noted and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42777
It is enough to know knlist to remove from it, and the list is
autodestroyed on last removal.
PR: 275286
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42777
This reverts commit 393ac29f0b. A
different fix is following, which preserves semantic, required by the
sys.kqueue.proc3_test.proc3 test.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
PR: 275286
Differential revision: https://reviews.freebsd.org/D42777
If you have a pointer which you know points to stale data, you can
fill it with junk so that dereference later will trap
Reviewed by: kib
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D40946
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
For the uncommon items: Go through the tree and remove sccs tags that
didn't fit any nice pattern. If in the neighborhood, other SCM tags were
removed when they were detritis of long-ago CVS somehow in the early
mists of the project. Some adjacent copyrights stringswere removed (they
duplicated the copyright notices in the file). This also removed
non-standard formations of omission of SCCS tags (usually by adding an
extra #if 0 somewhere.
After this commit, a number of strings tagged with the 'what' @(#)
prefix remain, but they are primarily copyright notices.
Sponsored by: Netflix
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.
Sponsored by: Netflix
Bus drivers which use an rman to sub-divide a resource allocated from
a parent bus should handle mapping requests (and activate/deactivate
requests) for those sub-allocated resources by doing a subset mapping
of the resource allocated from the parent (and then using this to
handle activate/deactivate requests).
However, not all bus drivers which use internal rmans (such as acpi(4)
and pci_pci(4)) do that since not all nexus drivers support
bus_map/unmap. Eventually bus drivers should be updated to do this
properly at which point these assertions can be reenabled.
Reported by: delphij, kib
These routines can be used to implement
bus_alloc/adjust/activate/deactive/release_resource on bus drivers
which suballocate resources from rman(9) resource managers.
These methods require a new bus_get_rman method in the bus driver to
return the suitable rman for a given resource type. The
activate/deactivate helpers also require the bus to implement the
bus_map/ummap_resource methods.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42739
Normally, process already has all its kqueue fds destroyed at the moment
p_klist is detached in exit flow. But, if the process was created with
rfork(2) with shared file descriptors, its signal knotes can survive.
Then, knlist_detach() does not destroy non-empty knlist. Later, when
owning kqueue is closed, we access freed (or rather, reused, because
struct proc is type-stable) memory by referencing p->p_klist from such
knote.
Handle this situation by deleting all knotes hanging from p_klist.
PR: 275286
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42745
This helper function for BUS_MAP_RESOURCE performs common argument
validation.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42723
It is possible to reach this function from ddb via the "reset" command.
When this happens, we don't actually exit kdb, meaning we never execute
the latter steps of kdb_break() to restore the system state (e.g.
re-enable scheduler).
Therefore, we should not clear the kdb_active flag in this function, as
the debugger is still active. Put differently, kern_reboot() is not an
authority on kdb state, and should not touch it. The original motivation
for this assignment is not clear; I have checked thoroughly and I am
convinced it is not required by any reset code.
This fixes an edge case where a panic can be triggered during reset from
ddb:
1. Enter ddb via keyboard break sequence (KERNEL_PANICKED() == false &&
td->td_critnest > 0)
2. Execute the "reset" command
3. kern_reboot() sets kdb_active = false
4. A witness_checkorder() call via shutdown handler sees !kdb_active
and panics
Reviewed by: imp, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42684
This is to handle the case where the system has not panicked but the
debugger is active, where we still can't wait for thread termination.
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42683
Don't try to gracefully terminate the pkt_manager thread if the
scheduler is not running.
We should not attempt to shutdown ald if RB_NOSYNC is set, and must not
if the scheduler is stopped (the function calls wakeup()).
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42340
When a process attempts to access a snapshot under
/<dataset>/.zfs/snapshot, the snapshot is automounted.
However, without this patch, the automount does not
set mnt_exjail, which results in the snapshot not being
accessible over NFS.
This patch defines a new function called vfs_exjail_clone()
which sets mnt_exjail from another mount point and
then uses that function to set mnt_exjail in the snapshot
automount. A separate patch that is currently a pull request
for OpenZFS, calls this function to fix the problem.
PR: 275200
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42672
Just skip compiling this file if RACCT isn't defined. This allows to
skip including headers that no code uses at all, and also to remove the
whole file's #ifdef/#endif bracketing.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
If RCTL is not defined, only the system call stubs returning ENOSYS are
compiled in. In this case, don't waste time including most headers
since their code is not used.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
It seems this was an "emergency" knob to revert a newly introduced
behavior. Overall, we want better system-wide signal receive latency,
and it doesn't seem that some contrary policy was ever needed (and if
that comes up, it should rather be implemented, e.g., per-process).
Suggested by: kib
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42315
in case COMPAT_FREEBSD32 was enabled in config but hardware does not
support executing 32bit binaries.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42641
Change vfs_byname_kld to always return an error value of ENODEV to
indicate an unsupported fstype leaving ENOENT to indicate errors such
as a missing mount point or invalid path. This allows nmount(2) to
better distinguish these cases and avoid treating a missing device
node as an invalid fstype after commit 6e8272f317.
While here, change mount(2) to return EINVAL instead of ENODEV for an
invalid fstype to match nmount(2).
PR: 274600
Reviewed by: pstef, markj
Differential Revision: https://reviews.freebsd.org/D42327
We only want to produce syscall.mk for the main syscall table so default
to not producing it (send it to /dev/null) and add a syscalls.conf to
sys/kern to trigger the creation of sys/sys/syscall.mk. This eliminates
the need for entries in other syscalls.conf files and is a cleaner
pattern going forward.
Reviewed by: kevans, imp
Differential Revision: https://reviews.freebsd.org/D42663
If a module (e.g. the ertt hhook for TCP) can't clean up at
shutdown, there is nothing to be done about it. In the ertt case,
cleanup is just shutting down a UMA zone, which doesn't need to be
done. Suppress EBUSY warnings on shutdown.
PR: 271677
Reviewed by: tuexen, imp
Differential Revision: https://reviews.freebsd.org/D42650
Lock the vnode in the most exclusive lock mode requested, once.
All callers already ensure that vp1 != vp2 or are careful enough to only
unlock once otherwise.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42642
Switch from using the shell's builtin echo command to using the
builtin printf command to print the asserts.
Reported by: jrtc27
Suggested by: imp
Fixes: accfb4cc93
Sponsored by: Netflix
Instead, use a here document for the input. This allows us to run the
while loop in the main script so we can build the list of asserts in
a shell variable. We then print out the list of asserts at the end of
the loop.
Reviewed by: imp
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D42407
Call sigexit rather than exit1 so that a core is generated.
If running the SIGABRT handler is desired, this would need to use
kern_psignal() instead. In that case a userspace wrapper in libc
would be needed to force an exit if the handler doesn't exit. Given
that abort2(2)'s intended use case is when userland is in a
sufficiently bad state such that it can't safely call syslog(3) before
abort(3), a userspace abort2(3) wrapper in libc might be dubious.
Reviewed by: Olivier Certner <olce.freebsd@certner.fr>, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42163
This is required e.g. for nullfs to ensure liveness of the lower mount
points.
Reviewed by: jah, rmacklem, Olivier Certner <olce.freebsd@certner.fr>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42554
The get operations change the data pointed to by the structure, but do
not update the contents of the struct.
Mark the struct mac arguments of mac_[gs]etsockopt_*label() and
mac_check_structmac_consistent() const to prevent this from changing
in the future.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14488
The tty_rubchar() code handling backspaces for UTF-8 characters didn't
properly check whether the beginning of the current line was reached.
This resulted in a kernel panic in ttyinq_unputchar() when prodded with
certain malformed UTF-8 sequences.
Fixes: PR 275009
Reviewed by: christos
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42564
Measure the total deferred time (from the time we decide to defer until
we try again) for busdma_load requests. On systems that don't ever
defer, there is no performnce change. Add new sysctl
hw.busdma.zoneX.total_deferred_time to report this (in
microseconds).
Normally, deferrals don't happen in modern hardware... Except there's a
lot of buggy hardware that can't cope with memory > 4GB or that can't
cross a 4GB boundary (or even more restrictive values), necessitating
bouncing. This will measure the effect on the I/Os of this deferral.
Sponsored by: Netflix
Reviewed by: gallatin, mav
Differential Revision: https://reviews.freebsd.org/D42550
To allow for architecture specific protections add sv_protect to struct
sysent. This can be used to apply these after the executable is loaded
into the new address space.
Reviewed by: kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42440
This is useful to check if a note is present and contains an expected
value, e.g. to read NT_GNU_PROPERTY_TYPE_0 on arm64 to see if we should
enable BTI.
Reviewed by: kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42439
Move the definition of GNU_ABI_VENDOR to a common location so it can
be used in multiple files.
Reviewed by: emaste, kib, imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42442
- Pass zone pointer to trash_ctor() and report zone name in the panic
message. It may be difficult to figyre out zone just by the item size.
- Do not pass user arguments to internal trash calls, pass thezone.
- Report malloc type name in the same unified panic message.
- Report corruption offset from the beginning of the items instead of
the full pointer. It makes panic message shorter and more readable.
We need to vfs_op_enter()/vn_seqc_write_start() before jumping to
cleanup.
PR: 274992
Reported by: trasz
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Fixes: 9ef7a491a4
LINKER_LOAD_FILE() calls linker_load_dependencies() which will return
EEXIST in case the module to be loaded has already been compiled into
the kernel. Since the format of the module is now recognized then there
is no need to retry loading with a different linker, otherwise the
userland will get misleading error number ENOEXEC.
PR: 274936
Reviewed by: dfr
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42474