User pctrie_root_store(*, PCTRIE_LOCKED) to change the root value of a
pctrie, to ensure proper synchronization when smr is in use.
Reviewed by: alc
Differential Revision: https://reviews.freebsd.org/D46968
Take advantage of a nearby 2-byte hole to avoid growing the struct.
This way, only the offsets of "flags" and "pg_color" change. Bump
__FreeBSD_version since some out-of-tree kernel modules may access these
fields, though I haven't found any examples so far.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35905
Allocate the 'zero_region' page using VM_ALLOC_NOFREE since
it never gets released.
Differential Revision: https://reviews.freebsd.org/D46885
Reviewed by: alc, markj, kib
For now, opargs are not validated due to their runtime nature.
Reviewed by: kp
Approved by: kp (mentor)
Differential Revision: https://reviews.freebsd.org/D46496
SHM_REMAP was incorrectly defined to 030000 which collides with
SHM_RDONLY and SHM_RND. Renumber to 040000 (incidentally matching
Linux).
This is an ABI break, but the previous ABI was unusable (SHM_REMAP would
imply SHM_RDONLY and vice versa). Fortunately SHM_REMAP has very few
consumers in the wild (I spotted openjdk for Linux, libfabric, MIPCH,
and one other MPI thing in Debian code search)
Reviewed by: kib
Fixes: ea7e7006db Implement shmat(2) flag SHM_REMAP.
Differential Revision: https://reviews.freebsd.org/D46825
Create an additional 4 channel pcm device for RME HDSP 9632 sound cards,
to support the optional AO4S-192 and AI4S-192 extension boards. For
simplicity, the <HDSP 9632 [ext]> pcm device is always present, even if
the extension boards are not installed.
Unfortunately I cannot test this with actual hardware, but I made sure
the additional channels do not affect the functionality of the HDSP 9632
as currently in src.
Reviewed by: christos, br
Differential Revision: https://reviews.freebsd.org/D46837
Fix unified pcm mode after support for the AO4S-192 and AI4S-192
extension boards was added. Adjust the man page accordingly.
Reviewed by: br
Differential Revision: https://reviews.freebsd.org/D46946
We were previously unconditionally adding PROT_WRITE to the maxprot of
private mapping (because a private mapping can be written even if the
fd is read-only), but this might violate the user's PROT_MAX request.
While here, rename cap_maxprot to max_maxprot. This is the intersection
of the maximum protections imposed by capsicum rights on the fd (not
really relevant for private mappings) and the user-required maximum
protections (which were not being obeyed). In particular, cap_maxprot
is a misnomer after the introduction of PROT_MAX.
Add some regression test cases. mmap__maxprot_shm fails without this
patch.
Note: Capsicum's CAP_MMAP_W is a bit ambiguous. Should it be required
in order to create writeable private mappings? Currently it is, even
though such mappings don't permit writes to the object referenced by the
fd.
Reported by: brooks
Reviewed by: brooks
MFC after: 1 month
Fixes: c7841c6b8e ("Relax restrictions on private mappings of POSIX shm objects.")
Differential Revision: https://reviews.freebsd.org/D46741
In general it's not safe to drop the topology lock in these routines, as
GEOM assumes that the mesh will be consistent during traversal.
However, there's no reason we can't hold the topology lock across calls
to g_gate_release(). (Note that g_gate_hold() can be called with the
topology lock held.)
PR: 238814
MFC after: 2 weeks
These add alternative behaviour to some floating-point instructions
so don't need any kernel support and can just be exposed to userspace.
Sponsored by: Arm Ltd
tmpfs_seek_data_locked should return the offset of the first page
either resident in memory or in swap, but may return an offset to a
nonresident page. Check for residence to fix that.
Reviewed by: alc, kib
Differential Revision: https://reviews.freebsd.org/D46879
The virtio_scsi device allows a VM guest to directly send SCSI commands
(ctsio->cdb array) to the kernel driver exposed on /dev/cam/ctl
(ctl.ko).
All kernel commands accessible from the guest are defined by
ctl_cmd_table.
The command ctl_persistent_reserve_out (cdb[0]=0x5F and cbd[1]=0) allows
the caller to call malloc() with an arbitrary size (uint32_t). This can
be used by the guest to overload the kernel memory (DOS attack).
Reported by: Synacktiv
Reviewed by: asomers
Security: HYP-08
Sponsored by: The Alpha-Omega Project
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46044
These are all local diffs that have no functional change.
Reviewed by: mav, emaste
Sponsored by: AFRL, DARPA
Differential Revision: https://reviews.freebsd.org/D46530
The function `pf_setup_pdesc()` handles ruleset evaluation for non-reassembled
packets. Having it called before `pf_mtag` is checked for flags
`PF_MTAG_FLAG_ROUTE_TO` and `PF_MTAG_FLAG_DUMMYNET` will cause loops for
fragmented packets if reassembly is disabled.
Move `pd` zeroing and `pf_mtag` extraction from `pf_setup_pdesc()` to a separate
function `pf_init_pdesc()` and change the order of function calls: first
call `pf_init_pdesc()`, then check if the currently processed packet has been
reinjected from dummynet, finally call `pf_setup_pdesc()`.
Add functionality of sending UDP packets to `pft_ping.py` with fragmentation
support and fix broken IPv6 reassembly.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D46880
mtxpool_sleep should be used as a leaf lock since it is a general
purpose pool shared across many consumers. Holding a pool lock while
acquiring other locks (e.g. the socket buffer lock in soreserve()) can
trigger LOR warnings for unrelated code.
Reviewed by: glebius
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D46792
Some Fujitsu Lifebooks return an invalid _BIX object. The first element
of _BIX is a revision number, which indicates what elements will follow:
* ACPI 4.0 defined _BIX revision 0 with 20 elements.
* ACPI 6.0 introduced _BIX revision 1 with 21 elements.
The problem is that the offending Lifebooks have the a non-zero _BIX
revision, but provide 20 fields only.
The ACPICA parser chokes on this [1], but that seems to be
inconsequential. More importantly, our own battery info handling code
also verifies that for revision > 0, there are at least 21 fields - and
refuses to process the invalid _BIX. One workaround would be to
introduce special case / quirk handling for Fujitsu Lifebooks. A better
one is to relax the requirements check: If there are only 20 elements,
treat the _BIX as revision 0, no matter what revision number was
provided by the device.
Linux doesn't run into this problem by the way because it only supports
the 20 fields defined in the ACPI 4.0 spec [3]. It never looks at the
revision number or the 21st field added in ACPI 6.0.
[1] https://cgit.freebsd.org/src/tree/sys/contrib/dev/acpica/components/namespace/nsprepkg.c#n815
[2] https://cgit.freebsd.org/src/tree/sys/dev/acpica/acpi_cmbat.c#n371
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/battery.c#n418
PR: 252030
Reviewed by: imp
MFC After: 2 weeks
The issue was only apparent when ALTQ was not enabled, which is why it
was not revealed by `make universe`.
Fixes: 27f54be50b
Reviewed by: kp, imp
Differential Revision: https://reviews.freebsd.org/D46877
There is no need for the union pf_krule_ptr for kernel-only structs like
pf_kstate and pf_ksrc_node. The rules are always accessed by pointer. The rule
numbers are a leftover from using the same structure for pfctl(8) and pf(4).
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D46868
If the guest VM emits the exit code VM_EXITCODE_DB the kernel will
execute the function named vm_handle_db.
If the value of rsp is not page aligned and if rsp+sizeof(uint64_t)
spans across two pages, the function vm_copy_setup will need two structs
vm_copyinfo to prepare the copy operation.
For instance is rsp value is 0xFFC, two vm_copyinfo objects are needed:
* address=0xFFC, len=4
* address=0x1000, len=4
The vulnerability was addressed by commit 51fda658ba ("vmm: Properly
handle writes spanning across two pages in vm_handle_db"). Still,
replace the KASSERT with an error return as a more defensive approach.
Reported by: Synacktiv
Reviewed by markj, emaste
Security: HYP-09
Sponsored by: The Alpha-Omega Project
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46133
The vm_handle_db function is responsible for writing correct status
register values into memory when a guest VM is being single-stepped
using the RFLAGS.TF mechanism. However, it currently does not properly
handle an edge case where the resulting write spans across two pages.
This commit fixes this by making vm_handle_db use two vm_copy_info
structs.
Security: HYP-09
Reviewed by: markj
The variable struct pd->nat_rule is set only during rule evaluation, that
is only for the first packet of a connection. Use struct pf_kstate->nat_rule
instead.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D46867
An upcoming refactor appends do-not-merge comments to all headers
centrally to do the same to reduce the final diff. Headers also start
with a comment line (for /*) and end with a blank line.
Comment aligning was inconsistent and required a ton of book-keeping.
Replaced comment aligning with a simple, single tab out.
Pull Request: https://github.com/freebsd/freebsd-src/pull/1441
Signed-off-by: agge3 <sterspark@gmail.com>
As for the consumer `enc_add_hhooks()`, `hhook_add_hook()` will never
fail for the given parameters. Meanwhile, to build the module if_enc(4),
at least option INET or INET6 is required, so no need for the error
EPFNOSUPPORT.
No functional change intended.
Reviewed by: ae
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D46770
This simplifies the code slightly, and brings us closer to the OpenBSD code.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D46707
Reduces code and fixes a bunch of bugs with fragment handling not being in sync
with the rest of the ruleset.
Much feedback from mpf, bluhm & markus
Thanks to Tony Sarendal for help with testing
ok bluhm; various previous versions ok henning, claudio, mpf, markus
Note that while this changes the order of src addr/src port/dst addr/dst port
skips this doesn't actually affect the kernel/userspace ABI. The kernel always
recalculates skip steps. As a result we have to fix one of the pfctl parser
tests. Note that this is an order change that does not affect what packets are
acceppted or dropped.
Obtained from: OpenBSD, mcbride <mcbride@openbsd.org>, 04c69899a7
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D46705
nfsrv_freeopen() was being called after the mutex
lock was released, making it possible for other
kernel threads to change the lists while nfsrv_freeopen()
took the nfsstateid out of the lists.
This patch moves the code around
"if (nfsrv_freeopen(stp, vp, 1 p) == 0) {"
into nfsrv_freeopen(), so that it can remove the nfsstateid
structure from all lists before unlocking the mutex.
This should avoid any race between CLOSE and other nfsd threads
updating the NFSv4 state.
The patch does not affect semantics when vfs.nfsd.enable_locallocks=0.
PR: 280978
Tested by: Matthew L. Dailey <matthew.l.dailey@dartmouth.edu>
MFC after: 1 week
Visibility can get complicated when, e.g., ifuncs are involved. In
particular, SHA256/SHA512 on aarch64 use ifuncs for their _Transform
implementations, which then exposes global symbols of the same name that
break things trying to statically link both libcrypto and libmd.
Revert this part of the _Transform removal to fix the pkg-static build
on aarch64.
Fixes: 81de655acd ("libmd: stop exporting Transform() symbols")
They all were experimental and some comments refer to internal Netflix
versions. There is not reason to leak that into the header. Style unused
options so that they have the available value aligned with really used
values.
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D46779
When the sysctl-variable net.inet.ip.accept_sourceroute is non-zero,
an mbuf would be leaked when processing a SYN-segment containing an
IPv4 strict or loose source routing option, when the on-stack
syncache entry is used or there is an error related to processing
TCP MD5 options.
Fix this by freeing the mbuf whenever an error occurred or the
on-stack syncache entry is used.
Reviewed by: markj, rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D46839
Use explicit atomic load/store operations for all producer and consumer
head and tail accesses. This allows us to remove the volatile
annotation from these variables.
Reviewed by: alc, imp, kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46380
When the FreeBSD/arm64 port was created we only supported FDT. We now
also support ACPI, and have for many years. When this support was
added we kept FDT as the default.
There are some setups where both ACPI tables and a FDT DTB are passed
into the kernel. In most of these cases the DTB is only used to pass
in minimal information.
To handle the cases where both are passed in prefer ACPI over FDT.
Reviewed by: bz, imp, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46750
Report when SVE is present and allow it to be used by calling
sve_restore_state on an SVE exception from userspace.
Reviewed by: kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43310