Commit graph

145108 commits

Author SHA1 Message Date
Gleb Smirnoff
b2c558c898 cxgbe: include headers required to include t4_tom.h
Before the change we would get struct tcpcb forward declaration
only with help of pollution via in_pcb.h.
2022-10-19 15:15:53 -07:00
Gleb Smirnoff
e1401f7579 cxgbe: use standard sototcpcb() accessor macro to get socket's tcpcb
Reviewed by:		np
Differential revision:	https://reviews.freebsd.org/D37041
2022-10-19 15:15:32 -07:00
Mark Johnston
2dba2288aa uma: Never pass cache zones to memguard
Items allocated from cache zones cannot usefully be protected by
memguard.

PR:		267151
Reported and tested by:	pho
MFC after:	1 week
2022-10-19 14:36:36 -04:00
Konstantin Belousov
e9adbcdf2e tmpfs: report minimal hole size
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
85cff1455a tmpfs: implement FIOSEEKDATA and FIOSEEKHOLE
providing the support for lseek(2) SEEK_DATA and SEEK_HOLE.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
934bfc128e Add vm_page_any_valid()
Use it and several other vm_page_*_valid() functions in more places.

Suggested and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
5bd45b2ba3 swap_pager_find_least(): assert that the function is called on the right object type
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
33ce178835 vn_bmap_seekhole: check that passed offset is non-negative
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
8b32cdec9c tmpfs: order include files alphabetically
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
7f055843ac tmpfs: change return type of tmpfs_pages_check_avail() to bool
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov
555a861d68 device_get_path(): take sbuf directly
This allows to fix a bug where sbuf allocation done in the context of
dev_wired_cache_match() must use non-sleepable allocations.

Suggested by:	jhb
Reviewed by:	jhb, takawata
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36899
2022-10-19 19:39:40 +03:00
Konstantin Belousov
8cf783bde3 device_get_path(): handle case when dev is root
PR:	266862
Based on submission by:	takawata
Reviewed by:	jhb, takawata
Disscussed with:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36899
2022-10-19 19:39:33 +03:00
Konstantin Belousov
d9c5a9ea49 device_get_path(): do not drop the error from BUS_GET_DEVICE_PATH()
Later it would silently converted to ENOMEM always, because any error
was reported as NULL return path.

Reviewed by:	jhb, takawata
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36899
2022-10-19 19:39:26 +03:00
Konstantin Belousov
23d2fcfbb2 subr_bus.c: some style
Wrap long lines in devctl2_ioctl DEV_GET_PATH and dev_wired_cache_match()

Reviewed by:	jhb, takawata
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36899
2022-10-19 19:39:17 +03:00
Alan Somers
f6e5319550 fusefs: fix VOP_ADVLOCK with SEEK_END
When the user specifies SEEK_END, unlike SEEK_CUR, VOP_ADVLOCK must
adjust lock offsets itself.

Sort-of related to bug 266886.

MFC after:	2 weeks
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D37040
2022-10-18 19:11:49 -06:00
Gleb Smirnoff
c37384665f tcp: style the struct tcpcb definition
- Use C99 types uintXX_t instead of u_intXX_t.
- Try to make space/tab usage a little bit more consistent.
- Shorten comments to fit into 80 chars.

Not a functional change, just making future changes easier to read.
2022-10-18 18:00:30 -07:00
Rick Macklem
ae7816576e nfsd: Make the pNFS server update Change for Setxattr/Rmxattr
When the NFS server does the Setxattr or Rmxattr operation,
the Change attribute (va_filerev) needs to be updated.

Without this patch, that was not happening for the
pNFS server configuration.  This patch does a Setattr
against the DS file to make the Change attribute
change.

This bug was discovered during a recent IETF NFSv4 testing
event, where the Change attribute wasn't changed in the
operation reply.

MFC after:	1 month
2022-10-18 15:47:07 -07:00
Gleb Smirnoff
e03b7883e9 mbuf: don't include lock.h conditionally
Using keywords from opt_global.h in the system headers allows to
create cryptic kernel build failures, that depend on the options
used in the kernel config, very hard to debug and understand.

Fixes:	063d811465
2022-10-18 08:34:03 -07:00
Zhenlei Huang
5be5d0d5cb geom_part: Check number of GPT entries and size of GPT entry
Current specification does not have upper limit of the number of
partition entries and the size of partition entry. In
799eac8c3d Andrey V. Elsukov introduced a
limit maximum number of GPT entries to 4k, but that is for write routine
(gpart create) only. When attaching disks that have large number of GPT
entries exceeding the limit, or disks with large size of partition
entry, it is still possible to exhaust kernel memory.

1. Reuse the limit of the maximum number of partition entries.
2. Limit the maximum size of GPT entry to 1k.

In current specification (2.10) the size of GPT entry is 128 *
2^n while n >= 0, and the size - 128 is reserved. 1k should be
sufficient enough for foreseen future.

PR:		266548
Discussed with:	imp
Reviewed by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D36717
2022-10-18 11:03:02 -04:00
Takanori Watanabe
7b5d62bb73 ofw: add BUS_GET_DEVICE_PATH interface to openfirm/fdt, somewhat incomplete.
This add BUS_GET_DEVICE_PATH interface,
which shows device tree of openfirm/fdt.

In qemu-system-arm64 with "virt" machine with device-tree firmware,
% devctl getpath OFW cpu0

Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37031
2022-10-18 16:55:47 +09:00
Colin Percival
469ad86031 amd64: Add FIRECRACKER kernel configuration
This kernel configuration supports the Firecracker VMM environment.

Relnotes:	FreeBSD can now run inside the Firecracker VMM
		via the amd64 FIRECRACKER kernel configuration.
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36672
2022-10-17 23:02:22 -07:00
Colin Percival
13f34e211b PVH: Set bootmethod to PVH
Now that we can PVH boot on a non-Xen hypervisor, we shouldn't set
machdep.bootmethod to "XEN".  Instead, set it to "PVH"; there are
other ways to discern the hypervisor.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36191
2022-10-17 23:02:22 -07:00
Colin Percival
c4a4011c74 PVH: support whitespace cmdline splitting
For historical reasons, Xen kernel command lines have options
separated by commas.  Every other FreeBSD platform uses whitespace;
this is also necessary in PVH in order to support the Firecracker
VMM.  Allow options to be separated by any combination of commas
and whitespace.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36190
2022-10-17 23:02:22 -07:00
Colin Percival
a8ea154064 x86: Distinguish Xen from non-Xen PVH boots
The PVH boot protocol, introduced by Xen, is now used by some non-Xen
platforms (e.g. the Firecracker VM) as well.  In order to accommodate
these, we use CPUID to detect Xen and only perform Xen-specific setup
when running on that platform.

The "isxen" function duplicates some work done by identcpu.c later in
the boot process; but we need it here since this is the very first C
code which runs when PVH booting (even before hammer_time).

In many places the existing code had
        xc_printf(...);
        HYPERVISOR_shutdown(SHUTDOWN_crash);
making use of Xen functionality to print a message and shut down; in
the places where this idiom can be reached in the non-xen case, we
replace it idiom with a CRASH(...) macro which calls those in the Xen
case and halts in the non-Xen case.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35801
2022-10-17 23:02:22 -07:00
Colin Percival
023a025b5c x86: Add support for PVH version 1 memmap
Version 0 of PVH booting uses a Xen hypercall to retrieve the system
memory map; in version 1 the memory map can be provided via the
start_info structure.

Using the memory map from the version 1 start_info structure allows
FreeBSD to use PVH booting on systems other than Xen, e.g. on the
Firecracker VM.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35800
2022-10-17 23:02:22 -07:00
Colin Percival
d1ca8cc638 x86: Add MPTABLE_LINUX_BUG_COMPAT option
Linux has two bugs in its handling of the x86 MP table:
1. It assumes that there is always 640 kB of base memory, and looks for
the MP table in the top kB of this even if the memory map indicates
that memory location does not exist.
2. It ignores that entry_count field and instead iterates through the
MP table by scanning until it runs out of bytes in the table.

The Firecracker VM (and probably other related VMs) relies on both of
these bugs.  With the MPTABLE_LINUX_BUG_COMPAT option, we search for
the MP table at address 639k even if that isn't in the memory map; and
replace a zeroed entry_count with a value computed from scanning the
table until we run out of table bytes.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35799
2022-10-17 23:02:22 -07:00
Colin Percival
2297a1633d Add NO_LEGACY_PCIB kernel option to i386, amd64
On systems without a PCI bus, legacy_pcib_identify by default creates
one anyway:
    legacy_pcib_identify: no bridge found, adding pcib0 anyway

This commit adds a kernel option NO_LEGACY_PCIB which disables this,
allowing systems to be fully PCI-free.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35798
2022-10-17 23:02:22 -07:00
Colin Percival
c4b68e7e53 ns8250: Check if flush via FCR succeeded
The emulated UART in the Firecracker VMM (aka the implementation in the
rust-vmm/vm-superio project) includes FIFOs but does not implement the
FCR register, which is used by ns8250_flush to flush the FIFOs.

Check the LSR to see if there is still data in the FIFOs and call
ns8250_drain if necessary.

Discussed with:	emaste, imp, jrtc27
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36979
2022-10-17 23:02:21 -07:00
Colin Percival
782105f7c8 vtblk: Use busdma
We assume that bus addresses from busdma are the same thing as
"physical" addresses in the Virtio specification; this seems to
be true for the platforms on which Virtio is currently supported.

For block devices which are limited to a single data segment per
I/O, we request PAGE_SIZE alignment from busdma; this is necessary
in order to support unaligned I/Os from userland, which may cross
a boundary between two non-physically-contiguous pages.  On devices
which support more data segments per I/O, we retain the existing
behaviour of limiting I/Os to (max data segs - 1) * PAGE_SIZE.

Reviewed by:	bryanv
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36667
2022-10-17 23:02:21 -07:00
Colin Percival
3a8aff9d08 vtblk: Include pointer to softc in request
No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36666
2022-10-17 23:02:21 -07:00
Colin Percival
cc25cfc9cf vtblk: Requeue inside vtblk_request_execute
Most virtio_blk requests are launched from vtblk_startio; prior to this
commit, if vtblk_request_execute failed (e.g. due to a lack of space on
the virtio queue) vtblk_startio would requeue the request to be
reattempted later.

Add a flag "vbr_requeue_on_error" to requests and perform the requeuing
from inside vtblk_request_execute instead.

No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36665
2022-10-17 23:02:21 -07:00
Colin Percival
86f8f5ccb7 vtblk: Make vtblk_request_execute return void.
The error, if any, now gets stashed in the request structure.  (Step 1
of reworking this driver to use busdma.)

No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36664
2022-10-17 23:02:21 -07:00
Colin Percival
0e1f5ab7db virtio_mmio: Support command-line parameters
The Virtio MMIO bus driver was added in 2014 with support for devices
exposed via FDT; in 2018 support was added to discover Virtio MMIO
devices via ACPI tables, as in QEMU.  The Firecracker VMM eschews both
FDT and ACPI, instead presenting device information via kernel command
line arguments of the form virtio_mmio.device=<parameters>.

These command line parameters get converted into kernel environment
variables; this adds support for parsing those variables and attaching
virtio_mmio children to nexus.

There is a case to be made that it would be cleaner to have a new
"cmdlinebus" attached to nexus and virtio_mmio children attached to
that.  A future commit might do that.

Discussed with:	imp, jrtc27
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36189
2022-10-17 23:02:21 -07:00
Colin Percival
c32bd97641 kern: Support duplicate variables in early kenv
Some virtual machines pass virtio MMIO device parameters via the kernel
command line as a series of virtio_mmio.device=<parameters> options.
These get translated into FreeBSD kernel environment variables; but
unfortunately they all use the same variable name, which resulted in
all but the first such parameter being ignored when the dynamic kernel
environment is set up from the initial environment buffers.

With this commit, duplicate environment settings will instead be stored
as ${name}_1, ${name}_2... ${name}_9999.  In the unlikely event that
the same variable is set over 10000 times before the dynamic kernel
environment is set up, we panic.

Variable settings after the dynamic environment is initialized continue
to override the previously-set value; the change is limited to the very
early kernel boot (prior to SI_SUB_KMEM + 1) and changes behaviour from
"ignore" to "store with a different name" only.

Reviewed by:	imp
Feedback from:	kevans
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36187
2022-10-17 23:02:20 -07:00
Gleb Smirnoff
d6eabdac2e dpaa2: fix build without WITNESS
Using mutex(9) requires including <sys/lock.h> per manual page.  With
WITNESS the header was cryptically included via dpaa_ni.h -> mbuf.h.
2022-10-17 22:38:40 -07:00
Gleb Smirnoff
2782ed8f6c dpaa2: fix standalone module build 2022-10-17 22:38:24 -07:00
Gleb Smirnoff
7fb975c8fb dpaa2: fix build without FDT 2022-10-17 22:38:02 -07:00
Eric Joyner
9c95013905
iflib: Introduce v2 of TX Queue Select Functionality
For v2, iflib will parse packet headers before queueing a packet.

This commit also adds a new field in the structure that holds parsed
header information from packets; it stores the IP ToS/traffic class
field found in the IPv4/IPv6 header.

To help, it will only partially parse header packets before queueing
them by using a new header parsing function that does less than the
current parsing header function; for our purposes we only need up to the
minimal IP header in order to get the IP ToS infromation and don't need
to pull up more data.

For now, v1 and v2 co-exist in this patch; v1 still offers a
less-invasive method where none of the packet is parsed in iflib before
queueing.

This also bumps the sys/param.h version.

Signed-off-by:	Eric Joyner <erj@FreeBSD.org>
Tested by:	IntelNetworking
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision: 	https://reviews.freebsd.org/D34742
2022-10-17 14:59:55 -07:00
Ed Maste
9f6097d6a6 linuxkpi: retire now-unused MIPS support
Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37023
2022-10-17 16:31:20 -04:00
Gleb Smirnoff
a3da8329c5 carp: fix regression panic from ccd69bd573
Reported & tested by:	Oleg Ginzburg <olevole olevole.ru>
Fixes:			ccd69bd573
2022-10-17 11:39:40 -07:00
Ali Abdallah
ba4782022a ksched: correct return code for invalid priority
By convention, EINVAL is returned when validating arguments, not EPERM.
This matches the documented behaviour of sched_setscheduler(3), and that
of SCHED_OTHER.

PR:		227735
MFC after:	1 week
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D37021
2022-10-17 15:12:13 -03:00
Kenneth D. Merry
11778fca4a Fix mpr(4) panic during a firmware update.
Issue Description:
The RequestCredits field of IOCFacts got changed between the Phase23
firmware to Phase24 firmware. So as part of firmware update operation,
driver has to free the resources & pools which are created with the Phase23
Firmware's IOCFacts data (i.e. during driver load time) and has to
reallocate the resources and pools using Phase24's IOCFacts data. Here
driver has freed the interrupts but missed to reallocate the interrupts and
hence config page read operation is getting timed out and controller is
going for recursive reinit (controller reset) operations and leading to
kernel panic.

Fix:
Reallocate the interrupts if the interrupts are disabled as part of
firmware update/downgrade operation.

Submitted by:	Sreekanth Ready <sreekanth.reddy@broadcom.com>
Tested by:	ken
MFC after:	3 days
2022-10-17 12:48:34 -04:00
Gert Doering
2e797555f7 if_ovpn(4): implement ioctl() to set if_flags
Fully working openvpn(8) --iroute support needs real subnet config
on ovpn(4) interfaces (IFF_BROADCAST), while client-side/p2p
configs need IFF_POINTOPOINT setting.  So make this configurable.

Reviewed by:	kp
2022-10-17 15:33:45 +02:00
Alan Somers
3c3b906b54 fusefs: After successful F_GETLK, l_whence should be SEEK_SET
PR:		266886
Reported by:	John Millikin <jmillikin@gmail.com>
MFC after:	2 weeks
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D37014
2022-10-17 07:09:50 -06:00
Kristof Provost
b136983a8a if_ovpn: fix use-after-free
ovpn_encrypt_tx_cb() calls ovpn_encap() to transmit a packet, then adds
the length of the packet to the "tunnel_bytes_sent" counter.  However,
after ovpn_encap() returns 0, the mbuf chain may have been freed, so the
load of m->m_pkthdr.len may be a use-after-free.

Reported by:	markj
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-10-17 09:24:41 +02:00
Cy Schubert
8cee2ebac5 Revert "unbound: Vendor import 1.17.0"
This reverts commit 64d318ea98, reversing
changes made to 8063dc0320.

Revert a mismerge which reversed 8063dc0320.
2022-10-16 13:42:15 -07:00
Cy Schubert
64d318ea98 unbound: Vendor import 1.17.0
Added ACL per interface, proxy protocol and bug fixes.

Announcement:   https://nlnetlabs.nl/news/2022/Oct/13/unbound-1.17.0-released/

Merge commit '643f9a0581e8aac7eb790ced1164748939829826' into new_merge
2022-10-16 13:32:55 -07:00
Rick Macklem
8063dc0320 nfsd: Make Setxattr/Removexattr NFSv4.2 ops IO_SYNC
When the NFS server does Setxattr or Removexattr, the
operations must be done IO_SYNC. If a server
crashes/reboots immediately after replying it must
have the extended attribute changes.

Since UFS does extended attributes asynchronously
by default and there is no "ioflag" argument in
the VOP calls, follow the VOP calls with VOP_FSYNC(),
to ensure the operation has been done synchronously.

This was found by inspection while investigating a
bug discovered during a recent IETF NFSv4 testing
event, where the Change attribute wasn't changed
in the operation reply.

This bug will take further work for ZFS and the
pNFS server configuration, but is now fixed for
a non-pNFS UFS exported file system.

MFC after:	1 month
2022-10-16 13:27:32 -07:00
Mitchell Horne
39888ed7a3 kern_intr: Check for NULL event in intr_destroy()
It likely won't happen, but is consistent with the other functions of
this KPI.

Reviewed by:	imp, jhb
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33479
2022-10-15 15:51:44 -03:00
Rick Macklem
7d9dc91a99 nfscl: Fix the NFSv4.0 mount so that it does not crash
Commit efe58855f3 modifies IN_LOOPBACK() so that it uses a VNET
variable. Without this patch, nfscl_getmyip() uses IN_LOOPBACK()
when the VNET is not set and crashes the system.
nfscl_getmyip() is only called when a NFSv4.0 (not NFSv4.1/4.2)
mount is done.

This patch re-organizes nfscl_getmyip() so that IN_LOOPBACK()
is before the CURVENT_RESTORE() macro, to avoid the crashes.

Reviewed by:	karels, zlei.huang_gmail.com
Differential Revision:	https://reviews.freebsd.org/D37008
2022-10-15 08:38:07 -07:00