This currently only implements the address space handler and attempts to
configure pins with flags obtained from ACPI.
Reviewed by: wulf
MFC after: 1 month
Pull Request: https://github.com/freebsd/freebsd-src/pull/1359
Ignoring page_pools with the few needed adjustments and ignoring 7622
mt7615 seems to build as well. Add it so once we can connect it to
the build people can start testing and debugging.
(The actual work was done on a newer version of the mt76 drivers but
it seems the to-build-changes equally apply here already).
Requested by: Radu-Cristian Fotescu (freebsd-wireless, 2024-07-31)
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Notable upstream pull request merges:
#15892 -multiple Fast Dedup: Introduce the FDT on-disk format and feature flag
#15893 -multiple Fast Dedup: “flat” DDT entry format
#15895 -multiple Fast Dedup: FDT-log feature
#162396be8bf555 zpool: Provide GUID to zpool-reguid(8) with -g
#16277 -multiple Fast Dedup: prune unique entries
#163165807de90a Fix null ptr deref when renaming a zvol with snaps and snapdev=visible
#1634377a797a38 Enable L2 cache of all (MRU+MFU) metadata but MFU data only
#1644683f359245 FreeBSD: fix build without kernel option MAC
#16449963e6c9f3 Fix incorrect error report on vdev attach/replace
#16505b10992582 spa_prop_get: require caller to supply output nvlist
Obtained from: OpenZFS
OpenZFS commit: b109925820
This file contains the vmm device file implementation. Most of this
code is not machine-dependent and so shouldn't be duplicated this way.
Move most of it into a generic dev/vmm/vmm_dev.c. This will make it
easier to introduce a cdev-based interface for VM creation, which in
turn makes it possible to implement support for running bhyve as an
unprivileged user.
Machine-dependent ioctls continue to be handled in machine-dependent
code. To make the split a bit easier to handle, introduce a pair of
tables which define MI and MD ioctls. Each table entry can set flags
which determine which locks need to be held in order to execute the
handler. vmmdev_ioctl() now looks up the ioctl in one of the tables,
acquires locks and either handles the ioctl directly or calls
vmmdev_machdep_ioctl() to handle it.
No functional change intended. There is a lot of churn in this change
but the underlying logic in the ioctl handlers is the same. For now,
vmm_dev.h is still mostly separate, even though some parts could be
merged in principle. This would involve changing include paths for
userspace, though.
Reviewed by: corvink, jhb
Differential Revision: https://reviews.freebsd.org/D46431
There is a small difference between the arm64 and amd64 implementations:
the latter makes use of a "scope" to exclude AMD-specific stats on Intel
systems and vice-versa. Replace this with a more generic predicate
callback which can be used for the same purpose.
No functional change intended.
Reviewed by: corvink, jhb
Differential Revision: https://reviews.freebsd.org/D46430
These just need to include the common code with macros to ensure it is
built correctly.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46083
We can share some of the vmm code between VHE and non-VHE modes. To
support this create new files that include the common code and create
macros to name what will be the common functions.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D46072
To simplify disabling the kernel sanitizers in some files add
NOSAN_CFLAGS and NOSAN_C variables. These are CFLAGS and NORMAL_C with
the sanitizer flags removed.
While here add MSAN_CFLAGS to simplify keeping KMSAN in kern_kcov.c
Reviewed by: khng, brooks, imp, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45498
When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
commands on half-failed drives.
Fixes: 9229b3105d (nvme: Fail passthrough commands right away in failed state)
Sponsored by: Netflix
Notable upstream pull request merges:
#158175536c0dee Sync AUX label during pool import
#15889c7ada64bb ddt: dedup table quota enforcement
#1589062e7d3c89 ddt: add support for prefetching tables into the ARC
#15894e26b3771e spa_preferred_class: pass the entire zio
#15894d54d0fff3 dnode: allow storage class to be overridden by object type
#1619755427add3 Several improvements to ARC shrinking
#16217 -multiple JSON output for various zfs and zpool subcommands
#1624824e6585e7 libzfs.h: Set ZFS_MAXPROPLEN and ZPOOL_MAXPROPLEN
to ZAP_MAXVALUELEN
#162649dfc5c4a0 Fix long_free_dirty accounting for small files
#16268ed0db1cc8 Make txg_wait_synced conditional in zfsvfs_teardown,
for FreeBSD
#16288d60debbf5 Fix sa_add_projid to lookup and update SA_ZPL_DXATTR
#16308ec580bc52 zfs: add bounds checking to zil_parse
#16310c21dc56ea Fix zdb_dump_block for little endian
#163157ddc1f737 zil: add stats for commit failure/fallback
#16326b0bf14cdb abd: lift ABD zero scan from zio_compress_data()
to abd_cmp_zero()
#16337c8184d714 Block cloning conditionally destroy ARC buffer
#16338dbe07928b Add support for multiple lines to the sharenfs property
for FreeBSD
#163741a3e32e6a Cleanup DB_DNODE() macros usage
#16374ed87d456e Skip dnode handles use when not needed
#16346fb6d8cf22 Add some missing vdev properties
#16364670147be5 zvol: ensure device minors are properly cleaned up
#16382dea8fabf7 FreeBSD: Fix RLIMIT_FSIZE handling for block cloning
#16387aef452f10 Improve zfs_blkptr_verify()
#16395cbcb52243 Fix the names of some FreeBSD sysctls in
include/tunables.cfg
#164015b9f3b766 Soften pruning threshold on not evictable metadata
#16404cdd53fea1 FreeBSD: Add missing memory reclamation accounting
#164041fdcb653b Once more refactor arc_summary output
#164191f5bf91a8 Fix memory corruption during parallel zpool import
with -o cachefile
#16426cf6e8b218 zstream: remove duplicate highbit64 definition
Obtained from: OpenZFS
OpenZFS commit: 9c56b8ec78
disable hdmi_audio_infoframe_pack_for_dp function for now as it depends
on not imported yet drm sources and is not used by drm-kmod.
Reviewed by: manu
Sponsored by: Serenity CyberSecurity, LLC
Differential Revision: https://reviews.freebsd.org/D46224
A new instance of using ld with -T to bring in the kernel ld script
crept into the tree after I originally did the refactoring. It too needs
-L ${SYSDIR}/conf added.
Fixes: 37d6d682af
Sponsored by: Netflix
Right now, only IPv4 transport mode, with aes-gcm ESP, is supported.
Driver also cooperates with NAT-T, and obeys socket policies, which
makes IKEd like StrongSwan working.
Sponsored by: NVIDIA networking
Compile more of the IPMI into the kernel, and include all the
dependencies in ipmi.ko.
Sponsored by: Netflix
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D45765
This was done in the original DTrace import, presumably because that
made it a bit easier to handle includes. However, this can cause
dtrace_getpcstack() to be inlined into dtrace_probe(), resulting in a
missing frame in stack traces since dtrace_getpcstack() takes care to
bump "aframes" to account for its own stack frame.
To avoid this, compile dtrace_isa.c separately on all platforms. Add
requisite includes.
MFC after: 2 weeks
Sponsored by: Innovate UK
All architectures enable NEW_PCIB in DEFAULTS (arm being the most recent
to do so in 121be55599 (arm: Set NEW_PCIB in DEFAULTS rather than a
subset of kernel configs")), so it's time we removed the legacy code
that no longer sees much testing and has a significant maintenance
burden.
Reviewed by: jhb, andrew, emaste
Differential Revision: https://reviews.freebsd.org/D32954
Notable upstream pull request merges:
#16209 --multi-- icp: rip out everything we don't use
#1623020c8bdd85 FreeBSD: Update use of UMA-related symbols in
arc_available_memory
#16242121a2d335 FreeBSD: unregister mountroot eventhandler on unload
#162585de3ac223 vdev_open: clear async fault flag after reopen
#16270436731276 zvol: Fix suspend lock leaks
#16273c87cb22ba head_errlog: fix use-after-free
#16284f72e081fb FreeBSD: Use a statement expression to implement
SET_ERROR()
#16300a10faf5ce FreeBSD: Use the new freeuio() helper to free dynamically
allocated UIOs
#16302a7fc4c85e zstd: don't call zstd_mempool_reap if there are no buffers
#16334dc91e7452 zdb: dump ZAP_FLAG_UINT64_KEY ZAPs properly
Obtained from: OpenZFS
OpenZFS commit: 1147a27978
These are only included in the amd64 vmm code, so it doesn't make sense
to list them unconditionally.
PR: 280171
Reviewed by: wosch, imp, emaste
Differential Revision: https://reviews.freebsd.org/D45964
Inline IPSEC offload moves almost whole IPSEC processing from the
CPU/MCU and possibly crypto accelerator, to the network card.
The transmitted packet content is not touched by CPU during TX
operations, kernel only does the required policy and security
association lookups to find out that given flow is offloaded, and then
packet is transmitted as plain text to the card. For driver convenience,
a metadata is attached to the packet identifying SA which must process
the packet. Card does encryption of the payload, padding, calculates
authentication, and does the reformat according to the policy.
Similarly, on receive, card does the decapsulation, decryption, and
authentification. Kernel receives the identifier of SA that was
used to process the packet, together with the plain-text packet.
Overall, payload octets are only read or written by card DMA engine,
removing a lot of memory subsystem overhead, and saving CPU time because
IPSEC algos calculations are avoided.
If driver declares support for inline IPSEC offload (with the
IFCAP2_IPSEC_OFFLOAD capability set and registering method table struct
if_ipsec_accel_methods), kernel offers the SPD and SAD to driver.
Driver decides which policies and SAs can be offloaded based on
hardware capacity, and acks/nacks each SA for given interface to
kernel. Kernel needs to keep this information to make a decision to
skip software processing on TX, and to assume processing already done
on RX. This shadow SPD/SAD database of offloads is rooted from
policies (struct secpolicy accel_ifps, struct ifp_handle_sp) and SAs
(struct secasvar accel_ipfs, struct ifp_handle_sav).
Some extensions to the PF_KEY socket allow to limit interfaces for
which given SP/SA could be offloaded (proposed for offload). Also,
additional statistics extensions allow to observe allocation/octet/use
counters for specific SA.
Since SPs and SAs are typically instantiated in non-sleepable context,
while offloading them into card is expected to require costly async
manipulations of the card state, calls to the driver for offload and
termination are executed in the threaded taskqueue. It also solves
the issue of allocating resources needed for the offload database.
Neither ipf_handle_sp nor ipf_handle_sav do not add reference to the
owning SP/SA, the offload must be terminated before last reference is
dropped. ipsec_accel only adds transient references to ensure safe
pointer ownership by taskqueue.
Maintaining the SA counters for hardware-accelerated packets is the
duty of the driver. The helper ipsec_accel_drv_sa_lifetime_update()
is provided to hide accel infrastructure from drivers which would use
expected callout to query hardware periodically for updates.
Reviewed by: rscheff (transport, stack integration), np
Sponsored by: NVIDIA networking
Differential revision: https://reviews.freebsd.org/D44219
We're building ACPI, so we need -DDEV_ACPI on CFLAGS. Nomally, the
kernel config brings this in, but there's no kernel directory for the
standalone build.
Sponsored by: Netflix
Summary:
Add support for building ossl(4) on powerpc64* by implementing ossl_cpuid and
other support functions for powerpc. The required assembly files for ppc were
already present in-tree.
Test Plan: The changes were tested using the in-tree tools/tools/crypto/cryptocheck.c tool on both powerpc64 and powerpc64le on a POWER9 system.
Reviewed by: #powerpc, jhibbits, jhb
Differential Revision: https://reviews.freebsd.org/D41837
The idea here is to avoid a memory access and conditional branch per
probe site. Instead, the probe is represented by an "unreachable"
unconditional function call. asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record. Each SDT probe carries a list
of tracepoints.
When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label. The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general. The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.
Per gallatin@ in D43504, this approach has less overhead when probes are
disabled. To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case. It could be re-added later if need be.
The approach taken in this patch enables some more improvements:
1. We can now automatically fill out the "function" field of SDT probe
names. The SDT macros let the programmer specify the function and
module names, but this is really a bug and shouldn't have been
allowed. The intent was to be able to have the same probe in
multiple functions and to let the user restrict which probes actually
get enabled by specifying a function name or glob.
2. We can avoid branching on SDT_PROBES_ENABLED() by adding the ability
to include blocks of code in the out-of-line path. For example:
if (SDT_PROBES_ENABLED()) {
int reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
SDT_PROBE1(proc, , , exit, reason);
}
could be written
SDT_PROBE1_EXT(proc, , , exit, reason,
int reason;
reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
);
In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.
Reviewed by: Domagoj Stolfa, avg
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D44483
This is derived from swills@ fork of the Juniper virtfs with many
changes by me including bug fixes, style improvements, clearer layering
and more consistent logging. The filesystem is renamed to p9fs to better
reflect its function and to prevent possible future confusion with
virtio-fs.
Several updates and fixes from Juniper have been integrated into this
version by Val Packett and these contributions along with the original
Juniper authors are credited below.
To use this with bhyve, add 'virtio_p9fs_load=YES' to loader.conf. The
bhyve virtio-9p device allows access from the guest to files on the host
by mapping a 'sharename' to a host path. It is possible to use p9fs as a
root filesystem by adding this to /boot/loader.conf:
vfs.root.mountfrom="p9fs:sharename"
for non-root filesystems add something like this to /etc/fstab:
sharename /mnt p9fs rw 0 0
In both examples, substitute the share name used on the bhyve command
line.
The 9P filesystem protocol relies on stateful file opens which map
protocol-level FIDs to host file descriptors. The FreeBSD vnode
interface doesn't really support this and we use heuristics to guess the
right FID to use for file operations. This can be confused by privilege
lowering and does not guarantee that the FID created for a given file
open is always used for file operations, even if the calling process is
using the file descriptor from the original open call. Improving this
would involve changes to the vnode interface which is out-of-scope for
this import.
Differential Revision: https://reviews.freebsd.org/D41844
Reviewed by: kib, emaste, dch
MFC after: 3 months
Co-authored-by: Val Packett <val@packett.cool>
Co-authored-by: Ka Ho Ng <kahon@juniper.net>
Co-authored-by: joyu <joyul@juniper.net>
Co-authored-by: Kumara Babu Narayanaswamy <bkumara@juniper.net>
This code runs at EL2 while the kernel runs at EL1. We build these
files for EL2 through a dependency in vmm_hyp_blob.elf.full so there
is no need to include them in SRCS.
Reviewed by: imp, kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45467
Currently FreeBSD uses IPI based TLB flushing for remote
TLB flushing. Hyper-V allows hypercalls to flush local and
remote TLB. The use of Hyper-V hypercalls gives significant
performance improvement in TLB operations.
This patch set during test has shown near to 40 percent
TLB performance improvement.
Also this patch adds rep hypercall implementation as well.
Reviewed by: whu, kib
Tested by: whu
Authored-by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Co-Authored-by: Erni Sri Satya Vennela <ernis@microsoft.com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D45521
Firmware advertises the transfer lenght for writesame commands to driver during init.
So for any writesame IOs with ndob and unmap bit set and transfer lengh is greater
than the max write same length specified by the firmware, then direct those commands
to firmware instead of hardware otherwise hardware will break.
Reviewed by: imp
Approved by: imp
Differential revision: https://reviews.freebsd.org/D44452
When we enable checking for BTI on arm64 we need to include an ELF
note in all object files linked into a module.
As using objcopy from a binary to an ELF object file doesn't add the
note switch to using .incbin from an assembly file. This allows us to
add the needed note without affecting the included object.
Reviewed by: imp, kib, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45468
LLD has the -zbti-report=error argument to check if the BTI note is
present when linking. To allow for this to be used when linking the
kernel and modules:
- Add the BTI note to the remaining assembly files
- Mark ptrauth.c as protected by BTI
- Disable -zbti-report for vmm hypervisor switching code as it's not
used there.
The linux64 module doesn't build with the flag as it includes vdso code
that doesn't include the note.
Reviewed by: imp, kib, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45466
Notable upstream pull request merges:
#1594041ae864b6 Replace P2ALIGN with P2ALIGN_TYPED and delete P2ALIGN
#161285137c132a zpool import output is not formated properly
#16138efbef9e6c FreeBSD: Add zfs_link_create() error handling
#1614604bae5ec9 Disable high priority ZIO threads on FreeBSD and Linux
#16151cc3869153 zfs_ioc_send: use a dedicated taskq thread for send
#16151adda768e3 spa: remove spa_taskq_dispatch_sync()
#16151515c4dd21 spa: flatten spa_taskq_dispatch_ent()
#161510a543db37 spa_taskq_dispatch_ent: simplify arguments
#16153975a13259 Add support for parallel pool exports
#1615389acef992 Simplified the scope of the namespace lock
#16159136c05321 ZAP: Fix leaf references on zap_expand_leaf() errors
#16162af5dbed31 Fix scn_queue races on very old pools
#161653400127a7 Fix ZIL clone records for legacy holes
#16167414acbd37 Unbreak FreeBSD cross-build on MacOS broken in 051460b8b#16172eced2e2f1 libzfs: Fix mounting datasets under thread limit pressure
#16178b64afa41d Better control the thread pool size when mounting datasets
#16181fa99d9cd9 zfs_dbgmsg_print: make FreeBSD and Linux consistent
#16191e675852bc dbuf: separate refcount calls for dbuf and dbuf_user
#16198a043b60f1 Correct level handling in zstream recompress
#1620434906f8bb zap: reuse zap_leaf_t on dbuf reuse after shrink
#16206d0aa9dbcc Use memset to zero stack allocations containing unions
#162078865dfbca Fix assertion in Persistent L2ARC
#1620808648cf0d Allow block cloning to be interrupted by a signal
#16210e2357561b FreeBSD: Add const qualifier to members of struct
opensolaris_utsname
#16214800d59d57 Some improvements to metaslabs eviction
#1621602c5aa9b0 Destroy ARC buffer in case of fill error
#1622501c8efdd5 Simplify issig()
Obtained from: OpenZFS
OpenZFS commit: e2357561b9
LINT includes bnxt_re driver. Adjust the path in files, add missing
files and add a new BNXT_C to build (which thinly wraps OFED version
with bnxt specicif stuff).
Sponsored by: Netflix
Fixes: acd884dec9 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
The new bnxt_re driver doesn't compile on any of them (it uses writeq()
from the LinuxKPI, which isn't implemented there), and had already been
disconnected from the build on i386.
Reported by: Jenkins
Fixes: acd884dec9 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
This patch introduces the RoCE driver for the
Broadcom NetXtreme-E 10/25/50/100/200G RoCE HCAs.
The RoCE driver is a two part driver that relies
on the bnxt_en NIC driver to operate. The changes
needed in the bnxt_en driver is included through
another patch "L2-RoCE driver communication interface"
in this set.
Presently, There is no user space support, Hence
recommendation to use the krping kernel module for
testing. User space support will be incorporated in
subsequent patch submissions.
Reviewed by: imp
Approved by: imp
Differential revision: https://reviews.freebsd.org/D45011
- Added Aux bus support for RoCE.
- Implemented the ulp ops that are required by RoCE driver.
- Restructure context memory data structures
- DBR pacing support
Reviewed by: imp
Approved by: imp
Differential revision: https://reviews.freebsd.org/D45006
Created new directory "bnxt_en" in /dev/bnxt and /modules/bnxt
and moved source files and Makefile into respective directory.
ETS support:
- Added new files bnxt_dcb.c & bnxt_dcb.h
- Added sysctl node 'dcb' and created handlers 'ets' and
'dcbx_cap'
- Add logic to validate user input and configure ETS in
the firmware
- Updated makefile to include bnxt_dcb.c & bnxt_dcb.h
PFC support:
- Created sysctl handlers 'pfc' under node 'dcb'
- Added logic to validate user input and configure PFC in
the firmware.
App TLV support:
- Created 3 new sysctl handlers under node 'dcb'
- set_apptlv (write only): Sets a specified TLV
- del_apptlv (write only): Deletes a specified TLV
- list_apptlv (read only): Lists all APP TLVs configured
- Added logic to validate user input and configure APP TLVs
in the firmware.
Added Below DCB ops for management interface:
- Set PFC, Get PFC, Set ETS, Get ETS, Add App_TLV, Del App_TLV
Lst App_TLV
Reviewed by: imp
Approved by: imp
Differential revision: https://reviews.freebsd.org/D45005
This policy enables a user to become another user without having to be
root (hence no setuid binary). it is configured via rules using sysctl
security.mac.do.rules
For example:
security.mac.do.rules=uid=1001:80,gid=0:any
The above rule means the user identifier by the uid 1001 is able to
become user 80
Any user of the group 0 are allowed to become any user on the system.
The mdo(1) utility expects the MAC/do policy to be installed and its
rules defined.
Reviewed by: des
Differential Revision: https://reviews.freebsd.org/D45145