Commit graph

140679 commits

Author SHA1 Message Date
Jamie Gritton
cf18a61708 MFC jail: Remove a prison's shared memory when it dies
Add shm_remove_prison(), that removes all POSIX shared memory segments
belonging to a prison.  Call it from prison_cleanup() so a prison
won't be stuck in a dying state due to the resources still held.

PR:		257555
Reported by:	grembo

(cherry picked from commit 7060da62ff)
2022-07-03 12:25:43 -07:00
Jamie Gritton
06dcf1499b MFC jail: add prison_cleanup() to release resources held by a dying jail
Currently, when a jail starts dying, either by losing its last user
reference or by being explicitly killed,
osd_jail_call(...PR_METHOD_REMOVE...) is called.  Encapsulate this
into a function prison_cleanup() that can then do other cleanup.

(cherry picked from commit a9f7455c38)
2022-07-03 12:24:49 -07:00
Bjoern A. Zeeb
66754c01ff XHCI: clear warm and port reset
It seems we do not clear UPS_C_BH_PORT_RESET and UPS_C_PORT_RESET
conditions after warm or port reset.  Add that code.

Obtained from:	an old patch mainly debugging other problems
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D35483

(cherry picked from commit 8f892e9bee)
2022-07-01 13:50:19 +00:00
Bjoern A. Zeeb
39cd7aa134 USB: add quirks to XHCI
While XHCI is very generic some revisions of chipsets have problems.
On dwc3 <= 3.00a Port Disable does not seem to work so we need to not
enable it.
For that introduce quirks to xhci so that controllers can steer
certain features.  I would hope that this is and remains the only one.

Obtained from:	an old patch mainly debugging other problems
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D35482

(cherry picked from commit 447c418da0)
2022-07-01 13:50:10 +00:00
Bjoern A. Zeeb
2217448bcc LinuxKPI: 802.11: cleanup lsta better
This changes cleans up lsta from the VIF station list as well as
deals with freeing the lsta itself so it is not leaked.

lkpi_iv_update_bss() makes this more complicated than it should be
as we ties more sta state (incl. drv/fw) to the node that net80211
does not know about.  There is more work to be done detangling this
now that is better understood.

(cherry picked from commit e24e8103e0)
2022-07-01 13:50:03 +00:00
Bjoern A. Zeeb
05b1792754 LinuxKPI: 802.11: remove an early bandaid to make sure queues are allocated
iwlwifi allocates queues on first wakeup.  This takes a lot longer on
FreeBSD's work implementation that it seems to on Linux based on some
discussion.  That meant that we couldn't get non-data frames out quickly
enough initially and failed to associate.
d0d2911035 should have solved most of this
for us with iwlwifi.  None of the other drivers ported to LinuxKPI/802.11
up to today will call a dequeue so we get notified when the queus are
allocated or even need to do so.
Remove the bandaid initilly put in for iwlwifi now and speed up the
overall process of getting us associated.

(cherry picked from commit 841719c08f)
2022-07-01 13:49:55 +00:00
Bjoern A. Zeeb
d4d3ba68b7 LinuxKPI: 802.11: sync sta->addr in lkpi_iv_update_bss()
In lkpi_iv_update_bss() introduced in d9f59799fc we swap lsta and
along with that sta and drv state if ni gets reused and swapped under
us by net80211.  What we did not do was to sync sta->addr which later
(usually in lkpi_sta_assoc_to_run) during a bss_info update cause
problems in drivers (or firmware) as the BSSID and the station address
were not aligned.

If this proves to hold up to fix iwlwifi issues seem on firmware
for older chipsets, multi-assoc runs, and rtw89 (which this fixes)
we should add asserts that lkpi_iv_update_bss() can only happen in
pre-auth stages and/or make sure we factor out synching more state
fields.

Found debugging:	rtw89

(cherry picked from commit ed3ef56b29)
2022-07-01 13:49:42 +00:00
Bjoern A. Zeeb
07bec0d40e net80211 / LinuxKPI: 802.11: add Control Trigger Subframe information
Add definitions related to 802.11ax Control Trigger frame format
needed for rtw89.

(cherry picked from commit 4c3684ef5c)
2022-07-01 13:49:19 +00:00
Bjoern A. Zeeb
f5d0b181f4 LinuxKPI: 802.11: ieee80211_start_tx_ba_session()
For as long as we do not implement the compat code for tx aggregation
return -EINVAL in ieee80211_start_tx_ba_session() as both rtw88 and
rtw89 check for this value and only then disable further attempts.

(cherry picked from commit 799051e2ca)
2022-07-01 13:49:11 +00:00
Bjoern A. Zeeb
54f9ddf4ae rtw88: update Realtek's rtw88 driver
Update rtw88 based on wireless-testing at
4e051428044d5c47cd2c81c3b154788efe07ee11 (tag: wt-2022-06-10).

This is in preparation to apply USB changes to work on these and
LinuxKPI for them over the next weeks, as well to debug a
reported issue, and possibly extract and upstream some local fixes.

(cherry picked from commit 9c951734c2)
2022-07-01 13:49:02 +00:00
Bjoern A. Zeeb
a478f4afd8 LinuxKPI: move pm_message_t from kernel.h to pm.h
Move pm_message_t from kernel.h to pm.h and remove a private define
in usb.h as well as adjust the implementation in linux_usb.c.
This cleans up what I believe to be a historic shortcut and is
needed for future wireless driver updates.

Leave a note in UPDATING that drm-kmod users need to update to the
latest version before re-compiling a new kernel to avoid errors
(see PR).

Sponsored by:	The FreeBSD Foundation
PR:		264449 (drm-kmod port update, thanks wulf)
Obtained from:	bz_git_iwlwifi (Dec 2020) (partly)
Reviewed by:	hselasky, imp
Differential Revision: https://reviews.freebsd.org/D35276

(cherry picked from commit 0e981d79b1)
2022-07-01 13:48:24 +00:00
Bjoern A. Zeeb
74cedb5e90 LinuxKPI: 802.11: rework handling of the special IEEE80211_NUM_TIDS queue
Rework the way we are dealing with the last queue.  If the driver
opts in to STA_MMPDU_TXQ then preferably send all non-data frames
via the last (IEEE80211_NUM_TIDS) queue which otherwise is not used
in station mode.
If we do not have that queue we do individual tx() calls for non-data
frames now.
Everything else goes via the selected queue if possible for as long as
we have a ni (sta) and otherwise resorts to direct tx.

Tested on:	Intel AX200 and AX210
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit d0d2911035)
(cherry picked from commit fb6eaf74e9)
2022-07-01 13:45:09 +00:00
Mark Johnston
70fd40edb8 debugnet: Fix an error handling bug in the DDB command tokenizer
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c262d5e877)
2022-06-30 10:12:15 -04:00
Mark Johnston
533a247fa8 debugnet: Handle batches of packets from if_input
Some drivers will collect multiple mbuf chains, linked by m_nextpkt,
before passing them to upper layers.  debugnet_pkt_in() didn't handle
this and would process only the first packet, typically leading to
retransmits.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 8414331481)
2022-06-30 10:11:52 -04:00
Hans Petter Selasky
973716df6c tcp: Correctly compute the TCP goodput in bits per second by using SEQ_SUB().
TCP sequence number differences should be computed using SEQ_SUB().

Differential Revision:	https://reviews.freebsd.org/D35505
Reviewed by:	rscheff@
Sponsored by:	NVIDIA Networking

(cherry picked from commit f5766992c0)
2022-06-30 11:39:43 +02:00
Hans Petter Selasky
20d3224919 uhid(4): Don't read-ahead from the USB IN endpoint.
This avoids an issue where IN endpoint data received from the device right
before the file handle is closed, gets lost.

PR:		263995
Sponsored by:	NVIDIA Networking

(cherry picked from commit b6f615255d)
2022-06-30 11:39:43 +02:00
Alexander Motin
15183f36e5 amd64: Stop using REP MOVSB for backward memmove()s.
Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
REP MOVSB the fastest way to copy memory in most of cases. However
Intel Optimization Reference Manual says: "setting the DF to force
REP MOVSB to copy bytes from high towards low addresses will expe-
rience significant performance degradation". Measurements on Intel
Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.

This patch keeps ERMS use for forward ordered memory copies, but
removes it for backward overlapped moves where it does not work.

Reviewed by:	mjg
MFC after:	2 weeks

(cherry picked from commit 6210ac95a1)
2022-06-29 21:15:49 -04:00
Neel Chauhan
7a8188739a if_ix: Reset on an ECC error
This mirrors the Linux behavior as seen in the kernel commit d773ce2.

Reviewed by:		kbowling
MFH after:		3 days
Differential Revision:	https://reviews.freebsd.org/D35542

(cherry picked from commit 4f1d91e413)
2022-06-29 10:07:50 -07:00
Mark Johnston
f1400b2ecc pmap: Keep PTI page table pages busy
PTI page table pages are allocated from a VM object, so must be
exclusively busied when they are freed, e.g., when a thread loses a race
in pmap_pti_pde().  Simply keep PTPs busy at all times, as was done for
some other kernel allocators in commit
e9ceb9dd11.

Also remove some redundant assertions on "ref_count":
vm_page_unwire_noq() already asserts that the page's reference count is
greater than zero.

Reported by:	syzkaller
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c6d092b510)
2022-06-29 10:13:57 -04:00
Mark Johnston
c31c881f7e Fix the test used to wait for AP startup on x86, arm64, riscv
On arm64, testing pc_curpcb != NULL is not correct since pc_curpcb is
set in pmap_switch() while the bootstrap stack is still in use.  As a
result, smp_after_idle_runnable() can free the boot stack prematurely.

Take a different approach: use smp_rendezvous() to wait for all APs to
acknowledge an interrupt.  Since APs must not enable interrupts until
they've entered the scheduler, i.e., switched off the boot stack, this
provides the right guarantee without depending as much on the
implementation of cpu_throw().  And, this approach applies to all
platforms, so convert x86 and riscv as well.

Reported by:	mmel
Tested by:	mmel
Reviewed by:	kib
Fixes:		8db2e8fd16 ("Remove the secondary_stacks array in arm64 and riscv kernels.")
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f6b799a86b)
2022-06-29 10:13:44 -04:00
Mark Johnston
cc81b8661d vm_fault: Fix a racy copy of page valid bits
We do not hold the object lock or a page busy lock when copying src_m's
validity state.  Prior to commit 45d72c7d7f we marked dst_m as fully
valid.

Use the source object's read lock to ensure that valid bits are not
concurrently cleared.

Reviewed by:	alc, kib
Fixes:		45d72c7d7f ("vm_fault_copy_entry: accept invalid source pages.")
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit d0443e2b98)
2022-06-29 10:12:34 -04:00
Mark Johnston
3fe539651a vm_fault: Avoid unnecessary object relocking in vm_fault_copy_entry()
Suggested by:	alc
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 1f88394b7f)
2022-06-29 10:12:34 -04:00
Mark Johnston
353aa91c64 mount: Fix an incorrect assertion in kernel_mount()
The pointer to the mount values may be null if an error occurred while
copying them in, so fix the assertion condition to reflect that
possibility.

While here, move some initialization code into the error == 0 block.  No
functional change intended.

Reported by:	syzkaller
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 7565431f30)
2022-06-29 10:12:33 -04:00
Konstantin Belousov
65d8e97c4c UFS: make mkdir() and link() reliable when using SU and reaching nlink limit
PR:	165392

(cherry picked from commit 8db679af66)
2022-06-29 12:38:26 +03:00
Alexander Motin
0e897d87f7 CTL: Fix double command completions on HA failover.
I've found couple cases when CTL_FLAG_SENT_2OTHER_SC flags were not
cleared on commands return from active node or the send failure.  It
created races when ctl_failover_lun() call before ctl_process_done()
could cause second ctl_done() and ctl_process_done() calls, causing
all sorts of problems.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.

(cherry picked from commit 3b0e3e8d2a)
2022-06-28 14:14:42 -04:00
Hans Petter Selasky
78088c5714 busdma: Protect ARM busdma bounce page counters using the bounce page lock.
In bus_dmamap_unload() on ARM, the counters for free_bpages and reserved_bpages
appear to be vulnerable to unprotected read-modify-write operations that result
in accounting that looks like a page leak.

This was noticed on a 2GB quad core i.MX6 system that has more than one device
attached via FTDI based USB serial connection.

Submitted by:	John Hein <jcfyecrayz@liamekaens.com>
Differential Revision:	https://reviews.freebsd.org/D35553
PR:		264836
Sponsored by:	NVIDIA Networking

(cherry picked from commit 6c4b6f55f7)
2022-06-28 08:20:34 +02:00
Mitchell Horne
dac438a9b5 mips: fix use of dump_append()
Direct commit to stable/13.

Reported by:	Jenkins
Fixes:		5a96b88f05
2022-06-27 18:02:02 -03:00
Mitchell Horne
5a96b88f05 kerneldump: remove physical from dump routines
It is unused, especially now that the underlying d_dumper methods do not
accept the argument.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35174

(cherry picked from commit db71383b88)
2022-06-27 16:32:06 -03:00
Mitchell Horne
e06f07bc3f kerneldump: remove physical argument from d_dumper
The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35173

(cherry picked from commit 489ba22236)
2022-06-27 16:32:06 -03:00
Mitchell Horne
13f544bc8e livedump: add event handler hooks
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.

Reviewed by:	markj, kib (previous version)
Reviewed by:	debdrup (manpages)
Reviewed by:	Pau Amma <pauamma@gundo.com> (manpages)
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34067

(cherry picked from commit eb9d205fa6)
2022-06-27 16:32:06 -03:00
Mitchell Horne
758e72c0a8 Add new vnode dumper to support live minidumps
This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by:	markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with:	kib
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D33813

(cherry picked from commit c9114f9f86)
2022-06-27 16:32:06 -03:00
Mitchell Horne
6d26e87f48 Split out dumper allocation from list insertion
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34068

(cherry picked from commit 59c27ea18c)
2022-06-27 16:32:06 -03:00
Eric van Gyzen
8320036255 netdump: send key before dump, in case dump fails
Previously, if an encrypted netdump failed, such as due to a timeout or
network failure, the key was not saved, so a partial dump was
completely useless.

Send the key first, so the partial dump can be decrypted, because even a
partial dump can be useful.

Reviewed by:	bdrewery, markj
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D31453

(cherry picked from commit 13a58148de)
2022-06-27 16:32:06 -03:00
Bryan Drewery
ad78db5a3f netdump: Fix leaking debugnet state on errors.
Reviewed by:	cem, markj
Sponsored by:	Dell EMC
Differential Revision: https://reviews.freebsd.org/D31319

(cherry picked from commit a573243370)
2022-06-27 16:32:06 -03:00
Mark Johnston
2cecf3cfbb bpf: Correct a comment
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c88f6908b4)
2022-06-27 10:11:20 -04:00
Mark Johnston
18c53b8dde bpf: Zero pad bytes preceding BPF headers
BPF headers are word-aligned when copied into the store buffer.  Ensure
that pad bytes following the preceding packet are cleared.

Reported by:	KMSAN
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 60b4ad4b6b)
2022-06-27 10:11:10 -04:00
Hans Petter Selasky
1ecd211915 ibcore: Fix sysfs registration error flow
The kernel commit cited below restructured ib device management
so that the device kobject is initialized in ib_alloc_device.

As part of the restructuring, the kobject is now initialized in
procedure ib_alloc_device, and is later added to the device hierarchy
in the ib_register_device call stack, in procedure
ib_device_register_sysfs (which calls device_add).

However, in the ib_device_register_sysfs error flow, if an error
occurs following the call to device_add, the cleanup procedure
device_unregister is called. This call results in the device object
being deleted -- which results in various use-after-free crashes.

The correct cleanup call is device_del -- which undoes device_add
without deleting the device object.

The device object will then (correctly) be deleted in the
ib_register_device caller's error cleanup flow, when the caller invokes
ib_dealloc_device.

Linux commit:
b312be3d87e4c80872cbea869e569175c5eb0f9a

PR:		264472
Sponsored by:	NVIDIA Networking

(cherry picked from commit 55d1833671)
2022-06-27 10:14:49 +02:00
Doug Moore
c253ae9fe4 iommu_gas: restrict tree search to promising paths
In iommu_gas_lowermatch and iommu_gas_uppermatch, a subtree search is
quickly terminated if the largest available free space in the subtree
is below a limit, where that limit is related to the size of the
allocation request. However, that limit is too small; it does not
account for both of the guard pages that will surround the allocated
space, but only for one of them. Consequently, it permits the search
to proceed through nodes that cannot produce a successful allocation
for all the requested space. Fix that limit to improve search
performance.

Reviewed by:	alc, kib
Submitted by:	Weixi Zhu (wxzhu@rice.edu)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D35414

(cherry picked from commit b831865fe3)
2022-06-27 00:36:42 -05:00
Doug Moore
4d517feaea busdma_iommu: simplify split logic
iommu_bus_dmamap_load_something1 includes code for handling the
possibility of splitting a buffer that is needlessly complex.
Simplify it.

Reviewed by:	alc, kib
MFC after:	3 weeks
Tested by: pho (previous revisions)
Differential Revision:	https://reviews.freebsd.org/D35232

(cherry picked from commit 04e86ae357)
2022-06-27 00:34:59 -05:00
Martin Matuska
5ce13b8aa5 zfs: merge openzfs/zfs@6c3c5fcfb (zfs-2.1-release) into stable/13
OpenZFS release 2.1.5

Notable upstream pull requeset merges:
  #12575 Reject zfs send -RI with nonexistent fromsnap
  #12687 Skip spacemaps reading in case of pool readonly import
  #12746 Default to zfs_dmu_offset_next_sync=1
  #13277 FreeBSD: Use NDFREE_PNBUF if available
  #13311 Fix error handling in FreeBSD's get/putpages VOPs
  #13345 FreeBSD: Fix translation from ABD to physical pages
  #13373 zfs: holds: dequadratify
  #13375 Corrected edge case in uncompressed ARC->L2ARC handling
  #13405 Reduce dbuf_find() lock contention
  #13406 FreeBSD: use zero_region instead of allocating a dedicated page
  #13499 zed: Take no action on scrub/resilver checksum errors
  #13484 FreeBSD: libspl: Add locking around statfs globals
  #13513 Remove wrong assertion in log spacemap
  #13537 Improve sorted scan memory accounting

Obtained from:	OpenZFS
OpenZFS tag:	zfs-2.1.5
OpenZFS commit:	6c3c5fcfbe
Relnotes:	yes
2022-06-25 09:13:54 +02:00
Damjan Jovanovic
db8710e219 struct kinfo_file changes needed for lsof to work using only usermode APIs`
(cherry picked from commit 8c309d48aa)
2022-06-24 22:37:33 +03:00
Damjan Jovanovic
c1731fa54d KERN_LOCKF: report kl_file_fsid consistently with stat(2)
PR:	264723

(cherry picked from commit 8ae7694913)
2022-06-24 22:37:33 +03:00
Konstantin Belousov
9a24a80a17 reap_kill_proc(): avoid singlethreading any other process if we are exiting
Tested by:	pho (whole series MFC)

(cherry picked from commit 1575804961)
2022-06-24 17:45:50 +03:00
Konstantin Belousov
b18df35be0 reap_kill_subtree(): hold the reaper when entering it into the queue to handle later
(cherry picked from commit e0343eacf3)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
935509ba18 reap_kill_subtree_once(): handle proctree_lock unlock in reap_kill_proc()
(cherry picked from commit 1d4abf2cfa)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
9644a36d95 reap_kill_proc: do not retry on thread_single() failure
(cherry picked from commit addf103ce6)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
6a0a83e5cc Make stop_all_proc_block interruptible to avoid deadlock with parallel suspension
(cherry picked from commit 008b2e6544)
2022-06-24 17:45:46 +03:00
Mark Johnston
64717c0148 thread_single_end(): consistently maintain p_boundary_count for ALLPROC mode
(cherry picked from commit 2d5ef216b6)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
ea6a7512d2 thread_unsuspend(): do not unuspend the suspended leader thread doing SINGLE_ALLPROC
(cherry picked from commit 1b4701fe1e)
2022-06-24 17:45:46 +03:00
Konstantin Belousov
b66c168721 thread_single(): remove already checked conditional expression
(cherry picked from commit b9009b1789)
2022-06-24 17:45:46 +03:00