After we wipe PMC configuration, including its local enable bit(s),
we don't really care about its global enable bit. Global enable bits
now may only be cleared by interrupt handler in case of error (sample
buffer overflow). Being set is actually a reset default for them.
This saves one WRMSR per process-scope PMC per context switch, that
is clearly visible in profiles.
MFC after: 1 month
(cherry picked from commit 326a8d3e08)
The PMC subsystem is not designed for non-uniform CPU capabilities
(P/E-cores are different), but at least several working architectural
events like cpu_clk_unhalted.thread_p should be better than nothing.
MFC after: 1 month
(cherry picked from commit fe109d3113)
Intel json's use event=0 to specify fixed counter number via umask.
Alternatively fixed counters have equivalent programmable event/umask.
MFC after: 1 month
(cherry picked from commit c1e813d123)
Since version 2 Intel CPUs can freeze PMCs when intering PMI to reduce
PMI effects on collected statistics. Since version 4 hardware supports
"streamlined" mechanism, not requiring IA_GLOBAL_CTRL MSR access.
MFC after: 1 month
(cherry picked from commit 81ffb45f02)
This variable is set based on the exact CPU model detected. If this
value is set too small, it could lead to a NULL-dereference from an
improperly initialized pmc_rowindex_to_classdep array.
Though it has been fixed, this was previously the case for Broadwell.
Add two asserts to catch this in DEBUG kernels, as it represents a
configuration error that may be hard to uncover otherwise.
PR: 253687
Reported by: Zhenlei Huang <zlei.huang@gmail.com>
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 8399d923a5)
get_counts() doesn't do anything at the moment but return the result of
get_cycles(), so remove it.
For clarity, rename get_cycles() to get_timecount(); RISC-V defines
separate time and cyclecount CSRs, so let's avoid confusing the two.
They may be backed by the same underlying clock, but this is an
implementation detail.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35461
(cherry picked from commit b82f4170fc)
The third argument to this function indicates whether the supplied
ticker is fixed or variable, i.e. requiring calibration. Give this
argument a type and name that better conveys this purpose.
Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35459
(cherry picked from commit 8701571df9)
- Prune unused definitions and includes
- Slight renaming of callback functions to indicate their usage
- Place vdso_fill_timehands callback logically in the file
- Small style nits
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35460
(cherry picked from commit 715276a08b)
This is cheaper than the default of tc_cpu_ticks().
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35462
(cherry picked from commit 33734a1f76)
Calling bus_dmamap_destroy() for a mapping which was allocated with
bus_dmamem_alloc() will result in a panic. This change is not run-time
tested, but I identified the issue while implementing the analogous
method in if_dwc(4), using this implementation as the template.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 8757d0fca9)
We already increment the unicast IPACKETS and OPACKETS counters in the
rx/tx paths, respectively. Multicast packets are counted in the generic
ethernet code. Therefore, we shouldn't increment these counters in
dwc_harvest_stats().
Drop the early return from dwc_rxfinish_one() so that we still count
received packets with e.g. a checksum error.
PR: 263817
Reported by: Jiahao LI <jiahali@blackberry.com>
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35499
(cherry picked from commit 9718759043)
It can be useful for testing.
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35496
(cherry picked from commit 27b39e58b3)
We claim support in ifcaps, but don't actually enable it.
PR: 263886
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35498
(cherry picked from commit 35c9edab41)
Per the reports, some Allwinner device trees now list the desired
phy-mode as "rgmii-id". The manual string comparison fails to detect
this, and we end up falling back to MII mode. Instead, select the clock
name using the sc->phy_mode variable, which is set in the main attach
function.
The logic to actually handle rgmii-id mode delays will be added to the
relevant PHY driver.
PR: 261355, 264673
Reported by: Maren <marentoy@protonmail.com>
Reported by: Arie Bikker <src-2016@bikker.homeunix.net>
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35500
(cherry picked from commit 3428997cb3)
And if_t rather than struct ifnet *. No functional change intended.
Reviewed by: manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35497
(cherry picked from commit ca01879004)
Commit 4b8365d752 introduced the ability to dynamically register
VM object types, for use by tmpfs, which creates swap-backed objects.
As a part of this, checks for such objects changed from
object->type == OBJT_DEFAULT || object->type == OBJT_SWAP
to
object->type == OBJT_DEFAULT || (object->flags & OBJ_SWAP) != 0
In particular, objects of type OBJT_DEFAULT do not have OBJ_SWAP set;
the swap pager sets this flag when converting from OBJT_DEFAULT to
OBJT_SWAP.
A few of these checks are done without the object lock held. It turns
out that this can result in false negatives since the swap pager
converts objects like so:
object->type = OBJT_SWAP;
object->flags |= OBJ_SWAP;
Fix the problem by adding explicit tests for OBJT_SWAP objects in
unlocked checks.
PR: 258932
Fixes: 4b8365d752 ("Add OBJT_SWAP_TMPFS pager")
Reported by: bdrewery
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit e123264e4d)
Add shm_remove_prison(), that removes all POSIX shared memory segments
belonging to a prison. Call it from prison_cleanup() so a prison
won't be stuck in a dying state due to the resources still held.
PR: 257555
Reported by: grembo
(cherry picked from commit 7060da62ff)
Currently, when a jail starts dying, either by losing its last user
reference or by being explicitly killed,
osd_jail_call(...PR_METHOD_REMOVE...) is called. Encapsulate this
into a function prison_cleanup() that can then do other cleanup.
(cherry picked from commit a9f7455c38)
It seems we do not clear UPS_C_BH_PORT_RESET and UPS_C_PORT_RESET
conditions after warm or port reset. Add that code.
Obtained from: an old patch mainly debugging other problems
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D35483
(cherry picked from commit 8f892e9bee)
While XHCI is very generic some revisions of chipsets have problems.
On dwc3 <= 3.00a Port Disable does not seem to work so we need to not
enable it.
For that introduce quirks to xhci so that controllers can steer
certain features. I would hope that this is and remains the only one.
Obtained from: an old patch mainly debugging other problems
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D35482
(cherry picked from commit 447c418da0)
This changes cleans up lsta from the VIF station list as well as
deals with freeing the lsta itself so it is not leaked.
lkpi_iv_update_bss() makes this more complicated than it should be
as we ties more sta state (incl. drv/fw) to the node that net80211
does not know about. There is more work to be done detangling this
now that is better understood.
(cherry picked from commit e24e8103e0)
iwlwifi allocates queues on first wakeup. This takes a lot longer on
FreeBSD's work implementation that it seems to on Linux based on some
discussion. That meant that we couldn't get non-data frames out quickly
enough initially and failed to associate.
d0d2911035 should have solved most of this
for us with iwlwifi. None of the other drivers ported to LinuxKPI/802.11
up to today will call a dequeue so we get notified when the queus are
allocated or even need to do so.
Remove the bandaid initilly put in for iwlwifi now and speed up the
overall process of getting us associated.
(cherry picked from commit 841719c08f)
In lkpi_iv_update_bss() introduced in d9f59799fc we swap lsta and
along with that sta and drv state if ni gets reused and swapped under
us by net80211. What we did not do was to sync sta->addr which later
(usually in lkpi_sta_assoc_to_run) during a bss_info update cause
problems in drivers (or firmware) as the BSSID and the station address
were not aligned.
If this proves to hold up to fix iwlwifi issues seem on firmware
for older chipsets, multi-assoc runs, and rtw89 (which this fixes)
we should add asserts that lkpi_iv_update_bss() can only happen in
pre-auth stages and/or make sure we factor out synching more state
fields.
Found debugging: rtw89
(cherry picked from commit ed3ef56b29)
For as long as we do not implement the compat code for tx aggregation
return -EINVAL in ieee80211_start_tx_ba_session() as both rtw88 and
rtw89 check for this value and only then disable further attempts.
(cherry picked from commit 799051e2ca)
Update rtw88 based on wireless-testing at
4e051428044d5c47cd2c81c3b154788efe07ee11 (tag: wt-2022-06-10).
This is in preparation to apply USB changes to work on these and
LinuxKPI for them over the next weeks, as well to debug a
reported issue, and possibly extract and upstream some local fixes.
(cherry picked from commit 9c951734c2)
Move pm_message_t from kernel.h to pm.h and remove a private define
in usb.h as well as adjust the implementation in linux_usb.c.
This cleans up what I believe to be a historic shortcut and is
needed for future wireless driver updates.
Leave a note in UPDATING that drm-kmod users need to update to the
latest version before re-compiling a new kernel to avoid errors
(see PR).
Sponsored by: The FreeBSD Foundation
PR: 264449 (drm-kmod port update, thanks wulf)
Obtained from: bz_git_iwlwifi (Dec 2020) (partly)
Reviewed by: hselasky, imp
Differential Revision: https://reviews.freebsd.org/D35276
(cherry picked from commit 0e981d79b1)
Rework the way we are dealing with the last queue. If the driver
opts in to STA_MMPDU_TXQ then preferably send all non-data frames
via the last (IEEE80211_NUM_TIDS) queue which otherwise is not used
in station mode.
If we do not have that queue we do individual tx() calls for non-data
frames now.
Everything else goes via the selected queue if possible for as long as
we have a ni (sta) and otherwise resorts to direct tx.
Tested on: Intel AX200 and AX210
Sponsored by: The FreeBSD Foundation
(cherry picked from commit d0d2911035)
(cherry picked from commit fb6eaf74e9)
Some drivers will collect multiple mbuf chains, linked by m_nextpkt,
before passing them to upper layers. debugnet_pkt_in() didn't handle
this and would process only the first packet, typically leading to
retransmits.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 8414331481)
TCP sequence number differences should be computed using SEQ_SUB().
Differential Revision: https://reviews.freebsd.org/D35505
Reviewed by: rscheff@
Sponsored by: NVIDIA Networking
(cherry picked from commit f5766992c0)
This avoids an issue where IN endpoint data received from the device right
before the file handle is closed, gets lost.
PR: 263995
Sponsored by: NVIDIA Networking
(cherry picked from commit b6f615255d)
Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
REP MOVSB the fastest way to copy memory in most of cases. However
Intel Optimization Reference Manual says: "setting the DF to force
REP MOVSB to copy bytes from high towards low addresses will expe-
rience significant performance degradation". Measurements on Intel
Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.
This patch keeps ERMS use for forward ordered memory copies, but
removes it for backward overlapped moves where it does not work.
Reviewed by: mjg
MFC after: 2 weeks
(cherry picked from commit 6210ac95a1)
This mirrors the Linux behavior as seen in the kernel commit d773ce2.
Reviewed by: kbowling
MFH after: 3 days
Differential Revision: https://reviews.freebsd.org/D35542
(cherry picked from commit 4f1d91e413)
PTI page table pages are allocated from a VM object, so must be
exclusively busied when they are freed, e.g., when a thread loses a race
in pmap_pti_pde(). Simply keep PTPs busy at all times, as was done for
some other kernel allocators in commit
e9ceb9dd11.
Also remove some redundant assertions on "ref_count":
vm_page_unwire_noq() already asserts that the page's reference count is
greater than zero.
Reported by: syzkaller
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit c6d092b510)
On arm64, testing pc_curpcb != NULL is not correct since pc_curpcb is
set in pmap_switch() while the bootstrap stack is still in use. As a
result, smp_after_idle_runnable() can free the boot stack prematurely.
Take a different approach: use smp_rendezvous() to wait for all APs to
acknowledge an interrupt. Since APs must not enable interrupts until
they've entered the scheduler, i.e., switched off the boot stack, this
provides the right guarantee without depending as much on the
implementation of cpu_throw(). And, this approach applies to all
platforms, so convert x86 and riscv as well.
Reported by: mmel
Tested by: mmel
Reviewed by: kib
Fixes: 8db2e8fd16 ("Remove the secondary_stacks array in arm64 and riscv kernels.")
Sponsored by: The FreeBSD Foundation
(cherry picked from commit f6b799a86b)
We do not hold the object lock or a page busy lock when copying src_m's
validity state. Prior to commit 45d72c7d7f we marked dst_m as fully
valid.
Use the source object's read lock to ensure that valid bits are not
concurrently cleared.
Reviewed by: alc, kib
Fixes: 45d72c7d7f ("vm_fault_copy_entry: accept invalid source pages.")
Sponsored by: The FreeBSD Foundation
(cherry picked from commit d0443e2b98)
The pointer to the mount values may be null if an error occurred while
copying them in, so fix the assertion condition to reflect that
possibility.
While here, move some initialization code into the error == 0 block. No
functional change intended.
Reported by: syzkaller
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 7565431f30)
I've found couple cases when CTL_FLAG_SENT_2OTHER_SC flags were not
cleared on commands return from active node or the send failure. It
created races when ctl_failover_lun() call before ctl_process_done()
could cause second ctl_done() and ctl_process_done() calls, causing
all sorts of problems.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
(cherry picked from commit 3b0e3e8d2a)