Commit graph

39473 commits

Author SHA1 Message Date
Alexander Motin
35f0bf2d37 atkbd: Reduce polling rate from 10Hz to ~1Hz.
In my understanding this is only needed to workaround lost interrupts.
I was thinking to remove it completely, but the comment about edge-
triggered interrupt may be true and needs deeper investigation.  ~1Hz
should be often enough to handle the supposedly rare loss cases, but
rare enough to not appear in top.  Add sysctl hw.atkbd.hz to tune it.

MFC after:	1 month

(cherry picked from commit 9e007a88d6)
2022-02-03 19:55:44 -05:00
Andriy Gapon
8934d3e7b9 sdhci: fix dumping support in MMCCAM configuration
This change fixes interaction with recently added sddadump.

(cherry picked from commit 94ff1d9cc8)
2022-02-02 08:51:13 +02:00
Andriy Gapon
08bc6b60c3 dwwdt: make it actually useful
Flip dwwdt_prevent_restart to false.  What's the use of a watchdog if it
does not restart a hung system?

Add a knob for panic-ing on the first timeout, resetting on the second
one.  This can be useful if interrupts can still work, otherwise a reset
recovers a system without any aid for debugging the hang.

The change also doubles the timeout that's programmed into the hardware.
The previous version of the code always had the interrupt on the first
timeout enabled, but it took no action on it.  Only the second timeout
could be configured to reset the system.  So, the hardware timeout was
set to a half of the user requested timeout.  But now,we can take a
corrective action on the first timeout, so we use the user requested
timeout.

While here, define boolean sysctl-s as such.

(cherry picked from commit ee900888c4)
2022-02-01 10:11:26 +02:00
Kevin Bowling
8082242b96 igc: Remove redundant IFCAP_VLAN_HWTAGGING check
Match igb(4) as in f7926a6d0c. From Vincenzo, this check is redundant
to setup providing us an IGC_RXD_STAT_VP bit and would make for an
unexpected condition if IFCAP_VLAN_HWTAGGING were not set but the tag
was stripped, which would be passed up the stack breaking isolation.

PR:		260068
Approved by:	vmaffione

(cherry picked from commit b4a58b3d58)
2022-01-30 13:40:02 -07:00
Gordon Bergling
037fe75b38 hwpmc(4): Fix a typo in a sysctl description
- s/avalable/available/

(cherry picked from commit 9966757dd6)
2022-01-29 09:44:47 +01:00
Mark Johnston
d7af180a30 vt: Use a taskqueue to clear splash_cpu logos
vt_fini_logos() calls vtbuf_grow(), which reallocates the console
window's buffer using malloc(M_WAITOK).  Because vt_fini_logos() is
called via a callout, we end up panicking if INVARIANTS is enabled.

Fix the problem simply by clearing the logos using a timed taskqueue.
taskqueue_thread is formally allowed to sleep; of course, if we actually
end up sleeping to satisfy the allocation, then we have bigger problems.

PR:		260896
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 6c7e4d72b1)
2022-01-28 10:28:30 -05:00
Vincenzo Maffione
05c9fb0087 net: iflib: fix vlan processing in the drivers
The logic that sets iri_vtag and M_VLANTAG does not handle the
case where the 802.11q VLAN tag is 0. Fix this issue across
the iflib drivers. While there, also improve and align the
VLAN tag check extraction, by moving it outside the RX descriptor
loop, eliminating a local variable and additional checks.

PR:             260068
Reviewed by:    kbowling, gallatin
Reported by:	erj
MFC after:      1 month
Differential Revision:  https://reviews.freebsd.org/D33156

(cherry picked from commit f7926a6d0c)
2022-01-27 22:39:09 +00:00
Vincenzo Maffione
e99828dfbd net: iflib: let the drivers use isc_capenable
Since isc_capenable (private copy of ifp->if_capenable) is
now synchronized to if_capenable, use it in the drivers
when checking the IFCAP_* bits.
This results in better cache usage and avoids indirection
through the ifp pointer.

PR:             260068
Reviewed by:    kbowling, gallatin
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D33156

(cherry picked from commit 52f45d8ace)
2022-01-27 22:26:30 +00:00
Andriy Gapon
79c3478e76 mmc_da: implement d_dump method, sddadump
sddadump has been derived from sddastart.

mmc_sim interface has grown a new method, cam_poll, to support polled
operation.

mmc_sim code has been changed to provide a sim_poll hook only if the
controller implements the new method.  The hooks is implemented in terms
of the new mmc_sim_cam_poll method.
Additionally, in-progress CCB-s now have CAM_REQ_INPROG status to
satisfy xpt_pollwait().

mmc_sim_cam_poll method has been implemented in dwmmc host controller.

Relnotes:	perhaps

(cherry picked from commit 44682688f0)
2022-01-26 09:27:21 +02:00
Ka Ho Ng
192c87bf7d iscsi: Fix missing is_lock unlock after cam_simq_alloc() failed
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit fa66950534)
2022-01-25 01:02:33 -05:00
Jessica Clarke
5272c66a00 hwpmc: Fix amd/arm64/armv7/uncore sampling overflow race
If a counter more than overflows just as we read it on switch out then,
if using sampling mode, we will negate this small value to give a huge
reload count, and if we later switch back in that context we will
validate that value against pm_reloadcount and panic an INVARIANTS
kernel with:

  panic: [pmc,1470] pmcval outside of expected range cpu=2 ri=16 pmcval=fffff292 pm_reloadcount=10000

or similar. Presumably in a non-INVARIANTS kernel we will instead just
use the provided value as the reload count, which would lead to the
overflow not happing for a very long time (e.g. 78 minutes for a 48-bit
counter incrementing at an averate rate of 1GHz).

Instead, clamp the reload count to 0 (which corresponds precisely to the
value we would compute if it had just overflowed and no more), which
will result in hwpmc using the full original reload count again. This is
the approach used by core for Intel (for both fixed and programmable
counters).

As part of this, armv7 and arm64 are made conceptually simpler; rather
than skipping modifying the overflow count for sampling mode counters so
it's always kept as ~0, those special cases are removed so it's always
applicable and the concatentation of it and the hardware counter can
always be viewed as a 64-bit counter, which also makes them look more
like other architectures.

Whilst here, fix an instance of UB (shifting a 1 into the sign bit) for
amd in its sign-extension code.

Reviewed by:	andrew, mhorne, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33654

(cherry picked from commit e74c7ffcb1)
2022-01-25 00:00:01 +00:00
Jessica Clarke
3ca6bdef85 xdma: Fix another -Wunused-but-set-variable warning previously missed
MFC after:	1 week

(cherry picked from commit fa5af3219f)
2022-01-24 23:59:38 +00:00
Jessica Clarke
c95bc1a7f9 xdma: Fix -Wunused-but-set-variable warnings
MFC after:	1 week

(cherry picked from commit d90c3b51cc)
2022-01-24 23:59:33 +00:00
Alexander Motin
0dc8e3fe6f mps/mpr: Relax doorbell polling precision.
It does not matter how often do we check firmware for crashes.

MFC after:	2 weeks

(cherry picked from commit 1849bc5f3f)
2022-01-23 14:57:35 -05:00
Warner Losh
67873a4fca nvd: For AHCI attached devices, report ahci bridge
When an NVME device is attached via a AHCI controller, we have no access
to its config space. So instead of information about the nvme drive
itself, return info about the AHCI controller as the next best
thing. Since the Intel Hardware RAID support looks at these values, this
likely is best.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33286

(cherry picked from commit b8194f3766)
2022-01-21 13:49:57 -05:00
Alexander Motin
fb56e14ce6 nvme: Do not rearm timeout for commands without one.
Admin queues almost always have several ASYNC_EVENT_REQUEST outstanding.
They have no timeouts, but their presence in qpair->outstanding_tr caused
useless timeout callout rearming twice a second.

While there, relax timeout callout period from 0.5s to 0.5-1s to improve
aggregation.  Command timeouts are measured in seconds, so we don't need
to be precise here.

Reviewed by:	imp
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33781

(cherry picked from commit b3c9b6060f)
2022-01-20 21:07:31 -05:00
Warner Losh
f102ae7159 nvme_sim: Only report PCI related stats when we can
For AHCI attached devices, we report the location and identification
information of the AHCI controller that we're attached to. We also
don't reprot link speed in that case, since we can't get to the PCIe
config space registers to find that out.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33287

(cherry picked from commit 8f07932272)
2022-01-20 21:07:31 -05:00
Warner Losh
cc20d75705 nvme_ahci: Mark AHCI devices as such in the controller
Add a quirk to flag AHCI attachment to the controller. This is for any
of the strategies for attaching nvme devices as children of the AHCI
device for Intel's RAID devices. This also has a side effect of cleaning
up resource allocation from failed nvme_attach calls now.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33285

(cherry picked from commit 7cf8d63c88)
2022-01-20 21:07:31 -05:00
Warner Losh
2b2925d1e8 nvme: Move to a quirk for the Intel alignment data
Prior to NVMe 1.3, Intel produced a series of drives that had
performance alignment data in the vendor specific space since no
standard had been defined. Move testing the versions to a quick so the
NVMe NS code doesn't know about PCI device info.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33284

(cherry picked from commit 053f8ed6eb)
2022-01-20 21:07:31 -05:00
Warner Losh
f022d47f2f nvme: Reduce traffic to the doorbell register
Reduce traffic to doorbell register when processing multiple completion
events at once. Only write it at the end of the loop after we've
processed everything (assuming we found at least one completion,
even if that completion wasn't valid).

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32470

(cherry picked from commit 2ec165e3f0)
2022-01-20 21:07:31 -05:00
Warner Losh
13b711e8c8 nvme: Restore hotplug warning
Restore hotplug warning in recovery state machine. No functional change
other than what message gets printed.

Sponsored by:		Netflix

(cherry picked from commit 18dc12bfd2)
2022-01-20 21:07:31 -05:00
Warner Losh
86721e606c nvme: Use adaptive spinning when polling for completion or state change
We only use nvme_completion_poll in the initialization path. The
commands they queue and wait for finish quickly as they involve no I/O
to the drive's media. These command take about 20-200 microsecnds
each. Set the wait time to 1us and then increase it by 1.5 each
successive iteration (max 1ms). This reduces initialization time by
80ms in cpervica's tests.

Use this same technique waiting for RDY state transitions. This saves
another 20ms. In total we're down from ~330ms to ~2ms.

Tested by:		cperciva
Sponsored by:		Netflix
Reviewed by:		mav
Differential Review:	https://reviews.freebsd.org/D32259

(cherry picked from commit 83581511d9)
2022-01-20 21:07:31 -05:00
Warner Losh
50b3d57e71 nvme: Only reset once on attach.
The FreeBSD nvme driver has reset the nvme controller twice on attach to
address a theoretical issue assuring the hardware is in a known
state. However, exierence has shown the second reset is unnecessary and
increases the time to boot. Eliminate the second reset. Should there be
a situation when you need a second reset (for buggy or at least somewhat
out of the mainstream hardware), the hardware option NVME_2X_RESET will
restore the old behavior. Document this in nvme(4).

If there's any trouble at all with this, I'll add a sysctl tunable to
control it.

Sponsored by:		Netflix
Reviewed by:		cperciva, mav
Differential Revision:	https://reviews.freebsd.org/D32241

(cherry picked from commit 4b3da659bf)
2022-01-20 21:07:31 -05:00
Warner Losh
932df0a258 nvme: Remove pause while resetting
After some study of the code and the standard, I think we can just drop
the pause(), unconditionally.  If we're not initialized, then there's
nothing to wait for from a software perspective.  If we are initialized,
then there might be outstanding I/O. If so, then the qpair 'recovery
state' will transition to WAITING in nvme_ctrlr_disable_qpairs, which
will ignore any interrupts for items that complete before we complete
the reset by setting cc.en=0.

If we go on to fail the controller, we'll cancel the outstanding I/O
transactions.  If we reset the controller, the hardware throws away
pending transactions and we retry all the pending I/O transactions. Any
transactions that happend to complete before cc.en=0 will have the same
effect in the end (doing the same transaction twice is just inefficient,
it won't affect the state of the device any differently than having done
it once).

The standard imposes no wait times here, so it isn't needed from that
perspective.

Unanswered Question: Do we may need to disable interrupts while we
disable in legacy mode since those are level-sensitive.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32248

(cherry picked from commit e5e26e4a24)
2022-01-20 21:07:30 -05:00
Warner Losh
0c3d88b9ab nvme: Explain a workaround a little better
The don't touch the mmio of the drive after we do a EN 1->0 transition
is only for a tiny number of dirves that have this unforunate issue.

Sponsored by:		Netflix

(cherry picked from commit 77054a897f)
2022-01-20 21:07:30 -05:00
Warner Losh
02325a44b5 nvme_ctrlr_enable: Small style nits
Rewrite the nested if's using the preferred FreeBSD style for branches
of ifs that return. NFC. Minor tweaks to the comments to better fit new
code layout.

Sponsored by:		Netflix
Reviewed by:		mav, chuck (prior rev, but comments rolled in)
Differential Revision:	https://reviews.freebsd.org/D32245

(cherry picked from commit a245627a4e)
2022-01-20 21:07:30 -05:00
Warner Losh
21de49b06c nvme: Use MS_2_TICKS rather than rolling our own
Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32246

(cherry picked from commit 26259f6ab9)
2022-01-20 21:07:30 -05:00
Warner Losh
a8ce655445 nvme_ctrlr_enable: Remove unnecessary 5ms delays
Remove the 5ms delays after writing the administrative queue
registers. These delays are from the very earliest days of the driver
(they are in the first commit) and were most likely vestiges of the
Chatham NVMe prototype card that was used to create this driver. Many of
the workarounds necessary for it aren't necessary for standards
compliant cards. The original driver had other areas marked for Chatham,
but these were not. They are unneeded. There's three lines of supporting
evidence.

First, the NVMe standards make no mention of a delay time after these
registers are written. Second, the Linux driver doesn't have them, even
as an option. Third, all my nvme cards work w/o them.

To be safe, add a write barrier between setting up the admin queue and
enabling the controller.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D32247

(cherry picked from commit d5fca1dc1d)
2022-01-20 21:07:30 -05:00
Warner Losh
1397c8ebb7 nvme: Sanity check completion id
Make sure the completion ID is in the range of [0..num_trackers) since
the values past the end of the act_tr array are never going to be valid
trackers and will lead to pain and suffering if we try to dereference
them to get the tracker or to set the tracker back to NULL as we
complete the I/O.

Sponsored by:		Netflix
Reviewed by:		mav, chs, chuck
Differential Revision:	https://reviews.freebsd.org/D32088

(cherry picked from commit 36a87d0c6f)
2022-01-20 21:07:30 -05:00
Warner Losh
86990decd7 nvme: count number of ignored interrupts
Count the number of times we're asked to process completions, but that
we ignore because the state of the qpair isn't in RECOVERY_NONE.

Sponsored by:		Netflix
Reviewed by:		mav, chuck
Differential Revision:	https://reviews.freebsd.org/D32212

(cherry picked from commit 587aa25525)
2022-01-20 21:07:30 -05:00
Warner Losh
7144882ae1 nvme: Add sanity check for phase on startup.
The proper phase for the qpiar right after reset in the first interrupt
is 1. For it, make sure that we're not still in phase 0. This is an
illegal state to be processing interrupts and indicates that we've
failed to properly protect against a race between initializing our state
and processing interrupts. Modify stat resetting code so it resets the
number of interrpts to 1 instead of 0 so we don't trigger a false
positive panic.

Sponsored by:		Netflix
Reviewed by:		cperciva, mav (prior version)
Differential Revision:	https://reviews.freebsd.org/D32211

(cherry picked from commit 7d5eebe0f4)
2022-01-20 21:07:30 -05:00
Warner Losh
e8f693131c nvme: start qpair in state RECOVERY_WAITING
An interrupt happens on the admin queue right away after the reset, so
as soon as we enable interrupts, we'll get a call to our interrupt
handler. It is safe to ignore this interrupt if we're not yet
initialized, or	to process it if we are. If we are initialized,	we'll
see there's no completion records and return. If we're not, we'll
process	no completion records and return. Either way, nothing is
processed and nothing is lost.

Until we've completely setup the qpair, we need to avoid processing
completion records. Start the qpair in the waiting recovery state so we
return immediately when we try to process completions. The code already
sets it to 'NONE' when we're initialization is complete. It's safe to
defer completion processing here because we don't send any commands
before the initialization of the software state of the qpair is
complete. And even if we were to somehow send a command prior to that
completing, the completion record for that command would be processed
when we send commands to the admin qpair after we've setup the software
state. There's no good central point to add an assert for this last
condition.

This fixes an KASSERT "received completion for unknown cmd" panic on
boot.

Fixes:			502dc84a8b
Sponsored by:		Netflix
Reviewed by:		mav, cperciva, gallatin
Differential Revision:	https://reviews.freebsd.org/D32210

(cherry picked from commit fa81f3731d)
2022-01-20 21:07:30 -05:00
Warner Losh
9bbd0a7ca9 nvme: Use shared timeout rather than timeout per transaction
Keep track of the approximate time commands are 'due' and the next
deadline for a command. twice a second, wake up to see if any commands
have entered timeout. If so, quiessce and then enter a recovery mode
half the timeout further in the future to allow the ISR to
complete. Once we exit recovery mode, we go back to operations as
normal.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D28583

(cherry picked from commit 502dc84a8b)
2022-01-20 21:07:30 -05:00
Warner Losh
24d2d1813e nvme/nda: Fail all nvme I/Os after controller fails
Once the controller has failed, fail all I/O w/o sending it to the
device. The reset of the nvme driver won't schedule any I/O to the
failed device, and the controller is in an indeterminate state and can't
accept I/O. Fail both at the top end of the sim and the bottom
end. Don't bother queueing up the I/O for failure in a different task.

Reviewed by:		chuck
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D31341

(cherry picked from commit 4b977e6dda)
2022-01-20 21:07:30 -05:00
Colin Percival
eb4d2eab07 Add some nvme initialization routines to TSLOG
About 335 ms of EC2 instance boot time is being spent here.

(cherry picked from commit bad42df9bf)
2022-01-20 21:07:30 -05:00
Michal Meloun
d6529c0d12 pci_dw_mv: Don't enable unhandled interrupts.
Mainly link errors interrupts should only be activated on fully linked port,
otherwise noise on lanes can cause livelock. But we don't have error
counters yet, so leave these interrupts disabled.

(cherry picked from commit ce5a4083de)
2022-01-20 11:35:51 +01:00
Michal Meloun
139afdb172 simple_mfd: switch to controllable locking for syscon provider.
MFC after	3 weeks

(cherry picked from commit f97f57b518)
2022-01-20 11:34:28 +01:00
Michal Meloun
f277be277d extres/clk: Add a method to detect the HW state of the clock gate.
- add method to read gate enable/disable staust from HW
- show gate status in sysctl clock dump

MFC after:	1 week

(cherry picked from commit 1a74d77f85)
2022-01-20 11:14:22 +01:00
Michal Meloun
3d4b9e5fa1 extres/clk: Improve sysctl dump of clocks.
Always recalculate the frequency, the cache is lazily initialized so it is not always up to date.
While I'm in mark sysctl as MPSAFE.

Discussed with:	manu, adrian
MFC after:	1 week

(cherry picked from commit 72a2f3b5e2)
2022-01-20 11:14:04 +01:00
Michal Meloun
a5e76683b2 dwmmc: Calculate the maximum transaction length correctly.
We should reserve two descriptors (not MMC_SECTORS) for potentially
unaligned (so bounced) buffer fragments, one for the starting fragment
and one for the ending fragment.

Submitted by:	kjopek@gmail.com
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30387

(cherry picked from commit dfb7360222)
2022-01-20 11:11:17 +01:00
Michal Meloun
57dd17dd60 Fix error value returned by ofw_bus_gen_get_node().
By definition ofw_bus_get_node() should  consistently return -1 when there
is no associated OF node.

MFC after:	4 weeks
Discussed with:	nwhitehorn
Analyzed in: 	https://reviews.freebsd.org/D30761

(cherry picked from commit 3eae4e106a)
2022-01-20 11:00:55 +01:00
Stefan Eßer
dc4114875e Make CPU_SET macros compliant with other implementations
(cherry picked from commit e2650af157)
2022-01-14 18:17:30 +02:00
Greg V
ae4067ce94 efifb,vbefb: implement vd_fini
This removes the pmap entry when switching away to e.g. drm fb.

Differential Revision:	https://reviews.freebsd.org/D29020
MFC After:	1 month

(cherry picked from commit 8ebda6e44b)
2022-01-14 16:55:23 +01:00
Mark Johnston
4905ce27b8 mvneta: Unconditionally print an error message if mii_attach() fails
The error message is useful for diagnosing mvneta_attach() failures.

(cherry picked from commit ed166a0173)
2022-01-11 09:28:33 -05:00
Alexander Motin
4ee9fbcd85 acpi_support: Remove CTLFLAG_NEEDGIANT from sysctls.
MFC after:	2 weeks

(cherry picked from commit 6237a1cc2d)
2022-01-09 19:30:14 -05:00
Alexander Motin
1b1f80ae1c sound: Remove CTLFLAG_NEEDGIANT from some sysctls.
While there, remove some dead code.

MFC after:	2 weeks

(cherry picked from commit 3b4c543322)
2022-01-09 19:29:59 -05:00
Alexander Motin
f39bf9a217 acpica: Remove CTLFLAG_NEEDGIANT from most sysctls.
MFC after:	2 weeks

(cherry picked from commit 3e68d2c52b)
2022-01-09 19:29:55 -05:00
Alexander Motin
6bc8606fca pccbb: Remove Giant mention in comments.
MFC after:	2 weeks

(cherry picked from commit 22405bb2e4)
2022-01-08 20:24:13 -05:00
Alexander Motin
2a36679b74 amdtemp: Remove CTLFLAG_NEEDGIANT from sysctls.
It seems to be needed only to serialize very old K8 registers access.
Introduce separate lock for that and remove Giant dependency.

MFC after:	2 weeks

(cherry picked from commit 6c101ed7a3)
2022-01-08 20:24:07 -05:00
Alexander Motin
cd9fe8d81d uart: Remove CTLFLAG_NEEDGIANT from sysctl.
MFC after:	2 weeks

(cherry picked from commit c214c2c004)
2022-01-08 20:24:01 -05:00