If we can't allocate more MSI-X vectors, accept using single shared.
If we can't allocate any MSI-X, try to allocate 2 MSI vectors, but
accept single shared. If still no luck, fall back to shared INTx.
This provides maximal flexibility in some limited scenarios. For
example, vmd(4) does not support INTx and can handle only limited
number of MSI/MSI-X vectors without sharing.
MFC after: 1 week
(cherry picked from commit e3bdf3da76)
Before this change devq was frozen only if some command was sent to
the target after reset started, but release was called always. This
change freezes the devq immediately, leaving mprsas_action_scsiio()
check only to cover race condition due to different lock devq use.
This should also avoid unnecessary requeue of the commands, creating
additional log noise and confusing some broken apps.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
(cherry picked from commit 9781c28c6d)
SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of
only 8, while for comparison some other combinations I have report
100 or even 128. In case of large JBOD detach requirement to send
target reset command to each target same time overflows the limit,
and without adequate handling makes devices stuck in half-detached
state, preventing later re-attach.
To handle that in case of allocation error mark the target with new
MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time
something else free high priority command. With this patch I can
successfully detach/attach 102 disk JBOD from/to the SAS9305-16e.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
(cherry picked from commit e3c5965c25)
When a DMA chain can't be loaded, set the state to STATE_INQUEUE so that
the mp[rs]_complete_command can properly fail the command.
Sponsored by: Netflix
(cherry picked from commit 33755dbb20)
When the mpr(4) and mps(4) drivers probe a SATA device, they issue an
ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the
target is fully setup in the driver. The drivers wait for completion of
the identify command, and have a 5 second timeout. If the timeout
fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be
freed later.
That is where the use-after-free problem comes in. Once the ATA
Identify times out, the driver sends a target reset, and then frees any
identify commands that have timed out. But, once the target reset
completes, commands that were queued to the drive are returned to the
driver by the controller.
At that point, the driver (in mp{s,r}_intr_locked()) looks up the
command descriptor for that particular SMID, marks it CM_STATE_BUSY and
sends it on for completion handling.
The problem at this stage is that the command has already been freed,
and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS
are turned on, we get a panic as soon as this command is allocated,
because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY.
So, the solution is to not free ATA Identify commands that get stuck
until they actually return from the controller. Hopefully this works
correctly on older firmware versions. If not, it could result in
commands hanging around indefinitely. But, the alternative is a
use-after-free panic or assertion (in the INVARIANTS case).
This also tightens up the state transitions between CM_STATE_FREE,
CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen
once, and we have assertions to make sure that commands are in the
correct state before transitioning to the next state. Also, for each
state assertion, we print out the current state of the command if it is
incorrect.
mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes,
that controls the behavior of the dump_reqs sysctl.
If dump_reqs_alltypes is non-zero, it will dump
all commands, not just the commands that are in the
CM_STATE_INQUEUE state. (You can see the commands
that are in the queue by using mp{s,r}util debug
dumpreqs.)
Make sure that the INQUEUE -> BUSY state transition
happens in one place, the mp{s,r}_complete_command
routine.
mp{s,r}_sas.c: Make sure we print the current command type in
command state assertions.
mp{s,r}_sas_lsi.c:
Add a new completion handler,
mp{s,r}sas_ata_id_complete. This completion
handler will free data allocated for an ATA
Identify command and free the command structure.
In mp{s,r}_ata_id_timeout, do not set the command
state to CM_STATE_BUSY. The command is still in
queue in the controller. Since we were blocking
waiting for this command to complete, there was
no completion handler previously. Set the
completion handler, so that whenever the command
does come back, it will get freed properly.
Do not free ATA Identify commands that have timed
out in mp{s,r}sas_add_device(). Wait for them
to actually come back from the controller.
mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new
dump_reqs_alltypes sysctl.
Make sure we print the current state for state
transition asserts.
This was tested in the Spectra Logic test bed (as described in the
review), as well Netflix's Open Connect fleet (where panics dropped from
a dozen or two a month to zero).
Reviewed by: imp@ (who is handling the commit with ken's OK)
Sponsored by: Spectra Logic
Differential Revision: https://reviews.freebsd.org/D25476
(cherry picked from commit 175ad3d003)
When adjusting resources we should write updated window base/limit into
the registers. Without this newly added address range won't be routed
through the bridge properly.
Use MIN()/MAX() against current window base/limit to not shrink it on
the other side if the window is shared by several resources.
Align passed resource start/end to the set window granularity to keep
it properly aligned. Currently this is mostly called by other bridges
having the same window alignment, but it may be change one day.
Reviewed by: jrtc27, jhb
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D31693
If we allocate a new window for a bridge rather than reusing an existing
one set up by firmware to cover all the devices then the new window only
includes the range needed for the first device to allocate the resource.
If a request comes in to adjust this resource in order to extend a
downstream window for another device then this will fail as the rman
doesn't have any space, so we must first grow the bridge's own window.
This is needed to support successfully attaching more than one PCI
device on SiFive's HiFive Unmatched, which has the following topology:
Root Port <---> Bridge <---> Bridge <-+-> Bridge <---> (Unused)
(pcib0) (pcib1) (pcib2) | (pcib3)
+-> Bridge <---> xHCI
| (pcib4)
+-> Bridge <---> M.2 E-key
| (pcib5)
+-> Bridge <---> M.2 M-key
| (pcib6)
+-> Bridge <---> x16 slot
(pcib7)
Without this, the xHCI endpoint successfully attaches but NVMe M.2 M-key
endpoint fails to attach as, when its adjacent bridge (pcib6) attempts
to allocate a window from its parent (pcib2) on the other side of the
switch, its parent attempts to grow its own window by calling
bus_adjust_resource on its own parent (pcib1) which fails to call the
root port device (pcib0) to request more memory to grow its own window.
Had the root port been directly connected to the switch without the
bridge in the middle then the existing code would have worked, but the
extra hop broke it.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31035
Virtio modern has the common data organized in little endian, but
on powerpc64 BE it was reading and writing in the wrong endian.
Submitted by: Leonardo Bianconi <leonardo.bianconi@eldorado.org.br>
Reviewed by: bryanv, alfredo
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28947
(cherry picked from commit fb53b42e36)
This is useful for bhyve, which otherwise has to use /dev/io to handle
accesses to I/O port BARs when PCI passthrough is in use.
Reviewed by: imp, kib
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 7e14be0b07)
Sync the e1000 shared code with DPDK shared code
"cid-gigabit.2020.06.05.tar.gz released by ND"
Primary focus was on client platforms (ich8lan). More work remains here
but we need an Intel contact for client networking.
Reviewed by: grehan, Intel Networking (erj, earlier rev)
Obtained from: DPDK <http://git.dpdk.org/dpdk/tree/drivers/net/e1000/base>
MFC after: 1 week
Sponsored by: me
Differential Revision: https://reviews.freebsd.org/D31547
(cherry picked from commit fc7682b17f)
The only thing around NTB using Giant lock is NewBus, and these callouts
have nothing to do with it.
MFC after: 2 weeks
(cherry picked from commit c6902e7796)
This was an error, we cannot use sbuf_trim(9) in the
ixgbe_sbuf_fw_version function because it also gets called in
the context of sbuf_new_for_sysctl(9). sbuf(9) explains the interaction
with drain functions as used by sbuf_new_for_sysctl(9).
Reviewed by: imp
Fixes: 7660e4ea5c
MFC after: 1 day
Differential Revision: https://reviews.freebsd.org/D31633
(cherry picked from commit 5de5419b5e)
Some devices like eGalax touchscreens use value of 0x33 instead of 0x13
for inches as unit of measure.
Reported by: Mark Kane <mark_AT_kane_DOT_mn>
(cherry picked from commit be75951af1)
This allows singletouch devices which use multitouch protocols to work.
Reported by: Mark Kane <mark_AT_kane_DOT_mn>
(cherry picked from commit e40fec4ec9)
To enable RSS hashing in the NIC, the PCSD bit must be set.
By default, this is never set when RXCSUM is disabled - which
causes problems higher up in the stack.
While here improve the RXCSUM flag assignments when enabling or
disabling IFCAP_RXCSUM.
See also: https://lists.freebsd.org/pipermail/freebsd-current/2020-May/076148.html
Reviewed by: markj, Franco Fichtner <franco@opnsense.org>,
Stephan de Wit <stephan.dewt@yahoo.co.uk>
Obtained from: OPNsense
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31501
Co-authored-by: Stephan de Wit <stephan.dewt@yahoo.co.uk>
Co-authored-by: Franco Fichtner <franco@opnsense.org>
(cherry picked from commit 69e8e8ea3d)
Both callout and taskqueue now have drain() routines not requiring
external locking. It allows to remove TASK flag and manual drain,
so the only thing remaining for lock to protect inside the callout
handler is ks_inq_length zero comparison, that can be lockless.
MFC after: 2 weeks
(cherry picked from commit e5018628e7)
Simplify the setup of srrctl.BSIZEPKT on igb class NICs.
Improve the setup of rctl.BSIZE on lem and em class NICs.
Don't try to touch rfctl on lem class NICs.
Manipulate rctl.BSEX correctly on lem and em class NICs.
Approved by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31457
(cherry picked from commit 12e8addd32)
It's rather confusing when adapter->hw and hw are mixed and matched
within a particular function.
Some of this was missed in cd1cf2fc1d
and r353778 respectively.
(cherry picked from commit c1655b0f89)
if it is only multiplexed device. Also enable syncbit checks for them.
This fixes touchpad recognition on Panasonic Toughbook CF-MX4 laptop.
Reported by: Tomasz "CeDeROM" CEDRO <tomek_AT_cedro_DOT_info>
PR: 253279
Differential revision: https://reviews.freebsd.org/D28502
(cherry picked from commit f5998d20ed)
during suspend/resume cycle. Previously used bus_generic_suspend_intr and
bus_generic_resume_intr may cause interrupt storm because of missed
interrupt acknowledges caused by blocking of intr handler.
Reported by: J.R. Oldroyd <jr_AT_opal_DOT_com>
(cherry picked from commit 82626fef62)
Commit message of the identical change in Linux driver says:
"When an I2C HID device is powered off during system sleep, as a result
of removing its power resources (by the ACPI core) the interrupt line
might go low as well. This results inadvertent interrupts."
This change fixes suspend/resume on Asus S510UQ laptops.
While here add a couple of typo fixes as well as a slight change to the
iichid_attach() code to have the power_on flag set properly.
Submitted by: J.R. Oldroyd <jr_AT_opal_DOT_com>
Reviewed by: wulf
(cherry picked from commit 5236888db7)
This controller supports 2.5G/1G/100MB/10MB speeds, and allows
tx/rx checksum offload, TSO, LRO, and multi-queue operation.
The driver was derived from code contributed by Intel, and modified
by Netgate to fit into the iflib framework.
Thanks to Mike Karels for testing and feedback on the driver.
Reviewed by: bcr (manpages), kbowling, scottl, erj
Relnotes: yes
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30668
(cherry picked from commit 517904de5c)
Use of smp_rendezvous_cpus() instead of sched_bind() allows to not
block indefinitely if target CPU is running some thread with higher
priority, while all we need is single rdmsr/wrmsr instruction call.
I guess it should also be much cheaper than full thread migration.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
(cherry picked from commit 74f80bc1af)