On architectures with strict alignment requirements (e.g. arm), clang 14
warns about a packed struct which encloses a non-packed union:
In file included from sys/dev/bwi/bwimac.c:79:
sys/dev/bwi/if_bwivar.h:308:7: error: field iv_val within 'struct bwi_fw_iv' is less aligned than 'union (unnamed union at sys/dev/bwi/if_bwivar.h:305:2)' and is usually due to 'struct bwi_fw_iv' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
} iv_val;
^
It appears to help if you also add __packed to the inner union (i.e.
iv_val). No change to the layout is intended.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34196
cxgbe_refresh_stats takes into account VI_SKIP_STATS but not
VI_INIT_DONE when deciding whether to read the hardware stats. But
before this change VI_SKIP_STATS was set only for VIs with VI_INIT_DONE.
That meant that cxgbe_refresh_stats always accessed the hardware for
uninitialized VIs, and this is a problem if the adapter is suspended or
in the middle of a reset.
Fix this by setting VI_SKIP_STATS on all VIs during suspend. While
here, ignore VI_INIT_DONE in vi_refresh_stats too to be consistent with
cxgbe_refresh_stats.
MFC after: 1 week
Sponsored by: Chelsio Communications
Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c)
This fixes creating flow rules from user-space after the
kernel space update based on Linux 5.7-rc1 .
Sponsored by: NVIDIA Networking
This was missed in 74d6c131cb where other geom modules were annotated
with MODULE_VERSION. Again, the problem is the same: we can't detect
that geom_md is loaded into the kernel without it.
This was noticed in release builds on the cluster; mdconfig attempts to
load geom_md because it can't detect it in the kernel, but the cluster
config includes md(4) and does not build the kmod. This problem would
have been masked on hosts with the kmod built, as the kmod attempts to
register the g_md module and fails. With this commit, mdconfig would
not even try to load it again.
Reported by: re (cperciva)
MFC after: 3 days
In the (extremely unlikely) case of vd->vd_height ==
vt_logo_sprite_height the vd_drawrect code would write outside of
frame-buffer memory.
MFC after: 1 week
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D34220
Having a single pool of worker threads adds extra complexity and
overhead. The software backend also uses per-connection kthreads.
Sponsored by: Chelsio Communications
Previously the driver was called to send PDUs to the NIC synchronously
from the icl_conn_pdu_queue_cb callback. However, this performed a
fair bit of work while holding the icl connection lock. Instead,
change the callback to add sent PDUs to a STAILQ and defer dispatching
of PDUs to the NIC to a helper thread similar to the scheme used in
the TCP iSCSI backend.
- Replace rx_flags int and the sole RXF_ACTIVE flag with a simple
rx_active bool.
- Add a pool of transmit worker threads for cxgbei.
- Fix worker thread exit to depend on the wakeup in kthread_exit()
to fix a race with module unload.
Reported by: mav
Sponsored by: Chelsio Communications
These headers originate with the Xen project and shouldn't be mixed with
the main portion of the FreeBSD kernel. Notably they shouldn't be the
target of clean-up commits.
Switch to use the headers in sys/contrib/xen.
Reviewed by: royger
There's no need to explicitly add linear mappings for the grant table
area, as the memory is allocated using xenmem_alloc and it should
already have a linear mapping that can be obtained using
rman_get_virtual.
While there also remove the return value of gnttab_map, since there's
no return value anymore.
Sponsored by: Citrix Systems R&D
Reviewed by: Elliott Mitchell <ehem+freebsd@m5p.com>
Differential revision: https://reviews.freebsd.org/D29602
sbcut() returns mbufs in reverse order so is not suitable for reading
data from the socket buffer. Instead, check for already-received data
in the receive worker thread before passing offload PDUs up to the
iSCSI layer. This uses soreceive() to read data from the socket and
is also to use M_WAITOK since it now runs from a worker thread instead
of an interrupt thread.
Also, fix decoding of the data segment length for pre-offload PDUs.
Reported by: Jithesh Arakkan @ Chelsio
Fixes: a8c4147edc cxgbei: Parse all PDUs received prior to enabling offload mode.
Sponsored by: Chelsio Communications
Summary:
This switch is based off of the AR8327/AR8337 external switch/PHY.
However unlike the AR8327/AR8337 it itself doesn't have any PHYs;
instead an external PHY connects to it using the PSGMII port.
Differential Revision: https://reviews.freebsd.org/D34112
Reviewed by: manu
This code is inspired by the ar40xx code in openwrt, which itself
is based on the Qualcomm QCA-SSDK. Both of these sources are, amusingly,
BSD licenced - and thus I have included some of the comments in the
hardware workaround paths to document some of the magic numbers.
This adds support for the IPQ4018/IPQ4019 MDIO bus. This is used to
talk to external PHYs and switches. (There's an internal switch
in the IPQ4018/IPQ4019 as well, but it's accessible via MMIO/AXI.)
Differential Revision: https://reviews.freebsd.org/D34110
Reviewed by: manu
A lot more generic cam related things were done in mmc_sim so this
simplifies the driver a lot.
Differential Revision: https://reviews.freebsd.org/D32154
Reviewed by: imp
There seem to be systems returning some garbage here. I still don't
know why, but at least I hope this check fix indefinite printf loop.
MFC after: 2 weeks
74cf7cae4d ("softclock: Use dedicated ithreads for running callouts.")
switched callouts away from the swi infrastructure. It turns out that
this was a major source of entropy in early boot, which we've now lost.
As a result, first boot on hardware without a 'fast' entropy source
would block waiting for fortuna to be seeded with little hope of
progressing without manual intervention.
Let's resolve it by explicitly harvesting entropy in callout_process()
if we've handled any callouts. cc/curthread/now seem to be reasonable
sources of entropy, so use those.
Discussed with: jhb (also proposed initial patch)
Reported by: many
Reviewed by: cem, markm (both csprng)
Differential Revision: https://reviews.freebsd.org/D34150
When FILEMON_SET_FD is used, the filemon handle effectively wraps the
passed file. In particular, the handle may be inherited by a child
process, or transferred over a unix domain socket, so we must verify
that the backing file permits this.
Reported by: syzbot+36e6be9e02735fe66ca8@syzkaller.appspotmail.com
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34128
According to Broadcom, mixing 64-bit SGEs with 32-bit chain entries can
lead to IOC Fault code 0x40000d04. This fault code has been observed to
suddenly increase on certain machines when the OCA firmware images are
deployed. The hardware interprets all elements of a 64-bit SGE, even
ones marked as 32-bit. Depending on the other bits, this will just work,
but sometimes generate the above fault. Broadcom recommends this
practice, and the Linux and NetBSD drivers follow it.
Rework the chaining code to use MPI2_SGE_CHAIN64 instead of
MPI2_SGE_CHAIN32. Adjust MPS_SGC_SIZE from 8 to 12 to match the size of
the new structure. Flag the structure as being 64-bits now. Since
MPS_SGE64_SIZE and MPS_SGC_SIZE are the same now, mps_push_sge could be
simplified (after the same fashion of mpr). The different number of
cases collapse to whether or not there's room for the segments and if
not we need a chain, however these changes haven't been made yet as the
current code handles those cases properly with the new defines.
Made chain_busaddr 64-bits, even though we ask for all allocations to be
below 4GB for this tag. Use it to set both parts of the CHAIN64 address
rather than baking the 4GB assumption. Add asserts around the allocation
to detect and BUSDMA bugs in allocation.
Remove asserts and associated comment in mpi_pre_fw_download and
mpi_pre_fw_upload. The code does not, it seems, depend on this
invariant. The mpr driver has similar code, no asserts and also doesn't
depend on this.
Adjust comments to reflect the updated size.
Sponsored by: Netflix
Reviewed by: scottl, mav
Differential Revision: https://reviews.freebsd.org/D34016
If port resume fails, likely the USB device is detached. Ignore such errors,
because else the USB stack might try forever trying to resume the device,
before it will proceed detaching it.
MFC after: 1 week
Sponsored by: NVIDIA Networking
TLS RX support is modeled after TLS TX support. The basic structures and layouts
are almost identical, except that the send tag created filters RX traffic and
not TX traffic.
The TLS RX tag keeps track of past TLS records up to a certain limit,
approximately 1 Gbyte of TCP data. TLS records of same length are joined
into a single database record.
Regularly the HW is queried for TLS RX progress information. The TCP sequence
number gotten from the HW is then matches against the database of TLS TCP
sequence number records and lengths. If a match is found a static params WQE
is queued on the IQ and the hardware should immediately resume decrypting TLS
data until the next non-sequential TCP packet arrives.
Offloading TLS RX data is supported for untagged, prio-tagged, and
regular VLAN traffic.
MFC after: 1 week
Sponsored by: NVIDIA Networking
If the driver_version capability bit is enabled, send the driver
version to firmware after the init HCA command, for display purposes.
Example of driver version: "FreeBSD,mlx5_core,14.0.0,3.x-xxx"
Linux commits:
012e50e109fd27ff989492ad74c50ca7ab21e6a1
MFC after: 1 week
Sponsored by: NVIDIA Networking
Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets
are sent back to the vport. A unicast loopback packet is the packet
with destination MAC address the same as the source MAC address. For
multicast, the destination MAC address is in the vport's multicast
filter list.
Moreover, the local loopback is not needed if there is one or none
user space context.
After this patch, the raw ethernet unicast and multicast local
loopback are disabled by default. When there is more than one user
space context, the local loopback is enabled.
Note that when local loopback is disabled, raw ethernet packets are
not looped back to the vport and are forwarded to the next routing
level (eswitch, or multihost switch, or out to the wire depending on
the configuration).
Linux commits:
c85023e153e3824661d07307138fdeff41f6d86a
8978cc921fc7fad3f4d6f91f1da01352aeeeff25
MFC after: 1 week
Sponsored by: NVIDIA Networking
This change adds convenience functions to setup a flow steering rule based on
a TCP socket. The helper function gets all the address information from the
socket and returns a steering rule, to be used with HW TLS RX offload.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Previously flow steering tables and rules were only created and destroyed
at link up and down events, respectivly. Due to new requirements for adding
TLS RX flow tables and rules, the main flow steering table must always be
available as there are permanent redirections from the TLS RX flow table
to the vlan flow table.
MFC after: 1 week
Sponsored by: NVIDIA Networking
All packets must go through the indirection table, RQT,
because it is not possible to modify the RQN of the TIR
for direct dispatchment after it is created, typically
when the link goes up and down.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Add support to map an SQ to a specific schedule queue using a
special WQE as performance enhancement.
SQ remap operation is handled by a privileged internal queue, IQ,
and the mapping is enabled from one rate to another.
The transition from paced to non-paced should however always go
through FW.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Internal send queues are regular sendqueues which are reserved for WQE commands
towards the hardware and firmware. These queues typically carry resync
information for ongoing TLS RX connections and when changing schedule queues
for rate limited connections.
The internal queue, IQ, code is more or less a stripped down copy
of the existing SQ managing code with exception of:
1) An optional single segment memory buffer which can be read or
written as a whole by the hardware, may be provided.
2) An optional completion callback for all transmit operations, may
be provided.
3) Does not support mbufs.
MFC after: 1 week
Sponsored by: NVIDIA Networking