Overview:
This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.
This patch is based on about four main areas of work:
- Update of the IB uobjects system:
- The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
is now managed by ibcore. This also require some changes in the
kernel verbs API. The updated verbs changes are typically about
initialize and deinitialize objects, and remove allocation and
free of memory.
- Update of the uverbs IOCTL framework:
- The parsing and handling of user-space commands has been
completely refactored to integrate with the updated IB uobjects
system.
- Various changes and updates to the generic uverbs interfaces in
device drivers including the new uAPI surface.
- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.
Dependencies:
- The mlx4ib driver code has been updated with the minimum changes
needed.
- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.
Compatibility:
- All user-space facing APIs are backwards compatible after this
change.
- All kernel-space facing RDMA APIs are backwards compatible after
this change, with exception of ib_create_ah() and ib_destroy_ah()
which takes a new flag.
- The "ib_device_ops" structure exist, but only contains the driver ID
and some structure sizes.
Differences from Linux:
- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
the sizes needed for allocating various IB objects, when adding
IB device instances.
Security:
- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.
Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31149
Sponsored by: NVIDIA Networking
Since neither ib_post_send() nor ib_post_recv() modify the data structure
their second argument points at, declare that argument const. This change
makes it necessary to declare the 'bad_wr' argument const too and also to
modify all ULPs that call ib_post_send(), ib_post_recv() or
ib_post_srq_recv(). This patch does not change any functionality but makes
it possible for the compiler to verify whether the
ib_post_(send|recv|srq_recv) really do not modify the posted work request.
Linux commit:
f696bf6d64b195b83ca1bdb7cd33c999c9dcf514
7bb1fafc2f163ad03a2007295bb2f57cfdbfb630
d34ac5cd3a73aacd11009c4fc3ba15d7ea62c411
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
All callers to ib_modify_qp_is_ok() provides enum ib_qp_state makes the
checks of out-of-scope redundant. Let's remove them together with updating
function signature to return boolean result.
While at it remove unused "ll" parameter from ib_modify_qp_is_ok().
Linux commit:
19b1f54099b6ee334acbfbcfbdffd1d1f057216d
d31131bba5a1630304c55ea775c48cc84912ab59
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
The driver needs to provide a LinuxKPI device structure to register
itself with the IB subsystem. It was erroneously using a copy of its
FreeBSD device structure for this purpose.
Use linux_pci_attach_device() instead, following the example of the
Chelsio iwarp driver. Also ensure that we don't leak the faked device
during detach.
Reviewed by: hselasky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29595
Remove unused #includes of a LinuxKPI header noticed while trying to
solve LinuxKPI struct net_device and related functions.
This takes qlnxr out of the picture of D29366.
Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
X-D-R: D29366 (extracted as further cleanup)
upper_32_bits() and lower_32_bits() are defined twice in this file.
With the extra conditinal removed on LinuxKPI in 3b1ecc9fa1
they are also included from there already. Use the LinuxKPI version
and remove the two local ones.
Sponsored-by: The FreeBSD Foundation
Reviewed-by: hselasky
MFC-after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29392
r360870 added linux/slab.h into liunx/bitmap.h and this include linux/types.h
The qlnx driver is redefining some of those types so remove them and add an
explicit linux/types.h include.
Pointy hat: manu
Reported by: Austin Shafer <ashafer@badland.io>
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
QL_ALIGN is a set of copies of roundup2, but it was missing an outer set
of parentheses, which began to matter with r324538. Now, fully copy the
parenthesization of roundup2.
Reviewed by: davidcs
Differential Revision: https://reviews.freebsd.org/D12673
TCP connections (order of tens of thousands), with predominantly Transmits.
Choice to perform receive operations either in IThread or Taskqueue Thread.
Submitted by:Vaishali.Kulkarni@cavium.com
MFC after:5 days
Propagate warning flags from kern.opts.mk and then fix minor -Werror
issues when building with gcc from -Wredundant-decls, -Wnested-externs,
-Wuninitialized.
Reviewed by: davidcs
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11413
2. Use sysctls for TRACE_LRO_CNT and TRACE_TSO_PKT_LEN
3. remove unused mtx tx_lock
4. bind taskqueue kernel thread to the appropriate cpu core
5. when tx_ring is full, stop further transmits till at least 1/16th of the Tx Ring is empty. In our case 1K entries. Also if there are rx_pkts to process, put the taskqueue thread to sleep for 100ms, before enabling interrupts.
6. Use rx_pkt_threshold of 128.
MFC after:3 days
1. Deadcode in ecore_init_cache_line_size(), qlnx_ioctl() and
qlnx_clean_filters()
2. ARRAY_VS_SINGLETON issue in qlnx_remove_all_mcast_mac() and
qlnx_update_rx_prod()
MFC after:5 days
- Move all bitmap related functions from bitops.h to bitmap.h, similar
to what Linux does.
- Apply some minor code cleanup and simplifications to optimize the
generated code when using static inline functions.
- Implement the following list of bitmap functions which are needed by
drm-next and ibcore:
- bitmap_find_next_zero_area_off()
- bitmap_find_next_zero_area()
- bitmap_or()
- bitmap_and()
- bitmap_xor()
- Add missing include directives to the qlnxe driver
(davidcs@ has been notified)
MFC after: 1 week
Sponsored by: Mellanox Technologies