Overview:
This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.
This patch is based on about four main areas of work:
- Update of the IB uobjects system:
- The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
is now managed by ibcore. This also require some changes in the
kernel verbs API. The updated verbs changes are typically about
initialize and deinitialize objects, and remove allocation and
free of memory.
- Update of the uverbs IOCTL framework:
- The parsing and handling of user-space commands has been
completely refactored to integrate with the updated IB uobjects
system.
- Various changes and updates to the generic uverbs interfaces in
device drivers including the new uAPI surface.
- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.
Dependencies:
- The mlx4ib driver code has been updated with the minimum changes
needed.
- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.
Compatibility:
- All user-space facing APIs are backwards compatible after this
change.
- All kernel-space facing RDMA APIs are backwards compatible after
this change, with exception of ib_create_ah() and ib_destroy_ah()
which takes a new flag.
- The "ib_device_ops" structure exist, but only contains the driver ID
and some structure sizes.
Differences from Linux:
- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
the sizes needed for allocating various IB objects, when adding
IB device instances.
Security:
- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.
Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31149
Sponsored by: NVIDIA Networking
Define new option in 'rdma_set_option' to override calculated QP timeout
when requested to provide QP attributes to modify a QP.
At the same time, pack tos_set to be bitfield.
Linux commit:
2c1619edef61a03cb516efaa81750784c3071d10
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
This change include several changes as listed below all related to UAR.
UAR is a special PCI memory area where the so-called doorbell register and
blue flame register live. Blue flame is a feature for sending small packets
more efficiently via a PCI memory page, instead of using PCI DMA.
- All structures and functions named xxx_uuars were renamed into xxx_bfreg.
- Remove partially implemented Blueflame support from mlx5en(4) and mlx5ib.
- Implement blue flame register allocator.
- Use blue flame register allocator in mlx5ib.
- A common UAR page is now allocated by the core to support doorbell register
writes for all of mlx5en and mlx5ib, instead of allocating one UAR per
sendqueue.
- Add support for DEVX query UAR.
- Add support for 4K UAR for libmlx5.
Linux commits:
7c043e908a74ae0a935037cdd984d0cb89b2b970
2f5ff26478adaff5ed9b7ad4079d6a710b5f27e7
0b80c14f009758cefeed0edff4f9141957964211
30aa60b3bd12bd79b5324b7b595bd3446ab24b52
5fe9dec0d045437e48f112b8fa705197bd7bc3c0
0118717583cda6f4f36092853ad0345e8150b286
a6d51b68611e98f05042ada662aed5dbe3279c1e
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.
It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.
This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.
This commit combines the following Linux upstream commits:
IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah
MFC after: 1 week
Sponsored by: Mellanox Technologies
compilation under FreeBSD. The mthca driver was temporarily removed as
part of the Linux 4.9 RoCE/infinband upgrade.
Top commit in Linux source tree:
69973b830859bc6529a7a0468ba0d80ee5117826
Sponsored by: Mellanox Technologies
This patch currently supports:
- ibcore as a kernel module only
- krping as a kernel module only
- ipoib as a kernel module only
Sponsored by: Mellanox Technologies