Commit graph

21 commits

Author SHA1 Message Date
John Baldwin
365b89e8ea nvmf: Switch several ioctls to using nvlists
For requests that handoff queues from userspace to the kernel as well
as the request to fetch reconnect parameters from the kernel, switch
from using flat structures to nvlists.  In particular, this will
permit adding support for additional transports in the future without
breaking the ABI of the structures.

Note that this is an ABI break for the ioctls used by nvmf(4) and
nvmft(4).  Since this is only present in main I did not bother
implementing compatability shims.

Inspired by:	imp (suggestion on a different review)
Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D48230
2024-12-30 13:52:21 -05:00
John Baldwin
4d3b659f24 nvmf: Track SQ flow control
This isn't really needed since the host driver never submits more
commands to a queue than it can hold, but I noticed that the
recently-added SQ head and tail sysctl nodes were not updating.  This
fixes that and also uses these values to assert that there we never
submit a command while a queue pair is full.

Sponsored by:	Chelsio Communications
2024-11-11 11:39:05 -05:00
John Baldwin
3ff90d91b4 nvmf: Schedule requests across multiple I/O queues
Similar to nvme(4), use the current CPU to select which I/O queue to
use.  The assignment in nvmf_attach() had to be moved down since
sc->num_io_queues is initialized in nvmf_establish_connection().

Note that nvmecontrol(8) still defaults to using a single I/O queue
for an association.

Sponsored by:	Chelsio Communications
2024-11-11 11:37:32 -05:00
John Baldwin
8922c5b821 nvmf: Fix an off by one error when scanning active namespace IDs
The active namespace list query fetches namespaces greater than the
passed in namespace ID, not greater than or equal to the passed in
namespace ID.  Thus, a multi-page request should start with the last
namespace ID from the previous page, not that ID plus 1.

While here, make use of NVME_GLOBAL_NAMESPACE_TAG instead of a magic
number to handle the edge case that the last namespace ID in a page is
the largest valid namespace ID.

Reviewed by:	chuck
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D47393
2024-11-04 20:27:14 -05:00
John Baldwin
a6ec214741 nvmf: Deregister the post_sync eventhandler correctly during detach
Previously the handler was removed from the wrong eventhandler list.

Fixes:		f46d4971b5 nvmf: Handle shutdowns more gracefully
Sponsored by:	Chelsio Communications
2024-11-02 09:54:36 -04:00
John Baldwin
931dd5feb0 nvmf: Add sysctl nodes for each queue pair
These report the queue size, queue head, queue tail, and the number of
commands submitted.

Sponsored by:	Chelsio Communications
2024-11-02 09:54:13 -04:00
John Baldwin
d1516ec33e nvmf: Fail pass through commands while a controller is not associated
Previously this just dereferenced NULL qp pointers and panicked.
Instead, use a shared lock on the connection lock to protect access to
the qp pointers and allocate a request.  If the controller is not
associated, fail the request with ECONNABORTED.

Possibly this should be honoring kern.nvmf.fail_on_disconnection and
block waiting for a reconnect request while disconnected if that
tunable is false.

Reported by:	Suhas Lokesha <suhas@chelsio.com>
Sponsored by:	Chelsio Communications
2024-10-17 12:09:27 -04:00
John Baldwin
ef052adf09 nvmf: Narrow scope of sim lock in nvmf_sim_io
nvmf_submit_request() handles races with concurrent queue pair
destruction (or the queue pair being destroyed between
nvmf_allocate_request and nvmf_submit_request), so the lock is not
needed here.  This avoids holding the lock across transport-specific
logic such as queueing mbufs for PDUs to a socket buffer, etc.

Holding the lock across nvmf_allocate_request() ensures that the queue
pair pointers in the softc are still valid as shutdown attempts will
block on the lock before destroying the queue pairs.

Sponsored by:	Chelsio Communications
2024-09-25 21:14:06 -04:00
John Baldwin
aec2ae8b57 nvmf: Always use xpt_done instead of xpt_done_direct
The last reference on a pending I/O request might be held by an mbuf
in the socket buffer.  When this mbuf is freed, the I/O request is
completed which triggers completion of the CCB.  However, this can
occur with locks held (e.g. with so_snd locked when the mbuf is freed
by sbdrop()) raising a LOR between so_snd and the CAM device lock.
Instead, defer CCB completion processing to a thread where locks are
not held.

Sponsored by:	Chelsio Communications
2024-09-25 21:10:44 -04:00
Mark Johnston
b67f248523 nvmf: Use device_set_descf()
No functional change intended.

MFC after:	1 week
2024-06-16 16:37:26 -04:00
John Baldwin
f46d4971b5 nvmf: Handle shutdowns more gracefully
If an association is disconnected during a clean shutdown, abort all
pending and future I/O requests with an error to avoid hangs either due
to filesystem unmounts or a stuck GEOM event.

If an association is connected during a clean shutdown, gracefully
disconnect from the remote controller and close the open queues.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45462
2024-06-05 12:59:28 -07:00
John Baldwin
aacaeeee8e nvmf: Permit failing I/O requests while disconnected
Add a kern.nvmf.fail_on_disconnection sysctl similar to the
kern.iscsi.fail_on_disconnection sysctl.  This causes pending I/O
requests to fail with an error if an association is disconnected
instead of requeueing to be retried once the association is
reconnected.  As with iSCSI, the default is to queue and retry
operations.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45308
2024-06-05 12:59:07 -07:00
John Baldwin
e140f85dc1 nvmf: Rescan namespaces after reconnecting
While a host was disconnected from a remote controller, namespaces
might have been added, removed, or altered properties.  Rescan the
namespaces after reconnecting to detect any such changes.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45461
2024-06-05 12:53:08 -07:00
John Baldwin
f6d434f110 nvmf: Rescan all namespaces if the changed NS log page is too large
Previously this just punted with a warning message.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45460
2024-06-05 12:52:43 -07:00
John Baldwin
8a082ca89f nvmf: Factor out most of nvmf_rescan_ns into a helper routine
This function accepts a namespace ID and associated namespace data
from IDENTIFY and takes care of updating nvmeXnY and ndaZ.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45459
2024-06-05 12:52:24 -07:00
John Baldwin
02ddb305cc nvmf: Refactor nvmf_add_namespaces to be more generic
Rename to nvmf_scan_active_namespaces and accept an additional
callback function and callback argument.  The callback is invoked on
each active namespace enumerated by the active namespace list from the
IDENTIFY command.

Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45458
2024-06-05 12:51:56 -07:00
John Baldwin
bed59baba2 nvmf: Pass const pointers to namespace data to nvmf_*_ns
Reviewed by:	imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45457
2024-06-05 12:51:37 -07:00
Chuck Tuffli
ce75bfcac9 nvme: Change namespace device name
Changes the device name for NVMe and NVMe-oF namespaces from using "ns"
to "n" to be more compatible with other operating systems. For example,
a device which was previously /dev/nvme0ns1 is now /dev/nvme0n1.

Preserves the existing functionality by creating alias from nvmeXnY to
nvmeXnsY.

Reviewed by:	imp
MFC after:	1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D45414
2024-06-01 04:14:14 -07:00
John Baldwin
da4230af3f nvme/f: Use strlcpy instead of strncpy + manual string termination
Reviewed by:	dab, imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D45153
2024-05-13 12:04:03 -07:00
John Baldwin
1f029b86bb nvmf: Use strlcpy instead of strncpy to ensure termination
Reported by:	Coverity Scan
CID:	 	1545054
Sponsored by:	Chelsio Communications
2024-05-10 08:56:51 -07:00
John Baldwin
a1eda74167 nvmf: The in-kernel NVMe over Fabrics host
This is the client (initiator in SCSI terms) for NVMe over Fabrics.
Userland is responsible for creating a set of queue pairs and then
handing them off via an ioctl to this driver, e.g. via the 'connect'
command from nvmecontrol(8).  An nvmeX new-bus device is created
at the top-level to represent the remote controller similar to PCI
nvmeX devices for PCI-express controllers.

As with nvme(4), namespace devices named /dev/nvmeXnsY are created and
pass through commands can be submitted to either the namespace devices
or the controller device.  For example, 'nvmecontrol identify nvmeX'
works for a remote Fabrics controller the same as for a PCI-express
controller.

nvmf exports remote namespaces via nda(4) devices using the new NVMF
CAM transport.  nvmf does not support nvd(4), only nda(4).

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D44714
2024-05-02 16:29:37 -07:00