Commit graph

219 commits

Author SHA1 Message Date
Mitchell Horne
a8ba64ca91 iscsi: adjust shutdown_pre_sync handler
Don't attempt to service reconnections if RB_NOSYNC is set. More
crucially, don't do it if the scheduler is stopped, as the maintenance
thread will never run again.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42342

(cherry picked from commit 2ce1c45b3411410a5d0a4d08198a3b0010d493b7)
2023-12-08 18:02:44 -04:00
Mitchell Horne
951d60ee3a shutdown: audit shutdown_post_sync event callbacks
Ensure they are all panic/debugger safe.

Most handlers for this event are for disk drivers/geom modules. There
are a mix of checks being used here (or not), so let's standardize on
checking the presence of the RB_NOSYNC flag.

This flag is set whenever:
 1. The kernel has panicked and kern.sync_on_panic=0*
 2. We reboot from within the kernel debugger (the "reset" command)
 3. Userspace requested it, e.g. by 'reboot -n'

Name the functions consistently.

*This sysctl is tuned to zero by default, but its existence means that
these handlers can be executed after a panic, at the user's discretion.
IMO this use-case is implicitly understood to be risky, and we'd be
better off eliminating it altogether.

Reviewed by:    markj
Sponsored by:   The FreeBSD Foundation
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D42337

(cherry picked from commit 4eb861d362d6a9493df7f77eab8e28f9c826702a)
2023-12-08 18:02:44 -04:00
Warner Losh
031beb4e23 sys: Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:58 -06:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Dimitry Andric
02a226ac34 Suppress possible unused variable warning for icl_soft.c
With clang 15, the following -Werror warning is produced on i386:

    sys/dev/iscsi//icl_soft.c:1277:6: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
            int i;
                ^

The 'i' variable is used later in the icl_soft_conn_pdu_get_bio()
function, via the PHYS_TO_DMAP() macro. However, on i386 and some other
architectures, this macro is defined to panic immediately, so in those
cases, 'i' is indeed not used. Suppress the warning by marking 'i' as
unused.

MFC after:	3 days
2022-07-27 21:13:58 +02:00
Dimitry Andric
f4f8470180 Fix unused variable warning in icl_soft.c
With clang 15, the following -Werror warning is produced:

    sys/dev/iscsi//icl_soft.c:886:6: error: variable 'coalesced' set but not used [-Werror,-Wunused-but-set-variable]
            int coalesced, error;
                ^

The 'coalesced' variable is eventually used only in an #if 0'd block,
obviously meant for debugging. Ensure that 'coalesced' is only declared
and used when DEBUG_COALESCED is defined, so the debugging can be easily
turned on later, if desired.

MFC after:	3 days
2022-07-25 00:40:12 +02:00
John Baldwin
7b02c1e8c6 iscsi: Fetch limits based on a socket rather than assuming global limits.
cxgbei needs the ability to return different limits based on the
connection (e.g. if the connection is over a T5 adapter or a T6
adapter as well as factoring in the MTU).

This change plumbs through the changes in the ioctls without changing
any of the backends.  The limits callback passed to icl_register now
accepts a second socket argument which holds the integer file
descriptor.  To support ABI compatiblity for old binaries, the
callback should return "global" values if the socket fd is zero.

The CTL_ISCSI_LIMITS argument used with CTL_ISCSI by ctld(8) now
accepts the socket fd in a field that was previously part of a
reserved spare field.  Old binaries zero this request which results in
passing a socket fd of 0 to the limits callback.

The ISCSIDREQUEST ioctl no longer returns limits.  Instead, iscsid(8)
invokes a new ISCSIDLIMITS ioctl after establishing the connection via
connect(2).  For ABI compat, if the old ISCSIDREQUEST is invoked, the
global limits are still fetched (with a socket fd of 0) and returned.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34928
2022-04-18 12:53:28 -07:00
Gordon Bergling
756220b515 isci(4): Remove a double word in an error message
- s/is is/is/

MFC after:	1 week
2022-04-03 16:07:20 +02:00
John Baldwin
832acea92f icl_soft: Use PHYS_TO_DMAP instead of pmap_map_io_transient.
The latter API is not actually MI but is only supported on amd64,
arm64, and RISC-V.

Sponsored by:	Chelsio Communications
2022-03-10 18:20:28 -08:00
John Baldwin
530e725d8e iscsi: Support unmapped I/O requests in the default initiator.
- Add icl_pdu_append_bio and icl_pdu_get_bio methods.

- When ICL_NOCOPY is used to append data from an unmapped I/O request
  to a PDU, construct unmapped mbufs from the relevant pages backing
  the struct bio.

- Use m_apply with a helper to compute crc32 digests on mbuf chains
  to handle unmapped mbufs.  Since m_apply requires PMAP_HAS_DMAP
  for unmapped mbufs, only support unmapped requests when PMAP_HAS_DMAP
  is true.

Reviewed by:	mav
Differential Revision:	https://reviews.freebsd.org/D34406
2022-03-10 15:50:26 -08:00
John Baldwin
7aab9c14a4 iscsi: Handle unmapped I/O requests.
Don't assume that csio->data_ptr is pointer to a data buffer that can
be passed to icl_get_pdu_data and icl_append_data.  For unmapped I/O
requests, csio->data_ptr is instead a pointer to a struct bio as
indicated by CAM_DATA_BIO.  To support these requests, add
icl_pdu_append_bio and icl_pdu_get_bio methods which pass a pointer to
the bio and an offset and length relative to the bio's buffer.

Note that only backends supporting unmapped requests need to implement
these hooks.

Implement simple no-op hooks for the iser backend.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34382
2022-03-10 15:49:53 -08:00
John Baldwin
9c7a4875bc iscsi: Use ICL_NOCOPY for SCSI command immediate data and R2T.
The associated csio ccb will not be completed via xpt_done() until
after the associated PDUs are transmitted to the other side and either
the original PDU is acked with a SCSI response, or a response is
received for a subsequent abort CCB (which means the earlier PDU has
also been sent since it would have been sent before the abort PDU).

This does assume that once an I/O request has been aborted, no further
PDUs with data payload are queued for that I/O request.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34405
2022-03-10 15:48:20 -08:00
Richard Scheffenegger
bd6bb49397 iscsi: per-session timeouts and rapid teardown of session on reconnect
Add per-Session configurable ping (SCSI NOP) and login timeouts.

Remove the torn down, old iSCSI session quickly, when performing a reconnect.

Reviewed By: trasz
Sponsored by:        NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34198
2022-02-25 10:35:47 +01:00
Richard Scheffenegger
972a7d95eb iscsi: Use calloutng instead of ticks in iscsi initiator
callout *_sbt functions are used to reduce ping/timeout scheduling
overhead, while allowing later improvments in the functionality.
Keep similar 1000ms callouts while adding a 10 ms window, to allow
some kernel scheduling improvements.

Reviewed By: jhb
Sponsored by:        NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34222
2022-02-15 17:36:22 +01:00
Richard Scheffenegger
70e9f880d8 iscsi: address unused-but-set-variable warning
remove "interrupted" in icl_soft_proxy_connect()

Reviewed By: hselasky
Sponsored by:        NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34223
2022-02-12 06:15:02 +01:00
Ka Ho Ng
fa66950534 iscsi: Fix missing is_lock unlock after cam_simq_alloc() failed
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-21 16:34:18 -05:00
John Baldwin
8903d8e37f iscsi: Pass the request PDU to icl_conn_transfer_setup().
This matches icl_conn_task_setup() which passes the PDU and avoids the
need for a layering violation in cxgbei to fetch the request PDU from
the ctl_io.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D33746
2022-01-04 14:37:17 -08:00
John Baldwin
e900338c09 Move the ICL_CONN_*LOCK* macros to <dev/iscsi/icl.h>.
These macros are not backend-specific but reference a
backend-independent field in struct icl_conn.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D32858
2021-11-05 16:38:25 -07:00
John Baldwin
cdbc4a074b Further refine the ExpDataSN checks for SCSI Response PDUs.
According to 11.4.8 in RFC 7143, ExpDataSN MUST be 0 if the response
code is not Command Completed, but we were requiring it to always be
the count of DataIn PDUs regardless of the response code.

In addition, at least one target (OCI Oracle iSCSI block device)
returns an ExpDataSN of 0 when returning a valid completion with an
error status (Check Condition) in response to a SCSI Inquiry.  As a
workaround for this target, only warn without resetting the connection
for a 0 ExpDataSN for responses with a non-zero error status.

PR:		259152
Reported by:	dch
Reviewed by:	dch, mav, emaste
Fixes:		4f0f5bf995 iscsi: Validate DataSN values in Data-In PDUs in the initiator.
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D32650
2021-10-26 14:50:05 -07:00
John Baldwin
4f0f5bf995 iscsi: Validate DataSN values in Data-In PDUs in the initiator.
As is done in the target, require that DataSN values are consecutive
and in-order.  If an out of order Data-In PDU is received, force a
session reconnect.  In addition, when a SCSI Response PDU is received,
verify that the ExpDataSN field matches the count of Data-In PDUs
received for this command.  If not, force a session reconnect.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31594
2021-08-24 14:58:34 -07:00
John Baldwin
c261b6ea4e iscsi: Teach the iSCSI stack about "large" received PDUs.
When using iSCSI PDU offload (cxgbei) on T6 adapters, a burst of
received PDUs can be reported via a single message to the driver.

Previously the driver passed these multi-PDU bursts up to the iSCSI
stack up as a single "large" PDU by rewriting the buffer offset, data
segment length, and DataSN fields in the iSCSI header.  The DataSN
field in particular was rewritten so that each of the "large" PDUs
used consecutively increasing values.  While this worked, the forged
DataSN values did not match the ExpDataSN value in the subsequent SCSI
Response PDU.  The initiator does not currently verify this value, but
the forged DataSN values prevent adding a check.

To avoid this, allow a logical iSCSI PDU (struct icl_pdu) to describe
a burst of PDUs via a new 'ip_additional_pdus' field.  Normally this
field is set to zero when 'struct icl_pdu' represents a single PDU.
If logical PDU represents a burst of on-the-wire PDUs, then 'ip_npdus'
contains the count of additional on-the-wire PDUs.  The header of this
"large" PDU is still modified, but the DataSN field now contains the
DataSN value of the first on-the-wire PDU in the burst.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31577
2021-08-18 10:56:28 -07:00
John Baldwin
f0594f52f6 iSCSI: Add support for segmentation offload for hardware offloads.
Similar to TSO, iSCSI segmentation offload permits the upper layers to
submit a "large" virtual PDU which is split up into multiple segments
(PDUs) on the wire.  Similar to how the TCP/IP headers are used as
templates for TSO, the BHS at the start of a large PDU is used as a
template to construct the specific BHS at the start of each PDU.  In
particular, the DataSN is incremented for each subsequent PDU, and the
'F' flag is only set on the last PDU.

struct icl_conn has a new 'ic_hw_isomax' field which defaults to 0,
but can be set to the largest virtual PDU a backend supports.  If this
value is non-zero, the iSCSI target and initiator use this size
instead of 'ic_max_send_data_segment_length' to determine the maximum
size for SCSI Data-In and SCSI Data-Out PDUs.  Note that since PDUs
can be constructed from multiple buffers before being dispatched, the
target and initiator must wait for the PDU to be fully constructed
before determining the number of DataSN values were consumed (and thus
updating the per-transfer DataSN value used for the start of the next
PDU).

The target generates large PDUs for SCSI Data-In PDUs in
cfiscsi_datamove_in().  The initiator generates large PDUs for SCSI
Data-Out PDUs generated in response to an R2T.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31222
2021-08-06 14:03:00 -07:00
John Baldwin
87322a9075 iscsi: Remove icl_soft-only fields from struct icl_conn.
Create a struct icl_soft_conn which extends struct icl_conn and
move fields only used by icl_soft from struct icl_conn to
struct icl_soft_conn.

Reviewed by:	mav
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31414
2021-08-05 12:05:30 -07:00
Warner Losh
30f8afd027 cam: fix xpt_bus_register and xpt_bus_deregister return errno
xpt_bus_register and xpt_bus_deregister returns a hybrid error that's
neither a cam_status, nor an errno, but a mix of both.  Update
xpt_bus_register and xpt_bus_deregister to return an errno. The vast
majority of current users compare against zero, which can also be
spelled CAM_SUCCESS. Nobody uses CAM_FAILURE, so remove that symbol
to prevent comfusion (nothing returns it either).

Where the return value is saved, ensure that the variable 'error' is
used to store an errno and 'status' is used to store a cam_status where
it makes the code clearer (usually just in functions that already mix
and match). Where the return value isn't used at all, avoid storing it
at all.

Reviewed by:		scottl@, mav@ (earlier version)
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D30860
2021-06-28 16:13:03 -06:00
Mark Johnston
a100217489 Consistently use the SOCKBUF_MTX() and SOCK_MTX() macros
This makes it easier to change the socket locking protocols.  No
functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-06-14 17:32:32 -04:00
John Baldwin
0cc7d64a2a iscsi: Move the maximum data segment limits into 'struct icl_conn'.
This fixes a few bugs in iSCSI backends where the backends were using
the limits they advertised initially during the login phase as the
final values instead of the values negotiated with the other end.

Reported by:	Jithesh Arakkan @ Chelsio
Reviewed by:	mav
Differential Revision:	https://reviews.freebsd.org/D30271
2021-05-20 09:59:11 -07:00
John Baldwin
89df484739 iscsi: Kick threads out of iscsi_ioctl() during unload.
iscsid can be sleeping in iscsi_ioctl() causing the destroy_dev() to
sleep forever if iscsi.ko is unloaded while iscsid is running.

Reported by:	Jithesh Arakkan @ Chelsio
Reviewed by:	mav
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D29688
2021-04-12 13:58:21 -07:00
Alexander Motin
afc3e54eee Move ic_check_send_space clear to the actual check.
It closes tiny race when the flag could be set between being cleared
and the space is checked, that would create us some more work.  The
flag setting is protected by both locks, so we can clear it in either
place, but in between both locks are dropped.

MFC after:	1 week
2021-03-03 15:29:35 -05:00
Alexander Motin
aff9b9ee89 Restore condition removed in df3747c660.
I think it allowed to avoid some TX thread wakeups while the socket
buffer is full.  But add there another options if ic_check_send_space
is set, which means socket just reported that new space appeared, so
it may have sense to pull more data from ic_to_send for better TX
coalescing.

MFC after:	1 week
2021-03-03 12:03:08 -05:00
Alexander Motin
df3747c660 Replace STAILQ_SWAP() with simpler STAILQ_CONCAT().
Also remove stray STAILQ_REMOVE_AFTER(), not causing problems only
because STAILQ_SWAP() fixed corrupted stqh_last.

MFC after:	1 week
2021-03-02 18:48:49 -05:00
Alexander Motin
06e9c71099 Fix initiator panic after 6895f89fe5.
There are sessions without socket that are not disconnecting yet.

MFC after:	3 weeks
2021-03-02 16:11:29 -05:00
Alexander Motin
b85a67f54a Optimize TX coalescing by keeping pointer to last mbuf.
Before m_cat() each time traversed through all the coalesced chain.

MFC after:	1 week
2021-03-01 23:34:11 -05:00
Alexander Motin
6895f89fe5 Coalesce socket reads in software iSCSI.
Instead of 2-4 socket reads per PDU this can do as low as one read
per megabyte, dramatically reducing TCP overhead and lock contention.

With this on iSCSI target I can write more than 4GB/s through a
single connection.

MFC after:	1 month
2021-02-22 12:51:59 -05:00
John Baldwin
47769bc557 iscsi: Mark iSCSI CAM sims as non-pollable.
Previously, iscsi_poll() just panicked.  This meant if you got a panic
on a box when using the iSCSI initiator, the attempt to shutdown would
trigger a nested panic and never write out a core.  Now, CCB's sent to
iSCSI devices (such as the sychronize-cache request in dashutdown())
just fail with a timeout during a panic shutdown.

Reviewed by:	scottl, mav
MFC after:	2 weeks
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D28455
2021-02-11 13:52:18 -08:00
Alexander Motin
3dd2a7a5ea Make DataSN counter of solicited Data-Out local.
DataSN for solicited Data-Out is per-R2T.  Since we handle whole R2T
in one go, we don't need to store it anywhere, especially in global
per-command structure.  This may allow us to handle multiple R2T per
command at once, if we decide, or may be relax locking.

Rename the second use of that field to io_referenced_task_tag.

MFC after:	1 month
2021-02-02 13:56:47 -05:00
Alexander Motin
b75168ed24 Make software iSCSI more configurable.
Move software iSCSI tunables/sysctls into kern.icl.soft subtree.
Replace several hardcoded length constants there with variables.

While there, stretch the limits to better match Linux' open-iscsi
and our own initiator with new MAXPHYS of 1MB.  Our CTL target is
also optimized for up to 1MB I/Os, so there is also a match now.
For Windows 10 and VMware 6.7 initiators at default settings it
should make no change, since previous limits were sufficient there.

Tests of QD1 1MB writes from FreeBSD over 10GigE link show throughput
increase by 29% on idle connection and 132% with concurrent QD8 reads.

MFC after:	3 days
Sponsored by:	iXsystems, Inc.
2021-01-28 16:22:45 -05:00
Alexander Motin
9bee9a98ff Exclude reserved iSCSI Initiator Task Tag.
RFC 7143 (11.2.1.8):
   An ITT value of 0xffffffff is reserved and MUST NOT be assigned for a
   task by the initiator.  The only instance in which it may be seen on
   the wire is in a target-initiated NOP-In PDU (Section 11.19) and in
   the initiator response to that PDU, if necessary.

MFC after:	1 month
2021-01-24 14:23:04 -05:00
Alexander Motin
ff751ee05c Remove FirstBurstLength limit for software iSCSI.
For hardware offload solicited data may potentially be handled more
efficiently than unsolicited due to direct data placement.  Or there
can be some unsolicited write buffering limitations.  It may create
situations where FirstBurstLength limit is really useful.

Software driver though has no those factors, having to do memcopy()
any way and having no so hard limit on the temporary storage.  Same
time more active use of unsolicited transfers allows to avoid some
of Ready To Transfer (R2T) PDU round-trip times and processing.

This change effectively doubles from 64KB to 128KB the maximum size
of write command that can be transferred within one link RTT.  Tests
of (64KB, 128KB] QD1 writes mixed with simultaneous QD8 reads over
the same connection, increasing RTT, shows almost double write speed
and half latency, while we should be able to afford few megabytes of
RAM for additional buffering on a target these days.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2021-01-20 23:17:12 -05:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Edward Tomasz Napierala
bce7ee9d41 Drop "All rights reserved" from all my stuff. This includes
Foundation copyrights, approved by emaste@.  It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.

Reviewed by:	emaste, imp, gbe (manpages)
Differential Revision:	https://reviews.freebsd.org/D26980
2020-10-28 13:46:11 +00:00
Alexander Motin
8836496815 Introduce support of SCSI Command Priority.
SAM-3 specification introduced concept of Task Priority, that was renamed
to Command Priority in SAM-4, and supported by all modern SCSI transports.
It provides 15 levels of relative priorities: 1 - highest, 15 - lowest and
0 - default.  SAT specification for SATA devices translates priorities 1-3
into NCQ high priority.

This change adds new "priority" field into empty spots of struct ccb_scsiio
and struct ccb_accept_tio of CAM and struct ctl_scsiio of CTL.  Respective
support is added into iscsi(4), isp(4), mpr(4), mps(4) and ocs_fc(4) drivers
for both initiator and where applicable target roles.  Minimal support was
added to CTL to receive the priority value from different frontends, pass it
between HA controllers and report in few places.

This patch does not add consumers of this functionality, so nothing should
really change yet, since the field is still set to 0 (default) on initiator
and not actively used on target.  Those are to be implemented separately.

I've confirmed priority working on WD Red SATA disks connected via mpr(4)
and properly transferred to CTL target via iscsi(4), isp(4) and ocs_fc(4).

While there, added missing tag_action support to ocs_fc(4) initiator role.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2020-10-25 19:34:02 +00:00
Richard Scheffenegger
4dfbcffbb9 Add network QoS support for PCP to iscsi initiator.
Make the Ethernet PCP codepoint configurable
for L2 local traffic, to allow lower latency for
iSCSI block IO. This addresses the initiator
side only.

Reviewed by:	mav, trasz, bcr
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D26739
2020-10-24 21:07:13 +00:00
Alexander Motin
7dbbd1aeae Negotiate iSCSIProtocolLevel of 2 (RFC 7144) in initiator.
It does not change anything immediately, but allows further support of
Command Priority, Status Qualifier and new task management functions.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2020-10-22 20:26:27 +00:00
Edward Tomasz Napierala
186bcdaac7 If the SIM freezes the queue at exactly the wrong moment, after
another thread has started to send in a CCB and already checked
the queue wasn't frozen, we would end up with iscsi_action()
being called despite the queue is now frozen.

Add a check to make sure this doesn't happen . Perhaps this should
be fixed at the CAM level instead, but given how the send queue and
SIM are governed by two separate mutexes, it is somewhat hard to do.

Reviewed by:	imp, mav
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26750
2020-10-18 16:30:49 +00:00
Richard Scheffenegger
bfabdade5c Add DSCP support for network QoS to iscsi initiator.
Allow the DSCP codepoint also to be configurable
for the traffic in the direction from the initiator
to the target, such that writes and any requests
are also treated in the appropriate QoS class.

Reviewed by:	mav
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D26714
2020-10-09 14:33:09 +00:00
Mateusz Guzik
2140d5b64f iscsi: clean up empty lines in .c and .h files 2020-09-01 21:30:22 +00:00
Alexander Motin
9a4510ac32 Implement zero-copy iSCSI target transmission/read.
Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter.  Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed.  In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC.  Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with:	Chelsio
Sponsored by:	iXsystems, Inc.
2020-06-08 20:53:57 +00:00