Let firmware do its best first, and if it can't, try software recovery.
I would remove software timeout handler completely, but found bunch of
complains on command timeout on sparc64 mailing list few years ago, so
better be safe in case of interrupt loss.
MFC after: 2 weeks
For 24xx and above use 2 vectors (default and response queue).
For 26xx and above use 3 vectors (default, response and ATIO queues).
Due to global lock interrupt hardlers never run simultaneously now, but
at least this allows to save one regitster read per interrupt.
MFC after: 2 weeks
Since we support RQSTYPE_RPT_ID_ACQ, that functionality is only useful
in loop mode, which probably doesn't worth having this hack in 2017.
MFC after: 2 weeks
Instead of single isp_intr() function doing all possible magic, introduce
four different functions to handle mailbox operation completions, async
events, response and ATIO queues. The goal is to isolate different code
paths to make code more readable, and to make easier support for multiple
interrupt vectors. Even oldest hardware in many cases can identify what
code path it should run on interrupt. Contemporary hardware can assign
them to different interrupt vectors.
MFC after: 2 weeks
It was implemented to reduce context switches when uploading firmware to
card's RAM. But this mechanism is not used last 10 years since all mbox
operations are now polled, and it was never used for cards produced in
last 15 years. Newer cards can use DMA to upload firmware.
MFC after: 2 weeks
This change fixes DMA resource leak on driver unload. Also it removes
DMA resources allocation for hardcoded number of requests before fetching
the real number from firmware. Also it prepares ground for more flexible
IRQs allocation according to firmware capabilities.
MFC after: 2 weeks
Its more important for SPI HBAs, as they don't support CDBs above 12 bytes.
The new error code makes CAM to fall back to alternative commands.
MFC after: 2 weeks
The isp(4) driver was changing the tag type for REQUEST SENSE
commands to Head of Queue, when the CAM CCB flag
CAM_TAG_ACTION_VALID was NOT set. CAM_TAG_ACTION_VALID is set
when the tag action in the XPT_SCSI_IO is not CAM_TAG_ACTION_NONE
and when the target has tagged queueing turned on.
In most cases when CAM_TAG_ACTION_VALID is not set, it is because
the target is not doing tagged queueing. In those cases, trying to
send a Head of Queue tag may cause problems. Instead, default to
sending a simple tag.
IBM tape drives claim to support tagged queueing in their standard
Inquiry data, but have the DQue bit set in the control mode page
(mode page 10). CAM correctly detects that these drives do not
support tagged queueing, and clears the CAM_TAG_ACTION_VALID flag
on CCBs sent down to the drives.
This caused the isp(4) driver to go down the path of setting the
tag action to a default value, and for Request Sense commands only,
set the tag action to Head of Queue.
If an IBM tape drive does get a Head of Queue tag, it rejects it with
Invalid Message Error (0x49,0x00). (The Qlogic firmware translates that
to a Transport Error, which the driver translates to an Unrecoverable
HBA Error, or CAM_UNREC_HBA_ERROR.) So, by default, it wasn't possible
to get a good response from a REQUEST SENSE to an FC-attached IBM
tape drive with the isp(4) driver.
IBM tape drives (tested on an LTO-5 with G9N1 firmware and a TS1150
with 4470 firmware) also have a bug in that sending a command with a
non-simple tag attribute breaks the tape drive's Command Reference
Number (CRN) accounting and causes it to ignore all subsequent
commands because it and the initiator disagree about the next
expected CRN. The drives do reject the initial command with a head
of queue tag with an Invalid Message Error (0x49,0x00), but after that
they ignore any subsequent commands. IBM confirmed that it is a bug,
and sent me test firmware that fixes the bug. However tape drives in
the field will still exhibit the bug until they are upgraded.
Request Sense is not often sent to targets because most errors are
reported automatically through autosense in Fibre Channel and other
modern transports. ("Modern" meaning post SCSI-2.) So this is not
an error that would crop up frequently. But Request Sense is useful on
tape devices to report status information, aside from error reporting.
This problem is less serious without FC-Tape features turned on,
specifically precise delivery of commands (which enables Command
Reference Numbers), enabled on the target and initiator. Without
FC-Tape features turned on, the target would return an error and
things would continue on.
And it also does not cause problems for targets that do tagged
queueing, because in those cases the isp(4) driver just uses the
tag type that is specified in the CCB, assuming the
CAM_TAG_ACTION_VALID flag is set, and defaults to sending a Simple
tag action if it isn't an ordered or head of queue tag.
sys/dev/isp/isp.c:
In isp_start(), don't try to send Request Sense commands
with the Head of Queue tag attribute if the CCB doesn't
have a valid tag action. The tag action likely isn't valid
because the target doesn't support tagged queueing.
Sponsored by: Spectra Logic
MFC after: 3 days
Replace archaic "busses" with modern form "buses."
Intentionally excluded:
* Old/random drivers I didn't recognize
* Old hardware in general
* Use of "busses" in code as identifiers
No functional change.
http://grammarist.com/spelling/buses-busses/
PR: 216099
Reported by: bltsrc at mail.ru
Sponsored by: Dell EMC Isilon
This code was originally implemented 7 years ago, but never really worked
due to trivial error. I think this functionality may be not required.
Initiators supporting optional periodic command status checks detected
those terminated commands and retried them 3 seconds later. But thinking
about less featured initiators and the fact that it is our race makes
virtual ports "unknown" it may be good to have this feature.
It is normal for ZOMBIE ports to be logged out. This status is not really
an error until Gone Device Timeout expires, so make CAM retry after delay.
MFC after: 1 week
Firmware automatically logs in only to local loop ports, and those ports
can be easily identified without extra flag by zero domain and area IDs.
MFC after: 1 week
Since we no longer need additional buffers for request and response IOCBs,
we can increase receive space by 192 bytes, that is enough for fetching 48
more ports. The new limit is 1020 fabric ports per virtual port.
MFC after: 1 month
This should close the race between request arriving on new target mode
virtual port and its scanner thread finally fetch its address for request
routing.
For some reason firmware sends Port Database Changed notifications in case
of explicit login requests from the driver when target port is unavailabe.
Those notifications don't give driver any new information, but only cause
infinite scan loop.
Previously we had to do it synchronously because we could not drop the lock
due to potential scratch memory use conflicts. Previous commits fixed that
collision, so here it goes -- slower and less reliable external requests
are executed asynchronously without spinning in tight loop and with more
safe timeout handling.
Usually IOCBs should be put on queue for asynchronous processing and should
not require additional DMA memory. But there are some cases like aborts and
resets that for external reasons has to be synchronous. Give those cases
separate 2*64 byte DMA area to decouple them from other DMA scratch area
users, using it for asynchronous requests.
This is cosmetics that simplifies identification of new ports on FC switch.
It would be good to use target name from CTL here instead of hostname, but
it is not passed here through CAM now.
MFC after: 2 weeks
This space does not require DMA syncing. It reduces lock scope of the DMA
scratch space. It allows whole DMA scratch space to be used to I/O, so now
we can fetch up to ~1000 ports from SNS.
Due to the last fact, increase maximal number of ports from 256 to 1024.
Before this change virtual ports control IOCBs were executed synchronously
via Execute IOCB mailbox command. It required exclusive use of scratch
space of driver and mailbox registers of the hardware. Because of that
shared resources use this code could not really sleep, having to spin for
completion, blocking any other operation.
This change introduces new asynchronous design, sending the IOCBs directly
on request queue and gracefully waiting for their return on response queue.
Returned IOCBs are identified with unified handle space from r292725.
I am not sure why this was split long ago, but I see no reason for it.
At this point this unification just slightly reduces memory usage, but
as next step I plan to reuse shared handle space for other IOCB types.
- Make scan aborted by event restart immediately and infinitely.
- Improve handling of some loop events from firmware.
- Remove loop down timer, adding its functionality to scanner thread.
- Some more unification and simplification.
Hacks to enable target mode there complicated code, while didn't really
work. And for outdated hardware fixing it is not really interesting.
Initiator mode tested with Qlogic 1080 adapter is still working fine.