opnsense-src/sys/dev/cxgbe
Navdeep Parhar 7951040f8a cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.

b) Replace buf_ring with a brand new mp_ring (multiproducer ring).  This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual.  mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue.  It also has:
- the ability to enqueue/dequeue multiple items.  This might become
  significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
  descriptors and have another if_transmit thread take over.  A thread
  that's writing tx descriptors can end up doing so for an unbounded
  time period if a) there are other if_transmit threads continuously
  feeding the sofware queue, and b) the chip keeps up with whatever the
  thread is throwing at it.
- accurate statistics about interesting events even when the stats come
  at the expense of additional branches/conditional code.

The NIC txq lock is uncontested on the fast path at this point.  I've
left it there for synchronization with the control events (interface
up/down, modload/unload).

c) Add support for "type 1" coalescing work request in the normal NIC tx
path.  This work request is optimized for frames with a single item in
the DMA gather list.  These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.

d) Do not request automatic cidx updates every 32 descriptors.  Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately).  Also, request an automatic final update
when the queue idles after activity.  This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles.  This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.

e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx).  Allow work requests to be written
directly to the hardware descriptor ring if room is available.  I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.

MFC after:	2 months
2014-12-31 23:19:16 +00:00
..
common cxgbe(4): figure out the max payload size and save it for later. 2014-11-19 20:16:56 +00:00
firmware cxgbe(4): adjust PMRX and PMTX parameters. 2014-11-10 19:45:28 +00:00
iw_cxgbe iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect 2014-11-13 03:59:36 +00:00
tom Check for SS_NBIO in so->so_state instead of sb->sb_flags in 2014-12-15 17:52:08 +00:00
adapter.h cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
offload.h Some hooks in cxgbe(4) for the offloaded iSCSI driver. 2014-07-24 18:39:08 +00:00
osdep.h Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update a 2013-08-28 20:45:45 +00:00
t4_ioctl.h cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program 2013-12-03 18:34:52 +00:00
t4_l2t.c cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
t4_l2t.h cxgbe(4): Updates to the hardware L2 table management code. 2013-01-14 20:36:22 +00:00
t4_main.c cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
t4_mp_ring.c cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
t4_mp_ring.h cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
t4_netmap.c Whitespace nit. 2014-09-09 18:36:00 +00:00
t4_sge.c cxgbe(4): major tx rework. 2014-12-31 23:19:16 +00:00
t4_tracer.c cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet. 2014-05-23 01:45:44 +00:00