Commit graph

956 commits

Author SHA1 Message Date
John Baldwin
8cce4145fa Add support for KTLS RX over TOE to T6.
This largely reuses the TLS TOE support added in r330884.  However,
this uses the KTLS framework in upstream OpenSSL rather than requiring
Chelsio-specific patches to OpenSSL.  As with the existing TLS TOE
support, use of RX offload requires setting the tls_rx_ports sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24453
2020-04-27 23:59:42 +00:00
John Baldwin
f1f9347546 Initial support for kernel offload of TLS receive.
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and
  authentication algorithms and keys as well as the initial sequence
  number.

- When reading from a socket using KTLS receive, applications must use
  recvmsg().  Each successful call to recvmsg() will return a single
  TLS record.  A new TCP control message, TLS_GET_RECORD, will contain
  the TLS record header of the decrypted record.  The regular message
  buffer passed to recvmsg() will receive the decrypted payload.  This
  is similar to the interface used by Linux's KTLS RX except that
  Linux does not return the full TLS header in the control message.

- Add plumbing to the TOE KTLS interface to request either transmit
  or receive KTLS sessions.

- When a socket is using receive KTLS, redirect reads from
  soreceive_stream() into soreceive_generic().

- Note that this interface is currently only defined for TLS 1.1 and
  1.2, though I believe we will be able to reuse the same interface
  and structures for 1.3.
2020-04-27 23:17:19 +00:00
Navdeep Parhar
55eae197fc cxgbe/crypto: Fix the key size in a couple of places to catch up with
the recent OCF refactor.

Sponsored by:	Chelsio Communications
2020-04-23 23:54:23 +00:00
Navdeep Parhar
a3372bd833 cxgbe/iw_cxgbe: Create a LinuxKPI pci device for an adapter and use it
as the dma_device during RDMA registration.

cxgbe's struct device cannot be used as-is because it's a native FreeBSD
driver and ibcore is LinuxKPI based.

MFC after:	1 week
MFC after:	r360196
2020-04-22 21:54:21 +00:00
Alexander V. Chernikov
8d6708ba80 Convert TOE routing lookups to the new routing KPI.
Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D24388
2020-04-22 07:53:43 +00:00
John Baldwin
29fe41ddd7 Retire the CRYPTO_F_IV_GENERATE flag.
The sole in-tree user of this flag has been retired, so remove this
complexity from all drivers.  While here, add a helper routine drivers
can use to read the current request's IV into a local buffer.  Use
this routine to replace duplicated code in nearly all drivers.

Reviewed by:	cem
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24450
2020-04-20 22:24:49 +00:00
John Baldwin
708652acc4 Set inp_flowid's for TOE connections.
KTLS uses the flowid to distribute software encryption tasks among its
pool of worker threads.  Without this change, all software KTLS
requests for TOE sockets ended up on the first worker thread.

Note that the flowid for TOE sockets created via connect() is not a
hash of the 4-tuple, but is instead the id of the TOE pcb (tid).  The
flowid of TOE sockets created from TOE listen sockets do use the
4-tuple RSS hash as the flowid since the firmware provides the hash in
the message containing the original SYN.

Reviewed by:	np (earlier version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24348
2020-04-15 19:28:51 +00:00
John Baldwin
f3b6d8ad2e Clear CPL_GET_TCB_RPL handler on module unload.
This fixes a panic when unloading and reloading t4_tom.ko since the
old pointer is still stored when t4_tom_load tries to set it.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24358
2020-04-15 19:23:53 +00:00
Navdeep Parhar
ddde90ac81 cxgbe/iw_cxgbe: Do not start the EP timer if soaccept fails.
This fixes a panic that would occur when the timer tried to close a
stale socket.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-04-15 03:40:33 +00:00
Andrew Gallatin
23feb56348 KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself.
While the original implementation of unmapped mbufs was a large
step forward in terms of reducing cache misses by enabling mbufs
to carry more than a single page for sendfile, they are rather
cache unfriendly when accessing the ext_pgs metadata and
data. This is because the ext_pgs part of the mbuf is allocated
separately, and almost guaranteed to be cold in cache.

This change takes advantage of the fact that unmapped mbufs
are never used at the same time as pkthdr mbufs. Given this
fact, we can overlap the ext_pgs metadata with the mbuf
pkthdr, and carry the ext_pgs meta directly in the mbuf itself.
Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array
of pages) directly after the existing m_ext.

In order to be able to carry 5 pages (which is the minimum
required for a 16K TLS record which is not perfectly aligned) on
LP64, I've had to steal ext_arg2. The only user of this in the
xmit path is sendfile, and I've adjusted it to use arg1 when
using unmapped mbufs.

This change is almost entirely mechanical, except that we
change mb_alloc_ext_pgs() to no longer allow allocating
pkthdrs, the change to avoid ext_arg2 as mentioned above,
and the removal of the ext_pgs zone,

This change saves roughly 2% "raw" CPU (~59% -> 57%), or over
3% "scaled" CPU on a Netflix 100% software kTLS workload at
90+ Gb/s on Broadwell Xeons.

In a follow-on commit, I plan to remove some hacks to avoid
access ext_pgs fields of mbufs, since they will now be in
cache.

Many thanks to glebius for helping to make this better in
the Netflix tree.

Reviewed by:	hselasky, jhb, rrs, glebius (early version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24213
2020-04-14 14:46:06 +00:00
Navdeep Parhar
843b264a85 cxgbe(4): Make sure 'flags' is at the same offset in structs toepcb and
synq_entry.  TAILQ_ENTRY isn't always the same size as two pointers.

Reported by:	rmacklem@
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-04-13 20:12:47 +00:00
John Baldwin
94fad5ffc6 Use both crypto engines on a T6.
A T6 adapter contains two crypto engines on separate channels.  This
commit distributes sessions between the two engines.  Previously, only
the first engine was used.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24347
2020-04-10 22:27:45 +00:00
John Baldwin
c034143269 Refactor driver and consumer interfaces for OCF (in-kernel crypto).
- The linked list of cryptoini structures used in session
  initialization is replaced with a new flat structure: struct
  crypto_session_params.  This session includes a new mode to define
  how the other fields should be interpreted.  Available modes
  include:

  - COMPRESS (for compression/decompression)
  - CIPHER (for simply encryption/decryption)
  - DIGEST (computing and verifying digests)
  - AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
  - ETA (combined auth and encryption using encrypt-then-authenticate)

  Additional modes could be added in the future (e.g. if we wanted to
  support TLS MtE for AES-CBC in the kernel we could add a new mode
  for that.  TLS modes might also affect how AAD is interpreted, etc.)

  The flat structure also includes the key lengths and algorithms as
  before.  However, code doesn't have to walk the linked list and
  switch on the algorithm to determine which key is the auth key vs
  encryption key.  The 'csp_auth_*' fields are always used for auth
  keys and settings and 'csp_cipher_*' for cipher.  (Compression
  algorithms are stored in csp_cipher_alg.)

- Drivers no longer register a list of supported algorithms.  This
  doesn't quite work when you factor in modes (e.g. a driver might
  support both AES-CBC and SHA2-256-HMAC separately but not combined
  for ETA).  Instead, a new 'crypto_probesession' method has been
  added to the kobj interface for symmteric crypto drivers.  This
  method returns a negative value on success (similar to how
  device_probe works) and the crypto framework uses this value to pick
  the "best" driver.  There are three constants for hardware
  (e.g. ccr), accelerated software (e.g. aesni), and plain software
  (cryptosoft) that give preference in that order.  One effect of this
  is that if you request only hardware when creating a new session,
  you will no longer get a session using accelerated software.
  Another effect is that the default setting to disallow software
  crypto via /dev/crypto now disables accelerated software.

  Once a driver is chosen, 'crypto_newsession' is invoked as before.

- Crypto operations are now solely described by the flat 'cryptop'
  structure.  The linked list of descriptors has been removed.

  A separate enum has been added to describe the type of data buffer
  in use instead of using CRYPTO_F_* flags to make it easier to add
  more types in the future if needed (e.g. wired userspace buffers for
  zero-copy).  It will also make it easier to re-introduce separate
  input and output buffers (in-kernel TLS would benefit from this).

  Try to make the flags related to IV handling less insane:

  - CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
    member of the operation structure.  If this flag is not set, the
    IV is stored in the data buffer at the 'crp_iv_start' offset.

  - CRYPTO_F_IV_GENERATE means that a random IV should be generated
    and stored into the data buffer.  This cannot be used with
    CRYPTO_F_IV_SEPARATE.

  If a consumer wants to deal with explicit vs implicit IVs, etc. it
  can always generate the IV however it needs and store partial IVs in
  the buffer and the full IV/nonce in crp_iv and set
  CRYPTO_F_IV_SEPARATE.

  The layout of the buffer is now described via fields in cryptop.
  crp_aad_start and crp_aad_length define the boundaries of any AAD.
  Previously with GCM and CCM you defined an auth crd with this range,
  but for ETA your auth crd had to span both the AAD and plaintext
  (and they had to be adjacent).

  crp_payload_start and crp_payload_length define the boundaries of
  the plaintext/ciphertext.  Modes that only do a single operation
  (COMPRESS, CIPHER, DIGEST) should only use this region and leave the
  AAD region empty.

  If a digest is present (or should be generated), it's starting
  location is marked by crp_digest_start.

  Instead of using the CRD_F_ENCRYPT flag to determine the direction
  of the operation, cryptop now includes an 'op' field defining the
  operation to perform.  For digests I've added a new VERIFY digest
  mode which assumes a digest is present in the input and fails the
  request with EBADMSG if it doesn't match the internally-computed
  digest.  GCM and CCM already assumed this, and the new AEAD mode
  requires this for decryption.  The new ETA mode now also requires
  this for decryption, so IPsec and GELI no longer do their own
  authentication verification.  Simple DIGEST operations can also do
  this, though there are no in-tree consumers.

  To eventually support some refcounting to close races, the session
  cookie is now passed to crypto_getop() and clients should no longer
  set crp_sesssion directly.

- Assymteric crypto operation structures should be allocated via
  crypto_getkreq() and freed via crypto_freekreq().  This permits the
  crypto layer to track open asym requests and close races with a
  driver trying to unregister while asym requests are in flight.

- crypto_copyback, crypto_copydata, crypto_apply, and
  crypto_contiguous_subsegment now accept the 'crp' object as the
  first parameter instead of individual members.  This makes it easier
  to deal with different buffer types in the future as well as
  separate input and output buffers.  It's also simpler for driver
  writers to use.

- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
  This understands the various types of buffers so that drivers that
  use DMA do not have to be aware of different buffer types.

- Helper routines now exist to build an auth context for HMAC IPAD
  and OPAD.  This reduces some duplicated work among drivers.

- Key buffers are now treated as const throughout the framework and in
  device drivers.  However, session key buffers provided when a session
  is created are expected to remain alive for the duration of the
  session.

- GCM and CCM sessions now only specify a cipher algorithm and a cipher
  key.  The redundant auth information is not needed or used.

- For cryptosoft, split up the code a bit such that the 'process'
  callback now invokes a function pointer in the session.  This
  function pointer is set based on the mode (in effect) though it
  simplifies a few edge cases that would otherwise be in the switch in
  'process'.

  It does split up GCM vs CCM which I think is more readable even if there
  is some duplication.

- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
  as an auth algorithm and updated cryptocheck to work with it.

- Combined cipher and auth sessions via /dev/crypto now always use ETA
  mode.  The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
  This was actually documented as being true in crypto(4) before, but
  the code had not implemented this before I added the CIPHER_FIRST
  flag.

- I have not yet updated /dev/crypto to be aware of explicit modes for
  sessions.  I will probably do that at some point in the future as well
  as teach it about IV/nonce and tag lengths for AEAD so we can support
  all of the NIST KAT tests for GCM and CCM.

- I've split up the exising crypto.9 manpage into several pages
  of which many are written from scratch.

- I have converted all drivers and consumers in the tree and verified
  that they compile, but I have not tested all of them.  I have tested
  the following drivers:

  - cryptosoft
  - aesni (AES only)
  - blake2
  - ccr

  and the following consumers:

  - cryptodev
  - IPsec
  - ktls_ocf
  - GELI (lightly)

  I have not tested the following:

  - ccp
  - aesni with sha
  - hifn
  - kgssapi_krb5
  - ubsec
  - padlock
  - safe
  - armv8_crypto (aarch64)
  - glxsb (i386)
  - sec (ppc)
  - cesa (armv7)
  - cryptocteon (mips64)
  - nlmsec (mips64)

Discussed with:	cem
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23677
2020-03-27 18:25:23 +00:00
Navdeep Parhar
aa301e5ffe cxgbe(4): Split sge_nm_rxq into three cachelines.
This reduces the lines bouncing around between the driver rx ithread and
the netmap rxsync thread.  There is no net change in the size of the
struct (it continues to waste a lot of space).

This kind of split was originally proposed in D17869 by Marc De La
Gueronniere @ Verisign, Inc.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-03-20 05:12:16 +00:00
Navdeep Parhar
7a25fb9963 cxgbe(4): Do not display error messages related to the CLIP table if
it's not in use by TOE or KTLS.

Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24046
2020-03-13 00:12:15 +00:00
Navdeep Parhar
87d228f935 cxgbe/t4_tom: The MSS in a FLOWC work request must not be 0.
Submitted by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-03-10 21:49:56 +00:00
Navdeep Parhar
2b9010f070 cxgbe(4): Do not try to use 0 as an rx buffer address when the driver is
already allocating from the safe zone and the allocation fails.

This bug was introduced in r357481.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-03-10 21:44:20 +00:00
Navdeep Parhar
7ba6f5493d cxgbe/t4_tom: Do not uninitialize a toepcb that has not been initialized.
This fixes the following panic:
--- trap 0xc, rip = 0xffffffff80c00411, rsp = 0xfffffe0025192840, rbp = 0xfffffe0025192860 ---
vmem_xfree() at vmem_xfree+0xd1/frame 0xfffffe0025192860
tls_uninit_toep() at tls_uninit_toep+0x78/frame 0xfffffe0025192880
free_toepcb() at free_toepcb+0x32/frame 0xfffffe00251928a0
t4_connect() at t4_connect+0x3be/frame 0xfffffe0025192950
tcp_offload_connect() at tcp_offload_connect+0xa4/frame 0xfffffe0025192990
tcp_usr_connect() at tcp_usr_connect+0xec/frame 0xfffffe00251929f0
soconnect() at soconnect+0xae/frame 0xfffffe0025192a30
kern_connectat() at kern_connectat+0xe2/frame 0xfffffe0025192a90
sys_connect() at sys_connect+0x75/frame 0xfffffe0025192ad0
amd64_syscall() at amd64_syscall+0x137/frame 0xfffffe0025192bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0025192bf0
--- syscall (98, FreeBSD ELF64, sys_connect), rip = 0x8008e9d8a, rsp = 0x7fffffffc0f8, rbp = 0x7fffffffc130 ---

Reviewed by:	jhb@
MFC after:	3 days
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23989
2020-03-06 19:56:12 +00:00
John Baldwin
6d44e8e6b5 Rename TOE TLS stats from [rt]x_tls_* to [rt]x_toe_tls_*.
This more clearly differentiates TLS records encrypted and decrypted
in TOE connections from those encrypted via NIC TLS.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-28 00:42:27 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Navdeep Parhar
02cd773916 cxgbe(4): Congestion drops are maintained per E-channel and not per
buffer group.

This fixes a bug where congestion drops on port 1 of a T6 card would
incorrectly be counted as drops on port 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-19 00:48:58 +00:00
Navdeep Parhar
9a4a1be02c cxgbe/iw_cxgbe: correctly enforce the max reg_mr depth.
Reported by:	Andrew Zhu @ Netapp
Obtained from:	Chelsio Communications
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-18 20:43:10 +00:00
John Baldwin
ca3b3c573e Remove the per-TXQ tls_wrs stat.
It duplicated the kern_tls_records stat and was not conditional on NIC
TLS being enabled.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23670
2020-02-13 22:55:45 +00:00
Navdeep Parhar
77ad00bf36 cxgbe(4): Update T4/5/6 firmwares to 1.24.12.0.
Obtained from:	Chelsio Communications
MFC after:	1 month
Sponsored by:	Chelsio Communications
2020-02-12 02:55:06 +00:00
Navdeep Parhar
21935a41fd cxgbe(4): Add native netmap support to the main interface.
This means that extra virtual interfaces (VIs) created with
hw.cxgbe.num_vis are no longer required to use netmap.  Use this
tunable to enable native netmap support on the main interface:

hw.cxgbe.native_netmap="3"

There is no change in default behavior.

Suggested by:	jch@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 22:29:01 +00:00
Navdeep Parhar
f4220a703d cxgbe(4): Add a knob to allow netmap tx traffic to be checksummed by
the hardware.

hw.cxgbe.nm_txcsum=1

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 00:13:15 +00:00
Navdeep Parhar
ba8b75ae01 cxgbe(4): Allow nm_black_hole and nm_cong_drop to be set at any time.
The cong_drop setting will apply to queues created after the setting is
changed and not to existing queues.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 00:08:58 +00:00
Navdeep Parhar
3479fe20e2 cxgbe(4): Report accurate rx_buf_maxsize to netmap.
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-04 23:55:21 +00:00
Navdeep Parhar
87bbb3338e cxgbe(4): Add pfil(9) hooks to the driver's rx.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:09:02 +00:00
Navdeep Parhar
1486d2de9e cxgbe(4): Treat NIC rx as special and run its handler directly and not
via the t4_cpl_handler dispatch table.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:01:35 +00:00
Navdeep Parhar
46e1e307ed cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 00:51:10 +00:00
Navdeep Parhar
d6f79b2710 cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs.  Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:50:29 +00:00
Navdeep Parhar
44c6fea82b cxgbe(4): Do not use pack boundary > 512B unless it is explicitly
requested.

This is a tradeoff between PCIe efficiency during large packet rx and
packing efficiency during small packet rx.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:30:39 +00:00
Navdeep Parhar
a9c4062a9a cxgbe(4): Initialize the rx buffer's metadata on first-use and not on
allocation.

refill_fl doesn't touch any part of a freshly allocated cluster after
this change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:25:12 +00:00
Navdeep Parhar
9087a3df60 cxgbe(4): Only checksummed TCP should be considered for LRO.
This avoids the per-packet nanouptime in tcp_lro_rx for traffic that's
not even TCP.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:06:42 +00:00
Navdeep Parhar
46d29cab25 cxgbe/iw_cxgbe: Do not allow memory registrations with page size greater
than 128MB, which is the maximum supported by the hardware in RDMA mode.

Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-01-14 01:43:04 +00:00
Bjoern A. Zeeb
334fc5822b vnet: virtualise more network stack sysctls.
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain.  All three are
set in the netoptions startup script, which we would love to run for VNETs
as well [1].

While virtualising the log_in_vain sysctls seems pointles at first for as
long as the kernel message buffer is not virtualised, it at least allows
an administrator to debug the base system or an individual jail if needed
without turning the logging on for all jails running on a system.

PR:		243193 [1]
MFC after:	2 weeks
2020-01-08 23:30:26 +00:00
Gleb Smirnoff
e9edde4110 Fix a typo - passing wrong mbuf pointer to needs_udp_csum(). Will
trigger panic only on a kernel with RATELIMIT.

Submitted by:	rrs
2020-01-07 21:29:42 +00:00
Navdeep Parhar
93065a5afd cxgbe(4): check if the firmware supports FW_RI_FR_NSMR_TPTE_WR work
request.

This is used by iw_cxgbe to figure out how best to register memory.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2019-12-18 19:10:30 +00:00
John Baldwin
93dafad57a Expand net epoch in the cxgbe TOE driver to satisfy assertions.
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22483
2019-12-13 23:33:54 +00:00
Navdeep Parhar
c0236bd93d cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers.  This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.

Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.

Reviewed by:	jhb@
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22788
2019-12-13 20:38:58 +00:00
Navdeep Parhar
82694ec0c0 cxgbe(4): Never use hardware checksumming in netmap tx.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-12 21:33:00 +00:00
Navdeep Parhar
c08c2d42cf cxgbe(4): Simplify the firmware version checks a bit.
No functional change.

MFC after:	1 week
2019-12-10 20:12:21 +00:00
Navdeep Parhar
aa7bdbc00c cxgbe(4): Use TX_PKTS2 work requests in netmap Tx if it's available.
TX_PKTS2 is more efficient within the firmware and this improves netmap
Tx by a few Mpps in some common scenarios.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-10 08:16:19 +00:00
Navdeep Parhar
6f012c14bc cxgbe(4): Update T4/5/6 firmwares to 1.24.11.0.
These were obtained from the Chelsio Unified Wire v3.12.0.1 beta
release.

Note that the firmwares are not uuencoded any more.

MFH:		1 month
Sponsored by:	Chelsio Communications
2019-12-10 07:45:10 +00:00
Navdeep Parhar
168bde45c2 cxgbe/iw_cxgbe: Support 64b length in the memory registration routines.
Submitted by:	bharat @ chelsio
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-09 19:10:42 +00:00
Michael Tuexen
fa49a96419 In order for the TCP Handshake to support ECN++, and further ECN-related
improvements, the ECN bits need to be exposed to the TCP SYNcache.
This change is a minimal modification to the function headers, without any
functional change intended.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes@, rrs@, tuexen@
Differential Revision:	https://reviews.freebsd.org/D22436
2019-12-01 18:05:02 +00:00
Navdeep Parhar
e3338dee08 cxgbe(4): Allow the driver to specify multiple FECs that the firmware
should try in order to link up with the peer.

Various FEC variables within the driver can now have multiple bits set
instead of being powers of 2.  0 and -1 in the user knobs still mean no
FEC and auto (driver decides) respectively for backward compatibility,
but no-FEC and auto now have their own bits in the internal
representation.  There is a new bit that can be set to request the FEC
recommended by the cable/transceiver module.

Add sysctls to display link related capabilities of the local side as
well as the link partner.

Note that all this needs a new firmware and the documentation for the
driver FEC knobs will be updated after that firmware is added to the
driver.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-26 05:54:25 +00:00
Navdeep Parhar
515a40d5d9 cxgbe(4): sysctl to reset the temperature/voltage sensor.
# sysctl dev.<nexus>.<inst>.reset_sensor=1
# sysctl dev.t6nex.0.reset_sensor=1

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-24 16:40:54 +00:00
Navdeep Parhar
e56d731b7d cxgbe(4): Update the firmware interface header.
This allows the driver to be updated for the next firmware without
waiting for it to be released.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2019-11-24 05:37:28 +00:00