All of the request handlers no longer modify session state, so remove
the mutex limiting operations to one per session. In addition, change
the pointer to the session state passed to process callbacks to const.
Suggested by: mjg
Reviewed by: mjg, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33317
Only pre-allocate auth contexts when a session-wide key is provided or
for sessions without keys. For sessions with per-operation keys,
always initialize the on-stack context directly rather than
initializing the session context in swcr_authprepare (now removed) and
then copying that session context into the on-stack context.
This approach permits parallel auth operations without needing a
serializing lock. In addition, the previous code assumed that auth
sessions always provided an initial key unlike cipher sessions which
assume either an initial key or per-op keys.
While here, fix the Blake2 auth transforms to function like other auth
transforms where Setkey is invoked after Init rather than before.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33316
As is done with authentication contexts, allocate cipher contexts on
the stack while completing requests. This permits safely dispatching
concurrent requests on a single session. The cipher context in the
session is now only allocated when a session key is provided during
session setup to serve as a template to initialize the on-stack
context similar to auth operations.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33198
The cipher context isn't always a key schedule, so use a more generic
name.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33197
Extend struct enc_xform to add new members to handle auth operations
for AEAD ciphers. In particular, AEAD operations in cryptosoft no
longer use a struct auth_hash. Instead, the setkey and reinit methods
of struct enc_xform are responsible for initializing both the cipher
and auth state.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33196
Previously, these values were only cleared in AES_GMAC_Init(), so a
second set of operations could reuse the final hash as the initial
hash. Currently this bug does not trigger in cryptosoft as existing
GMAC and GCM operations always use an on-stack auth context
initialized from a template context.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33315
This centralizes the check for valid nonce lengths for AES-GCM.
While here, remove some duplicate checks for valid AES-GCM tag lengths
from ccp(4) and ccr(4).
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33194
It is always valid for crp_payload_output_start to be 0. However, if
an output buffer is empty (e.g. a decryption request with a tag but an
empty payload), the existing assertion failed since 0 is not less than
0.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33193
Don't pass the same name to multiple mutexes while using unique types
for WITNESS. Just use the unique types as the mutex names.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32740
The starting sequence number used to verify that TLS 1.0 CBC records
are encrypted in-order in the OCF layer was always set to 0 and not to
the initial sequence number from the struct tls_enable.
In practice, OpenSSL always starts TLS transmit offload with a
sequence number of zero, so this only matters for tests that use a
random starting sequence number.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32676
As a followup to SW KTLS assuming an OCF backend, rename
struct ocf_session to struct ktls_ocf_session and forward
declare it in <sys/ktls.h> to use as the type of
struct ktls_session.cipher.
Reviewed by: gallatin, hselasky
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32565
Pass the ivlen along through, and just drop this KASSERT() if we're
building _STANDALONE for the time being.
Fixes: 1833d6042c ("crypto: Permit variable-sized IVs ...")
This is useful for WireGuard which uses a nonce of 8 bytes rather
than the 12 bytes used for IPsec and TLS.
Note that this also fixes a (should be) harmless bug in ossl(4) where
the counter was incorrectly treated as a 64-bit counter instead of a
32-bit counter in terms of wrapping when using a 12 byte nonce.
However, this required a single message (TLS record) longer than 64 *
(2^32 - 1) bytes (about 256 GB) to trigger.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32122
The tag length is included as one of the values in the flags byte of
block 0 passed to CBC_MAC, so merely copying the first N bytes is
insufficient.
To avoid adding more sideband data to the CBC MAC software context,
pull the generation of block 0, the AAD length, and AAD padding out of
cbc_mac.c and into cryptosoft.c. This matches how GCM/GMAC are
handled where the length block is constructed in cryptosoft.c and
passed as an input to the Update callback. As a result, the CBC MAC
Update() routine is now much simpler and simply performs the
XOR-and-encrypt step on each input block.
While here, avoid a copy to the staging block in the Update routine
when one or more full blocks are passed as input to the Update
callback.
Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32120
Permit nonces of lengths 7 through 13 in the OCF framework and the
cryptosoft driver. A helper function (ccm_max_payload_length) can be
used in OCF drivers to reject CCM requests which are too large for the
specified nonce length.
Reviewed by: sef
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32111
If an operation would generate a MAC output (e.g. for digest operation
or for an AEAD or EtA operation), then an empty payload buffer is
valid. Only reject requests with an empty buffer for "plain" cipher
sessions.
Some of the AES-CCM NIST KAT vectors use an empty payload.
While here, don't advance crp_payload_start for requests that use an
empty payload with an inline IV. (*)
Reported by: syzbot+d4b94fbd9a44b032f428@syzkaller.appspotmail.com (*)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32109
A request without AAD for an AEAD cipher can be submitted via
CIOCCRYPT rather than CIOCCRYPTAEAD.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32108
Add 'ivlen' and 'maclen' fields to the structure used for CIOGSESSION2
to specify the explicit IV/nonce and MAC/tag lengths for crypto
sessions. If these fields are zero, the default lengths are used.
This permits selecting an alternate nonce length for AEAD ciphers such
as AES-CCM which support multiple nonce leengths. It also supports
truncated MACs as input to AEAD or ETA requests.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32107
Rather than copying crp_iv to a local array on the stack that is then
passed to xform reinit routines, pass crp_iv directly and remove the
local copy.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32106
Add a 'len' argument to the reinit hook in 'struct enc_xform' to
permit support for AEAD ciphers such as AES-CCM and Chacha20-Poly1305
which support different nonce lengths.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32105
- Retire cse->mode and use csp->csp_mode instead.
- Use csp->csp_cipher_algorithm instead of the ivsize when checking
for the fixup for the IV length for AES-XTS.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32103
Otherwise we can end up comparing the computed digest with an
uninitialized kernel buffer.
In cryptoaead_op() we already unconditionally fail the request if a
pointer to a digest buffer is not specified.
Based on a patch by Simran Kathpalia.
Reported by: syzkaller
Reviewed by: jhb
MFC after: 1 week
Pull Request: https://github.com/freebsd/freebsd-src/pull/529
Differential Revision: https://reviews.freebsd.org/D32124
KTLS OCF support was originally targeted at software backends that
used host CPU cycles to encrypt TLS records. As a result, each KTLS
worker thread queued a single TLS record at a time and waited for it
to be encrypted before processing another TLS record. This works well
for software backends but limits throughput on OCF drivers for
coprocessors that support asynchronous operation such as qat(4) or
ccr(4). This change uses an alternate function (ktls_encrypt_async)
when encrypt TLS records via a coprocessor. This function queues TLS
records for encryption and returns. It defers the work done after a
TLS record has been encrypted (such as marking the mbufs ready) to a
callback invoked asynchronously by the coprocessor driver when a
record has been encrypted.
- Add a struct ktls_ocf_state that holds the per-request state stored
on the stack for synchronous requests. Asynchronous requests malloc
this structure while synchronous requests continue to allocate this
structure on the stack.
- Add a ktls_encrypt_async() variant of ktls_encrypt() which does not
perform request completion after dispatching a request to OCF.
Instead, the ktls_ocf backends invoke ktls_encrypt_cb() when a TLS
record request completes for an asynchronous request.
- Flag AEAD software TLS sessions as async if the backend driver
selected by OCF is an async driver.
- Pull code to create and dispatch an OCF request out of
ktls_encrypt() into a new ktls_encrypt_one() function used by both
ktls_encrypt() and ktls_encrypt_async().
- Pull code to "finish" the VM page shuffling for a file-backed TLS
record into a helper function ktls_finish_noanon() used by both
ktls_encrypt() and ktls_encrypt_cb().
Reviewed by: markj
Tested on: ccr(4) (jhb), qat(4) (markj)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31665
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).
Sponsored by: The FreeBSD Foundation
This function combines crypto_cursor_segbase() and
crypto_cursor_seglen() into a single function. This is mostly
beneficial in the unmapped mbuf case where back to back calls of these
two functions have to iterate over the sub-components of unmapped
mbufs twice.
Bump __FreeBSD_version for crypto drivers in ports.
Suggested by: markj
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30445
This avoids creating a duplicate copy on the stack just to
append the trailer.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30139
This removes support for loadable software backends. The KTLS OCF
support is now always included in kernels with KERN_TLS and the
ktls_ocf.ko module has been removed. The software encryption routines
now take an mbuf directly and use the TLS mbuf as the crypto buffer
when possible.
Bump __FreeBSD_version for software backends in ports.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30138
This is not a functional change as the Poly1305 hash is the same
length as the GMAC hash length.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30137
This is intended for use in KTLS transmit where each TLS record is
described by a single mbuf that is itself queued in the socket buffer.
Using the existing CRYPTO_BUF_MBUF would result in
bus_dmamap_load_crp() walking additional mbufs in the socket buffer
that are not relevant, but generating a S/G list that potentially
exceeds the limit of the tag (while also wasting CPU cycles).
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30136
There haven't been any non-obscure drivers that supported this
functionality and it has been impossible to test to ensure that it
still works. The only known consumer of this interface was the engine
in OpenSSL < 1.1. Modern OpenSSL versions do not include support for
this interface as it was not well-documented.
Reviewed by: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D29736
Copy the iovec for the trailer from the proper place. This is the same
fix for CBC encryption from ff6a7e4ba6.
Reported by: gallatin
Reviewed by: gallatin, markj
Fixes: 49f6925ca
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29177
cryptosoft is always present and doesn't print any useful information
when it attaches.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29098
There currently isn't a need to provide a public interface to a
software Poly1305 implementation beyond what is already available via
libsodium's APIs and these symbols conflict with symbols shared within
the ossl.ko module between ossl_poly1305.c and ossl_chacha20.c.
Reported by: se, kp
Fixes: 78991a93eb
Sponsored by: Netflix
Maintain a cache of physically contiguous runs of pages for use as
output buffers when software encryption is configured and in-place
encryption is not possible. This makes allocation and free cheaper
since in the common case we avoid touching the vm_page structures for
the buffer, and fewer calls into UMA are needed. gallatin@ reports a
~10% absolute decrease in CPU usage with sendfile/KTLS on a Xeon after
this change.
It is possible that we will not be able to allocate these buffers if
physical memory is fragmented. To avoid frequently calling into the
physical memory allocator in this scenario, rate-limit allocation
attempts after a failure. In the failure case we fall back to the old
behaviour of allocating a page at a time.
N.B.: this scheme could be simplified, either by simply using malloc()
and looking up the PAs of the pages backing the buffer, or by falling
back to page by page allocation and creating a mapping in the cache
zone. This requires some way to save a mapping of an M_EXTPG page array
in the mbuf, though. m_data is not really appropriate. The second
approach may be possible by saving the mapping in the plinks union of
the first vm_page structure of the array, but this would force a vm_page
access when freeing an mbuf.
Reviewed by: gallatin, jhb
Tested by: gallatin
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28556