This lets one use ossl(4) for AES-GCM operations on contemporary amd64
platforms. A kernel benchmark indicates that this gives roughly
equivalent throughput to aesni(4) for various buffer sizes.
Bulk processing is done in aesni-gcm-x86_64.S, the rest is handled in a
C wrapper ported from OpenSSL's gcm128.c.
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Reviewed by: jhb
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D39967
aes-gcm-avx512.S is generated from OpenSSL 3.1 and implements AES-GCM.
ossl_x86.c detects whether the CPU implements the required AVX512
instructions; if not, the ossl(4) module does not provide an AES-GCM
implementation. The VAES implementation increases throughput for all
buffer sizes in both directions, up to 2x for sufficiently large
buffers.
The "process" implementation is in two parts: a generic OCF layer in
ossl_aes.c that calls a set of MD functions to do the heavy lifting.
The intent there is to make it possible to add other implementations for
other platforms, e.g., to reduce the diff required for D37421.
A follow-up commit will add a fallback path to legacy AES-NI, so that
ossl(4) can be used in preference to aesni(4) on all amd64 platforms.
In the long term we would like to replace aesni(4) and armv8crypto(4)
with ossl(4).
Note, currently this implementation will not be selected by default
since aesni(4) and ossl(4) return the same probe priority for crypto
sessions, and the opencrypto framework selects the first registered
implementation to break a tie. Since aesni(4) is compiled into the
kernel, aesni(4) wins. A separate change may modify ossl(4) to have
priority.
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Reviewed by: jhb
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D39783
AES-CBC OpenSSL assembly is used underneath.
The glue layer(ossl_aes.c) is based on CHACHA20 implementation.
Contrary to the SHA and CHACHA20, AES OpenSSL assembly logic
does not have a fallback implementation in case CPU doesn't
support required instructions.
Because of that CPU caps are checked during initialization and AES
support is advertised only if available.
The feature is available on all architectures that ossl supports:
i386, amd64, arm64.
The biggest advantage of this patch over existing solutions
(aesni(4) and armv8crypto(4)) is that it supports SHA,
allowing for ETA operations.
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: jhb (previous version)
Differential revision: https://reviews.freebsd.org/D32099
AES-CBC OpenSSL assembly is used underneath.
The glue layer(ossl_aes.c) is based on CHACHA20 implementation.
Contrary to the SHA and CHACHA20, AES OpenSSL assembly logic
does not have a fallback implementation in case CPU doesn't
support required instructions.
Because of that CPU caps are checked during initialization and AES
support is advertised only if available.
The feature is available on all architectures that ossl supports:
i386, amd64, arm64.
The biggest advantage of this patch over existing solutions
(aesni(4) and armv8crypto(4)) is that it supports SHA,
allowing for ETA operations.
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: jhb
Differential revision: https://reviews.freebsd.org/D32099
Enable in-kernel acceleration of SHA1 and SHA2 operations on arm64 by adding
support for the ossl(4) crypto driver. This uses OpenSSL's assembly routines
under the hood, which will detect and use SHA intrinsics if they are
supported by the CPU.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27390
Make room for adding arm64 support to this driver by moving the
x86-specific feature parsing to a separate file.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27388
Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386. It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.
Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.
Reviewed by: gallatin, jkim, delphij, gnn
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26821