I previously forgot to mention these as they are set up through
contrib/arm-optimized/routines/string.
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
This should pick up our optimised memchr(), strlen(), and strlcpy()
when strlcat() is called.
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42863
Somewhat similar to stpncpy, but different in that we need to compute
the full source length even if the buffer is shorter than the source.
strlcat is implemented as a simple wrapper around strlcpy. The scalar
implementation of strlcpy just calls into strlen() and memcpy() to do
the job.
Perf-wise we're very close to stpncpy. The code is slightly slower as
it needs to carry on with finding the source string length even if the
buffer ends before the string.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42863
A straightforward derivation from the stpncpy unit test.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42863
strcat has a bespoke scalar assembly implementation we
inherited from NetBSD. While it performs well, it is
better to call into our SIMD implementations if any SIMD
features are available at all. So do that and implement
strcat() by calling into strlen() and strcpy() if these
are available.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Reviison: https://reviews.freebsd.org/D42600
This was surprisingly annoying to get right, despite being such a simple
function. A scalar implementation is also provided, it just calls into
our optimised memchr(), memcpy(), and memset() routines to carry out its
job.
I'm quite happy with the performance. glibc only beats us for very long
strings, likely due to the use of AVX-512. The scalar implementation
just calls into our optimised memchr(), memcpy(), and memset() routines,
so it has a high overhead to begin with but then performs ok for the
amount of effort that went into it. Still beats the old C code, except
for very short strings.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42519
This adds additional unit tests validating the function for
All possible alignment offsets of source and destination.
Also extend the test to allow testing of an external stpncpy
implementation, which greatly simplifies the development of
custom implementations.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42519
The strsep() function is basically strcspn() with extra steps.
On amd64, we now have an optimised implementation of strcspn(),
so instead of implementing the inner loop manually, just call
into the optimised routine.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42346
Also mention missing rindex() entry, which is provided through
strrchr().
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42217
The baseline implementation is very straightforward, while the scalar
implementation suffers from register pressure and the need to use SWAR
techniques similar to those used for strchr().
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42217
The scalar implementation is fairly straightforward and merely unrolled
four times. The baseline implementation closely follows D41971 with
appropriate extensions and extra code paths to pay attention to string
length.
Performance is quite good. We beat both glibc (except for very long
strings, but they likely use AVX which we don't) and Bionic (except for
medium-sized aligned strings, where we are still in the same ballpark).
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42122
These are patterned after the previously added (D41970)
strcmp tests, but are extended to check for various length
conditions.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42122
This lets us use our optimised strcspn() routine for strpbrk() calls.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D41980
This is the most complicated one so far. The basic idea is to process
the bulk of the string in aligned blocks of 16 bytes such that one
string runs ahead and the other runs behind. The string that runs ahead
is checked for NUL bytes, the one that runs behind is compared with the
corresponding chunk of the string that runs ahead. This trades an extra
load per iteration for the very complicated block-reassembly needed in
the other implementations (bionic, glibc). On the flip side, we need
two code paths depending on the relative alignment of the two buffers.
The initial part of the string is compared directly if it is known not
to cross a page boundary. Otherwise, a complex slow path to avoid
crossing into unmapped memory commences.
Performance-wise we beat bionic for misaligned strings (i.e. the strings
do not share an alignment offset) and reach comparable performance for
aligned strings. glibc is a bit better as it has a special kernel for
AVX-512, where this stuff is a bit easier to do.
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D41971
Before every wait for FIFO interrupt set how much data/space do we
want to see there. Previous code was not using it for receive, as
result aggregating interrupts only within processing latency. The
new code needs only one interrupt per transfer per FIFO length.
On my Dell XPS 13 9310 with iichid(4) touchscreen and touchpad this
reduces the interrupt rate per device down to 2 per sample or 16-20
per second when idle and 120-160 per second when actively touched.
MFC after: 1 month
Amdgpu driver does a lot of memory allocations in FPU-protected sections
of code for certain display cores, e.g. for DCN30. This does not work
currently on FreeBSD as its malloc function can not be run within a
critical section. Allocate memory for FPU context to overcome such
restriction.
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu (previous version), markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42822
acpi_dev_present detects that a given ACPI device is present based on
Hardware ID, Unique ID and Hardware Revision of the device.
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42823
It does a Read-Modify-Write operation using clear and set bitmasks on
PCI Express Capability Register at pos. As certain PCI Express
Capability Registers are accessed concurrently in RMW fashion, hence
require locking which is handled transparently to the caller.
Sponsored by: Serenity CyberSecurity, LLC
Reviewed by: manu, bz
MFC after: 1 week
MFC TODO: Move pcie_cap_lock to bottom to preserve KBI compatibility
Differential Revision: https://reviews.freebsd.org/D42821
We assume that backlight (in Linux term) is always "native".
Also stub acpi_video_register_backlight()
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42814
and bitmap_shift_right() functions to linux/bitmap.h
They perform calculation of two bitmaps intersection,
copying the contents of u32 array of bits to bitmap and
logical right shifting of the bits in a bitmap.
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42812
Cancel a work not waiting for it to finish.
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42811
Linux uses it only if SHRINKER_DEBUG config option is enabled. Ignore it.
Sponsored by: Serenity Cyber Security, LLC
Reviewers: manu, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42809
ida_alloc_min() allocates an unused ID. between min and INT_MAX.
While here allow end parameter of ida_simple_get() be larger than
INT_MAX. Linux caps the value to INT_MAX.
Sponsored by: Serenity Cyber Security, LLC
Reviewers: manu, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42806
It is implemented like ktime_get_raw_ns() function but uses less
precise getnanouptime() call.
Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42804