Commit graph

12 commits

Author SHA1 Message Date
Konstantin Belousov
ae507c25de amd64 libc: add missed GNU-stack annotation to memmove/memcpy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-11-18 15:31:38 +02:00
Alexander Motin
f22068d91b amd64: Stop using REP MOVSB for backward memmove()s.
Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
REP MOVSB the fastest way to copy memory in most of cases. However
Intel Optimization Reference Manual says: "setting the DF to force
REP MOVSB to copy bytes from high towards low addresses will expe-
rience significant performance degradation". Measurements on Intel
Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.

This patch keeps ERMS use for forward ordered memory copies, but
removes it for backward overlapped moves where it does not work.

This is just a cosmetic sync with kernel, since libc does not use
ERMS at this time.

Reviewed by:    mjg
MFC after:	2 weeks
2022-06-16 14:51:50 -04:00
Mateusz Guzik
0db6aef407 amd64: add a note about simd to libc memset, memmove and memcmp 2021-01-31 16:07:19 +00:00
Mateusz Guzik
164c3b8184 amd64: add missing ALIGN_TEXT to loops in memset and memmove 2021-01-30 00:01:44 +00:00
Mateusz Guzik
ddf6571230 amd64: align target memmove buffer to 16 bytes before using rep movs
See the review for sample test results.

Reviewed by:	kib (kernel part)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18401
2018-12-01 14:20:32 +00:00
Mateusz Guzik
94243af2da amd64: handle small memmove buffers with overlapping stores
Handling sizes of > 32 backwards will be updated later.

Reviewed by:	kib (kernel part)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18387
2018-11-30 20:58:08 +00:00
Mateusz Guzik
2847cfce54 amd64: remove stale attribution for memmove work
While the routine started as expanded bcopy, it is now entirely rewritten.

Sponsored by:	The FreeBSD Foundation
2018-11-30 00:47:36 +00:00
Mateusz Guzik
dd219e5ea5 amd64: tidy up copying backwards in memmove
For non-ERMS case the code used handle possible trailing bytes with
movsb first and then followed it up with movsq. This also happened
to alter how calculations were done for other cases.

Handle the tail with regular movs, just like when copying forward.
Use leaq to calculate the right offset from the get go, instead of
doing separate add and sub.

This adjusts the offset for non-rep cases so that they can be used
to handle the tail.

The routine is still a work in progress.

Sponsored by:	The FreeBSD Foundation
2018-11-30 00:45:10 +00:00
Mateusz Guzik
1e52ba8c62 amd64: import updated kernel memmove to libc
bcopy is left alone as it is expected to be converted to a C func.

Due to header mess ALIGN_TEXT is temporarily defined explicitly in memmove.S

Reviewed by:	kib
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17538
2018-10-13 21:15:47 +00:00
Konstantin Belousov
adc6846785 Remove duplicate .note.GNU-stack section declaration. bcopy already
made the neccessary provisions.

Reported by:	arundel
2011-02-04 21:04:00 +00:00
Konstantin Belousov
93ab758670 Add section .note.GNU-stack for assembly files used by 386 and amd64. 2011-01-07 16:08:40 +00:00
Alan Cox
91c09a383a Add machine-specific, optimized implementations of bcopy, bzero, memcpy,
memmove, and memset.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
2005-04-07 03:56:03 +00:00