opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-02-25 11:00:15 -05:00

Author	SHA1	Message	Date
Brandon Bergren	9941cb0657	[PowerPC] Fix atomic_cmpset_masked(). A recent kernel change caused the previously unused atomic_cmpset_masked() to be used. It had a typo in it. Instead of reading the old value from an uninitialized variable, read it from the passed-in pointer as intended. This fixes crashes on 64 bit Book-E. Obtained from: jhibbits	2020-05-26 19:03:45 +00:00
Conrad Meyer	ca0ec73c11	Expand generic subword atomic primitives The goal of this change is to make the atomic_load_acq_{8,16}, atomic_testandset{,_acq}_long, and atomic_testandclear_long primitives available in MI-namespace. The second goal is to get this draft out of my local tree, as anything that requires a full tinderbox is a big burden out of tree. MD specifics can be refined individually afterwards. The generic implementations may not be ideal for your architecture; feel free to implement better versions. If no subword_atomic definitions are needed, the include can be removed from your arch's machine/atomic.h. Generic definitions are guarded by defined macros of the same name. To avoid picking up conflicting generic definitions, some macro defines are added to various MD machine/atomic.h to register an existing implementation. Include _atomic_subword.h in arm and arm64 machine/atomic.h. For some odd reason, KCSAN only generates some versions of primitives. Generate the _acq variants of atomic_load._8, atomic_load._16, and atomic_testandset.*_long. There are other questionably disabled primitives, but I didn't run into them, so I left them alone. KCSAN is only built for amd64 in tinderbox for now. Add atomic_subword implementations of atomic_load_acq_{8,16} implemented using masking and atomic_load_acq_32. Add generic atomic_subword implementations of atomic_testandset_long(), atomic_testandclear_long(), and atomic_testandset_acq_long(), using atomic_fcmpset_long() and atomic_fcmpset_acq_long(). On x86, add atomic_testandset_acq_long as an alias for atomic_testandset_long. Reviewed by: kevans, rlibby (previous versions both) Differential Revision: https://reviews.freebsd.org/D22963	2020-03-25 23:12:43 +00:00
Brandon Bergren	9aafc7c052	[PowerPC] [MIPS] Implement 32-bit kernel emulation of atomic64 operations This is a lock-based emulation of 64-bit atomics for kernel use, split off from an earlier patch by jhibbits. This is needed to unblock future improvements that reduce the need for locking on 64-bit platforms by using atomic updates. The implementation allows for future integration with userland atomic64, but as that implies going through sysarch for every use, the current status quo of userland doing its own locking may be for the best. Submitted by: jhibbits (original patch), kevans (mips bits) Reviewed by: jhibbits, jeff, kevans Differential Revision: https://reviews.freebsd.org/D22976	2020-01-02 23:20:37 +00:00
Justin Hibbits	d0bdb11139	atomic: Add atomic_cmpset_masked to powerpc and use it Summary: This is a more optimal way of doing atomic_compset_masked() than the fallback in sys/_atomic_subword.h. There's also an override for _atomic_fcmpset_masked_word(), which may or may not be necessary, and is unused for powerpc. Reviewed by: kevans, kib Differential Revision: https://reviews.freebsd.org/D22359	2019-11-15 04:33:07 +00:00
Justin Hibbits	9551397f51	powerpc/atomic: Fix atomic_cmpset_rel() Need a release barrier, not an acquire barrier, else bad things happen.	2019-10-15 03:37:21 +00:00
Justin Hibbits	84046d16eb	powerpc: Implement atomic_(f)cmpset_ for short and char \| This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_ short and char functions, selectable at compile time for the target architecture. By default, it uses a generic shift-and-mask to perform atomic updates to sub-components of 32-bit words from <sys/_atomic_subword.h>. However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for halfword and bytes, introduced in PowerISA 2.06. These instructions are supported by all IBM processors from POWER7 on, as well as the Freescale/NXP e6500 core. Although the e5500 and e500mc both implement PowerISA 2.06 they do not implement these instructions. As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by using macros to reduce code duplication. ISA_206_ATOMICS requires clang or newer binutils (2.20 or later). Differential Revision: https://reviews.freebsd.org/D21682	2019-10-08 01:36:34 +00:00
Justin Hibbits	e44ed9d3d4	powerpc/atomic: Follow recommendations on atomic primitive comparisons Both IBM and Freescale programming examples presume the cmpset operands will favor equal, and pessimize the non-equal case instead. Do the same for atomic_cmpset_* and atomic_fcmpset_*. This slightly pessimizes the failure case, in favor of the success case. MFC after: 3 weeks	2019-09-25 01:39:58 +00:00
Hans Petter Selasky	d7a9bfee8f	Implement atomic_swap_xxx() for all platforms. Differential Revision: https://reviews.freebsd.org/D18450 Reviewed by: kib@ MFC after: 3 days Sponsored by: Mellanox Technologies	2018-12-10 13:38:13 +00:00
Justin Hibbits	6a0fd1a51b	powerpc/atomic: Loosen the memory barrier on atomic_load_acq_() 'sync' is pretty heavy-handed, and is unnecessary for this use case. It's a full barrier, which is applicable for all storage types. However, atomic_load_acq_() is only expected to operate on physical memory, not device memory, so lwsync is sufficient (lwsync provides access ordering on memory that is marked as Coherency Required and is not Write Through nor Cache Inhibited). On 32-bit systems, this is a nop, since powerpc_lwsync() is defined to use sync, as a workaround for a silicon bug in the Freescale e500 core.	2018-11-07 01:42:00 +00:00
Konstantin Belousov	30d4f9e888	Add atomic_load(9) and atomic_store(9) operations. They provide relaxed-ordered atomic access semantic. Due to the FreeBSD memory model, the operations are syntaxical wrappers around the volatile accesses. The volatile qualifier is used to ensure that the access not optimized out and in turn depends on the volatile semantic as implemented by supported compilers. The motivation for adding the operation is to help people coming from other systems or knowing the C11/C++ standards where atomics have special type and require use of the special access operations. It is still the case that FreeBSD requires plain load and stores of aligned integer types to be atomic. Suggested by: jhb Reviewed by: alc, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D13534	2017-12-19 09:59:20 +00:00
Pedro F. Giffuni	71e3c3083b	sys/powerpc: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:09:59 +00:00
Justin Hibbits	d3a8234cef	Don't retry a lost reservation in atomic_fcmpset() The desired behavior of atomic_fcmpset_() is to always exit on error. Instead of retrying on lost reservation, leave the retry to the caller, and return error. Reported by: kib	2017-01-31 03:40:13 +00:00
Justin Hibbits	37af2ad077	Drop the __GNUCLIKE_ASM guards around most atomic inlines. There are no alternatives defined, so there's no point in keeping them. Also, they weren't around every inline asm block anyway. Without __GNUCLIKE_ASM defined, the guarded functions return garbage. Reported by: Andrew Thompson	2017-01-30 02:52:15 +00:00
Justin Hibbits	02f151d412	Add atomic_fcmpset_() inlines for powerpc Summary: atomic_fcmpset_() is analogous to atomic_cmpset(), but saves off the read value from the target memory location into the 'old' pointer in the case of failure. Requested by: mjg Differential Revision: https://reviews.freebsd.org/D9325	2017-01-30 02:15:54 +00:00
Konstantin Belousov	0b39ffb35f	On PowerPC 64bit, the linux-compat mb() definition is implemented with lwsync instruction, which does not provide Store/Load barrier. Fix this by using "full" sync barrier for mb(). atomic_store_rel() does not need full barrier, change mb() call there to the lwsync instruction if not hitting the known CPU erratas (i.e. on 32bit). Provide powerpc_lwsync() helper to isolate the lwsync/sync compile time selection, and use it in atomic_store_rel() and several other places which duplicate the code. Noted by: alc Reviewed and tested by: nwhitehorn Sponsored by: The FreeBSD Foundation	2015-11-24 09:13:21 +00:00
Konstantin Belousov	8954a9a4e6	Add the atomic_thread_fence() family of functions with intent to provide a semantic defined by the C11 fences with corresponding memory_order. atomic_thread_fence_acq() gives r \| r, w, where r and w are read and write accesses, and \| denotes the fence itself. atomic_thread_fence_rel() is r, w \| w. atomic_thread_fence_acq_rel() is the combination of the acquire and release in single operation. Note that reads after the acq+rel fence could be made visible before writes preceeding the fence. atomic_thread_fence_seq_cst() orders all accesses before/after the fence, and the fence itself is globally ordered against other sequentially consistent atomic operations. Reviewed by: alc Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-07-08 18:12:24 +00:00
Justin Hibbits	181ca73b1a	Small performance optimization. Clobber only cr0, rather than the entire CR. Discussed with: rdivacky,nwhitehorn MFC after: 3 weeks	2014-04-11 06:17:44 +00:00
Andreas Tobler	feb86bbe4f	Described in the man page but not implemented. Here it comes, atomic_swap_32/64. The latter only for powerpc64. MFC after: 1 month	2014-01-13 22:21:29 +00:00
Bjoern A. Zeeb	08c5f3303d	Add a missing " to get closer to compiling.	2012-05-24 23:46:17 +00:00
Nathan Whitehorn	270dc329b7	Atomic operation acquire barriers also need to be isync on 64-bit systems.	2012-05-24 22:14:39 +00:00
Marcel Moolenaar	7097794901	Revert isync for ILP32 to sync as per my original change that I discussed with Nathan. Leave __ATOMIC_ACQ as an isync as per Nathan.	2012-05-24 22:06:00 +00:00
Marcel Moolenaar	df0bef25eb	Fix the memory barriers for CPUs that do not like lwsync and wedge or cause exceptions early enough during boot that the kernel will do ithe same. Use lwsync only when compiling for LP64 and revert to the more proven isync when compiling for ILP32. Note that in the end (i.e. between revision 222198 and this change) ILP32 changed from using sync to using isync. As per Nathan the isync is needed to make sure I/O accesses are properly serialized with locks and isync tends to be more effecient than sync. While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file so as not to leak their definitions. Discussed with: nwhitehorn	2012-05-24 20:45:44 +00:00
Nathan Whitehorn	bc96dccc69	Fix final bugs in memory barriers on PowerPC: - Use isync/lwsync unconditionally for acquire/release. Use of isync guarantees a complete memory barrier, which is important for serialization of bus space accesses with mutexes on multi-processor systems. - Go back to using sync as the I/O memory barrier, which solves the same problem as above with respect to mutex release using lwsync, while not penalizing non-I/O operations like a return to sync on the atomic release operations would. - Place an acquisition barrier around thread lock acquisition in cpu_switchin().	2012-05-04 16:00:22 +00:00
Nathan Whitehorn	a4cbf436e7	Provide a clearer split between read/write and acquire/release barriers. This should really, actually be correct now.	2012-04-22 22:27:35 +00:00
Nathan Whitehorn	a6349a998d	Clarify what we are doing in r234583 a little better: eieio and isync do not provide general barriers, but only barriers in the context of the atomic sequences here. As such, make them private and keep the global *mb() routines using a variant of sync.	2012-04-22 21:11:01 +00:00
Nathan Whitehorn	83ae3d5531	On non-64-bit systems (which generally don't have lwsync), use eieio and isync to implement read and write barriers, following Appendix B.2 of Book II of the architecture manual. This provides a 25% speed increase to fork() on the PowerPC G4.	2012-04-22 20:23:34 +00:00
Nathan Whitehorn	6f26a88999	Use lwsync to provide memory barriers on systems that support it instead of sync (lwsync is an alternate encoding of sync on systems that do not support it, providing graceful fallback). This provides more than an order of magnitude reduction in the time required to acquire or release a mutex. MFC after: 2 months	2012-04-22 19:00:51 +00:00
Attilio Rao	dc6dc1f573	Merge r221614,221696,221737,221840 from largeSMP project branch: Rewrite atomic operations for powerpc in order to achieve the following: - Produce a type-clean implementation (in terms of functions arguments and returned values) for the primitives. - Fix errors with _long() atomics where they ended up with the wrong arguments to be accepted. - Follow the sys/type.h specifics that define the numbered types starting from standard C types. - Let _ptr() version to not auto-magically cast arguments, but leave the burden on callers, as _ptr() atomic is intended to be used relatively rarely. Fix cfi in order to support the latest point. In collabouration with: bde Tested by: andreast, nwhitehorn, jceel MFC after: 2 weeks	2011-05-22 20:55:54 +00:00
Nathan Whitehorn	c3e289e1ce	MFppc64: Kernel sources for 64-bit PowerPC, along with build-system changes to keep 32-bit kernels compiling (build system changes for 64-bit kernels are coming later). Existing 32-bit PowerPC kernel configurations must be updated after this change to specify their architecture.	2010-07-13 05:32:19 +00:00
Marcel Moolenaar	7b30cb9c7c	Unbreak previous commit.	2008-11-22 22:15:34 +00:00
Kip Macy	db7f0b974f	- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions - add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own - add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers - add if_transmit(struct ifnet ifp, struct mbuf m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues This work was supported by Bitgravity Inc. and Chelsio Inc.	2008-11-22 05:55:56 +00:00
Marcel Moolenaar	bf8ad5a884	Fix copy-n-paste typos in free text.	2008-04-10 02:37:26 +00:00
Marcel Moolenaar	c563d53362	Reimplement atomic_add, atomic_clear, atomic_set and atomic_subtract so that all implemented variants have proper prototypes. The 8-bit, 16-bit and 64-bit variants are not implemented. This really fixes the current build breakages caused by type casting and struct aliasing rules.	2008-04-09 01:00:35 +00:00
Marcel Moolenaar	ca6f63a1ed	Quick fix for the kernel build breakage in netgraph and the aliasing warning in libthr. A more elaborate fix is in the works that makes sure that all variants have proper inline functions with proper types.	2008-04-08 16:34:50 +00:00
Pawel Jakub Dawidek	6eb4157ffc	Implement atomic_fetchadd_long() for all architectures and document it. Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)	2008-03-16 21:20:50 +00:00
Jason Evans	8af8e94855	Define atomic_readandclear_ptr.	2007-11-27 06:34:15 +00:00
John Birrell	ba90c265b0	Implement the _long functions using u_long rather than trying to cast as uint32_t which is defined as unsigned int. gcc doesn't want to consider that there might not be much difference between an int and a long on a 32 bit architecture.	2007-11-26 05:52:45 +00:00
John Birrell	912097517a	Define atomic_cmpset_acq_long and atomic_cmpset_rel_long so that they use casts rather than just assuming that the compiler will DTRT without complaining.	2007-11-19 03:16:16 +00:00
Marcel Moolenaar	c108b80c8c	Cast the arguments to atomic__ptr() when mapping it to atomic__32() This is a minimal fix. Approved by: re (kensmith)	2007-07-10 04:40:00 +00:00
John Baldwin	3c2bc2bf26	Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable. Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week	2005-09-27 17:39:11 +00:00
John Baldwin	80d52f16da	Stop using the '+' constraint modifier with inline assembly. The '+' constraint is actually only allowed for register operands. Instead, use separate input and output memory constraints. Education from: alc Reviewed by: alc Tested on: i386, alpha MFC after: 1 week	2005-09-15 19:31:22 +00:00
John Baldwin	122eceef61	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
Joerg Wunsch	a5f50ef9e4	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago	2005-03-02 21:33:29 +00:00
Peter Grehan	70134d768c	- change all u_int_XX to uint_XX - cast param for atomic_subtract_long, since Netgraph uses it.	2005-02-01 11:17:24 +00:00
Peter Grehan	8e9238c604	Fix bugs with operand ordering and unnecessary sync/eieio ops. Mostly obtained from Alpha atomic.h Approved by: Benno	2003-01-18 11:28:36 +00:00
Peter Grehan	6e1073f023	Fixed branch labels Approved by: benno	2002-09-19 04:39:59 +00:00
Benno Rice	dfc02c301d	Make atomic_cmpset_32 correctly return 0 on failure.	2002-02-24 23:31:49 +00:00
Benno Rice	abc5579e8c	Fix the atomic_*_32 operations. These were written before I had the ability to test them properly and before I had a working knowledge of GCC asm constraints.	2001-06-27 12:17:23 +00:00
Benno Rice	7a0e745f1a	Don't initialise ret in atomic_cmpset_32. Add more synchronisation.	2001-06-26 13:54:17 +00:00
Benno Rice	e2d53d7c4a	Fix asm constraints for atomic_cmpset_32. This fix may also be needed elsewhere.	2001-06-24 06:36:28 +00:00

1 2

53 commits