opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-02-24 02:10:45 -05:00

Author	SHA1	Message	Date
Doug Moore	8df38859d0	radix_trie: replace node count with popmap Replace the 'count' field in a trie node with a bitmap that identifies non-NULL children. Drop the 'last' field, and use the last bit set in the bitmap instead. In lookup_le, lookup_ge, remove, and reclaim_all, use the bitmap to find the previous/next/only/every non-null child in constant time by examining the bitmask instead of looping across array elements and null-checking them one-by-one. A buildworld test suggests that this reduces the cycle count on those functions that eliminate some null-checks by 4.9%, 1.5%, 0.0% and 13.3%. Reviewed by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D40775	2023-07-07 11:09:36 -05:00
Konstantin Belousov	ef747607ea	vm_fault: move FAULT_* return codes out of range for Mach errors This way a possible clash between FAULT_* and KERN_* numbering is avoided, and panics checks for fault_status confusion become more efficient. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40771	2023-06-28 00:03:14 +03:00
Doug Moore	da72505f9c	radix_trie: pass fewer params to node_get Let node_get calculate it's own owner value. Don't pass the count parameter, since it's always 2. Save 16 bytes in insert(). Move, without modifying, slot and trimkey to handle use-before-declaration problem. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D40723	2023-06-27 12:21:11 -05:00
Doug Moore	9cfed089ac	radix_trie: clean up overlong lines This is purely a cosmetic change. vm_radix.c has lines that reach past column 80 and this change cleans that up. The associated changes to subr_pctrie.c are just to keep mirroring vm_radix.c. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D40764	2023-06-27 12:01:33 -05:00
Doug Moore	72c3a43b16	radix_trie: skip compare in lookup_le, lookup_ge In _lookup_ge, where a loop "looks for an available edge or val within the current bisection node" (to quote the code comment), the value of index has already been modified to guarantee that it is the least value than can be found in the non-NULL child node being examined. Therefore, if the non-NULL child is a leaf, there's no need to compare 'index' to anything, and the value can just be returned. The same is true for _lookup_le with 'most' replacing 'least'. Reviewed by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D40746	2023-06-27 00:42:41 -05:00
Alan Cox	d8e6f4946c	vm: Fix anonymous memory clustering under ASLR By default, our ASLR implementation is supposed to cluster anonymous memory allocations, unless the application's mmap(..., MAP_ANON, ...) call included a non-zero address hint. Unfortunately, clustering never occurred because kern_mmap() always replaced the given address hint when it was zero. So, the ASLR implementation always believed that a non-zero hint had been provided and randomized the mapping's location in the address space. To fix this problem, I'm pushing down the point at which we convert a hint of zero to the minimum allocatable address from kern_mmap() to vm_map_find_min(). Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40743	2023-06-26 23:42:48 -05:00
Doug Moore	a42d8fe001	radix_trie: simplify trimkey functions Replacing a branch and two shifts with a single masking operation saves 64 bytes the pair of functions lookup_le and lookup_ge on amd64. Refresh the associated comments. Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D40722	2023-06-25 12:49:15 -05:00
Doug Moore	e8efee297c	radix_trie: avoid reloading radix node In the vm_radix:remove loop that searches for the last child, load that child once, without loading it again after the search is over. Change KASSERTS from index check to NULL node check. Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D40721	2023-06-23 18:47:23 -05:00
Doug Moore	1efa7dbc07	vm_radix: drop unused function; use bool. Replace boolean_t with bool in vm_radix.c. Drop the unused function vm_radix_is_singleton, which is unused and has no corresponding function in subr_pctrie.c. Reviewed by: alc Differential Revision: <https://reviews.freebsd.org/D40586>	2023-06-20 23:52:27 -05:00
Doug Moore	05963ea4d1	radix_trie: eliminate iteration in keydiff Use flsll(), instead of a loop, to find where two keys differ, and then arithmetic to transform that to a trie level. Approved by: alc, markj Differential Revision: https://reviews.freebsd.org/D40585	2023-06-20 11:30:29 -05:00
Alan Cox	58d4271721	vm_phys: Fix typo in `9e81742892`	2023-06-16 03:12:42 -05:00
Doug Moore	9e81742892	vm_phys: add binary segment search Replace several sequential searches for a segment that contains a phyiscal address with a call to a function that does it by binary search. In vm_page_reclaim_contig_domain_ext, find the first segment to reclaim from, and reclaim from each subsequent appropriate segment. Eliminate vm_phys_scan_contig. Reviewed by: alc, markj Differential Revision: https://reviews.freebsd.org/D40058	2023-06-16 01:43:45 -05:00
Mark Johnston	6062d9faf2	vm_phys: Change the return type of vm_phys_unfree_page() to bool This is in keeping with the trend of removing uses of boolean_t, and the sole caller was implicitly converting it to a "bool". No functional change intended. Reviewed by: dougm, alc, imp, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40401	2023-06-05 12:22:11 -04:00
Colin Percival	45cc8519f5	tslog: Annotate parts of SYSINIT cpu Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM, SYSINIT cpu takes roughly 2770 us: * 2280 us in vm_ksubmap_init * 535 us in kmem_malloc * 450 us in pmap_zero_page * 1720 us in pmap_growkernel * 1620 us in pmap_zero_page * 80 us in bufinit * 480 us in cpu_setregs * 430 us in cpu_setregs calling load_cr0 Much of this is hypervisor overhead: load_cr0 is slow because it traps to the hypervisor, and 99% of the time in pmap_zero_page is spent when we first touch the page, presumably due to the host Linux kernel faulting in backing pages one by one. Sponsored by: https://www.patreon.com/cperciva Differential Revision: https://reviews.freebsd.org/D40327	2023-06-04 10:16:35 -07:00
Warner Losh	4d846d260e	spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix	2023-05-12 10:44:03 -06:00
Andrew Gallatin	8b0dafdb2f	vm: implement vm_page_reclaim_contig_domain_ext() Implement vm_page_reclaim_contig_domain_ext() to reclaim multiple contiguous regions at once. This makes it more efficient for users that need multiple contiguous regions to reclaim those regions efficiently. This is needed because callers like ktls may need to reclaim many contiguous regions, and each scan of physical memory can take multiple seconds on a large memory machine (order of 100GB of RMA). Rather than modifying the core algorithm, I extended vm_page_reclaim_contig_domain() to take a "desired_runs" argument to allow the caller to request that it reclaim more than just a single run. There is no functional change intended for all existing callers. The first user for this interface is the ktls code (https://reviews.freebsd.org/D39421). By reclaiming multiple runs, ktls goes from consuming hours of CPU to refill its buffer zone to just seconds or minutes. Differential Revision: https://reviews.freebsd.org/D39739 Sponsored by: Netflix Reviewed by: alc, jhb, markj	2023-05-09 13:09:34 -04:00
Dimitry Andric	f74be55e30	vm: fix a number of functions to match the expected prototypes Noticed while attempting to make boolean_t unsigned: some vm-related function declarations and defintions were using boolean_t where they should have used int, and vice versa. MFC after: 1 week Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D39753	2023-04-25 19:58:18 +02:00
Konstantin Belousov	1e0e335b0f	amd64: fix PKRU and swapout interaction When vm_map_remove() is called from vm_swapout_map_deactivate_pages() due to swapout, PKRU attributes for the removed range must be kept intact. Provide a variant of pmap_remove(), pmap_map_delete(), to allow pmap to distinguish between real removes of the UVA mappings and any other internal removes, e.g. swapout. For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove(). Reported by: andrew Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39556	2023-04-15 02:53:59 +03:00
Konstantin Belousov	28f957b8b3	vnode_pager_input: return runningbufspace back Both vnode_pager_input_smlfs() and vnode_pager_generic_getpages() increment runningbufspace, but also both delegate io completion handling on the pbuf to either plain bdone() or filesystem-specific strategy routine. Accidentally, for e.g. UFS it is g_vfs_strategy()/g_vfs_done(). The later calls bufdone() which handles runningbufspace reclamation. For plain bdone() io done handler, nothing would return accounted b_runningbufspace back. Do it in the new helper vnode_pager_input_bdone(), as well as in vnode_pager_generic_getpages_done() explicitly. Note that potential multiple calls to runningbufwakeup() for the same pbuf or buf completion are safe. runningbufwakeup() clears accounting for the buffer, so second and later calls are nop. The problem was found due to tarfs using small vnode pager input but not g_vfs_strategy(). Reported by: des Reviewed by: markj, sjg Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39263	2023-03-26 00:55:29 +02:00
Mateusz Guzik	0e71f4f77c	vm: add unlocked page lookup before trying vm_fault_soft_fast Shaves a read lock + tryupgrade trip most of the time. Stats from doing a kernel build (counters not present in the tree): vm.fault_soft_fast_ok: 262653 vm.fault_soft_fast_failed_other: 41 vm.fault_soft_fast_failed_no_page: 39595772 vm.fault_soft_fast_failed_page_busy: 1929 vm.fault_soft_fast_failed_page_invalid: 22183 Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D39268	2023-03-25 22:14:59 +00:00
Mateusz Guzik	0a310c94ee	vm: consistently prefix fault helpers with vm_fault_ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D39029	2023-03-13 11:00:28 +00:00
Mateusz Guzik	3c3a434f8e	vm: avoid lock upgrade if possible in vm_fault_next In my tests during buildkernel fs->m was always NULL at that stage. Note the change has no impact on vm obj contention during said workload. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D39027	2023-03-11 21:52:01 +00:00
Mateusz Guzik	fdb1dbb1cc	vm: read-locked fault handling for backing objects This is almost the simplest patch which manages to avoid write locking for backing objects, as a result mostly fixing vm object contention problems. What is not fixed: 1. cacheline ping pong due to read-locks 2. cacheline ping pong due to pip 3. cacheling ping pong due to object busying 4. write locking on first object On top of it the use of VM_OBJECT_UNLOCK instead of explicitly tracking the state is slower multithreaded that it needs to be, done for simplicity for the time being. Sample lock profiling results doing -j 104 buildkernel on tmpfs: before: 71446200 (rw:vmobject) 14689706 (sx:vm map (user)) 4166251 (rw:pmap pv list) 2799924 (spin mutex:turnstile chain) after: 19940411 (rw:vmobject) 8166012 (rw:pmap pv list) 6017608 (sx:vm map (user)) 1151416 (sleep mutex:pipe mutex) Reviewed by: kib Reviewed by: markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D38964	2023-03-11 11:08:21 +00:00
Mateusz Guzik	bdfd1adc99	vm: add VM_OBJECT_UNLOCK Reviewed by: kib Reviewed by: markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D38964	2023-03-11 11:08:21 +00:00
Mateusz Guzik	73b951cd39	vm: move up object lock asserts in fault functions No functional changes. Reviewed by: kib Reviewed by: markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D38964	2023-03-11 11:08:21 +00:00
Mark Johnston	e08302f649	vm_fault: Update a comment to reflect the removal of the default pager Fixes: `5d32157d4e` ("vm_object: Modify vm_object_allocate_anon() to return OBJT_SWAP objects") Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D38985	2023-03-09 11:15:49 -05:00
Ed Maste	c3821149f4	Drop space in "vm object" lock name to improve wchan Lock names are shown in top as a `` followed by the first five characters of the name. `vmobj` a little more obvious and easier to search for than `*vm ob`. Differential Revision: https://reviews.freebsd.org/D36264	2023-02-15 08:31:17 -05:00
Mark Johnston	d099194818	vm_fault: Fix a race in vm_fault_soft_fast() When vm_fault_soft_fast() creates a mapping, it release the VM map lock before unbusying the top-level object. Without the map lock, however, nothing prevents the VM object from being deallocated while still busy. Fix the problem by unbusying the object before releasing the VM map lock. If vm_fault_soft_fast() fails to create a mapping, the VM map lock is not released, so those cases don't need to change. Reported by: syzkaller Reviewed by: kib (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D38527	2023-02-13 16:35:47 -05:00
Mateusz Guzik	bbb6228eae	vm: ansify Sponsored by: Rubicon Communications, LLC ("Netgate")	2023-02-13 18:23:21 +00:00
Andrew Gallatin	9cb6ba29cb	vm: centralize VM_BATCHQUEUE_SIZE definition Remove the platform-specific definitions of VM_BATCHQUEUE_SIZE for amd64 and powerpc64, and instead treat all 64-bit platforms identically. This has the effect of increasing the arm64 and riscv VM_BATCHQUEUE_SIZE to match that of other platforms. Reviewed by: jhb, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D37707	2023-01-21 14:30:00 -05:00
Konstantin Belousov	6189672e60	Handle ERELOOKUP from VOP_FSYNC() in several other places We need to repeat the operation if the vnode was relocked. Reported and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D38114	2023-01-20 03:54:56 +02:00
Konstantin Belousov	70e1b11216	vm_object.c: minor style Sponsored by: The FreeBSD Foundation MFC after: 3 days	2023-01-20 03:54:42 +02:00
Mark Johnston	b050ee6c97	vm_object: Fix a kernel memory disclosure via the vm_object list sysctl Reported by: Chris J-D <chris@accessvector.net> MFC after: 1 week Sponsored by: The FreeBSD Foundation	2023-01-16 11:27:54 -05:00
Mateusz Guzik	f45feecfb2	vfs: add vn_getsize getattr is very expensive and in important cases only gets called to get the size. This can be optimized with a dedicated routine which obtains that statistic. As a step towards that goal make size-only consumers use a dedicated routine. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D37885	2022-12-28 22:43:49 +00:00
Konstantin Belousov	3249449190	vm_page_grab_valid(): clear *mp in case of pager denying page allocation Same as it is done in other error return cases. Callers depend on error case returning NULL, e.g. vm_imgact_hold_page(). Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37719	2022-12-17 19:01:43 +02:00
Andrew Gallatin	1cac76c93f	vm: reduce lock contention when processing vm batchqueues Rather than waiting until the batchqueue is full to acquire the lock & process the queue, we now start trying to acquire the lock using trylocks when the batchqueue is 1/2 full. This removes almost all contention on the vm pagequeue mutex for for our busy sendfile() based web workload. It also greadly reduces the amount of time a network driver ithread remains blocked on a mutex, and eliminates some packet drops under heavy load. So that the system does not loose the benefit of processing large batchqueues, I've doubled the size of the batchqueues. This way, when there is no contention, we process the same batch size as before. This has been run for several months on a busy Netflix server, as well as on my personal desktop. Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D37305	2022-12-14 14:34:07 -05:00
Konstantin Belousov	645510e62e	Provide consistent prototype for swp_pager_meta_free() This should fix 32bit build breakage. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2022-12-09 17:23:09 +02:00
Konstantin Belousov	cd086696c2	vm_pager_allocate(): override resulting object type For dynamically allocated pager type, which inherits the parent's alloc method, type of the returned object is set to the parent's type otherwise. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37097	2022-12-09 14:17:03 +02:00
Konstantin Belousov	ec201dddfb	vm_pager: add method to veto page allocation Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37097	2022-12-09 14:15:37 +02:00
Konstantin Belousov	d537d1f12e	vm_pager: add methods for page insertion and removal notifications Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37097	2022-12-09 14:15:37 +02:00
Konstantin Belousov	d9dc64f158	tmpfs: make vm_object point to the tmpfs node instead of vnode The vnode could be reclaimed and allocated again during the lifecycle of the node, but the node cannot. Also, referencing the node would allow to reach it and tmpfs mount data from the object, regardless of the state of the possibly absent vnode. Still use swp_tmpfs for back-pointer, instead of using handle. Use of named swap objects would incur taking the sw_alloc_sx on node allocation and deallocation. swp_tmpfs is renamed to swp_priv to remove the last bit of tmpfs in vm/. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37097	2022-12-09 14:15:37 +02:00
Konstantin Belousov	baa1ccceef	Make swap_pager_freespace() global also make it return the count of the swap pages freed, which are not simultaneously resident in the object. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37097	2022-12-09 14:15:37 +02:00
Mitchell Horne	03d6764b38	ddb: don't limit pindex output in 'show vmopag' This command already prints a tremendous amount of output, and properly obeys the pager. It no longer makes sense to arbitrarily limit the pages that are printed, as the reader will not be aware that this has happened. Reviewed by: markj MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D37361	2022-11-11 14:25:39 -04:00
Anton Rang	cfbf1da0de	vm_page_unswappable: remove wrong assertion markj says: ...the assertion is incorrect and should simply be removed. It has been racy since we removed the use of the page hash lock to synchronize wiring of pages. PR: 267621 Reviewed by: markj, Anton Rang <rang@acm.org> MFC after: 1 week Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D37320	2022-11-09 14:28:03 -06:00
Mark Johnston	2dba2288aa	uma: Never pass cache zones to memguard Items allocated from cache zones cannot usefully be protected by memguard. PR: 267151 Reported and tested by: pho MFC after: 1 week	2022-10-19 14:36:36 -04:00
Konstantin Belousov	934bfc128e	Add vm_page_any_valid() Use it and several other vm_page_*_valid() functions in more places. Suggested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37024	2022-10-19 20:24:07 +03:00
Konstantin Belousov	5bd45b2ba3	swap_pager_find_least(): assert that the function is called on the right object type Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37024	2022-10-19 20:24:07 +03:00
Mark Johnston	2c9dc2384f	vm_page: Fix a logic error in the handling of PQ_ACTIVE operations As an optimization, vm_page_activate() avoids requeuing a page that's already in the active queue. A page's location in the active queue is mostly unimportant. When a page is unwired and placed back in the page queues, vm_page_unwire() avoids moving pages out of PQ_ACTIVE to honour the request, the idea being that they're likely mapped and so will simply get bounced back in to PQ_ACTIVE during a queue scan. In both cases, if the page was logically in PQ_ACTIVE but had not yet been physically enqueued (i.e., the page is in a per-CPU batch), we would end up clearing PGA_REQUEUE from the page. Then, batch processing would ignore the page, so it would end up unwired and not in any queues. This can arise, for example, when a page is allocated and then vm_page_activate() is called multiple times in quick succession. The result is that the page is hidden from the page daemon, so while it will be freed when its VM object is destroyed, it cannot be reclaimed under memory pressure. Fix the bug: when checking if a page is in PQ_ACTIVE, only perform the optimization if the page is physically enqueued. PR: 256507 Fixes: `f3f38e2580` ("Start implementing queue state updates using fcmpset loops.") Reviewed by: alc, kib MFC after: 1 week Sponsored by: E-CARD Ltd. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D36839	2022-10-05 15:12:46 -04:00
John Baldwin	a9fca3b987	Fix various places which cast a pointer to a vm_paddr_t or vice versa. GCC warns about the mismatched sizes on i386 where vm_paddr_t is 64 bits. Reviewed by: imp, markj Differential Revision: https://reviews.freebsd.org/D36750	2022-10-03 16:10:41 -07:00
John Baldwin	f49fd63a6a	kmem_malloc/free: Use void * instead of vm_offset_t for kernel pointers. Reviewed by: kib, markj Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D36549	2022-09-22 15:09:19 -07:00

1 2 3 4 5 ...

4747 commits