opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-05-28 04:12:45 -04:00

Author	SHA1	Message	Date
Mark Johnston	3f85c51824	swap_pager: uma_zcreate() doesn't fail Remove always-false checks for UMA zone creation failure. No functional change intended. Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `43b3b8e52d`)	2022-01-18 08:36:13 -05:00
Mark Johnston	d41768d5c1	vm_pageout: Group sysctl variables together with sysctl definitions Fix some style bugs while here. No functional change intended. Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `c4a25e0713`)	2022-01-18 08:36:04 -05:00
Dawid Gorecki	16a900ae02	setrlimit: Take stack gap into account. Calling setrlimit with stack gap enabled and with low values of stack resource limit often caused the program to abort immediately after exiting the syscall. This happened due to the fact that the resource limit was calculated assuming that the stack started at sv_usrstack, while with stack gap enabled the stack is moved by a random number of bytes. Save information about stack size in struct vmspace and adjust the rlim_cur value. If the rlim_cur and stack gap is bigger than rlim_max, then the value is truncated to rlim_max. PR: 253208 Reviewed by: kib Obtained from: Semihalf Sponsored by: Stormshield MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D31516 (cherry picked from commit `889b56c8cd`)	2021-12-30 16:24:59 +01:00
Stephen J. Kiernan	bd7e18a378	Eliminate key press requirement "show vmopag" command output. Summary: One was required to press a key to continue after every 18 lines of output. This requirement had been in the "show vmopag" command since it was introduced, which was many years before paging was added to DDB. With paging, this explict key check is no longer necessary. Obtained from: Juniper Networks, Inc. MFC after: 1 week Test Plan: Run "show vmopag" from db> prompt and see that it does not need additional keypresses other than the ones needed for the pager. Subscribers: imp, #contributor_reviews_base Differential Revision: https://reviews.freebsd.org/D33550 (cherry picked from commit `18048b6e3c`)	2021-12-29 14:32:48 -05:00
Doug Moore	dd8ea1c755	vm_reserv: fix zero-boundary error Handle specially the boundary==0 case of vm_reserv_reclaim_config, by turning off boundary adjustment in that case. Reviewed by: alc Tested by: pho, madpilot (cherry picked from commit `49fd2d51f0`)	2021-12-29 11:23:48 -06:00
Mark Johnston	0fc6eebbf7	vm_fault: Fix vm_fault_populate()'s handling of VM_FAULT_WIRE vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in the rage. (For largepage mappings, it calls vm_fault() once per large page.) A pager's populate method may return more than one page to be mapped. If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not just the fault page. Consider an object with two pages mapped in a vm_map_entry, and suppose vm_map_wire() is called on the entry. Then, the first vm_fault() would allocate and wire both pages, and the second would encounter a valid page upon lookup and wire it again in the regular fault handler. So the second page is wired twice and will be leaked when the object is destroyed. Fix the problem by modify vm_fault_populate() to wire only the fault page. Also modify the error handler for pmap_enter(psind=1) to not test fs->wired, since it must be false. PR: 260347 Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `88642d978a`)	2021-12-27 19:36:07 -05:00
Jason A. Harmening	fa4e4d55b3	Clean up a couple of MD warts in vm_fault_populate(): --Eliminate a big ifdef that encompassed all currently-supported architectures except mips and powerpc32. This applied to the case in which we've allocated a superpage but the pager-populated range is insufficient for a superpage mapping. For platforms that don't support superpages the check should be inexpensive as we shouldn't get a superpage in the first place. Make the normal-page fallback logic identical for all platforms and provide a simple implementation of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc. --Apply the logic for handling pmap_enter() failure if a superpage mapping can't be supported due to additional protection policy. Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case, and note Intel PKU on amd64 as the first example of such protection policy. Reviewed by: kib, markj, bdragon (cherry picked from commit `8dc8feb53d`)	2021-12-27 19:35:55 -05:00
Doug Moore	42f18ad112	Correct type size format error in KASSERT. Reported by: jenkins Fixes: `6f1c890827` vm: Don't break vm reserv that can't meet align reqs (cherry picked from commit `f7aa44763d`)	2021-12-23 02:02:42 -06:00
Doug Moore	3b8062cdd5	vm: Don't break vm reserv that can't meet align reqs Function vm_reserv_test_contig has incorrectly used its alignment and boundary parameters to find a well-positioned range of empty pages in a reservation. Consequently, a reservation could be broken mistakenly when it was unable to provide a satisfactory set of pages. Rename the function, correct the errors, and add assertions to detect the error in case it appears again. Reviewed by: alc, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D33344 (cherry picked from commit `6f1c890827`)	2021-12-23 02:01:17 -06:00
Konstantin Belousov	1791debf4a	swapoff: add one more variant of the syscall For MFC, COMPAT_FREEBSD13 braces were removed. (cherry picked from commit `5346570276`)	2021-12-20 02:29:11 +02:00
Konstantin Belousov	45786883b0	swapoff(2): add a SWAPOFF_FORCE flag (cherry picked from commit `e8dc2ba29c`)	2021-12-20 02:29:11 +02:00
Konstantin Belousov	6ceede7d36	swapoff(2): replace special device name argument with a structure (cherry picked from commit `a4e4132fa3`)	2021-12-20 02:29:11 +02:00
Doug Moore	0848451a2e	Set uninitialized popmap bits in vm_reserv_init In vm_reserv_init, set all the marker popmap bits in vm_reserv_init, and not just the bits of the first popmap entry. Reviewed by: markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D33258 (cherry picked from commit `9f32cb5b1c`)	2021-12-13 23:09:13 -06:00
Mark Johnston	e302ae7756	vm_page: Tighten the object lock assertion in vm_page_invalid() A page must not become invalid while vm_fault_soft_fast() is attempting to map unbusied pages for reading. Note that all callers hold the object write lock already, and vm_page_set_invalid() asserts the object write lock. Reviewed by: kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `39a7396f5d`)	2021-12-13 08:26:34 -05:00
Konstantin Belousov	dea036bd15	swap_pager.c: Remove MPSAFE and ARGSUSED annotations (cherry picked from commit `6df359449f`)	2021-12-10 04:32:18 +02:00
Mark Johnston	c4c2d50242	vm_fault: Factor out per-object operations into vm_fault_object() No functional change intended. Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `d47d3a94bb`)	2021-12-08 08:41:30 -05:00
Mark Johnston	e01ba31b9d	vm_fault: Introduce a fault_status enum for internal return types Rather than overloading the meanings of the Mach statuses, introduce a new set for use internally in the fault code. This makes the control flow easier to follow and provides some extra error checking when a fault status variable is used in a switch statement. vm_fault_lookup() and vm_fault_relookup() continue to use Mach statuses for now, as there isn't much benefit to converting them and they effectively pass through a status from vm_map_lookup(). Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `f1b642c255`)	2021-12-08 08:41:24 -05:00
Mark Johnston	61c3b6832d	vm_fault: Move nera into faultstate This makes it easier to factor out pieces of vm_fault(). No functional change intended. Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `45c09a74d6`)	2021-12-08 08:39:47 -05:00
Konstantin Belousov	08d995ca8f	swapoff_one(): only check free pages count manually turning swap off (cherry picked from commit `0190c38b9d`)	2021-12-06 02:29:43 +02:00
Mitchell Horne	233ec6b12b	minidump: Use the provided dump bitset When constructing the set of dumpable pages, use the bitset provided by the state argument, rather than assuming vm_page_dump invariably. For normal kernel minidumps this will be a pointer to vm_page_dump, but when dumping the live system it will not. To do this, the functions in vm_dumpset.h are extended to accept the desired bitset as an argument. Note that this provided bitset is assumed to be derived from vm_page_dump, and therefore has the same size. Reviewed by: kib, markj, jhb MFC after: 2 weeks Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D31992 (cherry picked from commit `10fe6f80a6`)	2021-12-03 10:02:03 -04:00
Konstantin Belousov	3a98b98be5	swap_pager: lock vnode in swapdev_strategy() (cherry picked from commit `b19740f4ce`)	2021-12-02 04:21:15 +02:00
Konstantin Belousov	4b2caeec43	swapon: extend the region where the swap vnode is locked (cherry picked from commit `6ddf41faa6`)	2021-12-02 04:21:14 +02:00
Konstantin Belousov	81c9a051ea	swap pager: lock vnode around VOP_CLOSE() (cherry picked from commit `a6d04f34a4`)	2021-12-02 04:21:14 +02:00
Mark Johnston	1556ae1356	vm_page: Remove vm_page_sbusy() and vm_page_xbusy() They are unused today and cannot be safely used in the face of unlocked lookup, in which pages may be busied without the object lock held. Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `a2665158d0`)	2021-11-29 09:11:37 -05:00
Mark Johnston	cb081566cf	vm_page: Consolidate page busy sleep mechanisms - Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix up callers. - Modify vm_page_busy_sleep() to return a status indicating whether the object lock was dropped, and fix up callers. - Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep() instead. - Remove vm_page_sleep_if_(x)busy(). No functional change intended. Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `87b646630c`)	2021-11-29 09:11:29 -05:00
Mark Johnston	fdd27db348	vm: Add a mode to vm_object_page_remove() which skips invalid pages This will be used to break a deadlock in ZFS between the per-mountpoint teardown lock and page busy locks. In particular, when purging data from the page cache during dataset rollback, we want to avoid blocking on the busy state of invalid pages since the busying thread may be blocked on the teardown lock in zfs_getpages(). Add a helper, vn_pages_remove_valid(), for use by filesystems. Bump __FreeBSD_version so that the OpenZFS port can make use of the new helper. PR: 258208 Reviewed by: avg, kib, sef Tested by: pho (part of a larger patch) Sponsored by: The FreeBSD Foundation (cherry picked from commit `d28af1abf0`)	2021-11-29 09:09:28 -05:00
Mark Johnston	0d900a16d0	vm_pager: Optimize an assertion Obtained from: jeff (object_concurrency patches) Reviewed by: kib (cherry picked from commit `b0acc3f11b`)	2021-11-22 08:44:08 -05:00
Mark Johnston	ce9c3848ff	uma: Fix handling of reserves in zone_import() Kegs with no items reserved have uk_reserve = 0. So the check keg->uk_reserve >= dom->ud_free_items will be true once all slabs are depleted. Then, rather than go and allocate a fresh slab, we return to the cache layer. The intent was to do this only when the keg actually has a reserve, so modify the check to verify this first. Another approach would be to make uk_reserve signed and set it to -1 until uma_zone_reserve() is called, but this requires a few casts elsewhere. Fixes: `1b2dcc8c54` ("uma: Avoid depleting keg reserves when filling a bucket") Sponsored by: The FreeBSD Foundation (cherry picked from commit `7585c5db25`)	2021-11-15 09:07:10 -05:00
Mark Johnston	d5ebaa6f8f	uma: Improve M_USE_RESERVE handling in keg_fetch_slab() M_USE_RESERVE is used in a couple of places in the VM to avoid unbounded recursion when the direct map is not available, as is the case on 32-bit platforms or when certain kernel sanitizers (KASAN and KMSAN) are enabled. For example, to allocate KVA, the kernel might allocate a kernel map entry, which might require a new slab, which requires KVA. For these zones, we use uma_prealloc() to populate a reserve of items, and then in certain serialized contexts M_USE_RESERVE can be used to guarantee a successful allocation. uma_prealloc() allocates the requested number of items, distributing them evenly among NUMA domains. Thus, in a first-touch zone, to satisfy an M_USE_RESERVE allocation we might have to check the slab lists of other domains than the current one to provide the semantics expected by consumers. So, try harder to find an item if M_USE_RESERVE is specified and the keg doesn't have anything for current (first-touch) domain. Specifically, fall back to a round-robin slab allocation. This change fixes boot-time panics on NUMA systems with KASAN or KMSAN enabled.[1] Alternately we could have uma_prealloc() allocate the requested number of items for each domain, but for some existing consumers this would be quite wasteful. In general I think keg_fetch_slab() should try harder to find free slabs in other domains before trying to allocate fresh ones, but let's limit this to M_USE_RESERVE for now. Also fix a separate problem that I noticed: in a non-round-robin slab allocation with M_WAITOK, rather than sleeping after a failed slab allocation we simply try again. Call vm_wait_domain() before retrying. Reported by: mjg, tuexen [1] Reviewed by: alc Sponsored by: The FreeBSD Foundation (cherry picked from commit `fab343a716`)	2021-11-15 09:06:54 -05:00
Gordon Bergling	e3f2519c5c	Fix a common typo in syctl descriptions - s/maxiumum/maximum/ (cherry picked from commit `c28e39c3d6`)	2021-11-06 08:52:57 +01:00
Mark Johnston	5dc9004b72	vm_page: Break reservations to handle noobj allocations vm_reserv_reclaim_*() will release pages to the default freepool, not the direct freepool from which noobj allocations are drawn. But if both pools are empty, the noobj allocator variants must break reservations to make progress. Reported by: cy Reviewed by: kib (previous version) Fixes: `b498f71bc5` ("vm_page: Add a new page allocator interface for unnamed pages") Sponsored by: The FreeBSD Foundation (cherry picked from commit `d7acbe481d`)	2021-11-03 13:44:47 -04:00
Mark Johnston	f86bda068c	Convert consumers to vm_page_alloc_noobj_contig() Remove now-unneeded page zeroing. No functional change intended. Reviewed by: alc, hselasky, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `84c3922243`)	2021-11-03 13:41:40 -04:00
Mark Johnston	fb3ba080a1	Introduce vm_page_alloc_noobj_contig() This is the same as vm_page_alloc_noobj(), but allocates physically contiguous runs of memory. For now it is implemented in terms of vm_page_alloc_contig(), with the difference that vm_page_alloc_noobj_contig() implements VM_ALLOC_ZERO by zeroing the page. Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `92db9f3bb7`)	2021-11-03 13:41:00 -04:00
Mark Johnston	66cb1858f4	Convert vm_page_alloc() callers to use vm_page_alloc_noobj(). Remove page zeroing code from consumers and stop specifying VM_ALLOC_NOOBJ. In a few places, also convert an allocation loop to simply use VM_ALLOC_WAITOK. Similarly, convert vm_page_alloc_domain() callers. Note that callers are now responsible for assigning the pindex. Reviewed by: alc, hselasky, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `a4667e09e6`)	2021-11-03 13:39:36 -04:00
Mark Johnston	24204bede3	vm_page: Add a new page allocator interface for unnamed pages The diff adds vm_page_alloc_noobj() and vm_page_alloc_noobj_domain(). These mostly correspond to vm_page_alloc() and vm_page_alloc_domain() when no VM object is specified, with the exception that they handle VM_ALLOC_ZERO by zeroing the page, rather than by preserving PG_ZERO. This simplifies callers and will permit simplification of the vm_page_alloc_domain() definition. Since the new allocator variant is similar to vm_page_alloc_freelist(), implement both of them using a common backend allocator function. No functional change intended. Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `b498f71bc5`)	2021-11-03 13:35:25 -04:00
Ryan Stone	8deb5f2f64	Add a VM flag to prevent reclaim on a failed contig allocation If a M_WAITOK contig alloc fails, the VM subsystem will try to reclaim contiguous memory twice before actually failing the request. On a system with 64GB of RAM I've observed this take 400-500ms before it finally gives up, and I believe that this will only be worse on systems with even more memory. In certain contexts this delay is extremely harmful, so add a flag that will skip reclaim for allocation requests to allow those paths to opt-out of doing an expensive reclaim. Sponsored by: Dell Inc Differential Revision: https://reviews.freebsd.org/D28422 Reviewed by: markj, kib (cherry picked from commit `660344ca44`)	2021-11-03 13:35:16 -04:00
Mark Johnston	bdfb568f8d	redzone: Raise a compile error if KASAN is configured redzone(9) does some munging of the allocation to insert redzones before and after a valid memory buffer, but KASAN does not know about this and will raise false positives if both are configured. Until this is fixed, do not allow both to be configured. Note that KASAN provides similar checking on its own but currently does not force the creation of redzones for all UMA allocations; this should be addressed as well. Sponsored by: The FreeBSD Foundation (cherry picked from commit `4e8e26a004`)	2021-11-01 10:07:31 -04:00
Mark Johnston	db33d492c8	uma: Fix a few problems with KASAN integration - Ensure that all items returned by UMA are aligned to KASAN_SHADOW_SCALE (8). This was true in practice since smaller alignments are not used by any consumers, but we should enforce it anyway. - Use a non-zero code for marking redzones that appear naturally in items that are not a multiple of the scale factor in size. Currently we do not modify keg layouts to force the creation of redzones. - Use a non-zero code for marking freed per-CPU items, otherwise accesses of freed per-CPU items are not detected by the runtime. Sponsored by: The FreeBSD Foundation (cherry picked from commit `b0dfc48684`)	2021-11-01 10:07:04 -04:00
Mark Johnston	28c338b342	realloc: Fix KASAN(9) shadow map updates When copying from the old buffer to the new buffer, we don't know the requested size of the old allocation, but only the size of the allocation provided by UMA. This value is "alloc". Because the copy may access bytes in the old allocation's red zone, we must mark the full allocation valid in the shadow map. Do so using the correct size. Reported by: kp Tested by: kp Sponsored by: The FreeBSD Foundation (cherry picked from commit `9a7c2de364`)	2021-11-01 10:05:22 -04:00
Mark Johnston	ed66f9c61b	kmem: Add KASAN state transitions Memory allocated with kmem_* is unmapped upon free, so KASAN doesn't provide a lot of benefit, but since allocations are always a multiple of the page size we can create a redzone when the allocation request size is not a multiple of the page size. Sponsored by: The FreeBSD Foundation (cherry picked from commit `2b914b85dd`)	2021-11-01 10:03:11 -04:00
Mark Johnston	9d95539ffe	kstack: Add KASAN state transitions We allocate kernel stacks using a UMA cache zone. Cache zones have KASAN disabled by default, but in this case it makes sense to enable it. Reviewed by: andrew (cherry picked from commit `244f3ec642`)	2021-11-01 10:03:02 -04:00
Mark Johnston	82f3e32c39	uma: Add KASAN state transitions - Add a UMA_ZONE_NOKASAN flag to indicate that items from a particular zone should not be sanitized. This is applied implicitly for NOFREE and cache zones. - Add KASAN call backs which get invoked: 1) when a slab is imported into a keg 2) when an item is allocated from a zone 3) when an item is freed to a zone 4) when a slab is freed back to the VM In state transitions 1 and 3, memory is poisoned so that accesses will trigger a panic. In state transitions 2 and 4, memory is marked valid. - Disable trashing if KASAN is enabled. It just adds extra CPU overhead to catch problems that are detected by KASAN. Sponsored by: The FreeBSD Foundation (cherry picked from commit `09c8cb717d`)	2021-11-01 10:02:54 -04:00
Konstantin Belousov	9b392d0738	sysctl vm.objects: yield if hog (cherry picked from commit `350fc36b4c`)	2021-11-01 02:44:51 +02:00
Konstantin Belousov	5ac0e08ef6	vm.objects_swap: disable reporting some information (cherry picked from commit `7738118e9a`)	2021-11-01 02:44:51 +02:00
Konstantin Belousov	c54be5cfcf	Add vm.swap_objects sysctl (cherry picked from commit `42812ccc96`)	2021-11-01 02:44:51 +02:00
Konstantin Belousov	7db438d470	vm_object_list: split sysctl handler in separate function (cherry picked from commit `1b610624fd`)	2021-11-01 02:44:51 +02:00
Mark Johnston	74efe421ea	vm_page: Move vm_page_alloc_check() to after page allocator definitions This way all of the vm_page_alloc_*() allocator functions are grouped together. Sponsored by: The FreeBSD Foundation (cherry picked from commit `a23e6a1078`)	2021-10-27 09:53:29 -04:00
Mitchell Horne	5794f8c75e	minidump: De-duplicate is_dumpable() The function is identical in each minidump implementation, so move it to vm_phys.c. The only slight exception is powerpc where the function was public, for use in moea64_scan_pmap(). Reviewed by: kib, markj, imp (earlier version) MFC after: 2 weeks Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D31884 (cherry picked from commit `31991a5a45`)	2021-10-15 12:20:48 -03:00
Konstantin Belousov	0b29fd06da	vm_fault: do not trigger OOM too early (cherry picked from commit `174aad047e`)	2021-10-10 12:22:58 +03:00
Mark Johnston	e68465ecbf	uma: Show the count of free slabs in each per-domain keg's sysctl tree This is useful for measuring the number of pages that could be freed from a NOFREE zone under memory pressure. Sponsored by: The FreeBSD Foundation (cherry picked from commit `d6e77cda9b`)	2021-09-24 09:01:22 -04:00

1 2 3 4 5 ...

4587 commits