Commit graph

4610 commits

Author SHA1 Message Date
Gordon Bergling
d44793af7e vm_page: Fix a typo in a source code comment
- s/consistancy/consistency/

(cherry picked from commit f77a88c855)
2022-06-10 14:30:17 +02:00
Gordon Bergling
59f3ff80f5 vm: Fix a common typo in a source code comment
- s/independant/independent/

(cherry picked from commit 860740ae0f)
2022-06-10 14:25:56 +02:00
John Baldwin
0c1258e707 Add a VA_IS_CLEANMAP() macro.
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.

In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions.  Abstracting this check reduces the diff relative
to FreeBSD.  It is perhaps slightly more readable as well.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D28710

(cherry picked from commit 67932460c7)
2022-05-10 10:47:07 -07:00
Mark Johnston
11eac05a40 vm: Move the "vm_wait in early boot" assertion to the proper place
The assertion was added in commit 1771e987ca.  After that, vm_wait()
and friends were refactored such that the actual sleep happens
elsewhere.  Now the assertion condition is not checked when
vm_wait_doms() is called directly, and it is checked even if we are not
going to sleep (because vm_page_count_min_set(wdoms) is false).

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 6fb7c42d59)
2022-04-21 09:18:14 -04:00
Mark Johnston
9a4e701578 vm: Initialize the transient buffer mapping arena with M_WAITOK
The wait flag is passed to UMA when allocating boundary tags for the
initial span, and UMA expects either M_WAITOK or M_NOWAIT to be present.

Reported by:	cperciva
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f82177b8cf)
2022-04-21 09:18:04 -04:00
Mark Johnston
f9677b7e74 uma: Don't allow a limit to be set in a warm zone
The limit accounting in UMA does not tolerate this.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit d53927b0ba)
2022-04-13 08:10:35 -04:00
Gordon Bergling
6265d53f90 memguard(9): Fix two typos in source code comments
- s/comparsion/comparison/

(cherry picked from commit f167c46e79)
2022-04-09 08:08:00 +02:00
Mark Johnston
229eff21b7 uma: Use the correct type for a return value
zone_alloc_bucket() returns a pointer, not a bool.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 54361f9020)
2022-04-06 20:30:45 -04:00
Mark Johnston
13ba1d2836 vm_pageout: Print a more accurate message to the console before an OOM kill
Previously we'd always print "out of swap space."  This can be
misleading, as there are other reasons an OOM kill can be triggered.  In
particular, it's entirely possible to trigger an OOM kill on a system
with plenty of free swap space.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 4a864f624a)
2022-02-28 09:06:58 -05:00
Hans Petter Selasky
6bf4f9268f uma: Add UMA_ZONE_UNMANAGED
Allow a zone to opt out of cache size management.  In particular,
uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from
the zone, nor will uma_timeout() purge cached items if the zone is idle.
This effectively means that the zone consumer has control over when
items are reclaimed from the cache.  In particular, uma_zone_reclaim()
will still reclaim cached items from an unmanaged zone.

Reviewed by:	hselasky, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34142

(cherry picked from commit 389a3fa693)
2022-02-24 10:59:28 +01:00
Edward Tomasz Napierala
b2db87294a Make vmdaemon timeout configurable
Make vmdaemon timeout configurable, so that one can adjust
how often it runs.

Here's a trick: set this to 1, then run 'limits -m 0 sh',
then run whatever you want with 'ktrace -it XXX', and observe
how the working set changes over time.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D22038

(cherry picked from commit 0f559a9f09)
2022-02-14 19:28:56 +00:00
John Baldwin
1a9f14cfa5 Use vmspace->vm_stacktop in place of sv_usrstack in more places.
Reviewed by:	markj
Obtained from:	CheriBSD

(cherry picked from commit becaf6433b)
2022-02-16 11:55:37 -05:00
Mark Johnston
a097a58543 fork: Copy the vm_stacktop field into the new vmspace
Fixes:	1811c1e957 ("exec: Reimplement stack address randomization")
Reported by:	pho
Reported by:	syzbot+0446312a51bc13ead834@syzkaller.appspotmail.com
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 46d35d415a)
2022-02-16 11:55:11 -05:00
Mark Johnston
5fa005e915 exec: Reimplement stack address randomization
The approach taken by the stack gap implementation was to insert a
random gap between the top of the fixed stack mapping and the true top
of the main process stack.  This approach was chosen so as to avoid
randomizing the previously fixed address of certain process metadata
stored at the top of the stack, but had some shortcomings.  In
particular, mlockall(2) calls would wire the gap, bloating the process'
memory usage, and RLIMIT_STACK included the size of the gap so small
(< several MB) limits could not be used.

There is little value in storing each process' ps_strings at a fixed
location, as only very old programs hard-code this address; consumers
were converted decades ago to use a sysctl-based interface for this
purpose.  Thus, this change re-implements stack address randomization by
simply breaking the convention of storing ps_strings at a fixed
location, and randomizing the location of the entire stack mapping.
This implementation is simpler and avoids the problems mentioned above,
while being unlikely to break compatibility anywhere the default ASLR
settings are used.

The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
and is re-enabled by default.

PR:		260303
Reviewed by:	kib
Discussed with:	emaste, mw
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 1811c1e957)
2022-02-16 11:55:03 -05:00
Konstantin Belousov
3261dea72c Revert "vm_pageout_scans: correct detection of active object"
This reverts commit 3de96d664a.

PR:	261707

(cherry picked from commit b51927b7b0)
2022-02-10 16:56:15 +02:00
Konstantin Belousov
c383935d00 vmmeter(): Fix detection of the named swap objects
(cherry picked from commit 0b8643eaf6)
2022-02-09 02:42:44 +02:00
Konstantin Belousov
1617ca2c16 vm_object: restore handling of shadow_count for all type of objects
(cherry picked from commit 4cf9f5d807)
2022-02-09 02:42:44 +02:00
Rick Macklem
6d95a66f1b vm_object: Make is_object_active() global
(cherry picked from commit cd37afd8b6)
2022-02-09 02:42:44 +02:00
Konstantin Belousov
64e0f75f39 vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h
(cherry picked from commit d950c5898a)
2022-02-08 08:42:07 +02:00
Konstantin Belousov
38a7a43505 vm/vm_pager.h: use sys/systm.h header
(cherry picked from commit f4cdb9d7c3)
2022-02-08 08:42:07 +02:00
Konstantin Belousov
78d27f25c7 Use dedicated lock name for pbufs
(cherry picked from commit 531f8cfea0)
2022-02-07 11:38:49 +02:00
Konstantin Belousov
2366d7ce87 vm_pageout_scans: correct detection of active object
(cherry picked from commit 3de96d664a)
2022-01-29 03:10:44 +02:00
Mark Johnston
0883a572d2 uma: Avoid polling for an invalid SMR sequence number
Buckets in an SMR-enabled zone can legitimately be tagged with
SMR_SEQ_INVALID.  This effectively means that the zone destructor (if
any) was invoked on all items in the bucket, and the contained memory is
safe to reuse.  If the first bucket in the full bucket list was tagged
this way, UMA would unnecessarily poll per-CPU state before attempting
to fetch a full bucket from the list.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit a04ce833f9)
2022-01-28 09:13:24 -05:00
Mark Johnston
3f85c51824 swap_pager: uma_zcreate() doesn't fail
Remove always-false checks for UMA zone creation failure.  No functional
change intended.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 43b3b8e52d)
2022-01-18 08:36:13 -05:00
Mark Johnston
d41768d5c1 vm_pageout: Group sysctl variables together with sysctl definitions
Fix some style bugs while here.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit c4a25e0713)
2022-01-18 08:36:04 -05:00
Dawid Gorecki
16a900ae02 setrlimit: Take stack gap into account.
Calling setrlimit with stack gap enabled and with low values of stack
resource limit often caused the program to abort immediately after
exiting the syscall. This happened due to the fact that the resource
limit was calculated assuming that the stack started at sv_usrstack,
while with stack gap enabled the stack is moved by a random number
of bytes.

Save information about stack size in struct vmspace and adjust the
rlim_cur value. If the rlim_cur and stack gap is bigger than rlim_max,
then the value is truncated to rlim_max.

PR: 253208
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: Stormshield
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D31516

(cherry picked from commit 889b56c8cd)
2021-12-30 16:24:59 +01:00
Stephen J. Kiernan
bd7e18a378 Eliminate key press requirement "show vmopag" command output.
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.

Obtained from:	Juniper Networks, Inc.
MFC after:	1 week

Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.

Subscribers: imp, #contributor_reviews_base

Differential Revision: https://reviews.freebsd.org/D33550

(cherry picked from commit 18048b6e3c)
2021-12-29 14:32:48 -05:00
Doug Moore
dd8ea1c755 vm_reserv: fix zero-boundary error
Handle specially the boundary==0 case of vm_reserv_reclaim_config,
by turning off boundary adjustment in that case.

Reviewed by:	alc
Tested by:	pho, madpilot

(cherry picked from commit 49fd2d51f0)
2021-12-29 11:23:48 -06:00
Mark Johnston
0fc6eebbf7 vm_fault: Fix vm_fault_populate()'s handling of VM_FAULT_WIRE
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage.  (For largepage mappings, it calls vm_fault() once per large
page.)

A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page.  Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry.  Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler.  So the second page is wired twice and will be
leaked when the object is destroyed.

Fix the problem by modify vm_fault_populate() to wire only the fault
page.  Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.

PR:		260347
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 88642d978a)
2021-12-27 19:36:07 -05:00
Jason A. Harmening
fa4e4d55b3 Clean up a couple of MD warts in vm_fault_populate():
--Eliminate a big ifdef that encompassed all currently-supported
architectures except mips and powerpc32.  This applied to the case
in which we've allocated a superpage but the pager-populated range
is insufficient for a superpage mapping.  For platforms that don't
support superpages the check should be inexpensive as we shouldn't
get a superpage in the first place.  Make the normal-page fallback
logic identical for all platforms and provide a simple implementation
of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc.

--Apply the logic for handling pmap_enter() failure if a superpage
mapping can't be supported due to additional protection policy.
Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case,
and note Intel PKU on amd64 as the first example of such protection
policy.

Reviewed by:	kib, markj, bdragon

(cherry picked from commit 8dc8feb53d)
2021-12-27 19:35:55 -05:00
Doug Moore
42f18ad112 Correct type size format error in KASSERT.
Reported by:	jenkins
Fixes:	6f1c890827 vm: Don't break vm reserv that can't meet align reqs

(cherry picked from commit f7aa44763d)
2021-12-23 02:02:42 -06:00
Doug Moore
3b8062cdd5 vm: Don't break vm reserv that can't meet align reqs
Function vm_reserv_test_contig has incorrectly used its alignment
and boundary parameters to find a well-positioned range of empty pages
in a reservation.  Consequently, a reservation could be broken
mistakenly when it was unable to provide a satisfactory set of pages.

Rename the function, correct the errors, and add assertions to detect
the error in case it appears again.

Reviewed by:	alc, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33344

(cherry picked from commit 6f1c890827)
2021-12-23 02:01:17 -06:00
Konstantin Belousov
1791debf4a swapoff: add one more variant of the syscall
For MFC, COMPAT_FREEBSD13 braces were removed.

(cherry picked from commit 5346570276)
2021-12-20 02:29:11 +02:00
Konstantin Belousov
45786883b0 swapoff(2): add a SWAPOFF_FORCE flag
(cherry picked from commit e8dc2ba29c)
2021-12-20 02:29:11 +02:00
Konstantin Belousov
6ceede7d36 swapoff(2): replace special device name argument with a structure
(cherry picked from commit a4e4132fa3)
2021-12-20 02:29:11 +02:00
Doug Moore
0848451a2e Set uninitialized popmap bits in vm_reserv_init
In vm_reserv_init, set all the marker popmap bits in vm_reserv_init,
and not just the bits of the first popmap entry.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33258

(cherry picked from commit 9f32cb5b1c)
2021-12-13 23:09:13 -06:00
Mark Johnston
e302ae7756 vm_page: Tighten the object lock assertion in vm_page_invalid()
A page must not become invalid while vm_fault_soft_fast() is attempting
to map unbusied pages for reading.

Note that all callers hold the object write lock already, and
vm_page_set_invalid() asserts the object write lock.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 39a7396f5d)
2021-12-13 08:26:34 -05:00
Konstantin Belousov
dea036bd15 swap_pager.c: Remove MPSAFE and ARGSUSED annotations
(cherry picked from commit 6df359449f)
2021-12-10 04:32:18 +02:00
Mark Johnston
c4c2d50242 vm_fault: Factor out per-object operations into vm_fault_object()
No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit d47d3a94bb)
2021-12-08 08:41:30 -05:00
Mark Johnston
e01ba31b9d vm_fault: Introduce a fault_status enum for internal return types
Rather than overloading the meanings of the Mach statuses, introduce a
new set for use internally in the fault code.  This makes the control
flow easier to follow and provides some extra error checking when a
fault status variable is used in a switch statement.

vm_fault_lookup() and vm_fault_relookup() continue to use Mach statuses
for now, as there isn't much benefit to converting them and they
effectively pass through a status from vm_map_lookup().

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit f1b642c255)
2021-12-08 08:41:24 -05:00
Mark Johnston
61c3b6832d vm_fault: Move nera into faultstate
This makes it easier to factor out pieces of vm_fault().  No functional
change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit 45c09a74d6)
2021-12-08 08:39:47 -05:00
Konstantin Belousov
08d995ca8f swapoff_one(): only check free pages count manually turning swap off
(cherry picked from commit 0190c38b9d)
2021-12-06 02:29:43 +02:00
Mitchell Horne
233ec6b12b minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992

(cherry picked from commit 10fe6f80a6)
2021-12-03 10:02:03 -04:00
Konstantin Belousov
3a98b98be5 swap_pager: lock vnode in swapdev_strategy()
(cherry picked from commit b19740f4ce)
2021-12-02 04:21:15 +02:00
Konstantin Belousov
4b2caeec43 swapon: extend the region where the swap vnode is locked
(cherry picked from commit 6ddf41faa6)
2021-12-02 04:21:14 +02:00
Konstantin Belousov
81c9a051ea swap pager: lock vnode around VOP_CLOSE()
(cherry picked from commit a6d04f34a4)
2021-12-02 04:21:14 +02:00
Mark Johnston
1556ae1356 vm_page: Remove vm_page_sbusy() and vm_page_xbusy()
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit a2665158d0)
2021-11-29 09:11:37 -05:00
Mark Johnston
cb081566cf vm_page: Consolidate page busy sleep mechanisms
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
  a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
  up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
  object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
  instead.
- Remove vm_page_sleep_if_(x)busy().

No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit 87b646630c)
2021-11-29 09:11:29 -05:00
Mark Johnston
fdd27db348 vm: Add a mode to vm_object_page_remove() which skips invalid pages
This will be used to break a deadlock in ZFS between the per-mountpoint
teardown lock and page busy locks.  In particular, when purging data
from the page cache during dataset rollback, we want to avoid blocking
on the busy state of invalid pages since the busying thread may be
blocked on the teardown lock in zfs_getpages().

Add a helper, vn_pages_remove_valid(), for use by filesystems.  Bump
__FreeBSD_version so that the OpenZFS port can make use of the new
helper.

PR:		258208
Reviewed by:	avg, kib, sef
Tested by:	pho (part of a larger patch)
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit d28af1abf0)
2021-11-29 09:09:28 -05:00
Mark Johnston
0d900a16d0 vm_pager: Optimize an assertion
Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib

(cherry picked from commit b0acc3f11b)
2021-11-22 08:44:08 -05:00