With INVARIANTS defined, have vm_addr_align_ok and vm_addr_bound_ok
panic when passed an alignment/boundary parameter that is not a power
of two.
Reviewed by: alc
Suggested by: kib, se
Differential Revision: https://reviews.freebsd.org/D33725
(cherry picked from commit ae13829ddc)
Move an assignment back to where it was before, to turn the
defined-but-not-used error back into a set-but-not-used warning.
Fixes: 01e115ab83 vm_phys: #include vm_extern
(cherry picked from commit e6930b1c5f)
Arm64 and powerpc don't include vm_extern.h indirectly in vm_phys.c, which
means that for the sake of those architectures, it must be included explicitly.
Also, fix a set-unused warning that jenkins also found.
Reported by: Jenkins
Fixes: c606ab59e7 vm_extern: use standard address checkers everywhere
(cherry picked from commit 01e115ab83)
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D33685
(cherry picked from commit c606ab59e7)
vm_reserv.c uses its own bitstring implemenation for popmaps. Using
the bitstring_t type from a standard header eliminates the code
duplication, allows some bit-at-a-time operations to be replaced with
more efficient bitstring range operations, and, in
vm_reserv_test_contig, allows bit_ffc_area_at to more efficiently
search for a big-enough set of consecutive zero-bits.
Make bitstring changes improve the vm_reserv code. Define a bit_ntest
method to test whether a range of bits is all set, or all clear.
Define bit_ff_at and bit_ff_area_at to implement the ffs and ffc
versions with a parameter to choose between set- and clear- bits.
Improve the area_at implementation. Modify the bit_nset and
bit_nclear implementations to allow code optimization in the cases
when start or end are multiples of _BITSTR_BITS.
Add a few new cases to bitstring_test.
Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision: https://reviews.freebsd.org/D33312
(cherry picked from commit 84e2ae64c5)
Function vm_reserv_reclaim_contig breaks a reservation with enough
free space to satisfy an allocation request and returns the free space
to the buddy allocator. Change the function to allocate the request
memory from the reservation before breaking it, and return that memory
to the caller. That avoids a second call to the buddy allocator and
guarantees successful allocation after breaking the reservation, where
that success is not currently guaranteed.
Reviewed by: alc, kib (previous version)
Differential Revision: https://reviews.freebsd.org/D33644
(cherry picked from commit 0d5fac2872)
Fix a very recent change that introduced a page accounting error in
case of a reserveration being broken.
Reviewed by: alc
Fixes: fb38b29b56 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
Differential Revision: https://reviews.freebsd.org/D33645
(cherry picked from commit 184c63db3c)
Extract code from vm_page_alloc_contig_domain into a new function. Do
so in a way that eliminates a bound-to-fail reservation test after a
reservation is broken by a call from vm_page_alloc_contig_domain.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33551
(cherry picked from commit fb38b29b56)
It is only called in the file that defines it, so make it static and
remove the declaration from the header.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33688
(cherry picked from commit 8119cdd38b)
Commit 4b8365d752 introduced the ability to dynamically register
VM object types, for use by tmpfs, which creates swap-backed objects.
As a part of this, checks for such objects changed from
object->type == OBJT_DEFAULT || object->type == OBJT_SWAP
to
object->type == OBJT_DEFAULT || (object->flags & OBJ_SWAP) != 0
In particular, objects of type OBJT_DEFAULT do not have OBJ_SWAP set;
the swap pager sets this flag when converting from OBJT_DEFAULT to
OBJT_SWAP.
A few of these checks are done without the object lock held. It turns
out that this can result in false negatives since the swap pager
converts objects like so:
object->type = OBJT_SWAP;
object->flags |= OBJ_SWAP;
Fix the problem by adding explicit tests for OBJT_SWAP objects in
unlocked checks.
PR: 258932
Fixes: 4b8365d752 ("Add OBJT_SWAP_TMPFS pager")
Reported by: bdrewery
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit e123264e4d)
We do not hold the object lock or a page busy lock when copying src_m's
validity state. Prior to commit 45d72c7d7f we marked dst_m as fully
valid.
Use the source object's read lock to ensure that valid bits are not
concurrently cleared.
Reviewed by: alc, kib
Fixes: 45d72c7d7f ("vm_fault_copy_entry: accept invalid source pages.")
Sponsored by: The FreeBSD Foundation
(cherry picked from commit d0443e2b98)
... rather than setting and clearing flags inline. No functional change
intended.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 630f633f2a)
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.
In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions. Abstracting this check reduces the diff relative
to FreeBSD. It is perhaps slightly more readable as well.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D28710
(cherry picked from commit 67932460c7)
The assertion was added in commit 1771e987ca. After that, vm_wait()
and friends were refactored such that the actual sleep happens
elsewhere. Now the assertion condition is not checked when
vm_wait_doms() is called directly, and it is checked even if we are not
going to sleep (because vm_page_count_min_set(wdoms) is false).
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 6fb7c42d59)
The wait flag is passed to UMA when allocating boundary tags for the
initial span, and UMA expects either M_WAITOK or M_NOWAIT to be present.
Reported by: cperciva
Sponsored by: The FreeBSD Foundation
(cherry picked from commit f82177b8cf)
Previously we'd always print "out of swap space." This can be
misleading, as there are other reasons an OOM kill can be triggered. In
particular, it's entirely possible to trigger an OOM kill on a system
with plenty of free swap space.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 4a864f624a)
Allow a zone to opt out of cache size management. In particular,
uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from
the zone, nor will uma_timeout() purge cached items if the zone is idle.
This effectively means that the zone consumer has control over when
items are reclaimed from the cache. In particular, uma_zone_reclaim()
will still reclaim cached items from an unmanaged zone.
Reviewed by: hselasky, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
(cherry picked from commit 389a3fa693)
Make vmdaemon timeout configurable, so that one can adjust
how often it runs.
Here's a trick: set this to 1, then run 'limits -m 0 sh',
then run whatever you want with 'ktrace -it XXX', and observe
how the working set changes over time.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D22038
(cherry picked from commit 0f559a9f09)
The approach taken by the stack gap implementation was to insert a
random gap between the top of the fixed stack mapping and the true top
of the main process stack. This approach was chosen so as to avoid
randomizing the previously fixed address of certain process metadata
stored at the top of the stack, but had some shortcomings. In
particular, mlockall(2) calls would wire the gap, bloating the process'
memory usage, and RLIMIT_STACK included the size of the gap so small
(< several MB) limits could not be used.
There is little value in storing each process' ps_strings at a fixed
location, as only very old programs hard-code this address; consumers
were converted decades ago to use a sysctl-based interface for this
purpose. Thus, this change re-implements stack address randomization by
simply breaking the convention of storing ps_strings at a fixed
location, and randomizing the location of the entire stack mapping.
This implementation is simpler and avoids the problems mentioned above,
while being unlikely to break compatibility anywhere the default ASLR
settings are used.
The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
and is re-enabled by default.
PR: 260303
Reviewed by: kib
Discussed with: emaste, mw
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 1811c1e957)
Buckets in an SMR-enabled zone can legitimately be tagged with
SMR_SEQ_INVALID. This effectively means that the zone destructor (if
any) was invoked on all items in the bucket, and the contained memory is
safe to reuse. If the first bucket in the full bucket list was tagged
this way, UMA would unnecessarily poll per-CPU state before attempting
to fetch a full bucket from the list.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit a04ce833f9)
Remove always-false checks for UMA zone creation failure. No functional
change intended.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 43b3b8e52d)
Fix some style bugs while here. No functional change intended.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit c4a25e0713)
Calling setrlimit with stack gap enabled and with low values of stack
resource limit often caused the program to abort immediately after
exiting the syscall. This happened due to the fact that the resource
limit was calculated assuming that the stack started at sv_usrstack,
while with stack gap enabled the stack is moved by a random number
of bytes.
Save information about stack size in struct vmspace and adjust the
rlim_cur value. If the rlim_cur and stack gap is bigger than rlim_max,
then the value is truncated to rlim_max.
PR: 253208
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: Stormshield
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D31516
(cherry picked from commit 889b56c8cd)
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.
Obtained from: Juniper Networks, Inc.
MFC after: 1 week
Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.
Subscribers: imp, #contributor_reviews_base
Differential Revision: https://reviews.freebsd.org/D33550
(cherry picked from commit 18048b6e3c)
Handle specially the boundary==0 case of vm_reserv_reclaim_config,
by turning off boundary adjustment in that case.
Reviewed by: alc
Tested by: pho, madpilot
(cherry picked from commit 49fd2d51f0)
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage. (For largepage mappings, it calls vm_fault() once per large
page.)
A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page. Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry. Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler. So the second page is wired twice and will be
leaked when the object is destroyed.
Fix the problem by modify vm_fault_populate() to wire only the fault
page. Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.
PR: 260347
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 88642d978a)
--Eliminate a big ifdef that encompassed all currently-supported
architectures except mips and powerpc32. This applied to the case
in which we've allocated a superpage but the pager-populated range
is insufficient for a superpage mapping. For platforms that don't
support superpages the check should be inexpensive as we shouldn't
get a superpage in the first place. Make the normal-page fallback
logic identical for all platforms and provide a simple implementation
of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc.
--Apply the logic for handling pmap_enter() failure if a superpage
mapping can't be supported due to additional protection policy.
Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case,
and note Intel PKU on amd64 as the first example of such protection
policy.
Reviewed by: kib, markj, bdragon
(cherry picked from commit 8dc8feb53d)
Function vm_reserv_test_contig has incorrectly used its alignment
and boundary parameters to find a well-positioned range of empty pages
in a reservation. Consequently, a reservation could be broken
mistakenly when it was unable to provide a satisfactory set of pages.
Rename the function, correct the errors, and add assertions to detect
the error in case it appears again.
Reviewed by: alc, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33344
(cherry picked from commit 6f1c890827)
In vm_reserv_init, set all the marker popmap bits in vm_reserv_init,
and not just the bits of the first popmap entry.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33258
(cherry picked from commit 9f32cb5b1c)