Historically we have not distinguished between kernel wirings and user
wirings for accounting purposes. User wirings (via mlock(2)) were
subject to a global limit on the number of wired pages, so if large
swaths of physical memory were wired by the kernel, as happens with
the ZFS ARC among other things, the limit could be exceeded, causing
user wirings to fail.
The change adds a new counter, v_user_wire_count, which counts the
number of virtual pages wired by user processes via mlock(2) and
mlockall(2). Only user-wired pages are subject to the system-wide
limit which helps provide some safety against deadlocks. In
particular, while sources of kernel wirings typically support some
backpressure mechanism, there is no way to reclaim user-wired pages
shorting of killing the wiring process. The limit is exported as
vm.max_user_wired, renamed from vm.max_wired, and changed from u_int
to u_long.
The choice to count virtual user-wired pages rather than physical
pages was done for simplicity. There are mechanisms that can cause
user-wired mappings to be destroyed while maintaining a wiring of
the backing physical page; these make it difficult to accurately
track user wirings at the physical page layer.
The change also closes some holes which allowed user wirings to succeed
even when they would cause the system limit to be exceeded. For
instance, mmap() may now fail with ENOMEM in a process that has called
mlockall(MCL_FUTURE) if the new mapping would cause the user wiring
limit to be exceeded.
Note that bhyve -S is subject to the user wiring limit, which defaults
to 1/3 of physical RAM. Users that wish to exceed the limit must tune
vm.max_user_wired.
Reviewed by: kib, ngie (mlock() test changes)
Tested by: pho (earlier version)
MFC after: 45 days
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19908
The man page is years out of date regarding errors. Our implementation _does_
allow unaligned addresses, and it _does_not_ check for negative lengths,
because the length is unsigned. It checks for overflow instead.
Update the tests accordingly.
Reviewed by: bcr
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D13826
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
vm.max_wired is a system-wide limit, not per-process. Reword the
section to make this more clear.
PR: docs/189214
Submitted by: Lawrence Chen (original text)
Approved by: hrs (mentor)
Stop calling system calls "function calls".
Use "The .Fn system call" a-la "The .Nm utility".
When referring to a non-BSD implementation in
the HISTORY section, call syscall a function,
to be safe.
mlock, mmap, mprotect, msync, munlock, and munmap are defined by
POSIX as taking void *. The const modifier has been added to
mlock, munlock, and mprotect as the standard dictates.
minherit comes from OpenBSD and has been updated to conform with
their recent change to void *.
madvise and mincore are not defined by POSIX, but their arguments
have been modified to be consistent with the POSIX-defined functions.
mincore takes a const pointer, but madvise does not due to the
MADV_FREE case.
Discussed with: bde
in a bunch of man pages.
Use the correct .Bx (BSD UNIX) or .At (AT&T UNIX) macros
instead of explicitly specifying the version in the text
in a bunch of man pages.