Commit graph

2597 commits

Author SHA1 Message Date
Alan Cox
452e6e0d44 MFC r208159
Add a comment about the proper use of vm_object_page_remove().
2010-05-23 21:57:45 +00:00
Alan Cox
b28c6ddbbc MFC r207306
Change vm_object_madvise() so that it checks whether the page is invalid
  or unmanaged before acquiring the page queues lock.  Neither of these
  tests require that lock.  Moreover, a better way of testing if the page
  is unmanaged is to test the type of vm object.  This avoids a pointless
  vm_page_lookup().
2010-05-20 16:21:19 +00:00
Konstantin Belousov
04b7b96b05 MFC r206814 (by alc):
Remove a nonsensical test from vm_pageout_clean().  A page can't be in the
inactive queue and have a non-zero wire count.
2010-05-13 20:31:24 +00:00
Konstantin Belousov
1f93868d76 MFC elimination of several settings of PG_REFERENCED bit, that either
do not make sense or are harmful.

MFC r206761 (by alc):
Setting PG_REFERENCED on the requested page in swap_pager_getpages() is
either redundant or harmful, depending on the caller.

MFC r206768 (by alc):
In vm_object_backing_scan(), setting PG_REFERENCED on a page before
sleeping on that page is nonsensical.

MFC r206770 (by alc):
In vm_object_madvise() setting PG_REFERENCED on a page before sleeping on
that page only makes sense if the advice is MADV_WILLNEED.

MFC r206801 (by alc):
There is no justification for vm_object_split() setting PG_REFERENCED on a
page that it is going to sleep on.
2010-05-13 20:26:16 +00:00
Konstantin Belousov
ef7b0ac47b MFC r207365:
When doing kstack swapin, read as much pages in one run as possible.
2010-05-13 18:17:01 +00:00
Konstantin Belousov
83d6c6d862 MFC r206545 (by alc):
Simplify vm_thread_swapin().

Approved by:	alc
2010-05-13 18:12:44 +00:00
Konstantin Belousov
e6ca6764e8 MFC r207364:
In swap pager, do not free the non-requested pages from the run if they are
wired. Kstack pages are wired, this change prepares swap pager for handling
of long runs of kstack pages.
2010-05-13 09:26:31 +00:00
Konstantin Belousov
70020ca01e MFC r207580:
Handle busy status of the page in a way expected for pager_getpage().
Flush requested page, unbusy other pages, do not clear m->busy.
2010-05-10 11:50:26 +00:00
Konstantin Belousov
88389d6103 MFC r206264:
When OOM searches for a process to kill, ignore the processes already
killed by OOM. When killed process waits for a page allocation, try to
satisfy the request as fast as possible.
2010-05-04 05:14:43 +00:00
Alan Cox
311d363969 MFC r206409
Introduce the function kmem_alloc_attr(), which allocates kernel virtual
  memory with the specified physical attributes.

  Correct an error in the prototype for kmem_malloc().
2010-04-14 16:31:59 +00:00
John Baldwin
f70ad5484f MFC 205536:
Reject attempts to create a MAP_ANON mapping with a non-zero offset.
2010-04-14 15:22:58 +00:00
Alan Cox
55146b83de MFC r206174
vm_reserv_alloc_page() should never be called on an OBJT_SG object, just
  as it is never called on an OBJT_DEVICE object.  (This change should have
  been included in r195840.)
2010-04-09 06:40:30 +00:00
Marcel Moolenaar
dfeca18773 MFC rev 198341 and 198342:
o   Introduce vm_sync_icache() for making the I-cache coherent with
    the memory or D-cache, depending on the semantics of the platform.
    vm_sync_icache() is basically a wrapper around pmap_sync_icache(),
    that translates the vm_map_t argumument to pmap_t.
o   Introduce pmap_sync_icache() to all PMAP implementation. For powerpc
    it replaces the pmap_page_executable() function, added to solve
    the I-cache problem in uiomove_fromphys().
o   In proc_rwmem() call vm_sync_icache() when writing to a page that
    has execute permissions. This assures that when breakpoints are
    written, the I-cache will be coherent and the process will actually
    hit the breakpoint.
o   This also fixes the Book-E PMAP implementation that was missing
    necessary locking while trying to deal with the I-cache coherency
    in pmap_enter() (read: mmu_booke_enter_locked).
2010-03-31 02:43:58 +00:00
Konstantin Belousov
95e5a0907c MFC r204415:
Update comment for vm_page_alloc(9), listing all acceptable flags [1].
Note that the function does not sleep, it can block.
2010-03-02 10:41:34 +00:00
Konstantin Belousov
f407eeb3b4 MFC r204205:
Remove write-only variable.
2010-02-25 10:40:52 +00:00
Konstantin Belousov
02d65815cb MFC r202529:
vunref() the vnode in vm object deallocation code for OBJT_VNODE
appropriate number of times to prevent possible vnode reference leak.
2010-02-07 10:51:17 +00:00
Konstantin Belousov
546b397a1f MFC r203175:
The MAP_ENTRY_NEEDS_COPY flag belongs to protoeflags, cow variable
uses different namespace.
2010-02-01 10:45:23 +00:00
Antoine Brodin
e2b36efde5 MFC r201145 to stable/8:
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.
  Fix some wrong usages.
  Note: this does not affect generated binaries as this argument is not used.

  PR:		137213
  Submitted by:	Eygene Ryabinkin (initial version)
2010-01-30 12:11:21 +00:00
Konstantin Belousov
2d63cbda24 MFC r200770:
Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for
vnode-backed vm objects.
2010-01-11 12:35:16 +00:00
Antoine Brodin
f12d6d2a3f MFC r200129 to stable/8:
Remove trailing ";" in UMA_HASH_INSERT and UMA_HASH_REMOVE macros.
2010-01-07 19:37:21 +00:00
Konstantin Belousov
4a0dfeb299 MFC r198505:
When protection of wired read-only mapping is changed to read-write,
install new shadow object behind the map entry and copy the pages
from the underlying objects to it. This makes the mprotect(2) call to
actually perform the requested operation instead of silently do nothing
and return success, that causes SIGSEGV on later write access to the
mapping.

Reuse vm_fault_copy_entry() to do the copying, modifying it to behave
correctly when src_entry == dst_entry.
2009-11-17 18:38:00 +00:00
Konstantin Belousov
82bccc6465 MFC r198476 (by alc):
Simplify the inner loop of vm_fault_copy_entry().

Approved by:	alc
2009-11-17 18:31:09 +00:00
Alan Cox
38c15033c7 MFC r198472
Eliminate an unnecessary check from vm_fault_prefault().
2009-10-31 18:18:32 +00:00
John Baldwin
a6fb726860 MFC 196615:
Extend the device pager to support different memory attributes on different
pages in an object.
- Add a new variant of d_mmap() currently called d_mmap2() which accepts
  an additional in/out parameter that is the memory attribute to use for
  the requested page.
- A driver either uses d_mmap() or d_mmap2() for all requests but not both.
  The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate
  that the driver provides a d_mmap2() handler instead of d_mmap().  This
  is done to make the change ABI compatible with existing drivers and
  MFC'able to 7 and 8.
2009-10-29 15:09:54 +00:00
Konstantin Belousov
46aaa1ed67 MFC r198201:
Remove spurious call to priv_check(PRIV_VM_SWAP_NOQUOTA).
Call priv_check(PRIV_VM_SWAP_NORLIMIT) only when per-uid limit is
actually exceed.

Approved by:	re (kensmith)
2009-10-21 15:07:34 +00:00
Konstantin Belousov
6832666db4 MFC r197661:
Move the annotation for vm_map_startup() immediately before the function.

Approved by:	re (bz, kensmith)
2009-10-04 12:14:49 +00:00
Konstantin Belousov
2c5f9fbe6a MFC r197348:
For a.out and pre-8 ELF binaries, allow the mmap of zero length.

Approved by:	re (kensmith)
2009-09-23 13:49:41 +00:00
Konstantin Belousov
2af00decb8 MFC r196730:
Remove the altkstacks, instead instantiate threads with kernel stack
allocated with the right size from the start. For the thread that has
kernel stack cached, verify that requested stack size is equial to the
actual, and reallocate the stack if sizes differ.

Introduce separate kernel stack cache that keeps some limited amount of
preallocated kernel stacks to lower the latency of thread allocation.

Not a merge: instead of removing td_altkstack* members of struct thread,
replace them with placeholders to keep struct thread layout on the
stable branch.

Also, record r196640, r196644 and r196648 as merged.

Approved by:	re (kensmith)
2009-09-08 15:31:23 +00:00
John Baldwin
8a503ad2c9 MFC 196637:
Mark the fake pages constructed by the OBJT_SG pager valid.  This was
accidentally lost at one point during the PAT development.  Without this
fix vm_pager_get_pages() was zeroing each of the pages.

Approved by:	re (kib)
2009-09-01 15:50:07 +00:00
John Baldwin
63cbfee7b0 Remove debugging that crept in with previous commit.
Reported by:	nwhitehorn
Approved by:	re (kib)
2009-07-24 15:06:49 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Alan Cox
9861cbc6ca Change the handling of fictitious pages by pmap_page_set_memattr() on
amd64 and i386.  Essentially, fictitious pages provide a mechanism for
creating aliases for either normal or device-backed pages.  Therefore,
pmap_page_set_memattr() on a fictitious page needn't update the direct
map or flush the cache.  Such actions are the responsibility of the
"primary" instance of the page or the device driver that "owns" the
physical address.  For example, these actions are already performed by
pmap_mapdev().

The device pager needn't restore the memory attributes on a fictitious
page before releasing it.  It's now pointless.

Add pmap_page_set_memattr() to the Xen pmap.

Approved by:	re (kib)
2009-07-19 21:40:19 +00:00
Alan Cox
13de722155 An addendum to r195649, "Add support to the virtual memory system for
configuring machine-dependent memory attributes...":

Don't set the memory attribute for a "real" page that is allocated to
a device object in vm_page_alloc().  It is a pointless act, because
the device pager replaces this "real" page with a "fake" page and sets
the memory attribute on that "fake" page.

Eliminate pointless code from pmap_cache_bits() on amd64.

Employ the "Self Snoop" feature supported by some x86 processors to
avoid cache flushes in the pmap.

Approved by:	re (kib)
2009-07-18 01:50:05 +00:00
John Baldwin
0fe0ed8bf8 - Change mmap() to fail requests with EINVAL that pass a length of 0. This
behavior is mandated by POSIX.
- Do not fail requests that pass a length greater than SSIZE_MAX
  (such as > 2GB on 32-bit platforms).  The 'len' parameter is actually
  an unsigned 'size_t' so negative values don't really make sense.

Submitted by:	Alexander Best  alexbestms at math.uni-muenster.de
Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 week
2009-07-14 19:45:36 +00:00
Alan Cox
3153e878dd Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t.  The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes.  Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures.  The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map.  The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

  kmem_alloc_contig() can now be used to allocate kernel memory with
  non-default memory attributes on amd64 and i386.

  vm_page_alloc() and the device pager will set the memory attributes
  for the real or fictitious page according to the object's default
  memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386.  In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by:	re (kib)
2009-07-12 23:31:20 +00:00
Konstantin Belousov
529ab57b9a When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounters
non-readable and non-executable map entry, the entry is skipped from
wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not
set for the map entry, its wired_count is later erronously decremented.
vm_map_delete(9) for such map entry stuck in "vmmaps".

Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop.

Reported by:	John Marshall <john.marshall riverwillow com au>
Approved by:	re (kensmith)
2009-07-12 12:37:38 +00:00
Konstantin Belousov
121fd46175 When forking a vm space that has wired map entries, do not forget to
charge the objects created by vm_fault_copy_entry. The object charge
was set, but reserve not incremented.

Reported by:	Greg Rivers <gcr+freebsd-current tharned org>
Reviewed by:	alc (previous version)
Approved by:	re (kensmith)
2009-07-03 22:17:37 +00:00
Konstantin Belousov
9b4d473a6e Eliminiate code duplication by calling vm_object_destroy()
from vm_object_collapse().

Requested and reviewed by:	alc
Approved by:	re (kensmith)
2009-06-28 08:42:17 +00:00
Alan Cox
e999111ae7 This change is the next step in implementing the cache control functionality
required by video card drivers.  Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures.  In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig().  These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.

In collaboration with:	jhb
2009-06-26 04:47:43 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
Konstantin Belousov
7a8af8eee2 Initialize the uip to silence gcc warning that seems to sneak in in some
build environments.

Reported by:	alc, bf1783 at googlemail com
2009-06-24 09:26:33 +00:00
Alan Cox
26f4eea53f The bits set in a page's dirty mask are a subset of the bits set in its
valid mask.  Consequently, there is no need to perform a bit-wise and of
the page's dirty and valid masks in order to determine which parts of a
page are dirty and valid.

Eliminate an unnecessary #include.
2009-06-24 04:45:03 +00:00
Konstantin Belousov
3364c323e6 Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
Alan Cox
ce1b857f33 Validate the page in one place, dev_pager_getpages(), rather than doing it
in two places, dev_pager_getfake() and dev_pager_updatefake().

Compare a pointer to "NULL" rather than "0".
2009-06-22 19:09:48 +00:00
Alan Cox
ef327c3ee7 Implement a mechanism within vm_phys_alloc_contig() to defer all necessary
calls to vdrop() until after the free page queues lock is released.  This
eliminates repeatedly releasing and reacquiring the free page queues lock
each time the last cached page is reclaimed from a vnode-backed object.
2009-06-21 20:29:14 +00:00
Alan Cox
6f0489c670 Strive for greater consistency among the places that implement real,
fictious, and contiguous page allocation.  Eliminate unnecessary
reinitialization of a page's fields.
2009-06-21 00:21:33 +00:00
Andrew Thompson
f06a3a36ac Track the kernel mapping of a physical page by a new entry in vm_page
structure. When the page is shared, the kernel mapping becomes a special
type of managed page to force the cache off the page mappings. This is
needed to avoid stale entries on all ARM VIVT caches, and VIPT caches
with cache color issue.

Submitted by:	Mark Tinguely
Reviewed by:	alc
Tested by:	Grzegorz Bernacki, thompsa
2009-06-18 20:42:37 +00:00
Alan Cox
aea6e893ed Add support for UMA_SLAB_KERNEL to page_free(). (While I'm here remove an
unnecessary newline character from the end of two panic messages.)
2009-06-18 07:27:11 +00:00
Alan Cox
f0553fdbc4 Eliminate unnecessary forward declarations. 2009-06-17 20:12:23 +00:00
Alan Cox
d78200e4e8 Refactor contigmalloc() into two functions: a simple front-end that deals
with the malloc tag and calls a new back-end, kmem_alloc_contig(), that
allocates the pages and maps them.

The motivations for this change are two-fold: (1) A cache mode parameter
will be added to kmem_alloc_contig().  In other words, kmem_alloc_contig()
will be extended to support the allocation of memory with caller-specified
caching. (2) The UMA allocation function that is used by the two jumbo
frames zones can use kmem_alloc_contig() in place of contigmalloc() and
thereby avoid having free jumbo frames held by the zone counted as live
malloc()ed memory.
2009-06-17 17:19:48 +00:00