opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-03-11 02:31:16 -04:00

Author	SHA1	Message	Date
Alan Cox	7827d9b0fe	- Introduce and use a mutex synchronizing access to the swblock hash table.	2003-10-26 19:55:35 +00:00
Alan Cox	43186e53ae	- Simplify vm_object_collapse()'s collapse case, reducing the number of lock acquires and releases performed. - Move an assertion from vm_object_collapse() to vm_object_zdtor() because it applies to all cases of object destruction.	2003-10-26 06:29:26 +00:00
Alan Cox	ee3dc7d7fe	- Add some of the required vm object locking, including assertions where the vm object lock is required and already held.	2003-10-25 23:42:17 +00:00
Alan Cox	93dbd07122	- Align a comment within struct vm_page. - Annotate the vm_page's valid field as synchronized by the containing vm object's lock.	2003-10-25 18:33:04 +00:00
Alan Cox	52051abcf1	- Call vnode_pager_input_old() with the vm object locked.	2003-10-25 05:21:16 +00:00
Alan Cox	2e3b314d3a	- Push down Giant from vm_pageout() to vm_pageout_scan(), freeing vm_pageout_page_stats() from Giant. - Modify vm_pager_put_pages() and vm_pager_page_unswapped() to expect the vm object to be locked on entry. (All of the pager routines now expect this.)	2003-10-24 06:43:04 +00:00
Alan Cox	ab42316c2f	- Retire vm_pageout_page_free(). Instead, use vm_page_select_cache() from vm_pageout_scan(). Rationale: I don't like leaving a busy page in the cache queue with neither the vm object nor the vm page queues lock held. - Assert that the page is active in vm_pageout_page_stats().	2003-10-22 18:41:32 +00:00
Alan Cox	d3c09dd7db	- Assert that every page found in the active queue is an active page.	2003-10-22 03:08:24 +00:00
Alan Cox	0d42c05ff4	- Assert that the containing vm object is locked in vm_page_set_validclean(). (This function reads and modifies the vm page's valid field, which is synchronized by the lock on the containing vm object.)	2003-10-21 19:36:51 +00:00
Alan Cox	fee181a696	- Remove some long unused code.	2003-10-20 18:57:01 +00:00
Alan Cox	3ad8097fd4	- Remove comments referring to functions that no longer exist.	2003-10-20 05:16:27 +00:00
Alan Cox	2bf43e4374	- Hold the vm object's lock around calls to vm_page_set_validclean().	2003-10-20 04:05:24 +00:00
Alan Cox	1b26eb10ff	- Synchronize access to a vm page's valid field using the containing vm object's lock. - Reduce the scope of the vm page queues lock in two places.	2003-10-19 00:01:56 +00:00
Alan Cox	8b575f6c28	- Synchronize access to the page's valid field in vnode_pager_generic_getpages() using the containing object's lock.	2003-10-18 21:30:29 +00:00
Alan Cox	7a93508274	- Increase the object lock's scope in vm_contig_launder() so that access to the object's type field and the call to vm_pageout_flush() are synchronized. - The above change allows for the eliminaton of the last parameter to vm_pageout_flush(). - Synchronize access to the page's valid field in vm_pageout_flush() using the containing object's lock.	2003-10-18 21:09:21 +00:00
Alan Cox	cbef13d877	Corrections to revision 1.305 - Specifying VM_MAP_WIRE_HOLESOK should not assume that the start address is the beginning of the map. Instead, move to the first entry after the start address. - The implementation of VM_MAP_WIRE_HOLESOK was incomplete. This caused the failure of mlockall(2) in some circumstances.	2003-10-18 18:48:17 +00:00
Poul-Henning Kamp	2c18019f14	DuH! bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)	2003-10-18 14:10:28 +00:00
Poul-Henning Kamp	9fbf91c0dd	Initialize bp->b_offset before calling VOP_[SPEC]STRATEGY(). Remove stale comment about B_PHYS.	2003-10-18 11:11:05 +00:00
Alan Cox	6989c456b3	- Synchronize access to a vm page's valid field using the containing vm object's lock. - Release the vm object and vm page queues locks around vput().	2003-10-17 05:07:17 +00:00
Alan Cox	c5b65a6723	- vm_fault_copy_entry() should not assume that the source object contains every page. If the source entry was read-only, one or more wired pages could be in backing objects. - vm_fault_copy_entry() should not set the PG_WRITEABLE flag on the page unless the destination entry is, in fact, writeable.	2003-10-15 08:00:45 +00:00
Alan Cox	8afcf0cc36	Lock the destination object in vm_fault_copy_entry().	2003-10-08 07:11:19 +00:00
Alan Cox	669890eaeb	Retire vm_page_copy(). Its reason for being ended when peter@ modified pmap_copy_page() et al. to accept a vm_page_t rather than a physical address. Also, this change will facilitate locking access to the vm page's valid field.	2003-10-08 05:35:12 +00:00
Bruce M Simpson	11f7ddc563	Only the super-user should be able to wire pages via the mlock() family of system calls at this time. Remove various #ifdef's to enforce this.	2003-10-06 01:59:04 +00:00
Bruce M Simpson	2bc7dd5661	Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count(). Reviewed by: truckman Discussed with: peter	2003-10-06 01:47:12 +00:00
Alan Cox	9aa3d17d37	The addition of a locking assertion to vm_page_zero_invalid() has revealed a long-time bug: vm_pager_get_pages() assumes that m[reqpage] contains a valid page upon return from pgo_getpages(). In the case of the device pager this page has been freed and replaced by a fake page. The fake page is properly inserted into the vm object but m[reqpage] is left pointing to a freed page. For now, update m[reqpage] to point to the fake page. Submitted by: tegge	2003-10-05 22:23:44 +00:00
Bruce M Simpson	5d264f84f3	Revert previous commit. Come back vslock(), all is forgiven. Pointy hat to: bms	2003-10-05 12:41:08 +00:00
Bruce M Simpson	aac7652ecd	Retire vslock() and vsunlock() with extreme prejudice. Discussed with: pete	2003-10-05 09:47:54 +00:00
Alan Cox	5a3970febf	Assert that the containing vm object's lock is held in vm_page_set_invalid().	2003-10-05 06:58:07 +00:00
Alan Cox	874f526de6	Assert that the containing vm object's lock is held in vm_page_zero_invalid().	2003-10-04 21:56:27 +00:00
Alan Cox	cbfbaad8be	Synchronize access to a vm page's valid field using the containing vm object's lock.	2003-10-04 21:35:48 +00:00
Alan Cox	bf0da100d6	- Extend the scope the vm object lock to cover calls to vm_page_is_valid(). - Assert that the lock on the containing vm object is held in vm_page_is_valid().	2003-10-04 19:23:29 +00:00
Alan Cox	49c06616ae	Synchronize access to a vm page's valid field using the containing vm object's lock.	2003-10-04 19:13:27 +00:00
Jeff Roberson	f3c625e47a	- Use the UMA_ZONE_VM flag on the fakepg and object zones to prevent vm recursion and LORs. This may be necessary for other zones created in the vm but this needs to be verified.	2003-10-04 14:21:53 +00:00
Alan Cox	566526a957	Migrate pmap_prefault() into the machine-independent virtual memory layer. A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.	2003-10-03 22:46:53 +00:00
Alan Cox	50028aa7d2	In vm_page_remove(), assert that the vm object is locked, unless an Alpha. (The Alpha still requires updates to its pmap.)	2003-09-28 04:50:48 +00:00
Marcel Moolenaar	fd75d71049	Part 2 of implementing rstacks: add the ability to create rstacks and use the ability on ia64 to map the register stack. The orientation of the stack (i.e. its grow direction) is passed to vm_map_stack() in the overloaded cow argument. Since the grow direction is represented by bits, it is possible and allowed to create bi-directional stacks. This is not an advertised feature, more of a side-effect. Fix a bug in vm_map_growstack() that's specific to rstacks and which we could only find by having the ability to create rstacks: when the mapped stack ends at the faulting address, we have not actually mapped the faulting address. we need to include or cover the faulting address. Note that at this time mmap(2) has not been extended to allow the creation of rstacks by processes. If such a need arises, this can be done. Tested on: alpha, i386, ia64, sparc64	2003-09-27 22:28:14 +00:00
Poul-Henning Kamp	e0f86251a7	Provide a bit more help with "memory overwritten after free" style bugs.	2003-09-27 21:33:13 +00:00
Peter Wemm	c460ac3a00	Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.	2003-09-25 01:10:26 +00:00
Mike Silbersack	3fde38df46	Adjust the kmapentzone limit so that it takes into account the size of maxproc and maxfiles, as procs, pipes, and other structures cause allocations from kmapentzone. Submitted by: tegge	2003-09-23 18:56:54 +00:00
Alan Cox	6c527f260e	Change the handling of the kernel and kmem objects in vm_map_delete(): In order to use "unmanaged" pages in the kmem object, vm_map_delete() must unconditionally perform pmap_remove(). Otherwise, sparc64 has problems. Tested by: jake	2003-09-23 04:28:04 +00:00
Alan Cox	95aad59a53	Initialize the page's pindex field even for VM_ALLOC_NOOBJ allocations. (This field is useful for implementing sanity checks even if the page does not belong to an object.)	2003-09-22 00:56:13 +00:00
Jeff Roberson	009b6fcb03	- Fix MD_SMALL_ALLOC on architectures that support it. Define a new alloc function, startup_alloc(), that is used for single page allocations prior to the VM starting up. If it is used after the VM startups up, it replaces the zone's allocf pointer with either page_alloc() or uma_small_alloc() where appropriate. Pointy hat to: me Tested by: phk/amd64, me/x86	2003-09-21 07:39:16 +00:00
Peter Wemm	c43ab0b5a1	Bad Jeffr! No cookie! Temporarily disable the UMA_MD_SMALL_ALLOC stuff since recent commits break sparc64, amd64, ia64 and alpha. It appears only i386 and maybe powerpc were not broken.	2003-09-20 23:35:33 +00:00
Jeff Roberson	9643769a3a	- Remove the working-set algorithm. Instead, use the per cpu buckets as the working set cache. This has several advantages. Firstly, we never touch the per cpu queues now in the timeout handler. This removes one more reason for having per cpu locks. Secondly, it reduces the size of the zone by 8 bytes, bringing it under 200 bytes for a single proc x86 box. This tidies up other logic as well. - The 'destroy' flag no longer needs to be passed to zone_drain() since it always frees everything in the zone's slabs. - cache_drain() is now only called from zone_dtor() and so it destroys by default. It also does not need the destroy parameter now.	2003-09-19 23:27:46 +00:00
Jeff Roberson	3e0cab95c0	- Remove the cache colorization code. We can't use it due to all of the broken consumers of the malloc interface who assume that the allocated address will be an even multiple of the size. - Remove disabled time delay code on uma_reclaim(). The comment there said it all. It was not an effective strategy and it should not be left in #if 0'd for all eternity.	2003-09-19 23:04:44 +00:00
Jeff Roberson	64f051e99a	- There are an endless stream of style(9) errors in this file. Fix a few. Also catch some spelling errors.	2003-09-19 22:31:45 +00:00
Jeff Roberson	44eca34adb	- Don't inspect the zone in page_alloc(). It may be NULL. - Don't cache more items than the zone would like in uma_zalloc_bucket().	2003-09-19 09:22:04 +00:00
Jeff Roberson	45bf76f0f8	- Move the logic for dealing with the uma_boot_pages cache into the page_alloc() function from the slab_zalloc() function. This allows us to unconditionally call uz_allocf(). - In page_alloc() cleanup the boot_pages logic some. Previously memory from this cache that was not used by the time the system started was left in the cache and never used. Typically this wasn't more than a few pages, but now we will use this cache so long as memory is available.	2003-09-19 08:53:33 +00:00
Jeff Roberson	b60f5b794e	- Fix the silly flag situation in UMA. Remove redundant ZFLAG/ZONE flags by accepting the user supplied flags directly. Previously this was not done so that flags for the same field would not be defined in two different files. Add comments in each header instructing future developers on how now to shoot their feet. - Fix a test for !OFFPAGE which should have been a test for HASH. This would have caused a panic if we had ever destructed a malloc zone. This also opens up the possibility that other zones could use the vsetobj() method rather than a hash.	2003-09-19 08:37:44 +00:00
Jeff Roberson	961647dfd0	- Don't abuse M_DEVBUF, define a tag for UMA hashes.	2003-09-19 07:23:50 +00:00
Jeff Roberson	b983089a05	- Eliminate a pair of unnecessary variables.	2003-09-19 06:41:06 +00:00
Jeff Roberson	cae33c1429	- Initialize a pool of bucket zones so that we waste less space on zones that don't cache as many items. - Introduce the bucket_alloc(), bucket_free() functions to wrap bucket allocation. These functions select the appropriate bucket zone to allocate from or free to. - Rename ub_ptr to ub_cnt to reflect a change in its use. ub_cnt now reflects the count of free items in the bucket. This gets rid of many unnatural subtractions by 1 throughout the code. - Add ub_entries which reflects the number of entries possibly held in a bucket.	2003-09-19 06:26:45 +00:00
Alan Cox	45ae1d9147	Merge vm_pageout_free_page_calc() into vm_pageout(), eliminating some unneeded code.	2003-09-19 05:03:45 +00:00
Alan Cox	417a26a154	Add vm object locking to vnode_pager_lock(). (This triggers the movement of a VM_OBJECT_LOCK() in vm_fault().)	2003-09-18 02:26:03 +00:00
Alan Cox	1dabe30610	Remove GIANT_REQUIRED from vm_object_shadow().	2003-09-17 07:00:14 +00:00
Alan Cox	4b5f553179	When calling vget() on a vnode-backed vm object, acquire the vnode interlock before releasing the vm object's lock.	2003-09-17 06:55:42 +00:00
Alan Cox	82f9defeaf	Eliminate the use of Giant from vm_object_reference().	2003-09-15 05:58:27 +00:00
Alan Cox	30bb12a4e8	Call vm_page_unmanage() on pages belonging to the kmem_object. This eliminates the unnecessary overhead of managing "PV" entries for these pages.	2003-09-14 02:37:59 +00:00
Alan Cox	b881da26a5	There is no need for an atomic increment on the vm object's generation count in _vm_object_allocate(). (Access to the generation count is governed by the vm object's lock.) Note: the introduction of the atomic increment in revision 1.238 appears to be an accident. The purpose of that commit was to fix an Alpha-specific bug in UMA's debugging code.	2003-09-13 20:07:26 +00:00
Alan Cox	b9850eb224	Add a new parameter to pmap_extract_and_hold() that is needed to eliminate Giant from vmapbuf(). Idea from: tegge	2003-09-12 07:07:49 +00:00
Alan Cox	ba2157f218	Introduce a new pmap function, pmap_extract_and_hold(). This function atomically extracts and holds the physical page that is associated with the given pmap and virtual address. Such a function is needed to make the memory mapping optimizations used by, for example, pipes and raw disk I/O MP-safe. Reviewed by: tegge	2003-09-08 02:45:03 +00:00
Alan Cox	7ebcee376a	Revise the locking in mincore(2).	2003-09-07 18:47:54 +00:00
Poul-Henning Kamp	afeb65e61d	Don't open with exclusive bit, swapon(8) wants to trash our swapdev. Add XXX comment with a rating of this concept.	2003-09-02 05:53:44 +00:00
Eivind Eklund	2ae51145e8	Change clean_map from a global to an auto variable	2003-09-01 16:46:47 +00:00
Alan Cox	3562af1215	- Add vm object locking to the part of vm_pageout_scan() that launders dirty pages. - Remove some unused variables.	2003-08-31 00:00:46 +00:00
Marcel Moolenaar	b21a0008ba	Introduce MAP_ENTRY_GROWS_DOWN and MAP_ENTRY_GROWS_UP to allow for growable (stack) entries that not only grow down, but also grow up. Have vm_map_growstack() take these flags into account when growing an entry. This is the first step in adding support for upward growable stacks. It is a required feature on ia64 to support the register stack (or rstack as I like to call it -- it also means reverse stack). We do not currently create rstacks, so the upward growing is not exercised and the change should be a functional no-op. Reviewed by: alc	2003-08-30 21:25:23 +00:00
Poul-Henning Kamp	dee34ca4fc	Add a close() method to a swapdev. Add a GEOM based backend. Remove the device/VOP_SPECSTRATEGY() based backend.	2003-08-30 16:44:26 +00:00
Poul-Henning Kamp	20da9c2eaf	Protect the swapdevice tailq with a mutex. Store the udev_t we will report to userland in the swdevt.	2003-08-30 16:10:28 +00:00
Poul-Henning Kamp	59efee01a3	Continue the objectification of the swapdev backends: Remove the vnode and dev_t fields and replace them with a void *. Introduce separate strategy functions for devices and regular (NFS) vnodes. For devices we don't need the vnode v_numoutput stuff. Add a generic swaponsomething() function to add a swapdevice and split the remainder of swaponvp() into swaponvp() and swapondev() which calls this backend.	2003-08-30 11:33:25 +00:00
Poul-Henning Kamp	4b03903a46	Make the strategy function a method of the individual swapdev.	2003-08-30 09:42:00 +00:00
Poul-Henning Kamp	2f249180f5	Consistent use modern function definitions	2003-08-30 08:32:42 +00:00
Marcel Moolenaar	23562e4bc6	In vnode_pager_generic_putpages(), change the printf format specifier to long and explicitly cast field dirty of struct vm_page to unsigned long. When PAGE_SIZE is 32K, this field is actually unsigned long.	2003-08-29 00:16:30 +00:00
Alan Cox	2370c6d40c	Recent pmap changes permit the use of a more precise locking assertion in vm_page_lookup().	2003-08-28 23:23:04 +00:00
Marcel Moolenaar	16bc6ff39e	Assert that u_long is at least 64 bits if PAGE_SIZE is 32K. Suggested by: phk	2003-08-25 19:58:01 +00:00
Alan Cox	529e15ed69	Held pages, just like wired pages, should not be added to the cache queues. Submitted by: tegge	2003-08-23 20:29:29 +00:00
Alan Cox	b7ad744dc5	Hold the page queues lock when performing vm_page_clear_dirty() and vm_page_set_invalid().	2003-08-23 18:11:53 +00:00
Alan Cox	8d8b9c6e70	To implement the sequential access optimization, vm_fault() may need to reacquire the "first" object's lock while a backing object's lock is held. Since this is a lock-order reversal, vm_fault() uses trylock to acquire the first object's lock, skipping the sequential access optimization in the unlikely event that the trylock fails.	2003-08-23 06:52:32 +00:00
Marcel Moolenaar	21a708cfde	Also define VM_PAGE_BITS_ALL for 16K and 32K pages. Make the constant unsigned for all page sizes and unsigned long for 32K pages.	2003-08-23 06:30:47 +00:00
Marcel Moolenaar	1fa057c6f1	Add support for 16K and 32K page sizes. The valid and dirty maps in struct vm_page are defined as u_int for 16K pages and u_long for 32K pages, with the implied assumption that long will at least be 64 bits wide on platforms where we support 32K pages.	2003-08-23 06:24:00 +00:00
Alan Cox	0f132ba697	Assert that the vm object's lock is held on entry to vm_page_grab(); remove code from this function that was needed when vm object locking was incomplete.	2003-08-21 20:59:07 +00:00
Alan Cox	891c1d4bd3	Assert that the vm object lock is held in vm_page_alloc().	2003-08-20 20:24:29 +00:00
Bosko Milekic	1c35e213f1	In sysctl_vm_zone, do not calculate per-cpu cache stats on UMA_ZFLAG_INTERNAL zones at all. Apparently, Wilko's alpha was crashing while entering multi-user because, I think, we were calculating the garbage cachefree for pcpu caches that essentially don't exist for at least the 'zones' zone and it so happened that we were reading from an unmapped location. Confirmed to fix crash: wilko Helped debug: wilko, gallatin	2003-08-20 18:22:06 +00:00
Poul-Henning Kamp	6a4b58230c	Replace a homegrown bdone()/bwait() implementation by the real thing	2003-08-18 19:47:16 +00:00
Alan Cox	ef13663bb6	Three unrelated changes to vm_proc_new(): (1) add vm object locking on the U pages object; (2) reorganize such that the U pages object is created and filled in one block; and (3) remove an unnecessary clearing of PG_ZERO.	2003-08-18 01:31:43 +00:00
Poul-Henning Kamp	ec7948490b	Use NULL for 3rd argument of VOP_BMAP() rather than custom cast. Eliminate unused variable.	2003-08-17 18:54:23 +00:00
Marcel Moolenaar	710338e94f	In vm_thread_swap{in\|out}(), remove the alpha specific conditional compilation and replace it with a call to cpu_thread_swap{in\|out}(). This allows us to add similar code on ia64 without cluttering the code even more.	2003-08-16 23:15:15 +00:00
Poul-Henning Kamp	395714feb7	Eliminate unnecessary udev_t variable: we can derive it from the dev_t when we need it.	2003-08-15 13:14:25 +00:00
Poul-Henning Kamp	89dc784fa3	Make swaponvp() static to the swap_pager.	2003-08-15 12:04:29 +00:00
Alan Cox	3e1b578a28	Extend the scope of the page queues lock in vm_pageout_scan() to cover the traversal of the PQ_INACTIVE queue.	2003-08-15 05:13:36 +00:00
Alan Cox	5402d8ec23	Remove GIANT_REQUIRED from vmspace_alloc().	2003-08-13 19:23:51 +00:00
Alan Cox	46add12552	Reduce the size of the vm map (and by inclusion the vm space) on 64-bit architectures by moving a field within the structure.	2003-08-13 03:13:22 +00:00
Warner Losh	06b4bf3e55	Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's copyrighted files. Approved by: Matt Dillon	2003-08-12 23:24:05 +00:00
Alan Cox	c759a3ca06	Reduce the size of the vm object on 64-bit architectures by moving a field within the structure.	2003-08-12 20:10:32 +00:00
Bosko Milekic	20e8e865bd	- When deciding whether to init the zone with small_init or large_init, compare the zone element size (+1 for the byte of linkage) against UMA_SLAB_SIZE - sizeof(struct uma_slab), and not just UMA_SLAB_SIZE. Add a KASSERT in zone_small_init to make sure that the computed ipers (items per slab) for the zone is not zero, despite the addition of the check, just to be sure (this part submitted by: silby) - UMA_ZONE_VM used to imply BUCKETCACHE. Now it implies CACHEONLY instead. CACHEONLY is like BUCKETCACHE in the case of bucket allocations, but in addition to that also ensures that we don't setup the zone with OFFPAGE slab headers allocated from the slabzone. This means that we're not allowed to have a UMA_ZONE_VM zone initialized for large items (zone_large_init) because it would require the slab headers to be allocated from slabzone, and hence kmem_map. Some of the zones init'd with UMA_ZONE_VM are so init'd before kmem_map is suballoc'd from kernel_map, which is why this change is necessary.	2003-08-11 19:39:45 +00:00
Bruce M Simpson	abd498aa71	Add the mlockall() and munlockall() system calls. - All those diffs to syscalls.master for each architecture are necessary. This needed clarification; the stub code generation for mlockall() was disabled, which would prevent applications from linking to this API (suggested by mux) - Giant has been quoshed. It is no longer held by the code, as the required locking has been pushed down within vm_map.c. - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES to express their intention explicitly. - Inspected at the vmstat, top and vm pager sysctl stats level. Paging-in activity is occurring correctly, using a test harness. - The RES size for a process may appear to be greater than its SIZE. This is believed to be due to mappings of the same shared library page being wired twice. Further exploration is needed. - Believed to back out of allocations and locks correctly (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC). PR: kern/43426, standards/54223 Reviewed by: jake, alc Approved by: jake (mentor) MFC after: 2 weeks	2003-08-11 07:14:08 +00:00
Mike Silbersack	cebde06978	More pipe changes: From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)	2003-08-11 05:51:51 +00:00
Poul-Henning Kamp	ef3c5abdba	Make the first two pages magic to protect the BSD labels rather than only one.	2003-08-06 14:13:38 +00:00
Poul-Henning Kamp	07f81f9159	Remove an unused variable.	2003-08-06 12:09:34 +00:00
Poul-Henning Kamp	751221fd32	Staticize swap_pager_putpages() Eliminate a lot of checkes to make sure requests are not cross-device which is unnecessary with the new layout. We know a sequential request cannot possibly be cross-device because there is a reserved page between the devices. Remove a couple of comments which no longer are relevant.	2003-08-06 12:08:27 +00:00
Poul-Henning Kamp	030b34923d	Access the swap_pagers' ->putpages() through swappagerops instead of directly, this is a cleaner way to do it.	2003-08-06 12:05:48 +00:00
Poul-Henning Kamp	f976cfd99a	Add XXX: comment to vm_pager_unswapped().	2003-08-06 10:51:40 +00:00
Poul-Henning Kamp	5e04322a6e	Explicitly set B_PAGING	2003-08-06 09:22:47 +00:00
Poul-Henning Kamp	c37a77ee86	Rip out the totally bogos vnode swapdev_vp with extreeme prejudice. Don't mark buffers with B_KEEPGIANT, we don't drop giant in strategy at this point in time.	2003-08-06 06:53:31 +00:00
Poul-Henning Kamp	e04e4bacf6	Use sparse struct initialization for struct pagerops. Mark our buffers B_KEEPGIANT before sending them downstream. Remove swap_pager_strategy implementation.	2003-08-05 06:54:56 +00:00
Poul-Henning Kamp	4e6586002d	Use sparse struct initializations for struct pagerops. This makes grepping for which pagers implement which methods easier.	2003-08-05 06:51:26 +00:00
Poul-Henning Kamp	665c0caf03	Put an uncovered page between the swap devices, that way we can be sure to not get any cross-device I/O requests. (The unallocated first page protecting BSD labels already gave us this, but that hack may go away at some point in time). Remove the check for cross-device I/O requests in swap_pager_strategy. Move the repeated statistics updating into flushchainbuf().	2003-08-04 08:22:49 +00:00
Alan Cox	981371629a	Use kmem_alloc_nofault() instead of kmem_alloc_pageable() to allocate swapbkva. Swapbkva mappings are explicitly managed using pmap_qenter(), not on-demand by vm_fault(), making kmem_alloc_nofault() more appropriate. Submitted by: tegge	2003-08-04 04:35:04 +00:00
Poul-Henning Kamp	12692209a6	Name swap_pager_find_dev() more correctly swp_pager_finde_dev(). Use ->bio_children to count child buffers, rather than abuse the bio_caller1 pointer. Expand the relevant bits of waitchainbuf() inline, this clarifies the code a little bit.	2003-08-03 21:22:42 +00:00
Poul-Henning Kamp	5ff0108d21	I accidentally hit undo before committing, fix the resulting off-by-one.	2003-08-03 14:53:52 +00:00
Poul-Henning Kamp	8f60c087e6	Change the layout policy of the swap_pager from a hardcoded width striping to a per device round-robin algorithm. Because of the policy of not attempting to retain previous swap allocation on page-out, this means that a newly added swap device almost instantly takes its 1/N share of the I/O load but it takes somewhat longer for it to assume it's 1/N share of the pages if there is plenty of space on the other devices. Change the 8G total swapspace limitation to 8G per device instead by using a per device blist rather than one global blist. This reduces the memory footprint by 75% (typically a couple hundred kilobytes) for the common case with one swapdevice but NSWAPDEV=4. Remove the compile time constant limit of number of swap devices, there is no limit now. Instead of a fixed size array, store the per swapdev structure in a TAILQ. Total swap space is still addressed by a 32 bit page number and therefore the upper limit is now 2^42 bytes = 16TB (for i386). We still do not allocate the first page of each device in order to give some amount of protection to any bsdlabel at the start of the device. A new device is appended after the existing devices in the swap space, no attempt is made to fill in holes left behind by swapoff (this can trivially be changed should it ever become a problem). The sysctl vm.nswapdev now reflects the number of currently configured swap devices. Rename vm_swap_size to swap_pager_avail for consistency with other exported names. Change argument type for vm_proc_swapin_all() and swap_pager_isswapped() to be a struct swdevt pointer rather than an index. Not changed: we are still using blists to manage the free space, but since the swapspace is no longer fragmented by the striping different resource managers might fare better.	2003-08-03 13:35:31 +00:00
Poul-Henning Kamp	745f330503	Move extern declaration of the various pagerops from vm_pager.c to vm_pager.h where the various pagers will also see them.	2003-08-03 09:27:39 +00:00
Alan Cox	b245ac95cf	Revise obj_alloc(). Most notably, use the object's lock to prevent two concurrent invocations from acquiring the same address(es). Also, in case of an incomplete allocation, free any allocated pages. In collaboration with: tegge	2003-08-03 06:08:48 +00:00
Bosko Milekic	48bf87258f	When INVARIANTS is on and we're in uma_zalloc_free(), we need to make sure that uma_dbg_free() is called if we're about to call uma_zfree_internal() but we're asking it to skip the dtor and uma_dbg_free() call itself. So, if we're about to call uma_zfree_internal() from uma_zfree_arg() and skip == 1, call uma_dbg_free() ourselves.	2003-08-02 22:40:27 +00:00
Alan Cox	b77c2bcd98	Update the comment at the head of kmem_alloc_nofault() to describe its purpose and use.	2003-08-01 19:51:43 +00:00
Bosko Milekic	174ab4501e	Only free the pcpu cache buckets if they are non-NULL. Crashed this person's machine: harti Pointy-hat to: me	2003-08-01 17:42:27 +00:00
Poul-Henning Kamp	8d677ef93f	Remove unused stuff. Move used stuff to swap_pager.c where it belongs. This file no longer exports anything to userland.	2003-07-31 22:19:28 +00:00
Peter Wemm	15a7ad60fb	Add #include "opt_kstack_pages.h" and "opt_kstack_max_pages.h" to remain in sync with the backend machdep code. When cpu_thread_init() does not have the same idea of KSTACK_PAGES as the thing that created the kstack, all hell breaks loose. Bad alc! no cookie! :-)	2003-07-31 01:25:05 +00:00
Bosko Milekic	d56368d779	Plug a race and a leak in UMA. 1) The race has to do with zone destruction. From the zone destructor we would lock the zone, set the working set size to 0, then unlock the zone, drain it, and then free the structure. Within the window following the working-set-size set to 0 and unlocking of the zone and the point where in zone_drain we re-acquire the zone lock, the uma timer routine could have fired off and changed the working set size to something non-zero, thereby potentially preventing us from completely freeing slabs before destroying the zone (and thus leaking them). 2) The leak has to do with zone destruction as well. When destroying a zone we would take care to free all the buckets cached in the zone, but although we would drain the pcpu cache buckets, we would not free them. This resulted in leaking a couple of bucket structures (512 bytes each) per cpu on SMP during zone destruction. While I'm here, also silence GCC warnings by turning uma_slab_alloc() from inline to real function. It's too big to be an inline. Reviewed by: JeffR	2003-07-30 18:55:15 +00:00
Bosko Milekic	a40fdcb439	When generating the zone stats make sure to handle the master zone ("UMA Zone") carefully, because it does not have pcpu caches allocated at all. In the UP case, we did not catch this because one pcpu cache is always allocated with the zone, but for the MP case, we were getting bogus stats for this zone. Tested by: Lukas Ertl <le@univie.ac.at>	2003-07-30 15:22:37 +00:00
Poul-Henning Kamp	7b4bd98ad5	Remove the disabling of buckets workaround. Thanks to: jeffr	2003-07-30 07:50:19 +00:00
Jeff Roberson	f828e5bedb	- Get rid of the ill-conceived uz_cachefree member of uma_zone. - In sysctl_vm_zone use the per cpu locks to read the current cache statistics this makes them more accurate while under heavy load. Submitted by: tegge	2003-07-30 05:59:17 +00:00
Jeff Roberson	d11e0ba565	- Check to see if we need a slab prior to allocating one. Failure to do so not only wastes memory but it can also cause a leak in zones that will be destroyed later. The problem is that the slab allocation code places newly created slabs on the partially allocated list because it assumes that the caller will actually allocate some memory from it. Failure to do so places an otherwise free slab on the partial slab list where we wont find it later in zone_drain(). Continuously prodded to fix by: phk (Thanks)	2003-07-30 05:42:55 +00:00
Poul-Henning Kamp	0c32d97ab5	Temporary workaround: Always disable buckets, there is a bug there somewhere. JeffR will look at this as soon as he has time. OK'ed by: jeffr	2003-07-29 22:07:10 +00:00
Alan Cox	234c7726c8	None of the "alloc" functions used by UMA assume that Giant is held any longer. (If they still need it, e.g., contigmalloc(), they acquire it themselves.) Therefore, we need not acquire Giant in slab_zalloc().	2003-07-28 02:29:07 +00:00
Alan Cox	f50ab15dff	Remove GIANT_REQUIRED from kmem_alloc().	2003-07-27 18:31:32 +00:00
Maxime Henrion	085f5d6043	Use pmap_zero_page() to zero pages instead of bzero() because they haven't been vm_map_wire()'d yet.	2003-07-27 10:41:33 +00:00
Alan Cox	9c65e7a336	Allow vm_object_reference() on kernel_object without Giant.	2003-07-27 05:43:58 +00:00
Alan Cox	17d89a1f67	Acquire Giant rather than asserting it is held in contigmalloc(). This is a prerequisite to removing further uses of Giant from UMA.	2003-07-26 21:48:46 +00:00
Poul-Henning Kamp	a8d43c90af	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
Alan Cox	0c1a133f56	Gulp ... call kmem_malloc() without Giant.	2003-07-26 03:55:32 +00:00
Maxime Henrion	b9ff8db1be	Add support for the M_ZERO flag to contigmalloc(). Reviewed by: jeff	2003-07-25 21:02:25 +00:00
Poul-Henning Kamp	a5edd34afe	Remove all but one of the inlines here, this reduces the code size by 2032 bytes and has no measurable impact on performance.	2003-07-22 20:54:26 +00:00
Poul-Henning Kamp	b4ae478044	Don't inline very large functions. Gcc has silently not been doing this for a long time.	2003-07-22 09:27:58 +00:00
Peter Wemm	da5fd14534	swp_pager_hash() was called before it was instantiated inline. This made gcc (quite rightly) unhappy. Move it earlier.	2003-07-22 06:55:48 +00:00
Poul-Henning Kamp	85fdafb98d	Fix a printf format warning I introduced. Use the macro max number of swap devices rather than cache the constant in a variable. Avoid a (now) pointless variable.	2003-07-18 22:11:17 +00:00
Hartmut Brandt	8522511b2a	When INVARIANTS is defined make sure that uma_zalloc_arg (and hence uma_zalloc) is called with exactly one of either M_WAITOK or M_NOWAIT and that it is called with neither M_TRYWAIT or M_DONTWAIT. Print a warning if anything is wrong. Default to M_WAITOK of no flag is given. This is the same test as in malloc(9).	2003-07-18 16:04:36 +00:00
Poul-Henning Kamp	d3dd89ab11	If a proposed swap device exceeds the 8G artificial limit which out radix-tree code imposes, truncate the device instead of rejecting it.	2003-07-18 11:01:23 +00:00
Poul-Henning Kamp	ec38b344cb	Move the implementation of the vmspace_swap_count() (used only in the "toss the largest process" emergency handling) from vm_map.c to swap_pager.c. The quantity calculated depends strongly on the internals of the swap_pager and by moving it, we no longer need to expose the internal metrics of the swap_pager to the world.	2003-07-18 10:47:58 +00:00
Poul-Henning Kamp	567104a148	Add a new function swap_pager_status() which reports the total size of the paging space and how much of it is in use (in pages). Use this interface from the Linuxolator instead of groping around in the internals of the swap_pager.	2003-07-18 10:26:09 +00:00
Poul-Henning Kamp	e9c0cc157b	Merge swap_pager.c and vm_swap.c into swap_pager.c, the separation is not natural and needlessly exposes a lot of dirty laundry. Move private interfaces between the two from swap_pager.h to swap_pager.c and staticize as much as possible. No functional change.	2003-07-18 10:02:44 +00:00
Poul-Henning Kamp	116b3c2af9	Make sure that SWP_NPAGES always has the same value in all source files, so that SWAP_META_PAGES does not vary either. swap_pager.c ended up with a value of 16, everybody else 8. Go with the 16 for now. This should only have any effect in the "kill processes because we are out of swap" scenario, where it will make some sort of estimate of something more precise.	2003-07-17 21:58:43 +00:00
Robert Drehmel	857961d925	Avoid an unnecessary calculation: there is no need to subtract `firstaddr' from `v' if we know that the former equals zero.	2003-07-13 21:02:11 +00:00
Alan Cox	ecf6279f00	- Complete the vm object locking in vm_pageout_object_deactivate_pages(). - Change vm_pageout_object_deactivate_pages()'s first parameter from a vm_map_t to a pmap_t. - Change vm_pageout_object_deactivate_pages()'s and vm_pageout_map_deactivate_pages()'s last parameter from a vm_pindex_t to a long. Since the number of pages in an address space doesn't require 64 bits on an i386, vm_pindex_t is overkill.	2003-07-07 07:16:29 +00:00
Alan Cox	f278f0fbab	Lock a vm object when freeing a page from it.	2003-07-05 20:51:22 +00:00
Poul-Henning Kamp	a5d841d4ce	Remove unnecessary cast.	2003-07-04 12:23:43 +00:00
Alan Cox	1f78f902a8	Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap. This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.	2003-07-03 20:18:02 +00:00
Maxime Henrion	b3670b9cd0	Fix a few style(9) nits.	2003-07-02 01:47:47 +00:00
Alan Cox	c53e8c5654	Modify vm_page_alloc() and vm_page_select_cache() to allow the page that is returned by vm_page_select_cache() to belong to the object that is already locked by the caller to vm_page_alloc().	2003-07-01 07:33:41 +00:00
Alan Cox	8526ce9b64	Check the address provided to vm_map_stack() against the vm map's maximum, returning an error if the address is too high.	2003-07-01 03:57:25 +00:00
Alan Cox	0551c08dee	Introduce vm_map_pmap_enter(). Presently, this is a stub calling the MD pmap_object_init_pt().	2003-06-29 23:32:55 +00:00
Alan Cox	dca96f1adc	- Export pmap_enter_quick() to the MI VM. This will permit the implementation of a largely MI pmap_object_init_pt() for vnode-backed objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64 and powerpc. - Correct a mismatch between pmap_object_init_pt()'s prototype and its various implementations. (I plan to keep pmap_object_init_pt() as the MD hook for device-backed objects on i386 and amd64.) - Correct an error in ia64's pmap_enter_quick() and adjust its interface to match the other versions. Discussed with: marcel	2003-06-29 21:20:04 +00:00
Alan Cox	0774dfb376	Add vm object locking to vm_pageout_map_deactivate_pages().	2003-06-29 19:51:24 +00:00
Alan Cox	8e1e7b93b3	Remove GIANT_REQUIRED from kmem_malloc().	2003-06-28 22:04:52 +00:00
Alan Cox	5163584c7e	- Add vm object locking to vm_pageout_clean().	2003-06-28 20:07:54 +00:00
Alan Cox	baaaadf125	- Use an int rather than a vm_pindex_t to represent the desired page color in vm_page_alloc(). (This also has small performance benefits.) - Eliminate vm_page_select_free(); vm_page_alloc() might as well call vm_pageq_find() directly.	2003-06-28 07:58:10 +00:00
Alan Cox	23252eeabe	Simple read-modify-write operations on a vm object's flags, ref_count, and shadow_count can now rely on its mutex for synchronization. Remove one use of Giant from vm_map_insert().	2003-06-27 18:52:49 +00:00
Alan Cox	9f2b1758c3	vm_page_select_cache() enforces a number of conditions on the returned page. Add the ability to lock the containing object to those conditions.	2003-06-26 15:44:03 +00:00
Alan Cox	2099bdfded	Modify vm_pageq_requeue() to handle a PQ_NONE page without dereferencing a NULL pointer; remove some now unused code.	2003-06-26 03:14:40 +00:00
Bosko Milekic	d88797c2ba	Move the pcpu lock out of the uma_cache and instead have a single set of pcpu locks. This makes uma_zone somewhat smaller (by (LOCKNAME_LEN * sizeof(char) + sizeof(struct mtx) * maxcpu) bytes, to be exact). No Objections from jeff.	2003-06-25 20:49:48 +00:00
Bosko Milekic	5c133dfa0e	Make sure that the zone destructor doesn't get called twice in certain free paths.	2003-06-25 17:25:45 +00:00
Alan Cox	95018011e5	Remove a GIANT_REQUIRED on the kernel object that we no longer need.	2003-06-25 05:31:02 +00:00
Alan Cox	dd5e55f872	Maintain the lock on a vm object when calling vm_page_grab().	2003-06-25 04:53:56 +00:00
Alan Cox	a8ab48702b	Assert that the vm object is locked on entry to dev_pager_getpages().	2003-06-24 19:48:34 +00:00
Alan Cox	f566a0b6ba	Assert that the vm object is locked on entry to vm_pager_get_pages().	2003-06-23 06:15:05 +00:00
Alan Cox	f29ba63ec9	Maintain a lock on the vm object of interest throughout vm_fault(), releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages() or vm_pager_has_pages()). If we sleep, we have marked the vm object with the paging-in-progress flag.	2003-06-22 21:35:41 +00:00
Poul-Henning Kamp	3b6d965263	Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.	2003-06-22 08:41:43 +00:00
Alan Cox	c8567c3a77	As vm_fault() descends the chain of backing objects, set paging-in- progress on the next object before clearing it on the current object.	2003-06-22 05:36:53 +00:00
Alan Cox	7ca33ad1e8	Complete the vm object locking in vm_object_backing_scan(); specifically, deal with the case where we need to sleep on a busy page with two vm object locks held.	2003-06-22 02:35:06 +00:00
Alan Cox	d98ddc4615	Make some style and white-space changes to the copy-on-write path through vm_fault(); remove a pointless assignment statement from that path.	2003-06-22 00:00:11 +00:00
Poul-Henning Kamp	a6af4ff136	Use a do {...} while (0); and a couple of breaks to reduce the level of indentation a bit.	2003-06-21 08:27:06 +00:00
Alan Cox	ebf7512532	Lock one of the vm objects involved in an optimized copy-on-write fault.	2003-06-21 06:31:42 +00:00
Alan Cox	06ecade7d8	- Increase the scope of the vm object lock in vm_object_collapse(). - Assert that the vm object and its backing vm object are both locked in vm_object_qcollapse().	2003-06-21 04:14:48 +00:00
Alan Cox	5ea4972cd4	Make swap_pager_haspages() static; remove unused function prototypes.	2003-06-20 20:20:06 +00:00
Poul-Henning Kamp	adece6e592	Initialize b_saveaddr when we hand out pbufs	2003-06-20 08:35:28 +00:00
Alan Cox	e50346b5e0	The so-called "optimized copy-on-write fault" case should not require the vm map lock. What's really needed is vm object locking, which is (for the moment) provided Giant. Reviewed by: tegge	2003-06-20 04:20:36 +00:00
Alan Cox	37681d8642	Assert that the vm object is locked in vm_page_try_to_free().	2003-06-19 01:50:14 +00:00
Alan Cox	d18e8afe99	Fix a vm object reference leak in the page-based copy-on-write mechanism used by the zero-copy sockets implementation. Reviewed by: gallatin	2003-06-19 01:40:44 +00:00
Alan Cox	31953be936	Lock the vm object when freeing a vm page.	2003-06-18 04:27:18 +00:00
Poul-Henning Kamp	b94b853bf1	This file was ignored by CVS in my last commit for some reason: Remove pointless initialization of b_spc field, which now no longer exists.	2003-06-16 09:31:15 +00:00
Poul-Henning Kamp	cefb5754dd	Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations to check that the buffer points to the correct vnode.	2003-06-15 18:53:00 +00:00
Alan Cox	bf5f21b622	Remove an unnecessary forward declaration.	2003-06-15 07:28:33 +00:00
Alan Cox	a04a7f2242	Use #ifdef __alpha__, not __alpha.	2003-06-15 00:12:42 +00:00
Alan Cox	49a2507bd1	Migrate the thread stack management functions from the machine-dependent to the machine-independent parts of the VM. At the same time, this introduces vm object locking for the non-i386 platforms. Two details: 1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The different machine-dependent implementations used various combinations of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set KSTACK_GUARD_PAGES to 0. 2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In 5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed to vm_page_alloc() or vm_page_grab().	2003-06-14 23:23:55 +00:00
Alan Cox	89f4fca265	Move the _new_altkstack() and _dispose_altkstack() functions out of the various pmap implementations into the machine-independent vm. They were all identical.	2003-06-14 06:20:25 +00:00
Alan Cox	33a609ece0	Extend the scope of the vm object lock in swp_pager_async_iodone() to cover a vm_page_free().	2003-06-13 06:17:42 +00:00
Alan Cox	8630c1173e	Add vm object locking to various pagers' "get pages" methods, i386 stack management functions, and a u area management function.	2003-06-13 03:02:28 +00:00
David E. O'Brien	874651b13c	Use __FBSDID().	2003-06-11 23:50:51 +00:00
Peter Wemm	77e2a274d0	GC unused cpu_wait() function	2003-06-11 05:20:33 +00:00
Alan Cox	2a8f9ab57f	- Finish vm object and page locking in vnode_pager_setsize(). - Make some small style changes to vnode_pager_setsize(); most notably, move two comments to a more logical place.	2003-06-10 20:28:41 +00:00
Poul-Henning Kamp	c1f5a18201	Revert last commit, I have no idea what happened.	2003-06-09 22:51:36 +00:00
Poul-Henning Kamp	47f94c12da	A white-space nit I noticed.	2003-06-09 19:40:34 +00:00
Alan Cox	bc5b057f6c	Hold the vm object's lock when performing vm_page_lookup().	2003-06-09 07:01:05 +00:00
Alan Cox	3471677cc9	Don't use vm_object_set_flag() to initialize the vm object's flags.	2003-06-09 06:50:02 +00:00
Alan Cox	138449dc19	- Properly handle the paging_in_progress case on two vm objects in vm_object_deallocate(). - Remove vm_object_pip_sleep().	2003-06-08 23:01:24 +00:00
Alan Cox	984a95d563	Lock the kernel object in kmem_alloc().	2003-06-07 23:24:10 +00:00
Alan Cox	36d1fdf5a2	Teach vm_page_grab() how to handle the vm object's lock.	2003-06-07 23:22:04 +00:00
Alan Cox	19ba4c8e49	Assert that the vm object is locked on entry to swap_pager_freespace().	2003-06-07 20:43:16 +00:00
Alan Cox	d7fc221044	Pass the vm object to vm_object_collapse() with its lock held.	2003-06-07 02:29:17 +00:00
Poul-Henning Kamp	8f16d45326	Fix NFS file swapping, I broke it 3 months ago it seems.	2003-06-05 21:57:19 +00:00
Alan Cox	40b808a842	- Extend the scope of the backing object's lock in vm_object_collapse().	2003-06-05 20:55:27 +00:00
Alan Cox	b72b0115ee	- Add further vm object locking to vm_object_deallocate(), specifically, for accessing a vm object's shadows.	2003-06-04 21:07:42 +00:00
Alan Cox	bc73ee3fe7	- Add VM_OBJECT_TRYLOCK().	2003-06-04 19:59:23 +00:00
Alan Cox	3b68228cce	- Add vm object locking to vm_object_deallocate(). (Still more changes are required.) - Remove special-case macros for kmem object locking. They are no longer used.	2003-06-04 06:00:55 +00:00
Alan Cox	bdbfbaafcc	Add vm object locking to vm_object_coalesce().	2003-06-03 19:37:01 +00:00
Alan Cox	cccf11b865	Change kernel_object and kmem_object to (&kernel_object_store) and (&kmem_object_store), respectively. This allows the address of these objects to be resolved at link-time rather than run-time.	2003-06-01 23:59:48 +00:00
Poul-Henning Kamp	c5d771b807	Prepend _ to internal union members to avoid ambiguity. Found by: FlexeLint	2003-05-31 19:52:15 +00:00
Poul-Henning Kamp	0b074f6c93	Remove unused variables Found by: FlexeLint	2003-05-31 19:51:05 +00:00
Alan Cox	34567de7fc	Add vm object locking to vm_object_madvise().	2003-05-31 19:40:57 +00:00
David Schultz	e92686d065	If we seem to be out of VM, don't allow the pagedaemon to kill processes in the first pass. Among other things, this will give us a chance to launder vnode-backed pages before concluding that we need more swap. This is particularly useful for systems that have no swap. While here, update a comment and remove some long-unused code. Reported by: Lucky Green <shamrock@cypherpunks.to> Suggested by: dillon Approved by: re (rwatson)	2003-05-19 00:51:07 +00:00
Alan Cox	1c500307d1	Reduce the size of a vm object by converting its shadow list from a TAILQ to a LIST. Approved by: re (rwatson)	2003-05-18 04:10:16 +00:00
John Baldwin	90af4afacb	- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)	2003-05-13 20:36:02 +00:00
Alan Cox	3a12f5da1f	Give the kmem object's mutex a unique name, instead of "vm object", to avoid false reports of lock-order reversal with a system map mutex. Approved by: re (jhb)	2003-05-09 02:13:23 +00:00
Alan Cox	658ad5fff5	Lock the vm_object when performing vm_pager_deallocate().	2003-05-06 02:45:28 +00:00
Alan Cox	f7dd7b637b	Extend the scope of the vm_object lock in vm_object_terminate().	2003-05-04 19:23:40 +00:00
Alan Cox	c4a1d732a3	Avoid a lock-order reversal and implement vm_object locking in vm_pageout_page_free().	2003-05-04 06:56:27 +00:00
Alan Cox	ad682c4825	Lock the vm_object on entry to vm_object_vndeallocate().	2003-05-03 20:28:26 +00:00
Alan Cox	bff99f0d12	- Revert kern/vfs_subr.c revision 1.444. The vm_object's size isn't trustworthy for vnode-backed objects. - Restore the old behavior of vm_object_page_remove() when the end of the given range is zero. Add a comment to vm_object_page_remove() regarding this behavior. Reported by: iedowse	2003-05-03 08:09:24 +00:00
Alan Cox	f92039a1fc	Move a declaration to its proper place.	2003-05-03 04:21:16 +00:00
Alan Cox	6be365253d	Lock the vm_object when updating its shadow list.	2003-05-02 04:55:21 +00:00
Alan Cox	4f7c7f6e23	Simplify the removal of a shadow object in vm_object_collapse().	2003-05-02 03:00:21 +00:00
Alan Cox	8e3a76fb6f	Extend the scope of the vm_object locking in vm_object_split().	2003-05-01 05:06:33 +00:00
Alan Cox	1534781737	- Update the vm_object locking in vm_object_reference(). - Convert some dead code in vm_object_reference() into a comment.	2003-05-01 03:29:20 +00:00
Alan Cox	4e73db5f40	Increase the scope of the vm_object lock in vm_map_delete().	2003-04-30 19:18:09 +00:00
Alan Cox	85b1dc89b6	Eliminate an unused parameter from vm_pageout_object_deactivate_pages().	2003-04-30 03:08:16 +00:00
Alan Cox	8ba20a48bd	Add vm_object locking to vmspace_swap_count().	2003-04-30 00:43:17 +00:00
Alan Cox	24b3046aac	Remove unused declarations and definitions.	2003-04-29 18:49:25 +00:00
Alexander Kabaev	104a9b7e3e	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
Alan Cox	17cd3642fe	- Lock the vm_object when performing swap_pager_isswapped(). - Assert that the vm_object is locked in swap_pager_isswapped().	2003-04-28 17:13:53 +00:00
Alan Cox	82774d8040	uma_zone_set_obj() must perform VM_OBJECT_LOCK_INIT() if the caller provides storage for the vm_object.	2003-04-28 06:11:32 +00:00
Alan Cox	ed6a786313	- Define VM_OBJECT_LOCK_INIT(). - Avoid repeatedly mtx_init()ing and mtx_destroy()ing the vm_object's lock using UMA's uminit callback, in this case, vm_object_zinit().	2003-04-28 03:45:35 +00:00
Alan Cox	c9917419ef	- Tell witness that holding two or more vm_object locks is okay. - In vm_object_deallocate(), lock the child when removing the parent from the child's shadow list.	2003-04-27 20:07:57 +00:00
Alan Cox	570a2f4ac5	Various changes to vm_object_shadow(): (1) update the vm_object locking, (2) remove a pointless assertion, and (3) make a trivial change to a comment.	2003-04-27 05:43:03 +00:00
Alan Cox	ecde4b3218	Various changes to vm_object_page_remove(): - Eliminate an odd, special-case feature: if start == end == 0 then all pages are removed. Only one caller used this feature and that caller can trivially pass the object's size. - Assert that the vm_object is locked on entry; don't bother testing for a NULL vm_object. - Style: Fix lines that are longer than 80 characters.	2003-04-26 23:41:30 +00:00
Alan Cox	c829b9d0fc	- Lock the vm_object on entry to vm_object_terminate().	2003-04-26 19:36:19 +00:00
Alan Cox	1ca5895341	- Convert vm_object_pip_wait() from using tsleep() to msleep(). - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().	2003-04-26 18:33:18 +00:00
Alan Cox	155080d31e	- Extend the scope of two existing vm_object locks to cover swap_pager_freespace().	2003-04-26 05:30:56 +00:00
Alan Cox	5103186c8c	Remove an XXX comment. It is no longer a problem.	2003-04-26 05:00:56 +00:00
John Baldwin	8f88740381	- Don't bother using the proc lock to test just P_SYSTEM as that is set in fork1() and never changes. - The proc lock is enough to cover reading p_state, so push down sched_lock into the PRS_NORMAL case of the switch on p_state.	2003-04-25 20:06:30 +00:00
Alan Cox	6a07e90d63	- Lock the vm_object when iterating over its list of resident pages.	2003-04-25 16:30:02 +00:00
Alan Cox	5299887de5	- Relax the Giant required in vm_page_remove(). - Remove the Giant required from vm_page_free_toq(). (Any locking errors will be caught by vm_page_remove().) This remedies a panic that occurred when kmem_malloc(NOWAIT) performed without Giant failed to allocate the necessary pages. Reported by: phk	2003-04-25 06:35:05 +00:00
Alan Cox	875791f63d	- Move swap_pager_isswapped()'s prototype to a more logical place.	2003-04-24 05:29:27 +00:00
Alan Cox	b6e48e0372	- Acquire the vm_object's lock when performing vm_object_page_clean(). - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)	2003-04-24 04:31:25 +00:00
John Baldwin	11edc1e0d7	Fix compiling in the NO_SWAPPING case. Submitted by: bde (partially)	2003-04-23 18:21:41 +00:00
John Baldwin	897ecacd64	Lock the proc to check p_flag and several other related tests in vm_daemon(). We don't need to hold sched_lock as long now as a result.	2003-04-22 20:03:08 +00:00
John Baldwin	eeec6bab2e	Prefer the proc lock to sched_lock when testing PS_INMEM now that it is safe to do so.	2003-04-22 20:01:56 +00:00
John Baldwin	664f718ba1	- Always call faultin() in _PHOLD() if PS_INMEM is clear. This closes a race where a thread could assume that a process was swapped in by PHOLD() when it actually wasn't fully swapped in yet. - In faultin(), always msleep() if PS_SWAPPINGIN is set instead of doing this check after bumping p_lock in the PS_INMEM == 0 case. Also, sched_lock is only needed for setting and clearning swapping PS_* flags and the swap thread inhibitor. - Don't set and clear the thread swap inhibitor in the same loops as the pmap_swapin/out_thread() since we have to do it under sched_lock. Instead, mimic the treatment of the PS_INMEM flag and use separate loops to set the inhibitors when clearing PS_INMEM and clear the inhibitors when setting PS_INMEM. - swapout() now returns with the proc lock held as it holds the lock while adjusting the swapping-related PS_* flags so that the proc lock can be used to test those flags. - Only use the proc lock to check the swapping-related PS_* flags in several places. - faultin() no longer requires sched_lock to be held by callers. - Rename PS_SWAPPING to PS_SWAPPINGOUT to be less ambiguous now that we have PS_SWAPPINGIN.	2003-04-22 20:00:26 +00:00
Alan Cox	2e9d00a15d	Revision 1.246 should have also included - Weaken the assertion in vm_page_insert() to require Giant only if the vm_object isn't locked. Reported by: "Ilmar S. Habibulin" <ilmar@watson.org>	2003-04-22 14:26:02 +00:00
Alan Cox	26da32cc73	Remove unused declarations.	2003-04-22 06:26:42 +00:00
Alan Cox	03d4c1e644	Revision 1.52 of vm/uma_core.c has led to UMA's obj_alloc() being called without Giant; and obj_alloc() in turn calls vm_page_alloc() without Giant. This causes an assertion failure in vm_page_alloc(). Fortunately, obj_alloc() is now MPSAFE. So, we need only clean up some assertions. - Weaken the assertion in vm_page_lookup() to require Giant only if the vm_object isn't locked. - Remove an assertion from vm_page_alloc() that duplicates a check performed in vm_page_lookup(). In collaboration with: gallatin, jake, jeff	2003-04-22 05:36:14 +00:00
Alan Cox	1c067f7ebc	Add VM_OBJECT_LOCKED().	2003-04-22 04:47:29 +00:00
Alan Cox	d647a0ed5a	- Assert that the vm_object is locked in vm_object_clear_flag(), vm_object_pip_add() and vm_object_pip_wakeup(). - Remove GIANT_REQUIRED from vm_object_pip_subtract() and vm_object_pip_subtract(). - Lock the vm_object when performing vm_object_page_remove().	2003-04-21 06:33:52 +00:00
Alan Cox	d7a013c320	- Lock the vm_object when performing either vm_object_clear_flag() or vm_object_pip_wakeup().	2003-04-20 23:23:41 +00:00
Alan Cox	1d284e00b5	- Update the vm_object locking in vm_map_insert().	2003-04-20 21:56:40 +00:00
Alan Cox	72ba747d16	- Lock the vm_object when performing vm_object_pip_wakeup(). - Merge two identical cases in a switch statement.	2003-04-20 20:37:14 +00:00
Alan Cox	b009d5a0af	- Lock the vm_object when performing vm_object_pip_wakeup().	2003-04-20 19:25:28 +00:00
Alan Cox	d68d828b43	- Lock the vm_object when performing vm_object_pip_add(). - Remove an unnecessary variable.	2003-04-20 07:08:30 +00:00
Alan Cox	7d040e3cc5	Update vm_object locking in vm_map_delete().	2003-04-20 04:35:47 +00:00
Alan Cox	d22bc7101c	- Lock the vm_object when performing vm_object_pip_add().	2003-04-20 03:41:21 +00:00
Alan Cox	0fa05eae77	- Lock the vm_object when performing vm_object_pip_subtract(). - Assert that the vm_object lock is held in vm_object_pip_subtract().	2003-04-19 22:11:41 +00:00
Alan Cox	0d420ad3e6	- Lock the vm_object when performing vm_object_pip_wakeupn(). - Assert that the vm_object lock is held in vm_object_pip_wakeupn(). - Add a new macro VM_OBJECT_LOCK_ASSERT().	2003-04-19 21:15:44 +00:00
Alan Cox	034b3d7a6f	o Update locking around vm_object_page_remove() in vm_map_clean() to use the new macros. o Remove unnecessary increment and decrement of the vm_object's reference count in vm_map_clean().	2003-04-19 01:43:32 +00:00
Alan Cox	410cfc455e	Lock the vm_object in obj_alloc().	2003-04-19 00:30:36 +00:00
Alan Cox	49281fbf68	Update locking around vm_object_page_remove() to use the new macros.	2003-04-18 16:39:03 +00:00
Andrew Gallatin	b37d8ead52	Don't grab Giant in slab_zalloc() if M_NOWAIT is specified. This should allow the use of INTR_MPSAFE network drivers. Tested by: njl Glanced at by: jeff	2003-04-18 13:02:29 +00:00
John Baldwin	69297bf8c9	suser() does not need the proc lock, just the setting of P_PROTECTED in p_flag needs the lock.	2003-04-17 22:38:27 +00:00
Tom Rhodes	9faaf3b3c8	Add some tunable descriptions. Submitted by: hmp Discussed with: bde	2003-04-17 15:44:22 +00:00
Tom Rhodes	2a3eeaa240	Pre-content whitespace commit. Discussed with: bde	2003-04-17 15:39:12 +00:00
Alan Cox	acbff226fc	Update locking on the kmem_object to use the new macros.	2003-04-15 01:16:05 +00:00
Alan Cox	de5ef10142	Update locking on the kernel_object to use the new macros.	2003-04-14 00:36:53 +00:00
Alan Cox	d1dc776d9d	Lock some manipulations of the vm object's flags.	2003-04-13 23:43:34 +00:00
Alan Cox	e2479b4fc3	Lock some manipulations of the vm object's flags.	2003-04-13 20:22:02 +00:00
Alan Cox	b077a36297	Lock some manipulations of the vm object's flags.	2003-04-13 19:36:18 +00:00
Alan Cox	fdff41609d	Add new macros for locking and unlocking a vm object.	2003-04-13 18:39:47 +00:00
Alan Cox	f279b88deb	Permit vm_object_pip_add() and vm_object_pip_wakeup() on the kmem_object without Giant held.	2003-04-13 00:43:48 +00:00
Alan Cox	f31c239da1	Eliminate unnecessary gotos from kmem_malloc().	2003-04-13 00:23:42 +00:00
John Baldwin	d8fed0f0f2	- Kill the pv_flags member of the alpha mdpage since it stop being used in rev 1.61 of pmap.c. - Now that pmap_page_is_free() is empty and since it is just a hack for the Alpha pmap, remove it.	2003-04-10 18:42:06 +00:00
Alan Cox	2ac8b16089	Remove GIANT_REQUIRED from getpbuf(). Reviewed by: tegge Reduce pbuf_mtx's scope in relpbuf(). Submitted by: tegge	2003-04-05 21:01:16 +00:00
Dag-Erling Smørgrav	e8c7f48855	Rename a static variable to avoid future conflicts.	2003-04-04 12:08:42 +00:00
Wes Peters	f4cf2141f6	Add a facility allowing processes to inform the VM subsystem they are critical and should not be killed when pageout is looking for more memory pages in all the wrong places. Reviewed by: arch@ Sponsored by: St. Bernard Software	2003-03-31 21:09:57 +00:00
Maxime Henrion	6900a17c75	The object type can't be OBJT_PHYS in vm_mmap(). Reviewed by: peter	2003-03-30 00:56:20 +00:00
Tor Egge	125ee0d161	Obtain Giant before calling kmem_alloc without M_NOWAIT and before calling kmem_free if Giant isn't already held.	2003-03-26 18:44:53 +00:00
Jake Burkholder	227f9a1c58	- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)	2003-03-25 00:07:06 +00:00
Maxime Henrion	dab392a4d4	Remove an empty comment.	2003-03-19 00:34:43 +00:00
Poul-Henning Kamp	b4b138c27f	Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.	2003-03-18 08:45:25 +00:00
Jake Burkholder	9f77ba59c5	Subtract the memory that backs the vm_page structures from phys_avail after mapping it. This makes it possible to determine if a physical page has a backing vm_page or not.	2003-03-17 03:16:00 +00:00
Jake Burkholder	5501d40bb9	Made the prototypes for pmap_kenter and pmap_kremove MD. These functions are machine dependent because they are not required to update the tlb when mappings are added or removed, and doing so is machine dependent. In addition, an implementation may require that pages mapped with pmap_kenter have a backing vm_page_t, which is not necessarily true of all physical pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the physical address in order to make this requirement clear.	2003-03-16 04:16:03 +00:00
David Schultz	72d97679ff	- When the VM daemon is out of swap space and looking for a process to kill, don't block on a map lock while holding the process lock. Instead, skip processes whose map locks are held and find something else to kill. - Add vm_map_trylock_read() to support the above. Reviewed by: alc, mike (mentor)	2003-03-12 23:13:16 +00:00
Kenneth D. Merry	9b80d344ec	Zero copy send and receive fixes: - On receive, vm_map_lookup() needs to trigger the creation of a shadow object. To make that happen, call vm_map_lookup() with PROT_WRITE instead of PROT_READ in vm_pgmoveco(). - On send, a shadow object will be created by the vm_map_lookup() in vm_fault(), but vm_page_cowfault() will delete the original page from the backing object rather than simply letting the legacy COW mechanism take over. In other words, the new page should be added to the shadow object rather than replacing the old page in the backing object. (i.e. vm_page_cowfault() should not be called in this case.) We accomplish this by making sure fs.object == fs.first_object before calling vm_page_cowfault() in vm_fault(). Submitted by: gallatin, alc Tested by: ken	2003-03-08 06:58:18 +00:00
Alan Cox	09c80124a3	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
Robert Watson	1b2c2ab29a	Provide a mac_check_system_swapoff() entry point, which permits MAC modules to authorize disabling of swap against a particular vnode. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-05 23:50:15 +00:00
John Baldwin	263067951a	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
Poul-Henning Kamp	afadcb6108	NO_GEOM cleanup: Use VOP_IOCTL(DIOCGMEDIASIZE) to check the size of a potential swap device instead of the cdevsw->d_psize() method.	2003-03-02 14:37:52 +00:00
Alan Cox	1a1e9f41e5	Teach vm_page_sleep_if_busy() to release the vm_object lock before sleeping.	2003-03-01 19:16:32 +00:00
Alan Cox	077808c588	Fuse two #ifdefs with identical conditions.	2003-02-25 06:46:08 +00:00
Jeff Roberson	17661e5ac4	- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick	2003-02-25 03:37:48 +00:00
Maxime Henrion	07159f9c56	Cleanup of the d_mmap_t interface. - Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API. I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE. Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386	2003-02-25 03:21:22 +00:00
Alan Cox	3fa24ec9f1	In vm_page_dirty(), assert that the page is not in the free queue(s).	2003-02-24 17:30:45 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alan Cox	6420521aa5	Remove GIANT_REQUIRED from vm_pageq_remove().	2003-02-16 06:36:48 +00:00
Alan Cox	814f5c92d7	Remove the acquisition and release of Giant around pmap_growkernel(). It's unnecessary for two reasons: (1) Giant is at present already held in such cases and (2) our various implementations of pmap_growkernel() look to be MP safe. (For example, for sparc64 the proof of (2) is trivial.)	2003-02-15 20:01:09 +00:00
Alan Cox	53b1963649	Move kernel_vm_end's declaration to pmap.h; add a comment regarding the synchronization of access to kernel_vm_end.	2003-02-15 19:38:23 +00:00
Alan Cox	6b4b77ad34	Add a comment describing how pagedaemon_wakeup() should be used and synchronized. Suggested by: tegge	2003-02-09 20:40:36 +00:00
Poul-Henning Kamp	886eaaacfa	Change a printf to also tell how many items were left in the zone.	2003-02-04 08:23:18 +00:00
Alan Cox	a1c0a78518	- It's more accurate to say that vm_paging_needed() returns TRUE than a positive number. - In pagedaemon_wakeup(), set vm_pages_needed to 1 rather than incrementing it to accomplish the same.	2003-02-02 07:16:40 +00:00
Alan Cox	8e1d8de578	- Convert vm_pageout()'s tsleep()s to msleep()s with the page queue lock.	2003-02-02 01:11:21 +00:00
Alan Cox	8b24576748	- Remove (some) unnecessary explicit initializations to zero. - Style changes to vm_pageout(): declarations and white-space.	2003-02-01 21:55:30 +00:00
Alan Cox	e6f2748cbc	- Convert the tsleep()s in vm_wait() and vm_waitpfault() to msleep()s with the page queue lock. - Assert that the page queue lock is held in vm_page_free_wakeup().	2003-02-01 21:18:16 +00:00
Alan Cox	75741c0497	Simplify vm_object_page_remove(): The object's memq is now ordered. The two cases that existed before for performance optimization purposes can be reduced to one.	2003-01-27 01:12:35 +00:00
Alan Cox	d923c5986e	Add MTX_DUPOK to the initialization of system map locks.	2003-01-25 18:45:55 +00:00
Alfred Perlstein	c3dfdfd132	use 'void *' instead of 'caddr_t' for useracc, kernacc, vslock and vsunlock.	2003-01-21 11:34:57 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Matthew Dillon	8575a17e02	Fix swapping to a file, it was broken when SPECSTRATEGY was introduced.	2003-01-20 20:00:32 +00:00
Matthew Dillon	2d5c7e4506	Close the remaining user address mapping races for physical I/O, CAM, and AIO. Still TODO: streamline useracc() checks. Reviewed by: alc, tegge MFC after: 7 days	2003-01-20 17:46:48 +00:00
Alan Cox	28ec30cd9f	- Hold the page queues lock around vm_page_hold(). - Assert that the page queues lock rather than Giant is held in vm_page_hold().	2003-01-20 09:24:03 +00:00
Jeff Roberson	ebc85edf5e	- M_WAITOK is 0 and not a real flag. Test for this properly. Submitted by: tmm Pointy hat to: jeff	2003-01-20 01:32:56 +00:00
David E. O'Brien	c4aa0a2e38	Rev 1.16 renamed VM_METER to VM_TOTAL. This is breaking 3rd-party apps. So add a VM_METER compat define. Submitted by: Andy Fawcett <andy@athame.co.uk>	2003-01-18 21:14:02 +00:00
Matthew Dillon	e3669cee72	Merge all the various copies of vm_fault_quick() into a single portable copy.	2003-01-16 00:02:21 +00:00
Alan Cox	b0ef8c5fe4	- Update vm_pageout_deficit using atomic operations. It's a simple counter outside the scope of existing locks. - Eliminate a redundant clearing of vm_pageout_deficit.	2003-01-14 06:57:03 +00:00
Alan Cox	ff2023a5df	Make vm_pageout_page_free() static.	2003-01-14 02:28:39 +00:00
Matthew Dillon	3db161e079	It is possible for an active aio to prevent shared memory from being dereferenced when a process exits due to the vmspace ref-count being bumped. Change shmexit() and shmexit_myhook() to take a vmspace instead of a process and call it in vmspace_dofree(). This way if it is missed in exit1()'s early-resource-free it will still be caught when the zombie is reaped. Also fix a potential race in shmexit_myhook() by NULLing out vmspace->vm_shm prior to calling shm_delete_mapping() and free(). MFC after: 7 days	2003-01-13 23:04:32 +00:00
Poul-Henning Kamp	ca94e7c4ca	We can get past here on a normal vnode as well, so use VOP_STRATEGY if so.	2003-01-13 21:32:16 +00:00
Matthew Dillon	48e3128b34	Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.	2003-01-13 00:33:17 +00:00
Alan Cox	a15700fe32	Make vm_page_alloc() return PG_ZERO only if VM_ALLOC_ZERO is specified. The objective being to eliminate some cases of page queues locking. (See, for example, vm/vm_fault.c revision 1.160.) Reviewed by: tegge (Also, pointed out by tegge that I changed vm_fault.c before changing vm_page.c. Oops.)	2003-01-12 23:32:46 +00:00
Alan Cox	1761f1829d	vm_fault_copy_entry() needn't clear PG_ZERO because it didn't pass VM_ALLOC_ZERO to vm_page_alloc().	2003-01-12 07:33:16 +00:00
Matthew Dillon	cd72f2180b	Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.	2003-01-12 01:37:13 +00:00
Alan Cox	b5dc830507	In vm_page_alloc(), fuse two if statements that are conditioned on the same expression.	2003-01-11 20:07:17 +00:00
Matthew Dillon	f7550ecf3f	Make 'sysctl vm.vmtotal' work properly using updated patch from Hiten. (the patch in the PR was stale). PR: kern/5689 Submitted by: Hiten Pandya <hiten@unixdaemons.com>	2003-01-11 07:29:47 +00:00
Alan Cox	9a032278bd	In vm_page_alloc(), honor VM_ALLOC_ZERO for system and interrupt class requests when the number of free pages is below the reserved threshold. Previously, VM_ALLOC_ZERO was only honored when the number of free pages was above the reserved threshold. Honoring it in all cases generally makes sense, does no harm, and simplifies the code.	2003-01-08 19:58:42 +00:00
Poul-Henning Kamp	5266a767e5	Convert VOP_STRATEGY to VOP_SPECSTRATEGY in the generic getpages and the pager input for small filesystems.	2003-01-05 20:32:03 +00:00
Alan Cox	6c4952c7b4	Use atomic add and subtract to update the global wired page count, cnt.v_wire_count.	2003-01-05 01:31:45 +00:00
Poul-Henning Kamp	f5b11b6e2d	Temporarily introduce a new VOP_SPECSTRATEGY operation while I try to sort out disk-io from file-io in the vm/buffer/filesystem space. The intent is to sort VOP_STRATEGY calls into those which operate on "real" vnodes and those which operate on VCHR vnodes. For the latter kind, the call will be changed to VOP_SPECSTRATEGY, possibly conditionally for those places where dual-use happens. Add a default VOP_SPECSTRATEGY method which will call the normal VOP_STRATEGY. First time it is called it will print debugging information. This will only happen if a normal vnode is passed to VOP_SPECSTRATEGY by mistake. Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY does on a VCHR vnode today. Add a new VOP_STRATEGY method in specfs to catch instances where the conversion to VOP_SPECSTRATEGY has not yet happened. Handle the request just like we always did, but first time called print debugging information. Apart up to two instances of console messages per boot, this amounts to a glorified no-op commit. If you get any of the messages on your console I would very much like a copy of them mailed to phk@freebsd.org	2003-01-04 22:10:36 +00:00
Alan Cox	469c4ba59e	Allow kmem_malloc() without Giant if M_NOWAIT is specified.	2003-01-04 19:26:35 +00:00
Alan Cox	4dbeceee96	Use vm_object_lock() and vm_object_unlock() in vm_object_deallocate(). (This procedure needs further work, but this change is sufficient for locking the kmem_object.)	2003-01-04 19:23:19 +00:00
Alan Cox	009f3e7a1e	Refine the assertions in vm_page_alloc().	2003-01-04 19:07:13 +00:00
Alan Cox	5440b5a974	Refine the assertion in vm_object_clear_flag() to allow operation on the kmem_object without Giant. In that case, assert that the kmem_object's mutex is held.	2003-01-03 19:19:08 +00:00
Poul-Henning Kamp	d6b3a1df18	Revert use of dmmax_mask, I had overlooked a '~'. Spotted by: bde	2003-01-03 19:16:48 +00:00
Poul-Henning Kamp	42c43e6031	Make struct swblock kernel only, to make vm/swap_pager.h userland includable. Move struct swdevt from sys/conf.h to the more appropriate vm/swap_pager.h. Adjust #include use in libkvm and pstat(8) to match.	2003-01-03 16:23:12 +00:00
Poul-Henning Kamp	c410df597b	Avoid extern decls in .c files by putting them in the vm/swap_pager.h include file where they belong. Share the dmmax_mask variable.	2003-01-03 14:30:46 +00:00
Poul-Henning Kamp	3ccbf2d533	Use correct _VM_SWAP_PAGER_H_ to check for multiple inclusion.	2003-01-03 14:22:52 +00:00
Poul-Henning Kamp	69fd75d094	Retire sys/dmap.h by including the two lines of it which matters directly in vm/vm_swap.c.	2003-01-03 09:55:05 +00:00
Alan Cox	a6864937e2	Lock the vm object when performing vm_object_clear_flag().	2003-01-03 09:15:43 +00:00
Poul-Henning Kamp	862702306b	Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.	2003-01-03 06:32:15 +00:00
Alan Cox	49247edca6	Add vm map and vm object locking to vmtotal().	2003-01-03 05:52:02 +00:00
Alan Cox	81e4e48d24	Lock the vm object when performing vm_object_clear_flag().	2003-01-02 09:09:27 +00:00
Alan Cox	d61e1287a4	Update the assertions in vm_page_insert() and vm_page_lookup() to reflect locking of the kmem_object.	2003-01-01 19:45:36 +00:00
Jens Schweikhardt	9d5abbddbf	Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.	2003-01-01 18:49:04 +00:00
Alan Cox	ea0081b61e	Add a needed #include. Reported by: ia64 tinderbox	2003-01-01 00:13:01 +00:00
Alan Cox	36daaecd04	Implement a variant locking scheme for vm maps: Access to system maps is now synchronized by a mutex, whereas access to user maps is still synchronized by a lockmgr()-based lock. Why? No single type of lock, including sx locks, meets the requirements of both types of vm map. Sometimes we sleep while holding the lock on a user map. Thus, a a mutex isn't appropriate. On the other hand, both lockmgr()-based and sx locks release Giant when a thread/process blocks during contention for a lock. This could lead to a race condition in a legacy driver (that relies on Giant for synchronization) if it attempts to kmem_malloc() and fails to immediately obtain the lock. Fortunately, we never sleep while holding a system map lock.	2002-12-31 19:38:04 +00:00
Alan Cox	c9267356b7	- Mark the kernel_map as a system map immediately after its creation. - Correct a cast.	2002-12-30 05:55:41 +00:00
Alan Cox	3a92e5d5e9	- Increment the vm_map's timestamp if _vm_map_trylock() succeeds. - Introduce map_sleep_mtx and use it to replace Giant in vm_map_unlock_and_wait() and vm_map_wakeup(). (Original version by: tegge.)	2002-12-30 00:41:33 +00:00
Alan Cox	e3a9e1b2a8	- Remove vm_object_init2(). It is unused. - Add a mtx_destroy() to vm_object_collapse(). (This allows a bzero() to migrate from _vm_object_allocate() to vm_object_zinit(), where it will be performed less often.)	2002-12-29 21:01:14 +00:00
Alan Cox	a28cc55e5b	Reduce the number of times that we acquire and release the page queues lock by making vm_page_rename()'s caller, rather than vm_page_rename(), responsible for acquiring it.	2002-12-29 07:17:06 +00:00
Alan Cox	2ee5fea7d3	Assert that the page queues lock rather than Giant is held in vm_page_flag_clear().	2002-12-28 22:49:37 +00:00
Matthew Dillon	40bb4f4bcf	vm_pager_put_pages() takes VM_PAGER_* flags, not OBJPC_* flags. It just so happens that OBJPC_SYNC has the same value as VM_PAGER_PUT_SYNC so no harm done. But fix it :-) No operational changes. MFC after: 1 day	2002-12-28 21:15:39 +00:00
Matthew Dillon	43b7990e30	Allow the VM object flushing code to cluster. When the filesystem syncer comes along and flushes a file which has been mmap()'d SHARED/RW, with dirty pages, it was flushing the underlying VM object asynchronously, resulting in thousands of 8K writes. With this change the VM Object flushing code will cluster dirty pages in 64K blocks. Note that until the low memory deadlock issue is reviewed, it is not safe to allow the pageout daemon to use this feature. Forced pageouts still use fs block size'd ops for the moment. MFC after: 3 days	2002-12-28 21:03:42 +00:00
Alan Cox	a623fedef7	Two changes to kmem_malloc(): - Use VM_ALLOC_WIRED. - Perform vm_page_wakeup() after pmap_enter(), like we do everywhere else.	2002-12-28 19:03:54 +00:00
Alan Cox	35c016315f	- Change vm_object_page_collect_flush() to assert rather than acquire the page queues lock. - Acquire the page queues lock in vm_object_page_clean().	2002-12-27 20:16:13 +00:00
Alan Cox	969da54c3a	Increase the scope of the page queues lock in phys_pager_getpages().	2002-12-27 06:09:56 +00:00
Alan Cox	82ea080d88	- Hold the page queues lock around calls to vm_page_flag_clear().	2002-12-24 19:02:03 +00:00
Alan Cox	dc907f6632	- Hold the page queues lock around vm_page_wakeup().	2002-12-24 04:24:58 +00:00
Alan Cox	6e14fce9d9	- Hold the kernel_object's lock around vm_page_insert(..., kernel_object, ...).	2002-12-23 20:39:15 +00:00
Alan Cox	7af7dd3c6f	Eliminate some dead code. (Any possible use for this code died with vm/vm_page.c revision 1.220.) Submitted by: bde	2002-12-23 04:35:38 +00:00
Matthew Dillon	9991ea7178	The UP -current was not properly counting the per-cpu VM stats in the sysctl code. This makes 'systat -vm 1's syscall count work again. Submitted by: Michal Mertl <mime@traveller.cz> Note: also slated for 5.0	2002-12-22 05:04:30 +00:00
Alan Cox	671e427ce9	Increase the scope of the kmem_object locking in kmem_malloc().	2002-12-20 18:59:23 +00:00
Alan Cox	4b420d501f	Add a mutex to struct vm_object. Initialize and destroy that mutex at appropriate times. For the moment, the mutex is only used on the kmem_object.	2002-12-20 05:10:32 +00:00
Alan Cox	cf3e6e4837	Remove the hash_rand field from struct vm_object. As of revision 1.215 of vm/vm_page.c, it is unused.	2002-12-19 20:01:22 +00:00
Alan Cox	24c9ad6bed	- Remove vm_page_sleep_busy(). The transition to vm_page_sleep_if_busy(), which incorporates page queue and field locking, is complete. - Assert that the page queue lock rather than Giant is held in vm_page_flag_set().	2002-12-19 07:23:46 +00:00
Alan Cox	9a96b6382a	- Hold the page queues lock when performing vm_page_busy() or vm_page_flag_set(). - Replace vm_page_sleep_busy() with proper page queues locking and vm_page_sleep_if_busy().	2002-12-19 01:20:24 +00:00
Alan Cox	bd82dc7460	- Hold the page queues lock when performing vm_page_busy(). - Replace vm_page_sleep_busy() with proper page queues locking and vm_page_sleep_if_busy().	2002-12-18 04:39:15 +00:00
Alan Cox	b365ea9e30	Hold the page queues lock when performing vm_page_flag_set().	2002-12-18 04:02:02 +00:00
Alan Cox	d8e7c54e1e	Hold the page queues lock when performing vm_page_flag_set().	2002-12-17 19:55:28 +00:00
Matthew Dillon	fa7dd9c5bc	Change the way ELF coredumps are handled. Instead of unconditionally skipping read-only pages, which can result in valuable non-text-related data not getting dumped, the ELF loader and the dynamic loader now mark read-only text pages NOCORE and the coredump code only checks (primarily) for complete inaccessibility of the page or NOCORE being set. Certain applications which map large amounts of read-only data will produce much larger cores. A new sysctl has been added, debug.elf_legacy_coredump, which will revert to the old behavior. This commit represents collaborative work by all parties involved. The PR contains a program demonstrating the problem. PR: kern/45994 Submitted by: "Peter Edwards" <pmedwards@eircom.net>, Archie Cobbs <archie@dellroad.org> Reviewed by: jdp, dillon MFC after: 7 days	2002-12-16 19:24:43 +00:00
Alan Cox	4b36fe0cbd	Perform vm_object_lock() and vm_object_unlock() on kmem_object around vm_page_lookup() and vm_page_free().	2002-12-15 21:09:09 +00:00
Matthew Dillon	92da00bb24	This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment. Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks	2002-12-15 19:17:57 +00:00
Matthew Dillon	389d2b6e21	Fix a refcount race with the vmspace structure. In order to prevent resource starvation we clean-up as much of the vmspace structure as we can when the last process using it exits. The rest of the structure is cleaned up when it is reaped. But since exit1() decrements the ref count it is possible for a double-free to occur if someone else, such as the process swapout code, references and then dereferences the structure. Additionally, the final cleanup of the structure should not occur until the last process referencing it is reaped. This commit solves the problem by introducing a secondary reference count, calling 'vm_exitingcnt'. The normal reference count is decremented on exit and vm_exitingcnt is incremented. vm_exitingcnt is decremented when the process is reaped. When both vm_exitingcnt and vm_refcnt are 0, the structure is freed for real. MFC after: 3 weeks	2002-12-15 18:50:04 +00:00
Alan Cox	2840cabe6a	As per the comments, vm_object_page_remove() now expects its caller to lock the object (i.e., acquire Giant).	2002-12-15 07:30:51 +00:00
Alan Cox	5e83956af5	Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().	2002-12-15 07:16:51 +00:00
Alan Cox	475e8011ab	Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().	2002-12-15 05:41:56 +00:00
Alan Cox	495bedfbd0	Assert that the page queues lock is held in vm_page_unhold(), vm_page_remove(), and vm_page_free_toq().	2002-12-15 00:06:02 +00:00
Alan Cox	bc105a6797	Hold the page queues lock when calling pmap_protect(); it updates fields of the vm_page structure. Make the style of the pmap_protect() calls consistent. Approved by: re (blanket)	2002-12-01 18:57:56 +00:00
Alan Cox	38857e7f73	Hold the page queues lock when calling pmap_protect(); it updates fields of the vm_page structure. Nearby, remove an unnecessary semicolon and return statement. Approved by: re (blanket)	2002-12-01 05:40:18 +00:00
Alan Cox	78f7187d01	Increase the scope of the page queue lock in vm_pageout_scan(). Approved by: re (blanket)	2002-12-01 00:02:39 +00:00
Alan Cox	e80b7b691e	Lock page field accesses in mincore(). Approved by: re (blanket)	2002-11-28 08:01:39 +00:00
Alan Cox	85e0124324	Hold the page queues lock when performing pmap_clear_modify(). Approved by: re (blanket)	2002-11-27 19:51:48 +00:00
Alan Cox	3a199de3d9	Hold the page queues lock while performing pmap_page_protect(). Approved by: re (blanket)	2002-11-27 08:03:24 +00:00
Alan Cox	85e03a7e1e	Acquire and release the page queues lock around calls to pmap_protect() because it updates flags within the vm page. Approved by: re (blanket)	2002-11-25 22:00:31 +00:00
Alan Cox	13dc71ed40	Extend the scope of the page queues/fields locking in vm_freeze_copyopts() to cover pmap_remove_all(). Approved by: re	2002-11-24 06:13:38 +00:00
Alan Cox	178949e021	Hold the page queues/flags lock when calling vm_page_set_validclean(). Approved by: re	2002-11-23 19:10:31 +00:00
Alan Cox	ba0208b945	Assert that the page queues lock rather than Giant is held in vm_pageout_page_free(). Approved by: re	2002-11-23 08:08:54 +00:00
Alan Cox	e8a27959f6	Add page queue and flag locking in vnode_pager_setsize(). Approved by: re	2002-11-23 03:58:35 +00:00
Jeff Roberson	855a310fcb	- Add an event that is triggered when the system is low on memory. This is intended to be used by significant memory consumers so that they may drain some of their caches. Inspired by: phk Approved by: re Tested on: x86, alpha	2002-11-21 09:17:56 +00:00
Jeff Roberson	74c924b553	- Wakeup the correct address when a zone is no longer full. Spotted by: jake	2002-11-18 08:27:14 +00:00
Alan Cox	a12cc0e489	Remove vm_page_protect(). Instead, use pmap_page_protect() directly.	2002-11-18 04:05:22 +00:00
Jeff Roberson	f3da1873bc	- Don't forget the flags value when using boot pages. Reported by: grehan	2002-11-16 20:57:41 +00:00
Alan Cox	4fec79bef8	Now that pmap_remove_all() is exported by our pmap implementations use it directly.	2002-11-16 07:44:25 +00:00
Alan Cox	81b9ee99e7	Remove dead code that hasn't been needed since the demise of share maps in various revisions of vm/vm_map.c between 1.148 and 1.153.	2002-11-13 19:50:06 +00:00
Alan Cox	eea85e9bb6	Move pmap_collect() out of the machine-dependent code, rename it to reflect its new location, and add page queue and flag locking. Notes: (1) alpha, i386, and ia64 had identical implementations of pmap_collect() in terms of machine-independent interfaces; (2) sparc64 doesn't require it; (3) powerpc had it as a TODO.	2002-11-13 05:39:58 +00:00
Olivier Houchard	f64e99baa2	Remove extra #include<sys/vmmeter.h>.	2002-11-11 13:57:50 +00:00
Matt Jacob	81f71edaec	atomic_set_8 isn't MI. Instead, follow Jake's suggestions about ZONE_LOCK.	2002-11-11 11:50:03 +00:00
Alan Cox	6372d61e3e	- Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit() if we're removing write access from the page's PTEs. - Export pmap_remove_all() on alpha, i386, and ia64. (It's already exported on sparc64.)	2002-11-11 05:17:34 +00:00
Matt Jacob	7ca05a39c7	Use atomic_set_8 on the us_freelist maps as they are not otherwise protected. Furthermore, in some RISC architectures with no normal byte operations, the surrounding 3 bytes are also affected by the read-modify-write that has to occur.	2002-11-10 16:16:44 +00:00
Alan Cox	d154fb4fe6	When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than indirectly through vm_page_protect(). The one remaining page flag that is updated by vm_page_protect() is already being updated by our various pmap implementations. Note: A later commit will similarly change the VM_PROT_READ case and eliminate vm_page_protect().	2002-11-10 07:12:04 +00:00
Alan Cox	f6116791a2	Fix an error case in vm_map_wire(): unwiring of an entry during cleanup after a user wire error fails when the entry is already system wired. Reported by: tegge	2002-11-09 21:26:49 +00:00
Alan Cox	1f7c5f98d7	In vm_page_remove(), avoid calling vm_page_splay() if the object's memq is empty.	2002-11-09 08:27:42 +00:00
Thomas Moestl	0fca57b8b8	Move the definitions of the hw.physmem, hw.usermem and hw.availpages sysctls to MI code; this reduces code duplication and makes all of them available on sparc64, and the latter two on powerpc. The semantics by the i386 and pc98 hw.availpages is slightly changed: previously, holes between ranges of available pages would be included, while they are excluded now. The new behaviour should be more correct and brings i386 in line with the other architectures. Move physmem to vm/vm_init.c, where this variable is used in MI code.	2002-11-07 23:57:17 +00:00
Maxime Henrion	bf1001fa0f	Better printf() formats.	2002-11-07 23:16:22 +00:00
Maxime Henrion	e47cd172e0	Some more printf() format fixes.	2002-11-07 23:03:04 +00:00
Maxime Henrion	cd034a5be9	Correctly print vm_offset_t types.	2002-11-07 22:49:07 +00:00
Alan Cox	ada2a050be	Export the function vm_page_splay().	2002-11-04 19:21:39 +00:00
Alan Cox	c71f01affe	- Remove the memory allocation for the object/offset hash table because it's no longer used. (See revision 1.215.) - Fix a harmless bug: the number of vm_page structures allocated wasn't properly adjusted when uma_bootstrap() was introduced. Consequently, we were allocating 30 unused vm_page structures. - Wrap a long line.	2002-11-03 22:20:42 +00:00
Alan Cox	02af9de6fc	Remove the vm page buckets mutex. As of revision 1.215 of vm/vm_page.c, it is unused.	2002-11-02 22:39:30 +00:00
Jeff Roberson	48eea37508	- Add support for machine dependant page allocation routines. MD code may define UMA_MD_SMALL_ALLOC to make use of this feature. Reviewed by: peter, jake	2002-11-01 01:01:27 +00:00
Jeff Roberson	026aa839a4	- Add a new flag to vm_page_alloc, VM_ALLOC_NOOBJ. This tells vm_page_alloc not to insert this page into an object. The pindex is still used for colorization. - Rework vm_page_select_* to accept a color instead of an object and pindex to work with VM_PAGE_NOOBJ. - Document other VM_ALLOC_ flags. Reviewed by: peter, jake	2002-11-01 00:59:03 +00:00
Robert Watson	03ce2c0c9b	Merge from MAC tree: rename mac_check_vnode_swapon() to mac_check_system_swapon(), to reflect the fact that the primary object of this change is the running kernel as a whole, rather than just the vnode. We'll drop additional checks of this class into the same check namespace, including reboot(), sysctl(), et al. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-27 06:54:06 +00:00
Jeff Roberson	bbee39c629	- Now that uma_zalloc_internal is not the fast path don't be so fussy about extra function calls. Refactor uma_zalloc_internal into seperate functions for finding the most appropriate slab, filling buckets, allocating single items, and pulling items off of slabs. This makes the code significantly cleaner. - This also fixes the "Returning an empty bucket." panic that a few people have seen. Tested On: alpha, x86	2002-10-24 07:59:03 +00:00
Jeff Roberson	bba739abf9	- Move the destructor calls so that they are not called with the zone lock held. This avoids a lock order reversal when destroying zones. Unfortunately, this also means that the free checks are not done before the destructor is called. Reported by: phk	2002-10-24 06:17:30 +00:00
Robert Watson	3e732e7d7d	Invoke mac_check_vnode_mmap() during mmap operations on vnodes, permitting policies to restrict access to memory mapping based on the credential requesting the mapping, the target vnode, the requested rights, or other policy considerations. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 15:56:44 +00:00
Robert Watson	1cbfd977fd	Introduce MAC_CHECK_VNODE_SWAPON, which permits MAC policies to perform authorization checks during swapon() events; policies might choose to enforce protections based on the credential requesting the swap configuration, the target of the swap operation, or other factors such as internal policy state. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 15:53:43 +00:00
John Baldwin	1c865ac70e	- Check that a process isn't a new process (p_state == PRS_NEW) before trying to acquire it's proc lock since the proc lock may not have been constructed yet. - Split up the one big comment at the top of the loop and put the pieces in the right order above the various checks. Reported by: kris (1)	2002-10-22 14:31:32 +00:00
Sheldon Hearn	29b4d52653	Fix typo in comments (misspelled "necessary").	2002-10-22 12:10:27 +00:00
Alan Cox	f3b676f0ad	o Reinline vm_page_undirty(), reducing the kernel size. (This reverts a part of vm_page.h revision 1.87 and vm_page.c revision 1.167.)	2002-10-20 19:57:55 +00:00
Alan Cox	f4ecdf056e	Complete the page queues locking needed for the page-based copy- on-write (COW) mechanism. (This mechanism is used by the zero-copy TCP/IP implementation.) - Extend the scope of the page queues lock in vm_fault() to cover vm_page_cowfault(). - Modify vm_page_cowfault() to release the page queues lock if it sleeps.	2002-10-19 18:34:39 +00:00
Matthew Dillon	b86ec922be	Replace the vm_page hash table with a per-vmobject splay tree. There should be no major change in performance from this change at this time but this will allow other work to progress: Giant lock removal around VM system in favor of per-object mutexes, ranged fsyncs, more optimal COMMIT rpc's for NFS, partial filesystem syncs by the syncer, more optimal object flushing, etc. Note that the buffer cache is already using a similar splay tree mechanism. Note that a good chunk of the old hash table code is still in the tree. Alan or I will remove it prior to the release if the new code does not introduce unsolvable bugs, else we can revert more easily. Submitted by: alc (this is Alan's code) Approved by: re	2002-10-18 17:24:30 +00:00
Poul-Henning Kamp	af045176d1	Properly put macro args in (). Spotted by: FlexeLint.	2002-10-16 10:52:15 +00:00
Julian Elischer	d524d69b16	Remove old useless debugging code	2002-10-14 20:31:54 +00:00
Jeff Roberson	b43179fbe8	- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch	2002-10-12 05:32:24 +00:00
John Baldwin	551cf4e150	Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.	2002-10-02 20:31:47 +00:00
Scott Long	316ec49abd	Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb	2002-10-02 07:44:29 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	3ef3e7c42b	- Get rid of the unused LK_NOOBJ.	2002-09-25 01:24:58 +00:00
Jeff Roberson	6a2eac8acc	- Lock access to numoutput on the swap devices.	2002-09-25 01:24:17 +00:00
Jeff Roberson	63e7e60dba	- Add a ASSERT_VOP_LOCKED in vnode_pager_alloc. - Lock access to v_iflags.	2002-09-25 01:23:43 +00:00
Matthew N. Dodd	4a2eca23ca	Modify vm_map_clean() (and thus the msync(2) system call) to support invalidation of cached pages for objects of type OBJT_DEVICE. Submitted by: Christian Zander <zander@minion.de> Approved by: alc	2002-09-22 08:22:32 +00:00
Alan Cox	e94ce82689	o Update some comments.	2002-09-22 04:33:43 +00:00
Jake Burkholder	05ba50f522	Use the fields in the sysentvec and in the vm map header in place of the constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS. This is mainly so that they can be variable even for the native abi, based on different machine types. Get stack protections from the sysentvec too. This makes it trivial to map the stack non-executable for certain abis, on machines that support it.	2002-09-21 22:07:17 +00:00
Alan Cox	8aadcc5368	Reduce namespace pollution. Submitted by: bde	2002-09-21 07:51:44 +00:00
Jeff Roberson	f461cf2297	- Use my freebsd email alias in the copyright. - Remove redundant instances of my email alias in the file summary.	2002-09-19 06:05:32 +00:00
Jeff Roberson	99571dc345	- Split UMA_ZFLAG_OFFPAGE into UMA_ZFLAG_OFFPAGE and UMA_ZFLAG_HASH. - Remove all instances of the mallochash. - Stash the slab pointer in the vm page's object pointer when allocating from the kmem_obj. - Use the overloaded object pointer to find slabs for malloced memory.	2002-09-18 08:26:30 +00:00
Nate Lawson	06be2aaa83	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
Julian Elischer	71fad9fdee	Completely redo thread states. Reviewed by: davidxu@freebsd.org	2002-09-11 08:13:56 +00:00
Seigo Tanimura	b1f99ebe2b	- Do not swap out a process if it is in creation. The process may have no address space yet. - Check whether a process is a system process prior to dereferencing its p_vmspace. Aio assumes that only the curthread switches the address space of a system process.	2002-09-09 09:05:06 +00:00
Julian Elischer	1faf202ea9	Use UMA as a complex object allocator. The process allocator now caches and hands out complete process structures including substructures . i.e. it get's the process structure with the first thread (and soon KSE) already allocated and attached, all in one hit. For the average non threaded program (non KSE that is) the allocated thread and its stack remain attached to the process, even when the process is unused and in the process cache. This saves having to allocate and attach it later, effectively bringing us (hopefully) close to the efficiency of pre-KSE systems where these were a single structure. Reviewed by: davidxu@freebsd.org, peter@freebsd.org	2002-09-06 07:00:37 +00:00
Bruce Evans	6af7f1e511	Use `struct uma_zone *' instead of uma_zone_t, so that <sys/uma.h> isn't a prerequisite.	2002-09-05 14:04:34 +00:00
David Xu	1279572a92	s/SGNL/SIG/ s/SNGL/SINGLE/ s/SNGLE/SINGLE/ Fix abbreviation for P_STOPPED_* etc flags, in original code they were inconsistent and difficult to distinguish between them. Approved by: julian (mentor)	2002-09-05 07:30:18 +00:00
Alan Cox	8a59b15cd4	o Synchronize updates to struct vm_page::cow with the page queues lock.	2002-09-02 04:04:12 +00:00
Matthew Dillon	ec61f55d42	Reduce the maximum KVA reserved for swap meta structures from 70 to 32 MB. Reduce the swap meta calculation by a factor of 2, it's still massive overkill. X-MFC after: immediately	2002-08-31 21:15:29 +00:00
Peter Wemm	447b3772dc	Change hw.physmem and hw.usermem to unsigned long like they used to be in the original hardwired sysctl implementation. The buf size calculator still overflows an integer on machines with large KVA (eg: ia64) where the number of pages does not fit into an int. Use 'long' there. Change Maxmem and physmem and related variables to 'long', mostly for completeness. Machines are not likely to overflow 'int' pages in the near term, but then again, 640K ought to be enough for anybody. This comes for free on 32 bit machines, so why not?	2002-08-30 04:04:37 +00:00
Alan Cox	6508a194aa	o Retire pmap_pageable(). It's an advisory routine that none of our platforms implements.	2002-08-25 04:20:05 +00:00
Alan Cox	fff6062ab6	o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.	2002-08-25 00:22:31 +00:00
Alan Cox	15c176c119	o Use vm_object_lock() in place of directly locking Giant. Reviewed by: md5	2002-08-24 18:44:52 +00:00
Alan Cox	4eaa117956	o Use vm_object_lock() in place of Giant when manipulating a vm object in vm_map_insert().	2002-08-24 17:52:08 +00:00
Alan Cox	d52bc3438c	o Resurrect vm_object_lock() and vm_object_unlock() from revision 1.19. (For now, they simply acquire and release Giant.)	2002-08-24 07:15:14 +00:00
Archie Cobbs	55f7c614fd	Don't use "NULL" when "0" is really meant.	2002-08-21 23:39:52 +00:00
Alan Cox	60582cbe6d	o Assert that the page queues lock is held in vm_page_activate().	2002-08-11 00:21:40 +00:00
Alan Cox	99cb3c4c0f	o Lock page queue accesses by vm_page_activate().	2002-08-11 00:14:10 +00:00
Alan Cox	67ef391e00	o Lock page queue accesses by vm_page_activate().	2002-08-10 23:53:59 +00:00
Alan Cox	a9911f9a0f	o Move a call to vm_page_wakeup() inside the scope of the page queues lock.	2002-08-10 23:27:06 +00:00
Alan Cox	38f612e053	o Remove the setting and clearing of the PG_MAPPED flag from the alpha and ia64 pmap. o Remove the PG_MAPPED flag's declaration.	2002-08-10 18:01:39 +00:00
Alan Cox	db44450b11	o Remove the setting and clearing of the PG_MAPPED flag. (This flag is obsolete.)	2002-08-10 07:11:16 +00:00
Alan Cox	06ec58b740	o Use pmap_page_is_mapped() in vm_page_protect() rather than the PG_MAPPED flag. (This is the only place in the entire kernel where the PG_MAPPED flag is tested. It will be removed soon.)	2002-08-08 19:12:36 +00:00
Alan Cox	24c28f1ad6	o Acquire the page queues lock before checking the page's busy status in vm_page_grab(). Also, replace the nearby tsleep() with an msleep() on the page queues lock.	2002-08-04 19:05:20 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Alan Cox	7f0bf36a2e	o Extend the scope of the page queues lock in contigmalloc1(). o Replace vm_page_sleep_busy() with vm_page_sleep_if_busy() in vm_contig_launder().	2002-08-04 07:07:34 +00:00
Alan Cox	aa9b1d9412	o Remove the setting of PG_MAPPED from vm_page_wire() and vm_page_alloc(VM_ALLOC_WIRED).	2002-08-03 01:29:52 +00:00
Alan Cox	00f9e8b421	o Convert two instances of vm_page_sleep_busy() into vm_page_sleep_if_busy() with appropriate page queue locking.	2002-08-02 18:55:29 +00:00
Alan Cox	1e7ce68ff4	o Lock page queue accesses in nwfs and smbfs. o Assert that the page queues lock is held in vm_page_deactivate().	2002-08-02 05:23:58 +00:00
Alan Cox	91bb74a88c	o Lock page queue accesses by vm_page_deactivate().	2002-08-02 03:56:31 +00:00
Alan Cox	46086ddf91	o Acquire the page queues lock before calling vm_page_io_finish(). o Assert that the page queues lock is held in vm_page_io_finish().	2002-08-01 17:57:42 +00:00
Alan Cox	239b5b9707	o Setting PG_MAPPED and PG_WRITEABLE on pages that are mapped and unmapped by pmap_qenter() and pmap_qremove() is pointless. In fact, it probably leads to unnecessary pmap_page_protect() calls if one of these pages is paged out after unwiring. Note: setting PG_MAPPED asserts that the page's pv list may be non-empty. Since checking the status of the page's pv list isn't any harder than checking this flag, the flag should probably be eliminated. Alternatively, PG_MAPPED could be set by pmap_enter() exclusively rather than various places throughout the kernel.	2002-07-31 18:46:47 +00:00
Alan Cox	67c1fae92e	o Lock page accesses by vm_page_io_start() with the page queues lock. o Assert that the page queues lock is held in vm_page_io_start().	2002-07-31 07:27:08 +00:00
Alan Cox	32585dd617	o In vm_object_madvise() and vm_object_page_remove() replace vm_page_sleep_busy() with vm_page_sleep_if_busy(). At the same time, increase the scope of the page queues lock. (This should significantly reduce the locking overhead in vm_object_page_remove().) o Apply some style fixes.	2002-07-30 07:23:04 +00:00
Seigo Tanimura	9eb881f804	- Optimize wakeup() and its friends; if a thread waken up is being swapped in, we do not have to ask for the scheduler thread to do that. - Assert that a process is not swapped out in runq functions and swapout(). - Introduce thread_safetoswapout() for readability. - In swapout_procs(), perform a test that may block (check of a thread working on its vm map) first. This lets us call swapout() with the sched_lock held, providing a better atomicity.	2002-07-30 06:54:05 +00:00
Alan Cox	e5f8bd9418	o Introduce vm_page_sleep_if_busy() as an eventual replacement for vm_page_sleep_busy(). vm_page_sleep_if_busy() uses the page queues lock.	2002-07-29 19:41:22 +00:00
Julian Elischer	b7f2cf173e	Remove a XXXKSE comment. the code is no longer a problem..	2002-07-29 18:47:19 +00:00
Julian Elischer	1d7b9ed2e6	Create a new thread state to describe threads that would be ready to run except for the fact tha they are presently swapped out. Also add a process flag to indicate that the process has started the struggle to swap back in. This will be needed for the case where multiple threads start the swapin action top a collision. Also add code to stop a process fropm being swapped out if one of the threads in this process is actually off running on another CPU.. that might hurt... Submitted by: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>	2002-07-29 18:33:32 +00:00
Alan Cox	14f8ceaa07	o Pass VM_ALLOC_WIRED to vm_page_grab() rather than calling vm_page_wire() in pmap_new_thread(), pmap_pinit(), and vm_proc_new(). o Lock page queue accesses by vm_page_free() in pmap_object_init_pt().	2002-07-29 05:42:44 +00:00
Alan Cox	2c071f61f0	o Modify vm_page_grab() to accept VM_ALLOC_WIRED.	2002-07-28 23:46:19 +00:00
Alan Cox	e43c2eab07	o Lock page queue accesses by vm_page_free(). o Apply some style fixes.	2002-07-28 20:13:48 +00:00
Alan Cox	6a684ecf05	o Lock page queue accesses by vm_page_free().	2002-07-28 19:01:38 +00:00
Alan Cox	299018d3b6	o Lock page queue accesses by vm_page_free(). o Increment cnt.v_dfree inside vm_pageout_page_free() rather than at each call.	2002-07-28 05:46:47 +00:00
Alan Cox	57123de641	o Lock page queue accesses by vm_page_free().	2002-07-28 04:23:03 +00:00
Alan Cox	55df3298c6	o Require that the page queues lock is held on entry to vm_pageout_clean() and vm_pageout_flush(). o Acquire the page queues lock before calling vm_pageout_clean() or vm_pageout_flush().	2002-07-27 23:20:32 +00:00
Alan Cox	4abd55b296	o Lock page queue accesses by vm_page_activate().	2002-07-27 07:20:27 +00:00
Alan Cox	ce18aebde4	o Lock page queue accesses by vm_page_activate() and vm_page_deactivate() in vm_pageout_object_deactivate_pages(). o Apply some style fixes to vm_pageout_object_deactivate_pages().	2002-07-27 06:41:03 +00:00
Alan Cox	9d52288860	o Lock page queue accesses by vm_page_activate() and vm_page_deactivate().	2002-07-27 04:30:46 +00:00
Alan Cox	f4f5cb1ffb	o Remove a vm_page_deactivate() that is immediately followed by a vm_page_rename() from vm_object_backing_scan(). vm_page_rename() also performs vm_page_deactivate() on pages in the cache queues, making the removed vm_page_deactivate() redundant.	2002-07-25 19:09:07 +00:00
Alan Cox	ef594d3186	o Merge vm_fault_wire() and vm_fault_user_wire() by adding a new parameter, user_wire.	2002-07-24 19:47:56 +00:00
Alan Cox	2999e9faca	o Lock page queue accesses by vm_page_dontneed(). o Assert that the page queue lock is held in vm_page_dontneed().	2002-07-23 04:39:48 +00:00
Alan Cox	8ffc151979	o Extend the scope of the page queues lock in vm_pageout_scan() to cover the traversal of the cache queue.	2002-07-23 02:42:25 +00:00
Alfred Perlstein	8209f090f1	Change struct vmspace->vm_shm from void * to struct shmmap_state *, this removes the need for casts in several cases.	2002-07-22 16:22:27 +00:00
Alfred Perlstein	2cc593fd8e	Remove caddr_t.	2002-07-22 16:12:55 +00:00
Alan Cox	2ad9827349	o Lock page queue accesses by vm_page_free() and vm_page_deactivate().	2002-07-21 21:20:57 +00:00
Alan Cox	ab9abe5d7e	o Lock page queue accesses by vm_page_free().	2002-07-21 20:38:45 +00:00
Seigo Tanimura	1b64ed3b5b	Do not pass a thread with the state TDS_RUNQ to setrunqueue(), otherwise assertion in setrunqueue() fails.	2002-07-21 10:55:57 +00:00
Alan Cox	40eab1e944	o Lock page queue accesses by vm_page_try_to_cache(). (The accesses in kern/vfs_bio.c are already locked.) o Assert that the page queues lock is held in vm_page_try_to_cache().	2002-07-20 20:58:46 +00:00
Alan Cox	d82efd2956	o Assert that the page queues lock is held in vm_page_try_to_free().	2002-07-20 20:12:57 +00:00
Alan Cox	15a5d2108e	o Lock page queue accesses by vm_page_cache() in vm_fault() and vm_pageout_scan(). (The others are already locked.) o Assert that the page queues lock is held in vm_page_cache().	2002-07-20 19:34:21 +00:00
Alan Cox	48c0444c98	o Lock accesses to the active page queue in vm_pageout_scan() and vm_pageout_page_stats().	2002-07-20 18:45:25 +00:00
Alan Cox	bda441aa04	o Lock page queue accesses by vm_page_cache() in vm_contig_launder(). o Micro-optimize the control flow in vm_contig_launder().	2002-07-20 06:11:16 +00:00
Alan Cox	6fd77192b2	o Remove dead and/or unused code.	2002-07-20 05:06:20 +00:00
Peter Wemm	3ebc124838	Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages. Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable. Obtained from: dfr (mostly, many tweaks from me).	2002-07-20 02:56:12 +00:00
Peter Wemm	16e12eab5a	Set P_NOLOAD on the pagezero kthread so that it doesn't artificially skew the loadav. This is not real load. If you have a nice process running in the background, pagezero may sit in the run queue for ages and add one to the loadav, and thereby affecting other scheduling decisions.	2002-07-19 21:06:01 +00:00
Alan Cox	eeeaf0fdd1	o Duplicate an odd side-effect of vm_page_wire() in vm_page_allocate() when VM_ALLOC_WIRED is specified: set the PG_MAPPED bit in flags. o In both vm_page_wire() and vm_page_allocate() add a comment saying that setting PG_MAPPED does not belong there.	2002-07-19 03:33:04 +00:00
Alan Cox	f23050633f	o Remove the acquisition and release of Giant from the idle priority thread that pre-zeroes free pages. o Remove GIANT_REQUIRED from some low-level page queue functions. (Instead assertions on the page queue lock are being added to the higher-level functions, like vm_page_wire(), etc.) In collaboration with: peter	2002-07-18 17:40:07 +00:00
Mark Murray	f6e34b823a	Void functions cannot return values.	2002-07-18 15:53:11 +00:00
Peter Wemm	9e7c1bce60	(VM_MAX_KERNEL_ADDRESS - KERNBASE) / PAGE_SIZE may not fit in an integer. Use lmin(long, long), not min(u_int, u_int). This is a problem here on ia64 which has way more than 2^32 pages of KVA. 281474976710655 pages to be precice.	2002-07-18 10:28:00 +00:00
Alan Cox	827b2fa091	o Introduce an argument, VM_ALLOC_WIRED, that requests vm_page_alloc() to return a wired page. o Use VM_ALLOC_WIRED within Alpha's pmap_growkernel(). Also, because Alpha's pmap_growkernel() calls vm_page_alloc() from within a critical section, specify VM_ALLOC_INTERRUPT instead of VM_ALLOC_SYSTEM. (Only VM_ALLOC_INTERRUPT is implemented entirely with a spin mutex.) o Assert that the page queues mutex is held in vm_page_wire() on Alpha, just like the other platforms.	2002-07-18 04:08:10 +00:00
Alan Cox	072e9cbb50	o Use vm_pageq_remove_nowakeup() and vm_pageq_enqueue() in vm_page_zero_idle() instead of partially duplicated implementations. In particular, this change guarantees that the number of free pages in the free queue(s) matches the global free page count when Giant is released. Submitted by: peter (via his p4 "pmap" branch)	2002-07-16 19:39:40 +00:00
Alan Cox	5c8cdc0e2a	o Create vm_contig_launder() to replace code that appears twice in contigmalloc1().	2002-07-15 06:33:31 +00:00
Alan Cox	8b8b8202f9	o Lock page queue accesses by vm_page_wire() that aren't within a critical section. o Assert that the page queues lock is held in vm_page_wire() unless an Alpha.	2002-07-14 23:51:55 +00:00
Alan Cox	e16cfdbea4	o Lock page queue accesses by vm_page_wire().	2002-07-14 19:36:15 +00:00
Alan Cox	eed6f3fd45	o Lock page queue accesses by vm_page_unmanage(). o Assert that the page queues lock is held in vm_page_unmanage().	2002-07-13 23:55:30 +00:00
Alan Cox	1f54526952	o Complete the locking of page queue accesses by vm_page_unwire(). o Assert that the page queues lock is held in vm_page_unwire(). o Make vm_page_lock_queues() and vm_page_unlock_queues() visible to kernel loadable modules.	2002-07-13 20:55:21 +00:00
Alan Cox	2d09a6ad97	o Lock some page queue accesses, in particular, those by vm_page_unwire().	2002-07-13 19:24:04 +00:00
Alan Cox	93bc4879e6	o Assert GIANT_REQUIRED on system maps in _vm_map_lock(), _vm_map_lock_read(), and _vm_map_trylock(). Submitted by: tegge o Remove GIANT_REQUIRED from kmem_alloc_wait() and kmem_free_wakeup(). (This clears the way for exec_map accesses to move outside of Giant. The exec_map is not a system map.) o Remove some premature MPSAFE comments. Reviewed by: tegge	2002-07-12 23:20:06 +00:00
Matthew Dillon	fbcf77c2ea	Re-enable the idle page-zeroing code. Remove all IPIs from the idle page-zeroing code as well as from the general page-zeroing code and use a lazy tlb page invalidation scheme based on a callback made at the end of mi_switch. A number of people came up with this idea at the same time so credit belongs to Peter, John, and Jake as well. Two-way SMP buildworld -j 5 tests (second run, after stabilization) 2282.76 real 2515.17 user 704.22 sys before peter's IPI commit 2266.69 real 2467.50 user 633.77 sys after peter's commit 2232.80 real 2468.99 user 615.89 sys after this commit Reviewed by: peter, jhb Approved by: peter	2002-07-12 20:17:06 +00:00
Peter Wemm	a7e9138e37	Avoid a vm_page_lookup() - that uses a spinlock protected hash. We can just use the object's memq for our nefarious purposes.	2002-07-12 04:38:51 +00:00
Alan Cox	7538e5500d	o Lock some (unfortunately, not yet all) accesses to the page queues.	2002-07-12 03:17:22 +00:00
Alan Cox	60e15726af	o Lock accesses to the page queues.	2002-07-12 02:55:55 +00:00
Alan Cox	9688f93163	o Add a "needs wakeup" flag to the vm_map for use by kmem_alloc_wait() and kmem_free_wakeup(). Previously, kmem_free_wakeup() always called wakeup(). In general, no one was sleeping. o Export vm_map_unlock_and_wait() and vm_map_wakeup() from vm_map.c for use in vm_kern.c.	2002-07-11 02:39:24 +00:00
Alan Cox	56030358cb	o Lock accesses to the page queues in vm_object_terminate(). o Eliminate some unnecessary 64-bit arithmetic in vm_object_split().	2002-07-09 18:02:03 +00:00
Peter Wemm	5e13bcd6c4	vm_page_queue_free_mtx is a spin mutex, not a normal sleep mutex. I do not know why this didn't panic my box, but I have most certainly been using it: peter@overcee[3:14pm]~src/sys/i386/i386-110> sysctl -a \| grep zero vm.stats.misc.zero_page_count: 2235 vm.stats.misc.cnt_prezero: 638951 vm.idlezero_enable: 1 vm.idlezero_maxrun: 16 Submitted by: Tor.Egge@cvsup.no.freebsd.org Approved by: Tor's patches are never wrong. :-)	2002-07-08 23:12:37 +00:00
Peter Wemm	b428c5fd23	Turn the zeroidle process off for SMP systems, there is still a possible TLB problem when bouncing from one cpu to another (the original cpu will not have purged its TLB if the it simply went idle). Pointed out by: Tor.Egge@cvsup.no.freebsd.org Approved by: Tor is never wrong. :-)	2002-07-08 23:09:11 +00:00
Peter Wemm	a58b3a6878	Add a special page zero entry point intended to be called via the single threaded VM pagezero kthread outside of Giant. For some platforms, this is really easy since it can just use the direct mapped region. For others, IPI sending is involved or there are other issues, so grab Giant when needed. We still have preemption issues to deal with, but Alan Cox has an interesting suggestion on how to minimize the problem on x86. Use Luigi's hack for preserving the (lack of) priority. Turn the idle zeroing back on since it can now actually do something useful outside of Giant in many cases.	2002-07-08 04:24:26 +00:00
Peter Wemm	f59685a4b7	Avoid vm_page_lookup() [grabs a spinlock] and just process the upage object memq instead. Suggested by: alc	2002-07-08 01:11:10 +00:00
Peter Wemm	a136efe9b6	Collect all the (now equivalent) pmap_new_proc/pmap_dispose_proc/ pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code and use a single equivalent MI version. There are other cleanups needed still. While here, use the UMA zone hooks to keep a cache of preinitialized proc structures handy, just like the thread system does. This eliminates one dependency on 'struct proc' being persistent even after being freed. There are some comments about things that can be factored out into ctor/dtor functions if it is worth it. For now they are mostly just doing statistics to get a feel of how it is working.	2002-07-07 23:05:27 +00:00
Alan Cox	25524d3eba	o Lock accesses to the free queue(s) in vm_page_zero_idle().	2002-07-07 19:27:57 +00:00
Alan Cox	c7118ed61b	o Traverse the object's memq rather than repeatedly calling vm_page_lookup() in vm_object_split().	2002-07-07 06:01:25 +00:00
Jeff Roberson	f6b5b182e8	- Hold a lock on the vnode acquired from the file table across the call to vm_mmap() as well as the GETATTR etc. - If the handle is a vnode in vm_mmap() assert that it is locked. - Wiggle Giant around a little to account for the extra vnode operation.	2002-07-06 22:14:38 +00:00
Andrew Gallatin	f784043a9f	Remove bogus vm_page_wakeup() in vm_page_cowfault() that will cause panics in the zero-copy send path if a process attempts to write to a page which is still in flight. reviewed by: ken	2002-07-05 23:33:27 +00:00
Jeff Roberson	17b9cc4941	Fix a lock order reversal in uma_zdestroy. The uma_mtx needs to be held across calls to zone_drain(). Noticed by: scottl	2002-07-05 21:39:52 +00:00
Alan Cox	21f1b5331f	o Lock accesses to the free page queues in contigmalloc1().	2002-07-05 06:43:32 +00:00
Jeff Roberson	f5118d6aaf	Remove unnecessary includes.	2002-07-05 05:16:19 +00:00
Alan Cox	70c1763634	o Resurrect vm_page_lock_queues(), vm_page_unlock_queues(), and the free queue lock (revision 1.33 of vm/vm_page.c removed them). o Make the free queue lock a spin lock because it's sometimes acquired inside of a critical section.	2002-07-04 22:07:37 +00:00
Julian Elischer	8108a14544	A small cleanup.	2002-07-04 12:37:13 +00:00
Julian Elischer	a30ec8f8b8	Don;t call teh thread setup routines from here.. they are already called when uma calls thread_init()	2002-07-04 12:31:54 +00:00
Alan Cox	22a97b04de	o Make the reservation of KVA space for kernel map entries a function of the KVA space's size in addition to the amount of physical memory and reduce it by a factor of two. Under the old formula, our reservation amounted to one kernel map entry per virtual page in the KVA space on a 4GB i386.	2002-07-03 19:16:37 +00:00
Jeff Roberson	e221e841b0	Actually use the fini callback. Pointy hat to: me :-( Noticed By: Julian	2002-07-03 00:30:51 +00:00
Robert Drehmel	47e151dd7a	- Use (OFF_TO_IDX(off) - pi) instead of (OFF_TO_IDX(off - IDX_TO_OFF(pi))). - Reformat a comment.	2002-07-01 14:14:07 +00:00
Alan Cox	c2eda4b565	o Remove some long dead code: from revision 1.41 of vm/vm_pager.c 3+ years ago. o Remove some unused prototypes.	2002-07-01 02:38:05 +00:00
Ian Dowse	300b96aca2	Change the type of `tscan' in vm_object_page_clean() to vm_pindex_t, as it stores an absolute page index that may not fit in a vm_offset_t.	2002-06-29 20:04:38 +00:00
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
Ian Dowse	23f09d50bb	Avoid using the 64-bit vm_pindex_t in a few places where 64-bit types are not required, as the overhead is unnecessary: o In the i386 pmap_protect(), `sindex' and `eindex' represent page indices within the 32-bit virtual address space. o In swp_pager_meta_build() and swp_pager_meta_ctl(), use a temporary variable to store the low few bits of a vm_pindex_t that gets used as an array index. o vm_uiomove() uses `osize' and `idx' for page offsets within a map entry. o In vm_object_split(), `idx' is a page offset within a map entry.	2002-06-26 20:32:51 +00:00
Ian Dowse	5125fe4f45	Use an explicit cast to avoid relying on sign extension to do the right thing in code such as `vm_pindex_t x = ~SWAP_META_MASK'. Reviewed by: dillon	2002-06-26 19:18:14 +00:00
Kenneth D. Merry	98cb733c67	At long last, commit the zero copy sockets code. MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.	2002-06-26 03:37:47 +00:00
Matthew Dillon	a69ac1740f	Enforce RLIMIT_VMEM on growable mappings (aka the primary stack or any MAP_STACK mapping). Suggested by: alc	2002-06-26 03:13:46 +00:00
Matthew Dillon	070f64fe6f	Part I of RLIMIT_VMEM implementation. Implement core functionality for a new resource limit that covers a process's entire VM space, including mmap()'d space. (Part II will be additional code to check RLIMIT_VMEM during exec() but it needs more fleshing out). PR: kern/18209 Submitted by: Andrey Alekseyev <uitm@zenon.net>, Dmitry Kim <jason@nichego.net> MFC after: 7 days	2002-06-26 00:29:28 +00:00
Ian Dowse	6395da5437	Complete the initial set of VM changes required to support full 64-bit file sizes. This step simply addresses the remaining overflows, and does attempt to optimise performance. The details are: o Use a 64-bit type for the vm_object `size' and the size argument to vm_object_allocate(). o Use the correct type for index variables in dev_pager_getpages(), vm_object_page_clean() and vm_object_page_remove(). o Avoid an overflow in the i386 pmap_object_init_pt().	2002-06-25 22:14:06 +00:00
Jeff Roberson	e78f35b33f	Turn VM_ALLOC_ZERO into a flag. Submitted by: tegge Reviewed by: dillon	2002-06-25 22:01:12 +00:00
Jeff Roberson	5c0e403ba2	Reduce the amount of code that runs with the zone lock held in slab_zalloc(). This allows us to run the zone initialization functions without any locks held.	2002-06-25 21:04:50 +00:00
Alan Cox	366838ddfe	o Eliminate vmspace::vm_minsaddr. It's initialized but never used. o Replace stale comments in vmspace by "const until freed" annotations on some fields.	2002-06-25 18:14:38 +00:00
Alan Cox	848d14193d	o Remove GIANT_REQUIRED from kmem_alloc_pageable(), kmem_alloc_nofault(), and kmem_free(). (Annotate as MPSAFE.) o Remove incorrect casts from kmem_alloc_pageable() and kmem_alloc_nofault().	2002-06-23 18:07:40 +00:00
Alan Cox	2cd301d1e1	o Remove the unnecessary acquisition and release of Giant around fdrop() in mmap(2).	2002-06-23 01:48:22 +00:00
Alan Cox	c04c996b25	o Reduce the scope of Giant in vm_mmap() to just the code that manipulates a vnode. (Thus, MAP_ANON and MAP_STACK never acquire Giant.)	2002-06-22 19:13:56 +00:00
Alan Cox	c8664f82a5	o Replace mtx_assert(&Giant, MA_OWNED) in dev_pager_alloc() with the acquisition and release of Giant. (Annotate as MPSAFE.) o Reorder the sanity checks in dev_pager_alloc() to reduce the time that Giant is held.	2002-06-22 18:36:51 +00:00
Alan Cox	409748276e	o In vm_map_insert(), replace GIANT_REQUIRED by the acquisition and release of Giant around the direct manipulation of the vm_object and the optional call to pmap_object_init_pt(). o In vm_map_findspace(), remove GIANT_REQUIRED. Instead, acquire and release Giant around the occasional call to pmap_growkernel(). o In vm_map_find(), remove GIANT_REQUIRED.	2002-06-22 17:47:12 +00:00
Alan Cox	24c46d036d	o Replace GIANT_REQUIRED in swap_pager_alloc() by the acquisition and release of Giant. (Annotate as MPSAFE.)	2002-06-22 08:03:21 +00:00
Alan Cox	2a1618cd59	o Remove GIANT_REQUIRED from phys_pager_alloc(). If handle isn't NULL, acquire and release Giant. If handle is NULL, Giant isn't needed. o Annotate phys_pager_alloc() and phys_pager_dealloc() as MPSAFE.	2002-06-22 07:54:42 +00:00
Alan Cox	990ab7add4	o Replace GIANT_REQUIRED in vnode_pager_alloc() by the acquisition and release of Giant. (Annotate as MPSAFE.) o Also, in vnode_pager_alloc(), remove an unnecessary re-initialization of struct vm_object::flags and move a statement that is duplicated in both branches of an if-else.	2002-06-22 07:28:06 +00:00
Alan Cox	43a90f3a1b	o Remove GIANT_REQUIRED from vslock(). o Annotate kernacc(), useracc(), and vslock() as MPSAFE. Motivated by: alfred	2002-06-22 01:26:02 +00:00
Alan Cox	27168693db	o Remove GIANT_REQUIRED from vm_map_stack().	2002-06-21 06:03:47 +00:00
Alan Cox	7942194583	o Remove GIANT_REQUIRED from vm_pager_allocate() and vm_pager_deallocate().	2002-06-21 05:04:56 +00:00
Alan Cox	3d66f1384e	o Remove an incorrect cast from obreak(). This cast would, for example, break an sbrk(>=4GB) on 64-bit architectures even if the resource limit allowed it. o Correct an off-by-one error. o Correct a spelling error in a comment. o Reorder an && expression so that the commonly FALSE expression comes first. Submitted by: bde (bullets 1 and 2)	2002-06-20 18:38:28 +00:00
Alan Cox	5375be1861	o Acquire and release the vm_map lock instead of Giant in obreak(). Consequently, use vm_map_insert() and vm_map_delete(), which expect the vm_map to be locked, instead of vm_map_find() and vm_map_remove(), which do not.	2002-06-20 02:04:55 +00:00
Jeff Roberson	1e081f889b	- Move the computation of pflags out of the page allocation loop in kmem_malloc() - zero fill pages if PG_ZERO bit is not set after allocation in kmem_malloc() Suggested by: alc, jake	2002-06-19 23:49:57 +00:00
Jeff Roberson	3370c5bfd7	- Remove bogus use of kmem_alloc that was inherited from the old zone allocator. - Properly set M_ZERO when talking to the back end page allocators for non malloc zones. This forces us to zero fill pages when they are first brought into a cache. - Properly handle M_ZERO in uma_zalloc_internal. This fixes a problem where per cpu buckets weren't always getting zeroed.	2002-06-19 20:49:44 +00:00
Jeff Roberson	95f24639b7	Teach kmem_malloc about M_ZERO.	2002-06-19 20:47:18 +00:00
Alan Cox	00e1854a1f	o Replace GIANT_REQUIRED in vm_object_coalesce() by the acquisition and release of Giant. o Reduce the scope of GIANT_REQUIRED in vm_map_insert(). These changes will enable us to remove the acquisition and release of Giant from obreak().	2002-06-19 06:02:03 +00:00
Alan Cox	515630b12f	o Remove LK_CANRECURSE from the vm_map lock.	2002-06-18 18:31:35 +00:00
Jeff Roberson	4741dcbff5	Honor the BUCKETCACHE flag on free as well.	2002-06-17 23:53:58 +00:00
Jeff Roberson	18aa2de5a7	- Introduce the new M_NOVM option which tells uma to only check the currently allocated slabs and bucket caches for free items. It will not go ask the vm for pages. This differs from M_NOWAIT in that it not only doesn't block, it doesn't even ask. - Add a new zcreate option ZONE_VM, that sets the BUCKETCACHE zflag. This tells uma that it should only allocate buckets out of the bucket cache, and not from the VM. It does this by using the M_NOVM option to zalloc when getting a new bucket. This is so that the VM doesn't recursively enter itself while trying to allocate buckets for vm_map_entry zones. If there are already allocated buckets when we get here we'll still use them but otherwise we'll skip it. - Use the ZONE_VM flag on vm map entries and pv entries on x86.	2002-06-17 22:02:41 +00:00
Alan Cox	b49ecb86d0	o Acquire and release Giant in vm_map_wakeup() to prevent a lost wakeup(). Reviewed by: tegge	2002-06-17 13:27:40 +00:00
Alan Cox	042bb29940	o Remove GIANT_REQUIRED from vm_fault_user_wire(). o Move pmap_pageable() outside of Giant in vm_fault_unwire(). (pmap_pageable() is a no-op on all supported architectures.) o Remove the acquisition and release of Giant from mlock().	2002-06-16 20:42:29 +00:00
Alan Cox	319490fb7b	o Remove GIANT_REQUIRED from useracc() and vsunlock(). Neither vm_map_check_protection() nor vm_map_unwire() expect Giant to be held.	2002-06-15 19:10:19 +00:00
Alan Cox	e30616dbfe	o Remove the acquisition and release of Giant from munlock(). Reviewed by: tegge	2002-06-15 05:05:04 +00:00
Alan Cox	1d7cf06c8c	o Use vm_map_wire() and vm_map_unwire() in place of vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_pageable() and vm_map_user_pageable(). o Remove vm_map_clear_recursive() and vm_map_set_recursive(). (They were only used by vm_map_pageable() and vm_map_user_pageable().) Reviewed by: tegge	2002-06-14 18:21:01 +00:00
Alan Cox	d46e7d6bee	o Acquire and release Giant in vm_map_unlock_and_wait(). Submitted by: tegge	2002-06-12 08:15:52 +00:00
Alan Cox	28c58286ef	o Properly handle a failure by vm_fault_wire() or vm_fault_user_wire() in vm_map_wire(). o Make two white-space changes in vm_map_wire(). Reviewed by: tegge	2002-06-11 19:13:59 +00:00
Alan Cox	73b2bace26	o Teach vm_map_delete() to respect the "in-transition" flag on a vm_map_entry by sleeping until the flag is cleared. Submitted by: tegge	2002-06-11 05:24:22 +00:00
Alan Cox	2b4a2c272d	o In vm_map_entry_create(), call uma_zalloc() with M_NOWAIT on system maps. Submitted by: tegge o Eliminate the "!mapentzone" check from vm_map_entry_create() and vm_map_entry_dispose(). Reviewed by: tegge o Fix white-space usage in vm_map_entry_create().	2002-06-10 06:11:45 +00:00
Ian Dowse	f97d6ce396	Correct the logic for determining whether the per-CPU locks need to be destroyed. This fixes a problem where destroying a UMA zone would fail to destroy all zone mutexes. Reviewed by: jeff	2002-06-10 03:25:23 +00:00
Alan Cox	12d7cc840f	o Add vm_map_wire() for wiring contiguous regions of either kernel or user vm_maps. This implementation has two key benefits when compared to vm_map_{user_,}pageable(): (1) it avoids a race condition through the use of "in-transition" vm_map entries and (2) it eliminates lock recursion on the vm_map. Note: there is still an error case that requires clean up. Reviewed by: tegge	2002-06-09 20:25:18 +00:00
Alan Cox	b2f3846aef	o Simplify vm_map_unwire() by merging the second and third passes over the caller-specified region.	2002-06-08 19:00:40 +00:00
Alan Cox	e27e17b711	o Remove an unnecessary call to vm_map_wakeup() from vm_map_unwire(). o Add a stub for vm_map_wire(). Note: the description of the previous commit had an error. The in- transition flag actually blocks the deallocation of a vm_map_entry by vm_map_delete() and vm_map_simplify_entry().	2002-06-08 07:32:38 +00:00
Alan Cox	acd9a301ec	o Add vm_map_unwire() for unwiring contiguous regions of either kernel or user vm_maps. In accordance with the standards for munlock(2), and in contrast to vm_map_user_pageable(), this implementation does not allow holes in the specified region. This implementation uses the "in transition" flag described below. o Introduce a new flag, "in transition," to the vm_map_entry. Eventually, vm_map_delete() and vm_map_simplify_entry() will respect this flag by deallocating in-transition vm_map_entrys, allowing the vm_map lock to be safely released in vm_map_unwire() and (the forthcoming) vm_map_wire(). o Modify vm_map_simplify_entry() to respect the in-transition flag. In collaboration with: tegge	2002-06-07 18:34:23 +00:00
Alfred Perlstein	fa7212543f	fix typo in _SYS_SYSPROTO_H_ case: s/mlockall_args/munlockall_args Submitted by: Mark Santcroos <marks@ripe.net>	2002-06-06 18:51:14 +00:00
Jeff Roberson	494273bead	Add a comment describing a resource leak that occurs during a failure case in obj_alloc.	2002-06-03 22:59:19 +00:00
Alan Cox	c5aaa06ded	o Migrate vm_map_split() from vm_map.c to vm_object.c, renaming it to vm_object_split(). Its interface should still be changed to resemble vm_object_shadow().	2002-06-02 23:54:09 +00:00
Alan Cox	0d78c0dce2	o Style fixes to vm_map_split(), including the elimination of one variable declaration that shadows another. Note: This function should really be vm_object_split(), not vm_map_split(). Reviewed by: md5	2002-06-02 19:32:05 +00:00
Alan Cox	72353893d4	o Condition vm_object_pmap_copy_1()'s compilation on the kernel option ENABLE_VFS_IOOPT. Unless this option is in effect, vm_object_pmap_copy_1() is not used.	2002-06-02 06:31:41 +00:00
Alan Cox	61c075b67f	o Remove GIANT_REQUIRED from vm_map_zfini(), vm_map_zinit(), vm_map_create(), and vm_map_submap(). o Make further use of a local variable in vm_map_entry_splay() that caches a reference to one of a vm_map_entry's children. (This reduces code size somewhat.) o Revert a part of revision 1.66, deinlining vmspace_pmap(). (This function is MPSAFE.)	2002-06-01 22:41:43 +00:00
Alan Cox	794316a866	o Revert a part of revision 1.66, contrary to what that commit message says, deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior() actually increases the kernel's size. o Make vm_map_entry_set_behavior() static and add a comment describing its purpose. o Remove an unnecessary initialization statement from vm_map_entry_splay().	2002-06-01 16:59:30 +00:00
Dag-Erling Smørgrav	8dcfdf3f80	Export nswapdev through sysctl(8). Sponsored by: DARPA, NAI Labs	2002-05-31 08:17:58 +00:00
Alan Cox	9917e01041	Further work on pushing Giant out of the vm_map layer and down into the vm_object layer: o Acquire and release Giant in vm_object_shadow() and vm_object_page_remove(). o Remove the GIANT_REQUIRED assertion preceding vm_map_delete()'s call to vm_object_page_remove(). o Remove the acquisition and release of Giant around vm_map_lookup()'s call to vm_object_shadow().	2002-05-31 03:48:55 +00:00
Alfred Perlstein	99b9331a4f	Check for defined(__i386__) instead of just defined(i386) since the compiler will be updated to only define(__i386__) for ANSI cleanliness.	2002-05-30 07:32:58 +00:00
Peter Wemm	7550be9c57	The kernel printf does not have %i	2002-05-29 08:25:13 +00:00
Alan Cox	8f2ba19c90	o Remove unused #defines.	2002-05-27 22:10:28 +00:00
Alan Cox	4b9fdc2bce	o Acquire and release Giant around pmap operations in vm_fault_unwire() and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete() only if operating on the kernel_object or the kmem_object. o Remove GIANT_REQUIRED from vm_map_remove(). o Remove the acquisition and release of Giant from munmap().	2002-05-26 04:54:56 +00:00
Alan Cox	4e94f40222	o Replace the vm_map's hint by the root of a splay tree. By design, the last accessed datum is moved to the root of the splay tree. Therefore, on lookups in which the hint resulted in O(1) access, the splay tree still achieves O(1) access. In contrast, on lookups in which the hint failed miserably, the splay tree achieves amortized logarithmic complexity, resulting in dramatic improvements on vm_maps with a large number of entries. For example, the execution time for replaying an access log from www.cs.rice.edu against the thttpd web server was reduced by 23.5% due to the large number of files simultaneously mmap()ed by this server. (The machine in question has enough memory to cache most of this workload.) Nothing comes for free: At present, I see a 0.2% slowdown on "buildworld" due to the overhead of maintaining the splay tree. I believe that some or all of this can be eliminated through optimizations to the code. Developed in collaboration with: Juan E Navarro <jnavarro@cs.rice.edu> Reviewed by: jeff	2002-05-24 01:33:24 +00:00
Alan Cox	03adb816d7	o Make contigmalloc1() static.	2002-05-22 01:01:37 +00:00
John Baldwin	4c1cc01cd8	In uma_zalloc_arg(), if we are performing a M_WAITOK allocation, ensure that td_intr_nesting_level is 0 (like malloc() does). Since malloc() calls uma we can probably remove the check in malloc() for this now. Also, perform an extra witness check in that case to make sure we don't hold any locks when performing a M_WAITOK allocation.	2002-05-20 17:54:48 +00:00
Alan Cox	e0be79afbf	o Eliminate the acquisition and release of Giant from minherit(2). (vm_map_inherit() no longer requires Giant to be held.)	2002-05-18 18:59:00 +00:00
Alan Cox	094f6d2694	o Remove GIANT_REQUIRED from vm_map_madvise(). Instead, acquire and release Giant around vm_map_madvise()'s call to pmap_object_init_pt(). o Replace GIANT_REQUIRED in vm_object_madvise() with the acquisition and release of Giant. o Remove the acquisition and release of Giant from madvise().	2002-05-18 07:48:06 +00:00
Alan Cox	4328504956	o Remove the acquisition and release of Giant from mprotect().	2002-05-18 03:58:16 +00:00
Tom Rhodes	d394511de3	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
Poul-Henning Kamp	98b0c78978	Make daddr_t and u_daddr_t 64bits wide. Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.	2002-05-14 11:09:43 +00:00
Jeff Roberson	713deb3677	Don't call the uz free function while the zone lock is held. This can lead to lock order reversals. uma_reclaim now builds a list of freeable slabs and then unlocks the zones to do all of the frees.	2002-05-13 05:08:18 +00:00
Jeff Roberson	0aef6126a1	Remove the hash_free() lock order reversal. This could have happened for several reasons before. Fixing it involved restructuring the generic hash code to require calling code to handle locking, unlocking, and freeing hashes on error conditions.	2002-05-13 04:39:28 +00:00
Alan Cox	a47335fdb4	o Remove GIANT_REQUIRED and an excessive number of blank lines from vm_map_inherit(). (minherit() need not acquire Giant anymore.)	2002-05-12 18:42:05 +00:00
Alan Cox	47c3ccc467	o Acquire and release Giant in vm_object_reference() and vm_object_deallocate(), replacing the assertion GIANT_REQUIRED. o Remove GIANT_REQUIRED from vm_map_protect() and vm_map_simplify_entry(). o Acquire and release Giant around vm_map_protect()'s call to pmap_protect(). Altogether, these changes eliminate the need for mprotect() to acquire and release Giant.	2002-05-12 05:22:56 +00:00
Alan Cox	b3a882e936	o Header files shouldn't depend on options: Provide prototypes for uiomoveco(), uioread(), and vm_uiomove() regardless of whether ENABLE_VFS_IOOPT is defined or not. Submitted by: bde	2002-05-06 06:20:04 +00:00
Alan Cox	c0b6bbb80b	o Condition the compilation and use of vm_freeze_copyopts() on ENABLE_VFS_IOOPT.	2002-05-06 05:45:57 +00:00
Alan Cox	dcc5840ed5	o Some improvements to the page coloring of vm objects, particularly, for shadow objects. Submitted by: bde	2002-05-06 03:34:17 +00:00
Alan Cox	e86256c1f4	o Move vm_freeze_copyopts() from vm_map.{c.h} to vm_object.{c,h}. It's plainly an operation on a vm_object and belongs in the latter place.	2002-05-06 00:12:47 +00:00
Alan Cox	c50fe92b8d	o Condition the compilation of uiomoveco() and vm_uiomove() on ENABLE_VFS_IOOPT. o Add a comment to the effect that this code is experimental support for zero-copy I/O.	2002-05-05 22:42:40 +00:00
Poul-Henning Kamp	81e017430a	Expand the one-line function pbreassignbuf() the only place it is or could be used.	2002-05-05 20:37:08 +00:00
Alan Cox	15fdd586e3	o Remove GIANT_REQUIRED from vm_map_lookup() and vm_map_lookup_done(). o Acquire and release Giant around vm_map_lookup()'s call to vm_object_shadow().	2002-05-05 05:36:28 +00:00
Jeff Roberson	c7173f58fa	Use pages instead of uz_maxpages, which has not been initialized yet, when creating the vm_object. This was broken after the code was rearranged to grab giant itself. Spotted by: alc	2002-05-04 21:49:29 +00:00
Alan Cox	79660d837c	o Make _vm_object_allocate() and vm_object_allocate() callable without holding Giant. o Begin documenting the trivial cases of the locking protocol on vm_object.	2002-05-04 20:23:48 +00:00
Alan Cox	8c5c5d049f	o Remove GIANT_REQUIRED from vm_map_lookup_entry() and vm_map_check_protection(). o Call vm_map_check_protection() without Giant held in munmap().	2002-05-04 02:07:36 +00:00
Alan Cox	bc91c5107a	o Change the implementation of vm_map locking to use exclusive locks exclusively. The interface still, however, distinguishes between a shared lock and an exclusive lock.	2002-05-02 17:32:27 +00:00
Jeff Roberson	8f70816cf2	Hide a pointer to the malloc_type bucket at the end of the freed memory. If this memory is modified after it has been freed we can now report it's previous owner.	2002-05-02 09:07:04 +00:00
Jeff Roberson	b9ba893179	Move around the dbg code a bit so it's always under a lock. This stops a weird potential race if we were preempted right as we were doing the dbg checks.	2002-05-02 09:05:36 +00:00
Andrew R. Reiter	c3bdc05fb9	- Changed the size element of uma_zctor_args to be size_t instead of int. - Changed uma_zcreate to accept the size argument as a size_t intead of int. Approved by: jeff	2002-05-02 07:36:30 +00:00
Jeff Roberson	5a34a9f089	malloc/free(9) no longer require Giant. Use the malloc_mtx to protect the mallochash. Mallochash is going to go away as soon as I introduce the kfree/kmalloc api and partially overhaul the malloc wrapper. This can't happen until all users of the malloc api that expect memory to be aligned on the size of the allocation are fixed.	2002-05-02 07:22:19 +00:00
Alan Cox	569687d02f	o Remove dead and lockmgr()-specific debugging code.	2002-05-02 02:32:09 +00:00
Jeff Roberson	639c9550fb	Remove the temporary alignment check in free(). Implement the following checks on freed memory in the bucket path: - Slab membership - Alignment - Duplicate free This previously was only done if we skipped the buckets. This code will slow down INVARIANTS a bit, but it is smp safe. The checks were moved out of the normal path and into hooks supplied in uma_dbg.	2002-05-02 02:08:48 +00:00
Alan Cox	ea0f50bcf0	o Convert the vm_page buckets mutex to a spin lock. (This resolves an issue on the Alpha platform found by jeff@.) o Simplify vm_page_lookup(). Reviewed by: jhb	2002-04-30 21:24:47 +00:00
Jeff Roberson	8efc4eff00	Add a new UMA debugging facility. This will overwrite freed memory with 0xdeadc0de and then check for it just before memory is handed off as part of a new request. This will catch any post free/pre alloc modification of memory, as well as introduce errors for anything that tries to dereference it as a pointer. This code takes the form of special init, fini, ctor and dtor routines that are specificly used by malloc. It is in a seperate file because additional debugging aids will want to live here as well.	2002-04-30 07:54:25 +00:00
Jeff Roberson	2cc35ff9c6	Move the implementation of M_ZERO into UMA so that it can be passed to uma_zalloc and friends. Remove this functionality from the malloc wrapper. Document this change in uma.h and adjust variable names in uma_core.	2002-04-30 04:26:34 +00:00
Alan Cox	7788e21963	o Revert vm_fault1() to its original name vm_fault(), eliminating the wrapper that took its place for the purposes of acquiring and releasing Giant.	2002-04-30 03:44:34 +00:00
Jeff Roberson	28bc44195c	Add a new zone flag UMA_ZONE_MTXCLASS. This puts the zone in it's own mutex class. Currently this is only used for kmapentzone because kmapents are are potentially allocated when freeing memory. This is not dangerous though because no other allocations will be done while holding the kmapentzone lock.	2002-04-29 23:45:41 +00:00
Peter Wemm	db17c6fc07	Tidy up some loose ends. i386/ia64/alpha - catch up to sparc64/ppc: - replace pmap_kernel() with refs to kernel_pmap - change kernel_pmap pointer to (&kernel_pmap_store) (this is a speedup since ld can set these at compile/link time) all platforms (as suggested by jake): - gc unused pmap_reference - gc unused pmap_destroy - gc unused struct pmap.pm_count (we never used pm_count - we track address space sharing at the vmspace)	2002-04-29 07:43:16 +00:00
Alan Cox	532eadef77	Document three synchronization issues in vm_fault().	2002-04-29 05:23:01 +00:00
Alan Cox	780b1c0997	Pass the caller's file name and line number to the vm_map locking functions.	2002-04-28 23:12:52 +00:00
Alan Cox	d974f03c69	o Introduce and use vm_map_trylock() to replace several direct uses of lockmgr(). o Add missing synchronization to vmspace_swap_count(): Obtain a read lock on the vm_map before traversing it.	2002-04-28 06:07:54 +00:00
Peter Wemm	44e74ba6c3	We do not necessarily need to map/unmap pages to zero parts of them. On systems where physical memory is also direct mapped (alpha, sparc, ia64 etc) this is slightly harmful.	2002-04-28 00:15:48 +00:00
Alan Cox	089b073345	o Begin documenting the (existing) locking protocol on the vm_map in the same style as sys/proc.h. o Undo the de-inlining of several trivial, MPSAFE methods on the vm_map. (Contrary to the commit message for vm_map.h revision 1.66 and vm_map.c revision 1.206, de-inlining these methods increased the kernel's size.)	2002-04-27 22:01:37 +00:00
Alan Cox	cbd53e95fe	o Control access to the vm_page_buckets with a mutex. o Fix some style(9) bugs.	2002-04-26 22:44:15 +00:00
Andrew R. Reiter	d4d6aee5a0	- Fix a round down bogon in uma_zone_set_max(). Submitted by: jeff@	2002-04-25 06:24:40 +00:00
Alan Cox	a569838764	Reintroduce locking on accesses to vm_object_list.	2002-04-20 07:23:22 +00:00
Alan Cox	92de35b0ce	o Move the acquisition of Giant from vm_fault() to the point after initialization in vm_fault1(). o Fix some style problems in vm_fault1().	2002-04-19 04:20:31 +00:00
Alan Cox	ff8f4ebe22	Add a comment documenting a race condition in vm_fault(): Specifically, a modification is made to the vm_map while only a read lock is held.	2002-04-18 03:55:50 +00:00
Alan Cox	6139043b1f	o Call vm_map_growstack() from vm_fault() if vm_map_lookup() has failed due to conditions that suggest the possible need for stack growth. This has two beneficial effects: (1) we can now remove calls to vm_map_growstack() from the MD trap handlers and (2) simple page faults are faster because we no longer unnecessarily perform vm_map_growstack() on every page fault. o Remove vm_map_growstack() from the i386's trap_pfault(). o Remove the acquisition and release of Giant from i386's trap_pfault(). (vm_fault() still acquires it.)	2002-04-18 03:28:27 +00:00
Peter Wemm	334f706177	Do not free the vmspace until p->p_vmspace is set to null. Otherwise statclock can access it in the tail end of statclock_process() at an unfortunate time. This bit me several times on an SMP alpha (UP2000) and the problem went away with this change. I'm not sure why it doesn't break x86 as well. Maybe it's because the clocks are much faster on alpha (HZ=1024 by default).	2002-04-17 05:26:42 +00:00
Alan Cox	b208d0633f	Remove an unused option, VM_FAULT_HOLD, to vm_fault().	2002-04-17 02:23:57 +00:00
Peter Wemm	1a87a0da66	Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]() and pmap_copy_page(). This gets rid of a couple more physical addresses in upper layers, with the eventual aim of supporting PAE and dealing with the physical addressing mostly within pmap. (We will need either 64 bit physical addresses or page indexes, possibly both depending on the circumstances. Leaving this to pmap itself gives more flexibilitly.) Reviewed by: jake Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)	2002-04-15 16:00:03 +00:00
Jeff Roberson	5300d9dda2	Fix a witness warning when expanding a hash table. We were allocating the new hash while holding the lock on a zone. Fix this by doing the allocation seperately from the actual hash expansion. The lock is dropped before the allocation and reacquired before the expansion. The expansion code checks to see if we lost the race and frees the new hash if we do. We really never will lose this race because the hash expansion is single threaded via the timeout mechanism.	2002-04-14 13:47:10 +00:00
Jeff Roberson	0da47b2fc6	Protect the initial list traversal in sysctl_vm_zone() with the uma_mtx.	2002-04-14 12:39:38 +00:00
Jeff Roberson	af7f9b97b6	Fix the calculation that determines uz_maxpages. It was off for large zones. Fortunately we have no large zones with maximums specified yet, so it wasn't breaking anything. Implement blocking when a zone exceeds the maximum and M_WAITOK is specified. Previously this just failed like the old zone allocator did. The old zone allocator didn't support WAITOK/NOWAIT though so we should do what we advertise. While I was in there I cleaned up some more zalloc logic to further simplify that code path and reduce redundant code. This was needed to make the blocking work properly anyway.	2002-04-14 01:56:25 +00:00
Jeff Roberson	bce9779110	Remember to unlock the zone if the fill count is too high. Pointed out by: pete, jake, jhb	2002-04-10 01:52:50 +00:00
Jeff Roberson	1d4cb54ba8	Quiet witness warnings about acquiring several zone locks. In the case that this happens it is OK.	2002-04-08 21:08:17 +00:00
Jeff Roberson	86bbae32f4	Add a mechanism to disable buckets when the v_free_count drops below v_free_min. This should help performance in memory starved situations.	2002-04-08 06:20:34 +00:00
Jeff Roberson	605cbd6a08	Don't release the zone lock until after the dtor has been called. As far as I can tell this could not have caused any problems yet because UMA is still called with giant. Pointy hat to: jeff Noticed by: jake	2002-04-08 05:13:48 +00:00
Jeff Roberson	9c2cd7e5a9	Implement uma_zdestroy(). It's prototype changed slightly. I decided that I didn't like the wait argument and that if you were removing a zone it had better be empty. Also, I broke out part of hash_expand and made a seperate hash_free() for use in uma_zdestroy.	2002-04-08 04:48:58 +00:00
Jeff Roberson	a553d4b8eb	Rework most of the bucket allocation and free code so that per cpu locks are never held across blocking operations. Also, fix two other lock order reversals that were exposed by jhb's witness change. The free path previously had a bug that would cause it to skip the free bucket list in some cases and go straight to allocating a new bucket. This has been fixed as well. These changes made the bucket handling code much cleaner and removed quite a few lock operations. This should be marginally faster now. It is now possible to call malloc w/o Giant and avoid any witness warnings. This still isn't entirely safe though because malloc_type statistics are not protected by any lock.	2002-04-08 02:42:55 +00:00
Jeff Roberson	c235bfa551	Spelling correction; s/seperate/separate/g Submitted by: eric	2002-04-07 22:56:48 +00:00
Jeff Roberson	fedfeee018	There should be no remaining references to these two files in the tree. If there are, it is an error. vm_zone has been superseded by uma.	2002-04-07 22:51:18 +00:00
Jeff Roberson	d0b06acbe1	This fixes a bug where isitem never got set to 1 if a certain chain of events relating to extreme low memory situations occured. This was only ever seen on the port build cluster, so many thanks to kris for helping me debug this. Tested by: kris	2002-04-07 22:47:36 +00:00
Alan Cox	aa4d062142	o Eliminate the use of grow_stack() and useracc() from sendsig(), osendsig(), and osf1_sendsig(). o Eliminate the prototype for the MD grow_stack() now that it has been removed from all platforms.	2002-04-05 00:52:15 +00:00
Matthew Dillon	80f5c8bf42	Embed a struct vmmeter in the per-cpu structure and add a macro, PCPU_LAZY_INC() which increments elements in it for cases where we can afford the occassional inaccuracy. Use of per-cpu stats counters avoids significant cache stalls in various critical paths that would otherwise severely limit our cpu scaleability. Adjust all sysctl's accessing cnt.* elements to now use a procedure which aggregates the requested field for all cpus and for the global vmmeter. The global vmmeter is retained, since some stats counters, like v_free_min, cannot be made per-cpu. Also, this allows us to convert counters from the global vmmeter to the per-cpu vmmeter in a piecemeal fashion, so have at it!	2002-04-04 21:38:47 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Jake Burkholder	48f9a59443	Fix a long standing 32bit-ism. Don't assume that the size of a chunk of memory in phys_avail will fit in 'int', use vm_size_t. This fixes booting on sparc64 machines with more than 2 gigs of ram. Thanks to Jan Chrillesen for providing me with access to a 4 gig machine.	2002-04-03 06:57:52 +00:00
Alfred Perlstein	157d7b3538	fix comment typo, s/neccisary/necessary/g	2002-04-02 21:25:12 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
Jeff Roberson	f22a4b62f5	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
Alan Cox	433b72aa12	Remove an unused prototype.	2002-03-26 05:30:59 +00:00
Jeff Roberson	f4af24d55d	Reset the cachefree statistics after draining the cache. This fixes a bug where a sysctl within 20 seconds of a cache_drain could yield negative "USED" counts. Also, grab the uma_mtx while in the sysctl handler. This hadn't caused problems yet because Giant is held all the time. Reported by: kkenn	2002-03-24 10:56:11 +00:00
Jeff Roberson	736ee5907f	Add uma_zone_set_max() to add enforced limits to non vm obj backed zones.	2002-03-20 05:28:34 +00:00
Jeff Roberson	670d17b5c0	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 04:02:59 +00:00
Alfred Perlstein	11caded34f	Remove __P.	2002-03-19 22:20:14 +00:00
Jeff Roberson	9eb6e51923	Quit a warning introduced by UMA. This only occurs on machines where vm_size_t != unsigned long. Reviewed by: phk	2002-03-19 11:49:10 +00:00
Peter Wemm	30171114b3	Fix a gcc-3.1+ warning. warning: deprecated use of label at end of compound statement ie: you cannot do this anymore: switch(foo) { .... default: }	2002-03-19 11:02:06 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
Brian Feldman	25adb370be	Back out the modification of vm_map locks from lockmgr to sx locks. The best path forward now is likely to change the lockmgr locks to simple sleep mutexes, then see if any extra contention it generates is greater than removed overhead of managing local locking state information, cost of extra calls into lockmgr, etc. Additionally, making the vm_map lock a mutex and respecting it properly will put us much closer to not needing Giant magic in vm.	2002-03-18 15:08:09 +00:00
Alan Cox	9f0567f557	Remove vm_object_count: It's unused, incorrectly maintained and duplicates information maintained by the zone allocator.	2002-03-17 18:37:37 +00:00
Alan Cox	5ee9fe6ba1	Undo part of revision 1.57: Now that (o)sendsig() doesn't call useracc(), the motivation for saving and restoring the map->hint in useracc() is gone. (The same tests that motivated this change in revision 1.57 now show that there is no performance loss from removing it.) This was really a hack and some day we would have had to add new synchronization here on map->hint to maintain it.	2002-03-17 07:01:42 +00:00
Alan Cox	2f6c16e1e8	Acquire a read lock on the map inside of vm_map_check_protection() rather than expecting the caller to do so. This (1) eliminates duplicated code in kernacc() and useracc() and (2) fixes missing synchronization in munmap().	2002-03-17 03:19:31 +00:00
Jake Burkholder	ac59490b5e	Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/ pmap_qremove. pmap_kenter is not safe to use in MI code because it is not guaranteed to flush the mapping from the tlb on all cpus. If the process in question is preempted and migrates cpus between the call to pmap_kenter and pmap_kremove, the original cpu will be left with stale mappings in its tlb. This is currently not a problem for i386 because we do not use PG_G on SMP, and thus all mappings are flushed from the tlb on context switches, not just user mappings. This is not the case on all architectures, and if PG_G is to be used with SMP on i386 it will be a problem. This was committed by peter earlier as part of his fine grained tlb shootdown work for i386, which was backed out for other reasons. Reviewed by: peter	2002-03-17 00:56:41 +00:00
Kirk McKusick	0d2af52141	Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.	2002-03-15 18:49:47 +00:00
Brian Feldman	9cb574590e	Document faultstate.lookup_still_valid more than none. Requested by: alfred	2002-03-14 02:10:14 +00:00
Brian Feldman	0e0af8ecda	Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate. While doing this, move it earlier in the sysinit boot process so that the VM system can use it. After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand. This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)	2002-03-13 23:48:08 +00:00
Eivind Eklund	a128794977	- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc	2002-03-10 21:52:48 +00:00
Tor Egge	ff91d7800f	Revert change in revision 1.53 and add a small comment to protect the revived code. vm pages newly allocated are marked busy (PG_BUSY), thus calling vm_page_delete before the pages has been freed or unbusied will cause a deadlock since vm_page_object_page_remove will wait for the busy flag to be cleared. This can be triggered by calling malloc with size > PAGE_SIZE and the M_NOWAIT flag on systems low on physical free memory. A kernel module that reproduces the problem, written by Logan Gabriel <logan@mail.2cactus.com>, can be found in the freebsd-hackers mail archive (12 Apr 2001). The problem was recently noticed again by Archie Cobbs <archie@dellroad.org>. Reviewed by: dillon	2002-03-09 16:24:27 +00:00
Matthew Dillon	8c5dffe8ca	Fix a bug in the vm_map_clean() procedure. msync()ing an area of memory that has just been mapped MAP_ANON\|MAP_NOSYNC and has not yet been accessed will panic the machine. MFC after: 1 day	2002-03-07 03:54:56 +00:00
Matthew Dillon	b9b7a4be90	Add a sequential iteration optimization to vm_object_page_clean(). This moderately improves msync's and VM object flushing for objects containing randomly dirtied pages (fsync(), msync(), filesystem update daemon), and improves cpu use for small-ranged sequential msync()s in the face of very large mmap()ings from O(N) to O(1) as might be performed by a database. A sysctl, vm.msync_flush_flag, has been added and defaults to 3 (the two committed optimizations are turned on by default). 0 will turn off both optimizations. This code has already been tested under stable and is one in a series of memq / vp->v_dirtyblkhd / fsync optimizations to remove O(N^2) restart conditions that will be coming down the pipe. MFC after: 3 days	2002-03-06 02:42:56 +00:00
Eivind Eklund	f52bd684f3	* Move bswlist declaration and initialization from kern/vfs_bio.c to vm/vm_pager.c, which is the only place it is used. * Make the QUEUE_* definitions and bufqueues local to vfs_bio.c. * constify buf_wmesg.	2002-03-05 18:20:58 +00:00
Alan Cox	2be21c5e68	o Create vm_pageq_enqueue() to encapsulate code that is duplicated time and again in vm_page.c and vm_pageq.c. o Delete unusused prototypes. (Mainly a result of the earlier renaming of various functions from vm_page_() to vm_pageq_().)	2002-03-04 18:55:26 +00:00
Alan Cox	64190c7a2f	Call vm_pageq_remove_nowakeup() rather than duplicating it.	2002-03-03 22:36:14 +00:00
Alan Cox	5714577006	Remove some long dead code.	2002-03-02 22:21:42 +00:00
John Baldwin	fdcc1cc09f	Use thread0.td_ucred instead of proc0.p_ucred. This change is cosmetic and isn't strictly required. However, it lowers the number of false positives found when grep'ing the kernel sources for p_ucred to ensure proper locking.	2002-02-27 19:18:10 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Mike Silbersack	7f3a40933b	Fix a horribly suboptimal algorithm in the vm_daemon. In order to determine what to page out, the vm_daemon checks reference bits on all pages belonging to all processes. Unfortunately, the algorithm used reacted badly with shared pages; each shared page would be checked once per process sharing it; this caused an O(N^2) growth of tlb invalidations. The algorithm has been changed so that each page will be checked only 16 times. Prior to this change, a fork/sleepbomb of 1300 processes could cause the vm_daemon to take over 60 seconds to complete, effectively freezing the system for that time period. With this change in place, the vm_daemon completes in less than a second. Any system with hundreds of processes sharing pages should benefit from this change. Note that the vm_daemon is only run when the system is under extreme memory pressure. It is likely that many people with loaded systems saw no symptoms of this problem until they reached the point where swapping began. Special thanks go to dillon, peter, and Chuck Cranor, who helped me get up to speed with vm internals. PR: 33542, 20393 Reviewed by: dillon MFC after: 1 week	2002-02-27 18:03:02 +00:00
Peter Wemm	d1693e1701	Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.	2002-02-27 09:51:33 +00:00
Peter Wemm	bd1e3a0f89	Jake further reduced IPI shootdowns on sparc64 in loops by using ranged shootdowns in a couple of key places. Do the same for i386. This also hides some physical addresses from higher levels and has it use the generic vm_page_t's instead. This will help for PAE down the road. Obtained from: jake (MI code, suggestions for MD part)	2002-02-27 02:14:58 +00:00
Peter Wemm	dd50331c0e	Remove unused variable (td)	2002-02-26 01:01:37 +00:00
Poul-Henning Kamp	57c10583aa	GC: BIO_ORDERED, various infrastructure dealing with BIO_ORDERED.	2002-02-22 09:26:35 +00:00
Tor Egge	d2760948fe	Add a page queue, PQ_HOLD, that temporarily owns pages with nonzero hold count that would otherwise be on one of the free queues. This eliminates a panic when broken programs unmap memory that still has pending IO from raw devices. Reviewed by: dillon, alc	2002-02-19 23:19:30 +00:00
Mike Silbersack	0c9e47230a	Add one more comment to the OOM changes so that future readers of the code may better understand the code. Suggested by: dillon MFC after: 1 week	2002-02-19 18:50:49 +00:00
Mike Silbersack	ef6020d187	Changes to make the OOM killer much more effective: - Allow the OOM killer to target processes currently locked in memory. These very often are the ones doing the memory hogging. - Drop the wakeup priority of processes currently sleeping while waiting for their page fault to complete. In order for the OOM killer to work well, the killed process and other system processes waiting on memory must be allowed to wakeup first. Reviewed by: dillon MFC after: 1 week	2002-02-19 18:34:02 +00:00
Bruce Evans	1e92845e1b	Garbage-collect options ACPI_NO_ENABLE_ON_BOOT, AML_DEBUG, BLEED, DEVICE_SYSCTLS, KEY, LOUTB, NFS_MUIDHASHSIZ, NFS_UIDHASHSIZ, PCI_QUIET and SIMPLELOCK_DEBUG.	2002-02-15 13:16:11 +00:00
Julian Elischer	2c1007663f	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
Julian Elischer	079b7badea	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
Alfred Perlstein	582ec34cd8	Fix a race with free'ing vmspaces at process exit when vmspaces are shared. Also introduce vm_endcopy instead of using pointer tricks when initializing new vmspaces. The race occured because of how the reference was utilized: test vmspace reference, possibly block, decrement reference When sharing a vmspace between multiple processes it was possible for two processes exiting at the same time to test the reference count, possibly block and neither one free because they wouldn't see the other's update. Submitted by: green	2002-02-05 21:23:05 +00:00
Matthew Dillon	027df6bdd7	GC P_BUFEXHAUST leftovers, we've had a new mechanism to avoid buffer cache lockups for over a year now. MFC after: 0 days	2002-01-31 18:39:44 +00:00
David Malone	d2979f90e7	Remove a parameter name from a prototype.	2002-01-25 21:33:10 +00:00
Bruce Evans	e50f5c2e8d	Don't declare vm_swapout() in the NO_SWAPPING case when it is not defined. Fixed some style bugs.	2002-01-17 16:46:26 +00:00
Alfred Perlstein	a4db49537b	Replace ffind_* with fget calls. Make fget MPsafe. Make fgetvp and fgetsock use the fget subsystem to reduce code bloat. Push giant down in fpathconf().	2002-01-14 00:13:45 +00:00
Alfred Perlstein	426da3bcfb	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
John Baldwin	c86b6ff551	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
Matthew Dillon	23b590188f	Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget() against VM_WAIT in the pageout code. Both fixes involve adjusting the lockmgr's timeout capability so locks obtained with timeouts do not interfere with locks obtained without a timeout. Hopefully MFC: before the 4.5 release	2001-12-20 22:42:27 +00:00
Matthew Dillon	3ebeaf5984	This fixes a large number of bugs in our NFS client side code. A recent commit by Kirk also fixed a softupdates bug that could easily be triggered by server side NFS. * An edge case with shared R+W mmap()'s and truncate whereby the system would inappropriately clear the dirty bits on still-dirty data. (applicable to all filesystems) THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING. see vm/vm_page.c line 1641 * The straddle case for VM pages and buffer cache buffers when truncating. (applicable to NFS client side) * Possible SMP database corruption due to vm_pager_unmap_page() not clearing the TLB for the other cpu's. (applicable to NFS client side but could effect all filesystems). Note: not considered serious since the corruption occurs beyond the file EOF. * When flusing a dirty buffer due to B_CACHE getting cleared, we were accidently setting B_CACHE again (that is, bwrite() sets B_CACHE), when we really want it to stay clear after the write is complete. This resulted in a corrupt buffer. (applicable to all filesystems but probably only triggered by NFS) * We have to call vtruncbuf() when ftruncate()ing to remove any buffer cache buffers. This is still tentitive, I may be able to remove it due to the second bug fix. (applicable to NFS client side) * vnode_pager_setsize() race against nfs_vinvalbuf()... we have to set n_size before calling nfs_vinvalbuf or the NFS code may recursively vnode_pager_setsize() to the original value before the truncate. This is what was causing the user mmap bus faults in the nfs tester program. (applicable to NFS client side) * Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made by Kirk). Testing program written by: Avadis Tevanian, Jr. Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step') MFC after: 1 week	2001-12-14 01:16:57 +00:00
Luigi Rizzo	60363fb9f7	vm/vm_kern.c: rate limit (to once per second) diagnostic printf when you run out of mbuf address space. kern/subr_mbuf.c: print a warning message when mb_alloc fails, again rate-limited to at most once per second. This covers other cases of mbuf allocation failures. Probably it also overlaps the one handled in vm/vm_kern.c, so maybe the latter should go away. This warning will let us gradually remove the printf that are scattered across most network drivers to report mbuf allocation failures. Those are potentially dangerous, in that they are not rate-limited and can easily cause systems to panic. Unless there is disagreement (which does not seem to be the case judging from the discussion on -net so far), and because this is sort of a safety bugfix, I plan to commit a similar change to STABLE during the weekend (it affects kern/uipc_mbuf.c there). Discussed-with: jlemon, silby and -net	2001-12-01 00:21:30 +00:00
Jonathan Lemon	4584bbf555	When laying out objects in a ZONE_INTERRUPT zone, allow them to cross a page boundary, since we've already allocated all our contiguous kva space up front. This eliminates some memory wastage, and allows us to actually reach the # of objects were specified in the zinit() call. Reviewed by: peter, dillon	2001-11-17 00:40:48 +00:00
Matthew Dillon	fe8e0238cc	Fix deadlock introduced in 1.73 (Jan 1998). The paging-in-progress count on a vnode-backed object must be incremented after obtaining the vnode lock. If it is bumped before obtaining the vnode lock we can deadlock against vtruncbuf(). Submitted by: peter, ps MFC after: 3 days	2001-11-09 21:34:45 +00:00
Matthew Dillon	33c6774151	Adjust vnode_pager_input_smlfs() to not attempt to BMAP blocks beyond the file EOF. This works around a bug in the ISOFS (CDRom) BMAP code which returns bogus values for requests beyond the file EOF rather then returning an error, resulting in either corrupt data being mmap()'d beyond the file EOF or resulting in a seg-fault on the last page of a mmap()'d file (mmap()s of CDRom files). Reported by: peter / Yahoo MFC after: 3 days	2001-11-05 18:58:47 +00:00
Matthew Dillon	e302698320	Don't let pmap_object_init_pt() exhaust all available free pages (allocating pv entries w/ zalloci) when called in a loop due to an madvise(). It is possible to completely exhaust the free page list and cause a system panic when an expected allocation fails.	2001-10-31 03:06:33 +00:00
Matthew Dillon	7a5a635273	Move recently added procedure which was incorrectly placed within an #ifdef DDB block.	2001-10-26 16:27:54 +00:00
Matthew Dillon	245df27cee	Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has a real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week	2001-10-26 00:08:05 +00:00
Matthew Dillon	57601bcb5d	Syntax cleanup and documentation, no operational changes. MFC after: 1 day	2001-10-21 06:12:06 +00:00
Ian Dowse	0eb6ce3169	Move the code that computes the system load average from vm_meter.c to kern_synch.c in preparation for adding some jitter to the inter-sample time. Note that the "vm.loadavg" sysctl still lives in vm_meter.c which isn't the right place, but it is appropriate for the current (bad) name of that sysctl. Suggested by: jhb (some time ago) Reviewed by: bde	2001-10-20 13:10:43 +00:00
Matthew Dillon	b386828956	contigmalloc1() could cause the vm_page_zero_count to become incorrect. Properly track the count. Submitted by: mark tinguely <tinguely@web.cs.ndsu.nodak.edu>	2001-10-17 17:34:34 +00:00
Tor Egge	d6844b6bf6	Don't use an uninitialized field reserved for callers in the bio structure passed to swap_pager_strategy(). Instead, use a field reserved for drivers and initialize it before usage. Reviewed by: dillon	2001-10-15 23:02:54 +00:00
Tor Egge	30105b9ec4	Don't remove all mappings of a swapped out process if the vm map contained wired entries. vm_fault_unwire() depends on the mapping being intact. Reviewed by: dillon	2001-10-14 20:51:14 +00:00
Tor Egge	e7673b8424	Fix locking violations during page wiring: - vm map entries are not valid after the map has been unlocked. - An exclusive lock on the map is needed before calling vm_map_simplify_entry(). Fix cleanup after page wiring failure to unwire all pages that had been successfully wired before the failure was detected. Reviewed by: dillon	2001-10-14 20:47:08 +00:00
Matthew Dillon	33bd457d91	Makes contigalloc[1]() create the vm_map / underlying wired pages in the kernel map and object in a manner that contigfree() is actually able to free. Previously contigfree() freed up the KVA space but could not unwire & free the underlying VM pages due to mismatched pageability between the map entry and the VM pages. Submitted by: Thomas Moestl <tmoestl@gmx.net> Testing by: mark tinguely <tinguely@web.cs.ndsu.nodak.edu> MFC after: 3 days	2001-10-13 04:23:37 +00:00
Matthew Dillon	00a6f47f13	Finally fix the VM bug where a file whos EOF occurs in the middle of a page would sometimes prevent a dirty page from being cleaned, even when synced, resulting in the dirty page being re-flushed to disk every 30-60 seconds or so, forever. The problem is that when the filesystem flushes a page to its backing file it typically does not clear dirty bits representing areas of the page that are beyond the file EOF. If the file is also mmap()'d and a fault is taken, vm_fault (properly, is required to) set the vm_page_t->dirty bits to VM_PAGE_BITS_ALL. This combination could leave us with an uncleanable, unfreeable page. The solution is to have the vnode_pager detect the edge case and manually clear the dirty bits representing areas beyond the file EOF. The filesystem does the rest and the page comes up clean after the write completes. MFC after: 3 days	2001-10-12 18:17:34 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
John Baldwin	61d80e90a9	Add missing includes of sys/ktr.h.	2001-10-11 17:53:43 +00:00
Paul Saab	cbc89bfbfe	Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable. Reviewed by: peter MFC after: 2 weeks	2001-10-10 23:06:54 +00:00
Ian Dowse	564bfabecb	Remove the SSLEEP case from the load average computation. This has been a no-op for as long as our CVS history goes back. Processes in state SSLEEP could only be counted if p_slptime == 0, but immediately before loadav() is called, schedcpu() has just incremented p_slptime on all SSLEEP processes.	2001-10-04 22:33:31 +00:00
Robert Watson	8c5d4fe829	o Modify access control checks in mmap() to use securelevel_gt() instead of direct variable access. Obtained from: TrustedBSD Project	2001-09-26 20:29:39 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Peter Wemm	eb30c1c0b9	Rip some well duplicated code out of cpu_wait() and cpu_exit() and move it to the MI area. KSE touched cpu_wait() which had the same change replicated five ways for each platform. Now it can just do it once. The only MD parts seemed to be dealing with fpu state cleanup and things like vm86 cleanup on x86. The rest was identical. XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional stub in place. Reviewed by: jake, tmm, dillon	2001-09-10 04:28:58 +00:00
John Baldwin	29fdb744d1	Process priority is locked by the sched_lock, not the proc lock.	2001-09-01 20:16:30 +00:00
Matthew Dillon	7feaf028be	make swapon() MPSAFE (will adjust syscalls.master later)	2001-08-31 22:15:37 +00:00
Matthew Dillon	6a33d53c48	mark obreak() and ovadvise() as being MPSAFE	2001-08-31 22:10:03 +00:00
Matthew Dillon	d2c60af81a	Cleanup	2001-08-31 01:26:30 +00:00
Peter Wemm	3516c025ff	Implement idle zeroing of pages. I've been tinkering with this on and off since John Dyson left his work-in-progress. It is off by default for now. sysctl vm.zeroidle_enable=1 to turn it on. There are some hacks here to deal with the present lack of preemption - we yield after doing a small number of pages since we wont preempt otherwise. This is basically Matt's algorithm [with hysteresis] with an idle process to call it in a similar way it used to be called from the idle loop. I cleaned up the includes a fair bit here too.	2001-08-25 05:00:44 +00:00
Matthew Dillon	676274db9b	Remove support for the badly broken MAP_INHERIT (from -current only).	2001-08-24 19:29:56 +00:00
Matthew Dillon	219d632c15	Move most of the kernel submap initialization code, including the timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal. Reviewed by: freebsd-smp, jake	2001-08-22 04:07:27 +00:00
Matthew Dillon	0b76df7146	KASSERT if vm_page_t->wire_count overflows.	2001-08-22 04:01:56 +00:00
Matthew Dillon	2f9e4e8025	Limit the amount of KVM reserved for the buffer cache and for swap-meta information. The default limits only effect machines with > 1GB of ram and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and kern.maxbcache. This has the effect of leaving more KVM available for sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad adds memory to a machine and then sees the kernel panic on boot due to running out of KVM. Also change the default swap-meta auto-sizing calculation to allocate half of what it was previously allocating. The prior defaults were way too high. Note that we cannot afford to run out of swap-meta structures so we still stay somewhat conservative here.	2001-08-20 00:41:12 +00:00
John Baldwin	02cd7c3cf2	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter	2001-08-10 06:56:12 +00:00
John Baldwin	8ec48c6dbf	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter	2001-08-10 06:37:05 +00:00
Thomas Moestl	59fa485c3e	Add a missing semicolon to unbreak the kernel build with INVARIANTS (which was unfortunately turned off in the confguration I used for the last test build). Spotted by: jake Pointy hat to: tmm	2001-08-05 03:55:02 +00:00
John Baldwin	bd8e0d5871	Whitespace fixes.	2001-08-04 20:49:29 +00:00
Thomas Moestl	b4c53a8111	Add a zdestroy() function to the zone allocator. This is needed for the unload case of modules that use their own zones. It has been tested with the nfs module.	2001-08-04 20:17:05 +00:00
Alfred Perlstein	61ce6eeee3	Fixups for the initial allocation by dillon: 1) allocate fewer buckets 2) when failing to allocate swap zone, keep reducing the zone by a third rather than a half in order to reduce the chance of allocating way too little. I also moved around some code for readability. Suggested by: dillon Reviewed by: dillon	2001-08-02 07:54:58 +00:00
Jake Burkholder	3a9b5daf48	Oops. Last commit to vm_object.c should have got these files too. Remove the use of atomic ops to manipulate vm_object and vm_page flags. Giant is required here, so they are superfluous. Discussed with: dillon	2001-07-31 04:09:52 +00:00
Jake Burkholder	b06805ad34	Remove the use of atomic ops to manipulate vm_object and vm_page flags. Giant is required here, so they are superfluous. Discussed with: dillon	2001-07-31 04:03:53 +00:00
Ian Dowse	a4821e444e	Permit direct swapping to NFS regular files using swapon(2). We already allow this for NFS swap configured via BOOTP, so it is known to work fine. For many diskless configurations is is more flexible to have the client set up swapping itself; it can recreate a sparse swap file to save on server space for example, and it works with a non-NFS root filesystem such as an in-kernel filesystem image.	2001-07-28 20:18:38 +00:00
Assar Westerlund	d3e5863fa9	make vm_page_select_cache static Requested by: bde	2001-07-23 12:34:31 +00:00
Assar Westerlund	0379d76358	(vm_page_select_cache): add prototype	2001-07-21 17:08:15 +00:00
Benno Rice	1f246456a5	The i386-specific includes in this file were "fixed" by bracketing them with #ifndef __alpha__. Fix this for the rest of the world by turning it into #ifdef __i386__. Reviewed by: obrien	2001-07-15 04:11:51 +00:00
Dag-Erling Smørgrav	bf3009895e	Fix missing newline and terminator at the end of the vm.zone sysctl.	2001-07-09 03:37:33 +00:00
Matt Jacob	f343cf2135	Apply field bandages to the includes so compiles happen on alpha.	2001-07-05 06:13:44 +00:00
Matthew Dillon	7197571105	Move vm_page_zero_idle() from machine-dependant sections to a machine-independant source file, vm/vm_zeroidle.c. It was exactly the same for all platforms and updating them all was getting annoying.	2001-07-05 01:32:42 +00:00
Matthew Dillon	6d03d577a5	Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc). Also removed some spl's and added some VM mutexes, but they are not actually used yet, so this commit does not really make any operational changes to the system. vm_page.c relates to vm_page_t manipulation, including high level deactivation, activation, etc... vm_pageq.c relates to finding free pages and aquiring exclusive access to a page queue (exclusivity part not yet implemented). And the world still builds... :-)	2001-07-04 23:27:09 +00:00
Matthew Dillon	1b40f8c036	Change inlines back into mainline code in preparation for mutexing. Also, most of these inlines had been bloated in -current far beyond their original intent. Normalize prototypes and function declarations to be ANSI only (half already were). And do some general cleanup. (kernel size also reduced by 50-100K, but that isn't the prime intent)	2001-07-04 20:15:18 +00:00
Matthew Dillon	54d9214595	whitespace / register cleanup	2001-07-04 19:00:13 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
John Baldwin	b62b9b648b	Fix a XXX comment by moving the initialization of the number of pbuf's for the vnode pager to a new vnode pager init method instead of making it a hack in getpages().	2001-07-03 07:35:56 +00:00
John Baldwin	6d541bf1ae	- Protect all accesses to nsw_[rw]count{,_{,a}sync} with the pbuf mutex. - Don't drop the vm mutex while grabbing the pbuf mutex to manipulate said variables.	2001-06-22 21:12:19 +00:00
Bosko Milekic	08442f8a82	Introduce numerous SMP friendly changes to the mbuf allocator. Namely, introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages: o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks. Additional things changed with this addition: - Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl. TODO (in order of priority): - Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc. Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/	2001-06-22 06:35:32 +00:00
John Baldwin	ad6c5bbede	Don't lock around swap_pager_swap_init() that is only called once during the pagedaemon's startup code since it calls malloc which results in lock order reversals.	2001-06-20 23:34:06 +00:00
John Baldwin	69a78d4666	Put the scheduler, vmdaemon, and pagedaemon kthreads back under Giant for now. The proc locking isn't actually safe yet and won't be until the proc locking is finished.	2001-06-20 00:48:20 +00:00
Matthew Dillon	ef6a93ef81	Cleanup the tabbing	2001-06-11 19:17:05 +00:00
Matthew Dillon	ff2b5645b5	Two fixes to the out-of-swap process termination code. First, start killing processes a little earlier to avoid a deadlock. Second, when calculating the 'largest process' do not just count RSS. Instead count the RSS + SWAP used by the process. Without this the code tended to kill small inconsequential processes like, oh, sshd, rather then one of the many 'eatmem 200MB' I run on a whim :-). This fix has been extensively tested on -stable and somewhat tested on -current and will be MFCd in a few days. Shamed into fixing this by: ps	2001-06-09 18:06:58 +00:00
Thomas Moestl	5c5c8fa826	Change the way information about swap devices is exported to be more canonical: define a versioned struct xswdev, and add a sysctl node handler that allows the user to get this structure for a certain device index by specifying this index as last element of the MIB. This new node handler, vm.swap_info, replaces the old vm.nswapdev and vm.swapdevX.* (where X was the index) sysctls.	2001-06-01 22:53:10 +00:00
Thomas Moestl	d279178df7	Clean up the code exporting interrupt statistics via sysctl a bit: - move the sysctl code to kern_intr.c - do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine the length of the intrcnt array - move the declarations of intrnames, eintrnames, intrcnt and eintrcnt from machine-dependent include files to sys/interrupt.h - remove the hw.nintr sysctl, it is not needed. - fix various style bugs Requested by: bde Reviewed by: bde (some time ago)	2001-06-01 13:23:28 +00:00
John Baldwin	342a1480aa	Don't hold the VM lock across VOP's and other things that can sleep.	2001-05-29 16:58:25 +00:00
John Baldwin	190609dd48	Stick VM syscalls back under Giant if the BLEED option is not defined.	2001-05-24 18:04:29 +00:00
Matthew Dillon	ac8f990bde	This patch implements O_DIRECT about 80% of the way. It takes a patchset Tor created a while ago, removes the raw I/O piece (that has cache coherency problems), and adds a buffer cache / VM freeing piece. Essentially this patch causes O_DIRECT I/O to not be left in the cache, but does not prevent it from going through the cache, hence the 80%. For the last 20% we need a method by which the I/O can be issued directly to buffer supplied by the user process and bypass the buffer cache entirely, but still maintain cache coherency. I also have the code working under -stable but the changes made to sys/file.h may not be MFCable, so an MFC is not on the table yet. Submitted by: tegge, dillon	2001-05-24 07:22:27 +00:00
John Baldwin	e6b961ffbd	- Assert Giant is held in the vnode pager methods. - Lock the VM while walking down a vm_object's backing_object list in vnode_pager_lock().	2001-05-23 22:51:23 +00:00
John Baldwin	3614c6fcbb	- Add in several asserts of vm_mtx. - Assert Giant in vm_pageout_scan() for the vnode hacking that it does. - Don't hold vm_mtx around vget() or vput(). - Lock Giant when calling vm_pageout_scan() from the pagedaemon. Also, lock curproc while setting the P_BUFEXHAUST flag. - For now we still hold Giant for all of the vm_daemon. When process limits are locked we will be only need Giant for swapout_procs().	2001-05-23 22:48:28 +00:00
John Baldwin	60517fd1f7	- Assert that the vm lock is held for all of _vm_object_allocate(). - Restore the previous order of setting up a new vm_object. The previous had a small bug where we zero'd out the flags after we set the OBJ_ONEMAPPING flag. - Add several asserts of vm_mtx. - Assert Giant is held rather than locking and unlocking it in a few places. - Add in some #ifdef objlocks code to lock individual vm objects when vm objects each have their own lock someday. - Don't bother acquiring the allproc lock for a ddb command. If DDB blocked on the lock, that would be worse than having an inconsistent allproc list.	2001-05-23 22:42:10 +00:00
John Baldwin	21c641b2a9	- Add lots of vm_mtx assertions. - Add a few KTR tracepoints to track the addition and removal of vm_map_entry's and the creation adn free'ing of vmspace's. - Adjust a few portions of code so that we update the process' vmspace pointer to its new vmspace before freeing the old vmspace.	2001-05-23 22:38:00 +00:00
John Baldwin	3a2189d451	- Lock the VM around the pmap_swapin_proc() call in faultin(). - Don't lock Giant in the scheduler() function except for when calling faultin(). - In swapout_procs(), lock the VM before the proccess to avoid a lock order violation. - In swapout_procs(), release the allproc lock before calling swapout(). We restart the process scan after swapping out a process. - In swapout_procs(), un #if 0 the code to bump the vmspace reference count and lock the process' vm structures. This bug was introduced by me and could result in the vmspace being free'd out from under a running process. - Fix an old bug where the vmspace reference count was not free'd if we failed the swap_idle_threshold2 test.	2001-05-23 22:35:45 +00:00
John Baldwin	b608320d4a	- Fix the sw_alloc_interlock to actually lock itself when the lock is acquired. - Assert Giant is held in the strategy, getpages, and putpages methods and the getchainbuf, flushchainbuf, and waitchainbuf functions. - Always call flushchainbuf() w/o the VM lock.	2001-05-23 22:31:15 +00:00
John Baldwin	6d556da5c2	Assert Giant is held for the device pager alloc and getpages methods since we call the mmap method of the cdevsw of the device we are mmap'ing.	2001-05-23 22:27:52 +00:00
John Baldwin	e4ca250d4b	- Obtain Giant in mmap() syscall while messing with file descriptors and vnodes. - Fix an old bug that would leak a reference to a fd if the vnode being mmap'd wasn't of type VREG or VCHR. - Lock Giant in vm_mmap() around calls into the VM that can call into pager routines that need Giant or into other VM routines that need Giant. - Replace code that used a goto to jump around the else branch of a test to use an else branch instead.	2001-05-23 22:17:43 +00:00
John Baldwin	bb10bb4978	Acquire Giant around vm_map_remove() inside of the obreak() syscall for vm_object_terminate().	2001-05-23 22:13:10 +00:00
John Baldwin	576f0c5fa4	Take a more conservative approach and still lock Giant around VM faults for now.	2001-05-23 22:09:18 +00:00
John Baldwin	c52f090cfb	Set the phys_pager_alloc_lock to 1 when it is acquired so that it is actually locked.	2001-05-23 19:52:23 +00:00
Alfred Perlstein	c5e62505ad	aquire Giant when playing with the buffercache and doing IO. use msleep against the vm mutex while waiting for a page IO to complete.	2001-05-23 10:28:11 +00:00
Alfred Perlstein	240e0fdd93	aquire vm mutex in swp_pager_async_iodone. Don't call swp_pager_async_iodone with the mutex held.	2001-05-22 19:01:26 +00:00
John Baldwin	86e92ee7e1	Remove duplicate include and sort includes.	2001-05-22 07:21:46 +00:00
John Baldwin	7d4ad42de5	Sort includes.	2001-05-22 07:01:11 +00:00
John Baldwin	12635f9c89	Unlock the VM lock at the end of munlock() instead of locking it again.	2001-05-22 06:07:36 +00:00
John Baldwin	874468957d	Sort includes from previous commit.	2001-05-22 05:35:45 +00:00
John Baldwin	4edf4a58e6	Sort includes.	2001-05-22 00:56:25 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
John Baldwin	ea7549540f	- Use a timeout for the tsleep in scheduler() instead of having vmmeter() wakeup proc0 by hand to enforce the timeout. - When swapping out a process, keep the process locked via the proc lock from the first checks up until we clear PS_INMEM and set PS_SWAPPING in swapout(). The swapout() function now must be called with the proc lock held and releases it before returning. - Comment out the code to attempt to lock a process' VM structures before swapping out. It is broken in that it releases the lock after obtaining it. If it does grab the lock, it needs to hand it off to swapout() instead of releasing it. This can be revisisted when the VM is locked as this is a valid test to perform. It also causes a lock order reversal for the time being, which is the immediate cause for temporarily disabling it.	2001-05-18 00:08:38 +00:00
John Baldwin	1c58e4e550	During the code to pick a process to kill when memory is exhausted, keep the process in question locked as soon as we find it and determine it to be eligible until we actually kill it. To avoid deadlock, we don't block on the process lock but skip any process that is already locked during our search.	2001-05-17 22:49:03 +00:00
John Baldwin	c96d52a913	- Use PROC_LOCK_ASSERT instead of a direct mtx_assert. - Don't hold Giant in the swapper daemon while we walk the list of processes looking for a process to swap back in. - Don't bother grabbing the sched_lock while checking a process' sleep time in swapout_procs() to ensure that a process has been idle for at least swap_idle_threshold2 before swapping it out. If we lose the race we just let a process stay in memory until the next call of swapout_procs(). - Remove some unneeded spl's, sched_lock does all the locking needed in this case.	2001-05-15 22:20:44 +00:00
Poul-Henning Kamp	a468031ce8	Actually biofinish(struct bio , struct devstat , int error) is more general than the bioerror(). Most of this patch is generated by scripts.	2001-05-06 20:00:03 +00:00
Mark Murray	559034b748	Putting sys/lockmgr.h in here allows us to depollute userland includes a bit. OK'ed by: bde	2001-05-03 11:33:51 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00

... 14 15 16 17 18 ...

2534 commits