systems. Of note:
- Implement a direct mapped region using 2MB pages. This eliminates the
need for temporary mappings when getting ptes. This supports up to
512GB of physical memory for now. This should be enough for a while.
- Implement a 4-tier page table system. Most of the infrastructure is
there for 128TB of userland virtual address space, but only 512GB is
presently enabled due to a mystery bug somewhere. The design of this
was heavily inspired by the alpha pmap.c.
- The kernel is moved into the negative address space(!).
- The kernel has 2GB of KVM available.
- Provide a uma memory allocator to use the direct map region to take
advantage of the 2MB TLBs.
- Fixed some assumptions in the bus_space macros about the ability
to fit virtual addresses in an 'int'.
Notable missing things:
- pmap_growkernel() should be able to grow to 512GB of KVM by expanding
downwards below kernbase. The kernel must be at the top 2GB of the
negative address space because of gcc code generation strategies.
- need to fix the >512GB user vm code.
Approved by: re (blanket)
- Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)'
is alays wrong because __FOO_VISIBLE is always defined (to 0 for
invisibility).
sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:
- Style fixes.
Submitted by: bde
Reviewed by: bsdmike
Approved by: re (scottl)
load_gs() calls into a single place that is less likely to go wrong.
Eliminate the per-process context switching of MSR_GSBASE, because it
should be constant for a single cpu. Instead, save/restore it during
the loading of the new %gs selector for the new process.
Approved by: re (amd64/* blanket)
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls. It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4. The ia64 code has not implemented signal delivery, so I had
to do that.
Before you say it, yes, this does need to go in a common place. But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.
On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure. The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting. The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes. This still needs
to be optimized.
Approved by: re (amd64/* blanket)
- Move struct sigacts out of the u-area and malloc() it using the
M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
and thread_stopped() are now MP safe.
Reviewed by: arch@
Approved by: re (rwatson)
we do not have to run so long with interrupts disabled. This involved
creating tf_addr in the trapframe. Reorganize the trap stubs so that
they consistently reserve the stack space and initialize any missing
bits.
Approved by: re (amd64 stuff)
bus_dma MD code for AMD64. (And a trivial ifdef update in dev/kbd because
of this). More updates are needed here to take advantage of the 64 bit
instructions.
Approved by: re (blanket amd64/*)
value on entry and exit. This isn't as easy as it sounds because when
we recursively trap or interrupt, we have to avoid duplicating the
swapgs instruction or we end up back with the userland %gs. I implemented
this by testing TF_CS to see if we're coming from supervisor mode
already, and check for returning to supervisor. To avoid a race with
interrupts in the brief period after beginning executing the handler and
before the swapgs, convert all trap gates to interrupt gates, and reenable
interrupts immediately after the swapgs. I am not happy with this.
There are other possible ways to do this that should be investigated.
(eg: storing the GS.base MSR value in the trapframe)
Add some sysarch functions to let the userland code get to this.
Approved by: re (blanket amd64/*)
could use different versions of the math code depending on whether there
was real floating point hardware or math emulation. Since the fpu is
part of the core specification on amd64, there is no need for this here.
Approved by: re (blanket amd64/*)
should really be renamed to fpu.h and npx.c to fpu.c since its part of
the core architecture on amd64 systems, not an isa 'numeric processor
extension'.
that interrupts be disabled and remain disabled until %cr2 is read.
Otherwise we can preempt and another process can fault, and by the
time we read %cr2, we see a different processes fault address. This
Greatly Confuses vm_fault() (to say the least). The i386 port has
got this marked as a bug workaround for a Cyrix CPU, which is what
lead me astray. Its actually necessary for preemption, regardless
of whether Cyrix cpus had a bug or not.
call handler before it was safe. It was possible for to lose context
and for something else to clobber the PCPU scratch variable. This
moves the interrupt enable *way* too late, but its better safe than
sorry for the moment.
Remove DBL_DIG, DBL_MIN, DBL_MAX and their FLT_ counterparts, they
were marked for deprecation ever since SUSv1 at least.
Only define ULLONG_MIN/MAX and LLONG_MAX if long long type is
supported.
Restore a lost comment in MI _limits.h file and remove it from
sys/limits.h where it does not belong.
that were added to sparc64 and later powerpc, really should have been in
the MI area. But changing that now with insufficient preperation will
just cause too much pain.
Move MD_FETCH() to the MI sys/linker.h file to avoid another two copies
of it.
a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to
attempt to get a stable base to start from. There is a lot missing still.
Worth noting:
- The kernel runs at 1GB in order to cheat with the pmap code. pmap uses
a variation of the PAE code in order to avoid having to worry about 4
levels of page tables yet.
- It boots in 64 bit "long mode" with a tiny trampoline embedded in the
i386 loader. This simplifies locore.s greatly.
- There are still quite a few fragments of i386-specific code that have
not been translated yet, and some that I cheated and wrote dumb C
versions of (bcopy etc).
- It has both int 0x80 for syscalls (but using registers for argument
passing, as is native on the amd64 ABI), and the 'syscall' instruction
for syscalls. int 0x80 preserves all registers, 'syscall' does not.
- I have tried to minimize looking at the NetBSD code, except in a couple
of places (eg: to find which register they use to replace the trashed
%rcx register in the syscall instruction). As a result, there is not a
lot of similarity. I did look at NetBSD a few times while debugging to
get some ideas about what I might have done wrong in my first attempt.
Rename visible x86_64 references to amd64.
Kill MID_MACHINE, its a.out specific, the only platform that supports it
is i386. All of the other platforms should remove it too.
syscall return values should be cleared. The system calls
getcontext() and swapcontext() want to return 0 on success
but these contexts can be switched to at a later time so
the return values need to be cleared in the saved register
sets. Other callers of get_mcontext() would normally want
the context without clearing the return values.
Remove the i386-specific context saving from the KSE code.
get_mcontext() is not i386-specific any more.
Fix a bad pointer in the alpha get_mcontext() code. The
context was being bcopy()'d from &td->tf_frame, but tf_frame
is itself a pointer, so the thread was being copied instead.
Spotted by jake.
Glanced at by: jake
Reviewed by: bde (months ago)
to get actual constant values. This is in preparation for machine/limits.h
retirement.
Discussed on: standards@
Submitted by: Craig Rodrigues <rodrigc@attbi.com> (*)
Modified by: kan
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
sigprocmask() that used the stackgap with calls to the corresponding
kern_sig*() functions instead without using the stackgap.
cpu_switch_load_gs, cpu is in context switch, so don't enable interrupt.
because it is in context switch, it is expected sched_lock was held,
so don't PROC_LOCK(p) and psignal, it is LOR, probably we can
set a P_XSIGBUS like flag in p_sflags, and set TDF_ASTPENDING in
td_flags, in ast(), post a SIGBUS to process if P_XSIGBUS was set.
ethernet controller. The driver has been tested with the LinkSys
USB200M adapter. I know for a fact that there are other devices out
there with this chip but don't have all the USB vendor/device IDs.
Note: I'm not sure if this will force the driver to end up in the
install kernel image or not. Special magic needs to be done to exclude
it to keep the boot floppies from bloating again, someone please
advise.
Ignoring maxsegsz may lead to fatal data corruption for some devices.
ex. SBP-2/FireWire
We should apply this change to other platforms except for sparc64.
MFC after: 1 week
enum to an int and redefine the BUS_DMASYNC_* constants as
flags. This allows us to specify several operations in one
call to bus_dmamap_sync() as in NetBSD.
by allprison_mtx), a unique prison/jail identifier field, two path
fields (pr_path for reporting and pr_root vnode instance) to store
the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
struct xprison instances that represent a snapshot of active jails.
Reviewed by: rwatson, tjr
backend for bus_dmamap_load_mbuf and bus_dmamap_load_uio.
- Increaes MAX_BPAGES to 512. Less than this causes fxp to quickly runs out
of bounce pages.
- Add an argument to reserve_bounce_pages indicating wether this operation
should fail or be queued for later processing if we run out of memory.
The EINPROGRESS return value is not handled properly by consumers of
bus_dmamap_load_mbuf.
- If bounce buffers are required allocate minimum 1 bounce page at map
creation time. If maxsize was small previously this could get truncated
to 0 and the drivers would quickly run out of bounce pages.
- Fix a bug handling the return value of alloc_bounce_pages at map creation
time. It returns the number of pages allocated, not 0 on success.
- Use bus_addr_t for physical addresses to avoid truncation.
- Assert that the map is non-null and not the no bounce map in
add_bounce_pages.
Sponsored by: DARPA, Network Associates Laboratories
the top of the address space to be reclaimed. The problem is that with
the APTD gone the mapable kernel address space runs right to the end of
the 32 bit address space. As a max this is 0x100000000, which can't be
represented in 32 bits, so we have to use ptd entry n-1 and pte offset
n-1, instead of ptd entry n and pte offset 0. There's still 1 page we
can't use, but we gain just under 4 megs of kva (8 megs with PAE).
Sponsored by: DARPA, Network Associates Laboratories
to take care of the KAME IPv6 code which needs ovbcopy() because NetBSD's
bcopy() doesn't handle overlap like ours.
Remove all implementations of ovbcopy().
Previously, bzero was a function pointer on i386, to save a jmp to
bzero_vector. Get rid of this microoptimization as it only confuses
things, adds machine-dependent code to an MD header, and doesn't really
save all that much.
This commit does not add my pagezero() / pagecopy() code.
as it could be and can do with some more cleanup. Currently its under
options LAZY_SWITCH. What this does is avoid %cr3 reloads for short
context switches that do not involve another user process. ie: we can
take an interrupt, switch to a kthread and return to the user without
explicitly flushing the tlb. However, this isn't as exciting as it could
be, the interrupt overhead is still high and too much blocks on Giant
still. There are some debug sysctls, for stats and for an on/off switch.
The main problem with doing this has been "what if the process that you're
running on exits while we're borrowing its address space?" - in this case
we use an IPI to give it a kick when we're about to reclaim the pmap.
Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a
few more things and get some more feedback before turning it on by default.
This is NOT a replacement for Bosko's lazy interrupt stuff. This was more
meant for the kthread case, while his was for interrupts. Mine helps a
little for interrupts, but his helps a lot more.
The stats are enabled with options SWTCH_OPTIM_STATS - this has been a
pseudo-option for years, I just added a bunch of stuff to it.
One non-trivial change was to select a new thread before calling
cpu_switch() in the first place. This allows us to catch the silly
case of doing a cpu_switch() to the current process. This happens
uncomfortably often. This simplifies a bit of the asm code in cpu_switch
(no longer have to call choosethread() in the middle). This has been
implemented on i386 and (thanks to jake) sparc64. The others will come
soon. This is actually seperate to the lazy switch stuff.
Glanced at by: jake, jhb
a pointer that is in user space. It will be used as the basic primitive
for a kernel supported user space lock implementation.
- Implement this function in x86's support.s
- Provide stubs that return -1 in all other architectures. Implementations
will follow along shortly.
Reviewed by: jake
a follow on commit to kern_sig.c
- signotify() now operates on a thread since unmasked pending signals are
stored in the thread.
- PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
- Change all consumers to pass in a thread.
Right now this does not cause any functional changes but it will be important
later when signals can be delivered to specific threads.
kernel opition 'options PAE'. This will only work with device drivers which
either use busdma, or are able to handle 64 bit physical addresses.
Thanks to Lanny Baron from FreeBSD Systems for the loan of a test machine
with 6 gigs of ram.
Sponsored by: DARPA, Network Associates Laboratories, FreeBSD Systems
accessing an alternate address space this causes 1 page table page at
a time to be mapped in, rather than using the recursive mapping technique
to map in an entire alternate address space. The recursive mapping
technique changes large portions of the address space and requires global
tlb flushes, which seem to cause problems when PAE is enabled. This will
also allow IPIs to be avoided when mapping in new page table pages using
the same technique as is used for pmap_copy_page and pmap_zero_page.
Sponsored by: DARPA, Network Associates Laboratories
This keeps the logical cpu's halted in the idle loop. By default
the logical cpu's are halted at startup. It is also possible to
halt any cpu in the idle loop now using machdep.hlt_cpus.
Examples of how to use this:
machdep.hlt_cpus=1 halt cpu0
machdep.hlt_cpus=2 halt cpu1
machdep.hlt_cpus=4 halt cpu2
machdep.hlt_cpus=3 halt cpu0,cpu1
Reviewed by: jhb, peter
1) Its critical for HTT. There's less foot-shooting opportunity.
2) I've seen significant improvements in interactive response to commands
over ssh sessions. I assume this is less lock contention.
3) As incentive to finish the idle cpu IPI wakeup stuff.
4) The machine on my desk was blowing hot air in my general direction
because somebody forgot to turn the hlt on, and it saves 50 watts per
cpu..
The machdep.cpu_idle_hlt sysctl is still available, but now the default
is the same as on UP kernels.
where physical addresses larger than virtual addresses, such as i386s
with PAE.
- Use this to represent physical addresses in the MI vm system and in the
i386 pmap code. This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
detection code, and due to kvtop returning vm_paddr_t instead of u_long.
Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.
Sponsored by: DARPA, Network Associates Laboratories
Discussed with: re, phk (cdevsw change)
doesn't do it. This fixes all known causes of "Context switches not
allowed in the debugger" in mi_switch(). The main cause was trap_fatal()
calling kdb_trap() with interrupts enabled. Switching to ithreads for
interrupt handling then made fatal traps more fatal and harder to debug.
The problem was limited in -current because most interrupt handlers are
blocked by Giant, but it occurred almost deterministically for me because
my clock interrupt handlers are non-fast and not blocked by Giant.
in busdma tags. There are currently no tags shared accross
different drivers so this isn't needed at the moment, but it
will be required when we'll have a proper newbus method to get
the parent busdma tag.
4 bits. This reportedly fixes booting on the SW7500CW2. Much thanks to
the submitter for tracking this down!
Submitted by: Brian Buchanan <brian@ncircle.com>
Reviewed by: peter
MFC after: 3 days
are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.
pmap_release.
- Merged pmap_release and pmap_release_free_page. When pmap_release is
called only the page directory page(s) can be left in the pmap pte object,
since all page table pages will have been freed by pmap_remove_pages and
pmap_remove. In addition, there can only be one reference to the pmap and
the page directory is wired, so the page(s) can never be busy. So all there
is to do is clear the magic mappings from the page directory and free the
page(s).
Sponsored by: DARPA, Network Associates Laboratories
The random value sometimes causes macro CLKF_USERMODE to return true
because PSL_VM bit is set and really shoudn't be, this causes statclock()
to execute in wrong path, and further breaks KSE code and kernel crashes
when executing threaded program.
check, mac_check_sysarch_ioperm(), permitting MAC security policy
modules to control access to these interfaces. Currently, they
protect access to IOPL on i386, and setting HAE on Alpha.
Additional checks might be required on other platforms to prevent
bypass of kernel security protections by unauthorized processes.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
Fixed memory leak in the "nodevice" option implementation.
Use these instead of sed(1) in MD NOTES.
Use a single makefile (sys/conf/makeLINT.mk) to generate
LINT for all architectures. (Previous versions missed
the LINT dependency on Makefile, and i386 version also
missed the dependency on ${NOTES}.)
Fixed bugs in the previous NOTES conversion using the
"nodevice" token and sed(1):
- i386 LINT lost "device pst".
- pc98 LINT lost SC_*, MAXCONS and KBD_DISABLE_KEYMAP_LOAD
options, and got needless DPT_* options.
- Added nooptions PPC_DEBUG, PPC_PROBE_CHIPSET, KBD_INSTALL_CDEV
to sparc64 LINT so that it has a chance to config(8).
This basically returns us to where we were before.
for testing and setting the current and alternate address spaces.
- Changed PTDpde and APTDpde to arrays to support multiple page directory
pages.
ponsored by: DARPA, Network Associates Laboratories
- Get rid of the useless atop() / pmap_phys_address() detour. The
device mmap handlers must now give back the physical address
without atop()'ing it.
- Don't borrow the physical address of the mapping in the returned
int. Now we properly pass a vm_offset_t * and expect it to be
filled by the mmap handler when the mapping was successful. The
mmap handler must now return 0 when successful, any other value
is considered as an error. Previously, returning -1 was the only
way to fail. This change thus accidentally fixes some devices
which were bogusly returning errno constants which would have been
considered as addresses by the device pager.
- Garbage collect the poorly named pmap_phys_address() now that it's
no longer used.
- Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this,
since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake
Reviewed by: peter
Compile-tested on: LINT (i386), GENERIC (alpha and sparc64)
Runtime-tested on: i386
- Changed VM_MAXUSER_ADDRESS to be defined in terms of PTDPTDI. In order for
assumptions about the recursive page table map to work it must be the base
of the recursive map. Any pte offset that's not NPTEPG will break these
assumptions.
Sponsored by: DARPA, Network Associates Laboratories
instead of allocating another page of kva and mapping it in again. This was
likely an oversight in revision 1.174 (cut and paste from pmap_pinit).
Discussed with: peter, tegge
Sponsored by: DARPA, Network Associates Laboratories
page directory.
- Use these instead of the magic constants 1 or PAGE_SIZE where appropriate.
There are still numerous assumptions that the page directory is exactly
1 page.
Sponsored by: DARPA, Network Associates Laboratories
#if'ed out for a while. Complete the deed and tidy up some other bits.
We need to be able to call this stuff from outer edges of interrupt
handlers for devices that have the ISR bits in pci config space. Making
the bios code mpsafe was just too hairy. We had also stubbed it out some
time ago due to there simply being too much brokenness in too many systems.
This adds a leaf lock so that it is safe to use pci_read_config() and
pci_write_config() from interrupt handlers. We still will use pcibios
to do interrupt routing if there is no acpi.. [yes, I tested this]
Briefly glanced at by: imp
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.
Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@
o Add a MD header private to libc called _fpmath.h; this header
contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
<math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
<machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
on the size of its argument. __fpclassifyl() is never called on
alpha because (sizeof(long double) == sizeof(double)), which is good
since __fpclassifyl() can't deal with such a small `long double'.
This was developed by David Schultz and myself with input from bde and
fenner.
PR: 23103
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
(significant portions)
Reviewed by: bde, fenner (earlier versions)
Remove all the stuff that does not relate to the TSC.
Change the calibration to use DELAY(1000000) rather than trying to check
it against the CMOS RTC, this drastically increases precision:
Using 25 samples on a Athlon 700MHz UP machine I find:
stddev min max average
CMOS 22200 Hz -74980 Hz 34301 Hz 704928721 Hz
DELAY 1805 Hz -1984 Hz 2678 Hz 704937583 Hz
(The difference between the two averages is not statistically significant.)
expressed in PPM of the frequency:
stddev min max
CMOS 31.49 PPM -106.37 PPM 48.66 PPM
DELAY 2.56 PPM 2.81 PPM 3.80 PPM
This code will not be used until a followup commit to sys/isa/clock.c
and sys/pc98/pc98/clock.c which will only happen after some field testing.
uio segment is empty. In this case no dma segment is create by
bus_dmamap_load_buffer, but the calling routine clears the first flag.
Under certain combinations of addresses of the first and second mbuf/uio
buffer this leads to corrupted DMA segment descriptors. This was already
fixed by tmm in sparc64/sparc64/iommu.c.
PR: kern/47733
Reviewed by: sam
Approved by: jake (mentor)
prevent the compiler from optimizing assignments into byte-copy
operations which might make access to the individual fields non-atomic.
Use the individual fields throughout, and don't bother locking them with
Giant: it is no longer needed.
Inspired by: tjr
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.
Reviewed by: jhb, tmm
Tested on: i386, sparc64
- Use atomic subtract to update the global wired pages count. (See
also vm/vm_page.c revision 1.233.)
- Assert that the page queue lock is held in pmap_remove_entry().
I'm not convinced there is anything major wrong with the patch but
them's the rules..
I am using my "David's mentor" hat to revert this as he's
offline for a while.
counterparts to bus_dmamem_alloc() and bus_dmamem_free(). This allows
the caller to specify the size of the allocation instead of it defaulting
to the max_size field of the busdma tag.
This is intended to aid in converting drivers to busdma. Lots of
hardware cannot understand scatter/gather lists, which forces the
driver to copy the i/o buffers to a single contiguous region
before sending it to the hardware. Without these new methods, this
would require a new busdma tag for each operation, or a complex
internal allocator/cache for each driver.
Allocations greater than PAGE_SIZE are rounded up to the next
PAGE_SIZE by contigmalloc(), so this is not suitable for multiple
static allocations that would be better served by a single
fixed-length subdivided allocation.
Reviewed by: jake (sparc64)
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
- Sort definition of cpu_* variables appropriately.
- Move cpu_fxsr out of the magic non-BSS set of variables and stick it in
the BSS along with hw_instruction_sse (make the latter static as well).
Submitted by: bde (partially)
variable to something in the cpu_* namespace since that's what all the
other cpuid variables were named and cpu_procinfo is what I came up with.
Requested by: bde
metadata. This fixes module dependency resolution by the kernel linker on
sparc64, where the relocations for the metadata are different than on other
architectures; the relative offset is in the addend of an Elf_Rela record
instead of the original value of the location being patched.
Also fix printf formats in debug code.
Submitted by: Hartmut Brandt <brandt@fokus.gmd.de>
PR: 46732
Tested on: alpha (obrien), i386, sparc64
<machine/ieeefp.h> where it belongs.
o Remove the i386 specific inclusion of <machine/floatingpoint.h>
from <ieeefp.h>, now that including <machine/ieeefp.h> is enough
for all architectures.
o Allow <machine/ieeefp.h> to inline the functions exposed by the
headers by checking for _IEEEFP_INLINED_ in the MI header. When
defined, prototypes are not given and it is assumed that the MD
headers, when inlining only a subset of the functions provide
prototypes for the functions not being inlined.
Based on patch from: Terry Lambert <tlambert2@mindspring.com>
Tested with: make release.
portable copy. Note that pmap_extract() must be used instead of
pmap_kextract().
This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
cpu_exthigh and cpu_brand in printcpuinfo() instead of in identify_cpu().
We also only do it for known-good values of cpu_vendor which is a bit more
conservative.
Reviewed by: bde (mostly)
print_AMD_foo() functions.
- Add a brand name table for the brand index provided on Intel CPU's in
%ebx after cpuid 1.
- For Intel CPUs, if we don't get a processor name from the extended cpuid
then use the brand index in cpuid_cpuinfo to pick a name from the brand
table and copy that name into cpu_brand.
- Replace the duplicated code to use the extended cpuid to replace
cpu_model with the processor name in the AMD and Transmeta sections of
printcpuinfo() with generic code that replaces cpu_model with
cpu_brand if cpu_brand is not an empty string. We also trim leading
spaces from cpu_brand prior to doing this since at least some processor
names (notably those of Intel CPUs) have leading spaces in the name.
- Give print_AMD_features() its own private regs[] array since
printcpuinfo() doesn't use the one it has anymore.
returned from cpuid 0x80000000.
- Add a cpu_brand char array to hold the processor name returned by
cpuid 0x80000002-0x80000004 on AMD, Intel, Transmeta, and possibly
other CPUs.
- Use cpuid to set cpu_exthigh and read the processor name if it is present
in identify_cpu().
in the mptable. The way this works is that we determine if the system
has hyperthreading and how many logical CPU's should be in each physical
CPU by using the information returned by cpuid. During the first pass of
the mptable, we build a bitmask of the APIC IDs of the CPUs listed in the
mptable. We then scan that bitmask to see if the CPUs are already listed
by the mptable, or if there are any APIC IDs already in use that would
conflict with the APIC IDs of the logical CPUs. If that test succeeds,
then we fixup the count of application processors. Later on during the
second pass of the mptable we create fake processor entries for logical
CPUs and add them to the system.
We only need this type of fixup hack when using the mptable to enumerate
CPUs. The ACPI MADT table properly enumerates all logical CPUs.
(show thread {address})
Remove the IDLE kse state and replace it with a change in
the way threads sahre KSEs. Every KSE now has a thread, which is
considered its "owner" however a KSE may also be lent to other
threads in the same group to allow completion of in-kernel work.
n this case the owner remains the same and the KSE will revert to the
owner when the other work has been completed.
All creations of upcalls etc. is now done from
kse_reassign() which in turn is called from mi_switch or
thread_exit(). This means that special code can be removed from
msleep() and cv_wait().
kse_release() does not leave a KSE with no thread any more but
converts the existing thread into teh KSE's owner, and sets it up
for doing an upcall. It is just inhibitted from being scheduled until
there is some reason to do an upcall.
Remove all trace of the kse_idle queue since it is no-longer needed.
"Idle" KSEs are now on the loanable queue.
of the `machdep.acpi_root' sysctl. This is required on ia64
because the root pointer hardly ever, if at all, lives in the
first MB of memory and also because scanning the first MB of
memory can cause machine checks.
This provides a save and reliable way for ACPI tools to work
with the tables if ACPI support is present in the kernel. On
ia64 ACPI is non-optional.
GENERIC. Each device can be re-enabled at startup time by unsetting the
disabled hint in the loader.
Requested by: mdodd
Approved by: re
Prodded by: rwatson
The correct range is [1...7] with Sunday=1, but we have been writing
[0...6] with Sunday=0.
The Soekris computers flagged the zero, zapped the date, so if you
rebooted your soekris on a sunday, it would come up with a wrong
date.
Bruce has a more extensive rework of this code, but we will stick with
the minimalist fix for now.
Spotted by: Soren Kristensen <soren@soekris.com>
Thanks to: Michael Sierchio <kudzu@tenebras.com>.
Confirmed by: bde
Approved by: re
to accomodate the new SSE/XMM floating point save/restore
instructions.
This commit is mostly from bde and includes some style nits.
Approved by: re (jhb)
pmap_remove_pte(). Use vm_page_sleep_if_busy() in
_pmap_unwire_pte_hold() so that the page queues lock is released
when sleeping.
Approved by: re (blanket)
to the sparc64 implementation. (Note: With modest effort on the alpha and
ia64 this function could migrate to the MI part of the kernel.)
Approved by: re (blanket)
i386 cpu_thread_exit(). This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit(). We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().
Approved by: re@
Suggested by: jhb
macro for use when parsing MADT tables, thus we always tried to set the
interrupt model to APIC. This proved to be harmful on UP machines with
IO APIC's (or for UP kernels on SMP machines) since the wrong interrupt
routing information would be returned.
Pointy hat to: jhb
Approved by: re (rwatson)
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
to reflect its new location, and add page queue and flag locking.
Notes: (1) alpha, i386, and ia64 had identical implementations
of pmap_collect() in terms of machine-independent interfaces;
(2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
has broken int 12H.
If hw.hasbrokenint12="1" in loader environment, kernel never use BIOS
INT 12 call to determine base memory size.
Otherwise, kernel use INT 12 in old behaviour.
This should fix kernel panic problem caused by 1.544 changes.
MFC after: 1 day
sysctls to MI code; this reduces code duplication and makes all of them
available on sparc64, and the latter two on powerpc.
The semantics by the i386 and pc98 hw.availpages is slightly changed:
previously, holes between ranges of available pages would be included,
while they are excluded now. The new behaviour should be more correct
and brings i386 in line with the other architectures.
Move physmem to vm/vm_init.c, where this variable is used in MI code.
take advantage of the fact that the vm object's list of pages is
now ordered to reduce the overhead of finding the desired set of
pages to be mapped. (See revision 1.215 of vm/vm_page.c.)
remove global variable in_vm86call, set vm86 calling flag in PCB flags.
2.Fix vm86 BIOS calling preempted problem by changing vm86_lock mutex type
from MTX_DEF to MTX_SPIN. vm86pcb is not remembered in thread struct,
when the thread calling vm86 BIOS is preempted by interrupt thread,
and later switching back to the thread would cause incorrect context be
loaded into CPU registers, this leads to kernel crash.
not look like the prerequisites to fill it in properly will be in the tree
for the upcoming release, but it's mostly done, so there is no need for these
to stay around to remind us.
o It turns out that we always need to try to route the interrupts for
the case where the $PIR tells us there can be only one. Some machines
require this, while others fail when we try to do this (bogusly, imho).
Since we have no apriori way of knowing which is which, we always try to
do the routing and hope for the best if things fail.
o Add some additional comments that state the obvious, but amplify it in
non-obvious ways (judging from the questions I've gotten).
This should un-break older laptops that still have to use PCIBIOS to route
interrupts.
Tested by: sam
Use exact width types, since this is a MD file and won't be used elsewhere.
Fix a couple of resulting printf breakages
Bug found by: phk using Flexlint
handling clean and functional as 5.x evolves. This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43. Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago).
Approved by: re
Try INT 15H/E820H first, then fall back to the old compatibility
method (INT 12H).
This is a workaround for newer machines which have broken INT 12H BIOS
service implementation.
Reviewed by: -current ML
MFC after: 3 days
long doubles at the moment (printf truncates them to doubles).
However, long doubles to appear to work to the ranges listed in this
commit on both -stable (4.5) and -current. There may be some slight
rounding issues with long doubles, but that's an orthogonal issue to
these constants.
I've had this in my local tree for 3 months, and in my company's local
tree for 15 months with no ill effects.
Obtained from: NetBSD
Not likely to like it: bde
so that there is ony one copy of it. Fix that one copy
so that KSEs with no mailbox in a KSE program are not a cause
of page faults (this can legitmatly happen).
Submitted by: (parts) davidxu
This is for the not-quite-ready signal/fpu abi stuff. It may not see
the light of day, but I'm certainly not going to be able to validate it
when getting shot in the foot due to syscall number conflicts.
execve_secure() system call, which permits a process to pass in a label
for a label change during exec. This permits SELinux to change the
label for the resulting exec without a race following a manual label
change on the process. Because this interface uses our general purpose
MAC label abstraction, we call it execve_mac(), and wrap our port of
SELinux's execve_secure() around it with appropriate sid mappings.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
The primary reason for this is to allow MD code to process machine
specific attributes, segments or sections in the ELF file and
update machine specific state accordingly. An immediate use of this
is in the ia64 port where unwind information is updated to allow
debugging and tracing in/across modules. Note that this commit
does not add the functionality to the ia64 port. See revision 1.9
of ia64/ia64/elf_machdep.c.
Validated on: alpha, i386, ia64
ACL configuration changes, this shouldn't result in different code paths
for file systems not explicitly configured for ACLs by the system
administrator. For UFS1, administrators must still recompile their
kernel to add support for extended attributes; for UFS2, it's sufficient
to enable ACLs using tunefs or at mount-time (tunefs preferred for
reliability reasons). UFS2, for a variety of reasons, including
performance and reliability, is the preferred file system for use with
ACLs.
Approved by: re
2. Update a comment. We now restore much more than RTC updates and
interrupts.
3. Order change. Stop interrupts by writing to RTC_STATUSB,
restore rate bits for the interrupts by writing to RTC_STATUSA,
then enable interrupts again.
This seems to be done perfectly backwards in startrtclock().
Otherwise, the idea for this change was obtained from
startrtclock().
4. Don't stop the clock (RTCB_HALT). We only program some control bits
and don't want to stop the clock.
5. (Not really related.) Add caveats to the comment about timer_restore().
The update is non-atomic since locking is not done.
On locking:
6. rtcin() and writertc() are locked() adequately by splhigh() in RELENG_4,
but this locking is null in -current.
7. Doing things in the correct order in (3) combined with (6) is probably
enough locking for rtcrestore() in RELENG_4. In -current, the
writertc()'s race with rtcintr() unless the BIOS disables RTC interrupts.
Submitted by: bde (including commit message)
MFC after: 1 week
This is most beneficial for vmware client os installs.
Reviewed by: jmallet, iedowse, tlambert2@mindspring.com
MFC After: never, -STABLE does not currently use this instruction
- Begin moving scheduler specific functionality into sched_4bsd.c
- Replace direct manipulation of scheduler data with hooks provided by the
new api.
- Remove KSE specific state modifications and single runq assumptions from
kern_switch.c
Reviewed by: -arch
that add an instance of themselves. The npx(4) driver doesn't even check
the npx 'port' hint but hardcodes IO_NPX instead. The npx(4) driver also
will use isa IRQ 13 (on x86, 8 on pc98) by default if no 'irq' hint is
specified, so we don't need that hint either.
in specific situations. The owner thread must be blocked, and the
borrower can not proceed back to user space with the borrowed KSE.
The borrower will return the KSE on the next context switch where
teh owner wants it back. This removes a lot of possible
race conditions and deadlocks. It is consceivable that the
borrower should inherit the priority of the owner too.
that's another discussion and would be simple to do.
Also, as part of this, the "preallocatd spare thread" is attached to the
thread doing a syscall rather than the KSE. This removes the need to lock
the scheduler when we want to access it, as it's now "at hand".
DDB now shows a lot mor info for threaded proceses though it may need
some optimisation to squeeze it all back into 80 chars again.
(possible JKH project)
Upcalls are now "bound" threads, but "KSE Lending" now means that
other completing syscalls can be completed using that KSE before the upcall
finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is
one of the highest priorities in the KSE system.) The upcall when it happens
will present all the completed syscalls to the KSE for selection.
there are some strange machines that seem to need this.
o delete bogus comment.
o don't use the the bios for read/writing config space. They interact badly
with SMP and being called from ISR. This brings -current in line with
-stable.
# make the latter #ifdef on USE_PCI_BIOS_FOR_READ_WRITE in case we
# need to go back in a hurry.
NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to
need the NO_GEOM option:
boot0cfg/fdisk:
Tries to update the MBR while it is being used to control
slices. GEOM does not allow this as a direct operation.
SCSI floppy drives:
Appearantly the scsi-da driver return "EBUSY" if no media
is inserted. This is wrong, it should return ENXIO.
PC98:
It is unclear if GEOM correctly recognizes all variants of
PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
This will be removed when new versions of syscalls sigreturn()
and sigaction() are added (mini is working on this but is in
the middle of a move).
This should fix the problem of cvsupd dying.
These are still unknown name but these are working as well
as the other ServerWorks chipset.
Description strings should be corrected when the chipsets
are known.
MFC after: 1 week
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create. Passing the
value 0 prevents the alternate kstack from being created. Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.
Reviewed by: jake, peter, jhb
of 1 so that it is not probed until after acpi0 is probed and attached.
- In legacy_probe(), return ENXIO if acpi0 is around and alive.
- nexus_attach() is now much simpler and just lets its child drivers do
all the work.
and attach routines have succeeded so that if they fail we can still use
the PnP BIOS to find ISA on-board devices. The fact that we do this here
is gross but fixing it properly involves a lot more work.
- nexus no longer has PCI bridges as direct children, so the PCI bus
ivar is no longer used and is removed.
- Don't attach default EISA, ISA, or MCA busses. Instead, if we do not
have an acpi0 device after bus_generic_probe(), add a legacy0 child
device.
- Remove machine/nexusvar.h.
ACPI or for when ACPI support is disabled or not present in the kernel.
Basically, the nexus device is now split into two with some parts
(such as adding default ISA, MCA, and EISA busses if they aren't found
as well as support for PCI bus device ivars) being moved to the legacy
driver.
expand to __attribute__((packed)) and __attribute__((aligned(x)))
respectively. Replace the handful of gcc-ism's that use
__attribute__((aligned(16))) etc around the kernel with __aligned(16).
There are over 400 __attribute__((packed)) to deal with, that can come
later. I just want to use __packed in new code rather than add more
gcc-ism's.
so that it is MI. Allow nfs_mountroot to return an error if the nfs_diskless
struct is not valid, rather than panicing later on. Call nfs_setup_diskless()
from nfs_mountroot if NFS_ROOT is defined, like bootpc_init(). Removed legacy
root mount support for sparc64, and enabled NFS_ROOT by default.
function were put in i386/i386/machdep.c from where it has been
cut and pasted to other architectures with only minor corruption.
Disklabel is really a MI format in many ways, at least it certainly
is when you operate on struct disklabel.
Put bounds_check_with_label() back in subr_disklabel.c where it belongs.
Sponsored by: DARPA & NAI Labs.
MD function is just a wrapper around db_stack_trace_cmd() that prints out
a backtrace of curthread. Currently, this function is only implemented
on i386 and alpha (and the alpha version isn't quite tested yet, will do
that in a bit). Other changes:
- For i386, fix a bug in the raw frame address case. The eip we extract
from the passed in frame address does not match the frame we received.
Thus, instead of printing a bogus frame with the wrong eip, go ahead
and advance frame down to the same frame as the eip we are using.
- For alpha, attempt to add a way of doing a raw trace for alpha. Instead
of passing a frame address in 'addr', pass in a pointer to a structure
containing PC and KSP and use those to start the backtrace. The alpha
db_print_backtrace() uses asm to read in the current PC and KSP values
into such a request.
Tested on: i386
Requested by: many
This patch addresses a bug that can cause a GPF in the kernel - if a
process makes use of i386_set_ldt to install a LDT entry, then loads
a corresponding segment descriptor into %gs, forks, and if the child
execs.
In this scenario, setregs executes user_ldt_free and then determines
how to reset the %gs register:
/* reset %gs as well */
if (pcb == curpcb)
load_gs(_udatasel);
else
pcb->pcb_gs = _udatasel;
This is insufficient in the fork/exec case, since pcb will be equal
to curpcb when the child execs; load_gs will reset %gs to _udatasel
but it doesn't reset pcb->pcb_gs; upon return from the system call,
cpu_switch_load_gs will thus attempt to restore %gs from pcb->pcb_gs
and trigger a GPF since all LDT entries have already been cleared.
The fix is to always reset pcb->pcb_gs to _udatasel.
Submitted by: Christian Zander <zander@minion.de>
Reviewed by: jake
under way to move the remnants of the a.out toolchain to ports. As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL. It has already lasted WAY beyond
that.
Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.
Tested on: i386 (extensively), alpha
- Maintain fpu state across signals.
- Use ucontext_t's to store KSE thread state.
- Synthesize state for the UTS upon each upcall, rather than
saving and copying a trapframe.
- Save and restore FPU state properly in ucontext_t's.
Reviewed by: deischen, julian
Approved by: -arch
next step is to allow > 1 to be allocated per process. This would give
multi-processor threads. (when the rest of the infrastructure is
in place)
While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc
are diverging more than they should.. corrective action needed soon.
to control the mapping of things like the ACPI and APM into memory.
The problem is that starting X changes these values, so if something
was using the bits of BIOS mapped into memory (say ACPI or APM),
then next time they access this memory the machine would hang.
This patch refuse to change MTRR values it doesn't understand,
unless a new "force" option is given. This means X doesn't change
them by accident but someone can override that if they really want
to.
PR: 28418
Tested by: Christopher Masto <chris@netmonger.net>,
David Bushong <david@bushong.net>,
Santos <casd@myrealbox.com>
MFC after: 1 week
to userland in the signal handler that were not being iflled out before, but
should and can be.
This part of sendsig could be slightly refactored to use an MI interface, or
ideally, *sendsig*() would have an API change to accept a siginfo_t, which
would be filled out by an MI function in the level above sendsig, and said MI
function would make a small call into MD code to fill out the MD parts (some
of which may be bogus, such as the si_addr stuff in some places). This would
eventually make it possible for parts of the kernel sending signals to set up
a siginfo with meaningful information.
Reviewed by: mux
MFC after: 2 weeks
if compiling with I686_CPU as a target. CPU_DISABLE_SSE will prevent
this from happening and will guarantee the code is not compiled in.
I am still not happy with this, but gcc is now generating code that uses
these instructions if you set CPUTYPE to p3/p4 or athlon-4/mp/xp or higher.
route interrupts if the child bus is described in the PCIBIOS interrupt
routing table. For child busses that are in the routing table, they do
not necessarily use a 'swizzle' on their pins on the parent bus to route
interrupts for child devices. If the child bus is an embedded device then
the pins on the child devices can be (and usually are) directly connected
either to a PIC or to a Interrupt Router. This fixes PCIBIOS interrupt
routing across PCI-PCI bridges for embedded devices.
IRQ for an entry in a PCIBIOS interrupt routing ($PIR) table.
- Change pci_cfgintr() to except the current IRQ of a device as a fourth
argument and to use that IRQ for the device if it is valid.
- If an intpin entry in a $PIR entry has a link of 0, it means that that
intpin isn't connected to anything that can trigger an interrupt. Thus,
test the link against 0 to find invalid entries in the table instead of
implicitly relying on the irqs field to be zero. In the machines I have
looked at, intpin entries with a link of 0 often have the bits for all
possible interrupts for PCI devices set.
not the 'entry' member. The entry point is formed from both a base and
a relative entry point. 'entry' is that relative offset. It is perfectly
valid to have an entry point with a relative offset of 0. PCIbios.ventry
is the virtual address of the entry point that takes both 'base' and
'entry' into account, thus it is the proper variable to test to see if we
have an entry point or not.
lnc(4) will attach to AMD PCnet/FAST NICs if pcn(4) does not attach.
I.e. pcn(4) gets first chance. There is a problem however in that pcn(4)
was moved out of the install kernel so that the module would be used.
This however causes bad installs if one has an AMD PCnet/FAST NIC.
sysentvec. Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places. Provided stack
fixup routines for emulations that previously used the default.
Don't attempt to follow null pointers for zombie processes in db_ps().
Style fix: use explicit an comparison with NULL for all null pointer
checks in db_ps() instead of for half of them.
db_interface.c:
Fixed ddb's handling of traps from with ddb on i386's only.
This was mostly fixed in rev.1.27 (by longjmp()'ing back to the top
level) but was completly broken in rev.1.48 (by not unwinding the new
state (mainly db_active) either before or after the longjmp(). This
mostly never worked for other arches, since rev.1.27 has not been ported
and lower level longjmp()'s only handle traps for memory accesses. All
cases should be handled at a lower level to provided better control and
simplify unwinding of state.
Implementation details: don't pretend to maintain db_active in a nested
way -- ddb cannot be reentered in a nested way. Use db_active instead
of the db_global_jmpbuf_valid flag and longjmp()'s return value for things
related to reentering ddb. [re]entering is still not atomic enough.
in the original hardwired sysctl implementation.
The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int. Use
'long' there.
Change Maxmem and physmem and related variables to 'long', mostly for
completeness. Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody. This
comes for free on 32 bit machines, so why not?
alive!" message right as the scsi probe messages happen. This is a bit
nasty, but it seems to work. At the point that we unlock the AP's, briefly
wait till they are all done while we hold the console on their behalf.
These types are unlikely to ever become very MD. They include:
clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t,
rune_t, socklen_t, timer_t, wchar_t, and wint_t.
While moving them, make a few adjustments (submitted by bde):
o __ct_rune_t needs to be precisely `int', not necessarily __int32_t,
since the arg type of the ctype functions is int.
o __rune_t, __wchar_t and __wint_t inherit this via a typedef of
__ct_rune_t.
o Some minor wording changes in the comment blocks for ct_rune_t and
mbstate_t.
Submitted by: bde (partially)
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif
Concept by: bde
Reviewed by: jake, obrien