Use it universally. Book-E traps may also need revisiting due to the
introduction of fixed-offset traps and the deprecation of IVORs in POWER
ISA 2.06, but that's very much an issue for another day.
platform modules. Whether to call this function or not is highly machine
dependent: on some systems, it is required, while on others it breaks
everything. Platform modules are in a better position to figure this
out. This is required for POWER hypervisor SCSI to work correctly. There
are no functional changes on Powermac systems.
Approved by: re (kib)
making sure they are all misaligned at +8 bytes. This fixes clang builds
of powerpc64 kernels (aside from a required increase in KSTACK_PAGES which
will come later).
This commit from FreeBSD/powerpc64 with a clang-built kernel.
MFC after: 2 weeks
Issues were noted by Bruce Evans and are present on all architectures.
On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.
On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.
PowerPC64 is treated the same as amd64.
For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.
Noted by: bde
Sponsored by: The FreeBSD Foundation
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.
Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc
as CTASSERT in MI pcpu.h, stop including all possible mutually exclusive
PCPU_MD_FIELDS fields into LINT kernels, due to brekaing
aforementioned CTASSERT.
Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.
See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.
In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.
Discussed with: bde
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.
Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.
As an added bonus, tidy up some nearby comments concerning page flags.
Reviewed by: kib
MFC after: 6 weeks
implementation specific vs. the common architecture definition.
Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.
This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.
Obtained from: AppliedMicro, Freescale, Semihalf
in_cksum.h required ip.h to be included for struct ip. To be
able to use some general checksum functions like in_addword()
in a non-IPv4 context, limit the (also exported to user space)
IPv4 specific functions to the times, when the ip.h header is
present and IPVERSION is defined (to 4).
We should consider more general checksum (updating) functions
to also allow easier incremental checksum updates in the L3/4
stack and firewalls, as well as ponder further requirements by
certain NIC drivers needing slightly different pseudo values
in offloading cases. Thinking in terms of a better "library".
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
exceptions early enough during boot that the kernel will do ithe same.
Use lwsync only when compiling for LP64 and revert to the more proven isync
when compiling for ILP32. Note that in the end (i.e. between revision 222198
and this change) ILP32 changed from using sync to using isync. As per Nathan
the isync is needed to make sure I/O accesses are properly serialized with
locks and isync tends to be more effecient than sync.
While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file
so as not to leak their definitions.
Discussed with: nwhitehorn
range operations like pmap_remove() and pmap_protect() as well as allowing
simple operations like pmap_extract() not to involve any global state.
This substantially reduces lock coverages for the global table lock and
improves concurrency.
- Use isync/lwsync unconditionally for acquire/release. Use of isync
guarantees a complete memory barrier, which is important for serialization
of bus space accesses with mutexes on multi-processor systems.
- Go back to using sync as the I/O memory barrier, which solves the same
problem as above with respect to mutex release using lwsync, while not
penalizing non-I/O operations like a return to sync on the atomic release
operations would.
- Place an acquisition barrier around thread lock acquisition in
cpu_switchin().
not provide general barriers, but only barriers in the context of the
atomic sequences here. As such, make them private and keep the global
*mb() routines using a variant of sync.
isync to implement read and write barriers, following Appendix B.2 of
Book II of the architecture manual. This provides a 25% speed increase
to fork() on the PowerPC G4.
of sync (lwsync is an alternate encoding of sync on systems that do not
support it, providing graceful fallback). This provides more than an order
of magnitude reduction in the time required to acquire or release a mutex.
MFC after: 2 months
sync performs a strict superset of the functions of eieio, so using both
is redundant. While here, expand bus barriers to all bus_space operations,
since many drivers do not correctly use bus_space_barrier().
In principle, we can also replace sync just with eieio, for a significant
performance increase, but it remains to be seen whether any poorly-written
drivers currently depend on the side effects of sync to properly function.
MFC after: 1 week
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.
MFC after: 5 days
be less ambiguous and more clearly identify what it means. This
attribute is what Intel refers to as UC-, and it's only difference
relative to normal UC memory is that a WC MTRR will override a UC-
PAT entry causing the memory to be treated as WC, whereas a UC PAT
entry will always override the MTRR.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
didn't already have them. This is because the ternary expression will
return int, due to the Usual Arithmetic Conversions. Such casts are not
needed for the 32 and 64 bit variants.
While here, add additional parentheses around the x86 variant, to
protect against unintended consequences.
MFC after: 2 weeks
Without this patch we were not able to see the assembly function.
Only the function descriptor was visible.
- Distinguish between user-land and kernel when creating the ENTRY() point of
assembly source.
- Make the ENTRY() macro more readable, replace the .align directive with the
gas platform independant .p2align directive.
- Create an END()macro for later use to provide traceback tables on powerpc64.
profiling and kernel profiling. To enable kernel profiling one has to build
kgmon(8). I will enable the build once I managed to build and test powerpc
(32-bit) kernels with profiling support.
- add a powerpc64 PROF_PROLOGUE for _mcount.
- add macros to avoid adding the PROF_PROLOGUE in certain assembly entries.
- apply these macros where needed.
- add size information to the MCOUNT function.
MFC after: 3 weeks, together with r230291
that the compiler promotes floats to double precision in computations, but
inspection of the output of a cross-compiler indicates that this isn't the
case on powerpc.
possible, and double faults within an SLB trap handler are not. The result
is that it possible to take an SLB fault at any time, on any address, for
any reason, at any point in the kernel.
This lets us do two important things. First, it removes the (soft) 16 GB RAM
ceiling on PPC64 as well as any architectural limitations on KVA space.
Second, it lets the kernel tolerate poorly designed hypervisors that
have a tendency to fail to restore the SLB properly after a hypervisor
context switch.
MFC after: 6 weeks
AIM systems to 4 GB on 32-bit systems and 2^64 bytes on 64-bit systems.
VM_MAXUSER_ADDRESS remains at 2 GB on pending Book-E, pending review of
an increase to 3 GB by those more familiar with Book-E.
pmap_remove() for large sparse requests. This can prevent pmap_remove()
operations on 64-bit process destruction or swapout that would take
several hundred times the lifetime of the universe to complete. This
behavior is largely indistinguishable from a hang.
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.
Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
to VPO_UNMANAGED (and also making the flag protected by the vm object
lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
VPO_UNMANAGED. As a consequence, pmap code now can use use just
VPO_UNMANAGED to decide whether the page is unmanaged.
Reviewed by: alc
Tested by: pho (x86, previous version), marius (sparc64),
marcel (arm, ia64, powerpc), ray (mips)
Sponsored by: The FreeBSD Foundation
Approved by: re (bz)
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.
No MFC is previewed for this patch.
Tested by: pluknet
Approved by: re (kib)
instead of a PCPU field for curthread. This averts a race on SMP systems
with a high interrupt rate where the thread looking up the value of
curthread could be preempted and migrated between obtaining the PCPU
pointer and reading the value of pc_curthread, resulting in curthread being
observed to be the current thread on the thread's original CPU. This played
merry havoc with the system, in particular with mutexes. Many thanks to
jhb for helping me work this one out.
Note that Book-E is in principle susceptible to the same problem, but has
not been modified yet due to lack of Book-E hardware.
MFC after: 2 weeks
Renovate and improve the AIM Open Firmware support:
- Add RTAS (Run-Time Abstraction Services) support, found on all IBM systems
and some Apple ones
- Improve support for 32-bit real mode Open Firmware systems
- Pull some more OF bits over from the AIM directory
- Fix memory detection on IBM LPARs and systems with more than one /memory
node (by andreast@)
o In bare_probe(), change the logic that determines the maximum
number of processors/cores into a switch statement and take
advantage of the fact that bit 3 of the SVR value indicates
whether we're running on a security enabled version. Since we
don't care about that here, mask the bit. All -E versions
are taken care of automatically.
Rewrite atomic operations for powerpc in order to achieve the following:
- Produce a type-clean implementation (in terms of functions arguments
and returned values) for the primitives.
- Fix errors with _long() atomics where they ended up with the wrong
arguments to be accepted.
- Follow the sys/type.h specifics that define the numbered types starting
from standard C types.
- Let _ptr() version to not auto-magically cast arguments, but leave
the burden on callers, as _ptr() atomic is intended to be used
relatively rarely.
Fix cfi in order to support the latest point.
In collabouration with: bde
Tested by: andreast, nwhitehorn, jceel
MFC after: 2 weeks
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.
Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).
Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.
Requested by: alc
MFC after: 1 week
MFC with: r221853
starting from base C types (int, long, etc).
That is also reflected when building atomic operations, as the
size-bounded types are built from the base C types.
However, powerpc does the inverse thing, leading to a serie of nasty
bugs.
Cleanup the atomic implementation by defining as base the base C type
version and depending on them, appropriately.
Tested by: jceel
already supported nested PICs, but was limited to having a nested
AT-PIC only. With G5 support the need for nested OpenPIC controllers
needed to be added. This was done the wrong way and broke the MPC8555
eval system in the process.
OFW, as well as FDT, describe the interrupt routing in terms of a
controller and an interrupt pin on it. This needs to be mapped to a
flat and global resource: the IRQ. The IRQ is the same as the PCI
intline and as such needs to be representable in 8 bits. Secondly,
ISA support pretty much dictates that IRQ 0-15 should be reserved
for ISA interrupts, because of the internal workins of south bridges.
Both were broken.
This change reverts revision 209298 for a big part and re-implements
it simpler. In particular:
o The id() method of the PIC I/F is removed again. It's not needed.
o The openpic_attach() function has been changed to take the OFW
or FDT phandle of the controller as a second argument. All bus
attachments that previously used openpic_attach() as the attach
method of the device I/F now implement as bus-specific method
and pass the phandle_t to the renamed openpic_attach().
o Change powerpc_register_pic() to take a few more arguments. In
particular:
- Pass the number of IPIs specificly. The number of IRQs carved
out for a PIC is the sum of the number of int. pins and IPIs.
- Pass a flag indicating whether the PIC is an AT-PIC or not.
This tells the interrupt framework whether to assign IRQ 0-15
or some other range.
o Until we implement proper multi-pass bus enumeration, we have to
handle the case where we need to map from PIC+pin to IRQ *before*
the PIC gets registered. This is done in a similar way as before,
but rather than carving out 256 IRQs per PIC, we carve out 128
IRQs (124 pins + 4 IPIs). This is supposed to handle the G5 case,
but should really be fixed properly using multiple passes.
o Have the interrupt framework set root_pic in most cases and not
put that burden in PIC drivers (for the most part).
o Remove powerpc_ign_lookup() and replace it with powerpc_get_irq().
Remove IGN_SHIFT, INTR_INTLINE and INTR_IGN.
Related to the above, fix the Freescale PCI controller driver, broken
by the FDT code. Besides not attaching properly, bus numbers were
assigned improperly and enumeration was broken in general. This
prevented the AT PIC from being discovered and interrupt routing to
work properly. Consequently, the ata(4) controller stopped functioning.
Fix the driver, and FDT PCI support, enough to get the MPC8555CDS
going again. The FDT PCI code needs a whole lot more work.
No breakages are expected, but lackiong G5 hardware, it's possible
that there are unpleasant side-effects. At least MPC85xx support is
back to where it was 7 months ago -- it's amazing how badly support
can be broken in just 7 months...
Sponsored by: Juniper Networks
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
and pointers don't always have the same size, e.g. the __mips_n32 ABI
(ILP32) has 64 bit registers but 32 bit pointers.
On mips introduce PRIptr to fix the format specifier for (u)intptr_t.
Prefix PRI64 and PRIptr with underscores because macro names starting with
PRI[a-zX] are reserved for future use.
Approved by: kib (mentor)
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]
Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.
Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.
Suggested by: bde [1]
Approved by: kib (mentor)
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.
The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.
The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".
Discussed with: bde
Approved by: kib (mentor)
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.
Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.
While here, correct some comments.
Reviewed by: bde
Approved by: kib (mentor)
It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.
Discussed with: bde
Approved by: kib (mentor)
available on firmwares 3.15 and earlier.
Caveats: Support for the internal SATA controller is currently missing,
as is support for framebuffer resolutions other than 720x480. These
deficiencies will be remedied soon.
Special thanks to Peter Grehan for providing the hardware that made this
port possible, and thanks to Geoff Levand of Sony Computer Entertainment
for advice on the LV1 hypervisor.
logic to support modifying the page table through a hypervisor. This
uses KOBJ inheritance to provide subclasses of the base 64-bit AIM MMU
class with additional methods for page table manipulation.
Many thanks to Peter Grehan for suggesting this design and implementing
the MMU KOBJ inheritance mechanism.
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.
PR: kern/80980
Discussed with: jhb
byte-swapped versions of compile-time constants. This allows use of
bswap() and htole*() in initializers, which is required to cross-build
btxld.
Obtained from: sparc64
hypervisor infrastructure support:
- Fix coexistence of multiple platform modules in the same kernel
- Allow platform modules to provide an SMP topology
- PowerPC hypervisors limit the amount of memory accessible in real mode.
Allow the platform modules to specify the maximum real-mode address,
and modify the bits of the kernel that need to allocate
real-mode-accessible buffers to respect this limits.
routines.
This unbreaks Book-E build after the recent machine/mutex.h removal.
While there move tlb_*lock() prototypes to machine/tlb.h.
Submitted by: jhb
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.
Suggested by: bde (1, 2)
concurrency bug. Since all SLB/SR entries were invalidated during an
exception, a decrementer exception could cause the user segment to be
invalidated during a copyin()/copyout() without a thread switch that
would cause it to be restored from the PCB, potentially causing the
operation to continue on invalid memory. This is now handled by explicit
restoration of segment 12 from the PCB on 32-bit systems and a check in
the Data Segment Exception handler on 64-bit.
While here, cause copyin()/copyout() to check whether the requested
user segment is already installed, saving some pipeline flushes, and
fix the synchronization primitives around the mtsr and slbmte
instructions to prevent accessing stale segments.
MFC after: 2 weeks
values to zero. A correct solution would involve emulating vector
operations on denormalized values, but this has little effect on accuracy
and is much less complicated for now.
MFC after: 2 weeks
Unlike actual MTRR, this only controls the mapping attributes for
subsequent mmap() of /dev/mem. Nonetheless, the support is sufficiently
MTRR-like that Xorg can use it, which translates into an enormous increase
in graphics performance on PowerPC.
MFC after: 2 weeks
which are similar to the previous ones, and one for user maps, which
are arrays of pointers into the SLB tree. This changes makes user SLB
updates atomic, closing a window for memory corruption. While here,
rearrange the allocation functions to make context switches faster.
hardware with a lockless sparse tree design. This marginally improves
the performance of PMAP and allows copyin()/copyout() to run without
acquiring locks when used on wired mappings.
Submitted by: mdf
include/mmuvar.h - Change the MMU_DEF macro to also create the class
definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT,
which has an extra parameter specifying the MMU class to inherit methods
from. Update the comments at the start of the header file to describe the
new macros.
booke/pmap.c
aim/mmu_oea.c
aim/mmu_oea64.c - Collapse mmu_def_t declaration into updated MMU_DEF macro
The MMU_DEF_INHERIT macro will be used in the PS3 MMU implementation to
allow it to inherit the stock powerpc64 MMU methods.
Reviewed by: nwhitehorn
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.
A make buildkernel -j4 uses ~360% CPU.
- Bracket the AP spinup printf with a mutex to avoid garbled output.
- Enable SMP by default on powerpc64.
Reviewed by: nwhitehorn
the existing code was very platform specific, and broken for SMP systems
trying to reboot from KDB.
- Add a new PLATFORM_RESET() method to the platform KOBJ interface, and
migrate existing reset functions into platform modules.
- Modify the OF_reboot() routine to submit the request by hand to avoid
the IPIs involved in the regular openfirmware() routine. This fixes
reboot from KDB on SMP machines.
- Move non-KDB reset and poweroff functions on the Powermac platform
into the relevant power control drivers (cuda, pmu, smu), instead of
using them through the Open Firmware backdoor.
- Rename platform_chrp to platform_powermac since it has become
increasingly Powermac specific. When we gain support for IBM systems,
we will grow a new platform_chrp.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.
Tested by: marius (sparc64)
MFC after: 1 month
IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that
constructed a mask for a single CPU with calls to ipi_cpu() instead. This
will matter more in the future when we transition from cpumask_t to
cpuset_t for CPU masks in which case building a CPU mask is more expensive.
Submitted by: peter, sbruno
Reviewed by: rookie
Obtained from: Yahoo! (x86)
MFC after: 1 month
now it uses a very dumb first-touch allocation policy. This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
a CPU belongs to. Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
(VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
The MD code is required to populate an array of mem_affinity structures.
Each entry in the array defines a range of memory (start and end) and a
domain for the range. Multiple entries may be present for a single
domain. The list is terminated by an entry where all fields are zero.
This array of structures is used to split up phys_avail[] regions that
fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
used when fulfulling a physical memory allocation. Right now the
per-domain freelists are listed in a round-robin order for each domain.
In the future a table such as the ACPI SLIT table may be used to order
the per-domain lookup lists based on the penalty for each memory domain
relative to a specific domain. The lookup lists may be examined via a
new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
pick a lookup list when allocating memory.
Reviewed by: alc
name of 32bit sibling architecture instead of the host one. Do the
same for hw.machine on amd64.
Add a safety belt debug.adaptive_machine_arch sysctl, to turn the
substitution off.
Reviewed by: jhb, nwhitehorn
MFC after: 2 weeks
Kernel sources for 64-bit PowerPC, along with build-system changes to keep
32-bit kernels compiling (build system changes for 64-bit kernels are
coming later). Existing 32-bit PowerPC kernel configurations must be
updated after this change to specify their architecture.