- Remove now unnecessary MACHINE_ARCHES definition. The default logic
in kern_mib.c works fine now for RISC-V.
- Remove custom sv_machine_arch hook from sysentvec.
Fixes: 1ca12bd927 Remove the riscv64sf architecture.
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D40648
This avoids bloating the kernel image when MAXCPU is large.
A follow-up patch for kgdb and other kernel debuggers is needed since
the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that
debuggers can use osreldate to figure out how to handle stoppcbs.
PR: 269572
MFC after: never
Reviewed by: mjg, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39806
There are now several Supervisor-mode extensions that have entered the
'ratified' status, so begin parsing and reporting a few of these.
Recognize the following extensions:
- Sstc: stimecmp/vstimecmp CSR
- Svnapot: NAPOT* translation contiguity
- Svpbmt: page-based memory types
- Svinval: fine-grained TLB invalidation instructions
- Sscofpmf: performance counter overflow
*i.e. "naturally aligned power-of-2" page granularity
For now, provide globals for Sstc and Sscofpmf, as we will make use of
these in the near future.
Plus, update the copyright statement after my recent work on this file.
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40240
Use PMAP_MAPDEV_EARLY_SIZE instead of assuming that its value is always
L2_SIZE. Add compile-time assertions to check that the size matches the
expectations in locore.
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D40110
Detect and report the supported MMU for each CPU. Export the
capabilities to the rest of the kernel and use it in pmap_bootstrap() to
check for Sv48 support.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39814
Modify when and how we perform parsing and reporting. Most notably,
everything now executes on CPU 0.
The de-facto standard way to enumerate CPU features (ISA extensions) on
RISC-V is by parsing each CPU's ISA string. We currently obtain this
information from the device tree, and in the future will be able to pull
it from ACPI tables.
Eliminate the SYSINIT from identcpu.c. We still need to walk the /cpus
list in the device tree, but now do this one CPU at a time, as a step in
the identify_cpu() procedure. This is slightly less error prone, and
allows us to parse ISA features for CPU 0 much earlier.
Make use of the SMP hooks cpu_mp_start() and cpu_mp_announce() to
identify and print secondary CPU info, respectively. This causes
secondary processor identification to be printed much earlier in boot;
everything is done by SI_SUB_CPU, SI_ORDER_THIRD. Adjust some other
printf() calls so that we get enough useful info to debug under
bootverbose.
Reviewed by: markj (slightly earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39811
It is advantageous to have knowledge of ISA features as early as
possible. For example, the presence of newer virtual memory extensions
may be useful to pmap_bootstrap().
To achieve this, split out the printf() parts of identify_cpu() into a
separate function, printcpuinfo(). This latter function will be called
later in boot after the console has been initialized.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39810
Make better use of the RISC-V identification CSRs: mvendorid, marchid,
and mimpid. This code was written before these registers were
well-specified, or even available to the kernel. It currently fails to
recognize any CPU or platform.
Per the privileged specification, mvendorid contains the JEDEC vendor ID,
or zero.
The marchid register denotes the CPU microarchitecture. This is either
one of the globally allocated open-source implementation IDs, or the
field has a custom encoding. Therefore, for known vendors (SiFive) we
can also maintain a list of known marchid values. If we can not give a
name to the CPU but marchid is non-zero, then just print its value in
the report.
The mimpid (implementation ID) could be used in the future to more
uniquely identify the micro-architecture, but it really remains to be
seen how it gets used. For now we just print its value.
Thank you to Danjel Qyteza <danq1222@gmail.com> who submitted an early
version of this change to me, although it has been almost entirely
rewritten.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39809
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
This existing helper function is preferable to the hand-rolled
calculation of the kstack bounds.
Make some small style improvements while here. Notably, rename every
instance of "r", the return address, to "ra". Tidy the includes in the
affected files.
Reviewed by: jkoshy
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39909
When vm_map_remove() is called from vm_swapout_map_deactivate_pages()
due to swapout, PKRU attributes for the removed range must be kept
intact. Provide a variant of pmap_remove(), pmap_map_delete(), to
allow pmap to distinguish between real removes of the UVA mappings
and any other internal removes, e.g. swapout.
For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove().
Reported by: andrew
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39556
sig_atomic_t is defined as a long and thus is 64-bit on arm64. For some
reason its limit was incorrectly specified as a 32-bit number. This had
the unfortunate side effect of causing gnulib to override most of the
definitions in stdint.h. On CheriBSD this breaks all software that uses
gnulib in annoying and hard to debug ways.
Technically updating the limits might be an ABI change, but these
defines are largely unused (the only use in tree is in the libc++ test
suite where it's use an assertion that will fail due to this bug).
Further, since the underlying type remains the same, we're just
increasing the range of values a paranoid program might use.
Reviewed by: emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D39194
The unwind logic was copied from AArch64 which follows the peculiar
AACPS (where, unlike typical RISC architectures, its frame pointer
follows an x86/stack machine-like convention where the frame pointer
points at the bottom of the frame record, not the top). Delete the
pointless riscv_frame struct and fix this.
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28054
This code was originally written under the assumption that the ISA
string would only contain single-letter extensions. The RISC-V
specification has extended its description of the format quite a bit,
allowing for much longer ISA strings containing multi-letter extension
names.
Newer versions of QEMU (7.1.0) will append to the riscv,isa property
indicating the presence of multi-letter standard extensions such as
Zfencei. This triggers a KASSERT about the expected length of the
string, preventing boot.
Increase the size of the isa array significantly, and teach the code
to parse (skip over) multi-letter extensions, and optional extension
version numbers. We currently ignore them completely, but this will
change in the future as we start supporting supervisor-level extensions.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36601
Add a <sys/_pv_entry.h> intended for use in <machine/pmap.h> to
define struct pv_entry, pv_chunk, and related macros and inline
functions.
Note that powerpc does not yet use this as while the mmu_radix pmap
in powerpc uses the new scheme (albeit with fewer PV entries in a
chunk than normal due to an used pv_pmap field in struct pv_entry),
the Book-E pmaps for powerpc use the older style PV entries without
chunks (and thus require the pv_pmap field).
Suggested by: kib
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36685
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This applies one of the changes from
5567d6b441 to other architectures
besides arm64.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36263
The devmap variants used vm_offset_t for some reason, and a few places
explicitly cast bus addresses to vm_offset_t. (Probably those casts
along with similar casts for vm_size_t should just be removed and
instead permit the compiler to DTRT.)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D35961
After the addition of SV48 support, VIRT_IS_VALID() did not exclude
addresses that are in the SV39 address space hole but not in the SV48
address space hole. This can result in mishandling of accesses to that
range when in SV39 mode.
Fix the problem by modifying VIRT_IS_VALID() to use the runtime address
space bounds. Then, if the address is invalid, and pcb_onfault is set,
give vm_fault_trap() a chance to veto the access instead of panicking.
PR: 265439
Reviewed by: jhb
Reported and tested by: Robert Morris <rtm@lcs.mit.edu>
Fixes: 31218f3209 ("riscv: Add support for enabling SV48 mode")
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35952
These files no longer depend on the macros required when these checks
were added.
PR: 263102 (exp-run)
Reviewed by: brooks, imp, emaste
Differential Revision: https://reviews.freebsd.org/D34804
Since physical memory management is now handled by subr_physmem.c, the
need to keep this global array has diminished. It is not referenced
outside of early boot-time, and is populated by physmem_avail() in
pmap_bootstrap(). Just allocate the array on the stack for the duration
of its lifetime.
The check against physmap[0] in initriscv() can be dropped altogether,
as there is no consequence for excluding a memory range twice.
Reviewed by: markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34778
This increases the size of the user map from 256GB to 128TB. The kernel
map is left unchanged for now.
For now SV48 mode is left disabled by default, but can be enabled with a
tunable. Note that extant hardware does not implement SV48, but QEMU
does.
- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
mode. If the write to SATP doesn't take, the kernel continues to run
in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit. In SV39 mode,
the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
mappable.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34280
Instead of having the one-off load_satp(), just use csr_write(). No
functional change intended.
Reviewed by: alc, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34271
In SV48 mode, the top-level page will be an L0 page rather than an L1
page. Rename the field accordingly. No functional change intended.
Reviewed by: alc, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34270
- Move busdma_lock_mutex to subr_bus_dma.c.
- Move _busdma_lock_dflt to subr_bus_dma.c. This function was named a
couple of different things previously. It is not a public API but
an internal helper used in place of a NULL pointer. The prototype
is in <sys/bus_dma.h> as not all backends include
<sys/bus_dma_internal.h>.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33694
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages. For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm). However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi. I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux. However, in the current scheme there's no need for the mux.
Instead, remove swi_vm and vm_ih. Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly. This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.
One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.
Reviewed by: imp, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33447
The header exports the following:
- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)
Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).
Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr. libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).
A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.
Reviewed by: kib (older version of x86/tls.h), jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33351
After a round of cleanups in late 2020, all definitions are
functionally identical.
This removes a rotted __aligned(8) on arm. It was added in
b7112ead32 and was intended to align the
args member so that 64-bit types (off_t, etc) could be safely read on
armeb compiled with clang. With the removal of armev, this is no
longer needed (armv7 requires that 32-bit aligned reads of 64-bit
values be supported and we enable such support on armv6). As further
evidence this is unnecessary, cleanups to struct syscall_args have
resulted in args being 32-bit aligned on 32-bit systems. The sole
effect is to bloat the struct by 4 bytes.
Reviewed by: kib, jhb, imp
Differential Revision: https://reviews.freebsd.org/D33308
This definition enables callers to estimate remaining space on the
kstack, and take action on it. Notably, it enables optimizations in the
GEOM and netgraph subsystems to directly dispatch work items when there
is sufficient stack space, rather than queuing them for a worker thread.
Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
not go unimplemented elsewhere.
PR: 259157
Reviewed by: mav, kib, markj (previous version)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32580
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.
This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.
Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.
The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.
Reviewed by: kib, markj, jhb
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31989
This is needed for LinuxKPI's _ioremap_attr. This reuses the generic
implementation introduced for aarch64, and itself requires implementing
pmap_kenter, which is trivial to do given riscv currently treats all
mapping attributes the same due to the Svpbmt extension not yet being
ratified and in hardware.
Reviewed by: markj, mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32445
When handling a kernel page fault, check explicitly that stval resides
in either the user or kernel address spaces, and make the page fault
fatal if not. Otherwise, a properly crafted address may appear to
pmap_fault() as a valid and present page in the kernel map, causing the
page fault to be retried continuously. This is mainly due to the fact
that the upper bits of virtual addresses are not validated by most of
the pmap code.
Faults of this nature should only occur due to some kind of bug in the
kernel, but it is best to handle them gracefully when they do.
Handle user page faults in the same way, sending a SIGSEGV immediately
when a malformed address is encountered.
Add an assertion to pmap_l1(), which should help catch other bugs of
this kind that make it this far.
Reviewed by: jrtc27, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31208
pmap_change_attr is required by drm-kmod so we need the function to
exist. Since the Svpbmt extension is on the horizon we will likely end
up with a real implementation of it, so this stub implementation does
all the necessary page table walking to validate the input, ensuring
that no new errors are returned once it's implemented fully (other than
due to out of memory conditions when demoting L2 entries) and providing
a skeleton for that future implementation.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31996
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D31885
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).
Reviewed by: imp, markj
Sponsored by: DARPA, AFRL (original work)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19830
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).
Sponsored by: The FreeBSD Foundation
which is the place to put MD asserts about allocated pages.
On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.
Approved by: markj (mentor)
Reviewed by: kib, bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D29185