Check if RISC-V SSTC is available and advertise to the guest.
This is needed for Eswin EIC7700 that does not include SSTC.
As we don't have a mechanism for reporting extension presence
from the kernel to userspace, then use vm_cap_type for now.
Reviewed by: mhorne, markj
Differential Revision: https://reviews.freebsd.org/D48058
Switch to INTR_ROOT_COUNT as this name better describes its purpose.
Remove the default INTR_ROOT_IRQ from the core. Define it (redundantly)
in each architecture's header, but now placed alongside its sibling
values (if defined by the platform, e.g. arm64 INTR_ROOT_FIQ).
Reviewed by: mhorne
Pull Request: https://github.com/freebsd/freebsd-src/pull/1280
In order to match reality, allow using these functions with pointers on
const objects, and bring us closer to C11.
Remove the '+' modifier in the atomic_load_acq_64_i586()'s inline asm
statement's constraint for '*p' (the value to load). CMPXCHG8B always
writes back some value, even when the value exchange does not happen in
which case what was read is written back. atomic_load_acq_64_i586()
further takes care of the operation atomically writing back the same
value that was read in any case. All in all, this makes the inline
asm's write back undetectable by any other code, whether executing on
other CPUs or code on the same CPU before and after the call to
atomic_load_acq_64_i586(), except for the fact that CMPXCHG8B will
trigger a #GP(0) if the memory address is part of a read-only mapping.
This unfortunate property is however out of scope of the C abstract
machine, and in particular independent of whether the 'uint64_t' pointed
to is declared 'const' or not.
Approved by: markj (mentor)
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46887
Sometimes we need defines from this file in assembler code. Today we do
the heavyweight approach of using genassym for that. However, they are
just #defines, so in the future we want to include sys/intr.h to pick up
the needed constants in exception.S.
PR: 283041
Sponsored by: Netflix
Reviewed by: mmel, andrew
Differential Revision: https://reviews.freebsd.org/D47846
The T-HEAD custom PTE bits are defined in such a way that the
default/normal memory type is non-zero value. This _unthoughtful_ choice
means that, unlike the Svpbmt and non-Svpbmt cases, this field cannot be
left bare in our bootstrap PTEs, or the hardware will fail to proceed
far enough in boot (cache strangeness). On the other hand, we cannot
unconditionally apply the PTE_THEAD_MA_NONE attributes, as this is not
compatible with spec-compliant RISC-V hardware, and will result in a
fatal exception.
Therefore, in order to handle this errata, we are forced to perform a
check of the CPU type at the first moment possible. Do so, and fix up
the PTEs with the correct memory attribute bits in the T-HEAD case.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47458
T-HEAD CPUs provide a spec-violating implementation of page-based memory
types, using PTE bits [63:59]. Add basic support for this "errata",
referred to in some places as an "extension".
Note that this change is not enough on its own, but a workaround is
needed for the bootstrap (locore) page tables as well.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45472
This is the first major quirk we need to support in order to run on
current T-HEAD/XuanTie CPUs, e.g. the C906 or C910, found in several
existing RISC-V SBCs. With these custom dcache routines installed,
busdma can reliably communicate with devices which are not coherent
w.r.t. the CPU's data caches.
This patch introduces the first quirk/errata handling functions to
identcpu.c, and thus is forced to make some decisions about how this
code is structured. It will be amended with the changes that follow in
the series, yet I feel the final result is (unavoidably) somewhat
clumsy. I expect the CPU identification code will continue to evolve as
more CPUs and their quirks are eventually supported.
Discussed with: jrtc27
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47455
Cache management operations were, for a long time, unspecified by the
RISC-V ISA, and thus these functions have been no-ops. To cope, hardware
with non-coherent I/O has implemented custom cache flush mechanisms,
either in the form of custom instructions or special device registers.
Additionally, the RISC-V CMO extension is ratified and these official
instructions will start to show up in hardware eventually. Therefore, a
method is needed to select the dcache management routines at runtime.
Add a simple set of function hooks, as well as a routine to install them
and specify the minimum dcache line size. The first consumer will be the
non-standard cache management instructions for T-HEAD CPUs.
The unused I-cache variables and macros are removed.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47454
For current architectures, these are just aliases for the existing
operation on the relevant scalar integer.
Reviewed by: imp, kib
Obtained from: CheriBSD
Sponsored by: AFRL, DARPA
Differential Revision: https://reviews.freebsd.org/D47631
These use amoor and amoand rather than a loop.
Also define atomic_testandset_acq_(64|long) using amoor.aq.
Reviewed by: mhorne, kib
Sponsored by: AFRL, DARPA
Differential Revision: https://reviews.freebsd.org/D47627
- Make machine/vmm_dev.h self-contained.
- Check for errors from vmmdev_init().
- Make VM_MAX_NAMELEN visible to userspace.
Reported by: Jenkins
Fixes: a97f683fe3 ("vmm: Add a device file interface for creating and destroying VMs")
Add kernel code for 'H' — Hypervisor Extension[1] to support
virtualization on RISC-V ISA.
This comes with a separate userspace patch allowing us to boot
unmodified freebsd/riscv guest. Other operating systems are untested.
This also comes with a U-Boot port that is configured to run in bhyve
guest environment — in RISC-V virtual supervisor mode.
The vmm SBI code emulates RISC-V machine-mode for the guest, handling
SBI calls partly in vmm kernel and partly in bhyve userspace.
Developed in Spike simulator during short period of time, the support
is considered experimental. The first real hardware with hypervisor
spec included should have just reached the market, so this was tested
in Spike and QEMU only. Note that this depends on Sstc extension
presence in the hardware (both Spike and QEMU have it).
Note that booting multiple guests at the same time is not tested and
may require additional work. Some TODOs are indicated within the
code, and some listed in the project's home page[2].
Many thanks to Jessica Clarke, Mitchell Horne and Mark Johnston
for help with parts, test and review.
1. https://riscv.org/technical/specifications/
2. https://wiki.freebsd.org/riscv/bhyve
Sponsored by: UK Research and Innovation
Differential Revision: https://reviews.freebsd.org/D45553
This reverts commit 536c8d948e. The
change seemed fine on the surface, but converting to an enum has raised
some concerns due to the asm <-> C interface. Back it out and let
someone else deal with it later if they'd like to.
Further context about the concerns can be found in D47279.
sys/intr.h originally started life as an extract of arm's intr.h, and
this include was dropped in its place. Changes in flight want to add
some MD definitions that we'll use in the more MI parts of INTRNG.
Let's formally reverse the dependency now since this is way more
common in general. All of the includes switched in this change that I
spot-checked were in-fact wanting declarations historically included in
sys/intr.h anyways.
Reviewed by: andrew, imp, jrtc27, mhorne, mmel, olce
Differential Revision: https://reviews.freebsd.org/D47002
uint32_t is handy for directly interfacing with assembly-language. For
the C portion, enum is much handier. In particular there is no need to
count the number of roots by hand. This also works better for being
able to build kernels with varying numbers of roots.
Switch to INTR_ROOT_COUNT as this better matches the purpose of the
value. Switch to root_type, rather than rootnum for similar reasons.
Remove the default from the core. Better to require the architectures
to declare the type since they will routinely deviate and a default
chosen now will likely be suboptimal.
Leave intr_irq_handler() taking a register type as that better matches
for interfacing with assembly-language.
Fix csr_swap() macro so that we don't overwrite the argument (which is not
even possible when the argument is an immediate value)
Reviewed by: jrtc27
Differential Revision: https://reviews.freebsd.org/D46526
This was ratified earlier this year as an alias for Zba_Zbb_Zbs. Whilst
we don't currently export multi-letter extensions, we can still export
this alias in AT_HWCAP.
Reviewed by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D46277
The Svpbmt extension provides specification of "Page-Based Memory
Types", or memory attributes (e.g. cacheability constraints).
Extend the pmap code to apply memory attributes when creating/updating
PTEs. This is done in a way which has no effect on CPUs lacking Svpbmt
support, and is non-hostile to alternate encodings of memory attributes
-- a future change will enable this for T-HEAD CPUs, which implement
this PTE feature in an different (incompatible) way.
Reviewed by: jhb
Tested by: br
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45471
As of 9e6544dd6e contigfree(9) is no longer
needed and should not be used anymore. We leave a wrapper for 3rd party
code in at least 15.x but remove (almost) all other cases from the tree.
This leaves one use of contigfree(9) untouched; that was the original
trigger for 9e6544dd6e and is handled in D45813 (to be committed
seperately later).
Sponsored by: The FreeBSD Foundation
Reviewed by: markj, kib
Tested by: pho (10h stress test run)
Differential Revision: https://reviews.freebsd.org/D46099
All architectures enable NEW_PCIB in DEFAULTS (arm being the most recent
to do so in 121be55599 (arm: Set NEW_PCIB in DEFAULTS rather than a
subset of kernel configs")), so it's time we removed the legacy code
that no longer sees much testing and has a significant maintenance
burden.
Reviewed by: jhb, andrew, emaste
Differential Revision: https://reviews.freebsd.org/D32954
Add new SBI implementation IDs including recently allocated one for bhyve.
Reviewed by: mhorne
Sponsored by: UKRI
Differential Revision: https://reviews.freebsd.org/D45696
And from struct riscv_bootparams. It is no longer needed.
Reviewed by: br, markj
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45470
The overall goal of the change is to reduce the amount of work done in
locore assembly, and defer as much as possible until pmap_bootstrap().
Currently, half the setup is done in assembly, and then we pass the l1pt
address to pmap_bootstrap() where it is amended with other mappings.
Inspiration and understanding has been taken from amd64's
create_pagetables() routine, and I try to present the page table
construction in the same way: a linear procedure with commentary
explaining what we are doing and why. Thus the core of the new
implementation is contained in pmap_create_pagetables().
Once pmap_create_pagetables() has finished, we switch to the new
pagetable root and leave the bootstrap ones created by locore behind,
resulting in a minimal 8kB of wasted space.
Having the whole procedure in one place, in C code, allows it to be more
easily understood, while also making it more amenable to future changes
which depend on CPU feature/errata detection.
Note that with this change the size of the early devmap is bumped up
from one to four L2 pages (8MB).
Reviewed by: markj
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45327
This is useful for two reasons. Within this change, it allows the
early DTB mapping to be eliminated, as we can now just dereference the
physical address provided by FW and copy the DTB contents into KVA.
It will also aid in an upcoming change: the larger reworking of page
table bootstrapping on this platform.
Reviewed by: markj, jhb
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45324
The idea here is to avoid a memory access and conditional branch per
probe site. Instead, the probe is represented by an "unreachable"
unconditional function call. asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record. Each SDT probe carries a list
of tracepoints.
When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label. The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general. The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.
Per gallatin@ in D43504, this approach has less overhead when probes are
disabled. To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case. It could be re-added later if need be.
The approach taken in this patch enables some more improvements:
1. We can now automatically fill out the "function" field of SDT probe
names. The SDT macros let the programmer specify the function and
module names, but this is really a bug and shouldn't have been
allowed. The intent was to be able to have the same probe in
multiple functions and to let the user restrict which probes actually
get enabled by specifying a function name or glob.
2. We can avoid branching on SDT_PROBES_ENABLED() by adding the ability
to include blocks of code in the out-of-line path. For example:
if (SDT_PROBES_ENABLED()) {
int reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
SDT_PROBE1(proc, , , exit, reason);
}
could be written
SDT_PROBE1_EXT(proc, , , exit, reason,
int reason;
reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
);
In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.
Reviewed by: Domagoj Stolfa, avg
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D44483
Add basic stage 2 translation support (guest-physical to host-physical).
RISC-V hypervisor spec[1] introduces new translation schemes: Sv32x4,
Sv39x4, Sv48x4 and Sv57x4.
In each case, the size of the incoming address is widened by 2 bits (e.g.
Sv39 becomes 41-bit system).
To accommodate the 2 extra bits, the root page table (only) is expanded
by a factor of four to be 16 KiB instead of the usual 4 KiB. The rest of
page table system (including PTE format) is similar.
This gives us 4x of memory space in each scheme, but it does not make sense
to support all that memory for now.
Allocate required amount of pages for the top directory in case of stage 2,
but leave it unused.
1. https://github.com/riscv/riscv-isa-manual/blob/main/src/hypervisor.adoc
Reviewed by: mhorne
Sponsored by: UKRI
Differential Revision: https://reviews.freebsd.org/D45481
This commit introduces the MINIDUMP_STARTUP_PAGE_TRACKING symbol and
uses it to simplify several instances of a complex preprocessor conditional
for adding pages allocated when bootstraping the kernel to minidumps.
Reviewed by: markj, mhorne
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D45085
This commit refactors the UMA small alloc code and
removes most UMA machine-dependent code.
The existing machine-dependent uma_small_alloc code is almost identical
across all architectures, except for powerpc where using the direct
map addresses involved extra steps in some cases.
The MI/MD split was replaced by a default uma_small_alloc
implementation that can be overridden by architecture-specific code by
defining the UMA_MD_SMALL_ALLOC symbol. Furthermore, UMA_USE_DMAP was
introduced to replace most UMA_MD_SMALL_ALLOC uses.
Reviewed by: markj, kib
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D45084
It is inherited from arm, where the global exists and is used. No
functional change.
Reviewed by: markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45323
No functional change.
Reviewed by: markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45322
Implement atomic_load_acq_16, atomic_store_rel_16.
These are needed by bhyve(8) PCIe bus emulation code.
Group 16-bit atomic functions similarly to 32 and 64-bit.
Reviewed by: mhorne
Differential Revision: https://reviews.freebsd.org/D45228
Currently the local interrupt controller implementation is based on
pre-INTRNG arm/arm64 code, using hand-rolled event code rather than
INTRNG. This then interacts weirdly with the PLIC, and other future
interrupt controllers like the APLIC and IMSICs in the upcoming AIA
specification, since they become the root PIC despite not being the
logical root. Instead, use a real newbus device for it and register
it as the root PIC.
This also adapts the IPI code to make use of the newly-added INTRNG
generic IPI handling framework, adding a new sbi_ipi as the PIC. In
future there will be alternative devices for sending IPIs that will
register with higher priorities, such as the proposed AIA IMSIC and
ACLINT SSWI.
Reviewed by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35901
This approach is based on the Arm PSCI driver, though that makes more
extensive use of its softc than we do here. This will be used to extract
the SBI IPI code as a real PIC.
Reviewed by: mhorne, imp
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35900
After removing filter functionality, the naming doesn't clearly
represent what the function does, so try to address this. Include some
code clarity and style improvements.
Create a common version in subr_busdma_bounce.c, used by most
implementations. powerpc still needs its own version of the function,
due to its dmat->iommu == NULL check.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42896
Without filter functions, we do not need to keep track of tag ancestry.
All inheritance of the parent tag's parameters occurs when creating the
new child tag.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42895
Address filter functions are unused, unsupported, and now rejected.
Simplify some busdma code by removing filter functionality completely.
Note that the chains of parent tags become useless, and will be cleaned
up in the next commit.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42894
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.
Sponsored by: Netflix
All of these used the 'immediately at beginning' variation of the
BSD-2-Clause license. This wasn't intentional, just what I copied from
from a random file in the tree back in 2005. It was not an intentional
decision.
The different arch bus.h files are a mix of BSD-2-Clause and
BSD-4-Clause that have various copyright holders (Charles M. Hannum,
Christopher G. Demetriou, The NetBSD Foundation and KATO Takenori), and
some of the content of these files were likely copied from there.
However, apart from the uncopyrightable interface lines, there are very
few comments. It's unclear if these comments are 'original material'
here to copyright, but to the extent that there is, license it under the
standard BSD-2-Clause copyright that's the norm for the project today.
In any event, the standard BSD-2-Clause is also closer to those
originals.
In addition, FreeBSD uses different type definitions than the original
NetBSD code in part. The comments that were copied have been copied a
lot, but appear in NetBSD's bus.h files in NetBSD 1.3.
While I'm here, assign the copyright, to the extent any exists from me,
to the FreeBSD Foundation. I just cut and pasted these into _bus.h from
the different machine files and those files have a rich history of
modification from the original imports from NetBSD over more than 25
years so it's tricky to say who, exactly, wrote each bit. Given the size
of the files, this seems like the best compromise. Also add an
acknowledgement to the NetBSD 1.3 bus.h files and their authors (there
were no additional FreeBSD authors listed in the various
sys/*/include/bus.h files). Finally, use the SPDX identifier instead of
multiple copies of the text.
Differential Revision: https://reviews.freebsd.org/D42532
Sponsored by: Netflix
- When promoting, do not require that all PTEs all have PTE_A set.
Instead, record whether they did and store this information in the
PTP's valid bits.
- Synchronize some comments in pmap_promote_l2().
- Make pmap_promote_l2() scan starting from the end of the 2MB range
instead of the beginning. See the commit log for 9d1b7fa31f for
justification of this, which I believe applies here as well.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D42288
- Remove now unnecessary MACHINE_ARCHES definition. The default logic
in kern_mib.c works fine now for RISC-V.
- Remove custom sv_machine_arch hook from sysentvec.
Fixes: 1ca12bd927 Remove the riscv64sf architecture.
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D40648
This avoids bloating the kernel image when MAXCPU is large.
A follow-up patch for kgdb and other kernel debuggers is needed since
the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that
debuggers can use osreldate to figure out how to handle stoppcbs.
PR: 269572
MFC after: never
Reviewed by: mjg, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39806
There are now several Supervisor-mode extensions that have entered the
'ratified' status, so begin parsing and reporting a few of these.
Recognize the following extensions:
- Sstc: stimecmp/vstimecmp CSR
- Svnapot: NAPOT* translation contiguity
- Svpbmt: page-based memory types
- Svinval: fine-grained TLB invalidation instructions
- Sscofpmf: performance counter overflow
*i.e. "naturally aligned power-of-2" page granularity
For now, provide globals for Sstc and Sscofpmf, as we will make use of
these in the near future.
Plus, update the copyright statement after my recent work on this file.
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40240
Use PMAP_MAPDEV_EARLY_SIZE instead of assuming that its value is always
L2_SIZE. Add compile-time assertions to check that the size matches the
expectations in locore.
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D40110
Detect and report the supported MMU for each CPU. Export the
capabilities to the rest of the kernel and use it in pmap_bootstrap() to
check for Sv48 support.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39814