When adding support for new hardware extensions we may not want to
enable support for the FreeBSD and Linux ABIs at the same time. To
support this split the Linux ID register and hwcaps so they can be
configured separately.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42372
(cherry picked from commit e6dbc99d47ddb254d75822817592bb82b5ce4d97)
When returning from a Linux signal use the Linux sigframe to find the
register values to restore.
Remove the FreeBSD ucontext from the stack as it's now unneeded.
PR: 270250
Reviewed by: dchagin, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D42360
(cherry picked from commit 070a4ff82a34652d533f9315ae9ad0aa8f1fdeb2)
The lr register is cleared at the beginning of the _dl_start and _start,
so there is no need to initialize it.
Gnu libc _start takes an rtld_fini pointer in x0 which is set by ld.so
for __libc_start_main, the kernel does not register any atexit pointers.
While here fix whitespaces.
MFC after: 1 week
(cherry picked from commit 20845a6994c548977874d1f413044d43c8474f0a)
To avoid clobbering of any registers by the trampoline code use Linux
way to call signal handlers. I.e., we are out from the kernel right into
the signal handler, put return address from the signal handler into the
link register.
The mysterious NOP is required for some unwinders (e.g. libc++) that
unconditionally subtract one from the result of _Unwind_GetIP() in order
to identify the calling function.
MFC after: 1 week
To allow unwinders to go througth a previous to sigreturn frame we should
properly emulate the trampoline frame record which should points to the
previous frame and set the trampoline frame pointer to the emulated frame
before calling signal handler.
MFC after: 1 week
An Aarch64 sigreturn trampoline frame can't currently be described in
a DWARF .eh_frame section, because Aarch64 does not define a register
number for PC and provide no direct way to encode PC of the previous
frame. Instead, unwinders (libgcc, gdb, libunwind) detect the sigreturn
frame by looking for the sigreturn instruction. If a sigreturn frame is
detected, unwinders restores all the gprs, SP and PC by assuming that
sp points to an rt_sigframe Linux kernel struct
When entering the kernel, the link register (lr) contains the return
address of the previous frame, the exception link register (elr) contains
the address of the next instruction after the one which generated the
exception, i.e., PC.
MFC after: 1 week
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
Export default MINSIGSTKSZ value for the x86 until we do not preserve AVX
registers in the signal context.
Differential Revision: https://reviews.freebsd.org/D39644
MFC after: 1 month
This allows the syscallname() function to give a usable result for Linux
ABIs.
Reported by: jrtc27
Reviewed by: jrtc27, markj, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37199
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393
The implemenation differs from others Linuxulators.
For unwinders Linux ucontext_t is stored, however native machine context
is used to store/restore process state to avoid code duplication.
As DWARF Aarch64 does not define a register number for PC and provides no
direct way to encode the PC of the previous frame, CFI cannot describe a
signal trampoline frame. So, modified the vdso linker script to discard
unused sections.
Extensions are not implemented.
MFC after: 2 weeks
The signal trampoine-related definitions are used only in the MD part
of code, wherefore moved from everywhere used linux.h to separate MD
headers.
MFC after: 2 weeks
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro. With stack address randomization the
ps_strings address is no longer fixed.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33704
Simplify control flow around handling of the execpath length and signal
trampoline. Cache the sysentvec pointer in a local variable.
No functional change intended.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33703
Use nitems instead and just use a magic `8` for the size of the args
array. MAXARGS was rarely used (only in arm64 code) and is an overly
generic name to polute the namespace with.
Requested by: kib in D33308
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.
Differential revision: https://reviews.freebsd.org/D31474
MFC after: 2 weeks
Note that this still uses FreeBSD-style sigframe;
this will be addressed later.
Reviewed By: dchagin
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D31258
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();
As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.
Reviewed by: trasz
Differential Revision: https://reviews.freebsd.org/D30902
MFC after 2 weeks
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.
The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
implemented in the vDSO: for now gettimeofday, clock_gettime.
The first two have been implemented, so add the implementation of system
calls.
System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).
In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.
Tested by: trasz (arm64)
Differential revision: https://reviews.freebsd.org/D30900
MFC after: 2 weeks
Temporary add stubs to the Linux emulation layer which calls the existing hook.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D30911
MFC after: 2 weeks
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.
Approved by: markj (mentor)
Reviewed by: kib, bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D29185
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
Approved by: markj (mentor)
Reviewed by: kib, bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D29185
Implement dumping core for Linux binaries on amd64, for both
32- and 64-bit executables. Some bits are still missing.
This is based on a prototype by chuck@.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30019
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values. It also updates all
of the ABI definitions to preserve current behaviour.
This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication. It will be used
for Linux coredumps.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30921
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().
I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.
This is the first stage of Linuxulator's vdso revision.
Reviewed by: trasz, imp
Differential Revision: https://reviews.freebsd.org/D30774
MFC after: 2 weeks
Require queueing of the signals with default action, and disable
dequeueing SIGCHLD on wait for live process.
Reported and tested by: dchagin
Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30675