Commit graph

386 commits

Author SHA1 Message Date
Dmitry Chagin
707e567a40 linux(4): Add a helper intended for copying timespec's from the userspace.
There are many places where we copyin Linux timespec from the userspace
and then convert it to the kernel timespec. To avoid code duplication
add a tiny halper for doing this.

MFC after:		2 weeks
2022-05-08 16:16:47 +03:00
Dmitry Chagin
1579b320f1 linux(4): Zero out high order bits of nanoseconds in the compat mode.
Assuming the kernel would use random data, the 64-bit Linux kernel ignores
upper 32 bits of tv_nsec of struct timespec64 for 32-bit binaries.

MFC after:		2 weeks
2022-05-08 15:38:19 +03:00
Dmitry Chagin
9a9482f874 linux(4): Add a helper intended for copying timespec's to the userspace.
There are many places where we convert natvie timespec and copyout it to
the userspace. To avoid code duplication add a tiny halper for doing this.

MFC after:		2 weeks
2022-05-08 15:37:27 +03:00
Dmitry Chagin
8c84ca657b linux(4): Implement sched_rr_get_interval_time64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
5171ed79f6 linux(4): Copyout pselect timeout.
According to pselect6 manual, on error timeout becomes undefined, by fact
Linux modifies the timeout and ignore EFAULT error if so.

MFC after:	2 weeks
2022-04-26 19:35:56 +03:00
Dmitry Chagin
fe894a3705 linux(4): Check that the thread tid in the thread group pid in linux_tdfind().
MFC after:		2 weeks
2022-04-25 10:21:51 +03:00
Dmitry Chagin
d5dc757e84 linux(4): Add compat.linux32.emulate_i386 knob.
Historically 32-bit Linuxulator under amd64 emulated the real i386
behavior. Since 3d8dd983 the old i386 Linux world can't be used under
amd64 Linuxulator as it don't know anything about amd64 machine (which
is returned now by newuname() syscall). So, add a knob to allow to swith
the behavior and use i386 Linux binaries on amd64.
Set knob to the new behavior as I think this is common to the modern
Linux distros.

Reviewed by:		Pau Amma (doc), emaste
Differential revision:	https://reviews.freebsd.org/D34708
MFC after:		2 weeks
2022-03-31 21:01:09 +03:00
Dmitry Chagin
9103c5582a linux(4): wait4() returns ESRCH if pid is INT_MIN.
Weird and undocumented patch was added to the Linux kernel in 2017,
fixes wait403 LTP test.

MFC after:	2 weeks
2022-03-31 20:49:39 +03:00
Dmitry Chagin
4e3aefb923 linux(4): Fixup waitid handling P_PGID idtype.
Since Linux 5.4, if id is zero, then wait for any child that is in the same
process grop as the caller's process group.

Differential revision:  https://reviews.freebsd.org/D31567
MFC after:		2 weeks
2022-03-31 20:44:00 +03:00
Dmitry Chagin
be1e4a0bdf linux(4): Return ENOSYS for unsupported P_PIDFD waitid idtype.
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31559
MFC after:		2 weeks
2022-03-31 20:42:42 +03:00
Dmitry Chagin
c7ef7c3fac linux(4): Add support for __WALL wait option bit.
As FreeBSD does not have __WALL option bit analogue explicitly set all
possible option bits to emulate Linux __WALL wait option bit.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31555
MFC after:		2 weeks
2022-03-31 20:42:03 +03:00
Dmitry Chagin
d7e1e138ec linux(4): Fixup options value validation in linux_waitid().
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31554
MFC after:		2 weeks
2022-03-31 20:41:19 +03:00
Dmitry Chagin
2fd480b318 linux(4): Eliminate bogus options value check validation.
The kernel do it for us.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31553
MFC after:		2 weeks
2022-03-31 20:41:02 +03:00
Dmitry Chagin
0c6b1ff7de linux(4): Consolidate wait* facility into linux_common_wait().
Also fix bug in waitid() implementation, use wru_self not wru_children.

Differential revision:  https://reviews.freebsd.org/D31552
MFC after:		2 weeks
2022-03-31 20:40:22 +03:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Edward Tomasz Napierala
99454d3e98 linux: Provide dummy seccomp(2)
Don't emit messages; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D33808
2022-01-28 11:45:41 +00:00
Edward Tomasz Napierala
9caeb82eab Revert "linux: Provide dummy seccomp(2)"
This reverts commit 56981629f9.

Wrong patch; fails to build on i386.
2022-01-20 22:25:15 +00:00
Edward Tomasz Napierala
56981629f9 linux: Provide dummy seccomp(2)
Don't emit warnings; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D33808
2022-01-25 11:54:00 +00:00
Mateusz Guzik
0c8d7eebfd linux: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-18 13:12:15 +00:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Mateusz Guzik
af4051d250 linux: remove the always curthread argument from lconvpath 2021-11-25 22:50:42 +00:00
Mateusz Guzik
74a0e24f07 linux: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:16:03 +00:00
Edward Tomasz Napierala
a90ff3c4bc linux: Add ptrace(2) support on arm64
This moves linux_ptrace.c from sys/amd64/linux/ to sys/compat/linux/,
making it possible to use it on architectures other than amd64.
It also enables Linux ptrace(2) on arm64.

Relnotes:	yes
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32868
2021-11-07 08:39:24 +00:00
Edward Tomasz Napierala
2f514e6f13 linux(4): implement PR_SET_NO_NEW_PRIVS
This makes prctl(2) support PR_SET_NO_NEW_PRIVS, by mapping it
to the native PROC_NO_NEW_PRIVS_CTL procctl(2).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30973
2021-07-03 08:42:37 +01:00
Dmitry Chagin
c1da89fec2 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:		trasz, imp
Differential Revision:	https://reviews.freebsd.org/D30774
MFC after:		2 weeks
2021-06-22 08:36:21 +03:00
Dmitry Chagin
2eff670fde linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks
2021-06-22 08:07:46 +03:00
Dmitry Chagin
26795a0378 linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks
2021-06-22 08:06:05 +03:00
Dmitry Chagin
ed61e0ce1d linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:18:46 +03:00
Dmitry Chagin
f6d075ecd7 linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:03:30 +03:00
Dmitry Chagin
e4bffb80bb linux(4): Implement utimensat_time64 system call.
MFC after:	2 weeks
2021-06-07 04:54:30 +03:00
Dmitry Chagin
2a0fa277f6 linux(4): Microoptimize futimesat, utimes, utime.
While here wrap long line.

Differential Revision:	https://reviews.freebsd.org/D30488
MFC after:		2 weeks
2021-05-31 22:54:18 +03:00
Dmitry Chagin
b4f9b6eef2 linux(4): Handle AT_EMPTY_PATH in the utimensat syscall.
Differential Revision:	https://reviews.freebsd.org/D30518
MFC after:		2 weeks
2021-05-31 22:37:06 +03:00
Dmitry Chagin
8505eb5dd8 linux(4): Convert flags before use in utimensat.
Differential Revision:	https://reviews.freebsd.org/D30487
MFC after:		2 weeks
2021-05-31 22:30:37 +03:00
Edward Tomasz Napierala
ec2700e015 linux: mute the "unsupported prctl option 23" warnings
Make the PR_CAPBSET_READ prctl(2) return EINVAL without logging
any warnings; this is way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-13 10:31:56 +00:00
Conrad Meyer
76b2bfeda4 linux(4): Fix loadable modules after r367395
Move dtrace SDT definitions into linux_common module code.  Also, build
linux_dummy.c into the linux_common kld -- we don't need separate
versions of these stubs for 32- and 64-bit emulation.

Reported by:	several
PR:		250897
Discussed with:	emaste, trasz
Tested by:	John Kennedy, Yasuhiro KIMURA, Oleg Sidorkin
X-MFC-With:	r367395
Differential Revision:	https://reviews.freebsd.org/D27124
2020-11-06 22:04:57 +00:00
Conrad Meyer
eaa5afcefa linux(4) prctl(2): Implement PR_[GS]ET_DUMPABLE
Proxy the flag to the roughly analogous FreeBSD procctl 'TRACE'.

TRACE-disabled processes are not coredumped, and Linux !DUMPABLE processes
can not be ptraced.  There are some additional semantics around ownership of
files in the /proc/[pid] pseudo-filesystem, which we do not attempt to
emulate correctly at this time.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D27015
2020-11-03 02:10:54 +00:00
Conrad Meyer
a98f03786e linux(4): style: Eliminate dead 'break' after 'return'
No functional change.
2020-11-03 01:10:27 +00:00
Mateusz Guzik
1024de70f9 linux: add missing conversions for compat.linux.use_emul_path handling 2020-10-26 18:02:52 +00:00
Edward Tomasz Napierala
62b1382ff3 Further improve prctl(2) debug.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26916
2020-10-24 14:23:44 +00:00
Edward Tomasz Napierala
1c7481377c Improve prctl(2) debug.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26899
2020-10-23 12:00:30 +00:00
Edward Tomasz Napierala
54669eb779 Add compat.linux.dummy_rlimits, and disable by default.
Turns out the dummy rlimits fix prlimit(1), but break su(8)
(login-1:4.5-1ubuntu2) - although not sudo(8), for some reason.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26814
2020-10-18 15:58:16 +00:00
Edward Tomasz Napierala
139c09788b Make linux getrlimit(2) and prlimit(2) return something reasonable
for linux-specific limits.  Fixes prlimit (util-linux-2.31.1-0.4ubuntu3.7).

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26777
2020-10-16 10:10:09 +00:00
Mateusz Guzik
1a18003240 compat: clean up empty lines in .c and .h files 2020-09-01 21:24:33 +00:00
Mateusz Guzik
a125ed50a6 linux: add sysctl compat.linux.use_emul_path
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
2020-08-18 22:04:22 +00:00
Edward Tomasz Napierala
3d8dd98381 Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR:		kern/240432
Analyzed by by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25248
2020-06-15 20:12:10 +00:00
Mark Johnston
479f70ef24 Fix a couple of nits in Linux sysinfo(2) emulation.
- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
  names on Linux.

Discussed with:	alc
MFC after:	1 week
2020-06-10 23:52:50 +00:00
Mark Johnston
27e4374dd4 Add a comment reflecting the commit log for r361945.
Suggested by:	alc
Reviewed by:	alc
MFC with:	r361945
2020-06-10 23:52:39 +00:00
Mark Johnston
3e5fae34fc Stop computing a "sharedram" value when emulating Linux sysinfo(2).
The previous code was computing an incorrect value in a very expensive
manner.  "sharedram" is supposed to be the amount of memory used by
named swap objects, which on FreeBSD basically corresponds to memory
usage by shared memory objects (including, for example, GEM objects) and
tmpfs.  We currently have no cheap way to count such pages.  The
previous code tried to determine the number of copy-on-write pages
shared between processes.

Just replace the computed value with 0.  illumos reportedly does the
same thing.  Linux itself did not populate this field until a 2014
commit, "mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces".

Reported by:	mjg
MFC after:	1 week
2020-06-08 22:29:52 +00:00
Tijl Coosemans
b4147bf6b4 Move compat.linux.map_sched_prio sysctl definition to linux_mib.c so it is
only defined by linux_common kernel module and not both linux and linux64
modules.

Reported by:	Yuri Pankov <ypankov@fastmail.com>
2020-03-05 14:41:27 +00:00
Tijl Coosemans
f8b9b299a2 linuxulator: Map scheduler priorities to Linux priorities.
On Linux the valid range of priorities for the SCHED_FIFO and SCHED_RR
scheduling policies is [1,99].  For SCHED_OTHER the single valid priority is
0.  On FreeBSD it is [0,31] for all policies.  Programs are supposed to
query the valid range using sched_get_priority_(min|max), but of course some
programs assume the Linux values are valid.

This commit adds a tunable compat.linux.map_sched_prio.  When enabled
sched_get_priority_(min|max) return the Linux values and sched_setscheduler
and sched_(get|set)param translate between FreeBSD and Linux values.

Because there are more Linux levels than FreeBSD levels, multiple Linux
levels map to a single FreeBSD level, which means pre-emption might not
happen as it does on Linux, so the tunable allows to disable this behaviour.
It is enabled by default because I think it is unlikely that anyone runs
real-time software under Linux emulation on FreeBSD that critically relies
on correct pre-emption.

This fixes FMOD, a commercial sound library used by several games.

PR:		240043
Tested by:	Alex S <iwtcex@gmail.com>
Reviewed by:	dchagin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D23790
2020-03-01 13:12:04 +00:00