Commit graph

390 commits

Author SHA1 Message Date
Dmitry Chagin
8287577434 linux(4): Trace Linux l_sigset_t.
MFC after:		2 weeks

(cherry picked from commit 9310737333)
2022-07-06 14:02:13 +03:00
Dmitry Chagin
50fe2c32d5 linux(4): Use the copyin_sigset() in the remaining places
MFC after:		2 weeks

(cherry picked from commit 5e872c279a)
2022-06-17 22:35:39 +03:00
Dmitry Chagin
089a76e915 Finish cpuset_getaffinity() after f35093f8
Split cpuset_getaffinity() into a two counterparts, where the
user_cpuset_getaffinity() is intended to operate on the cpuset_t from
user va, while kern_cpuset_getaffinity() expects the cpuset from kernel
va.
Accordingly, the code that clears the high bits is moved to the
user_cpuset_getaffinity(). Linux sched_getaffinity() syscall returns
the size of set copied to the user-space and then glibc wrapper clears
the high bits.

MFC after:		2 weeks

(cherry picked from commit d46174cd88)
2022-06-17 22:35:31 +03:00
Dmitry Chagin
959d0bdc24 linux(4): Deduplicate execve
As linux_execve is common across archs, except amd64 32-bit Linuxulator,
move it under compat/linux.

Noted by:               andrew@
MFC after:              2 weeks

(cherry picked from commit 26700ac0c4)
2022-06-17 22:35:30 +03:00
Dmitry Chagin
0e89b88d1c linux(4): Revert c7ef7c3 as it's wrong at all.
Reported by:		trasz

(cherry picked from commit 5326ebfd05)
2022-06-17 22:35:15 +03:00
Dmitry Chagin
72bc1e6806 cpuset: Byte swap cpuset for compat32 on big endian architectures
Summary:
BITSET uses long as its basic underlying type, which is dependent on the
compile type, meaning on 32-bit builds the basic type is 32 bits, but on
64-bit builds it's 64 bits.  On little endian architectures this doesn't
matter, because the LSB is always at the low bit, so the words get
effectively concatenated moving between 32-bit and 64-bit, but on
big-endian architectures it throws a wrench in, as setting bit 0 in
32-bit mode is equivalent to setting bit 32 in 64-bit mode.  To
demonstrate:

32-bit mode:

BIT_SET(foo, 0):        0x00000001

64-bit sees: 0x0000000100000000

cpuset is the only system interface that uses bitsets, so solve this
by swapping the integer sub-components at the copyin/copyout points.

Reviewed by:    kib
Sponsored by:   Juniper Networks, Inc.
Differential Revision:  https://reviews.freebsd.org/D35225

(cherry picked from commit 47a57144af)

Fix the build after 47a57144

(cherry picked from commit 89737eb829)

cpuset: Fix the KASAN and KMSAN builds

Rename the "copyin" and "copyout" fields of struct cpuset_copy_cb to
something less generic, since sanitizers define interceptors for
copyin() and copyout() using #define.

Reported by:    syzbot+2db5d644097fc698fb6f@syzkaller.appspotmail.com
Fixes:  47a57144af ("cpuset: Byte swap cpuset for compat32 on big endian architectures")
Sponsored by:   The FreeBSD Foundation

(cherry picked from commit 4a3e51335e)

Use Linux semantics for the thread affinity syscalls.

Linux has more tolerant checks of the user supplied cpuset_t's.

Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.

Reviewed by:            Pau Amma (man pages)
In collaboration with:  jhb
Differential revision:  https://reviews.freebsd.org/D34849
MFC after:              2 weeks

(cherry picked from commit f35093f8d6)
2022-06-17 22:35:14 +03:00
Dmitry Chagin
573a0f3e68 linux(4): Add a helper intended for copying timespec's from the userspace.
There are many places where we copyin Linux timespec from the userspace
and then convert it to the kernel timespec. To avoid code duplication
add a tiny halper for doing this.

MFC after:		2 weeks

(cherry picked from commit 707e567a40)
2022-06-17 22:34:59 +03:00
Dmitry Chagin
c0584b324e linux(4): Zero out high order bits of nanoseconds in the compat mode.
Assuming the kernel would use random data, the 64-bit Linux kernel ignores
upper 32 bits of tv_nsec of struct timespec64 for 32-bit binaries.

MFC after:		2 weeks

(cherry picked from commit 1579b320f1)
2022-06-17 22:34:58 +03:00
Dmitry Chagin
5a37b53f16 linux(4): Add a helper intended for copying timespec's to the userspace.
There are many places where we convert natvie timespec and copyout it to
the userspace. To avoid code duplication add a tiny halper for doing this.

MFC after:		2 weeks

(cherry picked from commit 9a9482f874)
2022-06-17 22:34:58 +03:00
Dmitry Chagin
b2f336bb73 linux(4): Implement sched_rr_get_interval_time64 syscall.
MFC after:		2 weeks

(cherry picked from commit 8c84ca657b)
2022-06-17 22:34:15 +03:00
Dmitry Chagin
c78e5c4c6a linux(4): Copyout pselect timeout.
According to pselect6 manual, on error timeout becomes undefined, by fact
Linux modifies the timeout and ignore EFAULT error if so.

MFC after:	2 weeks

(cherry picked from commit 5171ed79f6)
2022-06-17 22:34:04 +03:00
Dmitry Chagin
e4de536940 linux(4): Check that the thread tid in the thread group pid in linux_tdfind().
MFC after:		2 weeks

(cherry picked from commit fe894a3705)
2022-06-17 22:33:54 +03:00
Dmitry Chagin
82fa7c3b83 linux(4): Add compat.linux32.emulate_i386 knob.
Historically 32-bit Linuxulator under amd64 emulated the real i386
behavior. Since 3d8dd983 the old i386 Linux world can't be used under
amd64 Linuxulator as it don't know anything about amd64 machine (which
is returned now by newuname() syscall). So, add a knob to allow to swith
the behavior and use i386 Linux binaries on amd64.
Set knob to the new behavior as I think this is common to the modern
Linux distros.

Reviewed by:		Pau Amma (doc), emaste
Differential revision:	https://reviews.freebsd.org/D34708
MFC after:		2 weeks

(cherry picked from commit d5dc757e84)
2022-06-17 22:33:47 +03:00
Dmitry Chagin
ccbef519e0 linux(4): wait4() returns ESRCH if pid is INT_MIN.
Weird and undocumented patch was added to the Linux kernel in 2017,
fixes wait403 LTP test.

MFC after:	2 weeks

(cherry picked from commit 9103c5582a)
2022-06-17 22:33:46 +03:00
Dmitry Chagin
cbb2112dca linux(4): Fixup waitid handling P_PGID idtype.
Since Linux 5.4, if id is zero, then wait for any child that is in the same
process grop as the caller's process group.

Differential revision:  https://reviews.freebsd.org/D31567
MFC after:		2 weeks

(cherry picked from commit 4e3aefb923)
2022-06-17 22:33:45 +03:00
Dmitry Chagin
2c029214d3 linux(4): Return ENOSYS for unsupported P_PIDFD waitid idtype.
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31559
MFC after:		2 weeks

(cherry picked from commit be1e4a0bdf)
2022-06-17 22:33:45 +03:00
Dmitry Chagin
638a0649cf linux(4): Add support for __WALL wait option bit.
As FreeBSD does not have __WALL option bit analogue explicitly set all
possible option bits to emulate Linux __WALL wait option bit.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31555
MFC after:		2 weeks

(cherry picked from commit c7ef7c3fac)
2022-06-17 22:33:45 +03:00
Dmitry Chagin
7df3421cb6 linux(4): Fixup options value validation in linux_waitid().
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31554
MFC after:		2 weeks

(cherry picked from commit d7e1e138ec)
2022-06-17 22:33:44 +03:00
Dmitry Chagin
de1382dc4b linux(4): Eliminate bogus options value check validation.
The kernel do it for us.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31553
MFC after:		2 weeks

(cherry picked from commit 2fd480b318)
2022-06-17 22:33:44 +03:00
Dmitry Chagin
721dcd87ab linux(4): Consolidate wait* facility into linux_common_wait().
Also fix bug in waitid() implementation, use wru_self not wru_children.

Differential revision:  https://reviews.freebsd.org/D31552
MFC after:		2 weeks

(cherry picked from commit 0c6b1ff7de)
2022-06-17 22:33:44 +03:00
Mateusz Guzik
6a8133e329 linux: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 74a0e24f07)
2022-06-17 22:33:43 +03:00
Edward Tomasz Napierala
d005a6f2c8 linux: Provide dummy seccomp(2)
Don't emit messages; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D33808

(cherry picked from commit 99454d3e98)
2022-06-17 22:33:40 +03:00
Mateusz Guzik
0506fe6d3c linux: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 0c8d7eebfd)
2022-06-17 22:33:40 +03:00
Mateusz Guzik
753605353d linux: remove the always curthread argument from lconvpath
(cherry picked from commit af4051d250)
2022-06-17 22:33:39 +03:00
Edward Tomasz Napierala
99950e8beb linux: Add ptrace(2) support on arm64
This moves linux_ptrace.c from sys/amd64/linux/ to sys/compat/linux/,
making it possible to use it on architectures other than amd64.
It also enables Linux ptrace(2) on arm64.

Relnotes:	yes
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32868

(cherry picked from commit a90ff3c4bc)
2022-06-17 22:33:37 +03:00
Dmitry Chagin
0cd177a3d3 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:            trasz, imp
Differential Revision:  https://reviews.freebsd.org/D30774
MFC after:              2 weeks

(cherry picked from commit c1da89fec2)
2022-06-17 22:30:23 +03:00
Dmitry Chagin
b7289a2e08 linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks

(cherry picked from commit 2eff670fde)
2022-06-17 22:30:19 +03:00
Dmitry Chagin
b2deba043c linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks

(cherry picked from commit 26795a0378)
2022-06-17 22:30:19 +03:00
Dmitry Chagin
ad561d55e4 linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks

(cherry picked from commit ed61e0ce1d)
2022-06-17 22:30:17 +03:00
Dmitry Chagin
4250154ede linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks

(cherry picked from commit f6d075ecd7)
2022-06-17 22:30:16 +03:00
Dmitry Chagin
43ee9ad567 linux(4): Implement utimensat_time64 system call.
MFC after:	2 weeks

(cherry picked from commit e4bffb80bb)
2022-06-17 22:27:55 +03:00
Dmitry Chagin
a836accf4c linux(4): Microoptimize futimesat, utimes, utime.
While here wrap long line.

Differential Revision:	https://reviews.freebsd.org/D30488
MFC after:		2 weeks

(cherry picked from commit 2a0fa277f6)
2022-06-17 22:22:17 +03:00
Dmitry Chagin
5e777d88c4 linux(4): Handle AT_EMPTY_PATH in the utimensat syscall.
Differential Revision:	https://reviews.freebsd.org/D30518
MFC after:		2 weeks

(cherry picked from commit b4f9b6eef2)
2022-06-17 22:22:16 +03:00
Dmitry Chagin
0f5f0e5e56 linux(4): Convert flags before use in utimensat.
Differential Revision:	https://reviews.freebsd.org/D30487
MFC after:		2 weeks

(cherry picked from commit 8505eb5dd8)
2022-06-17 22:22:16 +03:00
Mark Johnston
9171b8068b cpuset: Fix the KASAN and KMSAN builds
Rename the "copyin" and "copyout" fields of struct cpuset_copy_cb to
something less generic, since sanitizers define interceptors for
copyin() and copyout() using #define.

Reported by:	syzbot+2db5d644097fc698fb6f@syzkaller.appspotmail.com
Fixes:	47a57144af ("cpuset: Byte swap cpuset for compat32 on big endian architectures")
Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 4a3e51335e)
2022-05-23 17:11:22 -05:00
Dmitry Chagin
01f281d0ee Fix the build after 47a57144
(cherry picked from commit 89737eb829)
2022-05-23 17:11:22 -05:00
Edward Tomasz Napierala
b5cae0d5ed linux(4): implement PR_SET_NO_NEW_PRIVS
This makes prctl(2) support PR_SET_NO_NEW_PRIVS, by mapping it
to the native PROC_NO_NEW_PRIVS_CTL procctl(2).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30973

(cherry picked from commit 2f514e6f13)
2022-02-14 00:08:55 +00:00
Edward Tomasz Napierala
ec2700e015 linux: mute the "unsupported prctl option 23" warnings
Make the PR_CAPBSET_READ prctl(2) return EINVAL without logging
any warnings; this is way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-13 10:31:56 +00:00
Conrad Meyer
76b2bfeda4 linux(4): Fix loadable modules after r367395
Move dtrace SDT definitions into linux_common module code.  Also, build
linux_dummy.c into the linux_common kld -- we don't need separate
versions of these stubs for 32- and 64-bit emulation.

Reported by:	several
PR:		250897
Discussed with:	emaste, trasz
Tested by:	John Kennedy, Yasuhiro KIMURA, Oleg Sidorkin
X-MFC-With:	r367395
Differential Revision:	https://reviews.freebsd.org/D27124
2020-11-06 22:04:57 +00:00
Conrad Meyer
eaa5afcefa linux(4) prctl(2): Implement PR_[GS]ET_DUMPABLE
Proxy the flag to the roughly analogous FreeBSD procctl 'TRACE'.

TRACE-disabled processes are not coredumped, and Linux !DUMPABLE processes
can not be ptraced.  There are some additional semantics around ownership of
files in the /proc/[pid] pseudo-filesystem, which we do not attempt to
emulate correctly at this time.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D27015
2020-11-03 02:10:54 +00:00
Conrad Meyer
a98f03786e linux(4): style: Eliminate dead 'break' after 'return'
No functional change.
2020-11-03 01:10:27 +00:00
Mateusz Guzik
1024de70f9 linux: add missing conversions for compat.linux.use_emul_path handling 2020-10-26 18:02:52 +00:00
Edward Tomasz Napierala
62b1382ff3 Further improve prctl(2) debug.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26916
2020-10-24 14:23:44 +00:00
Edward Tomasz Napierala
1c7481377c Improve prctl(2) debug.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26899
2020-10-23 12:00:30 +00:00
Edward Tomasz Napierala
54669eb779 Add compat.linux.dummy_rlimits, and disable by default.
Turns out the dummy rlimits fix prlimit(1), but break su(8)
(login-1:4.5-1ubuntu2) - although not sudo(8), for some reason.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26814
2020-10-18 15:58:16 +00:00
Edward Tomasz Napierala
139c09788b Make linux getrlimit(2) and prlimit(2) return something reasonable
for linux-specific limits.  Fixes prlimit (util-linux-2.31.1-0.4ubuntu3.7).

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26777
2020-10-16 10:10:09 +00:00
Mateusz Guzik
1a18003240 compat: clean up empty lines in .c and .h files 2020-09-01 21:24:33 +00:00
Mateusz Guzik
a125ed50a6 linux: add sysctl compat.linux.use_emul_path
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
2020-08-18 22:04:22 +00:00
Edward Tomasz Napierala
3d8dd98381 Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR:		kern/240432
Analyzed by by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25248
2020-06-15 20:12:10 +00:00
Mark Johnston
479f70ef24 Fix a couple of nits in Linux sysinfo(2) emulation.
- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
  names on Linux.

Discussed with:	alc
MFC after:	1 week
2020-06-10 23:52:50 +00:00