Commit graph

412 commits

Author SHA1 Message Date
Dmitry Chagin
ce693de706 linux(4): Deorbit linux_nosys
Differential Revision:	https://reviews.freebsd.org/D41901
MFC after:		1 week

(cherry picked from commit 199e397e9bf1076ae905e2742ef8e294870f5b27)
2023-10-10 08:12:12 +03:00
Dmitry Chagin
a321b17b8e linux(4): Deduplicate mmap2
To help porting the Linux emulation layer to a new platforms start using
Linux names for conditional builds instead of architecture-specific ifdefs.

MFC after:		1 week

(cherry picked from commit 2a1cf1b6b55c8326bbe85d0fdf17b0f2fb9b34ce)
2023-09-12 10:48:56 +03:00
Dmitry Chagin
4d17239617 linux(4): Deduplicate mprotect, madvise
MFC after:		1 week

(cherry picked from commit 553b1a4e4eb426d2c29a033b284c13bfbab30334)
2023-09-12 10:48:39 +03:00
Dmitry Chagin
3460fab5fc linux(4): Remove sys/cdefs.h inclusion where it's not needed due to 685dc743 2023-08-18 13:12:02 +03:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Dmitry Chagin
bbe017e041 linux(4): Add a dedicated ioprio system calls
On Linux these system calls have an effect only when used in conjuction
with an I/O scheduler that supports I/O priorities. If no I/O scheduler
has been set for a thread, then by defaut the I/O priority will follow
the CPU nice value. Due to FreeBSD lack of I/O scheduler facilities, the
default Linux behavior is implemented.

Ubuntu 23.04 debootstrap requires Linux ionice which depends on these
syscalls.

Differential Revision:	https://reviews.freebsd.org/D41153
MFC after:		1 month
2023-08-04 16:03:57 +03:00
Dmitry Chagin
8340b03425 linux(4): Add a dedicated linux_exec_copyin_args()
Because Linux allows to exec binaries with 0 argc.

Reviewed by:		brooks
Differential Revision:	https://reviews.freebsd.org/D40148
MFC after:		2 month
2023-05-29 12:18:16 +03:00
Dmitry Chagin
fd745e1db6 linux(4): Use pwd_altroot() to tell namei() about ABI root path
PR:			72920
Differential Revision:	https://reviews.freebsd.org/D40090
MFC after:		2 month
2023-05-29 11:16:46 +03:00
Dmitry Chagin
166e2e5a9e linux(4): Uniformly dev_t arguments translation
The two main uses of dev_t are in struct stat and as a parameter of the
mknod system calls.
As of version 2.6.0 of the Linux kernel, dev_t is a 32-bit quantity
with 12 bits set asaid for the major number and 20 for the minor number.
The in-kernel dev_t encoded as MMMmmmmm, where M is a hex digit of the
major number and m is a hex digit of the minor number.
The user-space dev_t encoded as mmmM MMmm, where M and m is the major
and minor numbers accordingly. This is downward compatible with legacy
systems where dev_t is 16 bits wide, encoded as MMmm.
In glibc dev_t is a 64-bit quantity, with 32-bit major and minor numbers,
encoded as MMMM Mmmm mmmM MMmm. This is downward compatible with the Linux
kernel and with legacy systems where dev_t is 16 bits wide.
In the FreeBSD dev_t is a 64-bit quantity. The major and minor numbers
are encoded as MMMmmmMm, therefore conversion of the device numbers between
Linux user-space and FreeBSD kernel required.
2023-04-28 11:55:05 +03:00
Dmitry Chagin
e185d83fc4 linux(4): Use inlined LINUX_KERNVER for tests to improve readability
MFC after:		1 month
2023-04-26 16:57:30 +03:00
Dmitry Chagin
c8a79231a5 linux(4): Rename linux_timer.h to linux_time.h
To avoid confusing people, rename linux_timer.h to linux_time.h,
as linux_timer.c is the implementation of timer syscalls only,
while linux_time.c contains implementation of all stuff declared
in linux_time.h.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
d8e53d94fa linux(4): Cleanup includes under compat/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
acbbd5c039 linux(4): Cleanup sys/uio.h where linux_uitl.h is included
MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
50c85a32d9 linux(4): Move uselib() to i386
This obsolete system call is not supported by glibc. In ancient libc
versions (before glibc 2.0), uselib() was used to load the shared
libraries with names found in an array of names in the binary.
On Linux, since 3.15, this system call is available only when
the kernel is configured with the CONFIG_USELIB option.

It doesn't look like anyone needs this syscall for others Linuxulators,
so move it to the corresponding MD Linuxulator.

MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
513eb69edf linux(4): Cleanup sys/sysent.h from linux_util
Include sys/sysent.h directly where it needed. The linux_util.h included
in a most source files of the Linuxulator, avoid collecting a rarely used
includes here.

MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
31e938c531 linux(4): Cleanup vm includes from linux_util.h
Include vm headers directly where they needed. The linux_util.h included
in a most source files of the Linuxulator, avoid collecting a rarely used
includes here.

MFC after:		2 weeks
2023-02-14 17:46:30 +03:00
Dmitry Chagin
10d16789a3 linux(4): Get rid of the opt_compat.h include.
Since e013e369 COMPAT_LINUX, COMPAT_LINUX32 build options are removed,
so include of opt_compat.h is no more needed.

MFC after:		2 weeks
2023-02-12 20:24:32 +03:00
Dmitry Chagin
3e0c56a717 linux(4): Use designated initializers.
MFC after:		1 week
2023-02-03 19:17:15 +03:00
Dmitry Chagin
9310737333 linux(4): Trace Linux l_sigset_t.
MFC after:		2 weeks
2022-06-22 14:09:54 +03:00
Dmitry Chagin
5e872c279a linux(4): Use the copyin_sigset() in the remaining places
MFC after:		2 weeks
2022-05-30 19:59:45 +03:00
Dmitry Chagin
d46174cd88 Finish cpuset_getaffinity() after f35093f8
Split cpuset_getaffinity() into a two counterparts, where the
user_cpuset_getaffinity() is intended to operate on the cpuset_t from
user va, while kern_cpuset_getaffinity() expects the cpuset from kernel
va.
Accordingly, the code that clears the high bits is moved to the
user_cpuset_getaffinity(). Linux sched_getaffinity() syscall returns
the size of set copied to the user-space and then glibc wrapper clears
the high bits.

MFC after:		2 weeks
2022-05-28 20:53:08 +03:00
Dmitry Chagin
26700ac0c4 linux(4): Deduplicate execve
As linux_execve is common across archs, except amd64 32-bit Linuxulator,
move it under compat/linux.

Noted by:		andrew@
MFC after:		2 weeks
2022-05-23 13:18:41 +03:00
Mark Johnston
4a3e51335e cpuset: Fix the KASAN and KMSAN builds
Rename the "copyin" and "copyout" fields of struct cpuset_copy_cb to
something less generic, since sanitizers define interceptors for
copyin() and copyout() using #define.

Reported by:	syzbot+2db5d644097fc698fb6f@syzkaller.appspotmail.com
Fixes:	47a57144af ("cpuset: Byte swap cpuset for compat32 on big endian architectures")
Sponsored by:	The FreeBSD Foundation
2022-05-20 10:34:25 -04:00
Dmitry Chagin
89737eb829 Fix the build after 47a57144 2022-05-19 21:40:59 +03:00
Dmitry Chagin
5326ebfd05 linux(4): Revert c7ef7c3 as it's wrong at all.
Reported by:		trasz
2022-05-11 21:00:54 +03:00
Dmitry Chagin
f35093f8d6 Use Linux semantics for the thread affinity syscalls.
Linux has more tolerant checks of the user supplied cpuset_t's.

Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.

Reviewed by:		Pau Amma (man pages)
In collaboration with:	jhb
Differential revision:	https://reviews.freebsd.org/D34849
MFC after:		2 weeks
2022-05-11 10:36:01 +03:00
Dmitry Chagin
707e567a40 linux(4): Add a helper intended for copying timespec's from the userspace.
There are many places where we copyin Linux timespec from the userspace
and then convert it to the kernel timespec. To avoid code duplication
add a tiny halper for doing this.

MFC after:		2 weeks
2022-05-08 16:16:47 +03:00
Dmitry Chagin
1579b320f1 linux(4): Zero out high order bits of nanoseconds in the compat mode.
Assuming the kernel would use random data, the 64-bit Linux kernel ignores
upper 32 bits of tv_nsec of struct timespec64 for 32-bit binaries.

MFC after:		2 weeks
2022-05-08 15:38:19 +03:00
Dmitry Chagin
9a9482f874 linux(4): Add a helper intended for copying timespec's to the userspace.
There are many places where we convert natvie timespec and copyout it to
the userspace. To avoid code duplication add a tiny halper for doing this.

MFC after:		2 weeks
2022-05-08 15:37:27 +03:00
Dmitry Chagin
8c84ca657b linux(4): Implement sched_rr_get_interval_time64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
5171ed79f6 linux(4): Copyout pselect timeout.
According to pselect6 manual, on error timeout becomes undefined, by fact
Linux modifies the timeout and ignore EFAULT error if so.

MFC after:	2 weeks
2022-04-26 19:35:56 +03:00
Dmitry Chagin
fe894a3705 linux(4): Check that the thread tid in the thread group pid in linux_tdfind().
MFC after:		2 weeks
2022-04-25 10:21:51 +03:00
Dmitry Chagin
d5dc757e84 linux(4): Add compat.linux32.emulate_i386 knob.
Historically 32-bit Linuxulator under amd64 emulated the real i386
behavior. Since 3d8dd983 the old i386 Linux world can't be used under
amd64 Linuxulator as it don't know anything about amd64 machine (which
is returned now by newuname() syscall). So, add a knob to allow to swith
the behavior and use i386 Linux binaries on amd64.
Set knob to the new behavior as I think this is common to the modern
Linux distros.

Reviewed by:		Pau Amma (doc), emaste
Differential revision:	https://reviews.freebsd.org/D34708
MFC after:		2 weeks
2022-03-31 21:01:09 +03:00
Dmitry Chagin
9103c5582a linux(4): wait4() returns ESRCH if pid is INT_MIN.
Weird and undocumented patch was added to the Linux kernel in 2017,
fixes wait403 LTP test.

MFC after:	2 weeks
2022-03-31 20:49:39 +03:00
Dmitry Chagin
4e3aefb923 linux(4): Fixup waitid handling P_PGID idtype.
Since Linux 5.4, if id is zero, then wait for any child that is in the same
process grop as the caller's process group.

Differential revision:  https://reviews.freebsd.org/D31567
MFC after:		2 weeks
2022-03-31 20:44:00 +03:00
Dmitry Chagin
be1e4a0bdf linux(4): Return ENOSYS for unsupported P_PIDFD waitid idtype.
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31559
MFC after:		2 weeks
2022-03-31 20:42:42 +03:00
Dmitry Chagin
c7ef7c3fac linux(4): Add support for __WALL wait option bit.
As FreeBSD does not have __WALL option bit analogue explicitly set all
possible option bits to emulate Linux __WALL wait option bit.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31555
MFC after:		2 weeks
2022-03-31 20:42:03 +03:00
Dmitry Chagin
d7e1e138ec linux(4): Fixup options value validation in linux_waitid().
Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31554
MFC after:		2 weeks
2022-03-31 20:41:19 +03:00
Dmitry Chagin
2fd480b318 linux(4): Eliminate bogus options value check validation.
The kernel do it for us.

Reviewed by:		emaste
Differential revision:  https://reviews.freebsd.org/D31553
MFC after:		2 weeks
2022-03-31 20:41:02 +03:00
Dmitry Chagin
0c6b1ff7de linux(4): Consolidate wait* facility into linux_common_wait().
Also fix bug in waitid() implementation, use wru_self not wru_children.

Differential revision:  https://reviews.freebsd.org/D31552
MFC after:		2 weeks
2022-03-31 20:40:22 +03:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Edward Tomasz Napierala
99454d3e98 linux: Provide dummy seccomp(2)
Don't emit messages; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D33808
2022-01-28 11:45:41 +00:00
Edward Tomasz Napierala
9caeb82eab Revert "linux: Provide dummy seccomp(2)"
This reverts commit 56981629f9.

Wrong patch; fails to build on i386.
2022-01-20 22:25:15 +00:00
Edward Tomasz Napierala
56981629f9 linux: Provide dummy seccomp(2)
Don't emit warnings; this isn't any different from a Linux kernel
built without OPTIONS_SECCOMP, so the userspace already needs to know
how to deal with it.  This is also similar with how we handle seccomp
in linux_prctl().

Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D33808
2022-01-25 11:54:00 +00:00
Mateusz Guzik
0c8d7eebfd linux: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-18 13:12:15 +00:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Mateusz Guzik
af4051d250 linux: remove the always curthread argument from lconvpath 2021-11-25 22:50:42 +00:00
Mateusz Guzik
74a0e24f07 linux: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:16:03 +00:00
Edward Tomasz Napierala
a90ff3c4bc linux: Add ptrace(2) support on arm64
This moves linux_ptrace.c from sys/amd64/linux/ to sys/compat/linux/,
making it possible to use it on architectures other than amd64.
It also enables Linux ptrace(2) on arm64.

Relnotes:	yes
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32868
2021-11-07 08:39:24 +00:00
Edward Tomasz Napierala
2f514e6f13 linux(4): implement PR_SET_NO_NEW_PRIVS
This makes prctl(2) support PR_SET_NO_NEW_PRIVS, by mapping it
to the native PROC_NO_NEW_PRIVS_CTL procctl(2).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30973
2021-07-03 08:42:37 +01:00