Commit graph

58 commits

Author SHA1 Message Date
Vico Chen
3c93ba3d7f linux(4): Convert flags in timerfd_create
The timerfd is introduced in FreeBSD 14, and the Linux ABI timerfd is
also moved to FreeBSD native timerfd, but it can't work well as Linux
TFD_CLOEXEC and TFD_NONBLOCK haven't been converted to FreeBSD
TFD_CLOEXEC and TFD_NONBLOCK.

Reviewed by:		dchagin, jfree
PR:			273662
Differential revision:	https://reviews.freebsd.org/D41708
MFC after:		1 week

(cherry picked from commit aadc14bceb4e94f5b75a05de96cd9619b877b030)
2023-09-12 10:47:54 +03:00
Jake Freeland
af93fea710 timerfd: Move implementation from linux compat to sys/kern
Move the timerfd impelemntation from linux compat code to sys/kern. Use
it to implement the new system calls for timerfd. Add a hook to kern_tc
to allow timerfd to know when the system time has stepped. Add kqueue
support to timerfd. Adjust a few names to be less Linux centric.

RelNotes: YES
Reviewed by: markj (on irc), imp, kib (with reservations), jhb (slack)
Differential Revision: https://reviews.freebsd.org/D38459
2023-08-24 14:28:56 -06:00
Dmitry Chagin
3460fab5fc linux(4): Remove sys/cdefs.h inclusion where it's not needed due to 685dc743 2023-08-18 13:12:02 +03:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Dmitry Chagin
c8a79231a5 linux(4): Rename linux_timer.h to linux_time.h
To avoid confusing people, rename linux_timer.h to linux_time.h,
as linux_timer.c is the implementation of timer syscalls only,
while linux_time.c contains implementation of all stuff declared
in linux_time.h.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
d8e53d94fa linux(4): Cleanup includes under compat/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
10d16789a3 linux(4): Get rid of the opt_compat.h include.
Since e013e369 COMPAT_LINUX, COMPAT_LINUX32 build options are removed,
so include of opt_compat.h is no more needed.

MFC after:		2 weeks
2023-02-12 20:24:32 +03:00
Konstantin Belousov
c6d31b8306 AST: rework
Make most AST handlers dynamically registered.  This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it.  For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.

Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit.  For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.

The dynamic registration also allows third-party modules to register AST
handlers if needed.  There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it.  In fact, this
is already present behavior for hwpmc.ko and ufs.ko.  I do not think it
is worth the efforts and the runtime overhead to try to fix it.

Reviewed by:	markj
Tested by:	emaste (arm64), pho
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35888
2022-08-02 21:11:09 +03:00
Dmitry Chagin
9310737333 linux(4): Trace Linux l_sigset_t.
MFC after:		2 weeks
2022-06-22 14:09:54 +03:00
Dmitry Chagin
707e567a40 linux(4): Add a helper intended for copying timespec's from the userspace.
There are many places where we copyin Linux timespec from the userspace
and then convert it to the kernel timespec. To avoid code duplication
add a tiny halper for doing this.

MFC after:		2 weeks
2022-05-08 16:16:47 +03:00
Dmitry Chagin
ce9f8d6ab0 linux(4): Implement timerfd_gettime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:52 +03:00
Dmitry Chagin
b1f0b08d93 linux(4): Implement timerfd_settime64 syscall.
MFC after:		2weeks
2022-05-04 13:06:50 +03:00
Dmitry Chagin
e00aad1042 linux(4): Add epoll_pwai2 syscall.
MFC after:	2 weeks
2022-04-26 19:35:59 +03:00
Dmitry Chagin
3923e63209 linux(4): Add copyin_sigset() helper.
MFC after:	2 weeks
2022-04-26 19:35:57 +03:00
Dmitry Chagin
27a25179c8 linux(4): Add linux_epoll_pwait_ts() helper for future use.
MFC after:	2 weeks
2022-04-26 19:35:56 +03:00
Mateusz Guzik
74a0e24f07 linux: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:16:03 +00:00
Mateusz Guzik
2b68eb8e1d vfs: remove thread argument from VOP_STAT
and fo_stat.
2021-10-11 13:22:32 +00:00
Dmitry Chagin
2b38186330 Drop rdivacky@ "All rights reserved" from linux_event.
I got explicit permission from Roman.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30913
MFC after:		2 weeks
2021-07-20 10:06:16 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Vladimir Kondratyev
b3c6fe663b epoll: Store epoll_event udata member in ext member of kevent.
Current epoll implementation stores udata fields of epoll_event
structure in special dynamically-sized table rather than in udata field
of backing kevent structure because of 2 reasons:
1. Kevent's udata size is smaller than epoll's on 32-bit archs.
2. Kevent's udata can be clobbered on execution EPOLL_CTL_ADD as kqueue
   modifies existing event while epoll returns error in this case.

After r320043 has introduced four new 64bit user data members (ext[]),
we can store epoll udata in one of them and drop aforementioned table.
According to kqueue_register() source code ext members are not updated
when existing kevent is modified that fixes p.2.

As a side effect the patch fixes PR/252582.

Reviewed by:	trasz
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D28169
2021-02-08 02:46:14 +03:00
shu
14c40d2c29 linux: remove locks around callout_drain in timerfd_close()
The lock around callout_drain() is unnecessary and may cause
deadlock when one closes a timer descriptor during timer execution.

Reviewed By:	delphij
Submitted By:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28148
2021-02-03 19:47:38 +00:00
shu
ae71b794cb linux: make timerfd_settime(2) set expirations count to zero
On Linux, read(2) from a timerfd file descriptor returns an unsigned
8-byte integer (uint64_t) containing the number of expirations
that have occurred, if the timer has already expired one or more
times since its settings were last modified using timerfd_settime(),
or since the last successful read(2).  That's to say, once we do
a read or call timerfd_settime(), timer fd's expiration count should
be zero.  Some Linux applications create timerfd and add it to epoll
with LT mode, when event comes, they do timerfd_settime instead
of read to stop event source from trigger.  On FreeBSD,
timerfd_settime(2) didn't set the count to zero, which caused high
CPU utilization.

Submitted by:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28231
2021-02-03 19:08:40 +00:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Konstantin Belousov
7a202823aa Expose eventfd in the native API/ABI using a new __specialfd syscall
eventfd is a Linux system call that produces special file descriptors
for event notification. When porting Linux software, it is currently
usually emulated by epoll-shim on top of kqueues.  Unfortunately, kqueues
are not passable between processes.  And, as noted by the author of
epoll-shim, even if they were, the library state would also have to be
passed somehow.  This came up when debugging strange HW video decode
failures in Firefox.  A native implementation would avoid these problems
and help with porting Linux software.

Since we now already have an eventfd implementation in the kernel (for
the Linuxulator), it's pretty easy to expose it natively, which is what
this patch does.

Submitted by:   greg@unrelenting.technology
Reviewed by:    markj (previous version)
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D26668
2020-12-27 12:57:26 +02:00
Konstantin Belousov
7cb901bf22 Remove useless ARGUSED annotations.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Konstantin Belousov
11c9f2ff1a Add SPDX tag.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Mateusz Guzik
1a18003240 compat: clean up empty lines in .c and .h files 2020-09-01 21:24:33 +00:00
Edward Tomasz Napierala
86e794eb65 Don't use newlines with linux_msg(). No functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-11 14:57:30 +00:00
Vladimir Kondratyev
71b8e362c5 Linux epoll: Allow passing of any negative timeout value to epoll_wait
Linux epoll allow passing of any negative timeout value to epoll_wait()
to cause unbound blocking

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22517
2019-11-24 20:51:09 +00:00
Vladimir Kondratyev
335fe0afb8 Linux epoll: Register events with zero event mask
Such an events are legal and should be interpreted as EPOLLERR | EPOLLHUP.
Register a disabled kqueue event in that case as we do not support EPOLLHUP yet.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22516
2019-11-24 20:47:40 +00:00
Vladimir Kondratyev
461120b834 Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD
Linux epoll EPOLL_CTL_ADD op handler should always check registration
of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied
file descriptor fd is already registered with epoll instance.

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22515
2019-11-24 20:44:14 +00:00
Vladimir Kondratyev
896a4c279d Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired
Linux epoll does not remove descriptor after one-shot event has been triggered.
Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22513
2019-11-24 20:41:47 +00:00
Mark Johnston
792843c38f Pass malloc flags directly through kevent(2) subroutines.
Some kevent functions have a boolean "waitok" parameter for use when
calling malloc(9).  Replace them with the corresponding malloc() flags:
the desired behaviour is known at compile-time, so this eliminates a
couple of conditional branches, and makes the code easier to read.

No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18318
2018-11-24 17:06:01 +00:00
Alan Somers
6040822c4e Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.

Our kernel currently defines two-argument versions of timespecadd and
timespecsub.  NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions.  Solaris also defines a three-argument version, but
only in its kernel.  This revision changes our definition to match the
common three-argument version.

Bump _FreeBSD_version due to the breaking KPI change.

Discussed with:	cem, jilles, ian, bde
Differential Revision:	https://reviews.freebsd.org/D14725
2018-07-30 15:46:40 +00:00
Ed Maste
931e2a1a6e linuxulator: do not include legacy syscalls on arm64
Existing linuxulator platforms (i386, amd64) support legacy syscalls,
such as non-*at ones like open, but arm64 and other new platforms do
not.

Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h
files.  We may need finer grained control in the future but this is
sufficient for now.

Reviewed by:	andrew
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D15237
2018-06-15 14:41:51 +00:00
Matt Macy
cbd92ce62e Eliminate the overhead of gratuitous repeated reinitialization of cap_rights
- Add macros to allow preinitialization of cap_rights_t.

- Convert most commonly used code paths to use preinitialized cap_rights_t.
  A 3.6% speedup in fstat was measured with this change.

Reported by:	mjg
Reviewed by:	oshogbo
Approved by:	sbruno
MFC after:	1 month
2018-05-09 18:47:24 +00:00
Ed Maste
132f90c660 Linuxolator whitespace cleanup
A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator.  Clean these up so that new
architectures do not inherit whitespace issues.

Clean up shared Linuxolator files while here.

Sponsored by:	Turing Robotic Industries Inc.
2018-02-05 17:29:12 +00:00
Dmitry Chagin
0633df29be Linux epoll return EEXIST on case when op is EPOLL_CTL_ADD, and the supplied
file descriptor fd is already registered with this epoll instance.

MFC after:	1 month
2017-02-28 19:55:16 +00:00
Dmitry Chagin
8bc4edfd75 Linux epoll return ENOENT error in case when op is EPOLL_CTL_MOD or
EPOLL_CTL_DEL, and fd is not registered with this epoll instance.

MFC after:	1 month
2017-02-28 19:54:22 +00:00
Dmitry Chagin
29b1ecbe01 FreeBSD does not have analgue for epill EPOLLPRI event type.
So, do not set EPOLLPRI event acidently.
Also, do not set EPOLLWRNORM and EPOLLRDNORM events as epoll
do not set this events.

MFC after:	1 month
2017-02-28 19:49:21 +00:00
Dmitry Chagin
281afc595f Return EINVAL when an invalid file descriptor specified.
MFC after:	1 month
2017-02-27 16:55:09 +00:00
Dmitry Chagin
ed211c40f4 Unify eventfd ioctl method and use it for other similar interfaces.
MFC after:	1 month
2017-02-27 16:53:52 +00:00
Dmitry Chagin
2fa6d2fe84 Return EINVAL in case when an invalid size of signal mask specified.
MFC after:	1 month
2017-02-26 20:01:58 +00:00
Dmitry Chagin
80c7315d17 Restore signal mask in epoll_pwait.
MFC after:	1 month
2017-02-26 19:54:17 +00:00
Dmitry Chagin
e0a254f6df Return EINVAL when an invalid file descriptor is specified.
MFC after:	1 month
2017-02-26 19:51:44 +00:00
Dmitry Chagin
dd93b628e9 Implement timerfd family syscalls.
MFC after:	1 month
2017-02-26 09:48:18 +00:00
Dmitry Chagin
01d4a346c2 Nostly style(9) changes, replace unused eventfd_truncate()
by default invfo_truncate() method.

MFC after:	1 month
2017-02-26 09:42:34 +00:00
Ed Maste
df4336ddfa Catch up to sys/capability.h rename to sys/capsicum.h in r263232
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-09-19 18:44:43 +00:00
Dmitry Chagin
09806d8e3e When write(2) on eventfd object fails with the error EAGAIN do not return
the number of bytes written.

MFC after:	1 week
2016-03-26 19:16:53 +00:00