When the file system does not support symbolic links (like in the case
of MSDOS), symlink() returns -1 and sets errno to EOPNOTSUPP.
Document this behavior.
Reviewed by: glebius, markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D49803
This patch updates the man page for the MNT_NAMEDATTR flag.
Another man page that explains named attributes will
be introduced in a future commit.
This is a content change.
Reviewed by: manpages (zaiee)
Fixes: 2ec2ba7e23 ("vfs: Add VFS/syscall support for Solaris style extended attributes")
Differential Revision: https://reviews.freebsd.org/D49719
This patch updates the man page for the O_NAMEDATTR flag.
Another man page that explains named attributes will
be introduced in a future commit.
This is a content change.
Reviewed by: manpages (zaiee)
Fixes: 2ec2ba7e ("Add support for Solaris style extended attr")
Differential Revision: https://reviews.freebsd.org/D49718
The ones that were effectively unchanged from
d97e44784bb5a^..e24279e0f9e did not have `.Dd` bumped. Only
the ones that had a net content change between those
revisions.
MFC after: 2 weeks
MFC with: d97e44784be24279e0f9
aio(4) is a hard requirement in the kernel as of f3215338ef. The
scenario that the patch was submitted for is no longer possible.
This isn't a straight up revert since the previous change also addressed
some minor issues.
PR: 190942
Reported by: asomers
MFC after: 2 weeks
MFC with: d97e44784b
Fixes: d97e44784b ("aio_*(2): mention ENOSYS under ERRORS")
Differential Revision: https://reviews.freebsd.org/D49541
ENOSYS can occur if aio(4) is not loaded in the kernel. Document this
behavior so consumers on FreeBSD can better understand that this is a
possible scenario.
Clean up the manpages slightly while here:
- Sort `ERRORS` by errno(3).
- Use `.Fx` instead of `FreeBSD`.
MFC after: 2 weeks
Reviewed by: ziaee
PR: 190942
Differential Revision: https://reviews.freebsd.org/D49502
Previously, __realpathat was in libc and libsys (as is currently
standard), but not exported from libc which meant the stub in libc was
not filtered and thus libc's copy of the syscall was used. This broke
an upcoming change to CheriBSD limiting syscalls to libsys.
The realpath(3) implementation now uses __sys___realpathat so there are no
consumers of __realpathat. Switch it to PSEUDO (only _foo and __sys_foo
symbols) and remove __realpathat from Symbol.map.
This is a corrected version of 58d43a3cd7.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D49049
Previously, __realpathat was in libc and libsys (as is currently
standard), but not exported from libc which meant the stub in libc was
not filtered and thus libc's copy of the syscall was used. This broke
an upcoming change to CheriBSD limiting syscalls to libsys.
The realpath(3) implementation now uses __sys___realpathat so there are no
consumers of __realpathat. Switch it to PSEUDO (only _foo and __sys_foo
symbols) and remove __realpathat from Symbol.map.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D49049
Clarify the RETURN VALUES section with improved structure,
the condition of the return value 0, and the setting of errno.
PR: 174581
Reviewed by: jhb, ziaee
Approved by: mhorne (mentor)
Differential Revision: https://reviews.freebsd.org/D48955
The SO_SETFIB option can be used to set a socket's FIB number, but there
is no way to retrieve it. Rename SO_SETFIB to SO_FIB and implement a
handler for it for getsockopt(2).
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D48834
- Add fstatfs(), fchdir(), fchroot(), extattr_*_fd(), cap_*_get(),
cap_*_limit() to the list of syscalls that can take an O_PATH fd.
- Remove readlinkat() from the list, since it is already discussed
in the first few lines of the paragraph. It was originally added
to the list when readlinkat() adds support for non-dir fd with
an empty relative path (as if with AT_EMPTY_PATH), however,
such use case is also discussed in the next paragraph.
- Add funlinkat() to the list, since it accepts an extra fd
(of the file to be unlinked), which is worth extra mentioning.
- Fix a syntax issue which causes a bogus space to be rendered
before a closing parentheses.
Signed-off-by: CismonX <admin@cismon.net>
Reviewed by: markj, jhb
MFC after: 2 weeks
Pull Request: https://github.com/freebsd/freebsd-src/pull/1564
POSIX used to specify that munmap shall fail with EINVAL if the addr
argument is not a multiple of the page size, but that was changed to
may fail. Note that we conform to contemporary POSIX and include a
brief note for portable programs.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D48481
We previously claimed that non-page-aligned addresses would return
EINVAL, but the address is in fact rounded down to the page boundary.
Reported by: Harald Eilertsen <haraldei@anduin.net>
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Fixes: dabee6fecc ("kern_descrip.c: add fdshare()/fdcopy()")
Differential Revision: https://reviews.freebsd.org/D48465
- Use a typical tagged list for the open flags instead of a literal
block. This permits using markup in the flag descriptions. Also,
drop the offset to avoid indenting the entire list.
- Note that O_RESOLVE_BENEATH only applies to openat(2)
- Use a clearer description of O_CLOEXEC (what it means, not the
internal flag it sets)
- Note that exactly one permission flag is required.
- Split up a paragraph on various flags so that each flag gets its own
paragraph. Some flags already had their own paragraph, so this is
more consistent. It also makes it clearer which flag a sentence is
talking about when a flag has more than one sentence.
- Appease some errors from igor and man2ps
- In the discussion about a returned directory descriptor opened with
O_SEARCH, avoid the use of Fa fd since the descriptor in question is
a return value and not an argument to open or openat.
- Various and sundry markup and language tweaks
Reviewed by: kib, emaste
Differential Revision: https://reviews.freebsd.org/D48253
- Use consistent language to describe user values unchanged by the
kernel.
- Replace passive language with active in a few places.
- Add a history note for kqueuex() and kqueue1().
- Add an MLINK and synopsis for kqueue1().
- Various wording and markup tweaks.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D48203
The SUS doesn't mention this error code as a possible one [1]. The FreeBSD
manual page specifies a possible ECONNRESET for close(2):
[ECONNRESET] The underlying object was a stream socket that was
shut down by the peer before all pending data was
delivered.
In the past it had been EINVAL (see 21367f630d), and this EINVAL was
added as a safety measure in 623dce13c6. After conversion to
ECONNRESET it had been documented in the manual page in 78e3a7fdd5, but
I bet wasn't ever tested to actually be ever returned, cause the
tcp-testsuite[2] didn't exist back then. So documentation is incorrect
since 2006, if my bet wins. Anyway, in the modern FreeBSD the condition
described above doesn't end up with ECONNRESET error code from close(2).
The error condition is reported via SO_ERROR socket option, though. This
can be checked using the tcp-testsuite, temporarily disabling the
getsockopt(SO_ERROR) lines using sed command [3]. Most of these
getsockopt(2)s are followed by '+0.00 close(3) = 0', which will confirm
that close(2) doesn't return ECONNRESET even on a socket that has the
error stored, neither it is returned in the case described in the manual
page. The latter case is covered by multiple tests residing in tcp-
testsuite/state-event-engine/rcv-rst-*.
However, the deleted block of code could be entered in a race condition
between close(2) and processing of incoming packet, when connection had
already been half-closed with shutdown(SHUT_WR) and sits in TCPS_LAST_ACK.
This was reported in the bug 146845. With the block deleted, we will
continue into tcp_disconnect() which has proper handling of INP_DROPPED.
The race explanation follows. The connection is in TCPS_LAST_ACK. The
network input thread acquires the tcpcb lock first, sets INP_DROPPED,
acquires the socket lock in soisdisconnected() and clears SS_ISCONNECTED.
Meanwhile, the syscall thread goes through sodisconnect() which checks for
SS_ISCONNECTED locklessly(!). The check passes and the thread blocks on
the tcpcb lock in tcp_usr_disconnect(). Once input thread releases the
lock, the syscall thread observes INP_DROPPED and returns ECONNRESET.
- Thread 1: tcp_do_segment()->tcp_close()->in_pcbdrop(),soisdisconnected()
- Thread 2: sys_close()...->soclose()->sodisconnect()->tcp_usr_disconnect()
Note that the lockless operation in sodisconnect() isn't correct, but
enforcing the socket lock there will not fix the problem.
[1] https://pubs.opengroup.org/onlinepubs/9799919799/
[2] https://github.com/freebsd-net/tcp-testsuite
[3] sed -i "" -Ee '/\+0\.00 getsockopt\(3, SOL_SOCKET, SO_ERROR, \[ECONNRESET\]/d' $(grep -lr ECONNRESET tcp-testsuite)
PR: 146845
Reviewed by: tuexen, rrs, imp
Differential Revision: https://reviews.freebsd.org/D48148
This new system call allows to set all necessary credentials of
a process in one go: Effective, real and saved UIDs, effective, real and
saved GIDs, supplementary groups and the MAC label. Its advantage over
standard credential-setting system calls (such as setuid(), seteuid(),
etc.) is that it enables MAC modules, such as MAC/do, to restrict the
set of credentials some process may gain in a fine-grained manner.
Traditionally, credential changes rely on setuid binaries that call
multiple credential system calls and in a specific order (setuid() must
be last, so as to remain root for all other credential-setting calls,
which would otherwise fail with insufficient privileges). This
piecewise approach causes the process to transiently hold credentials
that are neither the original nor the final ones. For the kernel to
enforce that only certain transitions of credentials are allowed, either
these possibly non-compliant transient states have to disappear (by
setting all relevant attributes in one go), or the kernel must delay
setting or checking the new credentials. Delaying setting credentials
could be done, e.g., by having some mode where the standard system calls
contribute to building new credentials but without committing them. It
could be started and ended by a special system call. Delaying checking
could mean that, e.g., the kernel only verifies the credentials
transition at the next non-credential-setting system call (we just
mention this possibility for completeness, but are certainly not
endorsing it).
We chose the simpler approach of a new system call, as we don't expect
the set of credentials one can set to change often. It has the
advantages that the traditional system calls' code doesn't have to be
changed and that we can establish a special MAC protocol for it, by
having some cleanup function called just before returning (this is
a requirement for MAC/do), without disturbing the existing ones.
The mac_cred_check_setcred() hook is passed the flags received by
setcred() (including the version) and both the old and new kernel's
'struct ucred' instead of 'struct setcred' as this should simplify
evolving existing hooks as the 'struct setcred' structure evolves. The
mac_cred_setcred_enter() and mac_cred_setcred_exit() hooks are always
called by pairs around potential calls to mac_cred_check_setcred().
They allow MAC modules to allocate/free data they may need in their
mac_cred_check_setcred() hook, as the latter is called under the current
process' lock, rendering sleepable allocations impossible. MAC/do is
going to leverage these in a subsequent commit. A scheme where
mac_cred_check_setcred() could return ERESTART was considered but is
incompatible with proper composition of MAC modules.
While here, add missing includes and declarations for standalone
inclusion of <sys/ucred.h> both from kernel and userspace (for the
latter, it has been working thanks to <bsm/audit.h> already including
<sys/types.h>).
Reviewed by: brooks
Approved by: markj (mentor)
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47618
Note in the manpage that the 2024 edition finally added ppoll(), and
also add the appropriate declarations for the correct versions of
_POSIX_C_SOURCE (via __POSIX_VISIBLE).
Differential Revision: https://reviews.freebsd.org/D48043
I added a third value for kern.logsigexit to mean 'auto' as an abundance
of caution, but I don't know how much it matters -- that can be easily
consolidated back to boolean-ish.
This is primarily targeted towards people running test suites under CI
(e.g. buildbot, jenkins). Oftentimes tests entail segfaults that are
expected, and logs get spammed -- this can be particularly high volume
depending on the application. Per-process control of this behavior is
desirable because they may still want to be logging legitimate
segfaults, so the system-wide atomic bomb kern.logsigexit=0 is not a
great option.
This adds a process flag to disable it, controllable via
procctl(2)/proccontrol(1); the latter knows it as "sigexitlog" due to
its length, but it's referred to almost everywhere else as
"sigexit_log."
Reviewed by: kib (earlier version), pstef
Differential Revision: https://reviews.freebsd.org/D21903
Note in the manpage that the 2024 edition finally added ppoll(), and
also add the appropriate declarations for the correct versions of
_POSIX_C_SOURCE.
Differential Revision: https://reviews.freebsd.org/D48043
- Add some missing .Pp macros after the end of literal blocks and some
lists to ensure there is a blank line before the following text.
- Use an indent of Ds for nested lists to reduce excessive indentation and
make the bodies of the nested list items easier to read.
- Various and sundry rewordings and clarifications.
Reviewed by: kib, emaste
Differential Revision: https://reviews.freebsd.org/D47782
This is similar to chroot(2), but takes a file descriptor instead
of path. Same syscall exists in NetBSD and Solaris. It is part of a larger
patch to make absolute pathnames usable in Capsicum mode, but should
be useful in other contexts too.
Reviewed By: brooks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41564
Also remove some information from HISTORY that is no longer needed (and
could be confusing), now that _Fork is part of a standard.
Reported by: kib
Reviewed by: imp, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47588
With this patch, it is possible to call fchmod() on a unix socket prior
to binding it to the filesystem namespace, so that the mode is set
atomically. Without this, one has to call chmod() after bind(), leaving
a window where threads can connect to the socket with the default mode.
After bind(), fchmod() reverts to failing with EINVAL.
This interface is copied from Linux.
The behaviour of fstat() is unmodified, i.e., it continues to return the
mode as set by soo_stat().
PR: 282393
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D47361
These were reported by `mandoc -T lint ...` as warnings:
- unusual Xr order
- unusual Xr punctuation
Fixes made by script in https://github.com/Tarsnap/freebsd-doc-scripts
Signed-off-by: Graham Percival <gperciva@tarsnap.com>
Reviewed by: mhorne, Alexander Ziaee <concussious.bugzilla@runbox.com>
Sponsored by: Tarsnap Backup Inc.
Pull Request: https://github.com/freebsd/freebsd-src/pull/1464