Rework the routines to convert a native statfs structure (with fixed-size 64-bit
counters) to a Linux statfs structure (with long-sized counters) for 32-bit apps.
Instead of following Linux and return an EOVERFLOW error from statfs() family of
syscalls when actual fs stat value(s) are large enough to not fit into 32 bits,
apply scale logics used by FreeBSD to convert a 5.x statfs structure to a 4.x
statfs structure.
For more details see cc479dda.
Tested by: glebius
MFC after: 1 week
Specific to Linux AT_NO_AUTOMOUNT flag tells the kernel to not automount the
terminal component of pathname if it is a directory that is an automount point.
As it is the default for FreeBSD silencly ignore this flag.
glibc-2.34 uses this flag in the stat64 system calls which is used by i386.
Reviewed by: trasz
Differential revision: https://reviews.freebsd.org/D31524
MFC after: 2 weeks
Filter out the flags we do support; previously we would print
out the flag value verbatim.
Reviewed By: dchagin
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30693
Without it, Qt5 apps from Focal fail to start, being unable to load
their plugins. It's also necessary for glibc 2.33, as found in recent
Arch snapshots.
PR: 254112
Reviewed By: kib
Sponsored by: The FreeBSD Foundation, EPSRC
Differential Revision: https://reviews.freebsd.org/D28192
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.
It defaults to on (== current behavior).
make -C /root/linux-5.3-rc8 -s -j 1 bzImage:
use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
The reason for this is to work around an idiosyncrasy of glibc
getttynam(3) implementation: it checks whether st_dev returned for
fd 0 is the same as st_dev returned for the target of /proc/self/fd/0
symlink, and with linux chroots having their own devfs instance,
the check will fail if you chrooted into it.
PR: kern/240767
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25559
The previous behavior of leaving VI_OWEINACT vnodes on the active list without
a hold count is eliminated. Hold count is kept and inactive processing gets
explicitly deferred by setting the VI_DEFINACT flag. The syncer is then
responsible for vdrop.
Reviewed by: kib (previous version)
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D23036
devices. It's required for LTP, among other things. It's not
complete, but good enough for now.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22950
of them listed in opt_global.h which is not generated while building
modules outside of a kernel and such modules never match real cofigured
kernel.
So, we should prevent our users from building obviously defective modules.
Therefore, remove the root cause of the building of modules outside of a
kernel - the possibility of building modules with DEBUG or KTR flags.
And remove all of DEBUG printfs as it is incomplete and in threaded
programms not informative, also a half of system call does not have DEBUG
printf. For debuging Linux programms we have dtrace, ktr and ktrace ability.
PR: 222861
Reviewed by: trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20178
Existing linuxulator platforms (i386, amd64) support legacy syscalls,
such as non-*at ones like open, but arm64 and other new platforms do
not.
Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h
files. We may need finer grained control in the future but this is
sufficient for now.
Reviewed by: andrew
Sponsored by: Turing Robotic Industries
Differential Revision: https://reviews.freebsd.org/D15237
the old encodings for the lower 16 and 32 bits and only using the
higher 32 bits for unusually large major and minor numbers. This
change breaks compatibility with the previous encoding (which was only
used in -current).
Fix truncation to (essentially) 16-bit dev_t in newnfs v3.
Any encoding of device numbers gives an ABI, so it can't be changed
without translations for compatibility. Extra bits give the much
larger complication that the translations need to compress into fewer
bits. Fortunately, more than 32 bits are rarely needed, so
compression is rarely needed except for 16-bit linux dev_t where it
was always needed but never done.
The previous encoding moved the major number into the top 32 bits.
Almost no translation code handled this, so the major number was blindly
truncated away in most 32-bit encodings. E.g., for ffs, mknod(8) with
major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent
this and blindly truncated it to 2. But if this mknod was run on any
released version of FreeBSD, it gives dev_t = 0x102. ffs can represent
this, but in the previous encoding it was not decoded, giving major = 0,
minor = 0x102.
The presence of bugs was most obvious for exporting dev_t's from an
old system to -current, since bugs in newnfs augment them. I fixed
oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed
to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then
further in -current. E.g., old ad0 with major = 234, minor = 0x10002
had the correct (major, minor) number on the wire, but newnfs truncated
this to (234, 2) and then the previous encoding shifted the major
number into oblivion as seen by ffs or old applications.
I first tried to fix this by translating on every ABI/API boundary, but
there are too many boundaries and too many sloppy translations by blind
truncation. So use the old encoding for the low 32 bits so that sloppy
translations work no worse than before provided the high 32 bits are
not set. Add some error checking for when bits are lost. Keep not
doing any error checking for translations for almost everything in
compat/linux.
compat/freebsd32/freebsd32_misc.c:
Optionally check for losing bits after possibly-truncating assignments as
before.
compat/linux/linux_stats.c:
Depend on the representation being compatible with Linux's (or just with
itself for local use) and spell some of the translations as assignments in
a macro that hides the details.
fs/nfsclient/nfs_clcomsubs.c:
Essentially the same fix as in 1996, except there is now no possible
truncation in makedev() itself. Also fix nearby style bugs.
kern/vfs_syscalls.c:
As for freebsd32. Also update the sysctl description to include file
numbers, and change it to describe device ids as device numbers.
sys/types.h:
Use inline functions (wrapped by macros) since the expressions are now
a bit too complicated for plain macros. Describe the encoding and
some of the reasons for it. 16-bit compatibility didn't leave many
reasonable choices for the 32-bit encoding, and 32-bit compatibility
doesn't leave many reasonable choices for the 64-bit encoding. My
choice is to put the 8 new minor bits in the low 8 bits of the top 32
bits. This minimizes discontiguities.
Reviewed by: kib (except for rewrite of the comment in linux_stats.c)
of 64-bit dev_t's (but not ones involving dev_t's).
st_size was supposed to be clamped in cvtstat() and linux's copy_stat(),
but the clamping code wasn't aware that st_size is signed, and also had
an obfuscated off-by-1 value for the unsigned limit, so its effect was
to produce a bizarre negative size instead of clamping.
Change freebsd32's copy_ostat() to be no worse than cvtstat(). It was
missing clamping and bzero()ing of padding.
Reviewed by: kib (except a final fix of the clamp to the signed maximum)
- Add macros to allow preinitialization of cap_rights_t.
- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.
Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text. To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders. Additional files
waiting on permission from others are listed in review D14210.
Approved by: kan, marcel, sos, rdivacky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
would be useless anyway - there is no point in pretending to have block
devices; our "block" devices are in fact character ones, and can only
be accessed as such.
Discussed with: dchagin
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Right now size of the structure is 472 bytes on amd64, which is
already large and stack allocations are indesirable. With the ino64
work, MNAMELEN is increased to 1024, which will make it impossible to have
struct statfs on the stack.
Extracted from: ino64 work by gleb
Discussed with: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
"highly experimental" remove /dev/shm magic commited
in r218497 and convert tmpfs type to an expected magic number.
Differential Revision: https://reviews.freebsd.org/D1497
Reviewed by: emaste, trasz
All fields of type l_int in struct statfs are defined
as l_long on i386 and amd64.
Differential Revision: https://reviews.freebsd.org/D1064
Reviewed by: trasz
have both kern_open() and kern_openat(); change the callers to use
kern_openat().
This removes one (sometimes two) levels of indirection and
consolidates arguments checks.
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.
MFC after: 2 weeks
kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.
Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc
to construct the full pathname. It starts to search at the default
mountpoint which is /dev/shm. If this fails it runs through fstab
and searches for shmfs and tmpfs. Whatever it finds will be
statfs()'ed to be checked for Linux' fs magic for shmfs (0x01021994).
Ideally our tmpfs should deliver this fs magic to Linux processes, but
as our tmpfs is considered to be an experimental feature we can not
assume that there is always a tmpfs available.
To make shared memory work in the Linuxulator, force the fs type of
/dev/shm (which can be a symlink) to match what Linux expects. The user
is responsible (info has to be added to the linux base ports and the docs)
to setup a suitable link for /dev/shm.
Noticed by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Submitted by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
MFC after: 1 month
A nice thing about POSIX 2008 is that it finally standardizes a way to
obtain file access/modification/change times in sub-second precision,
namely using struct timespec, which we already have for a very long
time. Unfortunately POSIX uses different names.
This commit adds compatibility macros, so existing code should still
build properly. Also change all source code in the kernel to work
without any of the compatibility macros. This makes it all a less
ambiguous.
I am also renaming st_birthtime to st_birthtim, even though it was a
local extension anyway. It seems Cygwin also has a st_birthtim.
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd