* If during FUSE_CREATE, FUSE_MKDIR, etc the server returns the same
inode number for the new file as for its parent directory, reject it.
Previously this would triggers a recurse-on-non-recursive lock panic.
* If during FUSE_LINK the server returns a different inode number for
the new name as for the old one, reject it. Obviously, that can't be
a hard link.
* If during FUSE_LOOKUP the server returns the same inode number for the
new file as for its parent directory, reject it. Nothing good can
come of this.
PR: 263662
Reported by: Robert Morris <rtm@lcs.mit.edu>
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D35128
(cherry picked from commit 0bef4927ea)
In an error path, a dtrace probe could access an undefined variable.
Reported by: Coverity (CID 1471986)
Sponsored by: Axcient
(cherry picked from commit dcfa054216)
The daemon can specify fsname=XXX in its mount options. If so, the file
system should report f_mntfromname as XXX during statfs. This will show
up in the output of commands like mount and df.
Submitted by: Ali Abdallah <ali.abdallah@suse.com>
Differential Revision: https://reviews.freebsd.org/D35090
(cherry picked from commit 2f6362484c)
Prior to fuse protocol version 7.9, the fuse_entry_out structure had a
smaller size. But fuse_vnop_create did not take that into account when
working with servers that use older protocols. The bug does not matter
for servers which don't use file handles or open flags (the only fields
affected).
PR: 263625
Submitted by: Ali Abdallah <ali.abdallah@suse.com>
(cherry picked from commit 45825a12f9)
During a FUSE_WRITE, the kernel requests the server to write a certain
amount of data, and the server responds with the amount that it actually
did write. It is obviously an error for the server to write more than
it was provided, and we always treated it as such, but there were two
problems:
* If the server responded with a huge amount, greater than INT_MAX, it
would trigger an integer overflow which would cause a panic.
* When extending the file, we wrongly set the file's size before
validing the amount written.
PR: 263263
Reported by: Robert Morris <rtm@lcs.mit.edu>
Sponsored by: Axcient
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D34955
(cherry picked from commit 3a1b3c6a1e)
Formerly fusefs would pass up the stack any error value returned by the
fuse server. However, some values aren't valid for userland, but have
special meanings within the kernel. One of these, EJUSTRETURN, could
cause a kernel page fault if the server returned it in response to
FUSE_LOOKUP. Fix by validating all errors returned by the server.
Also, fix a data lifetime bug in the FUSE_DESTROY test.
PR: 263220
Reported by: Robert Morris <rtm@lcs.mit.edu>
Sponsored by: Axcient
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D34931
(cherry picked from commit 155ac516c6)
* We never send FUSE_LOOKUP for the root inode, since its inode number
is hard-coded to 1. Therefore, we should not send FUSE_FORGET for it,
lest the server see its lookup count fall below 0.
* During VOP_RECLAIM, if we are reclaiming the root inode, we must clear
the file system's vroot pointer. Otherwise it will be left pointing
at a reclaimed vnode, which will cause future VOP_LOOKUP operations to
fail. Previously we only cleared that pointer during VFS_UMOUNT. I
don't know of any real-world way to trigger this bug.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D34753
(cherry picked from commit 3227325366)
When renaming a directory into a different parent directory, invalidate
the cached attributes of the new parent. Otherwise, stat will show the
wrong st_nlink value.
Reviewed by: ngie
Differential Revision: https://reviews.freebsd.org/D34336
(cherry picked from commit e8553be9bc)
MFC 3d721de049 ("Fix NFS exports of FUSE file systems for big
directories") missed a case of a uint64_t from HEAD that should be a
u_long in 13 due to KPI differences. Specifically, HEAD has b214fcceac
("Change VOP_READDIR's cookies argument to a **uint64_t"), but stable/13
does not.
This is a direct commit to stable/13.
The FUSE protocol does not require that a directory entry's d_off field
outlive the lifetime of its directory's file handle. Since the NFS
server must reopen the directory on every VOP_READDIR call, that means
it can't pass uio->uio_offset down to the FUSE server. Instead, it must
read the directory from 0 each time. It may need to issue multiple
FUSE_READDIR operations until it finds the d_off field that it's looking
for. That was the intention behind SVN r348209 and r297887, but a logic
bug prevented subsequent FUSE_READDIR operations from ever being issued,
rendering large directories incompletely browseable.
Reviewed by: rmacklem
(cherry picked from commit d088dc76e1)
fusefs: optimize NFS readdir for FUSE_NO_OPENDIR_SUPPORT
In its lowest common denominator, FUSE does not require that a directory
entry's d_off field is valid outside of the lifetime of the directory's
FUSE file handle. But since NFS is stateless, it must reopen the
directory on every call to VOP_READDIR. That means reading the
directory all the way from the first entry. Not only does this create
an O(n^2) condition for large directories, but it can also result in
incorrect behavior if either:
* The file system _does_ change the d_off field for the last directory
entry previously seen by NFS, or
* The file system deletes the last directory entry previously seen by
NFS.
Handily, for file systems that set FUSE_NO_OPENDIR_SUPPORT d_off is
guaranteed to be valid for the lifetime of the directory entry, there is
no need to read the directory from the start.
Reviewed by: rmacklem
(cherry picked from commit 4a6526d84a)
fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting
FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not
guarantee that d_off will be valid after closing and reopening a
directory. That conflicts with NFS's statelessness, that results in
unresolvable bugs when NFS reads large directories, if:
* The file system _does_ change the d_off field for the last directory
entry previously returned by VOP_READDIR, or
* The file system deletes the last directory entry previously seen by
NFS.
Rather than doing a poor job of exporting such file systems, it's better
just to refuse.
Even though this is technically a breaking change, 13.0-RELEASE's
NFS-FUSE support was bad enough that an MFC should be allowed.
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D33726
(cherry picked from commit 00134a0789)
fusefs: fix the build without INVARIANTS after 00134a0789
MFC with: 00134a0789
Reported by: se
(cherry picked from commit 18ed2ce77a)
Now posix_fallocate will be correctly forwarded to fuse file system
servers, for those that support it.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33389
(cherry picked from commit 398c88c758)
In an earlier version of the revision that created that sysctl (D20519)
the sysctl was gated by INVARIANTS, so the test had to check for it.
But in the committed version it is always available.
(cherry picked from commit 19ab361045)
fusefs: move common code from forget.cc to utils.cc
(cherry picked from commit 8d99a6b91b)
fusefs: fix .. lookups when the parent has been reclaimed.
By default, FUSE file systems are assumed not to support lookups for "."
and "..". They must opt-in to that. To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..". But if the parent's vnode has
been reclaimed that won't be possible. Previously we paniced in this
situation. Now, we'll return ESTALE instead. Or, if the file system
has opted into ".." lookups, we'll just do that instead.
This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.
PR: 259974
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33239
(cherry picked from commit 1613087a81)
If FUSE_COPY_FILE_RANGE returns successfully, update the atime of the
source and the mtime and ctime of the destination.
Reviewers: pfg
Differential Revision: https://reviews.freebsd.org/D33159
(cherry picked from commit 5169832c96)
VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked. But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked. So a race is possible. For
example:
1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode
and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
the server. Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
attributes, which are now out-of-date.
Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified. Check that timestamp during VOP_LOOKUP
and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.
PR: 259071
Reported by: Agata <chogata@moosefs.pro>
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33158
(cherry picked from commit 13d593a5b0)
FUSE_COPY_FILE_RANGE instructs the server to write data to a file.
fusefs must invalidate any cached data within the written range.
PR: 260242
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33280
(cherry picked from commit 41ae9f9e64)
This function was always confusing, because it created an H-shaped
callgraph: two functions called in and left via different paths based on
which which called.
(cherry picked from commit dc433e1530)
Correctly handle the situation where a FUSE server unlinks a file, then
creates a new file of a different type but with the same inode number.
Previously fuse_vnop_lookup in this situation would return EAGAIN. But
since it didn't call vgone(), the vnode couldn't be reused right away.
Fix this by immediately calling vgone() and reallocating a new vnode.
This problem can occur in three code paths, during VOP_LOOKUP,
VOP_SETATTR, or following FUSE_GETATTR, which usually happens during
VOP_GETATTR but can occur during other vops, too. Note that the correct
response actually doesn't depend on whether the entry cache has expired.
In fact, during VOP_LOOKUP, we can't even tell. Either it has expired
already, or else the vnode got reclaimed by vnlru.
Also, correct the error code during the VOP_SETATTR path.
PR: 258022
Reported by: chogata@moosefs.pro
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33283
(cherry picked from commit 25927e068f)
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.
MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first. With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].
Reported by: Greg V (greg unrelenting.technology)
Sponsored by: The FreeBSD Foundation
Suggested by: imp [1]
Reviewed by: imp (and others to different level)
Differential Revision: https://reviews.freebsd.org/D32383
(cherry picked from commit df38ada293)
When using cached attributes, whether or not the data cache is enabled,
fusefs must update a file's atime whenever it reads from it, so long as
it wasn't mounted with -o noatime. Update it in-kernel, and flush it to
the server on close or during the next setattr operation.
The downside is that close() will now frequently trigger a FUSE_SETATTR
upcall. But if you care about performance, you should be using
-o noatime anyway.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D33145
(cherry picked from commit 91972cfcdd)
fusefs: fix 32-bit build of the tests after 91972cfcdd
(cherry picked from commit d109559ddb)
When copy_file_range extends a file, it must update the cached file
size.
Reviewed by: rmacklem, pfg
Differential Revision: https://reviews.freebsd.org/D33151
(cherry picked from commit 65d70b3bae)
If the FUSE server tells the kernel that a file's size has changed, then
the kernel must invalidate any portion of that file in cache. But the
kernel can't do that during VOP_STRATEGY, because the file's buffers are
already locked. Instead, proceed with the write.
PR: 256937
Reported by: Agata <chogata@moosefs.pro>
Tested by: Agata <chogata@moosefs.pro>
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D32332
(cherry picked from commit 032a5bd55b)
fuse_vnop_bmap needs to know the file's size in order to calculate the
optimum amount of readahead. If the file's size is unknown, it must ask
the FUSE server. But if the file's data was previously cached and the
server reports that its size has shrunk, fusefs must invalidate the
cached data. That's not possible during VOP_BMAP because the buffer
object is already locked.
Fix the panic by not querying the FUSE server for the file's size during
VOP_BMAP if we don't need it. That's also a a slight performance
optimization.
PR: 256937
Reported by: Agata <chogata@moosefs.pro>
Tested by: Agata <chogata@moosefs.pro>
(cherry picked from commit 7430017b99)
If the FUSE server does something that would make our cache incoherent,
we should print a warning to the user. However, we previously warned in
some situations when we shouldn't, such as if the file's size changed on
the server _after_ our own attribute cache had expired. This change
suppresses the warning in cases like that. It also moves the warning
logic to a single place within the code.
PR: 256936
Reported by: Agata <chogata@moosefs.pro>
Tested by: Agata <chogata@moosefs.pro>, jSML4ThWwBID69YC@protonmail.com
(cherry picked from commit 5d94aaacb5)
For file systems that allow it, fusefs will skip FUSE_OPEN,
FUSE_RELEASE, FUSE_OPENDIR, and FUSE_RELEASEDIR operations, a minor
optimization.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D32141
(cherry picked from commit 7124d2bc3a)
Synchronize formatting and documentation in fuse_kernel.h with upstream
sources.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D32141
(cherry picked from commit a3a1ce3794)
During VOP_GETPAGES, fusefs needs to determine the file's length, which
could require a FUSE_GETATTR operation. If that fails, it's better to
SIGBUS than panic.
Sponsored by: Axcient
Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D31994
(cherry picked from commit 4f917847c9)
During FUSE_SETLK, the owner field should uniquely identify the calling
process. The fusefs module now sets it to the process's pid.
Previously, it expected the calling process to set it directly, which
was wrong.
libfuse also apparently expects the owner field to be set during
FUSE_GETLK, though I'm not sure why.
PR: 256005
Reported by: Agata <chogata@moosefs.pro>
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D30622
(cherry picked from commit 18b19f8c6e)
Every FUSE operation has a unique value in its header. As the name
implies, these values are supposed to be unique among all outstanding
operations. And since FUSE_INTERRUPT is asynchronous and racy, it is
desirable that the unique values be unique among all operations that are
"close in time".
Ensure that they are actually unique by incrementing them whenever we
reuse a fuse_dispatcher object, for example during fsync, write, and
listextattr.
PR: 244686
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D30810
(cherry picked from commit 5403f2c163)
/dev/fuse is always ready for writing, so it's kind of dumb to poll it.
But some applications do it anyway. Better to return ready than EINVAL.
Reviewed by: emaste, pfg
Differential Revision: https://reviews.freebsd.org/D30784
(cherry picked from commit 7b8622fa22)
Simplify fuse_device_filt_write
It always returns 1, so why bother having a variable.
Pull Request: https://github.com/freebsd/freebsd-src/pull/478
(cherry picked from commit 9b876fbd50)
The fusefs driver will print warning messages about FUSE servers that
commit protocol violations. Previously it would print those warnings on
every violation, but that could spam the console. Now it will print
each warning no more than once per lifetime of the mount. There is also
now a dtrace probe for each violation.
Sponsored by: Axcient
Reviewed by: emaste, pfg
Differential Revision: https://reviews.freebsd.org/D30780
(cherry picked from commit 0b9a5c6fa1)
kevans actually caught this in the original review and I fixed it, but
then I committed an older copy of the branch. Whoops.
Reported by: kevans
Differential Revision: https://reviews.freebsd.org/D29031
(cherry picked from commit 9c5aac8f2e)
1) F_SETLKW (blocking) operations would be sent to the FUSE server as
F_SETLK (non-blocking).
2) Release operations, F_SETLK with lk_type = F_UNLCK, would simply
return EINVAL.
PR: 253500
Reported by: John Millikin <jmillikin@gmail.com>
(cherry picked from commit 929acdb19a)
This allows d_off to be used with lseek to position the file so that
getdirentries(2) will return the next entry. It is not used by
readdir(3).
PR: 253411
Reported by: John Millikin <jmillikin@gmail.com>
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D28605
(cherry picked from commit 71befc3506)
Must lock the vnode before accessing the fufh table. Also, check for
invalid parameters earlier. Bug introduced by r346170.
MFC after: 2 weeks
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27936
This updates the FUSE protocol to 7.28, though most of the new features
are optional and are not yet implemented.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27818
FUSE_LSEEK reports holes on fuse file systems, and is used for example
by bsdtar.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27804
The original fusefs GSoC project seems to have envisioned exchanging two
types of messages with FUSE servers. Perhaps vectored and non-vectored?
But in practice only one type has ever been used. Delete the other type.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27770
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
No functional change intended.
Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).
__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.
Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037
from a Linux binary. Should come handy for AppImages.
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26959