Skip to content

Conversation

@allisonkarlitskaya
Copy link
Contributor

This is going to need a lot of love. Feedback extremely welcome!

@allisonkarlitskaya
Copy link
Contributor Author

Some notes:

  • unfortunately you need to be real root in order to use this because of a (host) CAP_SYS_ADMIN check. Something related to visibility of files in lsof. It looks like they want to relax that restriction upstream
  • According to the commit message torvalds/linux@4435025

The FUSE server should call FUSE_DEV_IOC_BACKING_CLOSE ioctl to close the backing file by its id.

This can be done at any time, but if an open reply with FOPEN_PASSTHROUGH flag is still in progress, the open may fail if the backing file is closed before the fuse file was opened.

This leaves a little bit to be desired in terms of clarity about exactly when it's OK, so I figured that I'd document that you should close it at Filesystem::release() time. It might be possible to also punt it to a background task that does it after a second or two, but honestly, that sort of thing sounds racy (even with long durations, on a very heavily loaded system, who knows...)

  • the max stack thing is kinda brutal. You get a maximum of 2!! I'm running in a container so that's already my first layer gone, so I had to pick (the maximum value of) 2 for the example.
  • There's nothing preventing you from using a BackingID from one channel on another, which feels a bit like a safety bug (although it won't actually produce "UB" in the usual sense in userspace, nor in the kernel). I'm not sure that it's reasonably possible to prevent that statically. We could probably add an assert. It also seems kinda unrealistic to happen in practice: fuse filesystems seem to be a one-fs-per-process sort of deal.
  • the ABI stuff is hurting me. I'm going to need a lot of help with that. Should we fill in all of the missing flags and structure fields for all of the ABI versions that happened in the meantime, or is it OK to add those only as-needed?
  • the handling of the flags2 field in particular is real pain. This is, conceptually, a single u64 flags field now. I'd like to handle that by just declaring that flags are 64bit now and dealing with the splitting at the kernel level when we send the init reply. But: changing the types of the consts of the existing flags is an API break, right? How do we handle stuff like that?

@cberner
Copy link
Owner

cberner commented May 2, 2025

the ABI stuff is hurting me. I'm going to need a lot of help with that. Should we fill in all of the missing flags and structure fields for all of the ABI versions that happened in the meantime, or is it OK to add those only as-needed?

Yes, you'll need to add all the fields, or at least reserve space for them, so that fuser parses the right number of bytes

the handling of the flags2 field in particular is real pain. This is, conceptually, a single u64 flags field now. I'd like to handle that by just declaring that flags are 64bit now and dealing with the splitting at the kernel level when we send the init reply. But: changing the types of the consts of the existing flags is an API break, right? How do we handle stuff like that?

Breaking the API is fine, if it's necessary for supporting new FUSE features. That's why we're still on version zero :)

@rarensu
Copy link
Contributor

rarensu commented May 2, 2025

Breaking the API is fine, if it's necessary for supporting new FUSE features. That's why we're still on version zero :)

music to my ears

@allisonkarlitskaya
Copy link
Contributor Author

Some discussion here about the CAP_SYS_ADMIN thing... https://lore.kernel.org/linux-fsdevel/ec87f7f4-5c12-4e71-952c-861f67dc4603@bsbernd.com/T/#u

Obviously this doesn't block this PR in any way whatsoever... but it might be useful to have something to write in the doc comment...

Recent upstream kernel releases have added more than 32 bits worth of
flags for the init message by adding 'flags2' fields on the message
types.  In order to prepare for this change, let's first change the
types of our existing init flags to u64.

This is an API break, but most users won't notice, unless they were
explicitly specifying the type of the flags somewhere.  None of the
examples are impacted.

This change is limited to the high-level API that we expose to users.
At the kernel ABI level, we currently just convert u32 to u64 and back.
That will change in the following commits.
@allisonkarlitskaya allisonkarlitskaya marked this pull request as ready for review May 5, 2025 11:33
@allisonkarlitskaya
Copy link
Contributor Author

This should be vaguely reviewable now...

Kernel commit 53db28933e95 ("fuse: extend init flags") added a `flags2`
field to the 'in' and 'out' init messages.  Add support for those,
ourselves.

This introduces a feature flag for ABI version 7.36, but skips the ABI
versions between.
@rarensu
Copy link
Contributor

rarensu commented May 5, 2025

I read it, but I don't feel qualified to judge it.

I think the hash maps need a //comment to explain what is being mapped to what. (inode to fh? something else?). the type isn't giving enough information.

Copy link
Owner

@cberner cberner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a test for this feature. I haven't taken the time to create a nice way to test features like this, and mostly have relied on the xfstests and pjdfstests. But maybe a shell script like I used in this other project or a new integration test in tests/?

I had a few other comments, but overall it looks reasonable to me

src/lib.rs Outdated
///
/// The kernel currently has a hard maximum value of 2. Anything higher won't work.
#[cfg(feature = "abi-7-40")]
pub fn set_max_stack_depth(&mut self, value: u32) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return a Result like some of the other methods do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. I'm not sure what I'd return here, though. We could check for >2 and return Err(2), of course, but we don't have any actual way of know that via kernel interaction, as far as I know....

That said, it is defined in a public header, <linux/fs.h>:

/*
 * Maximum number of layers of fs stack.  Needs to be limited to
 * prevent kernel stack overflow
 */
#define FILESYSTEM_MAX_STACK_DEPTH 2

so maybe we just ought to expose that and use it to do the check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my concern here is if it's ever raised we might be inappropriately enforcing a restriction that's no longer true...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm, ya. Sorry. It's not in <linux/fs.h> but rather in linux/include/linux/fs.h, which is not installed. That's sort of exactly my problem with enforcing an actual limit here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote to fsdevel asking for clarification...

https://lore.kernel.org/linux-fsdevel/CAOYeF9V_n93OEF_uf0Gwtd=+da0ReX8N2aaT6RfEJ9DPvs8O2w@mail.gmail.com/T/#u

Let's see what they say.

@allisonkarlitskaya
Copy link
Contributor Author

btw: I'm kinda a rust newb. Can anyone help me with why cargo build --examples only builds some of the examples?

@rarensu
Copy link
Contributor

rarensu commented May 6, 2025

Do you have the correct --feature enabled? Some of the examples are feature depandant.

@cberner
Copy link
Owner

cberner commented May 6, 2025

btw: I'm kinda a rust newb. Can anyone help me with why cargo build --examples only builds some of the examples?

You need to pass cargo build --examples --features=<whatever ABI you need>

@allisonkarlitskaya
Copy link
Contributor Author

You need to pass cargo build --examples --features=

Seems unfortunate. It would be nice if it worked the other way around (each example pulled in its required feature set)...

It worked, though!

This largely corresponds to changes introduced in 7dc4e97a4f9a ("fuse:
introduce FUSE_PASSTHROUGH capability").

Add a new 'abi-7-40' feature flag and make some changes if it's set:

 - the `padding` field on `fuse_open_out` is renamed to `backing_id`

 - add a new `max_stack_depth field` to `fuse_init_out`

 - new `FUSE_PASSTHROUGH` init flag

 - new `FOPEN_PASSTHROUGH` `fuse_open_out` flag

Add some asserts to the existing reply APIs to ensure that the user
can't use the FUSE_PASSTHROUGH flag: we will soon introduce a version of
these APIs that operate with passthrough fds.

Add an API for setting max_stack_depth to KernelConfig as well.
Add a new BackingId wrapper API in a new file (passhthrough.rs) to
support creating and sending "Backing IDs" for passthrough fds.

Doing this requires performing ioctls: we enable the feature on the nix
crate if our ABI version is 7.40 or greater.

Closes cberner#297
This is copied from the same template as the other examples.  It
includes an example of how a filesystem might choose to cache and reuse
BackingIds.
Copy link
Owner

@cberner cberner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but see my comment on the passthrough test

@allisonkarlitskaya
Copy link
Contributor Author

allisonkarlitskaya commented May 6, 2025

thread 'main' panicked at examples/passthrough.rs:145:59:
called `Result::unwrap()` on an `Err` value: 137438953472
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

That's the line:

     config.add_capabilities(consts::FUSE_PASSTHROUGH).unwrap();

and 137438953472 is 0x2000000000, which is the value of consts::FUSE_PASSTHROUGH.

I guess that means that the kernel on the GitHub runner doesn't support passthrough. That's not surprising:

https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md says

Kernel Version: 6.8.0-1027-azure

The version on the 24.04 OS image:

Kernel Version: 6.11.0-1012-azure

and apparently passthrough went in around 6.9. Should we update the image for the workflow?

@allisonkarlitskaya
Copy link
Contributor Author

tests/test_passthrough.sh: 43: cannot open {mnt}/passthrough: No such file

Oops. Typo. Once more around :)

...and run it from CI via `make test_passthrough`.

We need to upgrade to ubuntu-24.04 in order to do this because we need a
kernel with passthrough support.
@cberner cberner merged commit 4b9e23a into cberner:master May 6, 2025
9 checks passed
@cberner
Copy link
Owner

cberner commented May 6, 2025

Merged. Thanks!

@rarensu
Copy link
Contributor

rarensu commented Jun 24, 2025

I need help adapting this feature to work with the return result paradigm (PR #345).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants