Skip to content

WIP: Rough draft for updated generic OCI sealing#226

Draft
cgwalters wants to merge 34 commits intocomposefs:mainfrom
cgwalters:sealing-impl
Draft

WIP: Rough draft for updated generic OCI sealing#226
cgwalters wants to merge 34 commits intocomposefs:mainfrom
cgwalters:sealing-impl

Conversation

@cgwalters
Copy link
Collaborator

This is just some rough draft raw material that builds on:

@cgwalters cgwalters force-pushed the sealing-impl branch 2 times, most recently from 1ce192a to 063ff54 Compare February 12, 2026 16:49
composefs_oci::signing::FsVeritySigningKey::from_pem(&cert_pem, &key_pem)?;

// Build subject descriptor from the source image's manifest
let manifest_json = img.manifest().to_string()?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we actually need to operate on the raw original representation, can't rely on to_string() always giving us the same thing.

/// Image reference (tag name)
image: String,
/// Path to the OCI layout directory (must already exist)
oci_layout_path: PathBuf,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use clap(value_parser) into an ocidir directly or so

/// the container to be mounted with integrity protection.
///
/// Returns a tuple of (sha256 content hash, fs-verity hash value) for the updated configuration.
pub fn seal<ObjectID: FsVerityHashValue>(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner if we do a prep commit that removes the old sealing as we know we're not going to do it anymore.

/// # Returns
///
/// The number of referrer artifacts exported.
pub fn export_referrers_to_oci_layout<ObjectID: FsVerityHashValue>(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this could land as a prep commit

use std::fs;
use std::io::Write;

let blobs_dir = oci_layout_path.join("blobs").join("sha256");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ocidir

format!("{seed:02x}").repeat(32)
}

fn sample_subject() -> Descriptor {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify this stuff with shared infra to generate an ocidir with known content

@cgwalters cgwalters force-pushed the sealing-impl branch 3 times, most recently from 361eeb7 to 2f93e4a Compare March 6, 2026 12:24
@cgwalters
Copy link
Collaborator Author

This one will need to logically depend on #225 because that one has a lot of hardening for the EROFS parser

Add a validated Algorithm type that wraps the fsverity-<hash>-<lg_blocksize>
string format (e.g. 'fsverity-sha512-12'). Implements FromStr for parsing
with proper error types and Display for serialization, so it can be used as
a clap value_parser argument. Includes for_hash::<H>() constructor to
derive from FsVerityHashValue types at compile time.

Prep for repository metadata support.

Assisted-by: OpenCode (Claude Opus 4)
Prep for repository metadata (meta.json) serialization.

Assisted-by: OpenCode (Claude Opus 4)
Add a meta.json file to the repository format that records the digest
algorithm, format version, and feature flags, so tools can auto-detect
the configuration instead of requiring --hash on every invocation.

The versioning model is inspired by Linux filesystem superblocks
(ext4, XFS, EROFS): a base version integer for fundamental layout
changes, plus three tiers of feature flags for finer-grained
evolution:

  - compatible: old tools can safely ignore
  - read-only-compatible: old tools may read but must not write
  - incompatible: old tools must refuse the repository entirely

Because creating a repo is no longer just `mkdir`, add
'cfsctl init --algorithm=fsverity-sha512-12 [path]'.

Closes: composefs#181

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Move Debug impls for format types and EROFS structures to the top
of the file (before ImageVisitor), extract hexdump helper, and
add missing_debug_implementations allows. Pure reorganization,
no functional changes.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Convert the assert_eq! in ImageVisitor::note() to return an error
instead of panicking when a corrupt image has the same offset visited
as two different segment types. Found by the debug_image fuzz target.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Fix arithmetic operations that could overflow, underflow, or cause
resource exhaustion when processing malformed EROFS images:

- Use checked_mul instead of unchecked << for block address
  calculations in debug.rs
- Use checked_add for block range end computation in reader.rs to
  prevent u64 overflow
- Use usize::BITS instead of hardcoded 64 for blkszbits validation
  (correct on 32-bit platforms)
- Use usize::try_from instead of 'as usize' casts for inode size,
  inode ID, and block ID to avoid silent truncation on 32-bit
- Cap Vec allocation against image length to prevent OOM from crafted
  size fields
- Use saturating_sub for debug display calculations

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Replace direct slice indexing with .get() where the bounds come from
image content: XAttr::suffix/value/padding, Inode::inline, and
debug_img's unassigned-region slicing. This prevents panics on
malformed images where field values are inconsistent with actual data
lengths.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
…pers

Change XAttr::suffix(), value(), and padding() to return
Result<&[u8], ErofsReaderError> instead of silently returning empty
slices on out-of-bounds access. This ensures corrupt xattr data
is properly reported rather than silently swallowed.

Also deduplicate is_whiteout() (moved to InodeHeader trait method)
and find_child_nid() (moved to Image method), and remove the
redundant entry_nid() test helper in favor of DirectoryEntry::nid().

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Add fuzz testing infrastructure under crates/composefs/fuzz/ with two
targets: read_image (exercises the full reader API surface including
inode traversal, xattr parsing, and object collection) and debug_image
(runs the debug_img dump on arbitrary input). Includes a seed corpus
generator that creates valid EROFS images exercising various code paths.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
…verflow

A crafted EROFS image with directory cycles can cause unbounded recursion
in populate_directory(), leading to a stack overflow. Add a depth parameter
and enforce a maximum of PATH_MAX / 2 (2048) levels, matching the
theoretical limit for valid filesystem paths.

Found by cargo-fuzz.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
The cargo-fuzz targets found multiple panics within seconds of fuzzing.
Convert all remaining .unwrap() calls and assert!() macros in non-test
reader code to return Result, and propagate errors at all call sites.

Key changes:
- data_layout() returns Result instead of unwrapping TryInto
- XAttr::from_prefix(), xattrs(), shared(), local() return Result
- DirectoryBlock::n_entries/entries/get_entry_header return Result
- DirectoryEntries iterator yields Result<DirectoryEntry>
- XAttrIter yields Result<&XAttr>
- All callers in reader.rs, debug.rs, and fuzz targets updated

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
This got introduced in a CI refactoring and wasn't
intentional. Our fuzzing had way too short of a timeout.
If CI job is actually stuck we'll figure that out when
it happens.

Signed-off-by: Colin Walters <walters@verbum.org>
…ks()

The fuzzer found a crafted EROFS image where an ExtendedInodeHeader has
an enormous size field (~63 petabytes), causing blocks() to return a
range of ~15.5 trillion block IDs. Iterating this range caused a timeout.

Change our flow so that we pass the image (including its size)
when iterating blocks, so we can validate those.

Also add a default 1 GiB maximum image size in Image::open(), since
composefs images are metadata-only and should never approach that.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
Composefs images are metadata-only EROFS images with well-known
structural constraints. Add an opt-in restriction mode that enforces:

- blkszbits must be 12 (4096-byte blocks)
- For non-ChunkBased inodes (directories, inline files, symlinks,
  devices), size must not exceed the image size, since their data
  is stored within the image itself.  ChunkBased (external) files
  are exempt because their size reflects the real file on the
  underlying filesystem.

The high-level collect_objects() and erofs_to_filesystem() APIs now
enable this by default.  Lower-level callers using Image::open()
directly can opt in via .restrict_to_composefs().

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
…nd metacopy checks

Validate composefs header magic and EROFS format version, superblock
magic, enforce the INLINE_CONTENT_MAX (64 byte) limit on inline regular
files, and reject malformed trusted.overlay.metacopy xattrs instead of
silently ignoring them.

The composefs header version field is validated but composefs_version is
not, since the C mkcomposefs writes version 0 while the Rust writer
uses version 2.

Previously, a malformed metacopy xattr would be silently ignored,
causing the file to be treated as inline rather than external. In
composefs-restricted mode this is now an error with a detailed
diagnostic message.

Cap the proptest inline file data strategy at INLINE_CONTENT_MAX to
match the composefs invariant that files > 64 bytes are external.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
The dumpfile parser accepted inline content up to 5000 bytes, which is
far beyond any reasonable composefs inline file size. Reduce to 512
bytes as a safety bound while still allowing room for future increases
to the inline-vs-external threshold (see composefs#107 for discussion of
adjusting INLINE_CONTENT_MAX per hash algorithm).

Update the special.dump test data to use 63/64/256-byte inline files
instead of the previous 4095/4096/4097-byte entries that exceeded the
new limit.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
…NTENT

Rename the writer's inline threshold to INLINE_CONTENT_MAX_V0 to make
it clear that changing this value is effectively a format break: it
determines which files get fs-verity checksums vs. stored inline, so
images from different thresholds aren't interchangeable. A future
composefs format version will need to encode this in the header.

Add MAX_INLINE_CONTENT (512 bytes) in lib.rs as the shared parsing
safety bound for untrusted input. Both the dumpfile parser and the
EROFS reader (in composefs-restricted mode) use this limit. It is
intentionally higher than V0 to allow for future threshold increases
per issue composefs#107.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
EROFS is a complex format supporting compression, metabox inodes,
and more. Whereas for composefs we only use it as a metadata
format, and we have a custom writer which is conservative in
what features it uses.

Add currently known EROFS feature_compat and feature_incompat flag
constants in format.rs. When we're in `restrict_to_composefs()` mode,
we filter these up front.

This should drastically cut down on the attack surface exposed
by malicious EROFS images when mounted directly by the Linux kernel.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
Add cap-std and cap-tempfile as dev-dependencies to composefs and
composefs-oci for capability-scoped filesystem manipulation in tests.

Add TestRepo::path() for accessing the repository's filesystem path,
and TestRepo::dir() for getting a cap_std::fs::Dir handle scoped to
the repository root (preventing accidental path traversal in tests).

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Implement `cfsctl fsck` and `cfsctl oci fsck` commands for verifying
composefs repository integrity at multiple levels.

Repository-level fsck validates:
- Object fsverity digests match path-derived identifiers
- Stream and image symlinks resolve to existing objects
- Refs resolve through the full symlink chain
- Splitstream headers parse correctly and referenced objects exist
- EROFS images are structurally valid (composefs-restricted parsing)
  and all objects referenced via overlay.metacopy xattrs exist

OCI-level fsck additionally validates:
- Manifest and config content sha256 digests
- Layer diff_id references and stream entries
- Seal image existence for sealed images
- Artifact layer reference consistency

Both commands support `--json` for machine-readable output that always
exits 0 (only process-level failures cause non-zero exit). Without
`--json`, non-zero exit indicates corruption was found.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
The old sealing approach stored an fsverity digest in OCI config labels
(containers.composefs.fsverity) and provided seal()/mount() functions to
write and consume it. This is being replaced by EROFS image refs stored
directly in config splitstreams, which integrates with the GC model and
avoids mutating the OCI config.

Remove the seal() and mount() library functions, the seal_digest() and
is_sealed() methods on OciImage, the "sealed" field from ImageInfo, and
the corresponding Seal/Mount CLI subcommands and SEALED table column.
Also remove the now-obsolete implementation plan doc.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Describe the current OCI storage model: naming conventions for
manifest/config/layer/blob splitstreams, how tags map to refs under
streams/refs/oci/, the named_ref chains (manifest→config+layers,
config→layers), and how the GC walks from tags to objects.

Also notes the current gap: EROFS images derived from OCI content are
not referenced by any splitstream, so their lifecycle must be managed
separately.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Add test utilities for creating multi-layer OCI images from composefs
dumpfile strings. This uses the real dumpfile format parsed by
dumpfile_to_filesystem(), then walks the resulting FileSystem tree to
emit tar bytes for import_layer().

Two convenience builders with versioned boot content:
- create_base_image: 5-layer busybox-like app image
- create_bootable_image(version): 20-layer bootable OS with kernel and UKI

v1 and v2 share userspace layers (busybox, libs, systemd, configs) but
differ in kernel version (6.1.0 vs 6.2.0), initramfs, modules, and UKI.
When both are pulled into the same repo the shared layers deduplicate,
exercising GC correctness with content referenced by multiple images.

Prep for adding boot image management API.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Motivation: For bootc I want to store *both* erofs images automatically
without needing to manually hold references.

But really what I want is that for generic OCI container images,
we want a clean model where a tag points to a manifest, which
in turn should reference everything else automatically.

With this change when pulling an OCI container image, we deafult
to generating the EROFS and reference it from the splitstream
for the config.

The next step here: bootable images, the config can be rewritten with additional refs
(e.g. "composefs.image.boot").

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Bootc needs both a plain EROFS image (for composefs mounts) and a
boot-transformed EROFS (with /boot emptied, SELinux labels applied).
This commit adds the bootable variant as a second named ref on the
config splitstream, using BOOT_IMAGE_REF_KEY ("composefs.image.boot")
alongside the existing IMAGE_REF_KEY ("composefs.image").

The same cascade rewrite pattern applies: adding a boot EROFS ref
rewrites config -> manifest -> tag, and GC keeps the boot EROFS
alive through the config ref chain.

CLI:
- cfsctl oci pull --bootable
- cfsctl oci mount --bootable

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
The biggest goal here is support for Linux kernel-native fsverity
signatures to be attached to layers, which enables integration with
IPE.

Add support for a fully separate OCI "composefs signature" artifact
which can be attached to an image.

Drop the -impl.md doc...it's not useful to try to write this
stuff in markdown. The spec has some implementation considerations,
but it's easier to look at implementation side from a code draft.

Add standardized-erofs-meta.md as a placeholder document outlining the
goal of standardizing composefs EROFS serialization across implementations
(canonical model: tar -> dumpfile -> EROFS).

Assisted-by: OpenCode (Claude Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
Implement end-to-end support for cryptographically signing composefs
OCI images using PKCS#7/fsverity detached signatures, stored as OCI
referrer artifacts following the 'composefs erofs-alongside' spec.

Core signing infrastructure (composefs crate):
- Add fsverity algorithm constants and ComposeFsAlgorithm type
- Add formatted_digest module for kernel-compatible fsverity digest
  construction (the 12-byte header + raw hash used by the kernel's
  FS_IOC_ENABLE_VERITY ioctl)
- Add kernel keyring support via composefs-ioctls keyring module
  (inject X.509 certs into .fs-verity keyring for kernel-level
  signature enforcement)

OCI signing library (composefs-oci crate):
- signing.rs: FsVeritySigningKey (sign) and FsVeritySignatureVerifier
  (verify) using openssl PKCS#7 with DETACHED|BINARY|NOATTR flags,
  compatible with Linux kernel fsverity builtin signature verification
- signature.rs: OCI artifact manifest builder/parser for the
  'application/vnd.composefs.erofs-alongside.v1' artifact type,
  storing per-layer and merged EROFS images alongside their PKCS#7
  signatures as typed layers with composefs.* annotations
- image.rs: compute_per_layer_digests() and compute_merged_digest()
  for deterministic EROFS image generation from OCI layer stacks
- oci_image.rs: seal_image() to compute and embed the composefs
  fsverity digest into the OCI config, export/import to OCI layout
  directories (migrated to ocidir crate for atomic I/O), referrer
  index management

CLI commands (cfsctl):
- 'oci seal <image>' — compute composefs EROFS, embed fsverity digest
- 'oci sign <image> --cert --key' — create signature artifact
- 'oci verify <image> [--cert]' — verify signatures (digest-only
  without --cert, full PKCS#7 with --cert)
- 'oci mount <name> <mountpoint> [--require-signature --trust-cert]'
  — verify signatures before kernel mount
- 'oci pull ... --require-signature --trust-cert' — verify after pull
- 'oci push <image> <dest> [--signatures]' — export to OCI layout
- 'oci export-signatures <image> <dest>' — export just artifacts
- 'oci inspect' — show referrer info in JSON output
- 'keyring add-cert <pem>' — inject cert into kernel keyring

The mount and pull --require-signature paths share a common
verify_image_signatures() helper that recomputes expected EROFS
digests and verifies each PKCS#7 signature blob against the trusted
certificate.

The mount command now also resolves tag names (via OciImage::open_ref)
instead of requiring raw config digests, consistent with seal/sign/
verify.

Integration tests:
- signing.rs: 17 unprivileged tests covering sign, verify, wrong cert,
  export, seal+sign roundtrip, artifact structure, --require-signature
  on pull and mount
- privileged.rs: 7 tests for real fsverity enforcement, kernel keyring
  injection, kernel signature acceptance/rejection
- podman.rs: 3 tests building real container images via podman
- cli.rs: updated for richer OCI test layout (4 entries) and new
  oci push/roundtrip tests
- test-oci-sign-verify.sh: standalone shell-based integration tests

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Skopeo (containers-image-proxy) doesn't support the OCI Referrers API,
so when pulling a sealed+signed image, the composefs signature artifacts
are never fetched and --require-signature always fails.

Add a new optional 'oci-client' feature to composefs-oci that, when
enabled, queries the registry directly via the oci-client crate's
pull_referrers() API after the skopeo pull completes. Any composefs
erofs-alongside artifacts referencing the pulled manifest are fetched
and imported into the local repository.

Key design decisions:
- Gated behind an optional feature flag to avoid pulling in reqwest/
  rustls for users who don't need registry referrer support
- Non-fatal: referrer fetch failures are logged as warnings but don't
  fail the pull (the image is still usable without signatures)
- Only attempted for Transport::Registry (not local sources)
- Uses anonymous auth for now (auth integration deferred)
- Bridges oci-spec 0.9 (oci-client) to 0.8 (composefs) via JSON
  serde round-tripping since the schema is identical

Assisted-by: OpenCode (Claude Opus 4)
…ld to ImageInfo

These were part of the sealing-impl branch but were lost during
conflict resolution of PR composefs#263 rebase. The oci mount subcommand
needs signature verification flags for the sealed app container
workflow, and ImageInfo needs the sealed field for oci images --json.

Assisted-by: OpenCode (Claude Opus 4)
When export_image_to_oci_layout() reconstitutes layers as uncompressed
tars, the new manifest has a different digest than the original. But
the referrer artifacts' subject.digest still pointed to the original
manifest digest, so tools like 'oras cp -r' couldn't discover the
referrer relationship and wouldn't copy signature artifacts.

Fix by capturing the new manifest digest from insert_manifest() and
passing it to export_referrers_to_oci_layout() which rewrites the
subject descriptor in each artifact manifest before serialization.

Assisted-by: OpenCode (Claude Opus 4)
When the OCI Referrers API fails (GHCR returns 303/404), fall back to
the OCI 1.1 referrers tag scheme where referrer artifacts are stored
under a tag named 'sha256-<hex>' in the same repository. This tag
contains an OCI Image Index listing all referrer manifests.

This is necessary because GHCR doesn't properly implement the Referrers
API, but oras cp -r stores referrers using the tag scheme fallback when
pushing to such registries.

Assisted-by: OpenCode (Claude Opus 4)
…gistry one

After pulling an image, ensure_oci_composefs_erofs rewrites the
config+manifest (adding EROFS refs), producing a new manifest digest
that differs from the original registry digest. The referrer fetch
was registering referrers against the registry digest, but
list_referrers/verify looks them up by the local (rewritten) digest.

Fix by passing both digests to fetch_and_import_referrers: the registry
digest for the Referrers API query, and the local digest for
add_referrer registration.

Assisted-by: OpenCode (Claude Opus 4)
Build cfsctl in release mode with pre-6.15 and oci-client features,
then push the binary to GHCR as an OCI artifact tagged by branch name
and short SHA. This lets the sealed demo Containerfile fetch a
pre-built binary via oras instead of building from source.

Assisted-by: OpenCode (Claude Opus 4)
CentOS Stream 10's kernel 6.12 doesn't support direct file-backed
EROFS mounts (added post-6.12 in mainline). The rhel9 feature enables
loopback device creation for EROFS images, which is needed for the
composefs overlay mount to work.

Assisted-by: OpenCode (Claude Opus 4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant