Skip to content

Various patches motivated by https://github.com/bootc-dev/bootc/pull/2073#19

Merged
cgwalters merged 3 commits intocomposefs:mainfrom
cgwalters:various-pax-gnu-bits
Mar 17, 2026
Merged

Various patches motivated by https://github.com/bootc-dev/bootc/pull/2073#19
cgwalters merged 3 commits intocomposefs:mainfrom
cgwalters:various-pax-gnu-bits

Conversation

@cgwalters
Copy link
Collaborator

Motivated by bootc-dev/bootc#2073, where Go's archive/tar (used by
Docker/BuildKit) emits PAX path headers for non-ASCII filenames like
Főtanúsítvány.pem (valid UTF-8, but non-ASCII). PAX headers take
precedence over basic tar headers per POSIX, so code that remaps
paths by rewriting the basic header must also update or strip PAX
path/linkpath records.

tar-core already handles non-UTF-8 PAX path values correctly (raw
`&[u8]` throughout, matching Go archive/tar and Rust tar crate),
but this was untested. Add tests covering: parser acceptance of
non-UTF-8 PAX path bytes, lossy conversion, builder->parser roundtrip
with a >100 byte path (to actually trigger PAX emission), linkpath
preservation, and PaxExtension value_bytes() vs value() behavior.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Test the PAX 'x' -> GNU 'L' -> real entry ordering, which is what
tar-rs's builder produces when you call append_pax_extensions() followed
by append_data() with a long path. This matters for ecosystem
compatibility -- bootc's copy_entry (bootc-dev/bootc#2073) generates
exactly this layout when filtering PAX extensions during path remapping.

The parser already handles this correctly via PendingMetadata
accumulation across recursive parse_header calls, but the reversed
ordering was untested. Also test that PAX path still wins over GNU
long name regardless of which comes first in the byte stream.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Previously, add_pax() accepted calls regardless of ExtensionMode but
finish() only emitted PAX records in Pax mode -- silently discarding
any PAX data added in Gnu mode. The doc comment incorrectly claimed
PAX extensions would be emitted regardless of mode.

Return HeaderError::IncompatibleMode instead, so callers get a clear
error rather than quietly losing xattrs or other PAX metadata.

This is a breaking API change: add_pax() now returns Result<&mut Self>
instead of &mut Self.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
@cgwalters cgwalters merged commit b3ff471 into composefs:main Mar 17, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant