Skip to content

desync tar gives up on strangely ordered tar, but gives successful exit code #210

@dominics

Description

@dominics

Setup

I have two .tar files. They were both produced by tar (GNU tar) 1.34, and have similar contents (NodeJS node_modules directories), and the only difference (after some testing) is that one was produced with a different file sort order, when originally running tar:

  • tar -c --hard-dereference -f works.tar ./node_modules ./a/b/node_modules
  • tar -c --hard-dereference -f breaks.tar ./a/b/node_modules ./node_modules

These tar files end up the same size, and the sole difference between them is shown in ordering:

$ gtar -tf breaks.tar | head
./a/b/node_modules/
./a/b/node_modules/.bin/
./a/b/node_modules/.bin/gql-gen                 # these are symlinks to ../../node_modules/<path>
./a/b/node_modules/.bin/graphql-codegen
./a/b/node_modules/.bin/graphql-code-generator
./node_modules/
./node_modules/indexes-of/
./node_modules/indexes-of/.npmignore
./node_modules/indexes-of/test.js
./node_modules/indexes-of/LICENSE
$ gtar -tf works.tar | head
./node_modules/
./node_modules/indexes-of/
./node_modules/indexes-of/.npmignore
./node_modules/indexes-of/test.js
./node_modules/indexes-of/LICENSE
./node_modules/indexes-of/index.js
./node_modules/indexes-of/README.md
./node_modules/indexes-of/package.json
./node_modules/pako/
./node_modules/pako/LICENSE

Behavior

When given to desync tar, these files both produce exit codes of zero, but breaks.tar fails to be processed:

$ time desync tar --config /tmp/config --verbose --tar-add-root --input-format tar --index --store /tmp/store breaks.caidx breaks.tar && echo "Works!"

real	0m0.090s
user	0m0.011s
sys	0m0.020s
Works!

But it's obvious in the case of breaks.tar that no work has happened: the .caidx file is only 144 bytes long, we didn't spend any time, and if we are piping the .tar file in, we don't even finish reading stdin (giving EPIPE).

It seemed at first I might be able to work around this with GNU tar's --sort=name, but that's inadequate: it still obeys the ordering given on the command line/file list (so I really need to make sure that's also relatively sorted).

  1. Is there something I might be doing wrong?
  2. Does this seem like a reasonable restriction on the input tar format, or more like unexpected behavior?
  3. Is there some extra logging we might enable to make it easier to see error like this? (Ideally an error exit status, but some more logs than I'm getting would probably be sufficient)

I'd be fine with just working around this myself (i.e. if the answer for question 1 is: "yes, it's a reasonable restriction, get a better .tar file!"), if it weren't for the exit with success on failure; that seemed concerning enough to open a bug for

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions