Skip to content

Add file de-duplication #29

@fcasson

Description

@fcasson

Some workflows upload files as input which were outputs of other workflows. This enables the parent / child detection feature to work

However, storage of identical objects is duplicated. This could be improved (a bit like git forks on github and gitlab but a lot simpler since there is only a single server).

Some attempt was made, but not merged I think.

https://git.iter.org/projects/IMEX/repos/simdb/commits/19f391fdfeaea65356b1029f5b576a90925a5e06#[src/simdb/remote/api.py]

Similar approach could be done for IMAS data but might need to based on IMAS checksums, not file checksums.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions