Skip to content

Limit size of archives put to HSI #2

@kc9jud

Description

@kc9jud

According to the NERSC HPSS documentation (https://docs.nersc.gov/filesystems/archive/#avoid-very-large-files), files over 2TB are inefficient when put to HPSS. They recommend breaking files up into 500GB chunks if they get over that limit.

The hsi handler mcscript.task.archive_handler_hsi() should:

  • Inspect the size of an archive file to put.
  • Put the file directly using hsi if smaller than threshold, or
  • use split to break up the file into smaller segments and put those with hsi.

In addition (so that consumers of archives don't need to be aware of this splitting behavior):

  • mcscript should also provide a wrapper function to fetch and reassemble archives,

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions