-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Exploring the use of Datalad for the pulling of large datasets and versioning of the datasets might be useful. I am not totally sure if Datalad perfectly aligns with our use-case, but I think it is still worth exploring.
This site gives an overview of how Datalad can work with git-annex, and specifically, this section of the site gives an overview of how to "publish a dataset on GitHub with publicly-accessible annexed files" (with the key being, these files are not downloaded locally automatically). We still need a place to store files, but this may ease the process for large datasets.
More information on Datalad can be found here:
Website: https://www.datalad.org/
GitHub: https://github.com/datalad/datalad
Documentation: http://handbook.datalad.org/en/latest/index.html
Introduction presentation: https://training.westdri.ca/materials/datalad_for_hpc_1_1.pdf