1.7 KiB
1.7 KiB
Tutorial Outline
- 2:00 p.m. Introduction
- 2:30 p.m. DataLad version control
- 3:15 p.m. DataLad reproducibility
- 4:00 p.m. Coffee Break
- 4:30 p.m. Datalad in HPC with SLURM
- 5:15 p.m. Outlook on additional and advanced features
- 5:45 p.m. Wrap Up
- 6:00 p.m. End of tutorial
Tutorial Outline - Part I
2:00 p.m. Introduction
- The Git ecosystem including git forges
- Why is standard Git not good for binary files?
- F.A.I.R. research data management and reproducibility in science
2:30 p.m. DataLad version control
- The git-annex extension and external storages for large data
- The DataLad tool on top of git and its sub-commands
- Hands-on: Get to know the tutorial repository
- Hands-on: Add new data to the tutorial repository
3:15 p.m. DataLad reproducibility
- The DataLad subcommands for machine-actionable reproducibility
- The YODA principles for data repositories
- Hands-on: Use the DataLad run subcommand
- Hands-on: Reproduce somebody else’s result with DataLad rerun
4:00 p.m. Coffee Break
Tutorial Outline - Part II
4:30 p.m. Datalad in HPC with SLURM
- The complication with DataLad run and SLURM batch processing
- The DataLad batch scheduling extension
- Hands-on: Run many reproducible batch jobs at a time with DataLad
- Hands-on: Migrate results to another HPC cluster and continue there
5:15 p.m. Outlook on additional and advanced features
- Considerations for parallel HPC filesystems
- DataLad simplifies hierarchical git submodules
- Containerized computations with DataLad
- Outlook on integrated metadata management
5:45 p.m. Wrap Up
- Summary and pointers to further resources
6:00 p.m. End of tutorial