2024-01-10 17:38:36 +00:00 · 2024-01-08 13:42:25 +00:00 · 2024-01-08 13:43:46 +00:00 · 2024-01-10 11:32:32 +00:00 · 2024-01-08 13:21:12 +00:00 · 2024-01-08 13:22:43 +00:00
13 changed files with 468 additions and 244 deletions
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@ -1,2 +1,2 @@
-Sphinx >= 7.0, < 8.0
-furo==2023.5.20
+Sphinx == 7.2.*
+furo == 2023.9.10
--- a/docs/source/admin.rst
+++ b/docs/source/admin.rst
@ -1,152 +1,27 @@
 Administrator docs
 ==================

-The INM-ICF Utilities `Github repository`_ provides a set of
-executable Python scripts which automate generation of deposits in the
-ICF archive. To simplify deployment, these scripts and all their
-dependencies are packaged as a `Singularity`_ v3 container
-(`download`_).
-
-.. _github repository: https://github.com/psychoinformatics-de/inm-icf-utilities
-.. _singularity: https://docs.sylabs.io/guides/main/user-guide/
-.. _download: https://ci.appveyor.com/api/projects/mih/inm-icf-utilities/artifacts/icf.sif
-
-Archive generation
------------------
-
-Containerized execution
-^^^^^^^^^^^^^^^^^^^^^^^
-
-With the Singilarity image, ``icf.sif``, all scripts are made directly
-available, either through ``singularity run``:
-
-.. code-block:: console
-
-   $ singularity run <singularity options> icf.sif <script name> <script options>
-
-or by making the image file executable.
-
-The singularity image can also be installed as if it was a system
-package. For this, fill in the placeholders in the following script,
-and save it as ``icf-utils``:
-
-.. code-block:: sh
-
-   #!/bin/sh
-   set -e -u
-   singularity run -B <absolute-path-to-data> <absolute-path-to-icf.sif-file> "$@" > icf-utils
-
-The ``-B`` defines a bind path, making it accessible from within the
-container.
-
-Afterwards, install it under ``/usr/bin`` to make all functionality
-available under an ``icf-utils`` command.
-
-.. code-block::
-
-   $ sudo install -t /usr/bin icf-utils
-
 Archival workflow
-^^^^^^^^^^^^^^^^^
+-----------------

 The main part of visit archival is the creation a TAR file.

-The DataLad dataset can be generated and placed alongside the tarballs
-without affecting them. Placement in the study folder guarantees the
-same access permissions (authenticated https). The datasets are
-generated based on file metadata -- the TAR archive remains the only
-data source -- so storage overhead is minimal.
+Optionally, the DataLad dataset can be generated and placed alongside
+the tarballs without affecting them. Placement in the study folder
+guarantees the same access permissions (authenticated https). The
+datasets are generated based on file metadata -- the TAR archive
+remains the only data source -- so storage overhead is minimal.

 Four scripts, executed in the given order, capture the archival
-process.
+process. See :ref:`scripts` for usage details and :ref:`container` for
+recommended deployment of the tools.

-Script listing
-^^^^^^^^^^^^^^
+- ``make_studyvisit_archive``
+- ``deposit_visit_metadata`` (optional)
+- ``deposit_visit_dataset`` (optional)
+- ``catalogify_studyvisit_from_meta`` (optional)

-``make_studyvisit_archive``
-"""""""""""""""""""""""""""
-
-This utility generates a TAR archive from a directory containing DICOM files.
-
-The input directory can have any number of files, with any organization or
-naming. However, the DICOM files are assumed to come from a single "visit"
-(i.e., the time between a person or sample entering and then leaving a
-scanner). The input directory's content is copied into a TAR archive verbatim,
-with no changes to filenames or organization.
-
-In order to generate reproducible TAR archives, the file order, recorded
-permissions and ownership, and modification times are standardized. All files
-in the TAR archive are declared to be owned by root/root (uid/gid: 0/0) with
-0644 permissions. The modification time of any DICOM file is determined
-by its contained DICOM `StudyDate/StudyTime` timestamps. The modification time
-for any non-DICOM file is set to the latest timestamp across all DICOM files.
-
-.. code-block:: console
-
-   $ icf-utils make_studyvisit_archive --help
-   usage: make_studyvisit_archive [-h] [-o PATH] --id STUDY-ID VISIT-ID <input-dir>
-
-``deposit_visit_metadata``
-""""""""""""""""""""""""""
-
-This command locates the DICOM tarball for a particular visit in a
-study (given by their respective identifiers) in the data store, and
-extracts a minimal set of metadata tags for each DICOM image, and the
-TAR archive as a whole. These metadata are then deposited in two
-files, in JSON format, in the study directory:
-
- ``{visit_id}_metadata_tarball.json``
-
-  JSON object with basic properties of the archive, such as 'size', and
-  'md5'.
-
- ``{visit_id}_metadata_dicoms.json``
-
-  JSON array with essential properties for each DICOM image file, such as
-  'path' (relative path inside the TAR archive), 'md5' (MD5 checksum of
-  the DICOM file), 'size' (in bytes), and select standard DICOM tags,
-  such as "SeriesDescription", "SeriesNumber", "Modality",
-  "MRAcquisitionType", "ProtocolName", "PulseSequenceName". The latter
-  enable a rough, technical characterization of the images in the TAR
-  archive.
-
-.. code-block:: console
-
-  $ icf-utils getmeta_studyvisit -h
-  usage: getmeta_studyvisit [-h] [-o PATH] --id STUDY-ID VISIT-ID
-
-``deposit_visit_dataset``
-"""""""""""""""""""""""""
-
-This command reads the metadata deposit from
-``deposit_visit_metadata`` for a visit in a study (given by their
-respective identifiers) from the data store, and generates a DataLad
-dataset from it. This DataLad dataset provides versioned access to the
-visit's DICOM data, up to single-image granularity.  Moreover, all
-DICOM files are annotated with basic DICOM tags that enable on-demand
-dataset views for particular applications (e.g., DICOMs sorted by
-image series and protocol name). The DataLad dataset is deposited in
-two files in the study directory:
-
- ``{visit_id}_XDLRA--refs``
- ``{visit_id}_XDLRA--repo-export``
-
-where the former enables `datalad/git clone` operations, and the latter
-represents the actual dataset as a compressed archive.
-
-.. code-block:: console
-
-   $ icf-utils dataladify_studyvisit_from_meta -h
-   usage: dataladify_studyvisit_from_meta [-h] [-o PATH] --id STUDY-ID VISIT-ID
-
-``catalogify_studyvisit_from_meta``
-"""""""""""""""""""""""""""""""""""
-
-This command creates or updates a DataLad catalog -- a user-facing
-html rendering of dataset contents. It is placed in the ``catalog``
-folder in the study directory.
-
-.. code-block:: console
-
-  $ icf-utils dataladify_studyvisit_from_meta --help
-  usage: dataladify_studyvisit_from_meta [-h] [-o PATH] --id STUDY-ID VISIT-ID
+Creation of the TAR file needs to be done by the ICF. The remaining
+three steps can be done by the ICF (with results deposited alongside
+the TAR file), or by the ICF users who can access the data (on their
+own infrastructure), and for this reason are marked as optional.
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -16,6 +16,7 @@ individuals.
   :caption: Contents:

   user/index
+   reference/index
   admin
   developer

--- a/docs/source/reference/container.rst
+++ b/docs/source/reference/container.rst
@ -0,0 +1,40 @@
+.. _container:
+
+Containerized execution
+-----------------------
+
+To simplify deployment, ICF utilities scripts and all their
+dependencies are packaged as a `Singularity`_ v3 container
+(`download`_).
+
+.. _singularity: https://docs.sylabs.io/guides/main/user-guide/
+.. _download: https://ci.appveyor.com/api/projects/mih/inm-icf-utilities/artifacts/icf.sif
+
+With the Singilarity image, ``icf.sif``, all scripts are made directly
+available, either through ``singularity run``:
+
+.. code-block:: console
+
+   $ singularity run <singularity options> icf.sif <script name> <script options>
+
+or by making the image file executable.
+
+The singularity image can also be installed as if it was a system
+package. For this, fill in the placeholders in the following script,
+and save it as ``icf-utils``:
+
+.. code-block:: sh
+
+   #!/bin/sh
+   set -e -u
+   singularity run -B <absolute-path-to-data> <absolute-path-to-icf.sif-file> "$@" > icf-utils
+
+The ``-B`` defines a bind path, making it accessible from within the
+container.
+
+Afterwards, install it under ``/usr/bin`` to make all functionality
+available under an ``icf-utils`` command.
+
+.. code-block::
+
+   $ sudo install -t /usr/bin icf-utils
--- a/docs/source/reference/index.rst
+++ b/docs/source/reference/index.rst
@ -0,0 +1,19 @@
+Reference
+=========
+
+The INM-ICF Utilities `Github repository`_ provides a set of
+executable Python scripts which automate generation of deposits in the
+ICF archive. To simplify deployment, these scripts and all their
+dependencies are packaged as a `Singularity`_ v3 container
+(`download`_).
+
+.. _github repository: https://github.com/psychoinformatics-de/inm-icf-utilities
+.. _singularity: https://docs.sylabs.io/guides/main/user-guide/
+.. _download: https://ci.appveyor.com/api/projects/mih/inm-icf-utilities/artifacts/icf.sif
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   container
+   scripts
--- a/docs/source/reference/scripts.rst
+++ b/docs/source/reference/scripts.rst
@ -0,0 +1,92 @@
+.. _scripts:
+
+Script listing
+--------------
+
+``make_studyvisit_archive``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This utility generates a TAR archive from a directory containing DICOM files.
+
+The input directory can have any number of files, with any organization or
+naming. However, the DICOM files are assumed to come from a single "visit"
+(i.e., the time between a person or sample entering and then leaving a
+scanner). The input directory's content is copied into a TAR archive verbatim,
+with no changes to filenames or organization.
+
+In order to generate reproducible TAR archives, the file order, recorded
+permissions and ownership, and modification times are standardized. All files
+in the TAR archive are declared to be owned by root/root (uid/gid: 0/0) with
+0644 permissions. The modification time of any DICOM file is determined
+by its contained DICOM `StudyDate/StudyTime` timestamps. The modification time
+for any non-DICOM file is set to the latest timestamp across all DICOM files.
+
+.. code-block:: console
+
+   $ icf-utils make_studyvisit_archive --help
+   usage: make_studyvisit_archive [-h] [-o PATH] --id STUDY-ID VISIT-ID <input-dir>
+
+``deposit_visit_metadata``
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This command locates the DICOM tarball for a particular visit in a
+study (given by their respective identifiers) in the data store, and
+extracts a minimal set of metadata tags for each DICOM image, and the
+TAR archive as a whole. These metadata are then deposited in two
+files, in JSON format, in the study directory:
+
+- ``{visit_id}_metadata_tarball.json``
+
+  JSON object with basic properties of the archive, such as 'size', and
+  'md5'.
+
+- ``{visit_id}_metadata_dicoms.json``
+
+  JSON array with essential properties for each DICOM image file, such as
+  'path' (relative path inside the TAR archive), 'md5' (MD5 checksum of
+  the DICOM file), 'size' (in bytes), and select standard DICOM tags,
+  such as "SeriesDescription", "SeriesNumber", "Modality",
+  "MRAcquisitionType", "ProtocolName", "PulseSequenceName". The latter
+  enable a rough, technical characterization of the images in the TAR
+  archive.
+
+.. code-block:: console
+
+  $ icf-utils deposit_visit_metadata -h
+  usage: deposit_visit_metadata [-h] [-o PATH] --id STUDY-ID VISIT-ID
+
+``deposit_visit_dataset``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This command reads the metadata deposit from
+``deposit_visit_metadata`` for a visit in a study (given by their
+respective identifiers) from the data store, and generates a DataLad
+dataset from it. This DataLad dataset provides versioned access to the
+visit's DICOM data, up to single-image granularity.  Moreover, all
+DICOM files are annotated with basic DICOM tags that enable on-demand
+dataset views for particular applications (e.g., DICOMs sorted by
+image series and protocol name). The DataLad dataset is deposited in
+two files in the study directory:
+
+- ``{visit_id}_XDLRA--refs``
+- ``{visit_id}_XDLRA--repo-export``
+
+where the former enables `datalad/git clone` operations, and the latter
+represents the actual dataset as a compressed archive.
+
+.. code-block:: console
+
+   $ icf-utils deposit_visit_dataset -h
+   usage: deposit_visit_dataset [-h] --id STUDY-ID VISIT-ID [-o PATH] [--store-url URL]
+
+``catalogify_studyvisit_from_meta``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This command creates or updates a DataLad catalog -- a user-facing
+html rendering of dataset contents. It is placed in the ``catalog``
+folder in the study directory.
+
+.. code-block:: console
+
+  $ icf-utils catalogify_studyvisit_from_meta --help
+  usage: catalogify_studyvisit_from_meta [-h] [-o PATH] --id STUDY-ID VISIT-ID
--- a/docs/source/user/browser.rst
+++ b/docs/source/user/browser.rst
@ -24,10 +24,10 @@ following:
 Catalog-based browsing
 ======================

-By entering the ``datalad_catalog`` directory, users will be able to
+If a catalog has been generated for a given study, users will be able to
 browse through the directory tree with additional annotations
 of available metadata, and search for acquisitions based on keywords
-or name.
+or name, by entering the ``datalad_catalog`` directory.

 Downloads
 =========
--- a/docs/source/user/datalad-access.rst
+++ b/docs/source/user/datalad-access.rst
@ -0,0 +1,85 @@
+.. _dl-access:
+
+Access data with DataLad
+------------------------
+
+This section describes accessing the ICF data by cloning DataLad
+datasets which have already been created and made available, most
+likely on local infrastructure. Dataset generation is described in
+the previous section, :ref:`dl-generate`.
+
+This workflow uses DataLad with DataLad-Next extension (see
+:ref:`dl-requirements`). DataLad datasets index data in their original
+(ICF) location. Obtaining data hosted in the ICF store requires access
+credentials for a given study, issued by the ICF. DataLad acts only as
+a client software. See :ref:`dl-credentials` for details.
+
+Clone & get
+^^^^^^^^^^^
+
+If a visit dataset has been prepared and placed in an accessible
+location, it can be cloned with DataLad from a URL containing the
+following components:
+
+* a set of configuration parameters, always constant
+* store base URL (e.g., ``file:///data/group/groupname/local_dicom_store``) [1]_
+* study ID (e.g., ``my-study``)
+* visit ID (e.g., ``P000123``)
+* a file name suffix / template, ``_annex{{annex_key}}`` (verbatim), always constant
+
+The pattern for the URL is::
+
+    'datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=<store base URL>/<study ID>/<visit ID>_{{annex_key}}'
+
+Given the exemplary values above, the pattern would expand to:
+
+.. code-block::
+
+    'datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=file:///data/group/groupname/local_dicom_store/my-study/P000123_{{annex_key}}'
+
+A full ``datalad clone`` command could then look like this:
+
+.. code-block::
+
+    datalad clone 'datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=file:///tmp/local_dicom_store/my-study/P000123_{{annex_key}}'  my_clone
+
+.. note::
+
+   The clone command will not fail if the ``datalad-annex::`` URL
+   points to a nonexisting target. If you see the following warning:
+
+   .. code-block:: none
+
+      [WARNING] You appear to have cloned an empty repository.
+      [WARNING] Cloned /path/to/my_clone but could not find a branch with commits
+
+   it is likely that the provided URL is mistyped or otherwise not correct.
+
+
+.. note:: The URL is arguably a bit clunky. A convenience short cut can be provided via configuration item ``datalad.clone.url-substitute.<label>`` and a substitution rule based on regular expressions. For example, clone URLs can be shortened to require only an identifier (here, ``file:///data/group/groupname/local_dicom_store``), study ID, and visit ID (``inm-icf/<study-ID>/<visit-ID>``) with the following configuration:
+
+   .. code-block::
+
+      git config --global datalad.clone.url-substitute.inm-icf ',^file:///data/group/groupname/local_dicom_store/([^/]+)/(.*)$,datalad-annex::?type=external&externaltype=uncurl&encryption=none&url=file:///data/group/groupname/local_dicom_store/\1/\2_{{annex_key}}'
+
+   This configuration allows DataLad to take any URL of the form ``file:///data/group/groupname/local_dicom_store/<study-ID>/<visit-ID>`` and assemble the required ``datalad-annex::...`` URL on its own, and a clone call shortens into ``datalad clone file:///data/group/groupname/local_dicom_store/my-study/P000123``.
+   You are free to adjust this configuration custom to your needs and preferences.
+   Further documentation on it can be found in the `DataLad Docs`_.
+
+
+.. _DataLad Docs: http://docs.datalad.org/en/stable/design/url_substitution.html
+
+Cloning will retrieve a lightweight dataset, which does not (yet)
+contain file content. File content can be retrieved with ``datalad
+get``. DataLad will handle download and unpacking of the tar file.
+Take a look at the section :ref:`dl-advanced` to learn about useful
+convenience features DataLad adds on top of this.
+
+
+.. rubric:: Footnotes
+
+.. [1] Examples use ``file://`` URLs, given that the datasets are most
+       likely to be generated on institute-local infrastructure. Other
+       protocoles (e.g. ``https://`` or ``ssh://``) can be substituted
+       depending on the particular setup, without affecting the URL
+       structure.
--- a/docs/source/user/datalad-credentials.rst
+++ b/docs/source/user/datalad-credentials.rst
@ -0,0 +1,28 @@
+.. _dl-credentials:
+
+Manage DataLad credentials
+--------------------------
+
+The ICF store is not publicly available, and ICF administrators will
+provide user names and passwords on a per-study basis.  DataLad will
+store or retrieve these credentials using your operating system's
+keyring service. In general, the first time you use DataLad to access
+a project directory, you will be prompted for your credentials. If
+content retrieval succeeds, you will have a possibility of saving the
+credential, to be reused the next time you access a URL from the same
+realm.
+
+If you have access to multiple projects, you can have different sets
+of credentials. You can use the `datalad credentials`_ command from
+DataLad Next to manage (e.g. query, set or remove) credentials known
+to DataLad.
+
+.. admonition:: DataLad usage in the context of GDPR
+
+   DataLad is a client-side software. Usage of DataLad with ICF store
+   is technically equivalent to downloading tar archives with ``wget``
+   or with a web browser click-to-download: in either case, data
+   access happens over https, and the authorisation is performed by
+   the ICF server, not by the clients.
+
+.. _datalad credentials: http://docs.datalad.org/projects/next/en/latest/generated/man/datalad-credentials.html
--- a/docs/source/user/datalad-generate.rst
+++ b/docs/source/user/datalad-generate.rst
@ -0,0 +1,142 @@
+.. _dl-generate:
+
+Generate DataLad datasets
+-------------------------
+
+The ICF archive for a given project contains DICOM files packaged in
+tar archives (DICOM tarballs). In this section we describe creating
+DataLad datasets, which index content and location of these tarballs,
+for DataLad-based access on institute-local infrastructure.
+
+In principle, such datasets are *lightweight*, meaning that they only
+index the content that can be retrieved from the ICF archive (all
+access restrictions apply). Using DataLad can simplify local access,
+allow raw data versioning, integrate with existing workflows, and
+enable logical transformations of the DICOM folder structure - see
+:ref:`dl-advanced` for examples of the latter.
+
+The workflow described below uses DataLad with DataLad-Next extension
+for initial DICOM download and the INM-ICF tools packaged as a
+Singularity container for subsequent steps (see
+:ref:`dl-requirements`). ICF access credentials are required (see
+:ref:`dl-credentials`).
+
+Obtain the tarball
+^^^^^^^^^^^^^^^^^^
+
+First, create an empty directory to be the local dataset store. The
+last path component must be the ``project-ID`` used by the ICF store,
+because following commands use project and visit IDs to determine
+paths.
+
+.. code-block:: bash
+
+   mkdir -p local_dicom_store/<project-ID>
+
+Download the visit tarball, keeping the same relative path:
+
+.. code-block:: bash
+
+   datalad download "https://data.inm-icf.de/<project-ID>/<visit-ID>_dicom.tar local_dicom_store/<project-ID>/<visit-ID>_dicom.tar"
+
+The local copy of the tarball is required to index its contents. It
+can be removed afterwards -- datasets will use the ICF store as the
+content source.
+
+Using ``datalad download`` for downloading the file has the benefit of
+using DataLad's credential management. If this is the first time you
+use DataLad to access the project directory, you will be asked to
+provide your ICF credentials. See :ref:`dl-credentials` for details.
+
+For the following steps, the ICF utility scripts packaged as a
+Singularity container will be used, and executed with ``singularity
+run`` (see :ref:`container` for download and usage details). The
+*absolute path* to the local DICOM store will be represented by
+``$STORE_DIR``:
+
+.. code-block:: bash
+
+   export STORE_DIR=$PWD/local_dicom_store
+
+Deposit visit metadata alongside tarball
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Information required to create a DataLad dataset needs to be extracted
+from the tarball:
+
+.. code-block:: bash
+
+   singularity run -B $STORE_DIR icf.sif deposit_visit_metadata \
+     --store-dir $STORE_DIR --id <project-ID> <visit ID>
+
+This will generate two files, ``<visit ID>_metadata_dicoms.json`` and
+``<visit ID>_metadata_tarball.json``, and place them alongside the
+tarball. The former contains metadata describing individual files
+within the tarball (relative path, MD5 checksum, size, and a small
+subset of DICOM headers describing acquisition type), and the latter
+describes the tarball itself.
+
+Deposit dataset representation alongside tarball
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The next step is to create a lightweight, clone-able representation of
+a dataset in the local dataset store. This step relies on the metadata
+extracted with the previous command. Additionally, the base URL of the
+ICF store needs to be provided (here represented by ``<ICF STORE
+URL>``, this base URL should not contain study or visit ID). The URL,
+combined with respective IDs, will be registered in the dataset as the
+source of the DICOM tarball, and used for retrieval by dataset clones.
+
+.. code-block:: bash
+
+   singularity run -B $STORE_DIR icf.sif deposit_visit_dataset \
+     --store-dir $STORE_DIR --store-url <ICF STORE URL> --id <project-ID> <visit ID>
+
+This will produce two files, ``<visit ID>_XDLA--refs`` and ``<visit
+ID>_XDLA--repo-export`` (text file and zip archive
+respectively). Together, they are a representation of a (lightweight)
+DataLad dataset, and contain the information necessary to retrieve the
+data content with DataLad (but do not contain the data content
+itself).
+
+Create a catalog view (optional)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A catalog page (html+JS rendering of dataset contents generated with
+`DataLad catalog`_) can be created for the visit dataset. This is
+mostly useful when providing (internal) https access to the datasets.
+
+The following command will create the catalog (or update its content)
+and place it in the ``catalog`` folder in the study directory.
+
+.. _DataLad catalog: https://docs.datalad.org/projects/catalog
+
+.. code-block:: bash
+
+   singularity run -B $STORE_DIR icf.sif catalogify_studyvisit_from_meta \
+     --store-dir $STORE_DIR --id <project-ID> <visit ID>
+
+This catalog needs to be subsequently served; a simple (possibly
+local) http server is enough. See the generated README file in the
+``catalog`` folder for details.
+
+Remove the tarball
+^^^^^^^^^^^^^^^^^^
+
+Finally, the DICOM tarball can be safely removed.
+
+.. code-block:: bash
+
+   rm $STORE_DIR/<project-ID>/<visit ID>_dicom.tar
+
+Metadata files can be removed, too, leaving only the dataset
+representation in ``*XDLRA*`` files.
+
+.. code-block:: bash
+
+   rm $STORE_DIR/<project-ID>/<visit ID>_metadata_*.json
+
+
+The local store can be used as a DataLad entry point for obtaining the
+DICOM files from the ICF store (which would serve as the data source
+for dataset clones); see :ref:`dl-access`.
--- a/docs/source/user/datalad-requirements.rst
+++ b/docs/source/user/datalad-requirements.rst
@ -0,0 +1,35 @@
+.. _dl-requirements:
+
+DataLad requirements
+--------------------
+
+Accessing the ICF store contents and cloning datasets generated with
+the ICF tooling requires `DataLad`_ with `Datalad-Next`_ extension
+installed.  You can find instructions for installing DataLad on your
+operating system in the `DataLad Handbook`_.  `Datalad-Next`_ can be
+installed with `pip`_ [1]_.
+
+Generating DataLad datasets based on the DICOMS in the ICF store
+additionally requires the INM-ICF tools, which are packaged as a
+`Singularity`_ container; see :ref:`container`. The tools are not
+required for accessing already existing DataLad datasets.
+
+Obtaining data hosted in the ICF store requires access credentials for
+a given study, issued by the ICF. DataLad acts only as a client
+software. See :ref:`dl-credentials` for details.
+
+.. rubric:: Footnotes
+
+.. [1] To install software with pip, run a call such as the one below
+       in your favourite `virtual environment`_:
+
+       .. code-block:: bash
+
+          python -m pip install datalad-next
+
+.. _datalad: https://www.datalad.org/
+.. _datalad-next: https://docs.datalad.org/projects/next
+.. _datalad handbook: https://handbook.datalad.org/intro/installation.html
+.. _pip: https://pip.pypa.io/en/stable/
+.. _virtual environment: https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
+.. _singularity: https://docs.sylabs.io/guides/main/user-guide/
--- a/docs/source/user/datalad.rst
+++ b/docs/source/user/datalad.rst
@ -1,96 +0,0 @@
-DataLad-based access
--------------------
-
-Software requirements
-^^^^^^^^^^^^^^^^^^^^^
-
-Accessing the ICF store requires `DataLad`_ with `Datalad-Next`_
-extension installed.
-You can find instructions for installing DataLad on your operating
-system in the `DataLad Handbook`_.
-`Datalad-Next`_ can be installed with `pip`_ [1]_.
-
-.. _datalad: https://www.datalad.org/
-.. _datalad-next: https://docs.datalad.org/projects/next
-.. _datalad handbook: https://handbook.datalad.org/intro/installation.html
-.. _pip: https://pip.pypa.io/en/stable/
-
-Credentials
-^^^^^^^^^^^
-
-The ICF store is not publicly available, and ICF administrators will provide user names and passwords on a per-study basis.
-DataLad will store or retrieve these credentials using your
-operating system's keyring service. In general, the first time you use
-DataLad to access a project directory, you will be prompted for your
-credentials. If content retrieval succeeds, the credential will be
-saved, and reused the next time you access a URL from the same realm.
-
-If you have access to multiple projects, you can have different sets
-of credentials. You can use the `datalad credentials`_ command from
-DataLad Next to manage (e.g. query, set or remove) credentials known
-to DataLad.
-
-.. admonition:: DataLad usage in the context of GDPR
-
-   DataLad is a client-side software. Usage of DataLad with ICF store
-   is technically equivalent to downloading tar archives with ``wget``
-   or with a web browser click-to-download: in either case, data
-   access happens over https, and the authorisation is performed by
-   the ICF server, not by the clients.
-
-.. _datalad credentials: http://docs.datalad.org/projects/next/en/latest/generated/man/datalad-credentials.html
-
-
-Clone & get
-^^^^^^^^^^^
-
-A visit dataset can be cloned with DataLad from a URL containing the
-following components:
-
-* store base URL (e.g., ``https://data.inm-icf.de``)
-* study ID (e.g., ``my-study``)
-* visit ID (e.g., ``P000123``)
-* a set of additional parameters, always constant
-
-The pattern for the URL is::
-
-    'datalad-annex::?type=external&externaltype=uncurl&url=<store base URL>/<study ID>/<visit ID>_{{annex_key}}&encryption=none'
-
-Given the exemplary values above, the pattern would expand to
-
-.. code-block::
-
-    'datalad-annex::?type=external&externaltype=uncurl&url=https://data.inm-icf.de/my-study/P000123_{{annex_key}}&encryption=none'
-
-.. note:: The URL is arguably a bit clunky. A convenience short cut can be provided via configuration item ``datalad.clone.url-substitute.<label>`` and a substitution rule based on regular expressions. For example, clone URLs can be shortened to require only an identifier (here, ``https://data.inm-icf.de``), study ID, and visit ID (``inm-icf/<study-ID>/<visit-ID>``) with the following configuration:
-
-   .. code-block::
-
-      git config --global datalad.clone.url-substitute.inm-icf ',^https://data.inm-icf.de/([^/]+)/(.*)$,datalad-annex::?type=external&externaltype=uncurl&url=https://data.inm-icf.de/\1/\2_{{annex_key}}&encryption=none'
-
-   This configuration allows DataLad to take any URL of the form ``https://data.inm-icf.de/<study-ID>/<visit-ID>`` and assemble the required ``datalad-annex::...`` URL on its own, and a clone call shortens into ``datalad clone https://data.inm-icf.de/my-study/P000123``.
-   You are free to adjust this configuration custom to your needs and preferences.
-   Further documentation on it can be found in the `DataLad Docs`_.
-
-.. _DataLad Docs: http://docs.datalad.org/en/stable/design/url_substitution.html
-
-Cloning will retrieve a lightweight dataset, which does not (yet)
-contain file content. File content can be retrieved with `datalad
-get`. DataLad will handle download and unpacking of the tar file.
-Take a look at the section :ref:`dl-advanced` to learn about
-useful convenience features DataLad adds on top of this.
-
-Catalog-based clone URLs
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-Instead of crafting clone URLs by hand, the ``datalad_catalog``
-directory in the data store displays a copy-paste URL for cloning when
-clicking the "Download with DataLad" button on each individual visit ID.
-
-
-.. rubric:: Footnotes
-
-.. [1] To install software with pip, run a call such as the one below
-       in your favourite `virtual environment <https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/>`_::
-
-              python -m pip install datalad-next
--- a/docs/source/user/index.rst
+++ b/docs/source/user/index.rst
@ -15,5 +15,8 @@ Please contact `ICF personnel`_ to get access and for any authentication-related
   :caption: Contents:

   browser
-   datalad
+   datalad-requirements
+   datalad-credentials
+   datalad-generate
+   datalad-access
   datalad-advanced