IGWN Data Distribution¶
This page describes how IGWN data are distributed in various domains for use on remote platforms including the IGWN Grid.
Low-latency data distribution is described elsewhere
This page describes distribution of data in high-latency (minutes/hours).
For details on the distribution of data in low-latency (seconds), please see Low-latency data.
Data replication¶
For replication of data from one storage centre to another, IGWN uses Rucio.
This is mainly utilised to establish redundant data archives to enable long-term custodial storage of the primary data outputs of the experiments.
Data are not replicated to support distributed computing applications, for that, see below.
Data publication with the Open Science Data Federation¶
To support distributed computing, IGWN publishes data into the Open Science Data Federation (OSDF) (an instance of a Pelican federation).
With this system, data are replicated from IGWN archival storage systems to an OSDF Origin server, from which it is published into the appropriate OSDF 'namespace'.
Data for each namespace are then available directly from the origin server but also from any registered OSDF Cache server that supports IGWN data.
The globally-distributed network of OSDF Cache servers enable scalable access to IGWN data from any computing centre without the need for replicating those data in advance (check out the OSDF map).
IGWN OSDF namespaces¶
IGWN currently manages origin servers that provide data in the following namespaces:
Namespace | Purpose |
---|---|
/gwdata | Public data curated by GWOSC, see Public data. |
/igwn/kagra | Primary data from the KAGRA observatory, see Aggregated strain data. |
/igwn/ligo | Primary data from the LIGO observatories, see Aggregated strain data. |
/igwn/virgo | Primary data from the Virgo observatory, see Aggregated strain data. |
/igwn/shared | Centrally-curated secondary data generated by IGWN groups, see Shared data. |
/igwn/cit | User-curated data published from the LDAS system at LIGO-CIT, see User scratch data. |
Accessing data from OSDF¶
Access to OSDF data is enabled via the Pelican client.
Installing the OSDF client¶
This tool can be installed from the official OSDF Yum repositories, or from conda-forge:
conda install -c conda-forge pelicanplatform
The pelican
package is available from the HTCondor Yum repositories:
dnf -y install pelican
Downloading data¶
Downloading data using the OSDF client
pelican object get osdf:///igwn/ligo/README README
Namespaces under /igwn
are restricted
Access to files under any of the /igwn
namespace paths is restricted to IGWN members.
Users should acquire a SciToken with the appropriate scope before attempting access.
The OSDF client will automatically discover a valid SciToken in a standard location, or referred to via the BEARER_TOKEN_FILE
environment variable.
Downloading data in HTCondor jobs¶
HTCondor jobs can declare OSDF paths as input files for a job via the transfer_input_files
submit command.
See Download data using OSDF for full details.