Skip to content

IGWN Data Distribution

This page describes how IGWN data are distributed in various domains for use on remote platforms including the IGWN Grid.

Low-latency data distribution is described elsewhere

This page describes distribution of data in high-latency (minutes/hours).

For details on the distribution of data in low-latency (seconds), please see Low-latency data.

Data replication

For replication of data from one storage centre to another, IGWN uses Rucio.

This is mainly utilised to establish redundant data archives to enable long-term custodial storage of the primary data outputs of the experiments.

Data are not replicated to support distributed computing applications, for that, see below.

Data publication with the Open Science Data Federation

To support distributed computing, IGWN publishes data into the Open Science Data Federation (OSDF) (an instance of a Pelican federation).

With this system, data are replicated from IGWN archival storage systems to an OSDF Origin server, from which it is published into the appropriate OSDF 'namespace'.

Data for each namespace are then available directly from the origin server but also from any registered OSDF Cache server that supports IGWN data.

The globally-distributed network of OSDF Cache servers enable scalable access to IGWN data from any computing centre without the need for replicating those data in advance (check out the OSDF map).

IGWN OSDF namespaces

IGWN currently manages origin servers that provide data in the following namespaces:

Namespace Purpose
/gwdata Public data curated by GWOSC, see Public data.
/igwn/kagra Primary data from the KAGRA observatory, see Aggregated strain data.
/igwn/ligo Primary data from the LIGO observatories, see Aggregated strain data.
/igwn/virgo Primary data from the Virgo observatory, see Aggregated strain data.
/igwn/shared Centrally-curated secondary data generated by IGWN groups, see Shared data.
/igwn/cit User-curated data published from the LDAS system at LIGO-CIT, see User scratch data.

Accessing data from OSDF

Access to OSDF data is enabled via the OSDF Client.

Installing the OSDF client

This tool can be installed from the official OSDF Yum repositories, or from conda-forge:

conda install -c conda-forge stashcp

The stashcp package is available from the OSDF Yum repositories:

dnf -y install stashcp

Downloading data

Downloading data using the OSDF client

stashcp /igwn/ligo/README README

Namespaces under /igwn are restricted

Access to files under any of the /igwn namespace paths is restricted to IGWN members.

Users should acquire a SciToken with the appropriate scope before attempting access.

The OSDF client will automatically discover a valid SciToken in a standard location, or referred to via the BEARER_TOKEN_FILE environment variable.

Downloading data in HTCondor jobs

HTCondor jobs can declare OSDF paths as input files for a job via the transfer_input_files submit command.

See Download data using OSDF for full details.