Skip to content

Accessing IGWN data

This page describes data availability, and how data are transferred in various regimes.

For details on how these data are transferred between computing centres, please see the pages on Bulk data distribution and Low-latency distribution.

Gravitational-wave Frame files (GWF)

Each of the current generation of IGWN detectors continuously produces data that are archived in Gravitational-Wave Frame (GWF, .gwf) files. This is a custom binary file format that allows for extremely efficient storage of a large quantity of heterogeneous data.

For details on the GWF format, see LIGO-T970130.

Data discovery

The GWF files are stored in a number of locations, and what data are available is dependent on which grid computing centre you want to use. Additionally, data may be available for remote use via the NDS2 (see below).

Local data discovery

Local data files are discoverable using gwdatafind, a python library and command-line interface backed by an index of all available files.

Info

For more details, and better examples, see the gwdatafind documentation.

Data are indexed using the following metadata parameters:

  • observatory - the single-character prefix for the observatory

    Prefix Name
    G GEO600
    K KAGRA
    L LIGO Livingston
    H LIGO Hanford
    V Virgo
  • frametype - the dataset name (see Available datasets below)

  • and the GPS [start, stop) interval of the contained data

Discovering datasets

Example

To discover which datasets are available at a given location for an observatory:

```Python tab= from gwdatafind import find_types print(find_types("L"))

```shell tab="Command-line"
python -m gwdatafind -o L --show-types

Discovering file URLs

Example

To find the file URLs (or paths) for a dataset:

```Python tab= from gwdatafind import find_urls print(find_urls("L", "L1_HOFT_C00", 1187008866, 1187008898))

```shell tab="Command-line"
python -m gwdatafind -o L -t L1_HOFT_C00 -s 1187008866 -e 1187008898

CVMFS data discovery

In a typical configuration gwdatafind is configured (via the LIGO_DATAFIND_SERVER environment variable) to connect to a local server that reads an index of local files.

However, a special server has been configured at datafind.ligo.org:443 to return URLs that point to locations in a CVMFS file system, allowing remote files to be accessed as if they were local.

Any (authorised) user can query against that server to discover file URLs, and just need to configure CVMFS on their system to be able to read the files as if they were local:

Configuring CVMFS for IGWN data access

See Accessing proprietary IGWN data via CVMFS for details on how to configure CVMFS on a Linux host (or container) to enable IGWN data access.

Example

To query for files using a custom server:

from gwdatafind import find_urls
print(find_urls("L", "L1_HOFT_C00", 1187008866, 1187008898, host="datafind.ligo.org:443")
python -m gwdatafind -o L -t L1_HOFT_C00 -s 1187008866 -e 1187008898 -r datafind.ligo.org:443

will return (at time of writing):

'file://localhost/cvmfs/oasis.opensciencegrid.org/ligo/frames/O2/hoft/L1/L-L1_HOFT_C00-11870/L-L1_HOFT_C00-1187008512-4096.gwf'

Remote data discovery with NDS

The LIGO Laboratory supports a tool called Network Data Server (NDS), which enables remote access to observatory data. Version 2 of the NDS protocol supports remote, authenticated access for collaboration members, from anywhere in the world.

Info

For full details, see the the NDS2 client documentation.

In contrast to gwdatafind, which locates files contain a specific data set, NDS2 operates solely on the data channel name, and will directly return the data for that channel, regardless of which dataset contains it. This can be extremely valuable when you are not sure which dataset contains a specific channel of interest.

Example

To download data for a specific data channel:

Python tab= import nds2 conn = nds2.connection("nds.ligo.caltech.edu") print(conn.fetch(1187008866, 1187008898, ["H1:GDS-CALIB_STRAIN"]))

Available datasets

The following is an complete, but representative reference as to which datasets may be avaiable.

Warning

Not all datasets are available at each grid computing centre.

Dataset (frametype) Description
H1_R All 'raw' data channels, stored at the native sampling rate
H1_T Second trends of all 'raw' channels, including .mean, .min, and .max
H1_M Minute trends of all 'raw' channels, including .mean, .min, and .max
H1_HOFT_C00 Strain h(t) and metadata generated using the real-time calibration pipeline
H1_HOFT_CXY Strain h(t) and metadata generated using the off-line calibration pipeline at version XY
H1_GWOSC_O2_4KHZ_R1 4k Hz Strain h(t) and metadata as released by The Gravitational-Wave Open Science Centre (GWOSC) for the O2 data release
H1_GWOSC_O2_16KHZ_R1 16k Hz Strain h(t) and metadata as released by The Gravitational-Wave Open Science Centre (GWOSC) for the O2 data release