Skip to content

SciTokens

SciTokens is a capability-based authorisation system whereby tokens are issued that grant access to perform specific actions on specific services (for example to query for segment information from the Segment Database).

What is a SciToken?

A SciToken is a special implementation of a JSON Web Token designed for distributed scientific computing. For full details on the SciTokens project and the implementation of the tokens, see

https://scitokens.org/

Key SciTokens concepts

Key to the IGWN usage of SciTokens are the following concepts:

Issuer

The Issuer of a token (encoded in the iss claim) is the address of the service that issued the token. Most services that accept tokens are configured to only accept tokens issued by specific issuers.

For all IGWN usage, the token issuer should be

https://cilogon.org/igwn

Audience

The Audience of a token (the aud claim) is the service that a token is authorised to acccess.

This should typically be the fully-qualified URI of the target service, e.g. https://gracedb.ligo.org.

The Audience can also have the special value ANY, which should allow it to be used with any service. Usage of ANY tokens is discouraged for security reasons, except in the case of authorising with CVMFS.

Scope

The Scope of a token (the scope claim) is a space-separated list of capabilities that the token should authorise. The valid scopes are typically specific to a given service. See Token-aware services below for details on the IGWN-specific scopes.

Installing the SciTokens tools

There are a few different tools to install based on usage:

Installing htgettoken

htgettoken is a tool for generating SciTokens via the IGWN Vault server.

The htgettoken tool can be installed using your preferred package manager on a number of systems:

conda install --channel conda-forge htgettoken

htgettoken is available for Debian in the IGWN Debian repositories:

apt-get install htgettoken

htgettoken is available for RHEL in the Open Science Grid yum repositories:

yum install htgettoken

Installing igwn-auth-utils

igwn-auth-utils is a Python library for discovering SciTokens and using them with HTTP requests.

Most IGWN service APIs use igwn-auth-utils automatically

The client API libraries for most IGWN services (including GraceDB and GWDataFind) automatically install and use igwn-auth-utils for handling SciTokens, so you probably don't need to install it manually.

igwn-auth-utils can be installed using your preferred package manager on a number of systems:

conda install --channel conda-forge igwn-auth-utils

python3-igwn-auth-utils is available for Debian in the IGWN Debian repositories:

apt-get install python3-igwn-auth-utils
python -m pip install igwn-auth-utils

python3-igwn-auth-utils is available for RHEL and derivatives in the IGWN RHEL repositories for Scientific Linux 7, Rocky Linux 8, and Rocky Linux 9:

yum install python3-igwn-auth-utils

How to generate a SciToken

Generating a default token

LIGO.ORG and KAGRA identity holders can generate tokens using htgettoken:

htgettoken -a vault.ligo.org -i igwn

The token generated by htgettoken can be inspected using the httokendecode utility provided by the same software package:

$ httokendecode
{
  "sub": "marie.curie@ligo.org",
  "aud": "ANY",
  "ver": "scitoken:2.0",
  "nbf": 1681817886,
  "scope": "read:/frames gwdatafind.read gracedb.read dqsegdb.read",
  "iss": "https://cilogon.org/igwn",
  "exp": 1681828691,
  "iat": 1681817891,
  "jti": "https://cilogon.org/oauth2/adfghy549ij7gfd599453718c902af46?type=accessToken&ts=1681817891006&version=v2.0&lifetime=10800000"
}

By default this command will generate a token with the following claims

Claim Value
aud "ANY"
scope dqsegdb.read gracedb.read gwdatafind.read read:/frames

Widely-scoped tokens are insecure

The default token is widely scoped, valid with any service with multiple capabilities. This presents a security risk, allowing an intruder to retrieve multiple forms of protected data while in posession of a token.

Generating a secure token

To generate a secure token, you should specify the --audience and --scopes as narrowly as possible to generate a token that with a single capability accepted by a single service.

For example, to generate a token that allows the bearer to read data from https://datafind.igwn.org only:

htgettoken \
  -a vault.ligo.org \
  --audience https://datafind.igwn.org \
  --scopes gwdatafind.read

This generates a token with only the required claims, and nothing else.

$ httokendecode -H
{
  "sub": "marie.curie@ligo.org",
  "aud": "https://datafind.igwn.org",
  "ver": "scitoken:2.0",
  "nbf": "Tue 18 Apr 2023 12:38:06 PM BST",
  "scope": "gwdatafind.read",
  "iss": "https://cilogon.org/igwn",
  "exp": "Tue 18 Apr 2023 03:38:11 PM BST",
  "iat": "Tue 18 Apr 2023 12:38:11 PM BST",
  "jti": "https://cilogon.org/oauth2/adfghy549ij7gfd599453718c902af46?type=accessToken&ts=1681817891006&version=v2.0&lifetime=10800000"
}

Generating a token using a robot Kerberos keytab

To support automated analyses, the IGWN Computing group can issue a robot credential that can be used to generate SciTokens without requiring interactive authorisation, or use of a personal identity.

To requests a robot credential to support SciTokens, please go to https://robots.ligo.org and select option 4. Apply for a SciToken keytab.

Once the keytab has been issued, you can use it to generate a SciToken by calling htgettoken with some extra arguments specific to your robot:

htgettoken \
  -a vault.ligo.org \
  --audience https://gracedb.ligo.org \
  --scopes gracedb.read \
  --role <my-robot-name> \
  --credkey <my-robot-credkey>

Here the values for the --role and --credkey arguments will be given to when the robot keytab is issued.

The first use requires an interactive response

The first time a robot credential is used to generate a token will require an interactive user to complete the OAuth 2.0 workflow.

This will generate a 'vault token' that is stored on the user machine that can be used (automatically) to get new/refreshed tokens without requiring the interactive OAuth workflow to be completed.

Setting default htgettoken options

To simplify the default operations for htgettoken, you can set the HTGETTOKENOPTS environment variable to encode the options you always give:

export HTGETTOKENOPTS="--vaultserver vault.ligo.org --issuer igwn"
setenv HTGETTOKENOPTS "--vaultserver vault.ligo.org --issuer igwn"
$Env:HTGETTOKENOPTS = "--vaultserver vault.ligo.org --issuer igwn"

Token-aware services

Proprietary IGWN data

Prorietary IGWN data are made available using the Open Science Data Federation with one 'origin' for each collaboration's h(t) data and one for shared derived/auxiliary data.

These data are then available over HTTP using the OSDF Client or via CVMFS.

IGWN members can access these data using authorisation tokens with the appropriate scope according to which collaboration's data are being accessed:

Collaboration OSDF path CVMFS repository SciToken scope
Legacy /user/ligo /cvmfs/ligo.osgstorage.org
/cvmfs/oasis.opensciencegrid.org/ligo
read:/frames
KAGRA /igwn/kagra /cvmfs/kagra.storage.igwn.org read:/kagra
LIGO /igwn/ligo /cvmfs/ligo.storage.igwn.org read:/ligo
Virgo /igwn/virgo /cvmfs/virgo.storage.igwn.org read:/virgo
IGWN /igwn/shared /cvmfs/shared.storage.igwn.org read:/shared

OSDF and CVMFS require ANY

Because of the distributed nature of the OSDF and CVMFS system, the special Audience value of ANY should be used when authorising with these services.

DQSegDB

DQSegDB accepts the following claims:

Scope Purpose
dqsegdb.read Authorise the bearer for DQSegDB read access

All DQSegDB servers expect https://segments.ligo.org as the audience

All DQSegDB instances require incoming tokens to declare the aud (audience) claim as https://segments.ligo.org, and not the fully-qualified URL of the actual instance.

GraceDB

GraceDB understands the following token scopes:

Scope Purpose
gracedb.read Authorise the bearer for GraceDB access

The gracedb.read scope is used as a proxy for an identification token - GraceDB uses its own internal authorisation system to assign capabilities to the subject (owner) of the token.

GWDataFind

GWDataFind accepts the following tokens:

Scope Purpose
gwdatafind.read Retrieve information from a GWDataFind server

Using tokens

Interactive use

For all interactive usage, tokens should be generated using htgettoken.

Once a token has been generated, the client tools for most token-aware services (including all of those listed above) will be able to automatically discover it by following the WLCG Bearer Token Discovery specification.

Using a token with the GWDataFind client

htgettoken --audience https://datafind.igwn.org --scope gwdatafind.read
python3 -m gwdatafind -r datafind.igwn.org -o H -t H1_HOFT_C00 --latest

HTCondor

For details on using SciTokens with HTCondor workflows, please see Using IGWN credentials with HTCondor.