Skip to content

Using IGWN credentials with HTCondor

If your workflow jobs require access to restricted IGWN services, you may need to configure your job to include an authorisation credential.

There are two types of credentials you can use with HTCondor jobs.

Kerberos

Work in progress

Documenting best-practice for using Kerberos credentials in an HTCondor workflow is a work in progress.

Please consider contributing to help us complete this section.

SciTokens

SciTokens are capabiltity tokens that inform services that the bearer of the token should be allowed access to a specific capability (e.g. read) on a specific service. See SciTokens for more details on what SciTokens are and how to use them in general.

For full details on specifying credentials in a job, please see

https://htcondor.readthedocs.io/en/lts/users-manual/submitting-a-job.html#jobs-that-require-credentials

SciToken issuers

There are two main ways to generate tokens for use within HTCondor: an access-point issuer, and the IGWN issuer, which will invoke htgettoken on your behalf. In the following documentation we will attempt to capture the various differences between these two approaches, but at a high level, an access point issuer (AP issuer) will be simpler to use, but support much less customization.

Each access point (submit node) supports only one issuer

At present, HTCondor only permits an access point ("submit node") to support one type of issuer, either AP or the IGWN (vault) issuer. As the features and syntax will differ depending on the type of issuer, you should consider carefully which features are needed for your workflow, and be aware that a submit file that is valid on one submit node will not work on another submit node using the other type of issuer.

Note that even on an access point using the AP issuer, the command htgettoken still exists and may be used if scitokens are required for reasons other than their use in HTCondor. It is simply not the mechanism that condor uses to acquire and manage tokens on a submitter's behalf.

At present, the only production access point supporting the AP issuer is the host ldas-osg3.caltech.ligo.edu at CIT.

Differences in the syntax and behavior of the two issuers are covered in the further topics below, but we also include here two tables for reference, to summarize the differences in both features and syntax that are explained with more context further down this page.

Services supported by each type of issuer

Service AP Issuer IGWN Issuer
GraceDB ❌ ✅
Segment Database ✅ ✅
GWDataFind ✅ ✅
OSDF ✅ ✅
CVMFS ✅ ✅
Access /igwn/cit from single user account ✅ ✅
Access /igwn/cit from shared account ✅ ❌

Since GraceDB does not support the AP token issuer, if your workflow needs to use that issuer for other reasons and you also require access to GraceDB, you should setup such a workflow to use X.509 credentials for GraceDB access.

Syntax and commands for each type of issuer

Syntax or command AP Issuer IGWN Issuer
Submit file: use_oauth_services = scitokens igwn
Custom "audience" via
<issuer>_oauth_resource
❌ ✅
Custom "scopes" via
<issuer>_oauth_permissions
❌ ✅
Format for transfer_input_files osdf:// igwn+osdf://
BEARER_TOKEN_FILE
for CVMFS access
$$(CondorScratchDir)/.condor_creds/scitokens.use $$(CondorScratchDir)/.condor_creds/igwn.use
Multiple tokens using "handles" ❌ ✅
Requires interactive response ❌ ✅
Must call
condor_vault_storer before
condor_submit_dag
❌ ✅
Robots require kerberos and HTGETTOKENOPTS ❌ ✅
Generate token outside of HTCondor Not possible With htgettoken

The basic usage valid for the majority of IGWN contexts is as follows:

Using a token in a job

To use a SciToken in an HTCondor job, add these commands to your HTCondor submit instructions:

use_oauth_services = scitokens
use_oauth_services = <issuer>
<issuer>_oauth_resource_<handle> = <service-url>
<issuer>_oauth_permissions_<handle> = <capability>

Where

  • <issuer> is the name of the token issuer to use. For a submit node running the AP issuer, this must be scitokens. For a submit machine supporting the IGWN issuer, this will in all cases for production workflows be igwn. For some testing applications, you can use igwn-test.

  • <service-url> is the fully-qualified URL of the service to access. This is also referred to as the 'audience' (aud) of the token.

    Note that this submit option is not available on a submit node configured for AP issued tokens, and should not be provided. On IGWN issuer submit nodes, it is optional, and will default to

    igwn_oauth_resource = ANY
    

    OSDF and CVMFS do not support audience URLs

    When using SciTokens to access protected data via OSDF or CVMFS, there is no URL for the service to use in the aud, claim. For these uses cases, just use ANY.

  • <capability> is the access level that is needed. This is also referred to as the 'scope' (scope) of the token. This can be a space-separated list of multiple scopes.

    Note that this submit option is not available on a submit node configured for AP issued tokens, and should not be provided. On IGWN issuer submit nodes, it is optional and will default to all scopes enabled for the user's 'role' (as defined by vault.ligo.org).

  • <handle> is a string of your choice which is used by HTCondor to distinguish between tokens from the IGWN Token Issuer which have different capabilities and services. See Token handles for more information.

    <handles> are required for IGWN-issued tokens

    If you specify the resource and permissions for a token from the IGWN token issuer you must provide a [handle for that token. This feature is designed to mitigate name collisions between tokens with different capabilities.

For example, to enable queries to the GWDataFind service located at https://datafind.igwn.org you would use:

use_oauth_services = igwn
igwn_oauth_resource_datafind = https://datafind.igwn.org
igwn_oauth_permissions_datafind = gwdatafind.read

If doing this from a submit node that uses the AP issuer, you would have instead:

use_oauth_services = scitokens

The precise steps triggered by these instructions will depend upon which type of token issuer the access point supports.

  • If using an AP issuer, the token will be managed by directly by condor itself, and it will not call external programs on behalf of the user.
  • If using the IGWN issuer, then condor will delegate management of the token(s) (through an executable named condor_vault_storer), and that will trigger a call to htgettoken (see How to generate a SciToken for more details on htgettoken). Then htgettoken will execute its 'normal' workflow, including possibly prompting the user to complete the OIDC workflow through a web brower.

    htgettoken may require an interactive response

    If htgettoken requires OIDC authorisation, condor_submit will be forced to wait until htgettoken receives the response it needs.

    This means that users submitting jobs that use tokens from a submit node using the IGWN issuer should not run condor_submit ... and immediately walk away. Please ensure that condor_submit completes (successfully).

Token management by HTCondor

If HTCondor can successfully get a token, it will transfer it to the execute point into the .condor_creds directory. This is typically a subdirectory of the job scratch directory, but the value is stored in the $_CONDOR_CREDS environment variable.

For a single token job like above, the token filename will be <issuer>.use:

Single token path for IGWN

For single-token jobs, the token will be generated on the execute machine as:

$_CONDOR_CREDS/scitokens.use

which is typically the same as

$_CONDOR_SCRATCH_DIR/.condor_creds/scitokens.use
$_CONDOR_CREDS/igwn.use

which is typically the same as

$_CONDOR_SCRATCH_DIR/.condor_creds/igwn.use

Most IGWN scitoken clients should be able to automatically discover the appropriate token file inside the $_CONDOR_CREDS directory, so you shouldn't actually need to care where the token file exists at the Execute Point.

To use this token with client tools that do not support discovering tokens inside $_CONDOR_CREDS (such as the CVMFS client), you can set the environment variable BEARER_TOKEN_FILE in your condor submit file:

environment = "BEARER_TOKEN_FILE=$$(CondorScratchDir)/.condor_creds/scitokens.use"
environment = "BEARER_TOKEN_FILE=$$(CondorScratchDir)/.condor_creds/igwn.use"

IGWN tokens are shared across all jobs for a user

Tokens are generated and stored on an Access Point using the IGWN issuer independently of the jobs that request them, so multiple concurrent or consecutive jobs may not use different token permissions without special considerations. This does not apply to Access Points using the AP issuer, because no customization of permissions or services is possible for those tokens, and the token that condor generates in that case will never be placed where other applications can see it.

Submitting a second job that requires different token permissions from an existing job may result in a submission failure that looks something like this:

$ condor_submit science.sub
Submitting job(s)
condor_vault_storer: Credentials exist that do not match the request.
They can be removed by
  condor_store_cred delete-oauth -s igwn
but make sure no other job is using them.
  More details might be available by running
    condor_vault_storer -v "igwn&scopes=gwdatafind.read&audience=https://datafind.igwn.org"

ERROR: (0) invoking /usr/bin/condor_vault_storer

The solution for this is to use token handles.

Token handles

Tokens may be given a 'handle' to allow HTCondor to distinguish between different sets of permissions and resources.

Token handles unsupported on AP Issuer

Because the AP Issuer does not support any form of token customization, it also does not support the use of token handles as described in this subsection. The rest of this subsection assumes submission from an Access Point using the IGWN issuer.

Handles are specified as a suffix to the <issuer>_oauth_resource and <issuer>_oauth_permissions commands, as follows:

use_oauth_services = igwn
igwn_oauth_resource_gracedb = https://gracedb.ligo.org
igwn_oauth_permissions_gracedb = gracedb.read

In the above example the handle is gracedb.

Tokens with handles are stored using the filename <issuer>_<handle>.use. In the above example, the token will be generated on the Access Point as

igwn_gracedb.use

and made available on the Execute Point as

.condor_creds/igwn_gracedb.use

Using handles for tokens enables submitting multiple different jobs with different token capabilities without clashes.

Using handles for tokens also enables submitting a job that requires multiple tokens.

Multiple tokens

To use multiple SciTokens in an HTCondor job, specify multiple tokens with unique handles in the same set of submit commands.

Multiple tokens unsupported on AP issuer

As a reminder, the AP issuer does not support token customization, including handles, and therefore cannot support multiple tokens. The rest of this subsection assumes submission from an Access Point using the IGWN issuer.

On a submit node using the IGWN issuer:

use_oauth_services = igwn
igwn_oauth_resource_token1 = <service-url-1>
igwn_oauth_permissions_token1 = <capability-1>
igwn_oauth_resource_token2 = <service-url-2>
igwn_oauth_permissions_token2 = <capability-2>

This will generate multiple token files with the following names

.condor_creds/igwn_token1.use
.condor_creds/igwn_token2.use

For example, to enable queries to GWDataFind at https://datafind.igwn.org and to GraceDB at https://gracedb.ligo.org in the same job:

use_oauth_services = igwn
igwn_oauth_resource_gwdatafind = https://datafind.igwn.org
igwn_oauth_permissions_gwdatafind = gwdatafind.read
igwn_oauth_resource_gracedb = https://gracedb.ligo.org
igwn_oauth_permissions_gracedb = gracedb.read

Simpler to use a single, multi-capability token

While using multiple tokens is valid usage, it is probably simpler to use a single token with multiple capabilities.

Refactoring the above example:

use_oauth_services = igwn
igwn_oauth_resource = https://datafind.igwn.org https://gracedb.ligo.org
igwn_oauth_permissions = gwdatafind.read gracedb.read

Using tokens in a DAG

All of the above describes how to configure one (type of) job to use one or more SciTokens. Using tokens in an HTCondor DAG workflow is more complex because the DAG doesn't really know that the job(s) inside it may need tokens.

DAGs require no special considerations with the AP Issuer

The above caveat does not apply to an Access Point that manages tokens with the AP issuer. Because there is no customization possible on these tokens, multiple tokens are not supported, and there is never a need for user interaction (such as with an OIDC workflow), the rest of this subsection does not apply. On such an access node, all that is needed is to have

use_oauth_services = scitokens
in the submit file of each job in the DAG, and run condor_submit_dag as you otherwise would.

On an Access Point supporting the IGWN issuer, the DAG's ignorance of which jobs might need which tokens means that when DAGMan (the DAG manager process) starts submitting the actual jobs, the condor_submit process that it spawns will run condor_vault_storer, which in turn runs htgettoken, which may attempt to prompt for a user to complete an OIDC workflow (as described above). However, as part of a DAG, this prompt is hidden inside a DAGMan log file, and will (usually) not be seen by a real person, causing the job submission to fail.

In order to successfully use SciTokens in jobs inside a DAG on an Access Point that uses the IGWN Token Issuer, you need to pre-emptively execute the condor_vault_storer to create the necessary 'parent' token that HTCondor can use to send individual tokens to jobs:

condor_vault_storer <issuer>

where <issuer> must match the value of the use_oauth_services command in the HTCondor job submit instructions, e.g:

condor_vault_storer igwn

This execution will prompt as necessary for you to complete an OIDC workflow in a web browser, rather than having it hidden inside a DAGMan log file.

Then you can proceed to run

condor_submit_dag my.dag

Using tokens in long-running or automated ('robot') workflows

For many automated, or long-running applications in IGWN, analysis groups may be issued a 'robot' Kerberos identity that is authorised to generate scitokens. That Kerberos identity is encoded into a Kerberos keytab file that should be securely stored on the HTCondor Access Point. This identity can then be used to generate SciTokens from the IGWN issuer. The AP issuer does not rely on a Kerberos identity, including for shared accounts.

Generating SciTokens using a robot Kerberos identity requires passing extra arguments to the htgettoken tool, for details see Generating a token using a robot Kerberos keytab. However, with HTCondor (both jobs and DAG workflows) the call to htgettoken is handled internally by condor_vault_storer, so these extra arguments need to be configured via the HTGETTOKENOPTS environment variable, which, e.g:

# get Kerberos ticket-granting ticket (TGT)
kinit <robot-principal>@LIGO.ORG -k -t <robot-keytab>

# set HTGETTOKENOPTS specific to this robot
export HTGETTOKENOPTS="--role <robot-role> --credkey <robot-credkey>"

# get initial token
condor_vault_storer <issuer>

# submit (including propagation of $HTGETTOKENOPTS)
condor_submit_dag robot.sub -include_env HTGETTOKENOPTS

Where <robot-principal>, <robot-keytab>, <robot-role>, and <robot-credkey> should be exactly as assigned by the Security, Identity, and Access Management team that issued the robot keytab.

HTCondor 23.4.0 makes this easier

HTCondor 23.4.0 will include a new <issuer>_oauth_options submit command that will make setting custom htgettoken options much easier.

When that is available on the Access Point you can configure your jobs with

igwn_oauth_options = --role <robot-role> --credkey <robot-credkey>

instead of setting the HTGETTOKENOPTS environment variable. For example:

use_oauth_services = igwn
igwn_oauth_permissions = gwdatafind.read gracedb.read read:/ligo read:/virgo
igwn_oauth_options = --role <robot-role> --credkey <robot-credkey>

You can check the version of HTCondor that is available by running

condor_version

The AP Issuer does not require kerberos

As already noted, all of the considerations above about kerberos and HTGETTOKENOPTS apply only to the IGWN issuer. The AP issuer will not interact with Kerberos nor call htgettoken at all, so nothing is required beyond

use_oauth_services = scitokens
on Access Points that use the AP issuer. This is also true for shared accounts. However, if your workflow requires an X.509 credential as well (for example, to access GraceDB) then that may require renewal.

Using tokens with transfer_input_files

When running a job that needs access to protected data managed using HTCondor file transfer, the SciToken should be configured as above and added as a prefix to the URL scheme as part of the transfer_input_files argument(s). This should be omitted for the AP Issuer, because the file transfer plugin defaults to assuming scitokens for the issuer if no issuer is specified before osdf://.

use_oauth_services = scitokens
transfer_input_files = osdf:///igwn/...

E.g:

use_oauth_services = scitokens
transfer_input_files = osdf:///igwn/ligo/README
use_oauth_services = <issuer>
transfer_input_files = <issuer>+osdf:///igwn/...

E.g:

use_oauth_services = igwn
transfer_input_files = igwn+osdf:///igwn/ligo/README

When using token handles, the handle should be included in the 'scheme' of the URL separated from the <issuer> name by a . (period):

use_oauth_services = <issuer>
<issuer>_oauth_permissions_<handle> = <scopes>
transfer_input_files = <issuer>.<handle>+osdf:///igwn/...

E.g:

use_oauth_services = igwn
igwn_oauth_permissions_ligo = read:/ligo
transfer_input_files = igwn.ligo+osdf:///igwn/ligo/README

This tells the relevant file-transfer plugin which access token to use when attempting to download the data. See Examples below for IGWN-specific examples.

See also transfer_input_files for more details.

Examples:

1. Downloading proprietary IGWN data via OSDF

The following HTCondor submit commands can be used to configure a job with the necessary permissions to transfer IGWN proprietary h(t) data from OSDF to a job:

use_oauth_services = scitokens
should_transfer_files = yes
transfer_input_files = osdf:///igwn/ligo/frames/O4/hoft_C00/H1/H-H1_HOFT_C00-137/H-H1_HOFT_C00-1373577216-4096.gwf
use_oauth_services = igwn
igwn_oauth_permissions_frames = read:/kagra read:/ligo read:/virgo
should_transfer_files = yes
transfer_input_files = igwn.frames+osdf:///igwn/ligo/frames/O4/hoft_C00/H1/H-H1_HOFT_C00-137/H-H1_HOFT_C00-1373577216-4096.gwf

2. Reading proprietary IGWN data from CVMFS

Accessing data via OSDF is recommended over CVMFS

Accessing data using transfer_input_files and osdf:// URLs is recommended instead of using CVMFS - the former is more reliable and significantly easier to debug when things go wrong.

By construction, all IGWN data in CVMFS must also be available via OSDF, and both methods use the same distributed cache infrastructure, so there should be no notable performance difference between the two (when things are working).

The following condor submit commands can be used to configure a job with the necessary permissions to read IGWN proprietary h(t) data from CVMFS:

use_oauth_services = scitokens
environment = "BEARER_TOKEN_FILE=$$(CondorScratchDir)/.condor_creds/scitokens.use"
use_oauth_services = igwn
igwn_oauth_permissions_frames = read:/kagra read:/ligo read:/virgo
environment = "BEARER_TOKEN_FILE=$$(CondorScratchDir)/.condor_creds/igwn_frames.use"

Authenticated CVMFS requires $BEARER_TOKEN_FILE

The helper tool that does credential handling for CVMFS does not know to look into the $_CONDOR_CREDS directory of an HTCondor job, so it is required to set the BEARER_TOKEN_FILE environment variable to enable CVMFS to use the token transferred with the job.

X.509

X.509 is no longer fully supported

Identity-based X.509 credentials are deprecated in favour of capability-based SciTokens in almost all cases, so please consider using the instructions for tokens above.

For details on the timescale on which support for X.509 certificates will be fully dropped, please see

https://git.ligo.org/groups/computing/-/epics/25

For details on which use cases still require X.509 over SciTokens, please contact the Computing and Software Working Group (compsoft@ligo.org).

X.509 is a credential standard used to encode an identity so that a service can authenticate a request and enable capabilities based on its own records of what users should be allowed to do.

See X.509 for more details on what X.509 is and how to use it in general.

Using an X.509 credential in a job

Generate the X.509 credential manually

Using X.509 with HTCondor requires manually generating the credential before submitting the job.

Please see How to generate a credential for documentation on how to generate an X.509 credential.

To use an X.509 credential file in an HTCondor job, add one of the following commands to your submit instructions:

To automatically discover the credential file based on your environment:

use_x509userproxy = true

To manually specify the path of the credential file:

x509userproxy = /path/to/myproxy.pem

In either case, the credential will be transferred onto the execute machine with your job and its path encoded in the $X509_USER_PROXY environment variable.