Skip to content

Job submission

Accounting tags are required

Jobs without accounting tags which are submitted from IGWN hosts will fail instantly. See: Accounting.

Collections of resources where you can run jobs in HTCondor are called "pools".

In this section, we discuss some special requirements and options to target the pool and resources in which your jobs run.

Submission hosts

Workflows can be submitted to the IGWN Grid only from specific machines. Workflows submitted from

For IGWN members, those machines are:

Hostname Location Notes
ldas-osg.ligo.caltech.edu Caltech (LIGO) Local & IGWN pools available
ldas-osg.ligo-wa.caltech.edu Hanford (LIGO) Local & IGWN pools available
ldas-osg.ligo-la.caltech.edu Livingston (LIGO) Local & IGWN pools available
stro.nikhef.nl Nikhef IGWN Grid-only, limited /home space

Login follows the same directions given here to connect to generic collaboration resources, hence both ssh and gsissh access is supported.

Each submit host is configured to connect to the underlying HTCondor computing workload manager. Any computing task one wishes to run on the IGWN pool should be submitted from one of such submit hosts.

IGWN grid job requirements

Jobs submitted from IGWN Grid submit hosts will now run on the IGWN Grid without any additional entries in the submit file requirements.

IGWN Grid jobs run through the glide-in model for distributed computing: special "pilot" jobs run on the native batch workflow management system at a computing site and provision slots for your HTCondor jobs to run on.

At some locations which also offer a local pool as well as the IGWN pool, you may find it useful for testing purposes to require that your job runs through a glidein and prevent them from using any available resources in the local pool:

requirements = (IS_GLIDEIN=?=True)

This is not a pre-requisite to using the IGWN Grid, however, and jobs submitted from IGWN Grid hosts should behave identically regardless of the pool they land in, provided they have been configured as described in this documentation.

Other requirements may be given to indicate to condor other resources that are needed by your job(s). The following table summarises some relevant requirement expressions, and what they mean for the target execute host:

Requirement Description
{: nowrap } (HAS_LIGO_FRAMES=?=True) GWF data files are available via CVMFS (paths as returned from datafind.ligo.org:443, see CVMFS data discovery)
{: .nowrap } (HAS_SINGULARITY=?=True) jobs can run within Singularity containers

Specifying multiple requirements to HTCondor

Multiple requirements should be combined using the && intersection operator:

requirements = (HAS_LIGO_FRAMES=?=True) && (HAS_SINGULARITY=?=True)

Restricting sites

You shouldn't need to do this for most use-cases

If requirements are set correctly, there should be no need to restrict the site at which jobs run. If site-specific problems do occur, the first action should be to notify the IGWN computing community via the help-desk. Restricting sites is a stop-gap solution which may help urgent workflows complete or facilitate special-case data-access patterns.

To opt-in to specific sites, use the +DESIRED_Sites HTCondor directive in your submit file

Restricting sites jobs run at

Restrict jobs to a specific site, using double-quotes:

+DESIRED_Sites = "LIGO-CIT"

Multiple desired sites can be declared as a comma-separated list:

+DESIRED_Sites = "LIGO-CIT,GATech"

To opt-out of (blacklist) a site, mark it as undesired:

+UNDESIRED_Sites = "GATech"

Available sites include (but are not necessarily limited to):

Label Location
BNL Brookhaven National Lab. (USA)
CCIN2P3 IN2P3, Lyon (France)
CNAF INFN (Italy)
GATech Georgia Tech (USA)
KISTI KISTI (Korea)
LIGO-CIT Caltech (USA)
LIGO-WA LIGO Hanford Observatory (USA)
LIGO-LA LIGO Livingston Observatory (USA)
NIKHEF Nikhef (Netherlands)
QB2 LSU (USA)
RAL RAL (UK)
SDSC-PRP Pacific Research Platform (Global, mostly USA)
SU-ITS Syracuse (USA)
SuperMIC LSU (USA)
UChicago Univ. of Chicago (USA)
UCSD UCSD (USA)

Using a local pool

Some IGWN Grid submit hosts provide access to both the IGWN Grid pool and a local cluster pool. In the case of ldas-osg.ligo.caltech.edu for example, this local pool is the "usual" CIT cluster accessed through ldas-grid.ligo.caltech.edu and, as such, allows access to resources like the shared filesystem, including /home directories.

Usage of the local pool operates under an opt-in model, where jobs must "flock" from the IGWN Grid to the local pool. Jobs which are not able to run on IGWN Grid resources must be further restricted to only run in the local pool.

HTCondor submit file for jobs in a local pool

Allow jobs to run in a local pool using +flock_local:

universe = Vanilla
executable = /lalapps_somejob
+flock_local = True
log = example.log
error = example.err
output = example.out
queue 1

Restrict jobs to a local pool using flocking and restricting the external sites at which they can run:

universe = Vanilla
executable = /lalapps_somejob
+flock_local = True
+DESIRED_SItes = "none"
log = example.log
error = example.err
output = example.out
queue 1

Finally, to restrict jobs to the local pool and leverage a shared filesystem, disable file transfers:

universe = Vanilla
executable = /lalapps_somejob
should_transfer_files = NO
+flock_local = True
+DESIRED_SItes = "none"
log = example.log
error = example.err
output = example.out
queue 1

Using the local pool can be a powerful tool for e.g.:

  • pre-processing jobs to extract data for further distribution using HTCondor file transfer and increasing the number of potential sites at which you can run
  • short-duration post-processing jobs such as output data aggregation or web page generation.