Skip to content

HTCondor .sub generation tutorial

Your computing task can be univocally described by the input(s), the executable to run, the parameters of such exeutable and the produced output(s).

Following this schema it is trivial to create an HTCondor-specific job descriptor (namely a .sub file, or submit file), which can be used to launch a specific task on the IGWN pool.

Every example reported in the following sections is available for download on GitHub. Feel free to fork and clone this repository.

How to describe a job

A skeleton of a valid submit file is:

universe = vanilla

transfer_input_files =
executable =
arguments =
transfer_output_files =

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = something.somethingelse.somethingmore
accounting_group_user = albert.einstein

queue 1

where lines from 3 to 6 have to be filled, and lines from 8 to 10 can be modified to redirect logging, ouput and error messages to other files. The lines from 12 to 19 are related to the job submission and shouldn't be changed. Lines 21 and 22 should be filled using the tags generated from this site and (typically) your own name in the form albert.einstein. The GitHub repository contains two tools, set_IGWN_group.sh and set_IGWN_user.sh, which are provided to modify in bulk the accounting information of the submit files. Use them according to the informations reported above (e.g. ./set_IGWN_user.sh albert.einstein). Line 24 is the line which triggers the enqueuing of the task.

1. Input/Output -less job

In order to figure how to complete the lines from 3 to 6, let's think about a simple example where one simply wants to run on the worker node the following bash command:

ls -lrt .

which will list the content of any directory found in your user home on the HTCondor worker node. There is no input needed for such operation, neither a output file is created: the only interesting output comes in the std.out file, where the output of ls will be streamed.

In this case the submit file becomes:

universe = vanilla

executable = /bin/ls
arguments = -lrt .

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = something.somethingelse.somethingmore
accounting_group_user = albert.einstein

queue 1

As you can see the executable is /bin/ls, while all additional arguments are put in the arguments field. The executable path should always be path-independent, hence absolute.

2. Input-only job

In this case lets figure how to describe a job which requires the input of a file and which outputs the file contents in std.out.

Let's assume a file something_to_print.txt is present alongside the .sub file.

The command to print the file content would be:

cat ./something_to_print.txt

In this case the submit file becomes:

universe = vanilla

transfer_input_files = ./something_to_print.txt
executable = /bin/cat
arguments = ./something_to_print.txt

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = something.somethingelse.somethingmore
accounting_group_user = albert.einstein

queue 1

The path given to the transfer_input_files field is relative to the path where the job submission happens on the submission host. The main executable is /bin/cat and the argument is the file path relative to the worker node user home of the input file transferred above.

The inputs are transferred at job submission automatically.

3. Output-only job

A job which takes no input and generates a file or a directory can be represented by the following command:

touch my_test_output_file.txt

which creates a file filled with the "Hello world!" string.

In this case the submit file becomes:

universe = vanilla

executable = /bin/touch
arguments = my_test_output_file.txt
transfer_output_files = ./my_test_output_file.txt

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = accounting_group_goes_here
accounting_group_user = albert.einstein

queue 1

which should be of trivial interpretation. Please note that on line 4, the double quotes have to be escaped with HTCondor syntax, requiring double-double quotes to make them interpreted as the " character.

4. Input/Output job

For this example lets assume one wants to perform a byte copy of a input file using cat, creating a new file as output. The command to run in:

cat ./my_input.txt >> ./my_output.txt

The submit file becomes:

universe = vanilla

transfer_input_files = ./my_input.txt
executable = /bin/cp
arguments = my_input.txt my_output.txt
transfer_output_files = my_output.txt

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = accounting_group_goes_here
accounting_group_user = albert.einstein

queue 1

which should be trivial to understand given previous examples.

5. Script-executing job

Let's assume a job runs a bash script named script.sh available on the submit node. In this case the custom script should be treated as a input file, make it transferred to the worker node by HTCondor automatically.

The submit file becomes:

universe = vanilla

executable = ./script.sh
arguments = test_argument
transfer_output_files = ./surprise.txt

log = std.log
output = std.out
error = std.err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = accounting_group_goes_here
accounting_group_user = albert.einstein

queue 1

HTCondor goodies

HTCondor job IDs

To better organize the output files collected from the submit node, it might be interesting to append or prepend the job ID to the .out, .log or .err file. To do so it is possible to use, inside the .sub file, a pletora of self expanding fields. For example $(CLUSTER) and $(PROCESS) are expanded to the according values. Referring to the .sub of the last example, it would become:

universe = vanilla

transfer_input_files = ./script.sh
executable = ./script.sh
arguments = <put your arguments here>
transfer_output_files = <register here the outputs you want to retrieve>

log = std-$(CLUSTER)-$(PROCESS).log
output = std-$(CLUSTER)-$(PROCESS).out
error = std-$(CLUSTER)-$(PROCESS).err

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

accounting_group = something.somethingelse.somethingmore
accounting_group_user = albert.einstein

queue 1

where on lines 8-9-10 the output files name will be expanded to include the job and subjob ID.

Using X.509 credentials in a job

In case some of the operations you need to run in the HTCondor job require a valid GRID or VOMS proxy to be accomplished, add the following lines to your submit file:

    use_x509userproxy = true

Always remember to create a valid proxy on the submit node before calling condor_submit, via:

voms-proxy-init --voms virgo:virgo/virgo [--valid HH:MM]

or via:

ligo-proxy-init albert.einstein

where HH and MM express the required proxy duration (capped at 144 hours, or 6 days).

Please note that the use_x509userproxy = true looks for the $509_USER_PROXY environment variable. In order to point to a specific proxy other than the default location one can use the following equivalent approach:

x509userproxy = <path-to-custom-proxy>

How to run on a specific site

In order to run on a specific site it is enough to add the following line to your submit file:

+OpenScienceGrid = True
Requirements = (IS_GLIDEIN=?=True)
+DESIRED_Sites = "SITE-NAME"

where SITE-NAME is one (or a comma separed list) of the sites listed here.

For example, to run exclusively at CNAF, add:

+OpenScienceGrid = True
Requirements = (IS_GLIDEIN=?=True)
+DESIRED_Sites = "CNAF"

Avoid executable transfer

By default HTCondor transfers the executable from the submit node to the workers. To avoid that behaviour (i.e. when you are sure the executable already installed on the worker node, e.g. via CVMFS) a specific flag can be added to the submit file:

transfer_executable = False

Propagate your env to the worker nodes

To propagate the environment variables from the submit node to the worker nodes add the following line anywhere in the .sub file:

environment = "KEY=VALUE KEY2=VALUE2"