Array Jobs

A common requirement is to be able to run the same job a large number of times, with different input parameters. Whilst this could be done by submitting lots of individual jobs, a more efficient and robust way is to use an array job. Using an array job also allows you to circumvent the maximum jobs per user limitation, and manage the submission process more elegantly.

Arrays can be thought of as a for loop:

for NUM in 1 2 3
do
    echo $NUM
done

Is equivalent to:

#!/bin/bash
#$ -cwd                     # Run the code from the current directory
#$ -j y                     # Merge the standard output and standard error
#$ -l h_rt=24:00:00         # Limit each task to 24 hrs
#$ -t 1-3
echo $SGE_TASK_ID

Here the -t flag configures the number of iterations in your qsub script and the counter (the equivalent of $NUM in the for loop example) is $SGE_TASK_ID.

To run an array job use the -t option to specify the range of tasks to run. Now, when the job is run the script will be run with $SGE_TASK_ID set to each value specified by -t. The format for -t can instead of a simple range be setup to start from any number, end on any larger number and move in increments larger than 1, for instance: -t 20-30:5 will produce 20 25 30.

The only difference between the individual tasks is the value of the $SGE_TASK_ID environment variable. This value can be used to reference different parameter sets etc. from within a job.

Processing Files

If you need to process lots of files, then you can set up an appropriate list using ls -1. e.g. if your files are all named EN<something>.txt :

ls -1 EN*.txt > list_of_files.txt

Now find out how many files there are:

$ wc -l list_of_files.txt
35 list_of_files.txt

Then set the -t value to the appropriate number:

#$ -t 1-35

You can then use sed to select the correct line of the file for each iteration:

INPUT_FILE=$(sed -n "${SGE_TASK_ID}p" list_of_files.txt)

Which results in the final script:

#!/bin/bash
#$ -cwd             # Run the code from the current directory
#$ -j y             # Merge the standard output and standard error
#$ -l h_rt=24:00:00 # Limit each task to 24 hrs
#$ -t 1-35

INPUT_FILE=$(sed -n "${SGE_TASK_ID}p" list_of_files.txt)
example-program < $INPUT_FILE

Processing Directories

Consider processing the contents of a collection of 1000 directories, called test1 to test1000.

#!/bin/bash
#$ -cwd            # Run the code from the current directory
#$ -j y            # Merge the standard output and standard error
#$ -l h_rt=1:00:00 # Limit each task to 1 hr
#$ -t 1-1000
cd test${SGE_TASK_ID}
./program < input

Tasks are started in order of the array index.

Task concurrency

Task concurrency (-tc N) is the number of array tasks allowed to run at the same time, this can be used to limit the number of tasks running for larger jobs, and jobs that may impact storage performance.

If you are running code that would possibly read or write to the same files on the filesystem, you may need to use this option to avoid filesystem blocking. Also, large numbers of jobs starting or finishing at the same moment puts an extra load on the scheduler using the tc throttle can limit this.

#!/bin/bash
#$ -cwd            # Run the code from the current directory
#$ -j y            # Merge the standard output and standard error
#$ -l h_rt=1:00:00 # Limit each task to 1 hr
#$ -t 1-1000
#$ -tc 5
cd test${SGE_TASK_ID}
./program < input

Concurrency default value

If a tc value is not supplied, we set a default value of 100 to array jobs. This is to avoid accidental impact on shared resources such as storage. We allow you to set a higher concurrency value than this, but please be vigilant of any potential issues your job might cause, such as each array task writing to a single file.

You can alter the tc value while the job is running with qalter. For example, to change the concurrency of an array job to a value of ten:

qalter -tc 10 <jobid>

Deleting specific tasks from a queued array job

If you want only to delete certain tasks from an array (for example tasks 1-10 are running, but you want to delete 20-60), use the -t option for qdel (see the man page).

In our example, if our jobid is 3388, and we want to delete tasks 20-60 from our array, and leave the rest running, do:

qdel 3388 -t 20-60

Need help?

If you need help writing or using array job submission scripts, please see Getting Help.