Skip to content

Array jobs

A common requirement is to be able to run the same job a large number of times, with different input parameters. Whilst this could be done by submitting lots of individual jobs, a more efficient and robust way is to use an array job. Using an array job also allows you to circumvent the maximum jobs per user limitation, and manage the submission process more elegantly.

Arrays can be thought of as a for loop:

for NUM in 1 2 3
do
    echo $NUM
done

Is equivalent to:

#!/bin/bash
#$ -cwd
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y
#$ -l h_rt=1:0:0
#$ -t 1-3
echo ${SGE_TASK_ID}

Here the -t flag configures the number of iterations in your qsub script and the counter (the equivalent of $NUM in the for loop example) is $SGE_TASK_ID.

To run an array job use the -t option to specify the range of tasks to run. Now, when the job is run the script will be run with $SGE_TASK_ID set to each value specified by -t. The values for -t can be any integer range, with the option to increase the step size. In the following example,: -t 20-30:5 will produce 20 25 30 and run 3 tasks.

#!/bin/bash
#$ -cwd
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y
#$ -l h_rt=1:0:0
#$ -t 20-30:5
echo "Sleeping for ${SGE_TASK_ID} seconds"
sleep ${SGE_TASK_ID}

The only difference between the individual tasks is the value of the $SGE_TASK_ID environment variable. This value can be used to reference different parameter sets etc. from within a job.

Output files for array jobs will include the task id to differentiate output from each task e.g.

testarray.o123.1
testarray.o123.2
testarray.o123.3

Running single task from array job

If you need to run a single task from array job (for example 5th task of 100 hit the execution time limit and you wish to run it again) you can pass number of task with -t key when submitting: qsub -t 5 array_job.sh

Email notifications and large arrays

Please ensure that job email notifications are not enabled in job scripts for arrays with lots of tasks, as the sending of a large number of email messages causes problems with the receiving mail servers, and even service disruption.

Processing files

If you need to process lots of files, then you can set up an appropriate list using ls -1. e.g. if your files are all named EN<something>.txt :

ls -1 EN*.txt > list_of_files.txt

Now find out how many files there are:

$ wc -l list_of_files.txt
35 list_of_files.txt

Then set the -t value to the appropriate number:

#$ -t 1-35

You can then use sed to select the correct line of the file for each iteration:

INPUT_FILE=$(sed -n "${SGE_TASK_ID}p" list_of_files.txt)

Which results in the final script:

#!/bin/bash
#$ -cwd
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y
#$ -l h_rt=1:0:0
#$ -t 1-35

INPUT_FILE=$(sed -n "${SGE_TASK_ID}p" list_of_files.txt)
example-program < $INPUT_FILE

Processing directories

Consider processing the contents of a collection of 1000 directories, called test1 to test1000.

#!/bin/bash
#$ -cwd
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y
#$ -l h_rt=1:0:0
#$ -t 1-1000
cd test${SGE_TASK_ID}
./program < input

Tasks are started in order of the array index.

Passing arguments to an application

The following example runs an application with differing arguments obtained from a text file:

$ cat list_of_args.txt
-i 50 52 54 -s 10
-i 60 62 64 -s 20
-i 70 72 74 -s 30
#!/bin/bash
#$ -cwd
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y
#$ -l h_rt=1:0:0
#$ -t 1-3
INPUT_ARGS=$(sed -n "${SGE_TASK_ID}p" list_of_args.txt)
./program $INPUT_ARGS

Would result in 3 job tasks being submitted, using a different set of input arguments, specified on each line of the text file.

Task concurrency

Task concurrency (-tc N) is the number of array tasks allowed to run at the same time, this can be used to limit the number of tasks running for larger jobs, and jobs that may impact storage performance.

If you are running code that would possibly read or write to the same files on the filesystem, you may need to use this option to avoid filesystem blocking. Also, large numbers of jobs starting or finishing at the same moment puts an extra load on the scheduler using the tc throttle can limit this.

#!/bin/bash
#$ -cwd            # Run the code from the current directory
#$ -pe smp 1
#$ -l h_vmem=1G
#$ -j y            # Merge the standard output and standard error
#$ -l h_rt=1:0:0 # Limit each task to 1 hr
#$ -t 1-1000
#$ -tc 5
cd test${SGE_TASK_ID}
./program < input

Concurrency default value

If a tc value is not supplied, we set a default value of 100 to array jobs. This is to avoid accidental impact on shared resources such as storage. We allow you to set a higher concurrency value than this, but please be vigilant of any potential issues your job might cause, such as each array task writing to a single file.

You can alter the tc value while the job is running with qalter. For example, to change the concurrency of an array job to a value of ten:

qalter -tc 10 <jobid>

Holding specific tasks from a queued array job

The qhold command (see the man page) will temporarily place a hold on queued jobs to stop them from starting. A hold applied to a currently running job will continue to run and will not be halted. This can be useful if you notice a lot of your jobs are failing - holding jobs will stop potential failures from other queued tasks within the same array job.

For example, if your jobid is 3388, to hold tasks 20-60 to stop them from starting, run:

qhold 3388.20-60

After applying a hold, these tasks will never run until released with qrls (see the man page). Held jobs and tasks will be displayed in hqw state when running qstat.

Deleting specific tasks from a queued array job

To delete certain tasks from an array, use the -t option for qdel (see the man page).

For example, if your jobid is 3388, to delete tasks 20-60, run:

qdel 3388 -t 20-60

This will delete the tasks regardless if they are running, queued or held.

Re-submitting specific tasks from an array job

If specific tasks in your array job did not complete successfully or were prematurely deleted before they could run, you may want to re-submit those tasks; rather than re-submitting the entire array, you can re-submit specific tasks after correcting the original issue.

For example, to re-submit tasks 2-5, 17 and 35-36 from the original 60-task array, either modify the -t parameter in your job script for each task range and re-submit, or override the -t parameter on the qsub line, as shown below:

qsub -t 2-5 example.sh
qsub -t 17 example.sh
qsub -t 35-36 example.sh

The task id range specified in the -t option argument may be a single number or a single range (see man page).

Need help?

If you need help writing or using array job submission scripts, please see Getting Help.