Using $TMPDIR¶

There is temporary space available on the nodes that can be used when you submit a job to the cluster.

As this storage is physically located on the nodes, it is not shared between nodes, but it will provide better performance for read/write (I/O) intensive tasks on a single node than networked storage. However, to use the temporary scratch space, you will need to copy files from networked storage to the temporary scratch space. In addition, if a job fails then any intermediate files created may be lost.

If your job does a lot of I/O operations to large files, it may therefore improve performance to:

copy files from your home directory into the temporary folder
run your job in the temporary folder
copy files back from the temporary folder to your home directory if needed
delete them from the temporary folder as soon as they're no longer needed

Basic example¶

The following job runs a shell-script ./runcode.sh in a data folder beneath a user's home directory. The data is held on networked storage at this point.

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

cd $HOME/project
./runcode.sh

On any node the temporary scratch directory is accessed using the variable $TMPDIR. If specific, known files are needed in your processing, you can copy your data to that space before working on it.

The following job:

copies data.file from the project directory to the temporary area
sets the current working directory to the temporary area
runs the appropriate code
copies the output file results.data back to the project directory

This is the equivalent of the previous example, but using the temporary storage.

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

# Copy data.file from the project directory to the temporary scratch space
cp $HOME/project/data.file $TMPDIR

# Move into the temporary scratch space where your data now is
cd $TMPDIR

# Do processing - as this is a small shell script, it is run from the network storage
$HOME/project/runcode.sh

# Copy results.data back to the project directory from the temporary scratch space
cp $TMPDIR/results.data $HOME/project/

If you do not know, or cannot list all the possible output files that you would like to move back to you home directory you can use rsync to only copy changed and new files back at the end of the job. This will save time and avoid unnecessary copying.

The following job:

copies files to the temporary scratch area
runs the shell-script ./runcode.sh on the local copy
copies the results back to networked storage

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

# Source folder for data
DATADIR=$HOME/project

# Copy data (inc. subfolders) to temporary storage
rsync -rltv $DATADIR/ $TMPDIR/

# Run job from temporary folder
cd $TMPDIR
./runcode.sh

# Copy changed files back
rsync -rltv $TMPDIR/ $DATADIR/

Viewing temporary files¶

To view temporary files while the job is running (to ensure the job is correct) you can ssh to the node.

The path of the file is made up of job id, task id and queue name.

$ qstat
3672630 5.00638 tempFilejob abc123 r 04/08/2016 14:20:57 all.q@sdx2 4

$ ssh sdx2

$ ls /tmp/3672630.1.all.q
temp_file1  temp_file2

SSH Connections

As per the Usage Policy SSH sessions on nodes should be limited to monitoring jobs.

Advanced example¶

This advanced example uses rsync for speed and will ensure cleanup happens at the end of a job or when the job hits the soft limit.

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_vmem=2G
#$ -l h_rt=1:0:0   # Request 1 hour runtime
#$ -l s_rt=0:55:0  # Clean up after 55 minutes

function Cleanup ()
{
    trap "" SIGUSR1 EXIT # Disable trap now we're in it
    # Clean up task
    rsync -rltv $TMPDIR/ $DATADIR/
    exit 0
}

DATADIR=$(pwd)
trap Cleanup SIGUSR1 EXIT # Enable trap

cd $TMPDIR
rsync -rltv $DATADIR/ $TMPDIR/

# Job
./runcode.sh