Using $TMPDIR

There is temporary space available on the nodes that can be used when you submit a job to the cluster. The size of this storage per node type is listed on the node types page.

As this storage is physically located on the nodes, it is not shared between nodes, but it will provide better performance for read/write (I/O) intensive tasks on a single node than networked storage. However, to use the temporary scratch space, you will need to copy files from networked storage to the temporary scratch space. In addition, if a job fails then any intermediate files created may be lost.

If your job does a lot of I/O operations to large files, it may therefore improve performance to:

  • copy files from your home directory into the temporary folder
  • run your job in the temporary folder
  • copy files back from the temporary folder to your home directory if needed
  • delete them from the temporary folder as soon as they're no longer needed

Basic Example

The following job runs a shell-script ./runcode.sh in a data folder beneath a user's home directory. The data is held on networked storage at this point.

#!/bin/sh
#$ -cwd
#$ -V
#$ -pe smp 12            # Request 12 CPU cores
#$ -l h_rt=24:0:0      # Request 24 hour runtime
#$ -l h_vmem=2G    # Request 2GB RAM / core, i.e. 24GB total
#$ -m be           # email at start and end of job
#$ -M <username>@qmul.ac.uk    # email address

cd $HOME/project
./runcode.sh

On any node the temporary scratch directory is accessed using the variable $TMPDIR. If specific, known files are needed in your processing, you can copy your data to that space before working on it.

The following job:

  • copies data.file from the project directory to the temporary area
  • sets the current working directory to the temporary area
  • runs the appropriate code
  • copies the output file results.data back to the project directory

This is the equivalent of the previous example, but using the temporary storage.

#!/bin/sh
#$ -cwd
#$ -V
#$ -pe smp 12            # Request 12 CPU cores
#$ -l h_rt=24:0:0      # Request 24 hour runtime
#$ -l h_vmem=2G    # Request 2GB RAM / core, i.e. 24GB total
#$ -m be           # email at start and end of job
#$ -M <username>@qmul.ac.uk    # email address

# Copy data.file from the project directory to the temporary scratch space
cp $HOME/project/data.file $TMPDIR

# Move into the temporary scratch space where your data now is
cd $TMPDIR

# Do processing - as this is a small shell script, it is run from the network storage
$HOME/project/runcode.sh

# Copy results.data back to the project directory from the temporary scratch space
cp $TMPDIR/results.data $HOME/project/

If you do not know, or cannot list all the possible output files that you would like to move back to you home directory you can use rsync to only copy changed and new files back at the end of the job. This will save time and avoid unnecessary copying.

The following job:

  • copies files to the temporary scratch area
  • runs the shell-script ./runcode.sh on the local copy
  • copies the results back to networked storage
#!/bin/sh
#$ -cwd
#$ -V
#$ -pe smp 12            # Request 12 CPU cores
#$ -l h_rt=24:0:0      # Request 24 hour runtime
#$ -l h_vmem=2G    # Request 2GB RAM / core, i.e. 24GB total
#$ -m be           # email at start and end of job
#$ -M <username>@qmul.ac.uk    # email address

# Source folder for data
DATADIR=$HOME/project

# Copy data (inc. subfolders) to temporary storage
rsync -rltv $DATADIR/ $TMPDIR/

# Run job from temporary folder
cd $TMPDIR
./runcode.sh

# Copy changed files back
rsync -rltv $TMPDIR/ $DATADIR/

Viewing Temporary Files

To view temporary files while the job is running (to ensure the job is correct) you can ssh to the node.

The path of the file is made up of job id, task id and queue name.

$ qstat
3672630 5.00638 tempFilejob abc123 r 04/08/2016 14:20:57 serial.q@dn24 4

$ ssh dn24

$ ls /tmp/3672630.1.serial.q
temp_file1  temp_file2

SSH Connections

As per the Usage Policy SSH sessions on nodes should be limited to monitoring jobs.


Advanced Example

This advanced example uses rsync for speed and will ensure cleanup happens at the end of a job or when the job hits the soft limit.

#!/bin/bash
#$ -cwd # use current working directory
#$ -o job.out # output file
#$ -j y # join output
#$ -l h_rt=0:10:0 # Request 10 minute runtime (upto 240 hours)
#$ -l s_rt=0:9:0 # Clean up after 9 minutes

function Cleanup ()
{
    trap "" SIGUSR1 EXIT # Disable trap now we're in it
    # Clean up task
    rsync -rltv $TMPDIR/ $DATADIR/
    exit 0
}

DATADIR=/`pwd`
trap Cleanup SIGUSR1 EXIT # Enable trap

cd $TMPDIR
rsync -rltv $DATADIR/ $TMPDIR/

# Job
./runcode.sh

echo "End"
exit 0