Skip to content

Deleting files

Individual files can be deleted using the rm command, for large amounts of files it is faster to use rsync instead.

This table compares the speed of deletion using three different methods for a directory in $TMPDIR containing 500000 files:

Node type find exec rm find delete rsync delete
nxn 14m36.137s 0m38.187s 0m13.212s
dn 12m31.660s 7m20.576s 0m37.492s
nxv 9m8.274s 0m17.378s 0m12.313s
sm 21m29.185s 0m45.506s 0m40.768s

As can be seen rsync is significantly faster than using rm directly.

Using rsync to delete files

Files in a directory can be deleted via rsync with the following commands:

# Create an empty directory
mkdir empty_dir
# Copy empty directory over target directory
rsync -a --delete empty_dir/ <target_dir>/
# Clean up by removing both directories
rmdir <target_dir> empty_dir

This should be submitted as a job so the frontend nodes are not overloaded:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

TARGET_DIR=example_directory

mkdir empty
time rsync -a --delete empty/ ${TARGET_DIR}/
rmdir empty ${TARGET_DIR}