Skip to content

RepeatMasker (Module)

RepeatMasker screens DNA sequences for interspersed repeats and low complexity DNA sequences.

RepeatMasker 4.0.7 is available as a module on Apocrita.

To run newer versions of RepeatMasker, we recommend installing it by following the Anaconda installation instructions.

Usage

To run the default installed version of RepeatMasker, simply load the repeatmasker module:

module load repeatmasker

For usage documentation, run RepeatMasker with no arguments:

$ RepeatMasker
RepeatMasker version X.Y.Z
No query sequence file indicated

NAME
    RepeatMasker - Mask repetitive DNA

SYNOPSIS
      RepeatMasker [-options] <seqfiles(s) in fasta format>

A more detailed help document can be viewed by running RepeatMasker -help.

Example jobs

Serial jobs

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load repeatmasker

RepeatMasker [-options] seqfiles.fa

Here is an example job running on 4 cores and 4GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load repeatmasker

# By default, RepeatMasker will start 2 threads for every slot requested,
# resulting in badly overloaded jobs. To avoid this, set a variable half of
# the allocated slots
REPCORES=$((NSLOTS / 2))

RepeatMasker [-options] -pa ${REPCORES} seqfiles.fa

To request a different number of slots, simply change the core request (smp X value), no additional changes are required. One should request an even number of slots to ensure the REPCORES variable is set correctly.

References