RepeatMasker¶
RepeatMasker screens DNA sequences for interspersed repeats and low complexity DNA sequences.
RepeatMasker is available to install from the Bioconda Anaconda channel.
Installation¶
Load the default Miniforge module:
module load miniforge
If required, create a new Conda environment:
mamba create -n rm_env
Activate your Conda environment:
mamba activate rm_env
In your activated environment, install RepeatMasker from the Bioconda Anaconda channel, additionally specifying the Conda Forge channel for any additional required dependencies:
mamba install -c bioconda -c conda-forge repeatmasker
Usage¶
To run the installed version of RepeatMasker, simply load the miniforge
module
and activate your Conda environment:
module load miniforge
mamba activate rm_env
For usage documentation, run RepeatMasker
with no arguments:
(rm_env) $ RepeatMasker
RepeatMasker version X.Y.Z
No query sequence file indicated
NAME
RepeatMasker - Mask repetitive DNA
SYNOPSIS
RepeatMasker [-options] <seqfiles(s) in fasta format>
A detailed help document can also be viewed by running RepeatMasker -help
.
Example jobs¶
Serial jobs¶
Here is an example job running on 1 core and 1GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load miniforge
mamba activate rm_env
RepeatMasker seqfiles.fa
Here is an example job running on 4 cores and 4GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load miniforge
mamba activate rm_env
# By default, RepeatMasker will start 2 threads for every slot requested,
# resulting in badly overloaded jobs. To avoid this, set a variable equal
# to half of the allocated slots
REPCORES=$((NSLOTS / 2))
RepeatMasker -pa ${REPCORES} seqfiles.fa
To request a different number of slots, simply change the core request (smp)
value, no additional changes are required. Requesting an even number
of slots will ensure that the REPCORES
variable is set correctly.