RepeatMasker (Anaconda)¶
RepeatMasker screens DNA sequences for interspersed repeats and low complexity DNA sequences.
RepeatMasker is available to install from the Bioconda Anaconda channel.
Installation¶
Load the default Anaconda module:
module load anaconda3
If required, create a new Conda environment:
mamba create -n myenv
Activate your Conda environment:
mamba activate myenv
In your activated environment, install RepeatMasker from the Bioconda Anaconda channel, additionally specifying the Conda Forge channel for any additional required dependencies:
mamba install -c bioconda -c conda-forge repeatmasker
Usage¶
To run the installed version of RepeatMasker, simply load the anaconda3
module
and activate your Conda environment:
module load anaconda3
mamba activate myenv
For usage documentation, run RepeatMasker
with no arguments:
(myenv) $ RepeatMasker
RepeatMasker version X.Y.Z
No query sequence file indicated
NAME
RepeatMasker - Mask repetitive DNA
SYNOPSIS
RepeatMasker [-options] <seqfiles(s) in fasta format>
A more detailed help document can be viewed by running RepeatMasker -help
.
Example jobs¶
Serial jobs¶
Here is an example job running on 1 core and 1GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load anaconda3
mamba activate myenv
RepeatMasker [-options] seqfiles.fa
Here is an example job running on 4 cores and 4GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load anaconda3
mamba activate myenv
# By default, RepeatMasker will start 2 threads for every slot requested,
# resulting in badly overloaded jobs. To avoid this, set a variable half of
# the allocated slots
REPCORES=$((NSLOTS / 2))
RepeatMasker [-options] -pa ${REPCORES} seqfiles.fa
To request a different number of slots, simply change the core request (smp X
value), no additional changes are required. One should request an even number
of slots to ensure the REPCORES
variable is set correctly.