Skip to content

RepeatModeler (Module)

RepeatModeler is a de-novo transposable element family identification and modelling package.

RepeatModeler 2.0.1 is available as a module which wraps an Apptainer container on Apocrita.

To run newer versions of RepeatModeler, we recommend installing it by following the Anaconda installation instructions.

Usage

To run the default installed version of RepeatModeler, simply load the repeatmodeler module:

module load repeatmodeler

For usage documentation, run RepeatModeler -help:

$ RepeatModeler -help
No database indicated

NAME
    RepeatModeler - Model repetitive DNA

SYNOPSIS
      RepeatModeler [-options] -database <XDF Database>

Example jobs

Serial jobs

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load repeatmodeler

# create a database for RepeatModeler
BuildDatabase -name DB_NAME INPUT.fa

Here is an example job running on 4 cores and 4GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load repeatmodeler

# By default, RepeatModeler will start 2 threads for every slot requested,
# resulting in badly overloaded jobs. To avoid this, set a variable half of
# the allocated slots
REPCORES=$((NSLOTS / 2))

RepeatModeler -database DB_NAME -pa ${REPCORES}

To request a different number of slots, simply change the core request (smp X value), no additional changes are required. One should request an even number of slots to ensure the REPCORES variable is set correctly.

References