Skip to content

MaSuRCA

MaSuRCA is an assembly algorithm for both PacBio and Illumina data that combines the benefits of De Bruijn graph and Overlap-Layout-Consensus assembly approaches.

MaSuRCA is available as a module on Apocrita.

Usage

To run the default installed version of MaSuRCA, simply load the masurca module:

$ module load masurca
$ masurca

USAGE: masurca <config_file>

For usage documentation, run masurca --help.

Example job

Selecting the number of threads

To prevent overloading a compute node, you should include the NUM_THREADS=X parameter in your configuration file, where X is equal to the number of cores requested.

Serial job

Here is an example job running on 2 cores and 4GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

module load masurca

masurca example.cfg
./assemble.sh

Here is the supporting example.cfg file:

DATA
PE= pe 180 20 /path/to/example.fastq
END

PARAMETERS
GRAPH_KMER_SIZE=auto
NUM_THREADS=2
JF_SIZE=200000000
END

Reference