SPAdes¶
SPAdes is an assembly toolkit containing various assembly pipelines.
SPAdes is available as a module on Apocrita.
Usage¶
To run the default installed version of SPAdes, simply load the spades
module:
$ module load spades
$ spades.py --help
SPAdes genome assembler <VERSION>
Usage: /share/apps/centos7/spades/<VERSION>/bin/spades.py [options] -o <output_dir>
Basic options:
-o <output_dir> directory to store all the resulting files (required)
--sc this flag is required for MDA (single-cell) data
--meta this flag is required for metagenomic sample data
--rna this flag is required for RNA-Seq data
--plasmid runs plasmidSPAdes pipeline for plasmid detection
--iontorrent this flag is required for IonTorrent data
--test runs SPAdes on toy dataset
-h/--help prints this usage message
-v/--version prints version
Advanced options:
--dataset <filename> file with dataset description in YAML format
-t/--threads <int> number of threads
[default: 16]
-m/--memory <int> RAM limit for SPAdes in Gb (terminates if exceeded)
[default: 250]
--tmp-dir <dirname> directory for temporary files
[default: <output_dir>/tmp]
-k <int,int,...> comma-separated list of k-mer sizes (must be odd and
less than 128) [default: 'auto']
For usage documentation, run spades.py --help
.
Example job¶
Selecting the number of threads and memory
By default, SPAdes will run multi-threaded on 16 cores and 250Gb (or all
available memory for nodes with less than 250Gb). To prevent overloading
a compute node, you should override this by passing the --threads
parameter with the value of ${NSLOTS}
and the --memory
parameter with
the value of ${SGE_HGR_m_mem_free%.*}
.
Serial job¶
Here is an example job running on 1 core and 1GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load spades
spades.py -o <output_dir> \
-1 example1.fastq \
-2 example1.fastq \
--threads ${NSLOTS} \
--memory ${SGE_HGR_m_mem_free%.*}