SPAdes¶
SPAdes is an assembly toolkit containing various assembly pipelines.
SPAdes is available as a module on Apocrita.
Usage¶
To run the latest installed version of SPAdes, simply load the spades
module:
$ module load spades $ spades.py --help SPAdes genome assembler <VERSION> Usage: /share/apps/centos7/spades/<VERSION>/bin/spades.py [options] -o <output_dir> Basic options: -o <output_dir> directory to store all the resulting files (required) --sc this flag is required for MDA (single-cell) data --meta this flag is required for metagenomic sample data --rna this flag is required for RNA-Seq data --plasmid runs plasmidSPAdes pipeline for plasmid detection --iontorrent this flag is required for IonTorrent data --test runs SPAdes on toy dataset -h/--help prints this usage message -v/--version prints version Advanced options: --dataset <filename> file with dataset description in YAML format -t/--threads <int> number of threads [default: 16] -m/--memory <int> RAM limit for SPAdes in Gb (terminates if exceeded) [default: 250] --tmp-dir <dirname> directory for temporary files [default: <output_dir>/tmp] -k <int,int,...> comma-separated list of k-mer sizes (must be odd and less than 128) [default: 'auto']
For usage documentation, run spades.py --help
.
Example job¶
Selecting the number of threads and memory
By default, SPAdes will run multi-threaded on 16 cores and 250Gb (or all
available memory for nodes with less than 250Gb). To prevent overloading
a compute node, you should override this by passing the --threads
parameter with the value of ${NSLOTS}
and the --memory
parameter with
the value of ${SGE_HGR_m_mem_free%.*}
.
Here is an example job running on 1 core and 1GB of memory:
#!/bin/bash #$ -cwd #$ -j y #$ -pe smp 1 #$ -l h_rt=1:0:0 #$ -l h_vmem=1G module load spades spades.py -o <output_dir> \ -1 example1.fastq \ -2 example1.fastq \ --threads ${NSLOTS} \ --memory ${SGE_HGR_m_mem_free%.*}