Stacks¶
Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.
Stacks is available as a module on Apocrita.
Usage¶
To run the default installed version of Stacks, simply load the
stacks
module:
$ module load stacks
Usage: <command> [options] [--help] [--version]
The following commands are available:
process_radtags Examines raw reads from an Illumina sequencing run and
first, checks that the barcode and the RAD cutsite are intact, and
demultiplexes the data.
process_shortreads Performs the same task as process_radtags for fast
cleaning of randomly sheared genomic or transcriptomic data, not for RAD data.
clone_filter Designed to identify PCR clones.
kmer_filter Allows paired or single-end reads to be filtered according
to the number or rare or abundant kmers they contain.
ustacks Takes as input a set of short-read sequences and aligns
them into exactly-matching stacks (or putative alleles).
cstacks Builds a catalog from any set of samples processed by the
ustacks or pstacks programs.
sstacks Sets of stacks, i.e. putative loci, constructed by the
ustacks program can be searched against a catalog produced by cstacks.
tsv2bam Transpose data so that it is oriented by locus, instead
of by sample.
gstacks Examines a RAD data set one locus at a time, looking at
all individuals in the metapopulation for that locus.
populations Analyze a population of individual samples computing a
number of population genetics statistics as well as exporting a variety of
standard output formats.
The following scripts are included in the Stacks package and allows preset pipelines to be run:
denovo_map.pl
ref_map.pl
For full usage documentation, run <command> -h
.
Example job¶
Serial job¶
The simplest way to execute the entire Stacks pipeline is to run it via the
denovo_map.pl
program.
Here is an example job running on 2 cores and 8GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=4G
module load stacks
denovo_map.pl -T ${NSLOTS} -M 4 -n 4 -o ./stacks/ \
--samples ./samples --popmap ./popmaps/popmap