Skip to content

DIAMOND

DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.

Key features include:

  • Pairwise alignment of proteins and translated DNA at 500x-20,000x speed of BLAST.
  • Frameshift alignments for long read analysis.
  • Low HPC resource requirements.
  • Various output formats, including BLAST pairwise, tabular, XML and taxonomic classification.

DIAMOND is available as a module on Apocrita.

Usage

AVX2 Instruction set required

DIAMOND requires the AVX2 instruction set to run. To guarantee a job will run on nodes which support AVX2, pass the -l avx2 parameter in the job script. See the node types page for a table of supported CPU instruction sets per node.

To run the latest installed version of DIAMOND, simply load the diamond module:

$ module load diamond
$ diamond help

Syntax: diamond COMMAND [OPTIONS]

Commands:
makedb  Build DIAMOND database from a FASTA file
blastp  Align amino acid query sequences against a protein reference database
blastx  Align DNA query sequences against a protein reference database
view    View DIAMOND alignment archive (DAA) formatted file
help    Produce help message
version Display version information
getseq  Retrieve sequences from a DIAMOND database file
dbinfo  Print information about a DIAMOND database file

For usage documentation, run diamond help.

Example job

Selecting the number of threads

By default, DIAMOND will run multi-threaded on all available cores. To prevent overloading a compute node, you should override this by passing the --threads parameter with the value of ${NSLOTS}.

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
#$ -l avx2

module load diamond

# Create a binary DIAMOND database
diamond makedb --db example \
               --in example.fa \
               --threads ${NSLOTS}

# Run the alignment task
diamond blastx --db example \
               --out matches \
               --query query.fna \
               --threads ${NSLOTS}

Reference