Skip to content

VCFtools

VCFtools are a set of utilities that provide methods for working with genetic variation data in the form of Variant Call Format (VCF) files.

VCFtools is available as a module on Apocrita.

Usage

To run the default installed version of VCFtools, simply load the vcftoolsmodule:

$ module load vcftools
$ vcftools

VCFtools (X.Y.Z)

Process Variant Call Format files

For a list of options, please go to:
        https://vcftools.github.io/man_latest.html

Alternatively, a man page is available, type:
        man vcftools

Example jobs

VCFtools is mostly a single-threaded application

The only tool within the VCFtools suite that supports multi-threading is vcf-sort, where the --parallel ${NSLOTS} option must be used if requesting multiple cores. All other tools are single-threaded and therefore, jobs using these tools must request 1 core.

Serial jobs

Here is an example job running on 1 core and 2GB of memory to return the number of variants and individuals in the given input file:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
module load vcftools
vcftools --vcf input_data.vcf

Here is an example job running on 1 core and 2GB of memory to filter out variants or individuals based on values within the given input file:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
module load vcftools
vcftools --vcf input_data.vcf \
         --chr 1 \
         --from-bp 1000000 \
         --to-bp 2000000

References