VCFtools¶
VCFtools are a set of utilities that provide methods for working with genetic variation data in the form of Variant Call Format (VCF) files.
VCFtools is available as a module on Apocrita.
Usage¶
To run the default installed version of VCFtools, simply load the
vcftools
module:
$ module load vcftools
$ vcftools
VCFtools (X.Y.Z)
Process Variant Call Format files
For a list of options, please go to:
https://vcftools.github.io/man_latest.html
Alternatively, a man page is available, type:
man vcftools
Example jobs¶
VCFtools is mostly a single-threaded application
The only tool within the VCFtools suite that supports multi-threading is
vcf-sort
, where the --parallel ${NSLOTS}
option must be used if
requesting multiple cores. All other tools are single-threaded and
therefore, jobs using these tools must request 1 core.
Serial jobs¶
Here is an example job running on 1 core and 2GB of memory to return the number of variants and individuals in the given input file:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
module load vcftools
vcftools --vcf input_data.vcf
Here is an example job running on 1 core and 2GB of memory to filter out variants or individuals based on values within the given input file:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
module load vcftools
vcftools --vcf input_data.vcf \
--chr 1 \
--from-bp 1000000 \
--to-bp 2000000