Public / Shared Data available on Apocrita

What is this?

In order to prevent duplication of data and to save valuable research time we provide a local copy of some widely used public datasets. Corrections, refreshes or requests for new datasets to be added to this repository should be directed to the Research Consultants who will monitor how widely requested a dataset is.

Datasets available

Name Version Description Location
Blast databases 08/2016 Standard set of database for BLAST(Basic Local Alignment Search Tool) /data/PublicDataSets/blast/db
GATK Bundle 2.8/hg19 Standard files for working with human resequencing data with the GATK /data/PublicDataSets/GATKbundle
Illumina Genomes HomoSapiens/build37.2 , Rattus_Norvegicus/Rnor_5.0 Ready-To-Use Reference Sequences and Annotations /data/PublicDataSets/genomes
CDD 2014 The Conserved Domain Database is a resource for the annotation of functional units in proteins /data/PublicDataSets/CDD
NCBI WGS 2014 Whole Genome Shotgun projects are genome assemblies of incomplete genomes /data/PublicDataSets/NCBI
Galaxy hg datasets hg(UCSC), hg_g1k_v37(1000Genomes) Reference genomes for use with Galaxy /data/PublicDataSets/galaxy