Faster, wider access to the 1000 Genomes Mirror

As part of a program to make large reference data sets more widely available, QCIF recently exposed the complete 1000 Genomes data set to the Euramoo and FlashLite compute clusters.

In the final phase of the seven-year 1000 Genomes Project, the genomes of 2,504 people across five continental regions were sequenced. This has provided a global reference and comprehensive resource on human genetic variation. The data sets total about 260 TBs and consist of more than 250,000 publicly accessible files.

The QCIF 1000 Genomes Project Mirror is a complete copy of the data sets for use by Australian researchers. Because the data is now available on QCIF’s compute clusters, researchers can perform high-end analysis on the data without having to copy it.

Anyone with a QRIScloud account has automatic command line access to the 1000 Genomes Mirror. For more information please visit the QRIScloud website.

1000 Genomes map

Image via IGSR: Population sites of the 1000 Genomes Project.