GQT: Making the most of genomic big data
UCGD investigator Aaron Quinlan and his team have released GQT, a software tool for exploring and querying large data sets of thousands to millions of genomes.
ABSTRACTGenotype Query Tools (GQT) is an indexing strategy that expedites analyses of genome-variation data sets in Variant Call Format based on sample genotypes, phenotypes and relationships. GQT's compressed genotype index minimizes decompression for analysis, and its performance relative to that of existing methods improves with cohort size. We show substantial (up to 443-fold) gains in performance over existing methods and demonstrate GQT's utility for exploring massive data sets involving thousands to millions of genomes. GQT can be accessed at https://github.com/ryanlayer/gqt.
Nat Methods. 2015 Nov 9. doi: 10.1038/nmeth.3654. [Epub ahead of print]