ABSTRACT

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

An integrated map of structural variation in 2,504 human genomes

Peter H. Sudmant,    Tobias Rausch,    Eugene J. Gardner,    Robert E. Handsaker,    Alexej Abyzov, John Huddleston,    Yan Zhang,    Kai Ye,    Goo Jun,    Markus Hsi-Yang Fritz,    Miriam K. Konkel,    Ankit Malhotra,    Adrian M. Stütz,    Xinghua Shi,    Francesco Paolo Casale,    Jieming Chen,    Fereydoun Hormozdiari,    Gargi Dayama,    Ken Chen,    Maika Malig,    Mark J. P. Chaisson,    Klaudia Walter, Sascha Meiers,    Seva Kashin,    Erik Garrison,    Adam Auton,    Hugo Y. K. Lam,    Xinmeng Jasmine Mu,    Can Alkan,    Danny Antaki,    Taejeong Bae,    Eliza Cerveira,    Peter Chines,    Zechen Chong, Laura Clarke,    Elif Dal,    Li Ding,    Sarah Emery,    Xian Fan,    Madhusudan Gujral,    Fatma Kahveci, Jeffrey M. Kidd,    Yu Kong,    Eric-Wubbo Lameijer,    Shane McCarthy,    Paul Flicek,    Richard A. Gibbs,    Gabor Marth,    Christopher E. Mason,    Androniki Menelaou,    Donna M. Muzny,    Bradley J. Nelson,    Amina Noor,    Nicholas F. Parrish,    Matthew Pendleton,    Andrew Quitadamo,    Benjamin Raeder,    Eric E. Schadt,    Mallory Romanovitch,    Andreas Schlattl,    Robert Sebra,    Andrey A. Shabalin,    Andreas Untergasser,    Jerilyn A. Walker,    Min Wang,    Fuli Yu,    Chengsheng Zhang, Jing Zhang,    Xiangqun Zheng-Bradley,    Wanding Zhou,    Thomas Zichner,    Jonathan Sebat,    Mark A. Batzer,    Steven A. McCarroll,    The 1000 Genomes Project Consortium,    Ryan E. Mills,    Mark B. Gerstein,    Ali Bashir,    Oliver Stegle,    Scott E. Devine,    Charles Lee,    Evan E. Eichler    & Jan O. Korbel

Nature 526, 75–81 (01 October 2015) doi:10.1038/nature15394

PMCID: PMC4617611