Format

Send to

Choose Destination
Nat Commun. 2019 Apr 16;10(1):1784. doi: 10.1038/s41467-018-08148-z.

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Author information

1
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA.
2
Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
3
European Molecular Biology Laboratory, Genome Biology Unit, 69117, Heidelberg, Germany.
4
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA.
5
Center for Genomic Medicine, Massachusetts General Hospital, Department of Neurology, Harvard Medical School, Boston, MA, 02114, USA.
6
The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
7
European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, AV, NL-9713, The Netherlands.
8
Center for Bioinformatics, Saarland University and the Max Planck Institute for Informatics, 66123, Saarbrücken, Germany.
9
Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.
10
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
11
The School of Life Science and Technology of Xi'an Jiaotong University, 710049, Xi'an, China.
12
MOE Key Lab for Intelligent Networks & Networks Security, School of Electronics and Information Engineering, Xi'an Jiaotong University, 710049, Xi'an, China.
13
Ye-Lab For Omics and Omics Informatics, Xi'an Jiaotong University, 710049, Xi'an, China.
14
Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, 02115, USA.
15
Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
16
Department of Bioinformatics and Genomics, College of Computing and Informatics, The University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
17
Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
18
The Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
19
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
20
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.
21
Yale University Medical School, Computational Biology and Bioinformatics Program, New Haven, CT, 06520, USA.
22
Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT, 06520, USA.
23
Biochemistry and Molecular Medicine, University of California Davis, Davis, CA, 95616, USA.
24
UC Davis Genome Center, University of California, Davis, Davis, CA, 95616, USA.
25
USTAR Center for Genetic Discovery and Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT, 84112, USA.
26
Pacific Biosciences, Menlo Park, CA, 94025, USA.
27
Bionano Genomics, San Diego, CA, 92121, USA.
28
Beyster Center for Genomics of Psychiatric Diseases, Department of Psychiatry University of California San Diego, La Jolla, CA, 92093, USA.
29
10X Genomics, Pleasanton, CA, 94566, USA.
30
Illumina Clinical Services Laboratory, Illumina, Inc., 5200 Illumina Way, San Diego, CA, 92122, USA.
31
Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, 92093, USA.
32
Ludwig Institute for Cancer Research, La Jolla, CA, 92093, USA.
33
Department of Graduate Studies - Life Sciences, Ewha Womans University, 52, Ewhayeodae-gil, Seodaemun-gu, Seoul, 03760, South Korea.
34
DNA Link, Seodaemun-gu, Seoul, South Korea.
35
TreeCode Sdn Bhd, Bandar Botanic, 41200, Klang, Malaysia.
36
Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA.
37
School of Biomedical Engineering, Drexel University, Philadelphia, PA, 19104, USA.
38
Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77225, USA.
39
Department of Medicine, McDonnell Genome Institute, Siteman Cancer Center, Washington University School of Medicine, St. Louis, MI, 63108, USA.
40
High Impact Research, University of Malaya, 50603, Kuala Lumpur, Malaysia.
41
Department of Computer Science, Yale University, 266 Whitney Avenue, New Haven, CT, 06520, USA.
42
Department of Statistics and Data Science, Yale University, 266 Whitney Avenue, New Haven, CT, 06520, USA.
43
Institute for Human Genetics, University of California-San Francisco, San Francisco, CA, 94143, USA.
44
Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, V5Z 1L3, Canada.
45
Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
46
Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
47
The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, China.
48
Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
49
Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
50
European Molecular Biology Laboratory, Genome Biology Unit, 69117, Heidelberg, Germany. jan.korbel@embl.de.
51
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom. jan.korbel@embl.de.
52
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA. eee@gs.washington.edu.
53
Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA. eee@gs.washington.edu.
54
The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA. charles.lee@jax.org.
55
Department of Graduate Studies - Life Sciences, Ewha Womans University, 52, Ewhayeodae-gil, Seodaemun-gu, Seoul, 03760, South Korea. charles.lee@jax.org.

Abstract

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

PMID:
30992455
PMCID:
PMC6467913
DOI:
10.1038/s41467-018-08148-z
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center