Format

Send to

Choose Destination
Cell. 2019 Jan 24;176(3):663-675.e19. doi: 10.1016/j.cell.2018.12.019. Epub 2019 Jan 17.

Characterizing the Major Structural Variant Alleles of the Human Genome.

Author information

1
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.
2
McDonnell Genome Institute, Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA.
3
Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637, USA.
4
Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH 43205, USA; The Ohio State University College of Medicine, Columbus, OH 43210, USA.
5
Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH 43205, USA.
6
Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
7
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. Electronic address: eee@gs.washington.edu.

Abstract

In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.

KEYWORDS:

gap closure; human reference genome; major allele; real-time (SMRT) sequencing; single-molecule; structural variation; whole-genome sequence and assembly

PMID:
30661756
DOI:
10.1016/j.cell.2018.12.019

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center