Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Nat Methods. 2010 May;7(5):365-71.

Characterization of missing human genome sequences and copy-number polymorphic insertions.

Author information

  • 1Department of Genome Sciences, University of Washington School of Medicine, Seattle, USA.

Abstract

The extent of human genomic structural variation suggests that there must be portions of the genome yet to be discovered, annotated and characterized at the sequence level. We present a resource and analysis of 2,363 new insertion sequences corresponding to 720 genomic loci. We found that a substantial fraction of these sequences are either missing, fragmented or misassigned when compared to recent de novo sequence assemblies from short-read next-generation sequence data. We determined that 18-37% of these new insertions are copy-number polymorphic, including loci that show extensive population stratification among Europeans, Asians and Africans. Complete sequencing of 156 of these insertions identified new exons and conserved noncoding sequences not yet represented in the reference genome. We developed a method to accurately genotype these new insertions by mapping next-generation sequencing datasets to the breakpoint, thereby providing a means to characterize copy-number status for regions previously inaccessible to single-nucleotide polymorphism microarrays.

Comment in

PMID:
20440878
[PubMed - indexed for MEDLINE]
PMCID:
PMC2875995
Free PMC Article

Images from this publication.See all images (5)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Publication Types, MeSH Terms, Substances, Secondary Source ID, Grant Support

Publication Types

MeSH Terms

Substances

Secondary Source ID

Grant Support

PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for PubMed Central
    Loading ...
    Write to the Help Desk