Format

Send to

Choose Destination
Nat Biotechnol. 2014 Mar;32(3):261-266. doi: 10.1038/nbt.2833. Epub 2014 Feb 23.

Whole-genome haplotyping using long reads and statistical methods.

Author information

1
Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
2
Illumina, Inc., 5200 Illumina Way, San Diego, CA 92199, USA.
3
Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
#
Contributed equally

Abstract

The rapid growth of sequencing technologies has greatly contributed to our understanding of human genetics. Yet, despite this growth, mainstream technologies have not been fully able to resolve the diploid nature of the human genome. Here we describe statistically aided, long-read haplotyping (SLRH), a rapid, accurate method that uses a statistical algorithm to take advantage of the partially phased information contained in long genomic fragments analyzed by short-read sequencing. For a human sample, as little as 30 Gbp of additional sequencing data are needed to phase genotypes identified by 50× coverage whole-genome sequencing. Using SLRH, we phase 99% of single-nucleotide variants in three human genomes into long haplotype blocks 0.2-1 Mbp in length. We apply our method to determine allele-specific methylation patterns in a human genome and identify hundreds of differentially methylated regions that were previously unknown. SLRH should facilitate population-scale haplotyping of human genomes.

PMID:
24561555
PMCID:
PMC4073643
DOI:
10.1038/nbt.2833
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center