Format

Send to

Choose Destination
Nat Methods. 2016 Dec;13(12):1050-1054. doi: 10.1038/nmeth.4035. Epub 2016 Oct 17.

Phased diploid genome assembly with single-molecule real-time sequencing.

Author information

1
Pacific Biosciences, Menlo Park, California, USA.
2
Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA.
3
Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA.
4
DOE Joint Genome Institute, Walnut Creek, California, USA.
5
Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California, USA.
6
Department of Viticulture and Enology, University of California Davis, Davis, California, USA.
7
Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada, USA.
8
Dipartimento di Biotecnologie, Universita' degli Studi di Verona, Verona, Italy.
9
Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA.

Abstract

While genome assembly projects have been successful in many haploid and inbred species, the assembly of noninbred or rearranged heterozygous genomes remains a major challenge. To address this challenge, we introduce the open-source FALCON and FALCON-Unzip algorithms (https://github.com/PacificBiosciences/FALCON/) to assemble long-read sequencing data into highly accurate, contiguous, and correctly phased diploid genomes. We generate new reference sequences for heterozygous samples including an F1 hybrid of Arabidopsis thaliana, the widely cultivated Vitis vinifera cv. Cabernet Sauvignon, and the coral fungus Clavicorona pyxidata, samples that have challenged short-read assembly approaches. The FALCON-based assemblies are substantially more contiguous and complete than alternate short- or long-read approaches. The phased diploid assembly enabled the study of haplotype structure and heterozygosities between homologous chromosomes, including the identification of widespread heterozygous structural variation within coding sequences.

PMID:
27749838
PMCID:
PMC5503144
DOI:
10.1038/nmeth.4035
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center