Format

Send to

Choose Destination
Nat Commun. 2014 Dec 17;5:5695. doi: 10.1038/ncomms6695.

High-quality genome (re)assembly using chromosomal contact data.

Author information

1
1] Institut Pasteur, Department of Genomes and Genetics, Groupe Régulation Spatiale des Génomes, 75015 Paris, France [2] CNRS, UMR 3525, 75015 Paris, France [3] Institut Pasteur, Unité Imagerie et Modélisation, 75015 Paris, France [4] CNRS, URA 2582, 75015 Paris, France [5] Sorbonne Universités, UPMC Univ Paris06, IFD, 4 place Jussieu, 75252 Paris, France.
2
1] Institut Pasteur, Department of Genomes and Genetics, Groupe Régulation Spatiale des Génomes, 75015 Paris, France [2] CNRS, UMR 3525, 75015 Paris, France.
3
Max Planck Institute for Dynamics and Self-Organization, Group Biological Physics and Evolutionary Dynamics, Bunsenstr. 10, 37073 Göttingen, Germany.
4
Institute for Research on Cancer and Ageing of Nice (IRCAN), CNRS UMR 7284-INSERM U108, Université de Nice Sophia Antipolis, 06107 Nice, France.
5
1] Sorbonne Universités, UPMC Univ Paris06, IFD, 4 place Jussieu, 75252 Paris, France [2] IFP Energies Nouvelles, 1 et 4 avenue de Bois-Préau, 92852 Rueil-Malmaison, France.
6
Institut Pasteur, Unité Cell Biology of Parasitism, 75015 Paris, France.
7
IFP Energies Nouvelles, 1 et 4 avenue de Bois-Préau, 92852 Rueil-Malmaison, France.
8
1] Institut Pasteur, Unité Imagerie et Modélisation, 75015 Paris, France [2] CNRS, URA 2582, 75015 Paris, France.

Abstract

Closing gaps in draft genome assemblies can be costly and time-consuming, and published genomes are therefore often left 'unfinished.' Here we show that genome-wide chromosome conformation capture (3C) data can be used to overcome these limitations, and present a computational approach rooted in polymer physics that determines the most likely genome structure using chromosomal contact data. This algorithm--named GRAAL--generates high-quality assemblies of genomes in which repeated and duplicated regions are accurately represented and offers a direct probabilistic interpretation of the computed structures. We first validated GRAAL on the reference genome of Saccharomyces cerevisiae, as well as other yeast isolates, where GRAAL recovered both known and unknown complex chromosomal structural variations. We then applied GRAAL to the finishing of the assembly of Trichoderma reesei and obtained a number of contigs congruent with the know karyotype of this species. Finally, we showed that GRAAL can accurately reconstruct human chromosomes from either fragments generated in silico or contigs obtained from de novo assembly. In all these applications, GRAAL compared favourably to recently published programmes implementing related approaches.

PMID:
25517223
PMCID:
PMC4284522
DOI:
10.1038/ncomms6695
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center