Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome

PLoS One. 2013;8(2):e55864. doi: 10.1371/journal.pone.0055864. Epub 2013 Feb 6.

Abstract

Next-generation sequencing (NGS) technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high quantities of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC) clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum). Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in length. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs <2 kb long, to dramatically improve the assembly from 75% to 95% complete.

MeSH terms

  • Chromosome Mapping*
  • Chromosomes, Artificial, Bacterial
  • Chromosomes, Plant / genetics*
  • Genes, Plant / genetics*
  • Genome, Plant / genetics*
  • High-Throughput Nucleotide Sequencing
  • Molecular Sequence Data
  • Nanotechnology / instrumentation*
  • Sequence Analysis, DNA
  • Triticum / genetics*

Associated data

  • GENBANK/JX295577