Format

Send to

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2014 Sep 15;15:302. doi: 10.1186/1471-2105-15-302.

SAGE: String-overlap Assembly of GEnomes.

Author information

1
Department of Computer Science, University of Western Ontario, N6A 5B7 London, Ontario, Canada. ilie@csd.uwo.ca.

Abstract

BACKGROUND:

De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed.

RESULTS:

We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.

CONCLUSIONS:

SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.

PMID:
25225118
PMCID:
PMC4174676
DOI:
10.1186/1471-2105-15-302
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Support Center