Format

Send to

Choose Destination
See comment in PubMed Commons below
Nucleic Acids Res. 2008 Apr;36(7):2284-94. doi: 10.1093/nar/gkn064. Epub 2008 Feb 20.

Empirical comparison of ab initio repeat finding programs.

Author information

  • 1Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS 39762, USA.

Abstract

Identification of dispersed repetitive elements can be difficult, especially when elements share little or no homology with previously described repeats. Consequently, a growing number of computational tools have been designed to identify repetitive elements in an ab initio manner, i.e. without using prior sequence data. Here we present the results of side-by-side evaluations of six of the most widely used ab initio repeat finding programs. Using sequence from rice chromosome 12, tools were compared with regard to time requirements, ability to find known repeats, utility in identifying potential novel repeats, number and types of repeat elements recognized and compactness of family descriptions. The study reveals profound differences in the utility of the tools with some identifying virtually their entire substrate as repetitive, others making reasonable estimates of repetition, and some missing almost all repeats. Of note, even when tools recognized similar numbers of repeats they often showed marked differences in the nature and number of repeat families identified. Within the context of this comparative study, ReAS and RepeatScout showed the most promise in analysis of sequence reads and assembled genomic regions, respectively. Our results should help biologists identify the program(s), if any, that is best suited for their needs.

PMID:
18287116
PMCID:
PMC2367713
DOI:
10.1093/nar/gkn064
[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Support Center