Display Settings:

Format

Send to:

Choose Destination
    PLoS Biol. 2005 Jan;3(1):e10. Epub 2005 Jan 4.

    A model of the statistical power of comparative genome sequence analysis.

    Source

    Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine Saint Louis, Missouri United States of America. eddy@genetics.wustl.edu

    Abstract

    Comparative genome sequence analysis is powerful, but sequencing genomes is expensive. It is desirable to be able to predict how many genomes are needed for comparative genomics, and at what evolutionary distances. Here I describe a simple mathematical model for the common problem of identifying conserved sequences. The model leads to some useful rules of thumb. For a given evolutionary distance, the number of comparative genomes needed for a constant level of statistical stringency in identifying conserved regions scales inversely with the size of the conserved feature to be detected. At short evolutionary distances, the number of comparative genomes required also scales inversely with distance. These scaling behaviors provide some intuition for future comparative genome sequencing needs, such as the proposed use of "phylogenetic shadowing" methods using closely related comparative genomes, and the feasibility of high-resolution detection of small conserved features.

    PMID:
    15660152
    [PubMed - indexed for MEDLINE]
    PMCID: PMC539325
    Free PMC Article

    Images from this publication.See all images (4) Free text

    Figure 2
    Figure 1
    Figure 4
    Figure 3

      Supplemental Content

      Click here to read Click here to read

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk