Display Settings:

Format

Send to:

Choose Destination
    Nat Methods. 2009 Sep;6(9):673-6. Epub 2009 Aug 2.

    Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models.

    Source

    Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA. abrady@umiacs.umd.edu

    Abstract

    Metagenomics projects collect DNA from uncharacterized environments that may contain thousands of species per sample. One main challenge facing metagenomic analysis is phylogenetic classification of raw sequence reads into groups representing the same or similar taxa, a prerequisite for genome assembly and for analyzing the biological diversity of a sample. New sequencing technologies have made metagenomics easier, by making sequencing faster, and more difficult, by producing shorter reads than previous technologies. Classifying sequences from reads as short as 100 base pairs has until now been relatively inaccurate, requiring researchers to use older, long-read technologies. We present Phymm, a classifier for metagenomic data, that has been trained on 539 complete, curated genomes and can accurately classify reads as short as 100 base pairs, a substantial improvement over previous composition-based classification methods. We also describe how combining Phymm with sequence alignment algorithms improves accuracy.

    PMID:
    19648916
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC2762791
    Free PMC Article

    Images from this publication.See all images (3) Free text

    Figure 1
    Figure 3
    Figure 2

      Supplemental Content

      Icon for Nature Publishing Group Icon for PubMed Central

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk