Send to

Choose Destination
Genes (Basel). 2018 Jun 20;9(6). pii: E313. doi: 10.3390/genes9060313.

CoreProbe: A Novel Algorithm for Estimating Relative Abundance Based on Metagenomic Reads.

Author information

School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China.
Sinotech Genomics, Shanghai 200120, China.
Department of Medicine, Stanford University School of Medicine, 269 Campus Dr., Stanford, CA 94305, USA.


With the rapid development of high-throughput sequencing technology, the analysis of metagenomic sequencing data and the accurate and efficient estimation of relative microbial abundance have become important ways to explore the microbial composition and function of microbes. In addition, the accuracy and efficiency of the relative microbial abundance estimation are closely related to the algorithm and the selection of the reference sequence for sequence alignment. We introduced the microbial core genome as the reference sequence for potential microbes in a metagenomic sample, and we constructed a finite mixture and latent Dirichlet models and used the Gibbs sampling algorithm to estimate the relative abundance of microorganisms. The simulation results showed that our approach can improve the efficiency while maintaining high accuracy and is more suitable for high-throughput metagenomic data. The new approach was implemented in our CoreProbe package which provides a pipeline for an accurate and efficient estimation of the relative abundance of microbes in a community. This tool is available free of charge from the CoreProbe's website: Access the Docker image with the following instruction: sudo docker pull panhongfei/coreprobe:1.0.


Dirichlet model; Gibbs sampling; core genome; metagenomics; relative abundance estimation

Supplemental Content

Full text links

Icon for Multidisciplinary Digital Publishing Institute (MDPI) Icon for PubMed Central
Loading ...
Support Center