Format

Send to

Choose Destination
See comment in PubMed Commons below
Microb Ecol Health Dis. 2015 May 29;26:27663. doi: 10.3402/mehd.v26.27663. eCollection 2015.

Analysis of composition of microbiomes: a novel method for studying microbial composition.

Author information

1
Department of Genes and Environment, Norwegian Institute of Public Health, Oslo, Norway.
2
Department of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
3
Department of Health Statistics, Norwegian Institute of Public Health, Oslo, Norway.
4
Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
5
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
6
Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, NC, USA; peddada@niehs.nih.gov.

Abstract

BACKGROUND:

Understanding the factors regulating our microbiota is important but requires appropriate statistical methodology. When comparing two or more populations most existing approaches either discount the underlying compositional structure in the microbiome data or use probability models such as the multinomial and Dirichlet-multinomial distributions, which may impose a correlation structure not suitable for microbiome data.

OBJECTIVE:

To develop a methodology that accounts for compositional constraints to reduce false discoveries in detecting differentially abundant taxa at an ecosystem level, while maintaining high statistical power.

METHODS:

We introduced a novel statistical framework called analysis of composition of microbiomes (ANCOM). ANCOM accounts for the underlying structure in the data and can be used for comparing the composition of microbiomes in two or more populations. ANCOM makes no distributional assumptions and can be implemented in a linear model framework to adjust for covariates as well as model longitudinal data. ANCOM also scales well to compare samples involving thousands of taxa.

RESULTS:

We compared the performance of ANCOM to the standard t-test and a recently published methodology called Zero Inflated Gaussian (ZIG) methodology (1) for drawing inferences on the mean taxa abundance in two or more populations. ANCOM controlled the false discovery rate (FDR) at the desired nominal level while also improving power, whereas the t-test and ZIG had inflated FDRs, in some instances as high as 68% for the t-test and 60% for ZIG. We illustrate the performance of ANCOM using two publicly available microbial datasets in the human gut, demonstrating its general applicability to testing hypotheses about compositional differences in microbial communities.

CONCLUSION:

Accounting for compositionality using log-ratio analysis results in significantly improved inference in microbiota survey data.

KEYWORDS:

constrained; log-ratio; relative abundance

PMID:
26028277
PMCID:
PMC4450248
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Datapage Icon for PubMed Central
    Loading ...
    Support Center