Format

Send to

Choose Destination
Am J Hum Genet. 2015 May 7;96(5):797-807. doi: 10.1016/j.ajhg.2015.04.003.

Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test.

Author information

1
Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
2
Division of Biomedical Statistics and Informatics and Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA. Electronic address: chen.jun2@mayo.edu.
3
Division of Gastroenterology and Hepatology, Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516, USA.
4
Department of Maternal and Child Health, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
5
Department of Human Genetics, Emory University, Atlanta, GA 30322, USA.
6
Department of Statistics, North Carolina State University, Cary, Raleigh, NC 27695, USA.
7
Division of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ 85724, USA.
8
Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19014, USA.
9
Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA. Electronic address: mcwu@fhcrc.org.

Abstract

High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.

PMID:
25957468
PMCID:
PMC4570290
DOI:
10.1016/j.ajhg.2015.04.003
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center