Format

Send to

Choose Destination
BMC Bioinformatics. 2015 Aug 28;16:272. doi: 10.1186/s12859-015-0707-9.

Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates.

Author information

1
Baylor Research Institute, 3310 Live Oak, Dallas, 75204, TX, USA. jacob.turner1@baylorhealth.edu.
2
Department of Microbiology and Immunology, Stanford University School, Stanford, 94305, CA, USA. cbolen@stanford.edu.
3
Baylor Research Institute, 3310 Live Oak, Dallas, 75204, TX, USA. DerekBl@BaylorHealth.edu.

Abstract

BACKGROUND:

Gene set analysis (GSA) of gene expression data can be highly powerful when the biological signal is weak compared to other sources of variability in the data. However, many gene set analysis approaches utilize permutation tests which are not appropriate for complex study designs. For example, the correlation of subjects is broken when comparing time points within a longitudinal study. Linear mixed models provide a method to analyze longitudinal studies as well as adjust for potential confounding factors and account for sources of variability that are not of primary interest. Currently, there are no known gene set analysis approaches that fully account for these study design and analysis aspects. In order to do so, we generalize the QuSAGE gene set analysis algorithm, denoted Q-Gen, and provide the necessary estimation adjustments to incorporate linear mixed model analyses.

RESULTS:

We assessed the performance of our generalized method in comparison to the original QuSAGE method in settings such as longitudinal repeated measures analysis and accounting for potential confounders. We demonstrate that the original QuSAGE method can not control for type-I error when these complexities exist. In addition to statistical appropriateness, analysis of a longitudinal influenza study suggests Q-Gen can allow for greater sensitivity when exploring a large number of gene sets.

CONCLUSIONS:

Q-Gen is an extension to the gene set analysis method of QuSAGE, and allows for linear mixed models to be applied appropriately within a gene set analysis framework. It provides GSA an added layer of flexibility that was not currently available. This flexibility allows for more appropriate statistical modeling of complex data structures that are inherent to many microarray study designs and can provide more sensitivity.

PMID:
26316107
PMCID:
PMC4551517
DOI:
10.1186/s12859-015-0707-9
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center