Send to

Choose Destination
Comput Stat Data Anal. 2017 Oct;114:105-118. doi: 10.1016/j.csda.2017.04.008. Epub 2017 Apr 29.

A parametric model to estimate the proportion from true null using a distribution for p-values.

Author information

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37232, U.S.A.
Department of Biostatistics, Yale University, New Haven, CT 06520, U.S.A.


Microarray studies generate a large number of p-values from many gene expression comparisons. The estimate of the proportion of the p-values sampled from the null hypothesis draws broad interest. The two-component mixture model is often used to estimate this proportion. If the data are generated under the null hypothesis, the p-values follow the uniform distribution. What is the distribution of p-values when data are sampled from the alternative hypothesis? The distribution is derived for the chi-squared test. Then this distribution is used to estimate the proportion of p-values sampled from the null hypothesis in a parametric framework. Simulation studies are conducted to evaluate its performance in comparison with five recent methods. Even in scenarios with clusters of correlated p-values and a multicomponent mixture or a continuous mixture in the alternative, the new method performs robustly. The methods are demonstrated through an analysis of a real microarray dataset.


distribution of p-values; microarray studies; mixture model; proportion from the null hypothesis

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center