Format

Send to

Choose Destination
Stat Methods Med Res. 2017 Aug;26(4):1802-1823. doi: 10.1177/0962280215588569. Epub 2015 May 31.

Approaches for dealing with various sources of overdispersion in modeling count data: Scale adjustment versus modeling.

Author information

1
1 Department of Public Health Sciences - Biostatistics, Medical University of South Carolina, Charleston, SC, USA.
2
2 Health Equity and Rural Outreach Innovation Center (HEROIC), Ralph H. Johnson Department of Veterans Affairs Medical Center, Charleston, SC, USA.
3
3 Division of Biostatistics, Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, USA.

Abstract

Overdispersion is a common problem in count data. It can occur due to extra population-heterogeneity, omission of key predictors, and outliers. Unless properly handled, this can lead to invalid inference. Our goal is to assess the differential performance of methods for dealing with overdispersion from several sources. We considered six different approaches: unadjusted Poisson regression (Poisson), deviance-scale-adjusted Poisson regression (DS-Poisson), Pearson-scale-adjusted Poisson regression (PS-Poisson), negative-binomial regression (NB), and two generalized linear mixed models (GLMM) with random intercept, log-link and Poisson (Poisson-GLMM) and negative-binomial (NB-GLMM) distributions. To rank order the preference of the models, we used Akaike's information criteria/Bayesian information criteria values, standard error, and 95% confidence-interval coverage of the parameter values. To compare these methods, we used simulated count data with overdispersion of different magnitude from three different sources. Mean of the count response was associated with three predictors. Data from two real-case studies are also analyzed. The simulation results showed that NB and NB-GLMM were preferred for dealing with overdispersion resulting from any of the sources we considered. Poisson and DS-Poisson often produced smaller standard-error estimates than expected, while PS-Poisson conversely produced larger standard-error estimates. Thus, it is good practice to compare several model options to determine the best method of modeling count data.

KEYWORDS:

Count data; Poisson; generalized linear mixed model; negative-binomial; overdispersion

PMID:
26031359
DOI:
10.1177/0962280215588569
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center