Send to

Choose Destination
Stat Methods Med Res. 2016 Dec;25(6):3038-3056. Epub 2014 May 26.

Logistic regression for dichotomized counts.

Author information

Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
Department of Statistics, University of Calcutta, Kolkata, India.
Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.
Department of Dental Ecology, University of North Carolina, Chapel Hill, NC, USA.


Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren.


binary data; dental caries; excess zeros; hurdle model; zero-altered Poisson regression; zero-inflation

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center