Format

Send to

Choose Destination
Accid Anal Prev. 2018 Aug;117:181-195. doi: 10.1016/j.aap.2018.04.016. Epub 2018 Apr 27.

Bayesian Poisson hierarchical models for crash data analysis: Investigating the impact of model choice on site-specific predictions.

Author information

1
Uber Technologies, Inc., San Francisco, CA, 94103, United States. Electronic address: hadi@uber.edu.
2
Department of Statistics, Texas A&M University, College Station, TX, 77843-3143, United States. Electronic address: vjohnson@stat.tamu.edu.
3
Zachry Department of Civil Engineering, Texas A&M University, College Station, TX, 77843-3136, United States. Electronic address: d-lord@tamu.edu.

Abstract

The Poisson-gamma (PG) and Poisson-lognormal (PLN) regression models are among the most popular means for motor vehicle crash data analysis. Both models belong to the Poisson-hierarchical family of models. While numerous studies have compared the overall performance of alternative Bayesian Poisson-hierarchical models, little research has addressed the impact of model choice on the expected crash frequency prediction at individual sites. This paper sought to examine whether there are any trends among candidate models predictions e.g., that an alternative model's prediction for sites with certain conditions tends to be higher (or lower) than that from another model. In addition to the PG and PLN models, this research formulated a new member of the Poisson-hierarchical family of models: the Poisson-inverse gamma (PIGam). Three field datasets (from Texas, Michigan and Indiana) covering a wide range of over-dispersion characteristics were selected for analysis. This study demonstrated that the model choice can be critical when the calibrated models are used for prediction at new sites, especially when the data are highly over-dispersed. For all three datasets, the PIGam model would predict higher expected crash frequencies than would the PLN and PG models, in order, indicating a clear link between the models predictions and the shape of their mixing distributions (i.e., gamma, lognormal, and inverse gamma, respectively). The thicker tail of the PIGam and PLN models (in order) may provide an advantage when the data are highly over-dispersed. The analysis results also illustrated a major deficiency of the Deviance Information Criterion (DIC) in comparing the goodness-of-fit of hierarchical models; models with drastically different set of coefficients (and thus predictions for new sites) may yield similar DIC values, because the DIC only accounts for the parameters in the lowest (observation) level of the hierarchy and ignores the higher levels (regression coefficients).

KEYWORDS:

Bayesian model; Model choice; Poisson hierarchical; Poisson-inverse gamma; Site-specific prediction

PMID:
29705601
DOI:
10.1016/j.aap.2018.04.016
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center