Format

Send to

Choose Destination
Bull World Health Organ. 2015 Apr 1;93(4):228-36. doi: 10.2471/BLT.14.139972. Epub 2015 Feb 27.

Data-driven methods for imputing national-level incidence in global burden of disease studies.

Author information

1
Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands .
2
Department of Virology, Parasitology and Immunology, Faculty of Veterinary Medicine, Ghent University, Salisburylaan 133, 9820 Merelbeke, Belgium .
3
Institute of Health and Society (IRSS), Université catholique de Louvain, Brussels, Belgium .
4
Centre for Statistics, Hasselt University, Diepenbeek, Belgium .
5
Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium .
6
Section of Veterinary Epidemiology, University of Zürich, Zürich, Switzerland .
7
Department of Food Science and Human Nutrition, Michigan State University, East Lansing, United States of America (USA).
8
Food Animal Production Medicine Section, School of Veterinary Medicine UW-Madison, Madison, USA .

Abstract

in English, Arabic, Chinese, French, Russian, Spanish

OBJECTIVE:

To develop transparent and reproducible methods for imputing missing data on disease incidence at national-level for the year 2005.

METHODS:

We compared several models for imputing missing country-level incidence rates for two foodborne diseases - congenital toxoplasmosis and aflatoxin-related hepatocellular carcinoma. Missing values were assumed to be missing at random. Predictor variables were selected using least absolute shrinkage and selection operator regression. We compared the predictive performance of naive extrapolation approaches and Bayesian random and mixed-effects regression models. Leave-one-out cross-validation was used to evaluate model accuracy.

FINDINGS:

The predictive accuracy of the Bayesian mixed-effects models was significantly better than that of the naive extrapolation method for one of the two disease models. However, Bayesian mixed-effects models produced wider prediction intervals for both data sets.

CONCLUSION:

Several approaches are available for imputing missing data at national level. Strengths of a hierarchical regression approach for this type of task are the ability to derive estimates from other similar countries, transparency, computational efficiency and ease of interpretation. The inclusion of informative covariates may improve model performance, but results should be appraised carefully.

PMID:
26229187
PMCID:
PMC4431555
DOI:
10.2471/BLT.14.139972
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Scientific Electronic Library Online Icon for PubMed Central
Loading ...
Support Center