Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Med Care. 2013 Mar;51(3):251-8. doi: 10.1097/MLR.0b013e31827da594.

Improved cardiovascular risk prediction using nonparametric regression and electronic health record data.

Author information

  • 1VA Center for Clinical Management Research, Ann Arbor VA Health Services Research and Development Center of Excellence, University of Michigan, Ann Arbor, MI, USA.



Use of the electronic health record (EHR) is expected to increase rapidly in the near future, yet little research exists on whether analyzing internal EHR data using flexible, adaptive statistical methods could improve clinical risk prediction. Extensive implementation of EHR in the Veterans Health Administration provides an opportunity for exploration.


To compare the performance of various approaches for predicting risk of cerebrovascular and cardiovascular (CCV) death, using traditional risk predictors versus more comprehensive EHR data.


Retrospective cohort study. We identified all Veterans Health Administration patients without recent CCV events treated at 12 facilities from 2003 to 2007, and predicted risk using the Framingham risk score, logistic regression, generalized additive modeling, and gradient tree boosting.


The outcome was CCV-related death within 5 years. We assessed each method's predictive performance with the area under the receiver operating characteristic curve (AUC), the Hosmer-Lemeshow goodness-of-fit test, plots of estimated risk, and reclassification tables, using cross-validation to penalize overfitting.


Regression methods outperformed the Framingham risk score, even with the same predictors (AUC increased from 71% to 73% and calibration also improved). Even better performance was attained in models using additional EHR-derived predictor variables (AUC increased to 78% and net reclassification improvement was as large as 0.29). Nonparametric regression further improved calibration and discrimination compared with logistic regression.


Despite the EHR lacking some risk factors and its imperfect data quality, health care systems may be able to substantially improve risk prediction for their patients by using internally developed EHR-derived models and flexible statistical methodology.

[PubMed - indexed for MEDLINE]
Free PMC Article

Images from this publication.See all images (2)Free text

Figure 1
Figure 2
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Lippincott Williams & Wilkins Icon for PubMed Central
    Loading ...
    Write to the Help Desk