Send to

Choose Destination
J Am Med Inform Assoc. 2019 Dec 1;26(12):1458-1465. doi: 10.1093/jamia/ocz136.

What health records data are required for accurate prediction of suicidal behavior?

Author information

Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA.
HealthPartners Institute, Minneapolis, Minnesota, USA.
Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA.



The study sought to evaluate how availability of different types of health records data affect the accuracy of machine learning models predicting suicidal behavior.


Records from 7 large health systems identified 19 061 056 outpatient visits to mental health specialty or general medical providers between 2009 and 2015. Machine learning models (logistic regression with penalized LASSO [least absolute shrinkage and selection operator] variable selection) were developed to predict suicide death (n = 1240) or probable suicide attempt (n = 24 133) in the following 90 days. Base models were used only historical insurance claims data and were then augmented with data regarding sociodemographic characteristics (race, ethnicity, and neighborhood characteristics), past patient-reported outcome questionnaires from electronic health records, and data (diagnoses and questionnaires) recorded during the visit.


For prediction of any attempt following mental health specialty visits, a model limited to historical insurance claims data performed approximately as well (C-statistic 0.843) as a model using all available data (C-statistic 0.850). For prediction of suicide attempt following a general medical visit, addition of data recorded during the visit yielded a meaningful improvement over a model using all data up to the prior day (C-statistic 0.853 vs 0.838).


Results may not generalize to setting with less comprehensive data or different patterns of care. Even the poorest-performing models were superior to brief self-report questionnaires or traditional clinical assessment.


Implementation of suicide risk prediction models in mental health specialty settings may be less technically demanding than expected. In general medical settings, however, delivery of optimal risk predictions at the point of care may require more sophisticated informatics capability.


electronic health records; insurance claims; machine learning; patient-reported outcomes; risk prediction; suicide


Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center