Send to

Choose Destination
PeerJ. 2018 Oct 12;6:e5765. doi: 10.7717/peerj.5765. eCollection 2018.

Building interpretable models for polypharmacy prediction in older chronic patients based on drug prescription records.

Author information

Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia.
Advanced Analytics Institute, Faculty of Engineering and IT, University of Technology, Sydney, New South Wales, Australia.
Department of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia.
Faculty of Health Sciences, University of Maribor, Maribor, Slovenia.
Institute of Physiology, Faculty of Medicine, University of Maribor, Maribor, Slovenia.
Healthcare Data Center, The National Institute of Public Health of the Republic of Slovenia, Ljubljana, Slovenia.
St Vincent's Clinical School, Faculty of Medicine, UNSW Sydney, Sydney, NSW, Australia.
Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia.



Multimorbidity presents an increasingly common problem in older population, and is tightly related to polypharmacy, i.e., concurrent use of multiple medications by one individual. Detecting polypharmacy from drug prescription records is not only related to multimorbidity, but can also point at incorrect use of medicines. In this work, we build models for predicting polypharmacy from drug prescription records for newly diagnosed chronic patients. We evaluate the models' performance with a strong focus on interpretability of the results.


A centrally collected nationwide dataset of prescription records was used to perform electronic phenotyping of patients for the following two chronic conditions: type 2 diabetes mellitus (T2D) and cardiovascular disease (CVD). In addition, a hospital discharge dataset was linked to the prescription records. A regularized regression model was built for 11 different experimental scenarios on two datasets, and complexity of the model was controlled with a maximum number of dimensions (MND) parameter. Performance and interpretability of the model were evaluated with AUC, AUPRC, calibration plots, and interpretation by a medical doctor.


For the CVD model, AUC and AUPRC values of 0.900 (95% [0.898-0.901]) and 0.640 (0.635-0.645) were reached, respectively, while for the T2D model the values were 0.808 (0.803-0.812) and 0.732 (0.725-0.739). Reducing complexity of the model by 65% and 48% for CVD and T2D, resulted in 3% and 4% lower AUC, and 4% and 5% lower AUPRC values, respectively. Calibration plots for our models showed that we can achieve moderate calibration with reducing the models' complexity without significant loss of predictive performance.


In this study, we found that it is possible to use drug prescription data to build a model for polypharmacy prediction in older population. In addition, the study showed that it is possible to find a balance between good performance and interpretability of the model, and achieve acceptable calibration at the same time.


Cardiovascular disease; Clinical interpretability; Diabetes type 2; Logistic regression; Polypharmacy prediction; Prescription data

Conflict of interest statement

The authors declare there are no competing interests.

Supplemental Content

Full text links

Icon for PeerJ, Inc. Icon for PubMed Central
Loading ...
Support Center