Send to

Choose Destination
Toxicol Sci. 2017 Aug 1;158(2):391-400. doi: 10.1093/toxsci/kfx099.

In Silico Prediction of Drug-Induced Liver Injury Based on Adverse Drug Reaction Reports.

Author information

Department of Environmental Science, College of Resource and Environment, Qingdao Engineering Research Center for Rural Environment, Qingdao Agricultural University, Qingdao 266109, China.
Department of Computer Science and Technology, College of Science and Information, Qingdao Agricultural University, Qingdao 266109, China.


Drug-induced liver injury (DILI) is a major cause of drug attrition. Currently existing Quantitative Structure-Activity Relationship models have limited predictive capabilities for DILI. Furthermore, their practical applications were limited by lack of new hepatotoxicity data. In this study, we first collected and curated a novel set of 122 DILI-positive and 932 DILI-negative drugs from online adverse drug reports using proportional reporting ratios as the signal detection method. Second, three strategies (under-sampling the majority class, synthetic minority over-sampling technique, and adjusting decision threshold approach) were employed to develop predictive classification models to cope with the unbalanced dataset. Random forest (RF) models using CDK, MACCS, and Mold2 descriptors based on the under-sampling and over-sampling strategies afforded correct classification ratio (CCR) of ∼0.77 and 0.78, respectively. Recursive RF models based on the last strategy tremendously reduced modeling descriptors (at most 95.4% for Mold2) while apparently improved the predictability with a consensus CCR of 0.84 (sensitivity of 0.88 and specificity of 0.79). Structural analysis showed that pyrimidine derivatives, purine derivatives, and halogenated hydrocarbon were critical for drugs' hepatotoxicity. The reporting frequency of many drugs was gender-dependent (eg, antiviral and anti-cancer drugs for males and antibacterial drugs for females) as well as age-dependent (eg, antiviral and anti-cancer drugs for the middle age group of 20-29, 30-39, and 40-49). Approximately 84% of total cases were reported during the first 6 months of administration. The curated hepatotoxicity dataset along with the predictive classification models presented here should provide insight into future studies of DILI.


QSAR; drug-induced liver injury; recursive random forest; structural alert; synthetic minority over-sampling technique; under-sampling the majority class

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center