Target Prediction Model for Natural Products Using Transfer Learning

Int J Mol Sci. 2021 Apr 28;22(9):4632. doi: 10.3390/ijms22094632.

Abstract

A large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The model was pre-trained on a processed ChEMBL dataset and then fine-tuned on a natural product dataset. Benefitting from transfer learning and the data balancing technique, the model achieved a highly promising area under the receiver operating characteristic curve (AUROC) score of 0.910, with limited task-related training samples. Since the embedding distribution difference is reduced, embedding space analysis demonstrates that the model's outputs of natural products are reliable. Case studies have proved our model's performance in drug datasets. The fine-tuned model can successfully output all the targets of 62 drugs. Compared with a previous study, our model achieved better results in terms of both AUROC validation and its success rate for obtaining active targets among the top ones. The target prediction model using transfer learning can be applied in the field of natural product-based drug discovery and has the potential to find more lead compounds or to assist researchers in drug repurposing.

Keywords: deep learning; natural product; target prediction; transfer learning.

Publication types

  • Validation Study

MeSH terms

  • Biological Products*
  • Deep Learning*
  • Drug Discovery / methods*
  • Models, Theoretical*
  • Molecular Targeted Therapy*

Substances

  • Biological Products