PETrans: De Novo Drug Design with Protein-Specific Encoding Based on Transfer Learning

Int J Mol Sci. 2023 Jan 6;24(2):1146. doi: 10.3390/ijms24021146.

Abstract

Recent years have seen tremendous success in the design of novel drug molecules through deep generative models. Nevertheless, existing methods only generate drug-like molecules, which require additional structural optimization to be developed into actual drugs. In this study, a deep learning method for generating target-specific ligands was proposed. This method is useful when the dataset for target-specific ligands is limited. Deep learning methods can extract and learn features (representations) in a data-driven way with little or no human participation. Generative pretraining (GPT) was used to extract the contextual features of the molecule. Three different protein-encoding methods were used to extract the physicochemical properties and amino acid information of the target protein. Protein-encoding and molecular sequence information are combined to guide molecule generation. Transfer learning was used to fine-tune the pretrained model to generate molecules with better binding ability to the target protein. The model was validated using three different targets. The docking results show that our model is capable of generating new molecules with higher docking scores for the target proteins.

Keywords: de novo drug design; deep learning; drug discovery; molecule generation; transfer learning.

MeSH terms

  • Amino Acids
  • Drug Design*
  • Ligands
  • Machine Learning
  • Molecular Structure
  • Proteins* / chemistry

Substances

  • Proteins
  • Amino Acids
  • Ligands