Recovering the true targets of specific ligands by virtual screening of the protein data bank

Proteins. 2004 Mar 1;54(4):671-80. doi: 10.1002/prot.10625.

Abstract

The Protein Data Bank (PDB) has been processed to extract a screening protein library (sc-PDB) of 2148 entries. A knowledge-based detection algorithm has been applied to 18,000 PDB files to find regular expressions corresponding to either protein, ions, co-factors, solvent, or ligand atoms. The sc-PDB database comprises high-resolution X-ray structures of proteins for which (i) a well-defined active site exists, (ii) the bound-ligand is a small molecular weight molecule. The database has been screened by an inverse docking tool derived from the GOLD program to recover the known target of four unrelated ligands. Both the database and the inverse screening procedures are accurate enough to rank the true target of the four investigated ligands among the top 1% scorers, with 70-100 fold enrichment with respect to random screening. Applying the proposed screening procedure to a small-sized generic ligand was much less accurate suggesting that inverse screening shall be reserved to rather selective compounds.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Binding Sites
  • Biotin / metabolism
  • Computational Biology / methods*
  • Computer Simulation*
  • Crystallography, X-Ray
  • Databases, Protein*
  • Drug Evaluation, Preclinical
  • Ligands*
  • Methotrexate / metabolism
  • Molecular Sequence Data
  • Molecular Weight
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Purine Nucleosides / metabolism
  • Ribonucleosides / metabolism
  • Sensitivity and Specificity
  • Software
  • Substrate Specificity
  • Tamoxifen / analogs & derivatives*
  • Tamoxifen / metabolism

Substances

  • Ligands
  • Proteins
  • Purine Nucleosides
  • Ribonucleosides
  • Tamoxifen
  • 6-hydroxyl-1,6-dihydropurine ribonucleoside
  • afimoxifene
  • Biotin
  • Methotrexate