Logo of bioinformLink to Publisher's site
Bioinformation. 2011; 5(10): 430–439.
Published online 2011 Feb 15.
PMCID: PMC3055157

Docking study of HIV-1 reverse transcriptase with phytochemicals


Natural products are important sources of drug discovery. In this context groups of different set of phytochemicals were taken and docked into the different cavities of the Reverse transcriptase (PDB ID: 1REV) of Human immunodeficiency virus (HIV) and results were discussed. Natural compounds such as Curcumin, Geranin, Gallotannin, Tiliroside, Kaempferol-3-o-glucoside and Trachelogenin were found to very effective according to its binding energy and ligand efficiency score. Those compounds also were found to have no adverse effect as carcinogenicity and mutagenicity and favorable drug likeness score. Hence, considering the facts those compounds could use effectively for HIV-1 drug discovery.


Human immunodeficiency virus (HIV) is a lentivirus (a member of the retrovirus family) that causes acquired immunodeficiency syndrome (AIDS), [12] a condition in humans in which the immune system begins to fail, leading to life-threatening opportunistic infections. Once HIV enters the human body, its primary target is a subset of immune cells that contain a molecule called CD4. In particular, the virus attaches itself to CD4+T cells and, to a lesser extent, to macrophages. Reverse Transcriptase (RT) converts the single-stranded HIV RNA genome to a double-stranded DNA copy by catalyzing both DNA-dependent and RNA-dependent DNA polymerization as well as RNase H cleavage activity to remove the RNA template once the DNA has been synthesized. Because of its unique catalytic properties, RT has been the target enzyme for many antiviral therapeutic agents used in the treatment of AIDS, including nucleoside and nonnucleoside analogues [36].

The aim of molecular docking is to evaluate the feasible binding geometries of a putative ligand with a target whose target site is known. The binding geometries is often known as binding poses, includes, in principle, both the position of the ligand relative to the receptor and conformational state of the ligand and the receptor. There are three basic tasks any docking procedure must accomplish: (1) characterization of the binding site; (2) positioning of the ligand into the binding site (orienting); and (3) evaluating the strength of interaction for a specific ligand-receptor complex (“scoring”). The first challenge for computer-aided design is to identify one or more lead compounds that show activity in an appropriate assay. Until recently most drugs in the market come from the lead compounds discovered by screening of natural products or exploring the analogues of known structures.

There are many small molecule databases in public domain such as ZINC, Pubchem, ChemDB, ChemSpider, KEGG ligand database and DrugBank for virtual screening. The procedure of structure based virtual screening through docking has become crucial when it is necessary to test a database of thousands of compounds against one or more protein targets in a feasible time.

An increasing number of patients with HIV infection cannot use the currently approved anti-HIV drugs, including the reverse transcriptase and protease inhibitors, due to the adverse effects and the emergence of drug resistance, the search for new effective and safe as well as affordable antiHIV agents is not merely an academic curiosity but rather a necessity [7]. It is important to note that a number of promising anti-HIV natural products have made it to the clinical level and are anticipated to be available to patients very soon [7]. The following natural products can be cited as promising anti-HIV agents of plant origin: baicalin (a flavonoid) [8], calanolides (coumarins) [9], betulinic acid (a triterpene) [1011], polycitone A (an alkaloid) [12], lithospermic acid, sulphated polysaccharides, cyanovirin-N [13], pokeweed antiviral protein [14] and alpha-trichobitacin (proteins). In this context six different set of phytochemicals were taken and docked into the cavities of the Reverse transcriptase and results were discussed.


The three dimensional structure of target HIV reverse transcriptase (PDBID: 1REV) was retrieved from protein data bank at 2.6 Å RMSD resolution. Phytochemicals with anti-HIV activity such as Curcumin, Geranin, Gallotannin, Tiliroside, Kaempferol-3-o-glucoside and Trachelogenin were obtained from Dr. Duke database (http://www.arsgrin.gov/duke/), which were searched against pubchem and chemspider database for the 2D structures and then with the help of open babel [http://openbabel.org/wiki/Main_Page] these 2D structures are converted to 3D structures. The 3D structures which are obtained are minimized using Hyperchem's MM+ force field. Molegro Virtual Docker [15] was used to detect the active sites and docking was performed by moldock function, which is an implementation of evolutionary algorithms (EAs), focused on molecular docking simulations. Docking was performed with all the potential active sites detected on HIV reverse transcriptase enzyme. During Docking at first the molecules were prepared and bonds, bond orders, explicit hydrogens, charges, flexible torsions, were assigned if they were missing by the MVD program to both the protein and ligands. From the docking wizard ligands were selected and the scoring function used is Moldock score. The Ignore distant atoms option is used to ignore atoms far away from the binding site. It reduces overall computing time. The Enforce hydrogen bond directionality option is used to check if bonding between potential hydrogen bond donors and acceptors can occur. If hydrogen bonding is possible, the hydrogen bond energy contribution to the docking score is assigned a penalty based on the deviations from the ideal bonding angle. Using this option can significantly reduce the number of unlikely hydrogen bonds reported also internal electrostatic interaction, internal hydrogen bond sp2-sp2 torsions are calculated from the pose by enabling the ligand evaluation terms. The search algorithm is taken as Moldock SE and numbers of runs are taken 10 and max iterations were 2000 with population size 50 and with an energy threshold of 100 also at each step least ‘min’ torsions/translations/rotations are tested and the one giving lowest energy is chosen. If the energy is positive (i.e. because of a clash or an unfavorable electrostatic interaction) then additional ‘max’ positions will be tested. Pose clustering was done by tabu based clustering method, using this clustering technique each found solution is added to a ‘tabu list’: during the docking simulation the poses are compared to the ligands in this ‘tabu list’. If the pose being docked is closer to one of the ligands in the list than specified by the RMSD threshold, an extra penalty term (the Energy penalty) is added to the scoring function. This ensures a greater diversity of the returned solutions since the docking engine will focus its search on poses different from earlier poses found. The energy penalty was set to 100, RMSD threshold was 2.00 and RMSD calculation by atom ID (fast) were set.

After the docking simulation is over the poses which were generated were sorted by rerank score. The Rerank Score uses a weighted combination of the terms used by the MolDock score mixed with a few addition terms (the Rerank Score includes the Steric (by LJ12-6) terms which are LennardJones approximations to the steric energy ­ the MolDock score uses a piecewise linear potential to approximate the steric energy) [15]. The reranking score function is computationally more expensive than the scoring function used during the docking simulation but it is generally better than the docking score function at determining the best pose among several poses originating from the same ligand [15]. Ligand efficiency is most commonly defined as the ratio of the free energy of binding over the number of heavy atoms in a molecule [18]. Binding affinities were calculated using Molegro data modeler and MVD used to find Ligand Efficiency 1(LE 1) as Moldock score divided by Heavy Atoms count, ligand efficiency 2(LE2) as a result of binding Affinity divided by Heavy Atoms count and Ligand Efficiency 3 (LE 3) as rerank Score divided by Heavy Atoms count.

The coefficients for the binding affinity terms were derived using multiple linear regression. The model was calibrated using a data set of more than 200 structurally diverse complexes from the PDBbind database with known binding affinities (expressed in kJ/mol) [16]. The Pearson correlation coefficient was 0.60 when doing 10-fold cross validation. It is important to note that this particular model was trained only on strongly interacting ligands in their optimal conformation known from the PDB complexes. Since the binding affinity measure was trained using known binding modes only, it might sometimes assign too strong binding affinities to weakly or non-binding molecules (false positives). Therefore recommend ranking the results of a virtual screening run using the rerank score. The binding affinity measure may then be used subsequently to get a rough estimate of the highest ranked poses. The scoring function used by MolDock is derived from the piecewise linear potential (PLP) scoring functions [17]. The scoring function used by MolDock further improves these scoring functions with a new hydrogen bonding term and new charge schemes [15]. Based on evolutionary agorithms (EAs) classification moldock algorithm may be classified as an Evolutionary simulator (ES), since it employs direct ranking of the solutions and the crossover operators. MolDock showed better overall performance in docking simulations when compared with other software.

ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) properties were predicted using PreADMET server (http://preadmet.bmdrc.org/) to know whether the phytochemicals has the potential of adverse effect in human.

Results and Discussion

A total of five cavities (Figure 1) were able to detected in reverse transcriptase enzyme (PDB ID: 1REV) by using Molegro Virtual Docker and were named cav1, cav2, cav3, cav4 and cav5, the volume and surface area details were given as (see Table 1). Details of five cavities are given in Table 4(see Table 4).

Figure 1
Five cavities detected in reverse transcriptase enzyme (PDB ID: 1REV).

Reverse Transcriptase has, there are two p66 domains and one p51 domain. p66 and p51 share a common amino terminus; p66 is 560 amino acids in length, p51 is 440 amino acids long. The larger subunit of the RT heterodimer, p66, contains the active sites for both of the enzymatic activities of RT (polymerase and RNase H); the smaller subunit plays a structural role. p66 is composed of two spatially distinct domains, polymerase and RNase H. The polymerase domain is composed of four subdomains: fingers (residues 1­85 and 118­155), palm (residues 86­117 and 156­236), thumb (237­318), and connection (319­426). The nucleicacid binding cleft is formed primarily by the p66 fingers, palm, thumb, connection, and RNase H subdomains of p66. The polymerase active site is composed of three catalytic carboxylates in the palm subdomain of p66 (D110, D185, and D186) that bind two divalent ions [19]. The cavity 5 has such residues in them and the residue shows their contributions to their inhibitors.

Curcumin is mainly found in Curcuma longa, Curcuma xanthorrhiza, Curcuma zedoaria, Costus speciosus and Zingiber officinale. Geranin is found in Phyllanthus amaru, Erythroxylum coca var. coca, Geranium thunbergii, Spondias pinnata and Terminalia catappa. Gallotannin is found in following plants Camellia sinensis, Salvia officinalis, Arctostaphylos uva-ursi, Juniperus communis, Rosmarinus officinalis, Vaccinium myrtillus, Ginkgo biloba, Prunus cerasus, Psidium guajava, Thymus vulgaris, Plantago major, Urtica dioica, Achillea millefolium, Equisetum arvense, Fragaria spp, Terminalia catappa, Cynara cardunculus subsp. Cardunculus and Glechoma hederacea. The compound tiliroside is found in Althaea officinalis, Helianthemum glomeratum, Pteridium aquilinum, Rosa spp, Tilia sp. Trachelogenin found in Arctium lappa and Cnicus benedictus. The compound kaempferol-3-o-glucoside found in Vitis vinifera, Urtica dioica and Echinacea spp. Structures of each compound is displayed in Figure 2.

Figure 2
Structure of compounds used for docking study. (a) Trachelogenin (b) Curcumin (c) Geraniin (d) Kaempferol-3-O-Glucoside (e) Gallotannin (f) Tiliroside.

All the phytochemicals were used as ligand at mentioned 5 cavities of reverse transcriptase enzyme (PDB ID: 1REV) and the results of the top ligands whose rerank score > -100 were selected and which were given in (a-e) (see Table 2) of the context along with the hydrogen bond interaction values, the hydrogen bond number and other electrostatic interaction values. Binding pose of each ligand at their highly bound cavity is showed in Figure 3. Selected phytochemicals were studied with potential Anti HIV compound 4,5,6,7-tetrahydroimidazo[4,5,1- jk][1,4]benzodiazepin-2(1H)-thione (TIBO) [20] and rerank score and moldock score were found to be good than (TIBO) (Table 5 see Table 5). A study also conducted by using similar structural analogue of ligands in all the cavities and phytochemicals showed higher binding score than structural analogue (Table 6 see Table 6).

Figure 3
Different ligands at highly bound cavity. (a) cucurmin in cavity 1, (b) geranin in cavity 2, (c)kaempferol-3-o-glucoside in cavity 5, (d) gallotannin in cavity 3, (e) tiliroside in cavity 1, (f) trachelogenin in cavity 1.

The molecules were searched for Lipinski rule of 5, lead like rule, CMC like rule, MDDR like rule, WDI like rule, Reactive functional group for drug likeness. Typical ADME prediction methods involves Caco-2cell permeability, Madin-Darby Canine Kidney Cell Permeability (MDCK), Human Intestinal Absortion (HIA), Skin Permeability, Blood-Brain Barrier (BBB) penetration were also calculated for the molecules along with the mutagenicity and carcinogenicity test based on Ames Test using PreADMET online webserver (http://preadmet.bmdrc.org/). The results of ADMET and drug-likeness are given as Table 3 (see Table 3). All the phytochemicals were bound to its target with rerank scores ranging from -100.56 to -136.23 at multiple binding sites on the enzyme. ADMET prediction resulted any carcinogenic effects in selected compounds. Trachelogenin found to be mutagenic by Ames test model of preADMET (Table 3 see Table 3). These results will probably become a lead phytochemical for further drug discovery process.

Supplementary material

Data 1:


Citation:Seal et al, Bioinformation 5(10): 430-439 (2011)


1. Weiss RA. Science. 1993;260:5112. [PubMed]
2. Douek DC, et al. Annu Rev Med. 2009;60:471. [PMC free article] [PubMed]
3. De Clercq E. Biochem Pharmacol. 1994;47:155. [PubMed]
4. De Clercq E. Clin Microbiol Rev. 1997;10:674. [PMC free article] [PubMed]
5. Goff SP. J Acquired Immune Defic Syndr. 1990;3:817. [PubMed]
6. Mitsuya H, et al. Science. 1990;249:4976. [PubMed]
7. Asres K, et al. Phytother Res. 2005;19:557. [PubMed]
8. Kitamura K, et al. Antivir Res. 1998;37:131. [PubMed]
9. Zhou P, et al. Phytochemistry. 2000;53:689. [PubMed]
10. Cichewicz RH, Kouzi SA. MedRes Rev. 2004;24:94. [PubMed]
11. Holz-Smith SL, et al. Antimicrob Agents Chemother. 2001;45:60. [PMC free article] [PubMed]
12. Loya S, et al. Biochem J. 1999;344:85. [PMC free article] [PubMed]
13. Dey B, et al. J Virol. 2000;74:4562. [PMC free article] [PubMed]
14. Uckun FM, et al. Antimicrob Agents Chemother. 1998;42:383. [PMC free article] [PubMed]
15. Thomsen R, Christensen MH. J Med Chem. 2006;49:3315. [PubMed]
16. Wang R, et al. J Med Chem. 2004;47:2977. [PubMed]
17. Yang JM, Chen CC. Proteins. 2004;55:288. [PubMed]
18. Abad-Zapatero C, Metz JT. Drug Discov Today. 2005;10:464. [PubMed]
19. Sarafianos SG, et al. J Mol Biol. 2009;385:693. [PMC free article] [PubMed]
20. Pauwels R, et al. Nature. 1990;343:6257. [PubMed]

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...