![]() | ![]() |
Formats:
|
||||||||
Copyright © 2004 Oxford University Press SDPMOD: an automated comparative modeling server for small disulfide-bonded proteins 1Department of Biochemistry, National University of Singapore, 8 Medical Drive, 117597, Singapore and 2Biotechnology Research Institute, Macquarie University, NSW 2109, Australia *To whom correspondence should be addressed. Tel: +61 2 9850 6262; Fax: +61 2 9850 8313; Email: shoba/at/els.mq.edu.au aThe online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. a© 2004, the authors Received February 15, 2004; Revised March 24, 2004; Accepted March 24, 2004. This article has been cited by other articles in PMC.Abstract Small disulfide-bonded proteins (SDPs) are rich sources for therapeutic drugs. Designing drugs from these proteins requires three-dimensional structural information, which is only available for a subset of these proteins. SDPMOD addresses this deficit in structural information by providing a freely available automated comparative modeling service to the research community. For expert users, SDPMOD offers a manual mode that permits the selection of a desired template as well as a semi-automated mode that allows users to select the template from a suggested list. Besides the selection of templates, expert users can edit the target–template alignment, thus allowing further customization of the modeling process. Furthermore, the web service provides model stereochemical quality evaluation using PROCHECK. SDPMOD is freely accessible to academic users via the web interface at http://proline.bic.nus.edu.sg/sdpmod. INTRODUCTION Small disulfide-bonded proteins (SDPs) are a special class of proteins that are relatively small in size (length ≤100 residues) and have disulfide bonds within their three-dimensional (3D) structures (1). SDPs include many secretory proteins which serve predatory, defensive or regulatory roles (such as toxins, inhibitors and hormones), and they are rich source for therapeutic drugs (2) and pesticides (3). The 3D structures of SDPs are essential for understanding the functions of SDPs and for drug design. However, 3D structure determination through experimental methods such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy are still both time-consuming and expensive. This results in a gap between the number of known 3D structures and the number of primary sequences that could be narrowed using large-scale automated protein structure prediction. Among current structure prediction methods, comparative modeling is the most reliable method for generating 3D models. Comparative modeling of protein structures often requires expert knowledge and proficiency in specialized methods. In the mid-1990s, Peitsch and coworkers developed the first automated modeling server SWISS-MODEL (4), which is currently the most widely used server of this genre. Recently, several other automated comparative modeling servers have also been developed, such as CPHmodels (5), 3D-JIGSAW (6), ModWeb (7) and ESyPred3D (8). Although so many automated comparative modeling servers are available, most of them do not work well on small SDPs for two reasons. Most of the automated servers are primarily designed for globular protein domains, making it difficult to discriminate small-sized SDPs from background noise. Taking as an example the sequence of α-conotoxin PnIA (9) (PDB id: 1PEN; 16 residues; 2 disulfide bridges in its structure), we note that both SWISS-MODEL and ModWeb report that they do not cover the modeling of sequences <25 or ≤30 amino acid residues in length, respectively, while the other three servers state that no suitable templates can be identified for this sequence. The second reason is that SDPs have distinct characteristics from medium-sized and large globular proteins. They usually do not have a compact hydrophobic core, which is a major factor in stabilizing protein structure. Their side chains are more likely to be exposed to solvent and their conformations are more flexible. The 3D structures of small proteins are usually dominated by disulfide bridges, metal or ligand (according to SCOP classification) (10) and tend to bind or interact with large molecules. In small disulfide-rich proteins, the effects of disulfide bridges and constrained residues such as prolines are more significant than sequence similarity. As such, the comparative modeling rules for such proteins are highly specific and different from those adopted for large globular proteins. These distinct features require specific methods and datasets to be developed for the comparative modeling of SDPs. To address these problems, we have first developed special strategies and rules for large-scale automated comparative modeling of the entire family of conotoxins (L. Kong and S. Ranganathan, unpublished data). Subsequently these rules were extended to other SDPs. Here, we present SDPMOD, a comprehensive comparative modeling server that is designed specifically for SDPs with specialized rules and datasets. MATERIALS AND METHODS Non-redundant SDP structure dataset Before the modeling can proceed, a non-redundant dataset for SDPs needs to be created to serve as the template repository. Structures containing protein chains of length <100 amino acids with at least two cysteines were retrieved from the Protein Data Bank (PDB) (11) and loaded into MySQL, a relational database management system for flexible query and manipulation. The redundancy in SDP structures was removed at two levels. First, for NMR structures which have multiple monomer models, the representative monomers were selected using NMRCLUST (12). Second, when multiple structures exist for the same sequence, the representative structure was chosen according to its structural qualities. The structural qualities are ranked by the following criteria (adopted from PDB): (i) X-ray structures over NMR structures, (ii) higher-quality factor (1/resolution − R-value) for X-ray structures and higher restraint per residue for NMR, (iii) better geometry, (iv) fewer missing atoms and non-standard residues and (v) later deposition date. Based on the above strategy, a non-redundant structure database for SDPs was generated. Currently it contains >1300 non-redundant protein chains and their coordinates. The database will be automatically updated once a month. Modeling procedure The SDPMOD server performs comparative modeling in four steps: (i) template selection, (ii) target–template alignment, (iii) model building and (iv) model evaluation (13). Figure Figure11
Benchmarking A large-scale benchmarking excercise was completed using the fully automated mode of the SDPMOD server. A control set of 664 sequences (a subset of our non-redundant SDP dataset) with known structures was used to evaluate the reliability of the server. The Cα root mean square deviation (RMSD) values between models and their actual experimental structures were calculated. The benchmarking results show SDPMOD can predict 3D models with a reasonable accuracy. For example, in the 40–70% sequence identity range, 64% of models have Cα RMSD values <1.5 Å. The detailed analysis of the accuracy of our modeling protocol is available from http://proline.bic.nus.edu.sg/sdpmod/accuracy.html. WEB SERVICE SDPMOD is freely accessible to academic or non-profit users via a web interface (shown in Figure Figure2)2
The ‘fully automated’ mode presents an easy-to-use interface. Users can simply submit a target sequence with their email address and their MODELLER license key, obtained from the MODELLER registration page http://salilab.org/modeller/registration.shtml, and the modeling will be carried out automatically according to the procedure described in Figure Figure1.1 After the modeling process is completed, a link with the prediction results will be returned via email. Users can refer to the link to view the prediction result and download the models. The prediction results consist of (i) a summary of the selected template(s), (ii) the predicted model based on each template in PDB format and (iii) a brief report for each modeling attempt that includes the target–template alignment used in model building, a comparison of the model against the template by means of RMSD and a PROCHECK report on the stereochemical quality of the models. ACKNOWLEDGEMENTS We would like to thank our colleagues at the Department of Biochemistry, National University of Singapore for their helpful comments and discussions. We are especially grateful to Professor Andrej Sali for permitting us to use MODELLER as a part of the server and Dr Ben Webb for useful suggestions. L.K. and B.L. would also like to thank the National University of Singapore for the award of Agency for Science, Technology and Research, Singapore (ASTAR) scholarships that made this work possible. REFERENCES 1. Harrison P.M. and Sternberg,M.J. (1996) The disulphide beta-cross: from cystine geometry and clustering to classification of small disulphide-rich protein folds. J. Mol. Biol., 264, 603–623. [PubMed] 2. Shen G.S., Layer,R.T. and McCabe,R.T. (2000) Conopeptides: from deadly venoms to novel therapeutics. Drug Discov. Today, 5, 98–106. [PubMed] 3. Richardson M. (1977) The proteinase inhibitors of plants and micro-organisms. Phytochemistry, 16, 159–169. 4. Peitsch M.C. (1996) ProMod and Swiss-Model: internet-based tools for automated comparative protein modelling. Biochem. Soc. Trans., 24, 274–279. [PubMed] 5. Lund O., Frimand,K., Gorodkin,J., Bohr,H., Bohr,J., Hansen,J. and Brunak,S. (1997) Protein distance constraints predicted by neural networks and probability density functions. Protein Eng., 10, 1241–1248. [PubMed] 6. Bates P.A., Kelley,L.A., MacCallum,R.M. and Sternberg,M.J. (2001) Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM. Proteins, Suppl 5, 39–46. [PubMed] 7. Pieper U., Eswar,N., Stuart,A.C., Ilyin,V.A. and Sali,A. (2002) MODBASE, a database of annotated comparative protein structure models. Nucleic Acids Res., 30, 255–259. [PubMed] 8. Lambert C., Leonard,N., De Bolle,X. and Depiereux,E. (2002) ESyPred3D: prediction of proteins 3D structures. Bioinformatics, 18, 1250–1256. [PubMed] 9. Hu S.H., Gehrmann,J., Guddat,L.W., Alewood,P.F., Craik,D.J. and Martin,J.L. (1996) The 1.1 Å crystal structure of the neuronal acetylcholine receptor antagonist, alpha-conotoxin PnIA from Conus pennaceus. Structure, 4, 417–423. [PubMed] 10. Murzin A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247, 536–540. [PubMed] 11. Berman H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. [PubMed] 12. Kelley L.A., Gardner,S.P. and Sutcliffe,M.J. (1996) An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Protein Eng., 9, 1063–1065. [PubMed] 13. Marti-Renom M.A., Stuart,A.C., Fiser,A., Sanchez,R., Melo,F. and Sali,A. (2000) Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct., 29, 291–325. [PubMed] 14. Sali A. and Blundell,T.L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol., 234, 779–815. [PubMed] 15. Laskowski R.A., Moss,D.S. and Thornton,J.M. (1993) Main-chain bond lengths and bond angles in protein structures. J. Mol. Biol., 231, 1049–1067. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||
J Mol Biol. 1996 Dec 6; 264(3):603-23.
[J Mol Biol. 1996]Drug Discov Today. 2000 Mar; 5(3):98-106.
[Drug Discov Today. 2000]Biochem Soc Trans. 1996 Feb; 24(1):274-9.
[Biochem Soc Trans. 1996]Protein Eng. 1997 Nov; 10(11):1241-8.
[Protein Eng. 1997]Proteins. 2001; Suppl 5():39-46.
[Proteins. 2001]Nucleic Acids Res. 2002 Jan 1; 30(1):255-9.
[Nucleic Acids Res. 2002]Bioinformatics. 2002 Sep; 18(9):1250-6.
[Bioinformatics. 2002]Structure. 1996 Apr 15; 4(4):417-23.
[Structure. 1996]J Mol Biol. 1995 Apr 7; 247(4):536-40.
[J Mol Biol. 1995]Nucleic Acids Res. 2000 Jan 1; 28(1):235-42.
[Nucleic Acids Res. 2000]Protein Eng. 1996 Nov; 9(11):1063-5.
[Protein Eng. 1996]Annu Rev Biophys Biomol Struct. 2000; 29():291-325.
[Annu Rev Biophys Biomol Struct. 2000]J Mol Biol. 1993 Dec 5; 234(3):779-815.
[J Mol Biol. 1993]J Mol Biol. 1993 Jun 20; 231(4):1049-67.
[J Mol Biol. 1993]Proteins. 2001; Suppl 5():39-46.
[Proteins. 2001]