• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Jul 1, 2009; 37(Web Server issue): W575–W580.
Published online May 22, 2009. doi:  10.1093/nar/gkp418
PMCID: PMC2703963

RHYTHM—a server to predict the orientation of transmembrane helices in channels and membrane-coils

Abstract

RHYTHM is a web server that predicts buried versus exposed residues of helical membrane proteins. Starting from a given protein sequence, secondary and tertiary structure information is calculated by RHYTHM within only a few seconds. The prediction applies structural information from a growing data base of precalculated packing files and evolutionary information from sequence patterns conserved in a representative dataset of membrane proteins (‘Pfam-domains’). The program uses two types of position specific matrices to account for the different geometries of packing in channels and transporters (‘channels’) or other membrane proteins (‘membrane-coils’). The output provides information on the secondary structure and topology of the protein and specifically on the contact type of each residue and its conservation. This information can be downloaded as a graphical file for illustration, a text file for analysis and statistics and a PyMOL file for modeling purposes. The server can be freely accessed at: URL: http://proteinformatics.de/rhythm

INTRODUCTION

About one third of the presently mapped gene sequences encode for membrane proteins, which are also major targets for pharmaceutical products (1,2). In contrast, only a minor fraction (February 2009, 1.8%) of the protein structures deposited in the protein data bank (PDB) belongs to this structural class (3,4). Due to difficulties in over expression and crystallization, their tertiary structure is often evaluated using computational methods (5–8). Homology modeling may be applied when an appropriate template structure is available (9). In other cases, ab initio or knowledge-based tertiary structure modeling comes into play. There is a high level of predictability regarding secondary structure elements (10–12). New approaches deal with the prediction of the exact lengths of the transmembrane helices (13). Finally, transmembrane topology prediction was optimized applying consensus predictions also identifying signal peptides (14). However, tools that perform or assist in low resolution tertiary structure modeling of helical membrane proteins are still rare (15–19).

The growing data on high-resolution structures of helical membrane proteins provide an appropriate base for structural analysis, statistics and the development of knowledge-based prediction methods (12,15–32). The type of packing of α-helices is fundamental for the stabilization and function of all helical membrane proteins (33–36). Residues involved in helix–helix interactions are therefore regularly more conserved than others and are often arranged in specific sequence motifs that reflect the type of packing (23,25,35,37). Right-handed parallel and anti-parallel interactions are typically found in channels (membrane proteins with a functional pore). These interactions are mainly accomplished by weakly polar amino acids (G > S > T > F) that preferably create contacts every fourth residue (23,37,38). Left-handed anti-parallel interactions are predominantly found in membrane-coils. There, large and polar residues (D > S > M > Q) create characteristic contacts every 3.5th residues (23,37,39).

The higher conservation of residues involved in helix–helix contacts was applied in methods predicting tertiary structure contacts (40,41). Such applications can be further improved combining conservation criteria with amino acid propensity scales (18,24,32,42). The combination of statistical potentials with fragment-based modeling and energy minimizations were applied for de novo modeling approaches (28,43–45). There are some tools available to predict buried versus exposed regions of transmembrane helices. ProperTM (18), LIPS (16), RANTS (15) and TMX (19) depend on multiple sequence alignments to produce predictions about transmembrane helix orientations or solvent accessibility. However, the quality of prediction by these tools largely depends on the quality of the multiple sequence alignment provided by the user. Due to the small size of several transmembrane protein families, such alignments are not always at hand. Moreover, the output is not always presented in a user-friendly format and thus cannot be directly used for modeling purposes.

RHYTHM is the first server that predicts the exposure or burial of transmembrane residues incorporating the structural specificities of channels. The quality of prediction (expressed by AUC-values) of helix–helix contacts rises by 16% to an average value of 76% when the sequence motifs typical for channels are applied, compared to the same approach when a non-specific matrix is taken (23). For our web service, the position-specific matrices were updated using an enlarged data set of input structures. To optimize the sensitivity of helix–helix contact predictions at high specificity thresholds, the matrix prediction method is now combined with a prediction directly applying evolutionary information from ‘Pfam-domains’ (46,47). RHYTHM also integrates the secondary structure prediction tool HMMTOP (48). Thus, after the upload of a single sequence file and the specification of the position specific matrix type (‘channel’ or ‘membrane-coil’) the prediction for tertiary structure contacts is started.

METHODS

Matrix prediction method

The prediction of buried versus exposed residues is based on two different sets of propensity matrices derived from representative and non-redundant datasets of 21 channels and 14 membrane-coils containing 310 and 179 transmembrane helices, respectively (see website for details). The data were analyzed as described in detail in earlier analyses (23,49). Shortly, helical sections were defined by the Kabsch and Sander algorithm (50). Only those residues were defined as transmembrane helixes with their Cα-atoms lying between the two membrane planes. The membrane planes were calculated applying the output of the TMDET algorithm (51). The type of contact of a specified residue was determined counting the atomic contacts to residues of another helix, to the virtual membrane or to virtual water (23). Structures with helix pairs too far apart were removed after visual inspection.

The matrices (which will be regularly updated due to the growing data set of high resolution membrane protein structures) store the propensities of residues to contact another helix or the membrane. To account for sequence motifs, propensities of all neighboring amino acids are stored in the same matrix [see website for details or ref. (23)]. Scores are calculated by summation of the residual propensities at positions 0 to ± 4 (channels) or 0 to ± 7 (membrane-coils). These windows account for the different RHYTHM of contacts in channels and membrane-coils (23,25,38). An amino acid is predicted to be buried (step 1, see Figure 1) or exposed (step 2, see Figure 1), when this score is above a certain threshold specified by the user. The advantage of this approach is that the prediction is thus much less affected by variations of single amino acid propensities. However, amino acids at the helix termini are not recorded by our method.

Figure 1.
Workflow of RHYTHM: the prediction is performed in three steps including (1) matrix prediction of helix–helix contacts; (2) matrix prediction of helix–membrane contacts and (3) prediction of helix–helix contacts by conservation ...

Conservation criteria

The Pfam database is an extensive set of protein domains and families currently covering 72% of known protein sequences (46). The families consist of multiple alignments of functionally or evolutionary-related protein sequences (47). These alignments also reproduce evolutionary relationships that would otherwise not be detected (9). To search the Pfam database, HMMER (version 2.3.2) is applied (52). HMMER allows for sensitive searching in a database of the consensus sequences of various protein families using Hidden Markov Models. To speed up the search the Pfam database was restricted to the 691 membrane protein families provided in February 2009. A bonus is added to the helix–helix score of fully conserved residues, according to the finding that conserved residues are often involved in helix–helix contacts (37,53). The value of the bonus depends on the selected specificity and is optimized for highest accuracy.

Three step prediction

A three-step approach was applied to predict buried versus exposed residues (Figure 1):

  1. Matrix prediction of helix–helix contacts: Amino acids predicted by HMMTOP (48) or specified by the user to be part of a transmembrane helix are scored. The prediction matrix has to be chosen by the user. In order to do this, the user must know whether the protein of the uploaded sequence has a functional pore (channels) or not (membrane-coil). The residues above the selected specificity threshold for helix–helix contacts (medium, high, very high and highest) are predicted as helix–helix contacts. The specificities for a single contact type range from about 75% for medium to 90% for very high thresholds.
  2. Matrix prediction of helix–membrane contacts: The remaining residues are predicted analogously as helix–membrane contacts using the threshold specified by the user at the beginning of the procedure. The specificities for that prediction also range from about 75–90%, respectively. In conjunction, a maximum of 70% (medium specificity) of the residues are recorded at the moment by the matrix prediction method.
  3. Pfam prediction: To optimize the sensitivity of helix–helix contact predictions at high specificity thresholds, residues not recorded by the matrix prediction method may be verified using conservation criteria. This means that a bonus optimized for the positive predictive value at a selected specificity threshold is added to the helix–helix score from matrix prediction. A residue is predicted as buried when the combined score is above the defined threshold. As a result a plus of 10–20% residues are additionally assigned to be part of a helix–helix contact. The specificity of prediction is not significantly affected by the Pfam prediction.

RESULTS AND DISCUSSION

Performance of the combined prediction

The prediction quality of RHYTHM improved compared to our previous analysis (23). This is due to the enlarged data set of helical membrane proteins and the combination of the matrix prediction method with the prediction from evolutionary conservation. The average AUC-values (from a leave-one-out cross validation) for the prediction of helix–helix contacts are 0.72 for channels [as in our previous analysis (23)] and 0.68 for membrane–coils, respectively. The corresponding values for the prediction of helix–membrane contacts are 0.75 and 0.73. Best predictions were obtained for helix–helix contacts of the translocon channel (PDB-entry: 1rh5, AUC-value: 0.78) and for helix–membrane contacts of the ABC-transporter protein (PDB-entry: 2qi9, AUC-value: 0.86). To receive high quality predictions with RHYTHM, we suggest selecting the default specificity threshold ‘very high’. This threshold may then be reduced if too few contacts are predicted. Besides tertiary structure contact types, the output assigns the secondary structure and topology of the protein (51). This information is provided as a graphical file for illustration (Figure 2), a text file for analysis and statistics and a PyMOL file for modeling purposes (Figure 3).

Figure 2.
Example graphical output of RHYTHM: topology of the ammonium transporter predicted with HMMTOP (51). Tertiary structure contacts predicted as helix–helix contacts (red) or helix–membrane contacts (green). Highly conserved residues are ...
Figure 3.
Two high-resolution crystal structures of (A) rhodopsin, PDB-entry: 1u19 and (B) the ammonium transporter, PDB-entry: 1xqf, were colored according to the predicted contact types (green = helix–membrane, red = helix–helix) using the downloadable ...

Complexity of tertiary contact predictions

The quality of prediction will further improve as the data set of non-homologous high resolution membrane protein structures grows. At the moment the prediction is limited for several reasons: A significant number of buried residues is close to internal cavities (37,54). Such residues are not judged in our analysis to be part of a helix–helix contact due to insufficient contacts to other residues and are thus often evaluated as false positives in our prediction. Large packing defects regularly account for structural flexibilities (36,55–57). The separate prediction of residues involved in packing defects could therefore enhance the prediction of tertiary structure contacts. Moreover, about one quarter of the residues is in contact with both another helix and the membrane. These residues are frequently not recorded at high specificity thresholds but will be predicted as buried or exposed at lower thresholds. This ambiguity clearly complicates the prediction, as well as the fact that many channels are highly flexible. Residues that are buried in one functional state may become exposed in another (45,58). Finally, residues that appear to be exposed to lipid may become (and may also be predicted to be) buried in quaternary complexes (59). With more structural data of channels a prediction of residues that are exposed or buried depending on their functional state will be possible.

Technical details

All computations are done on our server including optional prediction of membrane helix sections and searches for Pfam domains. Modern web technologies (AJAX, JavaScript, PHP, CSS) were used to create a fast and intuitively usable web application.

FUNDING

European Union (ProFIT) and the Deutsche Forschungsgemeinschaft (SFB449, SFB740). Funding for open access charge: SFB449.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

We thank Dr Cornelius Frömmel and Dr Robert Preissner for helpful discussions and Simon Ward for reading the manuscript.

REFERENCES

1. Becker OM, Marantz Y, Shacham S, Inbal B, Heifetz A, Kalid O, Bar-Haim S, Warshaviak D, Fichman M, Noiman S. G protein-coupled receptors: in silico drug discovery in 3D. Proc. Natl Acad. Sci. USA. 2004;101:11304–11309. [PMC free article] [PubMed]
2. Civelli O. GPCR deorphanizations: the novel, the known and the unexpected transmitters. Trends Pharmacol. Sci. 2005;26:15–19. [PubMed]
3. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein data bank. Nucleic Acids Res. 2000;28:235–242. [PMC free article] [PubMed]
4. Tusnady GE, Dosztanyi Z, Simon I. PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res. 2005;33(Database Issue):D275–D278. [PMC free article] [PubMed]
5. Bowie JU. Solving the membrane protein folding problem. Nature. 2005;438:581–589. [PubMed]
6. Lehnert U, Xia Y, Royce T, Goh C, Liu Y, Senes A, Yu H, Zhang Z, Engelman D, Gerstein M. Computational analysis of membrane proteins: genomic occurrence, structure prediction and helix interactions. Quart. Rev. Biophys. 2005;37:1–6. [PubMed]
7. Fleishman SJ, Unger VM, Ben-Tal N. Transmembrane protein structures without X-rays. Trends Biochem. Sci. 2006;31:106–113. [PubMed]
8. Punta M, Forrest LR, Bigelow H, Kernytsky A, Liu J, Rost B. Membrane protein prediction methods. Methods. 2007;41:460–474. [PMC free article] [PubMed]
9. Forrest LR, Tang CL, Honig B. On the accuracy of homology modeling and sequence alignment methods applied to membrane proteins. Biophys. J. 2006;91:508–517. [PMC free article] [PubMed]
10. Moller S, Croning MD, Apweiler R. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics. 2001;17:646–653. [PubMed]
11. Cuthbertson JM, Doyle DA, Sansom MS. Transmembrane helix prediction: a comparative evaluation and analysis. Protein Eng. Des. Sel. 2005;18:295–308. [PubMed]
12. Bernsel A, Viklund H, Falk J, Lindahl E, von Heijne G, Elofsson A. Prediction of membrane-protein topology from first principles. Proc. Natl Acad. Sci. USA. 2008;105:7177–7181. [PMC free article] [PubMed]
13. Granseth E, Viklund H, Elofsson A. ZPRED: predicting the distance to the membrane center for residues in alpha-helical membrane proteins. Bioinformatics. 2006;22:e191–e196. [PubMed]
14. Amico M, Finelli M, Rossi I, Zauli A, Elofsson A, Viklund H, von Heijne G, Jones D, Krogh A, Fariselli P, et al. PONGO: a web server for multiple predictions of all-alpha transmembrane proteins. Nucleic Acids Res. 2006;34:W169–W172. [PMC free article] [PubMed]
15. Adamian L, Liang J. Prediction of buried helices in multispan alpha helical membrane proteins. Proteins. 2006;63:1–5. [PubMed]
16. Adamian L, Liang J. Prediction of transmembrane helix orientation in polytopic membrane proteins. BMC Struct. Biol. 2006;6:13. [PMC free article] [PubMed]
17. Barth P, Schonbrun J, Baker D. Toward high-resolution prediction and design of transmembrane helical protein structures. Proc. Natl Acad. Sci. USA. 2007;104:15682–15687. [PMC free article] [PubMed]
18. Beuming T, Weinstein H. A knowledge-based scale for the analysis and prediction of buried and exposed faces of transmembrane domain proteins. Bioinformatics. 2004;20:1822–1835. [PubMed]
19. Park Y, Hayat S, Helms V. Prediction of the burial status of transmembrane residues of helical membrane proteins. BMC Bioinformatics. 2007;8:302. [PMC free article] [PubMed]
20. Park Y, Helms V. Prediction of the translocon-mediated membrane insertion free energies of protein sequences. Bioinformatics. 2008;24:1271–1277. [PubMed]
21. Zhang Y, Devries ME, Skolnick J. Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol. 2006;2:e13. [PMC free article] [PubMed]
22. Yarov-Yarovoy V, Schonbrun J, Baker D. Multipass membrane protein structure prediction using Rosetta. Proteins. 2006;62:1010–1025. [PMC free article] [PubMed]
23. Hildebrand PW, Lorenzen S, Goede A, Preissner R. Analysis and prediction of helix–helix interactions in membrane channels and transporters. Proteins. 2006;64:253–262. [PubMed]
24. Park Y, Helms V. How strongly do sequence conservation patterns and empirical scales correlate with exposure patterns of transmembrane helices of membrane proteins? Biopolymers. 2006;83:389–399. [PubMed]
25. Walters RF, DeGrado WF. Helix-packing motifs in membrane proteins. Proc. Natl Acad. Sci. USA. 2006;103:13658–13663. [PMC free article] [PubMed]
26. Eyre TA, Partridge L, Thornton JM. Computational analysis of {alpha}-helical membrane protein structure: implications for the prediction of 3D structural models. Protein Eng. Des. Sel. 2004;17:613–624. [PubMed]
27. Chamberlain AK, Bowie JU. Analysis of side-chain rotamers in transmembrane proteins. Biophys. J. 2004;87:3460–3469. [PMC free article] [PubMed]
28. Pellegrini-Calace M, Carotti A, Jones DT. Folding in lipid membranes (FILM): a novel method for the prediction of small membrane protein 3D structures. Proteins. 2003;50:537–545. [PubMed]
29. Eilers M, Patel AB, Liu W, Smith SO. Comparison of helix interactions in membrane and soluble alpha-bundle proteins. Biophys. J. 2002;82:2720–2736. [PMC free article] [PubMed]
30. Fleishman SJ, Ben-Tal N. A novel scoring function for predicting the conformations of tightly packed pairs of transmembrane alpha-helices. J. Mol. Biol. 2002;321:363–378. [PubMed]
31. Pappu RV, Marshall GR, Ponder JW. A potential smoothing algorithm accurately predicts transmembrane helix packing. Nat. Struct. Biol. 1999;6:50–55. [PubMed]
32. Pilpel Y, Ben-Tal N, Lancet D. kPROT: a knowledge-based scale for the propensity of residue orientation in transmembrane segments. Application to membrane protein structure prediction. J. Mol. Biol. 1999;294:921–935. [PubMed]
33. Eilers M, Shekar SC, Shieh T, Smith SO, Fleming PJ. Internal packing of helical membrane proteins. Proc. Natl Acad. Sci. USA. 2000;97:5796–5801. [PMC free article] [PubMed]
34. Adamian L, Liang J. Helix–helix packing and interfacial pairwise interactions of residues in membrane proteins. J. Mol. Biol. 2001;311:891–907. [PubMed]
35. Gimpelev M, Forrest LR, Murray D, Honig B. Helical packing patterns in membrane and soluble proteins. Biophys. J. 2004;87:4075–4086. [PMC free article] [PubMed]
36. Hildebrand PW, Rother K, Goede A, Preissner R, Frommel C. Molecular packing and packing defects in helical membrane proteins. Biophys. J. 2005;88:1970–1977. [PMC free article] [PubMed]
37. Hildebrand PW, Gunther S, Goede A, Forrest L, Frommel C, Preissner R. Hydrogen-bonding and packing features of membrane proteins: functional implications. Biophys. J. 2008;94:1945–1953. [PMC free article] [PubMed]
38. Walther D, Eisenhaber F, Argos P. Principles of helix-helix packing in proteins: the helical lattice superposition model. J. Mol. Biol. 1996;255:536–553. [PubMed]
39. Langosch D, Heringa J. Interaction of transmembrane helices by a knobs-into-holes packing characteristic of soluble coiled coils. Proteins. 1998;31:150–159. [PubMed]
40. Taylor WR, Jones DT, Green NM. A method for alpha-helical integral membrane protein fold prediction. Proteins. 1994;18:281–294. [PubMed]
41. Fuchs A, Martin-Galiano AJ, Kalman M, Fleishman S, Ben-Tal N, Frishman D. Co-evolving residues in membrane proteins. Bioinformatics. 2007;23:3312–3319. [PubMed]
42. Adamian L, Nanda V, DeGrado WF, Liang J. Empirical lipid propensities of amino acid residues in multispan alpha helical membrane proteins. Proteins. 2005;59:496–509. [PubMed]
43. MacKenzie KR, Engelman DM. Structure-based prediction of the stability of transmembrane helix–helix interactions: the sequence dependence of glycophorin A dimerization. Proc. Natl Acad. Sci. USA. 1998;95:3583–3590. [PMC free article] [PubMed]
44. Park Y, Elsner M, Staritzbichler R, Helms V. Novel scoring function for modeling structures of oligomers of transmembrane alpha-helices. Proteins. 2004;57:577–585. [PubMed]
45. Yarov-Yarovoy V, Baker D, Catterall WA. Voltage sensor conformations in the open and closed states in ROSETTA structural models of K(+) channels. Proc. Natl Acad. Sci. USA. 2006;103:7292–7297. [PMC free article] [PubMed]
46. Sammut SJ, Finn RD, Bateman A. Pfam 10 years on: 10 000 families and still growing. Brief Bioinform. 2008;9:210–219. [PubMed]
47. Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, et al. The Pfam protein families database. Nucleic Acids Res. 2008;36:D281–D288. [PMC free article] [PubMed]
48. Tusnady GE, Simon I. The HMMTOP transmembrane topology prediction server. Bioinformatics. 2001;17:849–850. [PubMed]
49. Preissner R, Goede A, Frommel C. Dictionary of interfaces in proteins (DIP). Data bank of complementary molecular surface patches. J. Mol. Biol. 1998;280:535–550. [PubMed]
50. Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. [PubMed]
51. Tusnady GE, Dosztanyi Z, Simon I. TMDET: web server for detecting transmembrane regions of proteins by using their 3D coordinates. Bioinformatics. 2004;21:1276–1277. [PubMed]
52. Wistrand M, Sonnhammer EL. Improved profile HMM performance by assessment of critical algorithmic features in SAM and HMMER. BMC Bioinformatics. 2005;6:99. [PMC free article] [PubMed]
53. Jones DT, Taylor WR, Thornton JM. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry. 1994;33:3038–3049. [PubMed]
54. Rother K, Preissner R, Goede A, Frömmel C. Inhomogeneous molecular density: reference packing densities and distribution of cavities within proteins. Bioinformatics. 2003;19:2112–2121. [PubMed]
55. Paci E, Marchi M. Intrinsic compressibility and volume compression in solvated proteins by molecular dynamics simulation at high pressure. Proc. Natl Acad. Sci. USA. 1996;93:11609–11614. [PMC free article] [PubMed]
56. Cuff AL, Martin AC. Analysis of void volumes in proteins and application to stability of the p53 tumour suppressor protein. J. Mol. Biol. 2004;344:1199–1209. [PubMed]
57. Rother K, Hildebrand PW, Goede A, Gruening B, Preissner R. Voronoia: analyzing packing in protein structures. Nucleic Acids Res. 2008;37:D393–D395. [PMC free article] [PubMed]
58. Adamian L, Liang J. Interhelical hydrogen bonds in transmembrane region are important for function and stability of Ca2+-transporting ATPase. Cell Biochem. Biophys. 2003;39:1–12. [PubMed]
59. Stevens TJ, Arkin IT. Substitution rates in alpha-helical transmembrane proteins. Protein Sci. 2001;10:2507–2517. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...