Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. 1998 May; 149(1): 445–458.
PMCID: PMC1460119

Assessing the impact of secondary structure and solvent accessibility on protein evolution.


Empirically derived models of amino acid replacement are employed to study the association between various physical features of proteins and evolution. The strengths of these associations are statistically evaluated by applying the models of protein evolution to 11 diverse sets of protein sequences. Parametric bootstrap tests indicate that the solvent accessibility status of a site has a particularly strong association with the process of amino acid replacement that it experiences. Significant association between secondary structure environment and the amino acid replacement process is also observed. Careful description of the length distribution of secondary structure elements and of the organization of secondary structure and solvent accessibility along a protein did not always significantly improve the fit of the evolutionary models to the data sets that were analyzed. As indicated by the strength of the association of both solvent accessibility and secondary structure with amino acid replacement, the process of protein evolution-both above and below the species level-will not be well understood until the physical constraints that affect protein evolution are identified and characterized.

Full Text

The Full Text of this article is available as a PDF (387K).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17(6):368–376. [PubMed]
  • Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93–104. [PubMed]
  • Friedlander TP, Regier JC, Mitter C, Wagner DL. A nuclear gene for higher level phylogenetics: phosphoenolpyruvate carboxykinase tracks mesozoic-age divergences within Lepidoptera (Insecta). Mol Biol Evol. 1996 Apr;13(4):594–604. [PubMed]
  • Goldman N. Statistical tests of models of DNA substitution. J Mol Evol. 1993 Feb;36(2):182–198. [PubMed]
  • Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994 Sep;11(5):725–736. [PubMed]
  • Goldman N, Thorne JL, Jones DT. Using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses. J Mol Biol. 1996 Oct 25;263(2):196–208. [PubMed]
  • Gotoh O. An improved algorithm for matching biological sequences. J Mol Biol. 1982 Dec 15;162(3):705–708. [PubMed]
  • Hansen JE, Lund O, Nielsen JO, Brunak S, Hansen JE. Prediction of the secondary structure of HIV-1 gp120. Proteins. 1996 May;25(1):1–11. [PubMed]
  • Hasegawa M, Kishino H, Yano T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985;22(2):160–174. [PubMed]
  • Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992 Jun;8(3):275–282. [PubMed]
  • Jones DT, Taylor WR, Thornton JM. A mutation data matrix for transmembrane proteins. FEBS Lett. 1994 Feb 21;339(3):269–275. [PubMed]
  • Asai K, Hayamizu S, Handa K. Prediction of protein secondary structure by the hidden Markov model. Comput Appl Biosci. 1993 Apr;9(2):141–146. [PubMed]
  • Benner SA, Badcoe I, Cohen MA, Gerloff DL. Bona fide prediction of aspects of protein conformation. Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. J Mol Biol. 1994 Jan 21;235(3):926–958. [PubMed]
  • Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. [PubMed]
  • Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. The Protein Data Bank. A computer-based archival file for macromolecular structures. Eur J Biochem. 1977 Nov 1;80(2):319–324. [PubMed]
  • Koshi JM, Goldstein RA. Context-dependent optimal substitution matrices. Protein Eng. 1995 Jul;8(7):641–645. [PubMed]
  • Bleasby AJ, Wootton JC. Construction of validated, non-redundant composite protein sequence databases. Protein Eng. 1990 Jan;3(3):153–159. [PubMed]
  • Lüthy R, McLachlan AD, Eisenberg D. Secondary structure-based profiles: use of structure-conserving scoring tables in searching protein sequence databases for structural similarities. Proteins. 1991;10(3):229–239. [PubMed]
  • Brown M, Hughey R, Krogh A, Mian IS, Sjölander K, Haussler D. Using Dirichlet mixture priors to derive hidden Markov models for protein families. Proc Int Conf Intell Syst Mol Biol. 1993;1:47–55. [PubMed]
  • Naylor GJ, Brown WM. Structural biology and phylogenetic estimation. Nature. 1997 Aug 7;388(6642):527–528. [PubMed]
  • Overington J, Johnson MS, Sali A, Blundell TL. Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. Proc Biol Sci. 1990 Aug 22;241(1301):132–145. [PubMed]
  • Bruno WJ. Modeling residue usage in aligned protein sequences via maximum likelihood. Mol Biol Evol. 1996 Dec;13(10):1368–1374. [PubMed]
  • Cao Y, Adachi J, Janke A, Päbo S, Hasegawa M. Phylogenetic relationships among eutherian orders estimated from inferred sequences of mitochondrial proteins: instability of a tree based on a single gene. J Mol Evol. 1994 Nov;39(5):519–527. [PubMed]
  • Russell RB, Saqi MA, Sayle RA, Bates PA, Sternberg MJ. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol. 1997 Jun 13;269(3):423–439. [PubMed]
  • Chothia C, Lesk AM. The relation between the divergence of sequence and structure in proteins. EMBO J. 1986 Apr;5(4):823–826. [PMC free article] [PubMed]
  • Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994 Sep;39(3):306–314. [PubMed]
  • Teller JK, Baker PJ, Britton KL, Engel PC, Rice DW, Stillman TJ. Correlation of intron-exon organisation with the three-dimensional structure in glutamate dehydrogenase. Biochim Biophys Acta. 1995 Mar 15;1247(2):231–238. [PubMed]
  • Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993–1005. [PMC free article] [PubMed]
  • Thorne JL, Goldman N, Jones DT. Combining protein evolution and secondary structure. Mol Biol Evol. 1996 May;13(5):666–673. [PubMed]
  • Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997 Oct;13(5):555–556. [PubMed]
  • Topham CM, McLeod A, Eisenmenger F, Overington JP, Johnson MS, Blundell TL. Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. J Mol Biol. 1993 Jan 5;229(1):194–220. [PubMed]
  • Yang Z, Lauder IJ, Lin HJ. Molecular evolution of the hepatitis B virus genome. J Mol Evol. 1995 Nov;41(5):587–596. [PubMed]
  • Wako H, Blundell TL. Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. I. Solvent accessibility classes. J Mol Biol. 1994 May 20;238(5):682–692. [PubMed]
  • Yokoyama S, Starmer WT. Phylogeny and evolutionary rates of G protein alpha subunit genes. J Mol Evol. 1992 Sep;35(3):230–238. [PubMed]
  • Yokoyama S, Harry DE. Molecular phylogeny and evolutionary rates of alcohol dehydrogenases in vertebrates and plants. Mol Biol Evol. 1993 Nov;10(6):1215–1226. [PubMed]
  • White JV, Stultz CM, Smith TF. Protein classification by stochastic modeling and optimal filtering of amino-acid sequences. Math Biosci. 1994 Jan;119(1):35–75. [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...