• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of prosciprotein sciencecshl presssubscriptionsetoc alertsthe protein societyjournal home
Protein Sci. Sep 2003; 12(9): 2057–2062.
PMCID: PMC2324001

Contact order revisited: Influence of protein size on the folding rate


Guided by the recent success of empirical model predicting the folding rates of small two-state folding proteins from the relative contact order (CO) of their native structures, by a theoretical model of protein folding that predicts that logarithm of the folding rate decreases with the protein chain length L as L2/3, and by the finding that the folding rates of multistate folding proteins strongly correlate with their sizes and have very bad correlation with CO, we reexamined the dependence of folding rate on CO and L in attempt to find a structural parameter that determines folding rates for the totality of proteins. We show that the Abs_CO = CO × L, is able to predict rather accurately folding rates for both two-state and multistate folding proteins, as well as short peptides, and that this Abs_CO scales with the protein chain length as L0.70 ± 0.07 for the totality of studied single-domain proteins and peptides.

Keywords: Protein folding kinetics, two-state kinetics, multistate kinetics, contact order, protein size, protein topology, rate of folding

Many proteins fold and unfold by a simple two-state transition lacking observable intermediates at any solvent conditions (Jackson 1998). Many other proteins exhibit a more complicated multistate transition; namely, they have observable folding intermediates under physiological conditions. However, the boundary between these two groups of proteins is not as well defined.

It is known that some proteins can be switched from two-state to multistate folding, and vice versa, by point mutations or even by changing conditions such as the salt concentration or temperature (Jackson 1998). In addition, multistate folding is observed only far from the point of thermodynamic equilibrium between the native and denatured states, whereas, close to this point, all proteins fold without any observable intermediates (Privalov 1979; Jackson 1998; Finkelstein and Ptitsyn 2002).

Small two-state folding proteins have attracted particular attention of experimentalists and theorists. It was demonstrated that the logarithms of in-water folding rates of these proteins correlate with their gross topological parameter called relative contact order (CO; Plaxco et al. 1998b). The latter is defined as

equation M1

where N is the number of contacts (within 6 Å) between nonhydrogen atoms in the protein, L is the length of the protein in amino acid residues, and ΔLij is the number of residues separating the interacting pair of nonhydrogen atoms (adjacent residues are assumed to be separated by one residue, etc.).

CO is a renormalization of the perhaps more intuitive measure, absolute contact order (Abs_CO),

equation M2

which, however, was found to be less correlated than CO with folding rates of the two-state folders (Plaxco et al. 1998b; Grantcharova et al. 2001).

The CO was invented to compare differences in topology (rather than in size) between proteins of different length. This parameter is small for proteins stabilized mainly by local interactions and is large when residues in a protein interact frequently with partners far away in the protein sequence. The latter should lead to slower folding (Plaxco et al. 1998b; Fersht 2000). Indeed, negative correlation between the CO and the logarithm of folding rates was found to be very strong, ~ −0.8 (Plaxco et al. 1998b; Fersht 2000) for two-state folding proteins (which also holds for all two-state folding proteins studied to date; Fig. 1 [triangle], circles).

Figure 1.
Natural logarithm of observed folding rate in water, ln(kf), versus relative contact order (CO) for various proteins and peptides: proteins having two-state folding kinetics at all the denaturant concentrations (circles ), proteins having multistate folding ...

However, examining a whole set of proteins studied to date (Table 11),), we see that CO, although it still gives good results for two-state folding proteins, fails to predict the folding rates of short peptides and large multistate folding proteins (Fig. 1 [triangle]). It seems the reason is that CO takes into account topology only and pays no explicit attention to the protein size.

Table 1.
List of proteins and polypeptides a
Table 1.
List of proteins and polypeptides a

A number of basic correlations between protein size and folding rate have been suggested (Thirumalai 1995; Gutin et al. 1996; Finkelstein and Badretdinov 1997a,b). All of them stress that, as might be expected, folding rate decreases monotonically with protein size, but all indicate different scaling laws for this decrease. It should be noted that some recent simulations of folding of off-lattice protein models with simplified potentials (Koga and Takada 2001) indicate that the logarithms of protein folding rate decrease with the chain length as L0.61 ± 0.18, which is in accordance with both Finkelstein and Badretdinov’s (1997a,b) and Thirumalai’s (1995) theories.

It has been shown, however, that the protein size by itself determines folding rates of only multistate folding proteins and fails to predict those for two-state folders (Galzitskaya et al. 2003): For multistate folders, the negative correlation between LP (L being the number of residues in the chain and P a free parameter) and the logarithm of folding rates is as high as −0.80 in the broad range of power P from zero to one, whereas for two-state folders any correlation between folding rate and size is virtually absent.

This study is aimed to develop a general parameter for predicting the protein folding rates of two-state folding proteins, multistate folding proteins, and small peptides. This general estimate, if found, would be useful for two reasons: (1) Attribution of proteins to two-state or multistate folders is somewhat arbitrary, at least for proteins which can be switched from the two-state to the multistate behavior by point mutations or changing solvent conditions, and (2) it is useful to estimate the folding rate of a protein when one does not know a priori if it is two-state or multistate folding protein.

Results and Discussion

The simplest way to obtain such a parameter is to take into account both the protein topology and its size, that is, to combine a length-based theory with empirical topology effect (Plaxco et al. 1998b). Here we describe such a combination.

Specifically, a theory of Finkelstein and Badretdinov’s (1997a,b) predicted that in a vicinity thermodynamic midtransition, folding rates of all single-domain proteins should decrease with their lengths, L as exp[−(0.5 ÷ 1.5) L2/3], and where the size-independent coefficient C = 0.5 ÷ 1.5 depends on the topology of the protein: C is close to 0.5 when a protein is stabilized mainly by local interactions, so that semifolded protein does not contain closed loops protruding from the folding nucleus, and C is close to 1.5 when a protein has many long-range contacts, so that many closed loops protrude from the nucleus. Later it was shown (Galzitskaya et al. 2001) that the range kf = exp(0.5L2/3) × 10ns ÷ exp(1.5L2/3) × 10ns is valid for all the studied peptides and single-domain proteins of a great variety of lengths, topologies, and folding behaviors.

Although Finkelstein and Badretdinov did not give an algorithm to compute their coefficient, C, from protein structure, it is clear that a physical sense of C is similar to those of the CO of Plaxco et al. Both are small for proteins with local contacts (i.e., α-helical proteins), and both are large for proteins with predominantly long-range contacts, which cannot avoid having many loops in a semifolded state. Therefore, the values of C and CO should correlate.

The simplest combination of CO and L, which seems to follow from theories of Plaxco et al. and Finkelstein and Badretdinov, may look like CO × L2/3. However, because we observe that CO is not a chain length–independent parameter (as the value C of Finkelstein and Badretdinov should be) but anticorrelates with the chain length, L (Fig. 2 [triangle]), for totality of proteins and peptides, we summarize CO and L in a general parameter, the "size-modified contact order" (SMCO), as

equation M3
Figure 2.
Logarithm of relative contact order versus logarithm of chain length. See legend to Figure 1 [triangle] for specification of the symbols and other details. The dashed line represents the best linear fit for two-state folders only (the correlation coefficient ...

One can see that P = 0 corresponds to SMCO = CO, whereas P = 1 corresponds to SMCO = Abs_CO.

The correlation of SMCO and ln(kf), depending on the power P value, is presented in the inset in Figure 3 [triangle]. One can see that although any P > 0.7 results in approximately the same correlation for the totality of proteins and peptides, the best correlation is achieved at P ≈ 1, that is, when SMCOAbs_CO. The correlation of Abs_CO and ln(kf) is presented in Figure 3 [triangle].

Figure 3.
Logarithm of observed folding rate in water ln(kf) versus Abs_CO = CO × L. See legend to Figure 1 [triangle] for specification of the symbols and other details. The dashed line represents the best linear fit for two-state folders only (the fitted ...

It should be mentioned, however, that for the two-state folders, the best ln(kf)–to–SMCO correlation is achieved when P = 0 ÷ 0.5 rather than 1 (Fig. 3 [triangle], inset).

However, this difference between the scaling laws observed for two-state folders and the other proteins correlates, to a certain extent, with the finding (Fig. 2 [triangle]) that CO is independent on the chain length for the two-state folders, whereas it decreases with the chain length, L, in proportion to L−0.4 for multistate folders, and for the totality of proteins and peptides, CO decreases with their chain length, L, in proportion to L−0.30 ± 0.07 on the average.

It is noteworthy that CO scales namely as L−0.30 ± 0.07 for the totality of proteins and peptides (Fig. 2 [triangle], dashed line). This means that the value Abs_CO = CO × L (which has the highest correlation with ln[kf] for the totality of proteins and peptides; Fig. 3 [triangle], inset) scales with the chain length as L0.70 ± 0.07. This is in a very good concordance with a general scaling law L2/3 predicted by Finkelstein and Badretdinov 1997a,b; although the Thirumalai’s [1995] scaling law L0.5 has only a little worse correlation with experiment, and thus, cannot be ruled out; Fig. 3 [triangle], inset), and agrees with an empirical scaling L0.61 ± 0.18 resulting from simplified off-lattice folding simulations of Koga and Takada (2001).


We are grateful to Blake Gillespie and Oxana Galzitskaya for discussions and some computations, and to David Thirumalai for discussions and his results on correlation of ln(kf) with CO × L1/2. This work was supported in part by the Russian Foundation for Basic Research, by an International Research Scholar’s Award to A.V.F. from the Howard Hughes Medical Institute, and by the Institute of Theoretical Physics (Santa Barbara University, ITP work no. NSF-ITP-01-173).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0302503.


  • Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F., Brice, M.D., Rogers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. The Protein Bank: A computer-based archival file for macromolecular structures. Eur. J. Biochem. 80 319–324. [PubMed]
  • Burns, L.L., Dalessio, P.M., and Ropson, I.J. 1998. Folding mechanism of three structurally similar β-sheet proteins. Proteins 33 107–118. [PubMed]
  • Burton, R.E., Huang, G.S., Daugherty, M.A., Fullbright, P.W., and Oas, T.G. 1996. Microsecond protein folding through a compact transition state. J. Mol. Biol. 263 311–322. [PubMed]
  • Cavagnero, S., Dyson, H.J., and Wright, P.E. 1999. Effect of H helix destabilizing mutations on the kinetic and equilibrium folding of apomyoglobin. J. Mol. Biol. 285 269–282. [PubMed]
  • Choe, S.E., Matsudaira, P.T., Osterhout, J., Wagner, G., and Shakhnovich, E.I. 1998. Folding kinetics of villin 14T, a protein domain with a central β-sheet and two hydrophobic cores. Biochemistry 37 14508–14518. [PubMed]
  • Clarke, J., Hamill, S.J., and Johnson, C.M. 1997. Folding and stability of a fibronectin type III domain of human tenascin. J. Mol. Biol. 270 771–778. [PubMed]
  • Clarke, J., Cota, E., Fowler, S.B., and Hamill, S.J. 1999. Folding studies of immunoglobulin-like β-sandwich proteins suggest that they share a common folding pathway. Struct. Fold. Des. 7 1145–1153. [PubMed]
  • Cota, E. and Clarke, J. 2000. Folding of β-sandwich proteins: Three-state transition of a fibronectin type III module. Protein Sci. 9 112–120. [PMC free article] [PubMed]
  • Dalessio, P.M. and Ropson, I.J. 2000. β-Sheet proteins with nearly identical structures have different folding intermediates. Biochemistry 39 860–871. [PubMed]
  • Ferguson, N., Capaldi, A.P., James, R., Kleanthous, C., and Radford, S.E. 1999. Rapid folding with and without populated intermediates in the homologous four-helix proteins Im7 and Im9. J. Mol. Biol. 286 1597–1608. [PubMed]
  • Fersht, A.R. 2000. Transition-state structure as a unifying basis in protein-folding mechanisms: Contact order, chain topology, stability, and the extended nucleus mechanism. Proc. Natl. Acad. Sci. 97 1525–1529. [PMC free article] [PubMed]
  • Finkelstein, A.V. and Badretdinov, A.Y. 1997a. Physical reasons for a rapid folding of stable protein structures: A solution of Levinthal’s paradox. Mol. Biol. 31 391–398.
  • ———. 1997b. Rate of protein folding near the point of thermodynamic equilibrium between the coil and the most stable chain fold. Fold Des. 2 115–121. [PubMed]
  • Finkelstein, A.V and Ptitsyn, O.B. 2002. Protein physics. Lectures 19–21. Academic Press, New York.
  • Fowler, S.B. and Clarke, J. 2001. Mapping the folding pathway of an immunoglobulin domain: Structural detail from [var phi] value analysis and movement of the transition state. Struct. Fold Des. 9 355–366. [PubMed]
  • Galzitskaya, O.V., Ivankov, D.N., and Finkelstein, A.V. 2001. Folding nuclei in proteins. FEBS Lett. 489 113–118. [PubMed]
  • Galzitskaya, O.V., Garbuzynskiy, S.O., Ivankov, D.N., and Finkelstein, A.V. 2003. Chain length is the main determinant of the folding rate for proteins with three-state folding kinetics. Proteins 51 162–166. [PubMed]
  • Golbik, R., Zahn, R., Harding, S.E., and Fersht, A.R. 1998. Thermodynamic stability and folding of GroEL minichaperones. J. Mol. Biol. 276 505–515. [PubMed]
  • Goldberg, M.E., Semisotnov, G.V., Friguet, B., Kuwajima, K., Ptitsyn, O.B., and Sugai, S. 1990. An early immunoreactive folding intermediate of the tryptophan synthetase β2 subunit is a "molten globule." FEBS Lett. 263 51–56. [PubMed]
  • Grantcharova, V.P. and Baker, D. 1997. Folding dynamics of the src SH3 domain. Biochemistry 36 15685–15692. [PubMed]
  • Grantcharova, V., Alm, E.J., Baker, D., and Horwich, A.L. 2001. Mechanisms of protein folding. Curr. Opin. Struct. Biol. 11 70–82. [PubMed]
  • Guerois, R. and Serrano, L. 2000. The SH3-fold family: Experimental evidence and prediction of variations in the folding pathways. J. Mol. Biol. 304 967–982. [PubMed]
  • Guijarro, J.I., Morton, C.J., Plaxco, K.W., Campbell, I.D., and Dobson, C.M. 1998. Folding kinetics of the SH3 domain of PI3 kinase by real-time NMR combined with optical spectroscopy. J. Mol. Biol. 276 657–667. [PubMed]
  • Gutin, A.M., Abkevich, V.I., and Shakhnovich E.I. 1996. Chain length scaling of protein folding time. Phys. Rev. Lett. 77 5433–5436. [PubMed]
  • Ikura, T., Hayano, T., Takahashi, N., and Kuwajima, K. 2000. Fast folding of Escherichia coli cyclophilin A: A hypothesis of a unique hydrophobic core with a phenylalanine cluster. J. Mol. Biol. 297 791–802. [PubMed]
  • Jackson, S.E. 1998. How do small single-domain proteins fold? Fold. Des. 3 R81–R91. [PubMed]
  • Jackson, S.E. and Fersht, A.R. 1991. Folding of chymotrypsin inhibitor 2, 1: Evidence for a two-state transition. Biochemistry 30 10428–10435. [PubMed]
  • Jager, M., Nguyen, H., Crane, J.C., Kelly, J.W., and Gruebele, M. 2001. The folding mechanism of a β-sheet: The WW domain. J. Mol. Biol. 311 373–393. [PubMed]
  • Jennings, P.A., Finn, B.E., Jones, B.E., and Matthews, C.R. 1993. A reexamination of the folding mechanism of dihydrofolate reductase from Escherichia coli: Verification and refinement of a four-channel model. Biochemistry 32 3783–3789. [PubMed]
  • Khorasanizadeh, S., Peters, I.D., and Roder, H. 1996. Evidence for a three-state model of protein folding from kinetic analysis of ubiquitin variants with altered core residues. Nat. Struct. Biol. 3 193–205. [PubMed]
  • Kim, D.E., Fisher, C., and Baker, D. 2000. A breakdown of symmetry in the folding transition state of protein L. J. Mol. Biol. 298 971–984. [PubMed]
  • Koga, N., and Takada, S. 2001. Roles of native topology and chain-length scaling in protein folding: A simulation study with a Go-like model. J. Mol. Biol. 313 171–180. [PubMed]
  • Kragelund, B.B., Robinson, C.V., Knudsen, J., Dobson, C.M., and Poulsen, F.M. 1995. Folding of a four-helix bundle: Studies of acyl-coenzyme A binding protein. Biochemistry 34 7217–7224. [PubMed]
  • Kuhlman, B., Luisi, D.L., Evans, P.A., and Raleigh, D.P. 1998. Global analysis of the effects of temperature and denaturant on the folding and unfolding kinetics of the N-terminal domain of the protein L9. J. Mol. Biol. 284 1661–1670. [PubMed]
  • Laurents, D.V., Corrales, S., Elias-Arnanz, M., Sevilla, P., Rico, M., and Padmanabhan, S. 2000. Folding kinetics of phage 434 Cro protein. Biochemistry 39 13963–13973. [PubMed]
  • Main, E.R., Fulton, K.F., and Jackson, S.E. 1999. Folding pathway of FKBP12 and characterisation of the transition state. J. Mol. Biol. 291 429–444. [PubMed]
  • Matouschek, A., Kellis Jr., J.T., Serrano, L., Bycroft, M., and Fersht, A.R. 1990. Transient folding intermediates characterized by protein engineering. Nature 346 440–445. [PubMed]
  • McCallister, E.L., Alm, E., and Baker, D. 2000. Critical role of β-hairpin formation in protein G folding. Nat. Struct. Biol. 7 669–673. [PubMed]
  • Munoz, V., Lopez, E.M., Jager, M., and Serrano, L. 1994. Kinetic characterization of the chemotactic protein from Escherichia coli, CheY: Kinetic analysis of the inverse hydrophobic effect. Biochemistry 33 5858–5866. [PubMed]
  • Munoz, V., Thompson, P.A., Hofrichter, J., and Eaton, W.A. 1997. Folding dynamics and mechanism of β-hairpin formation. Nature 390 196–199. [PubMed]
  • Ogasahara, K. and Yutani, K. 1994. Unfolding-refolding kinetics of the tryptophan synthase α subunit by CD and fluorescence measurements. J. Mol. Biol. 236 1227–1240. [PubMed]
  • Otzen, D.E. and Oliveberg, M. 1999. Salt-induced detour through compact regions of the protein folding landscape. Proc. Natl. Acad. Sci. 96 11746–11751. [PMC free article] [PubMed]
  • Parker, M.J. and Marqusee, S. 1999. The cooperativity of burst phase reactions explored. J. Mol. Biol. 293 1195–1210. [PubMed]
  • Parker, M.J., Spencer, J., and Clarke, A.R. 1995. An integrated kinetic analysis of intermediates and transition states in protein folding reactions. J. Mol. Biol. 253 771–786. [PubMed]
  • Parker, M.J., Sessions, R.B., Badcoe, I.G., and Clarke, A.R. 1996. The development of tertiary interactions during the folding of a large protein. Fold. Des. 1 145–156. [PubMed]
  • Parker, M.J., Dempsey, C.E., Lorch, M., and Clarke, A.R. 1997. Acquisition of native β-strand topology during the rapid collapse phase of protein folding. Biochemistry 36 13396–13405. [PubMed]
  • Perl, D., Welker, C., Schindler, T., Schroder, K., Marahiel, M.A., Jaenicke, R., and Schmid, F.X. 1998. Conservation of rapid two-state folding in mesophilic, thermophilic and hyperthermophilic cold shock proteins. Nat. Struct. Biol. 5 229–235. [PubMed]
  • Plaxco, K.W., Spitzfaden, C., Campbell, I.D., and Dobson, C.M. 1997. A comparison of the folding kinetics and thermodynamics of two homologous fibronectin type III modules. J. Mol. Biol. 270 763–770. [PubMed]
  • Plaxco, K.W., Guijarro, J.I., Morton, C.J., Pitkeathly, M., Campbell, I.D., and Dobson, C.M. 1998a. The folding kinetics and thermodynamics of the Fyn-SH3 domain. Biochemistry 37 2529–2537. [PubMed]
  • Plaxco, K.W., Simons, K.T., and Baker, D. 1998b. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277 985–994. [PubMed]
  • Privalov, P.L. 1979. Stability of proteins: Small globular proteins. Adv. Protein Chem. 33 167–241. [PubMed]
  • Reid, K.L., Rodriguez, H.M., Hillier, B.J., and Gregoret, L.M. 1998. Stability and folding properties of a model β-sheet protein, Escherichia coli CspA. Protein Sci. 7 470–479. [PMC free article] [PubMed]
  • Schindler, T., Herrler, M., Marahiel, M.A., and Schmid, F.X. 1995. Extremely rapid protein folding in the absence of intermediates. Nat. Struct. Biol. 2 663–673. [PubMed]
  • Schreiber, G. and Fersht, A.R. 1993. The refolding of cis- and trans-peptidylprolyl isomers of barstar. Biochemistry 32 11195–11203. [PubMed]
  • Schymkowitz, J.W., Rousseau, F., Irvine, L.R., and Itzhaki, L.S. 2000. The folding pathway of the cell-cycle regulatory protein p13suc1: Clues for the mechanism of domain swapping. Struct. Fold Des. 8 89–100. [PubMed]
  • Silow, M. and Oliveberg, M. 1997. High-energy channeling in protein folding. Biochemistry 36 7633–7637. [PubMed]
  • Spector, S. and Raleigh, D.P. 1999. Submillisecond folding of the peripheral subunit-binding domain. J. Mol. Biol. 293 763–768. [PubMed]
  • Tang, K.S., Guralnick, B.J., Wang, W.K., Fersht, A.R., and Itzhaki, L.S. 1999. Stability and folding of the tumour suppressor protein p16. J. Mol. Biol. 285 1869–1886. [PubMed]
  • Thirumalai, D. 1995. From minimal models to real proteins: Time scales for protein folding kinetics. J. Phys. 5 1457–1469.
  • Thompson, P.A., Eaton, W.A., and Hofrichter, J. 1997. Laser temperature jump study of the helix[left and right double arrow ]coil kinetics of an alanine peptide interpreted with a "kinetic zipper" model. Biochemistry 36 9200–9210. [PubMed]
  • Van Nuland, N.A., Chiti, F., Taddei, N., Raugei, G., Ramponi, G., and Dobson, C.M. 1998a. Slow folding of muscle acylphosphatase in the absence of intermediates. J. Mol. Biol. 283 883–891. [PubMed]
  • Van Nuland, N.A., Meijberg, W., Warner, J., Forge, V., Scheek, R.M., Robillard, G.T., and Dobson, C.M. 1998b. Slow cooperative folding of a small globular protein HPr. Biochemistry 37 622–637. [PubMed]
  • Viguera, A.R., Serrano, L., and Wilmanns, M. 1996. Different folding transition states may result in the same native structure. Nat. Struct. Biol. 3 874–880. [PubMed]
  • Villegas, V., Azuaga, A., Catasus, L., Reverter, D., Mateo, P.L., Aviles, F.X., and Serrano, L. 1995. Evidence for a two-state transition in the folding process of the activation domain of human procarboxypeptidase A2. Biochemistry 34 15105–15110. [PubMed]
  • Wittung-Stafshede, P., Lee, J.C., Winkler, J.R., and Gray, H.B. 1999. Cytochrome b562 folding triggered by electron transfer: Approaching the speed limit for formation of a four-helix–bundle protein. Proc. Natl. Acad. Sci. 96 6587–6590. [PMC free article] [PubMed]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Compound
    PubChem Compound links
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...