Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. 2000 Mar 14; 97(6): 2562–2566.

The retro-GCN4 leucine zipper sequence forms a stable three-dimensional structure


The question of whether a protein whose natural sequence is inverted adopts a stable fold is still under debate. We have determined the 2.1-Å crystal structure of the retro-GCN4 leucine zipper. In contrast to the two-stranded helical coiled-coil GCN4 leucine zipper, the retro-leucine zipper formed a very stable, parallel four-helix bundle, which now lends itself to further structural and functional studies.

Since the early folding experiments by Anfinsen (1), it has been accepted that structure and function of a protein are determined by its amino acid sequence as read from the N terminus to the C terminus. But how does the structure change if the amino acid sequence of the protein is inverted? Inverted sequences are occasionally found in genomic DNA, but thus far, a native retro-protein has not been detected. Physicochemical properties that are related to the amino acid composition or the hydrophobicity profile should not be affected by the inversion of the sequence, supporting the idea that retro-sequences might fold into a native-like conformation. Modeling experiments suggested that reversal of the backbone direction may result in a topological mirror image of the native structure of the protein (2) or may produce the same topology as that of the parent protein (3, 4). Thus far, no structure elucidation of a retro-l-peptide at atomic resolution has been reported.

Coiled coils, including leucine zippers, consist of two to five intertwined α-helices and are frequently found in oligomeric proteins such as transcription factors as well as motility and structural proteins (5). We used the two-stranded coiled-coil domain of the yeast transcription activator GCN4 (6) to dimerize an artificial HIV enhancer-binding peptide, an operation that resulted in increased inhibition of HIV enhancer-controlled transcription (7). Because modeling studies suggested that the retro-sequence of the GCN4 leucine zipper also seemed to form a suitable dimerization module, a 35-residue retro-GCN4 was synthesized and characterized. Oxidized retro-leucine zipper extended with Cys-Gly-Gly, previously termed (r-LZ38)2 (r-GCN4-p1′; Fig. Fig.11A), crystallized and is now shown by x-ray structure analysis to fold into a parallel tetrameric coiled coil.

Figure 1
(A) Sequence alignment of the true 35-residue retro-leucine zipper based on the sequence of GCN4-p1, previously termed r-LZ35 (r-GCN4-p1), r-GCN4-p1′, wild-type C-terminal 33-residue leucine zipper moiety of the yeast transcription activator GCN4 ...

Materials and Methods


Sedimentation velocity experiments were performed with a Beckman–Spinco XL-A analytical ultracentrifuge. The peptide was dissolved in 20 mM Tris[center dot]HCl/80 mM NaCl adjusted to pH 5.0, and measurements were made over a 10- to 100-μM peptide concentration range. The centrifuge was operated at a speed of 50,000 rpm at 20°C. A partial specific volume, v20 = 0.74 cm3[center dot]g−1, was calculated from the amino acid composition as was an estimate of the degree of hydration, d1 = 0.47 g water/g protein. Sedimentation velocity traces were analyzed according to the method described in ref. 8, and a value of the sedimentation coefficient, sw,20 = 1.77 ± 0.5 Svedberg, was obtained by extrapolation to infinite dilution. The axial ratio was calculated as described in ref. 9.

Crystallization and Data Collection.

The r-GCN4-p1′ peptide (Fig. (Fig.11A) was synthesized by solid phase chemistry as described (10) and crystallized by hanging drop vapor diffusion. The reservoir solution consisted of 0.2 M sodium chloride/25% (vol/vol) 2-methyl-2,4-pentanediol/100 mM sodium acetate, pH 4.8. The peptide solution in water (2.5 mg/ml) was mixed with the reservoir solution in a 1:1 ratio. Crystals belonged to space-group P21212 with unit cell dimensions of a = 34.11 Å, b = 34.09 Å, and c = 56.44 Å and two peptide chains per asymmetric unit. The crystal packing was very close to the tetragonal space group P4212 with one molecule in the asymmetric unit. Data were collected at the Swiss-Norwegian beamline at European Synchrotion Radiation Facility (Grenoble, France) and processed with the program xds (11). Data are given in Table Table1,1, with the values in the last resolution shell shown in parentheses.

Table 1
Crystallographic data and refinement statistics

Structure Solution and Refinement.

The structure was solved by molecular replacement (program amore; ref. 12) with the complete tetrameric GCN4-pLI structure as a search model (PDB ID code 1GCL; ref. 13). After rigid-body refinement, the electron density maps were sufficiently clear to assign the r-GCN4-p1′ sequence. During the early stages of refinement, we used the simulated annealing slow-cooling protocol (program x-plor; ref. 14) and manual model building (program o; ref. 15). Later, we applied constrained maximum-likelihood refinement as implemented in the programs refmac (16) and cns (17). The Fobs were scaled anisotropically B11 = 0.984 Å2, B22 = 0.789 Å2, and B33 = −1.735 Å2. The specific buried surface (Sb) was calculated as Sb = [(zolig × Smono) − Solig]/(zolig × nres) with zolig being the number of α-helices in the coiled coil, Smono and Solig being the water-accessible surfaces in Å2 of the isolated α-helix and the coiled coil, respectively (rprobe = 1.4 Å), and nres being the number of residues in one α-helix.

Results and Discussion

Analytical ultracentrifugation and CD spectroscopic studies showed that the true r-GCN4-p1 35-residue peptide (Fig. (Fig.11A) was monomeric and had no appreciable structure in the low micromolar concentration range (10). However, at higher concentrations (1 mM), this peptide is completely α-helical (Fig. (Fig.11B). Analytical ultracentrifugation showed that, at 250 μM, r-GCN4-p1 exists as a mixture of monomers and tetramers in solution (10). N-terminal extension of r-GCN4-p1 by the tripeptide Cys-Gly-Gly yielded a 38-residue peptide (r-GCN4-p1′) that allowed the formation of a covalently linked dimer. r-GCN4-p1′ was completely α-helical at a much lower concentration of 15 μM. Ultracentrifugation data of r-GCN4-p1′ were interpreted as a prolate ellipsoid with axial ratios of 2.3, suggesting that the tetramer is preserved even in dilute aqueous solutions (9). The CD spectra of r-GCN4-p1 at high concentrations and of r-GCN4-p1′ at low concentrations are very similar. Therefore, we conclude that the secondary structural contents of r-GCN4-p1 at 1 mM and r-GCN4-p1′ at 15 μM are also similar.

The structure was refined in the orthorhombic space group P21212, although the crystal packing was very close to tetragonal. Processing the data in P4212 yielded an Rsym of 4.7%, which was similar to the value observed for P21212 (Table (Table1).1). The decision for the lower symmetry space group was made based on the chemistry of the peptide. Mass spectroscopic analysis showed that two peptide chains were covalently linked by a disulfide bridge (data not shown). The presence of two disulfide bridges was in contradiction to the perfect 4-fold symmetry. To allow for this asymmetry, we reduced the main symmetry axis from 4- to 2-fold. The structure was modeled as a dimer of dimers (Fig. (Fig.11C). However, the Cys-Gly-Gly linker is partially disordered, and the molecule adopts an almost perfect 4-fold symmetry with an rmsd of 0.10 Å between the noncrystallographic symmetry-related peptide chains (residues 4–36), explaining the low Rsym value for the tetragonal processing.

The coiled-coil superhelix has a cylindrical shape with a diameter of 20 Å and a height of 55 Å (Fig. (Fig.22A). The observed molecular dimensions correspond with the analytical ultracentrifugation experiments. Residues 6–34 form an α-helix that makes a 77.8° superhelical twist. Because there are just 3.5 residues per α-helix turn, one superhelical turn requires 37 α-helical turns or 130 residues. There is a large temperature factor gradient along the superhelix from below 20 Å2 for residues 13–21 to approximately 100 Å2 near the N and C termini. Obviously the N and C termini with their increased thermal motion possess greater structural flexibility. Water molecules are concentrated around the rigid central part of the superhelix (Fig. (Fig.22A). A much less pronounced temperature factor gradient is observed in the native GCN4-p1 structure (6) and is entirely lacking in the tetrameric GCN4-pLI mutant (13). The increased rigidity in the central parts of the retro-peptide superhelix corresponds with the results that have been obtained in the retro-bombolitin III peptide entrapped in SDS micelles that has a central helical segment, as determined by NMR spectroscopy (20).

Figure 2
(A) Backbone of the r-GCN4-p1′ tetramer colored according to temperature factor (dark blue = 20 Å2; red = 100 Å2). Water molecules are shown as red bullets. Figures were prepared with the programs molscript and bobscript (18, ...

Coiled-coil structures contain a characteristic sequence profile consisting of seven residues denoted a to g. The a and d positions are occupied by leucine, isoleucine, or valine residues whose side chains point toward the center of the superhelix. The packing of the helices is guided by the “knobs-into-holes” principle that was first proposed by Crick (21). Knobs in position a fit into holes formed by residues d, g, a#, and d# of the clockwise related α-helix (# refers to the next repeat). Knobs in position d fit into holes that are created by residues in positions a, d, e, and a#. In the r-GCN4-p1′ peptide, positions a and d are filled by leucine and valine residues (Fig. (Fig.11C). These residues form the hydrophobic core along the center of the superhelix. Harbury and coworkers (13) related the oligomerization states with the residue types in the a and d positions. The authors introduced a nomenclature for substitutions in the GCN4-p1 sequence: GCN4-pIL, -pII, -pLI, etc. reflect the amino acids in positions a and d. If positions a and d were filled by Ile(Val)/Leu, Ile/Ile, or Leu/Ile(Val) residues, then the coiled-coil structures were dimers, trimers, or tetramers, respectively. The dimeric wild-type GCN4-p1 structure and the tetrameric GCN4-pLI mutant superimpose onto the r-GCN4-p1′ mutant with rmsds of 0.37 Å and 0.50 Å, respectively. The oligomerization state of the r-GCN4-p1′ structure is in perfect agreement with these predictions, because the r-GCN4-p1′ peptide as well as the GCN4-pLI mutant form tetramers in solution. The structure-based sequence alignment (Fig. (Fig.11A) indicates that valine residues in position d are replaced by isoleucine residues in the GCN4-pLI mutant. The GCN4-pLI mutant core is more densely packed than the r-GCN4-p1′ core because of additional carbon atoms in the side chains at position d. Fig. Fig.22B shows that there are four cavities lining up along the 4-fold axis. The largest cavity (volume = 87 Å3) is close to the center of gravity of the r-GCN4-p1′ structure. The strong positive electron density inside this cavity was interpreted as four water molecules. Two of them form hydrogen bonds with symmetry-related Asn-21 side chains, which are the only polar side chains that point into the hydrophobic core (Fig. (Fig.22C). The positions of these water molecules are badly defined, and in fact, this cavity seems to be too narrow to host all four water molecules simultaneously. Perhaps only two waters occupy this cavity at the same time but may fluctuate between the four symmetry equivalent positions.

The smaller cavities are located close to valine residues at position d of the seven-residue repeat. In GCN4-pLI, all of these cavities, except the large central cavity, are absent. No fixed water molecule positions are found in the central cavity of GCN4-pLI. Stabilizing hydrogen bonds are impossible because of the replacement of Asn-21 by isoleucine. The role of the asparagine residues in the a positions of the central seven-residue repeat of GCN4-p1 and the c-Jun leucine zippers have been discussed in detail (2326); it is thought that the presence of this residue maintains the dimeric state, even though asparagine in the a position involves an energetic expense, because the higher order oligomers tend to be more stable than the dimer because of an increased buried surface area. Substitution in positions e and g also lead to changes in oligomerization state (27, 28). In the r-GCN4-p1′ structure, 44 Å2 per residue is buried on tetramerization compared with 25 Å2 in the native GCN4-p1 dimer. Also, in the r-GCN4-p1′, the asparagine is shifted to the d position.

Residues 2–5 and residues 37–38 do not participate in the α-helix hydrogen bond network and possess extended conformations. The disordering of the N-terminal Cys-Gly-Gly linker can be attributed to the placement of the disulfide bridge. Residue Cys-1 occupies position e, which is unsuitable for a covalent interaction between symmetry-related side chains. To accommodate the disulfide bridge, a distortion of the helix hydrogen bond network is required. The side chain of Glu-5 forms a water-mediated hydrogen bond with the peptide oxygen of Arg-4 in the counter-clockwise related helix and serves as an efficient N-terminal helix cap. Although Lys-34 and Arg-36 are located close to the C terminus, neither of these residues fulfills a similar function. Further polar side chains are involved in hydrogen bonds on the surface of the superhelix. Particularly, the interactions between side chains in positions g/b (Lys-10/Arg-12 and Glu-31/Gln-33), g/e (Glu-17/Glu-15), and c/b (Tyr-20/Glu-26) create a hydrogen-bonding network that ties together adjacent α-helices (Fig. (Fig.11C). Surprisingly, there are several interactions between identically charged side chains. Presumably, the charges are sufficiently delocalized such that repulsive interactions do not interfere with folding of the retro-peptide. Interactions between identically charged side chains are occasionally found in crystal contacts (29, 30). The surface hydrogen-bond network in the r-GCN4-p1′ structure is markedly different from the network seen in the GCN4-pLI structure (Fig. (Fig.22D). Only one g/b interaction (Lys-8/Glu-10 in GCN4-pLI) is conserved in the r-GCN4-p1′ structure (Lys-10/Arg-12). Ser-23 in r-GCN4-p1′ and Ser-14 in GCN4-pLI are in position f. In both structures, the serine OH groups interact with peptide oxygens in b positions of the previous α-helical turns. Surprisingly, the r-GCN4-p1′ structure is very stable. The dissociation constant and the free energy of unfolding for the r-GCN4-p1′ are Kd = 1.8 × 10−10 M and ΔGu = 55.0 kJ[center dot]mol−1 (10). The occurrence of unfavorable polar interactions in the r-GCN4-p1′ structure illustrates that the stability of the GCN4 leucine zipper is dominated by the proper packing of the hydrophobic core, and polar interactions on the surface play only a minor role, which is in agreement with previous results (31).

The r-GCN4-p1′ monomer is almost identical to the native GCN4-p1 monomer, because the hydrophobicity profile contains a palindrome. The 2-fold palindrome axis intersects the r-GCN4-p1′ structure close to the central cavity between His-19 and Tyr-20 (Figs. (Figs.11A and and22B). Application of the 2-fold symmetry operation to the native GCN4-p1 sequence is equivalent with an ad and da transposition within the seven-residue repeat. When residues in positions a and d are swapped, the oligomerization state changes from a dimer to a tetramer. Application of the palindrome axis to the native GCN4-p1 peptide inverts the direction of the helix dipole. Superposition of the r-GCN4-p1′ structure onto the inverted GCN4-p1 structure yields an rmsd of 0.69 Å for 31 Cα atoms. When the superposition is restricted to a smaller part of the structure, the rmsd is significantly reduced (Fig. (Fig.22E). In this superposition, the α-helix dipole moments point in opposite directions, but both structures possess exactly identical sequence profiles. Side chains in the r-GCN4-p1′ structure are aligned with the Cα hydrogens of the inverted GCN4-p1 structure. Despite this similarity and the identity of the sequence profiles, the r-GCN4-p1′ peptide does not form a dimer like the native GCN4-p1 peptide, because the orientation between side chain and main chain is critical for the proper packing of α-helices. To achieve an optimal knobs-into-holes fit, the Cα–Cβ bond in position d must be perpendicular to the Cα–Cα vector in the adjacent helix. Because the side-chains are pointing in the direction of the Cα hydrogens, a r-GCN4-p1′ peptide consisting of D amino acids could form a dimer with the helix dipole oriented in the opposite direction.

Several studies on retro-peptides have been reported (32, 33), but with the exception of a 13-residue retro-D-peptide (34), detailed structural analysis has generally been hampered by the inability of the retro-peptides to form stable three-dimensional structures. This inability has led to the conclusion that retro-peptides differ considerably from their parent peptides and represent new structural entities where the correct folding is not guaranteed (35). Our results show that, although the retro sequence of many proteins might not fold, there are peptides in which the retro-sequence indeed folds and adopts a stable structure. The palindromic nature of the sequence is an important property for the existence of a stable retro-peptide structure. Although the r-GCN4-p1′ structure does not seem to be perfect, because there are several cavities in the hydrophobic core as well as interactions between identically charged side chains, it is reasonably stable. The presence of a palindrome ensures that the hydrophobicity profiles of the parent and the retro-sequences are similar. However, a palindrome in a protein can be more hidden than a palindromic DNA repeat, which makes detection extremely difficult. Although improved packing and stability might result if such a peptide were subjected to evolutionary pressure, the detection of the parent sequence would probably become impossible. The simplicity of the fold that consists only of α-helices also contributes to the stability of the r-GCN4-p1′ structure, because this architecture has the advantage that main chain hydrogen bonds are exclusively formed locally, and elements that are located in distant parts of the sequence do not have to interact. In addition, because there is just one secondary structural element, the structure is entirely free of loops. These factors relax the constraints on the sequence profile and increase the probability of the occurrence of a palindrome. We believe that, although r-GCN4-p1′ is an artificial construct, it will be an extremely useful tool for studying the impact of sequence directionality on protein folding.


We thank H.-R. Bosshard for fruitful discussions. Financial support by the Baugartenstiftung (CH-8022 Zürich) is acknowledged.


This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: GCN4-p1, C-terminal 33-residue leucine zipper moiety of the yeast transcription activator GCN4; r-GCN4-p1, 35-residue retro-leucine zipper based on the sequence of GCN4-p1, previously termed r-LZ35; r-GCN4-p1′, oxidized retro-leucine zipper extended with Cys-Gly-Gly, previously termed (r-LZ38)2; GCN4-pLI, GCN4-p1 mutant with leucine and isoleucine residues in positions a and d, respectively; rmsd, root-mean-square deviation.

Data deposition: The atomic coordinates have been deposited in the Protein Data Bank, www.rcsb.org (PDB ID code 1C94).


1. Anfinsen C B. Science. 1973;181:223–230. [PubMed]
2. Guptasarma P. FEBS Lett. 1992;310:205–210. [PubMed]
3. Olszewski K A, Kolinski A, Skolnick J. Protein Eng. 1996;9:5–14. [PubMed]
4. Witte K, Skolnick J, Wong C-H. J Am Chem Soc. 1998;120:13042–13046.
5. Beck K, Brodsky B. J Struct Biol. 1998;122:17–29. [PubMed]
6. O'Shea E K, Klemm J, Kim P S, Alber T. Science. 1991;254:538–544.
7. Liu N, Caderas G, Gutte B, Thomas R M. Eur Biophys J. 1997;25:399–403. [PubMed]
8. Philo J. Biophys J. 1997;72:435–444. [PMC free article] [PubMed]
9. Laue T M, Shah B D, Ridgeway T M, Pelletier S L. In: Analytical Utracentrifugation in Biochemistry and Polymer Science. Harding S E, Rowe A J, Horton J C, editors. Cambridge, U.K.: R. Soc. Chem.; 1992. pp. 90–125.
10. Liu N, Deillon C, Klauser S, Gutte B, Thomas R M. Protein Sci. 1998;7:1214–1220. [PMC free article] [PubMed]
11. Kabsch W. J Appl Crystallogr. 1988;21:916–924.
12. Navaza J. Acta Crystallogr A. 1994;50:157–163.
13. Harbury P B, Zhang T, Kim P S, Alber T. Science. 1993;262:1401–1407. [PubMed]
14. Brünger A T, Kuriyan J, Karplus M. Science. 1987;235:458–460. [PubMed]
15. Jones T A, Zou J Y, Cowan S W, Kjeldgaard M. Acta Crystallogr A. 1991;47:110–119. [PubMed]
16. Murshudov G N, Lebedev A, Vagin A A, Wilson K S, Dodson E J. Acta Crystallogr D. 1999;55:247–255. [PubMed]
17. Brünger A T, Adams P D, Clore G M, DeLano W L, Gros P, Grosse-Kunstleve R W, Jiang J S, Kuszewski J, Nilges M, Pannu N S, et al. Acta Crystallogr D. 1998;54:905–921. [PubMed]
18. Kraulis P J. J Appl Crystallogr. 1991;24:946–950.
19. Esnouf R M. J Mol Graphics. 1997;15:132–134. [PubMed]
20. Bisello A, Sala S, Tonello A, Signor G, Melotto E, Mammi S, Peggion E. Int J Biol Macromol. 1995;17:273–282. [PubMed]
21. Crick F H C. Acta Crystallogr. 1953;6:689–697.
22. Nicholls A, Sharp K A, Honig B. Proteins. 1991;11:282–296. [PubMed]
23. Potekhin S A, Medvedkin V N, Kashparov I A, Venyaminov S Y. Protein Eng. 1994;7:1097–1101. [PubMed]
24. Thomas R M, Wendt H, Zampieri A, Bosshard H R. Prog Colloid Polym Sci. 1995;99:24–30.
25. Junius F K, Mackay J P, Bubb W A, Jensen S A, Weiss A S, King G F. Biochemistry. 1995;34:6164–6174. [PubMed]
26. Gonzalez L, Woolfson D N, Alber T. Nat Struct Biol. 1996;3:510–515. [PubMed]
27. Zheng X, Zhu H, Lashuel H A, Hu J C. Protein Sci. 1997;6:2218–2226. [PMC free article] [PubMed]
28. Eckert D M, Malashkevich V N, Kim P S. J Mol Biol. 1998;284:859–865. [PubMed]
29. Privé G G, Anderson D H, Wesson L, Cascio D, Eisenberg D. Protein Sci. 1999;8:1400–1409. [PMC free article] [PubMed]
30. Mittl P R E, Priestle J P, Cox D A, McMaster G, Cerletti N, Grütter M G. Protein Sci. 1996;5:1261–1271. [PMC free article] [PubMed]
31. Spek E J, Bui A H, Lu M, Kallenbach N R. Protein Sci. 1998;7:2431–2437. [PMC free article] [PubMed]
32. Chorev M, Goodman M. Trends Biotechnol. 1995;13:438–445. [PubMed]
33. Juvvadi P, Vunnam S, Merrifield R B. J Am Chem Soc. 1996;118:8989–8997.
34. McDonald J M, Fushman D, Cahill S M, Sutton B J, Cowburn D. J Am Chem Soc. 1997;119:5321–5328.
35. Lacroix E, Viguera A R, Serrano L. Folding Des. 1998;3:79–85. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • Protein
    Published protein sequences
  • PubMed
    PubMed citations for these articles
  • Structure
    Published 3D structures
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...