• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Jan 21, 2003; 100(2): 478–483.
Published online Jan 6, 2003. doi:  10.1073/pnas.0236088100
PMCID: PMC141020
Biochemistry

Visualization of coupled protein folding and binding in bacteria and purification of the heterodimeric complex

Abstract

During overexpression of recombinant proteins in Escherichia coli, misfolded proteins often aggregate and form inclusion bodies. If an aggregation-prone recombinant protein is fused upstream (as an N-terminal fusion) to GFP, aggregation of the recombinant protein domain also leads to misfolding of the downstream GFP domain, resulting in a decrease or loss of fluorescence. We investigated whether the GFP domain could fold correctly if aggregation of the upstream protein domain was prevented in vivo by a coupled protein folding and binding interaction. Such interaction has been previously shown to occur between the E. coli integration host factors α and β, and between the domains of the general transcriptional coactivator cAMP response element binding protein (CREB)-binding protein and the activator for thyroid hormone and retinoid receptors. In this study, fusion of integration host factor β or the CREB-binding protein domain upstream to GFP resulted in aggregation of the fusion protein. Coexpression of their respective partners, on the other hand, allowed soluble expression of the fusion protein and a dramatic increase in fluorescence. The study demonstrated that coupled protein folding and binding could be correlated to GFP fluorescence. A modified miniintein containing an affinity tag was inserted between the upstream protein domain and GFP to allow rapid purification and identification of the heterodimeric complex. The GFP coexpression fusion system may be used to identify novel protein–protein interactions that involve coupled folding and binding or protein partners that can solubilize aggregation-prone recombinant proteins.

Protein aggregation and inclusion body formation are frequently encountered when recombinant proteins are overexpressed in Escherichia coli (13). By expressing GFP as a C-terminal fusion to the protein of interest, the solubility of the fusion protein in vivo, indicative of productive protein folding, can be monitored by fluorescence (4). If the protein of interest is prone to aggregation when expressed alone, it would likely also do so in the GFP fusion (4, 5). It appears that the aggregation of the upstream protein domain causes misfolding of the downstream GFP domain. Without a correctly folded structure, GFP cannot form the chromophore essential for its fluorescence (6, 7).

It has been suggested that protein aggregation during protein overexpression in a heterologous host is the result of early events during protein synthesis and folding that lead to off-pathway association of folding intermediates (8, 9). Because the GFP variant used in the GFP fusion protein folds properly when expressed alone (10), the misfolding of the GFP domain in the GFP fusion protein is probably due to the interference of the early GFP-folding pathway by the upstream aggregation-prone protein. In a previous study (4), GFP fusion proteins produced in vivo exhibited no significant difference in fluorescence compared to the same proteins produced by in vitro transcription/translation. This lack of dependence in fluorescence on bulk concentrations of macromolecules seemed to suggest that the misfolding of the GFP domain in the fusion proteins involved intra-molecular interactions that occurred early during protein synthesis and folding. Based on these hypotheses, we reasoned that, if we introduced a second protein known to bind and fold the aggregation-prone protein during early stages of protein synthesis and folding, the nonproductive interference with the folding of the GFP domain might be prevented and the GFP domain be allowed to fold correctly. To test this idea, we set up a model system in which an aggregation-prone subunit of the E. coli integration host factor (IHF) was used as the upstream (N-terminal) domain in a GFP fusion. IHF is a heterodimer of α and β subunits (11). Separate overexpression of IHF-α and -β in E. coli resulted in unstable polypeptides and insoluble aggregates, respectively (12). Coexpression of both IHF subunits was necessary for the production of soluble and active proteins. The expressed IHF proteins were purified as a 1:1 complex of tightly bound α and β subunits (12). It seems that one expressed IHF subunit could not adopt a correctly folded structure until it interacted with the other subunit, a process of mutual chaperoning which probably occurs early during protein synthesis and folding. Using IHF-β as the upstream domain in a GFP fusion protein, we demonstrated in this study that the interaction between the two IHF subunits allowed the downstream GFP domain to fold correctly, resulting in a dramatic increase in fluorescence.

There have been numerous examples of unstructured proteins that adopt folded structures when binding to their biological partners (13). These “coupled folding and binding” events, often found in eukaryotes, represent a distinct class of biomolecular recognition that is quite different from the “lock and key” or “adhesive surface” model of biomolecular interaction, often observed between folded or structured macromolecules (14, 15). We chose a recently characterized interaction, named synergistic folding, between domains of the general transcriptional coactivator cAMP response element binding protein (CREB)-binding protein (CBP) and the activator for thyroid hormone and retinoid receptors (ACTRs) (16). We show here that the unstructured CBP domain fused upstream to GFP led to aggregation and inclusion body formation. Coexpression of the ACTR domain resulted in a significant increase in fluorescence and copurification of the ACTR/CBP complex. Our data suggest that the GFP fusion may be of general use in monitoring coupled protein folding and binding events in vivo.

Materials and Methods

Fusion Vector Construction.

The fusion vectors were derived from the pTEG vector, which contained an isopropyl β-d-thiogalactopyranoside-inducible T7 promoter and expressed a modified miniintein fused to a downstream GFP domain as described (5). The E. coli IHF α and β genes were amplified from E. coli chromosomal DNA by using PCR. The domain fragment of human ACTR (residues 1,018–1,088) (16) was synthesized by using overlapping oligonucleotides. The domain fragment of mouse CBP (residues 2,059–2,117) was amplified from a vector carrying a full-length mouse CBP gene (kindly provided by Stephen C. Harrison). The construct of IG (Fig. (Fig.11A) expressed a fusion protein consisting of only the modified miniintein and GFP. The DNA fragments for IHF β and the CBP domain (CBPf residues 2,059–2,117) were cloned into the multiple cloning site of pTEG to create a fusion construct of βIG and CBPfIG, respectively (Fig. (Fig.11A). For bicistronic coexpression, the DNA fragments for IHF α and the ACTR domain (ACTRf residues 1,018–1,088) were cloned upstream of the βIG and CBPfIG fusion genes, respectively, and a second bacterial ribosome-binding site was placed after the stop codons of the upstream genes, yielding α-βIG and ACTRf-CBPfIG (Fig. (Fig.11A). As a control, the E. coli maltose-binding protein (MBP) gene was cloned in the bicistronic expression vector α-βIG to replace the IHF α gene to create a construct of MBP-βIG (Fig. (Fig.11A). The ACTR domain was also tagged with six histidine residues at its C terminus to yield a bicistronic expression construct ACTRfHis6-CBPfIG (Fig. (Fig.11A). A mutant ACTRfHis6, named mACTRfHis6, was constructed by using PCR and mutagenic oligos, in which the leucine residues 1,056 and 1,071 were substituted by arginine. The gene sequence of the modifed miniintein in the expression constructs of βIG and α-βIG was replaced by a short linker to generate fusion constructs of βG and α-βG, in which β was directly fused to the N terminus of the GFP domain (Fig. (Fig.11A). For two-plasmid coexpression, a second expression vector pACCExp, derived from pACYC184 (New England Biolabs) and compatible with the pTEG derivatives, was constructed. pACCExp had a T7 promoter and similar cloning sites as pTEG. In the two-plasmid coexpression of α, βIG (Fig. (Fig.11B), βIG was cloned in the pTEG vector whereas the α gene was cloned in the pACCExp vector. In the two-plasmid coexpression of β, α-βIG′ (Fig. (Fig.11B), the β gene was expressed from the pTEG vector whereas the α-βIG bicistronic expression was from the pACCExp vector.

Figure 1
Schematic diagram of the coexpression fusion constructs. T7 promoter (T7) is indicated by the arrow. rbs, bacterial ribosome-binding site; I, the modified miniintein; G, green fluorescence protein; α, E. coli IHFα; β, E. coli IHFβ; ...

Cell Culture and Protein Expression.

All fusion constructs were transformed into E. coli ER 2566 (New England Biolabs, Beverly, MA) (genotype: Fλfhu A2 [lon] ompT lacZ::T7 gene1 gal sulA11 Δ(mcrC-mrr)114::IS10R(mcr-73::miniTn10–TetS)2 R(zgb-210::Tn10)(TetS) endA1 [dcm]). Cells were grown at 37°C in LB medium supplemented with 100 μg/ml ampicillin in shake flasks. At OD600 = 0.05, the cultures were induced by using 0.4 mM isopropyl β-D-thiogalactoside and allowed to continue growth at 20 or 30°C for an additional 16 h. The cells were then used for fluorescence measurements or harvested by centrifugation for solubility analyses and protein purification.

Fluorescence Measurements.

For each fusion construct, fluorescence was measured in triplicate from three independently grown postinduction cultures (50 ml). For the construct IG and all fusion constructs of βIG (Fig. (Fig.33A), the cultures were induced at 20°C for 16 h. For the fusion constructs of CBPfIG (Fig. (Fig.55A), the cultures were induced at 30°C for 16 h. The cultures were diluted to OD600 1.0 and aliquots (200 μl each) were then added to wells of a microplate. The fluorescence was measured by using a Perkin–Elmer LS50B luminescence spectrometer (excitation, 395 nm; emission, 510 nm; each with 2.5-nm bandwidth). A culture containing a GFP-minus construct and grown and induced under the same conditions was used as a blank, and its fluorescence value was subtracted from those of the cultures containing the GFP fusion constructs. The fluorescence images of induced cultures (wc) and clarified cell extracts (sup) (Figs. (Figs.33B and and55B) were taken by using a digital camera. The samples were prepared in the same microplate as used for the fluorescence measurement and illuminated by a hand-held long-wave UV lamp (365 nm). All cultures were diluted to the same optical density before their fluorescence images were taken. The clarified cell extracts were obtained by using the Bugbuster HT protein extraction reagent (Novagen).

Figure 3
(A) Relative fluorescence of the induced cultures expressing different βIG constructs. Fluorescence was measured in triplicate from cultures induced at 20°C and normalized by cell density (see Materials and Methods). Relative fluorescence ...
Figure 5
(A) Relative fluorescence of the induced cultures expressing different CBPfIG constructs. Fluorescence was measured in triplicate from cultures induced at 30°C and normalized by cell density (see Materials and Methods). Relative fluorescence values ...

Solubility Analysis and Protein Purification.

The cell pellet from a 1-liter-induced culture was resuspended in HEN buffer (20 mM Hepes, pH 8.0/0.5 M NaCl) for solubility analysis and purification on chitin beads (New England Biolabs) or in KPI buffer (1 M K2HPO4, pH 8.0/20 mM imidazole) for purification on nickel-nitrilotriacetate (Ni-NTA) agarose beads (Qiagen, Chatsworth, CA). Cells were lysed by sonication and a sample was immediately taken for analysis by SDS/PAGE or Western blot as total cell protein (T). After centrifugation at 25,000 × g for 30 min, an aliquot of the supernatant (S) was analyzed for soluble protein while the pellet (P) was solubilized by boiling in 8 M urea. For purification on chitin beads, the supernatant was passed through a column packed with chitin beads at 4°C. After thoroughly washing the column with HEN buffer, an aliquot of the chitin beads was removed and treated with SDS-loading buffer (New England Biolabs) for analysis of the proteins bound to the beads. A solution of DTT (50 mM in HEN buffer) was then quickly passed through the column. The flow was stopped and the column was transferred to 23°C. After incubation for 16 h, the proteins were eluted in HEN buffer. An aliquot of the chitin beads also was removed and treated with the SDS-loading buffer for the analysis of the remaining proteins bound to the beads. For purification on Ni-NTA agarose beads (Qiagen), the supernatant was mixed at 4°C with Ni-NTA agarose beads in a centrifuge tube. The agarose beads were washed by repeated centrifugation with 20-bed volumes of KPI buffer containing 50 mM imidazole. The proteins were then eluted in KPI buffer containing 250 mM imidazole.

Western Blot Analysis.

The cell densities (OD600) of the induced cultures from all CBPfIG constructs (CBPfIG, ACTRf-CBPfIG, ACTRfHis6-CBPfIG, and mACTRfHis6-CBPfIG) were determined. The cells were then diluted to the same optical density before they were lysed. This ensured that the same amount of cells from each CBPfIG construct was used for solubility analysis (Fig. (Fig.4).4). Proteins from the total cell extract (T), the supernatant (S), and the pellet (P) were resolved on 12–16% Tris-Glycine gels (NOVEX, San Diego). After blotting onto nitrocellulose membranes, the proteins were visualized by probing with a mAb against the chitin-binding domain (New England Biolabs) and a polyclonal Ab against the six-histidine tag (Cell Signaling Technologies, Beverly, MA), followed by chemilluminescent detection.

Figure 4
Western blot analysis of the total cell extract (T), the supernatant (S), and the pellet (P) from different CBPfIG constructs (A and B). (A) Expression of CBPfIG alone (lanes 1–3) and with coexpression of ACTRf (lanes 4–6). Proteins were ...

Results

The Upstream IHFβ Caused Aggregation of the GFP Fusion Protein.

The E. coli IHFβ gene was cloned into the pTEG vector (5), in which IHFβ was fused to the N terminus of a modifed miniintein, whose C terminus was in turn fused to GFP. The modified miniintein contained an insertion of a chitin-binding domain and functioned to facilitate the purification of the upstream protein domain without the use of a protease (5). The expression and solubility of the miniintein-GFP fusion protein (IG) alone was analyzed. As shown in Fig. Fig.22A (Left), IG (50 kDa) was highly expressed in E. coli under the T7 promoter and a significant portion of the expressed proteins was soluble. Fusion of IHFβ to the N terminus of IG generated the βIG fusion protein (Fig. (Fig.11A). Although expressed at a similar level as IG, βIG (61 kDa) was almost completely insoluble and most of the fusion protein was found in the pellet fraction (Fig. (Fig.2A2A Right). These data were consistent with the previous studies showing that the E. coli IHFβ formed inclusion bodies when overexpressed (12) and that an aggregation-prone protein fused upstream to the intein-GFP moiety led to aggregation of the fusion protein (5).

Figure 2
Solubility analysis on SDS/PAGE gels of samples taken from the total cell extract (T), the supernatant (S), and the pellet (P) of different fusion constructs (AC), and SDS/PAGE analysis of samples taken from different purification ...

Coexpression of IHFα, from the Same or a Separate Plasmid, Prevented Aggregation of the IHFβ Fusion Protein.

The IHFα gene with a bacterial ribosome-binding site was cloned upstream of βIG to create a bicistronic expression construct (α-βIG) (Fig. (Fig.11A). Expression of α and βIG was driven by a single upstream T7 promoter. Both α and βIG were highly expressed after isopropyl β-d-thiogalactopyranoside induction and a significant amount of α seemed to be in the soluble fraction (Fig. (Fig.2B2B Left, α-βIG). In contrast to the expression of βIG alone, coexpression of α and βIG resulted in a large portion of the expressed βIG fusion protein in the soluble fraction (Fig. (Fig.2B2B Left). To test whether the bicistronic expression of α was necessary, the α gene was cloned into a compatible expression vector pACCExp which also used an isopropyl β-d-thiogalactopyranoside-inducible T7 promoter. pACCExp-α was then cotransformed into cells containing βIG [i.e., two-plasmid coexpression, Fig. Fig.11B (1), α, βIG]. As shown in Fig. Fig.22B (α, βIG, Center), both α and βIG were highly expressed and the expression of α from a separate plasmid also led to the soluble expression of βIG, at least to an extent similar to the bicistronic expression (Fig. (Fig.22B α-βIG, Left). As a control, α in α-βIG was replaced by the E. coli MBP, a protein that was not expected to interact with β. Solubility analysis indicated that βIG was mostly insoluble in the presence of the coexpressed MBP (Fig. (Fig.22B, MBP-βIG, Right).

To verify whether the specific interaction between α and β was responsible for the soluble expression of the βIG fusion protein, a second copy of the β gene was introduced on a compatible vector (β competition). In this case, the α-bacterial ribosome-binding site-βIG fragment from the bicistronic expression construct was transferred into the pACCExp vector to generate α-βIG′. Expression of α-βIG′ showed a similar solubility pattern as α-βIG (Fig. (Fig.22C, lanes 1–3 and Fig. Fig.2B2B Center). α-βIG′ was then cotransformed into cells containing a compatible vector expressing free β (i.e., β, α-βIG′ as in Fig. Fig.11B). As shown in Fig. Fig.22C (lanes 4–6), the relative amount of the soluble βIG was significantly decreased in comparison to α-βIG′ alone. The overall expression of βIG was also decreased, possibly due to the competition from free β overexpression. A significant amount of free β was found in the soluble fraction (Fig. (Fig.22C, lane 5). Because β was completely insoluble when overexpressed alone and became soluble in the presence of α (12), the data suggested a possible interaction between α and free β. In conclusion, the above data suggest that the specific interaction between α and β was the cause of the soluble expression of βIG.

IHFα Was Tightly Bound to IHFβ in the Fusion Protein and Could Be Copurified Through the Intein-Mediated Purification Process.

The supernatant from an induced culture of the bicistronic expression construct α-βIG (Fig. (Fig.22D, lane 2) was loaded onto chitin beads. After washing to remove the unbound proteins, an aliquot of the bound proteins were eluted by SDS treatment. As shown in Fig. Fig.22D (lane 3), both βIG (61 kDa) and α (11.4 kDa) were bound to chitin beads. Because α lacked the chitin-binding domain necessary for binding to chitin beads, the data suggested that α was tightly associated with the βIG fusion protein and coeluted by the SDS treatment. A minor band, just below βIG, was the in vivo cleavage product IG, whose identity was further confirmed by Western analysis using the Ab against the chitin-binding domain (data not shown). The bound βIG on chitin beads was then induced to cleave at the N terminus of the modified miniintein in the presence of DTT, which released β from the IG moiety. As shown in the elution fraction (Fig. (Fig.22D, lane 4), α coeluted with β, indicating a tight association between the two IHF subunits as expected. The identities of α and β in the elution were further confirmed by protein N-terminal sequencing analysis (data not shown). After the DTT-induced cleavage, the remaining bound proteins on chitin beads were eluted by SDS treatment. As indicated by Fig. Fig.22D (lane 5), most of the βIG fusion proteins were converted to IG.

Coexpression of IHFα Prevented the Downstream GFP Domain in βIG from Misfolding.

The relative fluorescence of the induced cultures from different βIG fusion constructs was compared (Fig. (Fig.33A). The fusion construct, βIG, had a low fluorescence (<1% of IG). Combined with the solubility analysis of the βIG expression (Fig. (Fig.2A2A Right), the data suggested that aggregation of β caused misfolding of the downstream GFP domain, which could not form the correct chromophore structure essential for its fluorescence. The fluorescence of the cells containing βIG increased significantly (>50-fold) when α was coexpressed from the same (α-βIG) or a compatible fusion vector (α, βIG) (Fig. (Fig.33A). In the control vector MBP-βIG, no significant increase in fluorescence was observed (Fig. (Fig.33A). The data suggested that coexpression of α prevented misfolding of the GFP domain in a significant portion of the expressed βIG fusion protein.

To test whether the modified miniintein played a role, the intein sequence in βIG was deleted and replaced by a short linker to create βG and α-βG. As shown in Fig. Fig.33A, βG alone exhibited a low fluorescence (4% of IG), whereas in the presence of the coexpressed α, a significant increase (≈60-fold) in fluorescence was observed. The extent of the increase in fluorescence without the miniintein (α-βG) was comparable to α-βIG, suggesting that the miniintein domain was not required for the solubilizing effect of the α coexpression. The solubility analysis of βG and α-βG by SDS/PAGE was consistent with the fluorescence data (data not shown).

In the β competition experiment (Fig. (Fig.33A, α-βIG′ and β,α-βIG′), the expression of free β from a compatible vector decreased significantly the fluorescence of α-βIG′. Although this decrease could be partially explained by the overall decrease in the level of βIG protein expression (Fig. (Fig.22C), the solubility analysis indicated that a large portion of the fluorescence decrease reflected a significant decrease of the βIG solubility (Fig. (Fig.22C). The data suggested that the competition from free β prevented the expressed βIG from binding to α and led to misfolding of the upstream β domain and consequently the downstream GFP domain in βIG.

The fluorescence differences between different fusion constructs were visually recognizable from colonies grown on plates (not shown), the whole cell culture (wc), and the clarified cell extracts (sup) (Fig. (Fig.33B).

Coupled Folding and Binding Between the ACTR and CBP Domains Allowed Soluble Expression of the CBPfIG Fusion Protein.

The CBP domain (CBPf, residues 2,059–2,117) and the ACTR domain (ACTRf, residues 1,018–1,088) have been found to be unstructured when expressed alone but undergo “synergistic folding” to form a tight heterodimeric complex when coexpressed in a bicistronic format (16). CBPf was fused to the N terminus of IG to generate a fusion construct of CBPfIG. When overexpressed at 30°C, the CBPfIG fusion protein was almost exclusively found in the pellet fraction and no soluble fusion protein was detected by Western analysis (Fig. (Fig.44A, lanes 1–3). This suggested that the unstructured CBP domain caused aggregation of the fusion protein. In a bicistronic expression construct (Fig. (Fig.11A), ACTRf was coexpressed with CBPfIG. Although most CBPfIG was still found in the pellet fraction, a portion of the expressed fusion proteins were detected in the soluble fraction (Fig. (Fig.44A, lanes 4–6). A second band, exhibiting a slightly smaller Mr than the full-length CBPfIG (57 kDa), was observed in all fractions (Fig. (Fig.44A). Because the Ab used in the Western analysis was against the chitin-binding domain in the modified miniintein, this lower Mr species was probably IG (50 kDa), the product of in vivo cleavage.

To test whether the specific interaction between the ACTR domain and CBP domain was responsible for the solubilization effect of the coexpression, amino acid substitutions that were expected to disrupt the intermolecular interaction were made in the ACTR domain. The leucine residues 1,056 and 1,071 of the ACTR domain are conserved in the sequence alignment and form multiple hydrophobic interactions with the residues of the CBP domain in the heterodimeric complex (16). Mutating both leucine residues to arginine is expected to destabilize the hydrophobic interaction between the two domains. To facilitate determination of the protein expression levels, a six-histidine tag was fused to the C-termini of both the WT and mutant ACTR domains to generate ACTRfHis6 and mACTRfHis6. The levels of protein expression were estimated either by purifying His-tagged proteins or by Western analysis using anti-His-6 Ab. Coexpression of ACTRfHis6 resulted in soluble expression of a significant amount of CBPfIG (Fig. (Fig.44B, lanes 1–3), similar to that of ACTRf (Fig. (Fig.44A, lanes 4–6). Coexpression of mACTRfHis6, on the other hand, resulted in almost no detectable CBPfIG in the soluble fraction (Fig. (Fig.44B, lane 5). Based on the purification of His-tagged proteins (data not shown) and Western analysis (Fig. (Fig.44B, lanes 1–6), similar levels of protein expression were observed for both ACTRfHis6 and mACTRfHis6. To verify the intermolecular interaction between the ACTR and CBP domains, the soluble fractions from the coexpression cultures of ACTRfHis6-CBPfIG and mACTRfHis6-CBPfIG were loaded onto Ni-NTA agarose beads and the bound proteins were then eluted in 250 mM imidazole. As shown in Fig. Fig.44C, CBPfIG was coeluted with ACTRfHis6 (lane 1) but not with mACTRfHis6 (lane 2), suggesting that CBPfIG was bound only to the WT ACTRfHis6 and the mutations in mACTRfHis6 disrupted the intermolecular interaction. In summary, the data suggest that coupled folding and binding between the ACTR domain and the CBP domain in CBPfIG prevented the aggregation of CBPfIG and allowed the fusion protein to fold in a soluble form.

Coexpression of the ACTR Domain Resulted in a Significant Increase in the GFP Fluorescence.

The relative fluorescence of various CBPfIG cultures after induction at 30°C was compared (Fig. (Fig.55A). Expression of CBPfIG alone resulted in a low fluorescence (Fig. (Fig.55A), suggesting that the unstructured CBP domain in CBPfIG caused misfolding of the downstream GFP domain. A significant increase (>10-fold) in fluorescence was observed when ACTRf or ACTRfHis6 was coexpressed (Fig. (Fig.55A). Coexpression of mACTRfHis6, however, failed to increase the fluorescence of CBPfIG (Fig. (Fig.55A), suggesting that the specific interaction between the ACTR and CBP domains was necessary to allow correct folding of the downstream GFP domain. These data, combined with the solubility analysis (Fig. (Fig.44 A and B), suggest that coupled folding and binding could be correlated to soluble protein expression and GFP fluorescence. When CBPfIG was expressed at 20°C, a significant amount of the fusion protein was soluble, resulting in a high fluorescence (Fig. (Fig.55B). Consequently, the effect of ACTRf coexpression was not obvious (data not shown).

Discussion

In addition to its wide use as a reporter system for gene expression (17), the GFP protein may also be used to monitor protein solubility and folding (4, 5). The folding pathway of the GFP domain fused to an aggregation-prone protein is of interest. After de novo synthesis of GFP, the nascent chain first folds into a compact cylindrical structure, a process that was estimated to have a t1/2 of 10 min, based on in vitro renaturation experiments (7). This early folding step is followed by a slower process of chromophore maturation (7, 18). The first folding step is required for proper chromophore formation and GFP fluorescence. The loss of fluorescence during inclusion body formation of a GFP fusion protein was not due to the failure of the folded GFP to form the chromophore but because of the failure of GFP to fold into the native structure in the early folding step (7).

In this study, expression of GFP alone (at 20°C) resulted in >90% soluble protein and a high fluorescence (data not shown). Fusion of an aggregation-prone protein β to the N terminus of GFP, however, resulted in >90% of the fusion protein in the pellet and a low fluorescence (Fig. (Fig.33A, βG). Aggregation of the GFP domain in βG was apparently caused by aggregation of the upstream β. One possible explanation is that a misfolded β domain interacted with the folding intermediates of the downstream GFP domain, resulting in the accumulation of off-pathway intermediates and subsequent protein aggregation. Because GFP chromophore is committed to form as soon as GFP folds into its native structure, it is likely that the effect of the β domain on GFP fluorescence is the result of its interference with the early folding step of the GFP domain. It is conceivable that β would not have any effect on the fluorescence if GFP has already folded into its native structure before it has a chance to interact with misfolded β. To support this argument, β and GFP were coexpressed from two separate genes and no significant decrease in fluorescence was observed (data not shown). It seems that the covalent linkage of β and GFP allowed β to interact with GFP before it folded into its native structure and cause GFP to aggregate. The GFP domain does not need to be at the C terminus for this effect to occur. Fusion of GFP to the N terminus of β also resulted in the majority of the fusion protein (Gβ) in the pellet and a loss in fluorescence (data not shown).

During the coexpression of α, misfolding of the β domain was avoided by binding to its biological partner early during protein synthesis. The tight association with α and formation of the native structure might have prevented the β domain from unfavorably interacting with the folding intermediates of the downstream GFP domain. The importance of this coupled folding and binding interaction in preventing the GFP domain from misfolding was further illustrated in the interaction between the ACTR and CBP domains. The unstructured CBP domain caused aggregation of the CBPfIG fusion protein and misfolding of the downstream GFP domain (Figs. (Figs.44A and and55A). Coexpression of the ACTR domain resulted in not only soluble expression of the CBPfIG fusion protein but also an increase in fluorescence (Figs. (Figs.44 and and5).5). The effect of the ACTRf coexpression was completely abolished when mutations were introduced in the ACTR domain that disrupted its interaction with the CBP domain (Figs. (Figs.44 B and C and and55A). These data suggest that the coupled folding and binding interaction prevented the upstream CBP domain from unfavorably interfering with the GFP folding and allowed expression of the soluble and correctly folded protein.

In most of the experiments conducted in this study, a modified miniintein was inserted between the aggregation-prone protein (β or CBPf) and the GFP domain. The purpose was to facilitate purification of the fusion protein and the heterodimeric complex. Although removing the miniintein had no significant effect on soluble expression and fluorescence during the coexpression, the third miniintein domain in the fusion protein may complicate the GFP-folding pathway as proposed above. However, the fact that coexpression of α resulted in similar levels of increase in fluorescence for both βIG and βG (Fig. (Fig.33A) indicates that the possible involvement of the miniintein domain during protein folding resulted in no significant change in the fluorescence. The presence of the modified miniintein allowed isolation of the fusion proteins on chitin beads and elution of the α/β complex after the DTT-induced cleavage. Furthermore, the ability of the miniintein to catalyze the specific cleavage reaction suggested that the interaction between α and the β domain in βIG allowed the correct folding of not only the downstream GFP domain but also the miniintein domain.

This study demonstrated the correlation between GFP fluorescence and coupled protein folding and binding. The coexpression system described here may be useful for studying in vivo protein interactions of unstructured proteins (13) or “natively unfolded” proteins (19). A survey of 91 known “natively unfolded” proteins (or domains) reveals that more than one-third of the proteins (or domains) are <100 residues and more than one-half are <150 residues (19). Both the β protein and CBPf domain are within the similar Mr range. It is not known whether the system is applicable to larger unstructured proteins. Nevertheless, we believe that the system would work effectively if the following three criteria are met. First, the GFP fusion protein containing the aggregation-prone protein can be expressed at a relatively high level. High protein expression ensures that a sufficient amount of the soluble fusion protein is expressed during the coexpression and a high fluorescence value is obtained. In this study, much higher fluorescence values were obtained during the coexpression of α-βIG than those of ACTRf-CBPfIG (Figs. (Figs.33A and and55A). The differences in fluorescence mostly reflected the expression levels of the GFP fusion proteins rather than the extent of solubilization. Low expression levels due to expression of large eukaryotic proteins (or domains) can sometimes be overcome by optimizing codon usage or by using a stronger promoter. Second, conditions can be found under which the fusion protein is largely insoluble when expressed alone. Low solubility ensures that a low fluorescence value is obtained before the coexpression. For instance, the CBPfIG fusion protein was partially soluble at 20°C but completely insoluble at 30°C. Therefore, a higher induction temperature was used during the coexpression of the ACTR domain. Third, the protein partner binds to the aggregation-prone protein with a high affinity early during protein synthesis and folding; i.e., the binding is coupled to protein folding. It is conceivable that binding occurring after protein folding would not result in an increase in solubility and fluorescence, as the GFP domain has already folded or misfolded. When binding does occur during protein folding, the range of binding affinity necessary for a detectable increase in solubility and fluorescence has not been determined in this study. We speculate that a stronger binding would result in a higher increase in solubility and fluorescence. This notion is consistent with the observation that a mutation that weakened the affinity between α and β resulted in a decrease in solubility and fluorescence (data not shown).

By fusing an aggregation-prone target protein to GFP, one may use the fusion construct to search for protein factors that can lead to soluble expression of the fusion protein. It may be possible that this approach leads to identification of unique protein–protein interactions, including coupled folding and binding interactions, and protein partners that can assist the folding of aggregation-prone recombinant proteins.

Acknowledgments

We thank Drs. Lise Raleigh, Richard Roberts, Eric Cantor, and George Tzertzinis for valuable discussions and reading of the manuscript; Dr. Stephen Harrison (Harvard University) for providing the CBP construct; Dr. Jack Benner for protein sequencing analysis; and Dr. Donald Comb for encouragement. This work is supported by National Institutes of Health Grant GM 57734 and New England Biolabs.

Abbreviations

ACTR
activator for thyroid hormone and retinoid receptor
CBP
cAMP response element binding protein (CREB)-binding protein
IHF
integration host factor
MBP
maltose-binding protein
Ni-NTA
nickel-nitrilotriacetate
sup
clarified cell extracts
CBPf
CBP domain, residues 2,059–2,117
ACTRf
ACTR domain, residues 1,018–1,088

Footnotes

This paper was submitted directly (Track II) to the PNAS office.

References

1. Marston F A. Biochem J. 1986;240:1–12. [PMC free article] [PubMed]
2. Schein C H. Bio/Technology. 1989;7:1141–1149.
3. Wetzel R. In: Stability of Protein Pharmaceuticals. Ahern T J, Manning M C, editors. New York: Plenum; 1992. pp. 43–88.
4. Waldo G S, Standish B M, Berendzen J, Terwilliger T C. Nat Biotechnol. 1999;17:691–695. [PubMed]
5. Zhang A, Gonzalez S M, Cantor E J, Chong S. Gene. 2001;275:241–252. [PubMed]
6. Cody C W, Prasher D C, Westler W M, Prendergast F G, Ward W W. Biochemistry. 1993;32:1212–1218. [PubMed]
7. Reid B G, Flynn G C. Biochemistry. 1997;36:6786–6791. [PubMed]
8. Haase-Pettingell C A, King J. J Biol Chem. 1988;263:4977–4983. [PubMed]
9. Wetzel R. Trends Biotechnol. 1994;12:193–198. [PubMed]
10. Crameri A, Whitehorn E A, Tate E, Stemmer W P. Nat Biotechnol. 1996;14:315–319. [PubMed]
11. Nash H A, Robertson C A. J Biol Chem. 1981;256:9246–9253. [PubMed]
12. Nash H A, Robertson C A, Flamm E, Weisberg R A, Miller H I. J Bacteriol. 1987;169:4124–4127. [PMC free article] [PubMed]
13. Dyson H J, Wright P E. Curr Opin Struct Biol. 2002;12:54–60. [PubMed]
14. Naray-Szabo G, Nagy P. Enzyme (Basel) 1986;36:44–53. [PubMed]
15. Ptashne M, Gann A. Genes and Signals. Plainview, NY: Cold Spring Harbor Lab. Press; 2002.
16. Demarest S J, Martinez-Yamout M, Chung J, Chen H, Xu W, Dyson H J, Evans R M, Wright P E. Nature. 2002;415:549–553. [PubMed]
17. van Roessel P, Brand A H. Nat Cell Biol. 2002;4:E15–E20. [PubMed]
18. Heim R, Prasher D C, Tsien R Y. Proc Natl Acad Sci USA. 1994;91:12501–12504. [PMC free article] [PubMed]
19. Uversky V N, Gillespie J R, Fink A L. Proteins. 2000;41:415–427. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...