• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of aemPermissionsJournals.ASM.orgJournalAEM ArticleJournal InfoAuthorsReviewers
Appl Environ Microbiol. Feb 2007; 73(3): 906–912.
Published online Dec 1, 2006. doi:  10.1128/AEM.01804-06
PMCID: PMC1800768

The Presence of N-Terminal Secretion Signal Sequences Leads to Strong Stimulation of the Total Expression Levels of Three Tested Medically Important Proteins during High-Cell-Density Cultivations of Escherichia coli[down-pointing small open triangle]


Genetic optimizations to achieve high-level production of three different proteins of medical importance for humans, granulocyte-macrophage colony-stimulating factor (GM-CSF), interferon alpha 2b (IFN-α2b), and single-chain antibody variable fragment (scFv-phOx), were investigated during high-cell-density cultivations of Escherichia coli. All three proteins were poorly expressed when put under control of the strong Pm/xylS promoter/regulator system, but high volumetric yields of GM-CSF and scFv-phOx (up to 1.7 and 2.3 g/liter, respectively) were achieved when the respective genes were fused to a translocation signal sequence. The choice of signal sequence, pelB, ompA, or synthetic signal sequence CSP, displayed a high and specific impact on the total expression levels for these two proteins. Data obtained by quantitative PCR confirmed relatively high in vivo transcript levels without using a fused signal sequence, suggesting that the signal sequences mainly stimulate translation. IFN-α2b expression remained poor even when fused to a signal sequence, and an alternative IFN-α2b coding sequence that was optimized for effective expression in Escherichia coli was therefore synthesized. The total expression level of this optimized gene remained low, while high-level production (0.6 g/liter) was achieved when the gene was fused to a signal sequence. Together, our results demonstrate a critical role of signal sequences for achieving industrial level expression of three human proteins in E. coli under the conditions tested, and this effect has to our knowledge not previously been systematically investigated.

Human cytokines are proteins that promote immune responses, and they have a broad range of medical uses, such as treatment of microbial and viral infections and vaccination against cancer. In 2004, the global cytokine market was 6.5 billion U.S. dollars, and according to a Research and Markets report, the demand for existing cytokines is expected to grow significantly in the next years (http://www.researchandmarkets.com/reports/314808/). Thus, cytokines are needed in large quantities, and the development of high-cell-density cultivations (HCDC) has led to production at high volumetric yields of such pharmaceutically important proteins in heterologous hosts like Escherichia coli. Despite the many advantages of this organism, high-level heterologous expression is not routinely achieved, typically due to biased codon usage, gene product toxicity, low gene product solubility, mRNA secondary structure formation, and low mRNA stability (15, 26). The broad-host-range plasmid pJB658 harbors the inducible Pm/xylS promoter/regulator elements for recombinant expression of cloned genes in a wide range of gram-negative bacteria (3, 4). We recently modified this plasmid to express high volumetric yields of secreted recombinant single-chain antibody variable fragment (scFv-phOx) during HCDC of E. coli (26). To obtain secretion, we added (at the DNA level) a signal sequence at the N terminus of scFv-phOx, a method that is commonly used if intracellular production is undesired.

The important cytokine granulocyte-macrophage colony-stimulating factor (GM-CSF) is one of four specific glycoproteins that stimulate generation of the white blood cells granulocytes and macrophages (18). Recombinant GM-CSF has been expressed in bacterial, yeast, and mammalian cells and is now produced for clinical uses. This protein has greatly reduced the infection risk associated with bone marrow transplantation (19). GM-CSF produced recombinantly in E. coli ends up in inclusion bodies (IBs) and has certain drawbacks, including complex processing, low specific activity, and poor in vitro renaturation (2). The GM-CSF product obtained as an IB has an added methionine at the N terminus that leads to stimulation of antibody production in the human body, hence influencing the therapeutic value (8, 33). Interferon alpha 2b (IFN-α2b) belongs to the IFN family of cytokines, which can induce antiproliferative, immunomodulatory, and potent antiviral activities against a wide range of mammalian viruses (7). IFN-α2b is used to treat several diseases, including some types of cancer and hepatitis, in particular hepatitis C. Since it can increase the intensity of antigen expression on certain tumors, IFN-α2b has a potential for use in diagnostics and therapeutics (20). As for GM-CSF, IFN-α2b has also been expressed recombinantly into active form in E. coli (28).

In this report, we describe the use of the pJB658-based expression system (26) to produce GM-CSF, IFN-α2b, and scFv-phOx both intracellularly (lacking signal sequences) and through secretion (with signal sequences) during HCDC. Surprisingly, we found that the presence of signal sequences very strongly stimulated not only secretion but also the total production levels of all three of these proteins. Such effects are to our knowledge not commonly referred to in the scientific literature but should be of significant importance in this field of biotechnology.


Strains, DNA manipulations, and growth conditions.

E. coli strains used in this study were as follows: DH5α (BRL) was used as a cloning host, while RV308 (ATCC 31608) was the standard recombinant production strain used for HCDC of E. coli. Strain BL21(DE3) (Stratagene) is deficient in the OmpT and Lon proteases and was used as an alternative test strain for recombinant protein production. BL21-CodonPlus (DE3)-RIPL (Stratagene) carries argU, ileY, and leuW genes encoding tRNA molecules that recognize rare codon triplets. ATCC 67979 and ATCC 53157 carry plasmids with the IFN-α2b and GM-CSF coding regions, respectively (ATCC). Standard cloning experiments were performed as described elsewhere (23), and recombinant E. coli strains were grown at 37°C in liquid Luria-Bertani (LB) medium or on solid LB agar plates. For small-scale production analyses, cells were grown at 30°C with shaking (225 rpm; orbital moment, 2.5-cm amplitude) in 250-ml baffled shake flasks containing 40 ml HiYe medium, which is composed as follows: Na2HPO4·2H2O, 8.6 g/liter; KH2PO4, 3 g/liter; NH4Cl, 1 g/liter; NaCl, 0.5 g/liter; glucose, 2 g/liter; glycerol, 10 g/liter; yeast extract, 10 g/liter; MgSO4, 2.5 mM; Fe(III)-citrate, 250 μM; H3BO3, 49 μM; MnCl2, 79 μM; EDTA, 23 μM; CuCl2, 9 μM; Na2MoO4, 10 μM; CoCl2, 11 μM; and Zn-acetate, 36 μM. Site-specific mutagenesis was performed by using a QuickChange site-directed mutagenesis kit from Stratagene according to the manufacturer's instructions. Oligonucleotides used in this study are listed in Table Table1.1. The defined preculture and the main culture media were prepared as described previously (26). When appropriate, the media were supplemented with ampicillin (100 μg/ml) and chloramphenicol (10 μg/ml). Design of the optimized IFN-α2bS gene was performed by GenScript, a 501-bp double-stranded DNA fragment encoding this sequence was synthesized (GeneScript), and unique NdeI and NotI restriction sites were introduced at the 5′ and 3′ ends, respectively, and cloned into the corresponding sites of pUC57, yielding pUC57IFN-S.

Oligonucleotides used in this study

Vector constructions.

All oligonucleotide primers used hereunder are listed in Table Table1,1, and all expression vectors constructed were verified by DNA sequencing.


Two unique restriction sites, SacI and AgeI, were introduced, flanking the xylS coding region of plasmid pJT19bla (31), by site-directed mutagenesis with the mutagenic oligonucleotide pairs sacI-F and sacI-R and ageI-F and ageI-R, respectively. The 1,025-bp SacI-AgeI fragment was subcloned into the corresponding sites of pLITMUS28 (New England Biolabs). The NcoI restriction site in the xylS gene then was specifically mutated in the latter construct with the mutagenic oligonucleotides ncoI-F and ncoI-R, yielding pTA4-NcoI. In parallel, plasmid pJB655cop271 (3) was digested with XmaI-BseRI, and the cohesive ends of the vector fragment were blunted and religated. The unique NdeI site of the trfA gene was then specifically mutated with the mutagenic oligonucleotides ndeI-F and ndeI-R, yielding plasmid pTA20. The 1,171-bp ClaI-SexA1 fragment of pTA20, including the mutated trfA gene, and the 339-bp HpaI-MunI fragment of pTA4-NcoI, including the xylS gene, were isolated and used to substitute for the corresponding fragments of vector pJBphOx-271 (26), yielding the cassette cloning and expression vector pJBphOx-271d.


The coding region of the mature GM-CSF gene was PCR cloned from total DNA isolated from E. coli ATCC 53157 by using primers gm-1F and gm-1R. The resulting DNA fragment was end digested with NdeI and NotI, and the 389-bp fragment was used to substitute with the corresponding scFv-phOx fragment of pJBphOx-271d, yielding pGM29 (expressing GM-CSF without signal sequence).


The coding region of the mature GM-CSF gene was PCR cloned from E. coli ATCC 53157 total DNA with the primers gm-2F and gm2R. The resulting DNA fragment was end digested with NcoI and NotI, and the 390-bp fragment was used to substitute for the corresponding scFv-phOx fragment of plasmid pJBphOx-271d, yielding pGM29pelB (expressing GM-CSF with pelB signal sequence).

pGM29ompA and pGM29CSP.

The DNA fragments with ompA and CSP coding sequences were prepared by annealing of synthetic oligonucleotides as described previously (26) and used to substitute for the corresponding NdeI-NcoI pelB fragment of pGM29pelB, to yield pGM29ompA and pGM29CSP, respectively (expressing GM-CSF with ompA and CSP signal sequences, respectively).


The scFv-phOx-cmyc-his6 coding region of pJBphOx was PCR amplified by using the primers phox-1F and phox-1R and end digested with NdeI-NotI, and the resulting 770-bp fragment was used to substitute for the corresponding fragment of plasmid pGM29pelB, yielding pAT65 (expressing scFv-phOx without signal sequence).


The IFN-α2b coding region was PCR amplified from DNA isolated from E. coli strain ATCC 67979 by using the primers IFN-1F and IFN-1R and was end digested with NcoI-NotI, and the 504-bp fragment was ligated into the corresponding sites of pGEM-5zf (Promega). From the resulting plasmid, the 504-bp NcoI-NotI insert was isolated and used to substitute for the corresponding region of pGM29pelB, yielding pIFN30pelB (expressing IFN-α2b with pelB signal sequence).


The IFN-α2b coding region was PCR amplified from DNA isolated from strain ATCC 67979 by using the primers IFN-2F and IFN-1R and was end digested with NdeI-NotI, and the 501-bp fragment was ligated into the corresponding sites of pGEM-5zf. From the resulting plasmid, the 501-bp NdeI-NotI insert was isolated and used to substitute for the corresponding fragment of pGM29pelB, yielding pIFN30 (expressing IFN-α2b without signal sequence).

pIFN30S and pIFN30SpelB.

The 501-bp NdeI-NotI fragment of plasmid pUC57IFN-S (see above) was used to substitute for the corresponding fragment of plasmid pGM29pelB, yielding pIFN30S (expressing IFN-α2bS without signal sequence). The IFN-α2bS insert was PCR amplified from pIFN30S by using primers IFN-3F and IFN-2R and was end digested with NcoI-NotI, and the 504-kb fragment was used to substitute for the corresponding fragment of pGM20pelB, yielding pIFN30SpelB (expressing IFN-α2bS with pelB signal sequence).

Production analyses in shake flask cultures.

Overnight cultures of recombinant cells were diluted into fresh prewarmed HiYe medium (100 ml) to an optical density at 600 nm (OD600) of 1, and cell growth was continued to an OD600 of 3. At this point, the cultures were induced by adding m-toluic acid (0.5 mM) and cell growth was continued for 4 h (OD600 = 10 to 15) before cell harvesting. Preparation of cell samples and product detections were performed as described below.

Production analyses in HCDC.

Preparation of fermentation inoculum and HCDC of recombinant E. coli strains was performed as described previously (26). The cell cultures were induced at an OD600 of 100, and cell growth was continued for 4 to 6 h, typically corresponding to an OD600 of 180 to 200. All quantitative protein data given from the high-yielding HCD fermentations are based on values from at least two independent cultivations.

Preparation of cell samples for protein product analyses by ELISA and Western blotting.

Cell sample preparations of medium (designated as S1 phase), soluble (designated as S2 phase), and pellet (designated as P2 phase) fractions for production analyses, and Western analyses of recombinant IFN-α2b, scFv-phOx, and GM-CSF proteins, were performed as described elsewhere (26). Recombinant soluble GM-CSF was quantified by using a human GM-CSF enzyme-linked immunosorbent assay (ELISA) set kit from BD Biosciences, in accordance with the manufacturer's instructions. Purified scFv-phOx protein served as a standard in all cases.

N-terminal sequencing of recombinant proteins.

Samples (0.2 ml) collected from the soluble fractions (see above) of the cell extracts were purified by using the Ni-nitrilotriacetic acid spin protocol from QIAGEN under denaturating conditions, in accordance with manufacturer's instructions. About 15 mg of purified protein (judged by using the Bio-Rad method) was then subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis and concomitant Western blotting, as described above. The membrane was stained with Coomassie brilliant blue R250, and the band corresponding to the desired mass was excised and used as a template for the sequencing which was performed by the Biotechnology Centre of Oslo, Norway (K. Sletten and S. Kjæraas).

Isolation of total RNA, reverse transcription, and quantitative PCR.

The experiments hereunder were performed essentially as described previously (9). Cell samples (0.1 ml) were harvested from the HCDC fermentations at an OD600 of 130 to 160, diluted in fresh medium to an OD600 of 10, and immediately treated with RNA Protect (QIAGEN). Total RNA was isolated by using an RNAqueous kit (Ambion) according to the manufacturer's instructions, and 3 μg of the obtained material was treated with a DNA-free kit (Ambion) and used as template for cDNA synthesis by using a first-strand cDNA synthesis kit (Amersham). The quantitative PCR profile was as follows: segment 1 (1 cycle), 95°C at 10 min; segment 2 (40 cycles), 95°C for 30 s, 55°C for 60 s, and 68°C for 30 s; and segment 3 (1 cycle), 95°C for 60 s, 55°C for 30 s, and slowly up to 95°C (dissociation curve). The primers used are listed in Table Table1.1. We used the iTaq SYBR Green Supermix with Rox (Bio-Rad), and amplification reaction mixtures contained 5 μl of diluted cDNA templates and 3 pmol of each primer in a final volume of 20 μl. The PCR products were detected by monitoring the increase in fluorescence by using an Mx3000P cycler system (Stratagene). Thresholds (CT) were set which intersected the amplification curves in the linear region of the semilogarithmic plots, and inspection of the dissociation curves confirmed negligible levels of primer self-hybridizations. Amplification of the β-lactamase gene present on the expression vector was used for sample normalization. All experiments were performed in three parallels.

In silico analysis of mRNA secondary structures.

The deduced transcripts of the gene variants were analyzed for secondary structures by using the program MFOLD (34) (http://www.bioinfo.rpi.edu/applications/mfold/old/rna/). Regions covering about 50 to 60 bp from the Pm transcription start point (22), including the translation initiation codon and its flanking regions, were used for these analyses.


Construction of the cloning and cassette expression vector pJBphOx-271d.

Plasmid pJBphOx-271, used previously to produce scFv-phOx, was modified (generating pJBphOx-271d; Fig. Fig.1)1) to facilitate the construction of the expression cassettes required for this study. In this new vector, the gene encoding scFv-phOx can easily be substituted by any gene of interest by a one-step cloning procedure, using either the NdeI/NotI sites (no signal sequence) or the NcoI/NotI sites (DNA sequence encoding the pelB signal sequence fused in frame to the 5′-terminal end). The coupled affinity detection sequence c-myc-his6 is fused to the 3′-terminal end of the cloned gene. The complete fusion genes are at the 5′ end, with or without a secretion signal sequence, placed in frame under translational control of the ribosome binding sequence (rbs) region and under transcriptional control of the tightly controlled and inducible Pm/xylS promoter/regulator system (3, 4) of the plasmid. Recombinant genes to be expressed, the type of signal sequence (pelB, ompA, or CSP) used, or the vector copy number (see reference 26) can thus easily be changed by one-step cloning procedures.

FIG. 1.
Physical map of part of the cassette cloning and expression vector pJBphOx-271d. Relevant restriction sites useful for gene cloning with and without a fused signal sequence, as well as for shifting out pelB with ompA and CSP, are indicated (see also Materials ...

GM-CSF and IFN-α2b are not expressed at detectable levels in pJBphOx-271d in the absence of a signal sequence.

In contrast to protein scFv-phOx, GM-CSF and IFN-α2b do not contain disulfide bridges. Consequently, the translocation of these two proteins to the periplasm to obtain correct product folding was not considered necessary, and we therefore chose to express them without the use of a secretion signal sequence. The gene encoding scFv-phOx was therefore substituted with those of GM-CSF and IFN-α2b as NdeI/NotI fragments (Fig. (Fig.1),1), generating plasmids pGM29 and pIFN30, respectively. These plasmids were transformed into the E. coli production host strain RV308, and the resulting transformants were initially subjected to production analyses in shake flask cultures. Western blot analyses of both the soluble (S1 and S2) and the pellet (P2) fractions (see Materials and Methods) of the cells showed that no recombinant product could be detected (below 1 mg/liter) from any of the two strains. To rule out the possibility that the poor result is linked to the RV308 host strain used, we also analyzed the production in the protease-modified E. coli strain BL21(DE3) (see Materials and Methods). No detectable production was observed in this host either, indicating that the low production levels observed are not a strain-specific property.

The production levels of GM-CSF and scFv-phOx, but not of IFN-α2b, are dramatically stimulated by 5′-terminal fusion of the pelB signal sequence.

We previously demonstrated that the choice of signal sequence used, pelB, ompA, or CSP, is important to achieve both effective translocation and high total expression level of scFv-phOx, and the best overall result was obtained with pelB (26). Based on the somewhat surprising results obtained with proteins GM-CSF and IFN-α2b (see above), it was of interest to also test the impact of expressing scFv-phOx protein without the use of a secretion signal sequence. We therefore constructed vector pAT65 which has no signal sequence and with the scFv-phOx coding sequence positioned in frame with the relevant translational start codon of the vector (see Materials and Methods). Production analysis of the S1, S2, and P2 fractions of strain RV308(pAT65) showed that no recombinant product (below 1 mg/ml) could be detected. This result showed that scFv-phOx is dependent on a signal sequence to be effectively expressed under these conditions, suggesting that efficient expression of GM-CSF and IFN-α2b might also potentially be obtained by fusing the corresponding gene sequences to the pelB sequence. Strains (RV308) containing the resulting plasmids pGM29pelB (expressing GM-CSF with pelB) and pIFN30pelB (expressing IFN-α2b with pelB) were therefore subjected to production analysis in shake flask cultures. Interestingly, a substantial level (about 50 mg/liter) of recombinant GM-CSF protein was now produced, while no detectable IFN-α2b production could be detected. Here we show that the poor IFN-α2b production observed is partly due to unfavorable codon usage in the IFN-α2b coding region (see below). Under HCDC conditions, high-level production of GM-CSF was achieved (0.8 g/liter) and about 50% of the detected protein was present as soluble product (Fig. (Fig.22 and Table Table2).2). This result therefore clearly indicated that efficient GM-CSF expression is (similar to scFv-phOx) dependent on a fused signal sequence to be effectively expressed, under the conditions tested.

FIG. 2.
Images of sodium dodecyl sulfate-polyacrylamide gel electrophoresis (top panels) and the corresponding Western blots (bottom panels) of insoluble fractions of recombinant RV308 strains producing GM-CSF and IFN-α2b proteins, during HCDC. The type ...
Expression data of GM-CSF-producing strains during HCDCa

Exchange of pelB with the alternative signal sequences CSP and ompA caused up to twofold-improved GM-CSF production levels.

To analyze the effect of using different signal sequences on the expression level of GM-CSF, the pelB coding region of plasmid pGM29pelB was substituted with the ompA and CSP coding regions to yield constructs pGM29ompA and pGM29CSP, respectively. Production analysis during HCDC with cells harboring these two constructs (Fig. (Fig.22 and Table Table2)2) demonstrated that the highest total production level (1.7 g/liter) was obtained by using ompA. Interestingly, the rankings of signal sequences with respect to the highest total expression level achieved for GM-CSF and scFv-phOx are very different (Fig. (Fig.3).3). As observed with scFv-phOx, the fraction that is soluble relative to the total GM-CSF produced remained relatively constant, irrespective of the total expression level (about 50 to 60%).

FIG. 3.
Production levels of scFv-phOx and GM-CSF as a function of different signal sequences, and no signal sequence, during HCDC. The scFv-phOx production data presented here were imported from our previous study (26). Both soluble and insoluble fractions are ...

Unfavorable codon usage partly contributes to the poor expression of IFN-α2b.

The strong stimulatory effects of secretion signal sequences on the expression levels for both scFv-phOx and GM-CSF were not observed with IFN-α2b (see above), and we wanted to identify the reasons for this. First, the respective coding DNA sequences were analyzed and compared (Table (Table3).3). All three genes are relatively small (between 378 and 777 nucleotides), and the total numbers of rare codons are also similar (between 10 and 13). We also noticed that the IFN-α2b gene has a low GC content compared to those of both the GM-CSF and scFv-phOx genes, and the possible relevance of this for expression is unknown. To experimentally test the impact of the rare codons on gene expression, we transformed the vectors pIFN30 (no signal sequence) and pIFN30pelB (pelB signal sequence) into the E. coli strain BL21-CodonPlus (DE3)-RIPL, which can effectively translate the low-usage codons AGG/AGA, AUA, and CUA (see Materials and Methods). The resulting recombinant strains were subjected to shake flask production analyses as described above. Interestingly, substantial IFN-α2b production (about 30 mg/liter) was achieved with cells harboring pIFN30pelB, while low production was still obtained (about 5 mg/liter) with cells harboring pIFN30 (Table (Table4).4). These results confirm our assumption that the codon usage of IFN-α2b is one major reason for the poor expression obtained for this protein in strain RV308. They also indicate that expression of this gene, as observed for scFv-phOx and GM-CSF genes, is strongly stimulated by the pelB secretion signal sequence under these conditions.

Comparison of coding sequences of recombinant genes used in this study
Production levels of recombinant IFN-α2b protein obtained in shake flasks by using original and synthetic genes and in different genetic backgrounds

Design and construction of a synthetic IFN-α2b gene, designated IFN-α2bS, useful for high-level IFN-α2b production.

To optimize the coding sequence of IFN-α2b for high expression in our preferred production host strain RV308 (26), the entire gene sequence was completely redesigned through complete gene synthesis (using computer software; see Materials and Methods). The synthetic gene is 80.4% identical to its parental version and has 99 nucleotide substitutions affecting 77 of its codons (46% of all its codons), still maintaining the original sequence of the protein product. The GC content of the synthetic gene is 53.6% compared to 48.4% of the original gene, and it has no rare codons (Table (Table3).3). It was of interest to test if this optimized IFN-α2bS gene could be efficiently expressed without a signal sequence in RV308, and it was therefore used to substitute for IFN-α2b in plasmid pIFN30, yielding plasmid pIFN30S. Interestingly, shake flask experiments with RV308(pIFN30S) showed that the IFN-α2b production was still below the detection level (Table (Table4).4). Therefore, we constructed the analogous vector pIFN30SpelB containing pelB and a similar analysis of RV308 harboring this plasmid showed that the IFN-α2b protein was then produced at about 40 mg/liter (Table (Table4).4). Strains RV308(pIFN30S) and RV308(pIFN30SpelB) were then analyzed during HCDC, and a total of 0.6 g/liter of IFN-α2b protein was produced with plasmid pIFN30SpelB, while no product was detected with plasmid pIFN30S (Fig. (Fig.22 and and3).3). Interestingly, all recombinant IFN-α2b production was present as IBs. As a control in these HCDC experiments, we also included strain RV308(pIFN30pelB) with the original IFN-α2b gene fused to pelB, and the total production level of this strain was found to be low, as expected (about 0.04 g/liter). Thus, the optimized IFN-α2bS DNA coding sequence could be used to achieve a high production level for IFN-α2b protein in E. coli RV308, but only provided that a signal sequence is fused to the 5′-terminal end of the coding region.

Quantitative PCR analyses suggest that both the transcript levels and translation efficiencies are higher when using signal sequences.

To increase the understanding of how the signal sequences stimulate gene expression, we analyzed a selection of our recombinant strains under relevant production conditions by using quantitative real-time PCR. The rankings of GM-CSF transcript levels in the four different GM-CSF-producing strains tested were similar to the corresponding rankings of total protein production levels (Table (Table2).2). Interestingly, the transcript level of RV308(pGM29), which has no signal sequence and produces no detectable GM-CSF protein, is only about threefold lower than the transcript levels detected when using pelB or CSP. This result implies that the low GM-CSF expression level of this strain is largely due to inefficient translation and not poor transcription. In bacteria, transcription and translation are tightly coupled in time and space, and for a given mRNA, high translational activity can protect the mRNA from degradation due to ribosome occupancy, which in turn contributes to more translational activity. We therefore believe that our data indicate that signal sequences can increase both the transcript levels and the translation efficiencies of the genes to which they have been fused.

N-terminal sequencing of recombinant proteins confirmed that the signal sequences are cleaved off in vivo.

To make sure that the secretion signal sequences were cleaved off in vivo, the soluble fractions (S1 and S2) of the fermented cultures of recombinant strains RV308(pGM29ompA) and RV308(pGM29CSP) were subjected to Western analysis and the desired proteins were excised and subjected to N-terminal sequencing. The results of these experiments confirmed that the recombinant products in both samples are equivalent to the primary sequence of the GM-CSF mature protein (data not shown), implying that the signal sequences and the initial Met residue were correctly cleaved off.


Secretion signal sequences are generally used in recombinant gene expression for the purpose of achieving translocation of the protein of interest. It is well-known that there exists no general rule guiding the choice of signal sequence that will maximize the level of translocation of any particular protein, and a trial-and-error type of approach is therefore commonly used (6). Here we demonstrate an unexpected and important role of signal sequences in strongly stimulating the levels of expression for three different proteins of human origin during HCDC in E. coli. By selecting the appropriate signal sequences, we achieved total product yields of up to 2.3 g/liter (Fig. (Fig.3),3), which is sufficiently high to be commercially interesting. In gram-positive Lactobacillus lactis, it has been showed that protein secretion can be an effective way to increase the overall expression level of several heterologous proteins (14). It has been suggested that this effect is due to lack of proteolysis, but this hypothesis is not yet well confirmed experimentally. We have previously demonstrated that the parental vector pJB658 can be used for high-level expression of different bacterial proteins without the use of any signal sequences (3, 4, 31). Plasmid pJBphOx-271d and its derivatives constructed in this study have retained the region covering Pm and rbs of pJB658 unmodified, so the observations made here do not seem to be related to the vector system as such. Although more GM-CSF transcript is present when the gene is fused to a signal sequence, the increase is not sufficient to explain the vast difference in the final protein product produced. It therefore seems likely that the proteins are either very inefficiently translated in the absence of a signal sequence or they are immediately degraded. Proteolytic degradation cannot be completely excluded, but tests in a strain commonly used to reduce such problems did not give any indication in support of degradation of the protein product as a major reason for the low production levels.

It has been documented that secondary structures in the rbs of mRNA can lower translation initiation efficiency and high expression levels can be transferred from an N-terminal fusion partner to a poorly expressing partner as a result of mRNA stabilization (10, 11, 12, 27, 32). Sequence alignments show that pelB, ompA, and CSP are only 22% and 16% identical at the mRNA and amino acid sequence levels, respectively (Fig. (Fig.4).4). We analyzed the 5′ ends of the different mRNAs, including the initiation codon and its flanking regions, by using the MFOLD software (34). Interestingly, the folding energies of all three genes tested are significantly reduced when fused to a signal sequence (dG values between −1 and −2 kcal) compared to the energy without a signal sequence (dG values between −4 and −10 kcal). As such, these data could possibly partly explain the poor translation of these mRNA molecules when not fused to a signal sequence. However, the calculated binding energy of the optimized INF-α2bS mRNA 5′ end is also low (dG values between +1 and −4 kcal), and the latter result cannot explain why this gene is poorly expressed without being fused to pelB. Gene expression is also controlled by the degradation of mRNA, and in bacteria, transcription and translation are coupled processes (27). Possibly, the up to eightfold higher transcript levels accompanying high GM-CSF protein levels were due to a reduced mRNA degradation caused by high translation initiation efficiency and ribosomal protection from mRNA degradation, and not so much due to increased GM-CSF transcription rates.

FIG. 4.
Sequence alignments of the signal sequences pelB, ompA, and CSP. A: mRNA sequences (5′ to 3′). B: Primary sequences. Identical nucleotides and residues in the DNA sequence and primary sequences are indicated with asterisks.

In contrast with what was observed for scFv-phOx and GM-CSF, the fusion of a signal sequence to the IFN-α2b gene was alone not sufficient to achieve high level expression. It has been previously reported that the expression of this particular gene may be hampered due to many rare codons as well as to secondary structure formation (1, 30, 32). This was confirmed here by the studies with the strain BL21-CodonPlus (DE3)-RIPL, specially designed to deal with such problems, as well as by the use of a redesigned synthetic version of the gene. Interestingly, a fusion signal was still needed in both cases in order to achieve high-level expression of the gene encoding IFN-α2b. It could therefore be concluded that all three tested genes responded similarly to the signal sequence, in spite of their unrelatedness in terms of their primary sequences.

The three signal sequences ompA (E. coli origin), pelB (Erwinia carotovora origin), and CSP (synthetic) target their fusion partners for translocation by the Sec translocase pathway (16). This is a posttranslational pathway where the polypeptide is translocated in an unfolded state that is not dependent on the signal recognition particle (13, 17). CSP was designed by us based on its amino acid sequence (26), and it was not optimized regarding formation of mRNA secondary structures. Alignments showed that these three signal sequences share low sequence identity at both the DNA and the amino acid sequences (Fig. (Fig.4).4). Thus, general sequence similarities in the signal sequences do not seem to explain the observed effects on expression either. Certain point mutations in the pelB coding sequence have been shown to affect expression levels of recombinant genes, presumably by affecting the mRNA secondary structures (5). It has been reported that codons immediately downstream of the translation initiation codon can have strong effects on translation initiation efficiency in E. coli (21, 25). We noticed that the signal sequences pelB, ompA, and CSP all possess the AAA triplet in position +2. This particular codon positioned in the +2 codon of the β-galactosidase coding region can cause high-level expression, but this positive effect was highly sensitive to sequence alterations in the upstream rbs region (29). Moreover, triplet AAA in position +2 was not optimal for high-level expression of the enterotoxin II protein (25), suggesting that a universal role for this codon (in that particular location) is questionable.

Another very different type of hypothesis would be to assume that the translocation process itself has an effect on the expression rates of the corresponding protein. The Sec translocation apparatus consists of multiple proteins, and the translocation and translation processes are closely coupled (for a review, see reference 17). The targeting of preproteins is governed by the signal sequence, and the preprotein is translocated in an unfolded state. It could therefore be hypothesized that this translocation process somehow contributes to a higher translation rate, but to our knowledge, no experimental support of such a hypothesis has been reported.


This work was financed by Alpharma AS and the Norwegian Research Council.


[down-pointing small open triangle]Published ahead of print on 1 December 2006.


1. Alexandrova, R., M. Eweida, F. Georges, M. B. Dragulev, and I. Ivanov. 1995. Domains in human interferon alpha-1 gene containing tandems of arginine codons AGG play the role of translational initiators in E. coli. Int. J. Biochem. Cell Biol. 27:469-473. [PubMed]
2. Belew, M., Y. Zhou, S. Wang, L. E. Nystöm, and J. C. Janson. 1994. Purification of recombinant human granulocyte-macrophage colony-stimulating factor from the inclusion bodies produced by transformed Escherichia coli cells. J. Chromatogr. 679:67-83. [PubMed]
3. Blatny, J. M., T. Brautaset, H. C. Winther-Larsen, P. Karunararan, and S. Valla. 1997. Improved broad-host range RK2 vectors useful for high and low regulated gene expression levels in gram-negative bacteria. Plasmid 38:35-51. [PubMed]
4. Brautaset, T., S. B. Petersen, and S. Valla. 2000. In vitro determined kinetic properties of mutant phosphoglucomutases and their effect on sugar catabolism in Escherichia coli. Metab. Eng. 2:104-114. [PubMed]
5. Calvez, H. L., J. M. Green, and D. Baty. 1996. Increased efficiency of alkaline phosphatase production levels in Escherichia coli using a degenerate PelB signal sequence. Gene 170:51-55. [PubMed]
6. Choi, J. H., and S. Y. Lee. 2004. Secretory and extracellular production of recombinant proteins using Escherichia coli. Appl. Microbiol. Biotechnol. 64:625-635. [PubMed]
7. Emanuel, S. L., and S. Petska. 1993. Human interferon-αA, -α2, and -α2 (Arg) genes in genomic DNA. J. Biol. Chem. 17:12565-12569. [PubMed]
8. Greenberg, R., D. Lundell, and Y. Alroy. 1988. Expression of biologically active, mature human granulocyte-macrophage colony-stimulating factor with an E. coli secretory expression system. Curr. Microbiol. 17:321-322.
9. Jakobsen, Ø. M., A. Benichou, M. C. Flickinger, S. Valla, T. E. Ellingsen, and T. Brautaset. 2006. Upregulated transcription of plasmid and chromosomal RuMP pathway genes is critical for methanol assimilation and methanol tolerance level in Bacillus methanolicus. J. Bacteriol. 188:3063-3072. [PMC free article] [PubMed]
10. Kane, J. F. 1995. Effects of rare codon clusters on high-level expression of heterologous proteins in Escherichia coli. Curr. Opin. Biotechnol. 6:494-500. [PubMed]
11. Kozak, M. 2005. Regulation of translation via mRNA secondary structure in prokaryotes and eukaryotes. Gene 361:13-37. [PubMed]
12. Laursen, B. S., H. P. Sorensen, and K. Mortensen. 2005. Initiation of protein synthesis in bacteria. Microbiol. Mol. Biol. Rev. 69:101-123. [PMC free article] [PubMed]
13. Lee, H. C., and H. D. Bernstein. 2001. The targeting pathway of Escherichia coli presecretory and integral membrane proteins is specified by the hydrophobicity of the targeting signal. Proc. Natl. Acad. Sci. USA 98:3471-3476. [PMC free article] [PubMed]
14. Le Loir, Y., V. Azevedo, S. C. Oliveira, D. A. Freitas, A. Miyoshi, L. G. Bermúdez-Humarán, S. Nouaille, L. A. Ribeiro, S. Leclercq, J. E. Gabriel, V. D. Guimaraes, M. N. Oliveira, C. Charlier, M. Gautier, and P. Langella. 2005. Protein secretion in Lactococcus lactis: an efficient way to increase the overall heterologous protein production. Microb. Cell Fact. 4:2. [PMC free article] [PubMed]
15. Makrides, S. C. 1996. Strategies for achieving high-level expression of genes in Escherichia coli. Microbiol. Rev. 60:512-538. [PMC free article] [PubMed]
16. Manting, E. H., and A. J. Driessen. 2000. Escherichia coli translocase: the unravelling of a molecular machine. Mol. Microbiol. 37:226-248. [PubMed]
17. Mergulhão, F. J. M., D. K. Summers, and G. A. Monteiro. 2005. Recombinant protein secretion in Escherichia coli. Biotechnol. Adv. 23:177-202. [PubMed]
18. Metcalf, D. 1985. The granulocyte-macrophage colony-stimulating factors. Science 229:16-22. [PubMed]
19. Metcalf, D. 1991. Control of granulocytes and macrophages: molecular, cellular, and clinical aspects. Science 254:529-533. [PubMed]
20. Neves, F. O., P. L. Ho, I. Raw, C. A. Pereira, C. Moreira, and A. L. T. O. Nascimento. 2004. Overexpression of a synthetic gene encoding human alpha interferon in Escherichia coli. Prot. Expr. Purif. 35:353-359. [PubMed]
21. Puri, N., K. B. Appa Rao, S. Menon, A. K. Panda, G. Tiwari, L. C. Garg, and S. M. Totey. 1999. Effect of the codon following the ATG start site on the expression of bovine growth hormone in Escherichia coli. Prot. Expr. Purif. 17:215-223. [PubMed]
22. Ramos, J. L., N. Mermod, and K. N. Timmis. 1987. Regulatory circuits controlling transcription of TOL plasmid operon encoding meta-cleavage pathway for degradation of alkylbenzoates by Pseudomonas. Mol. Microbiol. 1:293-300. [PubMed]
23. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
24. Schmidt, F. R. 2004. Recombinant expression systems in the pharmaceutical industry. Appl. Microbiol. Biotechnol. 65:363-372. [PubMed]
25. Simmons, L. C., and D. G. Yansura. 1996. Translational level is a critical factor for the secretion of heterologous proteins in Escherichia coli. Nat. Biotechnol. 14:629-634. [PubMed]
26. Sletta, H., A. Nedal, T. E. V. Aune, H. Hellebust, S. Hakvåg, R. Aune, T. E. Ellingsen, S. Valla, and T. Brautaset. 2004. Broad-host-range plasmid pJB658 can be used for industrial-level production of a secreted host-toxic single-chain antibody fragment in Escherichia coli. Appl. Environ. Microbiol. 70:7033-7039. [PMC free article] [PubMed]
27. Sørensen, H. P., and K. K. Mortensen. 2005. Advanced genetic strategies for recombinant protein expression in Escherichia coli. J. Biotechnol. 115:113-128. [PubMed]
28. Srivastava, P., P. Bhattacharaya, G. Pandey, and K. J. Mukherjee. 2005. Overexpression and purification of recombinant human interferon alpha2b in Escherichia coli. Prot. Expr. Purif. 41:313-322. [PubMed]
29. Stenström, C. M., E. Holmgren, and L. A. Isaksson,. 2001. Cooperative effects by the initiation codon and its flanking regions on translation initiation. Gene 273:259-265. [PubMed]
30. Valente, C. A., D. M. F. Prazeres, J. M. S. Cabral, and G. A. Monteiro. 2004. Translational features of human alpha 2b interferon production in Escherichia coli. Appl. Environ. Microbiol. 70:5033-5036. [PMC free article] [PubMed]
31. Winther-Larsen, H. C., K. D. Josefson, T. Brautaset, and S. Valla. 2000. Parameters affecting gene expression from the Pm promoter in gram-negative bacteria. Metab. Eng. 2:79-91. [PubMed]
32. Wu, X., H. Jõrnvall, K. D. Berndt, and U. Oppermann. 2004. Codon optimization reveals critical factors for high level expression of two rare codon genes in Escherichia coli: RNA stability and secondary structure but not tRNA abundance. Biochem. Biophys. Res. Commun. 313:89-96. [PubMed]
33. Zhang, X. W., T. Sun, D. X. Gu, and Z. Q. Tang. 1999. Production of granulocyte-macrophage colony-stimulating factor (GM-CSF) by high cell density fermentation of secretory recombination Escherichia coli. Process Biochem. 34:55-58.
34. Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31:3406-3415. [PMC free article] [PubMed]

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...