NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Riddle DL, Blumenthal T, Meyer BJ, et al., editors. C. elegans II. 2nd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 1997.

Cover of C. elegans II

C. elegans II. 2nd edition.

Show details

Section IVAnalysis of C. elegans Promoters

Expression patterns of several dozen C. elegans genes have now been investigated by fusing the gene sequences to a lacZ reporter construct, introducing the gene fusion into worms, and then staining for β-galactosidase activity. This approach has been greaxtly facilitated by a convenient series of modular vectors provided by Fire et al. (1990). Possible limitations of DNA-mediated transformation as a method to study gene control are discussed below in Summary and Conclusions (see also Mello and Fire 1995), but, by and large, this approach has worked well. The advent of new reporter genes such as GFP or “green fluorescent protein” (Chalfie et al. 1994) promises to make this approach even more useful in the future. In the present section, we consider only those studies in which promoters have been analyzed in sufficient detail to identify cis-elements in the DNA that could control transcription patterns.

A. The Vitellogenin Genes

The C. elegans vitellogenins (or yolk proteins) are encoded by a family of six genes that provide an excellent experimental system in which to investigate multiple interlocking transcriptional controls. Vitellogenin genes are expressed in a manner that is sex-specific (only in hermaphrodites), stage-specific (only in late L4 and adults), and tissue-specific (only in the intestine) (Kimble and Sharrock 1983; Sharrock et al. 1983; Blumenthal et al. 1984). Such properties, combined with the genes' massive rates of RNA production, made control of the vitellogenin genes among the first to be investigated biochemically in C. elegans.

The initial study (Spieth et al. 1988) produced transgenic nematodes by the low-copy-number integration scheme developed by Fire (1986). A fusion between the vit-2 and vit-6 genes was designed as a reporter gene, whose expression could be detected immunologically and by nuclease protection. The initial construct included 3.9 kb of vit-2 upstream sequence and 0.6 kb of vit-6 downstream sequence and clearly showed the main elements of correct regulation: Upon transformation into wild-type worms, expression of the vit-2 / vit-6 reporter was detected only in the intestine of adult hermaphrodites. The number of integrated genes was low (in the range of 1–10 copies/genome), expression levels approximately depended on the copy number, and RNA from the transforming gene was of the expected size. Although there were examples of rearranged and mutated genes, presumably introduced by the genomic integration events, the overall conclusion was that regulation was correct. Indeed, correct sex/stage/tissue regulation could be achieved with only 247 base pairs of 5′-flanking region upstream of the transcriptional start site (MacMorris et al. 1992).

Two highly conserved sequence motifs (named vitellogenin promoter elements or VPEs) were identified by comparing the sequences of the different genes in the C. elegans vitellogenin family and by comparing the C. elegans genes with the homologous vitellogenins from Caenorhabditis briggsae (Blumenthal et al. 1984; Spieth et al. 1985b, 1991a; Zucker-Aprison and Blumenthal 1989). VPE1 has the sequence TGTCAAT, and VPE2 has the sequence CTGATAA, a subset of the WGATAR sequences that are involved in gene regulation during vertebrate erythropoiesis (Weiss and Orkin 1995). MacMorris et al. (1992) focused on the importance of certain of these elements in controlling the fused vit-2/vit-6 reporter gene. The primary conclusion was that alteration of these elements could change the levels of reporter gene expression, but no alterations were identified that changed the sex/stage/tissue specificity of expression (see also MacMorris and Blumenthal 1993). Ablation of the VPE1 site closest to the TATAA box (–45 bp) inactivated the gene, but ablation of other VPE1 sites farther upstream had little effect. Ablation of one conserved VPE2 site (at –150) or of an overlapping VPE1-VPE2 site also inactivated the promoter. Independently produced transgenic strains could show a substantial quantitative variability in expression level, suggesting that the precise site of chromosome integration could influence expression level of the transgene.

A further study from the same group (MacMorris et al. 1994) retained the vit-2 / vit-6 reporter (with both wild-type and mutated 247-bp promoters) but switched to a transformation procedure that produces multicopy extrachromosomal arrays (Mello et al. 1991). Copy numbers ranged up to several hundred copies per haploid genome but expression levels showed only weak correlation with copy number. However, as assayed by the relative expression levels obtained with each promoter-mutated construct, the two transformation techniques (i.e., producing a small number of integrated copies or a large number of nonintegrated copies) did indeed lead to the same general conclusions: Mutation of some (but not all) of the conserved VPE1 and VPE2 sites greatly decreased the level of vitellogenin expression. In addition, single mutations that by themselves had little effect on expression could be combined to produce a drastic reduction in reporter gene activity.

Clearly, much remains to be done before vitellogenin control is understood. The VPE1 and VPE2 sequences are obvious candidate binding sites for transcription factors that confer specificity to vitellogenin expression. However, no such factors have yet been defined nor have the isolated elements (multimerized if necessary) been shown to be capable of directing correctly regulated expression of a “neutral” promoter. In principle, the elements could be general factor-binding sites, perhaps contributing to the massive levels of vitellogenin gene expression. The true sex/stage/tissue controlling sites may lie elsewhere, perhaps interspersed between the VPE1 and VPE2 sequences.

B. The unc-54 Gene

The unc-54 gene encodes the major myosin heavy chain expressed in the body-wall muscle and has long been the focus of genetic and biochemical analyses (MacLeod et al. 1981; Moerman et al. 1982; Miller et al. 1983; Anderson and Brenner 1984). unc-54 is also expressed in muscles of the vulva, intestine, and somatic gonad but is not expressed in muscles of the pharynx (Ardizzi and Epstein 1987; see also MoermanFire).

Fire and Waterston (1989) introduced cosmid DNA containing the unc-54 locus into unc-54 mutant worms, using the protocol (Fire 1986) that yields a low number of integrated gene copies. The general conclusion was that the exogenous genes could rescue wild-type movement and egg-laying ability. Monoclonal antibodies to the product of the unc-54 gene were used to demonstrate that unc-54 expression was correct, appearing in body-wall muscle cells but not in pharyngeal muscle cells. By comparing expression levels with those obtained in previous gene dosage experiments, the authors were able to conclude that the expression level of each copy of the transforming gene was within a factor of two of the expression level of the endogenous chromosomal gene (Fire and Waterston 1989).

The transcription initiation site of the unc-54 gene has been mapped approximately 80 base pairs upstream of the ATG translation initiation codon (Dibb et al. 1989; Okkema et al. 1993). There is no obvious TATAA element 25–30 base pairs farther upstream, but a polypyrimidine-polypurine sequence found nearby is also found in a comparable position in other myosin heavy-chain genes. Okkema et al. (1993) continued the transgenic analysis of the unc-54 promoter, using transformation by multicopy extrachromosomal arrays. Gene expression was most frequently assayed in the first-generation progeny of the injected worms. In such an F1 assay, expression of the transgene is mosaic, but expression patterns can still be deduced by investigating large numbers of individual animals. Two different reporter genes were used. The first reporter was the intact unc-54 gene itself; correct expression could be assayed by rescue of the paralysis and egg-laying defects of unc-54 mutants and by using unc-54 -specific antibodies. The second reporter consisted of various lacZ vectors fused to unc-54 or to pieces of unc-54 placed upstream of a “neutral” promoter. A series of undirectional 5′deletions were constructed and introduced into worms. It was found that deletions to within six base pairs of the translation initiation codon (and which deleted the normal site of transcription initiation) were still able to confer phenotypic rescue; unc-54 expression now derived from multiple start sites located in the newly juxtaposed vector. (Control experiments showed that muscle specificity did not derive from the sequences in the vector.) One of the models suggested by the authors was that an “initiator element” within the body of the gene directed transcription to begin a measured number of base pairs upstream. A second model proposed the existence of a mechanism to eliminate transcripts that cannot be properly processed (Okkema et al. 1993).

Using both deletions of the unc-54 gene and deletions of a unc-54 ::lacZ fusion construct, Okkema et al. (1993) were able to conclude that unc-54 expression in the body-wall muscles could be directed by either of two tissue-specific control elements: The first such element is situated within 200 base pairs of the beginning of the unc-54 gene, and the second element behaves as an enhancer and is situated within the third intron. These two enhancers appear to be redundant since either by itself directs body-wall muscle expression. There was the possibility of upstream elements that controlled expression level but not tissue specificity.

The body-wall muscle enhancer in the third intron of the unc-54 gene was investigated in more detail by Jantsch-Plunger and Fire (1994). As in the previous study, expression of various lacZ fusion constructs was assayed in the F1 generation produced by injected worms. The basic strategy was to place mutated (or multimerized) third intron enhancers upstream of a minimal myo-2 promoter (potentially pharyngeal-specific but inactive by itself) or upstream of the postulated non-tissue-specific promoter of the glp-1 gene; appropriate enhancer-promoter pairs were then fused to lacZ reporters. The third intron enhancer was found to lie within a 90-base-pair region, which was then extensively mutated. The general result was that the unc-54 intron enhancer consists of at least four distinct subelements; not only the sequence of these subelements, but also the spacing between subelements are important for expression. Gene expression destroyed by mutations in a particular element could sometimes be restored by duplication of the mutated site. Some individual elements could act by themselves if multimerized, but other elements could not. Moreover, there was clear interplay between the elements in the unc-54 enhancer and the particular promoter that was being used. In general, the myo-2 promoter gave higher levels of expression and more clear-cut results than did the glp-1 promoter. None of the identified sites have yet been matched with the binding site of a transcription factor. Although it might have been expected that the C. elegans MyoD homolog interacts with unc-54 , the authors point out that the unc-54 gene is still expressed in body-wall muscle in an hlh-1 mutant.

C. The myo-3 and myo-2 Genes

The second major myosin heavy chain expressed in the body-wall muscle cells is encoded by the myo-3 gene. Less is known about the myo-3 promoter than about the unc-54 promoter, but several observations are pertinent. Low-copy-number integrants of the intact myo-3 gene were able to rescue the lethality of a myo-3 mutation and to recreate the wild-type staining pattern seen with an anti-MYO-3 monoclonal antibody (Fire and Waterston 1989). As with the unc-54 gene, the reintroduced copies of the myo-3 gene were estimated to be expressed at approximately the same level as the endogenous genes.

Subsequent studies assayed the ability of fragments of the myo-3 gene to direct expression of myo-2 promoter::lacZ fusion constructs into body-wall muscle cells (Okkema et al. 1993). Three separable elements with body-wall muscle enhancer activity were detected: Two elements lie within 2.5 kb upstream of the myo-3 ATG ( myo-3 is trans-spliced), and the third element lies within the first myo-3 intron.

The myo-2 gene encodes a myosin heavy chain expressed exclusively in muscle cells of the pharynx, not in muscle cells of the body wall. The 5′end of the primary transcript has been mapped (Okkema et al. 1993) and, like the unc-54 and myo-1 genes, no appropriately spaced TATAA element can be identified. Using transgenic lacZ fusion constructs, the major pharyngeal muscle enhancer of the myo-2 gene was located several hundred base pairs upstream of the transcription initiation region. More detailed analysis (Okkema and Fire 1994) divided the myo-2 enhancer into three overlapping fragments, each separately inactive but active when combined in pairs or when duplicated. Whereas the intact enhancer directed lacZ expression in all eight classes of pharyngeal muscle cells, enhancer subelements (when duplicated) showed selectivity for expression in only some of these cells. None of the enhancer elements showed strict lineage specificity. Furthermore, one such element appeared to lose strict muscle specificity when studied in isolation; however, the element appeared to retain specificity for cells of the pharynx, as if it were organ-specific rather than cell-type-specific.

The overall picture of the myo-2 enhancer is a collection of elements with overlapping specificity, the ultimate expression pattern arising from combinatorial interactions between the various elements. Okkema and Fire (1994) suggest that the complex interactions between different elements in such an enhancer might be necessary to coordinate the separate differentiation programs of muscle cells and other cells within the developing pharynx. One particular enhancer subelement was used to screen an expression library, thereby isolating the candidate controlling gene ceh-22 , an NK class homeobox transcription factor described above (Okkema and Fire 1994).

D. The col-19 Gene

The col-19 gene encodes a collagen protein incorporated into the cuticle of the adult; expression of col-19 is regulated by the heterochronic pathway (Ambros and Horvitz 1984; Cox and Hirsh 1985). Liu et al. (1995) showed that 2.7 kb of col-19 upstream sequence could direct expression of a lacZ reporter construct in the spatial and temporal pattern expected, namely, in hypodermal cells beginning at the L4 to adult molt. When introduced into heterochronic mutants in which the production of the adult cuticle is either “precocious” or “retarded,” expression of the col-19 ::lacZ fusion was also precocious or retarded, suggesting that the heterochronic pathway controls the downstream “effector” genes at the level of transcription. Adult-specific expression could be directed by an approximately 150-bp region located several hundred base pairs upstream of the col-19 gene, with the possibility that separate elements control expression specificity and expression level. DNA fragments from within this region were shown to bind to recombinant LIN-29 protein (Rougvie and Ambros 1995). From genetic evidence, lin-29 is an obvious candidate for controlling downstream genes in the heterochronic pathway (see Ambros, this volume), and these biochemical experiments suggest that the control might actually be direct. As described above, the lin-29 gene encodes a zinc finger protein.

E. The hlh-1 Gene

As noted above, the hlh-1 gene encodes the C. elegans homolog of MyoD, a member of the class of helix-loop-helix transcription factors that have a central role in muscle development in vertebrates (for review, see Olson 1990; Weintraub 1993). The role of hlh-1 in muscle biology is considered by Moerman and Fire (this volume).

To understand how the hlh-1 gene gene is controlled, the promoter has been analyzed in considerable detail by Krause et al. (1994). A construct containing 3.1 kb of hlh-1 upstream sequence and 2 kb of gene sequence (including a large first intron) fused to a lacZ reporter gene is expressed in transgenic worms in a pattern that accurately recapitulates the distribution of the endogenous HLH-1 protein. Specifically, the reporter construct is expressed in mature body-wall muscles and their clonal precursors, in a set of six glial-like cells, and transiently in the granddaughters of the MS blastomere. A series of both unidirectional and internal deletions in the 5-kb region surrounding the hlh-1 initiation codon identified a number of control elements that influence different aspects of the overall expression pattern. For example, elements that direct expression in embryonic muscle precursor cells could be distinguished from elements that function in mature body-wall muscles. The control elements associated with embryonic muscles did not seem to have exclusive control over expression in a particular lineage but rather only “favored” expression: Some elements favored expression in the C over the MS + D lineages, and others favored the reverse. A common “core” element, lying between base pairs –551 and –435 relative to the hlh-1 ATG codon, appeared to be necessary for hlh-1 ::lacZ expression in all aspects of the pattern, both adult and embryonic. Two different elements were identified that controlled the transient expression in the MS granddaughter cells; a further distinct element in the first intron influenced expression in the glial-like cells. The suspected enhancer sequences were introduced upstream of a neutral promoter, either that of the myo-2 gene or that of the glp-1 gene. By and large, this latter assay supported the conclusions arrived at by the unidirectional deletions.

To aid in the precise identification of control sequences, the hlh-1 homolog from the related nematode C. briggsae was cloned and sequenced (Krause et al. 1994). Several conserved sequences could be identified within the enhancers previously defined by the functional assays, supporting the potential of these regions to be transcription factor targets. A number of “E-box” sequences (CAXXTG, binding sites for helix-loop-helix proteins) were noted in these conserved regions but none has yet been demonstrated actually to interact with protein. The authors showed that the C. briggsae hlh-1 gene could rescue C. elegans hlh-1 mutants and that the C. briggsae expression pattern was highly similar to that seen in C. elegans, with the interesting exception that the transient phase of hlh-1 expression in the MS granddaughters was not detected.

The above studies appear to have eliminated at least the simplest and most extreme model in which completely distinct enhancers control expression in distinct hlh-1 -expressing cell lineages. Rather, the hlh-1 promoter appears to be an array of overlapping influences. One short region just upstream of the ATG is required for all expression, but other regions influence the different spatial and temporal aspects of hlh-1 expression, in short, a complex piecemeal type of control.

F. The mec-3 Gene

mec-3 has been one of the most intensely studied genes in C. elegans. As noted above, mec-3 encodes a LIM-type homeoprotein necessary for the correct production of differentiated touch cells. The genetic studies of mec-3 are described by Driscoll and Kaplan (this volume). In this section, we review studies analyzing the mec-3 promoter.

In the original report of mec-3 cloning, Way and Chalfie (1988) showed that touch-insensitive mutants could indeed be rescued by germ-line transformation with nonintegrated multicopy arrays of mec-3 . Phenotypes usually diagnostic for touch cells (e.g., degeneration induced by a dominant allele of mec-4 ) could be identified in a few ectopic cells, presumably as a result of mec-3 overexpression or misexpression from the transforming array. A mec-3 ::lacZ fusion construct containing several kilobases of 5′-flanking DNA was expressed in a total of ten neurons. This reporter expression apparently mirrors native MEC-3 distribution, although this has not yet been verified by immunocytochemical analysis of the endogenous protein: The reporter-expressing cells include the six touch cells plus four other cells for which there is evidence both for a role in mechanosensation and for mec-3 involvement in cell function (Way and Chalfie 1989). Moreover, transgenes respond as expected to appropriate genetic backgrounds. Expression is abolished in unc-86 mutants, in which the lineages that produce the touch cells are altered, and expression is reduced or transient in mec-3 mutants, suggesting elements of autoregulation. The response of this full-length reporter construct to a variety of genetic backgrounds has been studied (Way and Chalfie 1989; Way et al. 1992; Mitani et al. 1993).

The mec-3 promoter has been analyzed both by deletions and by site-directed mutations (Way et al. 1991; Xue et al. 1992). The mec-3 control region appears to be compact, and several hundred base pairs upstream of the mec-3 ATG (the site of transcription initiation is not known) are sufficient to generate the appropriate lacZ staining patterns. Comparison of 5′-flanking DNA sequences among the mec-3 homologs from three species (C. elegans, C. briggsae, and C. remanei strain VT733 [previously called C. vulgarensis, see Fitch and Thomas, this volume]) reveals at least four highly conserved regions, and these are obvious candidates for cis-acting control sequences. Moreover, sites in the 5′-flanking region have been shown directly to bind recombinant versions of the UNC-86 and the MEC-3 proteins (Xue et al. 1992, 1993), and, by and large, the UNC-86- and MEC-3-binding sites do lie in the conserved regions.

Several key features of mec-3 regulation and touch cell specification have been illuminated by the analysis of the mec-3 promoter. Alteration of UNC-86-binding sites leads to reduced mec-3 ::lacZ expression in both the establishment and the maintenance phase. Similarly, MEC-3-binding sites appear to be directly involved in the maintenance of mec-3 expression. This in vivo analysis is made far more interesting because recombinant UNC-86 and MEC-3 proteins have been shown in vitro to form heterodimers on the DNA (Xue et al. 1992, 1993). Moreover, the possibility of synergistic UNC-86-MEC-3-binding interactions fits within the genetic framework (Mitani et al. 1993) and with evidence from in vitro transcription (Lichtsteiner and Tjian 1995). At least two issues remain to be resolved—whether there is a negative element that keeps mec-3 repressed in sister cells and whether there are cell-type-specific (or lineage-specific) subelements within the mec-3 promoter.

G. The unc-86 Gene

The unc-86 gene product is one of the founding members of the POU class of homeodomain proteins (see above). The regulation of unc-86 activity exemplifies central problems in C. elegans development: How do transcription factors become asymmetrically expressed within a lineage and how does such an asymmetric expression of a transcription factor bring about different fates of different cells (for a detailed review of unc-86 function in neurogenesis, see Ruvkun, this volume). This section summarizes studies that analyze the unc-86 promoter.

As detected by antibody staining, unc-86 is expressed in 57 of the 302 neurons in adult C. elegans (Finney and Ruvkun 1990), and in general, these are the same cells that are altered in unc-86 mutants. Besides the fact that the unc-86 -expressing cells are neurons, common features are not obvious. The cells are not spatially clustered, they do not show obvious clonal or lineage relations, and they do not have the same function. The central question is: How can a transcription factor become deployed in such a complex pattern in the developing animal? One model can probably be ruled out. In lineages that are transformed by unc-86 mutations, immunologically detectable UNC-86 protein can appear in cell nuclei shortly after cell birth, i.e., UNC-86 does not appear to be produced in a mother cell and then segregated asymmetrically into only one of the daughters.

A recent transgenic analysis has provided important new insights into unc-86 regulation (Baumeister et al. 1996). A 10-kb region of the unc-86 locus (including ~5 kb of upstream DNA and 2-kb downstream from the gene), when introduced as multicopy arrays into an unc-86 mutant, is able to rescue mutant phenotypes such as defective mechanosensation and chemotaxis. Moreover, antibody staining showed that the transgene product is expressed only in cells that normally express unc-86 . Reporter constructs in which the 5 kb of unc-86 5′-flanking region was fused to lacZ or to GFP (and excluding all sequences from unc-86 mRNA) were also expressed in the correct pattern, allowing the important conclusion that the complex unc-86 expression pattern is regulated at the level of unc-86 transcription.

Like the mec-3 gene described in the previous section, unc-86 appears to have two phases of expression, one of establishment and one of (autoregulatory) maintenance; i.e., the reporter constructs in an unc-86 mutant background show correct initial expression in the appropriate cells, but this expression is transitory. Moreover, promoter elements controlling establishment can be separated from elements controlling maintenance. A series of promoter deletions was used to explore the unc-86 establishment phase. The principal result was that distinct promoter regions directed reporter expression in distinct sets of the unc-86 -expressing cells. This important finding indicates that asymmetric unc-86 expression does not result from some unitary animal-wide control mechanism applied to all cell lineages that express unc-86 . Rather, the promoter appears to be modular, although many further constructs must be investigated to determine how discrete such a promoter module can actually be.

Baumeister et al. (1996) also introduced the (intact) promoter::reporter fusion into a number of genetic backgrounds and found that certain genes ( lin-11 , ham-1 , lin-17 ) affect unc-86 expression in some cell lineages, whereas other genes ( lin-32 , vab-3 , egl-5 ) affect expression in other lineages. This response is what would be expected for a modular promoter, in which different modules respond to different upstream genes. One task for the future will be to map the action of each potential unc-86 regulatory gene onto discrete elements in the unc-86 promoter. This will be a formidable amount of work, but the problem is both general and important.

H. The ges-1 Gene

The ges-1 gene encodes a carboxylesterase enzyme expressed exclusively in the intestinal cell lineage. Esterase expression can be first detected when the developing gut has four to eight cells and expression continues throughout the life cycle (Edgar and McGhee 1986, 1988). Studies on ges-1 control have focused on the establishment of the gut-specific expression patterns during the first half of embryogenesis (up to the comma or 1.5-fold stage); during these stages, the ges-1 esterase is the only detectable esterase in the embryonic intestine. Mutations that abolish ges-1 esterase activity have been produced (McGhee et al. 1990). The availability of these (viable) nonexpressing mutants and the cloned ges-1 gene (Kennedy et al. 1993) has allowed the gene to be used as its own reporter in transformation studies aimed at understanding control of the ges-1 promoter.

The basic approach has been to introduce ges-1 constructs into the ges-1 null mutant, to produce stable transgenic lines, in which the ges-1 gene is present in multicopy nonintegrated arrays, and then to stain transformed embryos for esterase activity. The initial analyses of the ges-1 promoter used unidirectional deletions (Aamodt et al. 1991; Kennedy et al. 1993). A subsequent study (Egan et al. 1995) has used more closely spaced internal deletions and site-directed mutations and, by and large, has confirmed the main conclusions of the earlier work. The results of this latter study can be summarized as follows. Control of the ges-1 gene centers on a region lying 800–1300 base pairs upstream of the ges-1 ATG codon ( ges-1 is trans-spliced so the point of transcription initiation is not yet known). A deletion that removes 1300–1100 base pairs upstream of the ATG abolishes ges-1 expression in the embryonic intestine, but this same construct now expresses ges-1 in cells of the pharynx and the tail.

It has been a matter of concern that the “pharynx/tail” pattern of expression seen with modified ges-1 promoters could have features in common with the “ectopic pharynx/posterior intestine” expression noted with a number of transgenic promoter-reporter constructs (see, e.g., Hope 1991; Krause et al. 1994). Furthermore, a significant fraction of promoter trap lines have been found to express in the pharynx. Nevertheless, all control experiments done to date suggest that the pharynx/tail expression patterns seen with modified ges-1 constructs do indeed reflect endogenous regulatory mechanisms associated with the ges-1 gene: Pharynx/tail expression has been observed in multiple stably transformed transgenic lines (using both integrated and nonintegrated arrays), in the absence of any vector sequences, with a variety of coinjected marker genes, and with several dozen different ges-1 deletion constructs. However, probably the strongest validation of the ges-1 transgenic analysis is that deletions in the endogenous chromosomal copies of the ges-1 promoter (isolated by imprecise transposon excisions) give rise to weak but significant ges-1 expression in the embryonic pharynx (Fukushige et al. 1996).

Sequences involved in this gut-to-pharynx/tail switch in ges-1 expression have been explored in more detail. The switch centers on a 36-base-pair region that contains two WGATAR sites. Deletion of these two sites is sufficient to abolish gut expression and to activate expression in the pharynx/tail pattern. As noted above, WGATAR sequences are known to be involved in gene control during vertebrate erythropoiesis (for review, see Weiss and Orkin 1995). Furthermore, the downstream WGATAR element sits in a region that matches (13/13 base pairs) a sequence implicated in control of the (gut-specific) vitellogenin genes (see above). This tandem pair of WGATAR sites act as an embryonic gut enhancer. Reintroduction of the sequences several hundred base pairs downstream from the normal location in the WGATAR-deleted ges-1 construct (and in either orientation) returns ges-1 expression to the gut and silences expression in the pharynx/tail. Moreover, the sites (at least when multimerized) can direct expression of a naive promoter/reporter gene construct in the embryonic intestine. No evidence could be produced that the sites were capable of repressing expression from a heat shock promoter (Egan et al. 1995).

Laser microsurgery experiments combined with genetic analysis have shown that the embryonic cells expressing the WGATAR deleted ges-1 construct belong to all three non-gut modules of the digestive tract: the ABa-derived anterior pharynx, the MS-derived posterior pharynx, and the ABp-derived rectum. Furthermore, this non-gut digestive tract expression of the WGATAR deleted gene is abolished by mutations in the zygotic gene pha-4 and responds appropriately to mutations in a series of maternal-effect genes that alter early blastomere fate ( skn-1 , mex-1 , pie-1 , and pop-1 ) (Fukushige et al. 1996). Thus, ges-1 appears to be regulated at the level of the entire digestive tract, not just at the level of separation of the E and MS blastomere fates as had originally been suggested (Aamodt et al. 1991; Egan et al. 1995). Furthermore, this digestive-tract-wide level of control is normally hidden, perhaps reflecting the evolutionary history of the ges-1 gene.

The transgenic studies of ges-1 expression also appear to have uncovered distinct control mechanisms within the gut lineage. Deletion of either the upstream or downstream WGATAR site directed ges-1 expression only in the anterior gut, not the posterior gut. Deletion of a neighboring fragment (called δ4, spanning base pairs –811 to –1100) also causes ges-1 expression only in the anterior gut. These deletions thus accentuate the normal tendency of ges-1 staining to be stronger in the gut anterior (Edgar and McGhee 1986). Deletion of any two of the three elements (i.e., upstream WGATAR, downstream WGATAR, or the adjoining δ4 region) led to complete abolition of ges-1 expression in the gut and to activation of ges-1 in the pharynx/tail. Deletion of all three of these elements greatly reduced ges-1 expression in all sets of cells. A rather detailed molecular model attempting to condense all the above results has been proposed (Egan et al. 1995). However, inconsistencies within the model were already apparent; for example, deletion of all three promoter elements would not be predicted to cause extinction of ges-1 expression in the pharynx/tail.

A biochemical system has been developed to study DNA-protein interactions in the early embryo (Stroeher et al. 1994). Nuclear extracts of blocked embryos (but not of oocytes) were shown to contain a factor that binds specifically to the tandem WGATAR sites. As described earlier, the tandem WGATAR sequences from the ges-1 promoter have been used to isolate the elt-2 gene, which encodes a C. elegans zinc finger “GATA-factor” and which is a candidate for direct ges-1 control (Hawkins and McGhee 1995).

I. Promoter Trapping

The above discussion concerns specific cloned genes whose expression pattern was then investigated by transformation. Hope (1991) has established a “promoter trap” approach that takes the opposite direction. By fusing random genomic fragments to lacZ reporters, expression patterns in transformed worms can be investigated without prior knowledge of the gene. This approach has revealed a number of intriguing expression patterns worthy of further study and, in the particular case of pes-1 (Hope 1994), has identified what could be an important regulatory factor in embryogenesis. A recent extension of this approach has investigated lacZ expression patterns directed by potential gene regulatory regions assigned by the genome sequencing project (Lynch et al. 1995).

J. Summary and Conclusions

From the above examples, it seems clear that transformation “works” reasonably well as an experimental tool to investigate gene expression and gene regulation in C. elegans. Yet, there are limitations and potential biases in the method that should be kept in mind. It is of crucial importance to have some independent means of detecting where a gene is normally expressed, either by endogenous enzyme activity, antibody staining, or in situ hybridization. By introducing a foreign reporter sequence into a C. elegans gene, control signals could in principle be disrupted, deleted, or misspaced; posttranscriptional regulation of both mRNA (see, e.g., Seydoux and Fire 1994; Wilkinson and Greenwald 1995) and protein could also be aberrant. Even when a gene acts as its own reporter, as in the case of ges-1 , potential concerns arise about the influence of multiple gene copies, interspersed with marker genes and divorced from any long-range chromosomal context. Some of the concerns can be addressed by suitable controls and cautious interpretations, but many of the problems will not be solved until an efficient gene replacement technology is developed for C. elegans. With these cautions, there are important reasons for optimism: As one example, transformation of worms with fusions to the GFP reporter raises the possibility of watching gene activity in real time within particular cells inside the living animal.

We should also note several aspects of transformation in C. elegans, especially transformation using multicopy arrays, that are unexplained but potentially interesting. First, where it can be measured, multicopy arrays clearly lead to overexpression of the transforming genes (Kennedy et al. 1993; Egan et al. 1995). This can possibly lead to low-level ectopic expression in particular cells, but misexpression might also be due to rearrangement of some small fraction of the transforming genes. The possibility of mosaic loss of the transforming array is a constant source of uncertainty, but it is curious that even integration of the array into the genome may still not raise the expression penetrance to 100% (Krause et al. 1994; R. Baumeister and G. Ruvkun, in prep.). There have also been reports of the transforming gene somehow interfering with the function of the endogenous gene. As one example, a ceh-10::lacZ transgene apparently impaired the normal function of the CAN cell, a cell in which the transgene is expressed (Svendsen and McGhee 1995). It is not known whether this effect arises at the level of transcription (e.g., due to competition for limiting transcription factors, antisense inhibition because of unregulated transcription from the transforming array, or some ectopic pairing phenomenon between the endogenous gene and the extrachromosomal array) or at the level of protein function (e.g., the ceh-10 fusion protein improperly dimerizing with its normal ceh-10 homeodomain partner). Phenocopies have also been reported in hlh-1 and pal-1 transformants. Perhaps some of the above unexplained phenomena can be turned to advantage to produce novel insights into gene regulation.

Overall, what has transgenic analysis revealed about transcriptional control in C. elegans? It is not surprising that C. elegans genes, like those of other eukaryotes, are controlled by arrays of enhancers and repressors. Although none of these C. elegans elements or arrays of elements have yet been investigated in sufficient detail that we can say that we really understand a certain promoter, the array of controls seems to be compact. Most of the genes investigated by transformation have been “correctly” controlled by flanking regions no longer than several kilobases and, in certain cases, a few hundred base pairs seem adequate. On the one hand, this may reflect the imperfections of the current transformation assays in which long-range influences might not have been detected. On the other hand, these local controls certainly seem to be capable of producing a good first-order approximation of correct gene regulation. It also seems that compared to genes in other animals, a higher fraction of genes in C. elegans have transcriptional control elements in introns and at the 3′end. Perhaps this is a necessary consequence of gene compactness, but it is a feature that should be considered in gene expression studies because reporter construct strategies usually do away with such elements.

Copyright © 1997, Cold Spring Harbor Laboratory Press.
Bookshelf ID: NBK20088
PubReader format: click here to try

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...