NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Brown TA. Genomes. 2nd edition. Oxford: Wiley-Liss; 2002.

Cover of Genomes

Genomes. 2nd edition.

Show details

Chapter 12Regulation of Genome Activity

Learning outcomes:

When you have read Chapter 12, you should be able to:

  • Distinguish between differentiation and development, and outline how regulation of genome expression underlies these two processes
  • Describe, with examples, the various ways in which extracellular signaling compounds can bring about transient changes in genome activity, making clear distinction between those signaling compounds that enter the cell and those that bind to a cell surface receptor
  • Describe, with examples, the various ways in which permanent and semipermanent changes in genome activity can be brought about, making clear distinction between those processes that involve rearrangement of the genome, those that involve changes in chromatin structure, and those that involve feedback loops
  • Discuss how studies of sporulation in Bacillus subtilis, vulva development in Caenorhabditis elegans, and embryogenesis in Drosophila melanogaster have contributed to our understanding of genome regulation during development, and explain why lower organisms can act as models for development in higher eukaryotes such as humans

We have followed the pathway by which expression of the genome specifies the content of the proteome, which in turn determines the biochemical signature of the cell. In no organism is this biochemical signature entirely constant. Even the simplest unicellular organisms are able to alter their proteomes to take account of changes in the environment, so that their biochemical capabilities are continually in tune with the available nutrient supply and the prevailing physical and chemical conditions. Cells in multicellular organisms are equally responsive to changes in the extracellular environment, the only difference being that the major stimuli include hormones and growth factors as well as nutrients. The resulting transient changes in genome activity enable the proteome to be remodeled continuously to satisfy the demands that the outside world places on the cell (Figure 12.1). Other changes in genome activity are permanent or at least semipermanent, and result in the cell's biochemical signature becoming altered in a way that is not readily reversible. These changes lead to cellular differentiation, the adoption by the cell of a specialized physiological role. Differentiation pathways are known in many unicellular organisms, an example being the production of spore cells by bacteria such as Bacillus, but we more frequently associate differentiation with multicellular organisms, in which a variety of specialized cell types (over 250 types in humans) are organized into tissues and organs. Assembly of these complex multicellular structures, and of the organism as a whole, requires coordination of the activities of the genomes in different cells. This coordination involves both transient and permanent changes, and must continue over a long period of time during the development of the organism.

Figure 12.1. Two ways in which genome activity is regulated.

Figure 12.1

Two ways in which genome activity is regulated. The genes on the left are subject to transient regulation and are switched on and off in response to changes in the extracellular environment. The genes on the right have undergone a permanent or semipermanent (more...)

There are many steps within the expression pathways for individual genes at which regulation can be exerted (Table 12.1). Examples of the biological roles of different mechanisms were given at the appropriate places in Chapters 811. The objective of this chapter is not to reiterate these gene-specific control systems, but to explain how the activity of the genome as a whole is regulated. In doing this we should bear in mind that the biosphere is so diverse, and the numbers of genes in individual genomes so large, that it is reasonable to assume that any mechanism that could have evolved to regulate genome expression is likely to have done so. It is therefore no surprise that we can nominate examples of regulation for every point in the genome expression pathway. But are all these control points of equal importance in regulating the activity of the genome as a whole? Our current perception is that they are not. Our understanding may be imperfect, based as it is on investigation of just a limited number of genes in a few organisms, but it appears that the critical controls over genome expression - the decisions about which genes are switched on and which are switched off - are exerted at the level of transcription initiation. For most genes, control that is exerted at later steps serves to modulate expression but does not act as the primary determinant of whether the gene is on or off (see Figure 9.22). Most, but not all, of what we will discuss in this chapter therefore concerns control of genome activity by mechanisms that specify which genes are transcribed and which are silent. We will address two issues: the ways in which transient and permanent changes in genome activity are brought about, and the ways in which these changes are linked in time and space in developmental pathways.

Table 12.1. Examples of steps in the genome expression pathway at which regulation can be exerted.

Table 12.1

Examples of steps in the genome expression pathway at which regulation can be exerted.

12.1. Transient Changes in Genome Activity

Transient changes in genome activity occur predominantly in response to external stimuli. For unicellular organisms, the most important external stimuli relate to nutrient availability, these cells living in variable environments in which the identities and relative amounts of the nutrients change over time. The genomes of unicellular organisms therefore include genes for uptake and utilization of a range of nutrients, and changes in nutrient availability are shadowed by changes in genome activity, so that at any one time only those genes needed to utilize the available nutrients are expressed. Most cells in multicellular organisms live in less variable environments, but an environment whose maintenance requires coordination between the activities of different cells. For these cells, the major external stimuli are therefore hormones, growth factors, and related compounds that convey signals within the organism and stimulate coordinated changes in genome activity.

To exert an effect on genome activity, the nutrient, hormone, growth factor, or other extracellular compound that represents the external stimulus must influence events within the cell. There are two ways in which it can do this (Figure 12.2):

  • Directly, by acting as a signaling compound that is transported across the cell membrane and into the cell;
  • Indirectly, by binding to a cell surface receptor which transmits a signal into the cell.

Figure 12.2. Two ways in which an extracellular signaling compound can influence events occurring within a cell.

Figure 12.2

Two ways in which an extracellular signaling compound can influence events occurring within a cell.

Signal transmission, by direct or indirect means, is one of the major research areas in cell biology (Lodish et al., 2000), with attention focused in particular on its relevance to the abnormal biochemical activities that underlie cancer. Many examples of signal transmission have been discovered, some of general importance in a variety of organisms and others restricted to just a few species. In the first part of this chapter we will survey the field.

12.1.1. Signal transmission by import of the extracellular signaling compound

In the direct method of signal transmission, the extracellular compound that represents the external stimulus crosses the cell membrane and enters the cell. After import into the cell, the signaling compound could influence genome activity by any one of three routes (Figure 12.3):

  • If the signaling compound is a protein, then it could act in the same way as one of the various protein factors that we met in Chapters 811, for example by activating or repressing assembly of the transcription initiation complex (Section 9.3), or by interacting with a splicing enhancer or silencer (Section 10.1.3).
  • The signaling compound could influence the activity of an existing protein factor. Such a signaling compound need not be a protein: it could, theoretically, be any type of compound.
  • The signaling compound could influence the activity of an existing protein factor via one or more intermediates, rather than by interacting with it directly.

Figure 12.3. Three ways in which an extracellular signaling compound could influence genome activity.

Figure 12.3

Three ways in which an extracellular signaling compound could influence genome activity.

Examples of each of these three modes of action are described below.

Lactoferrin is an extracellular signaling protein which acts as a transcription activator

If the extracellular signaling compound that is imported into the cell is a protein with suitable properties then it could directly affect the activity of its target genes by acting as an activator or repressor of some stage in the genome expression pathway. This might appear to be an attractively straightforward way of regulating genome activity, but it is not a common mechanism. The reason for this is not clear but probably relates, at least partly, to the difficulty in designing a protein that combines the hydrophobic properties needed for effective transport across a membrane with the hydrophilic properties needed for migration through the aqueous cytoplasm to the protein's site of action in the nucleus or on a ribosome.

The one clear example of a signaling compound that can function in this way is provided by lactoferrin, a mammalian protein found mainly in milk and, to a lesser extent, in the bloodstream. Lactoferrin is a transcription activator (Section 9.3.2). Its specific function has been difficult to pin down, but it seems to play a role in the body's defenses against microbial attack. As its name suggests, lactoferrin is able to bind iron, and it is thought that at least part of its protective role arises from its ability to reduce free-iron levels in milk, thereby starving invading microbes of this essential cofactor. It might therefore appear unlikely that lactoferrin would have a role in genome expression, but it has been known since the early 1980s that the protein is multi-talented and, among other things, can bind to DNA. This property was linked to a second function of lactoferrin - stimulation of the blood cells involved in the immune response - when in 1992 it was shown that the protein is taken up by immune cells, enters their nuclei, and attaches to the genome (Garre et al., 1992). Subsequently the DNA binding was shown to be sequence specific and to stimulate transcription, confirming that lactoferrin is a true transcription activator (He and Furmanski, 1995).

Some imported signaling compounds directly influence the activity of pre-existing protein factors

Although only a few imported signaling compounds are able themselves to act as activators or repressors of genome expression, many have the ability to influence directly the activity of factors that are already present in the cell. We encountered one example of this type of regulation in Section 9.3.1 when we studied the lactose operon of Escherichia coli. This operon responds to extracellular levels of lactose, the latter acting as a signaling molecule which enters the cell and, after conversion to its isomer allolactose, influences the DNA-binding properties of the lactose repressor and hence determines whether or not the lactose operon is transcribed (see Figure 9.24). Many other bacterial operons coding for genes involved in sugar utilization are controlled in this way.

Direct interaction between a signaling compound and a transcription activator or repressor is also a common means of regulating genome activity in eukaryotes. A good example is provided by the control system that maintains the intracellular metal-ion content at an appropriate level. Cells need metal ions such as copper and zinc as cofactors in biochemical reactions, but these metals are toxic if they accumulate in the cell above a certain level. Their uptake therefore has to be carefully controlled so that the cell contains sufficient metal ions when the environment is lacking in metal compounds, but does not over-accumulate metal ions when the environmental concentrations are high. The strategies used are illustrated by the copper-control system of Saccharomyces cerevisiae. This yeast has two copper-dependent transcription activators, Mac1p and Ace1p. Both of these activators bind copper ions, the binding inducing a conformational change that enables the factor to stimulate expression of its target genes (Figure 12.4). For Mac1p these target genes code for copper-uptake proteins, whereas for Ace1p they are genes coding for proteins such as superoxide dismutase that are involved in copper detoxification. The metal-controlled balance between the activities of Mac1p and Ace1p ensures that the copper content of the cell remains within acceptable levels (Pena et al., 1998; Winge et al., 1998).

Figure 12.4. Copper-regulated gene expression in Saccharomyces cerevisiae.

Figure 12.4

Copper-regulated gene expression in Saccharomyces cerevisiae. Yeast requires low amounts of copper because a few of its enzymes (e.g. cytochrome c oxidase and tyrosinase) are copper-containing metalloproteins, but too much copper is toxic for the cell. (more...)

Transcription activators are also the targets for steroid hormones such as progesterone, estrogen and glucocorticoid hormones. Steroid hormones are signaling compounds that coordinate a range of physiological activities in the cells of higher eukaryotes (Tsai and O'Malley, 1994). Steroids are hydrophobic and so easily penetrate the cell membrane. Once inside the cell, each hormone binds to a specific steroid receptor protein, which is usually located in the cytoplasm (Figure 12.5). After binding, the activated receptor migrates into the nucleus, where it attaches to a response element (see Box 9.6) upstream of a target gene. The typical hormone response element is a 15 bp sequence comprising a 6 bp inverted palindrome separated by a 3 bp spacer to which the steroid receptor binds via a special version of the zinc finger (see Figure 9.13). Response elements for each receptor are located upstream of 50–100 genes and, once bound, the receptor acts as a transcription activator. A steroid hormone can therefore induce a large-scale change in the biochemical properties of the cell.

Figure 12.5. Gene activation by a steroid hormone.

Figure 12.5

Gene activation by a steroid hormone. Estradiol is one of the estrogen steroid hormones. After entering the cell, estradiol attaches to its receptor protein and the complex enters the nucleus where it binds to the 15-bp estrogen response element (abbreviation: (more...)

All steroid receptors are structurally similar, not just with regard to their DNA-binding domains but also in other parts of their protein structures (Figure 12.6). Recognition of these similarities has led to the identification of a number of putative or orphan steroid receptors whose hormonal partners and cellular functions are not yet known. The structural similarities have also shown that a second set of receptor proteins, the nuclear receptor superfamily, belongs to the same general class as steroid receptors, although the hormones that they work with are not steroids. As their name suggests, these receptors are located in the nucleus rather than the cytoplasm. They include the receptors for vitamin D3, whose roles include control of bone development, and thyroxine, which stimulates the tadpole-to-frog metamorphosis.

Figure 12.6. All steroid hormone receptor proteins have similar structures.

Figure 12.6

All steroid hormone receptor proteins have similar structures. Three receptor proteins are compared. Each one is shown as an unfolded polypeptide with the two conserved functional domains aligned. The DNA-binding domain is very similar in all steroid (more...)

Some imported signaling compounds influence genome activity indirectly

The link between a signaling molecule and the protein factors involved in genome expression does not have to be as direct as in the examples described in the previous section. Signaling molecules can also influence genome activity in an indirect manner via one or more intermediates. An example is provided by the catabolite repression system of bacteria. This is the means by which extracellular and intracellular glucose levels dictate whether or not operons for utilization of other sugars are switched on when those alternative sugars are present in the medium.

This phenomenon was discovered by Jacques Monod in 1941, who showed that if E. coli or Bacillus subtilis are provided with a mixture of sugars, then one sugar will be metabolized first, the bacteria turning to the second sugar only when the first has been used up. Monod used a French word to describe this: diauxie (Brock, 1990). One combination of sugars that elicits a diauxic response is glucose plus lactose, glucose being used before the lactose (Figure 12.7A). When the details of the lactose operon were worked out some 20 years later (Section 9.3.1) it became clear that the diauxie between glucose and lactose must involve a mechanism whereby the presence of glucose can override the normal inductive effect that lactose has on its operon. In the presence of lactose plus glucose, the lactose operon is switched off, even though some of the lactose in the mixture is converted into allolactose which binds to the lactose repressor, so that under normal circumstances the operon would be transcribed (Figure 12.7B).

Figure 12.7. Catabolite repression.

Figure 12.7

Catabolite repression. (A) A typical diauxic growth curve, as seen when Escherichia coli is grown in a medium containing a mixture of glucose and lactose. During the first few hours the bacteria divide exponentially, using the glucose as the carbon and (more...)

The explanation for the diauxic response is that glucose acts as a signaling compound that represses expression of the lactose operon, as well as other sugar utilization operons, through an indirect influence on the catabolite activator protein. This protein binds to various sites in the bacterial genome and activates transcription initiation at downstream promoters. Productive initiation of transcription at these promoters is dependent on the presence of the bound protein: if the protein is absent then the genes it controls are not transcribed.

Glucose does not itself interact with the catabolite activator protein. Instead, glucose controls the level in the cell of the modified nucleotide cyclic AMP (cAMP; Figure 12.7C). It does this by inhibiting the activity of adenylate cyclase, the enzyme that synthesizes cAMP from ATP. This means that if glucose levels are high, the cAMP content of the cell is low. The catabolite activator protein can bind to its target sites only in the presence of cAMP, so when glucose is present the protein remains detached and the operons it controls are switched off. In the specific case of diauxie involving glucose plus lactose, the indirect effect of glucose on the catabolite activator protein means that the lactose operon remains inactivated, even though the lactose repressor is not bound, and so the glucose in the medium is used up first. When the glucose is gone, the cAMP level rises and the catabolite activator protein binds to its target sites, including the site upstream of the lactose operon, and transcription of the lactose genes is activated.

12.1.2. Signal transmission mediated by cell surface receptors

Many extracellular signaling compounds are unable to enter the cell because they are too hydrophilic to penetrate the lipid membrane and the cell lacks a specific transport mechanism for their uptake. In order to influence genome activity these signaling compounds must bind to cell surface receptors that carry their signals across the cell membrane. These receptors are proteins that span the membrane, with a site for binding the signaling compound on the outer surface. Binding of the signaling compound results in a conformational change in the receptor, inducing a biochemical event within the cell, often phosphorylation of an intracellular protein. This event forms the first step in the intracellular stage of the signal transduction pathway (Figure 12.8). Several types of cell surface receptor have been discovered (Table 12.2) and the intracellular events that they initiate are diverse, with many variations on each theme, not all of these specifically involved in regulating genome activity. Three examples will help us to appreciate the complexity of the system.

Figure 12.8. The role of a cell surface receptor in signal transduction.

Figure 12.8

The role of a cell surface receptor in signal transduction. Binding of the extracellular signaling compound to the outer surface of the receptor protein causes a conformational change that results in activation of an intracellular protein, for example (more...)

Table 12.2. Cell surface receptor proteins involved in signal transmission into eukaryotic cells.

Table 12.2

Cell surface receptor proteins involved in signal transmission into eukaryotic cells.

Signal transduction with one step between receptor and genome

With some signal transduction systems, stimulation of the cell surface receptor by attachment of the extracellular signaling compound results in the direct activation of a protein that influences genome activity. This is the simplest system by which an extracellular signal can be transduced into a genomic response.

The direct system is used by many cytokines such as interleukins and interferons, which are extracellular signaling polypeptides that control cell growth and division. Binding of these polypeptides to their cell surface receptors results in activation of a type of transcription factor called a STAT (signal transducer and activator of transcription). Activation is by phosphorylation of a single tyrosine amino acid at a position near to the C terminus of the STAT polypeptide. If the cell surface receptor is a member of the tyrosine kinase family (see Table 12.2) then it is able to activate the STAT directly (Figure 12.9A). If it is a tyrosine-kinase-associated receptor then it does not itself have the ability to phosphorylate a STAT, or any other intracellular protein, but acts through intermediaries called Janus kinases (JAKs). Binding of the signaling molecule to a tyrosine-kinase-associated receptor causes a change in the conformation of the receptor, often by inducing dimerization. This causes a JAK that is associated with the receptor to phosphorylate itself, this autoactivation being followed by phosphorylation of the STAT by the JAK (Figure 12.9B).

Figure 12.9. Signal transduction involving STATs.

Figure 12.9

Signal transduction involving STATs. (A) If the receptor is a member of the tyrosine kinase family then it can activate the STAT directly. (B) If the receptor is a tyrosine-kinase-associated type then it acts via a Janus kinase (JAK), which autophosphorylates (more...)

Seven STATs have so far been identified in mammals (Horvath, 2000). Three of these - STATs 2, 4 and 6 - are specific for just one or two extracellular cytokines, but the others are broad spectrum and can be activated by several different interleukins and interferons. Discrimination is provided by the cell surface receptors: a particular receptor binds just one type of cytokine, and most cells have only one or a few types of receptor. Different cells therefore respond in different ways to the presence of particular cytokines, even though the internal signaling process involves only a limited number of STATs.

The consensus sequence of the DNA-binding sites for STATs has been defined as 5′-TTN5–6AA-3′, largely by studies in which purified STATs have been tested against oligonucleotides of known sequence. The DNA-binding domain of the STAT protein is made up of three loops emerging from a barrel-shaped β-sheet structure (Becker et al., 1998). This is an unusual type of DNA-binding domain and has not been identified in precisely the same form in any other type of protein, although it has similarities with the DNA-binding domains of the NK-κB and Rel transcription activators. These similarities refer only to the tertiary structures of the DNA-binding domains because STATs, NK-κB and Rel, as a whole, have very little amino acid sequence identity. Many target genes are activated by STATs but the overall genomic response is modulated by other proteins which interact with STATs and influence which genes are switched on under a particular set of circumstances. Complexity is entirely expected because the cellular processes that STATs mediate - growth and division - are themselves complex and we anticipate that changes in these processes will require extensive remodeling of the proteome and hence large-scale alterations in genome activity.

Signal transduction with many steps between receptor and genome

The simplicity of the system whereby the cell surface receptor activates a STAT, directly or through a JAK associated with the receptor, contrasts with the more prevalent forms of signal transduction, in which the receptor represents just the first in a series of steps that lead eventually to one or more transcription activators or repressors being switched on or off. A number of these cascade pathways have been delineated in different organisms. The following are the important ones in mammals:

  • The MAP (mitogen activated protein) kinase system (Figure 12.10) responds to many extracellular signals, including mitogens - compounds with similar effects to cytokines but that specifically stimulate cell division (Robinson and Cobb, 1997). Binding of the signaling compound causes the internal parts of the mitogen receptor to become phosphorylated. Phosphorylation stimulates attachment to the receptor, on the internal side of the membrane, of various cytoplasmic proteins, one of which is Raf, a protein kinase that is activated when it becomes membrane bound. Raf initiates a cascade of phosphorylation reactions. It phosphorylates Mek, activating this protein so that it, in turn, phosphorylates the MAP kinase. The activated MAP kinase now moves into the nucleus where it switches on, again by phosphorylation, a series of transcription activators. The MAP kinase also phosphorylates another protein kinase, this one called Rsk, which phosphorylates and activates a second set of factors. Additional flexibility is provided by the possibility of replacing one or more of the proteins in the MAP kinase pathway with related proteins, ones with slightly different specificities and so activating another suite of factors. The MAP kinase pathway is used by vertebrate cells; equivalent pathways, using intermediates similar to those identified in mammals, are known in other organisms (see Section 12.3.2 for an example).
  • The Ras system is centered around the Ras proteins, three of which are known in mammalian cells (H-, K- and N-Ras), and similar proteins such as Rac and Rho. These proteins are involved in regulation of cell growth and differentiation and, as with many proteins in this category, when dysfunctional they can give rise to cancer. The Ras family proteins are not limited to mammals, examples being known in other eukaryotes such as the fruit fly. Ras proteins are intermediates in signal transduction pathways that initiate with autophosphorylation of a tyrosine kinase receptor in response to an extracellular signal. The phosphorylated version of the receptor forms protein-protein complexes with GNRPs (guanine nucleotide releasing proteins) and GAPs (GTPase activating proteins) which activate and inactivate Ras, respectively (Figure 12.11; Schlessinger, 1993). The extracellular signals can therefore switch Ras-mediated signal transduction on or off, the choice between the two depending on the nature of the signal and the relative amounts of active GNRPs and GAPs in the cell. When activated, Ras stimulates Raf activity, so in effect Ras provides a second entry point into the MAP kinase pathway, although this is unlikely to be the only function of Ras and it probably also activates proteins involved in signal transduction by second messengers (as described in the next section).
  • The SAP (stress activated protein) kinase system is induced by stress-related signals such as ultraviolet radiation, and growth factors associated with inflammation. The pathway has not been described in detail but is similar to the MAP kinase system although targeting a different set of transcription activators.

Figure 12.10. Signal transduction by the MAP kinase pathway.

Figure 12.10

Signal transduction by the MAP kinase pathway. See the text for details. ‘MK’ is the MAP kinase and ‘P’ indicates a phosphate group, PO3 2-. Elk-1, c-Myc and SRF (serum response factor) are examples of transcription factors (more...)

Figure 12.11. The Ras signal transduction system.

Figure 12.11

The Ras signal transduction system. See the text for details. Abbreviations: GAP, GTPase activating protein; GNRP, guanine nucleotide releasing protein. ‘P’ indicates a phosphate group, PO3 2-.

Signal transduction via second messengers

Some signal transduction cascades do not involve the direct transfer of the external signal to the genome but instead utilize an indirect means of influencing transcription. The indirectness is provided by second messengers, which are less specific internal signaling compounds that transduce the signal from a cell surface receptor in several directions so that a variety of cellular activities, not just transcription, respond to the one signal.

In Section 12.1.1 we saw how glucose modulates the catabolite activator protein by influencing cAMP levels in bacteria (see Figure 12.7). Cyclic nucleotides are also important second messengers in eukaryotic cells. Some cell surface receptors have guanidylate cyclase activity, and so convert GTP to cGMP, but most receptors in this family work indirectly by influencing the activity of cytoplasmic cyclases and decyclases. These cyclases and decyclases determine the cellular levels of cGMP and cAMP, which in turn control the activities of various target enzymes. The latter include protein kinase A, which is stimulated by cAMP. One of the activities of protein kinase A is to phosphorylate, and hence activate, a transcription activator called CREB. This is one of several proteins that influence the activity of a variety of genes by interacting with a second activator, p300/CBP, which is able to modify histone proteins and so affect chromatin structure and nucleosome positioning (Sections 8.2.1).

As well as being activated indirectly by cAMP, p300/CBP responds to another second messenger, calcium (Chawla et al., 1998). The calcium-ion concentration is substantially lower inside than outside the cell, so proteins that open calcium channels in the cell membrane allow calcium ions to enter (Berridge et al., 2000). This can be induced by extracellular signals that activate tyrosine kinase receptors which in turn activate phospholipases that cleave phosphatidylinositol-4,5-bisphosphate, a lipid component of the inner cell membrane, into inositol-1,4,5-trisphosphate (Ins(1,4,5)P3) and 1,2-diacylglycerol (DAG). Ins(1,4,5)P3 opens calcium channels (Figure 12.12). Ins(1,4,5)P3 and DAG are themselves second messengers that can initiate other signal transduction cascades (Spiegel et al., 1996; Toker and Cantley, 1997). Both the calcium- and the lipid-induced cascades target transcription activators, but only indirectly: the primary targets are other proteins. Calcium, for example, binds to and activates the protein called calmodulin, which regulates a variety of enzyme types, including protein kinases, ATPases, phosphatases and nucleotide cyclases.

Figure 12.12. Induction of the calcium second messenger system.

Figure 12.12

Induction of the calcium second messenger system. See the text for details. Abbreviations: DAG, 1,2-diacylglycerol; Ins(1,4,5)P3, inositol-1,4,5-trisphosphate; PtdIns(4, 5)P2, phosphatidylinositol-4,5-bisphosphate.

12.2. Permanent and Semipermanent Changes in Genome Activity

Transient changes in genome activity are, by definition, readily reversible, the gene expression pattern reverting to its original state when the external stimulus is removed or replaced by a contradictory stimulus. In contrast, the permanent and semipermanent changes in genome activity that underlie cellular differentiation must persist for long periods, and ideally should be maintained even when the stimulus that originally induced them has disappeared. We therefore anticipate that the regulatory mechanisms bringing about these longer term changes will involve systems additional to the modulation of transcription activators and repressors. This expectation is correct. We will look at three mechanisms:

  • Changes resulting from physical rearrangement of the genome;
  • Changes due to chromatin structure;
  • Changes maintained by feedback loops.

12.2.1. Genome rearrangements

Changing the physical structure of the genome is an obvious, although drastic, way to bring about a permanent change in genome expression. It is not a common regulatory mechanism, but several important examples are known. These are described below.

Yeast mating types are determined by gene conversion events

Mating type is the equivalent of sex in yeasts and other eukaryotic microorganisms. Because these organisms reproduce mainly by vegetative cell division, there is the possibility that a population, being derived from just one or a few ancestral cells, will be largely or completely composed of a single mating type and so will not be able to reproduce sexually. To avoid this problem, cells are able to change sex by the process called mating-type switching. In Saccharomyces cerevisiae and some other species this switching involves a genome rearrangement called gene conversion.

The two S. cerevisiae mating types are called a and α. The mating type is specified by the MAT gene, located on chromosome III. This gene has two alleles, MATa and MATα, a haploid yeast cell displaying the mating type corresponding to whichever allele it possesses. Elsewhere on chromosome III are two additional MAT-like genes, called HMLα and HMRa (Figure 12.13). These have the same sequences as MATα and MATa respectively, but neither gene is expressed because upstream of each one is a silencer that represses transcription initiation. These two genes are called ‘silent mating-type cassettes’.

Figure 12.13. Mating-type switching in yeast.

Figure 12.13

Mating-type switching in yeast. In this example, the cell begins as mating-type a. The HO endonuclease cuts the MATa locus, initiating gene conversion by the HMLα locus. The result is that the mating type switches to type α. For details (more...)

Mating-type switching is initiated by the HO endonuclease, which makes a double-stranded cut at a 24-bp sequence located within the MAT gene. This enables a gene conversion event to take place. We examine the details of gene conversion in Section 14.3.1; all that concerns us at the moment is that one of free 3′ ends produced by the endonuclease can be extended by DNA synthesis, using one of the two silent cassettes as the template. The newly synthesized DNA subsequently replaces the DNA currently at the MAT locus. The silent cassette chosen as the template is usually the one that is different to the allele originally at MAT (Haber, 1998), so replacement with the newly synthesized strand converts the MAT gene from MATa to MATα or vice versa (see Figure 12.13). This results in mating-type switching.

The MAT gene codes for a regulatory protein that interacts with a transcription activator, MCM1, thus determining which set of genes are switched on by this factor. The MATa and MATα gene products have different effects on MCM1, and so specify different allele-specific gene expression patterns. These expression patterns are maintained in a semipermanent fashion until another MAT gene conversion occurs.

Genome rearrangements are responsible for immunoglobulin and T-cell receptor diversities

In vertebrates there are two striking examples of the use of DNA rearrangements to achieve permanent changes in genome activity. These two examples, which are very similar, are responsible for the generation of immunoglobulin and T-cell receptor diversities.

Immunoglobulins and T-cell receptors are proteins that are synthesized by B and T lymphocytes, respectively. Both types of protein become attached to the outer surfaces of their cells, and immunoglobulins are also released into the bloodstream. The proteins help to protect the body against invasion by bacteria, viruses and other unwanted substances by binding to these antigens, as they are called. During its lifetime, an organism could be exposed to any number of a vast range of antigens, which means that the immune system must be able to synthesize an equally vast range of immunoglobulin and T-cell receptor proteins. In fact, humans can make approximately 108 different immunoglobulin and T-cell receptor proteins. But there are only 3.5 × 104 genes in the human genome, so where do all these proteins come from?

To understand the answer we will look at the structure of a typical immunoglobulin protein. Each immunoglobulin is a tetramer of four polypeptides linked by disulfide bonds (Figure 12.14). There are two long ‘heavy’ chains and two short ‘light’ chains. When the sequences of different heavy chains are compared it becomes clear that the variability between them lies mainly in the N-terminal regions of these polypeptides, the C-terminal parts being very similar, or ‘constant’, in all heavy chains. The same is true for the light chains, except that two families, κ and λ, can be distinguished, differing in the sequences of their constant regions.

Figure 12.14. Immunoglobulin structure.

Figure 12.14

Immunoglobulin structure. Each immunoglobulin protein is made up of two heavy and two light chains, linked by disulfide bonds. Each heavy chain is 446 amino acids in length and consists of a variable region (shown in red) spanning amino acids 1–108 (more...)

In the human genome there are no complete genes for the immunoglobulin heavy and light polypeptides. Instead, these proteins are specified by gene segments. The heavy-chain segments are on chromosome 14 and comprise 11 CH gene segments, preceded by 86 VH gene segments, 30 DH gene segments and 9 JH gene segments, these last three coding for different versions of the V (variable), D (diverse) and J (joining) components of the variable part of the heavy chain (Table 12.3; Figure 12.15A). The entire heavy-chain locus stretches over several megabase pairs. A similar arrangement is seen with the light-chain loci on chromosomes 2 (κ locus) and 22 (λ locus), the only difference being that the light chains do not have D segments (Table 12.3).

Table 12.3. Immunoglobulin gene segments in the human genome.

Table 12.3

Immunoglobulin gene segments in the human genome.

Figure 12.15. Immunoglobulin gene segments and construction of a functional gene.

Figure 12.15

Immunoglobulin gene segments and construction of a functional gene. (A) Organization of the human IGH locus on chromosome 14, containing gene segments for the immunoglobulin heavy chain. For a more detailed map of this region see Strachan and Read (1999). (more...)

As a B lymphocyte develops, the immunoglobulin loci in its genome undergo rearrangements. Within the heavy-chain loci, these rearrangements link one of the VH gene segments with one of DH gene segments, and then links this V-D combination with a JH gene segment, and finally attaches the resulting sequence to a CH gene segment (Figure 12.15B). The end result is a complete heavy-chain gene, but one that is specific for just that one lymphocyte. A similar series of DNA rearrangements results in the lymphocyte's light-chain gene, and transcription of the two genes produces one of the 108 immunoglobulins that the human body needs.

Diversity of T-cell receptors is based on similar rearrangements which link V, D, J and C gene segments in different combinations to produce cell-specific genes. We met two small components of this system - the Tβ gene segments V28 and V29-1 - in Chapter 1 in the 50-kb segment of the human genome (see Figure 1.14) with which we began our exploration of genomes.

12.2.2. Changes in chromatin structure

Some of the effects that chromatin structure can have on gene expression were described in Section 8.2. These range from the modulation of transcription initiation at an individual promoter by nucleosome positioning, through to the silencing of large segments of DNA locked up in higher order chromatin structure. The latter is an important means of bringing about long-term changes in genome activity and is implicated in a number of regulatory events. One of these concerns the yeast mating-type loci that we looked at earlier in this section, the silencing of the HMLα and HMRα cassettes resulting mainly from these loci being buried in inaccessible chromatin in response to the influence of their upstream silencer sequences (Haber, 1998). X inactivation (Section 8.2.2) also involves the formation of inaccessible chromatin, in this case along virtually the entire length of one of the two X chromosomes in a female nucleus.

One other example of chromatin silencing merits attention. This is a system that we will meet again later in the chapter when we look at development processes in the fruit fly. It concerns the Polycomb gene family. The proteins coded by these genes bind to DNA sequences called Polycomb response elements and induce formation of heterochromatin, the condensed form of chromatin that prevents transcription of the genes that it contains (Figure 12.16). The heterochromatin nucleates around the attached Polycomb protein and then propagates along the DNA for tens of kilobases (Pirrotta, 1997). The regions that become silenced contain homeotic genes which, as we will see in Section 12.3.3, specify the development of the individual body parts of the fly. As only one body part must be specified at a particular position in the fruit fly, it is important that a cell expresses only the correct homeotic gene. This is ensured by the action of Polycomb, which permanently silences the homeotic genes that must be switched off. An important point is that the heterochromatin induced by Polycomb is heritable: after division, the two new cells retain the heterochromatin established in the parent cell. This type of regulation of genome activity is therefore permanent not only in a single cell, but also in a cell lineage.

Figure 12.16. Polycomb silences regions of the Drosophila genome by initiating heterochromatin formation.

Figure 12.16

Polycomb silences regions of the Drosophila genome by initiating heterochromatin formation.

12.2.3. Genome regulation by feedback loops

The final mechanism that we will consider for bringing about long-term changes in genome activity involves the use of a feedback loop. In this system a regulatory protein activates its own transcription so that once its gene has been switched on, it is expressed continuously (Figure 12.17). A number of examples of this type of feedback regulation are known:

  • The MyoD transcription activator, which is involved in muscle development, is one of the best understood examples of cellular differentiation in vertebrates. A cell becomes committed to becoming a muscle cell when it begins to express the myoD gene. The product of this gene is a transcription activator that targets a number of other genes coding for muscle-specific proteins, such as actin and myosin, and is also indirectly responsible for one of the key features of muscle cells - the absence of a normal cell cycle, these cells being stopped in the G1 phase (Section 13.3.1). The MyoD protein also binds upstream of myoD, ensuring that its own gene is continuously expressed. The result of this positive-feedback loop is that the cell continues to synthesize the muscle-specific proteins and remains a muscle cell. The differentiated state is heritable because cell division is accompanied by transmission of MyoD to the daughter cells, ensuring that these are also muscle cells.
  • Deformed of Drosophila is one of several proteins coded by homeotic selector genes and is responsible for specifying segment identity in the fruit fly (Section 12.3.3). The Deformed (Dfd) protein is responsible for the identity of the head segments. To perform this function, Dfd must be continuously expressed in the relevant cells. This is achieved by a feedback system, Dfd binding to an enhancer located upstream of the Dfd gene (Regulski et al., 1991). Feedback autoregulation also controls the expression of at least some homeotic selector genes of vertebrates (Popperl et al., 1995).

Figure 12.17. Feedback regulation of gene expression.

Figure 12.17

Feedback regulation of gene expression.

Box Icon

Box 12.1

Unraveling a signal transduction pathway. A typical set of experiments for studying the functions of proteins involved in a signal transduction pathway. One of the most important extracellular signaling compounds is transforming growth factor-β (more...)

12.3. Regulation of Genome Activity During Development

The developmental pathway of a multicellular eukaryote begins with a fertilized egg cell and ends with an adult form of the organism. In between lies a complex series of genetic, cellular and physiological events that must occur in the correct order, in the correct cells, and at the appropriate times if the pathway is to reach a successful culmination. With humans, this developmental pathway results in an adult containing 1013 cells differentiated into approximately 250 specialized types, the activity of each individual cell coordinated with that of every other cell. Developmental processes of such complexity might appear intractable, even to the powerful investigative tools of modern molecular biology, but remarkably good progress towards understanding them has been made in recent years. The research that has underpinned this progress has been designed around three guiding principles:


It should be possible to describe and comprehend the genetic and biochemical events that underlie differentiation of individual cell types. This in turn means that an understanding of how specialized tissues, and even complex body parts, are constructed should be within reach.


The signaling processes that coordinate events in different cells should be amenable to study. We saw in Section 12.1 that a start is being made to describing these systems at the molecular level.


There should be similarities and parallels between developmental processes in different organisms, reflecting common evolutionary origins. This means that information relevant to human development can be obtained from studies of model organisms chosen for the relative simplicity of their developmental pathways.

Developmental biology encompasses areas of genetics, molecular biology, cell biology, physiology and biochemistry. We are concerned only with the role of the genome in development and so will not attempt a wide-ranging overview of developmental research in all its guises. Instead we will concentrate on three model systems of increasing complexity in order to investigate the types of change in genome activity that occur during development.

12.3.1. Sporulation in Bacillus

The first developmental pathway that we will examine is formation of spores by the bacterium Bacillus subtilis (Grossman, 1995; Errington, 1996; Stragier and Losick, 1996). Strictly speaking, this is not a developmental pathway, merely a type of cellular differentiation, but the process illustrates two of the fundamental issues that have to be addressed when genuine development in multicellular organisms is studied. These issues are how a series of changes in genome activity over time is controlled, and how signaling establishes coordination between events occurring in different cells. The advantages of Bacillus as a model system are that it is easy to grow in the laboratory and is amenable to study by genetic and molecular biology techniques such as analysis of mutants and sequencing of genes.

Sporulation involves coordinated activities in two distinct cell types

Bacillus is one of several genera of bacteria that produce endospores in response to unfavorable environmental conditions. These spores are highly resistant to physical and chemical abuse and can survive for decades or even centuries - the possibility of infection with anthrax spores produced by B. anthracis is taken very seriously by archaeologists excavating sites containing human and animal remains. Resistance is due to the specialized nature of the spore coat, which is impermeable to many chemicals and to biochemical changes that retard the decay of DNA and other polymers and enable the spore to survive a prolonged period of dormancy.

In the laboratory, sporulation is usually induced by nutrient starvation. This causes the bacteria to abandon their normal vegetative mode of cell division, which involves synthesis of a septum (or cross-wall) in the center of the cell. Instead the cells construct an unusual septum, one that is thinner than normal, at one end of the cell (Figure 12.18). This produces two cellular compartments, the smaller of which is called the prespore and the larger the mother cell. As sporulation proceeds, the prespore becomes entirely engulfed by the mother cell. By now the two cells are committed to different but coordinated differentiation pathways, the prespore undergoing the biochemical changes that enable it to become dormant, and the mother cell constructing the resistant coat around the spore and eventually dying.

Figure 12.18. Sporulation in Bacillus subtilis.

Figure 12.18

Sporulation in Bacillus subtilis. The top part of the diagram shows the normal vegetative mode of cell division, involving formation of a septum across the center of the bacterium and resulting in two identical daughter cells. The lower part of the diagram (more...)

Special σ subunits control genome activity during sporulation

Changes in genome activity during sporulation are controlled largely by the synthesis of special σ subunits that change the promoter specificity of the Bacillus RNA polymerase. Recall that the σ subunit is the part of the RNA polymerase that recognizes the bacterial promoter sequence, and that replacement of one σ subunit with another with a different DNA-binding specificity can result in a different set of genes being transcribed (Section 9.3.1). We have seen how this simple control system is used by E. coli in response to heat stress (see Figure 9.23). It is also the key to the changes in genome activity that occur during sporulation.

The standard B. subtilis σ subunits are called σA and σH. These subunits are synthesized in vegetative cells and enable the RNA polymerase to recognize promoters for all the genes it needs to transcribe in order to maintain normal growth and cell division. In the prespore and mother cell these subunits are replaced by σF and σE, respectively, which recognize different promoter sequences and so result in large-scale changes in gene expression patterns. The master switch from vegetative growth to spore formation is provided by a protein called SpoOA, which is present in vegetative cells but in an inactive form. This protein is activated by phosphorylation, the protein kinases that phosphorylate it responding to various extracellular signals that indicate the presence of an environmental stress such as lack of nutrients (Sonenshein, 2000). Activated SpoOA is a transcription factor that modulates the expression of various genes transcribed by the vegetative RNA polymerase and hence recognized by the regular σA and σH subunits. The genes that are switched on include those for σF and σE, resulting in the switch to prespore and mother cell differentiation (Figure 12.19).

Figure 12.19. Role of SpoOA in Bacillus sporulation.

Figure 12.19

Role of SpoOA in Bacillus sporulation. SpoOA is phosphorylated in response to extracellular signals derived from environmental stresses. It is a transcription activator with roles that include activation of the genes for the σE and σF (more...)

Initially, both σF and σE are present in each of the two differentiating cells. This is not exactly what is wanted because σF is the prespore-specific subunit and so should be active only in this cell, and σE is mother-cell specific. A means is therefore needed of activating or inactivating the appropriate subunit in the correct cell. This is thought to be achieved as follows (Figure 12.20; Errington, 1996):

  • σ F is activated by release from a complex with a second protein, SpoIIAB. This is controlled by a third protein, SpoIIAA, which, when unphosphorylated, can also attach to SpoIIAB and prevent the latter from binding to σF. If SpoIIAA is unphosphorylated then σF is released and is active; when SpoIIAA is phosphorylated then σF remains bound to SpoIIAB and so is inactive. In the mother cell, SpoIIAB phosphorylates SpoIIAA and so keeps σF in its bound inactive state. But in the prespore, SpoIIAB's attempts to phosphorylate SpoIIAA are antagonized by yet another protein, SpoIIE, and so σF is released and becomes active. SpoIIE's ability to antagonize SpoIIAB in the prespore but not the mother cell derives from the fact that SpoIIE molecules are bound to the membrane on the surface of the septum. Because the prespore is much smaller than the mother cell, but the septum surface area is similar in both, the concentration of SpoIIE is greater in the prespore, and this enables it to antagonize SpoIIAB.
  • σ E is activated by proteolytic cleavage of a precursor protein. The protease that carries out this cleavage is the SpoIIGA protein, which spans the septum between the prespore and mother cell. The protease domain, which is on the mother-cell side of the septum, is activated by binding of SpoIIR to a receptor domain on the prespore side. It is a typical receptor-mediated signal transduction system (Section 12.1.2). SpoIIR is one of the genes whose promoter is recognized specifically by σF, so activation of the protease, and conversion of pre-σE to active σE, occurs once σF-directed transcription is underway in the prespore.

Figure 12.20. Activation of the prespore- and mother-cell-specific σ subunits during Bacillus sporulation.

Figure 12.20

Activation of the prespore- and mother-cell-specific σ subunits during Bacillus sporulation. (A) In the mother cell, σF is inactive because it is bound to SpoAB, which phosphorylates SpoAA and prevents the latter releasing σF. (more...)

Activation of σF and σE is just the beginning of the story. In the prespore, about 1 hour after its activation, σF responds to an unknown signal (possibly from the mother cell) which results in a slight change in genome activity in the spore. This includes transcription of a gene for another σ subunit, σG, which recognizes promoters upstream of genes whose products are required during the later stages of spore differentiation. One of these proteins is SpoIVB, which activates another septum-bound protease, SpoIVF (Figure 12.21). This protease then activates a second mother cell σ subunit, σK, which is coded by a σE-transcribed gene but retained in the mother cell in an inactive form until the signal for its activation is received from the prespore. σK directs transcription of the genes whose products are needed during the later stages of the mother-cell differentiation pathway.

Figure 12.21. Activation of σK during Bacillus sporulation.

Figure 12.21

Activation of σK during Bacillus sporulation. See the text for details. Note that the scheme is very similar to the procedure used to activate σE (see Figure 12.20B). Abbreviations: G, σG; K, σK; IVB, SpoIVB; IVF, SpoIVF. (more...)

To summarize, the key features of Bacillus sporulation are as follows:

  • The master protein, SpoOA, responds to external stimuli to determine if and when the switch to sporulation should occur.
  • A cascade of σ subunits in prespore and mother cell brings about time-dependent changes in genome activity in the two cells.
  • Cell-cell signaling ensures that the events occurring in prespore and mother cell are coordinated.

12.3.2. Vulva development in Caenorhabditis elegans

B. subtilis is a unicellular organism and, although sporulation involves the coordinated differentiation of two cell types, it can hardly be looked upon as comparable to the developmental processes that occur in multicellular organisms. Sporulation provides pointers to the general ways in which genome activity might be regulated during the development of a multicellular organism, but it does not indicate the specific events to expect. We therefore need to examine development in a simple multicellular eukaryote.

C. elegans is a model for multicellular eukaryotic development

Research with the microscopic nematode worm C. elegans (Figure 12.22) was initiated by Sydney Brenner in the 1960s with the aim of utilizing it as a simple model for multicellular eukaryotic development. C. elegans is easy to grow in the laboratory and has a short generation time, measured in days but still convenient for genetic analysis. The worm is transparent at all stages of its life cycle, so internal examination is possible without killing the animal. This is an important point because it has enabled researchers to follow the entire developmental process of the worm at the cellular level. Every cell division in the pathway from fertilized egg to adult worm has been charted, and every point at which a cell adopts a specialized role has been identified. In addition, the complete connectivity of the 302 cells that comprise the nervous system of the worm has been mapped.

Figure 12.22. The nematode worm Caenorhabditis elegans.

Figure 12.22

The nematode worm Caenorhabditis elegans. The micrograph shows an adult hermaphrodite worm, approximately 1 mm in length. The vulva is the small projection located on the underside of the animal, about halfway along. Egg cells can be seen inside the worm's (more...)

The genome of C. elegans is relatively small, just 97 Mb (see Table 2.1), and the entire sequence is known (CESC, 1998). Analysis of the sequence, using many of the techniques described in Chapter 7, is beginning to assign functions to the unknown genes and to establish links between genome activity and the developmental pathways. The objective is a complete genetic description of development in C. elegans, a goal that is attainable in the not-too-distant future.

Determination of cell fate during development of the C. elegans vulva

A critical feature that underpins the usefulness of C. elegans as a tool for research is the fact that its development is more or less invariant: the pattern of cell division and differentiation is virtually the same in every individual. This appears to be due in large part to cell-cell signaling, which induces each cell to follow its appropriate differentiation pathway. To illustrate this we will look at development of the C. elegans vulva (Sharma-Kishore et al., 1999).

Most C. elegans worms are hermaphrodites, meaning that they have both male and female sex organs. The vulva is part of the female sex apparatus, being the tube through which sperm enter and fertilized eggs are laid. The adult vulva comprises 22 cells which are the progeny of three ancestral cells originally located in a row on the undersurface of the developing worm (Figure 12.23). Each of these ancestral cells becomes committed to the differentiation pathway that leads to production of vulva cells. The central cell, called P6.p, adopts the ‘primary vulva cell fate’ and divides to produce eight new cells. The other two cells - P5.p and P7.p - take on the ‘secondary vulva cell fate’ and divide into seven cells each. These 22 cells then reorganize their positions to construct the vulva (Sulston and Horwitz, 1977).

Figure 12.23. Cell divisions resulting in production of the vulva cells of Caenorhabditis elegans.

Figure 12.23

Cell divisions resulting in production of the vulva cells of Caenorhabditis elegans. Three ancestral cells divide in a programmed manner to produce 22 progeny cells, which re-organize their positions relative to one another to construct the vulva.

A critical aspect of vulva development is that it must occur in the correct position relative to the gonad, the structure containing the egg cells. If the vulva develops in the wrong place then the gonad will not receive sperm and the egg cells will never be fertilized. The positional information needed by the vulva progenitor cells is provided by a cell within the gonad called the anchor cell (Figure 12.24). The importance of the anchor cell has been demonstrated by experiments in which it is artificially destroyed in the embryonic worm: in the absence of the anchor cell, a vulva does not develop. The implication is that the anchor cell secretes an extracellular signaling compound that induces P5.p, P6.p and P7.p to differentiate. This signaling compound is the protein called LIN-3, coded by the lin-3 gene (Hill and Sternberg, 1992).

Figure 12.24. The postulated role of the anchor cell in determining cell fate during vulva development in Caenorhabditis elegans.

Figure 12.24

The postulated role of the anchor cell in determining cell fate during vulva development in Caenorhabditis elegans. It is thought that release of the signaling compound LIN-3 by the anchor cell commits P6.p (shown in blue), the cell closest to the anchor (more...)

Why does P6.p adopt the primary cell fate whereas P5.p and P7.p take on secondary cell fates? There are two possibilities. The first is that LIN-3 forms a concentration gradient and therefore has different effects on P6.p, the cell which is closest to it, and the more distant P5.p and P7.p, as shown in Figure 12.24. Evidence in favor of this idea comes from studies showing that isolated cells adopt the secondary fate when exposed to low levels of LIN-3 (Katz et al., 1995). Alternatively, the signal that commits P5.p and P7.p to their secondary fates might not come directly from the anchor cell but via P6.p in the form of a different extracellular signaling compound whose synthesis by P6.p is switched on by LIN-3 activation (Kornfeld, 1997). This hypothesis is supported by the abnormal features displayed by certain mutants in which more than three cells become committed to vulva development. With these mutants there is more than one primary cell, but each one is invariably surrounded by two secondary cells, suggesting that in the living worm adoption of the secondary cell fate is dependent on the presence of an adjacent primary cell.

There are other instructive features of vulva development in C. elegans. The first is that the signaling process that commits P6.p to its primary cell fate has many similarities with the MAP kinase signal transduction system of vertebrates (see Figure 12.10). The cell surface receptor for LIN-3 is a protein kinase called LET-23 (Aroian et al., 1990) which, when activated by binding LIN-3, initiates a series of intracellular reactions that leads to activation of a MAP-kinase-like protein, which in turn switches on a variety of transcription activators (Sternberg and Han, 1998). Unfortunately the target genes have not yet been delineated in either the primary or secondary vulva progenitor cells, but the system is open to study.

A second noteworthy feature is that as well as the activation signal provided by the anchor cell in the form of LIN-3, the vulva progenitor cells are also subject to the deactivating effects of a second signaling compound secreted by the hypodermal cell, a multinuclear sheath that surrounds most of the worm's body. This repressive signal is overcome by the positive signals that induce P5.p, P6.p and P7.p to differentiate, but prevents the unwanted differentiation of three adjacent cells, P3.p, P4.p and P8.p, each of which can become committed to vulva development if the repressive signal malfunctions, for example in a mutant worm.

In summary, the general concepts to emerge from vulva development in C. elegans are as follows:

  • In a multicellular organism, positional information is important: the correct structure must develop at the appropriate place.
  • The commitment to differentiation of a small number of progenitor cells can lead to construction of a multicellular structure.
  • Cell-cell signaling can utilize a concentration gradient to induce different responses in cells at different positions relative to the signaling cell.
  • A cell might be subject to competitive signaling, where one signal tells it to do one thing and a second signal tells it to do the opposite.

Box Icon

Box 12.2

The link between genome replication and sporulation in Bacillus. A combination of genetic and biochemical studies have shown how sporulation in Bacillus subtilis is coordinated with genome replication. A prerequisite for sporulation in B. subtilis is (more...)

12.3.3. Development in Drosophila melanogaster

The last organism whose development we will study is Drosophila melanogaster. The experimental history of the fruit fly dates back to 1910 when Thomas Hunt Morgan first used this organism as a model system in genetic research. For Morgan the advantages of Drosophila were its small size, enabling large numbers to be studied in a single experiment, its minimal nutritional requirements (the flies like bananas), and the presence in natural populations of occasional variants with easily recognized genetic characteristics such as unusual eye colors. Morgan was not aware that other advantages are a small genome (180 Mb; see Table 2.1) and the fact that gene isolation is aided by the presence in the salivary glands of ‘giant’ chromosomes. These are made up of multiple copies of the same DNA molecule laid side by side, and display banding patterns that can be correlated with the physical map of each chromosome to pinpoint the positions of desired genes. But Morgan did foresee that Drosophila might become an important organism for developmental research, a topic that he was as interested in as we are today.

The major contribution that Drosophila has made to our understanding of development has been through the insights it has provided into how an undifferentiated embryo acquires positional information that eventually results in the construction of complex body parts at the correct places in the adult organism. Although in some respects Drosophila is quite unusual in its embryonic organization (as we will see in the next section), the genetic mechanisms that specify the fly's body plan are similar to those in other organisms, including humans. Knowledge gained from Drosophila has therefore directed research into areas of human development that for a long time were thought to be inaccessible. To explore this story we must start with the events that occur in the developing Drosophila embryo.

Maternal genes establish protein gradients in the Drosophila embryo

The unusual feature of the early Drosophila embryo is that it is not made up of lots of cells, as in most organisms, but instead is a single syncytium comprising a mass of cytoplasm and multiple nuclei (Figure 12.25). This structure persists until successive rounds of nuclear division have produced some 1500 nuclei: only then do individual uninucleate cells start to appear around the outside of the syncytium, producing the structure called the blastoderm. Before the blastoderm stage has been reached, the positional information has begun to be established.

Figure 12.25. Early development of the Drosophila embryo.

Figure 12.25

Early development of the Drosophila embryo. To begin with, the embryo is a single syncytium containing a gradually increasing number of nuclei. These nuclei migrate to the periphery of the embryo after about 2 hours, and within another 30 minutes cells (more...)

Initially the positional information that the embryo needs is a definition of which end is the front (anterior) and which the back (posterior), as well as similar information relating to up (dorsal) and down (ventral). This information is provided by concentration gradients of proteins that become established in the syncytium. The bulk of these proteins are not synthesized from genes in the embryo, but are translated from mRNAs injected into the embryo by the mother. To see how these maternal-effect genes work we will examine the synthesis of Bicoid, one of the four proteins involved in determining the anterior-posterior axis.

The bicoid gene is transcribed in the maternal nurse cells, which are in contact with the egg cells, and the mRNA is injected into the anterior end of the unfertilized egg. This position is defined by the orientation of the egg cell in the egg chamber. The bicoid mRNA remains in the anterior region of the egg cell, attached by its 3′ untranslated region to the cell's cytoskeleton. It is not translated immediately, probably because its poly(A) tail is too short. This is inferred because translation, which occurs after fertilization of the egg, is preceded by extension of the poly(A) tail through the combined efforts of the Cortex, Grauzone and Staufen proteins, all of which are synthesized from genes in the egg. Bicoid protein then diffuses through the syncytium, setting up a concentration gradient, highest at the anterior end and lowest at the posterior end (Figure 12.26).

Figure 12.26. Establishment of the anterior-posterior axis in a Drosophila embryo.

Figure 12.26

Establishment of the anterior-posterior axis in a Drosophila embryo. The anterior-posterior axis is established by gradients of Bicoid, Nanos, Caudal and Hunchback proteins, as described in the text. In this diagram, the concentration gradients are indicated (more...)

Three other maternal-effect gene products are also involved in setting up the anterior-posterior gradient. These are the Hunchback, Nanos and Caudal proteins. All are injected as mRNAs into the anterior region of the unfertilized egg. The nanos mRNA is transported to the posterior part of the egg and attached to the cytoskeleton while it awaits translation. The hunchback and caudal mRNAs become distributed evenly through the cytoplasm, but their proteins subsequently form gradients through the action of Bicoid and Nanos:

  • Bicoid activates the hunchback gene in the embryonic nuclei and represses translation of the maternal caudal mRNA, increasing the concentration of the Hunchback protein in the anterior region and decreasing that of Caudal.
  • Nanos represses translation of hunchback mRNA, contributing further to the anterior-posterior gradient of the Hunchback protein.

The net result is a gradient of Bicoid and Hunchback, greater at the anterior end, and of Nanos and Caudal, greater at the posterior end (see Figure 12.26). The gradient is supplemented with Torso protein, another maternal-effect gene product, which accumulates at the extreme anterior and posterior ends. Similar events result in a dorsal-to-ventral gradient, predominantly of the protein called Dorsal.

A cascade of gene expression converts positional information into a segmentation pattern

The body plan of the adult fly, as well as that of the larva, is built from a series of segments, each with a different structural role. This is clearest in the thorax, which has three segments, each carrying one pair of legs, and the abdomen, which is made up of eight segments, but is also true for the head, even though in the head the segmented structure is less visible (Figure 12.27). The objective of embryo development is therefore production of a young larva with the correct segmentation pattern.

Figure 12.27. The segmentation pattern of the adult Drosophila melanogaster.

Figure 12.27

The segmentation pattern of the adult Drosophila melanogaster. Note that the head is also segmented, but the pattern is not easily discernible from the morphology of the adult fly. Reprinted with permission from Lewis EB, Nature, 276, 565. Copyright 1978 (more...)

The gradients established in the embryo by the maternal-effect gene products are the first stage in formation of the segmentation pattern. These gradients provide the interior of the embryo with a basic amount of positional information, each point in the syncytium now having its own unique chemical signature defined by the relative amounts of the various maternal-effect gene products. This positional information is made more precise by expression of the gap genes.

Three of the anterior-posterior gradient proteins - Bicoid, Hunchback and Caudal - are transcription activators that target the gap genes in the nuclei that now line the inside of the embryo (see Figure 12.25). The identities of the gap genes expressed in a particular nucleus depend on the relative concentrations of the gradient proteins and hence on the position of the nucleus along the anterior- posterior axis. Some gaps genes are activated directly by Bicoid, Hunchback and Caudal, examples being buttonhead, empty spiracles and orthodenticle which are activated by Bicoid. Other gap genes are switched on indirectly, as is the case with hucklebein and tailless, which respond to transcription activators that are switched on by Torso. There are also repressive effects (e.g. Bicoid represses expression of knirps) and the gap gene products regulate their own expression in various ways. This complex interplay results in the positional information in the embryo, now carried by the relative concentrations of the gap gene products, becoming more detailed (Figure 12.28).

Figure 12.28. The role of the gap gene products in conferring positional information during embryo development in Drosophila melanogaster.

Figure 12.28

The role of the gap gene products in conferring positional information during embryo development in Drosophila melanogaster. As in Figure 12.26, the concentration gradient of each gap gene product is denoted by the colored bars. The parts of the embryo (more...)

The next set of genes to be activated, the pair-rule genes, establish the basic segmentation pattern. Transcription of these genes responds to the relative concentrations of the gap gene products and occurs in nuclei that have become enclosed in cells. The pair-rule gene products therefore do not diffuse through the syncytium but remain localized within the cells that express them. The result is that the embryo can now be looked upon as comprising a series of stripes, each stripe consisting of a set of cells expressing a particular pair-rule gene. In a further round of gene activation, the segment polarity genes become switched on, providing greater definition to the stripes by setting the sizes and precise locations of what will eventually be the segments of the larval fly. Gradually we have converted the imprecise positional information of the maternal-effect gradients into a sharply defined segmentation pattern.

Segment identity is determined by the homeotic selector genes

The pair-rule and segment polarity genes establish the segmentation pattern of the embryo but do not determine the identities of the individual segments. This is the job of the homeotic selector genes, which were first discovered by virtue of the extravagant effects that mutations in these genes have on the appearance of the adult fly. The antennapedia mutation, for example, transforms the head segment that usually produces an antenna into one that makes a leg, so the mutant fly has a pair of legs where its antennae should be. The early geneticists were fascinated by these monstrous homeotic mutants and many were collected during the first few decades of the 20th century.

Genetic mapping of homeotic mutations has revealed that the selector genes are clustered in two groups on chromosome 3. These clusters are called the Antennapedia complex (ANT-C), which contains genes involved in determination of the head and thorax segments, and the Bithorax complex (BX-C), which contains genes for the abdomen segments (Figure 12.29). Some additional non-selector development genes, such as bicoid, are also located in ANT-C. One interesting feature of the ANT-C and BX-C clusters, which is still not understood, is that the order of genes corresponds to the order of the segments in the fly, the first gene in ANT-C being labial palps, which controls the most anterior segment of the fly, and the last gene in BX-C being Abdominal B, which specifies the most posterior segment.

Figure 12.29. The Antennapedia and Bithorax gene complexes of Drosophila melanogaster.

Figure 12.29

The Antennapedia and Bithorax gene complexes of Drosophila melanogaster. Both complexes are located on the fruit-fly chromosome 3,ANT-C upstream of BX-C. The genes are usually drawn in the order shown, although this means that they are transcribed from (more...)

The correct selector gene is expressed in each segment because the activation of each one is responsive to the positional information represented by the distributions of the gap and pair-rule gene products. The selector gene products are themselves transcription activators, each containing a homeodomain version of the helix-turn-helix DNA-binding structure (Section 9.1.4). Each selector gene product, possibly in conjunction with a coactivator such as Extradenticle, switches on the set of genes needed to initiate development of the specified segment. Maintenance of the differentiated state is ensured partly by the repressive effect that each selector gene product has on expression of the other selector genes, and partly by the work of Polycomb which, as we saw in Section 12.2.2, constructs inactive chromatin over the selector genes that are not expressed in a particular cell (Pirrotta, 1997).

Box Icon

Box 12.1

The genetic basis of flower development. Developmental processes in plants are, in most respects, very different from those of fruit flies and other animals, but at the genetic level there are certain similarities, sufficient for the knowledge gained (more...)

Homeotic selector genes are universal features of higher eukaryotic development

The homeodomains of the various Drosophila selector genes are strikingly similar. This observation led researchers in the 1980s to search for other homeotic genes by using the homeodomain as a probe in hybridization experiments. First, the Drosophila genome was searched, resulting in isolation of several previously unknown homeodomain-containing genes. These have turned out not to be selector genes but other types of gene coding for transcription activators involved in development. Examples include the pair-rule genes even-skipped and fushi tarazu, and the segment polarity gene engrailed.

The real excitement came when the genomes of other organisms were probed and it was realized that homeodomains are present in genes in a wide variety of animals, including humans. Even more unexpected was the discovery that some of the homeodomain genes in these other organisms are homeotic selector genes organized into clusters similar to ANT-C and BX-C, and that these genes have equivalent functions to the Drosophila versions, specifying construction of the body plan.

We now look on the ANT-C and BX-C clusters of selector genes in Drosophila as two parts of a single complex, usually referred to as the homeotic gene complex or HOM-C. In vertebrates there are four homeotic gene clusters, called HoxA to HoxD. When these four clusters are aligned with one another and with HOM-C (Figure 12.30) similarities are seen between the genes at equivalent positions, such that the evolutionary history of the homeotic selector gene clusters can be traced from insects through to humans (see Section 15.2.1). The genes in the vertebrate clusters specify the development of body structures and, as in Drosophila, the order of genes reflects the order of these structures in the adult body plan. This is clearly seen with the mouse HoxB cluster, which controls development of the nervous system (Figure 12.31).

Figure 12.30. Comparison between the Drosophila HOM-C gene complex and the four Hox clusters of vertebrates.

Figure 12.30

Comparison between the Drosophila HOM-C gene complex and the four Hox clusters of vertebrates. Genes that code for proteins with related structures and functions are indicated by the colors. For more details on the evolution of the Hox clusters, see Section (more...)

Figure 12.31. Specification of the mouse nervous system by selector genes of the HoxB cluster.

Figure 12.31

Specification of the mouse nervous system by selector genes of the HoxB cluster. The nervous system is shown schematically and the positions specified by the individual HoxB genes (HoxB1 to HoxB9) indicated by the red bars. The components of the nervous (more...)

The remarkable conclusion is that, at this fundamental level, developmental processes in fruit flies and other ‘simple’ eukaryotes are similar to the processes occurring in human and other ‘complex’ organisms. The discovery that studies of fruit flies are directly relevant to human development opens up vast vistas of future research possibilities.

Study Aids For Chapter 12

Key terms

Give short definitions of the following terms:

Self study questions


Distinguish between the terms ‘differentiation’ and ‘development’.


Outline the various ways by which transient changes in genome activity can be brought about.


Give details of one example of an extracellular signaling protein that can act as a transcription activator.


Describe how copper influences gene expression in Saccharomyces cerevisiae.


Outline the process by which steroid hormones regulate genome activity in higher eukaryotes.


Draw a series of diagrams to illustrate the catabolite repression system of bacteria.


Distinguish between a STAT and a JAK and explain how the two work together to regulate genome activity.


Outline the MAP kinase, Ras and SAP kinase signal transduction systems.


What is a second messenger? Give two examples of signal transduction via second messengers.


Describe how genome rearrangements underlie the processes involved in (a) mating-type switching in Saccharomyces cerevisiae, and (b) the generation of immunoglobulin diversity in vertebrates.


How do the Polycomb proteins influence genome activity?


Give two examples of feedback autoregulation of gene expression.


Describe how Bacillus subtilis uses sigma factor replacement as a means of regulating genome activity during sporulation.


How is cell-cell signaling used to control genome activity during sporulation in Bacillus subtilis?


Write a short essay on ‘The importance of protein phosphorylation during sporulation in Bacillus subtilis’.


Why is Caenorhabditis elegans a good model organism for development in higher eukaryotes?


Discuss the key features of genome regulation during vulva development in C. elegans. Your answer should emphasize those features of vulva development that have parallels in developmental processes in higher eukaryotes.


Why is Drosophila melanogaster a good model organism for development in higher eukaryotes?


Describe how the undifferentiated fruit-fly embryo acquires positional information.


Discuss the importance of homeotic selector genes in development in fruit flies and vertebrates.

Problem-based learning


‘We should bear in mind that the biosphere is so diverse, and the numbers of genes in individual genomes so large, that it is reasonable to assume that any mechanism that could have evolved to regulate genome expression is likely to have done so.’ To what extent is this statement supported by our current knowledge of genome regulation?


Describe how studies of signal transduction have improved our understanding of the abnormal biochemical activities that underlie cancer.


Explore the influence of signal transduction by second messengers on the regulation of genome activity.


Evaluate the three ‘guiding principles’ that have underpinned research into developmental processes (as listed on page 365).


Are Caenorhabditis elegans and Drosophila melanogaster good model organisms for development in higher eukaryotes?


What would be the key features of an ideal model organism for development in higher eukaryotes?


Is there any need for a model organism for development in higher eukaryotes?


  1. Aroian RV, Koga M, Mendel JE, Ohshima Y, Sternberg PW. The let-23 gene necessary for Caenorhabditis elegans vulval induction encodes a tyrosine kinase of the EGF receptor subfamily. Nature. (1990);348:693–699. [PubMed: 1979659]
  2. Becker S, Groner B, Müller CW. Three-dimensional structure of the Stat3b homodimer bound to DNA. Nature. (1998);394:145–151. [PubMed: 9671298]
  3. Berridge MJ, Lipp P, Bootman MD. The calcium entry pas de deux. Science. (2000);287:1604–1605. [PubMed: 10733429]
  4. Brock TD (1990) The Emergence of Bacterial Genetics. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.
  5. The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. (1998);282:2012–2018. [PubMed: 9851916]
  6. Chawla S, Hardingham GE, Quinn DR, Bading H. CBP: a signal-regulated transcriptional coactivator controlled by nuclear calcium and CaM kinase IV. Science. (1998);281:1505–1509. [PubMed: 9727976]
  7. Errington J. Determination of cell fate in Bacillus subtilis. Trends Genet. (1996);12:31–34. [PubMed: 8741858]
  8. Garre C, Bianchiscarra G, Sirito M, Musso M, Ravazzolo R. Lactoferrin binding sites and nuclear localization in K562(S) cells. J. Cell Physiol. (1992);153:477–482. [PubMed: 1447310]
  9. Goodrich J, Puangsomlee P, Martin M, Long D, Meyerowitz EM, Coupland G. A Polycomb-group gene regulates homeotic gene expression in Arabidopsis. Nature. (1997);386:44–51. [PubMed: 9052779]
  10. Grossman AD. Genetic networks controlling the initiation of sporulation and the development of genetic competence in Bacillus subtilis. Ann. Rev. Genet. (1995);29:477–508. [PubMed: 8825484]
  11. Haber JE. A locus control region regulates yeast recombination. Trends Genet. (1998);14:317–321. [PubMed: 9724964]
  12. He J, Furmanski P. Sequence specificity and transcriptional activation in the binding of lactoferrin to DNA. Nature. (1995);373:721–724. [PubMed: 7854459]
  13. Hill RJ, Sternberg PW. The lin-3 gene encodes an inductive signal for vulval development in C. elegans. Nature. (1992);358:470–476. [PubMed: 1641037]
  14. Horvath CM. STAT proteins and transcriptional responses to extracellular signals. Trends Biochem. Sci. (2000);25:496–502. [PubMed: 11050435]
  15. Katz WS, Hill RJ, Clandinin TR, Sternberg PW. Different levels of the C. elegans growth factor LIN-3 promote distinct vulval precursor fates. Cell. (1995);82:297–307. [PubMed: 7628018]
  16. Kornfeld K. Vulval development in Caenorhabditis elegans. Trends Genet. (1997);13:55–61. [PubMed: 9055606]
  17. Lodish H, Berk A, Zipursky AL, Matsudaira P, Baltimore D and Darnell J (2000) Molecular Cell Biology, 4th edition. W. H. Freeman, New York.
  18. Ma H. To be, or not to be, a flower - control of floral meristem identity. Trends Genet. (1998);14:26–32. [PubMed: 9448463]
  19. Parcy F, Nilsson O, Busch MA, Lee I, Weigel D. A genetic framework for floral patterning. Nature. (1998);395:561–566. [PubMed: 9783581]
  20. Pena MMO, Koch KA, Thiele DJ. Dynamic regulation of copper uptake and detoxification genes in Saccharomyces cerevisiae. Mol. Cell. Biol. (1998);18:2514–2523. [PMC free article: PMC110631] [PubMed: 9599102]
  21. Pirrotta V. Chromatin-silencing mechanisms in Drosophila maintain patterns of gene expression. Trends Genet. (1997);13:314–318. [PubMed: 9260517]
  22. Popperl H, Bienz M, Studer M. et al. Segmental expression of HoxB-1 is controlled by a highly conserved autoregulatory loop dependent upon exd/pbx. Cell. (1995);81:1031–1042. [PubMed: 7600572]
  23. Regulski M, Dessain S, McGinnis N, McGinnis W. High affinity binding sites for the Deformed protein are required for the function of an autoregulatory enhancer of the deformed gene. Genes Devel. (1991);5:278–286. [PubMed: 1995417]
  24. Robinson MJ, Cobb MH. Mitogen-activated kinase pathways. Curr. Opin. Cell Biol. (1997);9:180–186. [PubMed: 9069255]
  25. Schlessinger J. How receptor tyrosine kinases activate Ras. Trends Biochem. Sci. (1993);18:273–275. [PubMed: 8236435]
  26. Sharma-Kishore R, White JG, Southgate E, Podbilewicz B. Formation of the vulva in Caenorhabditis elegans: a paradigm for organogenesis. Development. (1999);126:691–699. [PubMed: 9895317]
  27. Sonenshein AL. Control of sporulation initiation in Bacillus subtilis. Curr. Opin. Microbiol. (2000);3:561–566. [PubMed: 11121774]
  28. Spiegel S, Foster D, Kolesnick R. Signal transduction through lipid second messengers. Curr. Opin. Cell Biol. (1996);8:159–167. [PubMed: 8791422]
  29. Sternberg PW, Han M. Genetics of RAS signaling in C. elegans. Trends Genet. (1998);14:466–472. [PubMed: 9825675]
  30. Strachan T and Read AP (1999) Human Molecular Genetics, 2nd edition. BIOS Scientific Publishers, Oxford.
  31. Stragier P, Losick R. Molecular genetics of sporulation in Bacillus subtilis. Ann. Rev. Genet. (1996);30:297–341. [PubMed: 8982457]
  32. Sulston J, Horwitz HR. Postembryonic cell lineages of the nematode Caenorhabditis elegans. Dev. Biol. (1977);56:110–156. [PubMed: 838129]
  33. Theissen G, Saedler H. Floral quartets. Nature. (2001);409:469–471. [PubMed: 11206529]
  34. Toker A, Cantley LC. Signaling through the lipid products of phosphoinositide-3-OH kinase. Nature. (1997);387:673–676. [PubMed: 9192891]
  35. Tsai M-J, O'Malley BW. Molecular mechanisms of action of steroid/thyroid receptor superfamily members. Ann. Rev. Biochem. (1994);63:451–486. [PubMed: 7979245]
  36. Winge DR, Jensen LT, Srinivasan C. Metal ion regulation of gene expression in yeast. Curr. Opin. Chem. Biol. (1998);2:216–221. [PubMed: 9667925]

Further Reading

  1. Gehring WJ, Affolter M, Bürglin T. Homeodomain proteins. Ann. Rev. Biochem. (1994);63:487–526.Details of the Drosophila homeodomain proteins, with emphasis on the DNA-protein interactions. [PubMed: 7979246]
  2. Karin M, Hunter T. Transcriptional control by protein phosphorylation: signal transmission from the cell surface to the nucleus. Curr. Biol. (1995);5:747–757. [PubMed: 7583121]
  3. Labouesse M, Mango SE. Patterning the C. elegans embryo: moving beyond the cell lineage. Trends Genet. (1999);15:307–313.Reviews the developmental pathways of C. elegans. [PubMed: 10431192]
  4. Maconochie M, Nonchev S, Morrison A, Krumlauf R. Paralogous Hox genes: function and regulation. Ann. Rev. Genet. (1996);30:529–556.Describes homeotic selector genes in vertebrates. [PubMed: 8982464]
  5. Maruta H, Burgess AW. Regulation of the Ras signaling network. Bioessays. (1994);16:489–496. [PubMed: 7945277]
  6. Tan PBO, Kim SK. Signaling specificity: the RTK/RAS/MAP kinase pathway in metazoans. Trends Genet. (1999);15:145–149.Describes how a single signal transduction pathway can activate different genes in different cells. [PubMed: 10203824]
Copyright © 2002, Garland Science.
Bookshelf ID: NBK21127
PubReader format: click here to try


Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to pubmed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...