NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gilbert SF. Developmental Biology. 6th edition. Sunderland (MA): Sinauer Associates; 2000.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Developmental Biology

Developmental Biology. 6th edition.

Show details

Differential Gene Transcription

Anatomy of the gene: Exons and introns

There are two fundamental differences distinguishing most eukaryotic genes from most prokaryotic genes. First, eukaryotic genes are contained within a complex of DNA and protein called chromatin. The protein component constitutes about half the weight of chromatin and is composed largely of nucleosomes. The nucleosome is the basic unit of chromatin structure. It is composed of an octamer of histone proteins (two molecules each of histones H2A-H2B and histones H3-H4) wrapped with two loops containing approximately 140 base pairs of DNA (Figure 5.1; Kornberg and Thomas 1974). Chromatin can thus be visualized as a string of nucleosome beads linked by ribbons of DNA. While classic geneticists have likened genes to “beads on a string,” molecular geneticists liken genes to “string on the beads.” Most of the time, the nucleosomes are themselves wound into tight “solenoids” that are stabilized by histone H1. Histone H1 is found in the 60 or so base pairs of “linker” DNA between the nucleosomes (Weintraub 1984). This H1-dependent conformation of nucleosomes inhibits the transcription of genes in somatic cells by packing adjacent nucleosomes together into tight arrays that prohibit the access of transcription factors and RNA polymerases to the genes (Thoma et al. 1979; Schlissel and Brown 1984). It is generally thought, then, that the “default” condition of chromatin is a repressed state, and that tissue-specific genes become activated by local interruption of this repression (Weintraub 1985).

Figure 5.1. Nucleosome and chromatin structure.

Figure 5.1

Nucleosome and chromatin structure. (A) Model of nucleosome structure as seen by X-ray crystallography at a resolution of 2.8 Å. Histones H2A and H2B are yellow and red, respectively; H3 is purple, and H4 is green. The DNA helix is wound about (more...)


5.1 Displacing nucleosomes. Transcription can occur even in a region of nucleosomes. If the promoter is accessible to RNA polymerase and transcription factors, the presence of nucleosomes will not inhibit the elongation of the message.

The second difference is that eukaryotic genes are not co-linear with their peptide products. Rather, the single nucleic acid strand of eukaryotic mRNA comes from noncontiguous regions on the chromosome. Between the regions of DNA coding for a protein—exons—are intervening sequences—introns—that have nothing whatsoever to do with the amino acid sequence of the protein.* The structure of a typical eukaryotic gene can be represented by the human β-globin gene, shown in Figure 5.2. This gene consists of the following elements:

Figure 5.2. Nucleotide sequence of the human β-globin gene.

Figure 5.2

Nucleotide sequence of the human β-globin gene. (A) Schematic representation of the locations of the promoter region, transcription initiation (cap) site, 5´ untranslated (leader) sequence, exons, introns, and 3´ untranslated region (more...)


A promoter region, which is responsible for the binding of RNA polymerase and for the subsequent initiation of transcription. The promoter region of the human β-globin gene has three distinct units and extends from 95 to 26 base pairs before (“upstream from”) the transcription initiation site (i.e., from -95 to -26).


The transcription initiation site, which for human β-globin is ACATTTG. This site is often called the cap sequence because it represents the 5´ end of the RNA, which will receive a “cap” of modified nucleotides soon after it is transcribed. The specific cap sequence varies among genes.


The translation initiation site, ATG. This codon (which becomes AUG in the mRNA) is located 50 base pairs after the transcription initiation site in the human β-globin gene (although this distance differs greatly among different genes). The intervening sequence of 50 base pairs between the initiation points of transcription and translation is the 5´ untranslated region, often called the 5´ UTR or leader sequence. The 5´ UTR can determine the rate at which translation is initiated.


The first exon, which contains 90 base pairs coding for amino acids 1–30 of human β-globin.


An intron containing 130 base pairs with no coding sequences for the globin protein. The structure of this intron is important in enabling the RNA to be processed into messenger RNA and exit from the nucleus.


An exon containing 222 base pairs coding for amino acids 31–104.


A large intron—850 base pairs—having nothing to do with the globin protein structure.


An exon containing 126 base pairs coding for amino acids 105–146.


A translation termination codon, TAA. This codon becomes UAA in the mRNA. The ribosome dissociates at this codon, and the protein is released.


A 3´ untranslated region that, (3´ UTR) although transcribed, is not translated into protein. This region includes the sequence AATAAA, which is needed for polyadenylation: the placement of a “tail” of some 200 to 300 adenylate residues on the RNA transcript. This poly(A) tail (1) confers stability on the mRNA, (2) allows the mRNA to exit the nucleus, and (3) permits the mRNA to be translated into protein. The poly(A) tail is inserted into the RNA about 20 bases downstream of the AAUAAA sequence. Transcription continues beyond the AATAAA site for about 1000 nucleotides before being terminated.

The original nuclear RNA transcript for such a gene contains the capping sequence, the 5´ untranslated region, the exons, the introns, and the 3´ untranslated region (Figure 5.3). In addition, both its ends become modified. A cap consisting of methylated guanosine is placed on the 5´ end of the RNA in opposite polarity to the RNA itself. This means that there is no free 5´ phosphate group on the nuclear RNA. The 5´ cap is necessary for the binding of mRNA to the ribosome and for subsequent translation (Shatkin 1976). The 3´ terminus is usually modified in the nucleus by the addition of a poly(A) tail. These adenylate residues are put together enzymatically and are added to the transcript; they are not part of the gene sequence. Both the 5´ and 3´ modifications may protect the RNA from exonucleases that would otherwise digest the mRNA (Sheiness and Darnell 1973; Gedamu and Dixon 1978). The modifications thus stabilize the message and its precursor.

Figure 5.3. Summary of the steps involved in the production of β-globin and hemoglobin.

Figure 5.3

Summary of the steps involved in the production of β-globin and hemoglobin.


5.2 Structure of the 5´ cap. The capping and methylation of the 5´ end are critical steps in the synthesis of mRNA. If the cap is missing or unmethylated, translation may fail to occur.

Anatomy of the gene: Promoters and enhancers

In addition to the protein-encoding region of the gene, there are regulatory sequences that can be on either end of the gene (or even within it). These sequences—the promoters and enhancers—are necessary for controlling where and when a particular gene is transcribed.

Promoters are the sites where RNA polymerase binds to the DNA to initiate transcription. Promoters of genes that synthesize messenger RNAs (i.e., genes that encode proteins) are typically located immediately upstream from the site where the RNA polymerase initiates transcription. Most of these promoters contain the sequence TATA, where RNA polymerase will be bound (Figure 5.4). This site, known as the TATA box, is usually about 30 base pairs upstream from the site where the first base is transcribed. Eukaryotic RNA polymerases, however, will not bind to this naked DNA sequence. Rather, they require additional protein factors to bind efficiently to the promoter. The protein-encoding genes are transcribed by RNA polymerase II, and at least six nuclear proteins have been shown to be necessary for the proper initiation of transcription by RNA polymerase II (Buratowski et al. 1989; Sopta et al. 1989). These proteins are called basal transcription factors. The first of these, TFIID,§ recognizes the TATA box through one of its subunits, TATA-binding protein (TBP). TFIID serves as the foundation of the transcription initiation complex, and it also serves to keep nucleosomes from forming in this region. Once TFIID is stabilized by TFIIA, it becomes able to bind TFIIB. Once TFIIB is in place, RNA polymerase can bind to this complex. Other transcription factors (TFIIE, F, and H) are then used to release RNA polymerase from the complex so that it can transcribe the gene, and to unwind the DNA helix so that the RNA polymerase will have a free template from which to transcribe.

Figure 5.4. The formation of the active eukaryotic initiation complex.

Figure 5.4

The formation of the active eukaryotic initiation complex. The diagrams represent the complexes formed on the TATA box by the transcription factors and RNA polymerase II. (A) The TFIID complex binds to the TATA box through its TBP subunit. (B) TFIID is (more...)

In addition to these basal transcription factors, which are found in each nucleus, there is also a set of transcription factors called TBP-associated factors, or TAFs (Figure 5.5; Buratowski 1997; Lee and Young 1998), which can stabilize the TBP. This function is critical for gene transcription, for if the TBP is not stabilized, it can fall off the small TATA sequence. The TAFs are bound by upstream promoter elements on the DNA. These DNA sequences are near the TATA sequence, and usually upstream from it. These TAFs need not be in every cell of the body, however. Cell-specific transcription factors (such as the Pax6 and microphthalmia proteins mentioned in Chapter 4) can also activate the gene by stabilizing the transcription initiation complex. They can do so by binding to the TAFs, by binding directly to other factors such as TFIIB, or (as we will see soon) by destabilizing nucleosomes.

Figure 5.5. Model of TAF stabilization of TBP.

Figure 5.5

Model of TAF stabilization of TBP. (A) A minimal complex near the promoter, containing TBP on the TATA box of the promoter and two upstream sites occupied by two transcription factors, Sp1 and NTF-1. TAF 250 is bound to the TBP, but this complex is not (more...)


5.3 Promoter struction and the mechanisms of transcription complex assembly. Getting RNA polymerase to a promoter is not an easy task. The transcriptional initiation complex is a major protein complex that must be created at each round of transcription. The delineation of the different parts of promoters is worked out via mutations and transgenes.

An enhancer is a DNA sequence that can activate the utilization of a promoter, controlling the efficiency and rate of transcription from that particular promoter. Enhancers can activate only cis-linked promoters (i.e., promoters on the same chromosome), but they can do so at great distances (some as great as 50 kilobases away from the promoter). Moreover, enhancers do not need to be on the 5´ (upstream) side of the gene. They can also be at the 3´ end, in the introns, or even on the complementary DNA strand (Maniatis et al. 1987). The human β-globin gene has an enhancer in its 3´ UTR, roughly 700 base pairs downstream from the AATAAA site. This sequence is necessary for the temporal- and tissue-specific expression of the β-globin gene in adult red blood cell precursors (Trudel and Constantini 1987). Like promoters, enhancers function by binding specific regulatory proteins called transcription factors.

Enhancers can regulate the temporal and tissue-specific expression of any differentially regulated gene, but different types of genes normally have different enhancers. In the pancreas, for instance, the exocrine protein genes (for the digestive proteins chymotrypsin, amylase, and trypsin) have enhancers different from that of the gene for the endocrine protein insulin. These enhancers both lie in the 5´ flanking sequences of their genes. Walker and colleagues et al. 1983 created transgenes by placing flanking regions from the genes for chymotrypsin and insulin onto the gene for bacterial chloramphenicol acetyltransferase (CAT), an enzyme that is not found in mammalian cells. CAT activity is easy to assay in mammalian cells, so the bacterial CAT gene can be used as a reporter gene to tell investigators whether a particular enhancer is functioning. The researchers then transfected the transgenes into (1) ovary cells (which do not secrete either insulin or chymotrypsin), (2) an insulin-secreting cell line, and (3) a chymotrypsin-secreting cell line, and measured the activity of CAT in each of these cells. As shown in Figure 5.6, neither enhancer sequence caused the enzyme to be made in the ovarian cells. In the insulin-secreting cell, however, the 5´ flanking region from the insulin gene enabled the CAT gene to be expressed, but the 5´ flanking region of the chymotrypsin gene did not. Conversely, when the clones were placed into the exocrine pancreatic cell line, the chymotrypsin 5´ flanking sequence allowed CAT expression, while the insulin enhancer did not. The enhancers for 10 different exocrine proteins share a 20-base-pair consensus sequence, suggesting that these similar sequences play a role in activating an entire set of genes specifically expressed in the exocrine cells of the pancreas (Boulet et al. 1986). Thus, the expression of genes in exocrine and in endocrine cells of the pancreas appears to be controlled by different enhancers.

Figure 5.6. Tissue specificity of pancreatic gene enhancers.

Figure 5.6

Tissue specificity of pancreatic gene enhancers. The 5´ flanking regions of the insulin gene (open circles) and the chymotrypsin gene (shaded circles) were each inserted next to the gene for bacterial CAT. As a positive control, the enhancer from (more...)

By taking an enhancer from one gene and fusing it to another gene, it has been shown that enhancers can direct the expression of any gene sequence. For instance, the β-galactosidase gene from E. coli (the lacZ gene) can be used as a reporter gene and placed onto an enhancer that normally directs a particular mouse gene to become expressed in muscles. If the resulting transgene is injected into a newly fertilized mouse egg and gets incorporated into its DNA, the β-galactosidase gene will be expressed in the muscle cells. By staining for the presence of β-galactosidase, the expression pattern of that muscle-specific gene can be seen (Figure 5.7A). Similarly, if the gene for green fluorescent protein (GFP, a reporter protein that is usually made only in jellyfish) is placed on the enhancer of genes encoding the crystallin proteins of the eye lens, GFP expression is seen solely in the lens (Figure 5.7B).

Figure 5.7. The genetic elements regulating cell type-specific transcription can be identified by fusing reporter genes to the enhancer regions of the genes found in particular cell types.

Figure 5.7

The genetic elements regulating cell type-specific transcription can be identified by fusing reporter genes to the enhancer regions of the genes found in particular cell types. (A) The enhancer region of the muscle-specific protein Myf-5 is fused to a (more...)

Enhancers are critical in the regulation of normal development. Over the past decade, six generalizations that emphasize their importance for differential gene expression have been made:


Most genes require enhancers for their transcription.


Enhancers are the major determinant of differential transcription in space (cell type) and time.


The ability of an enhancer to function while far from the promoter means that there can be multiple signals to determine whether a given gene is transcribed. A given gene can have several enhancer sites linked to it, and each enhancer can be bound by more than one transcription factor.


The interaction between the proteins bound to the enhancer sites and the transcription initiation complex assembled at the promoter is thought to regulate transcription. The mechanism of this association is not fully known, nor do we comprehend how the promoter integrates all these signals.


Enhancers are modular. There are various DNA elements that regulate temporal and spatial gene expression, and these can be mixed and matched. As we will see, the enhancers for endocrine hormones such as insulin and for lens-specific proteins such as crystallins both have sites that bind Pax6 protein. But Pax6 doesn't tell the lens to make insulin or the pancreas to make crystallins, because there are other transcription factor proteins that also must bind. It is the combination of transcription factors that causes particular genes to be transcribed.


A gene can have several enhancer elements, each turning it on in a different set of cells


Enhancers can also be used to inhibit transcription. In some cases, the same transcription factors that activate the transcription of one gene can be used to repress the transcription of other genes. These “negative enhancers “ are also called silencers.

Transcription factors

Transcription factors are proteins that bind to enhancer or promoter regions and interact to activate or repress the transcription of a particular gene. Most transcription factors can bind to specific DNA sequences. These proteins can be grouped together in families based on similarities in structure (Table 5.1). The transcription factors within such a family share a common framework structure in their DNA-binding sites, and slight differences in the amino acids at the binding site can alter the sequence of the DNA to which the factor binds.

Table 5.1. Some major transcription factor families and subfamilies.

Table 5.1

Some major transcription factor families and subfamilies.


5.4 Families of transcription factors. There are several families of transcription factors grouped together by their structural similarities and their mechanisms of action. Homeodomain transcription factors are important in specifying anterior-posterior axes, and hormone receptors mediate the effects of hormones to the genes.

Transcription factors have three major domains. The first is a DNA-binding domain that recognizes a particular DNA sequence. The second is a trans-activating domain that activates or suppresses the transcription of the gene whose promoter or enhancer it has bound. Usually, this trans-activating domain enables the transcription factor to interact with proteins involved in binding RNA polymerase (such as TFIIB or TFIIE; see Sauer et al. 1995). In addition, there may be a protein-protein interaction domain that allows the transcription factor's activity to be modulated by TAFs or other transcription factors.

Box Icon


Studying DNA Regulatory Elements. Identifying DNA regulatory elements How do we know that a particular DNA fragment binds a transcription factor? One of the simplest ways is to perform a gel mobility shift assay. The basis for this assay is gel electrophoresis. (more...)

Examples of transcription factors: MITF and Pax6

As examples of transcription factors, we can look at the Pax6 and microphthalmia proteins mentioned in Chapter 4. The microphthalmia (MITF) protein is necessary for the production of pigment cells and their pigments. There are three functionally important domains of the MITF protein. First, MITF has a protein-protein interaction domain that enables it to dimerize with another MITF protein (Ferré-D'Amaré et al. 1993). This homodimer (two microphthalmia proteins bound together) forms the functional protein that can bind to DNA and activate the transcription of certain genes (Figure 5.8). The second region, the DNA-binding domain, is close to the amino-terminal end of the protein and contains numerous basic amino acids that make contact with the DNA (Hemesath et al. 1994; Steingrímsson et al. 1994). This assignment was confirmed by the discovery of various human and mouse mutations that map within the DNA-binding site for MITF and prevent the attachment of the MITF protein to the DNA. Sites for MITF binding have been found in the promoter regions for three pigment-cell-specific proteins of the tyrosinase family (Bentley et al. 1994; Yasumoto et al. 1997). Without MITF, these proteins are not synthesized properly (Figure 5.9) and the melanin pigment is not made. These promoters all contain the same 11-base-pair sequence, including the core sequence CATGTG.

Figure 5.8. Three-dimensional model of the MITF dimer binding to its promoter element in DNA.

Figure 5.8

Three-dimensional model of the MITF dimer binding to its promoter element in DNA. The amino termini are located at the bottom of the figure and form the DNA-binding domain. The protein-protein interaction domain is located immediately above it, and the (more...)

Figure 5.9. Mitf is required for the transcription of pigmention genes.

Figure 5.9

Mitf is required for the transcription of pigmention genes. Serial sections of the eye in 15.5-day mouse embryos are shown. In the wild-type mouse embryo, in situ hybridization reveals the presence of Mitf (dark staining) in the retinal pigment epithelial (more...)

The third functional region of MITF is its trans-activating domain. This domain includes a long stretch of amino acids in the center of the protein. When the MITF dimer is bound to its target element in a promoter or enhancer, the trans-activating region is able to bind a TAF called p300/CBP. The p300/CPB protein is a histone acetyltransferase enzyme that can transfer acetyl groups to each histone in the nucleosomes (Ogryzko et al. 1996; Price et al. 1998). This acetylation of the nucleosomes destabilizes them and allows the genes for pigment-forming enzymes to be expressed. Recent discoveries have shown that numerous transcription factors operate by recruiting histone acetyltransferases. It is thought that once the nucleosomes are destabilized, other transcription factors and RNA polymerase can bind more easily to the DNA in that region.


5.5 Histone acetylation. Histone acetylation is a critical step in clearing the way for the transcription initiation complex. Derepressed chromatin is characterized by acetylated histones.

The Pax6 transcription factor, which is needed for mammalian eye, nervous system, and pancreas development, contains two potential DNA-binding domains. The major DNA-binding site of the Pax6 protein resides at its amino-terminal end, and these amino acids interact with a specific 20–26-base-pair sequence of DNA (Figure 5.10; Xu et al. 1995). Such Pax6-binding sequences have been found in the enhancers of vertebrate lens crystallin genes and in the genes expressed in the endocrine cells of the pancreas (insulin, glucagon, and somatostatin) (Cvekl and Piatigorsky 1996; Andersen 1999). When Pax6 binds to a particular site in an enhancer or promoter, it can either activate or repress that gene. The trans-activating domain of Pax6 is rich in proline, threonine, and serine. Mutations in this region cause severe nervous system, pancreatic, and optic abnormalities in humans Glaser et al. 1994.

Figure 5.10. Stereoscopic model of Pax6 binding to its enhancer element in DNA.

Figure 5.10

Stereoscopic model of Pax6 binding to its enhancer element in DNA. The DNA-binding region (the “paired domain”) is in yellow; the DNA is in blue. The red dots indicate the sites of loss-of-function mutations in the Pax6 gene that give (more...)

The use of Pax6 by different organs demonstrates the modular nature of transcriptional regulatory units. Figure 5.11 shows two gene regulatory regions that use Pax6. The first is that of the chick δ1 lens crystallin gene. This gene has a promoter containing a site for TBP binding and an upstream site that binds Sp1, a general transcriptional activator found in all cells. The gene also has an enhancer in its third intron that controls the time and place of crystallin gene expression. This enhancer has two Pax6-binding sites. The crystallin gene will not be expressed unless Pax6 is present in the nucleus and bound to these enhancer sites. As mentioned in Chapter 4, Pax6 is present during early development in the central nervous system and head surface ectoderm of the chick. Moreover, this enhancer has a site for another transcription factor, the Sox2 protein. Sox2 is not usually found in the outer ectoderm, but it appears in those outer ectodermal cells that will become lens by virtue of their being induced by the optic vesicle evaginating from the brain (Kamachi et al. 1998). Thus, only those cells that contain both Sox2 and Pax6 can express the lens crystallin gene. In addition, there is a third site that can bind either an activator (the δEF3 protein) or a repressor (the δEF1 protein) of transcription. It is thought that the repressor may be critical in preventing crystallin expression in the nervous system. Thus, enhancers function in a combinatorial manner, wherein several transcription factors work together to promote or inhibit transcription.

Figure 5.11. Modular transcriptional regulatory regions using Pax6 as an activator.

Figure 5.11

Modular transcriptional regulatory regions using Pax6 as an activator. (A) Promoter and enhancer of the chick lens δ1 crystallin gene. Pax6 interacts with Sox2 and Maf to activate this gene. (B) Enhancer of the rat somatostatin gene. Pax6 activates (more...)

Another set of regulatory regions that use Pax6 are the enhancers regulating the transcription of the insulin, glucagon, and somatostatin genes of the pancreas (Figure 5.11B). Here, Pax6 is essential for gene expression, and it works in cooperation with other transcription factors such as Pdx1 (specific for the pancreatic region of the endoderm) and Pbx1 (Andersen et al. 1999; Hussain and Habener 1999). In the absence of Pax6 (as in the homozygous small eye mutation in mice and rats), the endocrine parts of the pancreas do not develop properly, and the production of these proteins from those cells is deficient (Sander et al. 1997). One can see that the genes for specific proteins use numerous transcription factors in various combinations. In this way, transcription factors can regulate the timing and place of gene expression.

There are other genes that are activated by Pax6 binding, and one of them is the Pax6 gene itself. Pax6 protein can bind to the Pax6 gene promoter (Plaza et al. 1993). This means that once the Pax6 gene is turned on, it will be continue to be expressed, even if the signal that originally activated it is no longer given.

Within the past decade, our knowledge of transcription factors has progressed enormously, giving us a new, dynamic view of gene expression. The gene itself is no longer seen as an independent entity controlling the synthesis of proteins. Rather, the gene both directs and is directed by protein synthesis. Natalie Angier (1992) has written, “A series of new discoveries suggests that DNA is more like a certain type of politician, surrounded by a flock of protein handlers and advisers that must vigorously massage it, twist it and, on occasion, reinvent it before the grand blueprint of the body can make any sense at all.”

Cascades of transcription factors

We now know that the tissue-specific expression of a particular gene, such as the gene encoding the δ1 crystallin of the lens, is the result of the presence of a particular constellation of transcription factors in the nucleus. In the case of this lens crystalline, the Pax6 and Sox2 transcription factors are especially important. It is important to see that this combinatorial mode of operation means that none of these transcription factors has to be cell type-specific.

But how do the transcription factors themselves get to be expressed in a tissue-specific manner? In many cases, the genes for transcription factors are activated by other transcription factors. Let us look again at the Pax6 gene as an example. Not only does the Pax6 protein regulate other genes (such as the ones for insulin and crystallins); it is itself regulated. The regulatory regions of the mouse Pax6 gene were discovered by taking regions from its 5´ flanking sequence and introns and fusing them to a β-galactosidase reporter gene. This transgene was then microinjected into newly fertilized mouse pronuclei, and the resulting embryos were stained for β-galactosidase (Figure 5.15A; Kammandel et al. 1998; Williams et al. 1998). The results are summarized in Figure 5.15B. An enhancer farthest upstream from the promoter contains the regions necessary for Pax6 expression in the pancreas, while a second enhancer activates Pax6 expression in surface ectoderm (lens, cornea, and conjunctiva). A third enhancer resides in the leader sequence, and it contained the sequences that direct Pax6 expression in the neural tube. A fourth enhancer sequence, located in an intron shortly after the translation initiation sequence, determines the expression of Pax6 in the retina. The search is on now for those transcription factors that activate the gene for the Pax6 transcription factor. Wilhelm Roux (1894) described this situation eloquently in his manifesto for experimental embryology when he stated that the causal analysis of development may be the greatest problem the human intellect has attempted to solve, “since every new cause ascertained only gives rise to fresh questions concerning the cause of this cause.”

Figure 5.15. Regulatory regions of the mouse Pax6 gene.

Figure 5.15

Regulatory regions of the mouse Pax6 gene. (A) A sequence from the upstream enhancer of the murine Pax6 gene directs the expression of the lacZ reporter transgene in the surface ectoderm overlying the optic cup, as shown by the dark staining in this area. (more...)


Some sequences act specifically to block transcription. These silencer domains are useful in restricting the transcription of a particular gene to a particular group of cells or for regulating the timing of the gene's expression. For example, the fetal mouse liver makes serum albumin, but only after a certain stage of gut development. At first, the endodermal cells that will form the liver do not transcribe this albumin gene. However, when the endodermal tube contacts the cardiac mesoderm (which is in the process of forming a heart), the heart precursors are able to instruct the endodermal tube to begin forming the liver and to start transcribing liver-specific genes (Le Douarin 1964; Gualdi et al. 1996). This contact is thought to release a transcription factor bound to a silencer region in the serum albumin gene. This silencer site is occupied prior to the contact of the endoderm with the cardiac mesoderm, but it becomes vacant in the endodermal tube immediately after contact with the heart-forming cells (Figure 5.16; Gualdi et al. 1996).

Figure 5.16. The importance of silencers in liver-specific gene transcription.

Figure 5.16

The importance of silencers in liver-specific gene transcription. (A) In the early digestive tube endoderm, most of the transcription factors are not bound to their sites on the enhancer for serum albumin. (B) As endoderm development proceeds, the sites (more...)

Another example of a silencer is found in certain neural genes. There is a sequence found in certain promoters that prevents the promoter's activation in any tissue except neurons. This sequence was given the name neural restrictive silencer element (NRSE), and it has been found in several mouse genes whose expression is limited to the nervous system: those for synapsin I, sodium channel type II, brain-derived neurotrophic factor, Ng-CAM, and L1. The protein that binds to the NRSE is a zinc finger transcription factor called neural restrictive silencer factor (NRSF). NRSF appears to be expressed in every cell that is not a mature neuron (Chong et al. 1995; Schoenherr and Anderson 1995). Thus, the down-regulation of NRSF seems to be a key event in allowing the expression of several genes that are critical to neural function.

To test the hypothesis that NRSE was necessary in the normal repression of neural genes in non-neural cells, lacZ transgenes were made by fusing a β-galactosidase gene (lacZ) with part of the L1 neural cell adhesion gene. (L1 is a protein whose function is critical for brain development, as we will see in later chapters.) In one case, the L1 gene, from its promoter through the fourth exon, was fused to the lacZ sequences. A second transgene was made just like the first, except that the NRSE had been deleted from the L1 promoter. The two transgenes were separately inserted into the pronuclei of fertilized oocytes, and the resulting transgenic mice were analyzed for the expression of β-galactosidase (Kallunki et al. 1995, 1997). In the embryos receiving the complete transgene (which included the NSRE), expression was seen only in the nervous system (Figure 5.17A). However, in those mice whose transgene lacked the NRSE, expression was seen in the heart, limb mesenchyme and limb ectoderm, kidney mesoderm, ventral body wall, and cephalic mesenchyme (Figure 5.17B).

Figure 5.17. Analysis of β-galactosidase staining patterns in 11.

Figure 5.17

Analysis of β-galactosidase staining patterns in 11.5-day embryonic mice containing (A) a transgene composed of the L1 promoter, a portion of the L1 gene, and a bacterial lacZ gene fused to the second exon (which contains the NRSE region), or (more...)

Locus control regions in globin genes

There are some regions of DNA called locus control regions (LCRs), which function as “super-enhancers.” These LCRs establish an “open” chromatin configuration, inhibiting the normal repression of transcription over an area spanning several genes. The mechanism by which the LCR opens up the chromatin is not yet known.

One of the best-studied LCRs is that regulating the tissue-specific expression of the β-globin family of genes in humans, mice, and chicks. In many species, including chicks and humans, the embryonic or fetal hemoglobin differs from that found in adult red blood cells (Figure 5.18). Human hemoglobin consists largely of four globin chains of two different types and four molecules of heme. Human embryonic hemoglobin has two zeta (ζ) globin chains and two epsilon (ϵ) globin chains (and four molecules of heme). During the second month of human gestation, ζ- and ϵ-globin synthesis abruptly ceases, while alpha (α) and gamma (γ) globin synthesis increases. The association of two γ-globin chains with two α-globin chains produces fetal hemoglobin (α2γ2). (The physiological importance of the γ-globin chain in fetal hemoglobin is examined in Chapter 14.) At 3 months gestation, the beta (β) globin and delta (δ) globin genes begin to be active, and their products slowly increase, while γ-globin levels gradually decline. This switchover is greatly accelerated after birth, and fetal hemoglobin is replaced by adult hemoglobin (α2β2). The normal adult hemoglobin profile is 97% α2β2, 2–3% α2δ2, and 1% α2γ2.

Figure 5.18. Percentages of hemoglobin chain types as a function of human developmental stage.

Figure 5.18

Percentages of hemoglobin chain types as a function of human developmental stage. (After Karlsson and Nieuhaus 1985.)

A schematic diagram of human hemoglobin types and the genes that code for them is shown in Figure 5.19. In humans, the ζ- and α-globin genes are located on chromosome 16, but the ϵ-, γ-, δ-, and β-globin genes (known as the β-globin gene family) are linked together, in order of appearance, on chromosome 11. It appears, then, that there is a mechanism that directs the sequential switching of the chromosome 11 genes from embryonic, to fetal, to adult globins.

Figure 5.19. Diagram of the human β-globin family of genes on chromosome 11.

Figure 5.19

Diagram of the human β-globin family of genes on chromosome 11. The erythroid-specific LCR region is located 6 to 22 kilobases upstream of the ϵ-globin gene. The five DNase I-hypersensitive sites within this region are marked by arrows. (more...)

The discovery of the human globin LCR came from studies of the genetic disease β-thalassemia. This anemic condition results from a lack of β-globin and can be caused in several ways. The usual causes of β-thalassemia involve deletions or mutations in either the coding region of the β-globin gene or its promoter. However, in some patients, there is a deletion in a region upstream from the β-globin gene family, while the genes themselves are normal. Moreover, without this upstream region, the β-globin family DNA was found to be DNase I-insensitive (van der Ploeg et al. 1980; Kioussis et al. 1983). DNase I treatment is used to see whether the DNA in chromatin is accessible to transcription factors. If the DNA in chromatin is not digested by DNase I, it means that DNase I cannot reach it, and therefore, transcription factors couldn't reach it either. Promoters are usually DNase I-sensitive in the cells where they function; and they are usually DNase I-insensitive in those cells where they are not active (Weintraub and Groudine 1976; Stalder et al. 1980). So it appeared that there was a region of DNA upstream from the β-globin gene cluster that was responsible for “opening up” the chromatin of the genes, making them accessible to transcription factors. This region of DNA was termed the β-globin locus control region.

The locus control region for the β-globin gene complex is located far upstream from its most 5´ member (ϵ). This LCR contains four sites that are DNase I-hypersensitive only in erythroid precursor cells. Sites are said to be DNase-hypersensitive when the DNA in the chromatin can be digested there with only small amounts of DNase I. In most instances, a site is thought to be DNase I hypersensitive when it lacks nucleosomes (Elgin 1988). These sites in the LCR are therefore within nucleosome coils in most cells' nuclei, but in the precursors of the red blood cells, this DNA is exposed. The entire LCR is necessary for activating high levels of erythroid cell-specific transcription of the entire β-globin gene family on human chromosome 11 (Grosveld et al. 1987). Deletion or mutation of the LCR causes the silencing of all these genes. Conversely, if the LCR is placed adjacent to a gene that is not usually expressed in red blood cells (such as the T cell-specific thy-1 gene) and then transfected into erythroid precursor cells, the additional gene will be expressed in the red blood cells. This effect is specific for red blood cell precursors, since only they have the appropriate transcription factors to bind to the LCR (Blom van Assendelft et al. 1989; Fiering et al. 1993). If any of the globin genes are separated from the LCR, they are repressed, even in the erythroid cells that would normally be transcribing them. The locus control region is crammed with binding sites for transcription factors. As Gary Felsenfeld (1992) observed, “The domains look as though they were put together by an overenthusiastic student determined to construct a powerful cis-acting element.” He suggested that one of the functions of the LCR is to loop around to one of the globin promoter regions during DNA replication and bind to it in a manner that prevents nucleosomes from forming on that promoter. Indeed, the globin promoters are not DNase I-hypersensitive (i.e., accessible to transcription factors) except in the presence of the LCR.

It is not known why the genes closest to the LCR are transcribed earlier than the genes farther from it. Interestingly, experiments have shown that the distance between the LCR and the globin genes does affect their activation (Hanscombe et al. 1991; Dillon et al. 1997). When linked closely to the LCR, the human β-globin gene becomes expressed in transgenic mouse embryonic cells. Its correct activation (in adult red cells only) is restored only when it is placed farther away from the LCR. Similarly, the human γ-globin gene is repressed earlier (like the normal β-globin gene) when it is placed farther from the LCR. These findings suggest that the interaction between the LCR and the globin genes is polarized: those globin genes closest to the LCR are turned on earliest, while those genes more distal are turned on later. Presumably, there is physical contact between the LCR and the gene-specific promoters and enhancers.

LCRs have been found for other genes, such as the human growth hormone locus, the macrophage-specific lysozyme gene, and the CD2 and CD4 genes expressed in T lymphocytes (see Kioussis and Festenstein 1997; Fraser and Grosveld 1998).


5.8 Further mechanisms of transcriptional regulation. These three websites cover (1) DNase I hypersensitivity, (2) the mechanisms by which LCRs may reglate the temporal expression of tandemly linked genes, and (3) the association of active genes with the nuclear matrix.



The term exon refers to a nucleotide sequence whose RNA “exits” the nucleus. It has taken on the functional definition of a protein-encoding nucleotide sequence. Leader sequences and 3´ untranslated sequences are also derived from exons, even though they are not translated into protein.

By convention, upstream, downstream, 5´, and 3´ directions are specified in relation to the RNA. Thus, the promoter is upstream of the gene, near its 5´ end.

There are several types of RNA that do not encode proteins. These include the ribosomal RNAs and transfer RNAs (which are used in protein synthesis) and the small nuclear RNAs (which are used in RNA processing). In addition, there are regulatory RNAs (such as Xist and lin-4, which we will discuss later in this chapter) that are involved in regulating gene expression (and which do not encode proteins).


TF stands for transcription factor; II indicates that the factor was first found to be needed for RNA polymerase II; and the letter designations refer to the active fractions from the phosphocellulose columns used to purify these proteins.

Cis- and trans-regulatory elements are so named by analogy with E. coli genetics. There, cis-factors are regulatory elements that reside on the same strand of DNA (cis-, meaning “on the same side as”), while trans-elements are those that could be supplied from another chromosome (trans-, meaning “on the other side of”). Cis-regulatory elements now refers to those DNA sequences that regulate a gene on the same stretch of DNA (i.e., the promoters and enhancers). Trans-regulatory factors are soluble molecules whose genes are located elsewhere in the genome and which bind to the cis-regulatory elements. They are usually transcription factors.

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 2000, Sinauer Associates.
Bookshelf ID: NBK10023


  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...