• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Lodish H, Berk A, Zipursky SL, et al. Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Molecular Cell Biology

Molecular Cell Biology. 4th edition.

Show details

Section 10.5Eukaryotic Transcription Activators and Repressors

As in E. coli, the various transcription-control elements found in eukaryotic DNA are binding sites for trans-acting regulatory proteins. In this section, we discuss the identification, purification, and structures of these transcription factors, which function as activators and repressors of eukaryotic protein-coding genes. Our discussion focuses on activators, as these have been studied most extensively.

Biochemical and Genetic Techniques Have Been Used to Identify Transcription Factors

In yeast, Drosophila, and other genetically tractable eukaryotes, classical genetic studies have identified genes encoding transcription factors. However, in mammals and vertebrates, which are less amenable to such genetic analysis, most transcription factors have been identified by biochemical purification.

Biochemical Isolation of Transcription Factors

Once a DNA regulatory element has been identified by the kinds of mutational analyses described in the previous section, it can be used to identify cognate proteins that bind specifically to it. In this approach, an extract of cell nuclei is subjected to several chromatographic steps (Figure 10-35a); fractions are assayed by DNase I footprinting or an electrophoretic mobility shift assay using DNA fragments containing the identified regulatory element (see Figures 10-6 and 10-7). Fractions containing protein that binds to the regulatory element in these assays probably contain a putative transcription factor (Figure 10-35b). A powerful technique commonly used for the final step in purifying transcription factors is sequence-specific DNA affinity chromatography, a particular type of affinity chromatography (Section 3.5). As a final test, the ability of the isolated protein to stimulate transcription of a template containing the corresponding protein-binding sites is assayed in an in vitro transcription reaction (Figure 10-36).

Figure 10-35. Transcription-factor purification.

Figure 10-35

Transcription-factor purification. (a) Several chromatographic steps are used to purify a transcription factor (cognate protein) that binds to a specific regulatory element in DNA. (more...)

Figure 10-36. In vitro transcription activation by SP1, which binds to 10-bp GC-rich sequences.

Figure 10-36

In vitro transcription activation by SP1, which binds to 10-bp GC-rich sequences. (a) The SV40 genome contains six copies of a GC-rich promoter-proximal element upstream of the early (more...)

Once a transcription factor is isolated, its partial amino acid sequence can be determined and used to clone the gene or cDNA encoding it, as outlined in Chapter 7. The isolated gene can then be used to test the ability of the encoded protein to stimulate transcription in an in vivo transfection assay (Figure 10-37).

Figure 10-37. In vivo assay for transcription factor activity.

Figure 10-37

In vivo assay for transcription factor activity. The assay system requires two plasmids. One plasmid contains the gene encoding the putative transcription factor (X protein). The (more...)

Genetic Identification of Genes Encoding Transcription Factors

In yeast, genes encoding transcription factors were first identified through classical genetic analysis. For example, one of the yeast genes required for growth on galactose is called GAL4. Incubation of wild-type yeast cells in galactose media results in more than a thousand-fold increase in the concentration of mRNAs encoding the enzymes catalyzing galactose metabolism. This activation of mRNA expression is not observed in gal4 mutants. (In S. cerevisiae, wild-type genes are designated with capital letters in italics, and recessive mutant alleles of the gene are indicated with lowercase letters in italics. The encoded protein is designated by the name of the gene in Roman type, with the first letter capitalized, e.g., Gal4.) Directed mutagenesis studies like those described previously identified UASs for the induced genes. Each of these UASs was found to contain one or more copies of a related 17-bp sequence called UASGAL. When a copy of UASGAL was cloned upstream of a TATA box followed by a lacZ reporter gene, expression of lacZ was activated in galactose media in wild-type cells, but not in gal4 mutants. This indicated that UASGAL is a transcription-control element activated by the Gal4 protein in galactose media.

The GAL4 gene was isolated by complementation of a gal4 mutant with a library of wild-type yeast DNA (Section 8.2). By use of recombinant DNA techniques, the Gal4 protein was expressed in E. coli and found to bind to UASGAL. Thus, the Gal4 protein binds to UASGAL sequences and activates transcription from a nearby promoter when cells are placed in galactose media.

Classical genetic studies in a number of other organisms including Drosophila, the nematode C. elegans, and higher plants have uncovered several genes encoding transcription factors. For example, many mutations that interfere with normal Drosophila development have been identified. One of these inactivates the Ultrabithorax (Ubx) gene, causing an extra pair of wings to develop from the third thoracic segment (see Figure 8-8b). The protein encoded by wild-type Ubx has been shown to function as a transcription factor. The remarkable change in phenotype observed in Ubx mutants indicates that Ubx protein influences transcription of a large number of Drosophila genes.

Transcription Activators Are Modular Proteins Composed of Distinct Functional Domains

A remarkable set of experiments with the yeast Gal4 protein demonstrated that this transcription factor is composed of separable functional domains: a DNA-binding domain, which interacts with specific DNA sequences, and an activation domain, which interacts with other proteins to stimulate transcription from a nearby promoter. In these experiments, a series of gal4 deletion mutants were tested for their ability to activate transcription of a reporter gene (lacZ) linked to UASGAL (Figure 10-38a) in an in vivo assay like that depicted in Figure 10-37. Transcription activation was measured in cells lacking the GAL4 gene, so that wild-type Gal4 protein would not interfere with the analysis. Binding of the mutant Gal4 proteins to the UASGAL sequence also was assayed. The results of these experiments, outlined in Figure 10-38b, demonstrate that Gal4 contains a N-terminal 74-amino acid DNA-binding domain and a C-terminal activation domain. When the N-terminal DNA-binding domain of Gal4 was fused directly to various C-terminal fragments, the resulting truncated proteins retained the ability to stimulate expression of the reporter gene. Thus the internal portion of the protein is not required for functioning of Gal4 as a transcription factor.

Figure 10-38. Experimental demonstration of separate functional domains in yeast Gal4 protein.

Figure 10-38

Experimental demonstration of separate functional domains in yeast Gal4 protein. (a) Diagram of DNA construct containing a lacZ reporter gene with an added TATA box ligated to UASGAL, a (more...)

Similar experiments with the yeast transcription factor Gcn4, which regulates genes required for synthesis of many amino acids, indicated that it contains an ≈60-aa DNA-binding domain at its C-terminus and an ≈20-aa activation domain near the middle of its sequence. Further evidence for the existence of distinct activation domains in Gal4 and Gcn4 came from experiments in which their activation domains were fused to a DNA-binding domain from an entirely unrelated E. coli repressor. Introduction of a reporter gene construct containing the cognate site for the E. coli repressor upstream from a TATA-box and lacZ, and an expression vector for the repressor DNA-binding domain fused to the coding sequence for either the Gal4 or Gcn4 activation domain, led to expression of the reporter gene in yeast cells. In this case, a fusion protein consisting of the DNA-binding domain from one transcription factor and the activation domain from a different factor was expressed in vivo and activated transcription. Thus, entirely novel transcription factors composed of prokaryotic and eukaryotic elements can be constructed.

Studies such as these have now been carried out with many eukaryotic transcription factors. Activation domains in mammalian transcription factors are frequently assayed by fusing them to the Gal4 DNA-binding domain since mammalian cells do not contain an endogenous transcription factor that binds to the UASGAL sequence. The structural model of eukaryotic activators that has emerged from these studies is a modular one in which one or more activation domains is connected to a sequence-specific DNA-binding domain through relatively flexible protein domains (Figure 10-39). In some cases, amino acids included in the DNA-binding domain also contribute to transcriptional activation. As discussed in a later section, activation domains are thought to function through protein-protein interactions with transcription factors bound at the promoter. The flexible protein domains in activators, which connect the DNA-binding domains to activation domains, may explain why alterations in the spacing between control elements is so well tolerated in eukaryotic control regions. When the DNA-binding domains of neighboring transcription factors are shifted in their relative positions on the DNA, their activation domains may still be able to interact because they are attached to their DNA-binding domains through flexible protein regions.

Figure 10-39. Schematic diagrams illustrating the modular structure of eukaryotic transcription activators.

Figure 10-39

Schematic diagrams illustrating the modular structure of eukaryotic transcription activators. These transcription factors may contain more than one activation domain but rarely contain more (more...)

DNA-Binding Domains Can Be Classified into Numerous Structural Types

Eukaryotic transcription factors contain a variety of structural motifs that interact with specific DNA sequences. As with most bacterial activators and repressors, α helices in the DNA-binding domain of eukaryotic transcription factors are oriented so that they lie in the major groove of DNA where protein atoms make specific hydrogen bonds and van der Waals interactions with atoms in the DNA. Interactions with sugar-phosphate backbone atoms and, in some cases, with atoms in the DNA minor groove also contribute to binding. X-ray crystallographic analyses of complexes between specific protein-binding sites in DNA and isolated transcription-factor DNA-binding domains have revealed a number of structural motifs that can present an α helix to the major groove.

Transcription factors often are classified according to the type of DNA-binding domain they contain. Most of the structural classes of DNA-binding domains have characteristic consensus amino acid sequences. Consequently, newly characterized transcription factors frequently can be classified once the corresponding genes or cDNAs are cloned and sequenced. Several common classes of DNA-binding domains whose three-dimensional structures have been determined are described and illustrated here. Many additional classes are recognized, and new classes are still being characterized. The genomes of higher eukaryotes may encode dozens of classes of DNA-binding domains and literally hundreds of transcription factors.

Homeodomain Proteins

The structure of the DNA-binding domain from the Drosophila Engrailed protein is depicted in Figure 10-40. Transcription factors with this type of DNA- binding domain are called homeodomain proteins, a name derived from a group of Drosophila genes in which the conserved sequence encoding this structural motif was first noted. Mutations in these genes, called homeotic genes, result in the transformation of one body part into another during development (Section 14.3). Two of the most-studied of these genes are designated Antennapedia (Antp) and Ultrabithorax (Ubx). The proteins encoded by these genes share a highly conserved 60-aa region; this same conserved region was subsequently identified in the proteins encoded by other homeotic genes. Because the conserved DNA sequence encoding this region was often diagrammed in a box when sequences from different genes were compared, it came to be known as the homeobox. The conserved sequence has also been found in vertebrate genes, including human genes, that have similar master control functions in development.

Figure 10-40. Homeodomain from Engrailed protein interacting with its specific DNA recognition site.

Figure 10-40

Homeodomain from Engrailed protein interacting with its specific DNA recognition site. The Engrailed transcription factor is expressed during Drosophila embryogenesis. Base pairs (more...)

Zinc-Finger Proteins

A number of different proteins have regions that fold around a central Zn2+ ion, producing a compact domain from a relatively short length of the polypeptide chain. Termed a zinc finger, this structural motif was first recognized in DNA-binding domains but now is known to occur in proteins that do not bind to DNA. We describe three of the several classes of zinc-finger motifs that have been identified.

The interaction between DNA and a transcription factor containing five C2H2 zinc-finger domains is shown in Figure 10-41a. Each C2H2 finger has the consensus sequence Tyr/Phe-X-Cys-X2–4-Cys-X3-Phe/Tyr-X5-Leu-X2-His-X3–4-His, where X is any amino acid. This sequence binds one Zn2+ ion through the two cysteine (C) and two histidine (H) side chains. The name “zinc finger” was coined because a twodimensional diagram of the structure resembles a finger (see Figure 3-9c). When the three-dimensional structure was solved, it became clear that the binding of the Zn2+ ion by the two cysteine and two histidine residues folds the relatively short polypeptide sequence into a compact domain, which can insert its α helix into the major groove of DNA. The C2H2 zinc finger is one of the most common DNA-binding motifs in eukaryotic transcription factors. More than a thousand of these consensus sequences are in the current protein sequence data base. The repeating units in these proteins can interact with successive groups of base pairs, primarily within the major groove, as the protein wraps around the DNA double helix.

Figure 10-41. Interaction of C2H2 and nuclear receptor (C4) zinc-finger domains with DNA (blue).

Figure 10-41

Interaction of C2H2 and nuclear receptor (C4) zinc-finger domains with DNA (blue). (a) A five-finger C2H2 protein called GL1. This monomeric protein (more...)

A second type of zinc-finger structure, designated the C4 zinc finger, is found in more than 100 transcription factors. The first members of this class were identified as specific intracellular high-affinity binding proteins, or “receptors,” for steroid hormones, leading to the name steroid receptor superfamily. Because similar intracellular receptors for nonsteroid hormones subsequently were found, these transcription factors are now commonly called nuclear receptors. The DNA-binding domain of these proteins has the consensus sequence Cys-X2-Cys-X13-Cys-X2-Cys-X14–15-Cys-X5-Cys-X9-Cys-X2-Cys. The two groups of four critical cysteines in this region each binds a Zn2+ ion. Although the C4 zincfinger motif initially was named by analogy with the C2H2 zinc-finger motif, the three-dimensional structures of these DNA-binding domains later were found to be quite distinct. A particularly important difference between the two is that C2H2 zinc-finger proteins generally contain three or more repeating finger units and bind as monomers, whereas C4 zinc-finger proteins generally contain only two finger units and bind to DNA as homodimers or heterodimers. Like bacterial homodimeric helix-turn-helix DNA-binding domains, homodimers of C4 DNA-binding domains have twofold rotational symmetry (Figure 10-41b). Consequently, homodimeric nuclear receptors bind to consensus DNA sequences that are inverted repeats, another similarity with bacterial systems. Heterodimeric nuclear receptors do not exhibit rotational symmetry; in these proteins one C4 monomer is inverted relative to the lower monomer in Figure 10-41b.

The DNA-binding domain in the yeast Gal4 protein exhibits a third type of zinc-finger motif, known as the C6 zinc finger. Proteins of this class have the consensus sequence Cys-X2-Cys-X6-Cys-X5 – 6-Cys-X2-Cys-X6-Cys. The six cysteines bind two Zn2+ ions, folding the region into a compact globular domain (Figure 10-42). The Gal4 protein binds DNA as a homodimer in which the monomers associate through hydrophobic interactions along one face of their α-helical regions. This type of interaction between α helices, to form a coiled coil, also occurs in dimeric leucine-zipper proteins and is discussed in more detail below.

Figure 10-42. Trace diagram of interaction between Gal4, a C6 zinc-finger protein, and DNA.

Figure 10-42

Trace diagram of interaction between Gal4, a C6 zinc-finger protein, and DNA. This protein binds DNA as a homodimer with the monomers interacting to form a coiled coil that lies perpendicular (more...)

Winged-Helix (Forkhead) Proteins

The DNA-binding domains in histone H5 and several transcription factors that function during early development of Drosophila and mammals have the winged-helix motif, also called the forkhead motif. Like C2H2 zinc-finger proteins, winged-helix proteins generally bind to DNA as monomers.

Leucine-Zipper Proteins

Another structural motif present in a large class of transcription factors is exemplified by the DNA-binding domain of yeast Gcn4. The first transcription factors recognized in this class contained the hydrophobic amino acid leucine at every seventh position in the C-terminal portion of their DNA-binding domains. These proteins bind to DNA as dimers, and mutagenesis of the leucines showed that they were required for dimerization. Consequently, the name leucine zipper was coined to denote this structural motif.

X-ray crystallographic analysis of complexes between DNA and the Gcn4 DNA-binding domain has shown that the dimeric protein contains two extended α helices that “grip” the DNA molecule, much like a pair of scissors, at two adjacent major grooves separated by about half a turn of the double helix (Figure 10-43). The portions of the α helices contacting the DNA include basic residues that interact with phosphates in the DNA backbone and additional residues that interact with specific bases in the major groove.

Figure 10-43. Two views of the interaction of yeast Gcn4, a homodimeric leucine-zipper protein, with DNA.

Figure 10-43

Two views of the interaction of yeast Gcn4, a homodimeric leucine-zipper protein, with DNA. The extended α-helical regions of the monomers grip the DNA at adjacent major grooves. (more...)

Gcn4 forms dimers via hydrophobic interactions between the C-terminal regions of the α helices, forming a coiled-coil structure. This structure is common in proteins containing amphipathic α helices in which hydrophobic amino acid residues are regularly spaced alternately three or four positions apart in the sequence. As a result of this characteristic spacing, the hydrophobic side chains form a stripe down one side of the α helix. These hydrophobic stripes make up the interacting surfaces between the α-helical monomers in a coiled-coil dimer (see Figure 3-9a).

As noted above, the first transcription factors in this class to be analyzed contained leucine residues at every seventh position in the dimerization region and thus were named leucine-zipper proteins. However, additional DNA-binding proteins containing other hydrophobic amino acids in these positions subsequently were identified. Like leucine-zipper proteins, they form dimers containing a C-terminal coiled-coil dimerization region and N-terminal DNA-binding domain. The term basic zipper (bZip) now is frequently used to refer to all proteins with these common structural features. Many basic-zipper transcription factors are heterodimers of two different polypeptide chains, each containing one basic-zipper domain.

Helix-Loop-Helix Proteins

The DNA-binding domain of another class of dimeric transcription factors contains a structural motif very similar to the basic-zipper motif except that a nonhelical loop of the polypeptide chain separates two α-helical regions in each monomer (Figure 10-44). Termed a helix-loop-helix (HLH), this motif was predicted from the amino acid sequences of these proteins, which contain an N-terminal α helix with basic residues that interact with DNA, a middle loop region, and a C-terminal region with hydrophobic amino acids spaced at intervals characteristic of an amphipathic α helix. Because of the basic amino acids characteristic of this motif, transcription factors containing it sometimes are referred to as basic helix-loophelix (bHLH) proteins. As with basic-zipper proteins, different helix-loop-helix proteins can form heterodimers.

Figure 10-44. Interaction of the helix-loop-helix domain in the homodimeric Max protein with DNA.

Figure 10-44

Interaction of the helix-loop-helix domain in the homodimeric Max protein with DNA. The helix-loophelix motif extends from the DNA-binding helices on the left (N-termini of the monomers) (more...)

Heterodimeric Transcription Factors Increase Gene-Control Options

Three types of DNA-binding proteins discussed in the previous section can form heterodimers: C4 zinc-finger proteins, basic-zipper proteins, and helix-loop-helix proteins. Other classes of transcription factors whose structures are not specifically considered here also form heterodimeric proteins. In some heterodimeric transcription factors, each monomer has a DNA-binding domain with equivalent sequence specificity. In these proteins, the formation of heterodimers does not influence DNA-binding specificity, but rather allows the activation domains associated with each monomer to be brought together in a single transcription factor. However, if the monomers have different DNA-binding specificity, the formation of heterodimers increases the number of potential DNA sequences that a family of factors can bind, as illustrated in Figure 10-45a. In addition, there are examples of inhibitory basic-zipper and helix-loop-helix proteins that block DNA binding when they dimerize with a partner polypeptide normally capable of binding DNA. When these inhibitory factors are expressed, they repress transcriptional activation by the factors with which they interact (Figure 10-45b).

Figure 10-45. Combinatorial possibilities due to formation of heterodimeric transcription factors.

Figure 10-45

Combinatorial possibilities due to formation of heterodimeric transcription factors. (a) In the hypothetical example shown, transcription factors A, B, and C can each interact with each other, (more...)

The rules governing the interactions of members of a transcription-factor class are complex. This combinatorial complexity expands both the number of DNA sites from which these factors can activate transcription and the ways in which they can be regulated. This is not possible for transcription factors that bind only as monomers or homodimers.

Activation Domains Exhibit Considerable Structural Diversity

An activation domain is a polypeptide sequence that activates transcription when it is fused to a DNA-binding domain. For example, a large number of diverse peptide sequences can activate transcription in eukaryotic cells from a promoter with upstream UASGAL binding sites for the Gal4 DNA-binding domain. In one experiment, random fragments of E. coli DNA were ligated to the portion of the GAL4 gene encoding the DNA-binding domain of Gal4. Remarkably, ≈1 percent of all the resulting fusion proteins, composed of the Gal4 DNA-binding domain and random segments of E. coli proteins activated transcription from promoters with an upstream UASGAL in yeast and mammalian cells. This finding demonstrated that a diverse group of amino acid sequences can function as activation domains, even though they evolved to perform other functions. The actual mechanism of transcription activation is considered in a later section.

Although numerous diverse amino acid sequences can function as activation domains, many activation domains have an unusually high percentage of particular amino acids. Gal4, Gcn4, and most other yeast transcription factors have activation domains that are rich in acidic amino acids (aspartic and glutamic acids). These so-called acidic activation domains generally are capable of stimulating transcription in nearly all types of eukaryotic cells — fungal, animal, and plant cells. Activation domains from some Drosophila and mammalian transcription factors are glutamine rich, and some are proline rich; still others are rich in the closely related amino acids serine and threonine, both of which have hydroxyl groups. However, some strong activation domains are not particularly rich in any specific amino acid.

Recent biophysical studies on model acidic activation domains show that they exist as unstructured, random-coil regions of polypeptide until they interact with a co-activator protein. This interaction induces the activation domain to fold into an amphipathic α helix that contacts a complementary surface of the co-activator protein. For instance, NMR spectra of the isolated activation domain in the mammalian CREB (cAMP response element – binding) protein show that it is an unstructured random coil. In response to elevated levels of cAMP, protein kinase A phosphorylates a specific serine residue in the CREB activation domain, allowing it to interact with a specific region in its co-activator CBP (CREB-binding protein). In the three-dimensional structure of the complex between these two proteins, two α helices at right angles to each other in the CREB activation domain wrap around the interacting domain of CBP (Figure 10-46).

Figure 10-46. Structure of the phosphorylated CREB acidic activation domain complexed to the interacting domain of its co-activator, CBP.

Figure 10-46

Structure of the phosphorylated CREB acidic activation domain complexed to the interacting domain of its co-activator, CBP. The polypeptide backbone of the phosphorylated CREB acidic activation (more...)

In contrast to the relatively short, random-coil acidic activation domains, some activation domains are larger and more structured. For example, the ligand-binding domains of some nuclear receptors function as activation domains when they bind their specific ligand. Binding of ligand is thought to induce a large conformational change that allows the ligand-binding domain with bound hormone to interact with other proteins (Figure 10-47).

Figure 10-47. Effect of ligand binding on conformation of the ligand-binding domains in two related human nuclear receptors determined by x-ray crystallography.

Figure 10-47

Effect of ligand binding on conformation of the ligand-binding domains in two related human nuclear receptors determined by x-ray crystallography. (a) In the absence of bound ligand (9-cis (more...)

Multiprotein Complexes Form on Enhancers

Enhancers generally range in length from about 50 to 200 base pairs and include binding sites for several transcription factors. The multiple transcription factors that bind to a single enhancer are thought to interact. Analysis of the enhancer that regulates expression of β-interferon, an important protein in defense against viral infections in humans, provides a good example of such transcription-factor interactions. Four control elements have been identified in this ≈70-bp enhancer by analysis of linker scanning mutations. Cognate proteins that bind to each of these four sites were identified by techniques described earlier. Once the cDNAs encoding these proteins were isolated, they were shown to activate transcription from the β-interferon enhancer in transfection experiments (see Figure 10-37).

Subsequent studies showed that these transcription factors bind to the β-interferon enhancer simultaneously. In the presence of a small, abundant protein associated with chromatin called HMGI, binding of the transcription factors is highly cooperative, similar to the binding of E. coli CAP protein and RNA polymerase to neighboring sites in the lac operon (see Figure 10-17). This cooperative binding produces a multiprotein complex on the enhancer DNA (Figure 10-48). The term enhancesome has been coined to describe such large nucleoprotein complexes that assemble from transcription factors as they bind cooperatively to their multiple binding sites in an enhancer.

Figure 10-48. Model of the enhancesome that forms on the β-interferon enhancer.

Figure 10-48

Model of the enhancesome that forms on the β-interferon enhancer. Heterodimeric cJun/ATF-2, IRF-3, IRF-7, and NF-κB (a heterodimer of p50 and p65) bind to the four control (more...)

HMGI binds to the minor groove of DNA regardless of the sequence and, as a result, bends the DNA molecule sharply. This bending of the enhancer DNA permits the transcription factors to interact properly. The relatively weak interactions between the bound proteins are enhanced because the transcription factors are bound to neighboring sites, keeping the proteins at very high relative concentration. In studies with the enhancer from the T-cell receptor α gene, also important to the immune response, enhancesome assembly likewise was found to depend on a protein (different from HMGI) that interacts with the minor groove of DNA and introduces a bend. Such DNA-bending proteins have been referred to as architectural proteins, because they are required to build these nucleoprotein complexes.

Many Repressors Are the Functional Converse of Activators

Genetic and biochemical studies have shown that eukaryotic transcription is regulated by repressor proteins as well as the more-common activator proteins. For example, geneticists have identified mutations in yeast that result in constitutive expression of certain genes, indicating that these genes normally are regulated by a repressor. In another approach, repressor-binding sites were identified in systematic mutational analyses of eukaryotic transcription-control regions similar to the experiments depicted in Figure 10-32. While mutation of an activator-binding site leads to decreased expression of the linked reporter gene, mutation of a repressor-binding site leads to increased expression of a reporter gene. Repressor proteins that bind such sites have been purified and characterized using sequence-specific DNA affinity chromatography, as for activator proteins (see Figure 10-35).

Image med.jpgThe absence of appropriate repressor activity can have devastating consequences. For instance, the protein encoded by the Wilms’ tumor (WT1) gene is a repressor that is expressed preferentially in the developing kidney. Children who inherit mutations in both the maternal and paternal WT1 genes, so that they produce no functional WT1 protein, invariably develop kidney tumors early in life. The WT1 protein, which has a C2H2 zinc-finger DNA- binding domain, binds to the control region of the gene encoding a transcription activator called EGR-1 (Figure 10-49). This gene, like many other eukaryotic genes, is subject to both repression and activation. Binding of WT1 represses transcription of the EGR-1 gene without inhibiting binding of the two activators that normally stimulate expression of this gene. Eukaryotic transcription repressors like WT1 appear to be the functional converse of activators. They can inhibit transcription from a gene they do not normally regulate when their cognate binding sites are placed within a few hundred base pairs of the gene’s start site. The various mechanisms whereby repressor proteins exert their effects are described later.

Figure 10-49. Diagram of the control region of the gene encoding EGR-1, a transcription activator.

Figure 10-49

Diagram of the control region of the gene encoding EGR-1, a transcription activator. The binding sites for WT1, an eukaryotic repressor protein, do not overlap the binding sites for SRF and (more...)

Like activators, many eukaryotic repressors have two functional domains: a DNA-binding domain and a repression domain. As is true for activation domains, a variety of amino acid sequences can function as repression domains. Many of these are relatively short (≈20 amino acids) and contain high proportions of hydrophobic residues. Other repression domains contain a high proportion of basic residues. In some cases, repression domains are larger, well-structured protein domains. For example, in the absence of ligand, the RXRα ligand-binding domain functions as a repression domain (see Figure 10-47a). When the same domain binds its cognate ligand, 9-cis retinoic acid, it is converted into an activation domain. As for activation domains, the diverse structures of repression domains is probably a reflection of several possible molecular mechanisms for regulating eukaryotic transcription. To begin to understand how activation and repression domains of eukaryotic transcription factors regulate gene expression, we must first discuss the complexities of transcription initiation by RNA polymerase II.

SUMMARY

  •  Transcription factors, which stimulate or repress transcription, bind to promoter-proximal elements and enhancers in eukaryotic DNA.
  •  Activators are generally modular proteins containing a single DNA-binding domain and one or a few activation domains; the different domains frequently are linked through flexible polypeptide regions (see Figure 10-39). This may allow activation domains in different activators to interact even when their DNA-binding domains are bound to sites separated by tens of base pairs.
  •  Enhancers generally contain multiple clustered binding sites for transcription factors. Cooperative binding of multiple activators to nearby sites in an enhancer forms a multiprotein complex called an enhancesome (see Figure 10-48). Assembly of enhancesomes often requires small proteins that bind to the DNA minor groove and bend the DNA sharply, allowing proteins on either side of the bend to interact more readily.
  •  Most eukaryotic repressors also are modular proteins. Similar to activators, they usually contain a single DNA- binding domain, one or a few repression domains, and can control transcription when they are bound at sites hundreds to thousands of base pairs from a start site.
  • DNA-binding domains in eukaryotic transcription factors exhibit a variety of structures. Among the most common structural motifs are the homeodomain, basic zipper (leucine zipper), helix-loop-helix, and several types of zinc finger. In general, one or more α helices in a DNA-binding domain interacts with the major groove in its cognate site.
  •  The ability of some transcription factors to form heterodimers increases the number of DNA sites from which these factors can control transcription and the ways they can be controlled (see Figure 10-45).
  •  Although some activation and repression domains are rich in particular amino acids, these functional domains exhibit a variety of amino acid sequences and protein structures in different transcription factors.

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 2000, W. H. Freeman and Company.
Bookshelf ID: NBK21572