NCBI » Bookshelf » Molecular Cell Biology » RNA Processing, Nuclear Transport, and Post-Transcriptional Control » 11.2 Processing of Eukaryotic mRNA
 
mcb
Molecular Cell Biology
4th
Harvey Lodish,1 Arnold Berk,2 Lawrence Zipursky,2 Paul Matsudaira,3 David Baltimore,4 and James Darnell5
1Whitehead Institute for Biomedical Research and Massachusetts Institute of Technology
2Molecular Biology Institute, University of California, Los Angeles
3Howard Hughes Medical Institute, School of Medicine, University of California, Los Angeles
4California Institute of Technology (Caltech)
5Rockefeller University, New York
W. H. Freeman0-7167-3136-32000
cell biologymolecular biology

 11:  11.2 Processing of Eukaryotic mRNA

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f7.jpg.

Figure 11-7

.

   Overview of mRNA processing in eukaryotes

Shortly after RNA polymerase II initiates transcription at the first nucleotide of the first exon of a gene, the 5′ end of the nascent RNA is capped with 7-methylguanylate. Transcription by RNA polymerase II terminates at any one of multiple termination sites downstream from the poly(A) site, which is located at the 3′ end of the final exon. After the primary transcript is cleaved at the poly(A) site, a string of adenine (A) residues is added. The poly(A) tail contains ≈250 A residues in mammals, ≈150 in insects, and ≈100 in yeasts. For short primary transcripts with few introns, polyadenylation, cleavage, and splicing usually follows termination, as shown. For large genes with multiple introns, introns often are spliced out of the nascent RNA before transcription of the gene is complete. Note that the 5′ cap is retained in mature mRNAs.

As discussed in Chapter 4, the initial primary transcript synthesized by RNA polymerase II undergoes several processing steps before a functional mRNA is produced. In this section, we take a closer look at how eukaryotic cells carry out mRNA processing, which includes three major processes: 5capping, 3cleavage/polyadenylation, and RNA splicing (Figure 11-7). Processing occurs in the nucleus, and the functional mRNA produced is transported to the cytoplasm by mechanisms discussed later.

The 5′-Cap Is Added to Nascent RNAs Shortly after Initiation by RNA Polymerase II

After nascent RNA molecules produced by RNA polymerase II reach a length of 25 – 30 nucleotides, 7-methylguanosine is added to their 5′ end. This initial step in RNA processing is catalyzed by a dimeric capping enzyme, which associates with the phosphorylated carboxyl-terminal tail domain (CTD) of RNA polymerase II. Recall that the CTD becomes phosphorylated during transcription initiation (see Figure 10-50). Because the capping enzyme does not associate with polymerase I or III, capping is specific for transcripts produced by RNA polymerase II.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f8.jpg.

Figure 11-8

.

   Capping of the 5′ end of nascent RNA transcripts with 7′-methylguanylate (m7G)

The first two reactions are catalyzed by a capping enzyme that associates with the phosphorylated CTD of RNA polymerase II shortly after transcription initiation. Two different methyltransferases catalyze reactions 3 and 4. S-adenosylmethionine (S-Ado-Met) is the source of the methyl (CH3) group for the two methylation steps; the guanylate (G) is methylated first, then the 2′ hydroxyl of the first one or two nucleotides (N) in the transcript. See Figure 4-18 for structure of the resulting 5′ cap. [See S. Venkatesan and B. Moss, 1982, Proc. Nat’l. Acad. Sci. USA 79:304.]

One subunit of the capping enzyme removes the γ-phosphate from the 5′ end of the nascent RNA emerging from the surface of a RNA polymerase II (Figure 11-8). The other subunit transfers the GMP moiety from GTP to the 5′-diphosphate of the nascent transcript, creating the guanosine 5′-5′-triphosphate structure. In the final steps, separate enzymes transfer methyl groups from S-adenosylmethionine to the N7 position of the guanine and the 2′ oxygens of riboses at the 5′ end of the nascent RNA.

Pre-mRNAs Are Associated with hnRNP Proteins Containing Conserved RNA-Binding Domains

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-9

.

   Visualization of hnRNP protein associated with nascent transcripts in an oocyte of the newt Nophthalmus viridescens

A portion of a “lampbrush” chromosome is shown. DNA at the chromosome axis stains white with the DNA-specific dye DAPI. The long red filaments are nascent transcripts bound by hnRNP proteins, which fluoresce red after staining with a monoclonal antibody against a specific hnRNP protein. [Courtesy of M. Roth and J. Gall.]

Nascent RNA transcripts from protein-coding genes and mRNA processing intermediates, collectively referred to as pre-mRNA, do not exist as free RNA molecules in the nuclei of eukaryotic cells. From the time nascent transcripts first emerge from RNA polymerase II until mature mRNAs are transported into the cytoplasm, the RNA molecules are associated with an abundant set of nuclear proteins, as numerous in growing eukaryotic cells as histones. These proteins are the major protein components of heterogeneous ribonucleoprotein particles (hnRNPs), which contain heterogeneous nuclear RNA (hnRNA), a collective term referring to pre-mRNA and other nuclear RNAs of various sizes. The proteins in these ribonucleoprotein particles can be dramatically visualized with fluorescentlabeled monoclonal antibodies (Figure 11-9).

To identify hnRNP proteins, researchers exposed cells to high-dose UV irradiation, which causes covalent cross-links to form between RNA bases and closely associated proteins. Chromatography of nuclear extracts from treated cells on an oligo-dT cellulose column, which binds RNAs with a poly(A) tail, was used to recover proteins that had become cross-linked to nuclear mRNA in living cells (i.e., hnRNP proteins). Subsequent treatment of cell extracts from unirradiated human cells with monoclonal antibodies specific for the major hnRNP proteins identified by this cross-linking technique revealed a complex set of abundant hnRNP proteins ranging in size from 34 to 120 kDa. Characterization of the mRNAs encoding these proteins has shown that some of them (e.g., A2 and B1) are related proteins derived by alternative splicing of exons from the same transcription unit.

Binding studies with purified hnRNP proteins suggest that different hnRNP proteins associate with different regions of a newly made pre-mRNA molecule as determined by the sequence of the RNA. For example, the hnRNP proteins A1, C, and D bind preferentially to the pyrimidine-rich sequences at the 3′ ends of introns, discussed in a later section. Like transcription factors, most hnRNP proteins have a modular structure. They contain one or more RNA-binding domains and at least one other domain that is thought to interact with other proteins. Several different RNA-binding motifs have been identified by constructing deletions of hnRNP proteins and testing their ability to bind RNA. Although some RNA-binding proteins contain domains with the zinc-finger motif common in DNA-binding proteins (see Figure 10-41), this motif has not yet been described in any hnRNP proteins.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f10a.jpg.
An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f10b.jpg.

Figure 11-10

.

   Structure of complex between an RNP motif from U1A protein and RNA

(a) Diagram of the RNP motif domain. The conserved RNP1 and RNP2 regions are located in the two middle β–strands. (b) Surface representation of the U1A protein-RNA complex determined by X-ray crystallography. The RNA forms a stem-loop with the single-stranded portion of the loop bound to the surface of the protein. The N- and C-termini are at the upper left and right, respectively. Acidic and basic amino acids are colored red and blue, respectively. [From K. Nagai et al., 1995, Trends Biochem. Sci. 20:235.]

The RNP motif, also called the RNA-binding domain (RBD), is the most common RNA-binding domain in hnRNP proteins. This ≈80-residue motif, which occurs in many other RNA-binding proteins, contains two highly conserved regions (RNP1 and RNP2) that allow the motif to be recognized in newly sequenced proteins. X-ray crystallographic analysis has shown that the RNP motif consists of a four-stranded β sheet flanked on one side by two α helices. The conserved RNP1 and RNP2 sequences lie side by side on the two central β strands, and their side chains make multiple contacts with a single-stranded region of RNA. The single-stranded RNA loop lies across the surface of the β sheet and fits into a groove between the protein loop connecting strands β2 and β3 and the C-terminal region (Figure 11-10).

The RGG box, another RNA-binding motif found in hnRNP proteins, contains five Arg-Gly-Gly (RGG) repeats with several interspersed aromatic amino acids. Although the structure of this motif has not yet been determined, its arginine-rich nature is similar to the RNA-binding domains of the λ-phage N and HIV Tat proteins.

graphic elementThe 45-residue KH motif is found in the hnRNP K protein and several other RNA-binding proteins; commonly two or more copies of the KH motif are interspersed with RGG repeats. The three-dimensional structure of a representative KH motif, determined by NMR methods (Section 3.5), is similar to that of the RNP motif but smaller, consisting of a three-stranded β sheet supported from one side by a single α helix. It is not yet clear how this motif binds RNA. Mutations in the fragile-X gene (FMR1), which encodes a protein containing the KH motif, are associated with the most common form of heritable mental retardation. Although the molecular function of the Fmr1 protein is unknown, it presumably involves RNA binding.

hnRNP Proteins May Assist in Processing and Transport of mRNAs

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f11.jpg.

Figure 11-11

.

   Hybridization of RNA molecules in vitro is accelerated by hnRNP proteins

The presence of complex secondary structures within RNA molecules inhibits hybridization between long complementary sequences in separate molecules. Association of hnRNP proteins with RNA is thought to prevent formation of RNA secondary structures, thereby facilitating base-pairing between different complementary molecules. These proteins may have a similar function in vivo. [Adapted from D. S. Portman and G. Dreyfuss, 1994, EMBO J. 13:213.]

The association of pre-mRNAs with hnRNP proteins may prevent formation of short secondary structures dependent on base-pairing of complementary regions, thereby making the pre-mRNAs accessible for interaction with other macromolecules (Figure 11-11). Moreover, pre-mRNAs associated with hnRNP proteins present a more uniform substrate for further processing steps than would free, unbound pre-mRNAs each type of which forms a unique secondary structure dependent on its specific sequence.

The diversity of hnRNP proteins suggests that they probably have other functions as well. For example, various hnRNP proteins may interact with the RNA sequences that specify RNA splicing or cleavage/polyadenylation and contribute to the structure recognized by RNA-processing factors. Finally, cell-fusion experiments have shown that some hnRNP proteins remain localized in the nucleus, whereas others cycle in and out of the cytoplasm, suggesting that they function in the transport of mRNA (see later section).

Pre-mRNAs Are Cleaved at Specific 3′ Sites and Rapidly Polyadenylated

In animal cells, all mRNAs, except histone mRNAs, have a 3′ poly(A) tail. Early studies of pulse-labeled adenovirus and SV40 RNA demonstrated that the viral primary transcripts extend beyond the poly(A) site in the viral mRNAs. These results suggested that A residues are added to a 3′ hydroxyl generated by endonucleolytic cleavage, but the predicted downstream RNA fragments are degraded so rapidly in vivo that they cannot be detected. However, this cleavage mechanism was firmly established by detection of both predicted cleavage products in in vitro processing reactions performed with extracts of HeLa-cell nuclei.

Early sequencing of cDNA clones from animal cells showed that nearly all mRNAs contain the sequence AAUAAA 10 – 35 nucleotides upstream from the poly(A) tail. Polyadenylation of RNA transcripts from transfected genes is virtually eliminated when template DNA encoding the AAUAAA sequence is mutated to any other sequence except one encoding AUUAAA. The unprocessed RNA transcripts produced from such mutant templates do not accumulate in nuclei, but are rapidly degraded. Further mutagenesis of sequences within a few hundred bases of poly(A) sites revealed that a second signal downstream from the cleavage site is required for efficient cleavage and polyadenylation of most pre-mRNAs in animal cells. This downstream poly(A) signal is not a specific sequence but rather a GU-rich or simply a U-rich region within ≈50 nucleotides of the cleavage site.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f12.jpg.

Figure 11-12

.

   Model for cleavage and polyadenylation of pre-mRNAs in mammalian cells

Cleavage-and-polyadenylation specificity factor (CPSF) binds to an upstream AAUAAA polyadenylation signal. CStF interacts with a downstream GU- or U-rich sequence and with bound CPSF, forming a loop in the RNA; binding of CFI and CFII help stabilize the complex. Binding of poly(A) polymerase (PAP) then stimulates cleavage at a poly(A) site, which usually is 10 – 35 nucleotides 3′ of the upstream polyadenylation signal. The cleavage factors are released, as is the downstream RNA cleavage product, which is rapidly degraded. Bound PAP then adds ≈12 A residues at a slow rate to the 3′-hydroxyl group generated by the cleavage reaction. Binding of poly(A)-binding protein II (PABII) to the initial short poly(A) tail accelerates the rate of addition by PAP. After 200 – 250 A residues have been added, PABII signals PAP to stop polymerization.

Identification and purification of the proteins required for cleavage and polyadenylation of pre-mRNA has led to the model shown in Figure 11-12. According to this model, a 360-kDa cleavage and polyadenylation specificity factor (CPSF), composed of four different polypeptides, first forms an unstable complex with the upstream AU-rich poly(A) signal. Then at least three additional proteins — a 200-kDa heterotrimer called cleavage stimulatory factor (CStF), a 150-kDa heterotrimer called cleavage factor I (CFI), and a second cleavage factor (CFII), as-yet poorly characterized — bind to the CPSF-RNA complex. Interaction between CStF and the GU- or U-rich downstream poly(A) signal stabilizes the multiprotein complex. Finally, a poly(A) polymerase (PAP) binds to the complex before cleavage can occur. This requirement for PAP binding links cleavage and polyadenylation, so that the free 3′ ends generated are rapidly polyadenylated. Assembly of this large, multiprotein cleavage-polyadenylation complex around the AU-rich poly(A) signal in a pre-mRNA is analogous in many ways to formation of the transcription-initiation complex at the AT-rich TATA box of a template DNA molecule (see Figure 10-50). In both cases, multiprotein complexes assemble cooperatively through a network of specific proteinnucleic acid and protein-protein interactions.

Following cleavage at the poly(A) site, polyadenylation proceeds in two phases. Addition of the first 12 or so A residues occurs slowly, followed by rapid addition of up to 200 – 250 more A residues. The rapid phase requires the binding of multiple copies of a poly(A)-binding protein containing the RNP motif. This protein is designated PABII to distinguish it from the poly(A)-binding protein that binds to the poly(A) tail of cytoplasmic mRNAs. PABII binds to the short A tail initially added by PAP, stimulating polymerization of additional A residues by PAP (see Figure 11-12). PABII is also responsible for signaling poly(A) polymerase to terminate polymerization when the poly(A) tail reaches a length of 200 – 250 residues, although the mechanism for measuring this length is not yet understood.

Splicing Occurs at Short, Conserved Sequences in Pre-mRNAs via Two Transesterification Reactions

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f13a.jpg.
An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-13

.

   Demonstration that introns are spliced out by electron microscopy of RNA-DNA hybrid between adenovirus DNA and the mRNA encoding hexon, a major viral protein

(a) Diagram of the EcoRI A fragment of adenovirus DNA, which extends from the left end of the genome to just before the end of the final exon of the hexon gene. The gene consists of three short exons and one long (≈3.5-kb) exon separated by three introns of ≈1, 2.5, and 9 kb. (b) Electron micrograph (left) and schematic drawing (right) of hybrid between an EcoRI A fragment and hexon mRNA. The loops marked A, B, and C correspond to the introns indicated in (a). Since these intron sequences in the viral genomic DNA are not present in mature hexon mRNA, they loop out between the exon sequences that hybridize to their complementary sequences in the mRNA. [Micrograph from S. M. Berget et al., 1977, Proc. Nat’l. Acad. Sci. USA 74:3171; courtesy of P. A. Sharp.]

During the final step in formation of a mature, functional mRNA, the introns are removed and exons are spliced together (see Figure 11-7). The discovery that introns are removed during splicing came from electron microscopy of RNA-DNA hybrids between adenovirus DNA and the mRNA encoding hexon, a major virion capsid protein (Figure 11-13).* Similar analyses of hybrids between RNA isolated from the nuclei of infected cells and viral DNA revealed RNAs that were colinear with the viral DNA (primary transcripts) and RNAs with one or two of the introns removed (processing intermediates). These results, together with the findings that the 59 cap and 39 poly(A) tail of mRNA precursors are retained in mature cytoplasmic mRNAs, led to the realization that introns are removed from primary transcripts as exons are spliced together. For short transcription units, RNA splicing usually follows cleavage and polyadenylation of the 3′ end of the primary transcript. But for long transcription units containing multiple exons, splicing of exons in the nascent RNA usually begins before transcription of the gene is complete.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f14.jpg.

Figure 11-14

.

   Consensus sequences around 5′ and 3′ splice sites in vertebrate pre-mRNAs

The only nearly invariant bases are the (5′)GU and (3′)AG of the intron, although the flanking bases indicated are found at frequencies higher than expected based on a random distribution. A pyrimidine-rich region (light blue) near the 3′ end of the intron is found in most cases. The branch-point adenosine, also invariant, usually is 20 – 50 bases from the 3′ splice site. The central region of the intron, which may range from 40 bases to 50 kilobases in length, generally is unnecessary for splicing to occur. [See R. A. Padgett et al., 1986, Ann. Rev. Biochem. 55:1119; E. B. Keller and W. A. Noon, 1984,Proc. Nat’l. Acad. Sci. USA 81:7417.]

The location of exon-intron junctions (i.e., splice sites) in a pre-mRNA can be determined by comparing the sequence of genomic DNA with that of the cDNA prepared from the corresponding mRNA. Sequences that are present in the genomic DNA but absent from the cDNA represent introns and indicate the positions of splice sites. Such analysis of a large number of different mRNAs revealed moderately conserved, short consensus sequences at intron-exon boundaries in eukaryotic pre-mRNAs; in higher organisms, a pyrimidine-rich region just upstream of the 3′ splice site also is common (Figure 11-14). The most conserved nucleotides are the (5′)GU and (3′)AG found at the ends of most introns. Deletion analyses of the center portion of introns in various pre-mRNAs have shown that generally only 30 – 40 nucleotides at each end of an intron are necessary for splicing to occur at normal rates.

Recombinant DNAs containing the 5′ splice site of one transcription unit (e.g., SV40 late region) and the 3′ splice site of another (e.g., mouse β-globin gene) have been prepared and introduced into cultured cells. Spliced mRNA molecules are formed in which the two exon sequences are joined and the chimeric intron is deleted precisely. The formation of correctly spliced mRNAs in such experiments indicates that the cell’s splicing machinery can recognize 5′ and 3′ splice sites and correctly splice them together, with little influence from the intervening sequence in most cases.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-15

.

   Analysis of RNA products formed in an in vitro splicing reaction

A nuclear extract from HeLa cells was incubated with a 497-nucleotide radiolabeled RNA (bottom) that contained portions of two exons (orange and tan) from human β-globin mRNA separated by a 130-nucleotide intron (blue). After incubation for various times, the RNA was purified and subjected to electrophoresis and autoradiography, along with RNA markers (lane M). The number of nucleotides in the various species is indicated. Much of the slower-migrating starting RNA (497) was correctly spliced, yielding a 367-nucleotide product. The excised intron (130*) migrated slower than expected based on its molecular weight, indicating that it is not a linear molecule. Likewise, one of the reaction intermediates (339*) exhibited an anomalously slow electrophoretic mobility. Additional analysis indicated that in both cases the intron had a lariat structure resulting in the slow mobility. The 252** band, an aberrant product of the in vitro reaction, is greatly reduced in reactions in which the RNA is capped. [From B. Ruskin et al., 1984, Cell 38:317; photograph courtesy of Michael R. Green. See also R. A. Padgett et al., 1984, Science 225:898.]

Analysis of the intermediates formed during splicing of pre-mRNAs in vitro led to the conclusion that introns are removed as a lariat structure in which the 5′ G of the intron is joined in an unusual 2′,5′-phosphodiester bond to an adenosine near the 3′ end of the intron (Figure 11-15). This A residue is called the branch point because it forms an RNA branch in the lariat structure.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f16.jpg.

Figure 11-16

.

   Splicing of exons in pre-mRNA occurs via two transesterification reactions

In the first reaction, the ester bond between the 5′ phosphorus of the intron and the 3′ oxygen (red) of exon 1 is exchanged for an ester bond with the 2′ oxygen (dark blue) of the branch-site A residue. In the second reaction, the ester bond between the 5′ phosphorus of exon 2 and the 3′ oxygen (light blue) of the intron is exchanged for an ester bond with the 3′ oxygen of exon 1, releasing the intron as a lariat structure and joining the two exons. Arrows show where the activated hydroxyl oxygens react with phosphorus atoms.

The finding that excised introns have a branched lariat structure led to the discovery that splicing of exons proceeds via two sequential transesterification reactions (Figure 11-16). In each reaction, one phosphate-ester bond is exchanged for another. Since the number of phosphate-ester bonds in the molecule is not changed in either reaction, no energy is consumed. The net result of these two transesterification reactions is that two exons are ligated and the intervening intron is released as a branched lariat structure.

Spliceosomes, Assembled from snRNPs and a Pre-mRNA, Carry Out Splicing

Even before splicing was accomplished in vitro, several observations led to the suggestion that small nuclear RNAs (snRNAs) assist in the splicing reaction. First, the short consensus sequence at the 5′ end of introns was found to be complementary to a sequence near the 5′ end of the snRNA called U1. Second, snRNAs were found associated with hnRNPs in nuclear extracts. Five U-rich snRNAs (U1, U2, U4, U5, and U6), ranging in length from 107 to 210 nucleotides, participate in RNA splicing.

In the nucleus of eukaryotic cells, snRNAs are associated with six to ten proteins in small nuclear ribonucleoprotein particles (snRNPs). Some of these proteins are common to all snRNPs, and some are specific for individual snRNPs. Experiments with a synthetic oligonucleotide that hybridizes with the 5′-end region of U1 snRNA and later studies with pre-mRNAs that were mutated in the 5′ splice-site consensus sequence provided strong evidence that base pairing between the 5′ splice site of a pre-mRNA and the 5′ region of U1 snRNA is required for RNA splicing.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f17.jpg.

Figure 11-17

.

   Diagram of interactions between pre-mRNA, U1 snRNA, and U2 snRNA early in the splicing process

The 5′ region of U1 snRNA initially base-pairs with nucleotides at the 5′ end of the intron (blue) and 3′ end of the 5′ exon (dark red) of the pre-mRNA; U2 snRNA base-pairs with a sequence that includes the branch-point A, although this residue is not base-paired. The yeast branch-point sequence is shown here. Secondary structures in the snRNAs that are not altered during splicing are shown in diagrammatic line form. The purple rectangles represent sequences that bind snRNP proteins recognized by anti-Sm antibodies. For unknown reasons, antisera from patients with the autoimmune disease systemic lupus erythematosus (SLE) contain these antibodies. Such antisera have been useful in characterizing components of the splicing reaction. [See E. J. Sontheimer and J. A. Steitz, 1993, Science 262:1989; adapted from M. J. Moore et al., 1993, in R. Gesteland and J. Atkins, eds., The RNA World, Cold Spring Harbor Press, pp. 303-357.]

Involvement of U2 snRNA in splicing initially was suspected when it was found to have an internal sequence that is largely complementary to the consensus sequence flanking the branch point in pre-mRNAs (see Figure 11-14). Mutation experiments, similar to those conducted with U1 snRNA and 5′ splice sites, demonstrated that base pairing between U2 snRNA and the branch-point sequence in pre-mRNA is critical to splicing. These studies with U1 and U2 snRNAs indicate that during splicing they base-pair with pre-mRNA as shown in Figure 11-17. Significantly, the branch- point A itself, which is not base-paired to U2 snRNA, “bulges out,” allowing its 2′ hydroxyl to participate in the first transesterification reaction of RNA splicing (see Figure 11-16).

Similar studies with other snRNAs demonstrated that RNA-RNA interactions involving them also occur during splicing. For example, an internal region of U6 snRNA initially base-pairs with the 5′ end of U4 snRNA. Rearrangements later in the splicing process result in U6 snRNA base pairing with the 5′ end of U2 snRNA, which remains base-paired to the branch-point sequence in the intron. Later in the splicing process, base pairing of U5 snRNA with four exon nucleotides adjacent to the splice sites displaces U1 snRNA from the pre-mRNA.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-18

.

   Electron micrograph of a spliceosome

Extracts of HeLa cells were mixed with a β-globin pre-mRNA; the reaction was interrupted before splicing was completed, so that the spliceosomes, containing snRNPs and the pre-mRNA substrate, could be purified. [From R. Reid et al., 1988, Cell 53:949; courtesy of J. Griffith.]

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f19.jpg.

Figure 11-19

.

   The spliceosomal splicing cycle

The splicing snRNPs (U1, U2, U4, U5, and U6) associate with the pre-mRNA and with each other in an ordered sequence to form the spliceosome. This large ribonucleoprotein complex then catalyzes the two transesterification reactions that result in splicing of the exons (light and dark red) and excision of the intron (blue) as a lariat structure (see Figure 11-16). Although ATP hydrolysis is not required for the transesterification reactions, it is thought to provide the energy necessary for rearrangements of the spliceosome structure that occur during the cycle. Note that the snRNP proteins in the spliceosome are distinct from the hnRNP proteins discussed earlier. In higher eukaryotes, the association of U2 snRNP with pre-mRNA is assisted by an hnRNP protein called U2AF, which binds to the pyrimidine-rich region near the 3′ splice site. U2AF also probably interacts with other proteins required for splicing through a domain containing repeats of the dipeptide serine-arginine (the SR motif). The branch-point A in pre-mRNA is indicated in boldface. [See S. W. Ruby and J. Abelson, 1991, Trends Genet. 7:79; adapted from M. J. Moore et al., 1993, in R. Gesteland and J. Atkins, eds., The RNA World, Cold Spring Harbor Press, pp. 303-357.]

Based on the results of these experiments, identification of reaction intermediates, and other biochemical analyses, the five splicing snRNPs are thought to sequentially assemble on the pre-mRNA forming a large ribonucleoprotein complex called a spliceosome, which is roughly the size of a ribosome (Figure 11-18). According to the model depicted in Figure 11-19, assembly of a spliceosome begins with the base pairing of U1 and U2 snRNAs, as part of the U1 and U2 snRNPs, to the pre-mRNA (see Figure 11-17). Extensive base pairing between the snRNAs in the U4 and U6 snRNPs forms a complex that associates with U5 snRNP. The U4/U6/U5 complex then associates, presumably via protein-protein interactions, with the previously formed complex consisting of a pre-mRNA base-paired to U1 and U2 snRNPs to yield a spliceosome.

After formation of the spliceosome, extensive rearrangements occur in the pairing of snRNAs and the pre-mRNA, as noted previously. The rearranged spliceosome then catalyzes the two transesterification reactions that result in RNA splicing. After the second transesterification reaction, the ligated exons are released from the spliceosome while the lariat intron remains associated with the snRNPs. This final intron-snRNP complex is unstable and dissociates. The individual snRNPs released participate in a new cycle of splicing. The excised intron is rapidly degraded by a “debranching enzyme,” which hydrolyzes the 5′,2′-phosphodiester bond at the branch point, and other nuclear RNases.

It is estimated that at least one hundred proteins are involved in RNA splicing, making this process comparable in complexity to protein synthesis and initiation of transcription. Some of these splicing factors are associated with snRNPs, but others are not. Sequencing of yeast genes encoding splicing factors has revealed that they contain domains with the RNP motif, which interacts with RNA, and the SR motif, which interacts with other proteins and may contribute to RNA binding. Some splicing factors also exhibit sequence homologies to known RNA helicases; these may be necessary for the base-pairing rearrangements that occur in snRNAs during the spliceosomal splicing cycle.

Introns whose splice sites do not conform to the standard consensus sequence recently were identified in some pre-mRNAs. This class of introns begins with AU and ends with AC rather than following the usual “GU – AG rule” (see Figure 11-14). Research on the biochemistry of splicing for this special class of introns soon identified four novel snRNPs. Together with the standard U5 snRNP, these snRNPs appear to participate in a splicing cycle analogous to that discussed above.

Portions of Two Different RNAs Are Trans-Spliced in Some Organisms

Virtually all functional mRNAs in vertebrate and insect cells are derived from a single molecule of the corresponding pre-mRNA by removal of internal introns and splicing of exons. However, in two types of protozoa — trypanosomes and euglenoids — mRNAs are constructed by splicing together separate RNA molecules. This process, referred to as trans-splicing, is also used in the synthesis of 10 – 15 percent of the mRNAs in the round worm Caenorhabditis elegans, an important model organism for studying embryonic development.

The parasitic trypanosomes produce abundant amounts of a single 140-nucleotide leader RNA from tandemly repeated transcription units. In a two-step reaction analogous to spliceosomal pre-mRNA splicing, a 39-nucleotide portion of the leader RNA, termed a mini-exon, is spliced to the 5′ end of protein-coding exons in primary transcripts, which lack internal introns. The 5′ mini-exon, present in all trypanosome mRNAs, is thought to assist in initiation of translation. Because of trans-splicing, polycistronic protein- coding transcription units in trypanosomes, which are common, yield monocistronic mRNAs from their polycistronic primary transcripts. Splicing of a 5′ mini-exon to a coding region in a primary transcript triggers cleavage and polyadenylation at the 3′ end of the exon. Consequently, trypanosomes use trans-splicing and linked cleavage and polyadenylation to combine the operon organization of polycistronic transcription units characteristic of bacteria with the monocistronic organization of mRNAs characteristic of eukaryotes.

Self-Splicing Group II Introns Provide Clues to the Evolution of snRNAs

Under certain nonphysiological in vitro conditions, pure preparations of some RNA transcripts slowly splice out introns in the absence of any protein. This observation led to recognition that some introns are self-splicing. Two types of self-splicing introns have been discovered: group I introns, present in nuclear rRNA genes of protozoans, and group II introns, present in protein-coding genes and some rRNA and tRNA genes of mitochondria and chloroplasts in plants and fungi. Discovery of the catalytic activity of self-splicing introns revolutionized concepts about the functions of RNA. As discussed in Chapter 4, RNA is now thought to catalyze peptide-bond formation during protein synthesis in ribosomes. Here we discuss the probable role of group II introns, now found only in mitochondrial and chloroplast DNA, in the evolution of snRNAs; the functioning of group I introns is considered in the later section on rRNA processing.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch11f20.jpg.

Figure 11-20

.

   Schematic diagrams comparing the secondary structures of group II self-splicing introns (a) and U snRNAs present in the spliceosome (b)

The first transesterification reaction is indicated by black arrows; the second reaction, by blue arrows. The branch-point A is boldfaced. The similarity in these structures suggests that the spliceosomal snRNAs evolved from group II introns, with the trans-acting snRNAs being functionally analogous to the corresponding domains in group II introns. [Adapted from P. A. Sharp, 1991, Science 254:663.]

Even though their precise sequences are not highly conserved, all group II introns fold into a conserved, complex secondary structure containing numerous stem-loops (Figure 11-20a). Self-splicing by a group II intron occurs via two transesterification reactions, involving intermediates and products analogous to those found in nuclear pre-mRNA splicing. The mechanistic similarities between group II intron self-splicing and spliceosomal splicing led to the hypothesis that snRNAs function analogously to the stem-loops in the secondary structure of group II introns. According to this hypothesis, snRNAs interact with 5′ and 3′ splice sites of pre-mRNAs and with each other to produce an RNA structure functionally analogous to that of group II self-splicing introns (Figure 11-20b).

An extension of this hypothesis is that introns in present-day nuclear pre-mRNAs evolved from ancient group II self-splicing introns through the progressive loss of internal RNA structures, which concurrently evolved into transacting snRNAs that perform the same functions. In support of this kind of evolutionary model, group II intron mutants have been constructed in which domain V and part of domain I are deleted. Such mutants are defective in self-splicing, but when RNA molecules equivalent to the deleted regions are added to the in vitro reaction, self-splicing occurs. This finding demonstrates that these domains in group II introns can be trans-acting, like snRNAs.

The similarity in the mechanisms of group II intron self-splicing and spliceosomal splicing of pre-mRNAs also suggests that the splicing reaction is catalyzed by the snRNA, not the protein, components of spliceosomes. Although group II introns can self-splice in vitro at elevated temperatures and Mg2+ concentrations, under in vivo conditions proteins called maturases, which bind to group II intron RNA, are required for rapid splicing. Maturases, encoded by group II introns themselves, are thought to stabilize the precise three-dimensional interactions of the intron RNA required to catalyze the two splicing transesterification reactions. By analogy, snRNP proteins in spliceosomes are thought to stabilize the precise geometry of snRNAs and intron nucleotides required to catalyze pre-mRNA splicing.

The evolution of snRNAs may have been an important step in the rapid evolution of higher eukaryotes. As internal intron sequences were lost and their functions in RNA splicing supplanted by trans-acting snRNAs, the remaining intron sequences would be free to diverge. This in turn likely facilitated the evolution of new genes through exon shuffling (Section 9.3). It also permitted the increase in protein diversity that results from alternative RNA splicing and an additional level of gene control resulting from regulated RNA splicing.

One more remarkable property of group II introns deserves mention, namely, their ability to behave as mobile DNA elements in the genome. The maturases that increase the rate of self-splicing of these introns also contain a domain that is homologous to reverse transcriptase. Thus group II introns can move in the genome like other nonviral retrotransposons discussed in Chapter 9. As is generally true for mobile DNA elements, transposition of group II introns is rare. However, when a group II intron does transpose, it does not inactivate the gene into which it inserts, because the inserted intron is spliced out of the transcript produced from the target gene by self-splicing!

Most Transcription and RNA Processing Occur in a Limited Number of Domains in Mammalian Cell Nuclei

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-21

.

   Localization of polyadenylated RNA and RNA splicing factors in the nucleus of a mammalian fibroblast

Digital imaging microscopy was used to reconstruct a 1-μm-thick section of a stained human fibroblast nucleus. (a) Section stained with red rhodamine-labeled poly-dT to detect polyadenylated RNA (red) and with DAPI to detect DNA (blue). Polyadenylated RNA is localized to a limited number of discrete foci (speckles) between regions of chromatin, although not all regions containing low levels of DNA contained detectable polyadenylated RNA (arrow). (b) The same section shown in (a) stained to detect polyadenylated RNA (red) and the essential RNA-splicing protein SC-35, which was visualized with a green fluorescein-labeled monoclonal antibody. Regions where the stains overlap appear yellow. SC-35 is present in the center of many foci (arrow). Nu = nucleolus. [From K. C. Carter et al., 1993, Science 259:1330.]

The digital imaging micrographs in Figure 11-21 demonstrate that most of the nuclear polyadenylated RNA (including unspliced and partially spliced pre-mRNA and nuclear mRNA) occurs in discrete foci lying between dense regions of chromatin and that a required protein splicing factor (SC-35) is localized to the center of these same loci. The results of these and other studies suggest that transcription and RNA processing do not occur randomly throughout the eukaryotic nucleus; rather, the nucleus is organized into discreet domains (≈20 – 100 in human fibroblasts) where the bulk of transcription and RNA processing occurs.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is permission.jpg.

Figure 11-22

.

   Transmission electron micrograph showing the nuclear matrix (skeleton) of a HeLa cell

Cells were treated with a nonionic detergent to remove membranes; digested with DNase to remove most of the DNA; and then extracted with 0.25 M ammonium sulfate to remove histones and chromatinassociated protein. A whole mount of the remaining material was prepared. [From S. Penman et al., 1982, Cold Spring Harbor Symp. Quant. Biol. 46:1013.]

This highly organized view of the nucleus implies that there is an underlying nuclear substructure. It has been known for many years that when mammalian cells are treated with a mild nonionic detergent, DNase I, and high concentrations of salt, a fibrillar network of protein and RNA remains in the region of the nucleus (Figure 11-22). This protein network has been called the nuclear matrix, or nuclear skeleton. It is composed of actin and numerous other protein components that have not been fully characterized, including components of the chromosomal scaffold that rearranges and condenses to form metaphase chromosomes during mitosis (see Figure 9-34). However, snRNPs remain associated with the nuclear matrix prepared from detergent-extracted, DNase I – treated cells. Moreover, when the nuclear matrix is prepared with a low concentration of salt, pre-mRNAs associated with the matrix undergo splicing when ATP is added. These results suggest that the RNA-processing foci observed microscopically may be associated with specific regions of the nuclear matrix.

SUMMARY

Footnotes
*

Few genes in bacterial or bacteriophage DNAs contain introns, whereas most protein-coding genes in animals and plants, as well as in DNA viruses infecting them, contain introns. Thus DNA viruses generally have a similar intron content as their host cells. Most yeast genes, like bacterial genes, lack introns.

Help ǀ Contact Bookshelf