A. The Discovery of
Trans
-splicing
Trans-splicing, in which an identical short leader sequence, the
spliced leader (SL), is spliced onto the 5′ends of multiple mRNAs (for
reviews, see Agabian 1990; Donelson and Zeng 1990; Bonen 1993), was first discovered in the
primitive eukaryotes, the trypanosomatids (Murphy et al. 1986; Sutton and
Boothroyd 1986), and later shown to occur also in C.
elegans and other nematodes (Krause and Hirsh 1987; for reviews, see Nilsen 1993; Davis
1996), in Euglena (Tessier et al.
1991), and in flatworms (Rajkovic
et al. 1990; Davis et al.
1994). In trypanosomes, all splicing is
trans-splicing; all mRNAs begin with the SL, and genes do not
contain introns. Transcription is polycistronic, and
trans-splicing is responsible for separating the long
polycistronic transcripts into monocistronic units. In contrast, in nematodes,
the genes do contain introns, and the pre-mRNA products of many genes are not
subject to trans-splicing.
Trans-splicing in C. elegans was first found
during molecular analysis of the actin genes (Krause and Hirsh 1987). It was discovered that mRNAs of three of the
four actin genes begin with the identical 22-nucleotide sequence, a sequence
that is not associated with the gene. Instead, the 22-nucleotide SL is donated
by a 100-nucleotide small RNA, SL RNA, by a trans-splicing
reaction. This trans-splicing process is closely related to
cis-splicing (intron removal). A reasonable match to the
5′splice site consensus is present on the SL RNA, and a good match to
the 3′splice site consensus is present at the site of SL addition
(trans-splice site) on the pre-mRNA. Furthermore, the
reaction proceeds by way of a branched intermediate similar to the lariat of
cis-splicing (Bektesh and
Hirsh 1988; Thomas et al.
1988; Hannon et al. 1990a).
Trans-splicing also requires spliceosomal components
including U2, U4, U5, and U6 snRNPs (Hannon et
al. 1991; Maroney et al.
1996; see below).
B. The Spliced Leader snRNP
The donor in the trans-splicing reaction, the SL RNA, exists in
the form of an snRNP (Bruzik et al.
1988; Thomas et al. 1988;
Van Doren and Hirsh 1988). It is
bound to the Sm proteins found associated with U1, U2, U4, and U5 RNAs, and it
has the unusual modified cap structure, trimethylguanosine (TMG), found on these
snRNAs. The secondary structure predicted for the SL RNA resembles that of other
snRNAs. It has the 5′splice site base-paired to the upstream part of
the SL sequence in a manner resembling the U1-5′splice site base
pairing. It was hypothesized that this intramolecular interaction might replace
in trans-splicing the interaction between U1 and the
5′splice site required for initiation of cis-splicing
(Bruzik et al. 1988). Subsequent
work has demonstrated that the U1 snRNP is indeed dispensable for nematode
trans-splicing in vitro (see below).
C.
Trans
-splicing Signals
Table 2
Comparison of cis- and
trans-3′splice sites
| −7
| −6
| −5
| −4
| −3
| −2
| −1
| +1
|
|---|
| Cis | 53 | 89 | 98 | 70 | 83 | 100 | 100 | 74 |
| SL1 | 57 | 92 | 97 | 64 | 82 | 100 | 100 | 79 |
| SL2 | 66 | 66 | 84 | 66 | 78 | 100 | 100 | 75 |
The
trans-splice site consensus is the same as the intron
3′splice site consensus (
Table
2), so it is not immediately obvious how the two reactions can be
faithfully carried out. However, it is now clear that the signal for
trans-splicing is simply the presence of an intron-like
sequence at the 5′end of the mRNA with no functional
5′splice site upstream (
Conrad et
al. 1991, 1993b, 1995). The region of the pre-mRNA from the
5′end to the
trans-splice site is called the outron
(
Conrad et al. 1991). Genes whose
pre-mRNAs are subject to
trans-splicing are distinguished from
those that are not only by the presence of an outron. Considerable experimental
evidence has been adduced to support this view. A conventional gene can be
converted into a
trans-spliced gene by placing at the gene's
5′end either an intron from another gene or an A + T-rich synthetic
sequence followed by a canonical 3′splice site. Furthermore, a
trans-spliced gene can be converted into a conventional
gene by inserting a 5′splice site into its outron. These experiments
demonstrate that the only difference between
trans-spliced and
conventional genes is the presence of an outron, and they show that no
sequence-specific recognition is involved in the decision to
trans-splice. They also show that the intron and outron
3′splice sites are the same; the choice between
cis-
and
trans-splicing is based solely on the presence or absence
of an upstream 5′splice site.
Because trans-splicing is a relatively efficient reaction (like
cis-splicing), it is generally impossible to isolate
outron-containing pre-mRNAs, and so very few natural outrons have been defined.
Nevertheless, in a few cases, the promoters of trans-spliced
genes or start sites of outrons have been identified (e.g., col-13 has a 64-bp outron [Park and Kramer
1990]). It might be possible to determine
trans-spliced gene start sites by deleting the
trans-splice site and analyzing the RNA product from a
transgenic strain carrying the mutated gene. However, in the few cases in which
this technique has been attempted, it has been unsuccessful because
trans-splicing occurred at an alternative site. One proven
successful technique is to introduce a 5′splice site consensus
sequence into the outron. In this case, the introduced splice site splices to
the trans-splice site, and the outron length can be calculated
from the length of the 5′-untranslated region (5′UTR). This
procedure was used to determine outron length (173 bp) for rol-6 (Conrad et al. 1993).
Although few natural outrons have been characterized, synthetic sequences have
been introduced into the 5′UTR of a gene that is not normally
trans-spliced in order to determine whether they can
function as outrons. A + U-rich sequences of 51 nucleotides or longer resulted
in efficient trans-splicing, whereas sequences that were 41
nucleotides or shorter (or not A + U-rich) were much less effective (Conrad et al. 1995). Thus, the same
constraints that set the lower size limit on introns may be acting on
outrons.
D. Function of
Trans
-splicing
Figure 5
.
Distance from the trans-splice site to the translation
initiation site. The survey includes 83 trans-spliced
genes, both SL1- and SL2-acceptors. Each bar represents the number of
genes in the indicated distance class (shown in bp: 0–5,
6–10, 11–15, etc.).
Trans-splicing occurs throughout the nematode phylum, and there
is a remarkable degree of conservation of the SL sequence (although the portions
of the SL RNAs downstream from the splice site have diverged widely) (
Bektesh and Hirsh 1988;
Tackacks et al. 1988;
Nilsen et al. 1989;
Zeng et al. 1990). In the many free-living species, as
well as animal and plant parasitic nematodes, that have been examined, only one
single-base change has been found in the SL sequence (
Ray et al. 1994). It is not known what selection pressure
has kept this sequence so stable. In fact, it is not yet known precisely what
function the SL has in the cell. In
C. elegans, SL tends to be
spliced very close to the initiating methionine codon (often immediately
adjacent) (), so it seems likely to
play a part in translation initiation. The unusual cap structure,
trimethylguanosine (TMG), present at the 5′end of the SL becomes the
5′end of
trans-spliced mRNAs, and this cap remains on
the mRNA during translation (
Liou and
Blumenthal 1990;
Van Doren and
Hirsh 1990). A TMG cap is known to inhibit translation in mammalian
extracts (
Darzynkiewicz et al. 1988),
but it may actually stimulate translation in a nematode extract, at least when
it is present at the 5′end of the SL sequence (
Maroney et al. 1995).
In Ascaris lumbricoides, an animal parasite, the SL sequence in
the DNA is needed for transcription of the SL RNA gene, which may be one reason
why it has been so highly conserved (Hannon et
al. 1990b). Although it is not known precisely what roles the SL
sequence itself may perform, trans-splicing is in fact required
for viability (Ferguson et al. 1996).
An embryonic lethal mutation in the rrs-1 gene is a deletion of all 100 tandem copies of the 1-kb sequence that
encodes both 5S ribosomal RNA and SL RNA (see below). Remarkably, the embryonic
lethality is rescued by a tandem array carrying the SL RNA gene alone
(presumably the maternal supply of 5S RNA can carry the homozygous mutants
through embryogenesis). Mutations in the SL RNA gene that eliminate the
Sm-binding site prevent rescue, so it is fair to conclude that the SL snRNP is
required for embryogenesis. Its required role could be a positive effect such as
providing a sequence needed for translation initiation, mRNA stability, or
localization, or it could be required for suppression of a negative effect such
as inhibition of translation initiation by AUG codons in the outron. At least
some of the mRNAs that normally are trans-spliced to SL1 have
been found to be trans-spliced to the alternative spliced
leader, SL2 (see below), in the rrs-1 mutant strain (Ferguson et al.
1996).
ǀ