• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Nov 21, 2006; 103(47): 17602–17607.
Published online Nov 14, 2006. doi:  10.1073/pnas.0605476103
PMCID: PMC1693793
Eukaryotic Transposable Elements and Genome Evolution Special Feature
Research Articles, Biochemistry

RNA from the 5′ end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site


Non-LTR retrotransposons insert into eukaryotic genomes by target-primed reverse transcription (TPRT), a process in which cleaved DNA targets are used to prime reverse transcription of the element's RNA transcript. Many of the steps in the integration pathway of these elements can be characterized in vitro for the R2 element because of the rigid sequence specificity of R2 for both its DNA target and its RNA template. R2 retrotransposition involves identical subunits of the R2 protein bound to different DNA sequences upstream and downstream of the insertion site. The key determinant regulating which DNA-binding conformation the protein adopts was found to be a 320-nt RNA sequence from near the 5′ end of the R2 element. In the absence of this 5′ RNA the R2 protein binds DNA sequences upstream of the insertion site, cleaves the first DNA strand, and conducts TPRT when RNA containing the 3′ untranslated region of the R2 transcript is present. In the presence of the 320-nt 5′ RNA, the R2 protein binds DNA sequences downstream of the insertion site. Cleavage of the second DNA strand by the downstream subunit does not appear to occur until after the 5′ RNA is removed from this subunit. We postulate that the removal of the 5′ RNA normally occurs during reverse transcription, and thus provides a critical temporal link to first- and second-strand DNA cleavage in the R2 retrotransposition reaction.

Keywords: endonuclease, retrotransposition, reverse transcription, RNA–protein interactions

Non-LTR retrotransposons, also referred to as long interspersed nuclear elements (LINEs), are abundant insertions in many eukaryotic genomes. For example, there are >800,000 copies of these elements in the human genome, representing 17% of our DNA (1). Whereas retrotransposition assays in tissue culture cells have been developed to study non-LTR retrotransposition, many questions concerning the mechanism of their integration remain unanswered (25).

R2 is a non-LTR retrotransposable element with rigid sequence specificity for a target site in the 28S rRNA genes of arthropods, platyhelminths, tunicates, and vertebrates (6, 7). The sequence specificity of R2 integration has enabled detailed biochemical studies of its retrotransposition reaction (Fig. 1A). We have previously shown that one R2 protein subunit of a probable dimer binds a 30-bp DNA segment upstream of the insertion site and cleaves the first strand (bottom strand, Fig. 1A) of the target DNA (8, 9). If RNA corresponding to the 3′ UTR of the R2 element is present, then this subunit primes reverse transcription of the R2 RNA transcript from the free 3′ end released by the cleavage. This process is referred to as target-primed reverse transcription (TPRT) (10). After reverse transcription, the second (top) DNA strand is cleaved by the second protein subunit, which binds a different DNA sequence downstream of the insertion site (9). We have postulated that this second R2 subunit is responsible for the synthesis of the second DNA strand and thereby completes the retrotransposition reaction (9).

Fig. 1.
Introduction to the R2 element. (A) R2 protein subunits bind both upstream and downstream of the 28S gene insertion site. The protein subunit bound upstream of the integration site cleaves the bottom DNA strand and uses the newly generated 3′ ...

One uncharacterized aspect of the R2 retrotransposition reaction has been the mechanism by which the R2 protein could adopt alternative conformations, allowing it to bind different DNA sequences upstream and downstream of the insertion site. In this report we show that the R2 protein is able to specifically bind a segment of the R2 RNA located near the 5′ end of the transcript. Association with this RNA results in an R2 protein conformation that binds to the downstream DNA sequences. Thus it is the presence or absence of bound 5′ RNA that determines what role an R2 subunit plays in the integration reaction.


Our standard procedure for the purification of R2 protein from an Escherichia coli expression construct results in one predominant band on SDS/PAGE gels when visualized by protein staining (10). Recently, however, we have noted that two bands are detected when these gels are silver stained (Fig. 1C, lane 1). Only the upper band was sensitive to proteinase K (lane 3), whereas the lower band was found to be sensitive to RNase A (lane 2). The protection of this precise-length RNA from endogenous E. coli RNases and its copurification with the R2 protein through two affinity columns suggested a tight, highly specific association. The copurified RNA did not correspond to sequences from the 3′ UTR of the R2 RNA because these sequences are not present in the expression construct. The copurifying RNA was reverse transcribed by using the template-jumping activity of the R2 reverse transcriptase (11), cloned, and sequenced. The RNA was identified as a 320-nt fragment from the R2 element beginning near the start of the ORF and ending just before the highly conserved zinc-finger motif encoded by all R2 elements (12) (Fig. 1A). This region of the R2 element shows little sequence conservation among species at either the protein or the nucleic acid level (13, 14). The copurifying RNA will hereafter be referred to as 5′ RNA to differentiate it from the 3′ UTR RNA (3′ RNA), which can also be bound by the R2 protein (Fig. 1B).

Because all previous TPRT assays with the R2 protein contained the copurifying 5′ RNA, experiments were first undertaken to determine the effects of 5′ RNA removal on the DNA cleavage and reverse transcription reactions catalyzed by the R2 protein. To that end, the level of 5′ RNA was reduced by treating the isolated protein with RNase A. After RNase A treatment, excess 3′ RNA and RNaseOut inhibitor were added in standard DNA cleavage/TPRT assays. As shown in Fig. 2 A and B, the reduction in the level of 5′ RNA did not affect the ability of the R2 protein to cleave the first DNA strand or conduct the TPRT reaction. However, the ability of the R2 protein to cleave the second DNA strand was greatly reduced when the 5′ RNA had been removed (Fig. 2C), suggesting that the R2 protein was no longer able to bind the DNA target downstream of the insertion site or was allosterically inhibited from second-strand cleavage.

Fig. 2.
Removal of the copurifying 5′ RNA from the R2 protein does not affect first-strand DNA cleavage (A) and TPRT activity (B) but does eliminate second-strand DNA cleavage (C). In each reaction the purified R2 protein was pretreated with RNase A (black ...

To better understand the protein–DNA complexes formed in the presence of 5′ and 3′ RNAs, electrophoretic mobility shift assays (EMSAs) were performed with RNase A-treated protein in the presence of excess 3′ RNA and/or 5′ RNA (Fig. 3). It should be noted that in the absence of RNA, most protein–DNA complexes are retained within the well of the gel (8). An endonuclease-mutant R2 protein (15) was used in these assays to observe the complexes formed before DNA cleavage. In the presence of excess 3′ RNA (Left), there is a single shifted band, which corresponds to the previously characterized protein monomer binding upstream of the insertion site (9). The single band contrasts with earlier studies conducted with R2 protein containing the copurifying 5′ RNA, in which a second upper band was also detected (8, 9).

Fig. 3.
Electrophoretic mobility shift assay (EMSA) of the R2 protein and target DNA with 1.0 pmol 3′ RNA (Left), 1.0 pmol of 5′ RNA (Center), or 0.5 pmol of both 3′ and 5′ RNA (Right). Each reaction contained 40 fmol of 32P-end-labeled ...

In the presence of excess 5′ RNA (Fig. 3 Center), there is again a single shifted band, which migrates somewhat slower than the complex observed with 3′ RNA (the 5′ RNA is 70 nt longer than the 3′ RNA). In the presence of both 5′ and 3′ RNA (Right), the 5′ RNA band is present as well as a second slower-migrating band. This second band was seen at lower levels in our earlier studies with 3′ RNA additions and was characterized as a dimer containing R2 protein subunits bound both upstream and downstream of the insertion site (8, 9). In these previous reports, we showed the upstream subunit bound the 3′ RNA, erroneously suggesting that this was the only RNA present in the dimer complex. The gel shifts in Fig. 3 suggest that the downstream subunit binds 5′ RNA.

To directly demonstrate that the 5′ RNA is able to promote the binding of the R2 protein subunit downstream of the insertion site, DNase I footprint analyses were conducted to compare complexes formed with either 3′ RNA or 5′ RNA (Fig. 4A). Summary diagrams of the footprints are given in Fig. 4B. As reported previously (8), DNA protection by 3′ RNA–protein complexes footprint upstream of the insertion site between −38 and −11 on the top strand and from −42 to −8 on the bottom strand. DNA protection by the 5′ RNA–protein complexes footprint to an area downstream of the insertion site between −2 and +18 on the top strand and from −6 to +18 on the bottom strand. Although the isolated N-terminal domain of the R2 protein has been shown to bind to these downstream sequences (12), this is the first time we have observed the full-length R2 protein to bind exclusively downstream of the insertion site (i.e., in the absence of subunits bound upstream of the insertion site). The larger complex observed in the presence of both RNAs (Fig. 3 Right) protects regions of DNA that are a summation of the 3′ RNA and 5′ RNA–protein complexes (Fig. 4B).

Fig. 4.
DNase I footprints of DNA–protein complexes containing either 3′ or 5′ RNA. (A) The target DNA was 5′-end-labeled on either the top (Left) or the bottom (Right) strand. The complexes were formed and separated on EMSA gels ...

Having demonstrated above the effects of reducing the level of copurified 5′ RNA, we next wanted to address the effects of increasing levels of 5′ RNA on the DNA cleavage activities of the R2 protein. As shown in Fig. 5A, increasing 5′ RNA severely inhibits first-strand cleavage. This finding is in contrast to previous observations with 3′ RNA or nonspecific RNAs, in which increased RNA concentrations either have no effect or give rise to a slight increase in first-strand cleavage (8, 9, 16). The inverse relationship between first-strand cleavage and level of 5′ RNA confirms that as the 5′ RNA concentration is increased, a larger percentage of the R2 protein is driven to bind to the DNA target downstream of the insertion site and, thus, is in the wrong position to cleave the first strand.

Fig. 5.
Effect of 5′ RNA concentration on first- and second-strand DNA cleavage. Target DNA 5′-end-labeled on either the first (A) or second (B) strand was incubated with increasing amounts of 5′ RNA. Each reaction contained 100 fmol of ...

The effects of increasing 5′ RNA concentration on second-strand cleavage are shown in Fig. 5B. Increasing concentrations of 5′ RNA stimulated second-strand cleavage up to a point, but higher RNA concentrations only reduced the level of cleavage. The requirement of 5′ RNA for second-strand cleavage is consistent with the ability of this RNA to promote protein binding downstream of the target site; however, it is less apparent why higher 5′ RNA/protein ratios inhibit second-strand cleavage. One model that potentially explains this finding is that first-strand cleavage is required before second-strand cleavage. In this model, at low 5′ RNA concentrations many protein subunits are not associated with the RNA and thus bind upstream of the target site and cleave the first strand (Fig. 5A). Protein subunits that are associated with the 5′ RNA bind downstream of the same target sites and cleave the second strand. High concentrations of 5′ RNA, on the other hand, drive the binding of all subunits to the downstream site, circumventing first-strand cleavage (Fig. 5A). Although this model seems unlikely because of the 8-fold excess of DNA substrate to protein in these assays, we directly tested the ability of downstream subunits to cleave the second strand of DNA substrates precleaved on the first strand. In this assay, both the precleaved DNA substrate (2- to 12-fold) and the 5′ RNA (100-fold) were in excess, driving most R2 subunits to bind downstream of the insertion site. As shown in Fig. 6A, second-strand cleavage did not occur on DNA substrates precleaved on the first strand, which suggests that the inhibition of second-strand cleavage by excess 5′ RNA is not a result of a requirement for first-strand DNA cleavage.

Fig. 6.
Factors affecting second-strand cleavage by the downstream subunit. (A) Second-strand cleavage on DNA templates that have been precleaved on the bottom strand. The precleaved substrates were made by incubation of the DNA substrate with excess R2 protein ...

Two other models could explain the inability of the downstream subunit to cleave the second DNA strand in the presence of excess 5′ RNA. Either DNA cleavage by the downstream subunit could require protein–protein interactions with the upstream subunit, or the catalytic site of the downstream subunit is not available (masked) until after the 5′ RNA dissociates. To try to resolve these models, R2 protein was first bound to excess target DNA in the presence of a high concentration of 5′ RNA. Samples were then divided into two aliquots: one aliquot was digested with RNase A, and the second was left untreated. As shown in Fig. 6B, the level of second-strand cleavage was increased ≈2-fold by the addition of RNase A, which is consistent with the model that removal of the 5′ RNA from a bound downstream subunit can stimulate its ability to cleave the second strand. Unfortunately, reduction of the 5′ RNA by the RNase also destabilizes downstream binding by the R2 subunits, and the released protein is free to bind the upstream sites of DNA substrates in which the downstream subunit is still bound (i.e., form dimers). Because these assays were conducted in DNA substrate excess, it seems unlikely that this mechanism could account for the high level of cleavage observed (35% of the DNA bound by the R2 protein underwent cleavage). However, we cannot exclude a role for protein–protein interactions of the upstream and downstream subunits in the ability of the complex to cleave the second DNA strand.


The experiments presented in this report indicate that an RNA segment near the 5′ end of a full-length R2 transcript regulates the role of the R2 protein in a retrotransposition reaction. In the R2 element of Bombyx mori, the system used in this study, the 5′ RNA appears to encode the beginning of the ORF (Fig. 1). However, the nucleotide sequence of this RNA is not well conserved among different R2 elements, and the location of the first methionine codon within the single R2 ORF is variable (13, 14); thus it is unclear what fraction of this RNA may actually encode protein in this and other species. The absence of conserved nucleotide sequences was also found for the 3′ RNA recognized by the R2 protein (17). Despite extensive primary sequence changes, the R2 protein from B. mori is able to use the 3′ RNA from R2 elements of distant insect species in a TPRT reaction, which is consistent with the evidence from many systems that it is an RNA's tertiary structure, not primary sequence, that is recognized by proteins (17).

The discovery of the association of R2 protein subunits with the 5′ RNA adds an important component to our working model of the R2 integration reaction (Fig. 7). In addition to its ability to bind RNA, the R2 protein contains two DNA-binding domains: an N-terminal domain containing myb and zinc-finger protein motifs that bind the DNA sequences located downstream of the insertion site, and a C-terminal domain containing unknown protein motifs that bind the DNA sequences upstream of the insertion site (Fig. 7A). In the absence of RNA, the R2 protein appears to adopt a conformation that exposes both N-terminal and C-terminal DNA-binding domains. Evidence that the two binding domains are able to bind separate DNA molecules can be found in the large network of protein–DNA complexes observed on EMSA gels in the absence of RNA (8). Although in the absence of RNA the R2 protein is able to cleave the first (bottom) DNA strand, the R2 protein binds more efficiently to the upstream DNA in the presence of 3′ RNA, perhaps by a change in protein conformation that sequesters the N-terminal DNA-binding domain. This upstream subunit is capable by itself of conducting the TPRT reaction (steps 1 and 2 in Fig. 7B) (9).

Fig. 7.
Model of R2 retrotransposition. (A) The R2 protein is composed of three domains: an N-terminal DNA-binding domain (blue shading), a central reverse transcriptase (RT) domain (green shading), and a C-terminal DNA-binding and endonuclease domain (red shading). ...

Although we have previously shown that the downstream subunit is responsible for second-strand cleavage (step 3 of the integration) (9), the experiments in this report indicate that it is the association of R2 protein with the 5′ RNA that promotes this binding, presumably by sequestering the DNA-binding motifs of the C-terminal domain. In the presence of excess 5′ RNA, downstream subunit binding can occur in the absence of upstream binding (Fig. 4), but cleavage of the second DNA strand does not occur (Fig. 5). Failure of these downstream complexes to cleave the second DNA strand suggests that cleavage requires an interaction with the upstream subunit and/or that the downstream subunit must first discharge the RNA. The stimulation of second-strand cleavage by the treatment of prebound downstream subunits with RNase A supports the latter possibility (Fig. 6B).

The model that loss of 5′ RNA by the downstream subunit allows second-strand cleavage is also more consistent with our previous data on the timing of the various steps of the retrotransposition reaction. In vitro, second-strand cleavage occurs slowly and inefficiently only after reverse transcription (10). The kinetics of this reaction can be explained if second-strand cleavage follows the slow dissociation of the copurified 5′ RNA from the downstream subunit. Within a cell, on the other hand, loss of 5′ RNA from the downstream subunit is likely the result of the reverse transcription of an RNA transcript with 5′ and 3′ RNA as part of the same molecule (step 3). Thus, the requirement to remove the 5′ RNA to enable second-strand cleavage provides temporal control for a complete integration reaction. Finally, second-strand synthesis (step 4) is the only step of the integration reaction that has not been observed in vitro. However, the R2 polymerase can efficiently use DNA templates and has the ability to displace RNA strands that are annealed to these DNA templates (A. Kurzynska-Kokorniak, A. Bibillo, and T.H.E., unpublished work). Armed with our understanding of the role of the 5′ RNA in the reaction, we hope to be able to reproduce a complete integration reaction in vitro.

Discovery of the role played by 5′ RNA in R2 retrotransposition may also explain the observation made in different species that the 5′ junctions of full-length R2 elements are more precise than the junctions of 5′ truncated elements (refs. 13 and 18, and D. Stage and T.H.E., unpublished work). These 5′ truncations are postulated to be a result of cellular degradation of the RNA transcript or the reverse transcriptase failing to reach the 5′ end of the transcript. Such truncations may have more variable 5′ junctions, because in the former case an R2 subunit may not bind to the downstream site, and in the latter case, the downstream subunit cannot cleave the second strand, leaving cellular DNA repair proteins to complete or eliminate the integration. Finally, unlike many non-LTR retrotransposons, R2 elements do not encode another ORF (ORF1) upstream of the major ORF. ORF1 proteins are known to bind RNA (1921), possibly protecting it from cellular RNases, as well as to directly contribute to the retrotransposition reaction (2). It is interesting to speculate that the ability of the R2 protein to bind near the 5′ end of its own transcript may substitute for certain functions of the ORF1 protein.

Materials and Methods

Protein Purification and Nucleic Acid Preparations.

B. mori R2 protein was purified as described (8, 10). 5′-end-labeled DNA substrates extended from 50 bp upstream to 50 bp downstream of the insertion site (9). R2 protein binding, cleavage, and TPRT assays in 13-μl reactions were performed in 50 mM Tris·HCl (pH 8.0)/200 mM NaCl/5 mM MgCl2/1 mM DTT/11% glycerol/0.1 mg/ml BSA/0.01% Triton X-100, with or without 25 μM dNTPs as described (9). Reactions were incubated at 37°C for 30 min. For reactions in which the copurified 5′ RNA was first reduced, aliquots of the purified protein were preincubated with RNase A (3 μg/ml) for 20 min at 22°C and then 20 min at 37°C before adding 60 units of RNaseOut (Invitrogen, Carlsbad, CA) per assay, the appropriate DNA, and either 3′ or 5′ RNA. Control reactions were similarly incubated but without the RNase A. The DNA-binding reactions (400 fmol of DNA) used for the DNase I footprints contained 35–65% of the DNA substrate bound by protein. The bound DNA complexes were separated from free DNA on native polyacrylamide gels (8, 9). The DNA substrate to map the bottom strand extended from 70 bp upstream to 30 bp downstream of the insertion site.

Cloning of the 5′ RNA.

The 5′ RNA copurifying with the R2 protein was further purified by excision from SDS/PAGE gels, and was cloned by using the template-jumping activity of the R2 reverse transcriptase (11). The copurified 5′ RNA was used as the acceptor molecule for primer extension of a 177-nt donor RNA (11). The cDNA jumping products were purified from a denaturing gel by elution in a buffer of 0.5 M NH4OAc, 10 mM Mg(OAc)2, 1 mM EDTA at pH 8.0, and 0.1% SDS, and were precipitated with 3 vol of ethanol and dissolved in water. The products were extended by terminal deoxynucleotidyltransferase (Promega, Madison, WI) in the presence of 2 mM dCTP and then subjected to PCR with one primer specific to the 177-nt donor RNA (AB.9 of ref. 11) and the second primer corresponding to oligo(dG). The PCR products were cloned in pBluescript vector (Stratagene, La Jolla, CA), and multiple clones were sequenced.

RNA Synthesis.

R2 3′ UTR RNA (National Center for Biotechnology Information accession no. M16558 from nucleotide 4028 to nucleotide 4275) was made by in vitro transcription as described previously (9). R2 5′ end RNA (National Center for Biotechnology Information accession no. M16558 from nucleotide 716 to nucleotide 1034) was synthesized by T7 transcription of templates made by PCR amplification of SacI/XcmI-digested pR260 (10) by using the primers 5′-GCGTAATACGACTCACTATAGGGCCGGTGTAACCCGGATGGCTG-3′ and 5′-CGCAGAACTGGCAGGTCCAACCAG-3′. RNA transcription and purification of the RNA was the same as with the 3′ RNA (9, 12).


We thank A. Kurzynska-Kokorniak for discussions and D. Eickbush for comments on the manuscript. This work was supported by National Institutes of Health Public Health Service Grant GM42790.


target-primed reverse transcription.


The authors declare no conflict of interest.

This article is a PNAS direct submission.


1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. Nature. 2001;409:860–921. [PubMed]
2. Moran JV, Holmes SE, Nass PT, DeBarardinis RJ, Boeke JD, Kazazian HH., Jr Cell. 1996;87:917–927. [PubMed]
3. Dewannieux M, Esnault C, Heidmann T. Nat Genet. 2003;35:41–48. [PubMed]
4. Takahashi H, Fujiwara H. EMBO J. 2002;21:408–417. [PMC free article] [PubMed]
5. Chambeyron S, Bucketon A, Busseau I. J Biol Chem. 2002;277:17877–17882. [PubMed]
6. Eickbush TH. In: Mobile DNA II, Craig NL, Craigie R, Gellert M, Lambowitz AM, editors. Washington, DC: Am Soc Microbiol; 2002. pp. 813–835.
7. Kojima KK, Fujiwara H. Mol Biol Evol. 2003;21:207–217. [PubMed]
8. Christensen SM, Eickbush TH. J Mol Biol. 2004;336:1035–1045. [PubMed]
9. Christensen SM, Eickbush TH. Mol Cell Biol. 2005;25:6617–6628. [PMC free article] [PubMed]
10. Luan DD, Korman MH, Jakubczak JL, Eickbush TH. Cell. 1993;72:595–605. [PubMed]
11. Bibillo J, Eickbush TH. J Biol Chem. 2004;279:14945–14953. [PubMed]
12. Christensen SM, Bibillo A, Eickbush TH. Nucleic Acids Res. 2005;33:6461–6468. [PMC free article] [PubMed]
13. Burke WD, Malik HS, Jones JP, Eickbush TH. Mol Biol Evol. 1999;16:502–511. [PubMed]
14. George JA, Eickbush TH. Insect Mol Biol. 1999;8:3–10. [PubMed]
15. Yang J, Malik HS, Eickbush TH. Proc Natl Acad Sci USA. 1999;96:7847–7852. [PMC free article] [PubMed]
16. Luan DD, Eickbush TH. Mol Cell Biol. 1995;15:3882–3891. [PMC free article] [PubMed]
17. Ruschak AM, Mathews DH, Bibillo A, Spinelli SL, Childs JL, Eickbush TH, Turner DH. RNA. 2004;10:978–987. [PMC free article] [PubMed]
18. George JA, Burke WD, Eickbush TH. Genetics. 1996;142:853–863. [PMC free article] [PubMed]
19. Hohjoh H, Singer MF. EMBO J. 1997;16:6034–6043. [PMC free article] [PubMed]
20. Dawson A, Hartswood E, Paterson T, Finnegan DJ. EMBO J. 1997;16:4448–4455. [PMC free article] [PubMed]
21. Martin SL, Bushman FD. Mol Cell Biol. 2001;21:467–475. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...