![]() | ![]() |
Formats:
|
||||||||||||||||||
Copyright © 2008, Cold Spring Harbor Laboratory Press Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes 1 Department of Biochemistry and Molecular Biology and Department of Genetics, University of Georgia, Athens, Georgia 30602, USA; 2 Department of Chemistry and Biochemistry and Institute of Molecular Biophysics, Florida State University, Tallahassee, Florida 32306, USA 3Corresponding authors.E-MAIL mterns/at/bmb.uga.edu; FAX (706) 542-1752. 4E-MAIL rterns/at/bmb.uga.edu; FAX (706) 542-1752. Received September 19, 2008; Accepted October 20, 2008. Abstract An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here we identified Pyrococcus furiosus Cas6 as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targeting RNAs. Cas6 interacts with a specific sequence motif in the 5′ region of the CRISPR repeat element and cleaves at a defined site within the 3′ region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal-independent. cas6 is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR-derived guide RNAs in numerous bacteria and archaea. Keywords: CRISPR, Cas, endoribonuclease, RNA processing, Dicer, RNAi All genomes are potential targets of invasion by molecular parasites such as viruses and transposable elements, and organisms have evolved RNA-directed defense mechanisms to cope with the constant threat of genome invaders (Farazi et al. 2008; Girard and Hannon 2008). The well-known subpathway of RNA silencing referred to as RNAi functions in defense against viruses in eukaryotes (Ding and Voinnet 2007). The RNAi defense response is mediated by short (~22-nucleotide [nt]) RNAs termed siRNAs. The siRNAs are generated from invading viral RNAs by dsRNA-specific, RNase III-like endonucleases called Dicers (Jaskiewicz and Filipowicz 2008). The mature siRNAs are assembled with host effector proteins and target them to corresponding viral target RNAs to effect viral gene silencing via RNA destruction or other mechanisms (Farazi et al. 2008; Girard and Hannon 2008). Compelling evidence has recently emerged for the existence of an RNA-mediated genome defense pathway in archaea and numerous bacteria that has been hypothesized to parallel the eukaryotic RNAi pathway (for reviews, see Godde and Bickerton 2006; Lillestol et al. 2006; Makarova et al. 2006; Sorek et al. 2008). Known as the CRISPR-Cas system or prokaryotic RNAi (pRNAi), the pathway is proposed to arise from two evolutionarily and often physically linked gene loci: the CRISPR (clustered regularly interspaced short palindromic repeats) locus, which encodes RNA components of the system, and the cas (CRISPR-associated) locus, which encodes proteins (Jansen et al. 2002; Makarova et al. 2002, 2006; Haft et al. 2005). The individual Cas proteins do not share significant sequence similarity with protein components of the eukaryotic RNAi machinery, but have analogous predicted functions (e.g., RNA binding, nuclease, helicase, etc.) (Makarova et al. 2006). Unlike the siRNAs of the eukaryotic RNAi system, the effector RNAs of pRNAi are encoded in the host genome. CRISPR loci encode short (typically ~30- to 35-nt) invader-derived sequences interspersed between short (typically ~30- to 35-nt) direct repeat sequences (Bolotin et al. 2005; Mojica et al. 2005; Pourcel et al. 2005; Godde and Bickerton 2006; Lillestol et al. 2006; Makarova et al. 2006; Horvath et al. 2008; Sorek et al. 2008). Recent studies have provided clear experimental evidence that correlates the presence of virus-specific CRISPR sequences with viral immunity (Barrangou et al. 2007; Brouns et al. 2008; Deveau et al. 2008). Furthermore, viral infection has been shown to result in the appearance of new corresponding CRISPR elements in surviving strains (Barrangou et al. 2007; Deveau et al. 2008). This rapidly adapting CRISPR-based immunity acts within natural microbial populations to promote host cell fitness and to influence microbial ecology (Andersson and Banfield 2008; Tyson and Banfield 2008). The primary products of the CRISPR loci appear to be short RNAs that contain the invader targeting sequences, and are termed guide RNAs or prokaryotic silencing RNAs (psiRNAs) based on their hypothesized role in the pathway (Makarova et al. 2006; Hale et al. 2008). RNA analysis indicates that CRISPR locus transcripts are cleaved within the repeat sequences to release ~60- to 70-nt RNA intermediates that contain individual invader targeting sequences and flanking repeat fragments (Fig. 1A
The primary goal of this study was to begin to understand the biogenesis of psiRNAs through identification and characterization of the enzyme that cleaves within the repeat sequences of CRISPR RNA transcripts to liberate the many individual psiRNA species that function in defense against molecular invaders. Our results indicate that Cas6, one of the six highly conserved or “core” Cas proteins (Haft et al. 2005), functions as a CRISPR repeat RNA-specific endoribonuclease in P. furiosus and likely numerous other archaea and bacteria. Results The psiRNAs, which are thought to be primary agents in prokaryotic genome defense, are derived from CRISPR RNA transcripts that consist of a series of individual invader targeting sequences separated by a common repeat sequence (Fig. 1A
More than 40 CRISPR-associated genes have been identified; however, only a subset of the cas genes is found in any given genome, and no cas gene appears to be present in all organisms that possess the CRISPR-Cas system (Haft et al. 2005; Makarova et al. 2006). Cas6 is among the most widely distributed Cas proteins and is found in both bacteria and archaea (Haft et al. 2005). A distinct protein with similar activity was very recently reported in Escherichia coli (Brouns et al. 2008). This protein, Cse3 (CRISPR-Cas system subtype E. coli, also referred to as CasE), is found in some bacteria that lack Cas6 (Haft et al. 2005). Both Cas6 and Cse3 are members of the RAMP (repeat-associated mysterious protein) superfamily, as are a large number of the Cas proteins (Makarova et al. 2002, 2006). RAMP proteins contain G-rich loops and are predicted to be RNA-binding proteins (Makarova et al. 2002, 2006). Cas6 is distinguished from the many other RAMP family members by a conserved sequence motif within the predicted C-terminal G-rich loop (consensus GhGxxxxxGhG, where h is hydrophobic and xxxxx has at least one lysine or arginine) (Makarova et al. 2002; Haft et al. 2005). Nuclease activity was not predicted for Cas6 based on sequence analysis. To determine the precise PfCas6 cleavage site within the CRISPR repeat sequence, 5′-end-labeled repeat RNA was incubated with the purified enzyme and the 5′ cleavage product was mapped relative to RNase T1 (cuts after guanosines) and alkaline hydrolysis (cuts after each nucleotide) cleavage products (Fig. 3A
We next investigated the RNA sequence requirements of Cas6 binding and endonucleolytic cleavage. To identify the RNA-binding determinants, we performed gel mobility shift assays with a series of RNAs (Fig. 4A
Further analysis indicates that the first 12 nt of the 5′ region of the CRISPR repeat play a critical role in Cas6 binding. PfCas6 binds to an RNA comprised of the first 12 nt of the repeat with similar affinity as the 5′ cleavage product (Fig. 4A While nucleotides at the 5′ end of the CRISPR repeat are sufficient for robust PfCas6 binding, cleavage appears to involve additional elements. As expected, mutations that disrupt protein binding also eliminate cleavage activity (Fig. 5
P. furiosus has seven CRISPR loci with five slightly varied repeat sequences, and the elements that we identified as most important for Cas6 recognition and cleavage map to the regions of greatest sequence conservation. Variation is observed at only one position within each the first 12 and last 11 nt of the P. furiosus repeat sequences, consistent with the importance of these two regions in Cas6 binding and cleavage. On the other hand, variation occurs at three positions between the binding and cleavage sites (positions 14, 16, and 19), suggesting that nucleotide identities are less important in this region. To gain a more detailed understanding of PfCas6, we obtained a crystal structure of the protein at 1.8 Å resolution (Fig. 5
The structure of PfCas6 allows us to predict the site of catalysis and catalytic mechanism of the enzyme. Several candidate catalytic residues are evident as strictly conserved residues in aligned Cas6 sequences (Supplemental Fig. S2). These include Tyr31, His46, and Lys52, which cluster within 6 Å of each other and are found in close proximity to the G-rich loop that contains the Cas6 signature motif (Fig. 6B
Discussion The results presented here indicate that Cas6 plays a central role in the production of the psiRNAs in the emerging prokaryotic RNAi pathway. Cas6 is a novel riboendonuclease. Through direct binding and cleavage of CRISPR repeat sequences, Cas6 is capable of dicing long, single-stranded CRISPR primary transcripts into units that consist of an individual guide sequence flanked by a short (8-nt) repeat sequence at the 5′ end and by the remaining repeat sequence at the 3′ end of the RNA (Fig. 1A Cas6 is evolutionarily, structurally, and catalytically distinct from the Dicer proteins that function in the release of individual RNAs that mediate gene silencing in eukaryotes (Hammond 2005; Jaskiewicz and Filipowicz 2008). However, Cas6 is one of three different ferredoxin fold Cas proteins recently found to possess nuclease activity. Cas2, another protein found in many of the prokaryotes that possess the CRISPR-Cas system, cleaves U-rich ssRNA (Beloglazova et al. 2008). The mechanism of action of Cas6 seems to be distinct from that of Cas2, which appears to be a metal-dependent, hydrolytic enzyme (Beloglazova et al. 2008). The role of Cas2 in the pRNAi pathway is currently unknown. The E. coli Cse3 protein functions like Cas6 as a CRISPR repeat cleaving enzyme (Brouns et al. 2008). Cse3 also cleaves RNA in a divalent metal-independent manner (Brouns et al. 2008). The substrate RNA recognition requirements and the precise cleavage site have not yet been defined for Cse3. Interestingly, despite the lack of significant sequence homology, the Cas6 and Cse3 proteins appear to adopt similar structures to perform a common function in psiRNA biogenesis. Moreover, some bacteria with the CRISPR-Cas system do not appear to contain either a cas6 or a cse3 gene, suggesting that there is another Cas6 functional homolog among the Cas proteins, and illustrating the diversity of the CRISPR-Cas systems present in prokaryotes. Materials and methods Purification of PF1131 protein for cleavage and RNA-binding assays N-terminal, 6x-histidine-tagged PF1131 protein (PfCas6 from P. furiosus DSM 3638 strain) was expressed in Escherichia coli BL21 codon + (DE3, Invitrogen) cells harboring a pET24d plasmid containing the appropriate gene insert (gift of Michael Adams, University of Georgia). Protein expression was induced by growing the cells to an OD600 of 0.6 and adding isopropylthio-β-D-galactoside (IPTG) to a final concentration of 1 mM. The cells were disrupted by sonication (Misonix Sonicator 3000) in buffer A (20 mM sodium phosphate [pH 7.0], 500 mM NaCl and 0.1 mM phenylmethylsulfonyl fluoride). The lysate was then cleared by centrifugation and the supernatant was incubated for 20 min at 70°C. This sample was centrifuged and the supernatant was applied to a Ni-NTA agarose column (Qiagen) that had been equilibrated with Buffer A. The protein was eluted from the column with Buffer A containing 350 mM imidazole. The purity of the protein was evaluated by SDS-PAGE and staining with coomassie blue. Buffer exchange into 40 mM HEPES-KOH (pH 7.0), 500 mM KCL was carried out using Microcon PL-10 filter columns (Millipore). The protein concentration was determined by the BCA assay (Pierce). Generation of RNA substrates Synthetic RNAs (listed in Supplemental Table S1) and the RNA size standards (Decade Markers) were purchased from Integrated DNA Technologies (IDT) and Ambion, respectively. These RNAs were 5′-end-labeled with T4 Polynucleotide kinase (Ambion) in a 20-μL reaction containing 20 pmol of RNA, 500 μCi of [γ32P] ATP (3000 Ci/mmol; MP Biomedicals), and 20 U of T4 kinase. The RNAs were separated by electrophoresis on denaturing (7 M urea) 15% polyacrylamide gels, and the appropriate RNA species were excised from the gel with a sterile razor blade guided by a brief autoradiographic exposure. The RNAs were eluted from the gel slices by end-over-end rotation in 400 μL of RNA elution buffer (500 mM NH4OAc, 0.1% SDS, 0.5 mM EDTA) for 12–14 h at 4°C. The RNA was then extracted with phenol/chloroform/isoamyl alcohol (PCI, 25:24:1 at pH 5.2), and precipitated with 2.5 vol of 100% ethanol in the presence of 0.3 M sodium acetate and 20 μg of glycogen after incubation for 1 h at −20°C. All other RNAs were generated by in vitro transcription using T7 RNA polymerase (Ambion) and uniformly labeled with [α-32P] UTP (700 Ci/mmol; MP Biomedicals) as described (Baker et al. 2005). The templates used were either annealed DNA oligonucleotides or PCR products (see Supplemental Tables S1, S2), both containing the T7 promoter sequence. A typical reaction contained 200 ng of PCR product or annealed deoxyoligonucleotides, 1 mM DTT, 10 U SUPERase-IN RNase inihibitor (Ambion), 500 μM ATP, CTP, and GTP, 50 μM UTP, 30 μCi [α-32P] UTP, 1× transcription buffer (Ambion), and 40 U T7 RNA polymerase in a total volume of 20 μL. RNA-binding and cleavage reactions Typically, identical reaction conditions were used to assay the ability of PfCas6 protein to bind to and to cleave substrate RNAs. These reactions were initiated by incubating 0.05 pmol of 32P-radiolabed RNAs (either uniformly or 5′-end-labeled) with up to 1 μM (as indicated in the figure legends) of PfCas6 protein in 20 mM HEPES-KOH (pH 7.0), 250 mM KCl, 0.75 mM DTT, 1.5 mM MgCl2, 5 μg of E. coli tRNA, and 10% glycerol in a 20-μL reaction volume for 30 min at 70°C. Half of the reactions were directly run on native 8% polyacrylamide gels to assay RNA binding by gel mobility shift essentially as described (Baker et al. 2005). RNA cleavage was assayed using the remaining half of the reaction by deproteinizing (PCI extraction and ethanol precipitation) the RNAs and separating them by electrophoresis on denaturing (7 M urea), 12%–15% polyacrylamide gels. Gels were dried and the radiolabeled RNAs visualized by phosporimaging. Cleavage site mapping In order to map the site of RNA cleavage by Cas6, a standard cleavage reaction was set up using 5′ end labeled repeat RNA as described above. Alkaline hydrolysis and RNase T1 (0.1 U) ladders were generated as described previously (Youssef et al. 2007). Following the reactions, the RNAs were extracted with PCI, ethanol precipitated, and separated by electrophoresis on large, denaturing (7 M urea), 15% polyacrylamide (19:1 acrylamide:bis) gels. The gels were dried and the RNAs visualized by phosphorimaging. Purification of PfCas6 for structure determination N-terminal polyhistidine-tagged wild-type and selenomethionine-labeled PF1131 protein was expressed in E. coli and purified from cell extract by heat-denaturation and two chromatography steps. The cells were disrupted by sonication in a buffer containing 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, 5 mM β-mercaptoethanol (βME), and 0.2 mM phenylmethylsulfonyl fluoride. The cell lysate was heated for 15 min to 70°C before being pelleted. The supernatant was then directly loaded at room temperature onto a Ni-NTA (Qiagen) column equilibrated with 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, and 5 mM imidazole. The column was washed with the loading buffer containing 25 mM imidazole and then the bound protein was eluted using the loading buffer containing 350 mM imidazole. Fractions containing PF1131 were pooled and loaded onto a Superdex 200 (Hiload 26/60, Pharmacia) size-exclusion column equilibrated with 20 mM Tris-HCl (pH 7.4), 500 mM KCl, 5% glycerol, 0.5 mM ethylenediaminetetraacetic acid (EDTA), and 5 mM βME. The fractions corresponding to PF1131 were pooled and concentrated to 100 mg/mL for crystallization. Crystallization of PF1131 and selenomethionine-labeled PF1131 Both the wild-type and selenomethionine-labeled PF1131 protein were crystallized using vapor diffusion in a hanging drop at 30°C. The droplets of PF1131 at 40 mg/mL were combined in equal volume with a well solution that contained 50 mM MES (pH 6.0), 30 mM MgCl2, and 15% (v/v) isopropanol. The crystals formed in 1–5 d with a cubic shape and to a size of ~0.4 mm × 0.4 mm × 0.4 mm. Data collection and structure determination Crystals were soaked briefly in a cryo-protecting solution containing the mother liquor plus 20% (w/v) polyethylene glycol 4000 before being flash frozen in a nitrogen stream at 100 Kelvin. The crystals of the native and selenomethionine-labeled PF1131 diffracted to dmin = 1.8–2.2 Å at the Southeast Regional Collaborative Access Team (SER-CAT) beamline 22ID. The space group of the crystals was determined to be P3221 and the cell dimensions are listed in Supplemental Table S3. A single wavelength data set was collected at the anomalous peak of selenine from a selenomethionine-labeled crystal. The solvent content was calculated to be 54.9% if the crystal was assumed to contain one PF1131 in one asymmetric unit. The structure of PF1131 was solved by a SAD phasing method using the automated crystallographic structure solution program SOLVE (Terwilliger and Berendzen 1999). The initial model traced by SOLVE was further improved by the program COOT (Emsley and Cowtan 2004), followed by refinement using CNS (Brunger et al. 1998) and REFMAC5 (Murshudov et al. 1997) to Rwork/Rfree of 23.6/27.3. The quality of the structure model was checked by PROCHECK (Laskowski et al. 1993) and was found to be of satisfactory stereochemical properties. Acknowledgments We thank Caryn Hale (Terns laboratory) for contributions to the early stages of this project, Caryn Hale and Claiborne Glover (University of Georgia) for critical review of the manuscript, and Michael Adams (University of Georgia) for providing a PF1131 expression construct. This work was supported by NIH grant R01 GM54682 to M.T. and R.T., and NIH grant R01 GM66958 to H.L. X-ray diffraction data were collected from the Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory. Supporting institutions for APS beamlines may be found at http://necat.chem.cornell.edu and http://www.ser-cat.org/members.html. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under contract no. W-31-109-Eng-38. Footnotes Supplemental material is available at http://www.genesdev.org. Article is online at http://www.genesdev.org/cgi/doi/10.1101/gad.1742908. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||
Development. 2008 Apr; 135(7):1201-14.
[Development. 2008]Trends Cell Biol. 2008 Mar; 18(3):136-48.
[Trends Cell Biol. 2008]Cell. 2007 Aug 10; 130(3):413-26.
[Cell. 2007]Curr Top Microbiol Immunol. 2008; 320():77-97.
[Curr Top Microbiol Immunol. 2008]J Mol Evol. 2006 Jun; 62(6):718-29.
[J Mol Evol. 2006]Archaea. 2006 Aug; 2(1):59-72.
[Archaea. 2006]Biol Direct. 2006 Mar 16; 1():7.
[Biol Direct. 2006]Nat Rev Microbiol. 2008 Mar; 6(3):181-6.
[Nat Rev Microbiol. 2008]Mol Microbiol. 2002 Mar; 43(6):1565-75.
[Mol Microbiol. 2002]Microbiology. 2005 Aug; 151(Pt 8):2551-61.
[Microbiology. 2005]J Mol Evol. 2005 Feb; 60(2):174-82.
[J Mol Evol. 2005]Microbiology. 2005 Mar; 151(Pt 3):653-63.
[Microbiology. 2005]J Mol Evol. 2006 Jun; 62(6):718-29.
[J Mol Evol. 2006]Archaea. 2006 Aug; 2(1):59-72.
[Archaea. 2006]Biol Direct. 2006 Mar 16; 1():7.
[Biol Direct. 2006]Proc Natl Acad Sci U S A. 2002 May 28; 99(11):7536-41.
[Proc Natl Acad Sci U S A. 2002]Mol Microbiol. 2005 Jan; 55(2):469-81.
[Mol Microbiol. 2005]Archaea. 2006 Aug; 2(1):59-72.
[Archaea. 2006]Science. 2008 Aug 15; 321(5891):960-4.
[Science. 2008]PLoS Comput Biol. 2005 Nov; 1(6):e60.
[PLoS Comput Biol. 2005]PLoS Comput Biol. 2005 Nov; 1(6):e60.
[PLoS Comput Biol. 2005]Biol Direct. 2006 Mar 16; 1():7.
[Biol Direct. 2006]Science. 2008 Aug 15; 321(5891):960-4.
[Science. 2008]Nucleic Acids Res. 2002 Jan 15; 30(2):482-96.
[Nucleic Acids Res. 2002]J Mol Evol. 2006 Jun; 62(6):718-29.
[J Mol Evol. 2006]Genome Biol. 2007; 8(4):R61.
[Genome Biol. 2007]J Mol Evol. 2006 Jun; 62(6):718-29.
[J Mol Evol. 2006]Genome Biol. 2007; 8(4):R61.
[Genome Biol. 2007]FEBS J. 2005 May; 272(9):2118-31.
[FEBS J. 2005]Methods Enzymol. 2003; 374():492-509.
[Methods Enzymol. 2003]Cell Mol Life Sci. 2008 Apr; 65(7-8):1176-85.
[Cell Mol Life Sci. 2008]Science. 2008 Aug 15; 321(5891):960-4.
[Science. 2008]Curr Top Microbiol Immunol. 2008; 320():77-97.
[Curr Top Microbiol Immunol. 2008]FEBS Lett. 2005 Oct 31; 579(26):5822-9.
[FEBS Lett. 2005]Curr Top Microbiol Immunol. 2008; 320():77-97.
[Curr Top Microbiol Immunol. 2008]J Biol Chem. 2008 Jul 18; 283(29):20361-71.
[J Biol Chem. 2008]Science. 2008 Aug 15; 321(5891):960-4.
[Science. 2008]Genes Dev. 2005 May 15; 19(10):1238-48.
[Genes Dev. 2005]Genes Dev. 2005 May 15; 19(10):1238-48.
[Genes Dev. 2005]Nucleic Acids Res. 2007; 35(18):6196-206.
[Nucleic Acids Res. 2007]Acta Crystallogr D Biol Crystallogr. 1999 Apr; 55(Pt 4):849-61.
[Acta Crystallogr D Biol Crystallogr. 1999]Acta Crystallogr D Biol Crystallogr. 2004 Dec; 60(Pt 12 Pt 1):2126-32.
[Acta Crystallogr D Biol Crystallogr. 2004]Acta Crystallogr D Biol Crystallogr. 1998 Sep 1; 54(Pt 5):905-21.
[Acta Crystallogr D Biol Crystallogr. 1998]Acta Crystallogr D Biol Crystallogr. 1997 May 1; 53(Pt 3):240-55.
[Acta Crystallogr D Biol Crystallogr. 1997]