Logo of narLink to Publisher's site
Nucleic Acids Res. 2005; 33(14): 4443–4454.
Published online 2005 Aug 2. doi:  10.1093/nar/gki758
PMCID: PMC1182700

Identification and characterization of endogenous small interfering RNAs from rice


RNA silencing-mediated small interfering RNAs (siRNAs) and microRNAs (miRNAs) have diverse natural roles, ranging from regulation of gene expression and heterochromatin formation to genome defense against transposons and viruses. Unlike miRNAs, endogenous siRNAs are generally not conserved between species; consequently, their identification requires experimental approaches. Thus far, endogenous siRNAs have not been reported from rice, which is a model species for monocotyledonous plants. We identified a large set of putative endogenous siRNAs from root, shoot and inflorescence small RNA cDNA libraries of rice. Most of these siRNAs are from intergenic regions, although a substantial proportion (22%) originates from the introns and exons of protein-coding genes. Northern and RT–PCR analysis revealed that the expression of some of the siRNAs is tissue specific or developmental stage specific. A total of 25 transposons and 21 protein-coding genes were predicted to be cis-targets of some of the siRNAs. Based on sequence homology, we also predicted 111 putative trans-targets for 44 of the siRNAs. Interestingly, ∼46% of the predicted trans-targets are transposable elements, which suggests that endogenous siRNAs may play an important role in the suppression of transposon proliferation. Using RNA ligase-mediated-5′ rapid amplification of cDNA end assays, we validated three of the predicted targets and provided evidence for both cis- and trans-silencing of target genes by siRNAs-guided mRNA cleavage.


In plants and animals, microRNAs (miRNAs) and small interfering RNAs (siRNAs) have emerged as important negative regulators of gene expression (14). miRNAs arise by endonucleolytic processing by the enzyme Dicer from hairpin-structured single-stranded precursor RNAs that are transcribed from endogenous nonprotein-coding genes (2,3). siRNAs are also processed by a Dicer, but are generated from double-stranded RNAs (dsRNAs) as a result of antisense or convergent transcription or due to the activity of one or more cellular RNA-dependent RNA polymerases (RdRPs) (4). These small RNAs serve as specificity determinants of transcriptional gene silencing (TGS) and post-transcriptional gene silencing (PTGS) (57).

The RNAi pathway and machinery are highly conserved throughout lower to higher eukaryotic organisms (Schizosaccharomyces pombe to metazoans). However, there are differences in siRNA-mediated RNA silencing pathways between plants and animals. For instance, siRNAs produced in Drosophila embryos (7) and in mammalian cells (8) belong to the ∼21 nt class, while siRNAs in plants and fungi fall into two distinct classes: a short (∼21 nt) and a long (∼24 nt) size class (915). Recently, it was shown that the ∼21 nt class of siRNAs designated as trans-acting siRNAs (tasiRNAs) from Arabidopsis are associated with post-transcriptional silencing by directing the cleavage of target mRNAs (16,17). The longer size class of siRNAs (∼24 nt) are associated with TGS involving DNA methylation and histone (H3K9) methylation (1820). The length and functional diversity of small RNAs in plants are reflected in the multiplicity of DCL (Dicer-like) activities. Dicer is represented by one or two genes in animals, which indicates that often a single Dicer processes both miRNAs and siRNAs. In contrast, Arabidopsis and rice encodes at least four and three DCL proteins, respectively (19). Arabidopsis homozygous for the weak loss-of-function allele dcl1-9 is impaired in miRNA precursor processing (15,2123), whereas the DCL2 and DCL3 proteins are implicated in viral siRNA biogenesis and endogenous siRNA biogenesis, respectively (19). In addition to DCLs, an Argonaute (Ago-4), HEN1 and SDE4 have been implicated in siRNA accumulation (18,2426). These studies suggest that plants have evolved multiple small RNA processing pathways with specific as well as overlapping functions.

Cloning of small RNAs is a starting point to understand their number, diversity and possible roles in any organism. Recent studies clearly indicated the importance of small RNA cloning, particularly in the identification of hitherto unknown classes of endogenous small RNAs in diverse species, such as S.pombe, Drosophila, Caenorhabditis elegans and Arabidopsis. A large number of endogenous small RNAs referred to as repeat-associated small interfering RNAs (rasiRNAs) that can be mapped to transposons of the Drosophila genome have been identified (27). Another class of small RNAs referred to as tiny non-coding RNAs (tncRNAs) have been identified in C.elegans (28). Heterochromatic siRNAs that corresponded to both DNA strands of centromeric repeats have been identified in S.pombe (29). Similarly, cloning in Arabidopsis led to the identification of a large number of endogenous siRNAs (15,16,19). The functions of these recently discovered endogenous siRNAs have not been investigated in detail, but they appear to have important regulatory roles in gene expression. For example, two recent studies have demonstrated that Arabidopsis endogenous siRNAs referred to as tasiRNAs act similar to miRNAs and can direct the cleavage of the predicted trans-target mRNAs (16,17).

Since computational methods to predict endogenous siRNAs are currently unavailable, the identification of these siRNAs requires experimental approaches. Endogenous siRNAs have thus far been cloned only from a dicotyledonous plant (Arabidopsis), but siRNAs from monocotyledonous plants are not known. Rice is the world's most important crop, as measured by the portion of calories provided to the human diet, and is the only monocot species with a completely sequenced genome. To identify endogenous siRNAs and miRNAs from rice, we generated three small RNA cDNA libraries from different tissues. In this study, we report the first identification and characterization of a large set of putative endogenous siRNAs from rice. Most of these siRNAs map to intergenic regions. Some of the siRNAs show tissue/developmental stage-dependent expression. A total of 5 of the siRNAs are perfectly complementary to 25 different transposons and 8 of the siRNAs display perfect complementarity with 21 protein-coding genes, suggesting the possibility that these genes might be regulated in cis by endogenous siRNAs. Besides these, 111 trans-targets were predicted for 44 of the siRNAs and the predicted trans-target genes consist largely of transposable elements. This observation implies a role for endogenous siRNAs in controlling the proliferation of transposons in both cis- and trans-silencing manners. Importantly, we show that three of the predicted targets are genuine targets of endogenous siRNAs in rice and their mRNA cleavage in vivo is guided by the siRNAs. This is the first demonstration of mRNA targets of endogenous siRNAs in rice.


Cloning of endogenous siRNAs from rice

Total RNA was isolated separately from shoots and roots of 4-week-old young seedlings and inflorescences of adult rice plants (Oryza sativa spp. japonica cv. Nipponbare) with use of TRIzol (Invitrogen Life Technologies, Carlsbad, CA) according to the manufacturer's instructions. Cloning of small RNAs was performed as described previously (15). In brief, small RNAs from 18 to 26 nt were size fractionated, purified and ligated sequentially to 5′ and 3′ RNA/DNA chimeric oligonucleotide adapters. Reverse transcription was performed after ligation with the adapters, followed by PCR amplification. The resulting PCR products were cloned and transformed into competent cells. Plasmids were isolated from individual colonies and sequenced.

Sequence analysis

Automated base calling of raw sequence traces and vector removal were performed with the PHRED and CROSS MATCH programs from Ewing and Green (30). To avoid loss of critical sequence information in our short RNA sequences, a two step approach was used. First, non-insert and low quality sequences were removed with relatively forgiving filter settings by running the CROSS MATCH program with a minimum match parameter of 15 nt and a PHRED score of 14. In a second step, the obtained insert sequences with quality scores below 20 were flagged and further quality inspected by eye. Candidates with ambiguous base calls were removed from the dataset. The filtered and trimmed sequences with >16 nt in length were used to search the Rfam database (31) using the BLASTN program (32). This step allowed the removal of most RNA species that might be degradation products of non-coding RNAs in the dataset. Putative origins for the remaining sequences were identified by BLASTN searches against the genomic sequences of the version 2.0 annotation of O.sativa spp. japonica from The Institute for Genomic Research (TIGR) (ftp://ftp.tigr.org/pub/data/Eukaryotic_Projects/o_sativa/). Candidates with perfect matches against these genomic datasets were designated as endogenous siRNAs when they showed no fold-back structure in secondary structure predictions with the mfold program (33).

To identify putative target sequences, all predicted and cloned CDS (coding sequences) and UTR sequences from O.sativa (TIGR all.cdna set) were searched using the PatScan program (34). The following parameters were used for these pattern searches, all referring to the 5′ end of the siRNAs in antisense orientation: no more than three mismatches excluding at the positions 10 and 11. The retrieved siRNA/target site pairs were ranked and scored by aligning them with the Needleman–Wunsch global alignment program from EMBOSS (35).

RNA blot analysis

Total RNA was isolated from root, leaf and inflorescence tissues of adult plants and from 4-week-old young rice seedlings with use of Trizol. Low molecular weight RNAs were isolated by polyethylene glycol precipitation of total RNA (5). Fifty micrograms of low molecular weight RNA was loaded per lane, resolved on a denaturing 15% polyacrylamide gel and transferred electrophoretically to Hybond-N+ membranes. Hybridization and washings were performed as described previously (15). The membranes were briefly air dried and then exposed to a PhosphorImager.

PCR-based detection of candidate siRNAs

PCR-based detection of small RNAs was performed as described previously by Grad et al. (36). To PCR amplify the candidate siRNAs from the small RNA libraries, an oligonucleotide complementary to the 5′ adapter was used with a 3′ oligonucleotide complementary to the particular candidate siRNA.

5′ Rapid amplification of cDNA ends

Total RNA from 4-week-old rice seedlings was extracted with use of Trizol reagent. Poly(A)+ mRNA was purified from total RNA with use of the Poly(A) kit (Promega). RLM-5′ RACE (RNA ligase-mediated-5′ rapid amplification of cDNA ends) was carried out with use of the GeneRacer Kit (Invitrogen Life Technologies). The GeneRacer RNA Oligo adapter was directly ligated to mRNA (100 ng) without calf intestinal phosphatase and tobacco acid pyrophosphatase treatment. Gene-specific primers (9629.m00201, 9636.m02299 and 9632.m00807) were designed and used for cDNA synthesis. Initial PCR was carried out with the GeneRacer 5′ primer and gene specific primers. Nested PCR was carried out with 1 μl of the initial PCR, GeneRace 5′ nested primer and gene-specific internal primers. After amplification, 5′ RACE products were gel-purified and cloned, and at least 10 independent clones were sequenced.


Cloning, size distribution and 5′ nucleotide preference of rice endogenous siRNAs

We constructed three libraries of small RNAs each from the shoots and roots of seedlings and from the inflorescence tissues of adult rice plants (O.sativa spp. japonica cv. Nipponbare) plants. Small RNAs from 16 to 26 nt in length were isolated by size fractionation and ligated to 5′ and 3′ adapters, and then cloned and sequenced. A total of ∼10 000 clones were sequenced (∼1/3 from each library) and nearly half of it was represented by clones either shorter than 16 nt or without cDNA inserts (i.e. self-ligation of the adapters). The remaining 4910 small cDNA sequences were between 16 and 26 nt in length. All vector trimmed sequences longer than 16 nt were used to search the Rfam database (www.sanger.ac.uk/Software/Rfam) with BLASTN (31). This step allowed for the removal of most RNA species that might be degradation products of non-coding RNAs in the dataset (Table 1). The largest class of cloned RNAs represents fragments of abundant non-coding RNAs (rRNA, tRNA, snRNA and snoRNA) as determined by BLASTN searches against the Rfam database. Seventy percent of the sequences correspond to ribosomal RNAs. Furthermore, 26S rRNAs and 18S rRNAs together accounted for ∼40% of the total sequences (Table 1). Several clones mapped to chloroplast or mitochondrial genomes and may represent either degradation or possibly regulatory products of organellar RNAs (Table 1). Several sequences (478) did not match to any nuclear or organellar sequence, suggesting that these may correspond to unfinished regions in the rice genome (Table 1). Putative origins for the remaining sequences were identified by BLASTN searches against the genomic sequences of O.sativa spp. japonica. Candidates with perfect matches against these genomic datasets were endogenous small RNAs. These consist of miRNAs (124 sequences) (33) and putative endogenous siRNAs (314 sequences representing 284 unique siRNAs; all sequences are provided in Supplementary Table 1). The putative endogenous siRNAs correspond to genomic sequences that do not have predictable fold-back structures. A large set of small RNAs with sizes between 20 and 24 nt and in the sense orientation to non-coding RNAs (rRNA, tRNA, snRNA and snoRNA) and exons of protein-coding genes were not included in this analysis. Although many of these are likely degradation products of the abundant non-coding RNAs or mRNAs, some may be siRNAs. For example, some of the endogenous siRNAs in Arabidopsis have been shown to be derived from rDNAs and exons of protein-coding genes in the sense orientation (12,19). However, these were not characterized further in this study. Ascertaining these molecules unambiguously as rice siRNAs will require their expression analysis in dcl and rdrp mutants, which are not available for this species at the present time.

Table 1
Small RNA composition of the cDNA library

The number of endogenous siRNAs cloned from the different libraries varied greatly (Figure 1A). Most of them (225) are derived from the inflorescence library, whereas the shoot and root libraries yielded a substantially lower number of siRNAs (42 and 17 siRNAs from shoot and root libraries, respectively) (Figure 1A). Fifteen of the siRNA sequences (P4-D6, P8-E2, P16-C3, P65-B7, P86-H10, P88-A8, P88-A11, P91-A10, P94-H11, P98-A9, P103-B2, P104-G7, P107-E1, P108-A7 and P108-D3) showed an overlap between the libraries. The majority of the siRNAs are 21–24 nt in size, which is the typical size range for Dicer-derived products (Figure 1B). The 24 nt size class is predominant. The identified siRNAs exhibit a biased nucleotide distribution at their 5′ end. Of the 284 unique siRNAs, 105 begin with a U and 95 with an A, regardless of their size. About two-thirds begin with a pyrimidine, whereas the sequences that begin with C or G are underrepresented among the cloned sequences (Supplementary Table 1). Furthermore, each of the two size classes start with a distinct nucleotide at their 5′ end, with U predominating in the 21 nt class and a 5′ A in the 24 nt class (Figure 1D).

Figure 1
Rice endogenous siRNAs. (A) Histogram of the number of siRNAs identified from the libraries of shoots and roots of young seedlings, and inflorescence tissues. (B) Size distribution of endogenous siRNAs. (C) The rice endogenous siRNAs correspond to sequences ...

The publicly available Arabidopsis siRNAs (http://asrp.cgrb.oregonstate.edu/) were analyzed similarly to determine whether this nucleotide preference is a common feature in the 21 and 24 nt size classes of plant siRNAs. Most of the endogenous siRNAs of the 21 nt class begin with a 5′ U, whereas the 5′ nt of the 21 nt class transgene siRNAs show an even distribution among A, U, C and G (13) (http://asrp.cgrb.oregonstate.edu/). The bias of the endogenous rice and Arabidopsis siRNAs or non-bias of the transgene siRNAs in their 5′ nt may reflect the specific activities of DCL involved in their biogenesis. Some of the putative siRNAs of the ∼21 nt size class could be miRNAs. These could be generated from atypical hairpin structures that cannot be recognized as miRNA precursors. Analysis of the 24 nt size class in Arabidopsis further confirmed that this class of endogenous siRNAs has a 5′ A preference. This observation could suggest that one of the DCLs in rice and Arabidopsis is responsible for the generation of the 24 nt class of siRNAs preferentially with A on the 5′ end of the molecule. It was hypothesized that DCL3 and RNA-dependent RNA polymerase 2 (RDR2) generate endogenous siRNAs primarily of the large size (∼24 nt) class in Arabidopsis (19). Similarly, the DCL3 ortholog in rice may be involved in the generation of the ∼24 nt class of siRNAs. Further studies are required to correlate the generation of endogenous small RNAs with the specific DCL and RdRP activities in rice.

Genomic organization of endogenous siRNAs in rice

The genomic locations of rice endogenous siRNAs identified in the present study are shown in Supplementary Table 1. The 284 siRNAs map to 942 loci on the 12 rice chromosomes (Figure 1C and Supplementary Table 1). The discrepancy between the number of identified siRNAs (284) and the number of their corresponding genomic loci (942) is due to the fact that some siRNAs have multiple loci. Among the siRNAs, 165 are encoded by single copy loci in the rice genome, whereas the remaining siRNAs have multiple loci (2–21) (Supplementary Table 1). A total of 729 loci can be mapped to intergenic regions, and 213 correspond to introns (110 in the sense polarity and 103 in the antisense polarity relative to the genes) (Supplementary Table 1 and Figure 1C). They are scattered across all chromosomes with several islands of higher density (Supplementary Figure 1). The total number of siRNA loci between the different chromosomes is highly variable. The total number of siRNA loci peaks on chromosomes 11 and 12, whereas chromosome 5 has the lowest density and number of loci (Supplementary Figure 1). However, we were unable to detect a marked density correlation between siRNAs and genes or transposons along the different chromosomes.

Some of the siRNAs display perfect complementarity with protein-coding genes and transposons (Tables 2 and and3).3). A total of 8 of the siRNAs (P1-C04, P7-E5, P8-G3, P56-H4, P78-E8, P83-D5, P108-B7 and P109-D12) are perfectly complementary to the mRNAs of 21 protein-coding genes (Table 2). Some of these siRNAs are complementary to multiple genes. For example, P56-H4 is complementary to 12 genes, and P1-C04 to 2. Similarly, 5 of the siRNAs (P33-D12, P40-H12, P56-H4, P90-B8 and P107-E1) are perfectly complementary to 25 of the transposons (Table 3). These include several classes of transposons (DNA transposons, retrotransposons and transposases). This observation implies that these siRNAs originate from the same locus as their respective target genes. However, some of these siRNAs appear to have multiple loci, and therefore it cannot be confirmed whether these siRNAs are derived from antisense transcription of the same locus or originate from a different locus and target the genes in trans. In summary, the endogenous siRNAs from rice largely originate from intergenic regions, but a significant proportion also arises from introns, exons and repetitive sequences (transposons and retroelements).

Table 2
siRNAs that are antisense to protein-coding genes
Table 3
siRNAs originating from antisense orientations to transposons

Rice siRNA clusters

siRNA loci often generate multiple, overlapping clusters of small RNAs. This stands in contrast to miRNA loci that generally yield a single miRNA. We have identified six clusters in intergenic regions, each with two or more siRNAs. These clustered siRNAs are spaced within an interval of 500 nt (Figure 2). Some of the clusters contain small RNAs that map to the plus and minus strands of short genomic regions (Figure 2). This observation suggests the possibility that the different siRNAs in a cluster originate from one long dsRNA.

Figure 2
Endogenous siRNAs in clusters. The distance between two siRNAs is indicated. 5′ and 3′ flanking genes are shown. siRNAs are indicated by green or red arrows. For siRNAs that overlap the ORF of a gene, the siRNAs are shown on top and bottom ...

Expression pattern of rice endogenous siRNAs

The tissue- and developmental stage-specific expression of small RNAs can provide clues about their physiological function. In C.elegans, the expression of several tncRNAs was found to depend on developmental stages (28). To examine whether rice endogenous siRNAs are expressed in a developmental stage-dependent and/or tissue-specific manner, we performed northern analysis using low molecular weight RNA blots from various tissues of mature plants (root, leaf and inflorescence) as well as 4-week-old young seedlings. For expression analysis, we tried to select siRNAs that appeared relatively frequently in our sequences or are represented by more loci in the genome, as these may have relatively high expression and thus more amenable for detection.

The northern analysis revealed that most of the 30 siRNAs tested are expressed at very low abundance and could not be detected on the small RNA blots. However, we could detect a signal at the appropriate sizes for six of the siRNAs (P16-C3, P4-D6, P88-A11, P98-A9, P108-A7 and P107-E1) (Figure 3). Two of the siRNAs are expressed relatively abundantly (P16-C3 and P107-E1). Some show tissue- or developmental stage-specific expression patterns (P88-A11, P98-A9, P108-A7 and P107-E1). P16-C3 and P4-D6 appear to be uniformly expressed in all rice tissues examined (Figure 3A and B). In addition to the expected signal at 21 nt, the P16-C3 probe also detected a weaker signal at a lower size of ∼18–19 nt in all tissues examined. The lower sized signal may represent smaller sized siRNA rather than a degradation product of P16-C3. Furthermore, this small band was stronger in the leaf tissue than in other tissues. P16-C3 has multiple loci, and thus the smaller molecule might have originated from a different locus than the larger one. P98-A9 displayed moderate expression in leaves and young seedlings (Figure 3C). The expression of P88-A11 was moderate in inflorescence and seedlings but very weak in leaf and root tissues (Figure 3D). P108-A7 was expressed in all tissues tested, but the levels were higher in inflorescence and young seedlings (Figure 3E). The expression of P107-E1 was strong in leaves and young seedlings but almost below the detection limit in the root tissue (Figure 3F).

Figure 3
Expression analysis of some of the rice endogenous siRNAs by northern hybridization. Northern blots of low molecular weight RNA isolated from different tissues were probed with labeled oligonucleotides. The tRNA and 5S rRNA bands were visualized by ethidium ...

Furthermore, using an RT–PCR approach we were able to detect the expression of 16 additional siRNAs out of 65 tested (Figure 4). Some of these siRNAs, P8-E2, P86-H10, P88-A8, P91-A10, appear to be ubiquitously expressed, while others such as P108-D3, P103-B2, P65-B7, P104-G7 and P94-H11 are preferentially expressed in the inflorescence (Figure 4). The rest were expressed in at least one of the tissues tested (Figure 4). These results show that some endogenous siRNAs in rice are differentially expressed in different tissues and/or developmental stages.

Figure 4
PCR-based detection of endogenous siRNAs. PCR amplification was performed using a 5′ primer for the 5′ adapter sequence and a 3′ primer specific for the candidate siRNAs.

Predicted targets

To identify potential trans mRNA targets for the endogenous siRNAs, we searched rice mRNAs for antisense hits with three or fewer mismatches and no mismatch was allowed between positions 10 and 11 from the 5′ end of the siRNA. Arabidopsis miRNA and tasiRNAs targets are known to exhibit up to three or even four mismatches to their target sites. The number of mismatches is typically lower in the 5′ region of the small RNA (16,17,37,38). By allowing a maximum of three mismatches, we were able to predict 111 trans-targets for 44 of the siRNAs (Table 4). Targets could not be predicted for the remaining siRNAs by applying these criteria. miRNA target sites in plants are usually found in the open reading frames (ORFs) of the target genes (15,39), although some target sites are in the UTR of target genes (15). Similarly, the endogenous tasiRNAs in Arabidopsis appear to have their target sites in the ORFs of the target genes (16,17). Most (87) of our predicted trans-target genes in rice have target sites in the ORFs. Eighteen have predicted target sites in 3′-UTRs and six in 5′-UTRs.

Table 4
Predicted trans-targets of endogenous siRNAs from rice

Some of the siRNAs are perfectly complementary to the protein-coding genes and transposable elements (Tables 2 and and3).3). These siRNAs are possibly involved in the cis-silencing of those protein-coding genes and transposons. In addition, some transposable elements were predicted as trans-targets for some of the siRNAs. A total of 51 of the 111 predicted trans-targets are transposons, and these are targeted by 16 siRNAs (Table 4). In all the cases, the target sites are located in ORFs of the transposons. These include mariner and En/Spm type transposons, retrotransposons (Ty3-gypsy, Ty1-copia), MuDR transposases and some other unclassified transposable elements.

The remaining predicted targets appear to have roles in a broad range of physiological processes (Table 4). These include genes encoding protein kinases, F-box proteins, Zn-finger proteins, disease-related proteins, lipase and another 45 proteins with unknown functions (Table 4). The predicted target of siRNA 109-D12 is 9629.m00201, a PIN1-like gene, which has two complementary sites within the ORF (Figure 5A). The two target sites correspond to positions 362–383 and 1397–1418, with zero mismatches (Figure 5A). P109-D12 has two genomic loci corresponding to the two target sites in 9629.m00201, which indicates that P109-D12 siRNA originates possibly by antisense transcription.

Figure 5
Identification of siRNA-guided cleavage products of target mRNAs in rice. (A) mRNA 9639.m00201, (B) mRNA 9636.m02299 and (C) mRNA 9632.m00807. Mapping of cleavage sites was performed by RLM-5′ RACE. Partial mRNA sequences from target genes were ...

P1-C04 and P79-H4 are of the same size (24 nt) and differs by 1 nt and the origins were different. P1C04 is perfectly complementary to the 3′-UTR of flavanone 3-hydroxylase. P79-H4 is predicted to target the same site but with a mismatch at the position 11 from the 5′ end of the siRNA.

Endogenous rice siRNAs can direct the cleavage of target mRNAs

The cleavage of target mRNAs appears to be the predominant mode of gene regulation of miRNAs in plants. In addition to miRNAs, tasiRNAs in Arabidopsis have also been shown to direct the cleavage of the predicted target transcripts (16,17). Target mRNA fragments resulting from siRNA-guided cleavage are characterized by having a 5′ phosphate group, and cleavage occurs near the middle of the base pair interaction region. To examine whether some of the predicted siRNA targets are true targets and whether these siRNAs can direct the cleavage of predicted targets, we used a RNA ligase-mediated 5′ RACE procedure to map the cleavage site of siRNA targets. Using this procedure, we were able to detect and clone the cleaved fragments of 9629.m00201 (PIN1-like gene) targeted by P109-D12 (Figure 5A). Interestingly, this gene has two target sites in its ORF, and the cleaved fragments corresponded to both of the target sites. This observation provides an example of cis-silencing by endogenous siRNAs. We have also found evidence for siRNA-mediated trans-targeting in rice. For example, P98-A9 targets the unknown protein 9636.m02299 and P75-D3 targets the retrotransposon 9632.m00807. Both are examples of trans-targeting by siRNAs (Figure 5B and C). In both the cases, the most common 5′ end of the mRNA fragment is mapped to the nucleotide that pairs to the 10th nucleotide of the siRNA from its 5′ end. In this study, a substantial number of endogenous siRNAs were predicted to target protein-coding transcripts, and even if a small portion of these transcripts are genuine targets, the potential of siRNA regulation could be vast.


Endogenous siRNAs in plants can be divided into two classes on the basis of size and function: the 21 nt siRNAs that direct post-transcriptional silencing via mRNA degradation and the 24 nt siRNAs that trigger methylation of homologous DNA leading to TGS (10). Endogenous siRNAs in rice may be grouped into similar size and functional categories. A total of 284 unique putative siRNA sequences corresponding to 942 genomic loci have been identified in this study. The expression of some of the siRNAs is tissue specific or developmental stage specific. We predicted a large number of mRNA targets for the siRNAs. Importantly, we validated three of the predicted targets and provided evidence for both cis-silencing and trans-silencing of target mRNAs by rice siRNA-guided cleavage. Most of the siRNAs (225 of 284) were identified from an inflorescence library. The reason for the high number of endogenous siRNAs from inflorescence tissues is unknown, but may implicate an involvement of siRNAs in active meristematic/cell division or developmental processes. A large number of the siRNA loci (729) map to intergenic regions. It is unknown how the siRNAs mapped to intergenic regions are regulated, but they likely have their own regulatory sequences. Our results suggest that endogenous siRNAs in rice is very diverse.

The endogenous siRNAs differ in their 5′ end nucleotide preference from NOS promoter-driven transgenic siRNAs. Arabidopsis NOS promoter-expressing plants generate siRNAs of both 21 and 24 nt size classes (13). Two major differences were observed with respect to preferences for 5′ nt in rice endogenous siRNAs. Most of the endogenous siRNAs of the 21 nt size class begin with 5′ U, whereas 5′ nt of the 21 nt size class from transgenic siRNAs show an even distribution in nucleotides (13). The preference in the rice endogenous siRNAs could reflect a specific DCL activity in rice. Some siRNAs of this size class could still be miRNAs but cannot be defined as such, because they may be generated from atypical hairpin structures. With respect to the 24 nt size class, 5′ A was the predominant nucleotide for the endogenous siRNAs in rice, whereas C was the major 5′ nt in transgene-derived siRNAs. Although the 5′ A bias for the 24 nt size class was known from previous findings in Arabidopsis (9), our study confirms this striking difference in vivo in rice. It is interesting that this distinct feature is not apparent in the 24 nt size class of endogenous siRNAs from the lower plant moss (Polytrichum juniperinum) (40). This observation raises the possibility that the DCL responsible for processing the 24 nt size class with 5′ adenine may be absent in lower plants. Further studies with individual DCL knock-out mutants may verify the role of different DCLs in conferring the 5′ nt preference of endogenous siRNAs in rice.

The expression of several tncRNAs from C.elegans was shown to be constitutive and ubiquitous, whereas many other tncRNAs exhibit preferential expression in a temporal or tissue-specific manner (28). Our northern and RT–PCR results suggest that some of the rice endogenous siRNAs are expressed in a tissue-specific and/or developmental stage-dependent manner. The differential expression pattern supports a regulatory role of these siRNAs in specific tissues or in development.

Many siRNAs are expected to act on the same locus from which they originate. Presumably, these siRNAs are derived from the same loci as their targets, through antisense transcription or RdRP activity. Some of the rice endogenous siRNAs originate from the antisense strands of protein- or transposon-coding genes and might mediate the post-transcriptional silencing of these genes.

We also predicted 111 genes as potential trans-targets of 44 endogenous siRNAs in rice. Transposons represent ∼46% of the predicted trans-targets of siRNAs in the present study. Our analysis suggests that some of the retroelements that are predicted targets of siRNAs are whole. For example, out of the 31 gypsy type of retrotransposons that were predicted targets of siRNAs, 13 appear to have all constituent proteins without nonsense mutations, but the remaining 18 are truncated. Similarly, one out of the four CACTA retrotransposons that were predicted targets of siRNAs appears to be whole, whereas the remaining three are truncated. Small RNAs corresponding to transposon sequences have also been detected in Arabidopsis (41,42). Transposable elements are DNA sequences that can move and multiply within the genome of an organism. Higher eukaryotic genomes contain a large number of transposable elements, e.g. 45% of the human genome consists of remnants of transposon sequences. Twelve percent of the C.elegans genome is transposon sequences. Transposable elements make up 14% of the Arabidopsis genome, in contrast to an estimated 50–80% of the maize genome (43,44). The rice genome is populated by representatives from all known transposon superfamilies, including elements that cannot be easily classified into either Class I or II (45). Present estimates of the transposon content of the O.sativa ssp. Japonica genome is at least 35%. Transposon activation can have a range of effects, including alterations in gene expression, gene deletion and insertion, and chromosome rearrangements, most of which are deleterious. Recent work in plants and other organisms has revealed that both TGS and PTGS are important for the silencing of transposons. In organisms other than plants, PTGS is likely to be the primary pathway of transposon silencing (46). Because siRNAs guide the sequence-specific RNA degradation that occurs during RNAi, transposon siRNAs might mediate the degradation of transposon mRNAs to establish transposon silencing. rasiRNAs, which are derived from every known type of transposon in the Drosophila genome, could potentially regulate transposon mobility and assembly of heterochromatin at transposon-rich regions, such as telomeres and centromeres (27). In addition to the endogenous siRNAs that correspond to transposons in antisense orientations, our analysis predicted transposons as likely trans-targets for several of the siRNAs. This result implies a role for endogenous siRNAs in controlling the proliferation of transposons in both auto- and trans-silencing manners in rice.

The remaining predicted targets appear to have roles in diverse physiological processes and include both regulatory as well as metabolism and cell structure-related genes (Table 4). In addition, 45 unknown proteins are also predicted targets for some of the siRNAs. The validated targets in this study include 9629.m00201, which serves as an example for cis-silencing by siRNA P109-D12. Two other validated targets are examples of trans-targeting by siRNAs. Thus, our analysis suggests that endogenous siRNAs in rice can guide mRNA cleavage both in cis and in trans. Many of the siRNAs that do not have predicted mRNA targets may target DNA for TGS in rice.

The profile of naturally occurring siRNAs in rice provides an important foundation to explore the potential roles of these molecules in genome maintenance, genome expression and defense. The large number of predicted targets and our experimental evidence showing cis and trans silencing of genes by siRNA-guided mRNA cleavage suggest that endogenous siRNAs are important for the regulation of many genes in rice.


Supplementary Material is available at NAR Online.

Supplementary Material

[Supplementary Material]


The authors thank Rebecca Stevenson for assistance in growing the rice plants. This work was supported by National Institutes of Health grant R01GM0707501 and National Science Foundation grant IBN-0212346 to J.-K.Z. Funding to pay the Open Access publication charges for this article was provided by National Institutes of Health grant R01GM0707501.

Conflict of interest statement. None declared.


1. Carrington J.C., Ambros V. Role of microRNAs in plant and animal development. Science. 2003;301:336–338. [PubMed]
2. Ambros V. The functions of animal microRNAs. Nature. 2004;431:350–355. [PubMed]
3. Bartel D.P. MicroRNAs: genomics, biogenesis, mechanism and function. Cell. 2004;116:281–297. [PubMed]
4. Baulcombe D. RNA silencing in plants. Nature. 2004;431:356–363. [PubMed]
5. Hamilton A.J., Baulcombe D.C. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science. 1999;286:950–952. [PubMed]
6. Hammond S.M., Bernstein E., Beach D., Hannon G.J. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature. 2000;404:293–296. [PubMed]
7. Zamore P.D., Tuschl T., Sharp P.A., Bartel D.P. RNAi: double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell. 2000;101:25–33. [PubMed]
8. Billy E., Brondani V., Zhang H., Muller U., Filipowicz W. Specific interference with gene expression induced by long, double-stranded RNA in mouse embryonal teratocarcinoma cell lines. Proc. Natl Acad. Sci. USA. 2001;98:14428–14433. [PMC free article] [PubMed]
9. Tang G., Reinhart B.J., Bartel D.P., Zamore P.D. A biochemical framework for RNA silencing in plants. Genes Dev. 2003;17:49–63. [PMC free article] [PubMed]
10. Hamilton A., Voinnet O., Chappell L., Baulcombe D. Two classes of short interfering RNA in RNA silencing. EMBO J. 2002;21:4671–4679. [PMC free article] [PubMed]
11. Mallory A., Reinahrt B., Bartel D., Vance V., Bowman L. A viral suppressor of RNA silencing differentially regulates the accumulation of short interfering RNAs and micro-RNAs in tobacco. Proc. Natl Acad. Sci. USA. 2002;99:15228–15233. [PMC free article] [PubMed]
12. Llave C., Kasschau K.D., Rector M., Carrington J.C. Endogenous and silencing-associated small RNAs in plants. Plant Cell. 2002;14:1605–1619. [PMC free article] [PubMed]
13. Papp I., Mette M.F., Aufsatz W., Daxinger L., Schauer S.E., Ray A., van der Winden J., Matzke M., Matzke A.J.M. Evidence for nuclear processing of plant micro RNA and short interfering RNA precursors. Plant Physiol. 2003;132:1382–1390. [PMC free article] [PubMed]
14. Nicolás F.E., Torres-Martínez S., Ruiz-Vázquez R.M. Two classes of small antisense RNAs in fungal RNA silencing triggered by non-integrative transgenes. EMBO J. 2003;22:3983–3991. [PMC free article] [PubMed]
15. Sunkar R., Zhu J.K. Novel and stress regulated microRNAs and other small RNAs from Arabidopsis. Plant Cell. 2004;16:2001–2019. [PMC free article] [PubMed]
16. Vazquez F., Vaucheret H., Rajagopalan R., Lepers C., Gasciolli V., Mallory A.C., Hilbert J.L., Bartel D.P., Crete P. Endogenous trans-acting siRNAs regulate the accumulation of Arabidopsis mRNAs. Mol. Cell. 2004;16:69–79. [PubMed]
17. Peragine A., Yoshikawa M., Wu G., Albrecht H.L., Poethig R.S. SGS3 and SGS2/SDE1/RDR6 are required for juvenile development and the production of trans-acting siRNAs in Arabidopsis. Genes Dev. 2004;18:2368–2379. [PMC free article] [PubMed]
18. Zilberman D., Cao X., Jacobsen S.E. ARGONAUTE4 control of locus-specific siRNA accumulation and DNA and histone methylation. Science. 2003;299:716–719. [PubMed]
19. Xie Z., Kasschau K.D., Carrington J.C. Negative feedback regulation of Dicer-Like1 in Arabidopsis by microRNA-guided mRNA degradation. Curr. Biol. 2003;13:784–789. [PubMed]
20. Chan S.W., Zilberman D., Xie Z., Johansen L.K., Carrington J.C., Jacobsen S.E. RNA silencing genes control de novo DNA methylation. Science. 2004;303:1336. [PubMed]
21. Park W., Li J., Song R., Messing J., Chen X. CARPEL FACTORY, a Dicer homolog, and HEN1, a novel protein, act in microRNA metabolism in Arabidopsis thaliana. Curr. Biol. 2002;12:1484–1495. [PubMed]
22. Reinhart B.J., Weinstein E.G., Rhoades M.W., Bartel B., Bartel D.P. MicroRNAs in plants. Genes Dev. 2002;16:1616–1626. [PMC free article] [PubMed]
23. Kurihara Y., Watanabe Y. Arabidopsis micro-RNA biogenesis through Dicer-like 1 protein functions. Proc. Natl Acad. Sci. USA. 2004;101:12753–12758. [PMC free article] [PubMed]
24. Boutet S., Vazquez F., Liu J., Béclin C., Fagard M., Gratias A., Morel J.B., Crété P., Chen X., Vaucheret H. Arabidopsis HEN1: a genetic link between endogenous miRNA controlling development and siRNA controlling transgene silencing and virus resistance. Curr. Biol. 2003;13:843–848. [PubMed]
25. Onodera Y., Haag J.R., Ream T., Nunes P.C., Pontes O., Pikaard C.S. Plant nuclear RNA polymerase IV mediates siRNA and DNA methylation-dependent heterochromatin formation. Cell. 2005;120:613–622. [PubMed]
26. Herr A.J., Jensen M.B., Dalmay T., Baulcombe D.C. RNA polymerase IV directs silencing of endogenous DNA. Science. 2005;308:118–120. [PubMed]
27. Aravin A., Lagos-Quintana M., Yalcin A., Zavalon M., Marks D., Snyder B., Gaasterland T., Meyer J., Tuschl T. The small RNA profile during Drosophila melanogaster development. Dev. Cell. 2003;5:337–350. [PubMed]
28. Ambros V., Lee R.C., Lavanway A., Williams P.T., Jewell D. MicroRNAs and other tiny endogenous RNAs in C.elegans. Curr. Biol. 2003;13:807–818. [PubMed]
29. Reinhart B.J., Bartel D.P. Small RNAs correspond to centromere heterochromatic repeats. Science. 2002;297:1831. [PubMed]
30. Ewing P., Green B. Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998;8:186–194. [PubMed]
31. Griffiths-Jones S., Bateman A., Marshall M., Khanna A., Eddy S.R. Rfam: an RNA family database. Nucleic Acids Res. 2003;31:439–441. [PMC free article] [PubMed]
32. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
33. Sunkar R., Girke T., Jain P.K., Zhu J.K. Cloning and characterization of microRNAs from rice. Plant Cell. 2005;17:1397–1411. [PMC free article] [PubMed]
34. Dsouza M., Larsen N., Overbeek R. Searching for patterns in genomic data. Trends Genet. 1997;13:497–498. [PubMed]
35. Rice P., Longden I., Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–277. [PubMed]
36. Grad Y., Aach J., Hayes G.D., Reinhart B.J., Church G.M., Ruvkun G., Kim J. Computational and experimental identification of C.elegans microRNAs. Mol. Cell. 2003;11:1253–1263. [PubMed]
37. Palatnik J.F., Allen E., Wu X., Schommer C., Schwab R., Carrington J.C., Weigel D. Control of leaf morphogenesis by microRNAs. Nature. 2003;425:257–263. [PubMed]
38. Jones-Rhoades M.W., Bartel D.P. Computational identification of plant microRNAs and their targets, including a stress-induced miRNA. Mol. Cell. 2004;14:787–799. [PubMed]
39. Rhoades M.W., Reinhart B.J., Lim L.P., Burge B.C., Bartel B., Bartel D.P. Prediction of plant microRNA targets. Cell. 2002;110:513–520. [PubMed]
40. Axtell M.J., Bartel D.P. Antiquity of microRNAS and their targets in land plants. Plant Cell. 2005;17:1658–1673. [PMC free article] [PubMed]
41. Liu J., He Y., Amasino R., Chen X. siRNAs targeting an intronic transposon in the regulation of natural flowering behavior in Arabidopsis. Genes Dev. 2004;18:2873–2878. [PMC free article] [PubMed]
42. Lippman Z., May B., Yordan C., Singer T., Martienssen R. Distinct mechanisms determine transposon inheritance and methylation via small interfering RNA and histone modification. PLoS Biol. 2003;1:E67. [PMC free article] [PubMed]
43. Arabidopsis Genome Intiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. [PubMed]
44. SanMiguel P., Bennetzen J.L. Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotransposons. Ann. Bot. 1998;82:37–44.
45. Temnykh S., De Clerck G., Lukashova A., Lipovich L., Cartinhour S., Mc Couch S. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 2001;11:1441–1452. [PMC free article] [PubMed]
46. Sijen T., Plasterk R.H. Transposon silencing in the Caenorhabditis elegans germ line by natural RNAi. Nature. 2003;426:310–314. [PubMed]
47. Schramke V., Allshire R. Hairpin RNAs and retrotransposon LTRs effect RNAi and chromatin-based gene silencing. Science. 2003;301:1069–1074. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...