• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. Oct 2007; 17(10): 1471–1477.
PMCID: PMC1987347

Mapping the C. elegans noncoding transcriptome with a whole-genome tiling microarray

Abstract

The number of annotated protein coding genes in the genome of Caenorhabditis elegans is similar to that of other animals, but the extent of its non-protein-coding transcriptome remains unknown. Expression profiling on whole-genome tiling microarrays applied to a mixed-stage C. elegans population verified the expression of 71% of all annotated exons. Only a small fraction (11%) of the polyadenylated transcription is non-annotated and appears to consist of ~3200 missed or alternative exons and 7800 small transcripts of unknown function (TUFs). Almost half (44%) of the detected transcriptional output is non-polyadenylated and probably not protein coding, and of this, 70% overlaps the boundaries of protein-coding genes in a complex manner. Specific analysis of small non-polyadenylated transcripts verified 97% of all annotated small ncRNAs and suggested that the transcriptome contains ~1200 small (<500 nt) unannotated noncoding loci. After combining overlapping transcripts, we estimate that at least 70% of the total C. elegans genome is transcribed.

In organisms previously analyzed by tiling microarrays, a substantial part of the non-annotated genome has consistently displayed transcriptional activity (Kapranov et al. 2002; Rinn et al. 2003; Yamada et al. 2003; Bertone et al. 2004; Cheng et al. 2005; Stolc et al. 2005; David et al. 2006; Li et al. 2006; Manak et al. 2006), suggesting the existence of either a larger-than-predicted number of protein-coding genes or a large number of non-protein-coding RNA (ncRNAs) genes. The current annotation of the 100-Mb Caenorhabditis elegans genome estimated ~22,000 protein-coding genes and ~1,000 small ncRNA genes (Chen et al. 2005; Stricklin et al. 2005), and computational predictions have suggested the presence of an additional ~3,000 small ncRNA genes in the genome (Deng et al. 2006; Missal et al. 2006). The fact that a 1000-cell nematode appears to contain nearly as many protein coding genes as the far more complex genomes of insects and vertebrates invites the question of whether it is equally rich in ncRNA genes.

Transcriptional analyses employing microarrays that constitute complete nonrepetitive tile paths over a genome or part of a genome, irrespective of the location of annotated genes (genomic tiling microarrays; Bertone et al. 2006), have recently been applied to a number of organisms. With the exception of a recent study of 10 human chromosomes (Cheng et al. 2005), most expression profiling studies on genomic tiling arrays have focused on the polyadenylated fraction of the transcriptome in the respective organisms (Kapranov et al. 2002; Rinn et al. 2003; Yamada et al. 2003; Bertone et al. 2004; Stolc et al. 2005; David et al. 2006; Li et al. 2006; Manak et al. 2006). In an attempt to map out a major fraction of the small noncoding transcriptome in the worm, we adapted a highly efficient protocol for small (<500 nt) ncRNA cloning and microarray sample preparation (Deng et al. 2006; He et al. 2006), and applied this to a newly released Affymetrix C. elegans whole genome tiling array. Profiling a small non-polyadenylated (SNPA) RNA sample on this tiling array provided high sensitivity and specificity for detecting small ncRNAs; at a threshold where 97% of the known ncRNAs were detected, >80% of the array-detected, previously unknown transcripts were verifiable by reverse transcription–polymerase chain reaction (RT-PCR) or rapid amplification of c-DNA ends (RACE) (Supplemental Document 1). Incorporating these results with those obtained from polyadenylated (PA) and non-polyadenylated (NPA) total RNA profiled on tiling arrays demonstrated several advantages of this approach with respect to the breadth and depth of the information that could be extracted.

Results

The Affymetrix C. elegans Tiling 1.0R array contains ~3.2 million 25-mer oligonucleotide probe pairs covering the Watson strand of the entire nonrepetitive genome at an average resolution (distance between the central position of adjacent probes) of 25 bp. RNA was extracted from a mixed-stage population of wild-type C. elegans strain N2 and reverse-transcribed into double-stranded cDNA samples representing either PA RNA, NPA total RNA depleted in both polyadenylated RNAs and rRNAs, or SNPA RNAs.

When hybridized to the tiling array (see Methods for details), the PA, NPA, and SNPA samples gave rise to 23.5% (736,710), 18.2% (571,347), and 2.0% (63,292) of the probes with positive signals, respectively, amounting to a total of 917,753 positive probes representing 22.7% of the C. elegans genome. As single positive probes are likely to be the result of spurious nonspecific hybridization, we defined a putatively transcribed fragment (transfrag; The ENCODE Project Consortium 2004) as at least two positive probes separated by a gap of no more than 30 bp. The three samples individually produced 108,669, 97,548, and 5738 transfrags (Fig. 1), which after removal of redundancies suggested the presence of at least 146,249 stably expressed regions with an average and median length of 156 and 103 nt, respectively. Among the nonredundant transfrags, 95,928 (65.6%) are annotated protein-coding exons, 875 overlap with known small noncoding transcripts, and 6281 correspond to tandem repeats, pseudogenes, or transposons (Fig. 1). The remaining 43,165 then represent the lowest estimate for transcripts of unknown function (TUFs; The ENCODE Project Consortium 2004) detected by the tiling microarray.

Figure 1.
Transfrag distribution in the three different samples. “Other annotated” mainly includes tandem repeats and pseudogenes; “exons” include curated exons; “ncRNAs” include all tRNAs (Lowe and Eddy 1997), rRNAs, ...

The array detects 70% to 97% of annotated genes

To estimate the sensitivity of the tiling microarray, an annotated genomic element was regarded as detected if 30% or more of the interrogating probes were positive (Kampa et al. 2004). The highest detection rate for annotated genomic loci was observed for the SNPA sample, in which >97% of the known small ncRNA loci were detected. The detection rate depended somewhat on the ncRNA class, with tRNAs and snRNAs showing nearly 100% detection, whereas snoRNAs and uncharacterized RNAs detected at somewhat lower levels (Table 1). In comparison, the detection rate for small ncRNAs in the NPA sample was far lower with an average of 59% (65% for tRNAs, and 47% for other ncRNAs; Fig. 2A). Alternatively, using the poly(A)-tailless histone mRNAs in the NPA sample gave a detection rate of 97%, implying that random hexamer-primed reverse transcription may have biased the sample toward longer transcripts. MicroRNA precursors (pri- and pre-miRNAs) vary in length and polyadenylation status. We altogether detected signals corresponding to 64 out of 115 annotated miRNA precursor loci (55.6%) in the SNPA (46), PA (19), and in the NPA (29) data set, the lower detection rate for miRNA precursor probably caused by the lower stability of these transcripts (Bracht et al. 2004).

Table 1.
Detection rates of annotated ncRNAs in the SNPA sample
Figure 2.
Detection rates of annotated exons and genes in the NPA and PA samples. (A) Detection rates for histone exons, tRNAs, and other small ncRNAs in the NPA sample. (B) Detection rates for exons in genes with different confirmation status. Confirmed, partially ...

The detection rate of an exon in the PA sample was on average 71% but depended on the confirmation status of its corresponding gene and varied from 94% for exons in fully confirmed genes to only 28% in predicted genes (Fig. 2B; Supplemental Fig. 1). To relate the developmental- and environmental-specific expression of genes from the mixed-population RNA hybridized to the tiling microarrays, the genes previously reported to express under a number of given conditions (Jiang et al. 2001; Wang and Kim 2003) were compared to the same genes detected in the PA sample (Fig. 2C). For most tested conditions, expression of between 90% and 97% of the previously reported genes were observed on the tiling array. The exceptions were genes predominantly expressed in males, of which only 60% were detected, most probably reflecting a low number of males present in the mixed C. elegans population used for RNA sample preparation.

Major part of non-annotated transcriptome is longer non-polyadenylated transcripts

Compared to most genomes analyzed by tiling microarray (Cheng et al. 2005; David et al. 2006; Li et al. 2006; Manak et al. 2006), only a relatively small fraction (11%) of the detected polyadenylated transcripts occurred outside annotated exons of protein-coding genes, and the majority of the detected non-protein-coding transcripts in C. elegans thus appear to be non-polyadenylated. Only a very small fraction of this transcription was detected in the SNPA sample. At a signal probe intensity cutoff of 6.1, the SNPA data contained 1222 transcripts without annotation. RT-PCR and RACE analysis confirmed 77% of a random sample of TUFs from this set (Supplemental Documents 1 and 2). Contrary to an earlier analysis of the small noncoding transcriptome that found chromosome X to be nearly devoid of small ncRNAs (Deng et al. 2006), the SNPA TUF loci are nearly equally distributed on the C. elegans chromosomes, with a slight preference for chromosome X. Two-thirds of all TUF loci are intergenic, a higher fraction than for known small ncRNA loci (55%; Deng et al. 2006); however, the intergenic SNPA TUF loci show the same tendency as known ncRNA loci to locate in relative vicinity to annotated coding genes. The SNPA TUF loci appear less conserved than known and recently cloned loci, as only 21% show some conservation (weak WABA; Kent and Zahler 2000) in C. briggsae, and none was found to be conserved outside the nematodes. Further sequence analysis suggested that ~10% (126) of the SNPA TUFs may belong to various known ncRNA classes (mainly snoRNAs, snlRNAs, and sbRNAs), thus a far larger fraction of the SNPA loci may represent potentially novel functional categories of short RNAs than hitherto cloned transcripts. Analysis of sequence flanking the SNPA TUFs identified three known (UM1–3; Deng et al. 2006) and one novel (UM4) upstream motif at 143 of the most strongly expressed TUF loci (for further details see J. Wang, H. He, T. Liu, G. Skogerb, and R. Chen, in prep.).

The NPA sample produced 97,548 transfrags, all of which could potentially represent noncoding transcripts (except transcripts coding for histones). Nearly 70% of the NPA transcripts overlap with annotated exons (55.9%) or introns (17%) of coding genes, whereas 20.8% are non-annotated intergenic TUFs. The NPA signal-to-background ratio is lower than for the other samples (Supplemental Fig. 1); however, RT-PCR analysis confirmed 90% (26/29) of randomly sampled intronic and intergenic TUFs, effectively excluding the possibility that the majority of the NPA TUFs are a result of nonspecific hybridization. RT-PCRs against eight regions of low signal intensity gave no positive amplification (Supplemental Document 1), further indicating that the NPA data are real and have picked up most of the existent non-polyadenylated transcription. TUFs in the NPA sample are also fairly well conserved, with 54% showing at least some level of conservation (weak WABA; Kent and Zahler 2000) in Caenorhabditis briggsae. Although some longer NPA TUFs were observed (the longest being 3579 nt), most are generally short (mean 88 nt, median 75 nt); however, of these only 557 overlapped with the SNPA TUFs, which seems unexpectedly few, considering the high specificity of the latter. This discrepancy may stem from a lower ability of random hexamer priming used for reverse transcription of the NPA sample to capture short ncRNAs (as compared to priming from a 3′-end–ligated adapter used for the SNPA sample), and short NPA TUFs located in close proximity may actually represent longer transcripts. We first tested this by randomly selecting eight pairs of TUFs separated by <500 bp, all of which could be individually validated by RT-PCR. Subsequent RT-PCRs with one primer in each of the paired two TUFs resulted in the amplification of fragments corresponding to the genomic distance between the TUFs in five of the eight pairs. No amplification was observed when reverse transcriptase was omitted from the reaction, indicating that results were not generated from contamination of genomic DNA but were instead results of unspliced transcripts spanning distance between the two TUFs (Supplemental Document 1). To further explore the possibility that NPA TUFs mostly represented longer fragments, we then attempted a nested 5′- and 3′-RACE approach for all 33 TUFs validated by RT-PCR (Supplemental Document 1). Amplified fragments were cloned and sequenced, and 11 of 33 yielded at least one positive 5′- or 3′-RACE sequence. Seven of the RACE fragments extended at least 30 nt beyond the TUF from which they were initiated, and in one case the RACE fragment was 1 kb longer than its corresponding TUF (see Supplemental material for details). Taken together, these data suggest that a considerable fraction of the non-polyadenylated transcripts in C. elegans are in the form of longer, unspliced RNAs.

Coding regions are a complex web of overlapping transcripts

The NPA signals overlapping genic (exonic and intronic) sequence are more difficult to interpret. These could be of the same nature as intergenic non-polyadenylated signals (i.e., independent of coding gene transcription) or, conversely, could simply represent fragments from mRNA splicing and degradation. The signal intensity distributions for NPA TUFs and exonic transfrags show little difference (Fig. 3A), and various analyses of the genic PA and NPA data favor a hypothesis that genic non-polyadenylated transcription is not principally different from the intergenic transcriptional output (for further details, see T. Liu, H. He, J. Wang, G. Skogerb, and R. Chen, in prep.); however, the genic non-polyadenylated transcription appears at least in part to be composed of alternative, unspliced transcripts (possibly antisense) covering both exons and introns of the coding genes. There also appears to be a positive correlation between polyadenylated and non-polyadenylated activity within the same coding gene boundaries. A few annotated coding genes with evidence of both PA and NPA transcription were tested by reverse transcription with single primers in either orientation, followed by PCR. Only two out of 14 cases amplified a fragment corresponding to an antisense transcript (Supplemental Fig. 5F); thus antisense transcription is not likely to make up the bulk of non-polyadenylated transcription overlapping coding exons. Bimorphic transcripts (identical transcripts existing in both polyadenylated and non-polyadenylated form) have been indicated in the human transcriptome (Cheng et al. 2005), but our data cannot distinguish between this and other forms of transcriptional activity occurring at coding loci. Nonetheless, the strong overlap between polyadenylated and non-polyadenylated transcription in annotated protein-coding regions of the genome suggests that the transcriptional complexity in C. elegans is similar to that observed in other eukaryotes (Stolc et al. 2005; Engström et al. 2006)

Figure 3.
Signal intensity (log2) distribution for NPA TUFs and annotated transfrags (A) and PA TUFs and annotated transfrags (B).

Non-annotated polyadenylated TUFs are composed of novel exons and other transcripts

The PA data included 93,337 transfrags overlapping annotated exons and 11,925 without genomic annotation. To associate unannotated transfrags with known genes or transcripts (Manak et al. 2006), we supplemented the WormBase WS160 RefSeq annotations (v. 21) with 346,064 ESTs from GenBank. Clustering ESTs and RefSeq cDNA overlapping the TUFs (Supplemental Document 3) produced 1938 potential gene regions (PGRs) containing 3192 TUFs. Of these, 1340 TUFs appear to be additional or alternative exons of known annotated genes (Fig. 4), and the remaining TUFs may represent potential exons of unknown genes. An intriguing example of the latter is a PGR on the X chromosome containing 14 TUFs surrounding a locus annotated as noncoding transcript C53C7.5. The PGR lacks extended coding potential, contains an SL1 splicing recognition site, and is detected also by the NPA array, suggesting that this PGR may be a trans-spliced, bimorphic (Cheng et al. 2005) noncoding RNA gene (Fig. 5).

Figure 4.
Assignment of additional exons to coding genes. (A) Potential 3′-end exon is detected ~700 bp downstream of the 3′-most annotated exon in gene Y54E10BR.2 on chromosome 1. (B) Coding gene Y51A2D.18 on chromosome V has a potential ...
Figure 5.
PGRs generated near a non-protein-coding region (“C53C7.5” in WormBase; Chen et al. 2005) on chromosome X. In the “Transfrag” track, blue, green, or orange boxes represent transfrags from the PA, NPA, or SNPA arrays, respectively. ...

Among the 8733 TUFs in the PA sample that cannot be linked to RefSeq and EST data, 943 have gene prediction annotation and therefore have some protein-coding potential. This leaves 7790 TUFs with no additional information. The PA TUFs are generally short with a median (mean) size of 75 (87) bp, respectively, considerably shorter than most C. elegans exons. These TUFs have far lower signal intensities than most PA transfrags (Fig. 3B); nonetheless, RT-PCR analysis (Supplemental Document 1) confirmed 75% (18/24) of these, with no difference in confirmation rate between intronic and intergenic loci (see Supplemental materials for details). Further analysis by 5′- and 3′-RACE, cloning and sequencing of the 24 PA TUFs gave a positive 5′- and/or 3′-RACE fragment for six of these (Supplemental Table 1), three of which extended >30 nt beyond the TUF itself, possibly suggesting that also a fraction of the small PA TUFs may represent longer, lowly expressed transcripts.

Discussion

Relative to its genome size, the transcriptional output in C. elegans appears no less complex than those of other eukaryotes subjected to full genome scans. The tiling microarray detected ~200,000 transcribed regions corresponding to 22.7% of the C. elegans genome. When transcribed introns of all annotated coding genes are included in this figure, an estimated 62.4% of the C. elegans genome could be transcribed. Including the possibility that 60% of the detected genes may also have additional transcripts (antisense, bimorphic, or other) would further increase the amount of transcriptional output per base pair genomic sequence to 70%. The very likely possibility that the intervening regions between non-annotated NPA TUFs are also transcribed might, however, further increase this figure.

This amount of transcription is comparable to what has been estimated for a number of other eukaryotes (Willingham and Gingeras 2006). There are nevertheless a number of differences that set C. elegans apart from other organisms. Most tiling microarray studies have found an amount of non-annotated polyadenylated transcription several times higher than that expected to arise from annotated genes. Cumulative transcription detected in eight human cell lines covered 10.5% of all interrogated nucleotides, which is four times the annotated 2.5% exonic sequence in the human genome (Cheng et al. 2005). In rice, 58% of the positive probes represented regions of the genome annotated as intergenic (Li et al. 2006), and in the 24 first hours of Drosophila embryo development, 30% of the polyadenylated transcription does not correspond to known exons (Manak et al. 2006). Even in yeast, where annotated genes constitute ~70% of the genome, ~20% of the polyadenylated transcription arise outside annotated exons (David et al. 2006). In comparison, only 11% of the detected C. elegans polyadenylated transcripts could not be referred to annotated loci.

Non-polyadenylated transcription has thus far only been studied by tiling microarray analysis in the human genome (Cheng et al. 2005). The amount of C. elegans non-polyadenylated transcription was 44% of the total observed transcriptional output on the array, comparable to the almost 50% reported for 10 human chromosomes (Cheng et al. 2005). Also similar to the human data, a major fraction (70%) of the non-polyadenylated transcription falls within the limits of coding loci (i.e., overlapping either exons or introns), the majority of this at least partially overlapping exonic sequence.

Our main aim with this study was to obtain an overview of the non-annotated (and potentially noncoding) elements of the C. elegans transcriptome, and in particular the complement of small noncoding RNAs. Computational predictions based on sequence conservation of potential secondary structure had indicated the presence of ~3600 such loci in the C. elegans and C. briggsae genomes (Missal et al. 2006). To explore this set of ncRNAs, we employed a preparation procedure that enriched the hybridized sample in small non-polyadenylated RNAs. Contrary to the expectations from the computational and other estimates (Deng et al. 2006; Missal et al. 2006), the C. elegans genome appears not to encode any larger number of small non-polyadenylated RNAs. Also, of the ~1200 novel SNPA TUFs, only 4.2% overlapped or fell within close reach of the computationally predicted sites, thus, neither DNA sequence conservation nor secondary-structure potential appear to have high predictive value when it comes to identifying novel ncRNA genes. We cannot exclude the possibility that RNA samples harvested from mixed-stage worm culture are not representative for the full small noncoding transcriptome, but as judged from the polyadenylated array data there does not appear to be any major fraction of the transcriptome that is not represented in a mixed-stage culture. Contrarily, in C. elegans the major bulk of the potentially noncoding RNAs seems to be either polyadenylated or in the form of longer non-polyadenylated transcripts.

Does the picture of the C. elegans transcriptome deviate from those obtained from other organisms studied by tiling arrays? In the sense that the observed non-annotated polyadenylated transcription is just 11% of all the PA detected transcripts would imply that it does, and had we not included non-polyadenylated data in our analysis the answer to Hillier et al. (2005) as to why such a small worm needs so many (coding) genes would have been that it is because it has so few other genes. The non-polyadenylated transcription data completes the picture in the sense that when it comes to the fraction of total transcriptional output, the worm is as rich in non-polyadenylated transcripts as is man. However, when taking into consideration that the human polyadenylated transcriptome is probably several times larger than the coding part of its genome, the worm falls short also in this respect. Thus, it may be that the worm has received a nearly full complement of protein coding genes, but when it comes to participation in the new RNA world of regulatory complexity (Mattick 2004), its transcriptome betrays its organismal simplicity.

Methods

Sample preparation

RNA was extracted from mixed-stage wild-type N2 strain worms cultivated at 20°C according to the Trizol (Invitrogen) protocol. Small RNAs (<500 nt, SNPA sample) were isolated using a QIAGEN tip (QIAGEN), and the Poly(A)Purist MAG (Ambion) and MicrobExpress kits (Ambion) were adapted to remove remaining mRNAs and rRNAs (Deng et al. 2006). The enriched ncRNA pool was cloned using an adaptor-mediated library construction protocol. RNAs were dephosphorylated with calf intestine alkaline phosphatase (Fermentas) and then ligated to the 3′-adaptor (3AD) oligonucleotide by T4 RNA ligase (Fermentas) (He et al. 2006). Polyadenylated RNA (PA sample) was isolated from total RNA using the Poly(A)Purist MAG kit (Ambion). Non-polyadenylated RNA (NPA sample) was prepared by removing polyadenylated RNA using the Poly(A)Purist MAG kit and rRNA using the MicrobExpress kit (Supplemental Fig. 2). The PA and NPA RNA samples were reverse-transcribed (RT) using random hexamers, and the SNPA RNA sample was reverse-transcribed using a primer complementary to 3′-adaptor (oligo 3RT). First-strand cDNA was then used for second-strand DNA synthesis; the double-strand DNA was fractioned, labeled, and hybridized to the tiling array according to Affymetrix’s GeneChip Whole Transcript (WT) Double-Stranded Target Assay Manual (http://www.affymetrix.com). The microarrays were scanned on a 3000 7G GeneChip Scanner. Hybridization of the PA, NPA, and SNPA samples were started from 140 μg, 140 μg, and 1 mg total RNA, respectively. Each prepared sample was hybridized once to the array, and the entire process of sample preparation and hybridization was carried out twice for each type of sample.

RT-PCR and 5′- and 3′-RACE

Total RNA digested with DNase I (Fermentas) was used as template for RT-PCR (QIAGEN OneStep RT-PCR kit). SNPA TUFs RACE were performed by PCR amplification of previously prepared small ncRNA cDNA library (Deng et al. 2006), with one primer designed specific for the ncRNA sequence and another primer being either 5CD or 3RT for 5′- or 3′-RACE, respectively. The PA and NPA TUF RACE reactions were carried out on the polyadenylated RNA and non-polyadenylated RNA fractions, respectively (see Supplemental Document 1).

Computational analyses

C. elegans genome annotation and sequence data and C. briggsae genome data were downloaded from WormBase (version WS140) (Harris et al. 2003). Raw data analysis and transfrag determination were performed as described by Kampa et al. (2004) with minor modifications (Supplemental Document 2). Briefly, the replicates are performed quantile-normalization and then scaled to the median intensity of 60. Log2[max(PM − MM,1)] is calculated for each probe as an estimate of the expression level at each genomic position. The probes are considered significant over background if their signals are above a threshold associated with a false-positive rate of 4.6% estimated from the negative bacterial controls on the arrays. A transfrag is produced by the signal intensity threshold, a maximum gap between positive probe pairs (maxgap = 30), and a minimum length of the stretch-positive probe pairs (minrun = 13, at least two probe pairs). The analysis is implemented by the Affymetrix Tiling Analysis Software version 1.1.02.

All data underlying the study have been made available in the Supplemental material and on our server at http://bioinfo.ibp.ac.cn/tiling_array/.

Acknowledgments

We thank Yi Zhao, Yudong Wang, and Dandan He for early experiment discussion. The C. elegans strain N2 used in this work was provided by the Caenorhabditis Genetics Center, which is funded by the NIH National Center for Research Resources. This work was supported by the National Sciences Foundation of China (grant 30630040); National Key Basic Research & Development Program 973 (grants 2002CB713805 and 2003CB715900).

Footnotes

[Supplemental material is available online at www.genome.org.]

Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6611807

References

  • Bertone P., Stolc V., Royce T.E., Rozowsky J.S., Urban A.E., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Stolc V., Royce T.E., Rozowsky J.S., Urban A.E., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Royce T.E., Rozowsky J.S., Urban A.E., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Rozowsky J.S., Urban A.E., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Urban A.E., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Zhu X., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Rinn J.L., Tongprasit W., Samanta M., Weissman S., Tongprasit W., Samanta M., Weissman S., Samanta M., Weissman S., Weissman S., et al. Global identification of human transcribed sequences with genome tiling arrays. Science. 2004;306:2242–2246. [PubMed]
  • Bertone P., Trifonov V., Rozowsky J.S., Schubert F., Emanuelsson O., Karro J., Kao M.-Y., Snyder M., Gerstein M., Trifonov V., Rozowsky J.S., Schubert F., Emanuelsson O., Karro J., Kao M.-Y., Snyder M., Gerstein M., Rozowsky J.S., Schubert F., Emanuelsson O., Karro J., Kao M.-Y., Snyder M., Gerstein M., Schubert F., Emanuelsson O., Karro J., Kao M.-Y., Snyder M., Gerstein M., Emanuelsson O., Karro J., Kao M.-Y., Snyder M., Gerstein M., Karro J., Kao M.-Y., Snyder M., Gerstein M., Kao M.-Y., Snyder M., Gerstein M., Snyder M., Gerstein M., Gerstein M. Design optimization methods for genomic DNA tiling arrays. Genome Res. 2006;16:271–281. [PMC free article] [PubMed]
  • Bracht J., Hunter S., Eachus R., Weeks P., Pasquinelli A.E., Hunter S., Eachus R., Weeks P., Pasquinelli A.E., Eachus R., Weeks P., Pasquinelli A.E., Weeks P., Pasquinelli A.E., Pasquinelli A.E. Trans-splicing and polyadenylation of let-7 microRNA primary transcripts. RNA. 2004;10:1586–1594. [PMC free article] [PubMed]
  • Chen N., Harris T.W., Antoshechkin I., Bastiani C., Bieri T., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Harris T.W., Antoshechkin I., Bastiani C., Bieri T., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Antoshechkin I., Bastiani C., Bieri T., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Bastiani C., Bieri T., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Bieri T., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Blasiar D., Bradnam K., Canaran P., Chan J., Chen C.-K., Bradnam K., Canaran P., Chan J., Chen C.-K., Canaran P., Chan J., Chen C.-K., Chan J., Chen C.-K., Chen C.-K., et al. WormBase: A comprehensive data resource for Caenorhabditis biology and genomics. Nucleic Acids Res. 2005;33:D383–D389. doi: 10.1093/nar/gki066. [PMC free article] [PubMed] [Cross Ref]
  • Cheng J., Kapranov P., Drenkow J., Dike S., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., Kapranov P., Drenkow J., Dike S., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., Drenkow J., Dike S., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., Dike S., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., Brubaker S., Patel S., Long J., Stern D., Tammana H., Helt G., Patel S., Long J., Stern D., Tammana H., Helt G., Long J., Stern D., Tammana H., Helt G., Stern D., Tammana H., Helt G., Tammana H., Helt G., Helt G., et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005;308:1149–1154. [PubMed]
  • David L., Huber W., Granovskaia M., Toedling J., Palm C.J., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Huber W., Granovskaia M., Toedling J., Palm C.J., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Granovskaia M., Toedling J., Palm C.J., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Toedling J., Palm C.J., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Palm C.J., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Bofkin L., Jones T., Davis R.W., Steinmetz L.M., Jones T., Davis R.W., Steinmetz L.M., Davis R.W., Steinmetz L.M., Steinmetz L.M. A high-resolution map of transcription in the yeast genome. Proc. Natl. Acad. Sci. 2006;103:5320–5325. [PMC free article] [PubMed]
  • Deng W., Zhu X., Skogerbø G., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Zhu X., Skogerbø G., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Skogerbø G., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Wang Y., He H., Cai L., Sun H., Liu C., He H., Cai L., Sun H., Liu C., Cai L., Sun H., Liu C., Sun H., Liu C., Liu C., et al. Organization of the Caenorhabditis elegans small non-coding transcriptome: Genomic features, biogenesis, and expression. Genome Res. 2006;16:20–29. [PMC free article] [PubMed]
  • The ENCODE Project Consortium The ENCODE (ENCyclopedia Of DNA Elements) project. Science. 2004;306:636–640. [PubMed]
  • Engström P.G., Suzuki H., Ninomiya N., Akalin A., Sessa L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Suzuki H., Ninomiya N., Akalin A., Sessa L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Ninomiya N., Akalin A., Sessa L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Akalin A., Sessa L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Sessa L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Lavorgna G., Brozzi A., Luzi L., Tan S.L., Yang L., Brozzi A., Luzi L., Tan S.L., Yang L., Luzi L., Tan S.L., Yang L., Tan S.L., Yang L., Yang L., et al. Complex loci in human and mouse genomes. PLoS Genet. 2006;2:e47. doi: 10.1371/journal.pgen.0020047. [PMC free article] [PubMed] [Cross Ref]
  • Griffiths-Jones S., Grocock R.J., van Dongen S., Bateman A., Enright A.J., Grocock R.J., van Dongen S., Bateman A., Enright A.J., van Dongen S., Bateman A., Enright A.J., Bateman A., Enright A.J., Enright A.J. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006;34:D140–D144. doi: 10.1093/nar/gkj112. [PMC free article] [PubMed] [Cross Ref]
  • Harris T.W., Lee R., Schwarz E., Bradnam K., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Lee R., Schwarz E., Bradnam K., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Schwarz E., Bradnam K., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Bradnam K., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R., Blasier D., Kenny E., Cunningham F., Kishore R., Kenny E., Cunningham F., Kishore R., Cunningham F., Kishore R., Kishore R., et al. WormBase: a cross-species database for comparative genomics. Nucleic Acids Res. 2003;31:133–137. [PMC free article] [PubMed]
  • He H., Cai L., Skogerbø G., Deng W., Liu T., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Cai L., Skogerbø G., Deng W., Liu T., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Skogerbø G., Deng W., Liu T., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Deng W., Liu T., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Liu T., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Zhu X., Wang Y., Jia D., Zhang Z., Tao Y., Wang Y., Jia D., Zhang Z., Tao Y., Jia D., Zhang Z., Tao Y., Zhang Z., Tao Y., Tao Y., et al. Profiling Caenorhabditis elegans non-coding RNA expression with a combined microarray. Nucleic Acids Res. 2006;34:2976–2983. doi: 10.1093/nar/gkl371. [PMC free article] [PubMed] [Cross Ref]
  • Hillier L.W., Coulson A., Murray J.I., Bao Z., Sulston J.E., Waterston R.H., Coulson A., Murray J.I., Bao Z., Sulston J.E., Waterston R.H., Murray J.I., Bao Z., Sulston J.E., Waterston R.H., Bao Z., Sulston J.E., Waterston R.H., Sulston J.E., Waterston R.H., Waterston R.H. Genomics in C. elegans: So many genes, such a little worm. Genome Res. 2005;15:1651–1660. [PubMed]
  • Jiang M., Ryu J., Kiraly M., Duke K., Reinke V., Kim S.K., Ryu J., Kiraly M., Duke K., Reinke V., Kim S.K., Kiraly M., Duke K., Reinke V., Kim S.K., Duke K., Reinke V., Kim S.K., Reinke V., Kim S.K., Kim S.K. Genome-wide analysis of developmental and sex-regulated gene expression profiles in Caenorhabditis elegans. Proc. Natl. Acad. Sci. 2001;98:218–223. [PMC free article] [PubMed]
  • Kampa D., Cheng J., Kapranov P., Yamanaka M., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Cheng J., Kapranov P., Yamanaka M., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Kapranov P., Yamanaka M., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Yamanaka M., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Brubaker S., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Cawley S., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Drenkow J., Piccolboni A., Bekiranov S., Helt G., Piccolboni A., Bekiranov S., Helt G., Bekiranov S., Helt G., Helt G., et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004;14:331–342. [PMC free article] [PubMed]
  • Kapranov P., Cawley S.E., Drenkow J., Bekiranov S., Strausberg R.L., Fodor S.P., Gingeras T.R., Cawley S.E., Drenkow J., Bekiranov S., Strausberg R.L., Fodor S.P., Gingeras T.R., Drenkow J., Bekiranov S., Strausberg R.L., Fodor S.P., Gingeras T.R., Bekiranov S., Strausberg R.L., Fodor S.P., Gingeras T.R., Strausberg R.L., Fodor S.P., Gingeras T.R., Fodor S.P., Gingeras T.R., Gingeras T.R. Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002;296:916–919. [PubMed]
  • Kent W.J., Zahler A.M., Zahler A.M. Conservation, regulation, synteny, and introns in a large-scale C. briggsae–C. elegans genomic alignment. Genome Res. 2000;10:1115–1125. [PubMed]
  • Li L., Wang X., Stolc V., Li X., Zhang D., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Wang X., Stolc V., Li X., Zhang D., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Stolc V., Li X., Zhang D., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Li X., Zhang D., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Zhang D., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Su N., Tongprasit W., Li S., Cheng Z., Wang J., Tongprasit W., Li S., Cheng Z., Wang J., Li S., Cheng Z., Wang J., Cheng Z., Wang J., Wang J., et al. Genome-wide transcription analyses in rice using tiling microarrays. Nat. Genet. 2006;38:124–129. [PubMed]
  • Liu C., Bai B., Skogerbo G., Cai L., Deng W., Zhang Y., Bu D., Zhao Y., Chen R., Bai B., Skogerbo G., Cai L., Deng W., Zhang Y., Bu D., Zhao Y., Chen R., Skogerbo G., Cai L., Deng W., Zhang Y., Bu D., Zhao Y., Chen R., Cai L., Deng W., Zhang Y., Bu D., Zhao Y., Chen R., Deng W., Zhang Y., Bu D., Zhao Y., Chen R., Zhang Y., Bu D., Zhao Y., Chen R., Bu D., Zhao Y., Chen R., Zhao Y., Chen R., Chen R. NONCODE: an integrated knowledge database of non-coding RNAs. Nucleic Acids Res. 2005;33:D112–D115. doi: 10.1093/nar/gki041. [PMC free article] [PubMed] [Cross Ref]
  • Lowe T.M., Eddy S.R., Eddy S.R. tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–964. [PMC free article] [PubMed]
  • Manak J.R., Dike S., Sementchenko V., Kapranov P., Biemar F., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Dike S., Sementchenko V., Kapranov P., Biemar F., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Sementchenko V., Kapranov P., Biemar F., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Kapranov P., Biemar F., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Biemar F., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Long J., Cheng J., Bell I., Ghosh S., Piccolboni A., Cheng J., Bell I., Ghosh S., Piccolboni A., Bell I., Ghosh S., Piccolboni A., Ghosh S., Piccolboni A., Piccolboni A., et al. Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat. Genet. 2006;38:1151–1158. [PubMed]
  • Mattick J.S. RNA regulation: A new genetics? Nat. Rev. Genet. 2004;5:316–323. [PubMed]
  • Missal K., Zhu X., Rose D., Deng W., Skogerbø G., Chen R., Stadler P.F., Zhu X., Rose D., Deng W., Skogerbø G., Chen R., Stadler P.F., Rose D., Deng W., Skogerbø G., Chen R., Stadler P.F., Deng W., Skogerbø G., Chen R., Stadler P.F., Skogerbø G., Chen R., Stadler P.F., Chen R., Stadler P.F., Stadler P.F. Prediction of structured non-coding RNAs in the genomes of the nematodes Caenorhabditis elegans and Caenorhabditis briggsae. J. Exp. Zool. B Mol. Dev. Evol. 2006;306:379–392. [PubMed]
  • Pang K.C., Stephen S., Engström P.G., Tajul-Arifin K., Chen W., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Stephen S., Engström P.G., Tajul-Arifin K., Chen W., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Engström P.G., Tajul-Arifin K., Chen W., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Tajul-Arifin K., Chen W., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Chen W., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Wahlestedt C., Lenhard B., Hayashizaki Y., Mattick J.S., Lenhard B., Hayashizaki Y., Mattick J.S., Hayashizaki Y., Mattick J.S., Mattick J.S. RNAdb—A comprehensive mammalian noncoding RNA database. Nucleic Acids Res. 2005;33:D125–D130. doi: 10.1093/nar/gki089. [PMC free article] [PubMed] [Cross Ref]
  • Rinn J.L., Euskirchen G., Bertone P., Martone R., Luscombe N.M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Euskirchen G., Bertone P., Martone R., Luscombe N.M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Bertone P., Martone R., Luscombe N.M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Martone R., Luscombe N.M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Luscombe N.M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Hartman S., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Harrison P.M., Nelson F.K., Miller P., Gerstein M., Nelson F.K., Miller P., Gerstein M., Miller P., Gerstein M., Gerstein M., et al. The transcriptional activity of human chromosome 22. Genes & Dev. 2003;17:529–540. [PMC free article] [PubMed]
  • Stolc V., Samanta M.P., Tongprasit W., Sethi H., Liang S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Samanta M.P., Tongprasit W., Sethi H., Liang S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Tongprasit W., Sethi H., Liang S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Sethi H., Liang S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Liang S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Nelson D.C., Hegeman A., Nelson C., Rancour D., Bednarek S., Hegeman A., Nelson C., Rancour D., Bednarek S., Nelson C., Rancour D., Bednarek S., Rancour D., Bednarek S., Bednarek S., et al. Identification of transcribed sequences in Arabidopsis thaliana by using high-resolution genome tiling arrays. Proc. Natl. Acad. Sci. 2005;102:4453–4458. [PMC free article] [PubMed]
  • Stricklin S.L., Griffiths-Jones S., Eddy S.R., Griffiths-Jones S., Eddy S.R., Eddy S.R. C. elegans noncoding RNA genes. In: The C. elegans Research Community, editor. WormBook. 2005. http://www.wormbook.org/chapters/www_noncodingRNA/noncodingRNA.html. [PubMed]
  • Wang J., Kim S.K., Kim S.K. Global analysis of dauer gene expression in Caenorhabditis elegans. Development. 2003;130:1621–1634. [PubMed]
  • Willingham A.T., Gingeras T.R., Gingeras T.R. TUF love for “junk” DNA. Cell. 2006;125:1215–1220. [PubMed]
  • Yamada K., Lim J., Dale J.M., Chen H., Shinn P., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Lim J., Dale J.M., Chen H., Shinn P., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Dale J.M., Chen H., Shinn P., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Chen H., Shinn P., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Shinn P., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Palm C.J., Southwick A.M., Wu H.C., Kim C., Nguyen M., Southwick A.M., Wu H.C., Kim C., Nguyen M., Wu H.C., Kim C., Nguyen M., Kim C., Nguyen M., Nguyen M., et al. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science. 2003;302:842–846. [PubMed]
  • Zemann A., de op Bekke A., Kiefmann M., Brosius J., Schmitz J., de op Bekke A., Kiefmann M., Brosius J., Schmitz J., Kiefmann M., Brosius J., Schmitz J., Brosius J., Schmitz J., Schmitz J. Evolution of small nucleolar RNAs in nematodes. Nucleic Acids Res. 2006;34:2676–2685. doi: 10.1093/nar/gkl359. [PMC free article] [PubMed] [Cross Ref]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...