Identification of putative noncoding RNAs among the RIKEN mouse full-length cDNA collection

Genome Res. 2003 Jun;13(6B):1301-6. doi: 10.1101/gr.1011603.

Abstract

With the sequencing and annotation of genomes and transcriptomes of several eukaryotes, the importance of noncoding RNA (ncRNA)-RNA molecules that are not translated to protein products-has become more evident. A subclass of ncRNA transcripts are encoded by highly regulated, multi-exon, transcriptional units, are processed like typical protein-coding mRNAs and are increasingly implicated in regulation of many cellular functions in eukaryotes. This study describes the identification of candidate functional ncRNAs from among the RIKEN mouse full-length cDNA collection, which contains 60,770 sequences, by using a systematic computational filtering approach. We initially searched for previously reported ncRNAs and found nine murine ncRNAs and homologs of several previously described nonmouse ncRNAs. Through our computational approach to filter artifact-free clones that lack protein coding potential, we extracted 4280 transcripts as the largest-candidate set. Many clones in the set had EST hits, potential CpG islands surrounding the transcription start sites, and homologies with the human genome. This implies that many candidates are indeed transcribed in a regulated manner. Our results demonstrate that ncRNAs are a major functional subclass of processed transcripts in mammals.

MeSH terms

  • Animals
  • Computational Biology / methods
  • DNA, Complementary / genetics*
  • Databases, Genetic*
  • Humans
  • Mice
  • Mice, Inbred C57BL
  • RNA, Complementary / chemistry
  • RNA, Complementary / genetics
  • RNA, Untranslated / chemistry
  • RNA, Untranslated / genetics*
  • Rats
  • Sequence Homology, Nucleic Acid

Substances

  • DNA, Complementary
  • RNA, Complementary
  • RNA, Untranslated