Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications

Nucleic Acids Res. 2009 May;37(8):2461-70. doi: 10.1093/nar/gkp093. Epub 2009 Mar 2.

Abstract

Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous non-identical small RNA sequences. Investigating the sites and identities of substitution errors reveal that many potentially originate as a result of post-transcriptional modifications or RNA editing. Modifications include N1-methyl modified purine nucleotides in tRNA, potential deamination or base substitutions in micro RNAs, 3' micro RNA uridine extensions and 5' micro RNA deletions. Additionally, further analysis of large sequencing datasets reveal that the combined effects of 5' deletions and 3' uridine extensions can alter the specificity by which micro RNAs associate with different Argonaute proteins. Hence, we demonstrate that not all sequencing errors in small RNA datasets are technical artifacts, but that these actually often reveal valuable biological insights to the sites of post-transcriptional RNA modifications.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Arabidopsis / genetics
  • Artifacts
  • Base Sequence
  • Genome, Plant
  • MicroRNAs / chemistry*
  • MicroRNAs / metabolism
  • Oryza / genetics
  • Poly U / analysis
  • RNA Editing
  • RNA Processing, Post-Transcriptional*
  • RNA, Transfer / chemistry*
  • RNA, Transfer / metabolism
  • Sequence Alignment
  • Sequence Analysis, RNA*

Substances

  • MicroRNAs
  • Poly U
  • RNA, Transfer