iMiRNA-SSF: Improving the Identification of MicroRNA Precursors by Combining Negative Sets with Different Distributions

Sci Rep. 2016 Jan 12:6:19062. doi: 10.1038/srep19062.

Abstract

The identification of microRNA precursors (pre-miRNAs) helps in understanding regulator in biological processes. The performance of computational predictors depends on their training sets, in which the negative sets play an important role. In this regard, we investigated the influence of benchmark datasets on the predictive performance of computational predictors in the field of miRNA identification, and found that the negative samples have significant impact on the predictive results of various methods. We constructed a new benchmark set with different data distributions of negative samples. Trained with this high quality benchmark dataset, a new computational predictor called iMiRNA-SSF was proposed, which employed various features extracted from RNA sequences. Experimental results showed that iMiRNA-SSF outperforms three state-of-the-art computational methods. For practical applications, a web-server of iMiRNA-SSF was established at the website http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Databases as Topic
  • Humans
  • Internet
  • MicroRNAs / genetics*
  • MicroRNAs / metabolism
  • RNA Precursors / genetics*
  • RNA Precursors / metabolism
  • ROC Curve
  • Reproducibility of Results
  • Support Vector Machine
  • Thermodynamics

Substances

  • MicroRNAs
  • RNA Precursors