Format

Send to

Choose Destination

Links from PubMed

Front Genet. 2019 Feb 25;10:119. doi: 10.3389/fgene.2019.00119. eCollection 2019.

Improved Pre-miRNAs Identification Through Mutual Information of Pre-miRNA Sequences and Structures.

Author information

1
College of Information Science and Engineering, Hunan University, Changsha, China.
2
School of Mathematics and Statistics, Hainan Normal University, Haikou, China.
3
School of Computer Science, Hunan University of Technology, Zhuzhou, China.
4
Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, United States.

Abstract

Playing critical roles as post-transcriptional regulators, microRNAs (miRNAs) are a family of short non-coding RNAs that are derived from longer transcripts called precursor miRNAs (pre-miRNAs). Experimental methods to identify pre-miRNAs are expensive and time-consuming, which presents the need for computational alternatives. In recent years, the accuracy of computational methods to predict pre-miRNAs has been increasing significantly. However, there are still several drawbacks. First, these methods usually only consider base frequencies or sequence information while ignoring the information between bases. Second, feature extraction methods based on secondary structures usually only consider the global characteristics while ignoring the mutual influence of the local structures. Third, methods integrating high-dimensional feature information is computationally inefficient. In this study, we have proposed a novel mutual information-based feature representation algorithm for pre-miRNA sequences and secondary structures, which is capable of catching the interactions between sequence bases and local features of the RNA secondary structure. In addition, the feature space is smaller than that of most popular methods, which makes our method computationally more efficient than the competitors. Finally, we applied these features to train a support vector machine model to predict pre-miRNAs and compared the results with other popular predictors. As a result, our method outperforms others based on both 5-fold cross-validation and the Jackknife test.

KEYWORDS:

feature representation algorithm; mutual information; pre-miRNAs identification; structure analysis; support vector machine

Supplemental Content

Full text links

Icon for Frontiers Media SA Icon for PubMed Central
Loading ...
Support Center