Format

Send to

Choose Destination
Sci Rep. 2018 Nov 6;8(1):16385. doi: 10.1038/s41598-018-34708-w.

Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features.

Author information

1
Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA.
2
Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA. liangjw@clemson.edu.

Abstract

Long non-coding RNAs are involved in biological processes throughout the cell including the nucleus, chromatin and cytosol. However, most lncRNAs remain unannotated and functional annotation of lncRNAs is difficult due to their low conservation and their tissue and developmentally specific expression. LncRNA subcellular localization is highly informative regarding its biological function, although it is difficult to discover because few prediction methods currently exist. While protein subcellular localization prediction is a well-established research field, lncRNA localization prediction is a novel research problem. We developed DeepLncRNA, a deep learning algorithm which predicts lncRNA subcellular localization directly from lncRNA transcript sequences. We analyzed 93 strand-specific RNA-seq samples of nuclear and cytosolic fractions from multiple cell types to identify differentially localized lncRNAs. We then extracted sequence-based features from the lncRNAs to construct our DeepLncRNA model, which achieved an accuracy of 72.4%, sensitivity of 83%, specificity of 62.4% and area under the receiver operating characteristic curve of 0.787. Our results suggest that primary sequence motifs are a major driving force in the subcellular localization of lncRNAs.

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center