Format

Send to

Choose Destination
New Phytol. 2018 Nov;220(3):851-864. doi: 10.1111/nph.15349. Epub 2018 Jul 18.

Reproductive phasiRNAs in grasses are compositionally distinct from other classes of small RNAs.

Author information

1
Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19714, USA.
2
Delaware Biotechnology Institute, University of Delaware, Newark, DE, 19714, USA.
3
Donald Danforth Plant Science Center, St Louis, MO, 63132, USA.
4
Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19714, USA.
5
Division of Plant Sciences, University of Missouri - Columbia, 52 Agriculture Lab, Columbia, MO, 65211, USA.

Abstract

Little is known about the characteristics and function of reproductive phased, secondary, small interfering RNAs (phasiRNAs) in the Poaceae, despite the availability of significant genomic resources, experimental data, and a growing number of computational tools. We utilized machine-learning methods to identify sequence-based and positional features that distinguish phasiRNAs in rice and maize from other small RNAs (sRNAs). We developed Random Forest classifiers that can distinguish reproductive phasiRNAs from other sRNAs in complex sets of sequencing data, utilizing sequence-based (k-mers) and features describing position-specific sequence biases. The classification performance attained is > 80% in accuracy, sensitivity, specificity, and positive predicted value. Feature selection identified important features in both ends of phasiRNAs. We demonstrated that phasiRNAs have strand specificity and position-specific nucleotide biases potentially influencing AGO sorting; we also predicted targets to infer functions of phasiRNAs, and computationally assessed their sequence characteristics relative to other sRNAs. Our results demonstrate that machine-learning methods effectively identify phasiRNAs despite the lack of characteristic features typically present in precursor loci of other small RNAs, such as sequence conservation or structural motifs. The 5'-end features we identified provide insights into AGO-phasiRNA interactions. We describe a hypothetical model of competition for AGO loading between phasiRNAs of different nucleotide compositions.

KEYWORDS:

P4-siRNAs; classification; feature selection; heterochromatic siRNAs; machine learning; miRNAs; plant small RNAs; reproductive phasiRNAs

PMID:
30020552
DOI:
10.1111/nph.15349

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center