Format

Send to

Choose Destination
Hum Mutat. 2019 Sep;40(9):1261-1269. doi: 10.1002/humu.23794. Epub 2019 Jun 23.

Predicting the impact of single nucleotide variants on splicing via sequence-based deep neural networks and genomic features.

Author information

1
Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.

Abstract

Single nucleotide mutations in exonic regions can significantly affect gene function through a disruption of splicing, and various computational methods have been developed to predict the splicing-related effects of a single nucleotide mutation. We implemented a new method using ensemble learning that combines two types of predictive models: (a) base sequence-based deep neural networks (DNNs) and (b) machine learning models based on genomic attributes. This method was applied to the Massively Parallel Splicing Assay challenge of the Fifth Critical Assessment of Genome Interpretation, in which challenge participants predicted various experimentally-defined exonic splicing mutations, and achieved a promising result. We successfully revealed that combining different predictive models based upon the stacked generalization method led to significant improvement in prediction performance. In addition, whereas most of the genomic features adopted in constructing machine learning models were previously reported, feature values generated with DSSP, a DNN-based splice site prediction tool, were novel and helpful for the prediction. Learning the sequence patterns associated with normal splicing and the change in splicing site probabilities caused by a mutation was presumed to be helpful in predicting splicing disruption.

KEYWORDS:

CAGI; deep neural networks; ensemble learning; single nucleotide variant; splicing

PMID:
31090248
DOI:
10.1002/humu.23794

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center