Send to

Choose Destination
Hum Mutat. 2019 Sep;40(9):1270-1279. doi: 10.1002/humu.23790. Epub 2019 Jun 18.

Using secondary structure to predict the effects of genetic variants on alternative splicing.

Author information

Department of Bioengineering, University of California, Berkeley, California.
Department of Plant and Microbial Biology, University of California, Berkeley, California.


Accurate interpretation of genomic variants that alter RNA splicing is critical to precision medicine. We present a computational framework, Prediction of variant Effect on Percent Spliced In (PEPSI), that predicts the splicing impact of coding and noncoding variants for the Fifth Critical Assessment of Genome Interpretation (CAGI5) "Vex-seq" challenge. PEPSI is a random forest regression model trained on multiple layers of features associated with sequence conservation and regulatory sequence elements. Compared to other splicing defect prediction tools from the literature, our framework integrates secondary structure information in predicting variants that disrupt splicing regulatory elements (SREs). We applied our model to classify splice-disrupting variants among 2,094 single-nucleotide polymorphisms from the Exome Aggregation Consortium using model-predicted changes in percent spliced in (ΔPSI) associated with tested variants. Benchmarking our model against widely used state-of-the-art tools, we demonstrate that PEPSI achieves comparable performance in terms of sensitivity and precision. Moreover, we also show that using secondary structure context can help resolve several cases where changes in the counts of SREs do not correspond with the directionality of ΔPSI measured for tested variants.


CAGI; RNA secondary structure; alternative splicing; splice-disrupting variants; splicing regulatory elements


Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center