HydRA: Deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence

Mol Cell. 2023 Jul 20;83(14):2595-2611.e11. doi: 10.1016/j.molcel.2023.06.019. Epub 2023 Jul 7.

Abstract

RNA-binding proteins (RBPs) control RNA metabolism to orchestrate gene expression and, when dysfunctional, underlie human diseases. Proteome-wide discovery efforts predict thousands of RBP candidates, many of which lack canonical RNA-binding domains (RBDs). Here, we present a hybrid ensemble RBP classifier (HydRA), which leverages information from both intermolecular protein interactions and internal protein sequence patterns to predict RNA-binding capacity with unparalleled specificity and sensitivity using support vector machines (SVMs), convolutional neural networks (CNNs), and Transformer-based protein language models. Occlusion mapping by HydRA robustly detects known RBDs and predicts hundreds of uncharacterized RNA-binding associated domains. Enhanced CLIP (eCLIP) for HydRA-predicted RBP candidates reveals transcriptome-wide RNA targets and confirms RNA-binding activity for HydRA-predicted RNA-binding associated domains. HydRA accelerates construction of a comprehensive RBP catalog and expands the diversity of RNA-binding associated domains.

Keywords: RNA-binding domains; RNA-binding proteins; machine learning; protein-protein interaction network.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites / genetics
  • Deep Learning*
  • Humans
  • Hydra* / genetics
  • Hydra* / metabolism
  • Protein Binding
  • RNA / metabolism

Substances

  • RNA