Regulated splicing of large exons is linked to phase-separation of vertebrate transcription factors

EMBO J. 2021 Nov 15;40(22):e107485. doi: 10.15252/embj.2020107485. Epub 2021 Oct 4.

Abstract

Although large exons cannot be readily recognized by the spliceosome, many are evolutionarily conserved and constitutively spliced for inclusion in the processed transcript. Furthermore, whether large exons may be enriched in a certain subset of proteins, or mediate specific functions, has remained unclear. Here, we identify a set of nearly 3,000 SRSF3-dependent large constitutive exons (S3-LCEs) in human and mouse cells. These exons are enriched for cytidine-rich sequence motifs, which bind and recruit the splicing factors hnRNP K and SRSF3. We find that hnRNP K suppresses S3-LCE splicing, an effect that is mitigated by SRSF3 to thus achieve constitutive splicing of S3-LCEs. S3-LCEs are enriched in genes for components of transcription machineries, including mediator and BAF complexes, and frequently contain intrinsically disordered regions (IDRs). In a subset of analyzed S3-LCE-containing transcription factors, SRSF3 depletion leads to deletion of the IDRs due to S3-LCE exon skipping, thereby disrupting phase-separated assemblies of these factors. Cytidine enrichment in large exons introduces proline/serine codon bias in intrinsically disordered regions and appears to have been evolutionarily acquired in vertebrates. We propose that layered splicing regulation by hnRNP K and SRSF3 ensures proper phase-separation of these S3-LCE-containing transcription factors in vertebrates.

Keywords: evolution; intrinsically disordered region; large exon; splicing; transcription factors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Line
  • Cytidine / genetics
  • Evolution, Molecular
  • Exons*
  • Heterogeneous-Nuclear Ribonucleoprotein K / genetics
  • Heterogeneous-Nuclear Ribonucleoprotein K / metabolism
  • Humans
  • Intrinsically Disordered Proteins / genetics
  • Intrinsically Disordered Proteins / metabolism
  • Mice
  • Polyadenylation
  • RNA Splicing
  • RNA-Binding Proteins / genetics
  • Serine-Arginine Splicing Factors / genetics*
  • Serine-Arginine Splicing Factors / metabolism
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism
  • Vertebrates / genetics*

Substances

  • Heterogeneous-Nuclear Ribonucleoprotein K
  • Intrinsically Disordered Proteins
  • RNA-Binding Proteins
  • SRSF3 protein, human
  • Srsf3 protein, mouse
  • Transcription Factors
  • Serine-Arginine Splicing Factors
  • Cytidine

Associated data

  • GEO/GSE161601
  • GEO/GSE161602