U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

SON SON DNA and RNA binding protein [ Homo sapiens (human) ]

Gene ID: 6651, updated on 3-May-2025
Official Symbol
SONprovided by HGNC
Official Full Name
SON DNA and RNA binding proteinprovided by HGNC
Primary source
HGNC:HGNC:11183
See related
Ensembl:ENSG00000159140 MIM:182465; AllianceGenome:HGNC:11183
Gene type
protein coding
RefSeq status
REVIEWED
Organism
Homo sapiens
Lineage
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
Also known as
SON3; BASS1; DBP-5; NREBP; TOKIMS; C21orf50
Summary
This gene encodes a protein that contains multiple simple repeats. The encoded protein binds RNA and promotes pre-mRNA splicing, particularly of transcripts with poor splice sites. The protein also recognizes a specific DNA sequence found in the human hepatitis B virus (HBV) and represses HBV core promoter activity. There is a pseudogene for this gene on chromosome 1. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Jul 2013]
Expression
Ubiquitous expression in bone marrow (RPKM 43.7), lymph node (RPKM 33.4) and 25 other tissues See more
Orthologs
NEW
Try the new Gene table
Try the new Transcript table
See SON in Genome Data Viewer
Location:
21q22.11
Exon count:
15
Annotation release Status Assembly Chr Location
RS_2024_08 current GRCh38.p14 (GCF_000001405.40) 21 NC_000021.9 (33543038..33577481)
RS_2024_08 current T2T-CHM13v2.0 (GCF_009914755.1) 21 NC_060945.1 (31924877..31959322)
RS_2024_09 previous assembly GRCh37.p13 (GCF_000001405.25) 21 NC_000021.8 (34915344..34949787)

Chromosome 21 - NC_000021.9Genomic Context describing neighboring genes Neighboring gene DnaJ heat shock protein family (Hsp40) member C28 Neighboring gene phosphoribosylglycinamide formyltransferase, phosphoribosylglycinamide synthetase, phosphoribosylaminoimidazole synthetase Neighboring gene BRD4-independent group 4 enhancer GRCh37_chr21:34903402-34904601 Neighboring gene basic transcription factor 3 pseudogene 6 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13260 Neighboring gene NANOG-H3K27ac-H3K4me1 hESC enhancer GRCh37_chr21:34914753-34915484 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13261 Neighboring gene microRNA 6501 Neighboring gene H3K27ac hESC enhancer GRCh37_chr21:34960679-34961178 Neighboring gene DNA replication fork stabilization factor DONSON Neighboring gene crystallin zeta like 1 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18382 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18383

  • Project title: Tissue-specific circular RNA induction during human fetal development
  • Description: 35 human fetal samples from 6 tissues (3 - 7 replicates per tissue) collected between 10 and 20 weeks gestational time were sequenced using Illumina TruSeq Stranded Total RNA
  • BioProject: PRJNA270632
  • Publication: PMID 26076956
  • Analysis date: Mon Apr 2 22:54:59 2018

GeneRIFs: Gene References Into Functions

What's a GeneRIF?

Associated conditions

Description Tests
ZTTK syndrome
MedGen: C4310696 OMIM: 617140 GeneReviews: Not available
Compare labs

Copy number response

Description
Copy number response
Triplosensitivity

No evidence available (Last evaluated 2019-04-24)

ClinGen Genome Curation Page
Haploinsufficency

Sufficient evidence for dosage pathogenicity (Last evaluated 2019-04-24)

ClinGen Genome Curation PagePubMed

Protein interactions

Protein Gene Interaction Pubs
Pr55(Gag) gag HIV-1 Gag interacts with SON as demonstrated by proximity dependent biotinylation proteomics PubMed

Go to the HIV-1, Human Interaction Database

Products Interactant Other Gene Complex Source Pubs Description

Markers

Clone Names

  • FLJ21099, FLJ33914, KIAA1019

Gene Ontology Provided by GOA

Function Evidence Code Pubs
enables DNA binding IEA
Inferred from Electronic Annotation
more info
 
enables RNA binding HDA PubMed 
enables RNA binding IBA
Inferred from Biological aspect of Ancestor
more info
 
enables RNA binding IDA
Inferred from Direct Assay
more info
PubMed 
enables RNA binding IEA
Inferred from Electronic Annotation
more info
 
enables nucleic acid binding IEA
Inferred from Electronic Annotation
more info
 
enables protein binding IPI
Inferred from Physical Interaction
more info
PubMed 
Process Evidence Code Pubs
involved_in RNA splicing IEA
Inferred from Electronic Annotation
more info
 
involved_in mRNA processing IDA
Inferred from Direct Assay
more info
PubMed 
involved_in mRNA processing IEA
Inferred from Electronic Annotation
more info
 
acts_upstream_of microtubule cytoskeleton organization IMP
Inferred from Mutant Phenotype
more info
PubMed 
acts_upstream_of mitotic cytokinesis IMP
Inferred from Mutant Phenotype
more info
PubMed 
involved_in negative regulation of apoptotic process IDA
Inferred from Direct Assay
more info
PubMed 
involved_in regulation of RNA splicing IEA
Inferred from Electronic Annotation
more info
 
involved_in regulation of RNA splicing IMP
Inferred from Mutant Phenotype
more info
PubMed 
involved_in regulation of cell cycle IEA
Inferred from Electronic Annotation
more info
 
acts_upstream_of regulation of cell cycle IMP
Inferred from Mutant Phenotype
more info
PubMed 
involved_in regulation of mRNA splicing, via spliceosome IBA
Inferred from Biological aspect of Ancestor
more info
 
involved_in regulation of mRNA splicing, via spliceosome IDA
Inferred from Direct Assay
more info
PubMed 
Component Evidence Code Pubs
located_in nuclear speck IDA
Inferred from Direct Assay
more info
PubMed 
located_in nuclear speck IEA
Inferred from Electronic Annotation
more info
 
located_in nucleus IEA
Inferred from Electronic Annotation
more info
 
Preferred Names
protein SON
Names
Bax antagonist selected in Saccharomyces 1
NRE-binding protein
SON DNA binding protein
negative regulatory element-binding protein

NEW Try the new Transcript table

RefSeqs maintained independently of Annotated Genomes

These reference sequences exist independently of genome builds. Explain

These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

Genomic

  1. NG_052981.2 RefSeqGene

    Range
    5002..39445
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

mRNA and Protein(s)

  1. NM_001291411.2NP_001278340.2  protein SON isoform E

    Status: REVIEWED

    Description
    Transcript Variant: This variant (e) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. It encodes isoform E which is shorter and has a distinct C-terminus, compared to isoform F.
    Source sequence(s)
    AP000303
    Consensus CDS
    CCDS74784.1
    Related
    ENSP00000371095.4, ENST00000381679.8
    Conserved Domains (4) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
  2. NM_001291412.3NP_001278341.1  protein SON isoform H

    Status: REVIEWED

    Description
    Transcript Variant: This variant (h) represents the allele encoded by the GRCh38 reference genome and encodes isoform (H).
    Source sequence(s)
    AP000303, AP000304
    Consensus CDS
    CCDS77624.1
    UniProtKB/TrEMBL
    A0A994J4Y9, J3QSZ5
    Related
    ENSP00000371111.2, ENST00000381692.6
    Conserved Domains (2) summary
    pfam01585
    Location:333376
    G-patch; G-patch domain
    cl00054
    Location:398441
    DSRM; Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, ...
  3. NM_001412132.1NP_001399061.1  protein SON isoform I

    Status: REVIEWED

    Description
    Transcript Variant: This variant (i) uses the same exon combination as variant h but represents the allele encoded by the T2T genome assembly. The encoded isoform (I) has a slightly different sequence in the C-terminal region compared to isoform H.
    Source sequence(s)
    CP068257
    UniProtKB/TrEMBL
    A0A994J4Y9, Q6ZRV7
  4. NM_001412133.1NP_001399062.1  protein SON isoform J

    Status: REVIEWED

    Description
    Transcript Variant: This variant (j) uses the same exon combination as variant f but represents the allele encoded by the T2T genome assembly. The encoded isoform (J) has a slightly different sequence in the C-terminal region compared to isoform F.
    Source sequence(s)
    CP068257
  5. NM_032195.3NP_115571.3  protein SON isoform B

    Status: REVIEWED

    Description
    Transcript Variant: This variant (b) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. The encoded isoform (B) is shorter and has a distinct C-terminus, compared to isoform F.
    Source sequence(s)
    AP000303, AP000304
    Consensus CDS
    CCDS13631.1
    Related
    ENSP00000300278.2, ENST00000300278.8
    Conserved Domains (4) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
  6. NM_138927.4NP_620305.3  protein SON isoform F

    Status: REVIEWED

    Description
    Transcript Variant: This variant (f) represents the allele encoded by the GRCh38 reference genome and encodes isoform (F).
    Source sequence(s)
    AF380184, AK307612, AP000303
    Consensus CDS
    CCDS13629.1
    UniProtKB/Swiss-Prot
    D3DSF5, D3DSF6, E7ETE8, E7EU67, E7EVW3, E9PFQ2, O14487, O95981, P18583, Q14120, Q6PKE0, Q9H7B1, Q9P070, Q9P072, Q9UKP9, Q9UPY0
    Related
    ENSP00000348984.4, ENST00000356577.10
    Conserved Domains (6) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
    pfam01585
    Location:23052349
    G-patch; G-patch domain
    cl00054
    Location:23692419
    DSRM_SF; double-stranded RNA binding motif (DSRM) superfamily

RNA

  1. NR_103797.2 RNA Sequence

    Status: REVIEWED

    Description
    Transcript Variant: This variant (c) contains an alternate internal exon and uses an alternate splice site at the 3' exon, compared to variant f. This variant is represented as non-coding because the use of the 5'-most expected translational start codon, as used in variant f, renders the transcript a candidate for nonsense-mediated mRNA decay (NMD).
    Source sequence(s)
    AP000303, AP000304
    Related
    ENST00000455528.5

RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2024_08

The following sections contain reference sequences that belong to a specific genome build. Explain

Reference GRCh38.p14 Primary Assembly

Genomic

  1. NC_000021.9 Reference GRCh38.p14 Primary Assembly

    Range
    33543038..33577481
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

Alternate T2T-CHM13v2.0

Genomic

  1. NC_060945.1 Alternate T2T-CHM13v2.0

    Range
    31924877..31959322
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

Suppressed Reference Sequence(s)

The following Reference Sequences have been suppressed. Explain

  1. NM_003103.5: Suppressed sequence

    Description
    NM_003103.5: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
  2. NM_138925.1: Suppressed sequence

    Description
    NM_138925.1: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
  3. NR_103796.1: Suppressed sequence

    Description
    NR_103796.1: This RefSeq was permanently suppressed because it is now thought that this transcript variant does encode a protein.
  4. NR_103798.1: Suppressed sequence

    Description
    NR_103798.1: This RefSeq was temporarily suppressed because currently there is not sufficient data to support this transcript.