NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|7839187|ref|NP_058171.1|]
View 

gag-pol fusion protein [Saccharomyces cerevisiae S288c]

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1606-1742 7.74e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260004  Cd Length: 140  Bit Score: 112.18  E-value: 7.74e-28
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187  1606 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839187  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
TYA pfam01021
TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion ...
17-114 8.34e-62

TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles.


:

Pssm-ID: 144563  Cd Length: 98  Bit Score: 207.98  E-value: 8.34e-62
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLPEYDKDSTKANSQQETTPGSSAVPENHHHASPQTAQVPLPQNGPYQQQCMMTPNQA 80
                           90
                   ....*....|....*...
gi 7839187      97 NPSGWSFYGHPSMIPYTP 114
Cdd:pfam01021   81 NPSGWSVYGHPSMMPYTP 98
RVT_2 super family cl06662
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1265-1494 4.87e-35

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


The actual alignment was detected with superfamily member pfam07727:

Pssm-ID: 254387  Cd Length: 246  Bit Score: 136.43  E-value: 4.87e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1265 NTWDtdkyydrkEIDP---KRVINSMFIFNKKR--DGT---HKARFVARGDIQHP-----DTYdtgmqSNTVHHYALMTS 1331
Cdd:pfam07727    2 KTWE--------LVPLpkgKKPIGCKWVFKIKYnsDGEierYKARLVAKGFTQKEgidydETF-----SPVAKLTTIRLL 68
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1332 LSLALDNNYYITQLDISSAYLYADIKEELYIRPPP---HLGMNDKLIRLKKSHYGLKQSGANWYETIKSYLiKQCGMEEV 1408
Cdd:pfam07727   69 LALAAQRGWELHQMDVKTAFLNGELEEEVYMKQPPgfeDPGKPNKVCRLKKSLYGLKQAPRAWYQKLSSFL-LKLGFKQS 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1409 RGWSCVF-KNSQ---VTICLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGesdnEIQYdILGLEIKYQRGKYM--- 1481
Cdd:pfam07727  148 EADPCLFvKKSGggiIYLLLYVDDILIAGSNDELIDEFKEELSSEFEMK--DLG----ELKY-FLGIEIKRTSGGIFlsq 220
                          250       260
                   ....*....|....*....|..
gi 7839187    1482 ---------KLGMENSLTEKIP 1494
Cdd:pfam07727  221 rkyakkllkRFGMLDCKPVSTP 242
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
663-782 7.92e-20

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyses the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


:

Pssm-ID: 250040  Cd Length: 119  Bit Score: 88.16  E-value: 7.92e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187     663 EPFQYLHTDIFgPVHNLPKSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQASVlVIQMDRGSEYT 742
Cdd:pfam00665    4 RPNELWQMDIT-PIPISSKGGKKYLLVIVDDFSRFVVAYAL--KSKTDAELVFDLLEAALERRGGKPV-TIHSDNGSEFT 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 7839187     743 NRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:pfam00665   80 SKAFQELLKELGIKHSFSRPGNPQDNGVVERFNRTLKEEL 119
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1606-1742 7.74e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 112.18  E-value: 7.74e-28
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187  1606 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839187  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
TYA pfam01021
TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion ...
17-114 8.34e-62

TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles.


Pssm-ID: 144563  Cd Length: 98  Bit Score: 207.98  E-value: 8.34e-62
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLPEYDKDSTKANSQQETTPGSSAVPENHHHASPQTAQVPLPQNGPYQQQCMMTPNQA 80
                           90
                   ....*....|....*...
gi 7839187      97 NPSGWSFYGHPSMIPYTP 114
Cdd:pfam01021   81 NPSGWSVYGHPSMMPYTP 98
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1265-1494 4.87e-35

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 254387  Cd Length: 246  Bit Score: 136.43  E-value: 4.87e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1265 NTWDtdkyydrkEIDP---KRVINSMFIFNKKR--DGT---HKARFVARGDIQHP-----DTYdtgmqSNTVHHYALMTS 1331
Cdd:pfam07727    2 KTWE--------LVPLpkgKKPIGCKWVFKIKYnsDGEierYKARLVAKGFTQKEgidydETF-----SPVAKLTTIRLL 68
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1332 LSLALDNNYYITQLDISSAYLYADIKEELYIRPPP---HLGMNDKLIRLKKSHYGLKQSGANWYETIKSYLiKQCGMEEV 1408
Cdd:pfam07727   69 LALAAQRGWELHQMDVKTAFLNGELEEEVYMKQPPgfeDPGKPNKVCRLKKSLYGLKQAPRAWYQKLSSFL-LKLGFKQS 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1409 RGWSCVF-KNSQ---VTICLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGesdnEIQYdILGLEIKYQRGKYM--- 1481
Cdd:pfam07727  148 EADPCLFvKKSGggiIYLLLYVDDILIAGSNDELIDEFKEELSSEFEMK--DLG----ELKY-FLGIEIKRTSGGIFlsq 220
                          250       260
                   ....*....|....*....|..
gi 7839187    1482 ---------KLGMENSLTEKIP 1494
Cdd:pfam07727  221 rkyakkllkRFGMLDCKPVSTP 242
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
663-782 7.92e-20

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyses the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 250040  Cd Length: 119  Bit Score: 88.16  E-value: 7.92e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187     663 EPFQYLHTDIFgPVHNLPKSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQASVlVIQMDRGSEYT 742
Cdd:pfam00665    4 RPNELWQMDIT-PIPISSKGGKKYLLVIVDDFSRFVVAYAL--KSKTDAELVFDLLEAALERRGGKPV-TIHSDNGSEFT 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 7839187     743 NRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:pfam00665   80 SKAFQELLKELGIKHSFSRPGNPQDNGVVERFNRTLKEEL 119
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
29-171 2.72e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 255446 [Multi-domain]  Cd Length: 768  Bit Score: 47.69  E-value: 2.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      29 NQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPypqqcmmtqNQANPSGwsfyghPS 108
Cdd:pfam09606  394 NQGGLGANPMQQGQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVS---------GGMIPSP------PA 458
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839187     109 MIPYTPYQMSPMyfpPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMTSTK---KYVRP 171
Cdd:pfam09606  459 LMPSPSPQMSQS---PASQRTIQQDMVSPGGPLNTPGQSSVNSPANPQEEQLYREKYKqlsKYIEP 521
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
53-146 4.79e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.40  E-value: 4.79e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187       53 QQTTTPASSAVPenPHHASPQPASVP--PPQNGPYPQQCMMT---QNQANPSGwsfyGHPSMIPYTPYQMSPMYFPPGPQ 127
Cdd:smart00818   36 HHQIIPVSQQHP--PTHTLQPHHHIPvlPAQQPVVPQQPLMPvpgQHSMTPTQ----HHQPNLPQPAQQPFQPQPLQPPQ 109
                            90
                    ....*....|....*....
gi 7839187      128 SQFPQYPSSVGTPLSTPSP 146
Cdd:smart00818  110 PQQPMQPQPPVHPIPPLPP 128
PHA03247 PHA03247
large tegument protein UL36; Provisional
47-180 5.52e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.77  E-value: 5.52e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187     47 STKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQANP-SGWSFYGHPSMIPYTPY--QMSPMYFP 123
Cdd:PHA03247 2894 STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPtTDPAGAGEPSGAVPQPWlgALVPGRVA 2973
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 7839187    124 PgPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMtSTKKYVRPPPM-----LTSPND 180
Cdd:PHA03247 2974 V-PRFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSL-ALHEETDPPPVslkqtLWPPDD 3033
 
Name Accession Description Interval E-value
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1606-1742 7.74e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 112.18  E-value: 7.74e-28
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187  1606 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1681
Cdd:cd09272    1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 7839187  1682 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1742
Cdd:cd09272   79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
TYA pfam01021
TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion ...
17-114 8.34e-62

TYA transposon protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles.


Pssm-ID: 144563  Cd Length: 98  Bit Score: 207.98  E-value: 8.34e-62
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLPEYDKDSTKANSQQETTPGSSAVPENHHHASPQTAQVPLPQNGPYQQQCMMTPNQA 80
                           90
                   ....*....|....*...
gi 7839187      97 NPSGWSFYGHPSMIPYTP 114
Cdd:pfam01021   81 NPSGWSVYGHPSMMPYTP 98
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1265-1494 4.87e-35

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 254387  Cd Length: 246  Bit Score: 136.43  E-value: 4.87e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1265 NTWDtdkyydrkEIDP---KRVINSMFIFNKKR--DGT---HKARFVARGDIQHP-----DTYdtgmqSNTVHHYALMTS 1331
Cdd:pfam07727    2 KTWE--------LVPLpkgKKPIGCKWVFKIKYnsDGEierYKARLVAKGFTQKEgidydETF-----SPVAKLTTIRLL 68
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1332 LSLALDNNYYITQLDISSAYLYADIKEELYIRPPP---HLGMNDKLIRLKKSHYGLKQSGANWYETIKSYLiKQCGMEEV 1408
Cdd:pfam07727   69 LALAAQRGWELHQMDVKTAFLNGELEEEVYMKQPPgfeDPGKPNKVCRLKKSLYGLKQAPRAWYQKLSSFL-LKLGFKQS 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    1409 RGWSCVF-KNSQ---VTICLFVDDMILFSKDLNANKKIITTLKKQYDTKiiNLGesdnEIQYdILGLEIKYQRGKYM--- 1481
Cdd:pfam07727  148 EADPCLFvKKSGggiIYLLLYVDDILIAGSNDELIDEFKEELSSEFEMK--DLG----ELKY-FLGIEIKRTSGGIFlsq 220
                          250       260
                   ....*....|....*....|..
gi 7839187    1482 ---------KLGMENSLTEKIP 1494
Cdd:pfam07727  221 rkyakkllkRFGMLDCKPVSTP 242
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
663-782 7.92e-20

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyses the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 250040  Cd Length: 119  Bit Score: 88.16  E-value: 7.92e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187     663 EPFQYLHTDIFgPVHNLPKSAPSYFISFTDETTKFRWVYPLhdRREDSILDVFTTILAFIKNQFQASVlVIQMDRGSEYT 742
Cdd:pfam00665    4 RPNELWQMDIT-PIPISSKGGKKYLLVIVDDFSRFVVAYAL--KSKTDAELVFDLLEAALERRGGKPV-TIHSDNGSEFT 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 7839187     743 NRTLHKFLEKNGITPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:pfam00665   80 SKAFQELLKELGIKHSFSRPGNPQDNGVVERFNRTLKEEL 119
MFMR pfam07777
G-box binding protein MFMR; This region is found to the N-terminus of the pfam00170 ...
37-167 1.07e-04

G-box binding protein MFMR; This region is found to the N-terminus of the pfam00170 transcription factor domain. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain), whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C, classified according to motif composition. It has been suggested that some of these motifs may be involved in mediating protein-protein interactions. The MFMR region contains a nuclear localisation signal in bZIP opaque and GBF-2. The MFMR also contains a transregulatory activity in TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention signals.


Pssm-ID: 254423  Cd Length: 189  Bit Score: 43.68  E-value: 1.07e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      37 ASKTEECEKASTKANSQQTTTP--------ASSAVPenPHHASPQPASVPPPQNGPY---PQQCMMTqnqanpsgwsfyg 105
Cdd:pfam07777    6 EGKPSKSSPKTSVQEDTPTPTVypdwsamqAYYGPR--PPPPYFNSSVASSPQPHPYmwgPQQPMMP------------- 70
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 7839187     106 hpsmiPY-TPYQMSPMY----------FPPGPQSQFPQYPSSVGTPLSTPspesGNTFTDS-SSADSDMTSTKK 167
Cdd:pfam07777   71 -----PYgTPPPYAAMYppggvyahpsMPPGSHPFSPYAMPSAEVPGSTP----LSMETDAkSSDNKDKGSIKK 135
YppG pfam14179
YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which ...
73-148 7.93e-03

YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important.


Pssm-ID: 258379  Cd Length: 110  Bit Score: 36.64  E-value: 7.93e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839187      73 QPASVPPPQNGPYPQQCMMTQNQANPSgwsfyGHPSMiPYTPYQMSPMYFPPGPQSQFPQYPSSVGTPLSTPSPES 148
Cdd:pfam14179    1 NMYQQNHNPYLPYNQQQQPYQQQPYHQ-----QMPPP-PYSPPQQQQAHFMPPQPQPYPKPSPQQQQPSQFSSFMS 70
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
29-171 2.72e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 255446 [Multi-domain]  Cd Length: 768  Bit Score: 47.69  E-value: 2.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      29 NQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPypqqcmmtqNQANPSGwsfyghPS 108
Cdd:pfam09606  394 NQGGLGANPMQQGQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVS---------GGMIPSP------PA 458
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 7839187     109 MIPYTPYQMSPMyfpPGPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMTSTK---KYVRP 171
Cdd:pfam09606  459 LMPSPSPQMSQS---PASQRTIQQDMVSPGGPLNTPGQSSVNSPANPQEEQLYREKYKqlsKYIEP 521
DUF1421 pfam07223
Protein of unknown function (DUF1421); This family represents a conserved region approximately ...
52-145 4.33e-05

Protein of unknown function (DUF1421); This family represents a conserved region approximately 350 residues long within a number of plant proteins of unknown function.


Pssm-ID: 254110 [Multi-domain]  Cd Length: 357  Bit Score: 46.47  E-value: 4.33e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      52 SQQTTTPASSAVPENPHHASPQPasvPPPQngPYPQQCMMTQNQANPSgwSFYGHPSMIPYTP------------YQMSP 119
Cdd:pfam07223  133 AQQPQAQQPQPPPQVPQQQQYQS---PPQQ--PQYQQNPPPQAQSAPQ--VSGLYPEESPYQPqsyppneplpssMAMQP 205
                           90       100       110
                   ....*....|....*....|....*....|
gi 7839187     120 MYFPPGPQSQF----PQYPSSVGTPLSTPS 145
Cdd:pfam07223  206 PYSGAPPSQQFygppQPSPYMYGGPGGRPN 235
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
53-146 4.79e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.40  E-value: 4.79e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187       53 QQTTTPASSAVPenPHHASPQPASVP--PPQNGPYPQQCMMT---QNQANPSGwsfyGHPSMIPYTPYQMSPMYFPPGPQ 127
Cdd:smart00818   36 HHQIIPVSQQHP--PTHTLQPHHHIPvlPAQQPVVPQQPLMPvpgQHSMTPTQ----HHQPNLPQPAQQPFQPQPLQPPQ 109
                            90
                    ....*....|....*....
gi 7839187      128 SQFPQYPSSVGTPLSTPSP 146
Cdd:smart00818  110 PQQPMQPQPPVHPIPPLPP 128
DUF1421 pfam07223
Protein of unknown function (DUF1421); This family represents a conserved region approximately ...
51-182 2.05e-04

Protein of unknown function (DUF1421); This family represents a conserved region approximately 350 residues long within a number of plant proteins of unknown function.


Pssm-ID: 254110 [Multi-domain]  Cd Length: 357  Bit Score: 44.16  E-value: 2.05e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      51 NSQQTTTPASSAVPENPHHASPQPA-SVPPPQNGPYPQQCMMTQNQANPSGWSFYGHPSMIPYTPYQMSPMYfPPGPQSQ 129
Cdd:pfam07223  109 SVPQQPTPQQEPYYPPPSQPQPPPAqQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLY-PEESPYQ 187
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 7839187     130 FPQYPSSVGTPLSTP--SPESGNTftdSSSADSDMTSTKKYVRPPPMLTSPNDFP 182
Cdd:pfam07223  188 PQSYPPNEPLPSSMAmqPPYSGAP---PSQQFYGPPQPSPYMYGGPGGRPNSGFP 239
DUF1421 pfam07223
Protein of unknown function (DUF1421); This family represents a conserved region approximately ...
52-160 4.48e-04

Protein of unknown function (DUF1421); This family represents a conserved region approximately 350 residues long within a number of plant proteins of unknown function.


Pssm-ID: 254110 [Multi-domain]  Cd Length: 357  Bit Score: 43.01  E-value: 4.48e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      52 SQQTTTPASSA---VPENPHHASPQPASVPPPQNGPYPQQCMMTQNQANPSGwSFYGHPSMIPYT-------PYQMSPMY 121
Cdd:pfam07223  163 QQNPPPQAQSApqvSGLYPEESPYQPQSYPPNEPLPSSMAMQPPYSGAPPSQ-QFYGPPQPSPYMyggpggrPNSGFPSG 241
                           90       100       110
                   ....*....|....*....|....*....|....*....
gi 7839187     122 FPPGPQSQFPQYPSSvGTPLSTPSPESGNTFTDSSSADS 160
Cdd:pfam07223  242 QQPPPSQGQEGYGYS-GPPPSKGNHGSVASYAPQGSSQS 279
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
30-146 5.45e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 251763 [Multi-domain]  Cd Length: 979  Bit Score: 43.52  E-value: 5.45e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      30 QDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQANPSgwsfyGHPSM 109
Cdd:pfam03154  170 QQLLQPQGPPSIQVPPGAALAPSAPPPTPSAQAVPPQGSPIAAQPAPQPQQPSPLSLISAPSLHPQRLPS-----PHPPL 244
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 7839187     110 IPYTPYQMSPMyfPPGPQSQFPQYPSSVGTPlSTPSP 146
Cdd:pfam03154  245 QPQTASQQSPQ--PPAPSSRHPQSSHHGPGP-PMPHA 278
PHA03247 PHA03247
large tegument protein UL36; Provisional
47-180 5.52e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.77  E-value: 5.52e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187     47 STKANSQQTTTPASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTQNQANP-SGWSFYGHPSMIPYTPY--QMSPMYFP 123
Cdd:PHA03247 2894 STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPtTDPAGAGEPSGAVPQPWlgALVPGRVA 2973
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 7839187    124 PgPQSQFPQYPSSVGTPLSTPSPESGNTFTDSSSADSDMtSTKKYVRPPPM-----LTSPND 180
Cdd:PHA03247 2974 V-PRFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSL-ALHEETDPPPVslkqtLWPPDD 3033
PHA02517 PHA02517
putative transposase OrfB; Reviewed
690-782 6.13e-04

putative transposase OrfB; Reviewed


Pssm-ID: 222853 [Multi-domain]  Cd Length: 277  Bit Score: 42.54  E-value: 6.13e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187    690 FTDETTKFRWVY-----PLHDRR---------EDSILdVFTTILAFIKNQFQASVLVIQMDRGSEYTNRTLHKFLEKNGI 755
Cdd:PHA02517  117 FTYVSTWQGWVYvafiiDVFARRivgwrvsssMDTDF-VLDALEQALWARGRPGGLIHHSDKGSQYVSLAYTQRLKEAGI 195
                          90       100
                  ....*....|....*....|....*..
gi 7839187    756 TPCYTTTADSRAHGVAERLNRTLLDDC 782
Cdd:PHA02517  196 RASTGSRGDSYDNAPAESINGLYKAEV 222
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
52-182 1.27e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 251763 [Multi-domain]  Cd Length: 979  Bit Score: 42.37  E-value: 1.27e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      52 SQQTTTPASSAVPENPHHASPQPASVPPPQNGPyPQQCMMTQNQANPSGWSFYGHP-SMIPYTPYQMSPMYfPPGPQsQF 130
Cdd:pfam03154  261 SRHPQSSHHGPGPPMPHALQQGPVFLQHPSSNP-PQPFGLAQSQVPPLPLPSQAQPhSHTPPSQSALQPQQ-PPREQ-PL 337
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 7839187     131 PQYP--SSVGTPLSTPSPESGN---TFTDSSSADSDMTSTKKYVRPPPMLTSPNDFP 182
Cdd:pfam03154  338 PPAPsmPHIKPPPTTPIPQLPNqshKHPPHLQGPSPFPQMPSNLPPPPALKPLSSLP 394
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
53-132 1.51e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 255543 [Multi-domain]  Cd Length: 806  Bit Score: 42.07  E-value: 1.51e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      53 QQTTTPASSAVPENPHHASPQPASvpPPQNGPYPQQCMMTQNQA---NPSGWSFYGHPSMIPYTPYQMSPMYFPPGPQSQ 129
Cdd:pfam09770  161 AQLQQRQQAPQLPQPPQQVLPQGM--PPRQAAFPQQGPPEQPPGypqPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQ 238

                   ...
gi 7839187     130 FPQ 132
Cdd:pfam09770  239 QPP 241
DUF605 pfam04652
Vta1 like; Vta1 (VPS20-associated protein 1) is a positive regulator of Vps4. Vps4 is an ...
58-170 3.25e-03

Vta1 like; Vta1 (VPS20-associated protein 1) is a positive regulator of Vps4. Vps4 is an ATPase that is required in the multivesicular body (MVB) sorting pathway to dissociate the endosomal sorting complex required for transport (ESCRT). Vta1 promotes correct assembly of Vps4 and stimulates its ATPase activity through its conserved Vta1/SBP1/LIP5 region.


Pssm-ID: 252721 [Multi-domain]  Cd Length: 312  Bit Score: 40.05  E-value: 3.25e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      58 PASSAVPENPHHASPQPASVPPPQNGPYPQQCMMTqnqanPSGWSFYGH-PSMIPYTPYQMSPmyfppgPQSQFPQYPSS 136
Cdd:pfam04652  176 PASASPSDPPSSSPGEPSFPSPPEGPDSPSDSSLP-----PAPSSFQSDtPPSSPEEPTNPSP------PPSPFAPSPPP 244
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 7839187     137 VGTPLSTPSPESGNTFTDSSSA--------DSDMTSTKKYVR 170
Cdd:pfam04652  245 QQQVPPLSTAKPSPPHTSATPApigpitpdDDAIAKAQKHAK 286
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
17-146 3.38e-03

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteristic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localisation signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 259412 [Multi-domain]  Cd Length: 237  Bit Score: 39.80  E-value: 3.38e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 7839187      17 ACASVTSKEVHTNQDPLDVSASKTEECEKASTKANSQQTTTPASSAVPENphhaSPQPASVPPPQNGPYPQQCMMTQNQA 96
Cdd:pfam15279   77 SRSPSPASTVSQSVSPGPSPSQRSSPSSSPPSPSKPLISVAPPSKLLSVN----SPPSQPHLPPKGVPPLNPQRPPGSRP 152
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 7839187      97 NPSGWSFYGHPSMIPYTPYQMSPMYFPPgPQSQFPQYPSSVGTPLSTPSP 146
Cdd:pfam15279  153 PCPASNPMHRPPLSPFLHPPSTPTMPPP-PPGPSPPPPPSGMMPGFPPLP 201
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.14
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Marchler-Bauer A et al. (2015), "CDD: NCBI's conserved domain database.", Nucleic Acids Res.43(D)222-6.
  • Marchler-Bauer A et al. (2011), "CDD: a Conserved Domain Database for the functional annotation of proteins.", Nucleic Acids Res.39(D)225-9.
  • Marchler-Bauer A et al. (2009), "CDD: specific functional annotation with the Conserved Domain Database.", Nucleic Acids Res.37(D)205-10.
  • Marchler-Bauer A, Bryant SH (2004), "CD-Search: protein domain annotations on the fly.", Nucleic Acids Res.32(W)327-331.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH