NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|445966738|ref|WP_000044593|]
View 

serine-rich repeat glycoprotein adhesin SasA [Staphylococcus aureus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Bact_lectin pfam18483
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ...
261-489 3.35e-56

Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes.


:

Pssm-ID: 465784  Cd Length: 211  Bit Score: 194.58  E-value: 3.35e-56
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483    1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483   79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738   421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483  154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
664-749 6.53e-08

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


:

Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 52.09  E-value: 6.53e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345    1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
                           90
                   ....*....|....*
gi 445966738   735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345   81 SSGLSSSTTFTLTVT 95
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
577-659 1.55e-06

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


:

Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 48.24  E-value: 1.55e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345   10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89

                   ....*
gi 445966738   655 FTINV 659
Cdd:pfam05345   90 FTLTV 94
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
2197-2239 3.21e-06

LPXTG cell wall anchor motif;


:

Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 45.61  E-value: 3.21e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 445966738  2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746    1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
KxYKxGKxW_sig pfam19258
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ...
14-50 8.38e-06

KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide.


:

Pssm-ID: 466014 [Multi-domain]  Cd Length: 41  Bit Score: 44.40  E-value: 8.38e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 445966738    14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258    1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
FhaB super family cl27105
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
64-1209 2.19e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


The actual alignment was detected with superfamily member COG3210:

Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.15  E-value: 2.19e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210   249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210   329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210   409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210   489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210   569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210   649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210   724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210   804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210   884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210   964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210  1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210  1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210  1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210  1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
                        1130      1140
                  ....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210  1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
 
Name Accession Description Interval E-value
Bact_lectin pfam18483
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ...
261-489 3.35e-56

Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes.


Pssm-ID: 465784  Cd Length: 211  Bit Score: 194.58  E-value: 3.35e-56
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483    1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483   79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738   421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483  154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
lectin_L-type cd01951
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate ...
264-491 2.37e-22

legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.


Pssm-ID: 173886 [Multi-domain]  Cd Length: 223  Bit Score: 97.88  E-value: 2.37e-22
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  264 DNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYegnGNGGDGIGFAFSPGVLGE 343
Cdd:cd01951    10 NNNQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFTTTFKFYLGTKG---TNGADGIAFVLQNDPAGA 86
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  344 TGLNGA--AVGIGGLSNAFGFKLDTYHNtskpnsaaKANADPSNvaggGAFGAFVTTDSYGVATTYTSSSTAdnaakLKV 421
Cdd:cd01951    87 LGGGGGggGLGYGGIGNSVAVEFDTYKN--------DDNNDPNG----NHISIDVNGNGNNTALATSLGSAS-----LPN 149
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 445966738  422 QPTNNTFQNFDITYNGDTKVMTVTYAGQ--TWTRNIS--DWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEY 491
Cdd:cd01951   150 GTGLGNEHTVRITYDPTTNTLTVYLDNGstLTSLDITipVDLIQLGPTKAYFGFTASTGGLTNLHDILNWSFTS 223
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
664-749 6.53e-08

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 52.09  E-value: 6.53e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345    1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
                           90
                   ....*....|....*
gi 445966738   735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345   81 SSGLSSSTTFTLTVT 95
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
577-659 1.55e-06

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 48.24  E-value: 1.55e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345   10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89

                   ....*
gi 445966738   655 FTINV 659
Cdd:pfam05345   90 FTLTV 94
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
2197-2239 3.21e-06

LPXTG cell wall anchor motif;


Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 45.61  E-value: 3.21e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 445966738  2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746    1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
KxYKxGKxW_sig pfam19258
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ...
14-50 8.38e-06

KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide.


Pssm-ID: 466014 [Multi-domain]  Cd Length: 41  Bit Score: 44.40  E-value: 8.38e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 445966738    14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258    1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
64-789 9.93e-06

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 51.31  E-value: 9.93e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210    54 NAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTT 133
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210   134 TSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAG 213
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210   214 GGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSG 293
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210   294 DTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLT 373
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQ---PTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIA 460
Cdd:COG3210   374 TAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTtiaGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSG 453
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  461 KSGTTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNY 540
Cdd:COG3210   454 TTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTG 533
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  541 TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSY 620
Cdd:COG3210   534 GDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGT 613
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  621 DSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVT--- 697
Cdd:COG3210   614 ITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTlna 693
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  698 ---NTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKA 774
Cdd:COG3210   694 atgGTLNNAGNTLTISTGSITVTGQIGALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLT 773
                         730
                  ....*....|....*
gi 445966738  775 DSQSASTSTSGSIVV 789
Cdd:COG3210   774 LANANGNTSAGATLD 788
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
530-812 1.14e-05

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 50.77  E-value: 1.14e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  530 QSALTAKGYNY-TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGnQTIEVGKTMNPIVLTTTDNGAGTVT 608
Cdd:NF033849  237 QSAGTGYGESVgHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQS-TSESESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  609 NTVTGLPSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSevyspispikiaT 688
Cdd:NF033849  316 GTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSS------------R 383
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  689 QDNSGNAVTNTVTGLPSGLTFDSTNNTISGTpTNIGTST-ISIVSTDASGNKTTTTFkyevtrNSMSDSVSTSGSTQQSQ 767
Cdd:NF033849  384 SSSSGVSGGFSGGIAGGGVTSEGLGASQGGS-EGWGSGDsVQSVSQSYGSSSSTGTS------SGHSDSSSHSTSSGQAD 456
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 445966738  768 SVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
Cdd:NF033849  457 SVSQGTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSES 501
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
64-1209 2.19e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.15  E-value: 2.19e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210   249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210   329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210   409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210   489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210   569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210   649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210   724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210   804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210   884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210   964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210  1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210  1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210  1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210  1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
                        1130      1140
                  ....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210  1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
KxYKxGKxW TIGR03715
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as ...
16-33 8.49e-04

KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus.


Pssm-ID: 274741 [Multi-domain]  Cd Length: 23  Bit Score: 38.52  E-value: 8.49e-04
                           10
                   ....*....|....*...
gi 445966738    16 KTRVRLYKSGKNWVKSGI 33
Cdd:TIGR03715    1 KKRYKMYKSGKHWVFAGI 18
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
945-1208 3.29e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 42.69  E-value: 3.29e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  945 QSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSAS 1024
Cdd:NF033849  237 QSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEG 316
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1025 DSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSIS 1104
Cdd:NF033849  317 TSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSGGFSGG 396
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1105 TSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLStsesdsiSESTSTSDSISEAIS 1184
Cdd:NF033849  397 IAGGGVTSEGLGASQGGSEGWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQ-------ADSVSQGTSWSEGTG 469
                         250       260
                  ....*....|....*....|....
gi 445966738 1185 ASESTSISLSESNSTSDSESQSAS 1208
Cdd:NF033849  470 TSQGQSVGTSESWSTSQSETDSVG 493
LPXTG_anchor TIGR01167
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ...
2204-2238 6.16e-03

LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]


Pssm-ID: 273478 [Multi-domain]  Cd Length: 34  Bit Score: 36.30  E-value: 6.16e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 445966738  2204 RLPDTGDSIKQNGLLGGVMtLLVGLGLMKRKKKKD 2238
Cdd:TIGR01167    1 KLPKTGESGNSLLLLLGLL-LLGLGGLLLRKRKKK 34
 
Name Accession Description Interval E-value
Bact_lectin pfam18483
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ...
261-489 3.35e-56

Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes.


Pssm-ID: 465784  Cd Length: 211  Bit Score: 194.58  E-value: 3.35e-56
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483    1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483   79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738   421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483  154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
lectin_L-type cd01951
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate ...
264-491 2.37e-22

legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.


Pssm-ID: 173886 [Multi-domain]  Cd Length: 223  Bit Score: 97.88  E-value: 2.37e-22
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  264 DNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYegnGNGGDGIGFAFSPGVLGE 343
Cdd:cd01951    10 NNNQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFTTTFKFYLGTKG---TNGADGIAFVLQNDPAGA 86
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  344 TGLNGA--AVGIGGLSNAFGFKLDTYHNtskpnsaaKANADPSNvaggGAFGAFVTTDSYGVATTYTSSSTAdnaakLKV 421
Cdd:cd01951    87 LGGGGGggGLGYGGIGNSVAVEFDTYKN--------DDNNDPNG----NHISIDVNGNGNNTALATSLGSAS-----LPN 149
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 445966738  422 QPTNNTFQNFDITYNGDTKVMTVTYAGQ--TWTRNIS--DWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEY 491
Cdd:cd01951   150 GTGLGNEHTVRITYDPTTNTLTVYLDNGstLTSLDITipVDLIQLGPTKAYFGFTASTGGLTNLHDILNWSFTS 223
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
664-749 6.53e-08

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 52.09  E-value: 6.53e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345    1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
                           90
                   ....*....|....*
gi 445966738   735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345   81 SSGLSSSTTFTLTVT 95
He_PIG pfam05345
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ...
577-659 1.55e-06

Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.


Pssm-ID: 398814 [Multi-domain]  Cd Length: 95  Bit Score: 48.24  E-value: 1.55e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345   10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89

                   ....*
gi 445966738   655 FTINV 659
Cdd:pfam05345   90 FTLTV 94
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
2197-2239 3.21e-06

LPXTG cell wall anchor motif;


Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 45.61  E-value: 3.21e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 445966738  2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746    1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
KxYKxGKxW_sig pfam19258
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ...
14-50 8.38e-06

KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide.


Pssm-ID: 466014 [Multi-domain]  Cd Length: 41  Bit Score: 44.40  E-value: 8.38e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 445966738    14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258    1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
64-789 9.93e-06

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 51.31  E-value: 9.93e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210    54 NAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTT 133
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210   134 TSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAG 213
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210   214 GGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSG 293
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210   294 DTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLT 373
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQ---PTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIA 460
Cdd:COG3210   374 TAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTtiaGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSG 453
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  461 KSGTTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNY 540
Cdd:COG3210   454 TTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTG 533
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  541 TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSY 620
Cdd:COG3210   534 GDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGT 613
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  621 DSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVT--- 697
Cdd:COG3210   614 ITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTlna 693
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  698 ---NTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKA 774
Cdd:COG3210   694 atgGTLNNAGNTLTISTGSITVTGQIGALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLT 773
                         730
                  ....*....|....*
gi 445966738  775 DSQSASTSTSGSIVV 789
Cdd:COG3210   774 LANANGNTSAGATLD 788
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
530-812 1.14e-05

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 50.77  E-value: 1.14e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  530 QSALTAKGYNY-TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGnQTIEVGKTMNPIVLTTTDNGAGTVT 608
Cdd:NF033849  237 QSAGTGYGESVgHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQS-TSESESTGQSSSVGTSESQSHGTTE 315
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  609 NTVTGLPSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSevyspispikiaT 688
Cdd:NF033849  316 GTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSS------------R 383
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  689 QDNSGNAVTNTVTGLPSGLTFDSTNNTISGTpTNIGTST-ISIVSTDASGNKTTTTFkyevtrNSMSDSVSTSGSTQQSQ 767
Cdd:NF033849  384 SSSSGVSGGFSGGIAGGGVTSEGLGASQGGS-EGWGSGDsVQSVSQSYGSSSSTGTS------SGHSDSSSHSTSSGQAD 456
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 445966738  768 SVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
Cdd:NF033849  457 SVSQGTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSES 501
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
64-1209 2.19e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.15  E-value: 2.19e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210   249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210   329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210   409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210   489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210   569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210   649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210   724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210   804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210   884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210   964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210  1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210  1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210  1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210  1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
                        1130      1140
                  ....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210  1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
100-1209 2.10e-04

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 46.68  E-value: 2.10e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  100 LNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVSSEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSS 179
Cdd:COG3210   183 SLNVVAANPTGVTGVGGALINATAGVLANAGGGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGA 262
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  180 SDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTSTTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTI 259
Cdd:COG3210   263 GGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAG 342
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  260 TVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPG 339
Cdd:COG3210   343 LVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTTIAGNGGSAN 422
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  340 VLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKL 419
Cdd:COG3210   423 AGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNN 502
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  420 KVQPTNNTFQNFDITYNGDTKV-------MTVTYAGQTWTRNISDWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEYT 492
Cdd:COG3210   503 AGGDANGIATGLTGITAGGGGGgnatsggTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTA 582
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  493 ESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNYTSVDSSYASTYNDTNKTVKMTNAGQSVTYYFT 572
Cdd:COG3210   583 GNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTT 662
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  573 DVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGlpSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKST 652
Cdd:COG3210   663 GVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAG--NTLTISTGSITVTGQIGALANANGDTVTFGNLGTGAT 740
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  653 TTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKI----------ATQDNSGNAVTNTVTGLPSGLTFDSTNNTISGTPTN 722
Cdd:COG3210   741 LTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLanangntsagATLDNAGAEISIDITADGTITAAGTTAINVTGSGGT 820
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  723 IGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVS 802
Cdd:COG3210   821 ITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGGVATSTGTANAGTLTNLG 900
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  803 LSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEKSESLSTSTSDSLRTSTSLSDS 882
Cdd:COG3210   901 TTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTS 980
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  883 LSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLS 962
Cdd:COG3210   981 ANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQNGVGVNASGISGGNAAA 1060
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  963 TSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGS 1042
Cdd:COG3210  1061 LTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAA 1140
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1043 TSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASS 1122
Cdd:COG3210  1141 GAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTT 1220
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1123 ESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDS 1202
Cdd:COG3210  1221 TLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSL 1300

                  ....*..
gi 445966738 1203 ESQSASA 1209
Cdd:COG3210  1301 DTTGNTA 1307
KxYKxGKxW TIGR03715
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as ...
16-33 8.49e-04

KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus.


Pssm-ID: 274741 [Multi-domain]  Cd Length: 23  Bit Score: 38.52  E-value: 8.49e-04
                           10
                   ....*....|....*...
gi 445966738    16 KTRVRLYKSGKNWVKSGI 33
Cdd:TIGR03715    1 KKRYKMYKSGKHWVFAGI 18
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
945-1208 3.29e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 42.69  E-value: 3.29e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738  945 QSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSAS 1024
Cdd:NF033849  237 QSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEG 316
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1025 DSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSIS 1104
Cdd:NF033849  317 TSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSGGFSGG 396
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1105 TSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLStsesdsiSESTSTSDSISEAIS 1184
Cdd:NF033849  397 IAGGGVTSEGLGASQGGSEGWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQ-------ADSVSQGTSWSEGTG 469
                         250       260
                  ....*....|....*....|....
gi 445966738 1185 ASESTSISLSESNSTSDSESQSAS 1208
Cdd:NF033849  470 TSQGQSVGTSESWSTSQSETDSVG 493
HYR pfam02494
HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin ...
572-659 5.89e-03

HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion.


Pssm-ID: 460572 [Multi-domain]  Cd Length: 81  Bit Score: 37.75  E-value: 5.89e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738   572 TDVKAPTVTVGN---QTIEVGKTMNPIVLTTT--DNGAGTVTNTVTGLPSGLSYdsatnsiigtptKIGQSTVTVVSTDQ 646
Cdd:pfam02494    1 VDTTPPTVKCPNnivRTVELGTSTVRVFFTEPtaFDNSGQAILVSRTAQPGDFF------------PVGTTTVTYVAYDN 68
                           90
                   ....*....|...
gi 445966738   647 ANNKSTTTFTINV 659
Cdd:pfam02494   69 SGNRASCTFTVTV 81
LPXTG_anchor TIGR01167
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ...
2204-2238 6.16e-03

LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]


Pssm-ID: 273478 [Multi-domain]  Cd Length: 34  Bit Score: 36.30  E-value: 6.16e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 445966738  2204 RLPDTGDSIKQNGLLGGVMtLLVGLGLMKRKKKKD 2238
Cdd:TIGR01167    1 KLPKTGESGNSLLLLLGLL-LLGLGGLLLRKRKKK 34
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH