|
Name |
Accession |
Description |
Interval |
E-value |
| Bact_lectin |
pfam18483 |
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ... |
261-489 |
3.35e-56 |
|
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes. :
Pssm-ID: 465784 Cd Length: 211 Bit Score: 194.58 E-value: 3.35e-56
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483 1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483 79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738 421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483 154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
664-749 |
6.53e-08 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei. :
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 52.09 E-value: 6.53e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345 1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
|
90
....*....|....*
gi 445966738 735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345 81 SSGLSSSTTFTLTVT 95
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
577-659 |
1.55e-06 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei. :
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 48.24 E-value: 1.55e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345 10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89
|
....*
gi 445966738 655 FTINV 659
Cdd:pfam05345 90 FTLTV 94
|
|
| Gram_pos_anchor |
pfam00746 |
LPXTG cell wall anchor motif; |
2197-2239 |
3.21e-06 |
|
LPXTG cell wall anchor motif; :
Pssm-ID: 366278 [Multi-domain] Cd Length: 43 Bit Score: 45.61 E-value: 3.21e-06
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 445966738 2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746 1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
|
|
| KxYKxGKxW_sig |
pfam19258 |
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ... |
14-50 |
8.38e-06 |
|
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide. :
Pssm-ID: 466014 [Multi-domain] Cd Length: 41 Bit Score: 44.40 E-value: 8.38e-06
10 20 30
....*....|....*....|....*....|....*..
gi 445966738 14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258 1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
|
|
| FhaB super family |
cl27105 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
64-1209 |
2.19e-05 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport]; The actual alignment was detected with superfamily member COG3210:
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 50.15 E-value: 2.19e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210 249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210 329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210 409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210 489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210 569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210 649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210 724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210 804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210 884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210 964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210 1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210 1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210 1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210 1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
|
1130 1140
....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210 1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Bact_lectin |
pfam18483 |
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ... |
261-489 |
3.35e-56 |
|
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes.
Pssm-ID: 465784 Cd Length: 211 Bit Score: 194.58 E-value: 3.35e-56
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483 1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483 79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738 421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483 154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
|
|
| lectin_L-type |
cd01951 |
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate ... |
264-491 |
2.37e-22 |
|
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.
Pssm-ID: 173886 [Multi-domain] Cd Length: 223 Bit Score: 97.88 E-value: 2.37e-22
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 264 DNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYegnGNGGDGIGFAFSPGVLGE 343
Cdd:cd01951 10 NNNQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFTTTFKFYLGTKG---TNGADGIAFVLQNDPAGA 86
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 344 TGLNGA--AVGIGGLSNAFGFKLDTYHNtskpnsaaKANADPSNvaggGAFGAFVTTDSYGVATTYTSSSTAdnaakLKV 421
Cdd:cd01951 87 LGGGGGggGLGYGGIGNSVAVEFDTYKN--------DDNNDPNG----NHISIDVNGNGNNTALATSLGSAS-----LPN 149
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 445966738 422 QPTNNTFQNFDITYNGDTKVMTVTYAGQ--TWTRNIS--DWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEY 491
Cdd:cd01951 150 GTGLGNEHTVRITYDPTTNTLTVYLDNGstLTSLDITipVDLIQLGPTKAYFGFTASTGGLTNLHDILNWSFTS 223
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
664-749 |
6.53e-08 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 52.09 E-value: 6.53e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345 1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
|
90
....*....|....*
gi 445966738 735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345 81 SSGLSSSTTFTLTVT 95
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
577-659 |
1.55e-06 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 48.24 E-value: 1.55e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345 10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89
|
....*
gi 445966738 655 FTINV 659
Cdd:pfam05345 90 FTLTV 94
|
|
| Gram_pos_anchor |
pfam00746 |
LPXTG cell wall anchor motif; |
2197-2239 |
3.21e-06 |
|
LPXTG cell wall anchor motif;
Pssm-ID: 366278 [Multi-domain] Cd Length: 43 Bit Score: 45.61 E-value: 3.21e-06
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 445966738 2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746 1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
|
|
| KxYKxGKxW_sig |
pfam19258 |
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ... |
14-50 |
8.38e-06 |
|
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide.
Pssm-ID: 466014 [Multi-domain] Cd Length: 41 Bit Score: 44.40 E-value: 8.38e-06
10 20 30
....*....|....*....|....*....|....*..
gi 445966738 14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258 1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
64-789 |
9.93e-06 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 51.31 E-value: 9.93e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210 54 NAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTT 133
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210 134 TSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAG 213
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210 214 GGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSG 293
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210 294 DTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLT 373
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQ---PTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIA 460
Cdd:COG3210 374 TAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTtiaGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSG 453
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 461 KSGTTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNY 540
Cdd:COG3210 454 TTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTG 533
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 541 TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSY 620
Cdd:COG3210 534 GDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGT 613
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 621 DSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVT--- 697
Cdd:COG3210 614 ITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTlna 693
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 698 ---NTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKA 774
Cdd:COG3210 694 atgGTLNNAGNTLTISTGSITVTGQIGALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLT 773
|
730
....*....|....*
gi 445966738 775 DSQSASTSTSGSIVV 789
Cdd:COG3210 774 LANANGNTSAGATLD 788
|
|
| ser_rich_anae_1 |
NF033849 |
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ... |
530-812 |
1.14e-05 |
|
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.
Pssm-ID: 468206 [Multi-domain] Cd Length: 1122 Bit Score: 50.77 E-value: 1.14e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 530 QSALTAKGYNY-TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGnQTIEVGKTMNPIVLTTTDNGAGTVT 608
Cdd:NF033849 237 QSAGTGYGESVgHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQS-TSESESTGQSSSVGTSESQSHGTTE 315
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 609 NTVTGLPSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSevyspispikiaT 688
Cdd:NF033849 316 GTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSS------------R 383
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 689 QDNSGNAVTNTVTGLPSGLTFDSTNNTISGTpTNIGTST-ISIVSTDASGNKTTTTFkyevtrNSMSDSVSTSGSTQQSQ 767
Cdd:NF033849 384 SSSSGVSGGFSGGIAGGGVTSEGLGASQGGS-EGWGSGDsVQSVSQSYGSSSSTGTS------SGHSDSSSHSTSSGQAD 456
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 445966738 768 SVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
Cdd:NF033849 457 SVSQGTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSES 501
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
64-1209 |
2.19e-05 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 50.15 E-value: 2.19e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210 249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210 329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210 409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210 489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210 569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210 649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210 724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210 804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210 884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210 964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210 1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210 1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210 1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210 1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
|
1130 1140
....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210 1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
|
|
| KxYKxGKxW |
TIGR03715 |
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as ... |
16-33 |
8.49e-04 |
|
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus.
Pssm-ID: 274741 [Multi-domain] Cd Length: 23 Bit Score: 38.52 E-value: 8.49e-04
|
| ser_rich_anae_1 |
NF033849 |
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ... |
945-1208 |
3.29e-03 |
|
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.
Pssm-ID: 468206 [Multi-domain] Cd Length: 1122 Bit Score: 42.69 E-value: 3.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 945 QSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSAS 1024
Cdd:NF033849 237 QSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEG 316
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1025 DSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSIS 1104
Cdd:NF033849 317 TSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSGGFSGG 396
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1105 TSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLStsesdsiSESTSTSDSISEAIS 1184
Cdd:NF033849 397 IAGGGVTSEGLGASQGGSEGWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQ-------ADSVSQGTSWSEGTG 469
|
250 260
....*....|....*....|....
gi 445966738 1185 ASESTSISLSESNSTSDSESQSAS 1208
Cdd:NF033849 470 TSQGQSVGTSESWSTSQSETDSVG 493
|
|
| LPXTG_anchor |
TIGR01167 |
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ... |
2204-2238 |
6.16e-03 |
|
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]
Pssm-ID: 273478 [Multi-domain] Cd Length: 34 Bit Score: 36.30 E-value: 6.16e-03
10 20 30
....*....|....*....|....*....|....*
gi 445966738 2204 RLPDTGDSIKQNGLLGGVMtLLVGLGLMKRKKKKD 2238
Cdd:TIGR01167 1 KLPKTGESGNSLLLLLGLL-LLGLGGLLLRKRKKK 34
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Bact_lectin |
pfam18483 |
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in ... |
261-489 |
3.35e-56 |
|
Bacterial lectin; This entry primarily matches to legume-like lectin domains found in prokaryotes.
Pssm-ID: 465784 Cd Length: 211 Bit Score: 194.58 E-value: 3.35e-56
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 261 VNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYeGNGNGGDGIGFAFSPGv 340
Cdd:pfam18483 1 VTKDNFLDYFNLNGDATKQNYNGIVTLTPDQNGQSGAVTLKNKIDLNKDFTLKGAVNLGNKQ-SNTGGADGIGFVFHPG- 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 341 lGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSaakaNADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLK 420
Cdd:pfam18483 79 -GGIGTSGGGLGIGGLPNAFGFKFDTYYNSGDSDP----NADPSQGAGGDPYGAFVTTDSNGNLTDVGSDSQTGSTQALD 153
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 445966738 421 VQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTrnisdwiaksgTTNFSLSMTASTGGATNLQQVQFGTF 489
Cdd:pfam18483 154 SSLEDGAFHPITISYDANTKTLTVTYDGNDSS-----------STKVYFGFAASTGGSTNLQQFKITSL 211
|
|
| lectin_L-type |
cd01951 |
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate ... |
264-491 |
2.37e-22 |
|
legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.
Pssm-ID: 173886 [Multi-domain] Cd Length: 223 Bit Score: 97.88 E-value: 2.37e-22
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 264 DNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYegnGNGGDGIGFAFSPGVLGE 343
Cdd:cd01951 10 NNNQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFTTTFKFYLGTKG---TNGADGIAFVLQNDPAGA 86
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 344 TGLNGA--AVGIGGLSNAFGFKLDTYHNtskpnsaaKANADPSNvaggGAFGAFVTTDSYGVATTYTSSSTAdnaakLKV 421
Cdd:cd01951 87 LGGGGGggGLGYGGIGNSVAVEFDTYKN--------DDNNDPNG----NHISIDVNGNGNNTALATSLGSAS-----LPN 149
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 445966738 422 QPTNNTFQNFDITYNGDTKVMTVTYAGQ--TWTRNIS--DWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEY 491
Cdd:cd01951 150 GTGLGNEHTVRITYDPTTNTLTVYLDNGstLTSLDITipVDLIQLGPTKAYFGFTASTGGLTNLHDILNWSFTS 223
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
664-749 |
6.53e-08 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 52.09 E-value: 6.53e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 664 APTVTPIGDQSSEVYSPISPIKIATQDNSGN-------AVTNTVTGLPSGLTFDSTNNTISGTPTNI--GTSTISIVSTD 734
Cdd:pfam05345 1 PPVVTSPADQTATVGTPYSFTLSASGGSDPYggstvtySTTATGGALPSGLTLNSSTGTISGTPTSVqpGTYTFTVTATD 80
|
90
....*....|....*
gi 445966738 735 ASGNKTTTTFKYEVT 749
Cdd:pfam05345 81 SSGLSSSTTFTLTVT 95
|
|
| He_PIG |
pfam05345 |
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat ... |
577-659 |
1.55e-06 |
|
Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.
Pssm-ID: 398814 [Multi-domain] Cd Length: 95 Bit Score: 48.24 E-value: 1.55e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 577 PTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSATNSIIGTP--TKIGQSTVTVVSTDQANNKSTTT 654
Cdd:pfam05345 10 QTATVGTPYSFTLSASGGSDPYGGSTVTYSTTATGGALPSGLTLNSSTGTISGTPtsVQPGTYTFTVTATDSSGLSSSTT 89
|
....*
gi 445966738 655 FTINV 659
Cdd:pfam05345 90 FTLTV 94
|
|
| Gram_pos_anchor |
pfam00746 |
LPXTG cell wall anchor motif; |
2197-2239 |
3.21e-06 |
|
LPXTG cell wall anchor motif;
Pssm-ID: 366278 [Multi-domain] Cd Length: 43 Bit Score: 45.61 E-value: 3.21e-06
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 445966738 2197 TPAQSEKRLPDTGDSIKQNGLLGGVMTLLVGLGLMKRKKKKDE 2239
Cdd:pfam00746 1 AKKSKKKTLPKTGENSNIFLTAAGLLALLGGLLLLVKRRKKEK 43
|
|
| KxYKxGKxW_sig |
pfam19258 |
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as ... |
14-50 |
8.38e-06 |
|
KxYKxGKxW signal peptide; This entry represents a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK signal peptide.
Pssm-ID: 466014 [Multi-domain] Cd Length: 41 Bit Score: 44.40 E-value: 8.38e-06
10 20 30
....*....|....*....|....*....|....*..
gi 445966738 14 NEKTRVRLYKSGKNWVKSGIKEIEMFKIMGLPFISHS 50
Cdd:pfam19258 1 ERKTHYKMYKSGKHWVFAGITTLGLGLGLLGGTTAAA 37
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
64-789 |
9.93e-06 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 51.31 E-value: 9.93e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210 54 NAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTT 133
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210 134 TSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAG 213
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210 214 GGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSG 293
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210 294 DTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLT 373
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQ---PTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIA 460
Cdd:COG3210 374 TAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTtiaGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSG 453
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 461 KSGTTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNY 540
Cdd:COG3210 454 TTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTG 533
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 541 TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSY 620
Cdd:COG3210 534 GDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGT 613
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 621 DSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVT--- 697
Cdd:COG3210 614 ITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTlna 693
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 698 ---NTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKA 774
Cdd:COG3210 694 atgGTLNNAGNTLTISTGSITVTGQIGALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLT 773
|
730
....*....|....*
gi 445966738 775 DSQSASTSTSGSIVV 789
Cdd:COG3210 774 LANANGNTSAGATLD 788
|
|
| ser_rich_anae_1 |
NF033849 |
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ... |
530-812 |
1.14e-05 |
|
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.
Pssm-ID: 468206 [Multi-domain] Cd Length: 1122 Bit Score: 50.77 E-value: 1.14e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 530 QSALTAKGYNY-TSVDSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGnQTIEVGKTMNPIVLTTTDNGAGTVT 608
Cdd:NF033849 237 QSAGTGYGESVgHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQS-TSESESTGQSSSVGTSESQSHGTTE 315
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 609 NTVTGLPSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSevyspispikiaT 688
Cdd:NF033849 316 GTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSS------------R 383
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 689 QDNSGNAVTNTVTGLPSGLTFDSTNNTISGTpTNIGTST-ISIVSTDASGNKTTTTFkyevtrNSMSDSVSTSGSTQQSQ 767
Cdd:NF033849 384 SSSSGVSGGFSGGIAGGGVTSEGLGASQGGS-EGWGSGDsVQSVSQSYGSSSSTGTS------SGHSDSSSHSTSSGQAD 456
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 445966738 768 SVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
Cdd:NF033849 457 SVSQGTSWSEGTGTSQGQSVGTSESWSTSQSETDSVGDSTGTSES 501
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
64-1209 |
2.19e-05 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 50.15 E-value: 2.19e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 64 TGYGLKTTAVIGGAFTVNMLHDQQAFAASDAPLTSELNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVS 143
Cdd:COG3210 249 SSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGG 328
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 144 SEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSSSDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTS 223
Cdd:COG3210 329 NGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATG 408
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 224 TTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTITVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTR 303
Cdd:COG3210 409 NTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGG 488
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 304 IDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPGVLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADP 383
Cdd:COG3210 489 GIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLG 568
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 384 SNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKLKVQPTNNTFQNFDITYNGDTKVMTVTYAGQTWTRNISDWIAKSG 463
Cdd:COG3210 569 VLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGS 648
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 464 TTNFSLSMTASTGGATNLQQVQFGTFEYTESAVTQVRYVDVTTGKDIIPPKTYSGNvdqvvTIDNQQSALTAKGYNYTSV 543
Cdd:COG3210 649 GTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN-----TLTISTGSITVTGQIGALA 723
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 544 DSSYASTYNDTNKTVKMTNAGQSVTYYFTDVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGLPSGLSYDSA 623
Cdd:COG3210 724 NANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGT 803
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 624 TNSIIGTPTKIGQS--TVTVVSTDQANNKSTTTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKIATQDNSGNAVTNTVT 701
Cdd:COG3210 804 ITAAGTTAINVTGSggTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGG 883
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 702 GLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSAST 781
Cdd:COG3210 884 VATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGA 963
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 782 STSGSIVVSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEK 861
Cdd:COG3210 964 GDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQ 1043
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 862 SESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSI 941
Cdd:COG3210 1044 NGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGG 1123
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 942 ANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSL 1021
Cdd:COG3210 1124 TTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDS 1203
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1022 SASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSM 1101
Cdd:COG3210 1204 TGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATG 1283
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1102 SISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISE 1181
Cdd:COG3210 1284 STVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDSAA 1363
|
1130 1140
....*....|....*....|....*...
gi 445966738 1182 AISASESTSISLSESNSTSDSESQSASA 1209
Cdd:COG3210 1364 GAGSGGAAGSLAATAGAGTVLTGAGNNT 1391
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
100-1209 |
2.10e-04 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 46.68 E-value: 2.10e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 100 LNTQSETVGNQNSTTIEASTSTADSTSVTKNSSSVQTSNSDTVSSEKSEKVTSTTNSTSNQQEKLTSTSESTSSKNTTSS 179
Cdd:COG3210 183 SLNVVAANPTGVTGVGGALINATAGVLANAGGGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGA 262
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 180 SDTKSVASTSSTEQPINTSTNQSTASNNTSQSTTPSSVNLNKTSTTSTSTAPVKLRTFSRLAMSTFASAATTTAVTANTI 259
Cdd:COG3210 263 GGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAG 342
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 260 TVNKDNLKQYMTTSGNATYDQSTGIVTLTQDAYSQKGAITLGTRIDSNKSFHFSGKVNLGNKYEGNGNGGDGIGFAFSPG 339
Cdd:COG3210 343 LVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTTIAGNGGSAN 422
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 340 VLGETGLNGAAVGIGGLSNAFGFKLDTYHNTSKPNSAAKANADPSNVAGGGAFGAFVTTDSYGVATTYTSSSTADNAAKL 419
Cdd:COG3210 423 AGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNN 502
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 420 KVQPTNNTFQNFDITYNGDTKV-------MTVTYAGQTWTRNISDWIAKSGTTNFSLSMTASTGGATNLQQVQFGTFEYT 492
Cdd:COG3210 503 AGGDANGIATGLTGITAGGGGGgnatsggTGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTA 582
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 493 ESAVTQVRYVDVTTGKDIIPPKTYSGNVDQVVTIDNQQSALTAKGYNYTSVDSSYASTYNDTNKTVKMTNAGQSVTYYFT 572
Cdd:COG3210 583 GNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTT 662
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 573 DVKAPTVTVGNQTIEVGKTMNPIVLTTTDNGAGTVTNTVTGlpSGLSYDSATNSIIGTPTKIGQSTVTVVSTDQANNKST 652
Cdd:COG3210 663 GVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAG--NTLTISTGSITVTGQIGALANANGDTVTFGNLGTGAT 740
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 653 TTFTINVVDTTAPTVTPIGDQSSEVYSPISPIKI----------ATQDNSGNAVTNTVTGLPSGLTFDSTNNTISGTPTN 722
Cdd:COG3210 741 LTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLanangntsagATLDNAGAEISIDITADGTITAAGTTAINVTGSGGT 820
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 723 IGTSTISIVSTDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVS 802
Cdd:COG3210 821 ITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASITVGSGGVATSTGTANAGTLTNLG 900
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 803 LSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEKSESLSTSTSDSLRTSTSLSDS 882
Cdd:COG3210 901 TTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTS 980
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 883 LSMSTSGSLSKSQSLSTSTSGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLS 962
Cdd:COG3210 981 ANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQNGVGVNASGISGGNAAA 1060
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 963 TSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGS 1042
Cdd:COG3210 1061 LTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAA 1140
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1043 TSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASS 1122
Cdd:COG3210 1141 GAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTT 1220
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1123 ESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDS 1202
Cdd:COG3210 1221 TLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSL 1300
|
....*..
gi 445966738 1203 ESQSASA 1209
Cdd:COG3210 1301 DTTGNTA 1307
|
|
| KxYKxGKxW |
TIGR03715 |
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as ... |
16-33 |
8.49e-04 |
|
KxYKxGKxW signal peptide; This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus.
Pssm-ID: 274741 [Multi-domain] Cd Length: 23 Bit Score: 38.52 E-value: 8.49e-04
|
| ser_rich_anae_1 |
NF033849 |
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ... |
945-1208 |
3.29e-03 |
|
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.
Pssm-ID: 468206 [Multi-domain] Cd Length: 1122 Bit Score: 42.69 E-value: 3.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 945 QSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSAS 1024
Cdd:NF033849 237 QSAGTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEG 316
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1025 DSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSIS 1104
Cdd:NF033849 317 TSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSISHSESSSESTGTSVGHSTSSSVSSSESSSRSSSSGVSGGFSGG 396
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 1105 TSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLStsesdsiSESTSTSDSISEAIS 1184
Cdd:NF033849 397 IAGGGVTSEGLGASQGGSEGWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQ-------ADSVSQGTSWSEGTG 469
|
250 260
....*....|....*....|....
gi 445966738 1185 ASESTSISLSESNSTSDSESQSAS 1208
Cdd:NF033849 470 TSQGQSVGTSESWSTSQSETDSVG 493
|
|
| HYR |
pfam02494 |
HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin ... |
572-659 |
5.89e-03 |
|
HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion.
Pssm-ID: 460572 [Multi-domain] Cd Length: 81 Bit Score: 37.75 E-value: 5.89e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 445966738 572 TDVKAPTVTVGN---QTIEVGKTMNPIVLTTT--DNGAGTVTNTVTGLPSGLSYdsatnsiigtptKIGQSTVTVVSTDQ 646
Cdd:pfam02494 1 VDTTPPTVKCPNnivRTVELGTSTVRVFFTEPtaFDNSGQAILVSRTAQPGDFF------------PVGTTTVTYVAYDN 68
|
90
....*....|...
gi 445966738 647 ANNKSTTTFTINV 659
Cdd:pfam02494 69 SGNRASCTFTVTV 81
|
|
| LPXTG_anchor |
TIGR01167 |
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ... |
2204-2238 |
6.16e-03 |
|
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]
Pssm-ID: 273478 [Multi-domain] Cd Length: 34 Bit Score: 36.30 E-value: 6.16e-03
10 20 30
....*....|....*....|....*....|....*
gi 445966738 2204 RLPDTGDSIKQNGLLGGVMtLLVGLGLMKRKKKKD 2238
Cdd:TIGR01167 1 KLPKTGESGNSLLLLLGLL-LLGLGGLLLRKRKKK 34
|
|
|