Supplementary material- File 1
The prokaryotic antecedents of the Ubiquitin signaling system
and the early evolution of ubiquitin-like ß-grasp domains

Lakshminarayan M. Iyer, A. Maxwell Burroughs and L. Aravind


Presented below are the domain architectures and operon contexts of the different
systems reported in the study. The different groups are represented by the gi of one of the
components from the operons (marked with an asterisk). The operons are usually shown next to the organism name
where "->" signifies gene order from the 5'to 3' direction. Domain architectures are shown with
a '+' separating the domains. Also shown are the species names and the evolutionary group to which
a particular species belongs.
The general order of the major subgroups/operon types follows the order in Table 1.
We also provide alignments of various families described in the study.
--------------------------------------------------------------------------------------------------------------
1A. Classical Thiamine biosynthesis pathway
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# The gis shown below are ThiE, ThiD and ThiG like proteins (marked with an asterisk); also shown are the length of the protein
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                Protein description (if any)
13879931       222    <-ThiE*||ThiO->ThiS->ThiG->                                                                                             Mycobacterium tuberculosis CDC1551;                              actinobacteria
66869375       237    <-ThiG<-ThiS<-ThiO||ThiE*->                                                                                             Arthrobacter sp. FB24                                            actinobacteria
62424304       220    <-ThiF<-ThiG<-ThiS<-ThiO||ThiE*->                                                                                       Brevibacterium linens BL2                                        actinobacteria
13092621       235    ThiC->ThiE-><-ThiG<-ThiS<-ThiO||ThiE*->                                                                                 Mycobacterium leprae                                             actinobacteria
71915182       218    <-ThiG<-ThiS<-ThiO||?->ThiE*->                                                                                          Thermobifida fusca YX;                                           actinobacteria
41409995       223    <-ThiE*||ThiO->ThiS->ThiG->                                                                                             Mycobacterium avium subsp. paratuberculosis K-10                 actinobacteria
68263528       218    <-ThiF<-ThiG<-ThiS<-ThiO<-?<-ThiE*<-ThiC                                                                                Corynebacterium jeikeium K411                                    actinobacteria
86741756       237    <-ThiE*<-ThiS<-?<-?<-PDOR<-ThiH<-ThiG                                                                                   Frankia sp. CcI3                                                 actinobacteria
68230362       229    <-ThiE*<-ThiS<-?<-?<-PDOR<-ThiH<-ThiG                                                                                   Frankia sp. EAN1pec                                              actinobacteria
54018822       232    <-Mopterin_binding_protein<-ThiG<-ThiS<-ThiO||ThiE*->                                                                   Nocardia farcinica IFM 10152                                     actinobacteria
23493774       216    ThiE*->ThiO->ThiS->ThiG->ThiF->                                                                                         Corynebacterium efficiens YS-314                                 actinobacteria
38198930       222    ThiC->ThiE*->ThiO->ThiS->ThiG->ThiF->ThiD->                                                                             Corynebacterium diphtheriae                                      actinobacteria
13092617       279    ThiC->ThiD*-><-ThiG<-ThiS<-ThiO||ThiE->                                                                                 Mycobacterium leprae                                             actinobacteria
46191094       289    <-ThiF<-ThiG*<-ThiS                                                                                                     Bifidobacterium longum DJO10A                                    actinobacteria
5689919        264    <-PDOR||ThiO->ThiS->ThiG*->                                                                                             Streptomyces coelicolor A3(2)                                    actinobacteria
85666191       304    <-ThiG*<-ThiF<-ThiS                                                                                                     Bifidobacterium adolescentis                                     actinobacteria
71367701       197    ThiE->ThiO->ThiS->ThiG->ThiE*->ThiD->ThiC->                                                                             Nocardioides sp. JS614                                           actinobacteria
71481458       259    moaA-><-?||?->OAHSH->ThiS->ThiG*->ThiH->ThiF->                                                                          Prosthecochloris vibrioformis DSM 265                            bacteroidetes/chlorobi
34541689       259    <-ThiH<-ThiG*<-?<-ThiC<-ThiS                                                                                            Porphyromonas gingivalis W83                                     bacteroidetes/chlorobi
68550329       259    <-ThiF<-ThiH<-ThiG*<-ThiS<-OAHSH<-Cysteine_synthase<-permease                                                           Pelodictyon phaeoclathratiforme BU-1                             bacteroidetes/chlorobi
67939245       259    ThiS->ThiG*->ThiH->                                                                                                     Chlorobium phaeobacteroides BS1                                  bacteroidetes/chlorobi
21646637       259    <-ThiF<-ThiH<-ThiG*<-ThiS<-Cysteine_synthase<-OAHSH                                                                     Chlorobium tepidum TLS                                           bacteroidetes/chlorobi
67935885       259    <-ThiF<-ThiH<-ThiG*<-ThiS<-OAHSH||?-><-moaA                                                                             Chlorobium phaeobacteroides DSM 266                              bacteroidetes/chlorobi
78167321       275    OAHSH->ThiS->ThiG*->ThiH->ThiF->                                                                                        Pelodictyon luteolum DSM 273                                     bacteroidetes/chlorobi
78171432       256    OAHSH->?->ThiS->ThiG*->ThiH->?->ThiF->                                                                                  Chlorobium chlorochromatii CaD3                                  bacteroidetes/chlorobi
67919395       259    moaA-><-?||?->OAHSH->OAHSH->ThiS->ThiG*->ThiH->ThiF->                                                                   Chlorobium limicola DSM 245                                      bacteroidetes/chlorobi
60493472       204    ThiS->ThiE*->ThiG->ThiC->?->ThiH->                                                                                      Bacteroides fragilis NCTC 9343                                   bacteroidetes/chlorobi
52216685       204    ThiS->ThiE*->ThiG->ThiC->?->ThiH->                                                                                      Bacteroides fragilis YCH46                                       bacteroidetes/chlorobi
83758120       210    ThiO->ThiS->ThiG->ThiE*->ThiE->                                                                                         Salinibacter ruber DSM 13855                                     bacteroidetes/chlorobi
48855690       203    ThiS->ThiC->ThiD->ThiE*->ThiG->ThiH->                                                                                   Cytophaga hutchinsonii                                           bacteroidetes/chlorobi
83755862       290    ThiO->ThiS->ThiG->ThiE->ThiE*->                                                                                         Salinibacter ruber DSM 13855                                     bacteroidetes/chlorobi
29337956       209    <-ThiF<-ThiH<-ThiC<-ThiG<-ThiE*<-ThiS                                                                                   Bacteroides thetaiotaomicron VPI-5482                            bacteroidetes/chlorobi
33238326       346    ThiE*->ThiS->                                                                                                           Prochlorococcus marinus subsp. marinus str. CCMP1375             cyanobacteria
87125481       348    ThiE*->ThiS->                                                                                                           Synechococcus sp. RS9917                                         cyanobacteria
33639113       349    ThiE*->ThiS->                                                                                                           Synechococcus sp. WH 8102                                        cyanobacteria
86605751       257    <-ThiG*<-ThiS<-ThiO                                                                                                     Synechococcus sp. JA-3-3Ab                                       cyanobacteria
17130690       379    ThiE*->ThiS->                                                                                                           Nostoc sp. PCC 7120;                                             cyanobacteria
35210964       366    <-ThiS<-ThiE*                                                                                                           Gloeobacter violaceus PCC 7421;                                  cyanobacteria
67922607       338    ThiE*->ThiS->                                                                                                           Crocosphaera watsonii WH 8501                                    cyanobacteria
72002529       350    ThiE*->ThiS->                                                                                                           Prochlorococcus marinus str. NATL2A;                             cyanobacteria
33634552       353    <-ThiS<-ThiE*                                                                                                           Prochlorococcus marinus str. MIT 9313                            cyanobacteria
71674938       360    <-ThiS<-ThiE*                                                                                                           Trichodesmium erythraeum IMS101;                                 cyanobacteria
33640198       351    ThiE*->ThiS->                                                                                                           Prochlorococcus marinus subsp. pastoris str. CCMP1986            cyanobacteria
78713251       365    ThiE*->ThiS->                                                                                                           Prochlorococcus marinus str. MIT 9312;                           cyanobacteria
84512362       343    <-ThiS<-ThiE*                                                                                                           Prochlorococcus marinus str. MIT 9211                            cyanobacteria
56685459       343    <-ThiS<-ThiE*                                                                                                           Synechococcus elongatus PCC 6301                                 cyanobacteria
78169363       346    ThiE*->ThiS->                                                                                                           Synechococcus sp. CC9902                                         cyanobacteria
78196899       352    <-ThiS<-ThiE*                                                                                                           Synechococcus sp. CC9605                                         cyanobacteria
66797755       221    ThiC->ThiE*->ThiS->ThiG->?->ThiD->                                                                                      Deinococcus geothermalis DSM 11300                               deinococci
55772056       206    ThiE*->ThiS->ThiG->?->ThiC->?->ThiD->                                                                                   Thermus thermophilus HB8                                         deinococci
6460491        280    <-permease<-?<-?<-ThiD<-ThiG<-ThiS<-ThiE*<-ThiC                                                                         Deinococcus radiodurans R1                                       deinococci
82744798       256    Mopterin_binding_protein->?->ThiS->ThiG*->ThiH->                                                                        Clostridium beijerincki NCIMB 8052                               firmicutes
72496362       218    <-ThiF<-ThiG<-ThiS<-ThiO<-ThiE*                                                                                         Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305     firmicutes
83590499       255    ThiS->ThiG*->ThiH->                                                                                                     Moorella thermoacetica ATCC 39073                                firmicutes
68055200       198    <-ThiF<-ThiG<-ThiS<-ThiO<-ThiE*                                                                                         Exiguobacterium sp. 255-15                                       firmicutes
15025970       195    <-ThiE*<-ThiH<-ThiG<-ThiF<-ThiS                                                                                         Clostridium acetobutylicum ATCC 824;                             firmicutes
77996134       215    ThiS->ThiG->ThiH->ThiF->ThiE*->                                                                                         Carboxydothermus hydrogenoformans Z-2901                         firmicutes
82499658       219    ThiS->ThiG->ThiH->ThiF->ThiE*->ThiC->                                                                                   Caldicellulosiruptor saccharolyticus DSM 8903                    firmicutes
2633520        205    ThiE*->ThiO->ThiS->ThiG->ThiF->ThiD->                                                                                   Bacillus subtilis subsp. subtilis str. 168;                      firmicutes
52002880       203    ThiE*->ThiO->ThiS->ThiG->ThiF->ThiD->                                                                                   Bacillus licheniformis ATCC 14580                                firmicutes
10174048       211    ThiE*->ThiS->ThiG->ThiO->ThiD->                                                                                         Bacillus halodurans C-125                                        firmicutes
68446290       197    <-ThiF<-ThiG<-ThiS<-ThiO<-ThiE*                                                                                         Staphylococcus haemolyticus JCSC1435                             firmicutes
57865486       152    ThiE*->ThiO->ThiS->ThiG->ThiF->                                                                                         Staphylococcus epidermidis RP62A                                 firmicutes
23023751       212    <-ThiG<-ThiF<-ThiS<-ThiE*                                                                                               Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293         firmicutes
56909741       209    ThiE*->ThiS->ThiG->ThiO->ThiD->                                                                                         Bacillus clausii KSM-K16                                         firmicutes
77683441       196    ThiS->ThiF->ThiG->ThiH->ThiC->ThiE*->                                                                                   Alkaliphilus metalliredigenes QYMF                               firmicutes
67875149       356    <-ThiC<-ThiE*<-ThiF<-ThiH<-ThiG<-ThiS                                                                                   Clostridium thermocellum ATCC 27405                              firmicutes
18145262       193    <-ThiE*<-ThiH<-ThiG<-ThiF<-ThiS                                                                                         Clostridium perfringens str. 13                                  firmicutes
47501161       206    Mopterin_binding_protein->?->?->ThiE*->ThiO->ThiS->ThiG->ThiF->ThiD->                                                   Bacillus anthracis str. 'Ames Ancestor'                          firmicutes
56378999       201    ThiE*->ThiO->ThiS->ThiG->ThiF->                                                                                         Geobacillus kaustophilus HTA426                                  firmicutes
19712999       206    <-ThiE<-ThiH<-ThiG<-ThiF<-ThiS<-ThiC<-ThiE*<-ThiD                                                                       Fusobacterium nucleatum subsp. nucleatum ATCC 25586;             fusobacteria
32397912       287    <-ThiG*||?-><-ThiS                                                                                                      Rhodopirellula baltica SH 1                                      planctomycetes
27354938       208    ThiO->ThiS->ThiG->ThiE*->ThiC->                                                                                         Bradyrhizobium japonicum USDA 110;                               proteobacteria>alphaproteobacteria
78495123       202    <-ThiC<-ThiE*<-ThiG<-ThiS<-ThiO                                                                                         Rhodopseudomonas palustris BisB18                                proteobacteria>alphaproteobacteria
69299787       198    ThiD->ThiO->ThiS->ThiG->ThiE*->ThiF->                                                                                   Silicibacter sp. TM1040                                          proteobacteria>alphaproteobacteria
56676713       198    ThiD->ThiO->ThiS->ThiG->ThiE*->ThiF->                                                                                   Silicibacter pomeroyi DSS-3                                      proteobacteria>alphaproteobacteria
83751112       201    <-ThiD<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiC                                                                                   Bartonella bacilliformis KC583;                                  proteobacteria>alphaproteobacteria
49238087       252    <-ThiD*<-ThiE<-ThiG<-ThiS<-ThiO<-ThiC||?-><-Mopterin_binding_protein                                                    Bartonella henselae str. Houston-1                               proteobacteria>alphaproteobacteria
84501018       198    <-ThiF<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiD                                                                                   Oceanicola batsensis HTCC2597                                    proteobacteria>alphaproteobacteria
85705980       198    <-ThiF<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiD                                                                                   Roseovarius sp. 217                                              proteobacteria>alphaproteobacteria
83952604       203    ThiC->ThiO->ThiS->ThiG->ThiE*->ThiF->ThiD->                                                                             Roseovarius nubinhibens ISM                                      proteobacteria>alphaproteobacteria
86137738       196    ThiC->ThiO->ThiS->ThiG->ThiE*->ThiF->ThiD->                                                                             Roseobacter sp. MED193                                           proteobacteria>alphaproteobacteria
39650494       202    ThiO->ThiS->ThiG->ThiE*->ThiC->                                                                                         Rhodopseudomonas palustris CGA009                                proteobacteria>alphaproteobacteria
17983764       203    ThiD->ThiO->ThiS->ThiG->ThiE*->ThiC                                                                                     Brucella melitensis 16M;                                         proteobacteria>alphaproteobacteria
83954398       198    <-ThiF<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiD                                                                                   Sulfitobacter sp. NAS-14.1                                       proteobacteria>alphaproteobacteria
71062546       257    ThiS->ThiG*->                                                                                                           Candidatus Pelagibacter ubique HTCC1062                          proteobacteria>alphaproteobacteria
69926560       208    <-ThiC<-?<-ThiE*<-ThiG<-ThiS<-ThiO                                                                                      Nitrobacter hamburgensis X14                                     proteobacteria>alphaproteobacteria
68192290       206    ThiC->ThiO->ThiS->ThiG->ThiE*->ThiD-><-OmpA                                                                             Mesorhizobium sp. BNC1                                           proteobacteria>alphaproteobacteria
14025575       201    <-ThiD<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiC                                                                                   Mesorhizobium loti MAFF303099                                    proteobacteria>alphaproteobacteria
23011961       189    ThiO->ThiS->ThiG->ThiE*->                                                                                               Magnetospirillum magnetotacticum MS-1;                           proteobacteria>alphaproteobacteria
13423327       269    <-ThiG*<-ThiS                                                                                                           Caulobacter crescentus CB15                                      proteobacteria>alphaproteobacteria
84703985       259    <-phosphatidylglycerophosphate_synthase<-?<-?||ThiS->ThiG*->                                                            Parvularcula bermudensis HTCC2503                                proteobacteria>alphaproteobacteria
17741060       257    <-ThiG*<-ThiS<-ThiO<-ThiC                                                                                               Agrobacterium tumefaciens str. C58                               proteobacteria>alphaproteobacteria
58417134       266    <-ThiG*<-ThiS                                                                                                           Ehrlichia ruminantium str. Gardel                                proteobacteria>alphaproteobacteria
56416397       264    ThiS->ThiG*->                                                                                                           Anaplasma marginale str. St. Maries                              proteobacteria>alphaproteobacteria
83858498       262    <-ThiG*<-ThiS                                                                                                           Oceanicaulis alexandrii HTCC2633                                 proteobacteria>alphaproteobacteria
68538042       256    <-ThiG*<-ThiS                                                                                                           Sphingopyxis alaskensis RB2256                                   proteobacteria>alphaproteobacteria
78698311       202    <-ThiC<-ThiE*<-ThiG<-ThiS<-ThiO                                                                                         Bradyrhizobium sp. BTAi1                                         proteobacteria>alphaproteobacteria
72394551       261    <-ThiG*<-ThiS                                                                                                           Ehrlichia canis str. Jake                                        proteobacteria>alphaproteobacteria
74022860       312    ThiE-><-?||?-><-ThiE*<-ThiG<-ThiS<-ThiO                                                                                 Rhodoferax ferrireducens DSM 15236                               proteobacteria>betaproteobacteria
74019423       374    ThiO->ThiS->ThiG->ThiE*->Mopterin_binding_protein->                                                                     Burkholderia ambifaria AMMD;                                     proteobacteria>betaproteobacteria
7227331        205    ThiO->ThiE*->ThiS->ThiG->                                                                                               Neisseria meningitidis MC58                                      proteobacteria>betaproteobacteria
84713028       270    ThiC->ThiO->ThiS->ThiG->ThiE*->                                                                                         Polaromonas naphthalenivorans CJ2                                proteobacteria>betaproteobacteria
72117331       290    ThiC->ThiO->ThiS->ThiG->ThiE->?-><-?<-ThiD*||?->?->?->?->ThiS->                                                         Ralstonia eutropha JMP134                                        proteobacteria>betaproteobacteria
30138189       268    <-methylase<-ThiG*<-ThiS                                                                                                Nitrosomonas europaea ATCC 19718                                 proteobacteria>betaproteobacteria
82701205       264    <-methylase<-ThiG*<-ThiS                                                                                                Nitrosospira multiformis ATCC 25196                              proteobacteria>betaproteobacteria
71849093       260    <-methylase<-ThiG*<-ThiS<-ADH                                                                                           Dechloromonas aromatica RCB                                      proteobacteria>betaproteobacteria
68554870       276    ThiC->ThiO->ThiS->ThiG->ThiE->?-><-?<-ThiE*||?->?->?->?->ThiS->                                                         Ralstonia metallidurans CH34                                     proteobacteria>betaproteobacteria
34499221       264    <-ThiG*<-ThiS                                                                                                           Chromobacterium violaceum ATCC 12472                             proteobacteria>betaproteobacteria
47571796       176    ThiC->ThiO->ThiS->ThiG->ThiD*->                                                                                         Rubrivivax gelatinosus PM1                                       proteobacteria>betaproteobacteria
17427116       383    <-ThiE*<-ThiG||?-><-ThiS<-ThiO<-ThiC                                                                                    Ralstonia solanacearum;                                          proteobacteria>betaproteobacteria
74318144       262    <-methylase<-ThiG*<-ThiS                                                                                                Thiobacillus denitrificans ATCC 25259                            proteobacteria>betaproteobacteria
68212742       259    <-ThiG*<-ThiS                                                                                                           Methylobacillus flagellatus KT                                   proteobacteria>betaproteobacteria
77544040       206    ThiC->ThiS->ThiG->ThiH->?->ThiE*->                                                                                      Pelobacter carbinolicus DSM 2380                                 proteobacteria>deltaproteobacteria
78219006       214    ThiS->ThiG->ThiH->ThiF->ThiE*->                                                                                         Desulfovibrio desulfuricans G20;                                 proteobacteria>deltaproteobacteria
86158938       203    <-ThiE*||?-><-ThiG<-ThiS                                                                                                Anaeromyxobacter dehalogenans 2CP-C                              proteobacteria>deltaproteobacteria
71836232       223    <-ThiE*<-ThiG<-ThiS                                                                                                     Pelobacter propionicus DSM 2379                                  proteobacteria>deltaproteobacteria
71545062       222    ThiF->ThiG->ThiH->ThiS->ThiE*->?-><-?<-Mopterin_binding_protein                                                         Syntrophobacter fumaroxidans MPOB                                proteobacteria>deltaproteobacteria
85859826       229    <-ThiF<-ThiH<-ThiG<-ThiS<-ThiE*||?-><-?||?->Cysteine_synthase->                                                         Syntrophus aciditrophicus SB                                     proteobacteria>deltaproteobacteria
39982458       213    <-ThiE*<-ThiG<-ThiS                                                                                                     Geobacter sulfurreducens PCA                                     proteobacteria>deltaproteobacteria
78195386       213    ThiS->ThiG->ThiE*->                                                                                                     Geobacter metallireducens GS-15                                  proteobacteria>deltaproteobacteria
68178162       203    ThiF->ThiS->ThiG->ThiH->ThiE*->                                                                                         Desulfuromonas acetoxidans DSM 684;                              proteobacteria>deltaproteobacteria
50876628       263    ThiS->ThiG*->ThiH->                                                                                                     Desulfotalea psychrophila LSv54                                  proteobacteria>deltaproteobacteria
77544304       208    <-ThiE*<-ThiH<-ThiG<-ThiS<-ThiF                                                                                         Pelobacter carbinolicus DSM 2380                                 proteobacteria>deltaproteobacteria
46449915       226    <-ThiE*<-ThiF<-ThiH<-ThiG<-ThiS                                                                                         Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough        proteobacteria>deltaproteobacteria
57166733       201    <-ThiE*<-ThiH<-ThiG<-ThiF<-ThiS                                                                                         Campylobacter jejuni RM1221                                      proteobacteria>epsilonproteobacteria
57240558       200    ThiS->ThiF->ThiG->ThiH->ThiE*->                                                                                         Campylobacter lari RM2100                                        proteobacteria>epsilonproteobacteria
86155162       253    ThiS->ThiF->ThiG*->ThiH->                                                                                               Campylobacter fetus subsp. fetus 82-40                           proteobacteria>epsilonproteobacteria
56178885       504    ThiC->ThiD+ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                              Idiomarina loihiensis L2TR;                                      proteobacteria>gammaproteobacteria
83643050       487    ThiC->ThiO->ThiS->ThiG->ThiD+ThiE*->                                                                                    Hahella chejuensis KCTC 2396;                                    proteobacteria>gammaproteobacteria
45437723       229    ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                                   Yersinia pestis biovar Medievalis str. 91001                     proteobacteria>gammaproteobacteria
51587926       215    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Yersinia pseudotuberculosis IP 32953                             proteobacteria>gammaproteobacteria
12518922       211    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Escherichia coli O157:H7 EDL933;                                 proteobacteria>gammaproteobacteria
49609718       213    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Erwinia carotovora subsp. atroseptica SCRI1043                   proteobacteria>gammaproteobacteria
77960646       217    ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                                   Yersinia mollaretii ATCC 43969                                   proteobacteria>gammaproteobacteria
75855406       471    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC<-CcrB                                                                             Vibrio sp. Ex25;                                                 proteobacteria>gammaproteobacteria
68542221       650    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiD+ThiE*<-ThiC                                                                              Shewanella baltica OS155;                                        proteobacteria>gammaproteobacteria
29540946       479    ThiC->ThiO->ThiS->ThiG->ThiD+ThiE*->                                                                                    Coxiella burnetii RSA 493;                                       proteobacteria>gammaproteobacteria
71145222       529    ThiC->ThiO->ThiS->ThiG->ThiD+ThiE*->                                                                                    Colwellia psychrerythraea 34H;                                   proteobacteria>gammaproteobacteria
69953446       559    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiD+ThiE*<-ThiC                                                                              Shewanella frigidimarina NCIMB 400                               proteobacteria>gammaproteobacteria
53751266       488    ThiO->ThiS->ThiG->ThiD+ThiE*->ThiF->                                                                                    Legionella pneumophila str. Paris;                               proteobacteria>gammaproteobacteria
36783918       216    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Photorhabdus luminescens subsp. laumondii TTO1                   proteobacteria>gammaproteobacteria
87119893       203    <-ThiE*<-ThiG<-ThiS<-ThiO<-?<-?<-?<-Mopterin_binding_protein                                                            Marinomonas sp. MED121                                           proteobacteria>gammaproteobacteria
76874359       508    ThiC->ThiO->ThiS->ThiG->ThiD+ThiE*->ThiF->                                                                              Pseudoalteromonas haloplanktis TAC125;                           proteobacteria>gammaproteobacteria
78362775       218    <-ThiE<-ThiE*<-ThiG<-ThiS<-ThiO<-ThiC                                                                                   Thiomicrospira crunogena XCL-2                                   proteobacteria>gammaproteobacteria
28808052       444    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC<-CcrB                                                                             Vibrio parahaemolyticus RIMD 2210633                             proteobacteria>gammaproteobacteria
77977810       226    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Yersinia intermedia ATCC 29909                                   proteobacteria>gammaproteobacteria
77972243       226    ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                                   Yersinia frederiksenii ATCC 33641                                proteobacteria>gammaproteobacteria
68514852       525    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Shewanella amazonensis SB2B                                      proteobacteria>gammaproteobacteria
77957006       216    ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                                   Yersinia bercovieri ATCC 43970 ;                                 proteobacteria>gammaproteobacteria
37200142       444    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC<-CcrB                                                                             Vibrio vulnificus YJ016                                          proteobacteria>gammaproteobacteria
69156747       613    ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                                   Shewanella denitrificans OS217                                   proteobacteria>gammaproteobacteria
84393668       430    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC<-CcrB                                                                             Vibrio splendidus 12B01                                          proteobacteria>gammaproteobacteria
78366585       581    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiD+ThiE*<-ThiC                                                                              Shewanella sp. PV-4                                              proteobacteria>gammaproteobacteria
78362774       281    <-ThiD*<-ThiE<-ThiG<-ThiS<-ThiO<-ThiC                                                                                   Thiomicrospira crunogena XCL-2                                   proteobacteria>gammaproteobacteria
9654457        440    CcrB->ThiC->ThiE*->ThiF->ThiS->ThiG->ThiH->                                                                             Vibrio cholerae O1 biovar eltor str. N16961                      proteobacteria>gammaproteobacteria
16422721       211    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiE*<-ThiC                                                                                   Salmonella typhimurium LT2                                       proteobacteria>gammaproteobacteria
24373991       526    <-ThiH<-ThiG<-ThiS<-ThiF<-ThiD+ThiE*<-ThiC                                                                              Shewanella oneidensis MR-1                                       proteobacteria>gammaproteobacteria
76793993       276    <-ThiF<-?<-ThiG*<-ThiS<-ThiO<-ThiC                                                                                      Pseudoalteromonas atlantica T6c                                  proteobacteria>gammaproteobacteria
49530466       261    <-ThiG*<-ThiS                                                                                                           Acinetobacter sp. ADP1                                           proteobacteria>gammaproteobacteria
26991780       270    <-methylase<-ThiG*<-ThiS<-?||?-><-?<-?<-Mopterin_binding_protein                                                        Pseudomonas putida KT2440                                        proteobacteria>gammaproteobacteria
71555612       264    <-methylase<-ThiG*<-ThiS<-?||?-><-?<-?<-Mopterin_binding_protein                                                        Pseudomonas syringae pv. phaseolicola 1448A                      proteobacteria>gammaproteobacteria
67677083       266    ThiS->ThiG*->methylase->                                                                                                Chromohalobacter salexigens DSM 3043                             proteobacteria>gammaproteobacteria
68347434       264    <-methylase<-ThiG*<-ThiS<-?||?-><-?<-?<-Mopterin_binding_protein                                                        Pseudomonas fluorescens Pf-5                                     proteobacteria>gammaproteobacteria
77953947       269    <-ThiG*<-ThiS                                                                                                           Marinobacter aquaeolei VT8                                       proteobacteria>gammaproteobacteria
67154906       264    <-methylase<-ThiG*<-ThiS<-?||?-><-JAB                                                                                   Azotobacter vinelandii AvOP                                      proteobacteria>gammaproteobacteria
78701989       262    ThiS->ThiG*->methylase->                                                                                                Alkalilimnicola ehrlichei MLHE-1                                 proteobacteria>gammaproteobacteria
48862780       269    ThiO->ThiS->ThiG*->?->?->Mopterin_binding_protein->                                                                     Microbulbifer degradans 2-40                                     proteobacteria>gammaproteobacteria
21109645       264    ThiS->ThiG*->methylase->                                                                                                Xanthomonas axonopodis pv. citri str. 306                        proteobacteria>gammaproteobacteria
9105679        275    ThiS->ThiG*->methylase->                                                                                                Xylella fastidiosa 9a5c                                          proteobacteria>gammaproteobacteria
71900706       275    <-methylase<-ThiG*<-ThiS                                                                                                Xylella fastidiosa Ann-1                                         proteobacteria>gammaproteobacteria

Bacterial ThiSs fused to ThiG (Gis are of the ThiS+ThiG protein-marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                Protein descriptions (if any)
68512207       252    ThiE->ThiS+ThiG*->                                                                                                      Rubrobacter xylanophilus DSM 9941;                               actinobacteria                                Thiamine monophosphate synthase [Rubrobacter xylanophilus DSM 9941]
79039407       331    ThiS+ThiG*                                                                                                              Novosphingobium aromaticivorans DSM 12444;                       proteobacteria>alphaproteobacteria            similar to Uncharacterized enzyme of thiazole biosynthesis [Novosphingobium aromaticivorans DSM 12444]
56551634       331    ThiS+ThiG*                                                                                                              Zymomonas mobilis subsp. mobilis ZM4;                            proteobacteria>alphaproteobacteria            thiazole biosynthesis protein [Zymomonas mobilis subsp. mobilis ZM4]
84788478       332    ThiS+ThiG*                                                                                                              Erythrobacter litoralis HTCC2594;                                proteobacteria>alphaproteobacteria            thiazole biosynthesis protein [Erythrobacter litoralis HTCC2594]
85709842       333    ThiS+ThiG*                                                                                                              Erythrobacter sp. NAP1;                                          proteobacteria>alphaproteobacteria            thiazole biosynthesis protein [Erythrobacter sp. NAP1]
83576681       334    ThiS+ThiG*                                                                                                              Rhodospirillum rubrum ATCC 11170;                                proteobacteria>alphaproteobacteria            ThiS, thiamine-biosynthesis [Rhodospirillum rubrum ATCC 11170]
76883424       347    ThiS+ThiG*                                                                                                              Nitrosococcus oceani ATCC 19707;                                 proteobacteria>gammaproteobacteria            ThiS, thiamine-biosynthesis [Nitrosococcus oceani ATCC 19707]
68246504       326    ThiS+ThiG*                                                                                                              Magnetococcus sp. MC-1;                                          proteobacteria                                ThiS, thiamine-biosynthesis [Magnetococcus sp. MC-1]
46202840       330    ThiS+ThiG*                                                                                                              Magnetospirillum magnetotacticum MS-1;                           proteobacteria>alphaproteobacteria            COG2022: Uncharacterized enzyme of thiazole biosynthesis [Magnetospirillum magnetotacticum MS-1]
53758359       326    ThiS+ThiG*                                                                                                              Methylococcus capsulatus str. Bath;                              proteobacteria>gammaproteobacteria            thiamine biosynthesis protein ThiS [Methylococcus capsulatus str. Bath]
82701206       162    ThiS->ThiG*->                                                                                                           Nitrosospira multiformis ATCC 25196;                             proteobacteria>betaproteobacteria             thiamine biosynthesis protein ThiS [Nitrosospira multiformis ATCC 25196]

Archaeal ThiS solos (Gis are for the ThiS protein -marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon (no particularly conserved operons were detected)                                                            ORGANISM (gis are of the ThiS protein)                           Classification                                Protein descriptions (if any)
48425680       77                                                                                                                             Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 A Chain A, Backbone Solution Structure Of Mixed AlphaBETA PROTEIN Pf1061
33359535       71                                                                                                                             Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 sulfur carrier protein ThiS [Pyrococcus furiosus DSM 3638]
18893126       69                                                                                                                             Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 hypothetical protein [Pyrococcus furiosus DSM 3638]
19916735       77                                                                                                                             Methanosarcina acetivorans C2A                                   euryarchaeota                                 predicted protein [Methanosarcina acetivorans C2A]
19916952       77                                                                                                                             Methanosarcina acetivorans C2A                                   euryarchaeota                                 predicted protein [Methanosarcina acetivorans C2A]
13540947       68                                                                                                                             Thermoplasma volcanium GSS1                                      euryarchaeota                                 hypothetical protein TVN0116 [Thermoplasma volcanium GSS1]
14324330       64                                                                                                                             Thermoplasma volcanium GSS1                                      euryarchaeota                                 hypothetical protein [Thermoplasma volcanium GSS1]
10581690       174                                                                                                                            Halobacterium sp. NRC-1                                          euryarchaeota                                 Vng2279h [Halobacterium sp. NRC-1]
18893610       73                                                                                                                             Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 hypothetical protein [Pyrococcus furiosus DSM 3638]
2622875        70                                                                                                                             Methanothermobacter thermautotrophicus str. Delta H              euryarchaeota                                 unknown [Methanothermobacter thermautotrophicus str. Delta H]
19917335       70                                                                                                                             Methanosarcina acetivorans C2A                                   euryarchaeota                                 predicted protein [Methanosarcina acetivorans C2A]
21226239       70                                                                                                                             Methanosarcina mazei Go1                                         euryarchaeota                                 hypothetical protein MM0137 [Methanosarcina mazei Go1]
68211447       70                                                                                                                             Methanococcoides burtonii DSM 6242                               euryarchaeota                                 hypothetical protein MburDRAFT_0612 [Methanococcoides burtonii DSM 6242]
72398144       70                                                                                                                             Methanosarcina barkeri str. fusaro                               euryarchaeota                                 conserved hypothetical protein [Methanosarcina barkeri str. fusaro]
33356745       69                                                                                                                             Pyrococcus abyssi GE5                                            euryarchaeota                                 sulfur carrier protein ThiS [Pyrococcus abyssi GE5]
88951090       69                                                                                                                             Methanosaeta thermophila PT                                      euryarchaeota                                 conserved hypothetical protein [Methanosaeta thermophila PT]
48430257       64                                                                                                                             Picrophilus torridus DSM 9790                                    euryarchaeota                                 hypothetical protein PTO0537 [Picrophilus torridus DSM 9790]
44920975       64                                                                                                                             Methanococcus maripaludis S2                                     euryarchaeota                                 hypothetical protein [Methanococcus maripaludis S2]
10640784       67                                                                                                                             Thermoplasma acidophilum                                         euryarchaeota                                 hypothetical protein [Thermoplasma acidophilum]
11498344       67                                                                                                                             Archaeoglobus fulgidus DSM 4304                                  euryarchaeota                                 hypothetical protein AF0737 [Archaeoglobus fulgidus DSM 4304]
14591747       67                                                                                                                             Pyrococcus horikoshii OT3                                        euryarchaeota                                 sulfur carrier protein ThiS [Pyrococcus horikoshii OT3]
57159352       67                                                                                                                             Thermococcus kodakarensis KOD1                                   euryarchaeota                                 sulfur transfer protein involved in thiamine biosynthesis [Thermococcus kodakarensis KOD1]
55379215       66                                                                                                                             Haloarcula marismortui ATCC 43049                                euryarchaeota                                 hypothetical protein rrnAC2563 [Haloarcula marismortui ATCC 43049]
76801103       66                                                                                                                             Natronomonas pharaonis DSM 2160                                  euryarchaeota                                 homolog to thiamine biosynthesis protein ThiS (probable sulfur donor) [Natronomonas pharaonis DSM 2160]
84489151       66                                                                                                                             Methanosphaera stadtmanae DSM 3091                               euryarchaeota                                 hypothetical protein Msp_0330 [Methanosphaera stadtmanae DSM 3091]
68141055       64                                                                                                                             Ferroplasma acidarmanus Fer1                                     euryarchaeota                                 conserved hypothetical protein [Ferroplasma acidarmanus Fer1]
15622711       68                                                                                                                             Sulfolobus tokodaii str. 7                                       crenarchaeota                                 68aa long conserved hypothetical protein [Sulfolobus tokodaii str. 7]
68568033       68                                                                                                                             Sulfolobus acidocaldarius DSM 639                                crenarchaeota                                 conserved Archaeal protein [Sulfolobus acidocaldarius DSM 639]

B. Variant Thiamine biosynthesis pathway (Gis are for the ThiS+ThiF protein- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                              ORGANISM (gis are of the ThiS+ThiF protein)                          Classification                                 Protein descriptions (if any)
57240561       265     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter lari RM2100                                        proteobacteria>epsilonproteobacteria          HesA/MoeB/ThiF family protein [Campylobacter lari RM2100]
57168916       266     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter coli RM2228                                        proteobacteria>epsilonproteobacteria          HesA/MoeB/ThiF family protein [Campylobacter coli RM2228]
57166736       267     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter jejuni RM1221                                      proteobacteria>epsilonproteobacteria          thiamine biosynthesis protein ThiF [Campylobacter jejuni RM1221]
86152451       267     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter jejuni subsp. jejuni HB93-13                       proteobacteria>epsilonproteobacteria          thiamine biosynthesis protein ThiF [Campylobacter jejuni subsp. jejuni HB93-13]
86150511       267     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter jejuni subsp. jejuni CF93-6                        proteobacteria>epsilonproteobacteria          thiamine biosynthesis protein ThiF [Campylobacter jejuni subsp. jejuni CF93-6]
86150854       267     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter jejuni subsp. jejuni 260.94                        proteobacteria>epsilonproteobacteria          thiamine biosynthesis protein ThiF [Campylobacter jejuni subsp. jejuni 260.94]
87132835       267     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Campylobacter jejuni subsp. jejuni 84-25                         proteobacteria>epsilonproteobacteria          COG0476: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis                                               family 2 [Campylobacter jejuni subsp. jejuni 84-25]
71837115       267     OAHShyd->OAHShyd->Cyssynthase->ThiS+ThiF*-> (operon gene displacement)                                                 Pelobacter propionicus DSM 2379                                  proteobacteria>deltaproteobacteria            UBA/THIF-type NAD/FAD binding fold [Pelobacter propionicus DSM 2379]
77544308       268     ThiS+ThiF*->ThiS->ThiG->ThiH->ThiE->                                                                                   Pelobacter carbinolicus DSM 2380                                 proteobacteria>deltaproteobacteria            molybdopterin biosynthesis protein MoeB [Pelobacter carbinolicus DSM 2380]
68178158       272     ThiS+ThiF*->ThiS->ThiG->ThiH->ThiE->                                                                                   Desulfuromonas acetoxidans DSM 684                               proteobacteria>deltaproteobacteria            UBA/THIF-type NAD/FAD binding fold [Desulfuromonas acetoxidans DSM 684]
18145265       269     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Clostridium perfringens str. 13                                  firmicutes                                    probable molybdopterin biosynthesis protein [Clostridium perfringens str. 13]
82748786       267     ThiS+ThiF*->ThiE->                                                                                                     Clostridium beijerincki NCIMB 8052                               firmicutes                                    UBA/THIF-type NAD/FAD binding fold [Clostridium beijerincki NCIMB 8052]
28203841       267     ThiD->ThiM->ThiE->ThiS+ThiF*->ThiG->ThiH->                                                                             Clostridium tetani E88                                           firmicutes                                    molybdopterin biosynthesis protein moeB [Clostridium tetani E88]
77683437       268     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiC->ThiE->                                                                             Alkaliphilus metalliredigenes QYMF                               firmicutes                                    UBA/THIF-type NAD/FAD binding fold [Alkaliphilus metalliredigenes QYMF]
15025973       266     ThiS->ThiS+ThiF*->ThiG->ThiH->ThiE->                                                                                   Clostridium acetobutylicum ATCC 824                              firmicutes                                    AE007789_11 Dinucleotide-utilizing enzyme involved in molybdopterin/thiamine biosynthesis [Cl                                              ostridium acetobutylicum ATCC 824]

Thiamine biosynthesis pathways in operons with a Cys synthase (gis are for the Cys synthase (Cys syn)- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                  Protein descriptions (if any)
85859830       295    ThiF->ThiH->ThiG->ThiS<-?->?<-UbiA->Cys synthase*->                                                                     Syntrophus aciditrophicus SB                                     proteobacteria>deltaproteobacteria            cysteine synthase [Syntrophus aciditrophicus SB]
21646639       310    trans sulf->Cys synthase*->ThiS->ThiG->ThiH->                                                                           Chlorobium tepidum TLS                                           bacteroidetes/chlorobi                        cysteine synthase [Chlorobium tepidum TLS]
77545399       308    Cys syn*->OAHSH->ThiF->ThiS solo-> (probably molybdenum biosynthesis?)                                                  Pelobacter carbinolicus DSM 2380                                 proteobacteria>deltaproteobacteria            cysteine synthase [Pelobacter carbinolicus DSM 2380]

Miscellaneous pathway
67938818       328    Rrf2 (often fused to NifS)->Cys Synthase*->ThiS                                                                         Chlorobium phaeobacteroides BS1                                  bacteroidetes/chlorobi                        Cysteine synthase K/M:Cysteine synthase A [Chlorobium phaeobacteroides BS1]
-------------------------------------------------------------------------------------------------------------

2. Classical pathway: Molybdopterin cofactor biosynthesis and related pathways
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
a.Bacterial versions of classical MOCO factor biosynthesis pathway (The gis represent the MoaE protein- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                Protein descriptions (if any)
32033744       151     moaA->MoaC->MoaD->MoaE*->                                                                                              Actinobacillus pleuropneumoniae serovar 1 str. 4074;             proteobacteria>gammaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Actinobacillus pleuropneumoniae serovar 1 str. 4074]
75431112       159     moaA->MoaC->MoaD->MoaE*->                                                                                              Actinobacillus succinogenes 130Z                                 proteobacteria>gammaproteobacteria            molybdopterin converting factor, large subunit [Actinobacillus succinogenes 130Z]
45435690       152     <-MoaE*<-MoaD<-MoaC<-moaA                                                                                              Yersinia pestis biovar Medievalis str. 91001                     proteobacteria>gammaproteobacteria            molybdopterin [mpt] converting factor, subunit 2 [Yersinia pestis biovar Medievalis str. 91001]
71038168       184     MoeA->moaA->MoaB->?->MoaC->MoaD->MoaE*->Mo_transporter->permease->                                                     Psychrobacter arcticus 273-4                                     proteobacteria>gammaproteobacteria            probable molybdopterin converting factor, large subunit [Psychrobacter arcticus 273-4]
67156788       148     MoaC->MoaD->MoaE*->                                                                                                    Azotobacter vinelandii AvOP                                      proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Azotobacter vinelandii AvOP]
28868460       148     MoaC->MoaD->MoaE*->                                                                                                    Pseudomonas syringae pv. tomato str. DC3000                      proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Pseudomonas syringae pv. tomato str. DC3000]
21107238       146     MoaC->MoaD->MoaE*->                                                                                                    Xanthomonas axonopodis pv. citri str. 306                        proteobacteria>gammaproteobacteria            molybdopterin-converting factor chain 2 [Xanthomonas axonopodis pv. citri str. 306]
26988029       148     MoaC->MoaD->MoaE*->                                                                                                    Pseudomonas putida KT2440                                        proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Pseudomonas putida KT2440]
77381196       150     MoaC->MoaD->MoaE*->                                                                                                    Pseudomonas fluorescens PfO-1                                    proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Pseudomonas fluorescens PfO-1]
68345385       152     <-MoaB||MobA->?->?-><-MoaE*<-MoaD<-MoaC<-?<-moaA                                                                       Pseudomonas fluorescens Pf-5                                     proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Pseudomonas fluorescens Pf-5]
77382375       153     <-MoaE*<-MoaD<-MoaC<-?<-?||MoaB->MoeA->                                                                                Pseudomonas fluorescens PfO-1                                    proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Pseudomonas fluorescens PfO-1]
68342721       150     MoaC->MoaD->MoaE*->                                                                                                    Pseudomonas fluorescens Pf-5                                     proteobacteria>gammaproteobacteria            molybdopterin converting factor, subunit 2 [Pseudomonas fluorescens Pf-5]
84319277       150     <-MoeA<-MoaB<-MoaE*<-MoaD                                                                                              Pseudomonas aeruginosa C3719                                     proteobacteria>gammaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Pseudomonas aeruginosa C3719]
76791053       154     <-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-MoaB<-?<-moaA                                                           Pseudoalteromonas atlantica T6c                                  proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Pseudoalteromonas atlantica T6c]
36784881       150     moaA->MoaC->MoaD->MoaE*->                                                                                              Photorhabdus luminescens subsp. laumondii TTO1                   proteobacteria>gammaproteobacteria            molybdopterin [MPT] converting factor, subunit 2 (molybdenum cofactor biosynthesis protein E) (molybdopterin converting factor large subunit) [Photorhabdus luminescens subsp. laumondii TTO1]
37198129       151     moaA->MoaB->MoaC->MoaD->MoaE*->?->?->?->Mopterin_binding_protein->                                                     Vibrio vulnificus YJ016                                          proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Vibrio vulnificus YJ016]
46912748       149     moaA->MoaC->MoaD->MoaE*->                                                                                              Photobacterium profundum SS9                                     proteobacteria>gammaproteobacteria            putative molybdenum cofactor biosynthesisprotein E [Photobacterium profundum SS9]
67676069       152     <-moaA<-MoaE*<-MoaD<-MoeA<-mobB<-MobA<-Mopterin_binding_protein<-permease                                              Chromohalobacter salexigens DSM 3043                             proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Chromohalobacter salexigens DSM 3043]
71144053       156     <-MoeA||moaA->?->MoaB->MoaC->MoaD->MoaE*->Mo_transporter->permease->Mopterin_binding_protein->MoeB->                   Colwellia psychrerythraea 34H                                    proteobacteria>gammaproteobacteria            molybdopterin converting factor, subunit 2 [Colwellia psychrerythraea 34H]
12720896       150     <-MoaE*<-MoaD<-MoaC<-moaA                                                                                              Pasteurella multocida subsp. multocida str. Pm70                 proteobacteria>gammaproteobacteria            MoaE [Pasteurella multocida subsp. multocida str. Pm70]
49612263       150     <-MoaE*<-MoaD<-MoaC<-MoaB<-moaA                                                                                        Erwinia carotovora subsp. atroseptica SCRI1043                   proteobacteria>gammaproteobacteria            molybdopterin converting factor subunit 2 [Erwinia carotovora subsp. atroseptica SCRI1043]
84393379       157     <-Mopterin_binding_protein<-?<-?<-?<-MoaE*<-MoaD<-MoaC<-MoaB<-moaA                                                     Vibrio splendidus 12B01                                          proteobacteria>gammaproteobacteria            Molybdenum cofactor biosynthesis protein E [Vibrio splendidus 12B01]
28807085       151     <-Mopterin_binding_protein<-?<-?<-?<-MoaE*<-MoaD<-MoaC<-MoaB<-moaA                                                     Vibrio parahaemolyticus RIMD 2210633                             proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Vibrio parahaemolyticus RIMD 2210633]
26107156       150     moaA->MoaB-><-?||MoaC->MoaD->MoaE*->                                                                                   Escherichia coli CFT073                                          proteobacteria>gammaproteobacteria            AE016757_244 Molybdopterin converting factor subunit 2 [Escherichia coli CFT073]
59711550       148     moaA->MoaC->MoaD->MoaE*->                                                                                              Vibrio fischeri ES114                                            proteobacteria>gammaproteobacteria            molybdopterin converting factor, large subunit [Vibrio fischeri ES114]
33148682       151     <-MoaE*<-MoaD<-MoaC<-moaA                                                                                              Haemophilus ducreyi 35000HP                                      proteobacteria>gammaproteobacteria            molybdopterin converting factor subunit 2 [Haemophilus ducreyi 35000HP]
1574523        150     <-MoaE*<-MoaD<-MoaC<-moaA                                                                                              Haemophilus influenzae Rd KW20                                   proteobacteria>gammaproteobacteria            molybdopterin converting factor, subunit 2 (moaE) [Haemophilus influenzae Rd KW20]
23467045       150     moaA->MoaC->MoaD->MoaE*->                                                                                              Haemophilus somnus 129PT                                         proteobacteria>gammaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Haemophilus somnus 129PT]
68545075       152     <-Mopterin_binding_protein<-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-MoaB<-moaA                                    Shewanella amazonensis SB2B                                      proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Shewanella amazonensis SB2B]
69158642       156     <-Mopterin_binding_protein<-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-moaA                                          Shewanella denitrificans OS217                                   proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Shewanella denitrificans OS217]
75819544       153     <-MoaE*<-MoaD<-MoaC<-MoaB<-moaA                                                                                        Vibrio cholerae V51                                              proteobacteria>gammaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Vibrio cholerae V51]
48861422       145     MoaC->MoaD->MoaE*->                                                                                                    Microbulbifer degradans 2-40                                     proteobacteria>gammaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Microbulbifer degradans 2-40]
52307131       161     moaA->MoaC->MoaD->MoaE*->                                                                                              Mannheimia succiniciproducens MBEL55E                            proteobacteria>gammaproteobacteria            MoaE protein [Mannheimia succiniciproducens MBEL55E]
77951749       148     MobA-><-MoaE*<-MoaD<-MoeA<-MoaB                                                                                        Marinobacter aquaeolei VT8                                       proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Marinobacter aquaeolei VT8]
87121810       151     <-MoaE*<-MoaD<-MoaC                                                                                                    Marinomonas sp. MED121                                           proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Marinomonas sp. MED121]
78362791       155     moaA->MoaD->MoaE*->MoeA->MoaC->                                                                                        Thiomicrospira crunogena XCL-2                                   proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Thiomicrospira crunogena XCL-2]
53756579       151     MoaD->MoaE*->                                                                                                          Methylococcus capsulatus str. Bath                               proteobacteria>gammaproteobacteria            molybdopterin converting factor, subunit 2 [Methylococcus capsulatus str. Bath]
24375927       155     <-Mopterin_binding_protein<-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-moaA                                          Shewanella oneidensis MR-1                                       proteobacteria>gammaproteobacteria            molybdenum cofactor biosynthesis protein E [Shewanella oneidensis MR-1]
69951943       172     <-Mopterin_binding_protein<-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-moaA                                          Shewanella frigidimarina NCIMB 400                               proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Shewanella frigidimarina NCIMB 400]
71364350       163     <-permease<-Mo_transporter<-MoaE*<-MoaD<-MoaC<-?<-MoaB<-moaA<-MoeA                                                     Psychrobacter cryohalolentis K5                                  proteobacteria>gammaproteobacteria            Molybdopterin biosynthesis MoaE [Psychrobacter cryohalolentis K5]
86154629       148     <-MoeA<-?<-?<-MoaE*<-MoaD                                                                                              Campylobacter fetus subsp. fetus 82-40                           proteobacteria>epsilonproteobacteria          molybdopterin converting factor, subunit 2 [Campylobacter fetus subsp. fetus 82-40]
78776455       145     MoaD->MoaE*->MoeA->                                                                                                    Thiomicrospira denitrificans ATCC 33889                          proteobacteria>epsilonproteobacteria          possible molybdopterin converting factor, subunit 2 [Thiomicrospira denitrificans ATCC 33889]
15645419       145     <-MoaC<-MoaB<-MoaE*<-MoaD                                                                                              Helicobacter pylori 26695                                        proteobacteria>epsilonproteobacteria          molybdopterin converting factor, subunit 2 (moaE) [Helicobacter pylori 26695]
32261594       157     MoeA->MoaD->MoaE*->mobB->MoaB->                                                                                        Helicobacter hepaticus ATCC 51449                                proteobacteria>epsilonproteobacteria          molybdopterin converting factor [Helicobacter hepaticus ATCC 51449]
57167345       147     MoaD->MoaE*->?->MoeA->                                                                                                 Campylobacter jejuni RM1221;                                     proteobacteria>epsilonproteobacteria          molybdopterin converting factor, subunit 2 [Campylobacter jejuni RM1221]
34483283       145     MoeA->MoaD->MoaE*->mobB->MoaB->MoaC->                                                                                  Wolinella succinogenes                                           proteobacteria>epsilonproteobacteria          POSSIBLE MOLYBDOPTERIN CONVERTING FACTOR, SUBUNIT 2 [Wolinella succinogenes]
57505250       151     MoaD->MoaE*->MoeA->                                                                                                    Campylobacter upsaliensis RM3195                                 proteobacteria>epsilonproteobacteria          molybdopterin converting factor, subunit 2 [Campylobacter upsaliensis RM3195]
34495640       158     <-MoaE*<-MoaD                                                                                                          Chromobacterium violaceum ATCC 12472                             proteobacteria>betaproteobacteria             molybdopterin converting factor subunit 2 [Chromobacterium violaceum ATCC 12472]
18076268       172     MoeA->MoaD->MoaE*->CcrB->                                                                                              Cupriavidus necator                                              proteobacteria>betaproteobacteria             molybdopterin synthase large subunit [Cupriavidus necator]
67907156       226     <-MoaE*||?-><-MoaD<-MoeA<-mobB<-Threonine_synthase                                                                     Polaromonas sp. JS666                                            proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Polaromonas sp. JS666]
74022613       163     <-MoaE*<-MoaD<-MoeA<-mobB<-Threonine_synthase                                                                          Rhodoferax ferrireducens DSM 15236                               proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Rhodoferax ferrireducens DSM 15236]
47573809       159     Threonine_synthase->mobB->MoeA->MoaD->?->MoaE*->CcrB->                                                                 Rubrivivax gelatinosus PM1                                       proteobacteria>betaproteobacteria             COG0314: Molybdopterin converting factor, large subunit [Rubrivivax gelatinosus PM1]
74317045       151     <-moaA||mobB->MoeA->MoaD->MoaE*->                                                                                      Thiobacillus denitrificans ATCC 25259                            proteobacteria>betaproteobacteria             molybdenum cofactor biosynthesis protein E [Thiobacillus denitrificans ATCC 25259]
83719603       166     Threonine_synthase->MoeA->MoaD->MoaE*->                                                                                Burkholderia thailandensis E264                                  proteobacteria>betaproteobacteria             molybdopterin converting factor, subunit 2 [Burkholderia thailandensis E264]
77964629       189     <-MoaD<-MoaE*<-moaA<-MoeA                                                                                              Burkholderia sp. 383                                             proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Burkholderia sp. 383]
67664216       190     MoeA->moaA->MoaE*->MoaD->                                                                                              Burkholderia cenocepacia HI2424                                  proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Burkholderia cenocepacia HI2424]
74018016       187     MoeA->moaA->MoaE*->MoaD->                                                                                              Burkholderia ambifaria AMMD;                                     proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Burkholderia ambifaria AMMD]
84713091       157     Threonine_synthase->mobB->MoeA->MoaD->MoaE*->                                                                          Polaromonas naphthalenivorans CJ2                                proteobacteria>betaproteobacteria             moaE, RSc1332; probable molybdopterin mpt converting      factor (subunit 2) protein [Polaromonas naphthalenivorans CJ2]
33563746       163     ModE->moaA-><-MoeA<-MoaB<-MoaE*<-MoaD<-MoaC                                                                            Bordetella pertussis Tohama I                                    proteobacteria>betaproteobacteria             molybdopterin converting factor [Bordetella pertussis Tohama I]
56315291       161     <-MoaE*<-MoaD<-MoeA<-mobB                                                                                              Azoarcus sp. EbN1                                                proteobacteria>betaproteobacteria             Molybdenum cofactor biosynthesis protein E [Azoarcus sp. EbN1]
68212269       149     <-MoaE*<-MoaD                                                                                                          Methylobacillus flagellatus KT                                   proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Methylobacillus flagellatus KT]
68557891       163     <-CcrB<-MoaE*<-MoaD<-MoeA<-Threonine_synthase                                                                          Ralstonia metallidurans CH34                                     proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoaE [Ralstonia metallidurans CH34]
17428347       176     Threonine_synthase->?->MoeA->MoaD->MoaE*->CcrB->                                                                       Ralstonia solanacearum                                           proteobacteria>betaproteobacteria             PROBABLE MOLYBDOPTERIN MPT CONVERTING FACTOR (SUBUNIT 2) PROTEIN [Ralstonia solanacearum]
86357114       153     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH||OmpA->                                        Rhizobium etli CFN 42                                            proteobacteria>alphaproteobacteria            molybdopterin converting factor subunit 2 protein [Rhizobium etli CFN 42]
77389070       146     <-ADH||?->?->?-><-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH                                Rhodobacter sphaeroides 2.4.1                                    proteobacteria>alphaproteobacteria            Molybdopterin converting factor subunit 2 [Rhodobacter sphaeroides 2.4.1]
27355756       160     <-OmpA||Excinuclease->phosphatidylglycerophosphate_synthase->MoaD->MoaE*->                                             Bradyrhizobium japonicum USDA 110                                proteobacteria>alphaproteobacteria            molybdopterin converting factor large subunit [Bradyrhizobium japonicum USDA 110]
23347497       163     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH||OmpA->                                        Brucella suis 1330                                               proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Brucella suis 1330]
39648091       155     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease||OmpA->                                             Rhodopseudomonas palustris CGA009                                proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Rhodopseudomonas palustris CGA009]
78494766       152     <-OmpA||Excinuclease->phosphatidylglycerophosphate_synthase->MoaD->MoaE*->                                             Rhodopseudomonas palustris BisB18                                proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Rhodopseudomonas palustris BisB18]
83577061       162     MobA->MoaC->MoaD->MoaE*->                                                                                              Rhodospirillum rubrum ATCC 11170                                 proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Rhodospirillum rubrum ATCC 11170]
85705895       147     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-?<-?<-?<-Excinuclease                                            Roseovarius sp. 217                                              proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Roseovarius sp. 217]
84705082       155     <-MoaB<-MoaE*<-MoaD<-moaA                                                                                              Parvularcula bermudensis HTCC2503                                proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Parvularcula bermudensis HTCC2503]
69936171       146     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase                                                                   Paracoccus denitrificans PD1222                                  proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Paracoccus denitrificans PD1222]
69926308       155     <-OmpA||Excinuclease->phosphatidylglycerophosphate_synthase->MoaD->MoaE*->                                             Nitrobacter hamburgensis X14                                     proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Nitrobacter hamburgensis X14]
13421104       150     <-MoaC<-MoaB<-MoaE*<-MoaD<-moaA                                                                                        Caulobacter crescentus CB15;(Note MoaB related to MoeA)          proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Caulobacter crescentus CB15]
84786468       156     moaA->MoaD->MoaE*->                                                                                                    Erythrobacter litoralis HTCC2594                                 proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Erythrobacter litoralis HTCC2594]
15074100       155     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH||OmpA->                                        Sinorhizobium meliloti                                           proteobacteria>alphaproteobacteria            PROBABLE MOLYBDOPTERIN MPT CONVERTING FACTOR, SUBUNIT 2 PROTEIN [Sinorhizobium meliloti]
68538766       146     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase                                                                   Sphingopyxis alaskensis RB2256                                   proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Sphingopyxis alaskensis RB2256]
68193705       154     <-OmpA||ADH->Excinuclease->phosphatidylglycerophosphate_synthase->MoaD->MoaE*->                                        Mesorhizobium sp. BNC1                                           proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Mesorhizobium sp. BNC1]
14027319       159     <-OmpA||ADH->Excinuclease->phosphatidylglycerophosphate_synthase->MoaD->MoaE*->                                        Mesorhizobium loti MAFF303099                                    proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Mesorhizobium loti MAFF303099]
23016727       158     ADH->Excinuclease->phosphatidylglycerophosphate_synthase->mobB->MoeA->MoaD->MoaE*->                                    Magnetospirillum magnetotacticum MS-1                            proteobacteria>alphaproteobacteria            COG0314: Molybdopterin converting factor, large subunit [Magnetospirillum magnetotacticum MS-1]
83854897       147     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH                                                Sulfitobacter sp. NAS-14.1                                       proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Sulfitobacter sp. NAS-14.1]
85707988       147     moaA->MoaD->MoaE*->                                                                                                    Erythrobacter sp. NAP1                                           proteobacteria>alphaproteobacteria            molybdopterin converting factor, subunit 2 [Erythrobacter sp. NAP1]
68180109       147     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH                                                Jannaschia sp. CCS1                                              proteobacteria>alphaproteobacteria            Molybdopterin biosynthesis MoaE [Jannaschia sp. CCS1]
58001332       170     <-MoaE*<-MoaD<-MoaC<-moaA<-MoeA                                                                                        Gluconobacter oxydans 621H                                       proteobacteria>alphaproteobacteria            Molybdopterin (MPT) converting factor, subunit 2 [Gluconobacter oxydans 621H]
15156159       155     <-MoaE*<-MoaD<-phosphatidylglycerophosphate_synthase<-Excinuclease<-ADH||OmpA->                                        Agrobacterium tumefaciens str. C58;                              proteobacteria>alphaproteobacteria            AGR_C_2084p [Agrobacterium tumefaciens str. C58]
32444388       170     MoaD->MoaE*->                                                                                                          Rhodopirellula baltica SH 1                                      planctomycetes                                molybdopterin converting factor, large subunit [Rhodopirellula baltica SH 1]
28271029       133     <-Mopterin_binding_protein<-?<-?||MoaE*->MoaD->moaA->                                                                  Lactobacillus plantarum WCFS1                                    firmicutes                                    molybdopterin biosynthesis protein, E chain [Lactobacillus plantarum WCFS1]
16410446       140     <-permease||?->MoeA->mobB->MoaE*->MoaD->MoaC->moaA-><-MoaB<-MoeB                                                       Listeria monocytogenes                                           firmicutes                                    lmo1044 [Listeria monocytogenes]
56379151       155     moaA-><-?||MoeA->mobB->MoaE*->MoaD->                                                                                   Geobacillus kaustophilus HTA426                                  firmicutes                                    molybdopterin converting factor (subunit 2) [Geobacillus kaustophilus HTA426]
72494466       148     MoaB-><-MoaC||MoeA->mobB->MoaE*->MoaD->MobA->moaA->                                                                    Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305     firmicutes                                    molybdopterin converting factor large subunit [Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305]
29898351       139     <-ADH<-?<-?<-MoaD<-MoaE*<-mobB<-MoeA||MoaC-><-MoeB                                                                     Bacillus cereus ATCC 14579                                       firmicutes                                    Molybdopterin (MPT) converting factor, subunit 2 [Bacillus cereus ATCC 14579]
29895811       156     moaA->MoeB->MoeA->MoaE*->MoaD->                                                                                        Bacillus cereus ATCC 14579                                       firmicutes                                    Molybdopterin (MPT) converting factor, subunit 2 [Bacillus cereus ATCC 14579]
56908909       142     <-MoaB||moaA->MoeA->mobB->MoaE*->MoaD->                                                                                Bacillus clausii KSM-K16                                         firmicutes                                    molybdopterin converting factor subunit 2 MoaE [Bacillus clausii KSM-K16]
10175641       156     MobA-><-MoaD<-MoaE*<-mobB<-MoeA<-MoaB||MoaC->                                                                          Bacillus halodurans C-125                                        firmicutes                                    molybdopterin converting factor (subunit 2) [Bacillus halodurans C-125]
52003240       164     MobA->MoeB->MoeA->mobB->MoaE*->MoaD->                                                                                  Bacillus licheniformis ATCC 14580                                firmicutes                                    molybdopterin converting factor (subunit 2) [Bacillus licheniformis ATCC 14580]
2633801        157     MobA->MoeB->MoeA->mobB->MoaE*->MoaD->?->?->?->?->Mopterin_binding_protein->                                            Bacillus subtilis subsp. subtilis str. 168;                      firmicutes                                    molybdopterin converting factor (subunit 2) [Bacillus subtilis subsp. subtilis str. 168]
75760852       165     <-ADH<-?<-?<-MoaD<-MoaE*<-mobB<-MoeA||MoaC-><-MoeB                                                                     Bacillus thuringiensis serovar israelensis ATCC 35646            firmicutes                                    Molybdopterin converting factor, large subunit [Bacillus thuringiensis serovar israelensis ATCC 35646]
75762420       157     moaA->?->MoeB->MoeA->MoaE*->MoaD->                                                                                     Bacillus thuringiensis serovar israelensis ATCC 35646            firmicutes                                    Molybdopterin converting factor, large subunit [Bacillus thuringiensis serovar israelensis ATCC 35646]
49242615       148     <-moaA<-MobA<-MoaD<-MoaE*<-mobB<-MoeA||MoaC-><-MoaB                                                                    Staphylococcus aureus subsp. aureus MRSA252                      firmicutes                                    putative molybdopterin-synthase large subunit [Staphylococcus aureus subsp. aureus MRSA252]
3955206        150     MoaB-><-MoaC||MoeA->mobB->MoaE*->MoaD->MobA->moaA->                                                                    Staphylococcus carnosus                                          firmicutes                                    MoaE [Staphylococcus carnosus]
57867759       150     <-moaA<-MobA<-MoaD<-MoaE*<-mobB<-MoeA||MoaC-><-MoaB                                                                    Staphylococcus epidermidis RP62A                                 firmicutes                                    molybdenum cofactor biosynthesis protein E [Staphylococcus epidermidis RP62A]
68446506       149     MoaB-><-MoaC||MoeA->mobB->MoaE*->MoaD->MobA->moaA->                                                                    Staphylococcus haemolyticus JCSC1435                             firmicutes                                    molybdopterin converting factor moa [Staphylococcus haemolyticus JCSC1435]
78704014       132     MoaD->MoaE*->                                                                                                          Methanospirillum hungatei JF-1                                   euryarchaeota                                 Molybdopterin biosynthesis MoaE [Methanospirillum hungatei JF-1]
78705135       135     MoeB->MoaD->MoaE*-><-?||?->permease->Mopterin_binding_protein->                                                        Methanospirillum hungatei JF-1                                   euryarchaeota                                 Molybdopterin biosynthesis MoaE [Methanospirillum hungatei JF-1]
86604897       161     <-MoaE*<-MoaD<-?<-moaA<-MoeA                                                                                           Cyanobacteria bacterium Yellowstone A-Prime                      cyanobacteria                                 molybdopterin converting factor, subunit 2 [Cyanobacteria bacterium Yellowstone A-Prime]
35214942       149     MoaD->MoaE*->                                                                                                          Gloeobacter violaceus PCC 7421; MoaD->MoaE                       cyanobacteria                                 molybdopterin converting factor subunit 2 [Gloeobacter violaceus PCC 7421]
1001213        145     Ferr-nitrite_reductase->cyanate_lyase->MoeA->moaA->MoaC+MobA->MoaD->MoaE*->                                            Synechocystis sp. PCC 6803                                       cyanobacteria                                 molybdopterin (MPT) converting factor, subunit 2 [Synechocystis sp. PCC 6803]
33639603       142     MoaC->MoeA-><-?||?-><-MoaE*<-MoaD||MoaB->                                                                              Synechococcus sp. WH 8102                                        cyanobacteria                                 molybdenum cofactor biosynthesis protein E (molydbopterin converting factor large subunit) [Synechococcus sp. WH 8102]
78170140       148     MoaC->MoeA->sugar_epimerase-><-MoaE*<-MoaD||MoaB->                                                                     Synechococcus sp. CC9902                                         cyanobacteria                                 molybdenum cofactor biosynthesis protein E [Synechococcus sp. CC9902]
22295084       148     LysR<-MoaE*<-MoaD<-MoaC+MobA<-moaA<-MoeA                                                                               Thermosynechococcus elongatus BP-1                               cyanobacteria                                 molybdopterin (MPT) converting factor, subunit 2 [Thermosynechococcus elongatus BP-1]
76261575       137     ADH->MoaD->MoaE*->                                                                                                     Chloroflexus aurantiacus J-10-fl                                 chloroflexi                                   Molybdopterin biosynthesis MoaE [Chloroflexus aurantiacus J-10-fl]
86134371       140     <-MoaE*||?-><-MoaD                                                                                                     Tenacibaculum sp. MED152                                         bacteroidetes/chlorobi                        molybdopterin converting factor, subunit 2 [Tenacibaculum sp. MED152]
67937986       130     <-MoaE*<-MoaD<-MoeA<-MoaC+MoeA                                                                                         Chlorobium phaeobacteroides BS1;                                 bacteroidetes/chlorobi                        Molybdopterin biosynthesis MoaE [Chlorobium phaeobacteroides BS1]
86143256       142     <-moaA<-MoaC+MoeA<-MoaE*<-MoeB<-MoaD<-MobA<-ModE<-MoeA                                                                 Flavobacterium sp. MED217;                                       bacteroidetes/chlorobi                        molybdopterin converting factor, subunit 2 [Flavobacterium sp. MED217]
68553533       130     <-MoaE*<-MoaD<-MoeA<-?<-moaA                                                                                           Prosthecochloris aestuarii DSM 271                               bacteroidetes/chlorobi                        Molybdopterin biosynthesis MoaE [Prosthecochloris aestuarii DSM 271]
68562527       146     MoaD->MoaE*->                                                                                                          Rubrobacter xylanophilus DSM 9941; MoaD->MoaE                    actinobacteria                                Molybdopterin biosynthesis MoaE [Rubrobacter xylanophilus DSM 9941]
54017798       145     <-MoaD<-moaA||MoeA->?->MoaE*->                                                                                         Nocardia farcinica IFM 10152                                     actinobacteria                                putative molybdopterin biosynthesis protein [Nocardia farcinica IFM 10152]
13880439       141     MoaC->MoaB->MoaE*-><-?<-MoaD<-moaA                                                                                     Mycobacterium tuberculosis CDC1551                               actinobacteria                                molybdopterin cofactor biosynthesis protein E [Mycobacterium tuberculosis CDC1551]
62425449       140     <-MoaE*<-MoaC<-MoeA||moaA->MoaD-><-MoeB+Rhod<-MoeA                                                                     Brevibacterium linens BL2                                        actinobacteria                                COG0314: Molybdopterin converting factor, large subunit [Brevibacterium linens BL2]
25169125       155     <-MoaD<-MoaD<-MoaD<-moaA||MoeA->MoaC->MoaE*->                                                                          Arthrobacter nicotinovorans                                      actinobacteria                                molybdopterin synthase (large subunit moaE) [Arthrobacter nicotinovorans]
41406902       141     MoaC->MoaB->MoaE*-><-?<-MoaD<-moaA                                                                                     Mycobacterium avium subsp. paratuberculosis K-10                 actinobacteria                                MoaE2 [Mycobacterium avium subsp. paratuberculosis K-10]
12620120       150     moaA->MoaB->MoaC->MoaD->MoaE*->                                                                                        uncultured bacterium pCosHE1                                                                                   AF250774_5 putative molybdopterin converting factor subunit 2 [uncultured bacterium pCosHE1]
40062751       148     <-MoaB<-MoaD<-moaA<-MoaC<-MoeA||MobA-><-MoaE*                                                                          uncultured bacterium 439                                                                                       molydopterin converting factor, subunit 2 [uncultured bacterium 439]

Example of a MoaC fused to a MoaD
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH    Operon                                                                                                                  ORGANISM                                                        Classification                                Protein descriptions (if any)
84319278       243    PhoH(PIN+ATPase)->MoaC+MoaD->MoaE->MoaB->MoeA                                                                           Pseudomonas aeruginosa C3719                                     proteobacteria>gammaproteobacteria            COG0315: Molybdenum cofactor biosynthesis enzyme [Pseudomonas aeruginosa C3719]
67676070       262    permease->ABC ATPAse->MobA->MobB->MoeA->MoaC+MoaD->MoaE->MoaA->                                                         Chromohalobacter salexigens DSM 3043                             proteobacteria>gammaproteobacteria            Molybdopterin cofactor biosynthesis protein MoaC [Chromohalobacter salexigens DSM 3043]
Bacterial MoaDs that are fused to MoaE (Gis are for the MoaD+MoaE protein- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH    Operon                                                                                                              ORGANISM                                                         Classification                                Protein descriptions (if any)
67927210       229    MoaD+MoaE*                                                                                                              Solibacter usitatus Ellin6076                                    fibrobacteres/acidobacteria                   Molybdopterin biosynthesis MoaE:ThiamineS [Solibacter usitatus Ellin6076]
46200249       223    MoaD+MoaE*                                                                                                              Thermus thermophilus HB27                                        deinococci                                    molybdopterin (MPT) converting factor, subunit 2 [Thermus thermophilus HB27]
66799395       273    MoaD+MoaE*                                                                                                              Deinococcus geothermalis DSM 11300                               deinococci                                    Molybdopterin converting factor, subunit 1 [Deinococcus geothermalis DSM 11300]
6460436        229    MoaD+MoaE*                                                                                                              Deinococcus radiodurans R1                                       deinococci                                    AE002090_1 molybdenum cofactor biosynthesis protein D/E [Deinococcus radiodurans R1]
51858004       230    MoaD+MoaE*                                                                                                              Symbiobacterium thermophilum IAM 14863                           actinobacteria                                molybdopterin converting factor-like protein [Symbiobacterium thermophilum IAM 14863]
13883249       221    MoaA->dehydratase-> MoaC->MoaD+MoaE*->                                                                                  Mycobacterium tuberculosis CDC1551                               actinobacteria                                (dehydratase-pterin-4-alpha-carbinolamine dehydratase) molybdopterin cofactor biosynthesis protein D/E [Mycobacterium tuberculosis CDC1551]


b.  Archaeal pathways involved in MOCO biosynthesis and related pathways (MoaD gis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- Molybdenum pathway (Basic construction with minor elaboration)
Gis are for the MoaD containing protein (marked with an asterisk)
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                Protein descriptions (if any)
15621527       236        MoaD+MoaE*->ThiD+X->HD->InPP+X->Glucosaminyltransferase->                                                           Sulfolobus tokodaii str. 7                                       crenarchaeota                                 236aa long hypothetical molybdopterin converting factor [Sulfolobus tokodaii str. 7]
68567385       235        MoaD+MoaE*->ThiD+X->HD->InPP+X->Glucosaminyltransferase->                                                           Sulfolobus acidocaldarius DSM 639                                crenarchaeota                                 molybdenum cofactor biosynthesis protein D/E [Sulfolobus acidocaldarius DSM 639]
13815697       231        MoaD+MoaE*->ThiD+X->HD->InPP+X->Glucosaminyltransferase->                                                           Sulfolobus solfataricus P2                                       crenarchaeota                                 Molybdenum cofactor biosynthesis protein E (moaE) [Sulfolobus solfataricus P2]
18159566       229        MoaD+MoaE*                                                                                                          Pyrobaculum aerophilum str. IM2                                  crenarchaeota                                 molybdenum cofactor biosynthesis protein D/E [Pyrobaculum aerophilum str. IM2]
88950646       130        MoaD*->MoaE                                                                                                         Methanosaeta thermophila PT                                      euryarchaeota                                 MoaD, archaeal [Methanosaeta thermophila PT]
88603453       92         MoaD*->MoaE                                                                                                         Methanospirillum hungatei JF-1                                   euryarchaeota                                 thiamineS [Methanospirillum hungatei JF-1]
88601825       91         MoeB->MoaD*->MoaE->                                                                                                 Methanospirillum hungatei JF-1                                   euryarchaeota                                 thiamineS [Methanospirillum hungatei JF-1]
48430776       75         MoaC->MoaB->MoaE-><-Sugar_transporter<-MoaA<-MoaD*                                                                  Picrophilus torridus DSM 9790                                    euryarchaeota                                 molybdopterin (MPT) converting factor, subunit 1 [Picrophilus torridus DSM 9790]
57160377       88         MoaD*->MoeB<-?->MoaE->                                                                                              Thermococcus kodakarensis KOD1                                   euryarchaeota                                 molybdopterin converting factor, subunit 1 [Thermococcus kodakarensis KOD1]
33356787       94         MoeA->MoaD*->                                                                                                       Pyrococcus abyssi GE5                                            euryarchaeota                                 molybdopterin converting factor, subunit 1 [Pyrococcus abyssi GE5]
5458838        89         MoeA->MoaD*->                                                                                                       Pyrococcus abyssi GE5                                            euryarchaeota                                 moaD molybdopterin synthase, small subunit [Pyrococcus abyssi GE5]
33359306       89         MoeA->MoaD*->                                                                                                       Pyrococcus horikoshii OT3                                        euryarchaeota                                 putative molybdopterin converting factor, subunit 1 [Pyrococcus horikoshii OT3]
18892532       90         MoeA->MoaD*->                                                                                                       Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 molybdopterin converting factor, subunit 1                                         ;                                     (moaD) [Pyrococcus furiosus DSM 3638]
10640334       85         MoaD*->?->TFIIB<-MoeA+PBPII<-MoeA                                                                                   Thermoplasma acidophilum                                         euryarchaeota                                 MoaD (involved in molybdopterin synthesis) related protein [Thermoplasma acidophilum]
14324783       90         MoeA->MoeA+PPBII-><-WcaG->MoaD*->                                                                                   Thermoplasma volcanium GSS1                                      euryarchaeota                                 molybdopterin converting factor subunit 1 [Thermoplasma volcanium GSS1]

Archaeal MoaD Solos (Gis are of the MoaD protein-marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                              ORGANISM                                                         Classification                                Protein descriptions (if any)
11499216       86         MoaD*                                                                                                               Archaeoglobus fulgidus DSM 4304                                  euryarchaeota                                 molybdopterin converting factor, subunit 1 (moaD) [Archaeoglobus fulgidus DSM 4304]
55378770       133        MoaD*-><-MoaD                                                                                                       Haloarcula marismortui ATCC 43049                                euryarchaeota                                 hypothetical protein rrnAC2058 [Haloarcula marismortui ATCC 43049]
55379974       92         MoaD*                                                                                                               Haloarcula marismortui ATCC 43049                                euryarchaeota                                 hypothetical protein rrnAC3439 [Haloarcula marismortui ATCC 43049]
10581293       100        MoaD*-><-MoaD                                                                                                       Halobacterium sp. NRC-1                                          euryarchaeota                                 Vng1848h [Halobacterium sp. NRC-1]
68210071       112        MoaD*->CrcB->CrcB->                                                                                                 Methanococcoides burtonii DSM 6242                               euryarchaeota                                 MoaD, archaeal [Methanococcoides burtonii DSM 6242]
19918186       97         MoaD*->MoeA->CrcB->CrcB->                                                                                           Methanosarcina acetivorans C2A                                   euryarchaeota                                 molybdopterin converting factor, subunit 1 [Methanosarcina acetivorans C2A]
21226933       97         MoaD*->MoeA->CrcB->CrcB->                                                                                           Methanosarcina mazei Go1                                         euryarchaeota                                 Molybdopterin converting factor small subunit [Methanosarcina mazei Go1]
72395205       97         MoaD*->MoeA->CrcB->CrcB->                                                                                           Methanosarcina barkeri str. fusaro                               euryarchaeota                                 molybdopterin converting factor small subunit [Methanosarcina barkeri str. fusaro]
76801893       93         MoaD*                                                                                                               Natronomonas pharaonis DSM 2160                                  euryarchaeota                                 probable molybdopterin converting factor, small subunit 2 [Natronomonas pharaonis DSM 2160]
76803138       92         MoaD*                                                                                                               Natronomonas pharaonis DSM 2160                                  euryarchaeota                                 probable molybdopterin converting factor, small subunit 1 [Natronomonas pharaonis DSM 2160]
76802608       97         MoaD*                                                                                                               Natronomonas pharaonis DSM 2160                                  euryarchaeota                                 homolog to molybdopterin converting factor, small subunit [Natronomonas pharaonis DSM 2160]
18160633       93         MoaD*                                                                                                               Pyrobaculum aerophilum str. IM2                                  crenarchaeota                                 conserved hypothetical protein [Pyrobaculum aerophilum str. IM2]
18160535       90         MoaD*                                                                                                               Pyrobaculum aerophilum str. IM2                                  crenarchaeota                                 conserved hypothetical protein [Pyrobaculum aerophilum str. IM2]
18161603       94         MoaD*                                                                                                               Pyrobaculum aerophilum str. IM2                                  crenarchaeota                                 conserved hypothetical protein [Pyrobaculum aerophilum str. IM2]
33356700       78         MoaD*->CBS->                                                                                                        Pyrococcus abyssi GE5                                            euryarchaeota                                 hypothetical protein PAB1981.1n [Pyrococcus abyssi GE5]
18893753       79         MoaD*->CBS->                                                                                                        Pyrococcus furiosus DSM 3638                                     euryarchaeota                                 hypothetical protein [Pyrococcus furiosus DSM 3638]
33359416       75         MoaD*->CBS->                                                                                                        Pyrococcus horikoshii OT3                                        euryarchaeota                                 hypothetical protein PH1595.1n [Pyrococcus horikoshii OT3]
68567124       84         MoaD*                                                                                                               Sulfolobus acidocaldarius DSM 639                                crenarchaeota                                 conserved Archaeal protein [Sulfolobus acidocaldarius DSM 639]
10640172       90         MoaD*                                                                                                               Thermoplasma acidophilum                                         euryarchaeota                                 conserved hypothetical protein [Thermoplasma acidophilum]
42557747       448        MoaD*                                                                                                               uncultured crenarchaeote                                         crenarchaeota                                 putative molybdopterin biosynthesis protein [uncultured crenarchaeote]

Archaeal operons that have the MoaE protein and do not include the MoaD protein
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--- Archaeal MoaEs without MoaD; MobB+MoaE (MobB: Nitrogenase like GTPase); note many of these have a MoaD solo elsewhere in the genome
--gis are of the MoaE protein (marked with an asterisk)
GI             LENGTH     Operon                                                                                                               ORGANISM                                                        Classification                                Protein descriptions (if any)
55377991       275        MoaC->MoaE*<-CysT<-ModA->ThiC->                                                                                      Haloarcula marismortui ATCC 43049                               euryarchaeota                                 molybdenum cofactor biosynthesis protein [Haloarcula marismortui ATCC 43049]
10579734       297        <-MobB+MoaE*<-MoeA-->MoaA<-MoeA+PPBDII                                                                               Halobacterium sp. NRC-1                                         euryarchaeota                                 (Molybd binding domain)-                               molybdenum cofactor biosynthesis protein; MoaE [Halobacterium sp. NRC-1]
68211451       276        <-MobB+MoaE*<-RadC                                                                                                   Methanococcoides burtonii DSM 6242                              euryarchaeota                                 Molybdopterin-guanine dinucleotide biosynthesis protein [Methanococcoides burtonii DSM 6242]
72397307       213        <-MoeA<-MobB+MoaE*                                                                                                   Methanosarcina barkeri str. fusaro                              euryarchaeota                                 molybdopterin converting factor, subunit 2 [Methanosarcina barkeri str. fusaro]
21228894       278        MobB+MoaE*->MobA<-MoeA+PPBDII                                                                                        Methanosarcina mazei Go1                                        euryarchaeota                                 Molybdopterin converting factor, subunit 2 [Methanosarcina mazei Go1]
19915893       279        <-RadC<-MobB+MoaE*->MobA->                                                                                           Methanosarcina acetivorans C2A                                  euryarchaeota                                 molybdopterin-guanine dinucleotide biosynthesis protein B/molybdopterin converting factor, large subunit [Methanosarcina acetivorans C2A]
72396955       285        MobB+MoaE*->MobA->                                                                                                   Methanosarcina barkeri str. fusaro                              euryarchaeota                                 molybdopterin converting factor, subunit 2 [Methanosarcina barkeri str. fusaro]
76801780       262        MobB+MoaE*->?->metalloprotease<-ThiL                                                                                 Natronomonas pharaonis DSM 2160                                 euryarchaeota                                 molybdopterin converting factor, large subunit [Natronomonas pharaonis DSM 2160]
5104258        249        MoaE*                                                                                                                Aeropyrum pernix K1                                             crenarchaeota                                 249aa long hypothetical molybdopterin (mpt) converting factor, subunit 2 [Aeropyrum pernix K1]
11499761       239        <-RecB<-phosphoesterase->MoaE*->                                                                                     Archaeoglobus fulgidus DSM 4304                                 euryarchaeota                                 molybdopterin converting factor, subunit 2 (moaE) [Archaeoglobus fulgidus DSM 4304]
2833554        119        MoaE*                                                                                                                Methanocaldococcus jannaschii                                   euryarchaeota                                 Y717_METJA Hypothetical protein MJ0717
2621190        143        MoaE*->HD hydrolase->Flavoprotein->                                                                                  Methanothermobacter thermautotrophicus str. Delta H             euryarchaeota                                 molybdenum cofactor biosynthesis protein MoaE [Methanothermobacter thermautotrophicus str. Delta H]
5457600        148        TPR<-MoaE*->FeS oxidoreductase->KaiC->                                                                               Pyrococcus abyssi GE5                                           euryarchaeota                                 moaE molybdopterin synthase, large chain [Pyrococcus abyssi GE5]
68139846       136        MoaE*                                                                                                                Ferroplasma acidarmanus Fer1                                    euryarchaeota                                 Molybdopterin biosynthesis MoaE [Ferroplasma acidarmanus Fer1]
45047664       141        MoaE*                                                                                                                Methanococcus maripaludis S2                                    euryarchaeota                                 Molybdopterin biosynthesis MoaE [Methanococcus maripaludis S2]
72397308       60         MoaE*                                                                                                                Methanosarcina barkeri str. fusaro                              euryarchaeota                                 hypothetical protein Mbar_A2676 [Methanosarcina barkeri str. fusaro]
18892013       145        MoaE*                                                                                                                Pyrococcus furiosus DSM 3638                                    euryarchaeota                                 molybdopterin converting factor (subunit 2) [Pyrococcus furiosus DSM 3638]
10640820       135        MoaE*                                                                                                                Thermoplasma acidophilum                                        euryarchaeota                                 molybdopterin-synthase large subunit related protein [Thermoplasma acidophilum]
14324300       137        MoaE*                                                                                                                Thermoplasma volcanium GSS1 a_b_hydrolase->MoaE                 euryarchaeota                                 molybdopterin converting factor subunit 2 [Thermoplasma volcanium GSS1]
52549594       130        MoaE*                                                                                                                uncultured archaeon GZfos28G7                                                                                 molybdopterin converting factor subunit 2 [uncultured archaeon GZfos28G7]
52550228       130        MoaE*                                                                                                                uncultured archaeon GZfos36D8                                                                                 molybdopterin converting factor large subunit [uncultured archaeon GZfos36D8]
52548569       134        MoaE*                                                                                                                uncultured archaeon GZfos17C7                                                                                 molybdopterin converting factor large subunit [uncultured archaeon GZfos17C7]

Miscellaneous pathways
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                               ORGANISM                                                        Classification                                Protein descriptions (if any)
11498162       88        MoaD->MoeB->SirA->?->SirA->                                                                                           Archaeoglobus fulgidus DSM 4304;                                euryarchaeota                                 hypothetical protein AF0552 [Archaeoglobus fulgidus DSM 4304]
68550331   **  317       ModA->ModC->Cys synthase->cystathione gamme synthase->ThiS->ThiG->                                                    Pelodictyon phaeoclathratiforme BU-1                            bacteroidetes/chlorobi                        Cysteine synthase K/M:Cysteine synthase A [Pelodictyon phaeoclathratiforme BU-1]
-----------------------------------------------------------------------------------------------                                                --------------

3. Tungsten cofactor biosynthesis
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Abbreviations: 4Fe-S:4fe-SFerredoxin; AOR: Aldehyde ferredoxin oxidoreductase,
PDOR : Pyridine disulfide oxidoreductase, ADH: Alcohol dehydrogenase
Always have AOR and MoaD, often MoeB, occasionally MoeA and MoaA, MoaE

a. Archaeal operons (Gis are for the MoaD protein)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                               ORGANISM                                                        Classification                                Protein descriptions (if any)
5458384        84         AOR->MoaD*->MoaA->                                                                                                   Pyrococcus abyssi GE5                                           euryarchaeota                                 moaD-like molybdopterin converting factor related, subunit 1 [Pyrococcus abyssi GE5]
18892299       82         AOR->MoaD*->MoaA->                                                                                                   Pyrococcus furiosus DSM 3638                                    euryarchaeota                                 molybdopterin converting factor, subunit 1; (moaD) [Pyrococcus furiosus DSM 3638]
11497643       91         AOR->MoaD*->                                                                                                         Archaeoglobus fulgidus DSM 4304                                 euryarchaeota                                 hypothetical protein AF0022 [Archaeoglobus fulgidus DSM 4304]
19915596       94         AOR->MoaD*->                                                                                                         Methanosarcina acetivorans C2A                                  euryarchaeota                                 predicted protein [Methanosarcina acetivorans C2A]
21228746       94         AOR->MoaD*->                                                                                                         Methanosarcina mazei Go1                                        euryarchaeota                                 putative molybdopterin converting factor [Methanosarcina mazei Go1]
736275         69         AOR->MoaD*->MoaA->                                                                                                   Pyrococcus furiosus DSM 3638                                    euryarchaeota                                 unnamed protein product [Pyrococcus furiosus DSM 3638]
33359354       84         AOR->MoaD*-><-?->MoaA->                                                                                              Pyrococcus horikoshii OT3                                       euryarchaeota                                 putative molybdopterin converting factor, subunit 1 [Pyrococcus horikoshii OT3]
57159324       82         AOR->MoaD*->MoaA->                                                                                                   Thermococcus kodakarensis KOD1                                  euryarchaeota                                 molybdopterin converting factor, subunit 1 [Thermococcus kodakarensis KOD1]
14325024       91         <-MoaD*<-AOR                                                                                                         Thermoplasma volcanium GSS1                                     euryarchaeota                                 molybdopterin converting factor subunit 1 [Thermoplasma volcanium GSS1]

Possibly involved in tungsten cofactor biosynthesis (as the MoaAs, typically retrievethe tungsten cofactor protein in blast searches)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gis are of the MoaD protein (marked with an asterisk)

GI             LENGTH     Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
11499688       89         MoaA->MoaD*->                                                                                                         Archaeoglobus fulgidus DSM 4304                                euryarchaeota                                 hypothetical protein AF2105 [Archaeoglobus fulgidus DSM 4304]
68140833       78         MoaA->MoaD*->                                                                                                         Ferroplasma acidarmanus Fer1                                   euryarchaeota                                 MoaD, archaeal [Ferroplasma acidarmanus Fer1]
76801420       145        MoaA->?->MoaA->MoaD*->                                                                                                Natronomonas pharaonis DSM 2160                                euryarchaeota                                 pterin cluster protein [Natronomonas pharaonis DSM 2160]

b. Bacterial operons (gis are for the AOR gene- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
78700229       616        4Fe-S->AOR*->PDOR->MoaD->                                                                                             Alkalilimnicola ehrlichei MLHE-1                               proteobacteria>gammaproteobacteria            aldehyde:ferredoxin oxidoreductase,tungsten-containing [Alkalilimnicola ehrlichei MLHE-1]
34482541       575        AOR*->MoaD->MoeB                                                                                                      Wolinella succinogenes                                         proteobacteria>epsilonproteobacteria          ALDEHYDE OXIDOREDUCTASE [Wolinella succinogenes]
68178064       576        Dehyd->AOR*->MoaD->MoeB->FeS_assembly?->MoaA                                                                          Desulfuromonas acetoxidans DSM 684                             proteobacteria>deltaproteobacteria            IMP dehydrogenase/GMP reductase:Aldehyde ferredoxin oxidoreductase [Desulfuromonas acetoxidans DSM 684]
77543953       576        MoeA<-dehyd->AOR*->MoaD->MoeB->dehyd                                                                                  Pelobacter carbinolicus DSM 2380                               proteobacteria>deltaproteobacteria            aldehyde ferredoxin oxidoreductase [Pelobacter carbinolicus DSM 2380]
71838535       576        MoeA<-dehyd->AOR*->MoaD->MoeB->permease->ABC ATPase                                                                   Pelobacter propionicus DSM 2379                                proteobacteria>deltaproteobacteria            Aldehyde ferredoxin oxidoreductase [Pelobacter propionicus DSM 2379]
77544154       577        AOR*->MoaD->                                                                                                          Pelobacter carbinolicus DSM 2380                               proteobacteria>deltaproteobacteria            aldehyde ferredoxin oxidoreductase [Pelobacter carbinolicus DSM 2380]
71544346       609        AOR*->MoaD->                                                                                                          Syntrophobacter fumaroxidans MPOB                              proteobacteria>deltaproteobacteria            Aldehyde ferredoxin oxidoreductase [Syntrophobacter fumaroxidans MPOB]
46449005       576        AOR*-><-MoaD                                                                                                          Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough      proteobacteria>deltaproteobacteria            aldehyde:ferredoxin oxidoreductase, tungsten-containing [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough]
78193518       576        dehyd->AOR*->MoaD->MoeB                                                                                               Geobacter metallireducens GS-15                                proteobacteria>deltaproteobacteria            Aldehyde ferredoxin oxidoreductase [Geobacter metallireducens GS-15]
78219908       577        MoaD->MoeB<-AOR*                                                                                                      Desulfovibrio desulfuricans G20                                proteobacteria>deltaproteobacteria            aldehyde:ferredoxin oxidoreductase, tungsten-containing [Desulfovibrio desulfuricans G20]
68178220       576        AOR*->MoaD->MoeB                                                                                                      Desulfuromonas acetoxidans DSM 684                             proteobacteria>deltaproteobacteria            IMP dehydrogenase/GMP reductase:Aldehyde ferredoxin oxidoreductase [Desulfuromonas acetoxidans DSM 684]
50877365       575        MoeA->MoeA+PPBPII->AOR*->MoaD                                                                                         Desulfotalea psychrophila LSv54                                proteobacteria>deltaproteobacteria            related to tungsten-containing aldehyde ferredoxin oxidoreductase (AOR) [Desulfotalea psychrophila LSv54]
39982778   **  601        4Fe-S->AOR*->PDOR->MoaD-> MoeB->                                                                                      Geobacter sulfurreducens PCA                                   proteobacteria>deltaproteobacteria            aldehyde:ferredoxin oxidoreductase, tungsten-containing [Geobacter sulfurreducens PCA]
74023041       617        4Fe-S->AOR*->PDOR->MoaD->                                                                                             Rhodoferax ferrireducens DSM 15236                             proteobacteria>betaproteobacteria             Aldehyde ferredoxin oxidoreductase [Rhodoferax ferrireducens DSM 15236]
84716937       615        4Fe-S->AOR*->PDOR->MoaD->                                                                                             Polaromonas naphthalenivorans CJ2                              proteobacteria>betaproteobacteria             aldehyde:ferredoxin oxidoreductase,tungsten-containing [Polaromonas naphthalenivorans CJ2]
47572159       592        4Fe-S->AOR*->PDOR->MoaD->                                                                                             Rubrivivax gelatinosus PM1                                     proteobacteria>betaproteobacteria             COG2414: Aldehyde:ferredoxin oxidoreductase [Rubrivivax gelatinosus PM1]
56314521   **  774        AOR*+MoaD                                                                                                             Azoarcus sp. EbN1                                              proteobacteria>betaproteobacteria             putative tungsten-containing aldehyde ferredoxin oxidoreductase (AOR-1)
23015426       616        4Fe-S->AOR*->PDOR->PDOR->MoaD->                                                                                       Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria            COG2414: Aldehyde:ferredoxin oxidoreductase [Magnetospirillum magnetotacticum MS-1]
83589574       599        4Fe-S->AOR*->MoaD->PDOR->                                                                                             Moorella thermoacetica ATCC 39073                              firmicutes                                    Aldehyde ferredoxin oxidoreductase [Moorella thermoacetica ATCC 39073]
77996039       597        AOR*->MoaA->MoaD->MoaE->                                                                                              Carboxydothermus hydrogenoformans Z-2901                       firmicutes                                    aldehyde ferredoxin oxidoreductase, tungsten-containing [Carboxydothermus hydrogenoformans Z-2901]
76795288   **  599        AOR*->MoaA->MoaD->MoeB->MobB->PPBPII->permease->ABC ATPase->                                                          Thermoanaerobacter ethanolicus ATCC 33223                      firmicutes                                    Aldehyde ferredoxin oxidoreductase [Thermoanaerobacter ethanolicus ATCC 33223]
77995801       629        4Fe-S->AOR*->MoaD->                                                                                                   Carboxydothermus hydrogenoformans Z-2901                       firmicutes                                    aldehyde ferredoxin oxidoreductase, tungsten-containing [Carboxydothermus hydrogenoformans Z-2901]
77995423       597        ADH->AOR*->MoaD->                                                                                                     Carboxydothermus hydrogenoformans Z-2901                       firmicutes                                    aldehyde ferredoxin oxidoreductase, tungsten-containing [Carboxydothermus hydrogenoformans Z-2901]
71540750       597        MoaD->MoeB<-?->4Fe-S->AOR*->                                                                                          Syntrophomonas wolfei str. Goettingen                          firmicutes                                    Aldehyde ferredoxin oxidoreductase [Syntrophomonas wolfei str. Goettingen]
46200136       608        AOR*->MoaD->                                                                                                          Thermus thermophilus HB27                                      deinococci                                    tungsten-containing aldehyde ferredoxin oxidoreductase [Thermus thermophilus HB27]
51858106       604        AOR*->MoaD->                                                                                                          Symbiobacterium thermophilum IAM 14863                         actinobacteria                                aldehyde ferredoxin oxidoreductase [Symbiobacterium thermophilum IAM 14863]
51857711       603        AOR*->MoaD->                                                                                                          Symbiobacterium thermophilum IAM 14863                         actinobacteria                                aldehyde ferredoxin oxidoreductase [Symbiobacterium thermophilum IAM 14863]
-----------------------------------------------------------------------------------------------                                                 --------------
4. Uncharacterized operons with ThiS/ThiF+Rhodanese containing proteins (sulfur metabolism)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4a. Siderophore biosynthesis (Gis are of the E1+Rhodanese- marked with an asterisk)
GI             LENGTH     Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
28192388      387         Hist_phosphate_NH2transferase->E1+Rhodanese*->JAB->ThiS/MoaD->Trp-dioxygenase->hydroxybenzoate hydroxylase->          Pseudomonas fluorescens                                        proteobacteria>gammaproteobacteria            QbsC [Pseudomonas fluorescens]
83645618      390         E1+Rhodanese->JAB*->ThiS/MoaD->+CaiB-like coA transferase->AMP-acid ligase->                                          Hahella chejuensis KCTC 2396                                   proteobacteria>gammaproteobacteria            Dinucleotide-utilizing enzyme involved in molybdopterin and thiamine biosynthesis family 2 [Hahella chejuensis KCTC 2396]
82702101      390         E1+Rhodanese->JAB*->ThiS/MoaD->+CaiB-like coA transferase->                                                           Nitrosospira multiformis ATCC 25196                            proteobacteria>betaproteobacteria             UBA/THIF-type NAD/FAD binding fold [Nitrosospira multiformis ATCC 25196]
30181075      390         E1+Rhodanese->JAB*->ThiS/MoaD->+CaiB-like coA transferase->AMP-acid ligase->                                          Nitrosomonas europaea ATCC 19718                               proteobacteria>betaproteobacteria             Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 [Nitrosomonas europaea ATCC 19718]
83748714      389         E1+Rhodanese->JAB*->ThiS/MoaD->+CaiB-like coA transferase->AMP-acid ligase->                                          Ralstonia solanacearum UW551                                   proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoeB protein [Ralstonia solanacearum UW551]
83748714      389         E1+Rhodanese->JAB*->ThiS/MoaD->                                                                                       Ralstonia solanacearum UW551                                   proteobacteria>betaproteobacteria             Molybdopterin biosynthesis MoeB protein [Ralstonia solanacearum UW551]
5070639       391         E1+Rhodanese->JAB*->ThiS/MoaD->+CaiB-like coA transferase->AMP-acid ligase->                                          Pseudomonas stutzeri KC                                        proteobacteria>gammaproteobacteria            AF149851_6 MoeB-like protein [Pseudomonas stutzeri KC]
84994030      390         E1+Rhodanese(PdtF)*->JAB(PdtG)->ThiS/MoaD(PdtH)->+CaiB-like coA transferase(PdtI)->AMP-acid ligase(PdtJ)->            Pseudomonas putida                                             proteobacteria>gammaproteobacteria            PdtF [Pseudomonas putida]
----------------------------------------------------
4b. Uncharacterized operon encoding a ThiS/MoaD, a JAB peptidase and E1-like enzyme (gis are of E1+Rhod- marked with an asterisk)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI             LENGTH     Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
88807869      389         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. WH 7805                                      cyanobacteria                                 gll3412 [Gloeobacter violaceus PCC 7421]
86607093      387         JAB->ThiS/MoaD->E1+Rhod*->                                                                                            Cyanobacteria bacterium Yellowstone A-Prime                    cyanobacteria                                 UBA/THIF-type NAD/FAD binding, MoeZ/MoeB fmaily protein [Anaeromyxobacter dehalogenans 2CP-C]
86609523      389         JAB->ThiS/MoaD->E1+Rhod*->                                                                                            Cyanobacteria bacterium Yellowstone B-Prime                    cyanobacteria                                 putative molybdopterin biosynthesis protein MoeB [Synechococcus sp. JA-2-3B'a(2-13)]
81298969      391         JAB->E1+Rhod*->                                                                                                       Synechococcus elongatus PCC 7942                               cyanobacteria                                 putative molybdopterin biosynthesis protein MoeB [Synechococcus sp. JA-3-3Ab]
87300927      390         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. WH 5701                                      cyanobacteria                                 Rhodanese-like [Alkalilimnicola ehrlichei MLHE-1]
35213984      395         JAB->ThiS/MoaD->E1+Rhod*->                                                                                            Gloeobacter violaceus PCC 7421                                 cyanobacteria                                 Rhodanese-like [Synechococcus elongatus PCC 7942]
78700359      142         JAB->ThiS/MoaD+Rhodanese+E1*->                                                                                        Alkalilimnicola ehrlichei MLHE-1                               proteobacteria>gammaproteobacteria            molybdopterin biosynthesis MoeB protein [Synechococcus sp. WH 5701]
86159911      390         ThiS/MoaD->E1+Rhod*->JAB->                                                                                            Anaeromyxobacter dehalogenans 2CP-C                            proteobacteria>deltaproteobacteria            molybdopterin biosynthesis protein [Synechococcus sp. WH 7805]

-  JABs in operons with E1+Rhod  (No Ub_like) (Gis of E1 containing protein-marked with an asterisk)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
75700942      390         JAB->E1+Rhod*->                                                                                                       Anabaena variabilis ATCC 29413                                 cyanobacteria                                 Rhodanese-like MoeZ/MoeB [Anabaena variabilis ATCC 29413]
17132000      390         JAB->E1+Rhod*->                                                                                                       Nostoc sp. PCC 7120                                            cyanobacteria                                 molybdopterin biosynthesis protein [Nostoc sp. PCC 7120]
56686316      391         JAB->E1+Rhod*->                                                                                                       Synechococcus elongatus PCC 6301                               cyanobacteria                                 molybdopterin biosynthesis MoeB protein [Synechococcus elongatus PCC 6301]
71676726      391         JAB->E1+Rhod*->                                                                                                       Trichodesmium erythraeum IMS101                                cyanobacteria                                 UBA/THIF-type NAD/FAD binding fold:Rhodanese-like:MoeZ/MoeB [Trichodesmium erythraeum IMS101]
23124399      390         JAB->E1+Rhod*->                                                                                                       Nostoc punctiforme PCC 73102                                   cyanobacteria                                 COG0476: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 [Nostoc punctiforme PCC 73102]
87124948      389         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. RS9917                                       cyanobacteria                                 Rhodanese-like [Synechococcus sp. RS9917]
78169800      388         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. CC9902                                       cyanobacteria                                 Rhodanese-like [Synechococcus sp. CC9902]
72002829      381         JAB->E1+Rhod*->                                                                                                       Prochlorococcus marinus str. NATL2A                            cyanobacteria                                 rhodanese-like [Prochlorococcus marinus str. NATL2A]
33238703      379         JAB->E1+Rhod*->                                                                                                       Prochlorococcus marinus subsp. marinus str. CCMP1375           cyanobacteria                                 Prochlorococcus marinus subsp. marinus str. CCMP1375 complete genome
33635570      409         JAB->E1+Rhod*->                                                                                                       Prochlorococcus marinus str. MIT 9313                          cyanobacteria                                 molybdopterin biosynthesis protein [Prochlorococcus marinus str. MIT 9313]
84513874      379         JAB->E1+Rhod*->                                                                                                       Prochlorococcus marinus str. MIT 9211                          cyanobacteria                                 Dinucleotide-utilizing enzyme [Prochlorococcus marinus str. MIT 9211]
78196401      378         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. CC9605                                       cyanobacteria                                 Rhodanese-like [Synechococcus sp. CC9605]
33633363      377         JAB->E1+Rhod*->                                                                                                       Synechococcus sp. WH 8102                                      cyanobacteria                                 molybdopterin biosynthesis protein [Synechococcus sp. WH 8102]
76882207      257         E1*->JAB->                                                                                                            Nitrosococcus oceani ATCC 19707                                proteobacteria>gammaproteobacteria            Adenylyltransferase [Nitrosococcus oceani ATCC 19707]
----------------------------------------------------
4c. Uncharacterized operon with a  ThiS/MoaD, E1-like enzyme, a JAB and a Cysteine synthase  (gis are of the cys synthase- marked with an asterisk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
68563152       305        Cys synthase*->JAB->ThiS/MoaD->E1+Rhodanese->                                                                         Rubrobacter xylanophilus DSM 9941                              actinobacteria                                Cysteine synthase K/M [Rubrobacter xylanophilus DSM 9941]
83815753       317        Cys syn*->JAB->ThiS/MoaD->E1+Rhodanese->                                                                              Salinibacter ruber DSM 13855                                   bacteroidetes/chlorobi                        cysteine synthase B [Salinibacter ruber DSM 13855]
83757147       317        Cys syn*->JAB->ThiS/MoaD->E1+Rhod->                                                                                   Salinibacter ruber DSM 13855                                   bacteroidetes/chlorobi                         cysteine synthase B [Salinibacter ruber DSM 13855]
76258730       308        Cys synthase*->JAB->ThiS/MoaD->E1+Rhodanese->                                                                         Chloroflexus aurantiacus J-10-fl                               chloroflexi                                   Cysteine synthase K/M [Chloroflexus aurantiacus J-10-fl]
67932284       319        Cys syn*->JAB->ThiS/MoaD->E1+Rhodanese->                                                                              Solibacter usitatus Ellin6076                                  fibrobacteres/acidobacteria                   Cysteine synthase K/M [Solibacter usitatus Ellin6076]
78493973       304        JAB->E1+Rhodanese->Cys synthase*->                                                                                    Rhodopseudomonas palustris BisB18                              proteobacteria>alphaproteobacteria             Pyridoxal-5'-phosphate-dependent enzyme, beta subunit [Rhodopseudomonas palustris BisB18]
9948117        392        Cys synthase*->E1+Rhodanese->                                                                                         Pseudomonasaeruginosa PAO1;                                    proteobacteria>gammaproteobacteria            AE004638_1 probable molybdopterin biosynthesis protein MoeB [Pseudomonas aeruginosa PAO1]
----------------------------------------------------
4d. Uncharacterized operon with a ThiS/MoaD/MoaD, JAB, Cysteine synthase and ClpS  (gis of Cys synthases)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
13880986       323       ClpS->alpha_helical_domain->JAB->ThiS/MoaD->Cys synthase*->                                                            Mycobacterium tuberculosis CDC1551;                            actinobacteria                                cysteine synthase [Mycobacterium tuberculosis CDC1551]
54014566       320       ClpS->alpha_helical_domain->dmpA_peptidase->JAB->ThiS/MoaD->Cys synthase*->                                            Nocardia farcinica IFM 10152;                                  actinobacteria                                putative cysteine synthase [Nocardia farcinica IFM 10152]
29608823       316       ClpS->alpha_helical_domain->permease->JAB->ThiS/MoaD->Cys synthase*->                                                  Streptomyces avermitilis; MA-4680                              actinobacteria                                putative cysteine synthase [Streptomyces avermitilis MA-4680]
68231907       315       ClpS->alpha_helical_domain->JAB->ThiS/MoaD->Cys synthase*->                                                            Frankia sp. EAN1pec;                                           actinobacteria                                Cysteine synthase K/M [Frankia sp. EAN1pec]
86739581       315       ClpS->alpha_helical_domain->JAB->ThiS/MoaD->Cys synthase*->                                                            Frankia sp. CcI3;                                              actinobacteria                                cysteine synthases [Frankia sp. CcI3]
71916499       315       ClpS->alpha_helical_domain->JAB->ThiS/MoaD->Cys synthase*->                                                            Thermobifida fusca YX;                                         actinobacteria                                cysteine synthase K/M [Thermobifida fusca YX]
71366891       320       alpha_helical_domain->MutT->JAB->ThiS/MoaD->Cys synthase*->                                                            Nocardioides sp. JS614;                                        actinobacteria                                Cysteine synthase K/M [Nocardioides sp. JS614]
5531359        316       JAB->alpha_helical_domain->ThiS/MoaD->Cys Syn*->alpha_helical_domain<-MBL                                              Streptomyces coelicolor A3(2);                                 actinobacteria                                putative cysteine synthase [Streptomyces coelicolor A3(2)]


Alignment of rapidly diverging alpha helical protein

ALIGN                              -------------EE--HHHHHHHHHHHHHHHHHHH-------------------HHHHH-------HH------------------------------EEE-----------------------HHHHHHHHH--HHHHHHHHHHHHHHHHH---------------------HH-HEEE--HHHHHHHHHHHHHHHHHHHHHHH-------------------------HHHHHHHHHHHHHHHHHH---
HMM                                -----------HEEEEHHHHHHHHHHHHHHHHHHHHH---------H-------HHHHHH-------HE----------------------------EEEEEE---------------------HHHHHHHHHH-HHHHHHHHHHHHHHHHH----------------------EEEEEEE--HHHHHHHHHHHHHHHHHHHH-EEE----HHHH-----H-------HHHHHHHHHHHHHHHHHHHHHH--
FREQ                               ---HHH------HEE---HHHHHHHHHHHHHHHHHHH------------------HHHHH-------H------------------------------H-H------------------------HHHHHHHHHHHHHHHHHHHHHHHHHHH----------------H-HHH----HHEHHHHHHHHHHHHHHHHHHHHHHHHH-------------------------HHHHHHHHHHHHHHHHHHHH--
PSSM                               ------------EEEEEHHHHHHHHHHHHHHHHHHH--------------------HHHH-------H------------------------------EEEE-----------------------HHHHHHHH----HHHHHHHHHHHHHHHH------------------------EEEEE-HHHHHHHHHHH---EEEEEEE-----------------------HHHHHHHHHHHHHHHHHHHHHH---
FINAL                              ------------EEEE-HHHHHHHHHHHHHHHHHHH-------------------HHHHH-------H------------------------------EEEE----------------------HHHHHHHHHHHHHHHHHHHHHHHHHHHHH------------------------EEEEE-HHHHHHHHHHHHHHHHHHHHH------------------------HHHHHHHHHHHHHHHHHHHHHH--
NocaDRAFT_2640_Nsp._71366887       MSGFQRHRRSKLIIANFTGFEADLLRSLAGQLVELLRNEAAVPRDPV-------DPFEAM-------MDF------------------SGPTQEPEDPVLARLFPTAYPGD-------------QEAASEFRRFTEGTLRDGKAAAAVAIIDGL--------EEAGLPPELTEDGLMIDIELDEATAETWMRSFTDLRLALATRLEVEEGDDAYW-----HSLPDDDPRAQAHDIYEWVGYLQETLVQALSG
Lxx13320_Lxyl_50951464             MRPFRRTRDGT-LRARFEPDEAEILARLAAETAELAV-----------------DAA-------------------------------SGAGDPREDPAFIRLLPDAYSGD-------------AEASAEFRRFTAGGLAERKALTAQVVMETL--------GGGSG---------AIEVRLDAPQAAAWLRTLTDIRLVLAARLGIVQDGDEG-------DIHDAD-SAFRRAVYDWLAGVQESLVLALRS
BlinB01002436_Blin_62424056        --MAAIDARGDDVVLKLEDNERSLMLTVFTDLAALLAEDDNEDGRPD------SENWEARLG--------------------------LVERPRPQDPALLRLFPDVDPLDE-------------ERSREFRRLTEFDLQQAKAHNVRIVLNGL---------AKGS-----------SITLNHDEVLAWMKGLNDLRLVLAVRMGIDTEEAQEEKYAQREDL--DESEELTLTLYDFLTWIQDRLTTTLLS
clpS_Jsp._84494379                 AFARKGKGKNLRYAAKLDAVERAVVAGLMEQVHDLVAPEPEEAVATGPSGASDHDDDFAAIVSGLGGLGMGVSISAEDQVADDRPVPADARSFGDRDPALERLLPAGNRAD-------------DQVSAEFRRLTEHGLRQRKAGHLESAITSL--------RAPGS-----------GVELDERAAIDMVIALTDVRLVLGERLGLREDADVDRLEEELADVDDDDPRGHAMSVYDFLTWLQETLATAMLP
cg2770_Cglu_41326695               WKKKKGLMRQARYAVVFEPMEREVLGDLSAAVSEALIQRAQS--VPK-------DPLAEMTGMT------------------------SGHKEAPTDPALARLLPDFQHEGD---------EEYDGDNSFLRSLHEGDITRAKLENLRVINDAL--------GPDGN----------VAVTASEEEAHAWLAALNDIRLYVASG-DVRGGEAAE---------------EDRENLVQWLAYNQESLLEAMMN
_Ceff_23494252                     WKRRKALMRSARYTCVLEPMEREVLGNLSAVVLEALIHRAQD--APK-------DPLAELTGIP------------------------SGHKEAPRDPALARLLPDFQQEGD---------EEYDGDNSLLRSLHENDITRQKIANLQVINSAL--------GPDGG----------VAVSIPEEEAHAWLAGLNDIRLYLASG-ELKGGEAAE---------------EDRENLVQWLAYNQESLLEAMMG
DIP1856_Cdip_38200689              WKKKKGLFKGARYQCTLEPIEREVLGNLAANISEVLISRAQS--APK-------DELAELTGMG------------------------GGHTEAPEDPGLARLLPDFEMQGD---------EEFDGDNSLLRSLHENDITRAKLANLQTIGQAL--------GPDGS----------VFVTVTEEEAQAWVAGLNDIRLYLASSE-VQDTEDRD-------------------ALVEWLAFAQESLLTAMMG
jk0494_Cjei_68263163               WTKKNSLLRGTRFNTQLEPLEREMLGDSAVAVSDKLMERART--APK-------DELAEMTGMA------------------------SGHADAPKDPGLARLLPSFFREGD---------EEVDGDAALTRQLNETDIIKTKLSNLRFVVDYL--------GPNGS----------VNVSLTQDEVHPWLSAINDIRLYHSAQYEEFKKELL-------EGEENSDQATAAQNYLDWLGYHQDSLLSAMMG
nfa10870_Nfar_54014562             KWTRKNSLGGLKLRAEMDAHEAEVLRSLVGAVSGLLAERAQS--APE-------DELSALTGLR------------------------TGNTAPPDDPRLARLLPDFHRSEPGSPDADRA-----GLNSALRALHEPEIIDAKLAAGSVVLDTV--------PARGG-----------KIVLTPEQADAWLSALTDVRLALGTVLGIDAETP--------DQLDPDDPRAPHLDVYHWLTWMQDSLLQALAP
SCO2915_Scoe_5531364               MPGQFEPLPGGGAAVALDDVEISIIRSLAVQLLELIGPGPAED-ASD-------DPLAELFA--------------------------EGPSEPPSDPVLRRLFPDAYGDPEGAPQAREA-EEQRAHSAEFRRYTENDLRAGKRDNALAVVRTLDTLSSASAGEEGA-----------VLKLSPQESQQWLRALNDLRLAIGSRLEIADEDDTDLLYR----LPDEDPRKPMVMAYLWLGGLQESLVATLMP
SAV5160_Save_29608819              MPGHFEPLPGGGAAVALDEVEISIIRSLAVQLLELIGPGPAED-AAA-------DPLAELFA--------------------------EGPSEPPSDPVLQRLFPDAYGGPGGEGGSPEEAEEQRAHSSEFRRFTENDLRAGKRENALVVIRTL--DGMTVAGEGGA-----------VLKLSPEESRQWLGSLNDLRLAIGSRLDVVDEEDTDLLYR----LPDEDPRKPMVMAYLWLGGLQETLIETLMS
Francci3_0865_Fsp._86739578        DVADGFRRTRAGIELRLPRLEAALLIELVGQIESLLEPPP-----VE-------DPLEALVGLR------------------------DTAPPPPDDPAIARLLPDPYPDD-------------PMASGDFRRRRTDDLLARKRDAARRVLSAV--------PAPGR-----------ALLLDEEAAQDWLTTLNDLRLVLGTRLGLTDDDSTAEL----EHLDPDDSRRPLVAVYAFLTELLDDLTRALG-
Franean1DRAFT_3648_Fsp._68231910   --MNGFRRTRAGIELRLPRLESSLLTELLGQVDALLEAPP-----VD-------DPLEALVGLR------------------------DTAPPPPEDPAVARLLPDPYPDD-------------PLASGDFRRRRTDEALARKRDAARRVLAAV--------PAPGA-----------VLVLDEDAAQDWLTVLNDLRLVLGTRLGLTDDESTAELEN----LTPEDPRRPVAAVYAFLTELLDELTRALL-
KradDRAFT_2533_Krad_67987809       -MATFRRTRNGHFSLTLHAAEADLLASLAREVLELLEVPAAAPPRPV-------DPLQAELGLS----------DLPGFDTPLDDLAGDGPVAPPEDEVLRRLLPDAYGDD-------------PDASADFRRFTERGLRERKAAAASGLLAGL-----APVEGQGG-----------RVQLDADGARTWLAALNDIRLALGTRLGVSEDADPD------ADLAEDDPARWAWAVYDFTTHLQETLVRSLS-
Tfu_2371_Tfus_71916502             MTAKIRSAPHGGARITIGPDEAQLLRSMADFLLRVVEEPEQ-----Q-------DELAALVGIS-------------------------SSATQPEDPALARLFPDAYTDD-------------AEAAADFRRYTESDLRRHKRENARRVASAI--------PEWGG-----------EIVLDAEDVQAWLQTLTDVRLYLGVRLGIETEEDADAL---RAAAVRDESLAAAMHVYEWFTYVQDSLVRAVWQ
ArthDRAFT_1846_Asp._66965396       -MAKAFKYGIKGITGYLEPAERELLRSLIDDVISMLQPAES---ASE-------DPLTALIGLD-------------------------MNVREPSDRALRRLLPNVTKDD-------------DAASLEFRQLTERSLRENKIGALRAAALGL----------DTN-----------ELVLSQADARHWSQALNDVRLVLAERLDIRDDADAEHVHTMQDWSQAEDVESYLALVYNFTTWLQESLVQAMLQ
MT1374_Mtub_13880982               WKRVET-RDGPRFRSSLAPHEAALLKNLAGAMIGLLDDRDSS--SPS-------DELEEITGIK------------------------TGHAQRPGDPTLRRLLPDFYRPDDLDDDDPTAVDGSESFNAALRSLHEPEIIDAKRVAAQQLLDTV--------PDNGG-----------RLELTESDANAWIAAVNDLRLALGVMLEIGPRGP--------ERLPGNHPLAAHFNVYQWLTVLQEYLVLVLMG
MtubF_01001398_Mtub_76784817       WKRVET-RDGPRFRSSLAPHEAALLKNLAGAMIGLLDDRDSS--SPS-------DELEEITGIK------------------------TGHAQRPGDPTLRRLLPDFYRPDDLDDDDPTAVDGSESFNAALRSLHEPEIIDAKRVAAQQLLDTV--------PDNGG-----------RLELTESDANAWIAAVNDLRLALGVMLEIGPRGP--------ERLPGNHPLAAHFNVYQWLTVLQEYLVLVLMG
MAP2428c_Mavi_41408526             WKRVET-AEGPRFRSALASHEAALLKNLATAMIGLLDERESS--SPA-------DELEEITGIK------------------------TGNAQPPKDPTLRRLLPDFYRPDDNGDESPDAAE---SLNAALRSLHEPGIVNAKRVAAQRLLGTV--------PDDGG-----------RFELTEDDANAWIAAVNDIRLTLGVMLEIGPDGP--------ERLPADHPLAVHFDVYQWLTVLQEYLVLVLMG
_Mlep_466922                       WKRVET-ANGPRFRSVVAPHEVALLKHLVGALLGLLNERESS--SPL-------DELEVITGIK------------------------AGNAQRPEDPTLRRLLPDFYTPDDKDQLDPAALDAVDSLNAALRSLHEPEIVDAKRSAAQQLLDTL--------PESDG-----------RLELTEASANAWIAAVNDLRLALGVILEIDRPAP--------ERVPAGHPLSVHFDVYQWLTVLQEYLVLALMA
ML1166_Mlep_13093139               WKRVET-ANGPRFRSVVAPHEVALLKHLVGALLGLLNERESS--SPL-------DELEVITGIK------------------------AGNAQRPEDPTLRRLLPDFYTPDDKDQLDPAALDAVDSLNAALRSLHEPEIVDAKRSAAQQLLDTL--------PESDG-----------RLELTEASANAWIAAVNDLRLALGVILEIDRPAP--------ERVPAGHPLSVHFDVYQWLTVLQEYLVLALMA
consensus/100%                     ............h...h...E..hh..........h..................-.........................................D..h.RLhPs.....................s...R.bp...h...K......h...l...........s.............h.hs...s..h..shsDlRL..u................................hh.a.s..b-.L..sh..
consensus/90%                      ............h...h...E..ll.plhs.h..hl.........s........D.h..b.s...........................s....PpDPsl.RLhPs....s................su.hRphpp..l...K..sh..l..sl...........ss............l.ls...s..Wh.slsDlRLhlus.b.l....s......................hh.ahs..Q-.L..sh..

Species abbreviations: Asp. : Arthrobacter sp.; Blin : Brevibacterium linens; Cdip : Corynebacterium diphtheriae; Ceff : Corynebacterium efficiens; Cglu : Corynebacterium glutamicum; Cjei : Corynebacterium jeikeium; Fsp. : Frankia sp.; Jsp. : Janibacter sp.; Krad : Kineococcus radiotolerans; Lxyl : Leifsonia xyli; Mavi : Mycobacterium avium; Mlep : Mycobacterium leprae; Mtub : Mycobacterium tuberculosis; Nfar : Nocardia farcinica; Nsp. : Nocardioides sp.; Save : Streptomyces avermitilis; Scoe : Streptomyces coelicolor; Tfus : Thermobifida fusca


Miscellaneous operons
Rhodanese+E1 (no JABs in operons- gis are of the Rhodanese+E1 protein)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
71898141      379        Rhodanese+E1                                                                                                           Xylella fastidiosa Ann-1                                       proteobacteria>gammaproteobacteria            UBA/THIF-type NAD/FAD binding fold:Rhodanese-like:MoeZ/MoeB [Xylella fastidiosa Ann-1]
71900908      386        Rhodanese+E1                                                                                                           Xylella fastidiosa Ann-1                                       proteobacteria>gammaproteobacteria            UBA/THIF-type NAD/FAD binding fold:MoeZ/MoeB [Xylella fastidiosa Ann-1]
9105314       379        Rhodanese+E1                                                                                                           Xylella fastidiosa 9a5c                                        proteobacteria>gammaproteobacteria            AE003897_1 molybdopterin biosynthesis protein [Xylella fastidiosa 9a5c]
77747707      379        Rhodanese+E1                                                                                                           Xylella fastidiosa Temecula1                                   proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein MoeB [Xylella fastidiosa Temecula1]
78036060      401        MoeA-><-Rhodanese+E1                                                                                                   Xanthomonas campestris pv. vesicatoria str. 85-10;             proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein MoeB [Xanthomonas campestris pv. vesicatoria str. 85-10]
58426731      472        Rhodanese+E1                                                                                                           Xanthomonas oryzae pv. oryzae KACC10331                        proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein [Xanthomonas oryzae pv. oryzae KACC10331]
21108248      380        Rhodanese+E1                                                                                                           Xanthomonas axonopodis pv. citri str. 306                      proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein [Xanthomonas axonopodis pv. citri str. 306]
84367975      379        Rhodanese+E1                                                                                                           Xanthomonas oryzae pv. oryzae MAFF 311018                      proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein [Xanthomonas oryzae pv. oryzae MAFF 311018]
21113107      378        Rhodanese+E1                                                                                                           Xanthomonas campestris pv. campestris str. ATCC 33913          proteobacteria>gammaproteobacteria            molybdopterin biosynthesis protein [Xanthomonas campestris pv. campestris str. ATCC 33913]
----------------------------------------------------
4e. Operons with genes for sulfur metabolism proteins
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Domain abbreviations:
SirA-like redox proteins IF3C-fold, regulator of disulfide bond formation?,  (note in some instances this protein is fused to a Rhodanese)
OAHShyd: typically O-acetylhomoserine/serine sulfhydrylase/Methionine lyase; PLP dependent transferase superfamily
DsrE/H: ancient family, Conserved cysteine, often fused and solo versions, also in archaea,
involved in sulfur reduction, YchN-like fold, perhaps a breakaway Rossmannoid, DsrH like proteins
are involved in oxidation of intracellular sulfur (pdb: 1l1s ): solo gi:67938822
PAPSR: Phosphoadenosine phosphosulfate reductase
ATP_sulf: ATP sulfurylase

Gis are of the SirA or MoaD/ThiS  protein -marked with an asterisk

GI            LENGTH      Operon                                                                                                                                                              ORGANISM                                                       Classification                                Protein descriptions (if any)
67938823      82          ThiS/MoaD->OAHShyd->E1 solo->JAB->DsrE/H->SirA*->                                                                                                                   Chlorobium phaeobacteroides BS1;                               bacteroidetes/chlorobi                        SirA-like [Chlorobium phaeobacteroides BS1]
68208690      80          PAPSR->ATP_sulf->Sulf_adenyltransf_large->ThiS/MoaD->E1->JAB->Sulf_reductase(Fe-S binding protein)->SirA*->                                                         Desulfitobacterium hafniense DCB-2;                            firmicutes                                    SirA-like [Desulfitobacterium hafniense DCB-2]
77996033      72          sulfite_reductase->E1->ThiS/MoaD*->Sulf_adenylyltransferase->4Fe-S->Adenylylsulfate_reductase->?->Adenylylsulfate_kinase                                            Carboxydothermus hydrogenoformans Z-2901;                      firmicutes                                    thiamine biosynthesis protein ThiS [Carboxydothermus hydrogenoformans Z-2901]
67873788      81          PAPSR->ATP_sulf->Sulf_adenyltransf_large->ThiS/MoaD->E1->JAB->Sulf_reductase(Fe-S binding protein)->SirA*->                                                         Clostridium thermocellum ATCC 27405;                           firmicutes                                    SirA-like [Clostridium thermocellum ATCC 27405]
29894496      77          SirA+Rhodanese->Hydroxyacylglutathione hydrolase->SirA*->Rhod->Rhod->                                                                                               Bacillus cereus ATCC 14579;                                    firmicutes                                    Molybdopterin biosynthesis MoeB protein [Bacillus cereus ATCC 14579]
82499134      82          ABC sulfate transporter->ThiS/MoaD->E1->JAB->sulf_reductaseFe-S binding protein)->SirA*->OAHShyd->Adenylylsulfreduct->Ferredoxin->ATP_sulf->PAPSR->                 Caldicellulosiruptor saccharolyticus DSM 8903;                 firmicutes                                    conserved hypothetical protein [Caldicellulosiruptor saccharolyticus DSM 8903]
78194036      74          OAHShyd->ThiS/MoaD->E1solo->JAB->Sulf_reductase(Fe-S binding protein)->SirA*->                                                                                      Geobacter metallireducens GS-15;                               proteobacteria>deltaproteobacteria            conserved hypothetical protein [Geobacter metallireducens GS-15]
18160982      88          <-PAPSR<-?<-Sulfite_reductase<-?->ThiS/MoaD*->Rhod+Rhod->                                                                                                           Pyrobaculum aerophilum str. IM2;                               crenarchaeota                                 conserved hypothetical protein [Pyrobaculum aerophilum str. IM2]

Operons lacking sirA (gis are of the ThiF/E1-like protein-marked with an asterisk)
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
34483109      272         PAPSR->ATP_sulf->Sulf_adenyltransf_large->ThiS/MoaD->E1*->JAB->Sulf_reductase(Fe-S binding protein)->                 Wolinella succinogenes;                                        proteobacteria>epsilonproteobacteria          MOLYBDOPTERIN BIOSYNTHESIS PROTEIN MOEB [Wolinella succinogenes]
77686500      269         CysTRNAsyn_deacylase->ThiS/MoaD->E1*->JAB->Sulf_reductase(Fe-S binding protein)->                                     Alkaliphilus metalliredigenes QYMF                             firmicutes                                    UBA/THIF-type NAD/FAD binding fold:MoeZ/MoeB [Alkaliphilus metalliredigenes QYMF]

ThiS/MoaD+Sulf_reductase containing operon subtype  (Gis of ThiS/MoaD)
GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
71366157      648         Sulf_reductase+ThiS/MoaD*->PAPSR->                                                                                    Nocardioides sp. JS614                                         actinobacteria                                Ferredoxin--nitrite reductase [Nocardioides sp. JS614]
88931571      639         Sulf_reductase+ThiS/MoaD*->PAPSR->                                                                                    Acidothermus cellulolyticus 11B                                actinobacteria                                Ferredoxin--nitrite reductase [Acidothermus cellulolyticus 11B]
-------------------------------------------------------------------------------------------------------------

5. Phage Tail assembly associated Ub
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gp: Phage containing Ub domain; also called I-tail component; gpK-JAB
J: host specificity protein J; STF : lambda side tail fiber protein

- Operons of the type JAB+NlpC->Ub->gpJ (Gis are of the JAB protein- Marked with an asterisk)

GI            LENGTH      Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
38707909        194       JAB+NlpC*->Ub->gpJ->                                                                                                  Bacteriophage phi1026b                                         bacteriophages                                gp19 [Bacteriophage phi1026b]
76556246        226       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Phage BP-4795                                                  bacteriophages                                putative tail component [Phage BP-4795]
71834086        191       gpM->gpL->Ub->gpJ->                                                                                                   Bacteriophage JK06                                             bacteriophages                                hypothetical tail assembly protein I [Bacteriophage JK06]
77864688        187       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Burkholderia cepacia phage Bcep176                             bacteriophages                                gp63 [Burkholderia cepacia phage Bcep176]
80750693        190       gpL->HNH->JAB+NlpC*->Ub->gpJ->                                                                                        Bacteriophage RTP                                              bacteriophages                                putative tail assembly protein [Bacteriophage RTP]
46402106        197       gpL->JAB+NlpC*->?->Ub->gpJ+X->                                                                                        Bacteriophage phiKO2                                           bacteriophages                                Gp20 [Bacteriophage phiKO2]
17975181        194       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Bacteriophage phiE125                                          bacteriophages                                putative tail component protein [Bacteriophage phiE125]
11877308        240       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Neisseria meningitidis phage 2120                              bacteriophages                                putative protein I [Neisseria meningitidis phage 2120]
9630484         192       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Enterobacteria phage N15                                       bacteriophages                                gp20 [Bacteriophage N15]
215124          223       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Enterobacteria phage lambda                                    bacteriophages                                I (tail component;223) [bacteriophage lambda]
51773733        180       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Bacteriophage CP-1639                                          bacteriophages                                putative tail fiber component I [Bacteriophage CP-1639]
9634139         202       gpL->JAB+NlpC*->?->Ub->?->?<-?->gpJ->                                                                                 Enterobacteria phage HK022                                     bacteriophages                                gp21 [Enterobacteria phage HK022]
45686326        199       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Enterobacteria phage T1                                        bacteriophages                                putative tail assembly protein [Enterobacteria phage T1]
84357775        150       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Burkholderia cenocepacia PC184                                 proteobacteria>betaproteobacteria             COG4723: Phage-related protein, tail component [Burkholderia cenocepacia PC184]
83717443        194       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Burkholderia thailandensis E264                                proteobacteria>betaproteobacteria             Bacteriophage lambda tail assembly protein I [Burkholderia thailandensis E264]
76579036        188       gpL->JAB+NlpC*->Ub->gpJ->lysozyme->                                                                                   Burkholderia pseudomallei 1710b                                proteobacteria>betaproteobacteria             Bacteriophage lambda tail assembly protein I [Burkholderia pseudomallei 1710b]
16419562        234       JAB+NlpC*->Ub->gpJ->STF->                                                                                             Salmonella typhimurium LT2                                     proteobacteria>gammaproteobacteria            Gifsy-2 prophage probable tail assembly protein [phage Gifsy-2]
83587164        190       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                             Escherichia coli 101-1                                         proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli 101-1]
75208766        180       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
75210818        182       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
75208867        144       JAB+NlpC*->Ub->gpJ->                                                                                                  Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
75211970        190       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                             Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
75229909        190       gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                                  Escherichia coli B7A                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B7A]
26107858        210       gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                                  Escherichia coli CFT073                                        proteobacteria>gammaproteobacteria            AE016759_331 Putative tail component of prophage [Escherichia coli CFT073]
26107735        204       gpL-><-?->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli CFT073                                        proteobacteria>gammaproteobacteria            AE016759_208 Putative tail assembly protein of cryptic prophage [Escherichia coli CFT073]
26109404        210       gpL<-?->JAB+NlpC*->Ub->gpJ->                                                                                          Escherichia coli CFT073                                        proteobacteria>gammaproteobacteria            AE016765_9 Putative tail component of prophage [Escherichia coli CFT073]
75239817        180       JAB+NlpC*->Ub->gpJ->                                                                                                  Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
75235846        193       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
16421139        215       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Salmonella typhimurium LT2                                     proteobacteria>gammaproteobacteria            Gifsy-1 prophage protein [Salmonella typhimurium LT2]
75255450        193       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
75255278        193       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
24374467        209       NlpC(fragment)->?->Ub->gpJ->                                                                                          Shewanella oneidensis MR-1                                     proteobacteria>gammaproteobacteria            prophage LambdaSo, tail assembly protein I [Shewanella oneidensis MR-1]
75258709        130       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
75257430        180       Bro-NJAB+NlpC*->Ub->gpJ(N->gpJ(middle)->gpJ(C)->                                                                      Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
75259495        182       gpL->JAB+NlpC*->Ub->gpJ->gpM->                                                                                        Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
75175531        193       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Shigella boydii BS512                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Shigella boydii BS512]
75239568        182       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli F11                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli F11]
75239670        190       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->                                                                                     Escherichia coli F11                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli F11]
12514222        300       gpL->JAB+NlpC*->JAB+Ub->gpJ(N)->gpJ(Fn3+C)->                                                                          Escherichia coli O157:H7 EDL933                                proteobacteria>gammaproteobacteria            AE005290_12 putative tail component encoded by cryptic prophage CP-933M; partial [Escherichia coli O157:H7 EDL933]
12515098        225       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->gpM->                                                                              Escherichia coli O157:H7 EDL933                                proteobacteria>gammaproteobacteria            AE005349_14 putative tail component of prophage CP-933O [Escherichia coli O157:H7 EDL933]
12516097        178       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->gpM->                                                                              Escherichia coli O157:H7 EDL933                                proteobacteria>gammaproteobacteria            AE005420_1 putative tail fiber component I of prophage CP-933U [Escherichia coli O157:H7 EDL933]
46143649        181       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Actinobacillus pleuropneumoniae serovar 1 str. 4074            proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Actinobacillus pleuropneumoniae serovar 1 str. 4074]
82543715        204       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Shigella boydii Sb227                                          proteobacteria>gammaproteobacteria            putative tail component [Shigella boydii Sb227]
32043835        200       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Pseudomonas aeruginosa UCBPP-PA14                              proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Pseudomonas aeruginosa UCBPP-PA14]
75259293        193       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Escherichia coli E22                                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
75234649        193       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
75176997        172       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Shigella boydii BS512                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Shigella boydii BS512]
75238944        190       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
9946516         200       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Pseudomonas aeruginosa PAO1                                    proteobacteria>gammaproteobacteria            AE004499_8 probable bacteriophage protein [Pseudomonas aeruginosa PAO1]
75820383        200       gpM->gpL->JAB+NlpC*->Ub solo->gpJ->                                                                                   Vibrio cholerae V51                                            proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Vibrio cholerae V51]
74312870        210       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Shigella sonnei Ss046                                          proteobacteria>gammaproteobacteria            putative tail component of prophage [Shigella sonnei Ss046]
13361702        226       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                             Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
13361111        223       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            tail assembly protein [Escherichia coli O157:H7 str. Sakai]
13362414        225       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->gpJ(C)->                                                                             Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
13360300        215       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
74312266        180       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Shigella sonnei Ss046                                          proteobacteria>gammaproteobacteria            putative tail component of prophage [Shigella sonnei Ss046]
84318835        200       gpM->gpL->JAB+NlpC*->Ub->gpJ(N)->                                                                                     Pseudomonas aeruginosa C3719                                   proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Pseudomonas aeruginosa C3719]
56383531        180       gpM->gpL->JAB+NlpC*->Ub->gpJ->gpM->                                                                                   Shigella flexneri 2a str. 301                                  proteobacteria>gammaproteobacteria            putative tail component [Shigella flexneri 2a str. 301]
68345404        188 **    Bro-N->KilA-N+C->Ub->gpJ->P5->                                                                                        Pseudomonas fluorescens Pf-5                                   proteobacteria>gammaproteobacteria            prophage LambdaSo, tail assembly protein I [Pseudomonas fluorescens Pf-5]
24050968        191       gpM->gpL->JAB+NlpC*->Ub->gpJ->gpM->                                                                                   Shigella flexneri 2a str. 301                                  proteobacteria>gammaproteobacteria            putative tail component [Shigella flexneri 2a str. 301]
71037999        187       gpL->JAB+NlpC*->Ub->gpJ->                                                                                             Psychrobacter arcticus 273-4                                   proteobacteria>gammaproteobacteria            probable phage protein tail protein [Psychrobacter arcticus 273-4]
52788057        195       gpM->gpL->JAB+NlpC*->Ub->STF (distinct tail fiber protein)->                                                          Yersinia pestis                                                proteobacteria>gammaproteobacteria            phage lambda tail assembly protein I [Yersinia pestis]
75254904        226       Ub->gpJ(N->)                                                                                                          EscherichiacoliE22                                             proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E22]
2996351         183       gpM->gpL->JAB+NlpC*->Ub->host_specificity_J->                                                                         Yersinia pestis KIM                                            proteobacteria>gammaproteobacteria            unknown [Yersinia pestis KIM]
66046010        192       JAB+NlpC*<-?->Ub<-?->gpJ->                                                                                            Pseudomonas syringae pv. syringae B728a                        proteobacteria>gammaproteobacteria            Bacteriophage lambda tail assembly I [Pseudomonas syringae pv. syringae B728a]
16506034        195       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Salmonella enterica subsp. enterica serovar Typhi str. CT18    proteobacteria>gammaproteobacteria            putative phage tail protein [Salmonella enterica subsp. enterica serovar Typhi str. CT18]
62179803        168       gpM->gpL->JAB+NlpC*->Ub->gpJ->                                                                                        Salmonella enterica subsp. enterica serovar Choleraesuis str.  proteobacteria>gammaproteobacteria            Gifsy-1 prophage VtiI [Salmonella enterica subsp. enterica serovar Choleraesuis str. SC-B67]
16419434        225       JAB+NlpC*->gpM->gpL->Ub-><-superoxide_dismutase->host_specificity_J->                                                 Salmonella typhimurium LT2                                     proteobacteria>gammaproteobacteria            putative Fels-1 prophage tail assembly protein [phage Fels-1]
75208698        193       gpM->gpL->JAB+NlpC*->Ub-><-superoxide_dismutase->host_specificity_J->                                                 Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
75235151        193       gpM->gpL->JAB+NlpC*->Ub-><-superoxide_dismutase->host_specificity_J->                                                 Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
75214996        193       gpM->gpL->JAB+NlpC*->Ub-><-superoxide_dismutase->host_specificity_J->                                                 Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
75233804        193       JAB+NlpC*->gpM->gpL->JAB+NlpC*->Ub-><-superoxide_dismutase->host_specificity_J->                                      Escherichia coli E110019                                       proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli E110019]
13361452        226       gpM->gpL->JAB+NlpC*->Ub->?<-Superoxide_dismutase->host_specificity_J->                                                Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
13360578        226       gpM->gpL->JAB+NlpC*->Ub->?<-Superoxide_dismutase->host_specificity_J->                                                Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
84327632        60        gpH->gpL->JAB+NlpC*->Ub solo->gpJ->Lysozyme->                                                                         Pseudomonas aeruginosa 2192 fragment                           proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Pseudomonas aeruginosa 2192]

Operons with no gpJ in vicinity (Ub gis)
Gis are of the JAB or NlpC protein- Marked with an asterisk

GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
13361023        94        gpM->gpL->JAB+NlpC*->Ub->YjbI->?->                                                                                    Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria            putative tail assembly protein [Escherichia coli O157:H7 str. Sakai]
82533244        194       gpL->JAB+NlpC*->Ub solo (perhaps incomplete assembly)->                                                               Burkholderia pseudomallei 1106b                                proteobacteria>betaproteobacteria             hypothetical protein Bpse110_02005448 [Burkholderia pseudomallei 1106b]
62179570        221       gpL->JAB+NlpC*->Ub->?->lambda p27(distinct tail fiber)->                                                              Salmonella enterica subsp. enterica serovar Choleraesuis str.  proteobacteria>gammaproteobacteria            Gifsy-2 prophage probable tail assembly protein [Salmonella enterica subsp. enterica serovar Choleraesuis str. SC-B67]
75207839        180       NlpC*(fragmented?)->Ub->                                                                                              Escherichia coli B171                                          proteobacteria>gammaproteobacteria            COG4723: Phage-related protein, tail component [Escherichia coli B171]
17428712        200       gpL->JAB+NlpC*->Ub->                                                                                                  Ralstonia solanacearum                                         proteobacteria>betaproteobacteria             probable phage hk022 gp20-related protein [Ralstonia solanacearum]

Operons  without JABs in the operon
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gis are of the Ub protein-marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
82741527        216       BroN->Ub->gpJ->                                                                                                       Shewanella sp. W3-18-1;                                        proteobacteria>gammaproteobacteria            prophage LambdaSo, tail assembly protein I [Shewanella sp. W3-18-1]
82743846        238       Ub solo                                                                                                               Shewanella sp. W3-18-1;                                        proteobacteria>gammaproteobacteria            prophage LambdaSo, tail assembly protein I [Shewanella sp. W3-18-1]
15980136        206       Bro-N->ribbon->?->Ub->?->host_specificty_J->                                                                          Yersinia pestis CO92;                                          proteobacteria>gammaproteobacteria            putative phage tail assembly protein [Yersinia pestis CO92]
71558268        152       HTH->Ub->gpJ(N)->                                                                                                     Pseudomonas syringae pv. phaseolicola 1448A;                   proteobacteria>gammaproteobacteria            prophage PSPPH03, putative tail assembly protein I [Pseudomonas syringae pv. phaseolicola 1448A]
84780140        198       HNH->Ub->gpJ->lysozyme->                                                                                              Sodalis glossinidius str. 'morsitans';                         proteobacteria>gammaproteobacteria            putative phage tail assembly protein [Sodalis glossinidius str. 'morsitans']

B. Note the Domain_Z protein.. (Domain Z: an all beta domain)
(Gis are of the Ub+gpJ protein- marked with an asterisk)
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
31788497        1574      Domain_Z->NlpC->Ub+gpJ*(N+FN3+C)  [1-173 Ubl+ 173-673 (N)]->                                                          Xanthomonas campestris phage Xp10                                                                            22R [Xanthomonas oryzae bacteriophage Xp10]
84570663        1571      Domain_Z->NlpC->Ub+gpJ*(N+FN3+C)->                                                                                    Xanthomonas oryzae phage OP1                                                                                 putative tail component protein [Xanthomonas oryzae phage OP1]
23013869        775       NlpC solo->Ub+gpJ*(N)->P5->?Y-> (Note no JAB)                                                                         Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria            COG4733: Phage-related protein, tail component [Magnetospirillum magnetotacticum MS-1]
85716602        1267      NlpC->Ub+gpJ*(N+FN3+distinct_C)-> (Note no JAB)*********                                                              Nitrobacter sp. Nb-311A                                        proteobacteria>alphaproteobacteria            tail fiber protein, putative [Nitrobacter sp. Nb-311A]
23016384        508       Domain_Z->NlpC solo->Ub+gpJ*(N)->gpJ(fragment of C)->                                                                 Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria            COG0001: Glutamate-1-semialdehyde aminotransferase [Magnetospirillum magnetotacticum MS-1]
82944335        775       Domain_Z->NlpC->Ub+gpJ*(N)->P5->?Y->                                                                                  Magnetospirillum magneticum AMB-1                              proteobacteria>alphaproteobacteria            Phage-related protein [Magnetospirillum magneticum AMB-1]
82945132        775       Domain_Z->NlpC->Ub+gpJ*(N)->P5->?Y->                                                                                  Magnetospirillum magneticum AMB-1                              proteobacteria>alphaproteobacteria            Phage-related protein [Magnetospirillum magneticum AMB-1]
33568295        1268      Domain_Z->NlpC->Ub+gpJ*(N+distinct_C)->                                                                               Bordetella bronchiseptica RB50                                 proteobacteria>betaproteobacteria             phage-related hypothetical protein [Bordetella bronchiseptica RB50]
33564325        1318      NlpC_solo->Ub+gpJ* (N+FN3+C)->                                                                                        Bordetella pertussis Tohama I                                  proteobacteria>betaproteobacteria             phage-related conserved hypothetical protein [Bordetella pertussis Tohama I]
67545284        767       Domain_Z->NlpC solo->Ub+gpJ*(N)->(Note no JAB)*********                                                               Burkholderia vietnamiensis G4                                  proteobacteria>betaproteobacteria             phage-related conserved hypothetical protein [Burkholderia vietnamiensis G4]
68212786        1171      Domain_Z->NlpC->Ub+gpJ*(N)->                                                                                          Methylobacillus flagellatus KT                                 proteobacteria>betaproteobacteria             similar to Phage-related protein tail component [Methylobacillus flagellatus KT]
33576899        1318      Domain_Z->NlpC->Ub+gpJ*(N+FN3+distinct C)->                                                                           Bordetella bronchiseptica RB50                                 proteobacteria>betaproteobacteria             phage-related conserved hypothetical protein [Bordetella bronchiseptica RB50]
46449977        1346      NlpC_solo->Ub+gpJ*(N+FN3+distinct_C)->                                                                                Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough      proteobacteria>deltaproteobacteria            tail fiber protein, putative [Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough]

Versions of above without NLpC or JAB
Gis are of the Ub+gpJ protein-marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
23015894        766       ?H->Ub+gpJ(N)*->Lysozyme->?Y->                                                                                        Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria            COG4733: Phage-related protein, tail component [Magnetospirillum magnetotacticum MS-1]
78033450        766       Domain_Z <-PIN<-YoeB->Ub+gpJN)*->X->?Y-> (note toxin-antitoxin insert)                                                Magnetospirillum gryphiswaldense                               proteobacteria>alphaproteobacteria            phage-related protein [Magnetospirillum gryphiswaldense]
71548099        1644      Ub+gpJ*(distinctN+gp44+distinct_C) (NlpC in genome but not in vicinity)                                               Syntrophobacter fumaroxidans MPOB                              proteobacteria>deltaproteobacteria            similar to Phage-related protein tail component [Syntrophobacter fumaroxidans MPOB]
46916380        1294      Ub+gpJ(N)*                                                                                                            Photobacterium profundum SS9                                   proteobacteria>gammaproteobacteria            hypothetical protein [Photobacterium profundum SS9]


Domain_Z alignment:
FINAL                                        -HHHHHHH-------EEEEEEEEE------------EEEEEE---EEEEE------------EEEEEEEEEEE-------------EEEEE----HHHHHHHHHH-------EEEEEEEEEE-------------EEEE---EEE--EEEEEEEEE-HHHH--------------------
ALIGN                                        ----HHHH-------EEEEHEEE-------------EEEEEE----HEHHHH----------EEEEEE-----------------EEEEEEE-----HHHHHHHHH------HEEEEEEEE--------------EEEEE---------EEEHHHHHHH----------------------
HMM                                          -HHHHHHH-------EEEEEEEEE-----E------EEEEEE--EEEEEE--H---------EEEEEE--EEEE-----------EEEEEEE---HHHHHHHHHH-------EEEEEEEEEE-----------E-EEEE---EEE--EEEEEEEEE-HHHHHH-H---EEEE---------
FREQ                                         -HHHHHHHH------EEEEEEEE-------------EEEEE-----HHHHHHH---------EEEEEEE-----------------EEEEE-----HHHHHHHHH-------EEEEEEEEE---------------EE----------HHHHHHHHHH-----------------------
PSSM                                         -HHHHHH------------EEEE-------------EEEEE----EEEEE------------EEEEEE-EEEE-------------EEEEE----HHHHHHHHH--------EEEEEEEEE------------E-EEEE---EE----EEEEEEEE-------------------------
mgI418_Mgry_78033453                         SQALKEAFASAPAGTVILDTLEIWHPTFDE------PIRVVRDHADLTARLEAGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAV-SQQVIEITWRPYLSTDLNGPHMDPPI-TMTLTDVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
BB3488_Bbro_33576901                         EQALKEAYASAPQDRVVFDTLELRHPAFVDPHGEPTAVRVVLGYEDIRARLETEAPLDG-GQDVMFQAGAFRFRLPGFEE-GQVPSLLIAIDGASEQIVDHVEAAVQ-SRFPIYVTYRPYLSTDLSMPQMNPPI-TMELNKVTVT--GSSVSGTATLSDVHNW-AFPHERYVRERFPGLFR
BP3364_Bper_33564327                         EKALKEAYASAPQDRVVFDTLELRHPAFVDEHGERTAVRVVLGYEDIYARLEAEAPLDG-GKEVLFQAGAFRLRLPGFEE-GQVPSLLITIDGASEKIVDHVEAAVQ-SRYPIYATYRPYVSTDLSRPQMNPPI-TMELNKVTVT--GASVSGTATLADVHNW-AFPHQRYMRERFPGLFR
Bcep1808DRAFT_4080_Bvie_67545282             SEAIKEAYASAPSQQIILHTLELRHPAFVDEDGQQVAIRVVRDTGDLWARLESQAPLQA-GERVQFVAMGFELDLPPVDT-MPVPEITVTLDNVSREIVRHLDAAAE-SQSVIEVTYRPYLSTDLEGPQMDPPI-HLVLTEVEAD--IFRVTGRARMLDVGNK-AFPGVSYTAKTFPGLTR
amb1190_Mmag_82945130                        SQALKEAFASAPAGTVILDTLEIWHPTFIE------PIRVVRDHADLTARLEAGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAI-SQQVIEITWRPYLSTDLNGPHMDPPI-TMTLTEVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
amb0393_Mmag_82944333                        SQALKEAFASAPAGTVVLDTLEIWHPTFDE------PIRVVRDHADLTARLEAGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAI-SQQVIEITWRPYLSTDLNGPHMDPPI-TMTLTEVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
Magn03007629_Mmag_23013169                   SQALKEAFASAPAGTVILDTLEIWHPTFDE------PIRVVRDHADLTARLETGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAI-SQQVIEITWRPYLSTDLNGPHMDPPI-TMALTEVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
Magn03010833_Mmag_46200892                   SQALKEAFASAPAGTVVLDTLEIWHPSFTT------PIRVVRDHADLTARLEAGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAI-SQQVIEITWRPYLSTDLNGPHMDPPI-TMALTEVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
Magn03010336_Mmag_46201139                   ---MREAFAAAPTNTVILHTLEIWHPTFSE------PIRVVRDHADLTARLEAGAPRGG-GQKVTFIALAFDLDLPPVDT-APVPEITVTMDNVGQEIVDALEAAAI-SQDKIDIIYRPFLSTDLEGPHMDPPI-TLTLAEVEAD--TLRVTGRARMLDVGNK-AFPSITYTAKRFPGLAR
MflaDRAFT_2307_Mfla_68212788                 EEAIKEAYASNPVGEVELNTLEFRHPNFVDQNGDPSAIRVVLDNVDHYLTLEDDAPLNP-GESVLFVRMAFELTKPEVDS-VAGPAMDITLNNITPEIETQIRAATR-SPYPVIGMYRLYLLSDKTQPQNNPPM-EFQLDNVNAD--DESITARATFGNEAQR-PFPNENYTATRFPGLSR
PputDRAFT_2895_Pput_82737129                 MTALEVVYAS--GGDDIVPTLEISCPAWDK------TLYLVQDFEDFRATTEA-------GKTVTFLASAIDVALPAKDN-SGAQTLTFVIDNVTGEAQQLIDASLE-AEARVTIVYREYLYSIPGEPA-DRPY-RMTSFGGTMD--GPTIQIEAGYYDLINM-MWNRFRYTTDFAPGLTY
DaceDRAFT_2556_Dace_68177301                 TTAYKEAIAYANPETTIWEAIRITHSSWLE------SILLVNSYEVFTANL---------G---SFIPVQWSMKLPEVEA-ETRGELTLKIDLLPLSIKRTLFSGAS-KTDAMKL--YYYEYTDTTDPAGQLPA-ALEISKVEMDEDNQVTTIKALYADLVNI-VFPRRRMTTTLIPGGLV
BB1708_Ppro_46916381                         KNARINLNATT-ADEPFLILVEIHHQSFSE------PARIVADTQDITHA----------G--YRYTALPIDVTLPDEGE-GKLPQAKLIIDNVGRVLTDEIDGTRG-FEGGTCVI-MQVMRSNPS--HVEWGI-ELDVLDVSID--QLKISATLGYEDMLNK-PAVTMRFTPERSPGLF-
BB1708_Bbro_33568293                         TQAKRNVNATS-ADEPLLELIEITHPDLAV------PARFVNDTQDIQVE----------G--HAFLACRFDLSIPDDQA-EQVPGARLEVDNIGRELTQWLEYSQG-GKGAKC---RLILLLRSNPSNIELDM-TMDLTGLEIT--NFRVSGDLGFKNTLMQ-SGVAMRFDPLTAPGVF-
NB311A_12117_Nsp._85716598                   SLNFRQELFGQESGEVPILLVTITHPELPE------PIYLSTDPTERFSTDPLMYRTRS-R-GIDFLYAGIDVTLPDEQD-KSPPASKLTIANVTRGLIPLARSVS--TPPAVKIEV--VLASDPDTVE-MTWP-AMDMTNLTYD--ASFLTFDLTIDALVTE-PYPSGTFSPAYFPGLFY
RB2654_16431_Rbac_84684053                   -MPWLDAINDAETAEVVLTLVTLDHADWAA------PVRLVNDVADFEHD----------G--ETYTAAGFQVAMPDQAE-DRNAAMRWTLNDVDHDVAVLLRTTN--DVIDIEVSY--VLASDPDTVQ-AGPF-EAEIRQADLR--YGSVSGALVVYPVMEEVANASFRFSTGDFPGLI-
_BPMB78_4455819                              EAAYRRKLASNPDGEMDFITLEIYHPLLSK------RWLLVRGVKDLTATLET-------GEVVTFEGTPMEAKNAANNN-DMDQTASFSLPDVLNILDEEMDRIPYDNKELPKFIFRRYVSTDLTYP-CDGPV-VYELQTLTQE----KGVFTAETGTPMLNQRATGILMTPEEIPLLRG
_BPKS7_62327363                              EAAYRRKLASNPDGEMDFITLEIYHPLLSK------RWLLVRGADDLTATLET-------GEVVTFEGTPMEAKNAANNN-DMDQTASFSLPDVLNILDEEMDRIPYDNKELPKFIFRRYVSTDLTYP-CDGPV-VYELQTLTQE----KGVFTAETGTPMLNQRATGILMTPEEIPLLRG
mgI418_Mgry_78033453                         SQALKEAFASAPAGTVILDTLEIWHPTFDE------PIRVVRDHADLTARLEAGAPRDG-GKRVTFAALAFEFSPPPVDT-APVPEITVTLDNVGSDITDALEGAAV-SQQVIEITWRPYLSTDLNGPHMDPPI-TMTLTDVEAD--TMRVTGRARMLDAGNK-SFPSITYTARRFPGLAR
D3p22_BPD3_9635614                           ATALERFYAS-DGPDLPIATIEITRPSRPH------PIFICQGFKDLTCMTED-------GRLLTFIAGAIDVSIPKRDN-SGNQNVGFAIDNVTGFAQQYIAEAID-AGEPVTLVLRIYLESDLTAPA-ERPY-RMRVKGADFE--SLTVQVEAGYYDLINT-AALRHIYNVSEFPGLKY
YintA_01000766_Yint_77979284                 MTILNRLYASG-GSEVIIQTLEIAVGDK--------TYWLTKGWEDITAVLES-------GESATFTACGIDIALPARNS-DGTQDLQFAISNIDGIVSTAIRGALD-YLSTALLTYRYYVSTDLSAPA-AKPY-TLIVKSGYWT--ATEVQITAGYMNVLDT-AWPRYRYTLPNYPGLRY
PP1578_Pput_26988310                         MSILKRLYASS-GPEIIHEVLEITDGIT--------TYWMTKGWDELTITLET-------GQVVVCTPCGMDLALPARND-DGTQDLTFALSNIDGIASGFVRAALR-DGRRMSLVYRAYTSDDLGAPA-HAPH-RFKIKGGSVT--AAQVSVTAGYFDLLDT-RWPRNTYNLNEFPGLRY
PputDRAFT_4718_Pput_82734887                 MSLIEECYASGRGE--LVDTIEARKEGGTV------SHLYCSGWEDRVCTTED-------GRTLTFVAMAMDLALPKNDN-SAFQNLVLGLDNVTGEVQEVVEEAKA-ADDRFIITFRRYLAEDLTFPQ--ERY-RMTLLSREYE--DDVAKLTAGFFDLLNT-NGLRTVLTTTLAPGLKY
_BPXp10_31788495                             SFVSNRQRLTDYSG--ILQVLEISAAYLPD------TLRLVKDVKDWTIN----------G--QDYIGLEFTITLPEDRS-GSNGVLEIKMSNVGRDVTEDLEKRPPDQMMTAVLK----LSDRETPGEFYRII-PMPIDRVSID--AQTVTLTASMDSIMRQ-QACRLRFTPFITPGLF-
_BPOP1_84570661                              SFVSNRQRLTDYSG--ILQVLEISAAYLPD------TLRLVKDVKDWTIN----------G--QDYIGLEFTITLPEDRS-GSNGVLEIKMSNVGRDVTEDLEKRPPDQMMTAVLK----LSDRQTPGEFYRII-PMPIDRVSID--AQTVTLTASMDSIMRQ-QACRLRFTPFITPGLF-
_BPXp15_66392125                             MSTFKERKQRVRDPSGLLILMELSANSFQE------TLRIANDTDNWTSN----------G--LLYYGFPFKFTGPDDSD-GSNASSKIVIDNTGRGMSDDLESLQPNEIILVKL-----MITDFYNPSA-IIR-TLYLPMMGATIRVTQMEGRCGV-DYIMRQRSVQLASSPYTAPGSY-
SfumDRAFT_2313_Sfum_71544667                 VLVREKNKLATPDPWIVLLDIELDATH---------KLYFCSNNQNVTWS----------G--RVYTAFPFLLEPTEENSKGEIPSVSLKVANVTQVIHAYLEQLDGAVGATVTI--RVVNAGYLSEDASELDM-TFTVVSTSAD--AEWIVFTLGAPNPLRR-RFPPFRFIAKHCHWEFK
DVU_2155_Dvul_46449979                       -----------------------MHPSLAA------PLRISSDPTQRTVVTDEEVVYGTVSRAETFVFVPFSISLPNDSA-EETPQTSITIDNVGREMVPTIRALTSAPEITLEMV----MASTPDVVEAVFP--GFALSSVTYD--AMSISGTLSVTEFTTE-PCPAGTFNPAEFPGMF-
consensus/100%                               ....................hph...............bhs.s................. ....h....h.h..s............h.hs.....h...h..................................h.......p.........h...s......................
consensus/80%                                ..shcc.bhss.ssp.hh.slEl.ps.h........shblsps..Dhphp......... G....a.u.shchp.P..ps.s.s..hph.lssls..l.p.lc.........h.l....hl.sc.s.s..p.sh..h.l.phphs..s.pls.ph.h.sh.pp..hspbpass..hPGL..

Species abbreviations: BPD3 : Pseudomonas phage D3; BPKS7 : Bacteriophage KS7; BPMB78 : Bacteriophage MB78; BPOP1 : Xanthomonas oryzae phage OP1; BPXp10 : Xanthomonas campestris phage Xp10; BPXp15 : Xanthomonas campestris pv. pelargonii phage Xp15; Bbro : Bordetella bronchiseptica; Bper : Bordetella pertussis; Bvie : Burkholderia vietnamiensis; Dace : Desulfuromonas acetoxidans; Dvul : Desulfovibrio vulgaris; Mfla : Methylobacillus flagellatus; Mgry : Magnetospirillum gryphiswaldense; Mmag : Magnetospirillum magneticum; Mmag : Magnetospirillum magnetotacticum; Nsp. : Nitrobacter sp.; Ppro : Photobacterium profundum; Pput : Pseudomonas putida; Rbac : Rhodobacterales bacterium; Sfum : Syntrophobacter fumaroxidans; Yint : Yersinia intermedia
-------------------------------------------------------------------------------------------------------------
6. OPERONS WITH E2-like domains

6a.  Uncharacterized operon with a triple module protein containing an E2-like, E1-like and JAB domains (Metallo beta lactamase neighbor)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the E2+E1 containing protein- marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
21110358        750         MBL->E2+E1+JAB*->                                                                                                   Xanthomonas axonopodis pv. citri str. 306                      proteobacteria>gammaproteobacteria            conserved hypothetical protein [Xanthomonas axonopodis pv. citri str. 306]
48864353        735         MBL->E2+E1+JAB*->                                                                                                   Microbulbifer degradans 2-40, 48864354                         proteobacteria>gammaproteobacteria            COG0476: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 [Microbulbifer degradans 2-40]
58038271        741         MBL->E2+E1+JAB*->                                                                                                   Gluconobacter oxydans 621H,                                    proteobacteria>alphaproteobacteria            hypothetical protein GOX2518 [Gluconobacter oxydans 621H]
68246513        495         MBL->E2+E*1->                              (E2+E1) only (JAB perhaps displaced by transposon);                      Magnetococcus sp. MC-1                                         proteobacteria                                UBA/THIF-type NAD/FAD binding fold [Magnetococcus sp. MC-1]
68559822        751         MBL->E2+E1+JAB*->MBL->                                                                                              Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             UBA/THIF-type NAD/FAD binding fold [Ralstonia metallidurans CH34]
74421923        223         (E2 only)\  MBL->E2->E1->JAB->                                                                                      Nitrobacter winogradskyi Nb-255                                proteobacteria>alphaproteobacteria            hypothetical protein Nwi_2872 [Nitrobacter winogradskyi Nb-255]
74421925        352         (JAB only)|                                                                                                         Nitrobacter winogradskyi Nb-255                                proteobacteria>alphaproteobacteria            hypothetical protein Nwi_2874 [Nitrobacter winogradskyi Nb-255]
74421924        235                  /                                                                                                          Nitrobacter winogradskyi Nb-255 ThiF solo, 74421925: JAB,      proteobacteria>alphaproteobacteria            hypothetical protein Nwi_2873 [Nitrobacter winogradskyi Nb-255]
77387013        601         E2+E1->JAB->                                                                                                        Rhodobacter sphaeroides 2.4.1 (E2+E1, JAB neighbor)            proteobacteria>alphaproteobacteria            ThiF family protein [Rhodobacter sphaeroides 2.4.1]
77955313        851         MBL->E2+E1+JAB*->                                                                                                   Marinobacter aquaeolei VT8                                     proteobacteria>gammaproteobacteria            conserved hypothetical protein [Marinobacter aquaeolei VT8]
77955723        725         MBL->E2+E1+JAB*->                                                                                                   Marinobacter aquaeolei VT8                                     proteobacteria>gammaproteobacteria            hypothetical protein MaquDRAFT_3270 [Marinobacter aquaeolei VT8]
84502025        761         MBL->E2+E1+JAB*->                                                                                                   Oceanicola batsensis HTCC2597                                  proteobacteria>alphaproteobacteria            hypothetical protein OB2597_18097 [Oceanicola batsensis HTCC2597]
84717800        751         MBL->E2+E1+JAB*->                                                                                                   Polaromonas naphthalenivorans CJ2                              proteobacteria>betaproteobacteria             conserved hypothetical protein [Polaromonas naphthalenivorans CJ2]
85859492        1158        MBL->E2+E1+JAB+Calcineurin*->              (C-terminal calcineurin)                                                 Syntrophus aciditrophicus SB                                   proteobacteria>deltaproteobacteria            hesA/moeB/thiF type protein [Syntrophus aciditrophicus SB]
86559649        760         MBL->E2+E1+JAB*->                                                                                                   Clostridium perfringens, l                                     firmicutes                                    ThiF [Clostridium perfringens]
88705878        751         E2+E1+JAB*                                                                                                          gamma proteobacterium KT 71                                    proteobacteria>gammaproteobacteria            conserved hypothetical protein [gamma proteobacterium KT 71]
90019857        735         E2+E1+JAB*                                                                                                          Saccharophagus degradans 2-40                                  proteobacteria>gammaproteobacteria            hypothetical protein Sde_0208 [Saccharophagus degradans 2-40]
90419011        746         MBL->E2+E1+JAB*                                                                                                     Aurantimonas sp. SI85-9A1                                      proteobacteria>alphaproteobacteria            conserved hypothetical protein [Aurantimonas sp. SI85-9A1]
86475921        760         MBL->E2+E1+JAB*                                                                                                     Clostridium perfringens                                        firmicutes                                    ThiF [Clostridium perfringens]

----------------------------                                                                                                                    ------------------------
6b. Uncharacterized operon coding a multidomain protein with E2 and E1 domains  (This version of the JAB is closer to the E2+E1+JAB type)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the E2+E1 protein- marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
71038912        589         Patatin->nuct_transferase->E2+E1*->JAB->                                                                            Psychrobacter arcticus 273-4                                   proteobacteria>gammaproteobacteria            (in the vicinity of transposase)
9654584         584         Patatin->nuct_transferase->E2+E1*->JAB->                                                                            Vibrio cholerae O1 biovar eltor str. N16961                    proteobacteria>gammaproteobacteria            (transposase in vicinity)
37927532        538         Patatin->nuct_transferase->E2+E1*->JAB->                                                                            Escherichia coli                                               proteobacteria>gammaproteobacteria            (Integrative conjugative element)
84786718        558         nuct_transferase->E2+E1*->JAB->                                                                                     Erythrobacter litoralis HTCC2594                               proteobacteria>alphaproteobacteria
85706659        550         nuct_transferase->E2+E1*->                                                                                          Roseovarius sp. 217                                            proteobacteria>alphaproteobacteria            (in the vicinty of transposase)
66965723        592         E2+E1*->JAB->                                                                                                       Arthrobacter sp. FB24                                          actinobacteria                                UBA/E1-type NAD/FAD binding fold [Arthrobacter sp. FB24]
86357617        562         Nuct_transferase->E2+E1*->JAB->                                                                                     Rhizobium etli CFN 42                                          proteobacteria>alphaproteobacteria            hypothetical protein RHE_CH01997 [Rhizobium etli CFN 42]
84499281        557         Nuct_transferase->E2+E1*->JAB->                                                                                     Oceanicola batsensis HTCC2597                                  proteobacteria>alphaproteobacteria            hypothetical protein OB2597_05120 [Oceanicola batsensis HTCC2597]
86475968        567         Nuct_transferase->E2+E1*->JAB->                                                                                     Clostridium perfringens                                        firmicutes                                    conserved hypothetical protein [Clostridium perfringens]
88937743        576         Nuct_transferase->E2+E1*->JAB->                                                                                     Geobacter uraniumreducens Rf4                                  proteobacteria>deltaproteobacteria            similar to Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 [Geobacter uraniumreducens Rf4]
22726448        572         Nuct_transferase->E2+E1*->                                                                                          Ruegeria sp. PR1b                                              proteobacteria>alphaproteobacteria            RC170 [Ruegeria sp. PR1b]
78684828        575         Nuct_transferase->E2+E1*->                                                                                          Shewanella sp. ANA-3                                           proteobacteria>gammaproteobacteria            similar to Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 [Shewanella sp. ANA-3]
84701417        589         nuct_transferase->E2+E1*->                                                                                          Parvularcula bermudensis HTCC2503                              proteobacteria>alphaproteobacteria            hypothetical protein PB2503_00627 [Parvularcula bermudensis HTCC2503]
2496738         583         E2+E1*->JAB->                                                                                                       Rhizobium sp. NGR234                                           proteobacteria>alphaproteobacteria            Y4QC_RHISN Hypothetical 63.6 kDa protein y4qC
2496721         593         E2+E1*->                                                                                                            Rhizobium sp. NGR234                                           proteobacteria>alphaproteobacteria            Y4OA_RHISN Hypothetical 65.2 kDa protein y4oA
92915671        591         Patatin->nuct_transferase->E1+E2*                                                                                   Mycobacterium sp. KMS                                          actinobacteria
----------------------------------------------------
6c. Uncharacterized operon coding a distinctive multidomain protein with E2 and E1 related domains
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the E2+E1 protein- marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
2496664         519       ?->Metal?->JAB->E2+E1->                                                                                               Rhizobium sp. NGR234                                           proteobacteria>alphaproteobacteria            Y4JF_RHISN Hypothetical 55.4 kDa protein y4jF
14025925        519       ?->Metal?->JAB->E2+E1->                                                                                               Mesorhizobium loti MAFF303099                                  proteobacteria>alphaproteobacteria            mll6192 [Mesorhizobium loti MAFF303099]
20803932        485       ?->Metal?->JAB->E2+E1-> ; note part of symbiosis island (Integrated element)                                          Mesorhizobium loti                                             proteobacteria>alphaproteobacteria            HYPOTHETICAL CONSERVED TRANSMEMBRANE PROTEIN [Mesorhizobium loti]
86359719        514       ?->Metal?->JAB->E2+E1->  (plasmid p42a)                                                                               Rhizobium etli CFN 42                                          proteobacteria>alphaproteobacteria            hypothetical protein RHE_PA00014 [Rhizobium etli CFN 42]
23011188        110       Metal?->JAB->N+E1->                                                                                                   Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria            hypothetical protein Magn03005843 [Magnetospirillum magnetotacticum MS-1]
77690158        455       Ubl+Ubl+Ubl->Metal?->JAB->N+E1->                                                                                      Rhodopseudomonas palustris BisB5                               proteobacteria>alphaproteobacteria            UBA/THIF-type NAD/FAD binding fold [Rhodopseudomonas palustris BisB5]


-Alignment of potential metal binding domain

FINAL                           ---HHHHHHHHHHHHH-----HHHHHHHHHHHH--EEEE----EEEEEE----------EEEEEEEE---------EEE----------------------------------HHHHHHHHHHHHH--------EEEEE---EE---------------------HHHHHHHHHHHH---HHHHHHHHHHHHH---E--------
ALIGN                           ---------HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH----EEEEE-----------EEEEE-----------EEEE-------HHHHHHHHHHH------------HHHHHHHHHHHHHHHH--------EEEEE----------------EEEE-------HHHHHHHHHHH----HHHHHHHHHHHH------------
HMM                             ----HHHHHHHHHHHHHHHHH-----HHHHHHH-HHEHH---EEEEEEE-------HHHHHEEEEEE---------EEEE--------HHHHHHHHH-----------HHHHEEEHHHHHHHHHHE--------EEEEEE--EEEE--------HHHHEE-----HHHHHHHHHHHHHHHHHHHHHHHHEEE----E--------
FREQ                            --------HHHHHH-------HHHHHHHHHHHH-EEEEE-----EEEEE-----------EEEEEE----------EE----------------------------------HHHHHHHHHHHHHHH--------EEEE----------------EEE-------HHHHHHHHHHH-----HHHHHHHHHHHH------------
PSSM                            ---HHHHHHHHHHHHHHHHHHH------HHHH---EEEE----EEEEE---------EEEEEEEEEE---------EEEE-------------------------------------------------------EEE---------------------------HHHHHHHHHHHH---HHHHHEEEEEEEE---E--------
RHE_PA00016_Retl_86359721       MTESQFVDEAVSRRKFEREVAQYRELEDSYRRRGWFLLDATFPTVLVLFVALKVTPRSLVCAVRLDFTNYDLEPPSVTFVDPSTGTALPAKSLGFKMLRLNGLKEASPETVTTLAQQQRLSVQELLQAHSPDETPFLCLPGVREYHDHPAHTGDLWLLHRRSGEGSLHFILEQIWASGINPIRMLEYQIQMNFSGFQMDAAALPR
NGR234_174_Rsp._2182463         MPELQTVDPKVSRAKFDREISRFRAYADAYRMQGCFLIEESFPSAFFIFASPKVKPRVIGAAIEIDFTNYDLRPLSVVFVDPFTRQPIARKDLPLNMLRRPQLPGTPTEMISNLIQQNAVSLTDFIQANSLEDQPFLCMAGVREYHDNPAHSGDPWLLHRGSGEGCLAFILDKIIKYGTGPAEQLQIHLQVALGGLLVPPQAIPE
msi103_Mlot_20803930            MPEIQTVDPAVSRAKFDRQIGWFQTQAGAYRAQGCFLIEARFPTAFFIFAPPKIRPQIIGAAVEIDFSNYDLRPPSVVFVDPFTRRPVARKDLLLSMLRRPHLPGTPPGMISVLMQQKALSLSDFLQANSAEHTPFLCMAGVREYHDNPAHSGDSWLLHRGSGEGCLAFILDKIIKYGTGPVEQIQYQFQISVGAMVVPPSAIPE
mll8758_Mlot_14025927           MPEIQTVDPAVSRAKFDRQIGWFQTQAGAYRAQGCFLIEARFPTAFFIFAPPKIRPQIIGAAVEIDFSNYDLRPPSVVFVDPFTRRPVARKDLLLSMLRRPHLPGTPPDMISVLMQQKALSLADFLQANSAEHTPFLCMAGVREYHDNPAHSGDSWLLHRGSGEGCLAFILDKIIKYGTGPVEQIQYQFQISVGAMVVPPSAIPE
RPDDRAFT_1997_Rpal_77690160     ------MLEALSKATFDRDIGRIDPRS--VRMYDWAIVQANYPVFDVIFNHAQVAP----LRLRLVCDDWDEIPPSIELLNK----------------EGQPLATAPPNVGNVFNG----------STHPNTGRPFVCMRGAREYHTHGSHTSDLWDNYRGQSGMDLGGIVVQLWRAWKRSVG----------------------
Magn03005843_Mmag_23011188      -------------------------------------------MLDVILGHPTAAP----LRLRFTCVDWDDLPPSVELLDAA----------------GQHLSQAPPGAGGIFHP----------SPHPVTGRMFVCMRGTREYHTHFSHVGERWDGYRGQSGLDLLGILDQIWRCWKRAVG----------------------
consensus/100%                  ......h...lS+.pF-Rplubhp.b...hR.bshhllp.paPsh.hlFs..bl.P....h.lclshssaDbbP.Sl.hls.................c...L..sss...ssh............psps.p.pPFlCh.GsREYHspsuHouD.W..aR.pu..sL..Il.blh.....sh.......................

-Alignment of domain marked with a ?:

FINAL                         -HHHHHHHHEEE-HHHHH-----------E------EEEEEEE-------------EEEE-E----EEE----EE--------E-------------EEEEEEEE----EEEE------HHHHHHHHHH--------------------EEEEE-------E--HHHHHHHHH------
mll6195_Mlot_14025928         MYRQYFRIALIDYSCEAQFQPVYLPLKSRIKEGSTDSVAYPLSFAYSRPVAPSGRLKIAG-LTSRWAQAPGAGWQATGVGQMSKDSGKGD-HGG---KIEITVVVNGQPTQVEANPNQPLHVVRAKALENTQNVAQPAENWEFKDEAGNLLDVDKKVGDFGFANIVTLFLSLKAGVAGA
msi102_Mlot_20803929          MYRQYFRIALIDYSCEAQFQPVYLPLKSRIKEGSTDSVAYPLSFAYSRPVAPSGRLKIAG-LTSGWAQAPGAGWQATGVGQMSKDSGKGD-HGGGPGKIEITVVVNGQPTQVEANPNQPLHVVRAKALENTQNVAQPAENWEFKDEAGNLLDVDKKVGDFGFANIVTLFLSLKAGVAGA
y4jI_Rsp._2496667             -------------------------------------------------MAPSGRSKTASPLTGRSAVVPWGRLASHWSMTMSKEAGKGDNHGGGPGKIEIIVVVNGQPTQVEANPNQPLHVVRTKALENTQNVAQPAENWEFKDEAGTLLDADKKIGDFGFANTGTLFLSLKAGVAGA
RHE_PA00017_Retl_86359722     --------------------------------------------------------------TLDTNRSIDGVSMAKSPNTAPEAAGK---KTGSKNKITLTIVVNGEPVSVEANVNAPLHTAIAKALEESGNVGQPPENWELKDENGTVFDASKKIEDLGITAGQKLFLSLKAGAAG-

Alignment of N-terminal domain fused to E1

FINAL                          ---HHHHHHHHHHHHH----HHHHHHH--EEEEE------HHHHHHHHHHH---EEEEEE--------EEEEE--------HHHHHHH------EEEEE------------HHHHHHHHHHHHHHHHHHH----HHHHHHHHHHHHH-----
ALIGN                          ---HHHHHHHHHHHHH----HHHHHHHHHEEEEE------HHHHHHHHHHH--EEEEEE---------EEEEE--------HHHHHHHH-----EE--------------HHHHHHHHHHHHHHHHHHHH----------HHHHHH------
HMM                            ---HHHHHHHHHHHH-----HHHHHHHH-EEEEE------HHHHHHHHHH----EEEEEEE-------EEEEEEE------HHHHHHHHH----EEEEEE----------HHHHHHHHHHHHHHHHHHHHH---HHHHHHHHHHHHH-----
FREQ                           ----HHHHHHHHHHHH---HHHHHHHH--EEEEE-------HEEHHHHHHHH---EEEE-------------------HH-HHHHHH-------EEEE-------------HHHHHHH------HHEE----HHHHHHHHHHHHHHHH----
PSSM                           ---------EEEEEEE------------EEEEEE--------HHHHHHHHH---EEEEEE--------EEEEE----------HEEH-------EEEEE--------------------HHHHHHHHHHH----HHHHHHHHHH--------
RPDDRAFT_1995_Rpal_77690158    MNKATQQNAMMLASLLGVGEAEAGERLARTVLITAAPGWKSGWAVEVGELIG-RTVQVSHQQEPTDPDLELVIGDVTPRTSARRVYADLGSEGAAASLEPVAKLAG-EPHGLYAAAAACAVSAVVVHAVIDAADLPQARLPMRLDYAQLGVP
Magn03005841_Mmag_46203362     MITPAQENARMLAAILGSDEDDASERLNRAVLVTAPPGGADAAWAAEVAALLARTVGVV-TSPAEEAQLELVIGEAAARTDLPRLHAAIDAGGATVDVRPVGRTGGPPPHPLLAAVAACPAAAATLRMLLDDPALPAVAYPLRLDFDQLGVP
----------------------------------------------------
6d. Uncharacterized operon coding a Ub-like protein, a JAB, an E1-like protein and an E2-like protein
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the E2-like protein- marked with an asterisk

GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
84717439        265       Ub->alpha_helical_2->E2*->JAB->E1->                                                                                   Polaromonas naphthalenivorans CJ2                              proteobacteria>betaproteobacteria             conserved hypothetical protein [Polaromonas naphthalenivorans CJ2]
67910471        259       Ub->alpha_helical_2->E2*->JAB->E1->                                                                                   Polaromonas sp. JS666                                          proteobacteria>betaproteobacteria             hypothetical protein BproDRAFT_0623 [Polaromonas sp. JS666]
71847775        246       Ub->alpha_helical_2->E2*->JAB->E1->                                                                                   Dechloromonas aromatica RCB                                    proteobacteria>betaproteobacteria             conserved hypothetical protein [Dechloromonas aromatica RCB]
67543573        242       Ub->alpha_helical_2-->E2*->JAB->E1->                                                                                  Burkholderia vietnamiensis G4                                  proteobacteria>betaproteobacteria             conserved hypothetical protein [Burkholderia vietnamiensis G4]
38637969        241       Ub->alpha_helical_2->E2*->JAB->E1->                                                                                   Cupriavidus necator                                            proteobacteria>betaproteobacteria             hypothetical protein PHG308 [Cupriavidus necator]
17428675        240       Ub->alpha_helical_2->E2*->JAB->E1->                                                                                   Ralstonia solanacearum                                         proteobacteria>betaproteobacteria             conserved hypothetical protein [Ralstonia solanacearum]
56315656        239       E2*->JAB->E1->                                                                                                        Azoarcus sp. EbN1                                              proteobacteria>betaproteobacteria             conserved hypothetical protein [Azoarcus sp. EbN1]
56410324        239       E2*->JAB->E1->                                                                                                        Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             hypothetical protein [Ralstonia metallidurans CH34]
68559357        239       E2*->JAB->E1->                                                                                                        Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             conserved hypothetical protein [Ralstonia metallidurans CH34]
44004435        214       E2*->alpha_helical_2->JAB->alpha_helical_2->E1->                                                                      Bacillus cereus ATCC 10987                                     firmicutes                                    hypothetical protein BCE_A0096 [Bacillus cereus ATCC 10987]
67908644        255       E2*->JAB->E1->                                                                                                        Polaromonas sp. JS666                                          proteobacteria>betaproteobacteria             hypothetical protein BproDRAFT_4305 [Polaromonas sp. JS666]
74024822        241       E2*->JAB->E1->                                                                                                        Rhodoferax ferrireducens DSM 15236                             proteobacteria>betaproteobacteria             hypothetical protein RferDRAFT_4144 [Rhodoferax ferrireducens DSM 15236]
75705484        240       E2*->JAB->E1->                                                                                                        Anabaena variabilis ATCC 29413                                 cyanobacteria                                 conserved hypothetical protein [Anabaena variabilis ATCC 29413]
17134644        240       E2*->JAB->E1->                                                                                                        Nostoc sp. PCC 7120                                            cyanobacteria                                 alr7559 [Nostoc sp. PCC 7120]
29339960        233       Ub->alpha_helical_2->E2*->E1->                                                                                        Bacteroides thetaiotaomicron VPI-5482                          bacteroidetes/chlorobi                        hypothetical protein BT_2648 [Bacteroides thetaiotaomicron VPI-5482]
71839550        243       Ub->alpha_helical_2->E2*->E1 ->SFI helicase(note connection to F-box)->JAB->                                          Pelobacter propionicus DSM 2379                                proteobacteria>deltaproteobacteria            conserved hypothetical protein [Pelobacter propionicus DSM 2379]
75758403        241       E2*->alpha_helical_2->E1->                                                                                            Bacillus thuringiensis serovar israelensis ATCC 35646          firmicutes                                    hypothetical protein RBTH_06715 [Bacillus thuringiensis serovar israelensis ATCC 35646]
75758953        253       E2*->alpha_helical_2->E1->                                                                                            Bacillus thuringiensis serovar israelensis ATCC 35646          firmicutes                                    hypothetical protein RBTH_07326 [Bacillus thuringiensis serovar israelensis ATCC 35646]

**: note: all the JABs in these operons have an N-terminal domain, whose alignment is provided below:

Domain found N-terminal to the JABs (JAB-N)

FINAL                                   -HHHHHH---EEEE--------------EEEEEE----EEEE--HHHHHHHH---------E-----HHHHHHH-
ALIGN                                   -----------EE----------------EEEEE---EEEEEE---HHHEEH----------------HH-----
HMM                                     -HHHHH----EEEEE--------H----EEEEEE--HEEEEEHHHHHHHHHHH-------EEEEE--HHHHHH--
FREQ                                    -HHHHHH----EE----------------EEEEHH---EEEEE--HHHHHE----------------HHHHHHH-
PSSM                                    -HHHHHH-----E----------------EEEEE---HEHHH----HHHHHH---------------HHHH----
PproDRAFT_0259_Ppro_71839552            MDAILQEQFPTVMVPRYGD-FVPLAHNGRRFLSASDGLWLEEKNQWLHILWPLALQN--QVAMPYGSLQKKVDFL
RSc1658_Rsol_17428674                   ---KLWDSAPTVAVPKFAE-FKQLEDVGHRFLATAEGLFVEVRRPWLHVIQPVAPLNGQTVRPPYGTVKQKVDLA
BproDRAFT_0622_Psp._67910470            LDSIIQGMFPTVIMPREGT-IAPATKNGTRYVVAGDGLWREVVLPWVTVMHKIANS---DFMLPYGAAEEAVVIK
RmetDRAFT_6239_Rmet_68559358            ADMALQQSFPSVMVPRHGA-LPALEQVGERLLIAANGVFLEIVRPWLRVVRRLGEFQH-QTAIPYGDATEVTELR
RMe0063_Rmet_56410325                   --MALQQSFPSVMVPRHGA-LPALEQVGERLLIAANGVFLEIVRPWLRVVRRLGEFQH-QTAIPYGDATEVTELR
PHG307_Cnec_38637968                    ADAALQQSFPSVMVPRFGA-LAPMERSGERLLIAANGVFLEIVRPWLSVVRHLGAFQH-RTAIPYGEAAETTDLR
Bcep1808DRAFT_6254_Bvie_67543574        LDTVLQQSFPAVMVPSRET-VVPMTRSGERLLIASDGVYLEVLRPWVRVVRRIAQY-AVSIAVPYGKVEETTALL
p1B74_Asp._56315655                     RDMALQALTPTVMVPRFGC-FEPLSQPGHRFLVGQNGEWLEVRRAWMYARVQLTQP--SPVVKPYGVVTACLEWL
BproDRAFT_4306_Psp._67908645            MDAIIQSQFPTVLAPRFEA-LSPLETTGDRFILTRHQVLMEVSRPWLHAIQAISAP--FARQTPYGAGPRLGIKL
Daro_2537_Daro_71847774                 RDLALQAVCPVIAAPRFGP-LPDM-ANGQRIILAANGVFVQVKLDWLDCIQRLSPA--LPITLPYGGIEERLAFT
RferDRAFT_4145_Rfer_74024823            -DRFLATDCPVITMPHDSEVFEPLKTPGHRLIVAAGGLYKEIRRAWLHAIVHVAR-----AQTPFGELQTTLSM-
PnapDRAFT_0123_Pnap_84717438            LDQITMGVFPLLAASQTGV-LQDPEKHGVRYVAASDGMWRAIDTAWLKA--------------------------
PproDRAFT_0259_Ppro_71839552            MDAILQEQFPTVMVPRYGD-FVPLAHNGRRFLSASDGLWLEEKNQWLHILWPLALQN--QVAMPYGSLQKKVDFL
consensus/100%                          ....l....Psl.hPp....h..h..sGpRhl.s....h.b....Wh.h...ls.........PaG.......b.
consensus/95%                           ....l....Psl.hPp....h..h..sGpRhl.s....h.b....Wh.h...ls.........PaG.......b.
consensus/90%                           ...hlb..hPslhhP+....h.sh.psGpRhlhs.pGla.El.bsWlphh..lu.........PYG.h.p...h.
consensus/85%                           ...hlb..hPslhhP+....h.sh.psGpRhlhs.pGla.El.bsWlphh..lu.........PYG.h.p...h.
consensus/80%                           .D.hLQ..hPsVhhP+.us.h.shppsGcRhlluusGlahEl.bsWlphl..lu.........PYG.hpp...h.
consensus/75%                           .D.hLQ..hPsVhhP+.us.h.shppsGcRhlluusGlahEl.bsWlphl..lu.........PYG.hpp...h.
consensus/70%                           .D.hLQpphPoVhhPRaus.hsshppsGcRhllAusGlalEl.RsWLchlbplu.........PYGshpc.s.hb

Species abbreviations
Asp. : Azoarcus sp.; Bvie : Burkholderia vietnamiensis; Cnec : Cupriavidus necator; Daro : Dechloromonas aromatica; Pnap : Polaromonas naphthalenivorans; Ppro : Pelobacter propionicus; Psp. : Polaromonas sp.; Rfer : Rhodoferax ferrireducens; Rmet : Ralstonia metallidurans; Rsol : Ralstonia solanacearum



Alignment of alpha helical domain-2
FINAL                                 ----------------------------------------------------------HHHHHH-HHHHHHHH--------------HHHHHHHHHHHHHHHH---------------------------------------HHHHHHH------EEEEEE------HHHHHHH----HHHHHHHHHHH----------HHHHHHHH-EEE---------HHHHHHHHH-------HHHHHHH-------------------------------------HHHHHHHHHHHH----HHHHHHHHHHHHHHHHH---------------------------------EEEEEEE---HHHHHHHHHHHHH---------EEEEEEE-----HHHHHHHHHHHHHHHHHHHHHHHHHHHHH--
Daro_2539_Daro_71847776               ------------------------------------MMALVLPRIGPQVPRSIAPGPLLAANAM-VSRFLIEAEAFDEADIPVTWSDSLDACRQALDGWLKCQIGALHCLTPRF--------ALHMVSRDGESYRYYGSQPPKDFDFNAVEASWCEYHEQEWPVGAGLEALSARLHGLGTVVLHVLCRQSAFV-YPLFTPDIACDVATYLYWCGED---DEEAALDMNCGEDEEEREAMRAEMVTKSMLEA------------------SFPAWTRRWPRGLELAQCARFLRRATNRLSDPGAKATAEDALALATLEIDDSFRP---DMEGE-----------FVGFGAVLSWRDGDVTTRIYDDLLELAHQGEYC-EHMGEVQVPLD-DPAAFGAWQQAMASRFAAIRLIDRLIHHLSAG
BproDRAFT_0624_Psp._67910472          -----------------------MKSIALERPVPKAHGTFVLPQISSEVPLVIGGESIAHQTLAKFSLAAEKCGMELPGG---DIPKLESIVQMQLQGWLDKQVGAN--------------------ARACLGGQPLISANSSEIEFFMRAVSNLELLKL----KPVIEALEAKVPGLGWYVVDVIERSNGRG-ISIYSPAAM-GYHSFSQLQGAESDEDFVKEMQAMEGEDEPSPEELAELIEQARSDYAYLPSKVLESVEGHAHLLGWASPNAKHGPKRLKTKQAAYLLKTAELPDGLKQCVTDAIALDCLYGK--DKGAYTWDNSQDEE-----------QIGAACFIAWNDAEMLFELVQHYEEDTYNSGTAMECLCRLKVATGGTPAEFEQMARLMRAYFDQWNALGNLLVHFLDQ
PnapDRAFT_0133_Pnap_84717448          MLDRYEHGREFECDPESASLVTTSVRNLGVVAQGNQGGMWVLPTFSPEIPLEISRADAEASNLADFILLAHKKGIRIPDSI---YTTTSELMTQQFANYARSQVKN--------------------VRVELPLDVSIVAIDKKIEFAANATDRFQGIYQL----KDSVERLNAASPGLGWFITDTIRKGHGVG-LTTYDPCRIANCVQLIWFDSET---DQEAAAEVLDIDEKNVTEAHIEQARDERTFM---PSDFLASVGGHKHLLSWSQTKKEKAACRSMSASRVRACIHKLKLVEADRALV-MAALEFHDSIKVRKANAAIAPNGWFEHENLHEFECLDALGSLAFIVWDDSEFAREAITHYEEYAMNGEGSHSQLVGLFVEL-DEPASWGPFIDAYKLYIKRYAAFSNFIGALPEE
RSc1660_Rsol_17428676                 ------------------------------------MTALALPRLAA-MPTRYRTRDDGAAWCTPALLGLVDADALSADDVRRDPATPAELLQHTLQRHWDEITAGARIFDW--------HLSANPSQLGWWIPTTTSKNLWLAITPHNNNRVDVPLYYL----GPTITTLENIRKGLGQTVLAVFYDALRLL-PNTLTPADTYGHASWVHWHGET---DETMAIQWLYDEGDFETMEQAAAAYDGPTREA---------------LFEYMPEWAAYPRRVLSDRQVRRIARSHPFVAKVVDAVDGIWNHVHATHATGGYADCRVDADGD-------------SITWIAIFRWHPEDLALRIADDFTEFVTQGEY-QDASTLVCVE--SESDSLARWLHQMRANGQLARLVENLVDLIAMP
BcenP_01000005_Bcen_84357756          -----------------------------------------------------------MLTIEDAQLADDHRNERELARIALTRTWQELTDAHSIFEWSLRLSSD---------------------SCGPSYYRTGDDNSVWVSIHSDGGAGTAPVRFL----RGSISHLESVMPGLGQTVLAVLYEACAHYLPSVLTPSETISIAGYMYWQGHA---NEIEALPELRMHYDDVDEATPEEFFEACSIPR------------RTEFFRDAPDWLVNPQQVLNTFDVHRAAEQDEIAALAVSACDEIYSLIAHGGPFARVDHFDSNAG--------------PGIDFSLFLLWDHDDGTGRVIDDFLEHEMQG-DALEAACAVSLS--LAGKAVGNWFARVRNTSRLALAVEHLLDVIALR
p1B76_Asp._56315657                   ------MRGRTYAVSRKLTEEFGVSGSASASIKRHPNDPLRLPR-KCLAPGAYVEHASAGLPLANLALALYEEGLITEADPDWGLAEVVKLGLMRLTEGTLGDLVFVAPVDL--------AVSSTLEGCEGFSVEESDPVPQTYWLALELTNALEPCFA-----GKRLLELEKAVPGLGKTALDVAQSAGART-TGCWSPLFVRDLSSYIYWRGAD---TQEEWLEELEASGEDPDDYGFSPKQYEEGFE--------------------VDWACSAQMELDGFALVQALDHPDPAVGDVAEKLCELMCLLNN-----RRSAFPDASPTDRE-----------SVYRGCLIRWDKNDPIEQVIDDHIEYANQGADCYTTLCSVWDVK-ITREDFSEWLKSYRLGLQLYKSLDQLLAMLHTS
PproDRAFT_0256_Ppro_71839549          -------------------------------MATSPPSFLSLPNIPKSVPRLY-EFDTASTCVANIALHLLDLGVVTESE---AIMPLQDIVKQSLNRWCRSKTKDLECFSPILMVSDTFAGIGGYAVDADTVLEQESITPETSIVALGITFDNTKCFTL----KDKCDTLHTVEPDFIEFVIQKLYH-SLCV-MFAVTPELAHDTADMFYWDCYD---DSDEEPYVTKE-----------------------------------EFYKIIPEWVANPTYKEGWVKEFDRCLEHDNENIRKIAQHILTWQDIEHTRKSDVLPYYPDQVSDDG---------CTTIQNGTWISWDENELFDRIIDDWGEYHYQTST-TDLNNFFVVP--ATKQGIEKGLTLLEYYFVRLEWADKLLRLVGKI
RmetDRAFT_6237_Rmet_68559356          --MLFDPRSFVPALDGGQPG-WSFARQHPAARHRPSHGFLTLPAIAAETPGRAFLSFGDEPDALELARAQFETGVLRASDV-VNPTSAADAFAQAMFAWLAARMPTCRRLNFSFS-------LVDLNAAKDQLMQFGWDDQVDASLYLAIDLPGDEVYFIG---KARADALRAVHPYLLYTAMSLINLASSKS-LHLRTPDVLLDLFARWHWEYDCTLANDDDAREFLKNGCGM-DEGDIARYLPSAVRP------ELAPDDVLPPFCHAYPE-SRKLKTVGSRKLYELARSQHGWLKDVCVALAELNLAVKRQRDR-----SAVADSQWAE-----------PAHSAATLAYAESDYVTQVLDDLYDGYANSGDATLFQCFIPIA--VEPKAIRQQFEDLSGMFKIIAALDRVLTLISD-
PHG309_Cnec_38637970                  --MLFDPRSFVPEVDAGQPA-WTPARQHPIARRRPAHDFLTLPAIPAGVPNSGLLTFGDEVDVLGLVRAQFATGVLRANDVS-TPTGAGDAFAQAMFAWLRARTPECKRLSFGFS-------LIDIGAAKDQLMQFGWEDEIEAPLYLAIDMPGDDVYFIG---EARASALRAVHPHLLYTAMSLINTASAKS-LFLRTPEALLDLFARWHWEYDSTLADDKNAREFLAESCDM-DEGDIERYLPSVVRP------ELAPDDVLPPACHGYPA-SSKLRAFGSRKLYELSRANNGWVKDLCVALAELNLTLKRQGDR-----SAVAGSQWAE-----------PAYSAATLAYRQSDYVTQVLDDLYDGLNCSGDATLFQCFIPIA--GEPKAIRQQFRDLDGMLKTIAALDRVLTLISD-
Bcep1808DRAFT_6252_Bvie_67543572      --MFFDPALPDSSIAAGSAARWQPPRAAP-ARRRPAADLLTLPSFSTEVPGAVRLKWREDVNLSDLVLKHFQYGPLRAGDVH-DPADAGDAFQQAFHAWTRRQYGRLSRLRFTPH-----LFDAHAVRDVLDGLGNGNNDDDPTPLFFGFGLEDEWVYSL----EGAIETLRSTHPLLFRTVMGALYRASART-MFIRLPDWFMYEFSCWYWDGDPHISDKDADEALKERFDDDT-E-TRSAYLPSVVRP------QLCPDDADPCVFSGGKWRYRSALTAPELMRLR--ARSRGMPRRVCTEVLKLRALMRRSRSRD-----LLHVNYAAN-----------PAYALCSVIVEDNQFVGDLLDCHFENESQSGDATTYSGFSRLA--STPKAIRRQYADLALAFRILTHLDRLLALVSQS
BproDRAFT_4304_Psp._67908643          --------------------------------------------------------------MAKLARALCNVHPELLDLVTLSEQDLPKSCIEIVERWQASLRSFLPKDALAI-----------QPEVTGYRSGNNPEFGGDLLTVQLFLDCPEPIYMK-------EFMKRCRNKVLAHDAAKAIDQVAYLG-LEIWAPEVIRDMYGSMNWYHCDNDADILEEFAMNHWEGEGEDIPAMKPEDFP----------YVLPSKWDAHMKKLGYKKPGPKPLASIQQLREMAKGRSQKDAALATAILKLRKVIKRGHLRCSDDEDRWGC----------------VEPSFVFLWDTDSAQLRHALDEAVEDRHNAGVSRENVLQVSVRPESALQQVEDDVRAVEHLLAMQIAVGDLHTAMKTF
consensus/100%                        ..............................................................h..h.b..........s................h............................................................h.............bp.....h...h...h.............P...........h.......s......................................................................................h...........................................h.........phhpp..-...ps....................h......h.........h.ph...h...
consensus/95%                         ..............................................................h..h.b..........s................h............................................................h.............bp.....h...h...h.............P...........h.......s......................................................................................h...........................................h.........phhpp..-...ps....................h......h.........h.ph...h...
consensus/90%                         ..............................................................h..h.b...p...b..s....s.....p.....h..a...b.....................................................hb.b.......h..Lps....L...hh..h.p.........bsP....s..s.h.ap..s...s........................................................................p.............h....................s................h...s.l.hppsp...phhsc..-...pu.......s.h.l........h...h..h......h..h.pll..h...
consensus/85%                         ..............................................................h..h.b...p...b..s....s.....p.....h..a...b.....................................................hb.b.......h..Lps....L...hh..h.p.........bsP....s..s.h.ap..s...s........................................................................p.............h....................s................h...s.l.hppsp...phhsc..-...pu.......s.h.l........h...h..h......h..h.pll..h...
----------------------------------------------------
6e. Uncharacterized operons coding a protein with tandem repeats of a ubiquitin-like domain (polyUbl) (First evidence of polyubiquitins in bacteria)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Abbreviations: Y: Metal_binding domain_1; X Unknown domain
Gis are of the E1-like protein: Marked with an asterisk

GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
17134589        416       Ubl+Ubl->E2l->JAB+E1*->Y->                                                                                            Nostoc sp. PCC 7120                                            cyanobacteria                                 alr7504 [Nostoc sp. PCC 7120]
38423902        471       Ubl+Ubl->E2l->JAB+E1*->Y->                                                                                            Synechocystis sp. PCC 6803                                     cyanobacteria                                 sll6053 [Synechocystis sp. PCC 6803]
67547439        468       Ubl+Ubl+E2l->JAB+E1*->                                                                                                Burkholderia vietnamiensis G4                                  proteobacteria>betaproteobacteria             UBA/THIF-type NAD/FAD binding fold [Burkholderia vietnamiensis G4]
84711629        469       Ubl+Ubl+E2l->JAB+E1*->                                                                                                Polaromonas naphthalenivorans CJ2                              proteobacteria>betaproteobacteria             unknown protein [Polaromonas naphthalenivorans CJ2]
69928900        458       Ubl+Ubl+Ubl?+E2l->JAB+E1*->                                                                                           Nitrobacter hamburgensis X14                                   proteobacteria>alphaproteobacteria            UBA/THIF-type NAD/FAD binding fold [Nitrobacter hamburgensis X14]
86159351        604       (JAB+E1+thioredoxin-like*?)                                                                                           Anaeromyxobacter dehalogenans 2CP-C                            proteobacteria>deltaproteobacteria            UBA/THIF-type NAD/FAD binding protein [Anaeromyxobacter dehalogenans 2CP-C]
86742694        476       Ub->X+E1*->Y->                                                                                                        Frankia sp. CcI3                                               actinobacteria                                UBA/THIF-type NAD/FAD binding fold [Frankia sp. CcI3]
14025879        396       Ubl+Ubl+Ubl->X+E1*->Y->                                                                                               Mesorhizobium loti MAFF303099                                  proteobacteria>alphaproteobacteria            mlr6140 [Mesorhizobium loti MAFF303099]
68554445        389       Ubl+Ubl+Ubl->X+E1*->Y->                                                                                               Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             conserved hypothetical protein [Ralstonia metallidurans CH34]
28806072        392       Ubl+Ubl+Ubl->X+E1*->Y->                                                                                               Vibrio parahaemolyticus RIMD 2210633                           proteobacteria>gammaproteobacteria            hypothetical protein [Vibrio parahaemolyticus RIMD 2210633]
39651044        400       Ubl+Ubl+Ubl->X+E1*->Y->                                                                                               Rhodopseudomonas palustris CGA009                              proteobacteria>alphaproteobacteria            hypothetical protein [Rhodopseudomonas palustris CGA009]
77690161        239       Ubl+Ubl+Ubl->Metal?->JAB->N+E1*->                                                                                     Rhodopseudomonas palustris BisB5                               proteobacteria>alphaproteobacteria            hypothetical protein RPDDRAFT_1998 [Rhodopseudomonas palustris BisB5]
82740919        398       X+E1*->Y->                                                                                                            Shewanella sp. W3-18-1                                         proteobacteria>gammaproteobacteria            conserved hypothetical protein [Shewanella sp. W3-18-1]
88795472        291       Ub->E1*->                                                                                                             Alteromonas macleodii 'Deep ecotype'                           proteobacteria>gammaproteobacteria            hypothetical protein MADE_08186 [Alteromonas macleodii 'Deep ecotype']

***Note operon fusion: (The polyub is next to another operon that has been cited in a different context)

Alignment of Y: metal binding domain_1
FINAL                               HH---------EEEE-----------EEEEEE-----EEEEEE------EEEEEE--------EEEE---------------------EEEEEE--EEEEE-------
ALIGN                               HHHHH------EEEE-----------EEEEE-----EEEEEEE-------EEEEE--------EEEE----------------------EEEEE--EEEE--------
HMM                                 HH--------EEEEEE---------EEEEEEE-HH--EEEEEE------EEEEEE--------EEEEE-------EEE----------EEEEEE--EEEEE-------
FREQ                                HHHHHH-----EEEE-----------EEEEE-----EEEEEEE-------EEE----------EEEE----------------------EEEE---EEEEE-------
PSSM                                HH---------EE-------------EEEEE-------EEEE--------EEEEE-------EEEEEE--------------------EEEEEE--EEEEE-------
RES                                 LGFLKKSELTSRMVNFHPAPEEIMSGEVVIVGDRNHKKWACFRCPSGCGELILLSLNKNQHPSWRVDCDWLNRPTLHPSVRQLN-HCQCHFWIKRGVTQWCADSRHNK
sll6052_Ssp_38423901                LGFLKKSELTSRMVNFHPAPEEIMSGEVVIVGDRNHKKWACFRCPSGCGELILLSLNKNQHPSWRVDCDWLNRPTLHPSVRQLN-HCQCHFWIKRGVTQWCADSRHNK-----------------------------------------------------------
alr7505_Ana_17134590                LRFLPQPDLSARIVPTHPAPENIKPGEILVVGDAEYQKWACFRCPGGCGENILLSLNQKRHPCWAIAIDSLGRPTLNPSVRQLN-ECHCHFWVRQGVVEWCADSGQK------------------------------------------------------------
mlr6141_Mlot_14025880               -MMARVDCLTTVFVED--IPEQLDDGVLYV--SRQCHV-ALHNCACGCGEEVSTPLVPTE---YDLVMED-EGASIWPSIGNHDFPCGSHYIVKRGRIHWAGKMSREQIEAGRAYDRLLKRG--------------AQPKGLRAILAWIKRLWI-KFIG--------
Sputw3181DRAFT_3760_Ssp._82740918   ---MAVHYITPVFVEF--IPENIEQGKLYI--SETYKT-AIHKCCCGCGEEVVTPLSPAD---WQLKNGV-NTVSLYPSIGNWNYKCKSHYFINNNRIIWAPKFSPEQIQAVQVRDRVDKLNYIA-------DKNKAGPIAWSNFIGWLVKSWR-F---IRSLFSLR
RPA4125_Rpal_39651043               RKSMKLDQIKLQRVEF--MPKQLEPGILYV--SEKYRA-VAHLCACGCGAKIRTPLGITE---WAFTDNT-AGPSLWPSVGNWQQACKSHYIIDGGEIIWCGTWTPEQIMAGRRAEQARRKAHY--------DAMYVKR-------GLFNRVWQ-W---LKSLFGG-
Francci3_0886_Fsp._86566461         --MTRLDAVRHEFVEC--IPETLIQGVVYV--SIAYAT-VAHSCCCGCGNVAYTPLAPGR---WALTFDG-RSISLDPSIGNWSFPCQSHYWIERNRVHWHAAWTAEKIQKGRARTLQMI------------NKDIERTDGAKSATTAVQTRWRGWFARLRRRFK--
VP1086_Vpar_28806073                SLVLKHTHLAHKFVRS--IPKQLEPGILYV--SMEYAT-AIHSCCCGCGNQVVTPITPTD---WQLMFDG-DSISLSPSIGNWGFKCRSHYFIRKGMIVEAGQWDKKTITAGRDNDKHNKAHYYQ-------AKPKGDDNTYSHRVGLFKRVWH-WFLGKREFAKKR
RmetDRAFT_5044_Rmet_68554446        --MMRYKELEPRFVTT--VPRQLEPGVLYV--SMEYGT-VVHSCCCGCGEKVVTPLTPTD---WSITFNG-ESVSLWPSIGSWNLPCQSHYVIKGNRVLESGRWNRQMIDAEISRDNEAKAKYYKRTVSNETEPSLAHPIDIETGSQTYARQSF-WKTILSRLLR--
consensus/95%                       ........l...bV....hPcpl..G.lhl..s......sha.CssGCG..h..sls..p...a.h........ol.PSl.p....C.sHahlp.s...b.s....p............................................................
consensus/90%                       ........l...bV....hPcpl..G.lhl..s......sha.CssGCG..h..sls..p...a.h........ol.PSl.p....C.sHahlp.s...b.s....p............................................................
consensus/85%                       ...h..p.lp..hVp...hPcplb.G.lhl..sbpa...shapCssGCGp.l..sLs.sc...W.l..ss....oL.PSl.phs..CpsHahlc.s.l.bsup.s.p............................................................
consensus/80%                       ...h..p.lp..hVp...hPcplb.G.lhl..sbpa...shapCssGCGp.l..sLs.sc...W.l..ss....oL.PSl.phs..CpsHahlc.s.l.bsup.s.p............................................................
consensus/75%                       ..hh+.splp.bhVp...hPcplcsG.lYV..SbpY.s.shHpCsCGCGpblhTPLs.sc...W.lshss.pssSL.PSlGsasb.CpSHYhIcpsbl.Wsuphs.cbI...b..p...b............................h.p....b...........
consensus/70%                       ..hh+.splp.bhVp...hPcplcsG.lYV..SbpY.s.shHpCsCGCGpblhTPLs.sc...W.lshss.pssSL.PSlGsasb.CpSHYhIcpsbl.Wsuphs.cbI...b..p...b............................h.p....b...........
consensus/100%                      ........l...bV....hPcpl..G.lhl..s......sha.CssGCG..h..sls..p...a.h........ol.PSl.p....C.sHahlp.s...b.s....p............................................................

Species abbreviations:
Ana : Nostoc sp.; Fsp. : Frankia sp.; Mlot : Mesorhizobium loti; Rmet : Ralstonia metallidurans; Rpal : Rhodopseudomonas palustris; Ssp : Synechocystis sp.; Ssp. : Shewanella sp.; Vpar : Vibrio parahaemolyticus



Alignment of domain X: fused to E1: Perhaps a novel protease that displaced the JAB

FINAL                               --HHHH-----HHHHHHH--EEEEE--EEE-EEEEEEE-----EEEEEEEEEEE--------------EEEEE---------HHHHHHH-----EE------EEEEEEEE---------EE--------HHHHHHHHHHHHHHH------EE--EE--------HHHHHHHHHHHHHHHHHHHHHH---                        EEEEEE---HHHHHHHHHHH----EEEE--------------HHHHHHHHHHH--HHHHHHHHHHH-----EEEEE---HHHHHHH---EEEEEE---HHHHHHHHHHHHH----EEEEEEEE---------EEEEEEEEE-----HHHHHHHHHEE---------EEEEEEEHHHHHHHHHHHHHHHHHHHH------------EEEEE-------------
ALIGN                               -----------HHHHHH---HHHHH--EEE-EEE-------------EEEEEEEE-------------EEEE------------EEEEE------E------EEEEEEEE------------EE----EEEEEEEE-H---E--------EE--EE---------EEE------HHHHHHHHHHH----                        EEEEE-----EEEEHHH----EEEEEE------------------HHHHHHHH--HHHHHHHHHHHHHH----------H---------EEEEEEE------HEHHHHHH----EEEEE---EE------------EEEEE-------HHHH----------------HHHHHHHHHHHHHHHHHHHHHHH----E---------EEE----EE---------
HMM                                 --HHHH---HHHHHHHHH-EEEEEE--EEE-EEEEEEE-----EEEEEEEEHHHH------------EEEEEE--------HHHHHHHHH----EEEEEE-HHEEEEEEEE--------EEEEEHHEEEHHHHHHHHH--HEE------EEE--EEEE------EEEEE-----HHHHHHHHHHH----                        EEEEEE---HHHHHHHHHHHHHHHHH--------H--E-------HHHHHHHH--HHHHHHHHHHHHH-------HEHHHHHHHHH---EEEEEEEE-----EEEEEEE-----EEEEEEEEEEEE-------EEEEEEEEE----HHHHHH---E----------E--HEEEHHHHHHHHHHHHHHHHHHHH--HHHH--HEEEEEHHHHHHHH--------
FREQ                                --HHHHH----HHHHHHH---EEE---EEE-EE-----------EEEEEEEEE---------------EEEE-------------EEEE-------------EEEEEEE--------------------HHHHHHHHHHHHHH---------------------HHHHHHHHHHHH--HHHHHHHH---                        EEEEE---HHHHHHHHHHHH----EEEE----EE---HHHHHEHH-------H--HHHHHHHHHH------EEEEE--------E-----EEEE---HHHHHHHHHHHHHH----EEEE-------------EEEEEE--------HHHHHHHHHHH------------HHHHHHHHHHHHHHHHHHHHHHHH--E---------HEEE--------------
PSSM                                --HHHH-----HHHHHH---EEEE---EEE-EEE---E------EEEEEEEEEEE-------------EEEEE---------HHHHHHH-----HH-----HEEEEE----------------------EEEHHHHHH------------EE--EE-----------HHHH-H----HHHHHHHHHH--                        -EEEE-----HHHHHHHHH----EEEEE-----------------HHHHHHHH--HHHHHHHHHHH----EEEE-----HHHHHH----EEEEEE---HHHHHHHHHHHHH----EEEEE-------------E-EEEEEE------------EEEE---------EEE--------HHHHHHHHHHHHHHH---------H----EEE--------------
RPA4126_Rpal_39651044               MFQKLVSHNDDIKRLVDKGYAVGFDSNYMI-VRDIPYLDAQGSECWGAIVTKLVAT--DQGHVIQDDHQIFFAGSSPYNTDGTAIANLSDRPTALGLSEAAADVAVQRQFSNKPRIDGQLVGFNSFFDKIESYVGIISGPARAKFGSNWLTY--RSVEKVANDSVFKIHDTMTSRAEITALSAKFKDEV                        IAIIGLGGTGAYILDFMVKTPVKEIRGFDLDPFHVHNAFRSPGRFEDSEFKRS--KADVYQTRYDNFRHGLTLKAKFIDASCASDFDGVTFAFVCVDKGSSRAGIFEVLMAKGIPFIDVGMGLNRKRGP---LAGMMRATYYDPANAQAMKDKGFSELSDRPN--DEYRVNIQIGELNALNATLAVIKYKKLKGFYIETNPDFNFLFDLSDCKITRRSKIDEA
mlr6140_Mlot_14025879               MSADLISRDPHLKRLLDEGFELEMRELVLLLVHSVPYVKRDKSLGRGTLVCTLSLDTQGLTASPQTDHTMWFTGETPCHRDGAPMTNIIHNSNEATVGS---DIKVHHYFSSKPEGTGQ---YANIYDKVVTYESHLGAAARSHDKTANART-GVTLASAQDDSPFAIPDSASARYGIVAANRKLRG-R                        VAIIGLGGTGAYLLDLAAKTRVAEIHLYDDDQLLNHNLFRSPGAPEPVLAKNFPRKVDYYAALYARMHKGVKPHPTRVKADNIDEFAGYDFVFVCVDKGSSRRVIAEGLVRLGIPFVDTGIGLGLEHNT---LDGCARATFIAPGTPWAE-VATHLSFGDDDEEADVYGTEIQTAELNSLNAIMAIMRWKRWLTFYRDERNERNATYMIEGNNITNRGA----
Sputw3181DRAFT_3761_Ssp._82740919   MSSKLTVHNPSILRLIEEGFEIDIVRQHLL-VHSIPYLNQSGEVKFATLACPFVEN--GEQDTRPQDHTMWFKGEYPHDGKGRPMTEVVNSPNQHVLFD---EFGVDFYLSNKPNGQD----FSNFYDKVVHYHTLFVSQARLVDSNADGRT-GIVHGQRDESSVFCYPDTASSRAGITAITQKLEGSR                        IAIVGVGGTGSFILDLLAKTPIAEIHLFDADDFEPHNAFRAPGAASLEQLQSAPKKVDYFFDVYSAMRHGVVAHPYFLDEQNVYELDSFTFVFVAVDNGQARRVVTQHLVNRGIPFIDVGMGIEIVEDASLQLRGTCRVTLVTNEKNLHL--AQRANLHDDDDE-ALYKSNIQVADLNAMNAALAVMRWKQYMGFYLDQGQAHNLNYTLSLQSLTRDDGPEED
Francci3_0885_Fsp._86566460         MSQRLIVRSADLGRLREEGYHLETRGNVLL-VHDVPYVNPSREVLRGTLVTELELA--GDMTIQPSNHVAQFIGQTPSDSEGHPLSKLINSGAASLVG----SVHVNFTFSKKPMG-GDQ-RYRDYHHKVTTYVALLLMHAQVLDPTVAATTFPVITPDEDDDSPFEYLDTASVRAGISEVTKKLRLGP                        IAIIGLGGTGAYTLDLVAKTPVREIHLFDGDRYLQHNAFRSPGAPSIEELATVPKKVDYFAARYAKMRKKIVPHGDFVTEANVDELRGMTFVFLALDDGPARKLIVTKLEEYGIDFIDVGIGVEHVDNS---LTGLVRTTLSTVDSRKHLDADHRLPFGKANDA-NDYNRNIQIADLNALNAALAVIKWKKLAGFYLDLEREHYSAYAVNGNTLINEDLG---
VP1085_Vpar_28806072                MSLQLINLNSDLKRLRDEGYFIQVKNGFLI-MRDVPYVNSNRHVCRGTIISSLSLA--GDRTRIPDTHVVHFDGDMPCNAEGEALNAVVLQSSIFDLGR---GITAKHMFSSKPKS-G----YTDYYHKMTTYASILSGHAEVLNSGISPKV--FSTPEDEEDSVFNYTETASGRVGIGALSDLLTEES                        VAIIGLGGTGSYILDLVAKTPVREILLFDSDEFLQHNAFRAPGAPTLEALRDAEKKVEYFKSIYSNMHKRISTSSTYIDEENLELLNGVTFAFICIDAGTSKKSIVQKLEELDIPFVDVGMGVELTDGS---LGGILRVTASTSGKRQHV-HEGRVSFGGGEGN-DVYSSNIQVADLNALNAALAVIKWKKIRGFYRDLEQEHHSTYTTDGNLLLNGESCA--
RmetDRAFT_5043_Rmet_68554445        MSAALFNRNSDLKRLWDEGYRMRVEGGSLV-MLNVPYVNAKGEVKEGKIISPLLLA--GDVTQKPEPHTVHFEGEFPCDAGGKPLQAISACGVPADL-----HAVAQYYLSTKPDANG----YTDYHQKMATYAAIISGHATVLDREASPRK--VWQPLDDEESVFNYVENASGRAGIDKLTALLAGDC                        VAIIGLGGTGSYVLDFVAKTPVREIRLIDGDDFLQHNAFRAPGAPTAEQLREVPKKVDHFRSIYANMHRGIAAHAVALDASTVGLLTGVTFAFLCMDAGHGKRIAIDQLESLGVPFVDVGMGLELSNGT---LGGILRTSLSTPDCRDIA--RSTISFDEPDRD-GIYSSNIQVADLNAMNAVMAVMRWKRYRNFYRDFEGEFHSSFTTDVNMLLNGEPK---
consensus/100%                      M...L.s.sspl.RL.-cGa.h......hl.h.slPYlp.p.p..bu.lhs.h..s..s......psH.h.F.Gp.P.p..G.sh..l..pss...l......h.sp..hSpKP...s....a.shapKh.pY.s.h...Ap........p..........ppSsF.h.-shosRh.Is.hs.bh....                        lAIlGlGGTGua.LDhhsKT.l.EI..hD.D.h..HNhFRuPG..p...h.p...Ks-.a.s.Ys.h++.l..ps..lp..sh..h.uhsFsFlshD.G.u+..h.p.L..bslsFlDsGhGl...pss...L.G.hRsoh.ss.p...........h.......s.Y..pIQ.u-LNuhNA.hAlh+aKbh.sFYb-........a..p.p.l.p.......
consensus/95%                       M...L.s.sspl.RL.-cGa.h......hl.h.slPYlp.p.p..bu.lhs.h..s..s......psH.h.F.Gp.P.p..G.sh..l..pss...l......h.sp..hSpKP...s....a.shapKh.pY.s.h...Ap........p..........ppSsF.h.-shosRh.Is.hs.bh....                        lAIlGlGGTGua.LDhhsKT.l.EI..hD.D.h..HNhFRuPG..p...h.p...Ks-.a.s.Ys.h++.l..ps..lp..sh..h.uhsFsFlshD.G.u+..h.p.L..bslsFlDsGhGl...pss...L.G.hRsoh.ss.p...........h.......s.Y..pIQ.u-LNuhNA.hAlh+aKbh.sFYb-........a..p.p.l.p.......
consensus/90%                       M...L.s.sspl.RL.-cGa.h......hl.h.slPYlp.p.p..bu.lhs.h..s..s......psH.h.F.Gp.P.p..G.sh..l..pss...l......h.sp..hSpKP...s....a.shapKh.pY.s.h...Ap........p..........ppSsF.h.-shosRh.Is.hs.bh....                        lAIlGlGGTGua.LDhhsKT.l.EI..hD.D.h..HNhFRuPG..p...h.p...Ks-.a.s.Ys.h++.l..ps..lp..sh..h.uhsFsFlshD.G.u+..h.p.L..bslsFlDsGhGl...pss...L.G.hRsoh.ss.p...........h.......s.Y..pIQ.u-LNuhNA.hAlh+aKbh.sFYb-........a..p.p.l.p.......
consensus/85%                       M...L.s.sspl.RL.-cGa.h......hl.h.slPYlp.p.p..bu.lhs.h..s..s......psH.h.F.Gp.P.p..G.sh..l..pss...l......h.sp..hSpKP...s....a.shapKh.pY.s.h...Ap........p..........ppSsF.h.-shosRh.Is.hs.bh....                        lAIlGlGGTGua.LDhhsKT.l.EI..hD.D.h..HNhFRuPG..p...h.p...Ks-.a.s.Ys.h++.l..ps..lp..sh..h.uhsFsFlshD.G.u+..h.p.L..bslsFlDsGhGl...pss...L.G.hRsoh.ss.p...........h.......s.Y..pIQ.u-LNuhNA.hAlh+aKbh.sFYb-........a..p.p.l.p.......
consensus/80%                       MS.pLhs+ssclbRLb-EGa.lphc...Ll.h+slPYls.p.pl.bGslls.L.hs..Gp.s.b.psHshaF.Gp.PpsscGpshs.l..pss..sl.....ph.spabhSsKPpu.G....assaacKhsoY.ullsu.Aps.s.ssssp...h..sp.p--SsFph.-oASuRhGIs.lo.bLp...                        lAIIGLGGTGuYlLDhhAKTPV.EI+LaDsDpab.HNAFRuPGAsp.pbhbps.+KVDaa.sbYupM++.lss+s.hlc.psl.bhsGhTFsFlslD.Gpu++.lhp.L.pbGIPFlDVGhGlpb.css...LsGhhRsTh.sssp.b.h....phshscssp..s.YpsNIQlA-LNAhNAshAVh+WK+hbsFYbDbp.-ap.sas.sss.l.p.p.....

Fsp. : Frankia sp.; Mlot : Mesorhizobium loti; Rmet : Ralstonia metallidurans; Rpal : Rhodopseudomonas palustris; Ssp. : Shewanella sp.; Vpar : Vibrio parahaemolyticus
-------------------------------------------------------------------------------------------------------------

7. Ub fused to Mut7-C (Operons uninformative)
    ^^^^^^^^^^^^^^^^^^^^
Gis are of the Ub+Mut7C protein

GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
41410171        200       Ub+Mut7C                                                                                                              Mycobacterium avium subsp.paratuberculosisK-10                 actinobacteria                                hypothetical protein MAP4073 [Mycobacterium avium subsp. paratuberculosis K-10]
20520977        242       Ub+Mut7C                                                                                                              Streptomyces coelicolor A3(2)                                  actinobacteria                                conserved hypothetical protein [Streptomyces coelicolor A3(2)]
71915653        241       Ub+Mut7C                                                                                                              Thermobifida fusca YX                                          actinobacteria                                conserved hypothetical protein [Thermobifida fusca YX]
54016307        251       Ub+Mut7C                                                                                                              Nocardia farcinica IFM 10152                                   actinobacteria                                hypothetical protein [Nocardia farcinica IFM 10152]
76785598        252       Ub+Mut7C                                                                                                              Mycobacterium tuberculosisF11                                  actinobacteria                                COG1656: Uncharacterized conserved protein [Mycobacterium tuberculosis F11]
13880123        236       Ub+Mut7C                                                                                                              Mycobacterium tuberculosisCDC1551                              actinobacteria                                conserved hypothetical protein [Mycobacterium tuberculosis CDC1551]
29606942        241       Ub+Mut7C                                                                                                              Streptomyces avermitilis MA-4680                               actinobacteria                                hypothetical protein [Streptomyces avermitilis MA-4680]
53688960        250       Ub+Mut7C                                                                                                              Nostoc punctiforme PCC 73102                                   cyanobacteria                                 COG1656: Uncharacterized conserved protein [Nostoc punctiforme PCC 73102]
67930484        226       Ub+Mut7C                                                                                                              Solibacter usitatus Ellin6076                                  fibrobacteres/acidobacteria                   Protein of unknown function DUF82 [Solibacter usitatus Ellin6076]
56311907        265       Ub+Mut7C                                                                                                              Azoarcus sp. EbN1                                              proteobacteria>betaproteobacteria             conserved hypothetical protein [Azoarcus sp. EbN1]
74318176        264       Ub+Mut7C                                                                                                              Thiobacillus denitrificansATCC25259                            proteobacteria>betaproteobacteria             hypothetical protein Tbd_2158 [Thiobacillus denitrificans ATCC 25259]
68554875        266       Ub+Mut7C                                                                                                              Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Ralstonia metallidurans CH34]
74019866        268       Ub+Mut7C                                                                                                              Burkholderia ambifaria AMMD                                    proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Burkholderia ambifaria AMMD]
17431582        257       Ub+Mut7C                                                                                                              Ralstonia solanacearum                                         proteobacteria>betaproteobacteria             hypothetical protein of unknown function duf82 [Ralstonia solanacearum]
77965627        254       Ub+Mut7C                                                                                                              Burkholderia sp. 383                                           proteobacteria>betaproteobacteria             protein of unknown function DUF82 [Burkholderia sp. 383]
84363527        252       Ub+Mut7C                                                                                                              Burkholderia dolosa AUO158                                     proteobacteria>betaproteobacteria             COG1656: Uncharacterized conserved protein [Burkholderia dolosa AUO158]
48782379        251       Ub+Mut7C                                                                                                              Burkholderia fungorum LB400                                    proteobacteria>betaproteobacteria             COG1656: Uncharacterized conserved protein [Burkholderia fungorum LB400]
83719003        251       Ub+Mut7C                                                                                                              Burkholderia thailandensisE264                                 proteobacteria>betaproteobacteria             Protein of unknown function family [Burkholderia thailandensis E264]
67908809        251       Ub+Mut7C                                                                                                              Polaromonas sp. JS666                                          proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Polaromonas sp. JS666]
82701281        251       Ub+Mut7C                                                                                                              Nitrosospira multiformis ATCC25196                             proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Nitrosospira multiformis ATCC 25196]
83745680        277       Ub+Mut7C                                                                                                              Ralstonia solanacearum UW551                                   proteobacteria>betaproteobacteria             Zinc finger protein [Ralstonia solanacearum UW551]
67759108        251       Ub+Mut7C                                                                                                              Burkholderia pseudomallei S13                                  proteobacteria>betaproteobacteria             hypothetical protein BpseS_02004453 [Burkholderia pseudomallei S13]
72117336        247       Ub+Mut7C                                                                                                              Ralstonia eutropha JMP134                                      proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Ralstonia eutropha JMP134]
67738583        251       Ub+Mut7C                                                                                                              Burkholderia pseudomallei 668                                  proteobacteria>betaproteobacteria             COG1656: Uncharacterized conserved protein [Burkholderia pseudomallei 668]
67666690        259       Ub+Mut7C                                                                                                              Burkholderia cenocepacia HI2424                                proteobacteria>betaproteobacteria             Protein of unknown function DUF82 [Burkholderia cenocepacia HI2424]
67669903        251       Ub+Mut7C                                                                                                              Burkholderia pseudomallei 1655                                 proteobacteria>betaproteobacteria             hypothetical protein Bpse1_02004518 [Burkholderia pseudomallei 1655]
67154055        308       Ub+Mut7C                                                                                                              Azotobacter vinelandii AvOP                                    proteobacteria>gammaproteobacteria            Protein of unknown function DUF82 [Azotobacter vinelandii AvOP]
4981308         247       Ub+Mut7C                                                                                                              Thermotoga maritima MSB8                                       thermotogae                                   AE001747_4 conserved hypothetical protein [Thermotoga maritima MSB8]

Alignment of Ubl fused to Mut7-C

                                               <------------------------------Ub-like-------------------------------------------><------NYN/PIN------------------------------------------------------------------------------------------>|
FINAL                                          ---EEEEEHHHHHH---------EEEE------HHHHHHHEE-----EEEEEEE----------------EEEE-------|---------------EEEHHHHHHHHHHHHHHH------------HHHHHHHHHH--EEEEE-HHHHHHHHH----EEEE---HHHHHHHHHHHH--H----------------------------HHHHH----EEEEE---EEEEE---HHHHHHHHHHHHH------
ALIGN                                          ----HHHHHHHHHH--------------------EEEEEEE------EEEEEEE---------------EEEEE-------|---------------EEEHHHHHHHHHHHHHHH-------------HHHHHHHHH---EEEEHHHHHHHHHHHH----EE----HHHHHHHHHHH--H-------E---------HH--HH---------------EE-----EEEEE----HHHHHHHHHHHH------
HMM                                            ----EEEEHH-HHH--------EEEEEEE-----EEEEEEEEE---EEEEEEEEE--------------EEEEEE------|---------------HEEEHHHHHHHHHHHHHHHHHHHH------HHHHHHHHH---EEEEEHHHHHHHHHHH-EEEEEE----HHHHHHHHHHH--H-HHHHHHHHHH----HHHHHHHHHHH---HHHHHEEEEEEEE---EEEEE--HHHHHHHHHHHHHHH-----
FREQ                                           ---HHHHHHHHH------------HE-------HHHHHHHH------EEEEEEE----------------EEEE-------|--------------HEEHHHHHHHHHHHHHHHH----E--------HHHHHHHHHH--EEEE--HHHHHHHHH---EEEE---HHHHHHHHHHHH--H-------------------------------------EEEEE----EEEE----HHHHHHHHHHH-------
PSSM                                           ---EEEE----------------EEE-------HHHHHH---------HHHEE-----------------EE---------|---------------EEEEHHHHHHHHHHHH----E---------HHHHHHHHH----EEEE---HHHHH------EEE-----HHHHHHHHHHH-------------------EE------------HHH----EEEEE-----EEE---HHHHHHHHHHHHH------
Tfu_1519_Tfus_71915653                         HASITLRFDPTLRPLLAPRNRTDLLHVNHDPAASLSHVVESLGVPLTEIGELRINGTTASPSQHPQPGDLIEVLTVPKPQP|---------VPFSPIRFILDVHLGTLARRLRLLGVDTVYYT-HRDDPALVQQANEEQRILLTRDRGILYRKNLRAGGHIYASNPDEQLFEVLDRY--APPLAPWTRCLTCNGPLAQVDKDNIADQLPAGTRATYDTFVQCTECRQIYWPGAHHARLTQIIEAAQKRVAAI-
SCO4976_Scoe_20520977                          GPEIHVEFAPELHLFVPRARPTGVASAATDGVSTLGHLVESLGVPLTEVGALLVDGREVPPGHIPAGGESVRVRPVRHPQR|---------VPGAPLRFLLDVHLGTLARRLRLLGVDTAYESTDLGDPALAALSAAEKRVLLSRDRGLLRRRELWAGAYVYSTRPEEQLQEVLDRF--RPALSPWTRCTACNGLLRTATKEEVAEQLEGGTRRSYDVFAQCTACGRAYWRGAHHEQLEAIVERAVSSTRDA-
SAV3291_Save_29606942                          GPEIHVAFAPELRLFVPHERRSGTTAVGTDGASTLGHVVESLGVPLPEVGALVVNGRETPVSYIPAAGDSVEVRPVERPQR|---------VPGAPLRFLLDVHLGTLARRLRLLGVDTAYESTDLGDPALAALSAAEKRVLLSRDRGLLRRRELWAGAYIYSTRPDDQLRDVLDRF--APGLAPWTRCTACNGVLEKATKEQVADQLEGGTQRSYDVFAQCEECGRAYWKGAHHDRLEAIVERALAEFGA--
Npun02000115_Npun_53688960                     MAIAYFYFHAELNHFLPRHHKQVKISHFFEEKASIKDMIESLGVPHPEVDFINVNGKYVNFSYIVSDGDAINVYPISARSV|IIPSISVFPEPLSIIHFVVDIHLGKLATSLRLLGFDTLYRN-DYEDEKLAQISSSQGRILLTRDKGLLMRSLVTHGYYVRNTNPQEQIIEVLQRFDLFKLITPFKRCLRCNGLLEWVDKQSIIEQVPEKVRSQIDQFQRCQDCDRIYWKGSHYERLQQFIDGVLNSQKGE-
TM_0779_Tmar_4981308                           EKIAFFRFFGRLNDFFRNSERIK--THRFTGFQTVKDRIEALGVPHVEVSLITLNGKPVGFDHMVEDGELFFVYPEFQNIE|IPEDWLVTPRYIGEPRFVLDIHLGKLARLLRMLGFEAVFGE-E-SDEKLCWMAVKKKAILLSRDTGLLKRKELVFGYYVRNTDPKEQLVEVVERYDLKKWMKPFTRCIECGVELEEVPKEAVKNRVPPKVYGFFNEFARCPVCGRIYWKGSHYDHMVEFIKSNINKG----
AcidDRAFT_4098_Susi_67930484                   MPDGRFYFEGDLSLFLLPSLRGREVKRTWSDTDTLMHVIESIGVPHTEV------------ARIERDGSLIRVYPRTREIL|------------QDPRFVLDQHLGRLAAYLRMLGFDVLHTV-PAPDQHLAAASSREDRVLLTRDVGLLKRKEVRRGYFVRATDPRAQLLEVLKRFGLVDAIAPFTRCFLCNTPLESVDKAVIARQLPERIADLHNHFMRCPSCGRVYWKGSHYDRMRELIEDIKKRALFD-
nfa28300_Nfar_54016307                         ASGIELRLYAELNDFLPPQDRQDALWRPVRPHQTVKDIVEAAGVPHTEIDLLLVNGESVGFEHHPRPGDRLAAYPMFESLD|ISGLTRVRPHPLREPRMLIDVNLGGLARLLRLMGQDVRCDF-DATDARLAEISAEDHRILLTRDRGLLARRIVSHGVYVRADRPFEQIVEVIGRLDLADQLAPFTRCLRCGAVLADVAKDEIVHELSPGTRENYDTFRRCTGCGRIYWAGAHQRRLDDLVTQILAAVRR--
MtubF_01000602_Mtub_76785598                   VGYVDVRAYAELNEFVELQARGLTVRRPFRSHQTVKDVLEAMGIPHTEVDLILVNGDPADFSYRPVAGDRIAAYPMFEALD|IGSTARLRPAPLRNPRFVVDVNLGQLARLLRLLGFDTRWSS-AADDPTLADISLGEQRILLTRDRGLLKRRAITHGLFVHSQHPEEQALEVLRRLDLNGRLAPLSRCLRCNGELAAVSKDEVIGQLEPLTRRYYESFSRCFGCGRIYWPGSHHARLVRLVERLRDQLTTST
RRSL_04745_Rsol_83745680                       MPTLLFTFDASLTPLLPLTQRQRPAARAWPEGATLKHAIETFGVPHTEVGAVHVDGCAAPLESLLPARGAVAVAGVQAALP|-----------QAPLHFLCDAHLGATARLLRMAGFDTAYDN-NYADATIEALADTEDWIVLSRDRELLKRRGIRRGAFVRAREPQAQMREIVARFKLAEAARPFSRCLECNAPLRLLSAEEAASSVPPRVRERQHLFSTCDVCRRVYWPGSHWARMNTALARMLAPHQEDG
RSp1109_Rsol_17431582                          MPTLLFTFDASLTPLLPVAQRERPAARAWPEGATLKHAIETFGVPHTEVGVVQVDGHAALLDALLPARGAVAVAGVRAALP|-----------DAPLHFLCDAHLGATARLLRMAGFDTAYDN-NYADATIEALADTEDWIVLSRDRELLKRRGIRRGAFIRAREPQAQMREIVARFRLAEAARPFSRCLECNAPLRLLSAEEAAASVPPRVRERQHLFSTCDVCRRVYWPGSHWARMNTSLARMLAPHPDGA
BdolA_01000029_Bdol_84363527                   MATATFRFHGELNAFVARTQRDRAFAHACARDATLKHAIEALGVPHTEIGQLTVNGAAAGLDRPVGDGDRIDVYPERAREP|--AAAPPATPRSEQWRFVADAHLGGLAQLLRLAGFDTCYDN-HYRDDEIAALAEREGRLVLTRDRELLKRRAVARGCYLHALQPADQLRELFSRLALAPYMRPFRLCLRCNAPLHALDADAAAPRVPAGVRQRHRRFVECDVCRRVFWEGSHWRRMRALVDSMRTAAVPDE
BambDRAFT_0385_Bamb_74019866                   MATATFRFHGELNAFLARAQRGCAFAHVFARDATVKHAVEALGVPHTEIGRLCVNGAPAALDRPLGDGDRVDIHPERARPA|---IESPVQPQPESWRFIADAHLGGLAQLLRLAGFDTCYDN-HYRDDELVALAAREGRIVLTRDRELLKRRAVVRGCYLHAQQPDAQLHELFARLDLAPHMRPFRLCLRCNAPLHALDAADAAPRVPAGVRQRHRRFAACDVCRRVFWEGSHWRRMRAVVDAMRALPPVAP
Bcen2424DRAFT_1951_Bcen_67666690               MATASLRVVVELNAFLASQQRDRAFAHACARDATVKHAIEALGVPHTEIGRLYVNDAPAALDRPLDDGDRVEVLPERAGPA|---ANGATGPPPAAWRFVADAHLGGLAQLLRLAGFDTCYDN-HYRDDELAALAEREQRIVLTRDRELLKRRAVVRGCYLHALQPADQLRELFERLDLAPHMRPFRLCLRCNAPLHPLDAAAAAPSVPAGVRLRHRRFAACDVCRRVFWEGSHWRRMRAVVDAMRTPSPVRR
Bcep18194_A3405_Bsp._77965627                  MATATFRFHDELNAFLPRAQRDRAFGHACARDATLKHAIEALGVPHTEIGRLCVNDAPATLDRPLDDGDRVEAFPERAQPA|---ANGATVPPSAHWRFAADAHLGGLAQLLRLAGFDTCYDN-HYRDDELAALAAREGRIVLTRDRELLKRRAVERGCYLHALQPADQLRELFERLDLAPHMRPFRLCLRCNAPLHPLDAAAAAPRVPAGVRLRHRRFAACDVCRRVFWEGSHWRRMRTVVDAMRAPPPPAP
AvinDRAFT_7917_Avin_67154055                   MVSVTFRFYEELNDFLPSERRRQAFACDCARAATVKHMIEALGVPHTEVELVLLNGESVDFSRPLHDGDRVAVYPRFEALD|IGPLLKVRDHPLRELRFIADAHLGGLASLLRMCGFDTLYDN-HYEDRQIAALAAEQRRIVLSRDRELLKRRIVTHGCYLHALKPALQLRELFERLDLAGSARPFSRCLHCNLPLHEVTVEQARPRLPPRIAALYSRFFGCDACQRLYWEGSHWRSMRSLLAPLLDDRPPER
Bpse1_02004518_Bpse_67669903                   MVTVTFRFYEELNDFLARPLRRREFAHACMRGASVKHAIEALGVPHTEVELILVNGESTPFSHVLEEGDRVAVYPSFEAID|IRPLLRVRAAPLRVTRFIADAHLGGLAQLLRLAGFDTLYDN-HYPDKLIETIAAREARIVLTRDRELLKRRTITHGCYVRALKPQAQLQELFDRLDLAGSARPFRLCLSCNAPLRRIDPAEAAGRAPQGVLQRHTRFVTCDVCRRVFWEGSHWRRMRALIEHVSQPKPPPG
Bcep02006224_Bfun_48782379                     MVTATFRFYEELNDFLARPLRRRAFTYACAPGATAKHMIEALGVPHTEVELILVNGESVGFNHPLSDGDRLAVYPKFEALD|IHPLLRVRERPLRVVRFIADAHLGGLAPLLRLAGFDTLYDN-HYPDADIEALAAAQQRIVLTRDRELLKRRNITHGCYVRTLRPREQLREVFERLDLAGSAQPFRLCLMCNVPLRRIPKEEVGTRAPDGVLERHAQFVTCDVCRRVFWEGTHWQRMRALMDSVAAAPDRSA
Tbd_2158_Tden_74318176                         MVIATFRFYEELNDFLAPDRRKREFTVPCARAATTKHMIEALGVPHTEVELILVNGESAGFDRRLQDGDRVAVYPRFEAMD|VSPLLRVRERPLRETRFVADAHLGGLAHMLRMLGFDTLYDN-HFHDDAIVAICEHDGRIVLTRDRELLKRRSVTHGCYIHALKSEAQLREVVARLDLARSARPFTRCLHCNVPLRTVDKASVLDRLPPKVREHYAHFPTCDSCGRIYWAGSHWRNMRRLLDDVLSGERDSG
Nmul_A0146_Nmul_82701281                       MVTATFRFYEELNDFLVPERRKREFSCPCARAATTKHMIEALGVPHTEVELVLVNGESVGFDRILEHGDRVAVFPKFEMVD|VAPLLRVREHPLRVTRFIADAHLGGLAHLLRMTGFDTLYDN-NYHDRQIELLAAQEKRIVLTRDRELLKRRSITHGCYVRTLKPPEQLCEIFDRLDLAHSIKPFTLCLNCNAPLRPVEKSVVLERLPPSVRERFDHFSTCDICHRVFWEGSHWQRMRTMLEECIKPNRFGG
BproDRAFT_2323_Psp._67908809                   MVMASFRFYEELNEFLAPERRGREFACPCARAATTKHMIEALGVPHTEVELVLVNGESVGFDRQLREGDRVAVYPKFEALD|VTPLLRVRGQTLRVTRFVADAHLGGLAHLLRMAGFDTLYDN-HFRDEEIERIAAEQGRIVLTRDRDLLKRRTITHGCYVHALRTELQLREIFGRLDLARSARPFTLCLHCNAPLHAIEKMRVATMLPPQVREHYQRFSACDVCHRVFWEGSHWRRMRLMLDGLLS------
ebA822_Asp._56311907                           MVTATFRFYEELNDFLAPARRRREFDAPCARAATVKHMVEALGVPHTEVELVLVNGESVDFGRLLRDGDRVAVYPKFESLD|ITPLLRVRSHPLRVMRFVADAHLGGLAHLLRMTGFDTLYDN-HFDDGEIEIIAGRDARIVLTRDRELLKRRTLTHGCYVRALKPAQQVREIFDRLDLAGSAKPFTLCLDCNAPLRPIGKAQVEDRLPPGVRASHTRFSTCDVCRRVFWEGSHWRRMRVLVDELLAGSPPLP
RmetDRAFT_5449_Rmet_68554875                   MVTATFRFYEELNDFLAPAQRRRDLSCPCARAATVKHMIEALGVPHTEVELILVNGESSPFERIVCDGDRIAVYPKFESFD|IAPLLRVREQPLREIRFVADAHLGGLAHLLRMTGFDTLYDN-HFEDCEIARIASDEKRIVLTRDRELLKRRGITHGCYVRAIRSSLQVREIFSRLDLARSARPFSLCLDCNVPLRRIGKTDVDGRVPEGVFERHEHFVTCPHCHRVFWEGSHWRKMRTLVEELMSAQADQV
Reut_A0217_Reut_72117336                       MVTATFRFYEELNDFLAPDQRRRDLSCPCARAATVKHMIEALGVPHTEVELILVNGESSGFDRMLEDGDRVSVYPKFESLD|VSPLLRVRAHPLRIMRFVADAHLGGLAHLLRMMGFDTLYDN-HFEDSEIERIAEREGRIVLTRDRELLKRRGITHGCYVRAIKSTPQVREIFQRLDLARSARPFSLCLDCNVPLQPVARDVVADRVPPAVLERHDRFVTCDGCRRVFWEGSHWRCMRALVDELVCAG----
consensus/100%                                 .....h.h..pLp.hh.................o..c.lEshGlP.sEl.....................h.h.s......|...............+hhhD.pLG..A..LRh.G.-s........D..l...s..p..llLoRD..lL.Rp.l..G.al.s.ps..Qh.-lh.Rh.....h.PhpbC..Cs..L..h....h...h..........F..C..C.bhaW.GsH..ph...h...........
consensus/95%                                  .s.h.h.h..pLp.hl....+............o..chlEshGVP.sEl..l.lss..s...........l.h.s......|...............+FlhDhpLG..A.bLRh.GhDs.a......D..l..hu..p.bllLoRDp.LLbR+.l..Ghal.s.ps..Qh.Elh.Rh.....h.PapbC..Css.L..h....h..ph..........F..C..C.RhaW.GuHa.ph..hl..h........
consensus/90%                                  .s.h.h.h..pLp.hl....R...h..s.....o..chlEuhGVP.sEl..l.lss..ss.......Gs.l.h.s......|...............+FlhDhHLG.LApbLRh.GhDshaps.ph.D..l..hu..p.bllLoRD+.LLbR+.l..Ghal.s.ps..Qh.Elh.Rh.....h.PapbCh.CNs.L..ls...h..pl...sb...p.F..C..C.RlaW.GuHa.php.hl..h........
consensus/85%                                  .s.h.hbF..cLs.Fls...R...h..sh...sThbchlEuhGVPHTEl..l.lsG..sshs....sG-.l.lhP......|...............RFlhDhHLG.LApbLRhhGhDThaps.ph.D..l..lu..p.RllLoRDR.LL+R+.l..Ghal+s.ps..Ql.Elh.Rh.Ls..h.PapbCl.CNs.Lp.ls...h..pls..sb..ap.F..Cs.C.RlaW.GuHa.php.hlp.h........
consensus/80%                                  hs.hphbF..-Ls.Fls..bR.p.hs.shs.suTlKHhlEulGVPHTEl..l.VNG.ssshsp...sG-.l.VhP.b....|......s...s....RFlhDhHLGsLAphLRhhGFDThYcs.ph.D..l..lu.pc.RIlLoRDR.LL+RR.l.+Ghal+u.pP..QlbElh.Rh.Ls..h.PFpbCLpCNssLc.ls...h..plPs.sb.pappFs.CssC.RlaW.GoHap+hp.hlc.hbs......
consensus/75%                                  hs.hphcF..ELs.FLs..bRpp.hs.shs.suTlKHhlEuLGVPHTEl.bl.VNGpssshsp...sGDbl.VhP.b....|......s...P....RFlhDhHLGsLAphLRhhGFDThY-s.ch.D.pl..lusp-.RIlLoRDR.LLKRR.l.+GhYl+u.pP..QlbElh.RhcLA..hpPFpbCLpCNssLc.ls...hhsplPs.lb.pappFspCssC.RlaW.GSHap+hp.hl-.hbs......
consensus/70%                                  MsphpFRF..ELNsFLs..bRpc.hspshspsATlKHhlEuLGVPHTEV.hl.VNGpssshs+.l.sGDbl.VaPbb....|......sp..P....RFlhDhHLGsLApLLRhhGFDThY-s.ch.D.pl..lusp-.RIlLTRDR.LLKRR.lp+GhYl+u.cP..QlbElhpRhcLA..hpPFpbCLpCNsPLc.ls...hhsplPs.lbbpappFspCssC.RlaWcGSHacRMc.ll-.hbs......

Species abbreviations:
Asp. : Azoarcus sp.; Avin : Azotobacter vinelandii; Bamb : Burkholderia ambifaria; Bcen : Burkholderia cenocepacia; Bdol : Burkholderia dolosa; Bfun : Burkholderia fungorum; Bpse : Burkholderia pseudomallei; Bsp. : Burkholderia sp.; Mtub : Mycobacterium tuberculosis; Nfar : Nocardia farcinica; Nmul : Nitrosospira multiformis; Npun : Nostoc punctiforme; Psp. : Polaromonas sp.; Reut : Ralstonia eutropha; Rmet : Ralstonia metallidurans; Rsol : Ralstonia solanacearum; Save : Streptomyces avermitilis; Scoe : Streptomyces coelicolor; Susi : Solibacter usitatus; Tden : Thiobacillus denitrificans; Tfus : Thermobifida fusca; Tmar : Thermotoga maritima
-------------------------------------------------------------------------------------------------------------

8. Uncharacterized operon encoding a Ub-like (RnfH)  family protein
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Abbreviations: c/d: aromatic cyclase/dehydrase (c/d)--82
Gis are of the Ub/RnfH protein

GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
13362951        102          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Escherichia coli O157:H7                                       proteobacteria>gammaproteobacteria
67549235        107          <-SmpB-c/d->Ub*->                                                                                                  Burkholderia vietnamiensis G4                                  proteobacteria>betaproteobacteria
24373049        111          <-SmpB-c/d->Ub*-><-SmpA                                                                                            Shewanella oneidensis MR-1                                     proteobacteria>gammaproteobacteria
9655302         103          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Vibrio cholerae O1 biovar eltor str. N16961                    proteobacteria>gammaproteobacteria
71898356        84           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Xylella fastidiosa Ann-1                                       proteobacteria>gammaproteobacteria
45435745        94           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Yersinia pestis biovar Medievalis str. 91001                   proteobacteria>gammaproteobacteria
75857320        117          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Vibrio sp. Ex25                                                proteobacteria>gammaproteobacteria
46156590        110          -c/d->Ub*->                                                                                                        Haemophilus somnus 2336                                        proteobacteria>gammaproteobacteria
67638926        107          <-SmpB-c/d->Ub*->                                                                                                  Burkholderia mallei 10399                                      proteobacteria>betaproteobacteria
84387679        103          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Vibrio splendidus 12B01                                        proteobacteria>gammaproteobacteria
46202194        97           Ub*                                                                                                                Magnetospirillum magnetotacticum MS-1                          proteobacteria>alphaproteobacteria
16421233        96           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Salmonella typhimurium LT2                                     proteobacteria>gammaproteobacteria
77958607        94           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Yersinia bercovieri ATCC 43970                                 proteobacteria>gammaproteobacteria
7379707         92           FtsJ->FtsH->c/d->Ub*                                                                                               Neisseria meningitidis Z2491                                   proteobacteria>betaproteobacteria
58581648        91           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Xanthomonas oryzae pv. oryzae KACC10331                        proteobacteria>gammaproteobacteria
52627727        90           -c/d->Ub*-><-SmpA-                                                                                                 Legionella pneumophila subsp. pneumophila str                  proteobacteria>gammaproteobacteria
76579340        269          <-SmpB-Ub*->                                                                                                       Burkholderia pseudomallei 1710b                                proteobacteria>betaproteobacteria
68245723        165          Ub*                                                                                                                Magnetococcus sp. MC-1                                         proteobacteria
71362697        157          Ub*-><-SmpA-                                                                                                       Psychrobacter cryohalolentis K5                                proteobacteria>gammaproteobacteria
71037825        134          Ub*-><-SmpA-                                                                                                       Psychrobacter arcticus 273-4                                   proteobacteria>gammaproteobacteria
71143975        131          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Colwellia psychrerythraea 34H                                  proteobacteria>gammaproteobacteria
84717489        129          <-SmpB-c/d->Ub*->                                                                                                  Polaromonas naphthalenivorans CJ2                              proteobacteria>betaproteobacteria
56179014        125          <-SmpB-c/d->Ub*->                                                                                                  Idiomarina loihiensis L2TR                                     proteobacteria>gammaproteobacteria
68559691        121          Hjlc<-SmpB-c/d->Ub*->                                                                                              Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria
17428441        118          Hjlc<-SmpB-c/d->Ub*->                                                                                              Ralstonia solanacearum                                         proteobacteria>betaproteobacteria
83749959        118          Hjlc<-SmpB-c/d->Ub*->                                                                                              Ralstonia solanacearum UW551                                   proteobacteria>betaproteobacteria
37197746        117          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Vibrio vulnificus YJ016                                        proteobacteria>gammaproteobacteria
67158728        117          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Azotobacter vinelandii AvOP                                    proteobacteria>gammaproteobacteria
33576043        116          <-SmpB-c/d->Ub*->                                                                                                  Bordetella bronchiseptica RB50                                 proteobacteria>betaproteobacteria
67677127        115          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Chromohalobacter salexigens DSM 3043                           proteobacteria>gammaproteobacteria
74020818        115          -c/d->Ub*->                                                                                                        Rhodoferax ferrireducens DSM 15236                             proteobacteria>betaproteobacteria
85711977        115          -c/d->Ub*-><-SmpA-SmpB->                                                                                           Idiomarina baltica OS145                                       proteobacteria>gammaproteobacteria
68545933        112          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Shewanella amazonensis SB2B                                    proteobacteria>gammaproteobacteria
74317772        112          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Thiobacillus denitrificans ATCC 25259                          proteobacteria>betaproteobacteria
69157448        111          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Shewanella denitrificans OS217                                 proteobacteria>gammaproteobacteria
69952904        111          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Shewanella frigidimarina NCIMB 400                             proteobacteria>gammaproteobacteria
72118961        111          <-SmpB-c/d->Ub*->                                                                                                  Ralstonia eutropha JMP134                                      proteobacteria>betaproteobacteria
48787671        110          Hjlc<-SmpB-c/d->Ub*->                                                                                              Burkholderia fungorum LB400                                    proteobacteria>betaproteobacteria
52306665        110          Ub*                                                                                                                Mannheimiasucciniciproducens MBEL55E                           proteobacteria>gammaproteobacteria
68212386        110          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Methylobacillus flagellatus KT                                 proteobacteria>betaproteobacteria
78366485        110          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Shewanella sp. PV-4                                            proteobacteria>gammaproteobacteria
67907298        109          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Polaromonas sp. JS666                                          proteobacteria>betaproteobacteria
46912321        108          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Photobacterium profundum SS9                                   proteobacteria>gammaproteobacteria
48861843        108          -c/d->Ub*-><-SmpA-                                                                                                 Microbulbifer degradans 2-40                                   proteobacteria>gammaproteobacteria
71847580        108          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Dechloromonas aromatica RCB                                    proteobacteria>betaproteobacteria
76791575        108          <-SmpB-c/d->Ub*->                                                                                                  Pseudoalteromonas atlantica T6c                                proteobacteria>gammaproteobacteria
56315277        107          <-SmpB-c/d->Ub*->                                                                                                  Azoarcus sp. EbN1                                              proteobacteria>betaproteobacteria
76874703        107          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Pseudoalteromonas haloplanktis TAC125                          proteobacteria>gammaproteobacteria
47571785        106          <-SmpB-c/d->Ub*->                                                                                                  Rubrivivax gelatinosus PM1                                     proteobacteria>betaproteobacteria
71548925        105          Cox15-><-SmpB--c/d->Ub*->                                                                                          Nitrosomonas eutropha C71                                      proteobacteria>betaproteobacteria
28871646        104          DC3000;<-SmpB--X->-c/d->Ub*->-X->-X-><-SmpA-                                                                       Pseudomonas syringae pv. tomato str.                           proteobacteria>gammaproteobacteria
66047427        104          <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Pseudomonas syringae pv. syringae B728a                        proteobacteria>gammaproteobacteria
68348616        104          <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Pseudomonas fluorescens Pf-5                                   proteobacteria>gammaproteobacteria
71558661        104          <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Pseudomonas syringae pv. phaseolicola 1448A                    proteobacteria>gammaproteobacteria
77380988        104          <-SmpB-X->c/d->Ub*-><-SmpA-                                                                                        Pseudomonas fluorescens PfO-1                                  proteobacteria>gammaproteobacteria
30138331        103          <-SmpB-c/d->Ub*->                                                                                                  Nitrosomonas europaea ATCC 19718                               proteobacteria>betaproteobacteria
33572188        103          <-SmpB-c/d->Ub*->                                                                                                  Bordetella pertussis Tohama I                                  proteobacteria>betaproteobacteria
49530092        103          ->Ub*-><-SmpA                                                                                                      Acinetobacter sp. ADP1                                         proteobacteria>gammaproteobacteria
59712607        103          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Vibrio fischeri ES114                                          proteobacteria>gammaproteobacteria
68057197        102          Ub*                                                                                                                Haemophilusinfluenzae 86-028NP                                 proteobacteria>gammaproteobacteria
29541871        101          <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Coxiella burnetii RSA 493                                      proteobacteria>gammaproteobacteria
33149060        100          -c/d->Ub*->                                                                                                        Haemophilus ducreyi 35000HP                                    proteobacteria>gammaproteobacteria
34498917        100          -c/d->Ub*-><-SmpA-                                                                                                 Chromobacterium violaceum ATCC 12472                           proteobacteria>betaproteobacteria
10038928        99           <-SmpB-Ub*->                                                                                                       Buchnera aphidicola str. APS (Acyrthosipho                     proteobacteria>gammaproteobacteria
12720385        99           -c/d->Ub*->                                                                                                        Pasteurella multocida subsp. multocida str                     proteobacteria>gammaproteobacteria
76883017        99           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Nitrosococcus oceani ATCC 19707                                proteobacteria>gammaproteobacteria
87120236        99           <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Marinomonas sp. MED121                                         proteobacteria>gammaproteobacteria
21112537        98           -serinepeptidase-><-SmpB--c/d->Ub*-><-SmpA-                                                                        Xanthomonas campestris pv. campestris str. ATC                 proteobacteria>gammaproteobacteria
32035426        98           -c/d->Ub*->                                                                                                        Actinobacillus pleuropneumoniae serovar 1 str                  proteobacteria>gammaproteobacteria
78035544        98           -serinepeptidase-><-SmpB--c/d->Ub*-><-SmpA-                                                                        Xanthomonas campestris pv. vesicatoria str                     proteobacteria>gammaproteobacteria
36786684        97           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Photorhabdus luminescens subsp. laumondii TTO1                 proteobacteria>gammaproteobacteria
78701201        97           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Alkalilimnicola ehrlichei MLHE-1                               proteobacteria>gammaproteobacteria
82702330        96           -c/d->Ub*-><-SmpA-                                                                                                 Nitrosospira multiformis ATCC 25196                            proteobacteria>betaproteobacteria
83644081        96           <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Hahella chejuensis KCTC 2396                                   proteobacteria>gammaproteobacteria
21623148        95           <-SmpB--Ub*->                                                                                                      Buchnera aphidicola str. Sg (Schizaphi                         proteobacteria>gammaproteobacteria
77978427        94           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Yersinia intermedia ATCC 29909                                 proteobacteria>gammaproteobacteria
84780300        94           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Sodalis glossinidius str. 'morsitans'                          proteobacteria>gammaproteobacteria
49610307        93           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Erwinia carotovora subsp. atroseptica SCRI1043                 proteobacteria>gammaproteobacteria
77953016        92           <-SmpB--X->-c/d->Ub*-><-SmpA-                                                                                      Marinobacter aquaeolei VT8                                     proteobacteria>gammaproteobacteria
78364037        92           <-SmpB-c/d->Ub*-><-SmpA-                                                                                           Thiomicrospira crunogena XCL-2                                 proteobacteria>gammaproteobacteria
21107692        87           -serinepeptidase-><-SmpB--c/d->Ub*-><-SmpA-                                                                        Xanthomonas axonopodis pv. citri str. 306                      proteobacteria>gammaproteobacteria
27904127        86           Ub*-><-SmpA-                                                                                                       Buchneraaphidicola str. Bp (Baizongi                           proteobacteria>gammaproteobacteria
-------------------------------------------------------------------------------------------------------------

9. Mobile RnfH operon (electron transport chain--9)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the RnfH protein (marked with an asterisk)
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
56312934        101       rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Azoarcus sp. EbN1;                                             proteobacteria>betaproteobacteria             Protein rnfH [Azoarcus sp. EbN1]
71846749        90        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Dechloromonas aromatica RCB;                                   proteobacteria>betaproteobacteria             Protein of unknown function UPF0125 [Dechloromonas aromatica RCB]
56552704        88        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Zymomonas mobilis subsp. mobilis ZM4;                          proteobacteria>alphaproteobacteria            hypothetical protein ZMO1808 [Zymomonas mobilis subsp. mobilis ZM4]
53756757        95        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Methylococcus capsulatus str. Bath;                            proteobacteria>gammaproteobacteria            electron transport complex, H subunit [Methylococcus capsulatus str. Bath]
9843879         86        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Pseudomonas stutzeri;                                          proteobacteria>gammaproteobacteria            RnfH protein [Pseudomonas stutzeri]
77389630        85        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Rhodobacter sphaeroides 2.4.1;                                 proteobacteria>alphaproteobacteria            probable rnfH protein [Rhodobacter sphaeroides 2.4.1]
67158346        86        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Azotobacter vinelandii AvOP;                                   proteobacteria>gammaproteobacteria            Protein of unknown function UPF0125 [Azotobacter vinelandii AvOP]
1905814         85        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Rhodobacter capsulatus;                                        proteobacteria>alphaproteobacteria            RnfH protein [Rhodobacter capsulatus]
46202216        84        rnfB->rnfC->rnfD->rnfG->rnfE->(Ub)rnfH*->                                                                             Magnetospirillum magnetotacticum MS-1;                         proteobacteria>alphaproteobacteria            COG2914: Uncharacterized protein conserved in bacteria [Magnetospirillum magnetotacticum MS-1]
-------------------------------------------------------------------------------------------------------------

10. Aromatic amino acid hydroxylase; TolueneO-Xylene Monooxygenase Hydroxylase protein B
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Gis are of the TmoB/Ub  protein- marked with an asterisk
GI              LENGTH    Operon                                                                                                                ORGANISM                                                       Classification                                Protein descriptions (if any)
78693154        81        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Bradyrhizobium sp. BTAi1                                       proteobacteria>alphaproteobacteria            hypothetical protein BradDRAFT_6557 [Bradyrhizobium sp. BTAi1]
48094248        86        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Pseudomonas sp. OX1                                            proteobacteria>gammaproteobacteria            toluene o-xylene monooxygenase component [Pseudomonas stutzeri]
68556036        102       4OCDC->4OCTT->TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                  Ralstonia metallidurans CH34                                   proteobacteria>betaproteobacteria             Toluene-4-monooxygenase system B [Ralstonia metallidurans CH34]
5911739         94        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Rhodococcus sp. AD45                                           actinobacteria                                putative isoprene monooxygenase gamma subunit [Rhodococcus sp. AD45]
45479222        84        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Pseudomonas mendocina                                          proteobacteria>gammaproteobacteria            gammahydroxylase [Pseudomonas mendocina]
1754624         88        TmoA->TmoB/Ub*->TmoC->TmoD                                                                                            Pseudomonas aeruginosa                                         proteobacteria>gammaproteobacteria            bmoB[Pseudomonas aeruginosa]
71849051        88        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF->TodX                                                                          Dechloromonas aromatica RCB                                    proteobacteria>betaproteobacteria             Toluene-4-monooxygenase system B [Dechloromonas aromatica RCB]
86565792        82        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE                                                                                      Frankia sp. CcI3                                               actinobacteria                                Toluene-4-monooxygenase system B [Frankia sp. CcI3]
4210875         88        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Xanthobacter autotrophicus Py2                                 proteobacteria>alphaproteobacteria            oxygenase gamma subunit [Xanthobacter sp. Py2]
72122837        86        Note:phenol hydroxylase operon<-TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                Ralstonia eutropha JMP134                                      proteobacteria>betaproteobacteria             Toluene-4-monooxygenase system B [Ralstonia eutropha JMP134]
44893909        86        TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                                Ralstonia pickettii                                            proteobacteria>betaproteobacteria             gamma hydroxylase subunit [Ralstonia pickettii]
2150114         89        4OCDC->TmoA->TmoB/Ub*->TmoC->TmoD->TmoE->TmoF                                                                         Burkholderia cepacia                                           proteobacteria>betaproteobacteria             TbhB[Burkholderia cepacia]

Abbreviations:
TmoA: Toluene-4-monooxygenase hydroxylase; Ferritin-like
TmoD: hydroxylase/monooxygenase regulatory protein; Ferritin-like
TmoE: Toluene-4-monooxygenase hydroxylase
TmoB: Ubiquitin fold
TmoC: Rieske 2Fe-S protein
TmoF: NADH-ferredoxin oxidoreductase
4OCDC: 4-oxalocrotonate decarboxylase
4OCTT: 4-oxalocrotonate tautomerase
TodX: Aromatic amino acid transporter, Porin like beta-barrel
* Note The ribonucleotide large and small subunits also correspond to the TmoA/D pair
-------------------------------------------------------------------------------------------------------------

11. YukD-like proteins

Abbreviations:
YukD: YukD like ubiquitin
S/TK: serine/threonine kinase;
gis are of the YukD-like Ub protein protein- marked with an asterisk

GI              LENGTH    Operon                                                                                                                                                                           ORGANISM                                                       Classification                                Protein descriptions (if any)
15026816        90        <-FtsK<-S/TK<-yukD*<-?<-ESAT-6                                                                                                                                                   Clostridium acetobutylicum ATCC 824                            firmicutes                                    AE007866_9 Hypothetical protein [Clostridium acetobutylicum ATCC 824]
15022859        81        yukD*->FtsK->ESAT-6->                                                                                                                                                            Clostridium acetobutylicum ATCC 824                            firmicutes                                    E007517_5 Hypothetical protein [Clostridium acetobutylicum ATCC 824]
52004898        79        <-Mem_prot<-FtsK<-S/TK<-yukD*<-ESAT-6                                                                                                                                            Bacillus licheniformis ATCC 14580                              firmicutes                                    conserved protein YukD [Bacillus licheniformis ATCC 14580]
2635685         79        <-Mem_prot<-FtsK<-FtsK<-S/TK<-yukD*<-ESAT-6                                                                                                                                      Bacillus subtilis subsp. subtilis str. 168                     firmicutes                                    yukD [Bacillus subtilis subsp. subtilis str. 168]
56908701        79        ESAT-6->yukD*->S/TK->FtsK->Mem_prot->                                                                                                                                            Bacillus clausii KSM-K16                                       firmicutes                                    conserved hypothetical protein [Bacillus clausii KSM-K16]
10173588        80        <-Mem_prot||ESAT-6->yukD*->S/TK->FtsK->?->?->transp->                                                                                                                            Bacillus halodurans C-125                                      firmicutes                                    BH0973 [Bacillus halodurans C-125]
67875114        82        yukD*                                                                                                                                                                            Clostridium thermocellum ATCC 27405                            firmicutes                                    hypothetical protein CtheDRAFT_2497 [Clostridium thermocellum ATCC 27405]
76563722        80        <-FtsK<-S/TK<-yukD*<-?<-Mem_prot<-ESAT-6<-ESAT-6                                                                                                                                 Streptococcus agalactiae A909                                  firmicutes                                    conserved hypothetical protein [Streptococcus agalactiae A909]
88194066        93        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->                                                                                                                                         Staphylococcus aureus subsp. aureus NCTC 8325                  firmicutes                                    hypothetical protein SAOUHSC_00260 [Staphylococcus aureus subsp. aureus NCTC 8325]
49482522        80        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->?->?->transp->                                                                                                                           Staphylococcus aureus subsp. aureus MRSA252                    firmicutes                                    hypothetical protein SAR0282 [Staphylococcus aureus subsp. aureus MRSA252]
22776996        84        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->?->?->transp->                                                                                                                           Oceanobacillus iheyensis HTE831                                firmicutes                                    hypothetical conserved protein [Oceanobacillus iheyensis HTE831]
16412473        83        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->                                                                                                                                         Listeria innocua                                               firmicutes                                    lin0052 [Listeria innocua]
46906292        83        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->                                                                                                                                         Listeria monocytogenes str. 4b F2365                           firmicutes                                    hypothetical protein LMOf2365_0070 [Listeria monocytogenes str. 4b F2365]
89203070        83        <-FtsK<-S/TK<-yukD*<-?<-Mem_prot<-ESAT-6                                                                                                                                         Bacillus cereus subsp. cytotoxis NVH 391-98                    firmicutes                                    conserved hypothetical protein [Bacillus cereus subsp. cytotoxis NVH 391-98]
49329053        83        ESAT-6->Mem_prot->?->yukD*->S/TK->FtsK->                                                                                                                                         Bacillus thuringiensis serovar konkukian str. 97-27            firmicutes                                    conserved hypothetical protein [Bacillus thuringiensis serovar konkukian str. 97-27]
13093361        503       <-FtsK<-?<-subtilisin<-Ub+12xTM*<-?<-FtsK<-memb_associated                                                                                                                       Mycobacterium leprae                                           actinobacteria                                probable membrane protein [Mycobacterium leprae]
41407608        503       memb_associated->FtsK-><-?||?->PPE_family->PPE_family->PE_family->ESAT-6->?->Ub+12xTM*->subtilisin->?->FtsK->PE_family->PPE_family->PPE_family->?->PPE_family->PPE_family->      Mycobacterium avium subsp. paratuberculosis K-10               actinobacteria                                hypothetical protein MAP1510 [Mycobacterium avium subsp. paratuberculosis K-10]
13881491        503       PPE_family->PE_family->PPE_family->?->PPE_family-><-?||PE_family->ESAT-6->?->Ub+12xTM*->subtilisin->?->FtsK->?->PPE_family-><-?||PPE_family->PPE_family->                        Mycobacterium tuberculosis CDC1551                             actinobacteria                                hypothetical protein MT1844 [Mycobacterium tuberculosis CDC1551]
31618574        503       PPE_family->PE_family->PPE_family->PPE_family->PE_family->ESAT-6->ESAT-6->?->Ub+12xTM*->subtilisin->?->FtsK->?->PPE_family->PPE_family->PPE_family-><-?<-PE_family               Mycobacterium bovis AF2122/97                                  actinobacteria                                CONSERVED HYPOTHETICAL MEMBRANE PROTEIN [Mycobacterium bovis AF2122/97]
76784314        481       PPE_family->PE_family->PPE_family->PPE_family->PE_family->ESAT-6->ESAT-6->?->Ub+12xTM*->subtilisin->?->FtsK->?->PPE_family->PPE_family->PPE_family-><-?<-?||PE_family->          Mycobacterium tuberculosis F11                                 actinobacteria                                hypothetical protein MtubF_01001866 [Mycobacterium tuberculosis F11]
41406262        509       PE_family->?-><-?||?->?->?->FtsK->Ub+12xTM*->subtilisin->?->FtsK-><-?<-?||?-><-FtsK                                                                                              Mycobacterium avium subsp. paratuberculosis K-10               actinobacteria                                hypothetical protein MAP0164 [Mycobacterium avium subsp. paratuberculosis K-10]
1944601         509       <-subtilisin<-FtsK<-?<-subtilisin<-?*<-FtsK<-?<-?<-?<-?<-PE_family<-FtsK<-memb_associated                                                                                        Mycobacterium tuberculosis H37Rv                               actinobacteria                                PROBABLE CONSERVED TRANSMEMBRANE PROTEIN [Mycobacterium tuberculosis H37Rv]
31620222        467       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->subtilisin-><-memb_associated||cutinase->cutinase->                                                                                        Mycobacterium bovis AF2122/97                                  actinobacteria                                PROBABLE CONSERVED INTEGRAL MEMBRANE PROTEIN [Mycobacterium bovis AF2122/97]
13883386        467       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->subtilisin-><-memb_associated||cutinase->cutinase->                                                                                        Mycobacterium tuberculosis CDC1551                             actinobacteria                                hypothetical protein MT3554 [Mycobacterium tuberculosis CDC1551]
41410338        452       <-cutinase<-cutinase||memb_associated-><-subtilisin<-Ub+12xTM*||FtsK->?->ESAT-6->ESAT-6->                                                                                        Mycobacterium avium subsp. paratuberculosis K-10               actinobacteria                                hypothetical protein MAP4240c [Mycobacterium avium subsp. paratuberculosis K-10]
92916372        475       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->subtilisin-><-memb_associated                                                                                                              Mycobacterium sp. KMS                                          actinobacteria                                conserved hypothetical protein [Mycobacterium sp. KMS]
92911534        475       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->subtilisin-><-memb_associated||?-><-?||?->cutinase->cutinase->                                                                             Mycobacterium sp. JLS                                          actinobacteria                                conserved hypothetical protein [Mycobacterium sp. JLS]
89338189        434       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->subtilisin-><-memb_associated<-?||?->cutinase->cutinase->cutinase->                                                                        Mycobacterium flavescens PYR-GCK                               actinobacteria                                conserved hypothetical protein [Mycobacterium flavescens PYR-GCK]
90203437        447       Ub+12xTM*->subtilisin-><-memb_associated<-?||?->cutinase->cutinase->cutinase->                                                                                                   Mycobacterium vanbaalenii PYR-1                                actinobacteria                                conserved hypothetical protein [Mycobacterium vanbaalenii PYR-1]
92917561        472       FtsK->memb_associated->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6-><-?<-subtilisin<-Ub+12xTM*<-FtsK<-DNA_binding                                                                Mycobacterium sp. KMS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. KMS]
13093791        485       <-subtilisin<-Ub+12xTM*<-DNA_binding<-ESAT-6<-ESAT-6<-PE_family<-FtsK<-memb_associated<-FtsK                                                                                     Mycobacterium leprae                                           actinobacteria                                conserved membrane protein [Mycobacterium leprae]
31617055        472       FtsK->memb_associated->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->DNA_binding->Ub+12xTM*->subtilisin->                                                                         Mycobacterium bovis AF2122/97                                  actinobacteria                                PROBABLE CONSERVED TRANSMEMBRANE PROTEIN [Mycobacterium bovis AF2122/97]
13879797        472       FtsK->memb_associated->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->DNA_binding->Ub+12xTM*->subtilisin->                                                                         Mycobacterium tuberculosis CDC1551                             actinobacteria                                hypothetical protein MT0303 [Mycobacterium tuberculosis CDC1551]
41409884        480       FtsK->memb_associated->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->DNA_binding->Ub+12xTM*->subtilisin->                                                                         Mycobacterium avium subsp. paratuberculosis K-10               actinobacteria                                hypothetical protein MAP3786 [Mycobacterium avium subsp. paratuberculosis K-10]
92910002        476       <-subtilisin<-Ub+12xTM*<-DNA_binding<-ESAT-6<-ESAT-6<-PPE_family<-PE_family<-FtsK<-memb_associated<-FtsK                                                                         Mycobacterium sp. JLS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. JLS]
92915201        476       <-subtilisin<-Ub+12xTM*<-DNA_binding<-ESAT-6<-ESAT-6<-PPE_family<-PE_family<-FtsK<-memb_associated<-FtsK                                                                         Mycobacterium sp. KMS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. KMS]
89343513        532       <-subtilisin<-Ub+12xTM*<-DNA_binding<-ESAT-6<-ESAT-6<-PPE_family                                                                                                                 Mycobacterium flavescens PYR-GCK                               actinobacteria                                Protein of unknown function DUF571 [Mycobacterium flavescens PYR-GCK]
90205295        533       <-subtilisin<-Ub+12xTM*<-DNA_binding<-ESAT-6<-ESAT-6||?-><-PPE_family<-PE_family<-FtsK<-memb_associated<-FtsK                                                                    Mycobacterium vanbaalenii PYR-1                                actinobacteria                                Protein of unknown function DUF571 [Mycobacterium vanbaalenii PYR-1]
92917997        505       <-subtilisin<-Ub+12xTM*<-FtsK<-memb_associated<-?<-ESAT-6<-?<-PPE_family<-PE_family<-FtsK                                                                                        Mycobacterium sp. KMS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. KMS]
89338337        473       <-subtilisin<-Ub+12xTM*<-FtsK<-ESAT-6<-ESAT-6<-PPE_family<-PE_family<-FtsK<-memb_associated<-FtsK                                                                                Mycobacterium flavescens PYR-GCK                               actinobacteria                                hypothetical protein MflvDRAFT_5459 [Mycobacterium flavescens PYR-GCK]
13092444        512       subtilisin->?->?-><-Ub+12xTM*<-FtsK<-ESAT-6<-ESAT-6<-PPE_family<-FtsK<-FtsK<-memb_associated<-FtsK                                                                               Mycobacterium leprae                                           actinobacteria                                putative membrane protein [Mycobacterium leprae]
2370277         480       subtilisin->?->?->?->?-><-Ub+12xTM*<-FtsK<-ESAT-6<-ESAT-6<-PPE_family<-FtsK<-FtsK<-memb_associated<-FtsK                                                                         Mycobacterium leprae                                           actinobacteria                                hypothetical protein [Mycobacterium leprae]
90202132        508       ESAT-6->?->?->?->?-><-?<-?<-Ub+12xTM*<-FtsK<-ESAT-6<-?<-?<-PE_family<-FtsK<-FtsK<-memb_associated<-FtsK                                                                          Mycobacterium vanbaalenii PYR-1                                actinobacteria                                Protein of unknown function DUF571 [Mycobacterium vanbaalenii PYR-1]
89340379        549       FtsK->memb_associated->FtsK->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->FtsK->Ub+12xTM*->?->?-><-?<-?<-?<-?<-?<-?<-subtilisin                                                  Mycobacterium flavescens PYR-GCK                               actinobacteria                                Protein of unknown function DUF571 [Mycobacterium flavescens PYR-GCK]
92915077        509       FtsK->memb_associated->FtsK->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->FtsK->Ub+12xTM*->?-><-?<-?<-?<-subtilisin                                                              Mycobacterium sp. KMS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. KMS]
92909344        509       FtsK->memb_associated->FtsK->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->FtsK->Ub+12xTM*->?-><-?<-?<-?<-subtilisin                                                              Mycobacterium sp. JLS                                          actinobacteria                                Protein of unknown function DUF571 [Mycobacterium sp. JLS]
2960229         511       FtsK->memb_associated->FtsK->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->FtsK->?*->?-><-?<-?<-?<-?<-subtilisin<-FtsK<-?<-subtilisin                                             Mycobacterium tuberculosis H37Rv                               actinobacteria                                PROBABLE CONSERVED TRANSMEMBRANE PROTEIN [Mycobacterium tuberculosis H37Rv]
81252663        487       FtsK->memb_associated->FtsK->FtsK->PE_family->PPE_family->ESAT-6->ESAT-6->FtsK->Ub+12xTM*->?-><-?<-?<-?<-?<-?<-subtilisin<-FtsK                                                  Mycobacterium tuberculosis C                                   actinobacteria                                COG0477: Permeases of the major facilitator superfamily [Mycobacterium tuberculosis C]
54014302        493       DNA_binding->ESAT-6->ESAT-6-><-?||memb_associated-><-FtsK||Ub+12xTM*->subtilisin->FtsK->?-><-FtsK                                                                                Nocardia farcinica IFM 10152                                   actinobacteria                                hypothetical protein [Nocardia farcinica IFM 10152]
54014325        488       memb_associated-><-?<-subtilisin<-Ub+12xTM*||FtsK->?-><-ESAT-6<-ESAT-6                                                                                                           Nocardia farcinica IFM 10152                                   actinobacteria                                hypothetical protein [Nocardia farcinica IFM 10152]
68264440        383       <-ESAT-6<-ESAT-6<-?<-FtsK||Ub+12xTM*->?-><-memb_associated                                                                                                                       Corynebacterium jeikeium K411                                  actinobacteria                                putative membrane protein [Corynebacterium jeikeium K411]
84494284        443       <-FtsK||?-><-?<-?<-?<-ESAT-6<-ESAT-6||Ub+12xTM*->                                                                                                                                Janibacter sp. HTCC2649                                        actinobacteria                                putative integral membrane protein [Janibacter sp. HTCC2649]
29831983        451       Ub+12xTM*->?->FtsK->                                                                                                                                                             Streptomyces avermitilis MA-4680                               actinobacteria                                hypothetical protein SAV5440 [Streptomyces avermitilis MA-4680]
71369935        451       <-Ub+12xTM*<-ESAT-6<-ESAT-6                                                                                                                                                      Nocardioides sp. JS614                                         actinobacteria                                hypothetical protein NocaDRAFT_4675 [Nocardioides sp. JS614]
21224082        491       ESAT-6->?->?-><-?<-?<-?<-?<-?<-?<-FtsK||Ub+12xTM*->                                                                                                                              Streptomyces coelicolor A3(2)                                  actinobacteria                                integral membrane protein [Streptomyces coelicolor A3(2)]
29829069        502       <-Ub+12xTM*||FtsK->?->?-><-?<-ESAT-6<-ESAT-6<-?<-ESAT-6                                                                                                                          Streptomyces avermitilis MA-4680                               actinobacteria                                hypothetical protein SAV2527 [Streptomyces avermitilis MA-4680]



---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

II. Comprehensive alignments of different protein families described in the study

1. ThiS/MoaD/Ubiquitin

FINAL                                                  ---EEEE------------------E-----HHHHHHHHHH----------------------EEEEEE-----------------EE------EEEEEEEE---
ALIGN                                                  -----EEE---------------EEE-------HHHHHHHH----------------------HHEEEE-----------------EE-------EEEEE-----
HMM                                                    ---EEEEE---------------EEEE------HHHHHHHH----------------------EEEEEEE----------------EEE-----EEEEEEE----
FREQ                                                   ---EEEE-----------------------HHHHHHHHHHH----------------------EEEEEE----E-------------E-------EEEEEEE---
PSSM                                                   ---------------------------------HHHHHHHH----------------------EEEEE--------------------------EEEEEEE----

FINAL                                                  --EEE-------------------EEE----HHHHHHHHHH----------------------EEEEEE-----EE------E--EEEE----EEEEEEEE---\ThiS
2633522  Bacillus_subtilis_subsp_subtilis_str_168      -MLQLNG---------------KDVKWKKDTGTIQDLLASYQLE-----------------NKIVIVERN--KEIIGKERYHE--VELCDRD-VIEIVHFVGGG|
67939265 Chlorobium_phaeobacteroides_BS1               ITITLNG----------------QQREIQEGSTVEDILSIIGAE-----------------KQRVAVVVN--ENIVYPEKRGS--VLLREKD-QVEVLSFVAGG|
13879933 Mycobacterium_tuberculosis_CDC1551            MIVVVNE----------------QQVEVDEQTTIAALLDSLGFG-----------------DRGIAVALN--FSVLPRSDWATKICELRKPV-RLEVVTAVQGG|
29609756 Streptomyces_avermitilis_MA_4680              MNISVNG----------------ERRRIAPGTALDTLVKTLTAA-----------------PSGVAAALN--ETVVPRAQWSS--TALSEGD-RVEVLTAVQGG|
57865488 Staphylococcus_epidermidis_RP62A              MKCIING----------------DLFTFDQNQSIQEVLHSLELD-----------------PKRVIVELN--KELIKQDKYEE--YTVREDD-RLELLEIVGGG|
56909742 Bacillus_clausii_KSM_K16                      MRLVVNG----------------EERIS-ESTTLSELVSEFGLA-----------------SQLVVAEVN--GTIIDRVDWEA--TSLSEGM-KIELVHFVGGG|
17130691 Nostoc_sp_PCC_7120                            ITLQVNG----------------ETHNCSSPTPLPDLLQQLGFN-----------------PRLVAVEYN--GEILHRQFWEQ--TQVQSGD-RLEVVTIVGGG|
30138190 Nitrosomonas_europaea_ATCC_19718              MQLIING----------------QQQSYDGPMNVQQLVEKLSLQ-----------------NKRFAIERN--GEIIPRSRFPE--LLLNEGD-QLEIIVAVGGG/
FINAL                                                  -HHHHHHHHHHHHHH--------EEEE---HHHHHHHHHHHHHH--HHHHHHHH-------HHHHHHH-------------------------EEEEE------\MoaB
52696120 Pyrococcus_furiosus                           VKVKVKYFARFRQLAG----VDEEEIELPEGARVRDLIEEIKKRHEKFKEEVFGEGYDE--DADVNIAVN--GRYVSWD------EELKDGD-VVGVFPPVSGG|
10640172 Thermoplasma_acidophilum                      -MVTVRYYATLRPI------TKKKEETFNGISKISELLERLKVEYGSEFTKQMYDGNNL--FKNVIILVN--GNNITSMKGLD--TEIKDDD-KIDLFPPVAGG|
19915596 Methanosarcina_acetivorans_C2A                MKIHVKFLATIREITG----KPEIELEILPGDTVGTALQALQARYGPEFKEATTGTTAGG-IPKVRFLVN--GRNTDFLDGFE--TELKAGD-VMVFVPPVAGG|
11499216 Archaeoglobus_fulgidus_DSM_4304               -MVRVKLFANFRE-------AAGVKEVEVEAGTVGEVLQELVRRFPKLESLFYEEGRL---RDYVNIMVN--GRNVRGDLN----YPLSHTD-EVAIFPPVSGG|
75855800 Vibrio_sp_Ex25                                -MIKVLFFAQTRELI-----GIDSVELDDQFETVEAIRAHLVEEGADKNGKWDLALE----PGKLLAAVN--QSIVPLD------TEVKAGD-EVAFFPPVTGG|
26107155 Escherichia_coli_CFT073                       RMINVLFFAQVRELV-----GTDATEVAADFPTVEALRQHLAAQSDRWALALE--------DGKLLAAVN--QTLVSFD------HSLTDGD-EVAFFPPVTGG|
28868459 Pseudomonas_syringae_pv_tomato_str_DC3000     MKIEVQYFARYRETL-----GIDSESVEGEFVTLEVLRQHLLQRGEAWQVLA---------EQNLMCARN--QELCKLD------EPLLDGD-EVAFFPPVTGG|
4262375  Mus_musculus                                  CQIDVLYFAKSAEIAG----VRSETISVPQEIKASELWKELEMLHPGLADV----------RNQVIFAVR--QEYVELGDQQ---LLLQPGD-EVAIIPPISGG|
30681325 Arabidopsis_thaliana                          VEIKVLLFARARELTG----VPDLTLKMPSGSTTQKCLDELVLKFPSLEEV----------RSCVVLALN--EEYTTDS------AIVQHRD-ELAIIPPISGG/
FINAL                                                  -EEEEEE----------------EEEEE------HHHHHHHHHH--------------------EEEE------EEE---HHHH-HHHH-----EEEE------\Urm1
40889046 Mus_musculus                                  VSFKITLTSDP---------RLPYKVLSVPESTPFTAVLKFAAEEFKVP------------AATSAIITND-GIGINPAQTAGN-VFLKHGS-ELRIIPRDRVG|
71074940 Giardia_lamblia_ATCC_50                       IQVKIYKGFDP---------FYTYHVFNIPEASSTEKVIRLAARAFEIP------------QLEAVLINST-GDAIVPCQTILD-TCRRFGT-TLTVAHLKPII|
68352771 Theileria_parva                               VTFKIVLASDA---------NQPYKVLSVPEQAPFSAVIKFAAEEFRLN------------PATCAIITND-GVGINPTQTAGG-VFLKYGS-NLRLIPRDRVG|
15217447 Arabidopsis_thaliana                          VSFKVTLTSDP---------KLPFKVFSVPEGAPFTAVLKFAAEEFKVP------------PQTSAIITND-GIGINPQQSAGN-VFLKHGS-ELRLIPRDRVG|
72005426 Strongylocentrotus_purp                       VTFKITLTSDP---------KLPFKVLSVPESTPFTAVLKFAAEEFRVP------------AATSAIITND-GIGINPAQSAGN-VFLKHGS-ELRLIPRDRVG|
56112391 Chlamydomonas_incerta                         VTFKVTLTSDP---------KLPFRVFSVPEEAPFTAVLKFAAEEFKVP------------AQTSAIITND-GVGINPQQTAGN-VFLKHGS-ELRLIPRDRVG|
289769   Caenorhabditis_elegans                        VTFKITLTSDP---------KLPFKVLSVPESTPFTAVLKFAAEEFKVP------------AATSAIITND-GVGVNPAQPAGN-IFLKHGS-ELRLIPRDRVG/
FINAL                                                  --EEEEEE------------EEEEEEE----HHHHHHHHHHHHHH----------------------EEEE-EE--------------------EEEE------\RnfH
56312934 Azoarcus_sp_EbN1                              MKIGVAYSEPSH--------QVWLNLEVPDGTTVGAAIERSGILAQFPHID----------LTVQKVGVF--AKVVKLD------TPLRHGD-RVEIYRPITCD|
77389630 Rhodobacter_sphaeroides_241                   MIVGVAYAKPTV--------QVWKHVDVPEGTSAREAIERSGLLAQFPEID----------LAVNKVGIF--GAICPLD------RTLAEGD-RVEIYRPIHPE|
66047427 Pseudomonas_syringae_pv_syringae_B728a        IQIEVVYASVQR--------QVLKTVDVPTGSSVRQALALSGIDKEFPELD----------LSQCAVGIF--GKVVTDPAA----RVLEAGE-RIEIYRLLVAD|
67549235 Burkholderia_vietnamiensis_G4                 LSIEVCYALPDR--------QTLIPVSLPEGATVRAAIDASGVLALHPEID----------LAQAKTGVF--GKLAPLD------APLADHD-RVEIYRPLIVD|
68245723 Magnetococcus_sp_MC_1                         MRVAVTYAQPNR--------QLLLEFEVPEGTTAQQAVERSGILSKFPDIN----------LAEQKLGIY--AKLVEND------QVLEEGD-RVEIYRPAKGK|
71846749 Dechloromonas_aromatica_RCB                   MQIGVAYSEPSQ--------QIWLNIEVPDESSVKEAIERSGILKQFPHID----------LSTQKVGVF--GRLVKLD------AALKPGD-RIEIYRGIIAD|
59712607 Vibrio_fischeri_ES114                         IHVEVVYALPTE--------QVVFKLAVKAEQTVEEIIVQSGVLERYPEID----------LKVNKVGVF--SRNVKLD------STIRDKD-RIEIYRPLLAD/
FINAL                                                  --EEEEE-----------------EEE-------HHHEEE----------------------EEEEEEE----EE------------------EEEEEE-----\TGS
5107656  Escherichia_coli                              -MPVITL-------------PDGSQRHYDHAVSPMDVALDIGPGLA---------------KACIAGRVN--GELVDAC------DLIENDA-QLSIITAKDEE|
730881   Saccharomyces_cerevisiae                      VPLKIVLK------------DGAVKEATSWETTPMDIAKGISKSLA---------------DRLCISKVN--GQLWDLD------RPFEGEA-NEEIKLELLDF|
135177   Homo_sapiens                                  KPIKVTLP------------DGKQVDAESWKTTPYQIACGISQGLA---------------DNTVIAKVN--NVVWDLD------RPLEEDC-TLELLKFEDEE|
2983390  Aquifex_aeolicus_VF5                          EEVFVFTP------------KG-DLVVLPKGSTPVDLAYKIHTEVG---------------NHCAGAKSN--GRIVPLN------YELKSGD-VVEIITNPNKS|
1710082  Shigella_flexneri                             DRVYVFTP------------KG-DVVDLPAGSTPLDFAYHIHSDVG---------------HRCIGAKIG--GRIVPFT------YQLQMGD-QIEIITQKQPN|
416555   Drosophila_melanogaster                       RLQRIYTKPKGQLPD-----YNSPVVLHNERTSIEDFCNKLHRSIAKEFKYAL--------VWGSSVKHQ--PQKVGIE------HVLNDED-VVQIVKKV---|
2120160  Methanocaldococcus_jannaschii                 GFIKIYLKPQGKKPD-----FDEPLIMR-RGATVKDVCEKLHKDFVRNFRYAQ--------VWGKSAKHP--GQRVGLD------HKLEDGD-ILTIVIKR---/
FINAL                                                  ---EEEEHHHHHH---------EEEE------HHHHHHHH----------------------EEEEEEE------------------------EEEEE------\DUF82 fusion
71915653 Thermobifida_fusca_YX                         ASITLRFDPTLRPLLAPRNRTDLLHVNHDPAASLSHVVESLGVPL----------------TEIGELRIN--GTTASPS------QHPQPGD-LIEVLTVPKPQ|
20520977 Streptomyces_coelicolor_A32                   PEIHVEFAPELHLFVPRARPTGVASAATDGVSTLGHLVESLGVPL----------------TEVGALLVD--GREVPPG------HIPAGGE-SVRVRPVRHPQ|
54016307 Nocardia_farcinica_IFM_10152                  SGIELRLYAELNDFLPPQDRQDALWRPVRPHQTVKDIVEAAGVPH----------------TEIDLLLVN--GESVGFE------HHPRPGD-RLAAYPMFESL|
4981308  Thermotoga_maritima_MSB8                      KIAFFRFFGRLNDFFRN--SERIKTHRFTGFQTVKDRIEALGVPH----------------VEVSLITLN--GKPVGFD------HMVEDGE-LFFVYPEFQNI|
76785598 Mycobacterium_tuberculosis_F11                GYVDVRAYAELNEFVELQARGLTVRRPFRSHQTVKDVLEAMGIPH----------------TEVDLILVN--GDPADFS------YRPVAGD-RIAAYPMFEAL|
68554875 Ralstonia_metallidurans_CH34                  VTATFRFYEELNDFLAPAQRRRDLSCPCARAATVKHMIEALGVPH----------------TEVELILVN--GESSPFE------RIVCDGD-RIAVYPKFESF|
67666690 Burkholderia_cenocepacia_HI2424               ATASLRVVVELNAFLASQQRDRAFAHACARDATVKHAIEALGVPH----------------TEIGRLYVN--DAPAALD------RPLDDGD-RVEVLPERAGP/
FINAL                                                  -HHHHHHEEEEEEE---------E---------HHHHHHHHH------------------------EEEE-----------------E---E-EEEEEEEEEE-\ub-like
71847777 Dechloromonas_aromatica RCB                   MTIAVNEIRRVFRY------NGVQLPD-VPGMEPKEVRDLYSAQY----------------PELISAEIE--A------------GDVVNGV-QEYTFRKAVGT|
67908730 Polaromonas_sp_JS666                          ILVSTTVLKRVFMS------NGNPLTDPDPSMSPAAVKDFWSAMY----------------PELLNAEVQ--G------------PVSKDGE-LTYTFHRTTGT|
84357757 Burkholderia_cenocepacia_PC184                --MEIETLAREFSY------NGAKLADPAPTFTLQQIRDFYSQTY----------------PELTNAEIE--G------------PVIKGNR-NVYTFRRAVGT|
17428677 Ralstonia_solanacearum                        --MQTIQLTREFRY------NGVRLADPSPQFTLEQVRDFYANTY----------------PEILNADID--G------------PSVEGTL-QVYGFRRAVGR|
29339958 Bacteroides_thetaiotaomicron_VPI_5482         MALDIKGLKRVFILKKGN--DTLTLEDPDSRMSLSEVTDFYSMNY----------------PELTTATLH--G------------PELEEDR-AIYRFKTTIGT|
71839548 Pelobacter_propionicus_DSM_2379               --MQITTLTRTFKY------NGATLRDPDPKQTPEQVKEFYSMAY----------------PELTTAVVE--G------------PEENNGQ-LQYSFRKGAGT|
38637971 Cupriavidus_necator                           MALEIKKLLRQFSY------NGMSFVDPGPAFTPEQVRDIYSAQY----------------PELTTASVD--G------------PEVKGEV-ASFTFVRAAGA|
84717440 Polaromonas_naphthalenivorans CJ2             MALIAKTISRTFKF------NGMTLADPSPEMDMETVKRFYANQY----------------PELLNSVVE--G------------PVTKGTV-STYTFIRAVGA/
FINAL                                                  ---EEEEE--H---------HHH-EEEEEEE--HHHHHHHHHHHHHHHHHHHH--------EEEEE-------HHHHHHHH------------EEEEEEEE---\TAPI
76556246 Phage_BP_4795                                 PLARICLHGDL---------QRFGRRLSLYVNTAAEAIRALSLQVPGFRRQMNEGW-----YQIRIAGDDT-APEAVYARLH---EPLGEGT-VIHIVPRLAGA|
215124   bacteriophage_lambda                          GMARICLYGDL---------QRFGRRIDLRVKTGAEAIRALATQLPAFRQKLSDGW-----YQVRIAGRDV-STSGLTAQLH---ETLPDGA-VIHIVPRVAGA|
11877308 Neisseria_meningitidis_phage_2120             -MITVCLYGGL---------REYGRRFVLHVETPAEALHALFTQIKGLRQRIRDGV-----YQVRFDGKDQ-SEETIGSV-----FRRPADG-VLHIVPRVQGA|
45686326 Enterobacteria_phage_T1                       DVKVIKLSGSLG--------RRFGVFHRYAVDSYPEAIRALSSQVDGFKEYMQSEVGSRSKFAIFVDGVNV-GHHEE--------EKFKCAK-EIRIVPIPTGS|
71834086 Bacteriophage_JK06                            NVIDVKLGLGLG--------RKFGKLHKLCVKTVPEAMRALSVNIPEFKEFMRSHVGQNTRFAVFVDGKNV-NEHKI--------NDLETVS-EIRIMPIPQGR|
46402106 Bacteriophage_phiKO2                          VMTRIELSGILG--------KKFGAYHERLVSTTSEGIQALCCTIDGFEKFLNNSKEKGLTFAIFKGKKNI-GKDDL--------GFPVNGD-VIRIVPVIIGS|
9634139  Enterobacteria_phage_HK022                    VMTRIELSGVLA--------KTYGRVHHRLVRTTAEAINALAKTINGFEKFLNTSKARGLTYAVYRDKKNI-GVDDL--------GFPVTGE-VIRIVPVVIGS|
17975181 Bacteriophage_phiE125                         TFRTIRLYGVLG--------GRFGRVHRLAVSSTAKAVRALSVLIPGFRAFLTSARDGGLTFAVFNGRRNL-GEDEL--------EHPVGRD-EIRIAPVIVGS|
77864688 Burkholderia_cepacia_phage_Bcep176            KLREVRLYGIAG--------TRFGRVHRLAVSSTAEAVRALSVLLPGFRKFLLEARDNGLTFAVFNGRRNL-SQDDL--------TAPVGDE-AIRIAPVIIGS/
FINAL                                                  --EEEEEE--------------EEEEEEE--HHHHHHHHHH-----------------------EEEEE----EEE----EE---EEE-----EEEEEEE----\TAPI+protein J
85716602 Nitrobacter_sp_Nb_311A                        PAATVSVYGTTHPLNAVA--GARIHCRVPAGWSITEILGEALSHKPGWHR-----------RRDLIVRIN--DHIIPEENWSR--VRVKQGA-TVTFIPRLQDG|
66392071 Xanthomonas_campestris_pv_pelargonii_phag     THQVIVSPHPVVVDD-----QKNLILAFKQGESLFEILSRSVDNFE---------------EREWVVTIN--GRRVPVEMWTK--AFPKPGH-IIEVR--GNVG|
33568295 Bordetella_bronchiseptica_RB50                MPALMVVHNPFVASEG----RKAYCAAFLPGETLGRYCERMGVALP---------------SRVVNVWHN--GRPVPLALWQR--LIPRQGD-QVVIRAKGEGG|
46449977 Desulfovibrio_vulgaris_subsp_vulgaris_str     KADVVSVTGCPHPFRP----GDRVHDVVPVGGTLESIVVRGLDDMGVPEAL----------RGCGHAFVD--GEYVPRDRWAD--VTPRAGS-TVTYRLVPAGG|
67545284 Burkholderia_vietnamiensis_G4                 QSAVVLLRNPFQP-------SQREVMVAHPTQTIRQWLGAQGIAEF---------------DQPTVCIKN--DAPVLRTDWAV--T-PIDG--VVLFITLPQGG|
23015894 Magnetospirillum_magnetotacticum_MS_1         TASVIIIANPFEPV------ASRSVHAIVAGVTVGELLLDCGIDPDRW-------------ADGPEIRIN--GNVVAAEIFAV--RVIGEDE-IISIIRWPLGG|
78033450 Magnetospirillum_gryphiswaldense              TASIVIVTNPFEPV------ASRSVHAVESGITLGGLLQACGIAEDCW-------------SDGPEILIG--GMTVPVGIYAV--RAIVDGE-VVTVIRWPQGG/
FINAL                                                  ----EEE-----------------EEE----HHHHHHHHH-------------------------EEEEE-------------------E------EEEEEE--\fusions to E1-like proteins
57168916 Campylobacter coli RM2228                     --MRIKFN--------------GKELDTKLSTSLDFFKSVSK-------------------NENDVWIIN--GFAT---------KENIKIH-ENDELFCIERN|
57166736 Campylobacter jejuni RM1221                   -MMRVKFN--------------GKELDTDFKTSLEFFENISK-------------------NENDVWIIN--GFAT---------KENIALN-EDDELFCIERN|
71837115 Pelobacter propionicus DSM 2379               -MIQIRLN--------------EKTIMVDDGLTLAMLAKQRR-------------------PGADVLILN--GFPA---------EDDTQIN-DGDAVFLIKRG|
77544308 Pelobacter carbinolicus DSM 2380              --MHIWIN--------------EQPHNISEDARLFEMRDRFK-------------------PQADVVILN--GFPV---------TSDRPLS-NGDRIVLIRRG|
68178158 Desulfuromonas acetoxidans DSM 684            --MIIVLN--------------ENKIQVEENQSLFDLRDQIK-------------------PEADVLICN--GLPI---------QSDRTLQ-PFDHVILIRRG|
18145265 Clostridium perfringens str. 13               --MNIKIN--------------EKWREVKENCTVYALKNEEF-------------------PDSHVIVLN--GFPL---------VEDKKLK-DGDRIVFIKKG|
28203841 Clostridium tetani E88                        --MKIYVN--------------EIFLNVEEDIDVFKLKNKIK-------------------KDADIVIYN--GFPI---------NNNIVLK-PLDRIVFIKRG|
77683437 Alkaliphilus metalliredigenes QYMF            --MKLIVN--------------EDEMDVKKGTTAFEVRNKVK-------------------KDADVVVYN--GFII---------KEDVLLQ-EGDLITLIQRG/
FINAL                                                  -EEEEE--------------EEEEEEEE-----HHHHHHHHH---------------------EEEEEEE---EEE-------------------EEEEEEEE-\TmoB
48094248 Pseudomonas_sp_OX1                            ATFPIMSNFERD--------FVIQLVPVDTEDTMDQVAEKCAYHSINRRVHPQP-------EKILRVRRHEDGTLFPRGMI----VSDAGLR-PTETLDIIFMD|
78693154 Bradyrhizobium_sp_BTAi1                       ALFPLQANFRGD--------FVVLLVPVDDGDTMSVVADKVAQHAVGLRVAE---------KNASKCVYHN-GKELPSAIT----VAQSGIQ-PMDWIEVAYV-|
68556036 Ralstonia_metallidurans_CH34                  ALFPLSSNFEGD--------FVLQLVAVDTENTMDEVAAAAAHHSVGRRVKARP-------GHILRVRQQGSKECLPRTMK----VADSGLK-PTECVEVIWEP|
45479222 Pseudomonas_mendocina                         SAFPVHAAFEKD--------FLVQLVVVDLNDSMDQVAEKVAYHCVNRRVAPR--------EGVMRVRKHRSTELFPRDMT----IAESGLN-PTEVIDVVFEE|
71849051 Dechloromonas_aromatica_RCB                   ALFPLTSNFEGD--------FVLQLVAVDSENTMDEVAAAAAHHSVGRRVRARP-------GQILRVRRQGGEEFLPRTMR----VSESGLK-PTETVEIIWEA|
86565792 Frankia_sp_CcI3                               ALLPLSAVFEHD--------FVSLLVAVDDADTVEVVGQKIAHHVVGRRLPAS--------DAPVGIRHN--GQVLAREAR----IGEAGVG-PLDHVEAFFDE|
72122837 Ralstonia_eutropha_JMP134                     ALFPVISNFQYD--------FVLQLVAVDTENSMDEVAAAAAHHSVGRRVAPQP-------GKVVRVRRQGGDQFYPRDAR----IGDTDIK-PMESLEFIFCD/
FINAL                                                  EEEEEE-----------------EEE-------HHHHHHHH---------------------EEEEEE------EEEE--HH---HHHH-------EEEEEE--\repeat
84711628 Polaromonas_naphthalenivorans_CJ2             VVADEQLN-------------DRHLDLRDPVPTGRQILQAAEVRPVA--------------DYSIYAILPS-GEFEDLRLDE---TYDLRGR-GAERFVIFQTD|#1
69928899 Nitrobacter_hamburgensis_X14                  EVAGTDLA-------------FGPVIIRDRTPTGAQIAAAAGLTPAQ--------------DPYVLSFLPD-GELVEILASE---TVDLDE--GRRRFIVTSAD|
17134587 Nostoc_sp_PCC_7120                            KHYLVRID-------------DRSYKVDDPVITGGQLLDKASKRPVD--------------EYLIFQMLNN-GQLEEIRLDE---TVELRKP-GIERFITWRSD|
38423904 Synechocystis_sp_PCC_6803                     QQFRIQVD-------------QQQLMIPDPVPTGRQILEIAQKRPAD--------------EFLVFYLLPS-GQLEEIRLDE---TVDLRQT-GIERFITFRSD|
28806071 Vibrio_parahaemolyticus_RIMD_2210633          FFALDSLQ-------------FRSLSVQDPVPTGRQLIEIAGLDSFD--------------DYSLFAILPS-GDFEDIRLNE---TVDLRAR-GVERFIAFKTD|
68554444 Ralstonia_metallidurans_CH34                  ------LN-------------FIKIEIDDPVPLGRQVLTAAGMHGDD--------------NYSLFNILES-GDFEDVRLDE---QIDLRRP-GAERFIAFKSD|
77690161 Rhodopseudomonas_palustris_BisB5              RGMEYPVN-------------GAMAAFPDNVVNGREVLTRSGLVPAS--------------EYRLI-LVRN-GRTRLIGTDD---DVDLDKE-HGGSFRAFLSD|
39651045 Rhodopseudomonas_palustris_CGA009             LIADESFN-------------FRSFPFDDRQVTGAQIGEVFGAHPIS--------------DFVIIQQLES-LELETLRPTE---LADLRKS--VRFFV-IRGD|
14025878 Mesorhizobium_loti_MAFF303099                 TNFTFKLD-------------GRVVATNDAIISGREVRALGGLDPAS--------------DYILIQIADR-TS-RSIGLEE---AIDFREM-PHSEFLSFQGD|
77961668 Yersinia_mollaretii_ATCC_43969                LFAQENLA-------------FRAIEVNDPVPLGRQILIAAGLRAND--------------DYSLFAILET-GDFEDLRLDE---TFDLRGR-GAERFVAFQTD/
FINAL                                                  --EEEEEE---------------EEEE-----HHHHHHHH----------------------EEEEEEEE-----EEE--------EE-------EEEEE----\repeat
84711628 Polaromonas_naphthalenivorans_CJ2             RAFKFTID-------------DRQMEWGKPSISGKILKVLAGVPTD---------------TYDVYLEVRS-GGQDVLIRDTD--LIDLSKP-GIERFITLIRD|#2
69928899 Nitrobacter_hamburgensis_X14                  RSYRLTVD-------------GEQYDWPARMVTGATVRKLARVPAE---------------FL-VYLERQD-EPDRLIGNQD---IVNLGDK-GVEHFHARKQT|
67547440 Burkholderia_vietnamiensis_G4                 --YKIRID-------------KDYYVVDVPHMTGEQILGLAGKTSA---------------GY-LLSEKVH-GQMRPVAPAQ---TVDFTAH-GVERFATIPKE|
17134587 Nostoc_sp_PCC_7120                            RSFRFVID-------------GRRFEWGAPIITGLKLKELAGVDLA---------------SYGVWLELRG-AEDRPIADNE---SVDLQAP-GVERFFTGKKT|
38423904 Synechocystis_sp_PCC_6803                     RSFRFVID-------------GRRFEWGIPLISGLKLKQLAQVSPQ---------------AYGVWLEVRG-GEDRPIADHE---TVNLEAP-GVERFFTGKKT|
28806071 Vibrio_parahaemolyticus_RIMD_2210633          RDFKFSLK-------------GRQIVWGKSEIDGSDLYFLADV-AD---------------EQAIFLDVRG-GTDRLIEPDD---TVDLSEA-GIEHFVVADKP|
68554444 Ralstonia_metallidurans_CH34                  RNFKLTVN-------------GSQVVWGRPTISGADLYALSKP-AD---------------GEAVFMVVSG-GEDRQIERED---DVDLAAP-GVERFENAPKR|
77690161 Rhodopseudomonas_palustris_BisB5              RDFGFTVD-------------EVGQVWGTADMEVDEFLRIWPQHPE---------------HR-WVLERDD-EPDTVLTPGG---VLSFGPK-GVEHVVSRKDA|
39651045 Rhodopseudomonas_palustris_CGA009             ATYTFIVD-------------GLTMVWPKKTITGKAVKMLTNKDED---------------DIEVLLERED-RPDKVIGDDD---DIQLAAD-GVEKLKTRYAK|
14025878 Mesorhizobium_loti_MAFF303099                 RAFSFTVN-------------ERGWEWGSATISAADIYRYASIDED---------------LE-LIL--DS-AGDTVIPADG---AVTLGGQ-GVERIRSREAK/
FINAL                                                  -EEEEEEE------------------------HHHHHHEE---------------------EEEEEEEEE-----------------EEEEE--EEEEEEE---\repeat
14025878 Mesorhizobium_loti_MAFF303099                 KTVVIKVN-------------GRSRTVPRRKHSYREIALLAYPDA-NFEK-----------FKYTITYLKG-VHGA-EGDLVE--GENIEVKNGMVFNVRRSDK|#3
28806071 Vibrio_parahaemolyticus_RIMD_2210633          PDYIITVN-------------SREHVLDDPNVTYEQIVSFEFQYPPSNPN-----------TCYSMTYRHA-KSKPHAGELAAG-GSVIVKKKGTVFNVTATDK|
68554444 Ralstonia_metallidurans_CH34                  PKVVIIVN-------------GTKEELPAPLVTFDQLVALAYPGQPPQPG-----------ITYSITYYKV-ASYPHQGPMAP--GGSVEAKNGSIFNVGRTIQ|
39651045 Rhodopseudomonas_palustris_CGA009             TTVTIIVE-------------GTPHKWDKKKISYAEVVTLEVSDYEHHPD-----------ITYSVNFTNG-PHNRPEGDLAK--GESVKVRDGMIFSVSETGQ|
88795473 Alteromonas_macleodii_Deep_ecotype            KIFEIIVN-------------GRMKSVEDKFLTFVEIVKLAFGEFKECQN-----------QIYTMTFKRG-VGKK-EGSLVL--GDKVRIKDGVIFNVTATNK|
86566459 Frankia_sp_CcI3                               KTVEIIVN-------------GRRRTVVKGELSFDEVVALAFDPVPAGDN-----------VDFTITFRRG-HGDKPEGTLRP--GGTVKIKEGMIFDVTATDR|
69928899 Nitrobacter_hamburgensis_X14                  QNVLIEIA--------------TPTVVVAD--AMRQAGFDPAQPWHIFLKVQDQ-------TKREVAANYV-LDLRTPGIEKLR-LIPKDVNNGEACAPR----/
consensus/100%                                         ........................................................................................................
consensus/95%                                          ....................................h............................................................h......
consensus/90%                                          ....h...........................s...h............................................................h......
consensus/85%                                          ..h.h...........................s...h................................p...........................h......
consensus/80%                                          ..h.h...........................o..ph...h.........................h..ps.......................h..h......
consensus/75%                                          ..h.l.h.........................o..phh..h.........................h..ps....................p..hp.h....ss
consensus/70%                                          ..h.l.h.........................o..phhp.hs........................h..ss....................p..lp.h....ss

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

2. UBC/E2 like domain
                                                         Helix-1         Str-1                Str-2                Str-3                             Str-4  |                                  *                  *                                     Helix-2                                        Helix-3       Helix-4
Secondary Structure                              -hHHHHHHHHHHHHHh--------EEEEE----------------EEEEEEE--------------EEEEEEE---------------------------EEEE-----------------------------------------------------------HHH-------------------------------HHHHHHHHHHHHH-----------------------------------HHHHHHHH--h---hhhHHHHhhHHH
1ayzA_Ubc2_Scer_3659954                          TPARRRLMRDFKRMKE---DAPPGVSASP----------LPDNVMVWNAMII----GPADTPYEDGTFRLLLE------------FDEEYPNKPP-----HVKFLSE---------------------------------MFHPNVYAN---------GEICLDILQ----------------NRWTP------TYDVASILTSIQSLFN---------------DPNPASPAN-----------VEAATLFKDHK---SQYVKRVKETVE
1Q34A_Ubc_Cele_34810893                          TPSRRRLMRDFKKLQE---DPPAGVSGAP----------TEDNILTWEAIIF----GPQETPFEDGTFKLSLE------------FTEEYPNKPP-----TVKFISK---------------------------------MFHPNVYAD---------GSICLDILQ----------------NRWSP------TYDVAAILTSIQSLLD---------------EPNPNSPAN-----------SLAAQLYQENR---REYEKRVQQIVE
2E2C_E2-C_Ssol_4388942                           HSVSKRLQQELRTLLM---SGDPGITAFP----------DGDNLFKWVATLD----GPKDTVYESLKYKLTLE------------FPSDYPYKPP-----VVKFTTP---------------------------------CWHPNVDQS---------GNICLDILK----------------ENWTA------SYDVRTILLSLQSLLG---------------EPN-NASPL-----------NAQAADMWSNQ---TEYKKVLHEKYK
1QCQA_Ubc4_Scer_5107650                          MSSSKRIAKELSDLER---DPPTSCSAGP----------VGDDLYHWQASIM----GPADSPYAGGVFFLSIH------------FPTDYPFKPP-----KISFTTK---------------------------------IYHPNINAN---------GNICLDILK----------------DQWSP------ALTLSKVLLSICSLLT---------------DANPDDPLV-----------PEIAHIYKTDR---PKYEATAREWTK
2AAK_Ubc1_Atha_2981894                           TPARKRLMRDFKRLQQ---DPPAGISGAP----------QDNNIMLWNAVIF----GPDDTPWDGGTFKLSLQ------------FSEDYPNKPP-----TVRFVSR---------------------------------MFHPNIYAD---------GSICLDILQ----------------NQWSP------IYDVAAILTSIQSLLC---------------DPNPNSPAN-----------SEAARMYSESK---REYNRRVRDVVE
1PZVA_Ubc_Cele_34811307                          EQSSLLLKKQLADMRR---VPVDGFSAGL---------VDDNDIYKWEVLVI----GPPDTLYEGGFFKAILD------------FPRDYPQKPP-----KMKFISE---------------------------------IWHPNIDKE---------GNVCISILH---------DPPEEEEERWLP------VHTVETILLSVISMLT---------------DPNFESPAN-----------VDAAKMQRENY---AEFKKKVAQCVR
1I7KA_Ubch10_Hsap_13786748                       GPVGKRLQQELMTLMM---SGDKGISAFP----------ESDNLFKWVGTIH----GAAGTVYEDLRYKLSLE------------FPSGYPYNAP-----TVKFLTP---------------------------------CYHPNVDTQ---------GNICLDILK----------------EKWSA------LYDVRTILLSIQSLLG---------------EPN-IDSPL-----------NTHAAELWKNP---TAFKKYLQETYS
2UCZ_Ubc7_Scer_2981900                           KTAQKRLLKELQQLIK---DSPPGIVAGP---------KSENNIFIWDCLIQ----GPPDTPYADGVFNAKLE------------FPKDYPLSPP-----KLTFTPS---------------------------------ILHPNIYPN---------GEVCISILHSPGDDPNMYELAEEEEERWSP------VQSVEKILLSVMSMLS---------------EPNIESGAN-----------IDACILWRDNR---PEFERQVKLSIL
1J7DB_hUbc13_Hsap_15825811                       AGLPRRIIKETQRLLA---EPVPGIKAEP----------DESNARYFHVVIA----GPQDSPFEGGTFKLELF------------LPEEYPMAAP-----KVRFMTK---------------------------------IYHPNVDKL---------GRICLDILK----------------DKWSP------ALQIRTVLLSIQALLS---------------APNPDDPLA-----------NDVAEQWKTNE---AQAIETARAWTR
1JASA_Hsubc2b_Hsap_34809571                      TPARRRLMRDFKRLQE---DPPVGVSGAP----------SENNIMQWNAVIF----GPEGTPFEDGTFKLVIE------------FSEEYPNKPP-----TVRFLSK---------------------------------MFHPNVYAD---------GSICLDILQ----------------NRWSP------TYDVSSILTSIQSLLD---------------EPNPNSPAN-----------SQAAQLYQENK---REYEKRVSAIVE
1KPSA_Ubc9_Hsap_20150955                         GIALSRLAQERKAWRK---DHPFGFVAVP-----TKNPDGTMNLMNWECAIP----GKKGTPWEGGLFKLRML------------FKDDYPSSPP-----KCKFEPP---------------------------------LFHPNVYPS---------GTVCLSILE-----------EDDDDKDWRP------AITIKQILLGIQELLN---------------EPNIQDPAQ-----------AEAYTIYCQNR---VEYEKRVRAQAK
1KPPA_Tsg101_Hsap_21465897                       YKYRDLTVRETVNVIT------LYKDLKPVLDSYVFNDGSSRELMNLTGTIP----VPYR--GNTYNIPICLW------------LLDTYPYNPP-----ICFVKPT----------------------------SSMTIKTGKHVDAN---------GKIYLPYLH-----------------EWKHP-----QSDLLGLIQVMIVVFG---------------DEPPVFSRP----I------SASYPPYQATG---PPNTSYMPGMPG
1JATA_Ubc13_Scer_14719686                        ASLPKRIIKETEKLVS---DPVPGITAEP----------HDDNLRYFQVTIE----GPEQSPYEDGIFELELY------------LPDDYPMEAP-----KVRFLTK---------------------------------IYHPNIDRL---------GRICLDVLK----------------TNWSP------ALQIRTVLLSIQALLA---------------SPNPNDPLA-----------NDVAEDWIKNE---QGAKAKAREWTK
FINAL                                            HHHHHHHHHHHHHHH------------------------------EEEEEEE----------E----EEEEEE----------------------------EEE--------------------EE--------------EE-----------------EEEEEE-------------------------------HHHHHHHHHHHHH-----------------------------------------HHHHH----HHHH--------
_Rsp._22726448                                   TAGEARLIRECEELAS---LAAASAWLEEP-----QFGKNADGLLTWSFVLL----------AGDRRIPLRLV------------FPALFPDLPP-----FVLPADS-----------------------------SVRLSQHQYGEG----------GELCLQYRP----------------DNWHP------DCKSADVVRSAKALLE---------------ATPKDDGFS------------DVESAHPTDL---PSLLSGCSRRFM
OB2597_05120_Obat_84499281                       LVDSARLAAERRSIEQ----AAAGEWFRFA------RWTLHHGLVCVEGEIL----------AHDNTYPVRLI------------YPDQFPLVPA-----WVEPAEK------------------------------ARWSSHQYSG-----------GSLCLELRP----------------DNWIP------TATGADVLESAFNLLH-----TEDPLGEGGATAPSDHRVG------------EVQTYGDLHL---PALIGAGCLDRL
RHE_CH01997_Retl_86357617                        LNNTVRVAREKEAVEN---LATETEWFVLD------RWEIHDYKFAAIGSIV----------AHGATYPIRLV------------YPDNFPLVPA-----WVEPQDP-----------------------------EAKWSYHQYGKG----------GALCLELRP----------------DNWTS------RANGADVLRSAYGLLN----LENPLGDGEKGKVTSAHNVG------------EIQKYNWGES---PVFIGQECLTRL
y4oA_Rsp._2496721                                RLTEVNVLKRGSDQDN---WWQAYPGLYAR-----ELAAYEGHGASHRPLIQ----------QDGTLILEVLWP-----------MDSAGSIRLN-----VGYSPLH-------------------PFCRPSISAPELQLERHQNPFT----------RDLCLLTQDS---------------AQWYPH---QMVADFIAERLSQVLQVM-------------------T----------------LRRNEQWSEA---ASLEEQAPDPVT
y4qC_Rsp._2496738                                PAGRRRLAELQKLHSA------AGESLLVD-----EEAAAAGILRIEFSWPL----------NDGRTIGLRAV------------YPDTFPRLRP-----HVFLTCD----------------------------PSEYPERHCGSE-----------GALCLLGRDT---------------RYWQAN------MSLAELLDENLAHVL---------------DGT-------------------GAEDPQGEP---IEYWWNSLGQAS
ROS217_07909_Rsp._85706659                       RTAQDHSAHDFGVMDA---WERVREVLAGH-----GFTLVPGSGRDRYQGQI----------KVGSVPVSLEIE-----------IADYDFLDLP-----KVRVLKR--------------------------EALPKRLTGHIVSD-----------GTLCYADKAT---------------FLLDRY----QPDRSVVSCLEQARTTL---------------NTLLHG---------------NPSVAYMAEL---AAYWSATPYCL-
_Cper_86475968                                   -MVILILDLFNSLNSF---ENIKNVKEIKK-----NNDNFEVNYSKIYEFTL----------NIQKQNFDIIMC-----------IPEEWNLKLI-----DFYIKDY----------------------------KNIKFIPHLEEN-----------GKICLFDKEG---------------LLVEEN----LNGIAIESIERLNKVLY---------------EGLNDI----------------NKLDFINEF---DAYWNLLSTNNI
GuraDRAFT_0469_Gura_88937743                     DESLLKEALETCLLVK---SVAELHPKRLA-----EPWAKDRFVCRSYKLVI----------ELNGVPVDFYFG-----------VKKSFPLSLP-----YIFLAQW----------------------------DSFGILPHVETD-----------GYICYAQEDG---------------SVLDFD---DVAGIAQEALSRAIQVVV---------------DGISGK----------------NHQDFLDEF---GAYWDRLKKVKF
Psyc_1372_Parc_71038912                          -MMSELHQTMLSCGFK---YLKNSQRQSIS-----FFDSIPTTRPIYVKDYK----------TSEGIFNVALV------------FGDDLYTTLP-----RAQVLKK-------------------------PKKIEQVLLPHINSG-----------GYLCYVEEKE---------------ADWNPN----NLNALYRAVDEQVQNTL---------------NTAISSLQNG----------QIDQAEFEGEF---VSYWKPEQTIY-
ELI_04040_Elit_84786718                          -----FRFRMMSLADR---WRAIAATLANK-----GFTEQQGASPEFRGSIN----------VHGRAVDIELV------------IPDSKFVELP-----IVRLVDR--------------------------KQLPAGAFGHISRDDIEG-------SVVCFAPATG---------------LPLDFH----DPGGSVLRVLRQTELSL---------------EKSFAGQG---------------GAEVAAEY---QEYWIEKEPNFR
_Ecol_37927532                                   MKDGQLHQVMTGCGYR---YTRARNLPEKS-----ILHSRERGAGYYTKEYA----------TDAGNFNVALV------------IHPDPFTELP-----TAFIIEQ-------------------------PEQFKSCLMPHVALE-----------GFLCYVEQME---------------ADWDSN----DLEATYKEVDAQIHQTL---------------IDSVSAATQG----------VNDKRELEGEF---AAYWRPSETLFL
VC0180_Vcho_9654584                              -MKQELHHTLLGCGFR---YTPAKQMPKGI-----LLDTKSRRKGYYVKEYS----------TKGGVFVIALV------------LWNDPHIQLP-----FAYILQQ-------------------------PEQYKGRLLPHINFG-----------FCLCYVTQME---------------ADWNSN----DLKSTYQDVDEQIQLTL---------------DNSVASVESG----------TSNDVELEGEF---SAYWQSEEELYL
PB2503_00627_Pber_84701417                       GVISEARTALADRLGA---YLLSAFDAQPF-----SASDLQAYNGKKVDRGW----------RLPGDPPLHLL------------LDPEFPYAPP-----RIALPDE----------------------------TQRLLWPHVETA-----------GLLCVFPTQ----------------TNIDAF---EPEKVATALITDARDLIT---------------RNQSGD----------------LDEEFRKEF---QSYWTLAIDDKA
Shewana3DRAFT_3199_Ssp._78684828                 ------LERHRGHSVL---SEIKQHLINQG-----FNCTTSEVAGGERIVVE----------TTILNHGIQLML-----------VADPPYYRLP-----EFFLINP----------------------------DSIGRLAHVSVHEYAGIQI----GTVCVNAPES---------------LSVNFE---QPLLVVEESLRRHILLLE---------------KCITNPD--------------WNHSELLREF---SSEWLRICAPDS
ArthDRAFT_2172_Asp._66965723                     WERYAGLLQSEISWLQ---DLGIACRIDET-----KRDDHQTLTMELSVPET----------VTGTAPLELTAV-----------FPDFYPLVPP-----KVFAVDL--------------------------------GMPHHWNPFS---------NEVCLLGTPS---------------EEWGTN------GSLAQLLKDQLPAAL---------------KAGMSGDEH----------ADWNEKPQAEPF---GAYYNSYANSAM
FINAL                                            HHHHHHHHHHH----------EEE-------------------EEEEEEEE---------------EEEEEEEE--------EEE---------------------------------------------------------------------------EEEEE-HH---------------HHH------HHHHHHHHHHHHHHHHHH-------------------------------------------------------------
Mdeg02000735_Mdeg_48864353                       IHDVIRWLDETRSVAG--IQTVTSSDDGV------------VVATNWRVDLP--IRFESEGETESGIRSIEAV---------SWVFPWEYPLRAP-QPKLREDFPLT---------------------------------LPHINPVVEGED------ISPCIAEVDL-----------TDLLHSSGI-----EAVFGAMTHWLNNAASGEL-------------LCPVQGWE------------PVRRDNASGLI---SADTYAIREELN
SYN_01833_Saci_85859492                          AQEELREIEAASEGAF--EVLSVRFPEGD------------HRSAIAEISVT--CFDMPYAEGGIKLRDRERF---------LIYIPPDFPFDVPSVYTPHRRFSG----------------------------------NPHVQWQ-----------TYLCLYQSRN-----------TEWDASDGM-----FGFISRLELWLRRAALNQL-------------DMEGAPLH------------PPVAYPTERIT---------------
Mmc1DRAFT_1998_Masp_68246513                     ALEQVADIVAASNGTV--ELVQIDPPTSE------------GDTLLLRVSID--TSDYTFQKGGLKFRKREGF---------HIRVSSRFPIEPPIAKFTHQRFMG----------------------------------QAHVQWG-----------NQICLYLATD-----------VEWSASDGM-----FGFIKRLDQWLGDAAQDQL-------------DPDDAPLH------------PPAVYHSSDTK---FSVEIDTPELAD
pCPF5603_46_Cper_86559649                        NDDFTMFYKGLLECKN--VKNITIYKLNI------------NSVIIRLELKI--NLPSRRSLMEFDIKEFEPIK--------LLCSTNEIKYKAPLVFSDRNDFPVE--------------------------------KLPHTLAMGLNY-------SYICLHRGNI-----------DDWYIDHSV-----EDFVNRIRFWFSDAACNNL-------------IKPGDDFE------------PMINYTETGNI---VYSYNKLTKFIE
RmetDRAFT_0537_Rmet_68559822                     IADALHQLQRHRGLIR--VGEPRTTGAST------------EIEVDVAVQLP--NRSRRNGISETGVRTVETC---------VLVFGSDWPLSAP-EPFLRADFPLN---------------------------------LPHINPHRQGEL------VSPCLFEGSL-----------NELLHRFGL-----DAVVDQLIDWLHKAAAGTL-------------LDLEQGWE------------PTRRDSCPSTV---VFSAEKVAAAAP
MaquDRAFT_3270_Maqu_77955723                     HIQMLVAAILQHQRSE--DHQVTERENEL------------VLDVSWRVQLS--SRDVEVGQSGTGIKRLEPV---------RFLIPFAFPLRPP-DITLRSDFPRE--------------------------------FVPHIYPGSPGDP------VCPCIAEVGI-----------TDLMFQEGI-----SGVLRSLQAWLDRAAQGTL-------------MDPSQGWE------------PILFQNIAGSF---LDDKGSFLRGVR
Nwi_2872_Nwin_74421923                           AERFLAAALRHPECRG--GRLISVDAGGS------------RIELDLNVEMP--LAFKVDGASPNGVRVVETV---------NVRLWPSYPWSSP-SFYLRMDFPRD---------------------------------LPHVQPGPVTEP------PRPCLIDGNQ-----------REYFFQFGLVELGIFNLVHQLVLWLQRAAEGTL-------------IHHGRGWE------------PTLRCDLNDVI---ALNAEACRAVVD
XAC3952_Xaxo_21110358                            DGRMQALLRACNAHAD--INVVELRRIED------------PFIAEIIVADV--GDGAVSPGNDAGIHRIERM---------ALLYRTGARFPFE-ARPLRKTFPKA----------------------------------LHQYATGNEGP------PSLCIMEGDW-----------ELAEHRFTP-----EALLETLLAWLEKTADGTI-------------HEADRGLE------------PVFYSLGQCLM---LPPDFAEALSDP
MaquDRAFT_3597_Maqu_77955313                     NLPEPLSDLADACNDN--SDFDIVEFRRI------------SKDSYALVVDA--GDGTFDAENPVGIRRIERL---------AFVLNPNLGFPWE-VRALRSDFPVT----------------------------------MHQNHVEPNSP------RSLCLYVEPW-----------SSVERTWSP-----QSFLARALWWLRETACENL-------------HQANQPLE------------QLFFEPADQFV---LPEDYFERLTDT
PnapDRAFT_0071_Pnap_84717800                     RAKTLFDVVSRQRDYA--VVQLLQHCDDG------------TPKLECIVVEV--ECDGVPPKNGVGINYRERL----------ALCVSDDPKQLIEVLAMRKDFPVL----------------------------------MHQNQGILDAP------ASLCLYFESV-----------AAVMRTWTP-----QSFLRRIQWWLEKSARGEL-------------HPTDQPVE------------HLFFATRYELV---LPWNLSTLRKSA
OB2597_18097_Obat_84502025                       LTSSAAASFARFVDRH--AAELAAIVALR------------RGGAGELVELA--FRTGRPQQSVVPIRRTERI-----------GVRFAGGDSMPFVYVLRSDFPDT----------------------------------AHQNLTAEGSP------RAICIDDRGW-----------AEARLTWTP-----AELVQRILAWFRRAAEGAL-------------HDARQPVD------------PLMFGTGYNII---MSRALIDNANTQ
GOX2518_Goxy_58038271                            RSRLARSVIEYVCDSV--EHPYATIQEFQ------------SDGLSDIVDLE--LEIDLAQDRAVPIRHREPV---------RIVFASPDDLIAPRVLSLREDFPSG---------------------------------QVHTNLDREVDG------LCLCIWEEGW-----------HDLSRNLTG-----QALVERIRWWFAGMADGSL-------------HADDQILE------------PLVATTSDTIV---FPLGTFVGPWFI
RSP_2047_Rsph_77387013                           DEEIPDVLHPVTSLLR--IGVGPVTALEG------------WKEWRRGFFSL--PLVARVTISPGQSFPAESR-----------WHLVVSSGSYPA---DIFILPDK--------------------VAGPNLT------FPHQAAVYSRDGKEPWLNGEPCLTDPTAAFGDR------HGSRPEPIAL---ADRLIWKVERFSRWCELAAA-------------GRLHNPGD------------HFELPPLSGHT---NPMTIGFHETEG
FINAL                                            HHHHHHHHHHHHHH-------------------------------EEEEEEE---------------EEEEEE-EEEE----EE----------------E---EEE------------EEEEEEEE----------EE-----EE-------------EEEEE-----------------------H---------HHHHHH-HHHH-----------------------------------EE-HHHHHHHHH----------------
Ava_C0067_Avar_75705484                          EREGKESKYKFLSPE-----AVEKAFTSK-TAAS-------GWLSSNTIWWG---------KNPEGEAIIQFYSPQKYQIQIMGQEPEVITVPMP-----AFLFAGCSS----------RYYLWAIKGRVF-KPDTQLYKPPLPNVWED---------SSICFGG-------------------NSLS----MCSAATISQVWDLFWKSPFNKDLSQGKS-----KTHPDNIC--------------NQLIKLHESKA-KSYPSSDLVPVH
alr7559_Ana_17134644                             EREGKESKYKFLSPE-----AVEKAFTSK-TAAS-------GWLSSNTIWWG---------KNPEGEAIIQFYSPQKYQIQIMGQETEVITVPMP-----AFLFAGCGS----------RYYLWAVKGRVF-KPDAQLYKPPLPNVWED---------SSICFGG-------------------NSLS----MCSAATISQVWDLFWKSPFNKDLSQGKS-----KTHPDNIC--------------NQLIKLHESKA-KSYPSSDLVPVH
p1B75_Asp._56315656                              TVDGLRKMFDSLDPS----RSARPVFLEP-NVLS--------QGPGWLVWWM-----------KPQTRRVWFES--------KEIKLETAEVPHP-----GLVFAVTQE----------EWRVFAVQGRSRPRPGTKLYQAPYWNVWKG---------GRICAGS-------------------ARLP----SAGLQADPSGWEESFFSSR--------------FSHPNIHEKDALVKYKGG--SAKFWNAMLSGKF-KSFPQEVLVPAE
BCE_A0096_Bcer_44004435                          TFKDFYLALKEVMEQGTQDNTHYSSGVLPKGCIKH--EVLSKSGDKQAVWIE----------VPKAQWDIHFFE------------RPFQQVGFP-----RLLFRYTVYQKRVT-----NISVFAVKEDMELEEGMKLYQFPYSNVHPS---------GSVCTGR-------------------VVIP----EFRTLKDLETFHVLFFASS--------------FNHDLTHTHTEP--------VGELFKRFEN----QSFDDSILMESE
BT_2648_Bthe_29339960                            TYEFMNSLVESYTES----MSGIPHGRIPGNMLLC----DSRKGRERYIWYN-----------PPQKRKMYFQD---------GLHITDGTFNVP-----GVIYVVERE----------CMDIHAFKGA-IPEERTELYLAPFFNVAG----------ANVCLGSSS-----------------PKKPQ---DMDFLEFQEYWEKRFWMSE--------------FSHLGGNRNP----------TRSNLVSVTEHARNNPFDYSELQQSG
BproDRAFT_4305_Psp._67908644                     TLTSKNLKLLAQQAQQ---GLKQDFEVIPANVLV--------ANDSLLAWWM-----------PKGTQLMSFDVSMHELAGKSRLQGVSGNVPTP-----ALVFAMMRNRNAGGAFE--GLYVFALEKSERPTSDTSLYRAPLLNVGED---------GSVCWGD-------------------GVKP----AGKTVKDISAWQALFFSSV--------------FTHYNGTVPIVGDD------PYAFIADLMETEA-KEFPAAALKPMK
RferDRAFT_4144_Rfer_74024822                     KKDSLMAALRQLARQQ---GISDLVWVDD-QTIA--------TSSTLQVWWT-----------PAQSRWMHFQS---------QGLQLSLPAQNP-----PLVWLACGE----------CLMVFALKENIKPGPTTALHHAPLFNVFAN---------AEVCAGS-------------------MQKP-------KDGNAKEWVESFYAAT--------------FTHANPPSRRLTTYRQG---EKALWKHLMTSKKKPAFPTDKLKPFG
BproDRAFT_0623_Psp._67910471                     TEADYLAMVKVLAPQQ----RPQMEWQDH-CILA--------KGMGKMIWWT-----------PPMNRAMFFKKS---DMFGATTFSGQGICPLP-----GMVWMSDGR----------DLFVYAYRGSAMPGKETRLCQAPLFNVWAR---------GEVCVGN-------------------ASRP----DDSAKGNPQAWERFLFDSH--------------FTHPNFAQVDRLTKGVK---PAEFWKKMVAKP-AQKFPESVLVDLE
PnapDRAFT_0124_Pnap_84717439                     TQSDLNELVTGLSQSQ---SLSVPSWIDT-TMLA--------LGAGRMIWYT-----------PACQRAMFFKTS----SFTKDTFEAQGQLPTP-----GLVWLVMQG----------ALYVYAYKGSGRPDKETKLYQAPFFNVWSQ---------GKVCTGN-------------------AAMP----VGDNAAIPHMWVDAFFGSN--------------FTHPNFKEKDRLVKGVC---PIDFWKAMTEKP-LPVFPEGRLVDLP
RSc1659_Rsol_17428675                            SLGELSEFVEAAQTA-----TAYRGFIEP-HVLY--------LAPNTVAWWR-----------PAAPRTVWFSAE-------KPIGTRHGVTAHP-----PLVFIVHER----------QWYVFALAKNERPAPNTPLHVAPYFNVWER---------GEICTGN-------------------VSLP----DRPAPDALKAYETAFFDSR--------------FTHPNHARITRHKDG-----GGALWAHLLDHPEITEFPATALLPRK
RmetDRAFT_6238_Rmet_68559357                     NRMALIHAVRQVAANA----LPKGEFLTP-NVLS--------ISATTVTWWC-----------PAASRRVFFKCE--------EFGERNAIVAHP-----ALVFQASHS----------GFSVFALQGEDRPGPETALFEPPYFNTWDH---------GRICIGS-------------------AQVP----KQIDVASISGWEEGFFNSA--------------FTHPNHGGKRVAYERG----VYAFWKDMLDGKF-PDFPKQVLVPMK
PHG308_Cnec_38637969                             NRMALIHAVREVAEAS----LPNGEFLTP-NVLS--------ISPTAVTWWC-----------PAAQRRVFFDCK--------EFGKRSAVVPHP-----ALVFQASQS----------GFRVFALRGDERPVPASELCEPPYFNTWDH---------GKICIGS-------------------AHVP----KQIDVASIAGWEAGFFNSA--------------FTHPNHGSKRVTYERG----AYAFWKDMLDGQF-PDYPKQVLVPMK
Bcep1808DRAFT_6253_Bvie_67543573                 DRKVLVQTLQQLAEHV----APRAEFLPA-TVLG--------VSPEAVTWWC-----------PPAMRRVFFECE--------NLGKRSAVVPHP-----GLVFQALNQ----------GFRVFAVACSDRPVRETPLFEPPYFNTWDM---------GRICIGS-------------------AQVP----KRVDVASIDGWEAGFFDSA--------------FTHPNAGGKRIEYKDG----EYAFWRDMLDGKFGETFPLNALVPMK
Daro_2538_Daro_71847775                          TPRAAMDLAKALLKR-----AAHGGFLPE-TVLY--------MDGDLIVWWM-----------PPARRHIAFRVD-AEQAEAFGGQERGESVPHP-----GLVFAASSR----------VWRVWAVKGAGRPTPATALFQVPYFNVNVQ---------GNICHGN-------------------APVP----EGTTVEKIAAWNDAFLRSY--------------FTHPNGPGKLIRYRGG----AYTFWRDMLDGRF-QRFPERVLVDVK
PproDRAFT_0257_Ppro_71839550                     DVEMLGTLINALGRN-----VSIGGYLPP-NILS--------VGFDSMVWWV-----------KPSKRRVFFKTN------EEIIGERSEVVPHP-----GLVFGVNGSG---------VWAVCAVKGNTRPTEDTPIWQAPYFNVWSS---------GNICTGT-------------------IETP----KSVAVTETGKWEECFFSSY--------------FSHPNAHGSRQLINSRIN--PYQFWKTVLDGKY-KTFPTQKLVQTN
RBTH_06715_Bthu_75758403                         NTLFEFVQKNCYETKTNTKKLDIPVFETP-A-----------LPPGTVKYMALPDGKI-----VLFMEKKEFKHNL------TYHSTKYKQIPFP-----NLLFVFVFRPNGDKYILE-NKRCYAFRDKVF-RDTTKLYRFPFSHVQKD---------GEMCFFF---------------------LT----EMQDLAQMSSFIHNWLSAA-------------FTDHYYNLENKNKW-------GWPLRQIFSETQGQPHFNYDKLIEED
RBTH_07326_Bthu_75758953                         NTNIETIQQIFMKEQA------METPLLP-------------SQWGVVKYYRKNHYEGYVLTTPPTERVVKFDIG------RSSELPTEVTLPIP-----PMLWVFEVMTDQSGKKKLTHSMTYVIKHELL-SLKDKVFHAPFCNIGIS---------HGICWGR--------------------TLP----EVPIPKSIQSIPARFFSQPFNYDLSGNRVKPFEWTHPNGNTEDTECAVYHMMNEADKLKAAKEAGEAYSYPFDSLKPAG
FINAL                                            ---------HHHHHHH-------EEEEEE---------------EEEEE-------------------EEEEE----------------------------EEEE----------------------HHHHH-------------EE-------------EEEEEEE-------------------------  ---HHHHHHHHHHHHHHHH---
PnapDRAFT_3950_Pnap_84711628                     TEGLAALPEADQRYLD---SHGFTVEVVS----------DGPHTGVVLKQMQ----LPQGK-FNHPAADVLVI------------LPPGYPDVAP-----DMFFCNL--------------------WLTLVSAGRYPTCADQPHTFM----------GHNWQRWSRH--------------NNSWRP------GVDGLHTMIKRIEHALAEAK---------------------------------------------------------
sll6054_Ssp_38423903                             --VMTFLPESDRQYLA---NKDYTYEEIT----------EGSRKGLIFSKFP----LPNQK-YDVSEVDLLIL------------LPNGYPDIVP-----DMFYLEP--------------------AVKLVQGNRPPRATEARQQFN----------GRSWQRWSRH--------------EREWRR------GVDGIWTMLKRVEHALEVAA---------------------------------------------------------
alr7503_Ana_17134588                             --VMSFLPSNDRQYLE---NRGLPFEEVV----------DASQKGVILREFQ----LPLGR-FDTEQADILIL------------LPSGYPDAPP-----DMFYLLP--------------------WVKLVQGAKYPKAADQPHQFN----------GQKWQRWSRH--------------NNEWRP------GTDGIWTMLKRIENALEVAA---------------------------------------------------------
NhamDRAFT_1902_Nham_69928899                     PRQAFALLPVDERHLD---TMGLKWETVV----------DGGRRWLLIEGYP----VPEG--YNAAVVTLALE------------IPGPYPGAQI-----DMFYVHP--------------------ALRRLVGEEIP-ATQATETVL----------GRIFQRWSRHRGP-----------NSPWSS------RLDNVMTHLTLVDGALAKEVNQ-------------------------------------------------------
Bcep1808DRAFT_3228_Bvie_67547440                 VRADFTVMEEDAEFLN---SKGYTWEAVA----------SDAKR-IVVRGFE----PPQG--FAPTKVDMFVI------------LPQGYPDTQI-----DMVYFSP--------------------PLTRNDGKPI--RSLVTNEFE----------GKTWQGWSRHRTA-----------NSPWRQ------GIDNVGTHLMLVDDFLRAELSK-------------------------------------------------------
FINAL                                            EEHHHHHHHHHHHHHH-HHHHHHHHH-------------------EEEEE------HHHHHHHHHHHHHHH-----------------------------EEEEEEE----------------------------------EEEEEE----------------EE--------------------EE--------------HHHHH--------------------------------------------HHHHHH---HHHHHHHHHHHH
y4jF_Rsp._2496664                                AFDDQAASCAEGQATL-DLAVRLLARLYP----------------VLAILPL---DSASSFQAQALERLAKSI--------------------NPK----IGIRRSGKS------------------------------AMVCLVAGATRP-------SLRCTTFF------------------IGS-------------DGWAAKLSRT---------------DPVGSGSSLL----------PYGAGAASCFG---AANVFRTIFAAQ
mll6192_Mlot_14025925                            AFDDQAASCAEGQATL-DLAVRLLARLYP----------------VLAILPL---GSAASFQAQALERLAKSI--------------------NPK----VGIRRSGKS------------------------------ATICVVAGVTRP-------PLRCPTFF------------------MGS-------------DGWAAKLSRT---------------DPVGSGSSLL----------PYGAGAASCFG---AANVFRTIFAAQ
msi105_Mlot_20803932                             AFDDQAASCAEGQATL-DLAVRLLARLYP----------------VLAILPL---GSAASSQAQALERLAKSI--------------------NPK----VGIRRSGKS------------------------------ATICVVAGVTRP-------PLRCPTFF------------------MGS-------------DGWAAKLSRT---------------DPVGSGSSLL----------PYGAGAASCFG---AANVFRTIFAAQ
RHE_PA00014_Retl_86359719                        AFDEQACA-TEGRASL-DLLVRLVARLYP----------------TICLLPS---GEEAKKLAKNLASLARSI--------------------NED----ITIARRGSS-----------------------------ALSHCLVVGSTNP-------EISCPKFF------------------LGS-------------DGWIAKFSPE---------------EPVGTAGSNN----------AFGAGAAACIA---ASNLFRHIFRDQ

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

3. The JAB domain
                                        EE.HHHHHHHHHHHH.........EEEEEEEE......E..................EEEEEEEEEE..........................................................................EEEEEEEEE..........HHHHHHHH..................EEEEEEEE........................EEEEEEE.....EEE..E.EE....
                                        EE.HHHHHHHHHHHH.........EEEEEE...............................EEEEEE..........................................................................EEEEEEEE...........HHHHHHHH...................EEEEEE..........................EEEE....................
AF2198_Aful_11499780                    SRGLLKTILEAAKSA-----HPDEFIALLSGSK-----------------------DVMDELIFLPFVS-------------------------------------GSVSAVIHL-------------------DMLPIGMKVFGTVHSHPSPSC--RPSEEDLSLFTRFG---------------KYHIIVCY-----------PYDEN--------SWKCYNR----KGEEVELEVVEKD-
PH0451_Phor_14590365                    RRELLEYLLELAKSF-----YPREVAGFLRMKDG-----------------------VFEEVLIVPKGFF------------------------------------GESSVYFDL-------------------TLMPHDESIKGTFHSHPSPFP--YPSEGDLMFFSKFG---------------GIHIIAAF-----------PYDED--------SVKAFDS----EGREVELEVID---\Archaeal JAB
MK0214_Mkan_20093654                    DARLLDSLLEASDKN-----HPDEFFAMLGGSI------------------------DAETITIDSLIVVP-----------------------FEA---------SDSGAIFDL--------------------LSVHTCDVIGTFHSHPYGDP--VPSEDDLMLFKRLG---------------AVHAIAAY-----------PYTPD--------RVEFYDK----SGRNITPVVEVRYT|
MA1736_Mace_20090588                    LLYMQIKGIARDTLD-----FILEASKSMAPEEFAGLL-------------------QEQDGIITEVLILP----------------------GTES---------SNTNAVIR--------------------LYMMPNVKAVGSVHSHPGANR--RPSKADLRLFSKTG---------------NCHIIAGR-----------PYGRE--------SWTCYDR----EGNVRDLPVLDVEF|
MTH971_Mthe_15678989                    FKPVRRVVVDSEVMD-----EVLEIARRSHPHEFAALLEGRQ---------------EGEVLHVTGLIFLP-----------------------SET---------SDEGAVMDV-------------------LMLPPFTGAVGSVHSHPGPVN--LPSAADLHFFSKNG---------------LFHLIIAH-----------PYTME--------TVAAYTR----NGDPVDFEVVP---|
VNG0778C_Hasp_15789943                  GGRPSVLGIAEDALE-----FAREAAQDSHPDEYLGLLRATPASAFDLD--------ADDGYVVTDVLVIP----------------------GTET---------NPVSATFGS-------------------TQVPNDMRNVGSIHSHPNGVL--APSDADRSMFGKG----------------QLHIILGH-----------PYGPD--------CWRAFDS----EGEPRTTTVLDVDL/
Z1657_Ecol_15801143                     STRAAREWLILNMAG-----LEREEFRVLYLN-------------------------NQNQLIAGETXF-----------------------TGTINRTE------VHPREVIK--------------------RALYHNAAAVVLAHNHPSGEV--TPSKADRLITERL----------------VQALGLVDI----------RVP----------DHLIVGG----NQVFSFAEH-----\RadC
radC_Bsub_16079856                      SPEDGANLVMEDMRF-----LTQEHFVCLYLN-------------------------TKNQVIHKRTVF-----------------------IGSLNSSI------VHPREVFK--------------------EAFKRSAASFICVHNHPSGDP--TPSREDIEVTRRL----------------FECGNLIGI----------ELL----------DHLVIGD----KKFVSLKEK-----|
yfjY_Ecol_16130559                      STQAARDWLKLKMAG-----LEREEFMMLYLN-------------------------QQNQLIAHETLF-----------------------AGSISSTE------VHPREVVK--------------------RALYFNAAAVILAHNHPSGDT--TPSQADKTITQRL----------------VQALQLVDI----------RVP----------DHLIVGG----RQIYSFAEH-----|
radC_Mace_20090827                      SPKDVYALMYPRMRE-----QKKEKFITLYLD-------------------------TKNQILKEEVVS-----------------------IGSLNASI------VHPREVFK--------------------SALLESSASVIMVHNHPSGDP--SPSREDIMVTEKL----------------VEGGKLLGI----------DIL----------DHIIIGD----GRYVSLKDE-----|
radC_Ecol_6686314                       SPEMTREFLQSQLTG-----EEREIFMVIFLD-------------------------SQHRVITHRRLF-----------------------SGTLNHVE------VHPREIIR--------------------EAIKINASALILAHNHPSGCA--EPSKADKLITERI----------------IKSCQFMDL----------RVL----------DHIVIGR----GEYVSFAER-----|
RSc2620_Rsol_17547339                   SPAAVKEYLRAKLAG-----FEHEVFAVLFMD-------------------------TQHRLIEYAEMF-----------------------RGTIDGAS------VYPRELVK--------------------EALRLNAAAVIVSHNHPSGNP--EPSGADRALTQRL----------------KEALGLVDV----------RVL----------DHVIVAG----TDTTSFAER-----|
VC1786_Vcho_15641789                    RTENTTEYLRCKLAG-----YEHEIFAVLFLD-------------------------NQHRLIEFKELF-----------------------RGTVDAAS------VYPREVLK--------------------EALNVNAAAVIFAHNHPSGDP--EPSQADRRITQRL----------------KDALSLVDI----------RVL----------DHVVVGK----SS-VSFAER-----|
radC_Smel_15965481                      SWSAVIDYCHAAMAH-----ETKEQFRILFLD-------------------------KRNTLIADEVQQ-----------------------QGTIDHTP------VYPREVVK--------------------RALELSATALILVHNHPSGDP--TPSRADIDMTKLI----------------AEAAKPLGI----------ALH----------DHVIIGK----DGHVSLKGL-----|
radC_Paer_15600512                      SPQAVRDYLKARLRH-----EQHEVFACLFLD-------------------------TRHRVLSFEVLF-----------------------QGSIDGAS------VYPRQVVK--------------------RTLAHNAAALILTHNHPSGDA--RPSLADRQLTARL----------------KEALALIDV----------RVL----------DHFIIGD----GEPLSLAEY-----|
radC_Rsol_17547163                      SPQSVKDFLRLTLGH-----RPQEVFACLFLD-------------------------VRHRLIAWEELF-----------------------QGTLTEAR------VYPREIAK--------------------RALHHNASALILSHNHPTGHV--EPSESDLVLTREL----------------CRALALLDV----------RVL----------DHMIVGR----AEVYSFLEH-----|
radC_Atum_17935503                      SWSSVIDYCHAAMAH-----ETREQFRILFLD-------------------------KRNVLIADEVQG-----------------------QGTVDHTP------VYPREIVR--------------------RALELSSTALILIHNHPSGDP--TPSRADIEMTKTI----------------IDTAKPLGI----------TVH----------DHIIIGK----DGHASFKGL-----|
radC_Ssp_16331325                       SPEAAAIALSQDLMW-----QTQEHFAIVMLD-------------------------VKNRLLATKVIT-----------------------IGTATETL------IHPREIFR--------------------EVIKQGATRLIVAHNHPSGGL--EPSPEDIRLTEFL----------------LQGAQYLQI----------PVL----------DHLILGH----GKHQSLRQC-----|
radC_Cace_15894524                      SPKEAANLVMEQLRS-----FNKEHLYVIMLN-------------------------TKNIVIKISDVS-----------------------VGSLNSSI------VHPREVYV--------------------EPILKHAASIILCHNHPSGDP--KPSNEDLNITKRL----------------YECSKFIGI----------ELL----------DHIIIGD----GIYISLKEE-----|
TM1557_Tmar_15644305                    DSSVKVYKYCQEMVY-----LEREIVKVICLD-------------------------TKLNVIGENTLT-----------------------VGTSDRSL------IHPRDVFR--------------------TAIRANASGVIVVHNHPSGDP--TPSKEDRLITERL----------------KQAGEILGV----------SLV----------DHVIVSR----RGYFSFREE-----|
radC_Aae_15606726                       RNPQEAFEFLKDKFD-----ERRESLIALYLD-------------------------LSNRLLDWEVVA-----------------------IGNVNTVF------SKPKDILF--------------------KAVKLSANGIIIAHNHPQGEP--SPSNEDLNFTERL----------------KKACELLGF----------ELL----------DHLILSE----GRYFSFREE-----/
COPS5_Hsap_12654695                     SALALLKMVMHARSG-----GNLEVMGLMLGK------------------------VDGETMIIMDSFALP--------------------VEGTETRVNAQAAAYEYMAAYIENA------------------KQVGRLENAIGWYHSHPGYGC--WLSGIDVSTQMLNQQFQ------------EPFVAVVID----------PTRTI---SA---GKVNLG-----AFRTYPK-------\Euk JAB
RRI1_Scer_6319985                       SKLSCEKITHYAVRG-----GNIEIMGILMGF------------------------TLKDNIVVMDCFNLP--------------------VVGTETRVNAQLESYEYMVQYIDEMYNHNDGGDGR--------DYKGAKLNVVGWFHSHPGYDC--WLSNIDIQTQDLNQRFQ------------DPYVAIVVD----------PLKSL---ED---KILRMG-----AFRTIES-------|
PSMD14_Hsap_5031981                     SSLALLKMLKHGRAG-----VPMEVMGLMLGEF-----------------------VDDYTVRVIDVFAMP--------------------QSGTGVSVE-----AVDPVFQAKMLDML---------------KQTGRPEMVVGWYHSHPGFGC--WLSGVDINTQQSFEALS------------ERAVAVVVD----------PIQSV----K---GKVVID-----AFRLINA-------|
Rpn11_Tbru_18463065                     SSLALLKMLMHGRAG-----VPLEVMGLMIGEL-----------------------IDDYTVRVSDVFSMP--------------------QTATGQSVE-----AVDPEYQVHMLDKL---------------SVVGRPEKVVGWYHSHPGFGC--WLSGEDVMTASSYEQLT------------PRSVSVVID----------PIQSV----R---GKVVID-----AFRTTKD-------|
_Ddis_2104757                           SSLALLKMLQHARAG-----VPLEVMGLMLGEL-----------------------IDEYTIRVIDVFAMP--------------------QSGTSVSVE-----AIDPVFQTKMLDML---------------KQTGRDEIVIGWYHSHPGFGC--WLSSVDVNTQQSFEQLQ------------SRAVAVVVD----------PLQSV----R---GKVVID-----AFRTIKT-------|
ECU11_0570_Ecun_19074857                SSLALLKMLKHGRAG-----IPLEVMGLMLGEF-----------------------VDEYTVKVVDVFAMP--------------------QSGTNVTVE-----SVDPIFQMEMMSIL---------------KATGRHETVVGWYHSHPGFGC--WLSTVDISTQQSFEKLC------------KRAVAVVVD----------PIQSV----K---GKVVID-----AFRLIDN-------|
RPN11_Scer_14318526                     SSIALLKMLKHGRAG-----VPMEVMGLMLGEF-----------------------VDDYTVNVVDVFAMP--------------------QSGTGVSVE-----AVDDVFQAKMMDML---------------KQTGRDQMVVGWYHSHPGFGC--WLSSVDVNTQKSFEQLN------------SRAVAVVVD----------PIQSV----K---GKVVID-----AFRLIDT-------|
C6.1A_Hsap_1168719                      ESDAFLVCLNHALST-----EKEEVMGLCIGELNDDTRSDSKFAYTGTEMRTVAEKVDAVRIVHIHSVIIL--------------------RRSDKRKDR----VEISPEQLSAASTEAERLA-----------ELTGRPMRVVGWYHSHPHITV--WPSHVDVRTQAMYQMMD------------QGFVGLIFS----------CFIEDKNTKT---GRVLYT-----CFQSIQA-------/
Stambp_Mmus_17941277                    NLCSEFLQLASANTA-----KGIETCGVLCGKLMR----------------------NEFTITHVLIPR----------------------QNGGPD-------YCHTENEEEIFF------------------MQDDLGLLTLGWIHTHPTQTA--FLSSVDLHTHCSYQMM-------------LPESIAIVC----------SPKFQET------GFFKLT-----DYGLQEI-------\Euk JABs
SPAC19B12.10_Spom_19115685              LLKKVFLDVVKPNTK-----KNLETCGILCGKLRQ----------------------NAFFITHLVIPL----------------------QEATSD-------TCGTTDEASLFE------------------FQDKHNLLTLGWIHTHPTQTC--FMSSVDLHTHCSYQLM-------------LPEAIAIVM----------APSKNTS------GIFRLL----DPEGLQTI-------|
CG2224_Dmel_7301945                     DTMEVFLKLALANTS-----KNIETCGVLAGHLSQ----------------------NQLYITHIITPQ----------------------QQGTPD-------SCNTMHEEQIFD------------------VQDQMQLITLGWIHTHPTQTA--FLSSVDLHTHCSYQIM-------------MPEALAIVC----------APKYNTT------GFFILT----PHYGLDYI-------|
Stambpl1_Mmus_17390801                  DLCHKFLLLADSNTV-----RGIETCGILCGKLTH----------------------NEFTITHVVVPK----------------------QSAGPD-------YCDVENVEELFN------------------VQDQHGLLTLGWIHTHPTQTA--FLSSVDLHTHCSYQLM-------------LPEAIAIVC----------SPKHKDT------GIFRLT-----NAGMLEV-------|
1039_Ddis_2582351                       HGEVFQEFMRLAENNTK---RSIETCGILSGTL------------------------SNDVFRITTIIIPK--------------------QEGTTD-------TCNTIEEHEIFE------------------YQLENDLLTLGWIHTHPTQDC--FLSAVDVHTHCSYQYLL------------QEAIAVVIS----------PM-----------ANPNFG-----IFRLTDP-------/
AF2198_Aful_11499780                    SRGLLKTILEAAKSA-----HPDEFIALLSGSKD-----------------------VMDELIFLPFVS-------------------------------------GSVSAVIHL-------------------DMLPIGMKVFGTVHSHPSPSC--RPSEEDLSLFTRFG---------------KYHIIVCY-----------PYDEN--------SWKCYNR---KGEEVELEVVEKD--\Archaeal JABS-2
VNG1818a_Hasp_16554503                  TREGYDSVLDHAQAD-----TPREACGVFVGE------------------------RDGDLRRVTAVRRVP--------------------NVADAPRV------RYELDPEATLAVFD---------------EAAAVGREVVGFYHSHPVGPG--RPSATDREHAQ------------------WPDRVYVVA----------SLAARPPILD---AWLWTGE----AFER----------|
PA2102_Paer_15597298                    TEHALSVIYRHACRT-----YPRECCGFVLADA-----------------------KVKEGTNIQDELHMA-------------------DPRRYPRTAA-----NGYTFSVTDTVFLN---------------SSFKTCSPVSVIYHSHPDVGA--YFSREDIDKALYAGEPM------------LPVDYLVVD--------VAAGNVRGAKLF---AWRNGRF---ECTREFGPSSQ----|
PAE2024_Pyae_18313041                   MPKAFLEEARKKCA------PEAECVALIFGISDT-----------------------ALSWRWMKNVAA-----------------------------------SPVFFKLDPEEVYKAIV------------EAEERGEELLAIFHTHPGPP---TPSWEDVRHMRL-----------------WPVTWIIAN----------VFDWHI---S---AWRIDG-----GLKTIPL-------|
APE0681_Aper_14600889                   ASIGPLRQVLKLMAL-----AHNEEAGLVIGARR-----------------------GDTVYAYILYRTDN-------------------LKQSPEEFES-----DPWQVVQAHR-------------------AAEKLGLEVVGVYHTHTTCPP--SPSGKDVEGMKR-----------------WPGVWLIAC----------PGEVK--------AWTLEGE---TPVEIELE-------|
PH1488_Phor_3257912                     LPKNIIEEIITRSRE-----SKIEICGFIFGTK--------------------------NGERFIGKEVE-------------------FIRNRLNSSVEFE---MDPEEMINALE------------------RAERKGLEVVTIFHSHLNCPP--YPSKKDIKGMENWR---------------IPWLIVSLK----------GD-----------MKAFILR----SNNEVEEVKI----|
SSO0111_Ssol_15897071                   NRYFKINCWSRRFMD-----NLKEKCGIICNNT--------------------------FYELKNISRTE-------------------YE--------------FICDPSDFYTT------------------VKGKCSDDIQAIVHTHEESC---EPSYKDIMSMKIWN---------------IPWIIISKK----------CIKSILYLNG---SILELD----IHSLLSQELYHSLM-/
sll0864_Ssp_1652702                     SQVHQDQIYRHGERC-----YPEECCGLLLGKILIGENGH-------------------RHWQVVEVQPTENCWGDVE-----------EFQQNNHQGNKLHYFAIDPKVLLSAQK------------------DCRQKGLSIIGIFHSHPHGQP--IPSEFDRAIA-------------------WPEYIYLIA-----SGENGRFNTSR-------SWYLNEA----GNFMEVDS------
YPMT1.08c_Ypes_16082790                 MQEIYLTAIKR---------YPNEACGFLVRT---------------------------TGEKYRFMEARN---------------------------------VSENPENTFVMHADDI--------------IAAEDAGDVVAIWHSHTDESA--DASDADRAGCEATE---------------VPWLILAV-----------RKNVEGD------APFHFSE---MNVITPDGFEMPYL-
_Scoe_7479881                           TQALYDQIVAHARED-----HPDEACGVVAGPAG-----------------------EGRPERFIPMLNAA--------------------RSPTFYEFD-------SQDLLKLYR------------------EMDDRDEEPVVIYHSHTATEA--HPSRTDVTYAN------------------EPGAHYV------------LV-----------STADTDG---AGEFQFRSFRIVAG-
DR0402_Drad_15805429                    PAPLRRALWAQVRRE-----LPRECVGALGGW------------------------VRGEQVQAHALYPLP--------------------NVAADPER------EYLADPGDLLRVVR---------------AMQREGLDLVALYHSHPHGPA--APSASDRRLAA------------------YPVPYLIAD----------PAAE---------VLRAYLL---PGGEEVEV-------
_Aae_2984019                            KKEVLEKMIKQAERD-----YPYETCGLLIGK-------------------------SEGGIRIAYEAFET-------------------PNANPDRKHDRYE--IAPKDYMRAED------------------YAISKGMEIVGVYHSHPDHPD--RPSQFDLQRAFP-----------------DLSYIIFSVQ------KGKVASYR--------SWELKGD---KFEEEEV--------
RPCDRAFT_2255_Rpal_78493975             NEETLALIVRHAEQA-----YPKECCGFVYADGEVRA-------------------CVNIQDDLKSID--------------------PARYRHGATAGYTL-----SVADTLALNG-----------------SFETANPASV-IYHSHPDVGA--YFSQEDSDEALFLGTPVYP----------VDYLVVDVRR------AKALEAKL--------FVWRKAG---FFCARVFPIDQSYR-\ThiF+Rhodanese
Noc_0361_Noce_76882206                  PRPLVNQLLHQAQVK-----PQQEICGLISAR----------------------------NGLPSRCYP-------------------INNIAPEPQRHFFM-----DPQGQIAAMR-----------------RMREEGEELFGIYHSHPETAP--LPSKSDLAQAAYP----------------GALYLIISLN------TKGVLEMR--------GFRLQGE---VYEEIELQL------|
RRSL_01365_Rsol_83748715                LSELVDAVLAQARRD-----HPIETCGVIAGPV--------------------------GSDRPARLI--------------------PMRNAAQSIDAFRL-----DAQEQFQVWS-----------------EMDAREEEPIVLYHSHTGTNA--CPSRDDVRFAAEP----------------HAHYLIVSTD------PACGQAVR--------SFRIAEG---RAVEETIKVVARYQ-|
MlgDRAFT_2849_Aehr_78700360             PARERDRLARLGLAR-----WPEEACGLMLGCD---------------------------GRVRRLVL--------------------CRNVAARRADRYLV-----HARDFLRWDR-----------------AAHRLGLDILGVWHTHPDGGA--RPSGTDREQAWR-----------------GWSYLIAAVD------GRAITELR--------SWRLRGD---HFIEETLCLKPA---|ThiF+Rhodanese
NE2352_Neur_30181074                    HTKLISAMITQSLKD-----HPIETCGIIAGLA--------------------------GSNLPLRLI--------------------PMRNVAQSENFFMF-----DPQQQLQVWK-----------------EMSARHEEPVVIYHSHTGSEA--YPSRSDVELAAEP----------------QAHYVIIPTC------SPHKEEIR--------SFRIVDQ---MVIEERVQIVRQYQ-\ThiF+S (S)
Nmul_A0971_Nmul_82702100                HAKLVEAMLAQAHKD-----HPFEICGVIAGPE--------------------------KSNLPLRLI--------------------PMRNAAQSETFFKF-----DPQEQLQVWR-----------------EMEARGEEPIVIYHSHTHTPA--YPSRTDVQYASQP----------------QSHYVIVPTD------PAYGEEIR--------SFRILDG---MVTEERIRMINSYK-|ThiF+S (S)
pdtG_Pput_84994017                      TAQALEQVRHLAQAA-----HPIEACGLIAAAS--------------------------GEPLAHRVV--------------------PMRNQAASPTWFSF-----DPREQLQVWR-----------------ELDQRDEDCRVIYHSHTASEA--WPSREDIALASDP----------------QVHYLIVSTW------GEARHAAR--------SFRIIDG---RVFEEPLCVQP----|siderophore
HCH_02850_Hche_83645617                 LSELVDAMVRQAQAE-----HPIETCGVIAGRE--------------------------GSDRPLRLI--------------------PMRNAAASSDMFMF-----DAREQLQIWR-----------------EMDANGEEPVVIYHSHTASRA--YPSKDDILCAAEP----------------HAHYVIIPTD------PEHGSDIR--------SFRIVNG---AVVEETIKAVEHYS-|siderophore
qbsD_Pflu_28192389                      SQDIITAIFDQARQA-----HPLECCGIIAAAI--------------------------DSERATRLI--------------------PMTNSACSPVYFAF-----DPRQQLQVWR-----------------EMDARDEEPRVFYHSHTASRA--YPSATDIEFATDA----------------NAHYLIVTT-------ADYDPPLR--------SFRIAQG---CVSEEEVRVETPPY-|Siderophore
_Pstu_5070640                           KRQALGQVLAQARRD-----HPLETCGIVASSL--------------------------EAQLATRVI--------------------PMRNQAASQTFFRL-----DSQEQFQVFR-----------------SLDDRNEFQRVIYHSHTASEA--YPSREDIEYAGYP----------------EAHHLIVSTW------ENAREPAR--------CFRILRG---KVIEESISIVE----|Siderophore
SAV5162_Save_29608821                   TQALVDQIVAHARQD-----HPDEACGVVAGPE--------------------------GSGRPERFI--------------------PMLNAARSPTFYEF-----DSGDLLKLYR-----------------EMDDRDEEPVIIYHSHTATEA--YPSRTDISYANEP----------------GAHYVLVSTA------DADDAGPF--------QFRSFQI---VAGEVTEEEVKVVE-\Cys Syn ClpS
NocaDRAFT_2642_Nsp._71366889            ARATYDAIVAHARRD-----HPDEACGIVAGPE--------------------------GSDRPERLV--------------------EMVNAAGSPTFYEF-----DSTELLQLYK-----------------EMWARDEEPVVIYHSHTATEA--YPSRTDIGLASEP----------------GAHYVLVSTRHGADSRGGNNGGPV--------EFRSYRI---VDGEVTEEEVVVVD-|Cys Syn
RxylDRAFT_0217_Rxyl_68563153            GRGDVEHIHRHAREA-----YPEECAGALVGMDVGG--------------------GTKIVVDVWRA---------------------ENVHEEERSRRFLI-----EPEQIRRFER-----------------RAAERDMDVLGFYHSHPDHPA--EPSEYDRQHAWP-----------------YYSYVIVSVS------GEEIREMR--------SWRLRDD---RSGYDEEEIVG----|Cys synthase
SRU_2040_Srub_83814538                  TPDILDQIRVHGADA-----YPEEGCGFLLGTVTDD--------------------GDNRVAALHRA---------------------TNRRSEQRTRRYEL-----TADDYRAADA-----------------AAQEQGLDVVGVYHSHPDHPA--RPSATDLEEATFP----------------GFTYVIVSVR------DGAPEALT--------AWALAPD---RSEFHREDIVRPDP-|Cys
AcidDRAFT_1958_Susi_67932292            ESAAWAAMVKHAQAS-----YPNECCGAMLGDT--------------------------DGETKLVR----------------ESIALENAFEGAQAARYEL-----RPQDLLAADK-----------------AARERNMDLIGIYHSHPDCDA--YFSKTDLQNSCP-----------------WYSFVVLSIQ------KGEFHHAN--------SWLPNFD----QTEAAKEELSY---|
MT1376_Mtub_13880984                    RADLVNAMVAHARRD-----HPDEACGVLAGPE--------------------------GSDRPERHI--------------------PMTNAERSPTFYRL-----DSGEQLKVWR-----------------AMEDADEVPVVIYHSHTATEA--YPSRTDVKLATEP----------------DAHYVLVSTR------DPHRHELR--------SYRIVDG---AVTEEPVNVVEQY--|
nfa10890_Nfar_54014564                  KSDLVAAMVAHARAD-----HPDEACGVIAGPE--------------------------GSDRPERFI--------------------AMTNAERSPTFYRF-----DSGEQLKVWR-----------------EMDAADEEPVVIYHSHTATEA--YPSRTDISYASEP----------------NAHYVLISTR------DPEQHELR--------SYRILDG---VVTEEPVRVVDDYD-|
Franean1DRAFT_3647_Fsp._68231909        DRTHYEAIVAHARRD-----HPDEACGVIAGPE--------------------------GSDRPERHI--------------------PMVNAARSPTFYEF-----DPAEQIKVWN-----------------EMFDRDEDPVVIYHSHTATEA--YPSRTDISIAGYP----------------EAHYVLASTR------DPETIEFR--------SFRIADG---EVTEEPVEIL-----|ClpS
Tfu_2370_Tfus_71916501                  DRSIYDKIVAHARRD-----HPDEACGIVAGPE--------------------------GSDRPERFI--------------------EMINAERSPTFYRF-----DSLEQLKVWR-----------------EMEERGEEPVVIYHSHTSTEA--YPSRTDISYASEP----------------NAHYVLVSTR------DPETVEFR--------SYRIVDG---VVTEEPVEIID----|ClpS
Francci3_0866_Fsp._86739579             DRACYEAIVAHARRD-----HPDEACGIVAGSL--------------------------GSDRPKRFI--------------------PMENAERSPTFYRF-----DPMEQLKVWR-----------------EMDDRDEEPVIIYHSHTATEA--YPSRTDVSLAAEP----------------GAHYVLASTR------EPDVTEFR--------SYRIVDG---VVTEEPVEIV-----/ClpS
WS1005_Wsuc_34483108                    -KALFDSIIEHAQRE-----LPLEACGYVAG----------------------------VEGEVKRLF--------------------PMRNVDASPEHFSF-----DPAEQFSAFK-----------------EAQKEGLRLIGCYHSHPSTPA--RPSDEDIRLAYDS----------------SLSYLIVS--------LAKEPVLN--------SFKIKEG---VVTPENIEVI-----\Sulfite metabolism
Gmet_1569_Gmet_78194034                 -RAIHAELIAHAQAD-----APIEACGILGG----------------------------IDGAVSAIF--------------------RMANTDQSDEHFMM-----DPKEQFAVVK-----------------ELRNRGLAMLAIYHSHPETPA--RPSEEDIRLALTP----------------GVSYVIASL-------AGAEPDVK--------AFRITDG---VVEPEPIDIVE----|
Cphamn1DRAFT_2826_Cpha_67938821         CKSVYEKIIEHARRE-----TPLEACGYLGGK----------------------------GKTVIEAY--------------------CLTNIDQSREHFSF-----DPKEQFNAVL-----------------TMRSKKQLAVAVYHSHPVTPA--RPSQEDIRLAFDP----------------EIINVIVSL-------AAQEPEVN--------AFRIVKG---DVTEEPLVVIEGLC-|
CtheDRAFT_3348_Cthe_67873786            TKQQYQEILEHSRNA-----LPNEACGLLGGRI------------------------ENGVKYVEKVY--------------------LLRNIDESPEHFSM-----NPKEQFAAVK-----------------DMRNNGWELLGNFHSHPATPS--RPSEEDIRLAFDP----------------KASYLILSLK-------DDTPVLK--------SFNISSG---QATQEELSIVGEEA-|
DhafDRAFT_0037_Dhaf_68208688            TKKQMEEMLAHARQA-----LPNEACGLLGGRR------------------------DGDDRWVERVY--------------------PLNNLDQSPEHFSM-----DPREQLTAVK-----------------DMRKNGWVMLGNFHSHPATPA--RPSAEDKRLAFDP----------------SLSYLIISLA------EPQKPVCK--------SFLIKKD---GVDEEEIILKEE---|
AmetDRAFT_0932_Amet_77686499            -KENYNQIVKQAKEE-----FPLECCGLLAGVK------------------------TDDEILIKKVY--------------------ALTNIDQSSEHFSM-----DPKEQFAAIK-----------------QMRTDGDIVVGNYHSHPYTPS--RPS