• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Sep 1, 2000; 28(17): 3278–3288.
PMCID: PMC110705

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

Abstract

Four years after the original sequence submission, we have re-annotated the genome of Mycoplasma pneumoniae to incorporate novel data. The total number of ORFss has been increased from 677 to 688 (10 new proteins were predicted in intergenic regions, two further were newly identified by mass spectrometry and one protein ORF was dismissed) and the number of RNAs from 39 to 42 genes. For 19 of the now 35 tRNAs and for six other functional RNAs the exact genome positions were re-annotated and two new tRNALeu and a small 200 nt RNA were identified. Sixteen protein reading frames were extended and eight shortened. For each ORF a consistent annotation vocabulary has been introduced. Annotation reasoning, annotation categories and comparisons to other published data on M.pneumoniae functional assignments are given. Experimental evidence includes 2-dimensional gel electrophoresis in combination with mass spectrometry as well as gene expression data from this study. Compared to the original annotation, we increased the number of proteins with predicted functional features from 349 to 458. The increase includes 36 new predictions and 73 protein assignments confirmed by the published literature. Furthermore, there are 23 reductions and 30 additions with respect to the previous annotation. mRNA expression data support transcription of 184 of the functionally unassigned reading frames.

INTRODUCTION

This study presents a re-annotation of the Mycoplasma pneumoniae genome, updating the original published annotation by Himmelreich et al. (1; deposited in GenBank) through further sequence analysis, incorporation of knowledge from the literature and new experimental data. There are inherent difficulties in genome annotation, even if the genome considered is small (the M.pneumoniae genome has a size of only 816 kb). In the original annotation 328 proteins (48%) from M.pneumoniae had no functional assignment. Comparisons and contradictory results with the genome annotation of the closely related Mycoplasma genitalium (27) illustrate that functional annotation is a continuing effort.

With these difficulties in mind, we have tried to approach the re-annotation in a more formal way. First, we re-examine gene contents and reading frame lengths (Table (Table1)1) and define the semantics used for the re-annotation (Table (Table3).3). Second, important steps in the annotation reasoning and the programs used are given, allowing reproducibility. Third, new experimental genome analysis data from M.pneumoniae support our effort.

Table 1.
Identification of genes and reading frame lengtha
Table 3.
Re-annotation of protein function: the different re-annotation categories

The protein and RNA inventory of M.pneumoniae is made much more complete by the re-annotation, as shown by examples from all annotation categories discussed below.

MATERIALS AND METHODS

Computational genome and sequence analysis techniques

The complete genome of M.pneumoniae was extensively compared to available completely sequenced genomes (in particular to M.genitalium) to better assign and identify the encoded proteins therein. Furthermore, iterative sequence analysis searches (PSI-BLAST; 8) compared M.pneumoniae sequences to other organisms and public databases. The statistical expectancy value for reporting hits by chance was generally set at a conservative threshold of an expected value E of 10–6.

To independently check and test these results, we applied not only other programs with similar function, such as HMM and fasta searches, but also complementary tools and methods, such as domain analysis, phylogenetic analysis, analysis of context and clusters of orthologous genes. This also included analysis of gene duplications, replacement by unrelated sequences (non-orthologous displacement; 9) and gene neighborhood to determine orthology (10). Furthermore, we applied the different tools using extensive sequence analysis protocols as described and reviewed previously (11). Amongst other tests, this included verification of detected similarities by reciprocal searches from identified sequences and determination of the exact region where the sequence similarity was actually found. In particular, the multidomain architecture of many proteins has been taken into account. Functional assignments were tested and confirmed, including sequence searches from sequences with experimentally determined functions (12). Significant links to experimentally determined functions were established.

Phylogenetic analysis was applied to analyze gene duplication events and clarify the substrate specificity of the encoded enzymes.

Detailed data for each reading frame, including annotation reasoning and programs used, are available on our web site (www.bork.embl-heidelberg.de/Annot/MP/ ). The updated annotation data are furthermore deposited with GenBank (update of accession no. U00089; 1).

A number of standard features are included in the web table: gene numbering (original GenBank number and new revised numbering from the putative origin of replication in accordance with widely used numbering schemes for prokaryotic genomes), GenBank identifier and accession no.; original GenBank annotation and revised annotation; where applicable and of interest, proteins with similar sequence with known 3-dimensional folds (13); metabolic pathway assignment (14); MG orthologs and MP homologs; intrinsic features (transmembrane domains, protein export signals, low complexity regions and coiled coils); domain annotations according to the SMART program suite (15); characterizing comments on reading frames.

Experimental genome analysis techniques

Mycoplasma pneumoniae culture, treatment of cells and protein extraction are described in Proft and Herrmann (16).

2-Dimensional gel electrophoresis followed standard procedures (17). The pH gradient in the first dimension was from pH 3 to pH 10 and in the second dimension vertical slab gels were used.

Protein identification by mass spectrometry (details in 18). Colloidal Coomassie Blue stained protein spots were cut out and tryptic gel digests were done. The tryptic peptides were eluted, concentrated and analysed by on-line micro-HPLC and ion trap mass spectrometry (MS/MS). Ion trap mass spectrometry permitted identification of the protein by comparing the masses of tryptic peptides and their fragmentation pattern to a protein database directly translated from the DNA sequence. An in-depth analysis of the 2-dimensional gel and mass spectrometry data for M.pneumoniae will be published elsewhere (J.T.Regula, B.Ueberle, G.Boguth, A.Görg, M.Schnölzer, R.Herrmann and R.Frank, submitted for publication).

mRNA expression. Measurement of mRNA expression of the different M.pneumoniae genes (comparing different growth temperatures) followed standard techniques using DNA arrays (19; H.W.H.Göhlmann, C.U.Zimmermann and R.Herrmann, unpublished data).

Other techniques. Standard molecular biology techniques for genome sequencing, cloning, northern hybridization and protein analysis were applied according to Sambrook et al. (20).

RESULTS AND DISCUSSION

Identification of genes and reading frame length

RNA. Encoded RNAs and RNA genes were identified by systematic sequence comparison to orthologous RNAs from different prokaryotic and eukaryotic species and to GenBank and to available RNA databases for specific RNA types (2125). We did not consider other completely novel or non-consensus RNA variants (26,27). Two new tRNALeu were added and the positions of 19 of the original 33 tRNA annotations were revised. Furthermore, re-annotation of the positions of three rRNA genes (5S rRNA, 16S rRNA and 23S rRNA) and of three other functional RNA molecules (RNase P, 10Sa RNA and 4.5S RNA) were included, as well as a description of a new 200 nt RNA. The 200 nt RNA, named MP200 RNA, was further analyzed in detail, including northern analysis. It is highly abundant. Its rich stem–loop structure and the potential to encode cysteine-rich peptides is conserved between M.pneumoniae and M.genitalium, however, its specific function is still unclear (28).

Proteins. The intergenic regions were re-analyzed by sequence comparisons to identify unrecognized reading frames (Table (Table1,1, top). This yielded a total of 12 new proteins (two unassigned short proteins identified by mass spectrometry, six hypothetical proteins and four with predicted functional features) (Fig. (Fig.1).1). Furthermore, one of the original reading frames was dismissed and four with sequence similarity to proteins were discarded as they contain frameshifts and are likely pseudogenes (Table (Table1).1). Apart from PSI-BLAST searches these results were checked by extensive protein family alignments and other techniques as explained in Materials and Methods. As a result, the current number of protein genes we report here is 688, an increase of 11 from the previous annotation.

Figure 1Figure 1
(A) Peptides identified by mass spectrometry of the protein MPN033(MP121) (see Materials and Methods). Those peptides matching the genome-derived sequence are shown in bold. The protein reading frame sequence not covered by these peptides is shown in ...

All protein reading frames were consistently renumbered (MPN numbers; see our web page) from the origin of replication as in other prokaryotic genome efforts. Genome identifiers for the proteins discussed in the paper, sorted according to MPN number, are summarized with their alternative identifiers in Table Table22 [the new number, old identifier according to Himmelreich et al. (1), PID and ORF identifier are listed]. In the following the MP numbers according to the original numbering system after Himmelreich et al. (1) are given as subscripts in parentheses for reference to previous papers. These MP numbers are not identical to the subsequent GenBank numbering.

Table 2.
Genome identifiers for the proteins discussed (sorted according to MPN)

The reading frame lengths were also re-examined. Previously unrecognized extensions of different MP proteins became apparent and are summarized in Table Table11 (bottom). The eight re-annotated proteins that have been shortened at the N-terminus are already included in the SwissProt sequence database. Protein fragments and overlaps were also identified. For example, MPN305(MP532) and MPN304(MP533) are N- and C-terminal fragments of arginine deiminase. Re-sequencing suggests that the separating frameshift is real, while intact MPN560(MP282) provides arginine deiminase activity.

As a further validation of the results derived by sequence comparison and theoretical analysis, two of the predicted N-terminal extensions (Table (Table1,1, bottom) were directly confirmed using 2-dimensional gel electrophoresis and mass spectrometry. Applying this combination, 350 protein spots were resolved and analyzed in a systematic effort to study the proteome of M.pneumoniae. Figure Figure1A1A shows peptides of the protein MPN033(MP121) identified by mass spectroscopy in bold. Protein reading frame sequences not covered by these peptides are shown in plain text. The other predictions are currently being examined by the same techniques. In Figure Figure1B1B mass spectrometry data for three new, short proteins in M.pneumoniae are shown. Two of these short proteins show no homology to any known sequences (also not in HMM and SMART searches), while the third reading frame has significant similarity to a small subunit of the PTS system (expected E value applying PSI-BLAST of 10–36). This confirmed experimentally an ORF between P02_orf660 and P02_orf159 already suggested by Reizer et al. (29), as well as by our screen for proteins in previously intergenic regions (MPN495(MP346.1); Table Table1).1). Furthermore, the hypothetical protein MPN254(MP579.1) predicted from the intergenic screen was confirmed by the same technique (Fig. (Fig.1C).1C). The localization of the 2-dimensional gel spot for this protein before tryptic digestion for mass spectrometry is shown in Figure Figure11D.

Re-annotation of protein function

We considered a functional feature to be predicted for the product of a reading frame if either its molecular function could be predicted (e.g. ‘methyltransferase’) or the biological context has become clear. Thus, a transmembrane domain (predicted as an intrinsic feature) is not considered specific enough for a functional annotation, however, ‘permease’ (indicating the biological activity) is. Similarly, a non-specific description regarding an external stimulus (such as ‘glucose-inhibited protein’) was not considered to be sufficient for a functional annotation, whereas the cellular role (i.e. ‘cell division protein’) is. Different functional re-assignment categories are given together with an example for each category in Table Table3.3. Apart from the first group of 297 proteins for which the annotation could be confirmed (‘conf‘; Table Table3,3, top; 43% from a total of 688 proteins), modifications of the original annotations were made. These included semantic modifications (mainly in the classification of hypothetical proteins) and modified functional assignments (in all protein categories).

In the following only a few examples for each re-annotation category (hypothetical, conserved hypothetical, wrong, less, more_, new_conf and new) are discussed in the order they appear in Table Table3.3. More data are summarized in the tables and each reading frame annotation for the whole genome can be found at http://www.bork.embl-heidelberg.de/Annot/MP/

Proteins of unknown function

The original GenBank annotation of M.pneumoniae does not provide a known functional feature for 328 protein reading frames. These protein reading frames are listed in Table Table44 under four different categories. Part of our effort was motivated by the goal to add functional information to these entries. For example, 42 proteins were previously assigned as ‘putative lipoproteins’ only and four putatitive lipoproteins which were given a defined functional assignment (1). For these proteins, the prokaryotic lipoprotein motif (prosite PS00013) is present [lipobox, Met++, more or less hydrophobic leader region Leu(Ala/Ser)(Gly/Ala)Cys; the leader region is very short in MPN561(MP281) and MPN051(MP103)]. Palmitylation assays indicate that the number of proteins with lipid attachment sites in M.pneumoniae should be 25–30 (Pyrowolakis and Herrmann, unpublished results), but so far only the subunit b of the F0F1-type ATPase has been identified experimentally as a lipoprotein (30). A reliable, homology-based prediction requires the identification of a related sequence with a domain confirmed to be involved in lipid binding. This was the case for only six of the 42 putative lipoproteins. Another two were found to have a distinct function. The other 34 sequences were re-annotated more conservatively as ‘hypothetical’ or ‘conserved hypothetical’ (the next two categories in Table Table3;3; conserved hypothetical if there was a related protein sequence in another species).

Table 4.
Proteins with unknown function

Expression of mRNA (Table (Table4)4) was confirmed by gene expression data for 184 of the (conserved) hypothetical proteins using macroarrays (19). The macroarray data are given for individual reading frames in our complete genome annotation table (see Materials and Methods; presence of an mRNA for an individual reading frame is labeled ‘mRNA expressed’ in the web table).

Re-annotation of functional assigned proteins

Four annotations were completely replaced (wrong; example in Table Table3).3). In several cases the original annotation was too broad and a less specific one (keyword ‘less’; Table Table3,3, middle) had to be chosen. MPN007(MP147) is an example. It was originally annotated as DNA polymerase III subunit δ′. However, there is not enough sequence similarity to confirm that functional assignment. The sequence similarities in PSI-BLAST runs to other subunits such as γ and τ have similarly high E values (ranging from 10–7 to 10–4 for each of them; protein length is well covered; similar results are apparent from phylogenetic analyses or analyzing the domain architecture) and only similarity to an unspecified subunit of DNA polymerase III is annotated by us.

New functional features compared to GenBank were annotated in 109 cases, including predictions for four completely new reading frames. Each of these adds some information to the predicted protein and enzymatic repertoire of M.pneumoniae (Table (Table5).5). We defined three categories: novel functional features, novel annotation integrating public knowledge and novel prediction (more_, new_conf and new; Table Table3,3, bottom).

Table 5.
Re-annotated molecular functions encoded in M.pneumoniae reading frames (selected examples)

Novel functional features. In 30 cases we could add functional features to the functional annotation present (more_; Table Table3).3). An example is MPN237(MP595), which was originally annotated as an amidase homolog. Sequence analysis using PSI-BLAST shows that one can be more precise about this finding; this sequence is similar to glutamine-tRNA amidotransferase subunit A (this is also evident from the family alignment). There is high and significant homology over the full sequence length to, for example, the recently experimentally characterized sequence from Bacillus subtilis (31). This similarity has also been included in the recent update of the homologous M.genitalium sequence by GenBank.

Novel annotation integrating public knowledge. Since the release of the original GenBank annotation (1), new data on the sequence entries have become available and the sequence analysis software has been enhanced (see for example 8). To integrate these new data in an unbiased and systematic fashion, first all sequence entries were re-analyzed with the latest sequence analysis software (see Materials and Methods). The old annotation was also extensively compared to the results from a survey of recent literature and public database updates such as the SwissProt sequence database. The complete M.genitalium sequence has been recently updated and a number of papers (see for example 3234) have described novel predictions and experiments for many of the M.pneumoniae genes. Inconsistencies with the original annotation found by our own sequence analysis can be resolved with higher certainty by systematically retrieving and critically comparing this public data from different sources.

MPN558(MP284) and MPN557(MP285) provide typical examples (Table (Table5).5). Originally annotated as glucose-inhibited cell division proteins B and A, detailed sequence comparisons, including PSI-BLAST, domain architecture and complementary sequence analysis methods (such as predicting protein 3-dimensional structures based on homologous sequence searches; http://www.bork.embl-heidelberg.de ) show that their actual molecular functions seem to be a methyltransferase [MPN558(MP284)] and an NADH oxidoreductase [MPN557(MP285)], including homologs with known structure (1BHJ.brk and 1FEA.brk, respectively). Specific queries for these findings revealed that this information has already been noted by others, for example regarding the latest version of clusters of orthologous genes (COG0357 and COG0445, respectively; 35). However, these novel predictions were not considered in the last GenBank update of M.genitalium and in the recent literature (see for example 36).

Novel prediction. There are 36 cases where (at least to our knowledge) the functional assignment is completely new (Table (Table5).5). An example is the protein secretion system in M.pneumoniae. The system has been well characterized in Escherichia coli (35). Cytosolic chaperones or regulators (trigger factor, SecB, DnaK, bacterial signal recognition particle and FtsY) deliver the protein to a membrane transporter (SecA). The receptor should also function as a motor to push the protein across the membrane via specific protein channels (SecY, SecG, SecE, SecD and SecF). Himmelreich et al. (1) noted that they had identified trigger factor, DnaK, SRP and FtsY as well as SecA, whereas of the channel-forming proteins only SecY could be assigned, leaving the secretion pathway incomplete.

We have now annotated protein reading frames similar to SecD, SecE and SecG, yielding a new, more complete picture of this secretory pathway in M.pneumoniae. As several pathogenicity factors (e.g. re-annotated hydrolases and lipases; Table Table5)5) are secreted, the respective protein channels are potential drug targets.

SecE and SecG were annotated by integrating public knowledge. MPN068(MP086) is a SecE homolog (new_conf, updated COG0690; 35). MPN242, a region previously annotated as intergenic, is the missing SecG homolog. The YvaL homology has also been reported by Bellgard and Gojobori (38). YvaL has in the meantime been experimentally verified to be a SecG homolog (39).

However, MPN396(MP443), with its similarity to secD, provides an example of a novel prediction (Fig. (Fig.2).2). This protein had been annotated before as a conserved hypothetical protein, the MG277 homolog from M.genitalium (1,35,36 and in the SwissProt update of M.genitalium). PSI-BLAST searches indicate similarity to the secDF protein from B.subtilis after the second iteration.

Figure 2Figure 2
(A) Sequence alignment of MPN280(MP555) with related secD sequences. Only the central part (140 amino acid positions) of the alignment is given. After the M.pneumoniae sequence the M.genitalium homolog is shown (MG277), aligned with secD proteins from ...

Further analysis re-tested this suggestion and showed that protein MPN396(MP443) contains a domain similar to secD and a second part (which may perhaps be another domain involved in secretion, such as a fusion with the related secF as in B.subtilis secDF). The similarity of the secD-like domain in MPN396(MP443) was confirmed by PSI-BLAST searches from established secD proteins [finding MPN396(MP443) with expected values well below 10–6 in the second iteration]. Moreover, clusters of orthologous genes and gene neighborhoods (both available using the STRING tool at http://www.bork.EMBL-Heidelberg.DE/C-GOD ) back this prediction by independent methods. Detailed sequence alignment (the central portion is displayed in Fig. Fig.2A)2A) shows clear homology to other secD domains but indicates also that the Mycoplasma sequences are only secD-like. A phylogenetic tree of established secD and secF sequences including MPN396 and MG277 gives a similar result (Fig. (Fig.2B).2B). We suggest that MPN396 with its secD-like domain should further complete the secretory repertoire in M.pneumoniae; however, experiments and analyses have now to better determine its exact relation to the established members of the sec family characterized to date.

No homologous sequence has been found for SPase I in the secretory pathway in M.pneumoniae. SPase I would cleave the signal peptide before secretion. However, suitable cleavage sites have been identified for several M.pneumoniae proteins (1) and one of the proteases identified may contain this function, such as the new annotated intracellular protease MPN386(MP542) (new_conf, COG 0693).

Re-annotated molecular functions enable predictions on higher levels

The re-annotation of molecular functions may in addition provide some answers regarding higher levels of cellular interactions such as transport (several new annotated permeases and transporters are listed in Table Table5),5), secretion (example above) and pathogenicity factors. Metabolism, multiple substrate use and existing operons are also better described.

Metabolism. As an example, MPN547(MP295) was previously annotated as a homolog of MG369, which in the recent update of M.genitalium (December 1999) is still given as a conserved hypothetical protein. Detailed sequence analysis (see Materials and Methods) shows, for example, similarity to experimentally characterized dihydroxyacetone kinases from different bacteria and fungi in PSI-BLAST searches of the N-terminal 300 amino acids with significant E values below 10–7, also apparent from the latest COG table (35). The dihydroxyacetone kinase domain could yield ATP by transforming dihydroxyacetone phosphate and ADP into dihydroxyacetone and ATP. The predicted activity can be metabolically connected to phospholipid metabolism in M.pneumoniae and the necessary supply of dihydroxyacetone phosphate via MPN051(MP103) (glycerol 3-phosphate dehydrogenase reading frame, confirmed in re-annotation). The remaining sequence of MPN547(MP295) (total length 558 amino acids) may regulate or add further to this predicted enzyme activity.

Multiple substrates. There seem to be M.pneumoniae enzymes which can interact with several substrates, for example MPN158(MP674). As already indicated in the first annotation and in SwissProt (P22990), given its clear and high sequence similarity over the full length to biochemically well-characterized enzymes from Gram-positive homologs, the encoded enzyme can act as both a riboflavin kinase and an FMN adenylyltransferase using one substrate binding site (according to biochemical data for the Corynebacterium ammoniagenes enzyme; 40). However, considering that MPN047(MP107) is now re-annotated as nicotinate phosphoribosyltransferase (by sequence similarity, including biochemically well-characterized family members) and that MPN562(MP280) is and was annotated as an NH3-dependent NAD synthase, it is tempting to speculate that MPN158(MP674) also has nicotinate-nucleotide adenylyltransferase activity besides FMN adenylyltransferase activity. This capability would complete the synthesis of NAD from imported nicotinic acid, a pathway so far incomplete. The reaction mechanism and substrate seem to be sufficiently similar to suggest this, but, as further experimental evidence is lacking, we have kept the original annotation and suggest this further activity of the reading frame product only as a comment.

Apparent operons. The phosphate uptake system was more completely annotated. It is composed of MPN611(MP231) (new assignment, similar to phosphate-binding protein PTS, for example from E.coli, previously annotated as ‘conserved with MG412’), MPN610(MP232), MPN609(MP233) and MPN608(MP234). It is probably regulated by MPN397(MP442) (ppGpp 3′-pyrophosphorylase).

A ribulose uptake operon is apparent. Small operons were known previously for fructose (MPN078(MP077) and MPN079(MP076)) and mannitol (MPN651(MP191)–MPN653(MP189)). Ribulose is now found to be transported (MPN496(MP346), MPN494(MP347)) and channeled via d-arabinose 6-hexulose 3-phosphate synthase (MPN493(MP348)) and d-arabinose 6-hexulose 3-phosphate isomerase MPN492(MP349) into fructose 6-phosphate and glycolysis. Of these proteins, MPN496 and MPN493 were not functionally annotated before and MPN494 had been annotated as a hypothetical phosphotransferase. These new functional assignments also became apparent on integrating data from SwissProt annotations with further direct experimental data published and realized for homologous proteins. Furthermore, we have now added a description of and data on the pentitol BC subunit of the ribulose transporter (MPN495(MP346.1); see Table Table11 and data in Fig. Fig.1B),1B), not annotated before.

Lessons for genome annotation

The re-annotation presented here is only our current interpretation of the genome sequence. There remains a substantial fraction of proteins unassigned (230 of 688 or 33%) and even this prototype of a small or even minimal genome (34,41) is far from being completely understood. To reduce the level of errors, close cooperation, regular updates and deposition of the findings in databases such as SwissProt and GenBank is required. We support calls for concerted efforts in re-annotation and a consistent nomenclature (3,42,43).

Regular, well-documented further updates of genome sequences will yield a considerable gain in information. We have focused mainly on the molecular functions of the proteins because these can be directly deduced from the protein sequence and/or simple experimental tests. Furthermore, we approached the re-annotation in a more formal way, including semantics, re-annotation categories and inclusion of programs and reasoning to allow reproducibility. New experimental data were integrated, including data from this study on mRNA expression and proteome analysis. In this way, three new RNAs and 12 new proteins were identified, protein lengths (24 cases) and RNA positions (25 cases) were corrected and several new operons predicted. On the next level of re-annotation, the increase of 31% in functional assignments obtained (from 349 to 458) was not only quantitative but improved our overall knowledge regarding pathogenicity factors, secretion, transporters and metabolism of M.pneumoniae.

ACKNOWLEDGEMENTS

We thank Amos Bairoch for information on recent SwissProt annotation efforts on Mycoplasma and all our collaboration partners at the European Multimedia Laboratory (Heidelberg) and Lion Biosciences AG (Heidelberg), as well as Richard Roberts and Janos Posfai (New England Biolabs, Beverly, MA), Kerstin Lühn and Jürgen Brosius (University of Münster), Mitsuhiro Itaya (Mitsubishi Kasei Institute, Tokyo), Warren C. Lathe (EMBL), Ake Wieslander (Stockholm University), Robert Turner (Scripps Institute, La Jolla, CA) and Jonathan Reizer (University of California at San Diego, La Jolla, CA) for discussions, comments and making unpublished material available to us. This research was supported by the BMBF (grants 0312212 and ‘Genominfo’), DFG (SFB544/B2, He780/10-1 and SPP ‘Informatikmethoden zur Analyse grosser genomischer Datenmengen’), the Graduiertenkolleg ‘Pathogene Mikroorganismen: Molekulare Mechanismen und Genome’ and by the Fonds der Chemischen Industrie.

Notes

DDBJ/EMBL/GenBank accession no. U00089

REFERENCES

1. Himmelreich R., Hilbert,H., Plagens,H., Pirkl,E., Li,B.-C. and Herrmann,R. (1996) Nucleic Acids Res., 24, 4420–4449 [PMC free article] [PubMed]
2. Himmelreich R., Plagens,H., Hilbert,H., Reiner,B. and Herrmann,R. (1997) Nucleic Acids Res., 25, 701–712. [PMC free article] [PubMed]
3. Brenner S.E. (1999) Trends Genet., 15, 132–133. [PubMed]
4. Koonin E.V., Mushegian,A.R. and Rudd,K.E. (1996) Curr. Biol., 6, 404–416 [PubMed]
5. Ouzounis C., Casari,G., Valencia,A. and Sander,C. (1996) Mol. Microbiol., 20, 898–900. [PubMed]
6. Fraser C.M., Gocayne,J.D., White,O. et al. (1995) Science, 270, 397–403. [PubMed]
7. Pennisi E. (1999) Science, 286, 447–450. [PubMed]
8. Altschul S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402. [PMC free article] [PubMed]
9. Koonin E.V., Mushegian,A.R. and Bork,P. (1996) Trends Genet., 12, 334–336. [PubMed]
10. Huynen M.A. and Bork,P. (1998) Proc. Natl Acad. Sci. USA, 95, 5849–5856. [PMC free article] [PubMed]
11. Bork P., Dandekar,T., Diaz-Lazcoz,Y., Eisenhaber,F., Huynen,M. and Yuan,Y. (1998) J. Mol. Biol., 283, 707–725. [PubMed]
12. Bork P. and Gibson,T.J. (1996) Methods Enzymol., 266, 162–184. [PubMed]
13. Huynen M., Doerks,T., Eisenhaber,F., Orengo,C., Sunyaev,S., Yuan,Y. and Bork,P. (1998) J. Mol. Biol., 280, 323–326. [PubMed]
14. Dandekar T., Schuster,S., Snel,B., Huynen,M. and Bork,P. (1999) Biochem. J., 343, 115–124. [PMC free article] [PubMed]
15. Schultz J., Copley,R.R., Doerks,T., Ponting,C.P. and Bork,P. (2000) Nucleic Acids Res., 28, 231–234. [PMC free article] [PubMed]
16. Proft T. and Herrmann,R. (1994) Mol. Microbiol., 13, 337–348. [PubMed]
17. Görg A., Obermaier,C., Boguth,G., Harder,A., Scheibe,B., Wildgruber,R. and Weiss,W. (2000) Electrophoresis, 21, 1037–1053. [PubMed]
18. Eng J.K., McCormack,A.L. and Yates,J.R. (1994) J. Am. Soc. Mass Spectrom., 5, 976–989. [PubMed]
19. Southern E.M. (1996) Trends Genet., 12, 110–115. [PubMed]
20. Sambrook J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
21. Brown J.W. (1999) Nucleic Acids Res., 27, 314. [PMC free article] [PubMed]
22. De Rijk P., Robbrecht,E., de Hoog,S., Caers,A., Van de Peer,Y. and De Wachter,R. (1999) Nucleic Acids Res., 27, 174–178. [PMC free article] [PubMed]
23. Szymanski M., Barciszewska,M.Z., Barciszewski,J. and Erdmann,V.A (1999) Nucleic Acids Res., 27, 158–160. [PMC free article] [PubMed]
24. Van de Peer Y., Robbrecht,E., de Hoog,S., Caers,A., De Rijk,P. and De Wachter,R. (1999) Nucleic Acids Res., 27, 179–183. [PMC free article] [PubMed]
25. Williams K.P. (1999) Nucleic Acids Res., 27, 165–166. [PMC free article] [PubMed]
26. Guigo R. (1997) Comput. Chem., 21, 215–222. [PubMed]
27. Dandekar T., Beyer,K., Bork,P., Kenealy,M.R., Pantopoulos,K., Hentze,M., Sonntag-Buck,V., Flouriot,G., Gannon,F. and Schreiber,S. (1998) Bioinformatics, 14, 271–278. [PubMed]
28. Göhlmann H.W.H, Weiner,J., Schön,A. and Herrmann,R. (2000) J. Bacteriol., 182, 3281–3284. [PMC free article] [PubMed]
29. Reizer J., Paulsen,I.T., Reizer,A., Titgemeyer,F., Saier,M.H.Jr (1996) Microb. Comp. Genomics, 1, 151–164. [PubMed]
30. Pyrowolakis G., Hoffmann,D. and Herrmann,R. (1998) J. Biol. Chem., 273, 24792–24796. [PubMed]
31. Curnow A.W., Hong,K.W., Yuan,R., Kim,S., Martins,O. and Winkler,W. (1997) Proc. Natl Acad. Sci. USA, 94, 11819–11826. [PMC free article] [PubMed]
32. Fukuda Y.,Washio,T. and Tomita,M. (1999) Nucleic Acids Res., 27, 1847–1853. [PMC free article] [PubMed]
33. Aravind L. and Koonin,E.V. (1998) Trends Biochem.Sci., 23, 17–19. [PubMed]
34. Hutchison C.A., Peterson,S.N., Gill,S.R., Cline,R.T., White,O., Fraser,C.M., Smith,H.O. and Venter,J.C. (1999) Science, 286, 2165–2169. [PubMed]
35. Tatusov R.L., Galperin,M.Y., Natale,D.A. and Koonin,E.V. (2000) Nucleic Acids Res., 28, 33–36. [PMC free article] [PubMed]
36. Müller A., MacCallum,R.M. and Sternberg,M.J. (1999) J. Mol. Biol., 293, 1257–1271. [PubMed]
37. Schatz G. and Dobberstein,B. (1996) Science, 271, 1519–1526. [PubMed]
38. Bellgard M.I. and Gojobori,T. (1999) Gene, 238, 33–37. [PubMed]
39. Van Wely K.H., Swaving,J., Brockhulzen,C.F., Rose,M., Quax,W.J. and Driessen,A.J. (1999) J. Bacteriol., 181, 1786–1792. [PMC free article] [PubMed]
40. Efimov I., Kuusk,V., Zhang,X. and McIntire,W.S. (1998) Biochemistry, 37, 9716–9723. [PubMed]
41. Mushegain A.R. and Koonin,E.V. (1996) Proc. Natl Acad. Sci. USA, 93, 10268–10273. [PMC free article] [PubMed]
42. Kyrpides N.C. and Ouzounis,C.A. (1998) Science, 281, 1457. [PubMed]
43. Kyrpides N.C. and Ouzounis,C.A. (1999) Mol. Microbiol., 32, 886–887. [PubMed]
44. Dybvig K., Sitaraman,R. and French,C.T. (1998) Proc. Natl Acad. Sci. USA, 95, 13923–13928. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...