![]() | ![]() |
Formats:
|
||||||||||||||||||
Copyright © 2003 Oxford University Press Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps 1Howard Hughes Medical Institute and 2UCLA-DOE Institute of Genomics and Proteomics, Molecular Biology Institute, University of California at Los Angeles, Box 951570, Los Angeles, CA 90095-1570, USA and 3Protein Pathways, 21111 Oxnard Street, Woodland Hills, CA 91367, USA *To whom correspondence should be addressed. Tel: +1 310 206 3642; Fax: +1 310 206 3914; Email: david/at/mbi.ucla.edu Received September 16, 2003; Revised October 21, 2003; Accepted October 21, 2003. This article has been cited by other articles in PMC.Abstract Genome-wide functional linkages among proteins in cellular complexes and metabolic pathways can be inferred from high throughput experimentation, such as DNA microarrays, or from bioinformatic analyses. Here we describe a method for the visualization and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon and Conserved Gene Neighbor computational methods. This method involves the construction of a genome-wide functional linkage map, where each significant functional linkage between a pair of proteins is displayed on a two-dimensional scatter-plot, organized according to the order of genes along the chromosome. Subsequent hierarchical clustering of the map reveals clusters of genes with similar functional linkage profiles and facilitates the inference of protein function and the discovery of functionally linked gene clusters throughout the genome. We illustrate this method by applying it to the genome of the pathogenic bacterium Mycobacterium tuberculosis, assigning cellular functions to previously uncharacterized proteins involved in cell wall biosynthesis, signal transduction, chaperone activity, energy metabolism and polysaccharide biosynthesis. INTRODUCTION With the development of high throughput experimental and computational procedures, the identification of functionally linked proteins has progressed at a brisk pace, while methods for the visual interpretation of these large datasets have developed more slowly. Experimental methods such as high throughput yeast two-hybrid experiments (1) and whole genome microarray experiments (2) have yielded a plethora of information about functionally related genes and proteins, both in the form of protein–protein interaction data as well as information regarding protein function. Extensive databases, such as the Database of Interacting Proteins (3), the Biomolecular Interaction Network Database (4) and the MIPS Comprehensive Yeast Genome Database (5), have also been created to catalog thousands of protein–protein interactions identified by both large and small scale experiments. In addition to high throughput experimental procedures, several computational methods have been developed to identify functionally linked genes and proteins. Among these are the Rosetta Stone method (6), which identifies individual proteins that exist as a single fusion protein in another organism (6,7), the Phylogenetic Profile method (8), which examines the correlated occurrence of proteins in various genomes, the Operon method (9,10), which functionally links genes likely to belong to common operons based on the distance between genes in the same orientation, and the Conserved Gene Neighbor method (11,12), which identifies genes that are located in close chromosomal proximity in multiple genomes. These methods provide non-homology based approaches for identifying functionally related proteins throughout a particular genome and complement the traditional homology based tools such as BLAST (13) for the identification of protein function. In addition, these methods permit a protein’s function to be defined in the context of its cellular interactions (14). These computational methods are useful not only for the identification of protein function, as we have demonstrated previously in the genome of Mycobacterium tuberculosis (9), but can also be employed for the reconstruction of protein networks. Traditionally, these networks have been displayed as two-dimensional graphs of nodes and edges, where edges represent functional linkages between pairs of protein nodes (1,15). Although this classical representation has its merits, we have found that an alternative representation has advantages for depicting genome-wide functional linkages in prokaryotic genomes. Specifically, we employ a single scatter plot, with both axes organized according to the order of genes along the prokaryotic chromosome. This representation has allowed us to identify characteristics of protein network architecture that have previously eluded identification with the classical node and edge representation. Andrei Grigoriev first proposed the use of a two- dimensional matrix to indicate experimentally identified protein–protein interactions in the 56 gene genome of bacteriophage T7 (16). Here we build on this idea and plot computationally derived protein functional linkages on individual scatter plots. Thousands of functional linkages in the M.tuberculosis genome are plotted on a single scatter plot, comprising what we describe as a genome-wide functional linkage map. These maps reveal global and local features of prokaryotic genome organization and suggest protein relationships on a genome-wide basis. MATERIALS AND METHODS Genome-wide functional linkage maps Functional linkage maps were generated by first identifying M.tuberculosis protein pairs that are functionally linked by the Rosetta Stone, Phylogenetic Profile, Operon (distance threshold 100 bp) and Conserved Gene Neighbor methods. Pairs of proteins that are functionally linked by two or more computational methods were then identified. Protein pairs were converted to corresponding integer values (i.e. protein pair Rv0005 and Rv0006 was converted to integer pair 5,6), and a list of integer pairs was input into the graphing program SigmaPlot 2000 to create a scatter plot representing genome-wide functional linkages. Hierarchical clustering Bit vectors for each gene were created. The presence of a functional linkage was indicated by a bit entry of 1 in the vector at the position corresponding to the functionally linked gene. The absence of a functional linkage was indicated by a bit entry of 0. The program Cluster (17) was employed to cluster genes based on the similarity of their functional linkage profiles using a centered correlation coefficient as the comparison metric. The hierarchical clustering algorithm used was the average linkage algorithm. The program Treeview (17) was employed to visualize the clustering results. The original data table was represented graphically by coloring each cell according to the bit entry. Cells with a bit entry of 1 (corresponding to a functional linkage) were colored black, while cells with a bit entry of 0 (corresponding to absence of a functional linkage) were colored white. Rosetta Stone method Proteins were functionally linked by the Rosetta Stone method if individual proteins were found to be present as a single fused protein in another organism, as described by Marcotte et al. (6,18). In this case, if individual M.tuberculosis proteins have significant homology to distinct regions of a single ‘fusion’ protein in another organism then they are indicated as functionally linked by this method. A probabilistic score is calculated by estimating the likelihood of observing Rosetta Stone proteins given the number of homologs each protein has. Phylogenetic profile method Phylogenetic profiles were used to identify proteins that occur in a correlated fashion in numerous genomes, as described by Pellegrini et al. (8). A phylogenetic profile for each M.tuberculosis protein was created in the form of a bit vector, by searching for the presence or absence of homologs in each of the available fully sequenced genomes. The presence of an identifiable homolog in a particular genome was indicated by the integer 1 in the bit vector at the position corresponding to that genome, while the absence of a homolog was indicated by the integer 0. Phylogenetic profiles were then clustered based on the similarity of profiles, resulting in clusters of genes with similar profiles and likely related functions. Operon method A series of genes are considered functionally linked by the Operon method if the nucleotide distance between genes in the same orientation was less than or equal to a specified distance threshold (9). Multiple genes were linked if a series of genes in the same orientation all had intergenic distances less than or equal to the defined distance threshold. In the case of our genome-wide functional linkage maps, a distance threshold of 100 bp was employed. Conserved Gene Neighbor method Functional links were established by the Conserved Gene Neighbor method where genes appear as chromosomal neighbors in multiple genomes, as described by Overbeek et al. (11) and Dandekar et al. (12). For all possible pairs of M.tuberculosis genes, the nucleotide distance between homologs of these genes in all available sequenced genomes was calculated. Genes that were in close proximity in multiple genomes were indicated as functionally linked by this method. A probabilistic score reflects the likelihood of observing the intergenic distance between a pair of genes across all sequenced genomes. Sanger Institute functional annotations Sanger Institute M.tuberculosis H37Rv functional annotations were obtained from the Sanger M.tuberculosis web server at http://www.sanger.ac.uk/Projects/M_tuberculosis/Gene_list/. Evaluation of functional linkages The method of keyword recovery (9) was used to evaluate functional linkages represented in the genome-wide functional linkage map. The method of keyword recovery compares links between Swiss-Prot annotated proteins. For each pair of functionally linked proteins we define the first protein of the pair ‘query protein A’ and the second protein of the pair ‘linked protein B’. All protein pairs are reciprocal, since a link from gene 1 to gene 2 is also represented as a link from gene 2 to gene 1. The keyword recovery of all linkages was calculated as: ![]() where X is the total number of keywords in all query proteins, Y is the total number of linked gene pairs, x is the number of Swiss-Prot keywords of query protein A and nij is the number of times the query protein keyword j occurs in the annotation of the linked protein B. Signal-to-noise was calculated as: signal-to-noise = keyword recovery/random keyword recovery where random keyword recovery was calculated for the same number of random pairwise Swiss-Prot annotated genes as exist computationally inferred links (mean of 100 random trials). The maximum false-positive fraction (9) was calculated as the fraction of pairwise links that do not have any Swiss-Prot keywords in common (ignoring the keywords ‘hypothetical protein’, ‘3D structure’, ‘transmembrane’ and ‘complete proteome’). RESULTS Validation of functional linkages The 9766 functional linkages represented in the genome-wide functional linkage map (Fig. (Fig.1B)1
The method of keyword recovery allows us to evaluate a set of linkages based on known functional annotation. By comparing the Swiss-Prot keywords we can quantitatively evaluate the functional linkages represented in our genome-wide functional linkage map. The keyword recovery for linkages inferred by two or more methods is 0.55. Compared to the keyword recovery of randomly paired proteins (0.056) we have a signal-to-noise ratio of 9.8. The maximum false-positive fraction (9) also reflects the functional similarity among linked proteins. The maximum false-positive fraction is the fraction of functionally linked proteins that do not share any keywords in common. The quantity 1 – maximum false-positive fraction indicates the fraction of pairwise links that have one or more keywords in common, and therefore some function in common. The maximum false-positive fraction for the 9766 linkages inferred by two or more methods is 0.20, demonstrating that 80% [100% × (1 – maximum false-positive fraction)] of linked pairs share some function in common. This percentage (80%) may in fact represent the lower boundary, since some proteins may have incomplete annotation or may employ different vocabularies to describe similar functions. The linkage of genes by more than one method is expected in a number of cases since the computational methods we employ are inherently complementary. For example, Moreno-Hagelsieb et al. have shown that genes of common operons are more likely to occur as fusion genes (Rosetta Stone linkage), are more likely to occur as Conserved Gene Neighbors and are more likely to share similar Phylogenetic profiles than genes of different operons (19). Yanai et al. have also noted numerous instances where individual genes that constitute a fusion gene in one organism occur as operon members in another organism (20). Linkages that are inferred by two or more methods have been shown to have a higher keyword recovery score and lower maximum false-positive score than linkages inferred by only a single method (9,15). The resulting lists of functionally linked genes, inferred by overlapping methods, therefore represent higher confidence functional linkages (15) and are well suited for the construction of our genome-wide functional linkage maps. Genome-wide functional linkage maps Figure Figure11 Figure Figure1B,1 Notice that Figure Figure1B1 Our analysis of genome-wide functional linkage maps reveals certain chromosomal regions that have genes with many functional linkages and other regions that have few predicted linkages. This non-random connectivity may relate to the scale-free nature of protein network topology (21,22), and may enable detection of clusters of both promiscuous proteins as well as highly specialized proteins throughout the M.tuberculosis genome. A downloadable file of the genome-wide functional linkage map is provided at http://www.doe-mbi.ulca.edu/~strong/map. This file enables the identification of each point of the genome-wide functional linkage map in an interactive manner. In Figure Figure2A2
Region B of Figure Figure2C2 All six of the genes in cluster B occur next to each other in the same genomic orientation (Fig. (Fig.2C),2 Figure Figure2D2 Hierarchical clustering Although the off-diagonal points can identify functionally linked genes and gene clusters throughout the genome, we adopt a complementary method to facilitate this process. We apply hierarchical clustering to group the genes of our genome-wide functional linkage maps based on the similarity of their functional linkage profiles. A functional linkage profile for a particular gene is equivalent to a single row in our genome-wide functional linkage map and consists of a record of all the genes to which a particular gene is functionally linked. Figure Figure33
Figure Figure4A4
Although in our original functional linkage map the dense clustering along the diagonal reflects operon organization and genomic clustering of functionally related genes, the diagonal in Figure Figure4A4 These resulting gene clusters are analogous to the functional modules coined by Snel et al. (29), in which they used the Conserved Gene Neighbor method to construct traditional node and edge protein networks to identify groups of proteins involved in related cellular functions. While we employ a different method than Snel et al. for the visualization and examination of protein networks, we do see related observations, such as the presence of distinct functional modules consisting of proteins involved in related cellular functions, as well as the presence of linker proteins (29), which link functional modules of varied function. These observations also correspond well with the observations of Rives et al. (30) and Ravasz et al. (31), who employed analogous methods of network clustering to investigate the modular organization of yeast protein interaction networks (30) and E.coli metabolic networks (31), respectively. Our method of hierarchical clustering tends to cluster genes of related biological function, but we see relatively little overlap among separate clusters, as seen by the modest number of off-diagonal functional linkages connecting the varying clusters. All off-diagonal functional linkages in Figure Figure4A4 Figure Figure4B4 The two largest modules of Figure Figure4C4 Other examples of module linkers include a linkage between a serine-threonine kinase/phosphatase module and a cell envelope module. This linkage is mediated by the functional linkage between Rv0018c (ppp) and Rv0019c. In addition, we observe a linkage between two related ribosomal modules, mediated by the functional linkage between Rv0717 (rpsN1) and Rv0702 (rplD). Figure Figure55
Figure Figure5E5 Inference of protein function We can also use these clusters to infer function for previously uncharacterized proteins. Figure Figure6A–E6
Figure Figure6C6 Finally, in Figure Figure6E6
DISCUSSION We find that the four computational methods, the Rosetta Stone, Phylogenetic profile, Operon and Conserved Gene Neighbor methods, can be applied not only to link pairs of proteins throughout a particular genome, but also to construct complex protein networks. These methods have varied applications, ranging from the analysis of functional linkages on a genome-wide scale to the inference of protein function. We propose that genome-wide functional linkage maps may provide a useful method for the visualization and interpretation of both experimentally derived (3–5) as well as computationally inferred (33,34) functional linkage datasets on a genome-wide basis. Although the functional linkage maps we have discussed thus far have incorporated functional linkages inferred by at least two of our computational methods, we have also constructed genome maps using linkages established by each of the four computational methods alone (Supplementary Fig. Fig.1).1 The use of individual computational methods is also likely to aid in the inference of protein function. Since our original clustered map contains only linkages inferred by the overlap of two or more methods, examination of linkages established by individual methods may provide additional information and may aid in the identification of protein function involving clusters of previously uncharacterized genes. We envision that these functional linkages may suggest potential functional roles for these proteins and may indicate potential research directions or biochemical experiments in which to investigate these proteins. Using a combination of our genome-wide functional linkage maps and hierarchical clustering, we have been able to elucidate features of protein network architecture that have previously eluded inference. These two types of graphical representation have allowed us to rapidly analyze genome-wide functional linkages and have enabled us to infer protein function and identify potential pathways involving previously uncharacterized proteins. While we have focused our attention here on the deadly bacterial pathogen M.tuberculosis, these methods can also be applied to any prokaryotic organism, and may even be extended to examine genome features of eukaryotes. We have been able to assign function to a number of previously uncharacterized genes, including genes involved in cell wall metabolism, chaperone/heat shock activity, energy metabolism and polysaccharide metabolism, and suggest a potential pathway involving serine/threonine kinases and cell wall metabolism genes. Some of the proteins we have assigned function to, in turn, may serve as potential drug targets since a number of the pathways to which they are linked have been previously proposed as drug targets (35–37). Mycobacterium tuberculosis genome-wide functional linkage maps, dendrograms and functional linkages are available at http://www.doe-mbi.ucla.edu/~strong/map/. Supplementary Material is avaliable at NAR Online. [Supplementary Material]
ACKNOWLEDGEMENTS M.S. is supported by a USPHS National Research Service Award GM07185. This work was also supported by the National Institutes of Health under grant no. P01 GM31299-20. REFERENCES 1. Uetz P., Giot,L., Cagney,G., Mansfield,T.A., Judson,R.S., Knight,J.R., Lockshon,D., Narayan,V., Srinivasan,M., Pochart,P. et al. (2000) A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature, 403, 623–627. [PubMed] 2. Brown P.O. and Botstein,D. (1999) Exploring the new world of the genome with DNA microarrays. Nature Genet., 21, 33–37. [PubMed] 3. Xenarios I., Salwinski,L., Duan,X.J., Higney,P., Kim,S.M. and Eisenberg,D. (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res., 30, 303–305. [PubMed] 4. Bader G.D., Betel,D. and Hogue,C.W. (2003) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res., 31, 248–250. [PubMed] 5. Mewes H.W., Frishman,D., Guldener,U., Mannhaupt,G., Mayer,K., Mokrejs,M., Morgenstern,B., Munsterkotter,M., Rudd,S. and Weil,B. (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res., 30, 31–34. [PubMed] 6. Marcotte E.M., Pellegrini,M., Ng,H.L., Rice,D.W., Yeates,T.O. and Eisenberg,D. (1999) Detecting protein function and protein–protein interactions from genome sequences. Science, 285, 751–753. [PubMed] 7. Enright A.J., Iliopoulos,I., Kyrpides,N.C. and Ouzounis,C.A. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 86–90. [PubMed] 8. Pellegrini M., Marcotte,E.M., Thompson,M.J., Eisenberg,D. and Yeates,T.O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA, 96, 4285–4288. [PubMed] 9. Strong M., Mallick,P., Pellegrini,M., Thompson,M.J. and Eisenberg,D. (2003) Inference of protein function and protein linkages in M. tuberculosis based on prokaryotic genome organization: a combined computational approach. Genome Biol., 4, R59.1–R59.16. [PubMed] 10. Pellegrini M., Thompson,M., Fierro,J. and Bowers,P. (2001) Computational method to assign microbial genes to pathways. J. Cell. Biochem., 37, 106–109. 11. Overbeek R., Fonstein,M., D’Souza,M., Pusch,G.D. and Maltsev,N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA, 96, 2896–2901. [PubMed] 12. Dandekar T., Snel,B., Huynen,M. and Bork,P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci., 23, 324–328. [PubMed] 13. Altschul S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410. [PubMed] 14. Eisenberg D., Marcotte,E.M., Xenarios,I. and Yeates,T.O. (2000) Protein function in the post-genomic era. Nature, 405, 823–826. [PubMed] 15. Marcotte E.M., Pellegrini,M., Thompson,M.J., Yeates,T.O. and Eisenberg,D. (1999) A combined algorithm for genome-wide prediction of protein function. Nature, 402, 83–86. [PubMed] 16. Grigoriev A. (2001) A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Res., 29, 3513–3519. [PubMed] 17. Eisen M.B., Spellman,P.T., Brown,P.O. and Botstein,D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA, 95, 14863–14868. [PubMed] 18. Marcotte C.J.V. and Marcotte,E.M. (2002) Predicting functional linkages from gene fusions with confidence. Appl. Bioinformatics, 1, 93–100. [PubMed] 19. Moreno-Hagelsieb G., Trevino,V., Perez-Rueda,E., Smith,T.F. and Collado Vides,J. (2002) Transcription unit conservation in the three domains of life: a perspective from Escherichia coli. Trends Genet., 17, 175–177. [PubMed] 20. Yanai I., Wolf,Y.I. and Koonin,E.V. (2002) Evolution of gene fusions: horizontal transfer versus independent events. Genome Biol., 3, 24.1–24.13. 21. Wuchty S. (2002) Interaction and domain networks of yeast. Proteomics, 2, 1715–1723. [PubMed] 22. Jeong H., Tombor,B., Albert,R., Oltvai,Z.N. and Barabasi,A.L. (2000) The large-scale organization of metabolic networks. Nature, 407, 651–654. [PubMed] 23. Tatusov R.L., Galperin,M.Y., Natale,D.A. and Koonin,E.V. (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res., 28, 33–36. [PubMed] 24. Aravind L. and Koonin,E.V. (1999) DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res., 27, 4658–4670. [PubMed] 25. Yeats C., Finn,R.D. and Bateman,A. (2002) The PASTA domain: a beta-lactam-binding domain. Trends Biochem. Sci., 27, 438–440. [PubMed] 26. Salgado H., Moreno-Haelsieb,G., Smith,T. and Collado-Vides,J. (2000) Operons in Escherichia coli: genomic analysis and predictions. Proc. Natl Acad. Sci. USA, 97, 6652–6657. [PubMed] 27. Lodish H., Berk,A., Zipursky,S.L., Matsudaira,P., Darnell,J. and Baltimore,D. (1995) Molecular Cell Biology, 3rd Edn. Scientific American Books; , New York, NY. 28. Pallen M., Chaudhuri,R. and Khan,A. (2002) Bacterial FHA domains: neglected players in the phospho-threonine signalling game? Trends Microbiol., 10, 556–563. [PubMed] 29. Snel B., Bork,P. and Huynen,M.A. (2002) The identification of functional modules from the genomic association of genes. Proc. Natl Acad. Sci. USA, 99, 5890–5895. [PubMed] 30. Rives A.W. and Galitski,T. (2003) Modular organization of cellular networks. Proc. Natl Acad. Sci. USA, 100, 1128–1133. [PubMed] 31. Ravasz E., Somera,A.L., Mongru,D.A., Oltvai,Z.N. and Barabasi,A.L. (2002) Hierarchical organization of modularity in metabolic networks. Science, 297, 1551–1555. [PubMed] 32. Horswill A.R. and Escalante-Semerena,J.C. (2001) In vitro conversion of propionate to pyruvate by Salmonella enterica enzymes: 2-methylcitrate dehydratase (PrpD) and aconitase enzymes catalyze the conversion of 2-methylcitrate to 2-methylisocitrate. Biochemistry, 40, 4703–4713. [PubMed] 33. von Mering C., Huynen,M., Jaeggi,D., Schmidt,S., Bork,P. and Snel,B. (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res., 31, 258–261. [PubMed] 34. Mellor J.C., Yanai,I., Clodfelter,K.H., Mintseris,J. and DeLisi,C. (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res., 30, 306–309. [PubMed] 35. Rose J.D., Maddry,J.A., Comber,R.N., Suling,W.J., Wilson,L.N. and Reynolds,R.C. (2002) Synthesis and biological evaluation of trehalose analogs as potential inhibitors of mycobacterial cell wall biosynthesis. Carbohydr. Res., 337, 105–120. [PubMed] 36. McKinney J.D., Honer zu Bentrup,K., Munoz-Elias,E.J., Miczak,A., Chen,B., Chan,W.T., Swenson,D., Sacchettini,J.C., Jacobs,W.R.,Jr and Russell,D.G. (2000) Persistence of Mycobacterium tuberculosis in macrophages and mice requires the glyoxylate shunt enzyme isocitrate lyase. Nature, 406, 735–738. [PubMed] 37. Drews S.J., Hung,F. and Av-Gay,Y. (2001) A protein kinase inhibitor as an antimycobacterial agent. FEMS Microbiol. Lett., 205, 369–374. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||
Nature. 2000 Feb 10; 403(6770):623-7.
[Nature. 2000]Nat Genet. 1999 Jan; 21(1 Suppl):33-7.
[Nat Genet. 1999]Nucleic Acids Res. 2002 Jan 1; 30(1):303-5.
[Nucleic Acids Res. 2002]Nucleic Acids Res. 2003 Jan 1; 31(1):248-50.
[Nucleic Acids Res. 2003]Nucleic Acids Res. 2002 Jan 1; 30(1):31-4.
[Nucleic Acids Res. 2002]Science. 1999 Jul 30; 285(5428):751-3.
[Science. 1999]Nature. 1999 Nov 4; 402(6757):86-90.
[Nature. 1999]Proc Natl Acad Sci U S A. 1999 Apr 13; 96(8):4285-8.
[Proc Natl Acad Sci U S A. 1999]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Proc Natl Acad Sci U S A. 1999 Mar 16; 96(6):2896-901.
[Proc Natl Acad Sci U S A. 1999]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Nature. 2000 Feb 10; 403(6770):623-7.
[Nature. 2000]Nature. 1999 Nov 4; 402(6757):83-6.
[Nature. 1999]Nucleic Acids Res. 2001 Sep 1; 29(17):3513-9.
[Nucleic Acids Res. 2001]Proc Natl Acad Sci U S A. 1998 Dec 8; 95(25):14863-8.
[Proc Natl Acad Sci U S A. 1998]Science. 1999 Jul 30; 285(5428):751-3.
[Science. 1999]Appl Bioinformatics. 2002; 1(2):93-100.
[Appl Bioinformatics. 2002]Proc Natl Acad Sci U S A. 1999 Apr 13; 96(8):4285-8.
[Proc Natl Acad Sci U S A. 1999]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Proc Natl Acad Sci U S A. 1999 Mar 16; 96(6):2896-901.
[Proc Natl Acad Sci U S A. 1999]Trends Biochem Sci. 1998 Sep; 23(9):324-8.
[Trends Biochem Sci. 1998]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Nature. 1999 Nov 4; 402(6757):83-6.
[Nature. 1999]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Trends Genet. 2001 Apr; 17(4):175-7.
[Trends Genet. 2001]Genome Biol. 2003; 4(9):R59.
[Genome Biol. 2003]Nature. 1999 Nov 4; 402(6757):83-6.
[Nature. 1999]Proteomics. 2002 Dec; 2(12):1715-23.
[Proteomics. 2002]Nature. 2000 Oct 5; 407(6804):651-4.
[Nature. 2000]Nucleic Acids Res. 2000 Jan 1; 28(1):33-6.
[Nucleic Acids Res. 2000]Nucleic Acids Res. 1999 Dec 1; 27(23):4658-70.
[Nucleic Acids Res. 1999]Trends Biochem Sci. 2002 Sep; 27(9):438.
[Trends Biochem Sci. 2002]Proc Natl Acad Sci U S A. 2000 Jun 6; 97(12):6652-7.
[Proc Natl Acad Sci U S A. 2000]Trends Microbiol. 2002 Dec; 10(12):556-63.
[Trends Microbiol. 2002]Proc Natl Acad Sci U S A. 2002 Apr 30; 99(9):5890-5.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2003 Feb 4; 100(3):1128-33.
[Proc Natl Acad Sci U S A. 2003]Science. 2002 Aug 30; 297(5586):1551-5.
[Science. 2002]Biochemistry. 2001 Apr 17; 40(15):4703-13.
[Biochemistry. 2001]Nucleic Acids Res. 2002 Jan 1; 30(1):303-5.
[Nucleic Acids Res. 2002]Nucleic Acids Res. 2002 Jan 1; 30(1):31-4.
[Nucleic Acids Res. 2002]Nucleic Acids Res. 2003 Jan 1; 31(1):258-61.
[Nucleic Acids Res. 2003]Nucleic Acids Res. 2002 Jan 1; 30(1):306-9.
[Nucleic Acids Res. 2002]Carbohydr Res. 2002 Feb 5; 337(2):105-20.
[Carbohydr Res. 2002]FEMS Microbiol Lett. 2001 Dec 18; 205(2):369-74.
[FEMS Microbiol Lett. 2001]Trends Biochem Sci. 2002 Sep; 27(9):438.
[Trends Biochem Sci. 2002]