Logo of narLink to Publisher's site
Nucleic Acids Res. 2008 Jan; 36(Database issue): D562–D571.
Published online 2007 Oct 18. doi:  10.1093/nar/gkm758
PMCID: PMC2238957

CFGP: a web-based, comparative fungal genomics platform


Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the ‘fill-in-the-form-and-press-SUBMIT’ user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI.


Fungi exert a far-reaching influence on the earth's biosphere (1). As recyclers of organic matter and as symbionts of most terrestrial plants, fungi are essential components of healthy ecosystems (2). For thousands of years, humans have exploited fungi for the production of many useful compounds and foods (3). In contrast to these benefits, fungi are also a major cause of plant diseases, significantly reducing crop yield (4). Fungi also represent a direct threat to human health as the most common cause of death in immunocompromised patients such as bone marrow transplant recipients and individuals suffering from advanced HIV infection due to systemic mycoses (5,6).

Studies on fungal biology have been greatly aided by rapidly accumulating genome sequence data (7). Since the completion of the genome sequencing of Saccharomyces cerevisiae (8), genomes of more than 80 fungal species have been completely sequenced or are currently being sequenced (7,9). As new high-throughput and low cost sequencing technologies (10) become widely available, the rate of fungal genome sequencing will continue to accelerate. Currently available fungal genome sequences cover species in four out of the seven fungal phyla, including Ascomycota, Basidiomycota, Chytridiomycota and Microsprodia (11,12) (Table 1). These genome sequences provide novel opportunities for elucidating the evolutionary and genetic basis of many different fungal lifestyle features, such as pathogenesis, symbiosis and the ability to grow on diverse substrates (9,13,14), via the use of various functional genomic and informatic tools. A better understanding of fungal biology will not only facilitate the judicious use of beneficial fungi, but also advance our efforts to control pathogenic species (15,16).

Table 1.
List of genome sequences stored in the data warehouse of the CFGP

The abundance of sequenced species has facilitated in-depth comparative evolutionary genomic analyses across multiple fungal taxa (17–20). Because of the large amount of data involved, a cohesive, user-friendly informatics platform that links data and analysis tools is needed to efficiently support such analyses. Despite this need, the lack of data standardization has hampered the development of such platforms. The Genome Information Management System (GIMS) provided an integrated environment for archiving and visualization of genome sequences and data on transcriptome, protein–protein interaction, Gene Ontology (GO) and metabolic pathway (21). The ‘eFungi’, an improvement from the GIMS, stores genome sequences of 34 fungal and 2 Oomycete species (http://www.e-fungi.org.uk/). Although these systems systematically archive genomic data from multiple species, they do not support analysis of archived data with bioinformatic tools.

Heterogeneity of user interface (UI) and input/output data format in different bioinformatics tools has also complicated the integration of tools in a single platform to support multifaceted analyses of multiple genome sequences. Several systems provide multiple tools via a single platform. One example is the SNAP workbench, which supports sophisticated phylogenetic analyses through a menu-driven design (22). The iNquiry (BioTeam Inc., Wayland, MA, USA; http://web.bioteam.net/metadot/index.pl?id=2187) and European Molecular Biology Open Software Suite (EMBOSS) (23) are other examples of integrated, web-based platforms with multiple bioinformatic tools. The PLATCOM integrates a number of tools for comparative analysis of multiple genomes (24,25). These platforms, integrating data and tools, significantly shorten data analysis time by eliminating the need for visiting multiple, independent web sites to collect and analyse data. The ISYS platform utilizes middleware to link many different databases to data analysis tools using JAVA and allow these tools to communicate without any modification (26). Although these examples illustrate major improvements in supporting integrative analyses of genome sequence data via a single platform, the efficiency and expandability of such platforms require continuous enhancement, in order to adequately support utilization of rapidly increasing genome sequence data. Another area that requires improvement is the UI. Many currently available web-based bioinformatic platforms employ classical UI systems that simply display a list of functions or databases and provide a ‘paste-sequence-and-press-submit’ form (http://ausweb.scu.edu.au/aw02/papers/refereed/fitch/paper.html). Such UIs are easy to construct, but are not suitable for successively analysing sequence data with multiple tools.

To provide an effective means for analysing fungal genome sequence data through a suite of tools across multiple species, we developed the Comparative Fungal Genomics Platform (CFGP), which consists of a large-scale genomic data warehouse, bioinformatics tools useful for comparative genome analyses and a novel UI. The UI of the CFGP provides an easy access to sequence data stored in the data warehouse and seamlessly supports integrative data analyses using multiple tools. The data warehouse currently houses 101 genome databases in a standardized format for rapid data exchange. Bioinformatic tools incorporated into the CFGP were wrapped by a middleware program to efficiently manage tasks and facilitate data exchange between tools.


The CFGP consists of three layers—the basal layer, middleware and the UI (Figure 1). The basal layer contains a data warehouse, which is managed using MySQL. Meta information for different types of biological data, including genome sequences, species and phenotype screening data, is placed as individual objects in this layer. The middleware connects the basal layer with the UI and supports the use of data analysis tools, including BLAST (27), ClustalW for multiple sequence alignment (28), InterProScan for predicting functional domains (29), SignalP 3.0 for predicting the presence of signal peptide (30), PSORT II for predicting subcellular localization (31) and a newly developed program named BLASTMatrix for identifying and summarizing the distribution pattern of homologous genes across the genome sequences stored in the CFGP. As a result of the standardization of data exchange, the functionality of the CFGP can be easily expanded by adding any new tools that function in the UNIX environment. The UI of the CFGP developed with PHP (http://www.php.net) is based on a new concept, termed the Data-driven User Interface (DUI). By collecting sequences to be analysed first and executing analyses later, the DUI significantly reduces the time required for analysing the same sequence data via multiple tools.

Figure 1.
Overall system architecture and data flow in the CFGP. The basal layer contains a data warehouse, Favorite (a personal data repository and management tool), and external databases, such as InterPro and GO, stored in the CFGP. The wrapper in the middle ...

The three layers of the CFGP can be manipulated and developed independently, which provides an optimal environment for maintenance and expansion of the CFGP. This was made possible by employing a standardized scheme in building each layer. In the basal layer, functions and schema of databases were standardized in both naming rules and basic structure of programming style, which enhances the efficiency of database development. In the middle layer, communications between the CFGP and external programs were standardized via PERL modules. This facilitates the future expansion of functionality, because new programs can be easily incorporated into the CFGP by constructing additional PERL modules. In the DUI, most of the interface components were standardized as a function so that a developer can easily make a new UI with selected components.


Data warehouse

Fungal genome sequence data in the public domain are stored in heterogeneous formats, posing a hurdle in integrating the data for comparative analysis. We retrieved these data and stored all Open Reading Frame (ORF) and contig (or chromosome) sequences of individual genomes in the data warehouse of the CFGP in a single format using MySQL. Subsequently, all sequence data were encapsulated as individual objects so that they can be easily analysed through multiple data analysis tools. The data warehouse currently houses the genome sequences of 65 fungal species, 4 Oomycete species and 27 non-fungal organisms (Table 1). The fungal genome databases cover 52 species belonging to the Ascomycota, eight species in the Basidomycota, two species each in the Mucoromycotina and the Microsprodia and one in the Chytridiomycota (12).

Data-driven user interface (DUI)

Most of the bioinformatics tools currently available through the web typically provide a box in the UI for pasting a query sequence. However, as the complexity of scientific inquires increases, often requiring multiple analyses with a single query, a single analysis with multiple sequences, or a combination of both, this type of UI becomes inefficient, and a new UI design is required (32). The only current solution for analysing a large number of sequences is a batch processing of data, which likely requires some level of programming knowledge by the user.

We developed the DUI to seamlessly support data management and integrative analyses using a suite of data analysis tools. It consists of two compartments: the Data Frame, supports browsing and collection of data, and the Manipulation Frame, which supports data management (Figure 2A). Four browsing tools under the ‘SEQUENCE’ menu include Contig Browser for browsing data in the data warehouse, SequenceSet Browser for browsing data in databases such as Uniprot, MyGene Browser for browsing data in the user's own computer and NR Browser for NR and NT sequences of NCBI. The Manipulation Frame provides a mechanism for storing and organizing data collected in a personalized space in the CFGP. The collection arrow transfers selected sequence data from the Data Frame to the Manipulation Frame, where they can be analysed by any bioinfomatic tools in the CFGP. This data management scheme significantly enhances the efficiency of data analysis, especially when large amounts of data are involved.

Figure 2.
Structure of DUI. (A) A screenshot shows the process of data acquisition from Contig Browser. On the left side, ‘Data Frame’ displays the list of Magnaporthe oryzae proteins and ‘Manipulation Frame’ on the right side shows ...

Favorite as a bioinformatic workbench

A new UI tool named Favorite was developed to provide a personalized hub for storing and managing sequences retrieved from the data warehouse (Figure 2B). By storing only the primary keys of chosen sequences, not the sequences themselves, Favorite significantly reduces the space needed for storing data. Data stored in Favorite can be analysed with one tool or a series of tools by simply clicking the appropriate analyses in the option window (Figure 2C).

Five external programs, including BLAST, ClustalW, InterProScan, SignalP 3.0 and PSORT II, are available in Favorite. A BLAST search result can be presented in six different formats. One of them is ‘interpro view’, which displays the BLAST result annotated by InterPro to provide the functional prediction of the proteins in the BLAST output. The ClustalW provides three different output formats: the multiple sequence alignment, distance matrix and the bootstrapped phylogenetic tree. The MSA viewer and Phyloviewer aid the user in manipulating the results of multiple sequence alignments and phylogenetic trees, respectively (http://phyloviewer.riceblast.snu.ac.kr; J. Park et al., unpublished data). Results from InterProScan, SignalP and PSORT II are stored in the annotation database so that all results can be displayed in the annotation page of each query sequence. All analysis outputs provide an option of storing any sequences in the output into Favorite, offering an easy way to collect selected sequences for subsequent analyses.

To empower the personalized use of Favorite, user authentication is required. Besides supporting the management of individual users’ data, Favorite can also be used to exchange data with other researchers. In addition, Favorite retains the user's original reference data, which overcomes any discrepancies between analyses conducted at different time points due to the frequent updating of external databases, such as the NR database in NCBI.

BLASTMatrix, a novel tool for searching and visualizing potential homologs across multiple species

With the availability of a large number of completely sequenced fungal genomes, it is possible to analyse the distribution of homologous genes across fungal taxa (7,9). Repeated BLAST searches against individual genome datasets are currently required for this task, which is iterative and cumbersome (33). To solve this problem, a new tool named the BLASTMatrix was developed and linked to the CFGP. With a query sequence, the BLASTMatrix generates a table containing the best hit in each of the species, which is then organized according to their taxonomical positions (Figure 3A), and also calculates the distribution pattern of homologous genes in different taxonomic groups (Figure 3B). The output can include InterPro or GO terms, helping the prediction of putative functions of hypothetical proteins. Further analyses can then determine the orthologous relationships between the query and its homologs in individual species.

Figure 3.
Format of BLASTMatrix output. An example of BLASTMatirx output generated using the aflatoxin gene cluster in Aspergillus nidulans as queries. The results are presented in a matrix format (A) and a distribution based on e-value (B). Additionally, BLASTMatrix ...


Genome sequences, along with associated functional genomics data, will continue to accumulate at an exponential rate. To efficiently utilize this inflow of data, standardization of data and efficient communication among data analysis tools are required. Enhancing the standard of communication between programs will also help future expansion by integrating more bioinformatics tools and will provide a development environment for open source projects. Additional genomic information, such as alternative splicing and expression data derived from EST, SAGE and microarray experiments, can be added to the CFGP.


This research was partially supported by grants from Crop Functional Genomics Center (CG1141) and Microbial Genomics and Applications Center (0462-20060021) of the 21st Century Frontier Research Program funded by the Ministry of Science and Technology and a grant from Biogreen21 Project (20050401034629) funded by Rural Development Administration to Y.H.L. A grant from the USDA-NRI Plant Biosecurity program (2005-35605-15393) to S.K. also supported this project. J.P. thanks to graduate fellowship provided by the Ministry of Education through the Brain Korea 21 Agricultural Biotechnology Project. Funding to pay the Open Access publication charges for this article was provided by the Brain Korea 21 Agricultural Biotechnology Project.

Conflict of interest statement. None declared.


1. Hawksworth DL. The fungal dimension of biodiversity: magnitude, significance, and conservation. Mycol. Res. 1991;95:641–655.
2. Van der Heijden M.GA, Klironomos JN, Ursic M, Moutoglis P, Streitwolf-Engel R, Boller T, Wiemken A, Sanders IR. Mycorrhizal fungal diversity determines plant biodiversity, ecosystem variability and productivity. Nature. 1998;396:69.
3. Demain AL. Microbial biotechnology. Trends Biotechnol. 2000;18:26–31. [PubMed]
4. Agrios GN. Plant Pathology, 5th edn. San Diego: Academic Press; 2005. edn.
5. Denning DW. Invasive aspergillosis. Clin. Infect. Dis. 1998;26:781–805. [PubMed]
6. Kwon-Chung KJ, Varma A, Howard DH. Ecology of Cryptococcus neoformans and prevalence of its two varieties in AIDS and non-AIDS associated Cryptococcosis. In: Vanden Bossche H, Mackenzie D.WR, Cauwenbergh G, Van Custem J, Drouhet E, Dupont B, editors. Mycoses in AIDS Patients. New York: Plenum Press; 1990. pp. 103–113.
7. Galagan JE, Henn MR, Ma LJ, Cuomo CA, Birren B. Genomics of the fungal kingdom: insights into eukaryotic biology. Genome Res. 2005;15:1620–1631. [PubMed]
8. Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H, Galibert F, Hoheisel JD, Jacq C, et al. Life with 6000 genes. Science. 1996;274, 546:563–547. [PubMed]
9. Park J, Kim H, Kim S, Kong S, Park J, Kim S, Han H, Park B, Jung K, et al. A comparative genome-wide analysis of GATA transcription factors in fungi. Genomics Inform. 2006;4:147–160.
10. Metzker ML. Emerging technologies in DNA sequencing. Genome Res. 2006;15:1767–1776. [PubMed]
11. James TY, Kauff F, Schoch CL, Matheny PB, Hofstetter V, Cox CJ, Celio G, Gueidan C, Fraker E, et al. Reconstructing the early evolution of fungi using a six-gene phylogeny. Nature. 2006;443:818–822. [PubMed]
12. Hibbett DS, Binder M, Bischoff JF, Blackwell M, Cannon PF, Eriksson OE, Huhndorf S, James T, Kirk PM, et al. A higher-level phylogenetic classification of the Fungi. Mycol. Res. 2007;111:509–547. [PubMed]
13. Fitzpatrick DA, Logue ME, Stajich JE, Butler G. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis. BMC Evol. Biol. 2006;6:99. [PMC free article] [PubMed]
14. Kroken S, Glass NL, Taylor JW, Yoder OC, Turgeon BG. Phylogenomic analysis of type I polyketide synthase genes in pathogenic and saprobic ascomycetes. Proc. Natl Acad. Sci. USA. 2003;100:15670–15675. [PMC free article] [PubMed]
15. Kamper J, Kahmann R, Bolker M, Ma LJ, Brefort T, Saville BJ, Banuett F, Kronstad JW, Gold SE, et al. Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature. 2006;444:97–101. [PubMed]
16. Jeon J, Park SY, Chi MH, Choi J, Park J, Rho HS, Kim S, Goh J, Yoo S, et al. Genome-wide functional analysis of pathogenicity genes in the rice blast fungus. Nat. Genet. 2007;39:561–565. [PubMed]
17. Galagan JE, Calvo SE, Cuomo C, Ma LJ, Wortman JR, Batzoglou S, Lee SI, Basturkmen M, Spevak CC, et al. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature. 2005;438:1105–1115. [PubMed]
18. Payne GA, Nierman WC, Wortman JR, Pritchard BL, Brown D, Dean RA, Bhatnagar D, Cleveland TE, Machida M, et al. Whole genome comparison of Aspergillus flavus and A. oryzae. Med. Mycol. 2006;44(Suppl.):9–11.
19. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J, Marck C, Neuveglise C, et al. Genome evolution in yeasts. Nature. 2004;430:35–44. [PubMed]
20. Kellis M, Birren BW, Lander ES. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature. 2004;428:617–624. [PubMed]
21. Cornell M, Paton NW, Hedeler C, Kirby P, Delneri D, Hayes A, Oliver SG. GIMS: an integrated data storage and analysis environment for genomic and functional data. Yeast. 2003;20:1291–1306. [PubMed]
22. Price EW, Carbone I. SNAP: workbench management tool for evolutionary population genetic analysis. Bioinformatics. 2005;21:402–404. [PubMed]
23. Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–277. [PubMed]
24. Choi K, Ma Y, Choi JH, Kim S. PLATCOM: a platform for computational comparative genomics. Bioinformatics. 2005;21:2514–2516. [PubMed]
25. Lee D, Choi JH, Dalkilic MM, Kim S. COMPAM: visualization of combining pairwise alignments for multiple genomes. Bioinformatics. 2006;22:242–244. [PubMed]
26. Siepel A, Farmer A, Tolopko A, Zhuang M, Mendes P, Beavis W, Sobral B. ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources. Bioinformatics. 2001;17:83–94. [PubMed]
27. McGinnis S, Madden TL. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 2004;32:W20–W25. [PMC free article] [PubMed]
28. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. [PMC free article] [PubMed]
29. Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bradley P, Bork P, Bucher P, et al. InterPro, progress and status in 2005. Nucleic Acids Res. 2005;33:D201–D205. [PMC free article] [PubMed]
30. Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 2004;340:783–795. [PubMed]
31. Nakai K, Horton P. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 1999;24:34–36. [PubMed]
32. Wickware P. Next-generation biologists must straddle computation and biology. Nature. 2000;404:683–684. [PubMed]
33. Blair JE, Shah P, Hedges SB. Evolutionary sequence analysis of complete eukaryote genomes. BMC Bioinformatics. 2005;6:53. [PMC free article] [PubMed]
34. Schell MA, Karmirantzou M, Snel B, Vilanova D, Berger B, Pessi G, Zwahlen MC, Desiere F, Bork P, et al. The genome sequence of Bifidobacterium longum reflects its adaptation to the human gastrointestinal tract. Proc. Natl Acad. Sci. USA. 2002;99:14422–14427. [PMC free article] [PubMed]
35. Bentley SD, Chater KF, Cerdeno-Tarraga AM, Challis GL, Thomson NR, James KD, Harris DE, Quail MA, Kieser H, et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2) Nature. 2002;417:141–147. [PubMed]
36. Blattner FR, Plunkett G., 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. [PubMed]
37. Paulsen IT, Press CM, Ravel J, Kobayashi DY, Myers GS, Mavrodi DV, DeBoy RT, Seshadri R, Ren Q, et al. Complete genome sequence of the plant commensal Pseudomonas fluorescens Pf-5. Nat. Biotechnol. 2005;23:873–878. [PubMed]
38. Douglas SE, Penny SL. The plastid genome of the cryptophyte alga, Guillardia theta: complete sequence and conserved synteny groups confirm its common ancestry with red algae. J. Mol. Evol. 1999;48:236–244. [PubMed]
39. Myler PJ, Beverley SM, Cruz AK, Dobson DE, Ivens AC, McDonagh PD, Madhubala R, Martinez-Calvillo S, Ruiz JC, et al. The Leishmania genome project: new insights into gene organization and function. Med. Microbiol. Immunol. 2001;190:9–12. [PubMed]
40. Nierman WC, Pain A, Anderson MJ, Wortman JR, Kim HS, Arroyo J, Berriman M, Abe K, Archer DB, et al. Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature. 2005;438:1151–1156. [PubMed]
41. Machida M, Asai K, Sano M, Tanaka T, Kumagai T, Terai G, Kusumoto K, Arima T, Akita O, et al. Genome sequencing and analysis of Aspergillus oryzae. Nature. 2005;438:1157–1161. [PubMed]
42. Cuomo C, Güldener U, Xu J, Trail F, Turgeon B, Di PA, Walton J, Ma L, Baker S, et al. The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science. 2007;317:1400–1402. [PubMed]
43. Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ, Thon M, Kulkarni R, Xu JR, et al. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature. 2005;434:980–986. [PubMed]
44. Borkovich KA, Alex LA, Yarden O, Freitag M, Turner GE, Read ND, Seiler S, Bell-Pedersen D, Paietta J, et al. Lessons from the genome sequence of Neurospora crassa: tracing the path from genomic blueprint to multicellular organism. Microbiol. Mol. Biol. Rev. 2004;68:1–108. [PMC free article] [PubMed]
45. Jones T, Federspiel NA, Chibana H, Dungan J, Kalman S, Magee BB, Newport G, Thorstenson YR, Agabian N, et al. The diploid genome sequence of Candida albicans. Proc. Natl Acad. Sci. USA. 2004;101:7329–7334. [PMC free article] [PubMed]
46. Dietrich FS, Voegeli S, Brachat S, Lerch A, Gates K, Steiner S, Mohr C, Pohlmann R, Luedi P, et al. The Ashbya gossypii genome as a tool for mapping the ancient Saccharomyces cerevisiae genome. Science. 2004;304:304–307. [PubMed]
47. Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature. 2003;423:241–254. [PubMed]
48. Cliften P, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston R, Cohen BA, Johnston M. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science. 2003;301:71–76. [PubMed]
49. Jeffries TW, Grigoriev IV, Grimwood J, Laplaza JM, Aerts A, Salamov A, Schmutz J, Lindquist E, Dehal P, et al. Genome sequence of the lignocellulose-bioconverting and xylose-fermenting yeast Pichia stipitis. Nat. biotechnol. 2007;25:319–326. [PubMed]
50. Wood V, Gwilliam R, Rajandream MA, Lyne M, Lyne R, Stewart A, Sgouros J, Peat N, Hayles J, et al. The genome sequence of Schizosaccharomyces pombe. Nature. 2002;415:871–880. [PubMed]
51. Martinez D, Larrondo LF, Putnam N, Gelpke MD, Huang K, Chapman J, Helfenbein KG, Ramaiya P, Detter JC, et al. Genome sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain RP78. Nat. Biotechnol. 2004;22:695–700. [PubMed]
52. Loftus BJ, Fung E, Roncaglia P, Rowley D, Amedeo P, Bruno D, Vamathevan J, Miranda M, Anderson IJ, et al. The genome of the basidiomycetous yeast and human pathogen Cryptococcus neoformans. Science. 2005;307:1321–1324. [PMC free article] [PubMed]
53. Katinka MD, Duprat S, Cornillot E, Metenier G, Thomarat F, Prensier G, Barbe V, Peyretaillade E, Brottier P, et al. Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature. 2001;414:450–453. [PubMed]
54. Tyler BM, Tripathy S, Zhang X, Dehal P, Jiang RH, Aerts A, Arredondo FD, Baxter L, Bensasson D, et al. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science. 2006;313:1261–1266. [PubMed]
55. AGI. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. [PubMed]
56. IRGSP. The map-based sequence of the rice genome. Nature. 2005;436:793–800. [PubMed]
57. Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica) Science. 2002;296:79–92. [PubMed]
58. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray) Science. 2006;313:1596–1604. [PubMed]
59. Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JM, et al. The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002;298:129–149. [PubMed]
60. Kornberg TB, Krasnow MA. The Drosophila genome sequence: implications for biology and medicine. Science. 2000;287:2218–2220. [PubMed]
61. Darling JA, Reitzel AR, Burton PM, Mazza ME, Ryan JF, Sullivan JC, Finnerty JR. Rising starlet: the starlet sea anemone, Nematostella vectensis. Bioessays. 2005;27:211–221. [PubMed]
62. CSC. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282:2012–2018. [PubMed]
63. Dehal P, Satou Y, Campbell RK, Chapman J, Degnan B, De Tomaso A, Davidson B, Di Gregorio A, Gelpke M, et al. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science. 2002;298:2157–2167. [PubMed]
64. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. [PubMed]
65. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. [PubMed]
66. Garrity GM. Bergey's Manual of Systematic Bacteriology. 2nd. Springer: New York; 2001.
67. Adl SM, Simpson AG, Farmer MA, Andersen RA, Anderson OR, Barta JR, Bowser SS, Brugerolle G, Fensome RA, et al. The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J. Eukaryot. Microbiol. 2005;52:399–451. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...