• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Fungal Genet Biol. Author manuscript; available in PMC Mar 1, 2010.
Published in final edited form as:
PMCID: PMC2826280
NIHMSID: NIHMS166336

The 2008 update of the Aspergillus nidulans genome annotation: a community effort

Jennifer Russo Wortman,1 Jane Mabey Gilsenan,2 Vinita Joardar,3 Jennifer Deegan,4 John Clutterbuck,5 Mikael R. Andersen,6 David Archer,7 Mojca Bencina,8 Gerhard Braus,9 Pedro Coutinho,10 Hans von Döhren,11 John Doonan,12 Arnold J.M. Driessen,13,14 Pawel Durek,11 Eduardo Espeso,15 Erzsébet Fekete,16 Michel Flipphi,17 Carlos Garcia Estrada,18 Steven Geysens,19 Gustavo Goldman,20 Piet W.J. de Groot,21 Kim Hansen,22 Steven D. Harris,23 Thorsten Heinekamp,24 Kerstin Helmstaedt,9 Bernard Henrissat,10 Gerald Hofmann,6,22 Tim Homan,25 Tetsuya Horio,26 Hiroyuki Horiuchi,27 Steve James,28 Meriel Jones,29 Levente Karaffa,16 Zsolt Karányi,30 Masashi Kato,31 Nancy Keller,32 Diane E. Kelly,33 Jan A.K.W. Kiel,34,14 Jung-Mi Kim,35 Ida J. van der Klei,34,14 Frans M. Klis,21 Andriy Kovalchuk,13,14 Nada Kraševec,8 Christian P. Kubicek,36 Bo Liu,35 Andrew MacCabe,17 Vera Meyer,25,14 Pete Mirabito,37 Márton Miskei,38 Magdalena Mos,29 Jonathan Mullins,33 David R. Nelson,39 Jens Nielsen,6,40 Berl R. Oakley,26 Stephen A. Osmani,41 Tiina Pakula,42 Andrzej Paszewski,43 Ian Paulsen,44 Sebastian Pilsyk,43 István Pócsi,38 Peter J. Punt,45 Arthur F.J. Ram,25,14 Qinghu Ren,3 Xavier Robellet,46 Geoff Robson,47 Bernhard Seiboth,36 Piet van Solingen,48 Thomas Specht,49 Jibin Sun,50 Naimeh Taheri-Talesh,9 Norio Takeshita,51 Dave Ussery,52 Patricia A. vanKuyk,25 Hans Visser,53 Peter J.I. van de Vondervoort,54 Ronald P. de Vries,55 Jonathan Walton,56 Xin Xiang,57 Yi Xiong,41 An Ping Zeng,58 Bernd W. Brandt,59 Michael J. Cornell,46,60 Cees A.M.J.J. van den Hondel,25,14 Jacob Visser,25,61 Stephen G. Oliver,62 and Geoffrey Turner63

Abstract

The identification and annotation of protein-coding genes is one of the primary goals of whole-genome sequencing projects, and the accuracy of predicting the primary protein products of gene expression is vital to the interpretation of the available data and the design of downstream functional applications. Nevertheless, the comprehensive annotation of eukaryotic genomes remains a considerable challenge. Many genomes submitted to public databases, including those of major model organisms, contain significant numbers of wrong and incomplete gene predictions. We present a community-based reannotation of the Aspergillus nidulans genome with the primary goal of increasing the number and quality of protein functional assignments through the careful review of experts in the field of fungal biology.

Index descriptors: Aspergillus nidulans, aspergilli, genome, annotation, fungal community, assembly, transcription factors, CADRE

1) Introduction

Importance of A.nidulans as a reference for the eight sequenced aspergilli

Genome sequences of 8 species of the genus Aspergillus have now been determined (Table 1). However, the aspergillus research community is relatively small, and the task of detailed annotation of all of these genomes, together with presentation of the information in an easily accessible form, is enormous. We have therefore chosen to begin the task with the genetic model species Aspergillus nidulans.

Table 1
Genome sequences from species in the Aspergillus clade

i) General importance of aspergilli for fungal biology, human health, biotechnology, and agriculture

The genus Aspergillus comprises a diverse group of filamentous fungi. Despite belonging to the same genus, Aspergillus species have diverged significantly (Galagan et al., 2005), though they are sufficiently related such that orthologues can be identified for the majority of genes. A. nidulans, the most studied species, has been an important model organism for eukaryotic genetics for over 60 years (Martinelli and Kinghorn, 1994), with the advantage of having a sexual cycle (teleomorph Emericella nidulans), which is usually absent from other species of Aspergillus. In addition to a long history of classical genetic and biochemical studies, most molecular techniques for gene manipulation were first developed in A. nidulans before application to other members of the genus. Detailed laboratory protocols have recently been made easily accessible (Osmani et al., 2006; Szewczyk et al., 2006; Todd et al., 2007a; Todd et al., 2007b), and mutant strains isolated over many years are available from the Fungal Genetics Stock Centre (www.fgsc.net), together with other useful resources such as vectors and libraries.

It is important to note that different wild-type strains of A. nidulans exist (Jinks et al., 1966), but all the commonly used mutant strains are derived from a single strain, sometimes called the Glasgow strain, following its choice as a genetic model in the early 1950s (Pontecorvo et al., 1953). Our detailed understanding of the genetics and physiology of A. nidulans provides an excellent basis for extension of this knowledge to other, imperfect species of economic importance. These include the opportunistic pathogen A. fumigatus, a cause of allergies and a growing threat to immunocompromised patients; A. niger and A. oryzae, sources of industrial enzymes and other products such as citric acid; A. flavus, a plant pathogen and toxin-producing agricultural spoilage organism; A. terreus, sometimes an opportunistic pathogen, is also a source of lovastatin, one of the first of the hugely successful statins used therapeutically to inhibit cholesterol biosynthesis.

A. nidulans possesses a penicillin biosynthesis pathway similar to that found in the industrial producer Penicillium chrysogenum, and has most of the steps of the aflatoxin pathway found in A. flavus and A. parasiticus, so that it has become a key model system for studying these secondary metabolic pathways and their regulation (Bok et al., 2006b; Brakhage et al., 2004).

A. nidulans genetics has also contributed to eukaryotic cell biology beyond its close fungal relatives, examples being the discovery of γ-tubulin (Oakley and Oakley, 1989)and NudF, a homologue of the human lysencephaly protein (Xiang et al., 1995).

Since A. nidulans has played such an important role as a genetic model, Eurofung through the Eurofungbase Project decided to focus its community annotation efforts on A. nidulans in the first instance. We report the major findings of this exercise here, specific accounts on particular biological domains are presented in the following series of papers, and the newly annotated A. nidulans genome sequence is available in the CADRE database (Mabey et al., 2004) and will also be available in the Fungi section of ENSEMBL genomes at the EBI.

ii) Annotation of aspergilli

The genome sequences of eight distinct Aspergillus species have been publicly released over the past four years. However, many of these genomes were annotated at different institutions using diverse methods over a relatively long time period, during which available tools and datasets have evolved rapidly. The inconsistency in annotation quality and completeness across these species hinders many avenues of comparative genomic research that depend on high-quality genome annotation, including evolutionary and functional studies (Wortman et al., 2006b).

The first three Aspergillus genome sequences, those for A. fumigatus, A. oryzae and A. nidulans, were published in 2005 and described in three companion papers (Galagan et al., 2005; Machida et al., 2005; Nierman et al., 2005). The A. fumigatus genome sequence (Af293) was generated through a collaboration between the Institute for Genomic Research (TIGR) and the Wellcome Trust Sanger Institute and deposited in GenBank; it has the accession AAHF00000000 (Nierman et al., 2005). The assembled genomic sequence was processed through the TIGR annotation pipeline, which subjected each sequence to a series of homology searches as well as algorithms for predicting genes (GlimmerM, Exonomy, Unveil, and GeneSplicer)(Majoros et al., 2003; Pertea et al., 2001). The gene prediction algorithms were trained with a limited dataset of A. fumigatus EST and cDNA sequences (Nierman et al., 2005) and the output of the pipeline was manually reviewed. An updated annotation for A. fumigatus, based on comparative genome data and involving targeted manual annotation, was released to GenBank in March, 2007 (Fedorova et al., 2008). The A. oryzae RIB40 genome, sequenced and analyzed by a consortium led by the National Institute of Advanced Industrial Science and Technology (AIST) in Japan, was released to DDBJ under the accession numbers AP007150 to AP007177. The AIST annotation pipeline incorporated EST and protein homology data via the gene predictors ALN(Gotoh, 2000), GeneDecoder(Asai et al., 1998) and GlimmerM. The pipeline was trained on a set of gene models which were constructed by alignment of known fungal proteins to the open reading-frames in the A. oryzae genome(Machida et al., 2005). A. oryzae annotation was last updated in DDBJ in December 2005. The A. nidulans (Emericella nidulans) genome, for strain FGSC A4, was sequenced and annotated by the Broad Institute of Harvard and MIT and submitted to GenBank and has the accession AACD00000000. The genome sequence was annotated using the Calhoun annotation system, which included protein homology searches and the gene prediction algorithms FGENESH (Salamov and Solovyev, 2000), FGENESH+, and GENEWISE (Birney et al., 2004). A. nidulans EST data was not incorporated into the gene predictions, but was used separately for validation (Galagan et al., 2005).

Comparison of these first aspergillus genome sequences revealed a surprising level of genetic variability. Proteome comparisons revealed an average amino-acid identity of less than 70% between each species pair, suggesting that they are as evolutionarily distant from each other as humans are from fish (Galagan et al., 2005). Since these phylogenetic distances were so significant, it became clear that additional aspergilli would need to be sequenced to facilitate comparative analyses. More data would be needed to elucidate the specific gene differences and regulatory elements linked to the distinctive phenotypic and physiological properties important to the study of each organism. The proteome comparison also revealed the extent of gene model annotation differences between genomes, with the majority of identified orthologue groups (~80%) containing members differing in length and/or number of exons. Identified annotation problems included the merging of neighbouring loci, missed exon calls, and incorrect 5′ exons (Wortman et al., 2006b). A summary of the published genome sequences for Aspergillus species is given in Table 1, which provides information on the sequencing centres, software tools employed, and access to the sequences.

In addition to inconsistencies in gene model annotation caused by the lengthy timeframe over which the different genomes were sequenced and the diverse annotation processes employed at different sequencing centres, there are also significant differences in the functional annotation attached to gene products. While some groups will only attach a putative function based on a high stringency homology match to an experimentally characterized protein (Galagan et al., 2005), others have used more lenient criteria, employing resources such as InterPro(Mulder and Apweiler, 2007) and PFAM(Finn et al., 2008) profiles(Fedorova et al., 2008; Nierman et al., 2005) and the NCBI KOG (Tatusov et al., 2003) resource (Machida et al., 2005).

Additional attributes, such as Enzyme Commission numbers (Bairoch, 2000) or Gene Ontology associations (Ashburner et al., 2000) are also not consistently applied. As part of the original primary annotation efforts, the functional annotation of A. fumigatus (Af293) and A. niger (CBS 513.88) were manually reviewed with input from the research community, and with an emphasis on particular protein families (Fedorova et al., 2008; Nierman et al., 2005; Pel et al., 2007). Manual annotation by domain experts is a valuable approach for integrating multiple lines of computational evidence, but is a resource-intensive and time-consuming process. Thus, it is best applied to species within a given clade for which there is most genetic and functional data and for which the largest and most active research community exists. For all of these reason, a community annotation of the A. nidulans genome sequence appeared a high priority. Having a well-annotated reference genome for the aspergillus clade of organisms will support the transitive improvement of the annotation across orthologous genes in the other members of the genus.

Eurofungbase

Eurofungbase is a coordination action programme funded by the European Commission under contract LSSG-CT-2005-018964. It comprises a community of 32 different partner laboratories in 11 European countries, supported by an Industry Platform of 13 companies. Eurofungbase aims to facilitate the construction of an integrated data warehouse to enable comparative and functional genomic studies of filamentous fungi of scientific, medical, industrial, and agricultural importance. The Consortium has a number of different strategies for achieving its aims, but one of the most important approaches is to organise the manual annotation of the genomes of important model organisms by expert groups of researchers, and to consolidate and disseminate the results of such annotation exercises through journal publications and web facilities. It is within this context that Eurofungbase took on the task of re-annotating the Aspergillus nidulans genome with the help of colleagues from TIGR/J Craig Venter Institute and aspergillus research laboratories in the USA.

Evolution of A. nidulans genome data

The original public genome annotation of A. nidulans, described briefly above, consists of 9,541 protein-coding gene predictions (Galagan et al., 2005). Each gene was assigned a unique locus identifier with the prefix AN followed by a four digit number between 0001 and 9541 and appended with the annotation version number 2 (e.g. AN0001.2). Version 1 was internal to the Broad Institute and not released widely. As of July 2008, this is the version of the A. nidulans annotation that is still represented at GenBank, linked to accession AACD00000000, submitted in January, 2004. Of the genes overlapping EST alignments, approximately 70% were fully consistent with the EST data, while 30% showed some inconsistency (J. E. Galagan, personal communication). Comparative analysis with A. fumigatus and A. oryzae suggested that there were many neighboring loci inappropriately merged (Wortman et al., 2006b). Functional annotation was applied to gene products only when they exhibited high-identity matches to previously published, experimentally characterised, proteins within the fungal kingdom. This resulted in putative function assignments for approximately 3% of the predicted proteins.

In summer, 2005, NIAID requested that the annotation group at TIGR revisited the gene structure annotation of A. nidulans in advance of microarray design being planned by the NIAID-funded Pathogen Functional Genomics Resource Center (PFGRC). This re-annotation effort focused on the automated incorporation of EST data into the existing gene models, and the manual review and correction of merged loci. 32,931 EST and cDNA sequences compiled from GenBank and provided by C. d’Enfert and G. Goldman were aligned to the genome, and compared to the existing annotation using the PASA pipeline (Haas et al., 2003). These EST sequences collapsed to 8,690 unique assemblies and were used to perform automated gene structure updates of 1,146 genes. In addition, over 2,000 genes that could not be computationally resolved with the current gene structure were manually reviewed and corrected on the basis of either protein homology or EST data. 494 loci were split into two or more distinct loci, and 214 new gene models were added. In addition, 426 gene models originally predicted by the Broad institute, but excluded from the earlier release because they did not meet minimum length criteria, were also incorporated. The final gene set consisted of 10,701 protein-coding gene predictions, with 4,263 genes completely consistent with EST alignments. Locus identifiers were retained in all cases of one-to-one mapping (9447), whether the sequence changed or not, but the version number was incremented to 3 for all genes. Since the gene number was now over 10,000, there was a need to create new locus identifiers with 5 numeric digits after the AN prefix (e.g. AN10002.3). Functional annotation was supplied by the Broad Institute. As of July, 2008, this version 3 annotation data set is the current annotation reflected at the Broad Institute web site: http://www.broad.mit.edu/annotation/genome/aspergillus_group/.

2. Eurofungbase Community Annotation

Main focus: Gene function

The primary goal of the Eurofungbase annotation effort is to increase the numbers of A. nidulans proteins with informative functional assignments. Experts in various aspects of fungal and, particularly, A. nidulans biology were invited to participate in an ongoing annotation effort, which started with a jamboree in Autumn 2006. Prior to this initial meeting, the version 3 protein sequences were subjected to a series of computational analyses intended to provide evidence for protein function. These included homology searches against the GenBank non-redundant database (nr), domain identification against the PFAM(Finn et al., 2008) and InterPro(Mulder and Apweiler, 2007) databanks, and the programs tmHMM(Sonnhammer et al., 1998), which predicted transmembrane domains, and SignalP (Bendtsen et al., 2004), which predicts signal peptides.

To support the annotation effort, meeting participants were granted access to a compact summary of this computational evidence through the Manatee web-based interface (Wortman et al., 2006a). Manatee is a web-based manual annotation and analysis tool that acts as in interface between an underlying database and an annotator. Manatee’s interface allows the annotator to add, delete, and edit annotations attached to the protein, such as gene product names, gene symbols, EC numbers and GO assignments. All such changes are referred back to the underlying database. Consortium members could continue to interact with the annotation database, through Manatee, on an ongoing basis for more than a year.

Over the course of the functional annotation effort, 2,626 genes were reviewed and edited, with GO terms being added where possible and appropriate. Through this concerted effort, the percentage of A. nidulans gene products with an informative name has increased from approximately 3% to 19%. For the remaining un-reviewed genes, we have provided product names by transfer of information from A. fumigatus or A. niger orthologues, further increasing the proportion of gene products with informative names to 58%. As an indication of the changes arising form this project, we have also incremented the annotation version to 4 for all locus identifiers (e.g., AN****.4).

b. Assembly updates

Assembly of supercontigs and association with chromosomes

The initial Broad Institute assembly comprised 173 contigs, linked by end-sequenced BAC and fosmid bridges to form 16 supercontigs, corresponding to the 16 chromosome arms. A further 75 contigs were unplaced. Using BLAST (Altschul et al., 1990) and Bl2seq (Tatusova and Madden, 1999) it was found that many of the Broad contigs overlapped so that 58 previously unassigned contigs could be incorporated into supercontigs, and the number of unsequenced gaps within supercontigs was reduced from 157 to 71 (see http://www.gla.ac.uk/ibls/molgen/aspergillus/contiglinks.html and http://www.cadre-genomes.org.uk/Aspergillus_nidulans/Docs/revised_contig_overlaps.html. In five further cases, gaps are bridged by independently cloned sequences or retrotransposons with matching target-site duplications. The remaining gaps are assigned an arbitrary length of 1 kb in the revised assembly.

Supercontigs were related to chromosome arms using meiotically mapped and cloned linkage map markers (see http://www.gla.ac.uk/ibls/molgen/aspergillus/index.html). Over 185 such markers, identifiable with auto-called genes, are currently informative in this respect. With a few exceptions, most of which can be put down to reliance on inadequate linkage data, the correspondence between linkage and genome maps is excellent (Clutterbuck and Farman, 2008). Four supercontigs have telomeric simple sequence (TTAGGG) repeats at one end, allowing the orientation of the arm to be validated independently of the genetic map, and using TERMINUS (Li et al., 2005), Clutterbuck and Farman (2008) were able to identify telomeric simple sequences and subtelomeric contigs in the NCBI sequence trace archive and associate them with the remaining chromosome arms. While six different telomeres have identical telomeric contigs, the subtelomeric sequences are more varied, including a variety of simple sequences, and, in many cases, members of a specific class of telomere-linked helicases, all but one of which are truncated or degenerate (Clutterbuck and Farman, 2008). In most cases, the non-telomeric ends of supercontig arms were marked with clusters of A+T-rich, degraded transposable elements that are typical of centromeric DNA. In the revised assembly, centromeres, which have not been sequenced, are displayed as arbitrary gaps of 50 kb. The one case in which there was a serious discrepancy between genomic and linkage maps concerned supercontig 6, associated with chromosome V. Here a telomeric link was found in the middle of the supercontig, requiring the splitting of this supercontig into 6a and 6b. New supercontig 6a was found to end in a short fragment of a ribosomal RNA repeat sequence and now spans a central region of chromosome arm V-L, between the nucleolus organizer and the centromere, the distal section of V-L being formed by supercontig 12. Supercontig 6b, starting with links to telomere T15, makes up chromosome arm V-R (see Fig 5.6 in Clutterbuck and Farman, 2008). The resulting rearranged supercontigs now correspond well with the linkage map of chromosome V, already known to include the nucleolus organizer (Brody et al., 1991), (approximate length 360 kb: Ganley and Kobayashi, 2007).

c) Assembly availability

In summary, the new assembly has 248 contigs, of which 231 contigs are assigned to 17 supercontigs (with 66 unsequenced gaps) that are mapped to eight linkages groups (approx. 30.5Mb). This assembly is displayed within the Central Aspergillus Data Repository (CADRE) (Mabey et al., 2004). However, due to the recent improvements in assembly and annotation data, the current contigs no longer reflect the annotation within GenBank although the sequence remains the same. For this reason, contigs in CADRE are labeled with source identifiers from the Broad Institute (1.1 to 1.248), which coincide with the public sequence data within GenBank (AACD01000001 to AACD01000248). The supercontigs, annotated as scaffolds within GenBank (CH236920 to CH236935) have also changed. Therefore we have labeled the supercontigs in CADRE 1 to 16, with supercontig 6 replaced by 6a and 6b. These supercontigs are assigned to the linkages groups, which remain labeled as I to VIII.

d) Gene structure updates

The version 3 gene models, which were annotated on the 248 original contig sequences, were mapped to the revised chromosome sequences. The genome sequences were aligned using the NUCmer utility of the MUMmer package (Kurtz et al., 2004), and gene models were transferred on the basis of the resulting coordinate mapping. In this process, 82 gene predictions were unable to be transferred because of changes in the underlying sequence, and approximately 200 mapped gene pairs were found to overlap significantly. 34 gene predictions were marked as putative pseudogenes, as the open reading-frames are disrupted in the reported genome sequence. The transferred gene models were the starting point for version 4 of the A. nidulans annotation.

During the course of functional annotation, consortium members with expertise in specific gene families were able to identify gene structures that were likely to be incorrect. They then submitted either virtual cDNA sequences representing the corrected gene structure, or protein sequences corresponding to corrected genes. PASA (Haas et al., 2003) was used to correct genes based on the submitted cDNA sequences, processing 74 gene structure updates, including 7 additional putative pseudogenes. Genewise (Birney et al., 2004) was used to instantiate gene structures based on the submitted protein sequences. These structures were reviewed manually and used to update 95 genes, including 3 additional putative pseudogenes, and to create five new gene models. The final gene count for the version 4 annotation is 10,605, which includes 63 putative pseudogenes. Two families of genes were significantly improved through these efforts. 39 of 119 cytochrome P450 genes were updated, as were 96 of the 342 predicted Zn(II)2Cys6 transcription factors.

This dataset reflects an iterative improvement in the A. nidulans predicted gene complement, but should not be considered a final product. The gene structure annotation of A. nidulans will continue to change over time as new experimental evidence and computational approaches arise. Of particular interest are recently published gene prediction programs that use machine-learning approaches to improve performance on eukaryotic genomes (Bernal et al., 2007; DeCaprio et al., 2007). The annotation process can also benefit from the incorporation of more diverse data types and predictions. One example is found in the companion article by Wang et al. FGB-08-96, which describes a new and species-specific intron splice site predictor for aspergillus. In contrast to conventional gene finders producing entire gene structures, splice site predictors identify all candidate splice sites and assign probability values, information which can be used to correct gene structures or predict alternatively spliced genes.

New findings from the annotation

Subsequent articles in this issue do not aim to cover all the different families of genes, but rather reflect current research interests of the aspergillus community. Some of the genes included in the annotation (over 800) have been functionally characterised and/or genetically mapped prior to completion of the genome sequence, and these genes are mostly named according to the A. nidulans convention (Clutterbuck, 1973; Martinelli, 1994), in addition to the locus identifier in the AN**** format. Where genes have not been functionally characterised, the gene model name based on bioinformatics analysis is given only in the AN**** designation. For these uncharacterised genes, annotation is based on comparative genomics, and functions are suggested prior to functional analysis.

Since a genetic map of the 8 linkage groups (chromosomes) of A. nidulans was produced prior to the genome sequencing project, it was instructive to compare the genetic and physical maps. In most cases, the correlation is good, and in the case of chromosome V, the linkage map assisted in the correct assembly of the genome sequence.

a) Transposable elements

Transposable elements (TEs) were catalogued mainly by V.V.Kapitonov & J.Jurka, Genetic Information Research Institute, and can be found in Repbase (http://girinst.org/repbase/).

1275 insertions of whole or fragmented elements were identified, representing 14 retroposons and 19 DNA transposon families. Their sequences and distribution are described elsewhere (Clutterbuck et al., 2008). A total of 306 autocalled genes are involved with TEs identified by Kapitonov & Jurka. 151 are found wholely within TEs and many of these have recognizable transposition-related features. 103 genes overlap TEs or TE fragments, and 52 others have small TEs or fragments within them, only 14 of these being wholly within introns. In many cases the TEs are affected by RIP, and their associated autocalled genes have atypical (probably unrealistic) structures, often including multiple long introns, e.g. the mean length of 67 introns in 24 genes overlapping Mariner-6 elements was 152 bp: three times the modal introns length.

In only two cases have the affected genes been previously identified: the 3` end of AN7818.3 (stcF) overlaps a degraded Mariner-4_AN copy by just three nucleotides. Secondly, (Cultrone et al., 2007) have reported a case where a non-autonomous Helitron-N1_AN element has been created as a result of a 3` deletion of an autonomous Helitron1-AN element and readthrough into the xanA gene (AN10081.3). The new element has transposed once to give a second copy of the xanA promoter and 5` region, named psxA (AN11581.3). This pseudogene is transcribed, but the resulting mRNA is rapidly destroyed by nonsense-mediated decay.

b) Transcription factors

Eukaryotic transcription factors can be classified on the basis of their characteristic DNA-binding domains. The A. nidulans genome was analyzed for the presence of twelve different classes of DNA-binding transcription factors (Table 2). DNA-binding proteins without transcription factor activity were not included in the analysis. To identify putative transcription factors in the A. nidulans genome, BlastP searches were performed using functionally characterized transcription factors and/or predicted transcription factors from Saccharomyces cerevisiae and Neurospora crassa as queries. Predicted proteins that showed significant similarity to the above-mentioned transcription factors, but lacked the typical DNA-binding domain, were subsequently checked manually for alternative gene models that would include a DNA-binding domain. An overview of the number of putative transcription factors in the A. nidulans genome is given in Table 2 and further details are given as supplementary data (Supp. Tables 1–13).

Table 2
Overview of the twelve annotated classes of Transcription Factors (TF) in the A. nidulans genome.

As shown in Table 2, the family of Zn(II)2Cys6 transcription factors (MacPherson et al., 2006) is the largest, consisting of 330 proteins. The number of annotated proteins belonging to this class has been increased significantly from the initial annotation (Galagan et al., 2005) by improving the gene models manually. By addition of 5’exons, and/or changing intron/exon boundaries, in many cases a complete conserved six cysteine motif (CX2CX6CX5–16CX2CX6–8C) that was absent in a large number of the original gene models could be included in the protein sequence. For only a small number of putative Zn(II)2Cys6 TF proteins were we unable to identify a specific DNA binding domain even by manual annotation (Suppl. Table 13)

Members of an interesting subclass of Zn(II)2Cys6 transcription factors contain a C2H2 DNA binding domain in addition to the Cys6 motif. A total of 12 such transcription factors have been identified (Supp. Table 2). The function of those genes has not been studied in A. nidulans. In Colletotrichum lagenarium (Cmr1) and Magnaporthe grisae (Pig1) the double motif transcription factors are involved in the regulation of mycelial melanin biosynthesis (Tsuji et al., 2000).

It is a major challenge for future work to perform functional analysis of the members of the different families of transcription factors. High-throughput deletion strategies or overexpression approaches are useful methods for studying their function, as has proved to be the case for N. crassa (Colot et al., 2006) For A. nidulans, only 21 Zn(II)2Cys6 transcription factors have been functionally analyzed in more detail (Suppl. Table 1).

Previous work has indicated that, in various Aspergillus species, transcription factors are often clustered in the genome with their target genes (Cazelle et al., 1998; Felenbok et al., 2001; Gomi et al., 2000; Hull et al., 1989; Lamb et al., 1990; Unkles et al., 1992; Woloshuk et al., 1994; Yuan et al., 2008a). This applies, for example, to genes associated with primary and secondary metabolism and carbohydrate metabolism. Based on this finding, analysis of genes adjacent to the Zn(II)2Cys6 transcription factors might give information about their possible target genes and, hence, function (Flipphi et al., 2006; Ram and Punt, unpublished results). In addition, phylogenetic analyses of the protein sequences of Zn(II)2Cys6 transcription factors may be useful for the identification of sub-clusters with related functions. In a simple phylogenetic analyis of the 330 A. nidulans Zn(II)2Cys6 transcription factors, approximately 56 sub-clusters (defined as proteins with a reciprocal BlastP e-value of < −20 towards each other) could be identified. One of these clusters contained both the AmyR(AN2016.4) and InuR (AN3835.4) transcription factors involved in the regulation of enzymes degrading starch and inulin, respectively (Yuan et al., 2008a). The possible roles in polysaccharide catabolism of other transcription factors in this specific subcluster (AN7343.4, AN7346.4, AN8596.4 and AN10423.4) await further analysis.

c) Polysaccharide degrading enzymes

A. nidulans is found naturally in the rhizosphere, where it is able to utilise decaying plant material by means of secreted protein- and polysaccharide-degrading enzymes. Despite significant differences in the set of putative plant polysaccharide-degrading enzymes predicted from the genome sequence of A. niger, A. nidulans, and A. oryzae, no large differences were found in their growth on cellulose, xylan, pectin or galactomannan (Coutinho et al. FGB-08-108). The apparent differences in the repertoire of enzymes may, therefore, reflect specific adaptations to the natural biotope of the individual species. The identification of transcription factor (TF) target sites within the promoter regions of genes encoding polysaccharide-degrading enzymes is rarely helpful in predicting function. Thus while there is a correlation between the presence of XlnR sites and xylose induction for A. niger genes, this is not strictly true for the other two species; it is likely that TF binding sites in the promoters vary between Aspergillus species.

d) Primary metabolism

Primary metabolism and its regulation has been a major pre-genomic research area, and the post-genomic annotation has now identified genes for most of the central pathways. Several of these pathways, including glycolysis, the pentose phosphate pathway, TCA cycle, and the l-arabinose/d-xylose oxido-reductive pathway exhibit multiple genes for some individual steps (Flipphi et al. FGB-08-100). Phylogenetic analysis shows that these gene duplications were early events in the evolution of the aspergilli and it can be assumed that they have aided their proliferation in specific habitats. In addition to the pyruvate dehydrogenase and 2-oxo-glutarate dehydrogenase complexes, a third 2-oxo-acid dehydrogenase complex was discovered, and this is absent from yeast. The loci involved in this third complex are AN1726, AN3639 and AN8559. This enzyme is likely to be involved in the degradation of branched-chain aliphatic amino acids valine and isoleucine (like the orthologous complex in humans). A gene encoding glucose oxidase was found, suggesting an oxidative catabolic pathway for glucose. However, A. nidulans appears to lack the hydrolysing enzyme to linearise glucono-lactone. Moreover, since mutual disruption of the function of the genes for hexokinase (hxkA) and glucokinase (glkA) effectively abolishes growth on glucose (Flipphi et al., 2003), it is unlikely that glucose oxidation could lead to growth. Interestingly, A. nidulans appears distinct from all other aspergilli in lacking inositol-phosphate phosphatase – catalyzing the last step of inositol biosynthesis – which is non-essential in S. cerevisiae (Lopez et al., 1999). The encoding gene is found in synteny with its direct genetic environment in all Aspergillus species except A. nidulans. In the latter, there is no space between the neighbouring genes, AN9409 and AN9410, while in the other seven public Aspergillus genomes, the structurally well-conserved inositol biosynthesis gene resides between the orthologues of AN9409 and AN9410.

Two loci involved in central carbon metabolism were encountered in the A. nidulans genome that appear to result from horizontal gene transfer. A predicted cytosolic 2-methylcitrate dehydratase-like protein (AN1619) seems to be a prominent example (Flipphi et al. FGB-08-100). Another such event might have endowed A. nidulans with pyruvate-water dikinase (locus AN5843) (Flipphi and Kubicek, unpublished). In bacteria, this enzyme converts pyruvate into phosphoenolpyruvate – the reverse reaction of pyruvate kinase – thereby enabling futile cycling between the two ultimate intermediates of glycolysis. It is possible that the associated net conversion of ADP to AMP provides a physiological signal for general carbon metabolic control in A. nidulans.

e) Nitrogen and amino-acid metabolism

Nitrogen metabolism in A. nidulans has been the subject of extensive research over recent decades (Pontecorvo et al., 1953; Caddick, 2004). This has provided the model system of the GATA transcription factor AreA (AN8667), and further pathway-specific regulators and thus insight into control mechanisms for gene expression, with attention now turning to post-transcriptional control (Caddick et al., 2006). As a consequence, a large number of nitrogen metabolite transport and assimilation systems could be annotated since they had already been identified and analysed in functional detail. Similarly, the pathways of fungal aminoacid biosynthesis have been the subject of substantial previous study, including use as nutritional markers, and many could also be annotated.

Blast analysis identified signatures in several other genes which suggested putative roles in nitrogen metabolite transport, assimilation or aminoacid biosynthesis. These included a putative amino-acid permease (AN7392), purine transporter (AN7955), an additional acetamidase (AN9138) and four possible additional members of the arginase family (AN7488, AN6869, AN7669, AN3965). Bioinformatic analysis predicted functions for AN7488 and AN6869 as agmatinases and thus roles in polyamine biosynthesis. During searches for genes with amino-acid biosynthetic function, several aromatic aminotransferases were identified (AN6338 AN5041 AN8172 AN4156), at least one of which is a candidate for the final step in tyrosine biosynthesis. In addition, homology to the histidinol dehydrogenase region of the trifunctional hisDEI (AN0797) was identified in AN2723, but the other domains of this protein were absent. Functional analysis will be required to determine whether these genes indeed have roles in nitrogen assimilation, amino-acid metabolism or perhaps in secondary metabolism.

f) Sulphur metabolism

Almost all known genes related to eukaryotic sulphur metabolism have been annotated, and assigned an appropriate function. Within the last two years, the astA gene encoding an alternative sulphur transporter has been identified and characterized. The gene’s expression is strongly regulated by sulphur metabolite repression, which suggests that AstA protein is a specific sulphate transporter. The protein is distinct from known sulphate permeases and belongs to a poorly characterized large Dal5 family of allantoate transporters. Interestingly, astA appears to be a nonfunctional gene in some strains of A. nidulans. Its orthologues are present in only a few fungal species, in particular in plant pathogenic/saprophytic fungi (Piłsyk et al., 2007).

A group of genes encoding enzymes directly or indirectly implicated in homocysteine metabolism are induced by this amino acid. Coordinated regulation of this “homocysteine regulon” (Sienko et al., 2007) may prevent homocysteine reaching toxic concentrations within the cell. Recent evidence suggests that this response to homocysteine may be part of a general reaction to stress that also involves the unfolded protein response. While the molecular mechanisms involved in regulation of the “homocysteine regulon” remain to be elucidated, it is clear that the response is not part of the sulphur metabolite repression system (SMR), which controls a number of sulphur-related genes, particularly those implicated in sulphate assimilation.

g) Secondary metabolism

Secondary metabolism is the hallmark of most filamentous ascomycetes, and one of the intriguing findings arising from the genome sequencing projects is how few orthologous genes and gene clusters are shared between the different Aspergillus species (Nierman et al., 2005). The functions of most of the new secondary metabolic clusters revealed by the genome project remain unknown, and their biosynthetic products cannot be predicted from sequence alone. Nevertheless, the availability of the genome sequence has stimulated functional analysis studies. This has already led to the discovery of a number of metabolic products not previously reported for A. nidulans, namely aspyridones A and B (Bergmann et al., 2007), terrequinone A (Bok et al., 2006a) and the emericellamides (Chiang et al., 2008). An analysis of predicted non-ribosomal peptide synthetase genes has been carried out (von Döhren FGB-08-122)

h) Cytochrome P450

The extensive metabolic capacity of A. nidulans is also reflected in the 119 cytochrome P450 (CYP) genes (including 8 pseudogenes) that have now been annotated (Kelly et al. FGB-08-98), both manually and by exploiting the query functions of the e-Fungi data warehouse (Cornell et al., 2007). The functions of 13 of these have been determined to date, including genes for ergosterol and dityrosine biosynthesis. 32 of the genes were located close to known secondary metabolic genes such as those encoding nonribosomal peptide synthetases and polyketide synthases, probably reflecting their role in decorating secondary metabolites. Also identified were 8 putative NADPH cytochrome P450 reductase sequences. The protein encoded by AN0595 on chromosome VIII has 91% amino-acid sequence identity to the CprA protein from A. niger, 88% identity to the CprA from A. fumigatus, and is the most probable candidate reductase for the majority of the CYPs.

i) Peroxisomes

Fungal microbodies (peroxisomes, Woronin bodies) are inducible organelles that proliferate or are degraded (via an autophagy-related process termed pexophagy) in response to nutritional cues. Proteins involved in microbody biogenesis/proliferation are designated peroxins and are encoded by PEX genes, while genes involved in autophagy (including pexophagy) are known as ATG genes. A reappraisal of the A. nidulans genome for the presence of PEX and ATG genes has identified a number of previously missed genes. This analysis has led to the conclusion that the basic set of genes involved in microbody biogenesis, proliferation, and degradation are conserved in A. nidulans (Kiel and van der Klei FGB-08-78). The major differences between filamentous ascomycetes like A. nidulans and other organisms appears to be an aberrant RING structure on the peroxin Pex2p (part of the so-called “importomer”), the presence of a Pex16p orthologue (previously identified mainly in higher eukaryotes) and the novel protein Pex14/17p (related to the yeast-specific peroxin Pex17p), as well as multiple Pex11p paralogues (a feature seen mainly in higher eukaryotes). With respect to pexophagy, a receptor protein required to link microbodies destined for degradation to the autophagy machinery could not be identified in A. nidulans. Nevertheless, A. nidulans contains an Atg11p orthologue, a protein associated with selective autophagy, implying that selective microbody degradation occurs in this filamentous fungus. Because both microbody biogenesis / proliferation and autophagy / pexophagy have features that more closely resemble organelle formation / degradation in mammals rather than yeast, filamentous fungi like A. nidulans are ideal model systems of peroxisome homeostasis in man.

j. Cell Wall Genes

The hyphal walls of A. nidulans consist of an internal electron-transparent layer surrounded by an electron-dense outer layer (Jeong et al., 2004). This is consistent with the notion that the internal layer mainly contains stress-bearing polysaccharides, whereas the outer layer is enriched with glycoproteins (De Groot et al., 2005). The inventory of cell wall genes reveals the presence of genes encoding polysaccharide synthases, including genes responsible for the production of the stress-bearing polysaccharides 1,3-α-/1,4-α-glucan, 1,3-β-glucan, and chitin (De Groot et al. FGB-08-116). In addition, A. nidulans contains a gene (AN8444, designated as celA) the product of which shows similarity to bacterial plant and cellulose synthases and is predicted to be responsible for the production of 1,3-β-/1,4-β-glucan. This agrees with the presence of a linear 1,3-β-/1,4-β-glucan in the alkali-insoluble fraction of the hyphal wall of A. fumigatus (Fontaine et al., 2000). Importantly, four predicted GPI-protein-encoding genes were identified that are believed to have transglucosidase activity, and to act on 1,3-α-glucan (AN3308, AN4507, AN6324) or on 1,4-β-glucan (AN2385) (van der Kaaij et al., 2007; Yuan et al., 2008b). Whereas many GPI-proteins are predominantly located in the plasma membrane, others are incorporated in the fungal cell wall (reviewed in (De Groot et al., 2005)). Mass spectrometric analysis of a tryptic digest obtained by ‘cell wall shaving’ (Yin et al., 2008) showed the presence of twelve proteins, including ten predicted GPI-proteins. Hydrophobins are small amphipathic proteins that are only found in mycelial species and are deposited on the cell surface of conidia/ascospores, basidiocarps and aerial hyphae to render them hydrophobic (Wösten, 2001). Six putative hydrophobin-encoding genes were identified in the A. nidulans genome, including the already known rodA and dewA. Finally, some genes involved in spore wall maturation were identified, including some encoding chitin deacetylases, which are predicted to convert chitin into chitosan. In addition, two predicted cytosolic proteins were found that show similarity to the ScDit proteins. These synthesize precursors of the dityrosine-containing macromolecule found in the outer wall layer of the ascospores of S. cerevisiae (Briza et al., 1994).

k. Secretion

The secretion of proteins is important to the natural lifestyles of the aspergilli and becomes critical when those species are exploited as hosts for the commercial secretion of heterologous proteins. Their natural lifestyles make the aspergilli more effective secretors of hydrolytic enzymes than S. cerevisiae, suggesting that the aspergilli might have a better capacity for secretion of proteins in general. That is confirmed by the experience of the biotechnology industry, although S. cerevisiae can be improved to secrete some proteins to commercially acceptable yields. Therefore, the main question facing the annotation team was: does the genome annotation provide any clues as to why the aspergilli might be better protein secretors than the model fungus S. cerevisiae and might they be better suited to expression of proteins with human-like glycosylation? The annotation alone does not provide simple answers to these questions (Geysens et al. FGB-08-112; Tsang et al. FGB-08-121). The core facets of the chaperone cycle, formation of disulphide bonds and the unfolded protein response (UPR) show a high degree of similarity between species. Even so, differences between the aspergilli and the yeasts can be seen in the nucleotide exchange reaction within the chaperone cycle, the protein disulphide isomerase protein family and the activation mechanisms of the UPR mediated by the Hac bZip transcription factor. The functional consequences of these differences remain to be assessed.

In eukaryotes, N-glycosylation is performed via the transfer of a lipid-linked precursor structure (Glc3Man9GlcNAc2) onto a nascent polypeptide chain within the lumen of the ER. In the aspergilli, orthologues were identified for almost all genes involved in the generation of activated sugar donors as well as the synthesis and transfer of the precursor structure to a protein backbone. Also, the genes involved in the ER-processing of protein-linked N-glycans seem to be present. However, differences were observed with S. cerevisiae and in some cases this makes the aspergillus system more like that of mammals, e.g. the presence of a reglucosylation enzyme (UGGT), involved in the calnexin-based glycoprotein quality control. In general, the N-glycans on proteins secreted by the aspergilli are of the high-mannose type, sometimes decorated with monosacharides such as Galf. In contrast to S. cerevisiae, the aspergilli rarely synthesize hyperglycosyl structures. Nevertheless, they possess several orthologues of yeast mannosyltransferases that are known to be involved in the process of hypermannosylation. Several orthologues for mannosidases of family 47 and 92 were identified but it remains to be evaluated whether they act upon N-glycans inside (as in the mammalian Golgi apparatus) or outside the cell. Golgi-based demannosylation would again provide the aspergillus glycosylation pathway with mammalian similarity.

l. Polar growth

The signaling pathways involved in morphogenesis and calcium responses are reasonably well-conserved when compared to those of the well-studied yeast models (Harris et al. FGB-08-135). However, A. nidulans possesses many additional genes implicated in morphogenesis, development, and calcium signaling that are not found in yeast. The function of most of these genes remains a mystery, though it seems likely that they will contribute to unique aspects of hyphal growth (i.e., highly polarized growth) and development (conidiation, cleistothecium formation). Notably, A, nidulans shares with migratory animal cells and neurons several genes involved in actin organization that are otherwise absent from yeast. The number of genes potentially implicated in hyphal morphogenesis is therefore rather large and may exceed initial expectations.

4. Data availability

All data generated from this project – refined assembly, gene structures and annotation - have been deposited within CADRE (http://www.cadre-genomes.org.uk). CADRE (Central Aspergillus Data Repository) is a public resource that provides web-based tools for visualising and analysing genomic features identified within aspergilli. These tools offer simple displays for viewing annotation assigned to predicted genes (e.g. gene symbol, public loci and GO terms) and to their protein products (e.g. family and domain similarity matches), as well as complex displays for viewing genes and other features (e.g. repeated sequences) in the context of an assembly. Using the customised search facilities provided by CADRE, all A. nidulans genes can be easily sought and identified by their unique public locus identifier (AN****.4).

Supplementary Material

01

Acknowledgements

We acknowledge financial support by the European Commission under contract LSSG-CT-2005-018964. MC, and the use of the e-Fungi data warehouse, was supported by a grant to SGO and others as part of the BBSRC's Bioinformatics and e-Science programme II. We wish to thank Dr Michael Anderson for his input on the assembly, whilst at The University of Manchester. We also acknowledge Todd Creasy for Manatee set-up and support, currently at IGS; Brian Haas for data management and computational support (annotation), currently at the Broad; Joshua Orvis for data management and computational support (annotation), currently at IGS; Jonathan Crabtree for data management and computational support (orthologues), currently at IGS. Sandra M.J. Langeveld is acknowledged for her assistance in preparing part of the manuscript

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. [PubMed]
  • Asai K, Itou K, Ueno Y, Yada T. Recognition of human genes by stochastic parsing. Pacific Symposium on Biocomputing. 1998:228–239. [PubMed]
  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene Ontology: tool for the unification of biology. Nature Genetics. 2000;25:25–29. [PMC free article] [PubMed]
  • Bairoch A. The ENZYME database in 2000. Nucleic Acids Research. 2000;28:304–305. [PMC free article] [PubMed]
  • Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004;340:783–795. [PubMed]
  • Bergmann S, Schumann J, Scherlach K, Lange C, Brakhage AA, Hertweck C. Genomics-driven discovery of PKS-NRPS hybrid metabolites from Aspergillus nidulans. Nature Chemical Biology. 2007;3:213–217. [PubMed]
  • Bernal A, Crammer K, Hatzigeorgiou A, Pereira F. Global discriminative learning for higher-accuracy computational gene prediction. Plos Computational Biology. 2007;3:488–497. [PMC free article] [PubMed]
  • Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Research. 2004;14:988–995. [PMC free article] [PubMed]
  • Bok JW, Hoffmeister D, Maggio-Hall LA, Murillo R, Glasner JD, Keller NP. Genomic mining for Aspergillus natural products. Chemistry & Biology. 2006a;13:31–37. [PubMed]
  • Bok JW, Noordermeer D, Kale SP, Keller NP. Secondary metabolic gene cluster silencing in Aspergillus nidulans. Molecular Microbiology. 2006b;61:1636–1645. [PubMed]
  • Brakhage AA, Sprote P, Al-Abdallah Q, Gehrke A, Plattner H, Tuncher A. Regulation of penicillin biosynthesis in filamentous fungi. Molecular biotechnology of fungal beta-lactam antibiotics and related peptide synthetases. Advances in Biochemical Engineering and Biotechnology. 2004:45–90.
  • Briza P, Eckerstorfer M, Breitenbach M. The Sporulation-Specific Enzymes Encoded by the Dit1 and Dit2 Genes Catalyze a 2-Step Reaction Leading to a Soluble Ll-Dityrosine-Containing Precursor of the Yeast Spore Wall. Proceedings of the National Academy of Sciences of the United States of America. 1994;91:4524–4528. [PMC free article] [PubMed]
  • Brody H, Griffith J, Cuticchia AJ, Arnold J, Timberlake WE. Chromosome-Specific Recombinant-DNA Libraries from the Fungus Aspergillus nidulans. Nucleic Acids Research. 1991;19:3105–3109. [PMC free article] [PubMed]
  • Caddick MX. Nitrogen regulation in mycelial fungi. In: Brambl R, Marzluf GA, editors. The Mycota III Biochemistry and Molecular Biology. Berlin-Heidelberg: Springer-Verlag; 2004. pp. 349–367.
  • Caddick MX, Jones MG, van Tonder JM, Le Cordier H, Narendja F, Strauss J, Morozov IY. Opposing signals differentially regulate transcript stability in Aspergillus nidulans. Molecular Microbiology. 2006;62:509–519. [PubMed]
  • Cazelle B, Pokorska A, Hull E, Green PM, Stanway G, Scazzocchio C. Sequence, exon-intron organization, transcription and mutational analysis of prnA, the gene encoding the transcriptional activator of the prn gene cluster in Aspergillus nidulans. Molecular Microbiology. 1998;28:355–370. [PubMed]
  • Chiang YM, Szewczyk E, Nayak T, Davidson AD, Sanchez JF, Lo HC, Ho WY, Simityan H, Kuo E, Praseuth A, Watanabe K, Oakley BR, Wang CCC. Molecular genetic mining of the Aspergillus secondary metabolome: Discovery of the emericellamide biosynthetic pathway. Chemistry & Biology. 2008;15:527–532. [PMC free article] [PubMed]
  • Clutterbuck AJ. Gene symbols in Aspergillus nidulans. Genetical Research Cambridge. 1973;21:291–296. [PubMed]
  • Clutterbuck AJ, Farman M. Aspergillus nidulans linkage map and genome sequence: closing gaps and adding telomeres. In: Goldman GH, Osmani SA, editors. The Aspergilli: genomics, medical aspects, biotechnology, and research methods. Boca Raton: CRC Press; 2008. pp. 57–65.
  • Clutterbuck AJ, Kapitonov VV, Jurka J. Transposable elements and repeat-induced point mutation in Aspergillus nidulans, Aspergillus fumigatus and Aspergillus oryzae. In: Goldman GH, Osmani AH, editors. The Aspergilli: genomics, medical aspects, biotechnology, and research methods. Boca Raton: CRC Press; 2008. pp. 343–355.
  • Colot HV, Park G, Turner GE, Ringelberg C, Crew CM, Litvinkova L, Weiss RL, Borkovich KA, Dunlap JC. A high-throughput gene knockout procedure for Neurospora reveals functions for multiple transcription factors. Proceedings of the National Academy of Sciences of the United States of America. 2006;103:10352–10357. [PMC free article] [PubMed]
  • Cornell MJ, Alam I, Soanes DM, Wong HM, Hedeler C, Paton NW, Rattray M, Hubbard SJ, Talbot NJ, Oliver SG. Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi. Genome Research. 2007;17:1809–1822. [PMC free article] [PubMed]
  • Cultrone A, Dominguez YR, Drevet C, Scazzocchio C, Fernandez-Martin R. The tightly regulated promoter of the xanA gene of Aspergillus nidulans is included in a helitron. Molecular Microbiology. 2007;63:1565–1567. [PubMed]
  • De Groot PWJ, Ram AF, Klis FM. Features and functions of covalently linked proteins in fungal cell walls. Fungal Genetics and Biology. 2005;42:657–675. [PubMed]
  • DeCaprio D, Vinson JP, Pearson MD, Montgomery P, Doherty M, Galagan JE. Conrad: Gene prediction using conditional random fields. Genome Research. 2007;17:1389–1398. [PMC free article] [PubMed]
  • Fedorova ND, Khaldi N, Joardar VS, Maiti R, Amedeo P, Anderson MJ, Crabtree J, Silva JC, Badger JH, Albarraq A, Angiuoli S, Bussey H, Bowyer P, Cotty PJ, Dyer PS, Egan A, Galens K, Fraser-Liggett CM, Haas BJ, Inman JM, Kent R, Lemieux S, Malavazi I, Orvis J, Roemer T, Ronning CM, Sundaram JP, Sutton G, Turner G, Venter JC, White OR, Whitty BR, Youngman P, Wolfe KH, Goldman GH, Wortman JR, Jiang B, Denning DW, Nierman WC. Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus. PLoS Genet. 2008;4:e1000046. [PMC free article] [PubMed]
  • Felenbok B, Flipphi M, Nikolaev I. Ethanol catabolism in Aspergillus nidulans: a model system for studying gene regulation. Prog. Nucleic Acid Res. Mol. Biol. 2001;69:149–204. [PubMed]
  • Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A. The Pfam protein families database. Nucleic Acids Research. 2008;36:D281–D288. [PMC free article] [PubMed]
  • Flipphi M, van de Vondervoort PJI, Ruijter GJG, Visser J, Arst HN, Jr., Felenbok B. Onset of carbon catabolite repression in Aspergillus nidulans. Parallel involvement of hexokinase and glucokinase in sugar signaling. Journal of Biological Chemistry. 2003;278:11849–11857. [PubMed]
  • Flipphi M, Robellet X, Dequier E, Leschelle X, Felenbok B, Vélot C. Functional analysis of alcS, a gene of the alc cluster in Aspergillus nidulans. Fungal Genetics and Biology. 2006;43:247–260. [PubMed]
  • Fontaine T, Simenel C, Dubreucq G, Adam O, Delepierre M, Lemoine J, Vorgias CE, Diaquin M, Latge JP. Molecular organization of the alkali-insoluble fraction of Aspergillus fumigatus cell wall. Journal of Biological Chemistry. 2000;275:27594–27607. [PubMed]
  • Galagan JE, Calvo SE, Cuomo C, Ma LJ, Wortman JR, Batzoglou S, Lee SI, Basturkmen M, Spevak CC, Clutterbuck J, Kapitonov V, Jurka J, Scazzocchio C, Farman M, Butler J, Purcell S, Harris S, Braus GH, Draht O, Busch S, D'Enfert C, Bouchier C, Goldman GH, Bell-Pedersen D, Griffiths-Jones S, Doonan JH, Yu J, Vienken K, Pain A, Freitag M, Selker EU, Archer DB, Penalva MA, Oakley BR, Momany M, Tanaka T, Kumagai T, Asai K, Machida M, Nierman WC, Denning DW, Caddick M, Hynes M, Paoletti M, Fischer R, Miller B, Dyer P, Sachs MS, Osmani SA, Birren BW. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature. 2005b;438:1105–1115. [PubMed]
  • Ganley ARD, Kobayashi T. Highly efficient concerted evolution in the ribosomal DNA repeats: Total rDNA repeat variation revealed by whole-genome shotgun sequence data. Genome Research. 2007;17:184–191. [PMC free article] [PubMed]
  • Gomi K, Akeno T, Minetoki T, Ozeki K, Kumagai C, Okazaki N, Iimura Y. Molecular cloning and characterization of a transcriptional activator gene, amyR, involved in the amylolytic gene expression in Aspergillus oryzae. Bioscience Biotechnology and Biochemistry. 2000;64:816–827. [PubMed]
  • Gotoh O. Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps. Bioinformatics. 2000;16:190–202. [PubMed]
  • Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK, Jr., Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, Salzberg SL, White O. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Research. 2003;31:5654–5666. [PMC free article] [PubMed]
  • Hull EP, Green PM, Arst HN, Scazzocchio C. Cloning and Physical Characterization of the L-Proline Catabolism Gene Cluster of Aspergillus nidulans. Molecular Microbiology. 1989;3:553–559. [PubMed]
  • Jeong HY, Chae KS, Whang SS. Presence of a mannoprotein, MnpAp, in the hyphal cell wall of Aspergillus nidulans. Mycologia. 2004;96:52–56. [PubMed]
  • Jinks JL, Caten CE, Simchen G, Croft JH. Heterokaryon incompatibility and variation in wild populations of Aspergillus nidulans. Heredity. 1966;21:227–239. [PubMed]
  • Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL. Versatile and open software for comparing large genomes. Genome Biology. 2004;5:R12. [PMC free article] [PubMed]
  • Lamb HK, Hawkins AR, Smith M, Harvey IJ, Brown J, Turner G, Roberts CF. Spatial and Biological Characterization of the Complete Quinic Acid Utilization Gene-Cluster in Aspergillus nidulans. Molecular & General Genetics. 1990;223:17–23. [PubMed]
  • Li W, Rehmeyer CJ, Staben C, Farman ML. TERMINUS--Telomeric End-Read Mining IN Unassembled Sequences. Bioinformatics. 2005;21:1695–1698. [PubMed]
  • Lopez F, Leube M, Gil-Mascarell R, Navarro-Aviñó JP, Serrano R. The yeast inositol monophosphatase is a lithium- and sodium-sensitive enzyme encoded by a non-essential gene pair. Molecular Microbiology. 1999;31:1255–1264. [PubMed]
  • Mabey JE, Anderson MJ, Giles PF, Miller CJ, Attwood TK, Paton NW, Bornberg-Bauer E, Robson GD, Oliver SG, Denning DW. CADRE: the Central Aspergillus Data REpository. Nucleic Acids Research. 2004;32:D401–D405. [PMC free article] [PubMed]
  • MacPherson S, Larochelle M, Turcotte B. A fungal family of transcriptional regulators: the zinc cluster proteins. Microbiology and Molecular Biology Reviews. 2006;70:583–604. [PMC free article] [PubMed]
  • Machida M, Asai K, Sano M, Tanaka T, Kumagai T, Terai G, Kusumoto K, Arima T, Akita O, Kashiwagi Y, Abe K, Gomi K, Horiuchi H, Kitamoto K, Kobayashi T, Takeuchi M, Denning DW, Galagan JE, Nierman WC, Yu J, Archer DB, Bennett JW, Bhatnagar D, Cleveland TE, Fedorova ND, Gotoh O, Horikawa H, Hosoyama A, Ichinomiya M, Igarashi R, Iwashita K, Juvvadi PR, Kato M, Kato Y, Kin T, Kokubun A, Maeda H, Maeyama N, Maruyama J, Nagasaki H, Nakajima T, Oda K, Okada K, Paulsen I, Sakamoto K, Sawano T, Takahashi M, Takase K, Terabayashi Y, Wortman JR, Yamada O, Yamagata Y, Anazawa H, Hata Y, Koide Y, Komori T, Koyama Y, Minetoki T, Suharnan S, Tanaka A, Isono K, Kuhara S, Ogasawara N, Kikuchi H. Genome sequencing and analysis of Aspergillus oryzae. Nature. 2005;438:1157–1161. [PubMed]
  • Majoros WH, Pertea M, Antonescu C, Salzberg SL, Glimmer M. Exonomy and Unveil: three ab initio eukaryotic genefinders. Nucleic Acids Research. 2003;31:3601–3604. [PMC free article] [PubMed]
  • Martinelli SD. Gene symbols. In: Martinelli SD, Kinghorn JR, editors. Aspergillus: 50 years on. Amsterdam: Elsevier; 1994. pp. 825–827.
  • Martinelli SD, Kinghorn JR, editors. Aspergillus: 50 years on. Amsterdam: Elsevier; 1994.
  • Mulder N, Apweiler R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods in Molecular Biology. 2007;396:59–70. [PubMed]
  • Nierman WC, Pain A, Anderson MJ, Wortman JR, Kim HS, Arroyo J, Berriman M, Abe K, Archer DB, Bermejo C, Bennett J, Bowyer P, Chen D, Collins M, Coulsen R, Davies R, Dyer PS, Farman M, Fedorova N, Fedorova N, Feldblyum TV, Fischer R, Fosker N, Fraser A, Garcia JL, Garcia MJ, Goble A, Goldman GH, Gomi K, Griffith-Jones S, Gwilliam R, Haas B, Haas H, Harris D, Horiuchi H, Huang JQ, Humphray S, Jimenez J, Keller N, Khouri H, Kitamoto K, Kobayashi T, Konzack S, Kulkarni R, Kumagai T, Lafon A, Latge JP, Li WX, Lord A, Lu C, Majoros WH, May GS, Miller BL, Mohamoud Y, Molina M, Monod M, Mouyna I, Mulligan S, Murphy L, O'Neil S, Paulsen I, Penalva MA, Pertea M, Price C, Pritchard BL, Quail MA, Rabbinowitsch E, Rawlins N, Rajandream MA, Reichard U, Renauld H, Robson GD, de Cordoba SR, Rodriguez-Pena JM, Ronning CM, Rutter S, Salzberg SL, Sanchez M, Sanchez-Ferrero JC, Saunders D, Seeger K, Squares R, Squares S, Takeuchi M, Tekaia F, Turner G, de Aldana CRV, Weidman J, White O, Woodward J, Yu JH, Fraser C, Galagan JE, Asai K, Machida M, Hall N, Barrell B, Denning DW. Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature. 2005b;438:1151–1156. [PubMed]
  • Oakley CE, Oakley BR. Identification of gamma-tubulin, a new member of the tubulin superfamily encoded by mipA gene of Aspergillus nidulans. Nature. 1989;338:662–664. [PubMed]
  • Osmani AH, Oakley BR, Osmani SA. Identification and analysis of essential Aspergillus nidulans genes using the heterokaryon rescue technique. Nature Protocols. 2006;1:2517–2526. [PubMed]
  • Pel HJ, de Winde JH, Archer DB, Dyer PS, Hofmann G, Schaap PJ, Turner G, de Vries RP, Albang R, Albermann K, Andersen MR, Bendtsen JD, Benen JA, van den Berg M, Breestraat S, Caddick MX, Contreras R, Cornell M, Coutinho PM, Danchin EG, Debets AJ, Dekker P, van Dijck PW, van Dijk A, Dijkhuizen L, Driessen AJ, d'Enfert C, Geysens S, Goosen C, Groot GS, de Groot PW, Guillemette T, Henrissat B, Herweijer M, van den Hombergh JP, van den Hondel CA, van der Heijden RT, van der Kaaij RM, Klis FM, Kools HJ, Kubicek CP, van Kuyk PA, Lauber J, Lu X, van der Maarel MJ, Meulenberg R, Menke H, Mortimer MA, Nielsen J, Oliver SG, Olsthoorn M, Pal K, van Peij NN, Ram AF, Rinas U, Roubos JA, Sagt CM, Schmoll M, Sun J, Ussery D, Varga J, Vervecken W, van de Vondervoort PJ, Wedler H, Wosten HA, Zeng AP, van Ooyen AJ, Visser J, Stam H. Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nature Biotechnology. 2007;25:221–231. [PubMed]
  • Pertea M, Lin X, Salzberg SL. GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Research. 2001;29:1185–1190. [PMC free article] [PubMed]
  • Piłsyk S, Natorff R, Paszewski A. Sulfate transport in Aspergillus nidulans: A novel gene encoding alternative sulfate transporter. Fungal Genetics and Biology. 2007;44:715–725. [PubMed]
  • Pontecorvo G, Roper JA, Hemmons LM, Macdonald KD, Bufton AWJ. The Genetics of Aspergillus nidulans. Advances in Genetics Incorporating Molecular Genetic Medicine. 1953;5:141–238. [PubMed]
  • Salamov AA, Solovyev VV. Ab initio gene finding in Drosophila genomic DNA. Genome Research. 2000;10:516–522. [PMC free article] [PubMed]
  • Sienko M, Natorff R, Zielinski Z, Hejduk A, Paszewski A. Two Aspergillus nidulans genes encoding methylenetetrahydrofolate reductases are up-regulated by homocysteine. Fungal Genetics and Biology. 2007;44:691–700. [PubMed]
  • Sonnhammer EL, von Heijne G, Krogh A. A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol. 1998;6:175–182. [PubMed]
  • Szewczyk E, Nayak T, Oakley CE, Edgerton H, Xiong Y, Taheri-Talesh N, Osmani SA, Oakley BR. Fusion PCR and gene targeting in Aspergillus nidulans. Nature Protocols. 2006;1:3111–3120. [PubMed]
  • Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41. [PMC free article] [PubMed]
  • Tatusova TA, Madden TL. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999;174:247–250. [PubMed]
  • Todd RB, Davis MA, Hynes MJ. Genetic manipulation of Aspergillus nidulans: heterokaryons and diploids for dominance, complementation and haploidization analyses. Nature Protocols. 2007a;2:822–830. [PubMed]
  • Todd RB, Davis MA, Hynes MJ. Genetic manipulation of Aspergillus nidulans: meiotic progeny for genetic analysis and strain construction. Nature Protocols. 2007b;2:811–821. [PubMed]
  • Tsuji G, Kenmochi Y, Takano Y, Sweigard J, Farrall L, Furusawa I, Horino O, Kubo Y. Novel fungal transcriptional activators, Cmr1p of Colletotrichum lagenarium and Pig1p of Magnaporthe grisea, contain Cys2His2 zinc finger and Zn(II)2Cys6 binuclear cluster DNA-binding motifs and regulate transcription of melanin biosynthesis genes in a developmentally specific manner. Molecular Microbiology. 2000;38:940–954. [PubMed]
  • Unkles SE, Campbell EI, Punt PJ, Hawker KL, Contreras R, Hawkins AR, Vandenhondel C, Kinghorn JR. The Aspergillus niger niaD Gene Encoding Nitrate Reductase - Upstream Nucleotide and Amino-Acid-Sequence Comparisons. Gene. 1992;111:149–155. [PubMed]
  • van der Kaaij RA, Yuan XL, Franken A, Ram AFJ, Punt PJ, van der Maarel A, Dijkhuizen L. Two novel, putatively cell wall-associated and glycosylphosphatidylinositol-anchored alpha-glucanotransferase enzymes of Aspergillus niger. Eukaryotic Cell. 2007;6:1178–1188. [PMC free article] [PubMed]
  • Woloshuk CP, Foutz KR, Brewer JF, Bhatnagar D, Cleveland TE, Payne GA. Molecular Characterization of Aflr, a Regulatory Locus for Aflatoxin Biosynthesis. Applied and Environmental Microbiology. 1994;60:2408–2414. [PMC free article] [PubMed]
  • Wortman JR, Fedorova N, Crabtree J, Joardar V, Maiti R, Haas BJ, Amedeo P, Lee E, Angiuoli SV, Jiang B, Anderson MJ, Denning DW, White OR, Nierman WC. Whole genome comparison of the A-fumigatus family. Medical Mycology. 2006a;44:S3–S7.
  • Wősten HAB. Hydrophobins: Multipurpose proteins. Annual Review of Microbiology. 2001;55:625–646. [PubMed]
  • Xiang X, Osmani AH, Osmani SA, Xin M, Morris NR. nudF, a nuclear migration gene in Aspergillus nidulans, is similar to the human LIS-1 gene required for neuronal migration. Molecular Biology of the Cell. 1995;6:297–310. [PMC free article] [PubMed]
  • Yin QY, de Groot PWJ, de Koster CG, Klis FM. Mass spectrometry-based proteomics of fungal wall glycoproteins. Trends in Microbiology. 2008;16:20–26. [PubMed]
  • Yuan XL, Roubos JA, van den Hondel C, Ram AFJ. Identification of InuR, a new Zn(II)2Cys6 transcriptional activator involved in the regulation of inulinolytic genes in Aspergillus niger. Molecular Genetics and Genomics. 2008a;279:11–26. [PMC free article] [PubMed]
  • Yuan XL, van der Kaaij RM, van den Hondel C, Punt PJ, van der Maarel M, Dijkhuizen L, Ram AFJ. Aspergillus niger genome-wide analysis reveals a large number of novel alpha-glucan acting enzymes with unexpected expression profiles. Molecular Genetics and Genomics. 2008b;279:545–561. [PMC free article] [PubMed]
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...