• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Feb 8, 2011; 108(6): 2504–2509.
Published online Jan 24, 2011. doi:  10.1073/pnas.1011289108
PMCID: PMC3038703

Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen


In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ~30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ~100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.

Keywords: next generation sequencing, monoterpene, carbohydrate active enzymes, ABC transporter, forest genomics

Bark beetles and their fungal associates have inhabited conifer hosts since the Mesozoic era (1), and are the most economically and ecologically significant forest pests in the northern hemisphere. The current outbreak of the mountain pine beetle (MPB, Dendroctonus ponderosae) in western North America is the largest since the early 1900s. This beetle has killed an estimated 630 million cubic meters (~16.3 M hectares) of lodgepole pine (Pinus contorta subsp. latifolia Engelm.) forest in British Columbia (www.for.gov.bc.ca/hfp/mountain_pine_beetle/). The MPB epidemic has bypassed the natural geographic barrier of the Rocky Mountains and has the potential to spread eastward into the vast Canadian boreal pine forest. Climate change is thought to be a contributing factor to the current MPB epidemic, and the devastation of large areas of pine forest is anticipated to have major consequences that include disturbing the global balance of atmospheric carbon emission and sequestration (2).

Among the MPB-associated microbiota (3), the ascomycete Grosmannia clavigera (Gc) is a critical component of this large-scale epidemic (Fig. 1). This pathogenic fungus can kill lodgepole pine without the beetle when inoculated at a high density; however, the mechanisms by which the fungus kills trees are not fully characterized (4). The association between bark beetles and vectored fungi is symbiotic. The fungi benefit because beetles carry them through the tree bark into a new host's nutrient-rich tissues. The benefits to the beetle and its progeny are less clear, but the fungi may make nutrients available and may detoxify host-defense metabolites (57). Although both fungi and bark beetles must overcome physical and chemical host defenses to become established in conifers, their relative contributions to this process are poorly defined. Toxic phenolics and oleoresin terpenoids are key chemical defense components in conifers (8, 9). In lodgepole pine, phenolics are stored in specialized polyphenolic parenchyma cells in the inner bark (phloem), and oleoresin monoterpenoids and diterpene resin acids are formed and accumulate in resin ducts of the phloem and sapwood. When Gc is manually inoculated below the bark of seedlings or mature trees, as a single fungal inoculum point, it induces the formation of a phloem lesion (i.e., a dark necrotic zone of tissue) that contains high concentrations of tree oleoresins and phenolics, suggesting that the host prevents further fungal colonization. At higher inoculation densities, with inocula in multiple locations, the fungus will also invade the sapwood adjacent to the lesions and block water transport to the crown of the tree (10).

Fig. 1.
Life cycle and infection process of MPB and associated microorganisms. (A) MPBs disperse during early summer; both sexes of MPB carry blue stain fungi. Beetles bore through bark, make their galleries in the phloem, and deposit eggs along the gallery walls. ...

Gc is specifically associated with the MPB, which colonizes only pine species, suggesting that both the vector and its fungal associates may have evolved specific metabolic pathways for overcoming pine defenses. Although the virulence of Gc varies between isolates (11), little systematic characterization has been performed on the genetic variation in Gc populations and on the relation of such variation to the differences in virulence between isolates.

Identifying biochemical mechanisms by which Gc overcomes conifer defenses is a key part of understanding interactions between this fungal pathogen and its host pine. To address this knowledge gap, we first generated a draft genome sequence for Gc, primarily using next-generation sequencing data (12). Here, we report the finished 29.8-Mb Gc genome sequence, 8,314 annotated protein coding sequences, initial annotation of protein coding-sequence polymorphisms, proteins secreted in response to growth on wood, changes in the fungal transcriptome induced by exposure to lodgepole pine phloem extract (LPPE) or oleoresin terpenoids, and genes and pathways involved in the modification, transport, and metabolism of conifer defense components. These resources and results provide a solid foundation to clarify the interaction of Gc with host-tree defenses.


Genome Sequence and Protein Coding Annotations in G. clavigera.

Building on the previously published draft Gc genome sequence (12), we manually finished the genome assembly of Gc [kw1407; National Center for Biotechnology Information (NCBI), Genome PID: 39837] yielding 18 supercontigs with a total length of 29.8 Mb (Table 1 and SI Appendix). Telomeric sequences suggested that the supercontigs belonged to seven chromosomes. We achieved 64× sequence coverage across 90% of the finished genome sequence (SI Appendix, Figs. S1 and S2). We validated the assembly by aligning to it 99.4% of 7,169 unique expressed sequence tag (EST) sequences (method described in ref. 12). We assembled the mitochondrial genome into a single ~90-kb circularized sequence (SI Appendix, Fig. S3).

Table 1.
Genome characteristics of G. clavigera

Before predicting gene models, we masked the assembled genome sequence for repetitive elements identified using similarity to repeat databases (repbase v.20090120) and de novo repeat detection using RepeatScout (13). In total, 10.4% of the finished genome was found to be composed of repeats or low complexity sequences. Evidence for repeat-induced point mutation (RIP) was identified using RIPCAL (14), and was found almost exclusively within transposable elements. After excluding mitochondrial DNA, we predicted 8,314 protein-coding gene models, accounting for 46% of the total genome length (SI Appendix, Dataset S1). The predicted gene models were supported and validated with EST, RNA-seq, and peptide sequences (Table 1). We annotated the translated set of sequences using public sequence databases and assigned initial functional descriptions for ~75% of the total predicted protein collection (SI Appendix, Dataset S1).

RNA-seq Validation of Gene Models and Identification of Protein Coding Sequence Variations.

To identify SNPs in the protein-coding regions of the Gc genome and to provide additional gene model support, we assessed the genome using RNA-seq read data from a collection of seven additional Gc strains (SI Appendix, Table S1) (42 different culture-treatment combinations). For this purpose we generated cDNA from polyA+ purified total RNA and sequenced it using a paired-end read approach on the Illumina Genome Analyzer platform. We predicted 17,236 SNPs from the tag-to-genome alignments (FPR = 0.0045, FNR = 0.16) (SI Appendix, Dataset S2), of which 12,160 occurred within ~14.5 Mb of protein-coding gene-model sequence covering 92% of the total transcriptome length (~15.77 Mb). Only a small number of variants were located in predicted intron regions (741; 6%). These 741 SNPs could have resulted from incompletely spliced transcripts, alternatively spliced transcripts, or inappropriately predicted intron-exon boundaries. We found an SNP density of one variant per 1,189 bp across the predicted genes and an average minor allele frequency of 25.1%. Transitions were favored over transversions by a ratio of 3:1 and amino acid sequence variations for 5,689 of the predicted SNPs.

Identification of G. clavigera Gene Orthologs.

To further validate the Gc gene models and to assess gene family variations in Gc, relative to other fungi, we used orthoMCL (15) to identify gene orthologs. We clustered ~186.4 K predicted protein sequences from 17 fungal taxa and identified 6,780 ortho-groups (groups of putative orthologous genes) that contained at least one member from Gc. Of these, 1,940 contained a representative from all taxa and 692 possessed a strict single-copy orthologous relationship (i.e., clusters contained exactly one member per species). Phylogenetic analysis was performed with these 692 genes to confirm the phylogenetic position of Gc within the class Sordariomycetes (Fig. 1C and SI Appendix). We identified the mating-type (MAT) gene, suggesting that the sequenced strain belongs to the MAT-1-2 idiomorph. The high-mobility group domain of the Gc MAT protein was similar to those in MAT loci of other filamentous ascomycetes. We detected the MAT-1-1 idiomorph α-domain in other Gc isolates, but not in the sequenced strain. This result indicates that Gc is heterothallic.

Using CAFE (16), we identified Gc gene family expansions for methyltransferases, major facilitator superfamily transporters, and serine-peptidases, whereas gene family contractions occurred for Na+/Ca2+-transporting ATPases, glycoside hydrolases (GHs), zinc-type alcohol dehydrogenases, and cytochrome P450s (CYP450s). The largest Gc gene family expansion was for O-methyltransferases, for which we identified 199 methyltransferase-like sequences (PFAM: PF08241-2). Using a phylogenetic analysis including a subset from the other fungal taxa, we observed a clade containing seven Gc O-methyltransferase sequences that showed significant support for branch-specific differences in synonymous vs. nonsynonymous substitution rates using a likelihood ratio test (P < 0.001), indicating that these methyltransferases may be under positive selection (SI Appendix).

Identification and Annotation of Genes and Proteins for Inhabiting Host Pine.

Like other sap-staining fungi that colonize conifers, Gc is unable to degrade the structural components of wood, such as lignin and cellulose (17). To identify genes that may be used by Gc to grow in the host sapwood, we isolated proteins secreted by Gc during mycelial growth on a simplified substrate, pine sawdust-supplemented agar medium. Peptide sequencing supported 214 of the Gc gene models described above (SI Appendix, Dataset S3), for which we identified enriched Gene Ontology (GO) terms. Ninety percent of these annotated genes (162 genes) belonged to metabolic processes, with the greatest enrichment occurring within “carbohydrate metabolism” (GO:0005975) and “proteolysis” (GO:0006508). We used SignalP (18) to show that the deduced protein sequences were enriched in signal peptides for secretion; we predicted such peptides in 106 (50%) of the 214 genes but in only 538 (7%) of the genome-wide set of 8,314 gene models. The predicted secretome is small relative to secretomes predicted for phylogenetically similar species (SI Appendix). We noted that this reduction occurred for all protein lengths but may be biased toward smaller protein lengths. Only a small number of secreted protein families were expanded in Gc (SI Appendix, Table S2).

We identified 231 carbohydrate-active enzymes in the Gc genome, using the CAZy classification system (19) (SI Appendix, Dataset S4). This number is smaller than previously reported for Neurospora crassa (277) or Magnaporthe grisea (378). The Gc genome contained 139 GHs, 17 of which were detected by peptide sequencing (described above and in SI Appendix, Table S3). These GHs included enzymes that may be involved in maintaining cell wall plasticity during growth and morphogenesis and in acquiring carbohydrates. We found that Gc secretes relatively more plant cell wall degrading enzymes that could be involved in pectin degradation of the cell wall or the tracheid bordered pit membranes, allowing this fungus to colonize the sapwood (20). However, GHs involved in degrading host ligno-cellulose structures were notably absent from both the proteome and genome data collections (e.g., GH6 cellulase). Gc has only two carbohydrate-binding modules assigned to family 1: one was attached to a GH12 plant cell wall digesting enzyme and the other was attached to a chitinase. Carbohydrate esterases were also sparse, in particular families 5 and 1. Whereas M. grisea and N. crassa, respectively, have 10 and 7 carboyhydrate esterases from family 1, and 15 and 3 carbohydrate esterases from family 5, Gc has only one of each. Similarly, Gc appears to have only a single type B feruloyl esterase, with no secretion signal peptide and which is likely not secreted. Without a secreted feruloyl esterase, Gc cannot hydrolyze the diferulate cross-links in plant cell walls. Peptide sequencing and signal peptide analysis indicated that the carbohydrate esterase 5 and carbohydrate esterase 8 enzymes were secreted during growth on the sawdust-agar medium.

We used the MEROPS database (merops.sanger.ac.uk) to identify 287 putative peptidases in Gc. Twenty of these peptidases, belonging to the A1, S8, S28, and S53 families, were also identified in the peptide-sequencing data. The top five ranked by peptide-spectra abundance are reported in SI Appendix, Table S3. We identified a lineage-specific gene expansion within the peptidase family S53 (10 genes). S53 enzymes were among the most abundant peptidases secreted during growth on the sawdust-agar medium. In addition, we identified an extracellular lipase that may be involved in using pine triglycerides, a major carbon source for this fungus. Triglycerides account for ~2% to 2.5% dry weight of lodgepole pine stems (21).

Identification and Annotation of Genes for Detoxifying Host Defense Metabolites.

In a pine host, Gc grows in an environment with high concentrations of terpenoid and phenolic defense metabolites. Growth of Gc on malt extract agar was reduced in the presence of LPPE. In the presence of terpenoids, Gc grew with an initial lag phase (24 h) followed by growth at nearly the same rate as untreated controls (SI Appendix, Fig. S4). In contrast, N. crassa growth was reduced when challenged with the LPPE treatment and completely inhibited by the terpene treatment (SI Appendix, Fig. S5). These results highlight Gc’s tolerance for terpenoids and possibly other conifer defense compounds.

To identify genes associated with mechanisms used by Gc to overcome host chemical defenses, we used Illumina expression profiling (RNA-seq) on mycelia samples collected at 12 and 36 h after LPPE or terpenoid treatments (SI Appendix, Dataset S1). In total, 4,690 gene models showed at least twofold increase in transcript abundance in at least one of the treatments and time points sampled relative to matched untreated controls (SI Appendix, Fig. S6). P values for differential expression were highly significant for hundreds of genes (SI Appendix, Fig. S7). We plotted expression levels for genes induced by the LPPE and terpenoid treatments in 50-kb windows and noted regions with high transcriptional activity [coexpression clusters (ECs)] (see below, SI Appendix, Fig. S8, and http://bfgweb.bcgsc.ca/homepage.html).

Response of G. clavigera to LPPE.

GOMiner (22) analysis of the 12-h LPPE gene-expression data identified enrichment of transcripts for several biological processes including carbohydrate metabolism (P < 0.001), alcohol metabolism (P < 0.001), glycolysis (P < 0.001), external encapsulating structure organization (P < 0.001), cellular protein metabolic processes (P < 0.001), and cellular aromatic compound metabolic processes (P < 0.001).

GHs can also contribute to fungal detoxification of sugar-conjugated antimicrobial compounds, such as phenolic glycosides and saponins (23, 24). Inspection of the GH gene expression activity 12 h following LPPE treatment indicated up-regulation of GHs targeting the plant cell wall (families: GH51, GH78, GH61, GH53, GH43) and up-regulation of an α-trehalase (GH37). Genes encoding proteins from families GH3, GH5, and GH39 were also induced. Although the substrate specificity of these GHs is unknown, the GH3 and GH39 proteins are likely intracellular, as they do not possess extracellular signal peptide sequences, whereas the GH5 has a secretion signal and no GPI anchor.

Gene-expression data suggest that LPPE treatment induced an oxidative stress response in Gc with induction of Mn/Fe and Cu/Zn superoxide dismutases, peroxidases, and a thioredoxin and thioredoxin reductase. Up-regulation of the eight subunits of the T-complex polypeptide involved in actin and tubulin folding, as well as the induction of the actin and tubulin genes themselves, may suggest an LPPE-induced reorganization of the Gc cytoskeleton. In addition, up-regulation of 19 Gc genes encoding proteasome and proteasome regulatory subunits may indicate induced protein turnover in response to LPPE treatment. After 36 h, many of the genes that were induced by LPPE treatment at 12 h were no longer induced. Among the genes up-regulated at 36 h, we found no evidence that transcripts for particular biological processes were enriched. However, at 36 h after exposure to LPPE, many of the highly expressed (P < 0.01) genes belonged to gene families with known roles in detoxification (e.g., oxidoreductases and CYP450s) (SI Appendix, Table S4). We also observed a large number of significantly induced transcription factors (SI Appendix, Table S4). We anticipated finding genes involved in β-ketoadipate metabolism because this pathway is commonly used for the aerobic detoxification of aromatic compounds in microorganisms. However, neither the Gc ortholog to N. crassa 3-carboxy-cis,cis-muconate cyclase (25) or the Aspergillus nidulans phenylacetate catabolic gene cluster (26), nor genes identified in the TCA cycle were strongly induced in response to the LPPE treatment. As in A. nidulans, we noted that the genes of the phenylacetate catabolic pathway were clustered, although the cluster is expanded in Gc and included the phenylacetate 2-hydroxylase (GCSC_179: 1.126–1.135 Mb).

We investigated the genome regions surrounding the putative detoxification genes within the ECs (SI Appendix, Fig. S8). Our current expression data validated the two ECs identified by digital profiling ESTs following LPPE treatment (27). The additional gene annotation and expression data reported here allowed us to extend one of these clusters consisting of six loci by four loci to 10 gene models (GCSC_140; 1.13–1.15 Mb; Cluster I) (SI Appendix, Fig. S9 and Table S5). The region with the highest average expression levels over a 50-kb window in the LPPE-treated data (GCSC_173; 1.84–1.90 Mb; Cluster II) (SI Appendix, Table S4) contained 12 genes, all of which responded strongly to the LPPE treatment.

Response of G. clavigera to Terpenoid Treatment.

In the 12-h mycelial cultures treated with terpenoids we observed two overlapping GO clusters within the biological process hierarchy. The first cluster included genes annotated as mRNA processing (P < 0.001) and ribosome biogenesis (P < 0.001), and the second cluster included genes annotated as amino acid biosynthesis (P < 0.001). We observed the induction of genes encoding DNA repair, recombination, stability, and replication proteins, such as helix-destabilizing proteins, topoisomerases, ss-DNA binding protein, DNA repair nucleases, mismatch repair proteins, DNA ligases, a DNA glycosylase, and DNA polymerases. In addition, we observed the induction of genes encoding histones H2A, H2B, and H4, but alternate variants for H2A and H4 and histones H3 and H1 were strongly repressed. Strong induction of a putative H4 arginine methyltransferase, H3-K79 methyltransferase, ubiquitin conjugase, and SIR2-like deacetylase may implicate chromatin remodeling in the process of changing gene expression.

A marked change in gene expression was apparent at the 36-h time point following treatment with terpenoids compared with the 12-h time point. We observed among the differentially expressed genes two overlapping clusters of GO-terms within the biological process hierarchy. The clusters encompass lipid metabolic processes (P < 0.001) and alcohol metabolism (P < 0.001). The molecular function hierarchy contained several small clusters falling primarily within catalytic activity (P < 0.001) and microtubule based processes (P < 0.001). Within these classifications, noteworthy members were oxidoreductase activity (P < 0.001), aldehyde dehydrogenase activity (P < 0.001), and electron carrier activity (P < 0.001). Encompassed within the microtubule-based processes were “cytoskeleton organization and biogenesis” (P < 0.001), cytoskeleton based intracellular transport (P < 0.001), cellular localization (P < 0.001), and transport (P < 0.001). Our initial Kyoto Encyclopedia of Genes and Genomes annotations supported the GO analysis, indicating induction of the fatty acid and glyoxylate pathways. We examined the β-oxidation capacity of Gc and found that the FOX2 multifunctional β-oxidation enzyme (GLEAN_6203) was induced; however, the mitochondrial short-chain enoyl CoA hydratase (GLEAN_647) was more strongly induced. As well, at both 12 and 36 h we observed strong induction for carnitine acyl transferase and for carnitine acetyl transferase, indicating that β-oxidation in the mitochondria may be favored over the peroxisome. We were not able to identify a peroxisomal acyl-CoA oxidase and we observed no increase in expression for peroxisomal catalases, indicating that this fungus likely uses a nonforming H2O2 pathway for peroxisomal β-oxidation, which is consistent with its phylogenetic position (28).

We identified a 100-kb EC on supercontig GCSC_108 (0.9–1.1 Mb) (Fig. 2). Within an ~85-kb core section of this genome region, 35 gene models were predicted, 18 were induced in response to the terpene treatment, 4 were repressed, and 12 were unchanged (SI Appendix, Table S6). The most strongly induced genes in this region were a flavoprotein monooxygenase (FMO), an FMO-like monooxygenase containing a lipocalin signature, and a short-chain dehydrogenase/reductase enzyme. In addition to these oxidoreductases were enzymes such as an epoxide hydrolase, alcohol dehydrogenase, and aldehyde dehydrogenase, which may be important for activating terpenoids or their intermediates for β-oxidation. Given the induction of genes in the β-oxidation pathway and genes clustered within the genome that may be involved in activating terpenoids for β-oxidation, we tested the ability of Gc to grow on the terpenoid blend as a sole carbon source. Consistent with the results of gene-expression profiling, Gc was able to grow on the terpenoid blend as a sole carbon source (SI Appendix, Fig. S10). Finally, we have begun exploring the contribution of the most strongly induced pleiotropic drug resistance transporter, GLEAN_8030, to Gc’s terpenoid tolerance. Deleting this gene using our recently developed split-marker Agrobacterium-mediated transformation system (29) prevented mycelial growth of the fungus on the terpene-supplemented media (SI Appendix, Fig. S11).

Fig. 2.
Gene expression cluster induced following terpenoid treatment. RNA-seq profiling reveals a cluster of coexpressed genes on supercontig GCSC_108. For complete details and the results of genome-wide mapping data, see SI Appendix, Fig. S6. From Top to Bottom ...


We developed fundamental genomic and molecular resources for functional characterization of a bark beetle-symbiotic fungus and tree pathogen. Sequencing and assembly of the Gc genome involved next generation sequencing data and traditional finishing strategies. This approach resulted in a high-quality genome sequence. The genome size, repeat content, and gene collection are similar to other saprophytic and pathogenic fungi in the class Sordariomycetes. We identified evidence for the fungus-specific genome defense mechanism RIP. In N. crassa RIP occurs before or during meiosis, causing C•G to T•A mutations within duplicated sequences (30). This result indicates that Gc has sexual potential despite the Gc sexual cycle being rarely reported in field data and not yet achieved under laboratory conditions.

It is unknown if Gc relies on a limited number of modifying enzymes with broad substrate specificities, or a large number of enzymes with narrow substrate specificities for the detoxification of host defense metabolites, such as terpenoids and phenolics. The expansion of the Gc O-methyltransferase gene family may represent an opportunity for future work to test the range of substrate specificities of these enzymes and their possible role in detoxification of host metabolites. Some O-methyltransferases have known functions in plant phenolic metabolism (31) and in fungal and animal phenolic detoxification (32, 33). Noticeably, Gc had only 54 CYP450s and no obvious expansion of any CYP450 gene subfamilies, which is surprising given the potential of CYP450s to contribute to the transformation of host defense chemicals. Transcriptome sequencing identified a number of Gc CYP450s inducible by terpenoid or LPPE that warrant further investigation. These CYP450s will be explored in future work across a larger collection of different Gc strains.

Following a MPB attack or Gc inoculation, the concentration of host-defense chemicals, in particular terpenoids, increases (34). That Gc can overcome terpenoid defenses when N. crassa cannot suggests that this capability could be a critical pathogenicity factor in the MPB-Gc symbiosis. Responses to the LPPE and terpenoid treatments were substantially different; Gc growth was reduced by LPPE and delayed by terpenoids, and only 41 genes were induced by both treatments at 12 h. This set of 41 genes was enriched in general and chemical stress responders, including a putative DNA glycosylase and cytidine deaminase, suggesting that changes in DNA methylation or RNA/DNA editing may be important in early chemical stress responses.

The LPPE extract is a complex mixture of methanol-water soluble compounds (SI Appendix, Fig. S12), which contains defensive phenolic chemicals, sugars, and possibly other metabolites. Importantly, the LPPE captures a complexity of compounds encountered by the fungal propagules when deposited into the tree phloem, and this experimental treatment thus complements the more defined terpenoid treatment. When Gc was treated with LPPE the mycelia became pink, which may indicate that oxidized phenolic derivatives, such as quinones and free radicals, were generated. Consistent with this observation, genes involved in sugar utilization and response to oxidative stresses were strongly induced. The activation of genes and gene clusters that may be involved in the detoxification or degradation of host antimicrobial compounds suggests that Gc needs to detoxify its environment. The induction of transcription factors may reflect the regulatory coordination required to process the diverse collection of antimicrobial compounds in this phloem extract. In fungi, catabolic enzymes that degrade aromatic compounds can be encoded in gene clusters (26), and although we did not find substantial induction of genes known to be involved in aromatic degradation in Gc, LPPE treatment induced gene clusters. This finding suggests that Gc may use unique metabolic pathways to detoxify the host-specific pine-defense chemicals.

When Gc was treated with terpenoids we observed a lag phase in growth and indications that terpenoid treatment induced transcriptome reprogramming mediated by chromatin remodeling. We observed a gene cluster that spanned ~100 kb and that responded strongly to terpenoid treatment; functional annotation indicates that this cluster may contribute to early enzymatic steps in terpenoid metabolism. Coexpression clusters that span large genomic regions have been reported for higher eukaryotes (35), but not for fungi. Clusters may reflect selection for coordinated gene expression and reliable gene transmission (35, 36). For Gc, the origin and maintenance of such a cluster may reflect detoxification pathway optimization that minimizes accumulation of toxic metabolite intermediates. In support of this, in the prokaryote Burkholderia xenovorans nearly half of the 93 genes induced by the diterpene dehydroabietic acid occur in an ~80-kb region, and the genes in this region participate directly in dehydroabietic acid metabolism (37). At 36 h following terpenoid treatment, Gc genes involved in fatty acid metabolism were induced. This finding may indicate that subsequent steps involved in terpenoid detoxification occur through the β-oxidation pathway; consistent with this, we confirmed that Gc is able to use terpenoids as a sole carbon source. Fungi that are able to use hydrophobic substrates, such as long-chain alkanes via the β-oxidation pathway, have been described (38), and fungi cultured in the presence of small amounts of terpenoids often generate nonspecific oxidized derivatives; however, the ability to grow on alkanes, such as antimicrobial monoterpenoids, as a sole carbon source is unusual. The gene-expression data described here provide an opportunity for exploring the genes and mechanisms involved in this process.

By applying the genomic and molecular resources developed in this work we have begun to clarify the specialized mechanisms that Gc has developed, which allow it to tolerate terpenoids and grow in its pine host, an evolutionary adaptation that is an important factor in the interaction between host tree, the fungal pathogen, and its beetle vector.

Materials and Methods

Detailed materials and methods, including references, are described in the SI Appendix.


Gc strain kw1407 (NCBI Taxonomy ID: 655863) is deposited into the University of Alberta Mycological Herbarium 11150 along with the additional isolates used in this study (11151–11156).

Genome Sequence Finishing, ESTs, and Genome Anotations.

Genome sequence finishing was performed on the draft assembly described previously, with additional data: one lane of Illumina Genome Analyzer (GAii) from a 3-kb long insert library and 1,299 finishing reactions performed for filling gaps (SI Appendix). Telomeric repeats were identified using the sequence TTAGGG as a search sequence. ESTs were reported earlier (27). Gene models are a composite of ab initio and homology-based predictions generated using GLEAN (SI Appendix). Putative gene function assignments were generated from searches of the NCBI NR and Swissprot databases using BLAST and combined with PFAM domain assignments. GO annotations were assigned using Blast2GO. Predicted protein localizations were determined using SignalP, TMHMM, and WolfPsort.

Peptide Sequencing.

To obtain extracellular proteins for Gc, the fungus was grown on sawdust-agar plates overlaid with cellophane. After 3 d of growth, mycelia and cellophane were transferred to acetate buffer, centrifuged, and filtered (SI Appendix). The protein solution was concentrated and separated by 1D SDS/PAGE (SI Appendix). In-gel protein digests were performed for 16 bands cut from the 1D gel (SI Appendix). Peptide analysis was performed by tandem mass spectrometry (SI Appendix). Bioinformatic analysis was performed with custom scripts (SI Appendix and available upon request).

RNA-seq, Variant Detection, and Expression Analysis.

RNA-seq data were generated with an Illumina (GAii) from poly(A+) mRNA (SI Appendix). Sequence clusters were generated on an Illumina cluster station. Lanes were sequenced to 36 cycles. Postrun analysis was performed with the Illumina GA pipeline (v.1.0). Paired-end (PE) reads were aligned to the reference genome sequence using CLCbio’s Genomics workbench (http://www.clcbio.com/; CLCbio, DK); SNP prediction was also performed within this software package with additional postprediction filtering (SI Appendix). Culture conditions for mycelia generated for expression analysis, terpene, and LPPE treatment preparations are described in the SI SI Appendix. Treatments for transcriptome analysis were carried out using a TLC sprayer applying the treatment directly to culture surfaces with filtered nitrogen gas as the carrier.

Supplementary Material

Supporting Information:


The authors thank the Functional Genomics Group of the British Columbia Cancer Agency Genome Sciences Centre for expert technical assistance, and all of the excellent undergraduate students who have worked in the C.B. laboratory on this project. This work was funded by grants from the Natural Sciences and Engineering Research Council of Canada (to J.B. and C.B.), the British Columbia Ministry of Forests (to S.J.J., J.B., and C.B.), the Canadian Forest Service Genomics program (to R.C.H.), and funds from Genome Canada, Genome British Columbia and Genome Alberta (to J.B., C.B., R.C.H., and S.J.J.) in support for the Tria project (www.thetriaproject.ca). S.J.J. and M.A.M. are Michael Smith Distinguished Scholars. Salary support for J.B. came in part from a Natural Sciences and Engineering Research Council of Canada Steacie Fellowship and the University of British Columbia Distinguished Scholars Program. This is National Center for Biotechnology Information Genome Project 39847.


The authors declare no conflict of interest.

Data deposition: The sequences reported in this paper have been deposited in NCBI GenBank as assembly and annotations accession ACXQ00000000.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1011289108/-/DCSupplemental.


1. Seybold S, Bohlmann J, Raffa K. Biosynthesis of coniferophagous bark beetle pheromones and conifer isoprenoids: Evolutionary perspective and synthesis. Can Entomol. 2000;132:697–753.
2. Kurz WA, et al. Mountain pine beetle and forest carbon feedback to climate change. Nature. 2008;452:987–990. [PubMed]
3. Lee S, Kim J-J, Breuil C. Diversity of fungi associated with the mountain pine beetle Dendroctonus ponderosae and infested lodgepole pines in British Columbia. Fungal Divers. 2006;22:91–105.
4. Lee S, Kim J-J, Breuil C. Pathogenicity of Leptographium longiclavatum associated with Dendroctonus ponderosae to Pinus contorta. Can J Res. 2006;36:2864–2872.
5. Ayres M, Wilkens R, Ruel J, Lombardero M. Nitrogen budgets of phloem-feeding bark beetles with and without symbiotic fungi. Ecology. 2000;8:2198–2210.
6. Bleiker K, Six D. Dietary benefits of fungal associates to an eruptive herbivore: Potential implications of multiple associates on host population dynamics. Environ Entomol. 2007;36:1384–1396. [PubMed]
7. Lieutier F, Yart A, Salle A. Stimulation of tree defences by Ophiostomatoid fungi can explain attack success of bark beetles on conifers. Ann For Sci. 2009;66:801–823.
8. Franceschi VR, Krokene P, Christiansen E, Krekling T. Anatomical and chemical defences of conifer bark against bark beetles and other pests. New Phytol. 2005;167:353–375. [PubMed]
9. Keeling C, Bohlmann J. Genes enzymes and chemicals of terpenoid diversity in the constitutive and induced defence of conifers against insects and pathogens. New Phytol. 2006;170:657–675. [PubMed]
10. Lee S, Kim JJ, Breuil C. Leptographium longiclavatum sp. nov., a new species associated with the mountain pine beetle, Dendroctonus ponderosae. Mycol Res. 2005;109:1162–1170. [PubMed]
11. Plattner A, Kim JJ, DiGuistini S, Breuil C. Variation in pathogenicity of a mountain pine beetle-associated blue-stain fungus, Grosmannia clavigera, on young lodgepole pine in British Columbia. Can J Plant Pathol. 2008;30(3):457–466.
12. DiGuistini S, et al. De novo genome sequence assembly of a filamentous fungus using Sanger 454 and Illumina sequence data. Genome Biol. 2009;10(9):R94. [PMC free article] [PubMed]
13. Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21:i351–i358. [PubMed]
14. Hane JK, Oliver RP. RIPCAL: A tool for alignment-based analysis of repeat-induced point mutations in fungal genomic sequences. BMC Bioinformatics. 2008;9:478. [PMC free article] [PubMed]
15. Li L, Stoeckert C, Jr., Roos D. OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–2189. [PMC free article] [PubMed]
16. De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: A computational tool for the study of gene family evolution. Bioinformatics. 2006;22:1269–1271. [PubMed]
17. Zabel RA, Morell JJ. Wood stains and discolorations. In: Zabel RA, Morell JJ, editors. Wood Microbiology: Decay and its Prevention. San Diego, CA: Academic Press; 1992. pp. 326–343.
18. Emanuelsson O, Brunak S, von Heijne G, Nielsen H. Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc. 2007;2:953–971. [PubMed]
19. Cantarel BL, et al. The Carbohydrate-Active EnZymes database (CAZy): An expert resource for Glycogenomics. Nucleic Acids Res. 2009;37:D233–D238. [PMC free article] [PubMed]
20. Lieutier F, Berryman A. Preliminary histological investigations of the defence reactions of 3 pines to Ceratocystis-clavigera and two chemical elicitors. Can J Res. 1988;18:1243–1247.
21. Gao Y, Chen T, Breuil C. Identification and quantification of nonvolatile lipophilic substances in fresh sapwood and heartwood of lodgepole pine (Pinus-contorta Dougl) Holzforschung. 1995;49(1):20–28.
22. Zeeberg BR, et al. GoMiner: A resource for biological interpretation of genomic and proteomic data. Genome Biol. 2003;4(4):R28. [PMC free article] [PubMed]
23. Bouarab K, Melton R, Peart J, Baulcombe D, Osbourn A. A saponin-detoxifying enzyme mediates suppression of plant defences. Nature. 2002;418:889–892. [PubMed]
24. Zheng Z, Shetty K. Solid-state bioconversion of phenolics from cranberry pomace and role of Lentinus edodes β-Glucosidase. J Agric Food Chem. 2000;48:895–900. [PubMed]
25. Kajander T, et al. The structure of Neurospora crassa 3-carboxy-cis,cis-muconate lactonizing enzyme a β-propeller cycloisomerase. Structure. 2002;10:483–492. [PubMed]
26. Fernández-Cañón JM, Peñalva MA. Fungal metabolic model for human type I hereditary tyrosinaemia. Proc Natl Acad Sci USA. 1995;92:9132–9136. [PMC free article] [PubMed]
27. Hesse-Orce U, et al. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera. BMC Genomics. 2010;11:536. [PMC free article] [PubMed]
28. Shen Y-Q, Burger G. Plasticity of a key metabolic pathway in fungi. Funct Integr Genomics. 2009;9(2):145–151. [PubMed]
29. Wang Y, DiGuistini S, Wang T-CT, Bohlmann J, Breuil C. Agrobacterium-meditated gene disruption using split-marker in Grosmannia clavigera a mountain pine beetle associated pathogen. Curr Genet. 2010;56:297–307. [PubMed]
30. Galagan JE, Selker EU. RIP: The evolutionary cost of genome defence. Trends Genet. 2004;20:417–423. [PubMed]
31. Preisig CL, Matthews DE, VanEtten HD. Purification and characterization of S-adenosyl-L-methionine: 6-a-hydroxymaackiain 3-O-methyltransferase from Pisum sativum. Plant Physiol. 1989;91:559–566. [PMC free article] [PubMed]
32. Männistö PT, Kaakkola S. Catechol-O-methyltransferase (COMT): Biochemistry molecular biology pharmacology and clinical efficacy of the new selective COMT inhibitors. Pharmacol Rev. 1999;51:593–628. [PubMed]
33. Feltrer R, Álvarez-Rodríguez ML, Barreiro C, Godio RP, Coque J-JR. Characterization of a novel 2,4 6-trichlorophenol-inducible gene encoding chlorophenol O-methyltransferase from Trichoderma longibrachiatum responsible for the formation of chloroanisoles and detoxification of chlorophenols. Fungal Genet Biol. 2010;47:458–467. [PubMed]
34. Raffa K, Berryman A. Physiological-aspects of lodgepole pine wound responses to a fungal symbiont of the mountain pine-beetle Dendroctonus ponderosae (Coleoptera Scolytidae) Can Entomol. 1983;115:723–734.
35. Hurst LD, Pál C, Lercher MJ. The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet. 2004;5:299–310. [PubMed]
36. Walton JD. Horizontal gene transfer and the evolution of secondary metabolite gene clusters in fungi: An hypothesis. Fungal Genet Biol. 2000;30(3):167–171. [PubMed]
37. Smith DJ, Park J, Tiedje JM, Mohn WW. A large gene cluster in Burkholderia xenovorans encoding abietane diterpenoid catabolism. J Bacteriol. 2007;189:6195–6204. [PMC free article] [PubMed]
38. Thevenieau F, et al. Characterization of Yarrowia lipolytica mutants affected in hydrophobic substrate utilization. Fungal Genet Biol. 2007;44:531–542. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...