Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. 2010 Oct; 192(20): 5534–5548.
Published online 2010 Aug 13. doi:  10.1128/JB.00900-10
PMCID: PMC2950489

Metabolic Network Analysis of Pseudomonas aeruginosa during Chronic Cystic Fibrosis Lung Infection


System-level modeling is beginning to be used to decipher high throughput data in the context of disease. In this study, we present an integration of expression microarray data with a genome-scale metabolic reconstruction of Pseudomonas aeruginosa in the context of a chronic cystic fibrosis (CF) lung infection. A genome-scale reconstruction of P. aeruginosa metabolism was tailored to represent the metabolic states of two clonally related lineages of P. aeruginosa isolated from the lungs of a CF patient at different points over a 44-month time course, giving a mechanistic glimpse into how the bacterial metabolism adapts over time in the CF lung. Metabolic capacities were analyzed to determine how tradeoffs between growth and other important cellular processes shift during disease progression. Genes whose knockouts were either significantly growth reducing or lethal in silico were also identified for each time point and serve as hypotheses for future drug targeting efforts specific to the stages of disease progression.

The last decade has witnessed an explosion in both the quantity and the pace of biological discovery. High throughput methods have been developed and leveraged at an expanding rate, with the accumulation of high throughput data outstripping the capacity for analysis using conventional methods (16, 21). To face these new challenges, systems-focused methods have come to the forefront of biological discovery, enabling a synergistic merging of network analysis with the existing reductionist paradigms that have fueled biology for the past half-century (25, 40).

One of the most pressing applications of systems analysis is unraveling the myriad factors that combine to form human disease. This ambitious goal has motivated a surge of interest in the collection and analysis of microarray data, which has emerged as a dominant technology for gathering genome-scale data due to its relatively low cost, ubiquity, ease, and increasingly high resolution and reproducibility (42). In particular, microarrays for gene expression profiling have been used in longitudinal studies of disease, as it enables a glimpse at the internal changes cells undergo as a disease progresses. While many such studies have been published, very little model-driven analysis has been leveraged toward interpreting these data at the network level. There is a tremendous need for this next level of analysis, as a network approach promises a deeper mechanistic understanding of whole-cell phenotypes that will be crucial for determining better therapies in the future.

With the increase in life span of cystic fibrosis (CF) patients over the last several decades, bacterial infections of the thickened mucus of the lung have become the primary disease burden that must be managed in these patients today (23). The peculiarities of the CF lung mucosal environment render it a ripe environment for growth of Pseudomonas aeruginosa in particular, a notorious opportunistic pathogen that chronically infects the lungs of nearly every CF patient by an early age (32). Due to the ability of P. aeruginosa to thrive in many varied environments and its possession of a large number of regulators, it has been hypothesized that an important determinant of the virulence of this pathogen is its exceptional metabolic versatility and adaptability (37).

CF lung infections involve many adaptive stages as the bacteria respond to the host lung environment and as the lungs contemporaneously remodel based on the stresses of infection (18, 20, 35). Long-term bacterial adaptations have been studied in part through gene expression profiling, and it has been noted that a significant percentage of genes differentially expressed during chronic infection encode physiological or metabolic functions (12, 36). This finding reinforces the hypothesis that the metabolic versatility of P. aeruginosa is a large factor in its pathogenicity. As a tool in studying the metabolism of this opportunistic human pathogen, we have previously published a genome-scale reconstruction of the P. aeruginosa PAO1 strain (26). This reconstruction accounts for the functions of 1,056 genes, 883 reactions, and 760 metabolites, incorporating the functions of approximately 20% of the genes in the genome into a functional computational model that is amenable to metabolic flux-level analysis (9, 17, 31). Methods for integrating high-throughput data, including gene expression array data, with genome-scale models of metabolism in order to study tissue- or condition-dependent metabolic phenotypes are developing (1, 4-6, 22, 34). By integrating gene expression data from a longitudinal study of P. aeruginosa growth (12) with our model of P. aeruginosa metabolism (26), we are providing the first network-driven analysis of metabolic changes in P. aeruginosa growing in the CF lung. By evaluating the metabolic changes that occur in this environment, we offer a deeper understanding of how the metabolism of this pathogen adapts during a chronic CF infection and present a new way to view its evolving metabolic lifestyle.


Collection of transcriptomic data.

Microarray data were collected from a previous study of metabolic adaptations of P. aeruginosa in the CF lung environment (12) and analyzed in the context of a genome-scale metabolic model. The array data were collected from the online Gene Expression Ontology (GEO) database, with the accession number GSE10362. These data correspond to lung isolates of P. aeruginosa taken from a CF patient between February 1998 and September 2001, the year of the patient's death due to respiratory failure. The patient was chronically infected with P. aeruginosa at least as early as 1997 and was coinfected with Stenotrophomonas maltophilia during the last 7 months of life, as reported previously (13). Because no data before February 1998 are available, however, we refer herein to February 1998 as month 1 of the infection and refer to the dates at which bacteria were isolated as months 1 through 44, the last data point available. The patient died at the age of 34.

Microarray data and analysis.

Two lineages of P. aeruginosa were analyzed in the study, as identified previously by random amplification of polymorphic DNA (RAPD) typing: RAPD types 1a and 1b (13). The P. aeruginosa isolates were first collected and described in this previous publication (13), and microarray data on these isolates were collected and reported in a later publication (12). The microarray data set describes changes in expression of P. aeruginosa genes across four time points during the progression of lung infection in a single CF patient, with triplicate microarrays at each data point (12). More specifically, each set of microarray triplicates was taken for a single RAPD subtype isolated from the CF lung, after samples were extracted and grown to late logarithmic growth phase under in vitro rich medium conditions designed to roughly approximate those found in the lung (data initially published in reference 12; growth curves are shown in Fig. S1 in the supplemental material). Isolates of the 1a type were found at months 1, 8, 21, and 41 from the start of the study, while 1b samples were isolated at months 21, 40, and 44. At the month 44 time point, two samples of type 1b were isolated, and the resultant 6 microarrays were pooled together as a single time point for our analysis.

All analyses of 1a and 1b isolates were carried out separately. The data as collected from GEO have previously been robust multichip average (RMA) normalized, so we performed no further normalization. Microarray data were log2 transformed, and genes present in both the microarray and the P. aeruginosa metabolic model were kept for the remainder of the analysis (1,047 genes in all) (26).

Our aim in this study was to use these microarray data to tailor a metabolic model of P. aeruginosa to each available time point in order to characterize how the metabolism of P. aeruginosa changes over time during a CF lung infection. Several methods have been used to integrate regulatory information with genome-scale metabolic models in the past (1, 4-6, 22, 34). Most of these methods involve constraining reaction fluxes according to the up- and downregulation of metabolic genes as well as the logic associated with both isoenzymes (reaction x if gene 1 or gene 2) and multisubunit proteins (reaction x if gene 1 and gene 2).

While some methods impose arbitrary cutoffs on gene expression values to determine whether a gene is “on” or “off” for a given experimental condition or time point, instead we use a statistical test between microarray data at consecutive time points to determine genes which are significantly up- or downregulated. This approach enables us to use standard statistical cutoffs to determine significant genes, rather than imposing a single arbitrary cutoff for over 1,000 genes, when certain genes might require markedly different (and in many cases, unknown) levels of expression in order to be turned “on.”

The statistical test we used to determine significance of up- or downregulation of these genes was the significance analysis of microarrays (SAM) algorithm (39). For each set of microarrays from consecutive time points (for 1a and 1b strains), SAM was performed using a two-class, unpaired t statistic, with 720 permutations (or 800, in some cases). As a significance cutoff for gene expression changes, delta was chosen in each analysis to keep the 90th percentile false-discovery rate (FDR) at ≤0.05, and only gene changes below this threshold and at ≥2-fold were deemed significant. In the three transitions between time points available for the 1a RAPD type (transitions between months 1, 8, 21, and 41), 2, 233, and 2 genes, respectively, were found to vary significantly. In the two transitions between time points available for 1b isolates (i.e., transitions between months 21, 40, and 44), 19 and 42 genes, respectively, were found to vary significantly.

Analysis with starting versus endpoint phenotypes.

Microarray data were also analyzed only taking the first and final time points of microarray data into account. To determine significant genes, SAM was performed between the two time points as described in the previous section, and genes with P values below a median FDR cutoff of 0.01 and with a fold expression change of ≥2 were deemed significant. This slightly more stringent cutoff was chosen to reduce false positives, because several more genes were found to be significant with this analysis than with the RAPD type-focused analyses described previously in this study. For this analysis, the two isolates from the 1b lineage at month 44 (M25 and M26) were pooled together and analyzed versus the first isolate (M1) in order to determine significant genes. These time series analyses were then further processed the same way as the 1a and 1b RAPD time series, as described in “Construction of state matrices” below.

Construction of state matrices.

Once the significantly varying genes were identified, these expression changes were used to define metabolic states for each time point of the microarray data set. The processing of microarray data to generate viable metabolic states of the computational model at each time point is described in Fig. Fig.1.1. Initially, all genes were assumed to be “on” at every time point. Genes which had a significant expression change were then set to an “off” state at all time points before the change (in the case of an increase) or after the change (in the case of a decrease). In cases where expression changes conflicted due to the restriction to binary states (such as if a gene increased significantly in expression at two consecutive time points), the change with the highest P value was dropped, and this process was repeated until a viable state vector was found for the gene. Genes that were first upregulated and then later downregulated (or vice versa) were given on/off states appropriate to the timing of the changes (e.g., gene 3 in Fig. Fig.1d).1d). Genes in the metabolic model but not represented on the microarrays (9 genes in all) were considered to be “on” at all time points.

FIG. 1.
Schematic of microarray analysis method. The figure shows how the microarray data were analyzed and how the metabolic model was constrained based on the data. (a) We started with data from 24 normalized gene expression microarrays, corresponding to time ...

At each time point, the metabolic model was next grown in silico, and flux through the biomass production reaction was determined (see “Systems analysis” below). If biomass production was zero, an algorithm was employed to determine the least number of “off” genes that must be turned “on” at that state in order to sustain in silico growth. Namely, all genes were initially turned “on.” Then, one gene that should be “off” was chosen at random and was shut off. If this deletion prevented in silico growth, the gene was kept “on” and another gene was turned “off” at random. This process was repeated until no more deletions could be made without preventing model growth. The entire process was repeated 20 (or more) times with different random restarts, and from the collection of final states with the least number of “off” genes turned back “on” to enable in silico growth, the most common set of deleted genes was chosen as the final consensus state. All further analyses were performed using the consensus states. Genes turned on to allow model growth (despite transcriptomic evidence that they should be “off”) are termed “nixed gene knockouts.” The only isolates that needed to have genes “nixed” were those at time points 1 and 2 in lineage 1a, which each had 41 gene knockouts (KOs) nixed. All other models could produce biomass without alteration (see Fig. Fig.22 and and33).

FIG. 2.
Characteristics of microarray data. (a) Table of characteristics of the two RAPD lineages. Isolate names, months, and mutS phenotypes have been reported previously (12, 13). The other fields are based on our in silico analysis, as described in Materials ...
FIG. 3.
In silico metabolic states. Distribution of genes that are differentially expressed in the in silico metabolic model, listed by metabolic system. Only genes in the metabolic model that are significantly differentially expressed are shown. Each row represents ...

Metabolic network reconstruction.

For growth simulations, we used the genome-scale metabolic network reconstruction of P. aeruginosa, as published previously (26). The reconstruction accounts for 1,056 genes and 883 reactions, with gene-protein-reaction logic connecting the genes to the reactions. One mistake in the reconstruction was corrected from the initial publication; the reaction “Rha-(a1,3)-GlcNac-pyrophosphorylundecaprenol synthesis” (RHA1GLCNACPPUNDs) was altered to produce GMP in the version described herein instead of UMP as described previously.


Previously, the composition of cystic fibrosis sputum was analyzed and used to define a synthetic cystic fibrosis medium (SCFM) (29). P. aeruginosa grown on SCFM maintained growth patterns and phenotypes similar to those of P. aeruginosa grown in sputum, confirming that the SCFM was a reasonable approximation of the environment in a CF lung (29). In order to model the composition of the CF lung environment, we developed an in silico SCFM, in which all components available in the SCFM were allowed into the model up to a fixed bound. Unless otherwise noted, all growth simulations were run on this in silico SCFM. The composition of the in silico SCFM is provided in Table S1 in the supplemental material.

Systems analysis.

For each time point, a “state” vector, which described the on/off states of every gene in the model, was generated (see “Construction of state matrices” above). Functions in the COBRA toolbox, version 1.3.1 (3), were used to constrain fluxes of reactions in the metabolic model based on the state vector (reactions that were “off” were constrained to zero flux), and flux balance analysis (FBA) simulations were run with the model constrained for each time point. For these simulations, biomass was the objective unless otherwise noted, and all simulations were done on in silico SCFM. An in silico lethal phenotype indicates a simulation in which no biomass could be produced. (For more background on FBA, see reference 17.)

Flux variability analysis (FVA) (19) was performed using functionality in the COBRA toolbox. For determination of fluxes with a range and/or flux value of zero, a range/flux cutoff of <0.001 was used. A range cutoff of ≥800 was used for identifying fluxes with infinite ranges.

Determination of maximum fluxes.

To determine maximum production of a given metabolite, we created “demand reactions” for each metabolite of interest. Demand reactions are created strictly for modeling purposes; they are reactions that simply drain a metabolite of interest out of the system. To create demand reactions for metabolites that are bound to some carrier molecule or cofactor, the cofactor was included as an output of the reaction to ensure that the “demand” being tested was for the metabolite of interest and not production of its cofactor. (For more details on demand reactions, refer to reference 24.)

Construction of Pareto optimum curves.

To construct a Pareto optimum curve for two reactions, the flux through one reaction (usually biomass) was fixed at different levels, and the flux through the other reaction was both maximized and minimized at each level. These upper and lower bounds gave shape to two Pareto curves—the optimal tradeoffs from maximizing and minimizing one reaction while fixing the other reaction to various points along its full possible range of fluxes—which together outline the edges of the solution space along the plane defined by the two reactions. Reactions optimized were in most cases the demand reactions for the listed metabolites, as described in “Determination of maximum fluxes” above.

Determination of important reactions for state transitions.

To determine the most important reactions involved in the phenotypic changes predicted in silico, we employed a three-step method. First, we identified the pool of reactions that were “on” in one state but “off” in a consecutive state of interest (as determined through our construction of the state matrices, described above). Next, all reactions from that pool were removed from the model in which they were active. Third, each reaction was added back to the model in turn, and maximal production was assessed in each case for the metabolic product of interest. The reaction that most increased production was then added to the “list” of significant reactions and was added back into the model. The third step was then repeated for the remaining pool of reactions. This process was repeated until there were two or more reactions equivalently influential for maximal production or until a significance cutoff was reached for the effects on the product of interest. In one case (discussed in Results), two reactions had to be added back concurrently to have a significant effect on maximal flux, but otherwise reactions were added back one at a time.

Gene essentiality.

For a given network state, gene essentiality was determined by deleting in silico all of the genes knocked out in that particular state plus each of the other genes in the model one at a time. In each case, biomass production was assessed, and genes for which a knockout yielded zero biomass were deemed essential.

Data analysis platforms.

Microarray analysis was performed using SAM for Excel, version 3.09. Data handling was done in Excel 2007 and in Matlab R2008a. Growth simulations were performed using flux balance analysis as implemented in the COBRA toolbox, version 1.3.1 (3), using the free linear solver GLPK, the CPLEX solver as implemented in GAMS, or the solver gurobi. Simulations were done on a Dell laptop running Windows XP or on a linux server.


Integration of gene expression into the metabolic network.

Isolates of Pseudomonas aeruginosa were previously obtained from the lungs of a cystic fibrosis (CF) patient at several time points over 44 months of a chronic P. aeruginosa lung infection (13). Expression microarray data were later collected for these isolates, as grown on rich in vitro media designed to emulate conditions in the CF lung (12). We analyzed these gene expression data for significant changes over each pair of consecutive time points available for the isolates, as described in Materials and Methods. Analyses were performed separately for each of two isogenic lineages of P. aeruginosa that represent two predominant RAPD types obtained from this patient, termed 1a and 1b, as previously identified (13). The numbers of differentially expressed genes are shown in Fig. Fig.22.

These gene expression changes were then used to constrain fluxes through reactions in a genome-scale metabolic reconstruction of P. aeruginosa, according to gene-protein-reaction associations accounted for in the reconstruction (26). Gene expression states were imposed on the metabolic model, with reactions catalyzed by downregulated genes being shut off at the appropriate time points. As a final processing step, genes for which removal was lethal in silico were “turned back on” (Fig. (Fig.1),1), leaving viable metabolic models capable of in silico growth at each time point. The reason for this final processing step was twofold. First, we generally assume when constructing in silico models that the bacteria are capable of producing biomass, since all viable bacteria must be capable of producing biomass in order to survive over the long term (such as the several-year time span of a chronic CF lung infection). Second, the microarray data for these isolates were collected on RNA extracted during late logarithmic growth phase in batch culture (under conditions designed to represent the lung environment), so the cells were actively growing when array data were taken.

Growth simulations were performed with the metabolic model using flux balance analysis (FBA), a computational technique that determines a metabolic flux distribution supporting optimal flux through a given reaction under steady-state conditions (17). When maximizing for a “biomass” reaction, FBA can give insight into factors such as growth viability and essentiality or auxiliary metabolic effects of putative gene knockouts. The metabolic model was “grown” in silico on synthetic cystic fibrosis medium (SCFM), which was modeled after the measured nutritional environment in the sputum of the CF lung (see Table S1 in the supplemental material for a full list of SCFM components) (29). A moderate flux of oxygen uptake was allowed into the system (approximately 1/50 what the wild-type [WT] strain would consume in silico if oxygen uptake were unlimited) to simulate microaerobic conditions (2). By allowing the model to grow on SCFM, we were able to simulate growth of P. aeruginosa at each time point in the CF lung environment from which the bacteria were isolated.

System-level adaptations of P. aeruginosa during CF lung growth.

The expression patterns of all genes that are differentially expressed in the time-point-specific metabolic models are shown in Fig. Fig.33 (see Table S2 in the supplemental material for the specific gene names), with summary statistics listed in Fig. Fig.2.2. All other genes in the model are considered “on” at each time point, as described above. The most obvious trend is the switching “on” of many genes between 8 months and 21 months in lineage 1a. This trend is found across nearly every subsystem of metabolism and is likely due to the loss of the mutS gene, as described in the next section. The subsystems that are most heavily represented with differentially expressed genes overall are “amino acid synthesis and metabolism” and “energy metabolism,” while in the 1b lineage, “carbon compound catabolism” is the subsystem most enriched for differentially expressed genes (these subsystems are enriched in differentially expressed genes by 12%, 10%, and 21%, respectively, versus their share of model genes). Interestingly, several genes are first upregulated and then downregulated again during the final time point. It is possible that this trend reflects an optimization of function and, as a consequence, a reduction in unnecessary catabolic processes as the immune system weakens in the final stages of disease (the patient died shortly after isolation of the bacteria at month 44). This is, indeed, what is observed in an analysis of the metabolic capacities of the metabolic network (see “Metabolic capacities” below). The most conspicuously absent systems in the set of differentially expressed 1b genes are cell wall and cofactor synthesis genes, which comprise a significant portion of the total metabolic genes (14%) but are not differentially expressed in the 1b lineage. It is possible that this feature is due to previous regulation in this lineage, since only late-stage isolates are available. Alternatively, this lack of differential expression could be due to the housekeeping nature of many of these genes.

Interpretation of mutS phenotype.

As reported previously (12), several of the P. aeruginosa isolates used in this study lost function of the mutS gene through a frameshift mutation, a mutation which is known to cause an increase in mutational rate (28). The status of the mutS gene for each time point is shown in Fig. Fig.2.2. Unsurprisingly, the largest numbers of transcriptional changes are seen during the transition to an isolate lacking mutS, particularly in the transition from month 8 to month 21 for RAPD type 1a. This transition reflects the fact that the mutS gene was rendered nonfunctional at some point before the sample was taken, giving some time for adaptive mutations to occur. For RAPD type 1a, the final time point (month 41) contains the functional mutS gene. The relatively few gene expression changes between the 1a isolates at 21 and 41 months suggest that the isolate at 41 months (M23) is not an independent mutS+ strain that grew in the lung alongside the other RAPD type 1a strains but rather an isolate of the same strain after it regained the mutS phenotype. The retrieval of this function suggests a possible adaptive advantage to stabilizing the late-stage phenotype, since a continued high mutation rate might be maladaptive once an optimal metabolic state is reached.

Analysis of “nixed” reactions.

Because the inactivation of a significant number of genes was “nixed” in the M1 and M9 states in order to enable in silico growth (i.e., the genes were turned “on” in silico in order to enable biomass production, despite evidence from the microarray data that these genes should be “off” [Fig. [Fig.1e]),1e]), we analyzed characteristics of the “nixed” reactions to gain insight into possible roles that they might play in vivo. Functional classifications of these reactions (according to categories defined [26] based on pseudoCAP gene functions and pathway participation) were compared to the distribution of functional classes of reactions in the model as a whole, with percentages by which certain functions were enriched for nixed genes determined by subtracting the percentage of model reactions in a category from the percentage of nixed reactions in the category.

By far the most enriched category among nixed reactions was the nucleotide synthesis pathway, in which 14 reactions were nixed (+17% enrichment). Other categories highly enriched in nixed reactions include lipid synthesis (+9%), nucleotide salvage pathways (+7%), and cell wall/lipopolysaccharide (LPS) synthesis pathways (+5%). The large number of nucleotide-related reactions whose knockouts were nixed is striking, and further analysis reveals that most nixed reactions in nucleotide-related categories are nucleotide phosphotransferases. Although the effects of changes in expression of these enzymes were removed from our analyses (through “nixing”), these phosphotransferases are likely candidates for drug targeting, due to their essentiality for production of biomass and their significant upregulation during in vivo growth. The reactions nixed in the M1 and M9 time points and the statistics of the pathways enriched for nixed reactions are listed in Tables S3 and S4 in the supplemental material, respectively.

Metabolic capacities.

In order to learn more about the alterations in P. aeruginosa metabolism occurring during the CF lung infection, we performed in silico simulations to determine the theoretical limits of production of a variety of important cellular products. In the 1a lineage, maximal production capacity of nearly every biomass component and virulence factor increased between the month 8 and month 21 time points. This increase is a direct result of the increased gene expression across the board in this lineage, and the timing is likely due to the mutS frameshift mutation that occurs between 8 and 21 months, enabling a rapid assimilation of adaptive changes. More surprising is the output of the 1b lineage, in which for nearly all virulence factors the capacity peaks at the month 40 time point and then declines in the final (month 44) time point. This trend can be seen in Fig. Fig.4a,4a, which displays the maximal theoretical production capability of several major virulence factors at each time point for both lineages. Figure Figure4b4b shows the maximal production capability of quorum-sensing molecules, which display a monotonic increase over time in both 1a and 1b lineages.

FIG. 4.
Maximum production of cellular products. (a) Maximum fluxes through production pathways for virulence factors. (b) Maximum production of quorum-sensing molecules. Biomass is also shown in both plots for comparison. Maximum fluxes are listed in relative ...

Intriguingly, optimal biomass production capacity in the 1b lineage increases monotonically with time, despite the fact that the maximal production capacity of many virulence factors decreases in the final time point. This decrease occurs for all virulence factors shown in Fig. Fig.4a4a and suggests a honing of metabolism to optimize growth (and possibly other important cellular processes) at the expense of virulence factor production, an observation in good agreement with many previous studies which describe a general loss of virulence during chronic infection of the CF lung by P. aeruginosa (7).

To gain more insight into the relationship between growth rate and virulence factor production capacities at the different time states, we generated Pareto optimum curves, which denote the tradeoff between two possible cellular objectives (14). The Pareto curves define the perimeter of the feasible solution space in a plane defined by two metabolic flux vectors of interest, allowing enumeration of all possible combinations of the two fluxes that are feasible given the constraints of the system. This analysis enables a characterization of tradeoffs between metabolic processes for which the network is optimized. A typical set of tradeoffs for virulence factors is that shown in Fig. Fig.5a,5a, the tradeoff between rhamnolipid production and biomass. All virulence factors displayed this competitive tradeoff against biomass in Pareto analysis for both the 1a and 1b lineages (except in cases where a virulence factor could not be produced). The maximum rhamnolipid production capacity (and that of all other virulence factors examined in this study) shows a negative correlation with biomass for the entire length of the curve, indicating that production of this virulence factor competes for resources with production of biomass at all levels of production.

FIG. 5.
Cellular product tradeoff curves. Plots of tradeoffs between levels of production of competing cellular products. Tradeoffs were determined by optimizing for fluxes through the appropriate demand reactions (see “Construction of Pareto optimum ...

Even with the observed negative correlation between virulence factor maximum production and biomass production capacity, interesting dynamics in the CF lung data arise as time progresses. Figure Figure5b5b shows the tradeoff between alginate production and growth in the 1b lineage, which highlights a typical trend for virulence factors in this lineage. Consistent with our observations of maximal production capacities for alginate as well as most other virulence factors (Fig. (Fig.4a),4a), maximal alginate production capacity first rises and then at the final time point falls below its value in the initial (month 21) time point. The 1b lineage seems able to produce virulence factors optimally at the month 40 time point but then begins honing metabolism and decreasing its ability to produce these virulence factors by the final stage of disease.

To get a better sense of how P. aeruginosa metabolism changed over time, we examined at each time point the optimal production of every individual component contributing to biomass. This set included all 20 amino acids, several cofactors, some from the breakdown of amino acid products, and nucleotides, phospholipids, peptidoglycan, and lipopolysaccharide (26) (a full list of biomass components is given in the supplemental material). Specifically, we were interested in how maximal production capacity of the different types of cellular products increased or decreased over the course of the CF infection. In the 1a lineage, maximal production capacity of every biomass component increased between the month 8 and the month 21 time points. The 1b lineage showed a more complex pattern, with some components increasing over time and others peaking at the month 40 time point and then decreasing. A noticeable trend was that the maximum production capacity of all biomass-associated cofactors either increased over time or stayed constant. This general trend of monotonic increase in maximum production over time was also seen for acetyl coenzyme A (acetyl-CoA) and succinyl-CoA, the two sole central metabolites included in biomass. On the other hand, peptidoglycan, the phospholipids, lipopolysaccharide, and glycogen maximum production capacities all peaked at the month 40 time point and then decreased at 44 months. These products could all be considered peripheral to basic cell function, serving to protect the cell against its environment or to store energy for the long term, both facets of cellular survival that might become less relevant in the final stages of a disease, when the host immune system is severely weakened and the external environment is enriched with decaying organic matter. The only biomass component whose maximum production decreased with time in the 1b lineage was l-glutamine; the maximum production was constant at 21 and 40 months but dropped off slightly (∼2%) at the month 44 time point. Since l-glutamine is an important metabolic cofactor participating in many metabolic reactions and pathways, this decrease in production could put limitations on some pathways that require glutamine as a nitrogen donor during the final month in which microarray samples were taken.

It is important finally to note that the results in this section all relate to maximum theoretical fluxes through the production pathways for the given cellular products, which outline the constraints of feasible fluxes at steady state but do not necessarily reflect the true fluxes through these pathways. Just because the maximum theoretical flux of a certain component increases over time does not necessarily mean that the actual flux will increase; however, an increased theoretical flux coupled with a decreased actual flux could give important insight into regulatory events controlling the pathways in question. This issue is addressed further in “Prediction of in vitro phenotypes” below.

Examination of genes contributing to phenotypic transitions.

As mentioned in the previous section, two notable transitions occur in the in silico phenotypes of the P. aeruginosa models: the 1a lineage experiences a large increase in its capacity to produce a host of metabolic products between 8 and 21 months, and the 1b lineage first increases its metabolic capabilities and then decreases its ability to produce certain products in the final time point (Fig. (Fig.4).4). We were interested in the genetic causes of these changes, so for each of these transitions, we identified ranked lists of reactions that had the largest effect on the maximal production capabilities of biomass and a set of virulence factors. The genes underlying these reactions constitute the most important genes for regulating metabolism of P. aeruginosa as predicted by the model.

Figure Figure66 a shows the effects on maximal biomass production of adding reactions to the M9 model from the pool of reactions upregulated between month 8 and month 21, with reactions listed in the (descending) order in which they have the most effect when added cumulatively. The predicted increase in maximal biomass production between 8 and 21 months for the 1a lineage is most affected by gain of the NADH dehydrogenase (ubiquinone-8 and -2 protons) (NADH11) reaction, encoded by 13 genes (nuoA, nuoB, and nuoD to nuoN), of which 12 are significantly upregulated between the month 8 and month 21 time points. Addition of NADH11 to the M9 model increases the growth rate by 50%. Figure Figure6b6b lists the important reactions for increased maximal production of each virulence factor (and biomass) in descending order of importance, with the maximal fluxes (as percentages of maximal flux in the M13 model) before (top) and after (bottom) addition of the listed reactions to the M9 model. As indicated in Fig. Fig.6b,6b, a group of common reactions are predicted to be important for the increase in biomass production as well as production of most virulence factors between months 8 and 21. More details about these reactions are provided in Fig. Fig.6c.6c. The listed reactions comprise the complete set that can be added one at a time to improve production of the given cellular factor. The only exception to this rule is in the case of alginate production, for which two reactions (malonyl-ACP decarboxylase [MACPD] and phosphomannomutase [PMANM]) must be added together to make the pathway function. MACPD is essential for conversion of malonyl-ACP into acetyl-ACP, and PMANM is essential for conversion of mannose-6-phosphate into mannose-1-phosphate, both of which are used in the production of alginate. These two reactions are not important for biomass production capacity but are necessary for the production capacity of A-band O antigen and B-band O antigen in addition to alginate.

FIG. 6.
Analysis of reactions important for M9-to-M13 transition. Results of an analysis of which reactions (rxns) contribute the most to the large increase in maximal production capabilities of biomass and virulence factors between the month 8 and month 21 ...

Reactions that displayed the largest effects on maximal production capacity of biomass and nearly all of the virulence factors included NADH11, cytochrome c reductase (CYRUBQ8), pyruvate dehydrogenase (PDH), and enolase (ENO) reactions. These reactions are highlighted in yellow in Fig. Fig.6.6. Involved in oxidative phosphorylation and central metabolism (Fig. (Fig.6c),6c), these reactions influence many processes in metabolism nonspecifically, as they generally increase the efficiency of metabolism. Reactions with a less pronounced, but still significant effect on biomass are colored orange; many of these reactions are important for production of virulence factors as well and are for the most part also involved in central processes.

In addition to reactions important for the M9-to-M13 transition, we were interested in genes underlying the decreased maximal production of virulence factors in the final time point of the 1b lineage, despite increased biomass production. To investigate this, we identified the reactions turned both on and off in the month 40-to-month 44 transition in the 1b lineage and determined which reactions had the largest effects on maximum biomass and virulence factor fluxes. The sole reaction whose activation in the final time point significantly affects biomass production is the pyruvate dehydrogenase (PDH) reaction; this reaction also accounts for increases in maximal fluxes for most other metabolic products that increase in the final time point (the other reaction that enables increased fluxes is the 2-oxoglutarate dehydrogenase [AKGD] reaction).

Competing with the activation of PDH in silico in the final time point are losses of cytochrome c oxidase (CYOO3_2), propionyl-CoA carboxylase (PPCOAC), and acyl-CoA dehydrogenase (2-methylbutanoyl-trans-2-methylbut) (ACOAD10), which either mostly or fully decrease maximal virulence factor fluxes alone (CYOO3_2 for pyocyanin, peptidoglycan, and homoserine lactone [HSL] production) or in combination (all three for rhamnolipids and A-band and B-band O-antigen production) and either mostly or fully counteract the increases from the gain of PDH. The loss of adenosine kinase (ADNK1) in the final time point is responsible for the inability of the model to produce pyocyanin but is not important for production of other virulence factors.

Possible reaction flux ranges.

To further investigate the changes in metabolism as the bacteria evolved in the CF lung, we performed a flux variability analysis (FVA) on both RAPD lineages for each time point. FVA is an extension of FBA, which explores the range of flux values for every reaction which can still sustain optimal growth. By tracking these ranges, we can investigate whether fluxes in different subsystems of metabolism become more or less rigid during the course of infection. A high rigidity for a given reaction flux indicates that flux through that reaction is crucial for sustaining optimal growth, while a lower rigidity (i.e., a larger range in allowable fluxes) indicates that, for instance, there might be alternate pathways to carry the reaction flux.

Several interesting trends emerged from the analysis of flux variability. One striking result was that, in both the 1a and 1b lineages, flexibility of nitrogen metabolism increased over time. In the 1a lineage, denitrification reactions initially carried a zero flux, but at the month 21 time point, the flux expanded to cover a range of values. In the 1b lineage, an initially required high flux value, equal to the maximum flux of nitrate allowed into the cell in silico, was in the final time point (month 44) relaxed to allow lower values while still maintaining optimal growth. These changes can be seen in Fig. 7a and b, which illustrate flux tradeoffs between biomass and nitric oxide reductase (NOR), a reaction involved in denitrification, in the 1a and 1b lineages, respectively. The final time point in each lineage displays a vertical line on the right border of the plot, indicating a range of NOR flux values that sustain optimal growth. The black arrow in Fig. Fig.7b7b highlights the progression of the biomass optimizing point in the 1b lineage, including the expansion in the final time point to encompass a range of NOR values. This trend in the 1b lineage suggests a lessening of the importance of the denitrification pathway for growth at the late stages of disease, while in the early stages, this pathway seems crucial for maximum growth. Therefore, while in the later stages of infection, denitrification appears important for growth, a range of fluxes through the pathway are all able to sustain the highest level of biomass production.

FIG. 7.
Denitrification tradeoff curves. Pareto tradeoffs are depicted for reactions representing the denitrification pathway. Panels a and b show nitric oxide reductase flux versus biomass production for the 1a and 1b lineages, respectively. Black arrows are ...

To gain an appreciation for global changes in flux variability, we also performed FVA for all reactions in each in silico state of the 1a and 1b lineages. Flux ranges in general fell into four categories: (i) inflexible fluxes (range of ∼0), (ii) fluxes with bounded (but not zero) flexibility, (iii) infinitely flexible fluxes, and (iv) reactions that cannot carry flux (the last were excluded from the rest of the FVA). Infinitely flexible fluxes arise in metabolic models through the cycling of zero-sum loops, which generally do not reflect biologically significant states (38). Figure Figure7c7c displays statistics for how many reactions display fluxes with these different ranges in each functional category, for the M9 and M13 (month 8 and month 21) in silico states, when maximal biomass production was enforced. The categories that changed the most in flexibility between the M9 and M13 states are amino acid catabolism and amino acid synthesis, which both reflect a net gain in flexibility. This change reflects an increased capability to metabolize these important nutrient sources in the in vivo environment as time progressed in the infection.

Gene knockout analysis.

One powerful application of genome-scale metabolic models is determination of the effects of gene knockouts. If suppression of the functionality of a certain gene causes a significant reduction in the ability of a metabolic model to produce biomass, this gene is then predicted to be in silico essential given the specific growth medium. In silico gene essentiality predictions have achieved accuracies of over 85%, compared to in vitro essentiality data sets (26, 27), and are a powerful tool for determining possible novel drug candidates (reviewed in reference 25). If a gene is determined in silico to be necessary for growth under a certain condition, that gene becomes a putative target for novel antibiotic therapies. This method of determining novel antibiotic targets is one of the most direct and powerful applications of genome-scale metabolic models.

In order to investigate possible novel drug targets, the effects of gene knockouts on growth were determined at each time point. To account for possible differences in essential gene predictions that could arise from differences between our in silico formulation of SCFM and the in vivo environment experienced by P. aeruginosa, we simulated three variants of SCFM: SCFM (as used throughout the rest of this paper), anaerobic SCFM, and SCFM lacking tryptophan (since SCFM has a tryptophan concentration 1 order of magnitude lower than that of any other amino acid [29]). For all of these simulations (including those with the altered SCFM formulations), the isolate-specific models described above were used (i.e., we did not generate new isolate-specific models with the different in silico media or using any other microarray data). Results with the tryptophan-limited medium and normal (in silico) SCFM differed for only 5 genes. However, the first two time points of the 1a lineage required oxygen for growth, so the lack of oxygen reduced biomass production to zero for these time points. Of a total of 166 genes whose knockouts affected growth rate in normal SCFM by more than 5% in any of the in silico isolates, 15 genes lowered growth by 5 to 10%, 20 genes lowered growth by 10 to 30%, 0 genes lowered growth by 30 to 70%, and the remaining 131 genes lowered growth by >70% (in fact, all 131 of these knockouts were lethal). Full results of these analyses are listed in Table S5 in the supplemental material.

Next, we performed in silico genome-wide knockout experiments at each time point available from the CF lung microarray data. This analysis encompasses the first integration of genome-scale metabolic modeling toward determination of optimal drug targets tailored to a specific disease environment. Figure Figure88 a presents all gene knockouts that reduced the growth rate compared to the wild-type growth rate in normal SCFM, with results shown as percentages of wild-type growth under the given conditions. Boxes in the figure are colored based on the percentage of growth reduction associated with the given knockout (denoted by “P”) versus growth in normal SCFM at each time point, as described by the equation

equation M1

where equation M2denotes the maximum relative biomass production rate of RAPD lineage r at time t with gene m knocked out and “SCFM” denotes growth on normal SCFM, with the model unconstrained by a state-specific gene knockout pattern. As these gene knockout predictions are based on the metabolic states derived from P. aeruginosa microarrays during a CF lung infection, they represent estimations of the P. aeruginosa metabolic genes that, if targeted with enzyme inhibitors, might be the most effective at inhibiting growth of P. aeruginosa in this specific environment. We also analyzed gene knockouts under each microarray condition in anaerobic SCFM and came up with the set of growth-reducing genes shown in Fig. Fig.8b8b (time points 1 and 2 of the 1a isolate are excluded from this analysis, since these models are lethal with no genes knocked out). The five genes listed are also possible drug targets, due to the microaerobic and sometimes anoxic conditions in the CF lung (41).

FIG. 8.
Gene knockout analysis. For each gene at each time point, the percentages of growth reduction associated with the given knockouts versus growth in normal SCFM are indicated by colored boxes. The equation for determining these percentages is shown in the ...

It is interesting to note that there is little overlap between the 1a and 1b RAPD lineages in the set of genes whose knockouts are more growth reducing than the wild type in one of the isolates, as shown in Fig. Fig.8a8a (with “growth reduction” being calculated by the equation given in the previous paragraph). Since the majority of growth-reducing gene knockouts were found in the 1a lineage, we also analyzed only the genes whose knockouts are growth reducing in the 1b lineage. Even for this subset of genes, only 40% of genes were also growth reducing when knocked out in any time point of the 1a lineage. This lack of congruity between good gene targets in two substrains growing even within the lungs of the same CF patient suggests how difficult it is to effectively target a microbial community even with broad antibiotic therapies. However, it is also worth recalling that data for the 1b lineage were taken from a later stage of infection than data available for the 1a lineage (Fig. (Fig.2b),2b), so these differences in gene essentiality could be related to changes in the nutritional composition of the CF lung environment. Nonetheless, these results emphasize the value of network analysis that contextualizes high throughput data and that can be used to identify drug targets specific to the environment and regulatory state of the pathogen.

Prediction of in vitro phenotypes.

Bacterial isolates were assayed in vitro for production of alginate (mucoidy) and production of pyocyanin, as reported previously (12). These two phenotypes were previously tested over the 7 time points available for the 1a and 1b lineages (excluding two time points for which mucoidy data are not available). Four phenotypes matched between in silico and in vitro predictions, and eight phenotypes did not match (see Table S6 in the supplemental material). The nature of the mismatches is of the most interest. Five mismatches were false positives (i.e., the virulence factor could be produced in silico but was not produced in vitro), and three mismatches were false negatives. False positives do not necessarily indicate that the model is incorrect, since the model is merely testing whether the product could theoretically be produced given the current state of the system and not whether it actually is produced; it is possible for regulatory events to suppress transcription of biosynthetic genes or for certain biosynthetic genes to be inactivated through mutation. Most of the false positives were for alginate production in the later time points, which suggests that there might be some regulation suppressing alginate production at these time points, perhaps to conserve resources. It is difficult to determine the strength of this hypothesis, since the generally nonmucoid phenotype of isolates in this study is at odds with the observed increase in mucoidy in other chronic infections (15, 30). It is also possible that the in vivo mucoid phenotype is lost during isolation and culturing of the bacteria, as mucoidy is a notoriously unstable phenotype in vitro (8). The remaining false positives were for pyocyanin production of isolates M13 and M22. Again, we hypothesize that regulatory effects might play a role in suppressing pyocyanin production in these isolates.

The three false negatives are more significant, since they indicate a failure of the in silico system to capture in vitro phenotypes. Production of pyocyanin and alginate depends on three reactions that are knocked out in the early time points of the 1a lineage: adenosylhomocysteinase (AHC), MACPD, and PMANM (Fig. (Fig.6b).6b). The genes underlying these reactions (listed in Fig. Fig.6c)6c) are all significantly regulated but display fold changes with low magnitudes (the fold changes of the genes were all less than 3 although above the significance cutoff of 2). It is likely, therefore, that although these genes are expressed at a higher level in the later time points, they are expressed in the early time points at some basal level and thus allow some pyocyanin and alginate to be produced.

In addition to these phenotypes, isolates were assayed for production of 3-oxo-C12-HSL and n-butyryl-HSL (M. Hogardt, unpublished), two important quorum-sensing molecules in P. aeruginosa. As opposed to the maximum theoretical fluxes shown in Fig. Fig.4b,4b, which increase over time in both lineages, extraction experiments showed that all isolates in the study are lasR mutants and are negative for production of these two quorum-sensing molecules. These results are consistent with many previous studies which show a growth advantage for P. aeruginosa in the CF lung environment by loss of lasR gene functionality (7, 11, 33). Perhaps the metabolic burden of the large increase in capacity to produce quorum-sensing molecules seen in Fig. Fig.4b4b is a factor contributing to the selective pressure for loss of the lasR gene in vivo.

The isolates were also assayed for auxotrophy at each time point (12). Here again, there were a large number of mismatches, as shown in Table S6 in the supplemental material, with three correct and four incorrect predictions of auxotrophy. At the first and second time points of the 1a lineage, the model predicted auxotrophy for multiple amino acids, whereas the in vitro results showed a lack of auxotrophy. Like the three false negatives for pyocyanin and alginate production, these results suggest that the state-specific models are missing some important functionality that is present in vitro. This could be due to several factors. First, our algorithm for determining time-specific metabolic states suppresses expression of genes to zero for a given time point if the genes are significantly downregulated. This binary suppression is necessary in order to reduce the number of assumptions in our model, since it is difficult to know how much the constraints for a given reaction should be reduced given downregulation of the corresponding genes. Because of this conversion of microarray data into binary states, the metabolic phenotypes revealed by our method might represent blocks of metabolism that are highly active, with other processes active but at lower levels. Alternatively, the failure of the model to capture metabolic activities that are happening in vitro could indicate the existence of metabolic pathways that are not accounted for in the model. Further, because the bacterial isolates are not guaranteed to be fully representative of the bacterial landscape at each time point (particularly as evidenced by the existence of multiple subtypes at the later time points), it is possible that the 1a and 1b lineages do not represent truly connected time series of bacterial adaptation but rather disconnected isolates that have adapted separately, which reduces the accuracy of the time-specific metabolic models. We partially address this concern in “Comparison of isolates from initial versus final time points” below. These are issues that are important to address in future experimental studies.

Comparison of isolates from initial versus final time points.

For all analyses presented so far in this study, bacterial isolates were split into RAPD subtypes for time course analysis. The method we used to analyze the microarray data assumes that the strains analyzed in one time course are of the same lineage, since differences in state of gene expression are determined based on significance of gene expression changes between states. As a validation of our analysis by RAPD type (RAPD analysis), we also investigated adaptations of P. aeruginosa in the CF lung environment without relying on the RAPD typing for comparison (non-RAPD analysis). We performed this analysis using only two time states, the first time point at which data were collected (month 1) and the final time point at which data were collected (month 44). Specifics of this analysis are described in “Analysis of starting versus endpoint phenotypes” in Materials and Methods.

The trends of maximum virulence factor and biomass component production between the first isolate (M1) and pooled data from the final isolates (M25 and M26) were compared between the RAPD and non-RAPD analyses. Trends were split into trivalued data, i.e., rising, falling, and no change. Two virulence factors (rhamnolipids and pyocyanin) displayed different trends between these two analyses, but every other virulence factor and biomass component displayed identical trends, giving a 97% match in phenotype between the two analyses. This high level of congruence supports the observation that general metabolic trends are conserved despite differences in ways data are analyzed using our method. This match is surprising, since there were significant differences in the M1 and M25/M26 states determined by the non-RAPD method versus the RAPD lineage method of analyzing the microarray data. Only 50% and 12% of the 251 and 116 genes turned off using either method of analysis at the M1 and M25/M26 time points, respectively, were turned off in both analyses. Therefore, the different methods were more consistent at the metabolic phenotype level than at the gene level.

We also calculated conditional gene essentiality/growth reduction for the M1 and M25/M26 isolates by using the non-RAPD method (see Table S7 in the supplemental material). There are a surprisingly large number of differences in the gene essentiality study between the non-RAPD and the RAPD methods, with only 34% of genes whose knockouts are at least 5% more growth reducing than the wild type for the M1 state found to be growth reducing by both the RAPD and non-RAPD methods. For the M25/26 state, only 20% of genes that are growth reducing are conserved between the two methods. The only genes whose knockouts are determined to be growth reducing in the first and final time points by both the RAPD and the non-RAPD analysis methods are the 5 subunits of the cytochrome o ubiquinol oxidase complex. This complex is integral for transferring electrons from ubiquinol through the electron transport chain in the process of oxidative phosphorylation. Since the environment in the CF lung is thought to be microaerobic, reduction in cytochrome o ubiquinol oxidase function might not have a large effect on growth of P. aeruginosa in the in vivo environment, as oxygen uptake is already highly limited by the environment. However, this complex is worth investigating as a possible drug target for P. aeruginosa.


With the current study, model-driven systems biology is for the first time integrated with high throughput data to examine the complex processes underlying P. aeruginosa metabolic adaptations in a chronic CF lung infection. Specifically, microarray data of P. aeruginosa from different time points during a chronic CF lung infection were integrated with a genome-scale metabolic model of P. aeruginosa, which was built bottom-up from a combination of genomic and physiological data, enzyme and reaction information from various databases, and extensive literature-derived evidence. This integration has enabled a series of analyses of P. aeruginosa metabolism during the process of chronic infection, including (i) determination of changes in the ability of P. aeruginosa to produce various cellular products at different time points, (ii) analysis of tradeoffs between growth and virulence factor production, and (iii) predictions of stage-specific optimal gene targets for antibiotics.

The microarray data used in this study were analyzed previously without integration with a metabolic model (12). Standard statistical analysis methods were used to determine significant gene expression changes over time in isolates of P. aeruginosa from the CF lung, and then genes and pathways that were upregulated or downregulated over time were compared in order to categorize trends in adaptations of metabolism and other cellular networks over time. Our analysis differs from this initial study in that we are using the microarray data to establish constraints on a metabolic model, rather than overlaying the array data directly on top of a network. This is an important distinction, since an increase or decrease in expression of an enzyme is not necessarily correlated with the change in flux through it. If, for instance, one enzyme in a linear pathway is not upregulated, whereas all of the other enzymes are, the total flux through the pathway will be lowered to the maximum rate of the limiting enzyme. Our metabolic-model-driven analysis takes this characteristic into account by using the microarray data to alter the constraints on fluxes, rather than the values of the fluxes themselves. Fluxes are determined through flux balance analysis, which assumes optimal growth of the bacteria, a reasonable assumption, since the array data were taken on isolates in the late log phase of batch growth (12).

Expression microarrays are well suited for integration with genome-scale metabolic networks. Gene expression data can be used to model the constraints on metabolic systems, and genome-scale networks are effective tools for modeling the effects of gene knockouts (1). By imposing these gene expression-derived constraints on a network reconstruction while running flux balance analysis (FBA), in silico flux distributions can be generated for a given metabolic environment. FBA predictions have been shown to become more accurate as bacteria adapt to the environment in which they live, and improvements in the predictions have been shown even within a few hundred or thousand generations (10). Since the genetic alterations occurring in the microbe on the CF lung take place on a much slower time scale than this metabolic equilibration, the assumption of quasi-steady state that underlies FBA is highly appropriate for this system, and the analysis of optimal metabolic capacities becomes particularly relevant.

Examining the adaptation of P. aeruginosa strains during the course of a CF lung infection poses a number of computational challenges, however. We assume in this study that a gene that is significantly downregulated is turned “off” and allow no flux to run through the reaction catalyzed by the associated gene product. This assumption is made in order to implement our analysis, since accurately judging the correlation between the gene expression values for over 1,000 enzymes and the maximum flux carried by the reactions they catalyze remains a significant challenge. However, this method has its drawbacks; it is difficult to model the nuanced reality of metabolism, where production of certain enzymes can be reduced but not completely eliminated. This is particularly relevant for the genes that our analysis identified as “off” at certain time points but that were kept on due to lethality of the knockouts.

Also, because of the way we constructed the gene expression states for each strain, it must be noted that the in silico gene expression at each time point is a function of all of the other time points available, rather than just being a function of the expression values at that time point. Therefore, without expression data for early stages of the 1b lineage, the initial in silico state of the 1b lineage might have looked quite different were earlier microarray data available. This could be problematic especially if the isolates are not actually representative of a series of changes in the population of bacteria over time but rather are more representative of the within-population variability that occurs at any given time point. We partially addressed this issue in “Comparison of isolates from initial versus final time points” above, eliminating the assumption that isolates of a given clonal lineage are descended directly from earlier isolates of the same lineage by comparing the first isolate (M1) and the last isolates (M25/M26) and ignoring all of the isolates in between. We found a strong congruence in our predictions of certain model capabilities between the two analyses (e.g., trends in the ability of the models to produce various metabolic products), although essentiality predictions varied significantly. Certainly, this indicates that the method used to generate our isolate-specific models (Fig. (Fig.1)1) should be considered when interpreting results. These issues should be addressed further in future studies through high throughput experimentation designed specifically to be used with metabolic models. With the currently available microarray data, it is difficult to say to what degree the clonal isolates are representative of the metagenome at each time point, and it is thus difficult to know definitively how the entire population is evolving. Perhaps an appropriate design for a future study would involve metagenomic sequencing of sputum at each time point, thus eliminating the reliance on single strains to proxy for an entire unknown bacterial milieu. Another complication is the use of the P. aeruginosa PAO1 gene chip for microarray analysis of unsequenced strains. It is possible that these pathogenic CF isolates contain genome islands which are not represented on the chip or contain other variations that might not emerge from the microarray analysis.

Even with these drawbacks, the analysis presented here can provide novel and important insights into the evolution of P. aeruginosa in the CF lung. Without this type of modeling, optimal tradeoffs between important cellular products would be impossible to determine. Changes in optimal points as the bacteria evolve could be determined in a descriptive manner but could not be understood mechanistically without such compartmentational context. Furthermore, it is notable that, even with the imperfections inherent in the analysis methods employed, insights from these models are still likely relevant. If a gene is modeled as “off” and yet in vivo the gene is downregulated and the product is still produced at some low level, the portions of metabolism increasing or decreasing in usage will likely be the same, just to different degrees. We have been careful in our analysis to focus on the qualitative characteristics of the tradeoff curves and the types of changes that happen over time rather than on specific values of fluxes. These properties are most likely to stay consistent given noise in the system or partial deactivation of certain metabolic genes.

With this study, we have incorporated a novel set of techniques, system-level network modeling, to address a highly relevant clinical problem and attempt to place existing high-throughput data into a more mechanistic context than could otherwise be achieved. The results are suggestive of the tremendous potential of these approaches for aiding mechanistic understanding of bacterial adaptations during infection. Future longitudinal studies designed with consideration of network analysis could significantly improve the effectiveness of these methods and could lead to novel insights not accessible with current approaches. This analysis may facilitate identification of effective drug targets at different stages of P. aeruginosa CF lung infection and better characterize the adaptations occurring in this pathogen in the CF lung. This integration of high-throughput data with a well-validated network model of P. aeruginosa offers tremendous potential for understanding the progression of disease in a systematic way, enabling a glimpse at some of the complex processes underlying this devastating chronic human disease.

Supplementary Material

[Supplemental material]


We thank Paul Jensen and Sudhir Chowbina for helpful comments and suggestions in formulating the analyses for this study, and we thank Brian Hall for elucidating some aspects of the bacterial biology and RAPD typing. We appreciate the assistance of Jennifer Bartell with the preparation of the manuscript.

We acknowledge funding from the National Science Foundation (CAREER grant 0643548), the National Institutes of Health (an NIH biotechnology training grant to M.A.O.), and the Cystic Fibrosis Research Foundation (grant 1060 to J. A. P.). M. Hogardt was funded by the German Research Foundation (priority program SPP1316).


Published ahead of print on 13 August 2010.

Supplemental material for this article may be found at http://jb.asm.org/.


1. Akesson, M., J. Forster, and J. Nielsen. 2004. Integration of gene expression data into genome-scale metabolic models. Metab. Eng. 6:285-293. [PubMed]
2. Alvarez-Ortega, C., and C. S. Harwood. 2007. Responses of Pseudomonas aeruginosa to low oxygen indicate that growth in the cystic fibrosis lung is by aerobic respiration. Mol. Microbiol. 65:153-165. [PMC free article] [PubMed]
3. Becker, S. A., A. M. Feist, M. L. Mo, G. Hannum, B. O. Palsson, and M. J. Herrgard. 2007. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA toolbox. Nat. Protoc. 2:727-738. [PubMed]
4. Becker, S. A., and B. O. Palsson. 2008. Context-specific metabolic networks are consistent with experiments. PLoS Comput. Biol. 4:e1000082. [PMC free article] [PubMed]
5. Colijn, C., A. Brandes, J. Zucker, D. S. Lun, B. Weiner, M. R. Farhat, T. Y. Cheng, D. B. Moody, M. Murray, and J. E. Galagan. 2009. Interpreting expression data with metabolic flux models: predicting Mycobacterium tuberculosis mycolic acid production. PLoS Comput. Biol. 5:e1000489. [PMC free article] [PubMed]
6. Covert, M. W., and B. O. Palsson. 2002. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem. 277:28058-28064. [PubMed]
7. D'Argenio, D. A., M. Wu, L. R. Hoffman, H. D. Kulasekara, E. Deziel, E. E. Smith, H. Nguyen, R. K. Ernst, T. J. Larson Freeman, D. H. Spencer, M. Brittnacher, H. S. Hayden, S. Selgrade, M. Klausen, D. R. Goodlett, J. L. Burns, B. W. Ramsey, and S. I. Miller. 2007. Growth phenotypes of Pseudomonas aeruginosa lasR mutants adapted to the airways of cystic fibrosis patients. Mol. Microbiol. 64:512-533. [PMC free article] [PubMed]
8. DeVries, C. A., and D. E. Ohman. 1994. Mucoid-to-nonmucoid conversion in alginate-producing Pseudomonas aeruginosa often results from spontaneous mutations in algT, encoding a putative alternate sigma factor, and shows evidence for autoregulation. J. Bacteriol. 176:6677-6687. [PMC free article] [PubMed]
9. Feist, A. M., M. J. Herrgard, I. Thiele, J. L. Reed, and B. O. Palsson. 2009. Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7:129-143. [PMC free article] [PubMed]
10. Fong, S. S., and B. O. Palsson. 2004. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat. Genet. 36:1056-1058. [PubMed]
11. Heurlier, K., V. Denervaud, and D. Haas. 2006. Impact of quorum sensing on fitness of Pseudomonas aeruginosa. Int. J. Med. Microbiol. 296:93-102. [PubMed]
12. Hoboth, C., R. Hoffmann, A. Eichner, C. Henke, S. Schmoldt, A. Imhof, J. Heesemann, and M. Hogardt. 2009. Dynamics of adaptive microevolution of hypermutable Pseudomonas aeruginosa during chronic pulmonary infection in patients with cystic fibrosis. J. Infect. Dis. 200:118-130. [PubMed]
13. Hogardt, M., C. Hoboth, S. Schmoldt, C. Henke, L. Bader, and J. Heesemann. 2007. Stage-specific adaptation of hypermutable Pseudomonas aeruginosa isolates during chronic pulmonary infection in patients with cystic fibrosis. J. Infect. Dis. 195:70-80. [PubMed]
14. Intriligator, M. D. 2002. Mathematical optimization and economic theory. Society for Industrial and Applied Mathematics, Philadelphia, PA.
15. Kobayashi, H. 2005. Airway biofilms: implications for pathogenesis and therapy of respiratory tract infections. Treat Respir. Med. 4:241-253. [PubMed]
16. Kyrpides, N. C. 2009. Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream. Nat. Biotechnol. 27:627-632. [PubMed]
17. Lee, J. M., E. P. Gianchandani, and J. A. Papin. 2006. Flux balance analysis in the era of metabolomics. Brief. Bioinform. 7:140-150. [PubMed]
18. Lory, S., and J. K. Ichikawa. 2002. Pseudomonas-epithelial cell interactions dissected with DNA microarrays. Chest 121:36S-39S. [PubMed]
19. Mahadevan, R., and C. H. Schilling. 2003. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab. Eng. 5:264-276. [PubMed]
20. Mathee, K., G. Narasimhan, C. Valdes, X. Qiu, J. M. Matewish, M. Koehrsen, A. Rokas, C. N. Yandava, R. Engels, E. Zeng, R. Olavarietta, M. Doud, R. S. Smith, P. Montgomery, J. R. White, P. A. Godfrey, C. Kodira, B. Birren, J. E. Galagan, and S. Lory. 2008. Dynamics of Pseudomonas aeruginosa genome evolution. Proc. Natl. Acad. Sci. U. S. A. 105:3100-3105. [PMC free article] [PubMed]
21. McPherson, J. D. 2009. Next-generation gap. Nat. Methods 6:S2-S5. [PubMed]
22. Moxley, J. F., M. C. Jewett, M. R. Antoniewicz, S. G. Villas-Boas, H. Alper, R. T. Wheeler, L. Tong, A. G. Hinnebusch, T. Ideker, J. Nielsen, and G. Stephanopoulos. 2009. Linking high-resolution metabolic flux phenotypes and transcriptional regulation in yeast modulated by the global regulator Gcn4p. Proc. Natl. Acad. Sci. U. S. A. 106:6477-6482. [PMC free article] [PubMed]
23. Murray, T. S., M. Egan, and B. I. Kazmierczak. 2007. Pseudomonas aeruginosa chronic colonization in cystic fibrosis patients. Curr. Opin. Pediatr. 19:83-88. [PubMed]
24. Oberhardt, M. A., A. K. Chavali, and J. A. Papin. 2009. Flux balance analysis: interrogating genome-scale metabolic networks. Methods Mol. Biol. 500:61-80. [PubMed]
25. Oberhardt, M. A., B. O. Palsson, and J. A. Papin. 2009. Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5:320. [PMC free article] [PubMed]
26. Oberhardt, M. A., J. Puchalka, K. E. Fryer, V. A. Martins dos Santos, and J. A. Papin. 2008. Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1. J. Bacteriol. 190:2790-2803. [PMC free article] [PubMed]
27. Oh, Y. K., B. O. Palsson, S. M. Park, C. H. Schilling, and R. Mahadevan. 2007. Genome-scale reconstruction of metabolic network in bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J. Biol. Chem. 282:28791-28799. [PubMed]
28. Oliver, A., R. Canton, P. Campo, F. Baquero, and J. Blazquez. 2000. High frequency of hypermutable Pseudomonas aeruginosa in cystic fibrosis lung infection. Science 288:1251-1254. [PubMed]
29. Palmer, K. L., L. M. Aye, and M. Whiteley. 2007. Nutritional cues control Pseudomonas aeruginosa multicellular behavior in cystic fibrosis sputum. J. Bacteriol. 189:8079-8087. [PMC free article] [PubMed]
30. Ramsey, D. M., and D. J. Wozniak. 2005. Understanding the control of Pseudomonas aeruginosa alginate synthesis and the prospects for management of chronic infections in cystic fibrosis. Mol. Microbiol. 56:309-322. [PubMed]
31. Reed, J. L., I. Famili, I. Thiele, and B. O. Palsson. 2006. Towards multidimensional genome annotation. Nat. Rev. Genet. 7:130-141. [PubMed]
32. Sadikot, R. T., T. S. Blackwell, J. W. Christman, and A. S. Prince. 2005. Pathogen-host interactions in Pseudomonas aeruginosa pneumonia. Am. J. Respir. Crit. Care Med. 171:1209-1223. [PMC free article] [PubMed]
33. Sandoz, K. M., S. M. Mitzimberg, and M. Schuster. 2007. Social cheating in Pseudomonas aeruginosa quorum sensing. Proc. Natl. Acad. Sci. U. S. A. 104:15876-15881. [PMC free article] [PubMed]
34. Shlomi, T., M. N. Cabili, M. J. Herrgard, B. O. Palsson, and E. Ruppin. 2008. Network-based prediction of human tissue-specific metabolism. Nat. Biotechnol. 26:1003-1010. [PubMed]
35. Smith, E. E., D. G. Buckley, Z. Wu, C. Saenphimmachak, L. R. Hoffman, D. A. D'Argenio, S. I. Miller, B. W. Ramsey, D. P. Speert, S. M. Moskowitz, J. L. Burns, R. Kaul, and M. V. Olson. 2006. Genetic adaptation by Pseudomonas aeruginosa to the airways of cystic fibrosis patients. Proc. Natl. Acad. Sci. U. S. A. 103:8487-8492. [PMC free article] [PubMed]
36. Son, M. S., W. J. Matthews, Jr., Y. Kang, D. T. Nguyen, and T. T. Hoang. 2007. In vivo evidence of Pseudomonas aeruginosa nutrient acquisition and pathogenesis in the lungs of cystic fibrosis patients. Infect. Immun. 75:5313-5324. [PMC free article] [PubMed]
37. Stover, C. K., X. Q. Pham, A. L. Erwin, S. D. Mizoguchi, P. Warrener, M. J. Hickey, F. S. Brinkman, W. O. Hufnagle, D. J. Kowalik, M. Lagrou, R. L. Garber, L. Goltry, E. Tolentino, S. Westbrock-Wadman, Y. Yuan, L. L. Brody, S. N. Coulter, K. R. Folger, A. Kas, K. Larbig, R. Lim, K. Smith, D. Spencer, G. K. Wong, Z. Wu, I. T. Paulsen, J. Reizer, M. H. Saier, R. E. Hancock, S. Lory, and M. V. Olson. 2000. Complete genome sequence of Pseudomonas aeruginosa PA01, an opportunistic pathogen. Nature 406:959-964. [PubMed]
38. Teusink, B., A. Wiersma, D. Molenaar, C. Francke, W. M. de Vos, R. J. Siezen, and E. J. Smid. 2006. Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. J. Biol. Chem. 281:40041-40048. [PubMed]
39. Tusher, V. G., R. Tibshirani, and G. Chu. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U. S. A. 98:5116-5121. [PMC free article] [PubMed]
40. Vidal, M. 2009. A unifying view of 21st century systems biology. FEBS Lett. 583:3891-3894. [PubMed]
41. Worlitzsch, D., R. Tarran, M. Ulrich, U. Schwab, A. Cekici, K. C. Meyer, P. Birrer, G. Bellon, J. Berger, T. Weiss, K. Botzenhart, J. R. Yankaskas, S. Randell, R. C. Boucher, and G. Doring. 2002. Effects of reduced mucus oxygen concentration in airway Pseudomonas infections of cystic fibrosis patients. J. Clin. Invest. 109:317-325. [PMC free article] [PubMed]
42. Yauk, C. L., and M. L. Berndt. 2007. Review of the literature examining the correlation among DNA microarray technologies. Environ. Mol. Mutagen. 48:380-394. [PMC free article] [PubMed]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...