• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ploscompComputational BiologyView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
PLoS Comput Biol. Mar 2009; 5(3): e1000312.
Published online Mar 13, 2009. doi:  10.1371/journal.pcbi.1000312
PMCID: PMC2648898

Genome-Scale Reconstruction of Escherichia coli's Transcriptional and Translational Machinery: A Knowledge Base, Its Mathematical Formulation, and Its Functional Characterization

Christos A. Ouzounis, Editor

Abstract

Metabolic network reconstructions represent valuable scaffolds for ‘-omics’ data integration and are used to computationally interrogate network properties. However, they do not explicitly account for the synthesis of macromolecules (i.e., proteins and RNA). Here, we present the first genome-scale, fine-grained reconstruction of Escherichia coli's transcriptional and translational machinery, which produces 423 functional gene products in a sequence-specific manner and accounts for all necessary chemical transformations. Legacy data from over 500 publications and three databases were reviewed, and many pathways were considered, including stable RNA maturation and modification, protein complex formation, and iron–sulfur cluster biogenesis. This reconstruction represents the most comprehensive knowledge base for these important cellular functions in E. coli and is unique in its scope. Furthermore, it was converted into a mathematical model and used to: (1) quantitatively integrate gene expression data as reaction constraints and (2) compute functional network states, which were compared to reported experimental data. For example, the model predicted accurately the ribosome production, without any parameterization. Also, in silico rRNA operon deletion suggested that a high RNA polymerase density on the remaining rRNA operons is needed to reproduce the reported experimental ribosome numbers. Moreover, functional protein modules were determined, and many were found to contain gene products from multiple subsystems, highlighting the functional interaction of these proteins. This genome-scale reconstruction of E. coli's transcriptional and translational machinery presents a milestone in systems biology because it will enable quantitative integration of ‘-omics’ datasets and thus the study of the mechanistic principles underlying the genotype–phenotype relationship.

Author Summary

Systems biology aims to understand the interactions of cellular components in a systemic manner. Mathematical modeling is critical to the integration and analysis of these components on a conceptual as well as mechanistic level. To date, detailed genome-scale reconstructions of metabolism have become available for a growing number of organisms. Although metabolism has an important role in cells, other cellular functions need to be considered as well, such as signaling, regulation, and macromolecular synthesis. For instance, the cellular machinery required for RNA and protein synthesis consists of a complex set of proteins. Here, we show that one can collect all of the necessary information for a prokaryotic organism to create a gene-specific, fine-grained representation of the macromolecular synthesis machinery. E. coli was chosen as a model organism because of the wealth of available information. The explicit representation of transcription and translation in terms of a mass-balanced network enables a detailed, quantitative accounting of the protein synthesis capabilities of E. coli in silico. Hence, this study demonstrates the feasibility of constructing very large networks and also represents a critical step toward building cellular models of growth that can account for gene-specific protein production in a stoichiometric fashion on the genome scale.

Introduction

High-throughput experimental technologies enable the production of heterogeneous data, such as expression profiles and proteomic data, for almost any organism of interest. A detailed mathematical representation of the in vivo cellular network is required to obtain a holistic understanding of cellular processes from these data sets and to quantitatively integrate them into a biological context. One such approach is the bottom-up network reconstruction, which builds manually networks in a brick-by-brick manner using genome annotation and component-specific information (e.g., biochemical characterization of enzymes) [1],[2]. This reconstruction procedure is well established for metabolic reaction networks and has been applied to many organisms, including Human [3], Saccharomyces cerevisiae [4],[5], Leishmani major [6], Escherichia coli [7], Helicobacter pylori [8], Pseudomonas aeruginosa [9], and Pseudomonas putida [10],[11] (see http://systemsbiology.ucsd.edu/ for an continually updated table of metabolic reconstructions).

These bottom-up metabolic networks differ from other network reconstructions as they are tailored to the genomic content of the target organism and built manually using biochemical, physiological, and other experimental information in addition to the genome annotation. Hence, these reconstructions can be thought of as biochemically, genetically, and genomically structured (BiGG) knowledge bases [12]. The reconstruction and modeling procedure is a 4-step process: 1) obtaining a draft reaction list based on genome annotation and biochemical databases, 2) refinement of reaction list using experimental information (e.g., from literature), 3) conversion of the reaction list (reconstruction) into a computable format and application of systems boundaries to define condition-specific models, and 4) the evaluation and validation of the model content using various mathematical methods (see also [1],[2],[12],[13]). By iterating step 2 to 4, reconstructions that are self-consistent within their defined scope can be generated.

Metabolic network reconstruction have demonstrated to be useful in at least 5 areas of applications [2]: (i) biological discovery [14], (ii) phenotypic behavior [15], (iii) bacterial evolution [16], (iv) network analysis [17], and (v) metabolic engineering [18]. This wide range of applications of the metabolic reconstructions is possible because they can be readily converted into predictive, condition-specific models. Unlike more traditional approaches to modeling metabolism, the constraint-based modeling approach (COBRA) requires few, if any, parameters [12],[19]. The stoichiometric information encoded in the reconstruction (i.e., reaction list) can be represented mathematically as a stoichiometric matrix, S, where the rows correspond to the components and the columns correspond to the reactions (Figure 1).

Figure 1
Overview of constraint-based reconstruction and analysis.

While the COBRA approach has been successfully applied to metabolic networks, the same principles and assumptions can be also employed to reconstruct and model other cellular functions, such as signaling [20][22], regulation [23], and protein synthesis [24]. In this study, we extended and refined earlier work by Allen et al., which proposed a stoichiometric formalism to model protein synthesis and illustrated it on some E. coli genes and operons [24]. We created a more detailed, gene-specific representation of the transcriptional and translational processes, which explicitly accounts for the sequence-specific synthesis of DNA, mRNA, and proteins. This reconstruction enables quantitative integration of high-throughput data such as gene expression, proteomic, and mRNA degradation data. Moreover, proteins are produced in high copy numbers in growing cells; thus, any quantitative mechanistic modeling and analysis of high-throughput data needs to account for the synthesis cost associated with these molecules.

Numerous studies have been published that investigate protein synthesis using kinetic models [25][29]. These models are generally tailored to the questions they address making it difficult to readily apply them for modified problems. Since stoichiometric relationships are a common requisite for any type of mechanistic modeling, organism-specific BiGG knowledge bases can be used as templates to derive problem-specific, mechanistic models (Figure 1). In fact, network stoichiometry is a dominant feature of kinetic models as well [30]. Thus, network reconstruction serves as a platform for steady-state and kinetic modeling (Figure 1).

In this study, we present a new generation of network reconstructions, which directly account for the synthesis of individual mRNA and proteins (Figure 2A). We named the mathematical representation of this reconstruction the Expression matrix, or ‘E-matrix’, since it encodes the expression of mRNA and proteins. All network reactions were formulated to account for gene-specific and E. coli-specific details, such as nucleotide composition, operon association, and sigma factor usage. Furthermore, we used information from three databases and more than 500 scientific publications to formulate mechanistically detailed and accurate reactions. This reconstruction is the first comprehensive database detailing the available information for these cellular functions and can thus be deemed a knowledge base. After conversion of the ‘E-matrix’ reconstruction into condition-specific models corresponding to different doubling times, we were able to accurately predict the ribosome production reported in literature, without any parameterization. Furthermore, we show that the ‘E-matrix’ can be used to study the effect of rRNA operon deletion. Our results predict that a high density of RNA polymerases is required on the remaining rRNA operons, to achieve the reported ribosome numbers. Finally, we show that proteins used in the ‘E-matrix’ could be grouped into functional modules which lead to a more simplified view of the network.

Figure 2
Content of the ‘E-matrix’.

Results/Discussion

The ‘central dogma’ of molecular biology was first enunciated by Crick in 1958 and dealt with the transfer of sequential information from DNA to RNA to proteins [31]. The machinery necessary to conduct this information transfer was reconstructed in this study on a genome-scale, i.e., all known components in E. coli were considered. The ‘E-matrix’ encodes for all known reactions, which synthesize the components of the macromolecular synthesis machinery, in a mechanistically detailed fashion.

Reconstruction of the Networks and Formulation of the ‘E-Matrix’

Legacy data

The ‘E-matrix’ reconstruction was based on E. coli-specific information derived from more than 500 primary and review publications, three databases, and the revised genome annotation [32] (Figure 2B). This detailed information enabled the sequence-specific formulation of synthesis reactions, at high resolution, for every network component, namely DNA, mRNA, proteins, protein complexes, and metabolites. The reconstructed network accurately represents all known reactions required to produce the active, functional components of the transcriptional and translational machinery in E. coli (Figure 2A).

Reconstruction approach

The manual reconstruction of the ‘E-matrix’ was performed in an algorithmic manner by first identifying key components in the genome annotation (Tables S1, S15, S16, and S17). The functional roles of these key components were determined and then translated into stoichiometrically accurate reactions using multiple data sources (Figure 2B). A total of 303 components (proteins and RNA) were found to be directly involved in one or more subsystems, which represent groups of functionally related transformation pathways (Table 1 and Tables S2, S4, and S10). In this reconstruction linear transformation steps, e.g., elongation of nascent mRNA during transcription, were combined into a single reaction, while key reactions and known rate limiting steps were kept as separate reactions, e.g., transcription initiation and elongation. This representation captures key events in cellular processes and can be directly used to understand their reaction mechanisms at a high resolution.

Table 1
Reactions per subsystems.

A comprehensive, iterative quality control/quality assurance (QC/QA) procedure ensured that the resulting network had similar properties and capabilities as E. coli. This QC/QA procedure included gap analysis, testing for the production of every network component, and mass- and charge-balancing of more than 99% of the network reactions (Tables S7 and S8). Hence, the ‘E-matrix’ reconstruction follows the quality control standards developed for metabolic network reconstructions [1].

Unique properties of the ‘E-matrix’

This reconstruction is unique in the depth and breadth of information included as well as an advancement of other transcriptional and translational networks currently available [25][29]. It is also the largest reconstructed network to date, with 11,991 components and 13,694 reactions (Table 2 and Tables S12 and S13). The ‘E-matrix’ accounts for all known gene products necessary to produce the active components of the machinery itself, and is therefore self-contained. Furthermore, sequence-dependent synthesis reactions were carefully formulated to incorporate known reaction stoichiometry including protein-substrate complex intermediates, metallo-ions and cofactors. Two recently published large-scale datasets [33],[34] were used for the assigning the folding pathway to the individual polypeptides (Tables S5 and S6). Necessary modifications of stable RNA and proteins were also considered (Tables S16 and S17). Additionally, the transcription reactions were formulated in terms of transcription units rather than genes (Table S9), providing a biologically accurate representation of operon organization in bacterial genomes. These reactions can be readily extended to account for the production of other gene products such as metabolic enzymes or transcription factors. Lastly, this framework facilitates future integration of the ‘E-matrix’ reconstruction with the metabolic and regulatory network of E. coli.

Table 2
Overview of the ‘E-matrix’ content.

‘E-matrix’ versus available databases

The ‘E-matrix’ is distinguished from available online databases, such as KEGG [35] and EcoCyc [36], as all transcriptional, translational, and modification reactions were defined in a sequence dependent manner for every included E. coli gene. This task was achieved by determining the nucleotide and amino acid composition of each DNA, RNA and protein from the genome sequence, respectively. Furthermore, we determined the elemental composition of these macromolecules and mass balanced all network reactions. In contrast, KEGG [35] and EcoCyc [36] list mainly generic reactions using gene- and organism independent terms such as ‘DNA’, ‘protein’, and ‘RNA’. Subsequently, they contain only a subset of the synthesis reactions present in the ‘E-matrix’. Furthermore, neither of these databases can be directly converted into a comprehensive, self-consistent mathematical format that permits rigorous computational characterization of network fluxes. Another difference between the ‘E-matrix’ and these databases is the extent of mechanistic detail incorporated into the ‘E-matrix’, such as rRNA and tRNA modification reactions, iron–sulfur cluster formation, chaperone-dependent protein folding and protein complex formation.

Knowledge gaps

The transcriptional and translational machinery is essential for cellular growth. Considering the wealth of information available for E. coli, it was surprising to discover numerous knowledge gaps, or missing information, during the reconstruction process. For example, reaction mechanisms for some RNA modifications and iron–sulfur cluster biogenesis were either poorly understood or a general consensus on the mechanistic details was lacking. For instance, 15% of the included proteins had no gene annotation and their existence was suggested in the literature solely based on identification of modified proteins or stable RNA (Table S3). Furthermore, there are three metabolites with unknown metabolic transformations. One of these metabolites is preQ0, a precursor of preQ1, which is important for the queuosine formation in some tRNA (position G34). This precursor is formed from GTP and it has been suggested that two ribose units of two GTP molecules contribute to the formation of three carbons in preQ0 (C5,C6, and cyano carbon) but further information is missing [37],[38]. The two other missing metabolites are byproducts of the formation of uridine-5-oxyacetic-acid at position U34 in some tRNA. It has been suggested that chorismate acts as precursor for this nucleotide modification, however, such reaction would release two metabolites with formulae of C10H8O5 and C9H9O4, which have not been characterized yet [37],[38]. All of the knowledge gaps were highlighted in the reconstruction and associated with notes about currently available information (Tables S15, S16, and S17), which will hopefully promote their elucidation as it has been the case for some of the metabolic knowledge gaps in E. coli [14].

Network topology

The ‘E-matrix’ has a relatively ‘linear structure’ with only few components participating in multiple reactions since a majority of network components are only transferred from one reaction to another (Text S1, Figure D). This linearity is a dominant feature of the ‘E-matrix’ and it is less evident in metabolic reconstructions due to their much higher connectivity. Analysis of the component connectivity of the ‘E-matrix’ showed that the highest connected components are protons, water, and orthophosphate, which participate in 44%, 39%, and 32% of reactions, respectively. These compounds are also found to have the highest connectivity in metabolic networks [39]. In contrast to metabolic networks, ATP and ADP were not the next most highly connected but rather GTP and GDP, which participated in the numerous translational reactions. While the ATP requirement for cellular functions is accounted for in the biomass reaction of metabolic reconstructions, the high GTP requirement is not generally considered [7].

Determining Network Capabilities

The conversion of a network reconstruction into a mathematical model can be achieved, analogously to metabolic networks [1], by defining system boundaries and applying condition-dependent constraints on exchange and intracellular reactions (Figure 1) [1],[40]. Therefore, experimental data can be used to constrain the set of feasible network fluxes in a physiologically relevant manner. In the following section, we will illustrate the use of condition-specific models that were derived from the ‘E-matrix’ reconstruction.

Validation of the ‘E-matrix’ functionality—ribosome production

Cell growth is directly correlated with the protein synthesis capacity and thus with the number of active ribosomes [41]. Accordingly, we used the model's ribosome production capability as an indicator of its ability to support growth. For every growth rate, the uptake rates for NTP and amino acids as well as the transcription initiation rates of the rRNA operons were quantitatively constrained based on experimental data [42]. The in silico computed ribosome production capabilities showed very good agreement with the reported in vivo ribosome production capabilities [42] for all investigated doubling times (Figure 3), indicating that the capabilities of the reconstruction were very similar to those of an E. coli cell. This overlap between experimental data and predictions was somewhat expected as the constraints used, i.e., stable RNA transcription initiation rates as upper constraints for the rRNA operons (see Material & Methods), were dominant (governing) constraints. Thus, these results validated the predictive capability of the reconstructed network. Moreover, our results show that: (i) the network is capable of reproducing experimentally reported ribosome number given the uptake constraints, and (ii) an increase in transcription initiation rate would lead to an increase of ribosome production (see also Figure 4B). This latter result implies that the regulation of rRNA synthesis, which is outside the scope of this reconstruction, plays a significant role in determining the transcription rate [43],[44].

Figure 3
Comparison of in vivo [42] and in silico maximal number of ribosomes at different doubling times.
Figure 4
rRNA operon deletion study.

The effect of in silico rRNA operon deletions on ribosome production

The E. coli genome contains seven rRNA operons, which have similar structures (16S rRNA, tRNA, 23S rRNA, tRNA, 5S rRNA, and, in some cases, tRNA). Generally, it is assumed that rRNA operon redundancy in E. coli and other species, has evolved to provide high levels of ribosomes and thus to support rapid growth rates [45]. However, there is experimental evidence that rRNA operon multiplicity is rather required for rapid adaptation to changes in physiological conditions [46],[47]. In fact, it has been shown that the presence of only one rRNA operon on the chromosome is sufficient for synthesis of 56% of the wild-type rRNA concentration [48] and the deletion of multiple rRNA operons had only small effect on growth rate and ribosome content [46],[48],[49]. Subsequently, it was experimentally observed that the remaining rRNA operons were able to compensate for the loss by increasing the transcriptional rate [46].

Since the early days of the development and application of COBRA methods, in silico gene deletion analysis has been productively used to evaluate the consequences of gene deletions to metabolism and cellular growth [8], [50][52]. Here, we used the same approach to evaluate the consequences of rRNA operon multiplicity to the ribosome production capabilities of the ‘E-matrix’ by in silico operon deletion analysis. First, we set the stable RNA transcription initiation rates based on doubling time as reported in Neidhardt et al. [42], and optimized for ribosome production using linear programming. Subsequently, we created single and multiple in silico knockout mutants by deleting the rRNA operons and optimized again for ribosome production (Figure 4). Since the maximal possible rRNA transcription rates were set to the reported rates, we observed a linear decrease in ribosome production for all tested doubling times (Figure 4). This result was expected as the stable RNA transcription initiation rates were found to be the governing constraints (see above). Therefore, this simulation setup did not allow for the compensation of rRNA operon loss.

To simulate this compensation, we multiplied the transcription initiation rate of each rRNA operon with various scaling factors and re-computed the maximal possible ribosome production rate (see Figure 4 and Materials and Methods). Comparison with experimental data [46],[48] showed that similar compensation could be obtained in silico by using a transcriptional compensation factor. The compensation factor had to be increased in silico when multiple rRNA operons were deleted. To compare the calculated compensation factor with experimental data, we converted the measured number of RNA polymerases (RNAP) per operon in rRNA operon deficient strains [46] into compensation factors by diving them with the reported RNAP binding frequency in the wild-type [53]. These experimental compensation factors in good agreement with our in silico results (data not shown). Surprisingly, it was found experimentally that strains with only one intact rRNA operon can still produce 56% of wild-type rRNA [48]. This situation would correspond to an in silico compensation factor of 4 and thus, to approx. 150 RNAP bound to the remaining rRNA operon. Since the average length of an rRNA operon is 5100 nucleotides, this high number of bound RNAP corresponds to a RNAP every 34 nucleotides. Such an increase in RNAP density on the operon could be achieved by increasing the transcription elongation rate and/or modulating the frequency of RNAP binding to the promoter [46]. It is not known which regulatory elements could lead to such an increase in rRNA transcription; however, Condon et al. found the ppGpp concentration, responsible for the stringent response under amino acid starvation, unaltered [46]. Gaal et al. showed that rRNA synthesis is regulated by NTPs, which stabilize the open complex of RNAP and P1 promoter of an rRNA operon. The formation of the open complex is necessary for successful transcription initiation [54]. Feedback inhibition is also controlling the rRNA synthesis, where an excess of ribosomes might regulate the transcriptional rate [43]. In agreement with our predictions, experimental data have shown an increase in ribosomal content for some rRNA deficient strains (Figure 4) [46]. Furthermore, different rRNA operon knockout combinations resulted in large differences in compensation due to different gene dosage depending on the positions of the various operons on the chromosome (Figure 4 and Table 3). We did not determine the growth rates of the knockout strains as such calculation would require to assume the same correlation between doubling time and ribosome production as is present in wild-type E. coli (Figure 2). Our results suggest that the transcriptional initiation rate, and thus ribosome production rate, will be limited by competition for precursors, especially NTPs (data not shown). This agrees with the experimental observation that an increase in rRNA operon number will reduce the overall transcription initiation rate and thus maintain a constant rRNA content in the cell [55]. However, many complex regulatory mechanisms, which are outside the scope of the current model, are known to control ribosome production [43],[54]. The incorporation of regulation with the current model should lend further insight into the nature of rRNA operon multiplicity.

Table 3
List of rRNA transcription units and their basic characteristics.

Integration of ‘-omics’ data into ‘E-matrix’

An overall aim of this reconstruction effort was to create a stoichiometric representation of mRNA and protein synthesis machinery that allows the integration with experimental data. Interrogation of the data-constraint model would allow the investigation of the remaining network capabilities (Figure 5A). Here, we incorporated successively experimental data sets into the model as constraints, and investigated the resulting network capabilities. More specifically, we used the difference between minimal and maximal flux rate for each reaction (flux span) as a measure of constraint stringency.

Figure 5
Integration of ‘-omics’ data into ‘E-matrix’ as reaction constraints.

We successively integrated three different datasets (Figure 5):

  • First, we constrained the upper bounds of exchange reactions in the ‘E-matrix’ to uptake rates corresponding to LB-medium conditions (Figure 5B). This set of constraints was not sufficient to eliminate biologically irrelevant solutions since, for instance, the model was able to produce up to 45,000 ribosomes while approximately 30,000 ribosomes were observed experimentally [42].
  • Second, further constraints were applied on the stable RNA transcription initiation rates based on low-throughput data [42] to exclude physiologically infeasible stable RNA transcription rates (Figure 5C). However, the maximal flux rates for synthesis reactions of most network mRNAs were still found to be too high when compared to expression data [56].
  • Finally, we used high-throughput data, namely gene expression data from LB medium [56] and mRNA half life times [56], to further constrain the network. Numerical values for mRNA degradation rate, specific to each sequence of mRNA, were calculated based on these two data sets and applied as upper bounds on the mRNA degradation reactions in the network. This last set of constraints had a significant effect on the overall flux span, which highlights the importance of mRNA transcription constraints on the set of feasible solutions (Figure 5D).

A qualitative evaluation of mRNA expression in Boolean terms (on/off)—as used in metabolic modeling [52]—did not result in significant reduction of the size of the solution space (data not shown). Despite the mRNA degradation reaction constraints, many protein synthesis reactions still achieved high flux values. This result is consistent with the fact that low numbers of transcripts can be sufficient to synthesize high numbers of proteins and hence, the translation reactions can carry large flux rates. Thus, the application of quantitatively accurate proteomic data could greatly help to further constrain the set of feasible steady-state solutions.

Defining functional modules

Correlated reaction sets (co-sets) have been calculated for metabolic networks to obtain insight into the network structure and properties [15],[57]. Here, we applied the same concept to the ‘E-matrix’ to identify functional coupling between proteins. In the reconstruction, every protein is associated with a recycling reaction representing its overall utilization rate in the cell. It can be expected that proteins whose utilization rates are perfectly correlated based on stoichiometry would show similar pattern of protein expression, but not necessarily of gene expression, under different environmental conditions. A total of 14 multi-protein modules (or co-sets) were identified accounting for 91 out of 153 proteins or protein complexes (Table S14). Interestingly, many modules contained proteins from different subsystems, which were assigned based on classical pathway designation (Figure 6). Hence, our calculations suggest that some canonical pathway assignments may not necessarily represent the functional relationships between the proteins in the cell (Figure 6). Furthermore, no direct correlation between the calculated functional modules and protein-protein interaction data [58],[59] could be observed (data not shown). In contrast, stoichiometrically coupled changes of translation initiation factor 1 (IF-1) and ribosomes [60] observed experimentally, suggest that our calculated functional modules are biologically relevant. As more accurate quantitative proteomic data becomes available the functional modules reported herein should be useful in interpretation of this data and help resolve missing gene annotations.

Figure 6
Schematic representation is shown of the calculated functional modules, the associated proteins and their canonical assignments.

Integration with other cellular functions

The scope of the ‘E-matrix’ was limited to the reactions required for synthesis of E. coli's transcriptional and translational machinery, which can account for 50% of the dry weight in fast growing cells [53]. Subsequently, the synthesis and maintenance of this machinery places significant material and energy demands for biosynthetic precursors from metabolism. In the ‘E-matrix’, these precursors are provided via exchange reactions. As a next step, one could imagine replacing these exchange reaction with the stoichiometric matrix for the metabolic network of E. coli [7] (‘M-matrix’, Figure 5A). This integration would allow the direct assessment of the metabolic demand that the transcriptional and translational machinery imposes on a cell. Moreover, integration of the transcriptional regulation of individual operons would enable a more accurate determination of the genotype – phenotype relationship (‘O-matrix’, Figure 5A). Thus the genome-scale integrated network, or ‘OME-matrix’, would account for three major cellular processes and may capture more than 2,000 of E. coli's gene. Recently, two studies proposed approaches to integrate different cellular processes [61],[62] but no genome-scale representation is available yet.

Conclusion

In this study, we present the first, mechanistically and chemically detailed, genome-scale network reconstruction of the transcriptional and translational machinery of E. coli. Biochemical components, reaction formulation, and quality control measures analogous to metabolic network reconstructions were used to incorporate bibliomic data from the last 50 years into one reconstruction (Figure 2). The corresponding knowledge base can be queried online (http://bigg.ucsd.edu/E-matrix). This stoichiometric reconstruction represents a first step towards modeling this complex cellular function, and will require iterative refinement as new data becomes available. By describing the stoichiometric relationships between the components involved in transcription and translation, this reconstruction enables the quantitative integration of disparate ‘-omics’ data into a computational model (Figure 5). We demonstrated that low- and high-throughput data can be readily integrated and used as constraints on model reactions and the subsequent reduction of the feasible set of reaction fluxes results in physiological relevant predictions (Figure 5B–D). Furthermore, we showed that the computational model can be used to accurately predict ribosome production under different growth conditions (Figure 3). The deletion of single or multiple rRNA operons from the ‘E-matrix’ predicted that a high density of RNA polymerases is required on the remaining rRNA operons to achieve the reported ribosome numbers (Figure 4B). Computational analysis of the ‘E-matrix’ can provide further insight into the topologically local and global relationship between proteins in terms of functional modules (Figure 6).

This ‘E-matrix’ reconstruction ushers in a new generation of cellular network models that account quantitatively for mRNA and proteins. The ‘E-matrix’ offers the potential to (i) serve as a platform for integrated, numerical analysis of heterogeneous, quantitative high-throughput datasets; (ii) increase our understanding of the relationship between mRNA and protein abundance; (iii) be integrated with metabolism by extending the transcriptional and translational reactions to metabolic genes; (iv) be integrated with regulatory events by formulating regulatory rules for the genes of the ‘E-matrix’ and extending the transcriptional and translational reactions to transcription factors; and (v) enable computation of the material and energetic cost of macromolecular synthesis. These capabilities are important milestones in moving towards a more comprehensive genome-scale in silico model of all cellular processes in E. coli. Furthermore, the underlying reconstruction methodology can be readily extended and applied to other prokaryotes. Such extension could lead to further insight into conserved and unique features of the transcriptional and translational machinery of prokaryotes.

The history of E. coli metabolic reconstructions now spans more than 17 years, with numerous iterative reconstruction refinements and applications superseding initial expectations [63]. The reconstruction of transcriptional and translational machinery E. coli, and other prokaryotes, will have the same impact on systems biology, especially when integrated with metabolism, regulation, and condition-specific high-throughput data sets (Figure 5 A). This work represents hence a crucial step towards the important and ambitious goal of whole cell modeling [64].

Materials and Methods

Reconstruction Procedure

The reconstruction of the transcriptional and translational machinery of E. coli was approached by first identifying the main components from genome annotation [32], E. coli specific primary and review literature, as well as multiple databases (Figure 2B). For each of these components the gene ID (b-number), gene position, necessary metallo-ions and cofactors, and protein stoichiometry were extracted. The synthesis reactions for every network component were created using template reactions, which was possible since reaction mechanisms are similar for all network components (see Text S1 for examples). These template reactions were carefully formulated and derived from primary and review literature (Tables S15, S16, S17). The template-based network reconstruction was performed using the scripting language, Perl (http://www.perl.com/). Each template reaction as well as protein complex formation reactions were generated manually based on legacy data (Tables S15, S16, S17, and S18). Every network reaction was mass- and charged balances assuming a physiological pH of 7.2[1].

The basis for the reconstruction was the genome sequence, m56 [65], the most current gene coordinates [32], and the transcription unit definitions provided by EcoCyc (version 10.6, [36]). This information was also used to (i) calculate the formula and charge for each mRNA and protein species; (ii) individually adjust the template reactions for, e.g., NTP requirement; and (iii) transcribe operons rather than genes. A complete list of all transcription units can be found in Table S9. The genetic code used for this reconstruction is listed in Table S11. Network gap analysis was performed after the initial reaction list was obtained. Multiple iterations of content refinement and evaluation ensured completeness of the network within its scope by including missing components and reactions (Text S1, Figure A–c). One network gap remained, which is the RNase PH that is annotated as pseudogene in Riley et al. [32].

The systems boundaries of the ‘E-matrix’ were defined by adding 76 exchange reactions for amino acids, NTP, and other metabolic components. Furthermore, demand reactions were added for each protein gene product (Tables S9 and S12). The ‘E-matrix’ model is available in Matlab format (Dataset S1).

Constraint-Based Modeling

The mathematical model of the ‘E-matrix’ was represented by a stoichiometric matrix, S (m rows×n columns), where m is the number of components and n is the number of reactions [1]. Reactions within the network were mass-balanced and assumed to be at steady state such that An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e001.jpg, where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e002.jpg is flux vector. Additional constraints on upper, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e003.jpg, and lower, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e004.jpg, bounds were applied in form of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e005.jpg on each reaction i. The lower limits were set to zero for irreversible reactions. The unit for each reaction flux was defined to be An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e006.jpg, where the doubling time (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e007.jpg) is given in minutes, if not stated differently.

Simulation Constraints

The upper bounds on exchange reactions for NTPs and amino acids were constrained for all simulation conditions, while the lower bounds remained unconstrained. The fractional contribution of NTPs and amino acids were calculated based on experimental data [53] and scaled by RNA and protein content found at each doubling time (Text S1). The upper bounds of stable RNA transcription initiation reactions were constraint based on experimental data [42] using the following formula: An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e008.jpg where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e009.jpg is the rRNA transcription initiation rate, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e010.jpg is the copy number of the stable RNA gene i per cell due to gene dosage (Table 3), and TD the doubling time (see Text S1). The mRNA degradation rates were calculated using expression data in LB medium and mRNA half-life times [56] with An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e011.jpg where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e012.jpg is the concentration of mRNA i in the cell, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e013.jpg is the half-life time of mRNA i in LB medium, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e014.jpg is the half-life time of mRNA i in M9 medium+glucose (refer to Text S1 for detailed calculation). A total number of 4,600 mRNA per cell at 30 min doubling time was assumed [42]. The lower bound (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e015.jpg) was set to be 0. Since the expression data as well as the total mRNA number have experimental errors, the upper bound on each reaction flux had to be relaxed by multiplying each mRNA concentration with a factor of 10. The upper bound on mRNA recycling, or CONV2 reactions, were constrained using the following formula: An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e016.jpg where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e017.jpg is the doubling time (s), An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e018.jpg is the length of mRNA i , and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e019.jpg is the translation elongation rate at TD. This later set of reactions accounts for multiple translation rounds of an mRNA transcript between synthesis and degradation.

Ribosome Production Rate

The exchange flux rates and the transcription initiation rates of ribosomal RNA operons were constrained as described above. At each doubling time, the ribosome production rate (DM_rib_50) was chosen as objective function, and the maximal possible production rate under the given set of constraints was calculated using linear programming.

In Silico rRNA Operon Deletion

This analysis was carried out as illustrated in Figure 4. First, the transcription initiation rates were applied as constraints to all rRNA operons for the different doubling times (as described above). Using flux balance analysis (FBA) [66],[67], we optimized for ribosome production (DM_rib_50). For the strains deficient in one rRNA operon, we deleted each operon separately by setting the maximal possible transcription initiation rate to 0 (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e020.jpg), which corresponds the deletion of the reaction from the network. We optimized again for the ribosome production. For multiple rRNA operon deficient strains, all possible combinations of rRNA operon deletion were considered (Table 3), leading to the error bars in Figure 4. The compensation factors were chosen arbitrarily (1.5, 2, 2.5, and 4) and multiplied to all active rRNA operons in the mutant strains. Note that the unit for these simulations was An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e021.jpg.

Flux Variability Analysis

Flux variability analysis was performed as described by Mahadevan [68] using linear programming. Briefly, for every network reaction the minimal and maximal solution was determined by successively defining each network reaction as objective function. The lower bound of the ribosome production rate (DM_rib_50) was constrained to An external file that holds a picture, illustration, etc.
Object name is pcbi.1000312.e022.jpg.

Correlation of Protein Utilization

The pair-wise correlations between protein component recycling reactions (PROT_RECYCL) were determined in LB-medium using linear programming. The maximal reaction flux for reaction A was determined and its upper and lower bound was set to be the maximal flux value. The minimal and maximal reaction flux for reaction B was determined under this new set of constraints. The same procedure was repeated for the minimal flux rate through reaction A. The same approach was repeated for reaction B with respect to reaction A. This method resulted in pair wise dependency plots for all recycling reactions. The area of feasible flux rates was determined using a convex hull algorithm [69] and scaled by the maximal flux rates for each reaction. The reaction correlation was defined to be 1 minus the area between two network reactions.

All calculation were performed using MatLab (The MathWorks, Inc, Natick, MA) and TomLab (TomLab Optimization, Inc, Pullman, WA) as linear programming solver.

Availability

This knowledge base is freely available at http://bigg.ucsd.edu/E-matrix

Supporting Information

Dataset S1

Compressed Matlab file containing E-matrix model

(1.49 MB ZIP)

Figure S1

Map of proteins included in the reconstruction.

(1.40 MB PDF)

Table S1

This table lists the network protein components included in the ‘E-matrix’ reconstruction by the subsystem in which they are mainly involved.

(0.03 MB DOC)

Table S2

Reactions per subsystem

(0.01 MB PDF)

Table S3

Proteins without gene annotation

(0.05 MB DOC)

Table S4

E-matrix proteins

(0.03 MB PDF)

Table S5

DnaK-dependent protein folding

(0.01 MB PDF)

Table S6

GroEL-dependent protein folding

(0.04 MB PDF)

Table S7

Unbalanced exchange reactions

(0.01 MB PDF)

Table S8

Unbalanced internal reactions

(0.01 MB PDF)

Table S9

E-matrix transcription units

(0.02 MB PDF)

Table S10

E-matrix genes

(0.07 MB PDF)

Table S11

Used genetic code

(0.04 MB PDF)

Table S12

Complete model reaction list and flux variability (FVA) results

(1.75 MB PDF)

Table S13

Component list

(0.76 MB PDF)

Table S14

List of functional modules

(0.04 MB PDF)

Table S15

Template reactions

(0.74 MB DOC)

Table S16

Template reactions for rRNA modification

(0.40 MB DOC)

Table S17

Template reactions for tRNA modification

(0.76 MB DOC)

Table S18

References for individual network reactions

(3.92 MB DOC)

Text S1

The supplemental text describes in detail the network content, reconstruction approach, and underlying assumptions.

(1.45 MB DOC)

Acknowledgments

The authors want to thank V. Portnoy, E. M. Knight, M. Mo, and J. Schellenberger for valuable discussions. Furthermore, we want to thank Dr. De Crécy-Lagard for her review of and insight into the tRNA modification pathway.

Footnotes

The principal investigator and UCSD have a financial interest in Genomatica, Inc., although this grant has been identified for conflict of interest management based on the overall scope of the project and its potential to benefit Genomatica, Inc, the research findings included in this publication may not necessarily directly relate to the interests of Genomatica, Inc.

This study was supported by a grant from the National Institutes of Health (R0157089).

References

1. Reed JL, Famili I, Thiele I, Palsson BO. Towards multidimensional genome annotation. Nat Rev Genet. 2006;7:130–141. [PubMed]
2. Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO. Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol. 2009;7:129–143. [PMC free article] [PubMed]
3. Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, et al. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci U S A. 2007;104:1777–1782. [PMC free article] [PubMed]
4. Duarte NC, Herrgard MJ, Palsson B. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res. 2004;14:1298–1309. [PMC free article] [PubMed]
5. Kuepfer L, Sauer U, Blank LM. Metabolic functions of duplicate genes in Saccharomyces cerevisiae. Genome Res. 2005;15:1421–1430. [PMC free article] [PubMed]
6. Chavali AK, Whittemore JD, Eddy JA, Williams KT, Papin JA. Systems analysis of metabolism in the pathogenic trypanosomatid Leishmania major. Mol Syst Biol. 2008;4:177. [PMC free article] [PubMed]
7. Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007;3:121. [PMC free article] [PubMed]
8. Thiele I, Vo TD, Price ND, Palsson B. An expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterization of single and double deletion mutants. J Bacteriol. 2005;187:5818–5830. [PMC free article] [PubMed]
9. Oberhardt MA, Puchalka J, Fryer KE, Martins dos Santos VA, Papin JA. Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1. J Bacteriol. 2008;190:2790–2803. [PMC free article] [PubMed]
10. Nogales J, Palsson BO, Thiele I. A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory. BMC Syst Biol. 2008;2:79. [PMC free article] [PubMed]
11. Puchalka J, Oberhardt MA, Godinho M, Bielecka A, Regenhardt D, et al. Genome-scale reconstruction and analysis of the Pseudomonas putida KT2440 metabolic network facilitates applications in biotechnology. PLoS Comput Biol. 2008;4:e1000210. doi:10.1371/journal.pcbi.1000210. [PMC free article] [PubMed]
12. Palsson BO. Systems Biology: Properties of Reconstructed Networks. New York: Cambridge University Press; 2006.
13. Thiele I, Palsson BO. Bringing genomes to life: the use of genome-scale in silico models. In: Choi S, editor. Introduction to Systems Biology. Totowa (New Jersey): Humana Press; 2007. pp. 14–36.
14. Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, et al. Systems approach to refining genome annotation. Proc Natl Acad Sci U S A. 2006;103:17480–17484. [PMC free article] [PubMed]
15. Thiele I, Price ND, Vo TD, Palsson BO. Candidate metabolic network states in human mitochondria: Impact of diabetes, ischemia, and diet. J Biol Chem. 2005;280:11683–11695. [PubMed]
16. Fong SS, Palsson BO. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat Genet. 2004;36:1056–1058. [PubMed]
17. Almaas E, Kovacs B, Vicsek T, Oltvai ZN, Barabasi AL. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature. 2004;427:839–843. [PubMed]
18. Park JH, Lee KH, Kim TY, Lee SY. Metabolic engineering of Escherichia coli for the production of l-valine based on transcriptome analysis and in silico gene knockout simulation. Proc Natl Acad Sci U S A. 2007;104:7797–7802. [PMC free article] [PubMed]
19. Price ND, Papin JA, Schilling CH, Palsson B. Genome-scale microbial in silico models: the constraints-based approach. Trends Biotechnol. 2003;21:162–169. [PubMed]
20. Papin JA, Hunter T, Palsson BO, Subramaniam S. Reconstruction of cellular signalling networks and analysis of their properties. Nat Rev Mol Cell Biol. 2005;6:99–111. [PubMed]
21. Li F, Thiele I, Jamshidi N, Palsson BØ. Functional assessment of the TLR receptor network. PLoS Comput Biol. In press.
22. Dasika MS, Burgard A, Maranas CD. A computational framework for the topological analysis and targeted disruption of signal transduction networks. Biophys J. 2006;91:382–398. [PMC free article] [PubMed]
23. Gianchandani EP, Papin JA, Price ND, Joyce AR, Palsson BO. Matrix formalism to describe functional states of transcriptional regulatory systems. PLoS Comput Biol. 2006;2:e101. doi:10.1371/journal.pcbi.0020101. [PMC free article] [PubMed]
24. Allen TE, Palsson BO. Sequenced-based analysis of metabolic demands for protein synthesis in prokaryotes. J Theor Biol. 2003;220:1–18. [PubMed]
25. Tadmor AD, Tlusty T. A coarse-grained biophysical model of E. coli and its application to perturbation of the rRNA operon copy number. PLoS Comput Biol. 2008;4:e1000038. doi:10.1371/journal.pcbi.1000038. [PMC free article] [PubMed]
26. Suthers PF, Gourse RL, Yin J. Rapid responses of ribosomal RNA synthesis to nutrient shifts. Biotechnol Bioeng. 2007;97:1230–1245. [PubMed]
27. Mehra A, Lee KH, Hatzimanikatis V. Insights into the relation between mRNA and protein expression patterns: I. Theoretical considerations. Biotechnol Bioeng. 2003;84:822–833. [PubMed]
28. Mehra A, Hatzimanikatis V. An algorithmic framework for genome-wide modeling and analysis of translation networks. Biophys J. 2006;90:1136–1146. [PMC free article] [PubMed]
29. Zouridis H, Hatzimanikatis V. A model for protein translation: polysome self-organization leads to maximum protein synthesis rates. Biophys J. 2007;92:717–730. [PMC free article] [PubMed]
30. Jamshidi N, Palsson BO. Formulating genome-scale kinetic models in the post-genome era. Mol Syst Biol. 2008;4:171. [PMC free article] [PubMed]
31. Crick FH. On protein synthesis. Symp Soc Exp Biol. 1958;12:138–163. [PubMed]
32. Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, et al. Escherichia coli K-12: a cooperatively developed annotation snapshot—2005. Nucleic Acids Res. 2006;34:1–9. [PMC free article] [PubMed]
33. Deuerling E, Patzelt H, Vorderwulbecke S, Rauch T, Kramer G, et al. Trigger factor and DnaK possess overlapping substrate pools and binding specificities. Mol Microbiol. 2003;47:1317–1328. [PubMed]
34. Kerner MJ, Naylor DJ, Ishihama Y, Maier T, Chang HC, et al. Proteome-wide analysis of chaperonin-dependent protein folding in Escherichia coli. Cell. 2005;122:209–220. [PubMed]
35. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32:D277–D280. [PMC free article] [PubMed]
36. Karp PD, Arnaud M, Collado-Vides J, Ingraham J, Paulsen IT, et al. The E. coli EcoCyc Database: no longer just a metabolic pathway database. ASM News. 2004;70:25–30.
37. Smulson ME, Suhadolnik RJ. The biosynthesis of the 7-deazaadenine ribonucleoside, tubercidin, by Streptomyces tubercidicus. J Biol Chem. 1967;242:2872–2876. [PubMed]
38. Suhadolnik RJ, Uematsu T. Biosynthesis of the pyrrolopyrimidine nucleoside antibiotic, toyocamycin. VII. Origin of the pyrrole carbons and the cyano carbon. J Biol Chem. 1970;245:4365–4371. [PubMed]
39. Becker SA, Price ND, Palsson BO. Metabolite coupling in genome-scale metabolic networks. BMC Bioinformatics. 2006;7:111. [PMC free article] [PubMed]
40. Price ND, Reed JL, Palsson BO. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol. 2004;2:886–897. [PubMed]
41. Nomura M. Regulation of ribosome biosynthesis in Escherichia coli and Saccharomyces cerevisiae: diversity and common principles. J Bacteriol. 1999;181:6857–6864. [PMC free article] [PubMed]
42. Neidhardt FC, editor. Escherichia coli and Salmonella: cellular and molecular biology. 2nd edition. Washington, D.C.: ASM Press; 1996.
43. Nomura M, Gourse R, Baughman G. Regulation of the synthesis of ribosomes and ribosomal components. Annu Rev Biochem. 1984;53:75–117. [PubMed]
44. Edwards JS, Palsson BO. Significant redundancy and robustness exist in the central metabolic pathways; 12–15 October 1997. Snowbird, UT: American Society for Microbiology; 1997.
45. Nomura M, Morgan EA. Genetics of bacterial ribosomes. Annu Rev Genet. 1977;11:297–347. [PubMed]
46. Condon C, French S, Squires C, Squires CL. Depletion of functional ribosomal RNA operons in Escherichia coli causes increased expression of the remaining intact copies. EMBO J. 1993;12:4305–4315. [PMC free article] [PubMed]
47. Stevenson BS, Schmidt TM. Life history implications of rRNA gene copy number in Escherichia coli. Appl Environ Microbiol. 2004;70:6670–6677. [PMC free article] [PubMed]
48. Asai T, Condon C, Voulgaris J, Zaporojets D, Shen B, et al. Construction and initial characterization of Escherichia coli strains with few or no intact chromosomal rRNA operons. J Bacteriol. 1999;181:3803–3809. [PMC free article] [PubMed]
49. Condon C, Liveris D, Squires C, Schwartz I, Squires CL. rRNA operon multiplicity in Escherichia coli and the physiological implications of rrn inactivation. J Bacteriol. 1995;177:4152–4156. [PMC free article] [PubMed]
50. Edwards JS, Palsson BO. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc Natl Acad Sci U S A. 2000;97:5528–5533. [PMC free article] [PubMed]
51. Forster J, Famili I, Palsson BO, Nielsen J. Large-scale evaluation of in silico gene knockouts in Saccharomyces cerevisiae. Omics. 2003;7:193–202. [PubMed]
52. Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO. Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004;429:92–96. [PubMed]
53. Neidhardt FC, Ingraham JL, Schaechter M. Physiology of the Bacterial Cell: A Molecular Approach. Sunderland (Massachusetts): Sinauer Associates; 1990.
54. Gaal T, Bartlett MS, Ross W, Turnbough CL, Jr, Gourse RL. Transcription regulation by initiating NTP concentration: rRNA synthesis in bacteria. Science. 1997;278:2092–2097. [PubMed]
55. Voulgaris J, French S, Gourse RL, Squires C, Squires CL. Increased rrn gene dosage causes intermittent transcription of rRNA in Escherichia coli. J Bacteriol. 1999;181:4170–4175. [PMC free article] [PubMed]
56. Bernstein JA, Khodursky AB, Lin PH, Lin-Chao S, Cohen SN. Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays. Proc Natl Acad Sci U S A. 2002;99:9697–9702. [PMC free article] [PubMed]
57. Burgard AP, Nikolaev EV, Schilling CH, Maranas CD. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 2004;14:301–312. [PMC free article] [PubMed]
58. Butland G, Peregrin-Alvarez JM, Li J, Yang W, Yang X, et al. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature. 2005;433:531–537. [PubMed]
59. Arifuzzaman M, Maeda M, Itoh A, Nishikata K, Takita C, et al. Large-scale identification of protein-protein interaction of Escherichia coli K-12. Genome Res. 2006;16:686–691. [PMC free article] [PubMed]
60. Cummings HS, Hershey JW. Translation initiation factor IF1 is essential for cell viability in Escherichia coli. J Bacteriol. 1994;176:198–205. [PMC free article] [PubMed]
61. Covert MW, Xiao N, Chen TJ, Karr JR. Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli. Bioinformatics. 2008;24:2044–2050. [PubMed]
62. Min Lee J, Gianchandani EP, Eddy JA, Papin JA. Dynamic analysis of integrated signaling, metabolic, and regulatory networks. PLoS Comput Biol. 2008;4:e1000086. doi:10.1371/journal.pcbi.1000086. [PMC free article] [PubMed]
63. Feist AM, Palsson BO. The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol. 2008;26:659–667. [PMC free article] [PubMed]
64. Holden C. Alliance launched to model E. coli. Science. 2002;297:1459–1460. [PubMed]
65. Blattner FR, Plunkett GIII, Bloch CA, Perna NT, Burland V, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. [PubMed]
66. Edwards JS, Covert M, Palsson B. Metabolic modeling of microbes: the flux-balance approach. Environ Microbiol. 2002;4:133–140. [PubMed]
67. Varma A, Palsson BO. Metabolic flux balancing: basic concepts, scientific and practical use. Nat Biotechnol. 1994;12:994–998.
68. Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5:264–276. [PubMed]
69. Barber CB, Dobkin DP, Huhdanpaa HT. The Quickhull Algorithm for Convex Hulls. ACM Trans Math Softw. 1996;22:469–483.
70. Sundararaj S, Guo A, Habibi-Nazhad B, Rouani M, Stothard P, et al. The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 2004;32:D293–D295. [PMC free article] [PubMed]
71. Sprinzl M, Vassilenko KS. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 2005;33:D139–D140. [PMC free article] [PubMed]

Articles from PLoS Computational Biology are provided here courtesy of Public Library of Science

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...