![]() | ![]() |
Formats:
|
||||||||||||||||||||||||
Copyright © 2009 The Author(s) Model-based redesign of global transcription regulation 1Instituto de Biologia Molecular y Celular de Plantas, CSIC, 2Instituto de Aplicaciones en Tecnologias de la Informacion y las Comunicaciones Avanzadas (ITACA), Universidad Politecnica de Valencia, Camino de Vera s/n, 46022 Valencia, Spain, 3Laboratoire de Biochimie, Ecole Polytechnique - CNRS, Route de Saclay, 91128 Palaiseau Cedex and 4Epigenomics Project, Universite d'Evry Val d'Essonne - Genopole - CNRS, 523 Terrasses de l' Agora, 91034 Evry Cedex, France *To whom correspondence should be addressed. Tel: Phone: +33 1 69474444; Fax: +33 1 69474437; Email: alfonso.jaramillo/at/polytechnique.fr The authors wish it to be known that, in their opinion, the first two authors have contributed equally to this work. Received July 12, 2008; Revised January 2, 2009; Accepted January 7, 2009. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Synthetic biology aims to the design or redesign of biological systems. In particular, one possible goal could be the rewiring of the transcription regulation network by exchanging the endogenous promoters. To achieve this objective, we have adapted current methods to the inference of a model based on ordinary differential equations that is able to predict the network response after a major change in its topology. Our procedure utilizes microarray data for training. We have experimentally validated our inferred global regulatory model in Escherichia coli by predicting transcriptomic profiles under new perturbations. We have also tested our methodology in silico by providing accurate predictions of the underlying networks from expression data generated with artificial genomes. In addition, we have shown the predictive power of our methodology by obtaining the gene profile in experimental redesigns of the E. coli genome, where rewiring the transcriptional network by means of knockouts of master regulators or by upregulating transcription factors controlled by different promoters. Our approach is compatible with most network inference methods, allowing to explore computationally future genome-wide redesign experiments in synthetic biology. INTRODUCTION Molecular regulations govern the cell response under environmental (extracellular) or genetic (intracellular) perturbations. The elucidation of these regulations with computational techniques will allow analyzing the cell behavior (1), since modeling in biology has boosted the understanding of the cell mechanisms by means of systemic approaches (2). On the other hand, the design of new transcriptional networks requires a quantitative description of the transcription regulation. Thanks to the new developments in the inference from transcriptomic data, now it is possible to reconstruct the regulatory network with enough accuracy to predict the gene expression profile in presence of heterologous networks. We propose a procedure that, by extending a recent methodology, could be used to redesign transcriptional networks. The continuous developments on genome sequencing and annotation allow us to design microarrays and to identify the genes and transcription factors (TFs) of an organism. The development of the microarray technology has provided high-throughput genomic measurements, where cells are subjected to several conditions or stresses to measure their gene expression profiles (3). Large-scale cell models, such as metabolic, transcription or protein networks, are distilled from high-throughput genomic data, which poses one of the most challenging problems in biology. The construction of a deterministic model would allow the prediction of the cell response under different stimuli (4). To redesign the transcriptional regulation network, we need a quantitative model able to predict the gene dynamics. We propose to characterize such model by using microarray data with a known transcriptional network inference method. We first infer the network topology and we later estimate the corresponding kinetic parameters. For the last decade, there has been an enormous effort in the improvement of techniques aimed at the inference of the connectivity of the transcription network. Clustering approaches (5–9) have been used to obtain information of regulatory networks but with low accuracy (10). Information-theoretic inference provides more accurate networks (11–15) even from reduced expression datasets. A local significance calculation has been very fruitful to capture the network topology (14). On the other hand, Bayesian methods (16–19) give networks with high precision but low proportion of true recovered interactions (they introduce few regulations with high confidence). Moreover, such methods have a higher computational cost. Herein, we propose the construction of predictable genome models in a standard format from a regulatory scaffold captured by using probabilistic methods. Other approaches, instead, optimized directly the corresponding kinetic parameters for a linear regulatory model (20,21). In addition, recent algorithms (22,23) applied sparse logistic regression (24) for gene selection in order to avoid overfitting. METHODS We aim to the development of a methodology able to in silico evolve a genome for having a predefined transcriptional profile. For this, we require to construct a predictive genome model of transcription, based on ordinary differential equations (ODEs), to account for global redesigns of the cellular regulatory map. Using such models we could study the evolution of gene regulations as a consequence of the environmental stimuli. To construct this we have to use as input microarray data properly normalized (Figure 1
Mathematical model We describe the genetic regulations using a linear model for the mRNA dynamics. Here, we use as input data mRNA expression profiles in steady state derived from transcriptional perturbations. As transcriptomic data is normalized and usually represented in logarithmic scale, we have considered log as variables (where s can be 2 or 10). Therefore, the mRNA dynamics from gene yi is given by
the transcription regulatory coefficient of TF j, the cooperative transcription regulatory coefficient of TFs j and k acting on the promoter controlling the gene i and δi the degradation rate. We set and when j and j, k are not TFs regulating the gene i. We assume that all the genes of an operon have the same expression value. We also consider that two regulators could act in a cooperative way (i.e. synergistic inductions and cooperative repressions). We do not consider cooperation between more than two TFs.Here, we use expression values in steady state. Nevertheless, it could be also possible to extend our approach to the use of time series to enrich the experimental input (35). Hence, in the steady state we can write
, and . Notice that the resulting parameters are referred to the intensity scale of the microarray technology. We use a time scale such that the mRNA degradation constant is δ = 1. To use a realistic mRNA degradation constant, it would require translating the Affymetrix (36) data to concentration units.Using network inference to obtain a kinetic model To obtain a kinetic model suitable for redesign, we take advantage of recent methods aimed to infer the topology of the global regulatory map. In particular, we have chosen one of the best performing methods, the CLR (14), although other methodologies providing a transcriptional map, such as sparse Bayesian methods (19) could also be used. Our approach consists of using multiple regressions to fit the kinetic parameters of a continuous model of the transcription regulation. The approach for large-scale transcription inference is based on measuring the influence between the expression levels of TFs and operons across a large set of conditions. Here, we use MI to estimate the correlation between a TF t and an operon p by using , where H is the entropy of a variable. It is defined as , where is the expression value of gene i in the condition c, and the probability to reach that value. The MI is always a positive magnitude. Joint normal distributions are generated with independent variables and (values for gene i and TF j, in row i and column j). Thus, the MI matrix is converted into Z matrix where and Zi and Zj are the z-scores of from the marginal distributions. According to this matrix, we obtain the genomic interactions.For completeness, we have developed an algorithm (InferOpe) to infer operons from microarray data. Since two genes from one operon share the same mRNA molecule, we would expect that their transcriptomic profiles would be similar. Our operon prediction is based on the use of co-expression patterns (37), assuming that two genes, i and j, belong to the same operon if they are highly correlated. We evaluate this by using the Pearson correlation coefficient (we assume correlation if ). Moreover, we impose that the angle ( ) of such correlation should be around {i.e. ], where the relationship with is given by .For each operon we compute the kinetic parameters for the TFs regulating its promoter. The experimental value of one operon is computed as the average of the expressions of all genes belonging to that operon (i.e. , where n is the number of genes of the corresponding operon). To estimate the model parameters αi, and we use multiple linear regression (32), which is the result of a minimization problem (least squares) defined by
Our procedures are implemented in C++, and they run on any UNIX environment. The InferGene software, a tutorial, the corresponding files and some examples are available upon request. The software consists of different functional modules to compute first the network topology and then the corresponding kinetic parameters (see Supplementary Figure S1). Below we present the procedure implemented in InferGene:
Prediction of transcriptomic profiles To compute the performance of our algorithm, we defined a reference network taking those genes with known transcriptional regulation. In addition, the TFs that were present in our reference set regulating genes outside the reference set were also removed when determining the performance of the algorithm. Then, only the interactions among the genes present in that reference set were evaluated to compute the algorithm efficiency. All known interactions cataloged in RegulonDB version 4 (28) were used to construct the reference network in E. coli. However, we are still far from a complete understanding of the transcriptional regulation network of E. coli. Therefore, we designed in silico genomes with predefined regulations to validate the performance of our algorithm. For that, we did not consider: (i) operons with self-regulations; (ii) operons with constitutive promoters; and (iii) operons containing only TFs. We calculated two types of efficiencies (precision rate and sensitivity) to compare the inferred network with the reference network. We defined precision rate as the fraction of predicted interactions that are correct , and sensitivity as the fraction of all known interactions that are discovered by the algorithm , where is the number of true positives, the number of false negatives and the number of false positives (39,40).Designing genomes and expression data In order to evaluate the suitability of our procedure to redesign the transcription regulation, we will analyze our ability to infer the kinetic parameters. Since they are not known for any organism, this lead us to the development of a Generator of Artificial Genomes (GAG) to in silico create expression profiles (Figure S2). To construct such genomes, we specify the number of genes and TFs (this last is usually taken one order of magnitude less than the number of genes), and eventually the ratio between inducers and repressors (we have used 2/3). We can also specify the degree of connectivity to obtain scale-free networks [we have considered a probability distribution where k is the number of regulators of an operon], and the law for clustering distribution [we have assumed where n is the number of genes per operon]. To generate synthetic microarray data, we first obtain the steady state of the system [ , since with an arbitrary degradation rate of 1] without taking into account cooperations between different regulators (i.e. ) as an approximate solution of the system (Equation 2). In fact, as the gene expressions (y) are only functions of the TFs ( ), we can write the system as . Subsequently, we generate a new condition by randomly choosing a set of TFs with given size optimized for the inference (Figure S4) and perturbing their steady state values, while maintaining constant the other TF expressions. The perturbations over/under-express the TFs to a , relative to their steady states. Hence, this perturbed value ( ) is used to recalculate the gene expressions by applying the model . Although this could be extended to more complicated conditions, where different gene categories are altered, the conditions based on TF perturbations are more revealing. Furthermore, to generate more realistic data we have added random fluctuations (which would simulate noisy data) in the expression values. We have studied the efficiency (precision rate and sensitivity) of our algorithm for different noise levels. In Figure S5 (see Supplementary Data) we show that InferGene maintains high efficiency up to 10% of noise amplitude.RESULTS Genome-wide quantitative model of E. coli In the present study, we have applied inference methodologies recently used to obtain models suitable for genome redesign. We have considered the E. coli genome, which contains 4345 nonredundant genes, of which 328 are putative TFs. The genome is organized into 3333 operons, 2447 containing single genes and 886 polycistronic units. The reference regulatory set has been constructed according to RegulonDB (28). For the inference procedure, we have used public microarray data (41) from Affymetrix normalized using RMA (42). This is a microarray compendium containing 189 experiments. From this dataset, 20 experiments were excluded in order to later predict expression profiles from unbiased data. The inferred network contains 525 regulatory interactions (z-score ) and 566 combinatorial influences (z-score ). InferGene predicts 3982 genes to be controlled by constitutive promoters. In Figure 2
To analyze those results in a biological context, we have used the EcoCyc (43) classification to group genes by biological functions and to rank those groups according to their level of prediction (see Supplementary Figure S9). We have scored each biological function as , where n is number of genes involved in the biological function, m the number of the new conditions of the set (m = 20), the predicted expression and the measured expression. The best predicted functions are involved in the metabolism, such as biosynthesis of lipoprotein, carnitine, glycolate and glycoprotein, or functions related with information transfer such as rRNA and stable RNA, ATP binding, DNA and DNA degradation. In addition, we have observed two significant correlations between the number of constitutively expressed genes and the error in expression . These genes are from biological functions involved in the location of gene products and the cell processes (see in Supplementary Figure S9). On the other hand, in Figure 2 operon, involved in metabolism of alanine biosynthesis, is regulated by with a strength of 1.428, according to InferGene. InferGene also predicts the regulation for the operon, involved in the cell structure of , where and act synergistically with . For the operon, involved in transport, InferGene proposes the combinatorial regulation ( AND ) OR ( AND ), with and . Notice that these regulations are not found in RegulonDB, but are obtained as the best experiment-fitting regulators.Furthermore, we provide in the Supplementary Data a list of the E. coli promoters classified according to their inferred regulation. An analysis of the prediction of the promoter regulation shows (see Supplementary Figure S10) that the promoters which are regulated by two TFs are better predicted. In addition, the algorithm can be used to account for nontranscriptional regulations (20). In the Supplementary Data, we have applied this to the well-known SOS pathway. There we show that an effective model of gene–gene interactions can improve the prediction over the pure transcriptional one (see Figures S23–S25). Designing genomes and validating their transcription profiles We have constructed several genomes in silico using GAG and we have compared the predefined regulations in our models with the regulations inferred by InferGene. We have constructed three types of transcription networks according to the mode of regulation of its constituent operons: (i) networks with promoters regulated by at most one TF; (ii) networks with promoters that can be regulated by more than one TF; and (iii) networks with promoters that can be combinatorially regulated including synergistic effects. We have computed the precision rate and sensitivity (see Methods section) to quantify the efficiency of InferGene. In Figure 3
We have analyzed the predictive power of InferGene by calculating a score based on the error made on predicting the expression levels , and other score based on the error made on the prediction of the model parameters . We define , where is the predicted expression profile, is the experimental value, n is the number of operons that are correctly inferred according to RegulonDB and m is the number of conditions that were not used in the training set (m = 20). We also define , where np is number of parameters we use to model the kinetics of the operon expression, are the estimated model parameters and are the model parameters from GAG. To perform such analysis, we have generated a network using the GAG algorithm with 500 genes across 250 conditions (see Supplementary Figure S11). The median for was 0.009, and for Γ was around 0.01. Moreover, we have validated the estimated parameters by performing linear regressions with the predefined kinetic models and obtaining correlations (Pearson coefficients) above 0.90 (see Supplementary Figure S3).Prediction of wild-type E. coli trancriptomic profiles Before proceeding to change the regulation of E. coli, we have calculated the ability of the inferred model to predict the steady state expression levels of the E. coli genes. For that, we have used the model together with the expression levels of all the TFs for each experimental condition to compute the global expression profile. Afterwards, we have compared the predicted expression values with the corresponding measurements, obtaining . We have also determined the predictive power of the inferred model on the 20 experimental conditions excluded from training dataset. The distribution of for the 3333 operons of E. coli is shown in Figure 4 ) and cell processes (adaptation and defense survival).
In Figure 5 against the experimental profiles across all conditions (189 experiments, 169 conditions from the training set and 20 new conditions for prediction). We also perform a K-fold cross-validation (we consider nine partitions, see Figures S13 and S14) to ensure that our results do not depend on the selection of the testing set. In the Supplementary Data, we provide the best predicted profiles for the distinct types of promoters. In addition, we have analyzed the profile prediction to evaluate the best predicted conditions (see Figure S12). We have found that the conditions upregulating genes , , , , , , , , and are better predicted, and the experiments with plasmids pPROEx-CAT, pET3d and T7 controllable have higher error (see more details in the Supplementary Data).
Redesign of the global transcription regulation Finally, we have used our model to predict the expression profile under knockouts of TFs (conditions from the training set). This is a first step toward changing the transcription regulation. For that, we have solved the system of equations in steady state by removing the corresponding transcription regulation. For simplicity, here we have neglected the combinatorial terms to work with a linear model and recalculated the kinetic parameters. To account for experimentally reported interactions, we have incorporated into the model regulations between pairs of TFs according to RegulonDB. In Figure 6 , , and . In the Supplementary Data we also show predictions for the knockouts of the TFs , , , , and a double knockout of . We show how the model is able to capture the whole transcriptomic expression due to a perturbation in the TF network (the relative expression errors, in average for all genes, are shown in Figure 6
Moreover, we have applied our procedure to the modification of the global transcription regulation by adding new regulations into the genomic network. This was done experimentally by Isolan et al. (44), where they overexpressed plasmids pairing together wild-type promoters with ORFs coding for TF that were master regulators. We used our procedure to predict the gene expression of such transcriptional perturbation for the particular case where the and promoters are disposed together with the ORFs and , respectively (see Figure 7
DISCUSSION We have discussed a methodology to create quantitative models for transcription regulation aimed to future genome redesign projects. We have shown how we could use recent methodologies to infer the global topology of transcription regulation to produce the kinetic model able for genome redesign. We have successfully applied the inferred model to predict the transcriptomic response of E. coli under experimental conditions not included in the training set. The prediction has in average an error of 1–5% relative to the experimental value (average computed across all conditions). Furthermore, we have predicted the gene expression under knockouts of TFs and genetic rewirings (44) by solving a perturbed model, showing the predictive power of the inference procedure. Such perturbations change the regulatory map of the cell, but more complex redesigns, even a whole transcription refactorization, could be in silico explored by using our model. Our algorithm provides a global deterministic kinetic model of genetic regulations using microarray data. We show how to use this kinetic model to make predictions (23). Thus, our approach constitutes an important step toward the large-scale design of cell behaviors by providing models which are validated using in silico genomes and experimental transcription data. In this direction, we have accounted for simple transcription rewirings (44) by obtaining the gene expressions using computational methods. Such models can be used in the future to rewire the regulation of organisms without affecting their physiological behavior. The algorithm reaches high efficiencies at the topology and kinetic level, based on the CLR algorithm (14) to infer the network together with an extension to include cooperations in combinatorial promoters. However, it could use other approaches such as Bayesian methods (19). In addition, the generation of synthetic data from specified genome models has been essential to analyze the performance and limitations of InferGene. Indeed, we have shown how the precision rate is drastically improved, from 10–20% to 80–90%, by just doubling the number of perturbations in artificial genomes. Moreover, the error in the prediction of the expression value for correctly predicted regulations is of the order of magnitude of the standard errors on measured expression data, and the estimated parameters highly correlate with the predefined ones (correlation coefficient >0.9). The inaccuracies in our prediction could be rationalized by the lack of modeling of many dynamic variables of the cell (e.g. proteins or metabolites) or nontranscriptional regulations (e.g. protein–protein or RNAi), since these variables are not experimentally measured using microarrays. Furthermore, future works could consider confidence intervals on the model parameters to analyze the stochasticity in expression data. We provide the inferred model in a standard format, as it is SBML (33), which can be used for further applications. In addition, we have used genome annotation to identify the best predicted biological functions. Our approach can take advantage from additional sources of information. For instance, it can incorporate in the inferred model experimentally validated interactions (e.g. from functional genomics measurements or sequence analysis) as a regulatory background. In addition, the knowledge on the genome sequence can help in the inference procedure, by providing information about operon structure, identification of TFs and their regulations (28,45,46). The prior knowledge about regulation provides a topology that can be added into the model and can be used to predict new interactions with high fidelity (47). The methodology can also be applied to account for nontranscriptional interactions. In the Supplementary Data, we use the well known SOS pathway to show that an effective model of gene–gene interactions can improve the prediction over the pure transcriptional one. Furthermore, the algorithm can be expanded in a straightforward way to input expression data from time series. The identification of regulations is a high time-consuming activity. The running time scales with the number of genes and the square of the number of conditions. Nonetheless, the parameter estimation is a quick process (relative to the previous). For instance, in E. coli there are 4345 genes (strain K-12) clustered in 3333 operons, and 328 TFs and 53 628 pairs of TFs (28). The whole inference process took 6 h accomplished on a computer Pentium M 2.00 GHz and 1 GB RAM (time resources for parameter estimation are neglected as they are around 2 min). However, all simulations can be run in parallel allowing the reduction of the execution time (<5 min on a simple cluster). In this way, distributed computing provides the necessary resources to apply our methodology to infer the regulations of much larger genomes. Our methodology provides a simple and fast way to obtain a quantitative global model of transcriptional regulation even for large networks. The incorporation of sparse Bayesian regression methods (19) provides a promising extension for further works. Such methods would provide better inference but increasing the computational cost. The construction of genome-scale models is clearly a valuable step toward the understanding of the cellular behavior (4), but it is also of interest for the emerging field of synthetic biology, where functional genetic circuits are engineered into cells dealing to minimize the impact on the host (48). Hence, InferGene provides an accurate model to predict the changes in the biological processes when perturbing the cell. In addition, this model can be applied to discover molecular targets of heterologous compounds (20,21). SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. FUNDING Spanish Ministry of Education and Science (ref. TIN 2006-12860); Structural Funds of the European Regional Development Fund; EU grants BioModularH2 (FP6-NEST contract 043340) and EMERGENCE (FP6-NEST contract 043338); ATIGE Genopole/UEVE and the MIT-France grants; Graduate fellowship from the Conselleria d'E;ducacio de la Generalitat Valenciana (ref. BFPI 2007/160 to G.R.) and an EMBO Short-term fellowship (ref. ASTF-343.00-2007 to G.R.). HPC-Europa programme. Funding for open access charge: EU grant BioModularH2 FP6-NEST-043340. Conflict of interest statement. None declared. [Supplementary Data]
ACKNOWLEDGEMENTS We are indebted with M. Elati for his careful reading of the article and his comments. We also acknowledge the anonymous reviewers for their suggestions. REFERENCES 1. Lee T, Rinaldi N, Robert F, Odom D, Bar-Joseph Z, Gerber G, Hannett N, Harbison C, Thompson C, Simon I, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298:799–804. [PubMed] 2. deJong H. Modeling and simulation of genetic regulatory systems: a literature review. J. Comp. Biol. 2002;9:67–103. 3. Hughes T, Marton M, Jones A, Roberts C, Stoughton R, Armour C, Bennett H, Coffey E, Dai H, He Y, et al. Functional discovery via a compendium of expression profiles. Cell. 2000;102:109–126. [PubMed] 4. Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO. Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004;429:92–96. [PubMed] 5. Eisen M, Spellman P, Brown P, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA. 1998;95:14863–14868. [PubMed] 6. Ben-Dor A, Shamir R, Yakhini Z. Clustering gene expression patterns. J. Comput. Biol. 1999;6:281–297. [PubMed] 7. Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine A. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl Acad. Sci. USA. 1999;96:6745–6750. [PubMed] 8. Dhaeseleer P, Liang S, Somogyi R. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics. 2000;16:707–726. [PubMed] 9. Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N. Revealing modular organization in the yeast transcriptional network. Nat. Genet. 2002;31:370–377. [PubMed] 10. Bansal M, Belcastro V, Ambesi-Impiombato A, diBernardo D. How to infer gene networks from expression profiles. Mol. Syst. Biol. 2007;3:78. [PubMed] 11. Butte A, Kohane I. Mutual information relevance networks: functional genomic clustering using pairwise entropymeasurements. Pac. Symp. Biocomp. 2000;5:415–426. 12. Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A. Reverse engineering of regulatory networks in human B cells. Nat. Genet. 2005;37:382–390. [PubMed] 13. Margollin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, dellaFavera R, Califano A. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006;7:S7. 14. Faith J, Hayete B, Thaden J, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins J, Gardner T. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. Plos Biol. 2007;5:e8. [PubMed] 15. Meyer PE, Kontos K, Lafitte F, Bontempi G. Information-theoretic inference of large transcriptional regulatory networks. EURASIP J. Bioinf. Syst. Biol. 2007;2007:79879. 16. Yu J, Smith V, Wang P, Hartemink A, Jarvis E. Advances to bayesian network inference for generating causal networks from observational biological data. Bioinformatics. 2004;20:3594–3603. [PubMed] 17. Husmeier D. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics. 2003;19:2271–2282. [PubMed] 18. Fujita A, Sato JR, Garay-Malpartida HM, Yamaguchi R, Miyano S, Sogayar MC, Ferreira CE. Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC Syst. Biol. 2007;1:39. [PubMed] 19. Steinke F, Seeger M, Tsuda K. Experimental design for efficient identification of gene regulatory networks using sparse Bayesian models. BMC Syst. Biol. 2007;1:51. [PubMed] 20. Gardner T, diBernardo D, Lorenz D, Collins J. Inferring genetic networks and identifying compound mode of action via expression profiles. Science. 2003;301:102–105. [PubMed] 21. diBernardo D, Thompson M, Gardner T, Chobot S, Eastwood E, Wojtovich A, Elliott S, Schaus S, Collins J. Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nat. Biotechnol. 2005;3:377–383. 22. Shevade S, Keerthi S. A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics. 2003;19:2246–2253. [PubMed] 23. Bonneau R, Reiss D, Shannon P, Facciotti M, Hood L, Baliga N, Thorsson V. The inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol. 2006;7:R36. [PubMed] 24. Tibshirani R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B. 1996;58:267–288. 25. Behrens J, vonKries J, Khl M, Bruhn L, Wedlich D, Grosschedl R, Birchmeier W. Functional interaction of bold β-catenin with the transcription factor LEF-1. Nature. 1996;328:638–642. [PubMed] 26. Stewart V, Bledsoe P. Fnr-, NarP- and Narl-dependent regulation of transcription initiation from the Haemophilus influenzae Rd napF (Periplasmic Nitrate Reductase) promoter in Escherichia coli K-12. J. Bacteriol. 2005;187:6928–6935. [PubMed] 27. Long J, Roth M. Synthetic microarray data generation with RANGE and NEMO. Bioinformatics. 2008;24:132–134. [PubMed] 28. Salgado H, Gama-Castro S, Peralta-Gil M, Diaz-Peredo E, Sanchez-Solano F, Santos-Zavaleta A, Martinez-Flores I, Jimenez-Jacinto V, Bonavides-Martinez C, Segura-Salazar J, et al. Regu-lonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 2006;34:D394. [PubMed] 29. Gray R. Entropy and Information Theory. New York, NY, USA: Springer-Verlag; 1990. 30. Steuer R, Kurths J, Daub CO, Weise J, Selbig J. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics. 2002;18:S231–S240. [PubMed] 31. Daub C, Steuer R, Selbig J, Kloska S. Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data. BMC Bioinformatics. 2004;5:118. [PubMed] 32. Cohen JPC, West S, Aiken L. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ, USA: Lawrence Erlbaum Associates; 2003. 33. Hucka M, Bolouri H, Finney A, Sauro H, Doyle JKH, Arkin A, Bornstein B, Bray D, Cornish-Bowden A, Cuellar A, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19:524–531. [PubMed] 34. Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. [PubMed] 35. Bar-Joseph Z. Analyzing time series gene expression data. Bioinformatics. 2004;20:2493–2503. [PubMed] 36. Affymetrix. Affymetrix Microarray Suite User Guide, version 4. Santa Clara, CA, USA: Affymetrix; 1999. 37. Sabatti C, Rohlin L, Oh M, Liao J. Co-expression pattern from DNA microarray experiments as a tool for operon prediction. Nucleic Acids Res. 2002;30:2886–2893. [PubMed] 38. Dongarra J, Bunch J, Moler C, Stewart P. LINPACK User's Guide. Philadelphia, PA, USA: SIAM; 1979. 39. Altman D, Bland J. Statistics notes: diagnostic tests 1: sensitivity and specificity. Br. Med. J. 1994;308:1552. [PubMed] 40. Altman D, Bland J. Statistics notes: diagnostic tests 2: predictive values. Br. Med. J. 1994;309:102. [PubMed] 41. Faith J, Driscoll M, Fusaro V, Cosgrove E, Hayete B, Juhn F, Schneider S, Gardner T. Many microbe microarrays database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res. 2008;36:D866–D870. [PubMed] 42. Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U, Speed T. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–264. [PubMed] 43. Karp P, Riley M, Saier M, Paulsen I, Collado-Vides J, Paley S, Pellegrini-Toole A, Bonavides C, Gama-Castro S. The EcoCyc DataBase. Nucleic Acids Res. 2002;30:56–58. [PubMed] 44. Isalan M, Lemerle C, Michalodimitrakis K, Horn C, Beltrao P, Raineri E, Garriga-Canut M, Serrano L. Evolvability and hierarchy in rewired bacterial gene networks. Nature. 2008;452:840–845. [PubMed] 45. Price M, Huang K, Alm E, Arkin A. A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res. 2005;33:880–892. [PubMed] 46. Reiss D, Baliga N, Bonneau R. Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics. 2006;7:280. [PubMed] 47. Mordelet F, Vert J-P. SIRENE: supervised inference of regulatory networks. Bioinformatics. 2008;24:i76–i82. [PubMed] 48. Sprinzak D, Elowitz M. Reconstruction of genetic circuits. Nature. 2005;438:443–448. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||
Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]Cell. 2000 Jul 7; 102(1):109-26.
[Cell. 2000]Nature. 2004 May 6; 429(6987):92-6.
[Nature. 2004]Proc Natl Acad Sci U S A. 1998 Dec 8; 95(25):14863-8.
[Proc Natl Acad Sci U S A. 1998]J Comput Biol. 1999 Fall-Winter; 6(3-4):281-97.
[J Comput Biol. 1999]Proc Natl Acad Sci U S A. 1999 Jun 8; 96(12):6745-50.
[Proc Natl Acad Sci U S A. 1999]Bioinformatics. 2000 Aug; 16(8):707-26.
[Bioinformatics. 2000]Nat Genet. 2002 Aug; 31(4):370-7.
[Nat Genet. 2002]Nature. 1996 Aug 15; 382(6592):638-42.
[Nature. 1996]J Bacteriol. 2005 Oct; 187(20):6928-35.
[J Bacteriol. 2005]Bioinformatics. 2008 Jan 1; 24(1):132-4.
[Bioinformatics. 2008]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]Bioinformatics. 2002; 18 Suppl 2():S231-40.
[Bioinformatics. 2002]Bioinformatics. 2003 Mar 1; 19(4):524-31.
[Bioinformatics. 2003]Nucleic Acids Res. 2008 Jan; 36(Database issue):D866-70.
[Nucleic Acids Res. 2008]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]Bioinformatics. 2004 Nov 1; 20(16):2493-503.
[Bioinformatics. 2004]PLoS Biol. 2007 Jan; 5(1):e8.
[PLoS Biol. 2007]BMC Syst Biol. 2007 Nov 16; 1():51.
[BMC Syst Biol. 2007]Nucleic Acids Res. 2002 Jul 1; 30(13):2886-93.
[Nucleic Acids Res. 2002]PLoS Biol. 2007 Jan; 5(1):e8.
[PLoS Biol. 2007]PLoS Biol. 2007 Jan; 5(1):e8.
[PLoS Biol. 2007]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]BMJ. 1994 Jun 11; 308(6943):1552.
[BMJ. 1994]BMJ. 1994 Jul 9; 309(6947):102.
[BMJ. 1994]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2008 Jan; 36(Database issue):D866-70.
[Nucleic Acids Res. 2008]Biostatistics. 2003 Apr; 4(2):249-64.
[Biostatistics. 2003]Genome Res. 2003 Nov; 13(11):2498-504.
[Genome Res. 2003]Genome Res. 2003 Nov; 13(11):2498-504.
[Genome Res. 2003]Nucleic Acids Res. 2002 Jan 1; 30(1):56-8.
[Nucleic Acids Res. 2002]Science. 2003 Jul 4; 301(5629):102-5.
[Science. 2003]Nucleic Acids Res. 2008 Jan; 36(Database issue):D866-70.
[Nucleic Acids Res. 2008]Nucleic Acids Res. 2008 Jan; 36(Database issue):D866-70.
[Nucleic Acids Res. 2008]Nature. 2008 Apr 17; 452(7189):840-5.
[Nature. 2008]Nature. 2008 Apr 17; 452(7189):840-5.
[Nature. 2008]Genome Biol. 2006; 7(5):R36.
[Genome Biol. 2006]PLoS Biol. 2007 Jan; 5(1):e8.
[PLoS Biol. 2007]BMC Syst Biol. 2007 Nov 16; 1():51.
[BMC Syst Biol. 2007]Bioinformatics. 2003 Mar 1; 19(4):524-31.
[Bioinformatics. 2003]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2005; 33(3):880-92.
[Nucleic Acids Res. 2005]BMC Bioinformatics. 2006 Jun 2; 7():280.
[BMC Bioinformatics. 2006]Bioinformatics. 2008 Aug 15; 24(16):i76-82.
[Bioinformatics. 2008]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]BMC Syst Biol. 2007 Nov 16; 1():51.
[BMC Syst Biol. 2007]Nature. 2004 May 6; 429(6987):92-6.
[Nature. 2004]Nature. 2005 Nov 24; 438(7067):443-8.
[Nature. 2005]Science. 2003 Jul 4; 301(5629):102-5.
[Science. 2003]