![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2008 Marr et al; licensee BioMed Central Ltd. Dissecting the logical types of network control in gene expression profiles 1Computational Systems Biology Group, Jacobs University, Campus Ring 1, 28759 Bremen, Germany 2Molecular Genetics Group, Jacobs University, Campus Ring 1, 28759 Bremen, Germany 3Institute for Bioinformatics and Systems Biology, Helmholtz Zentrum München – German Research Center for Environmental Health, 85764 Neuherberg, Germany Corresponding author.#Contributed equally. Carsten Marr: c.marr/at/jacobs-university.de; Marcel Geertz: m.geertz/at/jacobs-university.de; Marc-Thorsten Hütt: m.huett/at/jacobs-university.de; Georgi Muskhelishvili: g.muskhelishvili/at/jacobs-university.de Received September 25, 2007; Accepted February 19, 2008. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Background In the bacterium Escherichia coli the transcriptional regulation of gene expression involves both dedicated regulators binding specific DNA sites with high affinity and also global regulators – abundant DNA architectural proteins of the bacterial nucleoid binding multiple sites with a wide range of affinities and thus modulating the superhelical density of DNA. The first form of transcriptional regulation is predominantly pairwise and specific, representing digitial control, while the second form is (in strength and distribution) continuous, representing analog control. Results Here we look at the properties of effective networks derived from significant gene expression changes under variation of the two forms of control and find that upon limitations of one type of control (caused e.g. by mutation of a global DNA architectural factor) the other type can compensate for compromised regulation. Mutations of global regulators significantly enhance the digital control, whereas in the presence of global DNA architectural proteins regulation is mostly of the analog type, coupling spatially neighboring genomic loci. Taken together our data suggest that two logically distinct – digital and analog – types of control are balancing each other. Conclusion By revealing two distinct logical types of control, our approach provides basic insights into both the organizational principles of transcriptional regulation and the mechanisms buffering genetic flexibility. We anticipate that the general concept of distinguishing logical types of control will apply to many complex biological networks. Background One important objection to Lamarckian evolution by inheritance of acquired characteristics emphasized by Bateson over forty years ago is the reduction of adaptational flexibility upon progressive specialization, necessitating the occurrence of genotypic changes compensating for this limitation [1]. In unicellular organisms such as bacteria, in keeping with Batesons' prediction the same acquired mutations beneficial in one environment can be restrictive in another [2]. At the same time, evolving Escherichia coli populations can demonstrate remarkable flexibility in genetic adaptation [3]. The mechanisms sustaining this flexibility remain unclear. In order to understand the genetic flexibility it is essential to decipher the organizational logic of transcriptional control. For the classical model organism E. coli the largest electronically accessible network integrating the data on the transcriptional regulation of genes is available [4]. The interlinked elements form a complex structure, which is essentially of digital nature (digital refers here to the fact that the network provides static information on the connections between unique, discontinuous components [5], e.g. a particular pair of regulating and regulated gene). Notably, such pair-wise connections are not necessarily reflected in genomic expression profiles [6,7] indicating that not all the interactions given in the network occur at all times. Furthermore, this type of network does not account for the analog mode of gene regulation via alterations of DNA topology – a long known control mechanism revived by recent DNA microarray analyses [8-10] (analog refers here to the fact that the expression of specific genes is under the control of continuous information provided by spatial distributions of supercoiling energy in the genome [11]). Indeed, transcriptional responses to alterations of DNA superhelicity reveal non-trivial spatial patterns, raising new questions on the coordination of genomic transcription [9,11] and the interplay between chromosomal organization and patterns in gene expression is now becoming the focus of computational analyses [12,13]. From these considerations it is obvious that a holistic theory of transcriptional regulation has to include the relationships between these two logically distinct (digital-binary and analog-continuous) types of information and therefore has to distinguish them in the first place. Although other mechanisms of gene regulation between the binary and continuous extremes can be considered, for understanding the organizational principles of transcriptional regulation we assume a working model here in which the impacts of the two distinct logical types of control – one of digital and another of analog type – are to be clearly distinguished and related to each other. In the following, we will translate the patterns in gene expression changes observed under systematic variation of the two types of control into effective networks and study their connectivity. The effective networks are derived as subnetworks of two larger (static) networks: (1) the transcriptional regulatory network based upon the action of dedicated transcription factors; (2) spatial proximity of two genes on the circular chromosome. We will statistically compare the properties of these effective networks with those obtained by random sampling of the static networks with a certain number of expression changes. The core quantity derived from these comparisons is the ratio of connected to isolated nodes (control ratio) and, furthermore, its z-score with respect to the random networks. This z-score we denote the confidence level of the particular control type (control type confidence, CTC). Results In this study we aim at understanding the relationships between the digital and analog types of control in transcriptional regulation by using the model system of exponentially growing E. coli cells. The rationale is to investigate transcript profiles obtained under conditions where we either modulate the analog component of regulation under constant digital control, or modulate the digital component keeping the analog control constant. We modulate the analog component by experimentally varying the negative superhelical density (-σ) of chromosomal DNA within the same genetic background (i.e. with constant digital TRN). Such variation of -σ is carried out within three genetic backgrounds – the wild type E. coli and two mutant strains lacking one of the two abundant DNA architectural proteins, either FIS or H-NS. These comparisons produce the so-called intra-strain transcript profiles [11] (see Figure Figure1b).1b
The transcriptional regulatory network (TRN) of E. coli is the basis of many recent studies on network architecture [14,15], as well as on the consistency of the network with expression profiles [6,7]. To assess the impact of digital-type control we analyze subnets of the TRN of E. coli spanned by genes with significantly changed expression in our three intra-strain transcript profiles, the effective TRNs (Figure (Figure2).2
The schematics used in Figure Figure1a1a
Discussion A unifying approach enabling to combine the data derived by different methodologies is essential for understanding the basic organizational principles of transcriptional regulation, especially since recently transcriptional sub-networks with organizationally distinct architectures have been described [20]. In this study we dissect the logical types of information derived by two established methodologies studying transcriptional regulation based either on TRN analyses, or on the analyses of transcriptional supercoiling response of genomic expression patterns. We denote the information retrieved by assessing directional interactions between the genes in TRN as digital, whereas we denote the information retrieved by assessing the influence of superhelical density on expression patterns as analog. This dissection enables us to present a generic approach allowing both, to distinguish and to assess the relationships between two logically distinct types of transcriptional control. Using this approach we demonstrate that variation of the analog component of regulation (changing DNA superhelicity) effectively exposes the contribution of digital-type control (represented by the TRN) to transcriptional regulation, which is significantly increased in E. coli strains lacking global DNA architectural proteins. In turn, alterations of the digital component (changing TRN by deleting hubs) expose a substantial contribution of analog-type control (approximated by the GPN) to transcriptional regulation in wild type cells. Since the digital and analog types of control are constituents of a single transcriptional regulatory system of the cell, our data suggest that these two logically distinct types of control are balancing each other, such that upon limitations of one type of control (caused e.g. by mutation of a global DNA architectural factor) the other type can compensate for compromised regulation (Figure (Figure5).5
While this network is intimately involved in spatial organization of transcription in E.coli, spatial organisation of transcription is observed in both, prokaryotes and eukaryotes [21,22]. In E. coli this phenomenon can be readily rationalized on the basis of topological domains of variable size underlying the organization of bacterial chromosome [23-25]. Indeed, both FIS and H-NS have been directly implicated in formation of topological barriers to supercoil diffusion [26]. Thus the preponderance of analog-type control in the wild type cells compared to mutants lacking FIS and H-NS (see Figure Figure5)5 One prediction from the observed interdependence between digital and analog types of transcriptional control is that adaptive mutations in E. coli will affect the determinants of global DNA architecture. Indeed, a recent study of long-term experimental evolution in E. coli unmasking DNA topology as a key target for selection identified fitness-enhancing mutations in topoisomerase and fis genes [27]. Furthermore, such "evolved" populations possess high adaptational flexibility [3]. We propose that the buffering of transcriptional regulation by balancing effects of analog and digital types of control can counteract the reduction of adaptational flexibility caused by accumulation of mutations in bacteria [2]. In this respect it is revealing, that fis is a relatively late acquisition in bacterial evolution [28], whereas H-NS is implicated in regulating "adaptive" gene rearrangements and minimizing the cost of competitive fitness during horizontal gene transfer [19,29]. Conclusion We believe that the general concept of distinguishing logical types of control developed in this study will apply to many complex biological networks. We shall also emphasize that based on our data, reinterpretation of the interactions contained in the E. coli TRN database RegulonDB with respect to both, their digital and analog control characteristics – for example, consideration of the supercoiling sensitivity of the genes – might be a worthwhile extension of this database. Methods Microarray and network data Transcript profiling for wild type, fis and hns LZ strains was carried out using E. coli K12 V2 OciChip™ DNA microarray. The genetically engineered E. coli LZ41 and LZ54 strains contain drug-resistant topoisomerase gene alleles enabling to selectively inhibit either DNA gyrase or topoisomerase IV activity and respectively induce either relaxation or high negative supercoiling [30]. The fis and hns mutants of the LZ41 and LZ54 strains were obtained by phage P1 transduction. Introduction of the fis and hns mutations in the LZ41 and LZ54 strains does not substantially alter the global supercoiling response to drug (norfloxacin) addition [11]. Each experiment was performed as two biological replicates with two technical replicates each, resulting in 28 cDNA microarray hybridisations. Scanned array images were quantified and normalized by applying a LOWESS (locally weighted scatterplot smoothing) algorithm to the data within print-tip groups using the TM4 software package [31]. A one-class t-test was applied to replicated experiments to obtain genes with significant changed expression. For all results presented in our article, we used a significance level α = 0.05. However, we find that the results remain unaffected over a wide range of significance levels (0.05 > α > 0.02). DNA microarray data sets have been deposited in the Array Express data bank with the accession number E-TABM-86. For detailed DNA microarray data description and analyses see [11]. TRN construction Preceding the construction of effective TRNs, dimeric regulatory gene identifiers in the microarray data (flhC, flhD; gatR_1, gatR_2; hupA, hupB; ihfA, ihfB; rcsA, rcsB) were replaced by unique Regulon DB identifiers (flhCflhD; gatR_1gatR_2; hupAhupB; ihfAihfB; rcsArcsB). The effective TRN subnet of a DNA microarray transcript profile is the set of affected genes in the TRN and their regulatory interactions contained in RegulonDB (see Additional file 1 for edge lists of the resulting effective TRNs). Connected components of an effective TRN emerge, if both regulating and regulated genes are affected in the transcript profile (see subnet analysis and Figure Figure2).2 GPN construction Preceding GPN subnet construction, the inter-strain transcript profile data was split up into genes with positive and negative log ratios, respectively. Genes with positive log ratios refer to high transcript levels in wild type background, genes with negative log ratios refer to high transcript levels in fis or hns mutant background. GPN subnets of the split DNA microarray transcript profiles were generated based on genomic position of affected genes together with the proximity threshold t, given in in nucleotide bases (b). All affected genes with spatial distance (here distance is relating to ORF start and stop position) below the selected proximity threshold t were considered as connected. GPN subnets were generated for a meaningful range of 1b <t < 10 kb, resulting in connected genes within an operon scale at t ≈10b, up to completely conntected GPNs for t > 10 kb. Connected and unconnected subnet components were further analysed [see Additional file 2]. Subnet analyses For each subnet, the control ratio R was calculated as the number of connected nodes Nconnected (i.e. the size of the connected subnet component) over the number of isolated nodes Nisolated (i.e. the size of the unconnected subnet component), R = Nconnected/Nisolated. The control type confidence, CTC, is the z-score of R, calculated from the mean R and its standard deviation obtained from 10000 runs of the corresponding null model. In the case of the digital null model, the same number of affected nodes was mapped randomly on the TRN (see Figure Figure2).2 The robustness of calculated ratios and CTCs was verified by 10% random data replacement with data of all affected genes from the remaining DNA microarray sets (see Figure Figure33 Abbreviations CTC, control type confidence, GPN, gene proximity network, TRN, transcriptional regulatory network. Authors' contributions CM, MG, MTH, and GM conceived the study. CM and MG analyzed the data. CM, MG, MTH, and GM wrote the paper. All authors read and approved the final manuscript. Additional file 1 Dataset S1. Gene identifier and corresponding edge lists of the seven directed effective TRNs emerging from the analysis of the seven transcript profiles. Click here for file(24K, txt) Additional file 2 Dataset S2. Gene identifier and corresponding edge lists of the eight undirected GPNs emerging from the seperated analysis of the four inter-strain transcript profiles with positive and negative log ratio, respectively. Click here for file(48K, txt) Acknowledgements CM was supported by a grant of the Darmstadt University of Technology. MG is supported by the DFG grant DFG-MU-2FIS. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Nature. 2000 Oct 12; 407(6805):736-9.
[Nature. 2000]Am Nat. 2006 Aug; 168(2):242-51.
[Am Nat. 2006]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]Genome Res. 2003 Nov; 13(11):2423-34.
[Genome Res. 2003]Genome Res. 2003 Nov; 13(11):2435-43.
[Genome Res. 2003]EMBO Rep. 2006 Jul; 7(7):710-5.
[EMBO Rep. 2006]Nat Genet. 2002 May; 31(1):64-8.
[Nat Genet. 2002]Proc Natl Acad Sci U S A. 2006 Oct 3; 103(40):14724-31.
[Proc Natl Acad Sci U S A. 2006]Genome Res. 2003 Nov; 13(11):2423-34.
[Genome Res. 2003]Genome Res. 2003 Nov; 13(11):2435-43.
[Genome Res. 2003]EMBO Rep. 2006 Jul; 7(7):710-5.
[EMBO Rep. 2006]Nat Rev Microbiol. 2004 May; 2(5):391-400.
[Nat Rev Microbiol. 2004]Nat Rev Microbiol. 2005 Feb; 3(2):157-69.
[Nat Rev Microbiol. 2005]Nat Rev Microbiol. 2007 Feb; 5(2):157-61.
[Nat Rev Microbiol. 2007]Nature. 2004 Sep 16; 431(7006):308-12.
[Nature. 2004]Nat Rev Microbiol. 2005 Feb; 3(2):157-69.
[Nat Rev Microbiol. 2005]Nat Genet. 2000 Oct; 26(2):183-6.
[Nat Genet. 2000]J Mol Biol. 2004 Jul 23; 340(5):957-64.
[J Mol Biol. 2004]Mol Microbiol. 2005 Sep; 57(6):1511-21.
[Mol Microbiol. 2005]Curr Opin Genet Dev. 2005 Oct; 15(5):507-14.
[Curr Opin Genet Dev. 2005]Mol Microbiol. 2005 Sep; 57(6):1636-52.
[Mol Microbiol. 2005]Genetics. 2005 Feb; 169(2):523-32.
[Genetics. 2005]Am Nat. 2006 Aug; 168(2):242-51.
[Am Nat. 2006]Nature. 2000 Oct 12; 407(6805):736-9.
[Nature. 2000]FEBS Lett. 1998 Aug 14; 433(1-2):108-12.
[FEBS Lett. 1998]Nat Rev Microbiol. 2007 Feb; 5(2):157-61.
[Nat Rev Microbiol. 2007]Genes Dev. 1997 Oct 1; 11(19):2580-92.
[Genes Dev. 1997]EMBO Rep. 2006 Jul; 7(7):710-5.
[EMBO Rep. 2006]Biotechniques. 2003 Feb; 34(2):374-8.
[Biotechniques. 2003]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D394-7.
[Nucleic Acids Res. 2006]