![]() | ![]() |
Formats:
|
||||||||||
Copyright © 2008 Axelsen et al; licensee BioMed Central Ltd. One hub-one process: a tool based view on regulatory network topology 1Centro de Astrobiología, Instituto Nacional de Técnica Aeroespacial, Ctra de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain 2Department of Theoretical Physics, Umeå University, 901 87 Umeå, Sweden 3Center for Models of Life, Niels Bohr Institute, Blegdamsvej 17 DK-2100 Copenhagen Ø, Denmark Corresponding author.Jacob Bock Axelsen: bockaj/at/inta.es; Sebastian Bernhardsson: sebbeb/at/tp.umu.se; Kim Sneppen: sneppen/at/nbi.dk Received August 17, 2007; Accepted March 4, 2008. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background The relationship between the regulatory design and the functionality of molecular networks is a key issue in biology. Modules and motifs have been associated to various cellular processes, thereby providing anecdotal evidence for performance based localization on molecular networks. Results To quantify structure-function relationship we investigate similarities of proteins which are close in the regulatory network of the yeast Saccharomyces Cerevisiae. We find that the topology of the regulatory network only show weak remnants of its history of network reorganizations, but strong features of co-regulated proteins associated to similar tasks. These functional correlations decreases strongly when one consider proteins separated by more than two steps in the regulatory network. The network topology primarily reflects the processes that is orchestrated by each individual hub, whereas there is nearly no remnants of the history of protein duplications. Conclusion Our results suggests that local topological features of regulatory networks, including broad degree distributions, emerge as an implicit result of matching a number of needed processes to a finite toolbox of proteins. Background Contemporary systems biology have provided us with a large amount of data on topology of molecular networks, thereby giving us glimpses into computation and signaling in living cells. It have been found that 1) regulatory networks have broad out-degree distributions [1,2], 2) transcriptional regulatory networks contains many feed forward motifs [3], and 3) highly connected hubs are often found on the periphery of the network [4]. These findings are elements in understanding the topology of existing molecular networks as the result of an interplay between evolution and the processes they orchestrate in the cell. In this paper we consider properties of proteins in the perspective of how they are positioned relative to each other in the network. This is in part motivated by the existence of highly connected proteins (hubs) and their relation to soft modularity [4,5] in regulatory networks. In particular one may envision broad degree distributions and possible isolation of hubs as a reflection of a local "information horizon" [6] with partial isolation between different biological processes. We here address this problem by considering the yeast regulatory network [7] with regards to protein properties. Using the Gene Ontology (GO) Consortium annotations[8] we will show that locality in the regulatory network primarily is associated to locality in biological process, and only weakly related to functional abilities of a protein. Results Figure Figure11
More precisely, a GO-graph is an acyclic directed graph which organize proteins according to a predefined categorization. A lower ranking protein in a GO-graph share large scale properties with higher ranking proteins, but are more specialized. In the GO-database, proteins are categorized into three networks according to different annotations, ranking known gene products after respectively: P) biological process, F) functional ability/design of the protein and C) cellular components where the protein is physically located. For each of these three ways of categorization we examined two distinct ways to measure GO annotation difference (see box in Fig. Fig.11 Figure Figure22
In particular Fig. 2(a) Figure 2(b) In all the panels in Fig. Fig.22 Figure Figure33
From Fig. 3(a) In Fig. 3(b) Discussion Protein regulatory networks are highly functional information processing systems, evolved to perform a diverse sets of tasks in a close to optimal way. It is of no surprise that they are not random, also in ways that can be detected without knowing much about what actually goes on in the living system they regulate. However we do not, a priori, know much about the relative importance of function versus history: Is the topology of a network primarily governed by the processes it direct, or is its topology influenced by random gene duplications [9,10] and "link" rewirings [11]? Concerning gene duplications [9,12-19], we detected 581 paralogous pairs among the 848 gene products in YPD, see methods. Of these 581 pairs, only ~15% significantly retained their common regulator, and only ~0.6% of the proteins pairs at distance l = 2 are detectable paralogs. Therefore the contribution from duplication events to any GO-similarity within hubs can be ignored. Our analysis in Figs. Figs.2,2 In any case we emphasize that we primarily find GO-processes localized on hubs, and only weak correlations of the functional abilities between proteins involved in the same process. The idea that process similarity are associated to network localization is not new, and implicitly behind attempts to infer gene networks from similarity in gene expression [20]. In the supplement we use gene expression from micro-arrays to re-investigate the correlation between process and locality in the regulatory network. Thereby, we provide a broader support for our findings, and present a quantitative illustration of the extent to which gene-expression studies can be used to deduce co-regulation. Support for the ubiquity of the "one hub-one process" association is also found from the fact that the likelihood that a regulatory protein is essential is nearly independent on how many proteins it regulate [2]. That is, the question of whether a null mutant of a certain protein is viable is keyed to the essentiality of the regulated process, and not to whether the process needs many or few different "tools" to be performed. Conclusion Overall we suggest that the topology of the yeast regulatory network is governed by processes located on hubs, each consisting of a number of tools in the form of proteins with quite different functional abilities. This is consistent with a network evolution where gene duplication occur, but where rewiring of regulatory links plays a bigger role [14,19,21-23]. The regulatory network is designed to co-regulate processes, and its evolutionary history must include a bias towards hub-regulation of individual processes. Degree distributions are not broad because of duplication events, but because a given biological task sometimes needs many, but typically require few tools. Finally our analysis have consequences for development of null models for network topologies, and thereby for identifying functionally important network motifs [3]. While the previous null model [4] maintain in- and out- degrees of each protein, it ignore correlations associated to cellular process. When nearby proteins are associated to the same processes one statistically expect an increased probability for cliques [24,25]. We therefore expect that some of the many feed-forward loops in transcription networks [3] will be explained by a new type of null model: A null model where proteins contributing to a given process are forced to remain close in the randomized network. Methods The GO-annotations are used without any filtering. This does not preclude bias introduced from using inferred annotations. Of the 848 genes in the YPD, 52 are not annotated and were thus not included in the analysis. 142 genes has more than one molecular function, 314 genes takes part in more than one cellular component and 463 genes participates in more than one biological process. To accommodate this the analysis was carried out by choosing the annotations which minimized the mutual distance for each pair of proteins. This choice maximally resolves significant signals, since we minimize the effect of the finite size of the GO-tree, and in the case of no signal this choice introduces no bias. Of the 848 gene products in YPD, we found 581 paralogous pairs using BLASTP with E-value cutoff of 10-10 [14,26]. For the YPD network 132 of these paralogous pairs are at distance l = 2. This should be compared to a null expectation of 50 ± 6 paralogous pairs at l = 2 found by randomizing the YPD network while keeping in- and out-degrees [4]. Therefore at max 132-50 = 82 of the paralogous pairs are in the same hub due to their history of common origin. This correspond to 82/581 ~15% of duplicated proteins in YPD. The excess of 82 paralogous pairs at distance 2 should also be compared to the total of 13554 protein pairs that the YPD network have at distance l = 2. Thus only ~0.6% of all proteins pairs at l = 2 are detectable paralogs. As seen in our Additional file 1, we reach the same basic conclusion of hubs being functionally isolated using a completely different approach based on gene expression data. Analyzing micro-array data from 482 stress experiments from Saccharomyces Genome Database [27] and managing the false discovery rate as in [28] we indeed find localization of perturbations on our regulatory network. Thus the appendix supports the robustness of our results to an independent categorization of protein processes. Authors' contributions All authors contributed equivally to this work. All authors read and approved the final manuscript. Additional file 1 Correlating microarray data of stress conditions with the YPD. Using 465 microarrays of stress conditions for S. Cerevisiae, from Stanford Genome Database, we perform a statistical analysis showing that functions are localized in the regulatory network. Click here for file(1.8M, pdf) Acknowledgements We acknowledge the support from the Danish National Research Foundation through "Center for Models of Life" at the Niels Bohr Institute. KS and JBA wishes to thank the Lundbeck Foundation. JBA wish to thank The Eva and Henry Frænkel Memorial Foundation. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||
Science. 2002 Oct 25; 298(5594):824-7.
[Science. 2002]Science. 2002 May 3; 296(5569):910-3.
[Science. 2002]Science. 2002 May 3; 296(5569):910-3.
[Science. 2002]Nature. 1999 Dec 2; 402(6761 Suppl):C47-52.
[Nature. 1999]Phys Rev Lett. 2005 Jun 17; 94(23):238701.
[Phys Rev Lett. 2005]Nucleic Acids Res. 2001 Jan 1; 29(1):75-9.
[Nucleic Acids Res. 2001]Nat Genet. 2000 May; 25(1):25-9.
[Nat Genet. 2000]Nucleic Acids Res. 2001 Jan 1; 29(1):75-9.
[Nucleic Acids Res. 2001]Bioinformatics. 2002 Nov; 18(11):1486-93.
[Bioinformatics. 2002]Bioinformatics. 2002 Nov; 18(11):1486-93.
[Bioinformatics. 2002]Mol Biol Evol. 2001 Jul; 18(7):1283-92.
[Mol Biol Evol. 2001]Nucleic Acids Res. 2004; 32(1):179-88.
[Nucleic Acids Res. 2004]BMC Evol Biol. 2004 Mar 8; 4():9.
[BMC Evol Biol. 2004]Trends Genet. 2002 Dec; 18(12):609-13.
[Trends Genet. 2002]Science. 2005 Aug 5; 309(5736):938-40.
[Science. 2005]Science. 2002 Oct 25; 298(5594):824-7.
[Science. 2002]Science. 2002 May 3; 296(5569):910-3.
[Science. 2002]Phys Rev Lett. 2004 Apr 30; 92(17):178702.
[Phys Rev Lett. 2004]Phys Rev E Stat Nonlin Soft Matter Phys. 2006 Sep; 74(3 Pt 2):036119.
[Phys Rev E Stat Nonlin Soft Matter Phys. 2006]BMC Evol Biol. 2004 Mar 8; 4():9.
[BMC Evol Biol. 2004]Biol Direct. 2007 Nov 26; 2():32.
[Biol Direct. 2007]Science. 2002 May 3; 296(5569):910-3.
[Science. 2002]Behav Brain Res. 2001 Nov 1; 125(1-2):279-84.
[Behav Brain Res. 2001]