• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Jul 1, 2008; 36(Web Server issue): W452–W459.
Published online May 6, 2008. doi:  10.1093/nar/gkn230
PMCID: PMC2447774

GraphWeb: mining heterogeneous biological networks for gene modules with functional significance

Abstract

Deciphering heterogeneous cellular networks with embedded modules is a great challenge of current systems biology. Experimental and computational studies construct complex networks of molecules that describe various aspects of the cell such as transcriptional regulation, protein interactions and metabolism. Groups of interacting genes and proteins reflect network modules that potentially share regulatory mechanisms and relate to common function. Here, we present GraphWeb, a public web server for biological network analysis and module discovery. GraphWeb provides methods to: (1) integrate heterogeneous and multispecies data for constructing directed and undirected, weighted and unweighted networks; (ii) discover network modules using a variety of algorithms and topological filters and (iii) interpret modules using functional knowledge of the Gene Ontology and pathways, as well as regulatory features such as binding motifs and microRNA targets. GraphWeb is designed to analyse individual or multiple merged networks, search for conserved features across multiple species, mine large biological networks for smaller modules, discover novel candidates and connections for known pathways and compare results of high-throughput datasets. The GraphWeb is available at http://biit.cs.ut.ee/graphweb/.

INTRODUCTION AND BACKGROUND

One of the greatest challenges of biomedical research is to understand the organization and function of living organisms at the molecular level. Experimental and computational data reveal complex networks that consist of genes and proteins as nodes and associations as edges (1–3). While describing different aspects of the cell, these networks appear to share universal structural properties like log-linear distribution of connections and small-world reachability (4,5). Within networks, modules of tightly interacting genes and proteins are believed to make up functional units responsible for processes in the cell (6). For instance, collections of protein–protein interactions (PPI) form networks of physically binding proteins, where modules reflect protein complexes or signalling pathways (7,8). Gene expression measures, transcription regulator binding data, cis-regulatory motif discovery and conservation information are combined to uncover transcription regulatory networks with modules of transcription factors (TFs) and target genes (9–12). From a slightly different angle, text-mining methods extract knowledge-based webs and co-occurring modules of genes and proteins from scientific literature (13).

Biological network analysis proposes the following computational challenges. The strategies need to take into account the myriad of cellular interactions that may be directed (e.g. TF–gene interaction) or undirected (e.g. PPI), involve quantitative values (e.g. gene expression correlation) or appear in multiple datasets (e.g. co-expression and physical interaction) (14). Combining different cellular domains requires data integration to deal with various biomolecules and experimental measurements (15). Module detection involves algorithms that identify nodes with special topological features or search for densely connected areas (16). Biological interpretation of modules comprises functional analysis using resources such as the Gene Ontology (GO) (17) and detection of significantly enriched biological processes, functions and cellular locations (18).

The growing interest in networks and systems biology has increased the need for computational and visual methods for network analysis, and as a result, several useful tools have been published. Notable software libraries include AT&T Graphviz for visualization and C++ Boost for graph structures and algorithms, packaged into Bioconductor by Carey and collegues (19). Cytoscape is a popular software for visual analysis of biological networks (20). A number of plugins complement Cytoscape with analytical features such as microarray data integration, dense subgraph detection (21) and GO-term enrichment analysis (22). Osprey focuses on visualization (23), while VisANT also provides topological analysis and functional annotation of nodes (24). MATISSE is useful for mapping high-throughput datasets onto network topologies and detecting gene modules using a number of algorithms (25). BiologicalNetworks is a network retrieval, construction and visualization tool with an emphasis on microarray data (26). BioPIXIE provides a gene-based query engine and GO analysis for a precomputed heterogeneous network for Saccharomyces cerevisiae (27). NetworkBLAST allows the user to align and compare two networks of different species through user-provided sequence similarity measures to discover conserved protein complexes (28).

We have identified open questions in the field of biological network analysis. There is a lack of simple ‘point-and-click’ web servers that allow biological data integration and discovery of modules. Some of the available tools involve no biological background information and force the user to put great effort in integrating datasets, linking molecules and retrieving functional annotations, while others constrain the analysis to some pre-calculated network of a specific model organism. Module detection is frequently limited to neighbourhood search of gene lists or topological analysis such as node connectivity. Both Cytoscape and VisANT implement functionality for analysing high-throughput networks, detecting modules and enriched biological features. However, we believe that there is a need for web-based resources that analyse heterogeneous datasets with mixed collections of genes and proteins, detect various types of modules and and provide a rich interface for functional annotation. Moreover, there is little support for the analysis and integration of multispecies data using automatic orthology mapping. With the development of the GraphWeb server, we wish to contribute to the network challenge and propose new solutions to the above questions.

THE GraphWeb SERVER

GraphWeb (http://biit.cs.ut.ee/graphweb, Figure 1) is a public web server for graph-based analysis of cellular networks that:

  1. analyses directed and undirected, weighted and unweighted heterogeneous networks of genes, proteins and microarray probesets for 35+ eukaryotic genomes;
  2. integrates multiple diverse datasets into global networks;
  3. incorporates multispecies data using gene orthology mapping;
  4. filters nodes and edges based on dataset support, edge weight and node annotation;
  5. detects gene modules from networks using a collection of algorithms;
  6. interprets discovered modules using GO, pathways and cis-regulatory motifs.

Figure 1.
GraphWeb user interface with data from the case study of human PPI and gene expression (see Results Section for a detailed description). The first module of 33 nodes is shown in Figure 2. User interface legend: (A) data upload, (B) module detection algorithms, ...

Networks in GraphWeb

The primary input of GraphWeb is a combined biological network of a selected species, consisting of genes, proteins or microarray probesets as nodes and corresponding associations as edges. The user may upload the input data as a file or type it into the webform. Genes, proteins and microarray probesets of various databases and platforms are automatically mapped to gene IDs of the Ensembl database (29) using the g:Profiler software (30). Unrecognized and ambiguous IDs may be optionally removed, but remain unchanged by default in order to keep the input networks intact. Associations between nodes may be represented as directed or undirected edges, and weights may be assigned to edges to convey quantitative relations between corresponding nodes. A collection of pre-defined datasets is available for immediate analysis, including PPI from IntAct (31) and HPRD (32), and the S.cerevisiae transcription regulatory network by MacIsaac et al. (33).

Data integration

GraphWeb allows the user to insert and combine different data sources and align these into a global network. Besides native plaintext format, Graphweb supports the import of other network files such as SIF, GML, XGMML and BioPAX through the Cytoscape BiNoM plugin (34). Labels can be used to distinguish associations of different sources, and a network score may be assigned to each label to denote the predictive power of corresponding associations. For example, TF-binding networks from ChIP-chip experiments may be combined and aligned with motif discovery results, and scored with predictive values learned from gene expression data.

The integration process first creates a global network that permits several connecting edges between a pair of nodes. This is followed by a label-wise weight normalization that makes associations of different networks comparable. Finally, a linear combination of edge weights wh,i,j and network scores sh for different labels h is used to rank all connected nodes i, j:

equation image

The score Si,j is designed to highlight associations with strong evidence from several sources. The user may also choose to create network scores automatically and assign proportionally more power to smaller datasets. This option provides a direct measure for preferring smaller, assumably high-quality networks. GraphWeb only supports the alignment of unambiguous known IDs, since the alignment of ambiguous entities may lead to erroneous networks. Proteins or probes that map to several base gene IDs are treated as independent nodes and corresponding edges are not aligned.

Multispecies networks

GraphWeb provides means to incorporate data from different organisms in order to improve network construction. When the user selects a target organism in the GraphWeb interface the nodes and corresponding associations of the input are automatically mapped to orthologous genes in the target. The orthology mapping information is retrieved from Ensembl via g:Profiler software. Resulting ortholog networks can be combined with other datasets of the target organism to highlight conserved associations. Similarly to single-species data integration, GraphWeb ignores ambiguous orthologs in network alignments to avoid noise and misleading results. Such a solution retains the cleanest possible network but undoubtedly results in a certain loss of information.

Graph filtering

GraphWeb filters help the user detect network areas with strong associations. Three types of filters may be used for selecting edges: minimum number of supporting datasets (i.e. labels), lower threshold on edge weights and selection of top-ranking edges. Node filtering excludes unrecognized or ambiguous genes and proteins, while module filtering limits the result to larger modules or those with significant functional enrichments. Filtering techniques are especially useful when incorporating edges from different datasets or species.

Gene module discovery

GraphWeb provides a number of methods and algorithms for detecting gene modules in directed and undirected networks. Resulting gene modules may easily be saved for later use or redirected to input for further analysis. GraphWeb identifies the following types of modules.

Connected components

A connected component (Figure 2A) is a group of genes, where every pair of genes, (gi, gj) is connected either directly (gi [up curve] gj) or indirectly via a path of length n, (g1 [up curve] g2 [up curve][up curve] gn [up curve] gn+1). GraphWeb also supports two extensions of the above: a strongly connected component relates to directed networks and requires connections in both directions, and a biconnected component requires at least two non-overlapping paths. Connected component detection is the first step in studying network structure.

Figure 2.
The case study: a connected component (A) detected from the combined network for protein interactions and gene expression similarity. The discovered module describes a fragment of the human cell cycle and consists of several smaller modules. Two cyclin-dependent ...

Neighbourhood modules

A neighbourhood module (Figure 2D) is based on a user-defined list of genes and proteins {G} and on a distance d. If d = 0, GraphWeb retrieves modules that consist of nodes G with internal associations inside the list. If d ≥ 1, modules consist of the initial list {G} and nodes connected to the latter via paths of maximum length d. Neighbourhood modules allow the user to study her focus list in a network context, and retrieve related nodes and associations to propose new hypotheses.

Hub-based modules

A hub-based module (Figure 2B) consists of a central hub (a node with many connections) and related genes and proteins within distance d. GraphWeb extracts a list of hub-based modules ranked by the central hub degree (number of connections). Hubs in PPI networks have been described in the context of lethality (35), and proteins linking to the same hub often refer to similar function (36). Hub-based modules may also reflect systems of TFs and target genes.

Cliques

A clique (Figure 2C) is a fully connected module where every pair of nodes is directly connected. Cliques in PPI networks have often been related to protein complexes and common functions (36). Fully connected modules also reflect clusters of co-expressed genes.

Cluster modules

A cluster module corresponds to a tightly connected group of nodes. GraphWeb provides two network clustering algorithms: the Markov Cluster (MCL) algorithm (37) and Betweenness Centrality Clustering (BCC) (38). These algorithms break networks down into separate modules by removing certain edges, and have been successfully applied in a number of studies, such as protein family detection (39) and essentiality assessment (40). MCL constructs modules of edges that are frequently visited during random walks, while BCC removes paths that act as bridges between separate tightly connected modules. Graph clustering is successful in integrative network analysis since it prefers associations with evidence from multiple datasets, and allows the detection of hybrid modules that combine the characteristics of different module types.

Empirical comparisons show that the time complexity of the above algorithms is generally linear to the number of edges. The NP-complete clique detection algorithm is the most computationally expensive method in GraphWeb and is especially sensitive to dense networks, where a network of 30 nodes and 300 edges requires a computation of nearly 10 min. MCL clustering, on the other hand, takes 10 min to handle a network of nearly 8000 nodes and 300 000 edges using GraphWeb default values. Hub-based modules and connected components are detected even faster.

Module interpretation and evaluation

Interpretation and evaluation is an integral process of module detection in GraphWeb. Once a module has been identified, GraphWeb automatically assesses its biological importance through the known properties of its members using the g:Profiler software. Functional profiling of the module involves statistically enriched annotations of biological processes (bp), cellular locations (cc) and molecular functions (mf) from the GO (17), and related pathways (pw) from the Kyoto Encyclopedia of Genes and Genomes (KEGG) (41) and Reactome (42). Besides functional annotations, the analysis takes into account cis-regulatory motif enrichments from TRANSFAC (43) and miRNA target site enrichments from miRBase (44).

First, g:Profiler applies the Fisher's; test to evaluate the enrichments of all biological annotations in the module:

equation image

The test computes the cumulative hypergeometric probability of randomly observing at least k genes with some common annotation α out of the n genes in the module, given the total number of genes N and the total number of genes having the annotation K. The g:Profiler uses a 5% multiple testing threshold g:SCS that applies a simulation procedure to retrieve only the significant enrichments from a hierarchical annotation structure like GO (45).

Once all enrichments for the module are known, GraphWeb computes an annotation score that sums the total significance relative to module size n:

equation image

The score is designed to highlight modules with strong size-independent enrichment of functions and regulatory features.

GraphWeb executes on-the-fly functional profiling and scoring of detected modules, displaying the names and P-values of most important discovered features from all the covered functional domains (GO:bp, GO:cc, GO:mf, KEGG:pw, Reactome:pw, TRANSFAC, miRBase). Hyperlinks to g:Profiler allow the user to access related terms and pathways, ortholog mapping and expression similarity search for related genes. In addition, a hyperlink to g:Cocoa at the bottom of the GraphWeb interface sends all discovered modules to comparative functional enrichment analysis.

RESULTS: A CASE STUDY

We present an example case study that demonstrates a possible data integration and module detection pipeline. The analysis concentrates on human cellular networks and involves six high-throughput datasets comprising gene expression values and PPI from public databases. Human PPI data originate from the study by (46) and the databases HPRD (32) and IntAct (31), and are interpreted as three separate networks. Human expression data are presented as an expression similarity network, computed using Multi Experiment Matrix (MEM) (Adler et al., manuscript in preparation) across nearly 3700 tumour-related samples of 89 public datasets, originating from GEO (47) and ArrayExpress (48). Besides human data, we use orthology mapping to incorporate two datasets for mouse: a MEM gene expression similarity network across 28 datasets and 1700 samples, and the PPI data from IntAct.

Unweighted PPI datasets and weighted expression similarity datasets are aligned into a global-weighted network. Integration of the above datasets reveals frequently co-expressed protein complexes such as ribosome and proteasome. We applied a strong edge filter of minimum dataset support 4, and queried for connected components. The largest resulting component consists of 33 nodes and four notable submodules, is included in known pathways of Reactome and KEGG, and involves strong GO enrichments.

The module plays a significant role in cell cycle and is well described with PPI as well as gene expression similarity. The two hubs denote cyclin-dependent kinases 1 (CDC2/CDK1) and 2 (CDK2), see Figure 2B for the former module. These kinases control the cell cycle entry to S-phase, while CDK1 also controls the entry to mitosis (49). MCM2-7 proteins form a helicase and five of these connect into a clique (Figure 2C). The neighbourhood of ORC2L and ORC5L partly reveals the origin recognition complex (ORC) (Figure 2D), that temporarily interacts with CDT1 and CDC6 and binds to the helicase to initiate replication in S-phase. Other connected proteins include cell cycle checkpoint controllers (e.g. CHEK1 kinase), inhibitors (GMNN, BIRC5) and cyclins (CCNE1, CCNE2, CCNB1).

The thorough common-knowledge description of the detected module provides support for the techniques proposed in GraphWeb. The rather strong filters applied above naturally extracted a well-studied result out of a large collection of public data. The GraphWeb case study provides a simple example of the possibilities and potential results of analysing novel data or combining it with existing public repertoires.

DISCUSSION

The core data structures and algorithms in GraphWeb render the myriad of molecular entities and corresponding relations, physical connections and regulatory events into a uniform collection of network nodes and connecting edges. On the one hand, this simplification creates an intuitive view of the cellular networks. GraphWeb analysis methods allow the researcher to approach a number of interesting tasks, for example proposing novel members of known pathways by strong ‘guilt by association’ evidence, comparing the results of multiple high-throughput datasets, or finding associations and modules of genes that are conserved in diverse species. On the other hand, looking at topological features, weighted edges and tightly connected groups of nodes may admittedly fail to deliver crucial aspects of biological systems, such as quantitative dependencies and dynamics over time. The greatest advantage of GraphWeb analysis is its relative simplicity and speed in handling complex objects as networks. We therefore believe that GraphWeb also proves useful in detailed network studies, since it allows the user to reduce the complexity of the whole network to the complexity of modules. Such a reduction may then provide access to more elaborate methods of mathematical modelling that are inapplicable to systems larger than a handful of variables.

CONCLUSION

GraphWeb is a publicly available web server for analysing and interpreting complex cellular networks. The server provides methods for integrating heterogeneous datasets into networks of interactions, means to incorporate multispecies data using gene orthology information, algorithms and methods for discovering network modules and functional enrichment analysis for biological interpretation. With the creation of the GraphWeb server, we wish to contribute to the difficult task of deciphering and understanding complex biological networks, and provide a tool with an emphasis on ease of use.

IMPLEMENTATION

The GraphWeb web server is implemented in Perl as a CGI application. Graph structures and algorithms are written in C++ and Perl and are partly based on the Boost Graph Library (http://www.boost.org/). GraphWeb applies the MCL algorithm implementation by van Dongen (37) (http://micans.org/mcl/). Visualization is provided by the AT&T Graphviz graph drawing package (http://www.graphviz.org/) and the SWOG graphical programming language (http://biit.cs.ut.ee/SWOG/).

ACKNOWLEDGEMENTS

The authors wish to thank Dr Nicholas Luscombe and the anonymous reviewers for valuable remarks on the articles and software. This work has been supported by the EU FP6 grants ENFIN LSHG-CT-2005-518254 and COBRED LSHB-CT-2007-037730, and Estonian Science Foundation grant ETF7437. J.R. has recieved funding from the Marie Curie Biostar program and the Tiger University program of the Estonian Information Technology Foundation. Funding to pay the Open Access publication charges for this article was provided by the European Commission (COBRED) project.

Conflict of interest statement: None declared.

REFERENCES

1. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL. The large-scale organization of metabolic networks. Nature. 2000;407:651–654. [PubMed]
2. Oltvai ZN, Barabasi AL. Life's; complexity pyramid. Science. 2002;298:763–764. [PubMed]
3. Maslov S, Sneppen K. Specificity and stability in topology of protein networks. Science. 2002;296:910–913. [PubMed]
4. Strogatz SH. Exploring complex networks. Nature. 2001;410:268–276. [PubMed]
5. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's; functional organization. Nature. 2004;5:101–113. [PubMed]
6. Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–C52. [PubMed]
7. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P. Comparative assessment of large-scale data sets of protein–protein interactions. Nature. 2002;417:399–403. [PubMed]
8. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006;440:631–636. [PubMed]
9. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2003;298:799–804. [PubMed]
10. Segal E, Shapira M, Regev A, Pe'e;r D, Botstein D, Koller D, Friedman N. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics. 2003;34:166–176. [PubMed]
11. Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, et al. Transcriptional regulatory code of a eukaryotic genome. Nature. 2004;431:99–104. [PMC free article] [PubMed]
12. Tanay A, Regev A, Shamir R. Conservation and evolvability in regulatory networks: the evolution of ribosomal regulation in yeast. PNAS. 2005;102:7203–7208. [PMC free article] [PubMed]
13. Jensen LJ, Saric J, Bork P. Literature mining for the biologist: from information retrieval to biological discovery. Nat. Rev. Genet. 2006;7:119–129. [PubMed]
14. Carter GW. Inferring network interactions within a cell. Brief. Bioinform. 2005;6:380–389. [PubMed]
15. Troyanskaya OG. Putting microarrays in a context: integrated analysis of diverse biological data. Brief. Bioinform. 2005;6:34–43. [PubMed]
16. Aittokallio T, Schwikowski B. Graph-based methods for analysing networks in cell biology. Brief. Bioinform. 2006;7:243–255. [PubMed]
17. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig J, et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 2000;1:25–29. [PMC free article] [PubMed]
18. Khatri P, Draghici S. Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics. 2005;21:3587–3595. [PMC free article] [PubMed]
19. Carey VJ, Gentry J, Whalen E, Gentleman R. Network structures and algorithms in Bioconductor. Bioinformatics. 2003;21:135–136. [PubMed]
20. Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, et al. Integration of biological networks and gene expression data using Cytoscape. Nat. Protocols. 2007;10:2366–2382. [PMC free article] [PubMed]
21. Ideker T, Ozier O, Schwikowski B, Siegel AF. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics. 2002;18:S233–S240. [PubMed]
22. Maere S, Heymans K, Kiper M. BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics. 2005;21:3448–3449. [PubMed]
23. Breitkreutz BJ, Stark C, Tyers M. Osprey: a network visualization system. Genome Biol. 2003;4:R22. [PMC free article] [PubMed]
24. Hu Z, Ng DM, Yamada T, Chen C, Kawashima S, Mellor J, Linghu B, Kanehisa M, Stuart JM, DeLisi C. VisANT 3.0: new modules for pathway visualization, editing, prediction and construction. Nucleic Acids Res. 2007;W35:W625–W632. [PMC free article] [PubMed]
25. Ulitsky I, Shamir R. Identification of functional modules using network topology and high-throughput data. BMC Systems Biol. 2007;1:8. [PMC free article] [PubMed]
26. Baitaluk M, Sedova M, Ray A, Gupta A. Biological Networks: visualization and analysis tool for systems biology. Nucleic Acids Res. 2006;W34:W466–W471. [PMC free article] [PubMed]
27. Myers CL, Robson D, Wible A, Hibbs MA, Chiriac C, Theesfeld CL, Dolinski K, Troyanskaya OG. Discovery of biological networks from diverse functional genomic data. Genome Biol. 2005;6:R114. [PMC free article] [PubMed]
28. Kalaev M, Smoot M, Ideker T, Sharan R. NetworkBLAST: comparative analysis of protein networks. Bioinformatics. 2008;4:594–596. [PubMed]
29. Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, et al. Ensembl 2007. Nucleic Acids Res. 2007;D35:D610–D617. [PMC free article] [PubMed]
30. Reimand J, Kull M, Hansen J, Peterson H, Vilo J. g:Profiler – a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res. 2007;W35:W193–W200. [PMC free article] [PubMed]
31. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, et al. IntAct – open source resource for molecular interaction data. Nucleic Acids Res. 2007;D35:D561–D565. [PMC free article] [PubMed]
32. Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TKB, et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003;13:2363–2371. [PMC free article] [PubMed]
33. MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E. An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinform. 2006;7:113. [PMC free article] [PubMed]
34. Zinovyev A, Viara E, Calzone L, Barillot E. BiNoM: a cytoscape plugin for manipulating and analyzing biological networks. Bioinformatics. 2008;6:876–877. [PubMed]
35. Jeong H, Mason SP, Barabsi AL, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–42. [PubMed]
36. Przulj N, Wigle DA, Jurisica I. Functional topology in a network of protein interactions. Bioinformatics. 2004;20:340–348. [PubMed]
37. van Dongen S. Ph.D. Thesis. University of Utrecht; 2000. Graph clustering by flow simulation.
38. Dunn R, Dudbridge F, Sanderson CM. The use of edge-betweenness clustering to investigate biological function in protein interaction networks. BMC Bioinform. 2005;6:39. [PMC free article] [PubMed]
39. Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002;30:1575–1584. [PMC free article] [PubMed]
40. Yu H, Kim PM, Sprecher E, Trifonov V, Gerstein M. The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput. Biol. 2007;3:e59. [PMC free article] [PubMed]
41. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008;36:D480–D484. ( http://nar.oxfordjournals.org/cgi/content/abstract/36/suppl_1/D480). [PMC free article] [PubMed]
42. Vastrik I, D'E;ustachio P, Schmidt E, Joshi-Tope G, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, et al. Reactome: a knowledge base of biologic pathways and processes. Genome Biol. 2007;8:R39. [PMC free article] [PubMed]
43. Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 2006;34:D108–110. [PMC free article] [PubMed]
44. Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006;34:D140–D144. [PMC free article] [PubMed]
45. Reimand J. Master's; Thesis. Estonia: University of Tartu; 2006. Gene Ontology mining tool GOSt.
46. Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM. Consolidating the set of known human protein–protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 2005;6:R40. [PMC free article] [PubMed]
47. Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R. NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Res. 2007;D35:D760–D765. [PMC free article] [PubMed]
48. Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, Holloway E, Kolesnykov N, Lilja P, et al. ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007;D35:D747–D750. [PMC free article] [PubMed]
49. Bashir T, Pagano M. Cdk1: the dominant sibling of Cdk2. Nat. Cell Biol. 2005;7:779–781. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...