Logo of narLink to Publisher's site
Nucleic Acids Res. Jul 1, 2008; 36(Web Server issue): W308–W314.
Published online May 28, 2008. doi:  10.1093/nar/gkn303
PMCID: PMC2447723

GEPAS, a web-based tool for microarray data analysis and interpretation

Abstract

Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org.

INTRODUCTION

Since its introduction in the mid 1990s (1), microarrays have revolutionized the way in which the research community addresses biological problems. Its success relays on its application to classify types of tumours (2), predicting disease outcome (3) or even the response to treatments (4). These practical applications of microarrays, despite them not being free of criticisms (5), have definitively fuelled the use of the methodology. In this scenario, the real bottleneck in the use of microarray technologies comes from the data analysis step (6). The web-based package Gene Expression Profile Analysis Suite (GEPAS) has been growing during the last 5 years (7–10) trying to keep pace with the state-of-the-art in algorithms for high-throughput gene expression data analysis as well as responding to the demands of the microarray community.

Although originally designed to analyse microarray data, the most important modules of GEPAS are not tied to the technology or to the microarray platforms used to extract the data on gene expression. GEPAS is rather oriented to analyse high-throughput gene expression data and to test different types of genome-scale hypotheses.

GEPAS is not a web server of a simple tool, but it constitutes one of the largest resources for integrated microarray data analysis available over the web. GEPAS is used by researchers worldwide as can be seen in the usage map, where all the sessions are mapped to its geographic location (http://bioinfo.cipf.es/access_map/map.html). By the end of year 2007, an average of 500 experiments per day were being analysed in GEPAS. The recent release 4.0 presented here includes new modules, new tests in already existent modules, technical improvements (GEPAS is now based on web services technology and includes Web 2.0 features) and a more powerful and intuitive interface which includes graphical tools to define workflows and persistent private sessions.

GENERAL OVERVIEW

GEPAS has been designated for the analysis of high-throughput gene expression data. Obviously, today this means microarray data analysis, but this situation might change in the future and the data could come from different platforms or technologies. Although some of their modules are platform dependent, the core of GEPAS aims to analyse and test hypothesis using gene expression data in a simple but rigorous way.

Many different biological questions can be addressed through gene-expression experiments, nevertheless, there are usually three types of objectives in this context: ‘class comparison’, ‘class prediction’ and ‘class discovery’ (6). The first two objectives fall into the category of supervised methods and usually involve the application of tests to define differentially expressed genes, or the use of different procedures to predict class membership on the basis of the values observed for a number of ‘key’ genes. Clustering methods belong to the last category, also known as unsupervised analysis, because no previous information about the class structure of the data set is used in the study. Thus, GEPAS is composed by the following modules:

Normalization and pre-processing

GEPAS implements normalization facilities for both two-colour and Affymetrix arrays. Normalization in two-colour arrays is performed using print-tip loess (11) with a number of different options. Affymetrix CEL files using standard bioconductor (12) tools, in particular the package affy (13). Besides its friendly web interface we provide the user with the speed and above all, the physical memory available in our server. In addition, the pre-processor (14) module performs some pre-processing of the data (log-transformations, standardizations, imputation of missing values, etc.).

Class discovery

Clustering techniques are used for class discovery either in genes or in experiments. GEPAS includes the best performing clustering methods according to different independent benchmarkings (15,16). There are obviously more methods but among the most extensively used for gene expression data clustering we can highlight: hierarchical clustering (17), SOM (18), SOTA (19) and K-means (20). It is worth mentioning that the version of SOM implemented here can automatically find the optimal number of clusters (21).The evaluation of cluster quality, a barely addressed issue, has been implemented here using the silhouette method (22), which presents an optimal performance in noisy situations, such as microarray data (23), along with some descriptive measures for each cluster partition (average profiles, standard deviation profiles, inter- and intra-cluster distances).

Differential gene expression

GEPAS implements tests for finding genes with significant differences in expression between two or more classes, related to a continuous experimental factor (e.g. the concentration of a metabolite) or to survival data. For two-class comparisons, GEPAS implements the popular t-test, the empirical Bayes test (24), the CLEAR-test that combines differential expression and variability (25), the data-adaptive test (26) and the SAM test (27). For comparisons involving more than two classes GEPAS uses the classical ANOVA. In order to find genes whose expression is significantly correlated to a continuous variable (e.g. the level of a metabolite), regression analysis and estimates of Pearson's and Spearman's correlation co-efficients can be obtained. Finally, for finding genes whose expression is related to survival times GEPAS estimates a Cox proportional hazards regression model (28). Right censored data is allowed as well as replicates in the survival times. Censoring variables should be provided by the researcher together with survival times that may be replicated.

When appropriate, P values adjusted for multiple testing are provided. Three methodologies are implemented. One of them controls the FWER (family-wise error rate) (29) while the others control the FDR (false discovery rate) (30).

Predictors

A new module for class prediction (31) has been implemented. The module includes different classifiers, such as diagonal linear discriminant analysis (DLDA) (32), k-nearest neighbour (KNN) (33), support vector machines (SVM) (34), SOM (18) and shrunken centroids (PAM) (35) of well-known efficiency as class predictors using microarray data (32). Cross-validation error is calculated in such a way as to avoid the well-known selection bias problem (36). See ref. (31) for details. Once the model has been trained it can be used for further prediction of new samples. This implementation is unique among similar programmes.

Time-course and dose–response gene-expression experiments

A new module for the analysis of multi-series time-course and dose–response microarray experiments has been added. In this type of experiments, the researcher aims to study gene expression changes across time or across dosages and to evaluate trend differences between the various experimental groups (37).

This module implements and extends the maSigPro statistical approach for the study of gene expression changes along time and the specific trend differences between various experimental groups (38). The method is a two-regression step approach where individual series are identified by dummy variables. The procedure first adjusts a global regression model which considers all experiment series and a maximum complexity in the time/dosage-dependent response. This first step indentifies differentially expressed genes at a given false positive control rate. In the second step, a variable selection method is applied to find the best model for each gene and to analyse particular significant profile differences between series. Finally, significant genes are clustered and displayed showing these trend differences.

Functional profiling

There are many available tools that make use of gene functional annotations to provide an interpretation for the observed global changes in gene expression in microarray experiments (39). Probably, one of the most complete packages for functional profiling analysis is the Babelomics suite (40,41). This suite of programs for functional annotation of genome-scale experiments has undergone a deep modification described in detail elsewhere (Al-Shahrour, submitted to this issue). Babelomics performs functional enrichment analysis, that is, comparing two lists of genes and testing simultaneously in order to find significant over-abundance of diverse biologically relevant terms that would define functional modules such as GO, KEGG pathways, Interpro motifs or regulatory modules such as Transfac® motifs, CisRed motifs, miRNA binding motifs or other types of modules such as the ones defined by relative abundance in tissues and bioentities extracted from PubMed. All the tests are further adjusted for multiple testing effects (42,43). Additionally, gene set enrichment analysis can be performed using different algorithms (44,45) using several sources of information (46). The Babelomics suite is fully integrated into GEPAS. Gene expression analyses resulting in lists of genes to be compared (different clusters, genes differentially expressed, etc.) can be submitted to Babelomics for functional enrichment analysis. Moreover, arrangements of genes according to, for example, differential expression or other criteria can be sent to Babelomics to be studied by gene set enrichment analysis. This allows discovering pathways or functional modules of genes that are coordinately activated or deactivated in the experiment studied.

Entry points and data formats

There are two entry points to GEPAS: platform dependent and platform independent. GEPAS accepts and normalizes different types of microarray data which include Affymetrix CEL files and 13 different two-channel arrays including Agilent, Genepix and other. Once the files are normalized any type of analysis can be applied. On the other hand, there is another simple format by means of which data from other platforms, other technologies (e.g. SAGE) and even other nature (e.g. proteomics, Chip-on-chip data) can be input in any of the GEPAS modules. A very simple text file with the numeric gene expression values are in the format of a tabulator-delimited matrix, in which rows make reference to gene identifiers and columns to experiments, can be used for this purpose. Information on the experiments can be stored in the first rows starting by a # symbol. The first column contains the gene identifiers.

WHAT IS NEW IN VERSION 4.0?

The novelties added to this version have been described in more detail above, in the general overview of the programme. Summarizing, we have implemented a number of new tests, inexistent in previous versions, apart from new whole modules. Thus, much more options for normalization have been added (support for 12 more formats). New tests for differential expression such as an improved version of clear (25) test or the popular SAM test (27) were implemented. The module for cluster visualization has also been extensively improved. Much work has been invested if implementing an improved tool for protein and gene ID conversion which includes a large number of species and databases. Now, the converter tool supports more than 10 species and more than 40 gene ID references for human [including single nucleotide polymorphism (SNP) and orthologous information]. In general, almost all the modules of GEPAS have undergone improvements to some extent. We have included a new complete module that allows the analysis of multi-series time-course and dose–response microarray experiments. The module is an implementation of the maSigPro statistical approach for the study of gene expression changes along time and the specific trend differences between various experimental groups (38). Another new module is the clustering by a version of SOM (21) that automatically finds the number of clusters. Obviously, the Babelomics has its own catalogue of novelties that are described in an accompanying paper.

In addition, there are technical novelties such as the re-engineering to web services, the inclusion of Web 2.0 technology features, the new interface of sessions and the pipeliner, which are described below.

All the novelties included in GEPAS are, in terms of resources invested, far beyond the work demanded by a conventional web server that offers a unique facility.

The pipeliner: a graphic module for easy implementation of workflows

Microarray data analysis consists of a series of steps that can be carried out by sequentially running different GEPAS modules (e.g. normalization + pre-processing + gene selection + functional profiling of significant genes). If some of these steps have to be repeated systematically many times (which would happen, for example in a microarray core facility) it is easier to have the possibility of saving the sequence of operations as a workflow and using it in future analysis. The possibility of saving and storing operations is also useful when a researcher uses a non-default set of parameters in the tools. The advanced ‘pipeliner’ module allows users to define workflows, for repetitive tasks, in a completely visual manner by choosing, dragging and dropping icons representing the different modules in the package (without the need of any scripting skills). Figure 1 shows the graphic interface that allows defining sequences of operations as well as setting the parameter used in these. The workflows so defined by this Java applet can be stored in the sessions and can be further loaded from them.

Figure 1.
The pipeliner interface with the available modules on the left and the customization options window below. Modules can be dragged and dropped on the screen and the sequence of execution is defined by linking them. Clicking on a module brings about the ...

Internal re-engineering, technological improvements and the session interface

GEPAS has been completely re-engineered and now it is based on SOAP web services and on new Web 2.0 technology features such as AJAX. This has facilitated the design of a new interface that allows asynchronous use, as well as projects, jobs and user management. Thus, the users can choose between the traditional anonymous sessions without loging in (as in previous versions) or to log into the new environment with username and password. This new environment offers persistent sessions in which data is kept stored as well as different facilities for tracking of the operations performed. Both options are free.

GEPAS is now running in a high-end cluster with 10 dedicated Intel XEON Quad-Core CPUs at 2.0 GHz (summing up a total of 40 cores) with a large amount of RAM (total 60 GB). In this way we can offer a high computer power to end users.

An improved module for protein and gene ID conversion including a large number of species and databases is used behind the scene. This module allows importing any microarray file regardless of the IDs used in the platform. More species and gene references have been added and now the converter module supports more than 10 species and more than 40 ID references for human (including SNP and orthologous information). This module has been implemented in Java to speed up the performance. Besides the web interface a public web service Application Programming Interface is provided, allowing anyone to access the data from their code.

Related training activities

In addition, there is a teaching programme related to GEPAS (http://bioinfo.cipf.es/docus/courses/courses.html) with on-line tutorials that can be freely used (http://bioinfo.cipf.es/docus/courses/on-line.html).

GEPAS usage

The impact over the user's community has been estimated by the corresponding number of Scholar Google citations. According to the number of citations, GEPAS is by far the most popular web resource in its category with 196 citations [252 if the citations of the SOTA (19) are included]. The updated citations for the web-tools with a significant presence in the scientific community can be found at: http://bioinfo.cipf.es/docus/tools-citations/microarrays. GEPAS is used by a broad research community of many countries and its records indicate an average usage rate of around 500 users per day. The geographical distribution of users can be monitored in real time at: http://bioinfo.cipf.es/access_map/map.html. The web-based pipeline for microarray gene expression data, GEPAS, is available at http://www.gepas.org.

Future plans

We are working on several improvements that will be released in an upcoming version. These include normalization for one channel Agilent arrays, for exon arrays (both Agilent and Affymetrix), for tiling arrays and for Illumina arrays. New tests for differential expression will be included. A new version of the predictor with more predictor tools and new cross-validation methods will also be implemented. The ISACHG (47) for array-CGH analysis will be fully integrated in GEPAS and interfaces to databases such as ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) or Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) will be provided.

DISCUSSION

GEPAS is a long-term, ongoing ambitious project that aims to provide the scientific community with an advanced set of tools for high-throughput gene expression data analysis, without renouncing to an easy and intuitive use. Since its official release in 2003 (7), GEPAS has been running uninterruptedly and has grown-up to include more tools to keep pace with the novelties in the microarray data analysis arena (7–9). GEPAS has the vocation of being a consistent set of both state-of-the-art and widely established algorithms, instead of a simple collection of as-much-as-possible tools. In fact, any new tool which has been included in the package has been the response to a new or emerging requirement requested by our users. As the Functional Genomics node of the Spanish Institute of Bioinformatics (INB; http://www.inab.org) and being part of the Spanish Network of Cancer (RTICC; http://www.rticcc.org) and the Network of Centres for Research in Rare Diseases (CIBERER, http://www.ciberer.es), we have a direct contact with researchers from which we get much of the feedback necessary to build up a useful tool. We are also integrated in the EMERALD project (http://www.microarray-quality.org/), where we will provide input in the data mining methodologies such as clustering, gene selection or predictors, to assess the implications of QA/QC.

GEPAS, integrated with the Babelomics suite (40,41), offers all the necessary methods in order to perform the most common analysis of microarray data. GEPAS has been designed to take full advantage of the properties of the web: connectivity, cross-platform functionality and remote usage. Its modular architecture based on web services allows easy implementation of new tools and facilitates the connectivity of GEPAS from and to other web-based tools.

It cannot be discarded that the technologies and the platforms will change in the future. Such foreseeable changes can only affect the entry point and the technology-related part of GEPAS (that is, the normalization). The important contribution of GEPAS is its potential for analyzing high-throughput gene expression data and for testing different types of hypotheses in this context, regardless the technology that has produced such results.

The step of functional interpretation is typically made by studying the enrichment in pre-defined modules of genes related among them by any interesting biological property (common function, regulation, chromosomal location, etc.) as a function of some parameter derived from the experiment. Thus, functional enrichment methods (39) are used to find gene modules significantly over-represented among the relevant genes selected in the experiment. Over-representation of a given gene module means that genes with a particular property have been activated or deactivated in the experiment. Recently, gene set enrichment methods are superseding conventional functional enrichment methods for the functional interpretation of high-throughput gene-expression data, given their higher sensitivity (39,48,49). Both families of methods along with several definitions of modules (functional, transcriptional, text-mining based and phenotypical and tissues based) are implemented in the Babelomics module, fully integrated in GEPAS.

GEPAS is now running in a high-end cluster that offers high computer power. This allows using tools (for example normalization tools are highly RAM-consuming) that are usually beyond the capabilities of the hardware available to many end users.

Although there are many alternatives for microarray data analysis, there is no other similar resource over the web with the number of possibilities offered by GEPAS.

ACKNOWLEDGEMENTS

This work is supported by grants from the Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER) ISCIII, and projects BIO2005-01078 from the Spanish Ministry of Education and Science, EMERALD from the EU and the National Institute of Bioinformatics (www.inab.org), a platform of Genoma España. Funding to pay the Open Access publication charges for this article was provided by project BIO2005-01078 fom the Spanish Ministry of Education and Science.

Conflict of interest statement. None declared.

REFERENCES

1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. [PubMed]
2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA. 2001;98:10869–10874. [PMC free article] [PubMed]
3. Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen G, Gharib TG, Thomas DG, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002;8:816–824. [PubMed]
4. van ‘t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. [PubMed]
5. Simon R. Roadmap for developing and validating therapeutically relevant genomic classifiers. J. Clin. Oncol. 2005;23:7332–7341. [PubMed]
6. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 2006;7:55–65. [PubMed]
7. Herrero J, Al-Shahrour F, Diaz-Uriarte R, Mateos A, Vaquerizas JM, Santoyo J, Dopazo J. GEPAS: a web-based resource for microarray gene expression data analysis. Nucleic Acids Res. 2003;31:3461–3467. [PMC free article] [PubMed]
8. Herrero J, Vaquerizas JM, Al-Shahrour F, Conde L, Mateos A, Diaz-Uriarte JS, Dopazo J. New challenges in gene expression data analysis and the extended GEPAS. Nucleic Acids Res. 2004;32:W485–W491. [PMC free article] [PubMed]
9. Vaquerizas JM, Conde L, Yankilevich P, Cabezon A, Minguez P, Diaz-Uriarte R, Al-Shahrour F, Herrero J, Dopazo J. GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data. Nucleic Acids Res. 2005;33:W616–W620. [PMC free article] [PubMed]
10. Montaner D, Tarraga J, Huerta-Cepas J, Burguet J, Vaquerizas JM, Conde L, Minguez P, Vera J, Mukherjee S, Valls J, et al. Next station in microarray data analysis: GEPAS. Nucleic Acids Res. 2006;34:W486–W491. [PMC free article] [PubMed]
11. Smyth G, Yang Y, Speed T. Statistical issues in microarray data analysis. In: Brownstein M, Khodursky A, editors. Functional Genomics: Methods and Protocols. Vol. 224. Totowa, NJ: Humana Press; 2003. pp. 111–136.
12. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. [PMC free article] [PubMed]
13. Gautier L, Cope L, Bolstad BM, Irizarry RA. affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. [PubMed]
14. Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, et al. Ensembl 2008. Nucleic Acids Res. 2008;36:D707–D714. [PMC free article] [PubMed]
15. Handl J, Knowles J, Kell DB. Computational cluster validation in post-genomic data analysis. Bioinformatics. 2005;21:3201–3212. [PubMed]
16. Datta S, Datta S. Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics. 2006;7:397. [PMC free article] [PubMed]
17. Sneath P, Sokal R. Numerical Taxonomy. San Francisco: W. H. Freeman; 1973.
18. Kohonen T. Self-organizing Maps. Berlin: Springer; 1997.
19. Herrero J, Valencia A, Dopazo J. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics. 2001;17:126–136. [PubMed]
20. Hartigan J, Wong M. A k-means clustering algorithm. Appl. Stat. 1979;28:100–108.
21. Vegas-Azcárate S, Muruzábal J. Biosignal Processing and Classification. 2005. Vol. 1. Insticc Press, Setubal, pp. 50–59.
22. Rousseeuw P. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987;20:53–65.
23. Azuaje F. A cluster validity framework for genome expression data. Bioinformatics. 2002;18:319–320. [PubMed]
24. Kendziorski CM, Newton MA, Lan H, Gould MN. On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Stat. Med. 2003;22:3899–3914. [PubMed]
25. Valls J, Grau M, Sole X, Hernandez P, Montaner D, Dopazo J, Peinado MA, Capella G, Moreno V, Pujana MA. CLEAR-test: combining inference for differential expression and variability in microarray data analysis. J. Biomed. Inform. 2007;41:33–45. [PubMed]
26. Mukherjee S, Roberts SJ, van der Laan MJ. Data-adaptive test statistics for microarray data. Bioinformatics. 2005;21(Suppl. 2):ii108–ii114. [PubMed]
27. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl Acad. Sci. USA. 2001;98:5116–5121. [PMC free article] [PubMed]
28. Klein JP, Moeschberger ML. Survival Analysis: Techniques for Censored and Truncated Data. New York: Springer; 2003.
29. Holm S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 1979;6:65–70.
30. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA. 2003;100:9440–9445. [PMC free article] [PubMed]
31. Medina I, Montaner D, Tarraga J, Dopazo J. Prophet, a web-based tool for class prediction using microarray data. Bioinformatics. 2007;23:390–391. [PubMed]
32. Dudoit S, Fridlyand J, Speed T. Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 2002;97:77–87.
33. Ripley B. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press; 1996.
34. Vapnik V. Statistical Learning Theory. New York: Wiley; 1998.
35. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl Acad. Sci. USA. 2002;99:6567–6572. [PMC free article] [PubMed]
36. Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl Acad. Sci. USA. 2002;99:6562–6566. [PMC free article] [PubMed]
37. Simon I, Siegfried Z, Ernst J, Bar-Joseph Z. Combined static and dynamic analysis for determining the quality of time-series expression profiles. Nat. Biotechnol. 2005;23:1503–1508. [PubMed]
38. Conesa A, Nueda MJ, Ferrer A, Talon M. maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics. 2006;22:1096–1102. [PubMed]
39. Dopazo J. Functional interpretation of microarray experiments. OMICS. 2006;10:398–410. [PubMed]
40. Al-Shahrour F, Minguez P, Tarraga J, Montaner D, Alloza E, Vaquerizas JM, Conde L, Blaschke C, Vera J, Dopazo J. BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments. Nucleic Acids Res. 2006;34:W472–W476. [PMC free article] [PubMed]
41. Al-Shahrour F, Minguez P, Vaquerizas JM, Conde L, Dopazo J. BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments. Nucleic Acids Res. 2005;33:W460–W464. [PMC free article] [PubMed]
42. Al-Shahrour F, Diaz-Uriarte R, Dopazo J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics. 2004;20:578–580. [PubMed]
43. Al-Shahrour F, Minguez P, Tarraga J, Medina I, Alloza E, Montaner D, Dopazo J. FatiGO+: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Res. 2007;35:W91–W96. [PMC free article] [PubMed]
44. Al-Shahrour F, Diaz-Uriarte R, Dopazo J. Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information. Bioinformatics. 2005;21:2988–2993. [PubMed]
45. Al-Shahrour F, Arbiza L, Dopazo H, Huerta-Cepas J, Minguez P, Montaner D, Dopazo J. From genes to functional classes in the study of biological systems. BMC Bioinformatics. 2007;8:114. [PMC free article] [PubMed]
46. Minguez P, Al-Shahrour F, Montaner D, Dopazo J. Functional profiling of microarray experiments using text-mining derived bioentities. Bioinformatics. 2007;23:3098–3099. [PubMed]
47. Conde L, Montaner D, Burguet-Castell J, Tarraga J, Medina I, Al-Shahrour F, Dopazo J. ISACGH: a web-based environment for the analysis of Array CGH and gene expression which includes functional profiling. Nucleic Acids Res. 2007;35:W81–W85. [PMC free article] [PubMed]
48. Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics. 2007;23:980–987. [PubMed]
49. Nam D, Kim SY. Gene-set approach for expression pattern analysis. Brief. Bioinform. 2008;9:189–197. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...