Logo of bioinfoLink to Publisher's site
Bioinformatics. 2010 Oct 1; 26(19): 2474–2476.
Published online 2010 Aug 10. doi:  10.1093/bioinformatics/btq452
PMCID: PMC2944204

Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies


Summary: Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols.

Availability: http://www.sanger.ac.uk/resources/software/genevar

Contact: hc.eginu@sikaztimred.liuonamme


Expression quantitative trait loci (eQTL) mapping, where gene expression profiling is treated as a phenotypic trait in genome-wide association studies (GWAS), has successfully been employed to uncover genetic variants that influence expression variation in recent studies (Dixon et al., 2007; Stranger et al., 2007a). Single-nucleotide polymorphism (SNP)–gene associations from eQTL analysis can be investigated in populations (Stranger et al., 2007b) or among tissue types (Dimas et al., 2009; Heinzen et al., 2008). In addition to genome-wide eQTL identification, combinations of eQTLs and lead SNPs identified by GWAS have been provided to interrogate the mechanisms underlying disease susceptibility at specific loci (Grundberg et al., 2009; Nica et al., 2010; Zeller et al., 2010). However, an analytical and visualization tool, together with a structured repository for multiple datasets, is still needed to facilitate the investigation of loci of interest and to share data publicly and among collaborators.

Here, we present Genevar, a database and Java tool designed to provide: (i) data warehousing; (ii) real-time computation of correlation significance; (iii) visualization of mapping results in a user-friendly interface; and (iv) an added web services platform that is implemented as a bridge between the server and multiple users. Genevar allows published data to be visually accessible in a secure fashion, without the need for users to download raw data. Through interactive analysis pipelines, researchers are able to rapidly investigate, for instance, cis-acting eQTLs at the locus of interest.

Complementing already available standalone tools (Chen et al., 2009; Ge et al., 2008), a database-centric architecture enables Genevar to perform complex queries on-the-fly and does not have a high memory requirement for prior reading in large-scale datasets. Furthermore, exploiting the convenience of web-based (Wang et al., 2003; Zou et al., 2007) and web-launch (Mueller et al., 2005) tools, a Java interface was developed that connects to both database and web services. The main advantage of this system design is that users can switch between public services and local data on the same interface. Default services at the Sanger Institute currently contain gene expression profiling and genotypic data from the following two datasets: lymphoblastoid cell lines from eight HapMap3 populations (824 individuals, unpublished data); and three cell types derived from umbilical cords of 75 Geneva GenCord individuals (Dimas et al., 2009).


Genevar has two main functionalities in cis-eQTL analysis: (i) identifying eQTLs in genes of interest, and (ii) observing SNP–gene associations surrounding SNPs of interest (Fig. 1). Additional features include SNP–probe association plots and external links to three major genome browsers. Either cis- or trans-eQTLs can be plotted in the SNP–probe association plot module. Mapping results are listed in tree nodes in a structural manner, and information can be saved as PNG diagrams or exported as tab-delimited lists for further use in presentations or publications.

Fig. 1.
Results of Genevar: a scatter plot represents observed eQTLs in a 2 Mb window centering the GBP3 locus in HapMap3 CHB (A), and a line chart illustrates observed SNP–gene associations in a 2 Mb region surrounding rs13277113 SNP in eight HapMap3 ...

Genevar is compatible with PLINK (Purcell et al., 2007) genotype data formats and any tab-delimited expression/genotyping file in our format. After uploading datasets onto the database, Genevar presents expression profiling data and individual genotypes in two cataloged management panels. Once a group of datasets is selected in the follow-up analysis pipelines, the software automatically prompts available expression–genotype pairs for the user to choose from.

Spearman's rank correlation coefficient is performed to estimate the strength of relationship between alleles and gene expression intensities, linear regression is also used to model the relationship between the two variables. To test the significance of the relationship, a t-statistic is employed with n − 2 degrees of freedom for both correlation and regression analysis (Stranger et al., 2007b). The software allows the user to adjust the window size centering on the gene/SNP of interest (e.g. 2 Mb) and user-defined P-value threshold (e.g. P < 0.001) for the featured cis-eQTL analysis. Alternatively, non-parametric permutation P-values are also provided in the subsequent association plot module to further evaluate the significance of nominal P-values. In order to construct a distribution of the test statistic, under the null hypothesis of no SNP–probe associations, expression intensities are randomly re-assigned to individuals' genotypes, then correlation coefficient and statistical significance are re-computed for the relabeled traits, and this procedure is repeated 10 000 times (Stranger et al., 2005).

We recommend users to launch Genevar via Java Web Start from our homepage for the most up-to-date version. After launching, Genevar is initially in web services mode connecting to the Sanger Institute. The user can then make another services connection to affiliated institutes, or switch to database mode connecting directly to user's local database. Genevar can be run completely offline in database mode as there is no communication between the Java interface and Sanger server.

Future work will include modified visualization for displaying next-generation sequence data, e.g. RNA-Seq (Montgomery et al., 2010); and implementation of methylation modules to interrogate epigenomic data.


This approach to relational database design is an attempt to systematically decompose traditional flat files, which are one record per line and have no structural relationships between the records, into grouped dimension tables and to reduce data redundancy. A normalized and structured repository is suitable to warehouse all kinds of data format regardless of the file size and field numbers. Most importantly, the advantage of using database indexing on expression and genotype fact tables highly stabilize retrieval performance with the subsequent but reasonable cost of slower uploads and increased disk space. The only limitation when the datasets grew would be the storage space as this is a trade-off for query speed.

To maximize the potential of Genevar as a platform shared among affiliations, Genevar has been extended to interact with web services protocols to enhance data security; the database schema will be deployed behind and protected by the firewall, whereas only a secure frontend webpage acting as a middle layer will be accessible to the user over the Internet.

Genevar uses Hibernate library (http://www.hibernate.org) to map object-oriented models onto MySQL relational database tables (http://www.mysql.com) in the back-end, and acquires Apache CXF framework (http://cxf.apache.org) to wrap up database queries and business logics into middle-layer services. Finally, a Tomcat server (http://tomcat.apache.org) is used to provide services in the front-end. For a standalone database-mode Genevar, only a MySQL database is required to be installed on user's local machine. Association results are visualized in genomic views by JFreeChart library (http://www.jfree.org/jfreechart/). A gene-centered scatter plot represents observed SNP–gene associations around genes of interest, and a SNP-centered line chart illustrates observed eQTLs surrounding SNPs of interest (Fig. 1).

Tested on a 1.6 GHz Pentium Centrino laptop with 1 GB of RAM, Genevar was able to upload a 75 × 23k expression dataset onto the database and built up indexes in 1 min; another 23 min were required for the 75 × 400k genotype file. Once it is uploaded, Genevar can fetch per SNP–probe pairs from these 75 individuals in <0.0257 s from the database, and calculates Spearman's rhos and nominal P-values for 486 SNP–probe pairs in 3 s.


We thank Guillaume Smits and Johan Rung (EMBL-EBI) for their suggestions on improving the functionalities. We also thank Richard Jeffs, James Smith, Paul Bevan (Sanger Webteam) and Andrew Bryant (Database Team) for helpful support on this project.

Funding: Wellcome Trust and Louis-Jeantet Foundation.

Conflict of Interest: none declared.


  • Chen W, et al. GWAS GUI: graphical browser for the results of whole-genome association studies with high-dimensional phenotypes. Bioinformatics. 2009;25:284–285. [PMC free article] [PubMed]
  • Dimas AS, et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science. 2009;325:1246–1250. [PMC free article] [PubMed]
  • Dixon AL, et al. A genome-wide association study of global gene expression. Nat. Genet. 2007;39:1202–1207. [PubMed]
  • Ge D, et al. WGAViewer: software for genomic annotation of whole genome association studies. Genome Res. 2008;18:640–643. [PMC free article] [PubMed]
  • Grundberg E, et al. Population genomics in a disease targeted primary cell model. Genome Res. 2009;19:1942–1952. [PMC free article] [PubMed]
  • Heinzen EL, et al. Tissue-specific genetic control of splicing: implications for the study of complex traits. PLoS Biol. 2008;6:2869–2879. [PMC free article] [PubMed]
  • Montgomery SB, et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature. 2010;464:773–777. [PMC free article] [PubMed]
  • Mueller M, et al. eQTL Explorer: integrated mining of combined genetic linkage and expression experiments. Bioinformatics. 2005;22:509–511. [PubMed]
  • Nica AC, et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 2010;6:e1000895. [PMC free article] [PubMed]
  • Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. [PMC free article] [PubMed]
  • Stranger BE, et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 2005;1:e78. [PMC free article] [PubMed]
  • Stranger BE, et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007a;315:848–853. [PMC free article] [PubMed]
  • Stranger BE, et al. Population genomics of human gene expression. Nat. Genet. 2007b;38:1217–1224. [PMC free article] [PubMed]
  • Wang J, et al. WebQTL: web-based complex trait analysis. Neuroinformatics. 2003;1:299–308. [PubMed]
  • Zeller T, et al. Genetics and beyond–the transcriptome of human monocytes and disease susceptibility. PLoS One. 2010;5:e10693. [PMC free article] [PubMed]
  • Zou W, et al. eQTL Viewer: visualizing how sequence variation affects genome-wide transcription. BMC Bioinformatics. 2007;8:7–11. [PMC free article] [PubMed]

Articles from Bioinformatics are provided here courtesy of Oxford University Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

  • Genetic variation at MECOM, TERT, JAK2 and HBS1L-MYB predisposes to myeloproliferative neoplasms[Nature Communications. ]
    Tapper W, Jones AV, Kralovics R, Harutyunyan AS, Zoi K, Leung W, Godfrey AL, Guglielmelli P, Callaway A, Ward D, Aranaz P, White HE, Waghorn K, Lin F, Chase A, Joanna Baxter E, Maclean C, Nangalia J, Chen E, Evans P, Short M, Jack A, Wallis L, Oscier D, Duncombe AS, Schuh A, Mead AJ, Griffiths M, Ewing J, Gale RE, Schnittger S, Haferlach T, Stegelmann F, Döhner K, Grallert H, Strauch K, Tanaka T, Bandinelli S, Giannopoulos A, Pieri L, Mannarelli C, Gisslinger H, Barosi G, Cazzola M, Reiter A, Harrison C, Campbell P, Green AR, Vannucchi A, Cross NC. Nature Communications. 66691
  • Genome-Wide Analysis of Attention Deficit Hyperactivity Disorder in Norway[PLoS ONE. ]
    Zayats T, Athanasiu L, Sonderby I, Djurovic S, Westlye LT, Tamnes CK, Fladby T, Aase H, Zeiner P, Reichborn-Kjennerud T, Knappskog PM, Knudsen GP, Andreassen OA, Johansson S, Haavik J. PLoS ONE. 10(4)e0122501
  • Multi-stage genome-wide association study identifies new susceptibility locus for testicular germ cell tumour on chromosome 3q25[Human Molecular Genetics. 2015]
    Litchfield K, Sultana R, Renwick A, Dudakia D, Seal S, Ramsay E, Powell S, Elliott A, Warren-Perry M, Eeles R, Peto J, Kote-Jarai Z, Muir K, Nsengimana J, UKTCC, Stratton MR, Easton DF, Bishop DT, Huddart RA, Rahman N, Turnbull C, Pugh J, Linger R, Marke J, Hughes D, Pernet D, Hall P, Easton DF, Berchuck A, Eeles R, Chenevix-Trench G, Dennis J, Dunning AM, Lee A, Dicks E, Easton DF, Benitez J, Gonzalez-Neira A, Simard J, Tessier DC, Bacot F, Vincent D, LaBoissière S, Robidoux F, Bojesen SE, Nielsen SF, Nordestgaard BG, Cunningham JM, Windebank SA, Hilker CA, Meyer J. Human Molecular Genetics. 2015 Feb 15; 24(4)1169-1176
  • The clinical and genetic features in a cohort of mainland Chinese patients with thyrotoxic periodic paralysis[BMC Neurology. ]
    Li X, Yao S, Xiang Y, Zhang X, Wu X, Luo L, Huang H, Zhu M, Wan H, Hong D. BMC Neurology. 1538
  • Whole-genome sequence-based analysis of thyroid function[Nature Communications. ]
    Taylor PN, Porcu E, Chew S, Campbell PJ, Traglia M, Brown SJ, Mullin BH, Shihab HA, Min J, Walter K, Memari Y, Huang J, Barnes MR, Beilby JP, Charoen P, Danecek P, Dudbridge F, Forgetta V, Greenwood C, Grundberg E, Johnson AD, Hui J, Lim EM, McCarthy S, Muddyman D, Panicker V, Perry JR, Bell JT, Yuan W, Relton C, Gaunt T, Schlessinger D, Abecasis G, Cucca F, Surdulescu GL, Woltersdorf W, Zeggini E, Zheng HF, Toniolo D, Dayan CM, Naitza S, Walsh JP, Spector T, Davey Smith G, Durbin R, Brent Richards J, Sanna S, Soranzo N, Timpson NJ, Wilson SG, The UK0K ConsortiumTurkiSaeed AlSA927AndersonCarlC9AnneyRichardR28AntonyDinuD29ArtigasMaria SolerMS28AyubMuhammadM30BalasubramaniamSenduranS9BarrettJeffrey C.JC9BarrosoInêsI931BealesPhilP29BenthamJamieJ32BhattacharyaShoumoS32BirneyEwanE33BlackwoodDouglasD34BobrowMartinM35BochukovaElenaE31BoltonPatrickP36BoundsRebeccaR31BoustredChrisC8BreenGeromeG3637CalissanoMattiaM38CarssKerenK9ChatterjeeKrishnaK31ChenLuL939CiampiAntonioA16CirakSebhattinS3840ClaphamPeterP9ClementGailG22CoatesGuyG9CollierDavidD4142CosgroveCatherineC32CoxTonyT9CraddockNickN43CrooksLucyL944CurranSarahS364546CurtisDavidD47DalyAllanA9Day-WilliamsAaronA948DayIan N.M.IN8DownThomasT949DuYuanpingY50DunhamIanI32EdkinsSarahS9EllisPeterP9EvansDavidD851FaroogiSadafS31FatemifarGhazalehG8FitzpatrickDavid R.DR52FlicekPaulP933FlyodJamesJ953FoleyA. ReghanAR16FranklinChristopher S.CS9FutemaMartaM54GallagherLouiseL28GeihsMatthiasM9GeschwindDanielD55GriffinHeatherH56GrozevaDetelinaD35GuoXueqinX50GuoXiaosenX50GurlingHughH49HartDeborahD22HendricksAudreyA957HolmansPeterP43HowieBryanB58HuangLirenL50HubbardTimT949HumphriesSteve E.SE54HurlesMatthew E.ME9HysiPirroP22JacksonDavid K.DK9JamshidiYaldaY59JingTianT50JoyceChrisC9KayeJaneJ56KeaneThomasT9KeoghJuliaJ31KempJohnJ851KennedyKarenK9Kolb-KokocinskiAnjaA9LachanceGenevieveG22LangfordCordeliaC9LawsonDanielD8LeeIreneI60LekMonkolM61LiangJieqinJ50LinHongH50LiRuiR1525LiYingruiY50LiuRyanR62LönnqvistJoukoJ63LopesMargaridaM964LotchkovaValentinaV933MacArthurDanielD96165MarchiniJonathanJ66MaslenJohnJ9MassimoManginoM21MathiesonIainI67MarenneGaëlleG9McGuffinPeterP36McIntoshAndrewA34McKechanieAndrew G.AG3468McQuillinAndrewA47MetrustrySarahS22MitchisonHannahH29MoayyeriAlirezaA2269MorrisJamesJ9MuntoniFrancescoF38NorthstoneKateK8O'DonnovanMichaelM43OnoufriadisAlexandrosA49O'RahillyStephenS31OualkachaKarimK70OwenMichael J.MJ43PalotieAarnoA96571PanoutsopoulouKalliopeK9ParkerVictoriaV31ParrJeremy R.JR72PaternosterLaviniaL8PaunioTiinaT6373PayneFelicityF9PietilainenOlliO96371PlagnolVincentV74QuayeLydiaL22QuaiMichael A.MA9RaymondLucyL35RehnströmKarolaK9RichardsBrentB15162225RingSusanS875RitchieGraham R.S.GR933RobertsNicolaN35SavageDavid B.DB31ScamblerPeterP29SchiffelsStephenS9SchmidtsMiriamM29SchoenmakersNadiaN31SempleRobert K.RK31SerraEvaE9SharpSally I.SI47ShinSo-YounSY89SkuseDavidD60SmallKerrinK22SouthamLorraineL964Spasic-BoskovicOliveraO35ClairDavid StDS76StalkerJimJ9StevensElizabethE38PourcianBeate StBS87778SunJianpingJ1516SuvisaariJaanaJ63TachmazidouIonnaI9TobinMartin D.MD9ValdesAnaA22KogelenbergMargriet VanMV9VijayarangakannanParthibanP9VisscherPeter M.PM5179WainLouise V.LV80WaltersJames T.R.JT43WangGuangbiaoG50WangJunJ5081828384WangYuY50WardKirstenK22WheelerElanorE9WhyteTamiekaT38WilliamsHywelH43WilliamsonKathleen A.KA52WilsonCrispianC35WongKimK9XuChangJiangC1516YangJianJ5178ZhangFendF22ZhangPingboP50. Nature Communications. 65681
See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...