![]() | ![]() |
Formats:
|
||||||||||||||
Copyright Bergholdt et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Expression Profiling of Human Genetic and Protein Interaction Networks in Type 1 Diabetes 1Hagedorn Research Institute and Steno Diabetes Center, Gentofte, Denmark 2Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark 3Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America 4Broad Institute of MIT and Harvard, Seven Cambridge Center, Cambridge, Massachusetts, United States of America 5Department of Medical Biochemistry and Genetics, University of Copenhagen, Copenhagen, Denmark 6University of Lund/Clinical Research Centre (CRC), Malmø, Sweden 7Department of Biomedical Science, University of Copenhagen, Copenhagen, Denmark Adrian Vella, Editor Mayo Clinic College of Medicine, United States of America * E-mail: rber/at/hagedorn.dk Conceived and designed the experiments: RB FP. Performed the experiments: RB CB KLH. Analyzed the data: RB CB KLH FP. Contributed reagents/materials/analysis tools: RB KLH JHN SB FP. Wrote the paper: RB CB KLH SB FP. Received March 24, 2009; Accepted June 17, 2009. Abstract Proteins contributing to a complex disease are often members of the same functional pathways. Elucidation of such pathways may provide increased knowledge about functional mechanisms underlying disease. By combining genetic interactions in Type 1 Diabetes (T1D) with protein interaction data we have previously identified sets of genes, likely to represent distinct cellular pathways involved in T1D risk. Here we evaluate the candidate genes involved in these putative interaction networks not only at the single gene level, but also in the context of the networks of which they form an integral part. mRNA expression levels for each gene were evaluated and profiling was performed by measuring and comparing constitutive expression in human islets versus cytokine-stimulated expression levels, and for lymphocytes by comparing expression levels among controls and T1D individuals. We identified differential regulation of several genes. In one of the networks four out of nine genes showed significant down regulation in human pancreatic islets after cytokine exposure supporting our prediction that the interaction network as a whole is a risk factor. In addition, we measured the enrichment of T1D associated SNPs in each of the four interaction networks to evaluate evidence of significant association at network level. This method provided additional support, in an independent data set, that two of the interaction networks could be involved in T1D and highlights the following processes as risk factors: oxidative stress, regulation of transcription and apoptosis. To understand biological systems, integration of genetic and functional information is necessary, and the current study has used this approach to improve understanding of T1D and the underlying biological mechanisms. Introduction Currently, genome-wide association studies in complex diseases are producing an unprecedented amount of genetic data. Complex traits like Type 1 Diabetes (T1D) are influenced by multiple genes interacting with each other to confer susceptibility and/or protection. However, identifying the individual components can be difficult because each only contributes weakly to the pathology. Alternatively, identification of entire cellular systems involved in a particular disease could be attempted. Such a strategy should be feasible in many different complex diseases since most genes exert their function as members of molecular machines where groups of proteins contributing to disease can be expected to be members of the same functional pathways [1], [2], [3], [4], [5], [6]. Analysis of an entire disease-related system might provide insight to the molecular etiology of the disease that would not emerge from isolated functional studies of single genes. We have previously in a large T1D linkage data set demonstrated statistical evidence for gene-gene interactions [7]. The data set comprised data from 1,321 affected sib pairs genotyped for 298 microsatellite markers [7], [8]. By an integrative approach combining genetic data and high-confidence (human) protein interaction networks, we identified four protein interaction networks significantly enriched in proteins from the predicted genetic interactions. This supported interaction in biological pathways. For each of these networks the identified protein or proteins were viewed in a biological context [7]. However, further functional and genetic evaluation is necessary to confirm the involvement of these interactions in T1D, elucidate the biological mechanisms of these networks and to identify the strongest risk factors amongst the network members. If several members of the same network can be shown to be likely risk factors in independent data this would support that the interaction networks as such are risk factors and serve as a validation of the genetic interactions previously identified. In the current study we use independent approaches for evaluating interaction networks and identifying the strongest risk factors amongst network members. We have used available T1D genome-wide association scan data for evaluation of whether entire interaction networks could be significantly associated with T1D. Furthermore, we performed expression profiling of identified genes. The hypothesis behind this is that expression levels may act as intermediate phenotypes between DNA sequence variation and more complex disease phenotypes and that evaluation of the expression of candidate genes in relevant tissue and/or disease models may provide a means for identifying those with a functional implication in T1D pathogenesis. Results and Discussion We have evaluated expression levels of candidate genes previously identified through genetic and protein interaction analyses [7]. The selected candidate genes originate from linkage regions predicted to genetically interact, and are functionally supported by evidence for physical interaction at the level of protein complexes. Four functional interaction networks (A–D) containing 30 proteins presumed to be responsible for the genetic interactions were previously obtained [7], and these four putative pathways and their 30 members were further evaluated in the present study. In a model of T1D, expression levels were evaluated in human islet preparations, representing the target organ, as well as in human lymphocytes representing the effector cells in T1D. Expression profiling in human islets was performed by comparing the constitutive expression versus cytokine-stimulated expression levels. Gene expression levels in lymphocytes were compared among controls and T1D individuals. Additional support for individual genes and genetic interactions in the networks comes from evidence for genetic association. The Wellcome Trust Case Control Consortium (WTCCC) has made the results of their large genome-wide association study of T1D and other diseases publicly available (www.wtccc.org) [9]. In this data set we searched for T1D associated SNPs in the 30 candidate genes located in the four interaction networks. To test for combined evidence for T1D association of the protein networks we measured the over-representation (enrichment) of significant SNPs associated with T1D in the four interaction networks, compared to randomly generated networks with similar properties. For each network we tested the enrichment of SNPs in the best 0.1 percentile, 1 percentile and 5 percentile of the WTCCC data for T1D. A nominal P-value and an adjusted P-value was determined for enrichment at each of those thresholds by comparing to 1,000 randomly generated networks with an equal number of proteins and proteins encoded by genes of similar size to the actual test genes. Interaction network A, table 1 and figure 1
Interaction network B, table 1 and figure 2
Interaction network C contains only three genes and originates from an identified gene-gene interaction between the HLA-region and a region on chromosome 11. Genes predicted to be responsible for this interaction is the MOG (Myelin-oligodendrocyte glycoprotein precursor) gene on chromosome 6 and the APLP2 (Amyloid-like protein 2 precursor (APPH)) and NTRI (Neurotrimin precursor (hNT)) genes on chromosome 11, table 1. No significant differences in either of the tissues/model systems were demonstrated for these genes, not providing any functional support from expression studies for the network C genes, also no enrichment of T1D associated SNPs was observed in this network. Interaction network D, table 1 and figure 3
In three out of four interaction networks our attempts to functionally characterize the candidate genes by expressional profiling have identified novel genes for further analysis. The third interaction network, network C, did not reveal any differentially expressed genes. Genetic support by evaluation of whether networks were enriched in T1D associated SNPs was obtained for network A and to a lesser degree for network D, highlighting these two interaction networks as the most important. A recently published study evaluated changes at the proteome level after cytokine stimulation of INS-1E cells (a rat tumor beta-cell line serving as an in vitro model for T1D) [17]. Among the proteins that changed expression levels after 4 hours of stimulation with IL-1β and IFN-γ were the rat gene products of the WDR1 and NPM1 genes and after 24 hours the PRDX1 protein was highly up-regulated. In that study a large protein interaction network containing many of the differentially expressed proteins including WDR1, NPM1 and PRDX1 was identified [17]. Despite use of different species and model systems and unknown dynamic differences in the transcriptome and proteome we find it of interest that these three genes were pin-pointed as functionally relevant in the current study as well as in the study by D'Hertog [17]. The likelihood of an overlap between three genes in these two studies has a P-value of<0.05, as calculated using hypergeometric statistics. To understand biological systems integration of genetic and functional information is necessary. This includes studies of gene-gene and protein-protein interactions and transcriptional or proteome profiling. In a previous study we identified genetic interactions observed in T1D and explained their functionality by using an approach for integrating protein-protein interactions generating protein interaction networks. In this work we validated the discovered networks and analyzed their functionality by expressional profiling in relevant target tissue and by using SNP association data. Protein interaction data generally are noisy and databases probably contain many false positives. The system used in the current study is, however, rigorously quality controlled to only include interactions that have been replicated in independent screens [18]. GO (gene ontology) terms (www.geneontology.org) for molecular function and biological processes of interaction networks A and D and the differentially expressed genes in particular support that oxidative stress and regulation of transcription and apoptosis are of relevance for beta-cell destruction in T1D pathogenesis, and points directly at these pathways as the most important. Despite the overall impression from recent genome-wide association studies [9], [19], [20] that genes of importance in T1D are mainly immune system genes and not beta-cell genes, it seems that by integrative genomics also genes involved in the genetic disposition to e.g. cytokine induced beta-cell death by apoptosis can be identified. Only few studies have addressed genetic interaction in T1D and focused on interactions between classical disease loci [21], [22], [23], and in the case of high risk HLA class II genotypes and PTPN22 a less than multiplicative association has been demonstrated in T1D and rheumatoid arthritis [16], [24], [25]. The general impression is that interactions may exist, even though they have been difficult to identify. Attempts to identify gene-gene interactions in T1D in previous studies, e.g. in the recent T1D genome wide association studies [9], [19], [26] have not been fruitful, however, stratifying for known T1D loci while searching for dependent effects at other known or unknown loci may not be the best method. Studies using simulated data have shown that the power to detect risk variants can be increased when allowing for epistasis in addition to single marker effects in e.g. genome-wide association studies [27]. Novel methods taking multiple loci at a time into account may offer possibilities of detecting interactions not detectable by classical methods. Evaluation of suggested interactions is necessary to support novel methods, and by no doubt replication of genetic interactions is important, even though it is currently not obvious how this should be done. In the current study we have integrated several approaches and our findings support such methods as valuable in searching for yet unidentified genetic and functional interactions involved in the pathogenetic processes of T1D. Evaluation of functionality is by this approach taken into account much earlier than in classical analyses where evaluation of functional significance is typically not performed before the end of a study. The exact consequence of the up- and down-regulations of the proteins in the interaction networks, permanently or transiently, and in relation to T1D, remains to be resolved. Our approach of measuring the enrichment in the interaction modules of T1D associated SNPs is a novel way of seeking also genetic support for several interacting genes eventually combined in biological pathways. Materials and Methods Ethics statement Human pancreatic islets were obtained as samples from a multicenter European Union-supported program on beta-cell transplantation in diabetes directed by Professor D. Pipeleers. The program has been approved by central and local ethical committees. Studies including human lymphocytes were approved by the local ethics committee of Copenhagen (KA 94020gm). Human islet preparations were obtained from nine donors (aged 8–57 years), six were male donors, three were from female donors. Each preparation was stimulated with a mixture of cytokines (TNF-α (5000 U/ml), IFN-γ (750 U/ml) and IL-1β (75 U/ml)) for 48 hours. Lymphocyte RNA was obtained from nine controls (all males, aged 15–35 years and without diabetes) and eight newly diagnosed T1D patients (all males, aged 15–30 years and with duration of T1D<20 weeks from first insulin injection and with continued insulin treatment since). cDNA from human pancreatic islets with and without cytokine stimulation and cDNA from human lymphocytes from controls as well as newly diagnosed T1D patients was used for comparing expression levels. cDNA was prepared from total RNA by oligo-dT-primed reverse transcription, as described by the manufacturer (TaqMan RT reagents, Applied Biosystems, Foster City, CA, USA). Relative expression levels of selected genes were evaluated by use of TaqMan assays. The Low Density Array system (Applied Biosystems) containing assays for the individual genes as well as housekeeping genes was used on TaqMan 7900HT (Applied Biosystems). For evaluation, expression levels of genes were normalized against the average of three human housekeeping genes, GAPDH, 18S-RNA and PPIA, and evaluated using the delta-delta Ct method [28]. Relative expression levels of genes were for un-stimulated vs. cytokine stimulated islet preparations compared by use of paired t-tests. Expression levels between control and T1D lymphocyte cDNA were compared by f- and t-test. P-values<0.05 were considered statistically significant. SNPs were mapped to genes/proteins by identifying all SNPs categorized as tagging each gene in the Wellcome Trust Case Control Consortium (WTCCC) genome wide association scan data [9]. We included SNPs 5 kb upstream and 1 kb downstream of each gene, since these regions have been shown to be strongly enriched for gene regulatory elements important for the function of the particular genes [29]. For each gene only the SNP with the lowest p-value was used, to avoid introducing a bias towards genes with many low p-value SNPs in linkage disequilibrium with each other. For each module the significance of the enrichment of SNPs in the best 0.1, 1 and 5 percentile was compared to 1,000 randomly generated protein interaction networks with a similar number of proteins. The random networks were composed of proteins of similar size as the proteins in the actual network tested to normalize against the fact that large genes will have a higher chance of containing T1D associated SNPs in the best percentiles of a study due to their size alone. P-values were adjusted for multiple testing using Bonferroni correction by multiplying the nominal p-values with 12, which is the total amount of tests used in this study. The significance of overlap between genes identified in our analysis and in a paper by D'Hertog et al. [17] was calculated using hypergeometric statistics. Acknowledgments We thank Bodil Bosmann Jørgensen for excellent technical assistance. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Footnotes Competing Interests: The authors have declared that no competing interests exist. Funding: The study is supported by the Juvenile Diabetes Research Foundation (JDRF grant no. 33-2008-391), The Danish Medical Research Council (grants to R.B. and K.L.) and Novo Nordisk A/S. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. References 1. Bader JS, Chaudhuri A, Rothberg JM, Chant J. Gaining confidence in high-throughput protein interaction networks. Nature Biotechnology. 2004;22:78–85. 2. Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, et al. Predicting function: from genes to genomes and back. Journal of Molecular Biology. 1998;283:707–725. [PubMed] 3. Ewing RM, Chu P, Elisma F, Li H, Taylor P, et al. Large-scale mapping of human protein-protein interactions by mass spectrometry. Molecular Systems Biology. 2007;3:89. [PubMed] 4. Fraser H, Plotkin J. Using protein complexes to predict phenotypic effects of gene mutation. Genome Biology. 2007;8:R252. [PubMed] 5. Rual J-F, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005;437:1173–1178. [PubMed] 6. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002;417:399–403. [PubMed] 7. Bergholdt R, Størling Z, Lage K, Karlberg E, Òlason P, et al. Integrative analysis for finding genes and networks involved in diabetes and other complex diseases. Genome Biology. 2007;8:R253. [PubMed] 8. Concannon P, Erlich H, Julier C, Morahan G, Nerup J, et al. Type 1 diabetes - Evidence for susceptibility loci from four genome-wide linkage scans in 1,435 multiplex families. Diabetes. 2005;54:2995–3001. [PubMed] 9. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. [PubMed] 10. Allcock RJN, Williams JH, Price P. The central MHC gene, BAT1, may encode a protein that down-regulates cytokine production. Genes to Cells. 2001;6:487–494. [PubMed] 11. Adler HJ, Winnicki RS, Gong T-WL, Lomax MI. A Gene Upregulated in the Acoustically Damaged Chick Basilar Papilla Encodes a Novel WD40 Repeat Protein. Genomics. 1999;56:59–69. [PubMed] 12. Kile BT, Panopoulos AD, Stirzaker RA, Hacking DF, Tahtamouni LH, et al. Mutations in the cofilin partner Aip1/Wdr1 cause autoinflammatory disease and macrothrombocytopenia. Blood. 2007;110:2371–2380. [PubMed] 13. Neumann CA, Krause DS, Carman CV, Das S, Dubey DP, et al. Essential role for the peroxiredoxin Prdx1 in erythrocyte antioxidant defence and tumour suppression. Nature. 2003;424:561–565. [PubMed] 14. Shau H, Butterfield L, Chiu R, Kim A. Cloning and sequence analysis of candidate human natural killer-enhancing factor genes. Immunogenetics. 1994;40:129–134. [PubMed] 15. Zhang Q, Wang HY, Liu X, Wasik MA. STAT5A is epigenetically silenced by the tyrosine kinase NPM1-ALK and acts as a tumor suppressor by reciprocally inhibiting NPM1-ALK expression. Nature Medicine. 2007;13:1341–1348. 16. Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nature Genetics. 2009;41:703–707. 17. D'Hertog W, Overbergh L, Lage K, Ferreira GB, Maris M, et al. Proteomics Analysis of Cytokine-induced Dysfunction and Death in Insulin-producing INS-1E Cells: New Insights into the Pathways Involved. Mol Cell Proteomics. 2007;6:2180–2199. [PubMed] 18. Lage K, Karlberg E, Størling Z, Olason P, Pedersen A, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nature Biotechnology. 2007;25:309–316. 19. Hakonarson H, Grant SFA, Bradfield JP, Marchand L, Kim CE, et al. A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature. 2007;448:591–594. [PubMed] 20. Todd J, Walker N, Cooper J, Smyth D, Downes K, et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genetics. 2007;39:857–864. [PubMed] 21. Cordell H, Todd J, Bennett S, Kawaguchi Y, Farrall M. Two-locus maximum lod score analysis of a multifactorial trait: joint consideration of IDDM2 and IDDM4 with IDDM1 in type 1 diabetes. American Journal of Human Genetics. 1995;57:920–934. [PubMed] 22. Cordell H, Wedig G, Jacobs K, Elston R. Multilocus linkage tests based on affected relative pairs. American Journal of Human Genetics. 2000;66:1273–1286. [PubMed] 23. Nerup J, Pociot F. European Consortium for IDDM genome studies. Members: Holm. P JC, Kochum. I, Senee. V, Blanc. H, Papp. J, Åkesson. K, Bartsocas. C, deLeiva. A, Dahlquist. G, Rønningen. KS, Lathrop. M, Luthman. H, Pociot. F, Nerup. J. A genomewide scan for Type 1-diabetes susceptibility in Scandinavian families: Identification of new loci with evidence of interactions. American Journal of Human Genetics. 2001;69:1301–1313. [PubMed] 24. Källberg H, Padyukov L, Plenge RM, Rönnelid J, Gregersen PK, et al. Gene-Gene and Gene-Environment Interactions Involving HLA-DRB1, PTPN22, and Smoking in Two Subsets of Rheumatoid Arthritis. The American Journal of Human Genetics. 2007;80:867–875. 25. Smyth DJ, Cooper JD, Howson JMM, Walker NM, Plagnol V, et al. PTPN22 Trp620 Explains the Association of Chromosome 1p13 With Type 1 Diabetes and Shows a Statistical Interaction With HLA Class II Genotypes. Diabetes. 2008;57:1730–1737. [PubMed] 26. Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, et al. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nature Genetics. 2008;40:1399–1401. [PubMed] 27. Evans DM, Marchini J, Morris AP, Cardon LR. Two-Stage Two-Locus Models in Genome-Wide Association. PLOS Genetics. 2006;2:e157. [PubMed] 28. Livak K, Schmittgen T. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods (San Diego, Calif). 2001;25:402–408. 29. Veyrieras J-B, Kudaravalli S, Kim SY, Dermitzakis ET, Gilad Y, et al. High-Resolution Mapping of Expression-QTLs Yields Insight into Human Gene Regulation. PLoS Genetics. 2008;4:e1000214. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
J Mol Biol. 1998 Nov 6; 283(4):707-25.
[J Mol Biol. 1998]Mol Syst Biol. 2007; 3():89.
[Mol Syst Biol. 2007]Genome Biol. 2007; 8(11):R252.
[Genome Biol. 2007]Nature. 2005 Oct 20; 437(7062):1173-8.
[Nature. 2005]Nature. 2002 May 23; 417(6887):399-403.
[Nature. 2002]Genome Biol. 2007; 8(11):R253.
[Genome Biol. 2007]Diabetes. 2005 Oct; 54(10):2995-3001.
[Diabetes. 2005]Genome Biol. 2007; 8(11):R253.
[Genome Biol. 2007]Nature. 2007 Jun 7; 447(7145):661-78.
[Nature. 2007]Genes Cells. 2001 May; 6(5):487-94.
[Genes Cells. 2001]Genomics. 1999 Feb 15; 56(1):59-69.
[Genomics. 1999]Blood. 2007 Oct 1; 110(7):2371-80.
[Blood. 2007]Nature. 2003 Jul 31; 424(6948):561-5.
[Nature. 2003]Immunogenetics. 1994; 40(2):129-34.
[Immunogenetics. 1994]Mol Cell Proteomics. 2007 Dec; 6(12):2180-99.
[Mol Cell Proteomics. 2007]Nature. 2007 Jun 7; 447(7145):661-78.
[Nature. 2007]Nature. 2007 Aug 2; 448(7153):591-4.
[Nature. 2007]Nat Genet. 2007 Jul; 39(7):857-64.
[Nat Genet. 2007]Am J Hum Genet. 1995 Oct; 57(4):920-34.
[Am J Hum Genet. 1995]Am J Hum Genet. 2000 Apr; 66(4):1273-86.
[Am J Hum Genet. 2000]Am J Hum Genet. 2001 Dec; 69(6):1301-13.
[Am J Hum Genet. 2001]Diabetes. 2008 Jun; 57(6):1730-7.
[Diabetes. 2008]Nature. 2007 Jun 7; 447(7145):661-78.
[Nature. 2007]Nature. 2007 Jun 7; 447(7145):661-78.
[Nature. 2007]PLoS Genet. 2008 Oct; 4(10):e1000214.
[PLoS Genet. 2008]Mol Cell Proteomics. 2007 Dec; 6(12):2180-99.
[Mol Cell Proteomics. 2007]