![]() | ![]() |
Formats:
|
||||||||
Copyright © 2005 by the Genetics Society of America A Statistical Multiprobe Model for Analyzing cis and trans Genes in Genetical Genomics Experiments With Short-Oligonucleotide Arrays *Groningen Bioinformatics Centre (GBiC), Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, 9751 NN, Haren, The Netherlands and †Department of Cell Biology, Section of Stem Cell Biology, University Medical Centre Groningen, University of Groningen, Groningen, 9713 AV, The Netherlands 1Corresponding author: Groningen Bioinformatics Centre, University of Groningen, Kerklaan 30, 9751 NN, Haren, The Netherlands. E-mail: r.alberts/at/rug.nl Communicating editor: C. Haley Received May 26, 2005; Accepted July 15, 2005. This article has been cited by other articles in PMC.Abstract Short-oligonucleotide arrays typically contain multiple probes per gene. In genetical genomics applications a statistical model for the individual probe signals can help in separating “true” differential mRNA expression from “ghost” effects caused by polymorphisms, misdesigned probes, and batch effects. It can also help in detecting alternative splicing, start, or termination. IN a genetical genomics experiment, a panel of 30 genetically different recombinant inbred mice was derived from a cross between parental strains C57BL/6 (B6) and DBA/2 (D2) (Jansen and Nap 2001; Bystrykh et al. 2005). These 30 mice were profiled with Affymetrix MG-U74Av2 arrays, using RNA isolated from hematopoietic stem cells and 12,422 probe sets. The observed array data were background corrected and quantile normalized (Bolstad et al. 2003; Gautier et al. 2004). Although various methods have been developed to compute a single expression value per probe set for further data analysis (e.g., Zhang et al. 2003; Wu and Irizarry 2004; Manly et al. 2005), we here develop an alternative statistical method to more fully exploit the information contained in the individual probe signals. Differential expression for a given gene can result from trans-regulation by other genes or from cis-regulation due to variation in the region of the gene itself (altering functional motifs in the promoter region, changing the stability of the mRNA, or modifying the gene product in such a way that the feedback loop is shifted; Jansen and Nap 2004). In either case signal differences are supposed to be rather stable across probes (Figure 1A
In a genetical genomics analysis we search with the aid of molecular markers for a genome position where the difference in expression between mice carrying the B6 marker allele and mice carrying the D2 marker allele is (most) significant; this genome position is commonly denoted “expression quantitative trait locus” (eQTL or just QTL). In our study the 30 mice have been expression profiled by using probe sets of size 16. The 30 × 16 signals in a given probe set are decomposed into Genes colocalizing within 20 Mb of their QTL are termed here cis genes, and genes mapping elsewhere are termed trans genes. The cis genes show many more probe-specific QTL effects than the trans genes do (Figure 1, C and D Our analysis separates probe sets that are “consistently” cis across probes from those that are more “probe-specific” cis and should be investigated in more detail in silico or in the lab. Figure 1C Some genomic regions showed up as “master” regulators in trans of many genes, but only if the factor batch was excluded from the model (particularly chromosome 4, see Figure 1E We have shown that current methods fail to consider various influential variations (technical, molecular, and sequence) in genetical genomics studies, with worse fit, loss of relevant information, and possibly wrong conclusions as a result. Our statistical analyses on the entire probe data set are therefore at the moment the methods of choice in genetical genomics applications with short-oligonucleotide arrays to help in separating “true” cis and trans genes from “ghost” ones. Data are available at www.genenetwork.org (Chesler et al. 2004). Acknowledgments We thank J. P. Nap, D. J. de Koning, and C. S. Haley for stimulating discussions. R.A. was supported by Netherlands Organization for Scientific Research-Biomolecular Informatics grant 050-50-203. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||
Trends Genet. 2001 Jul; 17(7):388-91.
[Trends Genet. 2001]Nat Genet. 2005 Mar; 37(3):225-32.
[Nat Genet. 2005]Bioinformatics. 2003 Jan 22; 19(2):185-93.
[Bioinformatics. 2003]Bioinformatics. 2004 Feb 12; 20(3):307-15.
[Bioinformatics. 2004]Nat Biotechnol. 2003 Jul; 21(7):818-21.
[Nat Biotechnol. 2003]Trends Genet. 2004 May; 20(5):223-5.
[Trends Genet. 2004]Genome Res. 2005 May; 15(5):681-91.
[Genome Res. 2005]Physiol Genomics. 2004 Aug 11; 18(3):308-15.
[Physiol Genomics. 2004]Genome Res. 2005 May; 15(5):748-54.
[Genome Res. 2005]Trends Genet. 2001 Jul; 17(7):388-91.
[Trends Genet. 2001]Genome Biol. 2005; 6(6):R54.
[Genome Biol. 2005]Nat Neurosci. 2004 May; 7(5):485-6.
[Nat Neurosci. 2004]