Display Settings:

Format

Send to:

Choose Destination

    Ann Hum Genet. 2008 Nov;72(Pt 6):834-47. Epub 2008 Aug 13.

    Efficient association study design via power-optimized tag SNP selection.

    Han B, Kang HM, Seo MS, Zaitlen N, Eskin E.

    Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA.

    Discovering statistical correlation between causal genetic variation and clinical traits through association studies is an important method for identifying the genetic basis of human diseases. Since fully resequencing a cohort is prohibitively costly, genetic association studies take advantage of local correlation structure (or linkage disequilibrium) between single nucleotide polymorphisms (SNPs) by selecting a subset of SNPs to be genotyped (tag SNPs). While many current association studies are performed using commercially available high-throughput genotyping products that define a set of tag SNPs, choosing tag SNPs remains an important problem for both custom follow-up studies as well as designing the high-throughput genotyping products themselves. The most widely used tag SNP selection method optimizes the correlation between SNPs (r(2)). However, tag SNPs chosen based on an r(2) criterion do not necessarily maximize the statistical power of an association study. We propose a study design framework that chooses SNPs to maximize power and efficiently measures the power through empirical simulation. Empirical results based on the HapMap data show that our method gains considerable power over a widely used r(2)-based method, or equivalently reduces the number of tag SNPs required to attain the desired power of a study. Our power-optimized 100k whole genome tag set provides equivalent power to the Affymetrix 500k chip for the CEU population. For the design of custom follow-up studies, our method provides up to twice the power increase using the same number of tag SNPs as r(2)-based methods. Our method is publicly available via web server at http://design.cs.ucla.edu.

    PMID: 18702637 [PubMed - indexed for MEDLINE]

    PMCID: 2574965

    Supplemental Content

    Click here to read Click here to read