Display Settings:

Format

Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Hum Hered. 2012;73(3):139-47. doi: 10.1159/000337300. Epub 2012 Jun 7.

Two-stage extreme phenotype sequencing design for discovering and testing common and rare genetic variants: efficiency and power.

Author information

  • 1Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19104, USA.

Abstract

Next-generation sequencing technology provides an unprecedented opportunity to identify rare susceptibility variants. It is not yet financially feasible to perform whole-genome sequencing on a large number of subjects, and a two-stage design has been advocated to be a practical option. In stage I, variants are discovered by sequencing the whole genomes of a small number of carefully selected individuals. In stage II, the discovered variants of a large number of individuals are genotyped to assess associations. Individuals with extreme phenotypes are typically selected in stage I. Using simulated data for unrelated individuals, we explore two important aspects of this two-stage design: the efficiency of discovering common and rare single-nucleotide polymorphisms (SNPs) in stage I and the impact of incomplete SNP discovery in stage I on the power of testing associations in stage II. We applied a sum test and a sum of squared score test for gene-based association analyses evaluating the power of the two-stage design. We obtained the following results from extensive simulation studies and analysis of the GAW17 dataset. When individuals with trait values more extreme than the 99.7-99th quantile were included in stage I, the two-stage design could achieve the same power as or even higher than the one-stage design if the rare causal variants had large effect sizes. In such design, fewer than half of the total SNPs including more than half of the causal SNPs were discovered, which included nearly all SNPs with minor allele frequencies (MAFs) ≥5%, more than half of the SNPs with MAFs between 1% and 5%, and fewer than half of the SNPs with MAFs <1%. Although a one-stage design may be preferable to identify multiple rare variants having small to moderate effect sizes, our observations support using the two-stage design as a cost-effective option for next-generation sequencing studies.

Copyright © 2012 S. Karger AG, Basel.

PMID:
22678112
[PubMed - indexed for MEDLINE]
PMCID:
PMC3558993
Free PMC Article

Images from this publication.See all images (4)Free text

Figure 1
Figure 2
Figure 3
Figure 4
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for S. Karger AG, Basel, Switzerland Icon for PubMed Central
    Loading ...
    Write to the Help Desk