Send to

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2011 Jul 15;27(14):1995-7. doi: 10.1093/bioinformatics/btr305. Epub 2011 Jun 2.

Coverage tradeoffs and power estimation in the design of whole-genome sequencing experiments for detecting association.

Author information

  • 1Department of Computer Science, Columbia University, New York, NY 10027, USA.



Whole-genome sequencing (WGS) allows direct interrogation of previously undetected uncommon or rare variants, which potentially contribute to the missing heritability of human disease. However, cost of sequencing large numbers of samples limits its application in case-control association studies. Here, we describe theoretical and empirical design considerations for such sequencing studies, aimed at maximizing the power of detecting association under the constraint of study-wide cost.


We consider two cost regimes. First, assuming cost is proportional to the total amount of base pairs to be sequenced across all samples, which is a practical model for whole-genome sequencing, we explored the tradeoff in terms of study power between increasing the number of subjects and increasing depth coverage. We demonstrate that the optimal power of detecting association is achieved at medium depth coverage under a wide range of realistic conditions for case-only sequencing designs. Second, if cost is fixed per sample, which is approximately the case in exome sequencing, we show that in a simple case+control sequencing study, the optimal design should include cases totaling 1/e of all subjects.


A web tool implementing the methods is available at

[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Support Center