Format

Send to

Choose Destination
See comment in PubMed Commons below
Genet Epidemiol. 2017 Feb;41(2):152-162. doi: 10.1002/gepi.22027. Epub 2016 Dec 26.

Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study.

Author information

1
Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.
2
Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America.
3
Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America.
4
Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, United States of America.
5
Massachusetts Area Veterans Epidemiology Research and Information Center (MAVERIC), VA Cooperative Studies Program, VA Boston Healthcare System, Boston, Massachusetts, United States of America.
6
Department of Medicine, Harvard University School of Medicine, Boston, Massachusetts, United States of America.
7
Department of Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America.
8
Bruce W. Carter Miami Veterans Affairs (VA) Medical Center, Miami, Florida, United States of America.
9
Department of Psychiatry, University of Miami Miller School of Medicine, Miami, Florida, United States of America.

Abstract

A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant's DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/).

KEYWORDS:

genome wide association test; genotyping; genotyping error; sequencing; statistical power

PMID:
28019059
PMCID:
PMC5604789
DOI:
10.1002/gepi.22027
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Support Center