Missing single nucleotide polymorphisms in Genetic Risk Scores: A simulation study

PLoS One. 2018 Jul 19;13(7):e0200630. doi: 10.1371/journal.pone.0200630. eCollection 2018.

Abstract

Using a genetic risk score (GRS) to predict a phenotype in a target sample can be complicated by missing data on the single nucleotide polymorphisms (SNPs) that comprise the GRS. This is usually addressed by imputation, omission of the SNPs or by replacing the missing SNPs with proxy SNPs. To assess the impact of the omission and proxy approaches on effect size estimation and predictive ability of weighted and unweighted GRS with small numbers of SNPs, we simulated a dichotomous phenotype conditional on real genotype data. We considered scenarios in which the proportion of missing SNPs ranged from 20-70%. We assessed the impact of omitting or replacing missing SNPs on the association between the GRS and phenotype, the corresponding statistical power and the area under the receiver operating curve. Omission resulted in a larger bias towards the null value of the effect size, a smaller predictive ability and greater loss of statistical power than proxy approaches. The predictive ability of a weighted GRS that includes SNPs with large weights depends of the availability of these large-weight SNPs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Disease / genetics*
  • Genetic Association Studies / methods*
  • Genetic Predisposition to Disease / genetics*
  • Genotype
  • Humans
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • Risk Assessment / methods
  • Risk Assessment / statistics & numerical data
  • Risk Factors

Grants and funding

This work was supported by Fonds de Recherche du Québec (FRSQ), URL: http://www.frqs.gouv.qc.ca. Grant number: 32123. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.