Send to

Choose Destination
Hum Hered. 2012;73(3):159-73. doi: 10.1159/000338943. Epub 2012 Jun 15.

Efficient adaptively weighted analysis of secondary phenotypes in case-control genome-wide association studies.

Author information

Division of Biostatistics, Department of Population Health, School of Medicine, New York University, New York, NY 10016, USA.


We propose and compare methods of analysis for detecting associations between genotypes of a single nucleotide polymorphism (SNP) and a dichotomous secondary phenotype (X), when the data arise from a case-control study of a primary dichotomous phenotype (D), which is not rare. We considered both a dichotomous genotype (G) as in recessive or dominant models and an additive genetic model based on the number of minor alleles present. To estimate the log odds ratio β(1) relating X to G in the general population, one needs to understand the conditional distribution [D ∣ X, G] in the general population. For the most general model, [D ∣ X, G], one needs external data on P(D = 1) to estimate β(1). We show that for this 'full model', the maximum likelihood (FM) corresponds to a previously proposed weighted logistic regression (WL) approach if G is dichotomous. For the additive model, WL yields results numerically close, but not identical, to those of the maximum likelihood FM. Efficiency can be gained by assuming that [D ∣ X, G] is a logistic model with no interaction between X and G (the 'reduced model'). However, the resulting maximum likelihood (RM) can be misleading in the presence of interactions. We therefore propose an adaptively weighted approach (AW) that captures the efficiency of RM but is robust to the occasional SNP that might interact with the secondary phenotype to affect the risk of the primary disease. We study the robustness of FM, WL, RM and AW to misspecification of P(D = 1). In principle, one should be able to estimate β(1) without external information on P(D = 1) under the reduced model. However, our simulations show that the resulting inference is unreliable. Therefore, in practice one needs to introduce external information on P(D = 1), even in the absence of interactions between X and G.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for S. Karger AG, Basel, Switzerland Icon for PubMed Central
Loading ...
Support Center