Logo of hheKargerHomeAlertsResources
Hum Hered. 2010 Jun; 70(1): 42–54.
Published online 2010 Apr 23. doi:  10.1159/000288704
PMCID: PMC2912645

A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants


Since associations between complex diseases and common variants are typically weak, and approaches to genotyping rare variants (e.g. by next-generation resequencing) multiply, there is an urgent demand to develop powerful association tests that are able to detect disease associations with both common and rare variants. In this article we present such a test. It is based on data-adaptive modifications to a so-called Sum test originally proposed for common variants, which aims to strike a balance between utilizing information on multiple markers in linkage disequilibrium and reducing the cost of large degrees of freedom or of multiple testing adjustment. When applied to multiple common or rare variants in a candidate region, the proposed test is easy to use with 1 degree of freedom and without the need for multiple testing adjustment. We show that the proposed test has high power across a wide range of scenarios with either common or rare variants, or both. In particular, in some situations the proposed test performs better than several commonly used methods.

Key Words: Genome-wide association study, Logistic regression, Multimarker analysis, Single nucleotide polymorphism


Genome-wide association studies (GWASs) have been successful in identifying many DNA variants, namely single nucleotide polymorphisms (SNPs), associated with complex and common diseases. The basic mechanism of GWASs is indirect mapping of disease loci through linkage disequilibrium (LD) among physically close DNA variants, facilitating the use of tag SNPs to survey the whole genome. Their effectiveness also depends on the validity of the common disease-common variant (CDCV) hypothesis. In spite of the success of GWASs, for most common diseases the proportion of the overall phenotypic variance explained by the discovered disease-susceptibility loci remains very small [Maher, 2008]. Among several possible explanations, we consider two. First, because the effect sizes of most discovered risk loci are quite small, and they account for only remarkably small amounts of the phenotypic variances, it is likely that many more loci with much smaller effects remain to be detected. As Flint and Mackay [2009] showed in their review, the distribution of effect sizes at 140 loci discovered so far for 20 human disease phenotypes ranged from odds ratio (OR) 1 to 2.4, with the overwhelming majority between 1 and 1.4 and with a mode of only 1.2. However, as observed in a recent multi-stage study [Willer et al., 2009] that detected six previously unknown common variants influencing body mass index (BMI), despite the huge sample sizes (n = 32,387 in stage 1 and n = 45,018 in stage 2), the eight known loci explained only 0.84% of the variance, although family and twin studies have shown that genetic factors account for 40–70% of the population variation in BMI. Second, an alternative to the CDCV hypothesis is the common disease-rare variant (CDRV) hypothesis, which states that for complex diseases there is extreme genetic heterogeneity and that the disease etiology is caused collectively by multiple rare variants with moderate to high penetrance, as supported by some recent studies [Azzopardi et al., 2008; Cohen et al., 2004; Ji et al., 2008; Walsh et al., 2008]. More theoretical and practical arguments supporting the CDRV hypothesis have appeared in the literature [e.g. Bodmer and Bonilla, 2008; Gorlov et al., 2008; Pritchard, 2001; Pritchard and Cox, 2002]. Although still highly debated, both hypotheses have merits, and it is likely that, for most complex diseases, the allelic architecture of susceptibility variants has a wide range of allele frequencies and effect sizes [McCarthy, 2009; Schork et al., 2009; Wen et al., 2004].

Therefore, from a methodological point of view, it would be desirable to develop statistical methods that can address both of the issues raised above. First, dealing with small effect sizes of common susceptibility variants requires the use of a statistical test with the highest possible statistical power. Due to the existence of LD, it seems more effective to use multiple variants or SNPs in an LD block or a candidate gene region. Although in theory there is no single most powerful test for multiple SNPs across all scenarios [Cox and Hinkeley, 1974; Pan, 2009], some tests tend to have higher power in many common situations [Chapman and Whittaker, 2008; Pan, 2009] than others. In particular, as advocated by Wang and Elston [2007], one needs to strike a balance between using multiple SNPs and the thus incurred high cost of large degrees of freedom (DF) or of multiple test adjustment. A simple test called the Sum test, similar to a test proposed by Wang and Elston [2007], summarizes information across multiple SNPs in LD with only DF = 1 and needs no multiple test adjustment. It was shown to have high power in some situations, but, unfortunately, not in others [Chapman and Whittaker, 2008; Pan, 2009]. It is the first goal of this paper to modify the Sum test in a data-adaptive way so that it can maintain high power across most scenarios.

Importantly, to address the second issue raised above, the Sum test or its modifications should be applicable to detect disease association with rare variants. Note that most of the existing tests were proposed under the CDCV hypothesis, and may not be applicable to a situation with rare variants. To our knowledge, there are only three tests specifically designed under the CDRV hypothesis [Morgenthaler and Thilly, 2007; Li and Leal, 2008; Madsen and Browning, 2009]. A key idea shared by the three tests for rare variants is to group or collapse multiple variants, a feature closely related to the Sum test. It is the second goal of this paper to show the effectiveness of the modified Sum test for rare variants, and more generally, to extend it to detect disease association with both common and rare variants. In particular, we show that in some situations the modified Sum test significantly outperforms any of the three existing tests, while for others it performs similarly to the three tests.


Logistic Regression for Common Variants

Given m independent observations (Yi, Xi) with Yi = 0 or 1 as disease status and Xi = (Xi1, …, Xik) as genotypes at k SNPs in LD for subject i, we would like to test for any possible association between the disease and genotypes. As usual, we use the dosage coding for Xij under an additive genetic model: Xij = 0, 1 or 2, representing the copy number of one of the two alleles present in SNP j of subject i, though other genetic models can be adopted. Many multilocus association tests are based on fitting a logistic regression model


A global test of any possible association between the disease and SNPs can be formulated as jointly testing on the multiple βj parameters with the null hypothesis H0: β = (β1, …, βk)′ = 0 by one of the three asymptotically equivalent tests: the likelihood ratio test, the Wald test and the score test. Under H0, any of the three test statistics has an asymptotic χ2 distribution with DF = k. The generalized Hotelling's T2 test [Fan and Knapp, 2003; Xiong et al., 2002] is closely related to the score test [Clayton et al., 2004]. A potential problem with the above tests is the power loss due to large DF for a large k.

In contrast to a global test, another extreme is to conduct a single-locus test for each SNP [Roeder et al., 2004]. Rather than including all k SNPs, we can include each SNP sequentially in a series of univariate or marginal regression models:


where we explicitly distinguish βM = (βM,1, …, βM,k)′ in marginal models (2) from β in the joint model (1). One would proceed by testing H0: βM,j = 0 for each j = 1, …, k sequentially. Each test can be done with only one DF, but a multiple test adjustment has to be made based on numerical solutions [Conneely and Boehnke, 2007], permutations or other methods. The method is equivalent to choosing the univariate test for βM,j with the minimum p value, and is hence also called UminP method.

Most of the existing methods fall into one of the two extreme categories mentioned above. The limitations of the two extremes have been well recognized and induced the development of new methods. A general strategy is to capture full information in all βj's while reducing the cost of large DF or adjustment for multiple testing, but the key is how to reconcile the two possibly conflicting goals. A weighted score test [Wang and Elston, 2007] and a Sum test [Chapman and Whittaker, 2008; Pan, 2009] are two such attempts.

The Sum Test

The Sum test can be used to strike a balance between jointly modeling with multiple SNPs and its resulting large DF: while using all the SNPs, it adopts a key and generally incorrect working assumption that all SNPs are associated with the disease with a common OR:


where βc reflects the common association strength between the disease and each SNP under the working assumption. To address the question of whether there is any association between the disease and any SNP, one only needs to test on one parameter with H0: βc = 0, which can easily be done by the likelihood ratio, Wald or score test in fitting model (3). Note that fitting model (3) is equivalent to regressing Y on a new covariate that is the sum of the genotypes of the multiple SNPs, and hence we call the resulting test the Sum test.

The main advantage of the Sum test is that, because it tests on only one parameter βc, there will be no power loss due to large DF or multiple testing adjustment. Generally, the common association parameter βc in (3) is an ‘average’ (or more precisely, a function) of the individual β1, …, βk. In linear regression, Pan [2009] showed that


where βc is the normal-based maximum likelihood estimate (or least-squares estimate) of βc in a linear model analogous to (3), and βM is the maximum likelihood estimate of βM in the marginal linear models analogous to (2). Now it becomes clear that the main problem with the Sum test is its dependence on the signs of βM,j, or on the codings of each SNP (i.e. which allele is chosen as the reference category). If the signs of βM,j are quite different, it may result in a small βc, and thus in power loss. Hence, before applying the Sum test, based on some ad hoc heuristic, one needs to choose the codings of the SNPs to maximize the number of their positive pairwise correlations [Pan, 2009]: (1) calculate pairwise Pearson's correlation coefficients between any two SNPs; (2) see which SNP s has the largest number of negative correlation coefficients, say ns, with other SNPs; (3) if ns > k/2, then flip the coding of SNP s, and repeat the above process; otherwise, stop. Unfortunately, such an endeavor does not guarantee any good performance of the Sum test. A similar issue exists with the weighted score test of Wang and Elston [2007]. Below we propose a data-adaptive Sum test that overcomes the above-mentioned limitation of the original Sum test.

A Data-Adaptive Sum Test

The problem with the Sum test is that, as shown in equation (4), because the estimated common effect βc is a linear combination of the components of βM with always positive coefficients, the test may have reduced power with a small βc when the components of βM have different signs. Hence, a natural approach is to choose the coding of each SNP j based on the sign of βM,j, which is data-adaptive and may lead to inflated type I error rates if no adjustment is made with the null distribution. Our proposed data-adaptive Sum (aSum) test is based on such an idea. However, critically, we need to address the following two issues. First, how to modify the null distribution of the resulting test statistic such that the type I error rate will be well controlled, and second, how to increase the power of the aSum test. Our proposed implementation below addresses both issues.

Step 1

With the original data {(Yi, Xi)}, we fit the marginal regression model (2) to each SNP j, obtaining βM,j and p value pM,j, for j = 1, …, k.

Step 2

With a chosen cutoff α0 for each j = 1, …, k, if βM,j<0 and pM,j ≤ α0, we change the coding of SNP j from X.j to X.j = 2 – X.j; otherwise, no change is done with X.j = X.j. Note the resulting data {(Yi, Xi)}.

Step 3

With the new data {(Yi, Xi)}, we fit the common-effect model (3), obtaining the usual score statistic U under H0: βc = 0, its variance V and a p value p from the test statistic U/V with the usual null distribution N(0, 1).

Step 4

By repeatedly permuting the disease indicators of the original data, we obtained B sets of permuted data {(Yi(b), Xi)} for b = 1, …, B. For each permuted data {(Yi(b), Xi)}, we repeat the above steps (1) to (3), obtaining the score statistic U(b) for H0: βc = 0 and its usual normal-based p value p(b).

A proper permutation-based p value for the aSum test, called aSum-P, is calculated as Σb=1BI(p(b)&lt;p)/B.

Step 5

Calculate the sample mean and sample variance (or covariance matrix, in general) of U(1), …, U(B) as U0 and V0. The aSum test statistic is


which is compared to a null distribution aχd2+b to obtain a p value, where d = dim(U), a and b are obtained to match the first two moments of aχd2+b to that of the empirical distribution of U(1), …, U(B).

Remark 1

In step 2, the choice of 0 ≤ α0 ≤ 1 is related to the power of the aSum test. Trivially, if α0 = 0, the aSum test reduces to the usual Sum test. If α0 = 1, we choose the coding of SNP j based on the sign of βM,j, which is perhaps the most intuitive way to overcome the problem with the Sum test; however, as will be shown later, the corresponding aSum test has low power because the resulting null distribution has heavy tails. Hence, we would like to choose a relatively small α0 such that (i) under H0, we do not tend to change the coding of any SNP, and (ii) when H0 does not hold, we would like to change the coding of an SNP with negative βM,j. As a compromise, by default, we used α0 = 0.1 throughout the study. As will be shown, we found that α0 = 0.1 performed well across all situations considered here. Nevertheless, the optimal choice of α0 should depend on the data at hands, and hence some data-driven methods of choosing α0 will be useful. One possibility is to choose α0 to maximize the (estimated) power of the resulting test, as discussed in Pan et al. [2010]; more research is needed.

Remark 2

In step 3, we implement the aSum test with the score statistic, although the Wald test statistic can be equally applied.

Remark 3

The permuted data in step 4 are generated to estimate the effects of the data-adaptive nature in step 2, and thus accordingly to adjust its null distribution. Note that the normal-based p value p is no longer a genuine p value because the usual null distribution N(0, 1) is not valid and a necessary adjustment has to be made; as shall be shown later, the direct use of p (called naive aSum, or naSum test) will lead to dramatically inflated type I error rates.

Remark 4

Step 5 is optional in calculating the p value of the aSum test by a theoretical null distribution. Although it depends on the use of permutation, in contrast to the aSum-P, it may not require a large number of permutations, B, to reach a small p value; by default we used B = 100. However, this theoretical null distribution is only an approximation, which may or may not work well.

Remark 5

By the Central Limit Theorem with a large sample size, U should have an approximately normal distribution, and thus the test statistic aSum should have an approximately χ2d as its null distribution. However, as is to be shown later, we found empirically that the null distribution of the standardized U tends to be slightly right-skewed (due to its data-adaptive nature), and hence we used a scaled-shifted χ2d as a null distribution to correct for its skewness. a and b are obtained by the well-known Satterthwaite [1946] approximation.

Tests for Rare Variants

To our knowledge, there are only three existing methods to test the association for rare variants for case-control data: the cohort allelic sums test (CAST) [Cohen et al., 2004; Morgenthaler and Thilly, 2007], the Combined Multivariate and Collapsing (CMC) method [Li and Leal, 2008], and a groupwise weighted Sum (w-Sum) test [Madsen and Bowning, 2009]. For rare variants, neither joint modeling nor univariate regression on individual SNPs works, and some grouping is needed. Since most disease-associated rare variants discovered so far are nonsynonymous sequence variants in some candidate genes [Cohen et al., 2004], it is natural to group nonsynonymous variants in a gene. However, a recent study has identified some susceptible rare noncoding variants [Haller et al., 2009]. Madsen and Browning [2009] suggest grouping variants according to a functional element, such as gene, pathway and ultra-conserved area. In particular, the CAST works by first collapsing the genotypes across variants into a new coding: Xi,C = 1 if any Xij > 0 (i.e. any rare variant is present), and Xi,C = 0 otherwise. It then tests the association between the disease and this new Xi,C. It can be regarded as fitting a logistic regression model


and testing H0: βC = 0. Comparing models (3) and (4), it is clear that the CAST is closely related to the Sum test: both test on only one parameter representing some common effect of the variants; they differ in their use of the new genotypic coding Xi,C=Vj=1kXij versus Xi,S=Σj=1kXij, similar to the use of a dominant genetic mode versus an additive genetic mode for the effects of the individual variants. Note that for rare variants, we have Xi,CXi,S. Hence, it seems natural to apply the Sum test to rare variants. More importantly, if the effects of the rare variants are in opposite directions (i.e. some are protective while others deleterious), the CAST will encounter the same problem as the Sum test discussed in the context of common variants. In that case, it would be better to use the aSum test. Note that recent studies have confirmed that nonsynonymous sequence variants in certain genes are preferentially enriched at either or both of the extremes of a phenotype's population distribution, suggesting that the effects of nonsynonymous variants could be either harmful or beneficial, although most are harmful [Cohen et al., 2004]. In fact, some nonsynonymous variants in gene PCSK9 have been discovered to be associated with lower plasma levels of low-density lipoprotein cholesterol (LDL-C) (e.g. only present in low-LDL subjects), while others are associated with higher levels of LDL-C [Kotowski et al., 2006].

Li and Leal [2009] modified the CAST to improve its performance with both rare and common variants. Specifically, for any rare mutations with minor allele frequencies (MAFs) of less than 0.01, they will be combined into a new group as in the CAST, while each common variant (with MAF > 0.01) forms its own group, and the generalized Hotelling's test is applied to the thus formed multiple groups. We can modify our aSum test in a similar way: we combine the rare variants into one group and combine the common variants into another group by summing over their genotypic codings (which are data-adaptively determined as before), then test on the two corresponding regression coefficients in a logistic regression model. We call the resulting test aSumC test.

There are two potential advantages of the aSumC test over the CMC test, as will be shown. First, the aSumC test overcomes the problem with different association directions of the functional variants, from which both the CMC test and the w-Sum test [Madsen and Browning, 2009] suffer with possibly significant power loss. Second, with only two groups, the aSumC may have a much smaller number of DF and thus higher power than the CMC test.

Other Tests to Be Compared

We will numerically compare the proposed aSum test with several other tests: the multivariate score test, the sum of squared score (SSU) test and its weighted version (SSUw) on H0: β = 0, all based on the joint model (1), and the UminP test for the marginal models (2), as considered in Pan [2009].

Based on model (1), we can derive the score vector and its covariance matrix as


where Y¯=Σi=1mYi/m and X¯=Σi=1nXi/m. The four test statistics are


where Uj is the j-th element of U and vj is the (j, j)-th diagonal element of V. Under H0, the first three statistics have either asymptotic or approximate χ2 distributions [Pan, 2009], while the distribution of TUminP can be numerically obtained [Conneely and Boehnke, 2007].

Next we review the w-Sum test of Madsen and Browning [2009] for rare variants. It is still based on the basic idea of grouping rare variants, but differs from the Sum test in (i) using a weighted sum, instead of a simple sum, of the rare variant scores for each subject, and (ii) comparing the ranks of the weighted sums, rather than the sums, between the case and control groups. Specifically, if A = {i : Yi = 1} and Ā = {i : Yi = 0} denote the sets of cases and controls, respectively, the w-Sum test is constructed as follows.

Step 1

Calculate weight wj=mjqj(1-qj), where qj=(ΣiA¯Xij+1)/(2njU+2),mj is the total number of subjects genotyped for variant j, and njU is the total number of controls genotyped for variant j.

Step 2

Calculate γi=Σj=1kXij/wj and R = ΣiΔA rank(γi).

Step 3

Generate a permuted dataset b (by randomly shuffling the disease labels), and calculate the corresponding R(b). Repeat for B times.

Step 4

The test statistic is Tw-Sum=(R-R¯)/Var(R), where R¯ and Var(R) are the sample mean and variance of {R(1), …, R(B)}. A p value can be obtained by referring Tw-Sum to a standard normal distribution. Alternatively, a permutation p value can be obtained by comparing R with {R(1), …, R(B)}.


Common Variants

Simulated Data with Compound Symmetry Correlation Structure: Ideal for the Sum Test

We first consider an ideal situation for the Sum test: there is a common association strength between any SNP and the disease, as considered by Wang and Elston [2007] and Pan [2009]. Specifically, we simulated k = 10 marker SNPs with sample size n = 500 cases or controls. The disease-causing SNP was assumed to be in the center of the marker SNPs, but was removed from the data. First, we generated a latent vector from a multivariate normal distribution with a compound symmetry covariance structure: there was an equal correlation ρ = 0.4 between any two SNPs. Second, the latent vector was dichotomized to yield a haplotype with the allele frequencies of the marker SNPs randomly between 0.2 and 0.8, while the MAF for the disease-causing SNP was fixed at 0.2. Third, we combined two haplotypes and obtained marker genotype data Xi = (Xi1, …, Xik)′ (after removing the disease-causing SNP X0i) for subject i. Fourth, the disease status Yi of subject i was generated from a logistic regression model:


where we chose β0 = –log 4 to give a background (i.e. not caused by the causal SNP) disease probability of 0.2, and the OR ranged from 1 (i.e. no association) to 2. Finally, following the case-control design, we sampled n cases (with Yi = 1) and n controls (with Yi = 0). We excluded the disease-causing SNP, supplying {(Yi, Xi): i = 1, 2, …, 2n} as a dataset to various statistical tests. For each setup, we simulated 1,000 datasets from which we obtained an empirical size or power for each test as its proportion of correctly or incorrectly rejecting its H0.

The results for the sample sizes n = 500 (i.e. 500 cases and 500 controls) and n = 1,000 based on 1,000 replicates are shown in table table1.1. It is clear that the aSum test and its permutation-based aSum-P performed best with the properly controlled type I error rates and highest power. In particular, we note that the naSum test (which does not make any adjustment to its null distribution) yielded much-inflated type I error rates >0.50.

Table 1
Type I error and power based on 1,000 replicates for simulated data with a CS covariance structure

Figure Figure11 depicts the empirical distributions of the various statistics from 1,000 replicates under the null hypothesis (i.e. OR = 1) with n = 500. It is noteworthy that the usual score test statistics U/V were centered at 0.3, not 0 as dictated by the usual null distribution N(0, 1) in the conventional situation, due to the data-adaptive choice of the SNP codings, explaining why the use of the N(0, 1) as the null distribution would lead to inflated type I error rates for the naive naSum test. By an adjustment with permuted data, the distribution of (U-U0)/V0 was close to N(0, 1), though the former seemed to be slightly right-skewed (fig. 1c, d), which motivated the use of aχ12+b as a reference distribution for the aSum statistic (fig. 1e, f). If χ12 was used as the null distribution for the aSum test, we found that the resulting test was conservative with some power loss (results not shown). The empirical distribution of a was characterized by its three quartiles 0.927, 0.997 and 1.090, and by its mean 1.019, while that of b had three quartiles −0.100, −0.007 and 0.064, and mean −0.029. Note that 95% pointwise confidence envelopes (implemented in R package car) were drawn on the Q-Q plots (fig. 1b, d and f): there were 125, 109 and 86 points outside the confidence bounds in figure 1b, d, f, respectively. When the Kolmogorov-Smirnov goodness-of-fit test was applied to cases b, d and f, we obtained the p values of 2 × 10−7, 0.068 and 0.163, respectively. Hence, in summary, the aSum test statistics had the best fit to the theoretical null distribution aχ12+b.

Fig. 1
Distributions of the naive score test statistics U/√V (a), the adjusted score statistics (U − U0)/√V0 (c) and (aSum b)/a (e), and their observed values against the quantiles from N(0.3, 1) (b), N(0, 1) (d) and χ21 (f), ...

Table Table22 lists the results for the aSum test with various values of the cutoff α0 for n = 500. It seems that the results were not too sensitive to α0 for α0 around 0.1. However, for α0 = 0.5 or 1, there was a substantial loss of power.

Table 2
Effects of a0 on type I error and power for the aSum test based on 1,000 replicates for simulated data in table table11

HapMap Data for Gene CHI3L2

Wang and Elston [2007] and Pan [2009] conducted a simulation study based on real LD patterns within the CHI3L2 gene as observed in the HapMap data, for which the Sum test performed well. The SNPs of the CHI3L2 gene for the 90 CEU (Utah residents with ancestry from northern and western Europe) individuals from the HapMap site were downloaded in June 2008. For data processing, the same procedure as the previous authors used was followed : first, we excluded SNPs with MAF ≤ 0.2, leaving 23 SNPs. Second, we did a single imputation for each of the missing genotypes by randomly drawing an observed genotype of the same SNP. Third, we used the dosage coding for the SNPs and applied the algorithm described earlier to minimize the number of negative pairwise SNP-SNP correlations. Fourth, we deleted redundant SNPs that were perfectly correlated with other SNPs. There was substantial LD among the remaining 17 SNPs as indicated by the distribution of their pairwise Pearson's correlation coefficients, which ranged from −0.388 to 0.989. Fifth, we repeatedly sampled (with replacement) subjects from the 90 CEU individuals. Finally, we chose SNP rs2182114 as disease-causing and generated disease indicators from the logistic model (6) with the same β0 and other possible values of OR.

The simulation results for the sample sizes n = 500 and n = 1,000 with 1,000 replicates are shown in table table3.3. Here the aSum test had a correct type I error rate. Furthermore, it had the highest power with a slight edge over the SSU and SSUw tests. In particular, it outperformed the Sum test, although the latter also worked quite well. We also noted that the naive naSum test without adjustment again had highly inflated type I error rates.

Table 3
Type I error and power based on 1,000 replicates for gene CHI3L2

HapMap Data for Gene IL21R

As shown by Chapman and Whittaker [2008] and Pan [2009], the Sum test did not perform well in the region of gene IL21R. We followed exactly the same steps as for gene CHI3L2 except that the disease-causing SNP was selected randomly and then excluded from the data in each simulation run. At the end, we had 28 SNPs.

The simulation results based on n = 500, n = 1,000 and 1,000 replicates are shown in table table4.4. As expected, the Sum test had low power, while the UminP test was most powerful. Nevertheless, the power of the aSum test (and its permutation-based version, aSum-P) was only slightly lower than that of the UminP test. Note that the naive naSum test had highly inflated type I error rates.

Table 4
Type I error and power based on 1,000 replicates for gene IL21R

As suggested by a reviewer, we also downloaded phased haplotype data of the 60 HapMap CEU subjects. We followed a similar procedure as before (except that there was no need to impute because there was no missing value), leading to 20 SNPs. The results for the two sample sizes based on 1,000 replicates are shown in table table5.5. The same conclusions can be drawn: the UminP test had the highest power, closely followed by the aSum-P and aSum tests; in particular, the aSum-P and aSum tests were much more powerful than the Sum test.

Table 5
Power based on 1,000 replicates of the phased-haplotype data for gene IL21R

Rare Variants

Simulation Setups

We independently generated k rare mutations, each with a mutation rate or MAF uniformly distributed between 0 and 0.05 with the constraint Σj=1kMAFj=0.05 or 0.01. The disease outcome was generated from a joint logistic regression model (1) with sample size n = 500. Several cases were considered. (i) Ideal case: with all β1 = … = βk = log(OR); (ii) nonideal: β1 = β2 = β3 = log(ORp), β4 = β5 = β6 = log(0.5) and β7 = β8 = β9 = log(2). In addition, for each case, we also considered adding a few independent and nonfunctional (i.e. noncausal) common variants or rare variants, and investigated the corresponding power properties of the various tests.

Simulation Results

For the ideal situation with all causal associations in the same direction, the simulation results based on n = 500 and 1,000 replicates are shown in table table6.6. It is clear that the Sum test and the w-Sum test of Madsen and Browning [2009] performed best, closely followed by the aSum-P and aSum tests. Next came the aSumC-P and aSumC tests, which had higher power than the CMC test. Note that the naive naSum test had a hugely inflated type I error rate at 0.617.

Table 6
Type I error and power based on 1,000 replicates for an ideal case for rare variants with a common association strength OR between any of nine causal SNPs and the disease

When more nonfunctional (i.e. noncausal) common SNPs were added (table (table7),7), the CMC test became the overall winner, closely followed by the aSumC-P and aSumC tests when the MAF of the nonfunctional SNPs was low. On the other hand, when multiple nonfunctional rare variants were added (table (table8),8), the Sum and aSum-P tests appeared to be the winners, closely followed by the aSum, w-Sum, aSumC-P and aSumC tests. Next came the CMC test.

Table 7
Power based on 1,000 replicates for an ideal case for nine causal rare variants with a common association strength OR = 1.8 and a number of nonfunctional common variants (NF)
Table 8
Power based on 1,000 replicates for an ideal case for nine causal rare variants with a common association strength OR = 1.8 and a number of nonfunctional rare variants (NF), each with the same MAF as that of a corresponding causal variant

For the nonideal situation with both positive and negative associations between the causal rare variants and the disease, the simulation results based on n = 500 and 1,000 replicates are shown in table table9.9. Evidently our proposed aSum test and its permutation-based version aSum-P had higher power than the CMC, and even more significantly than the w-Sum test (and the Sum test). It is interesting to note that the SSU, SSUw and score tests also performed very well here.

Table 9
Power based on 1,000 replicates for a nonideal case for rare variants: the effect sizes of the nine causal SNPs are (ORp, ORp, ORp, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5)

When two common nonfunctional variants were added (table (table10),10), the aSumC-P and aSumC tests outperformed the aSum-P and aSum tests but CMC had even higher power while both the Sum and w-Sum tests performed poorly. In these situations, it was somewhat surprising that the score and SSUw tests were the winners; note that because the SNPs were not in LD, the score and SSUw tests were expected to be close. When more nonfunctional rare variants were added (table (table11),11), the various versions of the aSum tests performed similarly, better than the CMC test, and far better than the w-Sum test, although the SSU test was an overall winner. Interestingly, for a higher sum of the MAFs for nonfunctional rare variants, the score and SSUw tests performed well; in contrast, if the sum of the MAFs was low, the score and SSUw tests had minimal power.

Table 10
Power based on 1,000 replicates for a nonideal case for nine causal rare variants and a number of nonfunctional common variants (NF)
Table 11
Power based on 1,000 replicates for a nonideal case for nine causal rare variants and a number of nonfunctional rare variants (NF), each with the same MAF as that of a corresponding causal variants


Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease leading to paralysis and death, typically within 3–5 years from onset. Despite evidence for a role for genetics, little is known about common genetic variants linking to sporadic ALS. Schymick et al. [2007] conducted a GWAS to identify genetic variants predisposing to developing ALS in a cohort of 276 American sporadic ALS cases and 268 neurologically normal controls. The original study assayed 555,352 unique SNPs for each subject. By testing each individual SNP separately, Schymick et al. [2007] identified a list of 34 most significant SNPs, although none of them was statistically significant after a genome-wide multiple test adjustment.

We randomly picked nine SNPs from the list of 34 significant common SNPs of Schymick et al. [2007], including the most significant one, rs4363506, which had a p value of 6.8 × 10−7 by univariate 2-DF χ2 test. For each of the nine SNPs, we extracted ten neighboring SNPs upstream and another ten downstream, then applied the default LD blocking algorithm implemented in Haploview (v4.1) [Barrett et al., 2005] to each 21-SNP region for the control group. Three SNPs were included in the LD block for SNP rs4363506, while four were included for SNP rs332389. As for any multimarker analysis, we had to choose an LD block [Gabriel et al., 2002] to analyze in a region of interest. The goal was to strike a balance between including as many informative SNPs as possible and controlling for increasing DF. Here we chose an LD block often used in haplotype analysis, although it is not clear how to do so optimally; other strategies, such as using sliding windows with various window sizes [e.g. Guo et al., 2009 and references therein], may be adopted, although issues of high computational demand and multiple testing adjustments have to be addressed.

We applied various multimarker genotype-based tests to each block for each of the nine candidate SNPs. We used the general 2-DF coding for each SNP. For two candidate SNPs, rs4363506 and rs332389, the aSum test based on B = 100 gave much more significant p values than the other tests, while the differences for the other seven SNPs were not big (not shown). However, by checking the Q-Q plots, we had some doubts about the goodness-of-fit of the far right tail of the proposed theoretical null distribution. Hence, we used B = 5 × 106 for the aSum-P test (table (table12).12). It is worth pointing out the dramatic p value differences between the Sum and aSum-P tests. Albeit not conclusive, the p values of the aSum-P test, which were more significant than those of the Sum test for the two SNPs, were in agreement with those of the UminP and other tests, suggesting possible follow-up studies on the two loci.

Table 12
p values from various tests on two LD blocks with 3 and 4 SNPs, respectively


We have proposed and studied a data-adaptive aSum test for genetic association with multiple common and/or rare variants. As the Sum test, it aims to minimize the cost of high DF and eliminate the need for multiple testing adjustment while utilizing information on multiple variants; these multiple variants may be SNPs in LD in a sliding window or LD block, or may be unlinked rare variants. Its major advantages include its easy use, simplicity, and applicability to both common and rare variants. We used permutations to correct for the effects of the data-adaptive aSum test on the null distribution of the resulting score (or Wald) test statistic. We also discussed the use of an approximate null distribution to derive p values based on a small number of permutations to save computing time; however, the theoretical approximation may not work well, and further studies are needed. Therefore, whenever feasible, e.g. in candidate gene association studies or with parallel computing resources, we recommend the use of the permutation-based aSum-P test. We conducted extensive simulations to evaluate its performance for common variants, establishing its superior performance over a wide range of scenarios. We also empirically compared its statistical power with other methods for rare variants, showing its potential use and improved performance in some situations. In anticipation of the future use of next-generation resequencing and other technologies to survey genetic architectures of both common and rare variants, further studies are needed to evaluate its performance for rare variants. In particular, to maximize power in practical situations, it would be desirable to incorporate automatic variant selection before applying the test.


This research was partially supported by NIH grants GM081535 and HL65462. The authors thank the reviewers and Dr Weihua Guan for helpful comments.


  • Azzopardi D, Dallosso AR, Eliason K, Hendrickson BC, Jones N, et al. Multiple rare nonsynonymous variants in the adenomatous polyposis coli gene predispose to colorectal adenomas. Cancer Res. 2008;68:358–363. [PubMed]
  • Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, Erlich HA, Julier C, Morahan G, Nerup J, Nierras C, Plagnol V, Pociot F, Schuilenburg H, Smyth DJ, Stevens H, Todd JA, Walker NM, Rich SS, The Type 1 Diabetes Genetics Consortium: Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet, Epub 2009. [PMC free article] [PubMed]
  • Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40:695–701. [PMC free article] [PubMed]
  • Chapman JM, Whittaker J. Analysis of multiple SNPs in a candidate gene or region. Genetic Epidemiology. 2008;32:560–566. [PMC free article] [PubMed]
  • Clayton D, Chapman J, Cooper J. Use of unphased multilocus genotype data in indirect association studies. Genet Epidemiol. 2004;27:415–428. [PubMed]
  • Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. [PubMed]
  • Conneely KN, Boehnke M. So many correlated tests, so little time! Rapid adjustment of p values for multiple correlated tests. Am J Hum Genet. 2007;81:1158–1168. [PMC free article] [PubMed]
  • Cox DR, Hinkley DV. Theoretical Statistics. London: Chapman and Hall; 1974.
  • Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES. High-resolution haplotype structure in the human genome. Nat Genet. 2001;29:229–232. [PubMed]
  • Fan R, Knapp M. Genome association studies of complex diseases by case-control designs. Am J Hum Genet. 2003;72:850–868. [PMC free article] [PubMed]
  • Flint J, Mackay TFC. Genetic architecture of quantitative traits in mice, flies and humans. Genome Research. 2009;19:723–733. [PMC free article] [PubMed]
  • Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, et al. The structure of haplotype blocks in the human genome. Science. 2002;296:2225–2229. [PubMed]
  • Gorlov IP, Gorlova OY, Sunyaev SR, Spitz MR, Amos CI. Shifting paradigm of association studies: Value of rare single-nucleotide polymorphisms. Am J Hum Genet. 2008;82:100–112. [PMC free article] [PubMed]
  • Guo Y, Li J, Bonham A, Wang Y, Deng H. Gain in power for exhaustive analyses of haplotypes using variable-sized sliding window strategy: a comparison of association-mapping strategies. European Journal of Human Genetics. 2009;17:785–792. [PMC free article] [PubMed]
  • Haller G, Torgerson DG, Ober C, Thompson EE. Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility. J Allergy Clin Immunol. 2009;124:1204–1209.e9. [PMC free article] [PubMed]
  • Ji W, Foo JN, O'Roak BJ, Zhao H, Larson MG, et al. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet. 2008;40:592–599. [PMC free article] [PubMed]
  • Kotowski I, Pertsemlidis A, Luke A, Cooper R, Vega G, Cohen J, Hobbs H. A spectrum of PCSK9 alleles contributes to plasma levels of low-density lipoprotein cholesterol. Am J Hum Genet. 2006;78:410–422. [PMC free article] [PubMed]
  • Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83:311–321. [PMC free article] [PubMed]
  • Madsen BE, Browning SR. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 2009;5:e1000384. [PMC free article] [PubMed]
  • Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. [PubMed]
  • McCarthy MI. Exploring the unknown: assumptions about allelic architecture and strategies for susceptibility variant discovery. Genome Med. 2009;1:66. [PMC free article] [PubMed]
  • Morgenthaler S, Thilly WG. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: A cohort allelic sums test (CAST) Mutation Research. 2007;615:28–56. [PubMed]
  • Pan W. Asymptotic tests of association with multiple SNPs in linkage disequilibrium. Genet Epidemiol. 2009;33:497–507. [PMC free article] [PubMed]
  • Pan W, Han F, Shen X. Test selection with application to detecting disease association with multiple SNPs. Hum Hered. 2010;69:120–130. [PMC free article] [PubMed]
  • Pritchard JK. Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet. 2001;69:124–137. [PMC free article] [PubMed]
  • Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease-common variant… or not? Hum Mol Genet. 2002;11:2417–2423. [PubMed]
  • Roeder K, Bacanu SA, Sonpar V, Zhang X, Devlin B. Analysis of single-locus tests to detect gene/disease associations. Genet Epidemiol. 2005;28:207–219. [PubMed]
  • Satterthwaite FE. An approximate distribution of estimates of variance components. Biometrics Bulletin. 1946;2:110–114. [PubMed]
  • Schork NJ, Murray SS, Frazer KA, Topol EJ. Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev. 2009;19:212–219. [PMC free article] [PubMed]
  • Schymick JC, Scholz SW, Fung HC, et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol. 2007;6:322–328. [PubMed]
  • Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB, et al. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science. 2008;320:539–543. [PubMed]
  • Wang T, Elston RC. Improved power by use of a weighted score test for linkage disequilibrium mapping. Am J Hum Genet. 2007;80:353–360. [PMC free article] [PubMed]
  • Wei Z, Li M, Rebbeck T, Li H. U-statistics-based tests for multiple genes in genetic association studies. Ann Hum Genet. 2008;72:821–833. [PMC free article] [PubMed]
  • Wen G, Mahata S, Cadman P, Mahata M, Ghosh S, Mahapatra N, Rao F, Stridsberg M, Smith D, Mahboubi P, Schork NJ, O'Connor DT, Hamilton BA. Both rare and common polymorphisms contribute functional variation at CHGA, a regulator of catecholamine physiology. Am J Hum Genet. 2004;74:197–207. [PMC free article] [PubMed]
  • Willer CJ, Speliotes EK, Loos RJF, Li S, Lindgren CM, Heid IM, Berndt SI, Elliott AL, Jackson AU, Lamina C, et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet. 2009;41:25–34. [PMC free article] [PubMed]
  • Xiong M, Zhao J, Boerwinkle E. Generalized T2 test for genome association studies. Am J Hum Genet. 2002;70:1257–1268. [PMC free article] [PubMed]

Articles from Human Heredity are provided here courtesy of Karger Publishers
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...