ADELLE: A global testing method for Trans-eQTL mapping

Understanding the genetic regulatory mechanisms of gene expression is a challenging and ongoing problem. Genetic variants that are associated with expression levels are readily identified when they are proximal to the gene (i.e., cis-eQTLs), but SNPs distant from the gene whose expression levels they are associated with (i.e., trans-eQTLs) have been much more difficult to discover, even though they account for a majority of the heritability in gene expression levels. A major impediment to the identification of more trans-eQTLs is the lack of statistical methods that are powerful enough to overcome the obstacles of small effect sizes and large multiple testing burden of trans-eQTL mapping. Here, we propose ADELLE, a powerful statistical testing framework that requires only summary statistics and is designed to be most sensitive to SNPs that are associated with multiple gene expression levels, a characteristic of many trans-eQTLs. In simulations, we show that ADELLE is more powerful than other methods at detecting SNPs that are associated with 0.2–2% of the traits. We apply ADELLE to a mouse advanced intercross line data set and show its ability to find trans-eQTLs that were not significant under a standard analysis. This demonstrates that ADELLE is a powerful tool at uncovering trans regulators of genetic expression.

ADELLE: A global testing method for Trans-eQTL mapping

S1 Text Covariance matrix estimation
In order to estimate Ω, the covariance matrix of Z in Eq (5) in the main text, we make additional modeling assumptions.In an eQTL mapping study in which D expression traits and M genome-wide SNPs are observed on each of N individuals, define G to be the M × N genotype matrix for the SNPs and Y to be the D × N phenotype matrix for the expression traits.Let Z denote the D × M matrix of test statistics, where Z dm , the (d, m)th entry of Z, is the test statistic for association between trait d and SNP m, and where each Z dm is assumed to be standard normal under the null hypothesis of no association between trait d and SNP m.
We further assume that N ≪ min( D, M ), i.e., there are many more expression traits and SNPs than there are individuals in the study.

Special case: simple linear regression model
In the simplest case, suppose that there is no population structure or relatedness in the sample, and there are no confounding variables.Then under the null hypothesis that none of the SNPs are eQTLs for any of the traits, the N columns of Y are assumed to be i.i.d.draws from N D (µ trait , V trait ), where µ trait is the D × 1 vector of trait means and V trait is the D × D trait covariance matrix, and the N columns of G are assumed to be independent, each having mean vector µ geno of length M and M × M covariance matrix V geno .Furthermore, suppose that Z dm is obtained as the t-statistic for testing significance of SNP m in a simple linear regression of trait d on SNP m, for 1 ≤ d ≤ D, 1 ≤ m ≤ M .Then for reasonably large N (even when N ≪ min( D, M )), the covariance structure of the elements of Z is given by where C geno is the correlation matrix derived from V geno and C trait is the correlation matrix derived from V trait .A consequence of this model is that each row of Z has covariance matrix C geno and each column of Z has covariance matrix C trait .
Since each element of Z is marginally standard normal and they have the correlation structure in Eq 1, it would be tempting to suppose that the joint distribution of the elements of Z must be multivariate normal, and to suppose that therefore Z must have the matrix normal distribution MN D,M (0, C trait , C geno ).
However, this is false.In fact the distribution of Z is a mixture of matrix normal distributions, where each component of the mixture has mean 0 and variance 1 for every element of Z, but where different mixture components have different correlation matrices for vec(Z).One way to specify the distribution of Z in this case is to note that conditional on G, Z does have a matrix normal distribution in large samples: where Ĉgeno is the M × M sample correlation matrix of the given G, which is a matrix of rank N − 1.From this formulation, it can be seen that each Z is only of rank N − 1, which makes sense, because there are only N observations of each phenotype and genotype on which the entire matrix Z is based.It can also be seen that the unconditional distribution of Z will be a mixture of matrix normals, where we mix over G.For simulation purposes, it is handy to note that it is also true in this case that where Ĉtrait is the D × D sample correlation matrix of the given Y , so Ĉtrait is of rank N − 1.
In this simple setting, Ω = C trait , so we could directly estimate C trait by first forming the sample correlation matrix Ĉtrait for the D traits.However, the estimate Ĉtrait would be low rank because typically N ≪ D, so we could regularize it by eigenvalue shrinkage [1], as described below in subsection Regularization of sample covariance matrix, to finally obtain Ω as an estimate for Ω.

More general case
If there are important covariates, population structure or related individuals in the model for Y (where we assume these are properly accounted for in the association test), then the simple model for Z given in Eq 2 does not hold.However, motivated by the simple case, we propose the more general null model for Z: where C G is a M × M correlation matrix that is a function of G and has rank N − 1 (or, more generally, Properties of this model are that Z is of rank N − k − 1, every column of Z has covariance matrix Ω and every row of Z has covariance matrix C G .In this case, Ω is no longer exactly the trait correlation matrix, so we estimate it from Z instead of directly from Y .
A simple (but low-rank) estimator would be the sample correlation matrix for the rows of Z, given by cov2cor(Z( , where cov2cor is the function that maps a symmetric positive semi-definite matrix A with positive diagonal elements to a matrix of the same size with (i, j)th element A ij / A ii A jj .If C G were available, a more efficient (but still low-rank) estimator could be obtained by both decorrelating and centering the rows of To decide which choice of C− to use in the estimator Ω1 , we consider the following simple setting for estimation of a single correlation value ρ.Suppose (X 1 , Y 1 ), . . ., (X n , Y n ) are i.i.d.bivariate normal with mean vector (0, 0) and with 2 × 2 covariance matrix having both diagonal elements equal to 1 and both off-diagonal elements equal to ρ.Then, if we let ρ d1,d2 denote the (d1, d2)th entry of Ω, where d1 ̸ = d2, in large samples the (d1, d2)th entry of Ω1 would be approximately unbiased for ρ d1,d2 with variance . However, to increase the robustness of the estimator, we usually prefer to choose a C− that is orthogonal to 1 M , so that we are also centering Z. Within this class of estimators, the variance Possibilities we have considered in practice are using where Ĉgeno is the sample correlation matrix of G, which would be the optimal choice among all C− that are orthogonal to 1 M , in the special case of simple linear regression.In the simulations and data analysis, we have used C− = I − M −1 1 M 1 T M .Given Ω1 , which is generally of rank N − k − 1, we apply regularization to obtain full-rank Ω as follows.

Regularization of sample covariance matrix
To regularize Ω1 , we apply a form of eigenvalue shrinkage.Suppose Ω1 = P ΛP T is the eigendecomposition of Ω1 , where Λ is a diagonal matrix with dth diagonal element λ d , the dth eigenvalue, where λ 1 ≥ λ 2 ≥ . . .λ D ≥ 0, and since Ω1 has 1's on the diagonal, we also have D d=1 λ d = D.As a first step, we apply shrinkage to calculate a new set of eigenvalues λ1 ≥ λ2 ≥ . . .≥ λD ≥ 0, and a new diagonal matrix Λ whose dth diagonal element is λd , as follows: for our setting in which the number of observations N is small relative to D and M , we define γ = dim( Ω1 )/rank( Ω1 ) and let λ + = (1 + √ γ) 2 .Let h = |{d : λ d > λ + }|, and call {λ 1 , . . ., λ h } the "large" eigenvalues and {λ h+1 , . . ., λ D } the "small" eigenvalues.We apply a debiasing function f 1 to each of the large eigenvalues and a linear contraction f 2 to each of the small eigenvalues.Here , where f 1 is the inverse bias function for the large eigenvalues in a spiked covariance model for a case when Ω1 is not low-rank [2].In that non-low-rank case, f 2 (λ i ) would ordinarily be a constant function equal to 1, but in a low-rank setting, we find that shrinkage to be too severe, (5)

Beta-binomial approximation
In this subsection, we describe the beta-binomial approximation [3] used to calculate values of F (d) , the cdf of π (d) , for d = 1, . . ., qD, which is needed for obtaining the l-values used in forming the ADELLE test statistic.
From Eq (7) in the main text, we have for the l-value where If Ω = I, then for c ≥ 0, S(c) has the null distribution of a Binomial(D, 2Φ(−c)) random variable, and which is the CDF of a Beta(d, D + 1 − d) distribution.When Ω ̸ = I, the null mean of S(c) will remain the same, E 0 [S(c)] = 2DΦ(−c), however the null variance is Where the last inequality follows from the positive correlation of the magnitudes of any correlated, mean-zero, bivariate Gaussians.
Following the method proposed by [3] (see also [4]), we approximate the distribution of S(c) with a beta-binomial distribution BB(D, a, b).The beta-binomial is the distribution formed from the binomial distribution when the success probability is drawn from a beta distribution.It is convenient to reparameterize the beta-binomial in terms of parameters λ = a a+b and γ = 1 a+b .If X is drawn from a BB(D, λ, γ), where λ > 0, γ > 0, and D is a positive integer, then X has the following properties: λ = E(X)/D, and (9) where B(a, b) is the beta function.
To approximate the distribution of S(c) we choose λ and γ by the method of moments.Barnett et al. [3] derived the following: Where H i are the Hermite polynomials, i , where Ω kl is the (k, l)th entry of Ω, and ϕ and Φ are the density and survivor functions of a standard normal, respectively.Note that Eqs 9, 10, and 11 imply that once λ is set by the method of moments, knowing λ determines c which determines V 0 [S (c)] and thus determines γ.So we will drop the writing of γ (and D) below to simplify notation, and define f λ (x) to be the beta-binomial density with parameters (D, λ, γ), where λ and γ are related by the equations λ = 2Φ(−c) and Applying this to Eq 6, Pre-computation for the ADELLE test Our ADELLE method lends itself to a pre-computation to reduce computation time when it will be applied to a large number of SNPs and a large number of traits.Suppose we observe M SNPs along with the D traits.Then, from Equation ( 12), calculating M ADELLE statistics involves M × qD j=1 (D − j + 1) ≈ M qD 2 evaluations of a beta-binomial probability mass function f h (k).However, we also see that computation of l1 (h) necessarily also computes all the l-values l1 (h), . . ., lD (h).This motivates a computationally efficient strategy for approximating the function F(d) (h) for arbitrary h and 1 ≤ d ≤ qD by pre-computing a qD × H matrix of l-values, where H is the number of values of h in the grid.We choose a set of pre-computation points, H = {h 1 , . . ., h H }, where 0 < h 1 < • • • < h H = 1, and evaluate each ld only on the points in H.For 1 ≤ d ≤ qD but h ̸ ∈ H we linearly interpolate when where h i < h < h i+1 .For h < h 1 we compute the l-value directly from (12).
In general both the computation time and the accuracy of the approximation tend to increase with the number of points H and the density of the pre-compute grid.In the data analysis, we use H = 10 5 and h 1 = 10 −20 .A naive means of assigning the values of H would be to create an evenly spaced grid between h 1 and 1, but this devotes few points, and little resolution to small values of h.Instead we choose H to be the geometric sequence h i = h for 1 ≤ i ≤ H.This results in 5,000 grid points for each order of magnitude, scaled logarithmically, for our choice of h 1 and H.
factors, or other covariates have been regressed out of the expression traits), and Ω is a D × D correlation matrix, where rank(Ω) = rank(C trait ), which we assume is equal to D.
where C − G is the Moore-Penrose generalized inverse of the (singular) matrix C G .More generally, suppose we use matrix C− in the estimator Ω1 = cov2cor(Z C− Z T ), where C− is a symmetric, positive semi-definite matrix.