# Genetic Linkage Analysis of a Dichotomous Trait Incorporating a Tightly Linked Quantitative Trait in Affected Sib Pairs

^{1}Departments of Statistics and Actuarial Science and Biostatistics, Division of Statistical Genetics, University of Iowa, Iowa City; and

^{2}Department of Preventive Medicine and Epidemiology, Loyola University Chicago Medical Center, Maywood, IL

## Abstract

Many complex diseases are usually considered as dichotomous traits but are also associated with quantitative biological markers or quantitative risk factors. For such dichotomous traits, although their associated quantitative traits may not directly underly the diagnosis of the disease status, if the associated quantitative trait is also linked to the chromosomal regions linked to the dichotomous trait, then joint analysis of dichotomous and quantitative traits should be more efficient than consideration of them separately. Previous studies have focused on the situation when a dichotomous trait can be modeled by a threshold process acting on a single underlying normal liability distribution. However, for many complex disorders, including most psychiatric disorders, diagnosis is generally based on a set of binary or discrete criteria. These traits cannot be modeled on the basis of a threshold process acting on an underlying continuous trait. We propose a likelihood-based method that efficiently combines such a discrete trait and an associated quantitative trait in the analysis, using affected-sib-pair data. Our simulation studies suggest that joint analysis increases the power to detect linkage of dichotomous traits. We also apply the proposed new method to an asthma genome-scan data set and incorporate the total serum immunoglobulin E level in the analysis.

## Introduction

Many complex diseases, such as asthma, autism, and schizophrenia, are usually considered as dichotomous traits but are also associated with quantitative biological markers or quantitative risk factors. For example, asthma is associated with total serum immunoglobulin E (IgE) level: children with asthma tend to have elevated total serum IgE levels. For such diseases, although the associated quantitative trait may not directly underlie the diagnostics of the disease status, if it is also linked to the dichotomous trait loci or the chromosomal regions linked to the trait loci, then joint consideration of the quantitative trait in the analysis in general will increase our ability to map the genes that predispose to complex diseases.

There has been much work on the joint analysis of (*a*) dichotomous traits that can be modeled by a threshold process acting on an underlying normal liability distribution and (*b*) associated quantitative traits (see, e.g., Lalouel et al. ^{1985}; Moldin et al. ^{1990}; Ott ^{1995}; Almasy et al. ^{1997}; Blangero et al. ^{1997}; Williams et al. ^{1999a}^{, }^{1999b}). The authors of these studies have shown that the power to detect chromosomal regions influencing a disease trait can benefit considerably from joint analysis of the discretized trait and a correlated quantitative factor. For such joint analysis, the likelihood of the data can be obtained by appropriately integrating a normal bivariate (or multivariate) trait likelihood, according to the threshold values that determine the trait status. Therefore, the basic framework of this approach is based on the premise that the dichotomous trait is a discretized version of an underlying continuous trait and that the rule of discretization is known.

However, for many complex disorders, including most psychiatric disorders, diagnosis is generally based on a set of binary or discrete criteria. The disease status cannot be modeled on the basis of a threshold process acting on an underlying continuous trait. Thus, the existing approaches cited above do not apply. We propose a combined likelihood ratio (CLR) test using affected-sib-pair (ASP) data for detecting linkage of the dichotomous trait incorporating an associated quantitative trait. The proposed likelihood is efficiently constructed from two components. The first component is for the dichotomous trait (the affection status) and uses the number of alleles that are shared identical by descent (IBD), as in the maximum LOD score (MLS) statistic (Risch ^{1990}). The second component is for the quantitative trait and is based on the variance components (VC) method (see, e.g., Goldgar ^{1990}; Schork ^{1993}; Amos ^{1994}).

We conduct simulation studies to examine the finite sample behavior of the proposed likelihood ratio (LR) statistic, with respect to the type I error rate and power. The simulation results show that the power of the proposed LR test is more powerful than the test that does not incorporate the associated quantitative trait. We also analyze the asthma data set of Wjst et al. (^{1999}) to illustrate our proposed approach.

## Methods

### The Combined Likelihood

Let *y*_{i}=(*y*_{1i},*y*_{2i}) be the associated quantitative trait values of the *i*th ASP. Let *m*_{i}=(*m*_{i1},…,*m*_{ik}) be the marker data of the *i*th ASP at *k* loci in the chromosomal region under investigation. At each locus, the marker genotype of the parents may or may not be available.

For the dichotomous trait, we do not assume any specific mode of inheritance. For the associated quantitative trait, following Haseman and Elston (^{1972}), we consider the general model

where μ is the overall mean; *g*_{ji},*j*=1,2 are the genetic effects taking values *a, d,* and −*a* for genotypes *DD, Dd,* and *dd,* respectively; and *e*_{ji},*j*=1,2 are the residual effects including polygenic and environmental effects. Denote σ^{2}_{e}=*Var*(*e*_{ji}) and σ_{e1,e2}=*Cov*(*e*_{1i},*e*_{2i}).

We note that the loci affecting the dichotomous trait and quantitative trait may not be the same. If they are far apart, then incorporating a quantitative trait into the analysis in general does not increase the power to detect linkage of the dichotomous trait. Therefore, we consider only the following scenarios: (*a*) pleiotropy, in which the same locus affects both the dichotomous and quantitative traits; and (*b*) tight coincident linkage (Almasy et al. ^{1997}), in which the genes that affect the dichotomous trait and quantitative trait are different but are tightly linked and we assume that the recombination fraction between them is zero.

Consider a chromosomal region containing, at most, one gene that predisposes to the dichotomous trait. Let *t* be the location (in cM) from the left end of the chromosome. The likelihood at locus *t* is the conditional probability of the observed quantitative trait values and the marker, given that the sib pair is affected—that is, *L*_{i}(*t*)=*p*(*y*_{i},*m*_{i}|*asp*_{i},*t*). We use this conditional likelihood because the data are ascertained on the basis of the criterion that the sibs are affected. In the absence of the quantitative trait, this form of the likelihood is also used in the MLS statistic for ASP data by Risch (^{1990}) and by Whittemore (^{1996}), for general pedigree data.

Let *s*_{i}(*t*) be the number of alleles shared IBD of the *i*th pair at locus *t*. By the assumption of pleiotropy or tight coincident linkage and by means of standard decomposition of probability, the likelihood can be written as

where the parameters of interest *z* and ρ are included in the argument of *L*_{i}. The overall likelihood *L* is simply the product of the *L*_{i} values over all the sib pairs.

The likelihood *L*_{i} consists of three components. The first component is the conditional distribution of the quantitative trait values of ASP given the IBD sharing *s*_{i}(*t*). On the basis of the model, we have (Haseman and Elston ^{1972}; Amos ^{1994})

and

We assume that *y*_{i} conditional on *s*_{i}(*t*)=*j* is bivariate normal. We note that, in many situations, appropriate transformation is needed to achieve normality. This is illustrated in the analysis of asthma data, in the example given below.

For ASP data, instead of the variance components parameters, we find that it is more convenient to use the total variance and the correlation coefficients. Let σ^{2}=σ^{2}_{a}+σ^{2}_{d}+σ^{2}_{e} be the total variance. Denote the correlation coefficients by

Let ρ=(ρ_{0},ρ_{1},ρ_{2}). From equation (2) for ρ_{j}*,* derived from the basic quantitative trait model (1), we see that ρ satisfies the monotonicity constraints ρ_{0}ρ_{1}ρ_{2}. Furthermore, if we assume that the residual covariance σ_{e1,e2} is nonnegative, then we have

Intuitively, the monotonicity constraints make sense, because the correlation should increase as the amount of IBD sharing at a linked locus increases. The restriction that ρ_{0}0 arises from the consideration that even when two sibs share 0 alleles IBD at locus *t,* they may still share the same genes at other loci, as well as similar environmental factors. Therefore, the residual correlation should be nonnegative. When there is no linkage (σ_{a}=σ_{d}=0), then ρ_{0}=ρ_{1}=ρ_{2}.

We note that all the parameters—in particular, the additive dominance variances and the correlation coefficients ρ—are for the ASP data. That is, they are defined with respect to the population of ASPs. They are, in general, different from the corresponding parameters in the population, as in the standard VC methods.

The second component, , is the conditional probability of the observed marker data, given that the *i*th pair shares *j* alleles IBD at locus *t*. It is clear that *w*_{ij}(*t*) is determined by the marker data and the location *t*. Let be the probability that the *i*th pair shares *j* alleles IBD given the marker data. By Bayes’ theorem, . The probability π_{ij}(*t*) can be computed using the Genehunter program (Kruglyak et al. ^{1996}). *P*(*m*_{i}) is the Mendelian probability of the marker; thus, it does not involve any parameters and can be treated as a constant. The unconditional probability , 0.5, and 0.25 for *j*=0, 1, and 2, respectively.

The third component, , is the probability that an ASP shares *j* alleles IBD, for *j*=0,1,2. This is the parameter of main interest. Under the null hypothesis of no linkage between the dichotomous trait and locus *t,* *z*_{0}=0.25, *z*_{1}=0.5, and *z*_{2}=0.25. Because *z*_{0}+*z*_{1}+*z*_{2}=1, we only need to consider *z*_{0} and *z*_{1}. Let *z*=(*z*_{0},*z*_{1}). The test of linkage between the dichotomous trait and the locus *t* can be formulated in terms of whether ** z** deviates from its null value.

### The CLR Test

To define the LR test statistics, we first need to consider (*a*) the parameter spaces of ** z** and ρ and (

*b*) the relationship between the parameters

**and ρ.**

*z*The sharing probabilities (*z*_{0},*z*_{1}) must lie in a closed triangle bounded by *z*_{0}=0, *z*_{1}=0.5, and *z*_{1}=2*z*_{0} (Holmans ^{1993}). We denote this triangle by “Δ” below. Under pleiotropy or tight coincident linkage, the parameters ** z** and ρ, which describe linkage of the dichotomous and quantitative traits, should be related. In particular, pleiotropy or tight coincident linkage implies that a locus is linked to the dichotomous trait if and only if it is also linked to the quantitative trait. Since our main interest is in detecting linkage of the dichotomous trait, the parameter

**is of primary interest. Therefore, we use the following model to take into account pleiotropy and tight coincident linkage in the likelihood:**

*z*These equations ensure that, if the dichotomous trait is not linked to a locus (*z*_{0}=0.25), then neither is the quantitative trait.

To ensure that ρ_{0}, ρ_{1}, and ρ_{2} satisfy the monotonicity constraints, we must restrict the parameter (β_{0}, β_{1}, β_{2}) in the space

We note that the above expressions provide one way for incorporating pleiotropy and coincident linkage explicitly into the likelihood. There are probably other parameterizations that can achieve the same goal. We choose the above expressions because the constraints on ρ are automatically satisfied with the restriction on β_{0}, β_{1}, and β_{2}. We write the overall likelihood *L* in terms of (** z**,β) below.

With this parameterization, the parameter space is ={(*z*,β):*z*Δ,β}. Under the null hypothesis of no linkage, β_{1} and β_{2} disappear from model (4). The corresponding parameter space is _{0}={(*z*,β_{0}):*z*=(0.25,0.5),β_{0}0}.

In general, the hypotheses can be stated in terms of the null parameter space versus everything outside the null space. The proposed CLR test statistic at locus *t* corresponding to the hypotheses _{0} versus -_{0} is

where is the maximum likelihood estimator (MLE) obtained in and where *z*_{0}=(0.25,0.5) and is the MLE of β_{0} obtained in _{0}.

However, by considering the whole parameter space , the degrees of freedom are relatively high, which may impede the power of the test. To reduce the degrees of freedom, we can consider a reasonable subspace of . For complex disorders, the dominance variance is, in general, small (Risch ^{1990}). Thus, in computing the LR test statistic, we can restrict *z*_{1}=0.5. This is equivalent to assuming that the dominance variance of the dichotomous trait is 0 in the analysis. For parameters β_{1} and β_{2}, we can also put further constraints on them. For example, a simple restriction is to set β_{2}=2β_{1}. This restriction ensures that ρ automatically satisfies equation (3). We note that the above restrictions do not necessarily correspond to the true underlying model. However, the LR test with these restrictions is valid in the sense that the size of the test is correct. With this restriction, the exact asymptotic null distribution is unknown. A conservative null distribution of the CLR statistic is 0.25χ^{2}_{0}+0.5χ^{2}_{1}+0.25χ^{2}_{2}, where χ^{2}_{0} denotes the degenerate distribution that puts probability 1 at 0. If ρ and ** z** are independent parameters not constrained by model (4), then the asymptotic null distribution is 0.25χ

^{2}

_{0}+0.5χ

^{2}

_{1}+0.25χ

^{2}

_{2}(Self and Liang

^{1987}). With the constraints defined in model (4), this distribution is stochastically greater than the correct asymptotic null distribution. Our simulation studies (see the “Simulation Results” section, below) show that this distribution is much too conservative. On the basis of our simulation results, a better but still conservative approximation is to increase the value of the CLR statistic by 0.8 (on the χ

^{2}scale) and then use the distribution 0.25χ

^{2}

_{0}+0.5χ

^{2}

_{1}+0.25χ

^{2}

_{2}to calculate the

*P*value. We will use this approximation to calculate the

*P*values of the CLR statistics in the analysis of an asthma data set below. A computer program for computing the CLR statistic can be found at J.H.'s Web site.

## Simulation Results

We conduct simulation studies to evaluate the null distribution of the CLR statistic and compare its power with the MLS test for ASP data (Risch ^{1990}).

In the simulations, we assume that the parents and the sibs in a family are genotyped. We also assume that the trait locus has two alleles, *D* and *d,* with allele frequencies *p* and *q*=1-*p*, respectively. The marker used in the simulation has 10 alleles with equal allele frequencies. The polymorphism information content of the marker is 0.89. We also assume that the same locus affects both the dichotomous and continuous traits. For the dichotomous trait, we consider three genetic models: recessive, dominant, and additive. For the continuous trait, we consider the model *y*_{j}=μ+*g*_{j}+*u*+*e*_{j}, for *j*=1, 2, where μ is the overall mean, *g*_{j} is the genetic effect, *u* is the common environmental effect, and *e*_{j} is the residual. The genetic effect *g*_{j} takes the values *a, d,* and *−a* for genotypes *DD, Dd,* and *dd,* respectively. We assume that *u*~*N*(0,σ^{2}_{u}) and *e*_{j}~*N*(0,σ^{2}_{e}). Let the broad-sense heritability be (Lynch and Walsh ^{1998})

However, the *H*^{2} here is not the population heritability as it is usually understood. This *H*^{2} is specifically the heritability among the ASPs. If we assume an additive model (*d*=0) on the genetic effect *g*_{j}*,* then *a* is given by

We fix σ^{2}_{u}=σ^{2}_{e}=1. Thus, when the trait allele frequency is given, *a* is determined by the heritability. This fact is used in determining the generating values of *a* for a given *H*^{2} and *p*.

### Simulation under *H*_{0}

A conservative estimate of the asymptotic null distribution of the CLR statistic is a mixture of χ^{2} distributions: 0.25χ^{2}_{0}+0.5χ^{2}_{1}+0.25χ^{2}_{2}. However, the exact asymptotic null distribution of the CLR statistic is unknown. Therefore, we performed simulations to determine the critical value for a given test size and to gauge how conservative the mixture χ^{2} distribution is. To accurately determine the critical value for the power calculation, we perform the simulation using the same generating models (recessive, dominant, and additive) as in the power simulation (see below), except that the recombination fraction between the marker and trait loci is set to be 0.5, and the heritability for the quantitative trait is set to be 0. The sample size is *n*=100, which is also the sample size in the power simulation. Under the null hypothesis, the trait model (i.e., recessive, dominant, or additive) does not affect the distribution of the CLR statistic. Therefore we combined the results based on 30,000 replications from the three generating models to determine the empirical critical value for a given test size.

To compare the critical values of the CLR statistic and the MLS statistic on the basis of *p*(*m*_{i}*asp*_{i},*t*) (Risch ^{1990}) without incorporating the quantitative trait and assuming that the dominance effect is zero, we also computed the theoretical and empirical critical values of the latter, which has an asymptotic null distribution 0.5χ^{2}_{0}+0.5χ^{2}_{1}.

Table 1 gives the simulated critical values for α=0.05, 0.01, and 0.001 and the critical values based on the asymptotic distribution. The simulated critical values of the MLS statistic and the theoretical values are similar. However, the simulated critical values of the CLR statistic are less then those based on the conservative asymptotic null distribution. In all of our simulation results, the differences between the simulated critical values and those based on the conservative asymptotic null distribution are ~1 but always >0.8. This suggests that the distribution 0.25χ^{2}_{0}+0.5χ^{2}_{1}+0.25χ^{2}_{2} is too conservative. A better but still conservative approximation is to add 0.8 to the CLR statistic (on the χ^{2} scale) and then use this distribution to calculate the *P* value, as described above. In the power simulation below, we use the simulated critical values for the CLR statistic and the theoretical values for the MLS statistic.

### Power Simulation

The purpose of the power simulation is to evaluate whether incorporation of the associated quantitative trait increases the power to detect linkage of the dichotomous trait under pleiotropy and tight coincident linkage. Therefore, we compare the power of the proposed CLR test and the MLS test for ASP data (which does not include the associated quantitative trait in the likelihood).

We consider three generating models: recessive, dominant, and additive. We assume that the recombination fraction between the marker and disease loci is 0. For each generating model, we assume that the phenocopy rate is 0.01 (the probability of being affected given the genotype at the trait locus is *dd*). We use two genotypic relative risks (GRRs), 4 and 8, in the simulation. The disease allele frequency for all the models considered is fixed at *p*=0.1. For each genetic model, the results were based on 10,000 replications. Six heritability values, from 0 to 0.5 at increments of 0.1, are considered.

Table 2 gives the power simulation results for the test size 0.01 for recessive, dominant, and additive models. The results show that, for all the simulation models considered, the CLR test has higher power than the MLS test when *H*^{2}10%, except for both the dominant and additive models when *GRR*=8 and *H*^{2}=10%. When *H*^{2}=0, the MLS test tends to have higher power than the CLR test, except for the recessive model with *GRR*=4. This is to be expected, because if the locus is not linked to the quantitative trait, then incorporation of the quantitative trait in the analysis only introduces random noise and does not increase the power for detecting linkage to the dichotomous trait. However, we note that, for the generating models considered, when the heritability is small (10%), both the CLR and MLS tests have little power to detect linkage, except when *GRR*=8 in the additive model.

^{[Note]}

A clear trend in the simulation results is that the power of the CLR test to detect linkage of the dichotomous trait increases with the heritability of the associated quantitative trait. In contrast, the power of the MLS test basically remains constant for different values of the heritability. This is because, under pleiotropy or tight coincident linkage, the quantitative trait provides linkage information, in addition to the dichotomous trait. Therefore, incorporation of an associated quantitative trait into the analysis of a dichotomous trait in general increases the power to detect linkage under the pleiotropy or the tight coincident linkage model.

## Application to an Asthma Data Set

Asthma is a common chronic inflammatory disorder of the airways, characterized by airway hyperresponsiveness, epithelial damage, and airway smooth-muscle hypertrophy (Sheffer ^{1995}). Although several environmental factors have been identified that increase the risk of asthma, previous segregation analyses suggest that genetics may also play an important role (Litwin ^{1978}; Dold et al. ^{1992}).

There has been much interest in searching for genes that predispose to asthma in recent years. Several groups have conducted genome-scan and candidate-gene studies of asthma, as well as genome scans using quantitative traits associated with asthma, such as total serum IgE level (Marsh et al. ^{1994}; Meyers et al. ^{1994}; Daniels et al. ^{1996}; Collaborative Study on the Genetics of Asthma [CSGA] ^{1997}; Palmer et al. ^{1998}; Wilkinson et al. ^{1998}; Wjst et al. ^{1999}; Xu et al. ^{2001}), and have identified several regions that could be linked to asthma. All of these studies considered either asthma status only or the asthma-related quantitative traits in the linkage analysis. When both asthma status and the asthma-related quantitative traits are available, genome scans are performed separately.

We illustrate the proposed method with the asthma data set (Wjst et al. ^{1999}). In this data set, there are 91 families contributing two children with asthma and 6 families contributing three children with asthma. A total of 331 markers with an average intermarker distance of 10.7 cM on 22 autosomal chromosomes are available for linkage analysis. (For a detailed description of the data, see Wjst et al. ^{1999}.)

Because our method assumes normality of the quantitative trait, we first examine the distribution of the observed total IgE levels. Figure 1*a* gives the histogram of the standardized total IgE levels—that is, the total IgE levels are centered by the sample mean and divided by the sample SD. The density curve of the standard normal distribution is also given in the plot. The distribution of the total IgE is highly skewed to the right. For such skewed data, two commonly used transformations are the logarithm and square root. Figure 1*b* and and1*c*1*c* display the histograms of the standardized logarithm and square root of the total IgE levels, respectively. It can be seen that, although the distributions of the transformed data are less skewed than the raw data, they are still far from being normally distributed. Therefore, we use a nonparametric normal quantile transformation that guarantees that the marginal distribution of the data is approximately normal. This transformation is defined as follows. First, denote the (modified) empirical distribution function of the total IgE levels (*x*_{i1}*,x*_{i2}) of all the sib pairs by

where *n*=95. Then the transformed data is defined as *y*_{ij}=Φ^{-1}[*F*_{n}(*x*_{ij})], for *j*=1, 2 and *i*=1,…,*n*. Since *F*_{n} converges to *F,* the unknown marginal distribution of *x*_{ij}*,* this transformation always results in approximately normally distributed observations *y*_{ij}*,* regardless of the form of *F*. We call this transformation a “quantile transformation.” Figure 1*d* shows the histogram of the transformed data. We can see that the distribution of the transformed data is approximately normal.

*Y*-axis in each panel represents density.

*a,*Standardized total IgE levels.

*b,*Standardized logarithm of total IgE levels.

*c,*Standardized square root of total IgE levels.

*d,*Transformed total

**...**

Thus, we apply the proposed CLR test to the German asthma data with the quantile transformation of the total IgE levels. In computing the likelihood defined in equation (5), we restricted the parameter space so that *z*_{1}=0.5 and β_{2}=2β_{1}. As a comparison, we also computed the MLS statistic for ASP data under the assumption that *z*_{1}=0.5 (Risch ^{1990}), as well as the new Haseman-Elston statistic, using only the IgE level (Elston et al. ^{2000}). The LOD scores along the 22 autosomal chromosomes are displayed in figures figures22 and and3,3, in which the solid lines are the LOD scores obtained on the basis of the proposed CLR statistic, the dotted lines are the LOD scores based on the MLS statistic, and the dashed lines are the LOD scores based on the H-E statistic. Because the H-E statistic is a *t* statistic, to make it comparable in scale, its corresponding (approximate) LOD score is calculated as the base-10 logarithm of the square of the maximum of the H-E statistic and zero.

As we can see from the LOD score plots, the H-E statistic that uses only IgE levels does not give any significant linkage signal in the 22 autosomal chromosomes. Therefore, we consider only the results from the CLR and MLS statistics below. The CLR and MLS statistics suggest that there are several interesting regions. The CLR score has an overall trend similar to the MLS score, but the CLR score is always higher. However, since the CLR statistic and the MLS score have different degrees of freedom, the scales of the two LOD scores are not strictly comparable. In table 3, we list the results of seven loci at which the maximum LOD scores based on the CLR statistic are >1.2.

From table 3, the strongest linkage signal comes from chromosome 6, where the CLR LOD score is 2.9, with *P* value .0003, and where the MLS LOD score is 1.76, with *P* value .0022. The second-highest LOD score is on chromosome 9, where the CLR LOD score is 2.2, with *P* value .0015, and where the MLS LOD score is 1.69, with *P* value .0027. For the remaining five locations, all the LOD scores are <2, and the CLR LOD scores have values that are slightly higher but mostly similar to the MLS LOD scores. However, at four locations, the MLS LOD scores have smaller *P* values than the CLR LOD scores. For example, on chromosome 15, the CLR LOD score is 1.99, with *P* value .0025, and the MLS LOD score is 1.83, with *P* value .0019. This illustrates that a bigger CLR score may not necessarily be more significant than a smaller MLS score.

We note that several previous studies have suggested linkage of asthma or total IgE level to various chromosomes, including the ones listed in table 3 (Marsh et al. ^{1994}; Meyers et al. ^{1994}; Daniels et al. ^{1996}; CSGA ^{1997}; Wjst et al. ^{1999}; Yokouchi et al. ^{2000}).

## Discussion

We proposed a likelihood-based approach for joint analysis of dichotomous and quantitative traits in sib-pair data. We have demonstrated that, for the models considered in our simulations, under pleiotropy or tight coincident linkage, the power to detect linkage of the dichotomous trait is increased when it is analyzed jointly with an associated quantitative trait.

The proposed CLR test is designed to deal with the situation in which the dichotomous trait cannot be modeled as the discretized version of an underlying continuous factor. Indeed, such dichotomous traits are truly *discrete* ones. It is perhaps more appropriate to call the qualitative traits obtained from an underlying continuous factor *discretized* traits. As has been pointed out before, when a continuous phenotype itself is available, there is no need to discretize it. Indeed, doing so will reduce the power to detect linkage (Williams et al. ^{1999a}).

As we mentioned earlier, many complex disorders—in particular, psychiatric disorders such as autism—are dichotomous traits, and there is no single underlying continuous risk factor. It has been well established that genetics play an important role in these disorders. However, it has proved difficult to localize the genes that predispose to these disorders by standard linkage methods. An emerging strategy is to include the information on the associated quantitative trait in the analysis, which may help with detection of the genes that affect these disorders (Almasy and Blangero ^{2001}; Piven ^{2001}). The proposed approach provides an efficient method for analytically implementing such a strategy.

In formulating the likelihood for the dichotomous and quantitative traits, we made the assumption of either pleiotropy or tight coincident linkage. These assumptions are necessary; if the dichotomous and quantitative traits are linked to loci that are far apart or unlinked, then combined analysis may not have higher power than the univariate analysis. This is because combined analysis increases the degrees of freedom, so, for a given size of the test, a larger critical value is required. This increase in degrees of freedom may not be compensated for by the modest increase in information in the joint analysis if the loci responsible for dichotomous and quantitative traits are far apart.

To incorporate pleiotropy and tight coincident linkage into the proposed likelihood, we used the expressions given in model (4). These expressions relate the variance-components parameters that describe linkage of the quantitative trait and the IBD-sharing parameters that describe linkage of the dichotomous trait. In addition to incorporation of pleiotropy and tight coincident linkage into the analysis, we also use these expressions for the following purposes. The first purpose is to make it explicit in the analysis that the primary purpose of the linkage analysis is to detect the genes that predispose the dichotomous trait. This is achieved by defining the IBD-sharing parameters as the independent parameters, while the correlation parameters (for the quantitative trait) depend on the IBD-sharing parameters. We note that, if the primary interest is in mapping the loci affecting the quantitative trait, then we can reverse the roles of the IBD-sharing and VC parameters, so that the former become the dependent parameters and the latter the independent parameters, although different expressions than those given in model (4) should be considered. The second purpose is to reduce the degrees of freedom. This is achieved because the parameter space under the constraints of model (4) is smaller and the correlation parameters disappear under the null hypothesis. Although the asymptotic null distribution of the CLR statistic is complicated and unknown under the constraints of model (4), a simple mixture of χ^{2} distributions can be used to give an upper bound of the *P* value, as illustrated in the analysis of the asthma data example.

We note that the problem of joint analysis of dichotomous and quantitative traits considered in this study is different from that of incorporating covariates in ASP data (Greenwood and Bull ^{1999}; Olson ^{1999}; Gauderman and Siegmund ^{2001}; Goddard et al. ^{2001}; Devlin et al. ^{2002}). Specifically, here we are interested in an associated trait that is in pleiotropy or tight coincident linkage with the dichotomous trait of interest, and it is modeled as such in the likelihood analysis. However, in modeling covariates in ASP data, no such modeling consideration is required in the analysis. For example, the usual variables, such as sex and ethnicity, can be considered as covariates but are not appropriate to be modeled as associated traits that are in pleiotropy or tight coincident linkage with the trait of interest. Thus, it is important to distinguish these two types of variables. We can also consider the usual covariates in the proposed joint analysis of dichotomous and quantitative traits. For example, we can let the IBD-sharing parameters depend on the covariates to model locus heterogeneity or gene-environment interaction. We can also use a regression model to remove the covariate effects on the quantitative trait, as in standard VC methods (Amos ^{1994}).

In this study, we considered only ASP data with an associated quantitative trait. We are considering extending the present approach to general pedigrees that include both affected and unaffected individuals. It is also of interest to consider incorporation of the associated multivariate quantitative traits into the analysis of a dichotomous trait, based on the genetic covariance structure for multivariate traits (Lange and Boehnke ^{1983}; Lange ^{1997}). We hope to communicate the results of these extensions in future articles.

## Acknowledgments

This work is supported, in part, by the National Institute of Mental Health grants K01-01541 and R01-52481 (both to J.H.; principal investigator for grant R01-52481: Veronica Vieland) and by National Heart, Lung, and Blood Institute grants HL53353 and HL65702 (both to Y.J.; principal investigator: Richard S. Cooper). The authors thank Dr. Matthias Wjst, for permission to use the asthma data set, and Drs. Veronica Vieland and Kai Wang, for helpful discussions on the topic of this article. The authors also thank two anonymous reviewers for their constructive comments that helped us in clarifying several points in the article.

## Electronic-Database Information

The URL for data presented herein is as follows:

## References

*a*) Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. I. Likelihood formulation and simulation studies. Am J Hum Genet 65:1134–1147 [PMC free article] [PubMed]

*b*) Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. II. Alcoholism and event-related potentials. Am J Hum Genet 65:1148–1160 [PMC free article] [PubMed]

**American Society of Human Genetics**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (257K) |
- Citation

- Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. I. Likelihood formulation and simulation results.[Am J Hum Genet. 1999]
*Williams JT, Van Eerdewegh P, Almasy L, Blangero J.**Am J Hum Genet. 1999 Oct; 65(4):1134-47.* - Bayesian mapping of quantitative trait loci for multiple complex traits with the use of variance components.[Am J Hum Genet. 2007]
*Liu J, Liu Y, Liu X, Deng HW.**Am J Hum Genet. 2007 Aug; 81(2):304-20. Epub 2007 Jul 3.* - Maximum-Likelihood-Binomial method for genetic model-free linkage analysis of quantitative traits in sibships.[Genet Epidemiol. 1999]
*Alcaïs A, Abel L.**Genet Epidemiol. 1999; 17(2):102-17.* - Score test for detecting linkage to complex traits in selected samples.[Genet Epidemiol. 2004]
*Lebrec J, Putter H, Houwelingen JC.**Genet Epidemiol. 2004 Sep; 27(2):97-108.* - Methods for linkage analysis of quantitative trait loci in humans.[Theor Popul Biol. 2001]
*Feingold E.**Theor Popul Biol. 2001 Nov; 60(3):167-80.*

- Genome-Wide Association Studies Using Haplotypes and Individual SNPs in Simmental Cattle[PLoS ONE. ]
*Wu Y, Fan H, Wang Y, Zhang L, Gao X, Chen Y, Li J, Ren H, Gao H.**PLoS ONE. 9(10)e109330* - Genome Wide Association Studies for Milk Production Traits in Chinese Holstein Population[PLoS ONE. ]
*Jiang L, Liu J, Sun D, Ma P, Ding X, Yu Y, Zhang Q.**PLoS ONE. 5(10)e13661* - Gains in power for exhaustive analyses of haplotypes using variable-sized sliding window strategy: a comparison of association-mapping strategies[European Journal of Human Genetics. 2009]
*Guo Y, Li J, Bonham AJ, Wang Y, Deng H.**European Journal of Human Genetics. 2009 Jun; 17(6)785-792* - Bivariate Association Analyses for the Mixture of Continuous and Binary Traits with the Use of Extended Generalized Estimating Equations[Genetic epidemiology. 2009]
*Liu J, Pei Y, Papasian CJ, Deng HW.**Genetic epidemiology. 2009 Apr; 33(3)217-227* - Bayesian Quantitative Trait Loci Mapping for Multiple Traits[Genetics. 2008]
*Banerjee S, Yandell BS, Yi N.**Genetics. 2008 Aug; 179(4)2275-2289*

- Genetic Linkage Analysis of a Dichotomous Trait Incorporating a Tightly Linked Q...Genetic Linkage Analysis of a Dichotomous Trait Incorporating a Tightly Linked Quantitative Trait in Affected Sib PairsAmerican Journal of Human Genetics. Apr 2003; 72(4)949

Your browsing activity is empty.

Activity recording is turned off.

See more...