• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Sep 2006; 174(1): 439–453.
PMCID: PMC1569811

Modeling Extent and Distribution of Zygotic Disequilibrium: Implications for a Multigenerational Canine Pedigree

Abstract

Unlike gametic linkage disequilibrium defined for a random-mating population, zygotic disequilibrium describes the nonrandom association between different loci in a nonequilibrium population that deviates from Hardy–Weinberg equilibrium. Zygotic disequilibrium specifies five different types of disequilibria simultaneously that are (1) Hardy–Weinberg disequilibria at each locus, (2) gametic disequilibrium (including two alleles in the same gamete, each from a different locus), (3) nongametic disequilibrium (including two alleles in different gametes, each from a different locus), (4) trigenic disequilibrium (including a zygote at one locus and an allele at the other), and (5) quadrigenic disequilibrium (including two zygotes each from a different locus). However, because of the uncertainty on the phase of the double heterozygote, gametic and nongametic disequilibria need to be combined into a composite digenic disequilibrium and further define a composite quadrigenic disequilibrium together with the quadrigenic disequilibrium. To investigate the extent and distribution of zygotic disequilibrium across the canine genome, a total of 148 dogs were genotyped at 247 microsatellite markers located on 39 pairs of chromosomes for an outbred multigenerational pedigree, initiated with a limited number of unrelated founders. A major portion of zygotic disequilibrium was contributed by the composite digenic and quadrigenic disequilibrium whose values and numbers of significant marker pairs are both greater than those of trigenic disequilibrium. All types of disequilibrium are extensive in the canine genome, although their values tend to decrease with extended map distances, but with a greater slope for trigenic disequilibrium than for the other types of disequilibrium. Considerable variation in the pattern of disequilibrium reduction was observed among different chromosomes. The results from this study provide scientific guidance about the determination of the number of markers used for whole-genome association studies.

THE extent and distribution of nonrandom associations between genes at different loci, i.e., linkage disequilibria, throughout the genome have been used often as a criterion to infer demographic and genetic events of a population in the past, such as population history and evolutionary forces governing the loci. Because of its relation with the recombination fraction, the extent of association has provided a foundation for fine-scale mapping of quantitative trait loci (QTL) that control complex diseases in humans (Ardlie et al. 2002) or economical and adaptive traits in livestock (Farnir et al. 2000; McRae et al. 2002) and plants (Remington et al. 2001). Emerging as an important model system for human health research, canines have recently received a resurgence of interest in unraveling the mysteries of mammalian genomes using linkage disequilibrium (LD) analysis (Hyun et al. 2003; Lou et al. 2003; Sutter and Ostrander 2004; Sutter et al. 2004; Lindblad-Toh et al. 2005). In a study of canine mapping, aimed to detect QTL affecting canine hip dysplasia in a multihierarchic outbred pedigree, we analyzed the extent of pairwise linkage disequilibrium to change over genetic distances with a set of microsatellite markers (240) genotyped from the entire canine genome (Lou et al. 2003).

As a common case for many comparable studies, the measure of the extent of linkage disequilibrium between different loci in our canine genetic study was based on multilocus disequilibrium at the gametic level (Weir 1996). Although such a gametic disequilibrium analysis is mathematically simple, it relies upon a fundamental assumption that the population under study is at Hardy–Weinberg equilibrium (HWE), in which individuals are assumed to be randomly mating to produce the next generations. In such an HWE population, the nonrandom associations of alleles at different loci occur only within gametes rather than between gametes. The randomly mating assumption may be violated in the canine pedigree used for our earlier study because different offspring are related to each other to a varying degree although multiple dog founders were used.

For a nonequilibrium population at Hardy–Weinberg disequilibrium (HWD), zygotic disequilibria that have power to characterize nonrandom associations at both gametic and zygotic levels (Weir 1996) may be more relevant. Earlier studies have documented possible genetic and evolutionary causes for zygotic associations in a nonequilibrium population (Haldane 1949; Bennett and Binet 1956; Charlesworth 1991; Barton and Gale 1993). In this article, we revisit our outbred canine pedigree by estimating the extent of zygotic disequilibria throughout the canine genome. Although zygotic disequilibria have been theoretically developed in the literature (see Weir 1996 for an excellent description), there is no application yet, to our best knowledge, for these measures to extensively study the structure of the genome in a case study. Recently, Yang (2000, 2002) proposed a multilocus zygotic measure for association study in a nonequilibrium population. Yang's two articles present the most thoughtful survey on zygotic disequilibrium analysis. The incorporation of zygotic disequilibrium analysis into genomic research is a necessary first step toward the formulation of an optimal strategy for characterizing genome structure and organization.

ESTIMATION OF ZYGOTIC DISEQUILIBRIUM

Genotype, allele, gamete, and nongamete frequencies:

Suppose that there is a natural or experimental population in which there are two codominant markers A with two alleles A and a and B with two alleles B and b, respectively. Let pA and pa (pA + pa = 1) as well as pB and pb (pB + pb = 1) be the corresponding allele frequencies. At each of the two loci, four different formations of zygotic genotypes lead to three distinguishable genotypes, i.e., AA, Aa, and aa for marker A and BB, Bb, and bb for marker B. The two markers form 10 genotypic configurations, but only 9 can be genetically distinguished from each other. This is because genotypic configurations equation M1 and equation M2 have the same genotype AaBb. Let P, subscripted and superscripted by the genotype notation, be the genotypic configuration frequencies that are individually tabulated in Table 1. It is not difficult to estimate one-marker genotype frequencies from two-marker genotypic configuration frequencies by

equation M3
(1)

for marker A and

equation M4
(2)

for marker B and estimate the allele frequencies from the one-marker genotype frequencies by

equation M5
(3)

The two markers form four gametes, AB, Ab, aB, and ab, whose frequencies can be estimated from genotypic configuration frequencies by

equation M6
(4)

Similarly, the frequencies of nonalleles from different gametes can be estimated by

equation M7
(5)

The frequencies of triple alleles from different markers are estimated as

equation M8
(6)
TABLE 1
Frequencies and observations of marker genotypes

Complete disequilibrium parameters:

The zygotic disequilibrium is defined as the deviation of two-locus genotype frequencies from products of single-locus genotype frequencies and, thus, is composed of all nonallelic genic disequilibria at the two loci (Weir 1996). Assume that the population considered above is at HWD. This population thus has no desirable property of an equilibrium population, such as independence of different allele frequencies at the same locus (Lynch and Walsh 1998). The HWD attempts to test for two alleles at the same locus, but on different gametes, whereas (gametic) linkage disequilibrium describes two alleles on the same gametes, but at different loci. For the zygotic disequilibrium, however, there is a third test, i.e., two alleles on different gametes and at different loci.

Since the population is not in HWE, two alleles at each marker are not independent, with the coefficients of Hardy–Weinberg disequilibrium defined as

equation M9
(7)

for marker A and

equation M10
(8)

for marker B, respectively. The coefficient of digenic gametic linkage disequilibrium between the two markers is defined as

equation M11
(9)

For the nonequilibrium population, digenic linkage disequilibrium that occurs between nonalleles at different gametes is defined as

equation M12
(10)

The trigenic disequilibrium between two alleles from marker A and one allele from marker B is defined as

equation M13
(11)

The trigenic disequilibrium between two alleles from marker A and one allele from marker B is defined as

equation M14
(12)

With genotypic configuration frequencies, allele frequencies, HWD, gametic and nongametic disequilibria, and trigenic disequilibria, we can estimate the quadrigenic disequilibrium (DAB) between two alleles from marker A and two alleles from marker B using the formulas given in Table 2 (see Weir 1996). Note that we use lower- and uppercase letters to denote gametic and zygotic disequilibria, respectively. From Table 2, we can see that each of the genotypic configuration frequencies can be expressed in terms of the allele frequencies (pA, pa and pB, pb), HWD coefficients (DA and DB), and gametic (Dab) and nongametic disequilibria of different orders (Da/b, DAb, DaB, and DAB).

TABLE 2
Expressions of quadrigenic disequilibrium DAB in terms of genotypic configuration frequencies, allele frequencies, and lower-order disequilibrium coefficients

Composite zygotic disequilibria:

It can be seen that 10 genotypic configurations have nine independent frequencies that are defined by two allele frequencies for each marker and seven disequilibrium parameters as defined above. But since two configurations of the double heterozygote cannot be separated in practice, it is not possible to estimate all these frequencies and disequilibrium parameters. To solve this problem, Weir (1996) suggested a set of composite disequilibrium coefficients. These include the digenic disequilibrium measured by the sum of the gametic and nongametic coefficients, i.e.,

equation M15
(13)

As shown by Equations 9 and 10, Δab will include the summation of gamete (pAB) and nongamete frequencies (pA/B). On the basis of the definitions of these two frequencies (Equations 4 and 5), Δab will finally need the summation of two configuration frequencies (equation M16 and equation M17) of the double heterozygote. Thus, Δab can be estimated directly on observable genotype frequencies. Weir (1996) also defined a quadrigenic disequilibrium measured by

equation M18
(14)

which can be finally measured from genotype frequencies.

The two composite digenic and quadrigenic disequilibria can make it possible to estimate the parameters on the basis of observable genotype frequencies rather than unobservable configuration frequencies. Table 3 tabulates the compositions of the composite quadrigenic disequilibrium in terms of genotype and allele frequencies and the coefficients of disequilibria with lower orders (see also Weir and Cockerham 1989).

TABLE 3
Expressions of composite quadrigenic disequilibrium ΔAB in terms of genotypic and allele frequencies and lower-order disequilibrium coefficients

Estimates and tests:

Two markers A and B are observed for a population of size n with nine genotypes listed in Table 1. Let u and v denote the marker genotypes, u = 2 for AA, 1 for Aa, and 0 for aa and v = 2 for BB, 1 for Bb, and 0 for bb. The multinomial log-likelihood of the genotype frequencies equation M19 given marker observations is written as

equation M20
(15)

which gives the MLEs of the genotype frequencies as

equation M21
(16)

On the basis of the estimated genotype frequencies, the allele frequencies for the two markers (pA and pB), the HWD coefficients (DA and DB), the composite digenic disequilibrium (Δab), two trigenic disequilibria (DAb and DaB), and the composite quadrigenic disequilibrium (ΔAB) can be estimated.

Each of these disequilibria should be tested for its significance. The hypotheses for testing HWD are formulated by

equation M22
(17)
equation M23
(18)

for two different markers, respectively. The hypotheses for testing each of the zygotic disequilibria between the two markers are given as

equation M24
(19)
equation M25
(20)
equation M26
(21)
equation M27
(22)

For these hypotheses (17–22), we calculate the likelihoods under H0 and H1, respectively, from which the log-likelihood ratio (LR) is calculated. The LR test statistic calculated follows a χ2-distribution with 1 d.f.

The likelihoods for testing HWD on the basis of hypotheses (17) and (18) can be calculated from marginal totals of one-marker genotype frequencies and observations separately for markers A and B, respectively. For these two hypotheses, allele frequencies under H0 can be estimated with a closed form and, thus, no EM algorithm is needed for computation. However, for the tests of hypotheses (19–22), parameter estimation under H0 needs the implementation of numerical algorithms, like the Newton–Raphson method, because the number of unknown parameters to be estimated is less than the number of genotype frequencies. It is also possible to test whether all the disequilibrium coefficients are together equal to zero. The parameters that need to be estimated under H0: Δab = DAb = DaB = ΔAB = 0, include allele frequencies and HWD coefficients that can be estimated with a closed form. The LR value for this hypothesis should asymptotically follow the χ2-distribution with 4 d.f.

Alternatively, hypotheses (17–22) for a given disequilibrium can be tested by calculating test statistics

equation M28

where equation M29 denotes the estimate of the disequilibrium coefficient and equation M30 is the sampling variance of the estimate, calculated by formulas given in Weir (1996). This test statistic is asymptotically χ2-distributed with 1 d.f.

Bounds and normalization:

To make zygotic disequilibria comparable between different studies, the estimates of disequilibria should be normalized. Lewontin (1964) proposed a standardized approach by expressing linkage disequilibrium as a proportion of the most extreme value. Thus, the new measure from this approach will lie between 0 (for linkage equilibrium) and | ± 1| (for complete linkage disequilibrium). A similar idea was used by Weir and Cockerham (1989) to derive bounds for trigenic and quadrigenic disequilibria for zygotic nonequilibrium analysis. More recently, Zaykin (2004) and Hamilton and Cole (2004) independently proposed algebraically equivalent bounds for a composite measure of gametic linkage disequilibrium. The bound for the composite zygotic disequilibrium has not been provided thus far. In the appendix, we provide bounds and normalized measures for all six disequilibria, DA, DB, Δab, DAb, DaB, and ΔAB, for zygotic disequilibrium analysis. These bounds for the first five disequilibria are consistent with those published in Weir and Cockerham (1989), Zaykin (2004), and Hamilton and Cole (2004).

MATERIALS

A canine pedigree was developed to map QTL responsible for canine hip dysplasia (CHD) using molecular markers. Seven founding greyhounds and six founding Labrador retrievers were intercrossed, followed by backcrossing F1's to the greyhounds and Labrador retrievers and intercrossing the F1's. A series of subsequent intercrosses among the progeny at different generation levels led to a complex network pedigree structure (Figure 1), which maximized phenotypic ranges in CHD-related quantitative traits and the chance to detect substantial linkage disequilibria (Todhunter et al. 1999, 2003a,b; Bliss et al. 2002). A total of 148 dogs from this structured pedigree were chosen for genetic analyses. This set of samples would not be appropriate for traditional gametic linkage disequilibrium analysis because the population is not randomly mating. Lou et al. (2003) estimated gametic linkage disequilibria for this pedigree on a critical foundation that the pedigree was originally derived from multiple unrelated founders. But although the resulting conclusions are consistent with the evolutionary history of dogs, Lou et al.'s analysis can be improved by estimating and testing the chromosomal distribution of zygotic disequilibria as will be done in this study.

Figure 1.
Diagram of an outbred pedigree in dog. Squares and circles represent males and females, respectively. Solid and open portions of each symbol represent the proportion of greyhound and Labrador retriever alleles, respectively, possessed by that dog.

For the sampled dogs from the structured pedigree, 247 microsatellite markers distributed on 38 pairs of autosomes and 1 pair of sex chromosomes were genotyped to construct a linkage map for the canine genome, which displays a good coverage of each chromosome (Mellersh et al. 1997, 2000; Breen et al. 2001; Richman et al. 2001). The recombination fractions between different markers were estimated for segregating families, which are converted to genetic distances in centimorgans on the basis of a map function. The average genetic distances between two adjacent markers on each chromosome are listed in Table 4 (Breen et al. 2001).

TABLE 4
The percentages and distributions of significant HWD and gametic and zygotic disequilibria through 39 chromosomes in the canine pedigree

RESULTS

The microsatellite markers genotyped display high heterozygosity in the dog pedigree, with the number of alleles at a marker ranging from 2 to 11 (Todhunter et al. 2003b). The multialleles of the microsatellite markers are collapsed into two categories, the most frequent allele vs. all the rest pooled alleles. Thus, the simple biallelic model can be directly used to analyze the extent and distribution of zygotic disequilibria throughout the canine genome using the model developed above.

The zygotic disequilibria that describe the association between two different markers in a nonequilibrium population, like the canine pedigree as used in this study, were estimated and tested for each pair of markers located on the same chromosome. The zygotic associations were partitioned into Hardy–Weinberg disequilibria at each locus (DA), composite gametic disequilibrium including two alleles each from a different locus (Δab), trigenic disequilibria including a zygote at one locus and an allele at the other (DAb or DaB), and composite quadrigenic disequilibrium including two zygotes each from a different locus (ΔAB). All these disequilibrium coefficients were normalized using a procedure described in the appendix. All the comparisons are based on the normalized coefficients.

Overall, 28% of the markers genotyped were observed to deviate from HWE, but showed considerable interchromosomal variation ranging from 0 (chromosomes 26, 29, 34, 36, and 38) to 100% (sex chromosome) (Table 4). Of the four types of dilocus disequilibria, Δab displays the most important impact on zygotic associations because its estimates are generally much larger than those of the other disequilibrium types. Furthermore, this disequilibrium, as well as the composite quadrigenic disequilibrium, has larger normalized values than the other types (Figure 2). Overall, the largest percentage of marker pairs is significant for Δab (61%), followed by trigenic disequilibria DAb (23%) and DaB (19%) and composite quadrigenic disequilibrium ΔAB (22%). The percentages of marker pairs that exhibit significant associations vary among different chromosomes (Table 4).

Figure 2.
Distributions of gametic and zygotic disequilibria values observed between syntenic marker pairs as a function of genetic distance in centimorgans.

Figure 2 illustrates the patterns of the relationship between zygotic disequilibria, Δab, DAb, DaB, and ΔAB, and genetic distances, all exhibiting a trend of decay with increased map distance. All the types of zygotic disequilibria occur more frequently between pairs of markers separated by <40 cM than between those separated by >40 cM. As compared with DAb and DaB, Δab and ΔAB tend to extend within a broader region of the canine genome. Both Δab and ΔAB decay with map distance, to a greater extent for the former than for the latter.

Each of the four types of zygotic association was plotted against the map distance separately for individual chromosomes (Figures 366).). Although the data are sparse, a general trend can be observed for the extent of zygotic disequilibria; i.e., whereas the distributions of DAb and DaB follow a similar pattern among different chromosomes, there is substantial interchromosomal variation in the extent and distribution of Δab and ΔAB over the canine genome.

Figure 3.
Interchromosomal heterogeneity in the extent and distribution of digenic linkage disequilibrium Δab among 39 chromosomes.
Figure 4.
Interchromosomal heterogeneity in the extent and distribution of trigenic linkage disequilibrium DAb among 39 chromosomes.
Figure 5.
Interchromosomal heterogeneity in the extent and distribution of digenic linkage disequilibrium DaB among 39 chromosomes.
Figure 6.
Interchromosomal heterogeneity in the extent and distribution of quadrigenic linkage disequilibrium ΔAB among 39 chromosomes.

MONTE CARLO SIMULATION

To our best knowledge, this is the first study of the distribution of zygotic disequilibrium across the genome in a nonequilibrium population. Given the tradition that most current linkage disequilibrium analyses are based on gametic associations without a test for zygotic disequilibria, we perform a reciprocal simulation study to examine the influence of such analyses on the power of the disequilibrium test in a nonequilibrium population. According to this reciprocal simulation study, data are simulated, respectively, under zygotic and gametic disequilibrium models, but are subject to separate analyses by each of these two models.

Simulated data by the zygotic model:

Table 5 lists four simulation designs in each of which all types of associations occur for an assumed nonequilibrium population. But these four designs are different in terms of the allocation pattern of zygotic associations. In designs 1 and 2, a large composite digenic disequilibrium is contributed mainly by gametic or nongametic disequilibrium, respectively. Designs 3 and 4 purport to have a large trigenic and a quadrigenic disequilibrium, respectively. The sample size is 150, mimicking the canine example used above. The simulated data are analyzed by both the gametic and the zygotic disequilibrium models. The simulation under each design is repeated 200 times to calculate the precision of parameter estimation and statistical power of disequilibrium detection. The results from this simulation study (Table 6) are summarized as follows:

  1. The zygotic disequilibrium model provides reasonable estimation of any type of disequilibria and shows a great power to detect disequilibria for a nonequilibrium population under simulation.
  2. As expected, the gametic linkage disequilibrium model can estimate only gametic linkage disequilibrium, but when used to estimate a nonequilibrium population, its estimation of this parameter is largely biased. Actually, the gametic model tends to estimate the composite gametic and nongametic disequilibrium when both exist, but its estimation precision is very poor. If the composite digenic disequilibrium is mainly due to the nongametic disequilibrium (design 2), the gametic disequilibrium model cannot be used, given its large estimation error.
  3. The gametic disequilibrium model can accurately estimate allele frequencies, but cannot provide precise estimation of these parameters. The second and third findings indicate that gametic disequilibrium analysis should never be used for a nonequilibrium population and that the test for zygotic disequilibrium is always crucial before gametic disequilibrium analysis is used.
TABLE 5
Given parameter values for simulation under different designs
TABLE 6
Maximum-likelihood estimates of parameters and the square roots of their mean square errors (in parentheses) estimated by the zygotic- and gametic-LD models for the data simulated under the zygotic-LD model of different designs

Simulated data by gametic model:

As a follow-up, we simulated the data for an equilibrium population by a gametic linkage disequilibrium model. The simulated data were analyzed by both the zygotic and the gametic models (Table 7). It can be seen that the zygotic model estimates the coefficient of linkage disequilibrium as precisely as the gametic model. The result from this simulation indicates that the zygotic model is powerful to estimate the degree of linkage disequilibrium for an equilibrium population. In conjunction with the results from the simulation by the zygotic disequilibrium model, it is concluded that the zygotic model is more general than the gametic model.

TABLE 7
Maximum-likelihood estimates of parameters and the square roots of their mean square errors (in parentheses) estimated by the zygotic- and gametic-LD models for the data simulated under the gametic-LD model

DISCUSSION

The characterization of the architecture of linkage disequilibrium in the genome is an area of explosive recent growth (Farnir et al. 2000; Remington et al. 2001; Ardlie et al. 2002; Hyun et al. 2003; Lou et al. 2003; Sutter and Ostrander 2004; Sutter et al. 2004; Lindblad-Toh et al. 2005) because the positional cloning of genes underlying common complex diseases relies on the identification of linkage disequilibrium between genetic markers and disease. Traditional linkage disequilibrium is defined as the nonrandom association between alleles at different loci in gametes or haplotypes. The estimation of such gametic linkage disequilibrium between different loci requires the assumption that the population under consideration is randomly mating, following HWE. However, for many nonequilibrium populations that are founded by a small number of ancestors and/or are frequently under evolutionary pressure, such as mutation, genetic drift, and population admixture and structure, or under artificial selection (Lynch and Walsh 1998), HWE may be violated and, therefore, a new analysis that relaxes the random-mating assumption should be formulated. Weir (1996) introduced the concept of zygotic association or zygotic disequilibrium that can characterize the disequilibria between different loci in a nonequilibrium population. Recently, Yang (2000, 2002) proposed a multilocus statistic to examine zygotic associations in nonequilibrium populations. Different disequilibria due to a single locus or multiple loci can be summarized in such a statistic.

In a multigenerational canine pedigree constructed by several founders (Todhunter et al. 1999), individual dogs are related to each other and, thus, sampled dogs from this pedigree violate the HWE assumption due to inbreeding. For this reason, zygotic disequilibrium should be more appropriate for this related pedigree to investigate the extent and distribution of associations throughout the canine genome. We found extensive linkage disequilibria in a broad region of chromosomes (≥40 cM), as compared with the human genome, even for the most isolated human populations (Hall et al. 2002; Varilo et al. 2003; Tenesa et al. 2004). This finding seems to be comparable with those of earlier linkage disequilibrium studies of purebred dogs (Hyun et al. 2003; Sutter et al. 2004). The extent of linkage disequilibrium across the chromosomes was also investigated for the same data set by the gametic linkage disequilibrium model (Lou et al. 2003). Although the results of the two models are broadly in agreement, the linkage disequilibrium detected by the zygotic model seems to be distributed more extensively over the genome than that detected previously by the gametic model. Given the finding from the simulation, the gametic model tends to estimate a combined gametic and nongametic linkage disequilibrium, i.e., composite digenic disequilibrium, and, therefore, to provide a biased estimate of gametic linkage disequilibrium especially when a large nongametic linkage disequilibrium exists. The extensive distribution of linkage disequilibrium in the canine genome detected by the zygotic model suggests that a relatively small number of markers will be required for whole-genome association mapping in dogs. However, an optimal number of markers should be determined separately for individual chromosomes, because the extent of linkage disequilibrium shows substantial interchromosomal variation. Historically, different degrees of selection pressure may have been operational on various chromosomes, which causes interchromosomal differentiation in linkage disequilibrium extent (Sutter and Ostrander 2004; Ostrander and Wayne 2005; Parker and Ostrander 2005).

The most significant contribution of this article may lie in the first systematic use of a zygotic disequilibrium analysis to characterize the extent of disequilibrium for a nonequilibrium population of canines although the conclusions obtained from our analysis may be explained only for the specific canine pedigree used, in which individual dogs are related to different extents. On the basis of simulation analyses, the idea and concept of zygotic disequilibrium can be readily applied to any population genetic studies. Results from simulation analyses indicate that a popular gametic linkage disequilibrium analysis when employed to understand the genetic structure of the population at HWD should be used with caution because the results from this analysis will be misleading. The zygotic disequilibrium model that does not rely on the assumption of random mating has great power to detect various types of disequilibrium at different orders. Therefore, it is safe to say that the zygotic disequilibrium model covers well the gametic disequilibrium model in practical population genetic studies.

In this study, the zygotic disequilibrium model mostly modified from Weir (1996) was proposed on the basis of biallelic markers although the data from a canine genetic project are multiallelic microsatellites. Given the current modest sample size used, it should be more reasonable to collapse multiple alleles into bialleles than to direly use the multiallelic zygotic model in terms of reducing the number of parameters being estimated. Also, with the development of high-throughput technologies for single-nucleotide polymorphism (SNP) markers, the biallelic model will be useful to analyze the genetic architecture of zygotic disequilibria over the entire genome for any nonequilibrium or isolated populations including humans and other agriculturally important species. However, when a sample size is sufficiently large, the multiallelic model, in which the number of disequilibrium parameters increases exponentially with the number of alleles, will be more informative than the biallelic model based on the collapsing of alleles. Technically, it is straightforward, although tedious, to model zygotic disequilibria with multiallelic markers. For example, consider two triallelic markers that each form six distinguishable genotypes. A total of 35 genotype frequencies for these two markers contain four allele frequencies, six HWD coefficients, four composite digenic disequilibria, 12 trigenic disequilibria, and nine composite quadrigenic disequilibria. Also, our zygotic model can be readily extended to manipulate three biallelic markers at the same time as seen in Yang (2000, 2002). With these extensions and modifications, the zygotic disequilibrium analysis will provide a routine tool for the identification of the overall picture of disequilibria across the genome. The results obtained from the zygotic disequilibrium model, like those for canine genetics in this study, will have important implications for the gene mapping of complex traits.

Acknowledgments

We thank Dmitri Zaykin and an anonymous reviewer for clarifying the concept of zygotic association and providing other constructive comments. The preparation of this manuscript was supported by a grant from the Morris Animal Foundation, National Institutes of Health (NIH) AR36554, the Consolidated Research Grant Program, the Cornell Advanced Technology Biotechnology Program, Nestle Purina, Marshfield Medical Research Foundation (Marshfield, WI), and Cornell University College of Veterinary Medicine unrestricted alumni funds, NIH R01 NS041670 and National Science Foundation 0540745.

APPENDIX

In what follows, we derived the ranges of the disequilibrium parameters for a nonequilibrium population and defined the normalized zygotic disequilibrium in a way as for gametic LD (Lewontin 1964, 1988). On the basis of Equations 7 and 8, the ranges of the HWD coefficients are expressed as

equation M103

for marker A, and

equation M104

for marker B.

For the composite gametic disequilibrium, the range is derived, on the basis of Equations 9, 10, and 13, as

equation M105

where A = 2pApb, B = 2papb, C = p2Apb + p2apB + pApa, equation M106, E = 2pApB, F = 2papb, equation M107, equation M108, equation M109, and equation M110. The normalized Δab is defined as

equation M111

where

equation M112

On the basis of Equations 11 and 12, two trigenic disequilibria have the ranges expressed, respectively, as

equation M113

where A = 2pApapb, equation M114, equation M115, equation M116, equation M117, equation M118, equation M119, H = pApB, I = papB, equation M120, equation M121, equation M122, equation M123, equation M124, equation M125, equation M126, equation M127, equation M128, equation M129, equation M130, equation M131, equation M132, equation M133, equation M134, equation M135, equation M136, equation M137, equation M138, and

equation M139

where A′ = 2papBpb, equation M140, equation M141, equation M142, equation M143, equation M144, equation M145, equation M146, equation M147, equation M148, equation M149, equation M150, equation M151, equation M152, equation M153, equation M154, equation M155, equation M156, equation M157, equation M158, equation M159, equation M160, equation M161, equation M162, equation M163, equation M164, equation M165, and equation M166. The normalized DAb and DaB are defined, respectively, as

equation M167

where

equation M168

and

equation M169

where

equation M170

On the basis of Table 3, the range of the composite quadrigenic disequilibrium is expressed as

equation M171

where equation M172, equation M173, equation M174, equation M175, equation M176, equation M177, equation M178, equation M179, and equation M180. The normalized Δab is defined as

equation M181

where

equation M182

References

  • Ardlie, K. G., L. Kruglyak and M. Seielstad, 2002. Patterns of linkage disequilibrium in the human genome. Nat. Rev. Genet. 3: 299–309. [PubMed]
  • Barton, N. H., and K. S. Gale, 1993. Genetic analysis of hybrid zones, pp. 13–45 in Hybrid Zones and the Evolutionary Process, edited by R. G. Harrison. Oxford University Press, Oxford.
  • Bennett, J. H., and F. E. Binet, 1956. Association between Mendelian factors with mixed selfing and random mating. Heredity 10: 51–55.
  • Bliss, S., R. J. Todhunter, R. Quaas, G. Casella, R. L. Wu et al., 2002. Quantitative genetics of traits associated with hip dysplasia in a canine pedigree constructed by mating dysplastic Labrador retrievers with unaffected greyhounds. Am. J. Vet. Res. 63: 1029–1035. [PubMed]
  • Breen, M., S. Jouquand, C. Renier, C. S. Mellersh, C. Hitte et al., 2001. Chromosome-specific single-locus FISH probes allow anchorage of an 1800-marker integrated radiation-hybrid/linkage map of the domestic dog genome to all chromosomes. Mamm. Genome 11: 1784–1795. [PMC free article] [PubMed]
  • Charlesworth, B., 1991. The evolution of sex chromosomes. Science 251: 1030–1033. [PubMed]
  • Farnir, F., W. Coppieters, J. J. Arranz, P. Berzi, N. Cambisano et al., 2000. Extensive genome-wide linkage disequilibrium in cattle. Genome Res. 10: 220–227. [PubMed]
  • Haldane, J. B. S., 1949. The association of characters as a result of inbreeding and linkage. Ann. Eugen. 15: 15–23. [PubMed]
  • Hall, D., E. M. Wijsman, J. L. Roos, J. A. Gogos and M. Karayiorgou, 2002. Extended intermarker linkage disequilibrium in the Afrikaners. Genome Res. 12: 956–961. [PMC free article] [PubMed]
  • Hamilton, D. C., and D. E. Cole, 2004. Standardizing a composite measure of linkage disequilibrium. Ann. Hum. Genet. 68: 234–239. [PubMed]
  • Hyun, C., L. J. Filippich, R. A. Lea, G. Shepherd, I. P. Hughes et al., 2003. Prospects for whole genome linkage disequilibrium mapping in domestic dog breeds. Mamm. Genome 14: 640–649. [PubMed]
  • Lewontin, R. C., 1964. The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49: 49–67. [PMC free article] [PubMed]
  • Lewontin, R. C., 1988. On measures of gametic disequilibrium. Genetics 120: 849–852. [PMC free article] [PubMed]
  • Lindblad-Toh, K., C. M. Wade, T. S. Mikkelsen, E. K. Karlsson, D. B. Jaffe et al., 2005. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438: 803–819. [PubMed]
  • Lou, X.-Y., R. J. Todhunter, M. Lin, Q. Lu, T. Liu et al., 2003. The extent and distribution of linkage disequilibrium in canine. Mamm. Genome 14: 555–564. [PubMed]
  • Lynch, M., and B. Walsh, 1998. Genetics and Analysis of Quantitative Traits.Sinauer Associates, Sunderland, MA.
  • McRae, A. F., J. C. McEwan, K. G. Dodds, T. Wilson, A. M. Crawford et al., 2002. Linkage disequilibrium in domestic sheep. Genetics 160: 1113–1122. [PMC free article] [PubMed]
  • Mellersh, C. S., A. A. Langston, G. M. Acland, M. A. Fleming, K. Ray et al., 1997. A linkage map of the canine genome. Genomics 46: 326–336. [PubMed]
  • Mellersh, C. S., C. Hitte, M. Richman, F. Vignaux, C. Priat et al., 2000. An integrated linkage-radiation hybrid map of the canine genome. Mamm. Genome 11: 120–130. [PubMed]
  • Parker, H. G., and E. A. Ostrander, 2005. Canine genomics and genetics: running with the pack. PLoS Genet. 1(5): e58. [PMC free article] [PubMed]
  • Ostrander, E. A., and R. K. Wayne, 2005. The canine genome. Genome Res. 15: 1706–1716. [PubMed]
  • Remington, D. L., J. M. Thornsberry, Y. Matsuokadagger, L. M. Wilson, S. R. Whitt et al., 2001. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc. Natl. Acad. Sci. USA 98: 11479–11484. [PMC free article] [PubMed]
  • Richman, M., C. S. Mellersh, C. Andre, F. Gailbert and E. A. Ostrander, 2001. Characterization of a minimal screening set of 172 microsatellite markers for genome-wide screens of the canine genome. J. Biochem. Biophys. Methods 47: 137–149. [PubMed]
  • Sutter, N. B., and E. A. Ostrander, 2004. Dog star rising: the canine genetic system. Nat. Rev. Genet. 5: 900–910. [PubMed]
  • Sutter, N. B., M. A. Eberle, H. G. Parker, B. J. Pullar, E. F. Kirkness et al., 2004. Extensive and breed-specific linkage disequilibrium in Canis familiaris. Genome Res. 14: 2388–2396. [PMC free article] [PubMed]
  • Tenesa, A., A. F. Wright, S. A. Knott, A. D. Carothers, C. Hayward et al., 2004. Extent of linkage disequilibrium in a Sardinian sub-isolate: sampling and methodological considerations. Hum. Mol. Genet. 13: 25–33. [PubMed]
  • Todhunter, R. J., G. M. Acland, M. Olivier, A. J. Williams, M. Vernier-Singer et al., 1999. An outcrossed canine pedigree for linkage analysis of hip dysplasia. J. Hered. 90: 83–92. [PubMed]
  • Todhunter, R. J., G. Casella, S. P. Bliss, G. Lust, A. J. Williams et al., 2003. a Power of a dysplastic Labrador retriever-greyhound pedigree for linkage analysis of hip dysplasia. Am. J. Vet. Res. 64: 418–424. [PubMed]
  • Todhunter, R. J., S. R. Bliss, S. R. Quaas, G. Lust, G. Casella et al., 2003. b Genetic structure of susceptibility traits for hip dysplasia and microsatellite informativeness of an outcrossed canine pedigree. J. Hered. 94: 39–48. [PubMed]
  • Varilo, T., T. Paunio, A. Parker, M. Perola, J. Meyer et al., 2003. The interval of linkage disequilibrium (LD) detected with microsatellite and SNP markers in chromosomes of Finnish populations with different histories. Hum. Mol. Genet. 12: 51–59. [PubMed]
  • Weir, B. S., 1996. Genetic Data Analysis II.Sinauer Associates. Sunderland, MA.
  • Weir, B. S., and C. C. Cockerham, 1989. Complete characterization of disequilibrium at two loci, pp. 86–110 in Mathematical Evolutionary Theory, edited by M. W. Feldman. Princeton University Press, Princeton, NJ.
  • Yang, R.-C., 2000. Zygotic associations and multilocus statistics in a nonequilibrium diploid population. Genetics 155: 1449–1458. [PMC free article] [PubMed]
  • Yang, R.-C., 2002. Analysis of multilocus zygotic associations. Genetics 161: 435–445. [PMC free article] [PubMed]
  • Zaykin, D. V., 2004. Bounds and normalization of the composite linkage disequilibrium coefficient. Genet. Epidemiol. 27: 252–257. [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...