• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Apr 2006; 172(4): 2647–2663.
PMCID: PMC1456384

The Hitchhiking Effect on Linkage Disequilibrium Between Linked Neutral Loci


We analyzed a three-locus model of genetic hitchhiking with one locus experiencing positive directional selection and two partially linked neutral loci. Following the original hitchhiking approach by Maynard Smith and Haigh, our analysis is purely deterministic. In the first half of the selected phase after a favored mutation has entered the population, hitchhiking may lead to a strong increase of linkage disequilibrium (LD) between the two neutral sites if both are <0.1s away from the selected site (where s is the selection coefficient). In the second half of the selected phase, the main effect of hitchhiking is to destroy LD. This occurs very quickly (before the end of the selected phase) when the selected site is between both neutral loci. This pattern cannot be attributed to the well-known variation-reducing effect of hitchhiking but is a consequence of secondary hitchhiking effects on the recombinants created in the selected phase. When the selected site is outside the neutral loci (which are, say, <0.1s apart), however, a fast decay of LD is observed only if the selected site is in the immediate neighborhood of one of the neutral sites (i.e., if the recombination rate r between the selected site and one of the neutral sites satisfies equation M1). If the selected site is far away from the neutral sites (say, r > 0.3s), the decay rate of LD approaches that of neutrality. Averaging over a uniform distribution of initial gamete frequencies shows that the expected LD at the end of the hitchhiking phase is driven toward zero, while the variance is increased when the selected site is well outside the two neutral sites. When the direction of LD is polarized with respect to the more common allele at each neutral site, hitchhiking creates more positive than negative linkage disequilibrium. Thus, hitchhiking may have a distinctively patterned LD-reducing effect, in particular near the target of selection.

GENETIC drift is recognized as a fundamental stochastic force shaping the polymorphism within populations and divergence between species of both neutral and selected variants at a locus (Fisher 1930; Wright 1931; Kimura 1983). Remarkably similar patterns to those caused by genetic drift are predicted by theoretical models incorporating stochastically varying selection (Gillespie 1994). In 1974 Maynard Smith and Haigh analyzed a simple and obvious extension of the genetic drift models, incorporating the stochastic coupling due to linkage with another locus undergoing strong directional selection. Addressing the apparent uniformity of allozyme polymorphism across species, Maynard Smith and Haigh (1974) focused on the “hitchhiking effect” on allelic frequencies and heterozygosity.

Since this seminal work the hitchhiking effect associated with positive directional selection has been extended to address observations in the emerging field of molecular population genetics (Aguadé et al. 1989; Stephan and Langley 1989; Begun and Aquadro 1992). These studies revealed low levels of DNA sequence polymorphism in Drosophila in genomic regions of low crossing over and led to theoretical analyses of the hitchhiking effect on nucleotide diversity (Kaplan et al. 1989) and on the frequency spectrum of polymorphisms (Braverman et al. 1995). Whether formulated in terms of gamete frequencies or in the coalescent framework, the genetic models dealt with a pair of loci: a selected and a linked neutral locus with a defined rate of recombination between them. Independent of whether single or recurrent hitchhiking events were analyzed (as a stochastic process or a deterministic approximation), the conclusion has consistently been that hitchhiking should have profound effects on expected heterozygosity and allele frequencies at the neutral locus, if selection is strong and linkage is tight.

In 1977 Thomson considered the impact of hitchhiking on linkage disequilibrium (LD). Most of her analysis focused on the association between the selected alleles and those at a single neutral locus. Thomson also considered the impact of hitchhiking (of a heterotic polymorphism) on the LD between alleles at two linked neutral loci. On the basis of numerical examples she concluded that hitchhiking creates LD between neutral loci within the same genomic domain in which it affects levels of heterozygosity. The impact of simple directional selection of rare variants rapidly going to fixation was not considered by Thomson (1977). Except for this initial study and a few rather targeted applications (Robinson et al. 1991; Grote et al. 1998), no attempts have been made to extend this simple two-locus hitchhiking model of Maynard Smith and Haigh (1974) to multiple loci. This is curious, as the hitchhiking effect has played a major role in molecular population genetics for >15 years.

On the basis of the analyses of Thomson and her colleagues, it has been believed that hitchhiking affects not only polymorphisms at individual sites (or loci), but also the association between polymorphisms. Using a three-locus model with one selected and two neutral loci, she showed that hitchhiking that has a strong effect on heterozygosity can also generate strong LD between the neutral loci. Without paying close attention to Thomson's writing, the important role of hitchhiking in generating LD has been reiterated in textbooks and publications by many authors. It was not until recently that a few authors reported quite the opposite results (Gillespie 1997; Kim and Stephan 2002; Kim and Nielsen 2004). Analyzing “shift” models that are similar to the hitchhiking model described above, Gillespie (1997, p. 297) concluded that “linked selection can reduce variation without building up high levels of linkage disequilibrium, contrary to our intuition.” These latter studies focused on average effects observable in simulated data. In small-sample coalescent simulations, Kim and Nielsen (2004) found increased LD between alleles at two neutral loci on the same side of the selected locus at the time of fixation and reduced LD across the site of selection. Furthermore, they provide heuristic arguments to explain this pattern. These different and somewhat contradictory views of the relationship between genetic hitchhiking and LD motivated us to pursue an analytic investigation of the question. We followed the deterministic approach of Maynard Smith and Haigh (1974), extending their model to three loci as did Thomson (1977). To analyze this model, we used the framework of Barton and Turelli (1991), which provides a natural setting for the investigation of the directional hitchhiking, yielding transparent mathematical expressions that illuminate the rather surprising dynamics of LD under the hitchhiking effect.


We consider a three-locus model with two neutral loci and one selected locus. For each locus, we assume that there are only two allele types, denoted by 0 and 1. The selected locus may be between the two neutral ones or on either side of them (see Figure 1). We denote by L and R the left and right neutral loci, respectively, and by S the selected locus. The corresponding recombination fractions between loci are rLR, etc. We assume positive directional selection according to the following fitness scheme:

equation M2

where s is the selection coefficient of the selected allele (type 1) and h the dominance coefficient. Note that we follow here the notation of Maynard Smith and Haigh (1974), not the definition of Kaplan et al. (1989). Effective population size Ne is assumed to be very large (equation M3), such that a deterministic analysis of the model is appropriate.

Figure 1.
Three possible configurations. The selected locus is denoted by S, whereas the left and the right neutral loci are denoted by L and R, respectively. The recombination rate rLR between L and R is given by adding or subtracting rLS and rRS, depending on ...

The system of full recursions:

To derive the recursions for the marginal (type 1) allele frequencies pL, pR, and pS at the three loci and the LDs (central moments) of the second (CLR, CLS, CRS) and third (CLRS) orders measured with respect to type 1 alleles, we follow the approach of Barton and Turelli (1991). In that approach, coefficients that appear in the recursions are recombination rates and generalized selection coefficients, the latter denoted by equation M4 and equation M5, where X and Y are nonempty subsets of loci. (Generalized selection coefficients are defined as coefficients that appear in expressing the relative fitness in terms of certain quantities related to LDs.) In our case, nonzero selection coefficients that appear in the recursions at time t are

equation M6

For h = equation M7, these simplify to

equation M8

Following Barton and Turelli (1991), we define rLRS = rLS,R + rRS,L + rLR,S and use rX,Y to denote the rate of recombination events that partition the loci into two nonempty sets X and Y. In our work, we ignore double-crossover recombination events, so we define rX,Y = 0 if the partition X, Y corresponds to a double-crossover event. For example, if the selected locus S is between the neutral loci, then rLS,R = rRS, rRS,L = rLS, and rLR,S = 0. Further, we assume that recombination rates are additive. The recombination rate rLR is therefore given by adding or subtracting rLS and rRS, depending on the configuration (see Figure 1).

To simplify notation, we hereafter omit writing the dependence on time t. We define qS = 1 − pS, qL = 1 − pL, and qR = 1 − pR. Marginal allele frequencies satisfy the following recursions:

equation M9
equation M10
equation M11

The LDs satisfy the recursions

equation M12
equation M13
equation M14
equation M15


equation M16
equation M17
equation M18
equation M19


equation M20

The above general recursions apply to all three configurations shown in Figure 1. Depending on the particular configuration being considered, recombination rates rX,Y need to be defined appropriately.

The system of truncated recursions:

We explore the behavior of the recursion Equations 17 in the region equation M21 and equation M22 (see Maynard Smith and Haigh 1974). Here, r may be any recombination parameter appearing in the Equations 17. Keeping only the terms linear in s or r leads to the following set of recursions:

equation M23
equation M24
equation M25
equation M26
equation M27
equation M28
equation M29

These equations agree to first order in r and s with those of Thomson (1977) (compare her Equations 30iii, 30iv, and 31). Since the hitchhiking effect can be best observed when equation M30 (such that simultaneously equation M31 holds), we may approximate the truncated recursions by the following ordinary differential equations (ODEs; see Maynard Smith and Haigh 1974):

equation M32
equation M33
equation M34
equation M35
equation M36
equation M37
equation M38

Here we have introduced time t (in generations) into the equation for pS and parameterized the other quantities by pS (which is a monotonically increasing function of t).

Structure of equations:

Several features of the dynamical system become apparent from these two sets of equations. Most importantly, selection acts on the alleles at the neutral loci indirectly and in a strictly hierarchical fashion: on the marginal neutral allele frequencies via the pairwise LDs CLS and CRS, on the LD between the neutral sites by the third moment, and on the latter by a fourth-order term (i.e., the product of the two pairwise LDs CLS and CRS).

Analytical solutions when the selected locus is between the neutral loci:

The ODEs for the LDs are first-order linear differential equations. The ODEs for the pairwise LDs CLS and CRS are homogeneous. The equations for CLR and CLRS contain the higher-order moments as inhomogeneous terms that act as “driving forces” of the dynamics. However, except for this impact of the higher-order terms, the equations are decoupled and can be solved successively. We have the following results.

The frequency of the selected allele at locus S is

equation M39

whereas marginal allele frequencies at the neutral loci are

equation M40
equation M41


equation M42

which corresponds to the heterozygosity at the selected locus. The LDs CLS(t) and CRS(t) can be written as

equation M43
equation M44

Given these solutions for CLS(t) and CRS(t), the coupled ODEs (24) and (25) admit simple exact solutions when the selected locus is between the two neutral loci. More specifically, the third-order LD is given by

equation M45

and the LD between the neutral loci can be written as

equation M46

where rLR = rLRS = rLS + rRS.

Analytical solutions when the selected locus is outside the neutral loci:

Solutions to the ODEs (19)–(23) do not depend on whether the selected locus is inside or outside the two neutral loci. For example, the allele frequency pS(t) and the LDs CLS(t) and CRS(t) are given by (26), (30), and (31), respectively, in all cases. However, the dynamics of the ODEs (24) and (25) for CLR(t) and CLRS(t), respectively, depend crucially on the position of the selected locus S with respect to the neutral loci L and R. As we elaborate presently, the dynamics of CLR(t) when S is between L and R exhibit radically different behavior than when S is outside.

In what follows, suppose that S is to the right of R, which implies rLS = rLRS = rLR + rRS. The case where S is to the left of L can be handled in a similar vein, with rRS replaced with rLS. Now, the ODE (25) for the third-order LD CLRS(t) does not admit a closed-form solution; a general solution can be obtained in terms of the incomplete beta function B(z; x, y), defined as equation M47. However, noting that B(z; 1 − 2rRS/s, 1 + 2rRS/s) ≈ z if equation M48, we obtain the following simple approximate solution:

equation M49

Further, using this solution and the approximation B[z; 2 − 2rRS/s, 1 + 2rRS/s] ≈ z2 for equation M50, we obtain the following approximate solution to (24):

equation M51

Note the striking resemblance of (34) and (35) to (32) and (33), respectively. For rRS = 0, the two sets of equations agree exactly, and hence our solutions for different regions form one continuous solution for the entire domain. The only difference between the two sets of equations is that, in (34) and (35), an extra factor equation M52 appears together with [pS(t) − pS(0)]/HS(0). This simple difference leads to important observable differences in the dynamics of the LDs.

Comparison of approximate analytic solutions with numerical solutions to the full recursions:

We have written a computer program to solve the full recursions (1)–(11) numerically. Comparison of our analytic solutions (33) and (35) with numerical solutions to the full recursions are shown in Tables 1 and and2,2, respectively. As these tables show, our analytic solutions are a good approximation to the exact dynamics. In obtaining our analytic solution (35) for the case in which the selected locus is outside the neutral loci, recall that we assumed equation M53 to approximate the incomplete beta function. As expected, Table 2 shows that (35) becomes less accurate as rRS increases, but it is a good approximation as long as equation M54.

Comparison of the exact CLR(t) (obtained by solving the full recursions numerically) with the analytic approximation (33) when the selected locus is between the two neutral loci
Comparison of the exact CLR(t) with the analytic approximation (35) when the selected locus is to the right of locus R


Here we consider the dynamics of the LD between the two neutral loci. We utilize our analytic solutions from the previous section to study several important aspects of the dynamics.

Vanishing LD:

In the domain equation M55, the term −sCLSCRS dominates the recursion for the third moment [see (18)] and hence influences also the LD between the neutral sites. If the selected site is between the two neutral sites and linkage is sufficiently tight (rLR < 0.1s), |CLR| quickly increases after the favored mutation has entered the population and, after transiently reaching a peak, rapidly decays to zero before the selected phase ends. To show this, define tf as the time satisfying pS(tf) = 1 − pS(0). Henceforward, we loosely refer to this time as the fixation time. Using (26), one can show that

equation M56

We wish to show that, if the selected locus is between the two neutral loci, then CLR(tf) ≈ 0 for all possible initial conditions of interest. Common to all initial conditions is that pS(0) = 1/(2N), with N being the population size. For N = ~104–106, equation M57, and therefore

equation M58

which implies that (33) at tf can be written approximately as follows:

equation M59

We use {000, 001, 010, 011, 100, 101, 110, 111} to denote gametic types. Their frequencies are denoted by {f000, f001, f010, f011, f100, f101, f110, f111}, which are related to the marginal frequencies and the LDs as follows:

equation M60

Using these relations, we obtain

equation M61

At t = 0, a new favored mutation occurs on only one of the following gametic types: 000, 001, 100, or 101. For instance, if the mutation occurs on a gamete of type 101, f010(0) = f011(0) = f110(0) = 0 and f111(0) = pS(0) = 1/(2N). More generally, only one of f010(0), f111(0), f011(0), f110(0) is supposed to be nonzero. This implies that the right-hand side of (39) must be zero. Hence, the coefficient of exp(−rLRtf) in (38) is exactly zero. If the approximation (37) were not used, then the coefficient of exp(−rLRtf) in CLR(tf) would not be exactly zero, but would still be very small. Note that exp(−rLRtf) may not be very small. For example, exp(−rLRtf) = 0.452813 for pS(0) = 0.00005, rLR = 0.00002, and s = 0.001. This shows that, for small recombination rates (equation M62), selection rather than recombination is the dominant force that causes CLR(t) to vanish before the fixation time. For large recombination rates, the contribution exp(−rLRt) from recombination should dominate over selection effects.

This behavior may be explained as follows. Under the scenario of tight linkage and strong selection, a low-frequency gamete on which the favored mutation landed is quickly dragged into intermediate to high frequency. If the recombination rates are nonzero, this gamete may undergo recombination, thereby creating the two types of single recombinants that also carry the selected allele and thus increase in frequency. This reduces the LD between L and R created by the hitchhiking effect in the first half of the selected phase. This hitchhiking effect on the recombinants is stronger, the greater the linkage between the selected site and the two neutral sites is, and thus also the product CLSCRS.

Figure 2a shows another important observation: LD may vanish very quickly in the selected phase, while relative heterozygosity approaches a finite (i.e., nonzero) equilibrium value. Thus, LD does not vanish because of the variation-reducing effect of hitchhiking per se, but as a consequence of secondary hitchhiking effects on the recombinants created in the selected phase (described above).

Figure 2.
Example trajectories of CLR(n) are shown on the left-hand side, and those of normalized heterozygosity Het(n)/Het(0) are shown on the right-hand side, where solid (resp. dashed) lines correspond to locus L (resp. R). (a) The selected locus is at the midpoint ...

A selected mutation occurring outside the two neutral sites on a low-frequency gamete may also lead to a transient peak of CLR, if both neutral polymorphic sites are <0.1s recombination distances away from the selected site (see Figure 2, b and c). Although this peak vanishes faster than under neutral conditions (i.e., with the selected site far away from the neutral sites, as in Figure 2d), the decay rate is not as high as when the favored mutation occurs between the neutral sites (Figure 2a). We analyze this behavior in more detail below.

Shown in Figure 3 is a plot of CLR(n) for varying position of the selected locus. As in Figure 2, the distance between the two neutral loci is fixed at rLR = 0.0002, and the same set of initial conditions is used. Note that this plot is symmetric about the plane r = 0; we return to this point later in the article. We stress that our conclusions described above do not depend on the particular values of neutral marginal allele frequencies pL(0) and pR(0) used for illustration. Even for low values of pL(0) and pR(0), for example, the same conclusions hold.

Figure 3.
The LD CLR between the neutral loci as a function of the position of the selected locus, for rLR = 0.0002 and s = 0.01. Here, r is the position of the selected locus S, and the value r = 0 corresponds to the midpoint between the ...

An alternative illustration of the above discussion is provided in Figure 4, which shows pairwise LD plots for a region containing 100 neutral loci and a single selected locus. The two plots shown correspond to two different time points. The selected locus is located in the middle of the region and LD values below a cutoff value are not plotted.

Figure 4.
Truncated pairwise LD plots for a region consisting of 100 neutral loci and 1 selected locus located in the middle. The recombination rate between any two adjacent loci is 0.4s/100, and hence the recombination rate between the leftmost and the rightmost ...

The maximum LD at tf:

Consider the case in which the selected locus S is to the right of locus R. Viewing CLR(tf) as a function of rRS, whether a local optimum in the domain equation M63 exists depends on initial conditions. The example shown in Figure 3 has a local maximum at rRS/s ≈ 0.039. Differentiating the analytic solution (35), it is possible to determine whether there is a critical point rRS = r*RS that satisfies

equation M64

Suppose that, at t = 0, the new gamete carrying a selected allele (of type 1) at locus S is of type ij1, with i being the allele at locus L and j the allele at locus R. Then, given that [δi1pL(0)][δj1pR(0)] > CLR(0), where δab is 1 if a = b or 0 if ab, we obtain

equation M65


equation M66

The value of r*RS in (41) may be very large for some initial conditions. In such a case, as our analytic solution (35) is valid only in the domain equation M67, all we can say for sure is that CLR(tf) has no critical point in the domain equation M68 [i.e., CLR(tf) is either a monotonically increasing or a monotonically decreasing function of rRS in that domain]. Further, if [δi1pL(0)][δj1pR(0)] ≤ CLR(0), there is no real-valued r*RS such that our approximate analytic solution (35) satisfies (40). Hence, noting that CLR(tf) approaches equation M69 as rRS/s increases, we conclude that, in the domain equation M70,

equation M71


equation M72

and equation M73. Note that the maximum possible value of X is equation M74. The critical point rLS = r*LS for the case in which the selected locus is to the left of locus L is also given by (41).

Invariance of CLR and CLRS when the selected locus is between the two neutral loci:

When the favored mutation occurs between the two neutral sites, the dynamics of the system of truncated recursions for CLR and CLRS do not depend on the position of the selected locus. This can be seen immediately from Equations 17 and 18, which depend only on the sum rLS + rRS and not on any individual recombination parameter. We show in the appendix that this invariance also (nearly) holds for the system of full recursions.


Here we assume that the selected locus is to the right of locus R. For such a case, recall that the LD (measured with respect to type 1 alleles) between the neutral loci is given by (35). We use ijk to denote gametic types, with i being for locus L, j for locus R, and k for locus S. By the gamete of origin, we mean the new gamete at t = 0 carrying a selected allele (of type 1) at locus S. We include ij in superscript [i.e., we write equation M75] if the gamete of origin is of type ij1. Using (35), we obtain

equation M76

where, as before, δab is 1 if a = b or 0 if ab, and α(t, y) is defined as

equation M77

Recall that marginal (type 1) allele frequencies pL(t) and pR(t) at the neutral loci are given by (27) and (28), respectively. For equation M78 and equation M79, we can use the approximation

equation M80

from which it follows that

equation M81

where α(tf, rRS/s) is defined as in (44). Similar to equation M82, we use equation M83 and equation M84 to denote pL(t) and pR(t), respectively, if the gamete of origin is of type ij1. It is straightforward to show that

equation M85
equation M86

Frequency-averaged LD and the range of the hitchhiking effect:

In what follows, we use xijk to denote the frequency of the gametic type ijk at time t = 0 and define xij· = xij0 + xij1. The type of the gamete of origin could be any of 001, 011, 101, and 111. Suppose that the probability of the gamete of origin being of type ij1 is equal to the frequency of the gametic type ij0 just before time t = 0 (note that this frequency is equal to xij·). Then, the average value of CLR(t) with respect to this probability is given by equation M87, which we call a frequency-averaged LD. We show below that, contrary to people's common intuition, the effect of selection on such an averaged LD does not depend on haplotype diversity xij· at t = 0.

Using (43), we can show that equation M88 is given by

equation M89

If CLR(0) = 0, then equation M90 for all t. For CLR(0) ≠ 0, we define

equation M91

Note that rLR need not be much smaller than s for our analytic solution (35) to be valid (recall that only equation M92 is required). Assuming equation M93, we can ignore genetic drift and regard equation M94 as the behavior of LD under neutrality. Thus, A(t, rRS/s) can be viewed as the ratio of the frequency-averaged LD in the presence of selection to that in the absence of selection. At the time tf of fixation, pS(tf) = 1 − pS(0) and therefore

equation M95

For given rRS/s, A(tf, rRS/s) depends only on pS(0) = 1/(2N); it has no dependence on other initial conditions. A plot of A(tf, rRS/s) is shown in Figure 5 for pS(0) = 0.00005.

Figure 5.
A plot of A(tf, rRS/s) for pS(0) = 0.00005. The function A(tf, rRS/s), defined in (47), captures the effect of selection on both relative frequency-averaged LD and relative frequency-averaged heterozygosity [see (47) and (48)].

We now compare A(tf, rRS/s) with relative frequency-averaged heterozygosity. Let us focus on the right neutral locus R and define equation M96. For HR(0) = 2pR(0)[1 − pR(0)] ≠ 0, one can use (46) to show that

equation M97

For pS(0) = 1/(2N), this is equation M98, which is equivalent to Equation 14d of Stephan et al. (1992) (the factor 2 in the exponent of that formula needs to be replaced by 4 because of the different definition of the selection coefficient). In the absence of genetic drift, equation M99(tf) = HR(0) for s = 0. Therefore, (48) can be regarded as the ratio of the frequency-averaged heterozygosity in the presence of selection to that in the absence of selection. Surprisingly, this ratio is exactly equal to the analogous ratio for LD shown in (47). The function A(tf, rRS/s) plays a special role in the sense that it encodes the effect of selection on two different frequency-averaged quantities.

For site heterozygosity, the hitchhiking effect is generally profound only when, provided that equation M100, the recombination distance r between the selected and neutral sites satisfies r < 0.1s (Maynard Smith and Haigh 1974). The term determining this effect is (2N)−4r/s, assuming that the initial frequency pS(0) of the selected allele is 1/(2N). Our above analysis shows that the range of a substantial reduction of LD due to hitchhiking [determined by equation M101] is exactly equal to that for variation [determined by (2N)−4r/s] (see Figure 5).

Characterization of equivalent initial conditions:

We now find the set of all initial conditions that lead to the same value of CLR at the time of fixation; i.e., CLR(tf) = c, where c is some fixed constant.

To be concrete, suppose that the gamete of origin is of type 001, in which case x101 = x011 = x111 = 0 and x001 = pS(0) = 1/(2N). First, note that x000 + x010 + x100 + x110 + pS(0) = 1 implies

equation M102

which defines a tetrahedron Δ as depicted in Figure 6a. Second, using (43), one can show that equation M103 implies

equation M104

which defines a surface Ξ in a three-dimensional Euclidean space with (x110, x010, x100) as coordinates. The intersection of surface Ξ with tetrahedron Δ, illustrated in Figure 6b, corresponds to the set of initial conditions such that equation M105. A case in which the gamete of origin is of type other than 001 can be handled in a similar vein.

Figure 6.
Illustration of initial conditions leading to the same equation M152. (a) The illustrated tetrahedron Δ corresponds to the set of all possible initial conditions, given that the gamete of origin is of type 001. Δ is defined by (49), and a point in ...

Probability distributions of CLR(tf):

Recall that CLR(t) for t > 0 depends on initial conditions. In what follows, we regard initial gametic frequencies as being random and consider the probability distribution of CLR(tf). The squared correlation coefficient R2(tf) is addressed later in the discussion. We assume that all initial gametic frequency configurations x000, x010, x100, x110 are equally likely and satisfy x000 + x010 + x100 + x110 + pS(0) = 1. Under this assumption of uniform distribution, it is possible to compute the probability distribution P[CLR(tf) < c] for fixed rLR/s and rRS/s. The key idea is to utilize the characterization of equivalent initial conditions described above. More precisely, as c changes, the surface defined by CLR(tf) = c changes in a smooth fashion, sweeping out a region in three dimensions. The probability P[CLR(tf) < c] is equal to the volume of the region corresponding to CLR(tf) < c inside Δ, normalized by the total volume of Δ.

Our main result, illustrated in Figure 7, is

equation M106

where equation M107 is defined below. Let

equation M108

Cases with c > 0 and c < 0 are treated separately below.

Figure 7.
Probability distributions of equation M154 for uniformly distributed initial gametic frequency configurations. These plots were generated using our analytic formulas for P[equation M155 < c], with rLR/s = 0.02 and 1/(2N) = 0.00005. As rRS/s increases, ...

For c > 0:

If cb/4, equation M109, and cab(1 − a), then

equation M110

If cb/4, equation M111, and c > ab(1 − a), then

equation M112

If either c > b/4 or equation M113, then P[C < c] = 1.

For c = −|c| < 0:

If |c| ≤ (1 − a)2b/4, then

equation M114

If |c| > (1 − a)2b/4, then P[C < −|c|] = 0.


Polarized LDs are measured with respect to major alleles. To determine the polarized LD Cω(tf) between the neutral loci, we compute

equation M115

the main point being that Cω(tf) = CLR(tf) if σ > 0 and Cω(tf) = −CLR(tf) if σ < 0. [Recall that CLR(t) is measured with respect to type 1 alleles.]

First, for rLR = rRS = 0, note that

equation M116

Second, for fixed initial marginal frequencies, we need to determine for what values of rLR and rRS the sign of σ changes. Using (45) and (46), we can obtain the following results:

  • For i = 0, σ ≈ 0 if pL(0) > equation M117 and
    equation M118
  • For i = 1, σ ≈ 0 if qL(0) > equation M119 and
    equation M120
  • For j = 0, σ ≈ 0 if pR(0) > equation M121 and
    equation M122
  • For j = 1, σ ≈ 0 if qR(0) > equation M123 and
    equation M124

Combined with (52), these equations determine completely whether Cω(tf) = CLR(tf) or Cω(tf) = −CLR(tf) for given parameter values. For example, suppose that (i, j) = (0, 0). If pL(0) < equation M125 and pR(0) < equation M126, then there is no real-valued solution to the condition σ = 0, and therefore Cω(tf) = CLR(tf) for all values of rLR and rRS. If pL(0) > equation M127 and pR(0) < equation M128, or if pL(0) < equation M129 and pR(0) > equation M130, then σ changes sign as illustrated in Figure 8, a and b, where

equation M131


equation M132

Note that u takes its minimum value at pL(0) = 1 and that equation M133 as equation M134. Similarly, v takes its minimum value at pR(0) = 1 and equation M135 as equation M136. If both pL(0) > equation M137 and pR(0) > equation M138, then there are three possibilities, depicted in Figure 8, c–e.

Figure 8.
The sign of σ = [equation M161(tf) − equation M162(tf)] × [equation M163(tf) − equation M164(tf)] for (i, j) = (0, 0). The polarized LD Cω(tf) between the neutral loci is CLR(tf) if σ > 0 or −CLR(tf) if σ < ...

Regions of positive Cω(tf):

To determine the regions of positive Cω(tf), we need to know how the sign of equation M139 depends on rRS; (43) implies that the sign of equation M140 does not depend on rLR. For concreteness, suppose that (i, j) = (0, 0), in which case we can obtain the following results from using (43):

  1. If CLR(0) ≥ 0, then equation M141 ≥ 0 for all rRS, and therefore the sign of Cω(tf) is completely determined by that of σ.
  2. If CLR(0) < 0, then equation M142 changes sign as illustrated in Figure 9, where
    equation M143
    Figure 9.
    The sign of equation M174 for CLR(0) < 0. If CLR(0) ≥ 0, then equation M175 is nonnegative for all rRS and rLR. If CLR(0) < 0, however, the sign of equation M176 can change at rRS = w, where equation M177. In general, equation M178 tends to be positive near (rLR, rRS) = (0, 0). ...

Note that w takes its minimum value of zero at |CLR(0)| = pL(0)pR(0) and that it increases monotonically as |CLR(0)|/[pL(0)pR(0)] decreases. The polarized LD Cω(tf) is positive if and only if equation M144 and σ are either both positive or both negative. Examples are shown in Figure 10.

Figure 10.
Examples of the sign of the polarized LD Cω(tf) when the gamete of origin is of type 001 and CLR(0) < 0. Note that Cω(tf) is positive if and only if equation M179 and σ are either both positive or both negative. In general, Cω ...

As shown in Figure 8, σ tends to be positive in the neighborhood of (rLR, rRS) = (0, 0). The size and shape of this neighborhood depend on u and v. Likewise, as shown in Figure 9, equation M145 tends to be positive in the neighborhood of (rLR, rRS) = (0, 0), with the size of the neighborhood depending on w. As a consequence, the polarized LD Cω(tf) also tends to be positive near (rLR, rRS) = (0, 0).

More generally, if the gamete of origin is of type ij1, the sign of equation M146 (respectively, σ) can be analyzed using (43) [respectively, (52)–(56)]. For all ij, the polarized LD Cω(tf) tends to be positive near (rLR, rRS) = (0, 0).

An exact symmetry when the selected locus is outside the two neutral loci:

Suppose that the selected locus is outside the two neutral loci and that geometric configuration and recombination fractions are fixed. Let {pS(0), pL(0), pR(0), CLS(0), CRS(0), CLR(0), CLRS(0)} and {pS(0), pL(0), pR(0), CLS(0), CRS(0), CLR(0), CLRS(0)} denote two different sets of initial conditions. At generation n > 1, we use “prime” to refer to the marginal allele frequencies and LDs obtained using the second set of initial conditions. In the appendix, we show that if CLS(0) = CRS(0) and CRS(0) = CLS(0), while pS(0) = pS(0), CLR(0) = CLR(0), and CLRS(0) = CLRS(0), then the system of full recursions (1)–(11) implies

equation M147

for all n ≥ 1. This is an exact symmetry result that holds for an arbitrary dominance coefficient h.

An application of this general result is the explanation of the symmetry of Figure 3 with respect to reflection about the r = 0 plane, for those regions corresponding to the selected locus being outside the neutral loci. Note that what is depicted in Figure 3 is different from the obviously symmetric case in which initial conditions CLS(0) and CRS(0) get exchanged when locus S is reflected about r = 0. In that figure, initial conditions remain fixed, while the geometric configuration of the loci and recombination fractions change upon reflection. That situation is related to changing initial conditions as described above, while keeping the geometric configuration and recombination fractions fixed.


To understand the forces that shape genomic variation in natural populations and the divergence between species, observed patterns must be compared to predictions of the models that faithfully represent the mechanisms through which such forces may work. While much of the natural selection of organismic phenotypes may be effectively approximated by deterministic single-locus equations, interactive and stochastic forces are thought to play a significant role. Until recently genetic drift has been considered the primary stochastic process determining the temporal, geographic, and genomic distribution of the vast majority of DNA sequence polymorphism and divergence. Gillespie has repeatedly demonstrated and emphasized fundamental differences between constant-fitness models and stochastically varying selection, despite their superficial similarities (Gillespie 1994). Recently emerging results of surveys of genomic regions of low crossing over per physical length indicate that linked selection rather than genetic drift can dominate the levels of polymorphism within populations (Aguadé et al. 1989; Stephan and Langley 1989; Begun and Aquadro 1992). The hitchhiking effect not only reduces the average level of heterozygosity in the surrounding genomic regions, but it also leaves a skewed frequency spectrum (Braverman et al. 1995). The early study by Thomson (1977) indicated that linked selection can create linkage disequilibrium. Several subsequent articles have addressed specific cases (Robinson et al. 1991; Grote et al. 1998) or noted some temporal and spatial patterns (Kim and Nielsen 2004). Here we have demonstrated that the hitchhiking effect involves a number of strong and surprisingly distinct dynamics and patterns of linkage disequilibrium. We believe that the approach we have taken to address the impact of selection can be extended further to address more complex selection schemes and genetic interactions.

The technological capacity of molecular population genomics is increasing rapidly. For example, the HapMap Project (International HapMap Consortium 2005) provides extensive genotypic survey results on >1 million SNPs in almost 300 individual humans. At this scale of observation one can anticipate much more powerful inferences about the role of direct selection, linked selection, crossing over, gene conversion, mutation, and geographic demography. Indeed, on the basis of such new data, genomic variation in the rate of crossing over has been proposed as the primary determinant of the patterns of linkage disequilibrium in human populations (McVean et al. 2004).

Several representations/notations have been developed to analyze the dynamics of multilocus systems (Bürger 2000). Through a series of articles Barton and Turelli have elaborated and applied their method on the basis of the explicit representations of the moments of allele frequencies (Barton and Turelli 1987, 1991; Turelli and Barton 1990, 1994). Their representation proved surprisingly tractable and transparent in the analysis of the hitchhiking effect on linkage disequilibrium.

Our analysis begins with the full representation of the three-locus dynamics using the notation of Barton and Turelli. These equations suggest the familiar approximation, “truncated equations,” in which equation M148 and small higher-order terms can be dropped. The truncated equations immediately expose much of the fundamental structure and their differential analogs, ordinary differential equations, allow approximate analytic solutions. Comparisons of these ODE dynamics with those of the Barton and Turelli representation indicate that the approximations remain quite accurate as long as equation M149 (see Tables 1 and and2).2). Particularly fortuitous and important is the role of the three-locus LD CLRS in driving the dynamics of the LD CLR between the two linked neutral loci and the dependence of CLRS on the product of the two-locus LDs CLS and CRS.

This systematic investigation of the dynamics of LD under hitchhiking reveals four important features. First, and quite generally, hitchhiking indeed generates LD during the initial half of the hitchhiking time course. As Figure 3 shows LD (positive in this instance) reaches a maximum shortly before the originally rare selected allele reaches 0.5. This result is consistent with Thomson's analysis of hitchhiking caused by the dynamics of an initially rare allele under balancing selection in that its frequency reaches an equilibrium closer to 0.5 than to 1.0. But what is truly surprising is that from several important perspectives the hitchhiking effect on LD is one of reduction. In Figure 3 it is obvious that in the second half of the hitchhiking period the large peak of LD (positive in this case) decreases rapidly. Figure 2 shows several configurations of initial conditions and demonstrates that the decline in the magnitude of LD is not attributable to decline in the heterozygosity at the two neutral loci. A second and striking result is that preexisting LD is completely destroyed when the selected locus is situated between the neutral sites. This geometric relationship produces a striking pattern when all pairwise associations are plotted together as in Figure 4. This is probably the mechanism behind the pattern noted by Kim and Nielsen (2004). This LD-reducing effect of hitchhiking is also evident when the selected site is outside the neutral pair since much of the LD generated during the initial phase is destroyed in the latter phase. A third unexpected property of the hitchhiking on LD is that the averaging over the frequencies of the gametes with which the rare selected variant can be associated indicates that the net effect of hitchhiking would be to reduce preexisting average LD. This is despite the fact that hitchhiking does tend to increase the variance in LD (see below). Note in Figure 3 that there is considerably increased LD in both regions flanking the two neutral sites (i.e., when the selected site is outside and is close to the two neutral sites). When the rare favored allele appears on two of the other three haplotypes (10 or 01) the final LD is strongly negative. Thus the average (weighted by the frequencies of the four gametes) will remain at zero if there is no LD and tend toward zero if initially different from zero. The rate of approach to zero is greater than or equal to that expected in the absence of hitchhiking.

The fourth notable LD hitchhiking effect is on the expected LD when this association is polarized by the marginal allele frequencies. Langley and Crow (1974) noted that with molecular polymorphism data the sign of LD is typically arbitrary. They proposed to orient LD such that it reflected the deviation for the expected most frequent gametic type and demonstrated that under quadratic stabilizing selection this measure of LD, denoted Cω, is negative. Under hitchhiking the average Cω tends to be positive. This can be understood as the consequence of the fact that the neutral alleles at each site on the initially selected haplotype tend to rise to frequencies >0.5 and the LD between those alleles is positive. A bias in the distribution of Cω either regionally or across the genome could be interpreted as evidence that hitchhiking is shaping LD. Thus the frequency-averaged hitchhiking effect on LD is to drive it to zero. But as shown in Figure 11a there is a bias with respect to marginal frequencies at the two neutral sites; Cω(tf) tends to be positive for small r/s. And, of course, there is a broad range of r in which the variance of LD is increased when the selected site is outside the two neutral sites. Figure 11b shows that the projection of the probability distribution of the squared correlation coefficient R2(tf) also has a peak for small r/s, near 0.02.

Figure 11.
Probability distributions of the polarized LD Cω(tf) and the squared correlation coefficient R2(tf), obtained from numerical simulations using rLR/s = 0.02 and 1/(2N) = 0.00005. The polarized LD Cω(tf) tends to be positive ...

The genomic scale over which hitchhiking has a significant effect on heterozygosity and the frequency spectrum has been considered previously. Beyond the obvious equation M150 inherent in the approximation, Stephan et al. (1992) showed that the reduction in heterozygosity was approximately proportional to 1 − (2N)−4r/s. Simulations of Braverman et al. (1995) indicated a similar scale and shape to the skewness in the frequency spectrum as measured by Tajima's D (also see Durrett and Schweinsberg 2005). In studying the expectation of the linkage disequilibrium caused by hitchhiking we note a striking common function A(t, r/s) that relates the averages of both heterozygosity and LD to what they would be in a large population in the absence of hitchhiking. We are tempted to speculate that this simple function may be fundamental to average dynamics of other moments of allele frequencies under the hitchhiking scenario.

While the effect of hitchhiking on the average CLR is to drive it toward zero, this is clearly not expected for equation M151, or the absolute value of CLR. We have not obtained an analytic expression for such expectations but simulated results such as those shown in Figures 3, ,4,4, and and1111 indicate that the hitchhiking effect on magnitude of CLR between neutral sites on the same side of the selected site can be substantial.

Our results were derived using a deterministic three-locus model of hitchhiking. Similar results hold for the pseudo-hitchhiking model (Gillespie 2000 and J. Gillespie, unpublished results). We have compared both models. The recursion equations of the pseudo-hitchhiking model are a good approximation of the dynamics of the three-locus model if the selected locus is outside the two neutral loci and the distance between the selected locus and either one of the neutral loci is much larger than the distance between the two neutral loci. In this parameter region, LD predicted by both models decays more quickly than under neutrality. How might these conclusions about the theoretical hitchhiking dynamics of LD influence the interpretation of population genomic polymorphism and divergence? Certainly it seems to inform any effort to identify regions in the genomes of natural populations in which there has been very recent selected substitution of newly arising mutations or otherwise rare variants. Hitchhiking may not increase LD in the neighborhood of a selected site as it has been widely thought; rather it can decrease it especially when the neutral sites are on opposite sides of the selected locus (see Figure 4). More generally LD that is built up by hitchhiking shortly after the occurrence of a favored mutation is quickly destroyed (even before fixation is reached). As a consequence, genomic regions around targets of recent positive directional selection are expected to exhibit a lack of LD, which is not simply due to the variation-reducing force of hitchhiking. This local dip in the magnitude of LD may be of use in the localization of targets of positive selection in the genome. Given the current debate of how various variation-reducing forces can be distinguished (in particular, bottlenecks from selective sweeps; Glinka et al. 2003; Haddrill et al. 2005), there is merit in attempting to include the specific pattern of LD predicted by these analyses into the methods for identifying targets of selection by selective sweeps (e.g., Kim and Stephan 2002; Kim and Nielsen 2004). Because populations that have undergone population size bottlenecks should show elevated genomewide levels of LD, regions lacking LD around targets of selection may be more easily distinguishable from the rest of the loci than when statistics that are solely based on the reduction of variation are used.

We have not attempted to extend our results to situations in which recurring and genomically randomly distributed hitchhiking events occur. The significant impediment to the analysis of the effect of such recurrent hitchhiking on heterozygosity may be the impact of simultaneous events within the same genomic region. But if selection is strong and events are sufficiently rare such occurrences may be negligible (Kaplan et al. 1989; Durrett and Schweinsberg 2005). While this issue of the dynamic interaction of simultaneous linked hitchhiking events may well remain for the analysis of the impact of hitchhiking on LD, there is clearly a second considerable issue. While in large populations the heterozygosity does not change in between hitchhiking events, that is not true of LD, which, of course, decays in magnitude at rate r. If the rate of the recurrent and randomly distributed hitchhiking events were sufficiently rare and there were no other force causing LD, the results given above are applicable, since LD would decay to zero throughout the genomic region before the next event. Given that LD is, in fact, commonly present on some scale in the various studied species, further analysis and/or simulations are warranted to make a general prediction of the genomic pattern.


We thank Michael Turelli for teaching us the multilocus formalism and John Gillespie for supplying us with an unpublished manuscript on pseudo-hitchhiking. We acknowledge support from the following sources: the Volkswagen-Foundation grant I/78 815 (W.S.), the National Science Foundation grants ELA-0220154 (Y.S.S. and C.H.L.) and IIS-0513910 (Y.S.S. and C.H.L.), and the National Human Genome Research Institute (National Institutes of Health) grants 5R01HG002107-03 and 5R01HG002942-02 (C.H.L.).


Quasi-invariance (embedded selected locus case):

Here we examine in more detail the case where the selected locus is between the neutral loci. More exactly, we wish to keep the sum rLS + rRS fixed to some value, say ρ, and consider varying rLS and rRS while satisfying that condition. To avoid being long-winded, we call this kind of translation of the selected locus a constrained S-translation. We wish to show that the dynamics of certain linkage disequilibria are quasi-invariant, as we clarify presently, under the constrained S-translation.

First, note that the dynamics of pS do not depend at all on the position of the selected locus. Then, as rLR = rLS + rRS for the case under consideration, recursions (6) and (10) do not change under the constrained S-translation. Since rLS,R = rRS and rRS,L = rLS, we have rLS,R + rRS,L = rRS + rLS and rLRS = rLR,S + rRS + rLS. If no double crossovers are allowed, then rLR,S is identically zero. Note that equation M184 and that the sum g(rLS) + g(rRS) does not change under the constrained S-translation. Therefore, recursions (7) and (11) do not change under the constrained S-translation, as long as rLR,S does not depend on where in between the neutral loci the selected locus is located.

We now turn to the product CLSCRS. For ease of notation, we define

equation M185

Then, (4), (5), (8), and (9) imply

equation M186

The product CLSCRS satisfies the recursion

equation M187

The quantity f(rLS; k)f(rRS; k) can be written as

equation M188


equation M189

Under the restriction that rLS + rRS = ρ, the maximum value of rLSrRS is ρ2/4, whereas the minimum is 0. Since γ(k) is positive definite, the maximum variation of CLS(n)CRS(n), as the selected locus moves between the neutral loci, can be obtained by comparing CLS(n)CRS(n) at rLSrRS = 0 with that at rLSrRS = ρ2/4. Define maximal relative variation epsilon(n) as

equation M190

It is straightforward to show that

equation M191

where “…” represents terms proportional to ρm, m ≥ 4. In the case of directional selection, γ(k)/(α(k) + ρβ(k)) is of order 1 for all values of s, h, pS(k), and ρ. Therefore, epsilon(n) = O2n), and we conclude that relative variation increases as time passes.

The dynamics of CLR and CLRS are almost (or quasi-) invariant under the constrained S-translation in the following sense: the range of ρ in which selection has observable influence on the dynamics of CLR and CLRS is where equation M192. In that case, it is possible to maintain equation M193 throughout the entire period from the initial generation to the fixation generation. We would then observe almost no variation in CLR or CLRS as the location of the selected locus is varied between the neutral loci. For large ρ, selection has little influence on CLR and CLRS, so their dynamics should be approximately invariant under translation of the selected locus.

An exact symmetry:

Suppose that the selected locus is outside the two neutral loci and that recombination fractions rLS, rRS, rLR, rLRS, rLR,S, rRS,L, and rLS,R appearing in the system of full recursions (1)–(11) are fixed. In what follows, the dominance coefficient h is assumed to be arbitrary. Let {pS(0), pL(0), pR(0), CLS(0), CRS(0), CLR(0), CLRS(0)} and {pS(0), pL(0), pR(0), CLS(0), CRS(0), CLR(0), CLRS(0)} denote two different sets of initial conditions. At generation n > 1, we use “prime” to refer to the allele frequencies and LDs obtained using the second set of initial conditions.

We first consider the second-order LDs involving the selected locus.

Lemma 1. Suppose that CLS(0) = CRS(0) and CRS(0) = CLS(0). Then, for all n ≥ 1,

equation M194

Proof. This result follows from induction on n. Recall that

equation M195

where the function f(r, k) is defined as in (A1). Similarly,

equation M196

If CLS(0) = CRS(0) and CRS(0) = CLS(0), then

equation M197

Suppose that the claim is true for all 1 ≤ nk. Then, for n = k + 1,

equation M198

where the third line follows from the induction hypothesis. [filled square]

Using the above lemma, we can obtain the following result regarding the third-order LD and the LD between the neutral loci:

Proposition 1. Suppose that pS(0) = pS(0), CLS(0) = CRS(0), CRS(0) = CLS(0), CLR(0) = CLR(0), and CLRS(0) = CLRS(0). Then,

equation M199

for all n ≥ 1.

Proof. First, note that pS(0) = pS(0) implies pS(n) = pS(n), equation M200, and equation M201 for all n ≥ 1. Therefore, since CLSCRS = CLSCRS by Lemma 1, we obtain

equation M202

which is equivalent to (10), and

equation M203

which is equivalent to (6). Similarly,

equation M204


equation M205

are equivalent to (11) and (7), respectively. Hence, CLR and CLRS satisfy exactly the same set of recursions as do CLR and CLRS. Since CLR(0) = CLR(0) and CLRS(0) = CLRS(0), it thus follows that CLR(n) = CLR(n) and CLRS(n) = CLRS(n) for all n ≥ 1. [filled square]


  • Aguadé, M., N. Miyashita and C. H. Langley, 1989. Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster. Genetics 122: 607–615. [PMC free article] [PubMed]
  • Barton, N. H., and M. Turelli, 1987. Adaptive landscapes, genetic distance and evolution of quantitative characters. Genet. Res. 49: 157–173. [PubMed]
  • Barton, N. H., and M. Turelli, 1991. Natural and sexual selection on many loci. Genetics 127: 229–255. [PMC free article] [PubMed]
  • Begun, D. J., and C. F. Aquadro, 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356: 519–520. [PubMed]
  • Braverman, J. M., R. R. Hudson, N. L. Kaplan, C. H. Langley and W. Stephan, 1995. The hitchhiking effect on the site-frequency spectrum of DNA polymorphism. Genetics 140: 783–796. [PMC free article] [PubMed]
  • Bürger, R., 2000. The Mathematical Theory of Selection, Recombination, and Mutation. John Wiley & Sons, Chichester, UK.
  • Durrett, R., and J. Schweinsberg, 2005. A coalescent model for the effect of advantageous mutations on the genealogy of a population. Stoch. Proc. Appl. 115: 1628–1657.
  • Fisher, R. A., 1930. The Genetical Theory of Natural Selection. Clarendon Press, Oxford.
  • Gillespie, J. H., 1994. The Causes of Molecular Evolution. Oxford University Press, Oxford.
  • Gillespie, J. H., 1997. Junk ain't what junk does: neutral alleles in a selected context. Gene 205: 291–299. [PubMed]
  • Gillespie, J. H., 2000. Genetic drift in an infinite population: the pseudo-hitchhiking model. Genetics 155: 909–919. [PMC free article] [PubMed]
  • Glinka, S., L. Ometto, S. Mousset, W. Stephan and D. De Lorenzo, 2003. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multilocus approach. Genetics 165: 1269–1278. [PMC free article] [PubMed]
  • Grote, M., W. Klitz and G. Thomson, 1998. Constrained disequilibrium values and hitchhiking in a three-locus system. Genetics 150: 1295–1307. [PMC free article] [PubMed]
  • Haddrill, P. R., K. R. Thornton, B. Charlesworth and P. Andolfatto, 2005. Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations. Genome Res. 15: 790–799. [PMC free article] [PubMed]
  • International HapMap Consortium, 2005. A haplotype map of the human genome. Nature 437: 1299–1320. [PMC free article] [PubMed]
  • Kaplan, N. L., R. R. Hudson and C. H. Langley, 1989. The “hitchhiking effect” revisited. Genetics 123: 887–899. [PMC free article] [PubMed]
  • Kim, Y., and R. Nielsen, 2004. Linkage disequilibrium as a signature of selective sweeps. Genetics 167: 1513–1524. [PMC free article] [PubMed]
  • Kim, Y., and W. Stephan, 2002. Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160: 765–777. [PMC free article] [PubMed]
  • Kimura, M., 1983. The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.
  • Langley, C. H., and J. F. Crow, 1974. The direction of linkage disequilibrium. Genetics 78: 937–941. [PMC free article] [PubMed]
  • McVean, G. A. T., S. R. Myers, S. Hunt, P. Deloukas, D. R. Bentley et al., 2004. The fine-scale structure of recombination rate variation in the human genome. Science 304: 581–584. [PubMed]
  • Robinson, W. P., A. Cambon-Thomsen, N. Borot, W. Klitz and G. Thomson, 1991. Selection, hitchhiking and disequilibrium analysis at three linked loci with application to HLA data. Genetics 129: 931–948. [PMC free article] [PubMed]
  • Maynard Smith, J., and J. Haigh, 1974. The hitch-hiking effect of a favourable gene. Genet. Res. 23: 23–35. [PubMed]
  • Stephan, W., and C. H. Langley, 1989. Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. I. Contrasts between the vermilion and forked loci. Genetics 121: 89–99. [PMC free article] [PubMed]
  • Stephan, W., T. H. E. Wiehe and M. W. Lenz, 1992. The effect of strongly selected substitutions on neutral polymorphism: analytical results based on diffusion theory. Theor. Popul. Biol. 41: 237–254.
  • Thomson, G., 1977. The effect of a selected locus on linked neutral loci. Genetics 85: 753–788. [PMC free article] [PubMed]
  • Turelli, M., and N. H. Barton, 1990. Dynamics of polygenic characters under selection. Theor. Popul. Biol. 38: 1–57.
  • Turelli, M., and N. H. Barton, 1994. Genetic and statistical analyses of strong selection on polygenic traits. What, me normal? Genetics 138: 913–941. [PMC free article] [PubMed]
  • Wright, S., 1931. Evolution in Mendelian populations. Genetics 16: 97–159. [PMC free article] [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...