![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||
Copyright © 2007 by the Genetics Society of America Joint Estimates of Quantitative Trait Locus Effect and Frequency Using Synthetic Recombinant Populations of Drosophila melanogaster *Department of Ecology and Evolutionary Biology and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas 66045 and †Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697 1Corresponding author: Department of Ecology and Evolutionary Biology, University of Kansas, 1030 Haworth Hall, 1200 Sunnyside Ave., Lawrence, KS 66045.E-mail: sjmac/at/ku.edu Communicating editor: G. C. Gibson Received December 13, 2006; Accepted April 10, 2007. This article has been cited by other articles in PMC.Abstract We develop and implement a strategy to map QTL in two synthetic populations of Drosophila melanogaster each initiated with eight inbred founder strains. These recombinant populations allow simultaneous estimates of QTL location, effect, and frequency. Five X-linked QTL influencing bristle number were resolved to intervals of ~1.3 cM. We confirm previous observations of bristle number QTL distal to 4A at the tip of the chromosome and identify two novel QTL in 7F–8C, an interval that does not include any classic bristle number candidate genes. If QTL at the tip of the X are biallelic they appear to be intermediate in frequency, although there is evidence that these QTL may reside in multiallelic haplotypes. Conversely, the two QTL mapping to the middle of the X chromosome are likely rare: in each case the minor allele is observed in only 1 of the 16 founders. Assuming additivity and biallelism we estimate that identified QTL contribute 1.0 and 8.7%, respectively, to total phenotypic variation in male abdominal and sternopleural bristle number in nature. Models that seek to explain the maintenance of genetic variation make different predictions about the population frequency of QTL alleles. Thus, mapping QTL in eight-way recombinant populations can distinguish between these models. VARIATION in quantitative, or complex, traits is influenced by numerous genetic loci and by environmental factors. For many complex traits we have estimates of the fraction of phenotypic variation that is due to genetic factors, but we do not have a general understanding of the number, effect, and frequency of the alleles that contribute to phenotypic variation. Are alleles at quantitative trait loci (QTL) generally of large effect, but low in frequency, consistent with models of mutation–selection balance (MSB; reviewed by Johnson and Barton 2005)? Alternatively, is the bulk of standing genetic variation for complex traits due to modest-effect intermediate frequency alleles maintained by some form of balancing selection (reviewed by Barton and Turelli 1989; Barton and Keightley 2002)? In the human genetics community the idea that complex trait variation is due to intermediate frequency polymorphisms has been termed the common disease–common variant (CDCV) hypothesis (Cargill et al. 1999). The distinction between MSB models and balancing selection/CDCV models not only is important for understanding how genetic variation is maintained in populations, but also will affect the power of current population-based approaches to identify risk alleles for human disease (Wang et al. 2005). The most effective way to clarify the contribution of MSB and CDCV forces in maintaining phenotypic variation is to experimentally identify and characterize the underlying molecular genetic basis of several QTL. With this ultimate goal in mind, two non-mutually exclusive experimental programs are predominant in the literature: QTL mapping and association or linkage disequilibrium (LD) mapping. In its simplest form QTL mapping involves crossing a pair of lines that are differentially fixed for alleles at a genomewide set of marker loci and at QTL contributing to the phenotype. Genotyping and phenotyping a large number of recombinant progeny from this cross identifies genetic intervals that harbor factors contributing to segregating variation in the cross. Since the publication of influential articles by Paterson et al. (1988) and Lander and Botstein (1989), the community has enjoyed considerable success mapping QTL for a wide range of traits in a diverse set of genetic systems. Typically QTL are resolved to broad intervals of ~10 cM (Mackay 2001), which may represent millions of base pairs. This lack of resolution has hindered identification of the molecular variants involved, particularly in QTL mapping studies of intraspecific variation where QTL can have subtle effects. Physically close genetic factors also pose a problem for QTL mapping, as it may be impossible to accurately estimate the effects and locations of linked QTL, and the number of QTL may be underestimated (Wright and Kong 1997; Cornforth and Long 2003). Additionally, since recombinant individuals for QTL mapping are generally derived from a pair of inbred parental lines, only QTL that segregate between the parents can be identified. As a result there is no way to know the population frequency of mapped QTL. Association mapping is a population-based genetic mapping strategy. The approach involves genotyping a large number of single nucleotide polymorphisms (SNPs) in a large sample of individuals and at each marker testing for an association between genotype and phenotype. A strong association signal at a SNP suggests either that the SNP itself contributes to trait variation or that the causal site is in strong LD with the SNP marker genotyped. Instead of relying on meiotic recombination in experimental crosses, association mapping utilizes the pattern of historical recombination in a panel of natural chromosomes. Thus, association mapping has the potential for much higher resolution than QTL mapping, and in principal the actual quantitative trait nucleotide (QTN) can be identified and its effect and frequency estimated directly. In practice, association mapping has met with modest success, and the literature is rife with failures to replicate published associations (although see Todd 2006 for a positive view of the future). This reflects a variety of factors, such as cryptic population structure, different patterns of LD or genetic heterogeneity in different populations, or simply insufficient power to detect variants with only subtle effects (Kruglyak 1999; Long and Langley 1999). Association mapping can be effective only when the density of genotyped SNPs is sufficiently high that real associations are not missed (Risch and Merikangas 1996). Since powerful genomewide association studies are tremendously difficult to carry out, even in humans where resources are considerable (Hirschhorn and Daly 2005; Wang et al. 2005), researchers have elected to carry out localized mapping on candidate gene regions (e.g., Genissel et al. 2004; Palsson and Gibson 2004; Macdonald et al. 2005a). Such a strategy will fail if the presumed candidate does not actually contribute to trait variation (e.g., Florez et al. 2006). Finally, an aspect of association mapping that is often overlooked is that if much of the genetic variation underlying complex traits is due to rare variants of large effect (as predicted by MSB models) the association mapping paradigm is not very powerful at all, and is almost guaranteed to fail (Weiss and Terwilliger 2000; Pritchard 2001; Reich and Lander 2001; Pritchard and Cox 2002). It is quite clear that both QTL and association mapping approaches, while powerful in many respects, suffer from distinct drawbacks that prevent the routine identification and characterization of QTN. To make the dissection of complex traits more routine we require a methodology that has some of the resolution of association mapping, combined with the power of QTL mapping to identify factors on a genomewide scale. To determine if standing variation is generally consistent with MSB or CDCV models a method allowing for direct estimation of the population frequency of mapped factors is highly desirable. An ideal methodology would also provide some mechanism with which to identify the precise molecular variants involved. In this study we describe a mapping scheme that allows joint estimates of QTL effects and frequencies from a recombinant panel derived from multiple founder chromosomes. Conceptually, our approach is similar to the mouse “Collaborative Cross” scheme envisioned by the Complex Trait Consortium (Threadgill et al. 2002; Churchill et al. 2004) and has parallels with the “heterogeneous stock” strategy (Talbot et al. 1999; Mott et al. 2000; Demarest et al. 2001) most recently used by Valdar et al. (2006b) to map QTL for 97 traits in mice. We take two independent sets of eight inbred Drosophila melanogaster lines, and from each set initiate a recombinant population. The genetic material for each synthetic population is thus derived from just eight founders, and after multiple generations of maintenance the genome of each recombinant individual is a mosaic of the founder chromosomes (Figure 1
MATERIALS AND METHODS D. melanogaster stocks: All 16 wild-type D. melanogaster lines used to found the synthetic populations (Table 1) have been examined for both PM and IR dysgenesis and were shown to be MI (Kidwell et al. 1983). We also made use of the strain of D. melanogaster used for genome sequencing, the “sequenced strain” (Bloomington Drosophila Stock Center no. 2057, Adams et al. 2000; Celniker et al. 2002), which has the M cytotype. We further verified that all lines are free of P elements using a PCR-based transposon-display assay (details available on request). After founding the synthetic populations we found that lines A7 and B8 (Table 1) were genetically indistinguishable on the basis of the X-linked markers we describe here. This likely represents an error that occurred at the stock center.
Synthetic recombinant populations: Four synthetic populations were created: population A replicates 1 and 2 (pAr1, pAr2) were initiated from lines A1–A8, and population B replicates 1 and 2 (pBr1, pBr2) were initiated from lines B1–B8 (Table 1 and Figure 1 Experimental flies: Figure 2
Coarse mapping: Virgin females were collected from each of the four synthetic populations and aged in groups of 50 in vials for 2–5 days. Twelve aged virgin females from a given population were crossed to 12 males from the sequenced strain in vials. Multiple replicate vials were created, and from each vial 4 male and 4 female offspring were used for phenotyping and genotyping. The experimental flies are thus F1 progeny of a recombinant female and an isogenic male. The coarse-mapping experiment was split into four blocks: in block 1 (generation G16 of the synthetic populations) 24 vials were set up for each of the four populations (pAr1, pAr2, pBr1, and pBr2), and in blocks 2–4 (generations G17–G19) 36 vials were set up for each population. This resulted in a total of 528 male and female experimental flies collected for each population. For each fly two phenotypic measurements were taken: sternopleural bristle number (SBN) is the sum of the number of macro- and microchaetae on the left and right sternopleural plates, and abdominal bristle number (ABN) is the number of microchaetae on the most posterior sternite, corresponding to segment six of females and segment five of males. A subset of the coarse-mapping experimental flies was tested for the presence of P elements using a transposon-display assay. All flies should be P free. We found that flies derived from population pAr2 showed P elements, implying that pAr2 was contaminated. This population was destroyed, and experimental flies from this population are not considered further. Fine mapping: Virgin females were collected from synthetic populations pAr1 and pBr1 and aged as for the coarse-mapping experiment. Multiple vial crosses were set up between 10 aged virgin females and 10 sequenced-strain males, and from each vial 4 male offspring were used for phenotyping and genotyping. The fine-mapping experiment was split into two blocks. In block 1 (generation G55) 144 vials were set up for each of the two populations pAr1 and pBr1, and in block 2 (generation G56) 120 vials were set up for each population. This resulted in a total of 1056 male experimental progeny collected for the pAr1 and pBr1 synthetic populations. Populations pAr1 and pBr1 were shown to be free of P elements at generation G52, just prior to beginning the fine-mapping experiment. Molecular marker development: We sought to identify 1-kb sequence fragments harboring several polymorphisms that collectively distinguish the founders (Figure 2
Genotyping: Following phenotyping, experimental flies were deposited directly into 96-well plates on ice. We also collected 12 female flies from each of the 16 lines used to found the synthetic populations and multiple females from the sequenced strain. Subsequently, DNA from all flies was extracted in 96-well format (described in Gruber et al. 2007), and diluted DNA was aliquoted into 384-well plates and dried down in preparation for PCR. Together with blanks and various control samples, the coarse-mapping DNA panel consisted of 12 384-well plates, and the fine-mapping DNA panel consisted of 6 384-well plates. The entire coarse-mapping (fine-mapping) DNA panel was PCR amplified for the appropriate 12 (17) 1-kb amplicons in standard 5-μl PCR reactions. These PCR products were pooled in groups of two or three and used as a template for multiplex genotyping of SNPs contained within the fragments. Macdonald et al. (2005b) provides full details of this genotyping methodology. The genotype data were processed using custom routines implemented in the statistical programming language R (http://www.R-project.org). First, we ensured that none of the SNPs genotyped segregated within the sequenced strain. Next, for each of the experimental flies we found the maternally inherited haplotype from the synthetic recombinant population. No change to the genotyping data from males is required, since all SNPs are X linked and Drosophila males have a hemizygous X. Experimental females have both a paternally inherited sequenced-strain X and a maternally inherited recombinant X. Because the sequenced strain is isogenic, the haplotype of the recombinant chromosome for each experimental female can be obtained by subtraction. For example, if the sequenced strain is abc, and we observe an experimental female genotype of aaBbCc, we know the inherited recombinant maternal chromosome is aBC. Thus, the maternally inherited recombinant haplotype can always be unambiguously defined. The next step is to transform the haplotype data from the experimental individuals into a three dimensional matrix, G, where Gimk takes a binary value describing whether the observed maternal haplotype for individual i at marker m is consistent with the haplotype of founder k (k = 1, 2, …, 8); i.e., Gimk = 1 if the haplotype is compatible with that of the kth founder, and Gimk = 0 otherwise. Using the data from the 12 females genotyped for each founder line, we can list all of the multilocus haplotypes present for each founder and marker. Generally the founder lines are completely inbred, although there is some residual heterozygosity and more than one haplotype can be present within a line at a given marker. Also, founders are not always unique at every marker, and missing data are unavoidable with a project on this scale. Typically we find that markers are not fully informative and fail to distinguish all eight possible founder chromosomes for one or both synthetic populations. Each test individual/marker combination is coded as follows. Consider that the marker haplotypes for the eight founder lines are (1) ABC, (2) AbC, (3) ABc, (4) aBC, (5) AbC, (6) aBc, (7) ABC, and (8) Abc (in this example founders 1 and 7, and 2 and 5, are indistinguishable). If an experimental fly is aBc it must have the chromosome from founder 6 and is coded as 2(6−1) = 32. Alternatively, if the experimental individual is found to be ABC it might equally be derived from founders 1 or 7 and is assigned the value 2(1−1) + 2(7−1) = 65. Finally, a haplotype with missing data, ?B?, is compatible with founders 1, 3, 4, 6, and 7, and is assigned the value 2(1−1) + 2(3−1) + 2(4−1) + 2(6−1) + 2(7−1) = 109. By extension it is obvious that an experimental individual will be assigned a value of 1–255 for each marker, precisely defining the potential ancestry of the chromosomal segment. Using this coding scheme the raw three-dimensional data matrix, G, can be alternatively represented as a two-dimensional matrix, C, with Cij (the code for the ith individual at the jth position) taking an integer value between 1 and 255. We provide C, along with the corresponding bristle phenotypes, as supplemental material on the Genetics website (http://www.genetics.org/supplemental). Statistical platform: Data analysis consists of three steps, and the statistical machinery is implemented as series of functions in the statistical programming language R, expanding on the R/qtl package (http://www.rqtl.org; Broman et al. 2003). First, we consider a 1-cM grid along the chromosome and calculate the probability pijk that individual i carries founder allele k at position j, given the available genotype data, G. This is done using the standard hidden Markov model (HMM) technology of Baum et al. (1970), first applied in a genetics context by Lander and Green (1987) and adapted to allow for genotyping errors by Lincoln and Lander (1992). The observed data, G, are viewed as marker “phenotypes” that are possibly subject to error. The true underlying genotypes are assumed to follow a Markov chain, with each of the eight possible founder alleles being equally likely. For any two positions, the probability of a transition from founder allele k1 to founder allele k2 is r/7 if k1 ≠ k2 (recombination in the interval) and 1 − r if k1 = k2 (no recombination). Here, r is analogous to the recombination fraction for the interval, but represents recombination events from multiple generations and is estimated from the data. The observed marker genotype at a locus is assumed to be compatible with the true underlying genotype with probability 1 − ε, where ε is the genotyping error rate. A readable tutorial on implementing the HMM is provided by Broman (2006). The information content of the available marker genotype data may be measured by the proportion of missing information, which we take to be Hj = Σi Σk pijk log pijk/n log 8, where n is the number of individuals. The second step is to fit a model relating phenotype to genotype. Initially, at the jth position we calculate the average phenotype by founder genotype (with the ith individual's phenotypic contribution to the mean of the kth founder chromosome weighted by the pij's) and sort these eight means from smallest to largest. We then fit a maximum of seven linear models to the data at each position: model 1 tests the difference between founder material with the smallest mean against all others, model 2 tests the difference between the pair of founders with the two smallest means against all others, and so on. For each model, we create a regressor variable for individual i at position j that is the sum of the elements of pij associated with these contrasts. The test is accomplished by regressing phenotypes on this regressor variable, with the additional constraint that the sum (over individuals) of the regressor variable must be >50. The resulting LOD score at position j uses a model of all eight founders having the same mean as a null and accepts the above contrast with the maximal likelihood as the alternate. Implicit in this analysis is the idea that there is a single biallelic QTL at some position on the chromosome that is segregating among the eight founder chromosomes and that some optimal partitioning of the founders can be used to identify that QTL. We note that the LOD scores resulting from our approach are strongly correlated with the F-statistic obtained from a multiple regression of phenotype onto the pj's at each position over the X chromosome. In the simulations the correlation between the LOD scores and F-statistics is generally >99%, and across all of the experimental panels (both sexes, both traits, both synthetic populations, and both the coarse and fine mapping) the correlation is 97.2%. The third step of the data analysis is then to estimate the probability that each of the eight founder chromosomes harbors the high, or Q, QTL allele at position j (pQk's) for the model implied by the best partitioning of the founders. This is simply the probability of observing each of the eight founder means given the estimated slope and intercept of that model, conditional on each founder harboring the high QTL. After all three steps are complete we obtain LOD scores and phenotypic effects at J positions in the genome and J corresponding pQ's. Our conservative estimate of the frequency of a QTL located at a local maximum in the LOD profile is the number of elements of pQ ≥ 0.95 divided by the number of elements of pQ ≥ 0.95 or pQ ≤ 0.05 (i.e., we ignore founder lines that do not allow for an accurate estimation of “phase”). Variation due to QTL: Estimates of QTL effect and frequency can be derived from eight-way synthetic populations, and we can use these values to estimate the fraction of segregating variation, Va, due to identified QTL. We can estimate this both in our (effectively haploid) mapping population as Va = pqα2, and in a natural, outbred diploid population under additivity as Va = 2pqα2, where p and q are the allele frequencies and α is the effect of the QTL (Falconer and Mackay 1996, p. 126). In both our mapping population and a natural population, male QTL on the hemizygous X chromosome have Va = pqα2. We can place a 95% confidence interval on Va using Monte Carlo simulation. For α this is accomplished by drawing 10,000 random samples from a normal distribution with mean equal to the observed effect of the QTL and standard deviation equal to the observed standard error on the QTL effect. We estimate the allele frequency, p, differently depending on whether we wish to estimate the variance due to the QTL within our mapping population, or in a natural population. Allele frequency, p, in the mapping population is simply the observed QTL frequency. To estimate allele frequencies of mapped factors in natural populations we draw samples from an allele frequency distribution, whose derivation is conditional on the fact that we observe i copies of a QTL allele among N founder chromosomes. Under neutrality the distribution of allele frequencies is described by Wright–Fisher sampling as 1, and therefore θ has little effect on the shape of pr(x;i,N), and second, for large N, and i not close to one or N, pr(x;i,N) is approximately a binomial distribution, and the “prior” assumption of neutrality has little weight. In a natural population, for any given QTL, we assume D. melanogaster θ = 0.006 (averaged over 98 loci collated in Presgraves 2005) and use “rejection sampling” (Press et al. 1996) to draw 10,000 random deviates from pr(x;i,N) to represent allele frequencies. For each pair of simulated α /p estimates we calculate Va as above. The 95% confidence interval on Va is taken as the 25th and 975th elements of the sorted vector of Va estimates. These values can be transformed to a percentage of the total bristle number variation explained by the QTL by dividing by the observed phenotypic variance.RESULTS We develop synthetic recombinant populations, each derived from eight inbred lines of D. melanogaster allowed to recombine at large population size for many generations. We use these populations to map bristle number QTL segregating on the D. melanogaster X chromosome. The mapping strategy we employ relies on the ability to take a recombinant individual, and specify which of the eight founders contributed each segment of the genome. Since we require haplotypic information for the recombinant chromosomes, all experimental individuals are the progeny of crosses between recombinant females and males from the isogenic sequenced strain of D. melanogaster (Figure 2 Simulations: We carried out simulations to assess our ability to accurately map QTL and jointly estimate their effect and frequency, and used parameters (chromosome size, marker density, marker informativeness) that realistically mimic the experimental data we collected. We sampled 1152 recombinants from an eight-way synthetic population 16 generations after founding to simulate the chromosome scan, and 56 generations after founding to simulate fine mapping. In each case we assume recombination occurs only in females. At the test generation (G16 or G56) recombinant individuals were created by concatenating chromosomal fragments derived from each of the eight founders with equal probability. Fragment lengths were drawn from an exponential distribution with mean 100/(16/2) or 100/(56/2) cM for the coarse and fine mapping, respectively. For the coarse mapping we simulated 12 partially informative markers equally spaced along a 66 cM chromosome and 5% missing data. For the fine-mapping simulation the 12 markers were placed in a more focused 10 cM region. For simplicity we assume the same level of informativeness at each marker, with four segregating haplotypes that group the founder lines as follows: haplotype 1 (three founders), haplotype 2 (two founders), haplotype 3 (two founders), haplotype 4 (single founder). The separation of founders into different haplotypes was random across markers. Finally, we place a biallelic QTL accounting for 5% of the total phenotypic variation at a random position within the mapping region, with the number of founders having the Q allele varied between one and four out of eight. Five-hundred realizations of each simulation were performed. The probability of observing a peak in the LOD score >4 is ≥99%, with an expected maximum LOD score of ~9.4 and ~11.4 for the coarse- and fine-mapping simulations, respectively. For those peaks associated with a LOD score >4, a 2.5-LOD drop from the maximum includes the simulated position of the QTL >99% of the time. On average, a 2.5-LOD drop maps a significant QTL to a 13.2 cM window with a standard deviation of 6.1 cM (coarse mapping) or a 2.3 cM window with a standard deviation of 0.9 cM (fine mapping). When the LOD score is >4, in no case do we incorrectly infer the “phase” of the QTL, and phase is assigned for an average of 7.8/8 founders. The simulated frequency of the QTL does not appear to affect the probability of inferring the allelic state of the QTL, the power to map a QTL, the average maximum LOD score, or the accuracy in localizing QTL. This is perhaps not surprising given that the simulations hold the proportion of variance attributable to the QTL constant at 5% (Long and Langley 1999). With the same simulations, but no QTL, the false positive rate at a LOD of four is 2 and 1.6% for the coarse- and fine-mapping simulations, respectively. With our current recombinant panel, marker density, and marker informativeness we can map QTL to the eight founder chromosomes in each of the synthetic recombinant populations. Additional simulations suggest that reducing sample size, marker density, or marker informativeness is detrimental. Marker informativeness: Ideally, every marker (a 1-kb fragment genotyped for several SNPs) would completely distinguish among all eight founders in both the pA and pB synthetic populations. In our experimental data this is typically not the case and markers are not fully informative. In fact, it is frequently not possible to distinguish among the eight founders within either population based on the DNA sequence of the entire 1-kb marker amplicon. For those 11 markers for which we had access to sequence from all founders, the average number of distinguishable haplotypes is 6.5/8. This is likely an overestimate of the number of distinguishable founder haplotypes for any arbitrary 1-kb region of the Drosophila genome, as a number of potential markers were sequenced and discarded due to a lack of polymorphism (data not shown). As with any “haplotype tagging” strategy, the SNP genotyping approach we employ further reduces the number of distinguishable haplotypes, both because we do not genotype all available SNPs, and because a proportion of the developed genotyping assays failed ( The inbred founder lines used to derive the synthetic populations are not isogenic, and 28/384 (7.3%) independent marker/founder combinations show heterozygosity. The heterozygosity is not localized to any particular marker as 17/24 markers show at least one heterozygous line. Half of the 16 founders show no evidence for heterozygosity, while 3 of the lines (A1, B3, and B7) are heterozygous at multiple amplicons. This trio of lines collectively contributes to 23/28 (82.1%) of the heterozygous marker/founder combinations, implying they are less well inbred than the remaining 13 lines. It is of interest that all 16 founder lines were maintained in stock centers at small effective population sizes for >40 years (without being contaminated by P-element-harboring flies). The observation that these lines are not completely homozygous suggests a relatively high rate of tightly linked deleterious alleles in trans. The HMM employs the genotype data to infer (for every individual and every position) the probability that the chromosomal segment is derived from each of the eight founders. Founder assignment becomes more accurate as the information level in the genotype data increases. We can visualize spatial variation in the information level by color coding (by founder of origin) those chromosomal segments inferred to come from a single founder with a probability >75%. Figure 3
We can examine marker informativeness more quantitatively using the measure H to estimate the proportion of missing genotypic information (H = 0, complete information; H = 1, no information). Figure 4, E and F
Phenotypes of synthetic populations: We scored two bristle phenotypes per experimental fly—abdominal bristle number (ABN) and sternopleural bristle number (SBN). Within each population (pAr1, pBr1, and pBr2), mapping generation (coarse and fine mapping), sex, and phenotype the bristle count distributions are approximately normal, similar to those measured in large outbred cohorts of flies sampled directly from nature (Genissel et al. 2004; Macdonald and Long 2004; Macdonald et al. 2005a). Table 3 presents phenotype means and standard deviations for all sets of flies examined in this study. We note that panels pBr1 and pBr2 are very similar for both sexes and bristle counts, and that flies from pAr1 have more abdominal and sternopleural bristles than flies from either pB population. On average, pAr1 flies have 0.5–1.1 more bristles than pB flies (Table 3). A difference in body size between the pA and pB panels may contribute to this pattern. The within-population phenotype means, and more importantly variances, do not change over time, and values are consistent between the coarse- and fine-mapping studies. Finally, we note that the within-panel/sex/trait phenotypic variances we observe are lower than variances observed for the same traits in two wild-caught D. melanogaster cohorts (Genissel et al. 2004; Macdonald and Long 2004; Macdonald et al. 2005a). This is presumably because each of the phenotyped flies in this study harbors a common set of isogenic, paternally derived chromosomes, and flies were reared under a controlled laboratory environment.
Position and effect of X-linked bristle number QTL: Coarse scan of the X chromosome: Initially we conducted a coarse scan of the entire X chromosome for QTL for two bristle traits for both sexes. For the coarse mapping we collected ~500 experimental flies of each sex from the populations pAr1, pBr1, and pBr2 (population pAr2 became contaminated during maintenance and was destroyed). Experimental individuals from the replicate populations pBr1 and pBr2 were pooled, and we refer to this pooled sample as pBr1+2. Comparison of the data from pBr1 and pBr2 alone with that from the pooled sample does not reveal any obvious inconsistencies. Since the sample size of population pAr1 is around half the size used in our simulations, we likely have reduced power to detect QTL in the pAr1 coarse-mapping sample. We only consider QTL to be present when the peak in the likelihood profile is >4-LOD. The likelihood profiles for the coarse-mapping samples shown in Figure 4 (A–D)
The largest bristle number QTL identified in the coarse-mapping scan is for male ABN in pBr1+2 (Table 4, Figure 4D The coarse-mapped QTL are resolved to intervals averaging 8.3 cM (~4 Mb). We elected to fine map two interesting QTL regions (Figure 4, C and D
X-tip fine mapping: The QTL for male SBN coarse mapped to the tip of the X chromosome in population pAr1 replicates in the fine-mapping experiment (QTL1 in Figure 5A The two best bristle number candidate genes in the fine-mapped X-tip region are the achaete-scute complex, ASC, at cytological position 1A6, and Notch at 3C7-3C9. In Figure 5 (A and C) X-middle fine mapping: The coarse mapping revealed a strong QTL for male ABN in the middle of the X chromosome in the pBr1+2 population and a suggestive peak (LOD < 4) for male SBN in a similar position (Figure 4D Frequency of X-linked bristle number QTL: Since the synthetic recombinant populations we employ are derived from multiple inbred lines, it is possible to estimate the phenotypic mean for each founder at every position along the chromosome. In turn—under the assumption that an identified QTL is biallelic—founders can be probabilistically assigned to “high” or “low” QTL allele classes. This permits an estimate of the frequency of the QTL. Figure 6
There are marked similarities in the overall pattern of estimated founder phenotype means in the coarse- and fine-mapping experiments. For instance, for QTL1 (Figure 6A Of all the QTL, QTL4 for male ABN was fine mapped to the smallest region (0.9 cM), and for this QTL the pattern of founder means alters between the coarse and fine mapping (Figure 6D Both QTL4 and QTL5 appear rare in population pB, with the minor allele present in Given that QTL4 and QTL5 reside in very small, and overlapping intervals one might conclude that we have mapped a single pleiotropic QTL contributing to variation in both male ABN and male SBN. Figure 6 (D and E) DISCUSSION Capturing experimental reality by simulation: We performed simulations to examine our ability to map and characterize QTL in eight-way recombinant populations. Our intent was not to fully explore the parameter space, but rather to inform our experimental work to ensure we carried out a study of sufficient power. Results suggest that we have considerable power to detect QTL contributing 5% to variation in phenotype with the sample sizes and scale of genotyping we eventually employed. Furthermore, the false positive rate is very low with the critical LOD threshold applied. As with any simulation approach, we make various simplifying assumptions. Of potential concern is that we simulated just one QTL on the chromosome. In reality, there could be interference from linked QTL that may affect both our ability to detect QTL and to estimate founder phenotype means. Many QTL mapping algorithms have agreeable properties in the absence of “traffic” from nearby QTL, but are prone to errors in inference with linked QTL (Wright and Kong 1997; Cornforth and Long 2003). An important feature of the recombinant populations we employ is that the negative effects of “traffic” on mapping inference are evaded by genetic map expansion rather than by some form of statistical correction. Fine-mapping QTL should eliminate any problems associated with other nearby factors, implying that our method can ultimately cope with problems arising from linked QTL. Nevertheless, one could envisage scenarios under which linked factors might prevent initial QTL detection in a coarse scan of the genome. In the simulations we also assume that the recombinant population is not subject to drift or selection, and that the expected frequency of genetic material derived from each founder at every point along the chromosome is Information content of markers: Each marker is composed of a set of genotyped SNPs within a 1-kb PCR amplicon and has the potential to completely distinguish among a set of eight chromosomes. In practice, we find that developed markers are not completely informative. This is the combined result of marker sequence identity among two or more founders, genotyping only a subset of the available SNPs, genotyping assay failure, and residual segregating variation within founders. Despite the non-fully informative nature of the markers we have power to detect QTL because the HMM employed incorporates data from linked markers (Broman 2005). Unlinked markers only provide information on the specific marked segment of the chromosome, whereas a set of linked markers provide information across the linkage group. The level of the information increases with marker density (relative to the average distance between recombination breakpoints) even if the markers remain only partially informative. By extension, instead of attempting to develop highly informative markers, it is possible to apply the HMM to a relatively dense genomewide set of genotyped biallelic SNPs. Future studies of eight-way recombinant Drosophila populations could take advantage of this possibility, but such an approach awaits the development a genomewide bank of intermediate-frequency SNPs for D. melanogaster, as well as some means of inexpensively genotyping those SNPs. X-linked bristle number QTL: Drosophila bristle number is arguably the best studied quantitative trait, and coupled with its easy and accurate scoring, permitted a rigorous test of our mapping methodology. A strong expectation was that we would identify bristle number QTL at the distal tip of the X chromosome, as factors influencing both sternopleural and abdominal bristle number have been identified in this region in previous studies (Long et al. 1995; Gurganus et al. 1998, 1999; Nuzhdin et al. 1999; Dilda and Mackay 2002). In a coarse-mapping experiment we identified QTL at the tip of the X for sternopleural bristle number (SBN) for both sexes in both synthetic populations, and for female abdominal bristle number (ABN) in just the pB population (Figure 4 On average, fine-mapped QTL were resolved to 1.3 cM, with the large male pB ABN QTL resolved to just 0.9 cM. These intervals implicate genetically tractable physical distances, and suggest a handful of genes for further study. The best bristle number candidate genes at the tip of the X chromosome are the achaete-scute complex (ASC) and Notch. Association between polymorphisms at ASC and bristle number variation were first seen by Mackay and Langley (1990), extended and confirmed by Long et al. (2000), and more fully explored by Gruber et al. (2007). ASC is located under QTL peaks QTL1 and QTL2, and segregating loci at ASC might plausibly be involved in the expression of these QTL. Unfortunately, the very tip of the X chromosome in Drosophila has a markedly reduced crossover rate relative to physical distance compared to the rest of the chromosome, and LD extends over large physical distances (Aguadé et al. 1989). Thus, the prospect for identifying the actual causal locus, rather than a locus in strong LD with the causal site, contributing to QTL1 and QTL2 is somewhat bleak. The Notch pathway is involved in the cell fate decisions that lead to bristle specification, and mutations of the component genes alter bristle patterning and spacing (reviewed by Artavanis-Tsakonas et al. 1999; Lai 2004). Thus, Notch is considered a viable candidate gene for bristle number variation, although no formal association mapping-style experiment has been performed across the region. The fine-mapping experiment presented here suggests that Notch is unlikely to contribute to segregating variation for male ABN, but we cannot completely rule out an effect of Notch on SBN. The two QTL mapped to the middle of the X chromosome are particularly interesting as we could find no good evidence for similar QTL in other studies that have scanned the X chromosome (Long et al. 1995; Gurganus et al. 1998, 1999; Nuzhdin et al. 1999; Dilda and Mackay 2002). This is probably because for both QTL the minor allele is rare in our experiment ( One of the purported advantages of the bristle number paradigm is that we have a good idea of the likely candidate genes underlying the phenotype (Mackay 1995). Clear QTL in regions without such candidates might appear to cast some doubt on this assumption. However, aside from ASC and Notch, many (if not most) of the best bristle number candidate genes reside on the autosomes (e.g., Suppressor of Hairless, daughterless, and scabrous on chromosome 2, and extra macrochaetae, quemao, hairy, Delta, Hairless, and Enhancer of split on chromosome 3), and there is some evidence for quantitative effects on bristle number residing at scabrous (Lai et al. 1994; Lyman et al. 1999), hairy (Robin et al. 2002), and Delta (Long et al. 1998; Lyman and Mackay 1998). The frequency of bristle number QTL: There has been a long-running debate in the quantitative genetics community over the mechanisms by which genetic variation is maintained in natural populations (see Lewontin 1974). Many traits are under either apparent or actual stabilizing selection (e.g., bristle number, Linney et al. 1971; Nuzhdin et al. 1995; García-Dorado and González 1996), yet paradoxically there is substantial genetic variation segregating for these traits. Two broad types of model that attempt to explain this paradox are MSB models and balancing selection models (reviewed by Barton and Turelli 1989; Barton and Keightley 2002; Johnson and Barton 2005). MSB models predict that the bulk of standing genetic variation is due to rare alleles of large effect that are unconditionally deleterious (Johnson and Barton 2005). In contrast, balancing selection models suggest that variation is due to intermediate frequency variants of more modest effect. These balanced polymorphisms might be maintained by heterozygote advantage (overdominance), variation in allelic effects via genotype-by-environment interaction (Gillespie and Turelli 1989; Turelli and Barton 2004), frequency-dependent selection (Hedrick 1972), or antagonistic pleiotropy (Rose 1982). Since mutations obviously occur, at least a portion of the segregating variation we see must be due to the effects of MSB. The relevant question then becomes what proportion of segregating variation is due to intermediate-frequency sites of modest effect. QTL mapping is routinely used for the genetic analysis of complex traits within and between various species. These works have yielded a staggering number of QTL, yet we have accumulated almost no information regarding the molecular genetic architecture of alleles at QTL or their population frequencies. There have been some attempts to combine the results of different mapping studies for the same trait, carried out in panels derived from different genetic material (e.g., Gurganus et al. 1999): overlap in the QTL identified across experiments can be taken as a loose surrogate for QTL frequency. From this Gurganus et al. (1999) suggest that some Drosophila bristle number QTL may be at intermediate frequency. A difficulty with such a “meta-analysis” approach, aside from the obvious lack of multiple QTL studies for the majority of traits, is that it may not be trivial to compare likelihood profiles generated across different genetic maps. Furthermore, the low resolution of most QTL mapping experiments will not allow confidence in the assertion that different QTL represent the same segregating factor. A much better approach to estimate QTL frequency is to utilize a mapping population that encompasses more than two haploid genomes. Nuzhdin et al. (2005) use a large set of inbred lines derived from a pair of heterozygous flies, such that the panel segregates for four haplotypes (three for the X chromosome), and use the data to assess the effect on mortality of lower-frequency alleles. Here, we take this idea further by taking a much larger sample of haplotypic variation, allowing us to generate a more robust estimate of the frequency for each mapped QTL. The three X-tip QTL appear to be somewhat frequent, with minor QTL frequencies between 0.25–0.4, although these QTL do not obviously appear to be biallelic (Figure 6 There are two important points to be made concerning our estimates of the amount of variation explained by the rare QTL4 and QTL5. First, rather than being rare, naturally occurring alleles, these QTL may represent mutations that arose in the founder lines in the laboratory. Identifying such mutations, rather than naturally segregating allelic variation, is a general concern with inbred line QTL mapping approaches. The advantage of our strategy is that such mutations will always be identified as singletons (one founder having a different QTL allele from all others), and a researcher can weigh the costs of further characterization of the causal locus against the possibility that the site may not contribute to natural variation of the trait. Second, our estimates of QTL effect/frequency are derived in a panel of D. melanogaster lines of worldwide distribution. Thus, our estimates of the contribution of each QTL to natural bristle number variation are worldwide estimates. If the frequency (or even the effect) of QTL differ across populations, our worldwide estimate may underestimate the variance explained in some populations, while overestimate it in others. In general there does not appear to be a great deal of population structure in D. melanogaster (Kreitman and Aguadé 1986; Hale and Singh 1991; Begun and Aquadro 1993), although there are some cases of strong geographic variation in allele frequency (e.g., clinal variation at Adh, Berry and Kreitman 1993), and an apparent population-specific effect has been observed at an identified wing-shape QTN (Palsson et al. 2005). We do not yet know the extent of among-population heterogeneity in the genetic control of complex traits in Drosophila. The five QTL we fine map were identified on the hemizygous X in males and, as such, contribute no dominance genetic variance. Our detection of these QTL was not predicated on the particular allele harbored by the sequenced strain of D. melanogaster, since none of the experimental males receive a sequenced strain X. We note that a fully recessive autosomal or female-specific X-linked QTL segregating in our worldwide sample of lines could be detected only if the sequenced strain harbors the recessive allele. The power to detect nonadditive autosomal or female X-linked QTL will depend both on the magnitude of the departure from additivity, and on the particular allele present in the isogenic standard line. Some combinations will increase power, and some will decrease power relative to detecting a fully additive QTL. From a strategic point of view it would be advantageous to be able to accurately estimate QTL frequency from coarse-mapping data alone. We demonstrate that the presence of coarse-mapped QTL is preserved on fine mapping, but it is not clear that founder means are always similar between the coarse and fine-mapping studies. At least in one case (QTL4—Figure 6D Amount of natural variation in bristle number explained: The eventual goal of our work is to identify all loci that contribute to natural variation in bristle number. If we assume additivity among mapped QTL, assume that the QTL represent naturally segregating biallelic polymorphisms (i.e., are not the result of mutation accumulation in the founder lines) and further assume our estimated effects translate to nature, we can estimate the fraction of the variation explained by the X chromosome QTL mapped here. In nature, ABN variation in males is 5.50 and SBN variation in males is 4.67 (Macdonald et al. 2005a). For male ABN, we need only consider QTL4 and estimate that we have explained 1.0% of the total phenotypic variation in ABN with this single QTL (95% confidence interval, 0.03–3.41%). The situation is more difficult for male SBN as we identify multiple QTL for this trait (QTL1, QTL2, QTL3, and QTL5). QTL1 (identified in pAr1) and QTL2 (identified in pBr1) may be equivalent, but this is unclear. To be conservative we ignore QTL2 as it barely achieves our LOD threshold. If QTL1 and QTL2 represent the same segregating factor, only considering QTL1 increases the Monte Carlo variance in allele frequency (as frequency is estimated from 8 rather than 16 alleles), and if the QTL are indeed independent we ignore the contribution of one of them. Similarly, the allele frequency estimate of QTL3 is taken just from pBr1, as although we do not formally identify an equivalent QTL in pAr1, the likelihood curve is only slightly less than our LOD threshold at the equivalent position in pAr1. If we consider that QTL1, QTL3, and QTL5 contribute to male SBN variation, the amount of total natural variation in SBN they collectively explain is 8.7% (95% confidence interval, 3.54–16.04%). Together with the other assumptions made, the caveat with the SBN variance calculation is that we include potentially nonbiallelic QTL at the tip of the X chromosome. This may have unpredictable effects on the accuracy of our estimate. Despite the suite of potential difficulties, mapping in synthetic populations derived from several founders allows for estimates of the total variance explained in nature by identified QTL. Prospects to resolve QTN from QTL mapped in eight-way populations: The methodology we outline provides an integrated system with which to map QTL to ~1 cM, estimate their effects, and identify the most likely allelic configuration at the QTL across the founder lines. Ultimately we wish to identify the underlying QTN, but even our most finely mapped QTL interval (QTL4) encompasses 204 kb of sequence. Since mapping resolution does not increase linearly with each additional generation of recombination (Valdar et al. 2006a), only a very large number of extra maintenance generations would provide a marked increase in mapping resolution. This advantage might well be outweighed by the impact of drift on the founder composition of the recombinant population. One way to confirm the presence of a QTL is to conduct some form of association study across the implicated QTL interval. Assuming one had access to a genomewide database of D. melanogaster polymorphisms, the obvious strategy would be to genotype every common SNP across the entire QTL region, either directly or indirectly via strong LD with a genotyped site. However, extrapolating from previous high-power association mapping work in Drosophila (Palsson and Gibson 2004; Macdonald et al. 2005a), this would entail genotyping several hundred to a few thousand SNPs. It would be possible to reduce the genotyping effort by focusing on likely candidate genes present in the QTL interval. However, as in the case of QTL4 and QTL5, or for those traits that are less well understood than bristle number, no clear a priori candidates may be present. A considerable reduction in genotyping effort could also be achieved by genotyping a subset of the available SNPs expected to be enriched for functional polymorphisms. For instance, one could genotype only nonsynonymous coding SNPs or only those sites in sequence regions tagged as nonneutrally evolving (Boffelli et al. 2003; Boffelli et al. 2004; Macdonald and Long 2005). Unfortunately, given so little is known about the nucleotide-level genetic control of complex traits, it is not clear if a strategy based on genotyping specific sets of putatively functional SNPs will work. QTL information derived from an eight-way synthetic population provides a different mechanism for identifying likely causal SNPs that is independent of their sequence context. The set of founder means allows us to estimate the QTL allele present in each founder: if the sequence for the QTL region was available from all 16 founders, the most likely causal SNPs are those completely in phase with the predicted QTL allele configuration. We have carried out coalescent simulations that show that, for a common QTL, the number of SNPs in phase with the QTL alleles is expected to be very small, even for large regions of the Drosophila genome (data not shown). Yalcin et al. (2005) explore a similar strategy that, in combination with a statistic reflecting the between-species conservation of the sequence surrounding each SNP, also dramatically reduces the number of segregating sites that may represent the QTN. Testing the small set of implicated sites by association mapping would be relatively trivial. All that is required is to sequence a QTL interval of perhaps 200 kb in 16 lines. Although such an experiment remains prohibitively costly today, emerging technologies suggest that just such a strategy could make sense in a few years (Frazer et al. 2004; Hinds et al. 2005; Margulies et al. 2005; Shendure et al. 2005; reviewed by Shendure et al. 2004; Metzker 2005). Acknowledgments We thank V. Bauer DuMont and C. F. Aquadro for access to prepublication Notch resequencing data and L. Ometto for providing DNA sequence alignments. All data from this work are available from http://www.people.ku.edu/~sjmac/pubs.html. This work was supported by National Science Foundation grant DEB-0614429 to A.D.L. Notes References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||
Philos Trans R Soc Lond B Biol Sci. 2005 Jul 29; 360(1459):1411-25.
[Philos Trans R Soc Lond B Biol Sci. 2005]Annu Rev Genet. 1989; 23():337-70.
[Annu Rev Genet. 1989]Nat Rev Genet. 2002 Jan; 3(1):11-21.
[Nat Rev Genet. 2002]Nat Genet. 1999 Jul; 22(3):231-8.
[Nat Genet. 1999]Nat Rev Genet. 2005 Feb; 6(2):109-18.
[Nat Rev Genet. 2005]Nature. 1988 Oct 20; 335(6192):721-6.
[Nature. 1988]Genetics. 1989 Jan; 121(1):185-99.
[Genetics. 1989]Annu Rev Genet. 2001; 35():303-39.
[Annu Rev Genet. 2001]Genetics. 1997 May; 146(1):417-25.
[Genetics. 1997]Genet Res. 2003 Oct; 82(2):139-49.
[Genet Res. 2003]Nat Genet. 2006 Jul; 38(7):731-3.
[Nat Genet. 2006]Nat Genet. 1999 Jun; 22(2):139-44.
[Nat Genet. 1999]Genome Res. 1999 Aug; 9(8):720-31.
[Genome Res. 1999]Science. 1996 Sep 13; 273(5281):1516-7.
[Science. 1996]Nat Rev Genet. 2005 Feb; 6(2):95-108.
[Nat Rev Genet. 2005]Mamm Genome. 2002 Apr; 13(4):175-8.
[Mamm Genome. 2002]Nat Genet. 2004 Nov; 36(11):1133-7.
[Nat Genet. 2004]Nat Genet. 1999 Mar; 21(3):305-8.
[Nat Genet. 1999]Proc Natl Acad Sci U S A. 2000 Nov 7; 97(23):12649-54.
[Proc Natl Acad Sci U S A. 2000]Behav Genet. 2001 Jan; 31(1):79-91.
[Behav Genet. 2001]Science. 2000 Mar 24; 287(5461):2185-95.
[Science. 2000]Genome Biol. 2002; 3(12):RESEARCH0079.
[Genome Biol. 2002]Proc Natl Acad Sci U S A. 2005 May 3; 102 Suppl 1():6614-21.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2002 Oct 1; 99(20):12949-54.
[Proc Natl Acad Sci U S A. 2002]Genetics. 2004 Aug; 167(4):1759-66.
[Genetics. 2004]Genetics. 2005 Oct; 171(2):639-53.
[Genetics. 2005]Mol Biol Evol. 2005 Oct; 22(10):2119-30.
[Mol Biol Evol. 2005]Genetics. 2007 Apr; 175(4):1987-97.
[Genetics. 2007]Genome Biol. 2005; 6(12):R105.
[Genome Biol. 2005]Bioinformatics. 2003 May 1; 19(7):889-90.
[Bioinformatics. 2003]Proc Natl Acad Sci U S A. 1987 Apr; 84(8):2363-7.
[Proc Natl Acad Sci U S A. 1987]Genomics. 1992 Nov; 14(3):604-10.
[Genomics. 1992]Curr Biol. 2005 Sep 20; 15(18):1651-6.
[Curr Biol. 2005]Genome Res. 1999 Aug; 9(8):720-31.
[Genome Res. 1999]Genetics. 2004 Jan; 166(1):291-306.
[Genetics. 2004]Genetics. 2004 Aug; 167(4):2127-31.
[Genetics. 2004]Genetics. 2005 Dec; 171(4):1741-56.
[Genetics. 2005]Genetics. 1995 Mar; 139(3):1273-91.
[Genetics. 1995]Genetics. 1998 Aug; 149(4):1883-98.
[Genetics. 1998]Genetics. 1999 Aug; 152(4):1585-604.
[Genetics. 1999]Genetics. 1999 Nov; 153(3):1317-31.
[Genetics. 1999]Genetics. 2002 Dec; 162(4):1655-74.
[Genetics. 2002]Genetics. 1995 Mar; 139(3):1273-91.
[Genetics. 1995]Genetics. 1998 Aug; 149(4):1883-98.
[Genetics. 1998]Genetics. 2002 Dec; 162(4):1655-74.
[Genetics. 2002]Genetics. 1999 Aug; 152(4):1585-604.
[Genetics. 1999]Genetics. 1999 Nov; 153(3):1317-31.
[Genetics. 1999]Genetics. 1997 May; 146(1):417-25.
[Genetics. 1997]Genet Res. 2003 Oct; 82(2):139-49.
[Genet Res. 2003]Genetics. 2006 Mar; 172(3):1783-97.
[Genetics. 2006]Genetics. 2005 Feb; 169(2):1133-46.
[Genetics. 2005]Genetics. 1995 Mar; 139(3):1273-91.
[Genetics. 1995]Genetics. 1998 Aug; 149(4):1883-98.
[Genetics. 1998]Genetics. 1999 Aug; 152(4):1585-604.
[Genetics. 1999]Genetics. 1999 Nov; 153(3):1317-31.
[Genetics. 1999]Genetics. 2002 Dec; 162(4):1655-74.
[Genetics. 2002]Nature. 1990 Nov 1; 348(6296):64-6.
[Nature. 1990]Genetics. 2000 Mar; 154(3):1255-69.
[Genetics. 2000]Genetics. 2007 Apr; 175(4):1987-97.
[Genetics. 2007]Genetics. 1989 Jul; 122(3):607-615.
[Genetics. 1989]Science. 1999 Apr 30; 284(5415):770-6.
[Science. 1999]Genetics. 1995 Mar; 139(3):1273-91.
[Genetics. 1995]Genetics. 1998 Aug; 149(4):1883-98.
[Genetics. 1998]Genetics. 1999 Aug; 152(4):1585-604.
[Genetics. 1999]Genetics. 1999 Nov; 153(3):1317-31.
[Genetics. 1999]Genetics. 2002 Dec; 162(4):1655-74.
[Genetics. 2002]Trends Genet. 1995 Dec; 11(12):464-70.
[Trends Genet. 1995]Science. 1994 Dec 9; 266(5191):1697-702.
[Science. 1994]Genet Res. 1999 Dec; 74(3):303-11.
[Genet Res. 1999]Genetics. 2002 Sep; 162(1):155-64.
[Genetics. 2002]Genetics. 1998 Jun; 149(2):999-1017.
[Genetics. 1998]Heredity. 1971 Oct; 27(2):163-74.
[Heredity. 1971]Genetics. 1995 Feb; 139(2):861-72.
[Genetics. 1995]Annu Rev Genet. 1989; 23():337-70.
[Annu Rev Genet. 1989]Nat Rev Genet. 2002 Jan; 3(1):11-21.
[Nat Rev Genet. 2002]Philos Trans R Soc Lond B Biol Sci. 2005 Jul 29; 360(1459):1411-25.
[Philos Trans R Soc Lond B Biol Sci. 2005]Genetics. 1999 Aug; 152(4):1585-604.
[Genetics. 1999]Genetics. 2005 Jun; 170(2):719-31.
[Genetics. 2005]Genetics. 1989 Jul; 122(3):607-615.
[Genetics. 1989]Genetics. 1989 Dec; 123(4):865-71.
[Genetics. 1989]Proc Natl Acad Sci U S A. 1986 May; 83(10):3562-6.
[Proc Natl Acad Sci U S A. 1986]Genetics. 1991 Sep; 129(1):103-17.
[Genetics. 1991]Nature. 1993 Oct 7; 365(6446):548-50.
[Nature. 1993]Genetics. 1993 Jul; 134(3):869-93.
[Genetics. 1993]BMC Genet. 2005 Aug 15; 6():44.
[BMC Genet. 2005]Genetics. 2005 Dec; 171(4):1741-56.
[Genetics. 2005]Genetics. 2006 Mar; 172(3):1783-97.
[Genetics. 2006]Genetics. 2004 Jul; 167(3):1187-98.
[Genetics. 2004]Genetics. 2005 Dec; 171(4):1741-56.
[Genetics. 2005]Science. 2003 Feb 28; 299(5611):1391-4.
[Science. 2003]Genome Res. 2004 Dec; 14(12):2406-11.
[Genome Res. 2004]Genetics. 2005 Oct; 171(2):673-81.
[Genetics. 2005]Genome Res. 2004 Aug; 14(8):1493-500.
[Genome Res. 2004]Science. 2005 Feb 18; 307(5712):1072-9.
[Science. 2005]Nature. 2005 Sep 15; 437(7057):376-80.
[Nature. 2005]Science. 2005 Sep 9; 309(5741):1728-32.
[Science. 2005]Proc Natl Acad Sci U S A. 2005 May 3; 102 Suppl 1():6614-21.
[Proc Natl Acad Sci U S A. 2005]Genetics. 2004 Aug; 167(4):1759-66.
[Genetics. 2004]Proc Natl Acad Sci U S A. 2002 Oct 1; 99(20):12949-54.
[Proc Natl Acad Sci U S A. 2002]Genetics. 2005 Oct; 171(2):639-53.
[Genetics. 2005]Mol Biol Evol. 2005 Oct; 22(10):2119-30.
[Mol Biol Evol. 2005]