• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Mar 2008; 178(3): 1639–1652.
PMCID: PMC2278092

Selection in the Making: A Worldwide Survey of Haplotypic Diversity Around a Causative Mutation in Porcine IGF2


Domestic species allow us to study dramatic evolutionary changes at an accelerated rate due to the effectiveness of modern breeding techniques and the availability of breeds that have undergone distinct selection pressures. We present a worldwide survey of haplotype variability around a known causative mutation in porcine gene IGF2, which increases lean content. We genotyped 34 SNPs spanning 27 kb in 237 domestic pigs and 162 wild boars. Although the selective process had wiped out variability for at least 27 kb in the haplotypes carrying the mutation, there was no indication of an overall reduction in genetic variability of international vs. European local breeds; there was also no evidence of a reduction in variability caused by domestication. The haplotype structure and a plot of Tajima's D against the frequency of the causative mutation across breeds suggested a temporal pattern, where each breed corresponded to a different selective stage. This was observed comparing the haplotype neighbor-joining (NJ) trees of breeds that have undergone increasing selection pressures for leanness, e.g., European local breeds vs. Pietrain. These results anticipate that comparing current domestic breeds will decisively help to recover the genetic history of domestication and contemporary selective processes.

A major goal of current genetics research, in livestock as in plants or humans, is to identify the polymorphisms responsible for the variability in complex traits, i.e., traits affected by the environment as well as by more than one locus. This endeavor has proved to be difficult. In livestock, despite the large number of chromosome regions associated with phenotypes of economic interest (QTL), very few causative polymorphisms have been convincingly identified. The number of published QTL amounts to hundreds in pigs (http://www.animalgenome.org/cgi-bin/QTLdb/SS/summary) (Rothschild et al. 2007), but <10 causative mutations have been reported so far in this species. A comparable picture exists in all species. To accelerate causative gene discovery, traditional QTL studies are usually pursued with gene or genomewide association studies. A complementary approach is to infer the action of selection at specific loci from their nucleotide variability, the so-called selection footprint. To date, several works have shown the usefulness of this approach in humans and in other species (e.g., DuMont and Aquadro 2005; Nielsen et al. 2005; Wright et al. 2005; Caicedo et al. 2007). However, different genomewide scans have picked up different regions as affected by selection (see reviews for humans in Nielsen et al. 2007; Thornton et al. 2007). Possibly, one of the reasons for conflicting results is that disentangling selective from purely demographic forces is very challenging. One of the additional difficulties is that the target of selection cannot always be identified, even when the observed nucleotide variability pattern is not explained by demography alone and selection is the most plausible explanation.

Domestic plant and animal species offer underexploited genetic resources, which are extremely valuable to disentangle demographic from selective processes. Modern breeding and artificial selection techniques allow us to study dramatic evolutionary changes at an accelerated rate. Domestic species have several advantages over natural or human species: the global phenotypic variability across breeds is often larger than in the wild species, the target of artificial selection is known and can differ between breeds or lines, their demographic history and origin are relatively well-documented, the domestication process can be studied with unprecedented accuracy if the wild ancestor is available and, finally, they are easy to sample. So far, however, relatively little is known of livestock fine haplotype structure and of the effects of artificial selection on haplotype variability.

Here, we present the first worldwide study of haplotype variability in a porcine autosomal locus. To characterize the footprint of selection in a known selective target, we chose the IGF2 region. This gene, located in a telomeric position on pig chromosome 2, harbors a paternally expressed mutation that increases muscle growth and leanness (Van Laere et al. 2003). The causative mutation (intron.3-g.3072G>A) occurs in a CpG island of intron 3, which has a regulatory role; pigs receiving the A allele from their sire have a threefold increase of IGF2 mRNA in muscle. The mutation has a considerable effect, explaining ~10–30% of the total phenotypic variability for these traits and has been confirmed in several independent studies (Jungerius et al. 2004; Estelle et al. 2005).



A panel of 399 Sus scrofa animals comprising 237 domestic or feral pigs and 162 wild boars was genotyped, together with one bearded pig (S. barbatus) and one babirusa (Babyrousa babyrussa) as outgroups. The domestic breeds pertained to 40 breeds from Europe, Asia (China, Korea, and Vietnam), the Americas (USA, Mexico, Costa Rica, Bolivia, and Argentina), and Africa (Kenya and Zimbabwe) and there were wild boars sampled from Europe, North Africa, and Asia (17 countries represented). The complete list of breeds and countries is in Table 1. More details about pig breeds are available in Porter (1993) or in the online resource http://www.ansi.okstate.edu/breeds/swine/.

Main statistics for the populations analyzed

We divided the panel into seven main groups: international breeds, Asian local breeds, European/USA local breeds, African local breeds, creole and American feral breeds, Asian wild boar, and European/Maghreb wild boar (Table 1). Representatives of two hybrid terminal sire lines are also in the list (Primus and Maximus). International breeds are those that are primarily used worldwide, representing most of the genetics stock employed by multinational companies. Pietrain and Hampshire are among the leanest animals, followed by Large White, which are used primarily in terminal sire lines. Landrace is valued for its good reproductive and growth abilities, while Duroc is known for a better meat quality. Local Asian breeds were sampled primarily in China. Although local, it is well-documented that Chinese pigs were imported into Europe during the 19th century and earlier and contributed significantly to some modern breeds like Large White (Porter 1993; Giuffra et al. 2000). Non-widely distributed European breeds are within the European local breeds label. They fall broadly into two groups, Mediterranean black pigs and British breeds. The former comprise southern Italian breeds (Casertana, Sicilian, etc.) and Iberian pigs, subdivided into different lines (e.g., Retinto, red; Lampiño, black hairless); Portuguese Porco Alentejano is closely related and genetic interchange between these breeds has occurred frequently during history. It is currently accepted that Mediterranean breeds have not been crossed to Asian animals (Alves et al. 2003), in contrast to British pigs, which were actively crossed to Chinese pigs during the 18th and 19th centuries. The primary purpose was to increase prolificacy and, now paradoxically, fatness.

Polymorphisms genotyped:

The SNPs were identified after aligning the 15 published sequences of ~28 kb in length (Van Laere et al. 2003). Initially, the goal was to select uniformly spaced SNPs that were identified as tag-SNPs by Haploview (Barrett et al. 2005) plus all coding SNPs. In practice, the SNPs chosen were conditioned by the genotyping platform software to genotype simultaneously as many SNPs as possible. Eventually, we were able to genotype 33 SNPs in two plexes using the MassARRAY SNP genotyping system (Sequenom, San Diego), following the manufacturer's instructions. The principles of this method are detailed elsewhere (Buetow et al. 2001). The causative IGF2 mutation was also genotyped using the pyrosequencing protocol as described (Van Laere et al. 2003) because it could not be genotyped with MassARRAY technology. The distance between the first and last SNPs was ~27 kb, and thus the average spacing was 790 bp, although the maximum gap between consecutive SNPs, 28 and 29, was 5 kb, while the minimum was 15 bp (Table 2).

SNP characteristics

Data analysis:

Phases were reconstructed with Phase v2.1.1 (Li and Stephens 2003) using default options except that the program was run 5 times and the last iteration was 10 times longer, as suggested by the authors. We retained only those phases known with high probability (P > 0.8) for further analyses. Several parameters were estimated with DnaSP v4.10 (Rozas et al. 2003): the mean number of pairwise differences across loci (πN), Tajimas's D, Fu and Li's D using babirusa and S. barbatus as outgroups to distinguish between ancestral and derived alleles. These indexes are reported only for those populations where more than six haplotypes were available. Tajima's D measures the discrepancy between the average polymorphism differences between haplotypes and the scaled number of segregating sites. Under the neutral null model, the expected values of the D statistics are zero. Directional selection causes negative D values while balancing selection, a positive value. Demographic events like admixture also result in positive D's. It should also be considered in interpreting Tajima's D here that an upward bias is expected because SNP data and not full resequencing are employed (Kelley et al. 2006). Coalescence simulations under the neutral model were also carried out with DnaSP to obtain the probability of number of haplotypes conditional on the observed number of segregating sites; 1000 replicates were run with either moderate linkage (4Ner = 10) or completely linked sites. The ancestral allele could not be determined for 4 of the 34 SNPs and thus some information is lost when applying the Fu and Li's D test (Table 1). NJ phylogenetic trees with the p-distance (percentage of differences) were drawn using MEGA3 (Kumar et al. 2004), with standard errors obtained from 1000 bootstrap replicates. The population-scaled recombination rate (ρ = 4Ner) was estimated using the Hudson's composite-likelihood method implemented in LDhat [http://www.stats.ox.ac.uk/~mcvean/LDhat/ (McVean et al. 2002)], which assumes a finite-sites mutation model. This we did separately for European wild boar, Asian wild boar, local European breeds, Asian local breeds, and international breeds. The Haploview v3.32 program (Barrett et al. 2005) was used to compute disequilibrium measures (r2 and D′) and to identify haplotype blocks.


SNPs genotyped as proxies for complete variability:

We first investigated how representative the 34 SNPs were of the true relationship between haplotypes (Van Laere et al. 2003). To study that, we compared the NJ trees obtained from the 34 SNPs (Figure 1, a and b,) and the complete sequences published by Van Laere et al. (2003) (Figure 1c). Except for sequence AY242110, which pertains to a recombinant Hampshire haplotype that carried the causative mutation, the two topologies were identical. Figure 1 also shows the main phylogenetic clades, E, C, J, A, and M. The rationale for this arrangement is discussed below (Haplotype phylogenies section).

Figure 1.
(a) Alignment showing the haplotypes from the 34 SNPs genotyped here pertaining to the 15 sequences published (Van Laere et al. 2003). The last three haplotypes (JXBLACK06741, IB01591, and Sus barbatus) were found here; they represent clade A (Jianxi ...

Causative allele frequency:

The causative mutation was segregating in 13 breeds and absent in 23 breeds. A notable trend emerges simply by comparing the frequency of the selected allele between international breeds (PA = 0.86) and Asian (PA = 0.06) or European (PA = 0.03) local breeds; the mutation was not found in wild boars. We defer the analysis of African and American populations because they are derived populations and their history is more complicated. This pattern suggests that the mutation is recent (after domestication) but that, nonetheless, has spread out across many breeds around the globe. Due to modern selection emphasis on lean content and the introgression of Asian genes into European breeds, its frequency has dramatically increased in international breeds. We found that the derived allele was fixed in two breeds (Hampshire and Pietrain). All these animals shared an identical haplotype, the same found by Van Laere et al. (2003) (Figure 1a). In Duroc, 89 of 90 haplotypes were identical; the only heterozygous animal was a Spanish Duroc that differed in 11 positions.

We found the mutation in the Licha Black breed (Shandong Province, China) to be in agreement with previous results (Yang et al. 2006). Interestingly, this is one of the leanest breeds from China. Yang et al. (2006) also reported the mutation in the Erhualian breed in the nearby province of Jiangsu but at a lower frequency (5%). We report for the first time the presence of the mutation in the Korean native pig at a rather high frequency (25%) but not in any Asian wild boar. Although the Asian wild boars have not been extensively sampled, these findings, together with the fact that the A allele is present in a single haplotype, would suggest that the mutation occurred in Eastern Asia after domestication and has a unique origin. To resolve definitely this question, however, more SNPs on SSC2 and a larger number of Asian samples should be genotyped.

The mutation in Mediterranean local populations was very rare. The A allele was absent in the Iberian breed except in Andalusian spotted (Manchado de Jabugo), which is a synthetic strain made up of crossing purebred Iberian to Berkshire and Large White (García Dory et al. 1990). It is also the only line that harbored Asian haplotypes (see below). The A allele was also present in Mukota's pig (a Zimbabwean local breed with influence from European and Asian lines), Argentinean feral pigs, Mexican hairless pigs (Pelón), and Costa Rican creole pigs.

Nucleotide variability:

The number of segregating sites (S) and the mean number of pairwise differences across loci (πN), Tajimas's D, and Fu and Li's D are in Table 1. Overall, genetic variability was much larger in Asian than European local breeds, both for domestic pigs and for wild boar. This is in agreement with analysis of mtDNA that uncovered a bottleneck/expansion demographic process that was stronger in European than Asian pig populations (Larson et al. 2005; Fang and Andersson 2006). But a relevant observation was that, in contrast to what has been currently observed in other species, e.g., in maize (Wright et al. 2005), domestication has not produced a detectable decrease in variability. In fact, the lowest variability was found in European wild boar, in agreement with results at the FABP4 gene (Ojeda et al. 2006), while some of the most variable breeds are the endangered British Tamworth breed or local Zimbabwean Mukota.

In parallel, selection has wiped out variability almost completely for this region in some breeds, like Pietrain, Hampshire, or Duroc, a clear signal of selective sweep. These breeds are typically selected for growth and leanness and are used as sire lines. A single haplotype was found in 34 Pietrain sequences; similarly, all haplotypes except one were identical in 90 Duroc animals. Coalescent simulations showed that it is highly unlikely (P < 10−6) to get a single haplotype conditionally on the observed number of segregating sites (S = 11 in Duroc), and thus the neutral model can be rejected. Nucleotide variabilities were comparable in the two main porcine breeds, Landrace and Large White.

We found a large variability in Tajima's or Fu and Li's D (Table 1), ranging from highly positive (e.g., Calabrese, Korean wild boar) to strong negative values (Japanese wild boar, Duroc). Thus, in addition to the effects of selection caused by the causative mutation, there must be strong demographic forces affecting nucleotide variability. A genomewide analysis is required to disentangle the two phenomena, though. To elucidate the effect of directional selection for the causative mutation, we plotted the frequency of the selected allele against Tajima's D (Figure 2). The observed pattern was illuminating. Before the appearance of the mutation (PA = 0), Tajima's D was highly variable, likely the result of demographic and/or sampling effects. We assumed that the only nonneutral polymorphism is the intron 3 causative allele. In stark contrast, as the frequency of the A allele increases, Tajima's D first increases and declines very rapidly after PA > 0.5. A similar, although more ragged pattern was observed with Fu and Li's D. Although the behavior of Tajima's D with genotypic data requires further study, especially its dynamics over the selection process, several authors (Jensen et al. 2005; Kelley et al. 2006) report that this statistic is biased upward and can even take positive values under directional selection and with partial selective sweeps, especially with population structure. Yet in breeds where PA was close to 1, i.e., when the mutation was almost fixed, Tajima's D took highly negative values as expected in a classical selective sweep.

Figure 2.
Relationship between Tajima's D and frequency of the causative mutation (PA).

Haplotype phylogenies:

A broad view of the porcine genetic landscape before modern selection in China and the Mediterranean (Iberian and Italian peninsulas) can be seen through the NJ trees in Figures 3, a and b, and supplemental Figure S1 at http://www.genetics.org/supplemental/, respectively. In light of these trees, we selected the five most divergent and frequent haplotypes as clade representatives: “C” for causative, is the typical haplotype of Pietrain that carries the derived allele; “E” for European, is at high frequencies in European wild boar and local European populations but also found in Asian populations; “J” for Japanese was described in a Japanese wild boar (Van Laere et al. 2003), but also present in continental Asian populations. Two additional clades were not described previously in (Van Laere et al. 2003): “A” for Asian and “M” for Mediterranean, the latter was frequent in European local populations and absent from Asian pigs. All the haplotypes and the NJ trees are in Figure 1, a and b, and outlined by colored rectangles. Note that the clade names are rather conventional because Asian populations harbored all haplotypes except M, whereas the M haplotype was at high frequency in Mediterranean populations but present also in other populations. All these haplotypes, marked with colored rectangles, are included in the NJ trees to facilitate visual comparisons between populations.

Figure 3. Figure 3.
NJ trees for different populations using the p-distance. Encircled letters point at the clade representative (Figure 2). Different colors correspond to different breeds. The individual code is composed of four letters, five numbers, and a G or an A. The ...

As expected from the fact that the species S. scrofa evolved initially in Eastern Asia, this region has maintained higher levels of variability than the European subspecies, as can be intuitively seen by deeper clades at intermediate frequencies and corroborated by higher πN in Asian vs. Mediterranean pigs (Table 1). Chinese pigs harbored all main clades, except M, at intermediate frequencies (Figure 3a).

As mentioned, one of the large advantages of studying domestic breeds over natural populations is that different breeds/lines may represent different stages of the domestication and artificial selection processes. Moreover, because the target of selection is often well-documented, interpretation of the results is also far easier than in natural populations. This temporal pattern can be appreciated by comparing the NJ trees of British local breeds, Landrace and Large White (Figures 3, c, d, and f). The set of local British/USA breeds would represent the stage just before modern selection for leanness but after introgression of Asian germplasm during the 19th century. They follow a similar pattern to that of Mediterranean local breeds although with some key differences: the M haplotype is absent and a more pronounced Asian influence is observed (Figure 3c). The global frequency of the A allele (clade C) was very low; it was at high frequency only in Chester White, a breed related to Large White. Next, the effect of modern selection can be observed by comparing the clinal NJ trees of Landrace (Figure 3d), Large White (Figure 3f), and Duroc. The pattern of the Landrace breed is an intermediate stage where the C clade represents ~50% of haplotypes; the fact that this clade is rather divergent from the preexisting clades (clade E predominantly) makes its Tajima's D become positive (DT = 1.8). In Large White's NJ tree (Figure 3f), clade C was already predominant at the expense of clade E and results in a negative Tajima's D overall. Note, however, that the frequency of clade C is intermediate (0.5) in the Yorkshire breed, a “primitive” USA Large White representative, and Tajima's D was positive here. The next most extreme case was the Duroc breed (PA = 0.99 and DT = −2.35), i.e., just before fixation. This pattern was not caused by demography alone because we found a highly positive DT in a subset of this Duroc population for fatty acid binding protein 5 (FABP5, our unpublished data). The selective sweep is accomplished in Pietrain and Hampshire, where clade C is fixed and genetic variation is removed for at least ~27 kb.

No haplotype was specific to a particular breed; each breed consisted of a mosaic of different haplotypes shared across breeds. This was observed throughout breeds and populations. It suggests that breed effective sizes are larger than what could be suspected a priori. For instance, the Iberian pig is considered a single breed with different lines (Lampiño, hairless; Retinto, red; Torbiscal, Guadyerbas, etc.) that tend to be bred separately. But even highly inbred and isolated lines like the Iberian Guadyerbas (inbreeding coefficient > 0.3) (Toro et al. 2000) harbored haplotypes in all main clades of the Mediterranean breeds. The Italian NJ tree was again very similar, and there was no correlation between clade and breed. The two most extreme cases were the British endangered breed Tamworth and Zimbabwe's Mukota. Tamworth is included in the British Rare Breeds Survival Trust protection program (http://www.rbst.org.uk/). Yet, it was one of the most diverse breeds (Table 1), with highly distant haplotypes in clades E, J, and A (Figure 3c). The six Mukota haplotypes pertained to three clades, A, C, and E, confirming that Mukota has an important Asian influence.

A variety of situations was found in the derived American and African creole/feral populations (supplemental Figure S1). We did not find the mutation in Bolivian pigs, collected in the Santa Cruz valley. Costa Rican animals were clearly influenced by modern breeds, as inferred from the high frequency of the causative allele, but also from Mediterranean pigs. Mexican hairless had some Asian influence, evidenced by the presence of the A haplotype. This result is in agreement with the report of Asian mtDNA haplotypes for Mexican hairless (Larson et al. 2005). The mutation was segregating even in Argentinean feral pigs. Thus, the genetic structure of derived American populations is complex. No clear pattern emerges and more extensive sampling is required to reconcile history with phylogeny in these populations.

Linkage disequilibrium:

Five haplotype blocks were inferred from the algorithm implemented in Haploview (Barrett et al. 2005), spanning 1, 2, 9, 4, and 0.8 kb, respectively (Figure 4a). The third block was the largest and contained the causative mutation. According to the tagger option, 17 SNPs would tag all 34 SNPs with r2 = 100%, while 10 SNPs would capture all markers with r2 > 0.8. The most associated SNPs with the causative mutation were 14 and 15, r2 = 0.91 and 0.95, respectively. The linkage disequilibrium pattern looks overall quite complex. To simplify the data set, we also analyzed the Landrace and Large White breeds separately, as selecting for the causative mutation in these breeds is more relevant than in local breeds (Figure 4b). As expected, the disequilibrium pattern was simpler, with a single block that nevertheless spanned 14 kb, i.e., only about half of the whole region analyzed. The causative mutation was at high disequilibrium again with SNPs 14 and 15 but also with the contiguous genotyped SNP 22. In summary, the causative SNP was in high disequilibrium with very few of the SNPs analyzed and thus marker assisted selection can be implemented effectively only because the causative SNP has been discovered. There was no relation between physical distance and any disequilibrium measure; this occurred between all pairs of markers (Figure 4c) and in particular between the causative mutation and the rest of the SNPs (Figure 4d). Note that the behavior of r2 was completely different from that of D′, as these statistics quantify different aspects of linkage disequilibrium (Ardlie et al. 2002). The index D′ was one for a large percentage of pairs, denoting the absence of recombination, while r2 values tended to cluster around zero because the allele frequencies at the SNP pairs were rather different (Figure 4, c and d). Although a flat line is the pattern expected if all markers are in the same haplotype block, the picture here was more complex, as many pairs “escaped” complete disequilibrium. But even for these pairs there was no observable trend between distance and disequilibrium.

Figure 4.
Haplotype structure and relationship between distance and disequilibrium measures. (a) r2 plot between pairs of loci and all individuals; haplotype blocks are underlined; the arrow points at the causative SNP. (b) r2 plot between pairs of loci for Landrace ...


Selection in the making:

Artificial selection in livestock species allows us to study dramatic genetic changes in the making. Different evolutionary stages can be scrutinized by comparing breeds that have undergone very different selection pressures. Knowing that SNP 23 (intron.3-g.3072G>A) is a causative, selected mutation has clearly facilitated the interpretation of the observed pattern of nucleotide and haplotype diversity. Certainly, an ongoing challenge is how to infer selection from the pattern of linkage disequilibrium and nucleotide variability alone, when the target of selection is not known. The fact that artificial selection is much more intense and effective than natural selection, together with the presence of highly structured populations and the availability of the wild ancestor in the pig, should make this task easier than in human or natural populations. Our work provides data that can be used to validate theoretical models for selective sweeps in structured populations.

The criteria of selection in international breeds vary but are primarily leanness, growth, and, to a lesser extent, reproductive traits. For most of the remaining breeds, no modern selection and breeding schemes have been set up and thus they tend to be much fatter and of higher meat quality than international breeds. The overall frequencies of the causative allele are in complete agreement with the expectations: high frequencies in international lean breeds and very low frequency in local breeds. The presence of the mutation in Iberian Andalusian spotted is explained by the well-known fact that it was created by crossing Iberian to British breeds (García Dory et al. 1990; Alves et al. 2003); the presence of the derived allele in other local European breeds (Mangalitza and Casertana) is also due to introgression in all likelihood. Some authors (Porter 1993) have suggested a small influence of Asian breeds in Casertana. But neither in Casertana nor in Mangalitza have Asian mtDNA haplotypes been reported (Angiolillo et al. 2001; Alves et al. 2003; Larson et al. 2005), so this would suggest that the introgression of the mutant allele was male mediated. Mexican hairless is thought to be descended from Iberian animals brought by the Spaniards (Lemus-Flores et al. 2001), but mtDNA analysis (Larson et al. 2005) and the presence of the mutation prove that interbreeding with Asian germplasm has also occurred. Similarly, the influence of Asian influence in Zimbabwe's Mukota (Figure 3e) is in agreement with mtDNA results (Ramírez et al. 2006).

The D statistics computed here (Table 1) were obtained with genotypes rather than the complete set of polymorphisms, and thus there is an ascertainment bias that causes an upward bias in Tajima's D (Kelley et al. 2006). Nevertheless, we also expect a positive correlation between sequence and very dense genotyping D's (Carlson et al. 2005; Kelley et al. 2006), which allows us to compare different populations. Directional selection causes an abundance of the haplotype carrying the selected mutation and a relative excess of rare variants that results in a negative Tajima's D. But this is the final stage of a selective sweep. When a mutation occurs and increases in frequency, there will be a moment when the selected haplotype will be at intermediate frequency. At this moment, Tajima's D becomes positive due to an apparent excess of alleles at intermediate frequencies. The pattern observed in Figure 2 is precisely the expected pattern when a selected haplotype replaces the existing ones. The comparison of local British breeds, Landrace and Large White NJ trees is particularly illuminating. The mutation was at much higher frequency in Large White (PA = 0.78) than Landrace (PA = 0.55). Probably, the reason why the mutation has not become fixed in Landrace is that this breed is used in both paternal and maternal lines, and the mutation has detrimental effects in prolificacy because an excess of leanness diminishes reproductive performance in the sow (Buys et al. 2006). In contrast, Large White is mostly used in sire lines in the European market, where the primary interest is to increase growth and leanness. Thus, the indirect selection pressure has been higher in Large White than Landrace. Interestingly, both Large White and Landrace are used as maternal lines in China, and the allele frequencies were similar (Yang et al. 2006). Certainly, selective sweeps occur instantaneously (in the evolutionary scale) in natural populations and only the final stage is observed, i.e., here the Pietrain, Hampshire, or Duroc populations. The study of a diverse collection of breeds and populations allowed us to carry out a spatial study that mimicked a longitudinal (temporal) process.

Hard or soft sweeps?

There is currently much interest in understanding the effects of directional selection on standing genetic variation, the so-called “soft” sweeps (Innan and Kim 2004; Hermisson and Pennings 2005; Przeworski et al. 2005; Teshima et al. 2006). This scenario is particularly relevant here because the selective advantage of a given allele can dramatically change after a domestication process. Simulation results have shown (Innan and Kim 2004; Przeworski et al. 2005) that the reduction in nucleotide diversity around the selected site can be much smaller than anticipated when the selected allele is already segregating in the population. The reduction will be minimal when the allele is at intermediate frequencies and undistinguishable from neutral variation. Our results and those of Yang et al. (2006) demonstrate that the mutation was segregating in East Asian populations before being selected due to modern emphasis in lean meat, i.e., we would be expecting a soft sweep behavior and a mild reduction in nucleotide diversity. However, the nucleotide pattern is that of a hard sweep, and genetic variability was wiped out for at least 27 kb. The most likely explanation is that the selection process was associated with a strong bottleneck, as probably very few copies of the allele were introgressed in European populations. More extensive sampling of Asian haplotypes harboring the mutation should be carried out to estimate the age of the allele, but the causative mutation seems to be quite recent, after domestication. All in all, the demographic model of modern pig breeding is far more complex than the single-population model studied so far in domestication (Innan and Kim 2004; Przeworski et al. 2005).

The classical hitchhiking effect predicts that a selective sweep will reduce genetic variability around the selected target. One of the striking observations from our study is that the footprint of this phenomenon is clearly visible across different international breeds which, in principle, behave as semi-isolated distinct populations. The case of Pietrain and Duroc are particularly notable: a single and identical haplotype spanning at least 27 kb was found in all 17 Pietrain animals sampled from Spain, Germany, France, and the UK and pertaining to several pig breeding companies. As for Duroc, all 90 haplotypes except 1 (a Spanish Duroc differing in 11 SNPs) were identical at all positions except for the last SNP. Duroc was sampled from Spain, USA, Mexico, Denmark, UK, Hungary, and France. Coalescent simulations conditional on the number of segregating sites suggest that it is virtually impossible to get just two haplotypes with S = 11 and n = 90 in a neutral model. These results are in stark contrast with results at FABP5 on chromosome 4, where there seems to be evidence of balancing selection in Duroc (our unpublished results). At the very least in Duroc, the lack of genetic diversity cannot be explained by bottlenecks alone. The picture is more complex in the two most popular international breeds: Large White and Landrace. While British, French, and Finnish Large White haplotypes were again identical to that in Pietrain, Spanish Landraces were far more variable; the frequency of the derived allele was only 36% and no extreme Tajima's D showed up.

Domestication and modern breeding are complex processes:

We found no apparent reduction in nucleotide variability after domestication, as can be seen in the diversity indexes of Table 1. To contrast the diversity indices πN, we also estimated the parameter ρ = 4Ner, for several populations separately: international breeds (equation M1 = 8.2), Mediterranean local breeds (Italian and Iberian peninsulas, equation M2 = 8.2), British local breeds (equation M3 = 4.1), European wild boar (equation M4 = 6.1), Asian local breeds (equation M5 = 13.3), and Asian wild boar (equation M6 = 11.2). Although these estimates are subject to large sampling errors, they do suggest that (1) domestication has not decreased genetic variability in the porcine species, at least for this region, and in agreement with the opinion that domestication—at least in animals—was a complex process and cannot be fully explained by a simple bottleneck (Wong et al. 2004; Larson et al. 2005; Vila et al. 2005; Fang and Andersson 2006); and (2) despite intense selection in international breeds targeting the IGF2 region, their effective sizes still seem comparable to local unselected breeds. This is likely the result of two balancing forces: the introgression of Asian genes vs. the selective sweep process. Interestingly, we also reported a very low nucleotide diversity in European wild boar for the FABP4 gene as compared to domestic breeds (Ojeda et al. 2006). The low nucleotide variability of European wild boar has also been reported for mtDNA in several studies and is consistent with a bottleneck followed by recent expansion that occurred prior to domestication (Larson et al. 2005; Fang and Andersson 2006).

Given that the mutation seems to have a unique origin, it is remarkable how widespread it was across breeds and continents. This proves that porcine breeds have regularly interchanged genetic material. This regular interchange would also explain the high heterozygosity observed within breeds, which consist of a mosaic of distinct haplotypes at different frequencies (Figure 3 and supplemental Figure S1). Thus, it is very rare that a particular haplotype is specific to a single breed, however isolated it is thought to be. Symmetrically, even highly inbred lines were made up of several clades. The extreme case is the endangered British breed Tamworth: we found it to be one of the most variable breeds, and despite it being listed among vulnerable breeds by the Rare Breed Survival Trust (http://www.rbst.org.uk/watch-list/pigs.php), it held haplotypes in clades J, A, and E. It is also noteworthy that Large White pig LWES0429 was heterozygous at 9 positions, harboring the two specific Asian haplotypes A and J (Figure 3f). This individual pertained to a highly inbred, very fat, and primitive Large White line imported to Spain in 1931 and kept in a closed herd until 1992, when they were slaughtered (Rodrigáñez et al. 1998). It was homozygous for the wild-type causative allele, though. The Iberian line Guadyerbas has an average inbred coefficient of ~30% and has been isolated for >50 years now (Toro et al. 2000), yet it also harbored haplotypes in all main clades M and E (pink squares in Figure 3b). All in all, the genetic difference between porcine breeds is very tenuous, and contrasts clearly with other species, e.g., the dog, where breed barriers seem to have been enforced more effectively.

In conclusion, we show that (1) selection can be observed and analyzed in the making by comparing different breeds that represent distinct stages of the selective process; (2) there is no evidence that, overall, domestication reduced genetic variability in the IGF2 region with respect to current wild ancestors in the pig (although a complete selective sweep is found in some very lean breeds such as, e.g., Pietrain); and (3) there seems to be considerable gene flow between porcine breeds, with the result of common haplotypes shared across breeds and few (if any) specific haplotypes of a single breed.


Thanks go to J. Rozas, S. Ramos-Onsins, and J. vand der Made for comments and discussions. We thank the following persons and institutions for samples: L. Silió and M. C. Valdovinos (Instituto Nacional de Investigaciones Agrarias, Spain), E. Martínez (Madrid Zoo), N. Okumura (Society for the Techo-Innovation of Agriculture, Forestry, and Fisheries Institute, Japan), G. Rohrer (United States Department of Agriculture), Z. Bosze (Agricultural Biotechnology Center, Hungary), M. Cumbreras (Diputación de Huelva, Spain), E. von Eckardt (Sweden), M. Kovac (Slovenia), J. Reixach (Batallé, Spain), O. Hanotte (International Livestock Research Institute, Kenya), J. García-Cascos (National Center for Animal Breeding and Reproduction, Spain), G. Caja (Universitat Autònoma Barcelona, Spain), E. González (Seporsa, Spain), P. Martínez (Universidad de Santiago de Compostela, Spain), D. Vidal and C. Gortázar (Instituto de Investigación en Recursos Cinegéticos, Spain), S. Casu (Istituto Zootecnico e Caseario per la Sardegna, Italy), M. Lukaszewicz (Academy of Sciences, Poland), C. Renard (Institut National Recherche Agronomique, France), E. Grindflek (The Norwegian Pig Breeders Association, Norway), M. A. Revidatti (Universidad Nacional del Nordeste, Argentina), A. Clop (Genesis Faraday, UK), R. Silva (Portugal), J. W. Ma (Jiangxi Agricultural University, China), Instituto de Capacitación del Oriente (Bolivia), Veterinarios Sin Fronteras (Spain), Junta de Castilla–La Mancha (Spain), London Zoo, Copaga Sociedad Cooperative (Spain), Degesa (Spain), Semen Porcino Andalucía (Spain), Kaliningrad Zoo (Russia). MassARRAY genotyping was carried out at the Spanish National Genotyping Center (http://www.cegen.org) facilities and was subsidized by Fundación Genoma España. A.O. is the recipient of a Ph.D. grant (Ministerio de Educación y Ciencia, Spain). Work was funded by projects AGL2004-0103, AECI/5267/06 (Spain), and the National Natural Science Foundation of China (grant 30425045).


  • Alves, E., C. Ovilo, M. C. Rodriguez and L. Silio, 2003. Mitochondrial DNA sequence variation and phylogenetic relationships among Iberian pigs and other domestic and wild pig populations. Anim. Genet. 34 319–324. [PubMed]
  • Angiolillo, A., F. Pilla, D. Matassino, A. Clop and A. Sánchez, 2001. Genetic characterization of Italian local breeds of pigs by mtDNA analysis. EAAP Meeting Proceedings, Budapest, Hungary, p. 54.
  • Ardlie, K. G., L. Kruglyak and M. Seielstad, 2002. Patterns of linkage disequilibrium in the human genome. Nat. Rev. Genet. 3 299–309. [PubMed]
  • Barrett, J. C., B. Fry, J. Maller and M. J. Daly, 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21 263–265. [PubMed]
  • Buetow, K. H., M. Edmonson, R. MacDonald, R. Clifford, P. Yip et al., 2001. High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proc. Natl. Acad. Sci. USA 98 581–584. [PMC free article] [PubMed]
  • Buys, N., G. Van den Abeele, A. Stinckens, J. Deley and M. Georges, 2006. Effect of the IGF2-intron3–G3072A mutation on prolificacy on sows. Proceedings of the World Congress on Genetics Applied to Livestock Production, Belo Horizonte, Brazil.
  • Caicedo, A. L., S. H. Williamson, R. D. Hernandez, A. Boyko, A. Fledel-Alon et al., 2007. Genome-wide patterns of nucleotide polymorphism in domesticated rice. PLoS Genet. 3 e163. [PMC free article] [PubMed]
  • Carlson, C. S., D. J. Thomas, M. A. Eberle, J. E. Swanson, R. J. Livingston et al., 2005. Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res. 15 1553–1565. [PMC free article] [PubMed]
  • DuMont, V. B., and C. F. Aquadro, 2005. Multiple signatures of positive selection downstream of Notch on the X chromosome in Drosophila melanogaster. Genetics 171 639–653. [PMC free article] [PubMed]
  • Estelle, J., A. Mercade, J. L. Noguera, M. Perez-Enciso, C. Ovilo et al., 2005. Effect of the porcine IGF2-intron3–G3072A substitution in an outbred Large White population and in an Iberian × Landrace cross. J. Anim. Sci. 83 2723–2728. [PubMed]
  • Fang, M., and L. Andersson, 2006. Mitochondrial diversity in European and Chinese pigs is consistent with population expansions that occurred prior to domestication. Proc. Biol. Sci. 273 1803–1810. [PMC free article] [PubMed]
  • García Dory, M. A., S. Martínez and F. Orozco, 1990. Guía de Campo de las Razas Autóctonas Españolas. Alianza Editorial, Madrid.
  • Giuffra, E., J. M. Kijas, V. Amarger, O. Carlborg, J. T. Jeon et al., 2000. The origin of the domestic pig: independent domestication and subsequent introgression. Genetics 154 1785–1791. [PMC free article] [PubMed]
  • Hermisson, J., and P. S. Pennings, 2005. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics 169 2335–2352. [PMC free article] [PubMed]
  • Innan, H., and Y. Kim, 2004. Pattern of polymorphism after strong artificial selection in a domestication event. Proc. Natl. Acad. Sci. USA 101 10667–10672. [PMC free article] [PubMed]
  • Jensen, J. D., Y. Kim, V. B. DuMont, C. F. Aquadro and C. D. Bustamante, 2005. Distinguishing between selective sweeps and demography using DNA polymorphism data. Genetics 170 1401–1410. [PMC free article] [PubMed]
  • Jungerius, B. J., A. S. Van Laere, M. F. Te Pas, B. A. van Oost, L. Andersson et al., 2004. The IGF2-intron3–G3072A substitution explains a major imprinted QTL effect on backfat thickness in a Meishan × European white pig intercross. Genet. Res. 84 95–101. [PubMed]
  • Kelley, J. L., J. Madeoy, J. C. Calhoun, W. Swanson and J. M. Akey, 2006. Genomic signatures of positive selection in humans and the limits of outlier approaches. Genome Res. 16 980–989. [PMC free article] [PubMed]
  • Kumar, S., K. Tamura and M. Nei, 2004. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinformatics 5 150–163. [PubMed]
  • Larson, G., K. Dobney, U. Albarella, M. Fang, E. Matisoo-Smith et al., 2005. Worldwide phylogeography of wild boar reveals multiple centers of pig domestication. Science 307 1618–1621. [PubMed]
  • Lemus-Flores, C., R. Ulloa-Arvizu, M. Ramos-Kuri, F. J. Estrada and R. A. Alonso, 2001. Genetic analysis of Mexican hairless pig populations. J. Anim. Sci. 79 3021–3026. [PubMed]
  • Li, N., and M. Stephens, 2003. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165 2213–2233. [PMC free article] [PubMed]
  • McVean, G., P. Awadalla and P. Fearnhead, 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160 1231–1241. [PMC free article] [PubMed]
  • Nielsen, R., C. Bustamante, A. G. Clark, S. Glanowski, T. B. Sackton et al., 2005. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3 e170. [PMC free article] [PubMed]
  • Nielsen, R., I. Hellmann, M. Hubisz, C. Bustamante and A. G. Clark, 2007. Recent and ongoing selection in the human genome. Nat. Rev. Genet. 8 857–868. [PMC free article] [PubMed]
  • Ojeda, A., J. Rozas, J. M. Folch and M. Perez-Enciso, 2006. Unexpected high polymorphism at the FABP4 gene unveils a complex history for pig populations. Genetics 174 2119–2127. [PMC free article] [PubMed]
  • Porter, V., 1993. Pigs: A Handbook to the Breeds of the World. Helm Information, Mountfield, UK.
  • Przeworski, M., G. Coop and J. D. Wall, 2005. The signature of positive selection on standing genetic variation. Evolution 59 2312–2323. [PubMed]
  • Ramírez, O., A. Tomás, A. Clop, O. Galmanomitogun, S. M. Makuza et al., 2006. Microsatellite and chromosome Y sequence analysis of wild boar and autochthonous pig breeds from Asia, Europe, South America and Africa. ISAG Meeting Proceedings, Porto Seguro, Brazil.
  • Rodrigáñez, J., M. Toro, M. Rodríguez and L. Silió, 1998. Effect of founder allele aurvivaland inbreeding depression on litter size in a closed line of Large White pigs. Anim. Sci. 67 573–582.
  • Rothschild, M. F., Z. L. Hu and Z. Jiang, 2007. Advances in QTL mapping in pigs. Int. J. Biol. Sci. 3 192–197. [PMC free article] [PubMed]
  • Rozas, J., J. C. Sanchez-DelBarrio, X. Messeguer and R. Rozas, 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19 2496–2497. [PubMed]
  • Teshima, K. M., G. Coop and M. Przeworski, 2006. How reliable are empirical genomic scans for selective sweeps? Genome Res. 16 702–712. [PMC free article] [PubMed]
  • Thornton, K. R., J. D. Jensen, C. Becquet and P. Andolfatto, 2007. Progress and prospects in mapping recent selection in the genome. Heredity 98 340–348. [PubMed]
  • Toro, M., J. Rodrigáñez, L. Silió and M. Rodríguez, 2000. Genealogical analysis of a closed herd of black hairless Iberian pigs. Conserv. Biol. 14 1843–1851.
  • Van Laere, A. S., M. Nguyen, M. Braunschweig, C. Nezer, C. Collette et al., 2003. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature 425 832–836. [PubMed]
  • Vila, C., J. Seddon and H. Ellegren, 2005. Genes of domestic mammals augmented by backcrossing with wild ancestors. Trends Genet. 21 214–218. [PubMed]
  • Wong, G. K., B. Liu, J. Wang, Y. Zhang, X. Yang et al., 2004. A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms. Nature 432 717–722. [PMC free article] [PubMed]
  • Wright, S. I., I. V. Bi, S. G. Schroeder, M. Yamasaki, J. F. Doebley et al., 2005. The effects of artificial selection on the maize genome. Science 308 1310–1314. [PubMed]
  • Yang, G. C., J. Ren, Y. M. Guo, N. S. Ding, C. Y. Chen et al., 2006. Genetic evidence for the origin of an IGF2 quantitative trait nucleotide in Chinese pigs. Anim. Genet. 37 179–180. [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

    Your browsing activity is empty.

    Activity recording is turned off.

    Turn recording back on

    See more...