Spatial Genetic Structure and Demographic History of the Wild Boar in the Qinling Mountains, China

Simple Summary The wild boar is native to the temperate region of Eurasia, which is now one of the most widely distributed mammals worldwide. The recent expansion in the wild boar population has attracted a lot of attention, which may cause great damage to ecosystems. Elucidating the patterns of the population structure, genetic diversity, population origin, and colonization route of wild boar is very helpful in the conservation and management of wild populations. Phylogeographic analysis has proven to be a powerful tool. Here, 82 samples of wild boars in 16 sampling locations were collected in Qinling Mountains (QM). Genetic analysis was conducted based on the mitochondrial control region and nuclear genes. The level of genetic diversity of wild boars in QM was lower than the total population in East Asia, but higher than European population. No obvious phylogeographic pattern were found. The effective population size was under demographic equilibrium in the past. Abstract Species dispersal patterns and population genetic structure can be influenced by geographical features. Qinling Mountains (QM) provide an excellent area for phylogeographic study. The phylogeography of Asian-wide wild boars revealed the colonization route. However, the impact of the QM on genetic diversity, genetic structure and population origin is still poorly understood. In this study, genetic analysis of wild boar in the QM was conducted based on the mitochondrial control region (943 bp) and twelve microsatellite loci of 82 individuals in 16 sampling locations. Overall genetic haplotype diversity was 0.86, and the nucleotide diversity was 0.0079. A total of 17 new haplotypes were detected. The level of genetic diversity of wild boars in QM was lower than in East Asia, but higher than in Europe. Phylogenetic analysis showed the weak genetic divergence in QM. Mismatch analysis, neutrality tests, and Bayesian Skyline Plot (BSP) results revealed that the estimates of effective population size were under demographic equilibrium in the past. Spatial analysis of molecular variance indicated no obvious phylogeographic structure.


Introduction
The Qinling Mountains (QM) are an important geographic and ecological barriers between northern and southern China, which were formed during the Mesozoic and Cenozoic periods extending over 1600 km from east to west in the southern Shaanxi province of China [1]. With limited Late Pleistocene ice-cover, QM provides a good research area for assessing the effect of Pleistocene climatic fluctuations and geologic events on population genetic structure [2]. Because of its diverse habitats and complex topography, the QMs contribute to the biodiversity of the Eastern Asian species by harboring many Animals 2021, 11, 346 2 of 13 threatened and endangered animals, such as the golden monkey and giant panda [3,4]. In recent years, many phylogeographical studies discovered that the QM probably served as both a major geographical and ecological barrier for plants and animals [5][6][7]. However, some other results suggested that the QM have no obvious impact on genetic structure [8]. More cases with phylogeographic patterns are needed to make generalizations about the QM impacts on the genetic structure and demographic history of species.
The wild boar (Sus scrofa) is a temperate species with diverse groups, and a widespread mammal currently distributed in Europe, North Africa and Asia [9]. In the European wild boar, a latitudinal gradient pattern of intraspecific genetic diversity was observed, despite the fact that human disturbance is common in this species [10][11][12][13]. The current distribution of the mitochondrial genetic pattern of European wild boar was highly influenced by past climatic fluctuations, especially in the Last Glacial Maximum [14,15]. The Southern areas acted as genetic reservoirs in glacial times, and northern areas were mainly recolonized from multi-refugia. The phylogeographical pattern of mtDNA in the European wild boar may result from leading-edge expansion and density-dependent migration processes [11,15].
In East Asia, many previous studies have focused on the origin and the recolonization history of domestic pigs [16,17]. The phylogeography of the Asia-wide wild boar revealed that the wild boar migrated from South-East Asia to South Asia, followed by migration to East and West Asia [18]. Geographical barriers, such as the complex topography of mountains and islands in east and central China, long-term gene flow and population history, play an important role in driving the formation of complex spatial genetic structure of wild boar in East Asia [19]. On a local scale, both Japanese and Korean wild boars are well studied, which has revealed the diverse pattern of genetic diversity and genetic differentiation among wild boar [20,21]. The genetic structure from the QM differed from those from other localities in terms of mitochondrial marker [19]. To date, we have no information on the genetic divergence or population structure of East Asian wild boars in QM, and the population origin, population genetic structure and the degree of admixture have not been well studied.
In this study, we used multiple molecular markers, including the maternally inherited sequences (the complete mtDNA control region, with 943 bp in length), and bi-parentally inherited microsatellite loci (twelve loci), to clarify the spatial genetic structure and historical demography of the wild boar distributed in historically glaciated regions in the QM. Our objectives are as follows: I. to evaluate the genetic status and population genetic diversity in QM; II. to examine the genetic structure of Sus scrofa in QM area; III. to discuss the possible origin of wild boar populations in QM by combining our data with the East Asian wild boar mtDNA data from GenBank.

Sampling and DNA Extraction
Our experimental procedures complied with the current laws on animal welfare and research in China, and were specifically approved by Nanjing Normal University's Animal Care and Use Committee (Permit # IACUC-20201206). A total of 82 samples (muscle or ear samples) were collected from 16 localities from 2015 to 2017. According to their geographical distances, topologies and previous phylogeography studies in QM, the sampling localities were grouped into five geographical regions (Table 1, Figure 1).   Table 1 for details).
Samples were stored at −80 • C in our laboratory at School of Life Science, Nanjing Normal University. Total genomic DNA was extracted using commercial DNA isolation kit (QIAGEN, Hilden, Germany) based on the manufacturer's protocol.

DNA Sequencing and Microsatellite Genotyping
Almost the entire control region (CR) was amplified by polymerase chain reaction (PCR) using two primers [22]. The detailed PCR methods are introduced in Hu et al. (2020). PCR products were then purified and sequenced using the forward primer on an ABI 377 sequencer. Sequences were aligned and checked in MEGA X [23]. Twelve microsatellite loci (Sw632, S0355, S0101, Sw72, IGF, S0005, S0226, S0143, S0225, S0026, S0155, and S0227) were selected, which were previously developed by the International Society of Animal Genetics (ISAG) and Food and Agriculture Organization of the United Nations (FAO) for swine biodiversity studies [24]. PCR products were amplified with fluorescently labeled forward primers (FAM, HEX, TAMRA) and PCR conditions and annealing temperatures followed the sets in the original publications. Amplified products were visualized on an ABI 3730 semiautomated sequencer (PE Applied Biosystems) with internal size marker GeneScan-500 ROX (Applied Biosystems), and the allele data were scored by GeneMarker 2.2.0 (SoftGenetics, LLC., State College, PA, USA).

Genetic Diversity
For mitochondrial DNA data, haplotype diversity (h), nucleotide diversity (π), and the number of haplotypes were calculated using the program DnaSP 6.12.03 [25].

Phylogenetic Analysis
Two datasets were assembled for phylogenetic analyses: (1) The first dataset used for study of the population genetic structure in QM. The newly sequenced CR (943 bp) were obtained in this study. (2) The second dataset, combined with 680 sequences downloaded from the previous study [19] with 511 bp in length to infer the possible origin of the QM wild boar. The sequences were aligned by using Clustal W implemented in MEGA X [23]. Data format conversion was performed using DnaSP 6.12.03 [25]. The most appropriate model of molecular evolution was selected using MrModeltest 2.3 [29]. For phylogenetic analyses, Bayesian inference (BI) was performed by using the MrBayes 3.2.1 [30]. BI were carried out using the GTR + G model of sequence evolution and two independent runs of four Karkov chains (one cold and three heated) over 5 × 10 7 generations, and every 1000 generations were sampled. The first 25% of sampled trees and estimated parameters were discarded as burn-in. The remaining trees were used to calculate 50% majority rule consensus tree and Bayesian posterior probabilities.

Population Structure and Demographic History
Fu's Fs and Tajima's D neutrality tests were carried out in order to detect evidence of recent population expansion. Fu's Fs [31] and Tajima's D [32] statistics were performed in Arlequin 3.5 [27]. Significance of deviation from neutrality by both Fs and D indices were assessed by 1000 coalescent simulations.
To infer detailed characteristics of demographic history, we used the Bayesian Skyline Plot (BSP) model by using BEAST 1.8.4 [33]. The molecular clock of 3% substitutions per site per million years was used, which was obtained from the evolutionary rates of mtDNA estimated in previous studies [34,35]. We performed runs with a Markov Chain Monte-Carlo (MCMC) chain length of 1 × 10 8 generations, and trees were sampled every 1000 generations. This analysis was run in the following parameters: GTR + G substitution model without site heterogeneity, relaxed log-normal molecular clock model and tree prior: coalescent Bayesian skyline, with 10 groups and a piecewise-constant skyline model. The first 25% of the generations were discarded as burn-in. All analyses were run to achieve an effective sample size (ESS) at least 200 for all estimated parameters. The effective population size for the posterior distribution of the estimated parameter values was determined using Bayesian skyline plot analysis with the stepwise (constant) model, using TRACER ver. 1.7.1 [36]. STRUCTURE 2.3.2 was used to identify genetically distinct clusters by microsatellite genotypes data, following the admixture model without prior population information [37]. Ten independent runs were carried for different values of K from 1 and 10, the MCMC were run for a total of 5 × 10 8 iterations with discarding the first 5 × 10 7 iterations as burn-in. Individual assignment probability, Ln P(D) and convergence between runs were used to assess the most likely value of K. The most likely number of clusters was estimated according to Evanno et al. [38].

Genetic Diversity
A total of 78 novel mitochondrial CR sequences were sequenced in this study. After deleting the repeat sequences, we obtained a sequence of 943 base pairs (bp) of the CR, which contained seven singleton variable sites and 21 parsimony informative sites. There are 25 transition sites and three transversion sites, and no indels were observed. The base frequencies were 0.342, 0.263, 0.140 and 0.255 for A, C, G and T, respectively. In total, 17 haplotypes were identified (GenBank accession no. MW417125-MW417141), in which eight haplotypes were only from the single sampling site and nine were shared between sampling sites. Hap3 was the most common haplotype that shared by 24 individuals and exhibiting the widest geographical distribution in nine sampling sites (Table 1). There are three haplotypes (Hap 1, 4, and 7) were shared by five sampling sites, and three (Hap 9, 11, and 8) were shared by two sampling sites.
Diversity indices, haplotype diversity (h) and nucleotide diversity (π), are shown in Table 2. For the wild boar in QM, h was 0.860 (±0.026), and ranged from 0.20 (MQL) to 0.86 (Bashan), and π was 0.0079 (±0.0007), and ranged from 0.0006 (MQL) to 0.0076 (Micang), suggesting a moderate haplotype and low nucleotide diversity. For microsatellite data, 82 individuals were genotyped at 12 loci. Genetic variation at these loci was relatively high. The number of alleles per locus ranged from 8 to 14, with a mean of 10.59. The mean observed heterozygosity was 0.53, and the mean expected heterozygosity was 0.67 (Table 3).

Phylogeography and Population Genetic Structure
For the first dataset, we obtained 17 haplotypes with 943 bp in length. The phylogenetic tree was estimated with the best-fit model GTR + G (−lnL = 1950, p < 0.001, BIC = 4298) based on the mtDNA CR with two outgroups (S. barbatus and Phacochoerus africanus). BI analyses produced the topologies with two clades corresponding to Clade I and Clade II, based on mitochondrial haplotypes (Figure 2a). However, the posterior probability of Clade I was relatively lower, with only 0.72. Accordingly, there is no clear genetic divergence in this phylogenetic tree. Haplotypes 6 and 9, located in the basal position of the phylogenetic tree, were from the Clade I, while the basal Haplotypes 3, 5 were distributed in the Clade II (Figure 2a). The median-joining haplotype networks demonstrated a similar phylogeographical pattern to the Bayesian phylogenetic tree (Figure 2b).  Table 1 for definitions for EQL, MQL, WQL, Micang and Bashan.
For the second dataset, we obtained 174 haplotypes with 511 bp in length. The newly sequenced 78 sequences produced 13 haplotypes in this dataset, and only two more haplotypes (Hap 173 and Hap 174) obtained than previous studies (Figure 3) [19]. The sampling localities were grouped into six geographical regions according to their geographical distances and topologies in East Asia (Figure 3c). The results revealed that the phylogenetic tree is defined by a general clade and large polytomy clades (Figure 3). The haplotypes from QM were mainly distributed in the Clade B and general clade, with only one haplotype distributed in Clade E (Figure 3). lotypes (Hap 173 and Hap 174) obtained than previous studies (Figure 3) [19]. The sampling localities were grouped into six geographical regions according to their geographical distances and topologies in East Asia (Figure 3c). The results revealed that the phylogenetic tree is defined by a general clade and large polytomy clades (Figure 3). The haplotypes from QM were mainly distributed in the Clade B and general clade, with only one haplotype distributed in Clade E (Figure 3).   For the microsatellite data, we performed the Bayesian assignment analysis to detect population genetic structure. Although no obvious maximum log likelihood of posterior probability was found [LnP (X/K) = −4085.17] (Figure 4a), the entire dataset yielding the highest ∆K statistics output showed a clear maximum at K = 2 with ∆K = 68 (Figure 4b). When K = 2, the diagram showed that some individuals were assigned to two groups with the membership coefficient 40-60% probability (red or green), indicating moderate admixture (Figure 4c). The samples revealed unclear genetic divergence in the microsatellite data.  Table 1 for definitions of EQL, MQL, WQL, Micang and Bashan.

Historical Demography
We analyzed neutrality tests and goodness of fit to a simulated population expansion and raggedness index of the Clade I, II and the total samples with the mitochondrial DNA data (Table 4). Non-significant negative Tajima's D and Fu's Fs values were obtained for these three sampling pools ( Table 4). The significant raggedness index and the mismatch distribution deviate significantly from the expected distribution under a sudden expansion model, which indicated no demographic expansion.
The BSP based on coalescent theory simulated the fluctuation in populations and showed detailed demographic histories for both clades ( Figure 5). ESSs were generally high (>200) for all parameters, indicating good MCMC mixing in the combined chains. For the total samples, the result revealed that the estimates of effective population size were under demographic equilibrium in the past, with a slight increase, which was then followed by an apparent decline in population size dating from approximately 350 years ago. n, number of individuals; N, number of haplotypes; h, haplotype diversity; π, nucleotide diversity. SSD, sum of square deviation (goodness-of-fit to a simulated population expansion); Raggedness, raggedness index. (* p < 0.05, *** p < 0.001).

Discussion
In this study, a genetic analysis of wild boar in the QM was conducted based on the mitochondrial DNA control region (943 bp) and twelve microsatellite loci of 82 individuals in 16 sampling locations. Overall genetic haplotype diversity is 0.86, and the nucleotide diversity is 0.0079. A total of 17 haplotypes were detected, and the genetic diversity was moderate in East Asia. For mitochondrial data, phylogenetic analysis showed a low genetic divergence of wild boar in QM, and there is no obvious phylogeographic pattern in those areas. Mismatch analysis, neutrality tests, and BSP results revealed that the estimates of effective population size were under demographic equilibrium in the past. For microsatellite data, STRUCTURE identified that most individuals showed admixed genetic structure and revealed no obvious genetic divergence in the microsatellite data.
QM plays on important geographical barriers for the species with low dispersal ability. For amphibians, the study of stream salamander Batrachuperus tibetanus revealed three lineages corresponding to the northwestern, eastern and western lineages, which was likely caused by orogenesis of the QM during the late Cenozoic [7]. For reptiles, similarly, the orogeny launched the independent lineage divergence between the northern and southern lineages of endemic Chinese gecko, Gekko swinhonis in the QM [5]. For mammals, bayesian analysis of the golden snub-nosed monkey based on 20 microsatellite loci revealed three major clusters which strongly coincide with the major topographical ridge (Main ridge and Huanglongliang ridge) features in the QM, suggesting that gene flow between clusters may be restricted by geographical barriers [3]. However, the study of the golden takin populations based on mitochondrial DNA suggested that no significant geographic genetic diversity across different populations within the QM [8].
In present study, the complete control region of mitochondrial DNA data was used to analysis the genetic diversity and structure of the wild boar in QM. The mtDNA genetic diversity was obvious lower than total of East Asian wild boar, but higher than European wild boar [13,19]. For mitochondrial data, phylogenetics analysis supported that the wild boars in QM may have originated from multiple different genetic clades (Figures 2 and 3), and there is no clear geographical pattern among them ( Figure S1). The moderate haplotype diversity and the low nucleotide diversity of the wild boar in QM may be attributed to the relative recent demographic histories. Wild boar populations possibly originated in the island of Southeast Asia, dispersed to East Asia, then radiated into the Indian subcontinent, and finally, spread across East China into North Asia [39]. The colonize route explains the high degree of genetic diversity in the Southeast Asia. The genetic diversities of most sampling pools in East Asia was higher than the European wild boar [19].
Here, we may infer that the wild boar in QM derives from the complicated population history and population expansions in East Asia ( Figure 3). In East Asia, environmental changes seemed to be moderate in subsequent climate oscillations [40]. The wild boar has shown successful survival in a wide variety of habitats, with the possible existence of multirefugia in the inland areas during the Late Pleistocene. Wild boars are widely distributed in East Asia during interglacial period. During glacial episodes, wild boar retreated to refugia. However, the QM are located in an important geographical position with complex topography and variable climates and habitats, and considered to be the center of glaciation in the QM, with limited Late Pleistocene ice-cover [41]. It is likely that wild boar in this area persisted at a relatively stable population size throughout glaciation, and the ecological plasticity may have enhanced its survival [42]. After subsequent warming, the population from refugia dispersed into Qinling areas. The wild boar of East Asian expanded during the interglacial period meet with native populations at QM. The possible gene flow among local populations resulted in the admixture of genetic structure and no obvious phylogeographic structure (Figures 3 and 4; Figure S1). No significant geographic genetic divergence was found across different sampling localities within the QM (Figures 2 and 4). We infer that population admixture and the population history of founder effect played important roles in shaping the spatial genetic structure within wild boars in QM.
We used 12 microsatellite loci to study the genetic diversity and population genetic structure of wild boar in QM. We revealed a moderate level of genetic diversity and a relatively large effective population size, suggesting a relatively high gene flow between populations in QM. In this study, the observed heterozygosity was obviously lower than the expected heterozygosity. Such a lack of heterozygosity may be attributable to the admixture of gene pools from different sources [43]. For microsatellite data, Bayesian assignment analysis revealed the admixed genetic clusters in QM, and the region-specific clusters were not clear. Bayesian assignment analysis and spatial genetic analysis in this study revealed that multiple distinct genetic reservoirs exist in Qinling wild boar, and there may be at least two genetically divergent ancestral sources in QM. The Bayesian spatial genetic structure analysis indicated that the clusters in each sampling sites has changed gradually. Some individuals were assigned to two groups with the membership coefficient 40-60% probability (red or green), indicating moderate admixture (Figure 4). These results indicate the gradual changes in the degree of genetic admixture of the multiple ancestral sources. There is no significant phylogeographic pattern across sampling localities within the QM.

Conclusions
Our results indicated that the genetic diversity of wild boars in both mitochondrial and microsatellite data of Qinling Mountains was moderate in East Asia, but higher than European wild boar. Phylogenetic analysis showed that there is no obvious phylogeographic structure. The effective population size was under demographic equilibrium in the past. We infer that population admixture and two-times colonization history of founder effect played important roles in shaping the spatial genetic structure of wild boars in QM. This study could be used as a case model for the research into the genetic diversity, genetic origin, and admixture genetic structure of wild animals.
The numbers indicate locality codes (see Table 1 for details). Pie charts represent proportions of each of the two mtDNA clades (I and II) in each sampling site (the phylogenetic clades are given in Figure 2a).  Institutional Review Board Statement: The study was approved by the Nanjing Normal University's Animal Care and Use Committee (protocol code #IACUC-20201206; 2020-01-10).

Informed Consent Statement: Not applicable.
Data Availability Statement: Mitochondrial DNA sequence data have been submitted to GenBank (MW417125-MW417141). Microsatellite data presented in this study are available on request from the corresponding author.