Phylogeography and genetic diversity of the widespread katydid Ducetia japonica (Thunberg, 1815) across China

Abstract Habitat fragmentation can lower migration rates and genetic connectivity among remaining populations of native species. Ducetia japonica is one of the most widespread katydids in China, but little is known about its genetic structure and phylogeographic distribution. We combined the five‐prime region of cytochrome c oxidase subunit I (COI‐5P), 11 newly developed microsatellite loci coupled with an ecological niche model (ENM) to examine the genetic diversity and population structure of D. japonica in China and beyond to Laos and Singapore. Both Bayesian inference (BI) and haplotype network methods revealed six mitochondrial COI‐5P lineages. The distribution of COI‐5P haplotypes may not demonstrate significant phylogeographic structure (N ST > G ST, p > .05). The STRUCTURE analysis based on microsatellite data also revealed six genetic clusters, but discordant with those obtained from COI‐5P haplotypes. For both COI‐5P and microsatellite data, Mantel tests revealed a significant positive correlation between geographic and genetic distances in mainland China. Bayesian skyline plot (BSP) analyses indicated that the population size of D. japonica's three major mitochondrial COI‐5P lineages were seemingly not affected by last glacial maximum (LGM, 0.015–0.025 Mya). The ecological niche models showed that the current distribution of D. japonica was similar to the species’ distribution during the LGM period and only slightly extended in northern China. Further phylogeographic studies based on more extensive sampling are needed to identify specific locations of glacial refugia in northern China.

I (COI-5P), 11 newly developed microsatellite loci coupled with an ecological niche model (ENM) to examine the genetic diversity and population structure of D. japonica in China and beyond to Laos and Singapore. Both Bayesian inference (BI) and haplotype network methods revealed six mitochondrial COI-5P lineages. The distribution of COI-5P haplotypes may not demonstrate significant phylogeographic structure (N ST > G ST , p > .05). The STRUCTURE analysis based on microsatellite data also revealed six genetic clusters, but discordant with those obtained from COI-5P haplotypes. For both COI-5P and microsatellite data, Mantel tests revealed a significant positive correlation between geographic and genetic distances in mainland China.
Bayesian skyline plot (BSP) analyses indicated that the population size of D. japonica's three major mitochondrial COI-5P lineages were seemingly not affected by last glacial maximum (LGM, 0.015-0.025 Mya). The ecological niche models showed that the current distribution of D. japonica was similar to the species' distribution during the LGM period and only slightly extended in northern China. Further phylogeographic studies based on more extensive sampling are needed to identify specific locations of glacial refugia in northern China.

K E Y W O R D S
Ducetia japonica, ecological niche modeling, genetic diversity, microsatellites loci, mitochondrial COI-5P, phylogeography barriers play an important role in reducing gene flow and promoting genetic differentiation of extant species (Zhang et al., 2017).
Higher dispersal propensity typically results in more gene flow and thus lower genetic structure between populations (Chamberland et al., 2020). Suitable environments with relaxed selective pressure can accommodate more genotypes and the population possesses higher genetic diversity (Jin & Liu, 2008). Habitat loss and fragmentation can lower migration rates and genetic connectivity among remaining populations of native species (Vandergast et al., 2007). Therefore, it is crucial to characterize the genetic lineages that may reflect species' adaptive potential.
Climatic oscillations during the Quaternary Period had a strong effect on the genetic diversity and distribution of extant species.
During the Pleistocene glaciations, Europe and most regions of North America were covered with ice sheets. In the same period, there were no large and continuous ice sheets in China, except in some high mountains. The Hengduan Mountains are the most important refugial region in southwest China (Li et al., 2011;Liu et al., 2012;Meng et al., 2015;Yang et al., 2008;Zhang, 2002). Recent phylogeographic studies indicated that extant species' genetic patterns at large scales are often shaped by past climate-driven range dynamics.
Most species retracted in small ice-free refugial areas in the temperate zone of the northern hemisphere during the last glacial maximum (LGM, 0.015-0.025 Mya), and following a geographic expansion during the post-LGM period (Yang et al., 2016). In southwest China, mountains form a series of parallel alpine areas over 5,000 m above sea level, and altitude drop between mountaintops and valleys exceeds 2000 m. The complex topography of southwest China led to long-term geographic isolation. Several widely distributed insects exhibit high genetic variability and strong phylogeographic structure in China (Li et al., 2019;Yang et al., 2016).
Multiple molecular markers (e.g., single-copy nuclear gene sequences, internal transcribed spacer, single nucleotide polymorphism, microsatellite loci, mitochondrial and chloroplast DNA) have been used for phylogeographic studies (Card et al., 2016;Garrick et al., 2015;Wang, Yang, Lu, Zhou, & Wu, 2017). Since single locus datasets can fail to adequately represent the overall and real lineage history of the species, increasingly more phylogeographic studies, therefore, have combined multiple molecular markers (Lu et al., 2016;Myers et al., 2013). Over the past two decades, phylogeographic datasets have become progressively larger in size (Garrick et al., 2015). The five-prime region of cytochrome c oxidase subunit I (COI-5P) has been widely used for distinguishing species, revealing cryptic species, and phylogeographic study (Lait & Hebert, 2018;Li et al., 2019;Mari-Mena et al., 2016). The use of mitochondrial DNA has decreased considerably over the years, but still continues to represent an important component of the phylogeographic toolbox (Garrick et al., 2015). Microsatellites may provide higher resolution of differentiation than mitochondrial DNA, owing to their higher mutation rates. Thus, microsatellites are usually expected to be more informative in describing the consequences of the most recent population events (Royo et al., 2007).
While phylogeographic structure of many vertebrates has been examined in China (Dai et al., 2011;Dufresnes et al., 2016;Qiu et al., 2016), insects have received much less attention despite their central ecological roles (Jiang et al., 2016;Li et al., 2019). A better understanding of current patterns of genetic structure and geographic distributions of insects across China requires more phylogeographic studies. Few phylogeographic studies have reassessed the distribution of orthopteran insects combining mitochondrial and nuclear genetic markers on a broad geographic scale (Alfaro et al., 2018;Allegrucci et al., 2005Allegrucci et al., , 2014Ma et al., 2012). The Ducetia japonica group originated somewhere in East Asia and spread from there into the North (Japan), West (India), and South (Australia) (Heller et al., 2017). In China, D. japonica group encompasses D. japonica and D. strelkovi (Kang et al., 2014), which are mainly distinguished by the venation of forewings and, especially, by a long ventral ridge of the male cerci (Heller et al., 2017). A recent study based on acoustic data and stridulatory file variation suggested that D. japonica is a complex of several distinct species (Heller et al., 2017). D. japonica is a common katydid on Ilex chinensis in urban parks. The juveniles of D.
japonica regularly visit and consume the pollinia and anther caps of Habenaria sagittifera (Suetsugu & Tanaka, 2014). D. japonica is widely distributed in China, which represents an ideal model system for the study of orthopteran population structuring on a broad geographic scale. To our knowledge, very little is known about the phylogeographic structure of D. japonica and other broad-winged katydids.
Herein, mitochondrial COI-5P and 11 microsatellite loci were used together to study D. japonica genetic diversity and phylogeographic structure based on range-wide sampling in China. Our aims are to (a) determine the phylogeographic structure and possible influential factors; (b) study the demographic history and gene flow among populations; and (c) identify the locations of putative refugia.

| Samples
In total, 796 individuals were collected from 81 populations of D. japonica ( Figure 1, Table S1). We collected from 1 to 60 individuals from each sampling site and identified them based on morphology (Kang et al., 2014). Specimens were preserved in absolute ethanol and stored at −20℃ until DNA extraction.
These populations were initially defined as 10 groups according to the zoogeographic divisions of China (Zhang, 2002)
We developed microsatellite loci specifically for D. japonica via 454 sequencing. Based on the initial analysis of amplification success rate and polymorphism, 11 microsatellite loci were amplified using fluorescent-labeled primers ( Table 1). The PCR system contained 7.5 μl 2 × Premix Taq™ (Takara Bio), 1.5 μl each primer (10 μM), 1 μl genomic DNA (~50 ng), and deionized water up to 15 μl. The amplification protocol of microsatellite loci included an initial denaturation at 94°C for 3 min, followed by 34 cycles of denaturation at 94°C for 30 s, annealing at 53-62°C (Table 1) for 40 s, extension at 72°C for 45 s, and a final extension at 72°C for 9 min. The amplification products were subjected to electrophoresis in a 2% agarose gel in TAE buffer to check whether the amplification reactions were successful. Amplified products were run on an ABI 3,730 XL Genetic Analyzer (Genewiz, China). Each reaction received 0.5 μl PCR products, 0.5 μl GeneScan™ 500 LIZ ® dye Size Standard (Thermo Fisher Scientific), and 9 μl Hi-DiTM formamide (Thermo Fisher Scientific).
Microsatellite alleles were genotyped by size using GeneMapper 4.0 (ABI) and checked manually to reduce scoring error.
For microsatellite data, the null allele frequency at each locus was calculated using GENEPOP 4.7 (Rousset, 2008). GenAlEx 6.5 (Peakall & Smouse, 2012) (Rousset, 2008 TA B L E 1 Primers used for microsatellite loci amplification deficiency and heterozygote excess. Bonferroni corrections were used to adjust critical probability values for the above multiple tests (Rice, 1989).
Analysis of molecular variance (AMOVA) and population differentiation (F ST ) were all implemented in Arlequin 3.5 (Excoffier & Lischer, 2010). AMOVAs were conducted to calculate the partitioning of genetic variation with 10,000 permutations. The pairwise genetic differences (F ST ) range from 0 to 1, where "0" indicates complete panmixia and "1" suggests a lack of migration (Chabot et al., 2015).
The conventional population F ST comparisons were measured with 1,000 permutations at a significance level of 0.05. The spatial genetic pattern was examined by spatial analysis of molecular variance (SAMOVA) using SAMOVA 2.0 (Dupanloup et al., 2002). The SAMOVA was calculated for K = 2 to 15, and the initial condition was set to 100 with 10,000 iterations. The point at which the F CT curve begins to plateau defines the final configuration of groups (K) (Heuertz et al., 2004).
Kuwayamaea brachyptera (accession number HQ609356) was used as out-group. Stationarity was considered to be reached when the average standard deviation of split frequencies was below 0.01.
The initial 25% of trees were discarded as burn-in, and the remaining trees were used to construct the Bayesian majority-rule consensus tree. Haplotype networks were derived from PopArt 1.7 using MJ algorithm (Leigh & Bryant, 2015). PERMUT 2.0 (Pons & Petit, 1996) was applied to test if the distribution of COI-5P haplotypes demonstrated significant phylogeographic structure, the estimate of genetic differentiation for phylogenetically ordered alleles (N ST ) significantly higher than population differentiation (G ST ). As PERMUT software needed at least three individuals per population, we excluded the populations with sample size less than 5 individuals.
Divergence times were estimated using COI-5P haplotype dataset with BEAST v1.10.4 . Because no fossil record was available, we assumed a divergence rate of 3.54% per million years for mitochondrial COI-5P (twice the substitution rate along one species/lineage), as previously estimated across insects for this locus (Papadopoulou et al., 2010). The divergence rate is twice the substitution rate along one species/lineage (Bensch et al., 2013). Clock.rate parameter was set as a normal prior with an initial value = 0.0177, mean = 0.0177, and SD = 0.004. The BEAST analysis was run 100 million generations with HKY + G + I model, strict molecular clock, random starting tree, the Yule process, and sampling every 10,000 steps. The first 10% of the trees were discarded as burn-in. The maximum clade credibility tree was produced in TreeAnnotator v1.10.4.
This analysis quantifies smoothness of the observed mismatch distribution, and a nonsignificant result indicates population growth (Harpending, 1994).
Bayesian skyline plots (BSPs) were used to evaluate the population size dynamics of D. japonica three major mitochondrial COI-5P lineages over time using BEAST v1.10.4 . We focus on three major COI-5P lineages (I, II, and III) because other lineages (IV, V, and VI) only contain 1 or 2 haplotypes. HKY + G + I model was determined to be the best substitution model using PhyloSuite (Zhang et al., 2020). Analyses were run using a strict molecular clock, assuming a divergence rate of 3.54% per million years (Clock.rate

| Population structure based on microsatellite data
Bayesian clustering analysis was implemented in Structure 2.3.4 (Pritchard et al., 2000) without prior structure information. All the possibilities were considered by dividing 81 populations into 10 groups, so we think the optimal number of clusters (K) will not exceed 15. Twenty runs for K = 1 to 15 were analyzed under the admixture model, correlated allele frequencies, and a burn-in of 250,000 followed by 1,000,000 Markov chain Monte Carlo (MCMC) iterations.
Structure Harvester 0.6.93 (Earl & Vonholdt, 2012) was applied to choose the optimal K-value based on the Delta K method. The 20 replicates for the chosen K-value were merged using CLUMPP 1.1.2 (Jakobsson & Rosenberg, 2007), and the final plots were generated using DISTRUCT 1.1 (Rosenberg, 2004 (bio15), and precipitation of coldest quarter (bio19). Model validation was performed using the same settings.
The accuracy of each model prediction was evaluated using AUC scores, where range from 0.5 (randomness) to 1 (exact match), and above 0.7 and 0.9 indicates good and excellent predictive power, respectively (Fielding & Bell, 1997;Ye et al., 2014). has more descendant haplotypes which also suggest it is most likely to be the ancestral haplotype (Castelloe & Templeton, 1994). The second most frequent haplotype H2 (n = 90) was seen in 8 populations from NEC and NC groups. The third most frequent haplotype, H12, was shared by 50 samples and occurred in 3 populations from NC and CC regions.

| RE SULTS
BI analysis revealed six mitochondrial COI-5P lineages ( Figure S1). The COI-5P haplotype network (Figure 2)  Mya, which lasted until recently. Moreover, it showed that none of the lineages was adversely affected by LGM (Figure 4b).

| Genetic differentiation and Mantel test on isolation by geographic distance
In this study, the maximum K2P distance of COI-5P sequences is 1.41%. The mean gene diversity (H S = 0.548 ± 0.043) was significantly lower than total gene diversity (H T = 0.961 ± 0.014).
The SAMOVA based on mitochondrial COI-5P dataset showed that F CT value clearly increased from K = 2 to 4 and then slightly   Results of the AMOVA test on mitochondrial COI-5P and microsatellite data in different populations and regional groups are shown in Table 4. For mitochondrial COI-5P, global AMOVA revealed that 21.54% genetic variation was found within populations, whereas 78.46% genetic variation was explained by differences among populations. Based on SAMOVA results, the most genetic variation was found among 4 groups (73.54%), followed by the genetic variance within populations (14.29%) and among populations within groups (12.17%). Regarding microsatellite data, global AMOVA revealed that 14.83% genetic variation was explained by differences among populations and up to 85.17% genetic variation was found within populations. Based on STRUCTURE clustering results, the most genetic variation was found within populations (84.09%), followed by the genetic variance among 6 groups (8.52%) and among populations within groups (7.40%). All fixation indices are significant (p < .001).
In both mitochondrial COI-5P and microsatellite data, population differentiation was significant (p < .05, 10,100 replicates) for the majority of pairwise comparisons. Pairwise F ST between populations based on mitochondrial COI-5P (Kimura-2P) and microsatellite data (number of different alleles, F ST -like) ranged from 0 to 1 and from 0.013 to 0.311, respectively (Table S2)

| Distribution modeling
The average test AUC value for the replicate runs is 0.883 ± 0.042 for current and 0.871 ± 0.052 for LGM distributional predictions.
These mean values showed that our models were distinct from ran-  Morphologically, five species within the D. japonica group are very similar and differ mainly in song (Heller et al., 2017). D. malayana was described from Singapore based on song unit of calling song F I G U R E 6 Predicted distribution area of Ducetia japonica under LGM and current climatic conditions (Heller et al., 2017). The Singapore population had two unique haplotypes (H158 and H159), which connected with adjacent haplotype by more than 5 mutational steps. Two closely related haplotypes (H9 and H136) were found in three populations of Taiwan island.

Source of variation
Dominant haplotype (H9) was shared with NEC and CC groups. The SAMOVAs also supported this situation.
The COI-5P haplotype distribution may not demonstrate a phylogeographic structure (N ST > G ST , p > .05). The high nucleotide diversity (π = 0.023) of DDH population can be explained by the existence of H117 and H119. High Hd and low π were observed in most populations, which probably resulted from rapid population growth of ancestral populations (Avise, 2000;Yu et al., 2014). The high genetic diversity might reflect a long evolutionary history and wide distribution of species (Li et al., 2019). Biogeographic barriers (such as rivers, mountains, and deserts) separate populations, impede gene flow, drive genetic differentiation, and eventually lead to allopatric speciation. The Hengduan Mountains are the most important refugial region in China (Liu et al., 2012).
D. japonica is a long-winged species and one of the most widespread katydids found from Japan to Australia (~5,000 km) and Pakistan to Solomon Islands (~10,000 km) (Heller et al., 2017). Mantel tests revealed a significant positive correlation between genetic differentiation and geographic distance based on both mitochondrial COI-5P and microsatellites. Population differentiation tends to be closely related to geographic distances (Wu et al., 2019). Dispersal ability has an important impact on genetic diversity of the species.
With its long fore and hind wings, D. japonica group occurred on numerous islands, which is evidence to support its longer distance dispersal capabilities. The distribution on islands may be the result of different circumstances, such as the species' arrival occurred when islands and mainland were not yet separated or accidental introduced by humans. Island populations were generally characterized as having lower genetic variation (Liu et al., 2020). Taiwan (Wares, 2010).
Three major lineages of D. japonica were formed before the LGM.
Lineage I showed a stable population size trend over time. Both lineage II (0.09 Mya) and lineage III (0.05 Mya) expansion may have occurred earlier than the LGM. Three major lineages were nearly at a stable population size during LGM, and it does not appear that any lineages were adversely affected by climatic changes. High levels of lineage diversity and signals of greater demographic stability suggest a lower impact of glaciations (Alfaro et al., 2018).
Many extant species persisted in small ice-free refugial areas during LGM in the temperate zone of the northern hemisphere (Yang et al., 2016). As the average test AUC value was higher than 0.85 for both current and LGM climate conditions, we assumed that the models generated reliable predictions. Hence, we be- factors and its host plants (Fresia et al., 2013;Li et al., 2019).
Climatic and habitat changes induced by multiple ice age cycles during the Pleistocene in Europe must have had an impact on the distribution of the grasshopper Oedaleus decorus (Kindler et al., 2012).
Given that the fossil record for D. japonica is unknown, phylogeographic methods based on divergence rates of genetic loci were used to date the lineage divergences and investigate possible signs of demographic events. Small population numbers, uneven sampling, and a single genetic marker-based analysis hampered the interpretation of the results. However, genome-wide single nucleotide polymorphisms (SNPs) and larger population sample sizes could potentially reveal more fine-scaled patterns. In summary, our results indicated that D. japonica in mainland China remained almost stable over long periods of time. Although our present study provides some insight for the D. japonica genetic patterns, the study for D. japonica does not stop here. Further phylogeographic studies based on more extensive sampling would be needed to identify specific locations of glacial refugia.

ACK N OWLED G M ENTS
This study was supported by Natural Science Foundation of Hebei Province (C2018201128). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We would like to thank Elsevier (https://cn.websh op.elsev ier.com) for English-language editing. We would also like to thank Ming Kai Tan (National University of Singapore) and Powei Chen (National Taiwan University) for helping us collect specimens.

CO N FLI C T O F I NTE R E S T S
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential competing interests.

AUTH O R CO NTR I B UTI O N S
ZZJ: Conception of the ideas. ZYX, GB, ML, and WWJ: Conduction of the fieldwork and collection of the data. ZZJ, ZYX, and GB: Analysis of the data. ZZJ and ZYX: Preparation of the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
Mitochondrial COI-5P haplotype sequences are available at