Rapid host switching of Wolbachia and even more rapid turnover of their phages and incompatibility-causing loci

About half of all insect species carry maternally inherited Wolbachia alphaproteobacteria, making Wolbachia the most common endosymbionts known in nature. Often Wolbachia spread to high frequencies within populations due to cytoplasmic incompatibility (CI), a Wolbachia-induced sperm modification caused by prophage-associated genes (cifs) that kill embryos without Wolbachia. Several Wolbachia variants also block viruses, including wMel from Drosophila melanogaster when transinfected into the mosquito Aedes aegypti. CI enables the establishment and stable maintenance of pathogen-blocking wMel in natural Ae. aegypti populations. These transinfections are reducing dengue disease incidence on multiple continents. While it has long been known that closely related Wolbachia occupy distantly related hosts, the timing of Wolbachia host switching and molecular evolution has not been widely quantified. We provide a new, conservative calibration for Wolbachia chronograms based on examples of co-divergence of Wolbachia and their insect hosts. Synthesizing publicly available and new genomic data, we use our calibration to demonstrate that wMel-like variants separated by only about 370,000 years have naturally colonized holometabolous dipteran and hymenopteran insects that diverged approximately 350 million years ago. Data from Wolbachia variants closely related to those currently dominant in D. melanogaster and D. simulans illustrate that cifs are rapidly acquired and lost among Wolbachia genomes, on a time scale of 104–105 years. This turnover occurs with and without the Wovirus prophages that contain them, with closely related cifs found in distantly related phages and distantly related cifs found in closely related phages. We present evidence for purifying selection on CI rescue function and on particular Cif protein domains. Our results quantify the tempo and mode of rapid host switching and horizontal gene transfer that underlie the spread and diversity of Wolbachia sampled from diverse host species. The wMel variants we highlight from hosts in different climates may offer new options for broadening Wolbachia-based biocontrol of diseases and pests.


Recombination
We detected little evidence for recombination among our 20 wMel-like Wolbachia presented in Fig. 1A (GARD: 0/411 genes, PHI test: 0/411 = 0%, Max χ 2 test: 3/411 = 0.7%, Neighbour Similarity Score: 0/411 = 0%).To test for recombination among more diverged variants, we completed a separate analysis that included wMel and wRi, three other group-A variants (Wolbachia associated with Andrena hattorfiana, Anoplius nigerrimus, and Apoderus coryli), and B-group wMau (Meany et al. 2019;Vancaester and Blaxter 2023).Consistent with previous analyses of recombination among distantly related Wolbachia variants (Werren and Bartos 2001;Jiggins et al. 2001;Baldo et al. 2006), GARD found evidence of recombination in 728 genes, with recombination in more than half of these genes also supported by the tests implemented in PhiPack (PHI: 418/728, Max χ 2 : 485/728, NSS: 520/728).Recombination was detected by all three tests in 353 genes.These 353 genes include 245 with a gene region where wMel and wRi are sister.Of these, 7 have a region where wMel and wMau are sister, 25 have a region where wRi and wMau are sister, 37 have a region where wMel and A. nigerrimus Wolbachia are sister, and 38 have a region where wRi and A. nigerrimus Wolbachia are sister.Our new analyses generalize prior results that used only a few genes and found support for pervasive recombination between relatively diverged Wolbachia (Werren and Bartos 2001;Jiggins et al. 2001;Baldo et al. 2006), while confirming that our criteria for selecting genes to produce phylograms indirectly selects for genes with little evidence for recombination.
Recombination between group-A and group-B Wolbachia could potentially influence estimated phylograms and chronograms.To test this hypothesis, we searched for genes with no evidence of recombination using the 15 Wolbachia we previously analyzed (Meany et al. 2019).Of 168 single-copy and equal-length genes, we discarded 14 that were shorter than 300 bp.All three statistical tests found no evidence for recombination in 51 of the remaining 154 genes.We used these 51 genes to re-estimate a phylogram and absolute chronogram with RevBayes 1.1.1.Four independent runs of each were performed and all agreed.With the exception of the placement of wAlbB (Fig. S1A), our phylogram estimated using only the 51 genes with no evidence for recombination recapitulates the topology estimated by Meany et al. (2019) (Fig. S1B).Our estimate of A-B group divergence time using only these 51 genes (~7-38 MYA) also overlaps with our previous relaxed-clock analyses with branch-rate priors Γ(7,7) (~8-46 MYA) and Γ(2,2) (~6-36 MYA), as well as with our strict-clock analysis (~12-64 MYA) (Meany et al. 2019).As expected, using our new calibration the estimated divergence times are longer for both the 51 gene set (~32-165 MYA) and the original Meany et al. (2019) gene set (~42-196 MYA); but again divergence estimates are overlapping.This confirms that our previously estimated phylograms and chronograms are insensitive to the inclusion of genes with evidence of recombination.

Inferences from chronograms
Inferences from these chronograms are limited with only one available Wolbachia sequence from most of these hosts.For instance, we cannot determine which Wolbachia infection was acquired first, whether two sister tips on our Wolbachia chronograms correspond to one of those hosts acquiring its Wolbachia (by either introgression or non-sexual horizontal transmission) from the host of the sister Wolbachia variantor whether one or both Wolbachia were recently acquired from an unsampled intermediate host.Turelli et al. (2018) previously proposed that wRi in D. simulans was acquired horizontally from D. ananassae, because multiple Wolbachia sequences were available from each species, and the wRi sequences formed a clade nested within a paraphyletic cluster containing the Wolbachia from D. ananassae.As noted by Richardson et al. (2012), divergence times for extant Wolbachia variants in a host species may be much shorter than the duration of its current Wolbachia variant, because selective sweeps and genetic drift will eliminate sequence variation that indicates the duration of associations.Fig. S2.illustrates the limited inferences about Wolbachia divergence and acquisition possible from individual Wolbachia sequences from various hosts.We consider four hypothetical host species with horizontally acquired Wolbachia.Host 1 acquires its Wolbachia, denoted w1, at time T1, and we denote by Ti(j » i) the time at which host i acquires its Wolbachia, denoted wi, from host j.If we postulate T1 > T2(1 » 2) > T3(1 » 3) > T4(2 » 4), the resulting chronogram is Fig.S2.
The chronogram provides no clue that variant w1 was acquired first or that w2 was acquired from host 1.Thus, we can infer only that all but one of the variants shown in Fig. 1A were acquired in the last 263-1.2MY and that a transfer between a dipteran and a hymenopteran host occurred within the last 184-817 KY, possibly mediated by intermediate hosts.Moreover, the chronogram Fig. 1A is consistent with many of these hosts, for which we do not have multiple Wolbachia sequences, acquiring their current Wolbachia very recently, followed by spatial spread as observed for Wolbachia in several hosts (Turelli et al. 2018).

Correction of Turelli et al. (2018) concerning the acquisition of wSpc
The analyses presented in our text deal with only a single Wolbachia sequence from each host species.As noted in Fig. S2, single sequences do not allow us to determine the direction or temporal order of Wolbachia acquisitions.In contrast, Turelli et al. (2018) had multiple Wolbachia sequences from several species.Specifically, the wRi sequences from D. simulans formed a clade nested within a paraphyletic cluster of wAna sequences from D. ananassae.
Given that D. ananassae and D. simulans span the D. melanogaster species group, which diverged on the order of 47 MYA (see Fig. 1C), introgression is impossible.Hence we conjectured that horizontal non-sexual transmission of Wolbachia occurred between D. ananassae and D. simulans, with D. ananassae as a plausible source.Turelli et al. (2018, Fig. 1) also analyzed eight Wolbachia sequences from D. suzukii (wSuz) and one D. subpulchrella (wSpc).Two copies of wSuz from Asia (the other wSuz samples came from North and South America) formed a clade with the single wSpc, sister to the other six wSuz sequences.We used these data to conjecture that D. suzukii was a plausible source of the Wolbachia in D. subpulchrella.However, the two putative Asian wSuz sequences, obtained from GenBank, were in fact derived from D. subpulchrella specimens (China-AA27 and Korea-AA7).This misidentification was first kindly pointed out to us by Mathieu Gaultier, then confirmed by Joanna Chiu.With this correction, the data in Fig. 1 of Turelli et al. (2018) produce sister clades of six wSuz and three wSpc, providing no information on the direction of transfer, as illustrated in Fig. S2.
First, sr2WO Wovirus form two clades: a clade observed in all wRi-like Wolbachia, and a clade observed in the wMel-like wSYTZ Wolbachia clade (MRCA 54-353 KYA) and in distantly related wMel-like wTris.While the history of sr2WO cannot be fully resolved from these data, we hypothesize that sr2WO in the wRi-like clade, the wSYTZ clade, and wTris represent independent acquisition events that occurred after wMel-like and wRi-like Wolbachia divergence (MRCA 263-1.2MY).The absence of sr2WO in wRec-sister to wTris-implies gain or loss of srWO2 may have occurred as recently as 101-762 KYA. Figure S5A presents these patterns.
Second, each wRi-like strain also contains closely related sr3WO Wovirus, with duplications observed in in wRi, wAna, and wSuz Wolbachia genomes.These sr3WO Wovirus observed in wRi-like Wolbachia form a clade that is sister to an sr3WO clade observed in several wMel-like variants: the four wSYTZ strains; sister (wSeg,wMal); a clade of three strains (wSbr, (wAra, wSpa)); as well as wBocq, wBor, and wDal.The absence of this haplotype in wAch indicates gain or loss since its divergence from wBocq in the last 14-106KY.Similarly, absence from wAu, and wTro indicates gain and/or loss since their divergence from wBor in the last 179-790KY.We observe an additional sr3WO haplotype that is absent from wBor but present in its 7 closest relatives, implying loss of this haplotype in the last 179-790KY.Fig. S5B presents these patterns, including our observation of another closely related copy of sr3WO Wovirus present in all wMel-like Wolbachia (MRCA 263-1.2MYA).
cif turnover among phages Cooper et al. (2019) found that a single sr3WO Wovirus in wSTY contain both Type 1 and Type IV cif operons, whereas closely related wMel does not contain a Type IV cif.This discovery has been confirmed with more complete wSTY assemblies (Baião et al. 2021).Thus cifs clearly move in and out of Woviruses, precluding concordant cif and Wovirus phylogenies.We used our estimated phylogenies for sr3WO haplotypes (sr3 alleles) and cifA[T1] alleles from Fig. 4 to assess how commonly cifs move among Woviruses.Horizontal cif movement among phages will produce discordance between the phylogenies estimated from sr3 alleles versus cifs.The SH and AU tests assess discordance by determining whether data from one set of markers is compatible with the phylogeny estimated from the other.As shown in Fig. 3, there is clear discordance in the topologies of cifA[T1] and sr3WO among the wMel-like variants.The SH and AU tests assign P < 10 -6 to the fit of the more resolved sr3WO data to the less resolved cifA[T1] tree in Fig. 3.

Origin and movement of cif operons in wSYT
The primary mode of cif movement is difficult to ascertain without a comprehensive comparative analysis of fully contiguous genomes and associated plasmids.However due to cifs' imperfect association with phages, and their proximity to IS elements, cif horizontal transfer may plausibly occur via phage-mediated and phage-independent mechanisms.Cooper et al. (2019) hypothesized that the Type IV cifs in wYak (and more broadly, wSYT) could have entered a phage that already carried Type I cifs via homologous recombination with extrachromosomal DNA excised by IS elements.This proposal was disputed by Baião et al. (2021).Using better assemblies, they proposed that phage probably delivered Type IV cifs into the ancestor of the wSYT clade and wMel that were then subsequently deleted in wMel.Their proposal is based on the phylogenetic relationship of octomom WD0513 homologs.Their hypothesis is implausible for several reasons that become obvious when assessing their assemblies.
First, the Type IV cifs in wSYT and divergent B-group wPip are < 3% diverged, while the surrounding phage genes in wYak and wPip are ~15% diverged (see Cooper et al. 2019, Figure 5).This points to a common donor for the Type IV cifs in wYak and wPip that is distinct from the phages that currently contain them.The highly diverged (~15%) phage sequences between wYak and wMel following the cifAB genes Cooper et al. (2019, Fig. 5) clearly indicate that this transfer did not occur in the ancestor of wSYTM followed by deletion from wMel, as proposed by Baião et al. (2021).Baião et al. (2021) note that the divergence of the WD0513 gene between wMel and wSYT is "much higher than most other parts of the genomes" (~15%) but still conclude that this Octomom associated gene is orthologous to the copy in wSYT.This proposal is unreasonable, as most of the wMel and wSYT genomes are <1% diverged.Given that the 15% divergence for WD0513 between wMel and wSYT is comparable to the divergence for the phage-associated genes flanking the Type IV cifs in wSYT, it is most plausible that WD0513 was transferred with these phage genes and the Type IV cifs from an unknown donor.Incidentally, WD0513 is immediately flanked by a transposon Baião et al. (2021, Fig. 2A) which, once again, makes it difficult to ascertain whether the mechanism providing extrachromosomal DNA for homologous recombination is phage-, transposon-, or plasmid-mediated.Our finding of discordance between sr3 and associated cifA[T1] phylograms (Fig. 3) generalizes our initial discovery of phage-independent cif transfer (Cooper et al. 2019).S7.

Fig. S1 .
Fig. S1.Recombination does not influence estimated relationships or divergence times.(A) An estimated phylogram for group-A (red) and group-B (blue) Wolbachia strains using only the 51 genes with no evidence of recombination.For this analysis, we follow the approach of Meany et al. (2019) as described in our Supplementary Methods.In the main text we also report the divergence of group-A and B Wolbachia estimated using our new calibration.Long branches separate the group-A and B clades, with significant variation in the substitution rates across branches.The topology here, with the exception of wAlbB placement, fully overlaps with the plylogram presented in Meany et al. (2019) that did not consider recombination.Meany et al. (2019) placed wAlbB outgroup to all other group-B Wolbachia.All nodes have Bayesian posterior of one, with the exception of the node distinguishing wNfa and wNLeu (0.99) and the node distinguishing wAlbB from (wVitB,(wPip_Pel, wPip_Mol)) (0.98).Bayesian support less than 1 likely reflects real uncertainty in the placement of these strains.(B) An estimated chronogram that includes the same strains.Again, all nodes have Bayesian posterior of 1, with the exception of the node distinguishing wAlbB from (wVitB,(wPip_Pel, wPip_Mol)) (0.998) and the node distinguishing wNfa and wNLeu (0.999).Point estimates for divergence times and their confidence intervals are presented for each node.Estimated divergence times broadly overlap with those estimated by Meany et al. (2019) without excluding genes with evidence for recombination.

Table S1 .
Wolbachia strains, genomes, reproductive phenotypes, and cifs.Table S2.Cif sequence and TM score similarity matrices.Pairwise percent identity of all (A) CifA and (B) CifB proteins reported in this study.Pairwise identity of (C) CifA and (D) CifB proteins included in the analysis of structure and sequence similarity.(A-D) Muscle5 was used to