Phylogenomic analysis for Campylobacter fetus ocurring in Argentina

Background and Aim: Campylobacter fetus is one of the most important pathogens that severely affects livestock industry worldwide. C. fetus mediated bovine genital campylobacteriosis infection in cattle has been associated with significant economic losses in livestock production in the Pampas region, the most productive area of Argentina. The present study aimed to establish the genomic relationships between C. fetus strains, isolated from the Pampas region, at local and global levels. The study also explored the utility of multi-locus sequence typing (MLST) as a typing technique for C. fetus. Materials and Methods: For pangenome and phylogenetic analysis, whole genome sequences for 34 C. fetus strains, isolated from cattle in Argentina were downloaded from GenBank. A local maximum likelihood (ML) tree was constructed and linked to a Microreact project. In silico analysis based on MLST was used to obtain information regarding sequence type (ST) for each strain. For global phylogenetic analysis, a core genome ML-tree was constructed using genomic dataset for 265 C. fetus strains, isolated from various sources obtained from 20 countries. Results: The local core genome phylogenetic tree analysis described the presence of two major clusters (A and B) and one minor cluster (C). The occurrence of 82% of the strains in these three clusters suggested a clonal population structure for C. fetus. The MLST analysis for the local strains revealed that 31 strains were ST4 type and one strain was ST5 type. In addition, a new variant was identified that was assigned a novel ST, ST70. In the present case, ST4 was homogenously distributed across all the regions and clusters. The global analysis showed that most of the local strains clustered in the phylogenetic groups that comprised exclusively of the strains isolated from Argentina. Interestingly, three strains showed a close genetic relationship with bovine strains obtained from Uruguay and Brazil. The ST5 strain grouped in a distant cluster, with strains obtained from different sources from various geographic locations worldwide. Two local strains clustered in a phylogenetic group comprising intercontinental Campylobacter fetus venerealis strains. Conclusion: The results of the study suggested active movement of animals, probably due to economic trade between different regions of the country as well as with neighboring countries. MLST results were partially concordant with phylogenetic analysis. Thus, this method did not qualify as a reliable subtyping method to assess C. fetus diversity in Argentina. The present study provided a basic platform to conduct future research on C. fetus, both at local and international levels.


Introduction
Argentina is the sixth-largest exporter of meat in the world (http://www.worldstopexports.com/ top-beef-exporting-countries/). Since ancient times, livestock farming has been an important traditional activity in Argentina and it is majorly practiced in the fertile pastures of the Pampas (Buenos Aires province and its surrounding area, Argentina). Livestock production is one of the most important contributors of the economy in Argentina. Recent times have witnessed an increase in local and international demands for animal protein, demanding a substantial improvement in livestock productivity. Bovine reproductive infections are one of the major challenges faced by livestock industry. Such infections account for significant financial losses every year. Campylobacter fetus infection has been identified as the main cause of bovine abortions in the Pampas [1,2]. Among the various subspecies of C. fetus, Campylobacter fetus fetus (Cff) and Campylobacter fetus venerealis (Cfv) are two most important subspecies that are associated with poor reproductive health in cattle. In particular, these two Available at www.veterinaryworld.org/Vol.14/May-2021/15.pdf subspecies have been reported to significantly affect herd reproductive parameters. Cfv is linked to bovine genital campylobacteriosis, a venereal disease that is primarily associated with infertility. C. fetus biovar intermedius, a Cfv variant that shares intermediate biochemical traits with Cfv and Cff, has been found to be frequently associated with late abortions in cattle. Several previous studies have reported the presence of Cfvi in cattle in Argentina. The diagnosis of the disease can be done by evaluating genital secretions of cows and bulls. In addition to this, aborted fetus and placental tissues can also be used for the diagnosis. Despite the advances in the molecular methods used for the identification and differentiation of subspecies, inconsistencies in the outcome poses a great challenge [3,4]. In South America, direct immunofluorescence-based screening assay is the method of choice for the identification of the pathogen and elimination of the infected animals from endemic herds [5]. This method involves direct detection of C. fetus with the aid of hyperimmune sera that are raised using total antigens of the bacterium. However, the inability of this method to differentiate between various subspecies limits its application. A third subspecies, C. fetus subsp. testudinum, has been identified in reptiles and humans. In humans, it has been isolated from various sources including feces, blood, pleural, and bile [6]. There is no evidence for the presence of this subspecies in Argentina; however, this could be attributed to limited study of the samples. This subspecies has not been identified in cattle so far. Despite the absence of official epidemiological data, C. fetus has been found to be associated with bacteremia in immunocompromised patients in Argentina [7,8]. However, no information is currently available regarding the subspecies of C. fetus responsible for human infections in Argentina, probably due to the lack of reliable and sophisticated tools. This highlights the need for the immediate development of accurate methods for the identification of C. fetus subspecies.
Since 1990, the Laboratory of Bacteriology of EEA-INTA Balcarce, Argentina has isolated C. fetus from various veterinary samples. The organization has provided a differential diagnosis for the identification of this pathogen. They have collected 250 strains of the pathogen in the past 30 years. According to the data provided by the Laboratory of Bacteriology, Cff and Cfv are the most prevalent subspecies that are responsible for bovine abortion. In a recent study by our group, whole-genome sequencing of C. fetus was performed utilizing the services provided by the Genomic Unit of INTA. The study aimed to characterize C. fetus strains and thus contribute to the bulk of genomic sequences of C. fetus strains found in Argentina, that are currently available in public databases.
In the present study, phylogenomic analysis was performed to gain better insights into the global and local patterns for the spread of this pathogen, with a view to improve its surveillance. The present study aimed to extend the currently available knowledge regarding the different strains of C. fetus found in Argentina and promotes collaborative research between various groups from animal as well as human health sectors. The results of the study will provide a better and wholistic understanding regarding the local and regional epidemiological scenario involving this bacterium.

Ethical approval
This study doesn´t need ethical approval. This is a genomic dataset-based study.

Study period and location
The study was conducted from October 2019 to August 2020 at the research units of the National Agricultural Technology Institute (INTA), Argentina.

Whole genome sequences and pangenome analysis
For phylogenomic analysis, 34 freely available complete genome sequences for bovine C. fetus strains, isolated from the Pampas region, were downloaded from GenBank (last access to the database: August 2020). The present study included strains from four provinces of the Pampas region, namely, Buenos Aires (n=27), Santa Fe (n=2), La Pampa (n=2), and Córdoba (n=2) (Supplementary Table-1). The origin of one strain remained unknown. These strains were isolated over a period of 26 years (1989-2015) from various sources, including prepuce, placenta, vaginal mucus, and fetus. For analysis, genomic sequences were assembled using SPAdes 3.11.1 [9]. Data filtering for contigs <200 bp and the ones with low coverage (<10) resulted in 96 contigs per genome on average.

Phylogenomic analysis
MAFFT was used for core genome multiple sequence alignment. A maximum likelihood (ML) phylogenetic tree (ML tree) was generated using IQ-Tree 1.6.12 [13], which was further visualized with the aid of iTol v5 [14]. The node support was evaluated with 1000 bootstrap pseudoreplications. Subsequently, a heatmap was generated in R using the 'Matrix' package software (https://CRAN.R-project. org/package=Matrix) for pairwise comparison of all the strains included in the study.

Multi-locus sequence typing (MLST)
In silico MLST (https://github.com/tseemann/ mlst) was performed to obtain the sequence type (ST) of each strain obtained from Argentina. Each genome was scanned against the traditional PubMLST typing scheme (Campylobacter non jejuni/coli PubMLST database). In the cases where inconclusive results were obtained, pipeline MLST 2.0 (https://cge.cbs.dtu.dk/ services/MLST/) was also employed for analysis. The novel allelic variants and their respective ST were deposited in the PubMLST database (https//:www. pubMLST.org).

Visualization of phylogenetic tree linked to metadata using Microreact
Microreact software [15] was used to visualize the local phylogenetic tree in a spatial and temporal context. A public project was created through the Microreact homepage. Metadata were collected from historical data collection of INTA (host, source, year of isolation, biochemical, and molecular traits of the strains) (Supplementary Table-1). Geno-and phenotyping of local strains were performed according to the previously published protocols [10,16]. Geodata were obtained with the aid of Google Maps (https//:maps.google.com).

Pangenome analysis of local C. fetus strains
The present study aimed to get a better understanding about C. fetus strains found in Argentina and their counterparts occurring in different parts of the world. For pangenome analysis, whole genome sequences for 34 strains of C. fetus, isolated from the cattle in the Pampas region, were obtained from GenBank. In general, the core and the soft-core genes provide information regarding the evolutionary history. In comparison to this, the shell and the cloud genes (which constitute the accessory genome) are involved in lifestyle and adaptation of the organism to different niches. The pangenome analysis for 34 strains showed that 1462 and 1748 genes belonged to the pool of core-genes and accessory genes, respectively (Figure-1a). Figure-1b displays the gene profiles shared between the isolates.
A heatmap representing the percentage of shared genes was generated using 'Matrix' in R Studio. As shown in Figure-2, the strains Cff 04-554 (Buenos Aires), Cfvi 02-298 (Córdoba), and Cfv 97-608 (La Pampa) were characterized by 84.5%, 85.5%, and 86.1%, respectively, of shared genes that were lowest as compared to the rest of the strains. For in-depth analysis of heatmap data, the average percentage of shared genes was calculated by grouping the strains as Cff, Cfv, and Cfvi. Cff, Cfvi and Cfv strains shared individually on average 91.1%, 90.6%, and 88.1% of genes with the complete set of genes of the other two variants, respectively.
No significant differences were detected between the core genomic constitution of subspecies and variants, and these were characterized by significant overall similarity.

Phylogenetic analysis based on the core genome of C. fetus: Local and global analysis
The core genome (1462 genes), built out of 34 C. fetus genomes was used to construct a maximum likelihood phylogenetic tree (ML tree) using IQ-Tree. Two major clusters (Cluster A and Cluster B) and a minor cluster (Cluster C) were identified from the ML tree ( Figure-2). Among the strains isolated from Argentina, 28 strains were included in the major clusters (A, n=11 and B, n=17), whereas two were grouped into the minor Cluster C. Four strains branched separately from clusters A, B, and C. Interestingly, no clustering was observed among these strains as well. Each cluster included strains from different provinces. Cluster A included strains from Buenos Aires, Santa Fe, and La Pampa. Cluster B included strains isolated from Buenos Aires and Córdoba. Cluster C was associated with strains obtained from Buenos Aires and Santa Fe. Interestingly, Cff 04-554 was found to have lower phylogenetic relationships and it diverged away from the rest of the isolates (Figure-2).
The geographical and temporal distribution of the data for clustering analysis was visualized using Microreact tool. No significant temporal or geographical associations were recorded within the phylogenetic groups. The clustering and metadata (Supplementary Table-1) are available in the following link: https://microreact.org/project/ nL5XD1qdA7ALMzgwMXeqiZ.
To investigate the phylogenetic relationships between the strains isolated from Argentina and other parts of the world, a phylogenetic study was conducted using a wide panel of genomic sequences of C. fetus strains having distinct origin and isolation source. To ensure consistency in the analysis, the parameters and models used for the local tree were used for global analysis as well. The C. fetus core genome built out 265 genomic sequences of strains from 20 countries, encompassed 1143 genes.
The ML tree generated from the analysis of 1143 core genes identified eight well-supported clusters (Figure-3). The major clusters were not geographically defined. A similar phylogenetic tree has been previously reported for a different dataset by Iraola et al. [17]. The reptile C. fetus testudinum strains diverged from the mammalian C. fetus strains (Cluster 8). Among the mammalian clusters (which were the largest one), Cluster 1 corresponded to the "cattle lineage", a term previously described by Iraola et al. [17]. This cluster is exclusive for the strains isolated from cattle. In comparison to this, Clusters 2-7 corresponded to the so called "human lineage" as these were predominated by human isolated strains.
Among 34 strains isolated from the Pampas, 33 strains belonged to Cluster 1. There was one exception, the strain Cff 04-554, which was described as a phylogenetically distant strain in the local analysis. This  was further confirmed by the global analysis, wherein Cff 04-554 clustered with nine bovine strains isolated from the United Kingdom and Germany, one ovine strain from Uruguay, and two human strains obtained from Turkey and Canada. Cff 04-554 clustered within Cluster 6, which was nested within the human lineage. Various subclusters could be identified within the cattle-specific Cluster 1. Further analysis revealed the occurrence of certain geographical associations. Most of the strains from Argentina (28/34) exclusively clustered with the strains of same origin. Among the six exceptions, the strains Cff 08-421 and Cff 11-477 shared a minor cluster with a strain isolated from Brazil, while the strain Cfvi 06-195 shared a minor cluster with a strain obtained from Uruguay. Similar to local ML tree analysis, the strain Cff 11-360 did not cluster with any of the strains in global analysis as well.
The remaining two strains, Cfv 97-608 and Cfv 98-25, clustered in a different branch of the subcluster of the strains isolated from Argentina. This minor phylogenetic group included 31 strains having intercontinental distribution. The strain Cfv 97-608 clustered with the strains isolated from Canada and the USA, and the strain Cfv 98-25 formed a singulete. Interestingly, this subcluster included strains identified as Cfv using different typing techniques (Table-1) [10,[17][18][19].

MLST for the strains isolated from Argentina
In silico MLST was performed to obtain the ST for each local C. fetus strain. Among 34 local strains, 31 strains were subtyped as ST4. According to the local phylogenetic analysis, ST4 was distributed homogenously among the phylogenetic groups ( Figure-2). In accordance with the results obtained for local and global phylogenetic analyses, the strain Cff 04-554 ("human lineage") was the only strain that showed ST5 (Figures-2 and 3).
Initially, the aforementioned in silico approach failed to establish ST for Cff 11-427 and Cff 07-485. In case of the strain Cff 07-485, the uncA locus showed low coverage of 85.1%, while the rest of the loci shared 100% nucleotide identity with ST4. The genome sequence analysis for Cff 11-427 showed the presence of a C-to-T transition at position 293 of the uncA allele. This new allele and its respective ST were deposited in the PubMLST database (http://pubmlst.org). This strain was assigned uncA allelic variant 15 and ST70 subtype. This ST70 strain clustered with a ST4 strain (Figure-2). This variant has been reported for the 1 st time in the present study.

Discussion
In the present study, local phylogenetic analysis of 34 C. fetus strains, isolated from the Pampas region,  The heatmap colors scale from blue (higher percentage of shared genes) to dark red (lower percentage of shared genes). Province codes: BA=Buenos Aires, SF=Santa Fe, CO=Córdoba, LP=La Pampa. n.a: not available.
Available at www.veterinaryworld.org/Vol.14/May-2021/15.pdf was performed. The study allowed a critical evaluation of the existence of regional variants to describe the circulating C. fetus strains in Argentina. The results of the analysis showed clustering of most of the strains into few clusters, suggesting a clonal population structure where two major overlapping clusters were identified in Argentina. These two major clusters included 82% of the local strains. The use of Microreact software allowed an interactive visualization of the dynamics of isolation in terms of year, frequency of geographical distribution, and source of isolation for each of the C. fetus strains included in the local study. The clustering data showed no association with the source of the samples, date of isolation, origin, and biochemical tests. The two strains that were included in the minor cluster were isolated at two different time points, 1997 and 2008. In fact, these strains were isolated from two distinct provinces, Buenos Aires and Santa Fe, from localities that were 500 km away. On the other hand, the second strain isolated from Santa Fe belonged to the major Cluster A. To establish the existence of a putative third cluster for the distribution of the strains throughout the region, the analysis must include a large sample pool. Interestingly, Córdoba Province was under represented and only two strains from this region were studied. These strains formed a subcluster within the major Cluster B. Similarly, two strains isolated in La Pampa Province were included in the study. However, one of these strains clustered in the major Cluster A, while the other one was found to be phylogenetically distant.
The clonal nature of C. fetus might be attributed to higher genetic stability of this pathogen as compared to other Campylobacter species [20,21]. Thus, there is a significant possibility for the continuous circulation of few genotypes of C. fetus in this endemic region of Argentina. Such circulation might be indicative of the movement of cattle over time or trade of livestock material between different regions. In addition, different variants were observed within the dominant clones. This might be attributed to the use of different herd management practices (like vaccination) or different cattle breeds, which could have driven the selection of the strains.
The global phylogenetic analysis provided a broad overview of the relationships between the strains obtained from Argentina and their global counterparts that were isolated from different hosts in different countries. In a recent study, Iraola et al. [17] described the phylogeny of C. fetus. The study proposed two major lineages for C. fetus strains, human, and cattle lineage . In the present study, 33 out of 34 strains, isolated from Argentina, belonged to the cattle lineage. The global ML tree was consistent with the findings of the local ML tree. It was successful in clarifying the position of the phylogenetically distant strains. The bovine strain Cff 04-554, which belonged to the human lineage, showed higher genetic distance in the clustering. In addition to this, it was associated with significant differences at the core genome level. The genome sequence of this strain was manually checked to avoid any issues arising due to chimeras or other assembly artifacts. The comparative analyses of the genomes showed the presence of large number of polymorphisms in this strain, including SNPs and insertions in some of the genes.
Cff 04-554 strain was isolated from a 7.5-monthold aborted fetus. This fetus belonged to a herd where both artificial and natural insemination was practiced in cows. Among the various strains obtained from Argentina, this strain was unique and it was assigned to ST5 according to the PubMLST database. ST5 subtype has been previously reported in cattle and human strains of C. fetus isolated from different countries such as the United States of America, Belgium, Germany, and the United Kingdom [20]. Interestingly, Cff 04-554 clustered with these previously reported strains. An explanation regarding the presence of this strain in the livestock productive system of Argentina remains elusive; however, the existence of international trade for livestock (semen, embryos, or even animals) in the past could not be ignored. In addition, this strain also clustered with one ovine strain obtained from Uruguay [22]. Thus, there is possibility for the circulation of this genotype in different hosts in South America. More studies are required to test the prevalence and relevance of this genotype.
Several previous studies have proposed the suitability of Pulsed Field Gel Electrophoresis (PFGE), Amplified Fragment Length Polymorphism (AFLP), and MLST as genotyping tools for subtyping of C. fetus strains. PFGE and AFLP have been successfully utilized for subspecies differentiation but have not been extensively studied [23,24]. In comparison to these, MLST is an unambiguous and less complex procedure, which is based on the sequencing of housekeeping genes. This technique is robust and the sequencing data can be compared with the help of open access database. Thus, MLST has gained wide acceptance for the evaluation of C. fetus diversity in the past few years promoted in large part by the sequencing costs reduction. In a previous study, MLST showed low inter-ST genetic diversity and the two subspecies of C. fetus were described to have close genetic relation. The study suggested the suitability of MLST for long-term epidemiological and phylogenetic analyses [20]. In another study, MLST results were in concordance with the core genome clustering. These results further suggested that the loci included in the MLST scheme represent a suitable subset of genes of the core genome [18].
In concordance with the phylogenetic analysis, MLST results revealed low diversity among C. fetus strains isolated from the Pampas region. ST4 was found to be most common subtype, with inclusion of 31 strains. One strain belonged to ST5, while another one was identified as a new variant and was designated a new subtype, ST70. However, this technique failed to efficiently discriminate between the strains located in the major clusters. In addition to this, ST4 was associated with all the clusters. Interestingly, ST5 and ST70 strains were grouped outside the major clusters. Initially, ST4 was first found to be exclusively associated with cattle Cfv strains [20]. Thus, ST4 was proposed to be cattle-associated genotype. However, later, Iraola et al. [25] identified an ST4 Cff strain in a rural worker, representing a probable case of zoonotic transmission. The study also reported inconsistencies between MLST and whole-genome typing outcomes. In concordance with the findings of Iraola et al. [25], MLST analysis in the present study was found to be partially concordant with phylogenetic analysis. Thus, all these observations suggested the limited utility of MLST as a tool to evaluate the genetic diversity of circulating C. fetus strains in Argentina.
NGS and phylogenetic studies have provided significant information about this pathogen; however, subspecies assignment and their differential diagnosis remain a great challenge. In the present study, an approach similar to the one used by Iraola et al. [17] was followed, which resulted in same tree topology and clustering. Interestingly, a particular sub-branch of 31 strains with common characteristics was identified within the cattle lineage. In a previous study, Farace et al. reported the use of a Polymerase chain reaction (PCR)-based testing method to evaluate the presence of a cysteine transporter operon (L-Cys transporter) linked to hydrogen sulfide production in C. fetus strains [10]. All the strains from this subcluster were tested by in silico-PCR and were identified as non-producers for hydrogen sulfide, a trait typical of Cfv. Additionally, t his analysis was consistent with the biochemical results obtained for the strains included in this subcluster. Most of these strains were isolated from preputial samples (71%). Bulls' preputial crypts have been previously described as the main niche for the existence of this subspecies [26]. In addition to this, most of these strains shared molecular traits and were identified as Cfv through molecular typing methods. Among these, the Spanish strain C7 was the only exception. This strain has been molecularly typed as Cff by Iraola et al. [17]. In comparison to this, Farace et al. described it as Cfv on the basis of PCR-based testing for L-Cys transporter [10]. Cfv 98-25, isolated from Argentina, has been associated with conflicting biochemical classifications in different labs [10]. However, in the present study, this strain was biochemically typed as Cfv, which was consistent with the results for L-Cys transporter-PCR, conventional molecular typing [16], and phylogenetic analysis. Interestingly, the rest of the strains within the cattle lineage were found to be hydrogen sulfide-producing strains, a characteristic common to both Cff and Cfv biovar intermedius strains.
In a previous study, van der Graaf-van Bloois et al. [18] performed core genome phylogenetic analysis for 21 C. fetus genome sequences. The clustering results for the study were found to be inconsistent with the phenotype of the strains. The discrepancies in the results for the present study and the study by van der Graaf-van Bloois et al. [18] could be attributed to the composition of the genomic dataset. Significantly different sample size was used in both studies, which might have a significant impact on the core genome constitution and clustering. It is important to mention that no evidences have been reported for any correlation between hydrogen sulfide production and virulence of the strain. Thus, the results of the present study should be interpreted with caution. For future analyses, well-characterized virulence markers must be used. There are many questions that still remain unanswered, particularly regarding the importance of each subspecies in cattle health and differential diagnosis. Several interesting points, like whether these subspecies are actually genetically distinct and do enough evidence exist to confirm (or deny) the importance of one subspecies over the other, should be discussed. The subspecies assignment should be upheld by both phenotypic and molecular evidences and must also be supported by phylogenetic analysis. In the present study, phylogenetic analysis was successful in differentiating a subset of strains that shared Cfv phenotypic and genotypic traits, and future studies should further evaluate its relevance in cattle health.
Despite the use of low number of samples from Argentina, the present study provided basic information that would assist in the designing of future epidemiological studies to get better insights into C. fetus related infections in the country. Argentina (2,780,400 km 2 ) contains different phytogeographic regions where livestock is less important as compared to the Pampas region . However, livestock production is relevant for the local economies and reproductive diseases are also prevalent. In these regions, the diagnosis is frequently based on techniques that do not involve isolation of C. fetus. This has particularly delayed the evaluation of circulating strains in Argentina. Further phylogenetic studies, particularly focused on the inclusion of under sampled regions and augmentation of overall number of samples, might be helpful in providing a more realistic overview regarding the phylogenetic relationships of C. fetus strains isolated from Argentina. The present work focused largely on the analysis of genomic data of C. fetus strains isolated from cattle in the Pampas region, the most productive area of Argentina. However, environmental samples and human strains must also be studied for better understanding. The outcomes of the present study encourage the sharing of data for strains isolated from different sources to expand the knowledge for this pathogen in Argentina.

Conclusion
Local and global phylogenomic analyses revealed the circulation of a limited number of C. fetus strains in Argentina over the years. The results also suggested an active movement of animals, probably due to economic trade between the different regions of the country as well as with neighboring countries such as Brazil and Uruguay. Although the results for MLST showed partial concordance with the phylogenetic analysis, MLST failed to qualify as a reliable subtyping method to assess C. fetus diversity in Argentina. The study provided significant background genomic information and updated metadata, which can be further, used as a platform for future surveillance and tracking of C. fetus distribution in Argentina.

Authors' Contributions
AKG and AFA conceived and supervised the study. PDF and JMI designed the study, collected and analyzed genomic data. FAP, CGM, MAM and JAG collected metadata. AKG, PDF, FAP, CGM, and JAG interpreted the results. AKG and PDF drafted the manuscript. All the authors read and approved the final version of the manuscript.