Accessory Genomes Drive Independent Spread of Carbapenem-Resistant Klebsiella pneumoniae Clonal Groups 258 and 307 in Houston, TX

ABSTRACT Carbapenem-resistant Klebsiella pneumoniae (CRKp) is an urgent public health threat. Worldwide dissemination of CRKp has been largely attributed to clonal group (CG) 258. However, recent evidence indicates the global emergence of a CRKp CG307 lineage. Houston, TX, is the first large city in the United States with detected cocirculation of both CRKp CG307 and CG258. We sought to characterize the genomic and clinical factors contributing to the parallel endemic spread of CG258 and CG307. CRKp isolates were collected as part of the prospective, Consortium on Resistance against Carbapenems in Klebsiella and other Enterobacterales 2 (CRACKLE-2) study. Hybrid short-read and long-read genome assemblies were generated from 119 CRKp isolates (95 originated from Houston hospitals). A comprehensive characterization of phylogenies, gene transfer, and plasmid content with pan-genome analysis was performed on all CRKp isolates. Plasmid mating experiments were performed with CG307 and CG258 isolates of interest. Dissection of the accessory genomes suggested independent evolution and limited horizontal gene transfer between CG307 and CG258 lineages. CG307 contained a diverse repertoire of mobile genetic elements, which were shared with other non-CG258 K. pneumoniae isolates. Three unique clades of Houston CG307 isolates clustered distinctly from other global CG307 isolates, indicating potential selective adaptation of particular CG307 lineages to their respective geographical niches. CG307 strains were often isolated from the urine of hospitalized patients, likely serving as important reservoirs for genes encoding carbapenemases and extended-spectrum β-lactamases. Our findings suggest parallel cocirculation of high-risk lineages with potentially divergent evolution.

study. Hybrid short-read and long-read genome assemblies were generated from 119 CRKp isolates (95 originated from Houston hospitals). A comprehensive characterization of phylogenies, gene transfer, and plasmid content with pan-genome analysis was performed on all CRKp isolates. Plasmid mating experiments were performed with CG307 and CG258 isolates of interest. Dissection of the accessory genomes suggested independent evolution and limited horizontal gene transfer between CG307 and CG258 lineages. CG307 contained a diverse repertoire of mobile genetic elements, which were shared with other non-CG258 K. pneumoniae isolates. Three unique clades of Houston CG307 isolates clustered distinctly from other global CG307 isolates, indicating potential selective adaptation of particular CG307 lineages to their respective geographical niches. CG307 strains were often isolated from the urine of hospitalized patients, likely serving as important reservoirs for genes encoding carbapenemases and extendedspectrum b-lactamases. Our findings suggest parallel cocirculation of high-risk lineages with potentially divergent evolution. IMPORTANCE The prevalence of carbapenem-resistant Klebsiella pneumoniae (CRKp) infections in nosocomial settings remains a public health challenge. High-risk clones such as clonal group 258 (CG258) are particularly concerning due to their association with bla KPC carriage, which can severely complicate antimicrobial treatments. There is a recent emergence of clonal group 307 (CG307) worldwide with little understanding of how this successful clone has been able to adapt while cocirculating with CG258. We provide the first evidence of potentially divergent evolution between CG258 and CG307 with limited sharing of adaptive genes. Houston, TX, is home to the largest medical center in the world, with a large influx of domestic and international patients. Thus, our unique geographical setting, where two pandemic strains of CRKp are circulating, provides an indication of how differential accessory genome content can drive stable, endemic populations of CRKp. Pan-genomic analyses such as these can reveal unique signatures of successful CRKp dissemination, such as the CG307-associated plasmid (pCG307_HTX), and provide invaluable insights into the surveillance of local carbapenem-resistant Enterobacterales (CRE) epidemiology. KEYWORDS CG258, CG307, carbapenem-resistant Klebsiella pneumoniae, divergent evolution, genomic surveillance, mobile genetic elements C arbapenem-resistant Klebsiella pneumoniae (CRKp) causes significant worldwide morbidity and mortality and is of paramount concern in nosocomial settings (1). Since the identification of K. pneumoniae isolates harboring the carbapenem-hydrolyzing enzyme K. pneumoniae carbapenemase (KPC) in 1996, global dissemination of CRKp has occurred (2). CRKp now accounts for the majority of carbapenem-resistant Enterobacterales (CRE) infections in the United States (3,4). Infections caused by CRKp are difficult to treat since these isolates often harbor resistance to multiple antibiotics, limiting therapeutic options (5).
Global dissemination of CRKp has been largely attributed to the clonal expansion of a genetic lineage of K. pneumoniae designated clonal group 258 (CG258), first identified in the United States in 2008 (6)(7)(8). Spread of CG258 has been associated with carriage of genes encoding the KPC enzyme (6,9), most notably on F-type pKpQIL plasmids (10). More recently, another CRKp genetic lineage, designated CG307, has emerged and appears to be spreading in countries such as Italy, Pakistan, Colombia, and the United States (11)(12)(13). Isolates in the K. pneumoniae CG307 lineage carry distinct genomic features that may confer virulence and colonization advantages, such as F-type plasmid-borne glycogen synthesis gene clusters and urea transport systems (12). These potential virulence determinants are present in conjunction with established pathogenic factors, such as the mrk gene cluster (encoding a type 3 pili), and extracellular polysaccharides (capsule and lipopolysaccharides) found across all multidrug-resistant (MDR) K. pneumoniae lineages (14). Carriage of bla KPC on pKpQIL and N-type plasmids in CG307 isolates has been documented (12), however, not at the same prevalence as CG258 (15,16). There is a strong association of the CG307 lineage with the carriage of bla CTX-M-15 , a gene encoding an extended-spectrum b-lactamase (ESBL) (11). Of interest, a molecular dating analysis estimated that CG307 arose in 1994 (11) near the estimated year of CG258 emergence (6), suggesting that these two high-risk genetic lineages have evolved in parallel.
A recent large U.S. cohort study of patients infected or colonized with CRE (Consortium on Resistance against Carbapenems in Klebsiella and other Enterobacterales 2 [CRACKLE-2] study) identified K. pneumoniae CG307 as the second most common lineage of CRKp after CG258 (7% versus 64% out of 593 isolates, respectively) (3). Of note, in that study, the majority (72%) of the CG307 K. pneumoniae isolates were found in the Houston, TX, metropolitan region. This observation is in agreement with previous reports describing the molecular epidemiology of multidrug-resistant K. pneumoniae in a hospital network in Houston, TX (17). Furthermore, the CRACKLE-2 results suggested that Houston had become the first large city in the United States with high endemicity of CRKp detected where K. pneumoniae CG307 and CG258 appear to be cocirculating and undergoing parallel expansion. Nevertheless, the molecular epidemiology of these two predominant lineages cocirculating in the same geographical locale, in particular, the extent to which they share adaptive traits found in their accessory genome, remains poorly understood.
To dissect the factors that drove the parallel cocirculation of CG307 and CG258, as well as the high endemicity of CRKp in the Houston area, we generated complete assemblies of K. pneumoniae sensu stricto recovered from patients enrolled in the CRACKLE-2 cohort in Houston and other sites in the United States. We assessed the potential intra-and interclade transmission of vectors responsible for carbapenem resistance between CG258 and CG307 and their respective correlated gene content and characterized clinical outcome differences in patients colonized or infected with CRKp.

RESULTS
Patients with CG307 colonization or infection may have lower 30-day mortality than those with CG258. Clinical epidemiological features stratified by CG for the Houston, TX, isolates are presented in Table 1. The crude 30-day mortality for the full Houston cohort (n = 73) was 17.8%; (13/73 [95% confidence interval (CI), 7.8 to 22.6%]). When comparing clinical features across the CG258 and CG307 groups, patients with CG258 exhibited a statistically significantly higher 30-day mortality than those infected/ colonized with CG307, albeit with a small number of events. Patients with CG307 infection/colonization exhibited a higher proportion of samples isolated from urine (65.6%) than those with CG258 (37.0%), but this association did not reach statistical significance (P = 0.068). Conversely, patients with CG258 infection/colonization had a higher proportion of isolates from blood (14.8% versus 0%) and respiratory cultures (25.9% versus 12.5%) than the CG307 patient group.
Molecular epidemiological features of the Houston K. pneumoniae isolates are shown in Table 2. The median genome size of isolates belonging to CG307 was smaller than that of CG258 and other heterogeneous sequence types, although there were no significant differences in chromosome sizes between the three CGs. The smaller genome size of CG307 was due to a smaller number of plasmids compared to CG258 or other CGs ( Table 2). The mean number of acquired antimicrobial resistance (AMR) genes per genome, i.e., the average number of nonintrinsic AMR genes found per CRKp isolate, was 7.8 6 5.0 genes (Fig. S2A at https://gitlab.com/carmig_dissertations/shropshire_dissertation/crkp_supplemental _files). The median number of acquired AMR genes stratified by group was similar between CG258 (5; interquartile range [IQR] = 6), CG307 (7; IQR = 5), and other CGs (9; IQR = 10.5). We did not find a statistically significant difference across these subsets in the numbers of genes encoding AMR determinants of different classes (i.e., number of antibiotic classes with at least one resistance determinant per genome) (Fig. S2B). We also analyzed AMR and virulence determinants using the Kleborate composite resistance scoring metric (18) along with the categorical virulence score (see Materials and Methods). We found no statistically significant difference in Kleborate resistance scores between CG258 and CG307 (adjusted [adj.] P value = 0.3). However, CG258 isolates had a statistically significantly higher resistance score than the group of other heterogenous sequence types (adj. P value = 0.007). The Kleborate composite virulence scores in Table 1 reflect that the Houston CG307 isolates lacked genes encoding acquired/nonintrinsic siderophores (e.g., yersiniabactin, salmochelin, aerobactin), the genotoxin colibactin gene cluster, and hypermucoidity genes (rmpA and rmpA2), which are commonly found virulence determinants in hypervirulent lineages of K. pneumoniae sensu stricto (19,20).
Multiple conjugative elements (ICEKp) carrying virulence factors integrated in the chromosome of K. pneumoniae lineages have been previously described (19). These mobile genetic elements largely contribute to hypervirulent, high-risk strains of K. pneumoniae commonly observed in Asian countries (5). There were 35.1% (13/37) of CG258 Houston, TX, isolates that carried a chromosomally inserted ICEKp with an associated yersiniabactin gene (ybt) cluster. The ICEKp in most of these CG258 isolates harbored an ICEKp10-ybt17 sequence type except for one Houston CG258 isolate (C592) that carried a novel ybt sequence type. All Houston CG258 isolates harboring ICEKp also contained the ICEKp-associated colibactin gene cluster from the clb3 lineage. This  (19). In contrast, all Houston CG307 isolates lacked ICEKp integration. Only one CG307 isolate in the non-Houston isolates (C4693 recovered from a patient in Georgia) harbored a chromosomal ICEKp, which belongs to the ICEKp12 lineage and encodes the ybt16 sequence type. An important genomic epidemiological observation relating to the population structure of K. pneumoniae is the rare convergence of multidrug-resistant (MDR) and hypervirulence genetic determinants (5,21). We identified two CRKp isolates with predicted resistance to .3 antimicrobial classes, C308 (ST23) and C346 (ST231), that harbored aerobactin-encoding iuc genes along with bla KPC-2 and bla OXA-232 , respectively, suggesting that the convergence of MDR and hypervirulence genetic determinants is occurring in our study population. The population structure in cocirculating CG258 and CG307 indicates nested subgroups within each lineage. The pangenome of the full cohort (n = 121), using 94 CRKp Houston isolates with 12 CG307 and 12 CG258 isolates collected from other CRACKLE-2 U.S. hospital sites (n = 48 sites) plus 3 references (see Materials and Methods), consisted of 13,049 genes, of which 3,908 (29.9%) made up the core genome, defined as gene groups included in $99% of the full cohort. The overall species median nucleotide divergence, a measure of genetic variation within a population based on normalized polymorphism counts, was 0.6% (median pairwise single-nucleotide polymorphism [SNP] difference = 22,417 SNPs), suggesting that less than 1% of the core genome nucleotide sites were variant sites. This nucleotide diversity is comparable to previously shown measures of genetic variation within the K. pneumoniae sensu stricto phylogroup based on core gene alignment (21).
To dissect lineage-specific population structures, recombination-free, referencebased core SNP maximum-likelihood phylogenetic trees were created for both CG258 and CG307 (Fig. 2). The intergroup median nucleotide divergence of CG258 and CG307 was 0.6% (median pairwise SNP difference = 23,304 SNPs), which was comparable to overall species divergence. The intragroup median nucleotide divergence calculated for the CG258 group was 0.013% (median pairwise SNP difference = 508 SNPs), a core genome nucleotide diversity expected for a clonal group showing a more homogenous core genome at the clonal group level, compared to the overall species level. We identified two clades within the CG258 lineage ( Fig. 2A), split largely by the capsule synthesis locus (cps), as previously described (7,22). Clade 1 (wzi 29/KL106) and clade 2 (wzi 154/KL107) encompassed 45% and 51% of CG258 K. pneumoniae isolates, respectively. The remaining 4% of CG258 isolates had unique capsular synthesis loci and/or a  large ISKpn26-mediated deletion in this region (a region known to have a high rate of recombination). There were nested population structures within each clade that segregated by geographical site (Houston versus other U.S. CRACKLE-2 sites), with a clearer delineation within clade 1. There was a strong association of isolates harboring ICEKp10 (primarily the ybt17 lineage) with clade 2 isolates (18/25; 72%) not observed in clade 1 isolates (Fisher exact P value = 5e-6). The CG307 group (Fig. 2B) was less divergent than CG258, with a median nucleotide divergence of 0.004% (median pairwise SNP difference = 145 SNPs). There was a marked geographical split of the CG307 group, correlating with the predicted four hierarchical clusters within the core population structure. A majority of CG307 isolates had the same K and O loci (wzi173 allele associated with the KL102 locus and the O2v1 [3/ 47] or O2v2 [33/47] loci, respectively), with the exception of one isolate (C291) which had a 28,813-bp ISKpn26 associated deletion in the cps locus.
To determine the overall population structure of CG307 isolates in the Houston region, relative to historical isolates detected globally, a reference-based alignment maximumlikelihood inferred phylogeny was generated with CG307 isolates (n = 798), using C234 (the first CG307 isolate in the Houston population), as a reference (Fig. 3). Four hierarchical population structures were detected, with a prominent distinction between three Houston-based ST307 clades (clades I, III, and IV) and the worldwide disseminated CG307 clade (clade II) (Fig. 3). Houston-based clade III isolates shared a common ancestor with clade IV isolates not shared with the distinct clade I and clade II isolates. Thus, phylogenetic reconstruction suggests that three lineages of CG307 (clades I, III, and IV) distinct from CG307 found in other parts of the world (clade II) are currently circulating in Houston hospitals.
Dissection of the accessory genome suggests independent evolution and limited horizontal gene transfer between CG307 and CG258 lineages. We sought to determine the extent of accessory genome sharing within our CRKp group as a measure of potential interclade horizontal gene transfer of antimicrobial resistance and virulence determinants. We took this approach to understand the dynamics of circulation of these genes between high-risk clones that might explain how they cocirculate in the Houston area. Thus, using the full cohort described in the previous section, we sought to determine the geographical clustering of the accessory genomes of all K. pneumoniae isolates, including genes encoding carbapenemases. Interestingly, both t-distributed stochastic neighbor embedding (t-SNE) analysis and principal-component analysis (PCA) indicated a distinct separation of the CG258 accessory genome from the rest of the isolates ( Fig. 4A and B; Fig. S3 at https://gitlab.com/carmig_dissertations/shropshire_dissertation/ crkp_supplemental_files), suggesting that CG258 has limited accessory genome sharing with other lineages. Fig. 4A shows that, apart from one Georgia isolate (C4688), CG307 isolates from Houston cluster together and are distinct from non-Houston CG307. In contrast to the unique accessory genome components in CG307 that segregated by region, we were unable to identify accessory genome clustering by geographical location of CG258 isolates.
We then identified the minimized genomic distance between each isolate to determine which isolates were more likely to share similar accessory genome content based on their predicted cluster assignment. Fig. 4B shows that when cluster groups were identified by sequence similarity (pairwise binary distances between the 121 isolates), there was a split in CG258 that largely segregated by cps-associated clades (cluster 1 versus cluster 2), with four exceptions (C293, C295, C259, and C4688). CG307 isolates shared a cluster assignment (cluster 3, Fig. 4B) with 10 other isolates belonging to non-CG307. Furthermore, CG307 accessory genomes appeared to cluster with CG147 isolates in our cohort. PCA of the accessory genome indicated that 90.7% of the variance in the data set could be explained in the first two components of the accessory genome with noted separation of CG258 from CG307, as well as the other clonal groups (Fig. S3 at https://gitlab .com/carmig_dissertations/shropshire_dissertation/crkp_supplemental_files). The PCA analyses in conjunction with t-SNE overall supports a clear separation between CG258 and non-CG258 accessory gene content.
To further support the divergence of CG258 from CG307 and the lack of genomic sharing, we analyzed a subset of the accessory genome (n = 2,943) excluding low-(#5%) and high-frequency ($95%) gene groups that are less indicative of horizontal gene transfer within the full cohort and determined distribution differences by Cluster of Orthologous Genes (COGs) functional categories (Table S1 at https://gitlab.com/carmig_dissertations/ shropshire_dissertation/crkp_supplemental_files). Figure 4C shows the overall distribution of accessory genome content annotated by functional group across all clonal groups found. When focusing on COG functional group proportions of CG258 versus CG307 isolates, we found statistically significant differences in relative frequency proportions of each respective COG functional group with all but 5 of the 18 characterized COG functional groups (Fig. 4D). In contrast, there were fewer statistically significant differences observed in the proportion of COG functional groups within the accessory genomes for both CG258 and CG307 compared to each respective other clonal group found in the cohort (Fig. S4).
The largest statistically significant differences in COG functional groups between CG258 and CG307 were in predicted carbohydrate metabolism and transport mechanisms (G), cell membrane structure/biosynthesis (M), and replication/recombination/repair (L) genes (Fig. 4D). The larger proportion of replication, recombination, and repair genes (L) in CG258 is likely due to a higher mean number of plasmids per genome observed in CG258 than in CG307 isolates. Noted differences include a previously described chromosomal, 13-kb p-fimbriae gene cluster (12), conserved in all CG307 isolates (n = 48) and absent from all CG258 (n = 51) (adj. P value = 8e-33) (Fig. S5 at https://gitlab.com/carmig _dissertations/shropshire_dissertation/crkp_supplemental_files). The fimbrial gene cluster was also present in all four CG147 isolates. A second, previously described (12), chromosomal capsular synthesis cluster with 12 genes (Cp2) was present in all 48 CG307 isolates and absent from all others (adj. P value = 9e-34) (Fig. S5). Conversely, there was a carbohydrate metabolism operon (designated ydjEFGHIJ) exclusively found on all CG258 and CG147 isolates and absent in the CG307 isolates (adjusted P = 5e-34). Carriage of unique phages between CG307 and CG258 was a primary driver of accessory genome differences. Indeed, three intact phages detected in CG307 were absent in CG258 (Fig. S5).
We further subset the accessory genome based on the histogram of gene sharing ( Fig. S6 at https://gitlab.com/carmig_dissertations/shropshire_dissertation/crkp _supplemental_files) to accessory genes shared in less than 80% of the population to observe gene sharing within and between each respective clonal group that does not include any potential "soft-core" genes. The median accessory gene groups shared within CG258 strains (640 [IQR: 557, 723]) was significantly higher than the number of gene groups shared between CG307 and non-CG258 strains (Fig. S7A). Similarly, a conservation of intragroup accessory genome sharing was observed within the CG307 strains (485 [IQR: 420, 550]) relative to intergroup accessory genome sharing with non-CG307 strains (Fig. S7B). Taken together, these results collectively indicate that CG258 and CG307 each contain highly conserved and distinct accessory genomes, which are likely driving independent endemic spread of each lineage in the Houston region.
Except for a CG307 Georgia isolate (C862) which shared similar genomic characteristics with CG307 from Houston, all other CRACKLE-2 CG307 recovered outside Houston exhibited distinct accessory genome features (Fig. 2B), supported by previous phylogenetic analyses (see above). Indeed, a majority of non-Houston CG307 (10/12, 83%) harbored a pKPN3-like plasmid, with 6 having bla CTX-M-15 as part of an ISEcp1 element. Furthermore, 3 of the 8 CG307 non-Houston isolates, which were bla KPC positive, carried bla KPC on non-Tn4401a transposons (compared to all Houston bla KPC -positive CG307 isolates, which carried bla KPC on Tn4401a; Table S2 at https://gitlab.com/carmig_dissertations/shropshire _dissertation/crkp_supplemental_files). Additionally, non-Houston CG307s harbored A/Ctype plasmids (n = 2) that carried bla KPC , a feature absent in Houston isolates (Fig. 5).   c Outer membrane porin gene changes are counted if a truncation or nonsense mutation is detected in either the ompK35 or ompK36 gene. d Aminoglycoside, tetracycline, sulfonamide, and phenicol (e.g., chloramphenicol) predicted resistance based on the presence of at least one gene that confers resistance to the respective antibiotic class (e.g., aac3-Ia confers aminoglycoside resistance). e Fluoroquinolone and colistin predicted resistance based on chromosomal mutations that confer resistance to each respective antibiotic class (e.g., truncations in mgrB/pmrB confer resistance to colistin). Small proportion of isolates (6/95; 6.3%) also carry quinolone resistance genes (i.e., qnr variants).
These results, in conjunction with our phylogenetic analysis, support that a distinct epidemic lineage of CG307 is currently circulating in tertiary care hospitals in Houston, TX.
CG258 and CG307 isolates harbor unique plasmid content. To determine the full complement of plasmid vectors assembled in our CRKp cohort, we characterized the full plasmid content (i.e., the plasmidome) to assess potential clustering and sharing of plasmids by clonal group. The plasmidome of our CRKp isolates was highly diverse (Fig. 5A), with an average number of plasmid structures of 3.5 per isolate (minimum, 1; maximum, 7). Additionally, we analyzed the clustering of individual plasmids as the unit of analysis (Fig. 5B). Plasmid content clustered by core-defined clonal groups (Fig. 5A), with a majority of CG258 isolates clustering within their own distinct group (Fig. 5A, top right). In particular, there were three plasmid types associated with the CG258 lineage, including X3, FII(K)/FIB(K), and FII(K)/FIB (pKpQIL) plasmids (Fig. 5B).
There was a particular rep-3 family replication initator protein (52/415;12.5%) gene detected within plasmids found in nearly all Houston CG307 isolates (31/35; 88.8%). These CG307 plasmids associated with the aforementioned repA gene (here referred to as pCG307_HTX) primarily clustered with F-type and R-type plasmids (Fig. 5B, outlined with red dotted square). A total of 30/52 (57.7%) pCG307_HTX plasmids were predicted to be nonmobilizable, with the rest (22/52; 42.3%) predicted to be conjugative and/or mobilizable, harboring tra operons and/or mob encoding genes with essential oriT sites. Importantly, 10/31 (32.3%) pCG307_HTX plasmids carried Tn4401a-bla KPC (Fig. 5B), suggesting that this plasmid may have been important for carbapenemase dissemination within the Houston CG307 isolates. Collectively, our results indicate that the CG258 plasmidome is conserved and largely segregated from other cocirculating CRKp isolates, including CG307, in agreement with the pan-genome analyses.
The novel pCG307_HTX plasmid carrying Tn4401a conjugates at similar rates compared to the CG258 pKpQIL plasmid. To evaluate the potential for horizontal transfer of carbapenemase genes, we chose 4 plasmid candidates of interest based on their relative frequencies found in Houston CG307 (i.e., pCG307_HTX) isolates, non-Houston CG307 (i.e., pKPN3-307_typeA) isolates, and CG258 (i.e., pKpQIL) isolates. In particular, we assessed the relative conjugation transfer frequencies of two bla KPC-2 -carrying plasmids associated with a rep-3 family repA gene (i.e., pCG307_HTX plasmids, one each predicted to be conjugative [pC299_2] or not conjugative [pC711_1], respectively). Conjugation efficiencies were compared using a comparable-size pKpQIL plasmid (pC344_1, CG258) that shared a similar tra operon and cargo gene region (Fig. S8 at https://gitlab.com/carmig_dissertations/shropshire_dissertation/crkp_supplemental _files). Additionally, we chose a pKPN3-307_typeA-like plasmid (pC763_2) primarily found in our non-Houston CG307 isolates to compare its conjugative frequency with the other aforementioned F-type plasmids (Fig. S8). Three of the four plasmids of interest had detectable, positive transconjugants in a proportion of conjugation transfer assay experiments (Fig. S9), with various efficiencies. pKpQIL (pC344_1) and pCG307_HTX (pC299_2) plasmids had comparable average transfer frequencies of 4.1 Â 10 27 and 2.9 Â 10 26 , respectively. The pKPN_3_typeA plasmid-associated F-type plasmid had a greater transfer frequency (6.3 Â 10 25 ). However, this did not reach a statistically significant difference (P = 0.06). As expected from in silico prediction, pC711_1 yielded no transconjugants in any of the three conjugation transfer assays. Overall, our conjugation experiments validated in silico predictions of mobilization and transmission of the novel pCG307_HTX plasmid associated with the Houston CG307.

DISCUSSION
Although K. pneumoniae CG258 has been considered the major genetic lineage responsible for endemicity of carbapenem resistance, a new lineage designated K. pneumoniae CG307 has emerged in recent years (11)(12)(13). Since the first report of CG307 in the Netherlands in 2008, it has been identified in different parts of the world, including Colombia (23), Italy (24,25), South Africa (26,27), Pakistan (28), and South Korea (29), among others. Interestingly, the first detection of CG307 in the United States occurred in Houston, TX, in 2010, which also was to our knowledge, the first detection of CG258 in Houston (30) and was subsequently followed by an outbreak in a large, Houston hospital system (17). A more recent study assessing the clinical and genomic epidemiology of carbapenem-resistant Enterobacterales in the United States (3) showed that Houston was the first major city in the United States where carbapenem-resistant K. pneumoniae CG307 seemed to have established endemicity along with isolates belonging to CG258. Furthermore, a recent study suggests that CG307 may be spreading to other municipalities in South Texas and cocirculating with CG258 (31). The concomitant circulation of these major clones provided an opportunity to dissect their dynamics of dissemination, genetic relationships, and evolution using a pangenomic approach combined with clinical data. Our findings suggest that CG258 tends to have a higher median plasmid content, with a similar number of AMR determinants found across all CRKp isolates. The most striking finding of our study was that the two main multidrug-resistant K. pneumoniae lineages have evolved independently of one another and appear to be disseminating in parallel with limited evidence of interclade horizontal gene transfer between them. Moreover, our results support the notion than CG307 plasmids carrying KPC carbapenemases are likely to be shared with other clonal groups, except CG258, amplifying the epidemic of multidrug-resistant organisms in the same geographical area. Partitioning of the accessory genome by COG functional groups indicates large differences in distribution between CG258 and CG307 accessory genome content that are less apparent when comparing each respective clade with other clonal groups. These differences are potentially driven, in part, by CG307 having a greater proportion of carbohydrate metabolism and cell membrane synthesis determinants. Virulence factors found in our cohort, such as a separate chromosomal capsular synthesis locus (Cp2) shared across CG307 isolates and plasmid-borne glycogen synthesis clusters found on CG307 plasmids, have been previously documented in CG307 isolates found in Colombia and Italy (12).
Virulence factors such as the filamentous, extracellular organelles and type 1 fimbriae are associated with colonization and infection of the urinary tract in Escherichia coli and K. pneumoniae (32)(33)(34)(35). Interestingly, a unique p -fimbria gene cluster (12) was strongly associated with CG307 and CG147 isolates, with urine as a main source where CG307 was recovered. Furthermore, we found accessory genome intersection between CG307 and CG147 isolates by clustering analysis, with comparable proportions of COG functional groups carried by each group. Indeed, there has been recent evidence of parallel antimicrobial resistance patterns shared between CG307 and CG147 with acquisition of similar F-and R-type plasmids harboring bla CTX-M-15 along with identical gyrA and parC mutations conferring quinolone resistance, suggesting a common evolutionary pathway (12).
Our study also identified a unique plasmid (designated pCG307_HTX) in predominantly Houston CG307 isolates carrying a rep-3 family initiator replication protein with potential for vertical and horizontal transmission based on predicted mobility, as well as positive transconjugants in our conjugative transfer assays. A similar rep-3 family plasmid that recombined with an FII(K) plasmid had been described in a CTX-M-15associated ST416 K. pneumoniae isolate (pKDO1; GenBank accession no. JX424423) (36). We also found that the previously identified FII(K)/FIB(K) plasmid (12), harboring bla CTX-M-15 associated with worldwide CG307 isolates, had relatively higher conjugative transfer efficiency than the Houston pCG307_HTX and CG258 pKpQIL plasmids. This FII FIG 5 Legend (Continued) hierarchical clustering using an "average" linkage. (A) Plasmidome mash distance matrix by isolate (n = 121). The Legend is labeled as follows: (1) region, (2) clonal group, (3) carbapenemase. There is a noted primary clustering group of CG258 isolates (blue-labeled clade), whereas there are two Houston-based CG307 clustering groups (red-labeled clades) with diffuse clustering occurring with other CGs. (B) Plasmidome mash distance matrix by plasmid type (n = 295) with small, primarily ColE1-like plasmids excluded from analysis. The legend is labeled as follows: (1) region, (2) clonal group, (3) plasmid type, (4) carbapenemase. This analysis shows clustering of primarily CG258 X2type plasmids, multireplicon FIIK-type plasmids, and pKpQIL plasmids indicated with blue-labeled cluster groups. This is in contrast to one primary CG307 plasmid cluster group which includes Rtype plasmids as well the novel pCG307_HTX plasmid associated with the Houston group.
(K)/FIB(K) plasmid harboring bla CTX-M-15 was less prevalent in the Houston CG307 cohort (n = 3) where the primary vector of ESBL transmission was stable integration of two copies of a chromosomal ISEcp1-bla CTX-M-15 transposon. We found that while CG258 may have a greater number of plasmids per genome, plasmidome analyses suggest a greater diversity of unique vectors carrying genes encoding carbapenemases for the CG307 lineage with the potential for sharing across other circulating non-CG258 CRKp isolates in the Houston region. These genomic features (i.e., stable 2X bla CTX-M-15 copy chromosomal integration with a diversity of carbapenemase vectors) in conjunction with antimicrobial stewardship practices may be the primary drivers of the CG307 dissemination in Houston, TX.
Interestingly, the propensity for disseminating multiple AMR determinants to other clonal groups was noted in a recent outbreak of CG307 in northeast Germany (37). In contrast to our study, they had also found the potential of convergent hypervirulent and multidrug-resistant determinants in CG307 isolates due to promiscuous sharing of plasmids carrying multiple hypervirulent and AMR determinants (37). Our epidemiological investigation along with this northeast Germany surveillance study indicate the high potential of horizontal gene transfer along with noted regional variation in accessory genome content across different CG307 clades.
There are limitations in our study. Due to the nonrandom sampling conducted in our study, the extent to which CG307 is disseminated in other geographical locales could be much greater than what was inferred from our study. Nevertheless, since selection criterion were comparable across study sites, we believe direct comparisons can be made across CRACKLE-2 study sites. While we have extensive molecular epidemiological data, our limited traditional epidemiological data limit our ability to assess potential transmission between our participants and clonal outbreaks within our data set (e.g., we could not establish if a Georgia patient with a CG307 isolate that shared high genomic similarity to the CG307 Houston isolates had any epidemiological links with the Houston region.). Caution should be exercised when inferring differences in clinical outcomes of patients with CG258 versus CG307 due to inclusion of both infection and colonizing isolates with small sample sizes, especially when considering statistically significant differences in 30-day mortality between clonal groups. Additionally, due to small sample sizes, some comparisons may be underpowered (i.e., resistance classes, virulence genes, and Kleborate scores across clonal groups), and thus we may fail to detect some differences in AMR composition. Kleborate scores only incorporate a limited number of epidemiologically relevant K. pneumoniae features, which may limit direct comparison of virulence and AMR genetic factors. While we did not causally link the type 1 fimbriae associated with CG307 isolates with their higher prevalence found in urine versus CG258, future research should aim to explore potential tropism of CG307 in the urinary tract.
In conclusion, our data provide a detailed dissection of parallel epidemics of carbapenem-resistant K. pneumoniae high-risk lineages, both of which are considered urgent public health issues. The genomic and clinical insights presented here are likely to provide novel understanding of the genomic epidemiology of multidrug-resistant K. pneumoniae in order to improve CRKp detection and surveillance.

MATERIALS AND METHODS
Study design. Characterization of patients and isolates was based on the CRACKLE-2 study (3), a prospective, multicenter observational cohort study performed in 49 hospitals across the continental United States. This study focused on patients recruited in Houston, TX, and their isolates. A total of 160 CRE isolates were collected from 10 hospital sites within the Houston, TX, metropolitan area from April 2016 to December 2017. Clinical isolates were identified as belonging to the Enterobacterales group, and antimicrobial susceptibility testing was performed using Etest and Vitek platforms. Inclusion criteria have been described previously (3). A total of 95/160 CRE (59.4%) were identified as belonging to the K. pneumoniae species complex (KpSC) and were isolates subjected to whole-genome sequencing. Further details of the study design are in supplemental methods (at https://gitlab.com/carmig_dissertations/ shropshire_dissertation/crkp_supplemental_files).
Whole-genome sequencing. Isolate culture, genomic DNA extraction, short-read sequencing library preparation, and Illumina short-read sequencing have been described previously (3). All isolates identified as CRKp in our sampling frame were subjected to Oxford Nanopore Technologies (ONT) longread sequencing using the SQK-RBK004 library preparation kit and sequenced on an Oxford Nanopore GridION X5 (Oxford, UK). The hybrid assembly pipeline has been described previously (38). Genome assembly metrics are shown in Table S4 (at https://gitlab.com/carmig_dissertations/shropshire_dissertation/crkp _supplemental_files) with their respective BioSample accession and Antibacterial Resistance Leadership Group (ARLG) identification numbers. A summary of the molecular epidemiological results found using Kleborate v1.0.0 (18) can be found in Table S5. A matrix of all in silico PCR-based replicon typing found with PlasmidFinder (39) is included in Table S6. A full list of RefSeq/SRA accession numbers and CRACKLE IDs for the CG307 phylogeny (n = 798) can be found in Table S7. A full description of WGS, genome assembly, and comparative genome analysis can be found in the supplemental methods.
Conjugation transfer assays. Four isolates with plasmids of interest based on plasmidome analysis were included in conjugation transfer assays using a modified protocol (43). Details of the plasmids studied can be found in Table S8 (at https://gitlab.com/carmig_dissertations/shropshire_dissertation/crkp _supplemental_files). Easyfig was used to compare plasmid structures (44). Donors and a sodium azide (NaN 3 )-resistant E. coli strain, J53, were grown overnight in Trypticase soy broth (TSB) supplemented with 2 mg/mL ertapenem or 10 mg/mL gentamicin for plasmid selection and 150 mg/mL NaN 3 for counterselection at 37°C with mild agitation. Overnight cultures were washed in 0.9% NaCl twice and then subcultured into fresh TSB at a 1:100 dilution and incubated at 37°C until the mid-log phase (;0.6 optical density at 600 nm [OD 600 ]). Broth mating was performed with 1:10 donor-to-recipient ratios with TSB overnight at 37°C for 20 h. Trypticase soy agar (TSA) plates supplemented with 2 mg/mL ertapenem or 10 mg/mL gentamicin and 150 mg/mL NaN 3 were used to select for transconjugants. Conjugation transfer frequency was enumerated by calculating the ratio of CFU/mL in transconjugants over the CFU/mL in the donor plates. The limit of detection was calculated by taking the minimum CFU threshold detected factoring in the dilution factor (50 CFU/mL) and dividing that by donor frequency. We used a PCR protocol to check for the positive identification of transconjugants screening for bla KPC-2 , aacA4, E. coli J53 (rpoB gene), as well as plasmids of interest with the primers listed in Table S9.
Ethics approval and consent to participate. Original data were collected with the approval of The University of Texas Health Science Center at Houston Institutional Review Board (IRB), the Committee for Protection of Human Subjects (CPHS; protocol ID HSC-MS-16-0334; ethical approval 5/16/2016), and no further data were collected from these subjects. CRACKLE-2 data collected outside Houston was approved through the Duke University Health System Institutional Review Board for Clinical Investigations (DUHS IRB; protocol ID Pro00071149; ethical approval 6/29/2016) with no clinical data reported in this study. All clinical isolates were stripped of identifying information prior to analysis. All clinical data were deidentified upon receipt.
Statistics. Group-level distributions of continuous variables were evaluated using a one-way analysis of variance (ANOVA) test or Kruskal-Wallis rank-sum tests contingent on assumptions based on the data distributions. Post hoc tests for continuous variables with significant ANOVA P values were accomplished using Tukey's honest significance test or the Dunn test using the "FDR" method to control for multiple comparisons. The Wilcoxon rank-sum test was used to determine distribution differences of continuous variables across two groups. Distributions of categorical data were evaluated using the Pearson chi square or Fisher's exact test, dependent on cell counts. All statistical analysis was performed with R v4.0.0 software.
Data availability. Short-read and long-read FASTQ files along with complete assemblies for all K. pneumoniae isolates sequenced, including isolates previously published (3)