High-Resolution Genomic Profiling of Carbapenem-Resistant Klebsiella pneumoniae Isolates: A Multicentric Retrospective Indian Study

Abstract Background Carbapenem-resistant Klebsiella pneumoniae (CRKP) is a threat to public health in India because of its high dissemination, mortality, and limited treatment options. Its genomic variability is reflected in the diversity of sequence types, virulence factors, and antimicrobial resistance (AMR) mechanisms. This study aims to characterize the clonal relationships and genetic mechanisms of resistance and virulence in CRKP isolates in India. Materials and Methods We characterized 344 retrospective K. pneumoniae clinical isolates collected from 8 centers across India collected in 2013–2019. Susceptibility to antibiotics was tested with VITEK 2. Capsular types, multilocus sequence type, virulence genes, AMR determinants, plasmid replicon types, and a single-nucleotide polymorphism phylogeny were inferred from their whole genome sequences. Results Phylogenetic analysis of the 325 Klebsiella isolates that passed quality control revealed 3 groups: K. pneumoniae sensu stricto (n = 307), K. quasipneumoniae (n = 17), and K. variicola (n = 1). Sequencing and capsular diversity analysis of the 307 K. pneumoniae sensu stricto isolates revealed 28 sequence types, 26 K-locus types, and 11 O-locus types, with ST231, KL51, and O1V2 being predominant. blaOXA-48-like and blaNDM-1/5 were present in 73.2% and 24.4% of isolates, respectively. The major plasmid replicon types associated with carbapenase genes were IncF (51.0%) and Col group (35.0%). Conclusion Our study documents for the first time the genetic diversity of K and O antigens circulating in India. The results demonstrate the practical applicability of genomic surveillance and its utility in tracking the population dynamics of CRKP. It alerts us to the urgency for longitudinal surveillance of these transmissible lineages.

Epidemic lineages of CRKP have increasingly emerged and spread through global healthcare systems since they were first identified in 2001 [4]. The worldwide spread of these strains is alarming because they are MDR, with resistance to β-lactam antibiotics, fluoroquinolones, and aminoglycosides [5]. The resistance of these species is generally due to the production of carbapenemases, such as the class A serine carbapenemase K. pneumoniae (KPC) and metallo-β-lactamases. Other causes include combinations of outer membrane permeability loss and extended spectrum β-lactamase production [6].
According to the Center for Disease Dynamics, Economics and Policy, India has seen an increase in carbapenem resistance in K. pneumoniae, from 24% in 2008 to 59% in 2017, though several single-center studies have shown variable rates [7][8][9]. This dramatic increase in recent years is attributed, among other factors, to the lack of formal infection control measures and adequate medical intervention, prolonged hospitalization, the presence of comorbidities, and the overuse of antibiotics. In a single-center study from south India, the rate of mortality among patients with CRKP bloodstream infections is as high as 68% [10]. Despite the high disease burden, there are limited reports from India on the resistance mechanisms in MDR K. pneumoniae isolates, highlighting the need for comprehensive epidemiological surveillance results [10].
The expansion and dissemination of CRKP at different geographical locations throughout India necessitates a targeted analysis of the population structure, genomic mechanisms of resistance, and virulence of strains collected from the area of interest. Recent developments in understanding population structure highlight enormous genomic diversity and provide a basis for the pathogen to be tracked [11]. Whole genome sequencing (WGS) is a powerful tool for the characterization and surveillance of pathogens. It offers an unparalleled opportunity to explore genomic content and verify and evaluate the diversity of clusters and the frequency of contemporary isolates. It is now well-positioned to become the gold standard to resolve the knowledge gap [12].
Understanding of the mechanism by which this bacterium causes various infections is still basic, and most studies have limitations because of the narrow range of virulence factors being investigated. Little is known about the epidemiology of the capsule because serological and molecular typing are not commonly available, and several isolates are not typable using these methods. Phenotypic studies have identified 77 distinct capsule types (K types) and genomic studies have identified 134, but the true extent of capsule diversity remains unknown [13]. WGS can provide new insights into disease transmission, virulence, and antimicrobial resistance (AMR) dynamics when combined with epidemiological, clinical, and phenotypic microbiological information [12].
Since 2013, the Central Research Laboratory, Kempegowda Institute of Medical Sciences in India has developed a network of tertiary care hospitals, medical college hospitals, and stand-alone diagnostic laboratories across India. This network was extended for collection of retrospective isolates belonging to World Health Organization priority bacterial pathogens. In this report, we characterize the clonal relationships and genetic mechanisms of resistance and virulence in CRKP isolates in India. WGS was performed on a retrospective collection of 344 K. pneumoniae isolates to characterize their relationships, multilocus sequence type (MLST), capsular type, virulence genes, and AMR determinants.

Bacterial Isolates and Phenotypic Characterization
The study included 344 retrospective (2013-2019) invasive and noninvasive putative K. pneumoniae isolates received from 8 teaching hospitals in 6 Indian regions. They were characterized at Central Research Laboratory, Kempegowda Institute of Medical Sciences, using the VITEK 2 (Biomeurieux) compact system. The results were interpreted according to the 2019 The Clinical and Laboratory standards Institute guidelines. Isolates with phenotypic carbapenem resistance of resistant (R) or intermediate (I) are considered resistant (R). An isolate is designated MDR when it shows resistance to ≥ 1 agent in ≥ 3 antimicrobial categories [12].

Sequencing and Genomic Analyses
Genomic DNA was extracted from bacterial isolates with Qiagen QIAamp DNA Mini kit, in accordance with the manufacturer's instructions. Double-stranded DNA libraries with 450 bp insert size were prepared and sequenced on the Illumina platform with 150 bp paired-end chemistry.
Bioinformatic analysis was performed using Nextflow pipelines developed as a part of Genomic Surveillance of Antimicrobial Resistance-AMR as detailed in www.protocols.io [14]. The genomes of 325 samples that passed sequence quality control were assembled using Spades v3.14 to generate contigs and annotated with Prokka v1.5 [15,16]. The species identification was done using bactinspector and contamination was assessed using confindr [17,18]. All the quality metrics were combined using multiqc and qualifyr to generate web-based reports [19,20]. MLST and AMR and virulence factors were identified using ARIBA tool v2.14.4 [21] with BIGSdb-Pasteur MLST database [22], NCBI AMR acquired gene and PointFinder databases [20] and VFDB [23], respectively (Supplementary Table 1).
SNPs were identified for 307 K. pneumoniae isolates by mapping of reads to the NCBI reference genome, K. pneumoniae strain NTUH-K2044, NC_006625.1 using bwa mem [24], and the variants called were filtered and a maximum likelihood phylogeny was constructed with 1000 bootstrap support using IQTree [25] implemented in SNP phylogeny pipeline [26].
K and O antigen types, virulence factors, and plasmid replicons specific to Klebsiella species were identified using the kleborate and Kaptive implemented in the Pathogenwatch [27].

Summary of the Collection
The collection consisted of 325 isolates from patients aged 7 days to 96 years, of whom 60% were 50-80 years old. Most isolates were from urine (31.4%) and blood (29.8%) (Supplementary Table 2). VITEK 2 Compact was used to identify 325 isolates as K. pneumoniae, and these were reassigned to the species K. pneumoniae (n = 307, 94.4%), K. quasipneumoniae (n = 17, 5.5%), and K. variicola (n = 1, 0.1%) by sequencing. This highlights the limitation of traditional identification methods to distinguish species within the K. pneumoniae species complex. In addition, the use of genomic tools unmasks the true clinical significance of each phylogroup and their potential epidemiological specificities [28]. Clonal lineages of K. pneumoniae differ in their ability to acquire resistance and virulence genes, and in their propensity to spread within hospital and community environments [29]. Overall, the K. pneumoniae isolates sequenced in this study belonged to 28 different sequence types (STs), with ST231 being the most common sequence type (n = 107, 34.8%), followed by ST147 (n = 73, 23.5%) and ST14 (n = 26, 8.5%), accounting for 67.1% of total K. pneumoniae isolates ( Table 1, Supplementary  Figure 1). High prevalence of ST231 is in concordance with published data from India [30]. ST231 was the only clonal lineage represented by different centers and different specimen sources across India in the our and other Indian studies [10,30]. Temporal distribution of ST231 isolates throughout the study period (2014-2019) from different regions of India demonstrates its regional distribution and spread across India. Although ST147 and ST14 were observed in 4 study centers (north, south, and western part of India). K. pneumoniae isolates from a hospital in the western region of India had 22 STs, revealing polyclonality within a single hospital.
ST258 is recognized as the most prevalent and extensively disseminated KPC-producing K. pneumoniae in many countries, which made its absence in our collection noteworthy. ST11, a single-locus variant of ST258 and a prevalent clone associated with the spread of KPC in Asia (particularly in China and Taiwan), was identified in 1.3% of the K. pneumoniae isolates [31]. ST147 and ST14, described as international high-risk clones associated with extensive drug resistance, accounted for 23.5% and 8.5% of the isolates in this study, respectively [32,33]. One novel ST (ST5603) with extensive drug resistance was identified in a single K. pneumoniae isolate (Supplementary Table 2). The novel ST (gapA2-infB1-Mdh1-Pgi8-phoE10-rpoB4-tonb202) was a single-locus variant of ST890, varying at the tonb gene (tonb61), and possessed the KL107 and O1v1 loci. KL51 (n = 111/307, 36.1%) and KL64 (n = 104/307, 33.8%) were dominant K-loci types in this study, whereas KL1, KL2, KL5, KL20, KL17, KL51, KL54, KL57 and KL64 are the reported K-loci types from other Indian studies [31]. KL51 is reported from isolates from the United States, Sweden, United Kingdom [34], Thailand [35], and Lebanon [36]. The phylogenetic tree of the 307 isolates shows the correlation between capsular locus type and sequence type, for example of KL51 with ST231, and of KL64 with ST147 and ST395 (Supplementary Figure 1). Notably, no isolates were assigned to KL1, despite its high prevalence in a previous study [37].

Virulence
More than 10 virulence factors account for the pathogenesis of CRKP, and their detection helps understand the pathogenesis of different strains [38]. In the present study, a total of 33 genes belonging to 6 major virulence factors were observed. The core virulence genes type I and III fimbriae, enterobactin, AcrAB efflux pump, and regulators (RmpA, RcsAB) were identified in all 307 isolates. The acquired virulence genes coding for colibactin, nutrition factor, salmochelin, and rmpA were completely absent. Yersiniabactin, an iron uptake locus (ybtAEPQSTUX), was identified in 89.9% (276/307) of the isolates, and the regulatory gene rmpA2 was present in 5.5% (17/307). Aerobactin, a dominant siderophore, was found in 38.1% (n = 117) of the isolates, all belonging to ST231 or ST2096 (Supplementary Figure 3). The highest recorded virulence score of 4 was observed in 117 isolates (38.1%), which carried yersiniabactin and aerobactin genes. Of the K. pneumoniae ST231-KL51 isolates, 94% were characterized by a virulence score of 4 (Supplementary Table  1). Phylogenetic analysis, using the dataset from this study and a global collection of ST231 isolates, identified a sublineage that has acquired aerobactin and yersiniabactin, as well as the OXA-232 carbapenemase [39].

Resistance Profile and Their Distribution
Accumulation of AMR in K. pneumoniae is primarily the result of horizontal gene transfer aided by plasmids and mobile genetic elements. Since the first report of CRKP in 1996, the incidence of this MDR pathogen has increased significantly. The resistance is primarily due to production of acquired carbapenemases bla KPC , bla OXA , bla NDM , and the combinatorial mechanism of extended spectrum beta lactamase (ESBL) activity with loss of outer membrane porins [40]. This has become worrisome, particularly at a time when no new promising antimicrobial agents are on the horizon [41]. For public health initiatives, understanding their emergence and distribution over a diverse geographical region is needed [11].
In the collection, 225 isolates carried bla OXA48-like genes. We observed that 93.7% (n = 211) of them harbored a plasmid with the ColKP3 replicon sequence (Supplementary Figure  2). A similar association between the ColKP3 replicon and the bla OXA-232 gene was also observed in ST231 isolates from around the world (Supplementary Figure 5) [39].

High-Risk Clone: ST231
ST231, an emerging CRKP epidemic clone, was reported from India, France, Singapore, Brunei, Darussalam, and Switzerland [45]. In southeast Asia, this clone was found to be MDR, combining resistance to carbapenems, extended-spectrum cephalosporins, and broad-spectrum aminoglycosides [46]. In India, ST231 was first reported in 2013 in Delhi, with the isolate carrying bla OXA-232 as the predominant bla OXA48-like carbapenemase variant [30].
The tree of 107 ST231 isolates from this study showed evidence of clonal spread of ST231 carrying bla OXA-232 and bla CTX-M-15 within 1 hospital over a period of 3 years (hospital 1, Figure 2). Forty-two isolates formed a tight cluster with a mean SNP difference of 1 (range, 0-3), and they were separated by at least 43 SNP differences from the remaining 65 ST231 isolates in this study. The isolates in this cluster were mostly from inpatients (37/42) and characterized by a median patient age of 73.5 years (range, 26-89) compared with 67 years (range, 7 days-96 years) for all 239 K. pneumoniae isolates collected by this hospital. Altogether, this reveals a persistent outbreak of carbapenem-and cephalosporin-resistant ST231 within hospital 1 and underscores the need to strengthen infection prevention and control. Importantly, the tree also shows other ST231 genomes from hospital 1 that show similar resistance and virulence profiles, but that can be clearly distinguished from this large outbreak by their clustering, highlighting the utility of WGS to rule cases out of an outbreak investigation even when other phenotypic or genotypic markers would be inconclusive.
We conclude that the MDR ST231 lineage carrying both important resistance and virulence determinants is a major and rapidly disseminating CRKP high-risk clone in India capable of causing nosocomial outbreaks [30]. The emergence of the ST231 clonal lineage has also recently been reported in Switzerland, France, and Thailand [48][49][50]. The presence of MDR and virulence genes poses a risk in that the lineage may be a reservoir of virulence-associated genes that can be passed on by horizontal gene transfer to other lineages. This means that a high level of vigilance and monitoring is required. As shown by other studies, WGS can be used as an outbreak detection tool, allowing the detection of widely dispersed outbreaks that might not be otherwise identified.
This study has some limitations. First, very few retrospective isolates were retrieved from 7 sentinel sites and majority of the isolates were from 1 hospital, which was an archival facility.
Furthermore, the outbreak observed here were sampled from 1 center. However, this study represents a starting point for deeper understanding of K. pneumoniae population structure and diversity and we will build on these findings with prospective genomic surveillance connecting more hospitals representing each of the Indian states.

CONCLUSION
The study establishes the presence of several high-risk MDR CRKP clones in clinical samples collected across India. It represents the basis for genomic surveillance of emerging CRKP in India, providing critical information that can be used to track the emergence and dissemination, and assess the potential impact, of important variants. The lack of structured surveillance framework and inability to access patient clinical data has limited our interpretation. To the best of our knowledge, this is the first WGS study from India to document genetic diversity of K and O antigens circulating in Indian CRKP isolates.