Regional spread of blaNDM-1-containing Klebsiella pneumoniae Sequence Type 147 in post-acute care facilities

Background Carbapenem-resistant Enterobacterales (CRE) harboring blaKPC have been endemic in Chicago-area healthcare networks for more than a decade. During 2016-2019, a series of regional point prevalence surveys identified increasing prevalence of blaNDM-containing CRE in multiple long-term acute care hospitals (LTACHs) and ventilator-capable skilled nursing facilities (vSNFs). We performed a genomic epidemiology investigation of blaNDM-producing CRE to understand their regional emergence and spread. Methods We performed whole-genome sequencing on NDM+ CRE isolates from four point-prevalence surveys across 35 facilities (LTACHs, vSNFs, and acute care hospital medical intensive care units) in the Chicago area and investigated the genomic relatedness and transmission dynamics of these isolates over time. Results Genomic analyses revealed that the rise of NDM+ CRE was due to the clonal dissemination of an ST147 Klebsiella pneumoniae strain harboring blaNDM-1 on an IncF plasmid. Dated phylogenetic reconstructions indicated that ST147 was introduced into the region around 2013 and likely acquired NDM around 2015. Analyzing genomic data in the context of patient transfer networks supported initial increases in prevalence due to intra-facility transmission in certain vSNFs, with evidence of subsequent inter-facility spread to connected LTACHs and vSNFs via patient transfer. Conclusions We identified a regional outbreak of blaNDM-1 ST147 that began in and disseminated across Chicago area post-acute care facilities. Our findings highlight the importance of performing genomic surveillance at post-acute care facilities to identify emerging threats.


Introduction
Carbapenem-resistant Enterobacterales (CRE) represent an urgent antibiotic resistance threat due to their resistance to first-line antibiotics and transmissibility in healthcare settings [1,2]. The emergence of epidemic lineages of CRE that are resistant to nearly all antibiotics and that cause infections with high mortality rates, such as Klebsiella pneumoniae carbapenemase (KPC) containing Klebsiella pneumoniae (KPC-Kp) sequence type (ST) 258 [3], has further escalated the need for more effective strategies to interrupt CRE transmission. Most interventions to prevent the spread of CRE and other healthcare-associated antibiotic resistance threats have been implemented at the level of individual healthcare facilities. However, there is now a multitude of evidence that the frequent movement of colonized and infected patients among regional healthcare facilities necessitates regional surveillance and infection prevention strategies [4].
Long-term acute care hospitals (LTACHs) and ventilator-capable skilled nursing facilities (vSNFs) are potentially high-impact settings for implementation of regional CRE surveillance and infection prevention interventions [5,6]. Patients in these facilities have been shown to be colonized with CRE at high rates, likely due to a combination of their chronic severe illness, long lengths of stay, and high rates of prior or on-going antibiotic exposure. Modeling and epidemiologic studies have suggested that the high CRE prevalence in LTACHs in particular has a significant impact on connected healthcare facilities with which they share patients [7,8].
Currently, less is known about the regional influence of vSNFs, although the even longer lengths of patient stay and more limited resources for infection prevention indicate that they might also be important settings in regional amplification of antibiotic resistance.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint 7 For all analyses, only the first isolate of a given ST (for K. pneumoniae isolates) or species (for all other species) was used for each patient. We used a Fisher's exact test to test for the statistical significance of the difference in NDM or KPC prevalence between the first and last surveys. Fisher's exact p-values were corrected using the Benjamini-Hochberg method.

Whole-genome sequencing
Genomic DNA was extracted from cultures derived from single sub-cultured colonies. Genomic libraries were prepared with NEBNext Ultra DNA library prep kit and sequenced at the University of Michigan Advanced Genomics Core on an Illumina NovaSeq 6000. All sequenced isolates have been deposited under BioProject PRJNA686897.

Determining patient flow between facilities
We constructed a patient transfer matrix of the Chicago metropolitan region using the Centers for Medicare and Medicaid Services' minimum data set, Medicare Provider Analysis and Review [MEDPAR] limited data set, Medicaid Analytic eXtract Data from 2010-2012. Using the patient . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)  [61], and cowplot v1.1.0 [62]. Code for analysis and visualization can be found here: https://github.com/Snitkin-Lab-Umich/ndm-st147-chicago-ms.

Ethical Review
Bacterial isolates and de-identified clinical metadata were collected under a prior surveillance project that underwent ethical review at the CDC and was determined to be a nonresearch activity (public health surveillance). The project was also evaluated independently at each participating healthcare facility and either deemed a public health assessment or human subjects research and approved by local review boards where applicable.

Results
Prevalence of NDM, but not KPC, increased over time in certain vSNFs that are not closely connected by patient transfer.
We first detected the presence of NDM+ CRE isolates in vSNFs and LTACHs during a regional point prevalence survey conducted in 2017 (survey 13; Fig 1A). To more closely monitor the potential emergence and spread of NDM in the region we performed three additional surveys for . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint CRE colonization in regional vSNFs and LTACHs, each 6 months apart (surveys 13-1, 13-2, and 13-3; Table 1). We found that, while the prevalence of KPC+ CRE remained relatively stable over time in all facilities, the prevalence of NDM+ CRE increased in three vSNFs not closely connected by patient transfer (Fisher's exact p < 0.05 for vSNFs J, K, and L; Fig 1B, S1,   S2; Table S1).
The majority of NDM+ isolates are K. pneumoniae ST147 and carry blaNDM-1 on an IncF plasmid.
To understand the molecular basis for the increase in NDM+ CRE we performed whole-genome sequencing on all CRE isolates from survey 13 and NDM+ isolates from the subsequent three follow-up surveys. We found that the presence of NDM was highly correlated with the presence of a suite of genes present on an NDM+ IncF plasmid isolated from K. pneumoniae [63] (Fig   S3). Most isolates containing the IncF plasmid were blaNDM-1 K. pneumoniae ST147; however, one blaNDM-1 Escherichia coli ST354 isolate also contained the plasmid (Fig 2, S4). While shortread sequencing data alone is insufficient to provide structural data associating NDM with the plasmid backbone, the high degree of correlation between NDM and the IncF-associated plasmid genes in concert with the phylogenetic relationships between isolates strongly suggests that these genes are co-inherited (Fig S3). Interestingly, 9 of 17 (53%) isolates with an IncF plasmid in survey 13 were KPC+/NDM-(ST14, 1 ST258, and 1 novel), but this plasmid then became present in predominantly blaNDM-1 isolates, with only 0-3 isolates with the IncF plasmid being KPC+/NDM-in the subsequent surveys (0-7%) (Fig S4). In addition to NDM, the IncF plasmid harbored a number of antibiotic resistance genes from several different resistance classes and the qacE gene, which has been shown to provide resistance to common biocides such as chlorhexidine gluconate in K. pneumoniae [64] (Fig S3).
Regional ST147 isolates are phylogenetically distinct from all public isolates.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) We investigated the phylogeography of ST147 in the Chicago area to determine whether circulating ST147 could be attributed to one or multiple importation events into the region. To this end we constructed a whole-genome phylogeny that included publicly available ST147 genomes from across the globe. Examination of the phylogenetic reconstruction revealed that all the ST147 isolates from the study form a monophyletic cluster, consistent with a single regional introduction (Fig 3). We also noted that while none of the NDM+ ST147 public isolates outside of the Chicago region harbor the IncF plasmid, the majority of ST147s from Chicago contain the plasmid. Moreover, one of the ST147 isolates in the KPC+/NDM-outgroup (from survey 13-2) contains the IncF plasmid but lacks NDM (Fig S5), suggesting that the plasmid may have been acquired in a locally circulating ST147 strain, followed by integration of a mobile  Table   S2).

Genomic evidence indicates that intra-facility transmission is driving prevalence at highprevalence vSNFs.
After determining that the increase in NDM prevalence corresponds to a clonal outbreak of blaNDM-1 ST147, we investigated the potential transmission dynamics of this clone within and between healthcare facilities. We observed a substantial clustering of isolates from certain vSNFs on the phylogeny (Fig 5A, S7), which suggests potential intra-facility transmission. To further investigate whether these clusters may represent within-facility transmission, we calculated pairwise SNV distances among all pairs of isolates and compared these distances for isolates from the same facility (intra-facility pairs) to isolates from different facilities (inter-facility pairs) across surveys (Fig 5B). Indeed, starting in survey 13-2 we observed a disproportionate . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint representation of small SNV distances (≤12 SNVs; see methods for threshold selection) consistent with intra-facility transmission in vSNFs. Of note, in survey 13-3 we observed spikes in small SNV distances for both intra-and inter-facility pairs, with closely related intra-facility pairs being primarily from vSNFs and closely related inter-facility pairs being from both vSNF-LTACH and vSNF-vSNF pairs (Fig S8). Putting these closely related inter-facility pairs in the context of the regional patient transfer network supports a potential role of patient transfer in regional blaNDM-1 ST147 spread in survey 13-3, with vSNF-LTACH and vSNF-vSNF isolate pairs less than 12 SNVs apart being from facilities with higher patient flow than isolate pairs with 12 or more SNVs (Wilcox p = 7.7e-6 and 1.4e-9, respectively; Fig S9).

Discussion
We performed genomic analyses of CRE isolates collected through serial point-prevalence surveys in the Chicago area to understand the molecular and epidemiologic basis for an increase in NDM+ CRE prevalence across a regional healthcare network. Our analysis supports the increase in NDM+ CRE being due to the clonal dissemination of a single blaNDM-1 ST147 strain of K. pneumoniae that emerged in 2015. Putting genomic analysis in the context of the regional healthcare network supports this strain first reaching high prevalence in a small number of vSNFs due to intra-facility transmission, with potential subsequent spread to other connected healthcare facilities via patient transfer.
Whole-genome sequencing showed that the majority of blaNDM-1 ST147 harbored an IncF multidrug resistance plasmid. Incorporating public data into the analysis revealed that these isolates formed a monophyletic clade, suggesting a single introduction into the region, either through importation of a pre-existing NDM+ ST147 strain or acquisition of blaNDM by a locally circulating ST147 strain. Furthermore, examination of the global phylogeny indicates that, while . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint NDM+ ST147 has evolved multiple times in different locations and sometimes resulted in clonal outbreaks, none of the global NDM+ ST147 isolates we included in our analysis harbor the IncF plasmid found in our study isolates. The rapid and widespread dissemination of this strain in the region indicates that the NDM-carrying IncF plasmid we identified here can stably associate with an ST147 strain with epidemic potential. Given the potential negative impact of epidemic NDMcarrying K. pneumoniae, this possibility warrants close monitoring. By combining regional surveillance with genomic analysis, we were able to discern that NDM initially spread in three vSNFs, likely via intra-facility transmission, with evidence of subsequent spread to connected healthcare facilities. There are several factors that likely contributed to the spread of this NDM+ ST147 clone. First, vSNF patients are a high-risk population for carriage of CRE as they are chronically ill, are usually admitted from ICUs or LTACHs, and are often exposed to antibiotics [65]. Second, patients in vSNFs generally have long lengths of stayoften much longer than patient stays at LTACHs [66] -meaning that they have a longer period of time to acquire a multi-drug resistant organism. Furthermore, multibed rooms are common and the facilities themselves are often under-resourced from a staffing and infection control perspective [6], both of which could facilitate intra-facility spread. Our findings paired with these observations indicate that vSNFs may be important healthcare facilities to detect emerging threats and potentially contain them before widespread dissemination. In the current study, we note that NDM+ isolates were uncommon in ICUs, and the outbreak of ST147 might not have been detectable until much later if sampling were restricted to ICUs.
Of note, the Chicago PROTECT intervention, a bundled infection prevention intervention designed to reduce the prevalence of CRE, began at around the same time as NDM+ ST147 began amplifying in the region. There are several potential reasons for this association. The intervention could have created a new niche for NDM+ isolates to thrive in, either due to NDM . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint itself, or other genes on the associated plasmid. One possibility is that the qacE gene harbored on the NDM IncF plasmid might have provided a selective advantage in the context of an intervention that included chlorhexidine gluconate bathing. It is also possible that the emergence of NDM during the intervention was coincidental, and that the intervention might have even prevented a larger NDM+ outbreak from occurring across more facilities. Regardless, this study exemplifies how genomic surveillance can be used to monitor emerging threats and understand how they develop and circulate through a region.
Our study has several strengths. Active surveillance of diverse types of healthcare facilities in the region allowed us to identify and investigate a potential multi-drug resistant organism threat earlier than would have been possible if serial point-prevalence surveys across several facility types were not ongoing. Furthermore, cross-sectional patient sampling within each survey allowed us to obtain a complete snapshot of CRE prevalence at a given facility at a given point in time, and to detect the increase in NDM+ isolates over time. Finally, we were able to leverage information from whole-genome sequencing to investigate the relatedness of isolates, as well as the intra-and inter-facility transmission dynamics of NDM across the healthcare network.
Our study also has several important limitations. First, we only have data from discrete points in time for each survey. This could have led to potential biases in the number of NDM+ isolates sequenced at facilities given that the patients at these facilities had different average lengths of stay, and it also precluded more nuanced examination of intra-facility transmission dynamics.
Second, we lack data from short-term acute care or community settings, particularly in the last three surveys, which limited our ability to examine the relative importance of other regional reservoirs for NDM+ ST147. However, the short-term acute care data that is available did not support their role in blaNDM-1 ST147 expansion. Another limitation is that we only had access to short-read sequencing data, which limited our ability to investigate more complex plasmid . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint dynamics. While we plan to perform long-read sequencing on a subset of these isolates in the future, we find it notable that we were able to leverage publicly available complete plasmid sequences to determine that NDM was carried on the same plasmid in the majority of isolates.
In conclusion, we were able to combine whole-genome sequencing of NDM+ isolates from serial point-prevalence surveys at vSNFs, LTACHs, and ICUs to identify and investigate the dissemination of an NDM+ clone of K. pneumoniae. These findings highlight the importance of performing regional point-prevalence surveys in not only acute care hospital ICUs, but also in post-acute care LTACHs and vSNFs, to identify potential multi-drug resistant organism threats as early as possible. Furthermore, our study identified an emerging blaNDM-1 ST147 clone of K.
pneumoniae that could pose a threat to healthcare facilities worldwide if this plasmid/strain combination continues to spread. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Table 1).

Assembly generation and annotation
For all study isolates, the reads were assembled using assemblage v1.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint quality of genomes was then assessed by the number of coding sequences and any genomes containing fewer than 4000 or greater than 8000 were excluded from the analysis.

Sequencing quality control and ST147 single nucleotide variant identification
The quality of sequencing reads was assessed with FastQC v0.11.9 [11], and Trimmomatic v0.39 [12] was used for trimming adapter sequences and low-quality bases. Single nucleotide variants (SNVs) were identified by (i) mapping filtered reads to the Genomes were removed that showed evidence of contamination as inferred by poor mapping to the reference genome or excessive numbers of uniquely filtered variants. This whole-genome masked variant alignment was used to reconstruct a maximum likelihood phylogeny with IQ-TREE v1.6.12 [41] using the general time reversible model GTR+G and ultrafast bootstrap with 1000 replicates (-bb 1000) [42].

Identifying NDM-containing plasmids
To identify the presence of beta-lactamase containing plasmids, we searched for previously sequenced, closed plasmids which were known to harbor genes of interest. First, CD-HIT v4.7 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint was used to identify orthologous genes [47]. NDM or KPC containing contigs from complete genomes were identified by the presence of a gene annotated as either of these betalactamases in the gff file, and all genes contained on these contigs were identified. The plasmid sharing the maximum number of genes with the NDM+ isolates was identified. Presence of the plasmid was determined as sharing 80% of the shared plasmid genes. A core gene alignment was created with cognac [46]. The pairwise substitution distance matrix was calculated from the alignment, which was then used to create a neighbor joining tree using ape v5. 3 [29] and midpoint rooted using phytools v0.6-99 [48]. The rows of the gene presence absence matrix were then sorted by the tip labels on the neighbor joining tree.
To confirm the presence of NDM containing plasmid, which was prevalent in the ST147 isolates, we utilized the raw sequencing data to calculate coverage across the plasmid sequence. We selected the complete genome with the best match for the NDM+ plasmid by shared gene content across the plasmid sequence as the reference sequence. The sequencing reads were aligned to the plasmid sequence using bwa mem v0.7.15 [21]. The sam files were parsed to identify reads which aligned to the NDM plasmid, and coverage was then calculated for 500 bp bins across the plasmid sequence. Antibiotic resistance genes were identified by querying the CARD database [49] with BLAST [50].

Inferring ancestral dates
We used the R package BactDating v1.0.12 [43] to infer ancestral dates on the NDM+ clade of the ST147 maximum likelihood phylogeny, as well as the phylogeny including all study ST147 isolates. A root-to-tip analysis was first performed and indicated that there is temporal signal in the data. As we do not have access to actual isolate dates, a date range for each tree tip was determined based on the year of isolation as well as the survey and facility type that the isolate came from. The maximum likelihood phylogeny and the date ranges were used to infer . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
All models had similar estimated root dates for the NDM+ clade (range: 2013.96 to 2016.24), and slightly more variation for the ST147 isolates (1996 to 2014), with varying levels of uncertainty. The best fit models were the additive relaxed clock (ARC) family of models [44], with the best fit being mixed continuous ARC (cARC) for the NDM+ clade and ARC for all ST147 study isolates. The cARC model was chosen for all subsequent analyses with the NDM+ clade, and the ARC model was chosen for all subsequent analyses with the ST147 study isolates. The effective sample size for all parameters in the best models were above 400.

Identifying phylogenetic clustering
We subset the NDM+ clade of the ST147 maximum likelihood phylogeny by survey. For each survey-specific tree, we used a method modified from [8] to identify the largest subclades that contain ≥90% of isolates from the same facility.

Calculating pairwise SNV distances
Pairwise SNV distances were calculated for the gubbins recombination-filtered ST147 variant alignment using the dist.dna function in the R package ape v5.4-1 [29], with the "N" model (count), and pairwise deletion set to true to include both core and non-core genetic variation.
We chose a pairwise SNV distance threshold based on the BactDating estimated mutation rate of ~12 SNVs per year ( Table S2). As the surveys were conducted 6 months apart, we considered closely related isolate pairs to be those with a SNV distance of ≤12 (i.e. 6 mutations per isolate). Wilcox tests were used to calculate significance, and all p-values were Bonferronicorrected.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

1e-01
Pairwise SNV distance Patient flow . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 25, 2021. ; https://doi.org/10.1101/2021.03.16.21253722 doi: medRxiv preprint