Fitness barriers to spread of colistin-resistant Klebsiella pneumoniae overcome by establishing niche in patient population with elevated colistin use

To combat antibiotic resistance, it is critical to improve our understanding of how new resistant strains emerge and spread. An antibiotic resistance threat of critical priority is the epidemic ST258 strain of carbapenem-resistant Klebsiella pneumoniae (CRKP). Here, we studied the spread of resistance among ST258 to an antibiotic of last resort, colistin, by tracking its evolution across 21 U.S. long-term acute care hospitals over the course of a year. Phylogenetic analysis suggested that a significant cost was associated with colistin resistance in most cases, as resistance emergence was common but resistance variants were rarely transmitted. The high cost of resistance was further supported by the observation that several of the resistance variants that were transmitted had acquired secondary variants that reverted the strain to colistin susceptibility. The exceptions to the general pattern of instability associated with colistin resistance were two large clusters of resistant strains in one ST258 clade II sublineage (clade IIB) present across 11 of the 12 sampled Southern California hospitals. Quantification of transmission fitness in the healthcare environment using time-scaled haplotypic density indicated that while resistant isolates from other clades were less fit than their susceptible counterparts, clade IIB resistant isolates were more fit. Overlaying patient clinical data suggested that the increased fitness of colistin-resistant clade IIB isolates is in part driven by a lineage defining variant that increased clade IIB's association with patient subpopulations who were more likely to be treated with colistin. These results show that a favorable genetic background and sustained selective pressure led to the emergence and spread of a colistin-resistant ST258 sublineage across a regional healthcare network. More broadly, these findings highlight the utility of integrating pathogen genomic and corresponding clinical data from regional healthcare networks to detect and understand the origin and dissemination of antibiotic resistance threats.


Introduction
Multidrug resistant organisms (MDROs) pose a significant threat to public health due to uncontrolled transmission and dwindling treatment options. 1 Of greatest concern are epidemic MDRO lineages that have become adapted to healthcare settings and continually gain resistance due to selective antibiotic pressure. 2,3 As resistance evolution continues to outpace our ability to develop new therapies, 3 there is a critical need to improve our understanding of the forces driving the emergence and spread of resistance so that we can maintain the efficacy of last-line antibiotics.
When de novo evolution of antibiotic resistance occurs in an individual, there is often little to no subsequent transmission to others due to the fitness cost of maintaining resistance in the absence of antibiotic exposure. 4,5 However, in some cases this barrier is overcome, as evidenced by the proliferation of epidemic resistance clones. 5,6 This may happen when the resistance variant or gene has an inherently low fitness cost and is thus maintained in the population even in the absence of antibiotic exposure. 7 Alternatively, the fitness cost of a given resistance element may depend on the genetic background in which it emerges, leading to dissemination of these elements only in certain lineages. Indeed, experimental evidence has shown that the genetic background of a strain interacts with resistance determinants to influence the fitness of the resistant strain, 8,9 highlighting the importance of historical evolutionary events in determining the potential ability of a resistant strain to spread. While the preferential spread of resistance elements in specific strains lends support to the importance of genetic background, in most instances it is unclear what the underlying epistatic interactions are that lead to these different fitness effects. 5 Deciphering how the genetic background of a clinical strain influences transmissibility can provide insight into how fit strain/resistance combinations evolve and disseminate in the healthcare setting.
Carbapenem-resistant Klebsiella pneumoniae (CRKP) is a global threat as it is resistant to nearly all available antibiotics and is associated with high mortality rates. 1 One of the few remaining treatment options for CRKP is the antibiotic colistin. 10 Concerningly, the viability of colistin as a treatment option moving forward is threatened by the frequent emergence of colistin resistance during treatment with this drug. 11,12 However, despite the accessibility of these resistance variants to CRKP, their spread to other patients is rare, with most cases even . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint within a single healthcare setting stemming from parallel evolution. 12 Here, we document the emergence and spread of multiple colistin resistance variants in CRKP sequence type (ST) 258 across a regional healthcare network. Through integration of genomic and clinical data, we show how resistance spread was enhanced in genetic backgrounds that mitigate the fitness cost of resistance which, when combined with sustained selective pressure, enabled regional dissemination of colistin-resistant lineages.

Results
Most resistant isolates contain variants in known resistance genes. Antimicrobial susceptibility testing revealed that 118/337 (35%) CRKP ST258 isolates were resistant to colistin, and that this resistance was found consistently across long-term acute care hospitals (LTACHs) over the yearlong study (Fig S1). To identify likely resistance variants, we first searched all isolates for mcr-containing mobile genetic elements and variants in canonical chromosomal genes known to confer resistance (pmrA, pmrB, phoP, phoQ, crrA, crrB, and mgrB). As expected due to the timeframe and locations in which the isolates were collected (July 2014 to August 2015 in the U.S.), 13 we did not find mcr genes in any of the isolates. However, the majority of colistin resistance we observed (103/118, 87%) could be explained by known and putative resistance variants in canonical resistance genes (Fig S2; Table S1; see methods for details on identification of resistance variants).
As not all resistance in our dataset could be explained by variants in known resistance genes, we next performed a genome-wide association study (GWAS) on the isolates with unknown resistance determinants to identify putative non-canonical resistance-conferring variants. We identified two additional putative resistance genes in this way: qseC and phosphotransferase system sugar transporter subunit IIB (Fig S2; Table S1). These variants explain 6/15 (40%) of the unknown colistin resistance in our dataset. Notably, qseC, which explained resistance in 4 isolates, has been shown to confer colistin resistance in an experimental evolution study using clinical isolates. 14 We were unable to identify putative colistin resistance-conferring variants for the other nine isolates. For all subsequent analyses, we defined resistance genes as the set of canonical and GWAS-identified resistance genes.
Epistatic interactions appear to influence resistance in isolates with more than one variant in resistance genes. Next, we investigated the relationship between the number of variants in resistance genes and the level of colistin resistance. While the majority of isolates with one variant in a resistance gene were resistant, the majority of isolates with two or more variants in resistance genes were susceptible (Fig 1A). To better understand how the presence of multiple variants in resistance genes influenced the extent of resistance, we explored the relationship between the number of variants in resistance genes and minimum inhibitory concentration (MIC). We found that resistant isolates with two or more variants in resistance genes tended to have higher MICs on average than those with one or fewer (median MIC of 24 vs. 8, p = 0.001; Fig 1B). In contrast to resistant isolates where having two or more variants in resistance genes was associated with increased MIC, susceptible isolates with two or more variants in resistance genes were more . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint susceptible on average than those with one or fewer (median MIC of 1 vs. 0.5, p = 0.0002). Furthermore, in 37/45 (82%) cases where susceptible isolates harbored multiple variants in resistance genes, at least one of the variants had been previously shown to confer resistance. These findings suggest that, while variants in resistance genes often increase MIC as expected, there also exist epistatic interactions among variants in these genes that influence their impact on resistance phenotypes.
Colistin resistance exhibits patterns of de novo evolution, onward dissemination, and reversion to susceptibility. To better understand the origin and fate of colistin resistance variants, we investigated the phylogenetic relationship among colistin-resistant and susceptible isolates (Fig 2; Fig S3). Visualization of resistance on the phylogeny revealed a striking dichotomy between a clade II sublineage (defined as clade IIB) and the other ST258 sublineages (clades I and IIA). Clade IIB contained large clusters of colistin-resistant isolates, while the other two sublineages largely exhibited sporadic parallel evolution of resistance. Therefore, while 28/118 (24%) isolates acquired colistin resistance from putative de novo resistance evolution events, two clonal expansions in clade IIB across 11/12 LTACHs in Southern California (10 in the Los Angeles area and 1 in San Diego; Fig S4) accounted for over half (69/118, 58%) of resistance in this isolate collection. These two clonal expansions were associated with a nonsense mutation in mgrB (Gln30*) and a missense mutation in phoQ (Thr244Asn), respectively. Notably, these same types of variants occurred in the sporadic colistin-resistant isolates (Fig 2; Fig S5), suggesting that these specific resistance variants were likely not inherently more fit. Within these clonal expansions, many of the isolates appeared to regain susceptibility to colistin via the accumulation of additional variants in resistance genes. These putative reversion mutations usually occurred within the same gene or in a downstream gene of the molecular pathway to resistance (Fig 2; Fig S6), and were sometimes followed by re-acquisition of resistance through a different mechanism (Fig 2; Fig S5).
Resistant strains in clade IIB are more fit than their susceptible non-revertant counterparts. Next, we were interested in understanding the relative transmission fitness of susceptible, resistant, and revertant isolates in the different ST258 sublineages. In particular, we hypothesized that the resistance variants in clade IIB had a decreased associated fitness cost, and thus were able to disseminate more widely. To estimate transmission fitness in the context of the healthcare environment we applied an analytic approach to quantify the epidemic success of isolates in our patient population. In particular, we took advantage of the comprehensive nature of our sampling and used time-scaled haplotypic density (THD) to quantify the extent of relative spread of each genotype. 15 Applying the THD metric, we observed significant differences in fitness effects of resistance variants in clade IIB versus the other clades (Fig 3).
In clades I and IIA we observed evidence of a significant fitness cost for resistance variants, as resistant clusters and susceptible revertants were less fit than susceptible non-revertants from those clades (R cluster: p = 0.01; S revertant: p = 0.0003). We did not observe a fitness cost for resistant singletons that did not transmit to others (p = 0.18), which may be because these strains very recently evolved resistance and are thus still closely related to their susceptible ancestor strains. In contrast, resistance variants in clade IIB were associated with a significant . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint fitness benefit, with resistant clusters and susceptible revertants from clade IIB being more fit than susceptible non-revertants from that clade (R cluster: p = 0.03; S revertant: p = 0.02). Taken together, these findings suggest that resistance variants in isolates from clade IIB conferred a fitness advantage, which may account for the independent emergence and spread of two different resistance alleles in this clade. In contrast, the accumulation of variants in resistance genes in the other clades appears to be associated with a transmission fitness cost, which is consistent with their limited clonal spread.
Clade IIB isolates have more colistin exposure compared to isolates from other clades. Lastly, we were interested in investigating whether colistin use in study facilities influenced the emergence and spread of stable resistance in clade IIB. Supporting the role of colistin use in the spread of colistin resistance, we found a positive association between having a clade IIB isolate and both prior exposure to colistin (15/120, 12.5% in clade IIB vs. 8/217, 3.7% in other clades; Fisher's exact p = 0.003), as well as treatment with colistin (41/120, 34.2% in clade IIB vs. 39/217, 18.0% in other clades; Fisher's exact p = 0.001). We hypothesize that this association between colistin use and clade IIB is mediated by our previously reported observation that clade IIB is enriched in respiratory isolates 16 and the frequent use of colistin to treat respiratory tract infections during the study period (Fig 4A), due to limited available antibiotic options at that time. This hypothesis is supported by a slight enrichment in colistin treatment for clade IIB respiratory isolates compared to non-respiratory isolates (31/83, 37.3% for respiratory vs. 10/37, 27.0% for non-respiratory; p = 0.3), and a significant enrichment across all clades (53/183, 29% for respiratory vs. 27/154, 17.5%; p = 0.01). Furthermore, in addition to potentially influencing the spread of colistin resistance, the extensive use of colistin to treat infections in a setting of clonally disseminating colistin resistance led to 36/118 (30.5%) patients with colistin-resistant isolates being treated with this antibiotic, even though the treatment was likely ineffective ( Fig  4B).
In putting our past and present observations together, we noted that a defining feature of clade IIB was disruption of the putative O-antigen glycosyltransferase kfoC. 16 Thus, we hypothesize that disruption of kfoC increased the affinity of clade IIB for the respiratory tract, increased exposure to colistin, and decreased the transmission fitness cost of colistin resistance, allowing colistin-resistant clade IIB strains to spread across a regional healthcare network (Fig 5).

Discussion
The means by which colistin resistance evolves and disseminates in the healthcare setting has important implications for antibiotic stewardship and infection prevention strategies. Here, we used genomic and clinical data to track the origin and fate of colistin resistance variants across a comprehensive regional sample of clinical CRKP ST258 isolates. Our analysis identified a sublineage whose genetic background appears to decrease the fitness cost of resistance, thus allowing colistin-resistant strains to spread among regional healthcare facilities.
Examination of colistin resistance variants within the comprehensive and longitudinal context of circulating CRKP strains allowed for inferences into their functional impact and clinical . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint significance. We discovered that resistant isolates with multiple variants in resistance genes have increased resistance, but susceptible isolates with multiple variants in resistance genes have decreased resistance. Additionally, we identified several instances where a known resistance variant was present in susceptible isolates, likely due to a suppressor variant in either the same resistance gene or a downstream resistance gene. This finding has two important implications. First, known resistance variants occurring in both resistant and susceptible isolates complicates the computational identification of novel resistance determinants using GWAS. Possible solutions include increasing sample size or removing isolates with known resistance variants. Second, the presence of a colistin resistance variant cannot necessarily be equated with in vitro phenotypic resistance. Clinically, this suggests that testing for resistance variants may not be a substitute for testing for phenotypic resistance, as harboring a resistance variant may not preclude the effectiveness of colistin treatment. In addition to being phenotypically susceptible, revertants that harbor resistance variants may in general have more difficulty becoming resistant once more, which may lend credence to treating these patients with colistin. 17 Of particular genomic and clinical interest are two independent instances of emergence and spread of colistin resistance variants in clade IIB, in stark contrast to the sporadic emergence of resistance variants in the other clades. Notably, the types of resistance variants present in clade IIB also occurred in other clades, implying that the nature of the variant itself is not what allowed the clade IIB variants to spread. Instead, our analysis of fitness using THD suggests that these colistin resistance variants were able to spread in this clade because the genetic background of the clade reduced the fitness cost of the variants. We previously showed that a defining feature of clade IIB is that the majority of isolates contain a disruption in the lipopolysaccharide (LPS) O locus gene kfoC, 16 a putative glycosyltransferase. Given the key role that LPS plays in both colistin-mediated killing and colistin resistance, 11 it is possible that inactivation of kfoC facilitates the spread of colistin resistance by altering the outer membrane in a way that reduces the fitness cost of resistance-conferring LPS modifications. Furthermore, we hypothesize that kfoC may indirectly mitigate the fitness cost of colistin resistance by increasing the association between clade IIB and patient populations in which colistin use is more frequent. In particular, clade IIB isolates are significantly enriched in both association with the respiratory tract and colistin exposure. Thus, increased colistin use in clade IIB may have both set the stage for the emergence of multiple instances of stable resistance in this clade and provided a continued selective pressure to maintain resistance. These findings highlight the importance of continued real-time monitoring of resistance to improve antibiotic stewardship and decrease the likely ineffective use of colistin to treat colistin-resistant CRKP strains.
One strength of our study is that we have comprehensive sampling of clinical CRKP isolates from LTACHs over the course of a year. This not only allowed us to capture putative transmission events of colistin resistance, but also allowed us to estimate the fitness of different ST258 sublineages. Additionally, whole-genome sequencing of study isolates permitted us to interrogate not only variants in known resistance genes, but also investigate putative resistanceconferring variants in other genes using GWAS. Furthermore, our access to clinical information about colistin use allowed us to identify a relationship between this clinical variable and different . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint ST258 sublineages, providing us with insight into the selective pressures of colistin resistance evolution and spread in patients.
Our study also has several limitations. First, we do not have rectal surveillance cultures, which limits our ability to fully capture the population of colistin-resistant and susceptible CRKP in the LTACHs studied, and thus to fully define spread. While this could bias our clinical collection if certain strains are more likely to present clinically and be cultured at extraintestinal sites, the strains we have included are the most clinically relevant since they were identified in clinically derived samples. Also, we cannot be certain that our classification of isolates into different categories of colistin susceptibility and resistance is entirely accurate. For instance, putative de novo evolution of resistance in resistant singletons may not have occurred in the patient we sampled. However, our comprehensive sampling of clinical isolates from the facilities in the study provides some confidence that resistance in resistant singleton isolates did in fact evolve in that patient. Another limitation of our study is that we do not have information about colistin exposure prior to the patient entering the healthcare facility, and we do not have information about the dose or duration of colistin use. Even with this limitation, we were able to identify significant associations between colistin use and different ST258 sublineages.
In conclusion, we observed distinct dynamics of colistin resistance evolution in different CRKP ST258 sublineages that appear to be due to differences in the fitness cost of colistin resistance dependent on the genetic background of the strain. In particular, we identified an emerging ST258 sublineage that is more fit and more amenable to maintaining colistin resistance than other ST258 strains. This is of particular concern due to the already limited treatment options for CRKP, and therefore merits further surveillance to determine the extent of spread. Furthermore, our findings highlight the value of using high-resolution genomic analysis to inform antibiotic stewardship and the importance of surveillance and monitoring of MDROs for continued evolution and adaptation to the healthcare environment.

Study isolates and metadata
We used whole-genome sequences of clinical CRKP isolates from a prospective longitudinal study in 21 U.S. LTACHs over the course of a year (BioProject accession no. PRJNA415194). 18 Isolates and metadata were collected, and isolates were sequenced, as described in Han et al. 18 Prior patient use of colistin in the facility from which the isolate was taken (up to 30 days before the isolate collection date), as well as colistin use for empiric and definitive treatment, was extracted from the electronic health record. We only used information on colistin, not polymyxin B, for all analyses. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint resistance and therefore are technically considered non-resistant using this interpretation; however, the European Committee on Antimicrobial Susceptibility Testing defines isolates with an MIC ≤ 2 as susceptible. 20 For clarity, we chose to use the term susceptible for isolates with an MIC ≤ 2.
Isolate selection Multi-locus sequence types were called using ARIBA. 21 Over 90% of CRKP isolates collected over the course of the study belonged to ST258; therefore, we focus all of our analyses on this sequence type. We ordered the isolates by resistance status followed by collection date. For all analyses except GWAS, only the first patient ST258 isolate from this ordered list was used, thus prioritizing resistant isolates over susceptible isolates. Sample sizes were too small to glean insights into within-host colistin resistance evolution.
Single nucleotide variant calling, indel calling, and phylogenetic tree reconstruction Variant calling was performed with a customized variant calling pipeline (https://github.com/Snitkin-Lab-Umich/variant_calling_pipeline/) as follows. The quality of sequencing reads was assessed with FastQC v0.11.9, 22 and Trimmomatic v0.39 23 was used for trimming adapter sequences and low-quality bases. Single nucleotide variants (SNVs) were identified by (i) mapping filtered reads to the ST258 KPNIH1 reference genome (BioProject accession no. PRJNA73191) using the Burrows-Wheeler short-read aligner (bwa v0.7.17), 24 (ii) discarding polymerase chain reaction duplicates with Picard v2.24.1, 25 and (iii) calling variants with SAMtools and bcftools v1.9. 26 Variants were filtered from raw results using VariantFiltration from GATK v4.1.9.0 27 (QUAL >100; MQ >50; >=10 reads supporting variant; and FQ < 0.025). Indels were called using the GATK HaplotypeCaller 28 with the following filters: root mean square quality (MQ) > 50.0, GATK QualbyDepth (QD) > 2.0, read depth (DP) > 9.0, and allele frequency (AF) > 0.9. In addition, a custom Python script was used to filter out (mask) variants in the whole-genome alignment that were: (i) SNVs <5 base pairs (bp) in proximity to indels, (ii) in a recombinant region identified by Gubbins v2.3.4, 29 in a phage region identified by the Phaster web tool 30 or (iii) they resided in tandem repeats of length greater than 20bp as determined using the exact-tandem program in MUMmer v3.23. 31 This whole-genome masked variant alignment was used to reconstruct a maximum likelihood phylogeny with IQ-TREE v1.6.12 32 using the general time reversible model GTR+G and ultrafast bootstrap with 1000 replicates (-bb 1000). 33 Insertion calling Large insertions relative to the reference genome were called using panISa v0. 1.4. 34 Variant preprocessing We preprocessed variants to include multiallelic sites and used the major allele method for variant binarization, as described in Saund et al. 35 We used SnpEff to predict the functional impact of single nucleotide variants and indels (high, moderate, low, modifier). 36 Additionally, we considered all insertions in or upstream of genes as high impact, and those downstream of . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint genes as moderate impact. Only high and moderate impact variants were included in downstream analyses.

Identification of putative resistance variants
Known resistance genes We consider the following genes known (canonical) resistance genes: mgrB, phoP, phoQ, crrA, crrB, pmrA, and pmrB. 37 We group variants in known resistance genes into the following categories, in order of decreasing confidence: 1. Known: experimentally confirmed resistance variants, including loss-of-function variants in mgrB. 37-40 2. Known site: the variant occurs at a nucleotide site where there is an experimentally confirmed resistance variant, but that specific amino acid change has not been experimentally confirmed. 41-44 3. Putative: nonsynonymous or disruptive variants in known resistance genes where >60% of the variants are present in resistant isolates. We define the set of all of these as resistance variants. Significant differences between MIC of different isolates were determined using Wilcox tests.

Genome-wide association study
We performed a burden test using treeWAS v1.0, 45 a convergence-based GWAS method, as our sample size is relatively small and resistance is a very convergent phenotype. A burden test increases power to detect resistance genes, and convergence-based methods control for the structure of the phylogeny more than mixed model methods. For GWAS, we included only isolates with no mutations in known resistance genes as we were most interested in identifying novel resistance genes, and because the entire dataset contained a number of susceptible isolates with known resistance variants that would have confounded the analysis. We used pyseer v1.3.6 46 to calculate the number of unique patterns and determine a p-value cutoff (p < 9.47e-5). Putative resistance genes were considered ones that were identified as significant by treeWAS, had more than one convergence event on the phylogeny, and >60% of all the variants in the gene were found in resistant isolates. Variants within these genes were considered putative resistance variants using the same definition as for putative resistance variants in known resistance genes. Using these requirements, we included four putative resistance variants from two putative resistance genes based on the GWAS results.

Identification of putative suppressor variants
We define putative suppressor variants as those in resistance genes where >60% of isolates with the variant are susceptible and contain a resistance variant.
Determination of isolate resistance group We assigned each isolate to one of four categories (Fig S3): 1. Resistant singletons: resistant isolates that do not cluster on the phylogeny, or that cluster but contain distinct resistance variants.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021.
2. Resistant clusters: resistant isolates that cluster on the phylogeny and contain the same resistance variant. Additionally, if a cluster on the phylogeny has unknown resistance variants, we also defined it as a resistant cluster. 3. Susceptible revertants: susceptible isolates that contain a putative resistance-conferring variant. 4. Susceptible non-revertants: susceptible isolates that do not contain a putative resistance-conferring variant. For the purposes of identifying resistant clusters that include susceptible revertants, we classified an isolate as "quasi-resistant" if it contained a known resistance variant and/or was resistant. Clusters of "quasi-resistant" isolates were identified using the get_clusters function in regentrans (https://github.com/Snitkin-Lab-Umich/regentrans) with a pureness of 1. Each isolate in these clusters was then denoted as a resistant cluster or susceptible revertant isolate depending on their corresponding phenotype.
Calculation of transmission fitness using time-scaled haplotypic density Even though there is a clear fitness cost associated with colistin resistance in the healthcare setting as evidenced by our finding that the vast majority of resistant strains do not spread, previous experimental approaches to studying transmission fitness of colistin-resistant CRKP have found no fitness cost to resistance. 47,48 Therefore, we define transmission fitness in the healthcare setting as the epidemic success of the strain based on time-scaled haplotypic density (THD). 15 We calculated the THD for each isolate with the R package thd v1.0.1 15  On average, resistant isolates with two or more variants in resistance genes have a higher MIC than resistance isolates with one or fewer variants in resistance genes, and susceptible isolates with two or more variants in resistance genes have a lower MIC than susceptible isolates with one or fewer variants in resistance genes. Significance defined as Wilcox p < 0.05. MIC=minimum inhibitory concentration.
16 y C s . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint 17 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint 18 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint s to e . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint Figure S2: The majority of colistin resistance can be explained by variants in known resistance genes.

22
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 13, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 13, 2021. ; d e l . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 13, 2021. ; https://doi.org/10.1101/2021.06.11.21258758 doi: medRxiv preprint