2. Abstract
Increasing resistance to third-generation cephalosporins (3GCs) threatens public health, as these antimicrobials are prescribed as empirical therapies for systemic infections caused by Gram-negative bacteria. Resistance to 3GCs in urinary tract infections (UTIs) and bacteraemia is associated with the globally disseminated, multidrug-resistant, uropathogenic Escherichia coli sequence type (ST)131. This study combines the epidemiology of E.coli blood culture surveillance with whole-genome sequencing (WGS) to investigate ST131 associated with bacteraemia in Wales between 2013 and 2014. This population-based prospective genomic analysis investigated temporal, geographic, and genomic risk factors. To identify spatial clusters and lineage diversity, we contextualised 142 genomes collected from twenty hospitals, against a global ST131 population (n=181). All three major ST131 clades are represented across Wales, with clade C/H30 predominant (n=102/142, 71.8%). Consistent with global findings, Welsh strains of clade C/H30 contain β-lactamase genes from the blaCTX-M-1 group (n=65/102, 63.7%), which confers resistance to 3GCs. In Wales, the majority of clade C/H30 strains belonged to sub-clade C2/H30Rx (n=88/151, 58.3%), whereas sub-clade C1/H30R strains were less common (n=14/67, 20.9%). A sub-lineage unique to Wales was identified within the C2/H30Rx sub-clade (named GB-WLS.C2/H30Rx) and is defined by six non-recombinogenic single-nucleotide polymorphisms (SNPs), including a missense variant in febE (ferric enterobactin transport protein) and fryC (fructose-like permease IIC component), and the loss of the capsular biosynthesis genes encoding the K5 antigen. Bayesian analysis predicted that GB-WLS.C2/H30Rx diverged from a common ancestor (CA) most closely related to a Canadian strain between 1998 and 1999. Further, our analysis suggests a descendent of GB-WLS.C2/H30Rx arrived through an introduction to North Wales circa 2002, spread and persists in the geographic region, causing a cluster of cases (CA emerged circa 2009) with a maximum pair-wise distance of 30 non-recombinogenic SNPs. This limited genomic diversity likely depicts local transmission within the community in North Wales. This investigation emphasises the value of genomic epidemiology, allowing detection of suspected transmission clusters and the spread of genetically similar/identical strains in local areas. These analyses will enable targeted and timely public health interventions.
Impact statement Uropathogenic Escherichia coli (UPEC) is a leading cause of bacteraemia, resulting in substantial mortality and morbidity, with rates of E. coli bacteraemia (ECB) becoming a particular concern in Wales(1). Previous genomic and multilocus sequence typing (MLST) studies have identified that ECB cases are disproportionately caused by specific groups [sequence types (ST)] of related E. coli. Previous work reports ST131 as a globally disseminated lineage associated with bacteraemia and antimicrobial resistance (AMR). Despite widespread study of ECB, the temporal and geographic patterns of key ECB clones remain an important area of study. Moreover, by gaining a detailed understanding of the population structure of key ECB clones, it should be possible to develop and improve public health measures to reduce the risk of ECB and act to combat the rise of AMR. Using whole-genome sequencing, we describe the temporal and spatial relationship of a collection of E. coli ST131 bacteraemia cases sampled across Wales. High-resolution analyses of genetic variants identified a local (North Wales) cluster of strains within the highly antimicrobial-resistant sub-clade C2/H30Rx, which are characterised by resistance to nitrofurantoin and the loss of the K5 capsule. Notably, AMR stewardship guidelines in Wales recently changed to include nitrofurantoin as a first-line treatment for uncomplicated UTIs. This local cluster likely represents environmentally-mediated community transmission, environmentally mediated, from the strain’s common ancestor that existed circa 2009, highlighting the need for national genomic surveillance, close to real-time, to track and understand the evolution of AMR in communities.
Data summary The study sequences are available in the National Center for Biotechnology Information (NCBI) under BioProject accession number PRJNA729115. Raw Illumina sequence read data have been deposited to the NCBI sequence read archive [SRA (https://www.ncbi.nlm.nih.gov/sra)] under the accession numbers SRR14519411 to SRR14519567. A complete list of SRA accession numbers is available in Table S1 (available in the online version of this article). The high-quality draft assemblies have been deposited to GenBank under the accession numbers JAHBGJ000000000 to JAHBMG000000000, and JAHBRR000000000 to JAHBRT000000000. The programs used to analyse raw sequence reads for polymorphism discovery and whole-genome sequencing based phylogenetic reconstruction are available as described in the materials and methods. The authors confirm all supporting data, code, and protocols have been provided within the article or through supplementary data files.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work received funding for whole-genome sequencing from Public Health Wales NHS Trust (United Kingdom) and a Wellcome Institutional Strategic Support Fund (ISSF) award to Cardiff University (United Kingdom).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This work was undertaken on stored bacterial cultures and no additional clinical samples were collected from any persons to facilitate this study. Patient anonymity was maintained by pseudonymised data that went outside Public Health Wales.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The study sequences are available in the National Center for Biotechnology Information (NCBI) under BioProject accession number PRJNA729115. Raw Illumina sequence read data have been deposited to the NCBI sequence read archive [SRA (https://www.ncbi.nlm.nih.gov/sra)] under the accession numbers SRR14519411 to SRR14519567. A complete list of SRA accession numbers is available in Table S1 (available in the online version of this article). The high-quality draft assemblies have been deposited to GenBank under the accession numbers JAHBGJ000000000 to JAHBMG000000000, and JAHBRR000000000 to JAHBRT000000000. The programs used to analyse raw sequence reads for polymorphism discovery and whole-genome sequencing based phylogenetic reconstruction are available as described in the materials and methods. The authors confirm all supporting data, code, and protocols have been provided within the article or through supplementary data files.
1.6 Abbreviations
- 3GCs
- third-generation cephalosporins
- AMR
- antimicrobial resistance
- BWA
- Burrows– Wheeler Aligner
- CA
- common ancestor
- catB
- chloramphenicol-related O-acetyltransferase
- CDS
- coding sequence
- CI
- confidence interval
- contigs
- contiguous sequences
- ECB
- Escherichia coli bacteraemia
- ESBLs
- extended-spectrum β-lactamases
- febE
- ferric enterobactin transport protein
- fryC
- fructose-like permease IIC component
- fumC
- fumarate hydratase class II
- GATK
- Genome Analysis Tool Kit
- HPD
- highest posterior density
- INDELS
- insertions and deletions
- IQR
- interquartile range
- IS
- insertion sequences
- MCC
- maximum clade credibility
- MCMC
- Markov chain Monte Carlo
- mdh
- malate dehydrogenase
- ML
- maximum likelihood
- MLST
- multilocus sequence typing
- NCBI
- National Center for Biotechnology Information
- NHS
- National Health Service
- NICE
- National Institute for Clinical Excellence
- PHW
- Public Health Wales
- RefSeq
- Reference Sequence
- SNPs
- single-nucleotide polymorphisms
- SACU
- Specialist Antimicrobial Chemotherapy Unit
- SRA
- Sequence Read Archive
- ST
- sequence type
- syn
- synonymous
- UK
- United Kingdom
- UPEC
- Uropathogenic Escherichia coli
- USA
- United States of America
- UTIs
- urinary tract infections
- WGS
- whole-genome sequencing
- XAT
- xenobiotic acyltransferase