Mba, I. E. et al. Vaccine development for bacterial pathogens: advances, challenges and prospects. Trop. Med. Int. Health 28, 275–299 (2023).
Murray, C. J. L. et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 339, 629–655 (2022).
Lin, H., Paff, M. L., Molineux, I. J. & Bull, J. J. Therapeutic application of phage capsule depolymerases against K1, K5, and K30 capsulated E. coli in mice. Front. Microbiol. 8, 2257 (2017).
Ingle, D. J. et al. In silico serotyping of E. coli from short read data identifies limited novel O-loci but extensive diversity of O:H serotype combinations within and between pathogenic lineages. Microb. Genom. 2, e000064 (2016).
Whitfield, C. Structure and assembly of Escherichia coli capsules. EcoSal Plus 3, https://doi.org/10.1128/ecosalplus.4.7.3 (2009).
Hong, Y., Cunneen, M. M. & Reeves, P. R. Two extremely divergent sequence forms of the genes that define Escherichia coli group 3 capsules suggest a very long history since their common ancestor. FEMS Microbiol. Lett. 366, fnz091 (2019).
Kunduru, B. R., Nair, S. A. & Rathinavelan, T. E. K. 3D: an E. coli K antigen 3-dimensional structure database. Nucleic Acids Res. 44, D675–D681 (2016).
Wyres, K. L. et al. Identification of Klebsiella capsule synthesis loci from whole genome data. Microb. Genom. 2, e000102 (2016).
Wyres, K. L., Cahill, S. M., Holt, K. E., Hall, R. M. & Kenyon, J. J. Identification of Acinetobacter baumannii loci for capsular polysaccharide (KL) and lipooligosaccharide outer core (OCL) synthesis in genome assemblies using curated reference databases compatible with Kaptive. Microb. Genom. 6, e000339 (2020).
Yang, S. et al. Genetic diversity of K-antigen gene clusters of Escherichia coli and their molecular typing using a suspension array. Can. J. Microbiol. 64, 231–241 (2018).
Mostowy, R. J. & Holt, K. E. Diversity-generating machines: genetics of bacterial sugar-coating. Trends Microbiol. 26, 1008–1021 (2018).
Holt, K. E., Lassalle, F., Wyres, K. L., Wick, R. & Mostowy, R. J. Diversity and evolution of surface polysaccharide synthesis loci in Enterobacteriales. ISME J. 14, 1713–1730 (2020).
Lorenz, O. et al. SeroBA(v2.0) and SeroBAnk: a robust genome-based serotyping scheme and comprehensive atlas of capsular diversity in Streptococcus pneumoniae. Microb. Genom. 11, 001483 (2025).
Blackwell, G. A. et al. Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences. PLoS Biol. 19, e3001421 (2021).
Gladstone, R. A. et al. Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002–17: a nationwide, longitudinal, microbial population genomic study. Lancet Microbe 2, e331–e341 (2021).
Arredondo-Alonso, S. et al. Plasmid-driven strategies for clone success in Escherichia coli. Nat. Commun. 16, 2921 (2025).
Pöntinen, A. K. et al. Modulation of multi-drug resistant clone success in Escherichia coli populations: a longitudinal multi-country genomic and antibiotic usage cohort study. Lancet Microbe 5, e142–e150 (2024).
Kallonen, T. et al. Systematic longitudinal survey of invasive Escherichia coli in England demonstrates a stable population structure only transiently disturbed by the emergence of ST131. Genome Res. 27, 1437–1449 (2017).
Mäklin, T. et al. Strong pathogen competition in neonatal gut colonisation. Nat. Commun. 13, 7417 (2022).
Shao, Y. et al. Primary succession of Bifidobacteria drives pathogen resistance in neonatal microbiota assembly. Nat. Microbiol. 9, 2570–2582 (2024).
Liu, C. M. et al. Using source-associated mobile genetic elements to identify zoonotic extraintestinal E. coli infections. One Health 16, 100518 (2023).
Ludden, C. et al. One Health genomic surveillance of Escherichia coli demonstrates distinct lineages and mobile genetic elements in isolates from humans versus livestock. mBio 10, e02693–18 (2019).
Dicks, J. et al. NCTC3000: a century of bacterial strain collecting leads to a rich genomic data resource. Microb. Genom. 9, mgen000976 (2023).
Zhou, Z., Charlesworth, J. & Achtman, M. HierCC: a multi-level clustering scheme for population assignments based on core genome MLST. Bioinformatics 37, 3645–3646 (2021).
Tonkin-Hill, G. et al. Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biol. 21, 180 (2020).
Lam, M. M. C., Wick, R. R., Judd, L. M., Holt, K. E. & Wyres, K. L. Kaptive 2.0: updated capsule and lipopolysaccharide locus typing for the Klebsiella pneumoniae species complex. Microb. Genom. 8, 000800 (2022).
Lipworth, S. et al. The plasmidome associated with Gram-negative bloodstream infections: a large-scale observational study using complete plasmid assemblies. Nat. Commun. 15, 1612 (2024).
Handal, N. et al. A comparative study of the molecular characteristics of human uropathogenic Escherichia coli collected from two hospitals in Norway and France in 2019. J. Antimicrob. Chemother. 80, 1707–1715 (2025).
Thänert, R. et al. Persisting uropathogenic Escherichia coli lineages show signatures of niche-specific within-host adaptation mediated by mobile genetic elements. Cell Host Microbe 30, 1034–1047.e6 (2022).
Sands, K. et al. Characterization of antimicrobial-resistant Gram-negative bacteria that cause neonatal sepsis in seven low- and middle-income countries. Nat Microbiol. 6, 512–523 (2021).
Pearse, O. et al. High diversity of Escherichia coli causing invasive disease in neonates in Malawi poses challenges for O-antigen based vaccine approach. Commun. Med. 5, 298 (2025).
Marin, J. et al. The population genomics of increased virulence and antibiotic resistance in human commensal Escherichia coli over 30 years in France. Appl. Environ. Microbiol. 88, e0066422 (2022).
Horesh, G. et al. A comprehensive and high-quality collection of Escherichia coli genomes and their genes. Microb. Genom. 7, 000499 (2021).
Whitfield, C. & Roberts, I. S. Structure, assembly and regulation of expression of capsules in Escherichia coli. Mol. Microbiol. 31, 1307–1319 (1999).
Sande, C. & Whitfield, C. Capsules and extracellular polysaccharides in Escherichia coli and Salmonella. EcoSal Plus 9, eESP00332020 (2021).
Hong, Y., Qin, J., Forga, X. B. & Totsika, M. Extensive diversity in Escherichia coli group 3 capsules is driven by recombination and plasmid transfer from multiple species. Microbiol. Spectr. 11, e0143223 (2023).
Lo, S. W. et al. Pneumococcal lineages associated with serotype replacement and antibiotic resistance in childhood invasive pneumococcal disease in the post-PCV13 era: an international whole-genome sequencing study. Lancet Infect. Dis. 19, 759–769 (2019).
Garcia Quesada, M. et al. Serotype distribution of remaining invasive pneumococcal disease after extensive use of ten-valent and 13-valent pneumococcal conjugate vaccines (the PSERENADE project): a global surveillance analysis. Lancet Infect. Dis. 25, 445–456 (2025).
Maeda, H. & Morimoto, K. Global distribution and characteristics of pneumococcal serotypes in adults. Hum. Vaccin. Immunother. 21, 2469424 (2025).
Moxon, E. R. & Kroll, J. S. The role of bacterial polysaccharide capsules as virulence factors. Curr. Top. Microbiol. Immunol. 150, 65–85 (1990).
Buckles, E. L. et al. Role of the K2 capsule in Escherichia coli urinary tract infection and serum resistance. J. Infect. Dis. 199, 1689–1697 (2009).
Cross, A. S., Kim, K. S., Wright, D. C., Sadoff, J. C. & Gemski, P. Role of lipopolysaccharide and capsule in the serum resistance of bacteremic strains of Escherichia coli. J. Infect. Dis. 154, 497–503 (1986).
Arredondo-Alonso, S. et al. Evolutionary and functional history of the Escherichia coli K1 capsule. Nat. Commun. 14, 3294 (2023).
Merino, I. et al. Molecularly defined extraintestinal pathogenic Escherichia coli status predicts virulence in a murine sepsis model better than does virotype, individual virulence genes, or clonal subset among E. coli ST131 isolates. Virulence 11, 327–336 (2020).
Mostafavi, S. K. S., Najar-Peerayeh, S., Mobarez, A. M. & Parizi, M. K. Characterization of uropathogenic E. coli O25b-B2-ST131, O15:K52:H1, and CGA: neutrophils apoptosis, serum bactericidal assay, biofilm formation, and virulence typing. J. Cell. Physiol. 234, 18272–18282 (2019).
Suerbaum, S., Friedrich, S., Leying, H. & Opferkuch, W. Expression of capsular polysaccharide determines serum resistance in Escherichia coli K92. Zentralbl. Bakteriol. 281, 146–157 (1994).
Ojala, F. et al. Basic reproduction number varies markedly between closely related pandemic Escherichia coli clones. Nat. Commun. 16, 9490 (2025).
Løchen, A., Truscott, J. E. & Croucher, N. J. Analysing pneumococcal invasiveness using Bayesian models of pathogen progression rates. PLoS Comput. Biol. 18, e1009389 (2022).
Corander, J. et al. Frequency-dependent selection in vaccine-associated pneumococcal population dynamics. Nat. Ecol. Evol. 1, 1950–1960 (2017).
Colijn, C., Corander, J. & Croucher, N. J. Designing ecologically optimized pneumococcal vaccines using population genomics. Nat. Microbiol. 5, 473–485 (2020).
Sleeman, K. L. et al. Capsular serotype-specific attack rates and duration of carriage of Streptococcus pneumoniae in a population of children. J. Infect. Dis. 194, 682–688 (2006).
Jounio, U. et al. Genotypic and phenotypic characterization of carriage and invasive disease isolates of Neisseria meningitidis in Finland. J. Clin. Microbiol. 50, 264–273 (2012).
Yazdankhah, S. P. et al. Distribution of serogroups and genotypes among disease-associated and carried isolates of Neisseria meningitidis from the Czech Republic, Greece, and Norway. J. Clin. Microbiol. 42, 5146–5153 (2004).
Brueggemann, A. B. et al. Clonal relationships between invasive and carriage Streptococcus pneumoniae and serotype- and clone-specific differences in invasive disease potential. J. Infect. Dis. 187, 1424–1432 (2003).
Shao, Y. et al. Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth. Nature 574, 117–121 (2019).
Tonkin-Hill, G. et al. Enhanced metagenomics-enabled transmission inference with TRACS. Preprint at bioRxiv https://doi.org/10.1101/2024.08.19.608527 (2024).
Weissman, S. J. et al. Clonal analysis reveals high rate of structural mutations in fimbrial adhesins of extraintestinal pathogenic Escherichia coli. Mol. Microbiol. 59, 975–988 (2006).
Johnson, T. J. et al. Comparison of extraintestinal pathogenic Escherichia coli strains from human and avian sources reveals a mixed subset representing potential zoonotic pathogens. Appl. Environ. Microbiol. 74, 7043–7050 (2008).
Frasa, H. et al. Escherichia coli in bacteremia: O-acetylated K1 strains appear to be more virulent than non-O-acetylated K1 strains. J. Clin. Microbiol. 31, 3174–3178 (1993).
Mandal, C., Schwartz-Albiez, R. & Vlasak, R. in SialoGlyco Chemistry and Biology I (eds Gerardy-Schahn, R. et al.) Ch1 (Springer, 2015).
Wassenaar, T. M. Insights from 100 years of research with probiotic E. coli. Eur. J. Microbiol. Immunol. 6, 147–161 (2016).
Qiu, L. et al. Vaccines against extraintestinal pathogenic Escherichia coli (ExPEC): progress and challenges. Gut Microbes 16, 2359691 (2024).
Weerdenburg, E. et al. Global distribution of O serotypes and antibiotic resistance in extraintestinal pathogenic Escherichia coli collected from the blood of patients with bacteremia across multiple surveillance studies. Clin. Infect. Dis. 76, e1236–e1243 (2023).
Vann, W. F. et al. Serological, chemical, and structural analyses of the Escherichia coli cross-reactive capsular polysaccharides K13, K20, and K23. Infect. Immun. 39, 623–629 (1983).
Jann, B., Kochanowski, H. & Jann, K. Structure of the capsular K96 polysaccharide (K96 antigen) from Escherichia coli O77:K96:H- and comparison with the capsular K54 polysaccharide (K54 antigen) from Escherichia coli O6:K54:H10. Carbohydr. Res. 253, 323–327 (1994).
Stanton, T. D., Hetland, M. A. K., Löhr, I. H., Holt, K. E. & Wyres, K. L. Fast and accurate in silico antigen typing with Kaptive 3. Microb. Genom. 11, 001428 (2025).
Rutschmann, C. et al. DMB labelling for detection and analysis of capsular polysaccharides. Preprint at bioRxiv https://doi.org/10.1101/2025.01.29.635457 (2025).
Miravet-Verde, S. et al. In silico typing maps the natural diversity of E. coli transporter-dependent capsules. Nat. Microbiol. https://doi.org/10.1038/s41564-026-02323-5 (2026).
Mäklin, T., Gladstone, R., Shao, Y. & Wellcome Sanger Institute BBS phase 1 & phase 2 high quality E. coli bin assembled genomes. Zenodo https://doi.org/10.5281/zenodo.14000489 (2024).
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Schwengers, O. et al. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb. Genom. 7, 000685 (2021).
Xie, Z. & Tang, H. ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics 33, 3340–3347 (2017).
Gladstone, R. A. et al. Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections: a nationwide longitudinal microbial population genomic cohort study in Norway between 2002–2017. Lancet Microbe 2, e331–e341 (2021).
Gladstone, R. rgladstone/EC-K-typing: K-typing DB update 11/2025 (v3.0.0). Zenodo https://doi.org/10.5281/zenodo.18107967 (2025).
Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016).
Hadfield, J. et al. Phandango: an interactive viewer for bacterial population genomics. Bioinformatics 34, 292–293 (2018).
Khawaja, T. et al. Deep sequencing of Escherichia coli exposes colonisation diversity and impact of antibiotics in Punjab, Pakistan. Nat. Commun. 15, 5196 (2024).
Mäklin, T. et al. Geographical variation in the incidence of colorectal cancer and urinary tract cancer is associated with population exposure to colibactin-producing Escherichia coli. Lancet Microbe 6, 101015 (2024).
Alanko, J. N. Species-colored Themisto v3 index with 640k bacterial genomes (Themisto v3). Zenodo https://zenodo.org/records/7736981 (2023).
Alanko, J. N., Vuohtoniemi, J., Mäklin, T. & Puglisi, S. J. Themisto: a scalable colored k-mer index for sensitive pseudoalignment against hundreds of thousands of bacterial genomes. Bioinformatics 39, i260–i269 (2023).
Mäklin, T. et al. High-resolution sweep metagenomics using fast probabilistic inference. Wellcome Open Res. 5, 14 (2021).
Mäklin, T. et al. Bacterial genomic epidemiology with mixed samples. Microb. Genom. 7, 000691 (2021).
Mäklin, T. Escherichia coli lineage deconvolution indexes for Themisto, mSWEEP/mGEMS, and demix_check (v0.1.1). Zenodo https://doi.org/10.5281/zenodo.12528310 (2024).
Lees, J. A. et al. Fast and flexible bacterial genomic epidemiology with PopPUNK. Genome Res. 29, 304–316 (2019).
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. https://doi.org/10.18637/jss.v067.i01 (2015).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Croucher, N. J. et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res. 43, e15 (2015).
Tonkin-Hill, G. & Lees, J. pairsnp-cpp: fast pairwise SNP distance matrices from multiple sequence alignments. GitHub https://github.com/gtonkinhill/pairsnp-cpp (2022).
Derelle, R. et al. Seamless, rapid, and accurate analyses of outbreak genomic data using split k-mer analysis. Genome Res. 34, 1661–1673 (2024).
Didelot, X., Croucher, N. J., Bentley, S. D., Harris, S. R. & Wilson, D. J. Bayesian inference of ancestral dates on bacterial phylogenetic trees. Nucleic Acids Res. 46, e134 (2018).
Argimón, S. et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb. Genom. 2, e000093 (2016).
Gladstone, R. Supplementary Data 1. Accessions and metadata. Gladstone et al NatMicro 2026. Zenodo https://doi.org/10.5281/zenodo.18154176 (2026).
Gladstone, R. Source data for main figures 1-5 and extended data figures 1-6. Gladstone et al. NatMicro. 2026. Zenodo https://doi.org/10.5281/zenodo.18349577 (2026).