Send to

Choose Destination
Front Microbiol. 2019 Feb 25;10:317. doi: 10.3389/fmicb.2019.00317. eCollection 2019.

Genomic Analyses of >3,100 Nasopharyngeal Pneumococci Revealed Significant Differences Between Pneumococci Recovered in Four Different Geographical Regions.

Author information

Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
Parasites and Microbes, Wellcome Sanger Institute, Hinxton, United Kingdom.
Department of Zoology, University of Oxford, Oxford, United Kingdom.
Department of Medicine, Imperial College London, London, United Kingdom.
Clinical Microbiology, University of Iceland and Landspitali University Hospital, Reykjavik, Iceland.
Institute of Infection and Global Health, University of Liverpool, Liverpool, United Kingdom.
Department of Pathology, University of Cambridge, Cambridge, United Kingdom.
Children's Hospital Iceland, Landspitali University Hospital, Reykjavik, Iceland.


Understanding the structure of a bacterial population is essential in order to understand bacterial evolution. Estimating the core genome (those genes common to all, or nearly all, strains of a species) is a key component of such analyses. The size and composition of the core genome varies by dataset, but we hypothesized that the variation between different collections of the same bacterial species would be minimal. To investigate this, we analyzed the genome sequences of 3,118 pneumococci recovered from healthy individuals in Reykjavik (Iceland), Southampton (United Kingdom), Boston (United States), and Maela (Thailand). The analyses revealed a "supercore" genome (genes shared by all 3,118 pneumococci) of 558 genes, although an additional 354 core genes were shared by pneumococci from Reykjavik, Southampton, and Boston. Overall, the size and composition of the core and pan-genomes among pneumococci recovered in Reykjavik, Southampton, and Boston were similar. Maela pneumococci were distinctly different in that they had a smaller core genome and larger pan-genome. The pan-genome of Maela pneumococci contained several >25 Kb sequence regions (flanked by pneumococcal genes) that were homologous to genomic regions found in other bacterial species. Overall, our work revealed that some subsets of the global pneumococcal population are highly heterogeneous, and our hypothesis was rejected. This is an important finding in terms of understanding genetic variation among pneumococci and is also an essential point of consideration before generalizing the findings from a single dataset to the wider pneumococcal population.


accessory genome; bacterial population structure; core genome; next generation sequencing; pan-genome; pneumococcus

Supplemental Content

Full text links

Icon for Frontiers Media SA Icon for PubMed Central
Loading ...
Support Center