Send to

Choose Destination
Nucleic Acids Res. 2019 Oct 10;47(18):e112. doi: 10.1093/nar/gkz656.

Genome-wide epistasis and co-selection study using mutual information.

Author information

Department of Mathematics and Statistics, Helsinki Institute for Information Technology (HIIT), Faculty of Science, University of Helsinki, FI-00014 Helsinki, Finland.
Department of Computer Science, Aalto University, Espoo, FI-00014, Finland.
Division of Informatics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA 02138, USA.
Parasites and Microbes, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.
Department of Biostatistics, University of Oslo, Oslo, 0317, Norway.
Department of Microbiology, New York University School of Medicine, New York, NY 10016, USA.
Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK.
Bioinformatics & Systems Biology program, King Mongkut's University of Technology Thonburi, Bangkok 10150, Thailand.
Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, UK.
MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, St. Mary's Campus, Imperial College London, London, W2 1PG, UK.


Covariance-based discovery of polymorphisms under co-selective pressure or epistasis has received considerable recent attention in population genomics. Both statistical modeling of the population level covariation of alleles across the chromosome and model-free testing of dependencies between pairs of polymorphisms have been shown to successfully uncover patterns of selection in bacterial populations. Here we introduce a model-free method, SpydrPick, whose computational efficiency enables analysis at the scale of pan-genomes of many bacteria. SpydrPick incorporates an efficient correction for population structure, which adjusts for the phylogenetic signal in the data without requiring an explicit phylogenetic tree. We also introduce a new type of visualization of the results similar to the Manhattan plots used in genome-wide association studies, which enables rapid exploration of the identified signals of co-evolution. Simulations demonstrate the usefulness of our method and give some insight to when this type of analysis is most likely to be successful. Application of the method to large population genomic datasets of two major human pathogens, Streptococcus pneumoniae and Neisseria meningitidis, revealed both previously identified and novel putative targets of co-selection related to virulence and antibiotic resistance, highlighting the potential of this approach to drive molecular discoveries, even in the absence of phenotypic data.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center