Format

Send to

Choose Destination
Genome Med. 2014 Nov 15;6(11):101. doi: 10.1186/s13073-014-0101-7. eCollection 2014.

A phylogeny-based sampling strategy and power calculator informs genome-wide associations study design for microbial pathogens.

Author information

1
Department of Pulmonary and Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA USA ; Department of Global Health and Social Medicine, Harvard Medical School, 641 Huntington Avenue Suite 4A, Boston, MA 02115 USA.
2
Département de sciences biologiques, Université de Montréal, Montréal, QC Canada.
3
Institute of Life Science, College of Medicine, Swansea University, Swansea, SA2 8PP UK.
4
Department of Mathematics, Imperial College London, London, UK.
5
Department of Global Health and Social Medicine, Harvard Medical School, 641 Huntington Avenue Suite 4A, Boston, MA 02115 USA ; Department of Epidemiology, Harvard School of Public Health, Boston, MA USA.

Abstract

Whole genome sequencing is increasingly used to study phenotypic variation among infectious pathogens and to evaluate their relative transmissibility, virulence, and immunogenicity. To date, relatively little has been published on how and how many pathogen strains should be selected for studies associating phenotype and genotype. There are specific challenges when identifying genetic associations in bacteria which often comprise highly structured populations. Here we consider general methodological questions related to sampling and analysis focusing on clonal to moderately recombining pathogens. We propose that a matched sampling scheme constitutes an efficient study design, and provide a power calculator based on phylogenetic convergence. We demonstrate this approach by applying it to genomic datasets for two microbial pathogens: Mycobacterium tuberculosis and Campylobacter species.

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center