Format

Send to

Choose Destination
Nat Biotechnol. 2015 Oct;33(10):1045-52. doi: 10.1038/nbt.3319. Epub 2015 Sep 7.

ConStrains identifies microbial strains in metagenomic datasets.

Luo C1,2,3, Knight R4,5, Siljander H6,7, Knip M6,7,8,9, Xavier RJ1,2,3,10, Gevers D1.

Author information

1
Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, Massachusetts, USA.
2
Gastrointestinal Unit and Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
3
Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
4
Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, Colorado, USA.
5
Howard Hughes Medical Institute, Boulder, Colorado, USA.
6
Children's Hospital, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.
7
Research Programs Unit, Diabetes and Obesity, University of Helsinki, Helsinki, Finland.
8
Folkhälsan Research Center, Helsinki, Finland.
9
Department of Pediatrics, Tampere University Hospital, Tampere, Finland.
10
Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.

Abstract

An important fraction of microbial diversity is harbored in strain individuality, so identification of conspecific bacterial strains is imperative for improved understanding of microbial community functions. Limitations in bioinformatics and sequencing technologies have to date precluded strain identification owing to difficulties in phasing short reads to faithfully recover the original strain-level genotypes, which have highly similar sequences. We present ConStrains, an open-source algorithm that identifies conspecific strains from metagenomic sequence data and reconstructs the phylogeny of these strains in microbial communities. The algorithm uses single-nucleotide polymorphism (SNP) patterns in a set of universal genes to infer within-species structures that represent strains. Applying ConStrains to simulated and host-derived datasets provides insights into microbial community dynamics.

PMID:
26344404
PMCID:
PMC4676274
DOI:
10.1038/nbt.3319
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center