Format

Send to

Choose Destination
Bioinformatics. 2015 Sep 15;31(18):2963-71. doi: 10.1093/bioinformatics/btv309. Epub 2015 May 18.

IMSEQ--a fast and error aware approach to immunogenetic sequence analysis.

Author information

1
Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.
2
Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.
3
Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.
4
Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.

Abstract

MOTIVATION:

Recombined T- and B-cell receptor repertoires are increasingly being studied using next generation sequencing (NGS) in order to interrogate the repertoire composition as well as changes in the distribution of receptor clones under different physiological and disease states. This type of analysis requires efficient and unambiguous clonotype assignment to a large number of NGS read sequences, including the identification of the incorporated V and J gene segments and the CDR3 sequence. Current tools have deficits with respect to performance, accuracy and documentation of their underlying algorithms and usage.

RESULTS:

We present IMSEQ, a method to derive clonotype repertoires from NGS data with sophisticated routines for handling errors stemming from PCR and sequencing artefacts. The application can handle different kinds of input data originating from single- or paired-end sequencing in different configurations and is generic regarding the species and gene of interest. We have carefully evaluated our method with simulated and real world data and show that IMSEQ is superior to other tools with respect to its clonotyping as well as standalone error correction and runtime performance.

AVAILABILITY AND IMPLEMENTATION:

IMSEQ was implemented in C++ using the SeqAn library for efficient sequence analysis. It is freely available under the GPLv2 open source license and can be downloaded at www.imtools.org.

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

CONTACT:

lkuchenb@inf.fu-berlin.de or peter.robinson@charite.de.

PMID:
25987567
DOI:
10.1093/bioinformatics/btv309
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center