Format

Send to

Choose Destination
BMC Genomics. 2018 Nov 28;19(1):845. doi: 10.1186/s12864-018-5264-y.

Helmsman: fast and efficient mutation signature analysis for massive sequencing datasets.

Author information

1
Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA. jedidiah@umich.edu.
2
Department of Genome Sciences, University of Washington, Seattle, WA, USA. jedidiah@umich.edu.
3
Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
4
Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
5
Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
6
Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA.

Abstract

BACKGROUND:

The spectrum of somatic single-nucleotide variants in cancer genomes often reflects the signatures of multiple distinct mutational processes, which can provide clinically actionable insights into cancer etiology. Existing software tools for identifying and evaluating these mutational signatures do not scale to analyze large datasets containing thousands of individuals or millions of variants.

RESULTS:

We introduce Helmsman, a program designed to perform mutation signature analysis on arbitrarily large sequencing datasets. Helmsman is up to 300 times faster than existing software. Helmsman's memory usage is independent of the number of variants, resulting in a small enough memory footprint to analyze datasets that would otherwise exceed the memory limitations of other programs.

CONCLUSIONS:

Helmsman is a computationally efficient tool that enables users to evaluate mutational signatures in massive sequencing datasets that are otherwise intractable with existing software. Helmsman is freely available at https://github.com/carjed/helmsman .

KEYWORDS:

Cancer genomics; Mutational signatures; Python; Single nucleotide variants

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center