Format

Send to

Choose Destination
Nat Biotechnol. 2020 Jan 6. doi: 10.1038/s41587-019-0368-8. [Epub ahead of print]

Accurate detection of mosaic variants in sequencing data without matched controls.

Author information

1
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
2
Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.
3
Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA.
4
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
5
Harvard/MIT MD-PhD Program, Harvard Medical School, Boston, MA, USA.
6
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
7
Bioinformatics and Integrative Genomics PhD program, Harvard Medical School, Boston, MA, USA.
8
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. peter_park@hms.harvard.edu.
9
Ludwig Center at Harvard, Boston, MA, USA. peter_park@hms.harvard.edu.

Abstract

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.

PMID:
31907404
DOI:
10.1038/s41587-019-0368-8

Supplemental Content

Full text links

Icon for Nature Publishing Group
Loading ...
Support Center