Format

Send to

Choose Destination
BMC Med Genomics. 2015 Oct 15;8:64. doi: 10.1186/s12920-015-0134-9.

Scalable and cost-effective NGS genotyping in the cloud.

Author information

1
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. yassine_souilmi@hms.harvard.edu.
2
Department of Biology, Mohamed Vth University, 4 Ibn Battouta Avenue, B.P: 1014RP, Rabat, Morocco. yassine_souilmi@hms.harvard.edu.
3
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. alex_lancaster@hms.harvard.edu.
4
Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02215, USA. alex_lancaster@hms.harvard.edu.
5
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. jae-yoon_jung@hms.harvard.edu.
6
Department of Electrical, Computer and Biomedical Engineering, University of Pavia, via Ferrata 1, Pavia, 27100, Italy. ettore.rizzo@unipv.it.
7
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. jared_hawkins@hms.harvard.edu.
8
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. rlpowles@vt.edu.
9
Department of Biology, Mohamed Vth University, 4 Ibn Battouta Avenue, B.P: 1014RP, Rabat, Morocco. amzazi@gmail.com.
10
Department of Biology, Mohamed First University, Oujda, Nador, Morocco. hassan.ghazal@fulbrightmail.org.
11
Department of Biomedical Informatics, Harvard Medical School 10 Shattuck Street, Boston, MA, 02115, USA. peter_tonellato@hms.harvard.edu.
12
Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02215, USA. peter_tonellato@hms.harvard.edu.
13
Department of Pediatrics and Psychiatry (by courtesy), Division of Systems Medicine & Program in Biomedical Informatics, Stanford University, Stanford, CA, 94305, USA. dpwall@stanford.edu.

Abstract

BACKGROUND:

While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars.

RESULTS:

We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets.

CONCLUSIONS:

Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.

PMID:
26470712
PMCID:
PMC4608296
DOI:
10.1186/s12920-015-0134-9
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center