Format

Send to

Choose Destination
BMC Bioinformatics. 2017 Jan 20;18(1):49. doi: 10.1186/s12859-016-1454-2.

MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants.

Author information

1
Center for Informatics Sciences, Nile University, Juhayna Square, Sheikh Zayed, Giza, Egypt.
2
Department of Biology, Mohamed Vth University in Rabat, 4 Ibn Battouta Avenue, BP: 1014RP, Rabat, Morocco.
3
Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA, 02115, USA.
4
Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02215, USA.
5
Department of Pediatrics and Psychiatry (by courtesy), Division of Systems Medicine & Program in Biomedical Informatics, Stanford University, Stanford, CA, 94305, USA.
6
Center for Informatics Sciences, Nile University, Juhayna Square, Sheikh Zayed, Giza, Egypt. mabouelhoda@yahoo.com.
7
Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt. mabouelhoda@yahoo.com.

Abstract

BACKGROUND:

Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices.

RESULTS:

In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers.

CONCLUSIONS:

MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot machines as a result of a sudden price increase. The package has a web-interface and it is available for free for academic use.

KEYWORDS:

Cloud computing; Multicloud; Personalized medicine; Sequence analysis; Variant analysis

PMID:
28107819
PMCID:
PMC5248509
DOI:
10.1186/s12859-016-1454-2
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center