Format

Send to

Choose Destination
J Comput Biol. 2016 Sep;23(9):737-49. doi: 10.1089/cmb.2015.0234. Epub 2016 May 6.

cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design.

Author information

1
1 Institute for Interdisciplinary Information Sciences, Tsinghua University , Beijing, China .
2
2 Department of Pharmacology and Pharmaceutical Sciences, Tsinghua University , Beijing, China .
3
3 Department of Computer Science, Duke University , Durham, North Carolina.
4
4 Department of Biochemistry, Duke University Medical Center , Durham, North Carolina.

Abstract

Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches.

KEYWORDS:

MapReduce; branch and bound; cloud; distributed systems; global minimum energy conformation; protein design

PMID:
27154509
PMCID:
PMC5586165
DOI:
10.1089/cmb.2015.0234
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Atypon Icon for PubMed Central
Loading ...
Support Center