Format

Send to

Choose Destination
J Comput Chem. 2017 Jun 15;38(16):1419-1430. doi: 10.1002/jcc.24729. Epub 2017 Jan 17.

In situ data analytics and indexing of protein trajectories.

Author information

1
Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA.
2
Department of Theort, Chemistry, University of Gdansk, 80-952, Gdańsk, Poland.
3
Department of Computer Science, University of California, Davis, CA, 95616, USA.

Abstract

The transition toward exascale computing will be accompanied by a performance dichotomy. Computational peak performance will rapidly increase; I/O performance will either grow slowly or be completely stagnant. Essentially, the rate at which data are generated will grow much faster than the rate at which data can be read from and written to the disk. MD simulations will soon face the I/O problem of efficiently writing to and reading from disk on the next generation of supercomputers. This article targets MD simulations at the exascale and proposes a novel technique for in situ data analysis and indexing of MD trajectories. Our technique maps individual trajectories' substructures (i.e., α-helices, β-strands) to metadata frame by frame. The metadata captures the conformational properties of the substructures. The ensemble of metadata can be used for automatic, strategic analysis within a trajectory or across trajectories, without manually identify those portions of trajectories in which critical changes take place. We demonstrate our technique's effectiveness by applying it to 26.3k helices and 31.2k strands from 9917 PDB proteins and by providing three empirical case studies.

KEYWORDS:

conformational metadata; eigenvalues; exascale computing; high-performance computing; protein trajectories

PMID:
28093787
DOI:
10.1002/jcc.24729

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center