Precise and Parallel Pairwise Metagenomic Comparisons

J Comput Biol. 2018 Aug;25(8):841-849. doi: 10.1089/cmb.2018.0081.

Abstract

The comparison and assessment of similarity across metagenomes are still an open problem. Uncultivated samples suffer from high variability, thus making it difficult for heuristic sequence comparison methods to find precise matches in reference databases. Finer methods are required to provide higher accuracy and certainty, although these come at the expense of larger computation times. Therefore, in this work, we present our software for the highly parallel, fine-grained pairwise alignment of metagenomes. First, an analysis of the computational limitations of performing coarse-grained global alignments in parallel manner is described, and a solution is discussed and employed by our proposal. Second, we show that our development is competitive with state-of-the-art software in terms of speed and consumption of resources, while achieving more accurate results. In addition, the parallel scheme adopted is tested, depicting a performance of up to 98% efficiency while using up to 64 cores. Sequential optimizations are also tested and show a speedup of 9× over our previous proposal.

Keywords: coarse-grained parallelism; comparative metagenomics; multiprocessing architectures; sequence comparison.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Humans
  • Metagenome*
  • Metagenomics / methods*
  • Metagenomics / standards*
  • Sequence Alignment / standards*
  • Software*