ParallelStructure: a R package to distribute parallel runs of the population genetics program STRUCTURE on multi-core computers

PLoS One. 2013 Jul 29;8(7):e70651. doi: 10.1371/journal.pone.0070651. Print 2013.

Abstract

This software package provides an R-based framework to make use of multi-core computers when running analyses in the population genetics program STRUCTURE. It is especially addressed to those users of STRUCTURE dealing with numerous and repeated data analyses, and who could take advantage of an efficient script to automatically distribute STRUCTURE jobs among multiple processors. It also consists of additional functions to divide analyses among combinations of populations within a single data set without the need to manually produce multiple projects, as it is currently the case in STRUCTURE. The package consists of two main functions: MPI_structure() and parallel_structure() as well as an example data file. We compared the performance in computing time for this example data on two computer architectures and showed that the use of the present functions can result in several-fold improvements in terms of computation time. ParallelStructure is freely available at https://r-forge.r-project.org/projects/parallstructure/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Computer Systems
  • Genetics, Population*
  • Internet
  • Microsatellite Repeats
  • Software*

Grants and funding

This work was funded by the Norwegian research council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.