Format

Send to

Choose Destination
Mol Biol Evol. 2014 Jul;31(7):1929-36. doi: 10.1093/molbev/msu136. Epub 2014 Apr 16.

PopGenome: an efficient Swiss army knife for population genomic analyses in R.

Author information

1
Institute for Computer Science, Heinrich Heine University, Düsseldorf, Germany.
2
Centre for Research in Agricultural Genomics, Bellaterra, Spain.
3
Institute for Computer Science, Heinrich Heine University, Düsseldorf, GermanyCluster of Excellence on Plant Sciences, Düsseldorf, Germany lercher@cs.uni-duesseldorf.de.

Abstract

Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson's MS and Ewing's MSMS programs to assess statistical significance based on coalescent simulations. PopGenome's integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN (http://cran.r-project.org/) for all major operating systems under the GNU General Public License.

KEYWORDS:

population genomics; single-nucleotide polymorphisms; software

PMID:
24739305
PMCID:
PMC4069620
DOI:
10.1093/molbev/msu136
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center