Format

Send to

Choose Destination
Bioinformatics. 2015 Jun 1;31(11):1708-15. doi: 10.1093/bioinformatics/btv070. Epub 2015 Feb 1.

CNOGpro: detection and quantification of CNVs in prokaryotic whole-genome sequencing data.

Author information

1
Section for Biostatistics and Epidemiology, Norwegian University of Life Sciences (NMBU), Oslo, Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås and Norwegian Institute of Public Health, Division of Epidemiology, 0403 Oslo, Norway.

Abstract

MOTIVATION:

The explosion of whole-genome sequencing (WGS) as a tool in the mapping and understanding of genomes has been accompanied by an equally massive report of tools and pipelines for the analysis of DNA copy number variation (CNV). Most currently available tools are designed specifically for human genomes, with comparatively little literature devoted to CNVs in prokaryotic organisms. However, there are several idiosyncrasies in prokaryotic WGS data. This work proposes a step-by-step approach for detection and quantification of copy number variants specifically aimed at prokaryotes.

RESULTS:

After aligning WGS reads to a reference genome, we count the individual reads in a sliding window and normalize these counts for bias introduced by differences in GC content. We then investigate the coverage in two fundamentally different ways: (i) Employing a Hidden Markov Model and (ii) by repeated sampling with replacement (bootstrapping) on each individual gene. The latter bypasses the complex problem of breakpoint determination. To demonstrate our method, we apply it to real and simulated WGS data and benchmark it against two popular methods for CNV detection. The proposed methodology will in some cases represent a significant jump in accuracy from other current methods.

AVAILABILITY AND IMPLEMENTATION:

CNOGpro is written entirely in the R programming language and is available from the CRAN repository (http://cran.r-project.org) under the GNU General Public License.

PMID:
25644268
DOI:
10.1093/bioinformatics/btv070
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center