Format

Send to

Choose Destination
PLoS Comput Biol. 2016 Apr 21;12(4):e1004873. doi: 10.1371/journal.pcbi.1004873. eCollection 2016 Apr.

CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing.

Talevich E1,2,3, Shain AH1,2,3, Botton T1,2,3, Bastian BC1,2,3.

Author information

1
Department of Dermatology, University of California, San Francisco, San Francisco, California, United States of America.
2
Department of Pathology, University of California, San Francisco, San Francisco, California, United States of America.
3
Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, California, United States of America.

Abstract

Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit.

PMID:
27100738
PMCID:
PMC4839673
DOI:
10.1371/journal.pcbi.1004873
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center