Send to

Choose Destination
Bioinformatics. 2012 Sep 15;28(18):2357-65. doi: 10.1093/bioinformatics/bts448. Epub 2012 Jul 13.

A regression model for estimating DNA copy number applied to capture sequencing data.

Author information

Department of Bioinformatics and Statistics, The Netherlands Cancer Institute, Amsterdam, The Netherlands.



Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus.


We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data.


The algorithm is implemented in C and R and the code can be downloaded from



Supplementary data are available at Bioinformatics online.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center