Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Genome Res. 2012 Feb;22(2):362-74. doi: 10.1101/gr.122564.111. Epub 2011 Nov 16.

Calling amplified haplotypes in next generation tumor sequence data.

Author information

  • 1Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA.

Abstract

During tumor initiation and progression, cancer cells acquire a selective advantage, allowing them to outcompete their normal counterparts. Identification of the genetic changes that underlie these tumor acquired traits can provide deeper insights into the biology of tumorigenesis. Regions of copy number alterations and germline DNA variants are some of the elements subject to selection during tumor evolution. Integrated examination of inherited variation and somatic alterations holds the potential to reveal specific nucleotide alleles that a tumor "prefers" to have amplified. Next-generation sequencing of tumor and matched normal tissues provides a high-resolution platform to identify and analyze such somatic amplicons. Within an amplicon, examination of informative (e.g., heterozygous) sites deviating from a 1:1 ratio may suggest selection of that allele. A naive approach examines the reads for each heterozygous site in isolation; however, this ignores available valuable linkage information across sites. We, therefore, present a novel hidden Markov model-based method-Haplotype Amplification in Tumor Sequences (HATS)-that analyzes tumor and normal sequence data, along with training data for phasing purposes, to infer amplified alleles and haplotypes in regions of copy number gain. Our method is designed to handle rare variants and biases in read data. We assess the performance of HATS using simulated amplified regions generated from varying copy number and coverage levels, followed by amplicons in real data. We demonstrate that HATS infers the amplified alleles more accurately than does the naive approach, especially at low to intermediate coverage levels and in cases (including high coverage) possessing stromal contamination or allelic bias.

PMID:
22090379
[PubMed - indexed for MEDLINE]
PMCID:
PMC3266043
Free PMC Article

Images from this publication.See all images (4)Free text

Figure 1.
Figure 2.
Figure 3.
Figure 4.
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk