Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Nucleic Acids Res. 2011 Aug;39(15):e103. doi: 10.1093/nar/gkr425. Epub 2011 Jun 6.

Systematic bias in high-throughput sequencing data and its correction by BEADS.

Author information

  • 1The Gurdon Institute and Department of Genetics, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.

Abstract

Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina's Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.

PMID:
21646344
[PubMed - indexed for MEDLINE]
PMCID:
PMC3159482
Free PMC Article

Images from this publication.See all images (3)Free text

Figure 1.
Figure 2.
Figure 3.
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk