Format

Send to:

Choose Destination
See comment in PubMed Commons below
Nucleic Acids Res. 2014 May;42(9):e74. doi: 10.1093/nar/gku178. Epub 2014 Mar 5.

Impact of sequencing depth in ChIP-seq experiments.

Author information

  • 1Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA Division of Genetics, Brigham and Women's Hospital & Harvard Medical School, Boston, MA 02115, USA.
  • 2Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.
  • 3Division of Genetics, Brigham and Women's Hospital & Harvard Medical School, Boston, MA 02115, USA Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA.
  • 4Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
  • 5Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • 6Division of Genetics, Brigham and Women's Hospital & Harvard Medical School, Boston, MA 02115, USA.
  • 7Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA Division of Genetics, Brigham and Women's Hospital & Harvard Medical School, Boston, MA 02115, USA Informatics Program, Children's Hospital, Boston, MA 02115, USA peter_park@harvard.edu.

Abstract

In a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiment, an important consideration in experimental design is the minimum number of sequenced reads required to obtain statistically significant results. We present an extensive evaluation of the impact of sequencing depth on identification of enriched regions for key histone modifications (H3K4me3, H3K36me3, H3K27me3 and H3K9me2/me3) using deep-sequenced datasets in human and fly. We propose to define sufficient sequencing depth as the number of reads at which detected enrichment regions increase <1% for an additional million reads. Although the required depth depends on the nature of the mark and the state of the cell in each experiment, we observe that sufficient depth is often reached at <20 million reads for fly. For human, there are no clear saturation points for the examined datasets, but our analysis suggests 40-50 million reads as a practical minimum for most marks. We also devise a mathematical model to estimate the sufficient depth and total genomic coverage of a mark. Lastly, we find that the five algorithms tested do not agree well for broad enrichment profiles, especially at lower depths. Our findings suggest that sufficient sequencing depth and an appropriate peak-calling algorithm are essential for ensuring robustness of conclusions derived from ChIP-seq data.

© The Author(s) 2014. Published by Oxford University Press.

PMID:
24598259
[PubMed - indexed for MEDLINE]
PMCID:
PMC4027199
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk