From reads to regions: a Bioconductor workflow to detect differential binding in ChIP-seq data

Aaron T L Lun; Gordon K Smyth

doi:10.12688/f1000research.7016.2

From reads to regions: a Bioconductor workflow to detect differential binding in ChIP-seq data

F1000Res. 2015 Oct 16:4:1080. doi: 10.12688/f1000research.7016.2. eCollection 2015.

Authors

Aaron T L Lun¹, Gordon K Smyth²

Affiliations

¹ The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia; Department of Medical Biology, The University of Melbourne, Melbourne, Australia.
² The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia; Department of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia.

Abstract

Chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) is widely used to identify the genomic binding sites for protein of interest. Most conventional approaches to ChIP-seq data analysis involve the detection of the absolute presence (or absence) of a binding site. However, an alternative strategy is to identify changes in the binding intensity between two biological conditions, i.e., differential binding (DB). This may yield more relevant results than conventional analyses, as changes in binding can be associated with the biological difference being investigated. The aim of this article is to facilitate the implementation of DB analyses, by comprehensively describing a computational workflow for the detection of DB regions from ChIP-seq data. The workflow is based primarily on R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, from alignment of read sequences to interpretation and visualization of putative DB regions. In particular, detection of DB regions will be conducted using the counts for sliding windows from the csaw package, with statistical modelling performed using methods in the edgeR package. Analyses will be demonstrated on real histone mark and transcription factor data sets. This will provide readers with practical usage examples that can be applied in their own studies.

Keywords: ChIP-seq; Differential binding; bioinformatics; genomics; read alignment; visualization.

Grants and funding

National Health and Medical Research Council (Program Grant 1054618 to G.K.S., Fellowship to G.K.S.); Victorian State Government Operational Infrastructure Support; Australian Government NHMRC IRIIS.