Format

Send to

Choose Destination
BMC Bioinformatics. 2016 Nov 2. [Epub ahead of print]

Evaluating tools for transcription factor binding site prediction.

Author information

1
Institute of Structural and Molecular Biology, Division of Biosciences, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK.
2
Institute of Structural and Molecular Biology, Division of Biosciences, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK. andrew@bioinf.org.uk.

Abstract

BACKGROUND:

Binding of transcription factors to transcription factor binding sites (TFBSs) is key to the mediation of transcriptional regulation. Information on experimentally validated functional TFBSs is limited and consequently there is a need for accurate prediction of TFBSs for gene annotation and in applications such as evaluating the effects of single nucleotide variations in causing disease. TFBSs are generally recognized by scanning a position weight matrix (PWM) against DNA using one of a number of available computer programs. Thus we set out to evaluate the best tools that can be used locally (and are therefore suitable for large-scale analyses) for creating PWMs from high-throughput ChIP-Seq data and for scanning them against DNA.

RESULTS:

We evaluated a set of de novo motif discovery tools that could be downloaded and installed locally using ENCODE-ChIP-Seq data and showed that rGADEM was the best-performing tool. TFBS prediction tools used to scan PWMs against DNA fall into two classes - those that predict individual TFBSs and those that identify clusters. Our evaluation showed that FIMO and MCAST performed best respectively.

CONCLUSIONS:

Selection of the best-performing tools for generating PWMs from ChIP-Seq data and for scanning PWMs against DNA has the potential to improve prediction of precise transcription factor binding sites within regions identified by ChIP-Seq experiments for gene finding, understanding regulation and in evaluating the effects of single nucleotide variations in causing disease.

KEYWORDS:

Motif discovery; Motif scanning tools; PWMs; Performance evaluation

PMID:
27806697
DOI:
10.1186/s12859-016-1298-9
Free full text

Supplemental Content

Full text links

Icon for BioMed Central
Loading ...
Support Center