Send to

Choose Destination
Sci Rep. 2017 Jun 22;7(1):4071. doi: 10.1038/s41598-017-03199-6.

Predicting conformational ensembles and genome-wide transcription factor binding sites from DNA sequences.

Author information

National Institutes of Biomedical Innovation Health and Nutrition, 7-6-8, Saito-Asagi, Ibaraki, Osaka, 5670085, Japan.
Faculty of Biology,Medicine and Health, Michael Smith Building, The University of Manchester, Dover Street, Manchester, M13 9PT, UK.
Department of Biology, Southern University of Science and Technology of China, Shenzhen, 518055, China.
World Premier International (WPI) Immunology Frontier Research Center (IFReC), Osaka University, 3-1 Yamadaoka, Suita, 565-0871, Osaka, Japan.
Centro de Biología Molecular Severo Ochoa, CSIC/Universidad Autónoma de Madrid, 28049, Madrid, Spain.
Department of Computer Science, University of Oxford Wolfson Building, Parks Road, OXFORD, OX1 3QD, United Kingdom.
Molecular Modeling and Simulation (MMS) Group, National Institutes for Quantum and Radiological Science and Technology, 8-1-7, Umemidai, Kizugawa, Kyoto, 619-0215, Japan.
National Cancer Institute, Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick, Maryland, USA.
Department of Biochemistry and Human Genetics, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
National Institutes of Biomedical Innovation Health and Nutrition, 7-6-8, Saito-Asagi, Ibaraki, Osaka, 5670085, Japan.
School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Mehrauli Road, New Delhi, 110067, India.


DNA shape is emerging as an important determinant of transcription factor binding beyond just the DNA sequence. The only tool for large scale DNA shape estimates, DNAshape was derived from Monte-Carlo simulations and predicts four broad and static DNA shape features, Propeller twist, Helical twist, Minor groove width and Roll. The contributions of other shape features e.g. Shift, Slide and Opening cannot be evaluated using DNAshape. Here, we report a novel method DynaSeq, which predicts molecular dynamics-derived ensembles of a more exhaustive set of DNA shape features. We compared the DNAshape and DynaSeq predictions for the common features and applied both to predict the genome-wide binding sites of 1312 TFs available from protein interaction quantification (PIQ) data. The results indicate a good agreement between the two methods for the common shape features and point to advantages in using DynaSeq. Predictive models employing ensembles from individual conformational parameters revealed that base-pair opening - known to be important in strand separation - was the best predictor of transcription factor-binding sites (TFBS) followed by features employed by DNAshape. Of note, TFBS could be predicted not only from the features at the target motif sites, but also from those as far as 200 nucleotides away from the motif.

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center