Send to

Choose Destination
Bioinformatics. 2019 Aug 30. pii: btz679. doi: 10.1093/bioinformatics/btz679. [Epub ahead of print]

Analysis of several key factors influencing deep learning-based inter-residue contact prediction.

Author information

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
Department of Mathematics and Computer Science, University of Missouri, St. Louis, MO, USA.



Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated.


We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT, and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors (i.e. deep learning technique, multiple sequence alignment, distance distribution prediction, and domain-based contact integration) that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three co-evolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global co-evolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from multiple sequence alignments alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction.



Supplementary data are available at Bioinformatics online.

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center