Homology-based loop modeling yields more complete crystallographic protein structures

IUCrJ. 2018 Aug 8;5(Pt 5):585-594. doi: 10.1107/S2052252518010552. eCollection 2018 Sep 1.

Abstract

Inherent protein flexibility, poor or low-resolution diffraction data or poorly defined electron-density maps often inhibit the building of complete structural models during X-ray structure determination. However, recent advances in crystallographic refinement and model building often allow completion of previously missing parts. This paper presents algorithms that identify regions missing in a certain model but present in homologous structures in the Protein Data Bank (PDB), and 'graft' these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these developments in the PDB-REDO pipeline has enabled the building of 24 962 missing loops in the PDB. The models and the automated procedures are publicly available through the PDB-REDO databank and webserver. More complete protein structure models enable a higher quality public archive but also a better understanding of protein function, better comparison between homologous structures and more complete data mining in structural bioinformatics projects.

Keywords: PDB-REDO; crystallography; loop building; model completion; structural re-building.

Grants and funding

This work was funded by Netherlands Organization for Scientific Research (NWO) grant 723.013.003 to Robbie P. Joosten. European Comission Horizon 2020 programme grants 675858 and 653706.