Format

Send to

Choose Destination
Nucleic Acids Res. 2019 Nov 16. pii: gkz967. doi: 10.1093/nar/gkz967. [Epub ahead of print]

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

Author information

1
Institute of Structural and Molecular Biology, UCL, Gower Street, London WC1E 6BT, UK.
2
MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.
3
Department of Biochemistry, University of Cambridge, Old Addenbrooke's Site, 80 Tennis Court Road, Cambridge CB2 0QH, UK.
4
Department of Computer Science, UCL, Gower Street, London WC1E 6BT, UK.
5
The Francis Crick Institute, 1 Midland Rd, London NW1 1AT, UK.
6
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
7
Centre for Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
8
Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor 43600, Malaysia.

Abstract

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.

PMID:
31733063
DOI:
10.1093/nar/gkz967

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center