Format

Send to

Choose Destination
Nucleic Acids Res. 2015 Jan;43(Database issue):D227-33. doi: 10.1093/nar/gku1041. Epub 2014 Nov 20.

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

Author information

1
Computer Science, University of Bristol, Bristol, BS8 1UB, UK Matt.Oates@bristol.ac.uk.
2
Computer Science, University of Bristol, Bristol, BS8 1UB, UK.
3
Computer Science, University of Bristol, Bristol, BS8 1UB, UK Medical Research Council Clinical Sciences Centre, Faculty of Medicine, Imperial College London, Hammersmith Hospital, London, UK.
4
Computer Science, University of Bristol, Bristol, BS8 1UB, UK e-Therapeutics plc,17 Blenheim Office Park, Long Hanborough, Oxfordshire, OX29 8LN, UK.
5
Computer Science, University of Bristol, Bristol, BS8 1UB, UK Bristol Centre for Complexity Sciences, University of Bristol, Bristol, UK.

Abstract

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.

PMID:
25414345
PMCID:
PMC4383889
DOI:
10.1093/nar/gku1041
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center