• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of prosciprotein sciencecshl presssubscriptionsetoc alertsthe protein societyjournal home
Protein Sci. Feb 1998; 7(2): 233–242.
PMCID: PMC2143930

Domain assignment for protein structures using a consensus approach: characterization and analysis.


A consensus approach for the assignment of structural domains in proteins is presented. The approach combines a number of previously published algorithms, and takes advantage of the elevated accuracy obtained when assignments from the individual algorithms are in agreement. The consensus approach is tested on a data set of 55 protein chains, for which domain assignments from four automated methods were known, and for which crystallographers assignments had been reported in the literature. Accuracy was found to increase in this test from 72% using individual algorithms to 100% when all four methods were in agreement. However a consensus prediction using all four methods was only possible for 52% of the dataset. The consensus approach [using three publicly available domain assignment algorithms (PUU, DETECTIVE, DOMAK)] was then used to make domain assignments for a data set of 787 protein chains from the Protein Data Bank. Analysis of the assignments showed 55.7% of assignments could be made automatically, and of these, 13.5% were multi-domain proteins. Of the remaining 44.3% that could not be assigned by the consensus procedure 90.4% had their domain boundaries assigned correctly by at least one of the algorithms. Once identified, these domains were analyzed for trends in their size and secondary structure class. In addition, the discontinuity of each domain along the protein chain was considered.

Full Text

The Full Text of this article is available as a PDF (7.1M).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. [PubMed]
  • Burmeister WP, Ruigrok RW, Cusack S. The 2.2 A resolution crystal structure of influenza B neuraminidase and its complex with sialic acid. EMBO J. 1992 Jan;11(1):49–56. [PMC free article] [PubMed]
  • Dominguez R, Souchon H, Spinelli S, Dauter Z, Wilson KS, Chauvaux S, Béguin P, Alzari PM. A common protein fold and similar active site in two distinct families of beta-glycanases. Nat Struct Biol. 1995 Jul;2(7):569–576. [PubMed]
  • Fisher AJ, Raushel FM, Baldwin TO, Rayment I. Three-dimensional structure of bacterial luciferase from Vibrio harveyi at 2.4 A resolution. Biochemistry. 1995 May 23;34(20):6581–6586. [PubMed]
  • Graves BJ, Crowther RL, Chandran C, Rumberger JM, Li S, Huang KS, Presky DH, Familletti PC, Wolitzky BA, Burns DK. Insight into E-selectin/ligand interaction from the crystal structure and mutagenesis of the lec/EGF domains. Nature. 1994 Feb 10;367(6463):532–538. [PubMed]
  • Holm L, Sander C. The FSSP database of structurally aligned protein fold families. Nucleic Acids Res. 1994 Sep;22(17):3600–3609. [PMC free article] [PubMed]
  • Islam SA, Luo J, Sternberg MJ. Identification and analysis of domains in proteins. Protein Eng. 1995 Jun;8(6):513–525. [PubMed]
  • Mikami B, Sato M, Shibata T, Hirose M, Aibara S, Katsube Y, Morita Y. Three-dimensional structure of soybean beta-amylase determined at 3.0 A resolution: preliminary chain tracing of the complex with alpha-cyclodextrin. J Biochem. 1992 Oct;112(4):541–546. [PubMed]
  • Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995 Apr 7;247(4):536–540. [PubMed]
  • Poland BW, Silva MM, Serra MA, Cho Y, Kim KH, Harris EM, Honzatko RB. Crystal structure of adenylosuccinate synthetase from Escherichia coli. Evidence for convergent evolution of GTP-binding domains. J Biol Chem. 1993 Dec 5;268(34):25334–25342. [PubMed]
  • Siddiqui AS, Barton GJ. Continuous and discontinuous domains: an algorithm for the automatic generation of reliable protein domain definitions. Protein Sci. 1995 May;4(5):872–884. [PMC free article] [PubMed]
  • Sowdhamini R, Blundell TL. An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins. Protein Sci. 1995 Mar;4(3):506–520. [PMC free article] [PubMed]
  • Sowdhamini R, Rufino SD, Blundell TL. A database of globular protein structural domains: clustering of representative family members into similar folds. Fold Des. 1996;1(3):209–220. [PubMed]
  • Zehfus MH. Binary discontinuous compact protein domains. Protein Eng. 1994 Mar;7(3):335–340. [PubMed]
  • Zehfus MH, Rose GD. Compact units in proteins. Biochemistry. 1986 Sep 23;25(19):5759–5765. [PubMed]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...