![]() | ![]() |
Formats:
|
||||||
Copyright © 2008 The Author(s) The Degradome database: mammalian proteases and diseases of proteolysis Departamento de Bioquímica y Biología Molecular, Facultad de Medicina, Instituto Universitario de Oncología, Universidad de Oviedo, 33006-Oviedo, Spain *To whom correspondence should be addressed. Tel: Phone: +34 985 104201; Fax: +34 985 103564; Email: clo/at/uniovi.es Received July 28, 2008; Accepted August 21, 2008. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract The degradome is defined as the complete set of proteases present in an organism. The recent availability of whole genomic sequences from multiple organisms has led us to predict the contents of the degradomes of several mammalian species. To ensure the fidelity of these predictions, our methods have included manual curation of individual sequences and, when necessary, direct cloning and sequencing experiments. The results of these studies in human, chimpanzee, mouse and rat have been incorporated into the Degradome database, which can be accessed through a web interface at http://degradome.uniovi.es. The annotations about each individual protease can be retrieved by browsing catalytic classes and families or by searching specific terms. This web site also provides detailed information about genetic diseases of proteolysis, a growing field of great importance for multiple users. Finally, the user can find additional information about protease structures, protease inhibitors, ancillary domains of proteases and differences between mammalian degradomes. INTRODUCTION Proteases are defined as hydrolytic enzymes acting on peptide bonds, in a process termed proteolysis. The biological significance of proteolysis has driven the evolutionary invention of multiple, extremely diverse classes and families of proteases. Thus, different proteases are known to play key roles in multiple biological processes, including cell cycle progression, differentiation and migration, morphogenesis and tissue remodelling, neuronal outgrowth, haemostasis, wound healing, immunity, angiogenesis and apoptosis (1). The importance of proteolysis is also apparent in the numerous pathological conditions related to alterations in proteases, including cancer, arthritis, progeria and neurodegenerative and cardiovascular diseases (1–6). The extensive biological and pathological implications of this large set of proteins with a common biochemical function led to the concept of proteases as a distinct subset of the proteome. Thus, the degradome of an organism was defined as the complete set of proteases in that organism (7). The definition of degradome naturally led to the development of degradomics as a new experimental field which includes all genomic and proteomic approaches for the identification and characterization of proteases that are present in an organism. The completion of multiple genome projects has been instrumental in the advance of degradomics by allowing researchers to extend the degradomes of several species in silico from known protease sequences. While several computer programs allow the automatic prediction of genes based on similarity, a reliable prediction still requires manual curation by trained researchers (8). The reasons for this limitation include the difficulty to detect small or dissimilar exons as well as the occurrence of occasional sequencing errors. Indeed, the analysis of a large set of genes is likely to require manual inspection of sequencing traces and cloning and re-sequencing experiments for some of the genes. Additionally, orthology or paralogy assignment of protease genes between human and other animal models also requires the supervision of expert curators. We have used this manual procedure to predict the degradomes of human, mouse, rat, chimpanzee and platypus (9–12). Furthermore, our continued effort in degradomics has led us to mine the literature and annotate known relationships between protease alterations and hereditary diseases. Since proteases make up promising drug development targets (13–16) and clinical markers (17–19), this compilation may prove very useful to researchers in different fields of human pathology. Likewise, this information on diseases of proteolysis represents a useful resource to determine the utility and limitations of diverse animal models to recapitulate certain human diseases. Here we report the Degradome database, which contains the results of the manual annotation of every protease gene in the genomes of human, chimpanzee, mouse and rat, along with relationships between protease alterations and hereditary diseases. This database complements existing databases devoted to proteases, by providing a different focus. Namely, the database ProLysED (20) is devoted to proteases in prokaryotes, whereas our target organisms are mammals. On the other hand, CutDB (21) focuses on annotation of individual proteolytic events, both actual and predicted, rather than on the proteases themselves. Finally, MEROPS (22) is a comprehensive and excellent database which relies on large-scale experiments and automatic annotation. However, a number of entries in this database correspond to pseudogenes or sequences derived from retroviral elements which do not code for any functional proteolytic enzyme. By contrast, our Degradome database, while less comprehensive in the number of species, relies on manual annotation and exhaustive curation of genes on an individual basis. In multiple cases, this informatic work is supported by direct cloning and sequencing experiments (23–27). Furthermore, our emphasis in diseases adds important information on the pathological relevance of some proteases, which is not directly available in other databases. The Degradome database is aimed at researchers looking for specific information about mammalian proteases and protease families. Additionally, we have incorporated features intended to help non-experienced users who want to learn about the degradome. These features include selected publications and interactive 3D structures that can be displayed and manipulated with Acrobat reader and thus do not need specialized software. DATABASE ACCESS Annotation of individual proteases The Degradome database contains information about 570 human, 568 chimpanzee, 651 mouse and 641 rat proteases. All of these proteases can be grouped into five catalytic classes, depending on the key residue for their catalytic mechanism. Accordingly, the database information is structured in five tables, containing aspartyl-, cysteine-, metallo-, serine- and threonine-proteases. Each table displays the name of the family using the MEROPS classification system, the name of each protease and the gene symbol for the protease in each species (Figure 1
On the other hand, the protease-specific links open ‘popup’ tables with information about each protease (Figure 1 Search engine The Degradome database can also be queried using the specific search engine (http://degradome.uniovi.es/search.html). This option lets the user find proteases which meet user-defined criteria such as the presence or absence in selected species, the localization within a specific chromosomal locus, or the existence of mutations which lead to human hereditary diseases (Figure 1 As an example, to find which human proteases are involved in a disease and located in human chromosome 12q, we can first set the query boxes to ‘Search for proteases containing 12q in the field Locus of Human’. This will retrieve 21 proteases. Then, we can rearrange the query boxes to ‘Keep proteases mutated in a disease’, which will narrow the results to a single hit. Hereditary diseases of proteolysis The information about mutated proteases in hereditary diseases has been compiled into a table (http://degradome.uniovi.es/diseases.html), so that users specifically interested in this subject do not need to browse or search the individual annotations. At its present form, the table of degradome-related genetic diseases contains 77 proteases, with information about gene locus, mode of inheritance, pathologic protease alteration (gain/loss of proteolytic activity) and availability of described animal models containing the same protease anomaly (Figure 1 To our knowledge, this is the first summary of the relationships between degradomics and pathology. The large number of protease alterations related to diverse diseases highlights the significance of the degradome to maintain a correct physiological balance. It must be noted that this summary does not include the multiple examples of non-hereditary diseases in which proteases are known to play an important role as a consequence of alterations in their spatio-temporal patterns of expression (2–6). Notably, the degradome database has also demonstrated its usefulness in the analysis of proteases associated with cancer (31). Selected structures Proteases represent important pharmacological targets for different human pathologies. Therefore, an important aspect of protease research has been to know the mechanism of action of these enzymes by determining the 3D structure of individual proteases. In this regard, we have prepared 22 web pages showing structural features of representative members of different protease families (Figure 1 These structures have been selected to present a general view of the multiple folds that can be found in the degradome. Most of the structures include a specific inhibitor interacting with the catalytic residues. In contrast, three structures from different catalytic classes have been chosen which display proteases in their active form. In these structures, the user can display schematics explaining the putative catalytic mechanisms of cysteine-proteases (C26 family), serine-proteases (S10 family) and metalloproteases (M03 family). Additional contents In addition to the Degradome database, the web site also offers several summaries of the characteristics of mammalian degradomes. Thus, a static table listing human, mouse and rat protease inhibitors can be found at http://degradome.uniovi.es/inhibitors.html. A count of proteases in these species, itemized by catalytic class, is shown at http://degradome.uniovi.es/numbers.html. These numbers should not be considered definitive and are likely to be expanded as novel catalytic classes are discovered and added to the Degradome database. Additionally, since most proteases display a series of non-proteolytic domains linked to the catalytic unit, we have also prepared a figure showing the different ancillary domains present in proteases (http://degradome.uniovi.es/domains.html). Finally, we have incorporated a figure summarizing the differences between human and mouse degradomes (http://degradome.uniovi.es/hmd.html). This figure is displayed as a static image and also as an interactive pdf file. Implementation The database with the annotations of individual proteases has been stored in five extensible markup language (XML) files, one for each catalytic class. These XML files are freely available upon request. Individual annotation tables access the corresponding XML files through Javascript. Therefore, if the browser lacks Javascript or Javascript is blocked, the user is offered a link to a static table displaying all of the information at once. On the other hand, several interactive features—i.e. ‘selected structures’ and ‘human/mouse degradome differences’—are offered as pdf files. Thus, the user does not need any plugins or specific software to manipulate these figures, only Adobe Reader v7.0 or higher. The web pages contain an external link to a web page where the user can download the last version of Adobe Reader. It must be noted that Microsoft Internet Explorer treats embedded pdf files as ActiveX content, which may be blocked in the browser. If this happens, the user can download the pdf file and access its contents locally. CONCLUSION AND FUTURE DIRECTION We have developed a database devoted to the degradome in several mammalian species, which is freely available through a web interface. Notably, this database contains information about the involvement of proteases in genetic diseases. The features provided by the degradome database are useful for researchers in the degradomics field, as well as for researchers working on individual proteases and protease families. It will also be of special interest to researchers working with animal models, as this database provides a highly curated repertoire of orthologous genes between human, mouse and rat. Likewise, the degradome database shows differences in protease genes between these organisms due to the selective expansion of a series of protease coding genes in rodents, which might hamper the use of these animal models to study certain human proteases. Our future plans include the extension of the database to other species. At this moment, we are in the process of adding the degradome of platypus. Other species we are currently studying include non-mammalian metazoa which may provide additional clues about the evolution of proteases. Finally, we plan to offer new features which will allow users to search for sequences and motifs in the degradome. FUNDING European Union (CancerDegradome-FP6 and FP7); Ministerio de Ciencia e Innovación-Spain; Fundación M Botín; Fundación Lilly; Obra Social Cajastur (to the Instituto Universitario de Oncologia). Funding for open access charge: Ministerio de Ciencia e Innovación-Spain. Conflict of interest statement. None declared. REFERENCES 1. Lopez-Otin C, Bond JS. J. Biol. Chem. Proteases: Multifunctional enzymes in life and disease. (Epub Ahead of print July 23, 2008). doi:10.1074/jbc.R800035200. 2. Freije JM, Balbin M, Pendas AM, Sanchez LM, Puente XS, Lopez-Otin C. Matrix metalloproteinases and tumor progression. Adv. Exp. Med. Biol. 2003;532:91–107. [PubMed] 3. Murphy G, Nagase H. Reappraising metalloproteinases in rheumatoid arthritis and osteoarthritis: destruction or repair? Nat. Clin. Pract. Rheumatol. 2008;4:128–135. [PubMed] 4. Varela I, Cadinanos J, Pendas AM, Gutierrez-Fernandez A, Folgueras AR, Sanchez LM, Zhou Z, Rodriguez FJ, Stewart CL, Vega JA, et al. Accelerated ageing in mice deficient in Zmpste24 protease is linked to p53 signalling activation. Nature. 2005;437:564–568. [PubMed] 5. Nalivaeva NN, Fisk LR, Belyaev ND, Turner AJ. Amyloid-degrading enzymes as therapeutic targets in Alzheimer's disease. Curr. Alzheimer Res. 2008;5:212–224. [PubMed] 6. Dollery CM, Libby P. Atherosclerosis and proteinase activation. Cardiovasc. Res. 2006;69:625–635. [PubMed] 7. Lopez-Otin C, Overall CM. Protease degradomics: a new challenge for proteomics. Nat. Rev. Mol. Cell. Biol. 2002;3:509–519. [PubMed] 8. Do JH, Choi DK. Computational approaches to gene prediction. J. Microbiol. 2006;44:137–144. [PubMed] 9. Puente XS, Sanchez LM, Overall CM, Lopez-Otin C. Human and mouse proteases: a comparative genomic approach. Nat. Rev. Genet. 2003;4:544–558. [PubMed] 10. Puente XS, Lopez-Otin C. A genomic analysis of rat proteases and protease inhibitors. Genome Res. 2004;14:609–622. [PubMed] 11. Puente XS, Gutierrez-Fernandez A, Ordonez GR, Hillier LW, Lopez-Otin C. Comparative genomic analysis of human and chimpanzee proteases. Genomics. 2005;86:638–647. [PubMed] 12. Ordonez GR, LaDeana WH, Warren WC, Grützner F, Lopez-Otin C, Puente XS. Loss of genes implicated in gastric function during platypus evolution. Genome Biol. 2008;9:R81. [PubMed] 13. Varela I, Pereira S, Ugalde AP, Navarro CL, Suarez MF, Cau P, Cadinanos J, Osorio FG, Foray N, Cobo J, et al. Combined treatment with statins and aminobisphosphonates extends longevity in a mouse model of human premature aging. Nat. Med. 2008;14:767–772. [PubMed] 14. Gomperts ED, Astermark J, Gringeri A, Teitel J. From theory to practice: applying current clinical knowledge and treatment strategies to the care of hemophilia a patients with inhibitors. Blood Rev. 2008;22(Suppl. 1):S1–11. [PubMed] 15. Chauhan D, Hideshima T, Anderson KC. Targeting proteasomes as therapy in multiple myeloma. Adv. Exp. Med. Biol. 2008;615:251–260. [PubMed] 16. Turk B. Targeting proteases: successes, failures and future prospects. Nat. Rev. Drug Discov. 2006;5:785–799. [PubMed] 17. Schoenberger J, Bauer J, Moosbauer J, Eilles C, Grimm D. Innovative strategies in in vivo apoptosis imaging. Curr. Med. Chem. 2008;15:187–194. [PubMed] 18. Sloane BF, Sameni M, Podgorski I, Cavallo-Medved D, Moin K. Functional imaging of tumor proteolysis. Annu. Rev. Pharmacol. Toxicol. 2006;46:301–315. [PubMed] 19. Paliouras M, Borgono C, Diamandis EP. Human tissue kallikreins: the cancer biomarker family. Cancer Lett. 2007;249:61–79. [PubMed] 20. Firdaus Raih M, Ahmad HA, Sharum MY, Azizi N, Mohamed R. ProLysED: an integrated database and meta-server of bacterial protease systems. Appl. Bioinformatics. 2005;4:147–150. [PubMed] 21. Igarashi Y, Eroshkin A, Gramatikova S, Gramatikoff K, Zhang Y, Smith JW, Osterman AL, Godzik A. CutDB: a proteolytic event database. Nucleic Acids Res. 2007;35:D546–549. [PubMed] 22. Rawlings ND, Morton FR, Kok CY, Kong J, Barrett AJ. MEROPS: the peptidase database. Nucleic Acids Res. 2008;36:D320–325. [PubMed] 23. Quesada V, Diaz-Perales A, Gutierrez-Fernandez A, Garabaya C, Cal S, Lopez-Otin C. Cloning and enzymatic analysis of 22 novel human ubiquitin-specific proteases. Biochem. Biophys. Res. Commun. 2004;314:54–62. [PubMed] 24. Marino G, Uria JA, Puente XS, Quesada V, Bordallo J, Lopez-Otin C. Human autophagins, a family of cysteine proteinases potentially implicated in cell degradation by autophagy. J. Biol. Chem. 2003;278:3671–3678. [PubMed] 25. Cal S, Quesada V, Garabaya C, Lopez-Otin C. Polyserase-I, a human polyprotease with the ability to generate independent serine protease domains from a single translation product. Proc. Natl Acad. Sci. USA. 2003;100:9185–9190. [PubMed] 26. Diaz-Perales A, Quesada V, Peinado JR, Ugalde AP, Alvarez J, Suarez MF, Gomis-Ruth FX, Lopez-Otin C. Identification and characterization of human archaemetzincin-1 and -2, two novel members of a family of metalloproteases widely distributed in Archaea. J. Biol. Chem. 2005;280:30367–30375. [PubMed] 27. Cal S, Obaya AJ, Llamazares M, Garabaya C, Quesada V, Lopez-Otin C. Cloning, expression analysis and structural characterization of seven novel human ADAMTSs, a family of metalloproteinases with disintegrin and thrombospondin-1 domains. Gene. 2002;283:49–62. [PubMed] 28. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Edgar R, Federhen S, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2008;36:D13–21. [PubMed] 29. Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, et al. An overview of Ensembl. Genome Res. 2004;14:925–928. [PubMed] 30. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33:D514–517. [PubMed] 31. Lopez-Otin C, Matrisian LM. Emerging roles of proteases in tumour suppression. Nat. Rev. Cancer. 2007;7:800–808. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||
Adv Exp Med Biol. 2003; 532():91-107.
[Adv Exp Med Biol. 2003]Nat Clin Pract Rheumatol. 2008 Mar; 4(3):128-35.
[Nat Clin Pract Rheumatol. 2008]Nature. 2005 Sep 22; 437(7058):564-8.
[Nature. 2005]Curr Alzheimer Res. 2008 Apr; 5(2):212-24.
[Curr Alzheimer Res. 2008]Cardiovasc Res. 2006 Feb 15; 69(3):625-35.
[Cardiovasc Res. 2006]J Microbiol. 2006 Apr; 44(2):137-44.
[J Microbiol. 2006]Nat Rev Genet. 2003 Jul; 4(7):544-58.
[Nat Rev Genet. 2003]Genome Res. 2004 Apr; 14(4):609-22.
[Genome Res. 2004]Genomics. 2005 Dec; 86(6):638-47.
[Genomics. 2005]Genome Biol. 2008; 9(5):R81.
[Genome Biol. 2008]Appl Bioinformatics. 2005; 4(2):147-50.
[Appl Bioinformatics. 2005]Nucleic Acids Res. 2007 Jan; 35(Database issue):D546-9.
[Nucleic Acids Res. 2007]Nucleic Acids Res. 2008 Jan; 36(Database issue):D320-5.
[Nucleic Acids Res. 2008]Biochem Biophys Res Commun. 2004 Jan 30; 314(1):54-62.
[Biochem Biophys Res Commun. 2004]J Biol Chem. 2003 Feb 7; 278(6):3671-8.
[J Biol Chem. 2003]Nucleic Acids Res. 2008 Jan; 36(Database issue):D13-21.
[Nucleic Acids Res. 2008]Genome Res. 2004 May; 14(5):925-8.
[Genome Res. 2004]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D514-7.
[Nucleic Acids Res. 2005]Adv Exp Med Biol. 2003; 532():91-107.
[Adv Exp Med Biol. 2003]Nat Clin Pract Rheumatol. 2008 Mar; 4(3):128-35.
[Nat Clin Pract Rheumatol. 2008]Nature. 2005 Sep 22; 437(7058):564-8.
[Nature. 2005]Curr Alzheimer Res. 2008 Apr; 5(2):212-24.
[Curr Alzheimer Res. 2008]Cardiovasc Res. 2006 Feb 15; 69(3):625-35.
[Cardiovasc Res. 2006]