Format

Send to

Choose Destination
J Mol Biol. 2003 Feb 21;326(3):955-78.

Functional sites in protein families uncovered via an objective and automated graph theoretic approach.

Author information

1
Department of Chemical Engineering, Indian Institute of Technology, Bombay, Powai Mumbai 400 076, India. pramodw@che.iitb.ac.in

Abstract

We report a method for detection of recurring side-chain patterns (DRESPAT) using an unbiased and automated graph theoretic approach. We first list all structural patterns as sub-graphs where the protein is represented as a graph. The patterns from proteins are compared pair-wise to detect patterns common to a protein pair based on content and geometry criteria. The recurring pattern is then detected using an automated search algorithm from the all-against-all pair-wise comparison data of proteins. Intra-protein pattern comparison data are used to enable detection of patterns recurring within a protein. A method has been proposed for empirical calculation of statistical significance of recurring pattern. The method was tested on 17 protein sets of varying size, composed of non-redundant representatives from SCOP superfamilies. Recurring patterns in serine proteases, cysteine proteases, lipases, cupredoxin, ferredoxin, ferritin, cytochrome c, aspartoyl proteases, peroxidases, phospholipase A2, endonuclease, SH3 domain, EF-hand and lectins show additional residues conserved in the vicinity of the known functional sites. On the basis of the recurring patterns in ferritin, EF-hand and lectins, we could separate proteins or domains that are structurally similar yet different in metal ion-binding characteristics. In addition, novel recurring patterns were observed in glutathione-S-transferase, phospholipase A2 and ferredoxin with potential structural/functional roles. The results are discussed in relation to the known functional sites in each family. Between 2000 and 50,000 patterns were enumerated from each protein with between ten and 500 patterns detected as common to an evolutionarily related protein pair. Our results show that unbiased extraction of functional site pattern is not feasible from an evolutionarily related protein pair but is feasible from protein sets comprising five or more proteins. The DRESPAT method does not require a user-defined pattern, size or location of the pattern and therefore, has the potential to uncover new functional sites in protein families.

PMID:
12581652
DOI:
10.1016/s0022-2836(02)01384-0
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center