Send to

Choose Destination
Genomics. 1994 Jan 1;19(1):97-107.

Protein family classification based on searching a database of blocks.

Author information

Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98104.


The most highly conserved regions of proteins can be represented as "blocks" of locally aligned sequence segments. Previously, an automated system was introduced to generate a database of blocks that is searched for local similarities using a sequence query. Here, we describe a method for searching this database that can also reveal significant global similarities. Local and global alignments are scored independently, so they can be used in concert to infer homology. A set of 7082 diverse sequences not represented in the database provided queries for testing this approach. The resulting distributions of scores led to guidelines for interpretation of search data and to the classification of 289 uncatalogued sequences into known groups. Thirty-eight of these relationships appear to be new discoveries. We also show how searching a database of blocks can be used to detect repeated domains and to find distinct cross-family relationships that were missed in searches of sequence databases.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center