Searching databases of conserved sequence regions by aligning protein multiple-alignments

S Pietrokovski

doi:10.1093/nar/24.19.3836

Searching databases of conserved sequence regions by aligning protein multiple-alignments

Nucleic Acids Res. 1996 Oct 1;24(19):3836-45. doi: 10.1093/nar/24.19.3836.

Author

S Pietrokovski¹

Affiliation

¹ Fred Hutchinson Cancer Research Center, Seattle, WA 98104, USA.

Abstract

A general searching method for comparing multiple sequence alignments was developed to detect sequence relationships between conserved protein regions. Multiple alignments are treated as sequences of amino acid distributions and aligned by comparing pairs of such distributions. Four different comparison measures were tested and the Pearson correlation coefficient chosen. The method is sensitive, detecting weak sequence relationships between protein families. Relationships are detected beyond the range of conventional sequence database searches, illustrating the potential usefulness of the method. The previously undetected relation between flavoprotein subunits of two oxidoreductase families points to the potential active site in one of the families. The similarity between the bacterial RecA, DnaA and Rad51 protein families reveals a region in DnaA and Rad51 proteins likely to bind and unstack single-stranded DNA. Helix--turn--helix DNA binding domains from diverse proteins are readily detected and shown to be similar to each other. Glycosylasparaginase and gamma-glutamyltransferase enzymes are found to be similar in their proteolytic cleavage sites. The method has been fully implemented on the World Wide Web at URL: http://blocks.fhcrc.org/blocks-bin/LAMAvsearch.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Amino Acid Sequence
Aspartylglucosylaminase / chemistry
Aspartylglucosylaminase / metabolism
Catalysis
Conserved Sequence*
DNA Nucleotidyltransferases / chemistry
DNA Nucleotidyltransferases / metabolism
DNA, Single-Stranded / metabolism
DNA-Binding Proteins / chemistry
DNA-Binding Proteins / metabolism
Databases, Factual*
Flavin-Adenine Dinucleotide / metabolism
Helix-Turn-Helix Motifs
Hydrolysis
Proteins / chemistry*
Proteins / metabolism
Sequence Homology, Amino Acid*
Transposases
gamma-Glutamyltransferase / chemistry
gamma-Glutamyltransferase / metabolism

Substances

DNA, Single-Stranded
DNA-Binding Proteins
Proteins
Flavin-Adenine Dinucleotide
gamma-Glutamyltransferase
DNA Nucleotidyltransferases
Transposases
Aspartylglucosylaminase

Grants and funding

GM29009/GM/NIGMS NIH HHS/United States