Searching databases of conserved sequence regions by aligning protein multiple-alignments

Nucleic Acids Res. 1996 Oct 1;24(19):3836-45. doi: 10.1093/nar/24.19.3836.

Abstract

A general searching method for comparing multiple sequence alignments was developed to detect sequence relationships between conserved protein regions. Multiple alignments are treated as sequences of amino acid distributions and aligned by comparing pairs of such distributions. Four different comparison measures were tested and the Pearson correlation coefficient chosen. The method is sensitive, detecting weak sequence relationships between protein families. Relationships are detected beyond the range of conventional sequence database searches, illustrating the potential usefulness of the method. The previously undetected relation between flavoprotein subunits of two oxidoreductase families points to the potential active site in one of the families. The similarity between the bacterial RecA, DnaA and Rad51 protein families reveals a region in DnaA and Rad51 proteins likely to bind and unstack single-stranded DNA. Helix--turn--helix DNA binding domains from diverse proteins are readily detected and shown to be similar to each other. Glycosylasparaginase and gamma-glutamyltransferase enzymes are found to be similar in their proteolytic cleavage sites. The method has been fully implemented on the World Wide Web at URL: http://blocks.fhcrc.org/blocks-bin/LAMAvsearch.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Aspartylglucosylaminase / chemistry
  • Aspartylglucosylaminase / metabolism
  • Catalysis
  • Conserved Sequence*
  • DNA Nucleotidyltransferases / chemistry
  • DNA Nucleotidyltransferases / metabolism
  • DNA, Single-Stranded / metabolism
  • DNA-Binding Proteins / chemistry
  • DNA-Binding Proteins / metabolism
  • Databases, Factual*
  • Flavin-Adenine Dinucleotide / metabolism
  • Helix-Turn-Helix Motifs
  • Hydrolysis
  • Proteins / chemistry*
  • Proteins / metabolism
  • Sequence Homology, Amino Acid*
  • Transposases
  • gamma-Glutamyltransferase / chemistry
  • gamma-Glutamyltransferase / metabolism

Substances

  • DNA, Single-Stranded
  • DNA-Binding Proteins
  • Proteins
  • Flavin-Adenine Dinucleotide
  • gamma-Glutamyltransferase
  • DNA Nucleotidyltransferases
  • Transposases
  • Aspartylglucosylaminase