• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Feb 1, 1994; 91(3): 1059–1063.
PMCID: PMC521453

Hidden Markov models of biological primary sequence information.

Abstract

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.1M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Chan SC, Wong AK, Chiu DK. A survey of multiple sequence comparison methods. Bull Math Biol. 1992 Jul;54(4):563–598. [PubMed]
  • Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. [PubMed]
  • Vingron M, Argos P. Motif recognition and alignment for many sequences by comparison of dot-matrices. J Mol Biol. 1991 Mar 5;218(1):33–43. [PubMed]
  • Higgins DG, Bleasby AJ, Fuchs R. CLUSTAL V: improved software for multiple sequence alignment. Comput Appl Biosci. 1992 Apr;8(2):189–191. [PubMed]
  • Gusfield D. Efficient methods for multiple sequence alignment with guaranteed error bounds. Bull Math Biol. 1993 Jan;55(1):141–154. [PubMed]
  • Churchill GA. Stochastic models for heterogeneous DNA sequences. Bull Math Biol. 1989;51(1):79–94. [PubMed]
  • Lawrence CE, Reilly AA. An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins. 1990;7(1):41–51. [PubMed]
  • Thorne JL, Kishino H, Felsenstein J. An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol. 1991 Aug;33(2):114–124. [PubMed]
  • Cardon LR, Stormo GD. Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments. J Mol Biol. 1992 Jan 5;223(1):159–170. [PubMed]
  • Krogh A, Brown M, Mian IS, Sjölander K, Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994 Feb 4;235(5):1501–1531. [PubMed]
  • Bashford D, Chothia C, Lesk AM. Determinants of a protein fold. Unique features of the globin amino acid sequences. J Mol Biol. 1987 Jul 5;196(1):199–216. [PubMed]
  • Hunter T. A thousand and one protein kinases. Cell. 1987 Sep 11;50(6):823–829. [PubMed]
  • Hanks SK, Quinn AM. Protein kinase catalytic domain sequence database: identification of conserved features of primary structure and classification of family members. Methods Enzymol. 1991;200:38–62. [PubMed]
  • Lindberg RA, Quinn AM, Hunter T. Dual-specificity protein kinases: will any hydroxyl do? Trends Biochem Sci. 1992 Mar;17(3):114–119. [PubMed]
  • Knighton DR, Zheng JH, Ten Eyck LF, Ashford VA, Xuong NH, Taylor SS, Sowadski JM. Crystal structure of the catalytic subunit of cyclic adenosine monophosphate-dependent protein kinase. Science. 1991 Jul 26;253(5018):407–414. [PubMed]
  • Doolittle RF. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. [PubMed]
  • Dayhoff MO, Barker WC, Hunt LT. Establishing homologies in protein sequences. Methods Enzymol. 1983;91:524–545. [PubMed]
  • Zuckerkandl E. Evolutionary processes and evolutionary noise at the molecular level. I. Functional density in proteins. J Mol Evol. 1976 Apr 9;7(3):167–183. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Cited in Books
    Cited in Books
    PubMed Central articles cited in books
  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...