• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Mar 15, 1992; 89(6): 2002–2006.
PMCID: PMC48584

Methods and algorithms for statistical analysis of protein sequences.

Abstract

We describe several protein sequence statistics designed to evaluate distinctive attributes of residue content and arrangement in primary structure. Considered are global compositional biases, local clustering of different residue types (e.g., charged residues, hydrophobic residues, Ser/Thr), long runs of charged or uncharged residues, periodic patterns, counts and distribution of homooligopeptides, and unusual spacings between particular residue types. The computer program SAPS (statistical analysis of protein sequences) calculates all the statistics for any individual protein sequence input and is available for the UNIX environment through electronic mail on request to V.B. (volker/genomic@stanford.edu).

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.1M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444–2448. [PMC free article] [PubMed]
  • Biou V, Gibrat JF, Levin JM, Robson B, Garnier J. Secondary structure prediction: combination of three different methods. Protein Eng. 1988 Sep;2(3):185–191. [PubMed]
  • Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982 May 5;157(1):105–132. [PubMed]
  • Bause E. Structural requirements of N-glycosylation of proteins. Studies with proline peptides as conformational probes. Biochem J. 1983 Feb 1;209(2):331–336. [PMC free article] [PubMed]
  • Krebs EG, Beavo JA. Phosphorylation-dephosphorylation of enzymes. Annu Rev Biochem. 1979;48:923–959. [PubMed]
  • Bairoch A. PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res. 1991 Apr 25;19 (Suppl):2241–2245. [PMC free article] [PubMed]
  • Landschulz WH, Johnson PF, McKnight SL. The DNA binding domain of the rat liver nuclear protein C/EBP is bipartite. Science. 1989 Mar 31;243(4899):1681–1688. [PubMed]
  • Bairoch A, Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 1991 Apr 25;19 (Suppl):2247–2249. [PMC free article] [PubMed]
  • Karlin S, Bucher P, Brendel V, Altschul SF. Statistical methods and insights for protein and DNA sequences. Annu Rev Biophys Biophys Chem. 1991;20:175–203. [PubMed]
  • Karlin S, Blaisdell BE, Brendel V. Identification of significant sequence patterns in proteins. Methods Enzymol. 1990;183:388–402. [PubMed]
  • Karlin S, Altschul SF. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264–2268. [PMC free article] [PubMed]
  • Leung MY, Blaisdell BE, Burge C, Karlin S. An efficient algorithm for identifying matches with errors in multiple long molecular sequences. J Mol Biol. 1991 Oct 20;221(4):1367–1378. [PMC free article] [PubMed]
  • Karlin S, Brendel V, Bucher P. Significant similarity and dissimilarity in homologous proteins. Mol Biol Evol. 1992 Jan;9(1):152–167. [PubMed]
  • Blochlinger K, Bodmer R, Jack J, Jan LY, Jan YN. Primary structure and expression of a product from cut, a locus involved in specifying sensory organ identity in Drosophila. Nature. 1988 Jun 16;333(6174):629–635. [PubMed]
  • Brendel V, Karlin S. Association of charge clusters with functional domains of cellular transcription factors. Proc Natl Acad Sci U S A. 1989 Aug;86(15):5698–5702. [PMC free article] [PubMed]
  • Brendel V, Dohlman J, Blaisdell BE, Karlin S. Very long charge runs in systemic lupus erythematosus-associated autoantigens. Proc Natl Acad Sci U S A. 1991 Feb 15;88(4):1536–1540. [PMC free article] [PubMed]
  • Benezra R, Davis RL, Lockshon D, Turner DL, Weintraub H. The protein Id: a negative regulator of helix-loop-helix DNA binding proteins. Cell. 1990 Apr 6;61(1):49–59. [PubMed]
  • Karlin S, Brendel V. Charge configurations in oncogene products and transforming proteins. Oncogene. 1990 Jan;5(1):85–95. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Cited in Books
    Cited in Books
    PubMed Central articles cited in books
  • Gene
    Gene
    Gene links
  • GEO Profiles
    GEO Profiles
    Related GEO records
  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...