• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Oct 1, 1993; 90(19): 8777–8781.
PMCID: PMC47443

Weighting in sequence space: a comparison of methods in terms of generalized sequences.

Abstract

Four methods for weighting aligned biological sequences have recently appeared that differ mathematically, philosophically, and in their results. Thus, while there is consensus about the need to weight sequences, the method to use is contentious. A geometric analysis based on a continuous sequence space is presented that provides a common framework in which to compare the methods. It is concluded that there are two "best" methods. When the sequences are known to be phylogenetically related and a tree can be generated without introducing excessive stress into the data, the method of Altschul et al. [Altschul, S. F., Carroll, R. J. & Lipman, D. J. (1989) J. Mol. Biol. 207, 647-653] is appropriate. When the sequences are not known to be phylogenetically related or a tree cannot be produced without unduly distorting the distances between the sequences, a modification of the method of Sibbald and Argos [Sibbald, P. R. & Argos, P. (1990) J. Mol. Biol. 216, 813-818] is preferable.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.0M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Felsenstein J. Maximum-likelihood estimation of evolutionary trees from continuous characters. Am J Hum Genet. 1973 Sep;25(5):471–492. [PMC free article] [PubMed]
  • Hogeweg P, Hesper B. The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol. 1984;20(2):175–186. [PubMed]
  • Lipman DJ, Altschul SF, Kececioglu JD. A tool for multiple sequence alignment. Proc Natl Acad Sci U S A. 1989 Jun;86(12):4412–4415. [PMC free article] [PubMed]
  • Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355–4358. [PMC free article] [PubMed]
  • Zvelebil MJ, Barton GJ, Taylor WR, Sternberg MJ. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol. 1987 Jun 20;195(4):957–961. [PubMed]
  • Barton GJ, Sternberg MJ. A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J Mol Biol. 1987 Nov 20;198(2):327–337. [PubMed]
  • Higgins DG, Sharp PM. Fast and sensitive multiple sequence alignments on a microcomputer. Comput Appl Biosci. 1989 Apr;5(2):151–153. [PubMed]
  • Vingron M, Argos P. A fast and sensitive multiple sequence alignment algorithm. Comput Appl Biosci. 1989 Apr;5(2):115–121. [PubMed]
  • Taylor WR. A flexible method to align large numbers of biological sequences. J Mol Evol. 1988 Dec;28(1-2):161–169. [PubMed]
  • Corpet F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 1988 Nov 25;16(22):10881–10890. [PMC free article] [PubMed]
  • Sibbald PR, Argos P. Weighting aligned protein or nucleic acid sequences to correct for unequal representation. J Mol Biol. 1990 Dec 20;216(4):813–818. [PubMed]
  • Altschul SF, Carroll RJ, Lipman DJ. Weights for data related by a tree. J Mol Biol. 1989 Jun 20;207(4):647–653. [PubMed]
  • Sander C, Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;9(1):56–68. [PubMed]
  • Eigen M, Winkler-Oswatitsch R, Dress A. Statistical geometry in sequence space: a method of quantitative comparative sequence analysis. Proc Natl Acad Sci U S A. 1988 Aug;85(16):5913–5917. [PMC free article] [PubMed]
  • van Heel M. A new family of powerful multivariate statistical sequence analysis techniques. J Mol Biol. 1991 Aug 20;220(4):877–887. [PubMed]
  • Higgins DG. Sequence ordinations: a multivariate analysis approach to analysing large sequence data sets. Comput Appl Biosci. 1992 Feb;8(1):15–22. [PubMed]
  • Altschul SF, Lipman DJ. Equal animals. Nature. 1990 Dec 6;348(6301):493–494. [PubMed]
  • Bandelt HJ, Dress AW. Weak hierarchies associated with similarity measures--an additive clustering technique. Bull Math Biol. 1989;51(1):133–166. [PubMed]
  • Li WH. Simple method for constructing phylogenetic trees from distance matrices. Proc Natl Acad Sci U S A. 1981 Feb;78(2):1085–1089. [PMC free article] [PubMed]
  • Hein J. Unified approach to alignment and phylogenies. Methods Enzymol. 1990;183:626–645. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Compound
    Compound
    PubChem Compound links
  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...