Format

Send to

Choose Destination
Proc Natl Acad Sci U S A. 2014 Jul 22;111(29):10556-61. doi: 10.1073/pnas.1405628111. Epub 2014 Jul 7.

Simple chained guide trees give high-quality protein multiple sequence alignments.

Author information

1
Conway Institute of Biomolecular and Biomedical Research, and UCD School of Medicine and Medical Science, University College Dublin, Dublin 4, Ireland.
2
Conway Institute of Biomolecular and Biomedical Research, and UCD School of Medicine and Medical Science, University College Dublin, Dublin 4, Ireland des.higgins@ucd.ie.

Abstract

Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random.

KEYWORDS:

Clustal; Mafft; Muscle; PFAM

PMID:
25002495
PMCID:
PMC4115562
DOI:
10.1073/pnas.1405628111
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center