Display Settings:

Format

Send to:

Choose Destination
We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Bioinformatics. 2007 Jul 1;23(13):1588-98. Epub 2007 Apr 25.

Murlet: a practical multiple alignment tool for structural RNA sequences.

Author information

  • 1Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan. kiryu-h@aist.go.jp

Abstract

MOTIVATION:

Structural RNA genes exhibit unique evolutionary patterns that are designed to conserve their secondary structures; these patterns should be taken into account while constructing accurate multiple alignments of RNA genes. The Sankoff algorithm is a natural alignment algorithm that includes the effect of base-pair covariation in the alignment model. However, the extremely high computational cost of the Sankoff algorithm precludes its application to most RNA sequences.

RESULTS:

We propose an efficient algorithm for the multiple alignment of structural RNA sequences. Our algorithm is a variant of the Sankoff algorithm, and it uses an efficient scoring system that reduces the time and space requirements considerably without compromising on the alignment quality. First, our algorithm computes the match probability matrix that measures the alignability of each position pair between sequences as well as the base pairing probability matrix for each sequence. These probabilities are then combined to score the alignment using the Sankoff algorithm. By itself, our algorithm does not predict the consensus secondary structure of the alignment but uses external programs for the prediction. We demonstrate that both the alignment quality and the accuracy of the consensus secondary structure prediction from our alignment are the highest among the other programs examined. We also demonstrate that our algorithm can align relatively long RNA sequences such as the eukaryotic-type signal recognition particle RNA that is approximately 300 nt in length; multiple alignment of such sequences has not been possible by using other Sankoff-based algorithms. The algorithm is implemented in the software named 'Murlet'.

AVAILABILITY:

The C++ source code of the Murlet software and the test dataset used in this study are available at http://www.ncrna.org/papers/Murlet/.

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
17459961
[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for HighWire
    Loading ...
    Write to the Help Desk