Format

Send to

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2003;19 Suppl 1:i232-40.

Pair hidden Markov models on tree structures.

Author information

1
Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan. yasu@bio.keio.ac.jp

Abstract

MOTIVATION:

Computationally identifying non-coding RNA regions on the genome has much scope for investigation and is essentially harder than gene-finding problems for protein-coding regions. Since comparative sequence analysis is effective for non-coding RNA detection, efficient computational methods are expected for structural alignments of RNA sequences. On the other hand, Hidden Markov Models (HMMs) have played important roles for modeling and analysing biological sequences. Especially, the concept of Pair HMMs (PHMMs) have been examined extensively as mathematical models for alignments and gene finding.

RESULTS:

We propose the pair HMMs on tree structures (PHMMTSs), which is an extension of PHMMs defined on alignments of trees and provides a unifying framework and an automata-theoretic model for alignments of trees, structural alignments and pair stochastic context-free grammars. By structural alignment, we mean a pairwise alignment to align an unfolded RNA sequence into an RNA sequence of known secondary structure. First, we extend the notion of PHMMs defined on alignments of 'linear' sequences to pair stochastic tree automata, called PHMMTSs, defined on alignments of 'trees'. The PHMMTSs provide various types of alignments of trees such as affine-gap alignments of trees and an automata-theoretic model for alignment of trees. Second, based on the observation that a secondary structure of RNA can be represented by a tree, we apply PHMMTSs to the problem of structural alignments of RNAs. We modify PHMMTSs so that it takes as input a pair of a 'linear' sequence and a 'tree' representing a secondary structure of RNA to produce a structural alignment. Further, the PHMMTSs with input of a pair of two linear sequences is mathematically equal to the pair stochastic context-free grammars. We demonstrate some computational experiments to show the effectiveness of our method for structural alignments, and discuss a complexity issue of PHMMTSs.

PMID:
12855464
[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center