Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
J Comput Biol. 2009 Jan;16(1):43-66. doi: 10.1089/cmb.2008.0028.

Detecting alternative gene structures from spliced ESTs: a computational approach.

Author information

  • 1Dipartimento di Informatica Sistemistica e Comunicazione, Universit√† degli Studi di Milano-Bicocca, Milano, Italy. bonizzoni@disco.unimib.it

Abstract

Alternative splicing (AS) is currently considered as one of the main mechanisms able to explain the huge gap between the number of predicted genes and the high complexity of the proteome in humans. The rapid growth of Expressed Sequence Tag (EST) data has encouraged the development of computational methods to predict alternative splicing from the analysis of EST alignment to genome sequences. EST data are also a valuable source to reconstruct the different transcript isoforms that derive from the same gene structure as a consequence of AS, as indeed EST sequences are obtained by fragmenting mRNAs from the same gene. The most recent studies on alternative splice sites detection have revealed that this topic is a quite challenging computational problem, far from a solution. The main computational issues related to the problem of detecting alternative splicing are investigated in this paper, and we analyze algorithmic solutions for this problem. We first formalize an optimization problem related to the prediction of constitutive and alternative splicing sites from EST sequences, the Minimum Exons ESTs Factorization problem (in short, MEF), and show that it is Np-hard, even for restricted instances. This problem leads us to define sets of spliced EST, that is, a set of EST factorized into their constitutive exons with respect to a gene. Then we investigate the computational problem of predicting transcript isoforms from spliced EST sequences. We propose a graph algorithm for the problem that is linear in the number of predicted isoforms and size of the graph. Finally, an experimental analysis of the method is performed to assess the reliability of the predictions.

PMID:
19119993
[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Mary Ann Liebert, Inc.
    Loading ...
    Write to the Help Desk