Format

Send to

Choose Destination
Bioinformatics. 2011 Apr 1;27(7):946-52. doi: 10.1093/bioinformatics/btr037. Epub 2011 Jan 25.

Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions.

Author information

1
Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany. hafemeis@molgen.mpg.de

Abstract

MOTIVATION:

Analyzing short time-courses is a frequent and relevant problem in molecular biology, as, for example, 90% of gene expression time-course experiments span at most nine time-points. The biological or clinical questions addressed are elucidating gene regulation by identification of co-expressed genes, predicting response to treatment in clinical, trial-like settings or classifying novel toxic compounds based on similarity of gene expression time-courses to those of known toxic compounds. The latter problem is characterized by irregular and infrequent sample times and a total lack of prior assumptions about the incoming query, which comes in stark contrast to clinical settings and requires to implicitly perform a local, gapped alignment of time series. The current state-of-the-art method (SCOW) uses a variant of dynamic time warping and models time series as higher order polynomials (splines).

RESULTS:

We suggest to model time-courses monitoring response to toxins by piecewise constant functions, which are modeled as left-right Hidden Markov Models. A Bayesian approach to parameter estimation and inference helps to cope with the short, but highly multivariate time-courses. We improve prediction accuracy by 7% and 4%, respectively, when classifying toxicology and stress response data. We also reduce running times by at least a factor of 140; note that reasonable running times are crucial when classifying response to toxins. In conclusion, we have demonstrated that appropriate reduction of model complexity can result in substantial improvements both in classification performance and running time.

AVAILABILITY:

A Python package implementing the methods described is freely available under the GPL from http://bioinformatics.rutgers.edu/Software/MVQueries/.

PMID:
21266444
DOI:
10.1093/bioinformatics/btr037
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center