Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 1999 Dec;15(12):987-93.

Modeling and predicting transcriptional units of Escherichia coli genes using hidden Markov models.

Author information

  • 1Genomic Sciences Center, RIKEN, c/o Laboratory of Genome Database, Human Genome Center, Institute of Medical Science, University of Tokyo, Japan.



The hidden Markov model (HMM) is a valuable technique for gene-finding, especially because its flexibility enables the inclusion of various sequence features. Recent programs for bacterial gene-finding include the information of ribosomal binding site (RBS) to improve the recognition accuracy of the start codon, using this feature. We report here our attempt to extend the model into the total transcriptional unit, enabling the prediction of operon structures.


First, we improved the prediction accuracy of coding sequences (CDSs) by employing the models of 'typical', 'atypical' and 'negative (false-positive)' classes as well as the models of RBS and its downstream spacer. The sensitivity of exactly predicting the 204 experimentally confirmed CDSs reached 90.2% in an objective test. Based on the prediction result of CDSs, the positions of the promoters and terminators were predicted. Our model could exactly recognize 60% of 390 known transcriptional units. Thus, the accuracy and significance of this prediction problem is far from trivial. We would like to propose this problem as an open theme in bioinformatics because the ongoing or planned post-sequencing projects will produce much data for future improvements.

[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk