Format

Send to

Choose Destination
J Mol Biol. 2000 Mar 31;297(3):599-606.

Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach.

Author information

1
Institute of Mammalian Genetics, GSF-National Research Center for Environment and Health, Ingolst├Ądter Landstrasse 1, Neuherberg, D-85758, Germany.

Abstract

We present a new algorithm called PromoterInspector to locate eukaryotic polymase II promoter regions in large genomic sequences with a high degree of specificity. PromoterInspector focuses on the genetic context of promoters, rather than their exact location. Application of PromoterInspector can serve as a crucial pre-processing step for other methods to locate exactly, or to analyze promoters. PromoterInspector does not depend on heuristics, because it is purely based on libraries of IUPAC words extracted from training sequences by an unsupervised learning approach. We compared PromoterInspector to in silico promoter prediction tools using the sequences from the review by J.W. Fickett. PromoterInspector compared favourably on Fickett's evaluation scheme. A true positive to false positive ratio of 2.3 was obtained, surpassing the best ratio of 0.6, reported for TSSG. The application of our method to several large genomic sequences of over 1.3 million base-pairs in total resulted in even more specific predictions. The coverage of annotated promoters was comparable to other in silico promoter prediction methods, while the true positive predictions increased by up to 100% of total matches. PromoterInspector scans 100 kb in less than one minute on a workstation, and thus is especially applicable for large genome analysis. The method is available at http://genomatix.gsf. de/cgi-bin/promoterinspector/promoterinspector.pl.

PMID:
10731414
DOI:
10.1006/jmbi.2000.3589
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center