Format

Send to

Choose Destination
J Mol Biol. 1995 Jun 23;249(5):923-32.

Predicting Pol II promoter sequences using transcription factor binding sites.

Author information

1
Molecular Biology Computing Center, University of Minnesota, St Paul 55108, USA.

Abstract

A computer program, PROMOTER SCAN, has been developed to recognize a high percentage of Pol II promoter sequences while allowing only a small rate of false positives. A total of 167 primate Pol II promoter sequences, obtained from the Eukaryotic Promoter Database, and 999 primate non-promoter sequences, obtained from the GenBank sequence databank, were used in the analysis. Both promoter and non-promoter sequences were analyzed for the comparative density of each unique mammalian transcription factor binding site listed in the Ghosh Transcription Factor Database. The density of each of these binding sites was then used to derive a ratio of density of each transcriptional element in promoter compared to non-promoter sequences. The combined individual density ratios of all binding sites were then collectively used to build a scoring profile called the Promoter Recognition Profile. This profile, used in combination with a weighted matrix for scoring a TATA box, was then used by the PROMOTER SCAN program to test the prediction of promoter sequences and the ability of the computer program to discriminate them from non-promoter sequences. When the promoter cutoff score was set so that 70% of promoters were recognized correctly by the program, a false positive rate of about 1/5600 bases was observed in the non-promoter sequence set. PROMOTER SCAN is now being developed for public distribution.

PMID:
7791218
DOI:
10.1006/jmbi.1995.0349
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center