• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Jan 25, 1991; 19(2): 313–318.
PMCID: PMC333596

Training back-propagation neural networks to define and detect DNA-binding sites.

Abstract

A three layered back-propagation neural network was trained to recognize E. coli promoters of the 17 base spacing class. To this end, the network was presented with 39 promoter sequences and derivatives of those sequences as positive inputs; 60% A + T random sequences and sequences containing 2 promoter-down point mutations were used as negative inputs. The entire promoter sequence of 58 bases, approximately -50 to +8, was entered as input. The network was asked to associate an output of 1.0 with promoter sequence input and 0.0 with non-promoter input. Generally, after 100,000 input cycles, the network was virtually perfect in classifying the training set. A trained network was about 80% effective in recognizing 'new' promoters which were not in the training set, with a false positive rate below 0.1%. Network searches on pBR322 and on the lambda genome were also performed. Overall the results were somewhat better than the best rule-based procedures. The trained network can be analyzed both for its choice of base and relative weighting, positive and negative, at each position of the sequence. This method, which requires only appropriate input/output training pairs, can be used to define and search for any DNA regulatory sequence for which there are sufficient exemplars.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.1M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Harr R, Häggström M, Gustafsson P. Search algorithm for pattern match analysis of nucleic acid sequences. Nucleic Acids Res. 1983 May 11;11(9):2943–2957. [PMC free article] [PubMed]
  • Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. [PMC free article] [PubMed]
  • Mulligan ME, Hawley DK, Entriken R, McClure WR. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):789–800. [PMC free article] [PubMed]
  • Galas DJ, Eggert M, Waterman MS. Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J Mol Biol. 1985 Nov 5;186(1):117–128. [PubMed]
  • Studnicka GM. Nucleotide sequence homologies in control regions of prokaryotic genomes. Gene. 1987;58(1):45–57. [PubMed]
  • O'Neill MC, Chiafari F. Escherichia coli promoters. II. A spacing class-dependent promoter search protocol. J Biol Chem. 1989 Apr 5;264(10):5531–5534. [PubMed]
  • Schneider TD, Stormo GD, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. [PubMed]
  • Berg OG, von Hippel PH. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol. 1987 Feb 20;193(4):723–750. [PubMed]
  • Youderian P, Bouvier S, Susskind MM. Sequence determinants of promoter activity. Cell. 1982 Oct;30(3):843–853. [PubMed]
  • O'Neill MC. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989 May 20;207(2):301–310. [PubMed]
  • Hawley DK, McClure WR. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983 Apr 25;11(8):2237–2255. [PMC free article] [PubMed]
  • Gentz R, Bujard H. Promoters recognized by Escherichia coli RNA polymerase selected by function: highly efficient promoters from bacteriophage T5. J Bacteriol. 1985 Oct;164(1):70–77. [PMC free article] [PubMed]
  • Harley CB, Reynolds RP. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987 Mar 11;15(5):2343–2361. [PMC free article] [PubMed]
  • Peden KW. Revised sequence of the tetracycline-resistance gene of pBR322. Gene. 1983 May-Jun;22(2-3):277–280. [PubMed]
  • Stüber D, Bujard H. Organization of transcriptional signals in plasmids pBR322 and pACYC184. Proc Natl Acad Sci U S A. 1981 Jan;78(1):167–171. [PMC free article] [PubMed]
  • Botchan P. An electron microscopic comparison of transcription on linear and superhelical DNA. J Mol Biol. 1976 Jul 25;105(1):161–176. [PubMed]
  • Stormo GD, Schneider TD, Gold L, Ehrenfeucht A. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2997–3011. [PMC free article] [PubMed]
  • O'Neill MC. Escherichia coli promoters. I. Consensus as it relates to spacing class, specificity, repeat substructure, and three-dimensional organization. J Biol Chem. 1989 Apr 5;264(10):5522–5530. [PubMed]
  • Beutel BA, Record MT., Jr E. coli promoter spacer regions contain nonrandom sequences which correlate to spacer length. Nucleic Acids Res. 1990 Jun 25;18(12):3597–3603. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Compound
    Compound
    PubChem Compound links
  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...