Send to

Choose Destination
Biosystems. 1993;29(2-3):87-104.

A linguistic representation of the regulation of transcription initiation. I. An ordered array of complex symbols with distinctive features.

Author information

Centro de Investigación sobre Fijación de Nitrógeno, Universidad Nacional Autónoma de México, Cuernavaca, Morelos.


The inadequacy of context-free grammars in the description of regulatory information contained in DNA gave the formal justification for a linguistic approach to the study of gene regulation. Based on that result, we have initiated a linguistic formalization of the regulatory arrays of 107 sigma 70 E. coli promoters. The complete sequences of promoter (Pr), operator (Op) and activator binding sites (I) have previously been identified as the smallest elements, or categories, for a combinatorial analysis of the range of transcription initiation of sigma 70 promoters. These categories are conceptually equivalent to phonemes of natural language. Several features associated with these categories are required in a complete description of regulatory arrays of promoters. We have to select the best way to describe the properties that are pertinent for the description of such regulatory regions. In this paper we define distinctive features of regulatory regions based on the following criteria: identification of subclasses of substitutable elements, simplicity, selection of the most directly related information, and distinction of one array among the whole set of promoters. Alternative ways to represent distances in between regulatory sites are discussed, permitting, together with a principle of precedence, the identification of an ordered set of complex symbols as a unique representation for a promoter and its associated regulatory sites. In the accompanying paper additional distinctive features of promoters and regulatory sites are identified.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center