Send to

Choose Destination
Pharmacogenomics. 2001 Feb;2(1):25-36.

Cluster analysis and promoter modelling as bioinformatics tools for the identification of target genes from expression array data.

Author information

Genomatix Software GmbH, Karlstrasse 55, D-80333 Munich, Germany.


Expression arrays yield enormous amounts of data linking genes, via their cDNA sequences, to gene expression patterns. This now allows the characterisation of gene expression in normal and diseased tissues, as well as the response of tissues to the application of therapeutic reagents. Expression array data can be analysed with respect to the underlying protein sequences, which facilitates the precise determination of when and where certain groups of genes are expressed. More recent developments of clustering algorithms take additional parameters of the experimental set-up into account, focusing more directly on co-regulated set of genes. However, the information concerning transcriptional regulatory networks responsible for the observed expression patterns is not contained within the cDNA sequences used to generate the arrays. Regulation of expression is determined to a large extent by the promoter sequences of the individual genes (and/or enhancers). The complete sequence of the human genome now provides the molecular basis for the identification of many regulatory regions. Promoter sequences for specific cDNAs can be obtained reliably from genomic sequences by exon mapping. In the many cases in which cDNAs are 5'-incomplete, high quality promoter prediction tools can be used to locate promoters directly in the genomic sequence. Once sufficient numbers of promoter sequences have been obtained, a comparative promoter analysis of the co-regulated genes and groups of genes can be applied in order to generate models describing the higher order levels of transcription factor binding site organisation within these promoter regions. Such modules represent the molecular mechanisms through which regulatory networks influence gene expression, and candidates can be determined solely by bioinformatics. This approach also provides a powerful alternative for elucidating the functional features of genes with no detectable sequence similarity, by linking them to other genes on the basis of their common promoter structures.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center