Knowledge and Information Systems Division, QinetiQ Ltd., St. Andrews Road, Malvern, Worcestershire, UK. dhoward@qinetiq.com
This paper develops an evolutionary method that learns inductively to recognize the makeup and the position of very short consensus sequences, cis-acting sites, which are a typical feature of promoters in genomes. The method combines a Finite State Automata (FSA) and Genetic Programming (GP) to discover candidate promoter sequences in primary sequence data. An experiment measures the success of the method for promoter prediction in the human genome. This class of method can take large base pair jumps and this may enable it to process very long genomic sequences to discover gene specific cis-acting sites, and genes which are regulated together.