Send to

Choose Destination
Bioinformatics. 2000 Oct;16(10):932-40.

Finding pathogenicity islands and gene transfer events in genome data.

Author information

Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.



There is a growing literature on wavelet theory and wavelet methods showing improvements on more classical techniques, especially in the contexts of smoothing and extraction of fundamental components of signals. G+C patterns occur at different lengths (scales) and, for this reason, G+C plots are usually difficult to interpret. Current methods for genome analysis choose a window size and compute a chi(2) statistics of the average value for each window with respect to the whole genome.


Firstly, wavelets are used to smooth G+C profiles to locate characteristic patterns in genome sequences. The method we use is based on performing a chi(2) statistics on the wavelet coefficients of a profile; thus we do not need to choose a fixed window size, in that the smoothing occurs at a set of different scales. Secondly, a wavelet scalogram is used as a measure for sequence profile comparison; this tool is very general and can be applied to other sequence profiles commonly used in genome analysis. We show applications to the analysis of Deinococcus radiodurans chromosome I, of two strains of Helicobacter pylori (26695, J99) and two of Neisseria meningitidis (serogroup B strain MC58 and serogroup A strain Z2491). We report a list of loci that have different G+C content with respect to the nearby regions; the analysis of N. meningitidis serogroup B shows two new large regions with low G+C content that are putative pathogenicity islands.


Software and numerical results (profiles, scalograms, high and low frequency components) for all the genome sequences analyzed are available upon request from the authors.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center