Send to

Choose Destination
Bioinformatics. 2000 Oct;16(10):915-22.

CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts.

Author information

Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens, Athens GR-15701, Greece.



Sensitive detection and masking of low-complexity regions in protein sequences. Filtered sequences can be used in sequence comparison without the risk of matching compositionally biased regions. The main advantage of the method over similar approaches is the selective masking of single residue types without affecting other, possibly important, regions.


A novel algorithm for low-complexity region detection and selective masking. The algorithm is based on multiple-pass Smith-Waterman comparison of the query sequence against twenty homopolymers with infinite gap penalties. The output of the algorithm is both the masked query sequence for further analysis, e.g. database searches, as well as the regions of low complexity. The detection of low-complexity regions is highly specific for single residue types. It is shown that this approach is sufficient for masking database query sequences without generating false positives. The algorithm is benchmarked against widely available algorithms using the 210 genes of Plasmodium falciparum chromosome 2, a dataset known to contain a large number of low-complexity regions.


CAST (version 1.0) executable binaries are available to academic users free of charge under license. Web site entry point, server and additional material:

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center