Format

Send to

Choose Destination
J Comput Biol. 2008 Jul-Aug;15(6):593-609. doi: 10.1089/cmb.2008.0010.

Efficiently identifying max-gap clusters in pairwise genome comparison.

Author information

1
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA. xuling@uiuc.edu

Erratum in

  • J Comput Biol. 2008 Oct;15(8):1132. Han, Jaiwei [corrected to Han, Jiawei].

Abstract

The spatial clustering of genes across different genomes has been used to study important problems in comparative genomics, from identification of operons to detection of homologous regions. A set of formal models and algorithms of so-called max-gap clusters have been proposed recently. These algorithms guarantee the completeness of the results, and the simplicity of the model enables a rigorous statistical test of significance. These features overcome the weakness of many previous methods, which are often heuristic in nature. We developed a very efficient algorithm to compute max-gap clusters in pairwise genome comparison. Our algorithm is an order-of-magnitude faster than the previous algorithm based on the same model under a number of different settings. In our evaluation on two bacterial genomes, we showed that our method could identify known operons as well as some novel structures in the genome. We also demonstrated that the current framework for conserved spatial clustering of genes can be used to detect homologous regions in higher organisms, through the comparison of human and mouse genomes.

PMID:
18631023
DOI:
10.1089/cmb.2008.0010
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center