Analysis of homologous gene clusters in Caenorhabditis elegans reveals striking regional cluster domains

Genetics. 2006 Jan;172(1):127-43. doi: 10.1534/genetics.104.040030. Epub 2005 Nov 15.

Abstract

An algorithm for detecting local clusters of homologous genes was applied to the genome of Caenorhabditis elegans. Clusters of two or more homologous genes are abundant, totaling 1391 clusters containing 4607 genes, over one-fifth of all genes in C. elegans. Cluster genes are distributed unevenly in the genome, with the large majority located on autosomal chromosome arms, regions characterized by higher genetic recombination and more repeat sequences than autosomal centers and the X chromosome. Cluster genes are transcribed at much lower levels than average and very few have gross phenotypes as assayed by RNAi-mediated reduction of function. The molecular identity of cluster genes is unusual, with a preponderance of nematode-specific gene families that encode putative secreted and transmembrane proteins, and enrichment for genes implicated in xenobiotic detoxification and innate immunity. Gene clustering in Drosophila melanogaster is also substantial and the molecular identity of clustered genes follows a similar pattern. I hypothesize that autosomal chromosome arms in C. elegans undergo frequent local gene duplication and that these duplications support gene diversification and rapid evolution in response to environmental challenges. Although specific gene clusters have been documented in C. elegans, their abundance, genomic distribution, and unusual molecular identities were previously unrecognized.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Caenorhabditis elegans / genetics*
  • Cluster Analysis
  • Codon / genetics*
  • Conserved Sequence
  • Evolution, Molecular
  • Gene Duplication
  • Molecular Sequence Data
  • Multigene Family*
  • Phylogeny*
  • Protein Structure, Tertiary
  • Sequence Homology, Amino Acid

Substances

  • Codon