Send to

Choose Destination
PLoS One. 2009 Jun 1;4(6):e5748. doi: 10.1371/journal.pone.0005748.

Gene classification based on amino acid motifs and residues: the DLX (distal-less) test case.

Author information

Instituto de Biologia Molecular e Celular (IBMC), University of Porto, Porto, Portugal.



Comparative studies using hundreds of sequences can give a detailed picture of the evolution of a given gene family. Nevertheless, retrieving only the sequences of interest from public databases can be difficult, in particular, when working with highly divergent sequences. The difficulty increases substantially when one wants to include in the study sequences from many (or less well studied) species whose genomes are non-annotated or incompletely annotated.


In this work we evaluate the usefulness of different approaches of gene retrieval and classification, using the distal-less (DLX) gene family as a test case. Furthermore, we evaluate whether the use of a large number of gene sequences from a wide range of animal species, the use of multiple alternative alignments, and the use of amino acids aligned with high confidence only, is enough to recover the accepted DLX evolutionary history.


The canonical DLX homeobox gene sequence here derived, together with the characteristic amino acid variants here identified in the DLX homeodomain region, can be used to retrieve and classify DLX genes in a simple and efficient way. A program is made available that allows the easy retrieval of synteny information that can be used to classify gene sequences. Maximum likelihood trees using hundreds of sequences can be used for gene identification. Nevertheless, for the DLX case, the proposed DLX evolutionary is not recovered even when multiple alignment algorithms are used.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center