Format

Send to

Choose Destination
PLoS One. 2013;8(1):e52854. doi: 10.1371/journal.pone.0052854. Epub 2013 Jan 14.

Detection of genomic idiosyncrasies using fuzzy phylogenetic profiles.

Author information

1
Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece.

Abstract

Phylogenetic profiles express the presence or absence of genes and their homologs across a number of reference genomes. They have emerged as an elegant representation framework for comparative genomics and have been used for the genome-wide inference and discovery of functionally linked genes or metabolic pathways. As the number of reference genomes grows, there is an acute need for faster and more accurate methods for phylogenetic profile analysis with increased performance in speed and quality. We propose a novel, efficient method for the detection of genomic idiosyncrasies, i.e. sets of genes found in a specific genome with peculiar phylogenetic properties, such as intra-genome correlations or inter-genome relationships. Our algorithm is a four-step process where genome profiles are first defined as fuzzy vectors, then discretized to binary vectors, followed by a de-noising step, and finally a comparison step to generate intra- and inter-genome distances for each gene profile. The method is validated with a carefully selected benchmark set of five reference genomes, using a range of approaches regarding similarity metrics and pre-processing stages for noise reduction. We demonstrate that the fuzzy profile method consistently identifies the actual phylogenetic relationship and origin of the genes under consideration for the majority of the cases, while the detected outliers are found to be particular genes with peculiar phylogenetic patterns. The proposed method provides a time-efficient and highly scalable approach for phylogenetic stratification, with the detected groups of genes being either similar to their own genome profile or different from it, thus revealing atypical evolutionary histories.

PMID:
23341912
PMCID:
PMC3544837
DOI:
10.1371/journal.pone.0052854
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center