Send to

Choose Destination
Genome Biol Evol. 2016 Jan 18;8(2):446-57. doi: 10.1093/gbe/evw005.

Ortholog-Finder: A Tool for Constructing an Ortholog Data Set.

Author information

Department of Biological and Environmental Science, Shizuoka University, Japan
The Genome Institute, Japanese Foundation of Cancer Research, Tokyo, Japan.
Department of Economics, Chiba University of Commerce, Ichikawa, Japan.
Research Center for Aquatic Genomics, National Research Institute of Fisheries Science, Fisheries Research Agency, Kanagawa, Japan.
School of New Sciences, Daegu Gyoungbook Institute of Science and Technology, Daegu, Republic of Korea.


Orthologs are widely used for phylogenetic analysis of species; however, identifying genuine orthologs among distantly related species is challenging, because genes obtained through horizontal gene transfer (HGT) and out-paralogs derived from gene duplication before speciation are often present among the predicted orthologs. We developed a program, "Ortholog-Finder," to obtain ortholog data sets for performing phylogenetic analysis by using all open-reading frame data of species. The program includes five processes for minimizing the effects of HGT and out-paralogs in phylogeny construction: 1) HGT filtering: Genes derived from HGT could be detected and deleted from the initial sequence data set by examining their base compositions. 2) Out-paralog filtering: Out-paralogs are detected and deleted from the data set based on sequence similarity. 3) Classification of phylogenetic trees: Phylogenetic trees generated for ortholog candidates are classified as monophyletic or polyphyletic trees. 4) Tree splitting: Polyphyletic trees are bisected to obtain monophyletic trees and remove HGT genes and out-paralogs. 5) Threshold changing: Out-paralogs are further excluded from the data set based on the difference in the similarity scores of genuine orthologs and out-paralogs. We examined how out-paralogs and HGTs affected phylogenetic trees constructed for species based on ortholog data sets obtained by Ortholog-Finder with the use of simulation data, and we determined the effects of confounding factors. We then used Ortholog-Finder in phylogeny construction for 12 Gram-positive bacteria from two phyla and validated each node of the constructed tree by comparison with individually constructed ortholog trees.


eukaryote; horizontal gene transfer; ortholog; out-paralog; phylogenetic analysis; prokaryote

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center