![]() | ![]() |
Formats:
|
||||||||||||||||||||||
Copyright © 2008 The Authors The Apicomplexan Whole-Genome Phylogeny: An Analysis of Incongruence among Gene Trees *Department of Genetics, University of Georgia †Center for Tropical and Emerging Global Diseases, University of Georgia ‡Institute of Bioinformatics, University of Georgia Corresponding author.E-mail: chkuo/at/email.arizona.edu. 1Present address: Department of Ecology and Evolutionary Biology, University of Arizona Hervé Philippe, Associate Editor Accepted September 18, 2008. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract The protistan phylum Apicomplexa contains many important pathogens and is the subject of intense genome sequencing efforts. Based upon the genome sequences from seven apicomplexan species and a ciliate outgroup, we identified 268 single-copy genes suitable for phylogenetic inference. Both concatenation and consensus approaches inferred the same species tree topology. This topology is consistent with most prior conceptions of apicomplexan evolution based upon ultrastructural and developmental characters, that is, the piroplasm genera Theileria and Babesia form the sister group to the Plasmodium species, the coccidian genera Eimeria and Toxoplasma are monophyletic and are the sister group to the Plasmodium species and piroplasm genera, and Cryptosporidium forms the sister group to the above mentioned with the ciliate Tetrahymena as the outgroup. The level of incongruence among gene trees appears to be high at first glance; only 19% of the genes support the species tree, and a total of 48 different gene-tree topologies are observed. Detailed investigations suggest that the low signal-to-noise ratio in many genes may be the main source of incongruence. The probability of being consistent with the species tree increases as a function of the minimum bootstrap support observed at tree nodes for a given gene tree. Moreover, gene sequences that generate high bootstrap support are robust to the changes in alignment parameters or phylogenetic method used. However, caution should be taken in that some genes can infer a “wrong” tree with strong support because of paralogy, model violations, or other causes. The importance of examining multiple, unlinked genes that possess a strong phylogenetic signal cannot be overstated. Keywords: Apicomplexa, genome scale, phylogeny, bootstrap, long-branch attraction, taxon sampling Introduction The protistan phylum Apicomplexa contains many important pathogens (Levine 1988). The most infamous members of this phylum are the causative agents of malaria from the genus Plasmodium, which causes more than one million human deaths per year globally (WHO and UNICEF 2005). Other important lineages include Babesia, which causes babesiosis in ruminants and humans (Brayton et al. 2007); Cryptosporidium, which causes cryptosporidiosis in humans and animals (Abrahamsen et al. 2004); Theileria, which causes tropical theileriosis and East Coast fever in cattle (Gardner et al. 2005; Pain et al. 2005); and Toxoplasma, which causes toxoplasmosis in immunocompromised patients and congenitally infected fetuses (Montoya and Liesenfeld 2004). These pathogens have been subjected to intense genome sequencing efforts in the hope of facilitating biomedical research (Tarleton and Kissinger 2001; Carlton 2003). The recent availability of fully annotated genome sequences from multiple species within this phylum provides a new and exciting opportunity for us to better understand the phylogeny of these important pathogens. The use of genome sequences for phylogenetic inference has only recently become possible. The large number of characters derived from genomic data allows robust inference of organismal phylogeny (Delsuc et al. 2005; Philippe, Delsuc, et al. 2005; Rokas 2006), even when the level of incomplete lineage sorting is high (Pollard et al. 2006). Initially, it was thought that use of genomic data would bring an end to the incongruence commonly observed in multigene molecular phylogenetic inference (Gee 2003; Rokas et al. 2003). However, further investigations suggest that the results from genome-scale phylogenetic inference should be interpreted with caution (Soltis et al. 2004; Jeffroy et al. 2006; Nishihara et al. 2007). Although genomic data can effectively suppress stochastic noise in shorter molecular sequences, the large amount of data can actually strengthen systematic biases when present (Phillips et al. 2004; Rodriguez-Ezpeleta et al. 2007). Previous studies that examined factors such as poor taxon sampling (Soltis et al. 2004; Philippe, Lartillot, and Brinkmann 2005), inappropriate choices of phylogenetic method (Phillips et al. 2004; Jeffroy et al. 2006), nucleotide or amino acid composition bias and deviation from compositional equilibrium (Phillips et al. 2004; Collins et al. 2005), and variation of evolutionary rates among or within sites (Dopazo H and Dopazo J 2005; Nishihara et al. 2007; Rodriguez-Ezpeleta et al. 2007), all found that systematic biases can lead to incorrect trees with strong support. Several approaches that can detect and remove systematic biases in genome-scale phylogenetic inference have been proposed, including modification of taxon sampling (Rodriguez-Ezpeleta et al. 2007), examination of model violations (Rodriguez-Ezpeleta et al. 2007), recoding of molecular sequences (Phillips et al. 2004; Rodriguez-Ezpeleta et al. 2007), removal of the fast-evolving sites (Nishihara et al. 2007; Rodriguez-Ezpeleta et al. 2007), and utilizing rare genomic changes (Delsuc et al. 2005). Among the approaches that have been developed to address the systematic biases in genome-scale analyses, examination of incongruence among individual genes is directly relevant to the design and interpretation of multigene analyses that are fundamental in molecular phylogenetics (Huelsenbeck et al. 1996; Taylor and Piel 2004; Jeffroy et al. 2006). Unfortunately, investigations of incongruence among gene trees at the genome-scale have been limited to a few selected groups such as gamma-Proteobacteria (Lerat et al. 2003), yeast (Taylor and Piel 2004; Gatesy and Baker 2005; Jeffroy et al. 2006), and Drosophila (Pollard et al. 2006) due to the limitation of data availability. In this study, we present the first genome-scale phylogenetic analysis in the phylum Apicomplexa. Because of the ancient origin of this phylum, estimated at approximately 700–900 Myr (Douzery et al. 2004), we perform our genome-scale phylogenetic inference at the protein level. The robust inference of the organismal phylogeny based on genomic data provides a solid foundation for comparative studies that improve our knowledge of apicomplexan evolution. In addition to facilitating the planning of future phylogenetic studies that involve other closely related pathogens, our systematic investigation of incongruence among gene trees can improve our understanding of multigene phylogenetic inference in general. Materials and Methods Data Sources and Ortholog Identification Our data set contains seven apicomplexan species that have fully annotated genome sequence available, including Babesia bovis (Brayton et al. 2007) from GenBank (GenBank accession numbers AAXT01000001–AAXT01000013), Cryptosporidium parvum (Abrahamsen et al. 2004) from CryptoDB.org (Heiges et al. 2006), Eimeria tenella from GeneDB.org (Hertz-Fowler et al. 2004), Plasmodium falciparum (Gardner et al. 2002) and Plasmodium vivax from PlasmoDB.org (Bahl et al. 2003), Theileria annulata (Pain et al. 2005) from GeneDB.org (Hertz-Fowler et al. 2004), and Toxoplasma gondii from Toxo-DB.org (Gajria et al. 2008). A free-living ciliate, Tetrahymena thermophila (Eisen et al. 2006), is included as the outgroup. For each species, we obtained all annotated proteins in the genome for ortholog identification. The data sources and protein-encoding gene counts are summarized in table 1.
Orthologous genes were identified using OrthoMCL (Li et al. 2003) (version 1.3) with BLASTP (Altschul et al. 1990) and E value cutoff set to 1 × 10−30. The ortholog identification process in OrthoMCL is largely based on the popular criterion of reciprocal best hits but also involves an additional step of Markov Clustering (van Dongen 2000) to improve sensitivity and specificity. A benchmarking study has found that this algorithm performed well among available methods for ortholog identification (Hulsen et al. 2006). We selected the orthologous genes that are shared by all eight species to infer the gene tree. Orthologous gene clusters that contain more than one gene from any given species were removed to avoid the complications introduced by paralogous genes in phylogenetic inference.Phylogenetic Inference The program ClustalW (Thompson et al. 1994) (version 1.83) was used for multiple sequence alignment. The “tossgaps” option was enabled to ignore gaps when constructing the guide tree, and all other parameters were set to the default values unless specifically stated otherwise. The alignments produced by ClustalW were filtered by GBLOCKS (Castresana 2000) (version 0.91b) to using default settings remove regions that contain gaps or are highly divergent. The resulting amino acid alignment for each gene (provided in supplementary data file 1, Supplementary Material online) was used in the main phylogenetic analysis as described below; a codon-based nucleotide alignment for each gene was generated by PAL2NAL (Suyama et al. 2006) and is provided in supplementary data file 2 (Supplementary Material online). Three phylogenetic methods, including maximum likelihood (ML), maximum parsimony (MP), and Neighbor-Joining (NJ), were used to infer the gene tree for each individual gene. ML inferences were performed using PHYML (Guindon and Gascuel 2003). The proportion of invariant sites and the gamma-distribution parameter with eight substitution categories were estimated from the data set. The substitution model was set to JTT (Jones et al. 1992), and we enabled the optimization options for tree topology, branch lengths, and rate parameters. MP trees were constructed using PROTPARS in the PHYLIP package (Felsenstein 1989) (version 3.65) with 100 randomizations of input order. When more than one equally parsimonious tree was found for a given gene, the strict consensus tree of all equally parsimonious trees was used as the MP tree of this gene. NJ trees were constructed using NEIGHBOR in the PHYLIP package with species input order randomization enabled. The distance matrices were calculated by Tree-Puzzle (Schmidt et al. 2002) (version 5.2). The parameters used in Tree-Puzzle were set to the JTT substitution model, the mixed model of rate heterogeneity with one invariant and eight gamma rate categories, and the exact and slow parameter estimation. The level of bootstrap support for each gene was inferred by 100 resamplings of the alignment using SEQBOOT in the PHYLIP package followed by ML inference. To investigate the sensitivity of a gene to the multiple sequence alignment parameter, we varied the gap opening penalty by 2-fold in both directions (i.e., increased the default cost from 10 to 20 or decreased it to 5) and inferred the gene tree under each setting. Individual genes are classified into three categories including robust, intermediate, and sensitive based on the ML gene-tree topologies from the three gap opening penalties examined. A gene is classified as robust if all three settings generated the same topology, intermediate if two out of the three settings generated the same topology, or sensitive if each setting generated a different topology. To investigate the effect of the substitution model used on the resulting gene-tree topology, we performed ML inference for each gene using two additional substitution models, including LG (Le and Gascuel 2008) and WAG (Whelan and Goldman 2001). The resulting gene trees are compared with the topology obtained using the JTT model (Jones et al. 1992). Inference of the Species Tree The species tree was inferred using two different approaches. The first approach was based on the consensus of individual gene trees. The consensus tree was inferred by the CONSENSE program in the PHYLIP package using extended majority rule. Gene trees inferred by different phylogenetic methods (i.e., ML, MP, and NJ) were analyzed separately. The second approach was based on the concatenated alignment of all individual genes following the phylogenetic inference procedures as described above. Characterization of Gene Trees The topology distance between each gene tree and the species tree was calculated based on the symmetric difference (Robinson and Foulds 1981) as implemented in TREEDIST in the PHYLIP package. For genes that inferred a topology that is different from the species tree, we performed the approximately unbiased (AU) test (Shimodaira 2002) and the Shimodaira–Hasegawa (SH) test (Shimodaira and Hasegawa 1999) using the CONSEL package (Shimodaira and Hasegawa 2001) to test if the species tree topology is significantly rejected by a gene. Taxon Removal Tests To evaluate the potential influence of long-branch attraction (LBA), we removed either of the two taxa that have a long terminal branch (i.e., the outgroup T. thermophila and the ingroup C. parvum) and repeated the phylogenetic inference for each gene. Our procedure is conceptually similar to the taxon jackknife method (Siddall 1995) but contains one important distinction. The traditional taxon jackknife method removes a taxon after multiple sequence alignment and prior to tree reconstruction. However, the taxon being removed still affects the alignment and thus can influence the resulting tree. We chose to perform the taxon removal prior to multiple sequence alignment to eliminate any effect on the phylogenetic inference from the taxon being removed. Results and Discussion Ortholog Identification From the seven apicomplexans and the one ciliate examined, we identified 268 single-copy genes that are shared by all eight species. These genes represent less than 10% of the annotated genes from the smallest genome (table 1), indicating that these organisms are highly divergent in their gene content. The long evolutionary distance between ciliates and apicomplexans only partially explains this observation. When the outgroup is not considered, the seven apicomplexans share 508 orthologous genes (of which 433 are single copy in all species). One of our previous studies that examined a different set of apicomplexan species produced similar results and suggested that 28–45% of the genes in an apicomplexan genome are genus-specific (Kuo and Kissinger 2008). This high level of divergence in gene content is consistent with the ancient origin of the phylum. The divergence time between apicomplexans and ciliates was estimated to be in the range of 700–900 Myr based on 129 genes from 36 eukaryotes (Douzery et al. 2004). For the purpose of phylogenetic analysis, we focus on the 268 single-copy genes shared by all eight species. Many of these genes are responsible for basic cellular processes (e.g., DNA replication, transcription, translation, etc.), as noted in our previous study (Kuo and Kissinger 2008). The sequence identity and annotation information of these genes are provided in supplementary table S1 (Supplementary Material online). The Apicomplexan Species Tree The species tree was inferred using two different approaches. The first approach calculated the consensus tree among the 268 individual gene trees, and the second approach utilized a concatenated alignment of 71,830 amino acid sites. Both approaches resulted in the same species tree topology (fig. 1
This tree topology is consistent with most of our prior understanding of apicomplexan evolution based on morphology and development (Perkins et al. 2000), rDNA analyses (Escalante and Ayala 1995; Morrison and Ellis 1997), and multigene phylogenies (Douzery et al. 2004; Philippe et al. 2004; Kuo and Kissinger 2008). The piroplasmids (represented by B. bovis and T. annulata) form a sister group to the haemosporidians (represented by the Plasmodium lineage) with the cyst-forming coccidia (represented by E. tenella and T. gondii) as the next closely related group. Although the Cryptosporidium lineage was classified as a coccidian in early taxonomy work (Levine 1984), our result provides further support to the growing consensus that this lineage is basal to other apicomplexans and separate from other coccidia (Carreno et al. 1999; Zhu et al. 2000; Leander et al. 2003). The Distribution of Gene Trees Examination of individual genes revealed a seemingly high degree of incongruence among gene trees. Of the 268 gene trees examined, we observed a total of 48 topologies based on ML analysis (fig. 2
Despite the seemingly high level of incongruence among gene trees, only 16 genes significantly reject the putative species tree topology in the AU test (Shimodaira 2002). When using the more conservative SH test (Shimodaira and Hasegawa 1999), only two genes significantly reject the putative species tree. The first gene is annotated as a hypothetical protein in P. falciparum (gene ID: PF14_0326) and exhibits a high level of length variation among the species examined (i.e., varied from 2,452 amino acids in E. tenella to 8,094 amino acids in P. falciparum). The conserved regions that can be reliably aligned only account for 3% of the alignment. The second gene is annotated as a putative RNA-binding protein in P. falciparum (gene ID: PF08_0086) and also exhibits a high level of length variation (i.e., varied from 271 amino acids in B. bovis to 1,076 amino acids in P. vivax). The protein alignment obtained after GBLOCKS filtering only contains 29 sites. Based on the pattern of sequence length variation, we suspect that the gene annotations may be problematic in some of the species. For this reason, further analysis of these two genes was not pursued. The finding of a high level of topological incongruence among gene trees that lack statistical significance has been reported in previous genome-scale phylogenetic studies. Lerat et al. (2003) examined 205 single-copy genes shared by 13 gamma-Proteobacteria species and found only two significantly rejected the putative species tree in the SH test. In both cases, the discordance between the gene tree and the putative species tree can be explained by a single lateral gene transfer (LGT) event. Similarly, examinations of the 106 single-copy genes shared by a group of Saccharomyces spp. showed that the majority of bipartition conflicts among genes have low bootstrap support (Taylor and Piel 2004; Jeffroy et al. 2006). One possible hypothesis to explain the rare occurrences of a gene significantly rejecting the species tree is that single-copy genes are unlikely to be involved in LGT events (Daubin et al. 2002, 2003). Under this hypothesis, these genes have been confined in the organismal phylogeny throughout their evolutionary history, so the gene-tree topology is unlikely to be radically different from the species tree. By focusing on a small subset of genes that are highly conserved across all apicomplexan lineages examined, our methodology for orthologous gene selection may have effectively excluded genes that experienced LGT since the ciliate–apicomplexan divergence. Although LGT does not appear to influence our phylogenetic inference as presented here, caution should be taken in future studies because several previous studies suggest that LGT is an important evolutionary force in apicomplexans (Huang, Mullapudi, Lancto, et al. 2004; Huang, Mullapudi, Sicheritz-Ponten, and Kissinger 2004; Striepen et al. 2004; Nagamune and Sibley 2006) and other protists (Gogarten 2003; Richards et al. 2003; Andersson 2005). Evaluation of Phylogenetic Signal by Bootstrap Support To test if the observed topological incongruence among gene trees can be explained by a low resolving power for certain clades in some genes, we used the minimum bootstrap value observed in a gene tree to identify genes that possess strong phylogenetic signals. The results indicate that the percentage of genes that support the putative species tree increases as a function of the bootstrap cutoff used (table 2). In the most extreme example, when only the genes with a minimum bootstrap value of 90% at any node are examined, all five genes that meet this cutoff support the putative species tree topology. Even when the selection stringency is relaxed to a 70% bootstrap support, a cutoff that is commonly used in phylogenetic inference (Hillis and Bull 1993), 47% of these genes are consistent with the putative species tree and the two short internal branches received at least 60% of the consensus support. Curiously, we did not find any significant correlation between bootstrap support and alignment length, average pairwise protein distance, or other attributes of genes (supplementary table S1, Supplementary Material online).
In addition to being consistent with the putative species tree, genes with strong bootstrap support are often insensitive to changes in alignment parameter (table 3), substitution model (table 4), or the phylogenetic method used (table 5). In these tests, we are interested in investigating if a gene could infer the same gene-tree topology across a range of settings used in the phylogenetic inference process; the agreement between the gene-tree topology and the putative species tree is not considered. At 70% minimum bootstrap cutoff, we found that 90% of these genes are robust to a 4-fold change in the gap opening penalty (table 3), 93% of the genes are insensitive to the choice of substitution model (table 4), and 57% of the genes behave consistently across different phylogenetic methods (table 5). Although the use of methodological concordance as a criterion for selecting genes for phylogenetic inference was criticized (Grant and Kluge 2003), our results suggest that a gene is more likely to behave consistently across different phylogenetic methods when it contains a strong phylogenetic signal.
Removal of the Long Branches In addition to the low signal-to-noise ratio in some genes, another possible source of incongruence among gene trees is the LBA problem that resulted from our nonideal taxon sampling. Several observations support this hypothesis. First, when a gene behaved inconsistently across different phylogenetic methods, ML and NJ often result in an identical gene-tree topology that is different from MP (table 5). In addition, the outgroup T. thermophila and the ingroup C. parvum both have a long evolutionary distance to the other taxa (fig. 1 The issue of nonideal taxon sampling reflects a limitation that is often faced by genome-scale phylogentic inferences (Soltis et al. 2004). To circumvent this limitation, we utilized two other commonly suggested approaches to address the LBA problem (Bergsten 2005). First, all sites that contain gaps or are highly divergent were removed from the alignment prior to phylogenetic inference by GBLOCKS (see Materials and Methods). Second, we removed either the outgroup T. thermophila or the ingroup C. parvum prior to sequence alignment and repeated the phylogenetic inference. When the outgroup is removed from the data set, we observed a large increase in the consensus support for the Plasmodium–Babesia–Theileria clade (table 6). Two alternative bipartitions, as shown in panels E and F of figure 3
Conclusion The recent availability of genome sequences allowed us to infer an organismal phylogeny that includes several important apicomplexan pathogens with high confidence. This robust species tree provides a solid foundation for future comparative studies that can improve our understanding of apicomplexan evolution and parasite biology. Although the level of incongruence among gene trees appears to be high at first glance, further investigation indicates that most of the observed conflict does not have strong statistical support. Interestingly, the minimum bootstrap support observed in a gene tree appears to be a useful predictor of phylogenetic performance. Genes that produce strong bootstrap support for all internal branches are more likely to be consistent with the species tree and robust to changes in the alignment parameter or the phylogenetic method used. Nevertheless, examination of multiple unlinked genes with strong phylogenetic signals is important for accurate phylogenetic inference because any single gene can have a different evolutionary history from the organismal phylogeny. Our systematic investigation provides a list of phylogenetically informative genes in the phylum Apicomplexa. These genes are good candidates for future sequencing efforts that aim at improving taxon sampling in this group of important pathogens. Supplementary Material Supplementary data files l and 2 and table S1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). [Supplementary Data]
Acknowledgments C.-H.K. was supported by a National Institutes of Health (NIH) Training Grant (GM07103), the Kirby and Jan Alton Graduate Fellowship, and a Dissertation Completion Assistantship at the University of Georgia. Funding for this work was provided by NIH R01 AI068908 to J.C.K. P. Brunk, F. Chen, J. Felsenstein, M. Heiges, A. Oliveira, E. Robinson, and H. Wang provided valuable assistance on the use of computer hardware and software. We thank the J. Craig Venter Institute for providing prepublication access to the genome sequence data of P. vivax and T. gondii. The associate editor, Dr Hervé Philippe, and three anonymous reviewers provided constructive comments that greatly improved this manuscript. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||
Science. 2004 Apr 16; 304(5669):441-5.
[Science. 2004]Science. 2005 Jul 1; 309(5731):134-7.
[Science. 2005]Science. 2005 Jul 1; 309(5731):131-3.
[Science. 2005]Lancet. 2004 Jun 12; 363(9425):1965-76.
[Lancet. 2004]Curr Opin Immunol. 2001 Aug; 13(4):395-402.
[Curr Opin Immunol. 2001]Nat Rev Genet. 2005 May; 6(5):361-75.
[Nat Rev Genet. 2005]Science. 2006 Sep 29; 313(5795):1897-9.
[Science. 2006]Nature. 2003 Oct 23; 425(6960):798-804.
[Nature. 2003]Trends Plant Sci. 2004 Oct; 9(10):477-83.
[Trends Plant Sci. 2004]Trends Genet. 2006 Apr; 22(4):225-31.
[Trends Genet. 2006]Trends Plant Sci. 2004 Oct; 9(10):477-83.
[Trends Plant Sci. 2004]Mol Biol Evol. 2005 May; 22(5):1246-53.
[Mol Biol Evol. 2005]Mol Biol Evol. 2004 Jul; 21(7):1455-8.
[Mol Biol Evol. 2004]Trends Genet. 2006 Apr; 22(4):225-31.
[Trends Genet. 2006]Syst Biol. 2005 Jun; 54(3):493-500.
[Syst Biol. 2005]Proc Natl Acad Sci U S A. 2004 Oct 26; 101(43):15386-91.
[Proc Natl Acad Sci U S A. 2004]Science. 2004 Apr 16; 304(5669):441-5.
[Science. 2004]Nucleic Acids Res. 2006 Jan 1; 34(Database issue):D419-22.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2004 Jan 1; 32(Database issue):D339-43.
[Nucleic Acids Res. 2004]Nature. 2002 Oct 3; 419(6906):498-511.
[Nature. 2002]Nucleic Acids Res. 2003 Jan 1; 31(1):212-5.
[Nucleic Acids Res. 2003]Genome Res. 2003 Sep; 13(9):2178-89.
[Genome Res. 2003]J Mol Biol. 1990 Oct 5; 215(3):403-10.
[J Mol Biol. 1990]Genome Biol. 2006; 7(4):R31.
[Genome Biol. 2006]Nucleic Acids Res. 1994 Nov 11; 22(22):4673-80.
[Nucleic Acids Res. 1994]Mol Biol Evol. 2000 Apr; 17(4):540-52.
[Mol Biol Evol. 2000]Nucleic Acids Res. 2006 Jul 1; 34(Web Server issue):W609-12.
[Nucleic Acids Res. 2006]Syst Biol. 2003 Oct; 52(5):696-704.
[Syst Biol. 2003]Comput Appl Biosci. 1992 Jun; 8(3):275-82.
[Comput Appl Biosci. 1992]Bioinformatics. 2002 Mar; 18(3):502-4.
[Bioinformatics. 2002]Mol Biol Evol. 2008 Jul; 25(7):1307-20.
[Mol Biol Evol. 2008]Mol Biol Evol. 2001 May; 18(5):691-9.
[Mol Biol Evol. 2001]Comput Appl Biosci. 1992 Jun; 8(3):275-82.
[Comput Appl Biosci. 1992]Syst Biol. 2002 Jun; 51(3):492-508.
[Syst Biol. 2002]Bioinformatics. 2001 Dec; 17(12):1246-7.
[Bioinformatics. 2001]BMC Evol Biol. 2008 Apr 11; 8():108.
[BMC Evol Biol. 2008]Proc Natl Acad Sci U S A. 2004 Oct 26; 101(43):15386-91.
[Proc Natl Acad Sci U S A. 2004]BMC Evol Biol. 2008 Apr 11; 8():108.
[BMC Evol Biol. 2008]Proc Natl Acad Sci U S A. 1995 Jun 20; 92(13):5793-7.
[Proc Natl Acad Sci U S A. 1995]Mol Biol Evol. 1997 Apr; 14(4):428-41.
[Mol Biol Evol. 1997]Proc Natl Acad Sci U S A. 2004 Oct 26; 101(43):15386-91.
[Proc Natl Acad Sci U S A. 2004]Mol Biol Evol. 2004 Sep; 21(9):1740-52.
[Mol Biol Evol. 2004]BMC Evol Biol. 2008 Apr 11; 8():108.
[BMC Evol Biol. 2008]Trends Genet. 2006 Apr; 22(4):225-31.
[Trends Genet. 2006]Syst Biol. 2002 Jun; 51(3):492-508.
[Syst Biol. 2002]Mol Biol Evol. 2004 Aug; 21(8):1534-7.
[Mol Biol Evol. 2004]Trends Genet. 2006 Apr; 22(4):225-31.
[Trends Genet. 2006]Genome Res. 2002 Jul; 12(7):1080-90.
[Genome Res. 2002]Science. 2003 Aug 8; 301(5634):829-32.
[Science. 2003]Genome Biol. 2004; 5(11):R88.
[Genome Biol. 2004]Int J Parasitol. 2004 Mar 9; 34(3):265-74.
[Int J Parasitol. 2004]Proc Natl Acad Sci U S A. 2004 Mar 2; 101(9):3154-9.
[Proc Natl Acad Sci U S A. 2004]Nature. 2004 Oct 28; 431(7012):1107-12.
[Nature. 2004]Trends Plant Sci. 2004 Oct; 9(10):477-83.
[Trends Plant Sci. 2004]