• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Oct 23, 2012; 109(43): 17519–17524.
Published online Oct 8, 2012. doi:  10.1073/pnas.1205818109
PMCID: PMC3491498
Evolution

Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales

Abstract

The angiosperm order Malpighiales includes ~16,000 species and constitutes up to 40% of the understory tree diversity in tropical rain forests. Despite remarkable progress in angiosperm systematics during the last 20 y, relationships within Malpighiales remain poorly resolved, possibly owing to its rapid rise during the mid-Cretaceous. Using phylogenomic approaches, including analyses of 82 plastid genes from 58 species, we identified 12 additional clades in Malpighiales and substantially increased resolution along the backbone. This greatly improved phylogeny revealed a dynamic history of shifts in net diversification rates across Malpighiales, with bursts of diversification noted in the Barbados cherries (Malpighiaceae), cocas (Erythroxylaceae), and passion flowers (Passifloraceae). We found that commonly used a priori approaches for partitioning concatenated data in maximum likelihood analyses, by gene or by codon position, performed poorly relative to the use of partitions identified a posteriori using a Bayesian mixture model. We also found better branch support in trees inferred from a taxon-rich, data-sparse matrix, which deeply sampled only the phylogenetically critical placeholders, than in trees inferred from a taxon-sparse matrix with little missing data. Although this matrix has more missing data, our a posteriori partitioning strategy reduced the possibility of producing multiple distinct but equally optimal topologies and increased phylogenetic decisiveness, compared with the strategy of partitioning by gene. These approaches are likely to help improve phylogenetic resolution in other poorly resolved major clades of angiosperms and to be more broadly useful in studies across the Tree of Life.

Malpighiales are one of the most surprising clades discovered in broad molecular phylogenetic studies of the flowering plants (13). The order contains ~16,000 species and 42 families (2, 3) that exhibit remarkable morphological and ecological diversity. A few examples include cactus-like succulents (Euphorbiaceae), epiphytes (Clusiaceae), holoparasites (Rafflesiaceae), submerged aquatics (Podostemaceae), and wind-pollinated trees (temperate Salicaceae). The order is ecologically important: species in Malpighiales constitute up to 40% of the understory tree diversity in tropical rain forests worldwide (4). They also include many economically important species, such as Barbados nut (Jatropha curcas L., Euphorbiaceae), cassava (Manihot esculenta Crantz, Euphorbiaceae), castor bean (Ricinus communis L., Euphorbiaceae), coca (Erythroxylum coca Lam., Erythroxylaceae), flax (Linum usitatissimum L., Linaceae), the poplars (Populus spp., Salicaceae), and the rubber tree (Hevea brasiliensis Müll. Arg., Euphorbiaceae). Partially for this reason, genomic resources for Malpighiales are growing at a rapid pace and include whole-genome sequencing projects completed or near completion for Barbados nut (5), cassava, castor bean (6), flax, and poplar (7). Thus, a resolved phylogeny of Malpighiales is critical not only for evolutionary, ecological, developmental, and genomic investigations of flowering plants, but also for crop improvement.

Despite substantial progress in resolving the angiosperm Tree of Life during the last 20 y (1, 812), phylogenetic relationships within Malpighiales remain poorly resolved. Molecular studies (1, 4) using multiple gene regions from the plastid, mitochondrial, and nuclear genomes have confirmed the monophyly of Malpighiales and its component families with a high degree of confidence but have identified only a handful of well-supported multifamily clades. The most recent analysis by Wurdack and Davis (3) included 13 genes, totaling 15,604 characters, sampled across all three genomes from 144 Malpighiales. Their results indicated that all families are monophyletic, but interrelationships among the 16 major subclades remained unresolved. The difficulty in determining these deep relationships may result from the rapid rise of the order during the mid-Cretaceous (4).

We used phylogenomic approaches to resolve relationships within Malpighiales to provide a framework for studying their tempo and mode of diversification. Our core data set included 82 genes sampled from the plastomes of 58 species, 48 of which were newly sequenced for this study. We combined this core data set with the previously described taxon-rich data set of Wurdack and Davis (3). Our results greatly improve phylogenetic resolution within Malpighiales, highlight the value of a unique partitioning strategy for phylogenomic analyses, and reveal a dynamic history of shifts in net diversification rates across the order.

Results and Discussion

Taxon and Gene Sampling.

Our core data set, the 82-gene matrix, included 58 taxa (48 are newly sequenced; SI Appendix, Table S2) and 82 plastid genes common to most angiosperms (72,828 characters; 17% of the cells in the matrix were gaps or missing data; each taxon was represented by an average of 86% of the 82 genes; SI Appendix, Tables S2 and S3). The taxa were carefully selected to capture the basal nodes within deeply diverged families, such as Centroplacaceae and Euphorbiaceae (4); they represent 39 of the 42 families of Malpighiales (excluding Lophopyxidaceae, Malesherbiaceae, and Rafflesiaceae; SI Appendix, SI Materials and Methods) and relevant outgroups. To obtain the most comprehensive phylogenetic tree for the order, we used the 82-gene matrix as a scaffold to which we added the existing taxon-rich but character-sparse 13-gene matrix (186 taxa; 15,574 characters; 15% missing data) (3) to create our combined-incomplete matrix (Table 1). This matrix included 191 taxa and 91 genes (82 plastid genes, six mitochondrial genes, and three nuclear genes; 81,259 characters; 64% missing data). We also created a combined-complete matrix by reducing the taxon sampling in the combined-incomplete matrix to match the taxon sampling of the 82-gene matrix. This greatly reduced the percentage of missing data cells in our alignment from 64% to 12%. The combined-complete matrix included 58 taxa and 91 genes (81,117 characters). Finally, we reanalyzed the 13-gene matrix alone to determine the phylogenetic impact of adding the 82-gene matrix. Each of the four matrices was analyzed using four different data partitioning strategies that are described below.

Table 1.
Characteristics of the four matrices and statistics of the best-scoring ML trees inferred from each of the four partitioning strategies

Relationships in Malpighiales.

Our analyses produced a well-resolved phylogeny of Malpighiales (Fig. 1; relationships of outgroups provided in SI Appendix, Fig. S1). The maximum likelihood (ML) and Bayesian trees inferred from the combined-incomplete matrix are congruent with trees from the remaining three matrices (i.e., 82-gene, combined-complete, and 13-gene; SI Appendix, Figs. S1–S17 and S26–S28), using 75 ML bootstrap percentage (BP; as calculated using the standard bootstrap option in RAxML) and 0.95 Bayesian posterior probability (PP) thresholds. The 16 subclades of Malpighiales whose interrelationships were previously unresolved with respect to one another are resolved into three well-supported (>80 BP, 1.0 PP) clades. Moreover, we find comparable or greatly improved support for previously identified clades (3) and moderate to strong support for the 12 additional clades we identified (Fig. 1). Six of these clades were supported with >80 BP, 1.0 PP; two with ≥70 BP, >0.60 PP; and one with >60 BP, >0.95 PP. Importantly, each of the 12 clades is also united by morphological features (summarized in Table 2).

Fig. 1.
ML 50% majority-rule bootstrap consensus tree of Malpighiales inferred from the combined-incomplete matrix using the MixtPart partitioning strategy. ML BPs/Bayesian PPs are indicated above each branch; a hyphen indicates that the node is not present in ...
Table 2.
Morphological features for the 12 additional clades we identified in Malpighiales

Clade 1 (85 BP, 1.0 PP) includes two major subclades: the euphorbioids (clade 4) and Humiriaceae + the parietal clade sensu Wurdack and Davis (3) (clade 7). Surprisingly, the euphorbioid clade (64 BP, 0.61 PP) reunites most of the former Euphorbiaceae (including Euphorbiaceae, Peraceae, Phyllanthaceae, and Picrodendraceae but excluding Putranjivaceae) (13, 14) with the well-supported (96 BP, 1.0 PP) linoid clade (clade 6; Ixonanthaceae + Linaceae) we identified. Within the euphorbioids, the linoids are well-supported (clade 5; 84 BP, 1.0 PP) as sister to the phyllanthoids (Phyllanthaceae + Picrodendraceae; 100 BP, 1.0 PP). The second major subclade identified here, clade 7 (62 BP, 0.79 PP), includes Humiriaceae and the parietal clade (100 BP, 1.0 PP). Within the parietal clade, Goupiaceae is sister to Violaceae (clade 9; 75 BP, 0.62 PP). Also within the parietal clade, (Malesherbiaceae (Passifloraceae + Turneraceae)) is sister to the salicoids [clade 8; (Lacistemataceae (Samydaceae (Salicaceae + Scyphostegiaceae))); 96 BP, 1.0 PP].

Clade 2 (83 BP, 1.0 PP) includes three subclades in a trichotomy. Its first major subclade, clade 10 (70 BP, 0.81 PP), includes the previously identified (6, 15) clusioid clade [((Bonnetiaceae + Clusiaceae) (Calophyllaceae (Hypericaceae + Podostemaceae))); 100 BP, 1.0 PP] plus their sister group the ochnoids [(Ochnaceae (Medusagynaceae + Quiinaceae)); 100 BP, 1.0 PP]. The second subclade in clade 2 is the recently identified (3) rhizophoroids [(Ctenolophonaceae (Erythroxylaceae + Rhizophoraceae)); 100 BP, 1.0 PP]. The third subclade is the pandoids (clade 11; Irvingiaceae + Pandaceae; 64 BP, 0.97 PP).

Clade 3 (81 BP, 1.0 PP) consists of four subclades, three of which have been previously identified (3), in a polytomy. These four subclades are the chrysobalanoids [(Balanopaceae ((Chrysobalanaceae + Euphroniaceae) (Dichapetalaceae + Trigoniaceae))); 100 BP, 1.0 PP], the malpighioids [clade 12; (Centroplacaceae (Elatinaceae + Malpighiaceae)); 63 BP, 0.51 PP], the putranjivoids (Lophopyxidaceae + Putranjivaceae; 100 BP, 1.0 PP), and Caryocaraceae.

Improved Phylogenetic Resolution Results from a Posteriori Data Partitioning and Better Taxon Sampling.

Several previous phylogenomic studies of angiosperms have applied a single substitution matrix in ML analyses to multiple-gene concatenated data sets (OnePart; e.g., refs. 8, 16, and 17). More recently, to better accommodate evolutionary rate heterogeneity across different characters, alignments have been partitioned a priori by gene (GenePart; e.g., refs. 11 and 12) or by codon position (CodonPart; e.g., refs. 11 and 18). The GenePart approach creates a partition for each gene and estimates the substitution rate matrix parameters separately for each partition, resulting in up to 83 partitions for many plastid data sets. The CodonPart approach partitions characters according to codon position, with a fourth partition added for noncoding regions (if present). These partitioning strategies are somewhat arbitrary, assuming for example that all third codon positions evolve rapidly or that gene boundaries define a class of sites that are expected to share a similar model of molecular evolution. As an alternative, we explored the use of an a posteriori partitioning strategy for ML analyses based on the partitions inferred from Bayesian searches of the matrix using a mixture model approach (19). Using a reversible-jump implementation, the Bayesian mixture model estimates the number of substitution rate matrices that best fit an alignment by allowing the fitting of multiple rate matrices to each character separately (20). We used this approach to find the optimal number of partitions for each matrix and then defined the characters in each class as a partition for subsequent ML analyses (MixtPart).

Using MixtPart, we found that the optimal number of partitions was 13 for the 82-gene matrix, 15 for the combined-complete matrix, and 20 for the combined-incomplete matrix. Thus, in all cases, defining the partitions on the basis of the mixture model search reduced the number of partitions substantially (from 82 for the 82-gene and from 91 for the two combined matrices using GenePart). Notably, our results show that using MixtPart substantially improved the likelihood of the best-scoring ML tree as measured by the corrected Akaike information criterion (AICc) (21) for all four matrices (Table 1). For example, compared with the GenePart approach, the MixtPart approach increased the AICc values by 7–12%. MixtPart also outperformed the OnePart, GenePart, and CodonPart approaches with respect to improving the branch support as measured by BP values. To compare these BP values among trees with different taxon sets, the bipartition trees inferred from the combined-incomplete and 13-gene matrices (SI Appendix, Figs. S10–S17) were pruned to match the taxon sampling of the 82-gene and combined-complete matrices (SI Appendix, Figs. S18–S25). This revealed that the use of MixtPart resulted in an increase in mean BP values by 5–11% (Fig. 2 and SI Appendix, Table S4) and most strikingly a mean increase in BP values by 20–49% for the 12 clades we identified (Fig. 3). It should be noted that the addition of our 82-gene matrix alone was insufficient to resolve the deeper nodes of Malpighiales. Although it was helpful [e.g., mean BP values increased by 13% when comparing between the 82- vs. 13-gene MixtPart analyses (Fig. 2)], only 4 of these 12 clades were supported with >50 BP using OnePart, GenePart, and CodonPart, vs. 10 clades that were resolved with >50 BP using MixtPart (Fig. 3). This indicates that the use of MixtPart results in substantial improvement.

Fig. 2.
Mean ML BPs of the bipartition trees inferred using ML for each of the four matrices and four partitioning strategies. SEs around the means are indicated, and the MixtPart partitioning strategy is highlighted in gray. The bipartition trees inferred from ...
Fig. 3.
ML BPs of the 12 additional clades we identified in Malpighiales (Fig. 1) inferred from three matrices and four partitioning strategies. The MixtPart partitioning strategy is highlighted in gray.

Our results also suggest that for the commonly used partitioning strategies, particularly for OnePart and GenePart, increased taxon sampling improves branch support, regardless of the increase in missing data. For example, despite its much higher percentage of missing data (64% vs. 12%), analyses of the 191-taxon combined-incomplete matrix yielded a better-supported phylogeny than the 58-taxon combined-complete matrix: the mean BP values increased by 3% and 4% for OnePart and GenePart, respectively (Fig. 2). Although this improvement might seem relatively small when comparing mean BP values, it is much more impressive for the 12 clades we identified, which showed an average increase of BP values by 34% and 36% for the OnePart and GenePart analyses, respectively (Fig. 3). These results provide empirical support for conclusions that increased taxon sampling improves phylogenetic accuracy (2224), even when the amount of missing data increases (2527). Theoretical studies (e.g., refs. 28 and 29) have shown that it is the number of complete characters rather than the number of empty cells that determines the impact of missing data on phylogenetic accuracy. The improved branch support we observed in trees from the combined-incomplete matrix likely results from our strategic scaffold approach, in which we ensured that critical nodes were deeply sampled for most characters. A similar scaffold approach was advocated by Wiens et al. (30) and more recently applied using large amounts of genomic data to successfully resolve relationships of butterflies and moths (31).

Despite the apparent successes of the scaffold approach, recent studies (32, 33) have shown that partial taxon coverage (whereby sequences from some partitions are missing for some taxa) can result in a vast terrace of phylogenetic trees that have different topologies but the same optimality score. In cases where complete taxon coverage for a partition is achieved the data set is expected to be decisive for all trees (32), and under these circumstances the problem of terraces does not arise (33). This is likely to be rare for large phylogenomic data sets, however, which sacrifice completeness for the additional taxa and characters. This problem was most clearly illustrated in the recent analysis of a 298-taxon grass data set with 34% missing data that produced a terrace including 61 million optimal trees (33). We found that different partitioning strategies induced different patterns of taxon coverage. Notably, the use of GenePart reduced taxon coverage density in all cases, and in the case of the combined-incomplete matrix it resulted in a pattern of taxon coverage that was indecisive and the best-scoring ML tree was on a terrace of 14,025 trees, whereas the use of MixtPart was decisive for all trees (Table 1). Despite this lack of decisiveness, the BUILD tree (i.e., the Adams consensus of the 14,025 trees) includes only four polytomies, all of which are restricted to subfamily relationships (SI Appendix, Fig. S29). Thus, our scaffold approach yielded a matrix that is resilient to reduced coverage density. Together our results suggest that there may be cases, depending on the patterns of taxon coverage, in which GenePart would be a poor choice for partitioning concatenated matrices.

Patterns of Species Diversification in Malpighiales.

Studies of diversification patterns across angiosperms have not previously detected shifts in net species diversification rates (speciation minus extinction) in Malpighiales (34, 35), possibly because a well-resolved, taxon-dense phylogeny was not available for the order. We used our 191-taxon, combined-incomplete matrix to test the hypothesis that net diversification rates have been constant throughout the history of Malpighiales. This matrix was originally constructed to include the deepest phylogenetic splits within each family (3, 4) and is an excellent foundation for exploring the tempo of evolution in the order. We first used the approach implemented in MEDUSA (36) to detect shifts of diversification rate using a time-calibrated Malpighiales phylogeny (SI Appendix, Fig. S30) that accounts for unsampled taxonomic diversity (SI Appendix, Fig. S31). This method sequentially adds break points to a multirate birth-and-death model fitting the given branch lengths and terminal diversities until subsequent break points do not improve the AICc values. Using MEDUSA we found significant decelerations in five clades (Goupiaceae, Lophopyxidaceae, Medusagynaceae, Scyphostegiaceae, and Irvingiaceae + Pandaceae) and acceleration in one clade (Passifloraceae + Turneraceae) (Fig. 1 and SI Appendix, Fig. S31).

Additionally, we used a method that models diversification as a stochastic, time-homogeneous birth-and-death process (34). This method does not use the phylogeny directly but considers stem or crown group ages within clades of interest and the survival of each lineage to the present. The results were similar to those from the phylogeny-based MEDUSA analysis, with the main difference being the detection of an additional four rate decelerations and four accelerations. Assuming a constant birth-and-death model, eight clades (Balanopaceae, Centroplacaceae, Ctenolophonaceae, Euphroniaceae, Goupiaceae, Lophopyxidaceae, Medusagynaceae, and Scyphostegiaceae) experienced decelerations, and five clades (Dichapetalaceae, Erythroxylaceae, Malpighiaceae, Passifloraceae, and Putranjivaceae) experienced accelerations (Fig. 1 and SI Appendix, Fig. S32).

These overlapping results, together with a well-resolved phylogeny, provide an improved foundation for exploring the mechanisms that have led to such substantial diversity within Malpighiales. In some cases (e.g., Malpighiaceae and Passifloraceae), specialized plant–pollinator mutualisms (3739) may account for all or part of their exceptional diversification rates. These and other hypotheses can now be tested in more detailed studies of phylogeny, morphology, ecology, and biogeography.

Conclusions

Our phylogeny of Malpighiales provides a critical context for future comparative studies of plant species that are economically and ecologically important. Although the increasing ease of genome-scale sampling may render moot the long-standing argument about whether it is better to add taxa or characters (40), the question remains important. Given the amount of biodiversity remaining to be discovered, described, and classified, the goal should be to maximize taxonomic sampling for phylogenetic study, but to do so in the most effective way possible. Our analyses confirm that one efficient and economical way to resolve difficult clades is to construct a scaffold using phylogenetically critical placeholders sampled for many characters augmented by many more taxa sampled for a modest number of characters. Most importantly, our analyses indicate that searching with a Bayesian mixture models leads to an optimal, a posteriori data partitioning strategy, which not only improves the branch support of phylogenetic trees but also minimizes the impact of missing data on phylogenetic decisiveness. Its use is likely to help resolve several remaining poorly resolved, major clades of angiosperms (e.g., Euasterids I and II and Ericales) (12) and to be more broadly useful in studies across the Tree of Life.

Materials and Methods

See SI Appendix, SI Materials and Methods for details on plastome sequencing, sequence alignment, and analyses of phylogenetic decisiveness, divergence time, and species diversification.

Phylogenetic Analyses.

Bayesian and ML analyses were performed on four matrices (Table 1) as described above. The Bayesian analyses were implemented with the parallel version of BayesPhylogenies v2.0 (19) using a reversible-jump implementation of the mixture model as described by Venditti et al. (20). This approach allows the fitting of multiple models of sequence evolution to each character in an alignment without a priori partitioning. Two independent Markov chain Monte Carlo (MCMC) analyses were performed, and the consistency of stationary-phase likelihood values and estimated parameter values was determined using Tracer v1.5. We ran each MCMC analysis for 10 million generations, sampling trees and parameters every 1,000 generations. Bayesian PPs were determined by building a 50% majority-rule consensus tree from two MCMC analyses after discarding the 20% burn-in generations (Fig. 1 and SI Appendix, Figs. S1 and S26–S28).

The ML analyses were conducted using RAxML v7.2.8 (41) with the GTR+Γ model. The best-scoring ML tree was obtained for each matrix using the rapid hill-climbing algorithm (41), and 1,000 bootstrap replicates were estimated using the standard bootstrap option. The BPs were summarized from all 1,000 bootstrap trees, and the bipartition tree was obtained by mapping these BPs to the best-scoring ML tree (SI Appendix, Figs. S2–S17) (42). We used four different partitioning strategies for our data analyses described above in Results and Discussion: OnePart (single data partition), GenePart (partitioned by gene), CodonPart (partitioned by codon), and MixtPart (described below). For the MixtPart approach, the data partitions identified in the Bayesian analyses were extracted from the output using a custom Perl script (SI Appendix, SI Script). This script selected the partition with the highest probability for each character. The matrices were then partitioned accordingly in RAxML.

Supplementary Material

Supporting Information:

Acknowledgments

We thank D. Barua, J. Beaulieu, M. Clements, R. Cronn, M. Ethier, D. Goldman, M. Guisinger-Bellian, R. Jansen, M. Kent, M. McMahon, A. Meade, M. Moore, M. Sanderson, A. Stamatakis, and members of the C.C.D. and S.M. laboratories for technical assistance. This work was supported by Brazil Conselho Nacional de Desenvolvimento Científico e Tecnológico Grant 563548/10-0 (to A.M.A.), Swiss National Science Foundation Grant 129804 (to P.K.E.), US National Science Foundation (NSF) Assembling the Tree of Life Grants DEB-0622764 and DEB-1120243 (to C.C.D.), and NSF Doctoral Dissertation Enhancement Project Grant OISE-0936076 (to C.C.D. and B.R.R.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. R.K.J. is a guest editor invited by the Editorial Board.

Data deposition: The sequence reported in this paper has been deposited in the GenBank database (accession no. JX661767JX665032).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1205818109/-/DCSupplemental.

References

1. Chase MW, et al. Phylogenetics of seed plants: An analysis of nucleotide sequences from the plastid gene rbcL. Ann Mo Bot Gard. 1993;80:528–580.
2. APG An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG II. Bot J Linn Soc. 2003;141:399–436.
3. Wurdack KJ, Davis CC. Malpighiales phylogenetics: Gaining ground on one of the most recalcitrant clades in the angiosperm tree of life. Am J Bot. 2009;96(8):1551–1570. [PubMed]
4. Davis CC, Webb CO, Wurdack KJ, Jaramillo CA, Donoghue MJ. Explosive radiation of Malpighiales supports a mid-cretaceous origin of modern tropical rain forests. Am Nat. 2005;165(3):E36–E65. [PubMed]
5. Sato S, et al. Sequence analysis of the genome of an oil-bearing tree, Jatropha curcas L. DNA Res. 2011;18(1):65–76. [PMC free article] [PubMed]
6. Chan AP, et al. Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol. 2010;28(9):951–956. [PMC free article] [PubMed]
7. Tuskan GA, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray) Science. 2006;313(5793):1596–1604. [PubMed]
8. Jansen RK, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc Natl Acad Sci USA. 2007;104(49):19369–19374. [PMC free article] [PubMed]
9. Moore MJ, Bell CD, Soltis PS, Soltis DE. Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proc Natl Acad Sci USA. 2007;104(49):19363–19368. [PMC free article] [PubMed]
10. Wang H, et al. Rosid radiation and the rapid rise of angiosperm-dominated forests. Proc Natl Acad Sci USA. 2009;106(10):3853–3858. [PMC free article] [PubMed]
11. Moore MJ, Soltis PS, Bell CD, Burleigh JG, Soltis DE. Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc Natl Acad Sci USA. 2010;107(10):4623–4628. [PMC free article] [PubMed]
12. Soltis DE, et al. Angiosperm phylogeny: 17 genes, 640 taxa. Am J Bot. 2011;98(4):704–730. [PubMed]
13. Cronquist A. The Evolution and Classification of Flowering Plants. 2nd Ed. Bronx, NY: New York Botanical Garden; 1988.
14. Webster GL. Classification of the Euphorbiaceae. Ann Mo Bot Gard. 1994;81:3–32.
15. Ruhfel BR, et al. Phylogeny of the clusioid clade (Malpighiales): Evidence from the plastid and mitochondrial genomes. Am J Bot. 2011;98(2):306–325. [PubMed]
16. Cai ZQ, et al. Complete plastid genome sequences of Drimys, Liriodendron, and Piper: Implications for the phylogenetic relationships of magnoliids. BMC Evol Biol. 2006;6:77. [PMC free article] [PubMed]
17. Hansen DR, et al. Phylogenetic and evolutionary implications of complete chloroplast genome sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae) Mol Phylogenet Evol. 2007;45(2):547–563. [PubMed]
18. Moore MJ, et al. Phylogenetic analysis of the plastid inverted repeat for 244 species: Insights into deeper-level angiosperm relationships from a long, slowly evolving sequence region. Int J Plant Sci. 2011;172:541–558.
19. Pagel M, Meade A. A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol. 2004;53(4):571–581. [PubMed]
20. Venditti C, Meade A, Pagel M. Phylogenetic mixture models can reduce node-density artifacts. Syst Biol. 2008;57(2):286–293. [PubMed]
21. Hurvich CM, Tsai CL. Regression and time series model selection in small samples. Biometrika. 1989;76:297–307.
22. Pollock DD, Zwickl DJ, McGuire JA, Hillis DM. Increased taxon sampling is advantageous for phylogenetic inference. Syst Biol. 2002;51(4):664–671. [PMC free article] [PubMed]
23. Zwickl DJ, Hillis DM. Increased taxon sampling greatly reduces phylogenetic error. Syst Biol. 2002;51(4):588–598. [PubMed]
24. Hedtke SM, Townsend TM, Hillis DM. Resolution of phylogenetic conflict in large data sets by increased taxon sampling. Syst Biol. 2006;55(3):522–529. [PubMed]
25. McMahon MM, Sanderson MJ. Phylogenetic supermatrix analysis of GenBank sequences from 2228 papilionoid legumes. Syst Biol. 2006;55(5):818–836. [PubMed]
26. Heath TA, Hedtke SM, Hillis DM. Taxon sampling and the accuracy of phylogenetic analyses. J Syst Evol. 2008;46:239–257.
27. Burleigh JG, Hilu KW, Soltis DE. Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms. BMC Evol Biol. 2009;9:61. [PMC free article] [PubMed]
28. Wiens JJ. Missing data, incomplete taxa, and phylogenetic accuracy. Syst Biol. 2003;52(4):528–538. [PubMed]
29. Wiens JJ. Can incomplete taxa rescue phylogenetic analyses from long-branch attraction? Syst Biol. 2005;54(5):731–742. [PubMed]
30. Wiens JJ, Fetzner JW, Parkinson CL, Reeder TW. Hylid frog phylogeny and sampling strategies for speciose clades. Syst Biol. 2005;54(5):778–807. [PubMed]
31. Cho S, et al. Can deliberately incomplete gene sample augmentation improve a phylogeny estimate for the advanced moths and butterflies (Hexapoda: Lepidoptera)? Syst Biol. 2011;60(6):782–796. [PMC free article] [PubMed]
32. Sanderson MJ, McMahon MM, Steel M. Phylogenomics with incomplete taxon coverage: The limits to inference. BMC Evol Biol. 2010;10:155. [PMC free article] [PubMed]
33. Sanderson MJ, McMahon MM, Steel M. Terraces in phylogenetic tree space. Science. 2011;333(6041):448–450. [PubMed]
34. Magallón S, Sanderson MJ. Absolute diversification rates in angiosperm clades. Evolution. 2001;55(9):1762–1780. [PubMed]
35. Smith SA, Beaulieu JM, Stamatakis A, Donoghue MJ. Understanding angiosperm diversification using small and large phylogenetic trees. Am J Bot. 2011;98(3):404–414. [PubMed]
36. Alfaro ME, et al. Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates. Proc Natl Acad Sci USA. 2009;106(32):13410–13414. [PMC free article] [PubMed]
37. Anderson WR. Floral conservatism in neotropical Malpighiaceae. Biotropica. 1979;11:219–223.
38. Neff JL. The passionflower bee: Anthemurgus passiflorae. J Newsl Passiflora Soc Int. 2003;13:7–9.
39. Zhang W, Kramer EM, Davis CC. Floral symmetry genes and the origin and maintenance of zygomorphy in a plant-pollinator mutualism. Proc Natl Acad Sci USA. 2010;107(14):6388–6393. [PMC free article] [PubMed]
40. Graybeal A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol. 1998;47(1):9–17. [PubMed]
41. Stamatakis A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22(21):2688–2690. [PubMed]
42. Stamatakis A, Hoover P, Rougemont J. A rapid bootstrap algorithm for the RAxML Web servers. Syst Biol. 2008;57(5):758–771. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...