![]() | ![]() |
Formats:
|
||||||||||||||||
Copyright © 2006 Kramer et al; licensee BioMed Central Ltd. A simplified explanation for the frameshift mutation that created a novel C-terminal motif in the APETALA3 gene lineage 1Dept. of Organismic and Evolutionary Biology, Harvard University, Cambridge MA 02138, USA 2Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei, Taiwan Corresponding author.Elena M Kramer: ekramer/at/oeb.harvard.edu; Huei-Jiun Su: r90226014/at/ntu.edu.tw; Cheng-Chiang Wu: cwu/at/fas.harvard.edu; Jer-Ming Hu: jmhu/at/ntu.edu.tw Received December 13, 2005; Accepted March 24, 2006. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Background The evolution of type II MADS box genes has been extensively studied in angiosperms. One of the best-understood subfamilies is that of the Arabidopsis gene APETALA3 (AP3). Previous work has demonstrated that the ancestral paleoAP3 lineage was duplicated at some point within the basal eudicots to give rise to the paralogous TM6 and euAP3 lineages. This event was followed in euAP3 orthologs by the replacement of the C-terminal paleoAP3 motif with the derived euAP3 motif. It has been suggested that the new motif was created by an eight-nucleotide insertion that produced a translational frameshift. Results The addition of 25 eudicot AP3 homologs to the existing dataset has allowed us to clarify the process by which the euAP3 motif evolved. Phylogenetic analysis indicates that the euAP3/TM6 duplication maps very close to the base of the core eudicots, associated with the families Trochodendraceae and Buxaceae. We demonstrate that although the transformation of paleoAP3 into euAP3 was due to a frameshift mutation, this was the result of a single nucleotide deletion. The use of ancestral character state reconstructions has allowed us to demonstrate that the frameshift was accompanied by few other nucleotide changes. We further confirm that the sequence is evolving as coding region. Conclusion This study demonstrates that the simplest of genetic changes can result in the remodeling of protein sequence to produce a kind of molecular 'hopeful monster.' Moreover, such a novel protein motif can become conserved almost immediately on the basis of what appears to be a rapidly generated new function. Given that the existing data on the function of such C-terminal motifs are somewhat disparate and contradictory, we have sought to synthesize previous findings within the context of the current analysis and thereby highlight specific hypotheses that require further investigation before the significance of the euAP3 frameshift event can be fully understood. Background An increasing body of research has demonstrated that changes in gene regulation play a major role in the evolution of morphological form (reviewed [1-3]). That is not to say, however, that the evolution of coding sequence does not also contribute. Multiple examples from both plants and animals demonstrate that even minor changes in coding sequence can impact both biochemical and developmental functions (e.g., [4-7]). Interestingly, a common theme among many of these examples is gene duplication, which serves to release resultant paralogs from the selective pressures experienced by the single ancestral locus. In order to begin to understand the process by which non-synonymous mutation leads to changes in gene function, we need to be able to isolate such changes and characterize the pattern of sequence evolution in detail. This is facilitated by a thorough understanding of taxonomic and gene lineage evolution as well as a relatively recent evolutionary timescale. All of these criteria are met by the APETALA3 (AP3) lineage of type II MADS box genes. Members of the type II MADS box family control many important aspects of plant development (reviewed [8]). Extensive phylogenetic analyses have identified multiple subfamilies, which are particularly well understood in the seed plants (reviewed [9]). This interest was largely triggered by the central role that type II MADS box genes play in the genetic program controlling floral organ identity. The so-called ABC model [10] describes how floral organ identity is determined by an overlapping set of three gene activities that produce distinct combinatorial codes: A class genes code for first whorl sepals; A+B, for second whorl petals; B+C, for third whorl stamens; and C alone, for fourth whorl carpels. Subsequent studies have identified additional critical gene classes, including the "E" class that acts in all floral whorls to facilitate the function of A, B and C class genes [11,12]. All but one of the ABCE class loci are type II MADS box genes [13], which are also known as MIKC MADS box genes due to the canonical structure displayed by the members. Starting at the N-terminal end of the gene, the 'M' or MADS domain is highly conserved across eukaryotes, and mediates DNA binding and protein dimerization [14,15]. The next two regions, referred to as I and K, are primarily involved with protein dimerization [14], while the last, the C domain, has been associated with a number of different functions. These include mediating higher-order interactions among MADS protein dimers [16,17], transcriptional activation [18,19], and post-translational modification [20]. A notable feature of the C-terminal domain is that although it shows a lower degree of overall sequence conservation than the other regions, each of the major MIKC subfamilies possesses short, highly conserved diagnostic motifs at their C-terminal end (reviewed [21,22]). In the majority of cases, the specific function of these motifs remains unknown. As our understanding of the evolution of MIKC MADS box genes has grown, it has become increasingly clear that their evolutionary history is one of frequent gene duplication across all phylogenetic levels (reviewed [9,23]). One subfamily that demonstrates this phenomenon especially well is defined by the APETALA3 (AP3) and PISTILLATA (PI) gene lineages, which include the Arabidopsis petal and stamen identity genes of the same names. These two lineages are sister groups within the larger MIKC MADS gene family [24] and are the product of a gene duplication event that predated the diversification of the angiosperms [25-27]. Early studies recognized that there were, in fact, two paralogous lineages of AP3-like genes in the core eudicots: one termed euAP3 that contains AP3 itself and the other named TM6, which lacks a representative in Arabidopsis but has been identified in many other core eudicot taxa [28,29]. Although clearly related, the euAP3 and TM6 lineages have a number of distinct features, the most striking of which is their C-terminal motifs. In the TM6 and ancestral paleoAP3 lineages, the C-terminal motif has the consensus YGxHDLRLA (x indicating a variable site) [28]. This sequence, the paleoAP3 motif, is conserved throughout angiosperms and is recognizable in gymnosperm AP3/PI ancestors as well as the even more distantly related Bsister lineage [30,31]. In the euAP3 lineage, however, the paleoAP3 motif is completely absent and in its place is the so-called euAP3 motif with the consensus SDLTTFALLE [28]. The differences in this region and other sites reveal euAP3 to be a divergent paralogous lineage relative to both its ancestral and sister lineages. The patterns of sequence evolution associated with the euAP3/TM6 duplication raise questions regarding the functional significance of the C-terminal motifs in general and the euAP3 divergence in particular. From the biochemical standpoint, we can say with certainty that the euAP3 motif is important for proper AP3 function in vivo, and that the paleoAP3 and euAP3 motifs are not functionally equivalent [6,32]. In terms of the genes' developmental roles, the suggestion has been made that following the euAP3/TM6 duplication, the euAP3 lineage acquired a new role in petal development [6]. The evidence to support this conclusion is diverse, and includes: 1) the fact that the expression patterns of paleoAP3 orthologs in the petals of non-core eudicots are much more variable than those observed for euAP3 representatives within the core eudicots [29,33]; 2) that a chimeric AP3 bearing a paleoAP3 motif is especially poor at promoting petal identity in Arabidopsis [6]; and 3) that the sole TM6 ortholog to be functionally characterized, PhTM6 from Petunia, only contributes to stamen identity ([34], Vandenbussche and Gerats, pers. comm). On the other hand, paleoAP3 orthologs are almost always expressed in petaloid organs (e.g., [35-37]) and appear to function in the identity of petal-derived organs in the grasses [38,39]. One explanation that could encompass all of the current evidence is to posit that although paleoAP3 members play variable roles in petal identity, this function was canalized at the base of the core eudicots in conjunction with changes in biochemical aspects of euAP3 function and subsequent subfunctionalization in the TM6 lineage [40,41]. In regards to the evolution of the euAP3 motif itself, it was recently recognized that a frameshift event in the coding sequence of the paleoAP3 motif could generate components of the euAP3 motif [22]. The model of Vandenbussche et al. proposes that an eight nucleotide insertion contributed to the evolution of the euAP3 motif both by the addition of novel sequence and by causing a frameshift mutation. In the current study, we have sought to better establish the timing of the euAP3/TM6 duplication event and the nature of the evolution of the euAP3 motif. The addition of 25 new AP3 homologs has particularly provided insight into the latter issue by demonstrating that the derivation of the euAP3 motif was even simpler than previously suggested. We conclude that a single nucleotide deletion transformed the ancestral paleoAP3 motif into the euAP3 motif with relatively few associated nucleotide changes. Furthermore, we provide evidence that the region is being conserved at the amino acid level, suggesting that the almost immediate conservation of the euAP3 motif was due to new function of the novel protein sequence. Results and discussion Characterization and phylogenetic analysis of AP3 homologs In an effort to better understand the evolution of the AP3 lineage in the eudicots, we used RT-PCR to isolate AP3 homologs from five taxa representing every lineage of the basal eudicots as well as eight taxa drawn from core eudicot lineages that had been poorly sampled (Fig. (Fig.1).1
We performed phylogenetic analysis using maximum likelihood (ML) on a nucleotide dataset (Additional file 3) containing all of the new loci in addition to previously identified basal and core eudicot sequences, with magnoliid dicot, monocot and ANITA grade AP3 homologs serving as outgroups to the eudicot sequences (Fig. (Fig.2).2
The major departure of the current phylogeny from previous studies is the position of the Pachysandra AP3 homologs, representing sampling from two species, which are placed as sister to the euAP3 lineage s.s. after the duplication event. This position is somewhat surprising given that none of the Pachysandra loci contain euAP3 motifs, which have previously been considered diagnostic for the euAP3 lineage. However, in the I and K regions of the protein sequence (Additional file 3), the Pachysandra AP3 homologs share other character states that have been identified as euAP3 lineage synapomorphies [28]. It should be noted that in maximum parsimony (MP) analyses, the Pachysandra loci sometimes are placed as an earlier branch, just before the euAP3/TM6 duplication event (data not shown), underscoring the poorly supported position of these loci. This analysis does allow us to make some conclusions regarding the timing of the euAP3/TM6 duplication event. The duplication clearly occurred before the last common ancestor of all core eudicots, including the family Gunneraceae, which has been identified as sister to the traditionally defined core eudicot clade [42]. It seems likely that the duplication occurred after the early lineages of the basal eudicots, including the Ranunculales, Proteales and Sabiaceae. Based on the current analysis, we cannot determine with certainty how the timing of the duplication event related to the origin of the Trochodendraceae and Buxaceae lineages. Similarly, recent phylogenetic studies of the eudicots place these two families as sister to the core eudicots including Gunneraceae without strong support for their exact branching order (Fig. (Fig.1)1 Evidence for a single nucleotide frameshift event at the base of the euAP3 clade What is interesting about the current dataset is that all of the paleoAP3 lineage members and the Pachysandra AP3 homologs possess fairly normal paleoAP3 motifs with no clear sign of intermediates with the highly diverged euAP3 motif (Additional file 2). The explanation for this lack of 'missing links' has recently become apparent. In the course of characterizing the AP3 representatives from Platanus [37], we noticed that while the first reading frame encoded a perfect paleoAP3 motif, the second frame in the same region had the potential to encode an amino acid sequence with strong similarity to the euAP3 motif (Fig. (Fig.3A).3A
The phylogenetically-structured nature of euAP3/paleoAP3 frameshift potential suggests that it is dependent on patterns of codon usage and, therefore, that this region is behaving as normal coding region. This conclusion is significant since one possible explanation for the observed phenomenon is that the region is conserved at the nucleotide level rather than at the amino acid level, such as would be the case for something like a microRNA binding site, for example. The prediction of this scenario, however, is that the sequence should not evolve in a pattern typical of coding region, where the first and second codon positions exhibit lower nucleotide diversity than the third positions. An alternative model is that the region is subject to programmed translational frameshift, a phenomenon previously observed in fungal, prokaryotic, plastid and viral genomes (reviewed [45]). This process is associated with perturbations in the expected pattern of sequence evolution such that substitutions are concentrated in the third positions of the original reading frame rather than in the third positions of the new frame. In addition, the encoded amino acid sequence of the original frame is conserved (e.g., [46,47]). Thus, under the first hypothesis, the paleoAP3 sequence would be conserved at the nucleotide level and would not bear the hallmarks of coding sequence evolution, while under the second hypothesis, the sequence should evolve like coding sequence but in the original reading frame. Our general observations, as well as those of others [22], are not consistent with these models but we wanted to test this further by directly analyzing patterns of nucleotide diversity in the region. Figure Figure44
As shown in Fig. Fig.3F,3F
Evidence for independent frameshift events in the AP3 lineage The euAP3 frameshift event seems so extraordinary that it naturally begs the question of how often this sort of thing happens. Similar events have been described in other MADS box genes lineages [22,48] as well as vertebrate gene families [49]. We examined the larger AP3 dataset for additional examples and found three (Fig. (Fig.6).6
Molecular 'hopeful monsters' The term 'hopeful monster' was coined by Goldschmidt [52] to describe new species that arise abruptly by macromutation. Very rarely, he argued, such profound mutations could be beneficial and allow the organism to rapidly adapt to a new mode of life. On the molecular level, the impact of a frameshift mutation on protein sequence is similarly drastic – replacing most, if not all, of the ancestral amino acids with new residues. It seems very likely that the vast majority of such mutations will not be retained, but the euAP3/TM6 example, as well as others [22,49], demonstrates that there are isolated cases in which frameshifts have become conserved. Although this phenomenon would seem to be so unlikely as to be vanishingly rare, the role of gene duplication in this process means that it is essentially a matter of numbers, particularly in plants. It has been suggested that plants are especially subject to frequent gene duplications [53], due to everything from genome-scale events to single locus tandem duplications. In particular, loci involved in transcriptional regulation and signal transduction appear to be preferentially retained [54,55]. Phylogenetic analyses of multiple gene families bear out this impression, displaying evidence of duplications at every phylogenetic level (e.g., [27,56-58]). The lower eudicots appear to be a particularly active period for MADS box gene duplication (reviewed [23,59]), leading to the suggestion that at least one genome duplication occurred during this period [60]. Given what may be a relatively high rate of paralog generation, even very rare events such as the appearance of an adaptive frameshift mutation will occur at low frequency. Once such a frameshifted allele appears, it will be subject to the usual microevolutionary forces and may be fixed due to selection or neutral processes. Along these lines, it has been suggested that periods of paralog maintenance due to neutral forces or subfunctionalization may eventually facilitate neofunctionalization [61,62]. Of course, it is only the evolutionarily successful events, or the fairly recent ones, that can be easily detected. Many such molecular 'monsters' may have come and gone over the course of plant evolution. This is not to say that frameshift-based evolution is restricted to plants, since it has also been identified in vertebrates [49]. In these cases, the presence of differentially spliced transcripts is associated with frameshift sequence remodeling. It remains to be seen whether duplication-related frameshift will also be uncovered in animals or if the variable transcript phenomenon will predominate. Other instances of clustered non-synonymous nucleotide changes have been identified [63], which demonstrate that such events can be maintained by selection. These examples may also provide candidates to be re-examined for evidence of frameshift mutation since the failure to recognize a frameshift mutation would result in a nucleotide alignment with the signature of successive non-synonymous substitutions. It is important to note, however, that the 'hopeful monster' analogy only applies to the evolutionary pattern of the protein sequence. At the nucleotide level, the sequence changes are, in fact, quite gradual. Implications for the evolution of the AP3 lineage and the ABC program The rapid generation and fixation of the euAP3 motif raises obvious questions regarding its biochemical function and its evolutionary significance. In order to consider these issues, we must first outline our basic knowledge of B gene function in model species. In Arabidopsis, AP3 and PI function as obligate heterodimers to promote petal and stamen identity [14,64]. All aspects of their function appear to be interconnected since their heterodimerization through the I and K domains is a requirement for protein stability [65,66], nuclear localization [67], DNA binding [14,68] and the maintenance of gene expression [69,70]. The contribution of the C-terminal motifs to these functions is not well understood. As mentioned previously, it has been demonstrated that the euAP3 motif is required for proper AP3 function and that the paleoAP3 motif is not biochemically equivalent to the euAP3 in Arabidopsis [6,32]. The study of Lamb and Irish further determined that the euAP3 motif is capable of conferring AP3-specific function to PI. This result is particularly intriguing since it suggests that dimers between the endogenous PI and chimeric PIcAP3 proteins were stabilized when one of the PI proteins possessed a euAP3 motif. Although indirect, this is the best evidence we have to support a role for the euAP3 motif in mediating protein-protein interactions. As to the paleoAP3 motif, a study in Lilium has argued that this region contributes to the novel homodimerization capacity of the paleoAP3 homolog and, further, that the Lilium paleoAP3 motif is sufficient to confer homodimerization capability on AP3 itself [71]. These findings are highly surprising given that all previous studies have shown that the C domain as a whole plays no role in AP3/PI dimerization [14,16,72]. Additionally, other analyses of both TM6 and paleoAP3 orthologs have not recovered any evidence of homodimerization [34,36,73,74]. Despite the conflicting nature of this set of results, it remains true that all specific investigations of AP3 motif function have indicated that it plays a role in mediating protein-protein interactions. Following from this statement, it is natural to now consider the known interaction partners of AP3. The current model of ABCE gene function holds that AP3/PI dimers form higher order complexes with other type II MADS box proteins from the A, C and E classes. In Arabidopsis, these genes are represented by APETALA1 (AP1) in the A class, AGAMOUS (AG) in the C class and the SEPALLATA1-4 loci in the E class (reviewed [75]). Therefore, in petals AP3/PI would interact with AP1/SEP dimers and in the stamens, with AG/SEP dimers [76]. This model is assumed to essentially hold for all other core eudicots, with supporting evidence in Antirrhinum and Petunia [16,77-80]. Unfortunately, the broader findings concerning the functions of C-terminal motifs within the context of these higher order complexes tend to be somewhat contradictory. On the one hand, complete deletion of the motifs does not generally affect complex formation in yeast three- or four-hybrid analyses [16,19] but, on the other hand, a separate yeast three-hybrid study recovered mutations in the C-terminal PI motif that did affect interactions with SEP proteins [17]. Similarly, the ability of PIcAP3 to rescue AP3 function may suggest a role for the euAP3 motif in higher order interactions [6]. Since the C-terminus is not required for AP3/PI dimerization [14], the apparent stabilization of the PI/PIcAP3 dimer is unlikely to be due to a direct interaction between the euAP3 motif and PI. It is more probable that the presence of the euAP3 motif allows the weakly associated dimer to interact with other proteins, thereby stabilizing the whole complex. One explanation for this diverse set of results is that there are other proteins participating in complex formation in planta that are not represented in the yeast experiments and it is these co-factors that are the targets of C-terminal motif interactions. Alternatively, it may simply be that the yeast system is not always sensitive enough to detect alterations in interaction strength that are significant in vivo. Given that our current understanding of C-terminal motif functions is confusing at best, it is also useful to consider the evolutionary histories of the loci thought to interact with AP3. In the case of PI, there is currently no clear evidence for a coincident gene duplication. Moreover, although there are sequence synapomorphies for core eudicot PI homologs, none of these map to the C-terminus and the MIK-associated residues do not represent obvious candidates for co-evolutionary changes (Kramer and Hu, unpublished data; [28]). Interestingly, the AG and SEP1/4 lineages both duplicated close to the base of the core eudicots [81,82]. However, AG has been shown to be unable to interact with AP3/PI on its own [19] and neither AG nor SEP1 underwent any major sequence remodeling in association with their basal eudicot duplications [81,82]. In contrast, the gene lineage containing AP1 is of particular interest given that it exhibits an evolutionary pattern which closely parallels that of AP3 [48]. Specifically, this lineage duplicated close to the base of the core eudicots to produce the paralogous euAP1 and euFUL lineages. Similar to euAP3, the euAP1 genes are divergent in sequence relative to both euFUL and the ancestral FUL-like lineage. Perhaps most surprising is that the remodeling of the euAP1 C-terminus also involved a frameshift mutation, although the exact extent of this phenomenon remains unclear [22,48]. In the case of euAP1, the single ancestral FUL-like motif was lost and two new conserved motifs evolved: one being involved in transcriptional activation (termed the euAP1 motif) and the other a site of post-translational farnesylation [18,20]. No clear data exist, however, regarding the function of the ancestral FUL-like motif or to suggest that the euAP1 motifs play a role in higher order complex formation. Although it has been proposed that the appearance of the euAP3 and euAP1 motifs may have been a co-evolutionary phenomenon [22], there are at least two variations on this theme that could fit the data. These two hypotheses yield sets of opposing and, most importantly, testable predictions. One possibility is that the new motifs promote interaction with each other in a manner that their ancestors did not. This theory is consistent with the idea that euAP1 and euAP3 acquired their common role in petal identity at the base of the core eudicots [6,22]. Supporting evidence includes the fact that AP1 orthologs can interact with AP3/PI heterodimers on their own, although this does not appear to be dependent on their C-terminal motifs [16,19]. Also, as opposed to the equivocal situation with euAP3 homologs [41], significant data exist to suggest that the role of euAP1 in petal identity is specific to the core eudicots [35,48]. A second scenario is that it was the ancestral FUL-like and paleoAP3 motifs that directly interacted and that, following the gene duplications, the loss of one of these motifs released the other from selection and allowed it to diverge to new function. This theory is more consistent with the lack of data indicating a protein interaction function for the euAP1 motifs. It is interesting to note that the FUL-like motif is strongly similar to the C-terminal motif of the SEP lineages [48,81], which are found within the same subfamily as AP1/FUL [8]. It may be that the loss of the FUL-like motif in euAP1 could be compensated by its conservation in the SEP proteins, which are thought to participate in the same complex. In terms of testable hypotheses, analyses of protein interactions among pre-duplication taxa could help to distinguish between the two models. On the whole, we are left with an intense sense of coincidence – that the AP3 and AP1/FUL lineages both duplicated and experienced C-terminal frameshift mutation in the same approximate phylogenetic vicinity. Understanding the full significance of this coincidence awaits the definitive establishment of the functions of the C-terminal motifs. Conclusion Phylogenetic analysis of an expanded set of AP3 homolog sequences indicates that the euAP3/TM6 duplication event occurred very close to the base of the core eudicots in association with the Trochodendraceae and Buxaceae lineages. The current dataset also reveals that the transition from the ancestral paleoAP3 motif to the derived euAP3 motif was primarily mediated by a single nucleotide deletion. The new motif appears to have become conserved with relatively little additional change, a somewhat extraordinary finding highlighting the potential for 'punctuated equilibrium' [83] to act at the molecular level as well as the morphological. It seems likely that the existence of a conserved second paralog facilitated the maintenance of the frameshift mutation. This finding fits with original models of gene duplication as a major source for genetic and biochemical diversification [84]. Current evidence regarding the biochemical functions of these C-terminal motifs is largely indirect and often contradictory, underscoring the importance of targeting these regions for further analysis. Methods Characterization of APETALA3 homologs Homologs of AP3 were cloned from select taxa (see Fig. Fig.1)1 Phylogenetic analyses In addition to the 20 new loci obtained in the current study, 61 other core eudicot, basal eudicot, magnoliid, monocot and ANITA grade AP3 homologs were identified based on previously published analyses and BLAST searches [85] (see Additional file 1 for references and accession numbers). In cases where GenBank contained nearly identical sequences from the same taxon, only one representative sequence was included. Full-length nucleotide alignments of the loci were initially compiled using ClustalW. ClustalW multiple alignment parameters were gap penalty 8 and gap extension penalty 2, transitions weighted for the nucleotide alignment. The alignments were then refined by hand using MacClade 4.06 [86]. The hypothesized single nucleotide deletion in the C-terminus of euAP3 lineage members was incorporated into the alignment (see Additional file 3 for complete alignment in NEXUS format). Maximum likelihood (ML) phylogenetic analyses were performed using PAUP* [87]. We used Modeltest [88] with the standard Akaike Information Criterion (AIC) to determine the simplest and most appropriate evolutionary model for our dataset. The models selected were a general time-reversible model (GTR) with a proportion of invariable sites (I) and a gamma approximation to the rate of variation among sites (Γ). The ML analysis used a single heuristic search with 100 random addition replicates, TBR branch swapping, MULPARS, and the steepest descent options. Branch support was estimated by performing 100 replicates of nonparametric bootstrapping using the same parameters as the original analysis. We also performed maximum parsimony (MP) analysis on the dataset using a heuristic tree search with 1000 random addition sequence replicates and TBR branch swapping. Support was estimated by performing 1000 bootstrap support replicates each with 10 random sequence addition replicates. The MP phylogeny is not shown (see text). Analysis of nucleotide diversity and ancestral character state reconstructions The program DnaSP [89] was used to determine the position-by-position nucleotide diversity of two small alignments derived from the full-length nucleotide dataset. The first alignment contains the C-terminal paleoAP3 motif-encoding region of loci from the TM6 lineage and the paleoAP3 lineage of basal eudicots. All indels were removed from the DnaSP alignment (see Additional file 4). The second alignment contains the C-terminal euAP3 motif-encoding region of loci from the euAP3 lineage (all core eudicots). All indels were removed from the DnaSP alignment except for the single nucleotide deletion that produced the euAP3 motif (see Additional file 5). The DNA Polymorphism function was used to determine the nucleotide diversity (π, [90]) for each position in the two alignments. Ancestral nucleotide character state reconstructions were performed using both MP and ML methods. For these analyses, we used the complete nucleotide alignment and the ML phylogeny. MP reconstructions were performed using the accelerated transitions (ACCTRAN) and delayed transitions (DELTRAN) options as they are implemented in MacClade 4.0 [86]. ML reconstructions were performed using the approach of Yang et al. [91] that is implemented in PAML [92]. As has been found in other cases where changes are relatively rare ([93] and references therein), the MP and ML reconstructions were identical. Given the fact that the relevant nodes have poor support, we also performed ancestral character state reconstructions with alternative topologies. Specifically, we tested a phylogeny where the Pachysandra loci are placed before the euAP3/TM6 duplication (see Fig. Fig.5B).5B Authors' contributions EK characterized AP3 homologs from Ilex, Kalanchoe, Saxifraga, Corylopsis, Pachysandra, Phytolacca, Paeonia and Vitis; conducted the phylogenetic analyses and ancestral state reconstructions; and drafted the manuscript. HJS and JMH characterized AP3 homologs from Loranthus and Trochodendron; and helped draft the manuscript. CCW characterized the AP3 homolog from Nelumbo in the laboratory of JMH. All authors read and approved of the final manuscript. Additional file 1 Table with Locus information Taxa of origin, GenBank accession numbers and reference information for all loci included in the alignment (sorted alphabetically by taxon). Click here for file(110K, doc) Additional file 2 Alignment of C-terminal regions of predicted proteins of paleoAP3, TM6 and euAP3 representatives. Phylogenetic affinities of the taxa are indicated by the bars on the left (BE = Basal Eudicot; based on [42, 95]. The PI Motif-derived region is boxed in green; paleoAP3 motifs, in blue; and euAP3 motifs, in purple. Residues showing chemical conservation with the consensus for each of these regions [28] are shaded in grey. Red arrows at the right indicate the loci that appear to have experienced independent frameshift mutations. Click here for file(506K, eps) Additional file 3 APETALA3 nucleotide alignment NEXUS format file of complete APETALA3 nucleotide alignment used in current phylogenetic analyses. Click here for file(90K, nex) Additional file 6 Comparison of position-by-position nucleotide diversity values for paleoAP3 and euAP3 motif containing loci. Complete dataset of nucleotide diversity values of paleoAP3 and euAP3 containing loci. Region spans the entire C-terminal motif. The yellow bars indicate the values for a dataset including all TM6 lineage members and basal eudicot paleoAP3 loci. The codon positions of each nucleotide are indicated by vertical hash marks and the corresponding amino acids are shown immediately below the chart. Note that the last four nucleotides in the paleoAP3 alignment are 3' UTR. The blue bars indicate the values for a dataset including all euAP3 lineage members. The codon positions of each nucleotide are indicated by vertical hash marks and the corresponding amino acid are shown at the bottom. The position of the euAP3 frameshift is represented by a dash mark. n/a = not applicable. Click here for file(343K, eps) Additional file 4 PaleoAP3 alignment for nucleotide diversity calculation Alignment of paleoAP3 encoding regions of Pachysandra loci, TM6 orthologs and basal eudicot paleoAP3 representatives. Indels were removed from the alignment. Click here for file(2.7K, nex) Additional file 5 EuAP3 alignment for nucleotide diversity calculation Alignment of euAP3 motif encoding regions of euAP3 lineage members. All indels were removed except for the single nucleotide deletion corresponding to the euAP3 motif frameshift. Click here for file(1.8K, nex) Acknowledgements EK wishes to thank Eric Wehrenberg-Klee, Stefan Vanderweil and Heather Watchel for help with screening clones and prepping plasmid DNA; Sarah Mathews for the use of computer equipment and many helpful conversations; and the Queitsch, Mathews and Kramer labs for comments on the manuscript. JMH is supported by a grant from National Science Council, Taiwan (NSC92-2621-B-002-022). The authors would also like to thank 2 anonymous reviewers for their comments on the manuscript. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Plant Cell. 1998 Jul; 10(7):1075-82.
[Plant Cell. 1998]Nature. 2002 Feb 21; 415(6874):910-3.
[Nature. 2002]Nature. 2002 Feb 21; 415(6874):914-7.
[Nature. 2002]Nature. 1991 Sep 5; 353(6339):31-7.
[Nature. 1991]Curr Biol. 2004 Nov 9; 14(21):1935-40.
[Curr Biol. 2004]Nature. 2000 May 11; 405(6783):200-3.
[Nature. 2000]Proc Natl Acad Sci U S A. 2000 May 9; 97(10):5328-33.
[Proc Natl Acad Sci U S A. 2000]Proc Natl Acad Sci U S A. 1996 May 14; 93(10):4793-8.
[Proc Natl Acad Sci U S A. 1996]Genetics. 1995 May; 140(1):345-56.
[Genetics. 1995]J Plant Res. 2004 Jun; 117(3):229-44.
[J Plant Res. 2004]Mol Biol Evol. 2004 Mar; 21(3):506-19.
[Mol Biol Evol. 2004]Genetics. 1998 Jun; 149(2):765-83.
[Genetics. 1998]Mol Genet Genomics. 2002 Feb; 266(6):942-50.
[Mol Genet Genomics. 2002]Proc Natl Acad Sci U S A. 2003 May 27; 100(11):6558-63.
[Proc Natl Acad Sci U S A. 2003]Proc Natl Acad Sci U S A. 1996 Apr 30; 93(9):4063-70.
[Proc Natl Acad Sci U S A. 1996]Nature. 1999 May 13; 399(6732):144-8.
[Nature. 1999]Plant Cell. 2004 Mar; 16(3):741-54.
[Plant Cell. 2004]Plant J. 2005 Sep; 43(5):724-44.
[Plant J. 2005]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Mol Biol Evol. 2004 Mar; 21(3):506-19.
[Mol Biol Evol. 2004]Genetics. 1998 Jun; 149(2):765-83.
[Genetics. 1998]Genetics. 1998 Jun; 149(2):765-83.
[Genetics. 1998]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Mol Cell. 2004 Jan 30; 13(2):157-68.
[Mol Cell. 2004]Mol Biol Evol. 1998 Nov; 15(11):1568-71.
[Mol Biol Evol. 1998]J Mol Evol. 2005 Feb; 60(2):141-52.
[J Mol Evol. 2005]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Dev Genet. 1999 Sep; 25(3):253-66.
[Dev Genet. 1999]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Genetics. 2003 Oct; 165(2):821-33.
[Genetics. 2003]Proc Natl Acad Sci U S A. 1940 May 15; 26(5):340-9.
[Proc Natl Acad Sci U S A. 1940]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Plant Physiol. 2005 Sep; 139(1):18-26.
[Plant Physiol. 2005]Trends Genet. 2004 Oct; 20(10):461-4.
[Trends Genet. 2004]Plant Cell. 2004 Jul; 16(7):1679-91.
[Plant Cell. 2004]Nature. 2004 Jun 3; 429(6991):558-62.
[Nature. 2004]Proc Natl Acad Sci U S A. 1996 May 14; 93(10):4793-8.
[Proc Natl Acad Sci U S A. 1996]Plant Cell. 1989 Jan; 1(1):37-52.
[Plant Cell. 1989]Development. 2001 Jan; 128(1):13-23.
[Development. 2001]Cell. 1994 Feb 25; 76(4):703-16.
[Cell. 1994]Genes Dev. 1996 Jul 15; 10(14):1812-21.
[Genes Dev. 1996]Plant Cell. 2004; 16 Suppl():S1-17.
[Plant Cell. 2004]Nature. 2001 Jan 25; 409(6819):469-71.
[Nature. 2001]Plant Cell. 2003 Nov; 15(11):2680-93.
[Plant Cell. 2003]Plant Mol Biol. 2003 Jul; 52(5):1051-62.
[Plant Mol Biol. 2003]Nature. 2001 Jan 25; 409(6819):525-9.
[Nature. 2001]Genetics. 1998 Jun; 149(2):765-83.
[Genetics. 1998]Genetics. 2005 Apr; 169(4):2209-23.
[Genetics. 2005]Genetics. 2004 Feb; 166(2):1011-23.
[Genetics. 2004]Nature. 2001 Jan 25; 409(6819):525-9.
[Nature. 2001]Genetics. 2003 Oct; 165(2):821-33.
[Genetics. 2003]Nucleic Acids Res. 2003 Aug 1; 31(15):4401-9.
[Nucleic Acids Res. 2003]Proc Natl Acad Sci U S A. 2003 May 27; 100(11):6558-63.
[Proc Natl Acad Sci U S A. 2003]Nature. 2001 Jan 25; 409(6819):525-9.
[Nature. 2001]Plant J. 2005 Sep; 43(5):724-44.
[Plant J. 2005]Genetics. 2003 Oct; 165(2):821-33.
[Genetics. 2003]Mol Biol Evol. 2004 Mar; 21(3):506-19.
[Mol Biol Evol. 2004]Genetics. 1998 Jun; 149(2):765-83.
[Genetics. 1998]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]J Gastrointest Surg. 2002 Nov-Dec; 6(6):845-52; discussion 853-4.
[J Gastrointest Surg. 2002]Bioinformatics. 1998; 14(9):817-8.
[Bioinformatics. 1998]Bioinformatics. 2003 Dec 12; 19(18):2496-7.
[Bioinformatics. 2003]Genetics. 1995 Dec; 141(4):1641-50.
[Genetics. 1995]Comput Appl Biosci. 1997 Oct; 13(5):555-6.
[Comput Appl Biosci. 1997]Mol Biol Evol. 2003 Jul; 20(7):1087-97.
[Mol Biol Evol. 2003]