• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ploscompComputational BiologyView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
PLoS Comput Biol. Nov 2008; 4(11): e1000206.
Published online Nov 7, 2008. doi:  10.1371/journal.pcbi.1000206
PMCID: PMC2563028

Facilitated Variation: How Evolution Learns from Past Environments To Generalize to New Environments

Gary Stormo, Editor

Abstract

One of the striking features of evolution is the appearance of novel structures in organisms. Recently, Kirschner and Gerhart have integrated discoveries in evolution, genetics, and developmental biology to form a theory of facilitated variation (FV). The key observation is that organisms are designed such that random genetic changes are channeled in phenotypic directions that are potentially useful. An open question is how FV spontaneously emerges during evolution. Here, we address this by means of computer simulations of two well-studied model systems, logic circuits and RNA secondary structure. We find that evolution of FV is enhanced in environments that change from time to time in a systematic way: the varying environments are made of the same set of subgoals but in different combinations. We find that organisms that evolve under such varying goals not only remember their history but also generalize to future environments, exhibiting high adaptability to novel goals. Rapid adaptation is seen to goals composed of the same subgoals in novel combinations, and to goals where one of the subgoals was never seen in the history of the organism. The mechanisms for such enhanced generation of novelty (generalization) are analyzed, as is the way that organisms store information in their genomes about their past environments. Elements of facilitated variation theory, such as weak regulatory linkage, modularity, and reduced pleiotropy of mutations, evolve spontaneously under these conditions. Thus, environments that change in a systematic, modular fashion seem to promote facilitated variation and allow evolution to generalize to novel conditions.

Author Summary

One of the striking features of evolution is the appearance of novel structures in organisms. The origin of the ability to generate novelty is one of the main mysteries in evolutionary theory. The molecular mechanisms that enhance the evolution of novelty were recently integrated by Kirschner and Gerhart in their theory of facilitated variation. This theory suggests that organisms have a design that makes it more likely that random genetic changes will result in organisms with novel shapes that can survive. Here we demonstrate how facilitated variation can arise in computer simulations of evolution. We propose a quantitative approach for studying facilitated variation in computational model systems. We find that the evolution of facilitated variation is enhanced in environments that change from time to time in a systematic way: the varying environments are made of the same set of subgoals, but in different combinations. Under such varying conditions, the simulated organisms store information about past environments in their genome, and develop a special modular design that can readily generate novel modules.

Introduction

The origin of the ability to generate novelty is one of the main mysteries in evolution. Pioneers of evolutionary theory, including Baldwin [1], Simpson [2], and Waddington [3],[4], suggested how useful novelty might be enhanced by physiological adaptations and by the robustness of the developmental process. These early theories were limited by a lack of knowledge of the molecular mechanisms of development.

Recent decades saw breakthroughs in the depth of understanding of molecular and developmental biology. Many of these findings were unified in the theory of facilitated variation [5], presented by Kirschner and Gerhart, that addresses the following question: how can small, random genetic changes be converted into complex useful innovations? In order to understand novelty in evolution, Kirschner and Gerhart integrated observations on molecular mechanisms to show how the current design of an organism helps to determine the nature and the degree of future variation. The key observation is that the organism, by its intrinsic construction, biases both the type and the amount of its phenotypic variation in response to random genetic mutation [3], [4], [6][10]. In other words, the organism seems to be built in such a way that small genetic mutations have a high chance of yielding a large phenotypic payoff.

To understand FV, it is important to compare it to the related concept of evolvability. A biological system is evolvable if it can readily acquire novel functions through genetic changes that help the organism survive and reproduce in future environments [11]. Evolvability is composed of two aspects: 1) variability: the capacity to generate new phenotypes 2) fitness: the fitness of the new phenotypes in future environments. Most studies of evolvability focused on the first aspect, variability. Such studies measured the range and diversity of the phenotypic variation that can be generated by a given mutation, usually without discerning between potentially useful phenotypes and non-useful ones [12][16] (for an interesting exception see Ciliberti et al [17]). FV theory adds to previous considerations by focusing on the nature of the generated variation, and specifically on the organism's ability to generate novel phenotypes which are potentially useful.

Facilitated variation (FV) is made possible by certain features of biological design. One of these is the existence of ‘weak regulatory linkage’ [5],[10],[18], where general and non-instructive signals can trigger large pre-prepared responses. For example, changes in growth hormone concentration at a localized position (limb bud in an embryo) can trigger large useful changes in the shape of the limb, driven by the conserved mechanisms for growth of bones, muscles, blood vessels, and nerves [19]. A good example is the ease of changing beak shapes with any of many possible mutations that affect the concentration of a single morphogenic factor [20] (Figure 1A). In weak regulatory linkage, the information about the output is pre-built into the regulated system without instruction from the regulator, which only selects between states. Such regulatory organization reduces the constraints for evolving new regulations and for generating complex potentially useful phenotypes.

Figure 1
A small number of mutations evokes large useful phenotypic adaptation in systems showing facilitated variation.

An additional feature that is important for FV is modular design [21][24], seen for example, in the highly conserved body-plan of the embryo [25],[26] and in the compartmental organization of gene regulation and signaling networks [27]. Modularity helps to relieve the concern that a mutation might interfere with many different parts of the organism. With properly designed modularity, variation within each module can be generated without harming other modules [28][31].

Facilitated variation can be in principle studied experimentally, for example by generating mutants and scanning the types of phenotypes generated. For example, a study on mutants of the lac regulatory region indicated that the shape of the gene input function is channeled in directions of AND-like and OR-like functions, rather than other possibilities [32].

An open question is how does FV spontaneously evolve? It is not clear how selection in a present environment can lead to designs that increase the probability of useful changes in future environments. How does evolutionary theory account for the emergence of special designs that make it easy to generate novel and useful variation?

The key point in our study is the observation that environments in nature do not vary randomly, but rather seem to have common rules or regularities [33][35]. Specifically, environmental goals faced by organisms or molecules may be thought of as composed of a combination of subgoals [33]. When environments change, the organisms encounter a new goal that is still made of the same or similar subgoals. For example, on the level of the organism, the same subgoals, such as digesting food, avoiding predation, and reproducing, must be fulfilled in each new environment but with different nuances and combinations. On the level of cells, the same subgoals such as adhesion and signaling must be fulfilled in each tissue type but with different input and output signals. On the level of proteins, the same subgoals, such as enzymatic activity, binding to other proteins, regulatory input domains, etc., are shared by many proteins but with different combinations in each case.

One may thus propose that in many cases, the different possible environments share a language of modularity, in the sense that they are all made of certain combinations of a set of subgoals. We thus test the possibility that under such patterned varying environments, the organism can learn over many generations the language common to the environments encountered in its past. We ask whether FV arises in such systematically varying environments, by measuring the ability of simple model systems to adapt to new, previously unseen goals, which are in the same language as past goals.

We employ two well-studied model systems: combinatorial logic circuits [33],[34] and RNA secondary structure [12]. We find that the standard experiment of setting a goal which remains constant over time leads to highly optimized systems that show little FV. In contrast, FV is readily generated under modularly varying goals (MVG), in which goals change over time but share the same subgoals [33]. We find that MVG evolution enhances the ability to generate novel phenotypes as long as novelty is modular: phenotypes with novel modules or novel combinations of modules. We show that organisms under MVG store information about past goals in their genomes, and evolve weak linkage that allows small genetic changes to unleash large phenotypic responses that do not ruin the modular structure of the organism. Our study thus suggests that environments that change in a systematic fashion promote the evolution of facilitated variation, and leave an imprint on the evolvability properties of the organisms, allowing them to generalize to new conditions that are in the same language as past conditions.

Results

Description of the Model Systems

Combinatorial logic circuit model

The first model system in this study is circuits made of logic gates, evolved toward a desired Boolean function G. The circuits are composed of NAND gates (NOT-AND function), have several input ports and a single output port. The fitness of the circuit is the fraction of times it computes the desired output, G, when evaluated over all possible combinations of the Boolean values of the inputs. The wiring of the gates is coded in a genome (string of bits). Starting with a population of random genomes, mutations are made and high fitness individuals are selected by means of a standard genetic algorithm (see Methods). The present results hold both in the presence and absence of recombination.

We compared evolution of circuits under a goal that is constant over time (called here fixed-goal or FG) to circuits evolved under goals which change from time to time in modular fashion (called modularly varying goals, denoted MVG). In FG evolution, the goal is a Boolean function such as

equation image
(1)

where XOR is the exclusive-or function. The resulting circuits have a non-modular design, as previously found [33]. The structure is non-modular despite the fact that the goals, such as G1, can be decomposed into subgoals (two XORs and one OR operations) (Figure 2A).

Figure 2
Schematic view of evolutionary goals and phenotypes in the two model systems.

In contrast, under MVG, instead of keeping the goal fixed, we switched the goal every E = 20 generations. These are rapid changes in comparison to the length of the simulations, 105 generations. A wide range of switching times E gives similar results.

Importantly, all goals presented along MVG evolution shared the same subgoals but in different combinations (Figure 2B). For example, we evolved the circuits toward G1 for 20 generations and then switched the goal to a similar function G2, in which one of the XORs is replaced by an EQ (the EQUAL function).

equation image
(2)

and then back to G1 and so on. Similar findings were obtained with three goals, with probabilistic transitions between G1, G2 and a third modularly related goal:

equation image
(3)

Similar findings are also found when OR is changed to AND, for example G2 = (x XOR y) AND (w XOR z). The specific examples were chosen because XOR and EQ are the most difficult two-input Boolean functions to implement with NAND-gate circuits. Contrary to FG evolution, the circuits evolved under MVG are found to have a modular structure: they display a structural module for each of the computational subgoals [33] (e.g. two modules that rapidly rewire by mutations to serve as a XOR or EQ according to the present goal, and a third module that performs an OR operation) (Figure 2B).

RNA secondary structure model

In addition to logic circuits, we studied RNA secondary structures. Here, genomes are RNA nucleotide sequences, and the goal is given by a desired secondary structure. A standard RNA folding algorithm was used to determine the secondary structure of each genome sequence [36]. Fitness was based on the most stable shape (minimum free energy, denoted MFE) corresponding to the genome sequence [12]. The fitness of the sequence is then defined as 1-d/B, where d is the structural distance to the goal and B is the length of the sequence [34].

We evolved an initially random population of RNA sequences toward predefined secondary structure using a standard genetic algorithm. We present in detail the example of a ‘clover leaf’ tRNA structure [12], but other structures gave similar conclusions, see Text S1 section 1.2. This clover leaf has three structural modules, two hairpin loops and one hairpin loop with a bulge. In FG simulations, the goal remained constant along evolution. In the MVG scenario, we switched between goals in a modular way in the sense that the different goal structures shared the same library of structural modules (such as hairpin loops and open loops) but in different combinations (Figure 2C) [34].

MVG Genotypes Adapt Rapidly When Goals Change

In the following, we mainly focus on two representative problems, logic circuits evolved towards combinations of XOR and EQ goals, and RNA molecules evolved towards cloverleaf-like RNA structure. Similar conclusions were found for all six Boolean goals studied and five other RNA structures tested, as detailed in Text S1 sections 1.1 and 1.2.

Under MVG evolution, the evolving circuits or RNA molecules were exposed to a series of goals that are related to each other by their shared set of subgoals. We find that within a few thousand generations, genomes evolve that are able to adapt rapidly, often within a single generation, to each new goal (Figure 1B and 1C). Despite the fact that the phenotypic adaptation is large (e.g. an entire hairpin changes to an unstructured open loop, or a change in about half the bits in the truth table of a circuit goal, see Methods), the adaptation is associated with a very small genetic change, usually only 1–2 mutations.

In contrast, adaptation of organisms evolved under FG is slow when the goal is suddenly switched, even if the switch is to a goal with the same subgoals as the previous goal. FG-organisms take a dozen times more generations to satisfy the new goal (Figure 3A), and require about five times more mutations on average, than organisms evolved under MVG. The same is true for the other goals tested in Text S1. Thus, the response to changing goals is significantly slower than the response of MVG-evolved organisms to previously seen goals (Figure 3A).

Figure 3
MVG-evolved organisms adapt faster than fixed-goal organisms when goals change.

High-Fitness Phenotypes for Past Goals Are Found within MVG Phenotypic Neighborhood

We next asked what is special about the design of MVG-evolved organisms that facilitates their response to changing goals? For this purpose, we considered the phenotypic neighborhood [37][39], defined as the set of phenotypes that are accessible from a given genotype by a single point mutation.

We find that the phenotypic neighborhood of MVG-evolved genomes includes phenotypes that have high fitness to the past goals seen in their history (Figure 3B). This indicates that the evolved organism effectively remembers its past goals by storing information about it in its genome. In contrast, in genomes evolved under constant conditions (FG), the fitness of the neighborhood for new goals is significantly lower.

FG populations are known to evolve toward the center of the neutral network, defined as the set of all genotypes with the same phenotype that are connected by neutral mutations [40][42]. Thus the FG organisms are more robust to genetic mutations and their phenotypic neighborhood exhibits a lower degree of variation than the MVG organisms. These features are also found in the present study (Text S1, section 4.1). In contrast, MVG organisms seem to be located at the edge of the neutral network that is closest to the neutral networks of the previously seen goals. This implies that temporally varying environments push populations towards special regions of the neutral network.

In addition to genetic mutations, one can also study thermal fluctuations that give rise to alternative structures encoded by a single genotype [12]. Thus, in the RNA model, we considered in addition to the genetic neighborhood also the thermodynamic neighborhood: the set of structures for a given genome that have a free energy that is within 5kT of the minimal free energy (MFE state) and are therefore accessible with a non-negligible probability by thermal fluctuations [43]. We find that the thermodynamic neighborhoods of MVG-evolved genomes include structures that have high fitness for previously seen goals. The FG-evolved genomes we have tested have a thermal neighborhood whose fitness for new goals is significantly lower. In this respect, the thermodynamic neighborhood is similar to the genetic neighborhood (Figure 3C, and Text S1 section 5.2), a phenomenon called ‘plastogenetic congruence’ [12] (Text S1 section 4.1).

The Adaptation to Previously Seen Goals Is Facilitated by Genetic Triggers

We find that the rapid adaptation to previously seen goals in MVG organisms is facilitated by key positions in the genome that can stabilize a desired sub-structure or module among other potential outcomes. We term these positions ‘genetic triggers’, since they can trigger a large and prepared phenotypic response.

To detect genetic triggers one must search for genomic positions that vary in a way that is highly correlated to the change in the goals. This means that triggers carry high information content about the current goal. The genetic triggers can thus be detected by evaluating the mutual information between the environment (goal) and the genomic content at each position (see Methods). Since mutual information measures how much the knowledge of one variable reduces the uncertainty regarding the other, the trigger positions are characterized by high mutual information with the environment (Figure 4A). Trigger positions were readily detected for all MVG cases tested. In the RNA model, we find that mutual information is spread amongst more genomic positions than in the logic circuit model. Triggers can still be clearly detected at sites with much higher mutual information than the background. We find that these trigger nucleotides are positioned within the module that they affect, usually in the stem of a hairpin (Figure 4C and 4D). In this respect, the hairpins evolved in MVG differ from hairpins evolved in FG in that a single change in the trigger can cause a flip between an open loop and a closed hairpin.

Figure 4
Evolution of genetic triggers.

Over time, under MVG conditions, it is evident that the mutual information between genomes and goals (i.e. environments) gradually becomes focused to a few trigger positions, allowing rapid adaptation when environment changes (Figure 4B). Since trigger positions are small variations that lead to a sizable switch between pre-designed states, they may be considered as a simple example of weak regulatory linkage.

Evolution of Novelty within the MVG ‘Modularity Language’

So far, we analyzed the adaptation to previously seen goals introduced along MVG evolution history, which highlighted the ability of MVG organism to remember its past. We now turn to novel, previously unseen goals, where we test the ability to generalize based on the past.

The main problem is to define what kind of novel goals might be encountered in future environments that are in the same context as the previous environments. Indeed, adaptation of MVG-organisms toward a randomly picked goal results in evolution that is as slow, or even slower, than FG-organisms (Text S1 section 6.5). But a randomly picked goal has no correlation with the past. To address this, MVG evolution offers the possibility of presenting a previously unseen goal which is in the same ‘language’ as previous history.

This language, in the present case of logic circuits, is defined as the set of all goals that can be decomposed in the following way u(x,y,w,z) = f(g(x,y),h(w,z)), Figure 5A. In other words, the goals in the language are made of a hierarchy of three functions f,g and h, such that g responds to x and y, and h responds to the other two inputs w and z, and f responds to g and h. In the case of the RNA model, the language can be defined as the set of all secondary structures with independent structural modules (e.g., hairpin loops, open loops etc.) that correspond in their genomic positions to the modules of the MVG goals (see Methods and Figure 5B).

Figure 5
Schematic representation of MVG ‘modularity language’.

Within this language, we defined two classes of possible future goals which are novel: (a) New-comb is a goal that presents previously seen subgoals but in a new combination (Figure 6A) (b) Novel-module refers to goals where one of the subgoals is a previously unseen one, while the other subgoals are kept unchanged (Figure 6C). This represents a novelty that is restricted to one of the modules of the goal.

Figure 6
Adaptation towards novel modular goals is more rapid in MVG organisms.

We tested evolution under these two classes of novelty. We find that for both logic circuit and RNA models, MVG populations adapted faster than FG populations when introduced to new-comb goals (Figure 6B). We also performed competition experiments in which initial populations were composed of 50% FG-evolved and 50% MVG-evolved genomes. When new-comb goals were presented, the descendants of MVG-evolved genomes took over the population in about 68% of the RNA model runs (Figure 6B inset). Logic circuits showed similar behavior, where MVG-genomes took over the population in about 75% of the runs (Text S1 section 6.3).

We also tested novel-module goals. Here, the RNA model did not show a significant difference between FG and MVG genomes. However, in the logic circuit model, MVG-populations adapted significantly faster also to novel-module goals (Figure 6D). We tested 20 different novel-module goals. For example, a novel goal is generated by replacing a XOR module by a previously unseen 2-input Boolean function, such as AND or NOR defined by its truth table (Figure 6C). We find that MVG's outperformance occurred only toward goals within the modularity language. MVG adaptation toward non-modular goals was not significantly different from FG's (Figure 6E).

In competition experiments [44] between FG and MVG genomes toward novel-module goals, populations were taken over by MVG-genomes in about 70% of the runs (Figure 6D inset). In experiments toward randomly chosen goals, populations had equal chance to be taken over by either FG or MVG genomes (Figure 6E inset). We further find that the harder the novel-module goal (the more generations needed to solve it ‘from scratch’), the more MVG organisms out-perform FG organisms (see Text S1 section 6.4). These results imply that temporally patterned environments not only lead to a memory of the past goals, but also to generalization: the population learned a language of its history of environments (conditions) that share the same common rules.

Mechanisms for Enhanced Evolution of Novelty

To examine the mechanisms for enhanced evolution of novelty within the MVG language, we tested three suggested mechanisms proposed in the theory of FV [5] (a) mutations have large effect on their own module. This reduces the number of steps to novelty; (b) mutations have small effect on other modules, a property also called reduced pleiotropy [45],[46]; and (c) mutations have reduced lethality, increasing viable genetic variance in the population and allowing access to higher diversity of potential phenotypes. We quantified the effects of mutations according to these suggestions. The results demonstrate that MVG organisms in the present study follow the first two mechanisms, but not the third.

We begin with the first two mechanisms, and treat the third in the next section. To quantify the effect of mutations on their own module and on other modules, we mutated each of the genome positions that correspond to a given module in the phenotype, and tested its phenotypic effect on its own module and on the other modules. The effect of the mutation was quantified as phenotypic distance: Hamming distance between the structures of subsequences in the case of RNA, and between the series of outputs of the gates (over all input combinations) within each module in the case of logic circuits (see Methods).

The results are summarized in Table 1. Significantly enhanced intra-module change and reduced pleiotropy were found in most cases. The two models differed in the extent of these mechanisms: logic circuits showed more reduced pleiotropy, and RNA structures primarily showed more enhanced intra-module change.

Table 1
Intra- and inter-modular effects of mutations.

MVG Evolution Reduces the Genetic Variance of the Population

We now turn to the third mechanism for novel adaptation proposed by FV theory, associated with an increase in the genetic variance of the populations. We evaluated the genetic variance in a population given its current goal by measuring the conditional genomic entropy (Methods, Text S1 section 11, [47]). In contrast to the suggested mechanism of FV theory, we find that MVG populations display lower genetic variance than FG populations (Figure 7A). The reduction in genetic variance indicates that the rapid adaptation of MVG populations in this study is not due to population diversity but rather in useful potential variation within each individual.

Figure 7
Reduction in genetic variance in MVG evolution.

Why do MVG-populations show a lower genetic variance? One possibility is that they evolve to store information about past environments in their genome, placing constraints on the sequence (strong stabilizing selection). To test this, we studied the effect of increasing the number of goals introduced over time in MVG. We find that the more goals (or more precisely the higher the information content in the environment), the lower the genetic variance in the population (Figure 7B). Organisms evolved in constant environment seem to store less information and have higher genomic entropy (Figure 7A).

An additional way to understand the low variance in MVG genomes compared to FG genomes is to consider that the latter are more robust to genomic mutations (see Text S1 section 4.1). Hence, they display more positions in the genome that can be varied without affecting the phenotype. Robustness to mutations thus allows higher genetic variance in the population [48], and conversely, strong constraints on the genome lead to lower genetic variance and sensitivity to mutations in MVG organisms.

As an example for storage of information in the genome and its effect on genetic variance, consider the example of Figure 7C and 7D. Populations of RNA molecules that evolve toward a fixed secondary structure G1 that contains an open loop are found to show high variance in the genomic positions that form the open loop. This is because forming a loop is relativity easy as there are few constraints for base-pairing. On the contrary, populations evolved under MVG environments in which the goal repeatedly switched between G1 and G2 (Figure 7C), show lower variance in the corresponding “loop region”. The evolved MVG loop carries information about its past, and is ready to become a stem by a single ‘trigger’ mutation. The information acquired by the loop is reflected in the pronounced decrease in the variance of that genomic region in MVG populations (Figure 7D).

We note that increase in variance might be expected in more complex models, especially when spatial heterogeneity can allow several metapopulations to exist by using recombination as an efficient adaptation mechanism. High variance may also occur if the genomes can not store the required information (see Text S1 section 10.2 for an example).

The Phenotypic Neighborhood of MVG Genotype Is Enriched with Novel ‘Useful’ Phenotypes

We find an additional property of MVG-evolved genomes which helps to overcome barriers to novelty and further reflects the ability to generalize, in the case of logic circuits. In a preceding section, we showed that the MVG phenotypic neighborhood is enriched with phenotypes that are close to previously seen goals. We now turn to possible future goals. We scanned the phenotypic neighborhoods for goals within the same modularity language as previous goals. We find, in the case of logic circuits, that the phenotypic neighborhood of a MVG-circuit is enriched with modular circuits that compute decomposable (modular) functions that are of the form u(x,y,w,z) = f(g(x,y),h(w,z)), (Figure 8). In contrast, the neighborhood of a FG-circuit includes more functions that are not decomposable and thus are not within this “modularity language” (see Text S1 section 7). This property was not found for the RNA model.

Figure 8
Enrichment of phenotypic neighborhood of logic circuits with novel ‘useful’ phenotypes.

Quantitative Measure of Facilitated Variation Shows That It Is Enhanced during MVG Evolution

Finally, we aimed to define a quantitative measure for facilitated variation. A desired measure should capture the two main components of biased variation: (a) the quantity component [41], namely enriching of the phenotypic neighborhood with potentially useful phenotypes which are novel. (b) The quality component: accessing as many as possible different potentially useful novel phenotypes, which are as far as possible in phenotypic distance from the wild-type [49].

We chose a simple FV measure, among other possible choices, which is the product of these two components (see Text S1 section 8.1). The ‘quantity’ component is the probability of forming a potentially useful phenotype which is novel by a single point mutation; the ‘quality’ component is the average phenotypic distance between the wild-type and the potentially useful phenotypes within its phenotypic neighborhood. This measure is then normalized for its corresponding value with respect to non-useful neighboring phenotypes.

equation image

Here, useful phenotypes correspond to phenotypes with the same modular structure as the goals in which the organism has previously evolved. Nuseful is the number of neighbors which have a modular phenotype (useful) and are different from the wildtype phenotype (novel), and <d(Puseful, P0)> is the mean distance between novel and useful phenotypes and the wildtype. Similar definitions apply for the denominator, where non-useful means phenotypes that do not have the modular structure of previous goals (in the logic circuit model this includes either trivial functions such as an output of all ones or all zeros, or non-decomposable Boolean functions).

According to the formula, an organism with high FV has a high likelihood of forming potentially useful variation and a relatively low probability of varying towards non-useful phenotypic directions (see Methods and Figure 9A).

Figure 9
Dynamics of facilitated variation.

We find that the FV measure increases with generations under both FG and MVG evolution (Figure 9B and 9C, Text S1 section 8.2). However, it increases significantly more under MVG. The increase under FG evolution seems to result from the increase in robustness (increased probability of generating wild-type phenotype or close to wild-type phenotypes). Finally, we performed experiments in which the initial population consisted of genomes with high FV that were evolved by MVG. We then placed this population under a fixed goal, corresponding to their last seen goal, but presented constantly over time. We find that FV decreased rapidly within a few tens of generations provided there is even a slight selection pressure for small circuit size (Figure 9D, following [33]). This result demonstrates the role of modularly varying goals in preserving facilitated phenotypic variation in the face of more optimal, low FV circuits.

Discussion

This study quantitatively examined facilitated variation in model systems and demonstrated that it is enhanced in modularly varying environments as compared to constant environments. When the environment varies in a modular fashion (or, more generally, in a systematic manner), it is possible to define feasible future environments that belong to the same ‘language’ as past environment. Hence, one can define a context specific evolvability: the extent to which organisms can generalize and generate novelty that is useful in the context of feasible future environments.

The present results suggest that adaptation to new goals in MVG relies on the evolvability properties of each individual [50]. The evolved organisms are intrinsically designed for a certain class of changes. Organisms that evolve under MVG develop weak linkage implemented by ‘trigger’ genomic positions that elicit a large phenotypic payoff upon minimal genetic investment. The triggers elicit substantial changes in one module and have low effect on other modules (low pleiotropy). The genomes are such that their genomic neighborhood is enriched with a wide range of potentially useful phenotypes – useful in the context of the previous goals ‘learned’ by the organism. Thus, the evolved genomes carry information about past goals. This information effectively prepares the organism for the future, provided that future goals are related to past goals.

The evolution of facilitated variation is time-scale dependant: if goals switched very rarely, it would be equivalent to a succession of FG's. On the other hand, if goals switched too fast, the required information would not have the sufficient time to be assimilated. We find that in the case of the present models, the rate of environmental switching that gives rise to evolvable organisms spans several orders of magnitude [33],[34].

This study employed two different models to study facilitated variation, logic circuits and RNA structures. Importantly, these two models differ in the type of modularity in their goals. RNA goals contained explicit structural modules (e.g. hairpin loops). Every RNA structure that satisfies such goals is modular by definition. In contrast, the modularity in logic circuits goals is implicit. Circuits that satisfy a modular goal can have either a modular circuit structure or a non-modular one. Modular circuit structures are in fact much more rare, and tend to evolve only under MVG, where switching between goals with shared modules constrain the circuits to evolve structural modules [33]. This difference between RNA and logic circuit models may underlie the fact that logic circuits showed a very strong enhancement of facilitated variation in MVG compared to fixed goals, whereas RNA model had a more modest enhancement. These two models are approximations to different aspects of biological design: Cell signaling and regulation networks that compute responses to signals are more analogous to the circuit model, whereas molecular structures are akin to the RNA model.

What happens if goals vary over time but in a non-modular fashion? We find that an environment that varies between randomly chosen goals typically causes confusion, where no good solution is found that can rapidly adapt to both goals. It is possible, however, to find pairs of goals which are not modular and yet which have solutions that are only a few mutations away from each other. In other words, goals whose neutral networks happen to come very close at a certain point. Here, genomes evolve that show rapid adaptation each time that the goal switches, but do not have modular phenotypes. However, it is hard to define facilitated variation towards novel goals in this case, since one can not define the future goals that are in the same ‘language’. Adaptation to novel goals is generally very poor (see Text S1 sections 2.1 and 6.5). In summary, evolution under non-modular varying environments might lead in certain cases to memory but not to generalization.

Modularly varying goals seem to enhance facilitated variation because of two main effects (i) they greatly improve the chances for the existence of solutions for the different goals that are close in genetic space (because the same modules need only be rewired by a few mutations) (ii) they offer the possibility of learning not only past goals, but also generalize to future goals as long as they are made of the same subgoals or with the same division into modules as previous goals. Finally, we note that facilitated variation comes with a cost: organisms are less optimal to the current goal than they might have been. For example, logic circuits that have high FV are usually composed of more logic gates than the optimal circuits that evolve if this goal is kept constant for very long times. Modularity, genetic triggers, and storage of information about the past in the genome, seem to demand more genes than is absolutely required to solve the problem. Extreme optimality to present environments is sacrificed to provide readiness to future ones.

Organisms or molecules that are under constant conditions [51],[52] are predicted by the present theory to lose their FV design, and become less evolvable. One may test this prediction by comparing organisms that evolved in varying and relatively constant environments [51],[52]. A further prediction is that any fluctuation in the system (such as molecular noise [53] or thermal fluctuation) would result in an output that is channeled in potentially useful directions.

In summary, the present study aimed at studying facilitated variation in simple model systems. Populations evolved under systematically varying conditions were found to exhibit not only a memory of past goals but were also able to generalize to new conditions that are in the same language as previous conditions. Adaptation to useful novel goals was enhanced by organisms that have learned the shared subgoals that existed in past environments and are therefore likely to be encountered in future environments. Several elements of facilitated variation theory, such as genetic triggers, modularity, and reduced pleiotropy of mutations seem to evolve spontaneously under these conditions. It would be interesting to study the evolution of additional FV mechanisms such as exploratory behavior and body-plan compartmentalization using more elaborate models with hierarchical designs and developmental programs.

Methods

Genetic Algorithm

We used a standard genetic algorithm [54],[55] to evolve combinatorial logic circuits and a structural model of RNA. The settings of the algorithm were as follows: a population of Npop individuals was initialized to random binary genomes of length B bits (random nucleotide sequences of length B bases in the case of RNA, in the main examples B = 76 for the RNA and B = 104 for logic circuits). In each generation, Npop individuals were selected with repeats from the previous generation according to a probability that exponentially scales with their fitness (selection strategy, see Text S1 section 1). Pairs of genomes from the selected individuals were recombined, using crossover probability Pc (Pc = 0.5 for the circuits model; Pc = 0 for the RNA model) and then each genome was randomly mutated (mutation probability Pm = 0.7/B per locus per genome). The present conclusions for the logic circuit model are generally valid also in the absence of recombination (Pc = 0). The present results were based on simulations of a population of size Npop = 5000 evolved for L = 105 generations for the circuit model, and a population of Npop = 500 evolved for L = 105 generations for the RNA model. These population sizes were empirically found to serve as minimal values for many of the presented effects, which seem to apply also for larger population sizes. For statistical analyses we considered only simulations that ended with maximal fitness of 1 within the predefined generation limit L. Similar conclusions were found when analyzing all runs.

Logic Circuits Evolution (Model 1)

Circuits were composed of up to twelve 2-input NAND gates. The binary genome coded for the circuit wiring as described in [33],[34],[54],[55]. Self loops and feedback loops were allowed. Goals were 4-input 1-output Boolean functions composed of XOR, EQ, AND, and OR operations. The goals were of the form u(x,y,w,z) = f(g(x,y),h(w,z)), where g and h were 2-input XOR or EQ functions, and f was an AND or an OR function [34]. Each Boolean function can be represented as a truth table, where each row represents a different combination of inputs values (0 or 1), and the relevant output value (again 0 or 1). Thus each goal can be uniquely defined by the output column vector. The fitness of each circuit was defined as the fraction of correct outputs over all possible inputs. In the MVG simulations the goals were modularly related by changing the functions f,g or h. The goal changed over time in a probabilistic manner every E = 20 generations.

RNA Secondary Structure (Model 2)

We followed the work of Schuster [42] and Ancel and Fontana [12] and used standard tools for structure prediction available at http://www.tbi.univie.ac.at/RNA/, and the “tree edit” structural distance [56]. The goals were secondary structures of length 60–90 nucleotides such as the Saccharomyces cerevisiae phenylalanine tRNA and synthetic secondary structures composed of three hairpins (for the full list of structures see Text S1 section 1.2). In MVG, the modular changes were applied by modifications of single hairpin at a time (such as changing the shape of the hairpin to an open loop). Goals changed every E = 20 generations (unless otherwise noted).

Normalized Fitness

Normalized fitness in Figures 3A, 6B, 6D, and 6E is defined as An external file that holds a picture, illustration, etc.
Object name is pcbi.1000206.e005.jpg, where F is the maximal fitness in the population and Fr is the average maximal fitness of a population of Npop random genomes. Normalized fitness Fn = 1 means a perfect solution to the goal, and Fn = 0 means a solution that is as good as expected in a random population of the same size. For the purposes of computing the best fitness X of a genetic neighborhood of a given system with phenotype P, as in Figure 3B and 3C, we used a normalization in which Fr is the value of X averaged over Npop samples taken from genomes with the same phenotype P. In the case of logic circuits, genomes with the phenotype P were obtained by simulated annealing optimization algorithm which produced genomes that satisfy the desired goal. In the case of RNA structures, genomes with phenotype P were generated using a standard inverse fold algorithm [36]. The normalized fitness of the genetic neighborhood is An external file that holds a picture, illustration, etc.
Object name is pcbi.1000206.e006.jpg.

Quantitative Measure of Genetic Variance

Following Adami et al. [48], genetic variance was measured using entropy H computed as follows. In a RNA genome of length B, each position can hold one of the 4 possible nucleotides with the probabilities: Pi,j where i = 1..B and j = {C,G,A,U}. The entropy of position i is Hi = −ΣPi,jlog(Pi,j). The maximal entropy per position (using logarithm of base 4) is 1, which occurs when the nucleotides distribution at that site is uniform. Perfectly conserved positions have zero entropy meaning that they contain maximal information (see Text S1 section 11.1). The nucleotide probabilities for each genomic position were computed from the population genomes. The genetic entropy is the sum of the entropies of all positions. We note that this is only an approximation of the full genomic entropy since we ignore the epistatic relations between positions. It is also important to note that this measure is not the marginal genomic entropy but the conditional entropy of the genome given its current environment (for FG, the two measures coincide). For an example, see Text S1 section 10.1.

Detection of Genetic Triggers

In order to detect the genetic triggers in a genome, we computed the mutual information I between target goal T and specific genomic site i, Xi, as I(Xi,T) = H(Xi)−H(Xi |T) where H is the entropy per site as described above [see Text S1 sections 10 and 11]. Triggers are defined by the positions with the highest mutual information (I) between goal and genomic contents.

Intra- and Inter-Modular Effects of Mutations

To define the effects of mutations on phenotype modules, we first computed the modules in each phenotype. For RNA this was based on the modular partition of the structure (into hairpin loops etc.), and in logic-circuits, modules were defined using the Newman-Girvan algorithm [57]. We then measured the effects of each possible genomic mutation on the phenotype of its own module, and on the phenotype of all other modules. In the RNA model, the effect of mutations on the phenotype of each module was evaluated by the distance d between the wild-type and the mutant structure in each module (Hamming distance between the string representations of the secondary structure [58]). In the case of logic circuits, the output series of each gate was evaluated, and the Hamming distance d between the mutant and wild-type was evaluated for each gate. Intra-module effects of mutations were the mean of all changes in the same module as the mutated gate, and inter-module effects of a mutation was the mean effect on the output of all gates in all other modules. The physical ranges of those effects were estimated by analyzing samples from the solution space (obtained by optimization algorithms).

Logic Circuits Modularity

To quantify the modularity of a network we used the normalized Qm measure of Kashtan et. al. [33],[51].

Definition of Phenotypic Distance

Logic circuit model

A logic circuit computes Boolean function of inputs, thus the phenotype can be described as a truth table (in our model, the goal function was 4-input, 1-output). We define the phenotypic distance of two circuits, as the Hamming distance between the corresponding output columns of two truth tables, i.e. fraction of different entries produced by the two circuits. In cases in which the output of the gate/circuit was time-dependent (oscillatory), we simulated the output of the gate/circuit over a window of 20 time-points. The final phenotypic distance was obtained by averaging the truth tables-distances over all time-points, and taking the best result out of all possible frames with a sliding window of 1-time point.

RNA secondary structure model

The phenotype of RNA sequence is a secondary structure that can be represented as a string of left and right parenthesis [36]. We used the ‘tree-edit’ distance [56] to compute the phenotypic distance between two legal structures (i.e. structures with balanced left and right parenthesis, where the number of left parenthesis is always larger or equal to the number of right parenthesis when reading the string from left to right). When measuring the phenotypic change in a certain module, ‘tree-edit’ distance can not be applied (since it operates on two legal structures, and sub-structure in a mutant genome is not necessarily legal). In such cases, we measured the Hamming distance between the two parenthesis sub-strings.

Definition of Potentially Useful Phenotypes

Logic circuit model

A potentially useful phenotype in the present context is a decomposable (i.e. modular) Boolean function of the form: u(x,y,w,z) = f(g(x,y), h(w,z))where f,g and h correspond to any 2 input, 1-output Boolean function, such as: AND, NAND, OR, XOR, EQ. Trivial cases such as u(x,y,w,z) = 0 or u(x,y,w,z) = x were not considered.

RNA secondary structure model

A potentially useful neighboring structure in the present context is a structure with independent structural modules that correspond in their genomic positions to the wild-type modules. To define this, consider a phenotype P′ in the phenotypic neighborhood of sequence S0, with MFE structure P0 (the wild-type structure), we say that P′ is a viable phenotype if: (i) P′ has legal sub-structures (legal parenthesis strings) at the genetic positions correspond to P0's modules and (ii) The genomic positions that correspond to distinct inter-module locations (for example, positions between module 1–2 and positions between modules 3–4) in P0, do not base-paired with each other in P′.

Supporting Information

Text S1

Supporting Information. Includes additional detailed examples and analysis.

(3.02 MB DOC)

Acknowledgments

We thank A. E. Mayo, M. Kirschner, R. Milo, E. Noor, S. Itzkovitz, E. Dekel, S. Kaplan, and W. Fontana for comments and discussions.

Footnotes

The authors have declared that no competing interests exist.

NIH and the Kahn Family.

References

1. Baldwin M. A New Factor in Evolution. The American Naturalist. 1896;30:441–451.
2. Simpson G. The Baldwin effect. Evolution. 1953;7:110–117.
3. Waddington CH. Canalization of development and genetic assimilation of acquired characters. Nature. 1959;183:1654–1655. [PubMed]
4. Waddington CH. Genetic assimilation. Adv Genet. 1961;10:257–293. [PubMed]
5. Kirschner M, Gerhart JC. The Plausibility of Life. London (United Kingdom: Yale University Press; 2005.
6. West-Eberhard MJ. Developmental plasticity and the origin of species differences. Proc Natl Acad Sci U S A. 2005;102(Supplement 1):6543–6549. [PMC free article] [PubMed]
7. Gould SJ. Darwinism and the expansion of evolutionary theory. Science. 1982;216:380–387. [PubMed]
8. West-Eberhard MJ. Phenotypic accommodation: adaptive innovation due to developmental plasticity. J Exp Zoolog B Mol Dev Evol. 2005;304:610–618. [PubMed]
9. Rutherford SL, Lindquist S. Hsp90 as a capacitor for morphological evolution. Nature. 1998;396:336–342. [PubMed]
10. Gerhart J, Kirschner M. Cells, Embryos, and Evolution: Toward a Cellular and Developmental Understanding of Phenotypic Variation and Evolutionary Adaptability. Oxford: Blackwell Publishers; 1997.
11. Wagner A. Robustness, evolvability, and neutrality. FEBS Lett. 2005;579:1772–1778. [PubMed]
12. Ancel LW, Fontana W. Plasticity, evolvability, and modularity in RNA. J Exp Zool. 2000;288:242–283. [PubMed]
13. Gardner A, Zuidema W. Is evolvability involved in the origin of modular variation? Evolution Int J Org Evolution. 2003;57:1448–1450. [PubMed]
14. Hansen TF. Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvability. Biosystems. 2003;69:83–94. [PubMed]
15. Wagner A. Robustness and evolvability: a paradox resolved. Proc Biol Sci 2007 [PMC free article] [PubMed]
16. Draghi J, Wagner GP. Evolution of evolvability in a developmental model. Evolution Int J Org Evolution. 2008;62:301–315. [PubMed]
17. Ciliberti S, Martin OC, Wagner A. Innovation and robustness in complex regulatory gene networks. Proc Natl Acad Sci U S A. 2007;104:13591–13596. [PMC free article] [PubMed]
18. Conrad M. The Geometry of Evolution. Rivista di Biologia/Biology Forum. 1996;89:21–54.
19. Cohn MJ, Patel K, Krumlauf R, Wilkinson DG, Clarke JD, et al. Hox9 genes and vertebrate limb specification. Nature. 1997;387:97–101. [PubMed]
20. Abzhanov A, Protas M, Grant BR, Grant PR, Tabin CJ. Bmp4 and morphological variation of beaks in Darwin's finches. Science. 2004;305:1462–1465. [PubMed]
21. Schlosser G, Wagner G. Modularity in Development and Evolution. Chicago: Chicago University Press; 2004.
22. Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–52. [PubMed]
23. Variano EA, McCoy JH, Lipson H. Networks, dynamics, and modularity. Phys Rev Lett. 2004;92:188701. [PubMed]
24. Alon U. An Introduction to Systems Biology: Design Principles of Biological circuits. Chapman & Hall/CRC; 2006.
25. Gerhart J, Lowe C, Kirschner M. Hemichordates and the origin of chordates. Curr Opin Genet Dev. 2005;15:461–467. [PubMed]
26. Gerhart J. Evolution of the organizer and the chordate body plan. Int J Dev Biol. 2001;45:133–153. [PubMed]
27. Kirschner M, Gerhart J. Evolvability. Proc Natl Acad Sci U S A. 1998;95:8420–8427. [PMC free article] [PubMed]
28. Wagner GP, Altenberg L. Complex Adaptations and the Evolution of Evolvability. Evolution. 1996;50:967–976.
29. Wagner GP, Pavlicev M, Cheverud JM. The road to modularity. Nat Rev Genet. 2007;8:921–931. [PubMed]
30. Winther RG. Varieties of modules: kinds, levels, origins, and behaviors. J Exp Zool. 2001;291:116–129. [PubMed]
31. Schlosser G, Thieffry D. Modularity in development and evolution. Bioessays. 2000;22:1043–1045. [PubMed]
32. Mayo AE, Setty Y, Shavit S, Zaslaver A, Alon U. Plasticity of the cis-regulatory input function of a gene. PLoS Biol. 2006;4:e45. doi:10.1371/journal.pbio.0040045. [PMC free article] [PubMed]
33. Kashtan N, Alon U. Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci U S A. 2005;102:13773–13778. [PMC free article] [PubMed]
34. Kashtan N, Noor E, Alon U. Varying environments can speed up evolution. Proc Natl Acad Sci U S A. 2007;104:13711–13716. [PMC free article] [PubMed]
35. Tagkopoulos I, Liu YC, Tavazoie S. Predictive behavior within microbial genetic networks. Science. 2008;320:1313–1317. [PMC free article] [PubMed]
36. Hofacker IL. Vienna RNA secondary structure server. Nucleic Acids Res. 2003;31:3429–3431. [PMC free article] [PubMed]
37. Dichtel-Danjoy ML, Felix MA. Phenotypic neighborhood and micro-evolvability. Trends Genet. 2004;20:268–276. [PubMed]
38. Fontana W, Schuster P. Continuity in evolution: on the nature of transitions. Science. 1998;280:1451–1455. [PubMed]
39. Stadler BM, Stadler PF, Wagner GP, Fontana W. The topology of the possible: formal spaces underlying patterns of evolutionary change. J Theor Biol. 2001;213:241–274. [PubMed]
40. Reidys C, Stadler PF, Schuster P. Generic properties of combinatory maps: neutral networks of RNA secondary structures. Bull Math Biol. 1997;59:339–397. [PubMed]
41. Meyers LA, Ancel FD, Lachmann M. Evolution of genetic potential. PLoS Comput Biol. 2005;1:236–243. doi:10.1371/journal.pcbi.0010032. [PMC free article] [PubMed]
42. Schuster P, Fontana W, Stadler PF, Hofacker IL. From sequences to shapes and back: a case study in RNA secondary structures. Proc Biol Sci. 1994;255:279–284. [PubMed]
43. Wuchty S, Fontana W, Hofacker IL, Schuster P. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers. 1999;49:145–165. [PubMed]
44. Wilke CO, Wang JL, Ofria C, Lenski RE, Adami C. Evolution of digital organisms at high mutation rates leads to survival of the flattest. Nature. 2001;412:331–333. [PubMed]
45. Griswold CK. Pleiotropic mutation, modularity and evolvability. Evol Dev. 2006;8:81–93. [PubMed]
46. Flatt T. The evolutionary genetics of canalization. Q Rev Biol. 2005;80:287–316. [PubMed]
47. Miller JG. Living systems: basic concepts. Behav Sci. 1965;10:193–237. [PubMed]
48. Adami C, Ofria C, Collier TC. Evolution of biological complexity. Proc Natl Acad Sci U S A. 2000;97:4463–4468. [PMC free article] [PubMed]
49. Sumedha, Martin OC, Wagner A. New structural variation in evolutionary searches of RNA neutral networks. Biosystems. 2007;90:475–485. [PubMed]
50. Meyers LA, Bull JJ. Fighting change with change: adaptive variation in an uncertain world. Trends in Ecology & Evolution. 2002;17:551–557.
51. Parter M, Kashtan N, Alon U. Environmental variability and modularity of bacterial metabolic networks. BMC Evol Biol. 2007;7:169. [PMC free article] [PubMed]
52. Kreimer A, Borenstein E, Gophna U, Ruppin E. The evolution of modularity in bacterial metabolic networks. Proc Natl Acad Sci U S A. 2008;105:6976–6981. [PMC free article] [PubMed]
53. Kaern M, Elston TC, Blake WJ, Collins JJ. Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. 2005;6:451–464. [PubMed]
54. Mitchell M. An Introduction to Genetic Algorithms. Cambridge (Massachusetts): MIT Press; 1996.
55. Goldberg D. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company; 1989.
56. Hofacker IV, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, et al. Fast folding and comparison of RNA secondary structures. Monatsh Chem. 1994;125:167–188.
57. Newman MEJ. Fast algorithm for detecting community structure in networks. Phys Rev E. 2004;69:066133. [PubMed]
58. Jiang T, Lin G, Ma B, Zhang K. A general edit distance between RNA structures. J Comput Biol. 2002;9:371–388. [PubMed]

Articles from PLoS Computational Biology are provided here courtesy of Public Library of Science

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...