NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Institute of Medicine (US) Forum on Microbial Threats. The Science and Applications of Synthetic and Systems Biology: Workshop Summary. Washington (DC): National Academies Press (US); 2011.

Cover of The Science and Applications of Synthetic and Systems Biology

The Science and Applications of Synthetic and Systems Biology: Workshop Summary.

Show details

A3THE GENOME AS THE UNIT OF ENGINEERING

and 27.

Introduction

Recent years have marked a dramatic increase of our capabilities to sequence and synthesize nucleic acids. Ten years ago the first human genome was sequenced, and at the time it was a monumental undertaking requiring billions of dollars in funding and legions of dedicated researchers. Today, the same genome sequence would cost a fraction of the price and soon personal genome sequencing will be commonplace. However, while our understanding of genomes has not yet caught up with our ability to generate data, there is no question that the technology revolution has transformed and will continue to drive the biological sciences, in particular the new integrative science known as systems biology.

The same revolution is occurring in DNA synthesis, as prices for DNA drop and technologies are developed for large-scale (and ever longer) syntheses. As with sequencing, the technology revolution will lead understanding, and it can be argued that, although we can synthesize a microbial genome (Gibson et al., 2010), we don’t necessarily understand how to design one. That’s fine; we can learn by doing, and as synthetic genomes become easier to synthesize, we will undoubtedly generate customized genomes that will test the simple hypothesis of whether they function as intended.

Together, these emerging issues are the essence of what can be seen as the holy grail of both systems and synthetic biology, the genome as the unit of engineering. Synthetic biology can, in many circumstances, be defined as the abstraction of biology to the point where it can be easily engineered. This is first and foremost an operational definition, not an intellectual one. Indeed, there hides in the background of synthetic biology the usually unstated hypothesis that biology was made to be engineered by engineers (a hypothesis we take issue with below). Parallels are often drawn between electrical engineering and synthetic biology, where genetic information like promoters and proteins can be compared to components of a circuit board such as transistors or capacitors. In the view of the new breed of synthetic biologists, standardized components can almost always be used to rationally design genetic elements to perform desired tasks. This view has much merit, since it has already yielded interesting products such as biofuels, pharmaceuticals, and biomaterials. For example, Bayer and Voigt coaxed yeast into making valuable methyl halides from biomass (Bayer et al., 2009), while the Keasling lab reengineered the common lab microbe, Escherichia coli, to produce large quantities of amorphadiene, a precursor to antimalarial and anticancer drugs (Martin et al., 2003). It is not unreasonable to suspect that synthetic circuits will be introduced in human hosts at some point; for example, Fussenegger has created a synthetic circuit composed of a bacterial uric acid sensor and a fungal urate oxidase (which converts uric acid into a more tolerable compound) that can be used to control uric acid levels in mammalian hosts and thus ameliorate chronic disease states such as gout (Kemmer et al., 2010).

However, it is nonetheless still the case that many of these systems must be optimized in order for them to function as intended, in large part because the constituent parts are neither truly modular nor is their function fully predictable in new contexts. The question thus becomes whether biology is really meant to be engineered the same way a circuit board is, whether engineers will learn to make biology a circuit board, or whether some composite view is more akin to reality. Or, stated another way, the question is whether genetic tinkering (which existed well before synthetic biology) has somehow entered a new, more robust phase or if it has just been relabeled.

Systems biology has taken a different approach to how to tinker with systems. This approach proceeds from an understanding of the system as a whole instead of as an amalgam of component parts. It is top down rather than bottom up. By integrating literally millions of points of data from genomes, transcriptomes, and proteomes across the phylogenetic spectra, systems biologists can draw remarkable conclusions, up to and including the identification of nonobvious and evolutionarily repurposed subsystems. It is this sort of understanding that allows us to realize that the same interactions that govern resistance to antifungals in yeast also govern blood vessel formation in higher organisms (McGary et al., 2010). While a comparable synthetic biology approach might be to repurpose a tractable signaling pathway for blood vessel formation, this approach would require extensive empirical testing. As technology continues to develop for high-throughput analysis of DNA, RNA, and proteins, systems biologists will have the benefit of several billion years of empirical testing and, thus, will hold the intellectual high ground for understanding how organisms truly work. Quantitative modeling of the connectedness of extant systems will, in the end, be more likely to allow us to build a functional genome from scratch than the untested engineering hypothesis that organisms should work like we want them to. In greater detail, this should inevitably lead to the following discussion.

Systems Biology Eats Synthetic Biology

The fiction of synthetic biology is that it is possible to engineer biological systems in modular, composable, scalable, and programmable ways using “parts” to build circuits and eventually systems (and ecosystems) (Bromley et al., 2008; Win et al., 2009). Both the field and the fiction have emerged largely not from scientific research but from the International Genetically Engineered Machines competition, iGEM, which is constrained to rely on “BioBricks” for the construction of student projects (Smolke, 2009). In fact, long before iGEM there were numerous biological engineers, and these biological engineers would, with different degrees of success, use genes (not then known as biological parts) to construct pathways (not then known as circuits) that had particular functions.

What has changed between the long period during which biological engineering has been maturing (indeed, one could argue that biological engineering in the form of selective breeding greatly predated man’s understanding of biology itself) and today to make biological engineering into synthetic biology, besides nomenclature? It could be argued that there are two important milestones: First, many researchers in other engineering disciplines were somehow shut out of the biology because it lacked an engineering flair and was more dissective than synthetic. In this view, the influx of engineers into biology is assisted by recasting biology in terms more familiar to engineers, and this is why we often see biological circuits represented (incorrectly, as I will argue later) as electronic circuits (Khalil and Collins, 2010).

Second, there is not just a quantitative but rather a qualitative or foundational difference in being able to make large amounts of DNA. The ability to remake a whole genome (Gibson et al., 2010) is so great relative to the ability to make a mutation in that genome (or in an episome sharing a cell with that genome) that there must now be a new engineering discipline that approaches genome construction in a wholly different way than traditional biological engineering would have been capable of doing. I think there is some merit to this explanation, although it is a bit like saying that genomics is different than genetics. While we would never argue that we do not have much greater understanding of molecular detail today than we did in Mendel’s day, our fantastic increase in knowledge does not invalidate the views of Mendel; rather, this knowledge merely adds depth to the fundamental concept of inheritance. Genomics further explains genetics; it does not remake it (although individual concepts in genetics, such as the role of environment in inheritance, have certainly been remade radically). Nonetheless, it is true that techniques for the rational manipulation of whole genomes will prove to be of enormous value into the future. And while many of us who wish to reconstruct living systems will continue to have “Venter envy,” a new generation of techniques such as multiplex automated genome engineering (Wang et al., 2009) are beginning to emerge that will make such manipulations possible even for individual investigators.

In the absence of the synthetic biology “revolution” (redefinition), how might biological engineers who were attempting to manipulate larger and larger amalgams of DNA have managed? Where would they have turned for knowledge and inspiration? I contend that it would not have been electrical engineering— the frequent muse to whom many who call themselves synthetic biologists appeal— but rather to systems biology. Systems biology attempts to understand the interrelationships between all of the molecular and cellular parts of system in a quantitative way, and it has as one of its ultimate goals the modeling of biological organisms down to the molecular level. If we had the same knowledge of organisms that we do of the hardware and software accompanying a computer, the entire organism would not just be a “chassis” for the installation of a synthetic circuit; it would be the unit of engineering itself. Fortunately for both traditional biological engineers and synthetic biologists, our understanding of systems has increased dramatically in recent years, and thus there will come a point where systems biology will not only completely inform biological engineering, but it will overtake the utility of orthogonal, add-on circuits that were largely meant to operate independently of the systems in which they were embedded. Systems biology will eat synthetic biology.

This change in perspective and approach will be absolutely essential as biological engineering moves forward. The utility of orthogonal synthetic circuits is limited for a variety of reasons. First, by failing to take into account the unity of the system, the predictability of circuits must be limited. This can best be seen by thinking about a very simple synthetic circuit that has been around for a long time: a plasmid that is used for protein overproduction. You can in fact embed a plasmid in a strain and induce protein production. From this vantage, the synthetic circuit has worked. However, it is difficult if not impossible to predict protein yield at the outset with any degree of certainty. Different proteins will express to different extents, will form aggregates to different extents, and will hamper the growth of the organism to different extents, to mention just a few variables that ultimately affect yield. This variability has nothing to do with whether a given “part” has been well characterized or not, and everything to do with the interaction of the part with the system as a whole. Protein overexpression interfaces with the cell’s machinery for transcription, translation, and degradation in myriad ways, and in the absence of a complete understanding of how the circuit interfaces with the system it will be difficult to ultimately predict how the circuit will work. Outcomes will range from making huge amounts of a protein to killing the cell outright. Of course the circuit can always be tinkered with to make more protein, but that was true before it was called a circuit and was merely a plasmid. What has synthetic biology brought to the modular, composable, scalable, and programmable overexpression of proteins that was not previously known? Nothing, because the tinkering that is now possible following the invention of the discipline is the same tinkering that was possible prior to its invention. Collins has pointed out that really tinkering is all that is necessary for engineering to prosper in biology, and he is of course correct. However, understanding and progress in this area will come from an increasingly detailed understanding of systems biology and integrated models of metabolism.

In discussions with Erik Winfree, a more charitable interpretation of the coexplosion of both systems and synthetic biology emerges, which is that these are different means to the same end—a predictable biology. The more complex system, the organism, is at present less predictable, and thus nominally orthogonal subsystems as built by synthetic biologists may allow us to initially develop better models. As those models become more sophisticated, they will of necessity take into account the discoveries and models developed through the study of systems biology.

A second, more important objection is that, by treating systems as though they were mere amalgams of components, synthetic biology ignores what I would argue is the central tenet of biology: organisms evolve. Given that an orthogonal circuit is really just an abstraction, as such circuits draw upon the metabolic, transcription, translation, and other resources of a cell, the circuit must of necessity change the fitness of a cell. To the extent that any synthetic circuit is used in the context of an evolutionary machine—a replicating cell—the possibility exists that mutations will arise that change both the cell and its added circuitry, for better or worse. While one alternative is just to not have cells replicate and to let circuits execute in preset bags of enzymes, this denies one of the great possibilities inherent in biology: the ability to generate larger amounts of complex matter from simpler substrates. An understanding of how to engineer biological systems will therefore proceed in large measure from a better understanding of the costs and benefits of circuits to the system as a whole, again a province of systems (and evolutionary) biology.

Remaking Organismal Operating Systems

It is with this somewhat different perspective that we can move forward from thinking about how to make orthogonal circuits to thinking about how to manipulate the operating systems of organisms. Ultimately, synthetic biology is not modular, composable, scalable, or programmable because the operating system of biology does not support these features. The operating system of biology is a kludge, the result of billions of years of happenstance and compromise, and it cannot at this juncture be remade by the simple expedient of saying it’s not so.

But this does not mean that the operating system cannot be remade.

In looking at the operating system of biology, there is really only one component that is remotely akin to the synthetic biology dream of being modular, composable, scalable, and programmable, and that is DNA (or, more generally, nucleic acids). The simplicity of Watson-Crick base pairing and its implementation in a regular biopolymer meets all of these requirements. Unfortunately, this simplicity is destroyed by translation, which turns a regular biopolymer into many irregular ones. Obviously, from the point of view of organismal fitness, this is a good thing.

Thus, the question becomes, how can we remake the operating system of an organism so that the features of DNA are the features of not just one component but of all the components? My answer to this question is a limited one, but I think it suggests new directions. It seemed to me that since it is possible to have the base-pairing properties of DNA, but with different backbones, such as those seen in RNA, locked nucleic acids (LNAs), and especially peptide nucleic acids (PNAs), that it should be possible to extend base pairing to other components in biological operating systems. In particular, it should be possible to allow proteins to base pair in much the same way that PNAs would.

To this end, my lab has embarked on a journey to expand the genetic code to include four monomers that have nucleobase amino acids with side chains that have pairing properties equivalent to G, A, T, and C. Given the exploits and insights of the Schultz lab and others, this is an engineering feat that should be possible. The punctuation of the genetic code, stop codons, can be reprogrammed to code for amino acids (although ultimately with a systems-level cost), and four base codons (and now even a four base codon–reading ribosome) allow a (still uncertain) expansion of the code and the production of genetically augmented proteins.

Imposing Rationality on Biology

Whereas computers were built from the bottom up, with all of the rationality that engineers could bring to bear on both the hardware and software, biology and its operating systems were presented to us as a fait accompli, and thus have to be engineered from the top down, in a way that can best be described as irrational, despite the fiction of synthetic biology. Biological systems evolved as kludges, and are engineered as kludges. While the underpinnings of synthetic biology seem to be to ignore these kludges, the hope of systems biology is to understand these kludges well enough that engineering short circuits and patches seem seamless.

As a small step, I envision that the addition of nucleobase amino acids to the genetic codes will allow us to begin to engineer biology in a truly rational manner. In my vision, there will be several substantive benefits to this fundamental remaking of a biological operating system. First, engineering protein–protein interactions will become more predictable, as it may be possible to uniquely code for such interactions via strings of nucleobase amino acids. Second, the same benefit will obviously accrue for protein–nucleic acid interactions. Transcription factors and RNA-binding proteins will no longer have semiprogrammable interaction domains, such as zinc fingers, but will instead be fully programmable. Third, it should be possible to rationally design cellular interactions and architectures via membrane proteins that again are tailed with nucleobase amino acids. And finally, this fundamental change in the operating system of a biological system should greatly speed the course of evolution. This conclusion comes from thinking about how transcription factors evolve to acquire new specificities. In general, there are multiple amino acid changes that must occur in order to bind a new DNA sequence, and these changes are not one-to-one, since the “code” for protein–nucleic acid interactions is three-dimensional rather than linear. By switching out the three-dimensional code for a nominally linear one, the number of sequence changes that must be acquired to bind adjacent to a promoter (or to form a new protein–protein interaction, or to make a connection to a new cell) will be proportionately reduced, and thus the acceptance of these changes (evolution) can potentially occur much more quickly.

For the longer term, I think there is the possibility that the operating systems of organisms can be changed in an even more fundamental manner. Merely adding nucleobase amino acids to the repertoire of the genetic code simplifies interactions from three-dimensional and irrational to linear and rational, but it does nothing about the way in which the associated machinery actually performs computations. Signal-transduction pathways will still pass information from the outside world to the inside of cells via a series of just-so scaffolds and complex conformational changes that will remain subject to relatively irrational engineering (and, given how I have tried to define rationality at a global level, this statement takes nothing away from the extraordinary successes of Lim and others in engineering signal transduction) (Bashor et al., 2008).

Where, then, can we find a more rational operating system for biology? In parallel with the “revolution” of synthetic biology, a true revolution in nucleic acid programming has been brewing at Caltech, led by Erik Winfree, Niles Pierce, Peng Yin, and others (Seelig et al., 2006; Yin et al., 2008; Zhang et al., 2007). These pioneers have taken the seminal work by Adleman (1994), showing that nucleic acids can be used for computation, and further transformed it, showing that nucleic acids can be computers. To my mind, the distinction is subtle but important. An algorithm can be embedded in nucleic acids, as Adleman showed, but nucleic acids may not be the best platform for executing that algorithm (electronics beats biology, at least in terms of speed, on most computations). When instead we attempt to identify how nucleic acids can best compute—how their intrinsic properties can be exploited to do computation—then we begin to make nucleic acid computers of the sort that have emerged from the Caltech researchers. These computers rely on a very simple reaction, strand exchange, in which one nucleic acid binds to and forms a duplex with another. Duplex formation can in turn lead to the displacement of a previous duplex or the alteration of a programmed secondary structure. The displacement can be transient, with new binding sites (so-called toeholds) being revealed and paving the way to the formation of additional and/or different duplexes. Overall, extraordinarily complex executable circuits can be built in a fully rational manner, and these circuits have now been adapted to a variety of interesting computations, from amplification to determining square roots. The comparison with synthetic biology is instructive: while Weiss has pointed out that it has proven difficult to make synthetic circuits with more than six or so layers (Purnick and Weiss, 2009), Winfree has now readily generated a DNA circuit with this many layers and the obvious capability to scale much farther (Qian and Winfree, 2011).

I therefore fantasize about a biological system that is based on a rational biological computer, akin to the strand exchange circuits. As tempting as it is to suggest we should tear down biology and start afresh, this is well beyond my capabilities, and thus even my fantasies must be instantiated incrementally. It is possible to envision how strand-exchange RNA circuits could be embedded in organisms. The outcome of such circuitry could be the production of particular miRNAs that would feed back on the system, in much the same way that Benenson and Weiss have nicely shown that logical miRNAs can be embedded in the current biological operating system (Rinaudo et al., 2007). In couple with a fully realized suite of genetically augmented proteins containing nucleobase amino acids, the possibilities for superimposing a rational operating system on the current irrational one is manifest (albeit still incremental).

In moving from current biological operating systems to a fanciful new operating system built on genetically augmented proteins and strand exchange reactions, I have always been sensitive to the nature of the operating system that we already have (even while attempting to remake it). This “feeling for the organism” (to steal a phrase from McClintock) is not stealth vitalism but rather a sense of how engineering must proceed from the materials at hand. In this regard, while engineers can certainly become biologists and vice versa, each brings their preconceptions to the table in a manner that should be explicitly recognized. From the point of view of a biologist, biological circuits are utterly unlike electronic circuits, irrespective of the analogies that can be drawn (Simpson et al., 2009). Electronic circuits operate at light speed, with spatially defined interconnectivities. Biological circuits operate via chemical diffusion, with molecular recognition and reactivity defining interconnectivity. While software is independent of hardware, and while both types of circuits can make, say, an oscillator, that’s no reason to suspect that both types of circuits will be able to operate in the same time regimes or with the same fidelity, since these features will stem in part from capabilities of the materials from which they are constructed. Thus, it may not be unreasonable to suggest that in engineering biology we must build from the capabilities of the materials involved, rather than trying to superimpose a foreign mindset—the current and likely transient Zeitgeist of synthetic biology—on those materials.

References

  • Adleman LM. Molecular computation of solutions to combinatorial problems. Science. 1994;266:1021–1024. [PubMed: 7973651]
  • Bashor CJ, Helman NC, Yan S, Lim WA. Using engineered scaffold interactions to reshape MAP kinase pathway signaling dynamics. Science. 2008;319:1539–1543. [PubMed: 18339942]
  • Bayer TS, Widmaier DM, Temme K, Mirsky EA, Santi DV, Voigt CA. Synthesis of methyl halides from biomass using engineered microbes. Journal of the American Chemical Society. 2009;131:6508–6515. [PubMed: 19378995]
  • Bromley EH, Channon K, Moutevelis E, Woolfson DN. Peptide and protein building blocks for synthetic biology: From programming biomolecules to self-organized biomolecular systems. ACS Chemical Biology. 2008;3:38–50. [PubMed: 18205291]
  • Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang RY, Algire MA, Benders GA, Montague MG, Ma L, Moodie MM, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova EA, Young L, Qi ZQ, Segall-Shapiro TH, Calvey CH, Parmar PP, Hutchison CA 3rd, Smith HO, Venter JC. Creation of a bacterial cell controlled by a chemically synthesized genome. Science. 2010;329:52–56. [PubMed: 20488990]
  • Kemmer C, Gitzinger M, Daoud-El Baba M, Djonov V, Stelling J, Fussenegger M. Self-sufficient control of urate homeostasis in mice by a synthetic circuit. Nature Biotechnology. 2010;28:355–360. [PubMed: 20351688]
  • Khalil AS, Collins JJ. Synthetic biology: Applications come of age. Nature Reviews Genetics. 2010;11:367–379. [PMC free article: PMC2896386] [PubMed: 20395970]
  • Martin VJ, Pitera DJ, Withers ST, Newman JD, Keasling JD. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nature Biotechnology. 2003;21:796–802. [PubMed: 12778056]
  • McGary KL, Park TJ, Woods JO, Cha HJ, Wallingford JB, Marcotte EM. Systematic discovery of nonobvious human disease models through orthologous phenotypes. Proceedings of the National Academy of Sciences of the United States of America. 2010;107:6544–6549. [PMC free article: PMC2851946] [PubMed: 20308572]
  • Purnick PE, Weiss R. The second wave of synthetic biology: From modules to systems. Nature Reviews Molecular Cell Biology. 2009;10:410–422. [PubMed: 19461664]
  • Qian L, Winfree E. Scaling up digital circuit computation with DNA strand displacement cascades. Science. 2011;332:1196–1201. [PubMed: 21636773]
  • Rinaudo K, Bleris L, Maddamsetti R, Subramanian S, Weiss R, Benenson Y. A universal RNAi-based logic evaluator that operates in mammalian cells. Nature Biotechnology. 2007;25:795–801. [PubMed: 17515909]
  • Seelig G, Soloveichik D, Zhang DY, Winfree E. Enzyme-free nucleic acid logic circuits. Science. 2006;314:1585–1588. [PubMed: 17158324]
  • Simpson ZB, Tsai TL, Nguyen N, Chen X, Ellington AD. Modelling amorphous computations with transcription networks. Journal of the Royal Society Interface. 2009;6(Suppl 4):S523–S533. [PMC free article: PMC2843957] [PubMed: 19474083]
  • Smolke CD. Building outside of the box: iGEM and the BioBricks Foundation. Nature Biotechnology. 2009;27:1099–1102. [PubMed: 20010584]
  • Wang HH, Isaacs FJ, Carr PA, Sun ZZ, Xu G, Forest CR, Church GM. Programming cells by multiplex genome engineering and accelerated evolution. Nature. 2009;460:894–898. [PMC free article: PMC4590770] [PubMed: 19633652]
  • Win MN, Liang JC, Smolke CD. Frameworks for programming biological function through RNA parts and devices. Chemical Biology. 2009;16:298–310. [PMC free article: PMC2713350] [PubMed: 19318211]
  • Yin P, Choi HM, Calvert CR, Pierce NA. Programming biomolecular self-assembly pathways. Nature. 2008;451:318–322. [PubMed: 18202654]
  • Zhang DY, Turberfield AJ, Yurke B, Winfree E. Engineering entropy-driven reactions and networks catalyzed by DNA. Science. 2007;318:1121–1125. [PubMed: 18006742]
27

Center for Systems and Synthetic Biology, University of Texas at Austin.

Copyright © 2011, National Academy of Sciences.
Bookshelf ID: NBK84454

Views

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...