Download The dawn of evolutionary genome engineering

Nature Reviews Genetics | AOP, published online 28 May 2014; doi:10.1038/nrg3746 PERSPECTIVES OPINION The dawn of evolutionary genome engineering Csaba Pál, Balázs Papp and György Pósfai Abstract | Genome engineering strategies — such as genome editing, reduction and shuffling, and de novo genome synthesis — enable the modification of specific genomic locations in a directed and combinatorial manner. These approaches offer an unprecedented opportunity to study central evolutionary issues in which natural genetic variation is limited or biased, which sheds light on the evolutionary forces driving complex and extremely slowly evolving traits; the selective constraints on genome architecture; and the reconstruction of ancestral states of cellular structures and networks. Laboratory evolution experiments coupled with whole-genome sequence analyses offer extremely powerful tools for the investigation of evolution in real time1. Microbial populations are particularly amenable to such investigations. As the generation times of many bacterial species range from an hour to ten hours2, phenotypic and molecular changes during the course of laboratory evolution can be monitored over thousands of generations. Studies in microbial populations have offered insights into key conceptual problems, such as the extent of convergent evolution, the origin of key innovations and the mechanisms that influence evolvability 3. Through precise control of population size and selection pressure, experimental evolution enables the proper testing of theories of evolution. This approach is immensely successful, not least because the molecular mechanisms that underpin evolution of laboratory and natural populations are related to each other. However, there are several important issues that cannot be readily addressed by microbial experimental evolution. The major limitations of microbial experimental evolution are largely due to the shortage of natural variation in the laboratory, the limited timescale of such experiments and the lack of appropriate control of mutational processes. Laboratory evolution experiments typically last around 200–1,000 generations, which results in the accumulation of 4–20 independent mutations per population4. An Escherichia coli strain adapted to glucose minimal medium in the laboratory acquired only 45 mutations over 20,000 generations of evolution5. several important issues … cannot be readily addressed by microbial experimental evolution As a consequence, several molecular innovations lack the intra-population genetic variation on which selection could act. For example, the capacity of E. coli to exploit citrate as a carbon source took 33,000 generations of laboratory evolution (that is, more than 14 years)6. Clearly, researchers focused on the evolution of a specific molecular pathway cannot wait years for the fortuitous occurrence of such extremely rare events in the laboratory. Moreover, given the limited timescale of microbial laboratory evolution experiments, it is difficult to compare the results to macroevolutionary trends of genome evolution. For example, large-scale reduction of bacterial genomes occurs NATURE REVIEWS | GENETICS readily in nature but extremely slowly in experimental evolution settings7. Therefore, the driving evolutionary forces and the consequences of massive genomic rearrangements on cellular viability and adaptation to novel conditions remain a terra incognita. Most microbial experimental evolution studies focus on complex traits, such as nutrient limitation and heat stress3, in which mutations in hundreds of genes across the genome contribute to fitness. In these cases, it is difficult to disentangle beneficial mutations from neutral ones. The genetic basis of adaptation can only be deciphered in a tedious manner by individual and combined insertions of the observed mutations into the ancestral genome. However, the goal is frequently to study the evolution of a particular cellular subsystem. As long as a single gene is concerned, the standard ‘toolbox’ of directed protein evolution provides an adequate solution8. Indeed, conceptual and technical advancements in this research field have led to a better understanding of how proteins evolve in nature8,9. However, when larger genetic circuits, enzymatic pathways or complex subcellular structures are concerned, the generation of mutant libraries of sufficient size for laboratory evolution remains a cumbersome exercise. Genome engineering — the targeted sequence modification of at least two distinct genomic regions (reviewed in REFS 10–12) — provides a complementary approach to study some of the notoriously difficult evolutionary problems for two reasons: first, it allows the generation of large mutant libraries across many predefined loci; second, it enables the construction of genomic alterations that do not occur spontaneously in the laboratory (FIG. 1). Genome engineering can also facilitate the generation of modifications that have never been explored in nature. Genome engineering tools — including oligonucleotide-mediated recombineering (that is, recombination engineering), engineered nucleases and de novo genome synthesis — enable the rapid editing of multiple genomic segments13–15, the reduction of microbial genomes16, combinatorial shuffling of small DNA segments or complete genomes17,18, ADVANCE ONLINE PUBLICATION | 1 © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES and chemical synthesis and integration of large DNA segments or even complete genomes into a host organism19 (TABLE 1). In this Opinion article, we highlight the potential of genome engineering for the study of evolution and focus on microbial systems. We present key studies with evolutionary implications, and discuss the Evolution of the genetic code The evolution of the genetic code is particularly difficult to study, as most mutations that alter the genetic code have fatal consequences. Indeed, natural alterations to the standard genetic code rarely occur across the phylogenetic tree21, probably because such modifications would require prospects and problems of the emerging field of evolutionary genome engineering. Methodological details of genome engineering 10–12, details of microbial experimental evolution3 or the genomic mechanisms that drive natural evolution20 are not discussed, as these topics have been reviewed in detail elsewhere. a b Nascent polypeptide E. coli RF1 Growth Wild-type E. coli genome MAGE Electroporation and λ-Red recombination RF2 mRNA Selection or screening AAGCAGUAG Recode UAG to UAA using MAGE and CAGE MAGE-generated, region-specific mutants RF1 UAG codons eliminated RF2 AAGCAGUAA Delete RF1 ORF (that is, prfA) Genome CAGE UAG codons and RF1 eliminated RF2 AGCAGUAA ∆prfA Reassign UAG Hierarchical combination of modified segments by conjugation UAG reassigned + ∆prfA RF2 E. coli with new traits UAGCAGUAA Orthogonal aminoacyl-tRNA synthase and tRNA Figure 1 | Genome editing approaches for altering the genetic code on a genome-wide scale in E. coli. a | Multiplex automated genome engineering (MAGE) allows production of multiple targeted, small mutations through oligonucleotide-mediated allelic replacement in an iterative manner, which results in a large number of allelic combinations. Coloured boxes indicate engineered mutations. Conjugative assembly genome engineering (CAGE) allows step-wise transfer of individually engineered, marked genomic modules into a single genome. Full transfer of a genomic segment is controlled by inserting oriT conjugational start sites and selection markers at appropriate positions (not shown). b | MAGE and CAGE can be used to construct a recoded Reviews | Genetics Escherichia coli genome with an expanded Nature genetic code. All occurrences of the UAG codon have been removed, and translation termination at UAG has been eliminated by deleting prfA, which is the open reading frame (ORF) encoding release factor 1 (RF1). The now blank UAG codon has been reintroduced, along with an orthogonal set of aminoacyl-tRNA synthase and tRNA, to encode a non-standard amino acid. Part b from Lajoie, M. J. et al. Genomically recoded organisms expand biological functions. Science 342, 357–360 (2013). Reprinted with permission from AAAS. 2 | ADVANCE ONLINE PUBLICATION www.nature.com/reviews/genetics © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES prior reorganization of codon usage across the whole genome. Nevertheless, it has become clear that minor deviations from the universal genetic code occur sporadically in bacteria and archaea, as well as in both nuclear and mitochondrial genomes in eukaryotes21. Moreover, beyond the 20 canonical amino acids, at least 2 additional genetically encoded amino acids contribute to the proteome of several extant species22 — selenocysteine (which is found in bacteria, archaea and eukaryotes as a component of selenoproteins) and pyrrolysine (which is mostly found in methanogenic archaea)22. These observations show that even the most fundamental features of the genetic system are evolvable. What are the evolutionary forces that drive the secondary evolution of the genetic code, and what constraints limit its reassignment in nature? Until recently, these issues were studied using either theoretical or comparative methods21. Now, reconstruction of altered genetic codes with reassigned codons on a genome-wide scale has emerged as a strategy to investigate these questions. Codon reassignment. Recent genome engineering efforts have shown the feasibility of recoding the genetic information of an organism while simultaneously expanding the coding capacity of its genetic code. In a pioneering series of studies14,15, an in vivo genome editing approach (FIG. 1) was used to remove all 321 TAG trinucleotides (which encode the amber stop codon) from E. coli DNA (by converting them to TAA, which encodes the ochre stop codon) as well as the release factor that recognizes them. For this purpose, the researchers identified all 314 E. coli genes that contain TAG codons. Reassigning hundreds of codons requires highly efficient methods to simultaneously modify multiple genomic locations. Multiplex automated genome engineering (MAGE) is well suited to rapidly edit multiple sites by oligonucleotide-mediated allelic replacement and was used to introduce subsets of TAG-to-TAA codon changes into 32 independent strains15. Next, partially recoded strains that contained distinct sets of codon modifications were merged into a single strain using a technique called hierarchical conjugative assembly genome engineering (CAGE)14,15 (FIG. 1a). The elementary step of CAGE involves the transfer of a targeted genomic region of one strain into a second strain through conjugation. Iterative assembly of pairs of partially recoded strains in a hierarchical manner resulted in a single fully recoded genome. Finally, this procedure yielded a blank codon, which was then reassigned to encode a non-standard amino acid. An extension of this approach showed that numerous sense codons might also be amenable to removal23. However, conversion of certain codons to their synonymous counterparts was constrained in various ways23. Failed replacements were likely to be caused by the disruption of endogenous regulatory mechanisms, by codon bias that affect gene expression or by the perturbation of expression as a result of separating overlapping genes. Nevertheless, all occurrences of 13 codons in a panel of essential genes could eventually be changed, which indicates that genome-wide recoding is feasible23. These methodological developments are expected to provide new insights into both the recent Table 1 | The role of microbial genome engineering in evolutionary research Engineering strategy Goals Genome editing Tools Recent advances and major achievements Evolutionary implications •Elucidate genotype– •Oligonucleotide-mediated phenotype correlation recombineering57 •Improve the metabolic •Engineered nucleases (ZFNs58, efficiency and/or robustness TALENs59 and RNA-guided of industrial producer nucleases60 based on the strains CRISPR–Cas system) •Mobile group II introns and Cre–loxP-mediated recombination61 •Development of highly efficient methods that allow multiple, parallel and combinatorial alterations at specific loci (MAGE13 and TRMR42) •Improved biomolecule production13 •Altered use of the genetic code15 •Systematic exploration of adaptive landscapes •Evolution of the genetic code •Evolutionary optimization of metabolic pathways and complex subcellular structures Genome reduction •Identify minimal gene sets •Create a simplified and programmable cell •λ-Red-mediated recombineering •Suicide plasmid recombination55 •Bacteriophage P1 transduction •Meganucleases55 •Targeted streamlining based on gene essentiality and comparative genomics31 •Organisms with reduced or core genomes16 •Genetically stabilized cells56 •Improved production hosts16 •Evolution towards minimal genomes •Role of mobile genetic elements and prophages in evolution De novo genome synthesis •Create characterized libraries of regulatory elements for predictable performance •Assemble modular pathways •Synthesize microorganisms with flexible genomes •DNA synthesis using chemical methods •In vitro and in vivo DNA assembly methods •Genome transplantation19,62 •Cre–loxP-mediated recombination37 •Refactoring of pathways and genomes63 •Synthetic device libraries64 (such as promoters and regulatory ‘switches’) •Chemical synthesis of a full genome62 •Synthesis of yeast chromosomes with rearrangeable architecture37 •Combinatorial evolution of transcriptional regulatory networks •Evolution of gene order Genome merging or shuffling •Create and rapidly improve complex phenotypes by exploiting the diversity of variation across genomes •Targeted recombination65 •CAGE14 •Genome mass transfer •Protoplast fusion18 •Construction of a hybrid bacterial genome65 •Whole-genome recoding15 •Improved producer strain18 •Origin of evolutionary novelties •Evolution of symbiotic genomes •Role of large-scale gene transfer in evolution CAGE, conjugative assembly genome engineering; CRISPR–Cas, clustered regularly interspaced short palindromic repeat–CRISPR-associated protein; MAGE, multiplex automated genome engineering; TALEN, transcription activator-like effector nuclease; TRMR, trackable multiplex recombineering; ZFN, zinc-finger nuclease. NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 3 © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES history of genetic-code evolution and the origin of the canonical genetic code (FIG. 1b). It will become feasible to directly test evolutionary scenarios for codon reassignment by constructing intermediate code variants that are unseen in nature and that would be too slow to evolve in the laboratory. For example, the codon capture hypothesis proposes that mutation bias that affect the genomic guanine and cytosine content can drive the extinction of certain codons in a neutral process21. In a second stage, a vanished codon can reappear and be recognized by a tRNA charged with a different amino acid21. By engineering genomes with the aim of completely eliminating a particular codon, we are now in the position to empirically study which codons are amenable to removal in the first stage and to which amino acids they can be reassigned in the second stage of codon capture. Other experimental work suggests that codon ambiguity is an effective mechanism that could drive alterations in the genetic code24. The amino acid repertoire. Not only the structure of codon reassignments but also the range of encoded amino acids demands an evolutionary explanation: what evolutionary forces have acted on the size and content of the canonical amino acid ‘alphabet’? The facts that numerous non-standard amino acids have been successfully engineered into natural proteins and that the protein biosynthetic machinery can be expanded to translate extra amino acids in vivo indicate a lack of fundamental barriers against a markedly expanded alphabet 25. According to one hypothesis, the amino acid repertoire was selected for its biochemical diversity 26. However, investigating the effect of engineered alphabets on single proteins gave controversial results: whereas an expanded code that contained a non-standard amino acid yielded superior antibodies in a directed evolution experiment 27, it was also possible to create a functional enzyme using a markedly reduced alphabet with only nine amino acids28. The availability of genome engineering tools to construct alternative genetic codes paves the way to systematically probe the effect of both expanded and reduced alphabets on the evolvability of complete proteomes. A recent study used an E. coli host strain that was engineered to incorporate 3‑iodotyrosine at amber stop codons to augment the genetic code available for an evolving bacteriophage with a non-standard amino acid29. Expanding the alphabet increased the evolvability of the phage by enabling access to a new beneficial mutation, which was only possible owing to incorporation of 3‑iodotyrosine into a phage protein involved in host cell lysis. Evolution of genome size The realization of the vast differences in genome sizes across bacterial species promoted a growing interest in the concept of minimal genomes. Organisms with nearly minimal number of genes occur in nature and are often obligate host-associated bacteria30. For example, the endosymbiotic bacteria Buchnera spp. are relatives of E. coli. Since the split of the two lineages 200 million years ago, the free-living ancestor lost 75% of its genome, including mobile genetic elements30. Buchnera spp. now contain ~580 genes, which shows that such a small number of genes is sufficient to maintain cellular life under a constant intracellular environment provided by the host. The extent of observed genome reduction in the laboratory is generally small. For example, in the bacterium Salmonella enterica, the rate of DNA loss during laboratory evolution was 0.05–2.50 bp per chromosome per generation7. Given such a low rate, genome engineering could be a more viable alternative to ‘replay’ the evolution of massive genome reduction in the laboratory. The development of basic gene deletion methods enabled a broad range of genome reduction projects, which resulted in substantially smaller and increasingly stable, streamlined bacterial genomes31 (FIG. 2a). These studies showed that microorganisms are amenable to such large-scale gene Glossary Amino acid ‘alphabet’ Directed protein evolution The set of amino acids used to build genetically encoded proteins. A protein engineering method to evolve proteins with desirable properties. It mimics and accelerates natural evolutionary processes by applying in vitro diversification–selection–amplification cycles. Antagonistic pleiotropy Pleiotropy occurs when a single gene influences multiple phenotypic traits that are seemingly unrelated. In the case of antagonistic pleiotropy, expression of the pleiotropic gene has mixed, competing effects; some of these are beneficial but others are detrimental to the organism. Epistatic interactions Interactions between two mutations whereby the phenotypic effect of one mutation depends on the presence of another mutation. Genome editing Codon ambiguity An extreme form of mistranslation in which a codon can be translated as two different amino acids. Combinatorial explosion A fundamental problem in evolutionary optimization and computing. As the size of the investigated system and the number of corresponding parameters increase, the number of combinations that one has to examine grows exponentially, which requires an intolerable amount of time to examine them. Modification of the genetic information encoded by the genome using in vivo, directed modification (such as replacement, removal or insertion of DNA bases) of a single locus or multiple loci. It uses synthetic oligonucleotides and a range of accessory tools, including engineered nucleases, and DNA repair and recombination enzymes. Multiplex automated genome engineering (MAGE). A highly efficient genome editing method that can generate a large and heterogeneous population of mutant bacterial genomes within days. Using oligonucleotidemediated allelic replacement technology in a cyclic and automated manner, MAGE can simultaneously target and modify multiple genomic locations across a large population of cells. Site-specific recombineering A recombination engineering system that allows efficient manipulation of genomic DNA at predetermined locations. It does not require extensive sequence similarity and relies on site-specific recombinases that catalyse reciprocal recombination of DNA at short sequences. Leading DNA strand The strand of nascent DNA that is being ‘read’ by the DNA polymerase in the same direction as the replication fork proceeds. It is being synthesized continuously, as opposed to the lagging strand. Convergent evolution Evolution of similar phenotypes in different populations or species as a result of adaptation to similar environments or ecological niches. Reduction towards a minimal essential gene set can occur either naturally (for example, in symbionts) or by genetic engineering. Minimal genomes Genomes that carry only the minimal genetic information necessary for life in a given environmental condition. 4 | ADVANCE ONLINE PUBLICATION Synthetic chromosome An artificial chromosome synthesized from simple chemical building blocks. Owing to limitations in the length of DNA that is amenable to direct chemical synthesis, construction of synthetic chromosomes is a hierarchical process, in which synthetic oligonucleotides are assembled into larger DNA segments in a step-wise manner using in vitro and in vivo assembly methods. www.nature.com/reviews/genetics © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES deletions, and that a large proportion of the genome can be deleted without any major growth defects. However, the effects of genome reduction on the transcriptomic profile, metabolic capacity and stress tolerance across a range of relevant conditions are mostly unknown, and these should be the subject of further studies. Microorganisms with streamlined genomes can also be used to elucidate some basic processes in evolution. For example, the E. coli laboratory strain MDS42 was designed for the elimination of most horizontally derived genomic islands, and deletions that amounted to 15.3% of the genome did not interfere with beneficial growth characteristics16. These results are consistent with prior arguments suggesting that horizontally derived genes only have important roles under special environmental settings32. Furthermore, as all transposable genetic elements were removed from E. coli MDS42 (REF. 16), this strain is especially well suited to study the role of these elements a A Wild-type genome B Genome Non-essential genomic segments to be deleted E. coli Synthetic fragment Marked, parallel deletions by λ-Red recombineering and positive selection Fitness check λ-Red-mediated integration and positive selection PM and NM A PM NM B Marker removal and negative selection A B Transduction and positive selection Synthetic fragment λ-Red-mediated integration and negative selection Marker removal and negative selection A Marker removal and negative selection Reduced genome b Non-directional IoxP sites Gene Cre Synthetic genome Deletions, inversions and translocations NATURE REVIEWS | GENETICS B Figure 2 | Examples of large-scale genome architecture restructuring. a | Systematic reduction of the Escherichia coli genome generates simplified cells. Non-essential genomic segments are individually deleted using λ-Red recombineering, tested for fitness and then transferred sequentially to the final acceptor strain by bacteriophage P1 transduction. A and B represent arbitrary genomic segments. PM and NM are positive and negative selection markers, respectively. Black dashed lines indicate the sites of ‘scarless’ deletions. b | Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) generates complex genotypes and a broad range of phenotypes by massive restructuring of the yeast genome. Transient induction of the Cre recombinase causes recombination between loxP sites inserted in the synthetic genome. ADVANCE ONLINE PUBLICATION |5 Nature Reviews | Genetics © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES Mutant library screen In silico modelling Laboratory evolution and whole-genome sequencing Serial transfer E. coli ∆gene B ∆gene E ∆gene A ∆gene C ∆gene D Identification of relevant gene set MAGE Combinatorial mutation library Figure 3 | Optimization of complex phenotypic traits by identifying relevant genes and by searching for optimal combinations of mutations within these genes. In the first step, genetic loci that contribute to the desired phenotype are identified by screening genome-wide mutant libraries (left panel), by systems-biology modelling of network perturbations43 (middle panel) or by laboratory evolution followed by whole-genome sequencing (right panel). Computational models in systems biology enable the performance of large cellular subsystems to be predicted with clear links during evolution and the implications of their absence from several ancient endo symbionts. Interestingly, genetic adaptation to a toxic plasmid was delayed in E. coli MDS42 (REF. 33). The issue of evolvability was further investigated by reinserting a single, highly active insertion sequence element IS1 into the E. coli MDS42 genome that was devoid of all other mobile genetic elements34. Subsequent laboratory evolution experiments revealed that insertion sequence elements increase mutational supply and occasionally generate variants with especially large phenotypic effects, but these elements have a smaller impact on adaptive evolution than other mutation-promoting mechanisms34. to changes in the dosage of individual genes in the network. Laboratory Nature Reviews | Genetics evolution identifies adaptive mutations that accumulated during selection for the phenotype of interest. In the second step, combinatorial mutation libraries are constructed by targeted modifications (represented by coloured boxes) of the genes identified in the first step using multiplex recombineering methods, such as multiplex automated genome engineering (MAGE). Finally, the resulting libraries can be selected to identify superior combinations of mutations. E. coli, Escherichia coli. Future studies should aim to develop automated genome reduction technologies, which would enable experiments to be carried out in larger scales and with higher precision and speed. With such techniques in hand, researchers would be able to test several crucial evolutionary issues. For example, computational models indicate that several different minimal genomes have identical fitness in environmental settings35. Hence, differences in gene content between intracellular bacteria may partially reflect alternative solutions to reach the same goal rather than lineage-specific adaptations35. It would be exceptionally interesting to investigate this theory under controlled 6 | ADVANCE ONLINE PUBLICATION laboratory conditions, not least because it has important implications for the role of historical contingencies in genome evolution. Evolution of genome architecture It is increasingly evident that not only the genome content but also the large-scale structural organization of the genome is influenced by selection in bacteria, archaea and eukaryotes alike20,36. Several genomic features show nonrandom patterns and strong signs of conservation across species. For example, co‑expressed genes tend to be clustered in eukaryotic genomes and remain linked more than expected in www.nature.com/reviews/genetics © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES related species on the basis of a neutral null model36. Similarly, selection for increased co‑regulation has shaped both the operonic organization of functionally associated genes and the ‘uberoperonic’ contiguity between related operons in bacteria and archaea20. However, the potential adaptive values of other aspects of genome organization have remained more elusive. For example, it remains unclear why essential genes are preferentially located on the leading DNA strand in bacteria20. It is also uncertain why eukaryotes show great diversity in the number, size and shape of their chromosomes, and why they differ in intron content. As intra-population natural variation in such genomic traits is either often scarce or associated with further genetic differences between individuals, it is generally difficult to directly investigate the fitness effect of genome rearrangements. The development of powerful experimental tools to rearrange microbial genomes — such as site-specific recombineering, and synthetic chromosome rearrangement and modification by loxPmediated evolution (SCRaMbLE) — holds promise to shed new light on the evolution of genome architecture37. Site-specific recombineering. A recent study used site-specific recombineering to alter the genome architecture of fission yeast without changing its coding sequence38. A panel of ten engineered strains, each containing a particular chromosomal inversion or translocation, was measured for meiotic viability and for mitotic fitness under various environmental conditions. Although altered chromosome structures had a deleterious effect during sexual reproduction, some strains showed a growth advantage during mitosis, which indicates the presence of antagonistic pleiotropy. This finding could potentially provide an explanation for the way that variation in chromosome structure is maintained in natural populations. Synthetic chromosome rearrangement and modification by loxP-mediated evolution. The combinatorial mutagenesis method SCRaMbLE was developed in budding yeast and is especially well suited for large-scale restructuring of the eukaryotic genome37,39. This method relies on constructing a synthetic chromosome on the basis of the directed modification of the native sequence (FIG. 2b). Most notably, by inserting loxP sites after the stop codons of each non-essential gene and at major genomic landmarks, the synthetic chromosome includes a Cre recombinase-inducible evolution system, which allows the formation of many structurally distinct genomes on demand. In a pioneering study, the feasibility of this approach was shown by the construction of two synthetic chromosome arms in budding yeast and the generation of substantial genetic and phenotypic heterogeneity by inducing site-specific recombination in these strains37. In principle, alterations in chromosome number, structure, ploidy and content can all be achieved using SCRaMbLE, which makes it a promising tool for mapping the fitness landscape of eukaryotic genome architecture. Furthermore, the recent construction of a fully synthetic Saccharomyces cerevisiae chromosome III that is devoid of introns shows that introns on this chromosome do not contribute substantially to fitness39. Future directions Several factors will influence the success of future applications. The power of genome engineering is mostly exemplified by studies that concentrate on the design of relatively simple or small-scale genetic manipulations. Moreover, the scale of generated variants is generally modest. These limitations will soon be overcome with the rapid technical advancement of the field (TABLE 1). Why is this important for future evolution studies? Identifying the forces by which complex cellular features (such as linear metabolic pathways or multi meric protein complexes) emerge is one of the major problems of evolutionary cell biology 40. Many such complex adaptations require the simultaneous acquisition of multiple, specific and rare mutations in a single lineage, all of which have little or no beneficial effects individually 40. Thus, the time needed for the establishment of such adaptations is expected to be very long in both natural and laboratory settings. MAGE could potentially overcome this problem13, as it can generate >4.3 billion combinatorial genomic variants per day at selected loci, thus accelerating the laboratory evolution of such complex adaptations (FIG. 1a). A technical challenge associated with MAGE and related genome editing protocols is that their efficiency is greatly enhanced by the inactivation of the endogenous methyldirected mismatch repair system, which in turn leads to a markedly increased genomic mutation rate and the consequent accumulation of undesired off-target mutations15. For example, in one study, 355 fortuitous mutations were detected in addition to the 321 modifications that were actually being targeted15. Clearly, more precise genome editing approaches are required to achieve increased genome stability during engineering. Multiple strategies have been offered to ameliorate this problem, including the transient suppression of DNA repair during mismatch-carrying oligonucleotide integration41. Further advances in the field could permit more efficient optimization of complex traits that are important for biotechnological applications. However, the Box 1 | Experimental perturbation of genetic circuit architecture One of the most striking discoveries of comparative genomics has perhaps been the high versatility of gene regulatory networks across related bacterial species47. Specifically, transcription factors are typically less conserved than their target genes and evolve independently of each other. Moreover, despite extensive changes in regulatory mechanisms, the logical output of the overall circuits frequently remains. Why is a specific network structure preferred for a given cellular task when alternative circuits could potentially deliver identical outcomes? In a seminal paper, the researchers constructed 598 recombinations of promoters (including regulatory regions) with different regulatory genes in Escherichia coli49. Strikingly, 95% of the new links were neutral or even beneficial under certain stressful conditions. Another study suggested that, although the population-level behaviour of many alternative circuits is similar, they show large differences across individual cells. In the case of the Bacillus subtilis circuit that regulates differentiation into the competent state, natural evolution specifically selected the circuit with larger output noise50. Engineering of protein networks can also reveal the relative importance of different mutational mechanisms during evolution. For example, evolution of signalling pathways may, in principle, proceed through multiple genetic mechanisms, including point mutations, duplications and recombination of protein modules. To estimate the potential importance of the last of these forces, the phenotypic diversity of a signalling response that results from domain recombination was analysed52. The investigators selected 11 proteins in the yeast mating pathway and constructed a library of 66 chimeric domain recombinants. Novel linkages between pre-existing domains had a major impact on signalling phenotype. At least under laboratory conditions, recombination of protein domains led to strains that mate more efficiently than wild-type strains. NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 7 © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES Box 2 | Impact of evolution on the design and maintenance of synthetic circuits Combining directed evolution and synthetic design The construction of artificial genetic circuits generally requires ‘fine-tuning’ for proper function, as the biological parts are frequently ill-defined or incompatible with each other51. This is especially problematic in the case of engineered or hybrid promoters. As a consequence, the initial circuit constructs show unpredictable behaviour or undesired cell‑to‑cell variability, which demands optimization of many ill-defined parameters (for example, transcription factor binding affinities). This optimization is frequently achieved through a slow process of trial and error. A promising complementary approach is to subject the elements that constitute the genetic circuit to directed evolution53. To achieve this, the desired output of the circuit needs to be screened and selected in a high-throughput manner, and the accumulation of mutations needs to gradually improve circuit performance. If the number of elements in the circuit is fairly large, and/or if the researcher plans to consider combinations of selected mutations only, then standard protocols of genome editing are especially suitable for generating variation on which laboratory selection can act. Reducing evolvability for maintenance of synthetic devices Another practical problem is that artificial circuits frequently impose a fitness cost on the host organism. This burden is often due to energy costs related to synthesis of the elements that are unnecessary for host survival and to interference of the construct with native cellular processes54. As inactivating mutations are favourable for the host, spontaneous evolution rapidly breaks the function of synthetic constructs. Indeed, prior studies indicate that the reliability of simple synthetic devices lasts only for 100 microbial generations55. Multiple strategies have been suggested to resolve this problem54, including reduction of host mutation rate33,56, application of elements that are less prone to deactivation and increasing mutational robustness of the circuit (for example, through increasing copy number of key elements). These approaches may reduce the speed by which the constructs are inactivated, but they are unlikely to offer a global solution, as they do not completely eliminate the fitness advantage of inactivating mutations. We suspect that the most reliable solution will rely on functional coupling of the output of the synthetic construct with an essential cellular function54. problem of combinatorial explosion remains an important issue; finding a desired function by very rare mutational combinations requires not only the availability of large mutant libraries but also efficient search strategies to navigate the genotypic space, such as those borrowed from the field of directed evolution8. A framework has been proposed12,42 that includes steps of generating diversity throughout the genome, ranking relevant genetic modifications and combinatorial optimization on selected loci (FIG. 3). The extent of epistatic relationships among the targeted genes has a key influence on this search strategy. There are at least two potential complementary approaches to identify the gene set that is relevant for subsequent optimization. As epistatic interactions can be predicted using detailed systems-biology modelling 43, future studies should attempt to extend this framework by computational modelling of the cellular subsystem studied. Alternatively, researchers could use standard protocols of microbial laboratory evolution coupled with whole-genome sequencing to identify adaptive mutations that accumulated during selection for the phenotype of interest (FIG. 3). Another serious limitation of genome engineering is that most protocols are currently applicable to only a few microbial species, most of which are laboratory model organisms10. One major issue is the variety of DNA repair mechanisms present in other species, which could potentially render genome engineering inefficient in other organisms. The extension of the applicability of, for example, genome editing protocols to a wide range of microbial species would enable researchers to systematically investigate and compare the effect of specific mutations and their combinations across species. Such comparative analyses of mutational effects may illuminate several unresolved issues, such as the mechanisms that drive variation in genetic pleiotropy across taxa44. With these conceptual and technical advancements in hand, we expect major breakthroughs in various areas. Reconstruction of ancestral subcellular subsystems. Ancestral protein and even large-scale genomic sequences can be inferred using phylogenetic methods. Reconstruction of these ancestral sequences through gene synthesis followed by integration into native genomes allows functional characterization. Successful examples so far include the reconstruction of enzymes45, highly conserved proteins46 and protein 8 | ADVANCE ONLINE PUBLICATION complexes47. Among others, these studies delivered insights into the ecological niches of ancestral species48 and the mechanisms underlying evolutionary innovations through gene duplication46. The next steps will be to use genome engineering to reconstruct larger cellular subsystems of ancestral species with the aim of rendering phenotypes that depend on the interplay of multiple genes and to investigate the emergence of complex pathways. This can be achieved in two fundamentally different ways. The genome of an existing organism could be edited at specific loci using multiplex recombineering or related techniques. However, if the ancestral and the edited genomes are substantially different (that is, if the number of loci that must be targeted is large), then this approach becomes extremely tedious. In such cases, complete de novo synthesis of the ancestral genome is expected to be a more viable strategy. Exploring the space of alternative genetic circuits. A series of studies indicated that alternative gene regulatory circuits can have similar logical outcome, albeit not necessarily identical performance49,50 (BOX 1). Combining network design with laboratory evolution experiments could potentially elucidate the extent to which different topologies can be ‘fine-tuned’, which would provide insights into why specific network structures have been preferred during evolution. In the future, establishment of libraries of regulatory ‘switches’, promoters and other standard biological parts will allow high-throughput automated design and laboratory selection of such circuits. Conclusions The examples above show that evolutionary biology can greatly benefit from concepts and methods of genome engineering. However, the reverse is also true: evolutionary thinking can feedback on engineering principles (BOX 2). We emphasize that the roles of genome engineering and laboratory evolution in elucidating difficult evolutionary problems should be complementary to each other. By constructing rare genomic alterations or specific combinations of mutations, genome engineering could facilitate complex alterations of cellular phenotypes, which could later be fine-tuned by standard protocols of laboratory evolution. Combination of rational and evolutionary design strategies is important both for understanding natural systems and for the construction of genetic regulatory circuits for biotechnological purposes51. www.nature.com/reviews/genetics © 2014 Macmillan Publishers Limited. All rights reserved PERSPECTIVES Csaba Pál, Balázs Papp and György Pósfai are at the Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, H-6726, Hungary. Correspondence to C.P. e-mail: [email protected] doi:10.1038/nrg3746 Published online 28 May 2014 Barrick, J. E. & Lenski, R. E. Genome dynamics during experimental evolution. Nature Rev. Genet. 14, 827–839 (2013). 2. Vieira-Silva, S. & Rocha, E. P. The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 6, e1000808 (2010). 3. Elena, S. F. & Lenski, R. E. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nature Rev. Genet. 4, 457–469 (2003). 4.Dettman, J. R. et al. Evolutionary insight from wholegenome sequencing of experimentally evolved microbes. Mol. Ecol. 21, 2058–2077 (2012). 5.Barrick, J. E. et al. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461, 1243–1247 (2009). 6. Blount, Z. D., Borland, C. Z. & Lenski, R. E. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc. Natl Acad. Sci. USA 105, 7899–7906 (2008). 7.Nilsson, A. I. et al. Bacterial genome size reduction by experimental evolution. Proc. Natl Acad. Sci. USA 102, 12112–12116 (2005). 8. Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nature Rev. Mol. Cell Biol. 10, 866–876 (2009). 9. Peisajovich, S. G. & Tawfik, D. S. Protein engineers turned evolutionists. Nature Methods 4, 991–994 (2007). 10. Esvelt, K. M. & Wang, H. H. Genome-scale engineering for systems and synthetic biology. Mol. Systems Biol. 9, 641 (2013). 11. Carr, P. A. & Church, G. M. Genome engineering. Nature Biotech. 27, 1151–1162 (2009). 12. Woodruff, L. B. & Gill, R. T. Engineering genomes in multiplex. Curr. Opin. Biotechnol. 22, 576–583 (2011). 13.Wang, H. H. et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature 460, 894–898 (2009). 14.Lajoie, M. J. et al. Genomically recoded organisms expand biological functions. Science 342, 357–360 (2013). 15.Isaacs, F. J. et al. Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science 333, 348–353 (2011). 16.Pósfai, G. et al. Emergent properties of reducedgenome Escherichia coli. Science 312, 1044–1046 (2006). 17.Patnaik, R. et al. Genome shuffling of Lactobacillus for improved acid tolerance. Nature Biotech. 20, 707–712 (2002). 18.Zhang, Y. X. et al. Genome shuffling leads to rapid phenotypic improvement in bacteria. Nature 415, 644–646 (2002). 19.Lartigue, C. et al. Genome transplantation in bacteria: changing one species to another. Science 317, 632–638 (2007). 20. Rocha, E. P. The organization of the bacterial genome. Annu. Rev. Genet. 42, 211–233 (2008). 21. Knight, R. D., Freeland, S. J. & Landweber, L. F. Rewiring the keyboard: evolvability of the genetic code. Nature Rev. Genet. 2, 49–58 (2001). 22. Ambrogelly, A., Palioura, S. & Söll, D. Natural expansion of the genetic code. Nature Chem. Biol. 3, 29–35 (2007). 23.Lajoie, M. J. et al. Probing the limits of genetic recoding in essential genes. Science 342, 361–363 (2013). 1. 24.Bezerra, A. R. et al. Reversion of a fungal genetic code alteration links proteome instability with genomic and phenotypic diversification. Proc. Natl Acad. Sci. USA 110, 11079–11084 (2013). 25. Liu, C. C. & Schultz, P. G. Adding new chemistries to the genetic code. Annu. Rev. Biochem. 79, 413–444 (2010). 26. Philip, G. K. & Freeland, S. J. Did evolution select a nonrandom “alphabet” of amino acids? Astrobiology 11, 235–240 (2011). 27.Liu, C. C. et al. Protein evolution with an expanded genetic code. Proc. Natl Acad. Sci. 105, 17688–17693 (2008). 28. Walter, K. U., Vamvaca, K. & Hilvert, D. An active enzyme constructed from a 9‑amino acid alphabet. J. Biol. Chem. 280, 37742–37746 (2005). 29.Hammerling, M. J. et al. Bacteriophages use an expanded genetic code on evolutionary paths to higher fitness. Nature Chem. Biol. 10, 178 (2014). 30. Wernegreen, J. J. Genome evolution in bacterial endosymbionts of insects. Nature Rev. Genet. 3, 850–861 (2002). 31. Fehér, T., Papp, B., Pál, C. & Pósfai, G. Systematic genome reductions: theoretical and experimental approaches. Chem. Rev. 107, 3498–3513 (2007). 32. Pál, C., Papp, B. & Lercher, M. J. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nature Genet. 37, 1372–1375 (2005). 33.Umenhoffer, K. et al. Reduced evolvability of Escherichia coli MDS42, an IS‑less cellular chassis for molecular and synthetic biology applications. Microb. Cell Fact 9, 38 (2010). 34.Fehér, T. et al. Competition between transposable elements and mutator genes in bacteria. Mol. Biol. Evol. 29, 3153–3159 (2012). 35.Pál, C. et al. Chance and necessity in the evolution of minimal metabolic networks. Nature 440, 667–670 (2006). 36. Hurst, L. D., Pál, C. & Lercher, M. J. The evolutionary dynamics of eukaryotic gene order. Nature Rev. Genet. 5, 299–310 (2004). 37.Dymond, J. S. et al. Synthetic chromosome arms function in yeast and generate phenotypic diversity by design. Nature 477, 471–476 (2011). 38. Teresa Avelar, A., Perfeito, L., Gordo, I. & Godinho Ferreira, M. Genome architecture is a selectable trait that can be maintained by antagonistic pleiotropy. Nature Commun. 4, 2235 (2013). 39.Annaluru, N. et al. Total synthesis of a functional designer eukaryotic chromosome. Science 344, 55–58 (2014). 40. Lynch, M. & Abegg, A. The rate of establishment of complex adaptations. Mol. Biol. Evol. 27, 1404–1414 (2010). 41.Nyerges, A. et al. Conditional DNA repair mutants enable highly precise genome engineering. Nucleic Acids Res. 42, e62 (2014). 42. Warner, J. R., Reeder, P. J., Karimpour-Fard, A., Woodruff, L. B. & Gill, R. T. Rapid profiling of a microbial genome using mixtures of barcoded oligonucleotides. Nature Biotech. 28, 856–862 (2010). 43. Papp, B., Notebaart, R. A. & Pál, C. Systems-biology approaches for predicting genomic evolution. Nature Rev. Genet. 12, 591–602 (2011). 44. Wang, Z., Liao, B. Y. & Zhang, J. Genomic patterns of pleiotropy and the evolution of complexity. 107, 18034–18039 (2010). 45. Benner, S. A., Sassi, S. O. & Gaucher, E. A. Molecular paleoscience: systems biology from the past. Adv. Enzymol. Relat. Areas Mol. Biol. 75, 1–132 (2007). 46.Voordeckers, K. et al. Reconstruction of ancestral metabolic enzymes reveals molecular mechanisms underlying evolutionary innovation through gene duplication. PLoS Biol. 10, e1001446 (2012). 47. Finnigan, G. C., Hanson-Smith, V., Stevens, T. H. & Thornton, J. W. Evolution of increased complexity in a molecular machine. Nature 481, 360–364 (2012). 48. Gaucher, E. A., Thomson, J. M., Burgan, M. F. & Benner, S. A. Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 425, 285–288 (2003). NATURE REVIEWS | GENETICS 49.Isalan, M. et al. Evolvability and hierarchy in rewired bacterial gene networks. Nature 452, 840–845 (2008). 50. Cagatay, T., Turcotte, M., Elowitz, M. B., Garcia-Ojalvo, J. & Süel, G. M. Architecturedependent noise discriminates functionally analogous differentiation circuits. Cell 139, 512–522 (2009). 51. Haseltine, E. L. & Arnold, F. H. Synthetic gene circuits: design with directed evolution. Annu. Rev. Biophys. Biomol. Struct. 36, 1–19 (2007). 52. Peisajovich, S. G., Garbarino, J. E., Wei, P. & Lim, W. A. Rapid diversification of cell signaling phenotypes by modular domain recombination. Science 328, 368–372 (2010). 53. Yokobayashi, Y., Weiss, R. & Arnold, F. H. Directed evolution of a genetic circuit. Proc. Natl Acad. Sci. USA 99, 16587–16591 (2002). 54. Renda, B. A., Hammerling, M. J. & Barrick, J. E. Engineering reduced evolutionary potential for synthetic biology. Mol. Biosyst. http://dx.doi. org/10.1039/C3MB70606K (2014). 55.Fehér, T. et al. Scarless engineering of the Escherichia coli genome. Methods Mol. Biol. 416, 251–259 (2008). 56. Csörgo, B., Fehér, T., Tímár, E., Blattner, F. R. & Pósfai, G. Low-mutation-rate, reduced-genome Escherichia coli: an improved host for faithful maintenance of engineered genetic constructs. Microb. Cell Fact 11, 11 (2012). 57. Ellis, H. M., Yu, D., DiTizio, T. & Court, D. L. High efficiency mutagenesis, repair, and engineering of chromosomal DNA using single-stranded oligonucleotides. Proc. Natl Acad. Sci. USA 98, 6742–6746 (2001). 58. Soskine, M. & Tawfik, D. S. Mutational effects and the evolution of new protein functions. Nature Rev. Genet. 11, 572–582 (2010). 59. Bogdanove, A. J. & Voytas, D. F. TAL effectors: customizable proteins for DNA targeting. Science 333, 1843–1846 (2011). 60. Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing of bacterial genomes using CRISPR–Cas systems. Nature Biotech. 31, 233–239 (2013). 61.Enyeart, P. J. et al. Generalized bacterial genome editing using mobile group II introns and Cre–lox. Mol. Syst. Biol. 9, 685 (2013). 62.Gibson, D. G. et al. Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329, 52–56 (2010). 63. Chan, L. Y., Kosuri, S. & Endy, D. Refactoring bacteriophage T7. Mol. Syst. Biol. 1, 2005.0018 (2005). 64.Rhodius, V. A. et al. Design of orthogonal genetic switches based on a crosstalk map of σs, anti‑σs, and promoters. Mol. Syst. Biol. 9, 702 (2013). 65. Itaya, M., Tsuge, K., Koizumi, M. & Fujita, K. Combining two genomes in one cell: stable cloning of the Synechocystis PCC6803 genome in the Bacillus subtilis 168 genome. Proc. Natl Acad. Sci. USA 102, 15971–15976 (2005). Acknowledgements The authors thank the anonymous reviewers for suggestions on the manuscript. C.P. and B.P. thank the Wellcome Trust and the Lendulet Programme of the Hungarian Academy of Sciences for supporting this work; G.P. thanks the Hungarian Research Council (OTKA) for supporting this work. B. Kintses, A. Nyerges and B. Csorgo gave comments on an earlier version of the manuscript. Competing interests statement The authors declare no competing interests. FURTHER INFORMATION MAGE: http://wyss.harvard.edu/viewpage/330/ Synthetic Yeast 2.0: http://syntheticyeast.org/ ALL LINKS ARE ACTIVE IN THE ONLINE PDF ADVANCE ONLINE PUBLICATION | 9 © 2014 Macmillan Publishers Limited. All rights reserved

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download The dawn of evolutionary genome engineering