* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download View PDF - DNA and Natural Algorithms Group
Non-coding DNA wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Catalytic triad wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Oligonucleotide synthesis wikipedia , lookup
Messenger RNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Structural alignment wikipedia , lookup
Silencer (genetics) wikipedia , lookup
RNA interference wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Genetic code wikipedia , lookup
RNA polymerase II holoenzyme wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Biochemistry wikipedia , lookup
Polyadenylation wikipedia , lookup
Metalloprotein wikipedia , lookup
Biosynthesis wikipedia , lookup
Gene expression wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
RNA silencing wikipedia , lookup
The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 Contents lists available at ScienceDirect The International Journal of Biochemistry & Cell Biology journal homepage: www.elsevier.com/locate/biocel Review Evolutionary origins and directed evolution of RNA Andrew D. Ellington ∗ , Xi Chen, Michael Robertson, Angel Syrett Department of Chemistry and Biochemistry, Institute for Cell and Molecular Biology, University of Texas at Austin, Austin, TX 78712, United States a r t i c l e i n f o Article history: Available online 19 August 2008 Keywords: Origins In vitro selection SELEX Aptamer Ribozyme Polymerase Translation Fitness landscape a b s t r a c t In vitro selection experiments show first and foremost that it is possible that functional nucleic acids can arise from random sequence libraries. Indeed, even simple sequence and structural motifs can prove to be robust binding species and catalysts, indicating that it may have been possible to transition from even the earliest self-replicators to a nascent, RNA-catalyzed metabolism. Because of the diversity of aptamers and ribozymes that can be selected, it is possible to construct a ‘fossil record’ of the evolution of the RNA world, with in vitro selected catalysts filling in as doppelgangers for molecules long gone. In this way a plausible pathway from simple oligonucleotide replicators to genomic polymerases can be imagined, as can a pathway from basal ribozyme activities to the ribosome. Most importantly, though, in vitro selection experiments can give a true and quantitative idea of the likelihood that these scenarios could have played out in the RNA world. Simple binding species and catalysts could have evolved into other structures and functions. As replicating sequences grew longer, new, more complex functions or faster catalytic activities could have been accessed. Some activities may have been isolated in sequence space, but others could have been approached along large, interconnected neutral networks. As the number, type, and length of ribozymes increased, RNA genomes would have evolved and eventually there would have been no area in a fitness landscape that would have been inaccessible. Self-replication would have inexorably led to life. © 2008 Elsevier Ltd. All rights reserved. Contents 1. 2. 3. 4. 5. Background on the RNA world hypothesis and in vitro selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What in vitro selection tells us about functional RNAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Binding species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Catalysts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. Structural complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Doppelgangers for recapitulating the RNA world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Recapitulation of the evolution of self-replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Recapitulation of the development of translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What in vitro selection reveals about fitness landscapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Tyranny of short motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Diversity in sequence space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Acquisition of new functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ Corresponding author. Tel.: +1 512 471 6445; fax: +1 512 4717014. E-mail address: [email protected] (A.D. Ellington). 1357-2725/$ – see front matter © 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.biocel.2008.08.015 255 255 255 255 255 257 258 258 261 262 262 262 262 264 264 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 1. Background on the RNA world hypothesis and in vitro selection It is generally accepted that life on Earth underwent a phase in which nucleic acids rather than proteins were the primary functional biopolymers in a cell, the so-called RNA world (Gilbert, 1986). It is presumed that at some point in the evolution of the RNA world template-directed polymerization arose, allowing the transfer of genetic information between generations of replicators. This would have in turn engendered the cycle of Darwinian evolution that is still playing out today. Assuming that the RNA world immediately preceded the advent of translation and the evolution of the modern cell in which proteins are the functional biopolymers, it seems likely that the RNA world contained a large number of pathways and catalysts that would have preceded or mimicked those we see in modern metabolism. The process by which ancient catalysts may have arisen can be roughly reproduced in the laboratory. Random sequence RNA pools can be generated by a combination of chemical and enzymatic synthesis, and molecules that perform a particular function (binding, catalysis) can be sieved from the pools. Selected binding species (aptamers) or catalysts (ribozymes) can then be amplified by a combination of reverse transcription, PCR, and in vitro transcription. After multiple cycles of selection and amplification that mimic natural selection, the aptamers and ribozymes generated by directed evolution are typically highly fit molecules. In this regard, they can be considered to be doppelgangers of the molecules that may have existed in the early RNA world. By studying the properties of these selected RNAs, we may also garner some insight into the nature and propensities of the RNA world. 2. What in vitro selection tells us about functional RNAs 2.1. Binding species Aptamers have now been selected against a wide variety of target molecules, including ions, metabolites, and proteins (reviewed in Stoltenburg et al., 2007). Aptamers typically bind their targets with Kd values in the low nanomolar to micromolar range, similar to the binding constants of many natural protein binding species and catalysts. Aptamers can discriminate between substrates as well as monoclonal antibodies. For example, aptamers selected to bind to theophylline can discriminate against caffeine, which differs by a single methyl group, by 10,000-fold, while aptamers selected to bind to ATP can discriminate against dATP by a similar value (reviewed in Ellington, 1994). Thus, it seems likely that binding pockets could have readily formed in the RNA world, and would have functioned in a manner similar to those found in modern, protein-based life. 2.2. Catalysts Of course, binding pockets in RNA molecules would have primarily been useful for fomenting substrate-binding and thus assisting in catalytic transformations. Just as binding species can be selected from random sequence populations, catalysts can be directly selected as well. Rather than selecting for partitioning between binding and non-binding species, selections for ribozymes typically involve selecting for single turnover reactions that modify the catalytic species itself (Chen et al., 2007). For example, ribozyme ligases that can append a primer sequence to themselves have been selected (Bartel and Szostak, 1993), as have ribozyme cleavases that cut themselves away from solid supports (reviewed in Breaker and Joyce, 1994). Variations on this theme have led to the selection of 255 alkyl transferases, kinases, phosphatases, esterases, and ribozymes capable of carbon–carbon bond synthesis (reviewed in Ellington and Robertson, 1999; Lilley, 2003; Strobel and Cochrane, 2007). Many of the basic reactions that support modern metabolism have already been shown to be catalyzed by selected ribozymes, again supporting the notion that such ribozymes could have also been present in a RNA world. As examples, below we show how both self-replication and translation might have arisen in a RNA world, based on the results of selection experiments. However, there is a distinct difference between the results of selections that generate aptamers and selections that generate ribozymes. Irrespective of the stringency of the selection, ribozymes always prove roughly 1000-fold or more slower than their protein counterparts. 2.3. Cofactors Beyond binding substrates, early ribozymes could have also bound cofactors to augment their catalytic activities. For example, a number of aptamers have now been isolated that bind to adenosine or its analogues. The first anti-ATP aptamer isolated by Sassanfar and Szostak (1993) contains an asymmetric internal loop flanked by two double-strand RNA regions and has a binding constant of around 10 !M. The three-dimensional structure of this aptamer has been determined (Dieckmann et al., 1996; Jiang et al., 1996). The structure forms a GNRA tetraloop-like motif with AMP contributing the A; binding primarily involves the nucleobase. Aptamers later isolated against NAD+ (Burgstaller and Famulok, 1994), and SAM (S-adenosyl methionine) (Burke and Gold, 1997) share the same sequence and structural motif for adenine binding, just as many different ATP binding proteins share the Rossman fold. Indeed, a closer structural examination reveals many similarities between the ways in which RNA molecules bind adenosine and proteins bind adenosine (Marshall et al., 1997). In each case, hydrogen bonding and stacking are the two dominant factors for affinity and discrimination. While there is a simple and common motif for adenine binding, other sequences and structures are also possible (just as there are other ways that proteins can interact with adenosine other than the Rossman fold). An anti-CoA aptamer isolated by Burke and Hoffman (1998) also primarily recognizes adenosine, but the binding sequence for binding is different from the anti-ATP aptamer. Even though both aptamers bind adenosine, the anti-ATP aptamer does not recognize CoA, likely because this aptamer cannot accommodate the 3" phosphate of CoA. Aptamers have also been isolated that specifically target adenine (as opposed to adenosine), and again have a different sequence and structural motif than the anti-ATP aptamer. Aptamers selected to bind cAMP (Koizumi and Breaker, 2000) or to the triphosphate of ATP (Sazani et al., 2004) also have sequence and structural motifs that differ from those of the anti-ATP aptamer. While it might be supposed that these sequence differences reflect specific contacts with the sugar or phosphate, in fact the anti-cAMP aptamer seems to primarily recognize adenine but forms a binding pocket that is sterically compatible with the cyclic phosphate. In contrast, the anti-triphosphate aptamer truly does recognize the beta and gamma phosphates, and can also bind to GTP, UTP, and CTP. While adenine is clearly a preferred target, other cofactor chemistries are also highly compatible with nucleic acid binding pockets. Anti-FMN and -FAD aptamers were first isolated by Burgstaller and Famulok (1994), while Lauhon and Szostak (1995) later isolated additional flavin-binding aptamers. An additional class of anti-FAD aptamers was isolated by the Burke lab (Roychowdhury-Saha et al., 2002). These aptamers all recognize riboflavin with Kd values ranging from 0.5 !M to 50 !M but have different sequences and adopt different secondary and tertiary 256 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 Fig. 1. Evolution from ATP-binding aptamer to ribozyme kinase. (a) Consensus sequence and rough 3D representation of the anti-ATP aptamer (see also Fig. 2a). The ‘GAA’ tri-nucleotide that forms GAAA tetraloop-like structure with AMP is shown in purple as in Fig. 2a. Watson–Crick base-pairs are shown as green lines, whereas the non-Watson–Crick base-pair is shown as in Leontis and Westhof (2001). Residues known to be conserved amongst different variants are shown in upper case. (b) Putative ATPbinding pockets in 4 classes of selected kinase ribozymes. Residues that vary from the conserved residues in the anti-ATP aptamer are highlighted in gray. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) structures: the anti-FMN and -FAD binding aptamers found by the Famulok lab have simple loop structures, the anti-riboflavin aptamers isolated by the Szostak lab fold into intramolecular G-quartets, and the anti-FAD aptamer isolated by the Burke lab contains three helices interrupted by several internal bulges. While the first anti-NAD aptamer also turned out to be an adenosine binder, as mentioned above, anti-NMN aptamers were subsequently identified (Lauhon and Szostak, 1995). The anti-NMN aptamers were able to discriminate between NAD and NADH in solution by over an order of magnitude (Kd values of 2.5 !M versus 37 !M). This result is encouraging, in that it implies that ribozymes using a similar domain might assist hydride acceptance or transfer during cofactor-mediated ribozyme catalysis. Given that ribozymes used nucleotide cofactors to invent metabolism, one interesting question that can be asked is how did they use their cofactors? There are two basic choices: non-covalent binding or covalent attachment. In order to assess these mechanistic possibilities, directed evolution has been used to reinvent RNA doppelgangers of protein-catalyzed reactions. Both mechanisms have now been successfully demonstrated. Aptamer domains have been cobbled to ribozymes in hopes of generating non-covalent cofactor-binding domains. For example, Lorsch and Szostak (1994) attempted to select ribozymes that utilized ATP-"S as a substrate and that could phosphorylate themselves. Initially, the anti-adenosine aptamer motif described above (Figs. 1 and 2a) was embedded in the random pool to increase the chance of identifying self-kinases that could utilize ATP. The transfer of the gamma thiol allowed capture of any self-kinases on a sulfhydryl affinity column. Among the resultant 7 classes of selected ribozymes, 5 classes acted as 5" kinases while the other 2 classes catalyzed the phosphorylation of an internal 2" hydroxyl. More recently, Burke and his co-workers have obtained similar results, against selecting ribozymes that could predominantly phosphorylate themselves on 2" hydroxyls (Saran et al., 2005). When the sequences selected by Lorsch were analyzed, the anti-ATP aptamer motif was found to be conserved in only 2 classes (Class IV and Class V in Fig. 1b), and only one of these had detectable ATP affinity. In other words, the ribozyme actually eliminated the pre-formed ability to bind ATP in order to make a functional kinase. This may be consistent with what is known about the relative robustness of functional sequence landscapes, described below. Redox ribozymes that utilize NAD+ as a cofactor have also been selected (Tsukiji et al., 2003). In this selection scheme, RNA transcription was initiated by guanosine-5" -monophosphorothioate (GMPS) and the thiol group was then used as a handle for the conjugation of a benzyl alcohol derivative. The conjugated pool was incubated with biotin-hydrazide in the presence of NAD+ . If the alcohol group was oxidized by NAD+ , the resultant aldehyde could be coupled to biotin and the self-tagging catalysts separated by streptavidin capture. Based on the estimated background rate, the selected ribozyme achieves a rate enhancement of >107 -fold. Subsequently, the same ribozyme was shown to catalyze the reverse reaction – reduction of benzyl aldehyde by NADH (Tsukiji et al., 2004). Moreover, the oxidation of alcohol still progressed when the reaction was supplemented with NADH and FAD, relying on the uncatalyzed hydride transfer from NADH to FAD. These results suggest that redox couples would have easily arisen in the earliest metabolic pathways. Cofactors can also serve as covalent carriers of functional groups. Acyl-CoA synthesizing ribozymes, which convert highenergy adenylate intermediates to slightly less energetic thioesters, were independently selected by two groups, using slightly different methods (Jadhav and Yarus, 2002; Coleman and Huang, 2002). In both selections, activated biotin (biotin-AMP) was the A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 257 Fig. 2. Structural complexity of aptamers. (a) The binding pocket for AMP (shown in green) in an anti-ATP aptamer. The ‘GAA’ tri-nucleotide (shown in purple) and its interaction with bound AMP resemble a stable GAAA tetraloop structural motif (b). (c) A selected anti-theophylline aptamer. Theophylline is shown in green. Residues making contacts with theophylline are shown in purple. (d) An expanded view of the binding pocket for theophylline. (Structures are from PDB ID: 1RAW for (a), 1ZIF for (b) and 1O15 for (c)). acyl source and served as an affinity tag to capture self-biotinylated ribozymes. In the selection done by the Yarus group, CoA was incorporated at the 5" termini of a pool of RNA using a trans-acting version of the previously described capping enzyme and the 3" phosphate of the AMP portion of CoA as a nucleophile. The most abundant sequence from the selection, acs1, catalyzed the CoA acylation reaction. Researchers in the Huang lab instead utilized a 3" -dephosphorylated CoA as the initiating nucleotide for transcription (Huang, 2003). It is possible that appended functional groups not only augmented catalysis, but may also have participated in replication, as described below. 2.4. Structural complexity The steady accumulation of in vitro selected aptamers and ribozymes has allowed a deeper understanding of how RNA achieves its functionality through the study of their threedimensional structures. Because of the relative scarcity of naturally occurring examples of functional RNAs, in vitro selection has been an invaluable source of candidates for structural analysis. Aside from a limited sample size, the examples of naturally occurring functional RNAs often present formidable technical challenges for structural studies. For example, the ribosome and the sliceosome are enormous multi-component molecular machines that are difficult to isolate in homogeneous quantities necessary for structural techniques such as X-ray crystallography. Conversely, in vitro selected RNAs are typically much smaller in size and are amenable to preparation and manipulation in vitro using the same techniques by which they were created in the first place. As a consequence, the contributions of aptamers and ribozymes to the structural biology of RNA have been at least as important as that of natural RNAs. Much like proteins, structurally complex RNAs form their global structures from secondary structural elements – A-form double helices – assembled into distinct tertiary structures through an assortment of smaller motifs and special interactions. These simple structural elements, such as pseudoknots, tetraloops, and uridine turns, among others, often represent the simplest, most energetically efficient solution to a particular structural requirement and are found time and again in diverse RNAs irrespective of their source. Different arrangements of smaller, local motifs within larger molecules lead to a virtually unlimited palette of global structures and functionalities. For example, GNRA tetraloops are commonly observed in large RNAs such as rRNA but also occur in numerous structured RNAs, including those created with in vitro selection. They use a compact and efficient combination of base stacking and hydrogen bonding to reverse the directionality of an RNA chain in base-paired helices that terminate in stem-loop structures. As an example of their frequency, the ∼1500 nucleotide E. coli 16S rRNA contains 9 GNRA tetraloops (Woese et al., 1990). This distinct structural motif may be further utilized as a handle to form long-range tertiary contacts with a partner motif called a tetraloop receptor or may be recognized by a separate protein binding partner. As mentioned previously, the ATP aptamer uses a GNRA-like motif to bind its ligand by docking the ATP into the “A” position of the motif, albeit with the ATP in a slightly modified configuration (Jiang et al., 1996) (Fig. 2a and b). This general mode of ligand binding – either using the ligand to complete or fit into a common motif or to tie different structural elements together – is a common theme. The theophylline aptamer is an exquisite example of the type of structural precision that RNA can achieve in a small efficient structure (Jenison et al., 1994; Zimmermann et al., 1997). The 33 nucleotide aptamer binds theophylline with a Kd of ∼400 nM but can discriminate against caffeine, which differs by the presence of 258 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 Fig. 3. Crystal structural of the L1 ribozyme ligase. (a) Overall fold of the L1 ligase. U71 and G1 span the ligation junction and are shown in red. A reverse Watson–Crick base-pair formed by A51 and U28 mediate inter-helical contacts and are shown in purple. A conserved 5 nucleotide motif that serves as the hinge region is shown in yellow. (b) Catalytic core of the L1 ligase. U71-p-G1 and the A51:U38 base-pair are shown as in (a). The backbone phosphates of G52 (pairing with U71), A39 and G40 coordinate the active-site Mg2+ (shown as a cyan sphere) and are shown in green. (c) Top view of the G1:A51:U38 base-triple in catalytic core. (PDB ID: 2OIU). a single additional methyl group at the N7 position, by a factor of 10,000-fold. This remarkable specificity is achieved primarily by a network of hydrogen bonds in which the ligand completes a basetriple interaction. The presence of the additional methyl group on caffeine disrupts one of the hydrogen bonds and causes a steric clash in the binding pocket that greatly diminishes the binding affinity. The binding pocket is further stabilized with extensive base stacking as the ligand-containing base-triple is sandwiched above and below by two other base-triples (Fig. 2c). A closer view (Fig. 2d) would demonstrate how theophylline fits into this wellordered binding pocket. With knowledge of the three-dimensional structure, this small aptamer has been further reduced in size to a minimal 13 nucleotide construct that preserves all of the essential interactions with the ligand and retains discrimination against caffeine (Anderson and Mecozzi, 2005). RNA catalysis, in particular RNA-catalyzed RNA replication, is central to ideas of how life on Earth was able to evolve to its current complexity. Since examples of RNA replicases are unknown nature, in vitro selection has been used to assess whether ribozymes are capable of this activity. RNA ligase ribozymes that catalyze the bond-forming nucleophilic attack of a 3" hydroxyl on the 5" triphosphate of a separate RNA oligonucleotide – the identical chemistry performed by modern RNA polymerase enzymes – were first isolated using in vitro selection from a large population of random sequence RNAs (Bartel and Szostak, 1993). Subsequently, the RNA ligase selection has been successfully repeated by several investigators under various conditions with different starting populations indicating that this activity is relatively abundant in RNA sequence space. With the notable exception of the Bartel Class I ligase (see also below), most of these selected ligases are remarkably simple in terms of their secondary structures and information content. The X-ray crystal structure of one of these smaller ligases, the L1 ligase (Fig. 3a), confirms that only a handful of “special” residues coupled with an arrangement of simple helices and tetraloops are required to form the catalytically active structure (Robertson and Scott, 2007). The majority of the structure, greater than 75%, is involved in Watson–Crick base-paired helices and ordinary tetraloops, and only 7 residues form the heart of the tertiary interaction that creates a catalytic pocket to position a magnesium ion at the ligation site by the juxtaposition of two stem elements (Fig. 3b and c). A few other residues in the hinge region (shown in yellow in Fig. 3a) appear to play a role in further stabilizing the active conformation, but overall this simple structure is formed with the efficient use of a relatively few residues directing the arrangement of standard structural elements into a compact fold. This observation is also relevant to a general appreciation of how the structures of selected RNA molecules impact our thoughts about a RNA world: selections can generate simple, functional sequences, these simple, functional sequences fold into compact structures, and the compact structures can potentially be appended to one another to make catalysts of ever-increasing complexity. 3. Doppelgangers for recapitulating the RNA world 3.1. Recapitulation of the evolution of self-replication An amazing body of work shows that there are plausible prebiotic routes to nucleosides, nucleotides, and oligonucleotides (reviewed in Orgel, 2004). The non-directed assembly of oligonucleotides would have provided fodder for self-replicators to arise, and these could have further elaborated themselves through the well-known cycle of Darwinian evolution. While mechanistic descriptions of many steps in this scenario remain unknown, we have previously postulated what we believe is a plausible pathway from prebiotic chemistry to the origin and elaboration of self-replication (Levy and Ellington, 2001; Fig. 4). In this scenario, prebiotic oligonucleotides would have served as the basis for template-directed ligation and amplification, similar to the parabolic replicators previously demonstrated by both von Kiedrowski and Orgel (Zielinski and Orgel, 1987; Sievers and von Kiedrowski, 1994). Such primitive replication cycles would have served as the fodder for creating ever longer oligonucleotides, which would in turn have been the basis for selecting for more complex catalysts that were capable of more efficient self-replication. For example, beyond accelerating catalysis via simple templating it may have A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 259 Fig. 4. The march of progress through the RNA world. This is a hypothetical representation of different stages of catalytic complexity leading from simple replicators through simple ribozymes to complex ribozymes and ultimately to the takeover of the RNA world by DNA genomes and protein machines. At many of these stages, in vitro selection experiments have provided potential doppelgangers of the molecules that may have once existed. This diagram originally appeared in Levy and Ellington (2001). been possible for ligase ribozymes to greatly accelerate the catalysis of phosphodiester bond formation. A number of different ligase ribozymes have now been selected from random sequence pools. The first and perhaps still the most robust of these ligases were selected by Bartel and Szostak (1993) from a pool that encompassed 220 random sequence positions. A wide variety of ligase ribozymes were identified, including some motifs that were found more than once, indicating that there may be motifs that were common enough in sequence space that these motifs would have been found in an ancient RNA world as well as in modern selection experiments. While it may have been possible to select for the catalysis of ligation, such ligases would not necessarily have provided a fitness benefit to a nascent replicator on their own. In order to couple ligation with fitness a given ligase would have had to somehow reform itself. This initially seems like an exceedingly implausible notion, given that it is difficult to imagine a ribozyme whose complement was also catalytically active. However, Joyce and coworkers showed that cross-catalytic ligation of oligonucleotides was indeed possible. In their scheme, a selected ribozyme ligase was broken into two pieces, and engineered so that one ribozyme could act upon the complementary junction of another, leading to its activation (Kim and Joyce, 2004; Fig. 5a). Mixing the four ribozyme substrates led to the formation of new catalysts, but their accumulation was sub-exponential. The autocatalytic feedback cycle was partially inhibited by product binding and by the formation of alternative products through promiscuous reactions. A similar demonstration has been carried out by the Lehman lab, based on dividing the Group I self-splicing intron into four pieces and then catalyzing reassembly by ribozyme-mediated ligation (Hayden and Lehman, 2006). Interestingly, a functional ribozyme can non-covalently assemble based on base-pairing interactions, and then catalyze the assembly of covalently joined ribozymes. While there were particular sequence requirements for the ligation junctions, these requirements could be relaxed, resulting in the formation of functional ribozymes from a larger possible set of initial sequences (Draper et al., 2008). In addition to being limited by kinetics, these or similar systems would ultimately have been limited by the fact that the ribozyme templates were not perfectly complementary to one another; the catalytic cores of the ribozymes fell outside the region of complementarity. To address this limitation, Kuhns and Joyce (2003) have shown that perfectly complementary nucleic acid enzymes can be engineered, although they were not the same nucleic acid enzymes with the same ligase activity. How two ribozymes might act upon one another to engender full cross-catalytic replication is still a mystery. It seems likely that at some point during the evolution of organisms a ribozyme polymerase would have arisen that could catalyze the reproduction of templates, rather than just of itself. In the original selection carried out by Bartel and Szostak there was one ligase that was unique, and that was also very fast and structurally complex, the Bartel Class I ligase. This ribozyme was further engineered to function as a modest polymerase, capable of adding up to 6 nucleotides in a 4-day incubation. While the first engineered polymerase acted in cis, an additional variant was engineered and selected to act in trans, and could extend an exogenous template by up to 14 nucleotides in 1 day. Unfortunately, the amount of infor- 260 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 Fig. 5. Continuous amplifications of ligase ribozymes without (a) and with (b) proteinaceous enzymes. (a) Cross-catalytic replication of the R3C ligase, after Kim and Joyce (2004). A" and B" are half ribozyme substrates that can pair with a full ribozyme T. Ligation leads to the production of a complementary template, T" , that can in turn align the half ribozyme substrates A and B. Interactions between T and T" are a dead end complex. (b) Scheme for the continuous evolution of the Bartel Class I ligase by Wright and Joyce (1997a). Ligation of a chimeric RNA:DNA oligonucleotide followed by reverse transcription leads to the production of a template for T7 RNA polymerase, which can in turn be transcribed to recreate the original ribozyme. mation that had to be added to the Bartel ligase in order to bind a trans template was much larger than 14 nucleotides, implying that the more complex ribozyme was even less capable of eventually becoming a self-replicator. However, McGinness and Joyce (2002) were able to adapt a different, much smaller ribozyme ligase (the hc ligase, which was derived by selection from the Group I selfsplicing intron) to act on a separable helix. Further evolution of the hc ligase led to variants that were capable of catalyzing the joining of adjacent oligonucleotide substrates on an external template with few sequence restrictions. This work sets the stage for the evolution of ligases that can reproduce themselves by oligonucleotide polymerization. In an additional attempt to recapitulate the dynamics of how early replicators may have evolved, Wright and Joyce (1997b) further evolved the Bartel ligase for greater catalytic efficiency. An adaptation to the ligase was introduced in which a functional promoter was formed only upon ligation of the oligonucleotide substrate to the ribozyme (Fig. 5b). This allowed enzymatic function and sequence amplification to be coupled at the same time and in the same test tube. In this regard, evolution now occurred almost continuously, just as in the wild, at least until the test tube ran out of ‘food’ (substrates) for ribozyme replication. To circumvent this problem, Wright merely had to regularly transfer a portion of the reaction to a new food source. While the adaptations necessary for continuous evolution severely depressed the reactivity of the ribozyme, over generations of continuous evolution the Bartel ligase accumulated multiple mutations and became almost as efficient as its parent (kcat /Km of 1 × 107 /min, an improvement of >104 -fold). Continuous evolution is more akin to what normally occurs in biology than what happens during in vitro selection, and while some mutations were likely fixed from an initial, heavily mutagenized population, others clearly arose during the experiment itself. Building on these results, Paegel and Joyce (2008) have described a machine that can continuously evolve the ligase. In this microfluidic device new food is fed to the rampantly replicating population not by hand, but a series of valves controlled by a computer. As ligated ribozymes accumulate they intercalate a dye, thiazole orange, in the reaction mixture, and this in turn gives off a fluorescent signal that can be seen by embedded sensors. Once fluorescence reaches a given level new food flows, the ribozymes (and fluorescence) are diluted, and replication continues until fluorescence again builds and the food gates are again opened. In these experiments the ribozymes were again under selection for speed, but also for their ability to hold onto their oligonucleotide substrate. The Km of the starting ribozyme was 35 !M, and substrate concentration was decreased from a limiting 1 !M to as low as 0.05 !M over the course of the continuous evolution experiment. As before with the manual regime the ribozyme responded, fixing multiple mutations that resulted in an improvement in its Km to 0.4 !M. For almost all of these schemes, activated nucleotides or oligonucleotides would have been required. However, activated leaving groups would likely not have survived for long periods of time in the prebiotic environment, or would have been present at only small, steady-state concentrations. For this reason, it is interesting to consider whether activated leaving groups may have come about due to the development of catalytic cycles that preceded a ribozyme polymerase, yet may have provided substrates for its evolution. In exploring the role of cofactors in metabolism we found that functional group transfers appeared to be relatively facile. This is especially true for cofactors and phosphoryl transfer reactions. In addition to the AMP-biotin transfer to CoA cited above, Yarus and co-workers selected a ribozyme that could catalyze the rearrangement of phosphoranhydride bonds (Huang and Yarus, 1997a; Huang et al., 2000). This reaction is mechanistically similar to those which occur during adenylation of cofactors, including FAD and NAD. Indeed, the ribozyme showed a surprising lack of substrate specificity, and could add any of a variety of molecules that contained at least one alpha-phosphate to its 5" end (Huang and Yarus, 1997b). A similar ‘cappase’ ribozyme has been selected by Kang and Suga (2007), although in this instance the attacking nucleophile is a 2" hydroxyl and the substrates are a variety of purine ribotides. The activated functional groups would also have been capable of participating in phosphoryl bond transfer reactions, such as those that occur during DNA ligation and polymerization. In this regard, it is interesting to note that the Group I self-splicing intron has been shown to be capable of incorporating cofactors at its 5" end through the same mechanism that guanosine normally uses to initiate the splicing casade (Breaker and Joyce, 1995). Oligonucleotides terminated with cofactors/leaving groups could have served to initiate ligation events, and a ribozyme selected by Fujita et al. (2006) does A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 Fig. 6. Down-hill acyl transfer reactions in translation. Acyl-activation, hydroxygroup acylation from an adenylate and peptide bond formation are shown as arrows 1–3, respectively. Each of these (or equivalent) reactions can be catalyzed by a ribozyme generated by in vitro selection. just this. The initial RNA pool terminated in a nicotinamide moiety joined to the ribozyme via a 5" –5" pyrophosphate linkage, and variants were selected that could displace 5" -phosphorylated NMN and form a phosphodiester bond. Taken together, these findings indicate that while the most obvious route to prebiotic replication, the assembly of monomers, may have been foreclosed by enantiomeric poisoning or other considerations, there would nonetheless have been a plethora of other mechanisms by which early, self-replicating catalysts might have arisen. Once the first self-replicator was established, the elaboration of catalysis would have been almost inevitable, with simple sequence and structural motifs being improved and elaborated. Growing self-replicators would have required and acquired additional functionalities, such as the activation and synthesis of nucleotides, leading to the establishment of an interdependent genome. 3.2. Recapitulation of the development of translation While it is hard to know how translation arose, it is likely that many of the steps in the process today are at least chemically similar to those that would have been present initially. Amino acids would likely have been activated by forming aminoacyl adenylates, a high-energy mixed anhydride between the carboxyl of the amino acid and the phosphate of AMP. In ribosome-mediated protein synthesis, this reaction is the first-step of tRNA aminoacylation catalyzed by aminoacyl-tRNA synthetase (aaRS). In ribosomes, the aminoacyl is then transferred to the 2" or 3" hydroxyl of the 3" terminal nucleotide of tRNA by aaRS. Charged tRNAs bind to the ribosome, and the amino group in the A-site aa-tRNA attacks the carbonyl group of the peptidyl-tRNA (or aminoacyl-tRNA) in the Psite, forming a new peptide bond. From a purely chemical point of view, the synthesis of a peptide is essentially the down-hill transfer of the aminoacyl between different carriers: phosphate (anhydride) → hydroxyl → amino in the ribosomal pathway (Fig. 6). This putative path to translation has at some level been recapitulated in the laboratory by the directed evolution of ribozymes. The initial activation of the acyl group as a mixed anhydride starts the process. An acyl activating ribozyme has been selected (Kumar and Yarus, 2001). These experiments utilized a carboxylic acid (3mercaptopropionic acid, 3 Mpa) rather than an amino acid as the nucleophilic substrate and were conducted at lower pH in order to diminish the inherent instability of the amino adenylate product. One selected variant, KK13, was shown to be able to use various amino acids and even water (resulting in 5" -pyrophosphatase activity) as nucleophiles. The ribozyme catalyzing the reaction that mimics the second step of aaRS was actually the first selected ribozyme capable of catalyzing aminoacyl transfer (Illangasekare et al., 1995). In this 261 selection, a RNA pool was incubated with phenylalanyl-AMP as an amino acid donor. Some catalysts proved capable of transferring phenylalanine to a hydroxyl group. The self-modified RNA now contained an amino group and could be reacted with the N-hydroxysuccinimide (NHS) ester of naphthoxyacetic acid, dramatically increasing the hydrophobicity of the RNA and thus allowing its separation from unreacted RNA by HPLC. One clone, isolate 29, was chosen for further study and the aminoacyl acceptor of this ribozyme was proved to be the 2" or 3" hydroxyl of the 3" terminal G. Another aminoacylating ribozyme was selected by the Suga group (Saito et al., 2001). In this case, though, the selection was explicitly designed to be relevant to modern translation. A randomized region was appended to the 5" end of tRNA, and the selected ribozyme catalyzes the aminoacylation of real tRNA on its amino acid acceptor arm. Even when the ribozyme domain and tRNA were separated by RNase P cleavage, the ribozyme could still function in trans, just as protein aaRS do. The first ribozyme that was actually capable of peptide bond synthesis was selected by Eaton and co-workers (Wiegand et al., 1997), and catalyzed amide-bond formation between an amino group tethered to RNA (through a flexible PEG linker) and biotin (again from a biotin adenylate). Other selected peptide bondforming ribozymes also utilize adenylate as an acyl source. First, a peptide bond-forming ribozyme was selected that supposedly utilized an aminoacyl linked to the 3" hydroxyl of AMP (Zhang and Cech, 1997), but the real substrate was later identified as a contaminating aminoacyl adenylate (Sun et al., 2002). Second, during further engineering of the previously described aminoacylating ribozyme (Illangasekare et al., 1995), a second reaction between the product, aminoacylated RNA, and free aminoacyl adenylate was observed, resulting in the formation of a dipeptide-RNA adduct ((Illangasekare and Yarus, 1999). Following up on these results, Zhang and co-workers intentionally selected ribozymes that could act as general dipeptide synthesis catalysts by using aminoacyl adenylates as substrates (Sun et al., 2002). The types of group transfer reactions carried out by the ribosome are used throughout metabolism, and thus there may have been many routes to translation in an early RNA world. The group transfer reactions embodied by ribozymes that utilize CoA are one example that we have already discussed, and indeed in non-ribosomal peptide synthesis, the aminoacyl of the aminoacyl adenylate is transferred to the thiol group of pantetheine, forming a thioester bond. Similarly, while the major task of the ribosome is to form peptide bonds using the amino group of an aminoacylated tRNA as the nucleophile, the nucleophile can also be amino group of puromycin or the hydroxyl group of hydroxypuromycin in a process called the ‘fragment reaction.’ A ribozyme catalyzing a reaction similar to the fragment reaction was first selected by Lohse and Szostak (1996). These researchers intended to select an acyl transfer ribozyme that could move an aminoacyl group at the 3" end of a 6 nucleotide RNA oligonucleotide to a 5" , 3" , or internal 2" hydroxyl of the ribozyme. After 11 rounds of selection one variant predominated, which had strikingly evolved a way to juxtapose the acyl group of the RNA oligonucleotide substrate and the 5" hydroxyl of the ribozyme by simply using a 13-nt fragment template to align and juxtapose the two functionalities. As expected, the nucleophile was identified as the 5" , template-aligned hydroxyl of the ribozyme. Astoundingly, when the 5" hydroxyl was substituted with an amino group, the aminoacyl was transferred to the ribozyme at a rate comparable to that of the original ribozyme. The fact that amide bond formation was as competent as the selected acyl transfer activity further emphasized the role of templating in the mechanism. A similar selection was carried out by the Famulok group, using 2" -aminoacyl-AMP as the aminoacyl donor (Jenne and Famulok, 262 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 1998). However, the nucleophile on the selected ribozyme proved to be an internal 2" hydroxyl. In this instance, though, substitution of the hydroxyl for an amino group did not result in the amide bond formation, possibly due to the differing geometries of the nucleophile (C3" -endo conformation for the 2" hydroxyl vs. C2" endo conformation for the 2" amine). Acyl transfer activity was also achieved by the previously described ribozyme selected by Suga and co-workers (Lee et al., 2006; Saito et al., 2001). The ribozyme could use various aminoacyl donors, including adenylates, thioesters, and oxygen esters. As we saw in discussing the evolution of replication, it may be that reactions that were either evolved for another purpose or that were semi-specific in terms of substrates and leaving groups could have provided the basis for the evolution of new activities, in this case the formation of peptide bonds. 4. What in vitro selection reveals about fitness landscapes At some level, all speculations about the RNA world are just that: speculations. Absent a more complete representation of phylogenetic descent prior to the takeover of the RNA world by translation (or a time machine) there are numerous possible options for what may have actually happened at origins and in the RNA world. It is because there is no clear historical trail to origins that directed evolution experiments are most useful in understanding what may have happened. In vitro selection provides a snapshot of fitness landscapes: how sequence maps to function (and, therefore, at least in part to fitness; reviewed in Jhaveri et al., 1997; Lehman, 2004). By determining how likely it is that a given ribozyme activity can arise in a test tube, extrapolations regarding the likelihood of that ribozyme having arisen in the RNA world can also be made. Based on selection experiments, a number of conclusions can be drawn about the relative likelihoods of different paths to and from origins. 4.1. Tyranny of short motifs To the extent that there was a period in Earth’s history where new sequences were invented from randomly assembled and replicated mono- and oligonucleotides, it is not unreasonable that in vitro selection directly mimics processes that occurred in the RNA world. That said, the earliest replicators would likely have been short, and their functionality would therefore have been minimal. This does not argue against nucleic acid origins, but rather in favor of it: short nucleic acids can more readily assume structure than short peptides. The fact that helices can form readily based on simple Watson–Crick pairing means that cavities for binding or catalysis will also form. The earliest replicating, functional nucleic acids would have been elongated and in the process would have had the opportunities to assume more complex structure and to refine and acquire additional functionalities. However, the original sequence and functionality would still have been embedded in the expanded sequence, and may therefore have guided or restricted the evolution of the longer binding species and catalysts. To the extent that this hypothesis is true, we might expect in vitro selection experiments to yield at least some binding species and catalysts that have short, compact sequence and structural motifs. In fact, in vitro selection not only yields short functional motifs, they often predominate in a selection. This is of course because shorter sequence motifs are numerically overrepresented relative to longer sequence motifs, and will therefore predominate unless the longer motifs are much more functional. This predominance has been called the ‘tyranny of short motifs’, and we have already seen it in action during the selection of the Bartel ligase, in which only one long, complex, and highly functional variant arose, while many other short, less complex, but still quite func- tional ribozymes were also found. Similarly, undirected selections of ribozyme cleavases turned up a surprising number of variants of the well-known but simple hammerhead ribozyme motif (SalehiAshtiani and Szostak, 2001), as well as other simple, less active motifs (Tang and Breaker, 2000). A similar set of selections that was started from pools biased in favor of the hammerhead also showed that the ribozyme had a relatively low information content (Tang and Breaker, 1997). McManus and Li (2007) have found that similar, simple deoxyribozyme kinases emerged from a selection starting from a completely random pool. A further implication of the tyranny of short motifs is that the short sequence motifs we see today may be similar or identical to short sequence motifs that would have been seen in the RNA world. For example, the fact that similar short, adenosine-binding motifs have arisen in selections against many different adenosinecontaining cofactors may imply that it was in fact this motif that was present in early ATP-binding species or ribozymes. Similarly, if an early cleavase was important in the RNA world it would almost certainly have been akin to the hammerhead ribozyme sequence that is seen in the modern world. Not only do short, simple, functional sequences exist, but also it appears that such species can potentially be selected from even sparse representations of sequence space (Knight and Yarus, 2003). 4.2. Diversity in sequence space While it is true that short motifs can dominate a selection, many selection experiments reveal that the fitness landscape surrounding a given function is quite diverse. As we have already seen, the original selection by Bartel and Szostak (1993) produced many short ribozymes as well as the complex Class I ligase. A more thorough analysis has been carried out by Li and his co-workers. Schlosser and Li (2005) followed the selection of deoxyribozyme cleavases through multiple generations, and saw the rise and fall of many different variants as the selection progressed. While many of these intermediate variants may be less fit relative to the population, they can be individually ‘rescued’ by letting them explore the surrounding sequence space and fitness landscape. For example, when deoxyribozymes predicted to form three-arm junctions were further evolved, they acquired additional helical elements, and became more complex five-arm junctions (Chiuman and Li, 2006). However, such variants are only available if individual variants are protected from competition. By increasing the selection pressure on the population, the diversity of the selected sequences winnowed quickly, with the population collapsing on a previously discovered deoxyribozyme, 8–17 (Schlosser and Li, 2004). 4.3. Acquisition of new functions Stephen Gould has asserted that if the tape of life were replayed the results would be different. However, the tyranny of short motifs implies that at least some of the molecular events in any tape that involves RNA may be quite deterministic. Both ideas can be simultaneously true. While short sequence motifs can readily predominate, elongation and elaboration of function may occur in a variety of different ways, and which of those ways was chosen during evolution will likely forever remain a mystery. In this regard, nucleic acids have proven to be surprisingly adept at moving between different functionalities, leaving behind their original, tyrannical motifs and adopting new sequences and structures. Burke and co-workers started with an anti-flavin aptamer and from it selected an anti-guanosine aptamer (Held et al., 2003). In at least one instance an anti-flavin and anti-guaonsine aptamer were separated by only seven sequence substitutions. By synthesizing predicted intermediates, a neutral path between the two A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 263 Fig. 7. Anticipated path between anti-FAD and anti-GMP aptamer variants. Structures were generated using the NUPACK algorithm of Dirks and Pierce (2004), based on sequences identified by Held et al. (2003). As sequence substitutions accumulate in the anti-FAD aptamer (left) the structure is predicted to be destabilized, quickly assuming a more open format. Additional sequence substitutions restabilize a new conformation, which can bind guanosine (right). aptamers could be found, although two of the variants in this path had reduced binding abilities for both ligands (Fig. 7). Schultes and Bartel (2000) performed a similar feat with two ribozymes, starting with a selected ligase and engineering it sequentially to be a known, natural cleavase (the HDV ribozyme). Interestingly, while the intermediates showed greatly reduced activity there were at least some variants that could catalyze both types of reactions. In both of these instances, the aptamers and ribozymes assumed completely different secondary structures as a result of the accumulation of point mutations. Given the joint activities at the ‘intersection sequences’ that separated them, this means that at least some functional nucleic acids may be conformationally mobile, and capable of multiple different activities depending on which conformation they assume. Such degenerate mapping of sequence to structure to function might have greatly accelerated the acquisition of novel functions at or near origins. Not all nucleic acids may be able to move readily through sequence or conformational spaces, however. By using microarrays to directly examine the sequence space surrounding an anti-IgE aptamer Katilius et al. (2007) showed that most of the nearest neighbor mutations were deleterious to binding. Similarly, although Bartel and Szostak (1993) showed that many of the sequences and structures of some of the ligases that emerged from a ‘deep random’ selection were clearly driven by the tyranny of short motifs, the aforementioned Class I Bartel ligase was not. Indeed, partial randomization and re-selection indicated that the Class I ligase had an extremely high information content, and given the pool size used would likely have been selected only once every 10,000 times the experiment was carried out (Ekland et al., 1995). This is usually taken to mean not that Bartel was extremely lucky, but rather that there are many equally functional, equally complex variants scattered through sequence space. A similar analysis of the natural HDV ribozyme indicated that it is also relatively rare, and unlike its counterpart the hammerhead may have arisen relatively rarely (Nehdi and Perreault, 2006). These results gibe with the fact that the HDV ribozyme has been found only once in nature, while the hammerhead ribozyme appears to have multiple, independent origins. The fact that the Class I Bartel ligase is complex and rare does not necessarily mean that it is isolated in sequence space; it might easily be part of a large neutral network of functional ribozymes. However, by and large this does not appear to be the case. The Bartel ligase has been subjected to numerous different directed evolution experiments, and has in general proved extremely recal- citrant to sequence change (Levy et al., 2005). For example, Schmitt and Lehman (1999; see also Lehman, 2004) has shown that multiple directed evolution experiments with the ligase that started from parallel, partially randomized populations yielded the same selected variants. Similarly, despite the grinding intensity of the 10500 -fold selection and amplification of the ligase that was carried out via the automated methods described above (Paegel and Joyce, 2008), no real improvement in the speed of the ligase (kcat ) was observed. Finally, despite heroic efforts that involved selection in a liter of water-in-oil emulsion, Zaher and Unrau (2007) improved the polymerization speed of a variant of the Bartel Class I ligase by only about 75-fold. Since there is not of necessity an inverse correlation between complexity and neutral movement on a fitness landscape, the relative isolation of the Bartel ligase is curious and demands further explanation. It may be that since the ligase was born through the unnatural process of simultaneously competing against multiple, unrelated points in sequence space, rather than the more natural process of accumulating single mutations and blocks of sequence, there is no reason to expect that it should sit on a neutral network and remain functional following even small mutational moves. As a counterexample, a number of variants of the Group I selfsplicing intron have been evolved. For example, the enzyme has been evolved to new tasks (the cleavage of DNA; Tsang and Joyce, 1994) and to function in new ways (with calcium rather than magnesium as a catalytic metal; Lehman and Joyce, 1993). During the evolution of these phenotypes, numerous different pathways or trajectories can be taken (Hanczyc and Dorit, 2000; Lehman et al., 2000). The fact that the Group I self-splicing intron was born of natural processes, and thus is already well-familiar with evolution in the context of single base changes, may be one factor accounting for the seeming difference between its readily traversable fitness landscape and the restricted peak of the Bartel ligase. While different functional nucleic acids would presumably have moved along fitness landscapes in very similar ways, the ability to fully traverse the landscape remains an open question. Some areas of the landscape may be sparsely populated and have isolated fitness peaks, while others may have large neutral networks that can be navigated by point mutation. Novel functionalities could potentially be acquired by re-folding or by recombination. The role of recombination in augmenting RNA functionality is an active area of research and it is still unclear to what extent it would have been a selective advantage at or near origins. 264 A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 5. Conclusions In vitro selection experiments show first and foremost that it is possible that functional nucleic acids can arise from random sequence libraries. Indeed, even simple sequence and structural motifs can prove to be robust binding species and catalysts, indicating that it may have been possible to transition from even the earliest self-replicators to a nascent, RNA-catalyzed metabolism. Because of the diversity of aptamers and ribozymes that can be selected, it is possible to construct a ‘fossil record’ of the evolution of the RNA world, with in vitro selected catalysts filling in as doppelgangers for molecules long gone. In this way a plausible pathway from simple oligonucleotide replicators to genomic polymerases can be imagined, as can a pathway from basal ribozyme activities to the ribosome. Most importantly, though, in vitro selection experiments can give a true and quantitative idea of the likelihood that these scenarios could have played out in the RNA world (although into the future new methods for the synthesis of microarrays allow huge numbers of sequence variants to be prepared and assessed simultaneously, and such large screens are becoming increasingly viable alternatives to blind selection processes; Katilius et al., 2007). Simple binding species and catalysts could have evolved into other structures and functions. As replicating sequences grew longer, new, more complex functions or faster catalytic activities could have been accessed. Some activities may have been isolated in sequence space, but others could have been approached along large, interconnected neutral networks. As the number, type, and length of ribozymes increased, RNA genomes would have evolved and eventually there would have been no area in a fitness landscape that would have been inaccessible. Self-replication would have inexorably led to life. Interestingly, this field of research has application not only to understanding our origins, but also to numerous issues in biotechnology. Attempts to understand the origins of replication have led to the synthesis of orthogonal base-pairs that can augment the information content of DNA. Attempts to understand replication have led to novel amplification assays. Into the future, we suspect this cross-hybridization will continue, and that the first self-replicating nucleic acids will not only provide insights into what may have occurred billions of years ago, but will also form the basis for selfrepairing nanotechnologies. References Anderson PC, Mecozzi S. Unusually short RNA sequences: design of a 13-mer RNA that selectively binds and recognizes theophylline. J Am Chem Soc 2005;127:5290–1. Bartel DP, Szostak JW. Isolation of new ribozymes from a large pool of random sequences. Science 1993;261:1411–8. Breaker RR, Joyce GF. A DNA enzyme that cleaves RNA. Chem Biol 1994;1: 223–9. Breaker RR, Joyce GF. Self-incorporation of coenzymes by ribozymes. J Mol Evol 1995;40(6):551–8. Burgstaller P, Famulok M. Isolation of RNA aptamers for biological cofactors by in vitro selection. Angew Chem Int Ed Engl 1994;33:1084–7. Burke DH, Gold L. RNA aptamers to the adenosine moiety of S-adenosyl methionine: structural inferences from variations on a theme and the reproducibility of SELEX. Nucleic Acids Res 1997;25:2020–4. Burke DH, Hoffman DC. A novel acidophilic RNA motif that recognizes coenzyme A. Biochemistry 1998;37:4653–63. Chen X, Li N, Ellington AD. Ribozyme catalysis of metabolism in the RNA world. Chem Biodivers 2007;4:633–55. Chiuman W, Li Y. Revitalization of six abandoned catalytic DNA species reveals a common three-way junction framework and diverse catalytic cores. J Mol Biol 2006;357:748–54. Coleman TM, Huang F. RNA-catalyzed thioester synthesis. Chem Biol 2002;9:1227–36. Dieckmann T, Suzuki E, Nakamura GK, Feigon J. Solution structure of an ATP-binding RNA aptamer reveals a novel fold. RNA 1996;2:628–40. Dirks RM, Pierce NA. An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots. J Comput Chem 2004;25:1295–304. Draper WE, Hayden EJ, Lehman N. Mechanisms of covalent self-assembly of the Azoarcus ribozyme from four fragment oligonucleotides. Nucleic Acids Res 2008;36:520–31. Ekland EH, Szostak JW, Bartel DP. Structurally complex and highly active RNA ligases derived from random RNA sequences. Science 1995;269:364–70. Ellington AD. RNA selection. Aptamers achieve the desired recognition. Curr Biol 1994;4:427–9. Ellington AD, Robertson MP. Ribozyme selection. In: Barton D, Nakanishi K, MethCohn O, editors. Comprehensive natural products chemistry, vol. 6. New York: Elsevier; 1999. p. 115–48. Fujita Y, Furuta H, Ikawa Y. Construction of an artificial ribozyme which ligates an RNA fragment activated by nicotinamide mononucleotide. Nucleic Acids Symp Ser (Oxf) 2006;50:231–2. Gilbert W. Evolution of antibodies. The road not taken. Nature 1986;320:485–6. Hanczyc MM, Dorit RL. Replicability and recurrence in the experimental evolution of a group I ribozyme. Mol Biol Evol 2000;17:1050–60. Hayden EJ, Lehman N. Self-assembly of a group I intron from inactive oligonucleotide fragments. Chem Biol 2006;13:909–18. Held DM, Greathouse ST, Agrawal A, Burke DH. Evolutionary landscapes for the acquisition of new ligand recognition by RNA aptamers. J Mol Evol 2003;57:299–308. Huang F, Yarus M. A calcium-metalloribozyme with autodecapping and pyrophosphatase activities. Biochemistry 1997a;36:14107–19. Huang F, Yarus M. Versatile 5" phosphoryl coupling of small and large molecules to an RNA. Proc Natl Acad Sci USA 1997b;94:8965–9. Huang F, Bugg CW, Yarus M. RNA-Catalyzed CoA, NAD, and FAD synthesis from phosphopantetheine, NMN, and FMN. Biochemistry 2000;39:15548–55. Huang F. Efficient incorporation of CoA, NAD and FAD into RNA by in vitro transcription. Nucleic Acids Res 2003;31:e8. Illangasekare M, Sanchez G, Nickles T, Yarus M. Aminoacyl-RNA synthesis catalyzed by an RNA. Science 1995;267:643–7. Illangasekare M, Yarus M. A tiny RNA that catalyzes both aminoacyl-RNA and peptidyl-RNA synthesis. RNA 1999;5:1482–9. Jadhav VR, Yarus M. Acyl-CoAs from coenzyme ribozymes. Biochemistry 2002;41:723–9. Jenison RD, Gill SC, Pardi A, Polisky B. High-resolution molecular discrimination by RNA. Science 1994;263:1425–9. Jenne A, Famulok M. A novel ribozyme with ester transferase activity. Chem Biol 1998;5:23–34. Jhaveri S, Hirao I, Bell SD, Uphoff K, Ellington AD. Landscapes for molecular evolution: lessons from in vitro selection experiments with nucleic acids. Comb Chem Mol Diver 1997;5:169–91. Jiang F, Kumar RA, Jones RA, Patel DJ. Structural basis of RNA folding and recognition in an AMP-RNA aptamer complex. Nature 1996;382:183–6. Katilius E, Flores C, Woodbury NW. Exploring the sequence space of a DNA aptamer using microarrays. Nucleic Acids Res 2007;35:7626–35. Kang TJ, Suga H. In vitro selection of a 5" -purine nucleotide transferase ribozyme. Nucleic Acids Symp Ser (Oxf) 2007;50:379–80. Kim DE, Joyce GF. Cross-catalytic replication of an RNA ligase ribozyme. Chem Biol 2004;11:1505–12. Knight R, Yarus M. Finding specific RNA motifs: function in a zeptomole world? RNA 2003;9:218–30. Koizumi M, Breaker RR. Molecular recognition of cAMP by an RNA aptamer. Biochemistry 2000;39:8983–92. Kuhns ST, Joyce GF. Perfectly complementary nucleic acid enzymes. J Mol Evol 2003;56:711–7. Kumar RK, Yarus M. RNA-catalyzed amino acid activation. Biochemistry 2001;40:6998–7004. Lauhon CT, Szostak JW. RNA aptamers that bind flavin and nicotinamide redox cofactors. J Am Chem Soc 1995;117:1246–57. Lee JF, Stovall GM, Ellington AD. Aptamer therapeutics advance. Curr Opin Chem Biol 2006;10:282–9. Lehman N, Joyce GF. Evolution in vitro: analysis of a lineage of ribozymes. Curr Biol 1993;3:723–34. Lehman N, Donne MD, West M, Dewey TG. The genotypic landscape during in vitro evolution of a catalytic RNA: implications for phenotypic buffering. J Mol Evol 2000;50:481–90. Lehman N. Assessing the likelihood of recurrence during RNA evolution in vitro. Artif Life 2004;10:1–22. Leontis NB, Westhof E. Geometric nomenclature and classification of RNA base pairs. RNA 2001;7:499–512. Levy M, Ellington AD. The descent of polymerization. Nat Struct Biol 2001;8:580–2. Levy M, Cater SF, Ellington AD. Quantum-dot aptamer beacons for the detection of proteins. Chembiochem 2005;6:2163–6. Lilley DM. The origins of RNA catalysis in ribozymes. Trends Biochem Sci 2003;28:495–501. Lohse PA, Szostak JW. Ribozyme-catalysed amino-acid transfer reactions. Nature 1996;381:442–4. Lorsch JR, Szostak JW. In vitro evolution of new ribozymes with polynucleotide kinase activity. Nature 1994;371:31–6. Marshall KA, Robertson MP, Ellington AD. A biopolymer by any other name would bind as well: a comparison of the ligand-binding pockets of nucleic acids and proteins. Structure 1997;5:729–34. McManus SA, Li Y. Multiple occurrences of an efficient self-phosphorylating deoxyribozyme motif. Biochemistry 2007;46:2198–204. A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265 McGinness KE, Joyce GF. RNA-catalyzed RNA ligation on an external RNA template. Chem Biol 2002;9:297–307. Nehdi A, Perreault JP. Unbiased in vitro selection reveals the unique character of the self-cleaving antigenomic HDV RNA sequence. Nucleic Acids Res 2006;34:584–92. Orgel LE. Prebiotic chemistry and the origin of the RNA world. Crit Rev Biochem Mol Biol 2004;39:99–123. Paegel BM, Joyce GF. Darwinian evolution on a chip. PLoS Biol 2008;6:e85. Robertson MP, Scott WG. The structural basis of ribozyme-catalyzed RNA assembly. Science 2007;315:1549–53. Roychowdhury-Saha M, Lato SM, Shank ED, Burke DH. Flavin recognition by an RNA aptamer targeted toward FAD. Biochemistry 2002;41:2492–9. Saito H, Kourouklis D, Suga H. An in vitro evolved precursor tRNA with aminoacylation activity. EMBO J 2001;20:1797–806. Salehi-Ashtiani K, Szostak JW. In vitro evolution suggests multiple origins for the hammerhead ribozyme. Nature 2001;414:82–4. Saran D, Nickens DG, Burke DH. A trans acting ribozyme that phosphorylates exogenous RNA. Biochemistry 2005;44:15007–16. Sassanfar M, Szostak JW. An RNA motif that binds ATP. Nature 1993;364:550–3. Sazani PL, Larralde R, Szostak JW. A small aptamer with strong and specific recognition of the triphosphate of ATP. J Am Chem Soc 2004;126:8370–1. Schlosser K, Li Y. Tracing sequence diversity change of RNA-cleaving deoxyribozymes under increasing selection pressure during in vitro selection. Biochemistry 2004;43:9695–707. Schlosser K, Li Y. Diverse evolutionary trajectories characterize a community of RNAcleaving deoxyribozymes: a case study into the population dynamics of in vitro selection. J Mol Evol 2005;61:192–206. Schmitt T, Lehman N. Non-unity molecular heritability demonstrated by continuous evolution in vitro. Chem Biol 1999;6:857–69. Schultes EA, Bartel DP. One sequence, two ribozymes: implications for the emergence of new ribozyme folds. Science 2000;289:448–52. Sievers D, von Kiedrowski G. Self-replication of complementary nucleotide-based oligomers. Nature 1994;369:221–4. 265 Stoltenburg R, Reinemann C, Strehlitz B. SELEX–a (r)evolutionary method to generate high-affinity nucleic acid ligands. Biomol Eng 2007;24:381–403. Strobel SA, Cochrane JC. RNA catalysis: ribozymes, ribosomes, and riboswitches. Curr Opin Chem Biol 2007;11:636–43. Sun L, Cui Z, Gottlieb RL, Zhang B. A selected ribozyme catalyzing diverse dipeptide synthesis. Chem Biol 2002;9:619–28. Tang J, Breaker RR. Examination of the catalytic fitness of the hammerhead ribozyme by in vitro selection. RNA 1997;3:914–25. Tang J, Breaker RR. Structural diversity of self-cleaving ribozymes. Proc Natl Acad Sci USA 2000;97:5784–9. Tsang J, Joyce GF. Evolutionary optimization of the catalytic properties of a DNAcleaving ribozyme. Biochemistry 1994;33:5966–73. Tsukiji S, Pattnaik SB, Suga H. An alcohol dehydrogenase ribozyme. Nat Struct Biol 2003;10:713–7. Tsukiji S, Pattnaik SB, Suga H. Reduction of an aldehyde by a NADH/Zn2+ -dependent redox active ribozyme. J Am Chem Soc 2004;126:5044–5. Wiegand TW, Janssen RC, Eaton BE. Selection of RNA amide synthases. Chem Biol 1997;4:675–83. Woese CR, Winker S, Gutell RR. Architecture of ribosomal RNA: constraints on the sequence of “tetra-loops”. Proc Natl Acad Sci USA 1990;87:8467–71. Wright MC, Joyce GF. Continuous in vitro evolution of catalytic function. Science 1997a;276:614–7. Wright MC, Joyce GF. Continuous in vitro evolution of catalytic function. Science 1997b;276:546–7. Zaher HS, Unrau PJ. Selection of an improved RNA polymerase ribozyme with superior extension and fidelity. RNA 2007;13:1017–26. Zhang B, Cech TR. Peptide bond formation by in vitro selected ribozymes. Nature 1997;390:96–100. Zielinski WS, Orgel LE. Autocatalytic synthesis of a tetranucleotide analogue. Nature 1987;327:346–7. Zimmermann GR, Jenison RD, Wick CL, Simorre JP, Pardi A. Interlocking structural motifs mediate molecular discrimination by a theophylline-binding RNA. Nat Struct Biol 1997;4:644–9.