Download View PDF - DNA and Natural Algorithms Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-coding DNA wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Catalytic triad wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Oligonucleotide synthesis wikipedia , lookup

Gene wikipedia , lookup

Metabolism wikipedia , lookup

Messenger RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Structural alignment wikipedia , lookup

Silencer (genetics) wikipedia , lookup

RNA interference wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Genetic code wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Biochemistry wikipedia , lookup

Polyadenylation wikipedia , lookup

Metalloprotein wikipedia , lookup

Biosynthesis wikipedia , lookup

Gene expression wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA wikipedia , lookup

RNA silencing wikipedia , lookup

RNA-Seq wikipedia , lookup

Epitranscriptome wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcript
The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
Contents lists available at ScienceDirect
The International Journal of Biochemistry
& Cell Biology
journal homepage: www.elsevier.com/locate/biocel
Review
Evolutionary origins and directed evolution of RNA
Andrew D. Ellington ∗ , Xi Chen, Michael Robertson, Angel Syrett
Department of Chemistry and Biochemistry, Institute for Cell and Molecular Biology,
University of Texas at Austin, Austin, TX 78712, United States
a r t i c l e
i n f o
Article history:
Available online 19 August 2008
Keywords:
Origins
In vitro selection
SELEX
Aptamer
Ribozyme
Polymerase
Translation
Fitness landscape
a b s t r a c t
In vitro selection experiments show first and foremost that it is possible that functional nucleic acids can
arise from random sequence libraries. Indeed, even simple sequence and structural motifs can prove to
be robust binding species and catalysts, indicating that it may have been possible to transition from even
the earliest self-replicators to a nascent, RNA-catalyzed metabolism. Because of the diversity of aptamers
and ribozymes that can be selected, it is possible to construct a ‘fossil record’ of the evolution of the RNA
world, with in vitro selected catalysts filling in as doppelgangers for molecules long gone. In this way a
plausible pathway from simple oligonucleotide replicators to genomic polymerases can be imagined, as
can a pathway from basal ribozyme activities to the ribosome. Most importantly, though, in vitro selection
experiments can give a true and quantitative idea of the likelihood that these scenarios could have played
out in the RNA world. Simple binding species and catalysts could have evolved into other structures and
functions. As replicating sequences grew longer, new, more complex functions or faster catalytic activities
could have been accessed. Some activities may have been isolated in sequence space, but others could
have been approached along large, interconnected neutral networks. As the number, type, and length
of ribozymes increased, RNA genomes would have evolved and eventually there would have been no
area in a fitness landscape that would have been inaccessible. Self-replication would have inexorably led
to life.
© 2008 Elsevier Ltd. All rights reserved.
Contents
1.
2.
3.
4.
5.
Background on the RNA world hypothesis and in vitro selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What in vitro selection tells us about functional RNAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.
Binding species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.
Catalysts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.
Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.
Structural complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Doppelgangers for recapitulating the RNA world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.
Recapitulation of the evolution of self-replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.
Recapitulation of the development of translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What in vitro selection reveals about fitness landscapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.
Tyranny of short motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.
Diversity in sequence space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.
Acquisition of new functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∗ Corresponding author. Tel.: +1 512 471 6445; fax: +1 512 4717014.
E-mail address: [email protected] (A.D. Ellington).
1357-2725/$ – see front matter © 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.biocel.2008.08.015
255
255
255
255
255
257
258
258
261
262
262
262
262
264
264
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
1. Background on the RNA world hypothesis and in vitro
selection
It is generally accepted that life on Earth underwent a phase
in which nucleic acids rather than proteins were the primary functional biopolymers in a cell, the so-called RNA world (Gilbert, 1986).
It is presumed that at some point in the evolution of the RNA world
template-directed polymerization arose, allowing the transfer of
genetic information between generations of replicators. This would
have in turn engendered the cycle of Darwinian evolution that is
still playing out today. Assuming that the RNA world immediately
preceded the advent of translation and the evolution of the modern cell in which proteins are the functional biopolymers, it seems
likely that the RNA world contained a large number of pathways
and catalysts that would have preceded or mimicked those we see
in modern metabolism.
The process by which ancient catalysts may have arisen can
be roughly reproduced in the laboratory. Random sequence RNA
pools can be generated by a combination of chemical and enzymatic synthesis, and molecules that perform a particular function
(binding, catalysis) can be sieved from the pools. Selected binding
species (aptamers) or catalysts (ribozymes) can then be amplified
by a combination of reverse transcription, PCR, and in vitro transcription. After multiple cycles of selection and amplification that
mimic natural selection, the aptamers and ribozymes generated by
directed evolution are typically highly fit molecules. In this regard,
they can be considered to be doppelgangers of the molecules that
may have existed in the early RNA world. By studying the properties of these selected RNAs, we may also garner some insight into
the nature and propensities of the RNA world.
2. What in vitro selection tells us about functional RNAs
2.1. Binding species
Aptamers have now been selected against a wide variety of target molecules, including ions, metabolites, and proteins (reviewed
in Stoltenburg et al., 2007). Aptamers typically bind their targets
with Kd values in the low nanomolar to micromolar range, similar
to the binding constants of many natural protein binding species
and catalysts. Aptamers can discriminate between substrates as
well as monoclonal antibodies. For example, aptamers selected to
bind to theophylline can discriminate against caffeine, which differs
by a single methyl group, by 10,000-fold, while aptamers selected
to bind to ATP can discriminate against dATP by a similar value
(reviewed in Ellington, 1994). Thus, it seems likely that binding
pockets could have readily formed in the RNA world, and would
have functioned in a manner similar to those found in modern,
protein-based life.
2.2. Catalysts
Of course, binding pockets in RNA molecules would have primarily been useful for fomenting substrate-binding and thus
assisting in catalytic transformations. Just as binding species can
be selected from random sequence populations, catalysts can be
directly selected as well. Rather than selecting for partitioning
between binding and non-binding species, selections for ribozymes
typically involve selecting for single turnover reactions that modify
the catalytic species itself (Chen et al., 2007). For example, ribozyme
ligases that can append a primer sequence to themselves have been
selected (Bartel and Szostak, 1993), as have ribozyme cleavases that
cut themselves away from solid supports (reviewed in Breaker and
Joyce, 1994). Variations on this theme have led to the selection of
255
alkyl transferases, kinases, phosphatases, esterases, and ribozymes
capable of carbon–carbon bond synthesis (reviewed in Ellington
and Robertson, 1999; Lilley, 2003; Strobel and Cochrane, 2007).
Many of the basic reactions that support modern metabolism
have already been shown to be catalyzed by selected ribozymes,
again supporting the notion that such ribozymes could have also
been present in a RNA world. As examples, below we show how both
self-replication and translation might have arisen in a RNA world,
based on the results of selection experiments. However, there is a
distinct difference between the results of selections that generate
aptamers and selections that generate ribozymes. Irrespective of
the stringency of the selection, ribozymes always prove roughly
1000-fold or more slower than their protein counterparts.
2.3. Cofactors
Beyond binding substrates, early ribozymes could have also
bound cofactors to augment their catalytic activities. For example, a
number of aptamers have now been isolated that bind to adenosine
or its analogues. The first anti-ATP aptamer isolated by Sassanfar
and Szostak (1993) contains an asymmetric internal loop flanked
by two double-strand RNA regions and has a binding constant of
around 10 !M. The three-dimensional structure of this aptamer has
been determined (Dieckmann et al., 1996; Jiang et al., 1996). The
structure forms a GNRA tetraloop-like motif with AMP contributing
the A; binding primarily involves the nucleobase. Aptamers later
isolated against NAD+ (Burgstaller and Famulok, 1994), and SAM
(S-adenosyl methionine) (Burke and Gold, 1997) share the same
sequence and structural motif for adenine binding, just as many
different ATP binding proteins share the Rossman fold. Indeed, a
closer structural examination reveals many similarities between
the ways in which RNA molecules bind adenosine and proteins bind
adenosine (Marshall et al., 1997). In each case, hydrogen bonding
and stacking are the two dominant factors for affinity and discrimination.
While there is a simple and common motif for adenine binding,
other sequences and structures are also possible (just as there are
other ways that proteins can interact with adenosine other than
the Rossman fold). An anti-CoA aptamer isolated by Burke and
Hoffman (1998) also primarily recognizes adenosine, but the binding sequence for binding is different from the anti-ATP aptamer.
Even though both aptamers bind adenosine, the anti-ATP aptamer
does not recognize CoA, likely because this aptamer cannot accommodate the 3" phosphate of CoA. Aptamers have also been isolated
that specifically target adenine (as opposed to adenosine), and
again have a different sequence and structural motif than the
anti-ATP aptamer. Aptamers selected to bind cAMP (Koizumi and
Breaker, 2000) or to the triphosphate of ATP (Sazani et al., 2004) also
have sequence and structural motifs that differ from those of the
anti-ATP aptamer. While it might be supposed that these sequence
differences reflect specific contacts with the sugar or phosphate, in
fact the anti-cAMP aptamer seems to primarily recognize adenine
but forms a binding pocket that is sterically compatible with the
cyclic phosphate. In contrast, the anti-triphosphate aptamer truly
does recognize the beta and gamma phosphates, and can also bind
to GTP, UTP, and CTP.
While adenine is clearly a preferred target, other cofactor
chemistries are also highly compatible with nucleic acid binding pockets. Anti-FMN and -FAD aptamers were first isolated
by Burgstaller and Famulok (1994), while Lauhon and Szostak
(1995) later isolated additional flavin-binding aptamers. An additional class of anti-FAD aptamers was isolated by the Burke lab
(Roychowdhury-Saha et al., 2002). These aptamers all recognize
riboflavin with Kd values ranging from 0.5 !M to 50 !M but have
different sequences and adopt different secondary and tertiary
256
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
Fig. 1. Evolution from ATP-binding aptamer to ribozyme kinase. (a) Consensus sequence and rough 3D representation of the anti-ATP aptamer (see also Fig. 2a). The
‘GAA’ tri-nucleotide that forms GAAA tetraloop-like structure with AMP is shown in purple as in Fig. 2a. Watson–Crick base-pairs are shown as green lines, whereas the
non-Watson–Crick base-pair is shown as in Leontis and Westhof (2001). Residues known to be conserved amongst different variants are shown in upper case. (b) Putative ATPbinding pockets in 4 classes of selected kinase ribozymes. Residues that vary from the conserved residues in the anti-ATP aptamer are highlighted in gray. (For interpretation
of the references to color in this figure legend, the reader is referred to the web version of the article.)
structures: the anti-FMN and -FAD binding aptamers found by
the Famulok lab have simple loop structures, the anti-riboflavin
aptamers isolated by the Szostak lab fold into intramolecular
G-quartets, and the anti-FAD aptamer isolated by the Burke lab contains three helices interrupted by several internal bulges. While the
first anti-NAD aptamer also turned out to be an adenosine binder,
as mentioned above, anti-NMN aptamers were subsequently identified (Lauhon and Szostak, 1995). The anti-NMN aptamers were
able to discriminate between NAD and NADH in solution by over
an order of magnitude (Kd values of 2.5 !M versus 37 !M). This
result is encouraging, in that it implies that ribozymes using a
similar domain might assist hydride acceptance or transfer during
cofactor-mediated ribozyme catalysis.
Given that ribozymes used nucleotide cofactors to invent
metabolism, one interesting question that can be asked is how did
they use their cofactors? There are two basic choices: non-covalent
binding or covalent attachment. In order to assess these mechanistic possibilities, directed evolution has been used to reinvent RNA
doppelgangers of protein-catalyzed reactions. Both mechanisms
have now been successfully demonstrated.
Aptamer domains have been cobbled to ribozymes in hopes
of generating non-covalent cofactor-binding domains. For example, Lorsch and Szostak (1994) attempted to select ribozymes that
utilized ATP-"S as a substrate and that could phosphorylate themselves. Initially, the anti-adenosine aptamer motif described above
(Figs. 1 and 2a) was embedded in the random pool to increase
the chance of identifying self-kinases that could utilize ATP. The
transfer of the gamma thiol allowed capture of any self-kinases
on a sulfhydryl affinity column. Among the resultant 7 classes
of selected ribozymes, 5 classes acted as 5" kinases while the
other 2 classes catalyzed the phosphorylation of an internal 2"
hydroxyl. More recently, Burke and his co-workers have obtained
similar results, against selecting ribozymes that could predominantly phosphorylate themselves on 2" hydroxyls (Saran et al.,
2005). When the sequences selected by Lorsch were analyzed, the
anti-ATP aptamer motif was found to be conserved in only 2 classes
(Class IV and Class V in Fig. 1b), and only one of these had detectable
ATP affinity. In other words, the ribozyme actually eliminated the
pre-formed ability to bind ATP in order to make a functional kinase.
This may be consistent with what is known about the relative
robustness of functional sequence landscapes, described below.
Redox ribozymes that utilize NAD+ as a cofactor have also been
selected (Tsukiji et al., 2003). In this selection scheme, RNA transcription was initiated by guanosine-5" -monophosphorothioate
(GMPS) and the thiol group was then used as a handle for the conjugation of a benzyl alcohol derivative. The conjugated pool was
incubated with biotin-hydrazide in the presence of NAD+ . If the
alcohol group was oxidized by NAD+ , the resultant aldehyde could
be coupled to biotin and the self-tagging catalysts separated by
streptavidin capture. Based on the estimated background rate, the
selected ribozyme achieves a rate enhancement of >107 -fold. Subsequently, the same ribozyme was shown to catalyze the reverse
reaction – reduction of benzyl aldehyde by NADH (Tsukiji et al.,
2004). Moreover, the oxidation of alcohol still progressed when
the reaction was supplemented with NADH and FAD, relying on
the uncatalyzed hydride transfer from NADH to FAD. These results
suggest that redox couples would have easily arisen in the earliest
metabolic pathways.
Cofactors can also serve as covalent carriers of functional
groups. Acyl-CoA synthesizing ribozymes, which convert highenergy adenylate intermediates to slightly less energetic thioesters,
were independently selected by two groups, using slightly different methods (Jadhav and Yarus, 2002; Coleman and Huang,
2002). In both selections, activated biotin (biotin-AMP) was the
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
257
Fig. 2. Structural complexity of aptamers. (a) The binding pocket for AMP (shown in green) in an anti-ATP aptamer. The ‘GAA’ tri-nucleotide (shown in purple) and its
interaction with bound AMP resemble a stable GAAA tetraloop structural motif (b). (c) A selected anti-theophylline aptamer. Theophylline is shown in green. Residues making
contacts with theophylline are shown in purple. (d) An expanded view of the binding pocket for theophylline. (Structures are from PDB ID: 1RAW for (a), 1ZIF for (b) and
1O15 for (c)).
acyl source and served as an affinity tag to capture self-biotinylated
ribozymes. In the selection done by the Yarus group, CoA was incorporated at the 5" termini of a pool of RNA using a trans-acting
version of the previously described capping enzyme and the 3"
phosphate of the AMP portion of CoA as a nucleophile. The most
abundant sequence from the selection, acs1, catalyzed the CoA acylation reaction. Researchers in the Huang lab instead utilized a
3" -dephosphorylated CoA as the initiating nucleotide for transcription (Huang, 2003). It is possible that appended functional groups
not only augmented catalysis, but may also have participated in
replication, as described below.
2.4. Structural complexity
The steady accumulation of in vitro selected aptamers and
ribozymes has allowed a deeper understanding of how RNA
achieves its functionality through the study of their threedimensional structures. Because of the relative scarcity of naturally
occurring examples of functional RNAs, in vitro selection has been
an invaluable source of candidates for structural analysis. Aside
from a limited sample size, the examples of naturally occurring
functional RNAs often present formidable technical challenges
for structural studies. For example, the ribosome and the sliceosome are enormous multi-component molecular machines that are
difficult to isolate in homogeneous quantities necessary for structural techniques such as X-ray crystallography. Conversely, in vitro
selected RNAs are typically much smaller in size and are amenable
to preparation and manipulation in vitro using the same techniques
by which they were created in the first place. As a consequence,
the contributions of aptamers and ribozymes to the structural biology of RNA have been at least as important as that of natural
RNAs.
Much like proteins, structurally complex RNAs form their global
structures from secondary structural elements – A-form double
helices – assembled into distinct tertiary structures through an
assortment of smaller motifs and special interactions. These simple
structural elements, such as pseudoknots, tetraloops, and uridine
turns, among others, often represent the simplest, most energetically efficient solution to a particular structural requirement and
are found time and again in diverse RNAs irrespective of their
source. Different arrangements of smaller, local motifs within larger
molecules lead to a virtually unlimited palette of global structures
and functionalities. For example, GNRA tetraloops are commonly
observed in large RNAs such as rRNA but also occur in numerous
structured RNAs, including those created with in vitro selection.
They use a compact and efficient combination of base stacking and
hydrogen bonding to reverse the directionality of an RNA chain in
base-paired helices that terminate in stem-loop structures. As an
example of their frequency, the ∼1500 nucleotide E. coli 16S rRNA
contains 9 GNRA tetraloops (Woese et al., 1990). This distinct structural motif may be further utilized as a handle to form long-range
tertiary contacts with a partner motif called a tetraloop receptor or
may be recognized by a separate protein binding partner. As mentioned previously, the ATP aptamer uses a GNRA-like motif to bind
its ligand by docking the ATP into the “A” position of the motif,
albeit with the ATP in a slightly modified configuration (Jiang et al.,
1996) (Fig. 2a and b). This general mode of ligand binding – either
using the ligand to complete or fit into a common motif or to tie
different structural elements together – is a common theme.
The theophylline aptamer is an exquisite example of the type
of structural precision that RNA can achieve in a small efficient
structure (Jenison et al., 1994; Zimmermann et al., 1997). The 33
nucleotide aptamer binds theophylline with a Kd of ∼400 nM but
can discriminate against caffeine, which differs by the presence of
258
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
Fig. 3. Crystal structural of the L1 ribozyme ligase. (a) Overall fold of the L1 ligase. U71 and G1 span the ligation junction and are shown in red. A reverse Watson–Crick
base-pair formed by A51 and U28 mediate inter-helical contacts and are shown in purple. A conserved 5 nucleotide motif that serves as the hinge region is shown in yellow.
(b) Catalytic core of the L1 ligase. U71-p-G1 and the A51:U38 base-pair are shown as in (a). The backbone phosphates of G52 (pairing with U71), A39 and G40 coordinate the
active-site Mg2+ (shown as a cyan sphere) and are shown in green. (c) Top view of the G1:A51:U38 base-triple in catalytic core. (PDB ID: 2OIU).
a single additional methyl group at the N7 position, by a factor of
10,000-fold. This remarkable specificity is achieved primarily by a
network of hydrogen bonds in which the ligand completes a basetriple interaction. The presence of the additional methyl group on
caffeine disrupts one of the hydrogen bonds and causes a steric
clash in the binding pocket that greatly diminishes the binding
affinity. The binding pocket is further stabilized with extensive
base stacking as the ligand-containing base-triple is sandwiched
above and below by two other base-triples (Fig. 2c). A closer view
(Fig. 2d) would demonstrate how theophylline fits into this wellordered binding pocket. With knowledge of the three-dimensional
structure, this small aptamer has been further reduced in size to
a minimal 13 nucleotide construct that preserves all of the essential interactions with the ligand and retains discrimination against
caffeine (Anderson and Mecozzi, 2005).
RNA catalysis, in particular RNA-catalyzed RNA replication, is
central to ideas of how life on Earth was able to evolve to its current complexity. Since examples of RNA replicases are unknown
nature, in vitro selection has been used to assess whether ribozymes
are capable of this activity. RNA ligase ribozymes that catalyze
the bond-forming nucleophilic attack of a 3" hydroxyl on the 5" triphosphate of a separate RNA oligonucleotide – the identical
chemistry performed by modern RNA polymerase enzymes – were
first isolated using in vitro selection from a large population of
random sequence RNAs (Bartel and Szostak, 1993). Subsequently,
the RNA ligase selection has been successfully repeated by several investigators under various conditions with different starting
populations indicating that this activity is relatively abundant in
RNA sequence space. With the notable exception of the Bartel
Class I ligase (see also below), most of these selected ligases are
remarkably simple in terms of their secondary structures and information content. The X-ray crystal structure of one of these smaller
ligases, the L1 ligase (Fig. 3a), confirms that only a handful of “special” residues coupled with an arrangement of simple helices and
tetraloops are required to form the catalytically active structure
(Robertson and Scott, 2007). The majority of the structure, greater
than 75%, is involved in Watson–Crick base-paired helices and ordinary tetraloops, and only 7 residues form the heart of the tertiary
interaction that creates a catalytic pocket to position a magnesium
ion at the ligation site by the juxtaposition of two stem elements
(Fig. 3b and c). A few other residues in the hinge region (shown in
yellow in Fig. 3a) appear to play a role in further stabilizing the
active conformation, but overall this simple structure is formed
with the efficient use of a relatively few residues directing the
arrangement of standard structural elements into a compact fold.
This observation is also relevant to a general appreciation of how
the structures of selected RNA molecules impact our thoughts about
a RNA world: selections can generate simple, functional sequences,
these simple, functional sequences fold into compact structures,
and the compact structures can potentially be appended to one
another to make catalysts of ever-increasing complexity.
3. Doppelgangers for recapitulating the RNA world
3.1. Recapitulation of the evolution of self-replication
An amazing body of work shows that there are plausible prebiotic routes to nucleosides, nucleotides, and oligonucleotides
(reviewed in Orgel, 2004). The non-directed assembly of oligonucleotides would have provided fodder for self-replicators to arise,
and these could have further elaborated themselves through
the well-known cycle of Darwinian evolution. While mechanistic descriptions of many steps in this scenario remain unknown,
we have previously postulated what we believe is a plausible
pathway from prebiotic chemistry to the origin and elaboration of self-replication (Levy and Ellington, 2001; Fig. 4). In this
scenario, prebiotic oligonucleotides would have served as the
basis for template-directed ligation and amplification, similar to
the parabolic replicators previously demonstrated by both von
Kiedrowski and Orgel (Zielinski and Orgel, 1987; Sievers and von
Kiedrowski, 1994).
Such primitive replication cycles would have served as the
fodder for creating ever longer oligonucleotides, which would in
turn have been the basis for selecting for more complex catalysts
that were capable of more efficient self-replication. For example,
beyond accelerating catalysis via simple templating it may have
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
259
Fig. 4. The march of progress through the RNA world. This is a hypothetical representation of different stages of catalytic complexity leading from simple replicators through
simple ribozymes to complex ribozymes and ultimately to the takeover of the RNA world by DNA genomes and protein machines. At many of these stages, in vitro selection
experiments have provided potential doppelgangers of the molecules that may have once existed. This diagram originally appeared in Levy and Ellington (2001).
been possible for ligase ribozymes to greatly accelerate the catalysis of phosphodiester bond formation. A number of different ligase
ribozymes have now been selected from random sequence pools.
The first and perhaps still the most robust of these ligases were
selected by Bartel and Szostak (1993) from a pool that encompassed 220 random sequence positions. A wide variety of ligase
ribozymes were identified, including some motifs that were found
more than once, indicating that there may be motifs that were
common enough in sequence space that these motifs would have
been found in an ancient RNA world as well as in modern selection
experiments.
While it may have been possible to select for the catalysis of ligation, such ligases would not necessarily have provided a fitness
benefit to a nascent replicator on their own. In order to couple
ligation with fitness a given ligase would have had to somehow
reform itself. This initially seems like an exceedingly implausible notion, given that it is difficult to imagine a ribozyme whose
complement was also catalytically active. However, Joyce and coworkers showed that cross-catalytic ligation of oligonucleotides
was indeed possible. In their scheme, a selected ribozyme ligase
was broken into two pieces, and engineered so that one ribozyme
could act upon the complementary junction of another, leading
to its activation (Kim and Joyce, 2004; Fig. 5a). Mixing the four
ribozyme substrates led to the formation of new catalysts, but
their accumulation was sub-exponential. The autocatalytic feedback cycle was partially inhibited by product binding and by the
formation of alternative products through promiscuous reactions.
A similar demonstration has been carried out by the Lehman lab,
based on dividing the Group I self-splicing intron into four pieces
and then catalyzing reassembly by ribozyme-mediated ligation
(Hayden and Lehman, 2006). Interestingly, a functional ribozyme
can non-covalently assemble based on base-pairing interactions,
and then catalyze the assembly of covalently joined ribozymes.
While there were particular sequence requirements for the ligation junctions, these requirements could be relaxed, resulting in
the formation of functional ribozymes from a larger possible set of
initial sequences (Draper et al., 2008).
In addition to being limited by kinetics, these or similar systems
would ultimately have been limited by the fact that the ribozyme
templates were not perfectly complementary to one another; the
catalytic cores of the ribozymes fell outside the region of complementarity. To address this limitation, Kuhns and Joyce (2003) have
shown that perfectly complementary nucleic acid enzymes can be
engineered, although they were not the same nucleic acid enzymes
with the same ligase activity. How two ribozymes might act upon
one another to engender full cross-catalytic replication is still a
mystery.
It seems likely that at some point during the evolution of organisms a ribozyme polymerase would have arisen that could catalyze
the reproduction of templates, rather than just of itself. In the
original selection carried out by Bartel and Szostak there was one
ligase that was unique, and that was also very fast and structurally
complex, the Bartel Class I ligase. This ribozyme was further engineered to function as a modest polymerase, capable of adding up
to 6 nucleotides in a 4-day incubation. While the first engineered
polymerase acted in cis, an additional variant was engineered and
selected to act in trans, and could extend an exogenous template by
up to 14 nucleotides in 1 day. Unfortunately, the amount of infor-
260
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
Fig. 5. Continuous amplifications of ligase ribozymes without (a) and with (b) proteinaceous enzymes. (a) Cross-catalytic replication of the R3C ligase, after Kim and Joyce
(2004). A" and B" are half ribozyme substrates that can pair with a full ribozyme T. Ligation leads to the production of a complementary template, T" , that can in turn align the
half ribozyme substrates A and B. Interactions between T and T" are a dead end complex. (b) Scheme for the continuous evolution of the Bartel Class I ligase by Wright and
Joyce (1997a). Ligation of a chimeric RNA:DNA oligonucleotide followed by reverse transcription leads to the production of a template for T7 RNA polymerase, which can in
turn be transcribed to recreate the original ribozyme.
mation that had to be added to the Bartel ligase in order to bind a
trans template was much larger than 14 nucleotides, implying that
the more complex ribozyme was even less capable of eventually
becoming a self-replicator. However, McGinness and Joyce (2002)
were able to adapt a different, much smaller ribozyme ligase (the
hc ligase, which was derived by selection from the Group I selfsplicing intron) to act on a separable helix. Further evolution of the
hc ligase led to variants that were capable of catalyzing the joining of adjacent oligonucleotide substrates on an external template
with few sequence restrictions. This work sets the stage for the evolution of ligases that can reproduce themselves by oligonucleotide
polymerization.
In an additional attempt to recapitulate the dynamics of how
early replicators may have evolved, Wright and Joyce (1997b) further evolved the Bartel ligase for greater catalytic efficiency. An
adaptation to the ligase was introduced in which a functional
promoter was formed only upon ligation of the oligonucleotide substrate to the ribozyme (Fig. 5b). This allowed enzymatic function
and sequence amplification to be coupled at the same time and in
the same test tube. In this regard, evolution now occurred almost
continuously, just as in the wild, at least until the test tube ran
out of ‘food’ (substrates) for ribozyme replication. To circumvent
this problem, Wright merely had to regularly transfer a portion of
the reaction to a new food source. While the adaptations necessary for continuous evolution severely depressed the reactivity of
the ribozyme, over generations of continuous evolution the Bartel ligase accumulated multiple mutations and became almost as
efficient as its parent (kcat /Km of 1 × 107 /min, an improvement of
>104 -fold). Continuous evolution is more akin to what normally
occurs in biology than what happens during in vitro selection, and
while some mutations were likely fixed from an initial, heavily
mutagenized population, others clearly arose during the experiment itself.
Building on these results, Paegel and Joyce (2008) have
described a machine that can continuously evolve the ligase. In
this microfluidic device new food is fed to the rampantly replicating population not by hand, but a series of valves controlled by a
computer. As ligated ribozymes accumulate they intercalate a dye,
thiazole orange, in the reaction mixture, and this in turn gives off
a fluorescent signal that can be seen by embedded sensors. Once
fluorescence reaches a given level new food flows, the ribozymes
(and fluorescence) are diluted, and replication continues until fluorescence again builds and the food gates are again opened. In these
experiments the ribozymes were again under selection for speed,
but also for their ability to hold onto their oligonucleotide substrate.
The Km of the starting ribozyme was 35 !M, and substrate concentration was decreased from a limiting 1 !M to as low as 0.05 !M
over the course of the continuous evolution experiment. As before
with the manual regime the ribozyme responded, fixing multiple
mutations that resulted in an improvement in its Km to 0.4 !M.
For almost all of these schemes, activated nucleotides or
oligonucleotides would have been required. However, activated
leaving groups would likely not have survived for long periods of
time in the prebiotic environment, or would have been present at
only small, steady-state concentrations. For this reason, it is interesting to consider whether activated leaving groups may have come
about due to the development of catalytic cycles that preceded
a ribozyme polymerase, yet may have provided substrates for its
evolution.
In exploring the role of cofactors in metabolism we found that
functional group transfers appeared to be relatively facile. This
is especially true for cofactors and phosphoryl transfer reactions.
In addition to the AMP-biotin transfer to CoA cited above, Yarus
and co-workers selected a ribozyme that could catalyze the rearrangement of phosphoranhydride bonds (Huang and Yarus, 1997a;
Huang et al., 2000). This reaction is mechanistically similar to those
which occur during adenylation of cofactors, including FAD and
NAD. Indeed, the ribozyme showed a surprising lack of substrate
specificity, and could add any of a variety of molecules that contained at least one alpha-phosphate to its 5" end (Huang and Yarus,
1997b). A similar ‘cappase’ ribozyme has been selected by Kang and
Suga (2007), although in this instance the attacking nucleophile is
a 2" hydroxyl and the substrates are a variety of purine ribotides.
The activated functional groups would also have been capable of
participating in phosphoryl bond transfer reactions, such as those
that occur during DNA ligation and polymerization. In this regard,
it is interesting to note that the Group I self-splicing intron has been
shown to be capable of incorporating cofactors at its 5" end through
the same mechanism that guanosine normally uses to initiate the
splicing casade (Breaker and Joyce, 1995). Oligonucleotides terminated with cofactors/leaving groups could have served to initiate
ligation events, and a ribozyme selected by Fujita et al. (2006) does
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
Fig. 6. Down-hill acyl transfer reactions in translation. Acyl-activation, hydroxygroup acylation from an adenylate and peptide bond formation are shown as arrows
1–3, respectively. Each of these (or equivalent) reactions can be catalyzed by a
ribozyme generated by in vitro selection.
just this. The initial RNA pool terminated in a nicotinamide moiety joined to the ribozyme via a 5" –5" pyrophosphate linkage, and
variants were selected that could displace 5" -phosphorylated NMN
and form a phosphodiester bond.
Taken together, these findings indicate that while the most
obvious route to prebiotic replication, the assembly of monomers,
may have been foreclosed by enantiomeric poisoning or other
considerations, there would nonetheless have been a plethora of
other mechanisms by which early, self-replicating catalysts might
have arisen. Once the first self-replicator was established, the
elaboration of catalysis would have been almost inevitable, with
simple sequence and structural motifs being improved and elaborated. Growing self-replicators would have required and acquired
additional functionalities, such as the activation and synthesis of
nucleotides, leading to the establishment of an interdependent
genome.
3.2. Recapitulation of the development of translation
While it is hard to know how translation arose, it is likely that
many of the steps in the process today are at least chemically similar to those that would have been present initially. Amino acids
would likely have been activated by forming aminoacyl adenylates, a high-energy mixed anhydride between the carboxyl of the
amino acid and the phosphate of AMP. In ribosome-mediated protein synthesis, this reaction is the first-step of tRNA aminoacylation
catalyzed by aminoacyl-tRNA synthetase (aaRS). In ribosomes, the
aminoacyl is then transferred to the 2" or 3" hydroxyl of the 3" terminal nucleotide of tRNA by aaRS. Charged tRNAs bind to the
ribosome, and the amino group in the A-site aa-tRNA attacks the
carbonyl group of the peptidyl-tRNA (or aminoacyl-tRNA) in the Psite, forming a new peptide bond. From a purely chemical point
of view, the synthesis of a peptide is essentially the down-hill
transfer of the aminoacyl between different carriers: phosphate
(anhydride) → hydroxyl → amino in the ribosomal pathway (Fig. 6).
This putative path to translation has at some level been recapitulated in the laboratory by the directed evolution of ribozymes.
The initial activation of the acyl group as a mixed anhydride starts
the process. An acyl activating ribozyme has been selected (Kumar
and Yarus, 2001). These experiments utilized a carboxylic acid (3mercaptopropionic acid, 3 Mpa) rather than an amino acid as the
nucleophilic substrate and were conducted at lower pH in order
to diminish the inherent instability of the amino adenylate product. One selected variant, KK13, was shown to be able to use various
amino acids and even water (resulting in 5" -pyrophosphatase activity) as nucleophiles.
The ribozyme catalyzing the reaction that mimics the second
step of aaRS was actually the first selected ribozyme capable of
catalyzing aminoacyl transfer (Illangasekare et al., 1995). In this
261
selection, a RNA pool was incubated with phenylalanyl-AMP as
an amino acid donor. Some catalysts proved capable of transferring phenylalanine to a hydroxyl group. The self-modified RNA
now contained an amino group and could be reacted with the
N-hydroxysuccinimide (NHS) ester of naphthoxyacetic acid, dramatically increasing the hydrophobicity of the RNA and thus
allowing its separation from unreacted RNA by HPLC. One clone,
isolate 29, was chosen for further study and the aminoacyl acceptor of this ribozyme was proved to be the 2" or 3" hydroxyl of the 3"
terminal G.
Another aminoacylating ribozyme was selected by the Suga
group (Saito et al., 2001). In this case, though, the selection was
explicitly designed to be relevant to modern translation. A randomized region was appended to the 5" end of tRNA, and the selected
ribozyme catalyzes the aminoacylation of real tRNA on its amino
acid acceptor arm. Even when the ribozyme domain and tRNA were
separated by RNase P cleavage, the ribozyme could still function in
trans, just as protein aaRS do.
The first ribozyme that was actually capable of peptide bond
synthesis was selected by Eaton and co-workers (Wiegand et al.,
1997), and catalyzed amide-bond formation between an amino
group tethered to RNA (through a flexible PEG linker) and biotin
(again from a biotin adenylate). Other selected peptide bondforming ribozymes also utilize adenylate as an acyl source. First,
a peptide bond-forming ribozyme was selected that supposedly
utilized an aminoacyl linked to the 3" hydroxyl of AMP (Zhang
and Cech, 1997), but the real substrate was later identified as a
contaminating aminoacyl adenylate (Sun et al., 2002). Second, during further engineering of the previously described aminoacylating
ribozyme (Illangasekare et al., 1995), a second reaction between the
product, aminoacylated RNA, and free aminoacyl adenylate was
observed, resulting in the formation of a dipeptide-RNA adduct
((Illangasekare and Yarus, 1999). Following up on these results,
Zhang and co-workers intentionally selected ribozymes that could
act as general dipeptide synthesis catalysts by using aminoacyl
adenylates as substrates (Sun et al., 2002).
The types of group transfer reactions carried out by the ribosome
are used throughout metabolism, and thus there may have been
many routes to translation in an early RNA world. The group transfer
reactions embodied by ribozymes that utilize CoA are one example that we have already discussed, and indeed in non-ribosomal
peptide synthesis, the aminoacyl of the aminoacyl adenylate is
transferred to the thiol group of pantetheine, forming a thioester
bond. Similarly, while the major task of the ribosome is to form peptide bonds using the amino group of an aminoacylated tRNA as the
nucleophile, the nucleophile can also be amino group of puromycin
or the hydroxyl group of hydroxypuromycin in a process called the
‘fragment reaction.’ A ribozyme catalyzing a reaction similar to the
fragment reaction was first selected by Lohse and Szostak (1996).
These researchers intended to select an acyl transfer ribozyme that
could move an aminoacyl group at the 3" end of a 6 nucleotide RNA
oligonucleotide to a 5" , 3" , or internal 2" hydroxyl of the ribozyme.
After 11 rounds of selection one variant predominated, which had
strikingly evolved a way to juxtapose the acyl group of the RNA
oligonucleotide substrate and the 5" hydroxyl of the ribozyme by
simply using a 13-nt fragment template to align and juxtapose the
two functionalities. As expected, the nucleophile was identified as
the 5" , template-aligned hydroxyl of the ribozyme. Astoundingly,
when the 5" hydroxyl was substituted with an amino group, the
aminoacyl was transferred to the ribozyme at a rate comparable
to that of the original ribozyme. The fact that amide bond formation was as competent as the selected acyl transfer activity further
emphasized the role of templating in the mechanism.
A similar selection was carried out by the Famulok group, using
2" -aminoacyl-AMP as the aminoacyl donor (Jenne and Famulok,
262
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
1998). However, the nucleophile on the selected ribozyme proved
to be an internal 2" hydroxyl. In this instance, though, substitution
of the hydroxyl for an amino group did not result in the amide
bond formation, possibly due to the differing geometries of the
nucleophile (C3" -endo conformation for the 2" hydroxyl vs. C2" endo conformation for the 2" amine). Acyl transfer activity was also
achieved by the previously described ribozyme selected by Suga
and co-workers (Lee et al., 2006; Saito et al., 2001). The ribozyme
could use various aminoacyl donors, including adenylates,
thioesters, and oxygen esters. As we saw in discussing the evolution
of replication, it may be that reactions that were either evolved for
another purpose or that were semi-specific in terms of substrates
and leaving groups could have provided the basis for the evolution
of new activities, in this case the formation of peptide bonds.
4. What in vitro selection reveals about fitness landscapes
At some level, all speculations about the RNA world are just that:
speculations. Absent a more complete representation of phylogenetic descent prior to the takeover of the RNA world by translation
(or a time machine) there are numerous possible options for what
may have actually happened at origins and in the RNA world. It
is because there is no clear historical trail to origins that directed
evolution experiments are most useful in understanding what may
have happened. In vitro selection provides a snapshot of fitness
landscapes: how sequence maps to function (and, therefore, at least
in part to fitness; reviewed in Jhaveri et al., 1997; Lehman, 2004).
By determining how likely it is that a given ribozyme activity can
arise in a test tube, extrapolations regarding the likelihood of that
ribozyme having arisen in the RNA world can also be made. Based
on selection experiments, a number of conclusions can be drawn
about the relative likelihoods of different paths to and from origins.
4.1. Tyranny of short motifs
To the extent that there was a period in Earth’s history where
new sequences were invented from randomly assembled and replicated mono- and oligonucleotides, it is not unreasonable that in
vitro selection directly mimics processes that occurred in the RNA
world. That said, the earliest replicators would likely have been
short, and their functionality would therefore have been minimal.
This does not argue against nucleic acid origins, but rather in favor of
it: short nucleic acids can more readily assume structure than short
peptides. The fact that helices can form readily based on simple
Watson–Crick pairing means that cavities for binding or catalysis
will also form.
The earliest replicating, functional nucleic acids would have
been elongated and in the process would have had the opportunities to assume more complex structure and to refine and
acquire additional functionalities. However, the original sequence
and functionality would still have been embedded in the expanded
sequence, and may therefore have guided or restricted the evolution of the longer binding species and catalysts.
To the extent that this hypothesis is true, we might expect in
vitro selection experiments to yield at least some binding species
and catalysts that have short, compact sequence and structural
motifs. In fact, in vitro selection not only yields short functional
motifs, they often predominate in a selection. This is of course
because shorter sequence motifs are numerically overrepresented
relative to longer sequence motifs, and will therefore predominate
unless the longer motifs are much more functional. This predominance has been called the ‘tyranny of short motifs’, and we have
already seen it in action during the selection of the Bartel ligase,
in which only one long, complex, and highly functional variant
arose, while many other short, less complex, but still quite func-
tional ribozymes were also found. Similarly, undirected selections
of ribozyme cleavases turned up a surprising number of variants of
the well-known but simple hammerhead ribozyme motif (SalehiAshtiani and Szostak, 2001), as well as other simple, less active
motifs (Tang and Breaker, 2000). A similar set of selections that was
started from pools biased in favor of the hammerhead also showed
that the ribozyme had a relatively low information content (Tang
and Breaker, 1997). McManus and Li (2007) have found that similar,
simple deoxyribozyme kinases emerged from a selection starting
from a completely random pool.
A further implication of the tyranny of short motifs is that the
short sequence motifs we see today may be similar or identical
to short sequence motifs that would have been seen in the RNA
world. For example, the fact that similar short, adenosine-binding
motifs have arisen in selections against many different adenosinecontaining cofactors may imply that it was in fact this motif that was
present in early ATP-binding species or ribozymes. Similarly, if an
early cleavase was important in the RNA world it would almost certainly have been akin to the hammerhead ribozyme sequence that
is seen in the modern world. Not only do short, simple, functional
sequences exist, but also it appears that such species can potentially
be selected from even sparse representations of sequence space
(Knight and Yarus, 2003).
4.2. Diversity in sequence space
While it is true that short motifs can dominate a selection, many
selection experiments reveal that the fitness landscape surrounding a given function is quite diverse. As we have already seen, the
original selection by Bartel and Szostak (1993) produced many
short ribozymes as well as the complex Class I ligase. A more
thorough analysis has been carried out by Li and his co-workers.
Schlosser and Li (2005) followed the selection of deoxyribozyme
cleavases through multiple generations, and saw the rise and fall
of many different variants as the selection progressed. While many
of these intermediate variants may be less fit relative to the population, they can be individually ‘rescued’ by letting them explore
the surrounding sequence space and fitness landscape. For example, when deoxyribozymes predicted to form three-arm junctions
were further evolved, they acquired additional helical elements,
and became more complex five-arm junctions (Chiuman and Li,
2006). However, such variants are only available if individual variants are protected from competition. By increasing the selection
pressure on the population, the diversity of the selected sequences
winnowed quickly, with the population collapsing on a previously
discovered deoxyribozyme, 8–17 (Schlosser and Li, 2004).
4.3. Acquisition of new functions
Stephen Gould has asserted that if the tape of life were replayed
the results would be different. However, the tyranny of short motifs
implies that at least some of the molecular events in any tape that
involves RNA may be quite deterministic. Both ideas can be simultaneously true. While short sequence motifs can readily predominate,
elongation and elaboration of function may occur in a variety of different ways, and which of those ways was chosen during evolution
will likely forever remain a mystery.
In this regard, nucleic acids have proven to be surprisingly adept
at moving between different functionalities, leaving behind their
original, tyrannical motifs and adopting new sequences and structures. Burke and co-workers started with an anti-flavin aptamer
and from it selected an anti-guanosine aptamer (Held et al., 2003).
In at least one instance an anti-flavin and anti-guaonsine aptamer
were separated by only seven sequence substitutions. By synthesizing predicted intermediates, a neutral path between the two
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
263
Fig. 7. Anticipated path between anti-FAD and anti-GMP aptamer variants. Structures were generated using the NUPACK algorithm of Dirks and Pierce (2004), based on
sequences identified by Held et al. (2003). As sequence substitutions accumulate in the anti-FAD aptamer (left) the structure is predicted to be destabilized, quickly assuming
a more open format. Additional sequence substitutions restabilize a new conformation, which can bind guanosine (right).
aptamers could be found, although two of the variants in this path
had reduced binding abilities for both ligands (Fig. 7). Schultes and
Bartel (2000) performed a similar feat with two ribozymes, starting
with a selected ligase and engineering it sequentially to be a known,
natural cleavase (the HDV ribozyme). Interestingly, while the intermediates showed greatly reduced activity there were at least some
variants that could catalyze both types of reactions. In both of these
instances, the aptamers and ribozymes assumed completely different secondary structures as a result of the accumulation of point
mutations. Given the joint activities at the ‘intersection sequences’
that separated them, this means that at least some functional
nucleic acids may be conformationally mobile, and capable of multiple different activities depending on which conformation they
assume. Such degenerate mapping of sequence to structure to
function might have greatly accelerated the acquisition of novel
functions at or near origins.
Not all nucleic acids may be able to move readily through
sequence or conformational spaces, however. By using microarrays
to directly examine the sequence space surrounding an anti-IgE
aptamer Katilius et al. (2007) showed that most of the nearest neighbor mutations were deleterious to binding. Similarly,
although Bartel and Szostak (1993) showed that many of the
sequences and structures of some of the ligases that emerged from a
‘deep random’ selection were clearly driven by the tyranny of short
motifs, the aforementioned Class I Bartel ligase was not. Indeed, partial randomization and re-selection indicated that the Class I ligase
had an extremely high information content, and given the pool size
used would likely have been selected only once every 10,000 times
the experiment was carried out (Ekland et al., 1995). This is usually
taken to mean not that Bartel was extremely lucky, but rather that
there are many equally functional, equally complex variants scattered through sequence space. A similar analysis of the natural HDV
ribozyme indicated that it is also relatively rare, and unlike its counterpart the hammerhead may have arisen relatively rarely (Nehdi
and Perreault, 2006). These results gibe with the fact that the HDV
ribozyme has been found only once in nature, while the hammerhead ribozyme appears to have multiple, independent origins.
The fact that the Class I Bartel ligase is complex and rare does
not necessarily mean that it is isolated in sequence space; it might
easily be part of a large neutral network of functional ribozymes.
However, by and large this does not appear to be the case. The
Bartel ligase has been subjected to numerous different directed
evolution experiments, and has in general proved extremely recal-
citrant to sequence change (Levy et al., 2005). For example, Schmitt
and Lehman (1999; see also Lehman, 2004) has shown that multiple directed evolution experiments with the ligase that started
from parallel, partially randomized populations yielded the same
selected variants. Similarly, despite the grinding intensity of the
10500 -fold selection and amplification of the ligase that was carried
out via the automated methods described above (Paegel and Joyce,
2008), no real improvement in the speed of the ligase (kcat ) was
observed. Finally, despite heroic efforts that involved selection in
a liter of water-in-oil emulsion, Zaher and Unrau (2007) improved
the polymerization speed of a variant of the Bartel Class I ligase by
only about 75-fold.
Since there is not of necessity an inverse correlation between
complexity and neutral movement on a fitness landscape, the relative isolation of the Bartel ligase is curious and demands further
explanation. It may be that since the ligase was born through the
unnatural process of simultaneously competing against multiple,
unrelated points in sequence space, rather than the more natural
process of accumulating single mutations and blocks of sequence,
there is no reason to expect that it should sit on a neutral network
and remain functional following even small mutational moves.
As a counterexample, a number of variants of the Group I selfsplicing intron have been evolved. For example, the enzyme has
been evolved to new tasks (the cleavage of DNA; Tsang and Joyce,
1994) and to function in new ways (with calcium rather than magnesium as a catalytic metal; Lehman and Joyce, 1993). During the
evolution of these phenotypes, numerous different pathways or trajectories can be taken (Hanczyc and Dorit, 2000; Lehman et al.,
2000). The fact that the Group I self-splicing intron was born of
natural processes, and thus is already well-familiar with evolution
in the context of single base changes, may be one factor accounting
for the seeming difference between its readily traversable fitness
landscape and the restricted peak of the Bartel ligase.
While different functional nucleic acids would presumably have
moved along fitness landscapes in very similar ways, the ability
to fully traverse the landscape remains an open question. Some
areas of the landscape may be sparsely populated and have isolated
fitness peaks, while others may have large neutral networks that
can be navigated by point mutation. Novel functionalities could
potentially be acquired by re-folding or by recombination. The role
of recombination in augmenting RNA functionality is an active area
of research and it is still unclear to what extent it would have been
a selective advantage at or near origins.
264
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
5. Conclusions
In vitro selection experiments show first and foremost that it
is possible that functional nucleic acids can arise from random
sequence libraries. Indeed, even simple sequence and structural
motifs can prove to be robust binding species and catalysts, indicating that it may have been possible to transition from even the
earliest self-replicators to a nascent, RNA-catalyzed metabolism.
Because of the diversity of aptamers and ribozymes that can be
selected, it is possible to construct a ‘fossil record’ of the evolution
of the RNA world, with in vitro selected catalysts filling in as doppelgangers for molecules long gone. In this way a plausible pathway
from simple oligonucleotide replicators to genomic polymerases
can be imagined, as can a pathway from basal ribozyme activities
to the ribosome. Most importantly, though, in vitro selection experiments can give a true and quantitative idea of the likelihood that
these scenarios could have played out in the RNA world (although
into the future new methods for the synthesis of microarrays allow
huge numbers of sequence variants to be prepared and assessed
simultaneously, and such large screens are becoming increasingly
viable alternatives to blind selection processes; Katilius et al., 2007).
Simple binding species and catalysts could have evolved into other
structures and functions. As replicating sequences grew longer,
new, more complex functions or faster catalytic activities could
have been accessed. Some activities may have been isolated in
sequence space, but others could have been approached along large,
interconnected neutral networks. As the number, type, and length
of ribozymes increased, RNA genomes would have evolved and
eventually there would have been no area in a fitness landscape that
would have been inaccessible. Self-replication would have inexorably led to life.
Interestingly, this field of research has application not only to
understanding our origins, but also to numerous issues in biotechnology. Attempts to understand the origins of replication have led to
the synthesis of orthogonal base-pairs that can augment the information content of DNA. Attempts to understand replication have
led to novel amplification assays. Into the future, we suspect this
cross-hybridization will continue, and that the first self-replicating
nucleic acids will not only provide insights into what may have
occurred billions of years ago, but will also form the basis for selfrepairing nanotechnologies.
References
Anderson PC, Mecozzi S. Unusually short RNA sequences: design of a 13-mer
RNA that selectively binds and recognizes theophylline. J Am Chem Soc
2005;127:5290–1.
Bartel DP, Szostak JW. Isolation of new ribozymes from a large pool of random
sequences. Science 1993;261:1411–8.
Breaker RR, Joyce GF. A DNA enzyme that cleaves RNA. Chem Biol 1994;1:
223–9.
Breaker RR, Joyce GF. Self-incorporation of coenzymes by ribozymes. J Mol Evol
1995;40(6):551–8.
Burgstaller P, Famulok M. Isolation of RNA aptamers for biological cofactors by in
vitro selection. Angew Chem Int Ed Engl 1994;33:1084–7.
Burke DH, Gold L. RNA aptamers to the adenosine moiety of S-adenosyl methionine: structural inferences from variations on a theme and the reproducibility of
SELEX. Nucleic Acids Res 1997;25:2020–4.
Burke DH, Hoffman DC. A novel acidophilic RNA motif that recognizes coenzyme A.
Biochemistry 1998;37:4653–63.
Chen X, Li N, Ellington AD. Ribozyme catalysis of metabolism in the RNA world. Chem
Biodivers 2007;4:633–55.
Chiuman W, Li Y. Revitalization of six abandoned catalytic DNA species reveals a
common three-way junction framework and diverse catalytic cores. J Mol Biol
2006;357:748–54.
Coleman TM, Huang F. RNA-catalyzed thioester synthesis. Chem Biol
2002;9:1227–36.
Dieckmann T, Suzuki E, Nakamura GK, Feigon J. Solution structure of an ATP-binding
RNA aptamer reveals a novel fold. RNA 1996;2:628–40.
Dirks RM, Pierce NA. An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots. J Comput Chem 2004;25:1295–304.
Draper WE, Hayden EJ, Lehman N. Mechanisms of covalent self-assembly of the
Azoarcus ribozyme from four fragment oligonucleotides. Nucleic Acids Res
2008;36:520–31.
Ekland EH, Szostak JW, Bartel DP. Structurally complex and highly active RNA ligases
derived from random RNA sequences. Science 1995;269:364–70.
Ellington AD. RNA selection. Aptamers achieve the desired recognition. Curr Biol
1994;4:427–9.
Ellington AD, Robertson MP. Ribozyme selection. In: Barton D, Nakanishi K, MethCohn O, editors. Comprehensive natural products chemistry, vol. 6. New York:
Elsevier; 1999. p. 115–48.
Fujita Y, Furuta H, Ikawa Y. Construction of an artificial ribozyme which ligates an
RNA fragment activated by nicotinamide mononucleotide. Nucleic Acids Symp
Ser (Oxf) 2006;50:231–2.
Gilbert W. Evolution of antibodies. The road not taken. Nature 1986;320:485–6.
Hanczyc MM, Dorit RL. Replicability and recurrence in the experimental evolution
of a group I ribozyme. Mol Biol Evol 2000;17:1050–60.
Hayden EJ, Lehman N. Self-assembly of a group I intron from inactive oligonucleotide
fragments. Chem Biol 2006;13:909–18.
Held DM, Greathouse ST, Agrawal A, Burke DH. Evolutionary landscapes for
the acquisition of new ligand recognition by RNA aptamers. J Mol Evol
2003;57:299–308.
Huang F, Yarus M. A calcium-metalloribozyme with autodecapping and pyrophosphatase activities. Biochemistry 1997a;36:14107–19.
Huang F, Yarus M. Versatile 5" phosphoryl coupling of small and large molecules to
an RNA. Proc Natl Acad Sci USA 1997b;94:8965–9.
Huang F, Bugg CW, Yarus M. RNA-Catalyzed CoA, NAD, and FAD synthesis from
phosphopantetheine, NMN, and FMN. Biochemistry 2000;39:15548–55.
Huang F. Efficient incorporation of CoA, NAD and FAD into RNA by in vitro transcription. Nucleic Acids Res 2003;31:e8.
Illangasekare M, Sanchez G, Nickles T, Yarus M. Aminoacyl-RNA synthesis catalyzed
by an RNA. Science 1995;267:643–7.
Illangasekare M, Yarus M. A tiny RNA that catalyzes both aminoacyl-RNA and
peptidyl-RNA synthesis. RNA 1999;5:1482–9.
Jadhav VR, Yarus M. Acyl-CoAs from coenzyme ribozymes. Biochemistry
2002;41:723–9.
Jenison RD, Gill SC, Pardi A, Polisky B. High-resolution molecular discrimination by
RNA. Science 1994;263:1425–9.
Jenne A, Famulok M. A novel ribozyme with ester transferase activity. Chem Biol
1998;5:23–34.
Jhaveri S, Hirao I, Bell SD, Uphoff K, Ellington AD. Landscapes for molecular evolution:
lessons from in vitro selection experiments with nucleic acids. Comb Chem Mol
Diver 1997;5:169–91.
Jiang F, Kumar RA, Jones RA, Patel DJ. Structural basis of RNA folding and recognition
in an AMP-RNA aptamer complex. Nature 1996;382:183–6.
Katilius E, Flores C, Woodbury NW. Exploring the sequence space of a DNA aptamer
using microarrays. Nucleic Acids Res 2007;35:7626–35.
Kang TJ, Suga H. In vitro selection of a 5" -purine nucleotide transferase ribozyme.
Nucleic Acids Symp Ser (Oxf) 2007;50:379–80.
Kim DE, Joyce GF. Cross-catalytic replication of an RNA ligase ribozyme. Chem Biol
2004;11:1505–12.
Knight R, Yarus M. Finding specific RNA motifs: function in a zeptomole world? RNA
2003;9:218–30.
Koizumi M, Breaker RR. Molecular recognition of cAMP by an RNA aptamer. Biochemistry 2000;39:8983–92.
Kuhns ST, Joyce GF. Perfectly complementary nucleic acid enzymes. J Mol Evol
2003;56:711–7.
Kumar RK, Yarus M. RNA-catalyzed amino acid activation. Biochemistry
2001;40:6998–7004.
Lauhon CT, Szostak JW. RNA aptamers that bind flavin and nicotinamide redox cofactors. J Am Chem Soc 1995;117:1246–57.
Lee JF, Stovall GM, Ellington AD. Aptamer therapeutics advance. Curr Opin Chem Biol
2006;10:282–9.
Lehman N, Joyce GF. Evolution in vitro: analysis of a lineage of ribozymes. Curr Biol
1993;3:723–34.
Lehman N, Donne MD, West M, Dewey TG. The genotypic landscape during in vitro
evolution of a catalytic RNA: implications for phenotypic buffering. J Mol Evol
2000;50:481–90.
Lehman N. Assessing the likelihood of recurrence during RNA evolution in vitro. Artif
Life 2004;10:1–22.
Leontis NB, Westhof E. Geometric nomenclature and classification of RNA base pairs.
RNA 2001;7:499–512.
Levy M, Ellington AD. The descent of polymerization. Nat Struct Biol 2001;8:580–2.
Levy M, Cater SF, Ellington AD. Quantum-dot aptamer beacons for the detection of
proteins. Chembiochem 2005;6:2163–6.
Lilley DM. The origins of RNA catalysis in ribozymes. Trends Biochem Sci
2003;28:495–501.
Lohse PA, Szostak JW. Ribozyme-catalysed amino-acid transfer reactions. Nature
1996;381:442–4.
Lorsch JR, Szostak JW. In vitro evolution of new ribozymes with polynucleotide
kinase activity. Nature 1994;371:31–6.
Marshall KA, Robertson MP, Ellington AD. A biopolymer by any other name would
bind as well: a comparison of the ligand-binding pockets of nucleic acids and
proteins. Structure 1997;5:729–34.
McManus SA, Li Y. Multiple occurrences of an efficient self-phosphorylating deoxyribozyme motif. Biochemistry 2007;46:2198–204.
A.D. Ellington et al. / The International Journal of Biochemistry & Cell Biology 41 (2009) 254–265
McGinness KE, Joyce GF. RNA-catalyzed RNA ligation on an external RNA template.
Chem Biol 2002;9:297–307.
Nehdi A, Perreault JP. Unbiased in vitro selection reveals the unique character of the self-cleaving antigenomic HDV RNA sequence. Nucleic Acids Res
2006;34:584–92.
Orgel LE. Prebiotic chemistry and the origin of the RNA world. Crit Rev Biochem Mol
Biol 2004;39:99–123.
Paegel BM, Joyce GF. Darwinian evolution on a chip. PLoS Biol 2008;6:e85.
Robertson MP, Scott WG. The structural basis of ribozyme-catalyzed RNA assembly.
Science 2007;315:1549–53.
Roychowdhury-Saha M, Lato SM, Shank ED, Burke DH. Flavin recognition by an RNA
aptamer targeted toward FAD. Biochemistry 2002;41:2492–9.
Saito H, Kourouklis D, Suga H. An in vitro evolved precursor tRNA with aminoacylation activity. EMBO J 2001;20:1797–806.
Salehi-Ashtiani K, Szostak JW. In vitro evolution suggests multiple origins for the
hammerhead ribozyme. Nature 2001;414:82–4.
Saran D, Nickens DG, Burke DH. A trans acting ribozyme that phosphorylates exogenous RNA. Biochemistry 2005;44:15007–16.
Sassanfar M, Szostak JW. An RNA motif that binds ATP. Nature 1993;364:550–3.
Sazani PL, Larralde R, Szostak JW. A small aptamer with strong and specific recognition of the triphosphate of ATP. J Am Chem Soc 2004;126:8370–1.
Schlosser K, Li Y. Tracing sequence diversity change of RNA-cleaving deoxyribozymes
under increasing selection pressure during in vitro selection. Biochemistry
2004;43:9695–707.
Schlosser K, Li Y. Diverse evolutionary trajectories characterize a community of RNAcleaving deoxyribozymes: a case study into the population dynamics of in vitro
selection. J Mol Evol 2005;61:192–206.
Schmitt T, Lehman N. Non-unity molecular heritability demonstrated by continuous
evolution in vitro. Chem Biol 1999;6:857–69.
Schultes EA, Bartel DP. One sequence, two ribozymes: implications for the emergence of new ribozyme folds. Science 2000;289:448–52.
Sievers D, von Kiedrowski G. Self-replication of complementary nucleotide-based
oligomers. Nature 1994;369:221–4.
265
Stoltenburg R, Reinemann C, Strehlitz B. SELEX–a (r)evolutionary method to generate
high-affinity nucleic acid ligands. Biomol Eng 2007;24:381–403.
Strobel SA, Cochrane JC. RNA catalysis: ribozymes, ribosomes, and riboswitches. Curr
Opin Chem Biol 2007;11:636–43.
Sun L, Cui Z, Gottlieb RL, Zhang B. A selected ribozyme catalyzing diverse dipeptide
synthesis. Chem Biol 2002;9:619–28.
Tang J, Breaker RR. Examination of the catalytic fitness of the hammerhead ribozyme
by in vitro selection. RNA 1997;3:914–25.
Tang J, Breaker RR. Structural diversity of self-cleaving ribozymes. Proc Natl Acad Sci
USA 2000;97:5784–9.
Tsang J, Joyce GF. Evolutionary optimization of the catalytic properties of a DNAcleaving ribozyme. Biochemistry 1994;33:5966–73.
Tsukiji S, Pattnaik SB, Suga H. An alcohol dehydrogenase ribozyme. Nat Struct Biol
2003;10:713–7.
Tsukiji S, Pattnaik SB, Suga H. Reduction of an aldehyde by a NADH/Zn2+ -dependent
redox active ribozyme. J Am Chem Soc 2004;126:5044–5.
Wiegand TW, Janssen RC, Eaton BE. Selection of RNA amide synthases. Chem Biol
1997;4:675–83.
Woese CR, Winker S, Gutell RR. Architecture of ribosomal RNA: constraints on the
sequence of “tetra-loops”. Proc Natl Acad Sci USA 1990;87:8467–71.
Wright MC, Joyce GF. Continuous in vitro evolution of catalytic function. Science
1997a;276:614–7.
Wright MC, Joyce GF. Continuous in vitro evolution of catalytic function. Science
1997b;276:546–7.
Zaher HS, Unrau PJ. Selection of an improved RNA polymerase ribozyme with superior extension and fidelity. RNA 2007;13:1017–26.
Zhang B, Cech TR. Peptide bond formation by in vitro selected ribozymes. Nature
1997;390:96–100.
Zielinski WS, Orgel LE. Autocatalytic synthesis of a tetranucleotide analogue. Nature
1987;327:346–7.
Zimmermann GR, Jenison RD, Wick CL, Simorre JP, Pardi A. Interlocking structural
motifs mediate molecular discrimination by a theophylline-binding RNA. Nat
Struct Biol 1997;4:644–9.