Download Minireview Shifty Ciliates: Frequent Programmed

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuronal ceroid lipofuscinosis wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Minimal genome wikipedia , lookup

Genomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene nomenclature wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genome evolution wikipedia , lookup

Epigenetics of human development wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Genome (book) wikipedia , lookup

Protein moonlighting wikipedia , lookup

Designer baby wikipedia , lookup

Messenger RNA wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Transfer RNA wikipedia , lookup

Point mutation wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epitranscriptome wikipedia , lookup

NEDD9 wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Expanded genetic code wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic code wikipedia , lookup

Transcript
Cell, Vol. 111, 1–20, December 13, 2002, Copyright 2002 by Cell Press
Shifty Ciliates: Frequent Programmed
Translational Frameshifting
in Euplotids
Lawrence A. Klobutcher1,3 and Philip J. Farabaugh2
Department of Biochemistry
University of Connecticut Health Center
Farmington, Connecticut 06032
2
Department of Biological Sciences
University of Maryland Baltimore County
Baltimore, Maryland, 21250
1
Recent work suggests that there is a high frequency
of programmed ⫹1 translational frameshifting in ciliates of the Euplotes genus. Frequent frameshifting
may have been potentiated by stop codon reassignment, which is also a feature of this group.
The way in which genetic information is translated into
proteins had been considered universal: all organisms
were thought to use a standard genetic code. Amazingly, the hard wiring of translation implicit in the genetic
code now appears itself to be subject to change. The
ciliated protozoa appear to have taken the greatest liberties with the so-called universal genetic code. Members
of this group have the heaviest concentration of nuclear
non-standard genetic codes recognized to date, typically using canonical stop codons to code instead for
amino acids (see Lozupone et al., 2001; Tourancheau et
al., 1995). For example, members of the genus Euplotes
decode the conventional UGA stop codon as cysteine.
In contrast, Tetrahymena and Paramecium retain a UGA
stop codon, but decode the other two stop codons as
glutamine, and it appears that this code has evolved
independently up to five times.
Changes to decoding can be subtler than the evolution of a non-standard genetic code. Often, mRNA sequences evolve to force a local change in the rules of
decoding, for example decoding a specific termination
codon as sense, or shifting the ribosome’s reading
frame. A number of reports over the last few years indicate that euplotids have also taken these types of liberties with the code. Though the absolute numbers remain
small, a significant fraction of available euplotid genes
appear to require a ⫹1 translational frameshift to produce a functional protein. Sequence identities among
the putative frameshift sites suggest that they occur as
the ribosome encounters a termination codon, suggesting that the phenomenon of non-standard codes
and non-canonical decoding may be mechanistically
related.
Programmed translational frameshifts occur when
special sequences in mRNAs manipulate the ribosome
to cause it to change its reading frame (reviewed in
Baranov et al., 2002; Stahl et al., 2002). The outcome is
that genetic information encoded discontinuously in the
mRNA is expressed into a continuous protein product.
Many metazoan viruses encode open reading frames
(ORFs) that overlap, for example the retroviral gag and
pol genes. Ribosomes that reach the end of the gag
3
Correspondence: [email protected]
Minireview
gene occasionally shift reading frames and continue
synthesis into pol, producing a Gag-Pol fusion protein.
The frequency of this frameshift is as much as 10,000fold greater than the estimated rate of spontaneous
translational frameshifting. The sequence of the region
in and around the site of frameshifting stimulates this
impressive increase in “error.”
Among the simplest types of programmed frameshifts
are “⫹1 shifty stops” (Weiss et al., 1987). These sites
consist of a poorly recognized termination codon immediately preceded by a sequence that can allow a tRNA
to slip ⫹1 on the mRNA while still maintaining at least
two base pairs. For example, in the prfB gene of Escherichia coli, a ⫹1 frameshift occurs at the sequence CUUUGA-C, shown in codons of the upstream, unshifted
ORF (reviewed in Baranov et al., 2002). The signal UGA-C
is recognized particularly poorly by peptide release factor II (RF2), itself the product of the prfB gene. RF2
recognizes UGA stop codons and triggers termination
of translation. When the concentration of RF2 falls below
the optimal concentration, recognition of the “internal”
UGA codon in prfB mRNA is slowed, causing a translational pause with the peptidyl-tRNA in the ribosomal P
site bound to the CUU codon. Frameshifting is thought
to occur by the peptidyl-tRNA slipping from CUU to the
UUU codon overlapping in the ⫹1 reading frame. This
mechanism provides an autogenous control loop, optimizing expression of RF2, since a decrease in RF2 concentration would tend to increase its expression, but an
increase would tend to decrease it.
Frequency of Frameshifting in Euplotes
The first report of a putative frameshifting event in a
euplotid gene was for an open reading frame (ORF2)
encoding a tyrosine-type recombinase in the Tec2
transposons of Euplotes crassus (Doak et al., 2003; Jahn
et al., 1993). Because a number of mobile elements were
known to require frameshifting for gene expression (reviewed in Farabaugh, 2000), this observation provided
little reason to believe that frameshifting would be common in Euplotes cellular genes. However, more recent
studies indicate that frameshifting may indeed be frequent. Evidence for frameshifting has been obtained
for genes encoding the regulatory subunit of cAMPdependent protein kinase (PKAR) and a nuclear protein
kinase (EoNdr2) in E. octocarinatus (Tan et al., 2001a,
2001b), a La motif protein (p43) in E. aediculatus (Aigner
et al., 2000), and the telomerase reverse transcriptase
(TERT) of E. crassus (Wang et al., 2002). The GenBank
database includes sequences for the complete coding
regions of only 67 genes from various Euplotes species.
By contrast, among the ⵑ6000 genes in the yeast Saccharomyces cerevisiae, frameshifting is required for the
expression of only two genes, encoding a subunit of
telomerase, EST3 (Morris and Lundblad, 1997), and an
actin binding protein, ABP140 (Asakura et al., 1998).
Thus, only about 0.03% of yeast genes require frameshifting, while the limited euplotid data suggest that
⬎5% of genes may require a frameshift for expression.
Cell
2
Figure 1. Frameshifting in Euplotes
(A) The E. octocarinatus Eondr2 gene (black rectangle) is shown.
The frame 0 ORF begins with an ATG initiation codon and encodes
all of conserved protein kinase domain I, as well as a part of domain
II. The second ORF (frame ⫹1) encodes a part of domain II, and the
remaining 10 conserved protein kinase domains. Start and stop
codons for the 0 and ⫹1 frame ORFs are indicated above and below
the gene map, respectively. Figure based on data in Tan et al.
(2001b).
(B) The sequence of the E. aediculatus La motif protein mRNA in
the vicinity of the frameshift site is shown, along with the predicted
translation products from the 0 and ⫹1 frame ORFs. The AAA-UAAA motif is highlighted in gray. A frameshift at positions “1” or “2”
would produce amino acid sequences that conform to the consensus of the “La motif,” including a highly conserved phenylalanine
(underlined F) residue (for details, see Aigner et al., 2000).
The Evidence for Translational Frameshifts
Two types of evidence are typically necessary to support
translational frameshifting: (1) solid sequencing data for
the gene and mRNA, and (2) identification of a single
protein produced from the separate reading frames. In
all Euplotes cases, DNA sequencing revealed two separate ORFs that, if joined by translational frameshifting,
would produce a single protein product similar to homologous proteins in other species. For example, serine/
threonine protein kinases contain twelve well-conserved
domains. In the E. octocarinatus Eondr2 protein kinase
gene (Figure 1A; Tan et al., 2001b), the first domain and
part of the second are encoded by one ORF (frame O),
while the remainder of second domain, and the other
10 domains, are encoded by a second out-of-frame ORF
(frame ⫹1). Producing a protein with all the characteristic domains would require shifting the reading frame
forward by one base in the region of overlap between
the 0 and ⫹1 ORFs. In the case of the E. crassus TERT
gene, three separate ORFs exist, so two ⫹1 frameshifts
would be required to produce the protein (Wang et al.,
2002).
With the exception of the Tec2 transposon ORF2 gene
and the E. crassus TERT gene, cDNA copies of the
mRNAs for all of the genes have been sequenced (Aigner
et al., 2000; Tan et al., 2001a, 2001b). All show the same
arrangement of ORFs, indicating that mRNA editing
does not result in the joining of the separate ORFs in
the mRNA prior to translation. An additional concern
is whether DNA sequencing errors might erroneously
indicate frameshifting, particularly since some of the
genes and mRNAs have been isolated by the polymerase chain reaction (PCR), which can introduce errors.
This seems unlikely, as multiple independent clones of
PCR products have often been analyzed, and direct
sequencing of PCR products has also been performed
(Aigner et al., 2000; Tan et al., 2001a, 2001b).
While the nucleic-acid-based evidence for frameshifting is strong, there is less information available on the
proteins. Indeed, only the La motif protein associated
with telomerase in E. aediculatus has been isolated and
analyzed (Aigner et al., 2000). A number of peptides
derived from the purified La motif protein were sequenced. One of the peptides was encoded within the
0 frame ORF, while the remainder were encoded by the
⫹1 frame ORF, providing a clear indication that a single
protein was produced by frameshifting.
Conserved Sequences near ⫹1 Frameshift Sites
During programmed frameshifting, the mRNA manipulates the translational machinery to cause a shift in reading frame. Frameshift stimulatory mRNA signals can include quite distant sequences or structural features.
However, in each case the sequence of from 4 to 7 nt
at the site of frameshifting is critical, as in the case
of the prfB frameshift described above. The Euplotes
frameshift genes share a common sequence motif
strongly resembling known ⫹1 frameshift signals. In
each gene, an AAA codon, coding for lysine, immediately
precedes the stop codon of the 0 frame ORF. The stop
codon is UAA, except for the Tec2 ORF 2 gene, which
has a UAG stop codon (Jahn et al., 1993). There is also
a strong tendency for the first base following the stop
codon to be an A, so that the 0 frame ORF typically
ends in the sequence 5⬘-AAA-UAA-A-3⬘. In the vicinity
of this motif, there is currently no compelling evidence
for other conserved sequence elements, or structural
motifs such as a secondary structure-forming region, a
common element of some types of frameshift sites. Tan
et al. (2001b) have noted that the sequence 5⬘-CAAGAA3⬘ is often present within the 41 bases preceding the
AAA-UAA-A motif, but exact matches to this sequence
are not seen in all of the genes, nor is it clear how it
could influence frameshifting.
The precise position of the frameshift is unknown, as
the amino acid sequence for this region has not been
determined for any of the Euplotes frameshift proteins.
However, the frameshift likely occurs near the AAAUAA-A motif. This is best illustrated using the La motif
protein of E. aediculatus (Figure 1B). In this case, there
is an extremely small overlap between the two ORFs,
as the ⫹1 frame ORF has a termination codon located
11 bases upstream of the termination codon of the 0
frame ORF. As a result, the frameshift must occur somewhere in this short region. Moreover, this region represents part of the conserved La motif, and a frameshift
at either the first or second codon upstream of the 0
frame stop codon would optimize amino acid sequence
conservation (Figure 1B and see Aigner et al., 2000).
Similar arguments can be made for the other genes. In
Minireview
3
Table 1. Frequency of Stop Codon Usage in Euplotesa
Stop Codon-Trinucucleotide
UAA (87.9%) UAG (12.1%)
Stop Codon-Tetranucleotide
UAA-A (37.9%) UAG-A (1.7%)
UAA-G (15.5%) UAG-G (0.0%)
UAA-C (3.4%) UAG-C (5.2%)
UAA-U (31.0%) UAG-U (5.2%)
a
Based on 58 Euplotes non-transposon, protein-coding genes
available in the Transterm database (http://uther.otago.ac.nz/
Transterm.html).
Figure 2. Model of the Euplotes ⫹1 Translational Frameshift
The figure shows a ribosome encountering the AAA-UAA-A
frameshift site. See text for details.
the case of the EoNdr2 protein kinase, a frameshift at the
lysine codon preceding the 0 frame stop codon would
produce a protein with maximum sequence similarity
to protein kinase domain II (Figure 1A; see Tan et al.,
2001b).
Possible Mechanism of the ⫹1 Frameshift
The putative frameshift signal in euplotid genes can be
considered a shifty stop (Weiss et al., 1987). A ribosome
translating up to the site would stop with the AAA codon
in the ribosomal P site and the UAA (or UAG) stop codon
in the A site (Figure 2). If a peptidyl-tRNALys were to slip
⫹1 from AAA to AAU then another lysyl-tRNALys could
enter the A site, accept the transfer of the peptide, and
translocate to the P site. Translation would continue in
the ⫹1 frame.
Two questions arise from the above model. First, why
is AAA always the last codon decoded in the 0 frame
rather than other potential “slippery” codons such as
UUU, CCC, or GGG? The finding implies that AAA, or
its cognate tRNA, has some special feature. In the yeast
Saccharomyces cerevisiae, ⫹1 frameshifting results
from an abnormal codon·anticodon interaction at the
equivalent codon (reviewed in Stahl et al., 2002).
Whether this is true in euplotids is unclear since no
information is available about their tRNAs.
Second, why is termination at the UAA codon slow
enough to allow frameshifting? Termination codons recognized poorly by release factor (RF) can stimulate
frameshifting (reviewed in Bertram et al., 2001). In both
prokaryotes and eukaryotes, RF appears to recognize
a tetranucleotide sequence consisting of the termination
codon and its 3⬘ nearest neighbor nucleotide. Certain
tetranucleotides are recognized poorly by RF in vitro,
and these same signals can stimulate frameshifting in
vivo, presumably because their recognition is slow
enough to allow time for the rare stochastic tRNA slippage thought to cause the shift in frame.
The proposed model requires that UAA, and more
specifically UAA-A, is a poorly recognized termination
signal in euplotids. At first glance, this does not appear
to be the case. Based on the analysis of 58 non-transposon, protein-coding genes, 87.9% of the Euplotes open
reading frames (mainly predicted) have UAA as a terminator, and 37.9% end in UAA-A (Table 1). This raises
the following conundrum: if UAA-A is frequently used
as the “normal” termination signal, how is it capable of
stimulating translational error at frameshift sites?
We suggest that the reassignment of the UGA stop
codon to a cysteine codon in euplotids has also resulted
in poor recognition, and inefficient termination, for UAA
codons. Stop codon reassignment is thought to require
two steps (Osawa et al., 1990): (1) the development of
a tRNA capable of decoding one of the stop codons,
and (2) the loss of the ability of the RF to recognize the
stop codon. In regard to the second step, a single release factor (eRF1) recognizes all three conventional
stop codons in eukaryotes. Changes in the amino acid
sequence of eRF1 at positions involved in interacting
with one or more of the three nucleotides comprising a
stop codon are thought to be necessary for the loss of
recognition for particular stop codons. Indeed, a number
of workers have characterized the sequences of eRF1
proteins from ciliates that have undergone stop codon
reassignment in an attempt to determine which residues
are involved in recognizing stop codons (e.g., Inagaki
and Doolittle, 2001; Lozupone et al., 2001; Muramatsu et
al., 2001). Since the three stop codons share nucleotide
determinants, the same amino acid changes that block
recognition of one stop codon may reduce the efficiency
of recognition of one or both of the remaining stop codons. In the case of Euplotes, recognition of the UAA
stop codon might be particularly impaired, as it shares
two nucleotide positions in common with the reassigned
UGA stop codon, while the UAG stop shares only one.
Impaired or slow recognition of stop codons at natural
termination sites may cause few problems in protein
synthesis, but in the context of a shifty codon (such as
AAA in Euplotes), it could serve to enhance the frequency of frameshifting. There is some evidence in support of this hypothesis. Seit-Nebi et al. (2002) introduced
amino acid changes into the human eRF1 protein, some
Cell
4
of which corresponded to residues present in the Paramecium eRF1, which recognizes only UGA stop codons.
A number of these altered human eRF1 proteins displayed greatly impaired recognition of UAA and UAG
stops in vitro, but, in addition, showed modest reductions in the recognition of UGA stop codons.
Outstanding Issues
Information on the mechanism of the ⫹1 frameshifting in
euplotids is clearly needed. If translational frameshifting
proves to be as common as the current limited data set
indicates, the euplotids may present particular advantages for generally understanding the molecular mechanism underlying ⫹1 programmed translational frameshifts. Further analyses of eRF1, as well as analysis of
tRNALys, may provide clues as to whether changes in
stop codon usage can lead to proliferation of programmed frameshifts. It will also be important to determine if the AAA-UAA-A sequence is sufficient for frameshifting.
An additional issue is whether frameshifting is involved in regulating gene expression in euplotids. While
there is no obvious relationship between all of the Euplotes frameshift genes, two of them encode proteins
involved in telomerase function (p43 and TERT), raising
the possibility that frameshifting is involved in some
form of coordinate regulation. Frameshifting can regulate protein expression in other systems (reviewed in
Baranov et al., 2002; Farabaugh, 2000), as with the autogenous regulation of RF2 in E. coli. In other instances,
the frequency of frameshifting is thought to set an optimal ratio of protein products. This is believed to be the
case for retroviruses, where the ratio of the Gag and
Gag-Pol fusion proteins depends on the frequency at
which ribosomes frameshift. Whether Euplotes employs
either of these regulatory mechanisms is unclear. Alternatively, it is possible that frameshifting has no regulatory role in euplotids. That is, these organisms may have
evolved an efficient enough frameshifting system that
generation of a frameshift site in a gene results in a
negligible, perhaps selectively neutral, decrease in protein expression.
A final area concerns the evolution of frameshifting.
The process has been observed in E. crassus, E. aediculatus, and E. octocarinatus, and it will be of interest to
determine if other euplotids and closely related groups
also display frameshifting. There is already some indication that new frameshift sites have arisen during the
evolution of euplotids, as the E. aediculatus TERT gene
(Lingner et al., 1997) lacks the two frameshift sites present in the E. crassus TERT gene (Wang et al., 2002).
A more general evolutionary question arises from the
proposal that stop codon reassignment and frameshifting are related. That is, is the tendency to frameshifting
a necessary outcome of reassigning termination codons
to be decoded as sense? Most codon reassignments
are in fact of termination codons, perhaps because that
type of reassignment is less deleterious, but also because the reassignment may require only a small adjustment of the competition between RF and nonsense suppressor tRNA in the ribosomal A site (Lozupone et al.,
2001). Adjusting competition certainly would eventually
require restricting recognition of the reassigned termination codon, and, as we have discussed, this may reduce
recognition of one of the remaining terminators. Pro-
grammed frameshifting frequently results when the rate
of a canonical process, such as termination, is reduced
sufficiently to allow a normally extremely unlikely noncanonical process, like frameshifting, to proceed. It may
be, therefore, that reassigning terminators will inevitably
enhance the ability to evolve programmed frameshifts.
It will be of great interest to see if shifty stop ⫹1 frameshifting is frequent in other species that have undergone
termination codon reassignment.
Selected Reading
Aigner, S., Lingner, J., Goodrich, K.J., Grosshans, C.A., Shevchenko,
A., Mann, M., and Cech, T.R. (2000). EMBO J. 19, 6230–6239.
Asakura, T., Sasaki, T., Nagano, F., Satoh, A., Obaishi, H., Nishioka,
H., Imamura, H., Hotta, K., Tanaka, K., Nakanishi, H., and Takai, Y.
(1998). Oncogene 16, 121–130.
Baranov, P.V., Gesteland, R.F., and Atkins, J.F. (2002). Gene 286,
187–201.
Bertram, G., Innes, S., Minella, O., Richardson, J., and Stansfield, I.
(2001). Microbiology 147, 255–269.
Doak, T.G., Witherspoon, D.J., Jahn, C.L., and Herrick, G. (2003).
Eukaryotic Cell, in press.
Farabaugh, P.J. (2000). Prog. Nucleic Acid Res. Mol. Biol. 64,
131–170.
Inagaki, Y., and Doolittle, W.F. (2001). Nucleic Acids Res. 29,
921–927.
Jahn, C.L., Doktor, S.Z., Frels, J.S., Jaraczewski, J.W., and Krikau,
M.F. (1993). Gene 133, 71–78.
Lingner, J., Hughes, T.R., Shevchenko, A., Mann, M., Lundblad, V.,
and Cech, T.R. (1997). Science 276, 561–567.
Lozupone, C.A., Knight, R.D., and Landweber, L.F. (2001). Curr. Biol.
11, 65–74.
Morris, D.K., and Lundblad, V. (1997). Curr. Biol. 7, 969–976.
Muramatsu, T., Heckmann, K., Kitanaka, C., and Kuchino, Y. (2001).
FEBS Lett. 488, 105–109.
Osawa, S., Muto, A., Jukes, T.H., and Ohama, T. (1990). Proc. R.
Soc. Lond. B Biol. Sci. 241, 19–28.
Seit-Nebi, A., Frolova, L., and Kisselev, L. (2002). EMBO Rep. 3,
881–886.
Stahl, G., McCarty, G.P., and Farabaugh, P.J. (2002). Trends Biochem. Sci. 27, 178–183.
Tan, M., Heckmann, K., and Brunen-Nieweler, C. (2001a). J. Eukaryot. Microbiol. 48, 80–87.
Tan, M., Liang, A., Brunen-Nieweler, C., and Heckmann, K. (2001b).
J. Eukaryot. Microbiol. 48, 575–582.
Tourancheau, A.B., Tsao, N., Klobutcher, L.A., Pearlman, R.E., and
Adoutte, A. (1995). EMBO J. 14, 3262–3267.
Wang, L., Dean, S.R., and Shippen, D.E. (2002). Nucleic Acids Res.
30, 4032–4039.
Weiss, R.B., Dunn, D.M., Atkins, J.F., and Gesteland, R.F. (1987).
Cold Spring Harb. Symp. Quant. Biol. 52, 687–693.