Download Combinatorial mutagenesis to restrict amino acid usage in an

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Magnesium transporter wikipedia , lookup

Butyric acid wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Gene wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Citric acid cycle wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Catalytic triad wikipedia , lookup

Enzyme wikipedia , lookup

Specialized pro-resolving mediators wikipedia , lookup

Protein wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Peptide synthesis wikipedia , lookup

Metalloprotein wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Proteolysis wikipedia , lookup

Metabolism wikipedia , lookup

Hepoxilin wikipedia , lookup

Point mutation wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Genetic code wikipedia , lookup

Biochemistry wikipedia , lookup

Biosynthesis wikipedia , lookup

Transcript
Combinatorial mutagenesis to restrict amino acid
usage in an enzyme to a reduced set
Satoshi Akanuma*, Takanori Kigawa*, and Shigeyuki Yokoyama*†‡§
*RIKEN Genomic Sciences Center, Tsurumi, Yokohama 230-0045, Japan; †Cellular Signaling Laboratory, RIKEN Harima Institute at SPring-8,
Sayo, Hyogo 679-5198, Japan; and ‡Department of Biophysics and Biochemistry, Graduate School of Science, University of Tokyo,
Hongo, Tokyo 113-0033, Japan
Edited by Gregory A. Petsko, Brandeis University, Waltham, MA, and approved September 5, 2002 (received for review April 24, 2002)
P
roteins are composed of 20 or nearly 20 kinds of naturally
occurring amino acids. Does a protein need the full set of 20
amino acids to encode a specific protein function? Only a few
studies have addressed this question experimentally. De novo
protein-designing experiments have demonstrated that entire
helical bundle architectures can be constructed from a reduced
amino acid alphabet (1–3). In the cases of small proteins, the
amino acid usage can be restricted with retention of their
biological functions. First, for the 57-residue, predominantly
␤-sheeted Src SH3 protein, 70% of an active variant selected
from a combinatorial library by the phage display approach was
composed of just five amino acid residues: Ala, Gly, Glu, Lys,
and Ile (4). In the cases of the 53-residue Arc repressor (5) and
the 58-residue bovine pancreatic trypsin inhibitor (6), multiple
alanine substitutions were made, and the proteins retained their
specific functions. In contrast, no study has demonstrated that
the function of a large protein can also be fabricated with a
limited set of amino acids. It is still unclear, therefore, whether
the principles deduced from the studies with the small proteins
can be applied to much larger and topologically more complex
proteins. Thus, we have developed an effective strategy to
simplify the entire sequence of a large protein with conservation
of its in vivo function.
To fabricate a simplified protein with a catalytic function, the
213-residue Escherichia coli orotate phosphoribosyltransferase
(OPRTase, EC 2.4.2.10) was used as a template for restricting
the building units to a limited set of amino acids while retaining
the metabolic function. This enzyme is the product of the pyrE
gene and catalyzes the Mg2⫹-dependent formation of orotidine
5⬘-monophosphate from orotate and ␣-D-5-phosphoribosyl-1pyrophosphate, which is involved in the de novo pyrimidine
biosynthesis as well as in the biosyntheses of histidine and
tryptophan (7). An E. coli strain carrying a lethal mutation
within the chromosomal pyrE gene is not able to grow in a
minimum medium without uracil (8). The uracil auxotrophic
phenotype is fully complemented by a plasmid carrying the
wild-type pyrE gene. This complementation by the exogenous
genes allows simple, rapid, and sensitive phenotype selection for
OPRTase variants with catalytic activity. We combined this
phenotype selection with a combinatorial mutagenesis to fabriwww.pnas.org兾cgi兾doi兾10.1073兾pnas.222243999
cate simplified amino acid sequences that are composed largely
of a reduced set of amino acids and are able to manifest the
OPRTase activity in vivo.
Experimental Procedures
Choice of Reduced Amino Acid Alphabet. Leu and Ala were added
initially to the amino acid subset because these two residues
occur most frequently in E. coli OPRTase as well as in naturally
occurring proteins (9). Because it seemed unlikely that the two
hydrophobic residue types are sufficient to achieve suitable core
packing of this relatively large enzyme, Val was also chosen for
another hydrophobic amino acid. Two Thr residues are involved
in a cluster of invariant residues, Asp124-Asp-Val-Ile-Thr-AlaGly-Thr-Ala132, among the OPRTase family (10, 11). In addition,
an unusual amino acid stretch of three contiguous Thr residues
is present at positions 79–81. These facts suggest that the Thr
residues specifically contribute to the structure and function of
the enzyme. For this reason, Thr was adopted as a reduced-set
residue. Asp, Arg, and Tyr were also added to the reduced amino
acid set as representatives of acidic, basic, and aromatic amino
acids, respectively. In addition to these seven representative
residues, Gly and Pro were added to the subset because the
flexibility of Gly and the rigidity of Pro make important contributions to the conformation of proteins. Thus, the present
simplification experiment used the nine amino acid residues,
Ala, Leu, Val, Thr, Asp, Arg, Tyr, Gly, and Pro (defined as the
reduced set of amino acids; Fig. 1).
Plasmid Construction. The pyrE gene was initially amplified from
genomic DNA of E. coli JM109 by the PCR method using a pair of
synthetic oligonucleotides containing HindIII and EcoRI restriction sites, respectively. The amplified pyrE gene was digested with
HindIII and EcoRI and cloned into pUC119. The resulting plasmid
was used to construct the combinatorial libraries.
Estimation of the Sensitivity of Phenotype Selection. To estimate
what level of activity is detectable by using the in vivo complementation of the E. coli pyrE-deficient strain RK1032 (F⫺,
thr,leu,argE3,proA2,his,pyrE60,thi), two mutant enzymes,
K73A⫹K103A and K73A⫹K100A⫹K103A, were prepared by
oligonucleotide-directed mutagenesis (12). The importance of
the three Lys residues for the catalytic activity was reported
previously in the study of the Salmonella typhimurium OPRTase
(13). It was confirmed that the uracil-auxotrophic growth
of RK1032 could be complemented by the mutated pyrE
gene encoding K73A⫹K103A but not by that encoding
K73A⫹K100A⫹K103A. The activity of the mutant enzymes
then was evaluated in vitro. Expression of the mutated genes and
purification of the products were carried out as described (14).
The purity of the enzymes used was ⬎95% as judged by SDS-gel
This paper was submitted directly (Track II) to the PNAS office.
Abbreviation: OPRTase, orotate phosphoribosyltransferase.
§To
whom correspondence should be sent at the * address. E-mail: yokoyama@biochem.
s.u-tokyo.ac.jp.
PNAS 兩 October 15, 2002 兩 vol. 99 兩 no. 21 兩 13549 –13553
BIOPHYSICS
We developed an effective strategy to restrict the amino acid usage
in a relatively large protein to a reduced set with conservation of
its in vivo function. The 213-residue Escherichia coli orotate phosphoribosyltransferase was subjected to 22 cycles of segment-wise
combinatorial mutagenesis followed by 6 cycles of site-directed
random mutagenesis, both coupled with a growth-related phenotype selection. The enzyme eventually tolerated 73 amino acid
substitutions: In the final variant, 9 amino acid types (A, D, G, L, P,
R, T, V, and Y) occupied 188 positions (88%), and none of 7 amino
acid types (C, H, I, M, N, Q, and W) appeared. Therefore, the
catalytic function associated with a relatively large protein may be
achieved with a subset of the 20 amino acid. The converged
sequence also implies simpler constituents for proteins in the early
stage of evolution.
Fig. 1. Amino acid substitutions to reduce the amino acid types contained in
E. coli OPRTase.
electrophoresis. The specific activity of each purified enzyme
was determined at 37°C as described (13). The purified
K73A⫹K103A and K73A⫹K100A⫹K103A proteins had specific activities that were 10,000- and 100,000-fold lower than that
of the wild-type enzyme, respectively. Therefore, the in vivo
complementation assay can detect activity that is ⬇1兾10,000 or
greater relative to that of the wild-type enzyme.
Design of Oligonucleotide Primers for Randomization. To generate
the amino acid substitutions shown in Fig. 1, synthetic oligonucleotides were designed and synthesized as follows: the codons for Ala
and Gly in the wild-type sequence were replaced by GSA, GSC,
GSG, or GST (S ⫽ C ⫹ G); the Thr codons were replaced by RCA,
RCC, or RCG (R ⫽ A ⫹ G); the Ser codons were replaced by DCC,
DCG, or DCT (D ⫽ A ⫹ G ⫹ T); the Cys codons were replaced
by KSC (K ⫽ G ⫹ T); the Leu and Val codons were replaced by
STG; the Ile codons were replaced by VTC or VTT (V ⫽ A ⫹
C ⫹ G); the Met codons were replaced by VTG; the Lys codons
were replaced by ARA or ARG; the Glu codons were replaced by
GAW (W ⫽ A ⫹ T) or GAS; the Asn codons were replaced by
RAC or RAT; the Gln codons were replaced by SAW or SAS; the
His codons were replaced by YRC or YRT (Y ⫽ C ⫹ T); and the
Phe codons were replaced by TWA or TWT. At the mixed
positions, the two or three bases were so mixed that their encoding
letters should appear with equal probability. The sequences with the
base mixtures were flanked by unmutated 9- to 14-base stretches so
as to ensure that the modified segments properly hybridized to the
template single-stranded DNAs. These synthetic primers were used
for the construction of semirandomly mutated plasmid DNA
libraries as shown in Fig. 2.
Selection and Sequence Analysis of Phenotypically Active Mutants. E.
coli RK1032 was transformed with the respective plasmid DNA
libraries. The transformed RK1032 then was spread on M9
medium agar plates supplemented with 0.2 mg䡠ml⫺1 glucose, 10
␮g䡠ml⫺1 thiamine, 20 ␮g䡠ml⫺1 each of threonine, leucine, arginine, proline, and histidine, and either 100 ␮g䡠ml⫺1 ampicillin or
20 ␮g䡠ml⫺1 chloramphenicol. On average 12 of the relatively
large colonies were selected from the plates after 2 days of
incubation at 37°C and cultured in 3 ml of LB medium supplemented with either 100 ␮g䡠ml⫺1 ampicillin or 20 ␮g䡠ml⫺1 chloramphenicol. The plasmid DNA variants were prepared from the
cultures and used for the subsequent PCR amplification of the
pyrE genes. The amplified PCR products were sequenced directly to determine the amino acid sequence within the region
randomized. DNA sequencing was carried out with an ABI
Prism BigDye Terminator cycle-sequencing ready-reaction kit
and an ABI Prism 310 genetic analyzer.
13550 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.222243999
Fig. 2. Construction of a combinatorial library for the pyrE gene and selection
in E. coli strain RK1032. The entire sequence of the E. coli OPRTase gene was
simplified stepwise in 22 rounds of mutagenesis and phenotype selection. In steps
1–3, a genetic library was constructed by a strategy similar to that of Huang et al.
(15). (Step 1) A target nucleotide sequence on pUC119 was replaced with a linker
containing a SalI restriction site to eliminate wild-type DNA contamination in the
following library constructions. E. coli CJ236 was transformed with the resulting
plasmid, and then uracil-containing single-stranded DNA was isolated. A synthetic oligonucleotide designed to replace the linker sequence and generate
several types of amino acid substitutions within the target segment was annealed
to the single-stranded DNA. (Step 2) Second strands were synthesized, and the
resulting double-stranded plasmid mixture was amplified in E. coli JM109. (Step
3) In rounds 1–15, the amplified plasmid DNA variants were digested with HindIII
and EcoRI to excise the pyrE gene variants with the combinatorial sequence, and
the HindIII–EcoRI fragments were inserted into low copy-number plasmid
pSTV29. The resulting plasmids were amplified again in E. coli JM109. The
phenotype selection was through steps 4 –5. (Step 4) The amplified plasmid
mixture was used to transform the ⌬pyrE E. coli strain RK1032. (Step 4⬘) In rounds
16 –22, the pyrE gene variants were not cloned into the low copy-number plasmid. The mutated pyrE genes inserted into pUC119 were used to transform E. coli
RK1032. The RK1032 transformants were spread on M9 medium agar plates
without uracil and incubated for 2 days at 37°C. (Step 5) On average, 12 of the
largest colonies then were selected from the plates. The plasmid DNA variants
were prepared from the colonies and used for the subsequent PCR amplification
of the pyrE genes. The amplified PCR products were sequenced directly to
determine the amino acid sequence within the mutagenized region. Based on
the sequence analyses, the positions where the reduced-set residues are tolerated
were determined. Thus a simplified amino acid sequence was generated within
the targeted segment on pUC119. (Step 6) The resulting sequence variant on the
plasmid was subjected to the next round of mutagenesis.
Further Simplification of Simp-1. To simplify the Simp-1 sequence
further, the tolerance of changes of M197, N41, Q117, Q141,
Q168, and Q5 with the nine reduced-set amino acids was tested
one by one in that order. An inverse PCR method was used to
randomize each position according to ExSite mutagenesis protocol (Stratagene) with slight modifications. To amplify the pyrE
gene with base substitutions within the codon for Met-197, four
mutagenic PCR primers, each of which contained a two-base
mixture within the codon, were designed, synthesized, and mixed
such that eight amino acid residues (Ala, Asp, Leu, Met, Arg,
Thr, Val, and Tyr) could appear at this position with equal
probability. The second primer was designed such that the
mutagenic and the second primers annealed to the next sequences on opposite strands of the template plasmid. The
following program was used: step 1, 95°C, 30 sec; step 2, 95°C,
30 sec; step 3, 55°C, 1 min; step 4, 68°C, 8 min; steps 2–4 were
repeated 16 times. After amplification, the PCR products were
treated with the DpnI restriction enzyme to digest the methylAkanuma et al.
Fig. 3. Amino acid sequences of E. coli OPRTase and its simplified variants. The nonreduced set and the substituting amino acids are shown in cyan and red,
respectively. Orange shading indicates the region semiselectively randomized in each round.
ated, nonmutagenized parent DNA template and then were
purified by agarose-gel electrophoresis. The purified DNAs were
self-ligated, and the resulting ligation products were amplified in
E. coli JM109. The amplified DNAs were used to transform E.
coli RK1032, and the transformants were screened for the
residual OPRTase activity in vivo on a plate of selective medium
without uracil. After 2 days of incubation, an average of 12
relatively large colonies were selected from the plate. The
plasmid DNA variants were isolated from the colonies, and the
inserts were sequenced to identify amino acid substitutions.
Allowable amino acid substitutions at positions 41, 117, 141, 168,
and 5 also were determined in a similar manner.
Results and Discussion
Combinatorial Mutagenesis to Restrict the Amino Acid Usage. Some
BIOPHYSICS
segments of a protein tolerate extensive amino acid substitutions
without compromising their structure and function (16–21),
whereas a few amino acid substitutions can exert large effects on
protein structure (22) and function (23, 24). To select allowable
substitutions at permissive positions, we used the growth complementation of the pyrE-deficient E. coli strain RK1032 by exogenous
genes for functional variants. This phenotype selection is sensitive
enough, because a plasmid gene encoding a specific activity of
1兾10,000 of the wild-type OPRTase can allow the host cells to
undergo colony growth on a minimum medium without uracil (see
Experimental Procedures). To make reasonably small libraries, the
substitutions were restricted among similar amino acids (Fig. 1). In
addition, the entire amino acid sequence of the enzyme was divided
into 22 segments, each consisting of 8–13 residues. In the first trial,
we simplified the 22 segments independently by selecting permissive substitutions in a mutated segment of the genes in which the
Fig. 4. Avoidance of the interference between mutations at different segments by stepwise mutagenesis. (a) The wild-type enzyme structure around positions
9 and 182. (b) Schematic illustration of amino acid substitutions at positions 9 and 182.
Akanuma et al.
PNAS 兩 October 15, 2002 兩 vol. 99 兩 no. 21 兩 13551
Table 1. Sequence data of functional variants obtained by additional mutagenesis
experiments
Wild-type residue
M197
N41
Q117
Q168
Q141
Q5
Candidate amino acids in
additional mutagenesis
Amino acids observed in
functional variants*
Simp-2
A, D, L, M, R, T, V, Y†
D, N, R, T, Y‡
D, Q, R, T, Y§
D, Q, R, T, Y§
D, Q, R, T, Y§
D, Q, R, T, Y§
M(9), Y(3)
N(9), R(2), T(1)
D(2), Q(4), R(2), T(4)
D(2), Q(6), R(2), T(3)
D(1), R(5), Q(3), T(2), Y(1)
Q(8), T(4)
Y
R
T
T
R
T
*An average of 12 functional variants were analyzed. These variants were obtained by in vivo complementation
of the pyrE-defective E. coli following the site-directed random mutagenesis at the indicated positions. The
frequency of an amino acid is given in parentheses next to the amino acid.
†The codon was randomized by using four mutagenic primers that contained RCG, CKG, KAC, and RTG,
respectively (R ⫽ A ⫹ G; K ⫽ G ⫹ T).
‡The codon was randomized by using two mutagenic primers that contained DAT and ASA, respectively (D ⫽ A ⫹
G ⫹ T; S ⫽ C ⫹ G).
§Each of these codons was randomized by using three mutagenic primers that contained KAT, CRA, and ACT,
respectively (K ⫽ G ⫹ T; R ⫽ A ⫹ G).
other segments were left unchanged. Then, the resulting simplified
segments were assembled into a single molecule. The advantage of
this ‘‘independent simplification’’ strategy is that all the segments
can be simplified at one time. Indeed, a simplified SH3 variant was
generated by this strategy (4). However, in the present case,
concatenation of the independently simplified fragments failed to
produce a phenotypically active enzyme. Therefore, we used an
alternative strategy in which the 22 segments were simplified in a
stepwise manner, and the simplified sequences were fixed in the
following rounds for the remaining segments. The entire sequence
of E. coli OPRTase thus was simplified in 22 rounds of mutagenesis,
selection, and introduction of the reduced alphabet residues into the
relevant positions (Fig. 2). Fig. 3 shows the stepwise increase in the
number of substitutions along the OPRTase sequence. The amino
acid sequence of the final product, designated as Simp-1, contains
a large number of the reduced set of amino acids; 182 (85%) of the
overall 213 positions are occupied by the nine amino acid types.
Advantage of Stepwise Simplification. The accumulative simplifi-
cation is superior to the independent simplification, probably
because the interference between mutations in different segments could be avoided in the course of the accumulative
processes. A typical example is found in positions 9 and 182. The
wild-type OPRTase Ile-9 is buried completely within an ␣-helix
and interacts with Ile-182 by means of a hydrophobic interaction
(Fig. 4a). When the Ile residue is located at position 182, Val
seems to be allowed at position 9. Reasonably, the relevant
independently simplified segment (residues 2–11) had Val at this
position. However, in the case where Ile was replaced by Val at
182, the further introduction of Val at 9 was not tolerated,
presumably because of the generation of a large cavity caused by
the double Ile 3 Val substitutions. In our successful example,
Val was introduced at position 182 in round 3, and the later
combinatorial mutagenesis and phenotype selection in round 21
chose Leu for position 9; the terminal methyl group of the
introduced Leu may fill a part of the cavity produced by the
Ile-182 3 Val substitution (Fig. 4b). Thus, our success in
the simplification relied on the stepwise strategy as reported for
minimizing a protein (25).
Further Simplification. Despite the significant simplification of the
amino acid sequence, Simp-1 still has a number of amino acid
residues to be eliminated in the simplified sequence. For example, Simp-1 has a Met residue at position 197, as hydrophobic
amino acids, Leu and Val, were not tolerated at this position. The
Met-197 side chain, however, is somewhat exposed to the solvent
13552 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.222243999
in the wild-type enzyme structure, suggesting that some hydrophilic residues may be allowed at this position. Similarly, although Asn-41, Gln-5, Gln-117, Gln-141, and Gln-168 were
resistant to replacement by Asp, it remained to be examined
whether the neutral or basic polar reduced-set residues are
allowed at these positions. Additional single-point random mutagenesis therefore was applied to positions 197, 41, 141, 168,
117, and 5. Randomization was done to generate the amino acid
mixture indicated in Table 1. Sequence data of the functional
variants also are summarized in Table 1. Based on the results, we
introduced Tyr in place of Met at position 197, Arg at 41 and 141,
and Thr at 117, 168, and 5 in place of their respective neutral
hydrophilic residues. A further simplified variant thus was
obtained and designated Simp-2.
In Fig. 5, the amino acid sequence of Simp-2 is compared with
those of the wild-type enzyme and Simp-1. Fig. 6 indicates the
positions of the simplified residues in Simp-2 superimposed on
the structure of the wild-type OPRTase (27). In total, 65 of 87
positions at which simplification was attempted are occupied by
the reduced set of nine residues. Consequently, 88% of Simp-2
is composed of the nine amino acids. More importantly, seven
residue types (Cys, His, Ile, Met, Asn, Gln, and Trp) are absent
from the simplified enzyme (the N-terminal Met is not taken
Fig. 5. Sequence alignment of the wild-type enzyme and simplified variants.
The nonreduced-set amino acid residues are highlighted with black boxes. The
N-terminal and catalytically important residues that were not subjected to
mutagenesis are indicated with open letters. The secondary structures of
the wild-type OPRTase are shown below their corresponding sequences. H,
␣-helix; S, ␤-strand.
Akanuma et al.
amino acid set protein (RASP) contains the minimal elements
required for achievement of the OPRTase activity. Our result
exemplifies that the amino acid usage of a relatively large enzyme
can also be restricted by this experimental method with retention
of its catalytic function.
Simp-2 is the most simplified variant obtained thus far, but it
may not represent the simplest possible OPRTase sequence. It
is expected that the application of the successive segment-wise
combinatorial mutagenesis reported here to the Simp-2 gene
again would further restrict the amino acid usage in the enzyme
molecule. Future attempts to generate more simplified enzymes
should be done in this direction.
Primordial Proteins. The results presented here provide some
into account). Hence, the entire sequence of Simp-2 is composed
of just 13 amino acid types. Despite this dramatic change in its
amino acid sequence, Simp-2 exhibits sufficient activity to
complement the uracil auxotrophic growth of the pyrE-deficient
E. coli strain even though the mutant’s activity is not as high as
that of the wild type. Thus, we have shown that the reduced
1. Regan, L. & DeGrado, W. F. (1988) Science 241, 976–978.
2. Kamtekar, S., Schiffer, J. M., Xiong, H., Babik, J. M. & Hecht, M. H. (1993)
Science 262, 1680–1685.
3. Schafmeister, C. E., LaPorte, S. L., Miercke, L. J. & Stroud, R. M. (1997) Nat.
Struct. Biol. 4, 1039–1046.
4. Riddle, D. S., Santiago, J. V., Bray-Hall, S. T., Doshi, N., Grantcharova, V. P.,
Yi, Q. & Baker, D. (1997) Nat. Struct. Biol. 4, 805–809.
5. Brown, B. M. & Sauer, R. T. (1999) Proc. Natl. Acad. Sci. USA 96, 1983–1988.
6. Kuroda, Y. & Kim, P. S. (2000) J. Mol. Biol. 298, 493–501.
7. Musick, W. D. (1981) Crit. Rev. Biochem. 11, 1–34.
8. Shimosaka, M., Fukuda, Y., Murata, K. & Kimura, A. (1984) J. Bacteriol. 160,
1101–1104.
9. Nakamura, Y., Gojobori, T. & Ikemura, T. (2000) Nucleic Acids Res. 28, 292.
10. Hershey, H. V. & Taylor, M. W. (1986) Gene 43, 287–293.
11. Hove-Jensen, B., Harlow, K. W., King, C. J. & Switzer, R. L. (1986) J. Biol.
Chem. 261, 6765–6771.
12. Kunkel, T. A., Roberts, J. D. & Zakour, R. A. (1987) Methods Enzymol. 154,
367–382.
13. Ozturk, D. H., Dorfman, R. H., Scapin, G., Sacchettini, J. C. & Grubmeyer C.
(1995) Biochemistry 34, 10755–10763.
14. Bhatia, M. B., Vinitsky, A. & Grubmeyer, C. (1990) Biochemistry 29, 10480–
10487.
15. Huang, W., Petrosino, J., Hirsch, M., Shenkin, P. S. & Palzkill, T. (1996) J. Mol.
Biol. 258, 688–703.
16. Bowie, J. U., Reidhaar-Olson, J. F., Lim, W. A. & Sauer, R. T. (1990) Science
247, 1306–1310.
Akanuma et al.
We thank Y. Kuroda for assistance in the preparation of molecular
graphic images. S.A. was supported by the Special Postdoctoral Research
Program of RIKEN.
17. Kim, S., Ribas de Pouplana, L. & Schimmel, P. (1994) Biochemistry 33,
11040–11045.
18. Blaber, M., Baase, W. A., Gassner, N. & Matthews, B. W. (1995) J. Mol. Biol.
246, 317–330.
19. Yu, M. H., Weissman, J. S. & Kim, P. S. (1995) J. Mol. Biol. 249,
388 –397.
20. Axe, D. D., Foster, N. W. & Fersht, A. R. (1996) Proc. Natl. Acad. Sci. USA
93, 5590–5594.
21. Silverman, J. A., Balakrishnan, R. & Harbury, P. B. (2001) Proc. Natl. Acad.
Sci. USA 98, 3092–3097.
22. Cordes, M. H., Walsh, N. P., McKnight, C. J. & Sauer, R. T. (1999) Science 284,
325–328.
23. Abrahmsen, L., Tom, J., Burnier, J., Butcher, K. A., Kossiakoff, A. & Wells,
J. A. (1991) Biochemistry 30, 4151–4159.
24. Jügens, C., Strom, A., Wegener, D., Hettwer, S., Wilmanns, M. & Sterner, R.
(2000) Proc. Natl. Acad. Sci. USA 97, 9925–9930.
25. Braisted, A. C. & Wells, J. A. (1996) Proc. Natl. Acad. Sci. USA 93,
5688 –5692.
26. Kraulis, P. J. (1991) J. Appl. Crystallogr. 24, 946–950.
27. Henriksen, A., Aghajari, N., Jensen, K. F. & Gajhede, M. (1996) Biochemistry
35, 3803–3809.
28. Osawa, S., Jukes, T. H., Watanabe, K. & Muto, A. (1992) Microbiol. Rev. 56,
229–264.
29. Baumann, U. & Oro, J. (1993) BioSystems 29, 133–141.
30. Di Giulio, M. & Medugno, M. (1999) J. Mol. Evol. 49, 1–10.
PNAS 兩 October 15, 2002 兩 vol. 99 兩 no. 21 兩 13553
BIOPHYSICS
Fig. 6. MOLSCRIPT (26) representation of a model of the monomer structure of
the simplified OPRTase variant Simp-2. Eighty-eight percent of Simp-2 is
composed of the reduced-set of nine amino acids, which are shown as a gray
backbone ribbon. The residues not simplified in Simp-2 are rendered in black.
implications for understanding early protein evolution. The
ancestor of the phosphoribosyltransferase family is thought to
have arisen early in evolution, because its members are distributed in a wide variety of organisms from bacteria to mammals.
Several authors (28–30) have proposed that early protein synthesis was much simpler and involved only 7–13 amino acids, and
that the existing system for protein synthesis has evolved progressively from the primordial one by gradually obtaining new
amino acids for the repertoire in protein synthesis. Such primitive proteins, which might have comprised a reduced set of
amino acids, are not thought to have been optimized fully in
terms of their stability and activity, because such optimizations
may be the results of a long history of evolution. However, the
primitive proteins must have had a sufficiently adequate structure for functional interactions and catalysis. Here we show that
the full amino acid alphabet set is not necessarily essential, and
only 13 amino acid types are sufficient to achieve protein
functions in vivo. This number of amino acid types is quite
consistent with the above-mentioned propositions. Accordingly,
our result supports the hypothesis that proteins in the early stage
of evolution were made from a limited set of amino acids.