Download Expanding the Genetic Code with Unnatural Amino Acids

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Monoclonal antibody wikipedia , lookup

Interactome wikipedia , lookup

Epitranscriptome wikipedia , lookup

Gene expression wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Fatty acid metabolism wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Expression vector wikipedia , lookup

Magnesium transporter wikipedia , lookup

Metalloprotein wikipedia , lookup

Western blot wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Peptide synthesis wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Point mutation wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Protein wikipedia , lookup

Metabolism wikipedia , lookup

Proteolysis wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Genetic code wikipedia , lookup

Biosynthesis wikipedia , lookup

Transcript
Senior Comprehensive Paper:
Expanding the Genetic Code with Unnatural Amino
Acids
Kevin Cravedi
Department of Chemistry
The Catholic University of America
Class of 2010
1
Abstract:
Twenty L--amino acids occur in human proteins, ten are biosynthesized and ten must come from the
diet. The amino acids and their chemically modified forms cover a wide range of physicochemical properties to
serve specificity in molecular recognition in a very broad range of functions. Chemical modification of amino
acids takes place post-translationally and, thus, has been missing in recombinant DNA technology, if based on
simple expression systems in bacteria and yeast. However, a recent breakthrough provides technology for the
extension of the genetic code: It is based on the use of a unique tRNA codon pair with its corresponding
aminoacyl-tRNA synthetase that places modified or unnatural amino acids in place of one of the natural amino
acids at a specified site. An orthogonal tRNA is constructed that is not a substrate to any natural aminoacyl
synthetase, which inserts its cognate amino acid in response to an amber nonsense codon (stop codon), a
cognate synthetase is then created to recognize this unique tRNA and no other. The proteins then can be
expressed in E. coli. Key points of the methodology and examples for the incorporation of a variety of unnatural
amino acids into proteins with efficiency and specificity are discussed in this thesis. The two key case studies
detailed from the work of Peter G. Schultz and his group are 1) the photolabeling of phenylalanine with AzoPhe
and the characterization and proof of its utility, and 2) the sulfation of tyrosine63 in hirudin and its
characterization, structure in complex with human -thrombin, and its binding properties to -thrombin.
2
Introduction
This thesis examines the addition of unnatural amino acids to the 20 amino acids in humans. Their use
will open the door for numerous possibilities for posttranslational modifications and the incorporation of
markers with novel physicochemical properties. Their use in research and medicine will impact therapy of
cancer, HIV and other critical diseases.
Natural Amino Acids
Amino acids play a central role as the building blocks of proteins and as intermediates in metabolism.
Although there are over 300 amino acids found in nature, only 20 amino acids are found within mammalian
cells. These convey an array of chemical versatility.1
Amino acids are called -amino acids, which consist of a central -carbon, an amine group, a carboxyl
group, a hydrogen group, and a distinctive R group. The side chain, or R group provides the characteristics of
the amino acid. With these four groups attached to the central carbon, the molecule is chiral, except for glycine.
The amino acids also exist in two mirror-image forms, the L isomer or the D isomer. The L (corresponding
predominantly to S) amino acids are constituents of proteins in most living organisms.1
The twenty amino acids vary in size, shape, charge, bonding and reactivity. All proteins in all the
species of bacteria, archaea, and eukaryotes are made from these same 20 amino acids, with only a few
exceptions. Bacterial amino acids are in the L as well as some are in the D form. Furthermore, humans produce
10 out of the 20 essential amino acids and as a result must intake the other 10 amino acids. The biosynthesis of
the 10 amino acids occurring in humans are derived from intermediates of the glycolytic pathway or the citric
acid cycle. Examples of these amino acids are glutamate, -ketoglutarate, pyruvate, and oxaloacetate that
transaminate or produce compounds used in the production of amino acids. Overall the 10 amino acids that
humans produce are alanine, asparagine, aspartate, cysteine, glutamate, glutamine, glycine, proline, serine and
tyrosine, while the other 10 amino acids are essential and must be consumed. These amino acids are arginine,
3
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine. These amino
acids contain a variety of functional groups such as: hydroxyl (-OH), sulfhydryl or thiol (-SH), amines (-NH3),
acids (-COOH), amides (CONH2) and methyl (CH3) groups which can be found in Figure 1. These functional
groups can form tightly networked connections through noncovalent, ionic, hydrogen bonds and van der Waals
interactions. These functional groups also form regions that are hydrophobic and hydrophilic. Furthermore, the
sequence of amino acids determines how proteins will fold and interact.1
Proteins
Proteins are linear polymers of amino acids. They can contact one another and other macromolecules to
form complexes, and some proteins are more rigid while others are flexible. In addition, the sequence of amino
acids determines the folding of the protein, which is necessary for the protein's function and stability. Proteins
have many roles. They are enzymes, biological catalysts of most reactions in living cells, they transport and
store oxygen, provide mechanical support, generate movement, and they control virtually all-cellular processes.1
In the 1950s and 1960s, it was found that nucleotides containing A (adenine), G (guanine), T (thymine),
and C (cytosine) in DNA make up the genetic code, unique for each amino acid in a proteins sequence.2 This
sequence of nucleotides in DNA specifies a complementary sequence of nucleotides in messenger RNA
(mRNA), in the process of transcription. One or more of three nucleotides, called codons, encodes each one of
the 20 amino acids. Every codon has an anti-triplet from the tRNA, called anticodons.
The mRNA comes in contact with the tRNA while in contact with a ribosome; the ribosome translates
the sequences into a polypeptide chain forming the protein. Translation specifies for the amino acid sequence
or primary structure of proteins.1 The formation of proteins occurs when the -carboxyl group of one amino
acid forms a peptide bond (amide bond) with that of an -amine group of another amino acid, creating a
dipeptide, with the loss of water. A series of amino acids joined by peptide bonds are called polypeptides. The
biosynthesis of these peptides and proteins require an input of free energy.
4
Figure 1: Lists the structures and names of the 20 L-amino acids.3
The nascent sequence of amino acids arrange into several secondary structures: -helix -pleated sheet and
numerous turns make its tertiary structure. This tertiary structure, which is held together by disulfide bridges,
hydrogen bonds and hydrophobic interactions, gives the protein its function. If the tertiary structure were
5
disrupted in any way, the proteins would become inactive or severely compromised.4.
Recombinant DNA and Post-translational modification
Recombinant DNA (rDNA) technology is being used for the biosynthesis of specific proteins.
Recombinant technology begins with the isolation of a gene of interest. The gene is then cleaved by a specific
endonuclease, called a restriction enzyme, at specific base sequences. Restriction enzymes are used to cleave
DNA molecules into specific fragments that are more readily analyzed and manipulated than the entire parent
molecule.1 The gene is then inserted into a vector by a DNA ligase and cloned. A vector is a piece of DNA that
is capable of independent growth, such as bacteria plasmid or viral phage, which can include inserts ranging
from 50kb to even several thousand kilobases. Small samples of DNA can be amplified using PCR. The vector
must be cloned to produce numerous copies. Once the vector is produced in large quantities, it can be
introduced in the desired host such as yeast or special bacterial cells for expression. The cells will synthesize the
recombinant protein in large quantities that can be isolated and purified in large amounts.1
The downside to the rDNA technology is that post-translational modifications in bacteria do not take
place. Bacteria lack the necessary enzymes needed for posttranslational modification such as specific cleavage
of polypeptides, attachment of carbohydrate units, and chemical modification.1 Modifications are made to
increase the diversity of functional groups beyond those side chains of 20 proteinogenic amino acids
incorporated into nascent proteins. This diversity leads to new chemistry, new recognition patterns for partner
molecules, turns “on” and “off” enzyme activity, and controls the protein's life and location in the cell.1 Many
posttranslational modifications occur in Nature including: oxidation, sulfation, acylation, alkylation, glycation,
and many others.5 As a need for this post-translational modification, a new method has been developed to
incorporate novel amino acids into the genetic code before translation.
Unnatural Amino Acids
Unnatural amino acids fulfill a variety of roles. They are important because they represent an enormous
amount of diverse structural elements for the development of new leads in peptide and non-peptide compounds
6
shown in Figure 2. Small-molecule combinatorial libraries containing unnatural amino acid residues have
already shown an important role in the discovery process. Novel, short-chain peptide ligand mimetics with both
enhanced biological activity and resistance to proteolytic cleavage are drug candidates in today’s
pharmaceutical pipelines. Optimized and fine-tuned analogues of peptidic substrates, inhibitors or effectors are
also excellent analytical tools for investigating signal transduction pathways or gene regulation.6 This
methodology can be applied to studies of protein structure and function in vitro and in vivo, as well as the
evolution of proteins with novel properties, including therapeutic peptides, proteins and vaccines.
In Figure 2, the list of novel unnatural amino acids provides numerous possibilities such as HIV entry
inhibitors, generation of antibodies with high affinity for viral protein targets, photocontrol protein
phosphorylation in vivo in spatial fashion. In addition, unnatural amino acid in Figure 2: 21, 22, 23, 32, and 35,
p-amino-l-phenylalanine, p-methoxy-l-phenylalanine, p-iodo-l-phenylalanine, p-bromo-l-phenylalanine, and l3-(2-naphthyl)alanine were substituted for tyrosine 66 in GFP in E. coli. This amino acid was targeted because
conventional mutagenesis indicated only an aromatic group could be substituted in this position.8 Nevertheless,
in order to introduce these unnatural amino acids into proteins, which can be expressed in E. coli, yeast, and
mammalian cells, a change in the tRNA codon pair with its corresponding aminoacyl-tRNA synthetase must
occur that correctly codes for an unnatural amino acid.7
In 2001, the first selected co-translational incorporation of an unnatural amino acid O-methyl-L-tyrosine
into proteins in E. coli, in response to an amber nonsense codon, was reported. As of 2010, 71 unnatural amino
acids have been incorporated into or in the process of being incorporated into proteins, in response to a unique
triplet or quadruplet codon with high fidelity.9
The biosynthetic method for unnatural amino acids
The biosynthetic method relies on a unique codon-tRNA pair and corresponding aminoacyl tRNA
7
Figure 2. Novel amino acids that have been or are being genetically encoded in prokaryotic and eukaryotic
organisms.7
synthetase (aaRS) for each unnatural amino acid, shown in Figure 3, that does not cross-react with any of the
endogenous tRNAs, aminoacyl tRNA synthetase, amino acids or codons in the host organism. This means the
replacement of the codon for the amino acid of interest with the amber nonsense codon such as TAG (UAG in
RNA) will not recognize any of the common tRNAs involved in protein synthesis for E. coli, by conventional
oligonucleotide-directed mutagenesis. The amber stop codon TAG is the least used among the three stop codons
in E. coli and yeast, it rarely terminates essential genes, and is efficiently translated by amber suppressor tRNAs
in vivo and in vitro. Thus, the use of the TAG codon won’t significantly change or alter the growth of the
8
organism. Other significant codons that can be used are the opal stop codon TGA, rare codons such as AGG,
and codons made of four nucleotides to encode new amino acids (AGGA) but will not be further discussed.9
As a result, translation of the corresponding mRNA proceeds with the suppression of the stop codon by
a suppressor tRNA acylated with the desired amino acid. A synthetic amber suppressor tRNA is chemically
aminoacylated with the desired unnatural amino acid at pdCpA and ligated using T4 RNA ligase10, which is
then added to an in vitro transcription-translation system programmed with the mutagenized DNA. This results
in the specific incorporation of the unnatural amino acid at the position corresponding to the amber mutation in
the newly synthesized protein.11
Figure 3. Genetic incorporation of unnatural amino acids in live cells. The engineered orthogonal synthetase
aminoacylates the orthogonal tRNA with the desired unnatural amino acid, and the tRNA delivers the unnatural
amino acid in response to a unique codon (such as a stop or extended codon).12
The first orthogonal tRNA-amino acid tRNA synthetase pair was derived from a tyrosyl-tRNA
9
synthetase (TyrRS)-tRNATyr pair from the archaea Methanococcus jannaschii (Mj) because archaeal tRNAs
have distinct aminoacyl-tRNA synthetase recognition elements relative to their E. coli counterparts.9, 13. The
MjTyrRS has a small anticodon loop-binding domain, which makes it possible to alter the anticodon loop of its
tRNA with little loss in affinity in the synthetase. In addition, it lacks the editing mechanisms that can deacylate
an unnatural amino acid. Figure 4 shows a competition between E. coli release factor 1 and acylated suppressor
tRNA, which results either in a release factor 1-dependent termination of protein synthesis or read through of
the UAG stop codon by the suppressor tRNA. By using an E. coli S-30 extract depleted in release factor 1, read
through of the UAG stop codon is favored as applied to DHFR and HIV-1 protease.14
An orthogonal tRNA-synthetase pair evolved from the MjTyrRS-tRNATyr pair has been used to
incorporate more than 50 new amino acids into proteins in E. coli.9 The use of orthogonal tRNA-aminoacyl
tRNA synthetase has made it possible to genetically encode many diverse amino acids including those with
unique chemical and photochemical activity.15
To alter the specificity of the orthogonal synthetase to acylate the cognate tRNA with the unnatural
amino acid, a directed evolution approach was developed in which large libraries of synthetase variants were
passed through a series of stringent positive and negative selections. The library was initially generated by
randomizing five residues in the amino acids binding site of TyrRS to identify synthetase variants specific for
unnatural amino acids. The libraries were first transferred into cells containing chlorampenicol acetyl
transferase (CAT) gene with an amber mutation at a permissive site, and grown in media containing
chloramphenicol and the unnatural amino acid. The survivors contained synthetase variants that incorporate
either the unnatural or endogenous amino acid in response to amber codon. Selected synthetase clones were
then transferred into cells containing a toxic barnase gene with amber mutations at permissive sites, and grown
in the absence of unnatural amino acids. All clones that charged endogenous amino acids produced full-length
barnase protein and died. Repeated rounds of positive and negative selections resulted in the incorporation of
unnatural amino acids in response to the amber codon and no barnase.9 The selection process is illustrated on
10
Figure 5.
Figure 4: Strategy employed for studying the site-specific incorporation of unnatural amino acids into DHFR
and HIV-1 protease.14
11
Figure 5: Diagram of evolving aminoacyl tRNA synthetases with new specificities.16
Incorporation of photoisomerizable unnatural amino acid into proteins expressed in E. coli.
Photochromic and photocleavable groups can be used to spatially and temporally control a variety of
biological processes, either by directly regulating the activity of the enzymes, receptors, or ion channels or by
monitoring the intracellular concentrations of various signaling molecules. The requirement for this chemical
modification is azobenzene or a nitrobenzyl group. The generation of an orthogonal tRNA – aminoacyl tRNA
synthetase pair allows the selective incorporation of the photoisomerizable17 amino acid phenylalanine-4’azobenzene (AzoPhe), which is shown in Figure 6, into proteins in response to the amber codon, TAG.18
Trans azobenzene undergoes photochemical isomerization to cis upon radiation. Because the two
isomers differ in geometry and dipole moment, placement of azobenzene in close proximity to the substrate’s
binding site in an enzyme, receptor, or ion channel allows it to change the activity of the protein.
12
Figure 6 Interconversion of trans and cis isomers of AzoPhe.19
AzoPhe was synthesized by coupling of nitrosobenzene to N-Boc-p-aminophenylalanine followed by
Boc deprotection. Boc is a group used to protect amines and amino acids from undesired chemical changes.
Addition of Boc is done under aqueous or anhydrous conditions with a reaction between a base and anhydride
Boc2O. Removal of Boc, de-protection of amines, is with a strong acid, such as trifluoroacetic acid, followed
by washings.20, 21
To incorporate the AzoPhe at defined sites in proteins, an orthogonal tRNA –aminoacyl tRNA
synthetase pair was evolved that uniquely specifies the AzoPhe in response to the amber codon, TAG. It was
found22 that MjTyrRS and a mutant tyrosyl amber suppressor tRNA (MjtRNATyrCUA) function efficiently in
protein translation in E. coli, but do not cross-react with endogenous tRNAs or synthetases. To alter the
specificity of the MjTyrRS synthetase to selectively recognize AzoPhe only, a library of 109 TyrRS mutants
was created by randomizing six residues (Tyr-32, Leu-65, Phe-108, Gln-109, Asp-158, and Leu-162) in the
binding site of TyrRS. The crystal structure of the MjTyrRS showed that these residues are very close to the aryl
ring of the tyrosine. Positive and negative selections were then made. In the positive selection, cell survival
depends on the suppression of an amber codon introduced at a permissive site in the CAT gene when cells co
transformed with DNA plasmid pBK-lib2, encoding Mj yrRS and tRNATyr, are grown in the presence of 1 mM
AzoPhe and chloramphenicol. This synthetase could code for either natural or unnatural amino acid. Thus from
the survivors of the positive selection, the negative selection incorporates the surviving cells that are now
13
transformed into cells containing orthogonal tRNA or cognate tRNA and a gene that encodes for a toxic barnase
protein with amber mutations introduced at three permissive sites, a representation is shown in Figure 6. The
only way for survival is if the clones incorporate AzoPhe in place of the amber codon. Thus, any cells that
incorporate anything but the AzoPhe will die. These cells are grown without AzoPhe to remove any clones that
utilize endogenous amino acids.13
After five rounds of selections, 10 active synthetase clones were converged to one sequence that was
made with the sequence: Tyr32Gly, Leu65Glu, Phe108Ala, Gln109Glu, Asp158Gly, and Leu162His. Looking
at the synthetase structure, the mutations make a large binding pocket to accommodate the azobenzene. For
example, the large bulky mass of Tyr and Phe are switched for Gly and Ala, two of the smallest amino acids.13
To determine that the incorporation of AzoPhe was of high fidelity and efficiency into proteins, residue
75 from a sperm whale, containing His6 tag, was substituted for the amber codon. Protein expression was then
carried out in the presence of synthetase AzoPheRS and MjRNATyrCUA in E. coli grown in minimal media
supplemented with 1 mM of AzoPhe. As a negative control, protein synthesis was carried out in the absence of
AzoPhe.13 As a result, protein was expressed only in the presence of AzoPhe.
Application of the AzoPhe photolabel in Catabolite Activator Protein, (CAP)
To further characterize the photochemical properties of proteins containing AzoPhe, the amino acid was
introduced into the E. coli CAP. CAP is a homodimeric bacterial transcriptional activator that regulates a
number of catabolite-sensitive operons in E. coli. It regulates enzyme function by CAP’s binding to the
operon.19 When cAMP binds to CAP, conformational changes occur to the protein, increasing its binding
affinity to the promoter, resulting in an increased transcription from CAP-dependent promoters. To determine
whether CAP can be properly photo-labeled in E. coli, an amber codon was introduced at the amino acid
position of Leu125, a residue located at the dimerization interface; a C-terminal His6 tag was added, and this
mutant was added to a rich medium in the presence of 1 mM AzoPhe, AzoPheRS, and MjtRNATyrCUA. Ni-NTA
purification, which is designed for purification of His6-tagged recombinant proteins expressed in bacteria,
14
insect, and mammalian cells was applied. Ni-NTA agarose has high affinity and selectivity for recombinant
fusion proteins that are tagged with six tandem histidine residues. About 1.5 mg of mutant CAP was obtained
per liter of cultured cells, compared to the 3 mg/L for wild type CAP. A UV-visible spectrum of this mutant
protein indicates that there is a distinct absorbance peak at 334 nm corresponding to the trans-azobenzene
chromophore shown in Figure 7. Irradiation of the CAP at 25 C light led to the decrease of the peak at 334 nm
and increase in the 420 nm peak also shown in the insert in Figure 7, which is consistent with the photostationary state of 45% trans- and 55% cis azobenzene. As a result, the photo-isomerization is interchangeable
between the cis-trans AzoPhe.18
Figure 7: Absorption spectra of mutant CAP (CAP125TAG; 50 M) in 50 mM sodium phosphate, 300 mM
NaCl, 250 mM imidazole, pH 8.0 buffer: red, after Ni-NTA purification and prior to irradiation; blue,
irradiation at 334 nm, 40 min; black, subsequent irradiation with 420 nm light, 40 min.19
Incorporation of a sulfated tyrosine residue
Another instance in the incorporation of unnatural amino acids into proteins is sulfated tyrosine residues.
Here the incorporation of sulfo-tyrosine, shown in Figure 8, into proteins are done by genetically encoding the
modified amino acid in response to an amber nonsense codon, TAG.23
15
Figure 8: Structure of Sulfo-tyrosine.23
Tyrosine-sulfation is a common post-translational modification in secreted and membrane bound
proteins24 in eukaryotes. Although not much is known about the wide scale biological functions of sulfotyrosine, it has been observed in many protein-protein interactions. For example, tyrosine sulfation is involved
in the coagulation cascade.17 Also, sulfotyrosine has been found in many clotting factors and natural thrombin
inhibitors such as the leech-secreted anti-coagulant hirudin.
The obstacle in determining sulfation function is the  60 known and over 2100 predicted proteins
containing sulfo-tryosine. This large number makes it difficult to synthesize selectively sulfated proteins. The
old tradition in synthesizing sulfated proteins25 lacked generality. The new technique mentioned, again, uses a
specific/selective orthogonal tRNA/aminoacyl-tRNA synthetase pair that allows efficient incorporation of
sulfotyrosine into proteins and expression in E. coli in response to the amber nonsense codon. 15, 17
A library of active site mutants of MjTyrRS, which charges an engineered nonsense suppressor
MjtRNATyrCUA, not recognized by E. coli synthetases, was used. Again, positive and negative selections were
made .13 In this case, survival of the positive selection comes down to suppression of the amber mutation in the
CAT gene in the presence of 2 mM sulfotyrosine while survival in the negative selection is contigent upon
adequate suppression of three amber mutations in a gene encoding the toxic barnase protein in the absence of
sulfotyrosine. The only way for survival is if the clones incorporate sulfotyrosine in place of the amber codon.
After these selections, clones were identified that harbored the CAT reporter gene with the amber mutation at a
permissive site. In the absence of sulfotyrosine, the same cells did not grow on 20 g/ml chloramphenicol,
16
which is consistent with efficient sulfotyrosine incorporation with little to no background from incorporation of
endogenous amino acids.23 Sequencing revealed that these synthetase clones converged upon one mutant
StyrRS (Tyr32Leu, Leu65Pro, Asp158Gly, Ile159Cys, and Leu162Lys). Reasons for the preference for the
amino acids substitutions are the possibility that Lys162 forms a salt-bridge interaction with the sulfotyrosine
SO3-. Leu32 and Gly158 may accommodate the larger SO3- group and remove affinity for endogenous tyrosine
and Tyr32 and Asp158 are involved in hydrogen bonding to the tyrosine phenolic group in wild-type
synthetase. The replacement of the Asp158 with glycine eliminates unfavorable electrostatic interactions with
sulfotyrosine.23
To verify the results in the incorporation of sulfotyrosine by the selected synthetase StyrRS, an amber
mutant (residue 7) of C-terminal His6 tag Z-domain protein was expressed in E. coli in a mixture of the
plasmids of the amber mutant Z-domain, 2mM sulfotyrosine, MjtRNATyrCUA, and StyrRS. After the Ni-NTA
purification, a strong band formed only in the presence of 2 mM sulfotyrosine in the medium shown in Figure
9. No band formed in the absence of sulfotyrosine, which confirmed the dependence of the amber suppression
on sulfotyrosine.12, 23
Figure 9: Ni-NTA purified cell lysate from cells expressing Z-domain with an amber codon at position 7 run on
a denaturing PAGE gel stained with Coomassie blue.23
17
For further characterization, a matrix-assisted laser desorption ionization/time-of-flight (MALDI-TOF)
analysis of the purified mutant Z-domain was used. Shown in Figure 10, a predominant peak [M+H] of 7,876
Da (theoretical 7,877.5 Da) appeared, corresponding to the Z-domain containing a single sulfotyrosine and
lacking a methionine. The loss of sulfate resulted in a less than 10% spike at 7,798 Da (theoretical 7,797.5 Da)
from tyrosine during MALDI-TOF. Confirmation of this result was also obtained with SDS-PAGE analysis
shown in Figure 9.
Figure 10: Positive-ion MALDI-TOF spectra (generated using THAP matrix) of Ni-NTA purifed cell lysate
(concentration and dialyzed against water) showing a peak corresponding to full-length Z-domain containing a
single sulfotyrosine and lacking methionine. Also observed is a peak corresponding to loss of sulfate resulting
from mass spectral analysis conditions.23
Sulfo-Hirudin
Recombinant hirudin (desulfo-hirudin) has been of clinical interest as a thrombin inhibitor, but it is less
potent as an anticoagulant than the native protein. One of the most important applications of the incorporation
18
of sulfotyrosine into a protein was in the case of hirudin.
The expression of sulfo-hirudin was done by the cloning of StyrRS gene into the pSup vector backbone
containing 6 copies of MjtRNATyrCUA with optimized promoters.21 The hirudin gene with an amber codon at
position 63 and a gIII periplasmic signal sequence was synthesized and inserted into pBAD vector. After cotransformation of DH10B E. coli cells with both plasmids, expression in liquid glycerol minimal medium
supplemented with 10 mM sulfotyrosine was carried out.23 Because hirudin is small, direction into the
periplasm effectively results in secretion from the E. coli; thus, sulfo-hirudin was purified directly from the
concentrated secreted medium by fast protein liquid chromatography using a Q-matrix sepharose anionexchange column followed by size-exclusion chromatography to yield 5 mg/L of sulfo-tyrosine.
The resulting hirudin was analyzed first on a SDS- PAGE, displayed in Figure 11. As shown, the band
furthest down the gel is sulfo-hirudin because of the sulfate group’s negative charge. MALDI-TOF analysis
shown in Figure 12 indicates the correct masses of sulfo-hirudin at (7,059 Da; Mtheoretical, 7,059.5 Da) and
desulfo-hirudin (6,979 Da; Mtheoretical, 6,979.5 Da) with small peaks to the right of the large peaks. These small
peaks represent sodium salts of the adducts from the column.23
Figure 11: Desulfo-hirudin (left) and sulfo-hirudin (right) migration on a denaturing PAGE gel stained with
Coomassie blue. Size of hirudin cannot be judged by molecular weight standards because of hirudin’s atypical
charge.23
19
Figure 12: MALDI-TOF analysis showed the correct [M+H] masses for both sulfo-hirudin (7,059 Da;
Mtheoretical, 7,059.5 Da) and desulfo-hirudin (6,979 Da; Mtheoretical, 6,979.5 Da).23
It was suspected that there are high amounts of truncated protein, shown in Figure 13, due to the low
permeability of anionic sulfotyrosine in the E.coli cells, resulting in a decreased amount of amino acid charged
with MjtRNATyrCUA.23 The other possibility of truncation was that the second peak arose during the spectral
analysis. To test for the first possibility, a 10% greater ionic strength was applied to an anion exchange column
for the capture of the sulfo-hirudin while desulfo-hirudin elutes. This allows for the complete separation of
desulfo-hirudin and sulfo-hirudin if present. The conclusion of this experiment was that the anion exchange
column did not present any spike for desulfo-hirudin, meaning no desulfo-hirudin was created when sulfohirudin was being expressed. The second process was a MALDI-TOF analysis of biosynthetic products with the
exclusion of sulfotyrosine. In the end, no full-length proteins, only truncated proteins resulted with a [M+H]
peak of 6,578 Da, due to TAG’s alternative behavior as a stop codon, which can been observed in Figure 14.
20
Figure 13: MALDI-TOF spectra of unpurified sulfo-hirudin media corresponding to expression in the presence
of sulfotyrosine demonstrating the peak ratio of truncated to full-length sulfo-hirudin. Good detection of crude
sample mixture is observed because of harsher application of conditions. Only the ionized form of sulfo-hirudin
is observed.2
Figure 14: MALDI-TOF spectrum of unpurified sulfo-hirudin expression media corresponding to expression in
the absence of sulfotyrosine. Only truncated proteins can be observed.23
21
Structure and kinetic analysis of inhibition of thrombin with sulfohirudin
The X-ray structure of the sulfohirudin-thrombin complex at 1.84 Å resolution showed the critical
interactions between the sulfate groups on Tyr and numerous amino acids at the external binding site of human
-thrombin as shown on Figure 15.
Figure 15: Xray structure of the sulfohirudin-thrombin complex.17
The kinetics of human  thrombin inhibition by the biosynthesized sulfohirudin was studied using a
fluorogenic enzyme assay based on the single progress curve method.26, 27 Using 100pM of either desulfo- or
sulfo-hirudin were mixed with 50 M of fluorogenic substrate to which some -thrombin was added to initiate
the reaction. Cleavage of the substrate by thrombin, which had been inhibited to different degrees with the
addition of either sulfo- or desulfo-hirudin, was observed and the fluorescence versus time data points was
plotted in Figure 16. The exact concentrations of desulfo- and sulfo-hirudin were determined by the titration
against thrombin in 1:1 binding. The results come from fitting equation (1) to the data: 28,
P = vst + [(1 - )(v0 – vs)/]ln(1 - e-t/1 - )10
(Eq 1)
22
Where P is the amount of product formed at time t, and v0 and vs are the initial and steady-state velocities of the
reaction. In equation 1, vs, , and  can be described by the following expressions:
 = ((Ki’ + Et + It – Q)/(Ki’ + Et + It + Q)), and
 = konQ,
where; Ki’ = Ki(1 + (S/Km)) and Q = ((Ki’ + Et + It)2 – 4EtIt).
These equations were used to determine Ki and kon. The Ki’s are 26 fM for sulfohirudin and 307 fM for
desulfo-hirudin and in agreement with the literature value of the natural form of hirudin (sulfo-hirudin) 5, 28 As
expected, kon for sulfo-hirudin (0.95 x 108 M-1 s-1) is greater than that for desulfo-hirudin (0.38 x 108 M-1 s-1),
whereas koff for sulfo-hirudin is smaller (0.22 x 10-5 s-1) than for desulfo-hirudin (1.18 x 10-5 s-1).29 The value of
koff is the product of kon and Ki.
Figure 16: a.) Plots of thrombin inhibition by hirudin with their respective fitted progress curves superimposed
on the raw data points. b.) Magnified curves for 600 sec inhibition.23
23
The co-translational incorporation of sulfotyrosine into hirudin with efficient expression in E. coli makes
it possible for the biosynthesis of other sulfated proteins such as: antibodies, chemokine receptor motifs and
clotting factors. In addition, in vivo strategies towards the accumulated sulfated libraries and phage display of
sulfated proteins provide opportunities for peptide synthesis that were not made possible before; native
chemical ligation, and expressed protein ligation. Alternatively, this strategy could also be used to the
expression of tyrosine-sulfated proteins in eukaryotic organisms.23
Conclusion:
These were the pioneering studies of incorporating modified amino acids into proteins using molecular
genetics. In these experiments, twenty-one amino acids were incorporated into an organism, which allowed the
exploration of abilities under growth conditions in comparison to a twenty amino acid bacterium. This
methodology was developed to be expressed in yeast and mammalian cells and has been applied to multicellular organisms including a twenty-one amino acid transgenic mouse.
A consensus-based approach has also been developed for generating new orthogonal tRNA-synthetase
pairs and has been used, together with a novel selection scheme to identify efficient four base codon decoding
tRNAs to genetically encode novel amino acids in bacteria in response to four base codons. In addition, removal
of redundancy in the existing genetic code of E. coli to encode additional amino acids has been done. Finally,
the application of this methodology to studies of protein structure and function in vitro and in vivo will include
the evolution of proteins with new properties, including therapeutic peptides, proteins and vaccines. Today,
because of research into unnatural amino acids, there are revolutionary applications to: break immunological
tolerance to self-proteins (using immunogenic amino acids); generate highly potent HIV entry inhibitor peptides
(based on metal ion mediated oligomerization); generate antibodies with high affinity for viral protein targets
(using phage-displayed libraries containing unnatural amino acids); generate bifunctional antibodies as
anticancer agents (modified with ligands and other proteins); generate sequence specific DNA binding/cleaving
proteins; and to photocontrol protein phosphorylation in vivo in a spatial and temporal specific fashion.15
24
Bibliography:
1) Berg, J. M.; Tymoczko, J. L.; Stryer, L. Biochemistry; 6th Edition; W. H. Freeman and Co: New
York, NY, 2008; pp 27.
2) Nirenberg, M. W.; Matthei, H. J. Med. World News. 1962, 3(1), 18.
3) Gomperts, D. B.; Kramer, M. I.; Tatham, P. E. R. Signal Transduction: 2nd Ed.; Elsevier Inc: San
Diego, 2009.
4) Pauling. L.; Corey, R.B.; Branson, H.R. Proc Natl Acad Sci. 1951, 37 (4), 205.
5) Walsh, C. T. Posttranslational Modification of Proteins: Expanding Nature’s Inventory; Roberts
and company: Greenwood, CO, 2006, pp 1.
6) Liu, C. C.; Schultz, P. G. Nature Biotechnology. 2006, 24, 1436.
7) Reddi, O. S. Recombinant DNA Technology: A Laboratory Manual; Allied Publishers:
Mayapuri, New Dehli, 2000; pp 1.
8) Wang, L.; Xie, J.; Deniz, A. A.; and Schultz, P. G. J. Org. Chem. 2003, 68 (1), pp 174
9) Xie, J.; Schultz, P. G. Nat. Rev. Mol. Cell Biol. 2006, 7, 775.
10) Thorson, J. S.; Cornish, V. W.; Barrett, J. E.; Cload, S. T.; Yano, T.; Schultz, P. G. Methods in
Molecular Biology. 1998, 77, 43; Young S. T; Schultz P. G. J Biol Chem. 2010, pp. 1.
11) Short, G. F., et al., Hecht, S. M. Biochemistry. (2000), 39, 8768.
12) Wang, L. Wang Lab. Salk Institute for biological studies. http://wang.salk.edu/research.php
(accessed Jan 17, 2010).
13) Ryu, Y.; Schultz, P. G. Nat. Methods. 2006, 3, 263.
14) Short, G. F.; Golovine, S. Y.; Hecht, S. M.; Biochemistry. 1999, 38, 8808.
15) Moore, K. L. J. Biol. Chem. 2003, 278, 24243.
16) Schultz, P. G. The Schultz Lab. The Scripps Research Institute.
http://schultz.scripps.edu/research.html (accessed Dec 26, 2009)
17) Liu, C. C.; Brustad, E.; Liu, W.; Schultz, P. G. J. Am. Chem. Soc. 2007, 129, 10648.
18) Robert M. Williams, Peter J. Sinclair, Duane E. DeMong, Daimo Chen, and Dongguan Zhai,
Org. Synth. 2003, 80, 18.
19) Bose, M.; Groff, D.; Xie, J.; Eric, B.; Schultz, P. G. J. Am. Chem. Soc. 2005, 128, 388.
20) Organic Chemistry Portal. Protecting Groups. http://www.organicchemistry.org/protectivegroups/amino/boc-amino.htm (accessed Dec 26, 2009).
21) D. M. Shendage, R. Fröhlich, G. Haufe. Org. Lett. 2006, 6, 3675.
22) (a)Weber, I. T.; Steitz, T. A. J. Biol. Chem. 1987, 198, 311. (b) McKay, D. B.; Weber, I. T.;
Steitz, T. A. Science 1991, 253, 1001.
23) Kehoe, J. W.; Bertozzi, C. R. Chem. Biol. 2000, 7, R57.
24) Xie, J.; Scultz, P. G. Current Opinion in Chemical Biology. 2005, 9, 548.
25) Noren, C. J.; Anthony-Cahill, S. J.; Griffith, M. C.; Schultz, P. G. Science. 1989, 244, 182.
26) Cha, S. Biochem. Pharmacol. 1976, 25, 2695.
27) Komatsu, Y.; Misawa, S.; Sukesada, A.; Ohba, Y.; & Hayashi, H. Biochem. Biophys. Res.
Commun. 1993, 196, 773.
28) Stone, S. R.; & Hofsteenge, J. Biochemistry. 1986, 25, 4622.
29) Fried, M. G.; Crothers, D. M. J. Mol. Biol. 1984, 172, 241.
25
26