Download Biomolecular chemistry 4. From amino acids to proteins

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleic acid analogue wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene regulatory network wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

SR protein wikipedia , lookup

Metabolism wikipedia , lookup

Paracrine signalling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Magnesium transporter wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Signal transduction wikipedia , lookup

Expression vector wikipedia , lookup

Gene expression wikipedia , lookup

Genetic code wikipedia , lookup

Interactome wikipedia , lookup

Point mutation wikipedia , lookup

Biosynthesis wikipedia , lookup

Metalloprotein wikipedia , lookup

Protein purification wikipedia , lookup

Western blot wikipedia , lookup

Protein wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Biochemistry wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
77
Biomolecular chemistry
4. From amino acids to proteins
Suggested reading: Sections 14.5 to 14.8 and Sections 2.1 to 2.4 of
Mikkelsen and Cortón, Bioanalytical Chemistry
Primary Source Material
• Chapters 4 and 12 of Introduction to Genetic Analysis Anthony: J.F. Griffiths, Jeffrey H.
Miller, David T. Suzuki, Richard C. Lewontin, William M. Gelbart (courtesy of the NCBI
bookshelf).
• Chapters 4, 4 and 6 of Biochemistry: Berg, Jeremy M.; Tymoczko, John L.; and Stryer,
Lubert (courtesy of the NCBI bookshelf).
• Chapters 3 and 7 of Molecular Cell Biology: Lodish, Harvey; Berk, Arnold; Zipursky, S.
Lawrence; Matsudaira, Paul; Baltimore, David; Darnell, James E. (courtesy of the NCBI
bookshelf).
• ExPASy: online course on Principles of Protein Structure
• Many figures and the descriptions for the figures are from the educational resources
provided at the Protein Data Bank (http://www.pdb.org/)
• Most of these figures and accompanying legends have been written by David S. Goodsell
of the Scripps Research Institute and are being used with permission. I highly recommend
browsing the Molecule of the Month series at the PDB (http://www.pdb.org/pdb/101/
motm_archive.do)
Where are we and how did we get here?
78
We are
here!
• We are done with the Central Dogma and now we move into the realms of protein structure
and function. The Central Dogma only relates to the flow of genetic information, not to the
function of biological macromolecules.
Proteins come in all shapes and sizes
79
http://www.rcsb.org/pdbstatic/education_discussion/molecule_of_the_month/poster_quickref.pdf
• Proteins are diverse and versatile ‘nano’ structures and machines
• Large number of potential combinations
• There is a relatively large number number of amino acids (a.a.) which you can use to
construct a protein.
• Includes 20 common a.a.’s plus numerous post-translational modifications.
• 200 amino-acid protein could have 20 to the 200th power possible sequences.
• Structurally versatile
• Polypeptide backbone can adopt a variety of conformations
• Many conformers of side chains
• Secondary structural elements can pack together in a wide variety of orientations
• Various states of homo- and hetero- oligomerization
• Proteins can bind prosthetic groups or cofactors (non-protein)
• Heme
• Metal ions
• flavins
• Structurally dynamic
• Allosteric activation
• Active and inactive forms
The structure of a protein is determined by the80
linear sequence of amino acids (1º structure)
Ribonuclease
An unfolded protein can be refolded in vitro. This demonstrates that the information
needed to specify the tertiary structure is fully contained in the primary sequence.
http://www.users.csbsju.edu/~hjakubow/classes/rasmolchime/01ch331finproj/Rnase/templateprot.htm
• The classic work of Christian Anfinsen in the 1950s on the enzyme ribonuclease revealed
the relation between the amino acid sequence of a protein and its conformation. For this
work he was awarded the Nobel Prize in Chemistry in 1972. Anfinsen discovered that:
• Ribonuclease is a single polypeptide chain consisting of 124 amino acid residues
cross-linked by four disulfide bonds.
• Agents such as urea or guanidinium chloride effectively disrupt the noncovalent
bonds.,
• The disulfide bonds can be cleaved reversibly by reducing them with a reagent such
as β-mercaptoethanol.
• When ribonuclease was treated with β-mercaptoethanol in 8 M urea, the product was
a fully reduced, randomly coiled polypeptide chain devoid of enzymatic activity. In
other words, ribonuclease was denatured by this treatment.
• Anfinsen then made the critical observation that the denatured ribonuclease, freed of urea
and β-mercaptoethanol by dialysis, slowly regained enzymatic activity. All the measured
physical and chemical properties of the refolded enzyme were virtually identical with those
of the native enzyme.
• These experiments showed that the information needed to specify the catalytically active
structure of ribonuclease is contained in its amino acid sequence.
Ala showing Lstereochemistry
•
•
•
•
•
•
•
•
•
81
http://www.neb.com/neb/sitemap/sitemap_5-1-10.html
The 20 common amino acids
20 different common amino acids only differing in side chain
Note that stereochemistry at Cα has not been indicated in this figure.
All natural a.a.’s are in L-configuration
A more general system of stereochemical designation is the R/S system. The Lconfiguration nearly always corresponds to S in the R/S system. The exception is Lcysteine which is R.
You might want to keep this sheet handy as a reference.
I will often used the one letter codes and you should learn these.
Most are easy, but I find D, E, N and Q the most tricky to remember
Q: Do we need to memorize the structure and names of amino acids on the test?
A: Yes. You should know the structure, name, 3 letter abbreviation, and 1 letter code of all
the common amino acids.
Amino acid classification by property
82
http://www.imb-jena.de/image_library/GENERAL/aa/mut1.jpg
http://www.imb-jena.de/image_library/GENERAL/aa/chemprop.jpg
• Various simple textbook classifications for a.a.’s
• e.g. small, nucleophilic, hydrophobic, aromatic, acidic, amide, basic
• e.g. aliphatic, non-polar, aromatic, polar, charged -ve, charged +ve
• However, no simple classification can properly capture the diversity of a.a. interactions and
properties.
• the same amino acid in different charge states can go from polar to nonpolar (H or K
for example)
• Different portions of the same amino acid can have different properties (aliphatic
chain vs. guanidinium of arginine)
• Generally find aliphatic/hydrophobic residues inside proteins and polar/charged on the
surface of proteins
• Notes:
• Cysteine is special because it is best nucleophile, is most easily oxidized, and can form
disulphide bonds.
• Proline has a tertiary as opposed to secondary amide nitrogen and induces bend in
polypeptide chain.
• Theonine and Isoleucine have chiral carbons in side chain
Free amino acids are almost always zwitterions83
Commentary on the topic of zwitterions: http://bip.cnrs-mrs.fr/bip10/zwitter.htm
• Amino acids in solution at neutral pH exist as dipolar ions (zwitterions).
• In the dipolar form, the amino group is protonated (-NH3+) and the carboxyl group is
deprotonated (-CO2-).
• Under almost any conceivable physiologically relevant conditions, the amino and
carboxylate group of a free amino acid will be in its charged state.
• This is also true of a polypeptide chain: the N-terminus and the C-terminus will be in the
charged states
• Possible exceptions
• Groups buried in the interior of proteins or lipid bilayers
• Proteins in the stomach
pKa values of protein functional groups
84
• Seven of the 20 amino acids have readily ionizable side chains. These 7 amino acids are able to donate or accept
protons to facilitate reactions as well as to form ionic bonds.
• The above table gives equilibria and typical pKa values for ionization of the side chains of tyrosine, cysteine,
arginine, lysine, histidine, and aspartic and glutamic acids in proteins.
• Two other groups in proteins—the terminal α-amino group and the terminal α-carboxyl group—can be ionized.
• You should know the approximate values for all of these ionizable groups. It is safe to say that all carboxylic acids
in proteins have a pKa of about 3-4.
• Q: What is so special about Histidine? It has a pka of ~6, but did you mention that it does not react with anything
much?
• A: Histidine is very good at donating and accepting protons at physiological pH. This is a very important part of
many enzyme mechanisms. I may have mentioned that histidine is not such a good nucleophile. For enzyme
mechanisms that involve a nucleophilic attack on the substrate, cysteine would be the best amino acid, followed by
lysine.
• Q. Proteins buried in lipid bilayers are charged on one terminal end or not at all? if its charged on part which one is
it?
• A. The N-terminus is always positively charged and the C-terminus is always negatively charged under normal pH
conditions (near neutral). Under some circumstances, such as when the N- or C-terminus is buried in a very
hydrophobic environment, I suppose they could be uncharged. The pKa of an ionizable group is going to depend
on its environment.
• Q. Proteins in stomach are charged on their N terminals, am i right?
• A. I believe that the stomach is very low pH, like 2-3. At such low pH, practically every group in proteins will be
protonated. It is close to the pKa for the C-terminus, so it might be partially protonated.
• Q. Are the pKa values of AAs will be given in the test or not?
• A. They won't be provided. You should know which residues are positively and negatively charged at neutral pH.
An oligopeptide
85
• Oligopeptide: A compound made up of the condensation of a small number (typically less
than 20) of amino acids
• Polypeptide: A compound made up of the condensation of more than ~20 amino acids
• Each type of protein differs in its sequence and number of amino acids. It is the particular
sequence of the various side chains that makes each protein distinct.
• The two ends of a polypeptide chain are chemically different: the end carrying the free
amino group (NH3+, sometimes incorrectly written as NH2) is the amino, or N-terminus,
and that carrying the free carboxyl group (CO2-, sometimes incorrectly written as CO2H) is
the carboxyl, or C-terminus.
• The amino acid sequence of a protein is always presented in the N to C direction, reading
from left to right. This corresponds to the 5’ to 3’ direction in which genes are read.
The peptide bond is planar
86
• Linus Pauling and Robert Corey analyzed the geometry and dimensions of the peptide
bonds in the crystal structures of molecules containing one or a few peptide bonds.
• This analysis led Pauling to correctly predict the existence and structure of the alpha helix
and beta sheets (for which he was awarded the 1954 Nobel Prize in Chemistry)
• The take home message is that the secondary structure elements of proteins can be
predicted by looking at the structure of an individual amino acid. That is, an amino acid in
an alpha helical or beta sheet conformer is also in a minimal energy conformer because its
bonds are staggered and the peptide bond is planar.
• Note that the C-N bond length of the peptide is 10% shorter than that found in usual C-N
amine bonds. This is because the peptide bond has some double bond character (40%)
due to resonance which occurs with amides.
• As a consequence of this resonance all peptide bonds in protein structures are found to be
almost planar. This rigidity of the peptide bond reduces the degrees of freedom of the
polypeptide during folding.
• The planarity of the peptide bond is described using the angle ‘omega’. This is the dihedral
angle between the Calpha-carbonyl bond and the N-Calpha bond.
The peptide bond is almost always trans
87
All amino acids except proline
Proline
image credit: http://www.imb-jena.de/IMAGE.html.
• The omega (ω) angle is almost always 180º (trans) though sometimes (extremely rarely) it
is 0º (cis).
• Note that both the cis and trans form are planar.
• Of the cis-peptide bonds found in proteins, almost all involve proline residues.
• The overall atom geometry in cis proline is very similar to the trans-proline case.
Energetically, the trans proline structure is not markedly more favorable than its cis-proline
counterpart since much the same spatial conflicts are present in both cases.
• Approximately 1% of prolines in proteins are cis.
• A cis-peptide bond induces a very sharp kink in the polypeptide chain.
• Q. It is stated that "Approximately 1% of prolines in proteins are cis." Does it mean 99% of
prolines in proteins are trans? So, trans-proline is still more favourable than cis-proline
(Slide 87)? Also, do you mean that proline is the only amino acid that can exist in cis while
19 other amino acids cannot.
• A. Correct. 99% of all prolines are trans and trans is more favourable than cis. The
difference in energy for cis vs. trans is smaller than it is for any of the other amino acids,
and this is why we occasionally see cis prolines. It is extremely rare to find any of the other
19 amino acids in a cis conformation.
Certain combinations of φ and ψ angles are
preferred
88
Scans downloaded from: http://www.nd.edu/~aseriann/cou.html
• A polypeptide can be thought of as a series of planar units (peptide bonds) joined by
flexible hinges (Cα-atoms).
• Each Cα-atom has two rotatable bonds, the C-N bond (φ, phi) and the C-C bond (ψ, psi)
• Only certain combinations of φ and ψ angles are allowed due to steric clashes between the
adjacent residues.
The Ramachandran Plot (φ vs. ψ)
89
β-strand conformation
α-helical conformation
• A graph of φ angle vs. ψ angle vs. occurrence in proteins is called a Ramachandran plot.
• There are actually only a few conformations that are strongly preferred and these give rise
to the common elements of secondary structure.
The Ramachandran90
Plot of a typical
protein
(as output by the
program PROCHECK)
http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html
http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html
• The Ramachandran plot for a particular protein shows the phi-psi torsion angles for all
residues in the structure
• By looking at how well the angles match up with expected distribution, the quality of a
structure can be assessed.
• Glycine residues are separately identified by triangles as these are not restricted to the
regions of the plot appropriate to the other sidechain types.
• The coloring/shading on the plot represents the various levels of favorability: the darkest
areas (here shown in red) correspond to the "core" regions representing the most favorable
combinations of phi-psi values.
• A properly folded protein will have over 90% of the residues in these "core" regions.
• The percentage of residues in the "core" regions is one of the better guides to
stereochemical quality for assessing experimental protein structures.
• An ideal Ramachandran plot can be generated computationally using known atomic radii
and bond distances.
alpha-helices: an ‘island’ of preferred
conformation
91
http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html
• As mentioned earlier, Pauling and Corey twisted models of polypeptides around to find
ways of getting the backbone into regular conformations which would agree with
experimental diffraction data (much like the way the structure of DNA was determined). The
most simple and elegant arrangement is a right-handed spiral conformation known as the
'alpha-helix'.
• The structure repeats itself every 5.4 Angstroms along the helix axis, i.e. we say that the
alpha-helix has a pitch of 5.4 Angstroms. Alpha-helices have 3.6 amino acid residues per
turn, i.e. a helix 36 amino acids long would form 10 turns. The separation of residues along
the helix axis is 5.4/3.6 or 1.5 Angstroms, i.e. the alpha-helix has a rise per residue of 1.5
Angstroms.
• Every mainchain C=O and N-H group is hydrogen-bonded to a peptide bond 4 residues
away (O(i) to N(i+4)). This gives a very regular, stable arrangement.
• The peptide planes are roughly parallel with the helix axis and the dipoles within the helix
are aligned. That is, all C=O groups point in the same direction and all N-H groups point the
other way. This alignment of C=O and N-H bonds gives the alpha-helix a permanent dipole
with a partial positive charge at the amino-terminus and a partial negative charge at the
carboy-terminus.
• Side chains point outward from helix axis and are generally oriented towards its aminoterminal end.
• All the amino acids have negative phi and psi angles, typical values being -60 degrees and
-50 degrees, respectively
beta-strands: another ‘island’ of preferred
conformation
92
http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html
• In addition to the alpha helix, Pauling and Corey discovered another periodic structural
motif which they named the β-pleated sheet (β because it was the second structure that
they elucidated, the α helix being the first).
• The β-sheet differs markedly from the rodlike α-helix. A polypeptide chain, called a βstrand, in a β-sheet is almost fully extended rather than being tightly coiled as in the αhelix. A range of extended structures are sterically allowed. The side chains of adjacent
amino acids point in opposite directions.
• A β-sheet is formed by linking two or more β-strands by hydrogen bonds. Adjacent chains
in a β-sheet can run in opposite directions (antiparallel β-sheet) or in the same direction
(parallel β-sheet).
• In the antiparallel arrangement, the NH group and the CO group of each amino acid are
respectively hydrogen bonded to the CO group and the NH group of a partner on the
adjacent chain.
• In the parallel arrangement, for each amino acid, the NH group is hydrogen bonded to the
CO group of one amino acid on the adjacent strand, whereas the CO group is hydrogen
bonded to the NH group on the amino acid two residues farther along the chain.
• Many strands, typically 4 or 5 but as many as 10 or more, can come together in β-sheets.
Such β-sheets can be purely antiparallel, purely parallel, or mixed.
• β-sheets can be relatively flat but most adopt a somewhat twisted shape.
Turns and loops connect strands and helices 93
http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html
• Most proteins have compact, globular shapes, requiring reversals in the direction of their polypeptide chains. Many of these reversals are accomplished by reverse turns and hairpins.
• A reverse turn is region of the polypeptide having a hydrogen bond from one main chain carbonyl oxygen to the main chain N-H group 3 residues along the chain (i.e. O(i) to N(i+3)). Helical regions are excluded from this definition and
turns between beta-strands form a special class of turn known as the beta-hairpin.
• Reverse turns are very abundant in globular proteins and generally occur at the surface of the molecule. It has been suggested that turn regions act as nucleation centers during protein folding.
• Reverse turns are divided into classes based on the phi and psi angles of the residues at positions i+1 and i+2. Types I and II shown in the figure are the most common reverse turns, the essential difference between them being the
orientation of the peptide bond between residues at (i+1) and (i+2). The torsion angles for the residues (i+1) and (i+2) in the two types of turn lie in distinct regions of the Ramachandran plot.
• 2-residue beta-hairpin turns occur between two antiparallel beta-strands as shown in the figure.
• The residues forming these two-residue turns have torsion angles in characteristic regions of the Ramachandran plot.
• For type I' turns, residue 2 is always glycine whereas for type II' turns residue 1 is always Gly. This is because amino acids other than glycine would cause steric hindrance involving the residue's side chain and the main chain.
• In other cases, more elaborate structures are responsible for chain reversals. These structures are called loops or sometimes Ω loops (omega loops) to suggest their overall shape. Unlike α-helices and β-strands, loops do not have
regular, periodic structures. Nonetheless, loop structures are often rigid and well defined. Turns and loops invariably lie on the surfaces of proteins and thus often participate in interactions between proteins and other molecules.
• For example, a part of an antibody molecule has surface loops (shown in red) that mediate interactions with other molecules.
• Q. Does reverse turn only exists among alpha-helix? And beta-hairpin only exists among beta-sheet?
• A. As the name implies, the beta hairpin is most commonly found as a connector between strands of an antiparallel beta sheet. The reverse turn is a a bit more general and can be found in loops that connect both helices and strands.
• Q. What are the differences/relationship between reverse turns, beta-hairpin turns and omega loops?
• A. Reverse turns and beta turns do look very similar when you look at the structures on the slide. However, there are key differences in the conformations of amino acids that define each of these two types of turns. I don't expect you to
know the details of these differences. One thing you should remember is that beta turns are typically used to connect two strands of anti-parallel beta sheet. An omega loop is a larger structure that is supposed to look something like the
omega character (Ω). That is, the ends are very close together but the loop itself is large and extends out into space. The variable regions of an antibody can be described as omega loops.
• Q. In slide 93, type II looks similar to type I'. I know one is for reverse turn and another is for beta hairpin. If I have a structure like that, how can I tell it's a beta-hairpin or a reverse turn. I know the feature for beta-hairpin is that residue2 in
typeI' and residue1in typeII' should be Glysine. Is there any other feature for reverse turn? Do we need to know how to tell type I and type II of reverse turns? Another question is the difference between reverse turn, beta-hairpin and loops.
Did we tell them apart by the number of amino acids?
• A. There are key differences in the conformations of the residues in these turns, which is the basis by which they are classified. I don't expect you to know the details of these differences. One thing you should remember is that beta turns
are typically used to connect two strands of anti-parallel beta sheet. Proteins generally composed of α-helices 94
and/or β-sheets connected by turns and loops
• The α-helical content of proteins ranges widely, from nearly none to almost 100%. For
example, about 75% of the residues in ferritin, a protein that helps store iron, are in αhelices. Single α-helices are usually less than 45 Å long. However, two or more α-helices
can entwine to form a very stable ‘coiled coil’ structure, which can have a length of 1000 Å
(100 nm, or 0.1 μm) or more. Such α-helical ‘coiled coils’ are found in myosin and
tropomyosin in muscle, in fibrin in blood clots, and in keratin in hair. The helical cables in
these proteins serve a mechanical role in forming stiff bundles of fibers, as in porcupine
quills. The cytoskeleton (internal scaffolding) of cells is rich in so-called intermediate
filaments, which also are two-stranded α-helical coiled coils. Many proteins that span
biological membranes also contain α-helices.
• The β-sheet is an important structural element in many proteins. For example, fatty acidbinding proteins, important for lipid metabolism, are built almost entirely from β-sheets.
Protein folding is largely driven by
hydrophobic interactions
95
Myoglobin
Hydrophobic
Hydrophilic
surface
cross section
• Myoglobin, the oxygen carrier in muscle, is a single polypeptide chain of 153 amino acids.
The capacity of myoglobin to bind oxygen depends on the presence of heme, a prosthetic
(helper) group consisting of protoporphyrin IX and a central iron atom.
• The folding of the main chain of myoglobin, like that of most other proteins, is complex and
devoid of symmetry. A unifying principle emerges from the distribution of side chains. The
striking fact is that the interior consists almost entirely of nonpolar residues such as leucine,
valine, methionine, and phenylalanine. Charged residues such as aspartate, glutamate,
lysine, and arginine are absent from the inside of myoglobin. The only polar residues inside
are two histidine residues, which play critical roles in binding iron and oxygen.
• The outside of myoglobin, on the other hand, consists of both polar and nonpolar residues.
This contrasting distribution of polar and nonpolar residues reveals a key facet of protein
architecture. In an aqueous environment, protein folding is driven by the strong tendency of
hydrophobic residues to be excluded from water.
• The polypeptide chain therefore folds so that its hydrophobic side chains are buried and its
polar, charged chains are on the surface.
• The secret of burying a segment of main chain in a hydrophobic environment is pairing all
the NH and CO groups by hydrogen bonding. This pairing is neatly accomplished in an αhelix or β-sheet.
• The ability to predict whether or not a given polypeptide sequence will fold into a given
tertiary structure remains one of the ‘grand challenges’ of science.
• In nature, protein fold either independently or with the help of other proteins known as
chaperones.
96
Membrane proteins have grease on the outside
K+-channel
lipid bilayer
Three views
of the same
structure
• Some proteins that span biological membranes are “the exceptions that prove the rule” regarding
the distribution of hydrophobic and hydrophilic amino acids throughout three-dimensional
structures. For example, ion channels are covered on the outside largely with hydrophobic
residues that interact with the neighbouring alkane chains. The inner channel is quite polar and
there are many specific interactions with the ion being transported.
• David S. Goodsell: The Molecule of the Month appearing at the PDB
• Potassium ions move through this channel from inside the cell to the outside. The driving force
for this movement is simply the concentration gradient. Cells concentrate potassium ions inside,
and then these ions are released when the membrane depolarizes (for example, during
transmission of signals through the nervous system). The selectivity filter is the part with the
backbone carbonyls oriented towards the ion in the centre of the channel. Only potassium (not
sodium) is perfectly coordinated by these carbonyl oxygen atoms, and so only it can pass
through the channel. It is my understanding that potassium ions are normally surrounded by 8
water molecules, whereas sodium is normally surrounded by 6.
• The 2003 Nobel Prize in Chemistry was awarded for work in the area of channels
• Roderick Mackinnon pioneered x-ray crystallography of ion channels.
• Peter Agre discovered water channels.
• Water channels facilitate the rapid transport of water across cell membranes in response to
osmotic gradients. These channels are believed to be involved in many physiological processes
that include renal water conservation, neuro-homeostasis, digestion, regulation of body
temperature and reproduction. Members of the water channel superfamily have been found in a
range of cell types from bacteria to human.
Chaperone assisted protein folding
97
http://www.users.csbsju.edu/~hjakubow/classes/rasmolchime/02ch331finproj/GroELES/templateprotGROEL2.htm
• Folding of proteins in vitro tends to be an inefficient process, with only a minority of
unfolded molecules undergoing complete folding within a few minutes.
• More than 95 percent of the proteins present in cells are in their native conformation.
• The explanation for the cell’s remarkable efficiency in promoting protein folding probably
lies in chaperones, a family of proteins found in all organisms from bacteria to humans.
• There are two general families of chaperones: molecular chaperones, which bind and
stabilize unfolded or partially folded proteins, thereby preventing these proteins from being
degraded; and chaperonins, which directly facilitate their folding.
• Chaperonins are probably used for a specific and relatively small selection of proteins,
whereas molecular chaperones are used for most, if not all, proteins.
• All chaperones have ATPase activity, and their ability to bind and stabilize their target
proteins is specific and dependent on ATP hydrolysis.
• Molecular chaperones include the Hsp70 family of proteins. When bound to ATP, Hsp70
assumes an open form in which an exposed hydrophobic pocket transiently binds to
exposed hydrophobic regions of the unfolded target protein. Hydrolysis of the bound ATP
causes Hsp70 to assume a closed form, releasing the target protein. Molecular chaperones
are thought to bind all nascent polypeptide chains as they are being synthesized on
ribosomes.
98
More on GroEL
Hydrophobic
stripe
ATP-binding
site
Large cavity
David S. Goodsell: The Molecule of the Month appearing at the PDB
• Proper folding of a small proportion of proteins (e.g., the cytoskeletal proteins actin and
tubulin) requires additional assistance, which is provided by chaperonins.
• Shown on this slide is the bacterial chaperonin, GroEL, which contains 14 identical subunits
stacked in two concentric rings (green). GroES is shown at the bottom in pink.
• The large GroEL-GroES complex is available in PDB entry 1aon. In this picture, three of the
subunits in each GroEL ring have been removed to show the interior, leaving four subunits
in each ring. On the two in back, the hydrophobic amino acids, LEU, ILE, VAL, MET, PHE,
TYR and TRP, are coloured blue.
• Notice the stripe of hydrophobic amino acids around the entry at the top. This will interact
strongly with unfolded proteins by coaxing them into the upper cavity. Once the unfolded
protein is bound, ATP and GroES bind to GroEL. This causes a conformational change that
forces the protein into the larger lower cavity that is much more hydrophilic than the upper
cavity.
• Now that the protein is in a hydrophilic environment, it will be forced to fold in order to
minimize they unfavourable interactions between its hydrophobic portions and its
hydrophilic environment.
• After the protein has folded, ATP is hydrolyzed and GroES (the lid on the cavity) is released
along with the newly folded protein.
• Q: When use chaperonin to help proteins to fold, the GroES will bind to GroEL to the large
cavity side or hydrophobic stripe side?
• A: I believe it can bind to both sides. Don't worry about the details.
Proteins often consist of multiple independent99
domains and have 4o structure
CD4
Cro
hemoglobin
Rhinovirus
http://web.mit.edu/esgbio/www/cb/virus/virus.html
Immunoglobin (antibody)
• Some polypeptide chains fold into two or more compact regions that may be connected by a flexible segment of
polypeptide chain, rather like pearls on a string.
• These compact globular units, called domains, range in size from about 30 to 400 amino acid residues.
• For example, the extracellular part of CD4 (shown at top), the cell-surface protein on certain cells of the immune
system to which the human immunodeficiency virus (HIV) attaches itself, comprises four similar domains of
approximately 100 amino acids each. Often, proteins are found to have domains in common even if their overall
tertiary structures are different.
• Antibodies (immunoglobins) have a distinct domain structure in addition to quaternary structure. We will be taking a
much closer look at antibody structure in the next section.
• Quaternary structure refers to the spatial arrangement of subunits and the nature of their interactions.
• The simplest sort of quaternary structure is a dimer, consisting of two identical subunits. This organization is present in
the DNA-binding protein Cro found in a bacterial virus called λ.
• More complicated quaternary structures also are common. More than one type of subunit can be present, often in
variable numbers. For example, human hemoglobin, the oxygen-carrying protein in blood, consists of two subunits of
one type (designated α) and two subunits of another type (designated β). Thus, the hemoglobin molecule exists as an
α2β2 tetramer.
• Viruses make the most of a limited amount of genetic information by forming coats that use the same kind of subunit
repetitively in a symmetric array. The coat of rhinovirus, the virus that causes the common cold, includes 60 copies
each of four subunits. The subunits come together to form a nearly spherical shell that encloses the viral genome.
• Q: It mentioned that the coat of rhinovirus includes 60 copies each of four subunits. But from the picture I only see
three coloured subunits. What's wrong in this.
• A: There is a 4th protein that is inside and not visible from the outside.
Post-translational modifications of proteins
Ubiquitylation (Lys)
Courtesy of Spencer Alford
• Many proteins are covalently modified, through the attachment of groups other than amino
acids, to augment their functions. Many proteins, especially those that are present on the
surfaces of cells or are secreted, acquire carbohydrate units on specific asparagine
residues. The addition of sugars makes the proteins more hydrophilic and able to
participate in interactions with other proteins. Conversely, the addition of a fatty acid to an
α-amino group or a cysteine sulfhydryl group produces a more hydrophobic protein that will
be tightly associated with the membrane.
• Proteins can also be reversibly modified to regulate their activity. Perhaps the single most
important of all modifications is phosphorylation and dephosphorylation of serine,
threonine, and tyrosine residues. Regulation of protein activity by phosphorylation is basis
for intracellular signalling. The enzymes that catalyze the addition of phosphate groups
(from ATP donors) are called kinases (why kinases?). Enzymes that remove phosphate
groups are called phosphatases.
• Histones—proteins that assist in the packaging of DNA into chromosomes as well as in
gene regulation—are rapidly acetylated and deacetylated on specific lysine residues in
vivo. More heavily acetylated histones are associated with genes that are being actively
transcribed. A more permanent modification of lysines in histone proteins is methylation.
• The attachment of ubiquitin, a protein comprising 72 amino acids, is a signal that a protein
is to be destroyed, the ultimate means of regulation.
• This slides shows only a few of the common examples. A number of additional posttranslational modifications are known.
Lysine
ε-N-monomethyllysine
H2C=O
c Acetylation
Histone H4 Ac–Lys
GCN5 bromodomain
recognize shor
target proteins
tide has acquir
domains usual
the modified r
selectively enga
distinguishes b
same PTM6–9. B
that they recogn
fore, in princip
proteins.
Ubiquitin
Vps27 UIM
Cooperative in
dependent inte
signal is only g
same protein h
in various ways
recognized in an
action domains
phosphotyrosin
HP1 chromodomain
101
O
Acetyl
CoA
CoA
Post-translational modifications are catalyzed
by enzymes and often serve as binding sites
REVIEWS
+
NH3
HN
HAT
HDAC
N
H
a Phosphorylation
OH
N
H
ADP
ATP
Protein kinase
Phosphatase
O
Tyrosine
Pi
b Methylation
SAM
+
O–
O P O–
O
H2O
N
H
O
Phosphotyrosine
SAH
+
H2N
NH3
SHC P–Tyr
GRB2 SH2 domain
CH3
O
Lysine
H2O2
H2C=O
c Acetylation
+
NH3
O2
O
ε-N-monomethyllysine
Histone H3 Me–Lys
HP1 chromodomain
O
Acetyl
CoA
CoA
HN
HAT
HDAC
N
H
O
O
Lysine
H2O
O
–
N
H
O
ε-N-monoacetyllysine
+
NH3
Histone H4 Ac–Lys
GCN5 bromodomain
O
d Ubiquitylation
Ubiquitin + ATP
HN
Ub
N
H
O
Lysine
Ubiquitin H2O
N
H
O
N-ubiquitinyllysine
Ubiquitin
Vps27 UIM
e Hydroxylation
O2
N
O
Proline
H2O
Prolyl hydroxylase
Dehydroxylase?
HO
N
O
Hydroxyproline
HIF1α OH–Pro
VHLβ
Figure 1 | Example post-translational modification reactions and structures of
protein-interaction-domain–ligand complexes. Various amino-acid side chains
can be modified by, for example: phosphorylation (a); methylation (b); acetylation (c);
ubiquitylation (d); and hydroxylation (e). The enzymes that are involved in adding and
removing these post-translational modifications are shown on the reaction arrows.
The structures on the far right show examples of protein-interaction domains (pale
purple) in complex with their respective ligands (red). The structures were obtained
from the Protein Data Bank (accession codes 1JYR, 1Q3L, 1E6I, 1Q0W and 1LM8 for
parts a–e, respectively) and were manipulated using Pymol (Delano Scientific). GCN5,
general control of amino-acid-synthesis protein-5; GRB2, growth-factor-receptorbound protein-2; HAT, histone acetyltransferase; HDAC, histone deacetylase; HIF1α,
hypoxia-inducible factor-1α; HP1, heterochromatin protein-1; Me–Lys, methylated
lysine; OH–Pro, hydroxylated proline; Pi, inorganic phosphate; P–Tyr,
phosphotyrosine; SAH, S-adenosylhomocysteine; SAM, S-adenosylmethionine; SH2,
Src-homology-2; SHC, SH2-domain-containing transforming protein; Ub, ubiquitin;
UIM, ubiquitin-interacting motif; VHLβ, von Hippel–Lindau protein-β; Vps27, vacuolar
protein sorting protein-27.
474 | JULY 2006 | VOLUME 7
O–
N
H
O
ε-N-monoacetyllysine
Dehydroxylase?
O
Src-homology-2
HIF1α OH–Pro
Proline
Hydroxyproline
VHLβ
PTM-induced
interactions. Interaction domains
often
Figure 1 | Example
post-translational
modification
and structures of
recognize
short peptide
motifs that are
embeddedreactions
in
protein-interaction-domain–ligand
complexes.
target
proteins, but do not bind stably
until theVarious
pep- amino-acid side chains
can be
example: phosphorylation
methylation (b); acetylation (c);
tide
hasmodified
acquiredby,anfor
appropriate
PTM (FIGS 1,2a)(a);
. Such
ubiquitylation
(d);have
and hydroxylation
(e). The enzymes
that are involved in adding and
domains
usually
a conserved binding
pocket for
removing these post-translational modifications are shown on the reaction arrows.
the modified residue and a more variable surface that
The structures on the far right show examples of protein-interaction domains (pale
selectively
engages the
amino acids,
and(red).
thereby
purple) in complex
withflanking
their respective
ligands
The structures were obtained
distinguishes
between
different
peptide
motifs
with
the 1E6I, 1Q0W and 1LM8 for
1JYR,
1Q3L,
from the Protein
Data Bank
(accession
codes
6–9
. Both the and
domains
and the peptide
motifs
same
parts PTM
a–e, respectively)
were manipulated
using
Pymol (Delano Scientific). GCN5,
that
theycontrol
recognize
are modular in design
and can theregeneral
of amino-acid-synthesis
protein-5;
GRB2, growth-factor-receptorbound
histone acetyltransferase;
HDAC, histone deacetylase; HIF1α,
fore,
inprotein-2;
principle,HAT,
be incorporated
into many different
hypoxia-inducible factor-1α; HP1, heterochromatin protein-1; Me–Lys, methylated
proteins.
One of the main functions of PTMs
is to provide a binding site for a
protein partner with a suitable
interaction domain.
In this way, the location of proteins
lysine; OH–Pro, hydroxylated proline; Pi, inorganic phosphate; P–Tyr,
inside
of S-adenosylhomocysteine;
cells
can be dynamically
phosphotyrosine;
SAH,
Cooperative
interactions
and multi-site PTMs. SAM,
PTM-S-adenosylmethionine; SH2,
Src-homology-2;
SHC,
SH2-domain-containing
transforming
protein; Ub, ubiquitin;
dependent
interactions can be cooperative, such
that a
changed.
UIM, ubiquitin-interacting motif; VHLβ, von Hippel–Lindau protein-β; Vps27, vacuolar
signal is only generated after two or more sites on the
protein sorting protein-27.
Ubiquitin ligase
Nature Reviews Molecular
Cell(E3)
Biology 7, 473-483
Deubiquitylating
isopeptidase
H2O
O
words
and phrases, and has engendered the hypothesis
d Ubiquitylation
that protein+ domainsUbiquitin
represent
a basic syntacticHN
unit of
+ ATP
NH3
5
cellular organization
. In this article, we briefly discussUb
the common ways in
whichligase
PTMs
Ubiquitin
(E3) and interaction
domains synergize toDeubiquitylating
regulate cellular processes, and we
provide specific examples
that involve phosphorylation,
isopeptidase
N sumoylation
N
methylation,
acetylation, ubiquitylation and
H
H
O
O. This is not intended to be a comprehensive
(FIGS 1,2)
Ubiquitin H2O
Lysine
N-ubiquitinyllysine
analysis, but rather aims to highlight the general strategies
through
which PTMs exert their effects.
e Hydroxylation
O
N
H
O
PTM-dependent interactions: common strategies
HO
O2
H2Obriefly discuss
In the following subsections
we
the common mechanisms by which PTM-dependent interactions
regulate
cellular processes.
N
N
Prolyl hydroxylase
Methyltransferase
Amine oxidase
demethylase
N
H
O
Lysine
same protein have been modified. This can be achieved
in various ways. First, a doubly modified motif can be
recognized
in an obligatory fashion by two tandem inter474 | JULY 2006 | VOLUME 7
action domains, as is the case for the recognition of two
phosphotyrosine (pTyr)-containing motifs by the tandem
Src-homology-2 (SH2) domains of the ZAP-70 (ζ-chain
(T-cell receptor)-associated protein kinase 70 kDa)
protein tyrosine kinase10 (FIG. 2b). This can increase both
the affinity and the specificity of the interaction. Second,
a single domain can possess two binding pockets for the
modified residues. For example, the single SH2 domain of
the APS protein (adaptor molecule containing pleckstrinhomology (PH) and SH2 domains protein) binds to two
pTyr residues in the activation loop of the insulin receptor kinase; furthermore, two APS SH2 domains form
a non-covalent dimer, which potentially stabilizes the
activated receptor11. Third, a domain with a single binding pocket can bind specifically to a protein that carries
several modifications. For example, the WD40-repeat
domain of the Saccharomyces cerevisiae protein Cdc4 (celldivision cycle-4) only binds to its target, Sic1 (substrate
inhibitor of cyclin-dependent protein kinase-1),
when the target has been phosphorylated during the
G1 phase of the cell cycle on at least six Ser/Thr residues
(FIG. 2c). As Cdc4 is the substrate-targeting subunit of
an SCF (Skp1–Cul1–F-box) E3 ubiquitin ligase complex,
phosphorylation of Sic1 leads to its polyubiquitylation
and degradation by the proteasome. This, in turn, is
necessary for the onset of DNA synthesis, because Sic1
www.nature.com/reviews/molcellbio
(T-cell recepto
protein tyrosine
the affinity and
a single domain
modified residu
the APS protein
homology (PH)
pTyr residues in
tor kinase; furt
a non-covalent
activated recept
ing pocket can
several modifi
domain of the Sa
division cycle-4
inhibitor of c
when the targe
G1 phase of the
(FIG. 2c). As Cd
an SCF (Skp1–
phosphorylatio
and degradatio
necessary for th
Many proteins are assembled from interaction102
multiple domains with predictable functions
By combining various module
binding domains, plus a catalytic
domain, sophisticated dynamic
regulation of protein activity is
possible.
It is apparent that evolution has
shuffled and recycled these
domains many times to make the
proteins that control cell signalling.
Examples:
http://pawsonlab.mshri.on.ca
Pawson and Nash, Science 300, 445 (2003)
Modular proteins interact with each other in 103
complex, yet rational, pathways (circuits)
“…mammalian cyclin E activates cyclindependent kinase-2 (CDK2) to promote
the G1-to-S phase transition in the cell
cycle. The phosphorylation of cyclin E
on a Thr residue is required for its
recognition by the WD40-repeat domain
of the targeting subunit — CDC4 (celldivision cycle-4) — of an SCF (SKP1–
CUL1–F-box) E3 ubiquitin-ligase
complex. This interaction leads to the
addition of a Lys48-linked polyubiquitin
chain to cyclin E, which results in its
subsequent recruitment to the
proteasome by a ubiquitin-interacting
motif (UIM) and its destruction.”
SCF E3 complex
P
Thr
Cyclin E
CDK2
Cyclin E
P
P
CDK2 Ub
P
Ub
CDC4
E2
Ub
Ub
Ub
Ub P
Ub Thr
Ub
Cyclin E
UIM
Proteasome
E3
P T
P
Ub
Ub
P
CDK2
(WD40)8 F-box
E1
SHC
PTB
P Tyr
CDK2
P
GRB2
SH2
CDK2 is an important kinase that needs to be activated in a cell-cycle dependent
fashion. Specifically, it needs t one active at the G1 to S transition point.
Ras–MAPK signalling
While turning on the activity of kinase is important, so is turning it off. This is
Cell-cycle regulation
Cell signalling
accomplished by targeting the cyclin E activator for Figure
destruction
by the proteasome.
3 | Networks of modification-dependent interactions re
Nature Reviews Molecular Cell Biology 7, 473-483
phosphorylation (P) and ubiquitin (Ub)-dependent protein interacti
induced signalling and endocytic trafficking. Inducible post-transla
highlighted by red arrows. Sequential PTM-dependent interactions
examples. In the first example, mammalian cyclin E activates cyclinphase transition in the cell cycle. The phosphorylation of cyclin E on
WD40-repeat domain of the targeting subunit — CDC4 (cell-divisio
ubiquitin-ligase complex. This interaction leads to the addition of a
results in its subsequent recruitment to the proteasome by a ubiquit
second example, epidermal growth factor (EGF) receptor autophos
can recruit the Src-homology-2 (SH2) domain of the E3 ubiquitin lig
oncogene). Cbl monoubiquitylates the receptor to provide docking
EPS15 (EGF-receptor-pathway substrate-15), which are involved in
undergoes an intramolecular interaction with a monoubiquitylated
and ‘open’ conformations). The binding of the phosphotyrosine-bin
transforming protein) to a pTyr site on the EGF receptor induces the
in turn, recruits the SH2 domain of GRB2 (growth-factor-receptor-b
The figure also shows the convergence of distinct interaction doma
SH2 and PTB domains for pTyr) and the modification of the same po
the multi-site phosphorylation and ubiquitylation of the EGF recept
activating enzyme; E2, ubiquitin-conjugating enzyme; MAPK, mitog
residues in a manner that depends on ligand phosphorylation and the identity of the flanking amino acids26
(FIG. 2a). Activated receptor tyrosine kinases (RTKs),
such as the β-platelet-derived growth factor receptor or
the epidermal growth factor receptor (EGFR), become
phosphorylated at multiple Tyr sites, and each of these
sites selectively binds the SH2 domain of one or more
cytoplasmic signalling proteins, which, in turn, activate
specific intracellular signalling pathways27–29 (FIGS 3,4a).
Among the SH2-domain-containing proteins that are
recruited to an autophosphorylated RTK such as EGFR
is the Cbl (Casitas B-lineage lymphoma proto-oncogene)
E3 ubiquitin ligase, which subsequently ubiquitylates the
T
at m
to y
inter
lator
this
by th
depe
key
is us
(FIG.
crip
the p
Disulphide bonds are another type of posttranslational modification
104
Ribonuclease
(PDB 5RSA)
• Disulphide bonds generally only form under oxidizing conditions. They are very common in
extracellular proteins such as those found in blood.
• Disulphide bonds generally do not form in the cytoplasm of living cells because this is a
reducing environment (~5 mM glutathione, a thiol)
• A protein that has disulphide bonds will tend to be very stably folded relative to proteins that
do not have disulphide since there is a covalent link holding two sections of chain together
(as opposed to non-covalent interactions in a protein that lacks disulfide bonds). You would
expect a protein with disulphide bonds to be more stable at elevated temperatures than a
protein without disulphide bonds.
• To maintain the activity of a protein with free thiols in vitro, it is important to add reducing
agents such as β-mercaptoethanol, TCEP, or DTT.
In vitro, reducing agents are necessary to
maintain thiols in a reduced form
SH
S S
HO
OH
Dithiothreitol (DTT)
SH
HO
SH
reduction
O
β-mercaptoethanol
(2-mercaptoethanol)
HO
OH
S S
O
SH
HS
HO
oxidation
105
OH
P
HO
H Cl
O
Tris(2-Carboxyethyl) Phosphine
Hydrochloride (TCEP·HCl)
• Don’t use β-mercaptoethanol (at least not in this building) if you can help it. It stinks
• DTT or TCEP are much better choices. These two reducing agents are similar in terms of
effectiveness. TCEP can be slightly more expensive but it benefits from not having a free
thiol (a good nucleophile). This is advantageous if you are working with electrophiles in
solution.
106
The chromophore of green fluorescent protein
is a unique post-translational modification
Gly67
O
Tyr66
O
HN
N
H
O
Ser65
N
H
HO
cyclization
O
HN
HO
HO
HO
N
OH O
N
H
oxidation &
dehydration
O
N
HO
HO
N
O
N
H
mature chromophore
MSKGEELFTGVVPILVELDGDVNGQKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL
VTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFYKDDGNYKTRAEVKFEGDTLV
NRIELKGIDFKEDGNILGHKMEYNYNSHNVYIMADKPKNGIKVNFKIRHNIKDGSVQLAD
HYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMILLEFVTAAGITHGMDELYK
Prasher et al. Gene (1992) 111: 229-233.
Tsien, Annu. Rev. Biochem. (1998) 67: 509-544.
• Yet another type of post-translational modification are those that are generated by chemical
rearrangements of side chains and, sometimes, the peptide backbone. For example, the
Aequorea Victoria jellyfish produces a green fluorescent protein (GFP). Surprisingly, the
chromophore is not a bound cofactor but rather a post-tranlationally modified sequence of
amino acids.
• The chromophore (also okay to call fluorophore) is formed by the protein-promoted
rearrangement and oxidation of the sequence Ser-Tyr-Gly within the center of the protein.
The GFP is of tremendous utility to researchers as a marker within cells. The Nobel prize in
Chemistry for 2009 was awarded to 3 researchers (Tsien, Chalfie, and Shimomura) who
are considered pioneers in the development of GFP as a research tools.
• The spontaneous formation of the Aequorea green fluorescent protein chromophore within
the folded beta-can protein structure must necessarily involve at least three key steps:
cyclization of the main chain, loss of a molecule of water (dehydration), and oxidation with
molecular oxygen. The exact order and mechanism of these steps is a matter of ongoing
investigation.
• Chromophore formation is spontaneous only within the context of the fluorescent protein can structure where steric constraints force the peptide into a tight turn conformation
(Branchini et al. 1998) and the side chains of highly conserved residues, such as glutamate
222 and arginine 96, are positioned to facilitate the reaction.
Fluorescent proteins engineered to
fluoresce at different wavelengths
107
• Note that, because the GFP chromophore is generated from amino acids, we can change
the structure of the chromophore by introducing mutations in the gene that ultimately
change the amino acids present in the chromophore forming tripeptide (or its immediate
surroundings).
• For example, by changing the tyrosine of the GFP chromophore to other aromatic amino
acids, new chromophore structures can be formed that fluoresce at different wavelengths
from the wild-type protein.
Visualizing the central dogma in a live cell!
cyan fluorescence =
Lac operator in
nucleus and
peroxisomes in
cytoplasm
An immortalized human
cancer cell known as U2OS
Gene for
LacI - cyan FP
108
yellow fluorescence =
RNA transcripts
Gene for
MS2bp - yellow FP
256 x Lac operator
promoter
cyan FP (peroxisome-targeted)
24 x MS2-repeats
This cell has 3 different genes artificially introduced into it:
• LacI - cyan FP, the gene for LacI fused to the gene for a cyan fluorescent protein
• MS2bp - yellow FP, the gene for MS2bp fused to the gene for yellow fluorescent protein
• A peroxisome targeted cyan FP followed by 24 copies of MS2 in the 3‘UTR. This gene is proceeded by 256
copies of the Lac operator which is not transcribed.
• Definitions:
• U2OS cell line: a human cell line cultivated from the bone tissue of a fifteen-year-old human female suffering from osteosarcoma (the same cancer as Terry Fox had).
• LacI: a bacterial protein that binds tightly to the section of DNA known as the Lac operator.
• Lac operator: a DNA sequence that binds to LacI.
• MS2: an RNA sequence that forms a hairpin structure. This RNA hairpin binds tightly to the viral protein that I have called MS2bp.
• MS2bp (MS2-binding protein): a viral coat protein that binds tightly to the the MS2 RNA sequence.
• cyan FP: the cyan fluorescent variant of the green fluorescent protein
• yellow FP: the yellow fluorescent variant of the green fluorescent protein
• peroxisomes: small organelles in the cytoplasm of eukaryotic cells. They have a role in destroying peroxides in the cell.
• This is the work of the labs of David L. Spector and Robert H. Singer
• Q. This is about the topic of visualizing the central dogma in the class today. In the example of the sequence having 254 copies of Lac operator and 24 copies of MS2, the
figure shows both the cyan and yellow florescence separately. But will there be more of the cyan fluorescence actually noticed since the sequence has 254 copies of Lac
and 24 copies of MS2 ? But the figure actually shows more number of yellow spots compared to cyan in the nucleus. Could you please explain if there will be any
differences ?
• A. All of the copies of the Lac operator DNA sequence are located in the same place and there is only one 'copy' of the 254 copies in the cell (i.e., there is only one copy of
the genome). There are many, many RNA transcripts created by transcription of this DNA sequence. As we saw earlier in the course, one DNA sequence can be read by
many RNA polymerases over and over again in order to produce many mRNA molecules. Each of these transcripts carries the 24 copies of MS2 and thus shows up as a
yellow fluorescent spot. You would expect each yellow spot (corresponding to one transcript) to be substantially dimmer than the cyan spot in the nucleus. Each transcript
has 24 fluorophores attached to it, while the DNA sequence has 254 fluorophores attached to it.
• Q. Visualizing central dogma in a living cell. There are three arrows point to nucleus. One is for Lacl, one is for a DNA sequence, another is for MS2bp. For Lacl and MS2bp,
do they also stand for DNA sequences? And they would transcribe and translate to cyanFP and yellow FP respectively. The translated cyan FP would bind to Lac operator
and translated yellow FP would bind to MS2 RNA sequence. Is that right?
• A. Correct. The cell has 3 different genes introduced into its genome. One of these is the gene for LacI-cyan FP and it would be transcribed and translated to form the LacIcyan FP protein. Likewise the gene for MS2bp-yellow FP would be transcribed and translated to form the MS2bp-yellow FP protein. You are correct that they protein then
bind to the Lac operator (in DNA) and the MS2 RNA sequence (in RNA), respectively.
Visualizing the central dogma in a live cell!
example 1
cyan fluorescence = Lac operator in
nucleus and peroxisomes in cytoplasm
109
yellow fluorescence =
RNA transcripts
• Imaging gene expression in single living cells Nat Rev Mol Cell Biol 5(10):855-862 (2004 October)
• Visualizing gene expression in living cells. The movie shows a cell with a stably integrated gene that also contains 256 lac operator repeats. This gene transcribes an
RNA that contains both a coding sequence for the cyan fluorescent protein (CFP) protein (with a peroxisome-targeting sequence) and a stretch of MS2 stem-loops. In
the beginning of the movie, the gene locus is visible as a result of tagging of the DNA with a CFP–lac-repressor protein. Once transcription is induced from this gene,
the locus becomes structurally open and decondenses. The RNAs produced from the gene are tagged with yellow fluorescent protein (YFP)–MS2 and can be seen
accumulating at the transcription site. The RNA is translated in the cytoplasm and at later times post-induction, CFP-labelled peroxisomes are detected. The cell was
imaged every 2.5 min for a total of 4 hr and 22.5 min.
• http://singerlab.aecom.yu.edu/supplements/natrevmcb_v5p855/movies03.htm
• Q: The lac operator is connected to the DNA sequence of the cyan FP, and when transcription happens, cyan FP will be generated and the lac operator will not be
transcribed? And the lac operator will not be replicated in the nucleus?
• A: Transcription starts from the promoter and so only things that come after the promoter, specifically the cyan FP (peroxisome targeted) and the 24 x MS2 repeats, will
be transcribed. The 256 lac operator copies comes before the promoter, so it is not transcribed. The 256 lac operator are just there to serve as a binding site for Lac Icyan FP, so that the location of the inserted DNA sequence can be visualized by fluorescence imaging. All of the DNA will be replicated when the cell divides, but
otherwise there will be just one copy of the DNA in the cell.
• Q: The MS2 is an RNA sequence and it is connected to the DNA sequence of the yellow FP? What is the function of the MS2 here and can it be replicated?
• A: 'MS2' is an RNA sequence that forms a hairpin structure. 'MS2 binding protein' (MS2bp) is a protein that binds to the MS2 hairpin. The cell contains the gene for
MS2bp fused to yellow FP. When this fusion protein is made in the cell, it will stick to the RNA molecules that contain the 24 x MS2 repeats and allow the RNA
molecules to be visualized by fluorescence imaging.
• Q: Is one lac operator connected to one cyan FP?
• A: The lac operator isn't connected to anything. It is just a DNA sequence that is not transcribed. Lac Inhibitor (LacI) is a protein that binds to the lac operator. The cell
contains the gene for LacI fused to cyan FP. When this protein is made, 256 copies of it will stick to the 256 repeats of the lac operator in the DNA.
• Q: Will the content of visualizing the central dogma in a live cell in lecture 4 be included in the exam?
• A: Everything that was covered in class and/or is in assigned reading could be included in the exam. Visualizing the central dogma in a live cell!
example 2
cyan fluorescence = Lac operator in
nucleus and peroxisomes in cytoplasm
110
yellow fluorescence =
RNA transcripts
1
3
2
4
• Dynamics of Single mRNPs in Nuclei of Living Cells Science 304(5678):1797-1800 (2004 June 18)
• 1. Detection of open gene locus by CFP-lac repressor.
• 2. Detection of cytoplasmic CFP-peroxisomes (different plane than 1).
• 3. Detection of YFP-MS2 nuclear mRNPs and YFP-MS2 accumulation at the transcription site.
• 4. Different threshold of same cell showing cytoplasmic YFP-MS2 mRNPs.
• An important conclusion from this work is that RNA transcripts are freely diffusing inside of the nucleus. This is different than
in the cytoplasm where they are actively transported.
• http://singerlab.aecom.yu.edu/supplements/science_v304p1797/movies.htm
• Q: In the example of "visualizing the central dogma in a living cell", you mentioned that we tag the MS2 RNA with the YFP.
My question is that where does the YFP comes from? I found there is no DNA sequence that represent for the YFP
• A: The cell also contains a gene encoding MS2bp-YFP
• Q. I have a question about cyanFP, you mentioned that it is peroxisome- targeted what do you mean by that? You mean that
it will have interaction by peroxisome? What will happen to it? It is still on the translating region and it should appear on the
RNA transcription, so do we have both cyan and yellow color?
• A. By genetically fusing a protein (including FPs) to specific peptide sequences, they can be targeted to different
compartments of the cell. For example, there are also specific sequences that send proteins to the nucleus and the
mitochondria. In the case of peroxisomes it is a simple SKL tripeptide that causes a protein to be targeted to this
compartment. A peroxisome is a membrane-enclosed organelle and the protein will accumulate inside of it. Adding a
targeting sequence does not effect the color of the FP. These sequences are typically added to the N- or C-terminal tails of
the protein.
• Q. If LacI-cyan FP only bind to Lac operator of DNA sequences, why does the whole nucleus appear as cyan?
• A. There is an excess of LacI-cyan FP relative to the number of binding sites. It is this unbound protein that is causing the
whole nucleus to appear cyan.