* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Biomolecular chemistry 4. From amino acids to proteins
Nucleic acid analogue wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene regulatory network wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Paracrine signalling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Magnesium transporter wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Signal transduction wikipedia , lookup
Expression vector wikipedia , lookup
Gene expression wikipedia , lookup
Genetic code wikipedia , lookup
Interactome wikipedia , lookup
Point mutation wikipedia , lookup
Biosynthesis wikipedia , lookup
Metalloprotein wikipedia , lookup
Protein purification wikipedia , lookup
Western blot wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Biochemistry wikipedia , lookup
Protein–protein interaction wikipedia , lookup
77 Biomolecular chemistry 4. From amino acids to proteins Suggested reading: Sections 14.5 to 14.8 and Sections 2.1 to 2.4 of Mikkelsen and Cortón, Bioanalytical Chemistry Primary Source Material • Chapters 4 and 12 of Introduction to Genetic Analysis Anthony: J.F. Griffiths, Jeffrey H. Miller, David T. Suzuki, Richard C. Lewontin, William M. Gelbart (courtesy of the NCBI bookshelf). • Chapters 4, 4 and 6 of Biochemistry: Berg, Jeremy M.; Tymoczko, John L.; and Stryer, Lubert (courtesy of the NCBI bookshelf). • Chapters 3 and 7 of Molecular Cell Biology: Lodish, Harvey; Berk, Arnold; Zipursky, S. Lawrence; Matsudaira, Paul; Baltimore, David; Darnell, James E. (courtesy of the NCBI bookshelf). • ExPASy: online course on Principles of Protein Structure • Many figures and the descriptions for the figures are from the educational resources provided at the Protein Data Bank (http://www.pdb.org/) • Most of these figures and accompanying legends have been written by David S. Goodsell of the Scripps Research Institute and are being used with permission. I highly recommend browsing the Molecule of the Month series at the PDB (http://www.pdb.org/pdb/101/ motm_archive.do) Where are we and how did we get here? 78 We are here! • We are done with the Central Dogma and now we move into the realms of protein structure and function. The Central Dogma only relates to the flow of genetic information, not to the function of biological macromolecules. Proteins come in all shapes and sizes 79 http://www.rcsb.org/pdbstatic/education_discussion/molecule_of_the_month/poster_quickref.pdf • Proteins are diverse and versatile ‘nano’ structures and machines • Large number of potential combinations • There is a relatively large number number of amino acids (a.a.) which you can use to construct a protein. • Includes 20 common a.a.’s plus numerous post-translational modifications. • 200 amino-acid protein could have 20 to the 200th power possible sequences. • Structurally versatile • Polypeptide backbone can adopt a variety of conformations • Many conformers of side chains • Secondary structural elements can pack together in a wide variety of orientations • Various states of homo- and hetero- oligomerization • Proteins can bind prosthetic groups or cofactors (non-protein) • Heme • Metal ions • flavins • Structurally dynamic • Allosteric activation • Active and inactive forms The structure of a protein is determined by the80 linear sequence of amino acids (1º structure) Ribonuclease An unfolded protein can be refolded in vitro. This demonstrates that the information needed to specify the tertiary structure is fully contained in the primary sequence. http://www.users.csbsju.edu/~hjakubow/classes/rasmolchime/01ch331finproj/Rnase/templateprot.htm • The classic work of Christian Anfinsen in the 1950s on the enzyme ribonuclease revealed the relation between the amino acid sequence of a protein and its conformation. For this work he was awarded the Nobel Prize in Chemistry in 1972. Anfinsen discovered that: • Ribonuclease is a single polypeptide chain consisting of 124 amino acid residues cross-linked by four disulfide bonds. • Agents such as urea or guanidinium chloride effectively disrupt the noncovalent bonds., • The disulfide bonds can be cleaved reversibly by reducing them with a reagent such as β-mercaptoethanol. • When ribonuclease was treated with β-mercaptoethanol in 8 M urea, the product was a fully reduced, randomly coiled polypeptide chain devoid of enzymatic activity. In other words, ribonuclease was denatured by this treatment. • Anfinsen then made the critical observation that the denatured ribonuclease, freed of urea and β-mercaptoethanol by dialysis, slowly regained enzymatic activity. All the measured physical and chemical properties of the refolded enzyme were virtually identical with those of the native enzyme. • These experiments showed that the information needed to specify the catalytically active structure of ribonuclease is contained in its amino acid sequence. Ala showing Lstereochemistry • • • • • • • • • 81 http://www.neb.com/neb/sitemap/sitemap_5-1-10.html The 20 common amino acids 20 different common amino acids only differing in side chain Note that stereochemistry at Cα has not been indicated in this figure. All natural a.a.’s are in L-configuration A more general system of stereochemical designation is the R/S system. The Lconfiguration nearly always corresponds to S in the R/S system. The exception is Lcysteine which is R. You might want to keep this sheet handy as a reference. I will often used the one letter codes and you should learn these. Most are easy, but I find D, E, N and Q the most tricky to remember Q: Do we need to memorize the structure and names of amino acids on the test? A: Yes. You should know the structure, name, 3 letter abbreviation, and 1 letter code of all the common amino acids. Amino acid classification by property 82 http://www.imb-jena.de/image_library/GENERAL/aa/mut1.jpg http://www.imb-jena.de/image_library/GENERAL/aa/chemprop.jpg • Various simple textbook classifications for a.a.’s • e.g. small, nucleophilic, hydrophobic, aromatic, acidic, amide, basic • e.g. aliphatic, non-polar, aromatic, polar, charged -ve, charged +ve • However, no simple classification can properly capture the diversity of a.a. interactions and properties. • the same amino acid in different charge states can go from polar to nonpolar (H or K for example) • Different portions of the same amino acid can have different properties (aliphatic chain vs. guanidinium of arginine) • Generally find aliphatic/hydrophobic residues inside proteins and polar/charged on the surface of proteins • Notes: • Cysteine is special because it is best nucleophile, is most easily oxidized, and can form disulphide bonds. • Proline has a tertiary as opposed to secondary amide nitrogen and induces bend in polypeptide chain. • Theonine and Isoleucine have chiral carbons in side chain Free amino acids are almost always zwitterions83 Commentary on the topic of zwitterions: http://bip.cnrs-mrs.fr/bip10/zwitter.htm • Amino acids in solution at neutral pH exist as dipolar ions (zwitterions). • In the dipolar form, the amino group is protonated (-NH3+) and the carboxyl group is deprotonated (-CO2-). • Under almost any conceivable physiologically relevant conditions, the amino and carboxylate group of a free amino acid will be in its charged state. • This is also true of a polypeptide chain: the N-terminus and the C-terminus will be in the charged states • Possible exceptions • Groups buried in the interior of proteins or lipid bilayers • Proteins in the stomach pKa values of protein functional groups 84 • Seven of the 20 amino acids have readily ionizable side chains. These 7 amino acids are able to donate or accept protons to facilitate reactions as well as to form ionic bonds. • The above table gives equilibria and typical pKa values for ionization of the side chains of tyrosine, cysteine, arginine, lysine, histidine, and aspartic and glutamic acids in proteins. • Two other groups in proteins—the terminal α-amino group and the terminal α-carboxyl group—can be ionized. • You should know the approximate values for all of these ionizable groups. It is safe to say that all carboxylic acids in proteins have a pKa of about 3-4. • Q: What is so special about Histidine? It has a pka of ~6, but did you mention that it does not react with anything much? • A: Histidine is very good at donating and accepting protons at physiological pH. This is a very important part of many enzyme mechanisms. I may have mentioned that histidine is not such a good nucleophile. For enzyme mechanisms that involve a nucleophilic attack on the substrate, cysteine would be the best amino acid, followed by lysine. • Q. Proteins buried in lipid bilayers are charged on one terminal end or not at all? if its charged on part which one is it? • A. The N-terminus is always positively charged and the C-terminus is always negatively charged under normal pH conditions (near neutral). Under some circumstances, such as when the N- or C-terminus is buried in a very hydrophobic environment, I suppose they could be uncharged. The pKa of an ionizable group is going to depend on its environment. • Q. Proteins in stomach are charged on their N terminals, am i right? • A. I believe that the stomach is very low pH, like 2-3. At such low pH, practically every group in proteins will be protonated. It is close to the pKa for the C-terminus, so it might be partially protonated. • Q. Are the pKa values of AAs will be given in the test or not? • A. They won't be provided. You should know which residues are positively and negatively charged at neutral pH. An oligopeptide 85 • Oligopeptide: A compound made up of the condensation of a small number (typically less than 20) of amino acids • Polypeptide: A compound made up of the condensation of more than ~20 amino acids • Each type of protein differs in its sequence and number of amino acids. It is the particular sequence of the various side chains that makes each protein distinct. • The two ends of a polypeptide chain are chemically different: the end carrying the free amino group (NH3+, sometimes incorrectly written as NH2) is the amino, or N-terminus, and that carrying the free carboxyl group (CO2-, sometimes incorrectly written as CO2H) is the carboxyl, or C-terminus. • The amino acid sequence of a protein is always presented in the N to C direction, reading from left to right. This corresponds to the 5’ to 3’ direction in which genes are read. The peptide bond is planar 86 • Linus Pauling and Robert Corey analyzed the geometry and dimensions of the peptide bonds in the crystal structures of molecules containing one or a few peptide bonds. • This analysis led Pauling to correctly predict the existence and structure of the alpha helix and beta sheets (for which he was awarded the 1954 Nobel Prize in Chemistry) • The take home message is that the secondary structure elements of proteins can be predicted by looking at the structure of an individual amino acid. That is, an amino acid in an alpha helical or beta sheet conformer is also in a minimal energy conformer because its bonds are staggered and the peptide bond is planar. • Note that the C-N bond length of the peptide is 10% shorter than that found in usual C-N amine bonds. This is because the peptide bond has some double bond character (40%) due to resonance which occurs with amides. • As a consequence of this resonance all peptide bonds in protein structures are found to be almost planar. This rigidity of the peptide bond reduces the degrees of freedom of the polypeptide during folding. • The planarity of the peptide bond is described using the angle ‘omega’. This is the dihedral angle between the Calpha-carbonyl bond and the N-Calpha bond. The peptide bond is almost always trans 87 All amino acids except proline Proline image credit: http://www.imb-jena.de/IMAGE.html. • The omega (ω) angle is almost always 180º (trans) though sometimes (extremely rarely) it is 0º (cis). • Note that both the cis and trans form are planar. • Of the cis-peptide bonds found in proteins, almost all involve proline residues. • The overall atom geometry in cis proline is very similar to the trans-proline case. Energetically, the trans proline structure is not markedly more favorable than its cis-proline counterpart since much the same spatial conflicts are present in both cases. • Approximately 1% of prolines in proteins are cis. • A cis-peptide bond induces a very sharp kink in the polypeptide chain. • Q. It is stated that "Approximately 1% of prolines in proteins are cis." Does it mean 99% of prolines in proteins are trans? So, trans-proline is still more favourable than cis-proline (Slide 87)? Also, do you mean that proline is the only amino acid that can exist in cis while 19 other amino acids cannot. • A. Correct. 99% of all prolines are trans and trans is more favourable than cis. The difference in energy for cis vs. trans is smaller than it is for any of the other amino acids, and this is why we occasionally see cis prolines. It is extremely rare to find any of the other 19 amino acids in a cis conformation. Certain combinations of φ and ψ angles are preferred 88 Scans downloaded from: http://www.nd.edu/~aseriann/cou.html • A polypeptide can be thought of as a series of planar units (peptide bonds) joined by flexible hinges (Cα-atoms). • Each Cα-atom has two rotatable bonds, the C-N bond (φ, phi) and the C-C bond (ψ, psi) • Only certain combinations of φ and ψ angles are allowed due to steric clashes between the adjacent residues. The Ramachandran Plot (φ vs. ψ) 89 β-strand conformation α-helical conformation • A graph of φ angle vs. ψ angle vs. occurrence in proteins is called a Ramachandran plot. • There are actually only a few conformations that are strongly preferred and these give rise to the common elements of secondary structure. The Ramachandran90 Plot of a typical protein (as output by the program PROCHECK) http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html • The Ramachandran plot for a particular protein shows the phi-psi torsion angles for all residues in the structure • By looking at how well the angles match up with expected distribution, the quality of a structure can be assessed. • Glycine residues are separately identified by triangles as these are not restricted to the regions of the plot appropriate to the other sidechain types. • The coloring/shading on the plot represents the various levels of favorability: the darkest areas (here shown in red) correspond to the "core" regions representing the most favorable combinations of phi-psi values. • A properly folded protein will have over 90% of the residues in these "core" regions. • The percentage of residues in the "core" regions is one of the better guides to stereochemical quality for assessing experimental protein structures. • An ideal Ramachandran plot can be generated computationally using known atomic radii and bond distances. alpha-helices: an ‘island’ of preferred conformation 91 http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html • As mentioned earlier, Pauling and Corey twisted models of polypeptides around to find ways of getting the backbone into regular conformations which would agree with experimental diffraction data (much like the way the structure of DNA was determined). The most simple and elegant arrangement is a right-handed spiral conformation known as the 'alpha-helix'. • The structure repeats itself every 5.4 Angstroms along the helix axis, i.e. we say that the alpha-helix has a pitch of 5.4 Angstroms. Alpha-helices have 3.6 amino acid residues per turn, i.e. a helix 36 amino acids long would form 10 turns. The separation of residues along the helix axis is 5.4/3.6 or 1.5 Angstroms, i.e. the alpha-helix has a rise per residue of 1.5 Angstroms. • Every mainchain C=O and N-H group is hydrogen-bonded to a peptide bond 4 residues away (O(i) to N(i+4)). This gives a very regular, stable arrangement. • The peptide planes are roughly parallel with the helix axis and the dipoles within the helix are aligned. That is, all C=O groups point in the same direction and all N-H groups point the other way. This alignment of C=O and N-H bonds gives the alpha-helix a permanent dipole with a partial positive charge at the amino-terminus and a partial negative charge at the carboy-terminus. • Side chains point outward from helix axis and are generally oriented towards its aminoterminal end. • All the amino acids have negative phi and psi angles, typical values being -60 degrees and -50 degrees, respectively beta-strands: another ‘island’ of preferred conformation 92 http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html • In addition to the alpha helix, Pauling and Corey discovered another periodic structural motif which they named the β-pleated sheet (β because it was the second structure that they elucidated, the α helix being the first). • The β-sheet differs markedly from the rodlike α-helix. A polypeptide chain, called a βstrand, in a β-sheet is almost fully extended rather than being tightly coiled as in the αhelix. A range of extended structures are sterically allowed. The side chains of adjacent amino acids point in opposite directions. • A β-sheet is formed by linking two or more β-strands by hydrogen bonds. Adjacent chains in a β-sheet can run in opposite directions (antiparallel β-sheet) or in the same direction (parallel β-sheet). • In the antiparallel arrangement, the NH group and the CO group of each amino acid are respectively hydrogen bonded to the CO group and the NH group of a partner on the adjacent chain. • In the parallel arrangement, for each amino acid, the NH group is hydrogen bonded to the CO group of one amino acid on the adjacent strand, whereas the CO group is hydrogen bonded to the NH group on the amino acid two residues farther along the chain. • Many strands, typically 4 or 5 but as many as 10 or more, can come together in β-sheets. Such β-sheets can be purely antiparallel, purely parallel, or mixed. • β-sheets can be relatively flat but most adopt a somewhat twisted shape. Turns and loops connect strands and helices 93 http://www.cryst.bbk.ac.uk/PPS2/course/section3/rama.html • Most proteins have compact, globular shapes, requiring reversals in the direction of their polypeptide chains. Many of these reversals are accomplished by reverse turns and hairpins. • A reverse turn is region of the polypeptide having a hydrogen bond from one main chain carbonyl oxygen to the main chain N-H group 3 residues along the chain (i.e. O(i) to N(i+3)). Helical regions are excluded from this definition and turns between beta-strands form a special class of turn known as the beta-hairpin. • Reverse turns are very abundant in globular proteins and generally occur at the surface of the molecule. It has been suggested that turn regions act as nucleation centers during protein folding. • Reverse turns are divided into classes based on the phi and psi angles of the residues at positions i+1 and i+2. Types I and II shown in the figure are the most common reverse turns, the essential difference between them being the orientation of the peptide bond between residues at (i+1) and (i+2). The torsion angles for the residues (i+1) and (i+2) in the two types of turn lie in distinct regions of the Ramachandran plot. • 2-residue beta-hairpin turns occur between two antiparallel beta-strands as shown in the figure. • The residues forming these two-residue turns have torsion angles in characteristic regions of the Ramachandran plot. • For type I' turns, residue 2 is always glycine whereas for type II' turns residue 1 is always Gly. This is because amino acids other than glycine would cause steric hindrance involving the residue's side chain and the main chain. • In other cases, more elaborate structures are responsible for chain reversals. These structures are called loops or sometimes Ω loops (omega loops) to suggest their overall shape. Unlike α-helices and β-strands, loops do not have regular, periodic structures. Nonetheless, loop structures are often rigid and well defined. Turns and loops invariably lie on the surfaces of proteins and thus often participate in interactions between proteins and other molecules. • For example, a part of an antibody molecule has surface loops (shown in red) that mediate interactions with other molecules. • Q. Does reverse turn only exists among alpha-helix? And beta-hairpin only exists among beta-sheet? • A. As the name implies, the beta hairpin is most commonly found as a connector between strands of an antiparallel beta sheet. The reverse turn is a a bit more general and can be found in loops that connect both helices and strands. • Q. What are the differences/relationship between reverse turns, beta-hairpin turns and omega loops? • A. Reverse turns and beta turns do look very similar when you look at the structures on the slide. However, there are key differences in the conformations of amino acids that define each of these two types of turns. I don't expect you to know the details of these differences. One thing you should remember is that beta turns are typically used to connect two strands of anti-parallel beta sheet. An omega loop is a larger structure that is supposed to look something like the omega character (Ω). That is, the ends are very close together but the loop itself is large and extends out into space. The variable regions of an antibody can be described as omega loops. • Q. In slide 93, type II looks similar to type I'. I know one is for reverse turn and another is for beta hairpin. If I have a structure like that, how can I tell it's a beta-hairpin or a reverse turn. I know the feature for beta-hairpin is that residue2 in typeI' and residue1in typeII' should be Glysine. Is there any other feature for reverse turn? Do we need to know how to tell type I and type II of reverse turns? Another question is the difference between reverse turn, beta-hairpin and loops. Did we tell them apart by the number of amino acids? • A. There are key differences in the conformations of the residues in these turns, which is the basis by which they are classified. I don't expect you to know the details of these differences. One thing you should remember is that beta turns are typically used to connect two strands of anti-parallel beta sheet. Proteins generally composed of α-helices 94 and/or β-sheets connected by turns and loops • The α-helical content of proteins ranges widely, from nearly none to almost 100%. For example, about 75% of the residues in ferritin, a protein that helps store iron, are in αhelices. Single α-helices are usually less than 45 Å long. However, two or more α-helices can entwine to form a very stable ‘coiled coil’ structure, which can have a length of 1000 Å (100 nm, or 0.1 μm) or more. Such α-helical ‘coiled coils’ are found in myosin and tropomyosin in muscle, in fibrin in blood clots, and in keratin in hair. The helical cables in these proteins serve a mechanical role in forming stiff bundles of fibers, as in porcupine quills. The cytoskeleton (internal scaffolding) of cells is rich in so-called intermediate filaments, which also are two-stranded α-helical coiled coils. Many proteins that span biological membranes also contain α-helices. • The β-sheet is an important structural element in many proteins. For example, fatty acidbinding proteins, important for lipid metabolism, are built almost entirely from β-sheets. Protein folding is largely driven by hydrophobic interactions 95 Myoglobin Hydrophobic Hydrophilic surface cross section • Myoglobin, the oxygen carrier in muscle, is a single polypeptide chain of 153 amino acids. The capacity of myoglobin to bind oxygen depends on the presence of heme, a prosthetic (helper) group consisting of protoporphyrin IX and a central iron atom. • The folding of the main chain of myoglobin, like that of most other proteins, is complex and devoid of symmetry. A unifying principle emerges from the distribution of side chains. The striking fact is that the interior consists almost entirely of nonpolar residues such as leucine, valine, methionine, and phenylalanine. Charged residues such as aspartate, glutamate, lysine, and arginine are absent from the inside of myoglobin. The only polar residues inside are two histidine residues, which play critical roles in binding iron and oxygen. • The outside of myoglobin, on the other hand, consists of both polar and nonpolar residues. This contrasting distribution of polar and nonpolar residues reveals a key facet of protein architecture. In an aqueous environment, protein folding is driven by the strong tendency of hydrophobic residues to be excluded from water. • The polypeptide chain therefore folds so that its hydrophobic side chains are buried and its polar, charged chains are on the surface. • The secret of burying a segment of main chain in a hydrophobic environment is pairing all the NH and CO groups by hydrogen bonding. This pairing is neatly accomplished in an αhelix or β-sheet. • The ability to predict whether or not a given polypeptide sequence will fold into a given tertiary structure remains one of the ‘grand challenges’ of science. • In nature, protein fold either independently or with the help of other proteins known as chaperones. 96 Membrane proteins have grease on the outside K+-channel lipid bilayer Three views of the same structure • Some proteins that span biological membranes are “the exceptions that prove the rule” regarding the distribution of hydrophobic and hydrophilic amino acids throughout three-dimensional structures. For example, ion channels are covered on the outside largely with hydrophobic residues that interact with the neighbouring alkane chains. The inner channel is quite polar and there are many specific interactions with the ion being transported. • David S. Goodsell: The Molecule of the Month appearing at the PDB • Potassium ions move through this channel from inside the cell to the outside. The driving force for this movement is simply the concentration gradient. Cells concentrate potassium ions inside, and then these ions are released when the membrane depolarizes (for example, during transmission of signals through the nervous system). The selectivity filter is the part with the backbone carbonyls oriented towards the ion in the centre of the channel. Only potassium (not sodium) is perfectly coordinated by these carbonyl oxygen atoms, and so only it can pass through the channel. It is my understanding that potassium ions are normally surrounded by 8 water molecules, whereas sodium is normally surrounded by 6. • The 2003 Nobel Prize in Chemistry was awarded for work in the area of channels • Roderick Mackinnon pioneered x-ray crystallography of ion channels. • Peter Agre discovered water channels. • Water channels facilitate the rapid transport of water across cell membranes in response to osmotic gradients. These channels are believed to be involved in many physiological processes that include renal water conservation, neuro-homeostasis, digestion, regulation of body temperature and reproduction. Members of the water channel superfamily have been found in a range of cell types from bacteria to human. Chaperone assisted protein folding 97 http://www.users.csbsju.edu/~hjakubow/classes/rasmolchime/02ch331finproj/GroELES/templateprotGROEL2.htm • Folding of proteins in vitro tends to be an inefficient process, with only a minority of unfolded molecules undergoing complete folding within a few minutes. • More than 95 percent of the proteins present in cells are in their native conformation. • The explanation for the cell’s remarkable efficiency in promoting protein folding probably lies in chaperones, a family of proteins found in all organisms from bacteria to humans. • There are two general families of chaperones: molecular chaperones, which bind and stabilize unfolded or partially folded proteins, thereby preventing these proteins from being degraded; and chaperonins, which directly facilitate their folding. • Chaperonins are probably used for a specific and relatively small selection of proteins, whereas molecular chaperones are used for most, if not all, proteins. • All chaperones have ATPase activity, and their ability to bind and stabilize their target proteins is specific and dependent on ATP hydrolysis. • Molecular chaperones include the Hsp70 family of proteins. When bound to ATP, Hsp70 assumes an open form in which an exposed hydrophobic pocket transiently binds to exposed hydrophobic regions of the unfolded target protein. Hydrolysis of the bound ATP causes Hsp70 to assume a closed form, releasing the target protein. Molecular chaperones are thought to bind all nascent polypeptide chains as they are being synthesized on ribosomes. 98 More on GroEL Hydrophobic stripe ATP-binding site Large cavity David S. Goodsell: The Molecule of the Month appearing at the PDB • Proper folding of a small proportion of proteins (e.g., the cytoskeletal proteins actin and tubulin) requires additional assistance, which is provided by chaperonins. • Shown on this slide is the bacterial chaperonin, GroEL, which contains 14 identical subunits stacked in two concentric rings (green). GroES is shown at the bottom in pink. • The large GroEL-GroES complex is available in PDB entry 1aon. In this picture, three of the subunits in each GroEL ring have been removed to show the interior, leaving four subunits in each ring. On the two in back, the hydrophobic amino acids, LEU, ILE, VAL, MET, PHE, TYR and TRP, are coloured blue. • Notice the stripe of hydrophobic amino acids around the entry at the top. This will interact strongly with unfolded proteins by coaxing them into the upper cavity. Once the unfolded protein is bound, ATP and GroES bind to GroEL. This causes a conformational change that forces the protein into the larger lower cavity that is much more hydrophilic than the upper cavity. • Now that the protein is in a hydrophilic environment, it will be forced to fold in order to minimize they unfavourable interactions between its hydrophobic portions and its hydrophilic environment. • After the protein has folded, ATP is hydrolyzed and GroES (the lid on the cavity) is released along with the newly folded protein. • Q: When use chaperonin to help proteins to fold, the GroES will bind to GroEL to the large cavity side or hydrophobic stripe side? • A: I believe it can bind to both sides. Don't worry about the details. Proteins often consist of multiple independent99 domains and have 4o structure CD4 Cro hemoglobin Rhinovirus http://web.mit.edu/esgbio/www/cb/virus/virus.html Immunoglobin (antibody) • Some polypeptide chains fold into two or more compact regions that may be connected by a flexible segment of polypeptide chain, rather like pearls on a string. • These compact globular units, called domains, range in size from about 30 to 400 amino acid residues. • For example, the extracellular part of CD4 (shown at top), the cell-surface protein on certain cells of the immune system to which the human immunodeficiency virus (HIV) attaches itself, comprises four similar domains of approximately 100 amino acids each. Often, proteins are found to have domains in common even if their overall tertiary structures are different. • Antibodies (immunoglobins) have a distinct domain structure in addition to quaternary structure. We will be taking a much closer look at antibody structure in the next section. • Quaternary structure refers to the spatial arrangement of subunits and the nature of their interactions. • The simplest sort of quaternary structure is a dimer, consisting of two identical subunits. This organization is present in the DNA-binding protein Cro found in a bacterial virus called λ. • More complicated quaternary structures also are common. More than one type of subunit can be present, often in variable numbers. For example, human hemoglobin, the oxygen-carrying protein in blood, consists of two subunits of one type (designated α) and two subunits of another type (designated β). Thus, the hemoglobin molecule exists as an α2β2 tetramer. • Viruses make the most of a limited amount of genetic information by forming coats that use the same kind of subunit repetitively in a symmetric array. The coat of rhinovirus, the virus that causes the common cold, includes 60 copies each of four subunits. The subunits come together to form a nearly spherical shell that encloses the viral genome. • Q: It mentioned that the coat of rhinovirus includes 60 copies each of four subunits. But from the picture I only see three coloured subunits. What's wrong in this. • A: There is a 4th protein that is inside and not visible from the outside. Post-translational modifications of proteins Ubiquitylation (Lys) Courtesy of Spencer Alford • Many proteins are covalently modified, through the attachment of groups other than amino acids, to augment their functions. Many proteins, especially those that are present on the surfaces of cells or are secreted, acquire carbohydrate units on specific asparagine residues. The addition of sugars makes the proteins more hydrophilic and able to participate in interactions with other proteins. Conversely, the addition of a fatty acid to an α-amino group or a cysteine sulfhydryl group produces a more hydrophobic protein that will be tightly associated with the membrane. • Proteins can also be reversibly modified to regulate their activity. Perhaps the single most important of all modifications is phosphorylation and dephosphorylation of serine, threonine, and tyrosine residues. Regulation of protein activity by phosphorylation is basis for intracellular signalling. The enzymes that catalyze the addition of phosphate groups (from ATP donors) are called kinases (why kinases?). Enzymes that remove phosphate groups are called phosphatases. • Histones—proteins that assist in the packaging of DNA into chromosomes as well as in gene regulation—are rapidly acetylated and deacetylated on specific lysine residues in vivo. More heavily acetylated histones are associated with genes that are being actively transcribed. A more permanent modification of lysines in histone proteins is methylation. • The attachment of ubiquitin, a protein comprising 72 amino acids, is a signal that a protein is to be destroyed, the ultimate means of regulation. • This slides shows only a few of the common examples. A number of additional posttranslational modifications are known. Lysine ε-N-monomethyllysine H2C=O c Acetylation Histone H4 Ac–Lys GCN5 bromodomain recognize shor target proteins tide has acquir domains usual the modified r selectively enga distinguishes b same PTM6–9. B that they recogn fore, in princip proteins. Ubiquitin Vps27 UIM Cooperative in dependent inte signal is only g same protein h in various ways recognized in an action domains phosphotyrosin HP1 chromodomain 101 O Acetyl CoA CoA Post-translational modifications are catalyzed by enzymes and often serve as binding sites REVIEWS + NH3 HN HAT HDAC N H a Phosphorylation OH N H ADP ATP Protein kinase Phosphatase O Tyrosine Pi b Methylation SAM + O– O P O– O H2O N H O Phosphotyrosine SAH + H2N NH3 SHC P–Tyr GRB2 SH2 domain CH3 O Lysine H2O2 H2C=O c Acetylation + NH3 O2 O ε-N-monomethyllysine Histone H3 Me–Lys HP1 chromodomain O Acetyl CoA CoA HN HAT HDAC N H O O Lysine H2O O – N H O ε-N-monoacetyllysine + NH3 Histone H4 Ac–Lys GCN5 bromodomain O d Ubiquitylation Ubiquitin + ATP HN Ub N H O Lysine Ubiquitin H2O N H O N-ubiquitinyllysine Ubiquitin Vps27 UIM e Hydroxylation O2 N O Proline H2O Prolyl hydroxylase Dehydroxylase? HO N O Hydroxyproline HIF1α OH–Pro VHLβ Figure 1 | Example post-translational modification reactions and structures of protein-interaction-domain–ligand complexes. Various amino-acid side chains can be modified by, for example: phosphorylation (a); methylation (b); acetylation (c); ubiquitylation (d); and hydroxylation (e). The enzymes that are involved in adding and removing these post-translational modifications are shown on the reaction arrows. The structures on the far right show examples of protein-interaction domains (pale purple) in complex with their respective ligands (red). The structures were obtained from the Protein Data Bank (accession codes 1JYR, 1Q3L, 1E6I, 1Q0W and 1LM8 for parts a–e, respectively) and were manipulated using Pymol (Delano Scientific). GCN5, general control of amino-acid-synthesis protein-5; GRB2, growth-factor-receptorbound protein-2; HAT, histone acetyltransferase; HDAC, histone deacetylase; HIF1α, hypoxia-inducible factor-1α; HP1, heterochromatin protein-1; Me–Lys, methylated lysine; OH–Pro, hydroxylated proline; Pi, inorganic phosphate; P–Tyr, phosphotyrosine; SAH, S-adenosylhomocysteine; SAM, S-adenosylmethionine; SH2, Src-homology-2; SHC, SH2-domain-containing transforming protein; Ub, ubiquitin; UIM, ubiquitin-interacting motif; VHLβ, von Hippel–Lindau protein-β; Vps27, vacuolar protein sorting protein-27. 474 | JULY 2006 | VOLUME 7 O– N H O ε-N-monoacetyllysine Dehydroxylase? O Src-homology-2 HIF1α OH–Pro Proline Hydroxyproline VHLβ PTM-induced interactions. Interaction domains often Figure 1 | Example post-translational modification and structures of recognize short peptide motifs that are embeddedreactions in protein-interaction-domain–ligand complexes. target proteins, but do not bind stably until theVarious pep- amino-acid side chains can be example: phosphorylation methylation (b); acetylation (c); tide hasmodified acquiredby,anfor appropriate PTM (FIGS 1,2a)(a); . Such ubiquitylation (d);have and hydroxylation (e). The enzymes that are involved in adding and domains usually a conserved binding pocket for removing these post-translational modifications are shown on the reaction arrows. the modified residue and a more variable surface that The structures on the far right show examples of protein-interaction domains (pale selectively engages the amino acids, and(red). thereby purple) in complex withflanking their respective ligands The structures were obtained distinguishes between different peptide motifs with the 1E6I, 1Q0W and 1LM8 for 1JYR, 1Q3L, from the Protein Data Bank (accession codes 6–9 . Both the and domains and the peptide motifs same parts PTM a–e, respectively) were manipulated using Pymol (Delano Scientific). GCN5, that theycontrol recognize are modular in design and can theregeneral of amino-acid-synthesis protein-5; GRB2, growth-factor-receptorbound histone acetyltransferase; HDAC, histone deacetylase; HIF1α, fore, inprotein-2; principle,HAT, be incorporated into many different hypoxia-inducible factor-1α; HP1, heterochromatin protein-1; Me–Lys, methylated proteins. One of the main functions of PTMs is to provide a binding site for a protein partner with a suitable interaction domain. In this way, the location of proteins lysine; OH–Pro, hydroxylated proline; Pi, inorganic phosphate; P–Tyr, inside of S-adenosylhomocysteine; cells can be dynamically phosphotyrosine; SAH, Cooperative interactions and multi-site PTMs. SAM, PTM-S-adenosylmethionine; SH2, Src-homology-2; SHC, SH2-domain-containing transforming protein; Ub, ubiquitin; dependent interactions can be cooperative, such that a changed. UIM, ubiquitin-interacting motif; VHLβ, von Hippel–Lindau protein-β; Vps27, vacuolar signal is only generated after two or more sites on the protein sorting protein-27. Ubiquitin ligase Nature Reviews Molecular Cell(E3) Biology 7, 473-483 Deubiquitylating isopeptidase H2O O words and phrases, and has engendered the hypothesis d Ubiquitylation that protein+ domainsUbiquitin represent a basic syntacticHN unit of + ATP NH3 5 cellular organization . In this article, we briefly discussUb the common ways in whichligase PTMs Ubiquitin (E3) and interaction domains synergize toDeubiquitylating regulate cellular processes, and we provide specific examples that involve phosphorylation, isopeptidase N sumoylation N methylation, acetylation, ubiquitylation and H H O O. This is not intended to be a comprehensive (FIGS 1,2) Ubiquitin H2O Lysine N-ubiquitinyllysine analysis, but rather aims to highlight the general strategies through which PTMs exert their effects. e Hydroxylation O N H O PTM-dependent interactions: common strategies HO O2 H2Obriefly discuss In the following subsections we the common mechanisms by which PTM-dependent interactions regulate cellular processes. N N Prolyl hydroxylase Methyltransferase Amine oxidase demethylase N H O Lysine same protein have been modified. This can be achieved in various ways. First, a doubly modified motif can be recognized in an obligatory fashion by two tandem inter474 | JULY 2006 | VOLUME 7 action domains, as is the case for the recognition of two phosphotyrosine (pTyr)-containing motifs by the tandem Src-homology-2 (SH2) domains of the ZAP-70 (ζ-chain (T-cell receptor)-associated protein kinase 70 kDa) protein tyrosine kinase10 (FIG. 2b). This can increase both the affinity and the specificity of the interaction. Second, a single domain can possess two binding pockets for the modified residues. For example, the single SH2 domain of the APS protein (adaptor molecule containing pleckstrinhomology (PH) and SH2 domains protein) binds to two pTyr residues in the activation loop of the insulin receptor kinase; furthermore, two APS SH2 domains form a non-covalent dimer, which potentially stabilizes the activated receptor11. Third, a domain with a single binding pocket can bind specifically to a protein that carries several modifications. For example, the WD40-repeat domain of the Saccharomyces cerevisiae protein Cdc4 (celldivision cycle-4) only binds to its target, Sic1 (substrate inhibitor of cyclin-dependent protein kinase-1), when the target has been phosphorylated during the G1 phase of the cell cycle on at least six Ser/Thr residues (FIG. 2c). As Cdc4 is the substrate-targeting subunit of an SCF (Skp1–Cul1–F-box) E3 ubiquitin ligase complex, phosphorylation of Sic1 leads to its polyubiquitylation and degradation by the proteasome. This, in turn, is necessary for the onset of DNA synthesis, because Sic1 www.nature.com/reviews/molcellbio (T-cell recepto protein tyrosine the affinity and a single domain modified residu the APS protein homology (PH) pTyr residues in tor kinase; furt a non-covalent activated recept ing pocket can several modifi domain of the Sa division cycle-4 inhibitor of c when the targe G1 phase of the (FIG. 2c). As Cd an SCF (Skp1– phosphorylatio and degradatio necessary for th Many proteins are assembled from interaction102 multiple domains with predictable functions By combining various module binding domains, plus a catalytic domain, sophisticated dynamic regulation of protein activity is possible. It is apparent that evolution has shuffled and recycled these domains many times to make the proteins that control cell signalling. Examples: http://pawsonlab.mshri.on.ca Pawson and Nash, Science 300, 445 (2003) Modular proteins interact with each other in 103 complex, yet rational, pathways (circuits) “…mammalian cyclin E activates cyclindependent kinase-2 (CDK2) to promote the G1-to-S phase transition in the cell cycle. The phosphorylation of cyclin E on a Thr residue is required for its recognition by the WD40-repeat domain of the targeting subunit — CDC4 (celldivision cycle-4) — of an SCF (SKP1– CUL1–F-box) E3 ubiquitin-ligase complex. This interaction leads to the addition of a Lys48-linked polyubiquitin chain to cyclin E, which results in its subsequent recruitment to the proteasome by a ubiquitin-interacting motif (UIM) and its destruction.” SCF E3 complex P Thr Cyclin E CDK2 Cyclin E P P CDK2 Ub P Ub CDC4 E2 Ub Ub Ub Ub P Ub Thr Ub Cyclin E UIM Proteasome E3 P T P Ub Ub P CDK2 (WD40)8 F-box E1 SHC PTB P Tyr CDK2 P GRB2 SH2 CDK2 is an important kinase that needs to be activated in a cell-cycle dependent fashion. Specifically, it needs t one active at the G1 to S transition point. Ras–MAPK signalling While turning on the activity of kinase is important, so is turning it off. This is Cell-cycle regulation Cell signalling accomplished by targeting the cyclin E activator for Figure destruction by the proteasome. 3 | Networks of modification-dependent interactions re Nature Reviews Molecular Cell Biology 7, 473-483 phosphorylation (P) and ubiquitin (Ub)-dependent protein interacti induced signalling and endocytic trafficking. Inducible post-transla highlighted by red arrows. Sequential PTM-dependent interactions examples. In the first example, mammalian cyclin E activates cyclinphase transition in the cell cycle. The phosphorylation of cyclin E on WD40-repeat domain of the targeting subunit — CDC4 (cell-divisio ubiquitin-ligase complex. This interaction leads to the addition of a results in its subsequent recruitment to the proteasome by a ubiquit second example, epidermal growth factor (EGF) receptor autophos can recruit the Src-homology-2 (SH2) domain of the E3 ubiquitin lig oncogene). Cbl monoubiquitylates the receptor to provide docking EPS15 (EGF-receptor-pathway substrate-15), which are involved in undergoes an intramolecular interaction with a monoubiquitylated and ‘open’ conformations). The binding of the phosphotyrosine-bin transforming protein) to a pTyr site on the EGF receptor induces the in turn, recruits the SH2 domain of GRB2 (growth-factor-receptor-b The figure also shows the convergence of distinct interaction doma SH2 and PTB domains for pTyr) and the modification of the same po the multi-site phosphorylation and ubiquitylation of the EGF recept activating enzyme; E2, ubiquitin-conjugating enzyme; MAPK, mitog residues in a manner that depends on ligand phosphorylation and the identity of the flanking amino acids26 (FIG. 2a). Activated receptor tyrosine kinases (RTKs), such as the β-platelet-derived growth factor receptor or the epidermal growth factor receptor (EGFR), become phosphorylated at multiple Tyr sites, and each of these sites selectively binds the SH2 domain of one or more cytoplasmic signalling proteins, which, in turn, activate specific intracellular signalling pathways27–29 (FIGS 3,4a). Among the SH2-domain-containing proteins that are recruited to an autophosphorylated RTK such as EGFR is the Cbl (Casitas B-lineage lymphoma proto-oncogene) E3 ubiquitin ligase, which subsequently ubiquitylates the T at m to y inter lator this by th depe key is us (FIG. crip the p Disulphide bonds are another type of posttranslational modification 104 Ribonuclease (PDB 5RSA) • Disulphide bonds generally only form under oxidizing conditions. They are very common in extracellular proteins such as those found in blood. • Disulphide bonds generally do not form in the cytoplasm of living cells because this is a reducing environment (~5 mM glutathione, a thiol) • A protein that has disulphide bonds will tend to be very stably folded relative to proteins that do not have disulphide since there is a covalent link holding two sections of chain together (as opposed to non-covalent interactions in a protein that lacks disulfide bonds). You would expect a protein with disulphide bonds to be more stable at elevated temperatures than a protein without disulphide bonds. • To maintain the activity of a protein with free thiols in vitro, it is important to add reducing agents such as β-mercaptoethanol, TCEP, or DTT. In vitro, reducing agents are necessary to maintain thiols in a reduced form SH S S HO OH Dithiothreitol (DTT) SH HO SH reduction O β-mercaptoethanol (2-mercaptoethanol) HO OH S S O SH HS HO oxidation 105 OH P HO H Cl O Tris(2-Carboxyethyl) Phosphine Hydrochloride (TCEP·HCl) • Don’t use β-mercaptoethanol (at least not in this building) if you can help it. It stinks • DTT or TCEP are much better choices. These two reducing agents are similar in terms of effectiveness. TCEP can be slightly more expensive but it benefits from not having a free thiol (a good nucleophile). This is advantageous if you are working with electrophiles in solution. 106 The chromophore of green fluorescent protein is a unique post-translational modification Gly67 O Tyr66 O HN N H O Ser65 N H HO cyclization O HN HO HO HO N OH O N H oxidation & dehydration O N HO HO N O N H mature chromophore MSKGEELFTGVVPILVELDGDVNGQKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL VTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFYKDDGNYKTRAEVKFEGDTLV NRIELKGIDFKEDGNILGHKMEYNYNSHNVYIMADKPKNGIKVNFKIRHNIKDGSVQLAD HYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMILLEFVTAAGITHGMDELYK Prasher et al. Gene (1992) 111: 229-233. Tsien, Annu. Rev. Biochem. (1998) 67: 509-544. • Yet another type of post-translational modification are those that are generated by chemical rearrangements of side chains and, sometimes, the peptide backbone. For example, the Aequorea Victoria jellyfish produces a green fluorescent protein (GFP). Surprisingly, the chromophore is not a bound cofactor but rather a post-tranlationally modified sequence of amino acids. • The chromophore (also okay to call fluorophore) is formed by the protein-promoted rearrangement and oxidation of the sequence Ser-Tyr-Gly within the center of the protein. The GFP is of tremendous utility to researchers as a marker within cells. The Nobel prize in Chemistry for 2009 was awarded to 3 researchers (Tsien, Chalfie, and Shimomura) who are considered pioneers in the development of GFP as a research tools. • The spontaneous formation of the Aequorea green fluorescent protein chromophore within the folded beta-can protein structure must necessarily involve at least three key steps: cyclization of the main chain, loss of a molecule of water (dehydration), and oxidation with molecular oxygen. The exact order and mechanism of these steps is a matter of ongoing investigation. • Chromophore formation is spontaneous only within the context of the fluorescent protein can structure where steric constraints force the peptide into a tight turn conformation (Branchini et al. 1998) and the side chains of highly conserved residues, such as glutamate 222 and arginine 96, are positioned to facilitate the reaction. Fluorescent proteins engineered to fluoresce at different wavelengths 107 • Note that, because the GFP chromophore is generated from amino acids, we can change the structure of the chromophore by introducing mutations in the gene that ultimately change the amino acids present in the chromophore forming tripeptide (or its immediate surroundings). • For example, by changing the tyrosine of the GFP chromophore to other aromatic amino acids, new chromophore structures can be formed that fluoresce at different wavelengths from the wild-type protein. Visualizing the central dogma in a live cell! cyan fluorescence = Lac operator in nucleus and peroxisomes in cytoplasm An immortalized human cancer cell known as U2OS Gene for LacI - cyan FP 108 yellow fluorescence = RNA transcripts Gene for MS2bp - yellow FP 256 x Lac operator promoter cyan FP (peroxisome-targeted) 24 x MS2-repeats This cell has 3 different genes artificially introduced into it: • LacI - cyan FP, the gene for LacI fused to the gene for a cyan fluorescent protein • MS2bp - yellow FP, the gene for MS2bp fused to the gene for yellow fluorescent protein • A peroxisome targeted cyan FP followed by 24 copies of MS2 in the 3‘UTR. This gene is proceeded by 256 copies of the Lac operator which is not transcribed. • Definitions: • U2OS cell line: a human cell line cultivated from the bone tissue of a fifteen-year-old human female suffering from osteosarcoma (the same cancer as Terry Fox had). • LacI: a bacterial protein that binds tightly to the section of DNA known as the Lac operator. • Lac operator: a DNA sequence that binds to LacI. • MS2: an RNA sequence that forms a hairpin structure. This RNA hairpin binds tightly to the viral protein that I have called MS2bp. • MS2bp (MS2-binding protein): a viral coat protein that binds tightly to the the MS2 RNA sequence. • cyan FP: the cyan fluorescent variant of the green fluorescent protein • yellow FP: the yellow fluorescent variant of the green fluorescent protein • peroxisomes: small organelles in the cytoplasm of eukaryotic cells. They have a role in destroying peroxides in the cell. • This is the work of the labs of David L. Spector and Robert H. Singer • Q. This is about the topic of visualizing the central dogma in the class today. In the example of the sequence having 254 copies of Lac operator and 24 copies of MS2, the figure shows both the cyan and yellow florescence separately. But will there be more of the cyan fluorescence actually noticed since the sequence has 254 copies of Lac and 24 copies of MS2 ? But the figure actually shows more number of yellow spots compared to cyan in the nucleus. Could you please explain if there will be any differences ? • A. All of the copies of the Lac operator DNA sequence are located in the same place and there is only one 'copy' of the 254 copies in the cell (i.e., there is only one copy of the genome). There are many, many RNA transcripts created by transcription of this DNA sequence. As we saw earlier in the course, one DNA sequence can be read by many RNA polymerases over and over again in order to produce many mRNA molecules. Each of these transcripts carries the 24 copies of MS2 and thus shows up as a yellow fluorescent spot. You would expect each yellow spot (corresponding to one transcript) to be substantially dimmer than the cyan spot in the nucleus. Each transcript has 24 fluorophores attached to it, while the DNA sequence has 254 fluorophores attached to it. • Q. Visualizing central dogma in a living cell. There are three arrows point to nucleus. One is for Lacl, one is for a DNA sequence, another is for MS2bp. For Lacl and MS2bp, do they also stand for DNA sequences? And they would transcribe and translate to cyanFP and yellow FP respectively. The translated cyan FP would bind to Lac operator and translated yellow FP would bind to MS2 RNA sequence. Is that right? • A. Correct. The cell has 3 different genes introduced into its genome. One of these is the gene for LacI-cyan FP and it would be transcribed and translated to form the LacIcyan FP protein. Likewise the gene for MS2bp-yellow FP would be transcribed and translated to form the MS2bp-yellow FP protein. You are correct that they protein then bind to the Lac operator (in DNA) and the MS2 RNA sequence (in RNA), respectively. Visualizing the central dogma in a live cell! example 1 cyan fluorescence = Lac operator in nucleus and peroxisomes in cytoplasm 109 yellow fluorescence = RNA transcripts • Imaging gene expression in single living cells Nat Rev Mol Cell Biol 5(10):855-862 (2004 October) • Visualizing gene expression in living cells. The movie shows a cell with a stably integrated gene that also contains 256 lac operator repeats. This gene transcribes an RNA that contains both a coding sequence for the cyan fluorescent protein (CFP) protein (with a peroxisome-targeting sequence) and a stretch of MS2 stem-loops. In the beginning of the movie, the gene locus is visible as a result of tagging of the DNA with a CFP–lac-repressor protein. Once transcription is induced from this gene, the locus becomes structurally open and decondenses. The RNAs produced from the gene are tagged with yellow fluorescent protein (YFP)–MS2 and can be seen accumulating at the transcription site. The RNA is translated in the cytoplasm and at later times post-induction, CFP-labelled peroxisomes are detected. The cell was imaged every 2.5 min for a total of 4 hr and 22.5 min. • http://singerlab.aecom.yu.edu/supplements/natrevmcb_v5p855/movies03.htm • Q: The lac operator is connected to the DNA sequence of the cyan FP, and when transcription happens, cyan FP will be generated and the lac operator will not be transcribed? And the lac operator will not be replicated in the nucleus? • A: Transcription starts from the promoter and so only things that come after the promoter, specifically the cyan FP (peroxisome targeted) and the 24 x MS2 repeats, will be transcribed. The 256 lac operator copies comes before the promoter, so it is not transcribed. The 256 lac operator are just there to serve as a binding site for Lac Icyan FP, so that the location of the inserted DNA sequence can be visualized by fluorescence imaging. All of the DNA will be replicated when the cell divides, but otherwise there will be just one copy of the DNA in the cell. • Q: The MS2 is an RNA sequence and it is connected to the DNA sequence of the yellow FP? What is the function of the MS2 here and can it be replicated? • A: 'MS2' is an RNA sequence that forms a hairpin structure. 'MS2 binding protein' (MS2bp) is a protein that binds to the MS2 hairpin. The cell contains the gene for MS2bp fused to yellow FP. When this fusion protein is made in the cell, it will stick to the RNA molecules that contain the 24 x MS2 repeats and allow the RNA molecules to be visualized by fluorescence imaging. • Q: Is one lac operator connected to one cyan FP? • A: The lac operator isn't connected to anything. It is just a DNA sequence that is not transcribed. Lac Inhibitor (LacI) is a protein that binds to the lac operator. The cell contains the gene for LacI fused to cyan FP. When this protein is made, 256 copies of it will stick to the 256 repeats of the lac operator in the DNA. • Q: Will the content of visualizing the central dogma in a live cell in lecture 4 be included in the exam? • A: Everything that was covered in class and/or is in assigned reading could be included in the exam. Visualizing the central dogma in a live cell! example 2 cyan fluorescence = Lac operator in nucleus and peroxisomes in cytoplasm 110 yellow fluorescence = RNA transcripts 1 3 2 4 • Dynamics of Single mRNPs in Nuclei of Living Cells Science 304(5678):1797-1800 (2004 June 18) • 1. Detection of open gene locus by CFP-lac repressor. • 2. Detection of cytoplasmic CFP-peroxisomes (different plane than 1). • 3. Detection of YFP-MS2 nuclear mRNPs and YFP-MS2 accumulation at the transcription site. • 4. Different threshold of same cell showing cytoplasmic YFP-MS2 mRNPs. • An important conclusion from this work is that RNA transcripts are freely diffusing inside of the nucleus. This is different than in the cytoplasm where they are actively transported. • http://singerlab.aecom.yu.edu/supplements/science_v304p1797/movies.htm • Q: In the example of "visualizing the central dogma in a living cell", you mentioned that we tag the MS2 RNA with the YFP. My question is that where does the YFP comes from? I found there is no DNA sequence that represent for the YFP • A: The cell also contains a gene encoding MS2bp-YFP • Q. I have a question about cyanFP, you mentioned that it is peroxisome- targeted what do you mean by that? You mean that it will have interaction by peroxisome? What will happen to it? It is still on the translating region and it should appear on the RNA transcription, so do we have both cyan and yellow color? • A. By genetically fusing a protein (including FPs) to specific peptide sequences, they can be targeted to different compartments of the cell. For example, there are also specific sequences that send proteins to the nucleus and the mitochondria. In the case of peroxisomes it is a simple SKL tripeptide that causes a protein to be targeted to this compartment. A peroxisome is a membrane-enclosed organelle and the protein will accumulate inside of it. Adding a targeting sequence does not effect the color of the FP. These sequences are typically added to the N- or C-terminal tails of the protein. • Q. If LacI-cyan FP only bind to Lac operator of DNA sequences, why does the whole nucleus appear as cyan? • A. There is an excess of LacI-cyan FP relative to the number of binding sites. It is this unbound protein that is causing the whole nucleus to appear cyan.