* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The MOLECULES of LIFE
Artificial gene synthesis wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Paracrine signalling wikipedia , lookup
Genetic code wikipedia , lookup
Signal transduction wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Gene expression wikipedia , lookup
Point mutation wikipedia , lookup
Biochemistry wikipedia , lookup
Expression vector wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Magnesium transporter wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Structural alignment wikipedia , lookup
Interactome wikipedia , lookup
Protein purification wikipedia , lookup
Western blot wikipedia , lookup
Homology modeling wikipedia , lookup
Metalloprotein wikipedia , lookup
Protein–protein interaction wikipedia , lookup
The MOLECULES of LIFE Physical and Chemical Principles Solutions Manual Prepared by James Fraser and Samuel Leachman Chapter 5 Evolutionary Variation in Proteins Problems and Solutions True/False and Multiple Choice 1. The BLOSUM scoring matrix gives a measure of how conservative a mutation is. For substitutions of aspartic acid (Asp, D), which of the following orderings correctly places the amino acids from most conservative to least conservative? a. b. c. d. K,L,A,C,E,S E,S,K,A,C,L L,C,A,K,S,E A,C,E,K,L,S 2. An environment profile in the 3D-1D profile method compares: I. the stability of the amino acid in varying solvents II. the burial of each amino acid in the structure III. the hydrophobicity surrounding each amino acid IV. the type of secondary structure element containing each amino acid a. b. c. d. I, II, III, and IV I and IV II, III, and IV II and IV 7. The core of a protein generally contains residues from the ________ class of amino acids. Answer: hydrophobic/nonpolar 8. Globin proteins bind the iron-containing ________ cofactor. Answer: heme 9. A disulfide bond links two _______ residues. Answer: cysteine 10. According to the BLOSUM substitution matrix, the most conservative mutation from tryptophan (W), other than to itself, is to ______, which has a score of ______. Answer: tyrosine, 2 11. Many soluble human proteins can be expressed in the E. coli bacteria or using an in vitro translation system. How can these proteins fold without the cellular machinery present in human cells? Answer: The “thermodynamic hypothesis” states that proteins adopt native structures that optimize thermodynamic properties. Since sequence determines structure and most proteins do not require posttranslational modifications, external templates, or specific molecular chaperones to fold, many proteins can fold after translation by a different organism or even a cell-free (in vitro) translation system. 3. Protein domains can be assembled together in many different ways, because surface sidechains can be mutated easily without losing protein stability. True/False 4. In contrast to ribonuclease, some proteins cannot fold without the assistance of proteins known as molecular chaperones. This means the thermodynamic hypothesis of protein folding does not apply to these proteins. True/False 5. Two proteins that share more than 50% sequence identity over a 100-residue stretch are likely to have the same three-dimensional fold. True/False Fill in the Blank 6. Two common chemical denaturants of proteins are guanidinium and _______. Answer: urea 12. What level of activity (1, 10, or 100%) is predicted for ribonuclease-A when it is subject to each of the following stepwise procedures? a. i. denatured, then ii. reduced, then iii. exposed to oxygen, then iv. refolded by removing urea b. i. denatured, then ii. reduced, then iii. refolded by removing urea while exposed to oxygen c. i. denatured, then ii. refolded by removing urea Rationalize the predictions based on the effect of denaturation and reduction–oxidation of the ribonuclease-A cysteine residues. Assume that all 2 Chapter 5: Evolutionary Variation in Proteins However, we must correct for the vertical ordering: it does not matter which lysine–glutamate pair is picked first, second, etc., just that the correct pairs are picked. This means we need to divide the total number by 6! (6 pairs to choose first × 5 pairs to chose second...etc). 518,400/(6 × 5 × 4 × 3 × 2 × 1) = 720. Off the 720 random structures populated during crosslinking in experiment (b) only one will be correct. Thus the activity expected is 1/720 = 0.0014, or 0.14%. possible unfolded conformations are equally likely and that folding is faster than oxidation. Answer: a. Low activity (~1%). (i) The protein is first denatured, forming mostly random structures; however, all disulfide bonds remain intact. (ii) Reduction causes all disulfide bonds to be broken. (iii) Exposure to oxygen causes the disulfide bonds to reform. They reform with random pairings, since the protein is unfolded and populating many random structures. (iv) The protein refolds, but only those molecules in which the disulfides formed correctly are active. b. High activity (~100%). (i) The protein is first denatured forming mostly random structures; however, all disulfide bonds remain intact. (ii) Reduction causes all disulfide bonds to be broken. (iii) Exposure to oxygen causes the disulfide bonds to reform, but since folding is faster than oxidation, the disulfide bonds reform correctly. c. High activity (~100%). (i) The protein is first denatured forming mostly random structures; however, all disulfide bonds remain intact. (ii) The protein refolds and since the disulfide bonds are already intact, full activity is restored. 13. A folded protein structure contains six ion pairs between lysine and glutamate. There are no other possible ion pairs in the protein. A chemical crosslinker forms a covalent bond between ion-paired lysine and glutamate sidechains. By analogy to the Anfinsen experiment, the following experiments are done: a. folded protein → unfold with urea → remove urea to refold → add cross-linker → remove excess crosslinker → measure activity b. folded protein → unfold with urea → add crosslinker → remove excess cross-linker → remove urea to refold → measure activity The cross-linker does not by itself alter the activity of the protein when the correct ion pairs are formed. A protein with incorrect ion pairs cross-linked would be inactive. The activity measured at the beginning and end of experiment (a) is 100%. What percentage of the activity do you expect to observe at the end of experiment (b)? Assume that all unfolded conformations are equally likely. Answer: There are six ways of picking the first lysine and six ways of picking the first glutamate (6 × 6 = 36). There are five ways of picking the second lysine and five ways of picking the second glutamate (5 × 5 = 25). There are four ways of picking the third lysine and four ways of picking the third glutamate (4 × 4 = 16). There are three ways of picking the fourth lysine and three ways of picking the fourth glutamate (3 × 3 = 9). There are two ways of picking the fifth lysine and two ways of picking the fifth glutamate (2 × 2 = 4). There is one way of picking the last lysine and one way of picking the last glutamate (1 × 1 = 1). The total number of ways to pick is: 36 × 25 × 16 × 9 × 4 × 1 = 518,400. 14. Use the BLOSUM substitution matrix (Figure 5.11) to compute the sum of the substitution scores (Sij) and the overall likelihood ratio (L) of the following short alignments: a. PADKTN PEEKSA b. KFLASV ATWDPE Answer: a. Sequence: P A D K T N Sequence: P E E K S A Score: (7)+(–1)+(2)+(5)+(1)+(2) Sum of Scores = 12 Likelihood = 2(Score/2) = 2(12/2) = 64 b. Sequence: K F L A S V Sequence: A T W D P E Score: (–1)+(–2)+(–2)+(–2)+(–1)+(–2) Sum of Scores = –10 Likelihood = 2(Score/2) = 2(–10/2) = 1/32 = 0.3125 15. Based on the BLOSUM matrix, how much more likely is it that: a. tryptophan is substituted by a tyrosine than a tryptophan is substituted by a cysteine? b. sequence (i) DPKRFL is related to sequence (ii) EPKRFI than sequence (i) is related to sequence (iii) KGKRYA? To answer this question, you must calculate the ratio of the likelihood ratios for each case. Explain the significance of higher likelihood. Answer: a. The score for W → Y substitution = 2; The likelihood = 2(Score/2) = 2(2/2) = 2. The score for a W → C substitution = (–2); The likelihood = 2(Score/2) = 2(–2/2) = 0.5. Lij/Lik = 2/0.5 = 4 times higher likelihood. This means that a W → Y substitution is more conservative than a W → C subsitution. b. The sum of scores for (i) → (ii) = 27; The likelihood = 11,585.2. The sum of scores for (i) → (iii) = 9; The likelihood = 22.6. Lij/Lik = 11,585.2/22.6 = 512 times higher likelihood. This means that (i) is more likely to be related to (ii) than to (iii). 16. Proteins known as cyclophilins catalyze proline cis-trans isomerization. A catalytic arginine residue is invariant in all cyclophilins. All other positions change residue identity in different cyclophilins despite the fact the variant proteins have the same overall fold and general catalytic activity. Explain, given the relationship between protein sequence and structure, how catalytic activity is PROBLEMS and solutions retained even though most residues can change. Answer: The relationship between sequence and structure is asymmetric. Many proteins can have the same structure despite different (degenerate) sequences. However, sequences fold into only one structure (following the thermodynamic hypothesis). Only residues that perform specific functions, such as the catalytic arginine in cyclophilins and the histidine in helix F in globins are invariant. 17. Why is the sequence similarity generally higher when comparing two globins from mammals than when comparing a globin from a mammal and a globin from a plant? Answer: It is likely that the both the mammalian and plant globins derived from a single ancestral globin and have expanded through gene duplication. The mammalian globins share a common ancestor with each other that is more recent than the mammalian globins share with plant globins. 18. How might the tolerated variation in the hydrophobic core of the lambda repressor change if the hydrophobic core of wild type lambda repressor were more tightly packed? Answer: The protein would likely be less tolerant of mutations. Proteins undergo fluctuations, even in the hydrophobic core. They are not perfectly packed as interlocking pieces of a puzzle. This property allows for accommodation of differently shaped sidechains and for toleration of mutation. 3 Explain why the distribution of protein sizes has the periodicity that is seen in the diagram and estimate a value for x. Answer: A reasonable value for x is approximately 100–150. This is the average size of a protein domain (which are normally 50–200 residues). The periodicity is observed because proteins are modular and are expanded by addition of different domains. The peaks at 1x, 2x, and 3x derive from proteins containing 1, 2, or 3 domains. 21. The number of distinct protein folds is limited. Why might this be so? Approximately how many folds are there (hundreds, thousands, millions, or billions)? Answer: There are likely to be only thousands of folds because natural selection will favor protein sequences that will fold into stable structures. In order to fold into a stable structure, it is necessary to be built up of secondary structure elements and have a hydrophobic core. It is probable that there are a finite number of orientations and combinations of secondary structure elements that will produce a hydrophobic core of acceptable geometry to enable protein folding. 22. How many folds are represented below? Describe what CATH class each fold belongs to and how the secondary structure is arranged in each fold. Q5.12 (seq_struct_70_v1) 19. What characteristics define a protein domain? Answer: Domains have a distinct topology and a self-contained hydrophobic core. They generally contain 50–200 residues. A B C D E F 20. The diagram below shows the size distribution for globular proteins produced by the bacterium E. coli. frequency of occurrence Q5.10 (seq_struct_71_v1) Answer: There are four folds. A and E have the same Rossmannlike mixed α/β fold. B has a unique mixed α/β fold. C has an all β sheet structure arranged in a barrel. D and F have a four helix bundle. 1x 2x 3x number of residues in the protein 23. A threading program is used and it gives two possible predicted folds for a particular sequence. They differ mainly in the placement of a single helix. In predicted fold (A), the helix is entirely within the hydrophobic core. In predicted fold (B), the residues of the helix are 4 Chapter 5: Evolutionary Variation in Proteins mostly exposed to solvent. The sequence of the helix is LIVFLAIL. Explain how the 3D-1D profile method could be used to distinguish between the two possible folds. Answer: All of the residues in the helix are hydrophobic and have positive scores as α helical buried positions. In contrast, these residues have negative scores for exposed α helical positions. Thus, fold A correctly predicts that the α helix should be part of the hydrophobic core of the protein. 24. What structural features of the Rossmann domain enable it to bind nucleotides? Answer: The negatively charged phosphate groups of the nucleotides interact with the P loop at the positive pole of the helix dipole generated by the first α helix of the Rossmann fold. 25. Both thioredoxin reductase and glutathione reductase use fused FAD- and NADPH-binding domains and dimerization to accomplish their cellular functions. Which structural feature likely evolved first, the fused domains or dimerization? Answer: The fused FAD and NADPH domain structure is conserved between both structures. However, the relationship between the individual subunits of the dimers is different for each protein (see Figure 5.43). This implies that domain structure evolved before the split of individual thioredoxin and glutathione reductase lineages and that dimerization evolved later.