* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Biophysical Society On
Interactome wikipedia , lookup
Point mutation wikipedia , lookup
Western blot wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Homology modeling wikipedia , lookup
Structural alignment wikipedia , lookup
Genetic code wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Metalloprotein wikipedia , lookup
Peptide synthesis wikipedia , lookup
Biosynthesis wikipedia , lookup
Biophysical Society On-line Textbook PROTEINS CHAPTER 1. PROTEIN STRUCTURE Section 1. Primary structure, secondary motifs, tertiary architecture, and quaternary organization Jannette Carey* and Vanessa Hanley^ *Department of Chemistry ^Department of Chemical Engineering Princeton University Princeton, NJ 08544-1009 *corresponding author (609) 258-1631 phone (609) 258-6746 FAX [email protected] 1. The amino acid building blocks Proteins are polymeric chains that are built from monomers called amino acids. All structural and functional properties of proteins derive from the chemical properties of the polypeptide chain. There are four levels of protein structural organization: primary (1°), secondary (2°), tertiary (3°), and quaternary (4°). Primary structure is defined as the linear sequence of amino acids in a polypeptide chain. The secondary structure refers to certain regular geometric figures of the chain. Tertiary structure results from long-range contacts within the chain. The quaternary structure is the organization of protein subunits, or two or more independent polypeptide chains. Amino acids are the chemical constituents of proteins, and are characterized by a central alpha carbon atom. The alpha indicates the priority position from which the numbering follows for all subordinate groups. Four substituents are connected to this Cα: one substituent is the alpha proton -H, another is the side chain -R that gives rise to the chemical variety of the amino acids, the third is the carboxylic acid functional group (-COOH), and the fourth is the amino functional group (-NH). The α carbon is the asymmetric center of the molecule for all 20 amino acids except glycine, which has only a proton as its side chain. The configuration about the α carbon center must be the L-isomer for proteins synthesized on the ribosome. This is probably an accident of chemical evolution where the L-isomer happens to be the one chosen for early prebiotic systems and fixed into evolutionary history. figure 1 The side chains have a wide chemical variety which is vital for the unique functions of biological proteins (Figure 1). These side chains can be grouped into three categories: nonpolar, uncharged polar, and charged polar. The simplest amino acid is glycine. Alanine, valine, leucine, isoleucine, and proline are amino acids whose side chains are entirely aliphatic. Alanine has a methyl group as its side chain. Valine has two methyl groups connected to the β carbon, and this residue is said to be β-branched. Leucine has one more carbon atom in the side chain than valine, so that two methyl groups are attached to Cγ. Leucine and isoleucine are isomers whose only difference in structure is the position of the methyl groups. Isoleucine is a β-branched amino acid and has a second asymmetric center at the β− carbon. Proline contains an aliphatic side chain that is covalently bonded to the nitrogen atom of the α-amino group, forming an imide bond and leading to a constrained 5-membered ring. Side chains that are generally nonpolar have low solubility in water because they can form only van der Waals interactions with water molecules. On the other hand, the rest of the amino acids contain heteroatoms in their side chains, opening many bonding possibilities. The uncharged members of this group include: serine, threonine, asparagine, glutamine, tyrosine, and tryptophan. Serine, threonine, and tyrosine contain hydroxyl groups so they can function as both hydrogen bond donors and acceptors, and threonine also has a methyl group, making it β-branched. The benzene ring of tyrosine permits stabilization of the anionic phenolate form upon loss of the hydroxyl proton, which has a pKa near 10. Serine and threonine cannot be deprotonated at ordinary pH values. Asparagine and glutamine side chains are relatively polar in that they can both donate and accept hydrogen bonds. The nitrogen and proton of the tryptophan indole side chain can also participate in hydrogen bond interactions. Another group of polar residues can bear a full, formal charge depending on pH, but their pKa values are such that the charged form is largely populated near neutral pH's. These include lysine, arginine, histidine, aspartic acid, glutamic acid, and cysteine. Lysine and arginine are two basic amino acids that can bear a positive charge at the end of their side chain. The lysine α-amino group has a pKa value near 10, while the arginine guanidino group has a pKa value of ~12. Histidine is another basic residue with its side chain organized into a closed ring structure that contains 2 nitrogen atoms. One of these nitrogens already has a proton on it, but the other one has an available position that can take up an extra proton and form a positively charged histidine group with a pKa of about 6. Aspartic acid and glutamic acid differ only in the number of methylene (-CH2 -) groups in the side chain, with one and two methylene groups, respectively. Their carboxylate groups are extremely polar and can both donate and accept hydrogen bonds, and have pKa values near 4.5. The sulfhydryl (thiol) group of cysteine can ionize at slightly alkaline pH values, with a pKa near 9. The thiolate form can react with a second sulfhydryl to form a disulfide bond that is reversible by reduction. Methionine has a long alkyl side chain that also contains a sulfur heteroatom but is hydrophobic. This sulfur atom is relatively inert as a hydrogen bond acceptor. A group of three amino acids that all have aromatic side chains are phenylalanine, tryptophan, and tyrosine. The aromatic ring of phenylalanine is like that of benzene or toluene. It is very hydrophobic and chemically reactive only under extreme conditions, though its ring electrons are readily polarized. The side chain of histidine is arguably considered aromatic; it meets the electron rule in one of its protonation states, but does not have the characteristic strong near-UV absorption of the other three aromatic amino acids. The UV spectra of the three aromatic groups are distinctive, as are their extinction coefficients, and these properties are reflected in the electronic spectra of polypeptides. Finally, the α-amino and α-carboxylate groups of amino acids can also ionize, with pKa's of 6.8-7.9 and 3.5-4.3, respectively, for the aliphatic amino acids; nearby charged side chains can alter the pKa's of these groups. Each amino acid incorporated into a polypeptide chain is referred to as a residue. Thus, only the amino- and carboxyl- terminal residues possess available α−amino and α− carboxylate groups, respectively. 2. The polypeptide chain In order to form the amino acid monomers into a polymeric chain, amino acids are condensed with one another through dehydration synthesis. This reaction occurs when water is lost between the carboxylic functional group of one amino acid and the amino functional group of the next to form a C-N bond. These polymerization reactions are not spontaneous; however they can be arranged to occur through the energy-driven action of the ribosome. Ribosomes are complexes of proteins and RNA that translate a gene sequence in the form of mRNA into a protein sequence. The 20 amino acids listed above are encoded by the genes and incorporated by the ribosomal machinery during protein synthesis. Other minor amino acids are incorporated by ribosomes, but are derived by post-translation modifications. The reverse reaction, involving hydrolysis of the peptide bond, is not spontaneous either. It can be accomplished chemically, but only under very vigorous conditions. For example, treatment with very strong acid (1 molar HCl) and boiling at 100°C overnight can hydrolyze the peptide bonds. So, the reverse hydrolysis reaction actually happens very slowly under normal conditions. Thus, proteins are chemically and biologically stable unless they are deliberately depolymerized. The decomposition of a polypeptide chain into individual amino acids can also be facilitated by hydrolytic enzymes. Most proteins are heteropolymeric (i.e., they contain most or all the different amino acids). Only rarely do regions of proteins consist of sequences composed of just a few amino acids. Any region of a typical protein will therefore have a chemically heterogeneous environment. This heterogeneity is further amplified by the higher levels of protein structure, as we will see. 3. The peptide bond The peptide bond between two amino acids is a special case of an amide bond flanked on both sides by α-carbon atoms. Peptide bond angles and lengths are wellknown from many direct observations of protein and peptide structures. The peptide bond (C-N) length is observed to be 1.33Å (Figure 2A). This is considerably shorter than the adjacent (nonpeptide) C-N bond length of 1.45Å, but longer than the C=O bond length of 1.23Å. Figure 2A Figure 2B These bond lengths and angles reflect the distribution of electrons between atoms due to differences in polarity of the atoms, and the hybridization of their bonding orbitals. The two more electronegative atoms, O and N, can bear partial negative charges, and the two less electronegative atoms, C and H, can bear partial positive charges. The peptide group consisting of these four atoms can be thought of as a resonance structure. (Figure 2B) Thus, the peptide bond has partial double bond character, accounting for its intermediate bond length. Like any double bond, rotation about the peptide bond angle ω is restricted, with an energy barrier of ˜3 kcal/mole between cis and trans forms. These two isomers are defined by the path of the polypeptide chain across the bond. (Figure3) Successive α−carbons in the chain (i, i+1) are on the same side of the bond in the cis isomer as opposed to the staggered conformation of the trans isomer. For all amino acids but proline, the cis configuration is greatly disfavored because of steric hindrance between adjacent side chains. Ring closure in the proline side chain draws the α−carbon away from the preceding residue, leading to lower steric hindrance across the X-pro peptide bond. In most residues, the trans to cis distribution about this bond is about 90 - 10, but with proline, the trans to cis distribution is about 70 - 30. Figure 3 Also like any other double bond, certain atoms are confined to a single plane about the peptide bond. The group of six atoms between successive α-carbon atoms, inclusive, lie in one plane exactly as do the six atoms of ethene. These six atoms are shown in figure 3. In the trans configuration of the peptide bond, the combined effects of polarity and planarity result in a permanent small dipole moment across the peptide bond, with its negative end on the side of the carbonyl oxygen. The planarity of the peptide bond has additional profound consequences for polypeptide structure, as we shall see. 4. Restrictions on bond rotations While there is restricted rotation about the peptide bond, there is free rotation about the four bonds to the α-carbon of each residue. Two of these rotations are of particular relevance for the structure of the polypeptide backbone. To fully appreciate these rotations, we must shift our perspective from the peptide-bondcentered view of figure 3 to the Cα-centered view of figure 4A. The bond from the α−carbon to the carbonyl carbon of that residue is given the name ψ. Similarly, the bond from the α−carbon to the amino group of that residue is given the name φ. Because Cα is one of the six planar atoms of the peptide group, rotation about φ or ψ flanking Cα rotates the entire plane of the peptide group (Figure 4B). Figure 4A Figure 4B Since the entire plane rotates on either side of Cα, certain values of the angles φ and ψ cannot be achieved due to steric occlusion. The allowed regions of φ,ψ space differ for each amino acid because some of the restriction is due to Cα and its substituents. However, even for glycine, some angles are not allowed. Figure 5A Figure 5B The allowed regions of φ,ψ space for each amino acid are displayed on Ramachandran plots. The allowed regions can be defined in terms of the energetic cost that must be paid to enter a disallowed region, or in terms of the limiting socalled hard-sphere boundary when atoms clash (Figure 5A). For β-branched residues the restrictions are severe, and only a small fraction of φ,ψ space is allowed. Valine and isoleucine have access to only about 5% of all φ,ψ space. However, all residues have access to at least part of the most favorable regions of φ,ψ space in the upper and lower left of the plot. As we will return to shortly, it turns out that these two regions correspond to combinations of φ and ψ angles that characterize the two common regular secondary structures that can be adopted by the polypeptide backbone, the α-helix and β-strand. Note that there is an energy barrier between the α-helical region of φ,ψ space and the β-strand region. Thus, direct conversion between α- and β- structures is restricted even though most residues are allowed in both regions. Two conclusions from recent structural analysis of proteins and peptides are relevant to this point. First, the peptide bond deviates slightly from planarity in a surprisingly large fraction of cases (10). Presumably, the observed range of peptide bond angles has the effect of slightly enlarging the allowed φ,ψ space, and perhaps reducing the α/β barrier. Second, in protein structures certain residues are overrepresented outside the allowed regions, and these tend to be the small polar residues (11). Presumably, these can form favorable local interactions that compensate for the energetic penalty in those φ,ψ regions. 5. Secondary structures Since the restrictions on φ,ψ space arise in part from steric hindrance between side chain and backbone, this same steric hindrance is the origin of α and β secondary structures. There is no sequence dependence on the steric restrictions of the α and β space because φ,ψ restrictions arise within each residue rather than between residues. However, a sequence of residues that all have similar allowed φ,ψ space can give rise to a chain segment that forms α or β structures. Thus, these secondary structures owe their formation to both backbone and side chain steric restrictions. This analysis provides an important insight into the origins of protein secondary structures: these structures are intrinsically favorable for the chain under all conditions, independent of considerations about bonding. The helix structure looks like a spring. The most common shape is a right handed α−helix defined by the repeat length of 3.6 amino acid residues and a rise of 5.4 Å per turn. Thus residues (i+3) and (i+4) are closest to residue (i) in the helix (Figure 6A). The pitch and dimensions of the helix also bring the peptide dipole moments of successive residues into proximity such that their opposite charges neutralize each other substantially in the middle of a helix (Figure 6B). At the ends, the peptide dipole cannot be neutralized by this mechanism, resulting in a net helix macrodipole of approximately one-half unit of charge at each end. This charge may be neutralized by nearby side chains. Figure 6A Figure 6B The pitch and dimensions of the helix also bring the amide proton of residue (i+3) or (i+4) into proximity to the carbonyl oxygen of residue (i) such that a hydrogen bond can form. All peptide group hydrogen bond donors and acceptors are satisified in the central part of the helical segment, but not at the ends. While structural evidence clearly indicates that these hydrogen bonds are highly populated in helical segments of proteins, their contribution to helix stability is less clear since donors and acceptors would be satisfied by hydrogen bonding to water in nonhelical structures. However, φ,ψ restrictions can have the effect of preorganizing the chain into a helical conformation, which may favor hydrogen bonding by enhancing the local concentration of donors and acceptors. β strands are the other regular secondary structure that proteins form (Figure 7A). These are extended structures in which successive peptide dipole moments alternate direction along the chain. Because it is an extended structure, φ,ψ steric hindrance is reduced in the β strand, and the β region of φ,ψ space is larger than the α region. Two or more strand segments can pair by hydrogen bonding and dipolar interactions to form a β-sheet. Unlike helical segments, all peptide group hydrogen bond donors and acceptors are satisfied not within but between β-strand segments; thus individual β-strands do not have an independent existence. Also unlike a helical segment, adjacent strands of a sheet can come from sequentially distant segments of the chain; rarely, this can occur even within one strand of a sheet. β sheets can consist of either parallel or antiparallel strands, or a mixture of the two. In purely antiparallel sheets, segments that are sequentially next to each other in the primary structure often form adjacent strands. Figure 7A Figure 7B However, even when forming a hairpin from contiguous chain segments, linearly distant residues are brought into proximity at the N-and C- terminal ends of the hairpin (Figure 7B). Thus, while a β-strand is a secondary structure element because of its geometrically regular features, a β-sheet can be thought of as a tertiary structural feature because it is intrinsically nonlocal. This example illustrates that the distinctions between secondary and tertiary structural features are not entirely clear. So-called turn structures are also classified as secondary structural elements, but unlike helices and strands, they do not have a repeating, regular geometry. Rather, they can have well-defined spatial dispositions defined by certain values of φ and ψ angles that often require specific residue types and/or sequences, as well as fixed hydrogen bonding patterns. Most turns are local in the primary structure, but omega loops (12) can have a large number of intervening residues lacking defined geometries, with the turn being defined by the conformations of residues that form the constriction that gives this turn its name (Ω). Turns are essential for allowing the polypeptide chain to fold back upon itself to form tertiary interactions. Such interactions are generally long-range, and result in compaction of the protein into a globular, often approximately spherical, form. The turn regions are thus generally located on the outside of the globular structure, with helices and/or sheets forming its core. Turns on the surfaces of proteins have a wide range of dynamics, from quite mobile in cases where they form few interactions with the underlying protein surface to quite fixed due to extensive tertiary contacts. Thus, turns are also ambiguously classified as secondary structure elements. 6. Tertiary structures The side chains project outward from both α-helical (figure 6B) and β-strand (figure 7A) structures, and are therefore available for interactions with other surfaces through hydrophobic contacts and various kinds of bonding interactions to form the tertiary structure. In a helix, the side chains project radially outward, and in a strand successive side chains project alternately up and down. Rotation about bonds in the side chain are also restricted, however. The same steric hindrance that limits the backbone conformation also limits the side chain conformation about the Cα-Cβ bond to preferred rotamers defined by rotation angle χ1. Rotation angles beyond χ1 are restricted by side chain packing in the tertiary structure. If secondary structural elements result from steric restrictions in φ,ψ space, it is less obvious why tertiary structures form. Proteins with highly organized tertiary structures generally have a well-developed core of hydrophobic residues contributed from most or all of the secondary structure elements in the chain. Thus, secondary and tertiary structures are in general intimately and explicitly interconnected. These buried residues do not form merely a liquid-like oily interior, but rather are usually well-packed, with extensive rotamer restrictions. In aqueous solvents, the hydrophobic effect drives the chain toward compaction to relieve unfavorable solvation of these exposed side chains, but compaction and internal organization are entropically costly due to loss of chain flexibility, and it is likely that these competing effects nearly cancel each other energetically. On the other hand, upon compaction, bonding interactions with solvent molecules are replaced by intramolecular partners, with a likely net gain in favorable energetic contributions due to several effects, including lower dielectric constant in the incipient interior. Hydrogen bonding is favored within secondary structures because these are partially preorganized by φ,ψ restrictions into configurations that permit bonding at little additional entropic cost. In the case of βsheet formation, an additional favorable effect may result when two β-strands are brought into register, much like DNA duplex formation. The view developed in these paragraphs suggests that protein secondary and tertiary structures are not independent of each other, but rather interdependent. It seems likely that this interdependence is the molecular origin of the extraordinary cooperativity of protein structural stability, which is reflected in the observation that protein secondary and tertiary structures are lost concomitantly and in an all-ornone manner upon changes in environment that disfavor the folded state, such as higher temperature or solvent additives. 7. Quaternary structure The highest level of protein structural organization is the quaternary structure. The subunits that associate may be identical or not, and their organization may or may not be symmetric. In general, quaternary structure results from association of independent tertiary structural units through surface interactions, such as formation of the hemoglobin tetramer from myoglobin-like monomers. However, an increasing number of examples illustrates that tertiary structure can also be formed concomitantly with quaternary association in some cases. A notable example is the tryptophan repressor protein, which forms a highly intertwined dimer in which essentially all tertiary contacts are satisfied only across the subunit interface, rather than within each polypeptide chain (13). Thus, subunit assembly is necessarily a step in tertiary structure formation. Another example is the cyclin/Cdk inhibitor, which like Trp repressor has a well-formed secondary structure but no intramolecular tertiary structure; rather, all tertiary interactions are formed through its contacts to the binary cyclin/Cdk complex (14). These examples show that the codependence of tertiary and quaternary structures parallels the co-dependence between secondary and tertiary structures, and suggest that the distinction among these levels of the protein structure organizational hierarchy are blurry at best, and perhaps even misleading for our understanding of protein structural stability and folding. 8. Literature cited 1. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H. Freeman and Company, New York. pg. 3. 2. ibid., pg. 5. 3. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H. Freeman and Company, New York. pg. 41. 4. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H. Freeman and Company, New York. pg. 174 5. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H. Freeman and Company, New York. pg. pg. 165. 6. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H. Freeman and Company, New York. pg. 7. 7. ibid., pg. 160. 8. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H. Freeman and Company, New York. pg. pg. 256. 9. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H. Freeman and Company, New York. pg. 167. 10. MacArthur, M.W., and Thornton, J.M. (1996) J. Mol. Biol. 264, 1180-1195. 11. Gunasekaran, K., Ramakrishnan, C., and Balaram, P. (1996) J. Mol. Biol. 264, 191198. 12. Fetrow, J. S. (1995) FASEB J. 9, 708-717. 13. Schevitz, R.W., Otwinowski, Z., Joachimiak, A., Lawson, C.L., and Sigler, P.B. (1985) Nature 317, 782-786. 14. Russo, A.A., Jeffrey, P.D., Patten, A.K., Massague, J., and Pavletich, N.P. Nature 382, 325-331.