Download Biophysical Society On

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Interactome wikipedia , lookup

Point mutation wikipedia , lookup

Western blot wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Metabolism wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Homology modeling wikipedia , lookup

Protein wikipedia , lookup

Structural alignment wikipedia , lookup

Genetic code wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Metalloprotein wikipedia , lookup

Peptide synthesis wikipedia , lookup

Biosynthesis wikipedia , lookup

Proteolysis wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
Biophysical Society On-line Textbook
PROTEINS
CHAPTER 1. PROTEIN STRUCTURE
Section 1.
Primary structure, secondary motifs,
tertiary architecture, and quaternary organization
Jannette Carey* and Vanessa Hanley^
*Department of Chemistry
^Department of Chemical Engineering
Princeton University
Princeton, NJ 08544-1009
*corresponding author
(609) 258-1631 phone
(609) 258-6746 FAX
[email protected]
1. The amino acid building blocks
Proteins are polymeric chains that are built from monomers called amino
acids. All structural and functional properties of proteins derive from the chemical
properties of the polypeptide chain. There are four levels of protein structural
organization: primary (1°), secondary (2°), tertiary (3°), and quaternary (4°). Primary
structure is defined as the linear sequence of amino acids in a polypeptide chain.
The secondary structure refers to certain regular geometric figures of the chain.
Tertiary structure results from long-range contacts within the chain. The quaternary
structure is the organization of protein subunits, or two or more independent
polypeptide chains.
Amino acids are the chemical constituents of proteins, and are characterized
by a central alpha carbon atom. The alpha indicates the priority position from which
the numbering follows for all subordinate groups. Four substituents are connected
to this Cα: one substituent is the alpha proton -H, another is the side chain -R that
gives rise to the chemical variety of the amino acids, the third is the carboxylic acid
functional group (-COOH), and the fourth is the amino functional group (-NH). The
α carbon is the asymmetric center of the molecule for all 20 amino acids except
glycine, which has only a proton as its side chain. The configuration about the α
carbon center must be the L-isomer for proteins synthesized on the ribosome. This is
probably an accident of chemical evolution where the L-isomer happens to be the
one chosen for early prebiotic systems and fixed into evolutionary history.
figure 1
The side chains have a wide chemical variety which is vital for the unique
functions of biological proteins (Figure 1). These side chains can be grouped into
three categories: nonpolar, uncharged polar, and charged polar. The simplest amino
acid is glycine. Alanine, valine, leucine, isoleucine, and proline are amino acids
whose side chains are entirely aliphatic. Alanine has a methyl group as its side
chain. Valine has two methyl groups connected to the β carbon, and this residue is
said to be β-branched. Leucine has one more carbon atom in the side chain than
valine, so that two methyl groups are attached to Cγ. Leucine and isoleucine are
isomers whose only difference in structure is the position of the methyl groups.
Isoleucine is a β-branched amino acid and has a second asymmetric center at the β−
carbon. Proline contains an aliphatic side chain that is covalently bonded to the
nitrogen atom of the α-amino group, forming an imide bond and leading to a
constrained 5-membered ring.
Side chains that are generally nonpolar have low solubility in water because
they can form only van der Waals interactions with water molecules. On the other
hand, the rest of the amino acids contain heteroatoms in their side chains, opening
many bonding possibilities. The uncharged members of this group include: serine,
threonine, asparagine, glutamine, tyrosine, and tryptophan. Serine, threonine, and
tyrosine contain hydroxyl groups so they can function as both hydrogen bond
donors and acceptors, and threonine also has a methyl group, making it β-branched.
The benzene ring of tyrosine permits stabilization of the anionic phenolate form
upon loss of the hydroxyl proton, which has a pKa near 10. Serine and threonine
cannot be deprotonated at ordinary pH values. Asparagine and glutamine side
chains are relatively polar in that they can both donate and accept hydrogen bonds.
The nitrogen and proton of the tryptophan indole side chain can also participate in
hydrogen bond interactions.
Another group of polar residues can bear a full, formal charge depending on
pH, but their pKa values are such that the charged form is largely populated near
neutral pH's. These include lysine, arginine, histidine, aspartic acid, glutamic acid,
and cysteine. Lysine and arginine are two basic amino acids that can bear a positive
charge at the end of their side chain. The lysine α-amino group has a pKa value near
10, while the arginine guanidino group has a pKa value of ~12. Histidine is another
basic residue with its side chain organized into a closed ring structure that contains 2
nitrogen atoms. One of these nitrogens already has a proton on it, but the other one
has an available position that can take up an extra proton and form a positively
charged histidine group with a pKa of about 6. Aspartic acid and glutamic acid differ
only in the number of methylene (-CH2 -) groups in the side chain, with one and two
methylene groups, respectively. Their carboxylate groups are extremely polar and
can both donate and accept hydrogen bonds, and have pKa values near 4.5.
The sulfhydryl (thiol) group of cysteine can ionize at slightly alkaline pH
values, with a pKa near 9. The thiolate form can react with a second sulfhydryl to
form a disulfide bond that is reversible by reduction. Methionine has a long alkyl
side chain that also contains a sulfur heteroatom but is hydrophobic. This sulfur
atom is relatively inert as a hydrogen bond acceptor.
A group of three amino acids that all have aromatic side chains are
phenylalanine, tryptophan, and tyrosine. The aromatic ring of phenylalanine is like
that of benzene or toluene. It is very hydrophobic and chemically reactive only
under extreme conditions, though its ring electrons are readily polarized. The side
chain of histidine is arguably considered aromatic; it meets the electron rule in one
of its protonation states, but does not have the characteristic strong near-UV
absorption of the other three aromatic amino acids. The UV spectra of the three
aromatic groups are distinctive, as are their extinction coefficients, and these
properties are reflected in the electronic spectra of polypeptides.
Finally, the α-amino and α-carboxylate groups of amino acids can also ionize,
with pKa's of 6.8-7.9 and 3.5-4.3, respectively, for the aliphatic amino acids; nearby
charged side chains can alter the pKa's of these groups. Each amino acid
incorporated into a polypeptide chain is referred to as a residue. Thus, only the
amino- and carboxyl- terminal residues possess available α−amino and α−
carboxylate groups, respectively.
2. The polypeptide chain
In order to form the amino acid monomers into a polymeric chain, amino
acids are condensed with one another through dehydration synthesis. This reaction
occurs when water is lost between the carboxylic functional group of one amino acid
and the amino functional group of the next to form a C-N bond. These
polymerization reactions are not spontaneous; however they can be arranged to
occur through the energy-driven action of the ribosome. Ribosomes are complexes
of proteins and RNA that translate a gene sequence in the form of mRNA into a
protein sequence. The 20 amino acids listed above are encoded by the genes and
incorporated by the ribosomal machinery during protein synthesis. Other minor
amino acids are incorporated by ribosomes, but are derived by post-translation
modifications.
The reverse reaction, involving hydrolysis of the peptide bond, is not
spontaneous either. It can be accomplished chemically,
but only under very
vigorous conditions. For example, treatment with very strong acid (1 molar HCl)
and boiling at 100°C overnight can hydrolyze the peptide bonds. So, the reverse
hydrolysis reaction actually happens very slowly under normal conditions. Thus,
proteins are chemically and biologically stable unless they are deliberately
depolymerized. The decomposition of a polypeptide chain into individual amino
acids can also be facilitated by hydrolytic enzymes.
Most proteins are heteropolymeric (i.e., they contain most or all the different
amino acids). Only rarely do regions of proteins consist of sequences composed of
just a few amino acids. Any
region of a typical protein will therefore have a
chemically heterogeneous environment. This heterogeneity is further amplified by
the higher levels of protein structure, as we will see.
3. The peptide bond
The peptide bond between two amino acids is a special case of an amide bond
flanked on both sides by α-carbon atoms. Peptide bond angles and lengths are wellknown from many direct observations of protein and peptide structures. The
peptide bond (C-N) length is observed to be 1.33Å (Figure 2A). This is considerably
shorter than the adjacent (nonpeptide) C-N bond length of 1.45Å, but longer than
the C=O bond length of 1.23Å.
Figure 2A
Figure 2B
These bond lengths and angles reflect the distribution of electrons between
atoms due to differences in polarity of the atoms, and the hybridization of their
bonding orbitals. The two more electronegative atoms, O and N, can bear partial
negative charges, and the two less electronegative atoms, C and H, can bear partial
positive charges. The peptide group consisting of these four atoms can be thought of
as a resonance structure. (Figure 2B) Thus, the peptide bond has partial double bond
character, accounting for its intermediate bond length.
Like any double bond, rotation about the peptide bond angle ω is restricted,
with an energy barrier of ˜3 kcal/mole between cis and trans forms. These two
isomers are defined by the path of the polypeptide chain across the bond. (Figure3)
Successive α−carbons in the chain (i, i+1) are on the same side of the bond in the cis
isomer as opposed to the staggered conformation of the trans isomer. For all amino
acids but proline, the cis configuration is greatly disfavored because of steric
hindrance between adjacent side chains. Ring closure in the proline side chain
draws the α−carbon away from the preceding residue, leading to lower steric
hindrance across the X-pro peptide bond. In most residues, the trans to cis
distribution about this bond is about 90 - 10, but with proline, the trans to cis
distribution is about 70 - 30.
Figure 3
Also like any other double bond, certain atoms are confined to a single plane
about the peptide bond. The group of six atoms between successive α-carbon atoms,
inclusive, lie in one plane exactly as do the six atoms of ethene. These six atoms are
shown in figure 3. In the trans configuration of the peptide bond, the combined
effects of polarity and planarity result in a permanent small dipole moment across
the peptide bond, with its negative end on the side of the carbonyl oxygen. The
planarity of the peptide bond has additional profound consequences for polypeptide
structure, as we shall see.
4. Restrictions on bond rotations
While there is restricted rotation about the peptide bond, there is free rotation
about the four bonds to the α-carbon of each residue. Two of these rotations are of
particular relevance for the structure of the polypeptide backbone. To fully
appreciate these rotations, we must shift our perspective from the peptide-bondcentered view of figure 3 to the Cα-centered view of figure 4A. The bond from the
α−carbon to the carbonyl carbon of that residue is given the name ψ. Similarly, the
bond from the α−carbon to the amino group of that residue is given the name φ.
Because Cα is one of the six planar atoms of the peptide group, rotation about φ or ψ
flanking Cα rotates the entire plane of the peptide group (Figure 4B).
Figure 4A
Figure 4B
Since the entire plane rotates on either side of Cα, certain values of the angles
φ and ψ cannot be achieved due to steric occlusion. The allowed regions of φ,ψ space
differ for each amino acid because some of the restriction is due to Cα and its
substituents. However, even for glycine, some angles are not allowed.
Figure 5A
Figure 5B
The allowed regions of φ,ψ space for each amino acid are displayed on
Ramachandran plots. The allowed regions can be defined in terms of the energetic
cost that must be paid to enter a disallowed region, or in terms of the limiting socalled hard-sphere boundary when atoms clash (Figure 5A). For β-branched residues
the restrictions are severe, and only a small fraction of φ,ψ space is allowed. Valine
and isoleucine have access to only about 5% of all φ,ψ space. However, all residues
have access to at least part of the most favorable regions of φ,ψ space in the upper
and lower left of the plot. As we will return to shortly, it turns out that these two
regions correspond to combinations of φ and ψ angles that characterize the two
common regular secondary structures that can be adopted by the polypeptide
backbone, the α-helix and β-strand.
Note that there is an energy barrier between the α-helical region of φ,ψ space
and the β-strand region. Thus, direct conversion between α- and β- structures is
restricted even though most residues are allowed in both regions. Two conclusions
from recent structural analysis of proteins and peptides are relevant to this point.
First, the peptide bond deviates slightly from planarity in a surprisingly large
fraction of cases (10). Presumably, the observed range of peptide bond angles has the
effect of slightly enlarging the allowed φ,ψ space, and perhaps reducing the α/β
barrier. Second, in protein structures certain residues are overrepresented outside
the allowed regions, and these tend to be the small polar residues (11). Presumably,
these can form favorable local interactions that compensate for the energetic penalty
in those φ,ψ regions.
5. Secondary structures
Since the restrictions on φ,ψ space arise in part from steric hindrance between
side chain and backbone, this same steric hindrance is the origin of α and β
secondary structures. There is no sequence dependence on the steric restrictions of
the α and β space because φ,ψ restrictions arise within each residue rather than
between residues. However, a sequence of residues that all have similar allowed φ,ψ
space can give rise to a chain segment that forms α or β structures. Thus, these
secondary structures owe their formation to both backbone and side chain steric
restrictions. This analysis provides an important insight into the origins of protein
secondary structures: these structures are intrinsically favorable for the chain under
all conditions, independent of considerations about bonding.
The helix structure looks like a spring. The most common shape is a right
handed α−helix defined by the repeat length of 3.6 amino acid residues and a rise of
5.4 Å per turn. Thus residues (i+3) and (i+4) are closest to residue (i) in the helix
(Figure 6A). The pitch and dimensions of the helix also bring the peptide dipole
moments of successive residues into proximity such that their opposite charges
neutralize each other substantially in the middle of a helix (Figure 6B). At the ends,
the peptide dipole cannot be neutralized by this mechanism, resulting in a net helix
macrodipole of approximately one-half unit of charge at each end. This charge may
be neutralized by nearby side chains.
Figure 6A
Figure 6B
The pitch and dimensions of the helix also bring the amide proton of residue
(i+3) or (i+4) into proximity to the carbonyl oxygen of residue (i) such that a
hydrogen bond can form. All peptide group hydrogen bond donors and acceptors are
satisified in the central part of the helical segment, but not at the ends. While
structural evidence clearly indicates that these hydrogen bonds are highly populated
in helical segments of proteins, their contribution to helix stability is less clear since
donors and acceptors would be satisfied by hydrogen bonding to water in nonhelical
structures. However, φ,ψ restrictions can have the effect of preorganizing the chain
into a helical conformation, which may favor hydrogen bonding by enhancing the
local concentration of donors and acceptors.
β strands are the other regular secondary structure that proteins form (Figure
7A). These are extended structures in which successive peptide dipole moments
alternate direction along the chain. Because it is an extended structure, φ,ψ steric
hindrance is reduced in the β strand, and the β region of φ,ψ space is larger than the
α region. Two or more strand segments can pair by hydrogen bonding and dipolar
interactions to form a β-sheet. Unlike helical segments, all peptide group hydrogen
bond donors and acceptors are satisfied not within but between β-strand segments;
thus individual β-strands do not have an independent existence.
Also unlike a helical segment, adjacent strands of a sheet can come from
sequentially distant segments of the chain; rarely, this can occur even within one
strand of a sheet. β sheets can consist of either parallel or antiparallel strands, or a
mixture of the two. In purely antiparallel sheets, segments that are sequentially next
to each other in the primary structure often form adjacent strands.
Figure 7A
Figure 7B
However, even when forming a hairpin from contiguous chain segments, linearly
distant residues are brought into proximity at the N-and C- terminal ends of the
hairpin (Figure 7B). Thus, while a β-strand is a secondary structure element because
of its geometrically regular features, a β-sheet can be thought of as a tertiary
structural feature because it is intrinsically nonlocal. This example illustrates that
the distinctions between secondary and tertiary structural features are not entirely
clear.
So-called turn structures are also classified as secondary structural elements,
but unlike helices and strands, they do not have a repeating, regular geometry.
Rather, they can have well-defined spatial dispositions defined by certain values of φ
and ψ angles that often require specific residue types and/or sequences, as well as
fixed hydrogen bonding patterns. Most turns are local in the primary structure, but
omega loops (12) can have a large number of intervening residues lacking defined
geometries, with the turn being defined by the conformations of residues that form
the constriction that gives this turn its name (Ω).
Turns are essential for allowing the polypeptide chain to fold back upon itself
to form tertiary interactions. Such interactions are generally long-range, and result
in compaction of the protein into a globular, often approximately spherical, form.
The turn regions are thus generally located on the outside of the globular structure,
with helices and/or sheets forming its core. Turns on the surfaces of proteins have a
wide range of dynamics, from quite mobile in cases where they form few
interactions with the underlying protein surface to quite fixed due to extensive
tertiary contacts. Thus, turns are also ambiguously classified as secondary structure
elements.
6. Tertiary structures
The side chains project outward from both α-helical (figure 6B) and β-strand
(figure 7A) structures, and are therefore available for interactions with other
surfaces through hydrophobic contacts and various kinds of bonding interactions to
form the tertiary structure. In a helix, the side chains project radially outward, and
in a strand successive side chains project alternately up and down. Rotation about
bonds in the side chain are also restricted, however. The same steric hindrance that
limits the backbone conformation also limits the side chain conformation about the
Cα-Cβ bond to preferred rotamers defined by rotation angle χ1.
Rotation angles
beyond χ1 are restricted by side chain packing in the tertiary structure.
If secondary structural elements result from steric restrictions in φ,ψ space, it
is less obvious why tertiary structures form. Proteins with highly organized tertiary
structures generally have a well-developed core of hydrophobic residues contributed
from most or all of the secondary structure elements in the chain. Thus, secondary
and tertiary structures are in general intimately and explicitly interconnected. These
buried residues do not form merely a liquid-like oily interior, but rather are usually
well-packed, with extensive rotamer restrictions. In aqueous solvents, the
hydrophobic effect drives the chain toward compaction to relieve unfavorable
solvation of these exposed side chains, but compaction and internal organization are
entropically costly due to loss of chain flexibility, and it is likely that these competing
effects nearly cancel each other energetically.
On the other hand, upon compaction, bonding interactions with solvent
molecules are replaced by intramolecular partners, with a likely net gain in
favorable energetic contributions due to several effects, including lower dielectric
constant in the incipient interior. Hydrogen bonding is favored within secondary
structures because these are partially preorganized by φ,ψ restrictions into
configurations that permit bonding at little additional entropic cost. In the case of βsheet formation, an additional favorable effect may result when two β-strands are
brought into register, much like DNA duplex formation.
The view developed in these paragraphs suggests that protein secondary and
tertiary structures are not independent of each other, but rather interdependent. It
seems likely that this interdependence is the molecular origin of the extraordinary
cooperativity of protein structural stability, which is reflected in the observation that
protein secondary and tertiary structures are lost concomitantly and in an all-ornone manner upon changes in environment that disfavor the folded state, such as
higher temperature or solvent additives.
7. Quaternary structure
The highest level of protein structural organization is the quaternary structure. The
subunits that associate may be identical or not, and their organization may or may
not be symmetric. In general, quaternary structure results from association of
independent tertiary structural units through surface interactions, such as
formation of the hemoglobin tetramer from myoglobin-like monomers. However,
an increasing number of examples illustrates that tertiary structure can also be
formed concomitantly with quaternary association in some cases. A notable example
is the tryptophan repressor protein, which forms a highly intertwined dimer in
which essentially all tertiary contacts are satisfied only across the subunit interface,
rather than within each polypeptide chain (13). Thus, subunit assembly is
necessarily a step in tertiary structure formation. Another example is the cyclin/Cdk
inhibitor, which like Trp repressor has a well-formed secondary structure but no
intramolecular tertiary structure; rather, all tertiary interactions are formed through
its contacts to the binary cyclin/Cdk complex (14). These examples show that the codependence of tertiary and quaternary structures parallels the co-dependence
between secondary and tertiary structures, and suggest that the distinction among
these levels of the protein structure organizational hierarchy are blurry at best, and
perhaps even misleading for our understanding of protein structural stability and
folding.
8. Literature cited
1. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.
Freeman and Company, New York. pg. 3.
2. ibid., pg. 5.
3. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.
Freeman and Company, New York. pg. 41.
4. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.
Freeman and Company, New York. pg. 174
5. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.
Freeman and Company, New York. pg. pg. 165.
6. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.
Freeman and Company, New York. pg. 7.
7. ibid., pg. 160.
8. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.
Freeman and Company, New York. pg. pg. 256.
9. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.
Freeman and Company, New York. pg. 167.
10. MacArthur, M.W., and Thornton, J.M. (1996) J. Mol. Biol. 264, 1180-1195.
11. Gunasekaran, K., Ramakrishnan, C., and Balaram, P. (1996) J. Mol. Biol. 264, 191198.
12. Fetrow, J. S. (1995) FASEB J. 9, 708-717.
13. Schevitz, R.W., Otwinowski, Z., Joachimiak, A., Lawson, C.L., and Sigler, P.B.
(1985) Nature 317, 782-786.
14. Russo, A.A., Jeffrey, P.D., Patten, A.K., Massague, J., and Pavletich, N.P. Nature
382, 325-331.