Download Basic concepts of molecular biology and proteins I

Document related concepts

Signal transduction wikipedia , lookup

SR protein wikipedia , lookup

Point mutation wikipedia , lookup

Interactome wikipedia , lookup

Peptide synthesis wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Metabolism wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Genetic code wikipedia , lookup

Western blot wikipedia , lookup

Structural alignment wikipedia , lookup

Homology modeling wikipedia , lookup

Biosynthesis wikipedia , lookup

Protein wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Metalloprotein wikipedia , lookup

Biochemistry wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
Basic concepts of
molecular biology and proteins I
PROTEINS
A large molecule composed of one or more
chains of amino acids in a specific
order; the order is determined by the base
sequence of nucleotides in the gene
coding for the protein. Proteins are required
for the structure, function, and
regulation of the body’s cells, tissues, and
organs, and each protein has unique
functions. Examples are hormones, enzymes,
and antibodies.
A peptide bond. This covalent bond
forms when the carbon atom from the
carboxyl group of one amino acid shares
electrons with the nitrogen atom (blue)
from the amino group of a second amino
acid. As indicated, a molecule of water is
lost in this condensation reaction
Proteins are built up by amino acids that
are linked by peptide bonds to form a
polypeptide chain.
(a) Schematic diagram of an amino acid, A central
carbon atom (Ca) is attached to an amino group
(NH2), a carboxyl group (COOH), a hydrogen atom
(H), and a side chain (R). (b) In a polypeptide chain
the carboxyl group of amino acid n has formed a
peptide bond, C-N, to the amino group of amino acid
n + 1. One water molecule is eliminated in this
process. The repeating units, which are called
residues, are divided into main-chain atoms and side
chains. The main-chain part, which is identical in all
residues, contains a central Ca atom attached to an
NH group, a C'=O group, and an H atom. The side
chain R, which is different for different residues, is
bound to the Ca atom.
Condensation rxn
(a water is lost)
Opposite is
hydrolysis rxn
(a water is added)
Let us consider a macromolecule composed of n structural units along the backbone
Ri= position vector for i
Ri=
Rxi
Ryi
Rzi
{ Ci } = ith conformation
{ C } = { R1, R2, R3, ..., Rn-1, Rn}
Schematic representation of a chain of n backbone units. Bonds are labeled
from 2 to n, and structural units from 1 to n. The location of the ith unit with
respect to the laboratory-fixed frame OXYZ is indicated by the position
vector Ri. R1 and R3 are explicitly shown.
Diagram showing a polypeptide chain where the
main-chain atoms are represented as rigid peptide
units, linked through the Ca atoms.
Each unit has two degrees of freedom; it can rotate around two
bonds, its Ca-C' bond and its N- Ca bond. The angle of rotation
around the N-Ca bond is called phi (f) and that around the CaC' bond is called psi (y). The conformation of the main-chain
atoms is therefore determined by the values of these two
angles for each amino acid.
Looking down the H-Ca bond from the hydrogen atom, the L-form has CO, R
and N substituents from Ca going from a clockwise direction (CORN)
Chiral molecules (except Gly)
Let us consider a macromolecule composed of n structural units along the backbone
Ri= position vector for i
Ri=
Rxi
Ryi
Rzi
{ Ci } = ith conformation
{ C } = { R1, R2, R3, ..., Rn-1, Rn}
Schematic representation of a chain of n backbone units. Bonds are labeled
from 2 to n, and structural units from 1 to n. The location of the ith unit with
respect to the laboratory-fixed frame OXYZ is indicated by the position
vector Ri. R1 and R3 are explicitly shown.
{ C } = { R1, R2, R3, ..., Rn-1, Rn}
If you know these variables, then you know the structure
(3n variables since { Ri} = { Rix, Riy, Riz}
Change of varaibles from R to internal coordinates
Can be used for representing the conformation of a protein
Schematic representation of a portion of the main chain of a macromolecule. li
is the bond vector between units i-1 and i, as shown. ϕi denotes the torsional
angle about bond i.
Spatial representation of the torsional mobility around the bond i+1. The torsional
angle ϕi+1 of bond i+1 determines the position of the atom Ci+2 relative to Ci-1.
C'i+2 and C"i+2 represent the locations of the atom i+2, when ϕi assumes the
respective values 180° and 0°, characteristic of the trans and cis rotameric
states.
GENERALIZED COORDINATES FOR DEFINING THE INTERNAL CONFORMATION
If you eliminate translation and rotation => 3(n-2) variables will be left
li = ri - ri-1 is the bond vector connecting the units i-1 and i, pointing from i-1 to i
An alternative representation, change of coordinate system:
ri is the position vector in the new frame.
r1= 0
(removes translational degrees of freedom)
if you choose the first bond along x axis,
You remove the rotational degrees of
freedom
y2=z2=z3=0
The remaining 3n-6 coordinates { x2,x3,y3,x4,y4,z4,........ xn,yn,zn } define the
internal configuration of the molecule
The position of the ith unit with respect to original frame can be expressed in
terms of the internal position vectors ri as
T1 is the transformation matrix for the passage from the frame O1X1Y1Z1 into
the laboratory-fixed frame OXYZ. (first atom is used as a reference)
Note: cis vs trans state: Not all torsional angles are equally probable
Some torsional angles, referred to as rotational isomeric states (RIS), are more
frequent than others, these being favored by the intrinsic torsional energies of
the particular bonds.
Rotational energy as a function of dihedral angle for a threefold symmetric
torsional potential (dashed curve) and a three-state potential with a
preference for the trans isomer (ϕ = 180°) over the gauche isomers (60°
and 300°) (solid curve), and the cis (0°) state being most unfavorable.
Trans (180)
gauche+ (300)
gauche- (60)
Cis (0, 360)
Rotational isomeric states for the central bond in a segment of four backbone
atoms. Large blue spheres show backbone atoms. They are indexed from 1 to 4.
The small spheres show side groups; they are labeled by the indices of the
backbone atoms to which they are affixed (with a prime sign).
The staggered conformations are the most energetically favored conformations of two
tetrahedrally coordinated carbon atoms.
(a) A view along the C-C bond in ethane (CH3CH3) showing how the two carbon atoms can
rotate so that their hydrogen atoms are either not staggered (aligned) or staggered.Three
indistinguishable staggered conformations are obtained by a rotation of 120 degrees around
the C-C bond. (b-d) Similar views as in (a) of valine. The three staggered conformations are
different for valine because the three groups attached to Cβ are different. The first staggered
conformation (b) is less crowded and energetically most favored because the two methyl
groups bound to Cb are both close to the small H atom bound to Ca.
360
Ramachandran plots
Psi
showing allowed
combinations of the
conformational angles
phi and psi
0
0
phi
360
360
Psi
Psi
0
0
0
phi
360
360
0
phi
360
Proteins may assume infinitely many conformations.
Fixed bond lengths (l)
Fixed bond angles (θ)
Variable torsional angles (φ)
0º < φ < 360º
0º < θ < 180º
But just one of them is the native (tertiary structure)
This process is almost reversible (native <-> non-native)
Back to the protein science from protein structure
Refolding of a denatured protein.
Non-covalent bonds
C-C bond has an
energy of 83
kcal/mol
Each has an energy
of 5 kcal/mol
N.....H-O
(a water is lost)
O.....H-N
E
KK
Although they are
very weak, many of
them form to create
a strong bonding
arrangement
Disulfide bonds. This diagram illustrates how covalent disulfide bonds
form between adjacent cysteine side chains. As indicated, these cross-linkages
can join either two parts of the same polypeptide chain or two different
polypeptide chains.
Since the energy required to break one covalent bond is much larger than
the energy required to break even a whole set of noncovalent bonds,
a disulfide bond can have a major stabilizing effect on a protein.
Small proteins need more S-S bonds
The disulfide is usually the end product of air oxidation
according to the following schematic reaction scheme:
2 -CH2SH + 1/2 O2 ï -CH2-S-S-CH2 + H2O
The binding of a protein to another molecule is highly selective.
Many weak bonds are needed to enable a protein to bind tightly
to a second molecule (a ligand).
The ligand must therefore fit precisely into the protein's binding site,
like a hand into a glove, so that a large number of noncovalent bonds can
be formed between the protein and the ligand
¾ Interior of proteins is hydrophobic
z
z
Hydrophobic core
Hydrophilic surface
¾ Proteins have alpha helices and beta-
sheets as their secondary structures
Secondary structures: α-helices
nth
nth
n+4th
n+4th
Mostly right-handed in proteins
Linus Pauling (1951) first desciribed them
Phi and psi angles are 1200 and 1100 consecutively
H bonds between the C=O of residue n and NH of residue n+4.
~10 residues long (4-to-more than 40 residues)
All the H bonds point in the same direction
So helices have dipoles
Negatively charged groups such as phosphate ions frequently bind to the
amino ends of a helices.
The dipole moment of an a helix as well as the possibility of hydrogen-bonding to
free NH groups at the end of the helix favors binding.
Some amino acids are favored in helices: Ala, Glu, Leu, and Met
Pro, Gly, Tyr, and Ser are very unfavored
The helical wheel or spiral
3.6 amino acids per turn, so the angle between two consecutive aa is 1000
Secondary structures: β-sheets
Proteins formed by strands are rigid
5-to-10 residues long
Residues are fully extended
Phi and psi are range in the
upper left quadrant
Two types of beta sheet structures.
(A) Antiparallel beta sheet
(B) Parallel beta sheet.
Both of these structures are common.
The structure of a coiled-coil.
Amphipathic: one side hydrophilic
other side hydrophobic
A collection of protein
molecules, selected to show
a range of sizes and shapes.
Each protein is shown as a
space-filling model,
represented at the same scale.
Collagen and elastin. (A) Collagen is a triple helix formed by three
extended protein chains that wrap around one another.
Many rodlike collagen molecules are cross-linked together
in the extracellular space to form unextendable collagen fibrils (top)
that have the tensile strength of steel. The striping on the collagen fibril is
caused by the regular repeating arrangement of the collagen molecules within
the fibril.
(B) Elastin polypeptide chains are cross-linked together to form rubberlike,
elastic fibers. Each elastin molecule uncoils into a more extended
conformation when the fiber is stretched and will recoil spontaneously
as soon as the stretching force is relaxed.
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
¾
HW1: The coordinates of a polypeptide with the following sequence are provided as follows:
ATOM
1 N
THR
1
-7.712 14.556 16.794 1.00 17.45
ATOM
2 CA THR
1
-7.046 15.510 17.660 1.00 16.13
ATOM
3 C
THR
1
-6.849 14.891 19.045 1.00 14.58
ATOM
8 N
VAL
2
-5.693 15.098 19.646 1.00 12.07
ATOM
9 CA VAL
2
-5.490 14.585 21.007 1.00 10.98
ATOM
10 C
VAL
2
-5.851 15.665 22.008 1.00 13.48
ATOM
15 N
ALA
3
-6.867 15.389 22.802 1.00 10.39
ATOM
16 CA ALA
3
-7.253 16.247 23.919 1.00 12.54
ATOM
17 C
ALA
3
-6.661 15.678 25.190 1.00 14.90
ATOM
20 N
TYR
4
-6.222 16.563 26.096 1.00 9.30
ATOM
21 CA TYR
4
-5.723 16.132 27.380 1.00 7.77
ATOM
22 C
TYR
4
-6.713 16.614 28.449 1.00 13.04
ATOM
32 N
ILE
5
-7.246 15.673 29.198 1.00 9.28
ATOM
33 CA ILE
5
-8.238 15.938 30.232 1.00 11.00
ATOM
34 C
ILE
5
-7.695 15.622 31.609 1.00 10.89
ATOM
40 N
ALA
6
-7.893 16.576 32.533 1.00 8.54
ATOM
41 CA ALA
6
-7.525 16.339 33.920 1.00 11.63
ATOM
42 C
ALA
6
-8.755 15.757 34.632 1.00 11.09
ATOM
45 N
ILE
7
-8.490 14.746 35.464 1.00 11.88
ATOM
46 CA ILE
7
-9.580 14.193 36.271 1.00 12.85
ATOM
47 C
ILE
7
-9.227 14.353 37.743 1.00 16.83
ATOM
53 N AGLY
8
-10.186 14.794 38.548 0.54 14.91
ATOM
55 CA AGLY
8
-9.921 14.902 39.978 0.54 17.87
ATOM
57 C AGLY
8
-11.072 14.308 40.774 0.54 15.38
ATOM
61 N ASER
9
-10.786 13.771 41.962 0.54 16.88
ATOM
63 CA ASER
9
-11.849 13.281 42.834 0.54 18.62
ATOM
65 C ASER
9
-11.365 12.948 44.236 0.54 15.13
ATOM
73 N AASN
10
-12.108 13.339 45.270 0.54 20.60
ATOM
75 CA AASN
10
-11.739 13.008 46.640 0.54 23.35
ATOM
77 C AASN
10
-12.820 12.169 47.327 0.54 27.76
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
N
C
C
¾
Ignore the fisrt two columns, third column gives the type of the atom of the backbone
chain. The fourth column gives the residue type, fifth column lists the residue
numbers, the sixth-to-eighth columns gives the x, y,z of the atom.
¾
Calculate the phi, psi and omega angles of one of the residues. Theta angle of that
residue and the bond vectors.
¾
Dou you think your residue is a part of a helix, a beta strand or a loop, (refer to the
ramachandran map)
Summary
¾
¾
¾
¾
¾
Protein interiors are hydrophobic
Proteins are made of secondary structures
Secondary structure elements are connected to form
simple motifs
Protein molecules are organized in a structural hierarchy
Large polypeptide chains fold into several domains
Two a helices that are connected by a short loop region in a specific
geometric arrangement constitute a helix-turn-helix motif.
Two such motifs are shown: the DNA-binding motif (a), which is further
discussed in Chapter 8, and the calcium-binding motif (b), which is present in
many proteins whose function is regulated by calcium.
Schematic diagrams of the calcium-binding motif.
The hairpin motif is very frequent in b sheets and is built up from two
adjacent b strands that are joined by a loop region.
Two examples of such motifs are shown. (a) Schematic diagram of the
structure of bovine trypsin inhibitor. The hairpin motif is colored red. (b)
Schematic diagram of the structure of the snake venom erabutoxin. The two
hairpin motifs within the b sheet are colored red and green.
The Greek key motif is found in antiparallel b
sheets when four adjacent b strands are
arranged in the pattern shown as a topology
diagram in (a).
The motif occurs in many b sheets and is
exemplified here by the enzyme
Staphylococcus nuclease (b).
STRUCTURAL
HIERARCHY
¾ Primary structure
• Arrangement of aa along the linear polypeptide chain
¾ Secondary structure
• helices and strands arrange themselves in simple motifs
• several motifs usually are combined to form compact globular
structures called domains
¾ Tertiary structure
• Arrangement of motifs or domains in 3D space
¾ Quaternary structure
• Arrangement of monomeric proteins wrto each other in 3D space
Organization of polypeptide chains into domains.
Small protein molecules like the epidermal growth factor, EGF, comprise only one domain.
Others, like the serine proteinase chymotrypsin, are arranged in two domains that are require
to form a functional unit. Many of the proteins that are involved in blood coagulation and
fibrinolysis, such as urokinase, factor IX, and plasminogen, have long polypeptide chains tha
comprise different combinations of domains homologous to EGF and serine proteinases and
addition, calcium-binding domains and Kringle domains.
¾ The fundemental unit of tertiary structure
is the domain
¾ Domain; A polypeptide chain or a part of a
polypeptide chain that can independently
fold into a stable tertiary structure.
¾ Domains are units of functions
DOMAINS:
Each domain can
fold independently
Elements of secondary structure such as alpha helices and beta sheets
pack together into stable globular elements called domains.
A typical protein molecule is built from one or more domains,
often linked through relatively unstructured regions of polypeptide chain.
Different domain structures
Comparison of the conformations of two serine proteases.
The backbone conformations of elastase and chymotrypsin.
Although only those amino acids in the polypeptide chain shaded in green are
the same in the two proteins, the two conformations are very similar nearly
everywhere. The active site of each enzyme is circled in red; this is where the
peptide bonds of the proteins that serve as substrates are bound and cleaved
by hydrolysis. The serine proteases derive their name from the amino acid serine,
whose side chain is part of the active site of each enzyme and directly
participates in the cleavage reaction.
Motifs that are adjacent in the amino acid sequence are also usually adjacent
in the three-dimensional structure.
Triose-phosphate isomerase is built up from four β−α−β−α motifs that are
consecutive both in the amino acid sequence (a) and in the three-dimensional
structure (b).
Schematic diagram showing the packing of hydrophobic side chains
between the two α helices in a coiled-coil structure.
Every seventh residue in both α helices is a leucine, labeled "d." Due to the
heptad repeat, the d-residues pack against each other along the coiled-coil.
Residues labeled "a" are also usually hydrophobic and participate in
forming the hydrophobic core along the coiled-coil.
Salt bridges can stabilize coiled-coil structures and are sometimes important
for the formation of heterodimeric coiled-coil structures.
The residues labeled "e" and "g" in the heptad sequence are close to the hydrophobic
core and can form salt bridges between the two α helices of a coiled-coil structure, the
e-residue in one helix with the g-residue in the second and vice versa. (a) Schematic
view from the top of a heptad repeat. (b) Schematic view from the side of a coiled-coil
structure.
Proteins can be divided into three
main classes
¾ Alpha domain proteins
¾ Alpha/beta structures
¾ Beta domain structures
Four-helix bundles frequently occur as domains in a proteins.
Alpha-Beta proteins
Alpha-Beta proteins
Alpha-Beta proteins
In most a/b barrel structures the eight b starnds of the barrel enclose a tightly
Packed hydrophobic core formed by the side chains from b-strands.
α/β barrel domain of an enzyme
Hydrophilic hole
Enzyme pyruvate folds into several
domains