Download Tertiary Structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Molecular mimicry wikipedia , lookup

Transcript
Comments
Structural motif v sequence motif
polyproline (“PXXP”) motif for SH3 binding
“RGD” motif for integrin binding
“GXXXG” motif within the TM domain of membrane protein
Most common type I’ beta turn sequences: X – (N/D/G)G – X
Most common type II’ beta turn sequences: X – G(S/T) – X
1
Putting it together
Alpha helices and beta sheets are not proteins—only marginally stable
by themselves
…
Extremely small “proteins” can’t do much
2
Tertiary structure
• Concerns with how the secondary structure units within a single
polypeptide chain associate with each other to give a threedimensional structure
• Secondary structure, super secondary structure, and loops come
together to form “domains”, the smallest tertiary structural unit
• Structural domains (“domains”) usually contain 100 – 200 amino acids
and fold stably.
• Domains may be considered to be connected units which are to
varying extents independent in terms of their structure, function and
folding behavior. Each domain can be described by its fold, i.e. how the
secondary structural elements are arranged.
• Tertiary structure also includes the way domains fit together
3
Domains are modular
•Because they are self-stabilizing, domains can be swapped both in
nature and in the laboratory
PI3 kinase
beta-barrel
GFP
Branden & Tooze
fluorescence localization experiment
4
Chimeras
Recombinant proteins are often expressed and purified
as fusion proteins (“chimeras”) with
– glutathione S-transferase
– maltose binding protein
– or peptide tags, e.g. hexa-histidine, FLAG epitope
helps with solubility, stability, and
purification
5
Structural Classification
All classifications are done at the domain level
In many cases, structural similarity implies a common evolutionary origin
– structural similarity without evolutionary relationship is possible
– but no structural similarity means no evolutionary relationship
Each domain has its corresponding “fold”, i.e. the identity and connectivity
of secondary structural elements
There appears to be a limited number of actual folds (~ 1000) utilized by
naturally occurring proteins
– Chothia, Nature 357, 543 (1992)
Nearly all proteins have structural similarities with some other proteins
6
•Two widely used protein structure classification systems are
CATH (class, architecture, topology, and homology) and SCOP
(structural classification of proteins)
http://scop.berkeley.edu/
http://www.cathdb.info/latest/index.html
7
SCOP Classification Statistics
SCOP: Structural Classification of Proteins. 1.73 release (Nov 2007)
34494 PDB Entries, 97178 Domains
(excluding nucleic acids and theoretical models)
Number of
folds
Number of
superfamilies
Number of
families
All alpha proteins
259
459
772
All beta proteins
165
331
679
Alpha and beta proteins
(a/b)
141
232
736
Alpha and beta proteins
(a+b)
334
488
897
Multi-domain proteins
53
53
74
Membrane and cell surface
proteins
50
92
104
Small proteins
85
122
202
1086
1777
3464
Class
Total
• detailed and comprehensive description of the structural and evolutionary
relationships between all proteins whose structure is known
8
CATH
•Levitt and Thornton, 261, 552 (1976)
•Automated and manual classification based on sequence and structural similarity
–If a given domain has sufficiently high sequence and structural similarity (e.g. 35%
sequence identity) with a domain that has been previously classified in CATH, the
classification is automatically inherited from the other domain
–Otherwise, the protein is manually classified
Class: mainly alpha, mainly beta, alpha+beta, little secondary structure
Architecture: overall shape of the domain structure as determined by the orientations
of the secondary structures but ignores the connectivity between the secondary
structures
Topology (Fold): both the overall shape and connectivity of the secondary structures
Homology: protein domains which are thought to share a common ancestor and can
therefore be described as homologous
9
Visual representation of the protein structure space
• Plot structural related proteins are placed
close in 3D space
• Representative proteins from each of the
SCOP family are compared to one another
• Scoring matrix Sij is computed for proteins i, j
• Use of three different folds (alpha, beta,
alpha/beta) is sufficient to describe all known
folds
three-dimensional map
of the protein universe
Shape of the fold space and the overall
distribution of the folds are influenced by
three factors
–
–
–
Secondary structure class
Chain topology
Protein domain size
Hou, et al PNAS 100, 2386 (2003)
10
Proteins fold autonomously
Anfinsen, Nobel prize in
Chemistry 1972
denatured protein
ribonuclease A
The information required to fold a protein into
a 3D structure is stored in its sequence
11
Why do proteins fold
• Unfolded (“denatured”) polypeptides have a large hydrophobic surface
exposed to the solvent
• Water molecules in the vicinity of a hydrophobic patch are highly ordered
• When protein folds, these water molecules are released from the
hydrophobic surface, vastly increasing the solvent entropy
ΔS>0
Corollary: Proteins fold to bury
hydrophobic residues in the protein
core, inaccessible to solvent molecules
12
(-) Energetic contributions to drive folding
–
–
–
–
desolvation of hydrophobic surface
intramolecular hydrogen bonds
van der Waals interactions
on the order of ~ -200 kcal/mol (free energy change for U Î N)
(+) Contributions to oppose folding
– loss of hydrogen bonds to water molecules
– entropy loss due to restricted backbone and side chain movements
– on the order of +190 kcal/mol
Net result:
stabilization by ~ -10 kcal/mol
weight of captain and boat – weight of boat = weight of captain
13
How do proteins fold
Hydrophobic collapse leads to a
folding intermediate known as
molten globule, in which the
hydrophobic side chains are
buried and hydrophilic side
chains are exposed
14
• Protein fold much more rapidly than one might expect (often in μs to ms)
• Protein does not sample every possible conformation in order to reach the
native state (i.e. the folded state)
• Otherwise, there are simply too many possibilities—e.g. 10100
– Levinthal paradox: if a protein samples every possible conformation, it’ll
never fold
folding funnel
Simulation of the folding of NK-lysin
Jones, PROTEINS. Suppl. 1, 185-191 (1997)
15
Structural characterization
Molecular weight
– analytical ultracentrifugation: larger molecules sediment more quickly
– analytical gel filtration: larger molecules take shorter time to travel
Ultra (or analytical) centrifugation
mω 2 2
P(r ) = P(r0 ) exp(−
(r − r0 2 ))
2 RT
mω 2 2
(r − r0 2 ))
n(r ) = n(r0 ) exp(−
2 RT
mω 2 2
log(n(r )) = log(n(r0 )) −
(r − r0 2 )
2 RT
= a + bi r 2
16
Gel filtration
also known as size exclusion column
17
Optical activity of protein
• Chiral compounds (including alpha helix and beta strand) absorb righthanded and left-handed polarized light to different degrees
• Circular dichroism (CD) spectroscopy measures the extent to which
two circularly polarized lights get absorbed to measure secondary
structure content
208
222
wavelength
18
High resolution structure determination
• Nuclear magnetic resonance (NMR) spectroscopy
When placed in a magnetic field, NMR active
nuclei (e.g. 1H, 13C, 15N) absorb energy at
a specific frequency, dependent on strength
of the magnetic field
The resonance frequency depends critically
on the local electronic structure
By measuring these frequency shifts (called
chemical shifts after normalized for the
strength of the magnetic field), it is possible
to determine:
- dihedral angle of a bond (hence,
conformation)
- distance between two atoms (hence, 3D
distance constraint)
Chemistry Nobel prize 1995 and 2002
19
Principles of NMR
•
•
When an atom with non-zero spin is placed in magnetic field, the up
and down states have two different energies
The transition from one to the other is accompanied by an absorption
or emission of a photon
ΔE = −2 μ i H
20
1D NMR
There are many more protons in a protein than in small molecules,
leading to crowding and degeneracy of chemical shifts
Typical protein 1D NMR
21
Multidimensional NMR
Spin labeling of protein with NMR active isotopes (e.g. 15N, 13C) makes
these atoms “visible” and allows their chemical couplings to neighboring
atoms to be examined
Backbone amide region of the 600-MHz (1H, 15N)-HSQC spectra of
chain-selectively labeled carbon monoxide-bound hemoglobin A in water
at pH 6.5 and 29 deg. Cross-peaks of a tetrameric hemoglobin
22
2D NOESY
• Spin labeled particles can interact with
each other through space
• The intensity of the interaction (nuclear
Overhauser effect, “NOE”) is ~ 1/r6, where
r is the distance between two nuclei
• Gives detectable signal up to ~ 5 Å
• NOE thus provides distance constraints
between all pairs of protons that are
important to construct the 3D structure of
a protein
• Spatial arrangements of hydrogens are
then computationally determined
23
Stereoview of 10 lysozyme structures
stereogram of a heart or a ghost of a tombstone
24
1H
15N
13C
25
X-ray crystallography
• Atomic resolution structural information
• No size limit on how large the protein and
protein complex can be
– Compare with the upper limit on the
molecular weight of the protein of ~50
kDa for NMR (although new techniques
are being developed constantly)
• Amenable to high throughput approach
– Structural GenomiX (SGX
Pharmaceuticals) seeks high throughput
structure determination of well
characterized proteins
• High quality crystals are absolute musts
RNA poymerase II
3500 amino acids or ~ 400 kDa
Cramer et al Science 292, 1863 (2001)
Chemistry Nobel prize 2006
26
• In a good protein crystal, each molecule of protein is
laid out in precisely the same way in each unit cell
• An orderly arrangement of like atoms in space
scatters (“diffracts”) light in geometric patterns
• The diffraction pattern is used to compute the
corresponding electron density map
• Need to determine the phase
• Protein backbone and side chains can be fitted into
the measured electron density
• Refinement of the structure results in 3D
coordinates of the individual atoms of a protein
Creighton, Protein Structure
27
Growing protein crystal
•
•
First the protein of interest needs to be purified to near complete
homogeneity (~ 99%)
Protein crystals can be grown by gradually increasing the protein
concentration in solution through vapor diffusion against a reservoir
28
X-ray diffraction
X-ray can be generated using an in-house
machine or at a synchrotron
Synchrotron beams are much more intense
and the wavelength can be fine-tuned
Argonne National Lab synchrotron
29
Solving the crystal structure
• Diffraction pattern corresponds to the
Fourier transformation of the electron
density
• In order to get the electron density back,
we need to perform inverse Fourier
transformation
• However, the phase information is
required
– molecular replacement
– heavy atom soaking
– multiple wavelength anomalous
dispersion (MAD)
– direct method
Hauptman, SUNY Buffalo
1985 Chemistry Nobel prize
Kevin Cowtan's Book of Fourier: http://www.ysbl.york.ac.uk/~cowtan/fourier/fourier.html
30
Fitting the peptide chain
• Once the electron density has been computed, the peptide chain can
be threaded (computationally) into the density.
• This is easier if the resolution is higher (i.e. < 2.0 Å)
• Amino acids have known connectivities, bond lengths, angles, and
stereochemistry, which aid in fitting
2.9 Å
1.2 Å
31
Structural classification (cont’d)
•
Alpha
•
Beta
•
Alpha/Beta
•
Small proteins
32
Alpha domain proteins
May or may not contain any motifs
Small and large
hemerytherin
(four helix
bundle domain)
Myoglobin (globin domain)
bovine cyclophilin
(tetratricopeptide
repeat domain)
33
importin
(Armadillo repeat)
34
Functional diversity
Structural : collagen
DNA binding : engrailed homeodomain
Catalysis : cytochrome c
engrailed homeodomain
collagen
cytochrome C
35
Beta proteins
Structurally and functionally diverse
– enzymes, transport proteins, antibodies
– cell surface proteins, viral coat proteins
Strands are often arranged in an antiparallel fashion
up-down
Greek key
36
not seen in nature
Beta barrels
• Intrinsic twist in beta sheets often leads to a barrel-like structure when
two sheets are packed against each other
• Amino acid sequence reflects beta structure
– alternating hydrophobic and hydrophilic residues
retinol binding protein
streptavidin
37
Include enzymes, antibodies, cell surface receptors, signaling molecules,
viral coal proteins
SH-3
beta helix
collagenase
(trefoil)
immunoglobulin
(beta sandwiches)
In beta barrel enzymes, the loop regions
often contain residues involved in catalysis
38
Immunoglobulin (Ig) – beta sandwich
Cell surface receptors
– Tumor necrosis factor receptor
– Interferon receptor
Antibodies
– “Y” shaped protein involved in (acquired) immune response
– Five different types (A, D, E, G, M)
– Comprises two heavy chains (4 Ig domains each) and
two light chains (2 Ig domains each)
– Each Ig domain is about 120 amino acids
– Variable domain (Fab) and constant domain (Fc)
– Binding specificity resides within loop residues
39
Sheet packing
(a, c) End and side views of two
untwisted beta sheets. (b, d) The
intrinsic twist of the beta sheet results
in the sheets forming an angle of
approximately 30° with each other
Chothia, Annu Rev Biochem 1984
Orthogonal beta sheet packing consist of beta sheets folded on themselves
The corner strands continues from one layer to the other
40
Alpha/beta proteins
Most frequent domain structure
Central beta sheet (parallel or mixed) surrounded by alpha helices
Used in enzymes and transporter molecules
1. Triose phosphate isomerase (TIM) barrel
2. Open beta sheet (e.g. Rossman fold)
3. Leucine-rich motif
41
TIM barrel
• Triose phosphate isomerase is involved in
glycolysis
• Eight copies of beta-alpha-beta motif joined
in the same orientation
– strands 1 and 8 hydrogen are bonded to
each other
– second strand of i-th motif and first strand of
(i+1)-th motif are shared
dihydroxyacetone phosphate
D-glyceraldehyde-3-phosphate
42
• Rediscovered at least 21 times during evolution
– Nagano et al, JMB, 321, 741 (2002)
– Entire database devoted to TIM barrel enzymes (“DATE”)
– http://www.mrc-lmb.cam.ac.uk/genomes/date/
• Minimum of 200 residues are required to construct the entire protein
• TIM barrel proteins are all enzymes and make up ~10% of all enzymes
• Beta strands and alpha helices (~ 160 amino acids) provide structural
framework
• Loop residues do not contribute to structural stability but instead are
involved in catalytic activity
• Branched hydrophobic side chains in the core (tightly packed)
– Val, Ile, Leu together make up 40%
43
Open twisted beta sheet
•
•
•
•
•
Alpha helices on both sides of a beta sheet green
fluorescence
Does not form a barrel (see figure of GFP)
protein
Always contains two adjacent beta strands
whose connections to the next strand are on
opposite sides of the beta sheet, creating a
crevice
Active site always at the carboxy end of the
beta sheet
Functional residues come from the loop
regions
Examples: carboxypeptidase, arabinose-binding
protein, tyrosyl-tRNA synthetase,
phosphoglycerate mutase
44
Leucine rich repeat
• Right-handed beta-alpha superhelix
• Composed of repeating 20-30 amino acid stretches
• The region between the helices and sheets is hydrophobic and is tighly
packed with recurring leucine residues
• One face of the beta sheet and one side of the helix array are exposed
to solvent and are therefore dominated by hydrophilic residues
• Diversity may be generated by mutating residues on the solvent
exposed side of beta strands
45
LRR proteins in nature
• Toll-like receptor—innate immunity in mammalian cells
• Adaptive immune system in jawless fish lamprey
• RNase inhibitor
Alder et al, Science 310, 1970 (2005)
Ribonuclease
inhibitor
46
•Listeria internalin
– Schubert et al, Cell 111, 825 (2002)
human
E-cadherin
N-terminal
domain
47
Small proteins
Often require additional interactions to make up for the absence of a
significant hydrophobic core
metals, e.g. Zn++, Fe++, Ca++
disulfide bonds
Fe++
zinc finger
protozoan pheromone
48