Download Folds

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biochemical cascade wikipedia , lookup

Thylakoid wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Point mutation wikipedia , lookup

Biochemistry wikipedia , lookup

Paracrine signalling wikipedia , lookup

Gene expression wikipedia , lookup

Signal transduction wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

SR protein wikipedia , lookup

Expression vector wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Magnesium transporter wikipedia , lookup

Structural alignment wikipedia , lookup

Metalloprotein wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Homology modeling wikipedia , lookup

Protein wikipedia , lookup

Interactome wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Protein purification wikipedia , lookup

Western blot wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
This is not an obligatory material,
it is for students more interested in proteins.
Protein composition and structure - supplement
Attila Ambrus
Analysis of amino acid composition of
secondary structural elements
branching at the b-carbon tends to destabilize an a-helix (Val, Thr, Ile)
due to steric clashes, these aa are better suited to b-sheets where they
stand out of the chain
Ser, Asp, Asn disrupt a-helices due to H-bonding donor/acceptors sites
near the main chain where they compete for N-H or C=O
Pro tends to disrupt both a-helices and b-strands due to its ring structure
Gly readily fits into every sorts of structural motifs because of its small
size
predictions of secondary structure of 6 or fewer residues, taking the
above (and other) considerations into account, proved to be ~60-70%
accurate; reasons include the not abrupt change of preference one aa has
for one structural motif or another
tertiary interactions may push the same peptide to adopt another secondary structure in a different local environment
many sequences can adopt alternative conformations in different proteins
the VDLLKN sequence shown in purple
assumes a helical or a b-strand conformation in two different proteins (3WRP.
pdb and 2HLA.pdb)
Folds
protein tertiary structures are divided into five main classes according to
the secondary structure content of their domains:
all-a domains, all-b domains (b-barrels, e.g. Greek key motif), a+b domains
(irregular fashion of arrangement), a/b domains (b-a-b motifs) and “others”
each class contains many different folds further classified into families
no necessary functional connection is in this type of classification: a certain
type of function is often, but not always, restricted to a certain type of
fold (convergent evolution) – fold of a protein is “only” a scaffold to which
functions (active sites and different binding sites) are “added”
a protein of 100 aa may have 20100 possible sequences/conformations, but an
estimate says that there are only ~1000 folds in nature (we know a few hundreds of them so far); if we consider only the # of human genes (even without
splice variants) there should be more than ~30,000 individual conformations
if all protein sequences would adopt a new structure
conformation is more conserved than sequence
similar folds may have very low sequence identity and that is true vica versa,
nevertheless, ~30% sequence identity generally means similar structure
(may have diverged from a common ancestor; homology modeling of proteins
is based on this premise)
modules: genetically mobile units manifested as separate protein domains,
functional units that are shared by proteins of similar functions; there are
domain (fold) families throughout the phylogenetic tree to deliver similar
functions (small differences in sequence with evolutionary conserved regions
featured in multiple sequence alignment of protein primary structures)
some folds are more favored than others as they represent a more stable
structure and some proteins may converge towards these folds over the
course of evolution; a number of folds though are found in only one group
of proteins
the CATH domain database classifies domains into approximately 800 fold
families, ten of these folds are highly populated and are referred to as
'super-folds‘; super-folds are defined as folds for which there are at least
three structures without significant sequence similarity (the most populated
is the α/β-barrel super-fold)
b-barrels
large b-sheet that twists and coils to form a closed structure
first and last b-strands are H-bonded
typical antiparallel arrangement of strands
found in proteins spanning membranes (e.g. porins) and in proteins that
bind hydrophobic ligands inside the barrel (e.g. in a lipocalin fold)
CATH (Class, Architecture, Topology, Homologous) database
Similar database: SCOP (Structural Classification Of Proteins)
all known protein structures are sorted according to their folds
Superhelices
a-keratin (main component of wool and hair) consists of two right-handed
a-helices intertwined to form a left-handed superhelix called a coiled coil
(superfamily of coiled-coil proteins, ~60 proteins in humans)
2 or more a helices can entwine and form a stable, even 1000 Å (0.1 mm) or
longer, structure found in cytoskeleton, filaments, muscle proteins
3.5 residues/turn, heptad repeats, every 7th residue is Leu on each strand
and these two Leu interact (hydrophobic interaction), 2 Cys can also
interact (S-S) stabilizing fiber
wool can be stretched (some interactions among helices brake, S-S does
not and pulls back after release)
hair and wool have fewer cross-links, horn, claw, hoof are hard
Collagen
most abundant protein in mammals, main fibrous component of skin, bone,
teeth, cartilage and tendon
extracellular protein, rod shape, ~3000 Å long/15 Å in diameter, 3 helical
protein chains (~1000 residues each, every 3rd residue is Gly,
Gly-Pro-(Pro-OH) triad is frequent, Pro-OH (4-hydroxyproline) is a natural
amino acid derivative)
no H-bonds inside the helical strands, stabilization occurs via steric
repulsion between Pro and Pro-OH
~3 residues/turn, 3 helices wind in a superhelical cable that is stabilized
by H-bond in between strands (Pro-OH participates in H-bonding network
and lack of –OH on Pro in collagen lead to the disease scurvy (Vitamin C
deficiency, ascorbate reduces Fe3+ to Fe2+ in prolyl hydroxylase for its
continuous activity)
Pro rings are on the outside, Gly in every 3rd position is needed because
the superhelix is very crowded inside and there is no place for any other
bigger amino acid
Denaturation of proteins
denaturating agents have chaotropic properties and disrupt the 3D structure of proteins (or DNA/RNA)
chaotropic agents interfere with intramolecular H-bonding and van der
Waals forces (hydrophobic interactions) and denature biomacromolecules
chaotropes also break down the H-bonded network of H2O allowing proteins
more structural freedom and encouraging extension and denaturation
Examples of chaotropic agents: 6-8 M urea, 2 M thiourea, 6 M guanidinium
chloride, 4.5 M LiClO4
and in general high generic salt concentrations can also exhibit chaotropic effects:
they shield electronic charges preventing stabilization of salt bridges and also
weaken H-bridges which are more stable in less polar media (not being completely
solvated at high concentration, ions interact with dipoles of H-binding partners,
which is more favorable than H-bridging itself); they also perturb solubility of
proteins taking H2O out of the hydration sphere of proteins – (reversible) precipitation/fractionation of proteins
opposite of chaotropes (disorder-maker, destabilizer) are kosmotropes
(order-maker, stabilizer): they stabilize proteins in solution, increase
structuring of water molecules
SO42-, HPO42-, Mg2+ , Ca2+ , Li+, Na+, H+, OH- and HPO42- (small ions with high
charge density) are good kosmotropes exhibiting stronger interactions
with H2O than H2O with itself and therefore capable of breaking H2O-H2O
H-bonds; non-ionic kosmotropes: trehalose, glucose, proline, terc-butanol
SCN-, H2PO4-, HSO4-, HCO3-, I-, Cl-, NO3-, NH4+, Cs+, K+, (NH2)3C+ (guanidinium) and (CH3)4N+ (tetramethylammonium) ions are rather chaotropes
proteins are most stable in solution when surrounded by fully H-bonded H2O
as H2O with spare H-bonding capacity has higher entropy and is more “aggressive”; such reactive H2O behaves in a similar way to raising T that denatures proteins
optimum stabilization of biological macromolecule by salt requires a mixture
of a kosmotropic anion with a chaotropic cation and the chaotropic ions
(with their weak aqueous interactions) should be the direct counterions to
the protein and the kosmotropic ions (with their strong aqueous interactions)
in the bulk; (NH ) SO is a good salt for stabilizing protein structure/activity
when the anion and cation have similar affinities for H2O they are able to
remove H2O from each other most easily, to become ion-paired. A small ion
of high charge density plus a large counter-ion of low charge density forms
a highly soluble, solvent-separated hydrated but clustered ion pair as
the large ion cannot break through its counter-ion's hydration shell (for
example, CaI2, AgF and LiI versus CaF2, AgI)
Hofmeister series of ions precipitating proteins:
(Franz Hofmeister was also the one who proposed first in 1902 that amino acids build up
proteins via peptide bonds (even before Emil Fischer))
true when proteins are of net negative charge, pH>pI, may reverse if
pH<pI, different counterion or pH is present
in the original experiment they used a mixture of egg white proteins, did
not control pH and ovalbumin was of negative charge and they got the
following series:
anions: citrate3- > SO42- = tartrate2- > HPO42- > CrO42- > acetate- > HCO3- >
Cl- > NO3- > ClO3cations: Mg2+ > Li+ > Na+ = K+ > NH4+
(reversible) Salting out/precipitation of proteins
based on smaller solubility of proteins at high salt concentration
critical concentration varies for different proteins (fractionation/purification of proteins, e.g. albumins vs. globulins)
used also to concentrate proteins from dilute solutions (e.g. after gel
filtration [size-exclusion chromatography])
done generally by (solid) (NH4)2SO4 (final concentration expressed as the
% of the saturated (NH4)2SO4 solution) followed by filtration or centrifugation
dissolution in appropriate buffer and dialysis is used to remove high salt
concentrations afterwards and get the protein dissolved back again
Caution: some ions first increase the solubility of a protein (salting in)
while others may permanently denature/precipitate/poison certain proteins
or enzymes (e.g. heavy metal poisoning – irreversible complexation occurs)
protein “salting out” results from interfacial effects of strongly hydrated
anions near the protein surface so removing water molecules from the
protein solvation sphere and dehydrating the surface
protein “salting in” results from protein-counter ion binding and the consequently higher net protein charge and solvation; it occurs where the protein has little net charge near its pI primarily by weakly hydrated anions.
protein solubility is minimal at the pI (net charge is zero), below or above
charged protein molecules repel each other resulting in better solubility
precipitation is not necessarily accompanied by denaturation and vica versa
strong acids and bases can permanently destroy the H-bonding/salt-bridging network of proteins, denature and/or precipitate them; this is used
in the lab to test for protein content (TCA, sulfosalicylic acid)
ethanol or acetone can also precipitate proteins by shifting the dielectric
constant of solvent water that results in lower solubility of solute protein
heat denaturation/precipitation is of pathological relevance (high fever)
Folding/refolding of proteins
intriguing field of research for folding pathways
refolding techniques are used and optimized to increase protein yield
in heterologous protein expression and purification experiments (overexpressed excess protein may precipitate in the form of inclusion bodies
that contain protein in a (partially) denatured insoluble form)
refolding is not always spontaneous after dialysis of denaturant, helper
materials are used to facilitate/initiate the folding process (native prosthetic groups/cofactors/substrates/ligands and e.g. PEG, arginine, CHAPS,
lauril maltoside, glycerol, Triton X-100, BSA, etc. are good helper materials)
a redox-shuffling system (Cys-cystine, GSH-GSSG, b-SH-EtOH, DTT) helps
resolve wrongly made S-S bonds and find the thermodynamically most
favorable conformation
half-folded ??
sharp transition from the folded
to the unfolded state (“all or none”
cooperative process; same trend
when the protein is refolded)
there are transient intermediates of folding at
the atomic level (progressive stabilization of
intermediates); proteins may also get transiently
stabilized in a molten globule form that contains
native-like secondary structural elements but a
rather dynamic tertiary structure somewhere in
between the denatured and the native states)
if one part of the protein structure
is deteriorated (getting thermodynamically unstable under the given conditions), the whole structure
will brake down (cooperatively) since
the interactions that stabilized the
rest of the protein are lost with this
(unfolded) part of the enzyme
What is the pathway to fold up?
unfolded protein
?
unique conformation
in folded state
the protein should try out all the possible conformations to find the energetically most favorable one?
this would take for a 100 aa protein that samples 3 conformations/aa, each
in 100 fs, ~1027 years (Levinthal`s paradox)……not a good option!
Richard Dawkins in “The blind watchmaker” asked how long it would take for a monkey to spell
out accidentally on a typewriter Hamlet`s remark to Polonius “Methinks it is like a weasel“…calculated…it would happen (probably) in about 1040 random keystrokes
however, if we preserve the correct keystrokes and let the monkey retype only the wrong
punches, the whole process would only take couple of thousands of trials! (cumulative
selection, partly correct intermediates are retained)
it is the way for a protein to correctly fold in a reasonable time frame to
follow an at least partly defined folding pathway with intermediates on
the road to the folded form (nucleation-condensation model, energy surface
funnel model with multiple possible pathways to the same final stable
structure at the bottom of the funnel, deepening in the energy-funnel
means fewer and fewer conformations accessible to be adopted)
Chaperones
intracellular proteins assist in folding/preventing misfolding or aggregation
of biomacromolecules and help assemble complex macromolecular structures,
these proteins are called chaperones or chaperonins and some of them are
even called foldases or unfoldases
some chaperones assist in correctly folding newly synthesized protein chains
as a minority of protein structures would not be able to correctly fold all by
themselves
they also assist in disassembling/unfolding of macromolecular structures
they help assemble already folded structures to higher level structures
(e.g. oligomers)
they sometimes need co-chaperons to fully exhibit the chaperon action
they do not convey “steric information” to fold a protein per se, they rather
prevent transformation to non-functional structures
they use sometimes ATP as an energy source for doing their folding action
cellular shock (e.g. heat shock) leads to higher propensity of protein aggregation and specialized proteins, so-called “heat-shock proteins (HSP)”, help
avoid this aggregation; not all chaperones are HSPs
HSPs express as a response to higher T or other cellular shocks
important chaperons (found especially in the ER): calnexin, calreticulin,
different HSPs, protein disulfide isomerase, peptidyl prolyl cis/trans isomerase
Hsp60 (GroEL/GroES complex in E. coli, Group I chaperonin, GroES is a cochaperonin) is the best characterized large (~ 1 MDa) chaperone complex,
also found in the mitochondrial matrix
other HSPs: HSP70 (prevent apoptosis), HSP90, HSP100, etc. (the number
means MW)
the mechanism of action generally requires ATP hydrolysis and major conformational changes from the chaperon`s side to be able to encapsulate the
unfolded protein to the chaperon`s “lumen” where it will start folding
Protein misfolding and aggregation – pathological relevance
some infectious neurological diseases were recently revealed to be transmitted by virus-sized protein particles
such examples are bovine spongiform encephalopathy (mad cow disease)
and the analogous disease in humans, the Creutzfeldt-Jakob disease (CJD)
the agents causing these diseases are called prions
for proving the hypothesis that diseases can be transmitted purely by
proteins, Stanley Prusiner in 1997 was awarded the Nobel Prize in Physiology
or Medicine
such proteins are massive, resistant to most regular treatments, aggregated
proteins formed from a regular cellular, mostly helical, protein in the brain,
PrP (prion protein); PrPSC is insoluble and of heterogenous state
evidences say that helical and b-turn protein content gets converted to
b-strand conformations that link to other b-strands of similar nature and
form extended b-sheets and eventually protein aggregates (amyloids)
the infectious agent in prion diseases is an aggregated form of a protein
amyloids are insoluble fibrous protein aggregates sharing specific structural traits; abnormal accumulation of amyloids in organs may lead to
amyloidosis that plays role in various (neurodegenerative) diseases (CJD,
Alzheimer`s, Parkinson`s, Huntington`s diseases, Atherosclerosis, Diabetes mellitus type II, etc.)
PrPSC nucleus
(tau protein)
normal PrP pool
aggregation
Ab-protein
the protein-only model for prion disease transmission
the disease can be transferredAb
from
one organism
another
by transis derived
from theto
cellular
amyloid
precursor
ferring the nucleus (mad cow disease
in the
1990sproteases;
in the UK,
protein outbrake
(APP) through
specific
it is
amyloid plaques in the
prone to form
insoluble
animalssmall
were intestine
fed with feed of infected
animal
origin)aggregates and its structure by solid-state NMR spectroscopy showed
extended parallel b-sheet arrangements