Download Lecture PPT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleic acid analogue wikipedia , lookup

Matrix-assisted laser desorption/ionization wikipedia , lookup

SR protein wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Gene expression wikipedia , lookup

Expression vector wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Magnesium transporter wikipedia , lookup

Interactome wikipedia , lookup

Metabolism wikipedia , lookup

Point mutation wikipedia , lookup

Protein purification wikipedia , lookup

Metalloprotein wikipedia , lookup

Peptide synthesis wikipedia , lookup

Western blot wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Protein wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Genetic code wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Biosynthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Proteolysis wikipedia , lookup

Transcript
Bioch/BIMS 503 Lecture 1
Structure and Properties of Amino
Acids and the Peptide Backbone
August 26, 2008
Robert Nakamoto
Mol. Physiology & Biol Physics
Tel: 982-0279, [email protected]
Snyder 380 (Fontaine)
Major topics –
• Names, abbreviations, general structure of
amino acids
• Amino acid chemical classes (polar,
hydrophobic, acidic, basic, aromatic, Scontaining)
• Amino acid structural classes/affinity
• Amino acid evolutionary classes
• pK - Henderson-Hasselbach equation
• Structure of the peptide bond
• Proteomics – MS protein sequencing
Further Reading –
•
•
•
•
Lehninger, Chapter 3 pp 75-86, 102-106
MvHA, Chapter 5, pp 126-142
Brandon & Tooze, Ch. 1
Aebersold R, Mann M. (2003) Mass
spectrometry-based proteomics. Nature.
422:198-207 PMID:12634793
Hierarchies of protein
structure
primary
structure
MVDFYYLPGSSPCRSVIMTAKAVGVELNKK
secondary
structure
a-helix
b-strand
super-secondary
structure
aa
bab
bb
ternary
fold
aaaa
4-helix bundles
babab
Rossman fold
bbb bbbb
b-meander Greek key
ternary structure
Does this structural hierarchy reflect the folding process?
Secondary structure first, or last?
Examples of protein folds and complexes:
Many bacterial toxin proteins undergo conformational changes that
insert into host cell membrane: for example, Anthrax toxin
protein/Protective protein complex
From Santelli et al., 2004, Nature 430, 905-908
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
Currently 10,340 protein fold families in Pfam
[http://pfam.sanger.ac.uk/]
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
Pfam is a comprehensive collection of protein
domains and families, represented as multiple
sequence alignments and as profile hidden
Markov models.
Generally does not include membrane proteins.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
What defines the 3-dimensional fold of a
protein?
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
QuickTi me™ and a
decompressor
are needed to see thi s pi ctur e.
Structure and properties of
Amino-acids
Alanine
Arginine
Asparagine
Aspartic acid
Cysteine
Glutamine
Glutamic acid
Glycine
Histidine
Isoleucine
Ala
Arg
Asn
Asp
Cys
Gln
Glu
Gly
His
Ile
A
R
N
D
C
Q
E
G
H
I
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophan
Tyrosine
Valine
Leu
Lys
Met
Phe
Pro
Ser
Thr
Trp
Tyr
Val
L
K
M
F
P
S
T
W
Y
V
Asp/Asn
Asx
B
Glu/Gln
Glx
Z
Amino-acid Chirality
L-glyceraldehyde
D-glyceraldehyde
Amino-acid Chirality
the “CORN” rule
R
R
H
CO
Ca
O
N
H
C
O
N
Figure 5.3: The amino acids found in proteins.
CYCLIC AMINO ACID
AROMATIC AMINO ACIDS
Leucine
a
b
H
a
Alanine
b
R
R
H
a
CO
N
CO
H
N
Proline - a cyclic amino-acid
b
R
Pro
R
H
a
CO
N
N
CO
b
R
R
Ala
H
a
CO
N
CO
N
Classifications of amino-acids
•
•
•
•
Abundance
Hydrophobicity
Mutability
Structural
preference
• Charge properties
us.expasy.org/cgi-bin/protscale.pl
Molecular weight
Bulkiness
Polarity / Grantham
Recognition factors
Hphob. OMH / Sweet et al.
Hphob. / Kyte & Doolittle
Hphob. / Abraham & Leo
Hphob. / Bull & Breese
Hphob. / Guy
Hphob. / Miyazawa et al.
Hphob. / Roseman
Hphob. / Welling & al
al
Hphob. HPLC / Parker & al
Cowan
Hphob. HPLC pH7.5 / Cowan
HPLC / HFBA retention
HPLC / retention pH 2.1
% buried residues
Hphob. / Chothia
Ratio hetero end/side
Average flexibility
Fasman
beta-sheet / Chou & Fasman
alpha-helix / Deleage & Roux
Roux
beta-turn / Deleage & Roux
alpha-helix / Levitt
beta-turn / Levitt
Antiparallel beta-strand
A.A. composition
Relative mutability
Number of codon(s)
Polarity / Zimmerman
Refractivity
Hphob. / Eisenberg et al.
Hphob. / Hopp & Woods
Hphob. / Manavalan et al.
Hphob. / Black
Hphob. / Fauchere et al.
Hphob. / Janin
Hphob. / Rao & Argos
Hphob. / Wolfenden et al.
Hphob. HPLC / Wilson &
Hphob. HPLC pH3.4 /
Hphob. / Rf mobility
HPLC / TFA retention
HPLC / retention pH 7.4
% accessible residues
Hphob. / Rose & al
Average area buried
alpha-helix / Chou &
beta-turn / Chou & Fasman
beta-sheet / Deleage &
Coil / Deleage & Roux
beta-sheet / Levitt
Total beta-strand
Parallel beta-strand
A.A. comp. in Swiss-Prot
Amino acid frequencies in proteins
+ Ala
Arg
Asn
Asp
- Cys
Gln
+ Glu
+ Gly
- His
Ile
+ Leu
Lys
- Met
- Phe
Pro
+ Ser
Thr
- Trp
- Tyr
+ Val
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
0.0780
0.0512
0.0448
0.0536
0.0192
0.0426
0.0629
0.0737
0.0219
0.0514
0.0901
0.0574
0.0224
0.0385
0.0520
0.0711
0.0584
0.0132
0.0321
0.0644
Amino acid Hydropathicity/Hydrophobicity
Hopp T.P., Woods K.R. (1981) PNAS. 78:3824-3828.
Kyte J., Doolittle R.F. (1982). J. Mol. Biol. 157:105-132
D. M. Engelman, T. A. Steitz, A. Goldman, (1986) Annu. Rev.
Biophys. Biophys. Chem. 15, 321
Hopp/
Woods
Arg:
Lys:
Asp:
Glu:
Ser:
Gln:
Asn:
Pro:
Gly:
Thr:
His:
Ala:
Cys:
Met:
Val:
Leu:
Ile:
Tyr:
Phe:
Trp:
3.0
3.0
3.0
3.0
0.3
0.2
0.2
0.0
0.0
-0.4
-0.5
-0.5
-1.0
-1.3
-1.5
-1.8
-1.8
-2.3
-2.5
-3.4
Kyte/
Doolittle
Arg:
Lys:
Asp:
Glu:
Gln:
Asn:
His:
Pro:
Tyr:
Trp:
Ser:
Thr:
Gly:
Ala:
Met:
Cys:
Phe:
Leu:
Val:
Ile:
-4.5
-3.9
-3.5
-3.5
-3.5
-3.5
-3.2
-1.6
-1.3
-0.9
-0.8
-0.7
-0.4
1.8
1.9
2.5
2.8
3.8
4.2
4.5
GES
Arg:
Asp:
Lys:
Glu:
Asn:
Gln:
His:
Tyr:
Pro:
Ser:
Gly:
Thr:
Ala:
Trp:
Cys:
Val:
Leu:
Ile:
Met:
Phe:
12.3
9.2
8.8
8.2
4.8
4.1
3.0
0.7
0.2
-0.6
-1.0
-1.2
-1.6
-1.9
-2.0
-2.6
-2.8
-3.1
-3.4
-3.7
Amino-acid classes from
evolution/mutation
Given a set of (closely) related protein sequences...
GSTM1_HUMAN
GSTM2_HUMAN
GSTM4_HUMAN
GSTM5_HUMAN
GTM1_MOUSE
GTM2_MOUSE
GTM3_MOUSE
GTM4_MOUSE
GTM3_RABIT
MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD
MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD
MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGGAPDYDRSQWLNEKFKLGLD
MPMTLGYWDIRGLAHAIRLLLEYTDSSYVEKKYTMGDAPDYDRSQWLNEKFKLGLD
MPMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKFKLGLD
MPMTLGYWDIRGLAHAIRLLLEYTDTSYEDKKYTMGDAPDYDRSQWLSEKFKLGLD
MPMTLGYWNTRGLTHSIRLLLEYTDSSYEEKRYVMGDAPNFDRSQWLSEKFNLGLD
MSMVLGYWDIRGLAHAIRMLLEFTDTSYEEKRYICGEAPDYDRSQWLDVKFKLDLD
MPMTLGYWDVRGLALPIRMLLEYTDTSYEEKKYTMGDAPNYDQSKWLSEKFTLGLD
… how often is one amino-acid replaced by another?
Relative mutability of amino acids
(Ala=100)
Ala: 100.0
Arg:
Asn:
Asp:
Cys:
Gln:
Glu:
Gly:
His:
Ile:
Leu:
Lys:
Met:
Phe:
Pro:
Ser:
Thr:
Trp:
Tyr:
Val:
65.0
134.0
106.0
20.0
93.0
102.0
49.0
66.0
96.0
40.0
56.0
94.0
41.0
56.0
120.0
97.0
18.0
41.0
74.0
Dayhoff M.O., Schwartz R.M., Orcutt B.C.(1978) In
"Atlas of Protein Sequence and Structure", Vol.5,
Suppl.3
Mutation frequencies after 1%
change X 100,000
A 98754
R
30 98974
N
23
19 98720
D
42
8
269 98954
C
11
22
7
2 99432
Q
23
125
35
20
4 98955
E
65
18
36
470
3
198 99055
G
130
99
59
95
43
19
87 99350
H
6
75
89
25
16
136
6
5 98864
I
20
12
25
6
9
5
6
3
9 98729
L
28
35
11
6
21
66
9
6
51
209 99330
K
21
376
153
15
4
170
105
16
27
12
8 99100
M
13
10
7
4
7
10
4
3
8
113
92
15 98818
F
6
2
4
2
31
2
2
2
16
35
99
2
17 99360
P
98
37
8
8
7
83
9
13
58
5
52
11
8
9 99270
S
257
69
342
41
152
37
21
137
50
27
40
32
20
63
194 98556
T
275
37
135
23
25
30
19
20
27
142
15
60
131
7
69
276 98665
W
1
18
1
1
16
3
1
8
1
1
7
1
3
8
1
5
2 99686
Y
3
6
22
15
67
8
2
3
182
10
8
3
6
171
3
20
7
23 99392
V
194
12
11
20
41
13
29
31
8
627
118
9
212
41
15
25
74
17
11 98761
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
Jones D.T., Taylor W.R. and Thornton J.M. (1992) CABIOS 8:275-282
The PAM250 matrix
PAM: Point Accepted Mutation
Cys
Ser
Thr
Pro
Ala
Gly
Asn
Asp
Glu
Gln
His
Arg
Lys
Met
Ile
Leu
Val
Phe
Tyr
Trp
12
0
-2
-1
-2
-3
-4
-5
-5
-5
-3
-4
-5
-5
-2
-6
-2
-4
0
-8
C
2
1
1
1
1
1
0
0
-1
-1
0
0
-2
-1
-3
-1
-3
-3
-2
S
3
0
1
0
0
0
0
-1
-1
-1
0
-1
0
-2
0
-3
-3
-5
T
6
1
-1
-1
-1
-1
0
0
0
-1
-2
-2
-3
-1
-5
-5
-6
P
2
1
0
0
0
0
-1
-2
-1
-1
-1
-2
0
-4
-3
-6
A
5
0
1
0
-1
-2
-3
-2
-3
-3
-4
-1
-5
-5
-7
G
2
2
1
1
2
0
1
-2
-2
-3
-2
-4
-2
-4
N
4
3
2
1
-1
0
-3
-2
-4
-2
-6
-4
-7
D
4
2
1
-1
0
-2
-2
-3
-2
-5
-4
-7
E
4
3
1
1
-1
-2
-2
-2
-5
-4
-5
Q
6
2
0
-2
-2
-2
-2
-2
0
-3
H
6
3
0
-2
-3
-2
-4
-4
2
R
5
0 6
-2 2 5
-3 4 2 6
-2 2 4 2 4
-5 0 1 2 -1
-4 -2 -1 -1 -2
-3 -4 -5 -2 -6
K M I L V
9
7 10
0 0 17
F Y W
Solvent Exposed Area (SEA)
The data for this table was calculated from data taken from
55 proteins in the Brookhaven data base, coming from 9
molecular families: globins, immunoglobins, cytochromes c,
serine proteases, subtilisins, calcium binding proteins, acid
proteases, toxins and virus capsid proteins. Red entries are
found on the surface of a proteins on > 70% of occurrences
and blue entries are found inside of a protein of < 20% of
occurrences.
The only clear trend in this table is that some residues, such
as R and K, locate themselves so that they have access to the
solvent. The so-called hydrophobic residues, such as L and
F, show no clear trend: they are found near the solvent as
often as they are found buried.
Probability that a particular residue will be positioned in
real proteins so that its solvent exposed area meets the
particular criterion in the columns title.
> 30 A
2
< 10 A
2
30 >
SEA
> 10 A2
S
0.70
0.20
0.10
T
0.71
0.16
0.13
A
0.48
0.35
0.17
G
0.51
0.36
0.13
P
0.78
0.13
0.09
C
0.32
0.54
0.14
D
0.81
0.09
0.10
E
0.93
0.04
0.03
Q
0.81
0.10
0.09
N
0.82
0.10
0.08
L
0.41
0.49
0.10
I
0.39
0.47
0.14
V
0.40
0.50
0.10
M
0.44
0.20
0.36
F
0.42
0.42
0.16
Y
0.67
0.20
0.13
W
0.49
0.44
0.07
K
0.93
0.02
0.05
R
0.84
0.05
0.11
H
0.66
0.19
0.15
http://www.cmbi.kun.nl/swift/future/aainfo/access.htm
Ionization of Amino Acids in water
For all amino acids, there are two modes of ionization depending on
the pH of the aqueous medium: (1) uncharged at low pH, –1 at high
pH (acid), or (2) +1 at low pH, uncharged at high pH (base).
From the Henderson-Hasselbalch
equation:
[base ]
pH  pK a  log
[acid]
[base ]
log
 pK a  pH
[acid]
[base ]
 10 pKa  pH
[acid]
90% or 99% of the functional group is deprotonated (or protonated)
when the pH is 1 or 2 pH units above (below) the pK.

The ionic properties
of amino acids
reflect the ionization
of the COO–, NH3+,
and R-groups
When subjected to changes in pH,
amino acids change from the
protonated form with net positive charge
in strongly acidic solution to the
unprotonated form with net negative
charge in strongly basic solution.
During this transition, the amino acid
will pass through a state with no net
charge. The pH at which this occurs is
the isoelectric point or pI. pI can be
calculated from pKa values. For
zwitteronic and acidic amino acids,
pI = 1/2(pK1+pK2). For basic amino
acids, pI = 1/2(pK2+pK3).
pK2=9.6
pK1=2.3
cation
zwitterion
(net charge 0)
anion
Ionic characteristics of amino-acids
A
pK2=9.2
Zw
pK2=6.0
pK1=1.8
C22


C1

pH=7.4
pK2=6.0
pH=7.4
pK2=7.0
pH=6.8
pK2=6.0
[C1 ]
3.8%
28.2%
13.6%
[Zw]
94.7%
70.4%
86.0%


1
1
pI  ( pK 2  pK 3 )  (6.0  9.2)  7.6
2
2
Overall, the aa in solution is
positively charged at pH < pI
pKa values of common amino acids
Amino Acid
Alanine
Arginine
Asparagine
Aspartic Acid
Cysteine
Glutamic Acid
Glutamine
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophan
Tyrosine
Valine
a-COOH
pKa
2.4
2.2
2.0
2.1
1.7
2.2
2.2
2.3
1.8
2.4
2.4
2.2
2.3
1.8
2.1
2.2
2.4
2.4
2.2
2.3
a-NH3+ pKa
9.7
9.0
8.8
9.8
10.8
9.7
9.1
9.6
9.2
9.7
9.6
9.0
9.2
9.1
10.6
9.2
10.4
9.4
9.1
9.6
R group pKa
12.5
3.9
8.3
4.3
6.0
10.5
~13
~13
10.1
The planar
nature of the
peptide bond
MvHA Fig. 5.8
MvHA Fig. 5.12
Limited rotation around the peptide
bond – cis- and trans-proline
The 19 amino-acids other than proline
strongly prefer (>99.7%) to have the Ca–
carbons in the trans- configuration. Proline
shows a weaker preference, with about 5%
of Xaa-Pro in the cis- configuration.
Pro
Strategies for Protein
Sequencing (Proteomics)
• Classic “Edman”
sequencing
– PTC-conjugation to Nterminal amino-acid
– Cleave N-terminal
peptide bond
– Identify PTH amino-acid
– Repeat 20 - 30 cycles
• Sequencing with MassSpectrometry
– isolate protein (or use
mixture of proteins)
– cleave with trypsin
(proteins don’t “fly”)
– separate on HPLC
– separate peptides in
MS(1)
– fragment peptides in
collision cell
– separate peptide
fragments in MS(2)
Protein primary structure can be
determined by chemical methods and
from gene sequences
Edman degradation
Time-of-flight mass spectrometry measures
the mass of proteins and peptides
Positive ESI-MS m/z
spectrum of lysozyme.
Most protein analysis done by Electrospray Ionisation (ESI) or
Matrix Assisted Laser Desorption Ionisation (MALDI)
http://www.healthsystem.virginia.edu/internet/biomolec/
Figure 1 Generic mass spectrometry (MS)-based
proteomics experiment. The typical proteomics
experiment consists of five stages. In stage 1, the
proteins to be analysed are isolated from cell lysate
or tissues by biochemical fractionation or affinity
selection. This often includes a final step of onedimensional gel electrophoresis, and defines the
'sub-proteome' to be analysed. MS of whole proteins
is less sensitive than peptide MS and the mass of the
intact protein by itself is insufficient for
identification. Therefore, proteins are degraded
enzymatically to peptides in stage 2, usually by
trypsin, leading to peptides with C-terminally
protonated amino acids, providing an advantage in
subsequent peptide sequencing. In stage 3, the
peptides are separated by one or more steps of highpressure liquid chromatography in very fine
capillaries and eluted into an electrospray ion source
where they are nebulized in small, highly charged
droplets. After evaporation, multiply protonated
peptides enter the mass spectrometer and, in stage 4,
a mass spectrum of the peptides eluting at this time
point is taken (MS1 spectrum, or 'normal mass
spectrum'). The computer generates a prioritized list
of these peptides for fragmentation and a series of
tandem mass spectrometric or 'MS/MS' experiments
ensues (stage 5). These consist of isolation of a
given peptide ion, fragmentation by energetic
collision with gas, and recording of the tandem or
MS/MS spectrum. The MS and MS/MS spectra are
typically acquired for about one second each and
stored for matching against protein sequence
databases. The outcome of the experiment is the
identity of the peptides and therefore the proteins
making up the purified protein population.
Aebersold R, Mann M. (2003) Nature. 422:198
FIG. 3. Tandem mass (MS/MS)
spectra resulting from analysis of a
single spot on a 2D gel. The first
quadrupole selected a single
mass-to-charge ratio ( m/z) of
687.2 (A) or 592.6 (B), while the
collision cell was filled with argon
gas, and a voltage which caused
the peptide to undergo
fragmentation by CID was applied.
The third quadrupole scanned the
mass range from 50 to 1,400 m/z.
The computer program Sequest (8)
was utilized to match MS/MS
spectra to amino acid sequence by
database searching. Both spectra
matched peptides from the same
protein, S57593 (yeast hypothetical
protein YMR226C). Five other
peptides from the same analysis
Gygi matched
SP, et al. (1999)
Biol. 19:1720
were
toMol
theCell
same
protein.
Search human protein (International Protein Index)
database
20242509 residues in 65082 sequences
FASTS (4.00 July 2001 (ajm)) function [MD20 matrix (18:-29)] ktup: 1
join: 58, gap-pen: -12/-2, width: 16
Scan time: 13.183
The best scores are:
IPI00015759.1|SP:Q07244|NP:NP_112552 Het
IPI00063875.1|NP:NP_112553;NP_002131|TR:
IPI00059339.2|XP:XP_062032|ENSENSP000002
IPI00076129.1|XP:XP_087643 similar to he
(
(
(
(
463)
464)
482)
161)
initn
523
523
330
188
init1 bits E(65082)
218 523 133 7.2e-36
218 523 133 6.2e-36
135 330 67 4.1e-16
188 188 50 7.7e-11
3
3
3
1
>>IPI00015759.1|SP:Q07244|NP:NP_112552 Heterogeneous nuc (463 aa)
initn: 523 init1: 218 opt: 523 bits: 132.7 E(): 7.2e-36
Smith-Waterman score: 523; 100.000% identity in 46 aa overlap (1-46:149-396)
10
gi|108
LLIHQSLAGGIIGVK--------------:::::::::::::::
IPI000 ATSQLPLESDAVECLNYQHYKGSDFDCELRLLIHQSLAGGIIGVKGAKIKELRENTQTTI
120
130
140
150
160
170
20
gi|108 -----------------------------IILDLISESPIK------------------::::::::::::
IPI000 KLFQECCPHSTDRVVLIGGKPDRVVECIKIILDLISESPIKGRAQPYDPNFYDETYDYGG
180
190
200
210
220
230
30
40
gi|108 -------------------GSYGDLGGPIITTQVTIPK
:::::::::::::::::::
IPI000 MAYEPQGGSGYDYSYAGGRGSYGDLGGPIITTQVTIPKDLAGSIIGKGGQRIKQIRHESG
360
370
380
390
400
410
46
46
46
19
Figure 3 Schematic representation of methods
for stable-isotope protein labelling for
quantitative proteomics. a, Proteins are labelled
metabolically by culturing cells in media that are
isotopically enriched (for example, containing
15N salts, or 13C-labelled amino acids) or
isotopically depleted. b, Proteins are labelled at
specific sites with isotopically encoded reagents.
The reagents can also contain affinity tags,
allowing for the selective isolation of the
labelled peptides after protein digestion. The use
of chemistries of different specificity enables
selective tagging of classes of proteins
containing specific functional groups. c, Proteins
are isotopically tagged by means of enzymecatalysed incorporation of 18O from 18O water
during proteolysis. Each peptide generated by
the enzymatic reaction carried out in heavy water
is labelled at the carboxy terminal. In each case,
labelled proteins or peptides are combined,
separated and analysed by mass spectrometry
and/or tandem mass spectrometry for the purpose
of identifying the proteins contained in the
sample and determining their relative abundance.
The patterns of isotopic mass differences
generated by each method are indicated
schematically. The mass difference of peptide
pairs generated by metabolic labelling is
dependent on the amino acid composition of the
peptide and is therefore variable. The mass
difference generated by enzymatic 18O
incorporation is either 4 Da or 2 Da, making
quantitation difficult. The mass difference
generated by chemical tagging is one or multiple
times the mass difference encoded in the reagent
used.
Aebersold R, Mann M. (2003)
Nature. 422:198-207
Correlation between Protein and mRNA
Abundance in Yeast – Conclusions
• Correlation between mRNA and protein levels
insufficient to predict protein expression levels
(but good for very abundant proteins)
• 20-fold change in protein with little change in
mRNA
• no change in protein with 30-fold change in
mRNA
• codon bias does not predict protein or mRNA
levels (but abundant proteins have biased
codons)
Review questions –
1. List the 20 amino acids, with their 1-letter and 3letter abbreviations.
2. What are some of the most common amino-acids?
Least common?
3. Which amino acids contain hydroxyl groups that can
be phosphorylated? (Why is this important?)
4. Which amino-acids contain aromatic rings?
5. Which amino-acids are more likely to be on the
outside of proteins? On the inside? Why?
6. Which amino-acid is likely to change its charge
state with pH changes within the physiological
range (pH 6.5 – 8.0)? Why?
7. Outline the steps required for MS/MS protein
identification
8. Which MS/MS protein sequencing techniques
require a comprehensive protein sequence
database?
Questions from previous exams –
1. Pick an acidic or basic amino-acid. (a) name the amino-acid;
(b) draw the charge-structure of the amino-acid for each of the
charge-states that it can assume (the actual covalent structure
need not be correct, focus on the ionizable groups); (c) suggest
an approximate pK for each of the ionizable groups. (d)
Indicate the most abundant charge-state at pH 7.0.
2. The carboxyl group of amino acid alanine has a pKa value of
2.4 . In order to have 99% of the alanine in its COO form, what
must the numerical relation be between the pH of the solution
and the pKa of the carboxyl group of alanine.
3. Pick 5 amino acids including some that are more common and
some that are less common. Construct a "PAM" amino-acid
similarity matrix using those 5 amino acids, using +5 or +3 for
identities, +1 for "conserved" amino acids (amino acids with
similar properties), and -2 or -5 for non-conservative amino
acids.