Download Primary Structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Drug design wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Magnesium transporter wikipedia , lookup

Fatty acid synthesis wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Fatty acid metabolism wikipedia , lookup

Antibody wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Polyclonal B cell response wikipedia , lookup

Peptide synthesis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Monoclonal antibody wikipedia , lookup

Metabolism wikipedia , lookup

Protein wikipedia , lookup

Metalloprotein wikipedia , lookup

Western blot wikipedia , lookup

Point mutation wikipedia , lookup

Homology modeling wikipedia , lookup

Proteolysis wikipedia , lookup

Genetic code wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Biosynthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
Primary Structure
The primary sequence is:
>DYKDDDDKEVQLQESGPSLVKPSQTLSLTCSVTGDSVTSGYWSWIRQFPGNKLDY
MGYISYRGSTYYNPSLKSRISITRDTSKNQVYLQLKSVSSEDTATYYCSYFDSDDYA
MEYWGQGTSVTVSGGGGSGGGGSGGGGSQIVLTQSPAIMSASPGEKVTLTCSASSSV
SSSHLYWYQQKPGSSPKLWIYSTSNLASGVPARFSGSGSGTSYSLTISSMEAEDAASY
FCHQWSSFPFTFGSGTKLEIKRAP
A very similar sequence is found in PDB that is identical to this, without the DYKDDDDK
(Sigma’s FLAG Tag) at the beginning and without the final P, but containing everything
else[1].
In Appendix one the search result from PDB is shown, and the paper referenced shows the
sequence in Figure 2 of the appendix, identifying each section of the sequence that is also
found here.
The protein is a modified single chain A Fv antibody fragment scFv6H4 that binds to
methamphetamine and one of its derivates.
The matches in Blastp 2.2.24 (appendix 2) are these two best ones for homology:
100% Score 384: Chain A, Crystal Structures Of A Therapeutic Single Chain Antibody. In
Complex Methamphetamine. Contains the Sigma’s FLAG Tag (DYKDDDDK), anda hhhhhh
string at the end after the phosphate
81% Score 241: anti BoNT/A Hc scFv antibody [synthetic construct].
The molecular weight [2] is given as 27.4kDa, which is small enough to lead to rapid renal
clearance after the drug is injected to treat certain amphetamine derivates.
It is often found in dimer and monomeric combinations.
It is a genetic recombinant form of murine mAb6H4 Heavy and Light Chains.
From the EVQ to the SVT is the FLAG epitope, which has been extended to
DYKDDDDKEVQLQESGPSLVKPSQTLSLTCSVT.
The CDR of the heavy chain is given as
{GDSVTSGYWS}{YISYRGSTY}{SDDYAMEY}
The CDR of the Light chain is:
{SASSSVSSSHLY}{STSNLASG}{HQWSSFPFT}
The Variable heavy chain (VH)is given:
{EVQLQESGPSLVKSQTLSLTCSVT}{WIRQFPGNKLDYMG}{YNPSLKSRISITRDTSK
NQVYLQLKSVSSEDTATYYCSY}{WGQGTSVTV}
The variable light chain (VL) is given:
{QIVLTQSPAIMSASPGEKVTLT}{WYQQKPGSSPKLWIY}{VPARFSGSGSGTSYSLTI
SSMEAEDAASYFC}{FGSGTKLEIKRA}
The linker is given as: SGGGGSGGGGSGGGGS. Surprisingly there is no 6 Histidine but we
have a phosphate group. Phosphorylation is critical for many enzymes to work and affects
quaternary folding so this will be looked at.
Interestingly, the pI and Mw of the protein were calculated based solely on the primary
amino acid sequence given and inputted to a calculator [3]. The results can be seen in
appendix 3 figure 4. The pI was calculated as 5.39 and the Mw was calculated as 26940.51
Da. Since the theoretical Mw closely matches the measured Mw of 27.4kDa, the theoretical
pI calculated here as 5.39 is assumedly expected to lie close to the real value. Since there is
no real measurement of pI readily found in the research, the theoretical pI from the calculator
will be used.
The calculator was also used to calculate the amino acid compositions. In appendix 3 figure 5
the compositions are seen. The high amount of Serine is worth noting. This could have an
effect on the kinetics of the antibody fragment binding, or for potential glycosylation if found
at the surface. High Serine levels allow the protein to be soluble. Also worth noting is that
there is twice as many hydrophobic amino acids as polar, seen in figure 6 of Appendix 3.
This has implications for the structure where most of the amino acids will be folded inward
away from the polar environment of the blood. The main structure seems to be more
determined by the hydrophobic amino acids rather than structural amino acids, and glycine is
the most common structural amino acid, often found in scFv linkers for flexibility as is the
case here.
The molecular weight and pI are critical to the protein for a few reasons. The molecular
weight can affect renal clearance if it is really high and can bypass delivery to the liver or
bladder for much longer in the body. The pharmacokinetics are greatly improved. Many
drugs aim for PEGylation for this reason, by attaching large molecules to the drug to avoid
clearance by the kidneys and liver. This is known as glomerular filtration rate [4].
The isoelectric point determines the solubility of the protein. If the isoelectric point is close to
7, then it will be difficult to dissolve in water. The pH of blood is close to or usually very
slightly above neutral, pH 7.4. If the drug is to be dissolved in blood and thus mobile, the pI
will have to be outside this, either positively or negatively charged.
If this protein was found in your blood it probably means you are being treated for
methamphetamine or its derivative to clear the drug from your blood.
Secondary Structure
The sequence was inputted to an online secondary structure prediction tool[5], called Jpred3, run by
the University of Dundee. This runs an alignment to known sequences with sequences and calculates
the secondary structure from the best matches. The result is found in Appendix 4 Figure 1.
There are mostly beta sheets, and this covers quite a lot of the sequence, given as B’s in the results.
No alpha sheets were recorded.
There were also one cysteine bond found in the heavy chain and one in the light chain[6].
The secondary structure was also measured using NetsurfP to identify which amino acids are buried
and which are exposed. The results are in Appendix 5. The amino acids from 26 to 60 are mostly
buried, and this includes the first two heavy light chain coding regions. From 7* to 119 they are
again mostly buried, again including the final part of the CDR. The linker is exposed. In general, the
CDR’s of both chains are buried, while the linker and terminal regions are exposed.
The secondary structure is very similar in both heavy and light chains.
Tertiary Structure:
The tertiary structure for the protein was calculated from the swissmodel website. A workspace was
setup and using the sequence the structure was modelled and saved as Model1.PDB.
This was then viewed with the swiss pdbviewer 4.0.1
First of all the linker was found by highlighting the glycine stretch in the middle of the sequence in
the control panel. This can be seen in Appendix 6.
The three CDR’s of the Heavy Chain are located very close together on one side of the molecule.
Again the three CDR’s of the light chain are also located close together, but more interestingly are
located in the same section of the protein as the light chain CDR’s.
The PDB model of 3GKZ was added to the sequence structure and with a magic fit and align the RMS
was calculated to be 0.28. This is very low, representing almost identical structure.
Quaternary Structure
The antigen was found bound to the CDR’s. The closest amino acids are the Gly40, Tyr41, Trp42,
Ser43, Tyr58, Ile59, Ser60, Tyr66, Ser108, Tyr111, Met 113, Glu114, Tyr115, His173, Tyr175, His230,
Gln231, Trp232, Ser234, Phe 235, (pro236 gives strong bend in CDR around the antigen), Phe 237,
Thr238. There is a very significant aromatic presence in this area of the molecule, with the aromatic
branches of numerous amino acids branching directly towards the antigen.
Those amino acids inside 5 Angstrom are: Glu114, Tyr175, His230, Trp232. There are two hydrogen
bonds to the antigen, one from His230 and one from Glu114.
The protein shows a very strong electrostatic potential at the site of the CDR that may be
involved in attracting the antigen, and when zoomed in it can be seen that Glu114
significantly adds to that right up to the antigen binding site. The antigen for this protein is
Methamphetamine and 3,4-methylenedioxymethamphetamine.
This project has identified the two amino acids that bind to the antigen, a nearby
proline which helps bend the section structure around the antigen, the amount of
serines around the CDR region which may be involved in the pharmacokinetics of
the binding, a strong electrostatic potential right up to the binding site itself that may
be involved in attracting the drug to the active site.
Appendix 1.
Figure 1. A search result that comes directly from the pdb main webpage search window.
Figure 2 The functions of various sections of the amino acid sequence.
Figure 3-Blast results showing up 3GKZ_A
Appendix 2: Protein pI and Mw Calculation
Figure 4 above calculating the pI and Mw.
Figure 5 showing the percentage of each amino acid.
Figure 6. Showing the breakdown of amino acids into their physiochemical and structural properties
Appendix 4
Figure 7 a/b. The secondary structure prediction showing the beta-sheets.
Figure 8. Identification of Disulphide Bonds
Appendix 5.
# Col.1: Class assignment - B for buried or E for Exposed - Threshold: 25% exposure, but not based on RSA
# Col. 2: Amino acid
# Col. 3: Amino acid No.
# Col. 5: Relative Surface Accessibility - RSA
# Col. 6: Absolute Surface Accessibility
# Col. 7: Z-fit score for RSA prediction
# Col. 8: Probability for Alpha-Helix
# Col.9: Probability for Beta-strand
# Col. 10: Probability for Coil
Type
AA
E
D
1
0.879
126.621
0.598
0.003
0.003
0.994
E
Y
2
0.356
76.077
-1.544
0.018
0.088
0.893
B
K
3
0.208
42.744
-1.117
0.019
0.141
0.84
E
D
4
0.451
65.061
0.351
0.019
0.141
0.84
B
D
5
0.222
31.99
-2.136
0.02
0.205
0.775
B
D
6
0.099
14.237
-0.918
0.02
0.205
0.775
B
D
7
0.279
40.276
0.095
0.022
0.359
0.619
B
K
8
0.046
9.38
-0.592
0.022
0.552
0.426
B
E
9
0.155
27.166
0.059
0.023
0.655
0.322
B
V
10
0.068
10.375
0.314
0.011
0.918
0.071
E
Q
11
0.26
46.365
1.635
0.011
0.918
0.071
B
L
12
0.058
10.675
0.692
0.011
0.918
0.071
E
Q
13
0.307
54.777
0.809
0.021
0.756
0.223
B
E
14
0.201
35.167
0.423
0.004
0.514
0.481
E
S
15
0.383
44.864
-0.056
0.004
0.138
0.858
E
G
16
0.513
40.405
-1.644
0.018
0.047
0.935
E
P
17
0.551
78.229
-2.373
0.018
0.047
0.935
E
S
18
0.509
59.655
-1.822
0.019
0.141
0.84
E
L
19
0.396
72.434
-0.891
0.022
0.359
0.619
E
V
20
0.308
47.309
0.483
0.021
0.451
0.528
E
K
21
0.474
97.584
1.195
0.02
0.205
0.775
E
P
22
0.477
67.757
-0.515
0.005
0.045
0.951
E
S
23
0.606
71.047
-1.65
0.005
0.045
0.951
E
Q
24
0.426
76.012
-0.283
0.004
0.138
0.858
E
T
25
0.434
60.224
0.146
0.004
0.616
0.381
B
L
26
0.115
21.057
0.973
0.001
0.9
0.099
E
S
27
0.402
47.091
1.141
0.001
0.9
0.099
Pos
RSA
ASA
Z-Score
pA
pB
pCoil
B
L
28
0.069
12.579
0.694
0.001
0.959
0.04
B
T
29
0.162
22.483
0.711
0.001
0.959
0.04
B
C
30
0.039
5.433
0.03
0.001
0.959
0.04
E
S
31
0.317
37.141
0.645
0.001
0.9
0.099
B
V
32
0.086
13.172
-0.603
0.002
0.816
0.182
B
T
33
0.179
24.869
-0.773
0.004
0.514
0.481
E
G
34
0.288
22.658
-1.614
0.004
0.138
0.858
B
D
35
0.286
41.155
-1.441
0.005
0.262
0.733
B
S
36
0.192
22.502
-0.994
0.004
0.42
0.576
B
V
37
0.13
20.027
-1.237
0.021
0.451
0.528
B
T
38
0.245
34.037
-1.213
0.022
0.359
0.619
B
S
39
0.22
25.737
-1.274
0.019
0.141
0.84
B
G
40
0.117
9.232
-0.188
0.019
0.141
0.84
B
Y
41
0.087
18.571
0.141
0.021
0.451
0.528
B
W
42
0.029
6.854
0.236
0.021
0.756
0.223
B
S
43
0.05
5.86
-0.337
0.021
0.756
0.223
B
W
44
0.032
7.696
0.21
0.018
0.846
0.136
B
I
45
0.041
7.511
0.56
0.018
0.846
0.136
B
R
46
0.12
27.434
0.339
0.023
0.655
0.322
B
Q
47
0.102
18.164
0.388
0.022
0.359
0.619
E
F
48
0.308
61.856
0.984
0.005
0.045
0.951
E
P
49
0.372
52.815
0.383
0.005
0.015
0.979
E
G
50
0.67
52.76
-2.003
0.005
0.015
0.979
E
N
51
0.426
62.352
0.448
0.018
0.019
0.964
E
K
52
0.472
97.111
-0.534
0.018
0.088
0.893
B
L
53
0.174
31.786
0.779
0.005
0.262
0.733
B
D
54
0.177
25.477
-0.072
0.004
0.616
0.381
B
Y
55
0.05
10.749
0.357
0.001
0.9
0.099
B
M
56
0.028
5.523
0.678
0.001
0.959
0.04
B
G
57
0.019
1.456
-0.038
0.001
0.959
0.04
B
Y
58
0.088
18.891
0.083
0.001
0.959
0.04
B
I
59
0.034
6.272
-0.023
0.001
0.959
0.04
B
S
60
0.147
17.217
0.524
0.004
0.616
0.381
E
Y
61
0.369
78.855
-0.952
0.004
0.197
0.799
E
R
62
0.603
137.973
-2.17
0.005
0.015
0.979
E
G
63
0.406
31.913
-2.156
0.016
0.005
0.979
E
S
64
0.509
59.608
-1.153
0.018
0.047
0.935
E
T
65
0.352
48.767
-0.554
0.019
0.141
0.84
Appendix 6
Heavy Chain CDRs
Light Chain CDR’s.
Match with 3GKZ.
The antigen bound in the middle of all CDR’s
Showing the antigen in close to find amino acids within 5Angstrom, also the hydrogen bonds can be
seen in green.
Showing the strong electrostatic potential near the antigen.
The electrostatic potential at the active site. Glu114 is giving a strong reading in red.
References
1: Reha Celikel, Eric C Peterson, S Michael Owens, and Kottayil I Varughese. Crystal structures of a
therapeutic single chain antibody in complex with two drugs of abuse—Methamphetamine and 3,4methylenedioxymethamphetamine. Protein Sci. 2009 November; 18(11): 2336–2345. Published online 2009
September 16. doi: 10.1002/pro.244.
2. Peterson EC, Laurenzana EM, Atchley WT, Hendrickson HP, Owens SM.
Development and preclinical testing of a high-affinity single-chain antibody
against (+)-methamphetamine. J Pharmacol Exp Ther. 2008 Apr;325(1):124-33. Epub 2008 Jan
11.
3. Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A.;
Protein Identification and Analysis Tools on the ExPASy Server;
(In) John
M. Walker (ed): The Proteomics Protocols Handbook, Humana Press (2005).
4. Stevens LA, Coresh J, Greene T, Levey AS (June 2006). "Assessing kidney function--measured and
estimated glomerular filtration rate". The New England Journal of Medicine 354 (23): 2473–83.
doi:10.1056/NEJMra054415. PMID 16760447.
5. Citation: Cole C, Barber JD & Barton GJ. Nucleic Acids Res. 2008. 35 (suppl. 2) W197-W201
6. ] A. Ceroni, A. Passerini, A. Vullo and P. Frasconi. DISULFIND: a Disulfide Bonding State and
Cysteine Connectivity Prediction Server, Nucleic Acids Research, 34(Web Server issue):W177-W181, 2006.ik90000000