Download Biochemical, biophysical and interaction studies of the stress

Document related concepts

Lipid signaling wikipedia , lookup

Monoclonal antibody wikipedia , lookup

Clinical neurochemistry wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Biochemical cascade wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Signal transduction wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Point mutation wikipedia , lookup

Protein wikipedia , lookup

Metalloprotein wikipedia , lookup

Magnesium transporter wikipedia , lookup

Paracrine signalling wikipedia , lookup

Gene expression wikipedia , lookup

Homology modeling wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Interactome wikipedia , lookup

Expression vector wikipedia , lookup

Protein structure prediction wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Western blot wikipedia , lookup

Proteolysis wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Transcript
Biochemical, biophysical and
interaction studies of the stress
responsive protein hSTRAP
A thesis submitted to the University of Manchester for
the degree of PhD: Molecular Cancer studies in the
Faculty of Life Sciences
2013
Karishma Satia
Declaration
No portion of the work referred to in the thesis has been submitted in support of an
application for another degree or qualification of this or any other university or other
institute of learning;
Copyright statement
The following four notes on copyright and the ownership of intellectual property
rights must be included as written below:
i. The author of this thesis (including any appendices and/or schedules to this
thesis) owns certain copyright or related rights in it (the “Copyright”) and
s/he has given The University of Manchester certain rights to use such
Copyright, including for administrative purposes.
ii. Copies of this thesis, either in full or in extracts and whether in hard or
electronic copy, may be made only in accordance with the Copyright,
Designs and Patents Act 1988 (as amended) and regulations issued under
it or, where appropriate, in accordance with licensing agreements which
the University has from time to time. This page must form part of any
such copies made.
iii. The ownership of certain Copyright, patents, designs, trademarks and
other intellectual property (the “Intellectual Property”) and any
reproductions of copyright works in the thesis, for example graphs and
tables (“Reproductions”), which may be described in this thesis, may not
be owned by the author and may be owned by third parties. Such
Intellectual Property and Reproductions cannot and must not be made
available for use without the prior written permission of the owner(s) of
the relevant Intellectual Property and/or Reproductions.
iv. Further information on the conditions under which disclosure, publication
and commercialisation of this thesis, the Copyright and any Intellectual
Property and/or Reproductions described in it may take place is available
in
the
University
IP
Policy
(see
http://www.campus.manchester.ac.uk/medialibrary/policies/intellectualproperty.pdf), in any relevant Thesis restriction declarations deposited in
the University Library, The University Library’s regulations (see
http://www.manchester.ac.uk/library/aboutus/regulations) and in The
University’s policy on presentation of Theses
2
Acknowledgements
Firstly I would like to thank all my supervisors, Dr Alexander Golovanov, Dr Costas
Demonacos and Dr Stephen Prince for their continual guidance throughout this PhD.
I would also like to thank my adviser Professor Andrew Doig for all his support and
advice on PhD related issues and CD experiments. Also I would like to thank Dr
Marija Kristic Demonacos for her help when carrying out the molecular cloning
aspect of this project and letting me work on her bench space, also providing me
with an hSTRAP construct (pHA1-hSTRAP(1-440)) which was used in this project.
In addition I would like to thank Mrs Sandra Taylor who helped with the molecular
biology aspect of this PhD. I am also grateful to all the associated lab members,
especially Dr Richard Turncliffe who has read my thesis and advised corrections and
for all his advice given throughout this project. Also, Mrs Ilhem Berrou, Mrs Hajar
Karimi Alavi and Mr Travis Leung for all their help given during this project,
especially Travis for providing me with MCF7 cells. I am also very thankful to Miss
Preeti Kalra and Mr John Chipperfield for giving me all their help and continual
support throughout this PhD. I would also like to thank Dr James Birtley for all his
advice regarding the molecular cloning aspect of this project. I would also like to
thank Dr Martin Read, Dr Christopher Storey and all the members of the Michel
Smith Mass spectrometry facility especially Dr David Knight and Emma-jayne
Keevill for their help regarding the mass spectrometry aspect of this PhD. I would
also like to thank Dr Jean-Marc Schwartz for his help with all bioinformatics aspect
of this project.
I would also like to thank all my friends and all the Satia family, but especially my
brother Dr Imran Satia and my mum, Mrs Nasima Satia, as well as my nieces and
nephews who always make my day with a smile after a long day in the lab. If it
wasn’t for my mum and brother I would not be in the great position I feel I am in,
and hence I would like to dedicate my thesis to my Mum and Brother. I would also
like to thank Miss Sharmin Naaz, Daphne Chen and Preeti Kalra, who have been
such amazing friends and supported me emotionally through this PhD.
Last but not least I would like to thank the University Of Manchester and BBSRC
for providing me with the funding to carry out this PhD.
3
Abbreviations:
A, Alanine;
ADP, Adenosine Di-phosphate
AIPL1, Aryl-hydrocarbon-interacting-protein-1;
APC, Anaphase Promoting Complex;
AR, Androgen Receptor,
Arp2/3; Actin related protein 2/3;
ATM, Ataxia Telangiectasia Mutated;
ATM, Ataxia-telangiectasia mutated,
ATP, Adenosine Tri-phosphate
ATR, Ataxia-telangiectasia and Rad3-related;
Bax, Bcl2 associated X protein;
Bdp1, B Double Prime 1;
BRCA1, Breast Cancer type 1 susceptibility protein
BRCA2, Breast Cancer type 2 susceptibility protein
Brf1, B Related Factor 1;
Bub1; budding uninhibited by benzimidazole 1
BubR1; Bub1 Related
CBP, CREB-binding protein
Cdc, Cell division cycle;
CDH1, Cadherin;
CdK’s, Cyclin Dependent Kinases;
CFTR, Cystic Fibrosis Transmembrane conductance Regulator;
CHIP, Carboxy terminus of Hsc70 interacting protein;
Chk, Checkpoint kinases1;
CTPR3, Consensus TPR number of repeats;
DMEM, Dulbecco's Modified Eagle Medium
DMSO, Dimethyl sulphoxide;
DNA; Deoxyribonucleic acid
DTT, Dithiothreitol;
E1A, Adenovirus early reagion 1A;
E6-AP, E6-associated protein
EDTA, Ethylenediaminetetraacetic acid
ESI, Electrospray ionization;
F, Phenylalanine;
FA, Fanconi Anaemia;
4
FANC, Fanconi Anaemia Group,
FBS, Fetal bovine Serum;
FID, Free induction Decay;
FKBP52, Rabbit FK506 Binding Protein;
FKBPs, FK506- binding proteins;
FPLC, Fast protein liquid chromatography
G, Glycine;
GABAA, Gamma-aminobutyric acid, type A
GFP, Grenn Flourescent Protein;
GR, Glucocorticoid receptor;
GRIF-1, (GABAA) receptor interacting factor-1
GST, Glutathione S Transferase;
GTFs, General Transcription factors,
GTP, Guanosine-5'-triphosphate
HAT, Histone acetyltransferase;
HBP21, HSP70 binding protein 21;
HIF-1, Hypoxia Inducible Factor 1
HIP, Hsc70- interacting protein;
HIV, Human immunodeficiency virus
HOP, Hsp70-Hsp90 Organizing Protein;
HREs, Hypoxia Response Elements;
HSF1, Heat shock transcription factor 1;
Hsp, Heat Shock Proteins;
HTS, High Throughput Screening
JMY, Junction Mediating and regulatory protein;
L, Leucine;
LB, Luria Broth;
MAD, Multi-wavelength Anomalous Diffraction;
MALDI, Matrix associated laser desorbption/ionization
MAPKinase, Mitogen-activated protein kinases
MAS, Mitochondrial assembly;
Mdm2, Mouse double minute 2;
MIR, Multiple Isomorphous Replacement;
MOM, Mitochondrial Outer Membrane
Mps1, Mono-polar spindle 1;
MR, Molecular Replacement;
MS, Mass spectrometry;
5
NaCl, Sodium Chloride
NADPH, Nicotinamide adenine dinucleotide phosphate-oxidase;
NMR, Nuclear Magnetic Resonance
NPFs, Nucleation Promoting Factors
NTP, Nucleotide Tri-phosphate
OGT, O-Glc-NAc-transferase;
OIP’S, OGT-interacting proteins;
P, Proline;
p300, E1A binding protein p300
p58ipk, p58 inhibitor of protein kinase;
PAGE, Poly-acrylamide-gel-electrophoresis;
PAS, Peroxisome assembly
PBD, Peroxisome Biogenesis Disorder;
PBS, Phosphate buffered Saline
PDZ, post synaptic density protein (P), Drosophila disc large tumor suppressor (D),
and zonula occludens-1 protein (Z)
PEG, Polyethylene
PEX5, Peroxin 5;
PHD domain, Plant Homeo Domain;
PIC, Pre-initiation complex;
PKR, Protein kinase R;
PMSF, Phenylmethanesulfonyl Fluoride;
PP5, Protein Phosphatase 5;
PPIase, Peptidylprolyl Isomerase;
PPII, Polyproline Type II;
PRMT, Protein methyltransferase;
Psc, Pseudomonas secretion;
PTS1, Peroxisomal targeting signal;
PXR1, peroxisomal targeting signal import receptor;
RAC, Ras-related C botulinum toxin substrate;
RNA, Ribonucleic acid;
SAD, Single wavelength anomalous diffraction;
SDS, sodium dodecyl sulfate
sGC, soluble Guanylyl Cyclase;
SGT , small glutamine-rich protein;
SH, Src Homology;
SIR, Single Isomorphous Replacement;
6
Srb7, Suppressor of RNA polymerase B 7;
STRAP, Stress-responsive activator of p300;
SWI/SNF, SWItch/Sucrose NonFermentable;
TAE, Tris acetate EDTA;
TAFs, TBP associated factors;
TAP, Tandem affinity purification;
TBP, TATA-binding protein
TEV, Tobacco ethch virus;
Tfc4, Transcription Factor Class C 4
TFIIIB, RNA polymerase III transcription factor
TFIIIC, RNA polymerase III transcription initiation factor complex
TOF, Time of flight;
Tom20, Translocase of the outer membrane;
TPR, tetra-tri-co-peptide repeat;
TSS, Transcription Start site
TTC4, tetratricopeptide repeat domain 4
UBP, Vpu binding protein;
UPP, Ubiquitin Proteosome Pathway;
W, Tryptophan;
WASP, Wiskott Aldrich Syndrome protein
WD40, Tryptophan-aspartic acid dipeptide
WH2, Wiskott Aldrich Syndrome protein (WASp)-homology2
WISp39, WAF-1/CIP1 stabilizing protein 39
XAB2, XPA (Xeroderma pigmentosum, complementation group A) binding protein
2
XAP2, hepatitis B virus X-associated protein 2;
XRCC3, X-ray repair cross-complementing protein 3;
Y, Tyrosine;
ZZ domain, Zinc finger domain;
α-SGT, a small glutamine–rich tetratricopeptide repeat containing protein alpha;
7
Table of Contents
DECLARATION ........................................................................................................... 2
COPYRIGHT STATEMENT ...................................................................................... 2
ACKNOWLEDGEMENTS .......................................................................................... 3
ABBREVIATIONS: ..................................................................................................... 4
TABLE OF CONTENTS ............................................................................................. 8
LIST OF FIGURES .................................................................................................... 12
LIST OF TABLES ..................................................................................................... 15
ABSTRACT ................................................................................................................ 16
1. CHAPTER ONE. INTRODUCTION ................................................................. 17
1.1 Proteins ................................................................................................................................ 17
1.1.1 Importance of protein research ......................................................................................................... 17
1.1.2 The importance of protein structure determination........................................................................... 17
1.1.3 Importance of characterizing protein-protein interactions ................................................................. 19
1.2 Various structural biology techniques .................................................................................. 19
1.2.1 X-Ray crystallography ........................................................................................................................ 19
1.2.2 NMR ................................................................................................................................................. 23
1.2.3 Circular dichroism (CD) ...................................................................................................................... 26
1.2.4 Complementarities between different Structural biology techniques ................................................. 27
1.3 Mass spectrometry ............................................................................................................... 29
1.4 Protein-protein interaction motifs........................................................................................ 34
1.4.1 Structure ........................................................................................................................................... 34
1.4.2 Structural stability and ligand specificity of protein interaction domains ............................................ 36
1.5 Functions of various TPR proteins ........................................................................................ 45
1.5.1 TPR proteins involved in transcription ............................................................................................... 45
1.5.2 TPR proteins involved in the Stress Response Pathway ...................................................................... 46
1.5.3 TPR proteins involved in Mitochondrial and Peroxisomal import........................................................ 47
1.5.4 TPR proteins involved in the progression of the cell cycle .................................................................. 48
1.5.5 TPR proteins involved in DNA Repair ................................................................................................. 50
1.5.6 TPR proteins involved in Proteolysis .................................................................................................. 51
1.5.7 TPR proteins implicated in various other aspects of cell physiology .................................................... 51
1.6 Roles of scaffolds in signaling pathways ............................................................................... 54
1.7 Hallmarks of Cancer .............................................................................................................. 56
8
1.8 Role and regulation of p53 in cancer .................................................................................... 57
1.9 Regulation of the actin cytoskeleton .................................................................................... 57
1.10 Cancer cell migration .......................................................................................................... 58
1.11 RNA polymerase transcription machinery .......................................................................... 59
1.12 STRAP (Stress responsive activator of p300) ....................................................................... 64
1.12.1 STRAP discovery .............................................................................................................................. 64
1.12.2 STRAP, p300 and JMY ...................................................................................................................... 65
1.12.3 STRAP function ................................................................................................................................ 67
1.13 Aims of the Project ............................................................................................................. 72
2. CHAPTER TWO. MATERIALS AND METHODS ........................................ 73
2.1 Materials .............................................................................................................................. 73
2.1.1 Chemicals and Reagents .................................................................................................................... 73
2.1.2 Enzymes and Kits............................................................................................................................... 74
2.1.3 Other consumables ........................................................................................................................... 74
2.1.4 General buffers and solutions ............................................................................................................ 74
2.1.5 Chemically competent bacterial cells ................................................................................................. 75
2.2 Mammalian Cell Culture ....................................................................................................... 76
2.2.1 Cell lines............................................................................................................................................ 76
2.2.2 Cell passage and maintenance ........................................................................................................... 76
2.2.3 Biochemical pull down assays ............................................................................................................ 77
2.3 Cloning of hSTRAP constructs ............................................................................................... 77
2.3.1 Cloning of full length hSTRAP into pET14b (His-hSTRAP(1-440)) ......................................................... 77
2.3.2 Cloning of truncated versions of hSTRAP (His-hSTRAP) into pET-14b .................................................. 77
2.3.2.2 Primer design ........................................................................................................... 78
2.3.2.3 Polymerase Chain reaction ....................................................................................... 78
2.3.2.4 PCR Purification........................................................................................................ 79
2.3.2.5 Restriction digests .................................................................................................... 79
2.3.2.6 Agarose gel electrophoresis...................................................................................... 80
2.3.2.7 Ligation .................................................................................................................... 80
2.3.3 Cloning of Full length hSTRAP into pGEX-6P1 (GST- hSTRAP(1-440)) ................................................... 80
2.3.3.2 Alkaline Phosphatase treatment ............................................................................... 81
2.3.3.3 Gel Extraction and Purification ................................................................................. 81
2.4 Transformation of plasmid DNA into competent e.coli cells ................................................. 81
2.5 DNA Mini preps .................................................................................................................... 83
2.6 Sequencing ........................................................................................................................... 83
2.7 Expression trials .................................................................................................................... 83
2.8 SDS-PAGE Gels ...................................................................................................................... 85
2.9 Large scale expression and protein purification of all hSTRAP variants ................................ 86
2.9.1 Full length His-hSTRAP and truncated constructs of hSTRAP .............................................................. 86
2.9.2 GST-hSTRAP(1-440) ........................................................................................................................... 87
9
2.10 Determining the concentration of protein .......................................................................... 88
2.10.1 Bradford reagent ............................................................................................................................. 88
2.10.2 Protein sample absorbance at 280nm.............................................................................................. 88
2.11 Concentration of protein to a smaller volume .................................................................... 89
2.11.1 Amicon Concentration..................................................................................................................... 89
2.11.1.1 Concentration of protein into a buffer .................................................................... 89
2.11.1.2 Concentration of protein along with buffer exchange in the amicon ....................... 89
2.11.2 Viva spin500 concentrators ............................................................................................................. 90
2.12 Gel Filtration ....................................................................................................................... 91
2.13 X-RAY Crystallography experiments ................................................................................... 91
2.14 GST tag Cleavage ................................................................................................................ 92
2.15 CD experiments .................................................................................................................. 92
2.16 NMR experiments ............................................................................................................... 92
15
2.16.1 Expression of N labelled hSTRAP protein ....................................................................................... 92
2.16.2 Acquiring of NMR spectra ................................................................................................................ 93
2.17 Mass spectrometry experiments ........................................................................................ 93
2.18 Building the hSTRAP interactome network ......................................................................... 94
3. CHAPTER THREE. RESULTS .......................................................................... 95
3.1 Expression and purification of full length and truncated forms of hSTRAP protein .............. 95
3.1.1 Cloning, expression and purification of full length hSTRAP into pET14b (His-hSTRAP(1-440)).............. 95
3.1.2 Cloning, Expression and purification of GST-hSTRAP(1-440) ............................................................. 102
3.1.3 Cloning, expression and purification of truncated variants of hSTRAP .............................................. 108
3.1.3.1. Design and sequence analysis of truncated constructs of hSTRAP ........................................ 108
3.1.3.2. Cloning of Truncated versions of hSTRAP ............................................................................. 111
3.1.3.3. Expression and Purification of truncated versions of hSTRAP ............................................... 112
3.1.3.3.1 Expression and purification of hSTRAP(1-219)............................................................... 112
3.1.3.3.2 Expression and purification of hSTRAP(220-440) ........................................................... 116
3.1.3.3.3 Expression and purification of hSTRAP(1-150)............................................................... 118
3.1.3.3.4 Expression and purification of hSTRAP(151-284) ........................................................... 121
3.1.3.3.5 Expression and purification of hSTRAP(285-440) ........................................................... 123
3.2 Identification of hSTRAP interacting partners in MCF7 breast cancer cells ......................... 127
3.2.1 Purification of hSTRAP protein variants ........................................................................................... 128
3.2.2 Pull downs using MCF7 cellular extract ............................................................................................ 130
3.2.3 hSTRAP interacting partners ............................................................................................................ 132
3.3 Biophysical and structural studies carried out using full length and truncated versions
of hSTRAP ......................................................................................................................... 138
3.3.1 Biophysical and structural studies carried out on His-hSTRAP(1-440) ............................................... 138
3.3.1.1 Circular Dichroism on His-hSTRAP(1-440).............................................................................. 138
3.3.1.2 X-ray Crystallography on His-hSTRAP(1-440) ......................................................................... 140
3.3.2 Biophysical and structural studies carried out on GST-hSTRAP(1-440) .............................................. 155
3.3.2.1 Circular Dichroism on GST-hSTRAP(1-440) ............................................................................ 155
3.3.2.2 GST tag cleavage .................................................................................................................. 156
3.3.3 Biophysical and structural studies carried out on hSTRAP(1-219) ..................................................... 157
3.3.3.1 Circular Dichroism of hSTRAP(1-219).............................................................................157
3.3.3.2 NMR studies of hSTRAP(1-219) ............................................................................................. 159
10
3.3.4 Biophysical and structural studies carried out on hSTRAP(1-150) ..................................................... 168
3.3.4.1 Circular Dichroism of hSTRAP(1-150) .................................................................................... 168
3.3.4.2 NMR studies of hSTRAP(1-150) ............................................................................................. 169
3.3.5 Biophysical and Structural studies carried out on hSTRAP(151-284) ................................................. 179
3.3.5.1 Circular Dichroism on hSTRAP(151-284)................................................................................ 179
3.3.5.2 NMR studies of hSTRAP(151-284) ......................................................................................... 181
3.3.6 Biophysical and structural studies carried out on hSTRAP(285-440) ................................................. 183
3.3.6.1 Circular Dichroism on hSTRAP(285-440)................................................................................ 183
4. CHAPTER FOUR. GENERAL DISCUSSION................................................185
4.1 Comparisons of hSTRAP and mSTRAP structural data ........................................................ 185
4.2 Structural characterization of hSTRAP protein fragments .................................................. 188
4.2.1 CD characterization of all hSTRAP protein variants........................................................................... 188
4.2.2 Crystallographic studies on hSTRAP(1-440) ...................................................................................... 189
4.2.3 NMR studies on hSTRAP(1-219) ....................................................................................................... 190
4.2.4 NMR studies on hSTRAP(1-150) ....................................................................................................... 191
4.2.5 NMR studies on hSTRAP(151-284) ................................................................................................... 191
4.3 Difficulties with expression of hSTRAP protein variants using the E. coli expression
system, and possible ways to overcome these in future. .................................................. 191
4.4 The hSTRAP interactome .................................................................................................... 193
5. FUTURE DIRECTION ......................................................................................200
6. REFERENCES .....................................................................................................203
7. APPENDIX ..........................................................................................................217
11
List of figures
Figure 1.1. NMR can be used in all aspects of the drug discovery process. ......................... 18
Figure 1.2. Phase diagram of concentration of protein to precipitant. ............................... 22
Figure 1.3. Various vapour diffusion methods. .................................................................. 23
Figure 1.4. NMR Theory..................................................................................................... 24
Figure 1.5. 2D HSQC NMR spectrum. ................................................................................. 25
Figure 1.6. Far UV spectra of different secondary structure elements. ............................... 27
Figure 1.7. The three main components of the mass spectrometer.....................................30
Figure 1.8. Pictorial representation of the Quadrupole Ion trap ......................................... 32
Figure 1.9. Pictorial representation of the Tandem mass spectrometry process (MS/MS). . 33
Figure 1.10. A ribbon presentation of the WD40 domain. .................................................. 34
Figure 1.11. A ribbon presentation of the TPR motifs of PP5. ............................................. 36
Figure 1.12. Homology modeling of various TPR proteins. ................................................. 37
Figure 1.13. Domain organization of the TPR protein, HOP. ............................................... 38
Figure 1.14. Domain organization of the TPR proteins, PEX5 and p67phox. .......................... 39
Figure 1.15. TPR protein-ligand structures. ........................................................................ 43
Figure 1.16. PDZ domain structure. ................................................................................... 44
Figure 1.17. Schematic diagram showing domain organisation of the TPR Protein Tfc4 ..... 45
Figure 1.18. Schematic diagram showing domain organisation of the TPR Protein HIP. ...... 47
Figure 1.19. Schematic Diagram showing TPR Motif organisation within the TPR Protein
MAS70 .............................................................................................................................. 48
Figure 1.20. Schematic Diagram showing TPR Motif organisation within these TPR proteins
......................................................................................................................................... 49
Figure 1.21. Regulation of p21 by p53, WISp39 and Hsp90. ............................................... 49
Figure 1.22. Schematic Diagram showing TPR Motif organisation within PP5..................... 50
Figure 1.23. Schematic diagram showing domain organisation of the TPR Protein CHIP ..... 51
Figure 1.24. Schematic diagram showing domain organisation of the TPR Protein α-SGT. .. 53
Figure 1.25. Domain organisation of Bub1, BubR1 and Knl1 .............................................. 53
Figure 1.26. Scaffold mechanisms...................................................................................... 54
Figure 1.27. The six major alterations found in most cancers. ............................................ 56
Figure 1.28. Process of Actin polymerisation ..................................................................... 58
Figure 1.29. Process of Transcription. ................................................................................ 60
Figure 1.30. Formation of the Pre-initiation complex (PIC)................................................. 61
Figure 1.31. STRAP sequence and its conservation ............................................................ 65
Figure 1.32. STRAP interaction with JMY and p300 through distinct TPR motifs ................. 67
Figure 1.33. STRAP and the p53 Response ......................................................................... 68
Figure 1.34. STRAP and the DNA damage Response pathway. ........................................... 69
Figure 1.35. STRAP and the stress response pathway ........................................................ 70
Figure 1.36. Regulation of GR by STRAP ............................................................................. 71
Figure 2.1. Bacterial growth curve ..................................................................................... 82
Figure 3.1. The pET-14b Vector ......................................................................................... 96
Figure 3.2. Expression of His-hSTRAP(1-440). .................................................................... 99
Figure 3.3. Purification of His-hSTRAP(1-440) protein ...................................................... 101
Figure 3.4. His-hSTRAP(1-440) mass spectrometry .......................................................... 102
Figure 3.5. pGEX-6P1, GST expression vector. .................................................................. 103
Figure 3.6. Cloning of the hSTRAP wild type in pGEX-6P1................................................. 104
Figure 3.7. Expression of GST- hSTRAP (1-440). ............................................................... 105
Figure 3.8. Purification of GST-hSTRAP(1-440) protein. .................................................... 107
Figure 3.9. GST-hSTRAP(1-440) mass spectrometry ......................................................... 108
Figure 3.10. Secondary structure predictions of full length hSTRAP ................................. 110
Figure 3.11. PCR products of the hSTRAP fragment cloned in pET14-b vector .................. 112
Figure 3.12. Expression of hSTRAP (1-219). ...................................................................... 114
Figure 3.13. Purification of hSTRAP(1-219) protein. ......................................................... 115
Figure 3.14. hSTRAP(1-219) mass spectrometry .............................................................. 116
Figure 3.15. Expression of hSTRAP (220-440) in BL21(DE3)pLysS ...................................... 117
Figure 3.16. Purification of hSTRAP(220-440). ................................................................. 118
12
Figure 3.17. Expression of hSTRAP (1-150)..........................................................................120
Figure 3.18. Purification of hSTRAP(1-150) protein .......................................................... 120
Figure 3.19. hSTRAP(1-150) mass spectrometry .............................................................. 121
Figure 3.20. Expression of hSTRAP (151-284). .................................................................. 122
Figure 3.21. Purification of hSTRAP(151-284). ................................................................. 123
Figure 3.22. Expression of hSTRAP (285-440). .................................................................. 125
Figure 3.23. Purification of hSTRAP(285-440). ................................................................. 126
Figure 3.24. hSTRAP variants used for biochemical binding assays................................... 128
Figure 3.25. In-gel digestion mass spectrometry analysis. ................................................ 129
Figure 3.26. hSTRAP biochemical pull down assays with MCF7 cellular extracts................131
Figure 3.27. SDS-PAGE bands isolated and submitted to mass spectrometry analysis. ..... 131
Figure 3.28. hSTRAP implication in cancer related pathways. .......................................... 137
Figure 3.29. CD experiments carried out on His-hSTRAP(1-440). ...................................... 140
Figure 3.30. Concentration of His-hSTRAP(1-440) protein in different buffers.................. 142
Figure 3.31. Long term stability of concentrated His-hSTRAP(1-440) protein. .................. 142
Figure 3.32. Concentration and stability of His-hSTRAP(1-440) protein in Buffer 1 + H-MIX.
....................................................................................................................................... 144
Figure 3.33. Gel filtration graph of concentrated His-hSTRAP(1-440) protein sample ....... 145
Figure 3.34. Crystallisation trials with 12.9 mg/ml His-hSTRAP(1-440) sample in JSCG+
screen ............................................................................................................................. 146
Figure 3.35 Crystallisation trials with 18 mg/ml His-hSTRAP(1-440) sample in the JCSG+
screen. ............................................................................................................................ 148
Figure 3.36. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440). ............................... 150
Figure 3.37. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to
observe effects of pH on crystal grade............................................................................. 153
Figure 3.38. His-hSTRAP(1-440) Diffraction Pattern. ........................................................ 154
Figure 3.39. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to
observe effects of further lowering ethanol concentration on crystal grade .................... 154
Figure 3.40. CD experiments carried out on GST-hSTRAP(1-440)...................................... 156
Figure 3.41. On-column GST TAG cleavage of GST- hSTRAP(1-440). ................................. 157
Figure 3.42. Dialysed hSTRAP(1-219) protein in CD buffer. .............................................. 158
Figure 3.43. CD experiments carried out on hSTRAP(1-219). ............................................ 159
Figure 3.44. hSTRAP(1-219) protein stability in gel filtration buffer. ................................. 160
Figure 3.45. Concentration and stability of hSTRAP(1-219) in NMR buffer. ...................... 162
Figure 3.46. 1D NMR spectrum of 0.2 mM hSTRAP(1-219) ............................................... 163
Figure 3.47. Expression of 15N labelled hSTRAP(1-219)..................................................... 164
Figure 3.48. Purification of 15N labelled hSTRAP(1-219 .................................................... 166
Figure 3.49. Concentration of 15N-hSTRAP(1-219) in H-MIX, pH8. .................................... 167
Figure 3.50. Dialysed hSTRAP(1-150) protein in CD buffer. .............................................. 168
Figure 3.51. CD experiments carried out on hSTRAP(1-150). ............................................ 169
Figure 3.52. Expression of 15N labelled hSTRAP(1-150) protein. ....................................... 170
Figure 3.53. Purification of 15N labelled hSTRAP(1-150). .................................................. 171
Figure 3.54. 15NhSTRAP(1-150) buffer optimisation trials................................................. 173
Figure 3.55. 2D 1H-15N correlation NMR spectra on 15N-hSTRAP(1-150). .......................... 175
Figure 3.56. 2D1H-15N correlation HSQC NMR spectra of 15NhSTRAP(1-150) in identified
optimised conditions....................................................................................................... 176
Figure 3.57. Expression of unlabelled hSTRAP(1-150) in Shuffle T7 .................................. 177
Figure 3.58. Purification of unlabelled hSTRAP(1-150) in Shuffle T7 express. ................... 178
Figure 3.59. 1D 1H Spectrum of hSTRAP (1-150) in Shuffle T7 express. ............................. 179
Figure 3.60. Dialysed hSTRAP(151-284) protein in CD buffer............................................ 180
Figure 3.61. CD experiments carried out on hSTRAP(151-284). ........................................ 181
Figure 3.62. 1D 1H NMR spectrum of hSTRAP(151-284) ................................................... 182
Figure 3.63. Dialysed hSTRAP(285-440) protein in CD buffer to be used for subsequent CD
experiments .................................................................................................................... 183
Figure 3.64. CD experiments carried out on hSTRAP(285-440). ........................................ 184
Figure 4.1. Sequence alignments of Mouse and Human STRAP ........................................ 186
Figure 4.2. Homology modelling of hSTRAP and mSTRAP structure. ................................ 187
13
Figure 4.3 Ribbon representation of the different regions of hSTRAP cloned and expressed
separately in the current study, mapped on the model structure of full-length STRAP
protein. ........................................................................................................................... 188
Figure 4.4. Proposed hSTRAP mechanism of function. ..................................................... 197
14
List of Tables
Table 2.1. Buffer compositions. ......................................................................................... 74
Table 2.2. Bacterial competent cells. ................................................................................. 75
Table 2.3. PCR primers ...................................................................................................... 78
Table 2.4. PCR reaction protocol. ...................................................................................... 79
Table 2.5. Resolving gel components. ................................................................................ 85
Table 3.1. Expression trials and purification of His-hSTRAP(1-440). .................................... 98
Table 3.2. Estimation of His-hSTRAP(1-440) protein concentration .................................. 102
Table 3.3. hSTRAP truncated forms cloned in pET14-b. .................................................... 111
Table 3.4. Estimation of hSTRAP(1-219) protein concentration. ....................................... 116
Table 3.5. Estimation of hSTRAP(1-150) protein concentration. ....................................... 121
Table 3.6. hSTRAP interacting partners. ........................................................................... 133
Table 3.7. Function of hSTRAP interacting proteins. ........................................................ 135
Table 3.8. Estimation of soluble His-hSTRAP(1-440) protein concentration in different
buffers ............................................................................................................................ 141
Table 3.9. Estimated concentration of His-hSTRAP(1-440) in Buffer 1 + H-MIX................. 143
Table 3.10. Estimated concentrations of hSTRAP(1-219) in gel filtration buffer................ 160
Table 3.11. Estimated concentration of hSTRAP(1-219) protein samples in NMR buffer .. 162
Table 3.12. Estimated concentration of 15N-hSTRAP(1-219). ............................................ 167
Table 3.13. Estimated concentration of 15N hSTRAP(1-150). ............................................ 171
Table 3.14. Estimated concentration of hSTRAP(1-150) in elution fractions when expressed
in Shuffle T7 express cells. ............................................................................................... 178
Table 4.1. Experimental TPR positions of STRAP TPR motifs ............................................. 186
Table 4.2. CD data ........................................................................................................... 189
Table 4.3. The advantages and disadvantages associated with the bacterial and eukaryotic
expression systems. ........................................................................................................ 193
15
Abstract
STRAP (Stress responsive activator of p300) is a 440 amino acid protein, predicted
to have 6 TPR (Tetra-Tri-Co-Peptide Repeats) motifs, known to mediate proteinprotein interactions. STRAP has been shown to form a complex with proteins p300
and JMY (Junctional Mediatory Protein), and is implicated in the DNA damage, heat
shock response pathway, regulation of the Glucocorticoid receptor and in the
function of p53.
The aims of this project were to clone, express and purify full length and truncated
human STRAP (hSTRAP) variants in high quantities. Full length and shorter
hSTRAP fragments, which contain different combinations of the predicted TPR
motifs and hence cover different regions, would be then structurally characterised by
various structural and biophysical experiments. Another important aim was to
identify interacting partners of hSTRAP in breast cancer and to map the position of
their interaction sites to different parts of the protein.
To this direction GST- and His- tagged full length hSTRAP, as well as His- tagged
truncated hSTRAP protein variants have been successfully cloned, expressed and
purified. Independent and reproducible biochemical pull-down assays have been
carried out in MCF7 breast cancer cells, followed by mass spectrometry-based
proteomics analysis which identified 25 hSTRAP-interacting partners from various
signaling pathways such as regulation of the actin cytoskeleton and translation. In
addition, crystallization trials were carried out using pure His-hSTRAP(1-440)
protein, which were unfortunately un-successful. Various hSTRAP protein variants
have been characterized by CD, showing that hSTRAP(1-150), His-hSTRAP(1-440),
hSTRAP(1-219), hSTRAP(151-284) and hSTRAP(285-440) comprise of α and β
structures, but the hSTRAP protein variants show no clear cooperative unfolding
transitions, suggestive of molten globule states. NMR on hSTRAP(1-219),
hSTRAP(1-150) and hSTRAP(151-284) have shown these proteins are not folded at
a tertiary structure level.
We conclude that a protocol has been established to clone, express and purify
various hSTRAP variants and the thermal and secondary structure characteristics of
each have been determined, although the 3D structure could not be solved. Pulldown assays followed by proteomic analysis have shown that hSTRAP is implicated
in many aspects of cellular regulation.
16
1. Chapter One. Introduction
1.1 Proteins
1.1.1 Importance of protein research
The term “Protein” was introduced in 1839 by Gerhardus Johannes Mulder and since
then extensive research on various proteins has been undertaken [1]. Proteins are
composed of a combination of 20 amino acids, and are implicated in every aspect of
cellular function [1]. Various diseases have aroused from protein defects, such as
cystic fibrosis where the most common mutation leads to the deletion of a single
amino acid residue at position 508 of the Cystic Fibrosis Transmembrane Regulatory
(CFTR) protein, which as a consequence severely affects this protein and its function
[1]. Alzheimer’s disease is another example which is most commonly characterized
by high amounts of amyloid deposits and hyper-phosphorylated Tau protein [2].
Further research has implicated specific proteins in various other diseases such as
sickle cells anaemia, HIV and cancer (will be introduced in more detail below) [1].
Protein homeostasis is an important aspect of cellular function which needs to be
tightly regulated for the stability of the proteome [2, 3]. Various cellular stress
factors cause protein damage such as miss-folding or non-physiological intracellular
localization, which can be repaired [2, 3]. If repair is unsuccessful, defective proteins
can be eliminated through homeostatic mechanisms mediated by chaperones and
quality control processes [2, 3]. Dysregulation of these cellular mechanisms can lead
to the accumulation of damaged and non-functional proteins, resulting in the
emergence of various diseases [2, 3]. All of these factors mentioned above
emphasized the importance of carrying out protein research [1].
1.1.2 The importance of protein structure determination
The protein structure, unlike that of DNA which is largely independent of its
sequence composition, depends on the amino acid sequence and its sub-cellular
environment [4]. The structure of the protein is important in carrying out its cellular
functions and can be determined by X-ray crystallography, NMR and cryo-electron
microscopy [4]. X-ray crystallography is a technique in which an X-ray beam is
directed to a protein crystal, which then produces a unique diffraction pattern that
can be analyzed by various computational methods to deduce the protein structure
(will be discussed in more detail below) [4]. NMR (will be discussed in more detail
17
below) is used to determine the structure of small proteins in solution or in solid state
(solid state NMR) [4], and can also be used to study the dynamics of interacting
proteins and their complexes [4]. Hence, NMR is a versatile biophysical technique
that is commonly used in many aspects of structural biology and the drug discovery
process as shown in Figure 1.1 [5, 6]. Figure 1.1 shows that NMR can be used in the
process of initial target selection, high throughput screening (HTS) biochemical
binding assays to identify potential hits, further validation, identification of leads and
finally lead optimization through structure refinement [5,6]. Cryo-electron
microscopy is another structural biology technique and is used to visualize macromolecular complexes [4].
Figure 1.1. NMR can be used in all aspects of the drug discovery process. A
target is firstly selected which is normally protein and is found to be implicated in the
disease of interest. A series of biochemical binding assays are initiated to determine hits
which bind to the target and have the desired effect on target activity. These hits are then
studied further by NMR experiments to confirm target-hit interaction, and to identify ligand
binding site. Once a lead is identified this is then optimized through structure refinement.
Abbreviations: HTS, High throughput screening. Figure taken from [5]
The importance of the determination of the protein structure was first realised when
it was shown that DNA forms a double helical structure, which then identified the
mechanism of how the genetic information found on the DNA can be separated
equally to the two daughter cells obtained in the process of cell division [4, 7]. This
then lead to the prediction that determining the structure of proteins also can give
insights into protein function, and the principle “function follows form” was then
formulated [4, 7]. It was also realised that for the drug discovery process it is
important to determine the structure of the protein as then small molecule drugs can
then be designed to fit into certain protein pockets, which are both important in
mediating a certain cellular function and in the physiology of the disease [4, 7].
Hence, the determination of the protein structure is a very important aspect that was
needed for the progression of biomedical science.
18
1.1.3 Importance of characterizing protein-protein interactions
Resolving the structure of the protein provided crucial information in characterizing
cellular protein functions and pointed out that alternative protein conformations
could lead to interactions with different binding partners, which eventually might
alter their effects with detrimental consequences for the cellular physiology [8]. For
example, in Alzheimer’s disease, amyloid protein aggregate as deposits to cause
pathogenesis [2, 9]. Another example is the adult respiratory distress syndrome
where normally the function of elastase is inhibited by interaction of this protein
with α-1-antitrypsin [9]. Loss of this interaction leads to the activation of elastase
which is implicated in the development of the respiratory distress syndrome [9]. This
and many more examples emphasized the importance of identifying interacting
partners of proteins of interest, in order to determine optimal drug target.
The term “Proteomics” was first introduced in 1995 and this area of research is the
study of protein-protein interactions based on protein structure [8]. Proteomics can
be sub-categorised into two areas of research: expression and cell map proteomics
[8]. The first category is the study of expression patterns of various proteins, and the
latter sub-category is the study of protein complexes and the protein-protein
interaction implicated within that [8]. Proteomics research provided extensive
amount of protein-protein interaction data that needed to be analyzed using various
bioinformatics tools [10]. Bioinformatics is a powerful approach to link specific
protein structure abnormalities with altered protein-protein interactions and
assignment of these to particular pathways leading to disease [10].
1.2 Various structural biology techniques
The two main structural biology techniques commonly used by scientists to solve the
structures of protein are X-ray crystallography and NMR, both of which will be
discussed in more detail below. Circular dichroism can also be classified as a
structural biology technique and its methodology will also be described in the
following sections.
1.2.1 X-Ray crystallography
When carrying out any microscopic experiment, the resolution at which the sample
is observed primarily depends on the wavelength of the electromagnetic radiation
19
used [11]. For visualizing protein molecules the wavelength has to lie within the
range of 1 to 5 Å, which is true for X-rays [11]. X-rays directed towards the protein
crystal are scattered by the atoms of the proteins to produce a diffraction pattern
[11]. A protein crystal is a 3D lattice of a high number of protein molecules
organised/packed in such a manner to form a 3D lattice, which actually magnifies the
effect of X-ray scattering [11]. This latter effect can either interfere destructively, or
add constructively to then ultimately give a diffraction pattern (reflection spots) [11].
X-ray scattering can only add constructively if the conditions are met of Bragg’s law
[11], which show that to obtain an diffraction pattern, the total pathway difference
between the two scattered X-rays have to be of an integer value of a wavelength of
an X-ray [11]. Bragg’s law is as follows and the co-ordinates h, k and l are the planes
of the crystal latticenλ = 2dhklsinθ
d; lattice spacing, θ; diffraction angle of the reflections, hkl; miller indices (planes of
the crystal lattice)
In these diffraction experiments, the intensity of each reflection spot is measured,
which is directly proportional to the square of the structure factor amplitude (F) [11].
This structure factor is the sum of the atomic scattering within the unit cell from the
plane directions defined by the co-ordinates hkl [11]. The goal of these diffraction
experiments is to obtain an electron density map from this, and the electron
density(ρ) can be calculated via Fourier transform of each of the co-ordinates x,y,z
of the unit cell of volume V as shown by this equation:
ρ(x,y,z) = 1/V Σ|Fhkl| · exp[-2πi (hx + ky + lz – αhkl)]
For each data point two values are needed to calculate electron density, one which is
obtained through the experiment, which is the structure factor amplitude, |F hkl|, and
the other is the phase angle of each reflection, α hkl [11]. This phase information is
usually missing and has to be retrieved, and is referred to as the “phase problem” and
a reasonable estimate of this has to be derived to solve the structure by x-ray
diffraction [11].
This phase problem can be solved though a number of ways: Multiple or Single
Isomorphous Replacement (MIR or SIR), Single-wavelength Anomalous Dispersion
(SAD) and Multi-wavelength Anomalous Diffraction (MAD) [11]. All of these
experiments generally involve the incorporation of a heavy element (has a strong
20
scattering centre) within the protein crystal; through either crystal soaking with the
chosen heavy element, or recombinantly labeling the protein through protein
expression in selenomethionine media [11]. The phase problem can also be solved
through Molecular replacement (MR), which requires a homologous protein to the
protein being investigated [11]. An essential requirement for MR is to have a
reasonably accurate homologous model [11].
Obtaining a crystal of high enough quality to diffract is the critical, difficult and rate
limiting requirement in solving the structure by x-ray crystallography [11-12]. Many
trials have to be undertaken using various precipitants at different concentrations to
determine optimal crystallization conditions, with little concrete theory support, as
the condition in which the protein will crystallize cannot be predicted [11-12].
However, phase diagrams can then be determined through these trials, which are
used to facilitate the process of optimizing the condition of protein crystal growth
[11-12]. These 2D phase diagrams plot the solubility of a protein in various
conditions, and the most common phase diagrams plotted are of concentration of
protein to precipitant, as shown in Figure 1.2 [11-12]. These phase diagrams show
the different zones, which relate to different protein kinetics (phase space) (Fig.1.2)
[11-12]. Proteins crystallize when its concentration is higher than it’s solubility in
that buffer solution, which is known as the super-saturation zone [11-12]. Figure 1.2
also shows that this zone of super-saturation is divided into three other zones;
nucleation, precipitation, and metastable [11-12]. Each zone refers to a certain
protein kinetic effect, for example in the nucleation zone, the concentration of
protein is sufficient for the formation of nuclei and crystals [11-12]. Whereas in the
precipitation zone the concentration of the protein is too high and as a consequence,
growth is too rapid for correct crystal formation and therefore aggregation and
precipitation subsequently occurs as a result [11-12]. In the metastable zone, the
concentration of protein is not high enough to form new nuclei but allows the
formation of crystals from existing nuclei [11-12]. Since these zones refer to protein
kinetics the boundaries are not conclusive, and can differ [11-12]. Other factors that
can be plotted on these phase diagrams are protein purity, pH and temperature [1112].
21
Figure 1.2. Phase diagram of
concentration of protein to
precipitant. The arrow shows the
path of a protein taken through
these different zones in a vapour
diffusion experiment.
Figure taken from [12]
There are a number of techniques that can be used in growing these protein crystals,
these are vapour diffusion, free interface diffusion, batch and dialysis methods [1112]. In this thesis the vapour diffusion technique was used where drops were set up
with protein and precipitant, surrounded by the mother liquor (the precipitant) at a
higher concentration [11-12]. The initial starting point is the under-saturation stage
for these types of experiment (Fig.1.2.), and as the water evaporates the
concentration of both of the components within the drop increases and the protein is
then within the super-saturation stage [11-12]. Once it is within the nucleation zone,
the protein nucleates, subsequently followed by crystal formation and a decrease in
protein concentration [11-12].
For these types of vapour diffusion experiments two types of methods exist, the
sitting and hanging drop method [11-12]. The latter method involves the mixing of
the protein and crystallisation trial condition on a siliconised glass cover slip, which
is then subsequently placed over a plate containing the crystallization trial solution at
a higher concentration and volume to what is set up in the drop (See Fig.1.3) [1112]. The ratio and volume of protein to crystallization trial condition can be changed
to optimise condition for crystal growth [11-12]. The sitting drop method is very
similar to this latter technique mentioned, but the difference is that the drop of
protein and crystallization trial condition is sitting over the reservoir solution
(crystallization trial condition) rather than being suspended over it like in the
hanging drop method (Fig.1.3) [11-12]. The advantage with this technique is a larger
drop volume can be used [11-12].
22
Figure 1.3. Various vapour diffusion methods. The two types of drop methods,
hanging and sitting drop, are shown in this figure. Figure taken from [13].
A diffracting crystal of high enough quality is obtained from a uniformly formed
single crystal lattice, and when the latter is not the case, this is normally evident by
the diffraction image obtained [11-12]. This is known as mosaicity and is confirmed
through alteration of spot shape and intensity [11-12]. Other possible crystal
abnormalities, is the splitting of the crystal lattice, which subsequently gives two,
overlapping diffraction patterns [11-12]. Futhermore, diffraction experiments are
carried out at 100K, the crystal has to be frozen but no ice should be formed and
hence this is done in the presence of cryoprotectant, which prevents this [11-12].
1.2.2 NMR
NMR is a powerful biophysical technique that can provide structural information of
proteins up to 35kDa [14-22]. This technique is based on the fact that each charged
proton spins around its axis at a certain frequency within the small magnetic field it
creates around itself and this is referred to as magnetic moment (µ) [14-22]. After
application of an external magnetic field (B0) in the Z direction, the proton which has
a spin of ½, spins around the Z axis in either two orientations, Z+ or Z-, referred to
as spin up or down respectively [14-22]. The rate of proton spinning depends on the
frequency of the external magnetic field applied on the Z axis [14-22], for example
when the external magnetic field is 17.6 Tesla, protons spin at 700MHz frequency.
However, the spinning of the protein is also affected by the chemical environment,
i.e which other atoms are nearby [14-22]. A frequency shift in an NMR spectrum,
also called chemical shift, is generally detected to be small compared to the
frequency of the magnetic field [14-22]. The chemical shift is around a millionth of
the frequency of the magnetic field and is measured in parts per million (ppm), and
23
this shift is the difference in frequency between the sample signal and reference
NMR signal, normalized to external magnetic field [14-22]. When using a 700 MHz
spectrometer, a chemical shift of 4ppm indicates a frequency shift (relative to the
reference) of 2800 Hz (4*700). To measure the actual spinning rate, radiofrequency
electromagnetic field (B1) is applied which rotates the net magnetization vector
around the X axis. The net magnetization is thus rotated 90° to align along the Y axis
(Fig.1.4) [14-22]. This effect is known as the 90° pulse, and a 180° pulse can also be
obtained by increasing twice the time the external magnetic field is applied for [1422].
Figure 1.4. NMR Theory. Shows a 90° pulse
when an external magnetic field is applied (my)
from the state of equilibrium around the Z axis (M0
position). Protons spin around the Z axis at
equilibrium state. Figure from [16]
Once a pulse is applied, B1 magnetic field is switched off and as a consequence all
the magnetization eventually returns to the Z axis, to its original equilibrium state
before this electromagnetic field was applied [14-22]. This effect is known as
relaxation. The rate at which it rotates back to its equilibrium state along the Z-axis
is referred to as T1 [14-22]. At the equilibrium state no net magnetisation is detected
in the XY plane, and the rate at which this magnetisation decays to zero over time
from Y orientation is referred to as T 2, and this phenomenon is known as the free
induction decay (FID) [14-22]. The signal from oscillating spins is detected in XY
plane as a function of time. The NMR spectrum is obtained by performing the
Fourier transform of time-dependence of the detected signal. Fourrier transform is a
mathematical process whereby the time dependence signals of the free induction
decays are converted into distribution of spectral frequencies, giving rise to NMR
peaks [14-22].
When carrying out 1D NMR experiments, overlapping signals are generally obtained
from similar resonance frequencies and therefore 2D NMR experiments give better
peak resolution [14-22]. These are similar in principle to 1D NMR but two different
nuclei may be utilized, for example a proton and carbon, hence the information is
more clear and easy to analyse [14-22]. 2D experiments generally rely on two
phenomena: (1) bond coupling, which include experiments such as HSQC, COSY
and TOCSY; and the other through space coupling, which include experiments such
24
as NOESY [14-22]. All of these techniques which only analyse data at the 1H level
are referred to as homonuclear NMR [14-22]. Other 2D experiments, such as HSQC
(one- bond correlation) and HMBC (long-range correlation), utilize two different
nuclei such as 1H and
15
N, and therefore these experiments are referred to as
heteronuclear NMR [14-22]. As explained above, for 2D NMR the nuclei are excited
by a special series of pulses, and free induction decay signals are observed as a
function of t1 time delays used in the pulse sequences [14-22]. After initial
excitation, there is an evolution time whereby the magnetisation of each of the
different nuclei may affect each other; this consequently modulates the intensity of
the signal observed during acquisition period t2. [14-22]. A number of FIDs with
different evolution times t1 are acquired. The FID signals are first fourier
transformed along t2 to produce peaks as in 1D spectra, these are then fourier
transformed again along t1 dimension. The signals appear as contours on the 2D
spectra as shown in Figure 1.5 [14-22]. Figure 1.5 shows the various 2D HSQC
spectra that can be expected of unfolded and folded protein.
Figure 1.5. 2D HSQC NMR spectrum. These are examples of 2D1H-15N HSQC spectra
of various proteins in different folding states. A: 1H-15N HSQC spectra of “poor”, unfolded
protein as shown by a low number of cross-peaks and poor sensitivity and small signal
dispersion. B: 1H-15N HSQC spectra of unfolded protein as all poorly-dispersed backbone
amides are present within the random coil region of around 7–9 ppm, and a lot of
overlapping resonances signals from the side-chain amide are observed. C: 1H-15N HSQC
spectra of “promising protein”, as the spectrum shows presence of folded and unfolded
regions of the protein with higher peak intensities. D: 1H-15N HSQC spectra of folded
protein as shown by contours of uniform intensity, well dispersed signals, and correct
number of peak count for that protein under investigation. Figure taken from [23].
25
3D experiments are important for protein assignments, and these experiments are
carried out to obtain more information regarding protein secondary structure [14-22].
Again 3D and 2D NMR are similar in principle and technique, but 3D NMR has one
additional evolution time, t3, compared to 2D experiments [14-22]. 3D NMR spectra
may analyze 3 different nuclei, which give rise to 3 different frequencies that are
linked to each other [14-22]. These types of experiments are called triple resonance
spectroscopy [14-22].
1.2.3 Circular dichroism (CD)
To understand the theory of CD, it is important to view the concept that plane
polarized light consists of two oppositely rotated, equal magnitudes of circularly
polarized planes of light: left handed counter clockwise (L) and right handed
clockwise (R) [24-25]. The differential absorption of these two planes of light by the
protein sample being investigated results in the combination of these two values and
the phenomenon known as elliptical polarization, which is measured by
Spectropolarimeters (CD instrumentation), in degrees as a function of wavelength
[24-25]. CD data is most commonly shown as mean residue molar ellipticity (deg
cm2 dmol-1), whereby the data is normalized against molar concentration and
pathlength [24-25]. This CD signal is observed due to a chiral component present
within the sample being investigated, its surrounding environment or the protein
itself being chiral [24-25].
Information from different areas of the CD spectra can be analysed in parallel, which
can then be used to deduce the secondary structure composition [24-25]. Various
chromophores can be analyzed; such the peptide bond, side chains of the aromatics
and disulphide bonds [24-25]. The peptide bond absorbs between 180-240 nm, from
which the secondary structure can be deduced through various programs [24-25].
The aromatic amino acids e.g. tryptophan, phenylalanine and tyrosine absorb at 290305, 275-282, 225-270 nm respectively [24-25]. This region cannot give very
detailed information regarding tertiary structure but nevertheless can be studied [2425]. The disulphide bonds weakly absorb near 260nm. CD can therefore provide
information regarding secondary structure composition from the peptide bond region
as well as tertiary structure information [24-25]. Figure 1.6 shows CD spectra linked
to various secondary structure elements. The stability of the protein can also be
assessed via CD through monitoring the conformation changes induced through
variation of factors such as temperature and pH [24-25].
26
Figure 1.6. Far UV spectra of
different
secondary
structure
elements. This figure shows a typical
CD spectra for -helical (solid line),
anti-parallel  sheeted (long dashes),
type I  turn (dots) and irregular
structured (dots and short dashes)
sample.
Figure taken from [25]
1.2.4 Complementarities between different Structural biology
techniques
The area of structural proteomics involves solving the structure of proteins with a
view of understanding its function, and to study the correlation between protein
sequence, structure and function [26]. The main requirement for these types of
studies is to obtain a highly concentrated, stable, non-aggregated and soluble
recombinant protein [26]. Many experiments can be carried out to achieve these
requirements, but there is no single successful protocol that can be universally used
for all. Also only a small fraction of proteins can actually achieve these requirements
and be used for structural biology experiments [26].
The most commonly used structural biology experiments to solve the structure of a
protein are X-ray crystallography and NMR spectroscopy [26]. As mentioned above
NMR spectroscopy does have size restraints associated with it, but can be used as a
parallel approach to x-ray crystallographic studies of small proteins [26]. A 2D
15
N-
HSQC NMR spectrum can show whether the protein sample is folded and gives an
idea as to whether the structure can be solved [26]. For that reason NMR
spectroscopists can determine faster whether the structure can be solved by NMR
experiments, although, this does not generally indicate whether or not the structure
can be solved by X-ray crystallography [26]. If sample is folded according to 2D
NMR spectra, then subsequent NMR experiments can be initiated to solve the
structure, which can take approximately 6 months [26]. Whereas with X-ray
crystallography the time frame may be longer to obtain the well-diffracting crystal,
as firstly buffers have to be optimized to identify optimum conditions of crystal
27
growth, furthermore, not all proteins can be crystallized [26]. Unlike NMR
spectroscopists, who can generally screen a handful of conditions, protein
crystallographers can screen hundreds of conditions as a low volume of buffer and
protein are needed for these trials as they can be set up in 96 well plates [26]. Once
crystallization condition has been optimised more time is required to optimize the
crystal to obtain the best diffraction pattern [26]. Overall, the time it takes to solve
the crystal structure depends mainly on these exploratory trials mentioned above,
after which, once a good diffraction is achieved, the computational aspects involved
in solving the structure by x-ray crystallography are not time consuming in
comparison [26]. Advances are being made in developing both structural biology
techniques to reduce the time needed in solving the structure of proteins [26].
Another difference between the two techniques is that research has shown that there
are differences between the structures solved by NMR and X-ray crystallography of
the same protein, which means that there may be differences in conformation
between crystal and solution structures [27]. This concept is understandable as for xray crystallographic studies the protein is “packed” into the crystal, hence this
potentially could be different in conformation to a protein structure in solution,
where this “packing” phenomena does not exist [27]. As a result NMR studies
usually provide various conformers of the protein, which are then subsequently
deposited in the PDB [27]. One of the difference in structures solved by NMR and
X-ray is the differences in the number of contacts per residue, as NMR structures
deposited in the PDB have more inter-residue contacts within the 3.0 Å and 4.5– 6.5
Å resolution x-ray data, but fewer between 3-4 Å and 6.5-8 Å [27]. Also the main
chain hydrogen bonds are sometimes different between solved NMR and X-ray
structures [27]. However, it has been suggested that this difference could be due to
the mathematical procedures associated in processing the data for both techniques, as
after re-refinement of NMR structures using different force field parameters the
differences are much less (but are not completely lost) [27].
X-ray structures are still generally considered as the more accurate image of the
solved protein structure than NMR structures because of the non-existing quality
assessment of these NMR structures [27]. However, NMR solves the structure of the
protein in solution, which is known to be more relevant in the biological context
compared to static x-ray crystal structures [27].
Another simpler and faster technique is CD, which is most commonly used as a
28
preliminary experiment to X-ray and NMR studies, in determining the folding state
of protein prior to initiating these experiments [26]. Although, only NMR can
provide high resolution data regarding the folding state of the protein sample [26].
CD data can however be used to facilitate crystallographic and NMR studies, as
information regarding integrity, folding, stability (via monitoring changes in
structure through application of varying temperatures) and secondary structure
composition of protein can be assessed via this technique using less protein material
[24-26]. Compared to X-ray crystallography and NMR, which give structural data on
the atomic resolution level [26], CD provides low resolution data in which the full
overall structural data is assessed [24-25]. Due to its non-destructive nature, and low
sample and time requirements, CD is a very useful for preliminary structural studies
[24-25]. Compared to crystallographic studies, which requires high quality
diffracting crystals [26], CD is less time consuming and solution based technique
[24-25]. For NMR studies, high concentration of monomeric protein is needed and
this technique also has size restraints (less than 35kDa) [14-22, 26]. Furthermore, to
completely resolve the structure by NMR, each resonance has to be assigned which
is time consuming compared to CD, and NMR requires stable isotopic labeling,
which is expensive [14-22, 26].
X-ray structure gives a static picture and does not give any data regarding dynamics
of the protein, which is a critical aspect of its function [11-12, 26]. NMR studies can
provide information regarding the dynamic nature of the protein compared to X-ray
crystallography [26]. CD can be used to assess structural changes of the sample and
the rate at which this occurs [24-25].
As mentioned previously CD provides low resolution structural data but can give
reliable estimates of secondary structure composition, although cannot determine
exactly which region of protein belongs to diferent secondary structural elements
[24-25]. Furthermore, CD data cannot give very detailed information regarding
tertiary and quaternary structure but still nevertheless can be used as a
complementary approach to X-ray and NMR studies [24-25]
1.3 Mass spectrometry
Mass spectrometry is a highly sensitive and high throughput technique and as a
result is used during many stages of the drug discovery process [28]. This sensitivity
is critical in identifying drug targets that are present in low concentration within
29
mixtures [28]. The speed in carrying these mass spectrometry experiments is
important for high throughput analysis of drug libraries [28].
This technique does not measure the mass as what is implied by the name, but it
actually measures the mass to charge ratio (m/z) [28]. A mass spectrometry spectrum
usually plots ion abundance against m/z, and shown in terms of Dalton (Da) per unit
charge, and hence the raw data can give information regarding the relative
abundances of these gas phase ions in the sample [28]. The mass spectrometer
consists of three important elements: (1) an ionization source, (2) a mass analyzer
and a (3) detector as shown in Figure 1.7 [28].
Figure 1.7. The three main components of the mass spectrometer. The 3 main
components of the mass spectrometer are the ion source, analyzer and detector. The
ionization techniques used within the ion source are ESI (Electrospray ionization), APCI
(Atmospheric pressure chemical ionization), and MALDI (Matrix associated laser
desorption/ionization). The ions then travel to the analyzer, which include Time of flight,
Quadrupole, Quadrupole ion trap, FT-ICR (Fourier-transform ion-cyclotron resonance) and
sector analyzers. The ions then eventually reach the detector and are analyzed and
subsequent results viewed on the computer. *FT-ICR does not use an electro multiplier as its
detection source.
Figure taken from [28].
As the name implies the ionization source is where the analytes are ionized into gas
phase ions [28]. Historically electron ionization was used for this stage; however,
this limited the samples to be analyzed to low thermally stable molecular weight
compounds [28]. Advances in recent decades have now made available an ionization
technique to analyze large, non-volatile and thermally labile compounds, two of
which
are
electrospray
ionization
(ESI)
and
matrix
assisted
laser
desorption/ionization (MALDI). ESI is the process of ion formation whereby a
solvated sample is passed through a small charged capillary, resulting in the
formation of charged droplets of both solvent and analytes, which can either have a
net positive or negative charge [28]. These ions, as they passage through the
instrumentation to the mass analyzer loose the solvent surrounding environment
[28]. ESI has no size restraints as multiple charged species are formed during ESI,
and this technique is referred to as a soft ionization technique, which allows non30
covalent bio-macromolecules to be analyzed [28]. This ionization process can follow
from a liquid chromatography step, such as HPLC and capillary electrophoresis,
which allows complex biological samples to be analyzed swiftly with high
sensitivity [28]. However, because ESI is a solution-based technique, sample is
constantly being lost, and also ESI is very sensitive to ion suppression effects [28].
High salt can lower analyte ion formation, and so most samples have to be desalted
prior to mass spectrometry analysis [28].
In MALDI the sample is co-crystallized to a matrix, and then irradiated, which
results in the formation of gas phase ions that travel towards the mass analyzer [28].
This process results in the generation of singly charged ions, which is an important
advantage of MALDI, although the mechanism of this has not been fully clarified
[28]. This technique compared to ESI has very little sample wastage and as a
consequence is highly sensitive [28]. Compared to ESI, MALDI is not as sensitive to
salts and buffers present within the sample [28]. However, there are drawbacks to
this technique; firstly the matrix produces a large amount of chemical noise at m/z
values below 500 Da, and hence it is difficult to analyze low molecular samples [28].
However research is progressing in solving this problem associated with MALDI
[28].
The next component within the mass spectrometer is the mass analyzer of which that
are many that work in different ways in analyzing the ions [28]. There are 5 common
mass analyzers, which can be categorized into two groups; beam analyzers, whereby
the ions exist the ion source in form of a beam, and travel through the analyzer to the
detector, or (2) trapping analyzers; where the ions are trapped within the analyzer,
which have been generated by the external ion source or the analyzer itself [28]. One
example of the beam analyzer is Time of flight (TOF), which is a simple mass
analyzer that separates ions according to its speed [28]. Ions are formed then a fixed
potential is applied across the TOF drift tube, whereby the ions are accelerated and
travel through the tube to hit the detector [28]. Ions which have the same charge will
travel at the same velocity after they have been accelerated, and the lower the m/z
value the higher the speed by which the ion will travel and vice versa [28]. Hence,
depending on the time it takes for the ion to travel to the detector, will determine the
m/z value of the ion [28]. This is generally known as liner TOF and another more
advanced instrumentation is the reflectron TOF, where the ions travel through two
tubes aided through an electrostatic mirror known as the reflectron, which directs the
ions onto travelling through its second path and then finally to the detector [28]. This
31
reflectron accommodates for the small differences of the speeds of the ions with the
same m/z value and this improves the accuracy of the spectrometry [28]. The
advantage of this analyzer is that there are no size restraints [28].
Another analyzer is the Quadrupole, which is cheap and easy to use and applies a
much lower voltage such as 2-50 V compared to kV, to accelerate the ions [28]. This
analyzer is based on the trajectory of an ion to the detector through four rods
subjected to a dynamic electric field (radio frequency), which ultimately depends on
the m/z value of the ion [28]. Not all the ions will reach the detector as only the ion
with the single m/z value will be able to travel through the rods with a stable
trajectory to reach the detector; other ions with a different m/z won’t be able to
survive this [28].
Another most commonly used analyzer is the quadrupole ion trap, which is a close
relative of the analyzer just mentioned but differ in the application of the electric
field applied (Fig.1.8) [28]. Quadrupole applies the electric field through two
dimensions (x and y), and as a result, the ions travel perpendicular to the electric
field, so in the z direction [28]. In the ion trap, the electric field is applied in all
directions, and as a consequence the ions are trapped within this field (Fig.1.8) [28].
The mass spectrum obtained through the Quadrupole analyzer is the result of an ion
following a stable trajectory to the detector, whereas in ion trap the opposite is
required to obtain a mass spectrum [28]. The latter is achieved through increasing the
radio frequency voltage applied [28].
Figure 1.8. Pictorial
representation of the
Quadrupole Ion trap. A radio
frequency voltage is applied to the
ring electrode to disrupt the path of an
ion and an electric field is applied in
all direction to the ions in this
analyzer, which then result in the
trapping of the ions within the
instrumentation.
Figure taken from [28].
As well as the advances in the ionization techniques and instrumentation, another
technique, which has been important in the use of MS for biological samples is
tandem mass spectrometry usually denoted as MS/MS [28]. As the name suggests
32
and shown in Figure 1.9, this technique involves two steps, first step involves the
separation of the ion with the desired m/z (parent ion) from the other ions generated
by the ion source [28]. The second step is the consequent dissociation of the parent
ion to change its mass or charge to produce a set of ions (product ions), which are
then subsequently analyzed by the Mass spectrometer [28]. An important step of
MS/MS is ion dissociation, whereby the parent ion dissociates by an increase in its
internal energy before entering the second stage of MS-II [28]. This increase in
energy is most commonly through a process called collision-induced dissociation
(CID), where the parent ions collide with gas particles, resulting in the conversion of
kinetic to internal energy of the parent ion [28]. The instrumentation used for
MS/MS are of two types: Tandem in space or Tandem in time, where the first type
requires a specific analyzer, usually a beam time analyzer at each step of MS/MS
[28]. The second type has the ability to use a single analyzer, usually trapping
analyzers such as Quadrupole ion trap, but at different times [28]. The latter type of
MS/MS has a higher efficiency, because there is no transfer of ions between different
analyzers, and more time is given for the ions to dissociate using this form of
MS/MS [28]. The Quadrupole ion trap is most commonly used for MS/MS
experiments due to low maintenance cost, simplicity and speed in using this
instrument [28].
Figure 1.9. Pictorial representation of the Tandem mass spectrometry process
(MS/MS). The first stage of the MS/MS process, is where the ion with the desired m/z
value (parent ion) is separated from the other ions (MS-I). This parent ion is then
subsequently dissociated via CID (Collision induced dissociation), IRMPD (Infrared multiphoton photo dissociation) or SID (Surface induced dissociation) methods to subsequently
give a mass spectrum of the products ions generated via this process. The latter is the second
stage of the MS/MS process, MS-II.
Figure taken from [28]
Mass spectrometry is a highly sensitive and high throughput technique, furthermore
33
advances in the ionization techniques allowed analysis of many compounds, even
those present at low levels within complex mixtures, which was critical for the
analysis of biological samples [28]. Mass spectrometry has also been used within the
area of proteomics to identify and characterize proteins, de novo peptide sequencing
and determining post-translational modification states [28].
1.4 Protein-protein interaction motifs
1.4.1 Structure
Proteins can contain a string of tandem basic motifs, important in mediating protein
interactions, among which are the WD40 [29], PDZ [30], SH3 [31] and TPR motifs
[32-43]. The WD40 domain has no intrinsic enzymatic activity and is highly
abundant in eukaryotic proteins implicated in various biological processes such as
cell division, chemotaxis, RNA processing and various signal transduction pathways
[29]. The WD40 domain acts as a scaffold to mediate protein-protein interaction to
form multi-subunit complexes [29]. The WD40 domain consists of 44-60 residues
which form a seven bladed β- propeller fold, where each blade consists of four
stranded anti-parallel β sheets (Fig.1.10) [29].
(A)
(B)
Figure
1.10.
A
ribbon
presentation of the WD40
domain. A: The seven bladed βpropeller fold (each highlighted in
different colors and circled 1-7). B:
One of the seven bladed folds,
which consist of four stranded antiparallel β sheets (highlighted in
different colors and circled A-D).
Figure adapted from [29]
PDZ domains consists of 80-100 amino acid residues, and bind to the C termini of
interacting proteins which include transmembrane receptors, channel proteins and
other PDZ domain containing proteins [30]. These PDZ domain mediated
interactions are implicated in localising these interacting proteins to the plasma
membrane [30]. The PDZ domain consists of six β strands (βA-βF) and two αhelicies (αA and αB), which form a six β-stranded domain [30].
34
The SH2 domain recognizes and binds to phosphorylated tyrosine containing
sequences whereas SH3 binds to peptides containing the consensus sequence PxxP
(x being any other amino acid) [31]. SH3 motifs are approximately 60 amino acids
and are ubiquitous in eukaryotes. [31].
Another known example of a protein interaction motif is the TPR motif, a 34 amino
acid residue domain [32] and was first discovered in yeast in 1990 [33] as a motif
involved in protein-protein interactions [32, 34-35]. TPR motifs are evolutionarily
conserved from prokaryotes to eukaryotes and found in various proteins in different
sub-cellular locations such as nucleus, peroxisome, mitochondria and cytoplasm
[32]. The structure of the TPR motif has been characterized through X-Ray
crystallography of TPR proteins such as PP5 [32, 36-37]. Observation of the
secondary and tertiary structures of this motif has shown that TPR motifs consist of
two alpha helices forming an anti-parallel hairpin structure [32, 34-38] (Fig.1.11).
This hairpin thus consists of two anti-parallel alpha helical domains A and B, which
span different parts of one TPR motif. Domain A includes conserved residues 4, 7, 8
and 11 and domain B includes conserved residues 20, 24, 27 and 32 (Fig.1.11) [32,
36-37]. Each TPR motif is parallel to each other and there is an angle of 24°C
between helices A and B [32-33, 36-37]. TPR motifs have side-chains that protrude
into this grooved surface and specify interactions with other polypeptides [39]. PP5
has an extra helix 7 at the C terminus of the third TPR motif (Fig.1.11), this helix is
critical for the solubility and stability of these motifs [32-33]. This Helix is present in
other TPR proteins as well as PP5 such as FKBP52 [33, 40]. The stability of the TPR
protein increases with the number of TPR motifs within the protein [41].
35
(A)
(B)
Figure 1.11. A ribbon presentation of
the TPR motifs of PP5. A: The 3 TPR
motifs of PP5 and helix 7. B: The first TPR
motif of PP5 highlighting the conserved TPR
residues, 4, 7, 8, 11, 20, 24, 27. The TPR
motif is colored red and blue, of which red is
helix A and blue is helix B. The single letters
represent the residues and the numbers
represent their location within the TPR
motif. Figure adapted from [32, 37]
TPR motifs show homology in their amino acid sequence, hydrophobicity, length
and spacing [32-36, 42-43]. Sequence alignments of different TPR proteins
identified amino acid residues W (tryptophan), L (leucine), G (glycine), Y (tyrosine),
A (alanine), F (phenylalanine), A (alanine) and P (proline) at position 4, 7, 8, 11, 20,
24, 27 and 32 to be highly conserved, [32-36, 42-43].
Protein interaction motifs are simple but versatile structures, which can provide
mechanical strength to the protein, and can interact with a diverse set of proteins
varying from transcription factors to multi-meric scaffolding proteins [44]. Hence
proteins bearing these motifs are implicated in a wide range of functions within the
cells [44].
1.4.2 Structural stability and ligand specificity of protein interaction
domains
As mentioned above the TPR motif is an α-helical motif that is implicated in
mediating protein-protein interaction, resulting in the subsequent formation of multi
protein complexes implicated in various biological functions within the cell (this will
be discussed in more detail in subsequent sections) [32-38, 45]. The super-helical
structure of the TPR motif formed from the packing of adjacent TPR motifs, exhibits
a concave and convex surface which provide flexibility to the unit and is implicated
in ligand binding [45, 46]. TPR proteins bind to numerous ligands not necessarily
36
exhibiting sequence or structural conservation through different binding pockets,
which differ in surface amino acid residues, and serve as an interaction platform
[46]. These surface amino acid residues are implicated in ligand binding specificity,
as they can affect the electrostatic nature of the binding surface and hydro-phobicity
[46]. From solved TPR-ligand structures many factors contribute to ligand binding
specificity [46]. The central groove formed from the packing of the TPR fold
accommodates the target ligand but this is not the only mode of ligand interaction
[45]. The mechanisms for recognition of the cognate ligand as well as the structural
re-arrangements implicated in this interaction have been studied and will be
discussed in this section.
It has been shown that very little structural re-arrangements are observed in the TPR
fold upon ligand binding from studying the structures of various TPR proteins in the
presence and absence of ligand [37, 42, 47-50]. The structures of six proteins with 3
TPR motif proteins in the presence [37,47] and absence of ligand [37, 42, 47-50]
have been solved and compared to. Figure 1.12 shows homology modeling of these
various TPR motif proteins; CTPR3 (consensus TPR number of repeats), PP5,
TPR12A and TPR1 domains of HOP which have been co-crystallized with their
peptide ligand [42]. These structures have been compared and were found to
superimpose on top of each other when homology modeling was carried out, even in
the presence of ligand [42]. It was also shown that free TPR domains, which actually
do not bind to the ligand, are also still folded and structurally ordered [42].
Figure 1.12. Homology modeling of
various
TPR
proteins.
Homology
modelling of CTPR3 (PDB Code: 1NA0); TPR
domain (residues 19–177) of PP5 (PDB code:
1A17); TPR2A (PDB code: 1ELR); TPR1
(PDB code: 1ELW); domains of Hop which
were co-crystallized with their ligands, C
terminal peptides of Hsp90 and 70
respectively.
Figure taken from [42]
HOP, is an adaptor protein containing 9 TPR motifs, which are organized as 3 TPR
domain clusters (Fig.1.13); TPR1, 2A and 2B, whereby TPR1 and 2 are involved in
mediating the interaction between Hsp70 and Hsp90 respectively [47]. TPR1 and
TPR2 bind to the C terminal hepta and pentapeptides of Hsp70 and 90 respectively,
37
mediated by electrostatic and hydrophobic interactions with the EEVD peptide
consensus sequence, which include glutamate (E), Valine (V) and Aspartate (D),
present on these heat shock proteins [47]. The crystal structures of the TPR-peptide
complexes have been solved (Fig.1.12), which exhibit the common TPR fold, and
homology modeling of these structures shows how super-imposable the structures
are, in the presence of their respective ligand, hence providing more evidence that
very little structural re-arrangements are induced upon ligand binding [42, 47]. CD
spectra of TPR1 and TPR2A of HOP with and without ligand shows the domains are
folded and are highly alpha helical, and display no evidence of structural
rearrangements upon ligand binding [42].
Figure 1.13. Domain organization of the TPR protein, HOP. HOP contains three
clusters of 3 TPR motifs dispersed along its structure, these are named as TPR1 (white
boxes), TPR2A (black boxes) and TPR2B (grey boxes). These clusters include amino acids,
1-118, 232-252 and 353-477 respectively. Figure taken from [51]
Another form of evidence to support the hypothesis that no drastic structural rearrangements are induced upon ligand binding comes from CD and NMR spectrum
of the three TPR motif protein UBP (Vpu binding protein), also called SGT (small
glutamine-rich protein) [42]. This protein also binds to the C terminal peptide of
Hsp70, and analysis at residue specific level from the NMR HSQC spectra shows no
drastic change in HSQC spectra between protein and protein-ligand, which means
that no structural re-arrangements have been induced upon ligand binding [42].
Several TPR proteins have been structurally solved by crystallography, [32, 34-38,
45], which have been shown to form the common fold but have slight variations in
structure due to short amino acid insertions to allow specific protein interactions
[45]. This shows how versatile the TPR motif is, and examples of these include
p67phox, a component of the NADPH oxidase complex, which contains four TPR
motifs important in interacting with RAC GTP (Fig.1.14) [45]. Another is PEX5,
which includes 7 TPR motifs (Fig.1.14) and important in interacting with the type 1
peroxisomal targeting signal (PTS1) [45].
38
Figure 1.14. Domain organization of the TPR proteins, PEX5 and p67 phox. PEX5
contains 7 TPR motifs on its C terminus, and p67phox contains 4 TPR motifs on its N
terminus, as well other functional domains, such as the activation, P-rich and SH3 domains,
which are highlighted in peach, yellow and green boxes. The TPR motifs are highlighted in
red boxes. Figure taken from [45]
NADPH oxidase is an enzyme that is involved in the production of reactive oxygen
species, which is a protective mechanism against microbial infection [45]. In a
resting state within the neutrophils, p67phox acts as a mediator between p40phox and
p47phox to form a trimeric complex [45]. Upon neutrophil stimulation, a
conformational re-arrangement of this trimeric complex is induced, followed by
phosphorylation of these elements. This complex then interacts with membrane
bound cytochrome components, which include the GTPase RAC [45]. Recognition
of RAC by p67phox is a critical step in the formation and activation of NADPH
oxidase, and this mediated through the N terminal TPR domain of p67 phox (Fig.1.14)
[45]. Homology modelling of the two crystal structures of inactive p67 phox bound to
Rac and a longer active form of p67phox without Rac shows that the structure of the N
terminal TPR domain for both are similar, which suggests that p67 phox-Rac binding
does not cause any structural re-arrangements to the TPR domain [45]. The protein
p67phox as mentioned above has four TPR domains, and between TPR3 and 4, a 20
amino acid insertion has been discovered that form two anti-parallel β strands [45].
This is important as this insertion does not disrupt the general super-helical fold of
the TPR motif but this structure is part of the RAC binding site as well as the loop
that connects TPR1 and TPR3 [45]. This provides evidence that the central groove
formed by the super-helical nature of the TPR fold does not provide the only mode
of ligand recognition and ligand binding site [45].
Another TPR motif containing protein is PEX5, which contains seven TPR motifs
followed by a 7C loop (Fig.1.14), and the TPR motifs are implicated in the
translocation of newly formed peroxisomal enzymes to their correct sub-cellular
localization [45]. The latter is mediated through the C terminal PEX5 TPR domain,
which recognises a C terminal signal peptide, SKL or peroxisomal targeting signal
(PTS1), in its target ligand [45]. The structures of PEX5 bound and unbound to
peptide, including the PTS1 sequence, were analysed and this identified an
39
alternative TPR motif conformation and mode of ligand recognition [45]. It was
shown that structural re-arrangements are induced within the TPR motif upon ligand
binding mediated through this PTS1 signal [45]. Upon ligand binding the TPR motif
reverts to a closed structure from the open form with the 7C loop [45]. This 7C loop
is important in target recognition and interacts with TPR1 to form a closed
conformation [45]. PEX5 still has the common fold of the TPR motif but does
exhibit this non-canonical and flexible structure [45].
Contrasting evidence was also shown when a particular PP5 TPR construct was
shown to exhibit differences in folding and stabilization with and without ligand [42,
52]. It was shown that the TPR motif was folded and stabilized when complexed
with ligand and the converse was detected without ligand [42, 52]. It was then
hypothesized that folding and binding interaction could be a potential mechanism for
ligand recognition [42, 52]. However vast majority of evidence, whereby the
stability, structure with and without ligand of TPR motif proteins show that no
drastic structural re-arrangements are detected upon ligand interaction and that the
TPR acts as an individual and folded unit which presents a common binding surface
or architecture to which specific ligands are accommodated [37, 42, 47-51].
Next question is how the TPR motif determines ligand specificity and it is known
that since the genome was fully sequenced; many proteins have been identified,
which can be categorized into families based on their sequence conservation [53].
However, very little attention is given to poorly conserved sites, which are
hypothesized to be of little importance [53]. Although, recently a paper was
published, which showed through extensive statistical analysis that residues in
contact with the ligand are more variable then surface residues which are not in
contact with the ligand but exposed to the solvent [53]. Through this statistical
analysis of TPR proteins, they identified that sequence variation residues can
determine specificity in ligand interaction. TPRs proteins bind to highly diverse
ligands, and it has been shown that these “hyper-variation” sequences are implicated
in ligand binding and as a consequence can be used to predict ligand binding sites
[53]. Also certain residue positions also play a role in determining TPR ligand
specificity [51] and these concepts will be discussed in more detail below.
Within a protein family, the highly conserved residues are present in the
hydrophobic core, which are implicated in specifying the fold of the protein [53].
The residues on the surface are generally more variable, which if mutated have little
40
effect on protein structure or stability [53]. However, if these surface residues are
conserved then this is considered to be of functional importance [53]. This
information then can be used to identify ligand binding sites within proteins that
carry out the same function but cannot be used for proteins that expose a common
fold that binds to diverse ligands such as TPR proteins [53]. It has been shown that
the ligand binding site can be determined though this sequence hyper-variation,
situated proximal to the ligand, which are important in determining specificity of the
ligand [53]. When the ligand bound structures of the two TPR domains of HOP,
TPR1 and TPR2A were studied, it was evident that the ligand binding face of the
TPR motif is more variable then the solvent/surface residues, and it’s these residues
that are predicted to be implicated in determining ligand specificity [53]. This
sequence variation is termed as hyper-variation [53]. Further analysis has shown that
residues 2, 5, 9, 12, 13, 33 and 34 exhibit the most sequence variability within the
TPR, and are present on the concave face of the TPR motif [53]. In the case of TPR1
and 2A of HOP, residues 2, 5, 6, 9, 12 and 13 are implicated in ligand binding and
specificity [52]. It is yet to be confirmed whether residues 33 and 34 are also
implicated in ligand binding and specificity [53].
Hsp70 and Hsp90 are important proteins implicated in the folding of various
proteins, mediated through interaction with various TPR co-factors through its
conserved EEVD [51]. As mentioned above the TPR motif forms a central groove,
and it has been shown that it’s this groove that serves as a ligand-binding site [51].
HOP binds to its ligand in an extended conformation, which then can display a
maximal TPR interaction surface and allows recognition of short amino acid
sequences (Fig.1.15) [46]. From solved crystal structures of HOP with its ligand, it
has been shown that interaction of the TPR motifs is through the EEVD consensus
sequence [46]. In the case of TPR1 and 2A there are five amino acids within the
central groove that form the “two-carboxylate clamp”, which interact with the
Aspartate residues of the EEVD sequence of Hsp70 and Hsp90 and so this clamp
acts a binding and docking site for peptide ligand and TPR motif [46. 51]. This TPR
mediated interaction is important for the stability and specificity of the Hop-Hsp70
and Hop-Hsp90 complex formation [51]. This EEVD conserved sequence acts as an
anchor sequence for the TPR co-factors of the heat shock proteins, however it’s the
residues N terminal of the EEVD sequence that determines specificity to Hsp70 and
Hsp90 [51]. TPR2A-MEEVD of Hsp90 is the only contact required for the HopHsp90 complex formation, and the ME (methionine and glutamate) residues are
specific for interaction with Hsp90 residues [51]. However, for Hop-Hsp70 complex
41
formation, this requires not only the interaction of TPR1 with Hsp70 through the
PTIEEVD sequence but also additional contacts [51]. Residues PTI (proline,
threonine, Iso-leucine) are the residues specific for Hsp70 interaction [51]. Of the
EEVD conserved sequence the Aspartate and Valine residues are the anchor
residues, but the glutamate residues are critical in TPR2A-Hsp90 binding but not as
critical in TPR1-Hsp70 binding, as TPR1 has preference for a hydrophobic amino
acid at this position [51]. Furthermore, the TPR motif has preference for
hydrophobic aliphatics and aromatic side chains at certain positions within their
respective ligands such as position 4 and 6 [51]. For example in the case of TPR1
and 2A, Ile-4 in Hsp70 and Met-4 in Hsp90 are important residues in determining
specificity to TPR1 and 2A [51].
Structures of TPR-ligand can show ligand binding mediated via an extended
conformation as like with HOP (Fig.1.15), however, ligand binding can also be
mediated by the display of both helical and an extended conformation shown by
TPR-ligand structures such as the Pseudomonas secretion (Psc) proteins, PscG-PscE
in complex with the PscF peptide, and APC6 in complex with CDC26 (Fig.1.15) [46.
Firstly the (Psc) proteins are implicated in the bacterial Type III secretory pathway
of which PscG has three TPR motifs with a C terminal helix [46]. This latter protein
interacts with PscE through these TPR motifs [46]. PscF consists of two subdomains; a 13 and 17 amino acids long extended coil and C terminal helix
respectively [46]. Both PscG and PscE form a “cupped-hand-like structure”, whereas
PscF interacts with PscG through its C terminal helix to the concave binding surface
of PscG and its N terminal region to the convex surface of PscG (Fig.1.15) [46].
Another example of a TPR-ligand structure displaying both helical and extended
conformation in ligand binding is the APC6 protein in complex with CDC26 [46].
Both are components of the Anaphase promoting complex (APC), of which APC6
has 8 TPR motifs and a C terminal helix [46]. APC6 forms a “solenoid like
structure” encompassing the full-length of the N terminus of CDC26 of 26 amino
acids (Fig.1.15) [46]. The bound ligand CDC26, displays 12 amino acids in an
extended conformation and the other 14 amino acids as a helix (Fig.1.15) [46].
These examples show that TPR motifs can interact with various ligands through
different TPR binding modes.
42
Figure 1.15. TPR protein-ligand structures. A: TPR2A domain of Hop (highlighted
in pink) in complex with Hsp90 (highlighted in green). B: PscG-PscE dimer (highlighted in
light blue and purple respectively) in complex with PscF (Highlighted in pink). C: APC6
(highlighted in light green) in complex with CDC26 (highlighted in red).
Figure taken from [46]
Another protein interaction motif is the 100 amino acid PDZ domain, which as
mentioned above is composed of six  and 2 -helicies forming a  “sandwich”
structure (Fig.1.16), unlike TPR motifs which are alpha helical [54]. The PDZ
domain binds to the C terminus of its target protein through a four amino acid
consensus sequence; X-Thr/Ser-X-Val [54]. PDZ domain containing proteins expose
a peptide binding groove surface situated between a  sheet and  helix (Fig.1.16),
which binds to the consensus sequence on the target peptide in a geometry common
to PDZ domains [54, 55]. PDZ ligand binding specificity is dependent on minor
sequence variations, however, the geometry and overall fold within the binding
region is generally well conserved [55]. Ligand binding does not cause large
structural rearrangements to the PDZ domain as deduced from solved PDZ domainligand structures [55]. Furthermore, the mechanism of ligand recognition was
elucidated from the structures of PDZ motif in complex with and without ligand, for
example the third PDZ domain (PDZ-3) from the brain synaptic protein (PSD-95)
[54]. The PDZ domain recognizes the C terminus of its target ligand, and this is
mediated through a carboxylate-binding loop found in loop L1 (Fig.11.6), which
43
contains four important residues, Gly-Leu-Gly-Phe (Leu and Phe are the two X
residues in this case), and hydrogen bonds are formed between residues of this loop
with the carboxyl group of the ligand [54, 55]. The glycine residue provides the
structural flexibility to this loop, and an arginine residue is also present in the
binding loop, which also interacts with the carboxylate group of the ligand through
hydrogen bonds [54]. A hydrophobic pocket is also present in this fold which
recognizes hydrophobic C terminal target peptides, and the hydrophobic amino acids
present within this loop can vary between different PDZ domains but in this example
these are Leu-323, Phe-325, Ile-327, and Leu-379 (Fig.1.16) [55]. Peptide binding
does induce a slight but not large structural re-arrangements to the fold, but changes
are shown in loop L1 and B helix, which suggest of a possible mechanism that
opens up the hydrophobic pocket present between this region (Fig.1.16) [55]. The
specificity of PDZ domains to diverse ligands is dependant on the variable amino
acids within the A and B strands [55]
(A)
(B)
Figure 1.16. PDZ domain structure. A: Ribbon presentation of the third PDZ domain
of PSD-95, which consists of two -helices; A and B (highlighted in red) and 6 -strands
(highlighted in green) forming a  barrel structure and 6 loops that are highlighted in blue.
B: PSD-95 bound to peptide (stick representation, highlighted in orange). Atoms highlighted
in pink are part of the hydrophobic pocket.
Figure taken from [55]
The SH3 domain is implicated in mediating protein interaction in various cellsignalling pathways [56]. The SH3 domain presents a hydrophobic surface (2
hydrophobic pockets) that usually contain conserved amino acids that binds to its
peptide ligand, which is left-handed polyproline type II (PPII) helical in structure
and includes the PxxP (x being any other amino acid) consensus sequence [56]. The
third binding pocket also referred to as the “specificity pocket”, is negatively
44
charged and binds to residues flanking the consensus sequence commonly an
arginine or lysine present in the ligand [56]. Depending on where this basic residue
is situated relative to the proline of the consensus sequence determine how the ligand
binds in terms of structural orientation either Class I or Class II, if arginine is present
on the N or C terminus respectively [56, 57]. This specificity pocket is important in
increasing the affinity and specificity of SH3-ligand interaction, which is critical in
the cellular context [56]. Specificity is provided by additional contacts that are made
with the loops of the SH3 domain and residues on the ligand flanking the consensus
sequence. It was also shown from studying structures of free and bound SH3 domain
structures that very little structural re-arrangements are induced upon ligand
interaction [57]
1.5 Functions of various TPR proteins
TPR motifs containing proteins are involved in many aspects of cellular function and
examples of each will be given in the following sections.
1.5.1 TPR proteins involved in transcription
An important step in initiating RNA polymerase III transcription is the interaction
between Tfc4 of TFIIIC and Brf1 and Bdp1 of TFIIIB to assemble this latter factor
onto the DNA [58]. This has been shown to be the rate-limiting step for the process
of RNA polymerase III transcription [58]. Tfc4 has 11 TPR motifs and are organized
to form two clusters of TPRs at the N terminus (Fig.1.17) [58]. One set includes
TPR1-5 and the other set includes TPR6-9 and two TPR repeats are found at the C
terminus (Fig.1.17) [58]. These TPR motifs are important in mediating Tfc4
interaction with Brf1 and Bdp1 needed for the initial assembly of TFIIIB onto DNA
[58]. Gain of function mutations within TPR1-5 increase the interaction between
Brf1 and Tfc4, whereas mutations in TPR 6-9 disrupt Polymerase III reporter gene
transcription and impairs interaction between Brf1 and Tfc4 [58].
Figure 1.17. Schematic diagram showing domain organisation of the TPR
Protein Tfc4. Tfc4 contains a hydrophilic domain (yellow box), two tandem TPR arrays
(red boxes), which include TPR1-5 and TPR6-9 and an intervening region (IVR, green box)
in between the two domains at the N terminus. The C terminus contains another two TPR
motifs (TPR10 and TPR11). Figure taken from [58]
45
The protein TTC4 is also TPR protein and was originally identified within the gene
region implicated in breast cancer [59]. TTC4 is a nucleoplasmic protein and shown
to interact with Hsp70, Hsp90 and more recently shown to interact with the
replication initiation protein Cdc6 through TTC4 TPR motif [59]. Further research
showed that certain point mutations within this protein were detected in various
melanoma samples and this impairs TTC4 interaction with Cdc6 [59]. However, it
has not been proven that the loss of this interaction leads to cancer as the region
where the mutation maps to could be implicated in interacting with other proteins
[59]. Although the possibility that loss of interaction with Cdc6 leads to melanoma is
plausible as Cdc6 is an important regulator of DNA replication [59]. An increase in
the level of TTC4 was seen in these melanoma samples and TTC4 has been shown to
be implicated in cancer progression as increased level of TTC4 protein are detected
in various tumour cell lines [59].
1.5.2 TPR proteins involved in the Stress Response Pathway
The molecular chaperone Hsp90 is expressed ubiquitously and it has many substrates
such as steroid hormone receptors and kinases [60]. Hsp90 is known to be involved
with the folding [60-62] and degradation of proteins [62], interaction with steroid
receptors [63] and is known to have ATPase activity needed for its function. [64-66].
Hsp90 has two conserved motifs at its C and N termini and the two are connected by
a charged linker [66]. Hsp90 interacts with other co-chaperones containing TPR
motifs through its conserved C termini which includes a TPR motif recognition site,
the pentapeptide MEEVD [66]. The interaction of Hsp90 with these TPR proteins is
important for Hsp90 activity [66]. Hsp70 is also ubiquitously expressed and is
involved in protein folding and the stress response pathway [67]. Hsp70 has an
ATPase domain at its N terminus, a substrate binding domain and a C terminal
domain [65], the latter regulates substrate binding [64, 65]. The EEVD and
PTIEEVD have been identified as a TPR recognition site of Hsp70 located at the C
terminus [67-68].
HBP21 is a human TPR protein but its function has not been characterized [68]. It
has 3 TPR motifs implicated in the interaction with the C terminus of Hsp70 [68]
through the EEVD and PTIEEVD sequence [67-68]. Levels of HBP21 are high in
breast cancer and proliferative vitreoretinopathy (PVR) [68]. It has been
hypothesized that HBP21 could play a part in inhibiting metastasis of tumour cells
[68].
46
HIP, an Hsp70 interacting protein, is 369 residues long and has recently been found
to be implicated in glucocorticoid receptor signaling; and it also acts as a chaperone
[69]. Steroid receptors normally exit in a heteromeric complex with Hsp90 and other
chaperones before hormone binding [32, 69]. HIP binds to the ADP form of Hsp70,
and stabilises and subsequently promotes its interaction with other binding partners
[32, 69]. HIP enhances hormone dependant activation of GR [32, 69]. HIP is
composed of an oligomerization domain, three central TPR motifs and a highly
charged region at its N terminus (Fig.1.18), where the latter two regions are
implicated in mediating HIP interaction with Hsp70 [32, 69]. The C terminus has a
glycine, glycine, methionine and proline (G) repeat motif and a p60 homology
domain (Fig. 1.18) [69]. Mutations within the TPR region of HIP impair its
interaction with Hsp70 and enhancement of glucocorticods receptor signaling. [69].
Figure 1.18. Schematic diagram showing domain organisation of the TPR
Protein HIP. HIP contains an oligomerisation domain, three TPR motifs and a highly
charged region at its N termini. The C terminus includes the GGMP repeat motif and the p60
homology domain. Positions of various domains are indicated. Figure adapted from [69]
1.5.3 TPR proteins involved in Mitochondrial and Peroxisomal import
TPR proteins are found in the peroxisomal import receptor complex; these are PAS8/
PAS10/PXR1 which recognize target proteins and transport them across the
peroxisomal membrane [32, 70-72]. PAS8 has 7 TPR motifs on its C termini [72]
and PAS10 has 8 TPR motifs of which 7 are present on the C termini and the other
TPR motif is present on the N termini [73]. Human PXR1 is involved in peroxisome
import and has seven TPR motifs located at its C terminus [32, 70-72]. Mutations
within the TPR motifs can interrupt protein interaction which can affect its function
causing life threatening diseases such as the peroxisome biogenesis disorder (PBD)
and neonatal adrenoleukodystrophy [32, 70, 74]. The latter is a recessive disorder
due to a mutation in the TPR domain of PXR1 and PEX5 [32, 70, 74] which then
affects peroxisome assembly [32, 70].
47
TPR proteins are found in the mitochondrial import receptor complex, they are
MAS70 and TOM20 [32, 75-78]. These TPR proteins recognize target proteins and
transport them across the mitochondrial membrane [32, 75-78]. MAS70 has seven
TPR motifs (Fig.1.19) which are important in transporting MAS70 to the cytoplasm
from the outer membrane of the mitochondria where it is primarily localized [32, 7578]. Mutations within the C terminal TPR domains result in the transport of non
functional MAS70 to the mitochondria, however the protein is rendered nonfunctional as it cannot aid in the transport of other target proteins across the
mitochondrial membrane [77]. Tom20 has one TPR motif and this is unusual as most
have 3-16 TPR motifs [33].
Figure 1.19. Schematic Diagram showing TPR Motif organisation within the TPR
Protein MAS70. TPR motifs are shown as red boxes and MAS70 has 7 TPR motifs dispersed
along the sequence. Figure adapted from [37]
1.5.4 TPR proteins involved in the progression of the cell cycle
TPR proteins are also found within the multi-subunit E3 ubiquitin ligase APC
complex [32, 37, 79-80], which is part of RING/ cullin family and is involved in the
ubiquination of proteins to be degraded by the proteosome at various parts of the
mitotic cell cycle [79]. The APC consists of 13 subunits of which Cdc16, Cdc23 and
Cdc27 contain 10, 9 and 10 TPR motifs respectively [32, 37, 80] (Fig.1.20). All of
the TPR motifs are important for the interaction between these three components
[32, 80]. Mutation within these motifs interrupts the protein interactions between
these three components of the APC and/or their respective function [32, 80]. A
mutation in the seventh TPR domain at position 6 and also an insertion between
position 6 and 7 of Cdc27 interrupts the interaction with CdC23 [32, 35]. A mutation
within CdC23 at position 8 in TPR motif 5 and 7 [81] and in Cdc16 at position 20 of
TPR motif 9 [82] results in cell cycle arrest at a specific point in the mitotic cell
cycle, metaphase to anaphase transition [81, 82]. This is because these mutations at
position 8 and 20 are the TPR motif conserved residues, which most probably
interrupts protein function due to difference in structural conformation [32, 37].
48
Figure 1.20. Schematic Diagram showing TPR Motif organisation within
these TPR proteins. The TPR motifs are shown as red boxes and cdc16 and cdc27
have10 TPR motifs dispersed along the sequence. The other TPR protein, cdc23 has 9
TPR motifs. Figure adapted from [37]
Another TPR protein that has been identified is WISp39 [83], which is involved in
inhibiting p21 degradation [60]. Different Cyclin-CdK complexes are present at
different stages of the cell cycle [83]. The levels of both the complex and the
composition of specific cyclin and CdKs oscillate between the different cell cycle
phases [83]. Inhibitory proteins exist that can inhibit CdKs, one of which is p21, as it
can bind to both CdK and cyclin in specific positions [60, 84]. DNA damage induces
p53 which activates p21 and inhibits cell cycle progression [60, 85]. Transcriptional
control of p21 is regulated through p53 dependent and independent pathways
(Fig.1.21) [60]. Levels of p21 are regulated through post translational modification,
for example phosphorylation [86] and degradation [60, 83]. The N terminus of
WISp39 interacts with the N terminus p21, which is implicated in p21 ubiquitination
and this suggests that WISp39 could inhibit p21 degradation [60, 83]. Wisp39 has 3
TPR motifs at its C terminus which is involved in interacting with the C terminus of
Hsp90 [60, 83]. The Wisp39/Hsp90/p21 tri-complex is involved in inhibiting p21
degradation by increasing the stability of p21 (Fig.1.21) [60, 83]. This tri-meric
complex does not increase p21 protein levels but is implicated in the stabilisation of
p21 [60, 83]. Mutations within this TPR domain of WISp39 impairs its interaction
with Hsp90 and consequently p21 degradation and the formation of the
WISp39/Hsp90/p21 tri-meric complex that enables correct folding of p21 and
consequently its stabilisation [83].
Figure 1.21. Regulation
of p21 by p53, WISp39
and Hsp90. DNA damage
49
activates p53 which in turn
activates transcription of
p21. This leads to an
increase of unstable p21
levels, which then ultimately
stabilises
through
its
interaction with WISp39,
Hsp90 and formation of the
WISp39/p21/Hsp90 tri-meric
complex. Figure taken from
[83]
1.5.5 TPR proteins involved in DNA Repair
PP5 is an ubiquitinously expressed serine threonine phosphatase and it can bind to
Hsp90 [87-88]. PP5 has a C terminal catalytic domain and an N terminal TPR motif
(Fig.1.22) [85]. The TPR motif of PP5 interacts with a conserved sequence at the C
terminus of Hsp90 [89] and is important in interacting with other proteins like
CDC16 and CDC27 [90]. The TPR motif has a negative impact on PP5 activity [87,
91] although the mechanism of PP5 activation is unclear at this stage. ATM kinase is
a checkpoint kinase involved in the response to double stranded breaks [92]. ATM
activation was poorly understood until recently it has been shown that PP5 is
involved in ATM activation [92]. When DNA is damaged PP5 interacts with ATM,
ATM is auto-phosphorylated on serine 1981 which then leads to the activation and
phosphorylation of downstream targets like p53 and Rad17 [92].
Figure 1.22. Schematic Diagram showing TPR Motif organisation within
PP5. The TPR motifs are shown as red boxes and PP5 has 3 TPR motifs and a
phosphatase domain (highlighted in green). Figure adapted from [37]
Other TPR proteins are also part of the DNA repair pathway; this was identified
through looking at patients with Fanconi Anaemia (FA) [93-94]. The primary cause
of this condition is a defective DNA repair pathway and in this condition 8 key genes
were identified, one of which is BRCA2 [93-94]. An important step in the FA
pathway is the ubiquitylation of lysine in FANCD2, which then can allow further
steps to proceed such as the interaction of this protein with BRCA1 [93-94]. This
step requires the integrity of a multi-subunit complex of 6 FANC proteins, (FANCA,
-C, -E, -F, -G, and -L. No domain has been identified which is involved in these
protein interactions besides FANCL, which has 2 domains, WD40 and a ring finger
domain [93-94]. Further research was done to identify the protein interaction domain
by looking at sequence homology between different organisms of the same protein
[93-94]. Human FANCG was compared to its homologs in Oryzias latipes (Japanese
rice fish) and Danio rerio (zebra fish) [93-94]. This identified the seven TPR motifs
within FANCG which were dispersed throughout this protein [93-94]. FANCG as
well as being implicated in FA core complex assembly [93] has also been implicated
in the homologous recombination repair pathway [94]. It is involved in the latter
pathway as the TPR motifs of FANCG are critical for interaction with XRCC3 and
50
BRCA2, which are both important components of the homologous recombination
pathway [94]. Mutational analysis identified key mutations within TPR motifs,
which lead to complete or partial loss of function of this protein [93-94]. This was in
position 8 of TPR motifs 1, 2, 5 and 6 and further research showed that these TPR
motifs are important in the interaction with FANCA [93-94]. These mutations lead to
a loss of FANCG interactions with other partners [93-94].
1.5.6 TPR proteins involved in Proteolysis
Molecular chaperones and the ubiquitin proteosome pathway (UPP) are working
against each other. UPP degrades damaged proteins and molecular chaperones such
as heat shock proteins are involved in the refolding of damaged proteins [95]. The
fate of the protein depends on the activities of both of these components [95]. A very
important component that helps to decide the fate of a protein is CHIP, which is a
co-chaperone that has ubiquitin ligase activity and is 35 kDa [95]. It has 3 domains
of which one is the TPR domain at the amino terminus and on the opposite end a Ubox (has ubiquitin ligase activity) and in between the two is a highly charged region
(Fig.1.23) [96]. CHIP has 3 TPR domains, which are involved in the interaction of
CHIP with other molecular chaperones like Hsp70 and Hsp90 needed for the quality
control mechanism [97], regulation of signaling pathways [98] and for proteosomal
degradation [99].
Figure 1.23. Schematic diagram showing domain organisation of the TPR
Protein CHIP. The TPR motifs (boxed in purple) are involved with the binding to
Hsp70/90. There is a coiled coil domain indicated by an orange box in the middle. The
other end is the U-Box which has ubiquitin ligase activity. CHIP is 303 amino acids (aa)
long. Figure adapted from [99]
1.5.7 TPR proteins implicated in various other aspects of cell physiology
There are different types of post-translational modification that a protein can be
subjected to, among which are β–O-linked N-acetylglucosamine (O-GlcNAc) which
is through the enzyme O-Glc-NAc-transferase (OGT) [100-101]. Many key proteins
51
within the cell are subjected to this modification; these include RNA polymerases
[102], transcription factors [103] and kinases [104]. The enzyme OGT is 110 kDa
and it is very important for the cell as it has been shown that if deleted then it leads
to embryonic lethality [100]. OGT is a TPR enzyme which has a TPR motif at its N
terminus and catalytic domain at its C terminus [100]. There are two isoforms of this
enzyme, mitochondrial and nucleo-cytoplasmic with 9 and 11 TPR motifs
respectively [100]. The function of TPR is to mediate protein-protein interactions
and to maintain the integrity of the enzyme [100]. This enzyme is a trimer and TPR
motifs are involved in OGT trimerization [100]. TPR motifs are also involved in
determining target substrates and targeting it to a transcriptional repressor complex
mSin3A [100]. The TPR motif of OGT can interact with GRIF-1 and OIP106, the
first being a GABAA receptor-associated protein [100]. This is important for
localizing OGT to the GABAA receptor which in turn leads to the activation of the
GABA signaling pathway [100]. The TPR motifs of OGT (TPR 2-6) interact with a
member of the OIP, OIP106 and this interaction is important for the interaction with
RNA polymerase II complex [100].
TPR proteins are also implicated in prostate cancer, for example the 313 amino acid
(Fig.1.24) α-SGT protein is linked to this disease. The predominant feature in this
cancer is Androgen receptor (AR) is hyperactive even in the low presence of its
ligand, androgen [105]. It is suggested that nucleoplasmic shuttling of this receptor
by molecular chaperones is an important factor implicated in the progression of this
cancer [105]. Alpha-SGT is an Hsp70/90 co-chaperone which binds to AR
specifically at a hinge region in yeast and mammalian cells [105]. This interaction
results in localization of this receptor to the cytoplasm, which leads to inhibition of
the receptor transcriptional activity [105]. As well as that, it regulates how the
receptor responds to androgen and it does not allow weak agonists such as
progesterone to activate the receptor at reasonable levels of ligand [105]. In prostate
cancer a high ratio of androgen receptor to α-SGT is detected, which explains the
reason behind this predominant feature observed in this cancer. Alpha-SGT has two
motifs, a central TPR motif and a small glutamine–rich motif at the carboxy terminus
(Fig.1.24) [105]. The TPR motifs mediate the interaction of α-SGT with HSP70,
HSP90 and with the hinge region of AR [105]. This interaction is important in the
folding process of the hormone binding domain of steroid receptors to a high affinity
state [105]. The other domain has been suggested to be involved with α-SGT
dimerization [105].
52
Figure 1.24. Schematic diagram showing domain organisation of the TPR
Protein α-SGT. The protein α-SGT contains a central TPR repeat domain and a small
glutamine-rich motif at its carboxy terminus. Figure adapted from [105].
More recently two checkpoint serine/theorine kinases Bub1 and BubR1 were also
found to contain TPR motifs at its N terminus, implicated in the interaction with the
outer kinetochore protein Knl1 through its KI motif (12 residue motif), which is
important for the recruitments of these checkpoint kinases to the kinetochore (Fig
1.25) [106]. The spindle assembly checkpoint ensures cells don’t enter anaphase
until all the chromosomes are correctly attached on the mitotic spindle to ensure
correct chromosome segregation [106]. Bub1 and BubR1 are both implicated in the
spindle assembly checkpoint and chromosome alignment [106]. The recruitment of
these two proteins to the kinetochores is important for these two proteins to carry out
their function and activation [106]. The localisation of these two proteins is through
the TPR motifs found at the N terminus of Bub1 and BubR1, which interact directly
with KI1 and KI2 motifs of Knl1 (Fig.1.25) [106]. Mutations within these TPR
motifs of Bub1 impair interaction with Knl1, which leads to chromosome
segregation defects. [106].
Figure 1.25. Domain organisation of Bub1, BubR1 and Knl1 . Bub1 and BubR1
consists of TPR motifs, Bub3-BD, KEN box and a kinase box as highlighted in relevant
colored boxes shown in figure. The outer kinetochore proteins is composed of a PP1-BD,
Kl1, KI2 and a Mis12-BD domain as highlighted in relevant colored boxes shown in figure.
Abbreviations: TPR, Tetra-tri-co-peptide repeats; Bub3-BD, Bub3-binding domain, KEN,
KEN box; PP1-BD, protein phosphatase 1–binding domain; KI1, Bub1-binding domain 1;
KI2, BubR1-binding domain 2; Mis12-BD, Mis12-binding domain. Figure adapted from
[106].
53
1.6 Roles of scaffolds in signaling pathways
Most of our understanding regarding scaffolds comes from one of the scaffold
protein in the MAPKinase pathway, namely Ste5, which was discovered 15 years
ago [107-108]. Research since then has identified many scaffold proteins, which
exhibit properties similar to Ste5 [108-109]. Scaffolding proteins can affect
signalling cascade networks through its ability to interact with various signalling
enzymes, receptors and ion channels [108-109]. They facilitate the formation of
protein complexes and act as “signal processing hubs”. [108-109].
A scaffolding protein exhibits two important properties, firstly scaffolds stabilize and
maintain specificity of the signalling pathway, and they stabilise weak interactions
between the various components of the signalling pathway [108-109]. Scaffolds act
as “catalysts” in activating the proteins present within the pathway, for example in
relevance to the MAPKinase pathway the scaffold anchors the kinases in a manner
that enhances their interactions [108-109]. It is considered that a scaffolding protein
serves to co-localize a set of proteins part of the same signalling cascade to a specific
localisation within the cell consequently enhancing their activation [108-110].
However, aswell as localizing and holding on to the regulatory proteins in close
proximity, they can also change their own conformation through interaction with
their target protein, and can also change the conformation of their interacting ligand
to change their function [108]. In summary scaffolding proteins have various
mechanisms of action as shown in Fig.1.26. Scaffolding proteins have to be closely
regulated as degradation of the multi protein scaffold complex could result in the
deactivation of that signalling cascade network [108-110].
Figure 1.26. Scaffold mechanisms. Scaffolding proteins can affect signalling cascades
using different mechanisms of action. Scaffolding proteins can A: bring proteins close
together to interact with each other; B: form different signalling scaffolding complexes, one
active protein (red triangle) can be part of two different pathways using two scaffolding
signalling complexes; C: be regulated by the signalling proteins within the scaffolding
complex without the need of regulating each individual signalling component of the
pathway; D: change the conformation of the enzyme the scaffold binds to or vice versa.
Figure taken from [108].
54
Another class of proteins that are functionally similar to scaffolding proteins are the
adaptor proteins; however their functional role is limited and they can only interact
with two other proteins to facilitate their specific cellular localisation [108].
Scaffolding proteins have been discovered on an individual basis once experiments
have identified them to interact with protein kinases, ion channels or to other
proteins [108]. Since scaffolding proteins assemble multi protein complexes via
protein-protein interactions, it may be possible to identify scaffolding protein
through their protein interaction profiles [108]. This could be an excellent systematic
strategy that could be implemented through the emergence of both reliable protein
interaction databases and discovery methods [108].
Initial research identified scaffolds as “passive components” however research since
then has shown that scaffolds play a more “active” role [108]. The scaffold multi
protein complex creates its own micro-environment through its interaction with
many proteins and enriching them in a specific small localisation within the cell,
which consequently increases specificity due to the selectivity of other proteins
within the complex [108]. Also this scaffolding complex has the ability to recruit
positive and negative regulators, and so can create a complex and dynamic
environment [108]. Specificity can be maintained, even though the numbers of
signalling proteins are low, because different signalling complexes can be formed
with various combinations of the same signalling proteins (Fig1.26) [108]. Due to
the latter properties, various signalling cascades can share the same signalling
proteins [108, 111].
Scaffolding proteins not only regulate interactions within but also between different
pathways [112]. This latter type of interaction is referred to as “crosstalk” [112].
Initially pathways were seen to be linear however evidence has shown that signalling
networks are interconnected via this crosstalk mechanism [112]. This mechanism is
important, as comparatively to the various signalling functions of the cell, the
numbers of existing signalling cascades are low [112]. Hence, crosstalk between
pathways can create diverse output consequences for the cells but this has to be
closely regulated and monitored, one way is through scaffolding proteins [112]. In
relevance to cancer, during tumourigenesis, crosstalking proteins are affected
resulting in the dysfunction of the signalling cascade [112].
Due to the heterogeneous nature of signalling proteins it seems very unlikely that
they are a result of a common ancestor [108]. Although they contain protein
55
interaction domains, there is no signature motif, and function prediction cannot be
carried out through the presence of this common protein interaction domain [108].
Even the scaffolds present in the MAPKinase pathway do not share sequence
similarity even though the kinases they bind to do share this latter property [108].
1.7 Hallmarks of Cancer
Cancer is a complex disease and tumourgenesis involves a number of steps, the
primary one being the detection of genetic alterations within the genome of the cell
which confers a growth advantage for the cells, and ultimately this leads to its
progressive transition of a normal to a cancerous cell [113]. There are more than 100
types of cancer and various disruptions of regulatory circuits implicated in cell
proliferation and homeostasis have been detected [113]. One of the critical questions
is how many regulatory disruptions can a cell endure before the cell becomes
cancerous. It was firstly initially suggested that the majority of cancers are caused by
six essential alterations within the cell, which were termed the “Six Hallmarks of
Cancer” (Fig.1.27A) [113, 114]. However, further research has identified another
four hallmarks of cancer (Fig.1.27B) [114]
(A)
(B)
Figure 1.27. The six major alterations found in most cancers. A: It has been
suggested that these six major alterations are observed in all cancer cells, self sufficiency in
growth signals, insensitivity to anti-growth signals, tissue invasion and metastasis, limitless
replication potential, sustained angiogenesis and evading apoptosis. B: Further research has
identified another four hallmarks of cancer, avoiding immune detected, deregulation of
cellular energetic, genome instability and mutation and tumour promoting inflammation.
Figure taken from [113-114]
56
1.8 Role and regulation of p53 in cancer
The p53 protein induces a cellular response to a variety of stress signals such as
DNA damage, hypoxia and oncogene activation [115-117]. P53 is then stabilized,
and binds to DNA in a tetramer form and acts as a sequence specific transcription
factor for genes implicated in DNA repair, cell cycle arrest, senescence and
apoptosis [115-117]. This is an important response for tumour suppression as shown
by various human cancer cells where p53 is mutated, commonly in its DNA binding
domain [115-117].
Mdm2 was initially found to be a negative regulator of p53, and is implicated in p53
sub-cellular localization and stability [115-117]. In human cancers where Mdm2 is
over-expressed, this leads to cancer progression due to its effect on p53 [115-117].
Mdm2 binds to the N terminus of p53 and leads to its poly-ubiquitination and its
consequent degradation by the proteosome, or its mono-ubiquitination which leads to
its nuclear export [115-117]. Mdm2 is also implicated in the regulation of p53 at its
mRNA translation level, indirectly through various ribosomal proteins such as L26,
or direct binding to p53 mRNA [115-117]. This direct interaction is through the C
terminal RING domain of Mdm2 and the N terminal Mdm2 binding site of p53,
which negatively regulates p53[115-117]. This critical p53 and Mdm2 interaction is
regulated under the DNA damage response pathway as this leads to the activation of
various kinases such as ATM, ATR, Chk1 and Chk2, which consequently leads to
p53 and/or Mdm2 phosphorylation, and consequently reduction in p53-Mdm2
binding and exhibition of Mdm2 negative effects [115-117].
1.9 Regulation of the actin cytoskeleton
Actin, a 42kDa ATP binding protein is a highly conserved protein and exists in two
forms, globular G-actin, which can then assemble into filamentous F-actin [118].
Actin filaments are polar as they consists of two ends which exhibit different
properties, a dynamic (barbed) and less active (pointed) end (Fig.1.28) [118]. When
bound to ATP, actin polymerises at the barbed end of actin (Fig 1.28) [118]. This
filament turnover is regulated through various actin binding proteins which can have
diverse effects such as depleting actin monomers or its delivery, or can effect
filament nucleation, elongation, capping, severing or de-polymerization [118]. The
actin cytoskeleton is implicated in morphogenesis, migration, cytokinesis and
membrane transport [118]. Actin assembly is initiated from existing filaments or the
57
nucleation of de-novo actin monomers. These processes are in-efficient, although
factors such Arp2/3 complex proteins bind to the side of actin filaments and promote
actin filament formation [118].
Figure 1.28. Process of Actin
polymerisation. Actin filaments are polar
proteins consisting of a dynamic barbed end
a pointed less dynamic pointed end which
during actin polymerisation is bound to
ATP and ADP respectively. Figure taken
from [118]
The 220kDa Arp2/3 complex consists of seven highly conserved polypeptides, Arp2,
Arp3 and ArpC1-5, of which Arp2 and Arp3 bind to ATP [118]. This ATP binding is
a critical factor for the process of actin nucleation, and Arp2 is shown to hydrolyze
ATP [118]. The Arp2/3 protein complex also has a variety of activators, as on its
own it is also inefficient [118]. These activators are called nucleation promoting
factors (NPFs), which activate the Arp2/3 complex through WH2 domains,
amphipathic connector region and acidic peptide [118].
1.10 Cancer cell migration
Cell migration is implicated in many biological processes such as embryonic
morphogenesis, immune surveillance, tissue repair and regeneration [119-121].
However, dysregulation of cell migration can lead to cancer metastasis [119-121].
The invasion of cancer cells from a primary site to a secondary site is the process of
metastasis, which is the most common cause of death in cancer patients [119-121].
This is a very complex process and the initial step involves a response to a set of
chemo-tactic signals, followed by the protrusion of the cell membrane termed the
“leading edge”, and consequent attachment to the extracellular matrix [119-121].
These invasive cells can form various protrusive structures through the process of
actin polymerization such as filopodia, lamellipodia, and invadopodia/podosomes,
which differ in appearance, structure and function [119-121]. Lamellipodia are
dynamic structures and can migrate long distances through actin polymerisation at
58
the site of leading edge [119-121]. However, they are also lamella which are not as
dynamic but also are implicated in cell migration and connect actin to the myosin IImediated contractile machinery [119-121]. It has been shown that the latter is also
important in lamellipodia extension and this myosin II mediated contractility is
implicated in both actin filament disassembly at the back of the lamelipodium, and
direction of migration [119-121]. Chemo-attractants bind to cell surface receptors
which then activate a cascade of intracellular signaling pathways that are implicated
in the regulation of the actin cytoskeleton [119-121]. This invasive phenotype
requires over-expression of genes implicated in cell motility such as the WASP
family of proteins, which allows the cells to respond to the chemo-tactic signals
[119-121]. Therefore, proteins implicated in cancer cell migration are potential drug
targets for cancer therapy [119-121].
1.11 RNA polymerase transcription machinery
The recognition of nuclear gene promoters and subsequent target gene transcription
is carried out by three enzymes, RNA polymerase I, II and III [122]. Each implicated
in the transcription of specific sets of genes, and all rely on transcription factors to
recognize the promoter sequences [122]. RNA polymerase I only transcribes large
ribosomal RNA genes [122], whereas RNA polymerase II transcribes mRNA and
small nuclear RNA (snRNA) [122]. Gene promoters targeted by RNA polymerase II
usually contain one core and one regulatory region [122]. The core promoter domain
varies for different types of genes, however, in general the core promoter includes a
TATA box, an initiator and promoter elements [122]. Although, not all promoters
necessarily have to contain the TATA box, these promoters are referred to as “TATA
less” promoters [122]. RNA polymerase III transcribes genes generally no longer
than 400bp, which encode structural or catalytic RNAs such as the components of
protein synthesis, splicing and tRNA processing complexes [122]. The promoters of
the RNA polymerase III target genes can be divided into three main types, two that
are known as “gene internal” and generally do not contain the TATA box (type 1 and
type 2 promoters), and one known as “gene external” (type 3 promoters) which does
contain the TATA box [122].
Transcription is initiated by the coordinated action of the general transcription
factors (GTFs), which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH, and
the RNA polymerase core enzyme, which is composed of around 10-17 core
subunits [123-124]. GTFs bind to the core promoter in an ordered manner to form
59
the pre-initiation complex (PIC) (Figure 1.29), and facilitate the recruitment of RNA
polymerase to the promoter and the transcription start site (TSS) [123-124]. The
most studied core promoter element is the TATA promoter element, situated 25 bp
upstream of the TSS, and includes the consensus sequence TATAa/tAa/t [123]. The
first transcription factor recruited to the promoter is the TFIID complex, which is
implicated in the positioning of Pol II and determining the transcription start site
[123]. TFIID consists of the TATA box binding protein (TBP), which recognizes
and binds to the TATA box, and various TBP associated factors (TAFs) [123]. TFIIF
and non-phosphorylated Pol II then binds, followed by TFIIE and TFIIH recruitment
[123]. Once the PIC is assembled, the next step is initiated once NTPs become
available, which leads to strand separation at the TSS to give an open complex. The
large subunit of Pol II is then phosphorylated, resulting in transcription initiation,
subsequently followed by Pol II promoter release [123].
Figure 1.29. Process of Transcription. Schematic diagram showing the process of
transcription initiation that involves the recruitment of the general transcription factors
(TFIID, B, F, E and H) to the DNA to form the PIC (Pre-initiation complex), with the nonphosphorylated Pol II (Pol IIA). Elongation then follows in the presence of NTPs
(Nucletoide tri-phosphates) and phosphorylated PolII (PollIIO). After termination PolII is
de-phosphorylated and the whole cycle of events can be re-initiated.
Figure taken from [123]
The formation of the PIC can also occur via an “RNA Pol II holoenzyme”
intermediate where the RNA polymerase II is complexed with various proteins,
60
among which are mediators and chromatin remodeling factors, which can bind to
promoters without the ordered assembly of the general transcription factors as
mentioned above (Fig.1.30) [125].
Figure 1.30. Formation of the Pre-initiation complex (PIC). A: The GTFs (General
transcription factors) assemble to form the PIC on the promoter in an ordered manner. B:
PIC can also be assembled through the recruitment of the RNA Pol II holenzyme, which
includes the GTFs, Srb/Mediator proteins (Srb/Med) amd chromatin re-modelling factors
(CRFs). Bent arrow denotes transcription start site. Abbreviations: INR, Initiator element;
Figure adapted from [125]
The formation of the PIC complex happens only once, as when the RNA Pol II is
released from the promoter, a “scaffold” structure, which includes TFIID, E and H
and mediator still remains bound to the core promoter [125-126]. This mechanism of
transcription is known to direct low levels/basal transcription, however, transcription
can be activated by sequence specific transcriptional activators; which bind to
specific DNA sequence usually around 6-12bp, found upstream of the promoter
[125-126]. Activators can interact with the proteins implicated in the transcription
process, and as consequence enhance PIC assembly, or they could facilitate any of
the processes proceeding PIC formation [125-126]. Activators can also act at the
level of chromatin, as chromatin can restrict access of the transcriptional machinery
to the promoter thereby preventing the formation of the pre-initiation complex [125126].
Activators are also regulated by co-activators, which do not have any sequence
specific binding properties, but are recruited to specific domains of the promoters
through interactions with DNA bound activators [125-126]. They are similar to
61
activators as they facilitate the formation of the PIC complex or modify chromatin
[125-128]. Co-activators are important for the regulation of transcription since they
exert positive or negative effects on activators [125-126]. They also mediate
transcription factor target selectivity inducing the expression of particular subsets of
the transcription factor target genes thus giving rise to specific physiological
outcomes [125-126, 129-130].
Co-activators can be classified into classes, one that modify chromatin and the other,
that interact with RNA polymerase II and other general transcription factors [127128]. Formation of nucleosomes suppresses transcription due to the in-accessibility
of the transcriptional machinery to the DNA [127-128]. For gene activation this
suppression has to be relieved through alteration of chromatin structure via
acetylation, resulting in the activation of transcription [127-128]. Histone deacetylation results in transcriptional repression [125-126]. Eukaryotic cells contain
two classes of chromatin modifying proteins, namely the ATP-dependent chromatinremodeling complexes e.g SWI/SNF complexes, and the histone-modifying enzymes
e.g. the histone acetlytransferases p300 and CBP [127-128]. CBP, as the name
implies interacts with CREB, and this interaction is dependent on the
phosphorylation of CREB by cAMP-dependent protein kinase A [131]. The protein,
p300 was discovered through its interaction with the adenovirus protein E1A [131].
CBP and p300 are transcriptional co-activators that bridge sequence specific
transcription factors such as CREB activating transcription factor and p53 to the
basal transcription machinery (e.g. TATA-box binding protein (TBP) and
transcription factor IIB), and hence are implicated in RNA polymerase recruitment
and transcription initiation and activation [131-132]. Both p300 and CBP have
intrinsic histone acetyltransferase (HAT) activity and both interact with multiple
transcription regulators being critical integrators of various signalling pathways
[131]. Apart from histones, p300 and CBP acetylate various non-histone proteins and
transcription factors thereby regulating transcription at multiple levels [127-128].
Formation of the PIC is dependent on two complexes, TFIID and mediator [127].
The binding of TFIID to the TATA box is a critical step in the formation of the PIC,
but this alone cannot fully activate transcription, the concerted action of both TFIID
and the mediator (a multi-subunit complex of proteins) and their direct interaction is
required for efficient PIC complex formation [127]. The co-ordination of chromatin
remodeling to PIC complex formation was studied to elucidate the mechanism
involved and it was shown that the first step was the interaction of the mediator and
62
p300, resulting in chromatin acetylation [127]. The interaction of p300, mediator and
template restricts access to other cofactors including TFIID, which is critical for PIC
complex formation [123]. P300 is auto-acetylated and undergoes conformational
change, which results it its subsequent dissociation from this complex with DNA
[127]. This dissociation is enhanced further because of the competitive interaction of
TFIID for the mediator, and so the concerted action of auto-acetylation and
competitive interaction with TFIID results in the dissociation of p300, and the
subsequent increase in binding of TFIID, which is critical event for the activation of
transcription [127]. This then results in the recruitment of other general transcription
factors (GTFs) and formation of the PIC complex [127]. This is the mechanism that
co-ordinates chromatin remodeling to PIC complex formation [127].
In summary, transcription initiation is a critical event, closely regulated by many
proteins including transcription factors; which can activate or suppress transcription,
and also repressor proteins, which can inhibit either the action of transcription
factors which facilitate transcription, or the basal transcriptional machinery [125126, 132]. Gene specific transcription repression is critical aspect of gene regulation,
which is dependent on the activity of transcriptional repressors [133]. Repressors can
be categorized according to their repression range, i.e short range or long range
[133]. In the latter case the repressors mechanism of action includes mediation of
promoter resistance to all enhancers and is known as “promoter silencing” [133].
The mechanism of action of short-range repressors is to inhibit the function of
locally DNA bound rather than distantly bound activators (long range repressors)
[133]. An example of long-range transcriptional repression comes from a corepressor protein known as Tup1 (yeast homologue), which is a tetrameric complex
[133]. It has been shown that Tup1 mediates it repression through interaction with
one of the subunits of the mediator complex, whereby the mediator complex is a
large multi-subunit complex that interacts with the C terminus of the large subunit of
RNA Pol II and important for transcriptional activation [133].
Various sub-complexes of proteins assemble on promoters but on type II promoters
of RNA Pol III, its TFIIIC that recognize these promoter sequences, which then
recruits TFIIIB on the transcritpion start site of RNA polymerase III target genes,
leading to the recruitment of Pol III [122, 134]. This recruitment is initiated by the
TPR containing subunit, TFIIIC131, which interacts with the TFIIIB related factor
TFIIIB70/Brf1 [134] leading to the recruitment of TFIIIB followed by TFIIIC release
[134]. TFIIIC131 contains 11 TPR subunits, and mutations have been identified
63
within this TFIIIC131, which is within TPR2 of the protein [134]. This mutation
increase Pol III gene transcription through an increase in the recruitment of TFIIIB70
to the TFIIIC-DNA complex [134]. This is through a conformational change which
exposes a TFIIIB70 binding site [134].
Research was carried on the identification of the HATs implicated in Polymerase III
mediated transcription, although p300/CBP are known to be critical for Pol II and
Pol I transcription as described above [128]. It was then subsequently hypothesized
that p300 would also be implicated in Pol III transcription activation. TFIIIC also
has HAT activity, however, research into the possible role of this HAT activity being
important in suppressing nucleosomal formation was not clearly shown [128].
Chromatin immuno-precipitation experiments have shown that p300 is present on
the promoters of different transcribed Pol III genes in- vivo [128]. P300 was indeed
found to be a co-activator for Pol III, and implicated as initially predicted in
chromatin remodeling as well as formation of the PIC [128]. The latter is through its
interaction with TFIIIC and its subsequent recruitment [128]. P300 HAT activity is
critical for transcription activation on the chromatin level, but p300 is also important
for the stabilization of TFIIIC-DNA complex and formation of pre-initiation
complex for the transcription of histone free DNA templates, independent of it’s AT
activity [128].
Research has identified a link between cancer and the deregulation of the Pol III
transcription, as high amounts of RNA Pol III target genes have been detected in
various cancers [135]. It has been shown that TFIIIB is one of the contributors to this
link, as high level of TFIIIB as well as TBP, BRF1, Bdp1 and Brf2 have been
detected in various cancers, [135].
1.12 STRAP (Stress responsive activator of p300)
1.12.1 STRAP discovery
The focus of this investigation is the 440 amino acid protein STRAP (Fig.1.31A)
which was discovered through a yeast two-hybrid screen as one of the components
of a multi-protein complex containing junction mediatory (JMY) and p300 [136].
Sequence analysis of STRAP predicted the presence of six tandem TPR motifs
distributed throughout the protein (Fig. 1.31C) [136]. More detailed sequence
alignment of STRAP TPR motifs showed that amino acid residues at certain
positions within these domains are conserved (Fig.1.31B) [32-33, 37, 136].
64
(A)
01-MMADEEEEVKPILQKLQELVDQLYSFRDCYFETHSVEDAGRKQQDVQKEM-50
51-EKTLQQMEEVVGSVQGKAQVLMLTGKALNVTPDYSPKAEELLSKAVKLEP-100
101-ELVEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNKVSLQNLSMVLRQLR-150
151-TDTEDEHSHHVMDSVRQAKLAVQMDVHDGRSWYILGNSYLSLYFSTGQNP-200
201-KISQQALSAYAQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSRA-250
251-AALDPAWPEPRQREQQLLEFLDRLTSLLESKGKVKTKKLQSMLGSLRPAH-300
301-LGPCSDGHYQSASGQKVTLELKPLSTLQPGVNSGAVILGKVVFSLTTEEK-350
351-VPFTFGLVDSDGPCYAVMVYNIVQSWGVLIGDSVAIPEPNLRLHRIQHKG-400
401-KDYSFSSVRVETPLLLVVNGKPQGSSSQAVATVASRPQCE-441
(B)
TPR I
069-QVLMLTGKALNVTPDYSPKAEELLSKAVKLEPEL
TPR II 103-VEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNK
TPR III 179-GRSWYILGNSYLSLYFSTGQNPKISQQALSAYAQ
TPR IV 224-PDLHLNRATLHKYEESYGEALEGFSRAAALDPAW
TPR V
332-NSGAVILGKVVFSLTTEEKVPFTFGLVDSDGPCY
TPR VI 373-VQSWGVLIGDSVAIPEPNLRLHRIQHKGKDYSFS
TPR Consensus
W LG Y
A
F A
P
-102-hSTRAP
-136-hSTRAP
-212-hSTRAP
-257-hSTRAP
-365-hSTRAP
-406-hSTRAP
(C)
Figure 1.31. STRAP sequence and its conservation. A: hSTRAP is a 440 amino acid
protein and amino acids underlined and highlighted in different colors are part of the
predicted TPR motifs. The amino acids highlighted in green, red, blue, orange, purple and
brown are part of TPR motifs 1, 2, 3, 4, 5 and 6 respectively. B: Sequence alignments of the
6 TPR motifs of the human homologue of STRAP (hSTRAP). Eight amino acids are
conserved between the TPR motifs, amino acid 4, 7, 8, 11, 20, 24, 27 and 32 and these
amino acids are highlighted in red, which correlates with the general feature of TPR motifs.
The number at the start and end of each TPR motif indicates the residue number of the start
and end of that TPR motif. C: Illustrates the distribution of the six TPR motifs within
hSTRAP and are highlighted in grey and labeled I to VI. Figure adapted from [136]
1.12.2 STRAP, p300 and JMY
STRAP forms a complex with proteins p300 and JMY (STRAP-p300-JMY) [136] of
which p300 is a 300kDa phospo-protein discovered in 1986, found to interact with
the E1A (Adenovirus early region 1A) protein [137]. Sequence analysis revealed a
63% homology between the human orthologue of p300 and CBP (CREB binding
protein) family of proteins, where an increase in identity was observed in certain
specified regions namely the E1A binding site region [137]. Due to this latter
observation it was predicted that p300 and CBP have similar functions, which was
then found to be the case through further experiments [137]. As mentioned above,
both CBP and p300 are transcriptional co-activators, being adaptor molecules
between DNA binding factors and the transcriptional machinery [137-140]. P300 is
implicated in diverse cellular functions, among which are proliferation,
differentiation, cell cycle regulation, apoptosis and the DNA damage response
65
pathway [137-140]. P300 is also implicated in the p53 response, and interacts with
both positive and negative regulators of p53 [138-140]. The p53 negative regulator
Mdm2 binds to p300 through residues 102-222 of Mdm2 and mutations within this
regions means Mdm2 cannot bind to p300 and as a consequence cannot degrade p53
[138-140].
JMY, an 110kDa protein, was discovered through a yeast two-hybrid assay with a
truncated version of p300 (611-2283) and was consequently named according to its
function, “junction mediating and regulatory protein” [141]. JMY-p300 interaction
was confirmed to be a direct interaction and both were found to exist as a multicomponent co-activator complex [141]. JMY sequence analysis identified a number
of interesting features; among which are clusters of probable CdK phosphorylation
sites at its N termini, a string of proline rich residues at its C termini, and a central
conserved adenovirus E1A CR2 homology region [141]. JMY is a 983 amino acid
protein, which contains two p300 binding domains present between amino acid
residues 1-119 and 469-558 [141]. Experiments defined region of p300 implicated
with the interaction to JMY to 611-1257 and 1572-2283 [141]. To initially deduce
whether JMY could be implicated in cancer, its gene chromosomal location had to be
identified [141]. This was identified on chromosome 5 in band 5q 13.2, and this
region is implicated in various malignancies such as leukemia [141]. Further
research is yet to be done to clarify the role of the JMY gene in these malignancies
[141].
JMY and p300 are implicated in the regulation of the p53 response, as the p300-JMY
complex has been found to enhance p53 transcription, but this is dependant on the
activation domain of p53 in being intact [141]. Also the expression levels of p53
target genes such as Bax increased upon expression of JMY and p300 [141]. Also a
complex of p53-JMY-p300 was detected in U20S cells, indicating the formation of a
ternary complex [141]. The outcome of this increase in p53 transcription by this trimeric p53/p300/JMY complex was an increase in p53 dependant apoptosis [141].
Different isoforms of JMY have a different affect on the p53 response, for example a
JMY mutant where the proline rich sequence is deleted (P), promotes cell cycle
arrest rather than apoptosis [141].
It is known that Mdm2 targets p53 for degradation through the process of
ubiquitination, and recent research has implicated JMY in this process [142]. It has
been shown that when DNA is damaged, an increase in JMY protein is detected
66
followed by an increase in p53 activity; hence JMY was identified as a DNA damage
responsive protein [142]. Furthermore, Mdm2 inhibitors were shown to increase the
levels of JMY protein, which then suggested that Mdm2 negatively regulates JMY
[142]. JMY and Mdm2 co-expression results in an increase in poly-ubiquinated JMY
as Mdm2 ubiquitinates JMY, for subsequent degradation through the proteosome
[142]. JMY as a consequence can then not activate the p53 response. [142].
1.12.3 STRAP function
STRAP is a stress responsive element, as under stress the levels of STRAP increases,
as well as the interaction between p300 and JMY [136]. STRAP interacts with JMY
via its N-terminal domain (1-205) and with p300 through its C-terminal region (206438) (Fig.1.32) [136]. The role of STRAP in the JMY/p300 complex has been
suggested as the stabilization of the interaction between these two proteins [136].
Interacts
with
JMY
p300
205
Figure 1.32. STRAP interaction with JMY and p300 through distinct TPR
motifs. The six TPR motifs are highlighted in grey and labeled I to VI. Residues 1-205
of hSTRAP are implicated in JMY interaction and residues 206-438 are implicated with
p300 interaction.
Figure adapted from [136]
In addition, STRAP has been shown to increase the half-life of p53, possibly by
blocking the interaction of the tumour suppressor with MDM2, and co-activate its
transcriptional activity [136]. Furthermore STRAP has been found to interact with
PRMT5 under conditions of DNA damage thus allowing PRMT5 recruitment to p53
under these conditions (Fig.1.33) [143]. PRMT5 methylates p53 at Arg333, Arg335
and Arg337 that are located within the oligomerization, nuclear export and nuclear
import domains of p53 (Fig.1.33), thereby regulating cell cycle [143]. Under
stressful conditions, an increase in interaction between STRAP and p53 is detected
through which the level of p53 activity is maintained in this condition [136].
67
Figure 1.33. STRAP and the p53 Response. STRAP allows the recruitment of
PRMT5 to p53 when DNA is damaged; resulting in the methylation of three arginine
resides, Arg333, 335 and 337 on p53 by PRMT5. This as a consequence then affects the
p53 response. Figure adapted from [143]
As mentioned above STRAP is implicated in regulating p53, and it is also implicated
in the DNA damage response pathway [143-148]. DNA can be damaged through
ionizing
radiation,
which
leads
to
the
activation
of
specific
related
phosphatidylinositol-3-OH-kinase-like kinases; ATM, and ATR protein kinase that
activate a signaling cascade [144-148]. This signaling cascade includes many
components such as p53, Chk1 and Chk2 [144-145, 147-148]. P53 is a tumour
suppressor gene and a critical protein as 50% of human cancers are due to p53
mutations [146]. When DNA is damaged ATM phosphorylates STRAP at position
Ser203, which is within TPR3 of STRAP resulting in STRAP nuclear localization
[144-145, 147]. Once STRAP has localized to the nucleus, Chk2, which is
downstream of ATM phosphorylates STRAP at position Ser221 (Fig.1.34) [144-145,
147]. This site is not within the TPR motif but in the junctional region between
TPR3 and TPR4 [144-145, 147]. This phosphorylation event leads to STRAP
stabilization, which then leads to the assembly of the STRAP, p300 and JMY
complex (Fig.1.34) [144-145, 147]. This results in p53 histone acetylation and
activation of the DNA damage response (Fig.1.34) [144-145, 147].
68
Figure 1.34. STRAP and the DNA damage Response pathway. ATM and Chk2
phosphorylate STRAP at position 203 and 221 respectively which leads to STAP nuclear
localization and stabilization respectively. This then leads to the activation of the DNA
damage response pathway.
STRAP remains in the cytoplasm in two ataxia-telangiectasia (AT) cell lines tested,
both of which have non-functional ATM [145, 147]. Also a mutant form of STRAP,
which cannot be phosphorylated by ATM, also remains in the cytoplasm [145].
Translocating STRAP to the nucleus in these conditions with defective ATM
restores STRAP stabilization and the DNA damage response [145, 147]. This shows
that nuclear STRAP plays an important role in the DNA damage response pathway
through ATM [145, 147].
Cells can be exposed to a variety of environmental stress and as a result many stress
response pathways exist to enable cells to live in these conditions [148-149]. A type
of stress that a cell responds to is heat shock and this leads to the activation of a set
of chaperones called the heat shock proteins [148-149]. This response pathway
involves HSF1, which under normal conditions exist as a monomer in the cytoplasm
(Fig.1.35) [148-149]. When cells are heat shocked, STRAP interacts with HSF1
resulting
in
HSF1
phosphorylation
and
trimerization
[148-149].
The
STRAP/HSF1/p300 complex then binds to heat shock elements (HSEs) of target heat
shock protein genes, for example Hsp70 [148-149]. The histones of Hsp70 are then
acetylated by p300 resulting in Hsp70 transcriptional activation [148-149].
Activation of Hsp70 leads to inhibition of apoptosis through inhibition of caspase
activation and cytochrome C release (Fig.1.35) [148-149].
69
Figure 1.35. STRAP and the stress response pathway. Under heat shock STRAP
interacts with HSF1 resulting in its trimerization and phosphorylation and formation of
HSF1/STRAP/p300 complex. This as a complex then binds to HSEs of target genes for
example Hsp70, and causing its transcriptional activation. Abbreviations: HSF1, Heat Shock
Transcription Factor 1; HSEs, Heat Shock Elements; Hsp70, Heat Shock Protein 70;
STRAP is also implicated in the regulation of the Glucocorticoid receptor (GR)
under cellular stress, which is a member of the nuclear hormone receptor family,
[150]. GR is activated through its interaction with glucocorticods and lipophilic
hormones under stress [150]. Once activated the GR regulates a diverse set of genes
implicated in metabolism, inflammation and the immune response, both in a negative
and positive manner [150]. Due to its multi-functional properties its activity is
regulated
though
various
mechanisms;
protein
stability,
post-translational
modifications and interactions with various co-factors [150]. Protein stability is a
critical regulator of GR, as GR is a target for degradation through the process of
ubiquitination [150]. GR interacts with various co-factors, such as p300 and heat
shock proteins, which affect chromatin architecture, and interact with the basal
transcriptional machinery [150]. The interaction between hormone and GR causes a
significant conformational change within GR, which exposes a surface that binds to
the LXLL motif of target co-factors [150]. STRAP contains 6 TPR motifs [136] and
one LXLL motif situated between TPR4 and TPR5 [150]. STRAP(220-40) was
shown to interact with GR in A549 cells [150] and the most critical motif in this
interaction is TPR6 and the LXLL motif [150]. STRAP is an important regulator of
GR stability as STRAP-GR interaction inhibits GR degradation and is critical for the
70
stabilization of GR, as it has been shown to increase the half-life of GR [150].
STRAP is also implicated in the regulation of GR transcription (Fig.1.36) [150]
Figure 1.36.
Regulation of GR by STRAP. Under stress a tri-meric
GR/Hsp/TTC5(STRAP) complex is formed, resulting in GR stabilization of GR possibly
through Mdm2 inhibition. After the binding of glucocorticoid to GR, the receptor then
translocates to the nucleus and binds to glucocorticoid response elements (GRE) of its target
gene. STRAP(TTC5), p300, JMY and Hsp are all implicated in the regulation of GR
associated related gene transcription. Figure taken from [150]
71
1.13 Aims of the Project
It has become evident that TPR proteins are evolutionary conserved and implicated
in various essential cellular functions [32]. STRAP is one predicted TPR domain
protein with the rare structural characteristic that it harbors six predicted consecutive
TPR domains throughout its sequence [136]. The existence of six similar domains
within the sequence of STRAP are possibly required to provide this protein with the
ability to interact with several different binding partners thereby being involved in
multiple signaling pathways, this way tailoring cellular needs with specific
environmental cellular conditions. These domains may also be necessary to amplify
particular signals within the same pathway conferring the cell the capacity to respond
to similar stimuli with ranging intensity
In accord with the role of other TPR motif proteins [32] STRAP has been shown to
interact with p300 and JMY to form a trimeric complex of STRAP/p300/JMY [136],
thereby potentially being implicated in fundamental cellular signaling pathways. To
address this, biochemical pull-down assays will be carried out to identify interacting
partners of full-length hSTRAP protein in breast cancer cells, followed by mass
spectrometry and proteomic analysis. Furthermore, the interaction pattern of
truncated hSTRAP fragments in breast cancer cells will also be investigated and
compared to full-length hSTRAP protein, with a view of mapping the region of
hSTRAP involved in hSTRAP-ligand interaction. Another aim is to characterize
structurally full-length hSTRAP and its truncated versions by NMR, X-ray
crystallography and Circular dichroism. For this, a protocol has to be firstly
established to clone, express and purify these various hSTRAP constructs to a high
quantity.
72
2. Chapter Two. Materials and Methods
2.1 Materials
2.1.1 Chemicals and Reagents
Unless otherwise stated, all chemicals and reagents were of analytical grade. PBS
(BR0014) was purchased from Oxoid (Hampshire, UK) in tablet form. DMEM
(BE12-604F), Penicillin/Streptomycin (DE17-602E) and 0.5% (v/v) trypsin with
EDTA (BE17-161E) was supplied by Lonza (UK). FBS and L-glutamine was
obtained from GIBCO, Invitrogen (GIBCO BRL, Paisley, UK). Dimethyl sulphoxide
(DMSO), glycerol (G/0650/08), SDS (S/5200/53), NaCl (S/3160/60), DTT
(BPE172-25), Na2HPO4 (S-4400/53), KH2PO4 (P/4800/53), D-Glucose (G/0500/55),
EDTA (D/0700/53), Iso-propanlol (BP2618-500), Ethanol (BP2818-100) and
Agarose (BPE1356) were purchased by Fisher Scientific (Leicestershire, UK). Prestained Protein marker Broad range 7-175kDa (P7708S), 100bp DNA ladder
(N3231S) and 1kb DNA ladder (N3232S) were purchased from New England
Biolabs (Hertfordshire, UK). Protease Inhibitor cocktail (11836170001) was
purchased from Roche (West Sussex, UK). DNA hyper ladder III (BIO-33043) was
purchased from Bioline (London, UK). Bug buster (70544-3) was purchased from
Novagen (UK). PMSF (P7626-1), Bromophenolblue (B0126) 2-mercaptoethanol
(M6250), APS (A3678), TEMED (T9281), L-Glutamic acid (G251), Triton X-100
(23,472-9), Imidazole (56749), L-Glutathionine reduced (G4251), Thiamine (T4625)
and Ammonium Bi carbonate (A6141) were purchased by Sigma Aldrich (Poole,
Dorset, UK). Pageruler Pre-stained Protein ladder 10-170kDa (SMO671) and Glacial
acetic acid (A/0400/PB17) were purchased from Fermentas. InstantBlue (ISB1L)
was purchased from Expedeon (Cambridgeshire, UK). Acrylamide (161-0156) and
Bradford reagent (500-0006) were purchased from Biorad (Hertfordshire, UK). LArginine (104995000) was supplied from Acros Organic (Leicestershire, UK).
15
N
labelled ammonium chloride (299251) was purchased from Isotech (Champaign,
USA). Calcium chloride (100704Y) and Magnesium sulphate (101514Y) were
purchased by AnalaR (Leicestershire, UK). Tris (TRIS01) was purchased from
Formedium (Norfolk, UK). Gel red stain (41003) was purchased from Biotium
(Cambridge, UK). NP-40 (A2239.0100) was purchased from VWR BDH Polio
(UK). Trypsin, mass spectrometry grade (V5280) was purchased from Promega
(Southampton, UK). Acetonitrile (51101) was purchased from Thermo-scientific
(UK)
73
2.1.2 Enzymes and Kits
Phusion High fidelity DNA polymerase (M530S), dNTPs (N7552), BamHI High
fidelity (R3136T), NdeI (R0111S), T4 DNA ligase (M0202S) were purchased from
New England Biolabs (Hertfordshire, UK). Xho1 (10703770001) and alkaline
phosphatase (10108138001) were supplied by Roche (West Sussex, UK). All kits,
PCR purification kit (28104), Gel Extraction kit (28704) and Mini prep Kit (27104)
were purchased from Qiagen (West Sussex, UK). DNAse (107k7013), RNase
(126k760) was purchased From Sigma Aldrich (Poole, Dorset, UK). Pre-cission
protease (27-0843-01) was purchased from GE-Healthcare (UK)
2.1.3 Other consumables
His tag superflow Talon Resin (635506) was purchased from Clontech (France).
GST Resin (17-0756-05), Superdex 200 26/60 GL (17-1070-01) and Viva-spin 500 3
kMWCO (28-9322-18) were purchased from GE Healthcare (UK). Tissue culture
flasks, 10 cm plates and Petri dishes were obtained from Falcon (Runcorn, UK).
JSCG+ screen (130920), PACT (130918) pH Clear Strategy I (130909) and II
(130910) were purchased from Qiagen (West Sussex, UK). Morpheus (MD1-46) and
MRC 96 well crystallization plates (MD11-00-100) was purchased from Molecular
dimensions (Suffolk, UK). The 2 µM filter (FC121) used in this study was purchased
from Appleton (UK). The 25 mm 10 kMWCO membrane (RC-25-10) was purchased
from Generon (Berkshire, UK). Snakeskin pleated dialysis tubing, 3kMWCO
(68035) was purchased from Thermo Scientific (UK). 10 and 15 well 1.5 mm combs
were purchased from Biorad (Hertfordshire, UK). The amicon, stirred ultrafiltration
cells 8010 10 mls (5121) was purchased from Millipore (UK). The Phoenix nanolitre pippetting robot (600-0000-00) was purchased from Art Robbins instruments
(USA).
2.1.4 General buffers and solutions
The compositions of all the general buffers used in this study are listed in Table 2.1.
Table 2.1. Buffer compositions.
No
1
Buffer
TNN buffer
Composition
50 mM Tris-HCl pH 7.4, 120 mM NaCl,
5 mM EDTA, 0.5% NP-40.
2
3x SDS
buffer
3
50x TAE buffer
SDS
sample 187 mM Tris, 30% Glycerol, 6% SDS,
15%
2-mercapto
ethanol,
0.01%
bromophenolblue.
2 M Tris, 1 M Glacial acetic acid and 50
mM EDTA
74
No
4
Buffer
Upper
stacking
PAGE buffer
Composition
SDS 0.5 M Tris pH 6.8, 0.4% (w/v) SDS
5
Lower resolving
PAGE buffer
SDS 1.5 M Tris, pH 8.8, 0.4% (w/v) SDS
6
His tag Lysis buffer
50 mM Tris pH 7, 300 mM NaCl, 50 mM
L-Arginine, L-Glutamic acid, 0.5% (v/v)
Triton X-100, 1% (v/v) each of Protease
Inhibitors, PMSF, RNase and DNAse
7
His tag Wash buffer
8
His tag Elution buffer
9
GST tag Lysis buffer
50 mM Tris, 300 mM NaCl, 50 mM
Arginine and Glutamic acid and 5 mM
Imidazole
50 mM Tris, 300 mM NaCl, 50 mM
Arginine and Glutamic acid, 200 mM
Imidazole
50 mM Tris pH 8.7, 300 mM NaCl, 50 mM
Arginine and Glutamic acid, 0.5% (v/v)
Triton X-100, 1% (v/v) of Protease
Inhibitors, PMSF, RNase and DNAse
10
GST tag Wash buffer
50 mM Tris pH 8.7, 300 M NaCl, 50 mM
Arginine and Glutamic acid and 1 mM DTT
11
GST tag Elution buffer
50 mM Tris pH 8.7, 300 mM NaCl, 50 mM
Arginine and Glutamic acid, 10 mM LGlutathionine Reduced
12
Gel filtration buffer
50 mM Sodium Phosphate buffer pH6.5,
150 mM NaCl, 50 mM Arginine, 50 mM
Glutamic
acid and
10
mM
βmercaptoethanol
13
CD buffer
20 mM Sodium phosphate buffer, pH 6.5
14
Minimal media solution A
15
Minimal media solution B
88 mM Na2HPO4 and 55 mM KH2PO4, pH
7.2
19 mM 15N labelled ammonium chloride, 22
mM of D-Glucose, 180 µM Calcium
chloride and 200 µM Magnesium sulphate,
650 µL Trace elements
2.1.5 Chemically competent bacterial cells
All the bacterial competent cells used, supplier’s details, description and antibiotic
resistance of each bacterial chemically competent cell line used in this investigation
is listed in Table 2.2.
Table 2.2. Bacterial competent cells.
Cells
Suppliers
Details
Description
DH5α
(18265-017)
Invitrogen, UK
Cells used for
cloning processes
75
Antibiotic resistance of
vector and cells
molecular
Ampicillin (50 µg/µL) (v)
Cells
Suppliers
Details
Description
Antibiotic resistance of
vector and cells
BL21(DE3)-RIPL
(230280) Stratagene,
USA
Have extra copies of the rare
codon tRNAs genes
Ampicillin (50 µg/µL) (v),
chloramphenicol (34 µg/µL)
(c) and Streptomycin (50
µg/µL (c)
T7 express
(C2566H),
New
England Biolabs
T7 polymerase under the
control of the lac operon rather
the
lysogenic
prophage
(BL21(DE3 strains)
Ampicillin (50 µg/µL) (v)
BL21(DE3)pLysS
(69451-3) Novagen,
UK
Ampicillin (50 µg/µL) (v )and
chloramphenicol (34 µg/µL)
(c)
BL21(DE3)
single cells
(69450-3) Novagen,
UK
Encodes a natural inhibitor of
T7 polymerase,
the T7
lysozyme, which suppresses
expression
before
IPTG
induction
High level expression
Ampicillin (50 µg/µL) (v)
Rosetta
2(DE3).
Gami
(71351-3) Novagen,
UK
Promotes disulphide bond
formation and contains the
tRNAs for the rare codons
Ampicillin (50 µg/µL) (v),
Tetracycline (12.5 µg/µL) (c),
chloramphenicol (34 µg/µL)
(c) and Streptomycin (25
µg/µL) (c)
Shuffle
express
T7
(C3029H)
New
England Biolabs, UK
Contains chaperones to assist in
the folding of protein and
promotes correct di-sulphide
formation
Same properties as Shuffle T7
express
but
encodes
a
lysozyme, an inihibitor to
suppress expression before
IPTG induction
Ampicillin (50 µg/µL) (v),
Streptomycin (25 µg/µL (c)
Shuffle T7pLysY
(C3027H),
New
England Biolabs, UK
Ampicillin (50 µg/µL) (v),
Streptomycin (25 µg/µL (c)
and chloramphenicol (34
µg/µL) (c)
This table list the supplier’s details, description of each cell line and antibiotic resistance of
vectors (v) and bacterial competent cells used (c). The vectors used in this investigation,
pET14-b and pGEX-6P1 both have antibiotic resistance against ampicillin.
2.2 Mammalian Cell Culture
2.2.1 Cell lines
For this project breast cancer cells, MCF-7 cells (p53+/+) were used and purchased
from European Collection of Cell Cultures (ECACC). Cells were grown in
standardized media of Dulbecco's Modified Eagle's medium (DMEM), 10% (v/v)
heat inactivated fetal bovine serum (FBS), 1% 10,000 U/ml penicillin and
streptomycin (P/S) and 2 mM L-Glutamine. Cells were maintained in this medium at
370C, 21% O2, 5% CO2 and 74% N2.
2.2.2 Cell passage and maintenance
Cells were grown in 75 cm2 vented tissue culture flasks and regularly passaged at the
desired confluence of 70-80%. Cell culture was carried out using aseptic techniques
in Class II microbiological safety cabinets. Once desired confluence was reached,
76
cells were sub-cultured as follows; growth media was removed and the cell
monolayer was washed with sterile PBS. Then 2 ml 1x trypsin/EDTA, diluted in
PBS, was added to the cells and incubated for 2 mins at 37oC to aid detachment.
Complete media was added to the flask to neutralise the trypsin and an appropriate
amount of the cell suspension was transferred to a new flask, with the addition of
more fresh complete media. For routine culture, all cell lines were passaged at a
dilution ratio of 1:4.
Cells were seeded into 75 cm2 vented tissue culture flasks or cell culture 10 cm
plates for general cell maintenance and biochemical pull down assays respectively.
2.2.3 Biochemical pull down assays
Once a confluency of 60-70% was achieved on the adhesive cell culture 10 cm
plates, growth media was removed and the cell monolayer was washed twice with
sterile cold PBS. Then 150 µl of cold TNN buffer was added to the plates and the
cells were scraped off the plates. The subsequent lysate was then transferred into 2ml
eppendorfs to be incubated at 40C for 25 mins on the roller. After this, the lysate was
then centrifuged at 15871g for 10 mins, and the supernatant obtained through this
was then transferred to a universal tube and thoroughly mixed. A 100 µl of this
supernatant was then added to each resin sample. This lysate-resin sample was then
incubated at 40C on the roller for 1 hr and then centrifuged again at 15871g for 5
mins. Supernatant was discarded and the pellet was washed with cold PBS and
centrifuged at 15871g for 2 mins, and this was repeated three times. Then 30 µl of 3x
SDS sample buffer was added to the pellet and heated at 1000C for 5 min.
2.3 Cloning of hSTRAP constructs
2.3.1 Cloning of full length hSTRAP into pET14b (His-hSTRAP(1440))
Full-length hSTRAP codon optimized sequence was synthesized by GENEART in
the vector pET-14b in frame using NdeI and BamHI restriction sites. Hence, the
plasmid DNA for this construct was ready to be transformed into DH5α cells with a
view to carry out mini-preps (See Section 2.4).
2.3.2 Cloning of truncated versions of hSTRAP (His-hSTRAP) into
pET-14b
For this project truncated hSTRAP protein constructs were cloned into pET-14b,
taking into consideration both predicted structured boundaries and positions of TPR
77
motifs aswell as the pI of the protein. Five constructs were then subsequently chosen
to be cloned into pET14b: hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150),
hSTRAP(151-284) and hSTRAP(285-440).
2.3.2.2 Primer design
Primers were designed to have a melting temperature between 60-70ºC, a GC
content of 40-60% and to have a GC clamp (GGC) on the N terminus (Table 2.3).
Programs'
that
were
used
to
aid
primer
design
were
http://www.basic.northwestern.edu/biotools/oligocalc.html. Table 2.3 shows all the
primers used for the cloning of the truncated versions of hSTRAP. These primers
were obtained from Sigma Aldrich in HPLC purified form.
Table 2.3. PCR primers
Construct
Primer Name
Primer Sequence (5’-3’)
hSTRAP(1219)
P1-F3TPR-Fwd
P2-F3TPR-Rev
hSTRAP(220
-440)
hSTRAP (1150)
P3-L3TPR-Fwd
P4-L3TPR-Rev
P5-F2TPR-Fwd
P6-F2TPR-Rev
hSTRAP(151
-284)
P7-M2TPR-Fwd
P8-M2TPR-Rev
hSTRAP(285
-440)
P9-E2TPR-Fwd
P10-E2TPR-Rv
GGCCATATGATGGCCGATGAAGAAGAAGAAGTT
GGCGGATCCTCATCATTTACGATCAACTTTTTCT
GCCTGT
GGCCATATGGCAAGCAGCAATCCGGATCTG
GGCGGATCCTTATTATTCACACTGCGGACGG
GGCCATATGATGGCCGATGAAGAAGAAGAAGTT
GGCGGATCCTTATTAACGCAGCTGACGCAGAACC
AT
GGCCATATGACCGATACCGAAGATGAACATAG
GGCGGATCCTTATTACACTTTACCTTTGCTTTCC
AGCA
GGCCATATGAAAACCAAAAAACTGCAGAGCATGC
GGCGGATCCTTATTATTCACACTGCG
GC
content
(%)
41
40
Tm
(ºC)
57
55
41
57
66
64
68
66
64
43
46
66
66
66
44
50
70
68
68
68
68
Annealing
temp
(ºC)
68
66
PCR primers used to clone the truncated versions of hSTRAP which contain different
combinations of TPR motifs. The GC content varies from 40-57% and the melting temp
(TM) varies from 64-680C. Annealing temperature that was used for the PCR reactions for
each construct is shown in the last column.
2.3.2.3 Polymerase Chain reaction
Following primer design, the PCR reactions were then set up on ice. For these PCR
reactions Phusion High fidelity DNA polymerase was used, and each PCR reaction
was 50 µl, consisting of 20% (v/v) GC Buffer (compatible buffer for this
polymerase) (10 µl), 200 μM dNTPs (5 µl), 0.5 µM each of both forward and reverse
primer (1 µl each of primers shown in Table 2.3), approx 300 pg template DNA (1
µl) and 1 µl Phusion Hot Start Polymerase and the rest was 31 µl of MilliQ water.
The PCR protocol that was followed is shown in Table 2.4.
78
Table 2.4. PCR reaction protocol.
PCR REACTION
Step
1
2
Number of Cycles
1
35
Temperature (ºC)
Time (Secs)
98 ºC
30
98 ºC
30
Annealing temp (64- 30
68ºC, See Table 2.3,
Column 6)
72 ºC
90
3
1
72 ºC
300
4
1
10 ºC
This table shows the PCR protocol that was used to clone the truncated hSTRAP constructs.
2.3.2.4 PCR Purification
PCR samples were purified to eliminate all impurities such as primers, nucleotides,
enzymes, mineral oil, salts and agarose, using the Qiagen PCR purification kit. The
protocol followed and the buffers mentioned below were supplied with the kit. This
kit uses silica membrane technology. DNA binds to this membrane under high salt
conditions provided by the binding buffer, and is then eluted off the membrane under
low salt conditions.
The clean up process involved adding 5* in volume of buffer PBI (binding buffer) to
50 µl of the PCR reaction. The color of the mixture was observed to determine pH of
mixture, as it should be yellow after addition of buffer, to indicate a pH of less than
7.5. If mixture is any other color, then addition of 10 µl of sodium acetate pH 5.0 is
recommended, but in this case this was not needed as the mixture turned yellow.
Sample was then applied to the middle of the QIAquick column provided with the
kit and centrifuged for 45 sec at 15871g and the flow through was discarded. Then
750 µl of Buffer PE (with ethanol) was applied to the middle of the column to
remove the high salt, and then the column was centrifuged at the same speed for 45
sec and again the flow through was discarded. Then the column was centrifuged for
1 min to eliminate the excess of ethanol. The column was then placed in a clean
micro-centrifuge and 30 µl of Buffer EB (low salt buffer) was added to elute the
DNA off the column. The column was then left to stand for 5 min to achieve a higher
concentration of DNA and the eluted DNA was then transferred into a clean microcentrifuge.
2.3.2.5 Restriction digests
The PCR products were then digested with enzymes BamHI and NdeI, however, in
this project other restriction digestions were performed using this same procedure.
For each reaction 3g of sample was digested in a 50l (including 10x Buffer
79
compatible for the enzyme), with 20 U of each enzyme. All digestions were
incubated for 3 hrs at 370C.
2.3.2.6 Agarose gel electrophoresis
The digested DNA was mixed with 10x DNA loading buffer (0.25% bromophenol
blue, 0.25% xylene cyanol FF, 30% glycerol in water) and was subjected to
electrophoresis on a 1% agarose gel (1g/100ml 1 x TAE, 0.1 mg/ml ethidium
bromide). The electrophoresis was carried out in Tris-acetate/EDTA (TAE) buffer
(0.04M Tris-acetate, 0.001M EDTA) at 80 V for approximately 90 min. 5 µl of the
DNA HyperLadder (Bioline) was used to determine the size of the DNA fragments.
2.3.2.7 Ligation
DNA ligase is an enzyme that is used to join DNA fragments together by catalysing
the formation of phosphodiester bonds between a juxtaposed 5’ phosphate and a 3’
hydroxyl terminus in duplex DNA. The T4 ligase was originally purified from T4
phage-infected E. coli cells, and uses ATP to repair single-stranded nicks in duplex
DNA and also connect duplex DNA restriction fragments that have either blunt or
cohesive ends.
The insert to vector ratio of 1:1 and 5:1 were used for the ligation reactions, using
100 ng of vector. 1:10 of 10x ligation buffer (660 mM Tris-HCl, 50 mM MgCl2, 10
mM dithiothreitol, 10 mM ATP, pH 7.5) was added to the reaction together with 1 U
DNA ligase (Roche) and sterile distilled water was added to make up a final volume
of 10 µl. The reaction was incubated at 160C overnight and 5µl was transformed into
competent E.coli DH5α cells. The same vector was used in a reaction without any
insert as a control.
2.3.3 Cloning of Full length hSTRAP into pGEX-6P1 (GST- hSTRAP(1440))
Full-length hSTRAP with non-optimized gene sequence was cloned originally in an
HA tagged vector, pHA1, by Sandra Taylor in the laboratory of Dr Marija Kristic
Demonacos. For this project full-length hSTRAP was cloned into the GST plasmid,
pGEX-6P1 (GE, Healthcare, 27-4597-01; See Fig.3.5) using the methods described
above whereby the starting point for the cloning of this construct was from section
2.3.2.5. However additional steps were performed between restriction digestion and
ligation, and these were vector alkaline phosphatase treatment and gel extraction and
purification, which will be described below.
80
2.3.3.2 Alkaline Phosphatase treatment
The GST vector, pGEX-6P1 was treated with alkaline phosphatase to prevent vector
re-ligation in a reaction containing 2% of the total volume alkaline phosphatase and
10% de-phosphorylation buffer. The reaction mixture was then incubated at 370C for
60 min. The alkaline phosphatase was inactivated with the addition of 200 mM
EDTA (10% of the total volume) and incubation at 650C for 10 min. The plasmid
was then stored at -200C.
2.3.3.3 Gel Extraction and Purification
DNA isolation;
The DNA bands representing the PCR product were excised under UV light and
purified using the Qiagen gel extraction kit. This protocol utilises the ability of the
column membrane to bind to the DNA when the buffers provide the right salt
concentration and pH and is based on the principle that the adsorption of nucleic
acids on the silica surface is only possible when concentration of chaotropic salts is
high. Adsorption is ~95% when the pH is ~7.5 and is dramatically reduced at a
higher pH. One volume of the excised gel was dissolved in 3 volumes of buffer QG
at 500C. One volume of isopropanol was then added and the sample was then
transferred onto a QIAquick spin column, and centrifuged for 1 min at 17,900g.
Buffer PE was used to wash off contaminants and the flow-through was again
discarded. The bound DNA was recovered using 30 µl of buffer EB by
centrifugation for 1 min.
2.4 Transformation of plasmid DNA into competent e.coli cells
LB broth was produced by dissolving 25g of Luria Bertani (LB) powder in 1L of
distilled water. The LB solution was autoclaved (15 psi, 121 0C, 30min). For the
preparation of LB agar plates, 15g of agar were dissolved in 1L of distilled H2O
containing 25g of LB powder, which was autoclaved under the same conditions. For
colony selection, the antibiotic ampicillin (50µg/ml) was added (Table 2.2) to the
mixture when its temperature was about 500C. The vectors, pET-14b and pGEX6P1, used in this investigation are resistant to ampicillin (Fig.3.2 and Fig.3.5
respectively).
Preparations of competent bacteria were carried out in E. coli DH5α, a derivative of
Hanahan’s strain DH5. This new DH5α strain provides 1.5 x 108 transformants/µg
transformation efficiency, which is higher than the DH5 parent strain. CaCl2 was
used to transform DH5α to competent cells. CaCl 2 dissociates into Ca+2 and Cl- the
81
presence of Ca+2 outside of the cell and the heat shock provided during
transformation creates intense osmotic pressure between the inner and outer sides of
cell membrane thus allowing plasmid DNA to permeate into the cell.
DH5α glycerol stock was streaked on an LB plate. Bacterial colonies were picked
and allowed to grow to an OD600 in 5ml LB culture (without antibiotic). Following
centrifugation at 4062g, cells were resuspended in 500 ml ice-cold 50 mM CaCl2 and
incubated on ice for 20 minutes. Bacterial suspension was then centrifuged at 4062g
for 10 min at 40C. Pellets collected from centrifugation were resuspended in 12 ml
sterile solution of 0.53 ml 2 M CaCl2, 1.675 ml 100% glycerol and 10.09 ml sterile
dH2O. 50 µl aliquots of competent cells were dispensed in microfuge tubes and
stored at -800C for later use.
Aliquots of 50 µl of competent Escherichia coli DH5α were thawed on ice, mixed
with approximately 100 ng of plasmid DNA (vector only, GST-hSTRAP, HishSTRAP ligation mixture or the synthesized His-hSTRAP(1-440) DNA) and
incubated for 30 min on ice. Bacteria were then heat-shocked for 1 min at 420C and
immediately returned onto the ice for 2 min. 500 µl LB media without antibiotic was
added and the bacteria were incubated in a shaker for 1h at 370C. A volume of 200
µl was streaked on an LB agar plate containing ampicillin. The plates with the
transformed bacteria were inverted and incubated at 370C for 16 h to reach the
stationary phase of their growth shown in Figure 2.3. The plates were kept inverted
overnight at 370C, as the bacterial cells are in the exponential growth phase (Fig.2.3).
Plates were then stored at 4 0C and could be used for up to 4 days. The next day,
colonies were observed on these plates, which were then inoculated in LB with their
required antibiotics (Table 2.2) overnight at 370C, with shaking for expression
studies. A point to note is that for Shuffle T7 express and Shuffle T7pLysY all
incubations for these cell lines are done at 30 0C rather than 370C.
Figure 2.1. Bacterial growth
curve. Bacteria take time to adjust
to their new growth (lag phase)
before they can start dividing.
Once bacteria have adjusted, they
divide regularly and enter their
exponential growth at around 10
hrs from start of bacterial growth.
Eventually they stop dividing and
enter the stationary phase, and then
after 7hrs enter the death phase.
Figure taken from [151]
82
2.5 DNA Mini preps
Colonies obtained through transformations of plasmid DNA with DH5α cells were
then inoculated in LB-Amp overnight at 37⁰ C with shaking. These innoculations
were then centrifuged at 3381g for 20 mins and the supernatant was discarded. Mini
preps were carried out using the protocol and buffers provided with the Qiagen
QIAprep Mini prep Kit. This kit again uses silica membrane technology and uses the
same principle as with the other Qiagen kits mentioned in this investigation.
Cell pellets were re-suspended in 250 µl Buffer P1, and then 250 µl of buffer P2 was
added and mixed thoroughly. Then 350µl of buffer N3 was added and mixed
thoroughly and this lysate was centrifuged for 10 mins at 15871g. The supernatant
was then added to the QIAprep spin column and then the column is then centrifuged
for 60 secs at 15871g. The flow through is then discarded and then 750 µl of Buffer
PE was added to the column and then centrifuged again for 15781g for 60 secs. The
flow through was discarded and the column was centrifuged at 15871g for 60 secs to
remove any residual buffer. In order to elute the DNA, the column was placed in a
clean eppendorf and 50 µL of Buffer EB was added to the centre of the column. The
column was left to stand for 3 mins and then centrifuged at 15871g for 60 secs. Mini
preps were then stored at -20⁰ C.
2.6 Sequencing
Mini preps obtained from the cloning procedures were then sequenced by GATC
biotech using their own commercial primers specific for pET-14b in this project,
which bind to the T7 promoter and terminator region of the vector (Fig.3.2). All
sequencing data from all successful clones are shown in the Appendix. These mini
preps were then transformed into various cell lines following the same procedure as
mentioned in Section 2.4, to carry out expression trials to determine optimum
conditions of growth for soluble hSTRAP protein.
2.7 Expression trials
Extensive expression trials using various expression cell lines for each protein
construct were undertaken to determine optimum conditions of soluble hSTRAP
protein expression. This had to be done as the competent cell line strains have
different properties, which affect expression and yield of protein and so extensive
expression trials had to be done to investigate this (Table 2.2).
83
For His-hSTRAP(1-440), plasmid DNA was transformed in BL21(DE3)-RIPL, T7,
BL21(DE3)pLysS, BL21(DE3) single cells and Rosetta Gami 2(DE3). For the five
truncated constructs of hSTRAP, three truncated constructs of hSTRAP were only
transformed in BL21(DE3)pLysS, these were hSTRAP(1-219), hSTRAP(151-284)
and, hSTRAP(285-440). For the other two constructs hSTRAP(1-150) and
hSTRAP(220-440), these were also transformed in Shuffle T7 express and Shuffle
T7pLysY. All the transformations into various different cell lines were carried out
following the procedure described in Section 2.4. A point to note is that for Shuffle
T7 express and Shuffle T7pLysY, growth was carried out at 30⁰ C throughout.
For each cell line, colonies were obtained with transformation, and on that same day
one colony was inoculated into LB media (with their respective antibiotics (Table
2.2)), overnight at 37⁰ C, with shaking. The next day 500 µl of overnight culture
was transferred into fresh 50 mls of LB media containing the required antibiotics and
the OD600 was checked. Growth was started when the OD600 was between 0.05-0.1
and the OD600 was checked at regular intervals. Once the OD600 was between 0.50.7, the cells were induced with varying IPTG concentration and temperatures. Cells
were induced at this OD as this corresponds to the exponential growth phase of
bacteria and found to be the condition optimal for bacterial growth (Fig.2.3). In both
of the vectors used in this study, pET14-b and pGEX-6P1, expression is inducible by
IPTG (Fig.3.2 and Fig.2.1 respectively). However, before cells were induced a preinduction sample was taken for later SDS PAGE analysis. This involved taking a
200 µl aliquot from the growth media and centrifuging at 15871g for 10 mins. The
supernatant was discarded and the pellet was stored at -20⁰ C until the pellets were
to be lysed. Every hour up to 4 hrs a 200 µl aliquot was taken and centrifuged and
stored like the pre-induction sample and these were named post induction samples.
Cell pellets obtained through expression trials had to be then lysed to analyse soluble
and insoluble hSTRAP protein expression. For this procedure pellets were kept on
ice during the lysis process and lysis was carried out as follow; 5ml/g of pellet was
lysed with Bugbuster, supplemented with 1% (v/v) of Protease Inhibitor, PMSF,
DNAse and RNase. The lysate was then centrifuged at 15871g for 10 mins, and the
supernatant (soluble fraction) and pellet (insoluble fraction) were stored separately.
These samples were then analyzed for expression by SDS-PAGE and 15 µl of
sample with loading buffer (with 5% of 2-mercaptoethanol) and 5 µl of molecular
84
marker was loaded. In this project two protein molecular markers were used, Prestained Protein marker Broad range 7-175kDa and Pageruler Pre-stained Protein
ladder, 10-170kDa. Gels were run at 180 volts for 50 mins and stained with
InstantBlue for 20 mins and visualised on the camera.
Glycerol stocks were made of every construct in every cell line that was tested
during these expression trials. This was done by firstly growing all the constructs
with their required antibiotics (Table 2.2) overnight at 37 0C with shaking. The next
day, 750 µl of growth was added to 250 µl of autoclaved 100% glycerol in an
eppendorf. This was mixed thoroughly and stored at -80 0C straight away. These
glycerol stocks were used for future growths rather than transforming each time for
every growth.
2.8 SDS-PAGE Gels
In this project various percentages of SDS PAGE gels were prepared and so the
volumes of 30% (w/v) of acrylamide, lower resolving buffer, and distilled water
taken for the resolving gel are shown in Table 2.5. The stacking gel always consisted
of 6.2mls of upper stacking buffer, 1.3mls of 30% (w/v) acrylamide and 2.5mls of
distilled water. The resolving and stacking solutions are initially prepared, but the
fixing agents APS and TEMED are not added until all the plates are assembled
properly. Once the spacer and short plates are assembled as explained in the
manufacturers guide, 50 µl of 10% (w/v) APS and 20 µl of TEMED are added and
the resolving gel is pipetted between the two plates. The resolving gels usually sets
within 20 mins and then the same procedure is followed with the stacking gel,
however, a 10 or 15 well 1.5mm comb is added to the stacking gel depending on the
number of samples to be analyzed by SDS PAGE.
Table 2.5. Resolving gel components.
% SDS PAGE gel 30%
(w/v) Distilled
water Lower buffer (mls)
(%)
Acrylamide (mls)
(mls)
7.5
2.5
5.0
2.5
10
3.3
4.2
2.5
12
4.0
3.4
2.5
15
5.0
2.5
2.5
This table shows the volumes of acrylamide, distilled water and lower buffer added to make
that specific percentage of SDS PAGE gel.
85
2.9 Large scale expression and protein purification of all
hSTRAP variants
2.9.1 Full length His-hSTRAP and truncated constructs of hSTRAP
Glycerol stocks of all His-hSTRAP protein constructs in Bl21(DE3)pLysS were
scraped off using a sterile tip and transferred into 25 mls of LB with the required
antibiotics (Table 2.2). These inoculations were then kept overnight at 37°C, with
shaking, however, not more than 16 hrs, as cell exit mitosis after that. The next day
10 mls of overnight culture was pipetted into 500 mls of LB with antibiotics. Then
like before, the OD600 was checked and growth was not started until the OD600 was
between 0.05-0.1. Once the OD600 had reached 0.5 (exponential bacterial growth
phase, See Fig.2.3), cells were induced with 0.1 mM IPTG for 3 hrs at 37°C for HishSTRAP(1-440). For all the other His-hSTRAP constructs, cells were induced at
OD600 of 0.5 with 0.1 mM IPTG for 3 hours at 25°C. When growing Shuffle T7
express and Shuffle T7pLysY cells, cells were induced at 30°C rather than 25°C.
After 3 hrs induction, all cells were harvested at 1583g for 35 mins.
The pellet was re-suspended in His tag lysis buffer (Table 2.1) and cells lysed using
the Cell Disrupter. All proteins were purified following the same procedure
described below and buffer composition as shown in Table 2.1, and the only
difference being is the pH of the purification buffers used for each hSTRAP
construct. Purification of His-hSTRAP(1-440) was done at pH 7.4, and purification
of hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284), hSTRAP(285-440) was
done at pH 8.7. Purification of hSTRAP(220-440) was done at pH 8.2.
The Cell Disrupter was firstly thoroughly washed with water at 15 kPS1, and then
the cells were then ran through the cell disrupter at 15 kPS1 and collected. The
resulting lysate was then run through this cell disrupter until sample was clear, after
which 1% (v/v) of Protease Inhibitors, PMSF, RNase and DNAse is added. After
that, the machine was washed with water and 20% (v/v) Ethanol at 15 kPS1 until the
resulting solution in clear. The lysate is then centrifuged at 20238g for 30 mins at
4°C.
The next step was to purify hSTRAP on the Talon column, which is used to purify
proteins with a His Tag. Purification was carried out at 4°C and the first step was to
add 1.5 mls of the Talon Resin to the column. Then the column was washed with
10* bed volume of distilled water and 20* bed volume of His tag wash buffer. Then
a 10 µl sample of the resin was taken (R1), which will be the clean resin sample and
86
is stored at -20°C. Once the lysate had been centrifuged from the cell lysis step a
supernatant and pellet sample was taken, representative of the soluble and insoluble
fraction respectively, for SDS PAGE analysis. Then the supernatant was poured
through the column and the flow through was collected. Then the column was
washed with 50 mls of His tag wash buffer and the flow though was collected for
analysis. Then another 10 µl resin sample was taken (R2), this is the bound resin
sample. Once the bound resin sample was taken the protein was eluted with His tag
elution buffer. A point to add is that elution buffer was supplemented with H-MIX
for His-hSTRAP(1-440), hSTRAP(1-219) and hSTRAP(220-440). Also for the latter
hSTRAP protein variant H-MIX was added to all purification buffers. Protease
inhibitor was added to all the elutions straight after they are eluted off the column.
Another 10 µl resin sample was taken (R3) and this would be the “clean resin”
sample. Then the column was washed again with 10* bed volume of His tag wash
buffer and distilled water. The column is then stored in 20% (v/v) Ethanol at 4ºC.
Elutions and various other controls taken during the purification procedure were then
analyzed by SDS PAGE. After that, pure elutions as analyzed by SDS PAGE were
pooled together and dialyzed and/or concentrated down to a smaller volume.
2.9.2 GST-hSTRAP(1-440)
Glycerol stocks of GST-hSTRAP(1-440) in Bl21(DE3)pLysS were scraped off using
a sterile tip and transferred into 25 mls of LB with the required antibiotics (Table
2.2). This was then kept overnight at 37°C, with shaking, however, not more than 16
hrs. The next day, 10 mls of overnight culture was pipetted into 500 mls of LB with
antibiotics. Then like before, the OD600 was checked and growth was not started until
the OD600 was between 0.05-0.1. Once the OD600 had reached 0.5, the cells were
induced with 0.1 mM IPTG for 3 hrs induction at 25°C, and cells were harvested at
1583g for 35 mins.
The pellet was re-suspended in GST tag lysis buffer and lysed using the Cell
Disrupter following the same procedure mentioned above in section 2.9.1. Similar to
lysis of His tag proteins the lysate is then centrifuged at 20238g rpm for 30 mins at
4°C.
The next step was to purify GST-hSTRAP on the GST tag affinity resin, and
purification was carried out at 4°C. So 1 ml of the GST Resin was added to the
column, which was then washed with 10* bed volume of distilled water and 20* bed
87
volume of GST tag Wash buffer. Then a 10 µl sample of the resin was taken (R1),
which will be the clean resin sample and is stored at -20°C. Once the cells had been
centrifuged from the cell lysis step, a supernatant and pellet sample was taken for
SDS PAGE analysis. Then the supernatant was poured through the column and the
flow through was collected. Then the column was washed with 20* bed volume of
GST tag Wash buffer and the flow though was collected. Then another 10 µl resin
sample was taken (R2), this is the bound resin sample. Once the bound resin sample
was taken the protein was eluted with 5* 1.5 mls of GST elution buffer. Another 10
µl resin sample was taken (R3), and this would be the “clean resin” sample. Then the
column was washed again with 10* bed volume of wash buffer and distilled water.
The column is then stored in 20% (v/v) Ethanol at 4ºC.
The elutions were then analyzed on SDS PAGE, after which pure elutions were
pooled together depending on purity shown by SDS PAGE. This sample was then
dialyzed and/or concentrated down to a smaller volume.
2.10 Determining the concentration of protein
2.10.1 Bradford reagent
The concentration of protein was identified using the Bradford Reagent Assay. In a 2
ml eppendorf tube, 800 µl of distilled water was mixed with 200 µl of Bradford
reagent. A calibration curve was plotted using BSA as the standard. From that an
equation was derived which would be used to determine estimated protein
concentration. If 5, 3, 1.5 or 1 µL of eluted protein was added to the 800:200 µl
water and bradford mix, then the OD595 obtained from that was multiplied by 15 and
divided by 5, 3, 1.5 or 1 respectively depending on the amount of protein taken for
the initial reading. Protease inhibitor was added to the elutions and stored at 4°C
until it was verified that hSTRAP protein had been eluted through SDS PAGE gel
analysis.
2.10.2 Protein sample absorbance at 280nm
Firstly, the theoretical extinction co-efficient at 280nm (E280) had to be determined,
which was done using this formulae below-
E280 = (No of tryptophan residues*5500) + (No of tyrosine residues*1490) +
(Number of cysteine residues*125)
88
The E280 for each hSTRAP protein variant was identified using this software
http://www.basic.northwestern.edu/biotools/proteincalc.html, which calculated the
E280 for hSTRAP(1-440), hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150),
hSTRAP(151-284), hSTRAP (285-440) as 47090, 27670, 19420, 16860, 19060 and
11170 respectively. For each hSTRAP protein variant the protein concentration can
then be calculated by firstly measuring the absorbance at 280 nm of buffer only in
cuvette. That value is then zeroed and the absorbance of protein sample at 280 nm is
then measured. That values is then imputed in this formula to determine protein
concentration-
Concentration of protein (mg/ml) = Absorbance 280nm/ E280* Cuvette path length
(cm)
2.11 Concentration of protein to a smaller volume
2.11.1 Amicon Concentration
2.11.1.1 Concentration of protein into a buffer
The elutions that contain pure hSTRAP protein as analyzed by SDS PAGE, were
then pooled together and concentrated to a smaller volume in an Amicon. The
amicon is assembled as described in the manufacturers guide and for all the
constructs cloned in this project a 25 mm 10 k membrane was used. This amicon can
hold up to 10 mls of protein sample, which can be concentrated down to 1000-500
µl. The membrane was washed with distilled water thoroughly at first and then the
amicon was connected to the nitrogen gas cylinder. Pressure was adjusted to 30 Bar
and the amicon was placed on the magnetic stirrer. The membrane was again washed
with ample amounts of water, at the approx flow rate of one drop per 5 secs. Once
the membrane was washed, pooled elutions were poured into the amicon. This was
also subjected to the same pressure levels and approximate flow rate. Once the
volume that the protein sample should be concentrated to was achieved, the pressure
was released but the amicon was left on the stirrer for 20 mins to wash protein off
the membrane. The concentrated protein sample was then transferred to an eppendorf
and the concentration was determined using Bradford reagent (See Section 2.10.1)
2.11.1.2 Concentration of protein along with buffer exchange in the amicon
For His-hSTRAP(1-440) and hSTRAP(1-219), elution buffer was supplemented with
H-MIX as it was shown that for these protein constructs it was a critical addition for
protein stability. Dialysis was not possible for these protein constructs because the
contents of H-MIX are relatively expensive when in large volumes. In these cases
89
successive buffer exchanges were done in the amicon. The initial steps are the same,
but the difference is that the pooled elutions are poured into the amicon and
concentrated to 1 ml, and then another 9 mls of the optimised buffer for that protein
construct is added. For His-hSTRAP(1-440) this would be 50 mM Sodium
phosphate buffer, 50 mM NaCl, Arginine and Glutamic acid, 10 mM βmercaptoethanol and H-MIX, and for hSTRAP(1-219) this would be pure H-MIX
only, pH 8. Once the sample was concentrated to a tenth of its original volume,
another 9 mls of this optimised buffer was added to the amicon. The pooled elutions
were then concentrated down to a tenth of the initial volume again and the same
amount of storage buffer was added again. This was done several times and after that
the sample was concentrated to a final volume of 1 ml in the amicon. The amicon
was left to stir for about 20 mins without subjecting it to any pressure. Then the
concentrated sample was transferred to an eppendorf and the concentration was
determined with Bradford reagent (See Section 2.10.1). If the required concentration
was not achieved through this process then protein has to be concentrated down
further with viva-spin columns as described in the section below.
2.11.2 Viva spin500 concentrators
Viva spin500 concentrators can concentrate protein up to 10 µl with very high
concentrate recovery (aprox 96%). It consists of a vertical polyethersulfone
membrane which prevents membrane blockage and a thin channel filtration chamber.
With these viva-spin500 concentrators, samples cannot be completely lost as there is
threshold of 10 µl, beyond which sample cannot be concentrated any further
The OD595 of the protein is firstly measured, and at this point the volume of the
concentrate needed to achieve a concentration of around 10 mg/ml, if the
concentration was to increase at a linear rate was estimated. This volume was noted
and the sample was concentrated to that volume. Viva-spin 500 3 kMWCO was used
for all constructs, which can accommodate 500 µl of sample and can concentrate
protein down to 10 µl. Hence, 500 µl of protein sample was added to the
concentrating device and was spun at a speed of 9230g. Sample was checked
regularly and concentrated to the volume that was initially noted.
Once the desired concentration was reached the samples were analyzed on a SDS
PAGE gel to check for purity.
90
2.12 Gel Filtration
The Superdex 200 26/60 GL gel filtration column was used in this study. The
column was connected to the FPLC AKTA system, ensuring no bubbles were
inserted. All buffers that are run through the column were filter sterilized and
degassed and the pumps were washed with the appropriate buffers required in each
case. Firstly, the column was equilibrated by washing the column with 2 column
volumes of MilliQ water then 2 column volumes of gel filtration buffer.
Once the column was equilibrated, concentrated hSTRAP protein was injected into
the column. For Superdex 200 the maximum injection volume is 500 µl and for
Superdex 75 is around 20 mls. The AKTA system has a manual available for that
column and so that was followed. Once the run was completed, peaks were observed
on the gel filtration graph and the fractions included in these peaks were then
analyzed by SDS PAGE. Once it was confirmed which peak includes the required
hSTRAP protein, the pooled fractions were then concentrated down to a smaller
volume as described in the sections above (See Section 2.11).
2.13 X-RAY Crystallography experiments
The first initial wide broad trials were set up in MRC-96 well plates using the
Phoenix nano-litre pippetting robot. Crystallography trials were carried out when the
concentration of His-hSTRAP(1-440) was between 12-20 mg/ml. The concentrated
sample was divided into two, where one half was supplemented with graphite and
the other was not. Graphite nano-suspension was prepared by Dr Alexander
Golovanov and was added as this was hypothesized to act as a nucleating platform
for crystal growth (Dr Alexander Golovanov, personal communication). For the
trials 12 µl of protein was needed per plate, as 0.2 µl of screen condition: 0.2µl of
protein was used per well for these trials. The first broad range sparse matrix trial
that was undertaken was with the JSCG+ screen but the other commercial screens
that were tested were PACT, Clear Strategy I, Clear Strategy II and Morpheus (See
Section 2.1.3). Once the plates were prepared by the nano-litre pippetting Pheonix,
the plates were checked immediately to check for immediate effects of addition of
protein to buffer. The plates were then stored at 20ºC and checked every day for 2
weeks and then once a week for the next 3 months.
When more specific trials were undertaken, the trial was plated out manually and the
buffers were prepared as a commercial screen was not used. The drop size was 0.5 µl
of screen to 0.5 µl of protein. Temperature used was the same in all trials carried out.
91
2.14 GST tag Cleavage
For both on and off column GST cleavage, 200 µl of pre-cission protease is added to
either the column with GST-hSTRAP(1-440) bound to the GST affinity resin in
wash buffer, or eluted GST-hSTRAP(1-440) protein respectively. Both samples are
then kept on the roller at 4 0C and resin or eluted protein samples were taken every
hour for 3 hrs and then overnight. For both on and off column cleavage the flow
through is collected and then analyzed by SDS PAGE to determine if cleavage is
successful.
2.15 CD experiments
Pure elutions of hSTRAP protein as well as controls (GST tag only) were firstly
dialyzed into CD buffer at 4°C using Snakeskin pleated dialysis tubing, 3 kMWCO.
Concentration of hSTRAP protein or tag was firstly identified using the procedure
mentioned in Section 2.10. A far UV spectrum with a wavelength range of 260-180
nm and 0.1 cm pathlength was recorded at 4°C of dialysed buffer only or hSTRAP
protein sample using the JASCO J-810CD Spectropolarimeter, under constant
nitrogen flow connected to a temperature controller. A point to note is the final scan
is an average of 4 scans taken at these conditions and spectra was corrected for
potential background signal, and for GST-hSTRAP, spectra was corrected for buffer
and GST Tag signal. Then a variable temperature experiment was done, whereby a
scan was taken every 0.2°C from 4 to 80°C at fixed wavelength of 220 nm at
20°C/hr. Once this experiment was completed a scan is taken again at 4°C with
varying wavelength from 260-180 nm.
2.16 NMR experiments
2.16.1 Expression of 15N labelled hSTRAP protein
This growth is very similar to unlabelled growth; the only difference is growth is
done in 15N labelled minimal media rather than LB media. Minimal media consists of
solution A and B (Table 2.1), of which solution A is prepared first, and then
autoclaved. Solution B was then dissolved in 20 mls of milliQ water and then filter
sterilized through a 2 µM filter. This was then added to solution A after it had been
autoclaved and the mixture was then thoroughly mixed, and 500 ml of that media
was poured into 2 litre conical flask using aseptic techniques. Ampicillin was then
added to the media.
92
The protocol for expression and growth of labelled media was the same as unlabelled
growth of these truncated constructs of hSTRAP (See Section 2.9.1).
2.16.2 Acquiring of NMR spectra
All experiments were carried out at 30°C unless stated otherwise, on Bruker
600MHz Avance DRX spectrometers equipped with a cryoprobe. Protein samples
were supplemented with 10% D2O. 1H 1D and 2D 1H-15N hetronuclear singlequantum coherence (HSQC) spectra were acquired using a watergate pulse sequence
for water signal suppression. SO FAST-HMQC 1H-15N correlation spectra were
acquired on a Bruker DRX700 spectrometer
2.17 Mass spectrometry experiments
The mass spectrometry experiments were carried out by the Mass spectrometry
facility at the University of Manchester. The method they implenented is as follows:
Digestion:
Bands of interest were excised from the gel and dehydrated using acetonitrile
followed by vacuum centrifugation. Dried gel pieces were reduced with 10 mM
dithiothreitol and alkylated with 55 mM iodoacetamide. Gel pieces were then
washed alternately with 25 mM ammonium bicarbonate followed by acetonitrile.
This was repeated, and the gel pieces dried by vacuum centrifugation. Samples were
digested with trypsin overnight at 37 °C.
Mass Spectrometry:
Digested samples were analysed by LC-MS/MS using an UltiMate® 3000 Rapid
Separation LC (RSLC, Dionex Corporation, Sunnyvale, CA) coupled to a LTQ
Velos Pro (Thermo Fisher Scientific, Waltham, MA) mass spectrometer.
Peptides were concentrated on a pre-column (20 mm x 180 μm i.d, Waters). The
peptides were then separated using a gradient from 99% A (0.1% FA in water) and
1% B (0.1% FA in acetonitrile) to 25% B, in 45 min at 200 nL min -1, using a 75 mm
x 250 μm i.d. 1.7 mM BEH C18, analytical column (Waters). Peptides were selected
for fragmentation automatically by data dependant analysis.
Data Analysis:
Data produced were searched using Mascot (Matrix Science UK), against the full
93
database. Data were validated using Scaffold (Proteome Software, Portland, OR).
Proteins that were not found to bind to the control (Tag only), and detected either
twice or more with 2 unique peptides (with an 80% peptide probability) and a
scaffold probability of over 95% in pull downs with hSTRAP protein variants were
identified as hSTRAP interacting proteins.
2.18 Building the hSTRAP interactome network
The UNIPROT ID of each hSTRAP interacting partner was submitted into DAVID
bioinformatics software to assign all these particular proteins to their respective
pathway. David bioinformatics was found at http://www.david.abcc.ncifcrf.gov/. The
gene names of these hSTRAP interacting proteins implicated in these latter pathways
were then submitted into GeneMANIA and String 9.0 bioinformatics software found
at http://www.genemania.org/ and http://string-db.org/ respectively. These two
programs determines the interaction status between the two protein shown, if it is a
direct interaction proven by experiments, predicted or text mining All interaction
status for all protein shown were then noted in excel and a interacting network was
built
based
on
these
results
using
cytoscape
http://www.cytoscape.org/images/top_slides/cytoscapeDesktop1.png.
94
found
at
3. Chapter three. Results
3.1 Expression and purification of full length and truncated
forms of hSTRAP protein
The aims of this project were to identify interacting partners of full-length hSTRAP
and its truncated variants, to map regions of hSTRAP implicated in ligand
interactions related to breast cancer. Another aim was to characterize structurally
full-length hSTRAP and its truncated versions by NMR, X-ray crystallography and
Circular dichroism. To achieve both aims, it is needed to obtain purified
homogenous protein, ideally in a tagged form, which enables its easy attachment to
the affinity resin. Therefore, full-length hSTRAP was cloned into two plasmids,
pET-14b and pGEX-6P1 for expression with 6 Histidines and GST tag respectively.
Both protein constructs can then be structurally characterized as well as be used to
identify interacting partners of hSTRAP. Interacting data on full-length hSTRAP
from two different vector systems would give an indication of the reproducibility and
reliability of the interaction data. Furthermore, truncated versions of hSTRAP
covering different regions of hSTRAP and including different TPR motif
combinations were also cloned into pET-14b. These truncated constructs will be
used for mapping the regions of hSTRAP responsible for the protein interactions
identified, and potentially for solving the structure of shorter fragments of hSTRAP.
These following sections focus on the establishment of a protocol to clone, express
and purify pure hSTRAP protein variants bound to the affinity resin and in the
elutions. Expression and protein purification protocol was extensively optimized to
obtain pure hSTRAP protein and this has been explained in more detail in the
following sections.
3.1.1 Cloning, expression and purification of full length hSTRAP into
pET14b (His-hSTRAP(1-440))
Full-length hSTRAP codon optimized sequence was synthesized in frame with NdeI
and BamHI restriction sites by GENEART in the vector pET-14b (Fig.3.1). This
codon optimization step had to be done to ensure no rare codons were present in the
hSTRAP sequence, as this potentially could affect expression and consequently yield
of hSTRAP protein obtained in E.coli. The structural aspect of this project requires
high concentration of hSTRAP protein and hence any factors that could potentially
affect expression of protein were considered
95
Figure 3.1. The pET-14b Vector. The pET14b vector has the His tag at its N termini,
and is ampicillin resistant. Protein expression is inducible by IPTG.
Independent plasmid DNA sequencing confirmed the identity of the plasmid
construct pET-14b-His-hSTRAP(1-440) (See Appendix) and expression trials were
then initiated.
Initial experiments showed that His-hSTRAP(1-440) protein does not express easily
in soluble form and since large quantities, in the order of several milligrams, of pure
stable His-hSTRAP(1-440) protein would be needed to carry out structural studies,
extensive trials were therefore undertaken to determine the optimum conditions for
soluble His-hSTRAP(1-440) protein expression. Typically, for crystallization trials
10-20 mg/ml of protein solutions are required.
Several expression cell lines of E.coli were tested, these included BL21(DE3)-RIPL,
Rosetta-gami 2(DE3), T7, BL21(DE3) Single cells and BL21(DE3)pLysS (Table
2.2). Various IPTG concentrations such as 0.1, 0.25, 0.5, 0.75 and 1 mM IPTG and
different temperatures, varying from 37 to 30ºC were screened. Induction time, cell
density (OD600 of cell media) at the time of induction, type of media and rate of
shaking (RPM) were also investigated.
The first expression cell line tested was BL21(DE3)-RIPL, which contains a plasmid
encoding tRNAs for rare codons (Table 2.2). Best soluble His-hSTRAP(1-440)
expression in this cell line were identified at induction with 0.5 mM IPTG, followed
by 3 hrs incubation at 37ºC (Table 3.1). Purification yielded low quantities in the
96
order of 0.4 mg of His-hSTRAP(1-440) protein from 1 Litre E. coli culture, with less
than 40% protein purity as many contaminants were detected in elutions (Table 3.1).
The next cell line tested was Rosetta-gami 2(DE3), which as well as having a
plasmid encoding tRNAs for rare codons, also contains mutations in thioredoxin
reductase (trxB) and glutathione reductase (gor) genes to induce disulphide bond
formation (Table 2.2). Best conditions of soluble expression of His-hSTRAP(1-440)
protein in this cell line were identified as induction with 0.25 mM IPTG, followed by
3 hrs incubation at 37ºC (Table 3.1). Purification of pure His-hSTRAP(1-440) was
also un-successful in this cell line as many contaminants were detected in elutions
(Table 3.1).
The next cell line tested was another BL21 derivative, T7; which has the T7 RNA
polymerase under the control of the lac promoter, and unlike BL21(DE3), this cell
line has the T7 RNA polymerase on a lysogenic prophage, although, the latter is
dormant (Table 2.2). High expression of His-hSTRAP(1-440) protein was detected
in samples representative of the soluble fraction, when protein expression was
induced with 1 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1).
Purification of His-hSTRAP(1-440) was however un-successful, as only a cleaved
fragment of hSTRAP was purified rather than full length hSTRAP (Table 3.1).
The next cell line tested was BL21(DE3)pLysS, which also encodes the T7 RNA
polymerase, but also contains the pLysS plasmid (Table 2.2). This plasmid encodes
an inhibitor of T7 RNA polymerase, the T7 lysozyme, which suppresses protein
expression before IPTG induction (Table 2.2), as this was being observed in
previous expression trials with this protein construct. Best conditions of soluble HishSTRAP(1-440) protein expression was identified as induction with 0.25 mM IPTG,
followed by 3 hrs incubation at 37ºC (Table 3.1). The highest yield of HishSTRAP(1-440) protein was obtained in this cell line (<3mg/litre), however, to
maintain the pLysS plasmid an extra antibiotic is needed (Table 2.2), which
potentially could slower cell growth and lower the yield of His-hSTRAP(1-440)
protein. BL21(DE3)single cells were therefore tested to determine if this was the
case. Best His-hSTRAP(1-440) protein expression conditions was identified in this
cell line as induction with 0.25 mM IPTG, followed by 3 hrs incubation at 37ºC
(Table 3.1). Purification yielded both full length and a cleaved fragment of hSTRAP
(Table 3.1). The yield of full length hSTRAP was higher in BL21(DE3)pLysS than
BL21(DE3) single cells, hence, BL21(DE3)pLysS was determined to be the
97
optimum cell line for expression of the highest amounts of soluble His-hSTRAP(1440) protein.
In summary all cell lines tested expressed His-hSTRAP(1-440) protein to a various
extent, as shown by samples analyzed by SDS PAGE, representative of soluble,
insoluble and total expression. For all expression trials, LB media and post induction
time of 3 hrs was found to be the best conditions. All E.coli cells were induced when
OD600 was 0.5, as this was found to be optimal OD at which cells should be induced
for high His-hSTRAP(1-440) soluble protein expression. Finalized optimum
conditions of soluble His-hSTRAP(1-440) protein expression was identified in
BL21(DE3)pLysS cells, induced with 0.25 mM IPTG, followed by 3 hrs incubation
at 37ºC.
Table 3.1. Expression trials and purification of His-hSTRAP(1-440).
Cell Line
IPTG
Temperature
Temperature
concentrations 37ºC
30ºC
(mM)
Soluble
InSoluble
InFraction soluble
Fraction soluble
Fraction
Fraction
BL21(DE3)RIPL
0.25
+
+
+
0.5
+
+
Purification of
Full
length
hSTRAP
(50kDa)
LY<0.4 mg
per litre, >40%
purity
Rosetta Origami
2(DE3)
T7
BL21(DE3)pLysS
BL21 Single cells
1
0.25
0.5
0.75
1
0.25
0.5
0.75
1
0.1
0.25
++
++
++
+
+
+
++
+++
++
+++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
+
++
++
+
+
0.5
0.75
1
0.1
0.25
0.5
0.75
1
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
MC
CF
HY<3
mg
per litre, <50%
purity
>50% purity
The different conditions used to express and purify full length hSTRAP protein are shown in
this table. The (+) (++) and (+++) signs indicate low, medium and high His-hSTRAP(1-440)
protein expression respectively. The best conditions of His-hSTRAP(1-440) protein
expression using each different cell line are highlighted in red. These latter conditions were
then used to purify full-length protein. Successful His-hSTRAP(1-440) protein purification
is indicated by (), and unsuccessful by (). Protein expression and purification trials were
repeated at least twice. Abbreviations: (HY) Highest Yield; (VLY) Lowest Yield; MC,
Many contaminants yielded; CF, cleaved fragment yielded;
98
Large scale expression and purification of His-hSTRAP(1-440) was then carried out
as described in the Material and Methods section 2.9.1, once the optimum soluble
protein expression conditions were identified through expression trials. Samples
representative of insoluble and soluble fraction, as well as total expression, at these
optimum conditions of His-hSTRAP protein expression were analyzed by SDS
PAGE (Fig.3.2). These gels confirm that hSTRAP is being expressed and
accumulated over time as shown by samples representative of total expression
(Fig.3.2B, Lanes 2-5) and His-hSTRAP(1-440) is found mainly in the soluble
fraction of lysed cells grown in optimal conditions (Fig3.2A, Lanes 6-9).
(A)
(B)
Figure 3.2. Expression of His-hSTRAP(1-440). 10% SDS PAGE gel showing -A:
Insoluble (lanes 2-5) and soluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed
with pET-14b-His-hSTRAP(1-440) after 1 hour (lanes 3 and 7), 2 hours (lanes 4 and 8), and
3 hours (lanes 5 and 9) post IPTG induction. B: Total BL21(DE3)pLysS pET-14b-HishSTRAP(1-440) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5)
post IPTG induction. Lane 1 in (A) and (B) represent protein markers, and lanes 2 and 6,
pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained
with Instant blue.
Purification protocol had to be extensively optimised to obtain high quantities of
non-aggregated His-hSTRAP(1-440) protein in elutions from the His-tag affinity
column (Fig.3.3). Initial purification was carried out, where the cells pellet was lysed
by sonication. This resulted in an abundance of contaminants within column elutions
(Fig.3.3A, Lanes 10-12), and approximately only 15% of the proteins present was
99
full-length His-hSTRAP(1-440) (Fig.3.3A, Lanes 10-12, indicated by an arrow). In
order to attempt to decrease the amount of contaminants, another protein batch was
purified but this time cells were lysed using the Cell Disrupter, which is deemed to
be a more gentle method of lysis, and less damaging for proteins. This improved the
purity of His-hSTRAP(1-440) elutions to approximately 30% (Fig.3.3B, Lanes 1012) compared to previous 15% purity (Fig.3.3A, Lanes 10-12). This means that
sonication is not the optimal method of cell pellet lysis for purification of HishSTRAP(1-440) but cell disruption is. As this latter purification was carried out at
pH 8.0, the next experiment was carried out to investigate the effects of pH on the
purification of His-hSTRAP(1-440) protein. The pH of the purification buffers was
lowered to pH 7.4 and this improved the purity of His-hSTRAP(1-440) in elutions
by 50% (Fig,3.3C, Lanes 10-13). This suggested that lowering the pH improves HishSTRAP(1-440) protein purity in elutions, however, the pH of the purification
buffers could not be lowered any further as this is not recommended for histidineaffinity resin purification, and that would be also closer to the pI of this protein
construct which is 6. Samples of His tag affinity resin were taken before protein was
eluted off the resin, with a view to analyse purity of protein bound to resin (Fig.3.3C,
Lane 3). This gel showed that pure His-hSTRAP(1-440) protein was bound to the
resin before elutions were taken (Fig.3.3C, Lane 3), after which a contaminant
(likely degradation products) was detected in the elutions (Fig.3.3C., Lanes 10-13,
indicated by an arrow labelled “contaminant”). This suggested that His-hSTRAP(1440) was unstable in this elution buffer and that this buffer should be optimised
further. This lead to the decision to apply the new developed proprietary protein
solubilisation mixture by Dr Alexander Golovanov (H-MIX) to the elution buffer. HMIX is a high-concentration mixture of four hydrophobic amino acids (L-Leu, LVal, L-Ala and L-Ile) in a defined ratio. H-MIX was designed to act as a stabilising
agent against proteolytic degradation and protein aggregation. Purification with this
H-MIX supplement in elution buffer yielded pure His-hSTRAP(1-440) in elutions
(Fig.3.3D, Lane 10-12, indicated by an arrow) and therefore stability of HishSTRAP(1-440) in elutions had significantly increased (Fig.3.3D, Lanes 10-12,
indicated by an arrow).
Total estimate of the quantity of pure and stable His-hSTRAP(1-440) protein that
was obtained from 1 L of E.coli culture was 4.73 mg, as measured by Bradford assay
(Table 3.2). Biochemical binding assays with His-hSTRAP(1-440) can be carried out
as pure His-hSTRAP(1-440) protein can be obtained bound to the His tag affinity
resin (Fig.3.3D, Lane 3).
100
(A)
(B)
(C)
(D)
Figure 3.3. Purification of His-hSTRAP(1-440) protein. 10% SDS PAGE gels
showing A: His-hSTRAP(1-440) purification from sonicated BL21(DE3)pLysS cells using
purification buffer with pH 8.0. B: His-hSTRAP(1-440) purification form BL21(DE3)pLysS
cells lysed with the cell disrupter and purification buffer with pH 8.0. C: His-hSTRAP(1440) purification form BL21(DE3)pLysS cells lysed with the cell disrupter and purification
buffer with pH 7.4. D: Same as C but with the addition of H-MIX in the elution buffer. All
gels were loaded with 15µl of sample and stained with Instant blue as described in Methods
and Materials section. Migration of His-hSTRAP(1-440) is at 50 kDa as indicated by the
arrow.
101
Table 3.2. Estimation of His-hSTRAP(1-440) protein concentration
Elution 1
Elution 2
Elution 3
OD595 Concentration
(mg/ml)
0.167 0.50
0.117 0.35
0.020 0.06
Volume (mls)
7
3
3
Protein
quantity
estimate (mg)
3.50
1.05
0.18
The concentration of His-hSTRAP(1-440) protein measured using Bradford assay in the
elutions obtained from the purification method described in Figure 3.3D (for details see
Materials and Methods).
In-gel digestion mass spectrometry was carried out to confirm protein identity and
coverage of the suggestive His-hSTRAP(1-440) 50 kDa protein band (Fig.3.3C,
indicated by the top arrow), and the contaminant visible at 29 kD during HishSTRAP(1-440) purification (Fig.3.3C, indicated by the bottom arrow). Mass
spectrometry confirmed that the protein band found in elutions at 50 kDa (Fig.3.3C,
Lanes 10-12, top arrow) was indeed hSTRAP and coverage was indicative of full
length protein (Fig.3.4A). The band detected at 29 kDa (Fig.3.3B, Lanes 10-12,
indicated by the bottom arrow) was identified as a cleaved fragment of hSTRAP,
with peptide coverage within amino acids 14-147 (Fig.3.4B), that includes two of the
predicted TPR motifs.
(A)
1
81
161
241
321
401
MMADEEEEVK
TPDYSPKAEE
VMDSVRQAKL
GEALEGFSRA
LKPLSTLQPG
KDYSFSSVRV
PILQKLQELV
LLSKAVKLEP
AVQMDVHDGR
AALDPAWPEP
VNSGAVILGK
ETPLLLVVNG
DQLYSFRDCY
ELVEAWNQLG
SWYILGNSYL
RQREQQLLEF
VVFSLTTEEK
KPQGSSSQAV
FETHSVEDAG
EVYWKKGDVA
SLYFSTGQNP
LDRLTSLLES
VPFTFGLVDS
ATVASRPQCE
RKQQDVQKEM
AAHTCFSGAL
KISQQALSAY
KGKVKTKKLQ
DGPCYAVMVY
EKTLQQMEEV
THCRNKVSLQ
AQAEKVDRKA
SMLGSLRPAH
NIVQSWGVLI
VGSVQGKAQV
NLSMVLRQLR
SSNPDLHLNR
LGPCSDGHYQ
GDSVAIPEPN
LMLTGKALNV
TDTEDEHSHH
ATLHKYEESY
SASGQKVTLE
LRLHRIQHKG
PILQKLQELV
LLSKAVKLEP
AVQMDVHDGR
AALDPAWPEP
VNSGAVILGK
ETPLLLVVNG
DQLYSFRDCY
ELVEAWNQLG
SWYILGNSYL
RQREQQLLEF
VVFSLTTEEK
KPQGSSSQAV
FETHSVEDAG
EVYWKKGDVA
SLYFSTGQNP
LDRLTSLLES
VPFTFGLVDS
ATVASRPQCE
RKQQDVQKEM
AAHTCFSGAL
KISQQALSAY
KGKVKTKKLQ
DGPCYAVMVY
EKTLQQMEEV
THCRNKVSLQ
AQAEKVDRKA
SMLGSLRPAH
NIVQSWGVLI
VGSVQGKAQV
NLSMVLRQLR
SSNPDLHLNR
LGPCSDGHYQ
GDSVAIPEPN
LMLTGKALNV
TDTEDEHSHH
ATLHKYEESY
SASGQKVTLE
LRLHRIQHKG
(B)
1
81
161
241
321
401
MMADEEEEVK
TPDYSPKAEE
VMDSVRQAKL
GEALEGFSRA
LKPLSTLQPG
KDYSFSSVRV
Figure 3.4. His-hSTRAP(1-440) mass spectrometry A: Mass spectrometry performed
on the hSTRAP band obtained at 50 kDa (top arrow in Fig.3.3C, top arrow). B: Mass
spectrometry performed on the hSTRAP band obtained at 29 kDa (bottom arrow in Fig.
3.3C). The numbers of unique peptides (indicated in red) detected were 31 and 6 for the 50
kDa and 29 kDa bands shown in Figure 3.3C respectively.
3.1.2 Cloning, Expression and purification of GST-hSTRAP(1-440)
Non-optimized full-length hSTRAP gene sequence was cloned originally in HA
tagged vector, pHA1, by Sandra Taylor in the lab of Dr Marija Kristic-Demonacos.
102
For this project full length hSTRAP had to be cloned into the GST plasmid, pGEX6P1 (Fig.3.5). This GST vector was chosen because this plasmid would allow an
insert with Xho1 at both termini to be inserted in frame (Fig.3.5), which was a
requirement in this case. Full length hSTRAP was amplified by PCR by Sandra
Taylor and sequence verified by the DNA sequencing facility at the University of
Manchester.
Figure 3.5. pGEX-6P1, GST expression vector. This 5 kbp vector has a GST Tag,
pre-scission protease cleavage and a multiple cloning site at its N terminus. Protein
expression is inducible by IPTG and vector is ampicillin resistant.
To clone full length wild type hSTRAP in the pGEX-6P-1 vector, hSTRAP cDNA
(previously ligated into pHA1 plasmid) and the vector pGEX-6P1 were both
digested with XhoI. The resulting fragments were then ligated and transformed into
DH5α. Plasmid DNA was extracted by mini-preps from the seven colonies obtained
through this procedure were analyzed, firstly by XhoI and BamHI restriction digest
and secondly by sequencing, to verify the desired insertion of hSTRAP DNA.
The vector pGEX-6P-1 is 5 kbp (Fig.3.5) and the hSTRAP(1-440) insert is
approximately 1.3 kbp. Fig.3.6B shows that hSTRAP has been inserted into the
vector as shown by the DNA band found at 1.3 kbp with XhoI restriction digests
(Lanes 3-9), corresponding to the hSTRAP sequence. The sequence has also been
inserted in the desired orientation as a DNA band was detected at 1.3 kbp when the
103
DNA mini preps are digested with BamHI (Fig.3.6B, lanes 10-16). If hSTRAP had
not been inserted in the desired orientation a DNA band at approximately 81 bp (53
bp + 18 bp) would have been detected rather than 1270 bp (Fig.3.6A), as the wild
type hSTRAP sequence contains a BamHI restriction site at its C terminus
(Fig.3.6A). These digestions confirmed that hSTRAP has been cloned into the vector
in the desired orientation.
(A)
(B)
Figure 3.6. Cloning of the hSTRAP wild type in pGEX-6P1. A: Positions of the
BamH1 and XhoI cutting sites on the full length hSTRAP cDNA sequence as well as pGEX6P1 vector sequence. B: A 1% (w/v) Agarose gel, stained with ethidium bromide, showing
XhoI (lanes 3-9) and BamHI (lanes 10-16) restriction digestion reactions of the seven clones
obtained from the ligation of the vector pGEX-6P1 with full length hSTRAP. The 5 kbp
corresponds to the linearised pGEX-6P1 vector and the ~1.3 kbp to the full length hSTRAP
cDNA (indicated by an arrow). The presence of the ~1.3 kbp fragment in the BamH1
digestion reactions (lanes 3-9) indicates that the full length hSTRAP cDNA has been
incorporated in the construct in the desired orientation.
Clone 2 was chosen at random for further investigation, and the junctions and the
full hSTRAP sequence were sequence-verified to ensure no mutation had been
incorporated. The sequencing data confirmed the latter (See Appendix), and thus the
plasmid pGEX-6P1-hSTRAP(1-440) coding for GST-tagged hSTRAP(1-440) was
obtained. Trials to express and purify the protein were then initiated.
104
Plasmid DNA of clone 2, pGEX-6P1-GST-hSTRAP(1-440), was transformed into
the expression cell line BL21(DE3)pLysS as a test, although this cell line is normally
used for t7-promoter containing vectors. In this construct hSTRAP has been fused to
GST, which should yield a protein of approximately 76kDa, as the GST tag is 26
kDa and hSTRAP is approximately 50kDa. Expression trials for this protein were
carried out that defined optimum condition of soluble GST-hSTRAP(1-440) in
BL21(DE3)pLysS, induced with 0.1 mM IPTG at OD600 of 0.5, followed by 3 hrs
incubation at 25ºC in LB media. Samples representative of soluble, insoluble and
total expression at these optimum conditions of expression were analyzed by SDS
PAGE. Gels show that GST-hSTRAP(1-440) is being expressed in both the soluble
and insoluble fraction (Fig.3.7). Trials revealed that these conditions were best for
GST-hSTRAP(1-440) protein expression. Also GST-hSTRAP is migrating at 78 kDa
rather than the predicted 76 kDa (Fig.3.7), this is slightly higher than it should but it
is not completely uncommon for protein to migrate on a gel higher than it should.
Sequencing and mass spectrometry data confirmed this protein as GST-hSTRAP(1440).
(A)
(B)
Figure 3.7. Expression of GST- hSTRAP (1-440). A: 7.5% SDS PAGE gel showing
Insoluble (lanes 2-5) and soluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed
with pGEX-6P1-GST-hSTRAP(1-440) after 1 hour (lanes 3 and 7), 2 hours (lanes 4 and 8),
and 3 hours (lanes 5 and 9) after induction with IPTG. B: 12% SDS PAGE gel showing
Total BL21(DE3)pLysS pGEX-6P1-GST-hSTRAP(1-440) transformed cell lysate 1 hour
(lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. Lane 1 in (A) and (B)
represent protein markers and lanes 2 and 6 pre-induction fractions respectively. Both gels
were loaded with 15µl of sample and stained with instant blue. In this construct hSTRAP has
been fused to GST to give an N terminal GST tagged protein. Migration of this fusion
protein is as indicated
105
Large scale expression and purification of GST-hSTRAP(1-440) was then carried
out. The initial purification was performed at pH 7.4, as purification of HishSTRAP(1-440) was successful at this pH (Fig.3.3C). Furthermore, cells were lysed
using the cell disrupter, as this was identified as the best method of lysis for
purification of His-hSTRAP(1-440) (Fig.3.3C). Purification of the GST fused
hSTRAP protein (GST-hSTRAP(1-440)) at pH 7.4 yielded many contaminants
bound to the GST affinity resin (Fig.3.8A, Lane 3), hence purity of GST-hSTRAP(1440) in elutions was compromised as a result (Fig.3.8A, Lanes 5-8). Also SDSPAGE indicated that GST-hSTRAP(1-440) still remains bound to the resin even
after thorough washing with elution buffer (Fig. 3.8A, Lane 4), although, this may be
possible to improve through optimisation of the purification protocol. However, the
presence of protein contaminants could be due to insufficient washing, hence nonspecifically bound material to the GST tag affinity resin was detected (Fig.3.8A,
Lane 3). Hence, the purification at pH 7.4 was repeated and the wash buffer flowthrough was analyzed by SDS PAGE to determine if the resin was thoroughly
washed. Figure 3.8B shows that GST-hSTRAP(1-440) was purified but
contaminants are again detected and still bound to the resin after five column bed
volume washings of resin (Lane 10). Samples representative of the last wash flow
through of resin (Lanes 9), contaminants were still detected suggestive that the resin
still may not have been washed enough. Resin was washed with a further 20 mls of
wash buffer at the same pH, however, contaminants were still detected bound to the
resin (Fig. 3.8C, Lane 8). The pH of the buffer was then increased to pH 8 and
samples of resin before washing, wash flow through and resin after washing were
analyzed by SDS PAGE (Fig.3.8D). This washing condition did not improve purity
of GST-hSTRAP(1-440) protein bound to resin (Fig.3.8D, Lane 8). However, when
the pH was raised to 8.7, this yielded pure GST-hSTRAP(1-440) protein bound to
resin (Fig.3.8E, Lane 8). SDS-PAGE indicated that pure GST-hSTRAP(1-440)
protein can be obtained in elutions at pH 8.7 (Fig.3.8, Lanes 2-4), although the total
yield estimated from 1 litre E. coli culture was less than 0.1 mg.
These experiments concluded that GST-hSTRAP(1-440) protein purification is
optimum at pH 8.7, however, the total volume of growth culture would require
significant scaling up to obtain a large quantity of protein necessary for structural
studies. In gel digestion mass spectrometry of suspected GST-hSTRAP(1-440) band
(Fig.3.8, indicated by an arrow) confirmed that protein band as GST-hSTRAP(1440) and coverage indicative of full length hSTRAP (Fig.3.9) and so further
experiments can be carried out with this protein construct.
106
(A)
(B)
(C)
(D)
(E)
(F)
Figure 3.8. Purification of GST-hSTRAP(1-440) protein. A: 7.5% SDS PAGE gel
showing GST-hSTRAP(1-440) purification form BL21(DE3)pLysS at pH 7.4. These 12%
SDS PAGE gels show B: GST-hSTRAP(1-440) purification form BL21(DE3)pLysS at pH
7.4 but with further washing of resin. C: B but resin washed further. D: GST-hSTRAP(1440) purification form BL21(DE3)pLysS at pH 8.0. E: GST-hSTRAP(1-440) purification
form BL21(DE3)pLysS at pH 8.7.F: Pure GST-hSTRAP(1-440) elutions obtained from the
purification in E. All gels were loaded with 15µl of sample and stained with instant blue.
Migration of GST-hSTRAP(1-440) is shown.
107
1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV VGSVQGKAQV
71 LMLTGKALNV TPDYSPKAEE LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA AAHTCFSGAL THCRNKVSLQ
141 NLSMVLRQLR TDTEDEHSHH VMDSVRQAKL AVQMDVHDGR SWYILGNSYL SLYFSTGQNP KISQQALSAY
201 AQAEKVDRKA SSNPDLHLNR ATLHKYEESY GEALEGFSRA AALDPAWPEP RQREQQLLEF LDRLTSLLES
281 KGKVKTKKLQ SMLGSLRPAH LGPCSDGHYQ SASGQKVTLE LKPLSTLQPG VNSGAVILGK VVFSLTTEEK
351 VPFTFGLVDS DGPCYAVMVY NIVQSWGVLI GDSVAIPEPN LRLHRIQHKG KDYSFSSVRV ETPLLLVVNG
421 KPQGSSSQAV ATVASRPQCE
Figure 3.9. GST-hSTRAP(1-440) mass spectrometry A: Mass spectrometry
performed on the 76 kDa hSTRAP protein obtained from the gel shown in Fig.3.8E, lane 8.
Amino acids highlighted in red indicate peptide coverage of the thirteen unique peptides
obtained from the in gel digestion mass spectrometry of this 78 kDa band.
3.1.3 Cloning, expression and purification of truncated variants of
hSTRAP
3.1.3.1 Design and sequence analysis of truncated constructs of hSTRAP
Truncated variants of hSTRAP were decided to be cloned as this would provide
another route in solving the structure of hSTRAP, and also to determine region of
hSTRAP implicated in ligand interaction. However, the first step was to determine
polypeptide boundaries of these truncated forms of hSTRAP proteins; as this could
not be decided though structural data as this was not available at the time and so
secondary structure predictions had to be considered. Polypeptide boundaries had to
be decided with a view of not pertubating any predicted structured regions and
including the correct combinations of predicted TPR motifs. It was firstly decided to
create five truncated hSTRAP construct with the first three, last three, first two,
middle two and end two predicted TPR motifs. The amino acid sequence of fulllength hSTRAP was then submitted into two separate secondary structure prediction
programs, JPred3 and Scratch protein predictor. This was done to check for
consistency of predictions using different algorithms. Data from these two programs
were analyzed and construct boundaries were chosen taking into account the
predicted positions of the TPR motifs and secondary structure elements (Fig.3.10).
This was done to minimize the structural perturbation of truncation constructs.
According to both prediction programs, hSTRAP amino acid sequence is around
70% α-helical. This would correlate with the presence of six predicted TPR motifs
within hSTRAP, as TPR motifs are anti-parallel alpha helical structures [32]. The
five possible truncated hSTRAP constructs to be cloned were as follows:
hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150), hSTRAP(151-284) and
hSTRAP(285-440). The sequences of these five proposed polypeptides were then
analyzed in another program which can predict the pI of these constructs. This latter
data has been shown in Table 3.3A and this is an important factor that needed to be
108
considered, as ideally the pI of these proteins should not be between 6 and 7, as this
range of pH is typically used for NMR studies. The pI of these five protein
constructs mentioned above, were outside the range 6 to7 (Table 3.3A) and the
chosen domain boundaries were not expected to perturb any structured regions
according to secondary structure predictions (Fig.3.10). Therefore, these five
constructs were decided to be cloned into pET14-b. The summary of properties of
these five chosen constructs, including construct length, TPR motifs included,
molecular mass and pI of each construct is shown in Table 3.3A. The amino acid
sequence of these polypeptides is shown in Table 3.3B.
At a time when these constructs were designed structural data for hSTRAP was not
available, however, the C terminal part of hSTRAP, residues 262-422 has been since
solved
by
X-ray
crystallography
by
Structural
Genomics
Consortium
[http://www.thesgc.org/structures/details?pdbid=2XVS]. This structural data only
became available after these five constructs were cloned and expressed. The
experimental secondary structure of this region as given by the PDB file of 2XVS
protein structure is added to Figure 3.10 with a view to compare secondary structure
predictions to actual secondary composition of this solved region. Also it would
verify if any of the C terminal constructs, hSTRAP(151-284) and hSTRAP(285-440)
in reality perturb any structured region. The 284 th and 285th amino acid were found
to be situated between two helical elements (Fig.3.10), and hence this does not seem
the case. The solved region, hSTRAP(262-422) is 27% helical, 9% turn, 8% bend,
36% extended chain (β strand) and 20% the rest (chain).
109
Figure 3.10. Secondary structure predictions of full length hSTRAP. Secondary
structure predictions of full length hSTRAP obtained through Jpred3 and Scratch protein
predictor programs (Materials and Methods, Section 2.3.2.1). Finalized polypeptide
boundaries were derived using this latter information, and positions of predicted TPR motifs.
The six predicted TPR motifs are highlighted in different colors, each representing a
different TPR motif. The reliability of the Jpred3 prediction (Jpred3 Reliability) varies from
0 to 9, where 9 is the highest prediction accuracy. Jpred3 reliability greater than 4 is
highlighted in green. The structure of the peptide 262-422 (highlighted in grey) has been
solved [http://www.thesgc.org/structures/details?pdbid=2XVS], and its secondary structure
composition (from PDB file) is provided here for reference. Abbreviations: H, Alpha Helix;
G, 3-10-helix; S, Bend; C, The rest; I, pi-helix; B, Beta Bridge; E, Extended chain (Beta
strand); T, Turn; SSP, Secondary Structure Prediction; Rel, Reliability Prediction Accuracy;
Red, TPR 1; Bright Green, TPR2; Purple, TPR3; Dark Green, TPR 4;Bright TPR 5 and
Yellow TPR 6.
110
Table 3.3. hSTRAP truncated forms cloned in pET14-b.
(A)
Construct
Length
of Predicted
Construct
TPR Motifs
(amino acids)
included
Molecular
weight
(kDa)
Theoretical
pI
1-hSTRAP(1-219)
2-hSTRAP(220-440)
3-hSTRAP(1-150)
4-hSTRAP(151-284)
5-hSTRAP(285-440)
219
221
150
134
156
24.9
24.0
17.7
15.2
16.6
5.19
9.30
4.87
5.45
9.84
1-3
4-6
1-2
3-4
5-6
(B)
Construct
Amino Acid Sequence
1-hSTRAP(1-219)
MMADEEEEVK
RKQQDVQKEM
TPDYSPKAEE
AAHTCFSGAL
VMDSVRQAKL
KISQQALSAY
ASSNPDLHLN
PRQREQQLLE
HLGPCSDGHY
KVVFSLTTEE
IGDSVAIPEP
GKPQGSSSQA
MMADEEEEVK
RKQQDVQKEM
TPDYSPKAEE
AAHTCFSGAL
TDTEDEHSHH
SLYFSTGQNP
ATLHKYEESY
LDRLTSLLES
KTKKLQSMLG
STLQPGVNSG
YAVMVYNIVQ
FSSVRVETPL
2-hSTRAP(220-440)
3-hSTRAP(1-150)
4-hSTRAP(151-284)
5-hSTRAP(285-440)
PILQKLQELV DQLYSFRDCY FETHSVEDAG
EKTLQQMEEV VGSVQGKAQV LMLTGKALNV
LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA
THCRNKVSLQ NLSMVLRQLR TDTEDEHSHH
AVQMDVHDGR SWYILGNSYL SLYFSTGQNP
AQAEKVDRK
RATLHKYEES YGEALEGFSR AAALDPAWPE
FLDRLTSLLE SKGKVKTKKL QSMLGSLRPA
QSASGQKVTL ELKPLSTLQP GVNSGAVILG
KVPFTFGLVD SDGPCYAVMV YNIVQSWGVL
NLRLHRIQHK GKDYSFSSVR VETPLLLVVN
VATVASRPQC E
PILQKLQELV DQLYSFRDCY FETHSVEDAG
EKTLQQMEEV VGSVQGKAQV LMLTGKALNV
LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA
THCRNKVSLQ NLSMVLRQLR
VMDSVRQAKL AVQMDVHDGR SWYILGNSYL
KISQQALSAY AQAEKVDRKA SSNPDLHLNR
GEALEGFSRA AALDPAWPEP RQREQQLLEF
KGKV
SLRPAHLGPC SDGHYQSASG QKVTLELKPL
AVILGKVVFS LTTEEKVPFT FGLVDSDGPC
SWGVLIGDSV AIPEPNLRLH RIQHKGKDYS
LLVVNGKPQG SSSQAVATVA SRPQCE
A: Construct names, TPR motifs included, molecular weight (kDa) and calculated pI of the
hSTRAP constructs used in this study. B: Amino acid sequence of each hSTRAP construct used
in this study
3.1.3.2. Cloning of Truncated versions of hSTRAP
Once polypeptide boundaries were chosen, the next step was to proceed to the
cloning of the five constructs into pET-14b. PCR was carried out as described in the
Materials and Methods Section 2.3.2.2 using primers shown in Table 2.3. In order to
check if the PCR and digestion reactions with BamHI and NdeI were successful,
samples of the digested PCR reactions were loaded on an agarose gel (Fig.3.11A).
The correct sized DNA bands for all five constructs were observed: 670 bp
hSTRAP(1-219), 684 bp hSTRAP(220-440), 462 bp hSTRAP(1-150), 414 bp
hSTRAP(151-284), and 486 bp hSTRAP(285-440). Figure 3.11B shows a sample of
111
digested and linearised 4.7 kbp pET14-b vector DNA (Fig.3.11B). Both gels confirm
that PCR and digestion has been successful and so the next steps in the cloning
procedure can be followed.
(A)
(B)
Figure 3.11. PCR products of the hSTRAP fragment cloned in pET14-b vector. PCR
products and pET14-b vector were digested with BamHI and NdeI restriction enzymes and 15l
of each sample was then subsequently loaded on a 1% (w/v) agarose gel. This 1% (w/v) agarose
gels show-. A: Digestion reactions of the five truncated hSTRAP PCR products, which were
subsequently cloned in pET14-b vector. B: BamHI and NdeI restriction digestion of the pET14-b
vector (between 5-4 kbp).
3.1.3.3. Expression and Purification of truncated versions of hSTRAP
Plasmids coding for the different truncated hSTRAP constructs that had been created
were sequence verified, and this confirmed that all five truncated versions of
hSTRAP had been successfully cloned into pET14-b (See Appendix). Systematic
trials were then carried out on all protein constructs to determine optimum soluble
expression conditions for each hSTRAP protein variant.
3.1.3.3.1 Expression and purification of hSTRAP(1-219)
Expression trials involved testing expression in transformed E. coli at varying
incubation temperatures for cell growth in the range of 37 to 16°C, and at varying
IPTG concentrations. These trials identified optimum conditions of hSTRAP(1-219)
protein expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at an
OD600 of 0.5, followed by 3 hours incubation at 25°C in LB media. Samples
representative of soluble, insoluble and total expression were loaded on 15% SDS
112
PAGE gels (Fig.3.12). These results showed that hSTRAP(1-219) is over-expressed
at these optimised conditions as shown by levels of total expression (Fig.3.12B,
Lanes 2-5). The protein, hSTRAP(1-219) is running at approximately 28 kDa, which
is higher than its expected 26 kDa (Table 3.3), even accounting for the additional
approximate 1kDa for the His tag coding sequence to the start of the fusion protein.
Sequencing data (See apendix one) and mass spectrometry (Fig.3.14) have
confirmed this to be hSTRAP(1-219). Also, in comparison to total expression, a low
quantity of protein was detected in samples representative of soluble fraction at 2
and 3 hrs after induction (Fig.3.12A, Lanes 4-5). It was noted that hSTRAP(1-219)
was precipitating in these soluble fraction samples with time in all conditions tested.
This suggested that method of lysis and sample analysis requires optimisation so
protein does not precipitate in the buffer it is present in.
Initial purifications of hSTRAP(1-219) did not yield pure hSTRAP protein in
elutions from His-tag affinity resin (Fig.3.13A, Lanes 5-7), nor bound to the resin
(3.13A, Lane 3) at pH 7.4. This was the starting pH of all purifications, as
purification of His-hSTRAP(1-440) was successful at this pH (Fig.3.3C). Due to the
purification of hSTRAP(1-219) being un-successful at this pH, the pH of the buffers
were changed to investigate the effects of pH on the purification of hSTRAP(1-219).
This was done as it has been shown for the full-length hSTRAP constructs that
optimisation of the pH could be an important factor in obtaining pure hSTRAP
protein (Fig.3.3 and 3.8). The pH of the loading and washing buffers were then
changed from pH 7.4 to pH 8.7, as the latter pH was found to be optimal to obtain
pure GST-hSTRAP(1-440) protein (Fig.3.8). The purification carried out at pH 8.7
did yield pure hSTRAP(1-219) protein bound to resin (Fig.3.13B, Lane 8). This
provided further evidence that pH does indeed affect purity of protein bound to resin
and this information will be useful when carrying out future protein purifications.
Once pure hSTRAP(1-219) protein was bound to the resin, the elution buffer had to
be optimised further as degradation products were visible on the gel in the elutions
obtained at pH 8.7 (Fig.3.13C, Lanes 2-7). This optimisation was achieved by the
addition of H-MIX to the elution buffer. Higher purity hSTRAP(1-219) protein was
obtained in the elutions with H-MIX (Fig.3.13D, Lanes 2-7) compared to elutions in
the absence of H-MIX (Fig.3.13C, Lane 2-7). This proved that, indeed H-MIX does
improve protein stability against proteolytic degradation and aggregation as initially
hypothesized.
113
The quantity of hSTRAP(1-219) protein that was obtained from 1 L E.coli culture
was estimated just over 5 mg (Table 3.4). Pull downs can be carried out with this
protein as pure hSTRAP(1-219) can be obtained bound to the His tag affinity resin
(Fig.3.13B)
The 28 kDa protein band on the gel that was expected to be hSTRAP(1-219)
(indicated by an arrow in the elutions in Figure 3.13D), was characterised by In-gel
digestion Mass spectrometry to confirm its identity. Mass spectrometry confirmed
that this was indeed a fragment of hSTRAP, coverage indicative of hSTRAP(1-219)
(Fig.3.14) so further experiments can be carried out with this protein.
(A)
(B)
Figure 3.12. Expression of hSTRAP (1-219). 15% SDS PAGE gel showing- A: Soluble
(lanes 2-6) and insoluble fractions (lanes 7-11) of BL21(DE3)pLysS cells transformed with pET-14bHis-hSTRAP(1-219) after 1 hour (lanes 3 and 8), 2 hours (lanes 4 and 9), 3 hours (lanes 5 and 10)
and 4 hours (lanes 6 and 11) after induction with IPTG. B: Total BL21(DE3)pLysS pET-14b-HishSTRAP(1-219) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post
IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction
fractions respectively. Both gels were loaded with 15µl of sample and stained with instant blue.
HSTRAP(1-219) is migrating at approximately 28 kDa (indicated by an arrow)
Optimum conditions of soluble hSTRAP(1-219) expression were identified in this cell line as
induction with 0.1 mM IPTG at OD600 of 0.5, followed by 3hrs incubation at 25°C in LB media. A:
hSTRAP(1-219) TPR 1-3 (indicated by an arrow) total expression over time at these optimum
conditions. B: hSTRAP(1-219)TPR 1-3 (indicated by114
an arrow) expression in soluble and insoluble
fractions over time
(A)
(B)
(C)
(D)
Figure 3.13. Purification of hSTRAP(1-219) protein. 15% SDS PAGE gels showingA: hSTRAP(1-219) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(1-219)
purification from BL21(DE3)pLysS, at pH 8.7. C: hSTRAP(1-219) elutions obtained from
purification in B. D: Elutions obtained from the purification protocol performed as in B with
the addition of H-MIX in the elution buffer. All gels were loaded with 15µl of sample and
stained with instant blue. HSTRAP(1-219) is migrating at 28 kDa (indicated by an arrow).
115
Table 3.4. Estimation of hSTRAP(1-219) protein concentration.
Elution
OD595
1
2
3
4
5
6
0.248
0.436
0.445
0.225
0.193
0.171
1
61
121
181
241
301
361
421
MMADEEEEVK
VGSVQGKAQV
AAHTCFSGAL
SWYILGNSYL
GEALEGFSRA
LGPCSDGHYQ
DGPCYAVMVY
KPQGSSSQAV
Concentration
(mg/ml)
0.744
1.308
1.335
0.675
0.579
0.513
PILQKLQELV
LMLTGKALNV
THCRNKVSLQ
SLYFSTGQNP
AALDPAWPEP
SASGQKVTLE
NIVQSWGVLI
ATVASRPQCE
The concentration of hSTRAP(1-219) protein
measured by Bradford assay in the elutions
obtained from the purification method described
in Figure 3.13D and determined as described in
Materials and Methods.
Elutions
(mls)
1
1
1
1
1
1
DQLYSFRDCY
TPDYSPKAEE
NLSMVLRQLR
KISQQALSAY
RQREQQLLEF
LKPLSTLQPG
GDSVAIPEPN
FETHSVEDAG
LLSKAVKLEP
TDTEDEHSHH
AQAEKVDRKA
LDRLTSLLES
VNSGAVILGK
LRLHRIQHKG
RKQQDVQKEM
ELVEAWNQLG
VMDSVRQAKL
SSNPDLHLNR
KGKVKTKKLQ
VVFSLTTEEK
KDYSFSSVRV
EKTLQQMEEV
EVYWKKGDVA
AVQMDVHDGR
ATLHKYEESY
SMLGSLRPAH
VPFTFGLVDS
ETPLLLVVNG
Figure 3.14. hSTRAP(1-219) mass spectrometry. Mass spectrometry performed on the 26
kDa hSTRAP protein obtained from the gel shown in Fig.3.13D, lane 4. Amino acids highlighted
in red indicate peptide coverage of the eight unique peptides obtained from the in gel digestion
mass spectrometry of this 28 kDa band.
3.1.3.3.2 Expression and purification of hSTRAP(220-440)
Expression trials involved testing different cell lines, temperatures and IPTG
concentrations. The cells lines tested were Shuffle T7, Shuffle T7pLysY,
BL21(DE3) Single cells and BL21(DE3)pLysS, of which the latter was found to be
optimal for expression of this protein in the soluble form. Temperatures tested were
in the range 16 to 37°C and IPTG concentrations tested were 0.1, 0.2, 0.5 and 1 mM
IPTG. These trials identified optimum conditions of protein expression in
BL21(DE3)pLysS, induced with 0.2 mM IPTG at OD600 of 0.5, followed by 3 hrs
incubation at 25°C in LB media. However, expression level of this protein was very
low, as shown by samples representative of total expression (Fig.3.15, Lanes 2-5),
though it appears to be mostly soluble (Fig.3.15, Lanes 6-8). Also it seems there is
leaky hSTRAP(220-440) expression, which should be suppressed in this cell line
(Table 2.2), although, in the other cells lines tested, expression of hSTRAP(220-440)
was not detectable by SDS-PAGE. So despite the generally low level of protein
expression, it was decided to proceed to large-scale protein purification as described
in the Material and Methods section 2.9.1. Also the protein seems to be migrating at
a correct molecular weight (Table 3.3), taking into account the His tag coding
sequence (addition of approximately 1 kDa).
Large scale purification of hSTRAP(220-440) was carried out (Fig.3.16) and initial
purifications were done at pH 7.4, as purification of His-hSTRAP(1-440) was
successful at this pH (Fig3.4C). However, hSTRAP(220-440) protein bound to the
His-tag affinity resin was not pure at this pH (Fig.3.16A, Lane 10), even with
116
substantial amounts of washing of resin. The pH was then increased to 8.2
(Fig.3.16B), as it was shown previously that pH was a critical factor in obtaining
pure hSTRAP protein bound to resin. The pH could not be increased any further as
the pI of this protein construct is 9.3 and so pH 8.2 was the maximum threshold. At
pH 8.2, after substantial washing of the resin with wash buffer, pure hSTRAP(220440) protein was obtained bound to resin (Fig.3.16B, Lane 12). However, samples of
resin taken after gel analysis, which was 4 hours after cell lysis showed that protein
was unstable and would degrade almost immediately on the resin at 4°C (Fig.3.16C).
H-MIX was then added to all purification buffers as protein seems generally very
unstable. Even though H-MIX is hypothesized to act as a stabilising agent and has
significantly improved protein stability of other hSTRAP proteins (Fig.3.3D and
3.13D), this purification seems to have purified hSTRAP(220-440), but not of high
purity and seems unstable (Fig.3.16D). Furthermore, the yield of protein from 1 L of
E. coli growth culture was estimated at less than 0.5 mg. The problems encountered,
combined with the crystal structure of the C terminus of hSTRAP residues 262-422
[2XVS] which appeared in PDB at the time, lead to the decision that the construct
hSTRAP(220-440) will not be used for future structural studies and priority was
given to the other constructs. As this protein was unstable even when bound to the
resin (Fig.3.16C), it was impractical to use it for biochemical studies either. Protein
identity was not confirmed through in-gel digestion mass spectrometry as this
protein will not be used for any further experiments in this investigation.
Figure 3.15. Expression of hSTRAP (220-440) in BL21(DE3)pLysS. 12% SDS PAGE
gel showing total (lanes 2-5), soluble (Lanes 6-8) and insoluble fractions (lanes 9-11) of
BL21(DE3)pLysS transformed with pET-14b-His-hSTRAP(220-440) after 1hour (lanes 3 , 6
and 9), 2 hours (lanes 4, 7 and 10), 3 hours (lanes 5, 8 and 11) post IPTG induction. Lane 1 and
2 represent protein markers and pre-induction fractions respectively. Gel was loaded with 15µl
of sample in each lane and stained with instant blue. Leaky hSTRAP(220-440) seems to be
observed (Lane 2) and migration of hSTRAP(220-440) is 25 kDa as indicated.
117
(A)
(C)
(B)
(D)
Figure 3.16. Purification of hSTRAP(220-440). 15% SDS PAGE gels showing A:
hSTRAP(220-440) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(220-440)
purification from BL21(DE3)pLysS, at pH 8.2. C: Purified hSTRAP(220-440) truncated
protein loaded on resin 4 hours after cell lysis. D: Protein purification carried out using
purification with pH 8.2 supplemented with H-MIX. The protein expression process is
described in Materials and Methods. All gels were loaded with 15µl of sample in each lane
and stained with instant blue. Expected migration of hSTRAP(220-440) is 25 kDa as
indicated.
3.1.3.3.3 Expression and purification of hSTRAP(1-150)
Expression trials identified optimum conditions for soluble hSTRAP(1-150) protein
expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at OD 600 of 0.5,
followed by 3 hrs incubation at 25°C in LB media. Soluble and insoluble fractions
118
from lysed cells, as well as total expression, were analyzed by SDS-PAGE. The
hSTRAP(1-150) protein was over-expressed (Fig.3.17B, Lanes 2-6) and was mainly
found in the soluble fraction at these optimum conditions (Fig.3.17A, Lane 7-11).
The protein, hSTRAP(1-150) is migrating at the correct molecular weight of
approximately 19 kDa, 17.7 kDa hSTRAP(1-150) sequence (Table 3.3), with the
addition of the His tag coding sequence (approximately 1kDa).
Large scale expression and purification was carried out (Fig.3.18) and the first
purification was done at pH 7.4 (Fig.3.18A) and high yield of hSTRAP(1-150) was
obtained, but protein purity was less than 90% so the purification protocol had to be
optimised further. For that purpose the pH was increased to pH 8.7, and pure
hSTRAP(1-150) protein was obtained in elutions under these conditions (Fig.3.18B,
Lanes 2-7). Table 3.5 shows the estimated concentration of hSTRAP(1-150) protein
found in elutions following this optimised purification protocol (Fig.3.18B, Lanes 27). High concentration of protein was obtained, especially in elutions 2 and 3, where
over 30 mg/ml of protein was obtained (Table 3.5). Total estimated protein yield
from one litre of E.coli cell culture was very high, approximately 210 mg (Table
3.5).
Protein identity of hSTRAP(1-150) (indicated by an arrow in Fig.3.18) was
confirmed by In-gel digestion mass spectrometry (Fig.3.19). This figure shows that
the band found in elutions (Fig.3.18) was indeed a truncated version of hSTRAP
with peptide coverage indicative of hSTRAP(1-150).
119
(A)
(B)
Figure 3.17. Expression of hSTRAP (1-150). 15% SDS PAGE gels showing- A:
Soluble (lanes 2-6) and insoluble fractions (lanes 7-11) of BL21(DE3)pLysS cells
transformed with pET-14b-His-hSTRAP(1-150) after 1 hour (lanes 3 and 8), 2 hours (lanes
4 and 9), 3 hours (lanes 5 and 10) and 4 hours (lanes 6 and 11) after induction with IPTG.
B: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(1-150) transformed cell lysate 1 hour
(lane 3) 2 hours (lane 4), 3 hours (lane 5) and 4 hours (lane 6) post IPTG induction. Lane 1
in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions
respectively. Both gels were loaded with 15µl of sample and stained with instant blue.
Migration of hSTRAP(1-150) is indicated by an arrow.
(A)
(B)
Figure 3.18. Purification of hSTRAP(1-150) protein. 15% SDS PAGE gels showingA: hSTRAP(1-150) purification at pH 7.4 from BL21(DE3)pLysS cells. B: hSTRAP(1-150)
purification from BL21(DE3)pLysS at pH 8.7. Fifteen microlitres of samples were loaded on
both gels and all samples were diluted 1:40. Both gels were stained with instant blue.
Migration of hSTRAP(1-150) is indicated by an arrow.
120
Table 3.5. Estimation of hSTRAP(1-150) protein concentration.
Elution
OD595
Concentration
(mg/ml)
Velutions
(mls)
1
2
3
4
5
6
0.650
1.150
1.234
0.734
0.691
0.354
19.5
34.5
37.02
22.02
20.73
10.62
1.5
1.5
1.5
1.5
1.5
1.5
Total protein
quantity
(mg)
29.25
51.75
55.53
33.03
31.10
15.93
The concentration of hSTRAP(1-150)
protein measured by Bradford assay in
the elutions obtained from the
purification method described in
Figure 3.18B and determined as
described in Materials and Methods
1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV
61 VGSVQGKAQV LMLTGKALNV TPDYSPKAEE LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA
121 AAHTCFSGAL THCRNKVSLQ NLSMVLRQLR TDTEDEHSHH VMDSVRQAKL AVQMDVHDGR
181 SWYILGNSYL SLYFSTGQNP KISQQALSAY AQAEKVDRKA SSNPDLHLNR ATLHKYEESY
241 GEALEGFSRA AALDPAWPEP RQREQQLLEF LDRLTSLLES KGKVKTKKLQ SMLGSLRPAH
301 LGPCSDGHYQ SASGQKVTLE LKPLSTLQPG VNSGAVILGK VVFSLTTEEK VPFTFGLVDS
361 DGPCYAVMVY NIVQSWGVLI GDSVAIPEPN LRLHRIQHKG KDYSFSSVRV ETPLLLVVNG
421 KPQGSSSQAV ATVASRPQCE
Figure 3.19. hSTRAP(1-150) mass spectrometry. Mass spectrometry performed on
the 18 kDa hSTRAP protein obtained from the gel shown in Fig.3.18B, lane 2. Amino acids
highlighted in red indicate peptide coverage of the five unique peptides obtained from the in
gel digestion mass spectrometry of this 19 kDa band.
3.1.3.3.4 Expression and purification of hSTRAP(151-284)
Optimum conditions for the expression of the soluble hSTRAP(151-284) using
BL21(DE3)pLysS cells were identified as induction with 0.1 mM IPTG at OD 600 of
0.5, followed by 3 hrs incubation at 25°C (Fig.3.20). Analysis of lysed samples by
SDS-PAGE indicated that hSTRAP(151-284) was over-expressed (Fig.3.20B, Lanes
2-6), although mainly insoluble (Fig.3.20A, Lanes 2-6). However, this could be due
to the method of lysis and sample analysis, as these samples are from small scale
expression trials. There is a difference in methodology between small scale and large
scale pellet lysis and sample buffer conditions, which ultimately could affect
solubility of protein. As small-scale lysis is carried out with bugbuster and for largescale purification, samples are lysed through cell disruption or sonication, and
buffers can be optimized in large scale purification. HSTRAP(151-284) is migrating
at the correct molecular weight of around 16 kDa, as the hSTRAP(151-284) amino
acid sequence would account for 15.2 kDa (Table 3.3), and the His tag coding
sequence (approximately 1 kDa).
Large scale purification of hSTRAP(151-284) was attempted initially at pH 7.4
(Fig.3.21A), however the purity of eluted protein was less than 70% so the
purification protocol had to be optimised. Furthermore, hSTRAP(151-284) bound
strongly to the TALON His-tag affinity resin even after substantial washing of resin
with elution buffer (Fig.3.21A, Lane 4), therefore hSTRAP was not successfully
eluted (Fig.3.21B, lanes 7-8). This also suggested that the purification protocol had
121
to be optimised and for that reason the pH of the purification buffers were increased
to 8.7 (Fig.3.21B) and this yielded pure hSTRAP(151-284) protein bound to the
resin (Fig.3.21B, Lane 9), and in elutions (Fig.3.21C, Lanes 9-10). The estimated
yield of eluted protein was low, as less than 0.2 mg of protein was obtained from 2
litres of E. coli culture. These quantities of hSTRAP(151-284) are insufficient for
any structural studies but the volume of growth culture could be increased to
increase quantity of hSTRAP(151-284) protein obtained. Pure and stable
hSTRAP(151-284) could be readily obtained bound to the resin (Fig.3.21B, Lane 9),
hence, this protein construct can be used to determine its interacting partners. Protein
identity was confirmed through In-gel digestion mass spectrometry and this band
was identified as a truncated form of hSTRAP with coverage indicative of
hSTRAP(151-284) (Fig.3.21D).
(A)
(B)
Figure 3.20. Expression of hSTRAP (151-284). 15% SDS PAGE gels showing-A:
Insoluble (lanes 2-6) and soluble fractions (lanes 7-11) of BL21(DE3)pLysS cells
transformed with pET-14b-His-hSTRAP(151-284) after 1 hr (lanes 3 and 8), 2 hrs (lanes 4
and 9), 3 hrs (lanes 5 and 10) and 4 hrs (lanes 6 and 11) after induction with IPTG. B: Total
BL21(DE3)pLysS pET-14b-His-hSTRAP(151-284) transformed cell lysate, 1 hr (lane 3) 2
hrs (lane 4), 3 hrs (lane 5) and 4 hrs (lane 6) post IPTG induction. Lane 1 in (A) and (B)
represent protein markers. Lanes 2 and 7 are pre-induction fractions respectively. Both gels
were loaded with 15µl of sample and stained with instant blue. Migration of hSTRAP(151284) was approximately at 16 kDa as indicated.
122
(A)
(B)
(C)
(D)
1
31
61
91
121
151
181
211
241
271
301
331
361
391
421
MMADEEEEVK
FETHSVEDAG
VGSVQGKAQV
LLSKAVKLEP
AAHTCFSGAL
TDTEDEHSHH
SWYILGNSYL
AQAEKVDRKA
GEALEGFSRA
LDRLTSLLES
LGPCSDGHYQ
VNSGAVILGK
DGPCYAVMVY
LRLHRIQHKG
KPQGSSSQAV
PILQKLQELV
RKQQDVQKEM
LMLTGKALNV
ELVEAWNQLG
THCRNKVSLQ
VMDSVRQAKL
SLYFSTGQNP
SSNPDLHLNR
AALDPAWPEP
KGKVKTKKLQ
SASGQKVTLE
VVFSLTTEEK
NIVQSWGVLI
KDYSFSSVRV
ATVASRPQCE
DQLYSFRDCY
EKTLQQMEEV
TPDYSPKAEE
EVYWKKGDVA
NLSMVLRQLR
AVQMDVHDGR
KISQQALSAY
ATLHKYEESY
RQREQQLLEF
SMLGSLRPAH
LKPLSTLQPG
VPFTFGLVDS
GDSVAIPEPN
ETPLLLVVNG
Figure 3.21. Purification of hSTRAP(151-284). 15% SDS PAGE gels showing A:
hSTRAP(151-284) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(151-284)
purification from BL21(DE3)pLysS, at pH 8.7. C: hSTRAP(151-284) elutions obtained
from purification in B. D: Mass spectrometry performed on the 16 kDa hSTRAP protein
obtained from the gel shown in C, lane 10. Amino acids highlighted in red indicate peptide
coverage of the unique peptides obtained from the in gel digestion mass spectrometry of this
16 kDa band. All gels were loaded with 15µl of sample and stained with instant blue.
3.1.3.3.5 Expression and purification of hSTRAP(285-440)
Expression trials identified optimum conditions for hSTRAP(285-440) protein
expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at an OD600 of
0.5, followed by 4hrs incubation at 25°C, in LB media (Fig.3.22). SDS-PAGE gels
123
analysing samples representative of soluble and insoluble protein fractions of lysed
cells showed that after 4 hrs induction, hSTRAP(285-440) was expressed in the
soluble form (Fig.3.22A, Lane 11). HSTRAP(285-440) was migrating at the correct
molecular weight of approximate 18kDa, taking into account 16.6 kDa
corresponding to the hSTRAP(285-440) amino acid sequence (Table 3.3) and
approximately 1 kDa corresponding to the His tag coding sequence (Fig.3.1).
Large scale purification of this protein was carried out initially at pH 7.4. The protein
was mainly found in the insoluble fraction (Fig.3.23A, Lane 3) and many
contaminants were bound to the resin (Fig.3.23A, Lane 10). This suggested that
purification protocol had to be optimised further before proceeding to elution of the
protein from the column. The pH of the purification buffers was increased to 8.7 and
this improved purity of protein bound to the resin by approximately 60% (Fig.3.23B,
Lane 8) from initial purification done at pH 7.4 (Fig.3.23A, Lane 10). However,
samples of the final wash flow-through of resin contained contaminants (Fig.3.23B,
Lane 7). Further washing of the resin with wash buffer eventually yielded pure
hSTRAP(285-440) protein bound to the resin (Fig.3.23C, Lane 7).
Pure hSTRAP(285-440) protein was obtained in all elutions at pH8.7 (Fig.3.23D)
and total yield estimate from 1 L E.coli culture was less than 0.2 mg. However, at
that point we learned that the structure of the C terminus of hSTRAP, residues 262422 had been solved by X-Ray crystallography [2XVS] and so it was decided to use
this construct primarily for biochemical pull-down studies. This was possible
because pure stable protein, bound to the His tag affinity resin could be readily
obtained (Fig.3.23C, Lanes 7). However, CD experiments will still be carried out
with this protein construct as these experiments would determine folding state and
thermal stability of hSTRAP(285-440).
Protein identity was confirmed through In-gel digestion Mass spectrometry of the
band found in elutions shown in Figure 3.23D. This band was identified as hSTRAP
and the peptide coverage was of the expected range, amino acids 285-440
(Fig.3.23E).
124
(A)
(B)
Figure 3.22. Expression of hSTRAP (285-440). 15% SDS PAGE gels showing- A:
Insoluble (lanes 2-6) and soluble fractions (lanes 7-11) of BL21(DE3)pLysS cells
transformed with pET-14b-His-hSTRAP(285-440) after 1 hour (lanes 3 and 8), 2 hours
(lanes 4 and 9), 3 hours (lanes 5 and 10) and 4 hours (lanes 6 and 11) post IPTG induction.
B: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(285-440) transformed cell lysate, 1 hour
(lane 3) 2 hours (lane 4), 3 hours (lane 5) and 4 hours (lane 6) post IPTG induction. Lane 1
in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions. Both
gels were loaded with 15µl of sample and stained with instant blue. Migration of
hSTRAP(285-440) is indicated by an arrow.
125
(A)
(B)
(C)
(D)
(E)
1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV VGSVQGKAQV LMLTGKALNV
81
161
241
321
401
TPDYSPKAEE
VMDSVRQAKL
GEALEGFSRA
LKPLSTLQPG
KDYSFSSVRV
LLSKAVKLEP
AVQMDVHDGR
AALDPAWPEP
VNSGAVILGK
ETPLLLVVNG
ELVEAWNQLG
SWYILGNSYL
RQREQQLLEF
VVFSLTTEEK
KPQGSSSQAV
EVYWKKGDVA
SLYFSTGQNP
LDRLTSLLES
VPFTFGLVDS
ATVASRPQCE
AAHTCFSGAL
KISQQALSAY
KGKVKTKKLQ
DGPCYAVMVY
THCRNKVSLQ
AQAEKVDRKA
SMLGSLRPAH
NIVQSWGVLI
NLSMVLRQLR
SSNPDLHLNR
LGPCSDGHYQ
GDSVAIPEPN
TDTEDEHSHH
ATLHKYEESY
SASGQKVTLE
LRLHRIQHKG
Figure 3.23. Purification of hSTRAP(285-440). 15% SDS PAGE gels showing A:
hSTRAP(285-440) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(285-440)
purification from BL21(DE3)pLysS, at pH 8.7. C: B but resin washed further. D: Pure
hSTRAP(285-440) elutions obtained from the purification in C. E: Mass spectrometry
performed on the 17 kDa hSTRAP protein obtained from the gel shown in D. Amino acids
highlighted in red indicate peptide coverage of the unique peptides obtained from the in gel
digestion mass spectrometry of this 18 kDa band. All gels were loaded with 15µl of sample
and stained with instant blue.
126
3.2 Identification of hSTRAP interacting partners in MCF7 breast
cancer cells
One aim of this project was to identify interacting partners of hSTRAP, and regions
responsible for these interactions, in breast cancer as there is limited published data on
hSTRAP regarding this. Previous published data has elucidated STRAP to be implicated in
the DNA damage [136, 144-145, 147-148], stress response pathway [148, 149], regulation
of the glucocorticoid receptor [150] and p53 function [136, 143]. HSTRAP is hypothesized
to be implicated in diverse regulatory pathways due to the presence of the six predicted
TPR motifs, which are important in mediating protein-protein interactions [136]. These
TPR motifs could potentially bridge multiple protein complexes and form extensive
protein networks. This needed to be investigated further and these following sections
include the list of proteins that were identified to interact with hSTRAP in vitro.
Biochemical pull down assays were carried out using MCF7 cellular extracts with the
hSTRAP protein variants (bait) bound to their respective affinity resin. HSTRAP
interacting proteins were then identified through In-gel digestion mass spectrometry of
whole biochemical pull down assay samples. This list of potential hSTRAP interacting
proteins was then submitted to DAVID bioinformatics software [See Materials and
Methods Section 2.18] for association network analysis and to identify potential functions
of hSTRAP.
Mass spectrometry rather than western blotting was used in order to maximize the number
of hSTRAP interacting proteins identified from diverse pathways, rather than focusing on a
specific pathway. MCF7 cells, which is a human breast cancer cell line was chosen for this
study primarily to investigate hSTRAP implication in breast cancer. This human cell line
was also chosen as this cell line is readily used in the lab and was available at the time.
The protein constructs used for these interacting studies include full length hSTRAP with
GST and with 6His tags, named GST-hSTRAP(1-440) and His-hSTRAP(1-440)
respectively. As mentioned previously, this enables interacting data for full-length
hSTRAP with two different tags to be analyzed and compared. This approach provides
internal control for possible experimental artifacts linked to presence of co-purified E. coli
proteins, and/or interference of the tags themselves with protein-ligand interactions.
Furthermore, truncated versions of hSTRAP were also successfully cloned into pET14-b,
which will be used to narrow down the region of hSTRAP-ligand interaction and further
127
enhance data reliability, by providing more statistics. These truncated hSTRAP constructs
are hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440).
Biochemical studies will not be carried out with hSTRAP(220-440) as this protein
construct is very unstable when bound to the His tag affinity resin, which makes it
impractical for these pull down experiments.
3.2.1 Purification of hSTRAP protein variants
All hSTRAP constructs bound to their respective affinity resin to be used for subsequent
pull downs were analyzed by SDS PAGE to determine their purity (Fig.3.24), and to reconfirm protein identity, through in-gel digestion mass spectrometry (Fig.3.25). SDS
PAGE analysis was also necessary to ensure that equal quantities of these different pure
hSTRAP protein variants were being used in subsequent pull down assays (Fig.3.24). To
add to this, resin samples were heavily overloaded on this gel, to ensure that no obvious
contaminating proteins are present in the samples used as baits (Fig.3.24). Protein bands,
representative of the GST tag and the hSTRAP variants bound to their respective resin to
be used for subsequent pull downs (Fig.3.24, Bands 1-7) were characterized through mass
spectrometry to reconfirm their identity. Protein band 1 (Fig.3.24A) was identified as GST
(to be used as negative control in pull-downs) and Band 2-7 (Fig.3.24B-F respectively)
were identified as hSTRAP by in-gel digest MS, with the peptide coverage indicative of 1440, 1-440, 285-440, 1-219, 1-150 and 151-284 respectively.
Figure 3.24. hSTRAP variants used for biochemical binding assays. HSTRAP protein
variants purified and bound to the respective resin to be used for subsequent pull downs and
analyzed using 15% SDS PAGE stained with coomassie. Fifty microlitres of resin sample was
loaded in each lane. These 15% SDS PAGE gel show resin sample of- A: GST; B: GSThSTRAP(1-440); C: His-hSTRAP(1-440) (lane 2) and His-hSTRAP(285-440) (lane 3); D: HishSTRAP(1-219); E: His-hSTRAP(1-150); F: His-hSTRAP(151-284);
128
(A) Band 1-GST
1
71
141
211
MSPILGYWKI KGLVQPTRLL LEYLEEKYEE HLYERDEGDK WRNKKFELGL EFPNLPYYID GDVKLTQSMA
IIRYIADKHN MLGGCPKERA EISMLEGAVL DIRYGVSRIA YSKDFETLKV DFLSKLPEML KMFEDRLCHK
TYLNGDHVTH PDFMLYDALD VVLYMDPMCL DAFPKLVCFK KRIEAIPQID KYLKSSKYIA WPLQGWQATF
GGGDHPPK
(B) Band 2-GST-hSTRAP(1-440)
1
71
141
201
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
FETHSVEDAG
ELVEAWNQLG
AVQMDVHDGR
GEALEGFSRA
SASGQKVTLE
GDSVAIPEPN
RKQQDVQKEM
EVYWKKGDVA
SWYILGNSYL
AALDPAWPEP
LKPLSTLQPG
LRLHRIQHKG
EKTLQQMEEV
AAHTCFSGAL
SLYFSTGQNP
RQREQQLLEF
VNSGAVILGK
KDYSFSSVRV
VGSVQGKAQV
THCRNKVSLQ
KISQQALSAY
LDRLTSLLES
VVFSLTTEEK
ETPLLLVVNG
(C) Band 3- His-hSTRAP(1-440)
1
71
141
201
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
(D) Band 4- hSTRAP(285-440)
1
71
141
201
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
(E) Band 5-hSTRAP(1-219)
1
71
141
211
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
(F) Band 6-hSTRAP(1-150)
1
71
141
211
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
(G) Band 7-hSTRAP(151-284)
1
71
141
211
281
351
421
MMADEEEEVK
LMLTGKALNV
NLSMVLRQLR
AQAEKVDRKA
KGKVKTKKLQ
VPFTFGLVDS
KPQGSSSQAV
PILQKLQELV
TPDYSPKAEE
TDTEDEHSHH
SSNPDLHLNR
SMLGSLRPAH
DGPCYAVMVY
ATVASRPQCE
DQLYSFRDCY
LLSKAVKLEP
VMDSVRQAKL
ATLHKYEESY
LGPCSDGHYQ
NIVQSWGVLI
Figure 3.25. In-gel digestion mass spectrometry analysis. In-gel digestion mass
spectrometry characterisation of the resin fractions of the different protein variants (Figure 3.24)
used for subsequent pull down assays. A: GST; B: GST-hSTRAP(1-440); C: His-hSTRAP(1-440)
D: His-hSTRAP(285-440) E: His-hSTRAP(1-219); F: His-hSTRAP(1-150);
G: HishSTRAP(151-284); Amino acids highlighted in red are the peptides detected by mass spectrometry.
129
3.2.2 Pull downs using MCF7 cellular extract
In principle, MS characterization of proteins in mixtures could be implemented through
two strategies. In first, protein mixture is separated via SDS PAGE before MS analysis, e.g
a specific area of the gel with separated samples is digested with trypsin and analysed with
MS. In the second, the whole mixture of proteins is trypsin-digested and characterized
without prior separation, using high peptide separating and discriminating power of MS
itself. According to SDS-PAGE gels showing biochemical pull down assay samples
(Fig.3.26), hSTRAP has appeared to have many interacting partners. Therefore, specifying
a particular region of the gel would result in selecting a limited number of interaction
partners only, and potentially missing the other ligands with different molecular size.
Therefore, it was decided to analyse the whole pull-down mixture, to maximize the number
of hSTRAP interacting partners to be identified.
Before proceeding to extensive mass spectrometry characterization of whole mixtures, it
had to be confirmed that GST protein, which is used as negative control in pull-down
assays, and all hSTRAP protein variants were still bound to their respective affinity resin
after these biochemical pull down assays were carried out. This is an important factor
because if the latter is not found to be the case then data could be misinterpreted, as protein
could be potentially interacting with resin and not hSTRAP. SDS PAGE gel analysis of
one set of repeats show that GST and GST-hSTRAP(1-440) are both bound to GST affinity
resin after the pull downs were carried out (Fig.3.26A, Lanes 2 and 3 respectively). Also
hSTRAP(1-219),
hSTRAP(151-284),
hSTRAP(1-150),
His-hSTRAP(1-440)
and
hSTRAP(285-440) are all bound to the his tag affinity resin after the pull downs were
performed (Fig.3.26B, Lanes 2 and 3, Fig.3.26C and Fig.3.26A Lanes 2 and 3
respectively). This was found to be the case for the other two repeats as well (gels not
shown). Once this was confirmed, then samples were prepared for mass spec
characterisation of the pull-down mixtures. To follow the same standard in-gel digestion
protocol, samples containing pull-down mixtures were loaded on the gel and run very
briefly, until all the proteins enter the gel, but are not yet separated (Fig.3.27). The bands
containing the whole mixtures were then cut out of the gels, and subjected to trypsin
digestion and MS.
130
Figure 3.26. hSTRAP biochemical pull down assays with MCF7 cellular extracts.
Various hSTRAP protein variants were purified from BL21(DE3)pLysS cells and incubated with
MCF7 cellular extracts as described in Materials and Methods. Proteins retained on the resin after
extensive washings were extracted and submitted for SDS-PAGE analysis. A: 7.5% SDS PAGE
gel showing GST tag (lane 2) GST-hSTRAP (1-440) (lane 3) pull down samples. 12% SDS PAGE
gel showing B: His tag (lane 2), his-hSTRAP(1-219) and His-hSTRAP(151-284). C: HishSTRAP(1-150). D: Both His-hSTRAP(1-440) and His-hSTRAP(285-440) pull down samples. Gel
was stained with coomassie blue and fifty microlitres of sample was loaded in each lane
Biochemical pull down assays were repeated three times and one representative experiment is
shown in the figure.
Figure 3.27. SDS-PAGE bands isolated and submitted to mass spectrometry analysis.
Various hSTRAP protein variants as indicated were purified from BL21(DE3)pLysS cells and
incubated with MCF7 cellular extracts. Proteins retained on the resin after extensive washes were
then submitted for SDS-PAGE analysis. Electrophoresis was run long enough for the proteins to
enter the resolving gel, then the gel was stained with coomassie. The bands indicated within red
boxes isolated from the polyacrylamide gel were submitted for In-gel digestion mass spectrometry
analysis as described in Materials and Methods. Each independent biochemical pull assay was
repeated three times for each protein, and the repeat number is highlighted in blue brackets (). A:
GST tag (lane 1) and GST-hSTRAP(1-440) (lane 2). B: GST tag (lanes 1 and 3) and GSThSTRAP(1-440) (lanes 2 and 4) C: His-hSTRAP(1-440) (lanes 1-3) and hSTRAP(285-440) (lanes
4 and 5). D: hSTRAP(285-440) (lane 1), hSTRAP(1-219 (lane 2) and hSTRAP(151-284) (lanes 3
and 4). E: hSTRAP(1-150) (lanes 1 and 2), His tag (lanes 3, 5 and 7), hSTRAP(1-219) (lane 4) and
hSTRAP(151-284) (lane 6). F: hSTRAP(1-150) (lane 2) and hSTRAP(1-219) (lane 2). 50l of
sample was loaded in each lane.
131
3.2.3 hSTRAP interacting partners
Table 3.6 shows all the proteins identified through mass spectrometry in the pull-down
mixtures for the different constructs of hSTRAP, and for all three of the independent pulldown experiments performed. These selections of proteins were not present in the negative
controls, which were the pull-downs performed using GST bound to the resin, and His-tag
affinity resin as baits. The proteins identified in these negative controls were excluded
from the analysis, as there presence could be explained by non-specific interaction with the
tag and/or resin not hSTRAP. As the proteins shown in Table 3.6 have not been identified
to interact with the negative controls, it is reasonable to assume that these proteins are
interacting with hSTRAP. All of these proteins have more than two unique peptides (80+%
probability) identified and a Scaffold probability of over 95% (See Appendix for peptide
data and mascot scores). It is evident that hSTRAP interacts with many proteins implicated
in diverse regulatory roles, which was initially hypothesized (Table 3.6 and Table 3.7).
Although each of these interactions identified still need to be confirmed, as the
methodology implemented may lead to inclusion of some false positives in the dataset. We
decided to proceed with further bioinformatic analysis of this dataset, to check whether the
identified potential ligands are connected functionally with each other and hence can be
parts of the same pathways.
132
Table 3.6. hSTRAP interacting partners.
Protein Identified
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Myosin-9
cDNA FLJ53619, highly similar to
Heat shock protein HSP 90-beta
Triosephosphate isomerase
Elongation factor Tu 1 (Escherichia
coli )
L-lactate dehydrogenase
30S ribosomal protein S5 (E.coli)
ATP
synthase
subunit
beta,
mitochondrial
Phosphoglycerate kinase
cDNA FLJ51907, highly similar to
Stress-70 protein, mitochondrial
Ubiquitin-like
modifier-activating
enzyme 1
Peptidyl-prolyl cis-trans isomerase (E.
coli)
Ribose-phosphate pyrophosphokinase
(E.coli)
Cell wall structural complex MreBCD,
actin-like component MreB (E.coli)
Filamin-B
epiplakin 1
Filamin A
Eukaryotic initiation factor 4A-I
Tu translation elongation factor,
mitochondrial precursor
myosin, heavy chain 14 isoform 1
Tubulin
DNA damage dependent protein
GSThSTRAP
(1-440)
+ (2)
His-hSTRAP
(1-440)
hSTRAP
(1-219)
hSTRAP
(1-150)
hSTRAP
(151-284)
+++ (5-14)
+++ (2-6)
+++ (2-5)
hSTRAP
(285-440)
MW
(kDa)
Uniprot
ID
+++ (5-13)
++ (3-4)
227
82
P35579
B4DGL0
+++ (6-10)
31
43
P60174
B1LHD9
++ (2-3)
40
18
57
P00338
C6EGG3
P06576
+ (3)
+ (3)
++ (2)
++ (2)
+ (4)
+++ (2-3)
++ (2)
++ (2)
+ (2)
+ (2)
++ (2-3)
45
72
P00558
B7Z4V2
+ (4)
++ (2)
++ (3)
118
P22314
21
A1AGN5
++ (5)
37
A1AAD4
+++ (2-7)
37
A1AGE3
278
556
280
46
50
O75369
P58107
P21333
P60842
P49411
228
50
469
Q7Z406-1
P07437
P78527-1
++ (2-3)
++(2-5)
+++(5-9)
++(5-14)
+++ (2-15)
++ (6)
++ (2-6)
+++(2-5)
++ (4-6)
++ (2-4)
+ (7)
+++(2-9)
++(4-7)
+++ (3)
+(2)
++(2-3)
+++(2-5)
+++ (2-3)
+++ (2-15)
++(2-10)
+++ (3-4)
++(3-5)
+++(6-20)
+++ (2-5)
133
+++(3)
+ (7)
Protein Identified
22
23
24
25
kinase C Catalytic subunit
Actin
Elongation factor 1-alpha 1
Fatty acid synthase
alpha enolase
GSThSTRAP
(1-440)
++ (4-5)
+++ (5-6)
+++(6-18)
+++(5-8)
His-hSTRAP
(1-440)
hSTRAP
(1-219)
hSTRAP
(1-150)
+++ (3)
+++(2-3)
++ (4-6)
+++ (4-6)
+++(5-10)
+++(3-7)
+++(3-8)
+++ (4-5)
++ (2-5)
hSTRAP
(151-284)
+++(13-25)
+++(2-6)
hSTRAP
(285-440)
MW
(kDa)
Uniprot
ID
42
50
273
47
P60709
P68104
P49327
P06733-1
Proteins identified as hSTRAP interacting partners using In-gel digestion mass spectrometry. The (+) (++) or (+++) represent one, two or three times respectively that each
interacting protein was detected in each independent biochemical pull down assay for each hSTRAP protein variant indicated in the column headings. Each biochemical assay
was repeated three times. The number of unique peptides identified in each separate biochemical pull down assay was variable and is shown in brackets. The molecular weight
(kDa) of each one of the 25 hSTRAP interacting proteins identified and its UNIPROT ID number are indicated. Unless stated otherwise these proteins are the human orthologue. All peptide
data
is
shown
in
appendix
one,
which
lists
all
the
peptides
detected
in
each
pull
down
and
their
respective
mascot
score.
134
All the UNIPROT ID’s of all 20 hSTRAP interacting proteins (human orthalogues) were
submitted into DAVID bioinformatics software analysis. DAVID assigned all 20 proteins
to their respective pathways and this has been shown in Table 3.7. Among these pathways
included the regulation of the actin cytoskeleton, translation, DNA Damage, stress
response pathway, glycolysis and various other metabolic pathways. This provides
evidence to our original hypothesis that hSTRAP potentially can be implicated in various
diverse regulatory roles.
Table 3.7. Function of hSTRAP interacting proteins.
Protein
David assigned Pathway
Triosephosphate isomerase
Glycolysis, Metabolism of carbohydrates, Integration
of energy metabolism
Glycolysis, Metabolism of carbohydrates, Integration
of energy metabolism
Glycolysis, Metabolism of carbohydrates, Integration
of energy metabolism
Glycolysis/gluconeogensis,
Metabolism
of
carbohydrates, Hypoxia inducible factor in the
Cardiovascular system
Viral Myocarditis, Tight Junction, Focal adhesions,
Chromatin Remodeling by hSWI/SNF ATPdependent Complexes, cell morphogenesis
Viral Myocarditis, Tight Junction, Cytoskeletal
regulation by Rho GTPase, microtubule cytoskeleton
organisation
Viral Myocarditis, Tight junction, Cytoskeletal
regulation by Rho GTPase, actin filament based
processes
Focal adhesion, cytoskeletal organisation
Focal adhesion, cytoskeletal organisation
Cytoskeletal regulation by Rho GTPase, microtubule
cytoskeleton organisation
Integration of energy metabolism,
oxidative
phosphorylation
Integration of energy metabolism
Ubiquitin mediated proteolysis
Phosphoglycerate kinase
Alpha enolase
L-lactate dehydrogenase
Actin
Myosin 9
Myosin, heavy chain 14 isoform 1
Filamin A
Filamin B
Tubulin
ATP
synthase
subunit
beta,
mitochondrial
Fatty acid synthase
Ubiquitin-like
modifier-activating
enzyme 1
DNA damage dependent protein kinase
C Catalytic subunit
Eukaryotic initiation factor 4A-I
Elongation factor 1-alpha 1
Epiplakin 1
Tu translation elongation factor,
mitochondrial precursor
cDNA FLJ51907, highly similar to
Stress-70 protein, mitochondrial
cDNA FLJ53619, highly similar to
Heat shock protein HSP 90-beta
DNA damage (Non homologous end joining), cell
cycle
Translation
Translation
Cytoskeletal organisation
Translation
Stress response pathway
Stress response pathway
This table lists all the human orthologues of hSTRAP interacting partners and their relative
implications in the cell cycle associated pathways as assigned by David bioinformatics.
135
The UNIPROT IDs of the human 20 hSTRAP interacting proteins as shown in Table 3.6
were then submitted into GeneMania and String 9.0 software (see Materials and Methods
Section 2.18) to deduce interaction status of all these proteins. This data was then
ultimately used to build an interacting network using Cytoscape (see Materials and
Methods Section 2.18), showing possible hSTRAP implication in breast cancer (See
Fig.3.28). This signaling pathway shows hSTRAP relevance in breast cancer as well as a
possible functional role as a scaffolding protein. The way hSTRAP fits into the existing
known interaction network, and interacts with proteins which are functionally related,
would allow to formulate a hypothesis that hSTRAP can be potentially involved in cellular
migration, glycolysis and various metabolic pathways. This hypothesis should be further
checked by more targeted studies of individual interactions identified in the Table 3.6, and
confirming their presence in-vitro and in-vivo using independent assays. The latter will be
discussed in detail in the Discussion section of this thesis.
136
Figure 3.28. hSTRAP implication in cancer related pathways. This network was created
in cytoscape as described in Materials and Methods Section 2.18. The connections in red and blue
are interactions proved by experiments performed in this investigation or from existing published
data as shown by GeneMania and String respectively. Connections in pink are connections that are
proven by experiments by either GeneMania or String bioinformatic tools. The connections in grey
are those that are predicted or from text mining from one or both of the programs GeneMania and
String. Gene names are highlighted in pink nodes. Abbreviations: TTC5, hSTRAP; MY09, Myosin
9; ACTN, Actin; FLNA, Filamin A; FLNB, Filamin B; MYH14, Myosin, heavy chain 14 isoform
1; PRKDC, DNA damage dependent protein kinase C Catalytic unit, LDHA, L-lactate
dehydrogenase; EIF4A1, Eukaryotic initiation factor 4A-I; EEF1A, Elongation factor 1-alpha 1;
TUBB, Tubulin; EFTU, Tu translation elongation factor, mitochondrial precursor; ENO, Alpha
enolase ; UBA1, Ubiquitin-like modifier-activating enzyme 1; FASYN, Fatty acid synthase; TPI,
Triosephosphate isomerase; PGK1, Phosphoglycerate kinase; ATPSYN, ATP synthase subunit
beta, mitochondrial; EPIPN1, Epiplakin 1; HSP70, cDNA FLJ51907, highly similar to Stress-70
protein, mitochondrial; HSP90AB1, cDNA FLJ53619, highly similar to Heat shock protein HSP
90-beta;
137
3.3 Biophysical and structural studies carried out using full length
and truncated versions of hSTRAP
The following section contains all the data obtained from circular dichroism,
crystallography and NMR experiments carried out on all hSTRAP protein constructs
cloned into the N terminal His and GST tag vectors; pET14-b and pGEX-6P-1
respectively. The sample requirements for carrying out high resolution structural studies
are a concentrated solution of pure stable protein as the starting point, specifically, a
minimum of 20 µl of over 10 mg/ml for crystallographic trials, and 0.5 ml of 0.5-1 mM for
NMR experiments. However, for Circular dichroism experiments a very low concentration
of less than 15M of pure protein is required. Previous work (Section 3.1) established the
protocols for the expression and purification of all hSTRAP protein variants. The following
sections include details of buffer optimization trials carried out in order to concentrate
hSTRAP protein to high concentrations and improve protein stability. Section 3.1
identified four out of the five truncated hSTRAP constructs, with which these biophysical
studies were possible, namely: hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284) and
hSTRAP(285-440), as well as both full length constructs, His-hSTRAP(1-440) and GSThSTRAP(1-440). Protein construct hSTRAP(220-440) was unstable when bound to the His
tag affinity resin and in elution (Fig.3.16). Furthermore, during the course of this study the
structure of the C terminus of hSTRAP(262-422) [2XVS] was solved by another group,
and for this reason this protein was not included in the biophysical studies carried out in
this thesis.
3.3.1 Biophysical and structural studies carried out on His-hSTRAP(1440)
3.3.1.1 Circular Dichroism on His-hSTRAP(1-440)
CD was used to estimate the secondary structure content of the protein constructs, and to
identify whether the protein was folded. Furthermore, the temperature dependence of CD
spectra was measured to characterize thermal stability of His-hSTRAP(1-440). The protein
concentration used for this CD experiment was 1 µM
An initial scan was taken at 4°C to determine the folding state and secondary structure
composition of His-hSTRAP(1-440). Then a thermal stability experiment was carried out,
and readings at 220nm every 0.2°C from 4 to 80°C were recorded for this, after which
138
another scan was taken again once the temperature was reversed back to 4°C to determine
if the protein refolds again.
The initial scan taken at 4°C (Fig.3.29A) showed that the sample was folded and contained
various secondary structure elements. Mean residue molar ellipticity values for this
construct were then fed in the program Dichroweb to determine the percentage of each
secondary structure element in this construct. This showed that His-hSTRAP(1-440) is
11% α-helical, 25% β, 31.2% turn and 32.8% disordered. The His-hSTRAP(1-440)
throughout the whole temperature range tested at 220 nm did not show clear co-operative
unfolding transition (Fig.3.29B) and re-folds reversibly once temperature is reverse
(Fig.3.29C). The CD results therefore suggest that the protein is folded on a secondary
structure level; however, the absence of distinct unfolding transition suggests that this
protein construct exists in a molten globule state.
139
(A)
(B)
(C)
(D)
Figure 3.29. CD experiments carried out on His-hSTRAP(1-440). A: CD spectrum
recorded at 4°C before applying variable temperatures. B: CD spectrum carried out at variable
temperatures ranging from 4 to 80°C taken at fixed wavelength of 220 nm C: CD spectrum
recorded at 4°C, after the variable temperature experiment was carried out. D: Superimposition of
A (Highlighted in blue) and C spectra (Highlighted in green). One representative experiment is
shown out of three repeats. Abbreviations: MRME, Mean residue molar ellipticity;
3.3.1.2 X-ray Crystallography on His-hSTRAP(1-440)
Once it was confirmed by in gel digestion mass spectrometry that His-hSTRAP(1-440) was
present after the first step of affinity purification, His-hSTRAP(1-440) elutions were
pooled together and dialyzed into two buffers. As the process of protein concentration can
140
cause aggregation and losses, the effect of salt concentration on the stability of HishSTRAP(1-440) within buffers was investigated. Two buffers were tested, both contained
50 mM Sodium Phosphate, Arginine and Glutamic acid and 10 mM β-mercaptoethanol; in
addition, buffer 1 contained 50 mM NaCl while buffer 2 contained 150 mM NaCl (n.b.
buffer 2 was also the buffer used for gel filtration). To solve the atomic-resolution structure
of any protein, a high concentration of soluble, stable, non-aggregated protein in the order
of 10 mg/ml is likely to be required, hence the His-hSTRAP(1-440) protein sample was
concentrated down to a small volume. A sample of concentrate in each buffer was analyzed
on a 10% SDS PAGE gel (Fig.3.30), additionally; the concentration of soluble protein in
each buffer condition was estimated via Bradford reagent (Table 3.8). The latter
measurements were also used to determine the estimated percentage loss of HishSTRAP(1-440) protein during the buffer exchange and concentration process.
In buffer 1, the concentration of soluble His-hSTRAP(1-440) protein obtained was
estimated at 10 mg/ml in 20 µl (Table 3.8), corresponding to an 83.5% loss. In buffer 2, the
concentration of soluble His-hSTRAP(1-440) protein obtained was estimated at 5 mg/ml in
20 µl (Table 3.8), corresponding to a 91.7% loss. These data suggested that lower salt
concentrations are favorable for His-hSTRAP(1-440) protein, although, in this buffer the
presence of a 32 kDa degradation product was observed by SDS PAGE (Fig.3.30A). In
conclusion the concentration of His-hSTRAP(1-440) was not increasing in a manner
inversely proportional to the volume, as what would be expected. This protein loss
observed could be attributed to aggregation and/or sticking of His-hSTRAP(1-440) protein
to the ultrafiltration membrane.
Table 3.8. Estimation of soluble His-hSTRAP(1-440) protein concentration in different buffers
Protein sample
OD595
0.10
1.00
Volumemeasurement
(µl)
5.0
1.5
Concentration
(mg/ml)
0.3
10.0
Volume
(µl)
4000
20
Percentage
protein loss (%)
83.4
Initial hSTRAP Sample
Concentrated
HishSTRAP(1-440) in Buffer 1
Concentrated
HishSTRAP(1-440) in Buffer 2
0.50
1.5
5.0
20
91.7
The concentration of His-hSTRAP(1-440) protein measured (by Bradford reagent, as described in
Materials and Methods) in samples before and after concentration in buffer 1 and buffer 2 . Buffer
1: 50 mM Sodium Phosphate buffer, NaCl, Arginine and Glutamic acid, and 10 mM βmercaptoethanol. Buffer 2: 50 mM Sodium Phosphate buffer, 150 mM NaCl, 50 mM Arginine and
Glutamic acid, and 10 mM β-mercaptoethanol.
141
Figure 3.30. Concentration of HishSTRAP(1-440) protein in different
buffers. 10% SDS PAGE gel comparing
the difference in purity between samples
of His-hSTRAP(1-440) protein in buffer
1 (lane 2) and buffer 2 (lane 3). Fifteen
micro-litres of sample was diluted as
indicated in lane headings and loaded in
each lane. Gel was stained with Instant
Blue. Buffer 1-50 mM Sodium Phosphate
buffer, NaCl, Arginine and Glutamic acid,
and 10 mM β-mercaptoethanol. Buffer 250 mM Sodium Phosphate buffer, 150
mM NaCl, 50 mM Arginine and Glutamic
acid and 10 mM β-mercaptoethanol.
Following these first protein stability trials, the long-term stability of both concentrated
samples in buffer 1 and 2 were studied. Samples were incubated for four weeks at 4°C and
then analyzed by SDS PAGE. Achieving long-term His-hSTRAP(1-440) sample stability is
important for x-ray crystallization experiments. A SDS PAGE gel to determine long-term
stability of hSTRAP shown in Figure 3.31, indicated that hSTRAP was not stable in any of
the trailed buffers since many degradation products were detected in concentrated samples
in both buffer 1 (Lane 2) and buffer 2 (Lane 3).
Figure 3.31. Long term stability of
concentrated His-hSTRAP(1-440)
protein. A 10% SDS PAGE gel
indicating the long-term stability of HishSTRAP(1-440) protein (highlighted by
an arrow) solubilised in buffer 1 (lane 2)
and buffer 2 (lane 3) after four weeks
incubation at 4°C. 15 µl of sample was
diluted as indicated in lane headings and
loaded in each lane. Gel was stained with
Instant Blue. Buffer 1-50 mM Sodium
Phosphate buffer, NaCl, Arginine and
Glutamic acid and 10 mM βmercaptoethanol.
Buffer 2-50 mM Sodium Phosphate
buffer, 150 mM NaCl, 50 mM Arginine
and Glutamic acid and 10 mM βmercaptoethanol.
142
As stability, purity and high concentration are critical factors for the crystallization of
proteins, a new proprietary protein solubilisation mixture developed by Dr Alexander
Golovanov (H-MIX) was used to overcome these obstacles. To study the effects of H-MIX
on the concentration process, all elution fractions which already contained H-MIX and pure
His-hSTRAP(1-440) protein, were pooled together and concentrated down. The buffer was
then exchanged into low-salt Buffer 1, additionally containing H-MIX. Table 3.9 shows
that the addition of H-MIX to previously investigated buffer 1, concentrated HishSTRAP(1-440) up to an estimated 18 mg/ml (Table 3.8). The purity of the concentrated
sample also increased (Fig.3.32A) compared to the preparation without the addition of HMIX (Fig.3.31). Furthermore the percentage of the protein loss was improved after the
addition of H-MIX to the buffer, 28% (Table 3.9), compared to 83.4% without it (Table
3.8).
To determine the long-term stability of hSTRAP protein, samples were stored at 4°C for
one month and then analyzed by SDS PAGE after this time period. His-hSTRAP(1-440)
protein was stable in these buffer conditions (Fig.3.32B), which was a significant
improvement compared to the poor stability of His-hSTRAP protein in the same buffer in
the absence of H-MIX (Fig.3.31). The protein concentration, purity and stability in HMIX-containing buffer were now sufficient to proceed to crystallization trials.
Table 3.9 Estimated concentration of His-hSTRAP(1-440) in Buffer 1 + H-MIX.
OD595
Elution 1
Elution 2
Pooled elutions 1 and 2
Concentrated protein 1
Concentrated protein 2
0.095
0.043
0.07
0.17
0.607
Vol Measurement
(µl)
5
5
5
5
0.5
Concentration
(mg/ml)
0.29
0.129
0.21
0.5
18
Volume
(µl)
3000
2000
5000
2500
50
Percentage
loss (%)
0
28
protein
The concentration of His-hSTRAP(1-440) protein in various protein samples taken during the
concentration of His-hSTRAP(1-440) in the presence of H-MIX+Buffer 1.
143
(A)
(B)
Figure 3.32. Concentration and stability of His-hSTRAP(1-440) protein in Buffer 1 +
H-MIX. 10% SDS PAGE of A: His-hSTRAP(1-440) (indicated by an arrow) elutions in Buffer 1
+ H-MIX, before (lanes 2 and 3) and after (lanes 3 and 4) concentration. B: Concentrated 18 mg/ml
His-hSTRAP(1-440) after storage at 4°C in buffer 1+ H-MIX, demonstrating sample stability. (A)
Lanes: 1, Molecular marker; 2, 15l Eluted Protein 1 (1:2 dilution); 3, 15l Eluted Protein 2 (1:2
dilution); 4, 15l of the 2.5 mls of pooled elutions (1+2) concentrate (1:3 dilution); 5, 50 µl of 18
mg/ml concentrated protein sample (1:40) dilution; (B), Lanes: 1, Molecular marker; 2, 50 µl of 18
mg/ml concentrated protein sample (1:50) dilution. Both gels were stained with Instant Blue.
Before proceeding to crystallography trials concentrated His-hSTRAP(1-440) sample was
analyzed by gel filtration to confirm the presence of pure His-hSTRAP(1-440) protein. For
that a 1:100 dilution of the concentrated His-hSTRAP(1-440) protein sample was injected
into a gel filtration column (Superdex75). One sharp peak is observed on the gel filtration
graph at 144 ml (Fig.3.33A), which according to the superdex75 calibration curve
(Fig.3.33B) corresponds to a protein of 50 kDa in size indicating that crystallography trials
could be carried out with this sample.
144
(A)
(B)
Volume (mls)
Figure 3.33. Gel filtration graph of concentrated His-hSTRAP(1-440) protein sample.
A: Superdex75 chromatogram (OD280) obtained after injection of 1:100 dilution of concentrated
hSTRAP protein sample. B: Superdex75 calibration curve.
Following the establishment of a protocol for production of stable, pure and homogeneous
His-hSTRAP(1-440) protein, with a sample concentration exceeding 10 mg/ml,
crystallography trials were initiated. The first trial was carried out in the JCSG+ screen
with a 12.9 mg/ml His-hSTRAP(1-440) protein sample. Half of the concentrated protein
sample was supplemented with small suspended graphite flakes, hypothesized to act a
nucleating platform for the protein (Dr A. Golovanov, personal communication).
145
From these initial trials it was concluded that the concentration of His-hSTRAP(1-440) was
too low as more than 45% of the wells were clear after 3 months (Fig.3.34). However,
despite low concentration of His-hSTRAP protein, spheralites were observed in 4 of the
wells, both with and without graphite particles as shown in Figure 3.34. Since it was
concluded that the protein concentration was too low, the same trial was repeated but with
a higher protein concentration (Fig.3.35).
Spheralites
Figure 3.34. Crystallisation trials with 12.9 mg/ml His-hSTRAP(1-440) sample in
JSCG+ screen. B6 contained 0.1 M Phosphate-Citrate Buffer, 40% (v/v) Ethanol and 5% (w/v)
PEG 1000. B11 contained 1.6 M Tri-Sodium Citrate Buffer. A: Well B6 with Graphite; B: Well B6
without Graphite; C: Well B11 with Graphite; D: Well B11 without Graphite
The next crystallization trial was carried out with an 18 mg/ml His-hSTRAP(1-440) protein
sample using the JCSG+ trial screen. Due to the higher His-hSTRAP(1-440) protein
concentration, only 20% of the wells were clear compared to previous 45% after a 3
months incubation period, and more wells contained spheralites than in the previous trial
(Fig.3.35). Following three weeks, a large spheralite or a possible micro-crystal was
obtained in well B6 (Fig.3.35A). The screen condition of well B6 was: 0.1 M PhosphateCitrate buffer, 40% (v/v) Ethanol and 5% (w/v) PEG 1000. This well had the largest
spheralite observed in trials so far, therefore this condition was used as a basis for more
detailed screens in which concentrations of the components present varied, to investigate
their effect on crystal quality. Spheralites were also obtained in B11 as well (Fig.3.35E and
3.35F), which contained 1.6 M Tri-sodium Citrate; hence the conditions for B6 and B11
were combined for the subsequent screen. Therefore, the next trial that was carried out
contained 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG 1000, with varying ethanol
146
concentrations. Spheralites were consistently obtained in the well B6 and B11 trials with or
without graphite particles (Fig.3.34 and Fig.3.35). These spheralites obtained in these
preliminary trials could not be used for further structural experiments, as they were too
small for a diffraction experiment on a Rotating Anode source.
“Hairball” like structures, perhaps from multiple nucleations were seen in well A1, both
with graphite (Fig.3.35C) and without (Fig.3.35D). Whiskers crystal structures were
observed (Fig.3.35G) in well G1 with graphite, but unfortunately the well without graphite
had dried out (Fig.3.35H) so its reproducibility could not be determined.
147
Spheralites/
microcrystal
“Hairball” like
structure
Whisker
crystal
Figure 3.35 Crystallisation trials with 18 mg/ml His-hSTRAP(1-440) sample in the
JCSG+ screen. Well B6 contained 0.1 M Phosphate-Citrate buffer, 40% (v/v) Ethanol and 5%
(w/v) PEG 1000, and well A1 contained 2 M Lithium Sulfate, 0.1 M Sodium acetate pH 4.5, and
50% (w/v) PEG 400. Well B11 contained 1.6 M Tri-Sodium Citrate, and G1 contained 0.1 M
HEPES pH 7 and 30% (v/v) Jeffamine ED-2001 pH 7. - A: Well B6 with graphite, B: Well B6
without graphite; C: Well A1 with graphite; D: Well A1 without graphite; E: Well B11 with
Graphite; F: Well B11 without Graphite; G: Well G1 with graphite; H: Well G1 without graphite.
The third crystallization trial attempted to optimize the conditions which produced
spheralites in earlier experiments (Fig.3.35), by using 0.1 M Tri-Sodium Citrate pH 4.2 and
148
5% (w/v) PEG 1000, with varying ethanol concentrations such as 60%, 50%, 40%, 30%
and 20% (v/v) with a 20 mg/ml His-hSTRAP(1-440) protein sample. Spheralites were
again observed in this trial (Fig.3.36) and the largest one was found at 20% (v/v) ethanol
(Fig.3.36J), although in the previous trial it was found in 40% (v/v) ethanol (Fig.3.35A).
Observation from the side of the well showed the spheralite positioned at the edge of the
drop, (Fig.3.36K), and as mentioned previously this could be either salt or protein. The
ultimate test for this would be to obtain a crystal of high enough quality and analyze the
diffraction pattern. This could not be done with this spheralite as it was too small for a
diffraction experiment on a Rotating Anode source available at the time.
149
Spheralite
Spheralite
Spheralite
Figure 3.36. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440). All the wells
contain 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG1000 and varying ethanol concentrations. A:
60%(v/v) Ethanol with Graphite; B: 60%(v/v) Ethanol without Graphite; C: 50%(v/v) Ethanol with
Graphite; D: 50%(v/v) Ethanol without Graphite; E: 40%(v/v) Ethanol with Graphite; F: 40%(v/v)
Ethanol without Graphite; G: 30%(v/v) Ethanol with Graphite; H: 30%(v/v) Ethanol without
Graphite; I: 20%(v/v) Ethanol with Graphite; J: 20%(v/v) Ethanol without Graphite; K: Zoomed
in picture of side of well, contains 20%(v/v) Ethanol without Graphite,
150
The next trial investigated the effects of pH on crystal grade and so the pH range tested was
3.8-4.8, with 0.2 increments, additionally pH 5.2, 6.2 and 7.2 were also checked. Many
spheralites were obtained in wells at pH values 4.0-4.6, both with and without graphite
(Fig.3.37C-J). However, after a period of 6 weeks a large tablet crystal-like object
immersed in precipitation was obtained at pH 4.6 without graphite (Fig.3.37J). This was
observed under polarizing light and was observed to change the plane of polarization
(although in a plastic tray), and was around 150x40x40m3 as assessed on the microscope
graticule in size. Due to the high amount of precipitation observed in the well the crystal
was difficult to extract but was put on the X-ray beam, however, no single crystal
diffraction pattern was obtained (Fig.3.38). The buffer that the crystal was formed from
had poor cryoprotectant properties, as when the crystal was subjected to liquid nitrogen, ice
rings were formed (Fig.3.38).
151
Spheralite
Spheralite
Spheralite
Spheralite
Tablet like
crystal
Spheralite
152
Figure 3.37. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample
to observe effects of pH on crystal grade. All the wells contained 0.1 M Tri-Sodium Citrate,
20% (v/v) ethanol and 5% (w/v) PEG1000, but pH units were varied. A: 20%(v/v) Ethanol with
Graphite pH 3.8; B: 20%(v/v) Ethanol without Graphite pH 3.8; C: 20%(v/v) Ethanol with
Graphite pH 4.0; D: 20%(v/v) Ethanol without Graphite pH 4.0; E: 20%(v/v) Ethanol with
Graphite pH 4.2; F: 20%(v/v) Ethanol without Graphite pH 4.2; G: 20%(v/v) Ethanol with
Graphite pH 4.4; H: 20%(v/v) Ethanol without Graphite pH 4.4; I: 20%(v/v) Ethanol with Graphite
pH 4.6; J: 20%(v/v) Ethanol without Graphite pH 4.6; K: 20%(v/v) Ethanol with Graphite pH 4.8;
L: 20%(v/v) Ethanol without Graphite pH 4.8; M: 20%(v/v) Ethanol without Graphite pH 5.2; N:
20%(v/v) Ethanol with Graphite pH 5.2; O: 20%(v/v) Ethanol without Graphite pH 6.2; P:
20%(v/v) Ethanol with Graphite pH 6.2; Q: 20%(v/v) Ethanol without Graphite pH 7.2; R:
20%(v/v) Ethanol without Graphite pH 7.2
153
Figure 3.38. His-hSTRAP(1440) Diffraction Pattern.
Diffraction pattern of crystal
obtained in 20% ethanol (v/v), 0.1
M Tri-sodium citrate and 5%
(w/v) PEG1000 at pH 4.6. Sample
buffer was a not a good cryoprotectant as many ice rings were
observed.
Experiment was tailored for
protein diffraction (backstop was
pulled out to observe low
resolution diffraction and the
rotation width was small). Bruker
Microstar rotation anode source
with CCD detector.
The next set of trials tested the effect of lowering ethanol concentration within the sample
buffer further on crystal grade and the ethanol concentrations tested were 15%, 10% and
5% (v/v). Spheralites and rod shaped needles were obtained at 10% ethanol (Fig.3.39C),
however, this trial did not improve the crystal grade any further and it seems that this was
not the optimum condition for crystal growth.
Spheralite
Spheralite
Spheralites
Figure 3.39. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample
to observe effects of further lowering ethanol concentration on crystal grade . All the
wells contained 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG1000 at pH 4.6 but with varying ethanol
concentrations. A: 20%(v/v) Ethanol ; B: 15%(v/v) Ethanol ; C: 10%(v/v) Ethanol; D: 5%(v/v)
Ethanol;
154
As the JCSG+ based trials did not yield crystals suitable for protein structure
determination, a number of different kits were tested namely; PACT, Clear Strategy I,
Clear Strategy II and Morpheus. The results obtained with these screens were not
successful as mainly clear drops were obtained, and no spheralites or crystals were detected
(Well images not shown).
Spheralites were consistently obtained in both wells with or without graphite (Fig.3.343.37 and Fig.3.39), and there was no significant difference on spheralite quality or
appearance of wells with the addition of graphite (Fig.3.34-3.37 and Fig.3.39). This
suggested that as-prepared graphite flakes do not offer the suitable nucleating platform,
disproving the initial hypothesis.
Despite screening 500 different conditions, a high enough grade crystal was not obtained
for structural studies. The difficulty in crystallizing this protein could be potentially
explained by the presence of structural heterogeneity, caused, for example, by the presence
of intrinsically unstructured regions in this polypeptide, or by conformational plasticity. At
this point however it was concluded that the priority should be given in solving the
structures of shorter fragments of hSTRAP, which may be more stable structurally.
3.3.2 Biophysical and structural studies carried out on GST-hSTRAP(1440)
3.3.2.1 Circular Dichroism on GST-hSTRAP(1-440)
CD was carried out with pure GST-hSTRAP(1-440) following the same procedure as that
described for
His-hSTRAP(1-440). A scan was taken at 4°C before the variable
temperature experiment was carried out to determine folding state of GST-hSTRAP(1440). The resulting spectrum obtained showed that the CD of GST-hSTRAP(1-440) was
unusual (Figure 3.40) and difficult to interpret, possibly because of the presence of GST in
this construct that made this polypeptide too large for CD studies. As a consequence,
variable temperature experiments were not initiated neither the estimated percentage of
each secondary structure element could be determined.
155
Figure 3.40. CD
experiments carried
out on GSThSTRAP(1-440). CD
spectrum recorded at
4°C of GST-hSTRAP(1440). This was repeated
three times and the same
result was obtained as
shown in this figure.
Abbreviations: MRME,
Mean residue molar
ellipticity;
3.3.2.2 GST tag cleavage
The next set of experiments were done to determine the possibility of carrying out
crystallography on hSTRAP(1-440) obtained from a GST-tagged construct. In order to
carry out the latter experiments, the 26 kDa GST tag had to be cleaved as described in the
Material and Methods section 2.14. Cleavage on the affinity resin was successful as two
protein bands were detected, bound to the resin, after incubation with Pre-scission protease
at 50 kDa and 26 kDa (Fig.3.41), corresponding to hSTRAP(1-440) and GST tag
respectively (Fig.3.41A, Lane 4). However, no protein was detected in the elution fractions
collected (Fig.3.41A, Lanes 10-12). Furthermore, post-elution resin sample indicated that
hSTRAP(1-440) and the GST tag (Fig.3.41A, Lane 4) were still bound to the resin.Due to
the latter reason, the concentration of reduced glutathione in the elution buffer was
increased from 10 to 20 mM. Samples of resin after elutions with the new elution buffer
were analyzed by SDS PAGE (Fig.3.41B, Lane 4), which indicated that hSTRAP(1-440)
and GST were still bound to the resin as no protein was detected in the elution fractions
(Fig.3.41B, Lanes 6-7). Therefore the concentration of glutathione was drastically
increased to 200 mM, however both GST and hSTRAP protein remained bound to resin
(Fig.3.41B, Lane 5). Furthermore, no protein was detected on the gel in any elutions
(Fig.3.41B, Lanes 8-9) suggesting that the hSTRAP protein cannot be eluted from the resin
156
once cleaved from the GST tag possibly due to protein precipitation occurring after
cleavage.
(A)
(B)
Figure 3.41. On-column GST TAG cleavage of GST- hSTRAP(1-440). 10% SDS PAGE
gels showing A: Pure GST-hSTRAP(1-440) protein is bound to the resin before cleavage (Lanes 3)
but GST tag and hSTRAP(1-440) is still bound to affinity resin after cleavage (Lane 4) and no
protein in detected in elutions (Lanes 10-12). B: Even after increasing glutathione concentration to
20mM (Lane 4) and 200mM (Lane 5), hSTRAP(1-440) and GST is still bound to the resin and no
protein is detected in elutions either (Lanes 6-9). Gels were loaded with 15l of sample in each lane
and both gels were stained with Instant Blue.
Next set of experiments were performed to determine if GST cleavage could be carried out
in solution, but precipitation was seen in solution once the protease was added during the
cleavage experiment (Gel not shown), hence cleavage of GST was also not possible in
solution. Therefore hSTRAP could not be obtained by cleavage from its GST-tagged form,
for crystallographic or any other studies.
3.3.3 Biophysical and structural studies carried out on hSTRAP(1-219)
3.3.3.1 Circular Dichroism of hSTRAP(1-219)
Dialyzed hSTRAP(1-219) protein was analyzed by SDS PAGE to check its purity before
proceeding to Circular Dichroism experiments. SDS PAGE gel analysis confirmed that
dialyzed hSTRAP(1-219) protein was pure (Fig.3.42, Lane 3). The concentration of protein
used for this CD experiment was 11.4 µM.
157
Figure 3.42. Dialysed hSTRAP(1219) protein in CD buffer. 15%
SDS PAGE gel indicating pure
hSTRAP(1-219)
protein.
Fifteen
microlitres of sample was loaded in
each lane and the gel was stained with
Instant Blue.
An initial CD scan taken at 4°C to determine hSTRAP(1-219) folding state showed that
hSTRAP(1-219) was folded and composed of various secondary structure elements
(Fig.3.43A). Mean residue molar ellipticity values for this protein were then fed into the
software Dichroweb to determine the percentage of each secondary structure element. This
showed that hSTRAP(1-219) is 67% α-helical, 8.6% β, 11.6% turn and 12.8% is
disordered. The thermal stability experiment showed that hSTRAP(1-219) does not display
clear cooperative unfolding transition across temperatures 4 to 80°C at 220 nm
(Fig.3.43B), suggesting that the protein may exist in a molten globule state. The scan taken
at 4°C after the variable temperature experiment had been completed showed that
hSTRAP(1-219) was still folded (Fig.3.43C). Furthermore, the two scans taken at 4°C
before and after variable temperature experiment can be completely superimposed on top
of each other, and no significant difference in spectra was detected (Fig.3.43D). These
experiments were repeated 3 times and the same results were obtained. These results
suggested that the protein is folded on a secondary structure level; however the absence of
distinct temperature unfolding transition suggests that this protein construct exists in a
molten globule state.
158
(A)
(B)
(C)
(D)
Figure 3.43. CD experiments carried out on hSTRAP(1-219). A: CD spectrum recorded at
4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at
variable temperatures ranging from 4 to 80°C taken at a fixed wavelength of 220 nm. C: CD
spectrum recorded at 4°C, after the variable temperature experiment was carried out. D:
Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One
representative experiment is shown out of three repeats. Abbreviations: MRME, Mean residue
molar ellipticity;
3.3.3.2 NMR studies of hSTRAP(1-219)
For NMR studies high quantities of pure hSTRAP(1-219), in the order of tens mg/ml were
needed. For that reason, a buffer optimal for protein stability and solubility had to be
determined in which hSTRAP(1-219) protein should be dialyzed. Once dialyzed, the
protein had to be concentrated to the highest concentration possible and then injected into a
gel filtration column. SDS PAGE gel analysis of the dialyzed sample (in 20 mM Sodium
159
phosphate buffer, 150 mM NaCl, 50 mM Arginine and Glutamic acid and 10 mM βmercaptoethanol), before and after concentration is shown in Figure 3.44A. The
concentration of hSTRAP(1-219) was not sufficient as the protein could not be
concentrated more than 2 mg/ml (Table 3.10), furthermore, it precipitated in solution. An
estimated 76.1% protein loss of hSTRAP(1-219) protein in this buffer was detected (Table
3.10), as well as many contaminants (Fig.3.44A, lane 3). To carry out structural studies,
hSTRAP(1-219) protein sample has to be stable for weeks, and so the concentrated protein
sample was analyzed by SDS PAGE gel after it was stored for a month at 4°C to assess its
long-term stability. SDS PAGE gel analysis showed that hSTRAP(1-219) was unstable
after long term storage at 40C in this buffer (Fig.3.44B, Lane 2).
(A)
(B)
Figure 3.44. hSTRAP(1-219) protein stability in gel filtration buffer. 15% SDS PAGE
gels showing- A: Dialysed hSTRAP(1-219) (indicated by an arrow) before (lane 2) and after (lane
3) concentration in gel filtration buffer. B: hSTRAP(1-219) after one month storage at 4°C in gel
filtration buffer (Lane 2). All lanes were loaded with 15 l of sample and both gels were stained
with Instant Blue.
Table 3.10. Estimated concentrations of hSTRAP(1-219) in gel filtration buffer.
OD595
Dialysed hSTRAP(1-219) Protein
Concentrated hSTRAP(1-219) Protein
0.070
0.675
Concentration
(mg/ml)
0.21
2.01
Vol
(µl)
10 000
250
Sample
(%) Protein
Loss
76.1
Estimated concentration (as measured by Bradford reagent) of dialyzed hSTRAP(1-219) before and
after concentration via the amicon and viva-spin columns.
160
As shown in Section 3.1, H-MIX improved hSTRAP(1-219) stability, so its effect on the
gel filtration buffer was investigated. Buffer exchange and concentration in H-MIX were
carried out in the Amicon and Viva-spin columns as described in the Materials and
Methods section 2.11. Estimated concentration of hSTRAP(1-219) protein was identified
through Bradford measurements taken at OD595, which showed that with the addition of HMIX to gel filtration buffer, hSTRAP(1-219) can be concentrated down to an estimated 7
mg/ml in 20 µl (Table 3.11), after which it precipitates at 8 mg/ml. For NMR experiments
500 µl of this concentration would be initially needed to determine the tertiary folding
state. SDS PAGE gel analysis of hSTRAP(1-219) samples before and after concentration
showed that approximately 95% pure hSTRAP(1-219) protein was present in the
concentrated samples (Fig.3.45A). Furthermore, these experiments also established that
hSTRAP(1-219) can be concentrated at pH 8 and intriguingly in pH 5 (Table 3.11), given
that the pI of this protein is 5.19, which suggests that H-MIX facilitates solubilization and
stabilization even at pH values close to the pI of the protein, when its solubility is expected
to be at its minimum. The loss of protein decreased compared with the buffer without HMIX from 76.1% (Table 3.10), to 29% (Table 3.11) after the addition of H-MIX to the
buffer. Percentage loss of protein was marginally lower at pH 5 (29%) compared to pH 8
(32%) (Table 3.11).
Concentrated hSTRAP(1-219) was analyzed by SDS PAGE to investigate the long-term
stability of hSTRAP(1-219) in optimized NMR buffer. SDS PAGE gel analysis confirmed
concentrated hSTRAP(1-219) protein was relatively stable during a month incubation at
4°C in this optimized buffer at both pH 5 and 8 (Fig.3.45B). Although some degraded
products were detected between 15-11 kDa in both concentrated samples (Fig.3.45B, Lane
3 and 5), but the sample stability had significantly improved compared to the sample
without H-MIX (Fig.3.44B). All of these results suggest that in principle, high protein
concentrations can be achieved and structural studies can be carried out on hSTRAP(1219).
161
(A)
(B)
Figure 3.45. Concentration and stability of hSTRAP(1-219) in NMR buffer. These 15%
SDS PAGE gels show- A: hSTRAP(1-219) (indicated by an arrow) samples before and after
concentration in NMR buffer (gel filtration buffer + H-MIX), pH 5 (Lanes 2 and 3), and pH 8
(Lanes 4 and 5). B: Long term stability of hSTRAP(1-219) in NMR buffer (gel filtration buffer +
H-MIX) at both pH 5 (Lanes 2-3) and pH 8 (Lanes 4-5). All lanes were loaded with 15l of sample,
which were diluted as indicated in lane headings. Both gels were stained with Instant Blue.
Table 3.11. Estimated concentration of hSTRAP(1-219) protein samples in NMR buffer
OD595
Concentration Vol Sample
(mg/ml)
(µl)
Eluted hSTRAP(1-219) Protein 1
0.200
0.20
1000
Concentrated hSTRAP(1-219) Protein 1 (pH 5)
0.475
7.13
20
Concentrated hSTRAP(1-219) Protein 2 (pH 8)
0.455
6.83
20
(%)
Loss
29
32
Protein
Estimated concentration of hSTRAP(1-219) before and after concentration in NMR buffer (gel
filtration buffer + H-MIX), pH5 and pH8, were determined through Bradford reagent as described
in Materials and Methods.
A 1D 1H NMR spectrum was recorded to check the folding state of hSTRAP(1-219) in this
optimized buffer (gel filtration buffer + H-MIX) before proceeding to producing this
protein in
15
N labelled form. The 1D 1H NMR spectrum of hSTRAP(1-219) recorded at
30°C suggests that hSTRAP(1-219) does not show evidence for unique tertiary structure
(Fig.3.46). No shifted methyl resonances were observed around 0 ppm which usually
signals the presence of defined structure, and peaks are too broad for the protein of this
size. The signal broadening can be alternatively explained by protein aggregation, which
however were expected to be strongly suppressed in the presence of Arginine, Glutamic
acid [152] and H-MIX in the buffer. The low intensities of NMR signals could be a result
of conformational exchange in the absence of unique 3D structure, as well as some
162
aggregation (Fig.3.46). Initial NMR and CD spectra were consistent with the view of
possible lack of unique tertiary structure. However, further NMR experiments were carried
out to explore whether the change in solution conditions could result in correctly folded
hSTRAP(1-219) protein.
Figure 3.46. 1D NMR spectrum of 0.2 mM hSTRAP(1-219). Concentrated hSTRAP(1219) sample at 30°C in optimized NMR buffer (H-MIX+ gel filtration buffer), pH 8. The large
signals, which are clipped, originate from signals belonging to H-MIX. The spectrum indicates that
this protein does not pose a unique folded structure.
15
N-hSTRAP(1-219) protein sample was then produced to investigate folding state of this
construct by observing signals in fingerprint amide region. Optimum conditions of
15
N-
hSTRAP(1-219) protein expression were determined through expression trials, and were
identified in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at OD 600 of 0.5, followed
by 3hrs incubation at 30°C, in minimal media supplemented with 15N-ammonium chloride.
Samples representative of total, soluble and insoluble expression at identified optimal
conditions of
(Fig.3.47).
15
15
15
N-hSTRAP(1-219) protein expression were analyzed by SDS PAGE
N-hSTRAP(1-219) protein was expressed over time (Fig.3.47A), however
N-hSTRAP(1-219) protein was not detected in samples representative of soluble nor
insoluble fraction at these growth conditions (Fig.3.47B). This has been observed
previously with unlabelled hSTRAP(1-219) (See Results Section 3.1) as hSTRAP(1-219)
was precipitating in the soluble fraction. This could be due to the lysis method as
methodology of small and large-scale protein purification differ.
163
(A)
(B)
Figure 3.47. Expression of
15
N labelled hSTRAP(1-219). These 15% SDS PAGE gels
show- A: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(1-219) transformed cell lysate 1 hour
(lane 3), 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Soluble (lanes 2-5) and
insoluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pET-14b-HishSTRAP(1-219) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9)
post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are preinduction fractions. All lanes were loaded with 15l of sample and both gels were stained with
Instant Blue.
Purification of
15
N hSTRAP(1-219) was performed in the presence of H-MIX within
elution buffers as this was found to be critical for hSTRAP(1-219) protein stability. Initial
purification used the same conditions as for the purification of unlabelled hSTRAP(1-219)
protein. These were; 50 mM Tris, pH 8.7, 300 mM NaCl, 50 mM Arginine and Glutamic
acid, and H-MIX. This yielded pure
15
N-hSTRAP(1-219) protein bound to this His tag
affinity resin (Fig.3.48A, Lane 3), however, very little protein eluted off the resin with 200
mM imidazole in elution buffer (Fig.3.48A, Lane 5-7). A high quantity of 15N-hSTRAP(1219) was still bound to the His tag affinity resin even after elutions were taken (Fig.3.48A,
Lane 4). The imidazole concentration was increased to 400 mM, with a view of eluting 15N
hSTRAP(1-219) protein off the resin, however, this concentration of imidazole was still
not sufficient to elute the protein (Fig.3.48B). The imidazole concentration in elution buffer
was increased again to 600 mM and very low quantities of
15
N hSTRAP(1-219) protein
were detected in the elutions (Fig.3.48C, Lanes 3-5), although, majority of 15N hSTRAP(1219) was still bound to the His tag affinity resin (Fig.3.48C,Lane 6). The imidazole
164
concentration in elution buffer was increased to 800 mM and again no
protein was detected in elutions (Fig.3.48D, Lanes 3-5) and
15
15
N-hSTRAP(1-219)
N hSTRAP(1-219) protein
was still bound to the resin after elutions were taken (Fig.3.48D, Lane 6). Imidazole
concentration was increased to 1 M, which is the highest concentration of imidazole that
can be used to release the protein from a His-tag affinity resin. Again no
219) protein was detected in elutions (Fig.3.48E, Lanes 3-5) and
15
15
N-hSTRAP(1-
N-hSTRAP(1-219)
protein was still bound to the resin (Fig.3.48E, Lane 6). These results suggested that
15
N-
hSTRAP(1-219) protein had precipitated, hence was not eluting, even with 1M imidazole.
The next
15
N-hSTRAP(1-219) protein purification was carried out with H-MIX and 300
mM imidazole in the elution buffer. This time pure
15
N-hSTRAP(1-219) protein was
detected in elutions (Fig.3.48F). Estimated concentration of 15N-hSTRAP(1-219) protein in
elutions was however low and total quantity estimate from 1 litre E. coli growth culture
was 0.9 mg (Table 3.12).
165
Figure 3.48. Purification of 15N labelled hSTRAP(1-219). All elution buffers contained HMIX. 15% SDS PAGE gels showing 15N-hSTRAP(1-219) purification at pH 8.7 with-. A: 200 mM
Imidazole B: 400 mM Imidazole C: 600 mM Imidazole D: 800 mM Imidazole E: 1 M Imidazole
F: H-MIX and 300 mM Imidazole only. All lanes were loaded with 15l of samples and all gels
were stained with Instant Blue.
166
Table 3.12. Estimated concentration of 15N-hSTRAP(1-219).
OD595
Elution 1
Elution 2
Elution 3
Elution 4
Elution 5
0.150
0.143
0.136
0.112
0.060
Concentration
(mg/ml)
0.150
0.143
0.136
0.112
0.060
Volelution
Protein quantity (mg)
1.5
1.5
1.5
1.5
1.5
0.225
0.215
0.204
0.168
0.09
Estimated concentration of 15N-hSTRAP(1-219) in elutions (measured with Bradford reagent)
obtained through protein purification in H-MIX and 300 mM imidazole only (Elutions shown in
Figure 3.48F).
H-MIX pH 8 was not found to be the optimal buffer condition as protein precipitated in
solution almost instantly during the concentration process and many contaminants were
detected in the concentrated sample (Fig.3.49). We could not therefore obtain sufficient
amounts of soluble 15N-hSTRAP(1-219) for detailed NMR experiments.
Figure 3.49. Concentration of 15NhSTRAP(1-219) in H-MIX, pH8.
15% SDS PAGE gel showing eluted 15NhSTRAP(1-219) protein 1 and 2 (lanes 2
and 3) and concentrated 15N-hSTRAP(1219) (lane 4) in H-MIX only, pH 8. All
lanes were loaded with 15l of sample and
the gel was stained with Instant Blue
CD and NMR experiments carried out on hSTRAP(1-219) showed that this protein does
not have a unique tertiary structure (Fig.3.46) although it possesses secondary structure
(Fig.3.43). In addition, there was no clear cooperative unfolding observed as the
temperature was raised, suggesting that this construct may exist in molten globule state
(Fig.3.43B). This would explain the difficulties with expressing and purifying this protein,
its proteolytic instability and tendency to aggregate and precipitate. The molten-globule
behavior of this construct was possibly a consequence of the truncation, which perturbs the
native structure.
167
3.3.4 Biophysical and structural studies carried out on hSTRAP(1-150)
3.3.4.1 Circular Dichroism of hSTRAP(1-150)
Pure hSTRAP(1-150) protein was dialysed into CD buffer (20 mM Sodium Phosphate
buffer, pH 6.5) and then analyzed by SDS PAGE to determine its purity before proceeding
to Circular Dichroism experiments. This showed that dialyzed hSTRAP(1-150) protein was
pure (Fig.3.50, Lane 2). The concentration of hSTRAP(1-150) protein used for CD
experiments was 10.2 µM.
Figure 3.50. Dialysed hSTRAP(1-150) protein in
CD buffer. Fifteen microlitres of dialyzed protein
sample was loaded on the gel indicating the presence of
pure hSTRAP(1-150) protein. Gel was stained with
Instant Blue.
CD experiments were carried out to characterize hSTRAP(1-150) folding state and thermal
stability. An initial scan taken at 4°C, showed that hSTRAP(1-150) is an alpha helical
protein (Fig.3.51A). Mean residue molar ellipticity values for this protein construct were
then inserted in the program Dichroweb to determine percentage of each secondary
structure element of hSTRAP(1-150). This showed that hSTRAP(1-150) is 71% alpha
helical, and 29% disordered. A variable temperature experiment, between 4-80°C with
detection at 220 nm was then carried out, which showed that hSTRAP(1-150) protein starts
unfolding at 10°C and was completely unfolded by 70°C (Fig.3.51B). This showed that
hSTRAP(1-150) was unstable, furthermore, the scan taken at 4°C after the variable
temperature experiment had completed, confirmed that hSTRAP(1-150) thermal unfolding
was irreversible (Fig.3.51C). Comparison of scans of hSTRAP(1-150) taken at 4°C, before
and after variable temperature experiment, confirmed that hSTRAP(1-150) thermal
unfolding was irreversible (Fig.3.51D). The melting mid-point of hSTRAP(1-150) is
approximately 40°C. All of these results suggest that hSTRAP(1-150) is thermally
unstable, but may retain some helical conformation between 4 to 10°C. This was repeated
168
3 times and the same results were obtained as shown in Figure 3.51. This information was
then used to inform the NMR experiments to be carried out below 10°C.
(A)
(B)
(C)
(D)
Figure 3.51. CD experiments carried out on hSTRAP(1-150). A: CD spectrum recorded at
4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at
variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD
spectrum recorded at 4°C, after the variable temperature experiment was carried out. D:
Superimposition of A (Highlighted in green) and C spectra (Highlighted in blue). One
representative experiment is shown out of three repeats.
3.3.4.2 NMR studies of hSTRAP(1-150)
Expression trials were carried out to determine the optimum conditions of soluble
hSTRAP(1-150)
production
and
these
trials
169
identified
optimal
conditions
15
N
in
BL21(DE3)pLysS, induced with 0.1 mM IPTG at OD600 of 0.5, followed by 3 hrs
incubation at 25°C, in minimal media. Samples representative of total, soluble and
insoluble expression of
15
N hSTRAP(1-150) at optimal conditions of protein expression
were analyzed on 15% SDS PAGE gels. This showed that hSTRAP(1-150) was expressed
over time (Fig.3.52A), and was mainly found in the soluble fraction at these indentified
optimum conditions of 15N hSTRAP(1-150) protein expression (Fig.3.52B).
(A)
Figure 3.52. Expression of
(B)
15
N labelled hSTRAP(1-150) protein. These 15% SDS PAGE
gel show- A: Total BL21(DE3)pLysS pET-14b-His- hSTRAP(1-150) transformed cell lysate 1
hour (lane 3), 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Soluble (lanes 2-5) and
insoluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pET14b-HishSTRAP(1-150) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9)
post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are preinduction fractions. All lanes were loaded with 15l of sample and both gels were stained with
Instant Blue.
15
N labelled hSTRAP(1-150) purification yielded high quantities of pure
15
N hSTRAP(1-
150) protein in elutions from metal-affinity column (Fig.3.53). Estimated protein
concentration in each elution is shown in Table 3.13, and total quantity of 15N hSTRAP(1150) protein obtained from 1 litre E. coli culture was 91.6 mg (Table 3.13). Next step was
to determine optimal NMR buffer conditions to obtain correctly folded hSTRAP(1-150)
protein. For that, fractions containing pure protein (as analysed by SDS PAGE) were
pooled together and dialyzed into various NMR buffers and the protein folding state was
subsequently checked.
170
Figure 3.53. Purification of 15N
labelled hSTRAP(1-150). 15%
SDS PAGE gel showing 15NhSTRAP(1-150) purification from
BL21(DE3)pLysS at pH 8.7. Fifteen
microlitres of samples (diluted 1:40)
were loaded in each lane. Gel was
stained with instant blue
Table 3.13. Estimated concentration of 15N hSTRAP(1-150).
OD595
Elution 1
Elution 2
Elution 3
Elution 4
Elution 5
Elution 6
Elution 7
1.000
0.860
0.530
0.338
0.215
0.098
0.011
Concentration
(mg/ml)
30
25.8
15.9
10.14
6.45
2.94
0.33
Volelution
1
1
1
1
1
1
1
Protein
(mg)
30
25.8
15.9
10.14
6.45
2.94
0.33
quantity
Estimated concentration of 15N hSTRAP(1-150) in elution fractions obtained through protein
purification (Elutions shown in Figure 3.53). These concentrations were identified through
Bradford reagent as described in Materials and Methods.
The next set of experiments were performed to determine optimal buffer by analyzing 1D
1
H and 2D 1H-15N correlation NMR spectra of 15N hSTRAP(1-150) protein in each buffer.
Pure 0.5 mM 15N hSTRAP(1-150) protein with more than 95% purity (as analyzed by SDS
PAGE) was present in the dialyzed concentrated sample, to be used for initial NMR
experiments (Fig.3.54A, Lane 3).
The initial NMR buffer tested contained only 20 mM Sodium phosphate buffer and 150
mM NaCl pH 6.2, due to its simplicity, and the peaks observed by buffer constituents such
as arginine and Glutamic acid would be reduced. An initial 1D 1H spectra and 2D 1H-15N
correlation HSQC experiment of 15N hSTRAP(1-150) protein in this initial buffer showed
no evidence of folded protein (Fig.3.54B and 3.55A respectively). Arginine and Glutamic
acid (50 mM each) was added to the buffer, which was hypothesized to improve stability
and reduce aggregation [152] of 15N hSTRAP(1-150) protein. 2D 1H-15N correlation NMR
spectra in this buffer indeed improved (Fig.3.55B) from the previous initial spectra
171
(Fig.3.55A), as more signals were detected. However the signals were not dispersed well
and were not uniform in their intensity as would be expected in a folded protein, suggesting
structural instability or aggregation. Reducing agent (DTT) was then added to this mixture
and a HSQC was again taken, and this improved the spectra further as the signals were
more uniform (Fig.3.55C). Then the pH was increased to 6.5 from 6.0 by titrating several
microliters of diluted NaOH solution directly to the sample, and monitoring pH with a thin
3 mm electrode. The spectrum re-recorded at this slightly higher pH gave more uniform
and dispersed signals and the best spectra obtained so far through these trials (Fig.3.55D).
From these NMR experiments it was clear that the presence of DTT and change in pH had
improved the appearance of the NMR spectra of
15
N hSTRAP(1-150), likely by reducing
non-covalent aggregation via intermolecular disulfide bond formation, and increasing total
charge of the protein, respectively. This spectrum however still contained too many signals
(e.g., from tryptophan indoles) suggesting that a mixture of different conformers was
present. It should be noted that all these initial NMR experiments were conducted at 30°C.
At this point it was identified through CD experiments that this protein construct seems to
retain secondary structure only between 4 to 10°C, and starts to unfold as the temperature
is raised (Fig.3.51B). This would correlate with the findings obtained through these NMR
experiments carried out 30°C, because a mixture of folded and unfolded hSTRAP(1-150)
protein was present in hSTRAP(1-150) protein sample at this temperature.
2D 1H-15N correlation HSQC of dialysed 0.5 mM 15N hSTRAP(1-150) protein at 4°C in 20
mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic
acid, and 10 mM DTT at pH 6.5, showed that there was a mixture of folded and unfolded
hSTRAP(1-150) protein present in the sample (Fig.3.56). However, there was more folded
hSTRAP(1-150) protein present in this sample (Fig.3.56) than in the previous sample at
30°C (Fig.3.55D). Signals were still not of uniform intensity and not dispersed enough to
suggest a presence of completely folded protein. Also, the two tryptophan indole signals
have disappeared, which can be explained by possible protein aggregation.
172
(A)
(B)
(C)
Figure 3.54.
15
NhSTRAP(1-150) buffer optimisation trials. A: 15% SDS PAGE gel
showing 15l of 15NhSTRAP(1-150) protein sample in dialyzed initial NMR buffer (20 mM
Sodium phosphate buffer and 150 mM Sodium Chloride, pH 6.2), before (lane 2) and after
concentration (lane 3). Gel was stained with Instant Blue. B: 1D 1H NMR spectra shows no
evidence of folded 15N hSTRAP(1-150) protein in initial NMR buffer used. C: 1D 1H NMR spectra
of 15N hSTRAP(1-150) in optimised buffer (20 mM Sodium phosphate buffer, 150 mM Sodium
Chloride, 50 mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5) shows evidence of
folded protein.
173
(A)
(B)
174
(C)
15 N
1
H
(D)
Figure 3.55. 2D 1H-15N correlation NMR spectra on 15N-hSTRAP(1-150). HSQC spectra
of 0.5 mM 15N-hSTRAP(1-150) taken at 30°C in- A: 20 mM Sodium phosphate buffer and 150
mM Sodium Chloride, pH 6.2; B: Same buffer as used in (A) but with 50 mM Arginine and
Glutamic acid, pH 6.2, C: Same buffer as used in (B) but with 10 mM DTT at pH 6.2; D: Same
buffer as used as (C) but at pH 6.5. This latter spectrum showed evidence of folded 15N hSTRAP(1150) protein, although unfolded 15NhSTRAP(1-150) protein was also present
175
Figure 3.56. 2D1H-15N correlation HSQC NMR spectra of 15NhSTRAP(1-150) in
identified optimised conditions. This 2D HSQC NMR spectra was taken at 6°C of dialysed
15
N-hSTRAP(1-150) protein in 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50
mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5. This spectrum does not show
evidence of completely folded protein due to the non-uniform peak intensities, suggesting
conformational instability of the protein.
From these set of experiments it was concluded that when produced in the
BL21(DE3)pLysS cell line, hSTRAP(1-150) was not completely and stably folded,
although, these latter experiments have shown that DTT helps the folding of hSTRAP(1150) (Fig,3.55D), which suggests that incorrectly formed disulphide bonds may be
responsible for poor properties of this construct expressed in this cell line. To test if
construct properties could be improved, the pET-14b-His-hSTRAP(1-150) plasmid DNA
was transformed into Shuffle T7 express and Shuffle T7pLysY, which both facilitate the
formation of correct disulphide bond formation and contain chaperones to assist in the
folding of protein (Table 2.2). The latter cell line also expresses an inhibitor of
transcription to suppress expression of protein prior to induction (Table 2.2).
Expression trials were then carried out to determine optimum conditions of soluble
expression of unlabelled hSTRAP(1-150) protein in these cell lines. These trials identified
optimum expression in Shuffle T7, induced with 0.1 mM IPTG, at OD 600 of 0.5, followed
by 3hrs incubation at 30°C in LB media. Samples representative of total, soluble and
176
insoluble fractions of lysed cells were analyzed by SDS PAGE, which showed hSTRAP(1150) being expressed over time at these optimum conditions (Fig.3.57A, Lanes 2-5) and
found mainly in the soluble fraction (Fig.3.57B, Lanes 6-9). However, hSTRAP(1-150)
was not expressed in Shuffle T7pLysY (Gels not shown).
(A)
(B)
Figure 3.57. Expression of unlabelled hSTRAP(1-150) in Shuffle T7. These 15% SDS
PAGE gels show- A: Total Shuffle T7 pET-14b-His-hSTRAP(1-150) transformed cell lysate 1
hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Insoluble (lanes 2-5)
and soluble (lanes 6-9) fractions of Shuffle T7 cells transformed with pET-14b-His-hSTRAP(1150) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9) after induction
with IPTG. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are pre-induction. Each
lane was loaded with 15l of sample and both gels were stained with Instant Blue.
Protein purification showed that pure hSTRAP(1-150) can be obtained in elutions at pH 8.7
(Fig.3.58). The estimated concentration of hSTRAP(1-150) protein obtained in each elution
fraction is shown in Table 3.14. Total quantity estimate from 1 litre E. coli culture was 1.88
mg (Table 3.14), which is very low compared to 200 mg/l of unlabelled hSTRAP(1-150)
usually obtained in BL21(DE3)pLysS (See Results Section 3.1).
177
Figure 3.58. Purification of
unlabelled hSTRAP(1-150)
in Shuffle T7 express. 15%
SDS PAGE gel showing eluted
hSTRAP(1-150). All lanes were
loaded with 15l of sample and
the gel was stained with Instant
Blue
Table 3.14. Estimated concentration of hSTRAP(1-150) in elution fractions when expressed in
Shuffle T7 express cells.
OD595
Elution 1
Elution 2
Elution 3
Elution 4
Elution 5
Elution 6
Elution 7
0.230
0.285
0.255
0.169
0.113
0.102
0.098
Concentration
(mg/ml)
0.230
0.285
0.255
0.169
0.113
0.102
0.098
Volelution
1.5
1.5
1.5
1.5
1.5
1.5
1.5
Total
protein
quantity (mg)
0.345
0.428
0.383
0.254
0.170
0.153
0.147
Estimated concentration of hSTRAP(1-150) in elutions obtained through protein purification in
Shuffle T7 express (Elutions shown in Figure 3.58) and identified through Bradford reagent as
described in Materials and Methods.
Next step was to dialyse hSTRAP(1-150) protein in previously identified optimal NMR
buffer for labelled hSTRAP(1-150), which was 20 mM Sodium phosphate buffer, 150 mM
Sodium Chloride, 50 mM Arginine and Glutamic acid at pH 6.5. DTT was not added in
elutions obtained with this cell line, as this cell line contains enzymes which should help
formation of correct disulphide bonds (Table 2.2). Protein was then concentrated down in
the Amicon, and then in the Viva-spin column to 500 µl to obtain a concentration of 0.2
mM. Subsequently a 1D 1H spectrum of this concentrated sample was recorded, which
again did not show any signs of well-folded protein (Fig.3.59), or any improvement from
previous spectra of hSTRAP(1-150) sample obtained in BL21(DE3)pLysS cells, and
recorded in the same buffer (Fig.3.56). Hence, this meant that no difference was detected in
the folding of hSTRAP(1-150) in this cell line, which was initially hypothesized to
improve protein folding due to the presence of both disulphide-oxidizing enzymes and
chaperones.
178
Figure 3.59. 1D 1H Spectrum of hSTRAP (1-150) in Shuffle T7 express. This spectrum
shows no evidence of folded hSTRAP(1-150) protein in this cell line in 20 mM Sodium phosphate
buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid at pH 6.7. Protein
concentration used for this NMR experiment was 0.2 mM
From all the biophysical studies carried out with hSTRAP(1-150) it was shown that
hSTRAP(1-150) is 71% alpha helical (Fig.3.51A) and thermally unstable (Fig.3.51B) as it
completely unfolds at 70°C (Fig.3.51B) and does not refold when temperature is reversed
(Fig.3.51C). Furthermore, hSTRAP(1-150) exists as conformationally-heterogenious
mixture of isoforms, and does not possess unique tertiary fold
(Fig.3.56). Reasons
explaining the structural instability of this construct include perturbation of its native
structure due to chosen truncation.
3.3.5 Biophysical and Structural studies carried out on hSTRAP(151-284)
3.3.5.1 Circular Dichroism on hSTRAP(151-284)
Protein hSTRAP(151-284) was dialyzed into CD buffer (20 mM Sodium Phosphate buffer,
pH 6.5) and analyzed by SDS PAGE to confirm its purity before proceeding to Circular
Dichroism experiments. Pure hSTRAP(151-284) was present in dialysed sample
(Fig.3.60A, Lane 3) and the concentration of hSTRAP(151-284) protein used for these CD
experiment was 15.9 µM.
179
Figure 3.60. Dialysed hSTRAP(151-284)
protein in CD buffer. This 15% SDS PAGE
gel shows pure hSTRAP(151-284) (Indicated by
an arrow) was present in dialyzed sample to be
used for subsequent CD experiments. Fifteen
microlitres of sample was loaded in each lane
and the gel was stained with Instant Blue
A scan taken at 4°C to determine hSTRAP(151-284) folding state showed that
hSTRAP(151-284) was folded and composed of various secondary structure elements
(Fig.3.61A). Molar ellipticity values for this construct were uploaded in the program
Dichroweb to determine estimated percentage of each secondary structure element in
hSTRAP(151-284), and this showed that hSTRAP(151-284) is 42% α-helical, 16.5% βstructure, 21.5% Turn and 20% disordered. The variable temperature experiment was
carried out and this showed that hSTRAP(151-284) protein does not display clear
cooperative unfolding transition across temperatures 4-80°C at 220 nm (Fig.3.61C).
Furthermore, the scan taken at 4°C after the variable temperature experiment had
completed, shows hSTRAP(151-284) had re-folded (Fig.3.61C). Figure 3.61D shows the
scans taken at 4°C before and after the variable temperature experiment can be
superimposed on top of each other, suggestive of reversibly folded protein (Fig.3.61D).
These experiments were repeated 3 times and the same result was obtained as shown in
Figure 3.61. The CD results therefore suggest that the protein is folded on a secondary
structure level; however the absence of distinct unfolding transition suggest that this
protein construct exists in a molten globule state.
180
(A)
(B)
(C)
(D)
Figure 3.61. CD experiments carried out on hSTRAP(151-284). A: CD spectrum recorded
at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at
variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD
spectrum recorded at 4°C after the variable temperature experiment was carried out. D:
Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One
representative experiment is shown out of three repeats. Abbreviation: MRME, Mean residue molar
ellipticity;
3.3.5.2 NMR studies of hSTRAP(151-284)
HSTRAP(151-284) protein was dialyzed into previously identified optimized NMR buffer
for hSTRAP(1-150): 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM
Arginine and Glutamic acid, 10 mM DTT at pH 6.5. HSTRAP(151-284) was then
181
concentrated down to 0.15 mM and 1D 1H NMR spectrum of this unlabelled hSTRAP(151284) protein sample was recorded. This spectrum showed no evidence of folded
hSTRAP(151-284) protein (Fig.3.62); the characteristic dispersed protein signals were not
visible in the spectrum and apparent protein concentration was too low according to the
NMR spectrum (Fig.3.62), suggestive of aggregated protein. As a consequence it was
decided not to proceed in preparing
15
N labelled hSTRAP(151-284) for further structural
experiments on hSTRAP(151-284).
Figure 3.62. 1D 1H NMR spectrum of hSTRAP(151-284). This spectrum was recorded of a
0.15 mM hSTRAP(151-284) protein sample at 30°C in optimised buffer conditions, 20 mM
Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, and 10
mM DTT at pH 6.5. This shows that hSTRAP(151-284) is unfolded.
Structural studies carried out with hSTRAP(151-284) showed that this protein was folded
at the secondary structure level and composed of various secondary structure elements
(Fig.3.61) but may exist in a molten globule state. CD detects the presence of secondary
structure, and this does not necessarily translate to protein being folded at a tertiary
structure level, as shown by the 1D 1H NMR spectrum (Fig.3.63), which confirms that
hSTRAP(151-284) doe not appear to have a unique 3D fold (Fig.3.63). It could be possible
that hSTRAP(151-284) may exist in molten globule state due to a possible perturbation of
its native structure because of truncation.
182
3.3.6 Biophysical and structural studies carried out on hSTRAP(285-440)
3.3.6.1 Circular Dichroism on hSTRAP(285-440)
HSTRAP(285-440) was dialyzed into CD buffer (20 mM Sodium Phosphate buffer, pH
6.5) and analyzed by SDS PAGE gel to confirm its purity (Fig.3.63). Pure hSTRAP(285440) was present in dialyzed sample (Fig.3.63). The concentration of hSTRAP(285-440)
Dialysed
Protein
protein used for CD experiment was 9.8 µM.
Figure 3.63. Dialysed hSTRAP(285-440)
protein in CD buffer to be used for
subsequent CD experiments. This 15% SDS
25kDa
hSTRAP
(285-440) TPR 5-6
15kDa
10kDa
1
PAGE gel shows pure hSTRAP(285-440)
(Indicated by an arrow) was present in dialysed
sample to be used for subsequent CD experiments.
Lane was loaded with 15l of sample and the gel
was stained with Instant Blue
2
A scan taken at 4°C to determine hSTRAP(285-440) folding state showed that this
construct includes various secondary structures and is folded at this level (Fig.3.64A).
Molar ellipticity values for this construct were inserted in the program Dichroweb to
determine percentage of each secondary structure element present in hSTRAP(285-440),
and this showed that hSTRAP(285-440) is 15.7% α-helical, 29.7 % β-structure, 24.6% turn
and 30% disordered. Then the variable temperature experiment was carried out and this
showed that hSTRAP(285-440) protein does not show clear co-operative unfolding
transition (Fig.3.64B) possibly due the protein existing in a molten globule state.
Furthermore, the scan taken at 4°C after the variable temperature experiment had
completed, showed hSTRAP(285-440) refolded reversibly (Fig.3.64C). Figure 3.64D
shows the scans taken at 4°C, before and after variable temperature experiment was carried
out that superimpose well on top of each other. This confirmed that hSTRAP(285-440)
reversibly refolds after the temperature decreased. This was repeated 3 times and these
same results and conclusions were met.
At the time when these studies were being carried out, part of the structure of hSTRAP,
amino acid residues 262-422 was solved by another group [PDB code 2XVS], and for this
reason further structural studies for hSTRAP(285-440) were not performed.
183
(A)
(B)
(C)
(D)
Figure 3.64. CD experiments carried out on hSTRAP(285-440). A: CD spectrum recorded
at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at
variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD
spectrum recorded at 4°C, after the variable temperature experiment was carried out. D:
Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One
representative experiment is shown out of three repeats. MRME, Mean residue molar ellipticity;
184
4. Chapter four. General Discussion
4.1 Comparisons of hSTRAP and mSTRAP structural data
The structure of full length mouse mSTRAP (PDB code 4ABN) and a part of the C
terminus of human hSTRAP (residues 262-422) (PDB code 2XVS) have been solved by Xray crystallography and published only recently [153], when our experimental studies were
largely completed and the Thesis was being written up. The experimental structures of
mSTRAP (PDB code 4ABN) and hSTRAP fragment (PDB code 2XVS) therefore now
allow a comparison of the location of secondary structure elements within this protein with
predicted locations which were used as a basis for the current study. The secondary
structure predictions carried out in this study indicated an alpha helical content predictive
of another TPR motif present within hSTRAP amino acids 1-68 (Fig.3.10). This was found
to be the case from the solved mSTRAP structure (PDB code 4ABN) as there is one TPR
motif present between amino acids 7-61 [153 and summarized in Table 4.1], which was not
predicted from initial STRAP sequence analysis [136]. According to both secondary
structure prediction tools used in this study, hSTRAP is around 70% α-helical which would
correlate with the presence of six TPR motifs and correlates with the experimental
mSTRAP structure [153]. Indeed mSTRAP does contain 6 TPR motifs but their location is
slightly different for two of the TPR motifs predicted. TPR1, 2, 3, 4, 5 and 6 of mSTRAP
are located at positions 7-61, 68-98, 103-130, 136-174, 179-216 and 224-253 respectively
[153, Table 4.1] compared to the initially predicted positions of hSTRAP TPR1, 2, 3, 4, 5
and 6 as 69-102, 103-136, 179-212, 224-257, 332-365 and 373-406 respectively [136].
This shows that there is a TPR motif present between 7-61 and 136-174 that was not
predicted on initial STRAP sequence analysis. HSTRAP region extending from 262-442 is
27% helical, 9% turn, 8% bend, 36% extended chain (β strand) and 20% chain, and an OB
fold is present in this region rather than the two TPR motifs predicted initially from
STRAP amino acid sequence (332-365 and 373-406, [136] PDB code 2XVS), which
suggests that the initial predictions of location of TPR motifs using bioinformatic methods
were not reliable and were in fact misleading in this part of the protein. This once again
emphasizes the importance of experimental determination of 3D structure of proteins,
rather than just relying on bioinformatic-based predictions. Positions of four of the TPR
motifs towards the middle of the protein were predicted correctly. The experimental TPR
positional data mentioned above refers to the mouse homologue of STRAP and this thesis
is on the human homologue, however, the two structures are expected to be very similar.
185
Sequence alignment of mSTRAP with hSTRAP shows that the two STRAP homologues
are highly conserved as a difference of less than 10% is observed in amino acid sequence
between the two (Fig.4.1). To obtain the structure of human STRAP, we performed
homology modeling using the published structure of mSTRAP (PDB code 4ABN) [153] as
a template and hSTRAP amino acid sequence as a target, using the SwissPDB viewer
software (http://spdbv.vital-it.ch/). Figure 4.2 shows that there are minor differences at the
C terminus of STRAP but suggests that both mouse and human STRAP have similar
conformations, and due to high sequence similarity, and are expected to have similar
functional properties
Table 4.1. Experimental TPR positions of STRAP TPR motifs
TPR
TPR Helix A
TPR Helix B
1
7-28
42-61
2
68-78
86-98
3
103-116
119-130
4
136-146
154-174
5
179-195
200-216
6
224-236
240-253
This table shows STRAP TPR motif positions from solved structure; columns 2 and 3 indicate
STRAP amino acid positions.
Mouse
Human
MMADEEEEAKHVLQKLQGLVDRLYCFRDSYFETHSVEDAGRKQQDVQEEMEKTLQQMEEVLGSAQVEAQA 70
MMADEEEEVKPILQKLQELVDQLYSFRDCYFETHSVEDAGRKQQDVRKEMEKTLQQMEEVVGSVQGKAQV 70
Mouse
Human
LMLKGKALNVTPDYSPEAEVLLSKAVKLEPELVEAWNQLGEVYWKKGDVTSAHTCFSGALTHCKNKVSLQ 140
LMLTGKALNVTPDYSPKAEELLSKAVKLEPELVEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNKVSLQ 140
Mouse
Human
NLSMVLRQLQTDSGDEHSRHVMDSVRQAKLAVQMDVLDGRSWYILGNAYLSLYFNTGQNPKISQQALSAY 210
NLSMVLRQLRTDTEDEHSHHVMDSVRQAKLAVQMDVHDGRSWYILGNSYLSLYFSTGQNPKISQQALSAY 210
Mouse
Human
AQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSQAAALDPAWPEPQQREQQLLEFLSRLTSLLES 280
AQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSRAAALDPAWPEPRQREQQLLEFLDRLTSLLES 280
Mouse
Human
KGKTKPKKLQSMLGSLRPAHLGPCGDGRYQSASGQKMTLELKPLSTLQPGVNSGTVVLGKVVFSLTTEEK 350
KGKVKTKKLQSMLGSLRPAHLGPCSDGHYQSASGQKVTLELKPLSTLQPGVNSGAVILGKVVFSLTTEEK 350
Mouse
Human
VPFTFGLVDSDGPCYAVMVYNVVQSWGVLIGDSVAIPEPNLRHHQIRHKGKDYSFSSVRVETPLLLVVNG 420
VPFTFGLVDSDGPCYAVMVYNIVQSWGVLIGDSVAIPEPNLRLHRIQHKGKDYSFSSVRVETPLLLVVNG 420
Mouse
Human
KPQNSSSQASATVASRPQCE 440
KPQGSSSQAVATVASRPQCE 440
Figure 4.1. Sequence alignments of Mouse and Human STRAP. The amino acids
highlighted in yellow are conserved residues between mouse (mSTRAP) and human (hSTRAP).
Amino acids highlighted in cyan are the amino acids that are not conserved between the two
sequences.
186
(A)
(B)
Figure 4.2. Homology modelling of hSTRAP and mSTRAP structure. Homology
modeling performed using SwissPDB viewer. A: Structure highlighted in red is mSTRAP(from the
strcuture published by [153, PDB code 4ABN] and brown is hSTRAP (Amino acid sequence
uploaded into the program). B: hSTRAP structure with the N terminus and C terminus highlighted
in pink and yellow respectively.
Figure 4.3 illustrates all the hSTRAP protein variants cloned in this thesis shown on
hSTRAP structure (obtained from homology modeling of solved mSTRAP structure, See
Fig.4.2). As indicated in this figure, the residues 219-220 and 284-285 do not cut through
any secondary structure elements (and shown by Table 4.1). However, amino acid 150 and
151 is present between Helix A and B of TPR4 (Table 4.1 and Fig.4.3B), which could
cause structural instability of the constructs obtained, and as a consequence unfolded
proteins.
187
(A)
(B)
Figure 4.3 Ribbon representation of the different regions of hSTRAP cloned and
expressed separately in the current study, mapped on the model structure of fulllength STRAP protein. A: The region highlighted in red and blue is hSTRAP(1-219) and
hSTRAP(220-440), respectively; B: The structural region highlighted in cyan, green, and pink are
hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440), respectively.
All the hSTRAP proteins variants cloned in this study were successfully expressed and
purified as shown in Results section 3.1.
4.2 Structural characterization of hSTRAP protein fragments
4.2.1 CD characterization of all hSTRAP protein variants
CD experiments showed that His-hSTRAP(1-440), hSTRAP(1-219), hSTRAP(151-284)
and hSTRAP(285-440) are folded on the secondary structure level, and composed of
various secondary structure elements as shown by their respective CD spectrum
(Fig.3.29A, 3.43A, 3.61A, 3.64A respectively). HSTRAP(1-150) is also folded, and is
predominantly an alpha helical protein (Fig.3.51) compared to the other His-hSTRAP
protein variants, however, hSTRAP(1-150) is not thermally stable and does not exhibit
reversible folding after heating (Fig.3.51 and Table 4.2), unlike the other His-hSTRAP
protein variants (Fig.3.29A, 3.43A, 3.61A, 3.64A). CD experiments on GST-hSTRAP(1440) were inconclusive due to the presence and background signal of the large GST tag,
and as a consequence its secondary structure composition, folding and thermal stability
could not be determined (Fig.3.40). This does correlate well with literature published on
188
studying proteins by CD, as it is generally found that a large tag, such as the 26 kDa GST
tag in this case, will contribute largely to signals obtained from the fusion protein [25].
Further CD analysis and correlation of published STRAP structure [153] indicated that the
smaller the protein the more reliable the CD data (Table 4.2), as the CD experimental data
for the shorter truncated constructs hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285440) correlate well with the experimental α-helical content [4ABN, 153]. Table 4.2 shows
that the α-helical content of the N terminus of hSTRAP from CD analysis is higher than
what is expected if it was to contain the TPR motifs predicted from STRAP sequence
analysis initially performed when STRAP was discovered [136]. Furthermore, CD analysis
suggests that the C terminus also has a lower percentage helical content than expected if it
was to contain the two TPR motifs between residues 332-365 and 373-406, initially
predicted upon sequence analysis of STRAP discovery [136]. This would all correlate with
the recently published STRAP structure as mSTRAP (PDB code 4ABN) was found to
contain a TPR motif between 7-61 and 136-174 that was not predicted on initial STRAP
sequence analysis. Furthermore, from hSTRAP structural data, region 262-422 [2XVS]
was shown to be 27% helical, 9% turn, 8% bend, 36% extended chain (β strand) and 20%
the rest (chain), and to contain an OB fold rather than a TPR motif [153].
Table 4.2. CD data
Protein
His-hSTRAP(1-440)
hSTRAP(1-219)
hSTRAP(1-150)*
hSTRAP(151-284)
hSTRAP(285-440)
(%) αhelical
11.0
67.0
71.0
42.0
15.7
(%) α-helical from
STRAP sequence
analysis
46.4
46.6
45.3
51.1
43.9
Actual α-helical
(%) content
(%)
β
(%)
Turn
(%)
disordered
51.6
90.0
80.0
51.1
27.0
25.0
8.6
0.0
16.5
29.7
31.2
11.6
0.0
21.5
24.6
32.8
12.8
29.0
20.0
30.0
Estimated percentage of each secondary structure element determined by the program Dichroweb
from CD experiments carried out in this thesis (column 2, 5, 6 and 7). The third column shows the
% α-helical content if it was to contain the number of TPR motifs initially predicted from STRAP
amino acid sequence analysis [136] The fourth column shows the experiment α-helical content as
shown by recent published data of STRAP structure [153]. Asterisk (*) indicates the protein
construct that was found to be thermally unstable by CD.
4.2.2 Crystallographic studies on hSTRAP(1-440)
Approximately 500 conditions were screened but no crystal of high enough quality was
obtained (Fig3.34-37 and Fig.3.39), hence the structure of His-hSTRAP(1-440) could not
be solved. The priority was therefore given to solving the structures of shorter fragments of
hSTRAP, which could be more suitable for solution NMR studies, which does not require
189
protein crystals. The crystals used to solve the structure of full length mSTRAP and a part
of the C terminus of hSTRAP [153] were obtained in different conditions that were not
tested in the trials described above, which could explain why a crystal was not obtained.
Furthermore, this could be due to the presence of flexible regions in full-length hSTRAP or
its conformational plasticity. Another reason why full length mSTRAP crystallised and
hSTRAP did not considering they have a high degree of homology (Fig.4.1 and 4.2) could
be because full length hSTRAP is more unstable and requires a certain posttranslational
modification to form a stable correct fold that the mouse homologue does not. It could be
that hSTRAP in complex with a ligand is more stable and would crystalise as a complex
and so this avenue should be tested for the future. It is quite a common approach to try and
crystalise a homologue of the protein that is experiencing problems to crystalise, which
may favor crystal contact formation [154].
Crystallography trials using GST-hSTRAP(1-440) could not be carried out, because in
order to do so, the tag had to be cleaved as the tag is too big, and there is a flexible linker
between the GST tag and the hSTRAP sequence (Fig.3.5). These characteristics are not
favourable for crystal growth and GST-fusion proteins are generally hard to crystalise
[155]. Column GST cleavage experiments were carried out, but found to be unsuccessful as
it seems that after cleavage, both the tag and hSTRAP protein precipitate on the column
and as a consequence they cannot be eluted off the GST tag affinity resin even with elution
with 200 mM Glutathionine reduced (Fig.3.41). The same was observed with off-column
GST cleavage, therefore crystallography trials could not be carried out with GSThSTRAP(1-440) as the tag cannot be removed.
4.2.3 NMR studies on hSTRAP(1-219)
A 1D 1H NMR spectrum of hSTRAP(1-219) protein sample showed no evidence of well
folded protein, as no shifted methyl resonances were observed (Fig.3.46). The CD spectra
on hSTRAP(1-219) indicated the presence of secondary structure (Fig.3.43), but CD also
showed that 12.8% of the structure is disordered, furthermore, no co-operative unfolding
was observed as temperature was raised, which suggests that the protein exists in a molten
globule state. The latter would correlate with the NMR data and the characteristics
published on molten globule state proteins [156]. Although the truncation according to
published data does not cut through any helical structure [153], removal of parts of the
protein is likely to destabilize the 3D structure, leading to a molten globule state. Therefore
190
the construct may then appear folded at a secondary structure level (on the CD spectrum)
but not folded in a unique way in 3D (on the NMR spectra). This also would explain the
difficulties with expressing and purifying this protein fragment, its proteolytic instability
and tendency to aggregate and precipitate.
4.2.4 NMR studies on hSTRAP(1-150)
CD experiments carried out with hSTRAP(1-150) showed that hSTRAP(1-150) is 71% αhelical, thermally unstable and does not exhibit reversible folding (Fig.3.51C).Also, there
is not a clear two state unfolding transition suggestive of a molten globule state protein
[156], which would correlate with the NMR findings that hSTRAP(1-150) exists as
conformationally-heterogeneous mixture of isoforms, and does not show unique fold
(Fig.3.56). The structure of hSTRAP(1-150) could not be solved as this part of the protein
was found to be intrinsically unstable, and in order to fold it may possibly require the
presence of an interacting partner, or the rest of the protein which was removed in this
construct.
4.2.5 NMR studies on hSTRAP(151-284)
According to CD experiments, hSTRAP(151-284) protein is folded at the secondary
structure level but does not show clear co-operative unfolding transition (Fig.3.61).
Furthermore, 1D 1H NMR spectra (Fig.3.62) of hSTRAP(151-284) has shown that this
protein does not have a unique 3D fold which all suggests that protein may exist in a
molten globule state [156]. In the structure of full length mSTRAP the amino acid 151 is
between helix A and B of TPR4 [153, Table 4.1], which could be a reason explaining the
instability of this variant and the formation of a molten globule conformation, as the
presence of Helix B could be critical for its stability and correct folding.
4.3 Difficulties with expression of hSTRAP protein variants using the E.
coli expression system, and possible ways to overcome these in future.
The E.coli expression system was used to produce all hSTRAP protein variants, as this is
the most common and preferred expression system used in research laboratories due to the
advantages associated with the bacterial system (Table 4.1) [157-160]. However, this
expression system cannot maintain various post-translational modifications such as correct
191
disulphide bond formation, which could be required for the correct fold of the protein.
Correct disulfide bonds may not form within the reducing environment of the E. coli
cytoplasm and in this case it may be necessary to use another expression system to obtain
correctly folded recombinant protein [157-160]. Quite often, over-expressed recombinant
proteins do not fold correctly and undergo proteolytic degradation or form aggregates and
consequently inclusion bodies [161]. This seems to be the case for the hSTRAP protein
variants cloned in this study. This mis-folding occurs because in the cytoplasm of the E.
coli transcription and translation are tightly coupled and occurring at a fast rate where a
protein chain is forming every 35 seconds and a macromolecule concentration of 300400mg/ml can be reached [161]. This makes protein folding a challenging task, and it is
generally known that small singular motif recombinant proteins can form their native
conformation within this relatively small time period and dynamic environment [161]. This
is however more challenging for multi domain and over-expressed recombinant proteins,
like hSTRAP, that may require molecular chaperones to assist their folding [161]. Failure
to form the native conformation for the recombinant protein rapidly results in either
formation of inclusion bodies or degradation [161]. The probability of protein mis-folding
increases by the use of strong promoters and high inducer concentrations that ultimately
lead to a protein yield of over 50% of actual total quantity of cellular protein and hence the
rate of formation of protein aggregation exceeds that of correct protein folding [161].
Another commonly used expression system is the yeast, which is a eukaryotic microorganism, therefore more advanced and similar to human in terms of genetics than E. coli,
and yet it still maintains the ease to manipulate the cells compared to the mammalian
expression system [157-160]. Baculovirus (insect cells) expression system is the most
extensively used system as it can produce large amounts of protein [157-160]. The
expression levels of a recombinant protein is higher in insect cells compared to mammalian
cells, and most of the post translational modifications are maintained in this system, which
is an advantage over bacterial expression systems [157-160]. The likelihood of mis-folding
especially for polypeptides already destabilized via truncations are high. Therefore a
eukaryotic system should be used to express these hSTRAP protein variants but this was
not an option at the time, furthermore the likelihood of structural instability for hSTRAP(1150) and hSTRAP(151-284) would be high even expressed in this system due to truncation
between Helix A and B of the fourth TPR motif [153, Table 4.1]. Although new hSTRAP
protein variants can be cloned whereby the domain boundaries for truncation constructs
192
would be chosen according to the experimental structure of mSTRAP, which became
available recently [153].
Table 4.3. The advantages and disadvantages associated with the bacterial and eukaryotic expression
systems.
Bacterial expression system
Low cost
Ability to produce large amounts of protein
High growth rates, hence smaller time frame
needed to express and purify protein
Does not maintain all post translational
modifications
Easily transformed with low amounts of foreign
DNA
Eukaryotic expression system
Has an improved protein folding mechanism to
recognize eukaryotic protein
Found to obtain soluble form of human protein
Does maintain most (yeast and insect) if not all
post translational modifications (mammalian)
Likelihood of protein degradation reduced
More costly
Longer growth rates
More complex media
4.4 The hSTRAP interactome
General workflow in identifying interacting partners by mass spectrometry is to isolate the
protein complex by pull down or immuno-precipitation experiments, subsequently
followed by SDS PAGE for protein separation (size fractionation) [162]. The gel is then
cut in a number of pieces (region identified by user), and prepared for mass spectrometry
identification [162]. The disadvantage of this approach is that gel fragments obtained that
way would contain very narrow distribution of molecular sizes, and a lot of potential
interacting partners of significantly different size would go undetected. Therefore in this
study one gel fragment containing the whole mixture of proteins was used, in attempt to
use high discriminating power of mass-spectrometry protein identification, and maximise
the potential number of hits (Fig.3.26). As mass spectrometry is a highly sensitive method
and detects sub-picomolar amounts of protein [163], this methodology was expected to still
identify all STRAP interacting proteins within that gel slice. Also for these types of
experiments the tags generally used are GFP [163] and FLAG [163] to reduce the number
of contaminants obtained, although GST [164] and His tagged fusions are also used [165].
False positives can be reduced through the implementation of highly stringent purification
methods, which also come at the cost of removing low-abundance and low-affinity
interacting proteins [162]. In this thesis the number of false positives have been reduced as
tag only (controls), as well as truncated variants of hSTRAP were used as bait for the pull
downs. Another method that can be used to reduce the number of false positives is the use
of isotope labelling, which can distinguish specific to non specific interactors, although not
193
all specific interactors can be identified when signal to noise ratio is similar, due to the
background level provided by the contaminants [162].
Another disadvantage for these types of experiments is generally that the concentration of
bait exceeds “normal” cellular levels of the protein [163], and an in-vitro system is being
used. Both suggest that certain interactions detected via these experiments may not actually
occur under normal conditions because that concentration of protein may not “normally”
exist in the cell, as well as the fact that the two proteins may not be co-localized in the cell
in-vivo. This challenging task can be resolved by expressing the bait in a stable line or the
use of antibodies to detect native endogenous bait and their interacting partners [163]. Also
the tags used for purification may interfere with protein function and interactions [163] and
that could have been the case in this thesis, as certain proteins have been detected in the
pull downs with His tagged hSTRAP variants but not GST tagged hSTRAP. To overcome
this issue, for future experiments, the tag should be applied to both the N and C terminus of
the bait, and the differences in interaction profiles between the two pull downs should be
then analyzed [163].
In this investigation it was found that hSTRAP interacts with 25 proteins (Table 3.6),
which included 20 human and 5 E. coli proteins. The peptide data list was searched against
the full database, which would include sequences from both E. coli and human proteins.
Our primary aim is to determine human hSTRAP interacting partners hence human breast
cancer cellular extract (MCF7) was used for these pull downs, however, it cannot be
excluded that some E. coli proteins might have co-purified with the hSTRAP constructs
(Fig.3.24). This co-purification however would likely depend on the type of protein
purification procedure performed; hence with different purification tags a different set of
co-purified contaminants may be expected. Indeed, five E. coli proteins were detected and
these were in the pull downs with His tagged proteins and not GST tagged proteins (Table
3.6). This suggests that this contamination is specific for His tagged proteins and shows
that using His tagged proteins for these types of pull downs is not ideal. However, in this
case, we tried to mitigate the effects of such artefacts by using multiple repeats of
experiments, as well as using different truncation constructs of hSTRAP, full-length GST
tagged hSTRAP, and analysing the data together, looking for common interaction patterns
for constructs of the same type (i.e., containing the same regions). This reduces the number
of false positives that can be identified and increases the confidence in the data.
194
An interacting network was created based on the evidence from our experiments and also
input from various programs such as GeneMania and String which also take into account
published data. This interacting network (Fig.3.28) includes interactions identified as part
of the work in this thesis (highlighted in red), and those identified by GeneMania and
String (highlighted in blue) as well as interactions that have been predicted or were
revealed after text mining by either/both programs (highlighted in pink and grey). Text
mining data, which has been included in the network for information have been extracted
from large datasets by these two programs [166], and has not been proven by experiments,
therefore, no firm conclusions can be reached from this data. The possible functional
implication of this interacting network will be discussed in more detail below but it should
be mentioned that further experimental work is necessary to confirm these interactions.
The UNIPROT ID of each one of these 20 human proteins was also submitted to DAVID
bioinformatics software to define their functional role. The results of this analysis indicated
that hSTRAP interacting proteins are implicated in diverse pathways (Table 3.7) including
the regulation of the actin cytoskeleton, translation, oxidative phosphorylation, various
metabolic pathways, non-homologous end joining, glycolysis and gluco-neogenesis, fatty
acid bio-synthesis and the stress response pathway. This suggests that hSTRAP could be
potentially involved in diverse regulatory roles as originally hypothesized due to the
presence of six TPR motifs in its protein sequence [136, 153].
It has been recently shown that hSTRAP contains an OB fold at its C termini [153, PDB
code 2XVS], and it is known that OB fold containing proteins are critical for various DNA
related functions such as DNA replication, repair, transcription and translation [167]. In
this thesis hSTRAP interaction with DNA damage dependent protein kinase C catalytic
subunit has been suggested, which is implicated in DNA repair (Table 3.7). Furthermore,
interaction with Eukaryotic initiation factor 4A-I, Elongation factor 1-alpha 1 and Tu
translation elongation factor, all implicated in translation has also been suggested (Table
3.7). These interactions are in accord with published observations indicating that OB fold
containing proteins [167] and STRAP are implicated in the DNA damage response
pathway [144-145, 147-148].
Since STRAP forms a complex with the protein JMY [136], its implication in the
regulation of the actin cytoskeleton seems plausible as JMY has been shown to be
associated with the cytoskeleton [168-171]. Biochemical experiments have shown that
195
JMY induces actin nucleation by activating the Arp2/3 complex [168-171]. In particular,
the WH2 domain of JMY binds to monomeric actin and its central acidic region activates
the Arp2/3 complex, consequently causing actin polymerisation [168-171]. JMY activates
actin polymerisation also in Arp2/3 independent mode like spire, whereby four actin
molecules are aligned tandemly to form the pointed end, to which free actin monomers
then subsequently bind to form a nascent filament [168-171]. In this study actin was
identified as an hSTRAP interacting partner, but not any of the components of the Arp2/3
protein complex implying that JMY mediated actin polymerisation could potentially
involve hSTRAP.
JMY localises with actin in the cytoplasm of human neutrophils rather than the nucleus,
and upon DNA damage JMY translocates to the nucleus [168-171]. This implies that JMY
is mainly involved in cell motility under normal conditions, and under DNA damage
conditions it translocates to the nucleus thereby facilitating the p53 mediated stress
response, and concomitantly its effects on cell motility are reduced [168-171]. This could
also be true for STRAP, that under normal conditions STRAP is present in the cytoplasm
and mediates its role on the cytoskeleton and cell migration as suggested by the data in this
thesis, and under DNA damage STRAP translocates to the nucleus whereby it facilitates
the p53 and DNA damage response upon phosphorylation on serine 203 [144-145, 147149]. The potential mechanism of hSTRAP function is shown in Fig.4.4.
Published reports have shown that STRAP interacts with JMY through STRAP 1-205
[136] region. Results shown in this thesis indicate that hSTRAP 1-150 interacts with actin
allowing the hypothesis that STRAP mediates its effects in the regulation of the actin
cytoskeleton through this region possibly acting as a scaffolding protein either directly or
indirectly by interacting with JMY [136]. This hypothesis however needs to be checked
further by running a direct experiment in-vitro (to detect if direct interaction between
STRAP and actin takes place), and in-vivo (to detect if these two proteins do co-localize in
cell). In addition, hSTRAP was shown to interact with other actin regulatory molecules
providing further support to the notion that hSTRAP could potentially be the central
component in the regulation of the actin cytoskeleton acting as scaffold or adaptor protein
to cluster multiple actin regulatory proteins to initiate the desired actin response.
Furthermore STRAP might play a role in the nuclear translocation of JMY under DNA
damage conditions [144-145, 147] thus linking DNA damage response with cell motility
(Fig.4.4). This hypothesis however needs to be checked further, with more detailed studies.
196
Figure 4.4. Proposed hSTRAP mechanism of function. Under normal conditions, STRAP
remains in the cytoplasm regulating the function of actin cytoskeleton in complex with JMY. Then
upon DNA damage STRAP/JMY translocates to the nucleus and binds to p300 thus regulating
transcription and DNA damage response.
Recent research has implicated JMY in spindle migration, asymmetric division and
cytokinesis during mouse oocyte maturation [172]. During this process, JMY was found to
localize to the spindle microtubules as shown by its overlap with alpha tubulin and also in
the cytoplasm [172]. In this thesis hSTRAP 1-150 was found to associate with tubulin
allowing the hypothesis that the above reported JMY functions could be mediated through
STRAP, signifying a potential role of this protein in microtubule organization [136].
STRAP is phosphorylated in an ATM dependent manner on serine 203, which causes its
nuclear localisation and implicates this protein in the DNA damage response pathway
[144-145, 147-149]. In accord with these reports evidence supporting the notion that
hSTRAP is a stress and DNA damage responsive protein is provided in this thesis as this
protein was found to interact with DNA dependent Protein kinase C (DNA-PKc), HSP90
and HSP70. In earlier reports STRAP has been shown to interact with p300 and JMY
[136], which were not detected in the biochemical binding assays carried out in this thesis,
possibly due to the difference in the experimental systems used as well as the fact that the
cells utilised in the present study were not stressed [143-145, 147-149].
Inhibition of Hsp90 delays cell migration by decreasing the interaction of this protein with
actin monomers thereby reducing actin polymerization in breast cancer cells [173]. Since in
197
this thesis interaction of hSTRAP 1-150 with actin and hSTRAP 285-440 with Hsp90 were
suggested, an attractive hypothesis could be that the interaction between actin and Hsp90 is
mediated through STRAP. If that is the case, hSTRAP can potentially be acting as an
adaptor molecule coupling the stress response pathway to the actin cytoskeleton.
Furthermore, STRAP could be potentially connecting the stress response, DNA damage
and the actin cytoskeleton in cancer, as hSTRAP, region 1-150 apart from actin was also
shown to interact with DNA dependent protein kinase C. Taken together results presented
in this thesis support the hypothesis that different regions of hSTRAP cluster with many
proteins implicating this protein in diverse signalling pathways.
The role of STRAP as a potential scaffolding protein would correlate with published data
on TPR proteins [33] and the solved STRAP structure, as the OB fold exhibits an extended
super-helical scaffold structure, which can mediate protein-protein and protein-DNA
interactions [153]. This would correlate with the data provided in this thesis showing
hSTRAP(285-440), (which includes the OB fold) does interact with various proteins and
mediate protein-protein interactions (Table 3.6).
A point to note is that not all the interaction profiles are consistent as some proteins such as
Myosin 9 and Phosphoglycerate kinase were detected in the pull downs with His tagged
STRAP variants including full-length hSTRAP but not GST tagged full-length hSTRAP.
This is unexpected because if the proteins in question were hSTRAP interacting proteins
then they should be detected in the pull downs with GST tagged full-length hSTRAP as
well. This could be due to difference in folding states of the two differently tagged
proteins, and/or site of interaction being occluded by the tag itself. Also the same can apply
for proteins that have been detected in the pull downs with GST tagged hSTRAP protein
and His tagged truncated hSTRAP protein variants but not full length hSTRAP, for
example filamin A and filamin B.
Another point to consider is the computational algorithms used to identify these proteins,
as Mascot and Scaffold are probability based protein identification programs [174-175],
hence they bear both benefits and limitations. Mascot firstly works by comparing the
experimental data with calculated peptide mass/fragment ion mass values by applying the
appropriate cleavage to a sequence database [174-175]. A probability is then calculated to
determine the likelihood that observed identification is a chance event or not [174-175].
The match that has the lowest probability of occurring by chance is considered as the best
198
match [174-175]. This probability is then shown as a score, which relates to the confidence
in the match, which is -10Log10(Prob), hence the lower the probability that the match is a
random event the higher is the score [174-175]. A score of over 70 is generally known as a
significant match [174]. Scaffold validates these identifications and increases the
confidence in the data by using various peptide and protein validation methods following
an initial database search analysis (e.g Mascot) [174-175]. The results and scores obtained
from this initial database search analysis via Mascot are then converted into probabilities of
peptide identification and determine the probability whether Mascot protein identification
is correct [174-175]. Protein probabilities can then be subsequently determined through use
of the “Protein-Prophet” algorithm to then ultimately identify the proteins present in that
sample [175]. In this thesis a Scaffold probability of only over 95% was used, with 2
unique peptides (80+% peptide probability), as this is the quality that is currently
considered sufficient [174-175]. However, both Mascot and Scaffold programs use
statistical analysis algorithms which only quote probabilities [174-175], therefore false
positives can be obtained as a result of this analysis. However, the fact that a number of
pull downs have been carried out with truncated hSTRAP variants, and full-length
hSTRAP in two different vector systems and controls, increase the confidence in the data.
Our proteomic analysis performed here therefore allows us to formulate a hypothesis that
potentially hSTRAP is a critical protein involved in many aspects of cellular regulation,
and could act as a scaffolding protein which bridges various cellular proteins together.
However, due to observed inconsistencies in the pull down data as mentioned above,
individual interactions identified here need to be studied in more detail, and confirmed by
further biochemical binding and interaction assays. The value of the current study is that it
narrowed down a list of potential pathways and classes of proteins, and allowed to
formulate a hypothesis, which should be checked in future. For example, future
experiments should be performed to confirm whether there is a direct interaction between
the hSTRAP interacting proteins identified in this thesis with hSTRAP. This can be
determined via co-localization, co-immunoprecipitation and other functional assays.
199
5. Future direction
The structure of full length mouse mSTRAP and a part of the C terminus of human
hSTRAP has been recently published [153], and from this thesis it appears that the
structure of full-length human hSTRAP could not be easily solved, due to poor
crystallizability of the human orthologue. However, crystallography may be attempted of
hSTRAP-ligand complex, to see if its structure can be solved when it is part of a complex.
This thesis has shown that hSTRAP is implicated in diverse cellular functions, and hence
future work should prioritize on the hSTRAP interactome. Co-localization experiments
should be performed to confirm whether hSTRAP and ligand are in close proximity in-vivo
to interact. This experiment involves the labeling of hSTRAP and ligand, for example actin
and hSTRAP, with a different fluorescent probes, and the resulting images can then be
analyzed by microscopy and overlaid to determine co-localisation of the two proteins under
investigation [176]. Furthermore, these experiments can be performed with GFP-fused full
length STRAP, or its shorter variants expressing different TPR motifs, and their subcellular localization can also be studied. These experiments can also be performed in
different cell lines to determine differences, if any, in interaction status within these cell
lines as in this study MCF7 cells only were used. Site directed mutagenesis experiments
could also be conducted to identify critical residues important for the localization of
STRAP and interaction with its ligands. The residues that would be mutated first would be
within the region of hSTRAP identified in this thesis, such as actin was shown to interact
with hSTRAP through amino acids 1-150 (Table 3.6). These data combined with solved
STRAP structure [153] and analysis of which residues are solvent-exposed may lead to
selection of candidate residues to be mutated in the first instance.
Another method to confirm whether it is a direct interaction between the purified proteins
in-vitro
is to perform isothermal Titration Calorimetry (ITC) and determine
thermodynamics associated with this interaction. The binding equilibrium can be
determined by measuring the heat produced upon ligand interaction [177]. Through these
ITC experiments, the stoichiometry of the interaction (n), the association constant, the free
energy, enthalpy, entropy, and heat capacity of binding can be determined [177]. This can
be performed on any of these hSTRAP interacting proteins identified to determine the
thermodynamics associated with ligand and hSTRAP interaction [177]. Mutagenesis of
various hSTRAP residues within the specified region of hSTRAP-ligand interaction
200
mapped in thesis also can be carried out to identify critical residues important for hSTRAPligand interaction.
Another line of evidence would include investigating potential hSTRAP implication in
cancer metastasis, as it has been suggested here that hSTRAP could be implicated in the
regulation of the actin cytoskeleton. This would include performing in-vitro scratch assays,
which involves creating a scratch on a single cell mono-layer via a tip, and capturing
images of this monolayer at regular intervals to determine rate of cell migration [178-179].
This should be done in the presence and absence of STRAP to determine effect of STRAP
on cellular migration. Furthermore, this assay can be coupled to microscopy and hence
GFP fluorescently labeled protein (GFP tagged hSTRAP) can be visualised and
consequently their sub-cellular localization during cellular migration can be monitored live
[178].
Alternative approaches to complement the results presented in this thesis that could be
implemented would include experiments towards identifying hSTRAP interacting proteins
via tandem affinity purification coupled with mass spectrometry based analysis [180]. This
would involve the fusion of STRAP with the TAP tag, consisting of calmodulin binding
peptide, TEV cleavage site and immunoglobulin interacting domain of protein A [180].
This hSTRAP fusion protein would be then incubated with cellular extract and then
subjected to the first affinity purification step whereby the TAP tagged fused protein, along
with hSTRAP interacting partners would bind to an IgG affinity matrix [180]. TEV is then
added to the mixture, which results in the cleavage of the TAP tag. This elute is then
subjected to another affinity purification step, using calmodulin coated beads, which again
would bind to hSTRAP complexed to its interacting partners [180]. This whole complex is
then eluted with ethylene glycol tetraacetic acid. This method is quite effective in reducing
the number of contaminants identified because of this two step affinity purification process
[180], but on the other hand, may miss some of the weaker or transiently-formed
complexes.
In this study, cells were not treated prior to performing pull down assays, and this can also
be done in future experiments whereby the cells are treated with various drugs such as
etoposide, which is a topoisomerase II inhibitor inducing double strand breaks [181]. The
treatment of etoposide may be informative as STRAP is a stress responsive protein [136,
148-149], and so it is probable that differences in STRAP interaction profiles will occur
201
under cellular stress conditions. Similar as to this thesis, this can be coupled to mass
spectrometry analysis to determine STRAP interaction profile with and without this
treatment. Experiments should be performed to determine if differences in localisation of
STRAP variants occur upon different treatments and its effect on ligand interaction.
Stable expression mammalian cell lines expressing hSTRAP could also be used to identify
STRAP interacting partners. In addition the role of STRAP in cancer can be explored using
cell lines in which STRAP gene expression has been silenced using RNA interference
[182]. Cells can be monitored under the microscope to determine if cell death is occurring
upon STRAP knockdown. The effect of STRAP knockdown on cellular migration can be
studied via scratch wound assays mentioned above. Also effects of STRAP knockdown on
other pathways identified in this thesis should also be studied, for example on the
glycolysis pathway (Table 3.6 and 3.7). In glycolysis assays, colorimetic measurements can
be performed, whereby the levels of L-Lactate released in culture medium is used as a
measure of glycolytic rate [183].
In summary various experiments could be performed in the future to determine the role of
hSTRAP in cancer and cellular migration and other pathways that have been identified in
thesis that hSTRAP could be potentially be implicated in. Experiments should also be
performed to determine sub-cellular localizations of hSTRAP and hSTRAP interacting
partners identified in this thesis and to further verify these interactions. This study has built
a platform on hSTRAP but directed experiments on hSTRAP can now be advised as
mentioned above.
202
6. References
1. Whitford, David. Proteins: Structure and Function: John Wiley & Sons, 2005.
2. Morawe, Tobias, Christof Hiebel, Andreas Kern, and Christian Behl. "Protein
Homeostasis, Aging and Alzheimer’s Disease." Molecular Neurobiology 46, no. 1 (2012):
41-54.
3. Koga, Hiroshi, Susmita Kaushik, and Ana Maria Cuervo. "Protein Homeostasis and
Aging: The Importance of Exquisite Quality Control." Ageing Research Reviews 10, no. 2
(2011): 205-15.
4. Committee on Intellectual Property Rights in Genomic and Protein Research, and
National research Council. Reaping the Benefits of Genomic and Proteomic Research:
Intellectual Property Rights, Innovation, and Public Health: National Academies Press,
2006.
5. Jahnke, Wolfgang., and Hansjurg. Widmer. "Protein Nmr in Biomedical Research."
Cellular and Molecular Life Sciences 61, no. 5 (2004): 580-99.
6. Klages, Jochen, Murray Coles, and Horst Kessler. "Nmr-Based Screening: A Powerful
Tool in Fragment-Based Drug Discovery." Analyst 132, no. 7 (2007): 692-705.
7. Jacobsen, Neil E. Nmr Spectroscopy Explained : Simplified Theory, Applications and
Examples for Organic Chemistry and Structural Biology. Hoboken, N.J.: WileyInterscience, 2007.
8. Guan, Hongtao., and Endre. Kiss-Toth. "Advanced Technologies for Studies on Protein
Interactomes." Adv Biochem Eng Biotechnol 110 (2008): 1-24.
9. Stites, Wesley E. "Protein-Protein Interactions: Interface Structure, Binding
Thermodynamics, and Mutational Analysis." Chemical Reviews 97, no. 5 (1997): 1233-50.
10. Giometti, Carol. Smith. "Proteomics and Bioinformatics." Advances in protein
chemistry 65 (2003): 353-69.
11. Rupp, Bernhard. Modern Biomolecular Crystallography: Taylor & Francis, 2009.
ISBN: 9780815340812
12. Li, Liang, and Rustem F. Ismagilov. "Protein Crystallization Using Microfluidic
Technologies Based on Valves, Droplets, and Slipchip." Annual Review of Biophysics 39,
no. 1 (2010): 139-58.
13. Weselak, Mark, Marianne G. Patch, Thomas L. Selby, Gunther Knebel, Raymond C.
Stevens, Charles W. Carter, Jr., and M. Sweet Robert. "Robotics for Automated Crystal
Formation and Analysis." In Methods in Enzymology, 45-76: Academic Press, 2003.
14. Pascal, Steven.M. Nmr Primer: An Hsqc-Based Approach with Vector Animations: IM
Publications, 2008.
15. Lorigan, Gary A., Robert E. Minto, and Wei Zhang. "Teaching the Fundamentals of
Pulsed Nmr Spectroscopy in an Undergraduate Physical Chemistry Laboratory." Journal of
203
Chemical Education 78, no. 7 (2001): 956.
16. Smith, William B., and Thomas W. Proulx. "Pulse Nmr - an Old Analytical Technique
Oft Neglected by the Chemist." Journal of Chemical Education 53, no. 11 (1976): 700.
17. Evans, J.N.S. Biomolecular Nmr Spectroscopy: Oxford University Press, USA, 1995.
18. Farrar, Thomas C. "Pulsed and Fourier Transform Nmr Spectroscopy." Analytical
Chemistry 42, no. 4 (1970): 109A-12a
19. Edén, Mattias, and Lucio Frydman. "Homonuclear Nmr Correlations between HalfInteger Quadrupolar Nuclei Undergoing Magic-Angle Spinning." The Journal of Physical
Chemistry B 107, no. 51 (2003): 14598-611.
20. Bax, Adriaan. "Two-Dimensional Nmr and Protein Structure." Annual Review of
Biochemistry 58, no. 1 (1989): 223-56.
21. Li, Kuo. Bin., and Bryan. C. Sanctuary. "Cheminform Abstract: Automated Resonance
Assignment of Proteins Using Heteronuclear 3d Nmr. Part 2. Side Chain and SequenceSpecific Assignment." ChemInform 28, no. 37 (1997): no-no.
22. Braun, W., G. Wider, K. H. Lee, and K. Wüthrich. "Conformation of Glucagon in a
Lipid-Water Interphase by 1h Nuclear Magnetic Resonance." Journal of Molecular Biology
169, no. 4 (1983): 921-48.
23. Rossi, Paolo, G. V. T. Swapna, Yuanpeng J Huang, James M Aramini, Clemens
Anklin, Kenith Conover, Keith Hamilton, Rong Xiao, Thomas B Acton, Asli Ertekin, John
K Everett, and Gaetano T Montelione. "A Microscale Protein Nmr Sample Screening
Pipeline." Journal of Biomolecular NMR 46, no. 1 (2010): 11-22.
24. Kelly, Sharon M., Thomas J. Jess, and Nicholas C. Price. "How to Study Proteins by
Circular Dichroism." Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics
1751, no. 2 (2005): 119-39.
25. Kelly, Sharon. M., and Nicholas. C. Price. "The Use of Circular Dichroism in the
Investigation of Protein Structure and Function." Current protein & peptide science 1, no.
4 (2000): 349-84.
26. Yee, Adelinda A., Alexei Savchenko, Alexandr Ignachenko, Jonathan Lukin, Xiaohui
Xu, Tatiana Skarina, Elena Evdokimova, Cheng Song Liu, Anthony Semesi, Valerie
Guido, Aled M. Edwards, and Cheryl H. Arrowsmith. "Nmr and X-Ray Crystallography,
Complementary Tools in Structural Proteomics of Small Proteins." Journal of the
American Chemical Society 127, no. 47 (2005): 16512-17.
27. Garbuzynskiy, Sergiy O., Bogdan S. Melnik, Michail Yu Lobanov, Alexei V.
Finkelstein, and Oxana V. Galzitskaya. "Comparison of X-Ray and Nmr Structures: Is
There a Systematic Difference in Residue Contacts between X-Ray- and Nmr-Resolved
Protein Structures?" Proteins: Structure, Function, and Bioinformatics 60, no. 1 (2005):
139-47.
28. Glish, Gary, and Richard Vachet. "The Basics of Mass Spectrometry in the TwentyFirst Century." 2, no. 2 (2003): 140-50.
204
29. Stirnimann, Christian U., Evangelia Petsalaki, Robert B. Russell, and Christoph W.
Müller. "Wd40 Proteins Propel Cellular Networks." Trends in Biochemical Sciences 35,
no. 10 (2010): 565-74.
30. Jeleń, Filip., Arkadiusz. Oleksy, Katarzyna. Smietana, and Jacek. Otlewski. "Pdz
Domains - Common Players in the Cell Signaling." Acta biochimica Polonica 50, no. 4
(2003): 985-1017.
31. Li, Shawn S-C. "Specificity and Versatility of Sh3 and Other Proline-Recognition
Domains: Structural Basis and Implications for Cellular Signal Transduction." Biochem. J.
390, no. 3 (2005): 641-53
32. Blatch, Gregory L., and Michael Lässle. "The Tetratricopeptide Repeat: A Structural
Motif Mediating Protein-Protein Interactions." BioEssays 21, no. 11 (1999): 932-39.
33. D'Andrea, Luca D., and Lynne Regan. "Tpr Proteins: The Versatile Helix." Trends in
Biochemical Sciences 28, no. 12 (2003): 655-62.
34. Hirano, Tatsuya, Noriyuki Kinoshita, Kosuke Morikawa, and Mitsuhiro Yanagida.
"Snap Helix with Knob and Hole: Essential Repeats in S. Pombe Nuclear Protein Nuc2 +."
Cell 60, no. 2 (1990): 319-28.
35. Sikorski, Robert S., Mark S. Boguski, Mark Goebl, and Philip Hieter. "A Repeating
Amino Acid Motif in Cdc23 Defines a Family of Proteins and a New Relationship among
Genes Required for Mitosis and Rna Synthesis." Cell 60, no. 2 (1990): 307-17.
36. Lamb, John R., Stuart Tugendreich, and Phil Hieter. "Tetratrico Peptide Repeat
Interactions: To Tpr or Not to Tpr?" Trends in Biochemical Sciences 20, no. 7 (1995): 25759.
37. Das, Amit. K., Patricia. W. Cohen, and David. Barford. "The Structure of the
Tetratricopeptide Repeats of Protein Phosphatase 5: Implications for Tpr-Mediated ProteinProtein Interactions." EMBO J 17, no. 5 (1998): 1192-9.
38. Malek, Sami. N., Charles. H. Yang, William. C. Earnshaw, hristineC. A. Kozak, and
Stephen. Desiderio. "P150tsp, a Conserved Nuclear Phosphoprotein That Contains
Multiple Tetratricopeptide Repeats and Binds Specifically to Sh2 Domains." The Journal
of biological chemistry 271, no. 12 (1996): 6952-62.
39. Smith, David. F. "Tetratricopeptide Repeat Cochaperones in Steroid Receptor
Complexes." Cell stress & chaperones 9, no. 2 (2004): 109-21.
40. Wu, Beili, Pengyun Li, Yiwei Liu, Zhiyong Lou, Yi Ding, Cuiling Shu, Sheng Ye,
Mark Bartlam, Beifen Shen, and Zihe Rao. "3d Structure of Human Fk506-Binding Protein
52: Implications for the Assembly of the Glucocorticoid Receptor/Hsp90/Immunophilin
Heterocomplex." Proceedings of the National Academy of Sciences of the United States of
America 101, no. 22 (2004): 8348-53.
41. Main, Ewan R. G., Katherine Stott, Sophie E. Jackson, and Lynne Regan. "Local and
Long-Range Stability in Tandemly Arrayed Tetratricopeptide Repeats." Proceedings of the
National Academy of Sciences of the United States of America 102, no. 16 (2005): 5721-26.
42. Cortajarena, Aitziber L., and Lynne Regan. "Ligand Binding by Tpr Domains." Protein
205
Science 15, no. 5 (2006): 1193-98.
43. Cliff, Matthew J., Mark A. Williams, John Brooke-Smith, David Barford, and John E.
Ladbury. "Molecular Recognition Via Coupled Folding and Binding in a Tpr Domain."
Journal of Molecular Biology 346, no. 3 (2005): 717-32.
44. Strauss, H. M., S. Keller, Enno Klussmann, and John Scott. "Pharmacological
Interference with Protein-Protein Interactions Mediated by Coiled-Coil Motifs ProteinProtein Interactions as New Drug Targets." 461-82: Springer Berlin Heidelberg, 2008.
45. Allan, Rudi, and Thomas Ratajczak. "Versatile Tpr Domains Accommodate Different
Modes of Target Protein Recognition and Function." Cell Stress and Chaperones 16, no. 4
(2011): 353-67.
46. Zeytuni, Natalie., and Raz. Zarivach. "Structural and Functional Discussion of the
Tetra-Trico-Peptide Repeat, a Protein Interaction Module." Structure 20, no. 3 (2012): 397405.
47. Scheufler, Clemans., Achim. Brinker, Gleb. Bourenkov, Stefano. Pegoraro, Luis.
Moroder, Hans. Bartunik, F. Ulrich. Hartl, and Ismail. Moarefi. "Structure of Tpr DomainPeptide Complexes: Critical Elements in the Assembly of the Hsp70-Hsp90
Multichaperone Machine." Cell 101, no. 2 (2000): 199-210.
48. Taylor, Paul, Jacqueline Dornan, Amerigo Carrello, Rodney F. Minchin, Thomas
Ratajczak, and Malcolm D. Walkinshaw. "Two Structures of Cyclophilin 40: Folding and
Fidelity in the Tpr Domains." Structure 9, no. 5 (2001): 431-38.
49. Main, Ewan R. G., Yong Xiong, Melanie J. Cocco, Luca D'Andrea, and Lynne Regan.
"Design of Stable alpha-Helical Arrays from an Idealized Tpr Motif." Structure 11, no. 5
(2003): 497-508.
50. Sinars, Cindy R., Joyce Cheung-Flynn, Ronald A. Rimerman, Jonathan G. Scammell,
David F. Smith, and Jon Clardy. "Structure of the Large Fk506-Binding Protein Fkbp51, an
Hsp90-Binding Protein and a Component of Steroid Receptor Complexes." Proceedings of
the National Academy of Sciences 100, no. 3 (2003): 868-73.
51. Brinker, Achim, Clemens Scheufler, Florian von der Mülbe, Burkhard Fleckenstein,
Christian Herrmann, Günther Jung, Ismail Moarefi, and F. Ulrich Hartl. "Ligand
Discrimination by Tpr Domains." Journal of Biological Chemistry 277, no. 22 (2002):
19265-75.
52. Cliff, Matthew J., Mark A. Williams, John Brooke-Smith, David Barford, and John E.
Ladbury. "Molecular Recognition Via Coupled Folding and Binding in a Tpr Domain."
Journal of Molecular Biology 346, no. 3 (2005): 717-32
53. Magliery, Thomas. J., and Lynne. Regan. "Sequence Variation in Ligand Binding Sites
in Proteins." Bmc Bioinformatics 6 (2005): 240.
54. Doyle, Declan A., Alice Lee, John Lewis, Eunjoon Kim, Morgan Sheng, and Roderick
MacKinnon. "Crystal Structures of a Complexed and Peptide-Free Membrane Proteinbinding Domain: Molecular Basis of Peptide Recognition by Pdz." Cell 85, no. 7 (1996):
1067-76.
206
55. De Los Rios, Paolo, Fabio Cecconi, Anna Pretre, Giovanni Dietler, Olivier Michielin,
Francesco Piazza, and Brice Juanico. "Functional Dynamics of Pdz Binding Domains: A
Normal-Mode Analysis." Biophysical Journal 89, no. 1 (2005): 14-21.
56. Aitio, Olli, Maarit Hellman, Arunas Kazlauskas, Didier F. Vingadassalom, John M.
Leong, Kalle Saksela, and Perttu Permi. "Recognition of Tandem Pxxp Motifs as a Unique
Src Homology 3-Binding Mode Triggers Pathogen-Driven Actin Assembly." Proceedings
of the National Academy of Sciences 107, no. 50 (2010): 21743-48.
57. Bauer, Finn., Kristian. Schweimer, Helke. Meiselbach, Silke. Hoffmann, Paul. Rosch,
and Heinrich. Sticht. "Structural Characterization of Lyn-Sh3 Domain in Complex with a
Herpesviral Protein Reveals an Extended Recognition Motif That Enhances Binding
Affinity." Protein science : a publication of the Protein Society 14, no. 10 (2005): 2487-98.
58. Liao, Yanling, Ian M. Willis, and Robyn D. Moir. "The Brf1 and Bdp1 Subunits of
Transcription Factor Tfiiib Bind to Overlapping Sites in the Tetratricopeptide Repeats of
Tfc4." Journal of Biological Chemistry 278, no. 45 (2003): 44467-74.
59. Crevel, Gilles, Dorothy Bennett, and Sue Cotterill. "The Human Tpr Protein Ttc4 Is a
Putative Hsp90 Co-Chaperone Which Interacts with Cdc6 and Shows Alterations in
Transformed Cells." Plos One 3, no. 3 (2008): e0001737.
60. Jascur, Thomas, Howard Brickner, Isabelle Salles-Passador, Valerie Barbier,
Abdelhamid El Khissiin, Brian Smith, Rati Fotedar, and Arun Fotedar. "Regulation of
P21waf1/Cip1 Stability by Wisp39, a Hsp90 Binding Tpr Protein." Molecular Cell 17, no.
2 (2005): 237-49.
61. Jakob, Ursula, Hauke Lilie, Ines Meyer, and Johannes Buchner. "Transient Interaction
of Hsp90 with Early Unfolding Intermediates of Citrate Synthase: Implications for Heat
Shock in Vivo." Journal of Biological Chemistry 270, no. 13 (1995): 7288-94.
62. Imai, Jun., Mikako. Maruya, Hideki. Yashiroda, Ichiro. Yahara, and Keiji. Tanaka.
"The Molecular Chaperone Hsp90 Plays a Role in the Assembly and Maintenance of the
26s Proteasome." Embo Journal 22, no. 14 (2003): 3557-67.
63. Grad, Iwona, and Didier Picard. "The Glucocorticoid Responses Are Shaped by
Molecular Chaperones." Molecular and Cellular Endocrinology 275, no. 1–2 (2007): 212.
64. Russell, Lance C., Sherry R. Whitt, Mei-Shya Chen, and Michael Chinkers.
"Identification of Conserved Residues Required for the Binding of a Tetratricopeptide
Repeat Domain to Heat Shock Protein 90." Journal of Biological Chemistry 274, no. 29
(1999): 20060-63.
65. Whitesell, Luke, Edward G Mimnaugh, Brian De Costa, Charles E Myers, and Leonard
M Neckers. "Inhibition of Heat Shock Protein Hsp90-Pp60v-Src Heteroprotein Complex
Formation by Benzoquinone Ansamycins: Essential Role for Stress Proteins in Oncogenic
Transformation." Proceedings of the National Academy of Sciences 91, no. 18 (1994):
8324-28.
66. Pearl, Laurence H., and Chrisostomos Prodromou. "Structure and in Vivo Function of
Hsp90." Current Opinion in Structural Biology 10, no. 1 (2000): 46-51.
207
67. Morano, Kevin A. "New Tricks for an Old Dog." Annals of the New York Academy of
Sciences 1113, no. 1 (2007): 1-14.
68. Liu, Qinghuai, Juanyu Gao, Xi Chen, Yuxin Chen, Jie Chen, Saiqun Wang, Jin Liu,
Xiaoyi Liu, and Jianmin Li. "Hbp21: A Novel Member of Tpr Motif Family, as a Potential
Chaperone of Heat Shock Protein 70 in Proliferative Vitreoretinopathy (Pvr) and Breast
Cancer." Molecular Biotechnology 40, no. 3 (2008): 231-40.
69. Place, Sean. P. "Single-Point Mutation in a Conserved Tpr Domain of Hip Disrupts
Enhancement of Glucocorticoid Receptor Signaling." Cell stress & chaperones 16, no. 4
(2011): 469-74.
70. Dodt, Gabriele., Nancy. Braverman, Candice. Wong, Ann. Moser, Hugo. W. Moser,
Paul. Watkins, David. Valle, and Stephen. J. Gould. "Mutations in the Pts1 Receptor Gene,
Pxr1, Define Complementation Group 2 of the Peroxisome Biogenesis Disorders." Nat
Genet 9, no. 2 (1995): 115-25.
71. Brocard, C., Friedrich. Kragler, M. M. Simon, T. Schuster, and A. Hartig. "The
Tetratricopeptide Repeat Domain of the Pas10 Protein of Saccharomyces Cerevisiae Is
Essential for Binding the Peroxisomal Targeting Signal -Skl." Biochemical and
Biophysical Research Communications 204, no. 3 (1994): 1016-22.
72. McCollum, Dannel, Edward Monosov, and Suresh Subramani. "The Pas8 Mutant of
Pichia Pastoris Exhibits the Peroxisomal Protein Import Deficiencies of Zellweger
Syndrome Cells--the Pas8 Protein Binds to the Cooh-Terminal Tripeptide Peroxisomal
Targeting Signal, and Is a Member of the Tpr Protein Family." The Journal of Cell Biology
121, no. 4 (1993): 761-74.
73. Vanderleij, Inge., Maartje. M. Franse, Ype. Elgersma, Ben. Distel, and Henk. F. Tabak.
"Pas10 Is a Tetratricopeptide-Repeat Protein That Is Essential for the Import of Most
Matrix Proteins into Peroxisomes of Saccharomyces-Cerevisiae." Proceedings of the
National Academy of Sciences of the United States of America 90, no. 24 (1993): 1178286.
74. Schlüter, Agatha, Stéphane Fourcade, Enric Doménech-Estévez, Toni Gabaldón, Jaime
Huerta-Cepas, Guillaume Berthommier, Raymond Ripp, Ronald J. A. Wanders, Olivier
Poch, and Aurora Pujol. "PeroxisomeDB: A Database for the Peroxisomal Proteome,
Functional Genomics and Disease." Nucleic Acids Research 35, no. suppl 1 (2007): D815D22.
75. Lithgow, Trevor, Benjamin S. Glick, and Gottfried Schatz. "The Protein Import
Receptor of Mitochondria." Trends in Biochemical Sciences 20, no. 3 (1995): 98-101.
76. Moczko, M, U Bömer, M Kübrich, N Zufall, A Hönlinger, and N Pfanner. "The
Intermembrane Space Domain of Mitochondrial Tom22 Functions as a Trans Binding Site
for Preproteins with N-Terminal Targeting Sequences." Molecular and Cellular Biology
17, no. 11 (1997): 6574-84.
77. Riezman, Howard., Toshiharu. Hase, Adolphus. P. van Loon, Leslie. A. Grivell,
Kitaru. Suda, and Gottfried. Schatz. "Import of Proteins into Mitochondria: A 70
Kilodalton Outer Membrane Protein with a Large Carboxy-Terminal Deletion Is Still
Transported to the Outer Membrane." EMBO J 2, no. 12 (1983): 2161-8.
208
78. Yano, Masato, Kazutoyo Terada, and Masataka Mori. "Mitochondrial Import
Receptors Tom20 and Tom22 Have Chaperone-Like Activity." Journal of Biological
Chemistry 279, no. 11 (2004): 10808-13.
79. Thornton, Brian R., and David P. Toczyski. "Precise Destruction: An Emerging Picture
of the Apc." Genes & Development 20, no. 22 (2006): 3069-78.
80. Lamb, J. R., W. A. Michaud, R. S. Sikorski, and P. A. Hieter. "Cdc16p, Cdc23p and
Cdc27p Form a Complex Essential for Mitosis." The EMBO journal 13, no. 18 (1994):
4321-28.
81. Sikorski, Robert S, William A Michaud, and Philip Hieter. "P62cdc23 of
Saccharomyces Cerevisiae: A Nuclear Tetratricopeptide Repeat Protein with Two Mutable
Domains." Molecular and Cellular Biology 13, no. 2 (1993): 1212-21.
82. Samejima, Itaru, and Mitsuhiro Yanagida. "Bypassing Anaphase by Fission Yeast Cut9
Mutation: Requirement of Cut9+ to Initiate Anaphase." The Journal of Cell Biology 127,
no. 6 (1994): 1655-70.
83. Liu, Geng, and Guillermina Lozano. "P21 Stability: Linking Chaperones to a Cell
Cycle Checkpoint." Cancer Cell 7, no. 2 (2005): 113-14.
84. Fotedar, R., P. Fitzgerald, T. Rousselle, D. Cannella, M. Doree, H. Messier, and A.
Fotedar. "p21 Contains Independent Binding Sites for Cyclin and Cdk2: Both Sites Are
Required to Inhibit Cdk2 Kinase Activity." Oncogene 12, no. 10 (1996): 2155-64.
85. El-Deiry, Wafik S., Takashi Tokino, Victor E. Velculescu, Daniel B. Levy, Ramon
Parsons, Jeffrey M. Trent, David Lin, W. Edward Mercer, Kenneth W. Kinzler, and Bert
Vogelstein. "Waf1, a Potential Mediator of P53 Tumor Suppression." Cell 75, no. 4
(1993): 817-25.
86. Kim, Geum-Yi, Stephen E. Mercer, Daina Z. Ewton, Zhongfa Yan, Kideok Jin, and
Eileen Friedman. "The Stress-Activated Protein Kinases P38α and Jnk1 Stabilize P21cip1
by Phosphorylation." Journal of Biological Chemistry 277, no. 33 (2002): 29792-802.
87. Cliff, Matthew J., Richard Harris, David Barford, John E. Ladbury, and Mark A.
Williams. "Conformational Diversity in the Tpr Domain-Mediated Interaction of Protein
Phosphatase 5 with Hsp90." Structure 14, no. 3 (2006): 415-26.
88. Chen, Mei-Shya, Adam M. Silverstein, William B. Pratt, and Michael Chinkers. "The
Tetratricopeptide Repeat Domain of Protein Phosphatase 5 Mediates Binding to
Glucocorticoid Receptor Heterocomplexes and Acts as a Dominant Negative Mutant."
Journal of Biological Chemistry 271, no. 50 (1996): 32315-20.
89. Silverstein, Adam M., Mario D. Galigniana, Mei-Shya Chen, Janet K. Owens-Grillo,
Michael Chinkers, and William B. Pratt. "Protein Phosphatase 5 Is a Major Component of
Glucocorticoid Receptor Hsp90 Complexes with Properties of an Fk506-Binding
Immunophilin." Journal of Biological Chemistry 272, no. 26 (1997): 16224-30.
90. Ollendorff, Vincent, and Daniel J. Donoghue. "The Serine/Threonine Phosphatase Pp5
Interacts with Cdc16 and Cdc27, Two Tetratricopeptide Repeat-Containing Subunits of the
Anaphase-Promoting Complex." Journal of Biological Chemistry 272, no. 51 (1997):
32011-18.
209
91. Chen, Mao Xiang, and Patricia T. W. Cohen. "Activation of Protein Phosphatase 5 by
Limited Proteolysis or the Binding of Polyunsaturated Fatty Acids to the Tpr Domain."
Febs Letters 400, no. 1 (1997): 136-40.
92. Ali, Ambereen., Ji. Zhang, Shideng. Bao, Irene. Liu, Diane. Otterness, Nicholas. M.
Dean, Robert. T. Abraham, and Xian-Fan. Wang. "Requirement of Protein Phosphatase 5
in DNA-Damage-Induced Atm Activation." Genes Dev 18, no. 3 (2004): 249-54
93. Blom, Eric, Henri J. van de Vrugt, Yne de Vries, Johan P. de Winter, Fré Arwert, and
Hans Joenje. "Multiple Tpr Motifs Characterize the Fanconi Anemia Fancg Protein." DNA
Repair 3, no. 1 (2004): 77-84.
94. Hussain, Shobbir, James B. Wilson, Eric Blom, Larry H. Thompson, Patrick Sung,
Susan M. Gordon, Gary M. Kupfer, Hans Joenje, Christopher G. Mathew, and Nigel J.
Jones. "Tetratricopeptide-Motif-Mediated Interaction of Fancg with Recombination
Proteins Xrcc3 and Brca2." DNA Repair 5, no. 5 (2006): 629-40.
95. Jiang, Jihong, Douglas Cyr, Roger W. Babbitt, William C. Sessa, and Cam Patterson.
"Chaperone-Dependent Regulation of Endothelial Nitric-Oxide Synthase Intracellular
Trafficking by the Co-Chaperone/Ubiquitin Ligase Chip." Journal of Biological Chemistry
278, no. 49 (2003): 49332-4
96. Ballinger, Carol A., Patrice Connell, Yaxu Wu, Zhaoyong Hu, Larry J. Thompson, LiYan Yin, and Cam Patterson. "Identification of Chip, a Novel Tetratricopeptide RepeatContaining Protein That Interacts with Heat Shock Proteins and Negatively Regulates
Chaperone Functions." Molecular and Cellular Biology 19, no. 6 (1999): 4535-45.
97. Alberti, Simon, Karsten Böhse, Verena Arndt, Anton Schmitz, and Jörg Höhfeld. "The
Cochaperone Hspbp1 Inhibits the Chip Ubiquitin Ligase and Stimulates the Maturation of
the Cystic Fibrosis Transmembrane Conductance Regulator." Molecular Biology of the
Cell 15, no. 9 (2004): 4003-10.
98. Tripathi, Veenu, Amjad Ali, Rajiv Bhat, and Uttam Pati. "Chip Chaperones Wild Type
P53 Tumor Suppressor Protein." Journal of Biological Chemistry 282, no. 39 (2007):
28441-54.
99. Xia, Tian, Christiana Dimitropoulou, Jingmin Zeng, Galina N. Antonova, Connie
Snead, Richard C. Venema, David Fulton, Shuibing Qian, Cam Patterson, Andreas
Papapetropoulos, and John D. Catravas. "Chaperone-Dependent E3 Ligase Chip
Ubiquitinates and Mediates Proteasomal Degradation of Soluble Guanylyl Cyclase."
American Journal of Physiology - Heart and Circulatory Physiology 293, no. 5 (2007):
H3080-H87.
100. Iyer, Sai Prasad N., and Gerald W. Hart. "Roles of the Tetratricopeptide Repeat
Domain in O-Glcnac Transferase Targeting and Protein Substrate Specificity." Journal of
Biological Chemistry 278, no. 27 (2003): 24608-16.
101. Iyer, Sai Prasad N., Yoshihiro Akimoto, and Gerald W. Hart. "Identification and
Cloning of a Novel Family of Coiled-Coil Domain Proteins That Interact with O-Glcnac
Transferase." Journal of Biological Chemistry 278, no. 7 (2003): 5399-40
210
102. Kelly, William G, Michael E Dahmus, and Gerald W Hart. "Rna Polymerase Ii Is a
Glycoprotein. Modification of the Cooh-Terminal Domain by O-Glcnac." Journal of
Biological Chemistry 268, no. 14 (1993): 10416-24.
103. Jackson, Stephen P., and Robert Tjian. "O-Glycosylation of Eukaryotic Transcription
Factors: Implications for Mechanisms of Transcriptional Regulation." Cell 55, no. 1
(1988): 125-33.
104. Lubas, William A., and John A. Hanover. "Functional Expression of O-Linked Glcnac
Transferase: Domain Structure and Substrate Specificity." Journal of Biological Chemistry
275, no. 15 (2000): 10983-88.
105. Buchanan, Grant., Carmela. Ricciardelli, Jonathan. M. Harris, Jennifer. Prescott, Zoe.
Chiao-Li. Yu, Li. Jia, Lisa. M. Butler, Villis. R. Marshall, Howard. I. Scher, William. L.
Gerald, Gerhard. A. Coetzee, and Wayne. D. Tilley. "Control of Androgen Receptor
Signaling in Prostate Cancer by the Cochaperone Small Glutamine Rich Tetratricopeptide
Repeat Containing Protein Alpha." Cancer research 67, no. 20 (2007): 10087-96.
106. Krenn, Veronica, Annemarie Wehenkel, Xiaozheng Li, Stefano Santaguida, and
Andrea Musacchio. "Structural Analysis Reveals Features of the Spindle Checkpoint
Kinase Bub1-kinetochore Subunit Knl1 Interaction." The Journal of Cell Biology 196, no.
4 (2012): 451-67.
107. Chol, Kang-Yell, Brett Satterberg, David M. Lyons, and Elaine A. Elion. "Ste5
Tethers Multiple Protein Kinases in the Map Kinase Cascade Required for Mating in S.
Cerevisiae." Cell 78, no. 3 (1994): 499-51
108. Zeke, András, Melinda Lukács, Wendell A. Lim, and Attila Reményi. "Scaffolds:
Interaction Platforms for Cellular Signalling Circuits." Trends in Cell Biology 19, no. 8
(2009): 364-74.
109. Burack, W. Richard, and Andrey S. Shaw. "Signal Transduction: Hanging on a
Scaffold." Current Opinion in Cell Biology 12, no. 2 (2000): 211-16.
110. Ferrell Jr, James E., and Karlene A. Cimprich. "Enforced Proximity in the Function of
a Famous Scaffold." Molecular Cell 11, no. 2 (2003): 289-91.
111. Bhattacharyya, Roby. P., Attila. Remenyi, Brian. J. Yeh, and Wendell. A. Lim.
"Domains, Motifs, and Scaffolds: The Role of Modular Interactions in the Evolution and
Wiring of Cell Signaling Circuits." Annual Review of Biochemistry 75 (2006): 655-80.
112. Pálfy, Máté, Attila Reményi, and Tamás Korcsmáros. "Endosomal Crosstalk: Meeting
Points for Signaling Pathways." Trends in Cell Biology 22, no. 9 (2012): 447-56.
113. Hanahan, Douglas, and Robert A. Weinberg. "The Hallmarks of Cancer." Cell 100,
no. 1 (2000): 57-70.
114. Hanahan, Douglas, and Robert A Weinberg. "Hallmarks of Cancer: The Next
Generation." Cell 144, no. 5 (2011): 646-74.
115. Zhang, Yanping, Gabrielle White Wolf, Krishna Bhat, Aiwen Jin, Theresa Allio,
William A. Burkhart, and Yue Xiong. "Ribosomal Protein L11 Negatively Regulates
211
Oncoprotein Mdm2 and Mediates a P53-Dependent Ribosomal-Stress Checkpoint
Pathway." Molecular and Cellular Biology 23, no. 23 (2003): 8902-12
116. Gajjar, Madhavsai, Marco M Candeias, Laurence Malbert-Colas, Anne Mazars, Jun
Fujita, Vanesa Olivares-Illana, and Robin Fåhraeus. "The P53 Mrna-Mdm2 Interaction
Controls Mdm2 Nuclear Trafficking and Is Required for P53 Activation Following DNA
Damage." Cancer Cell 21, no. 1 (2012): 25-35
117. Vazquez, Alexi., Elisabeth. E. Bond, Arnold. J. Levine, and G. Levine. Bond. "The
Genetics of the P53 Pathway, Apoptosis and Cancer Therapy." Nat Rev Drug Discov 7, no.
12 (2008): 979-87.
118. Campellone, Kenneth. G., and Mathew. D. Welch. "A Nucleator Arms Race: Cellular
Control of Actin Assembly." Nat Rev Mol Cell Biol 11, no. 4 (2010): 237-51.
119. Ridley, Anne. "Life at the Leading Edge." Cell 145, no. 7 (2011): 1012-22.
120. Yamaguchi, Hideki, and John Condeelis. "Regulation of the Actin Cytoskeleton in
Cancer Cell Migration and Invasion." Biochimica et Biophysica Acta (BBA) - Molecular
Cell Research 1773, no. 5 (2007): 642-52.
121. Yamaguchi, Hideki, Jeffrey Wyckoff, and John Condeelis. "Cell Migration in
Tumors." Current Opinion in Cell Biology 17, no. 5 (2005): 559-64.
122. Schramm, Laura, and Nouria Hernandez. "Recruitment of Rna Polymerase III to Its
Target Promoters." Genes & Development 16, no. 20 (2002): 2593-620.
123. Nikolov, Dimitar.‚ B., and Stephen.‚ K. Burley. "Rna Polymerase II Transcription
Initiation: A Structural view." Proceedings of the National Academy of Sciences 94, no. 1
(1997): 15-22.
124. Sentenac, Andre. "Eukaryotic Rna-Polymerases." Crc Critical Reviews in
Biochemistry 18, no. 1 (1985): 31-90.
125. Gaston, Kevin., and Padma. S. Jayaraman. "Transcriptional Repression in Eukaryotes:
Repressors and Repression Mechanisms." Cell Mol Life Sci 60, no. 4 (2003): 721-41.
126. Maston, Glen A., Sara. K. Evans, and Michael. R. Green. "Transcriptional Regulatory
Elements in the Human Genome." Annu Rev Genomics Hum Genet 7 (2006): 29-59.
127. Black, Joshua C., Janet E. Choi, Sarah R. Lombardo, and Michael Carey. "A
Mechanism for Coordinating Chromatin Modification and Preinitiation Complex
Assembly." Molecular Cell 23, no. 6 (2006): 809-18
128. Mertens, Claudia., and Robert. G. Roeder. "Different Functional Modes of P300 in
Activation of Rna Polymerase Iii Transcription from Chromatin Templates." Molecular
and Cellular Biology 28, no. 18 (2008): 5764-76.
129. Lemon, Bryan, and Robert Tjian. "Orchestrated Response: A Symphony of
Transcription Factors for Gene Control." Genes & Development 14, no. 20 (2000): 255169.
130. Pan, Yongping, Chung-Jung Tsai, Buyong Ma, and Ruth Nussinov. "Mechanisms of
212
Transcription Factor Selectivity." Trends in Genetics 26, no. 2 (2010): 75-83.
131. Silverman, Eric S, Jing Du, Amy J Williams, Raj Wadgaonkar, Jeffrey M Drazen, and
Tucker Collins. "Camp-Response-Element-Binding-Protein-Binding Protein (Cbp) and
P300 Are Transcriptional Co-Activators of Early Growth Response Factor-1 (Egr-1)."
Biochem. J. 336, no. 1 (1998): 183-89.
132. Janknecht, Ralf, and Tony Hunter. Transcription. A Growing Coactivator Network.
Vol. 383, 1996.
133. Courey, Albert. J., and Songtao. Jia. "Transcriptional Repression: The Long and the
Short of It." Genes Dev 15, no. 21 (2001): 2786-96.
134. Moir, Robyn D, Indra Sethy-Coraci, Karen Puglia, Monett D Librizzi, and Ian M
Willis. "A Tetratricopeptide Repeat Mutation in Yeast Transcription Factor IIIC131
(TFIIC131) Facilitates Recruitment of TfIIb-Related Factor TfIIIb70." Molecular and
Cellular Biology 17, no. 12 (1997): 7119-25.
135. Cabarcas, Stephanie, and Laura Schramm. "Rna Polymerase III Transcription in
Cancer: The Brf2 Connection." Molecular Cancer C7 - 47 10, no. 1 (2011): 1-10.
136. Demonacos, Constantinos, Marija Kristic-Demonacos, and Nicholas B. La Thangue.
"A Tpr Motif Cofactor Contributes to P300 Activity in the P53 Response." Molecular Cell
8, no. 1 (2001): 71-84.
137. Dallas, Peter B, Peter Yaciuk, and Elizabeth Moran. "Characterization of Monoclonal
Antibodies Raised against P300: Both P300 and Cbp Are Present in Intracellular Tbp
Complexes." Journal of Virology 71, no. 2 (1997): 1726-31.
138. Yuan, L. W., and A. Giordano. "Acetyltransferase Machinery Conserved in
P300/Cbp-Family Proteins." Oncogene 21, no. 14 (2002): 2253-60.
139. Grossman, Steven R., Marco Perez, Andrew L. Kung, Michael Joseph, Claire Mansur,
Zhi-Xiong Xiao, Sushant Kumar, Peter M. Howley, and David M. Livingston.
"P300/Mdm2 Complexes Participate in Mdm2-Mediated P53 Degradation." Molecular
Cell 2, no. 4 (1998): 405-15.
140. Zhu, Qianzheng, Jihong Yao, Gulzar Wani, Manzoor A. Wani, and Altaf A. Wani.
"Mdm2 Mutant Defective in Binding P300 Promotes Ubiquitination but Not Degradation
of P53: Evidence for the Role of P300 in Integrating Ubiquitination and Proteolysis."
Journal of Biological Chemistry 276, no. 32 (2001): 29695-701.
141. Shikama, Noriko, Chang-Woo Lee, Stephen France, Laurent Delavaine, Jonathan
Lyon, Marija Krstic-Demonacos, and Nicholas B. La Thangue. "A Novel Cofactor for
P300 That Regulates the P53 Response." Molecular Cell 4, no. 3 (1999): 365-76.
142. Coutts, Amanda. S., Houda. Boulahbel, Anne. Graham, and Nicolas. B. La Thangue.
"Mdm2 Targets the P53 Transcription Cofactor Jmy for Degradation." EMBO reports 8,
no. 1 (2007): 84-90.
143. Jansson, Martin., Stephen. T. Durant, Er-Chieh. Cho, Sharon. Sheahan, Mariola.
Edelmann, Benedict. Kessler, and Nicolas. B. La Thangue. "Arginine Methylation
Regulates the P53 Response." Nat Cell Biol 10, no. 12 (2008): 1431-9.
213
144. Demonacos, Constantinos., Marija. Kristic-Demonacos, Linda. Smith, Danmei. Xu,
Darran. P. O'Connor, Martin. Jansson, and Nicolas. B. La Thangue. "A New Effector
Pathway Links Atm Kinase with the DNA Damage Response." Nat Cell Biol 6, no. 10
(2004): 968-76.
145. Adams, Cassandra. J., Anne. L. Graham, Martin. Jansson, Amanda. S. Coutts,
Mariola. Edelmann, Linda. Smith, Benedikt. Kessler, and Nicolas. B. La Thangue. "Atm
and Chk2 Kinase Target the P53 Cofactor Strap." EMBO Rep 9, no. 12 (2008): 1222-9.
146. Hollstein, M., D. Sidransky, B. Vogelstein, and C. C. Harris. "P53 Mutations in
Human Cancers." Science 253, no. 5015 (1991): 49-53.
147. Smith, Linda., and Nicolas. B. La Thangue. "Signalling DNA Damage by Regulating
P53 Co-Factor Activity." Cell Cycle 4, no. 1 (2005): 30-2.
148. Xu, Danmei., and Nicolas. B. La Thangue. "Strap: A Versatile Transcription CoFactor." Cell Cycle 7, no. 16 (2008): 2456-7.
149. Xu, Danmei., L. Panagiotis. Zalmas, and Nicolas. B. La Thangue. "A Transcription
Cofactor Required for the Heat-Shock Response." EMBO Rep 9, no. 7 (2008): 662-9.
150. Davies, Laura., Elissavet. Paraskevopoulou, Malihah. Sadeq, Christiana. Symeou,
Constantia. Pantelidou, Constantinos. Demonacos, and Marija. Krstic-Demonacos.
"Regulation of Glucocorticoid Receptor Activity by a Stress Responsive Transcriptional
Cofactor." Molecular endocrinology (Baltimore, Md.) 25, no. 1 (2011): 58-71.
151. http://textbookofbacteriology.net/themicrobialworld/growth.html- Accessed on the
15th March 2012.
152. Golovanov, Alexander. P., Guillaume. M. Hautbergue, Stuart. A. Wilson, and Lu.
Yun. Lian. "A Simple Method for Improving Protein Solubility and Long-Term Stability."
Journal of the American Chemical Society 126, no. 29 (2004): 8933-39.
153. Adams, Cassandra J., Ashley C. W. Pike, Sandra Maniam, Timothy D. Sharpe,
Amanda S. Coutts, Stefan Knapp, Nicholas B. La Thangue, and Alex N. Bullock. "The
P53 Cofactor Strap Exhibits an Unexpected Tpr Motif and Oligonucleotide-Binding
(Ob)–fold Structure." Proceedings of the National Academy of Sciences 109, no. 10
(2012): 3778-83.
154. Dale, Glenn E., Christian Oefner, and Allan D’Arcy. "The Protein as a Variable in
Protein Crystallization." Journal of Structural Biology 142, no. 1 (2003): 88-97.
155. Smyth, Douglas R., Marek K. Mrozkiewicz, William J. McGrath, Pawel Listwan, and
Bostjan Kobe. "Crystal Structures of Fusion Proteins with Large-Affinity Tags." Protein
Science 12, no. 7 (2003): 1313-22.
156. Schulman, Brenda. A., Peter. S. Kim, Christopher. M. Dobson, and Christina.
Redfield. "A Residue-Specific Nmr View of the Non-Cooperative Unfolding of a Molten
Globule." Nat Struct Biol 4, no. 8 (1997): 630-4.
157. Chen, Rachel. "Bacterial Expression Systems for Recombinant Protein Production: E.
Coli and Beyond." Biotechnology Advances 30, no. 5 (2012): 1102-07.
214
158. Verma, R., E. Boleti, and A. J. T. George. "Antibody Engineering: Comparison of
Bacterial, Yeast, Insect and Mammalian Expression Systems." Journal of Immunological
Methods 216, no. 1–2 (1998): 165-81.
159. Fernandez, Joseph.M., and James.P. Hoeffler. Gene Expression Systems: Using
Nature for the Art of Expression: Elsevier Science, 1999.
160. Higgins, Steve.J., and Steve.J.H.B.D. Hames. Protein Expression: A Practical
Approach: Oxford University Press, 1999.
161. Baneyx, Francois., and Mirna. Mujacic. "Recombinant Protein Folding and
Misfolding in Escherichia Coli." Nat Biotechnol 22, no. 11 (2004): 1399-408.
162. Trinkle-Mulcahy, Laura, Severine Boulon, Yun Wah Lam, Roby Urcia, FrancoisMichel Boisvert, Franck Vandermoere, Nick A Morrice, Sam Swift, Ulrich Rothbauer,
Heinrich Leonhardt, and Angus Lamond. "Identifying Specific Protein Interaction Partners
Using Quantitative Mass Spectrometry and Bead Proteomes." The Journal of Cell Biology
183, no. 2 (2008): 223-39.
163. Figeys, Daniel, Linda D. McBroom, and Michael F. Moran. "Mass Spectrometry for
the Study of Protein-Protein Interactions." Methods 24, no. 3 (2001): 230-39.
164. Brymora, Adam, Valentina A. Valova, and Phillip J. Robinson. "Protein-Protein
Interactions Identified by Pull-Down Experiments and Mass Spectrometry." In Current
Protocols in Cell Biology: John Wiley & Sons, Inc., 2001.
165. Arifuzzaman, Mohammad, Maki Maeda, Aya Itoh, Kensaku Nishikata, Chiharu
Takita, Rintaro Saito, Takeshi Ara, Kenji Nakahigashi, Hsuan-Cheng Huang, Aki Hirai,
Kohei Tsuzuki, Seira Nakamura, Mohammad Altaf-Ul-Amin, Taku Oshima, Tomoya
Baba, Natsuko Yamamoto, Tomoyo Kawamura, Tomoko Ioka-Nakamichi, Masanari
Kitagawa, Masaru Tomita, Shigehiko Kanaya, Chieko Wada, and Hirotada Mori. "LargeScale Identification of Protein-protein Interaction of Escherichia Coli K-12." Genome
Research 16, no. 5 (2006): 686-91.
166. Franceschini, Andrea, Damian Szklarczyk, Sune Frankild, Michael Kuhn, Milan
Simonovic, Alexander Roth, Jianyi Lin, Pablo Minguez, Peer Bork, Christian von Mering,
and Lars J. Jensen. "String V9.1: Protein-Protein Interaction Networks, with Increased
Coverage and Integration." Nucleic Acids Research 41, no. D1 (2013): D808-D15.
167. Theobald, Douglas L., Rachel M. Mitton-Fry, and Deborah S. Wuttke. "Nucleic Acid
Recognition by Ob-Fold Proteins." Annual Review of Biophysics and Biomolecular
Structure 32, no. 1 (2003): 115-33.
168. Roadcap, David. W., and James. E. Bear. "Double Jmy: Making Actin Fast." Nature
cell biology 11, no. 4 (2009): 375-76.
169. Coutts, Amanda. S., Louise. Weston, and Nicolas. B. La Thangue. "A Transcription
Co-Factor Integrates Cell Adhesion and Motility with the P53 Response." Proc Natl Acad
Sci U S A 106, no. 47 (2009): 19872-7.
215
170. Coutts, Amanda. S., Louise. Weston, and Nicolas. B. La Thangue. "Actin Nucleation
by a Transcription Co-Factor That Links Cytoskeletal Events with the P53 Response." Cell
Cycle 9, no. 8 (2010): 1511-5.
171. Wang, Yinggun. "Jimmy on the Stage: Linking DNA Damage with Cell Adhesion
and Motility." Cell adhesion & migration 4, no. 2 (2010): 166-68.
172. Sun, Shao-Chen, Qing-Yuan Sun, and Nam-Hyung Kim. "Jmy Is Required for
Asymmetric Division and Cytokinesis in Mouse Oocytes." Molecular human reproduction
17, no. 5 (2011): 296-304.
173. Taiyab, Aftab, and Ch Mohan Rao. "Hsp90 Modulates Actin Dynamics: Inhibition of
Hsp90 Leads to Decreased Cell Motility and Impairs Invasion." Biochimica et Biophysica
Acta (BBA) - Molecular Cell Research 1813, no. 1 (2011): 213-21.
174. Perkins, David N., Darryl J. C. Pappin, David M. Creasy, and John S. Cottrell.
"Probability-Based Protein Identification by Searching Sequence Databases Using Mass
Spectrometry Data." ELECTROPHORESIS 20, no. 18 (1999): 3551-67.
175. Searle, Brian C. "Scaffold: A Bioinformatic Tool for Validating Ms/Ms-Based
Proteomic Studies." PROTEOMICS 10, no. 6 (2010): 1265-69.
176. Scriven, David. R., Ronald. M. Lynch, and Edwin. D. Moore. "Image Acquisition for
Colocalization Using Optical Microscopy." American journal of physiology. Cell
physiology 294, no. 5 (2008): C1119-22.
177. Pierce, Michael M., C. S. Raman, and Barry T. Nall. "Isothermal Titration
Calorimetry of Protein–protein Interactions." Methods 19, no. 2 (1999): 213-21.
178. Liang, Chun- Chi., Ann. Y. Park, and Jun-Lin. Guan. "In Vitro Scratch Assay: A
Convenient and Inexpensive Method for Analysis of Cell Migration in Vitro." Nature
protocols 2, no. 2 (2007): 329-33.
179. Wells, Claire M., Maddy Parsons, and Giles Cory. "Scratch-Wound Assay." In Cell
Migration, 25-30: Humana Press, 2011.
180. Xu, Xiaoli, Yuan Song, Yuhua Li, Jianfeng Chang, Hua zhang, and Lizhe An. "The
Tandem Affinity Purification Method: An Efficient System for Protein Complex
Purification and Protein Interaction Identification." Protein Expression and Purification 72,
no. 2 (2010): 149-56.
181.Muslimović, Aida, Susanne Nyström, Yue Gao, and Ola Hammarsten. "Numerical
Analysis of Etoposide Induced DNA Breaks." Plos One 4, no. 6 (2009): e5859.
182. Mocellin, S., and M. Provenzano. "Rna Interference: Learning Gene Knock-Down
from Cell Physiology." Journal of translational medicine 2, no. 1 (2004): 39.
183. https://www.caymanchem.com/app/template/Product.vm/catalog/600450, accessed on
the 7th January 2013.
216
7. Appendix
(1) His-hSTRAP(1-440)TPR1-6
Sequence given by GATC:
gCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT
CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC
TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG
CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGATA
GCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCAC
TGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTG
CAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGC
TGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAG
TGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTC
TGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAAC
CGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCT
His-hSTRAP(1-440)TPR1-6 sequence
CATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATTCTGTTGAAGATGCAGGTCGTAAA
CAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGC
CCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGT
GCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCC
GTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAA
AAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGG
CCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACAT
CTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTT
AGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATT
CCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCA
GTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC
Alignment:
SEQ
Full
SEQ
Full
GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120
-----------------------CATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 97
*************************************************************************************************
AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 240
AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 217
************************************************************************************************************************
217
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
SEQ
Full
TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG
TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG
************************************************************************************************************************
TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCG
TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCG
************************************************************************************************************************
ATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCC
ATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCC
************************************************************************************************************************
TGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAA
TGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAA
************************************************************************************************************************
CCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGG
CCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGG
************************************************************************************************************************
ATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCG
ATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCG
************************************************************************************************************************
CAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTC
CAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTC
************************************************************************************************************************
CGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGC
CGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGC
************************************************************************************************************************
GTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAA
GTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAA
************************************************************************************************************************
CCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAG 1380
CCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC---------------------- 1335
**************************************
300
377
480
457
600
577
720
697
840
817
960
937
1080
1057
1200
1177
1320
1297
(2) GST-hSTRAP(1-440) sequencing data
Sequence given by the University of Manchester DNA sequencing facility
TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTCCTTTCCTTT
GTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCCATCTGAATCTACCAGGCC
AAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTTGAGCTCCAGGGTCACTTTCTGCCCAGA
GGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCCCTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAA
TTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTCCCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATG
AAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGATCTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAAT
ATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCATGACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTT
218
TTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGCAGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAG
CTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGGAGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTG
TAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTTCCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGAT
CGGCTTGACTTCTTCCTCTTCATCAGCCATCAT
Reverse complement of the GST-hSTRAP(1-440) sequence
TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTCCTTTCCTTT
GTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCCATCTGAATCTACCAGGCC
AAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTTGAGCTCCAGGGTCACTTTCTGCCCAGA
GGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCCCTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAA
TTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTCCCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATG
AAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGATCTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAAT
ATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCATGACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTT
TTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGCAGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAG
CTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGGAGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTG
TAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTTCCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGAT
CGGCTTGACTTCTTCCTCTTCATCAGCCATCAT
Alignment:
seq
Full
seq
Full
seq
Full
seq
Full
seq
Full
seq
Full
seq
Full
TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTC
TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTC
************************************************************************************************************************
CTTTCCTTTGTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCC
CTTTCCTTTGTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCC
************************************************************************************************************************
ATCTGAATCTACCAGGCCAAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTT
ATCTGAATCTACCAGGCCAAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTT
************************************************************************************************************************
GAGCTCCAGGGTCACTTTCTGCCCAGAGGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCC
GAGCTCCAGGGTCACTTTCTGCCCAGAGGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCC
************************************************************************************************************************
CTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAATTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTC
CTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAATTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTC
************************************************************************************************************************
CCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATGAAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGAT
CCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATGAAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGAT
************************************************************************************************************************
CTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAATATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCAT
CTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAATATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCAT
************************************************************************************************************************
219
120
120
240
240
360
360
480
480
600
600
720
720
840
840
seq
Full
seq
Full
seq
Full
seq
Full
seq
Full
GACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTTTTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGC
GACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTTTTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGC
************************************************************************************************************************
AGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAGCTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGG
AGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAGCTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGG
************************************************************************************************************************
AGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTGTAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTT
AGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTGTAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTT
************************************************************************************************************************
CCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGATCGGCTTGACTTCTTCCTCTTCATCAGCCAT
CCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGATCGGCTTGACTTCTTCCTCTTCATCAGCCAT
************************************************************************************************************************
CAT 1323
CAT 1323
***
960
960
1080
1080
1200
1200
1320
1320
(3) hSTRAP (1-219)TPR 1-3 Sequencing data
Sequence given from GATC
aGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCAT
TCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCA
CTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCA
GCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGAT
AGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCA
CTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGC
CTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACgGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGc
gGCGGCCAAAGCGGTCGGACAGTGCTCCGagaACgGGTgcgcATAGaAATTgcaTCAACGCATATAGCgCTAGCAGcacgccaTaGTGACTGGCGatGCtgtnngAATGGACGa
hSTRAP(1-219)TPR1-3 sequence
AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCAT
TCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCA
CTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCA
GCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGAT
AGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCA
CTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCC
Alignment:
SEQ
F3
AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTT 120
AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTT 120
220
SEQ
F3
SEQ
F3
SEQ
F3
SEQ
F3
SEQ
F3
************************************************************************************************************************
GAAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTG
GAAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTG
************************************************************************************************************************
ATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAA
ATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAA
************************************************************************************************************************
GTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACC
GTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACC
************************************************************************************************************************
GATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGC
GATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGC
************************************************************************************************************************
CTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCCGGCTGCTAACAAAGCCCGAAAGGA
CTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCC-----------------------************************************************************************************************
240
240
360
360
480
480
600
600
720
696
(4) hSTRAP(220-440)TPR4-6 sequencing data
Sequence given by GATC
AGCGGCCTGGTGCCGCGCGGCAGCCATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCA
CTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGC
CTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTG
GGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGT
GATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGT
AGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACC
CCTTGGGGCCTCTAA
hSTRAP(220-440)TPR4-6 sequence
CATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCG
CGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCG
TGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACC
ACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCG
AATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACC
GTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC
Alignment:
SEQ
L3
AGCGGCCTGGTGCCGCGCGGCAGCCATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGT 120
------------------------CATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGT 96
************************************************************************************************
221
SEQ
L3
SEQ
L3
SEQ
L3
SEQ
L3
SEQ
L3
GCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTG
GCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTG
************************************************************************************************************************
CAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCT
CAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCT
************************************************************************************************************************
GGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTG
GGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTG
************************************************************************************************************************
TATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGT
TATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGT
************************************************************************************************************************
GTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGC
GTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC--------------*********************************************************************************************************
240
216
360
336
480
456
600
576
720
681
(5) hSTRAP (1-150)TPR 1-2 sequencing data
Sequence given by GATC
gcgGCCTgGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT
CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC
TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG
CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCT
GAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACGGGTGTGGTCGCCAT
GATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAACGCATATAGCGCTAGCAGCAC
GCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAAGAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGGTGCCGAGGATGACGATGAGCgCATTGTT
AGATTTCanaCacGGTGCCTgACTGCGTTAGCAATTTAACTGTgataAACTAccGCATTaAAGCTTATCGATGataAGcTgtcAAACATgaaaATTCTTGAanacGaAAGGGCctcgtg
hSTRAP(1-150)TPR1-2 sequence
GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT
CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC
TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG
CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAATAAGGATCC
Alignment:
SEQ
F2
SEQ
GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120
GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120
***********************************************************************************************************************
AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 240
222
f2
SEQ
f2
SEQ
f2
SEQ
f2
AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA
************************************************************************************************************************
TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG
TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG
************************************************************************************************************************
TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAAT
TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAAT
************************************************************************************************************************
AAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAG 540
AAGGATCC---------------------------------------------------- 488
********
240
360
360
480
480
(6) hSTRAP (151-284)TPR 3-4 sequencing data
Sequence given by GATC
gCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATA
TTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGC
ATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAAT
TTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTT
GGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACGGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGAC
TGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAACGCATATAGCGCTAGCAGCACGCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAA
GAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGGTGCCGAGGATGacGATGAGCGCATTGTTAGATTTCATACACGGTGCCTGACTGCGTTAGCAATTTAACTGTGA
TaAACTACCGCATTAAAGCTTATCGATGATAAGCTGTCAAACATGanaATTCTTgaagacGaAAgGGCCTcGTGAtacGCCTATTTt
hSTRAP(151-284)TPR3-4 sequence
GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATA
TTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGC
ATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAAT
TTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCC
Alignment:
SEQ
M2
SEQ
M2
SEQ
GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTA
GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTA
************************************************************************************************************************
GCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAA
GCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAA
************************************************************************************************************************
GCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGC
223
120
120
240
240
360
M2
SEQ
M2
GCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGC 360
************************************************************************************************************************
GTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCT 480
GTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCC------------------------------------- 443
***********************************************************************************
(7) hSTRAP (285-440) TPR 5-6
Sequence given by GATC
gCGGCCTGGTGCCGCGCGGCAGCCATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCgtgTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTA
CCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAgAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTG
ATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATA
GCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCT
AACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCA
CAGGACGGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAA
CGCATATAGCGCTAGCAGCACGCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAAGangcCCGGCAGTACCGGCATAACCAAGCCTATGCCTAnnGCATCCAgGGTGACGGTGCcann
gATGACgATgaacGCATTGTTAgatTTCAtannnGgtgCCTgaCTGcgTTaGCAATTTAACTgtgataAACTACcgcATTAAAGCTTATCGaTgataagctnnca
hSTRAP(285-440)TPR5-6 sequence
CATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCT
ACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATG
GTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAA
ACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC
Alignment:
SEQ
E2
SEQ
E2
SEQ
E2
SEQ
E2
SEQ
E2
GCGGCCTGGTGCCGCGCGGCAGCCATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTC
-----------------------CATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTC
*************************************************************************************************
AGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCT
AGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCT
************************************************************************************************************************
TTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATC
TTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATC
************************************************************************************************************************
GTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAA
GTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAA
************************************************************************************************************************
GCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAG 540
GCCGTCCGCAGTGTGAATAATAAGGATCC------------------------------- 486
*****************************
224
120
97
240
217
360
337
480
457
Table 7.1. Mass spectrometry peptide data
hSTRAP interacting partner
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
hSTRAP
protein
variant
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
225
Peptide Sequence
Mascot ion Score
AGVLAHLEEER
ANLQIDQINTDLNLER
ASITALEAK
DELADEIANSSGK
ELEDATETADAMNR
IAEFTTNLTEEEEK
IAQLEEQLDNETK
IAQLEEQLDNETKER
KANLQIDQINTDLNLER
KQELEEICHDLEAR
KVEAQLQELQVK
LQVELDNVTGLLSQSDSK
QLEEAEEEAQR
QLLQANPILEAFGNAK
ELEDATETADAMNR
IAQLEEQLDNETKER
LQVELDNVTGLLSQSDSK
TVGQLYKEQLAK
VISGVLQLGNIVFK
ALEEAMEQKAELER
ANLQIDQINTDLNLER
ELEDATETADAMNR
IAEFTTNLTEEEEK
IAQLEEQLDNETKER
KVEAQLQELQVK
NTDQASMPDNTAAQK
QLLQANPILEAFGNAK
VIQYLAYVASSHK
ALEEAMEQKAELER
ANLQIDQINTDLNLER
DELADEIANSSGK
EEILAQAKENEK
65.7
87.5
59.2
66.3
85.8
55.9
80.3
95.3
83.1
64.7
67.7
65.3
74.9
55.3
64.1
82.6
80.2
53
70.9
71.5
85.5
57.2
88.2
77.8
65.6
66.9
87.9
76.1
60.3
95.6
65.6
55.3
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
Myosin 9
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
226
ELEDATETADAMNR
EQADFAIEALAK
HSQAVEELAEQLEQTKR
IAEFTTNLTEEEEK
IAQLEEELEEEQGNTELINDR
IAQLEEQLDNETK
KQELEEICHDLEAR
ANLQIDQINTDLNLER
IAQLEEQLDNETK
KFDQLLAEEK
NTDQASMPDNTAAQK
VIQYLAYVASSHK
AGVLAHLEEER
ALELDSNLYR
ANLQIDQINTDLNLER
ASITALEAK
HSQAVEELAEQLEQTKR
IAQLEEQLDNETKER
KANLQIDQINTDLNLER
KFDQLLAEEK
NTDQASMPDNTAAQK
QLLQANPILEAFGNAK
RQLEEAEEEAQR
TRLQQELDDLLVDLDHQR
VISGVLQLGNIVFKK
GVVDSEDLPLNISR
TLTLVDTGIGMTK
ELISNASDALDK
GVVDSEDLPLNISR
HFSVEGQLEFR
NPDDITQEEYGEFYK
NPDDITQEEYGEFYK
TTPSVVAFTADGER
ADLINNLGTIAK
ELISNASDALDKIR
GVVDSEDLPLNISR
66.9
57.7
51.1
97.7
54.4
62.1
79.7
105
66.6
60.7
57.3
58.6
62.1
67.6
92.2
62
65.4
75.1
60.8
59.7
83
69.4
56
67.6
78.8
79.9
84.4
64.2
77.2
65.3
73.3
69.5
71.5
63.7
67
77.2
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Triosephosphate isomerase
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
227
NPDDITQEEYGEFYK
SIYYITGESKEQVANSAFVER
SLTNDWEDHLAVK
ELISNASDALDKIR
GVVDSEDLPLNISR
NPDDITQEEYGEFYK
ELISNASDALDK
NPDDITQEEYGEFYK
SLTNDWEDHLAVK
TLTLVDTGIGMTK
HVFGESDELIGQK
IIYGGSVTGATCK
QSLGELIGTLNAAK
SNVSDAVAQSTR
VPADTEVVCAPPTAYIDFAR
IIYGGSVTGATCK
SNVSDAVAQSTR
KQSLGELIGTLNAAK
QSLGELIGTLNAAK
SNVSDAVAQSTR
VVLAYEPVWAIGTGK
IIYGGSVTGATCK
SNVSDAVAQSTR
VVLAYEPVWAIGTGK
AGENVGVLLR
GITINTSHVEYDTPTR
TTLTAAITTVLAK
AFDQIDNAPEEKAR
ALEGDAEWEAK
ELLSQYDFPGDDTPIVR
FESEVYILSK
FESEVYILSKDEGGR
GITINTSHVEYDTPTR
ILELAGFLDSYIPEPER
TKPHVNVGTIGHVDHGK
TTLTAAITTVLAK
71.2
60.8
65.9
72.8
76.5
78.8
67.5
64.6
73.5
73.3
61.2
70.5
80.9
74.4
71.7
74.6
67.7
68.4
66.8
71
82.3
71.7
79.1
61.4
57
73.9
76
64.9
64.1
62.2
54.7
59.6
79.7
57
66.3
81.2
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
Elongation factor Tu 1 (Escherichia coli)
L-lactate dehydrogenase
L-lactate dehydrogenase
L-lactate dehydrogenase
L-lactate dehydrogenase
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
30S ribosomal protein S5 (Escherichia coli)
ATP synthase subunit beta, mitochondrial
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(285-440)
228
VGEEVEIVGIK
AFDQIDNAPEEK
AFDQIDNAPEEKAR
AGENVGVLLR
ALEGDAEWEAK
GITINTSHVEYDTPTR
GSALKALEGDAEWEAK
TTLTAAITTVLAK
VGEEVEIVGIKETQK
ALEGDAEWEAK
ELLSQYDFPGDDTPIVR
FESEVYILSK
GITINTSHVEYDTPTR
TTLTAAITTVLAK
VGEEVEIVGIK
VIGSGCNLDSAR
VTLTSEEEAR
IVSGKDYNVTANSK
LLIVSNPVDILTYVAWK
ATIDGLENMNSPAMVAAK
VFMQPASEGTGIIAGGAMR
AYGSTNPINVVR
VFMQPASEGTGIIAGGAMR
AVLEVAGVHNVLAK
AYGSTNPINVVR
IFSFTALTVVGDGNGR
VFMQPASEGTGIIAGGAMR
ATIDGLENMNSPAMVAAK
AVLEVAGVHNVLAK
VFMQPASEGTGIIAGGAMR
AVLEVAGVHNVLAK
IFSFTALTVVGDGNGR
AVLEVAGVHNVLAK
IFSFTALTVVGDGNGR
VFMQPASEGTGIIAGGAMR
AIAELGIYPAVDPLDSTSR
67.7
58.3
74.7
59.4
64.9
71.8
65.3
72.8
61.3
67.5
71.1
59.2
76.3
88.7
67.3
60.7
62.8
74.2
111
67.4
88.8
54
74.1
61.4
63.6
66.3
77.2
73
68.7
86.4
70.9
105
64.8
68.6
82
60.1
ATP synthase subunit beta, mitochondrial
ATP synthase subunit beta, mitochondrial
ATP synthase subunit beta, mitochondrial
ATP synthase subunit beta, mitochondrial
Phosphoglycerate kinase
Phosphoglycerate kinase
Phosphoglycerate kinase
Phosphoglycerate kinase
Phosphoglycerate kinase
Phosphoglycerate kinase
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Ubiquitin-like modifier-activating enzyme 1
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(1-440)
229
TIAMDGTEGLVR
TVLIMELINNVAK
LVLEVAQHLGESTVR
VLDSGAPIKIPVGPETLGR
AHSSMVGVNLPQK
LGDVYVNDAFGTAHR
LGDVYVNDAFGTAHR
VLNNMEIGTSLFDEEGAK
GCITIIGGGDTATCCAK
LGDVYVNDAFGTAHR
AQFEGIVTDLIR
QAVTNPNNTFYATK
QATKDAGQISGLNVLR
VLENAEGAR
QATKDAGQISGLNVLR
TTPSVVAFTADGER
ETGVDLTKDNMALQR
MKETAENYLGHTAK
TTPSVVAFTADGER
DAGQISGLNVLR
TTPSVVAFTADGER
LAGTQPLEVLEAVQR
LDQPMTEIVSR
LQTSSVLVSGLR
NEEDAAELVALAQAVNAR
AAVATFLQSVQVPEFTPK
NGSEADIDEGLYSR
ALPAVQQNNLDEDLIR
NGSEADIDEGLYSR
ALPAVQQNNLDEDLIR
LAGTQPLEVLEAVQR
NEEDAAELVALAQAVNAR
AAVATFLQSVQVPEFTPK
LAGTQPLEVLEAVQR
NEEDAAELVALAQAVNAR
DLVVSLAYQVR
61
83.8
74.7
58.2
46.8
73.2
82.5
61.1
79.2
101
61.4
69.5
49.8
66.3
56
71.5
54.1
62
66.1
67.9
61.8
58.9
63.5
80.6
79.2
66.1
79.7
60.3
63.2
69.3
62.3
62
69.6
70.7
81.8
59.5
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Peptidyl-prolyl cis-trans isomerase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Ribose-phosphate pyrophosphokinase (Escherichia coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E.coli)
Cell wall structural complex MreBCD, actin-like component MreB (E.coli)
Cell wall structural complex MreBCD, actin-like component MreB (E.coli)
Cell wall structural complex MreBCD, actin-like component MreB (E.coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Cell wall structural complex MreBCD, actin-like component MreB (E. coli)
Filamin B
Filamin B
Filamin B
Filamin B
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
230
DVFMGVDELQVGMR
FNVEVVAIR
DVFMGVDELQVGMR
VAKDLVVSLAYQVR
DLVVSLAYQVR
DVFMGVDELQVGMR
FNVEVVAIR
VPKDVFMGVDELQVGMR
FNVEVVAIR
VPKDVFMGVDELQVGMR
FSDGEVSVQINENVR
ISNEESISAMFEH
LFAGNATPELAQR
TLTLSGMLAEAIR
VVADFLSSVGVDR
ANVSQVMHIIGDVAGR
FSDGEVSVQINENVR
LFAGNATPELAQR
TLTLSGMLAEAIR
VFAYATHPIFSGNAANNLR
GMVLTGGGALLR
IGGDRFDEAIINYVR
GMVLTGGGALLR
IKHEIGSAYPGDEVR
RNYGSLIGEATAER
GMVLTGGGALLR
GQGIVLNEPSVVAIR
IGGDRFDEAIINYVR
IKHEIGSAYPGDEVR
NYGSLIGEATAER
RNYGSLIGEATAER
SVAAVGHDAKQMLGR
LVSPGSANETSSILVESVTR
LGSAADFLLDISETDLSSLTASIK
IGNLQTDLSDGLR
AGPGTLSVTIEGPSK
71.4
72.5
53.6
59.8
65.9
73.5
66.8
98.6
60.7
58.6
80.7
91.8
66.7
67.2
76.9
64
74.5
58.1
75.7
57.5
66.4
57.2
66.4
68.8
57.9
62.4
58
70.2
78.5
76.1
50.3
56.1
109
68
40
50
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Filamin B
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
231
GAGIGGLGITVEGPSESK
LVSPGSANETSSILVESVTR
VLSEDEEDVDFDIIHNANDTFTVK
IGNLQTDLSDGLR
AGPGTLSVTIEGPSK
SPFTVGVAAPLDLSK
SPFEVQVGPEAGMQK
GAGIGGLGITVEGPSESK
DAGEGLLAVQITDQEGKPK
LVSPGSANETSSILVESVTR
LVSPGSANETSSILVESVTR
GLHVVEVTYDDVPIPNSPFK
IGNLQTDLSDGLR
AGPGTLSVTIEGPSK
DAGEGLLAVQITDQEGKPK
IGNLQTDLSDGLR
LVSPGSANETSSILVESVTR
GLHVVEVTYDDVPIPNSPFK
AGTLTVEELGATLTSLLAQAQAQAR
AVPVWDVLASGYVSR
LAAVDVSAR
LSVEEAVAAGVVGGEIQEK
SQREGQGEGETQEAAAAAAAAR
MSIYQAMWK
AVPVWDVLASGYVSR
LAAELSATLEQAAATAR
AVPVWDVLASGYVSGAAR
LSVEEAVAAGVVGGEIQEK
VSAWELINSEYFSEGR
SLEGGNFIAGVLIQGTQER
LLEAQIATGGVIDPVHSHR
LLEAQIATGGIIDPVHSHR
EELLAEFGSGTLDLPALTR
LLEAQVASGFLVDPLNNQR
LGLLDTQTSQVLTAVDKDNK
GFFDPNTHENLTYVQLLR
40
40
33
49
31
35
42
42
34
40
55
39
33
24
22
22
55
39
78.8
93.8
61.2
102
54.8
50
83
65
36
59
81
67
86
58
62
58
60
71
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Epiplakin
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
232
LLEIITTTIEETETQNQGIK
GFFDPNTHENLTYLQLLER
AVTGYTDPYTGQQISLFQAMQK
VALALLEAQAATGTIMDPHSPESLSVDEAVR
AGTLTVEELGATLTSLLAQAQAQAR
AVPVWDVLASGYVSR
AVPVWDVLASGYVSR
SLEGGNFIAGVLIQGTQER
LLEAQIATGGVIDPVHSHR
LLEAQIATGGIIDPVHSHR
EELLAEFGSGTLDLPALTR
GFFDPNTHENLTYVQLLR
LLEIITTTIEETETQNQGIK
GFFDPNTHENLTYLQLLER
AVTGYTDPYTGQQISLFQAMQK
LSVEEAVAAGVVGGEIQEK
QVSASELHTSGILGPETLR
LLEAQIATGGVIDPVHSHR
EELLAEFGSGTLDLPALTR
LLEIITTTIEETETQNQGIK
ALQQGLVGLELK
AVPVWDVLASGYVSR
LAAELSATLEQAAATAR
AVPVWDVLASGYVSGAAR
LSVEEAVAAGVVGGEIQEK
VSAWELINSEYFSEGR
SLEGGNFIAGVLIQGTQER
QVSASELHTSGILGPETLR
EELLAEFGSGTLDLPALTR
LLEAQVASGFLVDPLNNQR
GFFDPNTHENLTYVQLLR
LLEIITTTIEETETQNQGIK
GFFDPNTHENLTYLQLLER
AVTGYTDPYTGQQISLFQAMQK
SMGGAVSAAELLEVGILDEQAVQGLR
LSVEEAVAAGVVGGEIQEK
44
66
70
59
50
66
71
53
84
35
72
67
48
59
31
80
37
68
57
39
41
77
32
33
70
69
67
65
69
41
77
52
65
79
63
61
Epiplakin
Epiplakin
Epiplakin
Epiplakin
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
233
QVSASELHTSGILGPETLR
LTAIIEEAEEAPGARPQLQDAWR
AVPVWDVLASGYVSR
LSVEEAVAAGVVGGEIQEK
LPQLPITNFSR
YGGPYHIGGSPFK
FVPAEMGTHTVSVK
FADQHVPGSPFSVK
AEAGVPAEFSIWTR
SPFSVAVSPSLDLSK
VTYTPMAPGSYLISIK
TFSVWYVPEVTGTHK
QMQLENVSVALEFLDR
IPEISIQDMTAQVTSPSGK
LVSNHSLHETSSVFVDSLTK
YTPVQQGPVGVNVTYGGDPIPK
GAGSYTIMVLFADQATPTSPIR
SADFVVEAIGDDVGTLGFSVEGPSQAK
IANLQTDLSDGLR
AFGPGLQGGSAGSPAR
GAGTGGLGLAVEGPSEAK
VTAQGPGLEPSGNIANK
VANPSGNLTETYVQDR
AWGPGLEGGVVGK
IANLQTDLSDGLR
AYGPGIEPTGNMVK
ANLPQSFQVDTSK
SPFSVAVSPSLDLSK
GAGTGGLGLAVEGPSEAK
EGPYSISVLYGDEEVPR
DAGEGLLAVQITDPEGKPK
QMQLENVSVALEFLDRESIK
GLVEPVDVVDNADGTQTVNYVPSR
SADFVVEAIGDDVGTLGFSVEGPSQAK
LTVSSLQESGLK
GKLDVQFSGLTK
24
25
67.6
69.3
41
30
44
23
30
43
27
26
39
51
60
44
47
57
60
50
36
35
34
38
52
23
62
24
27
39
39
23
23
23
30
34
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Filamin A
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
234
YGGDEIPFSPYR
IANLQTDLSDGLR
ANLPQSFQVDTSK
FADQHVPGSPFSVK
SPFSVAVSPSLDLSK
EGPYSISVLYGDEEVPR
YTPVQQGPVGVNVTYGGDPIPK
GLVEPVDVVDNADGTQTVNYVPSR
AFGPGLQGGSAGSPAR
VANPSGNLTETYVQDR
YISPDQLADLYK
LAQANGWGVMVSHR
VVIGMDVAASEFFR
AAVPSGASTGIYEALELR
LAMQEFMILPVGAANFR
FTASAGIQVVGDDLTVTNPK
DYPVVSIEDPFDQDDWGAWQK
DATNVGDEGGFAPNILENKEGLELLK
VLITTDLLAR
GFKDQIYDIFQK.L
MFVLDEADEMLSR
LQMEAPHIIVGTPGR
GIYAYGFEKPSAIQQR
GIDVQQVSLVINYDLPTNR
LNSNTQVVLLSATMPSDVLEVTK
LQMEAPHIIVGTPGR
GIYAYGFEKPSAIQQR
GYDVIAQAQSGTGK
MFVLDEADEMLSR
LQMEAPHIIVGTPGR
GIYAYGFEKPSAIQQR
GIYAYGFEKPSAIQQR
GIDVQQVSLVINYDLPTNR
LNSNTQVVLLSATMPSDVLEVTK
MFVLDEADEMLSR
LQMEAPHIIVGTPGR
32
32
20
37
42
45
59
28
36
49
40
49
84
74
47
80
28
25
35
68
84
46
41
55
37
45
45
81
84
64
34
28
78
36
83
62
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Eukaryotic initiation factor 4A-I
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
Tu translation elongation factor, mitochondrial precursor
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
235
LNSNTQVVLLSATMPSDVLEVTK
MFVLDEADEMLSR
GIYAYGFEKPSAIQQR
DIETFYNTSIEEMPLNVADLI
GIDVQQVSLVINYDLPTNR
DIETFYNTSIEEMPLNVADLI.
LNSNTQVVLLSATMPSDVLEVTK
MFVLDEADEMLSR
LQMEAPHIIVGTPGR
GIYAYGFEKPSAIQQR
DIETFYNTSIEEMPLNVADLI
GYDVIAQAQSGTGK
MFVLDEADEMLSR
LQMEAPHIIVGTPGR.V
TVVTGIEMFHK.S
QIGVEHVVVYVNK
LLDAVDTYIPVPAR
GITINAAHVEYSTAAR
TIGTGLVTNTLAMTEEEK
DLEKPFLLPVEAVYSVPGR
VEAQVYILSK
AEAGDNLGALVR
TVVTGIEMFHK
LLDAVDTYIPVPAR
GITINAAHVEYSTAAR
K.ADAVQDSEMVELVELEIR
K.LLDAVDTYIPVPAR
R.GITINAAHVEYSTAAR
R.TVVTGIEMFHK
GITINAAHVEYSTAAR
TIGTGLVTNTLAMTEEEK
AEAGDNLGALVR
TVVTGIEMFHK
LLDAVDTYIPVPAR
GITINAAHVEYSTAAR
TIGTGLVTNTLAMTEEEK
24
84
21
24
82
12
22
84
74
27
33
56
79
55
32
43
72
70
61
45
56
44
47
38
69
53
70
53
44
70
46
76
45
42
68
63
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
236
QLLQANPILEAFGNAK
LLLQVESLTTELSAER
VIQYLAHVASSPK
LMATLSNTNPSFVR
QLLQANPILEAFGNAK
INFDVAGYIVGANIETYLLEK
LQQLFNHTMFVLEQEEYQR
DLGEELEALRGELEDTLDSTNAQQELR
AQAELENVSGALNEAESK
ELEDVTESAESMNR
FLTNGPSSSPGQER
LAQAEEQLEQETR
QLLQANPILEAFGNAK
AQAELENVSGALNEAESK
AQELQKVQELQQQSAR
KFDQLLAEEK
KQELELVVSELEAR
LAQAEEQLEQETR
LLLQVESLTTELSAER
LQEELAASDR
QLLQANPILEAFGNAK
VAEQAANDLR
AQAELENVSGALNEAESK
EAQAALAEAQEDLESER
GPSAGGGPGSGTSPQVEWTAR
KFDQLLAEEK
LAQAEEQLEQETR
NTDQATMPDNTAAQK
VAQLEEER
AELSSLQTAR
AQAELENVSGALNEAESK
AQELQKVQELQQQSAR
AQVTELEDELTAAEDAK
EAQAALAEAQEDLESER
ELEDVTESAESMNR
EQADFALEALAK
75
70
29
45
63
27
26
59
93.3
87
56.1
57.9
87.9
93.2
53.1
59.7
82.7
61.3
91.5
60.7
69.4
78.6
86.9
79.9
93.9
60.7
63
55
56.5
67.2
78.7
88.8
104
78.4
87.7
57.7
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Myosin, heavy chain 14 isoform 1
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
His-hSTRAP(285-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
237
FLTNGPSSSPGQER
GPSAGGGPGSGTSPQVEWTAR
HGQALGELAEQLEQAR
KQELELVVSELEAR
LAEFSSQAAEEEEKVK
LKEVVLQVEEER
LMATLSNTNPSFVR
NTDQATMPDNTAAQK
RLELQLQEVQGR
RQEEEAGALEAGEEAR
AVLVDLEPGTMDSVR
GHYTEGAELVDSVLDVVRK
ISEQFTAMFR
LAVNMVPFPR
HGRYLTVAAVFR
ISEQFTAMFR
ALTVPELTQQMFDAK
IMNTFSVVPSPK
ISEQFTAMFR
LAVNMVPFPR
LHFFMPGFAPLTSR
HGRYLTVAAVFR
ISEQFTAMFR
EVDEQMLNVQNK
GHYTEGAELVDSVLDVVR
IMNTFSVVPSPK
INVYYNEATGGKYVPR
LAVNMVPFPR
INVYYNEATGGKYVPR
LAVNMVPFPR
AVLVDLEPGTMDSVR
ISEQFTAMFR
MSATFIGNSTAIQELFKR
ALTVPELTQQMFDAK
ISEQFTAMFR
LAVNMVPFPR
63.8
101
74.3
86.8
58.4
55.5
85
77.1
64
58.8
59.2
53.7
73.7
59.9
55
68.6
66.9
61.2
74.7
63.3
59.5
54.3
68.5
79.6
109
70
74.4
59
104
58.4
59.1
69.3
56.8
61.6
70.1
64
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
Tubulin
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
DNA-dependent protein kinase catalytic subunit
Actin
Actin
Actin
Actin
Actin
Actin
Actin
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
238
YLTVAAVFR
ISEQFTAMFR
KLAVNMVPFPR
HGRYLTVAAVFR
ISEQFTAMFR
YLTVAAVFR
ISEQFTAMFR
KLAVNMVPFPR
YLTVAAVFR
GHYTEGAELVDSVLDVVR
ISEQFTAMFR
SGPFGQIFRPDNFVFGQSGAGNNWAK
SDPGLLTNTMDVFVK
MDPMNIWDDIITNR
TVGALQVLGTEAQSSLLK
LLLQGEADQSLLTFIDK
NLDLAVLELMQSSVDNTK
STVLTPMFVETQASQGTLQTR
HSSLITPLQAVAQR
TVGALQVLGTEAQSSLLK
LLLQGEADQSLLTFIDK
DVLIQGLIDENPGLQLIIR
LGASLAFNNIYR
NLLIFENLIDLK
HSSLITPLQAVAQR
TVSLLDENNVSSYLSK
TVGALQVLGTEAQSSLLK
LLLQGEADQSLLTFIDK
DVLIQGLIDENPGLQLIIR
EITALAPSTMK
HQGVMVGMGQK
AVFPSIVGRPR
IWHHTFYNELR
QEYDESGPSIVHR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
62
60.8
61.8
61.8
69.3
61.5
68.8
71.4
61.4
97
69.4
72.4
45
26
39
80
53
34
22
40
46
34
29
36
35
49
62
68
51
51
36
35
38
43
76
40
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
Actin
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
239
DLYANTVLSGGTTMYPGIADR
TTGIVMDSGDGVTHTVPIYEGYALPHAILR
EITALAPSTMK
SYELPDGQVITIGNER
DLYANTVLSGGTTMYPGIADR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
TTGIVMDSGDGVTHTVPIYEGYALPHAILR
EITALAPSTMK
AVFPSIVGRPR
IWHHTFYNELR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
AVFPSIVGRPR
IWHHTFYNELR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
TTGIVMDSGDGVTHTVPIYEGYALPHAILR
SYELPDGQVITIGNER
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
EITALAPSTMK
HQGVMVGMGQK
AVFPSIVGRPR
IWHHTFYNELR
QEYDESGPSIVHR
SYELPDGQVITIGNER
70
64
46
72
69
69
76
71
69
62
66
63
59
66
32
46
38
41
72
31
69
61
47
56
64
65
111
64
36
55
46
50
31
37
39
69
Actin
Actin
Elongation factor 1 alpha 1
Elongation factor 1 alpha 1
Elongation factor 1 alpha 1
Elongation factor 1 alpha 1
Elongation factor 1 alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
240
VAPEEHPVLLTEAPLNPK
DLYANTVLSGGTTMYPGIADR
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
THINIVVIGHVDSGK
VETGVLKPGMVVTFAPVNVTTEVK
LPLQDVYK
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
THINIVVIGHVDSGK
VETGVLKPGMVVTFAPVNVTTEVK
LPLQDVYK
IGGIGTVPVGR
EHALLAYTLGVK
K.YYVTIIDAPGHR
VETGVLKPGMVVTFAPVNVTTEVK
EHALLAYTLGVK
IGGIGTVPVGR
YYVTIIDAPGHR
IGGIGTVPVGR
YYVTIIDAPGHR
EHALLAYTLGVK
VETGVLKPGMVVTFAPVNVTTEVK
LPLQDVYK
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
THINIVVIGHVDSGK
VETGVLKPGMVVTFAPVNVTTEVK
IGGIGTVPVGR
EHALLAYTLGVK.
YYVTIIDAPGHR
VETGVLKPGMVVTFAPVNVTTEVK
LPLQDVYK
76
71
60
54
49
89
39
47
57
55
55
83
42
45
55
51
50
43
68
55
50
66
67
66
59
42
60
57
55
97
27
48
42
58
45
45
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Elongation factor 1-alpha 1
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
241
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
EHALLAYTLGVK
YYVTIIDAPGHR
THINIVVIGHVDSGK.
VETGVLKPGMVVTFAPVNVTTEVK
LPLQDVYK
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
LPLQDVYK
IGGIGTVPVGR
EHALLAYTLGVK
YYVTIIDAPGHR
VETGVLKPGMVVTFAPVNVTTEVK
EVWALVQAGIR
FDASFFGVHPK
LQVVDQPLPVR
DNLEFFLAGIGR
EGGFLLLHTLLR
VLQGDLVMNVYR
SLLVNPEGPTLMR
FPQLDSTSFANSR
VVEVLAGHGHLYSR
VVVQVLAEEPEAVLK
GILADEDSSRPVWLK
WTSQDSLLGMEFSGR
DTSFEQHVLWHTGGK
LPEDPLLSGLLDSPALK.
QGVQVQVSTSNISSLEGAR
WLSTSIPEAQWHSSLAR
RPTPQDSPIFLPVDDTSFR
DTVTISGPQAPVFEFVEQLR
LLLEVTYEAIVDGGINPDSLR
DTVTISGPQAPVFEFVEQLRK.
55
68
50
49
68
80
41
43
52
52
60
45
66
56
67
34
72
44
40
62
43
61
33
62
62
61
37
51
34
106
31
39
29
68
64
63
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
242
GYDYGPHFQGILEASLEGDSGR
NVTFHGVLLDAFFNESSADWR
SLYQSAGVAPESFEYIEAHGTGTK
HFLLEEDKPEEPTAHAFVSTLTR
LQVVDQPLPVR
VLQGDLVMNVYR
SLLVNPEGPTLMR
FPQLDSTSFANSR
EDGLAQQQTQLNLR
VVVQVLAEEPEAVLK
WTSQDSLLGMEFSGR
LPEDPLLSGLLDSPALK
SLYQSAGVAPESFEYIEAHGTGTK
EGGFLLLHTLLR
VVEVLAGHGHLYSR
VVVQVLAEEPEAVLK
WLSTSIPEAQWHSSLAR
LHLSGIDANPNALFPPVEFPAPR
LQVVDQPLPVR
DNLEFFLAGIGR
MVVPGLDGAQIPR
GVDLVLNSLAEEK
SLLVNPEGPTLMR
FPQLDSTSFANSR
EDGLAQQQTQLNLR
VVVQVLAEEPEAVLK
WTSQDSLLGMEFSGR
DTSFEQHVLWHTGGK
LPEDPLLSGLLDSPALK
EQGVTFPSGDIQEQLIR
QGVQVQVSTSNISSLEGAR
LLLEVTYEAIVDGGINPDSLR
DTVTISGPQAPVFEFVEQLRK
MVVPGLDGAQIPRDPSQQELPR
SLYQSAGVAPESFEYIEAHGTGTK
HFLLEEDKPEEPTAHAFVSTLTR
103
36
45
73
50
52
62
67
60
47
48
30
60
36
48
50
40
30
50
76
40
69
35
60
48
56
82
35
102
29
61
42
26
31
90
27
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
243
FDASFFGVHPK
LQVVDQPLPVR
DNLEFFLAGIGR
EGGFLLLHTLLR
DLVEAVAHILGIR
FPQLDSTSFANSR
VVEVLAGHGHLYSR
VVVQVLAEEPEAVLK
GILADEDSSRPVWLK
DTSFEQHVLWHTGGK
LPEDPLLSGLLDSPALK
VTVAGGVHISGLHTESAPR
WLSTSIPEAQWHSSLAR
RPTPQDSPIFLPVDDTSFR
IPGLLSPHPLLQLSYTATDR
DTVTISGPQAPVFEFVEQLR
LLLEVTYEAIVDGGINPDSLR
DTVTISGPQAPVFEFVEQLRK
GYDYGPHFQGILEASLEGDSGR
NVTFHGVLLDAFFNESSADWR
LHLSGIDANPNALFPPVEFPAPR
SLYQSAGVAPESFEYIEAHGTGTK
HFLLEEDKPEEPTAHAFVSTLTR
ALGLGVEQLPVVFEDVVLHQATILPK
LFDHPESPTPNPTEPLFLAQAEVYK
FDASFFGVHPK
LQVVDQPLPVR
GVDLVLNSLAEEK
SLLVNPEGPTLMR
VVVQVLAEEPEAVLK
GILADEDSSRPVWLK
WTSQDSLLGMEFSGR
DTSFEQHVLWHTGGK.
LPEDPLLSGLLDSPALK
EQGVTFPSGDIQEQLIR
QGVQVQVSTSNISSLEGAR
30
50
67
37
48
46
55
58
42
32
90
58
45
40
55
33
71
79
99
42
72
32
95
42
41
38
57
49
37
62
37
76
35
35
51
67
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Fatty acid synthase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
His-hSTRAP(151-284)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
244
GTHTGVWVGVSGSETSEALSR
LLLEVTYEAIVDGGINPDSLR
SDEAVKPFGLK
LQVVDQPLPVR
DNLEFFLAGIGR
GVDLVLNSLAEEK
VLQGDLVMNVYR
SLLVNPEGPTLMR
VVEVLAGHGHLYSR
EDGLAQQQTQLNLR
VVVQVLAEEPEAVLK
SNMGHPEPASGLAALAK
WTSQDSLLGMEFSGR
DTSFEQHVLWHTGGK
GNAGQSNYGFANSAMER
LPEDPLLSGLLDSPALK
EQGVTFPSGDIQEQLIR
QGVQVQVSTSNISSLEGAR
GYDYGPHFQGILEASLEGDSGR
AAVPSGASTGIYEALELR
HIADLAGNSEVILPVPAFNVINGGSHAGNK
LAMQEFMILPVGAANFR
VNQIGSVTESLQACK
YISPDQLADLYK
YISPDQLADLYK
LAQANGWGVMVSHR
VVIGMDVAASEFFR
AAVPSGASTGIYEALELR
LAMQEFMILPVGAANFR
FTASAGIQVVGDDLTVTNPK
YISPDQLADLYK
LAQANGWGVMVSHR
VVIGMDVAASEFFR
AAVPSGASTGIYEALELR
LAMQEFMILPVGAANFR
FTASAGIQVVGDDLTVTNPK
32
30
43
59
49
68
66
65
43
64
60
84
76
53
62
98
45
78
50
88.7
66.4
69.3
71.1
66.9
40
49
84
74
47
80
45
70
112
77
56
86
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
Alpha enolase
GST-hSTRAP(1-440)
GST-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-440)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-219)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
His-hSTRAP(1-150)
245
DYPVVSIEDPFDQDDWGAWQK
DATNVGDEGGFAPNILENKEGLELLK
AAVPSGASTGIYEALELR
LAMQEFMILPVGAANFR
LAQANGWGVMVSHR
VNQIGSVTESLQACK
YNQLLRIEEELGSK
LAMQEFMILPVGAANFR
VNQIGSVTESLQACK
HIADLAGNSEVILPVPAFNVINGGSHAGNK
TIAPALVSK
VNQIGSVTESLQACK
LMIEMDGTENK
YISPDQLADLYK
VVIGMDVAASEFFR
LAMQEFMILPVGAANFR
DATNVGDEGGFAPNILENK
FTASAGIQVVGDDLTVTNPK
GNPTVEVDLFTSK
YISPDQLADLYK
VVIGMDVAASEFFR
AAVPSGASTGIYEALELR
FTASAGIQVVGDDLTVTNPK
DYPVVSIEDPFDQDDWGAWQK
DATNVGDEGGFAPNILENKEGLELLK
GNPTVEVDLFTSK
LAMQEFMILPVGAANFR
VNQIGSVTESLQACK
IEEELGSKAK
VNQIGSVTESLQACK
AAVPSGASTGIYEALELR
FTASAGIQVVGDDLTVTNPKR
GNPTVEVDLFTSK
HIADLAGNSEVILPVPAFNVINGGSHAGNK
LAMQEFMILPVGAANFR
YISPDQLADLYK
69
65
91.3
63.5
51.5
95.7
59.8
53.4
95.4
64.9
30.1
105
32
47
113
74
58
77
40
54
70
87
71
70
81
77.2
73
66.7
61.9
88.9
98.5
59.2
81.4
58.7
67.8
63.3
This table shows all the peptide data associated with the Mass spectrometry data carried out in this thesis (Table 3.6), the peptides highlighted in black, red and green
corresponds to the peptide data associated with the first (N=1), second (N=2) and third repeat of experiments (N=3) respectively carried out in the investigation for the protein
indicated in column 1 and the pull downs for the particular pull down with a hSTRAP protein variant indicated in column two.
246
247