* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Biochemical, biophysical and interaction studies of the stress
Lipid signaling wikipedia , lookup
Monoclonal antibody wikipedia , lookup
Clinical neurochemistry wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Biochemistry wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Biochemical cascade wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Signal transduction wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Point mutation wikipedia , lookup
Metalloprotein wikipedia , lookup
Magnesium transporter wikipedia , lookup
Paracrine signalling wikipedia , lookup
Gene expression wikipedia , lookup
Homology modeling wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Interactome wikipedia , lookup
Expression vector wikipedia , lookup
Protein structure prediction wikipedia , lookup
Western blot wikipedia , lookup
Proteolysis wikipedia , lookup
Biochemical, biophysical and interaction studies of the stress responsive protein hSTRAP A thesis submitted to the University of Manchester for the degree of PhD: Molecular Cancer studies in the Faculty of Life Sciences 2013 Karishma Satia Declaration No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning; Copyright statement The following four notes on copyright and the ownership of intellectual property rights must be included as written below: i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://www.campus.manchester.ac.uk/medialibrary/policies/intellectualproperty.pdf), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s policy on presentation of Theses 2 Acknowledgements Firstly I would like to thank all my supervisors, Dr Alexander Golovanov, Dr Costas Demonacos and Dr Stephen Prince for their continual guidance throughout this PhD. I would also like to thank my adviser Professor Andrew Doig for all his support and advice on PhD related issues and CD experiments. Also I would like to thank Dr Marija Kristic Demonacos for her help when carrying out the molecular cloning aspect of this project and letting me work on her bench space, also providing me with an hSTRAP construct (pHA1-hSTRAP(1-440)) which was used in this project. In addition I would like to thank Mrs Sandra Taylor who helped with the molecular biology aspect of this PhD. I am also grateful to all the associated lab members, especially Dr Richard Turncliffe who has read my thesis and advised corrections and for all his advice given throughout this project. Also, Mrs Ilhem Berrou, Mrs Hajar Karimi Alavi and Mr Travis Leung for all their help given during this project, especially Travis for providing me with MCF7 cells. I am also very thankful to Miss Preeti Kalra and Mr John Chipperfield for giving me all their help and continual support throughout this PhD. I would also like to thank Dr James Birtley for all his advice regarding the molecular cloning aspect of this project. I would also like to thank Dr Martin Read, Dr Christopher Storey and all the members of the Michel Smith Mass spectrometry facility especially Dr David Knight and Emma-jayne Keevill for their help regarding the mass spectrometry aspect of this PhD. I would also like to thank Dr Jean-Marc Schwartz for his help with all bioinformatics aspect of this project. I would also like to thank all my friends and all the Satia family, but especially my brother Dr Imran Satia and my mum, Mrs Nasima Satia, as well as my nieces and nephews who always make my day with a smile after a long day in the lab. If it wasn’t for my mum and brother I would not be in the great position I feel I am in, and hence I would like to dedicate my thesis to my Mum and Brother. I would also like to thank Miss Sharmin Naaz, Daphne Chen and Preeti Kalra, who have been such amazing friends and supported me emotionally through this PhD. Last but not least I would like to thank the University Of Manchester and BBSRC for providing me with the funding to carry out this PhD. 3 Abbreviations: A, Alanine; ADP, Adenosine Di-phosphate AIPL1, Aryl-hydrocarbon-interacting-protein-1; APC, Anaphase Promoting Complex; AR, Androgen Receptor, Arp2/3; Actin related protein 2/3; ATM, Ataxia Telangiectasia Mutated; ATM, Ataxia-telangiectasia mutated, ATP, Adenosine Tri-phosphate ATR, Ataxia-telangiectasia and Rad3-related; Bax, Bcl2 associated X protein; Bdp1, B Double Prime 1; BRCA1, Breast Cancer type 1 susceptibility protein BRCA2, Breast Cancer type 2 susceptibility protein Brf1, B Related Factor 1; Bub1; budding uninhibited by benzimidazole 1 BubR1; Bub1 Related CBP, CREB-binding protein Cdc, Cell division cycle; CDH1, Cadherin; CdK’s, Cyclin Dependent Kinases; CFTR, Cystic Fibrosis Transmembrane conductance Regulator; CHIP, Carboxy terminus of Hsc70 interacting protein; Chk, Checkpoint kinases1; CTPR3, Consensus TPR number of repeats; DMEM, Dulbecco's Modified Eagle Medium DMSO, Dimethyl sulphoxide; DNA; Deoxyribonucleic acid DTT, Dithiothreitol; E1A, Adenovirus early reagion 1A; E6-AP, E6-associated protein EDTA, Ethylenediaminetetraacetic acid ESI, Electrospray ionization; F, Phenylalanine; FA, Fanconi Anaemia; 4 FANC, Fanconi Anaemia Group, FBS, Fetal bovine Serum; FID, Free induction Decay; FKBP52, Rabbit FK506 Binding Protein; FKBPs, FK506- binding proteins; FPLC, Fast protein liquid chromatography G, Glycine; GABAA, Gamma-aminobutyric acid, type A GFP, Grenn Flourescent Protein; GR, Glucocorticoid receptor; GRIF-1, (GABAA) receptor interacting factor-1 GST, Glutathione S Transferase; GTFs, General Transcription factors, GTP, Guanosine-5'-triphosphate HAT, Histone acetyltransferase; HBP21, HSP70 binding protein 21; HIF-1, Hypoxia Inducible Factor 1 HIP, Hsc70- interacting protein; HIV, Human immunodeficiency virus HOP, Hsp70-Hsp90 Organizing Protein; HREs, Hypoxia Response Elements; HSF1, Heat shock transcription factor 1; Hsp, Heat Shock Proteins; HTS, High Throughput Screening JMY, Junction Mediating and regulatory protein; L, Leucine; LB, Luria Broth; MAD, Multi-wavelength Anomalous Diffraction; MALDI, Matrix associated laser desorbption/ionization MAPKinase, Mitogen-activated protein kinases MAS, Mitochondrial assembly; Mdm2, Mouse double minute 2; MIR, Multiple Isomorphous Replacement; MOM, Mitochondrial Outer Membrane Mps1, Mono-polar spindle 1; MR, Molecular Replacement; MS, Mass spectrometry; 5 NaCl, Sodium Chloride NADPH, Nicotinamide adenine dinucleotide phosphate-oxidase; NMR, Nuclear Magnetic Resonance NPFs, Nucleation Promoting Factors NTP, Nucleotide Tri-phosphate OGT, O-Glc-NAc-transferase; OIP’S, OGT-interacting proteins; P, Proline; p300, E1A binding protein p300 p58ipk, p58 inhibitor of protein kinase; PAGE, Poly-acrylamide-gel-electrophoresis; PAS, Peroxisome assembly PBD, Peroxisome Biogenesis Disorder; PBS, Phosphate buffered Saline PDZ, post synaptic density protein (P), Drosophila disc large tumor suppressor (D), and zonula occludens-1 protein (Z) PEG, Polyethylene PEX5, Peroxin 5; PHD domain, Plant Homeo Domain; PIC, Pre-initiation complex; PKR, Protein kinase R; PMSF, Phenylmethanesulfonyl Fluoride; PP5, Protein Phosphatase 5; PPIase, Peptidylprolyl Isomerase; PPII, Polyproline Type II; PRMT, Protein methyltransferase; Psc, Pseudomonas secretion; PTS1, Peroxisomal targeting signal; PXR1, peroxisomal targeting signal import receptor; RAC, Ras-related C botulinum toxin substrate; RNA, Ribonucleic acid; SAD, Single wavelength anomalous diffraction; SDS, sodium dodecyl sulfate sGC, soluble Guanylyl Cyclase; SGT , small glutamine-rich protein; SH, Src Homology; SIR, Single Isomorphous Replacement; 6 Srb7, Suppressor of RNA polymerase B 7; STRAP, Stress-responsive activator of p300; SWI/SNF, SWItch/Sucrose NonFermentable; TAE, Tris acetate EDTA; TAFs, TBP associated factors; TAP, Tandem affinity purification; TBP, TATA-binding protein TEV, Tobacco ethch virus; Tfc4, Transcription Factor Class C 4 TFIIIB, RNA polymerase III transcription factor TFIIIC, RNA polymerase III transcription initiation factor complex TOF, Time of flight; Tom20, Translocase of the outer membrane; TPR, tetra-tri-co-peptide repeat; TSS, Transcription Start site TTC4, tetratricopeptide repeat domain 4 UBP, Vpu binding protein; UPP, Ubiquitin Proteosome Pathway; W, Tryptophan; WASP, Wiskott Aldrich Syndrome protein WD40, Tryptophan-aspartic acid dipeptide WH2, Wiskott Aldrich Syndrome protein (WASp)-homology2 WISp39, WAF-1/CIP1 stabilizing protein 39 XAB2, XPA (Xeroderma pigmentosum, complementation group A) binding protein 2 XAP2, hepatitis B virus X-associated protein 2; XRCC3, X-ray repair cross-complementing protein 3; Y, Tyrosine; ZZ domain, Zinc finger domain; α-SGT, a small glutamine–rich tetratricopeptide repeat containing protein alpha; 7 Table of Contents DECLARATION ........................................................................................................... 2 COPYRIGHT STATEMENT ...................................................................................... 2 ACKNOWLEDGEMENTS .......................................................................................... 3 ABBREVIATIONS: ..................................................................................................... 4 TABLE OF CONTENTS ............................................................................................. 8 LIST OF FIGURES .................................................................................................... 12 LIST OF TABLES ..................................................................................................... 15 ABSTRACT ................................................................................................................ 16 1. CHAPTER ONE. INTRODUCTION ................................................................. 17 1.1 Proteins ................................................................................................................................ 17 1.1.1 Importance of protein research ......................................................................................................... 17 1.1.2 The importance of protein structure determination........................................................................... 17 1.1.3 Importance of characterizing protein-protein interactions ................................................................. 19 1.2 Various structural biology techniques .................................................................................. 19 1.2.1 X-Ray crystallography ........................................................................................................................ 19 1.2.2 NMR ................................................................................................................................................. 23 1.2.3 Circular dichroism (CD) ...................................................................................................................... 26 1.2.4 Complementarities between different Structural biology techniques ................................................. 27 1.3 Mass spectrometry ............................................................................................................... 29 1.4 Protein-protein interaction motifs........................................................................................ 34 1.4.1 Structure ........................................................................................................................................... 34 1.4.2 Structural stability and ligand specificity of protein interaction domains ............................................ 36 1.5 Functions of various TPR proteins ........................................................................................ 45 1.5.1 TPR proteins involved in transcription ............................................................................................... 45 1.5.2 TPR proteins involved in the Stress Response Pathway ...................................................................... 46 1.5.3 TPR proteins involved in Mitochondrial and Peroxisomal import........................................................ 47 1.5.4 TPR proteins involved in the progression of the cell cycle .................................................................. 48 1.5.5 TPR proteins involved in DNA Repair ................................................................................................. 50 1.5.6 TPR proteins involved in Proteolysis .................................................................................................. 51 1.5.7 TPR proteins implicated in various other aspects of cell physiology .................................................... 51 1.6 Roles of scaffolds in signaling pathways ............................................................................... 54 1.7 Hallmarks of Cancer .............................................................................................................. 56 8 1.8 Role and regulation of p53 in cancer .................................................................................... 57 1.9 Regulation of the actin cytoskeleton .................................................................................... 57 1.10 Cancer cell migration .......................................................................................................... 58 1.11 RNA polymerase transcription machinery .......................................................................... 59 1.12 STRAP (Stress responsive activator of p300) ....................................................................... 64 1.12.1 STRAP discovery .............................................................................................................................. 64 1.12.2 STRAP, p300 and JMY ...................................................................................................................... 65 1.12.3 STRAP function ................................................................................................................................ 67 1.13 Aims of the Project ............................................................................................................. 72 2. CHAPTER TWO. MATERIALS AND METHODS ........................................ 73 2.1 Materials .............................................................................................................................. 73 2.1.1 Chemicals and Reagents .................................................................................................................... 73 2.1.2 Enzymes and Kits............................................................................................................................... 74 2.1.3 Other consumables ........................................................................................................................... 74 2.1.4 General buffers and solutions ............................................................................................................ 74 2.1.5 Chemically competent bacterial cells ................................................................................................. 75 2.2 Mammalian Cell Culture ....................................................................................................... 76 2.2.1 Cell lines............................................................................................................................................ 76 2.2.2 Cell passage and maintenance ........................................................................................................... 76 2.2.3 Biochemical pull down assays ............................................................................................................ 77 2.3 Cloning of hSTRAP constructs ............................................................................................... 77 2.3.1 Cloning of full length hSTRAP into pET14b (His-hSTRAP(1-440)) ......................................................... 77 2.3.2 Cloning of truncated versions of hSTRAP (His-hSTRAP) into pET-14b .................................................. 77 2.3.2.2 Primer design ........................................................................................................... 78 2.3.2.3 Polymerase Chain reaction ....................................................................................... 78 2.3.2.4 PCR Purification........................................................................................................ 79 2.3.2.5 Restriction digests .................................................................................................... 79 2.3.2.6 Agarose gel electrophoresis...................................................................................... 80 2.3.2.7 Ligation .................................................................................................................... 80 2.3.3 Cloning of Full length hSTRAP into pGEX-6P1 (GST- hSTRAP(1-440)) ................................................... 80 2.3.3.2 Alkaline Phosphatase treatment ............................................................................... 81 2.3.3.3 Gel Extraction and Purification ................................................................................. 81 2.4 Transformation of plasmid DNA into competent e.coli cells ................................................. 81 2.5 DNA Mini preps .................................................................................................................... 83 2.6 Sequencing ........................................................................................................................... 83 2.7 Expression trials .................................................................................................................... 83 2.8 SDS-PAGE Gels ...................................................................................................................... 85 2.9 Large scale expression and protein purification of all hSTRAP variants ................................ 86 2.9.1 Full length His-hSTRAP and truncated constructs of hSTRAP .............................................................. 86 2.9.2 GST-hSTRAP(1-440) ........................................................................................................................... 87 9 2.10 Determining the concentration of protein .......................................................................... 88 2.10.1 Bradford reagent ............................................................................................................................. 88 2.10.2 Protein sample absorbance at 280nm.............................................................................................. 88 2.11 Concentration of protein to a smaller volume .................................................................... 89 2.11.1 Amicon Concentration..................................................................................................................... 89 2.11.1.1 Concentration of protein into a buffer .................................................................... 89 2.11.1.2 Concentration of protein along with buffer exchange in the amicon ....................... 89 2.11.2 Viva spin500 concentrators ............................................................................................................. 90 2.12 Gel Filtration ....................................................................................................................... 91 2.13 X-RAY Crystallography experiments ................................................................................... 91 2.14 GST tag Cleavage ................................................................................................................ 92 2.15 CD experiments .................................................................................................................. 92 2.16 NMR experiments ............................................................................................................... 92 15 2.16.1 Expression of N labelled hSTRAP protein ....................................................................................... 92 2.16.2 Acquiring of NMR spectra ................................................................................................................ 93 2.17 Mass spectrometry experiments ........................................................................................ 93 2.18 Building the hSTRAP interactome network ......................................................................... 94 3. CHAPTER THREE. RESULTS .......................................................................... 95 3.1 Expression and purification of full length and truncated forms of hSTRAP protein .............. 95 3.1.1 Cloning, expression and purification of full length hSTRAP into pET14b (His-hSTRAP(1-440)).............. 95 3.1.2 Cloning, Expression and purification of GST-hSTRAP(1-440) ............................................................. 102 3.1.3 Cloning, expression and purification of truncated variants of hSTRAP .............................................. 108 3.1.3.1. Design and sequence analysis of truncated constructs of hSTRAP ........................................ 108 3.1.3.2. Cloning of Truncated versions of hSTRAP ............................................................................. 111 3.1.3.3. Expression and Purification of truncated versions of hSTRAP ............................................... 112 3.1.3.3.1 Expression and purification of hSTRAP(1-219)............................................................... 112 3.1.3.3.2 Expression and purification of hSTRAP(220-440) ........................................................... 116 3.1.3.3.3 Expression and purification of hSTRAP(1-150)............................................................... 118 3.1.3.3.4 Expression and purification of hSTRAP(151-284) ........................................................... 121 3.1.3.3.5 Expression and purification of hSTRAP(285-440) ........................................................... 123 3.2 Identification of hSTRAP interacting partners in MCF7 breast cancer cells ......................... 127 3.2.1 Purification of hSTRAP protein variants ........................................................................................... 128 3.2.2 Pull downs using MCF7 cellular extract ............................................................................................ 130 3.2.3 hSTRAP interacting partners ............................................................................................................ 132 3.3 Biophysical and structural studies carried out using full length and truncated versions of hSTRAP ......................................................................................................................... 138 3.3.1 Biophysical and structural studies carried out on His-hSTRAP(1-440) ............................................... 138 3.3.1.1 Circular Dichroism on His-hSTRAP(1-440).............................................................................. 138 3.3.1.2 X-ray Crystallography on His-hSTRAP(1-440) ......................................................................... 140 3.3.2 Biophysical and structural studies carried out on GST-hSTRAP(1-440) .............................................. 155 3.3.2.1 Circular Dichroism on GST-hSTRAP(1-440) ............................................................................ 155 3.3.2.2 GST tag cleavage .................................................................................................................. 156 3.3.3 Biophysical and structural studies carried out on hSTRAP(1-219) ..................................................... 157 3.3.3.1 Circular Dichroism of hSTRAP(1-219).............................................................................157 3.3.3.2 NMR studies of hSTRAP(1-219) ............................................................................................. 159 10 3.3.4 Biophysical and structural studies carried out on hSTRAP(1-150) ..................................................... 168 3.3.4.1 Circular Dichroism of hSTRAP(1-150) .................................................................................... 168 3.3.4.2 NMR studies of hSTRAP(1-150) ............................................................................................. 169 3.3.5 Biophysical and Structural studies carried out on hSTRAP(151-284) ................................................. 179 3.3.5.1 Circular Dichroism on hSTRAP(151-284)................................................................................ 179 3.3.5.2 NMR studies of hSTRAP(151-284) ......................................................................................... 181 3.3.6 Biophysical and structural studies carried out on hSTRAP(285-440) ................................................. 183 3.3.6.1 Circular Dichroism on hSTRAP(285-440)................................................................................ 183 4. CHAPTER FOUR. GENERAL DISCUSSION................................................185 4.1 Comparisons of hSTRAP and mSTRAP structural data ........................................................ 185 4.2 Structural characterization of hSTRAP protein fragments .................................................. 188 4.2.1 CD characterization of all hSTRAP protein variants........................................................................... 188 4.2.2 Crystallographic studies on hSTRAP(1-440) ...................................................................................... 189 4.2.3 NMR studies on hSTRAP(1-219) ....................................................................................................... 190 4.2.4 NMR studies on hSTRAP(1-150) ....................................................................................................... 191 4.2.5 NMR studies on hSTRAP(151-284) ................................................................................................... 191 4.3 Difficulties with expression of hSTRAP protein variants using the E. coli expression system, and possible ways to overcome these in future. .................................................. 191 4.4 The hSTRAP interactome .................................................................................................... 193 5. FUTURE DIRECTION ......................................................................................200 6. REFERENCES .....................................................................................................203 7. APPENDIX ..........................................................................................................217 11 List of figures Figure 1.1. NMR can be used in all aspects of the drug discovery process. ......................... 18 Figure 1.2. Phase diagram of concentration of protein to precipitant. ............................... 22 Figure 1.3. Various vapour diffusion methods. .................................................................. 23 Figure 1.4. NMR Theory..................................................................................................... 24 Figure 1.5. 2D HSQC NMR spectrum. ................................................................................. 25 Figure 1.6. Far UV spectra of different secondary structure elements. ............................... 27 Figure 1.7. The three main components of the mass spectrometer.....................................30 Figure 1.8. Pictorial representation of the Quadrupole Ion trap ......................................... 32 Figure 1.9. Pictorial representation of the Tandem mass spectrometry process (MS/MS). . 33 Figure 1.10. A ribbon presentation of the WD40 domain. .................................................. 34 Figure 1.11. A ribbon presentation of the TPR motifs of PP5. ............................................. 36 Figure 1.12. Homology modeling of various TPR proteins. ................................................. 37 Figure 1.13. Domain organization of the TPR protein, HOP. ............................................... 38 Figure 1.14. Domain organization of the TPR proteins, PEX5 and p67phox. .......................... 39 Figure 1.15. TPR protein-ligand structures. ........................................................................ 43 Figure 1.16. PDZ domain structure. ................................................................................... 44 Figure 1.17. Schematic diagram showing domain organisation of the TPR Protein Tfc4 ..... 45 Figure 1.18. Schematic diagram showing domain organisation of the TPR Protein HIP. ...... 47 Figure 1.19. Schematic Diagram showing TPR Motif organisation within the TPR Protein MAS70 .............................................................................................................................. 48 Figure 1.20. Schematic Diagram showing TPR Motif organisation within these TPR proteins ......................................................................................................................................... 49 Figure 1.21. Regulation of p21 by p53, WISp39 and Hsp90. ............................................... 49 Figure 1.22. Schematic Diagram showing TPR Motif organisation within PP5..................... 50 Figure 1.23. Schematic diagram showing domain organisation of the TPR Protein CHIP ..... 51 Figure 1.24. Schematic diagram showing domain organisation of the TPR Protein α-SGT. .. 53 Figure 1.25. Domain organisation of Bub1, BubR1 and Knl1 .............................................. 53 Figure 1.26. Scaffold mechanisms...................................................................................... 54 Figure 1.27. The six major alterations found in most cancers. ............................................ 56 Figure 1.28. Process of Actin polymerisation ..................................................................... 58 Figure 1.29. Process of Transcription. ................................................................................ 60 Figure 1.30. Formation of the Pre-initiation complex (PIC)................................................. 61 Figure 1.31. STRAP sequence and its conservation ............................................................ 65 Figure 1.32. STRAP interaction with JMY and p300 through distinct TPR motifs ................. 67 Figure 1.33. STRAP and the p53 Response ......................................................................... 68 Figure 1.34. STRAP and the DNA damage Response pathway. ........................................... 69 Figure 1.35. STRAP and the stress response pathway ........................................................ 70 Figure 1.36. Regulation of GR by STRAP ............................................................................. 71 Figure 2.1. Bacterial growth curve ..................................................................................... 82 Figure 3.1. The pET-14b Vector ......................................................................................... 96 Figure 3.2. Expression of His-hSTRAP(1-440). .................................................................... 99 Figure 3.3. Purification of His-hSTRAP(1-440) protein ...................................................... 101 Figure 3.4. His-hSTRAP(1-440) mass spectrometry .......................................................... 102 Figure 3.5. pGEX-6P1, GST expression vector. .................................................................. 103 Figure 3.6. Cloning of the hSTRAP wild type in pGEX-6P1................................................. 104 Figure 3.7. Expression of GST- hSTRAP (1-440). ............................................................... 105 Figure 3.8. Purification of GST-hSTRAP(1-440) protein. .................................................... 107 Figure 3.9. GST-hSTRAP(1-440) mass spectrometry ......................................................... 108 Figure 3.10. Secondary structure predictions of full length hSTRAP ................................. 110 Figure 3.11. PCR products of the hSTRAP fragment cloned in pET14-b vector .................. 112 Figure 3.12. Expression of hSTRAP (1-219). ...................................................................... 114 Figure 3.13. Purification of hSTRAP(1-219) protein. ......................................................... 115 Figure 3.14. hSTRAP(1-219) mass spectrometry .............................................................. 116 Figure 3.15. Expression of hSTRAP (220-440) in BL21(DE3)pLysS ...................................... 117 Figure 3.16. Purification of hSTRAP(220-440). ................................................................. 118 12 Figure 3.17. Expression of hSTRAP (1-150)..........................................................................120 Figure 3.18. Purification of hSTRAP(1-150) protein .......................................................... 120 Figure 3.19. hSTRAP(1-150) mass spectrometry .............................................................. 121 Figure 3.20. Expression of hSTRAP (151-284). .................................................................. 122 Figure 3.21. Purification of hSTRAP(151-284). ................................................................. 123 Figure 3.22. Expression of hSTRAP (285-440). .................................................................. 125 Figure 3.23. Purification of hSTRAP(285-440). ................................................................. 126 Figure 3.24. hSTRAP variants used for biochemical binding assays................................... 128 Figure 3.25. In-gel digestion mass spectrometry analysis. ................................................ 129 Figure 3.26. hSTRAP biochemical pull down assays with MCF7 cellular extracts................131 Figure 3.27. SDS-PAGE bands isolated and submitted to mass spectrometry analysis. ..... 131 Figure 3.28. hSTRAP implication in cancer related pathways. .......................................... 137 Figure 3.29. CD experiments carried out on His-hSTRAP(1-440). ...................................... 140 Figure 3.30. Concentration of His-hSTRAP(1-440) protein in different buffers.................. 142 Figure 3.31. Long term stability of concentrated His-hSTRAP(1-440) protein. .................. 142 Figure 3.32. Concentration and stability of His-hSTRAP(1-440) protein in Buffer 1 + H-MIX. ....................................................................................................................................... 144 Figure 3.33. Gel filtration graph of concentrated His-hSTRAP(1-440) protein sample ....... 145 Figure 3.34. Crystallisation trials with 12.9 mg/ml His-hSTRAP(1-440) sample in JSCG+ screen ............................................................................................................................. 146 Figure 3.35 Crystallisation trials with 18 mg/ml His-hSTRAP(1-440) sample in the JCSG+ screen. ............................................................................................................................ 148 Figure 3.36. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440). ............................... 150 Figure 3.37. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to observe effects of pH on crystal grade............................................................................. 153 Figure 3.38. His-hSTRAP(1-440) Diffraction Pattern. ........................................................ 154 Figure 3.39. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to observe effects of further lowering ethanol concentration on crystal grade .................... 154 Figure 3.40. CD experiments carried out on GST-hSTRAP(1-440)...................................... 156 Figure 3.41. On-column GST TAG cleavage of GST- hSTRAP(1-440). ................................. 157 Figure 3.42. Dialysed hSTRAP(1-219) protein in CD buffer. .............................................. 158 Figure 3.43. CD experiments carried out on hSTRAP(1-219). ............................................ 159 Figure 3.44. hSTRAP(1-219) protein stability in gel filtration buffer. ................................. 160 Figure 3.45. Concentration and stability of hSTRAP(1-219) in NMR buffer. ...................... 162 Figure 3.46. 1D NMR spectrum of 0.2 mM hSTRAP(1-219) ............................................... 163 Figure 3.47. Expression of 15N labelled hSTRAP(1-219)..................................................... 164 Figure 3.48. Purification of 15N labelled hSTRAP(1-219 .................................................... 166 Figure 3.49. Concentration of 15N-hSTRAP(1-219) in H-MIX, pH8. .................................... 167 Figure 3.50. Dialysed hSTRAP(1-150) protein in CD buffer. .............................................. 168 Figure 3.51. CD experiments carried out on hSTRAP(1-150). ............................................ 169 Figure 3.52. Expression of 15N labelled hSTRAP(1-150) protein. ....................................... 170 Figure 3.53. Purification of 15N labelled hSTRAP(1-150). .................................................. 171 Figure 3.54. 15NhSTRAP(1-150) buffer optimisation trials................................................. 173 Figure 3.55. 2D 1H-15N correlation NMR spectra on 15N-hSTRAP(1-150). .......................... 175 Figure 3.56. 2D1H-15N correlation HSQC NMR spectra of 15NhSTRAP(1-150) in identified optimised conditions....................................................................................................... 176 Figure 3.57. Expression of unlabelled hSTRAP(1-150) in Shuffle T7 .................................. 177 Figure 3.58. Purification of unlabelled hSTRAP(1-150) in Shuffle T7 express. ................... 178 Figure 3.59. 1D 1H Spectrum of hSTRAP (1-150) in Shuffle T7 express. ............................. 179 Figure 3.60. Dialysed hSTRAP(151-284) protein in CD buffer............................................ 180 Figure 3.61. CD experiments carried out on hSTRAP(151-284). ........................................ 181 Figure 3.62. 1D 1H NMR spectrum of hSTRAP(151-284) ................................................... 182 Figure 3.63. Dialysed hSTRAP(285-440) protein in CD buffer to be used for subsequent CD experiments .................................................................................................................... 183 Figure 3.64. CD experiments carried out on hSTRAP(285-440). ........................................ 184 Figure 4.1. Sequence alignments of Mouse and Human STRAP ........................................ 186 Figure 4.2. Homology modelling of hSTRAP and mSTRAP structure. ................................ 187 13 Figure 4.3 Ribbon representation of the different regions of hSTRAP cloned and expressed separately in the current study, mapped on the model structure of full-length STRAP protein. ........................................................................................................................... 188 Figure 4.4. Proposed hSTRAP mechanism of function. ..................................................... 197 14 List of Tables Table 2.1. Buffer compositions. ......................................................................................... 74 Table 2.2. Bacterial competent cells. ................................................................................. 75 Table 2.3. PCR primers ...................................................................................................... 78 Table 2.4. PCR reaction protocol. ...................................................................................... 79 Table 2.5. Resolving gel components. ................................................................................ 85 Table 3.1. Expression trials and purification of His-hSTRAP(1-440). .................................... 98 Table 3.2. Estimation of His-hSTRAP(1-440) protein concentration .................................. 102 Table 3.3. hSTRAP truncated forms cloned in pET14-b. .................................................... 111 Table 3.4. Estimation of hSTRAP(1-219) protein concentration. ....................................... 116 Table 3.5. Estimation of hSTRAP(1-150) protein concentration. ....................................... 121 Table 3.6. hSTRAP interacting partners. ........................................................................... 133 Table 3.7. Function of hSTRAP interacting proteins. ........................................................ 135 Table 3.8. Estimation of soluble His-hSTRAP(1-440) protein concentration in different buffers ............................................................................................................................ 141 Table 3.9. Estimated concentration of His-hSTRAP(1-440) in Buffer 1 + H-MIX................. 143 Table 3.10. Estimated concentrations of hSTRAP(1-219) in gel filtration buffer................ 160 Table 3.11. Estimated concentration of hSTRAP(1-219) protein samples in NMR buffer .. 162 Table 3.12. Estimated concentration of 15N-hSTRAP(1-219). ............................................ 167 Table 3.13. Estimated concentration of 15N hSTRAP(1-150). ............................................ 171 Table 3.14. Estimated concentration of hSTRAP(1-150) in elution fractions when expressed in Shuffle T7 express cells. ............................................................................................... 178 Table 4.1. Experimental TPR positions of STRAP TPR motifs ............................................. 186 Table 4.2. CD data ........................................................................................................... 189 Table 4.3. The advantages and disadvantages associated with the bacterial and eukaryotic expression systems. ........................................................................................................ 193 15 Abstract STRAP (Stress responsive activator of p300) is a 440 amino acid protein, predicted to have 6 TPR (Tetra-Tri-Co-Peptide Repeats) motifs, known to mediate proteinprotein interactions. STRAP has been shown to form a complex with proteins p300 and JMY (Junctional Mediatory Protein), and is implicated in the DNA damage, heat shock response pathway, regulation of the Glucocorticoid receptor and in the function of p53. The aims of this project were to clone, express and purify full length and truncated human STRAP (hSTRAP) variants in high quantities. Full length and shorter hSTRAP fragments, which contain different combinations of the predicted TPR motifs and hence cover different regions, would be then structurally characterised by various structural and biophysical experiments. Another important aim was to identify interacting partners of hSTRAP in breast cancer and to map the position of their interaction sites to different parts of the protein. To this direction GST- and His- tagged full length hSTRAP, as well as His- tagged truncated hSTRAP protein variants have been successfully cloned, expressed and purified. Independent and reproducible biochemical pull-down assays have been carried out in MCF7 breast cancer cells, followed by mass spectrometry-based proteomics analysis which identified 25 hSTRAP-interacting partners from various signaling pathways such as regulation of the actin cytoskeleton and translation. In addition, crystallization trials were carried out using pure His-hSTRAP(1-440) protein, which were unfortunately un-successful. Various hSTRAP protein variants have been characterized by CD, showing that hSTRAP(1-150), His-hSTRAP(1-440), hSTRAP(1-219), hSTRAP(151-284) and hSTRAP(285-440) comprise of α and β structures, but the hSTRAP protein variants show no clear cooperative unfolding transitions, suggestive of molten globule states. NMR on hSTRAP(1-219), hSTRAP(1-150) and hSTRAP(151-284) have shown these proteins are not folded at a tertiary structure level. We conclude that a protocol has been established to clone, express and purify various hSTRAP variants and the thermal and secondary structure characteristics of each have been determined, although the 3D structure could not be solved. Pulldown assays followed by proteomic analysis have shown that hSTRAP is implicated in many aspects of cellular regulation. 16 1. Chapter One. Introduction 1.1 Proteins 1.1.1 Importance of protein research The term “Protein” was introduced in 1839 by Gerhardus Johannes Mulder and since then extensive research on various proteins has been undertaken [1]. Proteins are composed of a combination of 20 amino acids, and are implicated in every aspect of cellular function [1]. Various diseases have aroused from protein defects, such as cystic fibrosis where the most common mutation leads to the deletion of a single amino acid residue at position 508 of the Cystic Fibrosis Transmembrane Regulatory (CFTR) protein, which as a consequence severely affects this protein and its function [1]. Alzheimer’s disease is another example which is most commonly characterized by high amounts of amyloid deposits and hyper-phosphorylated Tau protein [2]. Further research has implicated specific proteins in various other diseases such as sickle cells anaemia, HIV and cancer (will be introduced in more detail below) [1]. Protein homeostasis is an important aspect of cellular function which needs to be tightly regulated for the stability of the proteome [2, 3]. Various cellular stress factors cause protein damage such as miss-folding or non-physiological intracellular localization, which can be repaired [2, 3]. If repair is unsuccessful, defective proteins can be eliminated through homeostatic mechanisms mediated by chaperones and quality control processes [2, 3]. Dysregulation of these cellular mechanisms can lead to the accumulation of damaged and non-functional proteins, resulting in the emergence of various diseases [2, 3]. All of these factors mentioned above emphasized the importance of carrying out protein research [1]. 1.1.2 The importance of protein structure determination The protein structure, unlike that of DNA which is largely independent of its sequence composition, depends on the amino acid sequence and its sub-cellular environment [4]. The structure of the protein is important in carrying out its cellular functions and can be determined by X-ray crystallography, NMR and cryo-electron microscopy [4]. X-ray crystallography is a technique in which an X-ray beam is directed to a protein crystal, which then produces a unique diffraction pattern that can be analyzed by various computational methods to deduce the protein structure (will be discussed in more detail below) [4]. NMR (will be discussed in more detail 17 below) is used to determine the structure of small proteins in solution or in solid state (solid state NMR) [4], and can also be used to study the dynamics of interacting proteins and their complexes [4]. Hence, NMR is a versatile biophysical technique that is commonly used in many aspects of structural biology and the drug discovery process as shown in Figure 1.1 [5, 6]. Figure 1.1 shows that NMR can be used in the process of initial target selection, high throughput screening (HTS) biochemical binding assays to identify potential hits, further validation, identification of leads and finally lead optimization through structure refinement [5,6]. Cryo-electron microscopy is another structural biology technique and is used to visualize macromolecular complexes [4]. Figure 1.1. NMR can be used in all aspects of the drug discovery process. A target is firstly selected which is normally protein and is found to be implicated in the disease of interest. A series of biochemical binding assays are initiated to determine hits which bind to the target and have the desired effect on target activity. These hits are then studied further by NMR experiments to confirm target-hit interaction, and to identify ligand binding site. Once a lead is identified this is then optimized through structure refinement. Abbreviations: HTS, High throughput screening. Figure taken from [5] The importance of the determination of the protein structure was first realised when it was shown that DNA forms a double helical structure, which then identified the mechanism of how the genetic information found on the DNA can be separated equally to the two daughter cells obtained in the process of cell division [4, 7]. This then lead to the prediction that determining the structure of proteins also can give insights into protein function, and the principle “function follows form” was then formulated [4, 7]. It was also realised that for the drug discovery process it is important to determine the structure of the protein as then small molecule drugs can then be designed to fit into certain protein pockets, which are both important in mediating a certain cellular function and in the physiology of the disease [4, 7]. Hence, the determination of the protein structure is a very important aspect that was needed for the progression of biomedical science. 18 1.1.3 Importance of characterizing protein-protein interactions Resolving the structure of the protein provided crucial information in characterizing cellular protein functions and pointed out that alternative protein conformations could lead to interactions with different binding partners, which eventually might alter their effects with detrimental consequences for the cellular physiology [8]. For example, in Alzheimer’s disease, amyloid protein aggregate as deposits to cause pathogenesis [2, 9]. Another example is the adult respiratory distress syndrome where normally the function of elastase is inhibited by interaction of this protein with α-1-antitrypsin [9]. Loss of this interaction leads to the activation of elastase which is implicated in the development of the respiratory distress syndrome [9]. This and many more examples emphasized the importance of identifying interacting partners of proteins of interest, in order to determine optimal drug target. The term “Proteomics” was first introduced in 1995 and this area of research is the study of protein-protein interactions based on protein structure [8]. Proteomics can be sub-categorised into two areas of research: expression and cell map proteomics [8]. The first category is the study of expression patterns of various proteins, and the latter sub-category is the study of protein complexes and the protein-protein interaction implicated within that [8]. Proteomics research provided extensive amount of protein-protein interaction data that needed to be analyzed using various bioinformatics tools [10]. Bioinformatics is a powerful approach to link specific protein structure abnormalities with altered protein-protein interactions and assignment of these to particular pathways leading to disease [10]. 1.2 Various structural biology techniques The two main structural biology techniques commonly used by scientists to solve the structures of protein are X-ray crystallography and NMR, both of which will be discussed in more detail below. Circular dichroism can also be classified as a structural biology technique and its methodology will also be described in the following sections. 1.2.1 X-Ray crystallography When carrying out any microscopic experiment, the resolution at which the sample is observed primarily depends on the wavelength of the electromagnetic radiation 19 used [11]. For visualizing protein molecules the wavelength has to lie within the range of 1 to 5 Å, which is true for X-rays [11]. X-rays directed towards the protein crystal are scattered by the atoms of the proteins to produce a diffraction pattern [11]. A protein crystal is a 3D lattice of a high number of protein molecules organised/packed in such a manner to form a 3D lattice, which actually magnifies the effect of X-ray scattering [11]. This latter effect can either interfere destructively, or add constructively to then ultimately give a diffraction pattern (reflection spots) [11]. X-ray scattering can only add constructively if the conditions are met of Bragg’s law [11], which show that to obtain an diffraction pattern, the total pathway difference between the two scattered X-rays have to be of an integer value of a wavelength of an X-ray [11]. Bragg’s law is as follows and the co-ordinates h, k and l are the planes of the crystal latticenλ = 2dhklsinθ d; lattice spacing, θ; diffraction angle of the reflections, hkl; miller indices (planes of the crystal lattice) In these diffraction experiments, the intensity of each reflection spot is measured, which is directly proportional to the square of the structure factor amplitude (F) [11]. This structure factor is the sum of the atomic scattering within the unit cell from the plane directions defined by the co-ordinates hkl [11]. The goal of these diffraction experiments is to obtain an electron density map from this, and the electron density(ρ) can be calculated via Fourier transform of each of the co-ordinates x,y,z of the unit cell of volume V as shown by this equation: ρ(x,y,z) = 1/V Σ|Fhkl| · exp[-2πi (hx + ky + lz – αhkl)] For each data point two values are needed to calculate electron density, one which is obtained through the experiment, which is the structure factor amplitude, |F hkl|, and the other is the phase angle of each reflection, α hkl [11]. This phase information is usually missing and has to be retrieved, and is referred to as the “phase problem” and a reasonable estimate of this has to be derived to solve the structure by x-ray diffraction [11]. This phase problem can be solved though a number of ways: Multiple or Single Isomorphous Replacement (MIR or SIR), Single-wavelength Anomalous Dispersion (SAD) and Multi-wavelength Anomalous Diffraction (MAD) [11]. All of these experiments generally involve the incorporation of a heavy element (has a strong 20 scattering centre) within the protein crystal; through either crystal soaking with the chosen heavy element, or recombinantly labeling the protein through protein expression in selenomethionine media [11]. The phase problem can also be solved through Molecular replacement (MR), which requires a homologous protein to the protein being investigated [11]. An essential requirement for MR is to have a reasonably accurate homologous model [11]. Obtaining a crystal of high enough quality to diffract is the critical, difficult and rate limiting requirement in solving the structure by x-ray crystallography [11-12]. Many trials have to be undertaken using various precipitants at different concentrations to determine optimal crystallization conditions, with little concrete theory support, as the condition in which the protein will crystallize cannot be predicted [11-12]. However, phase diagrams can then be determined through these trials, which are used to facilitate the process of optimizing the condition of protein crystal growth [11-12]. These 2D phase diagrams plot the solubility of a protein in various conditions, and the most common phase diagrams plotted are of concentration of protein to precipitant, as shown in Figure 1.2 [11-12]. These phase diagrams show the different zones, which relate to different protein kinetics (phase space) (Fig.1.2) [11-12]. Proteins crystallize when its concentration is higher than it’s solubility in that buffer solution, which is known as the super-saturation zone [11-12]. Figure 1.2 also shows that this zone of super-saturation is divided into three other zones; nucleation, precipitation, and metastable [11-12]. Each zone refers to a certain protein kinetic effect, for example in the nucleation zone, the concentration of protein is sufficient for the formation of nuclei and crystals [11-12]. Whereas in the precipitation zone the concentration of the protein is too high and as a consequence, growth is too rapid for correct crystal formation and therefore aggregation and precipitation subsequently occurs as a result [11-12]. In the metastable zone, the concentration of protein is not high enough to form new nuclei but allows the formation of crystals from existing nuclei [11-12]. Since these zones refer to protein kinetics the boundaries are not conclusive, and can differ [11-12]. Other factors that can be plotted on these phase diagrams are protein purity, pH and temperature [1112]. 21 Figure 1.2. Phase diagram of concentration of protein to precipitant. The arrow shows the path of a protein taken through these different zones in a vapour diffusion experiment. Figure taken from [12] There are a number of techniques that can be used in growing these protein crystals, these are vapour diffusion, free interface diffusion, batch and dialysis methods [1112]. In this thesis the vapour diffusion technique was used where drops were set up with protein and precipitant, surrounded by the mother liquor (the precipitant) at a higher concentration [11-12]. The initial starting point is the under-saturation stage for these types of experiment (Fig.1.2.), and as the water evaporates the concentration of both of the components within the drop increases and the protein is then within the super-saturation stage [11-12]. Once it is within the nucleation zone, the protein nucleates, subsequently followed by crystal formation and a decrease in protein concentration [11-12]. For these types of vapour diffusion experiments two types of methods exist, the sitting and hanging drop method [11-12]. The latter method involves the mixing of the protein and crystallisation trial condition on a siliconised glass cover slip, which is then subsequently placed over a plate containing the crystallization trial solution at a higher concentration and volume to what is set up in the drop (See Fig.1.3) [1112]. The ratio and volume of protein to crystallization trial condition can be changed to optimise condition for crystal growth [11-12]. The sitting drop method is very similar to this latter technique mentioned, but the difference is that the drop of protein and crystallization trial condition is sitting over the reservoir solution (crystallization trial condition) rather than being suspended over it like in the hanging drop method (Fig.1.3) [11-12]. The advantage with this technique is a larger drop volume can be used [11-12]. 22 Figure 1.3. Various vapour diffusion methods. The two types of drop methods, hanging and sitting drop, are shown in this figure. Figure taken from [13]. A diffracting crystal of high enough quality is obtained from a uniformly formed single crystal lattice, and when the latter is not the case, this is normally evident by the diffraction image obtained [11-12]. This is known as mosaicity and is confirmed through alteration of spot shape and intensity [11-12]. Other possible crystal abnormalities, is the splitting of the crystal lattice, which subsequently gives two, overlapping diffraction patterns [11-12]. Futhermore, diffraction experiments are carried out at 100K, the crystal has to be frozen but no ice should be formed and hence this is done in the presence of cryoprotectant, which prevents this [11-12]. 1.2.2 NMR NMR is a powerful biophysical technique that can provide structural information of proteins up to 35kDa [14-22]. This technique is based on the fact that each charged proton spins around its axis at a certain frequency within the small magnetic field it creates around itself and this is referred to as magnetic moment (µ) [14-22]. After application of an external magnetic field (B0) in the Z direction, the proton which has a spin of ½, spins around the Z axis in either two orientations, Z+ or Z-, referred to as spin up or down respectively [14-22]. The rate of proton spinning depends on the frequency of the external magnetic field applied on the Z axis [14-22], for example when the external magnetic field is 17.6 Tesla, protons spin at 700MHz frequency. However, the spinning of the protein is also affected by the chemical environment, i.e which other atoms are nearby [14-22]. A frequency shift in an NMR spectrum, also called chemical shift, is generally detected to be small compared to the frequency of the magnetic field [14-22]. The chemical shift is around a millionth of the frequency of the magnetic field and is measured in parts per million (ppm), and 23 this shift is the difference in frequency between the sample signal and reference NMR signal, normalized to external magnetic field [14-22]. When using a 700 MHz spectrometer, a chemical shift of 4ppm indicates a frequency shift (relative to the reference) of 2800 Hz (4*700). To measure the actual spinning rate, radiofrequency electromagnetic field (B1) is applied which rotates the net magnetization vector around the X axis. The net magnetization is thus rotated 90° to align along the Y axis (Fig.1.4) [14-22]. This effect is known as the 90° pulse, and a 180° pulse can also be obtained by increasing twice the time the external magnetic field is applied for [1422]. Figure 1.4. NMR Theory. Shows a 90° pulse when an external magnetic field is applied (my) from the state of equilibrium around the Z axis (M0 position). Protons spin around the Z axis at equilibrium state. Figure from [16] Once a pulse is applied, B1 magnetic field is switched off and as a consequence all the magnetization eventually returns to the Z axis, to its original equilibrium state before this electromagnetic field was applied [14-22]. This effect is known as relaxation. The rate at which it rotates back to its equilibrium state along the Z-axis is referred to as T1 [14-22]. At the equilibrium state no net magnetisation is detected in the XY plane, and the rate at which this magnetisation decays to zero over time from Y orientation is referred to as T 2, and this phenomenon is known as the free induction decay (FID) [14-22]. The signal from oscillating spins is detected in XY plane as a function of time. The NMR spectrum is obtained by performing the Fourier transform of time-dependence of the detected signal. Fourrier transform is a mathematical process whereby the time dependence signals of the free induction decays are converted into distribution of spectral frequencies, giving rise to NMR peaks [14-22]. When carrying out 1D NMR experiments, overlapping signals are generally obtained from similar resonance frequencies and therefore 2D NMR experiments give better peak resolution [14-22]. These are similar in principle to 1D NMR but two different nuclei may be utilized, for example a proton and carbon, hence the information is more clear and easy to analyse [14-22]. 2D experiments generally rely on two phenomena: (1) bond coupling, which include experiments such as HSQC, COSY and TOCSY; and the other through space coupling, which include experiments such 24 as NOESY [14-22]. All of these techniques which only analyse data at the 1H level are referred to as homonuclear NMR [14-22]. Other 2D experiments, such as HSQC (one- bond correlation) and HMBC (long-range correlation), utilize two different nuclei such as 1H and 15 N, and therefore these experiments are referred to as heteronuclear NMR [14-22]. As explained above, for 2D NMR the nuclei are excited by a special series of pulses, and free induction decay signals are observed as a function of t1 time delays used in the pulse sequences [14-22]. After initial excitation, there is an evolution time whereby the magnetisation of each of the different nuclei may affect each other; this consequently modulates the intensity of the signal observed during acquisition period t2. [14-22]. A number of FIDs with different evolution times t1 are acquired. The FID signals are first fourier transformed along t2 to produce peaks as in 1D spectra, these are then fourier transformed again along t1 dimension. The signals appear as contours on the 2D spectra as shown in Figure 1.5 [14-22]. Figure 1.5 shows the various 2D HSQC spectra that can be expected of unfolded and folded protein. Figure 1.5. 2D HSQC NMR spectrum. These are examples of 2D1H-15N HSQC spectra of various proteins in different folding states. A: 1H-15N HSQC spectra of “poor”, unfolded protein as shown by a low number of cross-peaks and poor sensitivity and small signal dispersion. B: 1H-15N HSQC spectra of unfolded protein as all poorly-dispersed backbone amides are present within the random coil region of around 7–9 ppm, and a lot of overlapping resonances signals from the side-chain amide are observed. C: 1H-15N HSQC spectra of “promising protein”, as the spectrum shows presence of folded and unfolded regions of the protein with higher peak intensities. D: 1H-15N HSQC spectra of folded protein as shown by contours of uniform intensity, well dispersed signals, and correct number of peak count for that protein under investigation. Figure taken from [23]. 25 3D experiments are important for protein assignments, and these experiments are carried out to obtain more information regarding protein secondary structure [14-22]. Again 3D and 2D NMR are similar in principle and technique, but 3D NMR has one additional evolution time, t3, compared to 2D experiments [14-22]. 3D NMR spectra may analyze 3 different nuclei, which give rise to 3 different frequencies that are linked to each other [14-22]. These types of experiments are called triple resonance spectroscopy [14-22]. 1.2.3 Circular dichroism (CD) To understand the theory of CD, it is important to view the concept that plane polarized light consists of two oppositely rotated, equal magnitudes of circularly polarized planes of light: left handed counter clockwise (L) and right handed clockwise (R) [24-25]. The differential absorption of these two planes of light by the protein sample being investigated results in the combination of these two values and the phenomenon known as elliptical polarization, which is measured by Spectropolarimeters (CD instrumentation), in degrees as a function of wavelength [24-25]. CD data is most commonly shown as mean residue molar ellipticity (deg cm2 dmol-1), whereby the data is normalized against molar concentration and pathlength [24-25]. This CD signal is observed due to a chiral component present within the sample being investigated, its surrounding environment or the protein itself being chiral [24-25]. Information from different areas of the CD spectra can be analysed in parallel, which can then be used to deduce the secondary structure composition [24-25]. Various chromophores can be analyzed; such the peptide bond, side chains of the aromatics and disulphide bonds [24-25]. The peptide bond absorbs between 180-240 nm, from which the secondary structure can be deduced through various programs [24-25]. The aromatic amino acids e.g. tryptophan, phenylalanine and tyrosine absorb at 290305, 275-282, 225-270 nm respectively [24-25]. This region cannot give very detailed information regarding tertiary structure but nevertheless can be studied [2425]. The disulphide bonds weakly absorb near 260nm. CD can therefore provide information regarding secondary structure composition from the peptide bond region as well as tertiary structure information [24-25]. Figure 1.6 shows CD spectra linked to various secondary structure elements. The stability of the protein can also be assessed via CD through monitoring the conformation changes induced through variation of factors such as temperature and pH [24-25]. 26 Figure 1.6. Far UV spectra of different secondary structure elements. This figure shows a typical CD spectra for -helical (solid line), anti-parallel sheeted (long dashes), type I turn (dots) and irregular structured (dots and short dashes) sample. Figure taken from [25] 1.2.4 Complementarities between different Structural biology techniques The area of structural proteomics involves solving the structure of proteins with a view of understanding its function, and to study the correlation between protein sequence, structure and function [26]. The main requirement for these types of studies is to obtain a highly concentrated, stable, non-aggregated and soluble recombinant protein [26]. Many experiments can be carried out to achieve these requirements, but there is no single successful protocol that can be universally used for all. Also only a small fraction of proteins can actually achieve these requirements and be used for structural biology experiments [26]. The most commonly used structural biology experiments to solve the structure of a protein are X-ray crystallography and NMR spectroscopy [26]. As mentioned above NMR spectroscopy does have size restraints associated with it, but can be used as a parallel approach to x-ray crystallographic studies of small proteins [26]. A 2D 15 N- HSQC NMR spectrum can show whether the protein sample is folded and gives an idea as to whether the structure can be solved [26]. For that reason NMR spectroscopists can determine faster whether the structure can be solved by NMR experiments, although, this does not generally indicate whether or not the structure can be solved by X-ray crystallography [26]. If sample is folded according to 2D NMR spectra, then subsequent NMR experiments can be initiated to solve the structure, which can take approximately 6 months [26]. Whereas with X-ray crystallography the time frame may be longer to obtain the well-diffracting crystal, as firstly buffers have to be optimized to identify optimum conditions of crystal 27 growth, furthermore, not all proteins can be crystallized [26]. Unlike NMR spectroscopists, who can generally screen a handful of conditions, protein crystallographers can screen hundreds of conditions as a low volume of buffer and protein are needed for these trials as they can be set up in 96 well plates [26]. Once crystallization condition has been optimised more time is required to optimize the crystal to obtain the best diffraction pattern [26]. Overall, the time it takes to solve the crystal structure depends mainly on these exploratory trials mentioned above, after which, once a good diffraction is achieved, the computational aspects involved in solving the structure by x-ray crystallography are not time consuming in comparison [26]. Advances are being made in developing both structural biology techniques to reduce the time needed in solving the structure of proteins [26]. Another difference between the two techniques is that research has shown that there are differences between the structures solved by NMR and X-ray crystallography of the same protein, which means that there may be differences in conformation between crystal and solution structures [27]. This concept is understandable as for xray crystallographic studies the protein is “packed” into the crystal, hence this potentially could be different in conformation to a protein structure in solution, where this “packing” phenomena does not exist [27]. As a result NMR studies usually provide various conformers of the protein, which are then subsequently deposited in the PDB [27]. One of the difference in structures solved by NMR and X-ray is the differences in the number of contacts per residue, as NMR structures deposited in the PDB have more inter-residue contacts within the 3.0 Å and 4.5– 6.5 Å resolution x-ray data, but fewer between 3-4 Å and 6.5-8 Å [27]. Also the main chain hydrogen bonds are sometimes different between solved NMR and X-ray structures [27]. However, it has been suggested that this difference could be due to the mathematical procedures associated in processing the data for both techniques, as after re-refinement of NMR structures using different force field parameters the differences are much less (but are not completely lost) [27]. X-ray structures are still generally considered as the more accurate image of the solved protein structure than NMR structures because of the non-existing quality assessment of these NMR structures [27]. However, NMR solves the structure of the protein in solution, which is known to be more relevant in the biological context compared to static x-ray crystal structures [27]. Another simpler and faster technique is CD, which is most commonly used as a 28 preliminary experiment to X-ray and NMR studies, in determining the folding state of protein prior to initiating these experiments [26]. Although, only NMR can provide high resolution data regarding the folding state of the protein sample [26]. CD data can however be used to facilitate crystallographic and NMR studies, as information regarding integrity, folding, stability (via monitoring changes in structure through application of varying temperatures) and secondary structure composition of protein can be assessed via this technique using less protein material [24-26]. Compared to X-ray crystallography and NMR, which give structural data on the atomic resolution level [26], CD provides low resolution data in which the full overall structural data is assessed [24-25]. Due to its non-destructive nature, and low sample and time requirements, CD is a very useful for preliminary structural studies [24-25]. Compared to crystallographic studies, which requires high quality diffracting crystals [26], CD is less time consuming and solution based technique [24-25]. For NMR studies, high concentration of monomeric protein is needed and this technique also has size restraints (less than 35kDa) [14-22, 26]. Furthermore, to completely resolve the structure by NMR, each resonance has to be assigned which is time consuming compared to CD, and NMR requires stable isotopic labeling, which is expensive [14-22, 26]. X-ray structure gives a static picture and does not give any data regarding dynamics of the protein, which is a critical aspect of its function [11-12, 26]. NMR studies can provide information regarding the dynamic nature of the protein compared to X-ray crystallography [26]. CD can be used to assess structural changes of the sample and the rate at which this occurs [24-25]. As mentioned previously CD provides low resolution structural data but can give reliable estimates of secondary structure composition, although cannot determine exactly which region of protein belongs to diferent secondary structural elements [24-25]. Furthermore, CD data cannot give very detailed information regarding tertiary and quaternary structure but still nevertheless can be used as a complementary approach to X-ray and NMR studies [24-25] 1.3 Mass spectrometry Mass spectrometry is a highly sensitive and high throughput technique and as a result is used during many stages of the drug discovery process [28]. This sensitivity is critical in identifying drug targets that are present in low concentration within 29 mixtures [28]. The speed in carrying these mass spectrometry experiments is important for high throughput analysis of drug libraries [28]. This technique does not measure the mass as what is implied by the name, but it actually measures the mass to charge ratio (m/z) [28]. A mass spectrometry spectrum usually plots ion abundance against m/z, and shown in terms of Dalton (Da) per unit charge, and hence the raw data can give information regarding the relative abundances of these gas phase ions in the sample [28]. The mass spectrometer consists of three important elements: (1) an ionization source, (2) a mass analyzer and a (3) detector as shown in Figure 1.7 [28]. Figure 1.7. The three main components of the mass spectrometer. The 3 main components of the mass spectrometer are the ion source, analyzer and detector. The ionization techniques used within the ion source are ESI (Electrospray ionization), APCI (Atmospheric pressure chemical ionization), and MALDI (Matrix associated laser desorption/ionization). The ions then travel to the analyzer, which include Time of flight, Quadrupole, Quadrupole ion trap, FT-ICR (Fourier-transform ion-cyclotron resonance) and sector analyzers. The ions then eventually reach the detector and are analyzed and subsequent results viewed on the computer. *FT-ICR does not use an electro multiplier as its detection source. Figure taken from [28]. As the name implies the ionization source is where the analytes are ionized into gas phase ions [28]. Historically electron ionization was used for this stage; however, this limited the samples to be analyzed to low thermally stable molecular weight compounds [28]. Advances in recent decades have now made available an ionization technique to analyze large, non-volatile and thermally labile compounds, two of which are electrospray ionization (ESI) and matrix assisted laser desorption/ionization (MALDI). ESI is the process of ion formation whereby a solvated sample is passed through a small charged capillary, resulting in the formation of charged droplets of both solvent and analytes, which can either have a net positive or negative charge [28]. These ions, as they passage through the instrumentation to the mass analyzer loose the solvent surrounding environment [28]. ESI has no size restraints as multiple charged species are formed during ESI, and this technique is referred to as a soft ionization technique, which allows non30 covalent bio-macromolecules to be analyzed [28]. This ionization process can follow from a liquid chromatography step, such as HPLC and capillary electrophoresis, which allows complex biological samples to be analyzed swiftly with high sensitivity [28]. However, because ESI is a solution-based technique, sample is constantly being lost, and also ESI is very sensitive to ion suppression effects [28]. High salt can lower analyte ion formation, and so most samples have to be desalted prior to mass spectrometry analysis [28]. In MALDI the sample is co-crystallized to a matrix, and then irradiated, which results in the formation of gas phase ions that travel towards the mass analyzer [28]. This process results in the generation of singly charged ions, which is an important advantage of MALDI, although the mechanism of this has not been fully clarified [28]. This technique compared to ESI has very little sample wastage and as a consequence is highly sensitive [28]. Compared to ESI, MALDI is not as sensitive to salts and buffers present within the sample [28]. However, there are drawbacks to this technique; firstly the matrix produces a large amount of chemical noise at m/z values below 500 Da, and hence it is difficult to analyze low molecular samples [28]. However research is progressing in solving this problem associated with MALDI [28]. The next component within the mass spectrometer is the mass analyzer of which that are many that work in different ways in analyzing the ions [28]. There are 5 common mass analyzers, which can be categorized into two groups; beam analyzers, whereby the ions exist the ion source in form of a beam, and travel through the analyzer to the detector, or (2) trapping analyzers; where the ions are trapped within the analyzer, which have been generated by the external ion source or the analyzer itself [28]. One example of the beam analyzer is Time of flight (TOF), which is a simple mass analyzer that separates ions according to its speed [28]. Ions are formed then a fixed potential is applied across the TOF drift tube, whereby the ions are accelerated and travel through the tube to hit the detector [28]. Ions which have the same charge will travel at the same velocity after they have been accelerated, and the lower the m/z value the higher the speed by which the ion will travel and vice versa [28]. Hence, depending on the time it takes for the ion to travel to the detector, will determine the m/z value of the ion [28]. This is generally known as liner TOF and another more advanced instrumentation is the reflectron TOF, where the ions travel through two tubes aided through an electrostatic mirror known as the reflectron, which directs the ions onto travelling through its second path and then finally to the detector [28]. This 31 reflectron accommodates for the small differences of the speeds of the ions with the same m/z value and this improves the accuracy of the spectrometry [28]. The advantage of this analyzer is that there are no size restraints [28]. Another analyzer is the Quadrupole, which is cheap and easy to use and applies a much lower voltage such as 2-50 V compared to kV, to accelerate the ions [28]. This analyzer is based on the trajectory of an ion to the detector through four rods subjected to a dynamic electric field (radio frequency), which ultimately depends on the m/z value of the ion [28]. Not all the ions will reach the detector as only the ion with the single m/z value will be able to travel through the rods with a stable trajectory to reach the detector; other ions with a different m/z won’t be able to survive this [28]. Another most commonly used analyzer is the quadrupole ion trap, which is a close relative of the analyzer just mentioned but differ in the application of the electric field applied (Fig.1.8) [28]. Quadrupole applies the electric field through two dimensions (x and y), and as a result, the ions travel perpendicular to the electric field, so in the z direction [28]. In the ion trap, the electric field is applied in all directions, and as a consequence the ions are trapped within this field (Fig.1.8) [28]. The mass spectrum obtained through the Quadrupole analyzer is the result of an ion following a stable trajectory to the detector, whereas in ion trap the opposite is required to obtain a mass spectrum [28]. The latter is achieved through increasing the radio frequency voltage applied [28]. Figure 1.8. Pictorial representation of the Quadrupole Ion trap. A radio frequency voltage is applied to the ring electrode to disrupt the path of an ion and an electric field is applied in all direction to the ions in this analyzer, which then result in the trapping of the ions within the instrumentation. Figure taken from [28]. As well as the advances in the ionization techniques and instrumentation, another technique, which has been important in the use of MS for biological samples is tandem mass spectrometry usually denoted as MS/MS [28]. As the name suggests 32 and shown in Figure 1.9, this technique involves two steps, first step involves the separation of the ion with the desired m/z (parent ion) from the other ions generated by the ion source [28]. The second step is the consequent dissociation of the parent ion to change its mass or charge to produce a set of ions (product ions), which are then subsequently analyzed by the Mass spectrometer [28]. An important step of MS/MS is ion dissociation, whereby the parent ion dissociates by an increase in its internal energy before entering the second stage of MS-II [28]. This increase in energy is most commonly through a process called collision-induced dissociation (CID), where the parent ions collide with gas particles, resulting in the conversion of kinetic to internal energy of the parent ion [28]. The instrumentation used for MS/MS are of two types: Tandem in space or Tandem in time, where the first type requires a specific analyzer, usually a beam time analyzer at each step of MS/MS [28]. The second type has the ability to use a single analyzer, usually trapping analyzers such as Quadrupole ion trap, but at different times [28]. The latter type of MS/MS has a higher efficiency, because there is no transfer of ions between different analyzers, and more time is given for the ions to dissociate using this form of MS/MS [28]. The Quadrupole ion trap is most commonly used for MS/MS experiments due to low maintenance cost, simplicity and speed in using this instrument [28]. Figure 1.9. Pictorial representation of the Tandem mass spectrometry process (MS/MS). The first stage of the MS/MS process, is where the ion with the desired m/z value (parent ion) is separated from the other ions (MS-I). This parent ion is then subsequently dissociated via CID (Collision induced dissociation), IRMPD (Infrared multiphoton photo dissociation) or SID (Surface induced dissociation) methods to subsequently give a mass spectrum of the products ions generated via this process. The latter is the second stage of the MS/MS process, MS-II. Figure taken from [28] Mass spectrometry is a highly sensitive and high throughput technique, furthermore 33 advances in the ionization techniques allowed analysis of many compounds, even those present at low levels within complex mixtures, which was critical for the analysis of biological samples [28]. Mass spectrometry has also been used within the area of proteomics to identify and characterize proteins, de novo peptide sequencing and determining post-translational modification states [28]. 1.4 Protein-protein interaction motifs 1.4.1 Structure Proteins can contain a string of tandem basic motifs, important in mediating protein interactions, among which are the WD40 [29], PDZ [30], SH3 [31] and TPR motifs [32-43]. The WD40 domain has no intrinsic enzymatic activity and is highly abundant in eukaryotic proteins implicated in various biological processes such as cell division, chemotaxis, RNA processing and various signal transduction pathways [29]. The WD40 domain acts as a scaffold to mediate protein-protein interaction to form multi-subunit complexes [29]. The WD40 domain consists of 44-60 residues which form a seven bladed β- propeller fold, where each blade consists of four stranded anti-parallel β sheets (Fig.1.10) [29]. (A) (B) Figure 1.10. A ribbon presentation of the WD40 domain. A: The seven bladed βpropeller fold (each highlighted in different colors and circled 1-7). B: One of the seven bladed folds, which consist of four stranded antiparallel β sheets (highlighted in different colors and circled A-D). Figure adapted from [29] PDZ domains consists of 80-100 amino acid residues, and bind to the C termini of interacting proteins which include transmembrane receptors, channel proteins and other PDZ domain containing proteins [30]. These PDZ domain mediated interactions are implicated in localising these interacting proteins to the plasma membrane [30]. The PDZ domain consists of six β strands (βA-βF) and two αhelicies (αA and αB), which form a six β-stranded domain [30]. 34 The SH2 domain recognizes and binds to phosphorylated tyrosine containing sequences whereas SH3 binds to peptides containing the consensus sequence PxxP (x being any other amino acid) [31]. SH3 motifs are approximately 60 amino acids and are ubiquitous in eukaryotes. [31]. Another known example of a protein interaction motif is the TPR motif, a 34 amino acid residue domain [32] and was first discovered in yeast in 1990 [33] as a motif involved in protein-protein interactions [32, 34-35]. TPR motifs are evolutionarily conserved from prokaryotes to eukaryotes and found in various proteins in different sub-cellular locations such as nucleus, peroxisome, mitochondria and cytoplasm [32]. The structure of the TPR motif has been characterized through X-Ray crystallography of TPR proteins such as PP5 [32, 36-37]. Observation of the secondary and tertiary structures of this motif has shown that TPR motifs consist of two alpha helices forming an anti-parallel hairpin structure [32, 34-38] (Fig.1.11). This hairpin thus consists of two anti-parallel alpha helical domains A and B, which span different parts of one TPR motif. Domain A includes conserved residues 4, 7, 8 and 11 and domain B includes conserved residues 20, 24, 27 and 32 (Fig.1.11) [32, 36-37]. Each TPR motif is parallel to each other and there is an angle of 24°C between helices A and B [32-33, 36-37]. TPR motifs have side-chains that protrude into this grooved surface and specify interactions with other polypeptides [39]. PP5 has an extra helix 7 at the C terminus of the third TPR motif (Fig.1.11), this helix is critical for the solubility and stability of these motifs [32-33]. This Helix is present in other TPR proteins as well as PP5 such as FKBP52 [33, 40]. The stability of the TPR protein increases with the number of TPR motifs within the protein [41]. 35 (A) (B) Figure 1.11. A ribbon presentation of the TPR motifs of PP5. A: The 3 TPR motifs of PP5 and helix 7. B: The first TPR motif of PP5 highlighting the conserved TPR residues, 4, 7, 8, 11, 20, 24, 27. The TPR motif is colored red and blue, of which red is helix A and blue is helix B. The single letters represent the residues and the numbers represent their location within the TPR motif. Figure adapted from [32, 37] TPR motifs show homology in their amino acid sequence, hydrophobicity, length and spacing [32-36, 42-43]. Sequence alignments of different TPR proteins identified amino acid residues W (tryptophan), L (leucine), G (glycine), Y (tyrosine), A (alanine), F (phenylalanine), A (alanine) and P (proline) at position 4, 7, 8, 11, 20, 24, 27 and 32 to be highly conserved, [32-36, 42-43]. Protein interaction motifs are simple but versatile structures, which can provide mechanical strength to the protein, and can interact with a diverse set of proteins varying from transcription factors to multi-meric scaffolding proteins [44]. Hence proteins bearing these motifs are implicated in a wide range of functions within the cells [44]. 1.4.2 Structural stability and ligand specificity of protein interaction domains As mentioned above the TPR motif is an α-helical motif that is implicated in mediating protein-protein interaction, resulting in the subsequent formation of multi protein complexes implicated in various biological functions within the cell (this will be discussed in more detail in subsequent sections) [32-38, 45]. The super-helical structure of the TPR motif formed from the packing of adjacent TPR motifs, exhibits a concave and convex surface which provide flexibility to the unit and is implicated in ligand binding [45, 46]. TPR proteins bind to numerous ligands not necessarily 36 exhibiting sequence or structural conservation through different binding pockets, which differ in surface amino acid residues, and serve as an interaction platform [46]. These surface amino acid residues are implicated in ligand binding specificity, as they can affect the electrostatic nature of the binding surface and hydro-phobicity [46]. From solved TPR-ligand structures many factors contribute to ligand binding specificity [46]. The central groove formed from the packing of the TPR fold accommodates the target ligand but this is not the only mode of ligand interaction [45]. The mechanisms for recognition of the cognate ligand as well as the structural re-arrangements implicated in this interaction have been studied and will be discussed in this section. It has been shown that very little structural re-arrangements are observed in the TPR fold upon ligand binding from studying the structures of various TPR proteins in the presence and absence of ligand [37, 42, 47-50]. The structures of six proteins with 3 TPR motif proteins in the presence [37,47] and absence of ligand [37, 42, 47-50] have been solved and compared to. Figure 1.12 shows homology modeling of these various TPR motif proteins; CTPR3 (consensus TPR number of repeats), PP5, TPR12A and TPR1 domains of HOP which have been co-crystallized with their peptide ligand [42]. These structures have been compared and were found to superimpose on top of each other when homology modeling was carried out, even in the presence of ligand [42]. It was also shown that free TPR domains, which actually do not bind to the ligand, are also still folded and structurally ordered [42]. Figure 1.12. Homology modeling of various TPR proteins. Homology modelling of CTPR3 (PDB Code: 1NA0); TPR domain (residues 19–177) of PP5 (PDB code: 1A17); TPR2A (PDB code: 1ELR); TPR1 (PDB code: 1ELW); domains of Hop which were co-crystallized with their ligands, C terminal peptides of Hsp90 and 70 respectively. Figure taken from [42] HOP, is an adaptor protein containing 9 TPR motifs, which are organized as 3 TPR domain clusters (Fig.1.13); TPR1, 2A and 2B, whereby TPR1 and 2 are involved in mediating the interaction between Hsp70 and Hsp90 respectively [47]. TPR1 and TPR2 bind to the C terminal hepta and pentapeptides of Hsp70 and 90 respectively, 37 mediated by electrostatic and hydrophobic interactions with the EEVD peptide consensus sequence, which include glutamate (E), Valine (V) and Aspartate (D), present on these heat shock proteins [47]. The crystal structures of the TPR-peptide complexes have been solved (Fig.1.12), which exhibit the common TPR fold, and homology modeling of these structures shows how super-imposable the structures are, in the presence of their respective ligand, hence providing more evidence that very little structural re-arrangements are induced upon ligand binding [42, 47]. CD spectra of TPR1 and TPR2A of HOP with and without ligand shows the domains are folded and are highly alpha helical, and display no evidence of structural rearrangements upon ligand binding [42]. Figure 1.13. Domain organization of the TPR protein, HOP. HOP contains three clusters of 3 TPR motifs dispersed along its structure, these are named as TPR1 (white boxes), TPR2A (black boxes) and TPR2B (grey boxes). These clusters include amino acids, 1-118, 232-252 and 353-477 respectively. Figure taken from [51] Another form of evidence to support the hypothesis that no drastic structural rearrangements are induced upon ligand binding comes from CD and NMR spectrum of the three TPR motif protein UBP (Vpu binding protein), also called SGT (small glutamine-rich protein) [42]. This protein also binds to the C terminal peptide of Hsp70, and analysis at residue specific level from the NMR HSQC spectra shows no drastic change in HSQC spectra between protein and protein-ligand, which means that no structural re-arrangements have been induced upon ligand binding [42]. Several TPR proteins have been structurally solved by crystallography, [32, 34-38, 45], which have been shown to form the common fold but have slight variations in structure due to short amino acid insertions to allow specific protein interactions [45]. This shows how versatile the TPR motif is, and examples of these include p67phox, a component of the NADPH oxidase complex, which contains four TPR motifs important in interacting with RAC GTP (Fig.1.14) [45]. Another is PEX5, which includes 7 TPR motifs (Fig.1.14) and important in interacting with the type 1 peroxisomal targeting signal (PTS1) [45]. 38 Figure 1.14. Domain organization of the TPR proteins, PEX5 and p67 phox. PEX5 contains 7 TPR motifs on its C terminus, and p67phox contains 4 TPR motifs on its N terminus, as well other functional domains, such as the activation, P-rich and SH3 domains, which are highlighted in peach, yellow and green boxes. The TPR motifs are highlighted in red boxes. Figure taken from [45] NADPH oxidase is an enzyme that is involved in the production of reactive oxygen species, which is a protective mechanism against microbial infection [45]. In a resting state within the neutrophils, p67phox acts as a mediator between p40phox and p47phox to form a trimeric complex [45]. Upon neutrophil stimulation, a conformational re-arrangement of this trimeric complex is induced, followed by phosphorylation of these elements. This complex then interacts with membrane bound cytochrome components, which include the GTPase RAC [45]. Recognition of RAC by p67phox is a critical step in the formation and activation of NADPH oxidase, and this mediated through the N terminal TPR domain of p67 phox (Fig.1.14) [45]. Homology modelling of the two crystal structures of inactive p67 phox bound to Rac and a longer active form of p67phox without Rac shows that the structure of the N terminal TPR domain for both are similar, which suggests that p67 phox-Rac binding does not cause any structural re-arrangements to the TPR domain [45]. The protein p67phox as mentioned above has four TPR domains, and between TPR3 and 4, a 20 amino acid insertion has been discovered that form two anti-parallel β strands [45]. This is important as this insertion does not disrupt the general super-helical fold of the TPR motif but this structure is part of the RAC binding site as well as the loop that connects TPR1 and TPR3 [45]. This provides evidence that the central groove formed by the super-helical nature of the TPR fold does not provide the only mode of ligand recognition and ligand binding site [45]. Another TPR motif containing protein is PEX5, which contains seven TPR motifs followed by a 7C loop (Fig.1.14), and the TPR motifs are implicated in the translocation of newly formed peroxisomal enzymes to their correct sub-cellular localization [45]. The latter is mediated through the C terminal PEX5 TPR domain, which recognises a C terminal signal peptide, SKL or peroxisomal targeting signal (PTS1), in its target ligand [45]. The structures of PEX5 bound and unbound to peptide, including the PTS1 sequence, were analysed and this identified an 39 alternative TPR motif conformation and mode of ligand recognition [45]. It was shown that structural re-arrangements are induced within the TPR motif upon ligand binding mediated through this PTS1 signal [45]. Upon ligand binding the TPR motif reverts to a closed structure from the open form with the 7C loop [45]. This 7C loop is important in target recognition and interacts with TPR1 to form a closed conformation [45]. PEX5 still has the common fold of the TPR motif but does exhibit this non-canonical and flexible structure [45]. Contrasting evidence was also shown when a particular PP5 TPR construct was shown to exhibit differences in folding and stabilization with and without ligand [42, 52]. It was shown that the TPR motif was folded and stabilized when complexed with ligand and the converse was detected without ligand [42, 52]. It was then hypothesized that folding and binding interaction could be a potential mechanism for ligand recognition [42, 52]. However vast majority of evidence, whereby the stability, structure with and without ligand of TPR motif proteins show that no drastic structural re-arrangements are detected upon ligand interaction and that the TPR acts as an individual and folded unit which presents a common binding surface or architecture to which specific ligands are accommodated [37, 42, 47-51]. Next question is how the TPR motif determines ligand specificity and it is known that since the genome was fully sequenced; many proteins have been identified, which can be categorized into families based on their sequence conservation [53]. However, very little attention is given to poorly conserved sites, which are hypothesized to be of little importance [53]. Although, recently a paper was published, which showed through extensive statistical analysis that residues in contact with the ligand are more variable then surface residues which are not in contact with the ligand but exposed to the solvent [53]. Through this statistical analysis of TPR proteins, they identified that sequence variation residues can determine specificity in ligand interaction. TPRs proteins bind to highly diverse ligands, and it has been shown that these “hyper-variation” sequences are implicated in ligand binding and as a consequence can be used to predict ligand binding sites [53]. Also certain residue positions also play a role in determining TPR ligand specificity [51] and these concepts will be discussed in more detail below. Within a protein family, the highly conserved residues are present in the hydrophobic core, which are implicated in specifying the fold of the protein [53]. The residues on the surface are generally more variable, which if mutated have little 40 effect on protein structure or stability [53]. However, if these surface residues are conserved then this is considered to be of functional importance [53]. This information then can be used to identify ligand binding sites within proteins that carry out the same function but cannot be used for proteins that expose a common fold that binds to diverse ligands such as TPR proteins [53]. It has been shown that the ligand binding site can be determined though this sequence hyper-variation, situated proximal to the ligand, which are important in determining specificity of the ligand [53]. When the ligand bound structures of the two TPR domains of HOP, TPR1 and TPR2A were studied, it was evident that the ligand binding face of the TPR motif is more variable then the solvent/surface residues, and it’s these residues that are predicted to be implicated in determining ligand specificity [53]. This sequence variation is termed as hyper-variation [53]. Further analysis has shown that residues 2, 5, 9, 12, 13, 33 and 34 exhibit the most sequence variability within the TPR, and are present on the concave face of the TPR motif [53]. In the case of TPR1 and 2A of HOP, residues 2, 5, 6, 9, 12 and 13 are implicated in ligand binding and specificity [52]. It is yet to be confirmed whether residues 33 and 34 are also implicated in ligand binding and specificity [53]. Hsp70 and Hsp90 are important proteins implicated in the folding of various proteins, mediated through interaction with various TPR co-factors through its conserved EEVD [51]. As mentioned above the TPR motif forms a central groove, and it has been shown that it’s this groove that serves as a ligand-binding site [51]. HOP binds to its ligand in an extended conformation, which then can display a maximal TPR interaction surface and allows recognition of short amino acid sequences (Fig.1.15) [46]. From solved crystal structures of HOP with its ligand, it has been shown that interaction of the TPR motifs is through the EEVD consensus sequence [46]. In the case of TPR1 and 2A there are five amino acids within the central groove that form the “two-carboxylate clamp”, which interact with the Aspartate residues of the EEVD sequence of Hsp70 and Hsp90 and so this clamp acts a binding and docking site for peptide ligand and TPR motif [46. 51]. This TPR mediated interaction is important for the stability and specificity of the Hop-Hsp70 and Hop-Hsp90 complex formation [51]. This EEVD conserved sequence acts as an anchor sequence for the TPR co-factors of the heat shock proteins, however it’s the residues N terminal of the EEVD sequence that determines specificity to Hsp70 and Hsp90 [51]. TPR2A-MEEVD of Hsp90 is the only contact required for the HopHsp90 complex formation, and the ME (methionine and glutamate) residues are specific for interaction with Hsp90 residues [51]. However, for Hop-Hsp70 complex 41 formation, this requires not only the interaction of TPR1 with Hsp70 through the PTIEEVD sequence but also additional contacts [51]. Residues PTI (proline, threonine, Iso-leucine) are the residues specific for Hsp70 interaction [51]. Of the EEVD conserved sequence the Aspartate and Valine residues are the anchor residues, but the glutamate residues are critical in TPR2A-Hsp90 binding but not as critical in TPR1-Hsp70 binding, as TPR1 has preference for a hydrophobic amino acid at this position [51]. Furthermore, the TPR motif has preference for hydrophobic aliphatics and aromatic side chains at certain positions within their respective ligands such as position 4 and 6 [51]. For example in the case of TPR1 and 2A, Ile-4 in Hsp70 and Met-4 in Hsp90 are important residues in determining specificity to TPR1 and 2A [51]. Structures of TPR-ligand can show ligand binding mediated via an extended conformation as like with HOP (Fig.1.15), however, ligand binding can also be mediated by the display of both helical and an extended conformation shown by TPR-ligand structures such as the Pseudomonas secretion (Psc) proteins, PscG-PscE in complex with the PscF peptide, and APC6 in complex with CDC26 (Fig.1.15) [46. Firstly the (Psc) proteins are implicated in the bacterial Type III secretory pathway of which PscG has three TPR motifs with a C terminal helix [46]. This latter protein interacts with PscE through these TPR motifs [46]. PscF consists of two subdomains; a 13 and 17 amino acids long extended coil and C terminal helix respectively [46]. Both PscG and PscE form a “cupped-hand-like structure”, whereas PscF interacts with PscG through its C terminal helix to the concave binding surface of PscG and its N terminal region to the convex surface of PscG (Fig.1.15) [46]. Another example of a TPR-ligand structure displaying both helical and extended conformation in ligand binding is the APC6 protein in complex with CDC26 [46]. Both are components of the Anaphase promoting complex (APC), of which APC6 has 8 TPR motifs and a C terminal helix [46]. APC6 forms a “solenoid like structure” encompassing the full-length of the N terminus of CDC26 of 26 amino acids (Fig.1.15) [46]. The bound ligand CDC26, displays 12 amino acids in an extended conformation and the other 14 amino acids as a helix (Fig.1.15) [46]. These examples show that TPR motifs can interact with various ligands through different TPR binding modes. 42 Figure 1.15. TPR protein-ligand structures. A: TPR2A domain of Hop (highlighted in pink) in complex with Hsp90 (highlighted in green). B: PscG-PscE dimer (highlighted in light blue and purple respectively) in complex with PscF (Highlighted in pink). C: APC6 (highlighted in light green) in complex with CDC26 (highlighted in red). Figure taken from [46] Another protein interaction motif is the 100 amino acid PDZ domain, which as mentioned above is composed of six and 2 -helicies forming a “sandwich” structure (Fig.1.16), unlike TPR motifs which are alpha helical [54]. The PDZ domain binds to the C terminus of its target protein through a four amino acid consensus sequence; X-Thr/Ser-X-Val [54]. PDZ domain containing proteins expose a peptide binding groove surface situated between a sheet and helix (Fig.1.16), which binds to the consensus sequence on the target peptide in a geometry common to PDZ domains [54, 55]. PDZ ligand binding specificity is dependent on minor sequence variations, however, the geometry and overall fold within the binding region is generally well conserved [55]. Ligand binding does not cause large structural rearrangements to the PDZ domain as deduced from solved PDZ domainligand structures [55]. Furthermore, the mechanism of ligand recognition was elucidated from the structures of PDZ motif in complex with and without ligand, for example the third PDZ domain (PDZ-3) from the brain synaptic protein (PSD-95) [54]. The PDZ domain recognizes the C terminus of its target ligand, and this is mediated through a carboxylate-binding loop found in loop L1 (Fig.11.6), which 43 contains four important residues, Gly-Leu-Gly-Phe (Leu and Phe are the two X residues in this case), and hydrogen bonds are formed between residues of this loop with the carboxyl group of the ligand [54, 55]. The glycine residue provides the structural flexibility to this loop, and an arginine residue is also present in the binding loop, which also interacts with the carboxylate group of the ligand through hydrogen bonds [54]. A hydrophobic pocket is also present in this fold which recognizes hydrophobic C terminal target peptides, and the hydrophobic amino acids present within this loop can vary between different PDZ domains but in this example these are Leu-323, Phe-325, Ile-327, and Leu-379 (Fig.1.16) [55]. Peptide binding does induce a slight but not large structural re-arrangements to the fold, but changes are shown in loop L1 and B helix, which suggest of a possible mechanism that opens up the hydrophobic pocket present between this region (Fig.1.16) [55]. The specificity of PDZ domains to diverse ligands is dependant on the variable amino acids within the A and B strands [55] (A) (B) Figure 1.16. PDZ domain structure. A: Ribbon presentation of the third PDZ domain of PSD-95, which consists of two -helices; A and B (highlighted in red) and 6 -strands (highlighted in green) forming a barrel structure and 6 loops that are highlighted in blue. B: PSD-95 bound to peptide (stick representation, highlighted in orange). Atoms highlighted in pink are part of the hydrophobic pocket. Figure taken from [55] The SH3 domain is implicated in mediating protein interaction in various cellsignalling pathways [56]. The SH3 domain presents a hydrophobic surface (2 hydrophobic pockets) that usually contain conserved amino acids that binds to its peptide ligand, which is left-handed polyproline type II (PPII) helical in structure and includes the PxxP (x being any other amino acid) consensus sequence [56]. The third binding pocket also referred to as the “specificity pocket”, is negatively 44 charged and binds to residues flanking the consensus sequence commonly an arginine or lysine present in the ligand [56]. Depending on where this basic residue is situated relative to the proline of the consensus sequence determine how the ligand binds in terms of structural orientation either Class I or Class II, if arginine is present on the N or C terminus respectively [56, 57]. This specificity pocket is important in increasing the affinity and specificity of SH3-ligand interaction, which is critical in the cellular context [56]. Specificity is provided by additional contacts that are made with the loops of the SH3 domain and residues on the ligand flanking the consensus sequence. It was also shown from studying structures of free and bound SH3 domain structures that very little structural re-arrangements are induced upon ligand interaction [57] 1.5 Functions of various TPR proteins TPR motifs containing proteins are involved in many aspects of cellular function and examples of each will be given in the following sections. 1.5.1 TPR proteins involved in transcription An important step in initiating RNA polymerase III transcription is the interaction between Tfc4 of TFIIIC and Brf1 and Bdp1 of TFIIIB to assemble this latter factor onto the DNA [58]. This has been shown to be the rate-limiting step for the process of RNA polymerase III transcription [58]. Tfc4 has 11 TPR motifs and are organized to form two clusters of TPRs at the N terminus (Fig.1.17) [58]. One set includes TPR1-5 and the other set includes TPR6-9 and two TPR repeats are found at the C terminus (Fig.1.17) [58]. These TPR motifs are important in mediating Tfc4 interaction with Brf1 and Bdp1 needed for the initial assembly of TFIIIB onto DNA [58]. Gain of function mutations within TPR1-5 increase the interaction between Brf1 and Tfc4, whereas mutations in TPR 6-9 disrupt Polymerase III reporter gene transcription and impairs interaction between Brf1 and Tfc4 [58]. Figure 1.17. Schematic diagram showing domain organisation of the TPR Protein Tfc4. Tfc4 contains a hydrophilic domain (yellow box), two tandem TPR arrays (red boxes), which include TPR1-5 and TPR6-9 and an intervening region (IVR, green box) in between the two domains at the N terminus. The C terminus contains another two TPR motifs (TPR10 and TPR11). Figure taken from [58] 45 The protein TTC4 is also TPR protein and was originally identified within the gene region implicated in breast cancer [59]. TTC4 is a nucleoplasmic protein and shown to interact with Hsp70, Hsp90 and more recently shown to interact with the replication initiation protein Cdc6 through TTC4 TPR motif [59]. Further research showed that certain point mutations within this protein were detected in various melanoma samples and this impairs TTC4 interaction with Cdc6 [59]. However, it has not been proven that the loss of this interaction leads to cancer as the region where the mutation maps to could be implicated in interacting with other proteins [59]. Although the possibility that loss of interaction with Cdc6 leads to melanoma is plausible as Cdc6 is an important regulator of DNA replication [59]. An increase in the level of TTC4 was seen in these melanoma samples and TTC4 has been shown to be implicated in cancer progression as increased level of TTC4 protein are detected in various tumour cell lines [59]. 1.5.2 TPR proteins involved in the Stress Response Pathway The molecular chaperone Hsp90 is expressed ubiquitously and it has many substrates such as steroid hormone receptors and kinases [60]. Hsp90 is known to be involved with the folding [60-62] and degradation of proteins [62], interaction with steroid receptors [63] and is known to have ATPase activity needed for its function. [64-66]. Hsp90 has two conserved motifs at its C and N termini and the two are connected by a charged linker [66]. Hsp90 interacts with other co-chaperones containing TPR motifs through its conserved C termini which includes a TPR motif recognition site, the pentapeptide MEEVD [66]. The interaction of Hsp90 with these TPR proteins is important for Hsp90 activity [66]. Hsp70 is also ubiquitously expressed and is involved in protein folding and the stress response pathway [67]. Hsp70 has an ATPase domain at its N terminus, a substrate binding domain and a C terminal domain [65], the latter regulates substrate binding [64, 65]. The EEVD and PTIEEVD have been identified as a TPR recognition site of Hsp70 located at the C terminus [67-68]. HBP21 is a human TPR protein but its function has not been characterized [68]. It has 3 TPR motifs implicated in the interaction with the C terminus of Hsp70 [68] through the EEVD and PTIEEVD sequence [67-68]. Levels of HBP21 are high in breast cancer and proliferative vitreoretinopathy (PVR) [68]. It has been hypothesized that HBP21 could play a part in inhibiting metastasis of tumour cells [68]. 46 HIP, an Hsp70 interacting protein, is 369 residues long and has recently been found to be implicated in glucocorticoid receptor signaling; and it also acts as a chaperone [69]. Steroid receptors normally exit in a heteromeric complex with Hsp90 and other chaperones before hormone binding [32, 69]. HIP binds to the ADP form of Hsp70, and stabilises and subsequently promotes its interaction with other binding partners [32, 69]. HIP enhances hormone dependant activation of GR [32, 69]. HIP is composed of an oligomerization domain, three central TPR motifs and a highly charged region at its N terminus (Fig.1.18), where the latter two regions are implicated in mediating HIP interaction with Hsp70 [32, 69]. The C terminus has a glycine, glycine, methionine and proline (G) repeat motif and a p60 homology domain (Fig. 1.18) [69]. Mutations within the TPR region of HIP impair its interaction with Hsp70 and enhancement of glucocorticods receptor signaling. [69]. Figure 1.18. Schematic diagram showing domain organisation of the TPR Protein HIP. HIP contains an oligomerisation domain, three TPR motifs and a highly charged region at its N termini. The C terminus includes the GGMP repeat motif and the p60 homology domain. Positions of various domains are indicated. Figure adapted from [69] 1.5.3 TPR proteins involved in Mitochondrial and Peroxisomal import TPR proteins are found in the peroxisomal import receptor complex; these are PAS8/ PAS10/PXR1 which recognize target proteins and transport them across the peroxisomal membrane [32, 70-72]. PAS8 has 7 TPR motifs on its C termini [72] and PAS10 has 8 TPR motifs of which 7 are present on the C termini and the other TPR motif is present on the N termini [73]. Human PXR1 is involved in peroxisome import and has seven TPR motifs located at its C terminus [32, 70-72]. Mutations within the TPR motifs can interrupt protein interaction which can affect its function causing life threatening diseases such as the peroxisome biogenesis disorder (PBD) and neonatal adrenoleukodystrophy [32, 70, 74]. The latter is a recessive disorder due to a mutation in the TPR domain of PXR1 and PEX5 [32, 70, 74] which then affects peroxisome assembly [32, 70]. 47 TPR proteins are found in the mitochondrial import receptor complex, they are MAS70 and TOM20 [32, 75-78]. These TPR proteins recognize target proteins and transport them across the mitochondrial membrane [32, 75-78]. MAS70 has seven TPR motifs (Fig.1.19) which are important in transporting MAS70 to the cytoplasm from the outer membrane of the mitochondria where it is primarily localized [32, 7578]. Mutations within the C terminal TPR domains result in the transport of non functional MAS70 to the mitochondria, however the protein is rendered nonfunctional as it cannot aid in the transport of other target proteins across the mitochondrial membrane [77]. Tom20 has one TPR motif and this is unusual as most have 3-16 TPR motifs [33]. Figure 1.19. Schematic Diagram showing TPR Motif organisation within the TPR Protein MAS70. TPR motifs are shown as red boxes and MAS70 has 7 TPR motifs dispersed along the sequence. Figure adapted from [37] 1.5.4 TPR proteins involved in the progression of the cell cycle TPR proteins are also found within the multi-subunit E3 ubiquitin ligase APC complex [32, 37, 79-80], which is part of RING/ cullin family and is involved in the ubiquination of proteins to be degraded by the proteosome at various parts of the mitotic cell cycle [79]. The APC consists of 13 subunits of which Cdc16, Cdc23 and Cdc27 contain 10, 9 and 10 TPR motifs respectively [32, 37, 80] (Fig.1.20). All of the TPR motifs are important for the interaction between these three components [32, 80]. Mutation within these motifs interrupts the protein interactions between these three components of the APC and/or their respective function [32, 80]. A mutation in the seventh TPR domain at position 6 and also an insertion between position 6 and 7 of Cdc27 interrupts the interaction with CdC23 [32, 35]. A mutation within CdC23 at position 8 in TPR motif 5 and 7 [81] and in Cdc16 at position 20 of TPR motif 9 [82] results in cell cycle arrest at a specific point in the mitotic cell cycle, metaphase to anaphase transition [81, 82]. This is because these mutations at position 8 and 20 are the TPR motif conserved residues, which most probably interrupts protein function due to difference in structural conformation [32, 37]. 48 Figure 1.20. Schematic Diagram showing TPR Motif organisation within these TPR proteins. The TPR motifs are shown as red boxes and cdc16 and cdc27 have10 TPR motifs dispersed along the sequence. The other TPR protein, cdc23 has 9 TPR motifs. Figure adapted from [37] Another TPR protein that has been identified is WISp39 [83], which is involved in inhibiting p21 degradation [60]. Different Cyclin-CdK complexes are present at different stages of the cell cycle [83]. The levels of both the complex and the composition of specific cyclin and CdKs oscillate between the different cell cycle phases [83]. Inhibitory proteins exist that can inhibit CdKs, one of which is p21, as it can bind to both CdK and cyclin in specific positions [60, 84]. DNA damage induces p53 which activates p21 and inhibits cell cycle progression [60, 85]. Transcriptional control of p21 is regulated through p53 dependent and independent pathways (Fig.1.21) [60]. Levels of p21 are regulated through post translational modification, for example phosphorylation [86] and degradation [60, 83]. The N terminus of WISp39 interacts with the N terminus p21, which is implicated in p21 ubiquitination and this suggests that WISp39 could inhibit p21 degradation [60, 83]. Wisp39 has 3 TPR motifs at its C terminus which is involved in interacting with the C terminus of Hsp90 [60, 83]. The Wisp39/Hsp90/p21 tri-complex is involved in inhibiting p21 degradation by increasing the stability of p21 (Fig.1.21) [60, 83]. This tri-meric complex does not increase p21 protein levels but is implicated in the stabilisation of p21 [60, 83]. Mutations within this TPR domain of WISp39 impairs its interaction with Hsp90 and consequently p21 degradation and the formation of the WISp39/Hsp90/p21 tri-meric complex that enables correct folding of p21 and consequently its stabilisation [83]. Figure 1.21. Regulation of p21 by p53, WISp39 and Hsp90. DNA damage 49 activates p53 which in turn activates transcription of p21. This leads to an increase of unstable p21 levels, which then ultimately stabilises through its interaction with WISp39, Hsp90 and formation of the WISp39/p21/Hsp90 tri-meric complex. Figure taken from [83] 1.5.5 TPR proteins involved in DNA Repair PP5 is an ubiquitinously expressed serine threonine phosphatase and it can bind to Hsp90 [87-88]. PP5 has a C terminal catalytic domain and an N terminal TPR motif (Fig.1.22) [85]. The TPR motif of PP5 interacts with a conserved sequence at the C terminus of Hsp90 [89] and is important in interacting with other proteins like CDC16 and CDC27 [90]. The TPR motif has a negative impact on PP5 activity [87, 91] although the mechanism of PP5 activation is unclear at this stage. ATM kinase is a checkpoint kinase involved in the response to double stranded breaks [92]. ATM activation was poorly understood until recently it has been shown that PP5 is involved in ATM activation [92]. When DNA is damaged PP5 interacts with ATM, ATM is auto-phosphorylated on serine 1981 which then leads to the activation and phosphorylation of downstream targets like p53 and Rad17 [92]. Figure 1.22. Schematic Diagram showing TPR Motif organisation within PP5. The TPR motifs are shown as red boxes and PP5 has 3 TPR motifs and a phosphatase domain (highlighted in green). Figure adapted from [37] Other TPR proteins are also part of the DNA repair pathway; this was identified through looking at patients with Fanconi Anaemia (FA) [93-94]. The primary cause of this condition is a defective DNA repair pathway and in this condition 8 key genes were identified, one of which is BRCA2 [93-94]. An important step in the FA pathway is the ubiquitylation of lysine in FANCD2, which then can allow further steps to proceed such as the interaction of this protein with BRCA1 [93-94]. This step requires the integrity of a multi-subunit complex of 6 FANC proteins, (FANCA, -C, -E, -F, -G, and -L. No domain has been identified which is involved in these protein interactions besides FANCL, which has 2 domains, WD40 and a ring finger domain [93-94]. Further research was done to identify the protein interaction domain by looking at sequence homology between different organisms of the same protein [93-94]. Human FANCG was compared to its homologs in Oryzias latipes (Japanese rice fish) and Danio rerio (zebra fish) [93-94]. This identified the seven TPR motifs within FANCG which were dispersed throughout this protein [93-94]. FANCG as well as being implicated in FA core complex assembly [93] has also been implicated in the homologous recombination repair pathway [94]. It is involved in the latter pathway as the TPR motifs of FANCG are critical for interaction with XRCC3 and 50 BRCA2, which are both important components of the homologous recombination pathway [94]. Mutational analysis identified key mutations within TPR motifs, which lead to complete or partial loss of function of this protein [93-94]. This was in position 8 of TPR motifs 1, 2, 5 and 6 and further research showed that these TPR motifs are important in the interaction with FANCA [93-94]. These mutations lead to a loss of FANCG interactions with other partners [93-94]. 1.5.6 TPR proteins involved in Proteolysis Molecular chaperones and the ubiquitin proteosome pathway (UPP) are working against each other. UPP degrades damaged proteins and molecular chaperones such as heat shock proteins are involved in the refolding of damaged proteins [95]. The fate of the protein depends on the activities of both of these components [95]. A very important component that helps to decide the fate of a protein is CHIP, which is a co-chaperone that has ubiquitin ligase activity and is 35 kDa [95]. It has 3 domains of which one is the TPR domain at the amino terminus and on the opposite end a Ubox (has ubiquitin ligase activity) and in between the two is a highly charged region (Fig.1.23) [96]. CHIP has 3 TPR domains, which are involved in the interaction of CHIP with other molecular chaperones like Hsp70 and Hsp90 needed for the quality control mechanism [97], regulation of signaling pathways [98] and for proteosomal degradation [99]. Figure 1.23. Schematic diagram showing domain organisation of the TPR Protein CHIP. The TPR motifs (boxed in purple) are involved with the binding to Hsp70/90. There is a coiled coil domain indicated by an orange box in the middle. The other end is the U-Box which has ubiquitin ligase activity. CHIP is 303 amino acids (aa) long. Figure adapted from [99] 1.5.7 TPR proteins implicated in various other aspects of cell physiology There are different types of post-translational modification that a protein can be subjected to, among which are β–O-linked N-acetylglucosamine (O-GlcNAc) which is through the enzyme O-Glc-NAc-transferase (OGT) [100-101]. Many key proteins 51 within the cell are subjected to this modification; these include RNA polymerases [102], transcription factors [103] and kinases [104]. The enzyme OGT is 110 kDa and it is very important for the cell as it has been shown that if deleted then it leads to embryonic lethality [100]. OGT is a TPR enzyme which has a TPR motif at its N terminus and catalytic domain at its C terminus [100]. There are two isoforms of this enzyme, mitochondrial and nucleo-cytoplasmic with 9 and 11 TPR motifs respectively [100]. The function of TPR is to mediate protein-protein interactions and to maintain the integrity of the enzyme [100]. This enzyme is a trimer and TPR motifs are involved in OGT trimerization [100]. TPR motifs are also involved in determining target substrates and targeting it to a transcriptional repressor complex mSin3A [100]. The TPR motif of OGT can interact with GRIF-1 and OIP106, the first being a GABAA receptor-associated protein [100]. This is important for localizing OGT to the GABAA receptor which in turn leads to the activation of the GABA signaling pathway [100]. The TPR motifs of OGT (TPR 2-6) interact with a member of the OIP, OIP106 and this interaction is important for the interaction with RNA polymerase II complex [100]. TPR proteins are also implicated in prostate cancer, for example the 313 amino acid (Fig.1.24) α-SGT protein is linked to this disease. The predominant feature in this cancer is Androgen receptor (AR) is hyperactive even in the low presence of its ligand, androgen [105]. It is suggested that nucleoplasmic shuttling of this receptor by molecular chaperones is an important factor implicated in the progression of this cancer [105]. Alpha-SGT is an Hsp70/90 co-chaperone which binds to AR specifically at a hinge region in yeast and mammalian cells [105]. This interaction results in localization of this receptor to the cytoplasm, which leads to inhibition of the receptor transcriptional activity [105]. As well as that, it regulates how the receptor responds to androgen and it does not allow weak agonists such as progesterone to activate the receptor at reasonable levels of ligand [105]. In prostate cancer a high ratio of androgen receptor to α-SGT is detected, which explains the reason behind this predominant feature observed in this cancer. Alpha-SGT has two motifs, a central TPR motif and a small glutamine–rich motif at the carboxy terminus (Fig.1.24) [105]. The TPR motifs mediate the interaction of α-SGT with HSP70, HSP90 and with the hinge region of AR [105]. This interaction is important in the folding process of the hormone binding domain of steroid receptors to a high affinity state [105]. The other domain has been suggested to be involved with α-SGT dimerization [105]. 52 Figure 1.24. Schematic diagram showing domain organisation of the TPR Protein α-SGT. The protein α-SGT contains a central TPR repeat domain and a small glutamine-rich motif at its carboxy terminus. Figure adapted from [105]. More recently two checkpoint serine/theorine kinases Bub1 and BubR1 were also found to contain TPR motifs at its N terminus, implicated in the interaction with the outer kinetochore protein Knl1 through its KI motif (12 residue motif), which is important for the recruitments of these checkpoint kinases to the kinetochore (Fig 1.25) [106]. The spindle assembly checkpoint ensures cells don’t enter anaphase until all the chromosomes are correctly attached on the mitotic spindle to ensure correct chromosome segregation [106]. Bub1 and BubR1 are both implicated in the spindle assembly checkpoint and chromosome alignment [106]. The recruitment of these two proteins to the kinetochores is important for these two proteins to carry out their function and activation [106]. The localisation of these two proteins is through the TPR motifs found at the N terminus of Bub1 and BubR1, which interact directly with KI1 and KI2 motifs of Knl1 (Fig.1.25) [106]. Mutations within these TPR motifs of Bub1 impair interaction with Knl1, which leads to chromosome segregation defects. [106]. Figure 1.25. Domain organisation of Bub1, BubR1 and Knl1 . Bub1 and BubR1 consists of TPR motifs, Bub3-BD, KEN box and a kinase box as highlighted in relevant colored boxes shown in figure. The outer kinetochore proteins is composed of a PP1-BD, Kl1, KI2 and a Mis12-BD domain as highlighted in relevant colored boxes shown in figure. Abbreviations: TPR, Tetra-tri-co-peptide repeats; Bub3-BD, Bub3-binding domain, KEN, KEN box; PP1-BD, protein phosphatase 1–binding domain; KI1, Bub1-binding domain 1; KI2, BubR1-binding domain 2; Mis12-BD, Mis12-binding domain. Figure adapted from [106]. 53 1.6 Roles of scaffolds in signaling pathways Most of our understanding regarding scaffolds comes from one of the scaffold protein in the MAPKinase pathway, namely Ste5, which was discovered 15 years ago [107-108]. Research since then has identified many scaffold proteins, which exhibit properties similar to Ste5 [108-109]. Scaffolding proteins can affect signalling cascade networks through its ability to interact with various signalling enzymes, receptors and ion channels [108-109]. They facilitate the formation of protein complexes and act as “signal processing hubs”. [108-109]. A scaffolding protein exhibits two important properties, firstly scaffolds stabilize and maintain specificity of the signalling pathway, and they stabilise weak interactions between the various components of the signalling pathway [108-109]. Scaffolds act as “catalysts” in activating the proteins present within the pathway, for example in relevance to the MAPKinase pathway the scaffold anchors the kinases in a manner that enhances their interactions [108-109]. It is considered that a scaffolding protein serves to co-localize a set of proteins part of the same signalling cascade to a specific localisation within the cell consequently enhancing their activation [108-110]. However, aswell as localizing and holding on to the regulatory proteins in close proximity, they can also change their own conformation through interaction with their target protein, and can also change the conformation of their interacting ligand to change their function [108]. In summary scaffolding proteins have various mechanisms of action as shown in Fig.1.26. Scaffolding proteins have to be closely regulated as degradation of the multi protein scaffold complex could result in the deactivation of that signalling cascade network [108-110]. Figure 1.26. Scaffold mechanisms. Scaffolding proteins can affect signalling cascades using different mechanisms of action. Scaffolding proteins can A: bring proteins close together to interact with each other; B: form different signalling scaffolding complexes, one active protein (red triangle) can be part of two different pathways using two scaffolding signalling complexes; C: be regulated by the signalling proteins within the scaffolding complex without the need of regulating each individual signalling component of the pathway; D: change the conformation of the enzyme the scaffold binds to or vice versa. Figure taken from [108]. 54 Another class of proteins that are functionally similar to scaffolding proteins are the adaptor proteins; however their functional role is limited and they can only interact with two other proteins to facilitate their specific cellular localisation [108]. Scaffolding proteins have been discovered on an individual basis once experiments have identified them to interact with protein kinases, ion channels or to other proteins [108]. Since scaffolding proteins assemble multi protein complexes via protein-protein interactions, it may be possible to identify scaffolding protein through their protein interaction profiles [108]. This could be an excellent systematic strategy that could be implemented through the emergence of both reliable protein interaction databases and discovery methods [108]. Initial research identified scaffolds as “passive components” however research since then has shown that scaffolds play a more “active” role [108]. The scaffold multi protein complex creates its own micro-environment through its interaction with many proteins and enriching them in a specific small localisation within the cell, which consequently increases specificity due to the selectivity of other proteins within the complex [108]. Also this scaffolding complex has the ability to recruit positive and negative regulators, and so can create a complex and dynamic environment [108]. Specificity can be maintained, even though the numbers of signalling proteins are low, because different signalling complexes can be formed with various combinations of the same signalling proteins (Fig1.26) [108]. Due to the latter properties, various signalling cascades can share the same signalling proteins [108, 111]. Scaffolding proteins not only regulate interactions within but also between different pathways [112]. This latter type of interaction is referred to as “crosstalk” [112]. Initially pathways were seen to be linear however evidence has shown that signalling networks are interconnected via this crosstalk mechanism [112]. This mechanism is important, as comparatively to the various signalling functions of the cell, the numbers of existing signalling cascades are low [112]. Hence, crosstalk between pathways can create diverse output consequences for the cells but this has to be closely regulated and monitored, one way is through scaffolding proteins [112]. In relevance to cancer, during tumourigenesis, crosstalking proteins are affected resulting in the dysfunction of the signalling cascade [112]. Due to the heterogeneous nature of signalling proteins it seems very unlikely that they are a result of a common ancestor [108]. Although they contain protein 55 interaction domains, there is no signature motif, and function prediction cannot be carried out through the presence of this common protein interaction domain [108]. Even the scaffolds present in the MAPKinase pathway do not share sequence similarity even though the kinases they bind to do share this latter property [108]. 1.7 Hallmarks of Cancer Cancer is a complex disease and tumourgenesis involves a number of steps, the primary one being the detection of genetic alterations within the genome of the cell which confers a growth advantage for the cells, and ultimately this leads to its progressive transition of a normal to a cancerous cell [113]. There are more than 100 types of cancer and various disruptions of regulatory circuits implicated in cell proliferation and homeostasis have been detected [113]. One of the critical questions is how many regulatory disruptions can a cell endure before the cell becomes cancerous. It was firstly initially suggested that the majority of cancers are caused by six essential alterations within the cell, which were termed the “Six Hallmarks of Cancer” (Fig.1.27A) [113, 114]. However, further research has identified another four hallmarks of cancer (Fig.1.27B) [114] (A) (B) Figure 1.27. The six major alterations found in most cancers. A: It has been suggested that these six major alterations are observed in all cancer cells, self sufficiency in growth signals, insensitivity to anti-growth signals, tissue invasion and metastasis, limitless replication potential, sustained angiogenesis and evading apoptosis. B: Further research has identified another four hallmarks of cancer, avoiding immune detected, deregulation of cellular energetic, genome instability and mutation and tumour promoting inflammation. Figure taken from [113-114] 56 1.8 Role and regulation of p53 in cancer The p53 protein induces a cellular response to a variety of stress signals such as DNA damage, hypoxia and oncogene activation [115-117]. P53 is then stabilized, and binds to DNA in a tetramer form and acts as a sequence specific transcription factor for genes implicated in DNA repair, cell cycle arrest, senescence and apoptosis [115-117]. This is an important response for tumour suppression as shown by various human cancer cells where p53 is mutated, commonly in its DNA binding domain [115-117]. Mdm2 was initially found to be a negative regulator of p53, and is implicated in p53 sub-cellular localization and stability [115-117]. In human cancers where Mdm2 is over-expressed, this leads to cancer progression due to its effect on p53 [115-117]. Mdm2 binds to the N terminus of p53 and leads to its poly-ubiquitination and its consequent degradation by the proteosome, or its mono-ubiquitination which leads to its nuclear export [115-117]. Mdm2 is also implicated in the regulation of p53 at its mRNA translation level, indirectly through various ribosomal proteins such as L26, or direct binding to p53 mRNA [115-117]. This direct interaction is through the C terminal RING domain of Mdm2 and the N terminal Mdm2 binding site of p53, which negatively regulates p53[115-117]. This critical p53 and Mdm2 interaction is regulated under the DNA damage response pathway as this leads to the activation of various kinases such as ATM, ATR, Chk1 and Chk2, which consequently leads to p53 and/or Mdm2 phosphorylation, and consequently reduction in p53-Mdm2 binding and exhibition of Mdm2 negative effects [115-117]. 1.9 Regulation of the actin cytoskeleton Actin, a 42kDa ATP binding protein is a highly conserved protein and exists in two forms, globular G-actin, which can then assemble into filamentous F-actin [118]. Actin filaments are polar as they consists of two ends which exhibit different properties, a dynamic (barbed) and less active (pointed) end (Fig.1.28) [118]. When bound to ATP, actin polymerises at the barbed end of actin (Fig 1.28) [118]. This filament turnover is regulated through various actin binding proteins which can have diverse effects such as depleting actin monomers or its delivery, or can effect filament nucleation, elongation, capping, severing or de-polymerization [118]. The actin cytoskeleton is implicated in morphogenesis, migration, cytokinesis and membrane transport [118]. Actin assembly is initiated from existing filaments or the 57 nucleation of de-novo actin monomers. These processes are in-efficient, although factors such Arp2/3 complex proteins bind to the side of actin filaments and promote actin filament formation [118]. Figure 1.28. Process of Actin polymerisation. Actin filaments are polar proteins consisting of a dynamic barbed end a pointed less dynamic pointed end which during actin polymerisation is bound to ATP and ADP respectively. Figure taken from [118] The 220kDa Arp2/3 complex consists of seven highly conserved polypeptides, Arp2, Arp3 and ArpC1-5, of which Arp2 and Arp3 bind to ATP [118]. This ATP binding is a critical factor for the process of actin nucleation, and Arp2 is shown to hydrolyze ATP [118]. The Arp2/3 protein complex also has a variety of activators, as on its own it is also inefficient [118]. These activators are called nucleation promoting factors (NPFs), which activate the Arp2/3 complex through WH2 domains, amphipathic connector region and acidic peptide [118]. 1.10 Cancer cell migration Cell migration is implicated in many biological processes such as embryonic morphogenesis, immune surveillance, tissue repair and regeneration [119-121]. However, dysregulation of cell migration can lead to cancer metastasis [119-121]. The invasion of cancer cells from a primary site to a secondary site is the process of metastasis, which is the most common cause of death in cancer patients [119-121]. This is a very complex process and the initial step involves a response to a set of chemo-tactic signals, followed by the protrusion of the cell membrane termed the “leading edge”, and consequent attachment to the extracellular matrix [119-121]. These invasive cells can form various protrusive structures through the process of actin polymerization such as filopodia, lamellipodia, and invadopodia/podosomes, which differ in appearance, structure and function [119-121]. Lamellipodia are dynamic structures and can migrate long distances through actin polymerisation at 58 the site of leading edge [119-121]. However, they are also lamella which are not as dynamic but also are implicated in cell migration and connect actin to the myosin IImediated contractile machinery [119-121]. It has been shown that the latter is also important in lamellipodia extension and this myosin II mediated contractility is implicated in both actin filament disassembly at the back of the lamelipodium, and direction of migration [119-121]. Chemo-attractants bind to cell surface receptors which then activate a cascade of intracellular signaling pathways that are implicated in the regulation of the actin cytoskeleton [119-121]. This invasive phenotype requires over-expression of genes implicated in cell motility such as the WASP family of proteins, which allows the cells to respond to the chemo-tactic signals [119-121]. Therefore, proteins implicated in cancer cell migration are potential drug targets for cancer therapy [119-121]. 1.11 RNA polymerase transcription machinery The recognition of nuclear gene promoters and subsequent target gene transcription is carried out by three enzymes, RNA polymerase I, II and III [122]. Each implicated in the transcription of specific sets of genes, and all rely on transcription factors to recognize the promoter sequences [122]. RNA polymerase I only transcribes large ribosomal RNA genes [122], whereas RNA polymerase II transcribes mRNA and small nuclear RNA (snRNA) [122]. Gene promoters targeted by RNA polymerase II usually contain one core and one regulatory region [122]. The core promoter domain varies for different types of genes, however, in general the core promoter includes a TATA box, an initiator and promoter elements [122]. Although, not all promoters necessarily have to contain the TATA box, these promoters are referred to as “TATA less” promoters [122]. RNA polymerase III transcribes genes generally no longer than 400bp, which encode structural or catalytic RNAs such as the components of protein synthesis, splicing and tRNA processing complexes [122]. The promoters of the RNA polymerase III target genes can be divided into three main types, two that are known as “gene internal” and generally do not contain the TATA box (type 1 and type 2 promoters), and one known as “gene external” (type 3 promoters) which does contain the TATA box [122]. Transcription is initiated by the coordinated action of the general transcription factors (GTFs), which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH, and the RNA polymerase core enzyme, which is composed of around 10-17 core subunits [123-124]. GTFs bind to the core promoter in an ordered manner to form 59 the pre-initiation complex (PIC) (Figure 1.29), and facilitate the recruitment of RNA polymerase to the promoter and the transcription start site (TSS) [123-124]. The most studied core promoter element is the TATA promoter element, situated 25 bp upstream of the TSS, and includes the consensus sequence TATAa/tAa/t [123]. The first transcription factor recruited to the promoter is the TFIID complex, which is implicated in the positioning of Pol II and determining the transcription start site [123]. TFIID consists of the TATA box binding protein (TBP), which recognizes and binds to the TATA box, and various TBP associated factors (TAFs) [123]. TFIIF and non-phosphorylated Pol II then binds, followed by TFIIE and TFIIH recruitment [123]. Once the PIC is assembled, the next step is initiated once NTPs become available, which leads to strand separation at the TSS to give an open complex. The large subunit of Pol II is then phosphorylated, resulting in transcription initiation, subsequently followed by Pol II promoter release [123]. Figure 1.29. Process of Transcription. Schematic diagram showing the process of transcription initiation that involves the recruitment of the general transcription factors (TFIID, B, F, E and H) to the DNA to form the PIC (Pre-initiation complex), with the nonphosphorylated Pol II (Pol IIA). Elongation then follows in the presence of NTPs (Nucletoide tri-phosphates) and phosphorylated PolII (PollIIO). After termination PolII is de-phosphorylated and the whole cycle of events can be re-initiated. Figure taken from [123] The formation of the PIC can also occur via an “RNA Pol II holoenzyme” intermediate where the RNA polymerase II is complexed with various proteins, 60 among which are mediators and chromatin remodeling factors, which can bind to promoters without the ordered assembly of the general transcription factors as mentioned above (Fig.1.30) [125]. Figure 1.30. Formation of the Pre-initiation complex (PIC). A: The GTFs (General transcription factors) assemble to form the PIC on the promoter in an ordered manner. B: PIC can also be assembled through the recruitment of the RNA Pol II holenzyme, which includes the GTFs, Srb/Mediator proteins (Srb/Med) amd chromatin re-modelling factors (CRFs). Bent arrow denotes transcription start site. Abbreviations: INR, Initiator element; Figure adapted from [125] The formation of the PIC complex happens only once, as when the RNA Pol II is released from the promoter, a “scaffold” structure, which includes TFIID, E and H and mediator still remains bound to the core promoter [125-126]. This mechanism of transcription is known to direct low levels/basal transcription, however, transcription can be activated by sequence specific transcriptional activators; which bind to specific DNA sequence usually around 6-12bp, found upstream of the promoter [125-126]. Activators can interact with the proteins implicated in the transcription process, and as consequence enhance PIC assembly, or they could facilitate any of the processes proceeding PIC formation [125-126]. Activators can also act at the level of chromatin, as chromatin can restrict access of the transcriptional machinery to the promoter thereby preventing the formation of the pre-initiation complex [125126]. Activators are also regulated by co-activators, which do not have any sequence specific binding properties, but are recruited to specific domains of the promoters through interactions with DNA bound activators [125-126]. They are similar to 61 activators as they facilitate the formation of the PIC complex or modify chromatin [125-128]. Co-activators are important for the regulation of transcription since they exert positive or negative effects on activators [125-126]. They also mediate transcription factor target selectivity inducing the expression of particular subsets of the transcription factor target genes thus giving rise to specific physiological outcomes [125-126, 129-130]. Co-activators can be classified into classes, one that modify chromatin and the other, that interact with RNA polymerase II and other general transcription factors [127128]. Formation of nucleosomes suppresses transcription due to the in-accessibility of the transcriptional machinery to the DNA [127-128]. For gene activation this suppression has to be relieved through alteration of chromatin structure via acetylation, resulting in the activation of transcription [127-128]. Histone deacetylation results in transcriptional repression [125-126]. Eukaryotic cells contain two classes of chromatin modifying proteins, namely the ATP-dependent chromatinremodeling complexes e.g SWI/SNF complexes, and the histone-modifying enzymes e.g. the histone acetlytransferases p300 and CBP [127-128]. CBP, as the name implies interacts with CREB, and this interaction is dependent on the phosphorylation of CREB by cAMP-dependent protein kinase A [131]. The protein, p300 was discovered through its interaction with the adenovirus protein E1A [131]. CBP and p300 are transcriptional co-activators that bridge sequence specific transcription factors such as CREB activating transcription factor and p53 to the basal transcription machinery (e.g. TATA-box binding protein (TBP) and transcription factor IIB), and hence are implicated in RNA polymerase recruitment and transcription initiation and activation [131-132]. Both p300 and CBP have intrinsic histone acetyltransferase (HAT) activity and both interact with multiple transcription regulators being critical integrators of various signalling pathways [131]. Apart from histones, p300 and CBP acetylate various non-histone proteins and transcription factors thereby regulating transcription at multiple levels [127-128]. Formation of the PIC is dependent on two complexes, TFIID and mediator [127]. The binding of TFIID to the TATA box is a critical step in the formation of the PIC, but this alone cannot fully activate transcription, the concerted action of both TFIID and the mediator (a multi-subunit complex of proteins) and their direct interaction is required for efficient PIC complex formation [127]. The co-ordination of chromatin remodeling to PIC complex formation was studied to elucidate the mechanism involved and it was shown that the first step was the interaction of the mediator and 62 p300, resulting in chromatin acetylation [127]. The interaction of p300, mediator and template restricts access to other cofactors including TFIID, which is critical for PIC complex formation [123]. P300 is auto-acetylated and undergoes conformational change, which results it its subsequent dissociation from this complex with DNA [127]. This dissociation is enhanced further because of the competitive interaction of TFIID for the mediator, and so the concerted action of auto-acetylation and competitive interaction with TFIID results in the dissociation of p300, and the subsequent increase in binding of TFIID, which is critical event for the activation of transcription [127]. This then results in the recruitment of other general transcription factors (GTFs) and formation of the PIC complex [127]. This is the mechanism that co-ordinates chromatin remodeling to PIC complex formation [127]. In summary, transcription initiation is a critical event, closely regulated by many proteins including transcription factors; which can activate or suppress transcription, and also repressor proteins, which can inhibit either the action of transcription factors which facilitate transcription, or the basal transcriptional machinery [125126, 132]. Gene specific transcription repression is critical aspect of gene regulation, which is dependent on the activity of transcriptional repressors [133]. Repressors can be categorized according to their repression range, i.e short range or long range [133]. In the latter case the repressors mechanism of action includes mediation of promoter resistance to all enhancers and is known as “promoter silencing” [133]. The mechanism of action of short-range repressors is to inhibit the function of locally DNA bound rather than distantly bound activators (long range repressors) [133]. An example of long-range transcriptional repression comes from a corepressor protein known as Tup1 (yeast homologue), which is a tetrameric complex [133]. It has been shown that Tup1 mediates it repression through interaction with one of the subunits of the mediator complex, whereby the mediator complex is a large multi-subunit complex that interacts with the C terminus of the large subunit of RNA Pol II and important for transcriptional activation [133]. Various sub-complexes of proteins assemble on promoters but on type II promoters of RNA Pol III, its TFIIIC that recognize these promoter sequences, which then recruits TFIIIB on the transcritpion start site of RNA polymerase III target genes, leading to the recruitment of Pol III [122, 134]. This recruitment is initiated by the TPR containing subunit, TFIIIC131, which interacts with the TFIIIB related factor TFIIIB70/Brf1 [134] leading to the recruitment of TFIIIB followed by TFIIIC release [134]. TFIIIC131 contains 11 TPR subunits, and mutations have been identified 63 within this TFIIIC131, which is within TPR2 of the protein [134]. This mutation increase Pol III gene transcription through an increase in the recruitment of TFIIIB70 to the TFIIIC-DNA complex [134]. This is through a conformational change which exposes a TFIIIB70 binding site [134]. Research was carried on the identification of the HATs implicated in Polymerase III mediated transcription, although p300/CBP are known to be critical for Pol II and Pol I transcription as described above [128]. It was then subsequently hypothesized that p300 would also be implicated in Pol III transcription activation. TFIIIC also has HAT activity, however, research into the possible role of this HAT activity being important in suppressing nucleosomal formation was not clearly shown [128]. Chromatin immuno-precipitation experiments have shown that p300 is present on the promoters of different transcribed Pol III genes in- vivo [128]. P300 was indeed found to be a co-activator for Pol III, and implicated as initially predicted in chromatin remodeling as well as formation of the PIC [128]. The latter is through its interaction with TFIIIC and its subsequent recruitment [128]. P300 HAT activity is critical for transcription activation on the chromatin level, but p300 is also important for the stabilization of TFIIIC-DNA complex and formation of pre-initiation complex for the transcription of histone free DNA templates, independent of it’s AT activity [128]. Research has identified a link between cancer and the deregulation of the Pol III transcription, as high amounts of RNA Pol III target genes have been detected in various cancers [135]. It has been shown that TFIIIB is one of the contributors to this link, as high level of TFIIIB as well as TBP, BRF1, Bdp1 and Brf2 have been detected in various cancers, [135]. 1.12 STRAP (Stress responsive activator of p300) 1.12.1 STRAP discovery The focus of this investigation is the 440 amino acid protein STRAP (Fig.1.31A) which was discovered through a yeast two-hybrid screen as one of the components of a multi-protein complex containing junction mediatory (JMY) and p300 [136]. Sequence analysis of STRAP predicted the presence of six tandem TPR motifs distributed throughout the protein (Fig. 1.31C) [136]. More detailed sequence alignment of STRAP TPR motifs showed that amino acid residues at certain positions within these domains are conserved (Fig.1.31B) [32-33, 37, 136]. 64 (A) 01-MMADEEEEVKPILQKLQELVDQLYSFRDCYFETHSVEDAGRKQQDVQKEM-50 51-EKTLQQMEEVVGSVQGKAQVLMLTGKALNVTPDYSPKAEELLSKAVKLEP-100 101-ELVEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNKVSLQNLSMVLRQLR-150 151-TDTEDEHSHHVMDSVRQAKLAVQMDVHDGRSWYILGNSYLSLYFSTGQNP-200 201-KISQQALSAYAQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSRA-250 251-AALDPAWPEPRQREQQLLEFLDRLTSLLESKGKVKTKKLQSMLGSLRPAH-300 301-LGPCSDGHYQSASGQKVTLELKPLSTLQPGVNSGAVILGKVVFSLTTEEK-350 351-VPFTFGLVDSDGPCYAVMVYNIVQSWGVLIGDSVAIPEPNLRLHRIQHKG-400 401-KDYSFSSVRVETPLLLVVNGKPQGSSSQAVATVASRPQCE-441 (B) TPR I 069-QVLMLTGKALNVTPDYSPKAEELLSKAVKLEPEL TPR II 103-VEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNK TPR III 179-GRSWYILGNSYLSLYFSTGQNPKISQQALSAYAQ TPR IV 224-PDLHLNRATLHKYEESYGEALEGFSRAAALDPAW TPR V 332-NSGAVILGKVVFSLTTEEKVPFTFGLVDSDGPCY TPR VI 373-VQSWGVLIGDSVAIPEPNLRLHRIQHKGKDYSFS TPR Consensus W LG Y A F A P -102-hSTRAP -136-hSTRAP -212-hSTRAP -257-hSTRAP -365-hSTRAP -406-hSTRAP (C) Figure 1.31. STRAP sequence and its conservation. A: hSTRAP is a 440 amino acid protein and amino acids underlined and highlighted in different colors are part of the predicted TPR motifs. The amino acids highlighted in green, red, blue, orange, purple and brown are part of TPR motifs 1, 2, 3, 4, 5 and 6 respectively. B: Sequence alignments of the 6 TPR motifs of the human homologue of STRAP (hSTRAP). Eight amino acids are conserved between the TPR motifs, amino acid 4, 7, 8, 11, 20, 24, 27 and 32 and these amino acids are highlighted in red, which correlates with the general feature of TPR motifs. The number at the start and end of each TPR motif indicates the residue number of the start and end of that TPR motif. C: Illustrates the distribution of the six TPR motifs within hSTRAP and are highlighted in grey and labeled I to VI. Figure adapted from [136] 1.12.2 STRAP, p300 and JMY STRAP forms a complex with proteins p300 and JMY (STRAP-p300-JMY) [136] of which p300 is a 300kDa phospo-protein discovered in 1986, found to interact with the E1A (Adenovirus early region 1A) protein [137]. Sequence analysis revealed a 63% homology between the human orthologue of p300 and CBP (CREB binding protein) family of proteins, where an increase in identity was observed in certain specified regions namely the E1A binding site region [137]. Due to this latter observation it was predicted that p300 and CBP have similar functions, which was then found to be the case through further experiments [137]. As mentioned above, both CBP and p300 are transcriptional co-activators, being adaptor molecules between DNA binding factors and the transcriptional machinery [137-140]. P300 is implicated in diverse cellular functions, among which are proliferation, differentiation, cell cycle regulation, apoptosis and the DNA damage response 65 pathway [137-140]. P300 is also implicated in the p53 response, and interacts with both positive and negative regulators of p53 [138-140]. The p53 negative regulator Mdm2 binds to p300 through residues 102-222 of Mdm2 and mutations within this regions means Mdm2 cannot bind to p300 and as a consequence cannot degrade p53 [138-140]. JMY, an 110kDa protein, was discovered through a yeast two-hybrid assay with a truncated version of p300 (611-2283) and was consequently named according to its function, “junction mediating and regulatory protein” [141]. JMY-p300 interaction was confirmed to be a direct interaction and both were found to exist as a multicomponent co-activator complex [141]. JMY sequence analysis identified a number of interesting features; among which are clusters of probable CdK phosphorylation sites at its N termini, a string of proline rich residues at its C termini, and a central conserved adenovirus E1A CR2 homology region [141]. JMY is a 983 amino acid protein, which contains two p300 binding domains present between amino acid residues 1-119 and 469-558 [141]. Experiments defined region of p300 implicated with the interaction to JMY to 611-1257 and 1572-2283 [141]. To initially deduce whether JMY could be implicated in cancer, its gene chromosomal location had to be identified [141]. This was identified on chromosome 5 in band 5q 13.2, and this region is implicated in various malignancies such as leukemia [141]. Further research is yet to be done to clarify the role of the JMY gene in these malignancies [141]. JMY and p300 are implicated in the regulation of the p53 response, as the p300-JMY complex has been found to enhance p53 transcription, but this is dependant on the activation domain of p53 in being intact [141]. Also the expression levels of p53 target genes such as Bax increased upon expression of JMY and p300 [141]. Also a complex of p53-JMY-p300 was detected in U20S cells, indicating the formation of a ternary complex [141]. The outcome of this increase in p53 transcription by this trimeric p53/p300/JMY complex was an increase in p53 dependant apoptosis [141]. Different isoforms of JMY have a different affect on the p53 response, for example a JMY mutant where the proline rich sequence is deleted (P), promotes cell cycle arrest rather than apoptosis [141]. It is known that Mdm2 targets p53 for degradation through the process of ubiquitination, and recent research has implicated JMY in this process [142]. It has been shown that when DNA is damaged, an increase in JMY protein is detected 66 followed by an increase in p53 activity; hence JMY was identified as a DNA damage responsive protein [142]. Furthermore, Mdm2 inhibitors were shown to increase the levels of JMY protein, which then suggested that Mdm2 negatively regulates JMY [142]. JMY and Mdm2 co-expression results in an increase in poly-ubiquinated JMY as Mdm2 ubiquitinates JMY, for subsequent degradation through the proteosome [142]. JMY as a consequence can then not activate the p53 response. [142]. 1.12.3 STRAP function STRAP is a stress responsive element, as under stress the levels of STRAP increases, as well as the interaction between p300 and JMY [136]. STRAP interacts with JMY via its N-terminal domain (1-205) and with p300 through its C-terminal region (206438) (Fig.1.32) [136]. The role of STRAP in the JMY/p300 complex has been suggested as the stabilization of the interaction between these two proteins [136]. Interacts with JMY p300 205 Figure 1.32. STRAP interaction with JMY and p300 through distinct TPR motifs. The six TPR motifs are highlighted in grey and labeled I to VI. Residues 1-205 of hSTRAP are implicated in JMY interaction and residues 206-438 are implicated with p300 interaction. Figure adapted from [136] In addition, STRAP has been shown to increase the half-life of p53, possibly by blocking the interaction of the tumour suppressor with MDM2, and co-activate its transcriptional activity [136]. Furthermore STRAP has been found to interact with PRMT5 under conditions of DNA damage thus allowing PRMT5 recruitment to p53 under these conditions (Fig.1.33) [143]. PRMT5 methylates p53 at Arg333, Arg335 and Arg337 that are located within the oligomerization, nuclear export and nuclear import domains of p53 (Fig.1.33), thereby regulating cell cycle [143]. Under stressful conditions, an increase in interaction between STRAP and p53 is detected through which the level of p53 activity is maintained in this condition [136]. 67 Figure 1.33. STRAP and the p53 Response. STRAP allows the recruitment of PRMT5 to p53 when DNA is damaged; resulting in the methylation of three arginine resides, Arg333, 335 and 337 on p53 by PRMT5. This as a consequence then affects the p53 response. Figure adapted from [143] As mentioned above STRAP is implicated in regulating p53, and it is also implicated in the DNA damage response pathway [143-148]. DNA can be damaged through ionizing radiation, which leads to the activation of specific related phosphatidylinositol-3-OH-kinase-like kinases; ATM, and ATR protein kinase that activate a signaling cascade [144-148]. This signaling cascade includes many components such as p53, Chk1 and Chk2 [144-145, 147-148]. P53 is a tumour suppressor gene and a critical protein as 50% of human cancers are due to p53 mutations [146]. When DNA is damaged ATM phosphorylates STRAP at position Ser203, which is within TPR3 of STRAP resulting in STRAP nuclear localization [144-145, 147]. Once STRAP has localized to the nucleus, Chk2, which is downstream of ATM phosphorylates STRAP at position Ser221 (Fig.1.34) [144-145, 147]. This site is not within the TPR motif but in the junctional region between TPR3 and TPR4 [144-145, 147]. This phosphorylation event leads to STRAP stabilization, which then leads to the assembly of the STRAP, p300 and JMY complex (Fig.1.34) [144-145, 147]. This results in p53 histone acetylation and activation of the DNA damage response (Fig.1.34) [144-145, 147]. 68 Figure 1.34. STRAP and the DNA damage Response pathway. ATM and Chk2 phosphorylate STRAP at position 203 and 221 respectively which leads to STAP nuclear localization and stabilization respectively. This then leads to the activation of the DNA damage response pathway. STRAP remains in the cytoplasm in two ataxia-telangiectasia (AT) cell lines tested, both of which have non-functional ATM [145, 147]. Also a mutant form of STRAP, which cannot be phosphorylated by ATM, also remains in the cytoplasm [145]. Translocating STRAP to the nucleus in these conditions with defective ATM restores STRAP stabilization and the DNA damage response [145, 147]. This shows that nuclear STRAP plays an important role in the DNA damage response pathway through ATM [145, 147]. Cells can be exposed to a variety of environmental stress and as a result many stress response pathways exist to enable cells to live in these conditions [148-149]. A type of stress that a cell responds to is heat shock and this leads to the activation of a set of chaperones called the heat shock proteins [148-149]. This response pathway involves HSF1, which under normal conditions exist as a monomer in the cytoplasm (Fig.1.35) [148-149]. When cells are heat shocked, STRAP interacts with HSF1 resulting in HSF1 phosphorylation and trimerization [148-149]. The STRAP/HSF1/p300 complex then binds to heat shock elements (HSEs) of target heat shock protein genes, for example Hsp70 [148-149]. The histones of Hsp70 are then acetylated by p300 resulting in Hsp70 transcriptional activation [148-149]. Activation of Hsp70 leads to inhibition of apoptosis through inhibition of caspase activation and cytochrome C release (Fig.1.35) [148-149]. 69 Figure 1.35. STRAP and the stress response pathway. Under heat shock STRAP interacts with HSF1 resulting in its trimerization and phosphorylation and formation of HSF1/STRAP/p300 complex. This as a complex then binds to HSEs of target genes for example Hsp70, and causing its transcriptional activation. Abbreviations: HSF1, Heat Shock Transcription Factor 1; HSEs, Heat Shock Elements; Hsp70, Heat Shock Protein 70; STRAP is also implicated in the regulation of the Glucocorticoid receptor (GR) under cellular stress, which is a member of the nuclear hormone receptor family, [150]. GR is activated through its interaction with glucocorticods and lipophilic hormones under stress [150]. Once activated the GR regulates a diverse set of genes implicated in metabolism, inflammation and the immune response, both in a negative and positive manner [150]. Due to its multi-functional properties its activity is regulated though various mechanisms; protein stability, post-translational modifications and interactions with various co-factors [150]. Protein stability is a critical regulator of GR, as GR is a target for degradation through the process of ubiquitination [150]. GR interacts with various co-factors, such as p300 and heat shock proteins, which affect chromatin architecture, and interact with the basal transcriptional machinery [150]. The interaction between hormone and GR causes a significant conformational change within GR, which exposes a surface that binds to the LXLL motif of target co-factors [150]. STRAP contains 6 TPR motifs [136] and one LXLL motif situated between TPR4 and TPR5 [150]. STRAP(220-40) was shown to interact with GR in A549 cells [150] and the most critical motif in this interaction is TPR6 and the LXLL motif [150]. STRAP is an important regulator of GR stability as STRAP-GR interaction inhibits GR degradation and is critical for the 70 stabilization of GR, as it has been shown to increase the half-life of GR [150]. STRAP is also implicated in the regulation of GR transcription (Fig.1.36) [150] Figure 1.36. Regulation of GR by STRAP. Under stress a tri-meric GR/Hsp/TTC5(STRAP) complex is formed, resulting in GR stabilization of GR possibly through Mdm2 inhibition. After the binding of glucocorticoid to GR, the receptor then translocates to the nucleus and binds to glucocorticoid response elements (GRE) of its target gene. STRAP(TTC5), p300, JMY and Hsp are all implicated in the regulation of GR associated related gene transcription. Figure taken from [150] 71 1.13 Aims of the Project It has become evident that TPR proteins are evolutionary conserved and implicated in various essential cellular functions [32]. STRAP is one predicted TPR domain protein with the rare structural characteristic that it harbors six predicted consecutive TPR domains throughout its sequence [136]. The existence of six similar domains within the sequence of STRAP are possibly required to provide this protein with the ability to interact with several different binding partners thereby being involved in multiple signaling pathways, this way tailoring cellular needs with specific environmental cellular conditions. These domains may also be necessary to amplify particular signals within the same pathway conferring the cell the capacity to respond to similar stimuli with ranging intensity In accord with the role of other TPR motif proteins [32] STRAP has been shown to interact with p300 and JMY to form a trimeric complex of STRAP/p300/JMY [136], thereby potentially being implicated in fundamental cellular signaling pathways. To address this, biochemical pull-down assays will be carried out to identify interacting partners of full-length hSTRAP protein in breast cancer cells, followed by mass spectrometry and proteomic analysis. Furthermore, the interaction pattern of truncated hSTRAP fragments in breast cancer cells will also be investigated and compared to full-length hSTRAP protein, with a view of mapping the region of hSTRAP involved in hSTRAP-ligand interaction. Another aim is to characterize structurally full-length hSTRAP and its truncated versions by NMR, X-ray crystallography and Circular dichroism. For this, a protocol has to be firstly established to clone, express and purify these various hSTRAP constructs to a high quantity. 72 2. Chapter Two. Materials and Methods 2.1 Materials 2.1.1 Chemicals and Reagents Unless otherwise stated, all chemicals and reagents were of analytical grade. PBS (BR0014) was purchased from Oxoid (Hampshire, UK) in tablet form. DMEM (BE12-604F), Penicillin/Streptomycin (DE17-602E) and 0.5% (v/v) trypsin with EDTA (BE17-161E) was supplied by Lonza (UK). FBS and L-glutamine was obtained from GIBCO, Invitrogen (GIBCO BRL, Paisley, UK). Dimethyl sulphoxide (DMSO), glycerol (G/0650/08), SDS (S/5200/53), NaCl (S/3160/60), DTT (BPE172-25), Na2HPO4 (S-4400/53), KH2PO4 (P/4800/53), D-Glucose (G/0500/55), EDTA (D/0700/53), Iso-propanlol (BP2618-500), Ethanol (BP2818-100) and Agarose (BPE1356) were purchased by Fisher Scientific (Leicestershire, UK). Prestained Protein marker Broad range 7-175kDa (P7708S), 100bp DNA ladder (N3231S) and 1kb DNA ladder (N3232S) were purchased from New England Biolabs (Hertfordshire, UK). Protease Inhibitor cocktail (11836170001) was purchased from Roche (West Sussex, UK). DNA hyper ladder III (BIO-33043) was purchased from Bioline (London, UK). Bug buster (70544-3) was purchased from Novagen (UK). PMSF (P7626-1), Bromophenolblue (B0126) 2-mercaptoethanol (M6250), APS (A3678), TEMED (T9281), L-Glutamic acid (G251), Triton X-100 (23,472-9), Imidazole (56749), L-Glutathionine reduced (G4251), Thiamine (T4625) and Ammonium Bi carbonate (A6141) were purchased by Sigma Aldrich (Poole, Dorset, UK). Pageruler Pre-stained Protein ladder 10-170kDa (SMO671) and Glacial acetic acid (A/0400/PB17) were purchased from Fermentas. InstantBlue (ISB1L) was purchased from Expedeon (Cambridgeshire, UK). Acrylamide (161-0156) and Bradford reagent (500-0006) were purchased from Biorad (Hertfordshire, UK). LArginine (104995000) was supplied from Acros Organic (Leicestershire, UK). 15 N labelled ammonium chloride (299251) was purchased from Isotech (Champaign, USA). Calcium chloride (100704Y) and Magnesium sulphate (101514Y) were purchased by AnalaR (Leicestershire, UK). Tris (TRIS01) was purchased from Formedium (Norfolk, UK). Gel red stain (41003) was purchased from Biotium (Cambridge, UK). NP-40 (A2239.0100) was purchased from VWR BDH Polio (UK). Trypsin, mass spectrometry grade (V5280) was purchased from Promega (Southampton, UK). Acetonitrile (51101) was purchased from Thermo-scientific (UK) 73 2.1.2 Enzymes and Kits Phusion High fidelity DNA polymerase (M530S), dNTPs (N7552), BamHI High fidelity (R3136T), NdeI (R0111S), T4 DNA ligase (M0202S) were purchased from New England Biolabs (Hertfordshire, UK). Xho1 (10703770001) and alkaline phosphatase (10108138001) were supplied by Roche (West Sussex, UK). All kits, PCR purification kit (28104), Gel Extraction kit (28704) and Mini prep Kit (27104) were purchased from Qiagen (West Sussex, UK). DNAse (107k7013), RNase (126k760) was purchased From Sigma Aldrich (Poole, Dorset, UK). Pre-cission protease (27-0843-01) was purchased from GE-Healthcare (UK) 2.1.3 Other consumables His tag superflow Talon Resin (635506) was purchased from Clontech (France). GST Resin (17-0756-05), Superdex 200 26/60 GL (17-1070-01) and Viva-spin 500 3 kMWCO (28-9322-18) were purchased from GE Healthcare (UK). Tissue culture flasks, 10 cm plates and Petri dishes were obtained from Falcon (Runcorn, UK). JSCG+ screen (130920), PACT (130918) pH Clear Strategy I (130909) and II (130910) were purchased from Qiagen (West Sussex, UK). Morpheus (MD1-46) and MRC 96 well crystallization plates (MD11-00-100) was purchased from Molecular dimensions (Suffolk, UK). The 2 µM filter (FC121) used in this study was purchased from Appleton (UK). The 25 mm 10 kMWCO membrane (RC-25-10) was purchased from Generon (Berkshire, UK). Snakeskin pleated dialysis tubing, 3kMWCO (68035) was purchased from Thermo Scientific (UK). 10 and 15 well 1.5 mm combs were purchased from Biorad (Hertfordshire, UK). The amicon, stirred ultrafiltration cells 8010 10 mls (5121) was purchased from Millipore (UK). The Phoenix nanolitre pippetting robot (600-0000-00) was purchased from Art Robbins instruments (USA). 2.1.4 General buffers and solutions The compositions of all the general buffers used in this study are listed in Table 2.1. Table 2.1. Buffer compositions. No 1 Buffer TNN buffer Composition 50 mM Tris-HCl pH 7.4, 120 mM NaCl, 5 mM EDTA, 0.5% NP-40. 2 3x SDS buffer 3 50x TAE buffer SDS sample 187 mM Tris, 30% Glycerol, 6% SDS, 15% 2-mercapto ethanol, 0.01% bromophenolblue. 2 M Tris, 1 M Glacial acetic acid and 50 mM EDTA 74 No 4 Buffer Upper stacking PAGE buffer Composition SDS 0.5 M Tris pH 6.8, 0.4% (w/v) SDS 5 Lower resolving PAGE buffer SDS 1.5 M Tris, pH 8.8, 0.4% (w/v) SDS 6 His tag Lysis buffer 50 mM Tris pH 7, 300 mM NaCl, 50 mM L-Arginine, L-Glutamic acid, 0.5% (v/v) Triton X-100, 1% (v/v) each of Protease Inhibitors, PMSF, RNase and DNAse 7 His tag Wash buffer 8 His tag Elution buffer 9 GST tag Lysis buffer 50 mM Tris, 300 mM NaCl, 50 mM Arginine and Glutamic acid and 5 mM Imidazole 50 mM Tris, 300 mM NaCl, 50 mM Arginine and Glutamic acid, 200 mM Imidazole 50 mM Tris pH 8.7, 300 mM NaCl, 50 mM Arginine and Glutamic acid, 0.5% (v/v) Triton X-100, 1% (v/v) of Protease Inhibitors, PMSF, RNase and DNAse 10 GST tag Wash buffer 50 mM Tris pH 8.7, 300 M NaCl, 50 mM Arginine and Glutamic acid and 1 mM DTT 11 GST tag Elution buffer 50 mM Tris pH 8.7, 300 mM NaCl, 50 mM Arginine and Glutamic acid, 10 mM LGlutathionine Reduced 12 Gel filtration buffer 50 mM Sodium Phosphate buffer pH6.5, 150 mM NaCl, 50 mM Arginine, 50 mM Glutamic acid and 10 mM βmercaptoethanol 13 CD buffer 20 mM Sodium phosphate buffer, pH 6.5 14 Minimal media solution A 15 Minimal media solution B 88 mM Na2HPO4 and 55 mM KH2PO4, pH 7.2 19 mM 15N labelled ammonium chloride, 22 mM of D-Glucose, 180 µM Calcium chloride and 200 µM Magnesium sulphate, 650 µL Trace elements 2.1.5 Chemically competent bacterial cells All the bacterial competent cells used, supplier’s details, description and antibiotic resistance of each bacterial chemically competent cell line used in this investigation is listed in Table 2.2. Table 2.2. Bacterial competent cells. Cells Suppliers Details Description DH5α (18265-017) Invitrogen, UK Cells used for cloning processes 75 Antibiotic resistance of vector and cells molecular Ampicillin (50 µg/µL) (v) Cells Suppliers Details Description Antibiotic resistance of vector and cells BL21(DE3)-RIPL (230280) Stratagene, USA Have extra copies of the rare codon tRNAs genes Ampicillin (50 µg/µL) (v), chloramphenicol (34 µg/µL) (c) and Streptomycin (50 µg/µL (c) T7 express (C2566H), New England Biolabs T7 polymerase under the control of the lac operon rather the lysogenic prophage (BL21(DE3 strains) Ampicillin (50 µg/µL) (v) BL21(DE3)pLysS (69451-3) Novagen, UK Ampicillin (50 µg/µL) (v )and chloramphenicol (34 µg/µL) (c) BL21(DE3) single cells (69450-3) Novagen, UK Encodes a natural inhibitor of T7 polymerase, the T7 lysozyme, which suppresses expression before IPTG induction High level expression Ampicillin (50 µg/µL) (v) Rosetta 2(DE3). Gami (71351-3) Novagen, UK Promotes disulphide bond formation and contains the tRNAs for the rare codons Ampicillin (50 µg/µL) (v), Tetracycline (12.5 µg/µL) (c), chloramphenicol (34 µg/µL) (c) and Streptomycin (25 µg/µL) (c) Shuffle express T7 (C3029H) New England Biolabs, UK Contains chaperones to assist in the folding of protein and promotes correct di-sulphide formation Same properties as Shuffle T7 express but encodes a lysozyme, an inihibitor to suppress expression before IPTG induction Ampicillin (50 µg/µL) (v), Streptomycin (25 µg/µL (c) Shuffle T7pLysY (C3027H), New England Biolabs, UK Ampicillin (50 µg/µL) (v), Streptomycin (25 µg/µL (c) and chloramphenicol (34 µg/µL) (c) This table list the supplier’s details, description of each cell line and antibiotic resistance of vectors (v) and bacterial competent cells used (c). The vectors used in this investigation, pET14-b and pGEX-6P1 both have antibiotic resistance against ampicillin. 2.2 Mammalian Cell Culture 2.2.1 Cell lines For this project breast cancer cells, MCF-7 cells (p53+/+) were used and purchased from European Collection of Cell Cultures (ECACC). Cells were grown in standardized media of Dulbecco's Modified Eagle's medium (DMEM), 10% (v/v) heat inactivated fetal bovine serum (FBS), 1% 10,000 U/ml penicillin and streptomycin (P/S) and 2 mM L-Glutamine. Cells were maintained in this medium at 370C, 21% O2, 5% CO2 and 74% N2. 2.2.2 Cell passage and maintenance Cells were grown in 75 cm2 vented tissue culture flasks and regularly passaged at the desired confluence of 70-80%. Cell culture was carried out using aseptic techniques in Class II microbiological safety cabinets. Once desired confluence was reached, 76 cells were sub-cultured as follows; growth media was removed and the cell monolayer was washed with sterile PBS. Then 2 ml 1x trypsin/EDTA, diluted in PBS, was added to the cells and incubated for 2 mins at 37oC to aid detachment. Complete media was added to the flask to neutralise the trypsin and an appropriate amount of the cell suspension was transferred to a new flask, with the addition of more fresh complete media. For routine culture, all cell lines were passaged at a dilution ratio of 1:4. Cells were seeded into 75 cm2 vented tissue culture flasks or cell culture 10 cm plates for general cell maintenance and biochemical pull down assays respectively. 2.2.3 Biochemical pull down assays Once a confluency of 60-70% was achieved on the adhesive cell culture 10 cm plates, growth media was removed and the cell monolayer was washed twice with sterile cold PBS. Then 150 µl of cold TNN buffer was added to the plates and the cells were scraped off the plates. The subsequent lysate was then transferred into 2ml eppendorfs to be incubated at 40C for 25 mins on the roller. After this, the lysate was then centrifuged at 15871g for 10 mins, and the supernatant obtained through this was then transferred to a universal tube and thoroughly mixed. A 100 µl of this supernatant was then added to each resin sample. This lysate-resin sample was then incubated at 40C on the roller for 1 hr and then centrifuged again at 15871g for 5 mins. Supernatant was discarded and the pellet was washed with cold PBS and centrifuged at 15871g for 2 mins, and this was repeated three times. Then 30 µl of 3x SDS sample buffer was added to the pellet and heated at 1000C for 5 min. 2.3 Cloning of hSTRAP constructs 2.3.1 Cloning of full length hSTRAP into pET14b (His-hSTRAP(1440)) Full-length hSTRAP codon optimized sequence was synthesized by GENEART in the vector pET-14b in frame using NdeI and BamHI restriction sites. Hence, the plasmid DNA for this construct was ready to be transformed into DH5α cells with a view to carry out mini-preps (See Section 2.4). 2.3.2 Cloning of truncated versions of hSTRAP (His-hSTRAP) into pET-14b For this project truncated hSTRAP protein constructs were cloned into pET-14b, taking into consideration both predicted structured boundaries and positions of TPR 77 motifs aswell as the pI of the protein. Five constructs were then subsequently chosen to be cloned into pET14b: hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440). 2.3.2.2 Primer design Primers were designed to have a melting temperature between 60-70ºC, a GC content of 40-60% and to have a GC clamp (GGC) on the N terminus (Table 2.3). Programs' that were used to aid primer design were http://www.basic.northwestern.edu/biotools/oligocalc.html. Table 2.3 shows all the primers used for the cloning of the truncated versions of hSTRAP. These primers were obtained from Sigma Aldrich in HPLC purified form. Table 2.3. PCR primers Construct Primer Name Primer Sequence (5’-3’) hSTRAP(1219) P1-F3TPR-Fwd P2-F3TPR-Rev hSTRAP(220 -440) hSTRAP (1150) P3-L3TPR-Fwd P4-L3TPR-Rev P5-F2TPR-Fwd P6-F2TPR-Rev hSTRAP(151 -284) P7-M2TPR-Fwd P8-M2TPR-Rev hSTRAP(285 -440) P9-E2TPR-Fwd P10-E2TPR-Rv GGCCATATGATGGCCGATGAAGAAGAAGAAGTT GGCGGATCCTCATCATTTACGATCAACTTTTTCT GCCTGT GGCCATATGGCAAGCAGCAATCCGGATCTG GGCGGATCCTTATTATTCACACTGCGGACGG GGCCATATGATGGCCGATGAAGAAGAAGAAGTT GGCGGATCCTTATTAACGCAGCTGACGCAGAACC AT GGCCATATGACCGATACCGAAGATGAACATAG GGCGGATCCTTATTACACTTTACCTTTGCTTTCC AGCA GGCCATATGAAAACCAAAAAACTGCAGAGCATGC GGCGGATCCTTATTATTCACACTGCG GC content (%) 41 40 Tm (ºC) 57 55 41 57 66 64 68 66 64 43 46 66 66 66 44 50 70 68 68 68 68 Annealing temp (ºC) 68 66 PCR primers used to clone the truncated versions of hSTRAP which contain different combinations of TPR motifs. The GC content varies from 40-57% and the melting temp (TM) varies from 64-680C. Annealing temperature that was used for the PCR reactions for each construct is shown in the last column. 2.3.2.3 Polymerase Chain reaction Following primer design, the PCR reactions were then set up on ice. For these PCR reactions Phusion High fidelity DNA polymerase was used, and each PCR reaction was 50 µl, consisting of 20% (v/v) GC Buffer (compatible buffer for this polymerase) (10 µl), 200 μM dNTPs (5 µl), 0.5 µM each of both forward and reverse primer (1 µl each of primers shown in Table 2.3), approx 300 pg template DNA (1 µl) and 1 µl Phusion Hot Start Polymerase and the rest was 31 µl of MilliQ water. The PCR protocol that was followed is shown in Table 2.4. 78 Table 2.4. PCR reaction protocol. PCR REACTION Step 1 2 Number of Cycles 1 35 Temperature (ºC) Time (Secs) 98 ºC 30 98 ºC 30 Annealing temp (64- 30 68ºC, See Table 2.3, Column 6) 72 ºC 90 3 1 72 ºC 300 4 1 10 ºC This table shows the PCR protocol that was used to clone the truncated hSTRAP constructs. 2.3.2.4 PCR Purification PCR samples were purified to eliminate all impurities such as primers, nucleotides, enzymes, mineral oil, salts and agarose, using the Qiagen PCR purification kit. The protocol followed and the buffers mentioned below were supplied with the kit. This kit uses silica membrane technology. DNA binds to this membrane under high salt conditions provided by the binding buffer, and is then eluted off the membrane under low salt conditions. The clean up process involved adding 5* in volume of buffer PBI (binding buffer) to 50 µl of the PCR reaction. The color of the mixture was observed to determine pH of mixture, as it should be yellow after addition of buffer, to indicate a pH of less than 7.5. If mixture is any other color, then addition of 10 µl of sodium acetate pH 5.0 is recommended, but in this case this was not needed as the mixture turned yellow. Sample was then applied to the middle of the QIAquick column provided with the kit and centrifuged for 45 sec at 15871g and the flow through was discarded. Then 750 µl of Buffer PE (with ethanol) was applied to the middle of the column to remove the high salt, and then the column was centrifuged at the same speed for 45 sec and again the flow through was discarded. Then the column was centrifuged for 1 min to eliminate the excess of ethanol. The column was then placed in a clean micro-centrifuge and 30 µl of Buffer EB (low salt buffer) was added to elute the DNA off the column. The column was then left to stand for 5 min to achieve a higher concentration of DNA and the eluted DNA was then transferred into a clean microcentrifuge. 2.3.2.5 Restriction digests The PCR products were then digested with enzymes BamHI and NdeI, however, in this project other restriction digestions were performed using this same procedure. For each reaction 3g of sample was digested in a 50l (including 10x Buffer 79 compatible for the enzyme), with 20 U of each enzyme. All digestions were incubated for 3 hrs at 370C. 2.3.2.6 Agarose gel electrophoresis The digested DNA was mixed with 10x DNA loading buffer (0.25% bromophenol blue, 0.25% xylene cyanol FF, 30% glycerol in water) and was subjected to electrophoresis on a 1% agarose gel (1g/100ml 1 x TAE, 0.1 mg/ml ethidium bromide). The electrophoresis was carried out in Tris-acetate/EDTA (TAE) buffer (0.04M Tris-acetate, 0.001M EDTA) at 80 V for approximately 90 min. 5 µl of the DNA HyperLadder (Bioline) was used to determine the size of the DNA fragments. 2.3.2.7 Ligation DNA ligase is an enzyme that is used to join DNA fragments together by catalysing the formation of phosphodiester bonds between a juxtaposed 5’ phosphate and a 3’ hydroxyl terminus in duplex DNA. The T4 ligase was originally purified from T4 phage-infected E. coli cells, and uses ATP to repair single-stranded nicks in duplex DNA and also connect duplex DNA restriction fragments that have either blunt or cohesive ends. The insert to vector ratio of 1:1 and 5:1 were used for the ligation reactions, using 100 ng of vector. 1:10 of 10x ligation buffer (660 mM Tris-HCl, 50 mM MgCl2, 10 mM dithiothreitol, 10 mM ATP, pH 7.5) was added to the reaction together with 1 U DNA ligase (Roche) and sterile distilled water was added to make up a final volume of 10 µl. The reaction was incubated at 160C overnight and 5µl was transformed into competent E.coli DH5α cells. The same vector was used in a reaction without any insert as a control. 2.3.3 Cloning of Full length hSTRAP into pGEX-6P1 (GST- hSTRAP(1440)) Full-length hSTRAP with non-optimized gene sequence was cloned originally in an HA tagged vector, pHA1, by Sandra Taylor in the laboratory of Dr Marija Kristic Demonacos. For this project full-length hSTRAP was cloned into the GST plasmid, pGEX-6P1 (GE, Healthcare, 27-4597-01; See Fig.3.5) using the methods described above whereby the starting point for the cloning of this construct was from section 2.3.2.5. However additional steps were performed between restriction digestion and ligation, and these were vector alkaline phosphatase treatment and gel extraction and purification, which will be described below. 80 2.3.3.2 Alkaline Phosphatase treatment The GST vector, pGEX-6P1 was treated with alkaline phosphatase to prevent vector re-ligation in a reaction containing 2% of the total volume alkaline phosphatase and 10% de-phosphorylation buffer. The reaction mixture was then incubated at 370C for 60 min. The alkaline phosphatase was inactivated with the addition of 200 mM EDTA (10% of the total volume) and incubation at 650C for 10 min. The plasmid was then stored at -200C. 2.3.3.3 Gel Extraction and Purification DNA isolation; The DNA bands representing the PCR product were excised under UV light and purified using the Qiagen gel extraction kit. This protocol utilises the ability of the column membrane to bind to the DNA when the buffers provide the right salt concentration and pH and is based on the principle that the adsorption of nucleic acids on the silica surface is only possible when concentration of chaotropic salts is high. Adsorption is ~95% when the pH is ~7.5 and is dramatically reduced at a higher pH. One volume of the excised gel was dissolved in 3 volumes of buffer QG at 500C. One volume of isopropanol was then added and the sample was then transferred onto a QIAquick spin column, and centrifuged for 1 min at 17,900g. Buffer PE was used to wash off contaminants and the flow-through was again discarded. The bound DNA was recovered using 30 µl of buffer EB by centrifugation for 1 min. 2.4 Transformation of plasmid DNA into competent e.coli cells LB broth was produced by dissolving 25g of Luria Bertani (LB) powder in 1L of distilled water. The LB solution was autoclaved (15 psi, 121 0C, 30min). For the preparation of LB agar plates, 15g of agar were dissolved in 1L of distilled H2O containing 25g of LB powder, which was autoclaved under the same conditions. For colony selection, the antibiotic ampicillin (50µg/ml) was added (Table 2.2) to the mixture when its temperature was about 500C. The vectors, pET-14b and pGEX6P1, used in this investigation are resistant to ampicillin (Fig.3.2 and Fig.3.5 respectively). Preparations of competent bacteria were carried out in E. coli DH5α, a derivative of Hanahan’s strain DH5. This new DH5α strain provides 1.5 x 108 transformants/µg transformation efficiency, which is higher than the DH5 parent strain. CaCl2 was used to transform DH5α to competent cells. CaCl 2 dissociates into Ca+2 and Cl- the 81 presence of Ca+2 outside of the cell and the heat shock provided during transformation creates intense osmotic pressure between the inner and outer sides of cell membrane thus allowing plasmid DNA to permeate into the cell. DH5α glycerol stock was streaked on an LB plate. Bacterial colonies were picked and allowed to grow to an OD600 in 5ml LB culture (without antibiotic). Following centrifugation at 4062g, cells were resuspended in 500 ml ice-cold 50 mM CaCl2 and incubated on ice for 20 minutes. Bacterial suspension was then centrifuged at 4062g for 10 min at 40C. Pellets collected from centrifugation were resuspended in 12 ml sterile solution of 0.53 ml 2 M CaCl2, 1.675 ml 100% glycerol and 10.09 ml sterile dH2O. 50 µl aliquots of competent cells were dispensed in microfuge tubes and stored at -800C for later use. Aliquots of 50 µl of competent Escherichia coli DH5α were thawed on ice, mixed with approximately 100 ng of plasmid DNA (vector only, GST-hSTRAP, HishSTRAP ligation mixture or the synthesized His-hSTRAP(1-440) DNA) and incubated for 30 min on ice. Bacteria were then heat-shocked for 1 min at 420C and immediately returned onto the ice for 2 min. 500 µl LB media without antibiotic was added and the bacteria were incubated in a shaker for 1h at 370C. A volume of 200 µl was streaked on an LB agar plate containing ampicillin. The plates with the transformed bacteria were inverted and incubated at 370C for 16 h to reach the stationary phase of their growth shown in Figure 2.3. The plates were kept inverted overnight at 370C, as the bacterial cells are in the exponential growth phase (Fig.2.3). Plates were then stored at 4 0C and could be used for up to 4 days. The next day, colonies were observed on these plates, which were then inoculated in LB with their required antibiotics (Table 2.2) overnight at 370C, with shaking for expression studies. A point to note is that for Shuffle T7 express and Shuffle T7pLysY all incubations for these cell lines are done at 30 0C rather than 370C. Figure 2.1. Bacterial growth curve. Bacteria take time to adjust to their new growth (lag phase) before they can start dividing. Once bacteria have adjusted, they divide regularly and enter their exponential growth at around 10 hrs from start of bacterial growth. Eventually they stop dividing and enter the stationary phase, and then after 7hrs enter the death phase. Figure taken from [151] 82 2.5 DNA Mini preps Colonies obtained through transformations of plasmid DNA with DH5α cells were then inoculated in LB-Amp overnight at 37⁰ C with shaking. These innoculations were then centrifuged at 3381g for 20 mins and the supernatant was discarded. Mini preps were carried out using the protocol and buffers provided with the Qiagen QIAprep Mini prep Kit. This kit again uses silica membrane technology and uses the same principle as with the other Qiagen kits mentioned in this investigation. Cell pellets were re-suspended in 250 µl Buffer P1, and then 250 µl of buffer P2 was added and mixed thoroughly. Then 350µl of buffer N3 was added and mixed thoroughly and this lysate was centrifuged for 10 mins at 15871g. The supernatant was then added to the QIAprep spin column and then the column is then centrifuged for 60 secs at 15871g. The flow through is then discarded and then 750 µl of Buffer PE was added to the column and then centrifuged again for 15781g for 60 secs. The flow through was discarded and the column was centrifuged at 15871g for 60 secs to remove any residual buffer. In order to elute the DNA, the column was placed in a clean eppendorf and 50 µL of Buffer EB was added to the centre of the column. The column was left to stand for 3 mins and then centrifuged at 15871g for 60 secs. Mini preps were then stored at -20⁰ C. 2.6 Sequencing Mini preps obtained from the cloning procedures were then sequenced by GATC biotech using their own commercial primers specific for pET-14b in this project, which bind to the T7 promoter and terminator region of the vector (Fig.3.2). All sequencing data from all successful clones are shown in the Appendix. These mini preps were then transformed into various cell lines following the same procedure as mentioned in Section 2.4, to carry out expression trials to determine optimum conditions of growth for soluble hSTRAP protein. 2.7 Expression trials Extensive expression trials using various expression cell lines for each protein construct were undertaken to determine optimum conditions of soluble hSTRAP protein expression. This had to be done as the competent cell line strains have different properties, which affect expression and yield of protein and so extensive expression trials had to be done to investigate this (Table 2.2). 83 For His-hSTRAP(1-440), plasmid DNA was transformed in BL21(DE3)-RIPL, T7, BL21(DE3)pLysS, BL21(DE3) single cells and Rosetta Gami 2(DE3). For the five truncated constructs of hSTRAP, three truncated constructs of hSTRAP were only transformed in BL21(DE3)pLysS, these were hSTRAP(1-219), hSTRAP(151-284) and, hSTRAP(285-440). For the other two constructs hSTRAP(1-150) and hSTRAP(220-440), these were also transformed in Shuffle T7 express and Shuffle T7pLysY. All the transformations into various different cell lines were carried out following the procedure described in Section 2.4. A point to note is that for Shuffle T7 express and Shuffle T7pLysY, growth was carried out at 30⁰ C throughout. For each cell line, colonies were obtained with transformation, and on that same day one colony was inoculated into LB media (with their respective antibiotics (Table 2.2)), overnight at 37⁰ C, with shaking. The next day 500 µl of overnight culture was transferred into fresh 50 mls of LB media containing the required antibiotics and the OD600 was checked. Growth was started when the OD600 was between 0.05-0.1 and the OD600 was checked at regular intervals. Once the OD600 was between 0.50.7, the cells were induced with varying IPTG concentration and temperatures. Cells were induced at this OD as this corresponds to the exponential growth phase of bacteria and found to be the condition optimal for bacterial growth (Fig.2.3). In both of the vectors used in this study, pET14-b and pGEX-6P1, expression is inducible by IPTG (Fig.3.2 and Fig.2.1 respectively). However, before cells were induced a preinduction sample was taken for later SDS PAGE analysis. This involved taking a 200 µl aliquot from the growth media and centrifuging at 15871g for 10 mins. The supernatant was discarded and the pellet was stored at -20⁰ C until the pellets were to be lysed. Every hour up to 4 hrs a 200 µl aliquot was taken and centrifuged and stored like the pre-induction sample and these were named post induction samples. Cell pellets obtained through expression trials had to be then lysed to analyse soluble and insoluble hSTRAP protein expression. For this procedure pellets were kept on ice during the lysis process and lysis was carried out as follow; 5ml/g of pellet was lysed with Bugbuster, supplemented with 1% (v/v) of Protease Inhibitor, PMSF, DNAse and RNase. The lysate was then centrifuged at 15871g for 10 mins, and the supernatant (soluble fraction) and pellet (insoluble fraction) were stored separately. These samples were then analyzed for expression by SDS-PAGE and 15 µl of sample with loading buffer (with 5% of 2-mercaptoethanol) and 5 µl of molecular 84 marker was loaded. In this project two protein molecular markers were used, Prestained Protein marker Broad range 7-175kDa and Pageruler Pre-stained Protein ladder, 10-170kDa. Gels were run at 180 volts for 50 mins and stained with InstantBlue for 20 mins and visualised on the camera. Glycerol stocks were made of every construct in every cell line that was tested during these expression trials. This was done by firstly growing all the constructs with their required antibiotics (Table 2.2) overnight at 37 0C with shaking. The next day, 750 µl of growth was added to 250 µl of autoclaved 100% glycerol in an eppendorf. This was mixed thoroughly and stored at -80 0C straight away. These glycerol stocks were used for future growths rather than transforming each time for every growth. 2.8 SDS-PAGE Gels In this project various percentages of SDS PAGE gels were prepared and so the volumes of 30% (w/v) of acrylamide, lower resolving buffer, and distilled water taken for the resolving gel are shown in Table 2.5. The stacking gel always consisted of 6.2mls of upper stacking buffer, 1.3mls of 30% (w/v) acrylamide and 2.5mls of distilled water. The resolving and stacking solutions are initially prepared, but the fixing agents APS and TEMED are not added until all the plates are assembled properly. Once the spacer and short plates are assembled as explained in the manufacturers guide, 50 µl of 10% (w/v) APS and 20 µl of TEMED are added and the resolving gel is pipetted between the two plates. The resolving gels usually sets within 20 mins and then the same procedure is followed with the stacking gel, however, a 10 or 15 well 1.5mm comb is added to the stacking gel depending on the number of samples to be analyzed by SDS PAGE. Table 2.5. Resolving gel components. % SDS PAGE gel 30% (w/v) Distilled water Lower buffer (mls) (%) Acrylamide (mls) (mls) 7.5 2.5 5.0 2.5 10 3.3 4.2 2.5 12 4.0 3.4 2.5 15 5.0 2.5 2.5 This table shows the volumes of acrylamide, distilled water and lower buffer added to make that specific percentage of SDS PAGE gel. 85 2.9 Large scale expression and protein purification of all hSTRAP variants 2.9.1 Full length His-hSTRAP and truncated constructs of hSTRAP Glycerol stocks of all His-hSTRAP protein constructs in Bl21(DE3)pLysS were scraped off using a sterile tip and transferred into 25 mls of LB with the required antibiotics (Table 2.2). These inoculations were then kept overnight at 37°C, with shaking, however, not more than 16 hrs, as cell exit mitosis after that. The next day 10 mls of overnight culture was pipetted into 500 mls of LB with antibiotics. Then like before, the OD600 was checked and growth was not started until the OD600 was between 0.05-0.1. Once the OD600 had reached 0.5 (exponential bacterial growth phase, See Fig.2.3), cells were induced with 0.1 mM IPTG for 3 hrs at 37°C for HishSTRAP(1-440). For all the other His-hSTRAP constructs, cells were induced at OD600 of 0.5 with 0.1 mM IPTG for 3 hours at 25°C. When growing Shuffle T7 express and Shuffle T7pLysY cells, cells were induced at 30°C rather than 25°C. After 3 hrs induction, all cells were harvested at 1583g for 35 mins. The pellet was re-suspended in His tag lysis buffer (Table 2.1) and cells lysed using the Cell Disrupter. All proteins were purified following the same procedure described below and buffer composition as shown in Table 2.1, and the only difference being is the pH of the purification buffers used for each hSTRAP construct. Purification of His-hSTRAP(1-440) was done at pH 7.4, and purification of hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284), hSTRAP(285-440) was done at pH 8.7. Purification of hSTRAP(220-440) was done at pH 8.2. The Cell Disrupter was firstly thoroughly washed with water at 15 kPS1, and then the cells were then ran through the cell disrupter at 15 kPS1 and collected. The resulting lysate was then run through this cell disrupter until sample was clear, after which 1% (v/v) of Protease Inhibitors, PMSF, RNase and DNAse is added. After that, the machine was washed with water and 20% (v/v) Ethanol at 15 kPS1 until the resulting solution in clear. The lysate is then centrifuged at 20238g for 30 mins at 4°C. The next step was to purify hSTRAP on the Talon column, which is used to purify proteins with a His Tag. Purification was carried out at 4°C and the first step was to add 1.5 mls of the Talon Resin to the column. Then the column was washed with 10* bed volume of distilled water and 20* bed volume of His tag wash buffer. Then a 10 µl sample of the resin was taken (R1), which will be the clean resin sample and 86 is stored at -20°C. Once the lysate had been centrifuged from the cell lysis step a supernatant and pellet sample was taken, representative of the soluble and insoluble fraction respectively, for SDS PAGE analysis. Then the supernatant was poured through the column and the flow through was collected. Then the column was washed with 50 mls of His tag wash buffer and the flow though was collected for analysis. Then another 10 µl resin sample was taken (R2), this is the bound resin sample. Once the bound resin sample was taken the protein was eluted with His tag elution buffer. A point to add is that elution buffer was supplemented with H-MIX for His-hSTRAP(1-440), hSTRAP(1-219) and hSTRAP(220-440). Also for the latter hSTRAP protein variant H-MIX was added to all purification buffers. Protease inhibitor was added to all the elutions straight after they are eluted off the column. Another 10 µl resin sample was taken (R3) and this would be the “clean resin” sample. Then the column was washed again with 10* bed volume of His tag wash buffer and distilled water. The column is then stored in 20% (v/v) Ethanol at 4ºC. Elutions and various other controls taken during the purification procedure were then analyzed by SDS PAGE. After that, pure elutions as analyzed by SDS PAGE were pooled together and dialyzed and/or concentrated down to a smaller volume. 2.9.2 GST-hSTRAP(1-440) Glycerol stocks of GST-hSTRAP(1-440) in Bl21(DE3)pLysS were scraped off using a sterile tip and transferred into 25 mls of LB with the required antibiotics (Table 2.2). This was then kept overnight at 37°C, with shaking, however, not more than 16 hrs. The next day, 10 mls of overnight culture was pipetted into 500 mls of LB with antibiotics. Then like before, the OD600 was checked and growth was not started until the OD600 was between 0.05-0.1. Once the OD600 had reached 0.5, the cells were induced with 0.1 mM IPTG for 3 hrs induction at 25°C, and cells were harvested at 1583g for 35 mins. The pellet was re-suspended in GST tag lysis buffer and lysed using the Cell Disrupter following the same procedure mentioned above in section 2.9.1. Similar to lysis of His tag proteins the lysate is then centrifuged at 20238g rpm for 30 mins at 4°C. The next step was to purify GST-hSTRAP on the GST tag affinity resin, and purification was carried out at 4°C. So 1 ml of the GST Resin was added to the column, which was then washed with 10* bed volume of distilled water and 20* bed 87 volume of GST tag Wash buffer. Then a 10 µl sample of the resin was taken (R1), which will be the clean resin sample and is stored at -20°C. Once the cells had been centrifuged from the cell lysis step, a supernatant and pellet sample was taken for SDS PAGE analysis. Then the supernatant was poured through the column and the flow through was collected. Then the column was washed with 20* bed volume of GST tag Wash buffer and the flow though was collected. Then another 10 µl resin sample was taken (R2), this is the bound resin sample. Once the bound resin sample was taken the protein was eluted with 5* 1.5 mls of GST elution buffer. Another 10 µl resin sample was taken (R3), and this would be the “clean resin” sample. Then the column was washed again with 10* bed volume of wash buffer and distilled water. The column is then stored in 20% (v/v) Ethanol at 4ºC. The elutions were then analyzed on SDS PAGE, after which pure elutions were pooled together depending on purity shown by SDS PAGE. This sample was then dialyzed and/or concentrated down to a smaller volume. 2.10 Determining the concentration of protein 2.10.1 Bradford reagent The concentration of protein was identified using the Bradford Reagent Assay. In a 2 ml eppendorf tube, 800 µl of distilled water was mixed with 200 µl of Bradford reagent. A calibration curve was plotted using BSA as the standard. From that an equation was derived which would be used to determine estimated protein concentration. If 5, 3, 1.5 or 1 µL of eluted protein was added to the 800:200 µl water and bradford mix, then the OD595 obtained from that was multiplied by 15 and divided by 5, 3, 1.5 or 1 respectively depending on the amount of protein taken for the initial reading. Protease inhibitor was added to the elutions and stored at 4°C until it was verified that hSTRAP protein had been eluted through SDS PAGE gel analysis. 2.10.2 Protein sample absorbance at 280nm Firstly, the theoretical extinction co-efficient at 280nm (E280) had to be determined, which was done using this formulae below- E280 = (No of tryptophan residues*5500) + (No of tyrosine residues*1490) + (Number of cysteine residues*125) 88 The E280 for each hSTRAP protein variant was identified using this software http://www.basic.northwestern.edu/biotools/proteincalc.html, which calculated the E280 for hSTRAP(1-440), hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150), hSTRAP(151-284), hSTRAP (285-440) as 47090, 27670, 19420, 16860, 19060 and 11170 respectively. For each hSTRAP protein variant the protein concentration can then be calculated by firstly measuring the absorbance at 280 nm of buffer only in cuvette. That value is then zeroed and the absorbance of protein sample at 280 nm is then measured. That values is then imputed in this formula to determine protein concentration- Concentration of protein (mg/ml) = Absorbance 280nm/ E280* Cuvette path length (cm) 2.11 Concentration of protein to a smaller volume 2.11.1 Amicon Concentration 2.11.1.1 Concentration of protein into a buffer The elutions that contain pure hSTRAP protein as analyzed by SDS PAGE, were then pooled together and concentrated to a smaller volume in an Amicon. The amicon is assembled as described in the manufacturers guide and for all the constructs cloned in this project a 25 mm 10 k membrane was used. This amicon can hold up to 10 mls of protein sample, which can be concentrated down to 1000-500 µl. The membrane was washed with distilled water thoroughly at first and then the amicon was connected to the nitrogen gas cylinder. Pressure was adjusted to 30 Bar and the amicon was placed on the magnetic stirrer. The membrane was again washed with ample amounts of water, at the approx flow rate of one drop per 5 secs. Once the membrane was washed, pooled elutions were poured into the amicon. This was also subjected to the same pressure levels and approximate flow rate. Once the volume that the protein sample should be concentrated to was achieved, the pressure was released but the amicon was left on the stirrer for 20 mins to wash protein off the membrane. The concentrated protein sample was then transferred to an eppendorf and the concentration was determined using Bradford reagent (See Section 2.10.1) 2.11.1.2 Concentration of protein along with buffer exchange in the amicon For His-hSTRAP(1-440) and hSTRAP(1-219), elution buffer was supplemented with H-MIX as it was shown that for these protein constructs it was a critical addition for protein stability. Dialysis was not possible for these protein constructs because the contents of H-MIX are relatively expensive when in large volumes. In these cases 89 successive buffer exchanges were done in the amicon. The initial steps are the same, but the difference is that the pooled elutions are poured into the amicon and concentrated to 1 ml, and then another 9 mls of the optimised buffer for that protein construct is added. For His-hSTRAP(1-440) this would be 50 mM Sodium phosphate buffer, 50 mM NaCl, Arginine and Glutamic acid, 10 mM βmercaptoethanol and H-MIX, and for hSTRAP(1-219) this would be pure H-MIX only, pH 8. Once the sample was concentrated to a tenth of its original volume, another 9 mls of this optimised buffer was added to the amicon. The pooled elutions were then concentrated down to a tenth of the initial volume again and the same amount of storage buffer was added again. This was done several times and after that the sample was concentrated to a final volume of 1 ml in the amicon. The amicon was left to stir for about 20 mins without subjecting it to any pressure. Then the concentrated sample was transferred to an eppendorf and the concentration was determined with Bradford reagent (See Section 2.10.1). If the required concentration was not achieved through this process then protein has to be concentrated down further with viva-spin columns as described in the section below. 2.11.2 Viva spin500 concentrators Viva spin500 concentrators can concentrate protein up to 10 µl with very high concentrate recovery (aprox 96%). It consists of a vertical polyethersulfone membrane which prevents membrane blockage and a thin channel filtration chamber. With these viva-spin500 concentrators, samples cannot be completely lost as there is threshold of 10 µl, beyond which sample cannot be concentrated any further The OD595 of the protein is firstly measured, and at this point the volume of the concentrate needed to achieve a concentration of around 10 mg/ml, if the concentration was to increase at a linear rate was estimated. This volume was noted and the sample was concentrated to that volume. Viva-spin 500 3 kMWCO was used for all constructs, which can accommodate 500 µl of sample and can concentrate protein down to 10 µl. Hence, 500 µl of protein sample was added to the concentrating device and was spun at a speed of 9230g. Sample was checked regularly and concentrated to the volume that was initially noted. Once the desired concentration was reached the samples were analyzed on a SDS PAGE gel to check for purity. 90 2.12 Gel Filtration The Superdex 200 26/60 GL gel filtration column was used in this study. The column was connected to the FPLC AKTA system, ensuring no bubbles were inserted. All buffers that are run through the column were filter sterilized and degassed and the pumps were washed with the appropriate buffers required in each case. Firstly, the column was equilibrated by washing the column with 2 column volumes of MilliQ water then 2 column volumes of gel filtration buffer. Once the column was equilibrated, concentrated hSTRAP protein was injected into the column. For Superdex 200 the maximum injection volume is 500 µl and for Superdex 75 is around 20 mls. The AKTA system has a manual available for that column and so that was followed. Once the run was completed, peaks were observed on the gel filtration graph and the fractions included in these peaks were then analyzed by SDS PAGE. Once it was confirmed which peak includes the required hSTRAP protein, the pooled fractions were then concentrated down to a smaller volume as described in the sections above (See Section 2.11). 2.13 X-RAY Crystallography experiments The first initial wide broad trials were set up in MRC-96 well plates using the Phoenix nano-litre pippetting robot. Crystallography trials were carried out when the concentration of His-hSTRAP(1-440) was between 12-20 mg/ml. The concentrated sample was divided into two, where one half was supplemented with graphite and the other was not. Graphite nano-suspension was prepared by Dr Alexander Golovanov and was added as this was hypothesized to act as a nucleating platform for crystal growth (Dr Alexander Golovanov, personal communication). For the trials 12 µl of protein was needed per plate, as 0.2 µl of screen condition: 0.2µl of protein was used per well for these trials. The first broad range sparse matrix trial that was undertaken was with the JSCG+ screen but the other commercial screens that were tested were PACT, Clear Strategy I, Clear Strategy II and Morpheus (See Section 2.1.3). Once the plates were prepared by the nano-litre pippetting Pheonix, the plates were checked immediately to check for immediate effects of addition of protein to buffer. The plates were then stored at 20ºC and checked every day for 2 weeks and then once a week for the next 3 months. When more specific trials were undertaken, the trial was plated out manually and the buffers were prepared as a commercial screen was not used. The drop size was 0.5 µl of screen to 0.5 µl of protein. Temperature used was the same in all trials carried out. 91 2.14 GST tag Cleavage For both on and off column GST cleavage, 200 µl of pre-cission protease is added to either the column with GST-hSTRAP(1-440) bound to the GST affinity resin in wash buffer, or eluted GST-hSTRAP(1-440) protein respectively. Both samples are then kept on the roller at 4 0C and resin or eluted protein samples were taken every hour for 3 hrs and then overnight. For both on and off column cleavage the flow through is collected and then analyzed by SDS PAGE to determine if cleavage is successful. 2.15 CD experiments Pure elutions of hSTRAP protein as well as controls (GST tag only) were firstly dialyzed into CD buffer at 4°C using Snakeskin pleated dialysis tubing, 3 kMWCO. Concentration of hSTRAP protein or tag was firstly identified using the procedure mentioned in Section 2.10. A far UV spectrum with a wavelength range of 260-180 nm and 0.1 cm pathlength was recorded at 4°C of dialysed buffer only or hSTRAP protein sample using the JASCO J-810CD Spectropolarimeter, under constant nitrogen flow connected to a temperature controller. A point to note is the final scan is an average of 4 scans taken at these conditions and spectra was corrected for potential background signal, and for GST-hSTRAP, spectra was corrected for buffer and GST Tag signal. Then a variable temperature experiment was done, whereby a scan was taken every 0.2°C from 4 to 80°C at fixed wavelength of 220 nm at 20°C/hr. Once this experiment was completed a scan is taken again at 4°C with varying wavelength from 260-180 nm. 2.16 NMR experiments 2.16.1 Expression of 15N labelled hSTRAP protein This growth is very similar to unlabelled growth; the only difference is growth is done in 15N labelled minimal media rather than LB media. Minimal media consists of solution A and B (Table 2.1), of which solution A is prepared first, and then autoclaved. Solution B was then dissolved in 20 mls of milliQ water and then filter sterilized through a 2 µM filter. This was then added to solution A after it had been autoclaved and the mixture was then thoroughly mixed, and 500 ml of that media was poured into 2 litre conical flask using aseptic techniques. Ampicillin was then added to the media. 92 The protocol for expression and growth of labelled media was the same as unlabelled growth of these truncated constructs of hSTRAP (See Section 2.9.1). 2.16.2 Acquiring of NMR spectra All experiments were carried out at 30°C unless stated otherwise, on Bruker 600MHz Avance DRX spectrometers equipped with a cryoprobe. Protein samples were supplemented with 10% D2O. 1H 1D and 2D 1H-15N hetronuclear singlequantum coherence (HSQC) spectra were acquired using a watergate pulse sequence for water signal suppression. SO FAST-HMQC 1H-15N correlation spectra were acquired on a Bruker DRX700 spectrometer 2.17 Mass spectrometry experiments The mass spectrometry experiments were carried out by the Mass spectrometry facility at the University of Manchester. The method they implenented is as follows: Digestion: Bands of interest were excised from the gel and dehydrated using acetonitrile followed by vacuum centrifugation. Dried gel pieces were reduced with 10 mM dithiothreitol and alkylated with 55 mM iodoacetamide. Gel pieces were then washed alternately with 25 mM ammonium bicarbonate followed by acetonitrile. This was repeated, and the gel pieces dried by vacuum centrifugation. Samples were digested with trypsin overnight at 37 °C. Mass Spectrometry: Digested samples were analysed by LC-MS/MS using an UltiMate® 3000 Rapid Separation LC (RSLC, Dionex Corporation, Sunnyvale, CA) coupled to a LTQ Velos Pro (Thermo Fisher Scientific, Waltham, MA) mass spectrometer. Peptides were concentrated on a pre-column (20 mm x 180 μm i.d, Waters). The peptides were then separated using a gradient from 99% A (0.1% FA in water) and 1% B (0.1% FA in acetonitrile) to 25% B, in 45 min at 200 nL min -1, using a 75 mm x 250 μm i.d. 1.7 mM BEH C18, analytical column (Waters). Peptides were selected for fragmentation automatically by data dependant analysis. Data Analysis: Data produced were searched using Mascot (Matrix Science UK), against the full 93 database. Data were validated using Scaffold (Proteome Software, Portland, OR). Proteins that were not found to bind to the control (Tag only), and detected either twice or more with 2 unique peptides (with an 80% peptide probability) and a scaffold probability of over 95% in pull downs with hSTRAP protein variants were identified as hSTRAP interacting proteins. 2.18 Building the hSTRAP interactome network The UNIPROT ID of each hSTRAP interacting partner was submitted into DAVID bioinformatics software to assign all these particular proteins to their respective pathway. David bioinformatics was found at http://www.david.abcc.ncifcrf.gov/. The gene names of these hSTRAP interacting proteins implicated in these latter pathways were then submitted into GeneMANIA and String 9.0 bioinformatics software found at http://www.genemania.org/ and http://string-db.org/ respectively. These two programs determines the interaction status between the two protein shown, if it is a direct interaction proven by experiments, predicted or text mining All interaction status for all protein shown were then noted in excel and a interacting network was built based on these results using cytoscape http://www.cytoscape.org/images/top_slides/cytoscapeDesktop1.png. 94 found at 3. Chapter three. Results 3.1 Expression and purification of full length and truncated forms of hSTRAP protein The aims of this project were to identify interacting partners of full-length hSTRAP and its truncated variants, to map regions of hSTRAP implicated in ligand interactions related to breast cancer. Another aim was to characterize structurally full-length hSTRAP and its truncated versions by NMR, X-ray crystallography and Circular dichroism. To achieve both aims, it is needed to obtain purified homogenous protein, ideally in a tagged form, which enables its easy attachment to the affinity resin. Therefore, full-length hSTRAP was cloned into two plasmids, pET-14b and pGEX-6P1 for expression with 6 Histidines and GST tag respectively. Both protein constructs can then be structurally characterized as well as be used to identify interacting partners of hSTRAP. Interacting data on full-length hSTRAP from two different vector systems would give an indication of the reproducibility and reliability of the interaction data. Furthermore, truncated versions of hSTRAP covering different regions of hSTRAP and including different TPR motif combinations were also cloned into pET-14b. These truncated constructs will be used for mapping the regions of hSTRAP responsible for the protein interactions identified, and potentially for solving the structure of shorter fragments of hSTRAP. These following sections focus on the establishment of a protocol to clone, express and purify pure hSTRAP protein variants bound to the affinity resin and in the elutions. Expression and protein purification protocol was extensively optimized to obtain pure hSTRAP protein and this has been explained in more detail in the following sections. 3.1.1 Cloning, expression and purification of full length hSTRAP into pET14b (His-hSTRAP(1-440)) Full-length hSTRAP codon optimized sequence was synthesized in frame with NdeI and BamHI restriction sites by GENEART in the vector pET-14b (Fig.3.1). This codon optimization step had to be done to ensure no rare codons were present in the hSTRAP sequence, as this potentially could affect expression and consequently yield of hSTRAP protein obtained in E.coli. The structural aspect of this project requires high concentration of hSTRAP protein and hence any factors that could potentially affect expression of protein were considered 95 Figure 3.1. The pET-14b Vector. The pET14b vector has the His tag at its N termini, and is ampicillin resistant. Protein expression is inducible by IPTG. Independent plasmid DNA sequencing confirmed the identity of the plasmid construct pET-14b-His-hSTRAP(1-440) (See Appendix) and expression trials were then initiated. Initial experiments showed that His-hSTRAP(1-440) protein does not express easily in soluble form and since large quantities, in the order of several milligrams, of pure stable His-hSTRAP(1-440) protein would be needed to carry out structural studies, extensive trials were therefore undertaken to determine the optimum conditions for soluble His-hSTRAP(1-440) protein expression. Typically, for crystallization trials 10-20 mg/ml of protein solutions are required. Several expression cell lines of E.coli were tested, these included BL21(DE3)-RIPL, Rosetta-gami 2(DE3), T7, BL21(DE3) Single cells and BL21(DE3)pLysS (Table 2.2). Various IPTG concentrations such as 0.1, 0.25, 0.5, 0.75 and 1 mM IPTG and different temperatures, varying from 37 to 30ºC were screened. Induction time, cell density (OD600 of cell media) at the time of induction, type of media and rate of shaking (RPM) were also investigated. The first expression cell line tested was BL21(DE3)-RIPL, which contains a plasmid encoding tRNAs for rare codons (Table 2.2). Best soluble His-hSTRAP(1-440) expression in this cell line were identified at induction with 0.5 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1). Purification yielded low quantities in the 96 order of 0.4 mg of His-hSTRAP(1-440) protein from 1 Litre E. coli culture, with less than 40% protein purity as many contaminants were detected in elutions (Table 3.1). The next cell line tested was Rosetta-gami 2(DE3), which as well as having a plasmid encoding tRNAs for rare codons, also contains mutations in thioredoxin reductase (trxB) and glutathione reductase (gor) genes to induce disulphide bond formation (Table 2.2). Best conditions of soluble expression of His-hSTRAP(1-440) protein in this cell line were identified as induction with 0.25 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1). Purification of pure His-hSTRAP(1-440) was also un-successful in this cell line as many contaminants were detected in elutions (Table 3.1). The next cell line tested was another BL21 derivative, T7; which has the T7 RNA polymerase under the control of the lac promoter, and unlike BL21(DE3), this cell line has the T7 RNA polymerase on a lysogenic prophage, although, the latter is dormant (Table 2.2). High expression of His-hSTRAP(1-440) protein was detected in samples representative of the soluble fraction, when protein expression was induced with 1 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1). Purification of His-hSTRAP(1-440) was however un-successful, as only a cleaved fragment of hSTRAP was purified rather than full length hSTRAP (Table 3.1). The next cell line tested was BL21(DE3)pLysS, which also encodes the T7 RNA polymerase, but also contains the pLysS plasmid (Table 2.2). This plasmid encodes an inhibitor of T7 RNA polymerase, the T7 lysozyme, which suppresses protein expression before IPTG induction (Table 2.2), as this was being observed in previous expression trials with this protein construct. Best conditions of soluble HishSTRAP(1-440) protein expression was identified as induction with 0.25 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1). The highest yield of HishSTRAP(1-440) protein was obtained in this cell line (<3mg/litre), however, to maintain the pLysS plasmid an extra antibiotic is needed (Table 2.2), which potentially could slower cell growth and lower the yield of His-hSTRAP(1-440) protein. BL21(DE3)single cells were therefore tested to determine if this was the case. Best His-hSTRAP(1-440) protein expression conditions was identified in this cell line as induction with 0.25 mM IPTG, followed by 3 hrs incubation at 37ºC (Table 3.1). Purification yielded both full length and a cleaved fragment of hSTRAP (Table 3.1). The yield of full length hSTRAP was higher in BL21(DE3)pLysS than BL21(DE3) single cells, hence, BL21(DE3)pLysS was determined to be the 97 optimum cell line for expression of the highest amounts of soluble His-hSTRAP(1440) protein. In summary all cell lines tested expressed His-hSTRAP(1-440) protein to a various extent, as shown by samples analyzed by SDS PAGE, representative of soluble, insoluble and total expression. For all expression trials, LB media and post induction time of 3 hrs was found to be the best conditions. All E.coli cells were induced when OD600 was 0.5, as this was found to be optimal OD at which cells should be induced for high His-hSTRAP(1-440) soluble protein expression. Finalized optimum conditions of soluble His-hSTRAP(1-440) protein expression was identified in BL21(DE3)pLysS cells, induced with 0.25 mM IPTG, followed by 3 hrs incubation at 37ºC. Table 3.1. Expression trials and purification of His-hSTRAP(1-440). Cell Line IPTG Temperature Temperature concentrations 37ºC 30ºC (mM) Soluble InSoluble InFraction soluble Fraction soluble Fraction Fraction BL21(DE3)RIPL 0.25 + + + 0.5 + + Purification of Full length hSTRAP (50kDa) LY<0.4 mg per litre, >40% purity Rosetta Origami 2(DE3) T7 BL21(DE3)pLysS BL21 Single cells 1 0.25 0.5 0.75 1 0.25 0.5 0.75 1 0.1 0.25 ++ ++ ++ + + + ++ +++ ++ +++ + + + + + + + + + + + + + + + + + - + + + + + + + ++ ++ + + 0.5 0.75 1 0.1 0.25 0.5 0.75 1 + + + ++ + + + + + + + + + + + + + + + + + + + + + + + + + MC CF HY<3 mg per litre, <50% purity >50% purity The different conditions used to express and purify full length hSTRAP protein are shown in this table. The (+) (++) and (+++) signs indicate low, medium and high His-hSTRAP(1-440) protein expression respectively. The best conditions of His-hSTRAP(1-440) protein expression using each different cell line are highlighted in red. These latter conditions were then used to purify full-length protein. Successful His-hSTRAP(1-440) protein purification is indicated by (), and unsuccessful by (). Protein expression and purification trials were repeated at least twice. Abbreviations: (HY) Highest Yield; (VLY) Lowest Yield; MC, Many contaminants yielded; CF, cleaved fragment yielded; 98 Large scale expression and purification of His-hSTRAP(1-440) was then carried out as described in the Material and Methods section 2.9.1, once the optimum soluble protein expression conditions were identified through expression trials. Samples representative of insoluble and soluble fraction, as well as total expression, at these optimum conditions of His-hSTRAP protein expression were analyzed by SDS PAGE (Fig.3.2). These gels confirm that hSTRAP is being expressed and accumulated over time as shown by samples representative of total expression (Fig.3.2B, Lanes 2-5) and His-hSTRAP(1-440) is found mainly in the soluble fraction of lysed cells grown in optimal conditions (Fig3.2A, Lanes 6-9). (A) (B) Figure 3.2. Expression of His-hSTRAP(1-440). 10% SDS PAGE gel showing -A: Insoluble (lanes 2-5) and soluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pET-14b-His-hSTRAP(1-440) after 1 hour (lanes 3 and 7), 2 hours (lanes 4 and 8), and 3 hours (lanes 5 and 9) post IPTG induction. B: Total BL21(DE3)pLysS pET-14b-HishSTRAP(1-440) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. Lane 1 in (A) and (B) represent protein markers, and lanes 2 and 6, pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained with Instant blue. Purification protocol had to be extensively optimised to obtain high quantities of non-aggregated His-hSTRAP(1-440) protein in elutions from the His-tag affinity column (Fig.3.3). Initial purification was carried out, where the cells pellet was lysed by sonication. This resulted in an abundance of contaminants within column elutions (Fig.3.3A, Lanes 10-12), and approximately only 15% of the proteins present was 99 full-length His-hSTRAP(1-440) (Fig.3.3A, Lanes 10-12, indicated by an arrow). In order to attempt to decrease the amount of contaminants, another protein batch was purified but this time cells were lysed using the Cell Disrupter, which is deemed to be a more gentle method of lysis, and less damaging for proteins. This improved the purity of His-hSTRAP(1-440) elutions to approximately 30% (Fig.3.3B, Lanes 1012) compared to previous 15% purity (Fig.3.3A, Lanes 10-12). This means that sonication is not the optimal method of cell pellet lysis for purification of HishSTRAP(1-440) but cell disruption is. As this latter purification was carried out at pH 8.0, the next experiment was carried out to investigate the effects of pH on the purification of His-hSTRAP(1-440) protein. The pH of the purification buffers was lowered to pH 7.4 and this improved the purity of His-hSTRAP(1-440) in elutions by 50% (Fig,3.3C, Lanes 10-13). This suggested that lowering the pH improves HishSTRAP(1-440) protein purity in elutions, however, the pH of the purification buffers could not be lowered any further as this is not recommended for histidineaffinity resin purification, and that would be also closer to the pI of this protein construct which is 6. Samples of His tag affinity resin were taken before protein was eluted off the resin, with a view to analyse purity of protein bound to resin (Fig.3.3C, Lane 3). This gel showed that pure His-hSTRAP(1-440) protein was bound to the resin before elutions were taken (Fig.3.3C, Lane 3), after which a contaminant (likely degradation products) was detected in the elutions (Fig.3.3C., Lanes 10-13, indicated by an arrow labelled “contaminant”). This suggested that His-hSTRAP(1440) was unstable in this elution buffer and that this buffer should be optimised further. This lead to the decision to apply the new developed proprietary protein solubilisation mixture by Dr Alexander Golovanov (H-MIX) to the elution buffer. HMIX is a high-concentration mixture of four hydrophobic amino acids (L-Leu, LVal, L-Ala and L-Ile) in a defined ratio. H-MIX was designed to act as a stabilising agent against proteolytic degradation and protein aggregation. Purification with this H-MIX supplement in elution buffer yielded pure His-hSTRAP(1-440) in elutions (Fig.3.3D, Lane 10-12, indicated by an arrow) and therefore stability of HishSTRAP(1-440) in elutions had significantly increased (Fig.3.3D, Lanes 10-12, indicated by an arrow). Total estimate of the quantity of pure and stable His-hSTRAP(1-440) protein that was obtained from 1 L of E.coli culture was 4.73 mg, as measured by Bradford assay (Table 3.2). Biochemical binding assays with His-hSTRAP(1-440) can be carried out as pure His-hSTRAP(1-440) protein can be obtained bound to the His tag affinity resin (Fig.3.3D, Lane 3). 100 (A) (B) (C) (D) Figure 3.3. Purification of His-hSTRAP(1-440) protein. 10% SDS PAGE gels showing A: His-hSTRAP(1-440) purification from sonicated BL21(DE3)pLysS cells using purification buffer with pH 8.0. B: His-hSTRAP(1-440) purification form BL21(DE3)pLysS cells lysed with the cell disrupter and purification buffer with pH 8.0. C: His-hSTRAP(1440) purification form BL21(DE3)pLysS cells lysed with the cell disrupter and purification buffer with pH 7.4. D: Same as C but with the addition of H-MIX in the elution buffer. All gels were loaded with 15µl of sample and stained with Instant blue as described in Methods and Materials section. Migration of His-hSTRAP(1-440) is at 50 kDa as indicated by the arrow. 101 Table 3.2. Estimation of His-hSTRAP(1-440) protein concentration Elution 1 Elution 2 Elution 3 OD595 Concentration (mg/ml) 0.167 0.50 0.117 0.35 0.020 0.06 Volume (mls) 7 3 3 Protein quantity estimate (mg) 3.50 1.05 0.18 The concentration of His-hSTRAP(1-440) protein measured using Bradford assay in the elutions obtained from the purification method described in Figure 3.3D (for details see Materials and Methods). In-gel digestion mass spectrometry was carried out to confirm protein identity and coverage of the suggestive His-hSTRAP(1-440) 50 kDa protein band (Fig.3.3C, indicated by the top arrow), and the contaminant visible at 29 kD during HishSTRAP(1-440) purification (Fig.3.3C, indicated by the bottom arrow). Mass spectrometry confirmed that the protein band found in elutions at 50 kDa (Fig.3.3C, Lanes 10-12, top arrow) was indeed hSTRAP and coverage was indicative of full length protein (Fig.3.4A). The band detected at 29 kDa (Fig.3.3B, Lanes 10-12, indicated by the bottom arrow) was identified as a cleaved fragment of hSTRAP, with peptide coverage within amino acids 14-147 (Fig.3.4B), that includes two of the predicted TPR motifs. (A) 1 81 161 241 321 401 MMADEEEEVK TPDYSPKAEE VMDSVRQAKL GEALEGFSRA LKPLSTLQPG KDYSFSSVRV PILQKLQELV LLSKAVKLEP AVQMDVHDGR AALDPAWPEP VNSGAVILGK ETPLLLVVNG DQLYSFRDCY ELVEAWNQLG SWYILGNSYL RQREQQLLEF VVFSLTTEEK KPQGSSSQAV FETHSVEDAG EVYWKKGDVA SLYFSTGQNP LDRLTSLLES VPFTFGLVDS ATVASRPQCE RKQQDVQKEM AAHTCFSGAL KISQQALSAY KGKVKTKKLQ DGPCYAVMVY EKTLQQMEEV THCRNKVSLQ AQAEKVDRKA SMLGSLRPAH NIVQSWGVLI VGSVQGKAQV NLSMVLRQLR SSNPDLHLNR LGPCSDGHYQ GDSVAIPEPN LMLTGKALNV TDTEDEHSHH ATLHKYEESY SASGQKVTLE LRLHRIQHKG PILQKLQELV LLSKAVKLEP AVQMDVHDGR AALDPAWPEP VNSGAVILGK ETPLLLVVNG DQLYSFRDCY ELVEAWNQLG SWYILGNSYL RQREQQLLEF VVFSLTTEEK KPQGSSSQAV FETHSVEDAG EVYWKKGDVA SLYFSTGQNP LDRLTSLLES VPFTFGLVDS ATVASRPQCE RKQQDVQKEM AAHTCFSGAL KISQQALSAY KGKVKTKKLQ DGPCYAVMVY EKTLQQMEEV THCRNKVSLQ AQAEKVDRKA SMLGSLRPAH NIVQSWGVLI VGSVQGKAQV NLSMVLRQLR SSNPDLHLNR LGPCSDGHYQ GDSVAIPEPN LMLTGKALNV TDTEDEHSHH ATLHKYEESY SASGQKVTLE LRLHRIQHKG (B) 1 81 161 241 321 401 MMADEEEEVK TPDYSPKAEE VMDSVRQAKL GEALEGFSRA LKPLSTLQPG KDYSFSSVRV Figure 3.4. His-hSTRAP(1-440) mass spectrometry A: Mass spectrometry performed on the hSTRAP band obtained at 50 kDa (top arrow in Fig.3.3C, top arrow). B: Mass spectrometry performed on the hSTRAP band obtained at 29 kDa (bottom arrow in Fig. 3.3C). The numbers of unique peptides (indicated in red) detected were 31 and 6 for the 50 kDa and 29 kDa bands shown in Figure 3.3C respectively. 3.1.2 Cloning, Expression and purification of GST-hSTRAP(1-440) Non-optimized full-length hSTRAP gene sequence was cloned originally in HA tagged vector, pHA1, by Sandra Taylor in the lab of Dr Marija Kristic-Demonacos. 102 For this project full length hSTRAP had to be cloned into the GST plasmid, pGEX6P1 (Fig.3.5). This GST vector was chosen because this plasmid would allow an insert with Xho1 at both termini to be inserted in frame (Fig.3.5), which was a requirement in this case. Full length hSTRAP was amplified by PCR by Sandra Taylor and sequence verified by the DNA sequencing facility at the University of Manchester. Figure 3.5. pGEX-6P1, GST expression vector. This 5 kbp vector has a GST Tag, pre-scission protease cleavage and a multiple cloning site at its N terminus. Protein expression is inducible by IPTG and vector is ampicillin resistant. To clone full length wild type hSTRAP in the pGEX-6P-1 vector, hSTRAP cDNA (previously ligated into pHA1 plasmid) and the vector pGEX-6P1 were both digested with XhoI. The resulting fragments were then ligated and transformed into DH5α. Plasmid DNA was extracted by mini-preps from the seven colonies obtained through this procedure were analyzed, firstly by XhoI and BamHI restriction digest and secondly by sequencing, to verify the desired insertion of hSTRAP DNA. The vector pGEX-6P-1 is 5 kbp (Fig.3.5) and the hSTRAP(1-440) insert is approximately 1.3 kbp. Fig.3.6B shows that hSTRAP has been inserted into the vector as shown by the DNA band found at 1.3 kbp with XhoI restriction digests (Lanes 3-9), corresponding to the hSTRAP sequence. The sequence has also been inserted in the desired orientation as a DNA band was detected at 1.3 kbp when the 103 DNA mini preps are digested with BamHI (Fig.3.6B, lanes 10-16). If hSTRAP had not been inserted in the desired orientation a DNA band at approximately 81 bp (53 bp + 18 bp) would have been detected rather than 1270 bp (Fig.3.6A), as the wild type hSTRAP sequence contains a BamHI restriction site at its C terminus (Fig.3.6A). These digestions confirmed that hSTRAP has been cloned into the vector in the desired orientation. (A) (B) Figure 3.6. Cloning of the hSTRAP wild type in pGEX-6P1. A: Positions of the BamH1 and XhoI cutting sites on the full length hSTRAP cDNA sequence as well as pGEX6P1 vector sequence. B: A 1% (w/v) Agarose gel, stained with ethidium bromide, showing XhoI (lanes 3-9) and BamHI (lanes 10-16) restriction digestion reactions of the seven clones obtained from the ligation of the vector pGEX-6P1 with full length hSTRAP. The 5 kbp corresponds to the linearised pGEX-6P1 vector and the ~1.3 kbp to the full length hSTRAP cDNA (indicated by an arrow). The presence of the ~1.3 kbp fragment in the BamH1 digestion reactions (lanes 3-9) indicates that the full length hSTRAP cDNA has been incorporated in the construct in the desired orientation. Clone 2 was chosen at random for further investigation, and the junctions and the full hSTRAP sequence were sequence-verified to ensure no mutation had been incorporated. The sequencing data confirmed the latter (See Appendix), and thus the plasmid pGEX-6P1-hSTRAP(1-440) coding for GST-tagged hSTRAP(1-440) was obtained. Trials to express and purify the protein were then initiated. 104 Plasmid DNA of clone 2, pGEX-6P1-GST-hSTRAP(1-440), was transformed into the expression cell line BL21(DE3)pLysS as a test, although this cell line is normally used for t7-promoter containing vectors. In this construct hSTRAP has been fused to GST, which should yield a protein of approximately 76kDa, as the GST tag is 26 kDa and hSTRAP is approximately 50kDa. Expression trials for this protein were carried out that defined optimum condition of soluble GST-hSTRAP(1-440) in BL21(DE3)pLysS, induced with 0.1 mM IPTG at OD600 of 0.5, followed by 3 hrs incubation at 25ºC in LB media. Samples representative of soluble, insoluble and total expression at these optimum conditions of expression were analyzed by SDS PAGE. Gels show that GST-hSTRAP(1-440) is being expressed in both the soluble and insoluble fraction (Fig.3.7). Trials revealed that these conditions were best for GST-hSTRAP(1-440) protein expression. Also GST-hSTRAP is migrating at 78 kDa rather than the predicted 76 kDa (Fig.3.7), this is slightly higher than it should but it is not completely uncommon for protein to migrate on a gel higher than it should. Sequencing and mass spectrometry data confirmed this protein as GST-hSTRAP(1440). (A) (B) Figure 3.7. Expression of GST- hSTRAP (1-440). A: 7.5% SDS PAGE gel showing Insoluble (lanes 2-5) and soluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pGEX-6P1-GST-hSTRAP(1-440) after 1 hour (lanes 3 and 7), 2 hours (lanes 4 and 8), and 3 hours (lanes 5 and 9) after induction with IPTG. B: 12% SDS PAGE gel showing Total BL21(DE3)pLysS pGEX-6P1-GST-hSTRAP(1-440) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. Lane 1 in (A) and (B) represent protein markers and lanes 2 and 6 pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained with instant blue. In this construct hSTRAP has been fused to GST to give an N terminal GST tagged protein. Migration of this fusion protein is as indicated 105 Large scale expression and purification of GST-hSTRAP(1-440) was then carried out. The initial purification was performed at pH 7.4, as purification of HishSTRAP(1-440) was successful at this pH (Fig.3.3C). Furthermore, cells were lysed using the cell disrupter, as this was identified as the best method of lysis for purification of His-hSTRAP(1-440) (Fig.3.3C). Purification of the GST fused hSTRAP protein (GST-hSTRAP(1-440)) at pH 7.4 yielded many contaminants bound to the GST affinity resin (Fig.3.8A, Lane 3), hence purity of GST-hSTRAP(1440) in elutions was compromised as a result (Fig.3.8A, Lanes 5-8). Also SDSPAGE indicated that GST-hSTRAP(1-440) still remains bound to the resin even after thorough washing with elution buffer (Fig. 3.8A, Lane 4), although, this may be possible to improve through optimisation of the purification protocol. However, the presence of protein contaminants could be due to insufficient washing, hence nonspecifically bound material to the GST tag affinity resin was detected (Fig.3.8A, Lane 3). Hence, the purification at pH 7.4 was repeated and the wash buffer flowthrough was analyzed by SDS PAGE to determine if the resin was thoroughly washed. Figure 3.8B shows that GST-hSTRAP(1-440) was purified but contaminants are again detected and still bound to the resin after five column bed volume washings of resin (Lane 10). Samples representative of the last wash flow through of resin (Lanes 9), contaminants were still detected suggestive that the resin still may not have been washed enough. Resin was washed with a further 20 mls of wash buffer at the same pH, however, contaminants were still detected bound to the resin (Fig. 3.8C, Lane 8). The pH of the buffer was then increased to pH 8 and samples of resin before washing, wash flow through and resin after washing were analyzed by SDS PAGE (Fig.3.8D). This washing condition did not improve purity of GST-hSTRAP(1-440) protein bound to resin (Fig.3.8D, Lane 8). However, when the pH was raised to 8.7, this yielded pure GST-hSTRAP(1-440) protein bound to resin (Fig.3.8E, Lane 8). SDS-PAGE indicated that pure GST-hSTRAP(1-440) protein can be obtained in elutions at pH 8.7 (Fig.3.8, Lanes 2-4), although the total yield estimated from 1 litre E. coli culture was less than 0.1 mg. These experiments concluded that GST-hSTRAP(1-440) protein purification is optimum at pH 8.7, however, the total volume of growth culture would require significant scaling up to obtain a large quantity of protein necessary for structural studies. In gel digestion mass spectrometry of suspected GST-hSTRAP(1-440) band (Fig.3.8, indicated by an arrow) confirmed that protein band as GST-hSTRAP(1440) and coverage indicative of full length hSTRAP (Fig.3.9) and so further experiments can be carried out with this protein construct. 106 (A) (B) (C) (D) (E) (F) Figure 3.8. Purification of GST-hSTRAP(1-440) protein. A: 7.5% SDS PAGE gel showing GST-hSTRAP(1-440) purification form BL21(DE3)pLysS at pH 7.4. These 12% SDS PAGE gels show B: GST-hSTRAP(1-440) purification form BL21(DE3)pLysS at pH 7.4 but with further washing of resin. C: B but resin washed further. D: GST-hSTRAP(1440) purification form BL21(DE3)pLysS at pH 8.0. E: GST-hSTRAP(1-440) purification form BL21(DE3)pLysS at pH 8.7.F: Pure GST-hSTRAP(1-440) elutions obtained from the purification in E. All gels were loaded with 15µl of sample and stained with instant blue. Migration of GST-hSTRAP(1-440) is shown. 107 1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV VGSVQGKAQV 71 LMLTGKALNV TPDYSPKAEE LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA AAHTCFSGAL THCRNKVSLQ 141 NLSMVLRQLR TDTEDEHSHH VMDSVRQAKL AVQMDVHDGR SWYILGNSYL SLYFSTGQNP KISQQALSAY 201 AQAEKVDRKA SSNPDLHLNR ATLHKYEESY GEALEGFSRA AALDPAWPEP RQREQQLLEF LDRLTSLLES 281 KGKVKTKKLQ SMLGSLRPAH LGPCSDGHYQ SASGQKVTLE LKPLSTLQPG VNSGAVILGK VVFSLTTEEK 351 VPFTFGLVDS DGPCYAVMVY NIVQSWGVLI GDSVAIPEPN LRLHRIQHKG KDYSFSSVRV ETPLLLVVNG 421 KPQGSSSQAV ATVASRPQCE Figure 3.9. GST-hSTRAP(1-440) mass spectrometry A: Mass spectrometry performed on the 76 kDa hSTRAP protein obtained from the gel shown in Fig.3.8E, lane 8. Amino acids highlighted in red indicate peptide coverage of the thirteen unique peptides obtained from the in gel digestion mass spectrometry of this 78 kDa band. 3.1.3 Cloning, expression and purification of truncated variants of hSTRAP 3.1.3.1 Design and sequence analysis of truncated constructs of hSTRAP Truncated variants of hSTRAP were decided to be cloned as this would provide another route in solving the structure of hSTRAP, and also to determine region of hSTRAP implicated in ligand interaction. However, the first step was to determine polypeptide boundaries of these truncated forms of hSTRAP proteins; as this could not be decided though structural data as this was not available at the time and so secondary structure predictions had to be considered. Polypeptide boundaries had to be decided with a view of not pertubating any predicted structured regions and including the correct combinations of predicted TPR motifs. It was firstly decided to create five truncated hSTRAP construct with the first three, last three, first two, middle two and end two predicted TPR motifs. The amino acid sequence of fulllength hSTRAP was then submitted into two separate secondary structure prediction programs, JPred3 and Scratch protein predictor. This was done to check for consistency of predictions using different algorithms. Data from these two programs were analyzed and construct boundaries were chosen taking into account the predicted positions of the TPR motifs and secondary structure elements (Fig.3.10). This was done to minimize the structural perturbation of truncation constructs. According to both prediction programs, hSTRAP amino acid sequence is around 70% α-helical. This would correlate with the presence of six predicted TPR motifs within hSTRAP, as TPR motifs are anti-parallel alpha helical structures [32]. The five possible truncated hSTRAP constructs to be cloned were as follows: hSTRAP(1-219), hSTRAP(220-440), hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440). The sequences of these five proposed polypeptides were then analyzed in another program which can predict the pI of these constructs. This latter data has been shown in Table 3.3A and this is an important factor that needed to be 108 considered, as ideally the pI of these proteins should not be between 6 and 7, as this range of pH is typically used for NMR studies. The pI of these five protein constructs mentioned above, were outside the range 6 to7 (Table 3.3A) and the chosen domain boundaries were not expected to perturb any structured regions according to secondary structure predictions (Fig.3.10). Therefore, these five constructs were decided to be cloned into pET14-b. The summary of properties of these five chosen constructs, including construct length, TPR motifs included, molecular mass and pI of each construct is shown in Table 3.3A. The amino acid sequence of these polypeptides is shown in Table 3.3B. At a time when these constructs were designed structural data for hSTRAP was not available, however, the C terminal part of hSTRAP, residues 262-422 has been since solved by X-ray crystallography by Structural Genomics Consortium [http://www.thesgc.org/structures/details?pdbid=2XVS]. This structural data only became available after these five constructs were cloned and expressed. The experimental secondary structure of this region as given by the PDB file of 2XVS protein structure is added to Figure 3.10 with a view to compare secondary structure predictions to actual secondary composition of this solved region. Also it would verify if any of the C terminal constructs, hSTRAP(151-284) and hSTRAP(285-440) in reality perturb any structured region. The 284 th and 285th amino acid were found to be situated between two helical elements (Fig.3.10), and hence this does not seem the case. The solved region, hSTRAP(262-422) is 27% helical, 9% turn, 8% bend, 36% extended chain (β strand) and 20% the rest (chain). 109 Figure 3.10. Secondary structure predictions of full length hSTRAP. Secondary structure predictions of full length hSTRAP obtained through Jpred3 and Scratch protein predictor programs (Materials and Methods, Section 2.3.2.1). Finalized polypeptide boundaries were derived using this latter information, and positions of predicted TPR motifs. The six predicted TPR motifs are highlighted in different colors, each representing a different TPR motif. The reliability of the Jpred3 prediction (Jpred3 Reliability) varies from 0 to 9, where 9 is the highest prediction accuracy. Jpred3 reliability greater than 4 is highlighted in green. The structure of the peptide 262-422 (highlighted in grey) has been solved [http://www.thesgc.org/structures/details?pdbid=2XVS], and its secondary structure composition (from PDB file) is provided here for reference. Abbreviations: H, Alpha Helix; G, 3-10-helix; S, Bend; C, The rest; I, pi-helix; B, Beta Bridge; E, Extended chain (Beta strand); T, Turn; SSP, Secondary Structure Prediction; Rel, Reliability Prediction Accuracy; Red, TPR 1; Bright Green, TPR2; Purple, TPR3; Dark Green, TPR 4;Bright TPR 5 and Yellow TPR 6. 110 Table 3.3. hSTRAP truncated forms cloned in pET14-b. (A) Construct Length of Predicted Construct TPR Motifs (amino acids) included Molecular weight (kDa) Theoretical pI 1-hSTRAP(1-219) 2-hSTRAP(220-440) 3-hSTRAP(1-150) 4-hSTRAP(151-284) 5-hSTRAP(285-440) 219 221 150 134 156 24.9 24.0 17.7 15.2 16.6 5.19 9.30 4.87 5.45 9.84 1-3 4-6 1-2 3-4 5-6 (B) Construct Amino Acid Sequence 1-hSTRAP(1-219) MMADEEEEVK RKQQDVQKEM TPDYSPKAEE AAHTCFSGAL VMDSVRQAKL KISQQALSAY ASSNPDLHLN PRQREQQLLE HLGPCSDGHY KVVFSLTTEE IGDSVAIPEP GKPQGSSSQA MMADEEEEVK RKQQDVQKEM TPDYSPKAEE AAHTCFSGAL TDTEDEHSHH SLYFSTGQNP ATLHKYEESY LDRLTSLLES KTKKLQSMLG STLQPGVNSG YAVMVYNIVQ FSSVRVETPL 2-hSTRAP(220-440) 3-hSTRAP(1-150) 4-hSTRAP(151-284) 5-hSTRAP(285-440) PILQKLQELV DQLYSFRDCY FETHSVEDAG EKTLQQMEEV VGSVQGKAQV LMLTGKALNV LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA THCRNKVSLQ NLSMVLRQLR TDTEDEHSHH AVQMDVHDGR SWYILGNSYL SLYFSTGQNP AQAEKVDRK RATLHKYEES YGEALEGFSR AAALDPAWPE FLDRLTSLLE SKGKVKTKKL QSMLGSLRPA QSASGQKVTL ELKPLSTLQP GVNSGAVILG KVPFTFGLVD SDGPCYAVMV YNIVQSWGVL NLRLHRIQHK GKDYSFSSVR VETPLLLVVN VATVASRPQC E PILQKLQELV DQLYSFRDCY FETHSVEDAG EKTLQQMEEV VGSVQGKAQV LMLTGKALNV LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA THCRNKVSLQ NLSMVLRQLR VMDSVRQAKL AVQMDVHDGR SWYILGNSYL KISQQALSAY AQAEKVDRKA SSNPDLHLNR GEALEGFSRA AALDPAWPEP RQREQQLLEF KGKV SLRPAHLGPC SDGHYQSASG QKVTLELKPL AVILGKVVFS LTTEEKVPFT FGLVDSDGPC SWGVLIGDSV AIPEPNLRLH RIQHKGKDYS LLVVNGKPQG SSSQAVATVA SRPQCE A: Construct names, TPR motifs included, molecular weight (kDa) and calculated pI of the hSTRAP constructs used in this study. B: Amino acid sequence of each hSTRAP construct used in this study 3.1.3.2. Cloning of Truncated versions of hSTRAP Once polypeptide boundaries were chosen, the next step was to proceed to the cloning of the five constructs into pET-14b. PCR was carried out as described in the Materials and Methods Section 2.3.2.2 using primers shown in Table 2.3. In order to check if the PCR and digestion reactions with BamHI and NdeI were successful, samples of the digested PCR reactions were loaded on an agarose gel (Fig.3.11A). The correct sized DNA bands for all five constructs were observed: 670 bp hSTRAP(1-219), 684 bp hSTRAP(220-440), 462 bp hSTRAP(1-150), 414 bp hSTRAP(151-284), and 486 bp hSTRAP(285-440). Figure 3.11B shows a sample of 111 digested and linearised 4.7 kbp pET14-b vector DNA (Fig.3.11B). Both gels confirm that PCR and digestion has been successful and so the next steps in the cloning procedure can be followed. (A) (B) Figure 3.11. PCR products of the hSTRAP fragment cloned in pET14-b vector. PCR products and pET14-b vector were digested with BamHI and NdeI restriction enzymes and 15l of each sample was then subsequently loaded on a 1% (w/v) agarose gel. This 1% (w/v) agarose gels show-. A: Digestion reactions of the five truncated hSTRAP PCR products, which were subsequently cloned in pET14-b vector. B: BamHI and NdeI restriction digestion of the pET14-b vector (between 5-4 kbp). 3.1.3.3. Expression and Purification of truncated versions of hSTRAP Plasmids coding for the different truncated hSTRAP constructs that had been created were sequence verified, and this confirmed that all five truncated versions of hSTRAP had been successfully cloned into pET14-b (See Appendix). Systematic trials were then carried out on all protein constructs to determine optimum soluble expression conditions for each hSTRAP protein variant. 3.1.3.3.1 Expression and purification of hSTRAP(1-219) Expression trials involved testing expression in transformed E. coli at varying incubation temperatures for cell growth in the range of 37 to 16°C, and at varying IPTG concentrations. These trials identified optimum conditions of hSTRAP(1-219) protein expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at an OD600 of 0.5, followed by 3 hours incubation at 25°C in LB media. Samples representative of soluble, insoluble and total expression were loaded on 15% SDS 112 PAGE gels (Fig.3.12). These results showed that hSTRAP(1-219) is over-expressed at these optimised conditions as shown by levels of total expression (Fig.3.12B, Lanes 2-5). The protein, hSTRAP(1-219) is running at approximately 28 kDa, which is higher than its expected 26 kDa (Table 3.3), even accounting for the additional approximate 1kDa for the His tag coding sequence to the start of the fusion protein. Sequencing data (See apendix one) and mass spectrometry (Fig.3.14) have confirmed this to be hSTRAP(1-219). Also, in comparison to total expression, a low quantity of protein was detected in samples representative of soluble fraction at 2 and 3 hrs after induction (Fig.3.12A, Lanes 4-5). It was noted that hSTRAP(1-219) was precipitating in these soluble fraction samples with time in all conditions tested. This suggested that method of lysis and sample analysis requires optimisation so protein does not precipitate in the buffer it is present in. Initial purifications of hSTRAP(1-219) did not yield pure hSTRAP protein in elutions from His-tag affinity resin (Fig.3.13A, Lanes 5-7), nor bound to the resin (3.13A, Lane 3) at pH 7.4. This was the starting pH of all purifications, as purification of His-hSTRAP(1-440) was successful at this pH (Fig.3.3C). Due to the purification of hSTRAP(1-219) being un-successful at this pH, the pH of the buffers were changed to investigate the effects of pH on the purification of hSTRAP(1-219). This was done as it has been shown for the full-length hSTRAP constructs that optimisation of the pH could be an important factor in obtaining pure hSTRAP protein (Fig.3.3 and 3.8). The pH of the loading and washing buffers were then changed from pH 7.4 to pH 8.7, as the latter pH was found to be optimal to obtain pure GST-hSTRAP(1-440) protein (Fig.3.8). The purification carried out at pH 8.7 did yield pure hSTRAP(1-219) protein bound to resin (Fig.3.13B, Lane 8). This provided further evidence that pH does indeed affect purity of protein bound to resin and this information will be useful when carrying out future protein purifications. Once pure hSTRAP(1-219) protein was bound to the resin, the elution buffer had to be optimised further as degradation products were visible on the gel in the elutions obtained at pH 8.7 (Fig.3.13C, Lanes 2-7). This optimisation was achieved by the addition of H-MIX to the elution buffer. Higher purity hSTRAP(1-219) protein was obtained in the elutions with H-MIX (Fig.3.13D, Lanes 2-7) compared to elutions in the absence of H-MIX (Fig.3.13C, Lane 2-7). This proved that, indeed H-MIX does improve protein stability against proteolytic degradation and aggregation as initially hypothesized. 113 The quantity of hSTRAP(1-219) protein that was obtained from 1 L E.coli culture was estimated just over 5 mg (Table 3.4). Pull downs can be carried out with this protein as pure hSTRAP(1-219) can be obtained bound to the His tag affinity resin (Fig.3.13B) The 28 kDa protein band on the gel that was expected to be hSTRAP(1-219) (indicated by an arrow in the elutions in Figure 3.13D), was characterised by In-gel digestion Mass spectrometry to confirm its identity. Mass spectrometry confirmed that this was indeed a fragment of hSTRAP, coverage indicative of hSTRAP(1-219) (Fig.3.14) so further experiments can be carried out with this protein. (A) (B) Figure 3.12. Expression of hSTRAP (1-219). 15% SDS PAGE gel showing- A: Soluble (lanes 2-6) and insoluble fractions (lanes 7-11) of BL21(DE3)pLysS cells transformed with pET-14bHis-hSTRAP(1-219) after 1 hour (lanes 3 and 8), 2 hours (lanes 4 and 9), 3 hours (lanes 5 and 10) and 4 hours (lanes 6 and 11) after induction with IPTG. B: Total BL21(DE3)pLysS pET-14b-HishSTRAP(1-219) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained with instant blue. HSTRAP(1-219) is migrating at approximately 28 kDa (indicated by an arrow) Optimum conditions of soluble hSTRAP(1-219) expression were identified in this cell line as induction with 0.1 mM IPTG at OD600 of 0.5, followed by 3hrs incubation at 25°C in LB media. A: hSTRAP(1-219) TPR 1-3 (indicated by an arrow) total expression over time at these optimum conditions. B: hSTRAP(1-219)TPR 1-3 (indicated by114 an arrow) expression in soluble and insoluble fractions over time (A) (B) (C) (D) Figure 3.13. Purification of hSTRAP(1-219) protein. 15% SDS PAGE gels showingA: hSTRAP(1-219) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(1-219) purification from BL21(DE3)pLysS, at pH 8.7. C: hSTRAP(1-219) elutions obtained from purification in B. D: Elutions obtained from the purification protocol performed as in B with the addition of H-MIX in the elution buffer. All gels were loaded with 15µl of sample and stained with instant blue. HSTRAP(1-219) is migrating at 28 kDa (indicated by an arrow). 115 Table 3.4. Estimation of hSTRAP(1-219) protein concentration. Elution OD595 1 2 3 4 5 6 0.248 0.436 0.445 0.225 0.193 0.171 1 61 121 181 241 301 361 421 MMADEEEEVK VGSVQGKAQV AAHTCFSGAL SWYILGNSYL GEALEGFSRA LGPCSDGHYQ DGPCYAVMVY KPQGSSSQAV Concentration (mg/ml) 0.744 1.308 1.335 0.675 0.579 0.513 PILQKLQELV LMLTGKALNV THCRNKVSLQ SLYFSTGQNP AALDPAWPEP SASGQKVTLE NIVQSWGVLI ATVASRPQCE The concentration of hSTRAP(1-219) protein measured by Bradford assay in the elutions obtained from the purification method described in Figure 3.13D and determined as described in Materials and Methods. Elutions (mls) 1 1 1 1 1 1 DQLYSFRDCY TPDYSPKAEE NLSMVLRQLR KISQQALSAY RQREQQLLEF LKPLSTLQPG GDSVAIPEPN FETHSVEDAG LLSKAVKLEP TDTEDEHSHH AQAEKVDRKA LDRLTSLLES VNSGAVILGK LRLHRIQHKG RKQQDVQKEM ELVEAWNQLG VMDSVRQAKL SSNPDLHLNR KGKVKTKKLQ VVFSLTTEEK KDYSFSSVRV EKTLQQMEEV EVYWKKGDVA AVQMDVHDGR ATLHKYEESY SMLGSLRPAH VPFTFGLVDS ETPLLLVVNG Figure 3.14. hSTRAP(1-219) mass spectrometry. Mass spectrometry performed on the 26 kDa hSTRAP protein obtained from the gel shown in Fig.3.13D, lane 4. Amino acids highlighted in red indicate peptide coverage of the eight unique peptides obtained from the in gel digestion mass spectrometry of this 28 kDa band. 3.1.3.3.2 Expression and purification of hSTRAP(220-440) Expression trials involved testing different cell lines, temperatures and IPTG concentrations. The cells lines tested were Shuffle T7, Shuffle T7pLysY, BL21(DE3) Single cells and BL21(DE3)pLysS, of which the latter was found to be optimal for expression of this protein in the soluble form. Temperatures tested were in the range 16 to 37°C and IPTG concentrations tested were 0.1, 0.2, 0.5 and 1 mM IPTG. These trials identified optimum conditions of protein expression in BL21(DE3)pLysS, induced with 0.2 mM IPTG at OD600 of 0.5, followed by 3 hrs incubation at 25°C in LB media. However, expression level of this protein was very low, as shown by samples representative of total expression (Fig.3.15, Lanes 2-5), though it appears to be mostly soluble (Fig.3.15, Lanes 6-8). Also it seems there is leaky hSTRAP(220-440) expression, which should be suppressed in this cell line (Table 2.2), although, in the other cells lines tested, expression of hSTRAP(220-440) was not detectable by SDS-PAGE. So despite the generally low level of protein expression, it was decided to proceed to large-scale protein purification as described in the Material and Methods section 2.9.1. Also the protein seems to be migrating at a correct molecular weight (Table 3.3), taking into account the His tag coding sequence (addition of approximately 1 kDa). Large scale purification of hSTRAP(220-440) was carried out (Fig.3.16) and initial purifications were done at pH 7.4, as purification of His-hSTRAP(1-440) was successful at this pH (Fig3.4C). However, hSTRAP(220-440) protein bound to the His-tag affinity resin was not pure at this pH (Fig.3.16A, Lane 10), even with 116 substantial amounts of washing of resin. The pH was then increased to 8.2 (Fig.3.16B), as it was shown previously that pH was a critical factor in obtaining pure hSTRAP protein bound to resin. The pH could not be increased any further as the pI of this protein construct is 9.3 and so pH 8.2 was the maximum threshold. At pH 8.2, after substantial washing of the resin with wash buffer, pure hSTRAP(220440) protein was obtained bound to resin (Fig.3.16B, Lane 12). However, samples of resin taken after gel analysis, which was 4 hours after cell lysis showed that protein was unstable and would degrade almost immediately on the resin at 4°C (Fig.3.16C). H-MIX was then added to all purification buffers as protein seems generally very unstable. Even though H-MIX is hypothesized to act as a stabilising agent and has significantly improved protein stability of other hSTRAP proteins (Fig.3.3D and 3.13D), this purification seems to have purified hSTRAP(220-440), but not of high purity and seems unstable (Fig.3.16D). Furthermore, the yield of protein from 1 L of E. coli growth culture was estimated at less than 0.5 mg. The problems encountered, combined with the crystal structure of the C terminus of hSTRAP residues 262-422 [2XVS] which appeared in PDB at the time, lead to the decision that the construct hSTRAP(220-440) will not be used for future structural studies and priority was given to the other constructs. As this protein was unstable even when bound to the resin (Fig.3.16C), it was impractical to use it for biochemical studies either. Protein identity was not confirmed through in-gel digestion mass spectrometry as this protein will not be used for any further experiments in this investigation. Figure 3.15. Expression of hSTRAP (220-440) in BL21(DE3)pLysS. 12% SDS PAGE gel showing total (lanes 2-5), soluble (Lanes 6-8) and insoluble fractions (lanes 9-11) of BL21(DE3)pLysS transformed with pET-14b-His-hSTRAP(220-440) after 1hour (lanes 3 , 6 and 9), 2 hours (lanes 4, 7 and 10), 3 hours (lanes 5, 8 and 11) post IPTG induction. Lane 1 and 2 represent protein markers and pre-induction fractions respectively. Gel was loaded with 15µl of sample in each lane and stained with instant blue. Leaky hSTRAP(220-440) seems to be observed (Lane 2) and migration of hSTRAP(220-440) is 25 kDa as indicated. 117 (A) (C) (B) (D) Figure 3.16. Purification of hSTRAP(220-440). 15% SDS PAGE gels showing A: hSTRAP(220-440) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(220-440) purification from BL21(DE3)pLysS, at pH 8.2. C: Purified hSTRAP(220-440) truncated protein loaded on resin 4 hours after cell lysis. D: Protein purification carried out using purification with pH 8.2 supplemented with H-MIX. The protein expression process is described in Materials and Methods. All gels were loaded with 15µl of sample in each lane and stained with instant blue. Expected migration of hSTRAP(220-440) is 25 kDa as indicated. 3.1.3.3.3 Expression and purification of hSTRAP(1-150) Expression trials identified optimum conditions for soluble hSTRAP(1-150) protein expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at OD 600 of 0.5, followed by 3 hrs incubation at 25°C in LB media. Soluble and insoluble fractions 118 from lysed cells, as well as total expression, were analyzed by SDS-PAGE. The hSTRAP(1-150) protein was over-expressed (Fig.3.17B, Lanes 2-6) and was mainly found in the soluble fraction at these optimum conditions (Fig.3.17A, Lane 7-11). The protein, hSTRAP(1-150) is migrating at the correct molecular weight of approximately 19 kDa, 17.7 kDa hSTRAP(1-150) sequence (Table 3.3), with the addition of the His tag coding sequence (approximately 1kDa). Large scale expression and purification was carried out (Fig.3.18) and the first purification was done at pH 7.4 (Fig.3.18A) and high yield of hSTRAP(1-150) was obtained, but protein purity was less than 90% so the purification protocol had to be optimised further. For that purpose the pH was increased to pH 8.7, and pure hSTRAP(1-150) protein was obtained in elutions under these conditions (Fig.3.18B, Lanes 2-7). Table 3.5 shows the estimated concentration of hSTRAP(1-150) protein found in elutions following this optimised purification protocol (Fig.3.18B, Lanes 27). High concentration of protein was obtained, especially in elutions 2 and 3, where over 30 mg/ml of protein was obtained (Table 3.5). Total estimated protein yield from one litre of E.coli cell culture was very high, approximately 210 mg (Table 3.5). Protein identity of hSTRAP(1-150) (indicated by an arrow in Fig.3.18) was confirmed by In-gel digestion mass spectrometry (Fig.3.19). This figure shows that the band found in elutions (Fig.3.18) was indeed a truncated version of hSTRAP with peptide coverage indicative of hSTRAP(1-150). 119 (A) (B) Figure 3.17. Expression of hSTRAP (1-150). 15% SDS PAGE gels showing- A: Soluble (lanes 2-6) and insoluble fractions (lanes 7-11) of BL21(DE3)pLysS cells transformed with pET-14b-His-hSTRAP(1-150) after 1 hour (lanes 3 and 8), 2 hours (lanes 4 and 9), 3 hours (lanes 5 and 10) and 4 hours (lanes 6 and 11) after induction with IPTG. B: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(1-150) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4), 3 hours (lane 5) and 4 hours (lane 6) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained with instant blue. Migration of hSTRAP(1-150) is indicated by an arrow. (A) (B) Figure 3.18. Purification of hSTRAP(1-150) protein. 15% SDS PAGE gels showingA: hSTRAP(1-150) purification at pH 7.4 from BL21(DE3)pLysS cells. B: hSTRAP(1-150) purification from BL21(DE3)pLysS at pH 8.7. Fifteen microlitres of samples were loaded on both gels and all samples were diluted 1:40. Both gels were stained with instant blue. Migration of hSTRAP(1-150) is indicated by an arrow. 120 Table 3.5. Estimation of hSTRAP(1-150) protein concentration. Elution OD595 Concentration (mg/ml) Velutions (mls) 1 2 3 4 5 6 0.650 1.150 1.234 0.734 0.691 0.354 19.5 34.5 37.02 22.02 20.73 10.62 1.5 1.5 1.5 1.5 1.5 1.5 Total protein quantity (mg) 29.25 51.75 55.53 33.03 31.10 15.93 The concentration of hSTRAP(1-150) protein measured by Bradford assay in the elutions obtained from the purification method described in Figure 3.18B and determined as described in Materials and Methods 1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV 61 VGSVQGKAQV LMLTGKALNV TPDYSPKAEE LLSKAVKLEP ELVEAWNQLG EVYWKKGDVA 121 AAHTCFSGAL THCRNKVSLQ NLSMVLRQLR TDTEDEHSHH VMDSVRQAKL AVQMDVHDGR 181 SWYILGNSYL SLYFSTGQNP KISQQALSAY AQAEKVDRKA SSNPDLHLNR ATLHKYEESY 241 GEALEGFSRA AALDPAWPEP RQREQQLLEF LDRLTSLLES KGKVKTKKLQ SMLGSLRPAH 301 LGPCSDGHYQ SASGQKVTLE LKPLSTLQPG VNSGAVILGK VVFSLTTEEK VPFTFGLVDS 361 DGPCYAVMVY NIVQSWGVLI GDSVAIPEPN LRLHRIQHKG KDYSFSSVRV ETPLLLVVNG 421 KPQGSSSQAV ATVASRPQCE Figure 3.19. hSTRAP(1-150) mass spectrometry. Mass spectrometry performed on the 18 kDa hSTRAP protein obtained from the gel shown in Fig.3.18B, lane 2. Amino acids highlighted in red indicate peptide coverage of the five unique peptides obtained from the in gel digestion mass spectrometry of this 19 kDa band. 3.1.3.3.4 Expression and purification of hSTRAP(151-284) Optimum conditions for the expression of the soluble hSTRAP(151-284) using BL21(DE3)pLysS cells were identified as induction with 0.1 mM IPTG at OD 600 of 0.5, followed by 3 hrs incubation at 25°C (Fig.3.20). Analysis of lysed samples by SDS-PAGE indicated that hSTRAP(151-284) was over-expressed (Fig.3.20B, Lanes 2-6), although mainly insoluble (Fig.3.20A, Lanes 2-6). However, this could be due to the method of lysis and sample analysis, as these samples are from small scale expression trials. There is a difference in methodology between small scale and large scale pellet lysis and sample buffer conditions, which ultimately could affect solubility of protein. As small-scale lysis is carried out with bugbuster and for largescale purification, samples are lysed through cell disruption or sonication, and buffers can be optimized in large scale purification. HSTRAP(151-284) is migrating at the correct molecular weight of around 16 kDa, as the hSTRAP(151-284) amino acid sequence would account for 15.2 kDa (Table 3.3), and the His tag coding sequence (approximately 1 kDa). Large scale purification of hSTRAP(151-284) was attempted initially at pH 7.4 (Fig.3.21A), however the purity of eluted protein was less than 70% so the purification protocol had to be optimised. Furthermore, hSTRAP(151-284) bound strongly to the TALON His-tag affinity resin even after substantial washing of resin with elution buffer (Fig.3.21A, Lane 4), therefore hSTRAP was not successfully eluted (Fig.3.21B, lanes 7-8). This also suggested that the purification protocol had 121 to be optimised and for that reason the pH of the purification buffers were increased to 8.7 (Fig.3.21B) and this yielded pure hSTRAP(151-284) protein bound to the resin (Fig.3.21B, Lane 9), and in elutions (Fig.3.21C, Lanes 9-10). The estimated yield of eluted protein was low, as less than 0.2 mg of protein was obtained from 2 litres of E. coli culture. These quantities of hSTRAP(151-284) are insufficient for any structural studies but the volume of growth culture could be increased to increase quantity of hSTRAP(151-284) protein obtained. Pure and stable hSTRAP(151-284) could be readily obtained bound to the resin (Fig.3.21B, Lane 9), hence, this protein construct can be used to determine its interacting partners. Protein identity was confirmed through In-gel digestion mass spectrometry and this band was identified as a truncated form of hSTRAP with coverage indicative of hSTRAP(151-284) (Fig.3.21D). (A) (B) Figure 3.20. Expression of hSTRAP (151-284). 15% SDS PAGE gels showing-A: Insoluble (lanes 2-6) and soluble fractions (lanes 7-11) of BL21(DE3)pLysS cells transformed with pET-14b-His-hSTRAP(151-284) after 1 hr (lanes 3 and 8), 2 hrs (lanes 4 and 9), 3 hrs (lanes 5 and 10) and 4 hrs (lanes 6 and 11) after induction with IPTG. B: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(151-284) transformed cell lysate, 1 hr (lane 3) 2 hrs (lane 4), 3 hrs (lane 5) and 4 hrs (lane 6) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions respectively. Both gels were loaded with 15µl of sample and stained with instant blue. Migration of hSTRAP(151284) was approximately at 16 kDa as indicated. 122 (A) (B) (C) (D) 1 31 61 91 121 151 181 211 241 271 301 331 361 391 421 MMADEEEEVK FETHSVEDAG VGSVQGKAQV LLSKAVKLEP AAHTCFSGAL TDTEDEHSHH SWYILGNSYL AQAEKVDRKA GEALEGFSRA LDRLTSLLES LGPCSDGHYQ VNSGAVILGK DGPCYAVMVY LRLHRIQHKG KPQGSSSQAV PILQKLQELV RKQQDVQKEM LMLTGKALNV ELVEAWNQLG THCRNKVSLQ VMDSVRQAKL SLYFSTGQNP SSNPDLHLNR AALDPAWPEP KGKVKTKKLQ SASGQKVTLE VVFSLTTEEK NIVQSWGVLI KDYSFSSVRV ATVASRPQCE DQLYSFRDCY EKTLQQMEEV TPDYSPKAEE EVYWKKGDVA NLSMVLRQLR AVQMDVHDGR KISQQALSAY ATLHKYEESY RQREQQLLEF SMLGSLRPAH LKPLSTLQPG VPFTFGLVDS GDSVAIPEPN ETPLLLVVNG Figure 3.21. Purification of hSTRAP(151-284). 15% SDS PAGE gels showing A: hSTRAP(151-284) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(151-284) purification from BL21(DE3)pLysS, at pH 8.7. C: hSTRAP(151-284) elutions obtained from purification in B. D: Mass spectrometry performed on the 16 kDa hSTRAP protein obtained from the gel shown in C, lane 10. Amino acids highlighted in red indicate peptide coverage of the unique peptides obtained from the in gel digestion mass spectrometry of this 16 kDa band. All gels were loaded with 15µl of sample and stained with instant blue. 3.1.3.3.5 Expression and purification of hSTRAP(285-440) Expression trials identified optimum conditions for hSTRAP(285-440) protein expression in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at an OD600 of 0.5, followed by 4hrs incubation at 25°C, in LB media (Fig.3.22). SDS-PAGE gels 123 analysing samples representative of soluble and insoluble protein fractions of lysed cells showed that after 4 hrs induction, hSTRAP(285-440) was expressed in the soluble form (Fig.3.22A, Lane 11). HSTRAP(285-440) was migrating at the correct molecular weight of approximate 18kDa, taking into account 16.6 kDa corresponding to the hSTRAP(285-440) amino acid sequence (Table 3.3) and approximately 1 kDa corresponding to the His tag coding sequence (Fig.3.1). Large scale purification of this protein was carried out initially at pH 7.4. The protein was mainly found in the insoluble fraction (Fig.3.23A, Lane 3) and many contaminants were bound to the resin (Fig.3.23A, Lane 10). This suggested that purification protocol had to be optimised further before proceeding to elution of the protein from the column. The pH of the purification buffers was increased to 8.7 and this improved purity of protein bound to the resin by approximately 60% (Fig.3.23B, Lane 8) from initial purification done at pH 7.4 (Fig.3.23A, Lane 10). However, samples of the final wash flow-through of resin contained contaminants (Fig.3.23B, Lane 7). Further washing of the resin with wash buffer eventually yielded pure hSTRAP(285-440) protein bound to the resin (Fig.3.23C, Lane 7). Pure hSTRAP(285-440) protein was obtained in all elutions at pH8.7 (Fig.3.23D) and total yield estimate from 1 L E.coli culture was less than 0.2 mg. However, at that point we learned that the structure of the C terminus of hSTRAP, residues 262422 had been solved by X-Ray crystallography [2XVS] and so it was decided to use this construct primarily for biochemical pull-down studies. This was possible because pure stable protein, bound to the His tag affinity resin could be readily obtained (Fig.3.23C, Lanes 7). However, CD experiments will still be carried out with this protein construct as these experiments would determine folding state and thermal stability of hSTRAP(285-440). Protein identity was confirmed through In-gel digestion Mass spectrometry of the band found in elutions shown in Figure 3.23D. This band was identified as hSTRAP and the peptide coverage was of the expected range, amino acids 285-440 (Fig.3.23E). 124 (A) (B) Figure 3.22. Expression of hSTRAP (285-440). 15% SDS PAGE gels showing- A: Insoluble (lanes 2-6) and soluble fractions (lanes 7-11) of BL21(DE3)pLysS cells transformed with pET-14b-His-hSTRAP(285-440) after 1 hour (lanes 3 and 8), 2 hours (lanes 4 and 9), 3 hours (lanes 5 and 10) and 4 hours (lanes 6 and 11) post IPTG induction. B: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(285-440) transformed cell lysate, 1 hour (lane 3) 2 hours (lane 4), 3 hours (lane 5) and 4 hours (lane 6) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 7 are pre-induction fractions. Both gels were loaded with 15µl of sample and stained with instant blue. Migration of hSTRAP(285-440) is indicated by an arrow. 125 (A) (B) (C) (D) (E) 1 MMADEEEEVK PILQKLQELV DQLYSFRDCY FETHSVEDAG RKQQDVQKEM EKTLQQMEEV VGSVQGKAQV LMLTGKALNV 81 161 241 321 401 TPDYSPKAEE VMDSVRQAKL GEALEGFSRA LKPLSTLQPG KDYSFSSVRV LLSKAVKLEP AVQMDVHDGR AALDPAWPEP VNSGAVILGK ETPLLLVVNG ELVEAWNQLG SWYILGNSYL RQREQQLLEF VVFSLTTEEK KPQGSSSQAV EVYWKKGDVA SLYFSTGQNP LDRLTSLLES VPFTFGLVDS ATVASRPQCE AAHTCFSGAL KISQQALSAY KGKVKTKKLQ DGPCYAVMVY THCRNKVSLQ AQAEKVDRKA SMLGSLRPAH NIVQSWGVLI NLSMVLRQLR SSNPDLHLNR LGPCSDGHYQ GDSVAIPEPN TDTEDEHSHH ATLHKYEESY SASGQKVTLE LRLHRIQHKG Figure 3.23. Purification of hSTRAP(285-440). 15% SDS PAGE gels showing A: hSTRAP(285-440) purification from BL21(DE3)pLysS, at pH 7.4. B: hSTRAP(285-440) purification from BL21(DE3)pLysS, at pH 8.7. C: B but resin washed further. D: Pure hSTRAP(285-440) elutions obtained from the purification in C. E: Mass spectrometry performed on the 17 kDa hSTRAP protein obtained from the gel shown in D. Amino acids highlighted in red indicate peptide coverage of the unique peptides obtained from the in gel digestion mass spectrometry of this 18 kDa band. All gels were loaded with 15µl of sample and stained with instant blue. 126 3.2 Identification of hSTRAP interacting partners in MCF7 breast cancer cells One aim of this project was to identify interacting partners of hSTRAP, and regions responsible for these interactions, in breast cancer as there is limited published data on hSTRAP regarding this. Previous published data has elucidated STRAP to be implicated in the DNA damage [136, 144-145, 147-148], stress response pathway [148, 149], regulation of the glucocorticoid receptor [150] and p53 function [136, 143]. HSTRAP is hypothesized to be implicated in diverse regulatory pathways due to the presence of the six predicted TPR motifs, which are important in mediating protein-protein interactions [136]. These TPR motifs could potentially bridge multiple protein complexes and form extensive protein networks. This needed to be investigated further and these following sections include the list of proteins that were identified to interact with hSTRAP in vitro. Biochemical pull down assays were carried out using MCF7 cellular extracts with the hSTRAP protein variants (bait) bound to their respective affinity resin. HSTRAP interacting proteins were then identified through In-gel digestion mass spectrometry of whole biochemical pull down assay samples. This list of potential hSTRAP interacting proteins was then submitted to DAVID bioinformatics software [See Materials and Methods Section 2.18] for association network analysis and to identify potential functions of hSTRAP. Mass spectrometry rather than western blotting was used in order to maximize the number of hSTRAP interacting proteins identified from diverse pathways, rather than focusing on a specific pathway. MCF7 cells, which is a human breast cancer cell line was chosen for this study primarily to investigate hSTRAP implication in breast cancer. This human cell line was also chosen as this cell line is readily used in the lab and was available at the time. The protein constructs used for these interacting studies include full length hSTRAP with GST and with 6His tags, named GST-hSTRAP(1-440) and His-hSTRAP(1-440) respectively. As mentioned previously, this enables interacting data for full-length hSTRAP with two different tags to be analyzed and compared. This approach provides internal control for possible experimental artifacts linked to presence of co-purified E. coli proteins, and/or interference of the tags themselves with protein-ligand interactions. Furthermore, truncated versions of hSTRAP were also successfully cloned into pET14-b, which will be used to narrow down the region of hSTRAP-ligand interaction and further 127 enhance data reliability, by providing more statistics. These truncated hSTRAP constructs are hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440). Biochemical studies will not be carried out with hSTRAP(220-440) as this protein construct is very unstable when bound to the His tag affinity resin, which makes it impractical for these pull down experiments. 3.2.1 Purification of hSTRAP protein variants All hSTRAP constructs bound to their respective affinity resin to be used for subsequent pull downs were analyzed by SDS PAGE to determine their purity (Fig.3.24), and to reconfirm protein identity, through in-gel digestion mass spectrometry (Fig.3.25). SDS PAGE analysis was also necessary to ensure that equal quantities of these different pure hSTRAP protein variants were being used in subsequent pull down assays (Fig.3.24). To add to this, resin samples were heavily overloaded on this gel, to ensure that no obvious contaminating proteins are present in the samples used as baits (Fig.3.24). Protein bands, representative of the GST tag and the hSTRAP variants bound to their respective resin to be used for subsequent pull downs (Fig.3.24, Bands 1-7) were characterized through mass spectrometry to reconfirm their identity. Protein band 1 (Fig.3.24A) was identified as GST (to be used as negative control in pull-downs) and Band 2-7 (Fig.3.24B-F respectively) were identified as hSTRAP by in-gel digest MS, with the peptide coverage indicative of 1440, 1-440, 285-440, 1-219, 1-150 and 151-284 respectively. Figure 3.24. hSTRAP variants used for biochemical binding assays. HSTRAP protein variants purified and bound to the respective resin to be used for subsequent pull downs and analyzed using 15% SDS PAGE stained with coomassie. Fifty microlitres of resin sample was loaded in each lane. These 15% SDS PAGE gel show resin sample of- A: GST; B: GSThSTRAP(1-440); C: His-hSTRAP(1-440) (lane 2) and His-hSTRAP(285-440) (lane 3); D: HishSTRAP(1-219); E: His-hSTRAP(1-150); F: His-hSTRAP(151-284); 128 (A) Band 1-GST 1 71 141 211 MSPILGYWKI KGLVQPTRLL LEYLEEKYEE HLYERDEGDK WRNKKFELGL EFPNLPYYID GDVKLTQSMA IIRYIADKHN MLGGCPKERA EISMLEGAVL DIRYGVSRIA YSKDFETLKV DFLSKLPEML KMFEDRLCHK TYLNGDHVTH PDFMLYDALD VVLYMDPMCL DAFPKLVCFK KRIEAIPQID KYLKSSKYIA WPLQGWQATF GGGDHPPK (B) Band 2-GST-hSTRAP(1-440) 1 71 141 201 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG FETHSVEDAG ELVEAWNQLG AVQMDVHDGR GEALEGFSRA SASGQKVTLE GDSVAIPEPN RKQQDVQKEM EVYWKKGDVA SWYILGNSYL AALDPAWPEP LKPLSTLQPG LRLHRIQHKG EKTLQQMEEV AAHTCFSGAL SLYFSTGQNP RQREQQLLEF VNSGAVILGK KDYSFSSVRV VGSVQGKAQV THCRNKVSLQ KISQQALSAY LDRLTSLLES VVFSLTTEEK ETPLLLVVNG (C) Band 3- His-hSTRAP(1-440) 1 71 141 201 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI (D) Band 4- hSTRAP(285-440) 1 71 141 201 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE (E) Band 5-hSTRAP(1-219) 1 71 141 211 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE (F) Band 6-hSTRAP(1-150) 1 71 141 211 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE (G) Band 7-hSTRAP(151-284) 1 71 141 211 281 351 421 MMADEEEEVK LMLTGKALNV NLSMVLRQLR AQAEKVDRKA KGKVKTKKLQ VPFTFGLVDS KPQGSSSQAV PILQKLQELV TPDYSPKAEE TDTEDEHSHH SSNPDLHLNR SMLGSLRPAH DGPCYAVMVY ATVASRPQCE DQLYSFRDCY LLSKAVKLEP VMDSVRQAKL ATLHKYEESY LGPCSDGHYQ NIVQSWGVLI Figure 3.25. In-gel digestion mass spectrometry analysis. In-gel digestion mass spectrometry characterisation of the resin fractions of the different protein variants (Figure 3.24) used for subsequent pull down assays. A: GST; B: GST-hSTRAP(1-440); C: His-hSTRAP(1-440) D: His-hSTRAP(285-440) E: His-hSTRAP(1-219); F: His-hSTRAP(1-150); G: HishSTRAP(151-284); Amino acids highlighted in red are the peptides detected by mass spectrometry. 129 3.2.2 Pull downs using MCF7 cellular extract In principle, MS characterization of proteins in mixtures could be implemented through two strategies. In first, protein mixture is separated via SDS PAGE before MS analysis, e.g a specific area of the gel with separated samples is digested with trypsin and analysed with MS. In the second, the whole mixture of proteins is trypsin-digested and characterized without prior separation, using high peptide separating and discriminating power of MS itself. According to SDS-PAGE gels showing biochemical pull down assay samples (Fig.3.26), hSTRAP has appeared to have many interacting partners. Therefore, specifying a particular region of the gel would result in selecting a limited number of interaction partners only, and potentially missing the other ligands with different molecular size. Therefore, it was decided to analyse the whole pull-down mixture, to maximize the number of hSTRAP interacting partners to be identified. Before proceeding to extensive mass spectrometry characterization of whole mixtures, it had to be confirmed that GST protein, which is used as negative control in pull-down assays, and all hSTRAP protein variants were still bound to their respective affinity resin after these biochemical pull down assays were carried out. This is an important factor because if the latter is not found to be the case then data could be misinterpreted, as protein could be potentially interacting with resin and not hSTRAP. SDS PAGE gel analysis of one set of repeats show that GST and GST-hSTRAP(1-440) are both bound to GST affinity resin after the pull downs were carried out (Fig.3.26A, Lanes 2 and 3 respectively). Also hSTRAP(1-219), hSTRAP(151-284), hSTRAP(1-150), His-hSTRAP(1-440) and hSTRAP(285-440) are all bound to the his tag affinity resin after the pull downs were performed (Fig.3.26B, Lanes 2 and 3, Fig.3.26C and Fig.3.26A Lanes 2 and 3 respectively). This was found to be the case for the other two repeats as well (gels not shown). Once this was confirmed, then samples were prepared for mass spec characterisation of the pull-down mixtures. To follow the same standard in-gel digestion protocol, samples containing pull-down mixtures were loaded on the gel and run very briefly, until all the proteins enter the gel, but are not yet separated (Fig.3.27). The bands containing the whole mixtures were then cut out of the gels, and subjected to trypsin digestion and MS. 130 Figure 3.26. hSTRAP biochemical pull down assays with MCF7 cellular extracts. Various hSTRAP protein variants were purified from BL21(DE3)pLysS cells and incubated with MCF7 cellular extracts as described in Materials and Methods. Proteins retained on the resin after extensive washings were extracted and submitted for SDS-PAGE analysis. A: 7.5% SDS PAGE gel showing GST tag (lane 2) GST-hSTRAP (1-440) (lane 3) pull down samples. 12% SDS PAGE gel showing B: His tag (lane 2), his-hSTRAP(1-219) and His-hSTRAP(151-284). C: HishSTRAP(1-150). D: Both His-hSTRAP(1-440) and His-hSTRAP(285-440) pull down samples. Gel was stained with coomassie blue and fifty microlitres of sample was loaded in each lane Biochemical pull down assays were repeated three times and one representative experiment is shown in the figure. Figure 3.27. SDS-PAGE bands isolated and submitted to mass spectrometry analysis. Various hSTRAP protein variants as indicated were purified from BL21(DE3)pLysS cells and incubated with MCF7 cellular extracts. Proteins retained on the resin after extensive washes were then submitted for SDS-PAGE analysis. Electrophoresis was run long enough for the proteins to enter the resolving gel, then the gel was stained with coomassie. The bands indicated within red boxes isolated from the polyacrylamide gel were submitted for In-gel digestion mass spectrometry analysis as described in Materials and Methods. Each independent biochemical pull assay was repeated three times for each protein, and the repeat number is highlighted in blue brackets (). A: GST tag (lane 1) and GST-hSTRAP(1-440) (lane 2). B: GST tag (lanes 1 and 3) and GSThSTRAP(1-440) (lanes 2 and 4) C: His-hSTRAP(1-440) (lanes 1-3) and hSTRAP(285-440) (lanes 4 and 5). D: hSTRAP(285-440) (lane 1), hSTRAP(1-219 (lane 2) and hSTRAP(151-284) (lanes 3 and 4). E: hSTRAP(1-150) (lanes 1 and 2), His tag (lanes 3, 5 and 7), hSTRAP(1-219) (lane 4) and hSTRAP(151-284) (lane 6). F: hSTRAP(1-150) (lane 2) and hSTRAP(1-219) (lane 2). 50l of sample was loaded in each lane. 131 3.2.3 hSTRAP interacting partners Table 3.6 shows all the proteins identified through mass spectrometry in the pull-down mixtures for the different constructs of hSTRAP, and for all three of the independent pulldown experiments performed. These selections of proteins were not present in the negative controls, which were the pull-downs performed using GST bound to the resin, and His-tag affinity resin as baits. The proteins identified in these negative controls were excluded from the analysis, as there presence could be explained by non-specific interaction with the tag and/or resin not hSTRAP. As the proteins shown in Table 3.6 have not been identified to interact with the negative controls, it is reasonable to assume that these proteins are interacting with hSTRAP. All of these proteins have more than two unique peptides (80+% probability) identified and a Scaffold probability of over 95% (See Appendix for peptide data and mascot scores). It is evident that hSTRAP interacts with many proteins implicated in diverse regulatory roles, which was initially hypothesized (Table 3.6 and Table 3.7). Although each of these interactions identified still need to be confirmed, as the methodology implemented may lead to inclusion of some false positives in the dataset. We decided to proceed with further bioinformatic analysis of this dataset, to check whether the identified potential ligands are connected functionally with each other and hence can be parts of the same pathways. 132 Table 3.6. hSTRAP interacting partners. Protein Identified 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Myosin-9 cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta Triosephosphate isomerase Elongation factor Tu 1 (Escherichia coli ) L-lactate dehydrogenase 30S ribosomal protein S5 (E.coli) ATP synthase subunit beta, mitochondrial Phosphoglycerate kinase cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial Ubiquitin-like modifier-activating enzyme 1 Peptidyl-prolyl cis-trans isomerase (E. coli) Ribose-phosphate pyrophosphokinase (E.coli) Cell wall structural complex MreBCD, actin-like component MreB (E.coli) Filamin-B epiplakin 1 Filamin A Eukaryotic initiation factor 4A-I Tu translation elongation factor, mitochondrial precursor myosin, heavy chain 14 isoform 1 Tubulin DNA damage dependent protein GSThSTRAP (1-440) + (2) His-hSTRAP (1-440) hSTRAP (1-219) hSTRAP (1-150) hSTRAP (151-284) +++ (5-14) +++ (2-6) +++ (2-5) hSTRAP (285-440) MW (kDa) Uniprot ID +++ (5-13) ++ (3-4) 227 82 P35579 B4DGL0 +++ (6-10) 31 43 P60174 B1LHD9 ++ (2-3) 40 18 57 P00338 C6EGG3 P06576 + (3) + (3) ++ (2) ++ (2) + (4) +++ (2-3) ++ (2) ++ (2) + (2) + (2) ++ (2-3) 45 72 P00558 B7Z4V2 + (4) ++ (2) ++ (3) 118 P22314 21 A1AGN5 ++ (5) 37 A1AAD4 +++ (2-7) 37 A1AGE3 278 556 280 46 50 O75369 P58107 P21333 P60842 P49411 228 50 469 Q7Z406-1 P07437 P78527-1 ++ (2-3) ++(2-5) +++(5-9) ++(5-14) +++ (2-15) ++ (6) ++ (2-6) +++(2-5) ++ (4-6) ++ (2-4) + (7) +++(2-9) ++(4-7) +++ (3) +(2) ++(2-3) +++(2-5) +++ (2-3) +++ (2-15) ++(2-10) +++ (3-4) ++(3-5) +++(6-20) +++ (2-5) 133 +++(3) + (7) Protein Identified 22 23 24 25 kinase C Catalytic subunit Actin Elongation factor 1-alpha 1 Fatty acid synthase alpha enolase GSThSTRAP (1-440) ++ (4-5) +++ (5-6) +++(6-18) +++(5-8) His-hSTRAP (1-440) hSTRAP (1-219) hSTRAP (1-150) +++ (3) +++(2-3) ++ (4-6) +++ (4-6) +++(5-10) +++(3-7) +++(3-8) +++ (4-5) ++ (2-5) hSTRAP (151-284) +++(13-25) +++(2-6) hSTRAP (285-440) MW (kDa) Uniprot ID 42 50 273 47 P60709 P68104 P49327 P06733-1 Proteins identified as hSTRAP interacting partners using In-gel digestion mass spectrometry. The (+) (++) or (+++) represent one, two or three times respectively that each interacting protein was detected in each independent biochemical pull down assay for each hSTRAP protein variant indicated in the column headings. Each biochemical assay was repeated three times. The number of unique peptides identified in each separate biochemical pull down assay was variable and is shown in brackets. The molecular weight (kDa) of each one of the 25 hSTRAP interacting proteins identified and its UNIPROT ID number are indicated. Unless stated otherwise these proteins are the human orthologue. All peptide data is shown in appendix one, which lists all the peptides detected in each pull down and their respective mascot score. 134 All the UNIPROT ID’s of all 20 hSTRAP interacting proteins (human orthalogues) were submitted into DAVID bioinformatics software analysis. DAVID assigned all 20 proteins to their respective pathways and this has been shown in Table 3.7. Among these pathways included the regulation of the actin cytoskeleton, translation, DNA Damage, stress response pathway, glycolysis and various other metabolic pathways. This provides evidence to our original hypothesis that hSTRAP potentially can be implicated in various diverse regulatory roles. Table 3.7. Function of hSTRAP interacting proteins. Protein David assigned Pathway Triosephosphate isomerase Glycolysis, Metabolism of carbohydrates, Integration of energy metabolism Glycolysis, Metabolism of carbohydrates, Integration of energy metabolism Glycolysis, Metabolism of carbohydrates, Integration of energy metabolism Glycolysis/gluconeogensis, Metabolism of carbohydrates, Hypoxia inducible factor in the Cardiovascular system Viral Myocarditis, Tight Junction, Focal adhesions, Chromatin Remodeling by hSWI/SNF ATPdependent Complexes, cell morphogenesis Viral Myocarditis, Tight Junction, Cytoskeletal regulation by Rho GTPase, microtubule cytoskeleton organisation Viral Myocarditis, Tight junction, Cytoskeletal regulation by Rho GTPase, actin filament based processes Focal adhesion, cytoskeletal organisation Focal adhesion, cytoskeletal organisation Cytoskeletal regulation by Rho GTPase, microtubule cytoskeleton organisation Integration of energy metabolism, oxidative phosphorylation Integration of energy metabolism Ubiquitin mediated proteolysis Phosphoglycerate kinase Alpha enolase L-lactate dehydrogenase Actin Myosin 9 Myosin, heavy chain 14 isoform 1 Filamin A Filamin B Tubulin ATP synthase subunit beta, mitochondrial Fatty acid synthase Ubiquitin-like modifier-activating enzyme 1 DNA damage dependent protein kinase C Catalytic subunit Eukaryotic initiation factor 4A-I Elongation factor 1-alpha 1 Epiplakin 1 Tu translation elongation factor, mitochondrial precursor cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta DNA damage (Non homologous end joining), cell cycle Translation Translation Cytoskeletal organisation Translation Stress response pathway Stress response pathway This table lists all the human orthologues of hSTRAP interacting partners and their relative implications in the cell cycle associated pathways as assigned by David bioinformatics. 135 The UNIPROT IDs of the human 20 hSTRAP interacting proteins as shown in Table 3.6 were then submitted into GeneMania and String 9.0 software (see Materials and Methods Section 2.18) to deduce interaction status of all these proteins. This data was then ultimately used to build an interacting network using Cytoscape (see Materials and Methods Section 2.18), showing possible hSTRAP implication in breast cancer (See Fig.3.28). This signaling pathway shows hSTRAP relevance in breast cancer as well as a possible functional role as a scaffolding protein. The way hSTRAP fits into the existing known interaction network, and interacts with proteins which are functionally related, would allow to formulate a hypothesis that hSTRAP can be potentially involved in cellular migration, glycolysis and various metabolic pathways. This hypothesis should be further checked by more targeted studies of individual interactions identified in the Table 3.6, and confirming their presence in-vitro and in-vivo using independent assays. The latter will be discussed in detail in the Discussion section of this thesis. 136 Figure 3.28. hSTRAP implication in cancer related pathways. This network was created in cytoscape as described in Materials and Methods Section 2.18. The connections in red and blue are interactions proved by experiments performed in this investigation or from existing published data as shown by GeneMania and String respectively. Connections in pink are connections that are proven by experiments by either GeneMania or String bioinformatic tools. The connections in grey are those that are predicted or from text mining from one or both of the programs GeneMania and String. Gene names are highlighted in pink nodes. Abbreviations: TTC5, hSTRAP; MY09, Myosin 9; ACTN, Actin; FLNA, Filamin A; FLNB, Filamin B; MYH14, Myosin, heavy chain 14 isoform 1; PRKDC, DNA damage dependent protein kinase C Catalytic unit, LDHA, L-lactate dehydrogenase; EIF4A1, Eukaryotic initiation factor 4A-I; EEF1A, Elongation factor 1-alpha 1; TUBB, Tubulin; EFTU, Tu translation elongation factor, mitochondrial precursor; ENO, Alpha enolase ; UBA1, Ubiquitin-like modifier-activating enzyme 1; FASYN, Fatty acid synthase; TPI, Triosephosphate isomerase; PGK1, Phosphoglycerate kinase; ATPSYN, ATP synthase subunit beta, mitochondrial; EPIPN1, Epiplakin 1; HSP70, cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial; HSP90AB1, cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta; 137 3.3 Biophysical and structural studies carried out using full length and truncated versions of hSTRAP The following section contains all the data obtained from circular dichroism, crystallography and NMR experiments carried out on all hSTRAP protein constructs cloned into the N terminal His and GST tag vectors; pET14-b and pGEX-6P-1 respectively. The sample requirements for carrying out high resolution structural studies are a concentrated solution of pure stable protein as the starting point, specifically, a minimum of 20 µl of over 10 mg/ml for crystallographic trials, and 0.5 ml of 0.5-1 mM for NMR experiments. However, for Circular dichroism experiments a very low concentration of less than 15M of pure protein is required. Previous work (Section 3.1) established the protocols for the expression and purification of all hSTRAP protein variants. The following sections include details of buffer optimization trials carried out in order to concentrate hSTRAP protein to high concentrations and improve protein stability. Section 3.1 identified four out of the five truncated hSTRAP constructs, with which these biophysical studies were possible, namely: hSTRAP(1-219), hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440), as well as both full length constructs, His-hSTRAP(1-440) and GSThSTRAP(1-440). Protein construct hSTRAP(220-440) was unstable when bound to the His tag affinity resin and in elution (Fig.3.16). Furthermore, during the course of this study the structure of the C terminus of hSTRAP(262-422) [2XVS] was solved by another group, and for this reason this protein was not included in the biophysical studies carried out in this thesis. 3.3.1 Biophysical and structural studies carried out on His-hSTRAP(1440) 3.3.1.1 Circular Dichroism on His-hSTRAP(1-440) CD was used to estimate the secondary structure content of the protein constructs, and to identify whether the protein was folded. Furthermore, the temperature dependence of CD spectra was measured to characterize thermal stability of His-hSTRAP(1-440). The protein concentration used for this CD experiment was 1 µM An initial scan was taken at 4°C to determine the folding state and secondary structure composition of His-hSTRAP(1-440). Then a thermal stability experiment was carried out, and readings at 220nm every 0.2°C from 4 to 80°C were recorded for this, after which 138 another scan was taken again once the temperature was reversed back to 4°C to determine if the protein refolds again. The initial scan taken at 4°C (Fig.3.29A) showed that the sample was folded and contained various secondary structure elements. Mean residue molar ellipticity values for this construct were then fed in the program Dichroweb to determine the percentage of each secondary structure element in this construct. This showed that His-hSTRAP(1-440) is 11% α-helical, 25% β, 31.2% turn and 32.8% disordered. The His-hSTRAP(1-440) throughout the whole temperature range tested at 220 nm did not show clear co-operative unfolding transition (Fig.3.29B) and re-folds reversibly once temperature is reverse (Fig.3.29C). The CD results therefore suggest that the protein is folded on a secondary structure level; however, the absence of distinct unfolding transition suggests that this protein construct exists in a molten globule state. 139 (A) (B) (C) (D) Figure 3.29. CD experiments carried out on His-hSTRAP(1-440). A: CD spectrum recorded at 4°C before applying variable temperatures. B: CD spectrum carried out at variable temperatures ranging from 4 to 80°C taken at fixed wavelength of 220 nm C: CD spectrum recorded at 4°C, after the variable temperature experiment was carried out. D: Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One representative experiment is shown out of three repeats. Abbreviations: MRME, Mean residue molar ellipticity; 3.3.1.2 X-ray Crystallography on His-hSTRAP(1-440) Once it was confirmed by in gel digestion mass spectrometry that His-hSTRAP(1-440) was present after the first step of affinity purification, His-hSTRAP(1-440) elutions were pooled together and dialyzed into two buffers. As the process of protein concentration can 140 cause aggregation and losses, the effect of salt concentration on the stability of HishSTRAP(1-440) within buffers was investigated. Two buffers were tested, both contained 50 mM Sodium Phosphate, Arginine and Glutamic acid and 10 mM β-mercaptoethanol; in addition, buffer 1 contained 50 mM NaCl while buffer 2 contained 150 mM NaCl (n.b. buffer 2 was also the buffer used for gel filtration). To solve the atomic-resolution structure of any protein, a high concentration of soluble, stable, non-aggregated protein in the order of 10 mg/ml is likely to be required, hence the His-hSTRAP(1-440) protein sample was concentrated down to a small volume. A sample of concentrate in each buffer was analyzed on a 10% SDS PAGE gel (Fig.3.30), additionally; the concentration of soluble protein in each buffer condition was estimated via Bradford reagent (Table 3.8). The latter measurements were also used to determine the estimated percentage loss of HishSTRAP(1-440) protein during the buffer exchange and concentration process. In buffer 1, the concentration of soluble His-hSTRAP(1-440) protein obtained was estimated at 10 mg/ml in 20 µl (Table 3.8), corresponding to an 83.5% loss. In buffer 2, the concentration of soluble His-hSTRAP(1-440) protein obtained was estimated at 5 mg/ml in 20 µl (Table 3.8), corresponding to a 91.7% loss. These data suggested that lower salt concentrations are favorable for His-hSTRAP(1-440) protein, although, in this buffer the presence of a 32 kDa degradation product was observed by SDS PAGE (Fig.3.30A). In conclusion the concentration of His-hSTRAP(1-440) was not increasing in a manner inversely proportional to the volume, as what would be expected. This protein loss observed could be attributed to aggregation and/or sticking of His-hSTRAP(1-440) protein to the ultrafiltration membrane. Table 3.8. Estimation of soluble His-hSTRAP(1-440) protein concentration in different buffers Protein sample OD595 0.10 1.00 Volumemeasurement (µl) 5.0 1.5 Concentration (mg/ml) 0.3 10.0 Volume (µl) 4000 20 Percentage protein loss (%) 83.4 Initial hSTRAP Sample Concentrated HishSTRAP(1-440) in Buffer 1 Concentrated HishSTRAP(1-440) in Buffer 2 0.50 1.5 5.0 20 91.7 The concentration of His-hSTRAP(1-440) protein measured (by Bradford reagent, as described in Materials and Methods) in samples before and after concentration in buffer 1 and buffer 2 . Buffer 1: 50 mM Sodium Phosphate buffer, NaCl, Arginine and Glutamic acid, and 10 mM βmercaptoethanol. Buffer 2: 50 mM Sodium Phosphate buffer, 150 mM NaCl, 50 mM Arginine and Glutamic acid, and 10 mM β-mercaptoethanol. 141 Figure 3.30. Concentration of HishSTRAP(1-440) protein in different buffers. 10% SDS PAGE gel comparing the difference in purity between samples of His-hSTRAP(1-440) protein in buffer 1 (lane 2) and buffer 2 (lane 3). Fifteen micro-litres of sample was diluted as indicated in lane headings and loaded in each lane. Gel was stained with Instant Blue. Buffer 1-50 mM Sodium Phosphate buffer, NaCl, Arginine and Glutamic acid, and 10 mM β-mercaptoethanol. Buffer 250 mM Sodium Phosphate buffer, 150 mM NaCl, 50 mM Arginine and Glutamic acid and 10 mM β-mercaptoethanol. Following these first protein stability trials, the long-term stability of both concentrated samples in buffer 1 and 2 were studied. Samples were incubated for four weeks at 4°C and then analyzed by SDS PAGE. Achieving long-term His-hSTRAP(1-440) sample stability is important for x-ray crystallization experiments. A SDS PAGE gel to determine long-term stability of hSTRAP shown in Figure 3.31, indicated that hSTRAP was not stable in any of the trailed buffers since many degradation products were detected in concentrated samples in both buffer 1 (Lane 2) and buffer 2 (Lane 3). Figure 3.31. Long term stability of concentrated His-hSTRAP(1-440) protein. A 10% SDS PAGE gel indicating the long-term stability of HishSTRAP(1-440) protein (highlighted by an arrow) solubilised in buffer 1 (lane 2) and buffer 2 (lane 3) after four weeks incubation at 4°C. 15 µl of sample was diluted as indicated in lane headings and loaded in each lane. Gel was stained with Instant Blue. Buffer 1-50 mM Sodium Phosphate buffer, NaCl, Arginine and Glutamic acid and 10 mM βmercaptoethanol. Buffer 2-50 mM Sodium Phosphate buffer, 150 mM NaCl, 50 mM Arginine and Glutamic acid and 10 mM βmercaptoethanol. 142 As stability, purity and high concentration are critical factors for the crystallization of proteins, a new proprietary protein solubilisation mixture developed by Dr Alexander Golovanov (H-MIX) was used to overcome these obstacles. To study the effects of H-MIX on the concentration process, all elution fractions which already contained H-MIX and pure His-hSTRAP(1-440) protein, were pooled together and concentrated down. The buffer was then exchanged into low-salt Buffer 1, additionally containing H-MIX. Table 3.9 shows that the addition of H-MIX to previously investigated buffer 1, concentrated HishSTRAP(1-440) up to an estimated 18 mg/ml (Table 3.8). The purity of the concentrated sample also increased (Fig.3.32A) compared to the preparation without the addition of HMIX (Fig.3.31). Furthermore the percentage of the protein loss was improved after the addition of H-MIX to the buffer, 28% (Table 3.9), compared to 83.4% without it (Table 3.8). To determine the long-term stability of hSTRAP protein, samples were stored at 4°C for one month and then analyzed by SDS PAGE after this time period. His-hSTRAP(1-440) protein was stable in these buffer conditions (Fig.3.32B), which was a significant improvement compared to the poor stability of His-hSTRAP protein in the same buffer in the absence of H-MIX (Fig.3.31). The protein concentration, purity and stability in HMIX-containing buffer were now sufficient to proceed to crystallization trials. Table 3.9 Estimated concentration of His-hSTRAP(1-440) in Buffer 1 + H-MIX. OD595 Elution 1 Elution 2 Pooled elutions 1 and 2 Concentrated protein 1 Concentrated protein 2 0.095 0.043 0.07 0.17 0.607 Vol Measurement (µl) 5 5 5 5 0.5 Concentration (mg/ml) 0.29 0.129 0.21 0.5 18 Volume (µl) 3000 2000 5000 2500 50 Percentage loss (%) 0 28 protein The concentration of His-hSTRAP(1-440) protein in various protein samples taken during the concentration of His-hSTRAP(1-440) in the presence of H-MIX+Buffer 1. 143 (A) (B) Figure 3.32. Concentration and stability of His-hSTRAP(1-440) protein in Buffer 1 + H-MIX. 10% SDS PAGE of A: His-hSTRAP(1-440) (indicated by an arrow) elutions in Buffer 1 + H-MIX, before (lanes 2 and 3) and after (lanes 3 and 4) concentration. B: Concentrated 18 mg/ml His-hSTRAP(1-440) after storage at 4°C in buffer 1+ H-MIX, demonstrating sample stability. (A) Lanes: 1, Molecular marker; 2, 15l Eluted Protein 1 (1:2 dilution); 3, 15l Eluted Protein 2 (1:2 dilution); 4, 15l of the 2.5 mls of pooled elutions (1+2) concentrate (1:3 dilution); 5, 50 µl of 18 mg/ml concentrated protein sample (1:40) dilution; (B), Lanes: 1, Molecular marker; 2, 50 µl of 18 mg/ml concentrated protein sample (1:50) dilution. Both gels were stained with Instant Blue. Before proceeding to crystallography trials concentrated His-hSTRAP(1-440) sample was analyzed by gel filtration to confirm the presence of pure His-hSTRAP(1-440) protein. For that a 1:100 dilution of the concentrated His-hSTRAP(1-440) protein sample was injected into a gel filtration column (Superdex75). One sharp peak is observed on the gel filtration graph at 144 ml (Fig.3.33A), which according to the superdex75 calibration curve (Fig.3.33B) corresponds to a protein of 50 kDa in size indicating that crystallography trials could be carried out with this sample. 144 (A) (B) Volume (mls) Figure 3.33. Gel filtration graph of concentrated His-hSTRAP(1-440) protein sample. A: Superdex75 chromatogram (OD280) obtained after injection of 1:100 dilution of concentrated hSTRAP protein sample. B: Superdex75 calibration curve. Following the establishment of a protocol for production of stable, pure and homogeneous His-hSTRAP(1-440) protein, with a sample concentration exceeding 10 mg/ml, crystallography trials were initiated. The first trial was carried out in the JCSG+ screen with a 12.9 mg/ml His-hSTRAP(1-440) protein sample. Half of the concentrated protein sample was supplemented with small suspended graphite flakes, hypothesized to act a nucleating platform for the protein (Dr A. Golovanov, personal communication). 145 From these initial trials it was concluded that the concentration of His-hSTRAP(1-440) was too low as more than 45% of the wells were clear after 3 months (Fig.3.34). However, despite low concentration of His-hSTRAP protein, spheralites were observed in 4 of the wells, both with and without graphite particles as shown in Figure 3.34. Since it was concluded that the protein concentration was too low, the same trial was repeated but with a higher protein concentration (Fig.3.35). Spheralites Figure 3.34. Crystallisation trials with 12.9 mg/ml His-hSTRAP(1-440) sample in JSCG+ screen. B6 contained 0.1 M Phosphate-Citrate Buffer, 40% (v/v) Ethanol and 5% (w/v) PEG 1000. B11 contained 1.6 M Tri-Sodium Citrate Buffer. A: Well B6 with Graphite; B: Well B6 without Graphite; C: Well B11 with Graphite; D: Well B11 without Graphite The next crystallization trial was carried out with an 18 mg/ml His-hSTRAP(1-440) protein sample using the JCSG+ trial screen. Due to the higher His-hSTRAP(1-440) protein concentration, only 20% of the wells were clear compared to previous 45% after a 3 months incubation period, and more wells contained spheralites than in the previous trial (Fig.3.35). Following three weeks, a large spheralite or a possible micro-crystal was obtained in well B6 (Fig.3.35A). The screen condition of well B6 was: 0.1 M PhosphateCitrate buffer, 40% (v/v) Ethanol and 5% (w/v) PEG 1000. This well had the largest spheralite observed in trials so far, therefore this condition was used as a basis for more detailed screens in which concentrations of the components present varied, to investigate their effect on crystal quality. Spheralites were also obtained in B11 as well (Fig.3.35E and 3.35F), which contained 1.6 M Tri-sodium Citrate; hence the conditions for B6 and B11 were combined for the subsequent screen. Therefore, the next trial that was carried out contained 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG 1000, with varying ethanol 146 concentrations. Spheralites were consistently obtained in the well B6 and B11 trials with or without graphite particles (Fig.3.34 and Fig.3.35). These spheralites obtained in these preliminary trials could not be used for further structural experiments, as they were too small for a diffraction experiment on a Rotating Anode source. “Hairball” like structures, perhaps from multiple nucleations were seen in well A1, both with graphite (Fig.3.35C) and without (Fig.3.35D). Whiskers crystal structures were observed (Fig.3.35G) in well G1 with graphite, but unfortunately the well without graphite had dried out (Fig.3.35H) so its reproducibility could not be determined. 147 Spheralites/ microcrystal “Hairball” like structure Whisker crystal Figure 3.35 Crystallisation trials with 18 mg/ml His-hSTRAP(1-440) sample in the JCSG+ screen. Well B6 contained 0.1 M Phosphate-Citrate buffer, 40% (v/v) Ethanol and 5% (w/v) PEG 1000, and well A1 contained 2 M Lithium Sulfate, 0.1 M Sodium acetate pH 4.5, and 50% (w/v) PEG 400. Well B11 contained 1.6 M Tri-Sodium Citrate, and G1 contained 0.1 M HEPES pH 7 and 30% (v/v) Jeffamine ED-2001 pH 7. - A: Well B6 with graphite, B: Well B6 without graphite; C: Well A1 with graphite; D: Well A1 without graphite; E: Well B11 with Graphite; F: Well B11 without Graphite; G: Well G1 with graphite; H: Well G1 without graphite. The third crystallization trial attempted to optimize the conditions which produced spheralites in earlier experiments (Fig.3.35), by using 0.1 M Tri-Sodium Citrate pH 4.2 and 148 5% (w/v) PEG 1000, with varying ethanol concentrations such as 60%, 50%, 40%, 30% and 20% (v/v) with a 20 mg/ml His-hSTRAP(1-440) protein sample. Spheralites were again observed in this trial (Fig.3.36) and the largest one was found at 20% (v/v) ethanol (Fig.3.36J), although in the previous trial it was found in 40% (v/v) ethanol (Fig.3.35A). Observation from the side of the well showed the spheralite positioned at the edge of the drop, (Fig.3.36K), and as mentioned previously this could be either salt or protein. The ultimate test for this would be to obtain a crystal of high enough quality and analyze the diffraction pattern. This could not be done with this spheralite as it was too small for a diffraction experiment on a Rotating Anode source available at the time. 149 Spheralite Spheralite Spheralite Figure 3.36. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440). All the wells contain 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG1000 and varying ethanol concentrations. A: 60%(v/v) Ethanol with Graphite; B: 60%(v/v) Ethanol without Graphite; C: 50%(v/v) Ethanol with Graphite; D: 50%(v/v) Ethanol without Graphite; E: 40%(v/v) Ethanol with Graphite; F: 40%(v/v) Ethanol without Graphite; G: 30%(v/v) Ethanol with Graphite; H: 30%(v/v) Ethanol without Graphite; I: 20%(v/v) Ethanol with Graphite; J: 20%(v/v) Ethanol without Graphite; K: Zoomed in picture of side of well, contains 20%(v/v) Ethanol without Graphite, 150 The next trial investigated the effects of pH on crystal grade and so the pH range tested was 3.8-4.8, with 0.2 increments, additionally pH 5.2, 6.2 and 7.2 were also checked. Many spheralites were obtained in wells at pH values 4.0-4.6, both with and without graphite (Fig.3.37C-J). However, after a period of 6 weeks a large tablet crystal-like object immersed in precipitation was obtained at pH 4.6 without graphite (Fig.3.37J). This was observed under polarizing light and was observed to change the plane of polarization (although in a plastic tray), and was around 150x40x40m3 as assessed on the microscope graticule in size. Due to the high amount of precipitation observed in the well the crystal was difficult to extract but was put on the X-ray beam, however, no single crystal diffraction pattern was obtained (Fig.3.38). The buffer that the crystal was formed from had poor cryoprotectant properties, as when the crystal was subjected to liquid nitrogen, ice rings were formed (Fig.3.38). 151 Spheralite Spheralite Spheralite Spheralite Tablet like crystal Spheralite 152 Figure 3.37. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to observe effects of pH on crystal grade. All the wells contained 0.1 M Tri-Sodium Citrate, 20% (v/v) ethanol and 5% (w/v) PEG1000, but pH units were varied. A: 20%(v/v) Ethanol with Graphite pH 3.8; B: 20%(v/v) Ethanol without Graphite pH 3.8; C: 20%(v/v) Ethanol with Graphite pH 4.0; D: 20%(v/v) Ethanol without Graphite pH 4.0; E: 20%(v/v) Ethanol with Graphite pH 4.2; F: 20%(v/v) Ethanol without Graphite pH 4.2; G: 20%(v/v) Ethanol with Graphite pH 4.4; H: 20%(v/v) Ethanol without Graphite pH 4.4; I: 20%(v/v) Ethanol with Graphite pH 4.6; J: 20%(v/v) Ethanol without Graphite pH 4.6; K: 20%(v/v) Ethanol with Graphite pH 4.8; L: 20%(v/v) Ethanol without Graphite pH 4.8; M: 20%(v/v) Ethanol without Graphite pH 5.2; N: 20%(v/v) Ethanol with Graphite pH 5.2; O: 20%(v/v) Ethanol without Graphite pH 6.2; P: 20%(v/v) Ethanol with Graphite pH 6.2; Q: 20%(v/v) Ethanol without Graphite pH 7.2; R: 20%(v/v) Ethanol without Graphite pH 7.2 153 Figure 3.38. His-hSTRAP(1440) Diffraction Pattern. Diffraction pattern of crystal obtained in 20% ethanol (v/v), 0.1 M Tri-sodium citrate and 5% (w/v) PEG1000 at pH 4.6. Sample buffer was a not a good cryoprotectant as many ice rings were observed. Experiment was tailored for protein diffraction (backstop was pulled out to observe low resolution diffraction and the rotation width was small). Bruker Microstar rotation anode source with CCD detector. The next set of trials tested the effect of lowering ethanol concentration within the sample buffer further on crystal grade and the ethanol concentrations tested were 15%, 10% and 5% (v/v). Spheralites and rod shaped needles were obtained at 10% ethanol (Fig.3.39C), however, this trial did not improve the crystal grade any further and it seems that this was not the optimum condition for crystal growth. Spheralite Spheralite Spheralites Figure 3.39. Crystallisation trials with 20 mg/ml His-hSTRAP(1-440) protein sample to observe effects of further lowering ethanol concentration on crystal grade . All the wells contained 0.1 M Tri-Sodium Citrate, 5% (w/v) PEG1000 at pH 4.6 but with varying ethanol concentrations. A: 20%(v/v) Ethanol ; B: 15%(v/v) Ethanol ; C: 10%(v/v) Ethanol; D: 5%(v/v) Ethanol; 154 As the JCSG+ based trials did not yield crystals suitable for protein structure determination, a number of different kits were tested namely; PACT, Clear Strategy I, Clear Strategy II and Morpheus. The results obtained with these screens were not successful as mainly clear drops were obtained, and no spheralites or crystals were detected (Well images not shown). Spheralites were consistently obtained in both wells with or without graphite (Fig.3.343.37 and Fig.3.39), and there was no significant difference on spheralite quality or appearance of wells with the addition of graphite (Fig.3.34-3.37 and Fig.3.39). This suggested that as-prepared graphite flakes do not offer the suitable nucleating platform, disproving the initial hypothesis. Despite screening 500 different conditions, a high enough grade crystal was not obtained for structural studies. The difficulty in crystallizing this protein could be potentially explained by the presence of structural heterogeneity, caused, for example, by the presence of intrinsically unstructured regions in this polypeptide, or by conformational plasticity. At this point however it was concluded that the priority should be given in solving the structures of shorter fragments of hSTRAP, which may be more stable structurally. 3.3.2 Biophysical and structural studies carried out on GST-hSTRAP(1440) 3.3.2.1 Circular Dichroism on GST-hSTRAP(1-440) CD was carried out with pure GST-hSTRAP(1-440) following the same procedure as that described for His-hSTRAP(1-440). A scan was taken at 4°C before the variable temperature experiment was carried out to determine folding state of GST-hSTRAP(1440). The resulting spectrum obtained showed that the CD of GST-hSTRAP(1-440) was unusual (Figure 3.40) and difficult to interpret, possibly because of the presence of GST in this construct that made this polypeptide too large for CD studies. As a consequence, variable temperature experiments were not initiated neither the estimated percentage of each secondary structure element could be determined. 155 Figure 3.40. CD experiments carried out on GSThSTRAP(1-440). CD spectrum recorded at 4°C of GST-hSTRAP(1440). This was repeated three times and the same result was obtained as shown in this figure. Abbreviations: MRME, Mean residue molar ellipticity; 3.3.2.2 GST tag cleavage The next set of experiments were done to determine the possibility of carrying out crystallography on hSTRAP(1-440) obtained from a GST-tagged construct. In order to carry out the latter experiments, the 26 kDa GST tag had to be cleaved as described in the Material and Methods section 2.14. Cleavage on the affinity resin was successful as two protein bands were detected, bound to the resin, after incubation with Pre-scission protease at 50 kDa and 26 kDa (Fig.3.41), corresponding to hSTRAP(1-440) and GST tag respectively (Fig.3.41A, Lane 4). However, no protein was detected in the elution fractions collected (Fig.3.41A, Lanes 10-12). Furthermore, post-elution resin sample indicated that hSTRAP(1-440) and the GST tag (Fig.3.41A, Lane 4) were still bound to the resin.Due to the latter reason, the concentration of reduced glutathione in the elution buffer was increased from 10 to 20 mM. Samples of resin after elutions with the new elution buffer were analyzed by SDS PAGE (Fig.3.41B, Lane 4), which indicated that hSTRAP(1-440) and GST were still bound to the resin as no protein was detected in the elution fractions (Fig.3.41B, Lanes 6-7). Therefore the concentration of glutathione was drastically increased to 200 mM, however both GST and hSTRAP protein remained bound to resin (Fig.3.41B, Lane 5). Furthermore, no protein was detected on the gel in any elutions (Fig.3.41B, Lanes 8-9) suggesting that the hSTRAP protein cannot be eluted from the resin 156 once cleaved from the GST tag possibly due to protein precipitation occurring after cleavage. (A) (B) Figure 3.41. On-column GST TAG cleavage of GST- hSTRAP(1-440). 10% SDS PAGE gels showing A: Pure GST-hSTRAP(1-440) protein is bound to the resin before cleavage (Lanes 3) but GST tag and hSTRAP(1-440) is still bound to affinity resin after cleavage (Lane 4) and no protein in detected in elutions (Lanes 10-12). B: Even after increasing glutathione concentration to 20mM (Lane 4) and 200mM (Lane 5), hSTRAP(1-440) and GST is still bound to the resin and no protein is detected in elutions either (Lanes 6-9). Gels were loaded with 15l of sample in each lane and both gels were stained with Instant Blue. Next set of experiments were performed to determine if GST cleavage could be carried out in solution, but precipitation was seen in solution once the protease was added during the cleavage experiment (Gel not shown), hence cleavage of GST was also not possible in solution. Therefore hSTRAP could not be obtained by cleavage from its GST-tagged form, for crystallographic or any other studies. 3.3.3 Biophysical and structural studies carried out on hSTRAP(1-219) 3.3.3.1 Circular Dichroism of hSTRAP(1-219) Dialyzed hSTRAP(1-219) protein was analyzed by SDS PAGE to check its purity before proceeding to Circular Dichroism experiments. SDS PAGE gel analysis confirmed that dialyzed hSTRAP(1-219) protein was pure (Fig.3.42, Lane 3). The concentration of protein used for this CD experiment was 11.4 µM. 157 Figure 3.42. Dialysed hSTRAP(1219) protein in CD buffer. 15% SDS PAGE gel indicating pure hSTRAP(1-219) protein. Fifteen microlitres of sample was loaded in each lane and the gel was stained with Instant Blue. An initial CD scan taken at 4°C to determine hSTRAP(1-219) folding state showed that hSTRAP(1-219) was folded and composed of various secondary structure elements (Fig.3.43A). Mean residue molar ellipticity values for this protein were then fed into the software Dichroweb to determine the percentage of each secondary structure element. This showed that hSTRAP(1-219) is 67% α-helical, 8.6% β, 11.6% turn and 12.8% is disordered. The thermal stability experiment showed that hSTRAP(1-219) does not display clear cooperative unfolding transition across temperatures 4 to 80°C at 220 nm (Fig.3.43B), suggesting that the protein may exist in a molten globule state. The scan taken at 4°C after the variable temperature experiment had been completed showed that hSTRAP(1-219) was still folded (Fig.3.43C). Furthermore, the two scans taken at 4°C before and after variable temperature experiment can be completely superimposed on top of each other, and no significant difference in spectra was detected (Fig.3.43D). These experiments were repeated 3 times and the same results were obtained. These results suggested that the protein is folded on a secondary structure level; however the absence of distinct temperature unfolding transition suggests that this protein construct exists in a molten globule state. 158 (A) (B) (C) (D) Figure 3.43. CD experiments carried out on hSTRAP(1-219). A: CD spectrum recorded at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at variable temperatures ranging from 4 to 80°C taken at a fixed wavelength of 220 nm. C: CD spectrum recorded at 4°C, after the variable temperature experiment was carried out. D: Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One representative experiment is shown out of three repeats. Abbreviations: MRME, Mean residue molar ellipticity; 3.3.3.2 NMR studies of hSTRAP(1-219) For NMR studies high quantities of pure hSTRAP(1-219), in the order of tens mg/ml were needed. For that reason, a buffer optimal for protein stability and solubility had to be determined in which hSTRAP(1-219) protein should be dialyzed. Once dialyzed, the protein had to be concentrated to the highest concentration possible and then injected into a gel filtration column. SDS PAGE gel analysis of the dialyzed sample (in 20 mM Sodium 159 phosphate buffer, 150 mM NaCl, 50 mM Arginine and Glutamic acid and 10 mM βmercaptoethanol), before and after concentration is shown in Figure 3.44A. The concentration of hSTRAP(1-219) was not sufficient as the protein could not be concentrated more than 2 mg/ml (Table 3.10), furthermore, it precipitated in solution. An estimated 76.1% protein loss of hSTRAP(1-219) protein in this buffer was detected (Table 3.10), as well as many contaminants (Fig.3.44A, lane 3). To carry out structural studies, hSTRAP(1-219) protein sample has to be stable for weeks, and so the concentrated protein sample was analyzed by SDS PAGE gel after it was stored for a month at 4°C to assess its long-term stability. SDS PAGE gel analysis showed that hSTRAP(1-219) was unstable after long term storage at 40C in this buffer (Fig.3.44B, Lane 2). (A) (B) Figure 3.44. hSTRAP(1-219) protein stability in gel filtration buffer. 15% SDS PAGE gels showing- A: Dialysed hSTRAP(1-219) (indicated by an arrow) before (lane 2) and after (lane 3) concentration in gel filtration buffer. B: hSTRAP(1-219) after one month storage at 4°C in gel filtration buffer (Lane 2). All lanes were loaded with 15 l of sample and both gels were stained with Instant Blue. Table 3.10. Estimated concentrations of hSTRAP(1-219) in gel filtration buffer. OD595 Dialysed hSTRAP(1-219) Protein Concentrated hSTRAP(1-219) Protein 0.070 0.675 Concentration (mg/ml) 0.21 2.01 Vol (µl) 10 000 250 Sample (%) Protein Loss 76.1 Estimated concentration (as measured by Bradford reagent) of dialyzed hSTRAP(1-219) before and after concentration via the amicon and viva-spin columns. 160 As shown in Section 3.1, H-MIX improved hSTRAP(1-219) stability, so its effect on the gel filtration buffer was investigated. Buffer exchange and concentration in H-MIX were carried out in the Amicon and Viva-spin columns as described in the Materials and Methods section 2.11. Estimated concentration of hSTRAP(1-219) protein was identified through Bradford measurements taken at OD595, which showed that with the addition of HMIX to gel filtration buffer, hSTRAP(1-219) can be concentrated down to an estimated 7 mg/ml in 20 µl (Table 3.11), after which it precipitates at 8 mg/ml. For NMR experiments 500 µl of this concentration would be initially needed to determine the tertiary folding state. SDS PAGE gel analysis of hSTRAP(1-219) samples before and after concentration showed that approximately 95% pure hSTRAP(1-219) protein was present in the concentrated samples (Fig.3.45A). Furthermore, these experiments also established that hSTRAP(1-219) can be concentrated at pH 8 and intriguingly in pH 5 (Table 3.11), given that the pI of this protein is 5.19, which suggests that H-MIX facilitates solubilization and stabilization even at pH values close to the pI of the protein, when its solubility is expected to be at its minimum. The loss of protein decreased compared with the buffer without HMIX from 76.1% (Table 3.10), to 29% (Table 3.11) after the addition of H-MIX to the buffer. Percentage loss of protein was marginally lower at pH 5 (29%) compared to pH 8 (32%) (Table 3.11). Concentrated hSTRAP(1-219) was analyzed by SDS PAGE to investigate the long-term stability of hSTRAP(1-219) in optimized NMR buffer. SDS PAGE gel analysis confirmed concentrated hSTRAP(1-219) protein was relatively stable during a month incubation at 4°C in this optimized buffer at both pH 5 and 8 (Fig.3.45B). Although some degraded products were detected between 15-11 kDa in both concentrated samples (Fig.3.45B, Lane 3 and 5), but the sample stability had significantly improved compared to the sample without H-MIX (Fig.3.44B). All of these results suggest that in principle, high protein concentrations can be achieved and structural studies can be carried out on hSTRAP(1219). 161 (A) (B) Figure 3.45. Concentration and stability of hSTRAP(1-219) in NMR buffer. These 15% SDS PAGE gels show- A: hSTRAP(1-219) (indicated by an arrow) samples before and after concentration in NMR buffer (gel filtration buffer + H-MIX), pH 5 (Lanes 2 and 3), and pH 8 (Lanes 4 and 5). B: Long term stability of hSTRAP(1-219) in NMR buffer (gel filtration buffer + H-MIX) at both pH 5 (Lanes 2-3) and pH 8 (Lanes 4-5). All lanes were loaded with 15l of sample, which were diluted as indicated in lane headings. Both gels were stained with Instant Blue. Table 3.11. Estimated concentration of hSTRAP(1-219) protein samples in NMR buffer OD595 Concentration Vol Sample (mg/ml) (µl) Eluted hSTRAP(1-219) Protein 1 0.200 0.20 1000 Concentrated hSTRAP(1-219) Protein 1 (pH 5) 0.475 7.13 20 Concentrated hSTRAP(1-219) Protein 2 (pH 8) 0.455 6.83 20 (%) Loss 29 32 Protein Estimated concentration of hSTRAP(1-219) before and after concentration in NMR buffer (gel filtration buffer + H-MIX), pH5 and pH8, were determined through Bradford reagent as described in Materials and Methods. A 1D 1H NMR spectrum was recorded to check the folding state of hSTRAP(1-219) in this optimized buffer (gel filtration buffer + H-MIX) before proceeding to producing this protein in 15 N labelled form. The 1D 1H NMR spectrum of hSTRAP(1-219) recorded at 30°C suggests that hSTRAP(1-219) does not show evidence for unique tertiary structure (Fig.3.46). No shifted methyl resonances were observed around 0 ppm which usually signals the presence of defined structure, and peaks are too broad for the protein of this size. The signal broadening can be alternatively explained by protein aggregation, which however were expected to be strongly suppressed in the presence of Arginine, Glutamic acid [152] and H-MIX in the buffer. The low intensities of NMR signals could be a result of conformational exchange in the absence of unique 3D structure, as well as some 162 aggregation (Fig.3.46). Initial NMR and CD spectra were consistent with the view of possible lack of unique tertiary structure. However, further NMR experiments were carried out to explore whether the change in solution conditions could result in correctly folded hSTRAP(1-219) protein. Figure 3.46. 1D NMR spectrum of 0.2 mM hSTRAP(1-219). Concentrated hSTRAP(1219) sample at 30°C in optimized NMR buffer (H-MIX+ gel filtration buffer), pH 8. The large signals, which are clipped, originate from signals belonging to H-MIX. The spectrum indicates that this protein does not pose a unique folded structure. 15 N-hSTRAP(1-219) protein sample was then produced to investigate folding state of this construct by observing signals in fingerprint amide region. Optimum conditions of 15 N- hSTRAP(1-219) protein expression were determined through expression trials, and were identified in BL21(DE3)pLysS cells, induced with 0.1 mM IPTG at OD 600 of 0.5, followed by 3hrs incubation at 30°C, in minimal media supplemented with 15N-ammonium chloride. Samples representative of total, soluble and insoluble expression at identified optimal conditions of (Fig.3.47). 15 15 15 N-hSTRAP(1-219) protein expression were analyzed by SDS PAGE N-hSTRAP(1-219) protein was expressed over time (Fig.3.47A), however N-hSTRAP(1-219) protein was not detected in samples representative of soluble nor insoluble fraction at these growth conditions (Fig.3.47B). This has been observed previously with unlabelled hSTRAP(1-219) (See Results Section 3.1) as hSTRAP(1-219) was precipitating in the soluble fraction. This could be due to the lysis method as methodology of small and large-scale protein purification differ. 163 (A) (B) Figure 3.47. Expression of 15 N labelled hSTRAP(1-219). These 15% SDS PAGE gels show- A: Total BL21(DE3)pLysS pET-14b-His-hSTRAP(1-219) transformed cell lysate 1 hour (lane 3), 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Soluble (lanes 2-5) and insoluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pET-14b-HishSTRAP(1-219) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are preinduction fractions. All lanes were loaded with 15l of sample and both gels were stained with Instant Blue. Purification of 15 N hSTRAP(1-219) was performed in the presence of H-MIX within elution buffers as this was found to be critical for hSTRAP(1-219) protein stability. Initial purification used the same conditions as for the purification of unlabelled hSTRAP(1-219) protein. These were; 50 mM Tris, pH 8.7, 300 mM NaCl, 50 mM Arginine and Glutamic acid, and H-MIX. This yielded pure 15 N-hSTRAP(1-219) protein bound to this His tag affinity resin (Fig.3.48A, Lane 3), however, very little protein eluted off the resin with 200 mM imidazole in elution buffer (Fig.3.48A, Lane 5-7). A high quantity of 15N-hSTRAP(1219) was still bound to the His tag affinity resin even after elutions were taken (Fig.3.48A, Lane 4). The imidazole concentration was increased to 400 mM, with a view of eluting 15N hSTRAP(1-219) protein off the resin, however, this concentration of imidazole was still not sufficient to elute the protein (Fig.3.48B). The imidazole concentration in elution buffer was increased again to 600 mM and very low quantities of 15 N hSTRAP(1-219) protein were detected in the elutions (Fig.3.48C, Lanes 3-5), although, majority of 15N hSTRAP(1219) was still bound to the His tag affinity resin (Fig.3.48C,Lane 6). The imidazole 164 concentration in elution buffer was increased to 800 mM and again no protein was detected in elutions (Fig.3.48D, Lanes 3-5) and 15 15 N-hSTRAP(1-219) N hSTRAP(1-219) protein was still bound to the resin after elutions were taken (Fig.3.48D, Lane 6). Imidazole concentration was increased to 1 M, which is the highest concentration of imidazole that can be used to release the protein from a His-tag affinity resin. Again no 219) protein was detected in elutions (Fig.3.48E, Lanes 3-5) and 15 15 N-hSTRAP(1- N-hSTRAP(1-219) protein was still bound to the resin (Fig.3.48E, Lane 6). These results suggested that 15 N- hSTRAP(1-219) protein had precipitated, hence was not eluting, even with 1M imidazole. The next 15 N-hSTRAP(1-219) protein purification was carried out with H-MIX and 300 mM imidazole in the elution buffer. This time pure 15 N-hSTRAP(1-219) protein was detected in elutions (Fig.3.48F). Estimated concentration of 15N-hSTRAP(1-219) protein in elutions was however low and total quantity estimate from 1 litre E. coli growth culture was 0.9 mg (Table 3.12). 165 Figure 3.48. Purification of 15N labelled hSTRAP(1-219). All elution buffers contained HMIX. 15% SDS PAGE gels showing 15N-hSTRAP(1-219) purification at pH 8.7 with-. A: 200 mM Imidazole B: 400 mM Imidazole C: 600 mM Imidazole D: 800 mM Imidazole E: 1 M Imidazole F: H-MIX and 300 mM Imidazole only. All lanes were loaded with 15l of samples and all gels were stained with Instant Blue. 166 Table 3.12. Estimated concentration of 15N-hSTRAP(1-219). OD595 Elution 1 Elution 2 Elution 3 Elution 4 Elution 5 0.150 0.143 0.136 0.112 0.060 Concentration (mg/ml) 0.150 0.143 0.136 0.112 0.060 Volelution Protein quantity (mg) 1.5 1.5 1.5 1.5 1.5 0.225 0.215 0.204 0.168 0.09 Estimated concentration of 15N-hSTRAP(1-219) in elutions (measured with Bradford reagent) obtained through protein purification in H-MIX and 300 mM imidazole only (Elutions shown in Figure 3.48F). H-MIX pH 8 was not found to be the optimal buffer condition as protein precipitated in solution almost instantly during the concentration process and many contaminants were detected in the concentrated sample (Fig.3.49). We could not therefore obtain sufficient amounts of soluble 15N-hSTRAP(1-219) for detailed NMR experiments. Figure 3.49. Concentration of 15NhSTRAP(1-219) in H-MIX, pH8. 15% SDS PAGE gel showing eluted 15NhSTRAP(1-219) protein 1 and 2 (lanes 2 and 3) and concentrated 15N-hSTRAP(1219) (lane 4) in H-MIX only, pH 8. All lanes were loaded with 15l of sample and the gel was stained with Instant Blue CD and NMR experiments carried out on hSTRAP(1-219) showed that this protein does not have a unique tertiary structure (Fig.3.46) although it possesses secondary structure (Fig.3.43). In addition, there was no clear cooperative unfolding observed as the temperature was raised, suggesting that this construct may exist in molten globule state (Fig.3.43B). This would explain the difficulties with expressing and purifying this protein, its proteolytic instability and tendency to aggregate and precipitate. The molten-globule behavior of this construct was possibly a consequence of the truncation, which perturbs the native structure. 167 3.3.4 Biophysical and structural studies carried out on hSTRAP(1-150) 3.3.4.1 Circular Dichroism of hSTRAP(1-150) Pure hSTRAP(1-150) protein was dialysed into CD buffer (20 mM Sodium Phosphate buffer, pH 6.5) and then analyzed by SDS PAGE to determine its purity before proceeding to Circular Dichroism experiments. This showed that dialyzed hSTRAP(1-150) protein was pure (Fig.3.50, Lane 2). The concentration of hSTRAP(1-150) protein used for CD experiments was 10.2 µM. Figure 3.50. Dialysed hSTRAP(1-150) protein in CD buffer. Fifteen microlitres of dialyzed protein sample was loaded on the gel indicating the presence of pure hSTRAP(1-150) protein. Gel was stained with Instant Blue. CD experiments were carried out to characterize hSTRAP(1-150) folding state and thermal stability. An initial scan taken at 4°C, showed that hSTRAP(1-150) is an alpha helical protein (Fig.3.51A). Mean residue molar ellipticity values for this protein construct were then inserted in the program Dichroweb to determine percentage of each secondary structure element of hSTRAP(1-150). This showed that hSTRAP(1-150) is 71% alpha helical, and 29% disordered. A variable temperature experiment, between 4-80°C with detection at 220 nm was then carried out, which showed that hSTRAP(1-150) protein starts unfolding at 10°C and was completely unfolded by 70°C (Fig.3.51B). This showed that hSTRAP(1-150) was unstable, furthermore, the scan taken at 4°C after the variable temperature experiment had completed, confirmed that hSTRAP(1-150) thermal unfolding was irreversible (Fig.3.51C). Comparison of scans of hSTRAP(1-150) taken at 4°C, before and after variable temperature experiment, confirmed that hSTRAP(1-150) thermal unfolding was irreversible (Fig.3.51D). The melting mid-point of hSTRAP(1-150) is approximately 40°C. All of these results suggest that hSTRAP(1-150) is thermally unstable, but may retain some helical conformation between 4 to 10°C. This was repeated 168 3 times and the same results were obtained as shown in Figure 3.51. This information was then used to inform the NMR experiments to be carried out below 10°C. (A) (B) (C) (D) Figure 3.51. CD experiments carried out on hSTRAP(1-150). A: CD spectrum recorded at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD spectrum recorded at 4°C, after the variable temperature experiment was carried out. D: Superimposition of A (Highlighted in green) and C spectra (Highlighted in blue). One representative experiment is shown out of three repeats. 3.3.4.2 NMR studies of hSTRAP(1-150) Expression trials were carried out to determine the optimum conditions of soluble hSTRAP(1-150) production and these trials 169 identified optimal conditions 15 N in BL21(DE3)pLysS, induced with 0.1 mM IPTG at OD600 of 0.5, followed by 3 hrs incubation at 25°C, in minimal media. Samples representative of total, soluble and insoluble expression of 15 N hSTRAP(1-150) at optimal conditions of protein expression were analyzed on 15% SDS PAGE gels. This showed that hSTRAP(1-150) was expressed over time (Fig.3.52A), and was mainly found in the soluble fraction at these indentified optimum conditions of 15N hSTRAP(1-150) protein expression (Fig.3.52B). (A) Figure 3.52. Expression of (B) 15 N labelled hSTRAP(1-150) protein. These 15% SDS PAGE gel show- A: Total BL21(DE3)pLysS pET-14b-His- hSTRAP(1-150) transformed cell lysate 1 hour (lane 3), 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Soluble (lanes 2-5) and insoluble fractions (lanes 6-9) of BL21(DE3)pLysS cells transformed with pET14b-HishSTRAP(1-150) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9) post IPTG induction. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are preinduction fractions. All lanes were loaded with 15l of sample and both gels were stained with Instant Blue. 15 N labelled hSTRAP(1-150) purification yielded high quantities of pure 15 N hSTRAP(1- 150) protein in elutions from metal-affinity column (Fig.3.53). Estimated protein concentration in each elution is shown in Table 3.13, and total quantity of 15N hSTRAP(1150) protein obtained from 1 litre E. coli culture was 91.6 mg (Table 3.13). Next step was to determine optimal NMR buffer conditions to obtain correctly folded hSTRAP(1-150) protein. For that, fractions containing pure protein (as analysed by SDS PAGE) were pooled together and dialyzed into various NMR buffers and the protein folding state was subsequently checked. 170 Figure 3.53. Purification of 15N labelled hSTRAP(1-150). 15% SDS PAGE gel showing 15NhSTRAP(1-150) purification from BL21(DE3)pLysS at pH 8.7. Fifteen microlitres of samples (diluted 1:40) were loaded in each lane. Gel was stained with instant blue Table 3.13. Estimated concentration of 15N hSTRAP(1-150). OD595 Elution 1 Elution 2 Elution 3 Elution 4 Elution 5 Elution 6 Elution 7 1.000 0.860 0.530 0.338 0.215 0.098 0.011 Concentration (mg/ml) 30 25.8 15.9 10.14 6.45 2.94 0.33 Volelution 1 1 1 1 1 1 1 Protein (mg) 30 25.8 15.9 10.14 6.45 2.94 0.33 quantity Estimated concentration of 15N hSTRAP(1-150) in elution fractions obtained through protein purification (Elutions shown in Figure 3.53). These concentrations were identified through Bradford reagent as described in Materials and Methods. The next set of experiments were performed to determine optimal buffer by analyzing 1D 1 H and 2D 1H-15N correlation NMR spectra of 15N hSTRAP(1-150) protein in each buffer. Pure 0.5 mM 15N hSTRAP(1-150) protein with more than 95% purity (as analyzed by SDS PAGE) was present in the dialyzed concentrated sample, to be used for initial NMR experiments (Fig.3.54A, Lane 3). The initial NMR buffer tested contained only 20 mM Sodium phosphate buffer and 150 mM NaCl pH 6.2, due to its simplicity, and the peaks observed by buffer constituents such as arginine and Glutamic acid would be reduced. An initial 1D 1H spectra and 2D 1H-15N correlation HSQC experiment of 15N hSTRAP(1-150) protein in this initial buffer showed no evidence of folded protein (Fig.3.54B and 3.55A respectively). Arginine and Glutamic acid (50 mM each) was added to the buffer, which was hypothesized to improve stability and reduce aggregation [152] of 15N hSTRAP(1-150) protein. 2D 1H-15N correlation NMR spectra in this buffer indeed improved (Fig.3.55B) from the previous initial spectra 171 (Fig.3.55A), as more signals were detected. However the signals were not dispersed well and were not uniform in their intensity as would be expected in a folded protein, suggesting structural instability or aggregation. Reducing agent (DTT) was then added to this mixture and a HSQC was again taken, and this improved the spectra further as the signals were more uniform (Fig.3.55C). Then the pH was increased to 6.5 from 6.0 by titrating several microliters of diluted NaOH solution directly to the sample, and monitoring pH with a thin 3 mm electrode. The spectrum re-recorded at this slightly higher pH gave more uniform and dispersed signals and the best spectra obtained so far through these trials (Fig.3.55D). From these NMR experiments it was clear that the presence of DTT and change in pH had improved the appearance of the NMR spectra of 15 N hSTRAP(1-150), likely by reducing non-covalent aggregation via intermolecular disulfide bond formation, and increasing total charge of the protein, respectively. This spectrum however still contained too many signals (e.g., from tryptophan indoles) suggesting that a mixture of different conformers was present. It should be noted that all these initial NMR experiments were conducted at 30°C. At this point it was identified through CD experiments that this protein construct seems to retain secondary structure only between 4 to 10°C, and starts to unfold as the temperature is raised (Fig.3.51B). This would correlate with the findings obtained through these NMR experiments carried out 30°C, because a mixture of folded and unfolded hSTRAP(1-150) protein was present in hSTRAP(1-150) protein sample at this temperature. 2D 1H-15N correlation HSQC of dialysed 0.5 mM 15N hSTRAP(1-150) protein at 4°C in 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5, showed that there was a mixture of folded and unfolded hSTRAP(1-150) protein present in the sample (Fig.3.56). However, there was more folded hSTRAP(1-150) protein present in this sample (Fig.3.56) than in the previous sample at 30°C (Fig.3.55D). Signals were still not of uniform intensity and not dispersed enough to suggest a presence of completely folded protein. Also, the two tryptophan indole signals have disappeared, which can be explained by possible protein aggregation. 172 (A) (B) (C) Figure 3.54. 15 NhSTRAP(1-150) buffer optimisation trials. A: 15% SDS PAGE gel showing 15l of 15NhSTRAP(1-150) protein sample in dialyzed initial NMR buffer (20 mM Sodium phosphate buffer and 150 mM Sodium Chloride, pH 6.2), before (lane 2) and after concentration (lane 3). Gel was stained with Instant Blue. B: 1D 1H NMR spectra shows no evidence of folded 15N hSTRAP(1-150) protein in initial NMR buffer used. C: 1D 1H NMR spectra of 15N hSTRAP(1-150) in optimised buffer (20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5) shows evidence of folded protein. 173 (A) (B) 174 (C) 15 N 1 H (D) Figure 3.55. 2D 1H-15N correlation NMR spectra on 15N-hSTRAP(1-150). HSQC spectra of 0.5 mM 15N-hSTRAP(1-150) taken at 30°C in- A: 20 mM Sodium phosphate buffer and 150 mM Sodium Chloride, pH 6.2; B: Same buffer as used in (A) but with 50 mM Arginine and Glutamic acid, pH 6.2, C: Same buffer as used in (B) but with 10 mM DTT at pH 6.2; D: Same buffer as used as (C) but at pH 6.5. This latter spectrum showed evidence of folded 15N hSTRAP(1150) protein, although unfolded 15NhSTRAP(1-150) protein was also present 175 Figure 3.56. 2D1H-15N correlation HSQC NMR spectra of 15NhSTRAP(1-150) in identified optimised conditions. This 2D HSQC NMR spectra was taken at 6°C of dialysed 15 N-hSTRAP(1-150) protein in 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5. This spectrum does not show evidence of completely folded protein due to the non-uniform peak intensities, suggesting conformational instability of the protein. From these set of experiments it was concluded that when produced in the BL21(DE3)pLysS cell line, hSTRAP(1-150) was not completely and stably folded, although, these latter experiments have shown that DTT helps the folding of hSTRAP(1150) (Fig,3.55D), which suggests that incorrectly formed disulphide bonds may be responsible for poor properties of this construct expressed in this cell line. To test if construct properties could be improved, the pET-14b-His-hSTRAP(1-150) plasmid DNA was transformed into Shuffle T7 express and Shuffle T7pLysY, which both facilitate the formation of correct disulphide bond formation and contain chaperones to assist in the folding of protein (Table 2.2). The latter cell line also expresses an inhibitor of transcription to suppress expression of protein prior to induction (Table 2.2). Expression trials were then carried out to determine optimum conditions of soluble expression of unlabelled hSTRAP(1-150) protein in these cell lines. These trials identified optimum expression in Shuffle T7, induced with 0.1 mM IPTG, at OD 600 of 0.5, followed by 3hrs incubation at 30°C in LB media. Samples representative of total, soluble and 176 insoluble fractions of lysed cells were analyzed by SDS PAGE, which showed hSTRAP(1150) being expressed over time at these optimum conditions (Fig.3.57A, Lanes 2-5) and found mainly in the soluble fraction (Fig.3.57B, Lanes 6-9). However, hSTRAP(1-150) was not expressed in Shuffle T7pLysY (Gels not shown). (A) (B) Figure 3.57. Expression of unlabelled hSTRAP(1-150) in Shuffle T7. These 15% SDS PAGE gels show- A: Total Shuffle T7 pET-14b-His-hSTRAP(1-150) transformed cell lysate 1 hour (lane 3) 2 hours (lane 4) and 3 hours (lane 5) post IPTG induction. B: Insoluble (lanes 2-5) and soluble (lanes 6-9) fractions of Shuffle T7 cells transformed with pET-14b-His-hSTRAP(1150) after 1hour (lanes 3 and 7), 2 hours (lanes 4 and 8) and 3 hours (lanes 5 and 9) after induction with IPTG. Lane 1 in (A) and (B) represent protein markers. Lanes 2 and 6 are pre-induction. Each lane was loaded with 15l of sample and both gels were stained with Instant Blue. Protein purification showed that pure hSTRAP(1-150) can be obtained in elutions at pH 8.7 (Fig.3.58). The estimated concentration of hSTRAP(1-150) protein obtained in each elution fraction is shown in Table 3.14. Total quantity estimate from 1 litre E. coli culture was 1.88 mg (Table 3.14), which is very low compared to 200 mg/l of unlabelled hSTRAP(1-150) usually obtained in BL21(DE3)pLysS (See Results Section 3.1). 177 Figure 3.58. Purification of unlabelled hSTRAP(1-150) in Shuffle T7 express. 15% SDS PAGE gel showing eluted hSTRAP(1-150). All lanes were loaded with 15l of sample and the gel was stained with Instant Blue Table 3.14. Estimated concentration of hSTRAP(1-150) in elution fractions when expressed in Shuffle T7 express cells. OD595 Elution 1 Elution 2 Elution 3 Elution 4 Elution 5 Elution 6 Elution 7 0.230 0.285 0.255 0.169 0.113 0.102 0.098 Concentration (mg/ml) 0.230 0.285 0.255 0.169 0.113 0.102 0.098 Volelution 1.5 1.5 1.5 1.5 1.5 1.5 1.5 Total protein quantity (mg) 0.345 0.428 0.383 0.254 0.170 0.153 0.147 Estimated concentration of hSTRAP(1-150) in elutions obtained through protein purification in Shuffle T7 express (Elutions shown in Figure 3.58) and identified through Bradford reagent as described in Materials and Methods. Next step was to dialyse hSTRAP(1-150) protein in previously identified optimal NMR buffer for labelled hSTRAP(1-150), which was 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid at pH 6.5. DTT was not added in elutions obtained with this cell line, as this cell line contains enzymes which should help formation of correct disulphide bonds (Table 2.2). Protein was then concentrated down in the Amicon, and then in the Viva-spin column to 500 µl to obtain a concentration of 0.2 mM. Subsequently a 1D 1H spectrum of this concentrated sample was recorded, which again did not show any signs of well-folded protein (Fig.3.59), or any improvement from previous spectra of hSTRAP(1-150) sample obtained in BL21(DE3)pLysS cells, and recorded in the same buffer (Fig.3.56). Hence, this meant that no difference was detected in the folding of hSTRAP(1-150) in this cell line, which was initially hypothesized to improve protein folding due to the presence of both disulphide-oxidizing enzymes and chaperones. 178 Figure 3.59. 1D 1H Spectrum of hSTRAP (1-150) in Shuffle T7 express. This spectrum shows no evidence of folded hSTRAP(1-150) protein in this cell line in 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid at pH 6.7. Protein concentration used for this NMR experiment was 0.2 mM From all the biophysical studies carried out with hSTRAP(1-150) it was shown that hSTRAP(1-150) is 71% alpha helical (Fig.3.51A) and thermally unstable (Fig.3.51B) as it completely unfolds at 70°C (Fig.3.51B) and does not refold when temperature is reversed (Fig.3.51C). Furthermore, hSTRAP(1-150) exists as conformationally-heterogenious mixture of isoforms, and does not possess unique tertiary fold (Fig.3.56). Reasons explaining the structural instability of this construct include perturbation of its native structure due to chosen truncation. 3.3.5 Biophysical and Structural studies carried out on hSTRAP(151-284) 3.3.5.1 Circular Dichroism on hSTRAP(151-284) Protein hSTRAP(151-284) was dialyzed into CD buffer (20 mM Sodium Phosphate buffer, pH 6.5) and analyzed by SDS PAGE to confirm its purity before proceeding to Circular Dichroism experiments. Pure hSTRAP(151-284) was present in dialysed sample (Fig.3.60A, Lane 3) and the concentration of hSTRAP(151-284) protein used for these CD experiment was 15.9 µM. 179 Figure 3.60. Dialysed hSTRAP(151-284) protein in CD buffer. This 15% SDS PAGE gel shows pure hSTRAP(151-284) (Indicated by an arrow) was present in dialyzed sample to be used for subsequent CD experiments. Fifteen microlitres of sample was loaded in each lane and the gel was stained with Instant Blue A scan taken at 4°C to determine hSTRAP(151-284) folding state showed that hSTRAP(151-284) was folded and composed of various secondary structure elements (Fig.3.61A). Molar ellipticity values for this construct were uploaded in the program Dichroweb to determine estimated percentage of each secondary structure element in hSTRAP(151-284), and this showed that hSTRAP(151-284) is 42% α-helical, 16.5% βstructure, 21.5% Turn and 20% disordered. The variable temperature experiment was carried out and this showed that hSTRAP(151-284) protein does not display clear cooperative unfolding transition across temperatures 4-80°C at 220 nm (Fig.3.61C). Furthermore, the scan taken at 4°C after the variable temperature experiment had completed, shows hSTRAP(151-284) had re-folded (Fig.3.61C). Figure 3.61D shows the scans taken at 4°C before and after the variable temperature experiment can be superimposed on top of each other, suggestive of reversibly folded protein (Fig.3.61D). These experiments were repeated 3 times and the same result was obtained as shown in Figure 3.61. The CD results therefore suggest that the protein is folded on a secondary structure level; however the absence of distinct unfolding transition suggest that this protein construct exists in a molten globule state. 180 (A) (B) (C) (D) Figure 3.61. CD experiments carried out on hSTRAP(151-284). A: CD spectrum recorded at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD spectrum recorded at 4°C after the variable temperature experiment was carried out. D: Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One representative experiment is shown out of three repeats. Abbreviation: MRME, Mean residue molar ellipticity; 3.3.5.2 NMR studies of hSTRAP(151-284) HSTRAP(151-284) protein was dialyzed into previously identified optimized NMR buffer for hSTRAP(1-150): 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, 10 mM DTT at pH 6.5. HSTRAP(151-284) was then 181 concentrated down to 0.15 mM and 1D 1H NMR spectrum of this unlabelled hSTRAP(151284) protein sample was recorded. This spectrum showed no evidence of folded hSTRAP(151-284) protein (Fig.3.62); the characteristic dispersed protein signals were not visible in the spectrum and apparent protein concentration was too low according to the NMR spectrum (Fig.3.62), suggestive of aggregated protein. As a consequence it was decided not to proceed in preparing 15 N labelled hSTRAP(151-284) for further structural experiments on hSTRAP(151-284). Figure 3.62. 1D 1H NMR spectrum of hSTRAP(151-284). This spectrum was recorded of a 0.15 mM hSTRAP(151-284) protein sample at 30°C in optimised buffer conditions, 20 mM Sodium phosphate buffer, 150 mM Sodium Chloride, 50 mM Arginine and Glutamic acid, and 10 mM DTT at pH 6.5. This shows that hSTRAP(151-284) is unfolded. Structural studies carried out with hSTRAP(151-284) showed that this protein was folded at the secondary structure level and composed of various secondary structure elements (Fig.3.61) but may exist in a molten globule state. CD detects the presence of secondary structure, and this does not necessarily translate to protein being folded at a tertiary structure level, as shown by the 1D 1H NMR spectrum (Fig.3.63), which confirms that hSTRAP(151-284) doe not appear to have a unique 3D fold (Fig.3.63). It could be possible that hSTRAP(151-284) may exist in molten globule state due to a possible perturbation of its native structure because of truncation. 182 3.3.6 Biophysical and structural studies carried out on hSTRAP(285-440) 3.3.6.1 Circular Dichroism on hSTRAP(285-440) HSTRAP(285-440) was dialyzed into CD buffer (20 mM Sodium Phosphate buffer, pH 6.5) and analyzed by SDS PAGE gel to confirm its purity (Fig.3.63). Pure hSTRAP(285440) was present in dialyzed sample (Fig.3.63). The concentration of hSTRAP(285-440) Dialysed Protein protein used for CD experiment was 9.8 µM. Figure 3.63. Dialysed hSTRAP(285-440) protein in CD buffer to be used for subsequent CD experiments. This 15% SDS 25kDa hSTRAP (285-440) TPR 5-6 15kDa 10kDa 1 PAGE gel shows pure hSTRAP(285-440) (Indicated by an arrow) was present in dialysed sample to be used for subsequent CD experiments. Lane was loaded with 15l of sample and the gel was stained with Instant Blue 2 A scan taken at 4°C to determine hSTRAP(285-440) folding state showed that this construct includes various secondary structures and is folded at this level (Fig.3.64A). Molar ellipticity values for this construct were inserted in the program Dichroweb to determine percentage of each secondary structure element present in hSTRAP(285-440), and this showed that hSTRAP(285-440) is 15.7% α-helical, 29.7 % β-structure, 24.6% turn and 30% disordered. Then the variable temperature experiment was carried out and this showed that hSTRAP(285-440) protein does not show clear co-operative unfolding transition (Fig.3.64B) possibly due the protein existing in a molten globule state. Furthermore, the scan taken at 4°C after the variable temperature experiment had completed, showed hSTRAP(285-440) refolded reversibly (Fig.3.64C). Figure 3.64D shows the scans taken at 4°C, before and after variable temperature experiment was carried out that superimpose well on top of each other. This confirmed that hSTRAP(285-440) reversibly refolds after the temperature decreased. This was repeated 3 times and these same results and conclusions were met. At the time when these studies were being carried out, part of the structure of hSTRAP, amino acid residues 262-422 was solved by another group [PDB code 2XVS], and for this reason further structural studies for hSTRAP(285-440) were not performed. 183 (A) (B) (C) (D) Figure 3.64. CD experiments carried out on hSTRAP(285-440). A: CD spectrum recorded at 4°C, before the variable temperature experiment was carried out. B: CD spectrum carried out at variable temperatures ranging from 4 to 80°C taken at fixed wave length of 220 nm. C: CD spectrum recorded at 4°C, after the variable temperature experiment was carried out. D: Superimposition of A (Highlighted in blue) and C spectra (Highlighted in green). One representative experiment is shown out of three repeats. MRME, Mean residue molar ellipticity; 184 4. Chapter four. General Discussion 4.1 Comparisons of hSTRAP and mSTRAP structural data The structure of full length mouse mSTRAP (PDB code 4ABN) and a part of the C terminus of human hSTRAP (residues 262-422) (PDB code 2XVS) have been solved by Xray crystallography and published only recently [153], when our experimental studies were largely completed and the Thesis was being written up. The experimental structures of mSTRAP (PDB code 4ABN) and hSTRAP fragment (PDB code 2XVS) therefore now allow a comparison of the location of secondary structure elements within this protein with predicted locations which were used as a basis for the current study. The secondary structure predictions carried out in this study indicated an alpha helical content predictive of another TPR motif present within hSTRAP amino acids 1-68 (Fig.3.10). This was found to be the case from the solved mSTRAP structure (PDB code 4ABN) as there is one TPR motif present between amino acids 7-61 [153 and summarized in Table 4.1], which was not predicted from initial STRAP sequence analysis [136]. According to both secondary structure prediction tools used in this study, hSTRAP is around 70% α-helical which would correlate with the presence of six TPR motifs and correlates with the experimental mSTRAP structure [153]. Indeed mSTRAP does contain 6 TPR motifs but their location is slightly different for two of the TPR motifs predicted. TPR1, 2, 3, 4, 5 and 6 of mSTRAP are located at positions 7-61, 68-98, 103-130, 136-174, 179-216 and 224-253 respectively [153, Table 4.1] compared to the initially predicted positions of hSTRAP TPR1, 2, 3, 4, 5 and 6 as 69-102, 103-136, 179-212, 224-257, 332-365 and 373-406 respectively [136]. This shows that there is a TPR motif present between 7-61 and 136-174 that was not predicted on initial STRAP sequence analysis. HSTRAP region extending from 262-442 is 27% helical, 9% turn, 8% bend, 36% extended chain (β strand) and 20% chain, and an OB fold is present in this region rather than the two TPR motifs predicted initially from STRAP amino acid sequence (332-365 and 373-406, [136] PDB code 2XVS), which suggests that the initial predictions of location of TPR motifs using bioinformatic methods were not reliable and were in fact misleading in this part of the protein. This once again emphasizes the importance of experimental determination of 3D structure of proteins, rather than just relying on bioinformatic-based predictions. Positions of four of the TPR motifs towards the middle of the protein were predicted correctly. The experimental TPR positional data mentioned above refers to the mouse homologue of STRAP and this thesis is on the human homologue, however, the two structures are expected to be very similar. 185 Sequence alignment of mSTRAP with hSTRAP shows that the two STRAP homologues are highly conserved as a difference of less than 10% is observed in amino acid sequence between the two (Fig.4.1). To obtain the structure of human STRAP, we performed homology modeling using the published structure of mSTRAP (PDB code 4ABN) [153] as a template and hSTRAP amino acid sequence as a target, using the SwissPDB viewer software (http://spdbv.vital-it.ch/). Figure 4.2 shows that there are minor differences at the C terminus of STRAP but suggests that both mouse and human STRAP have similar conformations, and due to high sequence similarity, and are expected to have similar functional properties Table 4.1. Experimental TPR positions of STRAP TPR motifs TPR TPR Helix A TPR Helix B 1 7-28 42-61 2 68-78 86-98 3 103-116 119-130 4 136-146 154-174 5 179-195 200-216 6 224-236 240-253 This table shows STRAP TPR motif positions from solved structure; columns 2 and 3 indicate STRAP amino acid positions. Mouse Human MMADEEEEAKHVLQKLQGLVDRLYCFRDSYFETHSVEDAGRKQQDVQEEMEKTLQQMEEVLGSAQVEAQA 70 MMADEEEEVKPILQKLQELVDQLYSFRDCYFETHSVEDAGRKQQDVRKEMEKTLQQMEEVVGSVQGKAQV 70 Mouse Human LMLKGKALNVTPDYSPEAEVLLSKAVKLEPELVEAWNQLGEVYWKKGDVTSAHTCFSGALTHCKNKVSLQ 140 LMLTGKALNVTPDYSPKAEELLSKAVKLEPELVEAWNQLGEVYWKKGDVAAAHTCFSGALTHCRNKVSLQ 140 Mouse Human NLSMVLRQLQTDSGDEHSRHVMDSVRQAKLAVQMDVLDGRSWYILGNAYLSLYFNTGQNPKISQQALSAY 210 NLSMVLRQLRTDTEDEHSHHVMDSVRQAKLAVQMDVHDGRSWYILGNSYLSLYFSTGQNPKISQQALSAY 210 Mouse Human AQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSQAAALDPAWPEPQQREQQLLEFLSRLTSLLES 280 AQAEKVDRKASSNPDLHLNRATLHKYEESYGEALEGFSRAAALDPAWPEPRQREQQLLEFLDRLTSLLES 280 Mouse Human KGKTKPKKLQSMLGSLRPAHLGPCGDGRYQSASGQKMTLELKPLSTLQPGVNSGTVVLGKVVFSLTTEEK 350 KGKVKTKKLQSMLGSLRPAHLGPCSDGHYQSASGQKVTLELKPLSTLQPGVNSGAVILGKVVFSLTTEEK 350 Mouse Human VPFTFGLVDSDGPCYAVMVYNVVQSWGVLIGDSVAIPEPNLRHHQIRHKGKDYSFSSVRVETPLLLVVNG 420 VPFTFGLVDSDGPCYAVMVYNIVQSWGVLIGDSVAIPEPNLRLHRIQHKGKDYSFSSVRVETPLLLVVNG 420 Mouse Human KPQNSSSQASATVASRPQCE 440 KPQGSSSQAVATVASRPQCE 440 Figure 4.1. Sequence alignments of Mouse and Human STRAP. The amino acids highlighted in yellow are conserved residues between mouse (mSTRAP) and human (hSTRAP). Amino acids highlighted in cyan are the amino acids that are not conserved between the two sequences. 186 (A) (B) Figure 4.2. Homology modelling of hSTRAP and mSTRAP structure. Homology modeling performed using SwissPDB viewer. A: Structure highlighted in red is mSTRAP(from the strcuture published by [153, PDB code 4ABN] and brown is hSTRAP (Amino acid sequence uploaded into the program). B: hSTRAP structure with the N terminus and C terminus highlighted in pink and yellow respectively. Figure 4.3 illustrates all the hSTRAP protein variants cloned in this thesis shown on hSTRAP structure (obtained from homology modeling of solved mSTRAP structure, See Fig.4.2). As indicated in this figure, the residues 219-220 and 284-285 do not cut through any secondary structure elements (and shown by Table 4.1). However, amino acid 150 and 151 is present between Helix A and B of TPR4 (Table 4.1 and Fig.4.3B), which could cause structural instability of the constructs obtained, and as a consequence unfolded proteins. 187 (A) (B) Figure 4.3 Ribbon representation of the different regions of hSTRAP cloned and expressed separately in the current study, mapped on the model structure of fulllength STRAP protein. A: The region highlighted in red and blue is hSTRAP(1-219) and hSTRAP(220-440), respectively; B: The structural region highlighted in cyan, green, and pink are hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285-440), respectively. All the hSTRAP proteins variants cloned in this study were successfully expressed and purified as shown in Results section 3.1. 4.2 Structural characterization of hSTRAP protein fragments 4.2.1 CD characterization of all hSTRAP protein variants CD experiments showed that His-hSTRAP(1-440), hSTRAP(1-219), hSTRAP(151-284) and hSTRAP(285-440) are folded on the secondary structure level, and composed of various secondary structure elements as shown by their respective CD spectrum (Fig.3.29A, 3.43A, 3.61A, 3.64A respectively). HSTRAP(1-150) is also folded, and is predominantly an alpha helical protein (Fig.3.51) compared to the other His-hSTRAP protein variants, however, hSTRAP(1-150) is not thermally stable and does not exhibit reversible folding after heating (Fig.3.51 and Table 4.2), unlike the other His-hSTRAP protein variants (Fig.3.29A, 3.43A, 3.61A, 3.64A). CD experiments on GST-hSTRAP(1440) were inconclusive due to the presence and background signal of the large GST tag, and as a consequence its secondary structure composition, folding and thermal stability could not be determined (Fig.3.40). This does correlate well with literature published on 188 studying proteins by CD, as it is generally found that a large tag, such as the 26 kDa GST tag in this case, will contribute largely to signals obtained from the fusion protein [25]. Further CD analysis and correlation of published STRAP structure [153] indicated that the smaller the protein the more reliable the CD data (Table 4.2), as the CD experimental data for the shorter truncated constructs hSTRAP(1-150), hSTRAP(151-284) and hSTRAP(285440) correlate well with the experimental α-helical content [4ABN, 153]. Table 4.2 shows that the α-helical content of the N terminus of hSTRAP from CD analysis is higher than what is expected if it was to contain the TPR motifs predicted from STRAP sequence analysis initially performed when STRAP was discovered [136]. Furthermore, CD analysis suggests that the C terminus also has a lower percentage helical content than expected if it was to contain the two TPR motifs between residues 332-365 and 373-406, initially predicted upon sequence analysis of STRAP discovery [136]. This would all correlate with the recently published STRAP structure as mSTRAP (PDB code 4ABN) was found to contain a TPR motif between 7-61 and 136-174 that was not predicted on initial STRAP sequence analysis. Furthermore, from hSTRAP structural data, region 262-422 [2XVS] was shown to be 27% helical, 9% turn, 8% bend, 36% extended chain (β strand) and 20% the rest (chain), and to contain an OB fold rather than a TPR motif [153]. Table 4.2. CD data Protein His-hSTRAP(1-440) hSTRAP(1-219) hSTRAP(1-150)* hSTRAP(151-284) hSTRAP(285-440) (%) αhelical 11.0 67.0 71.0 42.0 15.7 (%) α-helical from STRAP sequence analysis 46.4 46.6 45.3 51.1 43.9 Actual α-helical (%) content (%) β (%) Turn (%) disordered 51.6 90.0 80.0 51.1 27.0 25.0 8.6 0.0 16.5 29.7 31.2 11.6 0.0 21.5 24.6 32.8 12.8 29.0 20.0 30.0 Estimated percentage of each secondary structure element determined by the program Dichroweb from CD experiments carried out in this thesis (column 2, 5, 6 and 7). The third column shows the % α-helical content if it was to contain the number of TPR motifs initially predicted from STRAP amino acid sequence analysis [136] The fourth column shows the experiment α-helical content as shown by recent published data of STRAP structure [153]. Asterisk (*) indicates the protein construct that was found to be thermally unstable by CD. 4.2.2 Crystallographic studies on hSTRAP(1-440) Approximately 500 conditions were screened but no crystal of high enough quality was obtained (Fig3.34-37 and Fig.3.39), hence the structure of His-hSTRAP(1-440) could not be solved. The priority was therefore given to solving the structures of shorter fragments of hSTRAP, which could be more suitable for solution NMR studies, which does not require 189 protein crystals. The crystals used to solve the structure of full length mSTRAP and a part of the C terminus of hSTRAP [153] were obtained in different conditions that were not tested in the trials described above, which could explain why a crystal was not obtained. Furthermore, this could be due to the presence of flexible regions in full-length hSTRAP or its conformational plasticity. Another reason why full length mSTRAP crystallised and hSTRAP did not considering they have a high degree of homology (Fig.4.1 and 4.2) could be because full length hSTRAP is more unstable and requires a certain posttranslational modification to form a stable correct fold that the mouse homologue does not. It could be that hSTRAP in complex with a ligand is more stable and would crystalise as a complex and so this avenue should be tested for the future. It is quite a common approach to try and crystalise a homologue of the protein that is experiencing problems to crystalise, which may favor crystal contact formation [154]. Crystallography trials using GST-hSTRAP(1-440) could not be carried out, because in order to do so, the tag had to be cleaved as the tag is too big, and there is a flexible linker between the GST tag and the hSTRAP sequence (Fig.3.5). These characteristics are not favourable for crystal growth and GST-fusion proteins are generally hard to crystalise [155]. Column GST cleavage experiments were carried out, but found to be unsuccessful as it seems that after cleavage, both the tag and hSTRAP protein precipitate on the column and as a consequence they cannot be eluted off the GST tag affinity resin even with elution with 200 mM Glutathionine reduced (Fig.3.41). The same was observed with off-column GST cleavage, therefore crystallography trials could not be carried out with GSThSTRAP(1-440) as the tag cannot be removed. 4.2.3 NMR studies on hSTRAP(1-219) A 1D 1H NMR spectrum of hSTRAP(1-219) protein sample showed no evidence of well folded protein, as no shifted methyl resonances were observed (Fig.3.46). The CD spectra on hSTRAP(1-219) indicated the presence of secondary structure (Fig.3.43), but CD also showed that 12.8% of the structure is disordered, furthermore, no co-operative unfolding was observed as temperature was raised, which suggests that the protein exists in a molten globule state. The latter would correlate with the NMR data and the characteristics published on molten globule state proteins [156]. Although the truncation according to published data does not cut through any helical structure [153], removal of parts of the protein is likely to destabilize the 3D structure, leading to a molten globule state. Therefore 190 the construct may then appear folded at a secondary structure level (on the CD spectrum) but not folded in a unique way in 3D (on the NMR spectra). This also would explain the difficulties with expressing and purifying this protein fragment, its proteolytic instability and tendency to aggregate and precipitate. 4.2.4 NMR studies on hSTRAP(1-150) CD experiments carried out with hSTRAP(1-150) showed that hSTRAP(1-150) is 71% αhelical, thermally unstable and does not exhibit reversible folding (Fig.3.51C).Also, there is not a clear two state unfolding transition suggestive of a molten globule state protein [156], which would correlate with the NMR findings that hSTRAP(1-150) exists as conformationally-heterogeneous mixture of isoforms, and does not show unique fold (Fig.3.56). The structure of hSTRAP(1-150) could not be solved as this part of the protein was found to be intrinsically unstable, and in order to fold it may possibly require the presence of an interacting partner, or the rest of the protein which was removed in this construct. 4.2.5 NMR studies on hSTRAP(151-284) According to CD experiments, hSTRAP(151-284) protein is folded at the secondary structure level but does not show clear co-operative unfolding transition (Fig.3.61). Furthermore, 1D 1H NMR spectra (Fig.3.62) of hSTRAP(151-284) has shown that this protein does not have a unique 3D fold which all suggests that protein may exist in a molten globule state [156]. In the structure of full length mSTRAP the amino acid 151 is between helix A and B of TPR4 [153, Table 4.1], which could be a reason explaining the instability of this variant and the formation of a molten globule conformation, as the presence of Helix B could be critical for its stability and correct folding. 4.3 Difficulties with expression of hSTRAP protein variants using the E. coli expression system, and possible ways to overcome these in future. The E.coli expression system was used to produce all hSTRAP protein variants, as this is the most common and preferred expression system used in research laboratories due to the advantages associated with the bacterial system (Table 4.1) [157-160]. However, this expression system cannot maintain various post-translational modifications such as correct 191 disulphide bond formation, which could be required for the correct fold of the protein. Correct disulfide bonds may not form within the reducing environment of the E. coli cytoplasm and in this case it may be necessary to use another expression system to obtain correctly folded recombinant protein [157-160]. Quite often, over-expressed recombinant proteins do not fold correctly and undergo proteolytic degradation or form aggregates and consequently inclusion bodies [161]. This seems to be the case for the hSTRAP protein variants cloned in this study. This mis-folding occurs because in the cytoplasm of the E. coli transcription and translation are tightly coupled and occurring at a fast rate where a protein chain is forming every 35 seconds and a macromolecule concentration of 300400mg/ml can be reached [161]. This makes protein folding a challenging task, and it is generally known that small singular motif recombinant proteins can form their native conformation within this relatively small time period and dynamic environment [161]. This is however more challenging for multi domain and over-expressed recombinant proteins, like hSTRAP, that may require molecular chaperones to assist their folding [161]. Failure to form the native conformation for the recombinant protein rapidly results in either formation of inclusion bodies or degradation [161]. The probability of protein mis-folding increases by the use of strong promoters and high inducer concentrations that ultimately lead to a protein yield of over 50% of actual total quantity of cellular protein and hence the rate of formation of protein aggregation exceeds that of correct protein folding [161]. Another commonly used expression system is the yeast, which is a eukaryotic microorganism, therefore more advanced and similar to human in terms of genetics than E. coli, and yet it still maintains the ease to manipulate the cells compared to the mammalian expression system [157-160]. Baculovirus (insect cells) expression system is the most extensively used system as it can produce large amounts of protein [157-160]. The expression levels of a recombinant protein is higher in insect cells compared to mammalian cells, and most of the post translational modifications are maintained in this system, which is an advantage over bacterial expression systems [157-160]. The likelihood of mis-folding especially for polypeptides already destabilized via truncations are high. Therefore a eukaryotic system should be used to express these hSTRAP protein variants but this was not an option at the time, furthermore the likelihood of structural instability for hSTRAP(1150) and hSTRAP(151-284) would be high even expressed in this system due to truncation between Helix A and B of the fourth TPR motif [153, Table 4.1]. Although new hSTRAP protein variants can be cloned whereby the domain boundaries for truncation constructs 192 would be chosen according to the experimental structure of mSTRAP, which became available recently [153]. Table 4.3. The advantages and disadvantages associated with the bacterial and eukaryotic expression systems. Bacterial expression system Low cost Ability to produce large amounts of protein High growth rates, hence smaller time frame needed to express and purify protein Does not maintain all post translational modifications Easily transformed with low amounts of foreign DNA Eukaryotic expression system Has an improved protein folding mechanism to recognize eukaryotic protein Found to obtain soluble form of human protein Does maintain most (yeast and insect) if not all post translational modifications (mammalian) Likelihood of protein degradation reduced More costly Longer growth rates More complex media 4.4 The hSTRAP interactome General workflow in identifying interacting partners by mass spectrometry is to isolate the protein complex by pull down or immuno-precipitation experiments, subsequently followed by SDS PAGE for protein separation (size fractionation) [162]. The gel is then cut in a number of pieces (region identified by user), and prepared for mass spectrometry identification [162]. The disadvantage of this approach is that gel fragments obtained that way would contain very narrow distribution of molecular sizes, and a lot of potential interacting partners of significantly different size would go undetected. Therefore in this study one gel fragment containing the whole mixture of proteins was used, in attempt to use high discriminating power of mass-spectrometry protein identification, and maximise the potential number of hits (Fig.3.26). As mass spectrometry is a highly sensitive method and detects sub-picomolar amounts of protein [163], this methodology was expected to still identify all STRAP interacting proteins within that gel slice. Also for these types of experiments the tags generally used are GFP [163] and FLAG [163] to reduce the number of contaminants obtained, although GST [164] and His tagged fusions are also used [165]. False positives can be reduced through the implementation of highly stringent purification methods, which also come at the cost of removing low-abundance and low-affinity interacting proteins [162]. In this thesis the number of false positives have been reduced as tag only (controls), as well as truncated variants of hSTRAP were used as bait for the pull downs. Another method that can be used to reduce the number of false positives is the use of isotope labelling, which can distinguish specific to non specific interactors, although not 193 all specific interactors can be identified when signal to noise ratio is similar, due to the background level provided by the contaminants [162]. Another disadvantage for these types of experiments is generally that the concentration of bait exceeds “normal” cellular levels of the protein [163], and an in-vitro system is being used. Both suggest that certain interactions detected via these experiments may not actually occur under normal conditions because that concentration of protein may not “normally” exist in the cell, as well as the fact that the two proteins may not be co-localized in the cell in-vivo. This challenging task can be resolved by expressing the bait in a stable line or the use of antibodies to detect native endogenous bait and their interacting partners [163]. Also the tags used for purification may interfere with protein function and interactions [163] and that could have been the case in this thesis, as certain proteins have been detected in the pull downs with His tagged hSTRAP variants but not GST tagged hSTRAP. To overcome this issue, for future experiments, the tag should be applied to both the N and C terminus of the bait, and the differences in interaction profiles between the two pull downs should be then analyzed [163]. In this investigation it was found that hSTRAP interacts with 25 proteins (Table 3.6), which included 20 human and 5 E. coli proteins. The peptide data list was searched against the full database, which would include sequences from both E. coli and human proteins. Our primary aim is to determine human hSTRAP interacting partners hence human breast cancer cellular extract (MCF7) was used for these pull downs, however, it cannot be excluded that some E. coli proteins might have co-purified with the hSTRAP constructs (Fig.3.24). This co-purification however would likely depend on the type of protein purification procedure performed; hence with different purification tags a different set of co-purified contaminants may be expected. Indeed, five E. coli proteins were detected and these were in the pull downs with His tagged proteins and not GST tagged proteins (Table 3.6). This suggests that this contamination is specific for His tagged proteins and shows that using His tagged proteins for these types of pull downs is not ideal. However, in this case, we tried to mitigate the effects of such artefacts by using multiple repeats of experiments, as well as using different truncation constructs of hSTRAP, full-length GST tagged hSTRAP, and analysing the data together, looking for common interaction patterns for constructs of the same type (i.e., containing the same regions). This reduces the number of false positives that can be identified and increases the confidence in the data. 194 An interacting network was created based on the evidence from our experiments and also input from various programs such as GeneMania and String which also take into account published data. This interacting network (Fig.3.28) includes interactions identified as part of the work in this thesis (highlighted in red), and those identified by GeneMania and String (highlighted in blue) as well as interactions that have been predicted or were revealed after text mining by either/both programs (highlighted in pink and grey). Text mining data, which has been included in the network for information have been extracted from large datasets by these two programs [166], and has not been proven by experiments, therefore, no firm conclusions can be reached from this data. The possible functional implication of this interacting network will be discussed in more detail below but it should be mentioned that further experimental work is necessary to confirm these interactions. The UNIPROT ID of each one of these 20 human proteins was also submitted to DAVID bioinformatics software to define their functional role. The results of this analysis indicated that hSTRAP interacting proteins are implicated in diverse pathways (Table 3.7) including the regulation of the actin cytoskeleton, translation, oxidative phosphorylation, various metabolic pathways, non-homologous end joining, glycolysis and gluco-neogenesis, fatty acid bio-synthesis and the stress response pathway. This suggests that hSTRAP could be potentially involved in diverse regulatory roles as originally hypothesized due to the presence of six TPR motifs in its protein sequence [136, 153]. It has been recently shown that hSTRAP contains an OB fold at its C termini [153, PDB code 2XVS], and it is known that OB fold containing proteins are critical for various DNA related functions such as DNA replication, repair, transcription and translation [167]. In this thesis hSTRAP interaction with DNA damage dependent protein kinase C catalytic subunit has been suggested, which is implicated in DNA repair (Table 3.7). Furthermore, interaction with Eukaryotic initiation factor 4A-I, Elongation factor 1-alpha 1 and Tu translation elongation factor, all implicated in translation has also been suggested (Table 3.7). These interactions are in accord with published observations indicating that OB fold containing proteins [167] and STRAP are implicated in the DNA damage response pathway [144-145, 147-148]. Since STRAP forms a complex with the protein JMY [136], its implication in the regulation of the actin cytoskeleton seems plausible as JMY has been shown to be associated with the cytoskeleton [168-171]. Biochemical experiments have shown that 195 JMY induces actin nucleation by activating the Arp2/3 complex [168-171]. In particular, the WH2 domain of JMY binds to monomeric actin and its central acidic region activates the Arp2/3 complex, consequently causing actin polymerisation [168-171]. JMY activates actin polymerisation also in Arp2/3 independent mode like spire, whereby four actin molecules are aligned tandemly to form the pointed end, to which free actin monomers then subsequently bind to form a nascent filament [168-171]. In this study actin was identified as an hSTRAP interacting partner, but not any of the components of the Arp2/3 protein complex implying that JMY mediated actin polymerisation could potentially involve hSTRAP. JMY localises with actin in the cytoplasm of human neutrophils rather than the nucleus, and upon DNA damage JMY translocates to the nucleus [168-171]. This implies that JMY is mainly involved in cell motility under normal conditions, and under DNA damage conditions it translocates to the nucleus thereby facilitating the p53 mediated stress response, and concomitantly its effects on cell motility are reduced [168-171]. This could also be true for STRAP, that under normal conditions STRAP is present in the cytoplasm and mediates its role on the cytoskeleton and cell migration as suggested by the data in this thesis, and under DNA damage STRAP translocates to the nucleus whereby it facilitates the p53 and DNA damage response upon phosphorylation on serine 203 [144-145, 147149]. The potential mechanism of hSTRAP function is shown in Fig.4.4. Published reports have shown that STRAP interacts with JMY through STRAP 1-205 [136] region. Results shown in this thesis indicate that hSTRAP 1-150 interacts with actin allowing the hypothesis that STRAP mediates its effects in the regulation of the actin cytoskeleton through this region possibly acting as a scaffolding protein either directly or indirectly by interacting with JMY [136]. This hypothesis however needs to be checked further by running a direct experiment in-vitro (to detect if direct interaction between STRAP and actin takes place), and in-vivo (to detect if these two proteins do co-localize in cell). In addition, hSTRAP was shown to interact with other actin regulatory molecules providing further support to the notion that hSTRAP could potentially be the central component in the regulation of the actin cytoskeleton acting as scaffold or adaptor protein to cluster multiple actin regulatory proteins to initiate the desired actin response. Furthermore STRAP might play a role in the nuclear translocation of JMY under DNA damage conditions [144-145, 147] thus linking DNA damage response with cell motility (Fig.4.4). This hypothesis however needs to be checked further, with more detailed studies. 196 Figure 4.4. Proposed hSTRAP mechanism of function. Under normal conditions, STRAP remains in the cytoplasm regulating the function of actin cytoskeleton in complex with JMY. Then upon DNA damage STRAP/JMY translocates to the nucleus and binds to p300 thus regulating transcription and DNA damage response. Recent research has implicated JMY in spindle migration, asymmetric division and cytokinesis during mouse oocyte maturation [172]. During this process, JMY was found to localize to the spindle microtubules as shown by its overlap with alpha tubulin and also in the cytoplasm [172]. In this thesis hSTRAP 1-150 was found to associate with tubulin allowing the hypothesis that the above reported JMY functions could be mediated through STRAP, signifying a potential role of this protein in microtubule organization [136]. STRAP is phosphorylated in an ATM dependent manner on serine 203, which causes its nuclear localisation and implicates this protein in the DNA damage response pathway [144-145, 147-149]. In accord with these reports evidence supporting the notion that hSTRAP is a stress and DNA damage responsive protein is provided in this thesis as this protein was found to interact with DNA dependent Protein kinase C (DNA-PKc), HSP90 and HSP70. In earlier reports STRAP has been shown to interact with p300 and JMY [136], which were not detected in the biochemical binding assays carried out in this thesis, possibly due to the difference in the experimental systems used as well as the fact that the cells utilised in the present study were not stressed [143-145, 147-149]. Inhibition of Hsp90 delays cell migration by decreasing the interaction of this protein with actin monomers thereby reducing actin polymerization in breast cancer cells [173]. Since in 197 this thesis interaction of hSTRAP 1-150 with actin and hSTRAP 285-440 with Hsp90 were suggested, an attractive hypothesis could be that the interaction between actin and Hsp90 is mediated through STRAP. If that is the case, hSTRAP can potentially be acting as an adaptor molecule coupling the stress response pathway to the actin cytoskeleton. Furthermore, STRAP could be potentially connecting the stress response, DNA damage and the actin cytoskeleton in cancer, as hSTRAP, region 1-150 apart from actin was also shown to interact with DNA dependent protein kinase C. Taken together results presented in this thesis support the hypothesis that different regions of hSTRAP cluster with many proteins implicating this protein in diverse signalling pathways. The role of STRAP as a potential scaffolding protein would correlate with published data on TPR proteins [33] and the solved STRAP structure, as the OB fold exhibits an extended super-helical scaffold structure, which can mediate protein-protein and protein-DNA interactions [153]. This would correlate with the data provided in this thesis showing hSTRAP(285-440), (which includes the OB fold) does interact with various proteins and mediate protein-protein interactions (Table 3.6). A point to note is that not all the interaction profiles are consistent as some proteins such as Myosin 9 and Phosphoglycerate kinase were detected in the pull downs with His tagged STRAP variants including full-length hSTRAP but not GST tagged full-length hSTRAP. This is unexpected because if the proteins in question were hSTRAP interacting proteins then they should be detected in the pull downs with GST tagged full-length hSTRAP as well. This could be due to difference in folding states of the two differently tagged proteins, and/or site of interaction being occluded by the tag itself. Also the same can apply for proteins that have been detected in the pull downs with GST tagged hSTRAP protein and His tagged truncated hSTRAP protein variants but not full length hSTRAP, for example filamin A and filamin B. Another point to consider is the computational algorithms used to identify these proteins, as Mascot and Scaffold are probability based protein identification programs [174-175], hence they bear both benefits and limitations. Mascot firstly works by comparing the experimental data with calculated peptide mass/fragment ion mass values by applying the appropriate cleavage to a sequence database [174-175]. A probability is then calculated to determine the likelihood that observed identification is a chance event or not [174-175]. The match that has the lowest probability of occurring by chance is considered as the best 198 match [174-175]. This probability is then shown as a score, which relates to the confidence in the match, which is -10Log10(Prob), hence the lower the probability that the match is a random event the higher is the score [174-175]. A score of over 70 is generally known as a significant match [174]. Scaffold validates these identifications and increases the confidence in the data by using various peptide and protein validation methods following an initial database search analysis (e.g Mascot) [174-175]. The results and scores obtained from this initial database search analysis via Mascot are then converted into probabilities of peptide identification and determine the probability whether Mascot protein identification is correct [174-175]. Protein probabilities can then be subsequently determined through use of the “Protein-Prophet” algorithm to then ultimately identify the proteins present in that sample [175]. In this thesis a Scaffold probability of only over 95% was used, with 2 unique peptides (80+% peptide probability), as this is the quality that is currently considered sufficient [174-175]. However, both Mascot and Scaffold programs use statistical analysis algorithms which only quote probabilities [174-175], therefore false positives can be obtained as a result of this analysis. However, the fact that a number of pull downs have been carried out with truncated hSTRAP variants, and full-length hSTRAP in two different vector systems and controls, increase the confidence in the data. Our proteomic analysis performed here therefore allows us to formulate a hypothesis that potentially hSTRAP is a critical protein involved in many aspects of cellular regulation, and could act as a scaffolding protein which bridges various cellular proteins together. However, due to observed inconsistencies in the pull down data as mentioned above, individual interactions identified here need to be studied in more detail, and confirmed by further biochemical binding and interaction assays. The value of the current study is that it narrowed down a list of potential pathways and classes of proteins, and allowed to formulate a hypothesis, which should be checked in future. For example, future experiments should be performed to confirm whether there is a direct interaction between the hSTRAP interacting proteins identified in this thesis with hSTRAP. This can be determined via co-localization, co-immunoprecipitation and other functional assays. 199 5. Future direction The structure of full length mouse mSTRAP and a part of the C terminus of human hSTRAP has been recently published [153], and from this thesis it appears that the structure of full-length human hSTRAP could not be easily solved, due to poor crystallizability of the human orthologue. However, crystallography may be attempted of hSTRAP-ligand complex, to see if its structure can be solved when it is part of a complex. This thesis has shown that hSTRAP is implicated in diverse cellular functions, and hence future work should prioritize on the hSTRAP interactome. Co-localization experiments should be performed to confirm whether hSTRAP and ligand are in close proximity in-vivo to interact. This experiment involves the labeling of hSTRAP and ligand, for example actin and hSTRAP, with a different fluorescent probes, and the resulting images can then be analyzed by microscopy and overlaid to determine co-localisation of the two proteins under investigation [176]. Furthermore, these experiments can be performed with GFP-fused full length STRAP, or its shorter variants expressing different TPR motifs, and their subcellular localization can also be studied. These experiments can also be performed in different cell lines to determine differences, if any, in interaction status within these cell lines as in this study MCF7 cells only were used. Site directed mutagenesis experiments could also be conducted to identify critical residues important for the localization of STRAP and interaction with its ligands. The residues that would be mutated first would be within the region of hSTRAP identified in this thesis, such as actin was shown to interact with hSTRAP through amino acids 1-150 (Table 3.6). These data combined with solved STRAP structure [153] and analysis of which residues are solvent-exposed may lead to selection of candidate residues to be mutated in the first instance. Another method to confirm whether it is a direct interaction between the purified proteins in-vitro is to perform isothermal Titration Calorimetry (ITC) and determine thermodynamics associated with this interaction. The binding equilibrium can be determined by measuring the heat produced upon ligand interaction [177]. Through these ITC experiments, the stoichiometry of the interaction (n), the association constant, the free energy, enthalpy, entropy, and heat capacity of binding can be determined [177]. This can be performed on any of these hSTRAP interacting proteins identified to determine the thermodynamics associated with ligand and hSTRAP interaction [177]. Mutagenesis of various hSTRAP residues within the specified region of hSTRAP-ligand interaction 200 mapped in thesis also can be carried out to identify critical residues important for hSTRAPligand interaction. Another line of evidence would include investigating potential hSTRAP implication in cancer metastasis, as it has been suggested here that hSTRAP could be implicated in the regulation of the actin cytoskeleton. This would include performing in-vitro scratch assays, which involves creating a scratch on a single cell mono-layer via a tip, and capturing images of this monolayer at regular intervals to determine rate of cell migration [178-179]. This should be done in the presence and absence of STRAP to determine effect of STRAP on cellular migration. Furthermore, this assay can be coupled to microscopy and hence GFP fluorescently labeled protein (GFP tagged hSTRAP) can be visualised and consequently their sub-cellular localization during cellular migration can be monitored live [178]. Alternative approaches to complement the results presented in this thesis that could be implemented would include experiments towards identifying hSTRAP interacting proteins via tandem affinity purification coupled with mass spectrometry based analysis [180]. This would involve the fusion of STRAP with the TAP tag, consisting of calmodulin binding peptide, TEV cleavage site and immunoglobulin interacting domain of protein A [180]. This hSTRAP fusion protein would be then incubated with cellular extract and then subjected to the first affinity purification step whereby the TAP tagged fused protein, along with hSTRAP interacting partners would bind to an IgG affinity matrix [180]. TEV is then added to the mixture, which results in the cleavage of the TAP tag. This elute is then subjected to another affinity purification step, using calmodulin coated beads, which again would bind to hSTRAP complexed to its interacting partners [180]. This whole complex is then eluted with ethylene glycol tetraacetic acid. This method is quite effective in reducing the number of contaminants identified because of this two step affinity purification process [180], but on the other hand, may miss some of the weaker or transiently-formed complexes. In this study, cells were not treated prior to performing pull down assays, and this can also be done in future experiments whereby the cells are treated with various drugs such as etoposide, which is a topoisomerase II inhibitor inducing double strand breaks [181]. The treatment of etoposide may be informative as STRAP is a stress responsive protein [136, 148-149], and so it is probable that differences in STRAP interaction profiles will occur 201 under cellular stress conditions. Similar as to this thesis, this can be coupled to mass spectrometry analysis to determine STRAP interaction profile with and without this treatment. Experiments should be performed to determine if differences in localisation of STRAP variants occur upon different treatments and its effect on ligand interaction. Stable expression mammalian cell lines expressing hSTRAP could also be used to identify STRAP interacting partners. In addition the role of STRAP in cancer can be explored using cell lines in which STRAP gene expression has been silenced using RNA interference [182]. Cells can be monitored under the microscope to determine if cell death is occurring upon STRAP knockdown. The effect of STRAP knockdown on cellular migration can be studied via scratch wound assays mentioned above. Also effects of STRAP knockdown on other pathways identified in this thesis should also be studied, for example on the glycolysis pathway (Table 3.6 and 3.7). In glycolysis assays, colorimetic measurements can be performed, whereby the levels of L-Lactate released in culture medium is used as a measure of glycolytic rate [183]. In summary various experiments could be performed in the future to determine the role of hSTRAP in cancer and cellular migration and other pathways that have been identified in thesis that hSTRAP could be potentially be implicated in. Experiments should also be performed to determine sub-cellular localizations of hSTRAP and hSTRAP interacting partners identified in this thesis and to further verify these interactions. This study has built a platform on hSTRAP but directed experiments on hSTRAP can now be advised as mentioned above. 202 6. References 1. Whitford, David. Proteins: Structure and Function: John Wiley & Sons, 2005. 2. Morawe, Tobias, Christof Hiebel, Andreas Kern, and Christian Behl. "Protein Homeostasis, Aging and Alzheimer’s Disease." Molecular Neurobiology 46, no. 1 (2012): 41-54. 3. Koga, Hiroshi, Susmita Kaushik, and Ana Maria Cuervo. "Protein Homeostasis and Aging: The Importance of Exquisite Quality Control." Ageing Research Reviews 10, no. 2 (2011): 205-15. 4. Committee on Intellectual Property Rights in Genomic and Protein Research, and National research Council. Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health: National Academies Press, 2006. 5. Jahnke, Wolfgang., and Hansjurg. Widmer. "Protein Nmr in Biomedical Research." Cellular and Molecular Life Sciences 61, no. 5 (2004): 580-99. 6. Klages, Jochen, Murray Coles, and Horst Kessler. "Nmr-Based Screening: A Powerful Tool in Fragment-Based Drug Discovery." Analyst 132, no. 7 (2007): 692-705. 7. Jacobsen, Neil E. Nmr Spectroscopy Explained : Simplified Theory, Applications and Examples for Organic Chemistry and Structural Biology. Hoboken, N.J.: WileyInterscience, 2007. 8. Guan, Hongtao., and Endre. Kiss-Toth. "Advanced Technologies for Studies on Protein Interactomes." Adv Biochem Eng Biotechnol 110 (2008): 1-24. 9. Stites, Wesley E. "Protein-Protein Interactions: Interface Structure, Binding Thermodynamics, and Mutational Analysis." Chemical Reviews 97, no. 5 (1997): 1233-50. 10. Giometti, Carol. Smith. "Proteomics and Bioinformatics." Advances in protein chemistry 65 (2003): 353-69. 11. Rupp, Bernhard. Modern Biomolecular Crystallography: Taylor & Francis, 2009. ISBN: 9780815340812 12. Li, Liang, and Rustem F. Ismagilov. "Protein Crystallization Using Microfluidic Technologies Based on Valves, Droplets, and Slipchip." Annual Review of Biophysics 39, no. 1 (2010): 139-58. 13. Weselak, Mark, Marianne G. Patch, Thomas L. Selby, Gunther Knebel, Raymond C. Stevens, Charles W. Carter, Jr., and M. Sweet Robert. "Robotics for Automated Crystal Formation and Analysis." In Methods in Enzymology, 45-76: Academic Press, 2003. 14. Pascal, Steven.M. Nmr Primer: An Hsqc-Based Approach with Vector Animations: IM Publications, 2008. 15. Lorigan, Gary A., Robert E. Minto, and Wei Zhang. "Teaching the Fundamentals of Pulsed Nmr Spectroscopy in an Undergraduate Physical Chemistry Laboratory." Journal of 203 Chemical Education 78, no. 7 (2001): 956. 16. Smith, William B., and Thomas W. Proulx. "Pulse Nmr - an Old Analytical Technique Oft Neglected by the Chemist." Journal of Chemical Education 53, no. 11 (1976): 700. 17. Evans, J.N.S. Biomolecular Nmr Spectroscopy: Oxford University Press, USA, 1995. 18. Farrar, Thomas C. "Pulsed and Fourier Transform Nmr Spectroscopy." Analytical Chemistry 42, no. 4 (1970): 109A-12a 19. Edén, Mattias, and Lucio Frydman. "Homonuclear Nmr Correlations between HalfInteger Quadrupolar Nuclei Undergoing Magic-Angle Spinning." The Journal of Physical Chemistry B 107, no. 51 (2003): 14598-611. 20. Bax, Adriaan. "Two-Dimensional Nmr and Protein Structure." Annual Review of Biochemistry 58, no. 1 (1989): 223-56. 21. Li, Kuo. Bin., and Bryan. C. Sanctuary. "Cheminform Abstract: Automated Resonance Assignment of Proteins Using Heteronuclear 3d Nmr. Part 2. Side Chain and SequenceSpecific Assignment." ChemInform 28, no. 37 (1997): no-no. 22. Braun, W., G. Wider, K. H. Lee, and K. Wüthrich. "Conformation of Glucagon in a Lipid-Water Interphase by 1h Nuclear Magnetic Resonance." Journal of Molecular Biology 169, no. 4 (1983): 921-48. 23. Rossi, Paolo, G. V. T. Swapna, Yuanpeng J Huang, James M Aramini, Clemens Anklin, Kenith Conover, Keith Hamilton, Rong Xiao, Thomas B Acton, Asli Ertekin, John K Everett, and Gaetano T Montelione. "A Microscale Protein Nmr Sample Screening Pipeline." Journal of Biomolecular NMR 46, no. 1 (2010): 11-22. 24. Kelly, Sharon M., Thomas J. Jess, and Nicholas C. Price. "How to Study Proteins by Circular Dichroism." Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics 1751, no. 2 (2005): 119-39. 25. Kelly, Sharon. M., and Nicholas. C. Price. "The Use of Circular Dichroism in the Investigation of Protein Structure and Function." Current protein & peptide science 1, no. 4 (2000): 349-84. 26. Yee, Adelinda A., Alexei Savchenko, Alexandr Ignachenko, Jonathan Lukin, Xiaohui Xu, Tatiana Skarina, Elena Evdokimova, Cheng Song Liu, Anthony Semesi, Valerie Guido, Aled M. Edwards, and Cheryl H. Arrowsmith. "Nmr and X-Ray Crystallography, Complementary Tools in Structural Proteomics of Small Proteins." Journal of the American Chemical Society 127, no. 47 (2005): 16512-17. 27. Garbuzynskiy, Sergiy O., Bogdan S. Melnik, Michail Yu Lobanov, Alexei V. Finkelstein, and Oxana V. Galzitskaya. "Comparison of X-Ray and Nmr Structures: Is There a Systematic Difference in Residue Contacts between X-Ray- and Nmr-Resolved Protein Structures?" Proteins: Structure, Function, and Bioinformatics 60, no. 1 (2005): 139-47. 28. Glish, Gary, and Richard Vachet. "The Basics of Mass Spectrometry in the TwentyFirst Century." 2, no. 2 (2003): 140-50. 204 29. Stirnimann, Christian U., Evangelia Petsalaki, Robert B. Russell, and Christoph W. Müller. "Wd40 Proteins Propel Cellular Networks." Trends in Biochemical Sciences 35, no. 10 (2010): 565-74. 30. Jeleń, Filip., Arkadiusz. Oleksy, Katarzyna. Smietana, and Jacek. Otlewski. "Pdz Domains - Common Players in the Cell Signaling." Acta biochimica Polonica 50, no. 4 (2003): 985-1017. 31. Li, Shawn S-C. "Specificity and Versatility of Sh3 and Other Proline-Recognition Domains: Structural Basis and Implications for Cellular Signal Transduction." Biochem. J. 390, no. 3 (2005): 641-53 32. Blatch, Gregory L., and Michael Lässle. "The Tetratricopeptide Repeat: A Structural Motif Mediating Protein-Protein Interactions." BioEssays 21, no. 11 (1999): 932-39. 33. D'Andrea, Luca D., and Lynne Regan. "Tpr Proteins: The Versatile Helix." Trends in Biochemical Sciences 28, no. 12 (2003): 655-62. 34. Hirano, Tatsuya, Noriyuki Kinoshita, Kosuke Morikawa, and Mitsuhiro Yanagida. "Snap Helix with Knob and Hole: Essential Repeats in S. Pombe Nuclear Protein Nuc2 +." Cell 60, no. 2 (1990): 319-28. 35. Sikorski, Robert S., Mark S. Boguski, Mark Goebl, and Philip Hieter. "A Repeating Amino Acid Motif in Cdc23 Defines a Family of Proteins and a New Relationship among Genes Required for Mitosis and Rna Synthesis." Cell 60, no. 2 (1990): 307-17. 36. Lamb, John R., Stuart Tugendreich, and Phil Hieter. "Tetratrico Peptide Repeat Interactions: To Tpr or Not to Tpr?" Trends in Biochemical Sciences 20, no. 7 (1995): 25759. 37. Das, Amit. K., Patricia. W. Cohen, and David. Barford. "The Structure of the Tetratricopeptide Repeats of Protein Phosphatase 5: Implications for Tpr-Mediated ProteinProtein Interactions." EMBO J 17, no. 5 (1998): 1192-9. 38. Malek, Sami. N., Charles. H. Yang, William. C. Earnshaw, hristineC. A. Kozak, and Stephen. Desiderio. "P150tsp, a Conserved Nuclear Phosphoprotein That Contains Multiple Tetratricopeptide Repeats and Binds Specifically to Sh2 Domains." The Journal of biological chemistry 271, no. 12 (1996): 6952-62. 39. Smith, David. F. "Tetratricopeptide Repeat Cochaperones in Steroid Receptor Complexes." Cell stress & chaperones 9, no. 2 (2004): 109-21. 40. Wu, Beili, Pengyun Li, Yiwei Liu, Zhiyong Lou, Yi Ding, Cuiling Shu, Sheng Ye, Mark Bartlam, Beifen Shen, and Zihe Rao. "3d Structure of Human Fk506-Binding Protein 52: Implications for the Assembly of the Glucocorticoid Receptor/Hsp90/Immunophilin Heterocomplex." Proceedings of the National Academy of Sciences of the United States of America 101, no. 22 (2004): 8348-53. 41. Main, Ewan R. G., Katherine Stott, Sophie E. Jackson, and Lynne Regan. "Local and Long-Range Stability in Tandemly Arrayed Tetratricopeptide Repeats." Proceedings of the National Academy of Sciences of the United States of America 102, no. 16 (2005): 5721-26. 42. Cortajarena, Aitziber L., and Lynne Regan. "Ligand Binding by Tpr Domains." Protein 205 Science 15, no. 5 (2006): 1193-98. 43. Cliff, Matthew J., Mark A. Williams, John Brooke-Smith, David Barford, and John E. Ladbury. "Molecular Recognition Via Coupled Folding and Binding in a Tpr Domain." Journal of Molecular Biology 346, no. 3 (2005): 717-32. 44. Strauss, H. M., S. Keller, Enno Klussmann, and John Scott. "Pharmacological Interference with Protein-Protein Interactions Mediated by Coiled-Coil Motifs ProteinProtein Interactions as New Drug Targets." 461-82: Springer Berlin Heidelberg, 2008. 45. Allan, Rudi, and Thomas Ratajczak. "Versatile Tpr Domains Accommodate Different Modes of Target Protein Recognition and Function." Cell Stress and Chaperones 16, no. 4 (2011): 353-67. 46. Zeytuni, Natalie., and Raz. Zarivach. "Structural and Functional Discussion of the Tetra-Trico-Peptide Repeat, a Protein Interaction Module." Structure 20, no. 3 (2012): 397405. 47. Scheufler, Clemans., Achim. Brinker, Gleb. Bourenkov, Stefano. Pegoraro, Luis. Moroder, Hans. Bartunik, F. Ulrich. Hartl, and Ismail. Moarefi. "Structure of Tpr DomainPeptide Complexes: Critical Elements in the Assembly of the Hsp70-Hsp90 Multichaperone Machine." Cell 101, no. 2 (2000): 199-210. 48. Taylor, Paul, Jacqueline Dornan, Amerigo Carrello, Rodney F. Minchin, Thomas Ratajczak, and Malcolm D. Walkinshaw. "Two Structures of Cyclophilin 40: Folding and Fidelity in the Tpr Domains." Structure 9, no. 5 (2001): 431-38. 49. Main, Ewan R. G., Yong Xiong, Melanie J. Cocco, Luca D'Andrea, and Lynne Regan. "Design of Stable alpha-Helical Arrays from an Idealized Tpr Motif." Structure 11, no. 5 (2003): 497-508. 50. Sinars, Cindy R., Joyce Cheung-Flynn, Ronald A. Rimerman, Jonathan G. Scammell, David F. Smith, and Jon Clardy. "Structure of the Large Fk506-Binding Protein Fkbp51, an Hsp90-Binding Protein and a Component of Steroid Receptor Complexes." Proceedings of the National Academy of Sciences 100, no. 3 (2003): 868-73. 51. Brinker, Achim, Clemens Scheufler, Florian von der Mülbe, Burkhard Fleckenstein, Christian Herrmann, Günther Jung, Ismail Moarefi, and F. Ulrich Hartl. "Ligand Discrimination by Tpr Domains." Journal of Biological Chemistry 277, no. 22 (2002): 19265-75. 52. Cliff, Matthew J., Mark A. Williams, John Brooke-Smith, David Barford, and John E. Ladbury. "Molecular Recognition Via Coupled Folding and Binding in a Tpr Domain." Journal of Molecular Biology 346, no. 3 (2005): 717-32 53. Magliery, Thomas. J., and Lynne. Regan. "Sequence Variation in Ligand Binding Sites in Proteins." Bmc Bioinformatics 6 (2005): 240. 54. Doyle, Declan A., Alice Lee, John Lewis, Eunjoon Kim, Morgan Sheng, and Roderick MacKinnon. "Crystal Structures of a Complexed and Peptide-Free Membrane Proteinbinding Domain: Molecular Basis of Peptide Recognition by Pdz." Cell 85, no. 7 (1996): 1067-76. 206 55. De Los Rios, Paolo, Fabio Cecconi, Anna Pretre, Giovanni Dietler, Olivier Michielin, Francesco Piazza, and Brice Juanico. "Functional Dynamics of Pdz Binding Domains: A Normal-Mode Analysis." Biophysical Journal 89, no. 1 (2005): 14-21. 56. Aitio, Olli, Maarit Hellman, Arunas Kazlauskas, Didier F. Vingadassalom, John M. Leong, Kalle Saksela, and Perttu Permi. "Recognition of Tandem Pxxp Motifs as a Unique Src Homology 3-Binding Mode Triggers Pathogen-Driven Actin Assembly." Proceedings of the National Academy of Sciences 107, no. 50 (2010): 21743-48. 57. Bauer, Finn., Kristian. Schweimer, Helke. Meiselbach, Silke. Hoffmann, Paul. Rosch, and Heinrich. Sticht. "Structural Characterization of Lyn-Sh3 Domain in Complex with a Herpesviral Protein Reveals an Extended Recognition Motif That Enhances Binding Affinity." Protein science : a publication of the Protein Society 14, no. 10 (2005): 2487-98. 58. Liao, Yanling, Ian M. Willis, and Robyn D. Moir. "The Brf1 and Bdp1 Subunits of Transcription Factor Tfiiib Bind to Overlapping Sites in the Tetratricopeptide Repeats of Tfc4." Journal of Biological Chemistry 278, no. 45 (2003): 44467-74. 59. Crevel, Gilles, Dorothy Bennett, and Sue Cotterill. "The Human Tpr Protein Ttc4 Is a Putative Hsp90 Co-Chaperone Which Interacts with Cdc6 and Shows Alterations in Transformed Cells." Plos One 3, no. 3 (2008): e0001737. 60. Jascur, Thomas, Howard Brickner, Isabelle Salles-Passador, Valerie Barbier, Abdelhamid El Khissiin, Brian Smith, Rati Fotedar, and Arun Fotedar. "Regulation of P21waf1/Cip1 Stability by Wisp39, a Hsp90 Binding Tpr Protein." Molecular Cell 17, no. 2 (2005): 237-49. 61. Jakob, Ursula, Hauke Lilie, Ines Meyer, and Johannes Buchner. "Transient Interaction of Hsp90 with Early Unfolding Intermediates of Citrate Synthase: Implications for Heat Shock in Vivo." Journal of Biological Chemistry 270, no. 13 (1995): 7288-94. 62. Imai, Jun., Mikako. Maruya, Hideki. Yashiroda, Ichiro. Yahara, and Keiji. Tanaka. "The Molecular Chaperone Hsp90 Plays a Role in the Assembly and Maintenance of the 26s Proteasome." Embo Journal 22, no. 14 (2003): 3557-67. 63. Grad, Iwona, and Didier Picard. "The Glucocorticoid Responses Are Shaped by Molecular Chaperones." Molecular and Cellular Endocrinology 275, no. 1‚Äì2 (2007): 212. 64. Russell, Lance C., Sherry R. Whitt, Mei-Shya Chen, and Michael Chinkers. "Identification of Conserved Residues Required for the Binding of a Tetratricopeptide Repeat Domain to Heat Shock Protein 90." Journal of Biological Chemistry 274, no. 29 (1999): 20060-63. 65. Whitesell, Luke, Edward G Mimnaugh, Brian De Costa, Charles E Myers, and Leonard M Neckers. "Inhibition of Heat Shock Protein Hsp90-Pp60v-Src Heteroprotein Complex Formation by Benzoquinone Ansamycins: Essential Role for Stress Proteins in Oncogenic Transformation." Proceedings of the National Academy of Sciences 91, no. 18 (1994): 8324-28. 66. Pearl, Laurence H., and Chrisostomos Prodromou. "Structure and in Vivo Function of Hsp90." Current Opinion in Structural Biology 10, no. 1 (2000): 46-51. 207 67. Morano, Kevin A. "New Tricks for an Old Dog." Annals of the New York Academy of Sciences 1113, no. 1 (2007): 1-14. 68. Liu, Qinghuai, Juanyu Gao, Xi Chen, Yuxin Chen, Jie Chen, Saiqun Wang, Jin Liu, Xiaoyi Liu, and Jianmin Li. "Hbp21: A Novel Member of Tpr Motif Family, as a Potential Chaperone of Heat Shock Protein 70 in Proliferative Vitreoretinopathy (Pvr) and Breast Cancer." Molecular Biotechnology 40, no. 3 (2008): 231-40. 69. Place, Sean. P. "Single-Point Mutation in a Conserved Tpr Domain of Hip Disrupts Enhancement of Glucocorticoid Receptor Signaling." Cell stress & chaperones 16, no. 4 (2011): 469-74. 70. Dodt, Gabriele., Nancy. Braverman, Candice. Wong, Ann. Moser, Hugo. W. Moser, Paul. Watkins, David. Valle, and Stephen. J. Gould. "Mutations in the Pts1 Receptor Gene, Pxr1, Define Complementation Group 2 of the Peroxisome Biogenesis Disorders." Nat Genet 9, no. 2 (1995): 115-25. 71. Brocard, C., Friedrich. Kragler, M. M. Simon, T. Schuster, and A. Hartig. "The Tetratricopeptide Repeat Domain of the Pas10 Protein of Saccharomyces Cerevisiae Is Essential for Binding the Peroxisomal Targeting Signal -Skl." Biochemical and Biophysical Research Communications 204, no. 3 (1994): 1016-22. 72. McCollum, Dannel, Edward Monosov, and Suresh Subramani. "The Pas8 Mutant of Pichia Pastoris Exhibits the Peroxisomal Protein Import Deficiencies of Zellweger Syndrome Cells--the Pas8 Protein Binds to the Cooh-Terminal Tripeptide Peroxisomal Targeting Signal, and Is a Member of the Tpr Protein Family." The Journal of Cell Biology 121, no. 4 (1993): 761-74. 73. Vanderleij, Inge., Maartje. M. Franse, Ype. Elgersma, Ben. Distel, and Henk. F. Tabak. "Pas10 Is a Tetratricopeptide-Repeat Protein That Is Essential for the Import of Most Matrix Proteins into Peroxisomes of Saccharomyces-Cerevisiae." Proceedings of the National Academy of Sciences of the United States of America 90, no. 24 (1993): 1178286. 74. Schlüter, Agatha, Stéphane Fourcade, Enric Doménech-Estévez, Toni Gabaldón, Jaime Huerta-Cepas, Guillaume Berthommier, Raymond Ripp, Ronald J. A. Wanders, Olivier Poch, and Aurora Pujol. "PeroxisomeDB: A Database for the Peroxisomal Proteome, Functional Genomics and Disease." Nucleic Acids Research 35, no. suppl 1 (2007): D815D22. 75. Lithgow, Trevor, Benjamin S. Glick, and Gottfried Schatz. "The Protein Import Receptor of Mitochondria." Trends in Biochemical Sciences 20, no. 3 (1995): 98-101. 76. Moczko, M, U Bömer, M Kübrich, N Zufall, A Hönlinger, and N Pfanner. "The Intermembrane Space Domain of Mitochondrial Tom22 Functions as a Trans Binding Site for Preproteins with N-Terminal Targeting Sequences." Molecular and Cellular Biology 17, no. 11 (1997): 6574-84. 77. Riezman, Howard., Toshiharu. Hase, Adolphus. P. van Loon, Leslie. A. Grivell, Kitaru. Suda, and Gottfried. Schatz. "Import of Proteins into Mitochondria: A 70 Kilodalton Outer Membrane Protein with a Large Carboxy-Terminal Deletion Is Still Transported to the Outer Membrane." EMBO J 2, no. 12 (1983): 2161-8. 208 78. Yano, Masato, Kazutoyo Terada, and Masataka Mori. "Mitochondrial Import Receptors Tom20 and Tom22 Have Chaperone-Like Activity." Journal of Biological Chemistry 279, no. 11 (2004): 10808-13. 79. Thornton, Brian R., and David P. Toczyski. "Precise Destruction: An Emerging Picture of the Apc." Genes & Development 20, no. 22 (2006): 3069-78. 80. Lamb, J. R., W. A. Michaud, R. S. Sikorski, and P. A. Hieter. "Cdc16p, Cdc23p and Cdc27p Form a Complex Essential for Mitosis." The EMBO journal 13, no. 18 (1994): 4321-28. 81. Sikorski, Robert S, William A Michaud, and Philip Hieter. "P62cdc23 of Saccharomyces Cerevisiae: A Nuclear Tetratricopeptide Repeat Protein with Two Mutable Domains." Molecular and Cellular Biology 13, no. 2 (1993): 1212-21. 82. Samejima, Itaru, and Mitsuhiro Yanagida. "Bypassing Anaphase by Fission Yeast Cut9 Mutation: Requirement of Cut9+ to Initiate Anaphase." The Journal of Cell Biology 127, no. 6 (1994): 1655-70. 83. Liu, Geng, and Guillermina Lozano. "P21 Stability: Linking Chaperones to a Cell Cycle Checkpoint." Cancer Cell 7, no. 2 (2005): 113-14. 84. Fotedar, R., P. Fitzgerald, T. Rousselle, D. Cannella, M. Doree, H. Messier, and A. Fotedar. "p21 Contains Independent Binding Sites for Cyclin and Cdk2: Both Sites Are Required to Inhibit Cdk2 Kinase Activity." Oncogene 12, no. 10 (1996): 2155-64. 85. El-Deiry, Wafik S., Takashi Tokino, Victor E. Velculescu, Daniel B. Levy, Ramon Parsons, Jeffrey M. Trent, David Lin, W. Edward Mercer, Kenneth W. Kinzler, and Bert Vogelstein. "Waf1, a Potential Mediator of P53 Tumor Suppression." Cell 75, no. 4 (1993): 817-25. 86. Kim, Geum-Yi, Stephen E. Mercer, Daina Z. Ewton, Zhongfa Yan, Kideok Jin, and Eileen Friedman. "The Stress-Activated Protein Kinases P38α and Jnk1 Stabilize P21cip1 by Phosphorylation." Journal of Biological Chemistry 277, no. 33 (2002): 29792-802. 87. Cliff, Matthew J., Richard Harris, David Barford, John E. Ladbury, and Mark A. Williams. "Conformational Diversity in the Tpr Domain-Mediated Interaction of Protein Phosphatase 5 with Hsp90." Structure 14, no. 3 (2006): 415-26. 88. Chen, Mei-Shya, Adam M. Silverstein, William B. Pratt, and Michael Chinkers. "The Tetratricopeptide Repeat Domain of Protein Phosphatase 5 Mediates Binding to Glucocorticoid Receptor Heterocomplexes and Acts as a Dominant Negative Mutant." Journal of Biological Chemistry 271, no. 50 (1996): 32315-20. 89. Silverstein, Adam M., Mario D. Galigniana, Mei-Shya Chen, Janet K. Owens-Grillo, Michael Chinkers, and William B. Pratt. "Protein Phosphatase 5 Is a Major Component of Glucocorticoid Receptor Hsp90 Complexes with Properties of an Fk506-Binding Immunophilin." Journal of Biological Chemistry 272, no. 26 (1997): 16224-30. 90. Ollendorff, Vincent, and Daniel J. Donoghue. "The Serine/Threonine Phosphatase Pp5 Interacts with Cdc16 and Cdc27, Two Tetratricopeptide Repeat-Containing Subunits of the Anaphase-Promoting Complex." Journal of Biological Chemistry 272, no. 51 (1997): 32011-18. 209 91. Chen, Mao Xiang, and Patricia T. W. Cohen. "Activation of Protein Phosphatase 5 by Limited Proteolysis or the Binding of Polyunsaturated Fatty Acids to the Tpr Domain." Febs Letters 400, no. 1 (1997): 136-40. 92. Ali, Ambereen., Ji. Zhang, Shideng. Bao, Irene. Liu, Diane. Otterness, Nicholas. M. Dean, Robert. T. Abraham, and Xian-Fan. Wang. "Requirement of Protein Phosphatase 5 in DNA-Damage-Induced Atm Activation." Genes Dev 18, no. 3 (2004): 249-54 93. Blom, Eric, Henri J. van de Vrugt, Yne de Vries, Johan P. de Winter, Fré Arwert, and Hans Joenje. "Multiple Tpr Motifs Characterize the Fanconi Anemia Fancg Protein." DNA Repair 3, no. 1 (2004): 77-84. 94. Hussain, Shobbir, James B. Wilson, Eric Blom, Larry H. Thompson, Patrick Sung, Susan M. Gordon, Gary M. Kupfer, Hans Joenje, Christopher G. Mathew, and Nigel J. Jones. "Tetratricopeptide-Motif-Mediated Interaction of Fancg with Recombination Proteins Xrcc3 and Brca2." DNA Repair 5, no. 5 (2006): 629-40. 95. Jiang, Jihong, Douglas Cyr, Roger W. Babbitt, William C. Sessa, and Cam Patterson. "Chaperone-Dependent Regulation of Endothelial Nitric-Oxide Synthase Intracellular Trafficking by the Co-Chaperone/Ubiquitin Ligase Chip." Journal of Biological Chemistry 278, no. 49 (2003): 49332-4 96. Ballinger, Carol A., Patrice Connell, Yaxu Wu, Zhaoyong Hu, Larry J. Thompson, LiYan Yin, and Cam Patterson. "Identification of Chip, a Novel Tetratricopeptide RepeatContaining Protein That Interacts with Heat Shock Proteins and Negatively Regulates Chaperone Functions." Molecular and Cellular Biology 19, no. 6 (1999): 4535-45. 97. Alberti, Simon, Karsten Böhse, Verena Arndt, Anton Schmitz, and Jörg Höhfeld. "The Cochaperone Hspbp1 Inhibits the Chip Ubiquitin Ligase and Stimulates the Maturation of the Cystic Fibrosis Transmembrane Conductance Regulator." Molecular Biology of the Cell 15, no. 9 (2004): 4003-10. 98. Tripathi, Veenu, Amjad Ali, Rajiv Bhat, and Uttam Pati. "Chip Chaperones Wild Type P53 Tumor Suppressor Protein." Journal of Biological Chemistry 282, no. 39 (2007): 28441-54. 99. Xia, Tian, Christiana Dimitropoulou, Jingmin Zeng, Galina N. Antonova, Connie Snead, Richard C. Venema, David Fulton, Shuibing Qian, Cam Patterson, Andreas Papapetropoulos, and John D. Catravas. "Chaperone-Dependent E3 Ligase Chip Ubiquitinates and Mediates Proteasomal Degradation of Soluble Guanylyl Cyclase." American Journal of Physiology - Heart and Circulatory Physiology 293, no. 5 (2007): H3080-H87. 100. Iyer, Sai Prasad N., and Gerald W. Hart. "Roles of the Tetratricopeptide Repeat Domain in O-Glcnac Transferase Targeting and Protein Substrate Specificity." Journal of Biological Chemistry 278, no. 27 (2003): 24608-16. 101. Iyer, Sai Prasad N., Yoshihiro Akimoto, and Gerald W. Hart. "Identification and Cloning of a Novel Family of Coiled-Coil Domain Proteins That Interact with O-Glcnac Transferase." Journal of Biological Chemistry 278, no. 7 (2003): 5399-40 210 102. Kelly, William G, Michael E Dahmus, and Gerald W Hart. "Rna Polymerase Ii Is a Glycoprotein. Modification of the Cooh-Terminal Domain by O-Glcnac." Journal of Biological Chemistry 268, no. 14 (1993): 10416-24. 103. Jackson, Stephen P., and Robert Tjian. "O-Glycosylation of Eukaryotic Transcription Factors: Implications for Mechanisms of Transcriptional Regulation." Cell 55, no. 1 (1988): 125-33. 104. Lubas, William A., and John A. Hanover. "Functional Expression of O-Linked Glcnac Transferase: Domain Structure and Substrate Specificity." Journal of Biological Chemistry 275, no. 15 (2000): 10983-88. 105. Buchanan, Grant., Carmela. Ricciardelli, Jonathan. M. Harris, Jennifer. Prescott, Zoe. Chiao-Li. Yu, Li. Jia, Lisa. M. Butler, Villis. R. Marshall, Howard. I. Scher, William. L. Gerald, Gerhard. A. Coetzee, and Wayne. D. Tilley. "Control of Androgen Receptor Signaling in Prostate Cancer by the Cochaperone Small Glutamine Rich Tetratricopeptide Repeat Containing Protein Alpha." Cancer research 67, no. 20 (2007): 10087-96. 106. Krenn, Veronica, Annemarie Wehenkel, Xiaozheng Li, Stefano Santaguida, and Andrea Musacchio. "Structural Analysis Reveals Features of the Spindle Checkpoint Kinase Bub1-kinetochore Subunit Knl1 Interaction." The Journal of Cell Biology 196, no. 4 (2012): 451-67. 107. Chol, Kang-Yell, Brett Satterberg, David M. Lyons, and Elaine A. Elion. "Ste5 Tethers Multiple Protein Kinases in the Map Kinase Cascade Required for Mating in S. Cerevisiae." Cell 78, no. 3 (1994): 499-51 108. Zeke, András, Melinda Lukács, Wendell A. Lim, and Attila Reményi. "Scaffolds: Interaction Platforms for Cellular Signalling Circuits." Trends in Cell Biology 19, no. 8 (2009): 364-74. 109. Burack, W. Richard, and Andrey S. Shaw. "Signal Transduction: Hanging on a Scaffold." Current Opinion in Cell Biology 12, no. 2 (2000): 211-16. 110. Ferrell Jr, James E., and Karlene A. Cimprich. "Enforced Proximity in the Function of a Famous Scaffold." Molecular Cell 11, no. 2 (2003): 289-91. 111. Bhattacharyya, Roby. P., Attila. Remenyi, Brian. J. Yeh, and Wendell. A. Lim. "Domains, Motifs, and Scaffolds: The Role of Modular Interactions in the Evolution and Wiring of Cell Signaling Circuits." Annual Review of Biochemistry 75 (2006): 655-80. 112. Pálfy, Máté, Attila Reményi, and Tamás Korcsmáros. "Endosomal Crosstalk: Meeting Points for Signaling Pathways." Trends in Cell Biology 22, no. 9 (2012): 447-56. 113. Hanahan, Douglas, and Robert A. Weinberg. "The Hallmarks of Cancer." Cell 100, no. 1 (2000): 57-70. 114. Hanahan, Douglas, and Robert¬†A Weinberg. "Hallmarks of Cancer: The Next Generation." Cell 144, no. 5 (2011): 646-74. 115. Zhang, Yanping, Gabrielle White Wolf, Krishna Bhat, Aiwen Jin, Theresa Allio, William A. Burkhart, and Yue Xiong. "Ribosomal Protein L11 Negatively Regulates 211 Oncoprotein Mdm2 and Mediates a P53-Dependent Ribosomal-Stress Checkpoint Pathway." Molecular and Cellular Biology 23, no. 23 (2003): 8902-12 116. Gajjar, Madhavsai, Marco M Candeias, Laurence Malbert-Colas, Anne Mazars, Jun Fujita, Vanesa Olivares-Illana, and Robin Fåhraeus. "The P53 Mrna-Mdm2 Interaction Controls Mdm2 Nuclear Trafficking and Is Required for P53 Activation Following DNA Damage." Cancer Cell 21, no. 1 (2012): 25-35 117. Vazquez, Alexi., Elisabeth. E. Bond, Arnold. J. Levine, and G. Levine. Bond. "The Genetics of the P53 Pathway, Apoptosis and Cancer Therapy." Nat Rev Drug Discov 7, no. 12 (2008): 979-87. 118. Campellone, Kenneth. G., and Mathew. D. Welch. "A Nucleator Arms Race: Cellular Control of Actin Assembly." Nat Rev Mol Cell Biol 11, no. 4 (2010): 237-51. 119. Ridley, Anne. "Life at the Leading Edge." Cell 145, no. 7 (2011): 1012-22. 120. Yamaguchi, Hideki, and John Condeelis. "Regulation of the Actin Cytoskeleton in Cancer Cell Migration and Invasion." Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1773, no. 5 (2007): 642-52. 121. Yamaguchi, Hideki, Jeffrey Wyckoff, and John Condeelis. "Cell Migration in Tumors." Current Opinion in Cell Biology 17, no. 5 (2005): 559-64. 122. Schramm, Laura, and Nouria Hernandez. "Recruitment of Rna Polymerase III to Its Target Promoters." Genes & Development 16, no. 20 (2002): 2593-620. 123. Nikolov, Dimitar.‚ B., and Stephen.‚ K. Burley. "Rna Polymerase II Transcription Initiation: A Structural view." Proceedings of the National Academy of Sciences 94, no. 1 (1997): 15-22. 124. Sentenac, Andre. "Eukaryotic Rna-Polymerases." Crc Critical Reviews in Biochemistry 18, no. 1 (1985): 31-90. 125. Gaston, Kevin., and Padma. S. Jayaraman. "Transcriptional Repression in Eukaryotes: Repressors and Repression Mechanisms." Cell Mol Life Sci 60, no. 4 (2003): 721-41. 126. Maston, Glen A., Sara. K. Evans, and Michael. R. Green. "Transcriptional Regulatory Elements in the Human Genome." Annu Rev Genomics Hum Genet 7 (2006): 29-59. 127. Black, Joshua C., Janet E. Choi, Sarah R. Lombardo, and Michael Carey. "A Mechanism for Coordinating Chromatin Modification and Preinitiation Complex Assembly." Molecular Cell 23, no. 6 (2006): 809-18 128. Mertens, Claudia., and Robert. G. Roeder. "Different Functional Modes of P300 in Activation of Rna Polymerase Iii Transcription from Chromatin Templates." Molecular and Cellular Biology 28, no. 18 (2008): 5764-76. 129. Lemon, Bryan, and Robert Tjian. "Orchestrated Response: A Symphony of Transcription Factors for Gene Control." Genes & Development 14, no. 20 (2000): 255169. 130. Pan, Yongping, Chung-Jung Tsai, Buyong Ma, and Ruth Nussinov. "Mechanisms of 212 Transcription Factor Selectivity." Trends in Genetics 26, no. 2 (2010): 75-83. 131. Silverman, Eric S, Jing Du, Amy J Williams, Raj Wadgaonkar, Jeffrey M Drazen, and Tucker Collins. "Camp-Response-Element-Binding-Protein-Binding Protein (Cbp) and P300 Are Transcriptional Co-Activators of Early Growth Response Factor-1 (Egr-1)." Biochem. J. 336, no. 1 (1998): 183-89. 132. Janknecht, Ralf, and Tony Hunter. Transcription. A Growing Coactivator Network. Vol. 383, 1996. 133. Courey, Albert. J., and Songtao. Jia. "Transcriptional Repression: The Long and the Short of It." Genes Dev 15, no. 21 (2001): 2786-96. 134. Moir, Robyn D, Indra Sethy-Coraci, Karen Puglia, Monett D Librizzi, and Ian M Willis. "A Tetratricopeptide Repeat Mutation in Yeast Transcription Factor IIIC131 (TFIIC131) Facilitates Recruitment of TfIIb-Related Factor TfIIIb70." Molecular and Cellular Biology 17, no. 12 (1997): 7119-25. 135. Cabarcas, Stephanie, and Laura Schramm. "Rna Polymerase III Transcription in Cancer: The Brf2 Connection." Molecular Cancer C7 - 47 10, no. 1 (2011): 1-10. 136. Demonacos, Constantinos, Marija Kristic-Demonacos, and Nicholas B. La Thangue. "A Tpr Motif Cofactor Contributes to P300 Activity in the P53 Response." Molecular Cell 8, no. 1 (2001): 71-84. 137. Dallas, Peter B, Peter Yaciuk, and Elizabeth Moran. "Characterization of Monoclonal Antibodies Raised against P300: Both P300 and Cbp Are Present in Intracellular Tbp Complexes." Journal of Virology 71, no. 2 (1997): 1726-31. 138. Yuan, L. W., and A. Giordano. "Acetyltransferase Machinery Conserved in P300/Cbp-Family Proteins." Oncogene 21, no. 14 (2002): 2253-60. 139. Grossman, Steven R., Marco Perez, Andrew L. Kung, Michael Joseph, Claire Mansur, Zhi-Xiong Xiao, Sushant Kumar, Peter M. Howley, and David M. Livingston. "P300/Mdm2 Complexes Participate in Mdm2-Mediated P53 Degradation." Molecular Cell 2, no. 4 (1998): 405-15. 140. Zhu, Qianzheng, Jihong Yao, Gulzar Wani, Manzoor A. Wani, and Altaf A. Wani. "Mdm2 Mutant Defective in Binding P300 Promotes Ubiquitination but Not Degradation of P53: Evidence for the Role of P300 in Integrating Ubiquitination and Proteolysis." Journal of Biological Chemistry 276, no. 32 (2001): 29695-701. 141. Shikama, Noriko, Chang-Woo Lee, Stephen France, Laurent Delavaine, Jonathan Lyon, Marija Krstic-Demonacos, and Nicholas B. La Thangue. "A Novel Cofactor for P300 That Regulates the P53 Response." Molecular Cell 4, no. 3 (1999): 365-76. 142. Coutts, Amanda. S., Houda. Boulahbel, Anne. Graham, and Nicolas. B. La Thangue. "Mdm2 Targets the P53 Transcription Cofactor Jmy for Degradation." EMBO reports 8, no. 1 (2007): 84-90. 143. Jansson, Martin., Stephen. T. Durant, Er-Chieh. Cho, Sharon. Sheahan, Mariola. Edelmann, Benedict. Kessler, and Nicolas. B. La Thangue. "Arginine Methylation Regulates the P53 Response." Nat Cell Biol 10, no. 12 (2008): 1431-9. 213 144. Demonacos, Constantinos., Marija. Kristic-Demonacos, Linda. Smith, Danmei. Xu, Darran. P. O'Connor, Martin. Jansson, and Nicolas. B. La Thangue. "A New Effector Pathway Links Atm Kinase with the DNA Damage Response." Nat Cell Biol 6, no. 10 (2004): 968-76. 145. Adams, Cassandra. J., Anne. L. Graham, Martin. Jansson, Amanda. S. Coutts, Mariola. Edelmann, Linda. Smith, Benedikt. Kessler, and Nicolas. B. La Thangue. "Atm and Chk2 Kinase Target the P53 Cofactor Strap." EMBO Rep 9, no. 12 (2008): 1222-9. 146. Hollstein, M., D. Sidransky, B. Vogelstein, and C. C. Harris. "P53 Mutations in Human Cancers." Science 253, no. 5015 (1991): 49-53. 147. Smith, Linda., and Nicolas. B. La Thangue. "Signalling DNA Damage by Regulating P53 Co-Factor Activity." Cell Cycle 4, no. 1 (2005): 30-2. 148. Xu, Danmei., and Nicolas. B. La Thangue. "Strap: A Versatile Transcription CoFactor." Cell Cycle 7, no. 16 (2008): 2456-7. 149. Xu, Danmei., L. Panagiotis. Zalmas, and Nicolas. B. La Thangue. "A Transcription Cofactor Required for the Heat-Shock Response." EMBO Rep 9, no. 7 (2008): 662-9. 150. Davies, Laura., Elissavet. Paraskevopoulou, Malihah. Sadeq, Christiana. Symeou, Constantia. Pantelidou, Constantinos. Demonacos, and Marija. Krstic-Demonacos. "Regulation of Glucocorticoid Receptor Activity by a Stress Responsive Transcriptional Cofactor." Molecular endocrinology (Baltimore, Md.) 25, no. 1 (2011): 58-71. 151. http://textbookofbacteriology.net/themicrobialworld/growth.html- Accessed on the 15th March 2012. 152. Golovanov, Alexander. P., Guillaume. M. Hautbergue, Stuart. A. Wilson, and Lu. Yun. Lian. "A Simple Method for Improving Protein Solubility and Long-Term Stability." Journal of the American Chemical Society 126, no. 29 (2004): 8933-39. 153. Adams, Cassandra J., Ashley C. W. Pike, Sandra Maniam, Timothy D. Sharpe, Amanda S. Coutts, Stefan Knapp, Nicholas B. La Thangue, and Alex N. Bullock. "The P53 Cofactor Strap Exhibits an Unexpected Tpr Motif and Oligonucleotide-Binding (Ob)‚Äìfold Structure." Proceedings of the National Academy of Sciences 109, no. 10 (2012): 3778-83. 154. Dale, Glenn E., Christian Oefner, and Allan D‚ÄôArcy. "The Protein as a Variable in Protein Crystallization." Journal of Structural Biology 142, no. 1 (2003): 88-97. 155. Smyth, Douglas R., Marek K. Mrozkiewicz, William J. McGrath, Pawel Listwan, and Bostjan Kobe. "Crystal Structures of Fusion Proteins with Large-Affinity Tags." Protein Science 12, no. 7 (2003): 1313-22. 156. Schulman, Brenda. A., Peter. S. Kim, Christopher. M. Dobson, and Christina. Redfield. "A Residue-Specific Nmr View of the Non-Cooperative Unfolding of a Molten Globule." Nat Struct Biol 4, no. 8 (1997): 630-4. 157. Chen, Rachel. "Bacterial Expression Systems for Recombinant Protein Production: E. Coli and Beyond." Biotechnology Advances 30, no. 5 (2012): 1102-07. 214 158. Verma, R., E. Boleti, and A. J. T. George. "Antibody Engineering: Comparison of Bacterial, Yeast, Insect and Mammalian Expression Systems." Journal of Immunological Methods 216, no. 1‚Äì2 (1998): 165-81. 159. Fernandez, Joseph.M., and James.P. Hoeffler. Gene Expression Systems: Using Nature for the Art of Expression: Elsevier Science, 1999. 160. Higgins, Steve.J., and Steve.J.H.B.D. Hames. Protein Expression: A Practical Approach: Oxford University Press, 1999. 161. Baneyx, Francois., and Mirna. Mujacic. "Recombinant Protein Folding and Misfolding in Escherichia Coli." Nat Biotechnol 22, no. 11 (2004): 1399-408. 162. Trinkle-Mulcahy, Laura, Severine Boulon, Yun Wah Lam, Roby Urcia, FrancoisMichel Boisvert, Franck Vandermoere, Nick A Morrice, Sam Swift, Ulrich Rothbauer, Heinrich Leonhardt, and Angus Lamond. "Identifying Specific Protein Interaction Partners Using Quantitative Mass Spectrometry and Bead Proteomes." The Journal of Cell Biology 183, no. 2 (2008): 223-39. 163. Figeys, Daniel, Linda D. McBroom, and Michael F. Moran. "Mass Spectrometry for the Study of Protein-Protein Interactions." Methods 24, no. 3 (2001): 230-39. 164. Brymora, Adam, Valentina A. Valova, and Phillip J. Robinson. "Protein-Protein Interactions Identified by Pull-Down Experiments and Mass Spectrometry." In Current Protocols in Cell Biology: John Wiley & Sons, Inc., 2001. 165. Arifuzzaman, Mohammad, Maki Maeda, Aya Itoh, Kensaku Nishikata, Chiharu Takita, Rintaro Saito, Takeshi Ara, Kenji Nakahigashi, Hsuan-Cheng Huang, Aki Hirai, Kohei Tsuzuki, Seira Nakamura, Mohammad Altaf-Ul-Amin, Taku Oshima, Tomoya Baba, Natsuko Yamamoto, Tomoyo Kawamura, Tomoko Ioka-Nakamichi, Masanari Kitagawa, Masaru Tomita, Shigehiko Kanaya, Chieko Wada, and Hirotada Mori. "LargeScale Identification of Protein-protein Interaction of Escherichia Coli K-12." Genome Research 16, no. 5 (2006): 686-91. 166. Franceschini, Andrea, Damian Szklarczyk, Sune Frankild, Michael Kuhn, Milan Simonovic, Alexander Roth, Jianyi Lin, Pablo Minguez, Peer Bork, Christian von Mering, and Lars J. Jensen. "String V9.1: Protein-Protein Interaction Networks, with Increased Coverage and Integration." Nucleic Acids Research 41, no. D1 (2013): D808-D15. 167. Theobald, Douglas L., Rachel M. Mitton-Fry, and Deborah S. Wuttke. "Nucleic Acid Recognition by Ob-Fold Proteins." Annual Review of Biophysics and Biomolecular Structure 32, no. 1 (2003): 115-33. 168. Roadcap, David. W., and James. E. Bear. "Double Jmy: Making Actin Fast." Nature cell biology 11, no. 4 (2009): 375-76. 169. Coutts, Amanda. S., Louise. Weston, and Nicolas. B. La Thangue. "A Transcription Co-Factor Integrates Cell Adhesion and Motility with the P53 Response." Proc Natl Acad Sci U S A 106, no. 47 (2009): 19872-7. 215 170. Coutts, Amanda. S., Louise. Weston, and Nicolas. B. La Thangue. "Actin Nucleation by a Transcription Co-Factor That Links Cytoskeletal Events with the P53 Response." Cell Cycle 9, no. 8 (2010): 1511-5. 171. Wang, Yinggun. "Jimmy on the Stage: Linking DNA Damage with Cell Adhesion and Motility." Cell adhesion & migration 4, no. 2 (2010): 166-68. 172. Sun, Shao-Chen, Qing-Yuan Sun, and Nam-Hyung Kim. "Jmy Is Required for Asymmetric Division and Cytokinesis in Mouse Oocytes." Molecular human reproduction 17, no. 5 (2011): 296-304. 173. Taiyab, Aftab, and Ch Mohan Rao. "Hsp90 Modulates Actin Dynamics: Inhibition of Hsp90 Leads to Decreased Cell Motility and Impairs Invasion." Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1813, no. 1 (2011): 213-21. 174. Perkins, David N., Darryl J. C. Pappin, David M. Creasy, and John S. Cottrell. "Probability-Based Protein Identification by Searching Sequence Databases Using Mass Spectrometry Data." ELECTROPHORESIS 20, no. 18 (1999): 3551-67. 175. Searle, Brian C. "Scaffold: A Bioinformatic Tool for Validating Ms/Ms-Based Proteomic Studies." PROTEOMICS 10, no. 6 (2010): 1265-69. 176. Scriven, David. R., Ronald. M. Lynch, and Edwin. D. Moore. "Image Acquisition for Colocalization Using Optical Microscopy." American journal of physiology. Cell physiology 294, no. 5 (2008): C1119-22. 177. Pierce, Michael M., C. S. Raman, and Barry T. Nall. "Isothermal Titration Calorimetry of Protein‚Äìprotein Interactions." Methods 19, no. 2 (1999): 213-21. 178. Liang, Chun- Chi., Ann. Y. Park, and Jun-Lin. Guan. "In Vitro Scratch Assay: A Convenient and Inexpensive Method for Analysis of Cell Migration in Vitro." Nature protocols 2, no. 2 (2007): 329-33. 179. Wells, Claire M., Maddy Parsons, and Giles Cory. "Scratch-Wound Assay." In Cell Migration, 25-30: Humana Press, 2011. 180. Xu, Xiaoli, Yuan Song, Yuhua Li, Jianfeng Chang, Hua zhang, and Lizhe An. "The Tandem Affinity Purification Method: An Efficient System for Protein Complex Purification and Protein Interaction Identification." Protein Expression and Purification 72, no. 2 (2010): 149-56. 181.Muslimović, Aida, Susanne Nyström, Yue Gao, and Ola Hammarsten. "Numerical Analysis of Etoposide Induced DNA Breaks." Plos One 4, no. 6 (2009): e5859. 182. Mocellin, S., and M. Provenzano. "Rna Interference: Learning Gene Knock-Down from Cell Physiology." Journal of translational medicine 2, no. 1 (2004): 39. 183. https://www.caymanchem.com/app/template/Product.vm/catalog/600450, accessed on the 7th January 2013. 216 7. Appendix (1) His-hSTRAP(1-440)TPR1-6 Sequence given by GATC: gCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGATA GCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCAC TGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTG CAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGC TGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAG TGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTC TGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAAC CGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCT His-hSTRAP(1-440)TPR1-6 sequence CATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATTCTGTTGAAGATGCAGGTCGTAAA CAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGC CCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGT GCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCC GTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAA AAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGG CCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACAT CTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTT AGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATT CCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCA GTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC Alignment: SEQ Full SEQ Full GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120 -----------------------CATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 97 ************************************************************************************************* AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 240 AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 217 ************************************************************************************************************************ 217 SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full SEQ Full TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG ************************************************************************************************************************ TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCG TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCG ************************************************************************************************************************ ATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCC ATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCC ************************************************************************************************************************ TGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAA TGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAA ************************************************************************************************************************ CCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGG CCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGG ************************************************************************************************************************ ATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCG ATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCG ************************************************************************************************************************ CAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTC CAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTC ************************************************************************************************************************ CGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGC CGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGC ************************************************************************************************************************ GTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAA GTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAA ************************************************************************************************************************ CCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAG 1380 CCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC---------------------- 1335 ************************************** 300 377 480 457 600 577 720 697 840 817 960 937 1080 1057 1200 1177 1320 1297 (2) GST-hSTRAP(1-440) sequencing data Sequence given by the University of Manchester DNA sequencing facility TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTCCTTTCCTTT GTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCCATCTGAATCTACCAGGCC AAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTTGAGCTCCAGGGTCACTTTCTGCCCAGA GGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCCCTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAA TTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTCCCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATG AAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGATCTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAAT ATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCATGACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTT 218 TTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGCAGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAG CTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGGAGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTG TAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTTCCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGAT CGGCTTGACTTCTTCCTCTTCATCAGCCATCAT Reverse complement of the GST-hSTRAP(1-440) sequence TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTCCTTTCCTTT GTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCCATCTGAATCTACCAGGCC AAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTTGAGCTCCAGGGTCACTTTCTGCCCAGA GGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCCCTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAA TTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTCCCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATG AAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGATCTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAAT ATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCATGACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTT TTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGCAGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAG CTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGGAGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTG TAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTTCCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGAT CGGCTTGACTTCTTCCTCTTCATCAGCCATCAT Alignment: seq Full seq Full seq Full seq Full seq Full seq Full seq Full TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTC TCATTCACACTGTGGTCGCGATGCCACTGTGGCAACAGCCTGGCTGCTGGATCCCTGAGGCTTCCCATTCACCACTAGCAGGAGGGGCGTCTCCACTCGAACACTGGAAAAGGAATAGTC ************************************************************************************************************************ CTTTCCTTTGTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCC CTTTCCTTTGTGCTGAATTCGGTGAAGCCGCAGGTTGGGCTCAGGAATGGCTACAGAGTCTCCAATGAGCACTCCCCAGCTCTGCACTATATTGTACACCATCACTGCATAGCAAGGTCC ************************************************************************************************************************ ATCTGAATCTACCAGGCCAAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTT ATCTGAATCTACCAGGCCAAATGTAAAGGGGACTTTCTCCTCTGTGGTGAGGCTAAATACCACCTTTCCCAGGATGACGGCACCGCTGTTCACCCCAGGCTGAAGCGTACTCAGTGGCTT ************************************************************************************************************************ GAGCTCCAGGGTCACTTTCTGCCCAGAGGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCC GAGCTCCAGGGTCACTTTCTGCCCAGAGGCTGACTGATAGTGCCCATCACTGCAAGGGCCTAGATGGGCTGGGCGCAAGCTTCCCAGCATGCTCTGCAGCTTTTTGGTCTTCACCTTTCC ************************************************************************************************************************ CTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAATTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTC CTTACTCTCAAGGAGGCTGGTTAATCTATCCAGGAATTCCAGAAGTTGTTGCTCTCGTTGCCGGGGCTCTGGCCAGGCAGGGTCCAGGGCTGCAGCCCGAGAGAAGCCCTCCAGGGCCTC ************************************************************************************************************************ CCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATGAAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGAT CCCATAACTCTCTTCATATTTATGCAACGTCGCCCTGTTCAGATGAAGGTCAGGATTGCTAGAAGCTTTTCTGTCAACTTTCTCTGCTTGGGCATAGGCACTGAGGGCTTGCTGGGAGAT ************************************************************************************************************************ CTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAATATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCAT CTTAGGGTTCTGGCCAGTAGAGAAGTAAAGGGAAAGATATGAATTCCCAAGAATATACCAGGAGCGGCCATCATGGACATCCATCTGAACAGCCAACTTAGCCTGTCGGACACTGTCCAT ************************************************************************************************************************ 219 120 120 240 240 360 360 480 480 600 600 720 720 840 840 seq Full seq Full seq Full seq Full seq Full GACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTTTTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGC GACATGGTGAGAATGTTCATCTTCAGTGTCAGTCCGCAGCTGACGAAGCACCATTGACAGGTTTTGCAAGGAGACTTTGTTCCTGCAATGGGTGAGGGCTCCTGAGAAGCAGGTGTGGGC ************************************************************************************************************************ AGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAGCTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGG AGCTGCAACATCCCCTTTTTTCCAGTACACCTCACCCAGCTGGTTCCAGGCTTCCACCAGCTCGGGCTCCAGCTTCACAGCCTTTGACAGAAGCTCCTCAGCCTTAGGGCTATAGTCAGG ************************************************************************************************************************ AGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTGTAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTT AGTCACATTTAGTGCTTTCCCAGTTAGCATTAGAACTTGTGCCTTGCCCTGGACAGAACCCACTACTTCTTCCATCTGCTGTAGGGTTTTCTCCATCTCCTTCTGCACATCCTGTTGCTT ************************************************************************************************************************ CCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGATCGGCTTGACTTCTTCCTCTTCATCAGCCAT CCTCCCAGCATCCTCAACACTATGTGTCTCGAAATAGCAGTCTCGAAATGAGTAGAGCTGATCCACGAGTTCCTGCAATTTCTGCAAGATCGGCTTGACTTCTTCCTCTTCATCAGCCAT ************************************************************************************************************************ CAT 1323 CAT 1323 *** 960 960 1080 1080 1200 1200 1320 1320 (3) hSTRAP (1-219)TPR 1-3 Sequencing data Sequence given from GATC aGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCAT TCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCA CTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCA GCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGAT AGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCA CTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGC CTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACgGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGc gGCGGCCAAAGCGGTCGGACAGTGCTCCGagaACgGGTgcgcATAGaAATTgcaTCAACGCATATAGCgCTAGCAGcacgccaTaGTGACTGGCGatGCtgtnngAATGGACGa hSTRAP(1-219)TPR1-3 sequence AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCAT TCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCA CTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCA GCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACCGATACCGAAGATGAACATAGCCATCATGTTATGGAT AGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCA CTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCC Alignment: SEQ F3 AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTT 120 AGCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTT 120 220 SEQ F3 SEQ F3 SEQ F3 SEQ F3 SEQ F3 ************************************************************************************************************************ GAAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTG GAAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTG ************************************************************************************************************************ ATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAA ATGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAA ************************************************************************************************************************ GTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACC GTGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTACC ************************************************************************************************************************ GATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGC GATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATATTCTGGGTAATAGCTATCTGAGC ************************************************************************************************************************ CTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCCGGCTGCTAACAAAGCCCGAAAGGA CTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAATGATGAGGATCC-----------------------************************************************************************************************ 240 240 360 360 480 480 600 600 720 696 (4) hSTRAP(220-440)TPR4-6 sequencing data Sequence given by GATC AGCGGCCTGGTGCCGCGCGGCAGCCATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCA CTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGC CTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTG GGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGT GATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGT AGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACC CCTTGGGGCCTCTAA hSTRAP(220-440)TPR4-6 sequence CATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCG CGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCG TGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACC ACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCG AATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACC GTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC Alignment: SEQ L3 AGCGGCCTGGTGCCGCGCGGCAGCCATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGT 120 ------------------------CATATGGCAAGCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGT 96 ************************************************************************************************ 221 SEQ L3 SEQ L3 SEQ L3 SEQ L3 SEQ L3 GCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTG GCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGAAAACCAAAAAACTG ************************************************************************************************************************ CAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCT CAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCT ************************************************************************************************************************ GGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTG GGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTG ************************************************************************************************************************ TATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGT TATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGT ************************************************************************************************************************ GTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGC GTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC--------------********************************************************************************************************* 240 216 360 336 480 456 600 576 720 681 (5) hSTRAP (1-150)TPR 1-2 sequencing data Sequence given by GATC gcgGCCTgGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCT GAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACGGGTGTGGTCGCCAT GATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAACGCATATAGCGCTAGCAGCAC GCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAAGAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGGTGCCGAGGATGACGATGAGCgCATTGTT AGATTTCanaCacGGTGCCTgACTGCGTTAGCAATTTAACTGTgataAACTAccGCATTaAAGCTTATCGATGataAGcTgtcAAACATgaaaATTCTTGAanacGaAAGGGCctcgtg hSTRAP(1-150)TPR1-2 sequence GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTGAAACCCATT CTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGATGCTGACCGGTAAAGCAC TGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAGTGTATTGGAAAAAAGGTGATGTTGCAG CAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAATAAGGATCC Alignment: SEQ F2 SEQ GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120 GCGGCCTGGTGCCGCGCGGCAGCCATATGATGGCCGATGAAGAAGAAGAAGTTAAACCGATTCTGCAGAAACTGCAGGAACTGGTTGATCAGCTGTATAGCTTTCGCGATTGCTATTTTG 120 *********************************************************************************************************************** AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA 240 222 f2 SEQ f2 SEQ f2 SEQ f2 AAACCCATTCTGTTGAAGATGCAGGTCGTAAACAGCAGGATGTTCGCAAAGAAATGGAAAAAACCCTGCAGCAGATGGAAGAAGTTGTTGGTAGCGTTCAGGGTAAAGCACAGGTTCTGA ************************************************************************************************************************ TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG TGCTGACCGGTAAAGCACTGAATGTTACACCGGATTATAGCCCGAAAGCAGAAGAACTGCTGTCTAAAGCAGTTAAACTGGAACCGGAACTGGTGGAAGCATGGAATCAGCTGGGTGAAG ************************************************************************************************************************ TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAAT TGTATTGGAAAAAAGGTGATGTTGCAGCAGCACATACCTGTTTTAGCGGTGCACTGACCCATTGTCGTAATAAAGTTAGCCTGCAGAATCTGAGCATGGTTCTGCGTCAGCTGCGTTAAT ************************************************************************************************************************ AAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAG 540 AAGGATCC---------------------------------------------------- 488 ******** 240 360 360 480 480 (6) hSTRAP (151-284)TPR 3-4 sequencing data Sequence given by GATC gCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATA TTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGC ATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAAT TTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTT GGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCACAGGACGGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGAC TGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAACGCATATAGCGCTAGCAGCACGCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAA GAGGCCCGGCAGTACCGGCATAACCAAGCCTATGCCTACAGCATCCAGGGTGACGGTGCCGAGGATGacGATGAGCGCATTGTTAGATTTCATACACGGTGCCTGACTGCGTTAGCAATTTAACTGTGA TaAACTACCGCATTAAAGCTTATCGATGATAAGCTGTCAAACATGanaATTCTTgaagacGaAAgGGCCTcGTGAtacGCCTATTTt hSTRAP(151-284)TPR3-4 sequence GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTAGCTGGTATA TTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAAGCAGCAATCCGGATCTGC ATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGCGTCAGCGTGAACAACAATTACTGGAAT TTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCC Alignment: SEQ M2 SEQ M2 SEQ GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTA GCGGCCTGGTGCCGCGCGGCAGCCATATGACCGATACCGAAGATGAACATAGCCATCATGTTATGGATAGCGTTCGTCAGGCAAAACTGGCCGTTCAGATGGATGTTCATGATGGTCGTA ************************************************************************************************************************ GCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAA GCTGGTATATTCTGGGTAATAGCTATCTGAGCCTGTATTTTAGCACCGGTCAGAATCCGAAAATTAGCCAGCAGGCACTGAGCGCATACGCACAGGCAGAAAAAGTTGATCGTAAAGCAA ************************************************************************************************************************ GCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGC 223 120 120 240 240 360 M2 SEQ M2 GCAGCAATCCGGATCTGCATCTGAATCGTGCAACCCTGCATAAATATGAAGAAAGCTATGGTGAAGCACTGGAAGGTTTTAGCCGTGCAGCAGCACTGGACCCTGCATGGCCTGAACCGC 360 ************************************************************************************************************************ GTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCT 480 GTCAGCGTGAACAACAATTACTGGAATTTCTGGATCGTCTGACCAGCCTGCTGGAAAGCAAAGGTAAAGTGTAATAAGGATCC------------------------------------- 443 *********************************************************************************** (7) hSTRAP (285-440) TPR 5-6 Sequence given by GATC gCGGCCTGGTGCCGCGCGGCAGCCATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCgtgTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTA CCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAgAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTG ATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATA GCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCT AACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATATCCA CAGGACGGGTGTGGTCGCCATGATCGCGTAGTCGATAGTGGCTCCAAGTAGCGAAGCGAGCAGGACTGGGCGGCGGCCAAAGCGGTCGGACAGTGCTCCGAGAACGGGTGCGCATAGAAATTGCATCAA CGCATATAGCGCTAGCAGCACGCCATAGTGACTGGCGATGCTGTCGGAATGGACGATATCCCGCAAGangcCCGGCAGTACCGGCATAACCAAGCCTATGCCTAnnGCATCCAgGGTGACGGTGCcann gATGACgATgaacGCATTGTTAgatTTCAtannnGgtgCCTgaCTGcgTTaGCAATTTAACTgtgataAACTACcgcATTAAAGCTTATCGaTgataagctnnca hSTRAP(285-440)TPR5-6 sequence CATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTCAGAAAGTTACCCTGGAACTGAAACCGCTGTCT ACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCTTTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATG GTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATCGTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAA ACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAAGCCGTCCGCAGTGTGAATAATAAGGATCC Alignment: SEQ E2 SEQ E2 SEQ E2 SEQ E2 SEQ E2 GCGGCCTGGTGCCGCGCGGCAGCCATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTC -----------------------CATATGAAAACCAAAAAACTGCAGAGCATGCTGGGTAGCCTGCGTCCGGCACATCTGGGTCCGTGTTCTGATGGTCATTATCAGAGCGCAAGCGGTC ************************************************************************************************* AGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCT AGAAAGTTACCCTGGAACTGAAACCGCTGTCTACCCTGCAGCCTGGTGTTAATAGCGGTGCAGTGATTCTGGGTAAAGTTGTTTTTAGCCTGACCACCGAAGAAAAAGTTCCGTTTACCT ************************************************************************************************************************ TTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATC TTGGTCTGGTTGATTCTGATGGTCCGTGTTATGCCGTGATGGTGTATAATATTGTTCAGAGCTGGGGTGTTCTGATTGGTGATAGCGTTGCAATTCCGGAACCGAATCTGCGTCTGCATC ************************************************************************************************************************ GTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAA GTATTCAGCATAAAGGTAAAGATTATAGCTTTAGCAGCGTTCGTGTTGAAACACCGCTGCTGCTGGTTGTTAATGGTAAACCGCAGGGTAGCAGCAGCCAGGCAGTTGCAACCGTTGCAA ************************************************************************************************************************ GCCGTCCGCAGTGTGAATAATAAGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAG 540 GCCGTCCGCAGTGTGAATAATAAGGATCC------------------------------- 486 ***************************** 224 120 97 240 217 360 337 480 457 Table 7.1. Mass spectrometry peptide data hSTRAP interacting partner Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 hSTRAP protein variant His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) 225 Peptide Sequence Mascot ion Score AGVLAHLEEER ANLQIDQINTDLNLER ASITALEAK DELADEIANSSGK ELEDATETADAMNR IAEFTTNLTEEEEK IAQLEEQLDNETK IAQLEEQLDNETKER KANLQIDQINTDLNLER KQELEEICHDLEAR KVEAQLQELQVK LQVELDNVTGLLSQSDSK QLEEAEEEAQR QLLQANPILEAFGNAK ELEDATETADAMNR IAQLEEQLDNETKER LQVELDNVTGLLSQSDSK TVGQLYKEQLAK VISGVLQLGNIVFK ALEEAMEQKAELER ANLQIDQINTDLNLER ELEDATETADAMNR IAEFTTNLTEEEEK IAQLEEQLDNETKER KVEAQLQELQVK NTDQASMPDNTAAQK QLLQANPILEAFGNAK VIQYLAYVASSHK ALEEAMEQKAELER ANLQIDQINTDLNLER DELADEIANSSGK EEILAQAKENEK 65.7 87.5 59.2 66.3 85.8 55.9 80.3 95.3 83.1 64.7 67.7 65.3 74.9 55.3 64.1 82.6 80.2 53 70.9 71.5 85.5 57.2 88.2 77.8 65.6 66.9 87.9 76.1 60.3 95.6 65.6 55.3 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 Myosin 9 cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) 226 ELEDATETADAMNR EQADFAIEALAK HSQAVEELAEQLEQTKR IAEFTTNLTEEEEK IAQLEEELEEEQGNTELINDR IAQLEEQLDNETK KQELEEICHDLEAR ANLQIDQINTDLNLER IAQLEEQLDNETK KFDQLLAEEK NTDQASMPDNTAAQK VIQYLAYVASSHK AGVLAHLEEER ALELDSNLYR ANLQIDQINTDLNLER ASITALEAK HSQAVEELAEQLEQTKR IAQLEEQLDNETKER KANLQIDQINTDLNLER KFDQLLAEEK NTDQASMPDNTAAQK QLLQANPILEAFGNAK RQLEEAEEEAQR TRLQQELDDLLVDLDHQR VISGVLQLGNIVFKK GVVDSEDLPLNISR TLTLVDTGIGMTK ELISNASDALDK GVVDSEDLPLNISR HFSVEGQLEFR NPDDITQEEYGEFYK NPDDITQEEYGEFYK TTPSVVAFTADGER ADLINNLGTIAK ELISNASDALDKIR GVVDSEDLPLNISR 66.9 57.7 51.1 97.7 54.4 62.1 79.7 105 66.6 60.7 57.3 58.6 62.1 67.6 92.2 62 65.4 75.1 60.8 59.7 83 69.4 56 67.6 78.8 79.9 84.4 64.2 77.2 65.3 73.3 69.5 71.5 63.7 67 77.2 cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta cDNA FLJ53619, highly similar to Heat shock protein HSP 90-beta Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Triosephosphate isomerase Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) 227 NPDDITQEEYGEFYK SIYYITGESKEQVANSAFVER SLTNDWEDHLAVK ELISNASDALDKIR GVVDSEDLPLNISR NPDDITQEEYGEFYK ELISNASDALDK NPDDITQEEYGEFYK SLTNDWEDHLAVK TLTLVDTGIGMTK HVFGESDELIGQK IIYGGSVTGATCK QSLGELIGTLNAAK SNVSDAVAQSTR VPADTEVVCAPPTAYIDFAR IIYGGSVTGATCK SNVSDAVAQSTR KQSLGELIGTLNAAK QSLGELIGTLNAAK SNVSDAVAQSTR VVLAYEPVWAIGTGK IIYGGSVTGATCK SNVSDAVAQSTR VVLAYEPVWAIGTGK AGENVGVLLR GITINTSHVEYDTPTR TTLTAAITTVLAK AFDQIDNAPEEKAR ALEGDAEWEAK ELLSQYDFPGDDTPIVR FESEVYILSK FESEVYILSKDEGGR GITINTSHVEYDTPTR ILELAGFLDSYIPEPER TKPHVNVGTIGHVDHGK TTLTAAITTVLAK 71.2 60.8 65.9 72.8 76.5 78.8 67.5 64.6 73.5 73.3 61.2 70.5 80.9 74.4 71.7 74.6 67.7 68.4 66.8 71 82.3 71.7 79.1 61.4 57 73.9 76 64.9 64.1 62.2 54.7 59.6 79.7 57 66.3 81.2 Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) Elongation factor Tu 1 (Escherichia coli) L-lactate dehydrogenase L-lactate dehydrogenase L-lactate dehydrogenase L-lactate dehydrogenase 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) 30S ribosomal protein S5 (Escherichia coli) ATP synthase subunit beta, mitochondrial His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(285-440) 228 VGEEVEIVGIK AFDQIDNAPEEK AFDQIDNAPEEKAR AGENVGVLLR ALEGDAEWEAK GITINTSHVEYDTPTR GSALKALEGDAEWEAK TTLTAAITTVLAK VGEEVEIVGIKETQK ALEGDAEWEAK ELLSQYDFPGDDTPIVR FESEVYILSK GITINTSHVEYDTPTR TTLTAAITTVLAK VGEEVEIVGIK VIGSGCNLDSAR VTLTSEEEAR IVSGKDYNVTANSK LLIVSNPVDILTYVAWK ATIDGLENMNSPAMVAAK VFMQPASEGTGIIAGGAMR AYGSTNPINVVR VFMQPASEGTGIIAGGAMR AVLEVAGVHNVLAK AYGSTNPINVVR IFSFTALTVVGDGNGR VFMQPASEGTGIIAGGAMR ATIDGLENMNSPAMVAAK AVLEVAGVHNVLAK VFMQPASEGTGIIAGGAMR AVLEVAGVHNVLAK IFSFTALTVVGDGNGR AVLEVAGVHNVLAK IFSFTALTVVGDGNGR VFMQPASEGTGIIAGGAMR AIAELGIYPAVDPLDSTSR 67.7 58.3 74.7 59.4 64.9 71.8 65.3 72.8 61.3 67.5 71.1 59.2 76.3 88.7 67.3 60.7 62.8 74.2 111 67.4 88.8 54 74.1 61.4 63.6 66.3 77.2 73 68.7 86.4 70.9 105 64.8 68.6 82 60.1 ATP synthase subunit beta, mitochondrial ATP synthase subunit beta, mitochondrial ATP synthase subunit beta, mitochondrial ATP synthase subunit beta, mitochondrial Phosphoglycerate kinase Phosphoglycerate kinase Phosphoglycerate kinase Phosphoglycerate kinase Phosphoglycerate kinase Phosphoglycerate kinase cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial cDNA FLJ51907, highly similar to Stress-70 protein, mitochondrial Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Ubiquitin-like modifier-activating enzyme 1 Peptidyl-prolyl cis-trans isomerase (Escherichia coli) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(1-440) 229 TIAMDGTEGLVR TVLIMELINNVAK LVLEVAQHLGESTVR VLDSGAPIKIPVGPETLGR AHSSMVGVNLPQK LGDVYVNDAFGTAHR LGDVYVNDAFGTAHR VLNNMEIGTSLFDEEGAK GCITIIGGGDTATCCAK LGDVYVNDAFGTAHR AQFEGIVTDLIR QAVTNPNNTFYATK QATKDAGQISGLNVLR VLENAEGAR QATKDAGQISGLNVLR TTPSVVAFTADGER ETGVDLTKDNMALQR MKETAENYLGHTAK TTPSVVAFTADGER DAGQISGLNVLR TTPSVVAFTADGER LAGTQPLEVLEAVQR LDQPMTEIVSR LQTSSVLVSGLR NEEDAAELVALAQAVNAR AAVATFLQSVQVPEFTPK NGSEADIDEGLYSR ALPAVQQNNLDEDLIR NGSEADIDEGLYSR ALPAVQQNNLDEDLIR LAGTQPLEVLEAVQR NEEDAAELVALAQAVNAR AAVATFLQSVQVPEFTPK LAGTQPLEVLEAVQR NEEDAAELVALAQAVNAR DLVVSLAYQVR 61 83.8 74.7 58.2 46.8 73.2 82.5 61.1 79.2 101 61.4 69.5 49.8 66.3 56 71.5 54.1 62 66.1 67.9 61.8 58.9 63.5 80.6 79.2 66.1 79.7 60.3 63.2 69.3 62.3 62 69.6 70.7 81.8 59.5 Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Peptidyl-prolyl cis-trans isomerase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Ribose-phosphate pyrophosphokinase (Escherichia coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E.coli) Cell wall structural complex MreBCD, actin-like component MreB (E.coli) Cell wall structural complex MreBCD, actin-like component MreB (E.coli) Cell wall structural complex MreBCD, actin-like component MreB (E.coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Cell wall structural complex MreBCD, actin-like component MreB (E. coli) Filamin B Filamin B Filamin B Filamin B His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) 230 DVFMGVDELQVGMR FNVEVVAIR DVFMGVDELQVGMR VAKDLVVSLAYQVR DLVVSLAYQVR DVFMGVDELQVGMR FNVEVVAIR VPKDVFMGVDELQVGMR FNVEVVAIR VPKDVFMGVDELQVGMR FSDGEVSVQINENVR ISNEESISAMFEH LFAGNATPELAQR TLTLSGMLAEAIR VVADFLSSVGVDR ANVSQVMHIIGDVAGR FSDGEVSVQINENVR LFAGNATPELAQR TLTLSGMLAEAIR VFAYATHPIFSGNAANNLR GMVLTGGGALLR IGGDRFDEAIINYVR GMVLTGGGALLR IKHEIGSAYPGDEVR RNYGSLIGEATAER GMVLTGGGALLR GQGIVLNEPSVVAIR IGGDRFDEAIINYVR IKHEIGSAYPGDEVR NYGSLIGEATAER RNYGSLIGEATAER SVAAVGHDAKQMLGR LVSPGSANETSSILVESVTR LGSAADFLLDISETDLSSLTASIK IGNLQTDLSDGLR AGPGTLSVTIEGPSK 71.4 72.5 53.6 59.8 65.9 73.5 66.8 98.6 60.7 58.6 80.7 91.8 66.7 67.2 76.9 64 74.5 58.1 75.7 57.5 66.4 57.2 66.4 68.8 57.9 62.4 58 70.2 78.5 76.1 50.3 56.1 109 68 40 50 Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Filamin B Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) 231 GAGIGGLGITVEGPSESK LVSPGSANETSSILVESVTR VLSEDEEDVDFDIIHNANDTFTVK IGNLQTDLSDGLR AGPGTLSVTIEGPSK SPFTVGVAAPLDLSK SPFEVQVGPEAGMQK GAGIGGLGITVEGPSESK DAGEGLLAVQITDQEGKPK LVSPGSANETSSILVESVTR LVSPGSANETSSILVESVTR GLHVVEVTYDDVPIPNSPFK IGNLQTDLSDGLR AGPGTLSVTIEGPSK DAGEGLLAVQITDQEGKPK IGNLQTDLSDGLR LVSPGSANETSSILVESVTR GLHVVEVTYDDVPIPNSPFK AGTLTVEELGATLTSLLAQAQAQAR AVPVWDVLASGYVSR LAAVDVSAR LSVEEAVAAGVVGGEIQEK SQREGQGEGETQEAAAAAAAAR MSIYQAMWK AVPVWDVLASGYVSR LAAELSATLEQAAATAR AVPVWDVLASGYVSGAAR LSVEEAVAAGVVGGEIQEK VSAWELINSEYFSEGR SLEGGNFIAGVLIQGTQER LLEAQIATGGVIDPVHSHR LLEAQIATGGIIDPVHSHR EELLAEFGSGTLDLPALTR LLEAQVASGFLVDPLNNQR LGLLDTQTSQVLTAVDKDNK GFFDPNTHENLTYVQLLR 40 40 33 49 31 35 42 42 34 40 55 39 33 24 22 22 55 39 78.8 93.8 61.2 102 54.8 50 83 65 36 59 81 67 86 58 62 58 60 71 Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin Epiplakin GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) 232 LLEIITTTIEETETQNQGIK GFFDPNTHENLTYLQLLER AVTGYTDPYTGQQISLFQAMQK VALALLEAQAATGTIMDPHSPESLSVDEAVR AGTLTVEELGATLTSLLAQAQAQAR AVPVWDVLASGYVSR AVPVWDVLASGYVSR SLEGGNFIAGVLIQGTQER LLEAQIATGGVIDPVHSHR LLEAQIATGGIIDPVHSHR EELLAEFGSGTLDLPALTR GFFDPNTHENLTYVQLLR LLEIITTTIEETETQNQGIK GFFDPNTHENLTYLQLLER AVTGYTDPYTGQQISLFQAMQK LSVEEAVAAGVVGGEIQEK QVSASELHTSGILGPETLR LLEAQIATGGVIDPVHSHR EELLAEFGSGTLDLPALTR LLEIITTTIEETETQNQGIK ALQQGLVGLELK AVPVWDVLASGYVSR LAAELSATLEQAAATAR AVPVWDVLASGYVSGAAR LSVEEAVAAGVVGGEIQEK VSAWELINSEYFSEGR SLEGGNFIAGVLIQGTQER QVSASELHTSGILGPETLR EELLAEFGSGTLDLPALTR LLEAQVASGFLVDPLNNQR GFFDPNTHENLTYVQLLR LLEIITTTIEETETQNQGIK GFFDPNTHENLTYLQLLER AVTGYTDPYTGQQISLFQAMQK SMGGAVSAAELLEVGILDEQAVQGLR LSVEEAVAAGVVGGEIQEK 44 66 70 59 50 66 71 53 84 35 72 67 48 59 31 80 37 68 57 39 41 77 32 33 70 69 67 65 69 41 77 52 65 79 63 61 Epiplakin Epiplakin Epiplakin Epiplakin Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(151-284) His-hSTRAP(151-284) 233 QVSASELHTSGILGPETLR LTAIIEEAEEAPGARPQLQDAWR AVPVWDVLASGYVSR LSVEEAVAAGVVGGEIQEK LPQLPITNFSR YGGPYHIGGSPFK FVPAEMGTHTVSVK FADQHVPGSPFSVK AEAGVPAEFSIWTR SPFSVAVSPSLDLSK VTYTPMAPGSYLISIK TFSVWYVPEVTGTHK QMQLENVSVALEFLDR IPEISIQDMTAQVTSPSGK LVSNHSLHETSSVFVDSLTK YTPVQQGPVGVNVTYGGDPIPK GAGSYTIMVLFADQATPTSPIR SADFVVEAIGDDVGTLGFSVEGPSQAK IANLQTDLSDGLR AFGPGLQGGSAGSPAR GAGTGGLGLAVEGPSEAK VTAQGPGLEPSGNIANK VANPSGNLTETYVQDR AWGPGLEGGVVGK IANLQTDLSDGLR AYGPGIEPTGNMVK ANLPQSFQVDTSK SPFSVAVSPSLDLSK GAGTGGLGLAVEGPSEAK EGPYSISVLYGDEEVPR DAGEGLLAVQITDPEGKPK QMQLENVSVALEFLDRESIK GLVEPVDVVDNADGTQTVNYVPSR SADFVVEAIGDDVGTLGFSVEGPSQAK LTVSSLQESGLK GKLDVQFSGLTK 24 25 67.6 69.3 41 30 44 23 30 43 27 26 39 51 60 44 47 57 60 50 36 35 34 38 52 23 62 24 27 39 39 23 23 23 30 34 Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Filamin A Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) 234 YGGDEIPFSPYR IANLQTDLSDGLR ANLPQSFQVDTSK FADQHVPGSPFSVK SPFSVAVSPSLDLSK EGPYSISVLYGDEEVPR YTPVQQGPVGVNVTYGGDPIPK GLVEPVDVVDNADGTQTVNYVPSR AFGPGLQGGSAGSPAR VANPSGNLTETYVQDR YISPDQLADLYK LAQANGWGVMVSHR VVIGMDVAASEFFR AAVPSGASTGIYEALELR LAMQEFMILPVGAANFR FTASAGIQVVGDDLTVTNPK DYPVVSIEDPFDQDDWGAWQK DATNVGDEGGFAPNILENKEGLELLK VLITTDLLAR GFKDQIYDIFQK.L MFVLDEADEMLSR LQMEAPHIIVGTPGR GIYAYGFEKPSAIQQR GIDVQQVSLVINYDLPTNR LNSNTQVVLLSATMPSDVLEVTK LQMEAPHIIVGTPGR GIYAYGFEKPSAIQQR GYDVIAQAQSGTGK MFVLDEADEMLSR LQMEAPHIIVGTPGR GIYAYGFEKPSAIQQR GIYAYGFEKPSAIQQR GIDVQQVSLVINYDLPTNR LNSNTQVVLLSATMPSDVLEVTK MFVLDEADEMLSR LQMEAPHIIVGTPGR 32 32 20 37 42 45 59 28 36 49 40 49 84 74 47 80 28 25 35 68 84 46 41 55 37 45 45 81 84 64 34 28 78 36 83 62 Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Eukaryotic initiation factor 4A-I Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor Tu translation elongation factor, mitochondrial precursor His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) 235 LNSNTQVVLLSATMPSDVLEVTK MFVLDEADEMLSR GIYAYGFEKPSAIQQR DIETFYNTSIEEMPLNVADLI GIDVQQVSLVINYDLPTNR DIETFYNTSIEEMPLNVADLI. LNSNTQVVLLSATMPSDVLEVTK MFVLDEADEMLSR LQMEAPHIIVGTPGR GIYAYGFEKPSAIQQR DIETFYNTSIEEMPLNVADLI GYDVIAQAQSGTGK MFVLDEADEMLSR LQMEAPHIIVGTPGR.V TVVTGIEMFHK.S QIGVEHVVVYVNK LLDAVDTYIPVPAR GITINAAHVEYSTAAR TIGTGLVTNTLAMTEEEK DLEKPFLLPVEAVYSVPGR VEAQVYILSK AEAGDNLGALVR TVVTGIEMFHK LLDAVDTYIPVPAR GITINAAHVEYSTAAR K.ADAVQDSEMVELVELEIR K.LLDAVDTYIPVPAR R.GITINAAHVEYSTAAR R.TVVTGIEMFHK GITINAAHVEYSTAAR TIGTGLVTNTLAMTEEEK AEAGDNLGALVR TVVTGIEMFHK LLDAVDTYIPVPAR GITINAAHVEYSTAAR TIGTGLVTNTLAMTEEEK 24 84 21 24 82 12 22 84 74 27 33 56 79 55 32 43 72 70 61 45 56 44 47 38 69 53 70 53 44 70 46 76 45 42 68 63 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) 236 QLLQANPILEAFGNAK LLLQVESLTTELSAER VIQYLAHVASSPK LMATLSNTNPSFVR QLLQANPILEAFGNAK INFDVAGYIVGANIETYLLEK LQQLFNHTMFVLEQEEYQR DLGEELEALRGELEDTLDSTNAQQELR AQAELENVSGALNEAESK ELEDVTESAESMNR FLTNGPSSSPGQER LAQAEEQLEQETR QLLQANPILEAFGNAK AQAELENVSGALNEAESK AQELQKVQELQQQSAR KFDQLLAEEK KQELELVVSELEAR LAQAEEQLEQETR LLLQVESLTTELSAER LQEELAASDR QLLQANPILEAFGNAK VAEQAANDLR AQAELENVSGALNEAESK EAQAALAEAQEDLESER GPSAGGGPGSGTSPQVEWTAR KFDQLLAEEK LAQAEEQLEQETR NTDQATMPDNTAAQK VAQLEEER AELSSLQTAR AQAELENVSGALNEAESK AQELQKVQELQQQSAR AQVTELEDELTAAEDAK EAQAALAEAQEDLESER ELEDVTESAESMNR EQADFALEALAK 75 70 29 45 63 27 26 59 93.3 87 56.1 57.9 87.9 93.2 53.1 59.7 82.7 61.3 91.5 60.7 69.4 78.6 86.9 79.9 93.9 60.7 63 55 56.5 67.2 78.7 88.8 104 78.4 87.7 57.7 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Myosin, heavy chain 14 isoform 1 Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) His-hSTRAP(285-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) 237 FLTNGPSSSPGQER GPSAGGGPGSGTSPQVEWTAR HGQALGELAEQLEQAR KQELELVVSELEAR LAEFSSQAAEEEEKVK LKEVVLQVEEER LMATLSNTNPSFVR NTDQATMPDNTAAQK RLELQLQEVQGR RQEEEAGALEAGEEAR AVLVDLEPGTMDSVR GHYTEGAELVDSVLDVVRK ISEQFTAMFR LAVNMVPFPR HGRYLTVAAVFR ISEQFTAMFR ALTVPELTQQMFDAK IMNTFSVVPSPK ISEQFTAMFR LAVNMVPFPR LHFFMPGFAPLTSR HGRYLTVAAVFR ISEQFTAMFR EVDEQMLNVQNK GHYTEGAELVDSVLDVVR IMNTFSVVPSPK INVYYNEATGGKYVPR LAVNMVPFPR INVYYNEATGGKYVPR LAVNMVPFPR AVLVDLEPGTMDSVR ISEQFTAMFR MSATFIGNSTAIQELFKR ALTVPELTQQMFDAK ISEQFTAMFR LAVNMVPFPR 63.8 101 74.3 86.8 58.4 55.5 85 77.1 64 58.8 59.2 53.7 73.7 59.9 55 68.6 66.9 61.2 74.7 63.3 59.5 54.3 68.5 79.6 109 70 74.4 59 104 58.4 59.1 69.3 56.8 61.6 70.1 64 Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin Tubulin DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit DNA-dependent protein kinase catalytic subunit Actin Actin Actin Actin Actin Actin Actin His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) 238 YLTVAAVFR ISEQFTAMFR KLAVNMVPFPR HGRYLTVAAVFR ISEQFTAMFR YLTVAAVFR ISEQFTAMFR KLAVNMVPFPR YLTVAAVFR GHYTEGAELVDSVLDVVR ISEQFTAMFR SGPFGQIFRPDNFVFGQSGAGNNWAK SDPGLLTNTMDVFVK MDPMNIWDDIITNR TVGALQVLGTEAQSSLLK LLLQGEADQSLLTFIDK NLDLAVLELMQSSVDNTK STVLTPMFVETQASQGTLQTR HSSLITPLQAVAQR TVGALQVLGTEAQSSLLK LLLQGEADQSLLTFIDK DVLIQGLIDENPGLQLIIR LGASLAFNNIYR NLLIFENLIDLK HSSLITPLQAVAQR TVSLLDENNVSSYLSK TVGALQVLGTEAQSSLLK LLLQGEADQSLLTFIDK DVLIQGLIDENPGLQLIIR EITALAPSTMK HQGVMVGMGQK AVFPSIVGRPR IWHHTFYNELR QEYDESGPSIVHR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK 62 60.8 61.8 61.8 69.3 61.5 68.8 71.4 61.4 97 69.4 72.4 45 26 39 80 53 34 22 40 46 34 29 36 35 49 62 68 51 51 36 35 38 43 76 40 Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin Actin GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) 239 DLYANTVLSGGTTMYPGIADR TTGIVMDSGDGVTHTVPIYEGYALPHAILR EITALAPSTMK SYELPDGQVITIGNER DLYANTVLSGGTTMYPGIADR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR TTGIVMDSGDGVTHTVPIYEGYALPHAILR EITALAPSTMK AVFPSIVGRPR IWHHTFYNELR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR AVFPSIVGRPR IWHHTFYNELR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR TTGIVMDSGDGVTHTVPIYEGYALPHAILR SYELPDGQVITIGNER VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR EITALAPSTMK HQGVMVGMGQK AVFPSIVGRPR IWHHTFYNELR QEYDESGPSIVHR SYELPDGQVITIGNER 70 64 46 72 69 69 76 71 69 62 66 63 59 66 32 46 38 41 72 31 69 61 47 56 64 65 111 64 36 55 46 50 31 37 39 69 Actin Actin Elongation factor 1 alpha 1 Elongation factor 1 alpha 1 Elongation factor 1 alpha 1 Elongation factor 1 alpha 1 Elongation factor 1 alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) 240 VAPEEHPVLLTEAPLNPK DLYANTVLSGGTTMYPGIADR IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR THINIVVIGHVDSGK VETGVLKPGMVVTFAPVNVTTEVK LPLQDVYK IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR THINIVVIGHVDSGK VETGVLKPGMVVTFAPVNVTTEVK LPLQDVYK IGGIGTVPVGR EHALLAYTLGVK K.YYVTIIDAPGHR VETGVLKPGMVVTFAPVNVTTEVK EHALLAYTLGVK IGGIGTVPVGR YYVTIIDAPGHR IGGIGTVPVGR YYVTIIDAPGHR EHALLAYTLGVK VETGVLKPGMVVTFAPVNVTTEVK LPLQDVYK IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR THINIVVIGHVDSGK VETGVLKPGMVVTFAPVNVTTEVK IGGIGTVPVGR EHALLAYTLGVK. YYVTIIDAPGHR VETGVLKPGMVVTFAPVNVTTEVK LPLQDVYK 76 71 60 54 49 89 39 47 57 55 55 83 42 45 55 51 50 43 68 55 50 66 67 66 59 42 60 57 55 97 27 48 42 58 45 45 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Elongation factor 1-alpha 1 Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) 241 IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR EHALLAYTLGVK YYVTIIDAPGHR THINIVVIGHVDSGK. VETGVLKPGMVVTFAPVNVTTEVK LPLQDVYK IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR LPLQDVYK IGGIGTVPVGR EHALLAYTLGVK YYVTIIDAPGHR VETGVLKPGMVVTFAPVNVTTEVK EVWALVQAGIR FDASFFGVHPK LQVVDQPLPVR DNLEFFLAGIGR EGGFLLLHTLLR VLQGDLVMNVYR SLLVNPEGPTLMR FPQLDSTSFANSR VVEVLAGHGHLYSR VVVQVLAEEPEAVLK GILADEDSSRPVWLK WTSQDSLLGMEFSGR DTSFEQHVLWHTGGK LPEDPLLSGLLDSPALK. QGVQVQVSTSNISSLEGAR WLSTSIPEAQWHSSLAR RPTPQDSPIFLPVDDTSFR DTVTISGPQAPVFEFVEQLR LLLEVTYEAIVDGGINPDSLR DTVTISGPQAPVFEFVEQLRK. 55 68 50 49 68 80 41 43 52 52 60 45 66 56 67 34 72 44 40 62 43 61 33 62 62 61 37 51 34 106 31 39 29 68 64 63 Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) 242 GYDYGPHFQGILEASLEGDSGR NVTFHGVLLDAFFNESSADWR SLYQSAGVAPESFEYIEAHGTGTK HFLLEEDKPEEPTAHAFVSTLTR LQVVDQPLPVR VLQGDLVMNVYR SLLVNPEGPTLMR FPQLDSTSFANSR EDGLAQQQTQLNLR VVVQVLAEEPEAVLK WTSQDSLLGMEFSGR LPEDPLLSGLLDSPALK SLYQSAGVAPESFEYIEAHGTGTK EGGFLLLHTLLR VVEVLAGHGHLYSR VVVQVLAEEPEAVLK WLSTSIPEAQWHSSLAR LHLSGIDANPNALFPPVEFPAPR LQVVDQPLPVR DNLEFFLAGIGR MVVPGLDGAQIPR GVDLVLNSLAEEK SLLVNPEGPTLMR FPQLDSTSFANSR EDGLAQQQTQLNLR VVVQVLAEEPEAVLK WTSQDSLLGMEFSGR DTSFEQHVLWHTGGK LPEDPLLSGLLDSPALK EQGVTFPSGDIQEQLIR QGVQVQVSTSNISSLEGAR LLLEVTYEAIVDGGINPDSLR DTVTISGPQAPVFEFVEQLRK MVVPGLDGAQIPRDPSQQELPR SLYQSAGVAPESFEYIEAHGTGTK HFLLEEDKPEEPTAHAFVSTLTR 103 36 45 73 50 52 62 67 60 47 48 30 60 36 48 50 40 30 50 76 40 69 35 60 48 56 82 35 102 29 61 42 26 31 90 27 Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) 243 FDASFFGVHPK LQVVDQPLPVR DNLEFFLAGIGR EGGFLLLHTLLR DLVEAVAHILGIR FPQLDSTSFANSR VVEVLAGHGHLYSR VVVQVLAEEPEAVLK GILADEDSSRPVWLK DTSFEQHVLWHTGGK LPEDPLLSGLLDSPALK VTVAGGVHISGLHTESAPR WLSTSIPEAQWHSSLAR RPTPQDSPIFLPVDDTSFR IPGLLSPHPLLQLSYTATDR DTVTISGPQAPVFEFVEQLR LLLEVTYEAIVDGGINPDSLR DTVTISGPQAPVFEFVEQLRK GYDYGPHFQGILEASLEGDSGR NVTFHGVLLDAFFNESSADWR LHLSGIDANPNALFPPVEFPAPR SLYQSAGVAPESFEYIEAHGTGTK HFLLEEDKPEEPTAHAFVSTLTR ALGLGVEQLPVVFEDVVLHQATILPK LFDHPESPTPNPTEPLFLAQAEVYK FDASFFGVHPK LQVVDQPLPVR GVDLVLNSLAEEK SLLVNPEGPTLMR VVVQVLAEEPEAVLK GILADEDSSRPVWLK WTSQDSLLGMEFSGR DTSFEQHVLWHTGGK. LPEDPLLSGLLDSPALK EQGVTFPSGDIQEQLIR QGVQVQVSTSNISSLEGAR 30 50 67 37 48 46 55 58 42 32 90 58 45 40 55 33 71 79 99 42 72 32 95 42 41 38 57 49 37 62 37 76 35 35 51 67 Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Fatty acid synthase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) His-hSTRAP(151-284) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) GST-hSTRAP(1-440) 244 GTHTGVWVGVSGSETSEALSR LLLEVTYEAIVDGGINPDSLR SDEAVKPFGLK LQVVDQPLPVR DNLEFFLAGIGR GVDLVLNSLAEEK VLQGDLVMNVYR SLLVNPEGPTLMR VVEVLAGHGHLYSR EDGLAQQQTQLNLR VVVQVLAEEPEAVLK SNMGHPEPASGLAALAK WTSQDSLLGMEFSGR DTSFEQHVLWHTGGK GNAGQSNYGFANSAMER LPEDPLLSGLLDSPALK EQGVTFPSGDIQEQLIR QGVQVQVSTSNISSLEGAR GYDYGPHFQGILEASLEGDSGR AAVPSGASTGIYEALELR HIADLAGNSEVILPVPAFNVINGGSHAGNK LAMQEFMILPVGAANFR VNQIGSVTESLQACK YISPDQLADLYK YISPDQLADLYK LAQANGWGVMVSHR VVIGMDVAASEFFR AAVPSGASTGIYEALELR LAMQEFMILPVGAANFR FTASAGIQVVGDDLTVTNPK YISPDQLADLYK LAQANGWGVMVSHR VVIGMDVAASEFFR AAVPSGASTGIYEALELR LAMQEFMILPVGAANFR FTASAGIQVVGDDLTVTNPK 32 30 43 59 49 68 66 65 43 64 60 84 76 53 62 98 45 78 50 88.7 66.4 69.3 71.1 66.9 40 49 84 74 47 80 45 70 112 77 56 86 Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase Alpha enolase GST-hSTRAP(1-440) GST-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-440) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-219) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) His-hSTRAP(1-150) 245 DYPVVSIEDPFDQDDWGAWQK DATNVGDEGGFAPNILENKEGLELLK AAVPSGASTGIYEALELR LAMQEFMILPVGAANFR LAQANGWGVMVSHR VNQIGSVTESLQACK YNQLLRIEEELGSK LAMQEFMILPVGAANFR VNQIGSVTESLQACK HIADLAGNSEVILPVPAFNVINGGSHAGNK TIAPALVSK VNQIGSVTESLQACK LMIEMDGTENK YISPDQLADLYK VVIGMDVAASEFFR LAMQEFMILPVGAANFR DATNVGDEGGFAPNILENK FTASAGIQVVGDDLTVTNPK GNPTVEVDLFTSK YISPDQLADLYK VVIGMDVAASEFFR AAVPSGASTGIYEALELR FTASAGIQVVGDDLTVTNPK DYPVVSIEDPFDQDDWGAWQK DATNVGDEGGFAPNILENKEGLELLK GNPTVEVDLFTSK LAMQEFMILPVGAANFR VNQIGSVTESLQACK IEEELGSKAK VNQIGSVTESLQACK AAVPSGASTGIYEALELR FTASAGIQVVGDDLTVTNPKR GNPTVEVDLFTSK HIADLAGNSEVILPVPAFNVINGGSHAGNK LAMQEFMILPVGAANFR YISPDQLADLYK 69 65 91.3 63.5 51.5 95.7 59.8 53.4 95.4 64.9 30.1 105 32 47 113 74 58 77 40 54 70 87 71 70 81 77.2 73 66.7 61.9 88.9 98.5 59.2 81.4 58.7 67.8 63.3 This table shows all the peptide data associated with the Mass spectrometry data carried out in this thesis (Table 3.6), the peptides highlighted in black, red and green corresponds to the peptide data associated with the first (N=1), second (N=2) and third repeat of experiments (N=3) respectively carried out in the investigation for the protein indicated in column 1 and the pull downs for the particular pull down with a hSTRAP protein variant indicated in column two. 246 247