* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Insights into interactions between poly(ethylene glycol) and proteins
Expression vector wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
Point mutation wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Evolution of metal ions in biological systems wikipedia , lookup
Genetic code wikipedia , lookup
Magnesium transporter wikipedia , lookup
Biosynthesis wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Bimolecular fluorescence complementation wikipedia , lookup
Protein purification wikipedia , lookup
Biochemistry wikipedia , lookup
Western blot wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Proteolysis wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Metalloprotein wikipedia , lookup
University of Iowa Iowa Research Online Theses and Dissertations 2010 Insights into interactions between poly(ethylene glycol) and proteins from molecular dynamics simulations Eli Dylan Musselman University of Iowa Copyright 2010 Eli Dylan Musselman This thesis is available at Iowa Research Online: http://ir.uiowa.edu/etd/1164 Recommended Citation Musselman, Eli Dylan. "Insights into interactions between poly(ethylene glycol) and proteins from molecular dynamics simulations." MS (Master of Science) thesis, University of Iowa, 2010. http://ir.uiowa.edu/etd/1164. Follow this and additional works at: http://ir.uiowa.edu/etd Part of the Biochemistry Commons INSIGHTS INTO INTERACTIONS BETWEEN POLY(ETHYLENE GLYCOL) AND PROTEINS FROM MOLECULAR DYNAMICS SIMULATIONS by Eli Dylan Musselman A thesis submitted in partial fulfillment of the requirements for the Master of Science degree in Biochemistry in the Graduate College of The University of Iowa July 2010 Thesis Supervisor: Associate Professor Adrian H. Elcock Graduate College The University of Iowa Iowa City, Iowa CERTIFICATE OF APPROVAL _______________________ MASTER'S THESIS _______________ This is to certify that the Master's thesis of Eli Dylan Musselman has been approved by the Examining Committee for the thesis requirement for the Master of Science degree in Biochemistry at the July 2010 graduation. Thesis Committee: ___________________________________ Peter A. Rubenstein, Thesis Chair ___________________________________ M. Todd Washington ___________________________________ Ernesto J. Fuentes To Meggan ii ACKNOWLEDGMENTS There are several individuals I would like to thank and show my appreciation towards for my graduate school experience. First, I would like sincerely thank my advisor, Adrian Elcock, for not only being an excellent companion in our trip through the simulation of polymers, but also a friend. From our detailed discussions and revising manuscripts to teaching the art and science of analyzing computer simulations, Adrian has always put his students first, contributing to their scientific and professional lives. I would also like to thank the members of the Elcock lab; Shun Zhu for always being behind me (literally), Andy Thomas for his forever jovial attitude, Tam Frembegen Kenser for her constant supply of chocolates and Sue Travis for her continuously engaging social opinions. I thank members of my committee; Peter Rubenstein, Todd Washington and Ernie Fuentes for their suggestions, personal time and commitment. Finally, I would like to thank my wife, Meggan Musselman, not only for putting up with a scientist, but always keeping me on an even keel. iii TABLE OF CONTENTS LIST OF TABLES ...............................................................................................................v LIST OF FIGURES ........................................................................................................... vi CHAPTER I. INTRODUCTION ............................................................................................1 Macromolecular Crowding ...............................................................................1 Crowding Agents .......................................................................................1 Effects of Crowding ..................................................................................2 The Use of PEG .........................................................................................3 Overview of Thesis ...........................................................................................5 II. METHODS .......................................................................................................8 Molecular Dynamics Simulation Details ..........................................................8 Setting PEG Parameters ....................................................................................9 MD Systems Examined ..................................................................................10 Determining Preferential Interactions.............................................................11 III. RESULTS .......................................................................................................16 Parameterization of Poly(ethylene glycol) .....................................................16 Simulation of PEG and Various Proteins .......................................................17 Choice of Proteins ...................................................................................17 Structural Stability of the Systems ..........................................................17 Preferential Interactions between PEG and Proteins ...............................18 Preferential Interactions between PEG and Individual Residue Types .......................................................................................................18 Preferential Interactions between Ions and Proteins ...............................19 Preferential Interactions between Ions and Individual Residue Types .......................................................................................................20 Comparison to NMR Experimental Results ............................................22 IV. DISCUSSION .................................................................................................38 Protein – PEG Interactions .............................................................................38 Comparison With Experimental Second Virial Coefficients ..................38 Atomic Level Details ...............................................................................39 Protein – Ion Interactions................................................................................41 A System View ........................................................................................41 Ions in More Detail ..................................................................................41 Future Directions ............................................................................................43 REFERENCES ..................................................................................................................45 iv LIST OF TABLES Table 3.1. Residue-specific analysis of cyt c: simulation vs NMR ...........................................37 v LIST OF FIGURES Figure 1.1. Polymers used as macromolecular crowding agents ..................................................6 1.2. Similarities between dimethoxyethane and poly(ethylene glycol) .............................7 2.1. Conformer populations of DME in solution .............................................................14 2.2. RDF analysis of HEWL ............................................................................................15 3.1. Static and dynamic properties of PEG. .....................................................................23 3.2. Sample snapshots of HEWL with varying concentrations of PEG. .........................24 3.3. RMSD analysis of all protein systems. .....................................................................25 3.4. Preferential interaction coefficient values for PEG – protein interactions. ..............26 3.5. Preferential interaction coefficient values between PEG and charged residues. ......27 3.6. Preferential interaction coefficient values between PEG and amino acid residue groups. ..........................................................................................................28 3.7. Preferential interaction coefficient values for protein – ion interactions..................29 3.8. Preferential interaction coefficient values between chloride ions and charged residues .....................................................................................................................30 3.9. Two lysine residues of TEM coordinated with a chloride ion ..................................31 3.10. Preferential interaction coefficient values between sodium ions and charged residues .....................................................................................................................32 3.11. Coordination of arginine residues of BLIP with positive ions .................................33 3.12. Preferential interaction coefficient values between chloride ions and amino acid residue groups ...................................................................................................34 3.13. Preferential interaction coefficient values between sodium ions and amino acid residue groups ...................................................................................................35 3.14. Coordination of a phenylalanine residue of cyt c with a sodium ion. ......................36 4.1. Coordination of PEG around lysine residues in experiment and simulation ............44 vi 1 CHAPTER I INTRODUCTION Macromolecular Crowding While a great deal of knowledge has been gathered about the chemical processes that occur within organisms from in vitro biochemistry studies, one aspect that is often overlooked is the environment in which these processes naturally occur. Conventional biochemical experiments usually employ a low concentration (less than 1 mg/ml) of a single protein; however, in vivo, proteins are immersed in a cellular milieu that contains many biomolecules at a total concentration up to 400 times higher than that used in vitro12 . The total concentration of biomolecules is sufficiently high that up to 40% of the volume of the cell is physically occupied and excluded to other molecules3-4. These very high concentrations are thought to cause substantial differences between the thermodynamics and kinetics of macromolecular behavior observed in vivo and in vitro. These differences are usually termed „macromolecular crowding effects‟, an umbrella term that describes all effects that occur when a system is crowded with additional macromolecules that are inert to the solutes of interest3,5-6; „inert‟ in this case means that the only interaction between the crowder and the solute(s) is an excluded-volume or steric interaction. Under this definition, any other nonspecific interactions (i.e. electrostatic or hydrophobic) – even though they may be important in vivo – would not normally be considered „macromolecular crowding effects‟. Crowding Agents. To mimic the effects of high macromolecular concentrations in vitro, a number of synthetic polymers have been used as „macromolecular crowding agents‟. The four main examples are poly(ethylene glycol) (PEG), polyvinylpyrolidone (PVP), dextran and Ficoll (Figure 1.1). PEG is a linear polymer that is used both as a crowding agent and to enhance protein crystallization7. PVP is a linear polymer, relatively new to crowding studies, which has been used recently because of its claimed 2 inertness with protein surfaces8-9. Dextran is a mostly linear polymer composed of glucose monomers linked via α-(1,6)-D-glycosidic bonds (approximately 95%) with an occasional α-(1,3)-D-glycosidic linkage10-12. Finally, Ficoll, is a highly branched polymer of sucrose monomers that are cross-linked by epichlorohydrins; it is thought to take on a spherical shape13-16. All four crowders have been extensively studied, both experimentally and theoretically. In addition to their use in crowding studies (see below), some of these polymers have also been used in biotechnological applications; it has been shown, for example, that protein – PEG conjugates can stabilize proteins against degradation or immune responses17-18. Effects of Crowding. The above macromolecular crowding agents have been shown to have a number of effects on proteins. Studies have shown that both PEG and dextran accelerate the polymerization of actin by increasing the association rates of actin monomers19. Ficoll has been shown to decrease the half-time for the self-assembly of the HIV capsid protein20, to increase the rate of fibrillation of α and β-synuclein21-22, and to enhance the refolding activity of the GroEL chaperone1,23 – an effect that can even be recapitulated in silico24. In addition to effects on association rates, macromolecular crowding agents have also been shown to increase protein stability; for example, the free energy of unfolding of FKBP was increased with the addition of Ficoll and dextran25-26. Finally, there is also increasing evidence that macromolecular crowding can have significant effects on solute diffusion, which several processes such as metabolism, transport, cell motility and signaling rely upon27. The general theory for why these effects occur is size-dependent excluded volume; the increase in crowding agents decreases the available volume that the species can sample in solution, favoring the reaction28. Even though the amount of experimental work accomplished has been great and the general interest in interactions between proteins in crowded environments is high, very few studies have directly explored the interactions between polymers and proteins at an atomic level. 3 If macromolecular crowding agents are truly inert in terms of their interactions with proteins, then statistical mechanical methods can be used to determine their potential thermodynamic effects on protein-protein associations and protein folding3,5. While very simple models are sometimes capable of producing quantitative agreement with experiment29, these quantitative fits often result in unrealistic parameters. For example, the stability of cytochrome b562 under crowded conditions can be matched with experiment30, but only when the radius of the PEG 20,000 molecule is assumed to be unrealistically small (i.e. 20 Å versus > 40 Å). The above studies rely on a very basic model which treats polymers as hard spheres and only accounts for excluded volume interactions between the polymer and the protein1,31; the unrealistic fit parameters therefore probably reflect the fact that crowding agents, such as PEG and dextran, are not spheres and are better thought of as random-coil polymers32-33. Like the experimental studies above, these theoretical studies do not provide insight at the atomic level. The Use of PEG. Besides macromolecular crowding, PEG is used as a crystallizing agent in structural biology34. In both settings, PEG is often assumed to interact with proteins via excluded volume effects alone (also known as „depletion interactions‟)6,31,35. Significant questions still remain, however, about how PEG interacts with proteins and with salts36. In particular, a number of studies have indicated that PEG is not a true, inert crowder and that it displays other interactions besides steric interactions6. In fact, as long ago as 1983, A.P. Minton mentioned the idea that the interaction between proteins and PEG is “more complex” than the excluded-volume interaction37. One method that can be used to determine the thermodynamic relationships between two body interactions is the osmotic second virial coefficient(B22)38-40. B22 is an experimentally measurable quantity (by methods such as light scattering and small-angle neutron scattering) of osmotic compressibility where positive values indicate an osmotic pressure greater than an ideal solution and negative values signify osmotic pressure less 4 than ideal solution39. It can be inferred that positive B22 values reflect repulsive interactions between the two measured species and negative B22 values describe attractive interactions between the measured species. It has been found that the second virial coefficient (B22) for the PEG - lysozyme interaction is lower (i.e. more favorable) than would be expected if the interaction was purely steric41-42. Such studies do not, however, provide atomic level detail to determine the origins of the favorable interaction between protein and PEG. Two experimental studies have tried to look deeper than at the system level, providing atomic and near-atomic level detail of the interactions between protein and PEG. One brief study, which analyzed six protein crystal structures determined under conditions where PEG was used as a crystallizing agent, provided anecdotal views of protein – PEG interactions at an atomic level43. Crystallographic structures showed coordination between PEG and protein and PEG and salts; however, the author was clear to point out that care must be taken when interpreting the electron density of PEG43. In a second, more detailed study, chemical shift changes in the backbone NH resonances of cytochrome c were measured as a function of increasing concentrations of PEG44. Interestingly, the largest chemical shift changes found on cytochrome c mapped to the interface of the cytochrome c – cytochrome c-peroxidase complex. As of now, we are unaware of any other experimental studies that have probed protein – polymer interactions explicitly in a similar level of detail. Since the above studies all hint at an attractive interaction between PEG and proteins, PEG makes an excellent candidate to study possible specific interactions of protein – polymer systems at the atomic level. Such studies would be beneficial in obtaining a more thorough understanding of PEG-protein interactions and might help to explain PEG‟s effects on protein behavior when used as a macromolecular crowding agent, and possibly also its effectiveness as a protein crystallization agent. 5 Overview of Thesis The focus of this study is on PEG – protein interactions. It is based on two concepts: (a) that an atomic level of detail will be required if the true effects of polymers on protein behavior are to be completely elucidated and (b) that in place of direct experimental advances, the best way to approach this problem is through the use of molecular simulation techniques. The use of molecular dynamics (MD) simulations allows the physicochemical nature of molecular interactions to be observed at a level of resolution that is currently beyond reach experimentally45-47. Modeling the aqueous solution explicitly – meaning that all water molecules and ions, along with the protein and PEG, are present in the system – allows for the simultaneous study of interactions between protein, PEG, water and ions all treated on the same level. PEG is a water-soluble, straight chain polymer composed of repeating ethylene oxide monomeric units (CH2 – CH2 – O) (Figure 1.1). Previous molecular modeling studies applied parameters derived for dimethoxyethane (DME) to PEG, as both molecules share the ethylene oxide monomer (Figure 1.2)48-49. As outlined in this thesis, we have followed this same approach by modeling PEG with parameters derived for DME. We have performed MD simulation of PEG molecules in an aqueous environment; direct comparisons with experimental data have then been performed to ensure that the observed behaviors of PEG are indeed realistic. Large-scale MD simulations with a variety of proteins in aqueous polymer solutions were then performed and used to obtain atomic-level views of protein – PEG interactions. A total of six different protein systems were studied. Each protein was sampled in 4 different concentrations of PEG for 500 ns, giving a total of 12 µs of simulation time for the protein solutions. As detailed in Results, we conclude that PEG does indeed undergo favorable interactions with proteins. The simulations are the first to provide an atomic level view of interactions between proteins and a polymer widely used in studies of crowding effects. 6 (a) (b) (c) (d) Figure 1.1. Polymers used as macromolecular crowding agents. Shown here are the chemical descriptions of four widely used polymers in macromolecular crowding studies. (a) Poly(ethylene glycol), a straight chain polymer made of repeating ethylene oxide subunits. (b) Polyvinylpyrrolidone, another straight chain polymer composed of Nvinylpyrrolidone subunits. (c) Dextran, a predominately linear polymer made of glucose monomers. (d) Ficoll, a highly branched polymer of sucrose monomers cross-linked by epicholorhydrins, which is thought of primarily spherical in shape; the three preferred sites of attachment of eplichlorohydrin are marked by arrows50. Images adapted from Sigma-Aldrich51. 7 Figure 1.2. Similarities between dimethoxyethane and poly(ethylene glycol). United Atomic representation of all trans-dihedrals dimethoxyethane (DME), top, and all transdihedrals poly(ethylene glycol) (PEG) with 4 monomeric units, bottom, to highlight similarities between the C-C-O subunit. All atoms are represented in the CPK scheme. 8 CHAPTER II METHODS Molecular Dynamics Simulation Details All molecular dynamics (MD) simulations were carried out using GROMACS 4 software52. In each simulated system, water was modeled with the TIP3P model,53 and proteins were modeled with the OPLS-AA forcefield54 implemented in GROMACS; each system also included 100 mM NaCl with literature parameters used to describe the ions5556 at a system pH of 7.0. NaCl was used because they were the predominate ions used in the experimental studies41,44. The system size of each simulation was dependent on the specific volume of each protein, determined by HYDROPRO57; the final volume of the box was adjusted so that each system contained 100, 200 or 300 mg/mL PEG 3350 with 6, 12 and 18 PEG 3350 molecules, respectively. All simulations used a dodecahedral box with approximately an 80 Å image distance – the distance from the center of one system to the next – and applied with periodic boundary conditions. All systems were first energy minimized using the steepest descent algorithm for 1000 steps and then incrementally heated to 298 K over 500 ps in the NPT ensemble; during this initial period, a pressure of 1 atm was maintained with the Berendsen barostat,58 and temperature was maintained with the Berendsen thermostat58. Directly following, each system was equilibrated for a further 10.5 ns in the NPT ensemble, switching to the canonical coupling schemes of the Parrinello-Rahman barostat59 at 1 atm pressure and the Nosé-Hoover thermostat60-61 at 298 K. Each production simulation lasted 500 ns in the canonical coupling scheme. Cutoffs for both the short-range electrostatic and van der Waals interactions were set to 10 Å, and the long-range electrostatic interactions utilized the Particle Mesh Ewald (PME) method62. A 2.5 fs time step was employed with all covalent bonds constrained using the LINCS algorithm63. All atomic coordinates (i.e. system „snapshots‟) were saved at 1 ps intervals for subsequent analysis. 9 Setting PEG Parameters Previously, several groups48-49,64 modeled PEG using parameters derived for dimethoxyethane (DME), as both molecules share a common backbone of C – O – C and C – C – O bonds. These previous DME parameters were optimized to reproduce the experimentally observed backbone torsional conformations in aqueous solution. We chose to use the Fischer et al. TraPPE-UA parameters48 as they were previously used in GROMACS. To test the implementation of the parameters, several 100 ns MD simulations of different concentrations of DME (0.3, 0.6 and 1 mole fraction) in TIP3P water were set up, using the same equilibration scheme as above, with the intention of determining whether the principal conformers of DME observed using Raman spectroscopy65 could be matched. The parameters did indeed reproduce the experimentally observed conformational populations of DME (Figure 2.1) for all solutions. The same DME parameters were therefore applied to our PEG model (see below). We treated the beginning and end hydrogen atoms like methyl/methylene groups, with a +0.25 electric charge and the equivalent non-bonded parameters for a hydroxyl hydrogen from the TraPPE-UA forcefield66. Lorentz-Berthelot mixing rules67 were applied to any atom that interacted with a TraPPE-UA forcefield atom; otherwise, standard OPLS-mixing rules68 were used for the interactions between OPLS-AA forcefield atoms, water and salts. 10 MD Systems Examined This study involved performing simulations of PEG 3350, alone in aqueous solution and with several proteins at various concentrations of PEG. The PEG 3350 model was composed of 76 connected monomers of ethylene oxide. First, simulations with polymer only were performed in 100 mM NaCl at 4 different concentrations of PEG 3350: one molecule of PEG at a nominal concentration of 16.67 mg/mL, 6 molecules of PEG at 100 mg/mL, 12 molecules of PEG at 200 mg/mL and 18 molecules of PEG at 300 mg/mL. The 1-molecule system was simulated for 200ns; all other systems were simulated for 500ns. Six different proteins were simulated; each was simulated alone in aqueous solution and with three different concentrations of PEG 3350 (100, 200 and 300 mg/mL, again corresponding to 6, 12 and 18 monomers of PEG 3350, respectively). All protein structures were obtained from the Protein Data Bank69. The entry codes for the proteins are as follows: Hen Egg White Lysozyme (HEWL): 1HEL70, Bovine α-Lactalbumin (BLA): 1HFZ71, Saccharomyces cerevisiae Cytochrome c (cyt c): 1YCC72, and the βlactamase TEM-1 – β-lactamase TEM-1 inhibitor protein complex (TEM-BLIP): 1S0W73. In order to simulate TEM and BLIP separately, the TEM-BLIP complex was divided into separate molecules. At the time our simulations were begun, the structure of apo BLIP had not been deposited in the PDB, although one had been reported in the literature74. Recently, from the data collected in 199474, a structure for apo BLIP was solved by the Strynadka group and deposited in the PDB75 ; this, however, was too late for the current simulation study. For purposes of comparison with NMR experimental data, a mutation was incorporated into the structure of reduced cyt c, C102T44,76-77, using the side-chain modeling program SCWRL478. Since the OPLS-AA force field does not contain parameters for the heme cofactor, AMBER force field parameters for heme79 were ported to GROMACS using ffAMBER80 and amb2gmx.pl81. While it is usually a 11 bad idea to mix parameter sets in this way, the results in this case appear to justify the approach (see Results). Determining Preferential Interactions One method to quantify the interactions of cosolutes with proteins is by use of the preferential interaction coefficient82-83. In the context of proteins, the preferential interaction or binding coefficient, Γ, is a thermodynamic quantity that describes the relative surplus or deficit of cosolutes within a certain distance from the surface of a protein compared to the bulk or system environment82. As far as we are aware, the preferential interaction formalism was first applied to computational simulations by Baynes and Trout83. The preferential interaction coefficient for PEG is given by: ౘ౫ౢౡ (1) ౘ౫ౢౡ where PEGlocal and OWlocal are, respectively, the number of atoms of PEG and water within a specified distance of the surface of the protein and PEGbulk or OWbulk are the total number of PEG or water atoms outside the immediate surrounding area of the protein, as previously specified. The angled brackets indicate an ensemble average: Γ is therefore to be calculated as an average over many simulation snapshots. The same approach can be used to find the preferential interaction coefficient of any cosolvent, such as sodium (equation 2) and chloride (equation 3), which would be written as follows: శ ౢౙౢ (2) ౘ౫ౢౡ ష ౢౙౢ ౘ౫ౢౡ (3) 12 Again, Na+local and Cl-local are the number of sodium and chloride ions within a specified distance of the surface of the protein, and Na+bulk and Cl-bulk are the total number of ions within the whole system. When there is an excess of cosolutes in the local environment of the protein, as compared to the bulk solution, the preferential interaction coefficient, Γ, is positive. Alternatively, when there is an absence of cosolvent in the local environment of the protein, as compared to the bulk solution, the preferential interaction coefficient, Γ, is negative. A Γ value of zero indicates that there is no preference for the solute of interest. To determine the preferential interaction coefficient, one needs to define a separation distance between the local and bulk domains. This cutoff was determined separately for each type of cosolute (i.e. PEG, Na+ and Cl-) from the corresponding radial distribution function (RDF) of the cosolute and the protein67. The distance at which the first peak of the RDF reached a minimum or an inflection point was chosen as the cutoff distance. Figure 2.2 shows the RDFs for PEG and protein, from which the cutoff was determined to be 5.0 Å from the nearest atom of the protein (the entire calculation of PEG Γ values was repeated with cutoffs of 6 and 7 Å of the protein and gave qualitatively identical results; data not shown). Figure 2.2 also shows the RDFs for Na+ and protein, and Cl- and protein. The cutoff for Na+ and protein was set to 3.0 Å; the first global minimum for Na+ and protein. The cutoff for Cl- and protein was set to 3.5 Å; the first global minimum for Cl- and protein. Separate analyses were performed to determine the Γ values for protein – PEG and protein – ion interactions. Additional calculations were performed to obtain Γ values for individual residues and residue types interacting with PEG and ions. The following groups of amino acid types were tested for preferential interaction coefficients: (a) positive residues (arginine and lysine), (b) negative residues (aspartate and glutamate), (c) aromatic residues (phenylalanine, tryptophan, and tyrosine), (d) aliphatic residues (alanine, isoleucine, leucine, proline and valine) and (e) polar residues (asparagine, 13 glutamine, histidine, serine and threonine). The charged amino acids (arginine, lysine aspartate and glutamate), were also examined individually. All calculations of preferential interaction coefficients were conducted using software written in-house by myself together with Dr. Adrian H. Elcock. 14 Figure 2.1. Conformer populations of DME in solution. Calibration of the DME force field parameters48 to match Raman spectroscopy measurements of principal DME conformers at three different mole fractions of DME (1, 0.6, & 0.3)65. DME contains three key dihedral angles, each of which can in principle adopt trans (T) or gauche (G) conformations; GG‟ is used to denote a pair of gauche dihedral angles of opposite signs. The three dihedral angles are listed in the order C-O-C-C, O-C-C-O, and C-C-O-C 15 (a) (b) (c) Figure 2.2. RDF analysis of HEWL. Cutoffs for the calculation of the preferential interaction coefficient (denoted by the dashed vertical lines) were determined from radial distribution functions (RDF) of lysozyme under various PEG concentrations for: (a) PEG, (b) sodium and (c) chloride. 16 CHAPTER III RESULTS Parameterization of Poly(ethylene glycol) As described in the Methods, poly(ethylene glycol) (PEG) parameters were set using the Fischer et al. TraPPE-UA modified parameters48 for dimethoxyethane (DME). DME was used because it shares the same continuous backbone monomer, ethylene oxide, as PEG (C – C – O). Qualitatively, the principal conformers of DME in aqueous solutions at mole fractions of 0.3, 0.6 and 1.0 were in good agreement with previous experimental results (Figure 2.1). As done by others48-49, DME parameters were then applied to our PEG model and used throughout the rest of the simulations. To confirm that the model of PEG 3350 performed properly in solution, other physical aspects of PEG were examined. The radius of gyration of PEG 3350 was determined from four MD simulations at various concentrations in TIP3P water with 100 mM NaCl. These data were then compared with neutron scattering results reported by the Hjelm group for the radius of gyration of PEG 3400 (this has, on average, one more connected ethylene oxide monomer than PEG 3350, i.e. 77 versus 76, respectively). According to the Hjelm group‟s analysis, PEG 3400 has a radius of gyration of 19.1 ± 1.0 Å84. In a previous computational study, the radius of gyration of a very similar molecule, poly(ethylene oxide) – with an identical number of monomers to our PEG model (76) – was determined to be 20.4 ± 0.8 Å85. In our simulations, the average radius of gyration of PEG 3350 at all concentrations was 19.4 ± 0.3 Å, which is clearly in very good agreement with these previous experimental and computational results. As shown in a histogram (Figure 3.1a), there is no change to the radius of gyration of PEG 3350 due to increasing PEG concentration. Interestingly, although there is no change in the mean radius of gyration, the decay rate of the end-to-end autocorrelation function decreases with increasing PEG concentration of PEG (Figure 3.1b). This indicates that although 17 static quantities are unaffected by increasing PEG concentration, dynamic quantities are likely to be significantly affected. Simulation of PEG and Various Proteins Choice of Proteins. The selected protein systems were chosen because of the availability of experimental data such as second virial coefficients 41-42, diffusion coefficients7, protein-protein association kinetics and thermodynamics7,32 and NMR chemical shifts44. Bovine α-lactalbumin, BLA, however, was chosen because it is an anionic analog of the positively-charged lysozyme and therefore provides the ability to compare the effects of net charge on the protein behavior with PEG. The six proteins have a range of net charges (from -8 to +8). All six proteins were simulated in four different PEG concentrations (0, 100, 200 and 300 mg/mL PEG 3350) with 100 mM NaCl in TIP3P water. Each condition was simulated for 500 ns of production runtime, giving a total of 12 µs of simulation. Figure 3.2 shows a snapshot of HEWL in the four different concentrations of PEG 3350. Structural Stability of the Systems. For all protein systems, root-mean-square deviations (RMSD) from the initial crystallographic structure of the backbone atoms (C, Cα, N) were calculated throughout the time course as shown in Figure 3.3. To ensure that the signal for the RMSDs was not unduly influenced by unfolded (frayed) segments at the N and C termini, these RMSD calculations excluded residues that were not listed in the secondary structure of the PDB file at the N and C termini (e.g. residues 1 through 3 might be excluded). All of the RMSDs seem reasonable: standard backbone RMSDs which have values under 4 Å are usually considered to be „roughly stable‟ structures86. The one possible exception is BLIP, which has an average RMSD of 3.75 Å (Figure 3.3). The probable explanation for this is that the BLIP structure was actually excised from the structure of the TEM-BLIP complex and so does not represent a true „unbound‟ structure; however, further MD simulations starting from the recently deposited unbound BLIP 18 structure would need to be performed to substantiate this idea. Interestingly, the RMSDs of all proteins in 300 mg/mL PEG 3350 are either lower or equal to those in solutions without PEG 3350. Not only does PEG 3350 appear not to unfold the proteins, therefore, but in three out of the six protein cases (HEWL, BLIP, and TEM-BLIP), it actually decreased the RMSD of the protein backbone. Preferential Interactions between PEG and Proteins. As explained in the Methods, the preferential interaction coefficient, Γ, describes the preference for a molecule between two different environments, the local and bulk. Interestingly, with every protein at 100 and 200 mg/mL PEG 3350, there is a positive attraction to PEG as shown in Figure 3.4. In only two cases at 300 mg/mL PEG 3350 is the preferential interaction coefficient less than zero (TEM-BLIP and BLA), indicating a repulsive interaction. Looking at the six proteins as a whole, one can see that the preferential interaction coefficient of PEG decreases as the concentration of PEG increases. The single exception to this trend is provided by the TEM-BLIP complex, and the most obvious potential explanation for this is that the interactions between PEG and the TEMBLIP complex were insufficiently sampled owing to the fact that it is such a large protein compared to the others; sampling problems can be a limitation in MD simulations47. For this reason, the TEM-BLIP complex was excluded from further analysis. Preferential Interactions between PEG and Individual Residue Types. Figure 3.5 displays the preferential interaction coefficients for charged amino acids of the five proteins and PEG 3350. For the negatively charged residues aspartate and glutamate, the PEG preferential interaction coefficients are negative at essentially all PEG concentrations (Figure 3.5a, b); this indicates that, on a whole, PEG prefers not to associate with negatively charged amino acids. As was the case with the PEG-protein interactions, the preferential interaction coefficients of the negative residues all decrease with increasing PEG concentration. 19 Compared to the negatively charged amino acids, the preferential interaction coefficients for the positively charged amino acid residues, arginine and lysine, give a much more complicated story to decipher (Figure 3.5c, d). Different proteins can show very different values, both in terms of magnitude and being positive or negative, for positively charged residues. A case in point is arginine, where the values for the interaction are extremely negative for BLA and extremely positive for HEWL; these differences may well be connected with the fact that the net charge of BLA is -5, while that of HEWL is +8. Thus, it is important to keep in mind that trends can also be affected by the net charge of the protein itself. Lysine, on the other hand, generally has positive preferential interaction coefficients with PEG. Interestingly, with both arginine and lysine, the general trend of decreasing preferential interaction coefficient caused by increasing the concentration of PEG is not as prevalent. In addition to preferential interactions of charged amino acids and PEG 3350, preferential interactions of five different groups of amino acids were examined. Groups were selected as aliphatic, aromatic, polar, positive, or negative as shown in Figure 3.6 (Residues in each group are explained in methods). On average, all groups, except for the negative amino acids, show a positive preferential interaction with PEG. Again, the preferential interaction coefficient decreases with increasing concentration of PEG 3350. The scale for the Γ values should, however, be noted, as the interactions with charged residues are 3 to 5 times stronger than those with aliphatic, aromatic or polar residues. The aromatic group of residues is the only group to have Γ values that do not always decrease with increasing concentrations of PEG; however, this is probably again due to sampling issues: there are comparatively few aromatic residues in the six proteins (notice the larger error bars in Fig. 3.6b). Preferential Interactions between Ions and Proteins. Next, the preferential interaction coefficients between the proteins and the salts in solution, chloride and sodium, were calculated (Figure 3.7). Interestingly, in most cases, increasing the 20 concentration of PEG actually increases the preference for ions over water, in stark contrast to the PEG – protein relationship. As one might assume, the preferential interaction coefficient between chloride ions and protein is more favorable for those proteins that have a positive net charge, which can be seen in Figure 3.7a. The opposite is true with sodium ions: as the net charge on the protein becomes more positive, the preferential interaction coefficient decreases (Figure 3.7b). Preferential Interactions between Ions and Individual Residue Types. As done previously for the protein – PEG interactions, the preferential interaction coefficients were examined between ions and charged amino acid residues. Figure 3.8 shows the results for interactions with Cl– ions. In Figure 3.8a & b, negative amino acid residues are shown to have negative preferential interaction coefficients with chloride ions. The absolute values are clearly much smaller than those for the entire protein: this is a consequence of the form of the equation (Equation 1), which gives larger effects when more atoms are contained within the binding domain. Figures 3.8c & d show that positive residues have a positive preferential interaction coefficient with chloride ions. Again the effects are very slight compared to the whole protein; on the order of 10 times smaller (compare Figures 3.7a and 3.8c & d). One inconsistent result appears in Figure 3.8d, where 300 mg/mL PEG with the protein TEM induces a much larger preferential interaction coefficient between lysine and chloride ions than is seen at the other concentrations of PEG. After viewing the simulation trajectories, it is apparent that a chloride ion diffused into an exposed pocket of TEM within 3.5 Å of two lysine residues, as shown in Figure 3.9. This explains the high value for the lysine – chloride preferential interaction with TEM in 300 mg/mL PEG 3350; a similar interaction was not observed at other concentrations of PEG. Figure 3.10 shows the preferential interaction coefficients between charged residues and sodium ions. As expected, the results are essentially opposite to those obtained with chloride ions. The negatively charged amino acids, aspartate and 21 glutamate, show a positive preferential interaction coefficient towards sodium ions (Figure 3.10a & b). The positively charged amino acids, arginine and lysine, sodium ions show a negative preferential interaction coefficient towards sodium ions, with one surprising exception (Figure 3.10c & d). This conflicting result occurs with the protein BLIP and the interaction between sodium ions and arginine; the anomaly occurs at all concentrations of PEG 3350 (Figure 3.10c). During the simulations, a sodium ion is located within 3.0 Å of the backbone carbonyl group of arginine 144 in BLIP, as shown in Figure 3.11a. Interestingly, the exact same interaction occurs in the crystal structure of the TEM-BLIP complex (PDB ID: 1S0W), albeit with a calcium ion (Figure 3.11b). This explains the positive preferential interaction coefficients of arginine residues and sodium ions for BLIP, whereas for all other proteins the preferential interaction coefficients are negative. Increasing the PEG concentration generally increases the preferential interaction coefficient between sodium and positive residues; however, this relationship is not as clear as that seen between sodium and the whole protein most likely due to the small sample size (~ 20 sodium ions per simulation). The opposite can be said for negative residues and the sodium ions: increasing the PEG concentration generally decreases the preferential interaction coefficient, although the small sample size makes these trends less clear. The preferential interaction coefficients between ions and the grouped amino acid residues of the proteins were also determined. As presented in Figure 3.12, chloride ions have negative preferential interaction coefficients with all groups of amino acids (aliphatic, aromatic, polar and negative) except for positive amino acids (Figure 3.12d). Compared to whole protein – chloride interactions (Figure 3.7a), these interactions are very small, on the order of 50 to 200 times less. A similar analysis was also run for sodium – amino acid group interactions, which gave positive results between sodium and negative amino acids, and results roughly equal to zero for all other amino acid groups – aliphatic, aromatic, polar and positive (Figure 22 3.13). Again, compared to protein – sodium interactions (Figure 3.7b), the sodium – amino acid group interactions are very weak. An anomalously positive preferential interaction coefficient is found between sodium ions and the aromatic amino acid group for cytochrome c in 300 mg/mL PEG (Figure 3.13b). This interaction can be explained by Figure 3.14, where a sodium ion is within 3.0 Å of the carbonyl oxygen of a phenylalanine residue of cytochrome c. There also appears to be coordination with the carboxylate oxygens of the terminal glutamate residue, viewed to the immediate right of the sodium ion in Figure 3.14. Comparison to NMR Experimental Results. To determine if our simulations could match the previous NMR studies on cytochrome c with PEG44, a per-residue preferential interaction coefficient was calculated for cyt c – PEG and compared with the chemical shift changes (Table 3.1). Interestingly, the two residues with the largest experimental chemical shift changes, Gln16 and Lys79, are ranked #1 and #5, respectively, in terms of their preferential interaction coefficient with PEG. The third ranked experimental residue, Val28, was ranked #6 by our PEG – residue preferential interaction coefficient. Interestingly, two residues unresolved (not assigned) by NMR were #2 and #4 in our ranking. In total, four of the top ten residues with the largest chemical shift changes are included in the top ten PEG – residue preferential interaction coefficients; two residues are unresolved, leaving four residues unaccounted for. Three of these four residues are clear misses, which might have to do with the fact that we are comparing the chemical shift of a nitrogen and hydrogen to the preferential interaction coefficient of the whole residue. As a whole, however, these results suggest that it may be possible to obtain a meaningful atomic level view of protein – PEG interactions through the use of MD simulations. 23 (a) (b) Figure 3.1. Static and dynamic properties of PEG. (a) Although the concentration of PEG varies from 100 to 300 mg/mL, no change is witnessed in the calculated radius of gyration. (b) The decay rate of the end-to-end autocorrelation function decreases with increasing concentration of PEG. 24 (a) (b) (c) (d) Figure 3.2. Sample snapshots of HEWL with varying concentrations of PEG. Shown here is HEWL under (a) 0, (b) 100, (c) 200, (d) 300 mg/mL PEG 3350. 25 (a) (b) (c) (d) (e) (f) Figure 3.3. RMSD analysis of all protein systems. These RMSD figures compare 0 mg/mL (gray) to 300 mg/mL (red) PEG 3350. (a) HEWL, (b) TEM-BLIP, (c) BLIP, (d) BLA, (e) cyt c, (f) TEM. With the addition of PEG 3350 in all cases, the RMSDs either remain stable or decrease. 26 (-8) (-7) (-5) (-1) (+6) (+8) Figure 3.4. Preferential interaction coefficient values for PEG – protein interactions. Here, all proteins simulated are compared to others via the preferential interaction coefficient, Γ. Proteins are listed from left to right, with increasing net charge. Increasing the concentration of PEG 3350 generally decreases the preferential interaction coefficient of the protein – PEG interaction. In only two cases, with TEM and BLA at 300 mg/mL PEG, does the Γ value fall below zero, which indicates a preference for water. If PEG was truly an inert macromolecular crowding agent, we would expect all Γ values to be close to zero. 27 (a) (c) (c) (b) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (d) (d) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) Figure 3.5. Preferential interaction coefficient values between PEG and charged residues. Shown here are the Γ values between the charged residues for PEG: (a) aspartate, (b) glutamate, (c) arginine, and (d) lysine. As witnessed on the whole protein level, increasing the concentration of PEG decreases the preferential interaction coefficient of the positive amino acids, aspartate and glutamate (a & b). The negative amino acids have a preference for water over PEG in every scenario except HEWL at 100 mg/mL PEG. The positive amino acids, arginine and lysine, display a much greater variation (c & d). 28 (a) (b) (-7) (-5) (-1) (+6) (+8) (c) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (e) (d) (-7) (-5) (-1) (+6) (+8) Figure 3.6. Preferential interaction coefficient values between PEG and amino acid residue groups. Further analysis was performed between PEG and amino acid residue groups, as explained in Methods for: (a) aliphatic residues, (b) aromatic residues, (c) polar residues, (d) positive residues and (e) negative residues. As a whole, all groups of amino acid residues prefer to interact with PEG 3350 than water, with the exception of the negative amino acid residues. In general, increasing the concentration of PEG decreases the preferential interaction coefficient for all groups. 29 (a) (-8) (-7) (-5) (-1) (+6) (+8) (-8) (-7) (-5) (-1) (+6) (+8) (b) Figure 3.7. Preferential interaction coefficient values for protein – ion interactions. The preference of ions or water for the proteins was also determined as a function of increasing concentration of PEG 3350, with (a) chloride or (b) sodium ions. In contrast to interactions between PEG and water, increasing the concentration of PEG actually increases the preferential interaction coefficient for both chloride and sodium ions with most proteins. 30 (a) (c) (b) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (d) (-7) (-5) (-1) (+6) (+8) Figure 3.8. Preferential interaction coefficient values between chloride ions and charged residues. Preferential interaction coefficients were determined for charged amino acids with chloride ions: (a) aspartate, (b) glutamate, (c) arginine, and (d) lysine. Increasing the concentration of PEG decreases the preference of chloride ions with the negative residues (a & b), but increases the preference for positive residues (c & d). The anomalous spike for the chloride – lysine interaction in TEM with 300 mg/mL PEG in (d) is explained in Figure 3.9. 31 Figure 3.9. Two lysine residues of TEM coordinated with a chloride ion. The anomalous Γ value between the lysine residues of TEM and chloride ions is explained here, as two lysine residues form a stable interaction with chloride. TEM is represented as CPK sticks, lysine residues as CPK and chloride ion is green. For purposes of clarity, water is not shown. 32 (a) (c) (b) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (d) (-7) (-5) (-1) (+6) (+8) Figure 3.10. Preferential interaction coefficient values between sodium ions and charged residues. Preferential interaction coefficients were determined for charged amino acids with sodium ions: (a) aspartate, (b) glutamate, (c) arginine, and (d) lysine. The negatively charged amino acids (a & b), preferred sodium ions to water. The positively charged amino acids (c & d), on the other hand, preferred water over sodium ions with the exception of the arginine residues in BLIP (explained in Figure 3.11). 33 (a) (b) Figure 3.11. Coordination of arginine residues of BLIP with positive ions. The anomalous value of PEG and sodium is explained here. (a) Arginine residue 144 of BLIP is coordinated with a sodium ion. The arginine residue is shown in CPK, the rest of BLIP protein is shown in CPK sticks and the sodium ion is purple. (b) Arginine residue 144 of the TEM-BLIP complex (PDB ID: 1S0W) is coordinated with a calcium ion. . The arginine residue is shown in CPK, the rest of BLIP is shown in CPK sticks, TEM is shown in gray sticks and the calcium ion is green. 34 (a) (b) (-7) (-5) (-1) (d) (-7) (-5) (-1) (+6) (+8) (+6) (+8) (c) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (e) (-7) (-5) (-1) (+6) (+8) Figure 3.12. Preferential interaction coefficient values between chloride ions and amino acid residue groups. Further analysis was performed between chloride ions and amino acid residue groups for: (a) aliphatic residues, (b) aromatic residues, (c) polar residues, (d) positive residues and (e) negative residues. In general, all residue groups preferred water over chloride, except for the positively charged residues. 35 (a) (b) (-7) (-5) (-1) (d) (-7) (-5) (-1) (+6) (+8) (+6) (+8) (c) (-7) (-5) (-1) (+6) (+8) (-7) (-5) (-1) (+6) (+8) (e) (-7) (-5) (-1) (+6) (+8) Figure 3.13. Preferential interaction coefficient values between sodium ions and amino acid residue groups. Further analysis was performed between sodium ions and amino acid residue groups for: (a) aliphatic residues, (b) aromatic residues, (c) polar residues, (d) positive residues and (e) negative residues. Generally, amino acids showed no strong preference between sodium and water, except for negatively charged amino acids (e). The spike in (d) with BLIP is explained in Figure 3.11. The anomalous change (panel b) in the preference of cyt c‟s aromatic groups of sodium ions over water in 300 mg/mL is explained in Figure 3.14. 36 Figure 3.14. Coordination of a phenylalanine residue of cyt c with a sodium ion. The high value of preferential interaction coefficient between aromatic groups of cyt c and sodium ions in 300 mg/mL PEG 3350 is explained here. The carbonyl oxygen of a phenylalanine residue interacts with a sodium ion, as well as the carboxylate oxygens of glutamate, which is the C-terminal tail (Immediately right of the sodium ion). Phenylalanine is shown in CPK, the rest of cyt c is viewed as CPK sticks, the sodium ion is purple and the iron ion in heme is green. 37 Table 3.1. Residue-specific analysis of cyt c: simulation vs NMR. Rank Order Simulation Largest Γ values NMR Experiment44 Greatest Chemical Shift (Δδ) 1 Gln 16 Lys 79 (4) 2 Lys 4 * Gln 16 (1) 3 Lys 100 4 Lys 11 * 5 Lys 79 Thr 12 (9) 6 Val 28 Lys 87 (104) 7 Lys 72 Phe 36 (80) 8 Ala 81 Gly 23 (36) 9 Thr 12 Val 57 (74) 10 Arg 13 Lys 27 (22) Val 28 (6) Gly 77 (28) Notes: Asterisk (*) in the second column indicated residues that were not assigned experimentally. Numbers in parentheses in the third column indicate the rank order of the same residues in the simulation results. 38 CHAPTER IV DISCUSSION Protein – PEG Interactions Poly(ethylene glycol) is a straight chain polymer that is frequently used as a macromolecular crowding agent3,29. In the case of macromolecular crowding, polymers are meant to fill the void in vitro where a crowded environment does not exist – i.e. to exclude volume6. Recently, it has been suggested that PEG does not play by the rules defined by the excluded volume theory – i.e. there might be more ways than purely steric interactions that PEG utilizes41-42,87. To further understand why PEG is not the most suitable crowding agent and why it participates in more than excluded-volume effects, 12 µs of MD simulations were examined with several concentrations of PEG 3350. These MD simulations allowed for not only a system level view, but could be magnified to an atomic level view. In these simulations, not only could the interactions between protein and PEG be fleshed out, but the interactions between all species could be observed, be they PEG, protein, solvent or ions. We are not aware of any other simulation studies in the literature that have directly examined interactions between protein and PEG. Comparison with Experimental Second Virial Coefficients. reported to favorably interact with lysozyme via second PEG has been virial coefficient measurements41-42. From our studies, at every concentration of PEG 3350, lysozyme had positive Γ values with PEG, consistent with previous experimental results41-42. The simulation results therefore reiterate the experimental finding that PEG has an attractive interaction with protein. Interestingly, at 100 mg/mL PEG lysozyme‟s preference for PEG over water is almost twice as strong as at 300 mg/mL PEG (i.e. 151, compared with 79); for comparison, the Γ values determined between lysozyme and a 1m solution of urea was determined to be roughly 10 82 . From an examination of PEG‟s interactions with the four proteins studied here that are of similar size (i.e. BLA, BLIP, cyt c, and 39 HEWL), it appears that net charge plays a role in determining the preferential interaction coefficient between PEG and protein: as shown in Figure 3.4, the Γ value for HEWL (net charge +8e) is approximately three times larger than that for BLA (net charge -5). Note that this relationship breaks down when we include TEM-BLIP and TEM only because these proteins are considerably larger than the others: it was noted earlier that the equation used for calculating Γ values is predisposed to giving larger values for larger proteins. Atomic Level Details. We have also examined the interactions of PEG with proteins at the level of individual amino acid residues. To our knowledge, this type of detailed information has never been reported in the literature. Clear trends were visible for most groups of amino acids; for example, increasing the PEG concentration tended to decrease the preference of PEG over water (Figure 3.6). This trend did not, however, hold for the aromatic group, but this is probably due to sampling issues, as there are relatively few aromatic residues in these six proteins. In several cases, the Γ value for grouped residues became negative as the PEG concentration reached 300 mg/mL, indicating a more favorable interaction with water. Positive amino acids, such as lysine, clearly prefer to interact with PEG over water; the origins of this effect are explained below. We also dissected the positive and negative groups into their constituent amino acids, arginine, lysine, aspartate and glutamate (Figure 3.5). Clearly, with the negative amino acids, there is a preference for water over PEG with all proteins (Figure 3.5a & b). This is likely to be caused by repulsion of the negative partial charges on the oxygen atoms of aspartate and glutamate and the oxygen atoms of PEG. With the positive amino acids, arginine displayed erratic trends between the different proteins. The net charge seems to affect the preferential interaction coefficient between PEG and arginine residues; for HEWL, arginine residues always prefer PEG over water, but for the anionic 40 analog, BLA, arginine residues prefers water over PEG (Figure 3.5). Lysine, on the other hand, displays positive Γ values for all proteins. We have uncovered the origins of the favorable preferential interaction between PEG and lysine residues by conducting an atomic-level analysis: all proteins show a common crown ether-like arrangement of PEG oxygen atoms around exposed lysine groups (see Figure 4.1a for an example taken from the simulation of HEWL). This is consistent with arrangements found in crystal structures where electron density due to PEG atoms is resolved43. For example, in Figure 4.1b, the purine repressor from Bacillus subtilis, PurR (PDB ID: 1O57)88, is shown to have a PEG molecule coordinated around a lysine. The striking similarities between the arrangements shown in Figures 4.1a and b provide a general indication that the simulations behave in a reasonable manner. Another atomic-level view can be gained from comparison of experimental NMR data of cyt c with simulations. Through the calculation of preferential interaction coefficients, we determined that four of the ten residues with the largest chemical shifts in the NMR study of Crowley et al.44 were listed in the top ten residues arranged according to their PEG – protein Γ values. Additionally, two of the residues in our top ten, #2 and #4, were unresolved in the NMR study. This result is a mixed bag: although it was nice to capture four of the ten residues, but another four residues identified in the Crowley at al study are clearly missed by our analysis. Again, it is important to note that we are comparing preferential interaction coefficients between specific residues to changes in chemical shifts of the nitrogen and hydrogen backbone, so part of the discrepancy between simulation and experiment may be due to the fact that we are, in effect, comparing apples with oranges. In the future, it might be possible to explicitly calculate chemical shifts of the NH groups using the structural snapshots extracted from the MD simulations. It would also be handy to have more NMR experimental data for different proteins in crowded PEG solutions; such data would allow us to (a) see if our 41 results occur in experiment and (b) see if we can, in the future, predict what residues might display interactions with PEG on a realistic basis. Protein – Ion Interactions A System View. In addition to purely PEG – protein interactions, our simulations also showed that protein – ion interactions are dependent on the PEG concentration. In fact, the Γ values generally increased with increasing concentrations of PEG, which, it will be noted is the exact opposite of what happened to the protein – PEG Γ values. This likely is caused by the fact as the PEG concentration increases, the water concentration decreases: the numbers of water molecules included ranged from 11221 for the 0 mg/mL PEG system to 8121 for the 300 mg/mL PEG system for lysozyme. The thirty percent decrease in the number of water molecules between the lowest and highest PEG concentrations (0 to 300 mg/mL PEG) suggests that if an ion does indeed favorably interact with the protein at high PEG concentration, it is much less likely to be outcompeted by water molecules. Ions in More Detail. Positively and negatively charged amino acids showed trends in the preferential interaction coefficient with PEG consistent with the charge of the ions. In the case of chloride – as expected – negatively charged residues, aspartate and glutamate, did not produce favorable Γ values; while positively charged ions, arginine and lysine, gave slightly favorable Γ value. Again, with increasing concentrations of PEG, fewer water molecules are available to interact with the protein, and the Γ values for protein – chloride tend to increase in magnitude. For aspartate and glutamate, this means that the preferential interaction coefficients between protein – chloride become less favorable and more negative. The positively charged residues, arginine and lysine, favor chloride over water with increasing concentrations of PEG. In addition, Figure 3.11 shows the origins of the anomalous result for the TEM lysine – chloride in Figure 3.10d. 42 A general trend with sodium ions also appears. On a system level, the more positive the net charge of a protein, the more likely the preferential interaction coefficient between sodium and protein is negative. Also, an increase in the PEG concentration increases the preferential interaction coefficients between protein and sodium, but on an amino acid level, the results are less clear (Figure 3.7b versus Fig. 3.10). For aspartate, three of the five proteins have a slight favorable interaction with sodium ions, while cyt c and HEWL show no preference between aspartate and PEG or water molecules. For glutamate, all proteins except BLIP show a preference for sodium ions over water molecules; however, these values fluctuate. The trend throughout the ions, that increasing the PEG concentration increases the strength of the preferential interaction coefficient, does not appear as strong for sodium ions. Inconsistencies in sampling are most likely the root of these trends, since on average (dependent on the protein net charge), there are only 20 sodium ions in each simulation. With arginine and lysine, the trends are increasingly negative Γ values with increasing PEG concentration, with the exception of the arginine – sodium interaction with BLIP (Figure 3.12c). This anomaly is explained by the similar interaction in the crystal structure with a calcium ion (Figure 3.13). Interactions between the PEG groups (i.e. aliphatic, aromatic, polar, positive and negative) and the ions generally follow the same trend as specific amino acids: increasing the concentration of PEG will increase the magnitude of the preferential interaction coefficient. Again, some of these trends are not solid due to the lack of sampling and the sampling of sodium seems worse than chloride (Figure 3.12-13). 43 Future Directions With our results reported here, PEG clearly does not act as an inert polymer suitable for macromolecular crowding studies. Any Γ values between PEG and protein not around zero suggest that PEG does affect the protein. Not only do we show that PEG does interact with proteins, it does so in a favorable interaction seen previously in experiment. PEG molecules in solution also affect other properties, such as the interactions between the proteins and salts. Clearly, we would suggest that PEG not be used in future crowding studies if one wants to use a purely inert crowding agent. The characterizations of polymer – protein interactions within this thesis provide a framework for answering further questions on crowding effects. The first place to logically start is to increase the system size and run the simulations for a longer period of time to decrease the chance of poor sampling. In the future, it will be possible to probe other polymers that are more inert, such a polyvinylpyrolidone, dextran and Ficoll to determine if any protein – polymer interactions occur. This will require building more atomic models, which is not a trivial task, especially in the case of the highly branched Ficoll, and reproducing basic experimental data for each polymer in aqueous solution. Further questions include probing the thermodynamics of biomolecular interactions for small molecule systems with polymers, such as acetate-methylammoium and methanemethane systems. These questions have been able to be explained using molecular dynamics in pure aqueous solution89, but have yet to be completed for polymeric systems. With this study, the groundwork for simulating protein – polymer systems has been laid and can be used to answer these questions. 44 (a) (b) Figure 4.1. Coordination of PEG around lysine residues in experiment and simulation. (a) Coordination is witnessed with a lysine residue of HEWL in our MD simulations. (b) Similar coordination of PEG around a lysine residue in the PurR receptor in crystallographic structure (PDB ID: 1O57).88 Lysine residues are shown as CPK, proteins in grey cartoons, PEG coordinated with lysine as CPK sticks and the rest of PEG as green sticks. 45 REFERENCES (1) Minton, A. P. J. Biol. Chem. 2001, 276, 10577. (2) Fulton, A. B. Cell 1982, 30, 345. (3) Hall, D.; Minton, A. P. Biochim. Biophys. Acta, Proteins Proteomics 2003, 1649, 127. (4) Ellis, R. J. Trends Biochem. Sci. 2001, 26, 597. (5) Zimmerman, S. B.; Minton, A. P. Annu. Rev. Biophys. Biomol. Struct. 1993, 22, 27. (6) Zhou, H. X.; Rivas, G. N.; Minton, A. P. Annu. Rev. Biophys. 2008, 37, 375. (7) Kozer, N.; Schreiber, G. J. Mol. Biol. 2004, 336, 763. (8) Charlton, L. M.; Barnes, C. O.; Li, C. G.; Orans, J.; Young, G. B.; Pielak, G. J. J. Am. Chem. Soc. 2008, 130, 6826. (9) Li, C. G.; Wang, Y. Q.; Pielak, G. J. J. Phys. Chem. B 2009, 113, 13390. (10) Rankin, J. C.; Jeanes, A. J. Am. Chem. Soc. 1954, 76, 4435. (11) Dimler, R. J.; Wolff, I. A.; Sloan, J. W.; Rist, C. E. J. Am. Chem. Soc. 1955, 77, 6568. (12) Vancleve, J. W.; Schaefer, W. C.; Rist, C. E. J. Am. Chem. Soc. 1956, 78, 4435. (13) Ohlson, M.; Sorenson, J.; Lindstrom, K.; Blom, A. M.; Fries, E.; Haraldsson, B. Am. J. Physiol. Renal Physiol. 2001, 281, F103. (14) Ohlson, M.; Sorensson, J.; Haraldsson, B. Am. J. Physiol. Renal Physiol. 2001, 280, F396. (15) Oliver, J. D.; Anderson, S.; Troy, J. L.; Brenner, B. M.; Deen, W. M. J. Am. Soc. Nephrol. 1992, 3, 214. (16) Venturoli, D.; Rippe, B. Am. J. Physiol. Renal Physiol. 2005, 288, F605. (17) Vicent, M. J.; Duncan, R. Trends Biotechnol. 2006, 24, 39. (18) Pasut, G.; Veronese, F. M. Adv. Drug Delivery Rev. 2009, 61, 1177. (19) Lindner, R. A.; Ralston, G. B. Biophys. Chem. 1997, 66, 57. (20) del Alamo, M.; Rivas, G.; Mateu, M. G. J. Virol. 2005, 79, 14271. 46 (21) Munishkina, L. A.; Cooper, E. M.; Uversky, V. N.; Fink, A. L. J. Mol. Recognit. 2004, 17, 456. (22) Yamin, G.; Munishkina, L. A.; Karymov, M. A.; Lyubchenko, Y. L.; Uversky, V. N.; Fink, A. L. Biochemistry 2005, 44, 9096. (23) Martin, J.; Hartl, F. U. Proc. Natl. Acad. Sci. U. S. A. 1997, 94, 1107. (24) Elcock, A. H. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 2340. (25) Spencer, D. S.; Xu, K.; Logan, T. M.; Zhou, H. X. J. Mol. Biol. 2005, 351, 219. (26) Batra, J.; Xu, K.; Zhou, H. X. Proteins: Struct., Funct., Bioinf. 2009, 77, 133. (27) Dix, J. A.; Verkman, A. S. Annu. Rev. Biophys. 2008, 37, 247. (28) Minton, A. P. J. Pharm. Sci. 2005, 94, 1668. (29) Minton, A. P. Biophys. J. 2000, 78, 101. (30) Qin, S.; Zhou, H. X. Biophys. J. 2009, 97, 12. (31) Asakura, S.; Oosawa, F. J. Chem. Phys. 1954, 22, 1255. (32) Kozer, N.; Kuttner, Y. Y.; Haran, G.; Schreiber, G. Biophys. J. 2007, 92, 2139. (33) Ioan, C. E.; Aberle, T.; Burchard, W. Macromolecules 2001, 34, 3765. (34) McPherson, A. Methods Enzymol. 1985, 114, 120. (35) Atha, D. H.; Ingham, K. C. J. Biol. Chem. 1981, 256, 12108. (36) Dumetz, A. C.; Lewus, R. A.; Lenhoff, A. M.; Kaler, E. W. Langmuir 2008, 24, 10345. (37) Minton, A. P. Mol. Cell. Biochem. 1983, 55, 119. (38) Neal, B. L.; Asthagiri, D.; Velev, O. D.; Lenhoff, A. M.; Kaler, E. W. J. Cryst. Growth 1999, 196, 377. (39) Neal, B. L.; Asthagiri, D.; Lenhoff, A. M. Biophys. J. 1998, 75, 2469. (40) Valente, J. J.; Verma, K. S.; Manning, M. C.; Wilson, W. W.; Henry, C. S. Biophys. J. 2005, 89, 4211. (41) Bloustine, J.; Virmani, T.; Thurston, G. M.; Fraden, S. Phys. Rev. Lett. 2006, 96. (42) Kulkarni, A. M.; Chatterjee, A. P.; Schweizer, K. S.; Zukoski, C. F. J. Chem. Phys. 2000, 113, 9863. (43) Hasek, J. Z. Kristallogr. 2006, 613. 47 (44) Crowley, P. B.; Brett, K.; Muldoon, J. ChemBioChem 2008, 9, 685. (45) Adcock, S. A.; McCammon, J. A. Chem. Rev. 2006, 106, 1589. (46) Karplus, M.; McCammon, J. A. Nat. Struct. Biol. 2002, 9, 646. (47) van Gunsteren, W. F.; Bakowies, D.; Baron, R.; Chandrasekhar, I.; Christen, M.; Daura, X.; Gee, P.; Geerke, D. P.; Glattli, A.; Hunenberger, P. H.; Kastenholz, M. A.; Oostenbrink, C.; Schenk, M.; Trzesniak, D.; van der Vegt, N. F.; Yu, H. B. Angew. Chem. Int. Ed. Engl. 2006, 45, 4064. (48) Fischer, J.; Paschek, D.; Geiger, A.; Sadowski, G. J. Phys. Chem. B 2008, 112, 2388. (49) Lee, H.; Venable, R. M.; MacKerell, A. D.; Pastor, R. W. Biophys. J. 2008, 95, 1590. (50) Houdier, S.; Perez, S. J. Carbohydr. Chem. 1995, 14, 1117. (51) Aldrich Chemical Company. Sigma-Aldrich Library of rare chemicals. Structure index; Aldrich Chemical Co.: Milwaukee, WI, 1993. (52) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. J. Chem. Theory Comput. 2008, 4, 435. (53) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 1983, 79, 926. (54) Jorgensen, W. L.; Maxwell, D. S.; TiradoRives, J. J. Am. Chem. Soc. 1996, 118, 11225. (55) Aqvist, J. J. Phys. Chem. 1990, 94, 8021. (56) Chandrasekhar, J.; Spellmeyer, D. C.; Jorgensen, W. L. J. Am. Chem. Soc. 1984, 106, 903. (57) de la Torre, J. G.; Huertas, M. L.; Carrasco, B. Biophys. J. 2000, 78, 719. (58) Berendsen, H. J. C.; Postma, J. P. M.; Vangunsteren, W. F.; Dinola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684. (59) Parrinello, M.; Rahman, A. J. Appl. Phys. 1981, 52, 7182. (60) Nose, S. J. Chem. Phys. 1984, 81, 511. (61) Hoover, W. G. Phys. Rev. A 1985, 31, 1695. (62) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. J. Chem. Phys. 1995, 103, 8577. 48 (63) Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. J. Comput. Chem. 1997, 18, 1463. (64) Fischer, J.; Paschek, D.; Geiger, A.; Sadowski, G. J. Phys. Chem. B 2008, 112, 13561. (65) Goutev, N.; Ohno, K.; Matsuura, H. J. Phys. Chem. A 2000, 104, 9226. (66) Chen, B.; Potoff, J. J.; Siepmann, J. I. J. Phys. Chem. B 2001, 105, 3093. (67) Allen, M. P. a. T., D.J Computer Simulations of Liquids; Oxford Science Publications: Oxford, 1987. (68) Jorgensen, W. L. J. Phys. Chem. 1986, 90, 1276. (69) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res. 2000, 28, 235. (70) Wilson, K. P.; Malcolm, B. A.; Matthews, B. W. J. Biol. Chem. 1992, 267, 10842. (71) Pike, A. C. W.; Brew, K.; Acharya, K. R. Structure 1996, 4, 691. (72) Louie, G. V.; Brayer, G. D. J. Mol. Biol. 1990, 214, 527. (73) Reichmann, D.; Rahat, O.; Albeck, S.; Meged, R.; Dym, O.; Schreiber, G. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 57. (74) Strynadka, N. C. J.; Jensen, S. E.; Johns, K.; Blanchard, H.; Page, M.; Matagne, A.; Frere, J. M.; James, M. N. G. Nature 1994, 368, 657. (75) Gretes, M.; Lim, D. C.; de Castro, L.; Jensen, S. E.; Kang, S. G.; Lee, K. J.; Strynadka, N. C. J. J. Mol. Biol. 2009, 389, 289. (76) Worrall, J. A. R.; Kolczak, U.; Canters, G. W.; Ubbink, M. Biochemistry 2001, 40, 7069. (77) Morar, A. S.; Kakouras, D.; Young, G. B.; Boyd, J.; Pielak, G. J. J. Biol. Inorg. Chem. 1999, 4, 220. (78) Krivov, G. G.; Shapovalov, M. V.; Dunbrack, R. L. Proteins-Structure Function and Bioinformatics 2009, 77, 778. (79) Autenrieth, F.; Tajkhorshid, E.; Baudry, J.; Luthey-Schulten, Z. J. Comput. Chem. 2004, 25, 1613. (80) Sorin, E. J.; Pande, V. S. Biophys. J. 2005, 88, 2472. (81) Mobley, D. L.; Chodera, J. D.; Dill, K. A. J. Chem. Phys. 2006, 125. (82) Shukla, D.; Shinde, C.; Trout, B. L. J. Phys. Chem. B 2009, 113, 12546. 49 (83) Baynes, B. M.; Trout, B. L. J. Phys. Chem. B 2003, 107, 14058. (84) Thiyagarajan, P.; Chaiko, D. J.; Hjelm, R. P. Macromolecules 1995, 28, 7730. (85) Lee, H.; de Vries, A. H.; Marrink, S. J.; Pastor, R. W. J. Phys. Chem. B 2009, 113, 13186. (86) Pitera, J. W.; Falta, M.; van Gunsteren, W. F. Biophys. J. 2001, 80, 2546. (87) Kulkarni, A. M.; Chatterjee, A. P.; Schweizer, K. S.; Zukoski, C. F. Journal of Physics-Condensed Matter 2000, 12, A301. (88) Sinha, S. C.; Krahn, J.; Shin, B. S.; Tomchick, D. R.; Zalkin, H.; Smith, J. L. J. Bacteriol. 2003, 185, 4087. (89) Thomas, A. S.; Elcock, A. H. J. Am. Chem. Soc. 2006, 128, 7796.