Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols, Dimitrios Morikis, Ronald D. Gorham Jr. Department of Bioengineering University of California, Riverside A bstract Epstein-Barr Virus (EBV) infects a large percentage of the world’s population and is responsible for infectious mononucleosis and, in rare cases, Burkitt’s lymphoma and nasopharyngeal carcinoma. EBV’s primary means of infection is the association of the viral surface glycoprotein gp350 with Complement Receptor 2 (CR2) of the immune system. Various mutagenesis studies have identified key residues on both gp350 and CR2 necessary for binding. These mutagenesis studies have recently been used to derive constraints for a computational docking study in order to generate a putative three-dimensional structure for the gp350-CR2 complex, using the soft-docking program HADDOCK (High-Ambiguity Driven biomolecular DOCKing). We have applied our own AESOP (Analysis of Electrostatic Similarities Of Proteins) protocol to analyze the electrostatic contributions to complex formation, using the HADDOCK-derived structure of the gp350-CR2 complex. Our atomic-detail studies using AESOP suggest that the original HADDOCK structure may not be optimized and warrant a re-evaluation of the docking process. A U T H O R Aaron Nichols Bioengineering Aaron Alan Nichols is a graduating senior majoring in Bioengineering. He is a member of the Medical Scholars Program, Tau Beta Pi Honors Engineering Society and participated in the Amgen Scholars Program at UCSF this past summer. Aaron joined the BioMoDel laboratory, led by Dr. Dimitrios Morikis, the summer prior to his junior year after discovering an avid fascination with computer M e n t o rs science and biology. His research focuses on the electrostatic interactions Faculty Mentor: Dimitrios Morikis Graduate Student Mentor: Ronald D. Gorham Jr. of proteins involved in our immune Department of Bioengineering system. He studied the infection by Aaron has worked in the Biomolecular Modeling and Design Laboratory (BioMoDeL) for nearly two years, gaining experience through a number of different research projects. His initial research involved evaluation of parameter selection in Poisson-Boltzmann electrostatic calculations through comparison of computed and experimentally-determined free energy values for association of protein complexes. Subsequently, Aaron has worked on examining the interaction between Epstein-Barr Virus glycoprotein 350 (gp350) and immune protein complement receptor 2 (CR2), aiming to better understand the molecular mechanisms underlying viral immune system evasion. The first project is now published in the major research journal Biopolymers, as part of a larger study led by graduate student Ronald Gorham. The work of the second project is reported here, and is also co-authored by Ronald Gorham who contributed by providing research guidance. Aaron has proved himself as an independent researcher, taking on a challenging project involving docking of the gp350-CR2 protein structures in light of his previous parametrization results. Aaron has presented his work at the UCR Symposium for Undergraduate Research, Scholarship, and Creative Activity in spring 2010, and at the Southern California Conference for Undergraduate Research at Pepperdine University in fall 2010. In addition to his research at UCR, Aaron participated in the Amgen Scholars program at UCSF during summer 2010. system UCR Un d e r g r a d u at e R the Epstein-Barr Virus of the immune e s e a rc h Jo ur n a l through the systematic computational mutation of various amino acids constituting the infection mediating proteins. He plans on using the skills he developed in research towards the pursuit of a medical degree. Aaron will be attending UC Riverside next quarter as a Masters student in the 5-year BS/MS Bioengineering program. 37 Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols Introduction The Epstein-Barr Virus (EBV) is a herpes-virus that occurs worldwide, and affects 95% of adults between the ages 35-40. EBV infections are known to cause infectious mononucleosis in 35-50% of adolescents leading to symptoms such as fever, sore throat, swollen lymph glands, and even spleen or liver complications. The virus also establishes a lifelong infection of the body’s immune system, and in rare cases, causes Burkitt’s lymphoma and nasopharyngeal carcinoma1. Infection by EBV is achieved by the association of the viral surface glycoprotein, gp350, with Complement Receptor 2 (CR2), located on the surface of T-lymphocytes. Viral gp350 is a large (907-residue) protein, with a large percentage of its surface being glycosylated, or covered by covalently attached sugars. The three-dimensional structure of a truncated form of gp350 (440 residues) has been experimentally determined and consists of three domains (D1, D2, and D3) dominated by beta-sheets that are all linked by short polypeptides, arranged into an L-shape (see Figure 1). A distinct patch on the surface of one of N-terminal domains is not glycosylated and coincides with a negative “hotspot,” or aggregation of negatively charged residues. Experimental deglycosylation of gp350 has been shown to have negligible effects on its ability to bind ligand, suggesting that this “naked” patch is a possible binding site for CR22. CR2 is a cell receptor involved in the complement portion of the immune system. It is a regulator of complement activation and is characterized by the presence of repeating modules known as short consensus repeats, or SCRs. CR2 comprises between 15-16 modules that span the cell membrane. Flow cytometry experiments suggest that gp350 interacts with the first two modules of CR2, SCR1 and SCR2. The three dimensional structures of these modules have been solved (see Figure 1); when crystallized, the modules form a tight V-conformation, however X-ray scattering suggests that the functional CR2 may open up1. In a study conducted by Hannan and coworkers (2008)1, an extensive amount of mutagenesis data targeting the putative binding site on both gp350 and CR2 was 38 collected and utilized in the determination of a potential three dimensional structure of the bound complex. To accomplish this, the HADDOCK, or High Ambiguity Driven biomolecular DOCKing, program was used to dock the two proteins. HADDOCK is a unique docking program since it uses a wide array of experimental data as restraints to find an optimal structure for the bound complex, and thus provides a crucial link between experimental and computational biology. Figure 1. (a) The three-dimensional structure of gp350 in a truncated, glycosylated state. Domains are labeled and colored according to electrostatic potential (red for negative, blue for positive). Region of concentrated negative amino acids (circled) coincides with unglycosylated region2. (b) SCR1-SCR2 of CR2 colored by coulombic potential. (c) HADDOCK generated structure of complex with domains and SCRs labeled. Methods and Materials AESOP - Since the interaction between gp350 and CR2 involves charged residues on each protein, an extensive study of their electrostatic nature is warranted. To this effect we have applied our own AESOP protocol, or the Analysis of Electrostatic Similarities Of Proteins, which provides the framework to rapidly analyze and quantify the electrostatic make-up of a protein or protein complex. AESOP was used to evaluate the HADDOCK structure UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols generated by Hannan and coworkers (2008)1. AESOP allows the user to conduct computational alanine scans, apply Poisson-Boltzmann electrostatic calculations to determine electrostatic potentials, determine computational free energies of binding, as well as cluster and analyze the results using a variable metric of the user’s choosing. The collection of these tools into a centralized location allows the user to efficiently study the contribution of electrostatics in the function of the protein3,4,5. The application of AESOP to the gp350-CR2 complex began by first retrieving the HADDOCK structure generated by the Hannan and coworkers (2008)1. The coordinate file was then cleaned by removing the header, leaving only the ATOM lines necessary to fully describe the complex’s three-dimensional structure. Once cleaned, a PQR file for the parent complex was generated using the webserver, PDB2PQR7. The PQR is similar to the coordinate file except that the temperature and occupancy columns are replaced by a per-atom charge term and radius term respectively. Next, the parent complex PQR was used to generate computational mutants with in an in-house R script. The script locates Aspartates, Glutamates, Arginines, Histidines, and Lysines (all of which may hold a charge at physiological pH), and truncates their respective side chains to only the beta carbon. The gamma carbon is replaced with hydrogen and the bond length is shortened. This process converts each of these ionizable amino acids into Alanine, and effectively perturbs the electrostatic potential of the protein without directly affecting global structure. There were 95 mutant PDBs generated with 72 single residue mutants belonging to gp350 and 23 to CR2. for electrostatic potential φ(r). Here, ε refers to the distance dependent dielectric, κ captures the implicit effect of solvation by water and ions on the proteins, and Qi is the fixed protein ith charge at some atom position r. Calculations were conducted for each mutant at both 0mM and 150mM ionic strength with a protein dielectric of 20.00 and solvent dielectric of 78.50. Visualizing the calculated electrostatic potential around a mutant protein of interest is useful in qualitatively understanding the contributions particular ionizable residues have on the protein. This can be accomplished simply by loading the data points into a common visualization software package like Chimera or VMD. A quantitative analysis, however, can be achieved by calculating the free energy associated with the mutation as it affects the complex, and can be described by, (2) Here, qi refers to charge and φi is the electrostatic potential calculated by APBS. We use a theoretical thermodynamic cycle (Figure 2) to account for both the free energy changes during association as well as the energy of solvating both the proteins individually as well as in complex. The following free energies are used to derive the solvation free energy of association, which is used to quantitatively compare the effects each computational mutant rendered on the complex. (3) After the computational mutants were generated, their respective electrostatic potentials were calculated. A local version of the Adaptive Poisson Boltzmann Solver, or APBS6, was used. APBS numerically solves the PoissonBoltzmann equation (shown below in its simplified linear form), (1) Each ΔG term represents moving either vertically or horizontally in the thermodynamic cycle shown in Figure 2. ΔΔGsolvation, or the association free energy of solvation, is the difference in free energy of the horizontal and vertical processes. UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l (4) (5) 39 Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols Results Figure 2. Thermodynamic cycle used to calculate free energies. The top process represents protein association in a low dielectric reference state, with a free energy change of ΔGref. The bottom process is identical, except for occurring in a more realistic solvated state, with a low dielectric protein interior and high dielectric solvent which is captured by ΔGsolu. Finally, the vertical processes represent the free energy change in moving each of the proteins from the ideal environment to a solvated one, and are stored in the ΔGsolvation term. To simultaneously account for the energy change associated with solvation and association we use the ΔΔGsolvation measure. ΔΔGsolvation can then be compared across mutants. Furthermore, the electrostatic potentials surrounding each mutant can be compared using a comparative metric of our choosing. For this study we used the Average Normalized Distance measure as described by4, (6) Using AESOP, clustered dendrograms as well as free energy plots of association were generated (Only the dendrogram and free energy plot for CR2 at 0mM is shown). Each line in the dendrogram represents a single mutant of the parent protein and is colored according to the physiological charge associated with the amino acid: blue for positively charged residues and red for negatively charged residues. The lines are terminated with circles that indicate the distance the particular residue is from the interaction interface. The dendrogram indicates that AESOP was able to successfully “cluster” mutants of similar charge, e.g. basic residues cluster separately from acidic residues. Additionally, the mutants that are located closer to the interaction interface also cluster together, suggesting that these mutations have a similar effect on the global electrostatic potential of the parent protein. Although clustering dendrograms indicate the similarity between the various mutants they do not provide any information about the effect the mutation had on the ability of the proteins to bind. The free energy diagrams generated by AESOP indicate the effect each mutation had on gp350-CR2 complex. Basic mutants are located energetically below the parent protein and indicate an unfavorable mutation while all the acidic mutants are located energetically above the parent protein and indicate a favorable mutation. Also included are the experimental “crosses” that indicate the deleterious effect the particular mutation had on the ability of gp350 to bind CR2. Discussion This comparison can be presented in a hierarchical dendrogram, in which mutants that have similar electrostatic potentials “cluster,” or group up together. It is our hypothesis that mutants that cluster together will behave similarly in their physiological function. The ultimate goal of AESOP then is to provide researchers a computational tool for screening mutants that ideally correlates well with experimental data. 40 It is readily noticeable that the mutation computationally predicted to have the greatest deleterious effect on binding, Lysine 67 to Alanine (K67A), is reported, experimentally, to reduce the binding affinity of the proteins by a mere 30%. This discrepancy was further studied by inspecting the three-dimensional structure of the HADDOCK complex, which places K67 at the interface of gp350 and CR2. In this conformation, K67 will be in a position to potentially form three very UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols Figure 3. AESOP results for calculations at 0mM ionic strength. The top panel shows the dendrogram generated for the CR2 mutants. Each line represents a mutant form of the parent protein. Each line is colored according to the charge that amino acid holds, e.g. blue for basic residues and red for acidic residues. Additionally each line is terminated in a color that indicates how far the residue is located from the interaction interface. Bottom panel is the corresponding free energy diagram for CR2 at 0mM. Each point on graph is located directly below its corresponding line in the dendrogram to ease the analysis of the energetics of each cluster. Crosses indicate experimental data from Hannan and coworkers (2008)1. In percentage of activity when compared to parent: +++, 89.9 – 70%; ++, 69.9 – 40%; +, 39.9 – 20%; –, 19 – 0.0%. strong interfacial salt-bridges with D18, D19, and E152 of gp350. The presence of these potential interactions suggests that the mutation of K67 to Alanine should render a much greater deleterious effect on binding than has been experimentally shown, thus supporting our hypothesis that the HADDOCK-generated structure may not be optimal. The original HADDOCK structure was driven using mutants that had a relatively minor effect on binding. In the study by Hannan and coworkers (2008)1 the K67A UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l mutation was rated +++, indicating a mere 30% decrease in binding in the ELISA experiments. This mutation data was used to perform gp350-CR2 docking in conjunction with other residues that had been shown to nearly abolish binding. During the docking procedure, it is likely that K67 became “trapped” by the sheer number of favorable Coulombic interactions, and thus did not explore its full range of conformational space. Additionally, most of the restraints used to dock gp350 and CR2 were ionizable residues. Ideally, non-polar or hydrophobic residues should 41 Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols Figure 4. On left, the HADDOCK structure with experimental mutants highlighted on the structure of the CR2. Mutants are colored according to their experimentally determined deleterious effect on binding upon mutation, red residues having the highest effect and green residues the least. K67 is circled. AESOP calculations suggest K67 participates in a multitude of strong interfacial Coulombic interactions, shown on the right. be used as restraints for HADDOCK since it is well understood that these residues act at very small distances when compared to residues that interact electrostatically. The mutation of a hydrophobic residue which abolishes binding is much more likely to reside at the proteinprotein interface than a charged residue with a similar deleterious effect on binding. The interactions of polar and charged residues can be both short and long range, and thus their mutation can significantly affect binding despite being located away from the interface. The inclusion of polar and charged residues as restraints in the HADDOCK, then, is likely to introduce error in the generation of a putative structure. Additionally, it must be considered that mutations of hydrophobic residues are likely to introduce perturbations in the global structure of the protein. These perturbations can inhibit the proteins from binding despite not being located at the complex interface. This can be addressed with the use of spectroscopic methods as a standardized control. In the absence of a structure for the protein complex, it is arguable whether ionizable residues should be considered “active” restraints in the HADDOCK docking process. Additionally, no reasonable cutoff for inhibition of binding exists when choosing the active residues for each docking process, therefore the use of 42 mutagenesis data that does not abolish binding may lead to the generation of non-optimal complex structures. Summary A computationally derived structure for the gp350-CR2 complex was determined by Hannan and coworkers (2008)1 with the use of the HADDOCK program. HADDOCK utilizes experimental mutagenesis data to drive the docking process and thus provides a crucial link between experimental and computational biology. We studied the electrostatic characteristics of the suggested complex structure using our own in-house protocol AESOP. Using AESOP we conducted computational single mutant Alanine scans of the complex and generated free energies of association for each mutation. The free energy data generated by AESOP suggests that the mutation of K67 of CR2 will cause a significant decrease in the binding ability of gp350 and CR2, however experimental mutagenesis data does not agree. We will construct a new HADDOCK structure of the gp350-CR2 complex with an alternative set of restraints, and specifically omit K67. We will then analyze the newly generated structures with our AESOP protocol in order to optimize the potential structure of the gp350CR2 complex. UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l Physicochemical Analysis of the Interaction between Epstein-Barr Virus Glycoprotein gp350 and Complement Receptor 2 using AESOP (Analysis of Electrostatic Similarities Of Proteins) Aaron Nichols 5. Kieslich, C. A., R. D. Gorham Jr., and D. Morikis, Is the rigid-body assumption reasonable? Insights into the effects of dynamics on the electrostatic analysis of barnase-barstar. Journal of Non-Crystalline Solids 357:707–716, 2011. References 1. Young KA, Herbert AP, Barlow PN, Holers VM, Hannan JP, Molecular Basis of the Interaction between Complement Receptor 2 (CR2/CD21) and Epstein Barr Virus Glycoprotein gp350, Journal of Virology 82:1217-1227, 2008. 6. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA, Electrostatics of nanosystems: application to microtubules and the ribosome. Proceedings of the National Academy of Sciences (USA) 98:10037–10041, 2001. 2. Szakonyi G, Klein MG, Hannan JP, Young KA, Ma RZ, Asokan R, Holers VM, Chen XS, Structure of the Epstein-Barr virus Major Envelope Glycoprotein, Nature Structural and Molecular Biology, 13: 996-1001, 2006. 7. T.J. Dolinsky, J.E. Nielsen, J.A. McCammon, N.A. Baker, PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations, Nucleic Acids Research 32:W665– W667, 2004. 3. Kieslich, CA, Yang J, Gunopulos D, Morikis D, Automated computational framework for the analysis of electrostatic similarities of proteins. Biotechnology Progress 27:316–325, 2011. 4. Gorham Jr., R. D., C. A. Kieslich, and D. Morikis, Electrostatic clustering and free energy calculations provide a foundation for protein design and optimization. Annals of Biomedical Engineering 39: 1252–1263, 2011. UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l 43 44 UCR Un d e r g r a d u at e R e s e a rc h Jo ur n a l