* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download X-ray Crystallography
Survey
Document related concepts
Transcript
X-ray Diffraction Stephen J Everse Fall 2004 Obtaining images of macromolecules In order for an object to be visible under magnification, the wavelength (l) of the light must be, roughly speaking, no larger than the object. Visible light (400-700 nm) cannot produce an image of protein molecules, in which bonded atoms are about 1.5Å apart (0.15 nm). Electromagnetic radiation of this wavelength falls into the X-ray range. S. Doublié ‘00 X-rays X-rays are just another form of electromagnetic radiation: Electromagnetic Spectrum: X-rays Energy: Ultraviolet Visible Light Infrared Microwave Radio Low High Frequency: High Low Wavelength: Short ~1Å(=0.1nm) ~400nm Long Resolving Power: (Ability to see detail) High Atomic Resolution Low {Electrons (~500keV, as in electron microscope) are not a form of electromagnetic radiation, but they still have wave-like character (deBroglie wavelength ~0.01Å). Unlike photons (EM rad.), electrons are charged --> fry the specimen faster } M. Rould ‘02 X-ray Crystallography • a method for studying the three-dimensional, atomic structure of molecules. In this course we will concentrate on applications for biological macromolecules. A protein crystal is placed in the x-ray beam the x-rays are diffracted by the electron clouds around atoms the atomic structure can be deduced from the data Why can’t we visualize molecules directly? A single molecule is a very weak scatterer of X-rays. Most of the X-rays will pass through the molecule without being diffracted. The diffracted rays are too weak to be detected. Solution: Analyzing diffraction from crystals instead of single molecules. A crystal is made of a three-dimensional repeat of ordered molecules (1014) whose signals reinforce each other. The resulting diffracted rays are strong enough to be detected. Unlike visible light, X-rays cannot be focused by lenses. The refractive index of X-rays in all materials is very close to 1.0. Solution: Use a computer to simulate an image-reconstructing lens. In short, the computer plays the part of the objective lens, computing the image of the object, then displaying it on a screen. Sylvie Doublié © 2000 The nature of crystals Under certain circumstances, macromolecules (protein, DNA, RNA) can form crystals. The resulting crystal is a three-dimensional array of ordered molecules held together by noncovalent interactions. Evidence that solution and crystal structure are similar: 1- NMR and X-ray crystallography have been used to determine the structure of the same molecule. The two methods produce similar models. 2-Many macromolecules are still functional in the crystalline state. Most protein crystals contain 50-70% solvent. S. Doublie ‘02 What is a Crystal ? object formed by stacking a basic unit in all 3-dimensions Unit Cell M. Rould ‘02 The Ideal Crystal the ordered disposition of molecules such that there exists a regular repetition of a pattern in 3-D space, where this repetition extends over a distance equal to or greater than thousands of molecular dimensions. The Real Crystal a crystal with less than perfect periodicity, imperfections are often caused by impurities and the effects of non-zero temperatures. The Protein Crystal the crystal contains a high degree of solvent, meaning that some molecules present are not in the crystalline state, but in the liquid state, creating disorder. Supersaturation to add more of a substance ( to a solution) than can normally be dissolved. This is a thermodynamically unstable state, achieved most often in protein crystallography by vapor diffusion or slow evaporation techniques. Zone 1 - Metastable zone. The solution may not nucleate for a long time but this zone will sustain growth. It is frequently necessary to add a seed crystal. Zone 2 - Nucleation zone. Protein crystals nucleate and grow. Zone 3 - Precipitation zone. Proteins do not nucleate but precipitate out of solution. Diagram from the website for The University of Reading, Course FS460 Investigating Protein Structure and Function Nucleation phenomenon whereby a “nucleus”, such as a dust particle, a tiny seed crystal, or commonly in protein crystallography, a small protein aggregrate, starts a crystallization process. Common difficulties: 1. If supersaturation is too high, too many nuclei form, hence an overabundance of tiny crystals. 2. In supersaturated solutions that don’t experience spontaneous nucleation, crystal growth often only occurs in the presence of added nuclei or “seeds”. Cessation of growth Caused by the development of growth defects or the approach of the solution to equilibrium. Mother liquor The solution in which the crystal exists - this is often not the same as the original crystallization screening solution, but is instead the solution that exists after some degree of vapor diffusion, equilibration through dialysis, or evaporation. Factors that affect crystallization 1) Purity of proteins 2) Protein concentration 3) Starting conditions (make-up of the protein solution) 4) Precipitating agent (precipitant) 5) Temperature 6) pH 7) Additives: Detergents, reducing agents, substrates, cofactors, etc. Hanging/Sitting Drop Vapor Diffusion Most popular method among protein crystallographers. 1. Crystal screen buffer is the well solution (0.5 - 1 mL) 2. Drop (on siliconized glass cover slip) is 1/2 protein solution, 1/2 crystal screen buffer (0.54 L). So, the concentration of precipitant in the drop is 1/2 the concentration in the well. 3. Cover slip is inverted over the top of the well and sealed with vacuum grease (airtight). 4. The precipitant concentration in the drop will equilibrate with the precipitant concentration in the well via vapor diffusion. Interpreting the Results of the Crystallization Experiment The Hampton Crystal Gallery http://www.hamptonresearch.com/stuff/gallery.html Experimental Set- Up Cryostream Rigaku rotating copper anode (in-house source) Beam Stop Detector Crystal Monochromator Or Mirrors X-ray source X-ray beam S. Cates ‘02 Goniometer European Synchrotron Radiation Facility Grenoble, France How are X-rays produced? X-rays in the useful range for crystallography (around 1 Å) can be produced by bombarding a metal target (most commonly copper or molybdenum) with electrons produced by a heated filament and accelerated by an electric field. A high energy electron collides with and displaces an electron from a low lying orbital in a target metal atom. Then an electron from a higher orbital drops into the resulting vacancy, emitting its excess energy as an X-ray photon. S. Doublié © 2000 X-ray Generators - The Rotating Anode Rigaku rotating copper anode (in-house source) X-rays are generated by bombarding a rotating copper anode with electrons. This creates X-ray radiation consisting of two wavelenghts characteristic of copper sources, 1.54 Å (K radiation) and 1.39 Å (K radiation). Crystallographers usually use K radiation (the intensity is greater). X-ray Generators - The Synchrotron European Synchrotron Radiation Facility Grenoble, France Electrons (or positrons) are released from a particle accelerator into a storage ring. The trajectory of the particles is determined by their energy and the local magnetic field. Magnets of various types are used to manipulate the particle trajectory. When the particle beam is “bent” by the magnets, the electrons (or positrons) are accelerated toward the center of the ring. Charged particles moving under the influence of an accelerating field emit electromagnetic radiation, and when they are moving at close to relativistic speeds, the radiation emitted includes high energy xray radiation. The oscillation equipment Rotates the crystal about an axis () perpendicular to the x-ray beam (and normal to the goniometer). The diffraction pattern from a crystal is a 3-D pattern, and the crystal must be rotated in order to observe all the diffraction spots. Check out Bernhard Rupp’s Crystallography 101 website: http://www-structure.llnl.gov/Xray/101index.html Detectors 1- Photographic film Not much used anymore because of the availability of far more sensitive detectors. Superior resolution due to its fine grain, but limited dynamic range. 2- Image plates Image plates are coated with a layer of inorganic storage phosphor. X-ray photons excite electrons in the material to higher energy levels. Part of the energy is emitted as fluorescence, but an appreciable amount of energy is retained in the material. The stored energy is released upon illumination with a red laser. Blue light is emitted and measured with a photomultiplier. The light emitted is proportional to the number of photons. Ten times more sensitive than film, dynamic range (1:104-105) S. Doublié © 2000 Diffraction A characteristic of wave phenomena, where whenever a wavefront encounters an obstruction that alters the amplitude or phase of a part of the wavefront, diffraction will occur. The components of the wavefront, both the unaffected and the altered, will interfere with one another, causing an observable energy-density distribution referred to as the diffraction pattern. Interactions between X-rays and atoms X-rays are scattered almost exclusively by the electrons in the atoms, not by the nuclei. The incident electromagnetic wave exerts a force on the electrons. This causes the electrons to oscillate with the same frequency as the incident radiation. The oscillating electrons act as radiation scatterers and emit radiation at the same frequency as the incident radiation. S. Doublié © 2000 When an incident x-ray beam hits a scatterer, scattered x-rays are emitted in all directions. Most of the scattering wavefronts are out of phase interfere destructively. Some sets of wavefronts are in phase and interfere constructively. A crystal is composed of many repeating unit cells in 3-dimensions, and therefore, acts like a 3dimensional diffraction grating. The constructive interference from a diffracting crystal is observed as a pattern of points on the detector. The relative positions of these points are related mathematically to the crystal’s unit cell dimensions. Destructive Interference Constructive Interference Diffraction gratings Diffraction patterns Notice - when the diffraction grating gets smaller, the pattern spacing gets larger (inverse relationship) Bragg’s Law 2d sin = nl where l = wavelength of incident x-rays = angle of incidence d = lattice spacing n = integer Spots are observed when the following conditions are met: 1. The angle of incidence = angle of scattering. 2. The spacing between lattice planes is equal to an integer number of wavelengths. The Ewald Sphere A tool to visualize the conditions under which Bragg’s law is satisfied and therefore a reflection (diffraction spot) will be observable. This occurs when the surface of a sphere centered about the crystal with radius = 1/l intersects with a point on the reciprocal lattice. QuickTime™ and a Vidéo decompressor are needed to see this picture. Movie downloaded from An Interactive Course on Symmetry and Analysis of Crystal Structure by Diffraction By: Gervais Chapuis and Wes Hardaker http://perch.cimr.cam.ac.uk/Course/Adv_diff2/Diffraction2.html#Ewald Unit Cell A crystal’s unit cell dimensions are defined by six numbers, the lengths of the 3 axes, a, b, and c, and the three interaxial angles, , and . The convention for designating the reciprocal lattice defines its axes as a*, b*, and c*, and its interaxial angles as *, * and *. Asymmetric unit Recall that the unit cell of a crystal is the smallest 3-D geometric figure that can be stacked without rotation to form the lattice. The asymmetric unit is the smallest part of a crystal structure from which the complete structure can be built using space group symmetry. The asymmetric unit may consist of only a part of a molecule, or it can contain more than one molecule, if the molecules not related by symmetry. Symmetry "An object has a particular symmetry if the object looks exactly the same after applying the corresponding symmetry operation." Types of Symmetry Operations: • Translational 4-fold rotation •n-fold Rotation • Combination symmetries: • Screw axis (translation + rotation) • Glide plane (translation + mirror) • Roto-inversion axis • Mirror operation Note that mirror and inversion operations change the hand. I.e., if an object possesses this symmetry, either both enantiomers must be present, or the object must be achiral. Mirror Plane • Inversion operation M. Rould ‘02 Inversion center Symmetry Can natural proteins have mirror or inversion symmetry? x L-Alanine No - proteins are chiral -only L-amino acids are present. D-Alanine How about nucleic acids (DNA, RNA)? No, (deoxy)ribose is chiral -only the D- stereoisomers are present. Of the 232 Crystallographic Space Groups, only 65 are possible for crystals containing enantiomorphic specimens such as most biological macromolecules. M. Rould ‘02 X-Ray Scattering from a Crystal A typical image of x-rays scattered by a crystal: (Dark spots are the scattered x-rays) When x-rays scatter from a crystal we see discrete spots: Reflections Why? X-Ray Diffraction Pattern M. Rould ‘02 X-Ray Diffraction from a Crystal • Electromagnetic radiation is wave-like: Electric field + + - + - + - + - + - •Waves can add constructively or destructively: Electric field + + Sum = M. Rould ‘02 + - + - + - + - Direction of motion of x-ray photon Structure Factor - F(hkl) Each reflection in the diffraction pattern is the result of diffractive contributions from all the atoms in the unit cell. F(hkl) = f1 ei + f2 ei + f3 ei + … + fN ei + f1' ei + f2' ei + f3' ei + … or, F(hkl) = ∑fj ei The term fj describing the diffractive contributions of each atom is called the atomic scattering factor of atom j. The scattering factor essentially describes the amplitude for the scattering contributed by a particular species of atom. Structure Factor - F(hkl) cont’d F(hkl), as a complex number, can be expressed in terms of its real and imaginary components: F(hkl) = A(hkl) + i B(hkl), where A = ∑fj cos j = fresultant cos resultant and B = ∑fj sin j = fresultant sin resultant, fj are the atomic scattering factors and j are the phase angles of the waves scattered from individual atoms. This is just an alternate, mathematically equivalent representation for the structure factor that sometimes proves useful. Fourier Methods in Diffraction Theory For each point in a diffraction, there is a corresponding spatial frequency. Therefore, the distribution of a far-field diffraction pattern is the Fourier transform of the aperture function. (aperture - an opening, often adjustable, that controls the amount of light reaching the lens on a camera or other optical instrument.) In our case, the aperture function is the regularly periodic (due to the repetition of the unit cell in the lattice) electron density distribution within our crystals. The electron density is the inverse Fourier transform of the diffraction pattern expressed as follows: (x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e -2πi(hx+ky+lz), h k l where Vunit cell = volume of one unit cell and F(hkl) is called the structure factor for a particular set of Miller indices h, k and l. We can do a summation here, instead of integrating, because we know we will only have reflections at integer values for h, k and l. Electron Density Electron density distribution: (x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e -2πi(hx+ky+lz) h k l for convenience, let us substitute = 2π(hx+ky+lz) in the future The amplitude of the structure factor F(hkl) for any given reflection is proportional to the square root of the intensity of the diffracted beam, or: |F(hkl)|2 I Therefore, we can deduce |F(hkl)|, the magnitude of F, directly from our data, but not its phase. We have all the information we need, except the phase. Why worry about the phase? On the top are photographs of Jerome Karle (left) and Herb Hauptman (right), who won the Nobel Prize for their work on solving the phase problem for small molecule crystals. We can treat the photographs as density maps and calculate their Fourier transforms, to get amplitudes and phases. If we combine the phases from the picture of Hauptman with the amplitudes from the picture of Karle, we get the picture on the bottom left. The bottom right picture combines the phases of Karle with the amplitudes of Hauptman. The pursuit of phases Although |Fhkl| can be derived from the recorded intensities Ihkl, the phase angle ahkl cannot be derived straightforwardly from the diffraction pattern. Several methods have been developed to solve this problem. Multiple isomorphous replacement (MIR) Free of model bias, but noisy due to lack of isomorphism Multiwavelength anomalous diffraction (MAD) Most reliable source of phases; isomorphism is nearly perfect Molecular replacement (MR) Widely used; errors due to model bias are variable and difficult to detect and correct MIR Basic principle: • Add heavy atom compound (Hg, Pt, Au, etc.) to the crystal. • Collect diffraction data from this derivatized crystal. • Hopefully the heavy metal will bind to just a few sites • It is relatively easy to determine the positions of these few really big atoms. -> knowing the positions of the heavy atoms, we can calculate their effect on the intensity and phase of each reflection. Caveat: MIR only works if the heavy atom doe not change the conformation of the protein or the crystal lattice in any way. The only differences allowed are the presence of the heavy atom in the crystal and the resulting change in intensity and phase of the scattered X-rays. S. Doublié ‘02 Protein phase angles A- Single isomorphous replacement Imaginary axis FPH= FP + FH H FP, protein structure factor FH, heavy atom structure factor FPH, structure factor for derivatized crystal FP O Real axis -FH FPH S. Doublié ‘02 G Harker construction for phase determination by the method of single isomorphous replacement: the vectors OH and OG represent two possibilities for FP. Harker construction for MIR Imaginary axis H FP -FH -FH2 FPH O Real axis G FPH2 The addition of another derivative breaks the phase ambiguity: FP is given unequivocally by the vector OH. S. Doublié ‘02 MAD MAD depends on the presence of sufficiently strong anomalously scattering atoms in the protein. Anomalous scattering occurs if the electrons in an atom cannot be regarded as free electrons. An anomalous scatterer absorbs X-rays of specified wavelength. As a result of this absorption, Friedel’s law does not hold, i.e, the reflections hkl and -h-k-l are not equal in intensity. This inequality of symmetry related reflections is called anomalous scattering. Metalloproteins (Fe) structures have been solved with MAD. Proteins that do not naturally contain anomalous scatterers can be expressed in E. coli in a defined medium with selenomethionine. The selenium atoms serve as anomalous scatteringheavy atoms. Caveat: MAD requires a tunable wavelength: data collection can only be done at synchrotron radiation facilities (Brookhaven, Stanford, APS etc.). S. Doublié ‘02 Molecular replacement Prerequisite: The protein of interest should have a structural homologue in the PDB in order to use the related protein as phasing model. Molecular replacement entails calculating initial phases by placing the model of a known protein in the unit cell of the unknown protein. Caveat: Errors due to model bias are variable and difficult to detect and correct S. Doublié ‘02 R factor Measure of the crystallographic residual, indicates the correctness of a model: R = ∑ | (|Fobs|-|Fcalc|) | ∑ (|Fobs| Variations that can prove confusing to the novice: Rmerge measurement of the quality of a merged data set Rsym measurement of the variation between symmetry-related reflections Rfree R factor for a test set of unique reflections that have been omitted from the refinement process (unbiased) Rfree R factor for a test set of unique reflections that have been omitted from the refinement process (unbiased) R = ∑ | (|Fobs|-|Fcalc|) | hkl T ∑ |F obs| hkl T where hkl T designates all reflections belonging to a test set T of randomly selected, unique reflections. The size of the test set is commonly 10% of the data set. Rmerge measurement of the quality of a merged data set N R = ∑ ∑ | (|Fhkl|-|Fhkl(j)|) | hkl j=1 ∑ N x (|Fhkl| hkl where | Fhkl| is the final value of the structure factor amplitude for that reflection, N = total no. of data sets (or images) merged. Rsym measurement of the variation between symmetry-related reflections R = ∑ ∑ | (|F(i)hkl|-|Fhkl|) | hkl i ∑ ∑ |F (i) hkl| hkl i for i observations of each symmetry-related reflection, where |Fhkl| is the average value for the structure factor amplitude of the i observations of a given reflection. Refinement Target Refinement searches for a global minimum for a target energy function similar to the one illustrated below: E where total = wxray Exray + Econformation + Enonbonded wxray = weight for the xray energy term Exray = xray energy term Econformation = conformational energy terms (bonds, angles) and Enonbonded = nonbond energy terms (van der Waals, electrostatic) Rigid Body Refinement Reduces the conformational RIGID freedom within the model to BODY 1 improve the ratio of observables to parameters in the early stages of refinement. The entire model can be treated as a rigid body, or it can be regarded as linked, rigid groups. RIGID BODY 2 For each group of atoms specified by the user as a rigid body, the 3 rotational and 3 translational degrees of freedom are minimized. Positional Refinement The atomic position parameters x, y and z are refined for each atom. • • Difficulties in protein crystallography: large number of parameters to fit macromolecular crystals diffract weakly, producing a poor parameters to observations ratio. The geometrical constraints introduced by the conformational energy terms greatly reduces the number of parameters to be refined. Least-squares optimization or conjugate gradient minimization techniques are commonly used for finding the best fit of the model to the data. B-factor (temperature factor) refinement B-factors are indicators of atomic mobility. High values correspond to low electron density, indicating a dynamic or disordered region, or a possible error in position. The B-factor is an exponential expression applied to the scattering factor that relates to the thermal motion of the scattering atom and the decrease in scattering intensity that results from thermal motions. fe -B[(sin 2 )/l2 ] The x-ray energy term is modified in the target energy function is revised where Fcalc is replaced by Fcalc e -s2 B/4 Occupancy Refinement The occupancy factor is used to describe disorder in the model. An atom with a partial occupancy factor can be thought of as an atom that does not occupy that position 100% of the time (i.e., ions, water, cofactors). Some refinement programs do not require that the occupancy factor be ≤ 1, so it is up to the crystallographer to remember that 1 is the upper limit on the occupance factor for a given atom in a given position. Simulated Annealing Simulated annealing - MD-refinement technique that involves the control of the temperature, mathematically related to the kinetic energy (KE) of the MD simulation by: Tcurrent = 2 KE/3nkb, for n = degrees of freedom, kb = Boltzmann constant Gradient descent minimization and least-squares optimization methods are prone to get “stuck” in regions of local minima when applied to the vast problem of solving the structure of a biological macromolecule. In these cases, it is often necessary to overcome an energy barrier between the local minimum and the global minimum. Therefore, to reach the global minimum, an algorithm must be applied that can go energetically “uphill”. Model Building Starting model: Molecular replacement model initial model is the search model that has been positioned in the unit cell by the rotation and translation function. MAD/isomorphous model electron density is calculated using the heavy atom phases, then the model has to be built into the electron density. Maps Electron density distribution: (x, y, z) = 1/Vunit cell ∑∑∑F(hkl) e-ø h k l The first map is an approximation to the true electron density derived from the observed structure factor amplitudes (Fobs) and the estimated phases from the model (MR, MAD, or MIR phases). (Remember our illustrations that the correctness of the model image depends more on having the correct phase information than on having the correct amplitudes.) Maps cont’d Both tryptophans are from the same 1.7 Å crystal structure, but the map in Figure 1 is the first map calculated using the initial MR phases and the map in Figure 2 is the final map calculated using the refined phases. 1 2 Resolution limits 6.0 - 4.5Å 3.0Å 2.5Å 1.8Å 1.2Å Placement of secondary structures Chain tracing Side chain orientation Alternate side chain orientations Hydrogen atoms Map types 2 FO - FC Maps FO = observed structure factors FC = calculated structure factor Subtracting Fc from 2 Fo exaggerates the areas where Fo differs from Fc. In the case where Fo is greater than Fc, the net structure factor amplitude is intensified and in the case where Fo is less than Fc, the net structure factor amplitude is decreased. FO - FC Maps (Difference Maps) Produces “positive” or “negative” peaks in areas where Fo differs from Fc. This map is usually contoured at a high level - 3 or 4 - so all the crystallographer views are the large difference peaks (not likely to be just noise). Atomic Model Deposition - The Protein Data Bank You’ve solved your 1.2 Å crystal structure with an R-factor of 15.4% and an R-free of 16.2%. It’s time to share your hardwon scientific knowledge with the rest of the world. When you publish your paper, most journals will request that you provide your PDB accession number, indicating you have deposited your coordinates for the betterment of mankind. So, you type the following URL into your browser: http://www.rcsb.org/pdb/ and wind up here: Welcome to the PDB, the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data. PDB Validation Suite PROCHECK, NUCHECK, SFCHECK PROCHECK Assesses the geometry of the residues in a given protein structure, as compared with stereochemical parameters derived from wellrefined, high-resolution structures. Unusual regions highlighted by PROCHECK are not necessarily errors, but may be unusual features for which there is a reasonable explanation (eg distortions due to ligand-binding in the protein's active site). Nevertheless, they are regions that should be checked carefully. The only input required for PROCHECK is the PDB file holding the coordinates of the structure of interest. Practical Considerations - generalizations (that means, of course, that there are always exceptions) Resolution: R-factor: Good: For sidechain conformations: ≤ 2Å Good upper limit for ~ 2Å data: R-free: 20 - 23 % within 10% of R (closer for hi res) < 3Å