Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Una maniera nuova di guardare le proteine globulari Cercare le ”strutture a strato" • Le proteine globulari possono essere rappresentate come formate da “strati” di scheletro peptidico : – Gli strati possono essere rappresentati dal piano dei legami H nei foglietti b o da una serie di a-eliche parallele impaccate contro i foglietti Simplified layer structure of proteins. Layers of secondary structure (b green; a • red) are combined to make globular protein domains. The b-sheets are represented as bars and circles, as they • would appear when viewed looking along their component strands. Each sheet has a left-handed twist between the strands • (not depicted) onto which can be added curl and stagger Redrawn with permission from Nature, 2002, 416, 657 • Possono essere piatti, curvati o cilindrici I residui idrofobici vengono nascosti tra gli “strati” Nelle proteine solubili, l’esterno e’ costituito principalmente da residui polari che interagiscono favorevolmente con il solvente E’ impressionante come la maggioranza delle strutture proteiche sia classificabile in questo modo: solo poche strutture non possono esserlo Esempi di domini proteici con diverso numero di strati dello scheletro proteico Residui idrofobici in giallo due strati di a-elica Una inusuale struttura a 5 strati: uno strato di filamenti b chiuso a sandwich tra 4 strati di aeliche uno strato di foglietti b fra due strati di elica, in tutto 3 strati strati concentrici di filamenti b all’interno e di a eliche all’esterno (struttura nota anche come barile a-b) TIPI DI FOLD PIU’ FREQUENTI Fold delle proteine e funzione In natura forse esistono 10000 diversi fold ovvero forme in cui possiamo trovare le proteine che hanno una organizzazione tipica della propria componente di struttura secondaria. In realtà possiamo dire che solo alcune centinaia sono realmente diffuse mentre le altre sono rare o specie specifiche. La natura ottimizza le proprie energie e non crea nuove forme se piccole modificazioni di strutture preesistenti consentono di ottenere funzioni diverse (Coulson, Proteins 2002) Il concetto dei SUPERFOLDS poche centinaia (forse 400) presenti in molte proteine tante funzioni MESOFOLDS una via di mezzo. Fold mediamente frequenti UNIFOLDS presenti in un’unica specie di proteine. Forse la maggior parte e specie specifiche. Possono essere oltre 10000 Fold delle proteine e funzione: Superfolds, mesofolds, unifolds I Superfolds TIM barrel fold The structure consists of an eight fold repeat of beta/alpha units. 1) Eight parallel beta strands on the inside are covered by 2) Eight alpha helices on the outside. The fold was first seen in triose phosphate isomerase. All known TIM barrel structures are enzymes, except for the narbonin family. Many of these enzymes are glycosyl hydrolases (EC 3.2.x.x). The fold is highly versatile, being found in single-domain monomeric enzymes and as the catalytic domain of larger enzymes. The active site is found at the Cterminal end of the barrel in a series of loops, hence it is very easy to alter the function and/or specificity without altering the core structure. Alpha/beta hydrolase fold The structure is an eight stranded, mostly parallel alpha/beta structure. The fold is tolerant to large insertions and is very plastic. All proteins known so far containing this fold are enzymes. The enzymatic properties of this fold are formed by a catalytic triad of a nucleophile, acid and a histidine residue. The nucleophile is found in a "nucleophilic elbow" turn located just after the fifth beta strand. NAD binding domain (Rossman fold) This is a double beta-alpha-beta-alpha-beta motif, and is a common structural motif of enzymes binding NAD, NADP and other related cofactors for example, NAD is found in dehydrogenases as the hydrogen acceptor. The domain is found as a common core unit in many structures, with other structural units at the periphery. P-loop NTP hydrolase fold This fold consists of alpha/beta/alpha, parallel or mixed beta sheets of variable size. The fold binds the phosphate of ATP or GTP and is found in ATP and GTP binding proteins such as adenylate kinase. The P-loop is phosphate binding loop that binds the phosphate groups of ATP and GTP,and is a glycine-rich sequence with the consensus sequence (A,G)xxxxGK(T,S). The P-loop residues are shown in detail (left) in guanylate kinase. Ferredoxinlike fold This fold consists of an alpha/beta sandwich with an antiparallel beta sheet The ferredoxinlike fold is associated predominantly with nonenzymatic ferredoxins. Ferredoxins are iron-sulphur clusters invovled in electron transport, and often form part of multisubunit assemblies. DISTRIBUZIONE DI FUNZIONE NEI FOLD Classi strutturali e funzione enzimatica (Hegyi & Gerstein, JMB 1999) 1) alpha/beta (α/β): proteine con struttura alpha e beta ben distinte. sono prevalentemente enzimi specialmente trasferasi e idrolasi 2) Tutte alpha (α) e piccoli fold: associate a proteine non-enzimatiche Da queste considerazioni, però, non si è certi di associare univocamente ad una classe strutturale una funzione basata sul contenuto di struttura secondaria. Proteine da swissprot vs. i fold di SCOP usando BLAST e divise in 92 categorie funzionali (divise in 6 supergruppi) per riga contro 229 fold per colonna. La prima riga con i non-enzimi. Ci sono 21068 (=92x229) combinazioni possibili ma solo 331 sono state osservate (quadratini pieni). Calcolo delle forze di interazione What is a Force Field ? • A force field is a set of equations and parameters which when evaluated for a molecular system yields an energy • A force field is a specific type of vector field where the value of a given force is defined at each point in space. Examples include gravitational fields and electrostatic fields • The space around a radiating body within which its electromagnetic oscillations can exert force on another similar body not in contact with it It is all about time versus accuracy • • • • • • Quantum chemistry Approximations Force Fields Hybrid methods Molecular dynamics and energy calculations Minimizers Quantum chemistry is accurate, but slow Quantum chemistry is accurate, but slow The largest ‘thing’ that can realistically be worked-out using the Schödinger equation is hydrogen. Other applications are the particle in a box that is mainly of theoretical importance Actually, pure quantum chemistry cannot be applied in our (protein) world. Approximations can make quantum chemistry software faster, but at the cost of accuracy. The basic functional form of a force field encapsulates both bonded terms relating to atoms that are linked by: 1. covalent bonds 2. nonbonded (also called "noncovalent") terms describing the long-range electrostatic and van der Waals forces. The specific decomposition of the terms depends on the force field, but a general form for the total energy in an additive force field can be written as where the components of the covalent and noncovalent contributions are given by the following summations: The bond and angle terms are usually modeled as harmonic oscillators in force fields that do not allow bond breaking. The functional form for the rest of the bonded terms is highly variable. Proper dihedral potentials are usually included. Additionally, "improper torsional" terms may be added to enforce the planarity of aromatic rings and other conjugated systems, and "cross-terms" that describe coupling of different internal variables, such as angles and bond lengths. Some force fields also include explicit terms for hydrogen bonds. The nonbonded terms are most computationally intensive because they include many more interactions per atom. A popular choice is to limit interactions to pairwise energies. The van der Waals term is usually computed with a Lennard-Jones potential and the electrostatic term with Coulomb's law, although both can be buffered or scaled by a constant factor to account for electronic polarizability and produce better agreement with experimental observations. Covalent bonds In simple terms a covalent bond exists between two atoms if they share electrons between them. In contrast, an ionic bond is formed if an electrons are transferred between atoms (e.g., in sodium chloride an electron is given up by the sodium atom to form a Na+ ion and accepted by the chlorine atom to form a chloride, Cl-ion. A single bond is formed when one pair of electrons is involved and a double bond when two pairs are involved. In quantum chemical terms such a picture is overly simplistic as the although a bonding orbital results in an increase in electron density between the atoms it also spreads over the rest of the molecule. This is particularly in the case of delocalized bonding. The "classical" example of delocalized bonding is the benzene molecule - which can be described as resonance hybrid between a number of alternate structures: The standard way to approximate the potential energy for a bond in a protein and most other molecules is to use a Hooke's law term where r is the length of the bond (i.e., the distance between the two nuclei of the atoms between which the bond acts), r_eq is the equilibrium bond length and K_r is a spring constant. This basically represents the bond as a spring linking the two atoms. Graph of the potential energy dependence for a C=O Bond The shape of the potential energy well will be parabolic and the motion will therefore tend to be harmonic Note that when the bond is at its equilibrium length i.e., r = r_eq the potential energy is assigned to be zero and when r approaches large values (i.e. the bond breaks) the energy goes infinite. This kind of approach does not attempt to reflect the energy of formation of the bond - it only seeks to reflect the energy difference on a small motion about the equilibrium value Atom pair r_eq in Å K_r in kcal/(molÅ2) C=O C-C2 C-N C2-N N-H Typical values for bond constants and equilibrium bond lengths taken from the AMBER potential energy function 1.229 1.522 1.335 1.499 1.010 570 317 490 337 434 O a carbonyl oxygen C a sp2 carbon (such as that attached to an O) N a main chain nitrogen atom H a hydrogen atom attached to the N C2 a "united atom" group(CH2) united atom means that instead of representing the carbon atom and the two hydrogen atoms which are bonded to it separately only one centre is considered Note that oxygen atoms are shown in red, nitrogens are in blue, hydrogens in white and carbon atoms are in light green. Only the hydrogen atoms bonded to an oxygen or nitrogen are shown. This is fairly routine approximation called the "united-atom" or "extended-atom" representation, which is an option in many potential energy functions. The hydrogen atoms which should be attached to carbons are not explicitly represented but instead a small adjustment is made to the LINK NEEDED van der Waals parameters of the carbon atom. Bond angles A bond angle theta between atoms A-B-C is defined as the angle between the bonds A-B and B-C: As bond angles are found (experimentally and theoretically) to vary around a single value it is sufficient in most applications to use a harmonic representation (in a similar manner to the bond potential): Angle eqin degrees Kein kcal/(mol.degrees2) C-N-H 119.8 35.0 C2-N-C 121.9 50.0 C2-N-H 118.4 38.0 C-C2-N 110.3 80.0 C2-C-O 120.4 80.0 C2-C-N 116.6 70.0 O-C-N 122.9 80.0 These are the bond angle parameters necessary in representing a glycine residue and its connections to neighbouring residues. A bond angle around 109 degrees means that the central atom is tetrahedral (with four other atoms bonded to it): The angle around the C beta atom of an alanine residue - showing the three hydrogen atoms bonded to the carbon. In contrast an angle around 120 degrees indicates a flat (sp2) central atom with three other atoms bounded to it: This shows the angles made around a main chain nitrogen atom are all approximately equal to 120 degrees: consequently the group is planar. The source of bond angle parameters is the same as for bonds: high resolution small molecule X-ray structures for eqilibrium values and either spectroscopic data or ab initio calculations for force constants. Potential energy curve for the N-C-O Bond Angle Dihedral angles The standard functional form for representing the potential energy for a torsional rotation was introduced by Pitzer (1951) Vn gives the energy barrier to rotation, n the number of maxima (or minima) in one full rotation and determines the angular offset. The use of the sum allows for complex angular variation of the potential energy. Barriers for dihedral angle rotation can be attributed to the exchange interaction of electrons in adjacent bonds. Steric effects can also be important. Potential energy curve for the omega dihedral angle n Vn (in kcal/mol) 1 2 3 1.3 5.0 0.0 γ (in degrees) 0. 180. - NON-BONDED INTERACTIONS As the name implies non-bonded interactions act between atoms which are not linked by covalent bonds. Like most things this is simple to state but can be confusing to apply in practice! In most approaches then atoms which are involved in a bond angle are also not regarded as having a non-bonded interaction. 1-4 interactions (those between the end atoms involved in a dihedral angle) are sometimes given an additional scaled down nonbonded interaction Interazioni elettrostatiche Non-bonded Interactions Electrostatic interactions electromagnetic interactions dominate on the molecular scale and provide the fundamental basis for all the different bonded and non-bonded interactions discussed here. This is clearest in the case of electrostatic interactions where charges on nuclei and electrons interact according to Coulomb's law where qi and qj are the magnitude of the charges, rij is their separation, 0 the eletric constant in vacuum and r the relative dielectric constant of the medium in which the charges are placed The source of dielectric effects is that the electric field polarizes the material involved. Suppose that we have two charges interacting in a vacuum. We can draw the electric field lines (the direction in which a positive charge would be forced to move) These charges are then placed in a dielectric medium - which can be thought of as being composed of a large number of microscopic dipoles (a little rod with positive charge at one end and negative at the other). every dipole lines up so that its positive end points toward the negative charge and vice-versa. This means that the electric field caused by the dipoles will oppose the original electric field at all places. This reduction in field causes a reduction in electric potential and thus a reduction in the interaction energy. that the electric field between charges permeates the whole of space - it does not only depend on what is immediately in between the charges The dielectric constant of selected materials Material Dielectric constant Water (20 C) Water (0 C) Ice (-10 C) Methanol Liquid H2S (-85.5 C) Beeswax Paraffin Liquid Argon(-191 C) Vacuum 80.3 87.7 ~98 33.6 9.3 2.9 2.0-2.5 1.5 1.0 (by definition) The strictly correct way to use the law would be to consider every nucleus and electron separately, plug it into the Schrödinger equation and apply quantum chemical methods to solve the equation for the spatial configuration of nuclei we are interested. As already mentioned this is completely impractical for biomolecular systems. So instead we wish to develop a useful model for the interactions between nuclear centres (commonly called "atoms") without having to explicitly deal with the electrons in a system. The simplest approach is to just consider the formal charges of the protein. Formal charges show whether chemical groups are ionized i.e., whether an atom or set of atoms has lost or gained an electron. Isolated amino acids (in neutral solution) are zwitter ionic - this means that although the molecule has no overall charge it carries both a negatively charged group and a positively charged group: SALT BRIDGE In practice salt bridges are relatively rare in proteins and in practice they normal occur on the surface as opposed to internally. An exception is when an internal salt bridge is involved in the catalytic mechanism of an enzyme such as in the asp-his-ser triad of serine proteases (a classic example of the structural basis of enzyme activity): The reason for this is that although an internal salt bridge is a strong interaction in comparison to having the isolated residues widely separated in a vacuum it is normally destabilizing for a protein. This apparent paradox is due to that fact that when considering the effect of an interaction one must consider the difference in the (free) energy between the folded and unfolded but solvated states. In the unfolded state the residues involved in a salt bridge would be widely separated but each making very favourable interactions with water molecules (there is an entropic contribution to this). These interactions are lost when the same residues are buried in the largely hydrophobic core of the protein. Similar arguments apply to practically all considerations of elucidation the energetic contributions to protein folding or ligand binding - normally a small overall free energy advantage arises from the balance between large but cancelling contributions. Hydrogen bonds Hydrogen bonds 2.8 Å 6kcal/mol The electrostatic interactions between groups which carry no formal overall electrical charge Partial Charges We have seen that electrostatic interactions are of fundamental importance to proteins. We shall now briefly examine the manner in which they are normally treated in computational studies. 1. The most common approach is to place a partial charge at each atomic centre (nucleus). 2. These charges then interact by Coulomb's Law. 3. The charge can take a fractions of an electron and can be positive or negative. 4. Charges on adjacent atoms (joined by one or two covalent) bonds are normally made invisible to one another - the interactions between these atoms being dealt with by covalent interactions. Note that the concept of a partial charge is only a convenient abstraction of reality. In practice many electrons and nuclei come together to form a molecule - partial charges give a crude representation of what a neighbouring atom will on average "see" due to this collection. 1. The standard modern way to calculate partial charges is to perform a (reasonably high level) quantum chemical calculation for a small molecule which is representative of the group of interest (e.g., phenol is considered for tyrosine). 2. The electrostatic potential is then calculated from the orbitals obtained for many points on the molecular surface. A least squares fitting procedure is then used to produce a set of partial charges which produce potential values most consistent with the quantum calculations. INDUCTION The normal treatment for partial charges is to assume they are fixed. In practice the electric field caused by other atoms and molecules will polarize an atom effecting its electron distribution and thus its partial charge. In turn the partial charge produces an electric field which affects neighbouring charges and thus fields. The process of polarization has an energetic effect. In practice it is difficult to find adequate parameters to treat systems as complex as proteins. Induction effects can be shown to decay by a r-6 relations so they can normally be regarded as implicitly corrected for when the dispersion term is fitted. DISPERSION (London forces) The London dispersion force is the weakest intermolecular force. The London dispersion force is a temporary attractive force that results when the electrons in two adjacent atoms occupy positions that make the atoms form temporary dipoles. It exploits the INDUCTION. Imagine that we have an atom of argon. It can be considered to be like a large spherical jelly with a golf ball embedded at the centre. The golf ball is the nucleus carrying a large positive charge and the jelly represents the clouds of electrons whizzing about this. At a point external to the atom the net average field will be zero because the positively-charged nucleus' field will be exactly balanced by the electron clouds: However, atoms vibrate (even at 0K) and so that at any instant the cloud is likely to be slightly off centre. This disparity creates an "instantaneous dipole": The Dispersion interaction can be shown to vary according to the inverse sixth power of the distance between the two atoms. The factor Bij depends on the nature of the pair of atoms interacting (in particular their polarizability) The factor Bij depends on the nature of the pair of atoms interacting (in particular their polarizability). It is normal to parameterize the dispersion empirically using structural and energetic data from crystals of small molecules. It is not possible to use simple quantum chemical calculations to find parameters. In this each electron is solved independently keeping the other orbitals frozen (in a self consistency). This effectively means that electrons only experience a time averaged picture of other electrons - so that dispersion cannot come into effect. More advanced methods in quantum chemistry introduce methods to tackle "electron correlation" to avoid this. REPULSION When two atoms are brought increasing close together there is a large energetic cost as the orbitals start to overlap. In the limit that the atomic nuclei where coincident the electrons of the two atoms would have to share the same orbital system. The Pauli exclusion principle states that no two electrons can share the same state so that in effect half the electrons of the system would have to go into orbitals with an energy higher than the valence state. For this reason the repulsive core is sometimes termed a "Pauli exclusion interaction". The Lennard-Jones potential and van der Waals Radii The dispersion and repulsion terms discussed above are commonly grouped together into the Lennard-Jones or 6-12 potential The equation can be rewritten in an equivalent more instructive form (choosing the case for an interaction be two atoms of the same type): The minimum of the function is at r = 2R* and has an energy of minus E*. The distance R* is known as the van der Waals radius for an atom and E* is its van der Waals well depth. atom type van der Waals radius in Å C (aliphatic) O H N P S 1.85 1.60 1.00 1.75 2.10 2.00 van der Waals well depth in kcal/mol 0.12 0.20 0.02 0.16 0.20 0.20 It is important to note that the Lennard-Jones interaction between uncharged atoms (such as CH3 groups) is less attractive than that between charged groups such as oxygens. The difference is that the contribution from electrostatics will dominant the L-J interactions. In cases where uncharged groups form compact structures van der Waals energies are often cited as stabilizing the conformation. Although partly true very often the major contribution comes rather from hydrophobic exclusion.