* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Protein-Protein Interactions: Stability, Function and Landscape
Survey
Document related concepts
Histone acetylation and deacetylation wikipedia , lookup
Magnesium transporter wikipedia , lookup
Signal transduction wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Multi-state modeling of biomolecules wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Protein phosphorylation wikipedia , lookup
Protein folding wikipedia , lookup
Homology modeling wikipedia , lookup
Protein moonlighting wikipedia , lookup
List of types of proteins wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Protein domain wikipedia , lookup
Protein structure prediction wikipedia , lookup
Transcript
Protein-Protein Interactions: Stability, Function and Landscape Structural Aspects of ProteinProtein Interactions Agenda • • • Understand the importance of studying protein-protein interactions at the structural level Classify the various types of interactions Look at one structure-based method for predicting protein-protein interactions LINK Protein interaction • • Definition Specific interactions between two or more proteins. • • Examples Enzyme-inhibitor complex; antibodyantigen complex; receptor-ligand interactions, multiprotein complexes such as ribosomes or RNA polymerases. Homocomplexes are usually permanent and optimized (e.g., the homodimer cytochrome c9 (1)) (Fig. 1a). Heterocomplexes can also have such properties, or they can be non-obligatory, being made and broken according to the environment or external factors and involve proteins that must also exist independently [e.g., the enzyme–inhibitor complex trypsin with the inhibitor from bitter gourd (2) (Fig. 1b) and the antibody– protein complex HYHEL-5 with lysozyme (3) (Fig. 1c)]. It is important to distinguish between the different types of complexes when analyzing the intermolecular interfaces that occur within them. Characteristics Classification: Protein-protein interactions can be arbitrarily classified based on the proteins involved (structural or functional groups) or based on their physical properties (weak and transient, “non-obligate” vs. strong and permanent). Protein interactions are usually mediated by defined domains, hence interactions can also be classified based on the underlying domains. Universality: All of molecular biology is about protein-protein interactions (Alberts et al. 2002, Lodish et al. 2000). Protein-protein interactions affect all processes in a cell: structural proteins need to interact in order to shape organelles and the whole cell, molecular machines such as ribosomes or RNA polymerases are hold together by protein-protein interactions, and the same is true for multi-subunit channels or receptors in membranes. Specificity distinguishes such interactions from random collisions that happen by Brownian motion in the aqeous solutions inside and outside of cells. Note that many proteins are known to interact although it remains unclear whether certain interactions have any physiological relevance. Number of interactions: It is estimated that even simple single-celled organisms such as yeast have their roughly 6000 proteins interact by at least 3 interactions per protein, i.e. a total of 20,000 interactions or more. By extrapolation, there may be on the order of ~100,000 interactions in the human body. The protein-protein interaction network in yeast. An interaction map of the yeast proteome assembled from published interactions. The map contains 1,548 proteins (boxes) and 2,358 interactions (connecting lines). Homo- and hetero-oligomeric complexes Protein-protein interactions (PPIs) occur between identical or non-identical chains (i.e. homo- or heterooligomers). (A-B) Oligomers of identical or homologous protein units can be organized in an isologous or heterologous way (Monod et al., 1965) with structural symmetry (Goodsell and Olson, 2000). An isologous association involves the same surface on both monomers (e.g. Arc repressor and lysin; Figure 1A and C), related by a 2-fold symmetry axis. In contrast to an isologous association that can only further oligomerize using a different interface (e.g. form a dimer of dimers with three 2-fold axes of symmetry), heterologous assemblies use different interfaces that, without a closed (cyclic) symmetry, can lead to infinite aggregation. Non-obligate and obligate complexes As well as composition, two different types of complexes can be distinguished on the basis of whether a complex is obligate or non-obligate. In an obligate PPI, the protomers are not found as stable structures on their own in vivo. Such complexes are generally also functionally obligate; for example, the Arc repressor dimer (Figure 1A) is essential for DNA binding. Many of the hetero-oligomeric structures in the Protein Data Bank involve non-obligate interactions of protomers that exist independently, such as intracellular signalling complexes (e.g. RhoA±RhoGAP; Figure 1D) and antibody±antigen, receptor±ligand and enzyme±inhibitor (e.g. thrombin±rodniin; Figure 1E) complexes. The components of such protein±protein complexes are often initially not co-localized and thus need to be independently stable. However, some homo-oligomers, which by definition are co-localized, can also form nonobligate assemblies (e.g. sperm lysin; Figure 1C). Transient and permanent complexes PPIs can also be distinguished based on the lifetime of the complex. In contrast to a permanent interaction that is usually very stable and thus only exists in its complexed form, a transient interaction associates and dissociates in vivo. We distinguish weak transient interactions that feature a dynamic oligomeric equilibrium in solution, where the interaction is broken and formed continuously (e.g. lysin; Figure 1C), and strong transient associations that require a molecular trigger to shift the oligomeric equilibrium. For example, the heterotrimeric G protein (Figure 1F) dissociates into the Ga and Gbg subunits upon guanosine triphosphate (GTP) binding, but forms a stable trimer with guanosine diphosphate (GDP) bound. Structurally or functionally obligate interactions are usually permanent, whereas non-obligate interactions may be transient or permanent. Types of protein-protein interactions (PPI) Obligate PPI Non-obligate PPI the protomers are not found as stable structures on their own in vivo Non-obligate homodimer Sperm lysin Obligate homodimer P22 Arc repressor DNA-binding Obligate heterodimer Human cathepsin D 1LYB Non-obligate heterodimer RhoA and RhoGAP signaling complex Types of protein-protein interactions (PPI) Non-obligate PPI Obligate PPI usually permanent the protomers are not found as stable structures on their own in vivo Transient Permanent (many enzyme-inhibitor complexes) Weak dissociation constant Kd=[A][B] / [AB] (electron transport complexes) 10-7 - 10-13 M Kd mM-µM Intermediate Non-obligate transient homodimer, Sperm lysin (interaction is broken and formed continuously) (antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM Strong Obligate heterodimer Human cathepsin D Non-obligate permanent heterodimer (require a molecular trigger to shift the oligomeric equilibrium) Kd nM-fM Thrombin and rodniin Bovine G protein dissociates into Gα and Gβγ subunits inhibitor upon GTP, but forms a stable trimer upon GDP Types of protein-protein interactions (PPI) Non-obligate PPI Obligate PPI usually permanent the protomers are not found as stable structures on their own in vivo Transient Permanent (many enzyme-inhibitor complexes) Weak dissociation constant Kd=[A][B] / [AB] (electron transport complexes) 10-7 ÷ 10-13 M Kd mM-µM Intermediate Non-obligate transient homodimer, Sperm lysin (interaction is broken and formed continuously) (antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd µM-nM Strong Obligate heterodimer Human cathepsin D Non-obligate permanent heterodimer (require a molecular trigger to shift the oligomeric equilibrium) Kd nM-fM Thrombin and rodniin Bovine G protein dissociates into Gα and Gβγ subunits inhibitor upon GTP, but forms a stable trimer upon GDP Structural features of protein-interaction sites • • The contact area between two proteins is almost always bigger than 1100 Å2 with each of the interacting partners contributing at least 550 Å2 of complementary surface. On average each partner loses about 800 Å2 of solvent-accessible surface upon contact, contributed by some 20 amino acid residues of each partner, i.e. the average interface residue covers some 40 Å2. • NACCESS • The Accessible surface area (ASA) of the complexes is calculated using an implementaion of the Lee and Richards (1971) algorithm devloped by Hubbard (1992). With a probe sphere, of radius 1.4 angstroms, the ASA was defined as the surface mapped out by the centre of the probe as if it were rolled around the van der Waals surface of the protein. The program is used to calculate the ASA of each protomer in the complex and then the complete complex.The ASA shown in the results table is for a single subunit (chain1 as designated by the user on the submission form (this subunit is indiacted at the top of the table and coloured purple)). • Forces that mediate protein-protein interactions include electrostatic interactions, hydrogen bonds, the van der Waals attraction and hydrophobic effects. • The average protein-protein interface is not less polar or more hydrophobic than the surface remaining in contact with the solvent. Water is usually excluded from the contact region. Non-obligate complexes tend to be more hydrophilic in comparison, as each component has to exist independently in the cell. It has been proposed that hydrophobic forces drive protein-protein interactions and hydrogen bonds and salt bridges confer specificity. • • Shape: Independent studies showed that 83-84% of interfaces are more or less flat. With few exceptions, the interfaces are approximately circular areas on the protein surface in both permenant and non-obligate complexes. Interfaces in permanent associations tend to be larger, less planar, more highly segmented (in terms of sequence), and closer packed than interfaces in non-obligate associations. Complementarity: can be measured in terms of “fitting surface shape”. Interfaces in homodimers, enzyme-inhibitor complexes, and permanent heterocomplexes are the most complementary, whilst the antibody-antigen complexes and the non-obligate heterocomplexes are the least complementary. Secondary structure: In one study the loop interactions contributed, on average, 40% of the interface contacts. In another study (involving 28 homodimers), 53% of the interface residues were a-helical, 22% beta sheets, and 12% ab, with the rest being coils. Amino acid composition: Interfaces have been shown to be more hydrophobic than the exterior but less hydrophobic than the interior of a protein. In one study, 47% of interface residues were hydrophobic, 31% polar and 22% charged. Permanent complexes have interfaces that contain hydrophobic residues, whilst the interfaces in 5 non-obligate complexes favour the more polar residues. Site-directed mutagenesis showed that in many cases a large majority (i.e. > 50%) of interface residues can be mutated to alanine with little effect on Kd: i.e. the functional epitope is a subset of the structural epitope. Clinical relevance and applications of protein-protein interaction analysis Biologically active proteins such as peptide hormones or antibodies act by interacting with other proteins such as receptors or antigens, respectively. Knowing their interaction sites allows the modification of the activity of such proteins or changing their specificity. In addition, small molecules may be designed that block interactions such as the binding of virus coat proteins to their cellular receptors, thereby blocking infection. Proteins and their interactions are therefore potential drug targets. Sometimes, protein-protein interactions are disadvantageous, such as in insulin that tends to form dimers and hexamers which are less active than monomers. Genetically engineered insulin molecules retain biological activity without oligomerizing. What Is the Preferred Way for Proteins to Interact? • An ultimate goal in molecular and cellular biology is to predict the preferred mode of protein associations - Similar protein structures can associate in different ways - Different protein structures can associate in similar ways Binding Is Still Not Entirely Understood! Possible reasons: • We usually observe one or two interaction sites; However a large portion of the surface is probably involved in binding • Some associations are stable; others are low affinity • Binding reactions are often cooperative events • Binding strength is condition-dependent Interfaces Are Variable • Different relative contributions of the hydrophobic effect versus electrostatic interactions • Wide range of motifs, with no prevailing architectures A Dataset of Protein-Protein Interfaces • A nonredundant dataset provides diversity • The clusters allow studies of - interface structures vs function - residue conservation Definition Of Interfaces: • An interface is the region between two polypeptide chains not covalently linked • Residue selection is based on how close this residue is to residues of the second chain. If two residues (one from each chain) are in contact, they are interacting residues • Residues in the vicinity of interacting residues are nearby residues. They provide the structural scaffold of the interfaces Magenta: Interacting residues Cyan: Nearby residues A protein complex forming an interface Generation of the Dataset of Interfaces • We started the generation of the dataset by extracting the interfaces between chains from the PDB coordinates • On July 18, 2002, there were 18,687 entries in the PDB which included 35,112 single chains including all individual chains in dimers, trimers and so on. The dataset of interfaces contains 21,686 two-chain interfaces An Interface Between Two Chains Interface Composition: Example The interface between the two chains: In green 'nearby' residues and in blue contact residues in chain A. In red nearby' residues and in magenta contact residues in chain B. A magnification of the interface: Balls depict C-alpha. The numbers refer to the residue positions. Green and blue are nearby and contact C-alpha's in chain A. Red and magenta are nearby and contact C-alpha's in chain B. Residue Order Independence Similar arrangement in space; Different sequential order A B A B C C E D E D Representation of Proteins As Sets of Points in the Three Dimensional Space Each ball is a C-alpha Hot Spots In The Interfaces • Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more) • All data are deposited inhttp://www.asedb.org DeLano, W.L., Unraveling hot spots in binding interfaces: progress and challenges. Curr. Opin. Struct. Biol. 2002 Computationally, Hot Spots Distinguish Between Binding Sites and Exposed Protein Surfaces (B. Ma, T. Elkayam, H. Wolfson, R. Nussinov PNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777) Multiple Structure Alignment (MUSTA) • Align all structures in each cluser • Find structurally conserved residues (hot spot) Conserved hot spots Interface Exposed surface Clustering • Clustering is iterative • At each iteration strict criteria are used • At the end of the first cycle, the number of interface clusters decreased from 21,686 to 16,446 • Members in each cluster share at most 90% connectivity score with at most 90% sequence identity • All have exact number of interface residues on their interfaces The parameters used during the clustering of the interfaces are: Cycle Number of interfaces A 21686 → 16446 0.9 90 0 B 16446 → 9637 0.9 80 3 C 9637 → 6647 0.8 50 10 D 6647 → 5332 0.7 25 20 E 5332 → 4429 0.6 10 40 F 4429 → 3799 0.5 0 50 Minimal connectivity score Minimal % amino Acid identity Maximal amino acid size difference between interfaces Further Filtering • Each cluster should at least have 5 members • None of the members should share a sequence similarity score of 50% or higher (Sequence alignments are done with CLUSTALW) Cluster Categories • This filtering reduced the number of clusters from 3799 to 103 - Library construction carried out through pair-wise comparison • Based on multiple structure alignment, the clusters are divided into two categories 1. Category I interface are clusters which share only ONE similar side. These clusters allow us to address the problem of how a given binding site can bind somewhat different protein surfaces Cluster Category II 2. Interface clusters which share TWO similar sides - Type I: Clusters with similar interfaces and similar functions - Type II: Clusters with similar interfaces but dissimilar functions Sample List of Some of the Two-chain Interface Clusters The dataset contains functional dimers, and others as receptor/ligands, antibody/antigens, enzyme/inhibitors, coat/capsid proteins Family Name (from SCOP database) Members of the cluster (proteincomplexes in the cluster) aligned residues # of members TRANSFARASES Glutathione S-transferases, C-terminal domain 10gsAB, 1axdAB, 1b48AB, 1c72AB, 1f2eAB, 1gnwAB, 1gwcBC, 1jlvAB, 1ljrAB, 1pd212 67 10 ANTIBODIES Immunoglobulin antibody variable domain like 1cd0AB, 1a2yAB, 1a6uLH, 1ac6AB, 1akjDE, 1ao7DE, 1a14HL, 1d9kAB, 1fo0AB, 1tvdAB 33 10 APOPTOSIS PROTEINS (Superfamily:TNF-like, Family: TNF-like) 1jh5AB, 1iqaAB, 1d0gAB, 1cdaAB, 1c28AC, 1bziBC, 1a8mAB 61 7 DNA clamp Family1: DNA polymerase processivity factor & Microbial ribonucleases 1b77AC, 1axcAC, 1axcAE, 1b77AB, 1a2pBC 18 5 VIRAL COAT & CAPSID PROTEINS 1al223, 1aym23, 1b35BC, 1bev23, 1tme23 93 5 VIRAL COAT & CAPSID PROTEINS 1al212,1aym12,1bev12,1cov12,1hri12 110 5 HISTONE-FOLD PROTEINS 1aoiAB, 1aoiCD, 1b67AB, 1bh8AB, 1jfiAB, 1tafAB 84 6 NAD(P)-BINDING PROTEINS Rossmann-fold domains: Tyrosinedependent oxidoreductases 1cydAD, 1e3sAC, 1e92AC, 1hdcAD, 1i01AB 111 5 SERPINS 1as4AB, 1c8oAB, 1d5sAB, 1hleAB, 1jjoC, 1paiAB 67 6 SM-LIKE RIBONUCLEOPROTEINS, SNRNP 1d3bAB, 1i4k12, 1i8fAG, 1d3bAB, 1i4kZ1, 1i4k12 41 6 SH3-domain proteins 1a0nAB, 1aboAC, 1azeAB, 1gcqAC, 1io6AB, 1jegAB 26 6 First Type: Interface Clusters With Similar Interfaces; Similar Functions Human glutathione S-transferase p1-complex with ter117 Crystal structure of mgsta4-4 in complex with GSH conjugate of 4-hydroxynonenal in one subunit and GSH in the other A A B 10gsAB 1b48AB Glutathione s-transferases B Type II: Structure and Function • A well known paradigm states that proteins with similar structures can have different functions • The type II interface clusters similarly illustrates that interfaces sharing the same cluster can belong to functionally different families Extending the Structure-Function Paradigm • The clusters extend and generalize this striking structure-function paradigm • Not only does it apply to monomers, it further applies to protein-protein interfaces Extending the SequenceStructure Postulate • For monomers it has been well known that different amino acid sequences can fold into similar structures; Since the sequences are different, it is not surprising that the function can also be different • The clusters illustrate that in all such similar interfaces different function cases, the structures of the monomers are also different Examples of Cases of Similar Interfaces and Different Functions In all such cases the monomer structures are different Interface Clusters With Similar Interfaces and Dissimilar Functions-1 A B A B 1dz1AB Chromatin Structure Mouse hp1 (m31) C terminal (shadow chromo) domain 1f05AB Transferase Structure of Human transaldolase Interface Clusters With Similar Interfaces and Dissimilar Functions-2 C C 1axcAC A 1a2pBC B Complex (DNA-binding protein/DNA) Human PCNA Ribonuclease Barnase Wildtype Structure Interface Clusters With Similar Interfaces and Dissimilar Functions-3 D 1eboAB 1ic2CD A B Virus/viral protein Structure of the Ebola Virus Membrane-fusion C Contractile protein Tropomyosin Molecule Similar Interfaces; Different Functions • The similar interfaces - different function can be rationalized: - Just as in monomer structures, evolution has utilized "good" favorable motifs for many (different!) functions - Hence, of all the combinatorially possible ways for different monomer structures to associate, they still prefer to interact in similar ways to yield preferred interface architectures How Can We Use the Dataset of Interfaces for Prediction of Binding Sites? Hot Spots In The Interfaces • Experimentally, a hot spot is a residue that, when mutated to alanine, gives rise to a distinct drop in the binding constant (tenfold or more) • All data are deposited inhttp://www.asedb.org DeLano, W.L., Unraveling hot spots in binding interfaces: progress and challenges. Curr. Opin. Struct. Biol. 2002 Computationally, Hot Spots Distinguish Between Binding Sites and Exposed Protein Surfaces (B. Ma, T. Elkayam, H. Wolfson, R. Nussinov PNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777) Multiple Structure Alignment (MUSTA) • Align all structures in each cluser • Find structurally conserved residues (hot spot) Conserved hot spots Interface Exposed surface