Download Protein-Protein Interactions: Stability, Function and Landscape

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ubiquitin wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

Magnesium transporter wikipedia , lookup

Signal transduction wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Proteasome wikipedia , lookup

SR protein wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Protein phosphorylation wikipedia , lookup

Protein folding wikipedia , lookup

Homology modeling wikipedia , lookup

Protein moonlighting wikipedia , lookup

Protein wikipedia , lookup

List of types of proteins wikipedia , lookup

JADE1 wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein domain wikipedia , lookup

Cyclol wikipedia , lookup

Protein structure prediction wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Transcript
Protein-Protein Interactions:
Stability, Function and Landscape
Structural Aspects of ProteinProtein Interactions
Agenda
•
•
•
Understand the importance of studying protein-protein interactions
at the structural level
Classify the various types of interactions
Look at one structure-based method for predicting protein-protein
interactions
LINK
Protein interaction
•
•
Definition
Specific interactions between two or more
proteins.
•
•
Examples
Enzyme-inhibitor complex; antibodyantigen complex; receptor-ligand
interactions, multiprotein complexes such
as ribosomes or RNA polymerases.
Homocomplexes are usually
permanent and optimized (e.g., the
homodimer cytochrome c9 (1)) (Fig. 1a).
Heterocomplexes can also have such
properties, or they can be non-obligatory,
being made and broken according to the
environment or external factors and involve
proteins that must also exist independently
[e.g., the enzyme–inhibitor
complex trypsin with the inhibitor from
bitter gourd (2) (Fig. 1b) and the antibody–
protein complex HYHEL-5 with lysozyme
(3) (Fig. 1c)].
It is important to distinguish between the
different types of complexes when
analyzing the intermolecular
interfaces that occur within them.
Characteristics
Classification: Protein-protein interactions can be arbitrarily classified based on the
proteins involved (structural or functional groups) or based on their physical
properties (weak and transient, “non-obligate” vs. strong and permanent). Protein
interactions are usually mediated by defined domains, hence interactions can also be
classified based on the underlying domains.
Universality: All of molecular biology is about protein-protein interactions (Alberts et
al. 2002, Lodish et al. 2000). Protein-protein interactions affect all processes in a cell:
structural proteins need to interact in order to shape organelles and the whole cell,
molecular machines such as ribosomes or RNA polymerases are hold together by
protein-protein interactions, and the same is true for multi-subunit channels or
receptors in membranes.
Specificity distinguishes such interactions from random collisions that happen by
Brownian motion in the aqeous solutions inside and outside of cells. Note that many
proteins are known to interact although it remains unclear whether certain interactions
have any physiological relevance.
Number of interactions: It is estimated that even simple single-celled organisms such
as yeast have their roughly 6000 proteins interact by at least 3 interactions per protein,
i.e. a total of 20,000 interactions or more. By extrapolation, there may be on the order
of ~100,000 interactions in the human body.
The protein-protein interaction network in yeast.
An interaction map of the yeast proteome assembled from published interactions. The
map contains 1,548 proteins (boxes) and 2,358 interactions (connecting lines).
Homo- and hetero-oligomeric
complexes
Protein-protein interactions (PPIs) occur
between identical or non-identical
chains (i.e. homo- or heterooligomers). (A-B)
Oligomers of identical or homologous
protein units can be organized in an
isologous or heterologous way (Monod
et al., 1965) with structural symmetry
(Goodsell and Olson, 2000).
An isologous association involves the
same surface on both monomers (e.g.
Arc repressor and lysin; Figure 1A and
C), related by a 2-fold symmetry axis.
In contrast to an isologous association
that can only further oligomerize using
a different interface (e.g. form a dimer
of dimers with three 2-fold axes of
symmetry), heterologous assemblies
use different interfaces that, without a
closed (cyclic) symmetry, can lead to
infinite aggregation.
Non-obligate and obligate complexes
As well as composition, two different types of complexes
can be distinguished on the basis of whether a complex is
obligate or non-obligate.
In an obligate PPI, the protomers
are not found as stable structures on their own in vivo.
Such complexes are generally also functionally obligate;
for example, the Arc repressor dimer (Figure 1A) is
essential for DNA binding.
Many of the hetero-oligomeric structures in the
Protein Data Bank involve non-obligate
interactions of protomers that exist independently, such as
intracellular signalling complexes (e.g. RhoA±RhoGAP;
Figure 1D) and antibody±antigen, receptor±ligand and
enzyme±inhibitor (e.g. thrombin±rodniin; Figure 1E) complexes.
The components of such protein±protein complexes
are often initially not co-localized and thus need to
be independently stable. However, some homo-oligomers,
which by definition are co-localized, can also form nonobligate
assemblies (e.g. sperm lysin; Figure 1C).
Transient and permanent complexes
PPIs can also be distinguished based on the lifetime of the
complex. In contrast to a permanent interaction that is
usually very stable and thus only exists in its complexed
form, a transient interaction associates and dissociates
in vivo. We distinguish weak transient interactions that
feature a dynamic oligomeric equilibrium in solution,
where the interaction is broken and formed continuously
(e.g. lysin; Figure 1C), and strong transient associations
that require a molecular trigger to shift the oligomeric
equilibrium. For example, the heterotrimeric G protein
(Figure 1F) dissociates into the Ga and Gbg subunits
upon guanosine triphosphate (GTP) binding, but forms a
stable trimer with guanosine diphosphate (GDP) bound.
Structurally or functionally obligate interactions are
usually permanent, whereas non-obligate interactions
may be transient or permanent.
Types of protein-protein interactions (PPI)
Obligate PPI
Non-obligate PPI
the protomers are not
found as stable
structures on their
own in vivo
Non-obligate homodimer
Sperm lysin
Obligate
homodimer
P22 Arc
repressor
DNA-binding
Obligate
heterodimer
Human cathepsin D
1LYB
Non-obligate heterodimer
RhoA and RhoGAP signaling complex
Types of protein-protein interactions (PPI)
Non-obligate PPI
Obligate PPI
usually
permanent
the protomers are
not found as stable
structures on their
own in vivo
Transient
Permanent
(many enzyme-inhibitor
complexes)
Weak
dissociation constant
Kd=[A][B] / [AB]
(electron
transport
complexes)
10-7 - 10-13 M
Kd mM-µM
Intermediate
Non-obligate transient
homodimer, Sperm lysin
(interaction is broken and
formed continuously)
(antibody-antigen, TCR-MHC-peptide,
signal transduction PPI), Kd µM-nM
Strong
Obligate
heterodimer
Human cathepsin D
Non-obligate
permanent
heterodimer
(require a molecular
trigger to shift the
oligomeric
equilibrium)
Kd nM-fM
Thrombin and rodniin Bovine G protein dissociates into Gα and Gβγ subunits
inhibitor
upon GTP, but forms a stable trimer upon GDP
Types of protein-protein interactions (PPI)
Non-obligate PPI
Obligate PPI
usually
permanent
the protomers are
not found as stable
structures on their
own in vivo
Transient
Permanent
(many enzyme-inhibitor
complexes)
Weak
dissociation constant
Kd=[A][B] / [AB]
(electron
transport
complexes)
10-7 ÷ 10-13 M
Kd mM-µM
Intermediate
Non-obligate transient
homodimer, Sperm lysin
(interaction is broken and
formed continuously)
(antibody-antigen, TCR-MHC-peptide,
signal transduction PPI), Kd µM-nM
Strong
Obligate
heterodimer
Human cathepsin D
Non-obligate
permanent
heterodimer
(require a molecular
trigger to shift the
oligomeric
equilibrium)
Kd nM-fM
Thrombin and rodniin Bovine G protein dissociates into Gα and Gβγ subunits
inhibitor
upon GTP, but forms a stable trimer upon GDP
Structural features of protein-interaction sites
•
•
The contact area between two proteins is almost always bigger than
1100 Å2 with each of the interacting partners contributing at least 550
Å2 of complementary surface.
On average each partner loses about 800 Å2 of solvent-accessible
surface upon contact, contributed by some 20 amino acid residues of
each partner, i.e. the average interface residue covers some 40 Å2.
•
NACCESS
•
The Accessible surface area (ASA) of the complexes is calculated using an implementaion
of the Lee and Richards (1971) algorithm devloped by Hubbard (1992). With a probe
sphere, of radius 1.4 angstroms, the ASA was defined as the surface mapped out by the
centre of the probe as if it were rolled around the van der Waals surface of the protein. The
program is used to calculate the ASA of each protomer in the complex and then the
complete complex.The ASA shown in the results table is for a single subunit (chain1 as
designated by the user on the submission form (this subunit is indiacted at the top of the
table and coloured purple)).
•
Forces that mediate protein-protein interactions include electrostatic
interactions, hydrogen bonds, the van der Waals attraction and hydrophobic
effects.
•
The average protein-protein interface is not less polar or more hydrophobic than
the surface remaining in contact with the solvent. Water is usually excluded from
the contact region.
Non-obligate complexes tend to be more hydrophilic in comparison, as each
component has to exist independently in the cell.
It has been proposed that hydrophobic forces drive protein-protein interactions
and hydrogen bonds and salt bridges confer specificity.
•
•
Shape: Independent studies showed that 83-84% of interfaces are more or less flat. With few exceptions, the interfaces
are approximately circular areas on the protein surface in both permenant and non-obligate complexes. Interfaces in
permanent associations tend to be larger, less planar, more highly segmented (in terms of sequence), and closer
packed than interfaces in non-obligate associations.
Complementarity: can be measured in terms of “fitting surface shape”. Interfaces in homodimers, enzyme-inhibitor
complexes, and permanent heterocomplexes are the most complementary, whilst the antibody-antigen complexes
and the non-obligate heterocomplexes are the least complementary.
Secondary structure: In one study the loop interactions contributed, on average, 40% of the interface contacts. In
another study (involving 28 homodimers), 53% of the interface residues were a-helical, 22% beta sheets, and 12%
ab, with the rest being coils.
Amino acid composition: Interfaces have been shown to be more hydrophobic than
the exterior but less hydrophobic than the interior of a protein. In one study, 47% of
interface residues were hydrophobic, 31% polar and 22% charged. Permanent
complexes have interfaces that contain hydrophobic residues, whilst the interfaces in
5 non-obligate complexes favour the more polar residues. Site-directed mutagenesis
showed that in many cases a large majority (i.e. > 50%) of interface residues can be
mutated to alanine with little effect on Kd: i.e. the functional epitope is a subset of the
structural epitope.
Clinical relevance and applications of protein-protein interaction analysis
Biologically active proteins such as peptide hormones or antibodies act by interacting
with other proteins such as receptors or antigens, respectively. Knowing their
interaction sites allows the modification of the activity of such proteins or changing
their specificity. In addition, small molecules may be designed that block interactions
such as the binding of virus coat proteins to their cellular receptors, thereby blocking
infection. Proteins and their interactions are therefore potential drug targets.
Sometimes, protein-protein interactions are disadvantageous, such as in insulin that
tends to form dimers and hexamers which are less active than monomers. Genetically
engineered insulin molecules retain biological activity without oligomerizing.
What Is the Preferred Way for
Proteins to Interact?
• An ultimate goal in molecular and cellular
biology is to predict the preferred mode of
protein associations
- Similar protein structures can associate in
different ways
- Different protein structures can associate
in similar ways
Binding Is Still Not
Entirely Understood!
Possible reasons:
•
We usually observe one or two interaction sites;
However a large portion of the surface is probably
involved in binding
•
Some associations are stable; others are low affinity
•
Binding reactions are often cooperative events
•
Binding strength is condition-dependent
Interfaces Are Variable
• Different relative contributions of the
hydrophobic effect versus electrostatic
interactions
• Wide range of motifs, with no prevailing
architectures
A Dataset of
Protein-Protein Interfaces
• A nonredundant dataset provides diversity
• The clusters allow studies of
- interface structures vs function
- residue conservation
Definition Of Interfaces:
• An interface is the region between two polypeptide chains not
covalently linked
• Residue selection is based on how close this residue is to
residues of the second chain. If two residues (one from each
chain) are in contact, they are interacting residues
• Residues in the vicinity of interacting residues are nearby
residues. They provide the structural scaffold of the interfaces
Magenta:
Interacting residues
Cyan:
Nearby residues
A protein complex forming an interface
Generation of the Dataset of
Interfaces
• We started the generation of
the dataset by extracting the
interfaces between chains
from the PDB coordinates
• On July 18, 2002, there were
18,687 entries in the PDB
which included 35,112 single
chains including all individual
chains in dimers, trimers and
so on. The dataset of
interfaces contains 21,686
two-chain interfaces
An Interface Between Two Chains
Interface Composition: Example
The interface between the two chains:
In green 'nearby' residues and in blue contact residues in chain A.
In red nearby' residues and in magenta contact residues in chain B.
A magnification of the interface:
Balls depict C-alpha. The numbers refer
to the residue positions. Green and blue
are nearby and contact C-alpha's in
chain A. Red and magenta are nearby
and contact C-alpha's in chain B.
Residue Order Independence
Similar arrangement in space;
Different sequential order
A
B
A
B
C
C
E
D
E
D
Representation of Proteins As Sets of
Points in the Three Dimensional Space
Each ball is
a C-alpha
Hot Spots In The Interfaces
• Experimentally, a hot spot
is a residue that, when
mutated to alanine, gives
rise to a distinct drop in
the binding constant
(tenfold or more)
• All data are deposited
inhttp://www.asedb.org
DeLano, W.L., Unraveling hot spots in binding interfaces:
progress and challenges. Curr. Opin. Struct. Biol. 2002
Computationally, Hot Spots Distinguish
Between Binding Sites and Exposed
Protein Surfaces
(B. Ma, T. Elkayam, H. Wolfson, R. Nussinov
PNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)
Multiple Structure Alignment (MUSTA)
• Align all structures in each cluser
• Find structurally conserved residues (hot spot)
Conserved
hot spots
Interface
Exposed surface
Clustering
• Clustering is iterative
• At each iteration strict criteria are used
• At the end of the first cycle, the number of interface clusters
decreased from 21,686 to 16,446
• Members in each cluster share at most 90% connectivity score with
at most 90% sequence identity
• All have exact number of interface residues on their interfaces
The parameters used during the clustering of the interfaces are:
Cycle
Number of
interfaces
A
21686 → 16446
0.9
90
0
B
16446 → 9637
0.9
80
3
C
9637 → 6647
0.8
50
10
D
6647 → 5332
0.7
25
20
E
5332 → 4429
0.6
10
40
F
4429 → 3799
0.5
0
50
Minimal
connectivity
score
Minimal %
amino
Acid
identity
Maximal amino acid
size difference
between
interfaces
Further Filtering
• Each cluster should at least have 5 members
• None of the members should share a
sequence similarity score of 50% or higher
(Sequence alignments are done with CLUSTALW)
Cluster Categories
•
This filtering reduced the number of clusters from 3799 to 103
- Library construction carried out through pair-wise comparison
•
Based on multiple structure alignment, the clusters are divided
into two categories
1. Category I interface are clusters which share only ONE
similar side. These clusters allow us to address the
problem of how a given binding site can bind somewhat
different protein surfaces
Cluster Category II
2.
Interface clusters which share TWO similar sides
- Type I: Clusters with similar interfaces and similar functions
- Type II: Clusters with similar interfaces but dissimilar functions
Sample List of Some of the Two-chain Interface Clusters
The dataset contains functional dimers, and others as receptor/ligands,
antibody/antigens, enzyme/inhibitors, coat/capsid proteins
Family Name (from SCOP database)
Members of the cluster
(proteincomplexes in the cluster)
aligned
residues
# of
members
TRANSFARASES
Glutathione S-transferases,
C-terminal domain
10gsAB, 1axdAB, 1b48AB, 1c72AB, 1f2eAB,
1gnwAB, 1gwcBC, 1jlvAB, 1ljrAB, 1pd212
67
10
ANTIBODIES
Immunoglobulin
antibody variable domain like
1cd0AB, 1a2yAB, 1a6uLH, 1ac6AB, 1akjDE,
1ao7DE, 1a14HL, 1d9kAB, 1fo0AB, 1tvdAB
33
10
APOPTOSIS PROTEINS
(Superfamily:TNF-like, Family: TNF-like)
1jh5AB, 1iqaAB, 1d0gAB, 1cdaAB, 1c28AC, 1bziBC,
1a8mAB
61
7
DNA clamp
Family1: DNA polymerase processivity
factor & Microbial ribonucleases
1b77AC, 1axcAC, 1axcAE, 1b77AB, 1a2pBC
18
5
VIRAL COAT & CAPSID PROTEINS
1al223, 1aym23, 1b35BC, 1bev23, 1tme23
93
5
VIRAL COAT & CAPSID PROTEINS
1al212,1aym12,1bev12,1cov12,1hri12
110
5
HISTONE-FOLD PROTEINS
1aoiAB, 1aoiCD, 1b67AB, 1bh8AB, 1jfiAB, 1tafAB
84
6
NAD(P)-BINDING PROTEINS
Rossmann-fold domains: Tyrosinedependent oxidoreductases
1cydAD, 1e3sAC, 1e92AC, 1hdcAD, 1i01AB
111
5
SERPINS
1as4AB, 1c8oAB, 1d5sAB, 1hleAB, 1jjoC, 1paiAB
67
6
SM-LIKE RIBONUCLEOPROTEINS,
SNRNP
1d3bAB, 1i4k12, 1i8fAG, 1d3bAB, 1i4kZ1, 1i4k12
41
6
SH3-domain proteins
1a0nAB, 1aboAC, 1azeAB, 1gcqAC, 1io6AB, 1jegAB
26
6
First Type: Interface Clusters With
Similar Interfaces; Similar Functions
Human glutathione
S-transferase p1-complex
with ter117
Crystal structure of mgsta4-4 in
complex with GSH conjugate of
4-hydroxynonenal in one subunit
and GSH in the other
A
A
B
10gsAB
1b48AB
Glutathione s-transferases
B
Type II: Structure and Function
• A well known paradigm states that proteins with
similar structures can have different functions
• The type II interface clusters similarly illustrates that
interfaces sharing the same cluster can belong to
functionally different families
Extending the
Structure-Function Paradigm
• The clusters extend and generalize this
striking structure-function paradigm
• Not only does it apply to monomers, it
further applies to protein-protein
interfaces
Extending the SequenceStructure Postulate
• For monomers it has been well known that
different amino acid sequences can fold into
similar structures; Since the sequences are
different, it is not surprising that the function
can also be different
• The clusters illustrate that in all such similar
interfaces different function cases, the
structures of the monomers are also different
Examples of Cases of Similar
Interfaces and Different Functions
In all such cases
the monomer structures are different
Interface Clusters With Similar
Interfaces and Dissimilar Functions-1
A
B
A
B
1dz1AB
Chromatin Structure
Mouse hp1 (m31) C terminal
(shadow chromo) domain
1f05AB
Transferase
Structure of
Human transaldolase
Interface Clusters With Similar
Interfaces and Dissimilar Functions-2
C
C
1axcAC
A
1a2pBC
B
Complex
(DNA-binding protein/DNA)
Human PCNA
Ribonuclease
Barnase Wildtype Structure
Interface Clusters With Similar
Interfaces and Dissimilar Functions-3
D
1eboAB
1ic2CD
A
B
Virus/viral protein
Structure of the
Ebola Virus Membrane-fusion
C
Contractile protein
Tropomyosin Molecule
Similar Interfaces; Different Functions
• The similar interfaces - different function can be
rationalized:
- Just as in monomer structures, evolution has
utilized "good" favorable motifs for many
(different!) functions
- Hence, of all the combinatorially possible ways
for different monomer structures to associate,
they still prefer to interact in similar ways to yield
preferred interface architectures
How Can We Use the
Dataset of Interfaces for
Prediction of Binding Sites?
Hot Spots In The Interfaces
• Experimentally, a hot spot
is a residue that, when
mutated to alanine, gives
rise to a distinct drop in
the binding constant
(tenfold or more)
• All data are deposited
inhttp://www.asedb.org
DeLano, W.L., Unraveling hot spots in binding interfaces:
progress and challenges. Curr. Opin. Struct. Biol. 2002
Computationally, Hot Spots Distinguish
Between Binding Sites and Exposed
Protein Surfaces
(B. Ma, T. Elkayam, H. Wolfson, R. Nussinov
PNAS | May 13, 2003 | vol. 100 | no. 10 | 5772-5777)
Multiple Structure Alignment (MUSTA)
• Align all structures in each cluser
• Find structurally conserved residues (hot spot)
Conserved
hot spots
Interface
Exposed surface