Download Introdution

Document related concepts

Protein phosphorylation wikipedia , lookup

Endomembrane system wikipedia , lookup

Organ-on-a-chip wikipedia , lookup

Protein wikipedia , lookup

Protein moonlighting wikipedia , lookup

Cyclol wikipedia , lookup

Signal transduction wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

JADE1 wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

List of types of proteins wikipedia , lookup

Transcript
Computational Molecular Biology
Actin binding surface
Profilin binding surface
Impact on drug development
Large pharma productivity from 2005–2010
Combined FDA-approved NMEs versus
R&D spending for nine large
pharmaceutical companies (AstraZeneca,
Bristol-Myers Squibb, Eli Lilly,
GlaxoSmithKline, Merck, Novartis, Pfizer,
Roche and Sanofi-Aventis). Figures
shown are in millions of US dollars.
Source: FDA CDER; Bernstein
Pharmaceutical industry has not been anywhere near good
enough at selecting biological targets that influence disease.
M.E. Bunnage, Nature Chemical Biology 7: p. 335 – 339 (2011)
Currently takes more than 10 years and requires an investment of over
$1B to bring a single innovative drug to market
Molecular Medicine (Pauling)
•  From Wikipedia: Molecular
medicine is a broad field,
where physical, chemical,
biological and medical
techniques are used to
describe molecular
structures and mechanisms,
identify fundamental
molecular and genetic errors
of disease, and to develop
molecular interventions to
correct them…. It is a new
scientific discipline in
European universities
Summary
•  Molecular medicine requires
computational molecular biology
•  Tools use both physics and biology
concepts. Orthogonal and complementary
•  Only integration with system biology leads
to understanding of drug action
The cell
• Eukaryotic cell: water 70% weight, proteins 20%, DNA/RNA 5% , lipids 3%, Polysaccharides 2%
Cell Volume euk/pro ≈10,000 µm3 vs 1 µm3
# proteins
≈4 1010 vs 4 106
[proteins] ≈0.1 pM vs ≈ 1nM
Size of genome 3 109 bp and 30,000 genes vs 4.6 106 bp and 4500 genes
The medium: The cytoplasm (saline solution)
Compartmentalization: The membrane (made up by lipids)
RIBOSOMES
(Factory)
Cytoplasm
CELL MEMBRANE
(Wall)
NUCLEUS
(Information)
Endoplasmic reticulum (Factory)
Endoplasmic reticulum
CHROMATIN
LYSOSOME
(Garbage disposal)
BACTERIAL CELL
TO SCALE
EUKARYOTIC CELL
MITOCHONDRIA
(Power Generator)
•  Cell stores its own set of instructions for carrying its functions
(force or transport, metabolism, protein synthesis, control
functions, production of new cells)
•  Self-contained and self-maintaining. Structures potentially
assemble, perform elaborate biochemical functions, vanish
effortlessly when their work is done
•  Cell stores its own set of instructions for carrying its functions
(force or transport, metabolism, protein synthesis, control
functions, production of new cells)
•  Self-contained and self-maintaining. Structures potentially
assemble, perform elaborate biochemical functions, vanish
effortlessly when their work is done
•  Communication: protein A finds protein B in a sea of other
interactors (hand-glove model)
A modern view of cell function
§ 
Cellular functions – the decisions to grow and divide, to die by
programmed cell death, or to stay static – ultimately lie with
macromolecules encoded by DNA.
§ 
Proteins and RNA directly control the cell through the reactions they
perform, the conformations they adopt, and the interactions that they
make.
§ 
A modern, mechanistic understanding of cells requires knowledge
of:
(i)  Molecular recognition events. Macromolecules are inherently nearsighted. Stable macromolecular interfaces involve forces that
typically are only effective in short ranges that can be measured in
Ångstroms. Although one can model these changes with simplified
approahces, one has to remember that they are only the sum of
short-range interactions between atoms. Thus, integration of
macromolecules into pathways requires investigation of molecular
recognition events
(ii) Free energy landscape: Biological processes (e.g. signaling) involve
almost isoenergetic free energy minima triggered by small-molecule
binding, allostery, chemical modifications (such as phosphorylation,
methylation, and ubiquitination), functional modification or complex
formation to propagate conformational changes through long
distances.
A modern view of cell function
§ 
Cellular functions – the decisions to grow and divide, to die by
programmed cell death, or to stay static – ultimately lie with
macromolecules encoded by DNA.
§ 
Proteins and RNA directly control the cell through the reactions they
perform, the conformations they adopt, and the interactions that they
make.
§ 
A modern, mechanistic understanding of cells requires knowledge of
(i)  Molecular recognition events. Macromolecules are inherently nearsighted. Stable macromolecular interfaces involve forces that
typically are only effective in short ranges that can be measured in
Ångstroms. Although one can model these changes with simplified
approahces, one has to remember that they are only the sum of
short-range interactions between atoms. Thus, integration of
macromolecules into pathways requires investigation of molecular
recognition events
(ii)  Free energy landscape: Biological processes (e.g. signaling) involve
almost isoenergetic free energy minima triggered by small-molecule
binding, allostery, chemical modifications (such as phosphorylation,
methylation, and ubiquitination), functional modification or complex
formation to propagate conformational changes through long
distances àadactable molecular recognition
(iii)  Dynamics!
Molecular recognition in biology
§  Understanding dynamics and (often
transient) interactions between
biomolecules
is one of the great
scientific challenges of our time.
§  Progress underpins our ability to cure
many diseases in a rational way.
§  We need data!
§  We need mathematical/computational
models!
Computational Molecular Biology: the
challenges…
•  It uses molecular recognition (ANY biological process,
pharmacological applications). It involves releasing some or
all of the water (or other) molecule normally in contact with the
interacting surfaces. It may also involve changes in the
conformation of the molecules.
•  Energetics of biological processes within about 1eV.Entropy
plays a major role
•  Biomolecules do not act alone
•  Very heterogeneous systems (cytoplasm as molecular
soup , cell membrane)
•  Role of cellular biology and regulation
…the simplifications….
• 
• 
• 
• 
• 
• 
Key role of Bioinformatics: Biology is a result of a historical process: a H atom
is not different 109 years ago but biological systems (!3.5 x 109 years old) are.
Darwinian Evolution (i.e. survival of the fittest at each new generation of each
organism) modifies existing mechanisms rather than invent new ones: By
investigating (structural, functional, cellular) patterns, we can have clues about
these very complicated systems.
Biological Systems are similar (e.g. plants and animal proteins share striking
similarities) and they are robust with respect of small changes, if the latter
are not crucial for the biological function
Biology occurs at T 300 K, P 1 Atm
Symmetry concepts (e.g viruses)
Enormous diversity of biological systems associated with designs that tend to
be preserved and utilized over and over in all possible combinations, before a
new design is attempted: functions accomplished by co-opting and by
combining functions that emerged in completely different contexts.
New protein structures reveals motifs already existing in the data banks and that
have been used over and over again in related and sometimes even unrelated
tasks.
…and a strikingly rich outcome!
•  Insights on mechanisms: what is the molecular basis of
e.g. enzymatic reactions or in receptor signaling?
•  To understand aberrant processes (e.g. fibrillation) and
eventually try to stop them
•  To intervene on mechanisms: How can we improve drug
affinity/selectivity? These concepts are physically welldefined concepts and therefore can be predicted as we
can predict structures with bioinformatics.
•  To deliver drugs (nano-biotechnology): e.g. delivering
RNA
•  For biotechnological applications: effects of mutations.
•  Food and agriculture industry: Many issues related to
ligand/target interactions (smell, taste)
The two faces of computational
molecular medicine
•  Physics –based approaches
•  Biology-based approaches
•  If used together, of course, they will provide much more
than the sum of the two single components!
The two faces of computational
molecular medicine
•  Physics –based approaches
•  Biology-based approaches
•  If used together, of course, they will provide much more
than the sum of the two single components!
Computational Biomedicine: Strategy
Biology-based
approaches
Physics-based
approaches
17
Pre-Biomolecular Simulation Era (up to the 60’s):
Beyond Dirac’s Paradigm
Theory
Derive largest number of phenomena
based on minimum number of principles:
Mechanics Laws (+QM, Relativity):
Include only biologically relevant degrees
of complexity
Statistics (Boltzmann)
Enormous virtual prediction But Complex
Equations and Models for biological systems
à Not even qualitative behaviour: little or no
real prediction for molecular recognition-based
processes
Experiments
Investigating molecular recognition in biology
today
Theory
Experiment
More and more:
Computer
simulation
Computational Physics: Real predictions but
(i) simplifying assumptions
(ii) biological model
(iii) Thermodynamic equilibrium
IBM Roadrunner
2008, Los Alamos
1.7 Peta (1015 )flops
- ENIAC 1943
Upenn
- 5,000 additions
substraction /s
-27 tons
-Six people did
programming by
manipulating
switches and cables
Investigating molecular recognition in biology
today
Theory
Experiment
More and more:
Computer
simulation
Computational Chemistry: Real predictions if
(i) simplifying assumptions
(ii) biological model
Carefully chosen
(i) Simplifying assumptions
•  Atoms evolve according to the laws of
Newtonian Physics (Molecular dynamics)
Fi=miai (i=1, N)
•  Basic ingredient: the energy function E
Fi=-∂E/ ∂ri
Classical Molecular Dynamics
•  Produces a sequence of configurations very
much like frames in movie
•  Atoms move though time in a series of
discrete timesteps (dt =10-15 s)
•  R(t+ dt )=R(t)+v dt
New position is old position+distance
traveled during the timestep
Chemical bonds as harmonic
oscillators
FSTRETCH = -K1r
The potential energy E
j
l
+
k
i
-
i
+
qp+
qo
effective point
charge interaction
[
1
1
2
E = ∑ kb (rij − b0 ) + ∑ kθ (θ ijk − θ 0 ) 2 + ∑∑ k n 1 + cos(nφijkh − φ0 )
b 2
θ 2
φ
n
12
6⎞
⎛⎛
⎞
⎛
⎞
qlqm
σ ⎟ ⎟ standard biomolecular
⎜⎜ σ ⎟
⎜
+ ∑
+ ∑ 4 ε⎜
−
force field
⎜
⎟
⎜ rop ⎟ ⎟⎟
r
⎜
4
πε
r
op
⎠
⎝
⎠ ⎠
lm
op
0 lm
⎝⎝
]
From water ….
To proteins..
Found in the
muscle cells of
animals.
It functions as
an oxygenstorage unit,
providing
molecular
oxygen to the
working
muscles.
Oxygen accessibility surface in
myoglobin
Fluctuations are now included
Karplus et al. 1986
Limits of MD
•  Force fields do not quite have the required
accuracy (polarization effects): Bustamante’s
claims
•  Formation and breaking of chemical bonds is
not allowed
•  Atom motion is described by classical
mechanics
•  Electrons are implicitly confined to remain in
the ground state
•  Size and time scale are severely restricted.
Size matters…
Proteins in water or embedded in membrane
(1000-100000 atoms)
DNA, RNA
membranes
Cells:
Accessible Time-scales
fs
100 fs -ps
Electronic excitations
ms
Vibrations
Rotations
Conformational Changes
ps-s e-transfer reactions Enzymatic reactions ms-s
Peptide folding
Protein Folding
ns
10ms-s
Comparison with experimental
data (e.g. NMR) whenever
possibile
Investigating molecular recognition in biology
today
Theory
Experiment
More and more:
Computer
simulation
Computational Chemistry: Real predictions if
(i) simplifying assumptions
(ii) biological model
Carefully chosen
(ii) The model
•  Can Dirac s famous statement be used in biology?
•  The underlying physical laws necessary for the
mathematical theory of a large part of physics and the
whole of chemistry are thus completely known, and the
difficulty is only that the exact application of these laws
leads to equations much too complicated to be soluble
•  Biologists worry about the relevance of in vitro
experiments
•  Environment: mapping proteins, nucleic acids in vivo
and in vitro onto a model system
•  Select subsystem of interest (e.g. membrane and
cytoplasmatic domains): Size matters also here!
(ii) The model
•  The culmination of the great analytical effort at fractionating, isolating, and
purifying the various parts that make up the cell will not immediately lead
to a qualitative jump in our understanding of how a cell works.
•  The complexity of biological systems is determined not so much by the
number of parts they use to carry out their functions, as by the number of
interactions involved in the regulation of these functions.
•  Thus, although eukaryotes have generally larger genomes than
prokaryotes, genome sizes are not correlated with the complexity of the
organism.
•  Unicelullar eukaryotes have genome sizes that vary 200,000-fold, and the
genome of the amoeba is about 200 times greater than that of humans.
•  Interrelationships and regulation
Feynman s Point of view
•  Certainly no subject or field is making more
progress on so many fronts at the present moment,
than biology, and if we were to name the most
powerful assumption of all, which leads one on
and on in an attempt to understand life, it is that
all things are made of atoms, and that everything
that living things do can be understood in terms
of the jigglings and wigglings of atoms
•  From Feynman s Lectures on Physics, 1963
Biologist s point of view
Computational Physical tools may help, e.g.
•  To have insights on mechanisms: what is the molecular
basis of e.g. enzymatic reactions or in receptor signaling?
•  To understand aberrant processes (e.g. fibrillation) and
eventually try to stop them
•  To intervene on mechanisms: How can we improve drug
affinity/selectivity? These concepts are physically welldefined concepts and therefore can be predicted as we can
predict structures with bioinformatics.
•  To deliver drugs (nano-biotechnology): e.g. delivering RNA
•  For biotechnological applications: effects of mutations.
•  Food and agriculture industry: Many issues related to
ligand/target interactions (smell, taste)
The two faces of computational
molecular biology
•  Physics –based approaches
•  Biology-based approaches
•  If used together, of course, they will provide much more
than the sum of the two single components!
Bioinformatics
•  Proteins are the product of natural
selection superimposed to random
variation
•  Stability and Reactivity optimized for
the biological environment
•  Fundamental features of proteins are
not shared by small molecules
•  Look for biological patterns (E.g.
Conserved groups in protein may
be functional, mutations for genetic
diseases etc)
•  Proteins evolving from a common
ancestor maintained similar core 3D
structures. Structural models of
proteins (targets) homologous to other
proteins whose 3D structure is known
(templates).
Proteins that have evolved from a common ancestor may
exhibit the same fold and different sequences, except for
functionally important residues
From DNA...
ACTTGTAAATTTAGT…
ACTTGTGATAAATTTAGT…
ACTTGTAATAAATTTAGT…
...to the Protein
A
C
D
E
F
G
A
C
N
E
F
G
A
C
E
F
G
-
From DNA
ACTTGTCAAAATAATTTAGT…
ACTTGTCAAAATAAATTTAGT…
ACTTGTAATAAATTTAGT…
...To the protein
A
C
-
D
E
F
G
A
C
-
N
E
F
G
A
C
-
E
F
G
A
C
Q
D
E
F
G
A
C
Q
D
-
-
-
-
N
L
Genomics - Proteomics
Mapping Sequence to Protein Structure and Dynamics
Primary Sequence
MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY
VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG
GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP
EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES
ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN
KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA
Folding
3D Structure
•  focus on one
element…
In vitro
In vivo
•  But processes in real life involve
very complicated pathways in which
biomolecules do not act alone!
Adapted from U. Rothlisberger
Connecting Computational Molecular
Biology to Systems Biology
•  View of living organisms as
molecular circuitry:
–  Molecular circuitry =
biochemical processes,
that form and recycle
molecules in a coordinated
and balanced fashion
–  intended modes of
operation = healthy state
–  aberrant modes of
operation = disease state
•  Diagnosis:
–  identify the molecular
basis of disease
•  Therapy:
–  guide biochemical
circuitry back to healthy
state
Information Sources
•  New technology generates
massive amounts of data (often
stored in publicly accessible
databases): Genomics and
Proteomics
–  Protein and DNA sequences /
Whole genome sequences
–  Protein structure data
–  Protein pathways and
networks
–  Protein interaction data
–  Expression data
Genomics - Proteomics
Mapping Sequence to Protein Structure, Dynamics and Function
Primary Sequence
MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY
VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG
GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP
EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES
ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN
KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA
Folding
3D Structure
From computational biophysics to systems biology via
bioinformatics
Primary Sequence
MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY
VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG
GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP
EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES
ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN
KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA
Folding
3D Structure
Complex function within
network of proteins
Disease
Challenges:
Which protein is a drug target? How Can we model protein/
protein interactions?
Challenges (continued):
Drug action, efficacy and side effects?
Drug Target:
Summary
Key Challenge: Mapping the relationship between
genome sequence and protein structures, dynamics and
functions in complex cellular environments.
Computational molecular medicine is a key tool along with
systems biology
Outline
Structural Biology
Structural Bionformatics
Molecular Docking