Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Protein phosphorylation wikipedia , lookup
Endomembrane system wikipedia , lookup
Organ-on-a-chip wikipedia , lookup
Protein moonlighting wikipedia , lookup
Signal transduction wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Computational Molecular Biology Actin binding surface Profilin binding surface Impact on drug development Large pharma productivity from 2005–2010 Combined FDA-approved NMEs versus R&D spending for nine large pharmaceutical companies (AstraZeneca, Bristol-Myers Squibb, Eli Lilly, GlaxoSmithKline, Merck, Novartis, Pfizer, Roche and Sanofi-Aventis). Figures shown are in millions of US dollars. Source: FDA CDER; Bernstein Pharmaceutical industry has not been anywhere near good enough at selecting biological targets that influence disease. M.E. Bunnage, Nature Chemical Biology 7: p. 335 – 339 (2011) Currently takes more than 10 years and requires an investment of over $1B to bring a single innovative drug to market Molecular Medicine (Pauling) • From Wikipedia: Molecular medicine is a broad field, where physical, chemical, biological and medical techniques are used to describe molecular structures and mechanisms, identify fundamental molecular and genetic errors of disease, and to develop molecular interventions to correct them…. It is a new scientific discipline in European universities Summary • Molecular medicine requires computational molecular biology • Tools use both physics and biology concepts. Orthogonal and complementary • Only integration with system biology leads to understanding of drug action The cell • Eukaryotic cell: water 70% weight, proteins 20%, DNA/RNA 5% , lipids 3%, Polysaccharides 2% Cell Volume euk/pro ≈10,000 µm3 vs 1 µm3 # proteins ≈4 1010 vs 4 106 [proteins] ≈0.1 pM vs ≈ 1nM Size of genome 3 109 bp and 30,000 genes vs 4.6 106 bp and 4500 genes The medium: The cytoplasm (saline solution) Compartmentalization: The membrane (made up by lipids) RIBOSOMES (Factory) Cytoplasm CELL MEMBRANE (Wall) NUCLEUS (Information) Endoplasmic reticulum (Factory) Endoplasmic reticulum CHROMATIN LYSOSOME (Garbage disposal) BACTERIAL CELL TO SCALE EUKARYOTIC CELL MITOCHONDRIA (Power Generator) • Cell stores its own set of instructions for carrying its functions (force or transport, metabolism, protein synthesis, control functions, production of new cells) • Self-contained and self-maintaining. Structures potentially assemble, perform elaborate biochemical functions, vanish effortlessly when their work is done • Cell stores its own set of instructions for carrying its functions (force or transport, metabolism, protein synthesis, control functions, production of new cells) • Self-contained and self-maintaining. Structures potentially assemble, perform elaborate biochemical functions, vanish effortlessly when their work is done • Communication: protein A finds protein B in a sea of other interactors (hand-glove model) A modern view of cell function § Cellular functions – the decisions to grow and divide, to die by programmed cell death, or to stay static – ultimately lie with macromolecules encoded by DNA. § Proteins and RNA directly control the cell through the reactions they perform, the conformations they adopt, and the interactions that they make. § A modern, mechanistic understanding of cells requires knowledge of: (i) Molecular recognition events. Macromolecules are inherently nearsighted. Stable macromolecular interfaces involve forces that typically are only effective in short ranges that can be measured in Ångstroms. Although one can model these changes with simplified approahces, one has to remember that they are only the sum of short-range interactions between atoms. Thus, integration of macromolecules into pathways requires investigation of molecular recognition events (ii) Free energy landscape: Biological processes (e.g. signaling) involve almost isoenergetic free energy minima triggered by small-molecule binding, allostery, chemical modifications (such as phosphorylation, methylation, and ubiquitination), functional modification or complex formation to propagate conformational changes through long distances. A modern view of cell function § Cellular functions – the decisions to grow and divide, to die by programmed cell death, or to stay static – ultimately lie with macromolecules encoded by DNA. § Proteins and RNA directly control the cell through the reactions they perform, the conformations they adopt, and the interactions that they make. § A modern, mechanistic understanding of cells requires knowledge of (i) Molecular recognition events. Macromolecules are inherently nearsighted. Stable macromolecular interfaces involve forces that typically are only effective in short ranges that can be measured in Ångstroms. Although one can model these changes with simplified approahces, one has to remember that they are only the sum of short-range interactions between atoms. Thus, integration of macromolecules into pathways requires investigation of molecular recognition events (ii) Free energy landscape: Biological processes (e.g. signaling) involve almost isoenergetic free energy minima triggered by small-molecule binding, allostery, chemical modifications (such as phosphorylation, methylation, and ubiquitination), functional modification or complex formation to propagate conformational changes through long distances àadactable molecular recognition (iii) Dynamics! Molecular recognition in biology § Understanding dynamics and (often transient) interactions between biomolecules is one of the great scientific challenges of our time. § Progress underpins our ability to cure many diseases in a rational way. § We need data! § We need mathematical/computational models! Computational Molecular Biology: the challenges… • It uses molecular recognition (ANY biological process, pharmacological applications). It involves releasing some or all of the water (or other) molecule normally in contact with the interacting surfaces. It may also involve changes in the conformation of the molecules. • Energetics of biological processes within about 1eV.Entropy plays a major role • Biomolecules do not act alone • Very heterogeneous systems (cytoplasm as molecular soup , cell membrane) • Role of cellular biology and regulation …the simplifications…. • • • • • • Key role of Bioinformatics: Biology is a result of a historical process: a H atom is not different 109 years ago but biological systems (!3.5 x 109 years old) are. Darwinian Evolution (i.e. survival of the fittest at each new generation of each organism) modifies existing mechanisms rather than invent new ones: By investigating (structural, functional, cellular) patterns, we can have clues about these very complicated systems. Biological Systems are similar (e.g. plants and animal proteins share striking similarities) and they are robust with respect of small changes, if the latter are not crucial for the biological function Biology occurs at T 300 K, P 1 Atm Symmetry concepts (e.g viruses) Enormous diversity of biological systems associated with designs that tend to be preserved and utilized over and over in all possible combinations, before a new design is attempted: functions accomplished by co-opting and by combining functions that emerged in completely different contexts. New protein structures reveals motifs already existing in the data banks and that have been used over and over again in related and sometimes even unrelated tasks. …and a strikingly rich outcome! • Insights on mechanisms: what is the molecular basis of e.g. enzymatic reactions or in receptor signaling? • To understand aberrant processes (e.g. fibrillation) and eventually try to stop them • To intervene on mechanisms: How can we improve drug affinity/selectivity? These concepts are physically welldefined concepts and therefore can be predicted as we can predict structures with bioinformatics. • To deliver drugs (nano-biotechnology): e.g. delivering RNA • For biotechnological applications: effects of mutations. • Food and agriculture industry: Many issues related to ligand/target interactions (smell, taste) The two faces of computational molecular medicine • Physics –based approaches • Biology-based approaches • If used together, of course, they will provide much more than the sum of the two single components! The two faces of computational molecular medicine • Physics –based approaches • Biology-based approaches • If used together, of course, they will provide much more than the sum of the two single components! Computational Biomedicine: Strategy Biology-based approaches Physics-based approaches 17 Pre-Biomolecular Simulation Era (up to the 60’s): Beyond Dirac’s Paradigm Theory Derive largest number of phenomena based on minimum number of principles: Mechanics Laws (+QM, Relativity): Include only biologically relevant degrees of complexity Statistics (Boltzmann) Enormous virtual prediction But Complex Equations and Models for biological systems à Not even qualitative behaviour: little or no real prediction for molecular recognition-based processes Experiments Investigating molecular recognition in biology today Theory Experiment More and more: Computer simulation Computational Physics: Real predictions but (i) simplifying assumptions (ii) biological model (iii) Thermodynamic equilibrium IBM Roadrunner 2008, Los Alamos 1.7 Peta (1015 )flops - ENIAC 1943 Upenn - 5,000 additions substraction /s -27 tons -Six people did programming by manipulating switches and cables Investigating molecular recognition in biology today Theory Experiment More and more: Computer simulation Computational Chemistry: Real predictions if (i) simplifying assumptions (ii) biological model Carefully chosen (i) Simplifying assumptions • Atoms evolve according to the laws of Newtonian Physics (Molecular dynamics) Fi=miai (i=1, N) • Basic ingredient: the energy function E Fi=-∂E/ ∂ri Classical Molecular Dynamics • Produces a sequence of configurations very much like frames in movie • Atoms move though time in a series of discrete timesteps (dt =10-15 s) • R(t+ dt )=R(t)+v dt New position is old position+distance traveled during the timestep Chemical bonds as harmonic oscillators FSTRETCH = -K1r The potential energy E j l + k i - i + qp+ qo effective point charge interaction [ 1 1 2 E = ∑ kb (rij − b0 ) + ∑ kθ (θ ijk − θ 0 ) 2 + ∑∑ k n 1 + cos(nφijkh − φ0 ) b 2 θ 2 φ n 12 6⎞ ⎛⎛ ⎞ ⎛ ⎞ qlqm σ ⎟ ⎟ standard biomolecular ⎜⎜ σ ⎟ ⎜ + ∑ + ∑ 4 ε⎜ − force field ⎜ ⎟ ⎜ rop ⎟ ⎟⎟ r ⎜ 4 πε r op ⎠ ⎝ ⎠ ⎠ lm op 0 lm ⎝⎝ ] From water …. To proteins.. Found in the muscle cells of animals. It functions as an oxygenstorage unit, providing molecular oxygen to the working muscles. Oxygen accessibility surface in myoglobin Fluctuations are now included Karplus et al. 1986 Limits of MD • Force fields do not quite have the required accuracy (polarization effects): Bustamante’s claims • Formation and breaking of chemical bonds is not allowed • Atom motion is described by classical mechanics • Electrons are implicitly confined to remain in the ground state • Size and time scale are severely restricted. Size matters… Proteins in water or embedded in membrane (1000-100000 atoms) DNA, RNA membranes Cells: Accessible Time-scales fs 100 fs -ps Electronic excitations ms Vibrations Rotations Conformational Changes ps-s e-transfer reactions Enzymatic reactions ms-s Peptide folding Protein Folding ns 10ms-s Comparison with experimental data (e.g. NMR) whenever possibile Investigating molecular recognition in biology today Theory Experiment More and more: Computer simulation Computational Chemistry: Real predictions if (i) simplifying assumptions (ii) biological model Carefully chosen (ii) The model • Can Dirac s famous statement be used in biology? • The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble • Biologists worry about the relevance of in vitro experiments • Environment: mapping proteins, nucleic acids in vivo and in vitro onto a model system • Select subsystem of interest (e.g. membrane and cytoplasmatic domains): Size matters also here! (ii) The model • The culmination of the great analytical effort at fractionating, isolating, and purifying the various parts that make up the cell will not immediately lead to a qualitative jump in our understanding of how a cell works. • The complexity of biological systems is determined not so much by the number of parts they use to carry out their functions, as by the number of interactions involved in the regulation of these functions. • Thus, although eukaryotes have generally larger genomes than prokaryotes, genome sizes are not correlated with the complexity of the organism. • Unicelullar eukaryotes have genome sizes that vary 200,000-fold, and the genome of the amoeba is about 200 times greater than that of humans. • Interrelationships and regulation Feynman s Point of view • Certainly no subject or field is making more progress on so many fronts at the present moment, than biology, and if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms • From Feynman s Lectures on Physics, 1963 Biologist s point of view Computational Physical tools may help, e.g. • To have insights on mechanisms: what is the molecular basis of e.g. enzymatic reactions or in receptor signaling? • To understand aberrant processes (e.g. fibrillation) and eventually try to stop them • To intervene on mechanisms: How can we improve drug affinity/selectivity? These concepts are physically welldefined concepts and therefore can be predicted as we can predict structures with bioinformatics. • To deliver drugs (nano-biotechnology): e.g. delivering RNA • For biotechnological applications: effects of mutations. • Food and agriculture industry: Many issues related to ligand/target interactions (smell, taste) The two faces of computational molecular biology • Physics –based approaches • Biology-based approaches • If used together, of course, they will provide much more than the sum of the two single components! Bioinformatics • Proteins are the product of natural selection superimposed to random variation • Stability and Reactivity optimized for the biological environment • Fundamental features of proteins are not shared by small molecules • Look for biological patterns (E.g. Conserved groups in protein may be functional, mutations for genetic diseases etc) • Proteins evolving from a common ancestor maintained similar core 3D structures. Structural models of proteins (targets) homologous to other proteins whose 3D structure is known (templates). Proteins that have evolved from a common ancestor may exhibit the same fold and different sequences, except for functionally important residues From DNA... ACTTGTAAATTTAGT… ACTTGTGATAAATTTAGT… ACTTGTAATAAATTTAGT… ...to the Protein A C D E F G A C N E F G A C E F G - From DNA ACTTGTCAAAATAATTTAGT… ACTTGTCAAAATAAATTTAGT… ACTTGTAATAAATTTAGT… ...To the protein A C - D E F G A C - N E F G A C - E F G A C Q D E F G A C Q D - - - - N L Genomics - Proteomics Mapping Sequence to Protein Structure and Dynamics Primary Sequence MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA Folding 3D Structure • focus on one element… In vitro In vivo • But processes in real life involve very complicated pathways in which biomolecules do not act alone! Adapted from U. Rothlisberger Connecting Computational Molecular Biology to Systems Biology • View of living organisms as molecular circuitry: – Molecular circuitry = biochemical processes, that form and recycle molecules in a coordinated and balanced fashion – intended modes of operation = healthy state – aberrant modes of operation = disease state • Diagnosis: – identify the molecular basis of disease • Therapy: – guide biochemical circuitry back to healthy state Information Sources • New technology generates massive amounts of data (often stored in publicly accessible databases): Genomics and Proteomics – Protein and DNA sequences / Whole genome sequences – Protein structure data – Protein pathways and networks – Protein interaction data – Expression data Genomics - Proteomics Mapping Sequence to Protein Structure, Dynamics and Function Primary Sequence MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA Folding 3D Structure From computational biophysics to systems biology via bioinformatics Primary Sequence MNGTEGPNFY VPFSNKTGVV RSPFEAPQYY LAEPWQFSML AAYMFLLIML GFPINFLTLY VTVQHKKLRT PLNYILLNLA VADLFMVFGG FTTTLYTSLH GYFVFGPTGC NLEGFFATLG GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLVGWSRYIP EGMQCSCGID YYTPHEETNN ESFVIYMFVV HFIIPLIVIF FCYGQLVFTV KEAAAQQQES ATTQKAEKEV TRMVIIMVIA FLICWLPYAG VAFYIFTHQG SDFGPIFMTI PAFFAKTSAV YNPVIYIMMN KQFRNCMVTT LCCGKNPLGD DEASTTVSKT ETSQVAPA Folding 3D Structure Complex function within network of proteins Disease Challenges: Which protein is a drug target? How Can we model protein/ protein interactions? Challenges (continued): Drug action, efficacy and side effects? Drug Target: Summary Key Challenge: Mapping the relationship between genome sequence and protein structures, dynamics and functions in complex cellular environments. Computational molecular medicine is a key tool along with systems biology Outline Structural Biology Structural Bionformatics Molecular Docking