* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Essential Bioinformatics and Biocomputing (LSM2104
Pharmacogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Pathogenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Essential gene wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Essential Bioinformatics and Biocomputing (LSM2104: Section I) Biological Databases and Bioinformatics Software Prof. Chen Yu Zong Tel: 6874-6877 Email: [email protected] http://xin.cz3.nus.edu.sg Room 07-24, level 7, SOC1, NUS January 2003 Lecture 4: Bioinformatics software Outline: – Examples of Bioinformatics software usage • Questions and strategies for dealing with biomedical problems • SARS as an example – SARS genome – How to figure out the function of SARS genes? – Use of info of SARS genes in drug design – Bioinformatics software and research Essential Bioinformatics and Biocomputing (LSM2104) 2 Bioinformatics software Examples of bioinformatics software usage: After the discovery of a new gene or protein related to a disease, these questions are usually asked: • What is its function and structure? – Is it similar in sequence to a known gene or protein? (sequence similarity search) – Does it contain sequence pattern similar to that of a group of known genes or proteins? (motif identification) – Is this protein “similar” to proteins with known 3D structure? If so, derive a 3D model (protein homology modeling) Essential Bioinformatics and Biocomputing (LSM2104) 3 Bioinformatics software After the discovery of a new gene or protein related to a disease, these questions are usually asked: • Has any research been done on the disease or similar genes/proteins? (literature search, data mining) • How to design a new drug to control it? – If 3D structure is known or can be derived, structure-based drug design can be conducted (molecular modeling, docking) – If 3D structure not known, are there any molecules known to interact with this protein (literature search, data mining)? If yes, predict the likely drug candidates based on statistic analysis of these molecules (QSAR, machine learning) Essential Bioinformatics and Biocomputing (LSM2104) 4 SARS as an example A novel coronavirus Identified as the cause of severe respiratory syndrome (SARS ) Essential Bioinformatics and Biocomputing (LSM2104) 5 SARS as an example How SARS coronavirus virus enters a cell and reproduce Essential Bioinformatics and Biocomputing (LSM2104) 6 SARS as an example Another Illustration of SARS Coronavirus Organization Essential Bioinformatics and Biocomputing (LSM2104) 7 SARS Coronavirus Genome Essential Bioinformatics and Biocomputing (LSM2104) 8 How SARS Coronavirus Is Identified? SARS pathogen related to an existing virus or bacterium? • Key SARS proteins must be similar to the proteins of that virus or bacterium so that they have similar organization, metabolic machinery etc. • Protein similarity = Sequence similarity (based on the principles of evolution and the sequence-structure-function relationship) • What has been found: • Key SARS proteins are similar to those of coronavirus • Some SARS proteins are not similar to any existing proteins • Conclustion: SARS pathogen is a novel coronavirus Essential Bioinformatics and Biocomputing (LSM2104) 9 How SARS Coronavirus Is Identified? Essential Bioinformatics and Biocomputing (LSM2104) 10 Sequence Comparison as a Mathematical Problem: Example: Sequence a: ATTCTTGC Sequence b: ATCCTATTCTAGC Best Alignment: ATTCTTGC ATCCTATTCTAGC /|\ gap Bad Alignment: AT TCTT GC ATCCTATTCTAGC /|\ /|\ gap gap Construction of many alignments => which is the best? Essential Bioinformatics and Biocomputing (LSM2104) 11 How to rate an alignment? • Match: +8 (w(x, y) = 8, if x = y) • Mismatch: -5 (w(x, y) = -5, if x ≠ y) • Each gap symbol: -3 (w(-,x)=w(x,-)=-3) C - - - T T A A C T C G G A T C A - - T +8 -3 -3 -3 +8 -5 +8 -3 -3 +8 = +12 Alignment score Essential Bioinformatics and Biocomputing (LSM2104) 12 How to rate an alignment? Alignment example: Sequence a: AGTACAT - A Sequence b: A - CACACTA Pair Status Scoring function (A, A) Match w=8 (G, -) Gap w=-3 (T, C) Mismatch w=-5 (A, A) Match w=8 (C, C) Match w=8 (A, A) Match w=8 (T, C) Mismatch w=-5 (- , T) gap w=-3 (A, A) Match w=8 Total score: S = 24 Essential Bioinformatics and Biocomputing (LSM2104) 13 From Math Algorithm to Computer Program Example: A program that calls each computer in your lab, and expect them to respond by sending a message: “hello from processor n” Essential Bioinformatics and Biocomputing (LSM2104) 14 How to design an anti-SARS drug? Mechanism of Drug Action: A drug interferes with the function of a disease protein by binding to it. This interference stops the disease process Drug Design: Structure of disease protein is very useful Essential Bioinformatics and Biocomputing (LSM2104) 15 Modeling the SARS-CoV Protein Structure Strategy: Using structure of other coronavirus as a template Assuming backbone structure is fixed, gradually mutate the sequence from that of the other coronavirus to that of SARS-CoV. Essential Bioinformatics and Biocomputing (LSM2104) 16 Modeling the SARS-CoV Protein Structure Essential Bioinformatics and Biocomputing (LSM2104) 17 Modeling the protein motions: Movie Show: Drug Binding Induced Conformation Change in Protein Essential Bioinformatics and Biocomputing (LSM2104) 18 Modeling the protein motions: Movie Show: Protein transient opening for ligand or drug binding and dissociation: Essential Bioinformatics and Biocomputing (LSM2104) 19 Drug Design by Computer Ligand-Protein Docking Computer search for stable drug-protein complex: Essential Bioinformatics and Biocomputing (LSM2104) 20 . Essential Bioinformatics and Biocomputing (LSM2104) 21 . Essential Bioinformatics and Biocomputing (LSM2104) 22 . Essential Bioinformatics and Biocomputing (LSM2104) 23 . Essential Bioinformatics and Biocomputing (LSM2104) 24 Anti-SARS Drug Design Essential Bioinformatics and Biocomputing (LSM2104) 25 Design of anti-SARS drug without knowledge of protein structure Log(Biological activity of drug candidate) => Size of a side chain of drug candidate (=> Size and shape of protein cavity) Essential Bioinformatics and Biocomputing (LSM2104) 26 General Principles Log(Biological activity) = f(structural, physico-chemical parameters) Example: Log(Biological activity) => Combination of hydrophobic group and charged groups (=> Chemical complementarity to protein cavity) Essential Bioinformatics and Biocomputing (LSM2104) 27 Construction of Drug Framework Superimposing common molecular interaction field contour resulted in the identification of the consensus framework The Concensus Framework: • • • • • purple = hydrophobic area green = electron-deficient aromatic system red = electronegative heteroatoms pink = protonated nitrogen blue = large planar ring system. Essential Bioinformatics and Biocomputing (LSM2104) 28 Bioinformatics software Its role in research: • Hypothesisdriven research cycle in biology (From Kitano H. Systems biology: a brief overview. Science 2002, 295:1662-4) Essential Bioinformatics and Biocomputing (LSM2104) 29 Bioinformatics software • Cyclical refinement of predictive computer models used to define further biological experiments, including the optimization step. (From Brusic et al. 2001, Efficient discovery of immune response targets by cyclical refinement of QSAR models of peptide binding. J. Mol. Graph. Model. 19:405-11, 467 Essential Bioinformatics and Biocomputing (LSM2104) ). 30 Bioinformatics software • By combining computational methods with experimental biology, major discoveries can be made faster and more efficiently. • Today, every large molecular or systems biology project has a bioinformatics component. • Use of biological software allows biologists to extend their set of skills for more efficient and more effective analysis of their data, and for planning of experiments. Essential Bioinformatics and Biocomputing (LSM2104) 31 Bioinformatics software Summary of Today’s lecture • What is and Why bioinformatics software? • SARS research as an example: – SARS genome – Figure out the function of SARS proteins – Drug design Essential Bioinformatics and Biocomputing (LSM2104) 32