Download Essential Bioinformatics and Biocomputing (LSM2104

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmacogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Pathogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Essential gene wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genomics wikipedia , lookup

NEDD9 wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Minimal genome wikipedia , lookup

Protein moonlighting wikipedia , lookup

Transcript
Essential Bioinformatics and Biocomputing
(LSM2104: Section I)
Biological Databases and
Bioinformatics Software
Prof. Chen Yu Zong
Tel: 6874-6877
Email: [email protected]
http://xin.cz3.nus.edu.sg
Room 07-24, level 7, SOC1, NUS
January 2003
Lecture 4: Bioinformatics software
Outline:
– Examples of Bioinformatics software usage
• Questions and strategies for dealing with biomedical
problems
• SARS as an example
– SARS genome
– How to figure out the function of SARS genes?
– Use of info of SARS genes in drug design
– Bioinformatics software and research
Essential Bioinformatics and
Biocomputing (LSM2104)
2
Bioinformatics software
Examples of bioinformatics software usage:
After the discovery of a new gene or protein related to a
disease, these questions are usually asked:
• What is its function and structure?
– Is it similar in sequence to a known gene or protein? (sequence
similarity search)
– Does it contain sequence pattern similar to that of a group of known
genes or proteins? (motif identification)
– Is this protein “similar” to proteins with known 3D structure? If so,
derive a 3D model (protein homology modeling)
Essential Bioinformatics and
Biocomputing (LSM2104)
3
Bioinformatics software
After the discovery of a new gene or protein related to a
disease, these questions are usually asked:
• Has any research been done on the disease or similar
genes/proteins? (literature search, data mining)
• How to design a new drug to control it?
– If 3D structure is known or can be derived, structure-based drug
design can be conducted (molecular modeling, docking)
– If 3D structure not known, are there any molecules known to interact
with this protein (literature search, data mining)? If yes, predict the
likely drug candidates based on statistic analysis of these molecules
(QSAR, machine learning)
Essential Bioinformatics and
Biocomputing (LSM2104)
4
SARS as an example
A novel coronavirus
Identified as
the cause of
severe respiratory
syndrome (SARS )
Essential Bioinformatics and
Biocomputing (LSM2104)
5
SARS as an example
How SARS
coronavirus
virus enters
a cell and
reproduce
Essential Bioinformatics and
Biocomputing (LSM2104)
6
SARS as an example
Another
Illustration of
SARS
Coronavirus
Organization
Essential Bioinformatics and
Biocomputing (LSM2104)
7
SARS Coronavirus Genome
Essential Bioinformatics and
Biocomputing (LSM2104)
8
How SARS Coronavirus Is Identified?
SARS pathogen related to an existing virus or bacterium?
• Key SARS proteins must be similar to the proteins of that virus or
bacterium so that they have similar organization, metabolic
machinery etc.
• Protein similarity = Sequence similarity (based on the principles of
evolution and the sequence-structure-function relationship)
• What has been found:
• Key SARS proteins are similar to those of coronavirus
• Some SARS proteins are not similar to any existing proteins
• Conclustion: SARS pathogen is a novel coronavirus
Essential Bioinformatics and
Biocomputing (LSM2104)
9
How SARS Coronavirus Is Identified?
Essential Bioinformatics and
Biocomputing (LSM2104)
10
Sequence Comparison as a
Mathematical Problem:
Example:
Sequence a: ATTCTTGC
Sequence b: ATCCTATTCTAGC
Best Alignment:
ATTCTTGC
ATCCTATTCTAGC
/|\
gap
Bad Alignment:
AT TCTT
GC
ATCCTATTCTAGC
/|\
/|\
gap
gap
Construction of many alignments => which is the best?
Essential Bioinformatics and
Biocomputing (LSM2104)
11
How to rate an alignment?
• Match: +8 (w(x, y) = 8, if x = y)
• Mismatch: -5 (w(x, y) = -5, if x ≠ y)
• Each gap symbol: -3 (w(-,x)=w(x,-)=-3)
C - - - T T A A C T
C G G A T C A - - T
+8 -3
-3
-3 +8 -5 +8 -3
-3
+8 = +12
Alignment score
Essential Bioinformatics and
Biocomputing (LSM2104)
12
How to rate an alignment?
Alignment example:
Sequence a: AGTACAT - A
Sequence b: A - CACACTA
Pair Status Scoring function
(A, A) Match
w=8
(G, -) Gap
w=-3
(T, C) Mismatch w=-5
(A, A) Match
w=8
(C, C) Match
w=8
(A, A) Match
w=8
(T, C) Mismatch w=-5
(- , T) gap
w=-3
(A, A) Match
w=8
Total score: S = 24
Essential Bioinformatics and
Biocomputing (LSM2104)
13
From Math Algorithm to Computer Program
Example:
A program that calls each computer in your lab, and
expect them to respond by sending a message:
“hello from processor n”
Essential Bioinformatics and
Biocomputing (LSM2104)
14
How to design an anti-SARS drug?
Mechanism of
Drug Action:
A drug interferes with
the function of a
disease protein by
binding to it.
This interference
stops the disease
process
Drug Design:
Structure of disease
protein is very useful
Essential Bioinformatics and
Biocomputing (LSM2104)
15
Modeling the SARS-CoV Protein Structure
Strategy:
Using structure of
other coronavirus as a
template
Assuming backbone
structure is fixed,
gradually mutate the
sequence from that of
the other coronavirus
to that of SARS-CoV.
Essential Bioinformatics and
Biocomputing (LSM2104)
16
Modeling the SARS-CoV Protein Structure
Essential Bioinformatics and
Biocomputing (LSM2104)
17
Modeling the protein motions:
Movie Show:
Drug Binding
Induced
Conformation
Change in Protein
Essential Bioinformatics and
Biocomputing (LSM2104)
18
Modeling the protein motions:
Movie Show:
Protein transient
opening for ligand or
drug binding and
dissociation:
Essential Bioinformatics and
Biocomputing (LSM2104)
19
Drug Design by Computer
Ligand-Protein Docking
Computer search for stable drug-protein complex:
Essential Bioinformatics and
Biocomputing (LSM2104)
20
.
Essential Bioinformatics and
Biocomputing (LSM2104)
21
.
Essential Bioinformatics and
Biocomputing (LSM2104)
22
.
Essential Bioinformatics and
Biocomputing (LSM2104)
23
.
Essential Bioinformatics and
Biocomputing (LSM2104)
24
Anti-SARS Drug Design
Essential Bioinformatics and
Biocomputing (LSM2104)
25
Design of anti-SARS drug without
knowledge of protein structure
Log(Biological activity of drug candidate) =>
Size of a side chain of drug candidate
(=> Size and shape of protein cavity)
Essential Bioinformatics and
Biocomputing (LSM2104)
26
General Principles
Log(Biological activity) =
f(structural, physico-chemical parameters)
Example:
Log(Biological activity) =>
Combination of hydrophobic group and
charged groups
(=> Chemical complementarity to protein cavity)
Essential Bioinformatics and
Biocomputing (LSM2104)
27
Construction of Drug Framework
Superimposing common molecular
interaction field contour resulted
in the identification of
the consensus framework
The Concensus Framework:
•
•
•
•
•
purple = hydrophobic area
green = electron-deficient aromatic system
red = electronegative heteroatoms
pink = protonated nitrogen
blue = large planar ring system.
Essential Bioinformatics and
Biocomputing (LSM2104)
28
Bioinformatics software
Its role in research:
• Hypothesisdriven research
cycle in biology
(From Kitano H.
Systems biology: a
brief overview.
Science 2002,
295:1662-4)
Essential Bioinformatics and
Biocomputing (LSM2104)
29
Bioinformatics software
• Cyclical refinement of predictive computer models used to define further
biological experiments, including the optimization step.
(From Brusic et al.
2001, Efficient discovery of immune response targets by cyclical refinement of QSAR
models of peptide binding. J. Mol. Graph. Model. 19:405-11, 467
Essential Bioinformatics and
Biocomputing (LSM2104)
).
30
Bioinformatics software
• By combining computational methods with experimental
biology, major discoveries can be made faster and more
efficiently.
• Today, every large molecular or systems biology project
has a bioinformatics component.
• Use of biological software allows biologists to extend
their set of skills for more efficient and more effective
analysis of their data, and for planning of experiments.
Essential Bioinformatics and
Biocomputing (LSM2104)
31
Bioinformatics software
Summary of Today’s lecture
• What is and Why bioinformatics software?
• SARS research as an example:
– SARS genome
– Figure out the function of SARS proteins
– Drug design
Essential Bioinformatics and
Biocomputing (LSM2104)
32