Download Machine Learning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Computer-aided Vaccine and Drug Discovery
G.P.S. Raghava
 Understanding immune system
 Annotation of genomes
 Breaking complex problem
 Searching drug targets
 Adaptive immunity
 Properties of drug molecules
 Innate Immunity
 Protein-chemical interaction
 Vaccine delivery system
 Prediction of drug-like
molecules
 ADMET of peptides
1
Adaptive Immunity
Innate Immunity
WHOLE
ORGANISM
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Purified Antigen
Vaccine Delivery
Epitopes (Subunit
Vaccine)
T cell epitope
Attenuated
Limitations of methods of subunit vaccine design
– Methods for one or two MHC alleles
– Do not consider pathways of antigen processing
– Limited to T-cell epitopes
Initiatives taken by our group
– Understand complete mechanism of antigen processing
– Develop better and comprehensive methods
– Promiscuous MHC binders
2
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
Disease Causing
Agents
3
Pathogens/Invaders
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
ER
TAP
4
Prediction of CTL Epitopes (Cell-mediated immunity)
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
5
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
MHCBN: A database of MHC/TAP
binders and T-cell epitopes
Distributed by EBI, UK
Reference database in T-cell epitopes
Highly Cited ( ~ 70 citations)
Bhasin et al. (2003) Bioinformatics 19: 665
Bhasin et al. (2004) NAR (Online)
6
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
Prediction of MHC II Epitopes ( Thelper Epitopes)
•
•
•
•
•
•
Propred: Promiscuous of binders for 51 MHC Class II binders
–
Virtual matrices
–
Singh and Raghava (2001) Bioinformatics 17:1236
HLADR4pred: Prediction of HLA-DRB1*0401 binding peptides
–
Dominating MHC class II allele
–
ANN and SVM techniques
–
Bhasin and Raghava (2004) Bioinformatics 12:421.
MHC2Pred: Prediction of MHC class II binders for 41 alleles
–
Human and mouse
–
Support vector machine (SVM) technique
–
Extension of HLADR4pred
MMBpred: Prediction pf Mutated MHC Binder
–
Mutations required to increase affinity
–
Mutation required for make a binder promiscuous
–
Bhasin and Raghava (2003) Hybrid Hybridomics, 22:229
MOT : Matrix optimization technique for binding core
MHCBench: Benchmarting of methods for MHC binders
7
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
Prediction of MHC I binders and CTL Epitopes
Propred1: Promiscuous binders for 47 MHC class I alleles
–
Cleavage site at C-terminal
–
Singh and Raghava (2003) Bioinformatics 19:1109
nHLApred: Promiscuous binders for 67 alleles using ANN and QM
–
Bhasin and Raghava (2007) J. Biosci. 32:31-42
TAPpred: Analysis and prediction of TAP binders
–
Bhasin and Raghava (2004) Protein Science 13:596
Pcleavage: Proteasome and Immuno-proteasome cleavage site.
–
Trained and test on in vitro and in vivo data
–
Bhasin and Raghava (2005) Nucleic Acids Research 33: W202-7
CTLpred: Direct method for Predicting CTL Epitopes
–
Bhasin and Raghava (2004) Vaccine 22:3195
8
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
9
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
BCIPEP: A database of
B-cell epitopes.
Saha et al.(2005) BMC Genomics 6:79.
Saha et al. (2006) NAR (Online)
10
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
Prediction of B-Cell Epitopes
•
•
BCEpred: Prediction of Continuous B-cell epitopes
–
–
–
–
Benchmarking of existing methods
Evaluation of Physico-chemical properties
Poor performance slightly better than random
Combine all properties and achieve accuracy around 58%
–
Saha and Raghava (2004) ICARIS 197-204.
ABCpred: ANN based method for B-cell epitope prediction
–
–
–
–
–
•
Extract all epitopes from BCIPEP (around 2400)
700 non-redundant epitopes used for testing and training
Recurrent neural network
Accuracy 66% achieved
Saha and Raghava (2006) Proteins,65:40-48
ALGpred: Mapping and Prediction of Allergenic Epitopes
–
–
–
Allergenic proteins
IgE epitope and mapping
Saha and Raghava (2006) Nucleic Acids Research 34:W202-W209
11
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
HaptenDB: A database of hapten molecules
12
13
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
PRRDB is a database of pattern
recognition receptors and their ligands
~500 Pattern-recognition Receptors
228 ligands (PAMPs)
77 distinct organisms
720 entries
14
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
15
Adaptive Immunity
Innate Immunity
Protective Antigens
Bioinformatics Centre
IMTECH, Chandigarh
Vaccine Delivery
Major Challenges in Vaccine Design
•
ADMET of peptides and proteins
•
Activate innate and adaptive immunity
•
Prediction of carrier molecules
•
Avoid cross reactivity (autoimmunity)
•
Prediction of allergic epitopes
•
Solubility and degradability
•
Absorption and distribution
•
Glycocylated epitopes
16
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Target
Proteins
Drug
Informatics
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
17
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
•
Drug-like
Molecules
Ab initio method for gene prediction using FFT technique
Issac et al. (2002) Bioinformatics 18:197
BLASTX against RefSeq & BLASTN against intron database
NNSPLICE program is used to reassign splicing signal site positions
Issac and Raghava (2004) Genome Research 14:1756
Collection of different datasets
Tools for evaluating a method
Creation of own datasets
SRF: Spectral Repeat finder
–
–
•
•
•
Protein-drug
Interaction
GeneBench: Benchmarking of gene finders
–
–
–
•
Protein
Drugs
Chemical
s
EGpred: Prediction of eukaryotic genes
–
–
–
•
Target
Proteins
Biologicals
FTGpred: Prediction of Prokaryotic genes
–
–
•
Drug
Informatics
Prediction and analysis of drug molecules
FFT based repeat finder
Sharma et al. (2004) Bioinformatics 20: 1405
Work in Progress
Prediction of polyadenylation signal (PAS) in human coding DNA
Understanding DICER cutter sites and siRNA/miRNA efficacy
Predict transcription factor binding sites in DNA sequences
18
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Target
Proteins
Drug
Informatics
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Comparative genomics
•
GWFASTA: Genome Wide FASTA
Search
–
–
Analysis of FASTA search for
comparative genomics
Biotechniques 2002, 33:548
•
GWBLAST: Genome wide BLAST
search
•
COPID: Composition based
similarity search
•
LGEpred: Expression of a gene
from its Amino acid sequence
–
•
BMC Bioinformatics 2005, 6:59
ECGpred: Expression from its
nucleotide sequence
19
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Subcellular localization Methods
PSLpred: Subcellular localization of prokaryotic proteins
–
–
5 major sub cellular localization
Bioinformatics 2005, 21: 2522
ESLpred: Subcellular localization of Eukaryotic proteins
–
–
–
–
SVM based method
Amino acid, Dipetide and properties composition
Sequence profile (PSIBLAST)
Nucleic Acids Research 2004, 32:W414-9
HSLpred: Sub cellular localization of Human proteins
–
–
–
Need to develop organism specific methods
84% accuracy for human proteins
Journal of Biological Chemistry 2005, 280:14427-32
MITpred: Prediction of Mitochndrial proteins
–
–
Exclusive mitochndrial domain and SVM
J Biol Chem. 2005, 281:5357-63.
Work in Progress: Subcellular localization of M.Tb. and malaria
20
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
•
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Regular Secondary Structure Prediction (-helix -sheet)
APSSP2: Highly accurate method for secondary structure prediction
Competete in EVA, CAFASP and CASP (In top 5 methods)
•
Irregular secondary structure prediction methods (Tight turns)
Betatpred: Consensus method for -turns prediction
– Statistical methods combined
– Kaur and Raghava (2001) Bioinformatics
•
Bteval : Benchmarking of -turns prediction
– Kaur and Raghava (2002) J. Bioinformatics and Computational Biology, 1:495:504
•
BetaTpred2: Highly accurate method for predicting -turns (ANN, SS, MA)
– Multiple alignment and secondary structure information
– Kaur and Raghava (2003) Protein Sci 12:627-34
•
BetaTurns: Prediction of -turn types in proteins
– Kaur and Raghava (2004) Bioinformatics 20:2751-8.
•
AlphaPred: Prediction of -turns in proteins
– Kaur and Raghava (2004) Proteins: Structure, Function, and Genetics 55:83-90
•
GammaPred: Prediction of -turns in proteins
21
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Supersecondary Structure
BhairPred: Prediction of Beta Hairpins
–
–
Secondary structure and surface accessibility used as input
Manish et al. (2005) Nucleic Acids Research 33:W154-9
TBBpred: Prediction of outer membrane proteins
–
–
–
Prediction of trans membrane beta barrel proteins
Application of ANN and SVM + Evolutionary information
Natt et al. (2004) Proteins: 56:11-8
ARNHpred: Analysis and prediction side chain, backbone interactions
–
Prediction of aromatic NH interactions
–
Kaur and Raghava (2004) FEBS Letters 564:47-57 .
Chpredict: Prediction of C-H .. O and PI interaction
–
Kaur and Raghava (2006) In-Silico Biology 6:0011
SARpred: Prediction of surface accessibility (real accessibility)
–
–
Multiple alignment (PSIBLAST) and Secondary structure information
Garg et al., (2005) Proteins 61:318-24
Secondary to Tertiary Structure
PepStr: Prediction of tertiary structure of Bioactive peptides
–
Kaur et al. (2007) Protein Pept Lett. (In Press)
22
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Target
Proteins
Drug
Informatics
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
23
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Nrpred: Classification of nuclear receptors
–
BLAST fails in classification of NR proteins
–
Uses composition of amino acids
Journal of Biological Chemistry 2004, 279: 23262
GPCRpred: Prediction of G-protein-coupled receptors
–
Predict GPCR proteins & class
–
> 80% in Class A, further classify
Nucleic Acids Research 2004, 32:W383
GPCRsclass: Amine type of GPCR
–
–
Major drug targets, 4 classes,
Accuracy 96.4%
Nucleic Acids Research 2005, 33:W172
VGIchan:Voltage gated ion channel
–
Genomics Proteomics & Bioinformatics 2007, 4:253-8
Pprint: RNA interacting residues in proteins
–
Proteins: Structure, Function and Bioinformatics (In Press)
GSTpred: Glutathione S-transferases proteins
–
Protein Pept Lett. 2007, 6:575-80
24
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Antibp: Analysis and prediction of antibacterial peptides
• Searching and mapping of antibacterial peptide
• BMC Bioinformatics 2007, 8:263
ALGpred: Prediction of allergens
•Using allergen representative peptides
•Nucleic Acids Research 2006, 34:W202-9.
BTXpred: Prediction of bacterial toxins
•Classifcation of toxins into exotoxins and endotoxins
•Classification of exotoxins in seven classes
•In Silico Biology 2007, 7: 0028
•NTXpred: Prediction of neurotoxins
•Classification based on source
•Classification based on function (ion channel blockers, blocks
Acetylcholine receptors etc.)
• In Silico Biology 2007, 7, 0025
25
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Work in Progress (Future Plan)
1. Prediction of solubility of proteins and peptides
2. Understand drug delivery system for protein
3. Degradation of proteins
4. Improving thermal stability of a protein (Protein
Science 12:2118-2120)
5. Analysis and prediction of druggable proteins/peptide
26
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
MELTpred: Prediction of melting point of chemical compunds
•
Around 4300 compounds were analzed to derive rules
•
Successful predicted melting point of 277 drug-like molecules
Future Plan
1. QSAR models for ADMET
2. QSAR + docking for ADMET
3. Prediction of drug like molecules
4. Open access in Chemoinformatics
27
Searching and analyzing druggable targets
Nucleotide
Protein
Genome Proteome Protein
Annotation Annotation Struct.
Drug
Informatics
Target
Proteins
Prediction and analysis of drug molecules
Biologicals
Protein
Drugs
Chemical
s
Protein-drug
Interaction
Drug-like
Molecules
Understanding Protein-Chemical Interaction
Prediction of Kinases Targets and Off Targets
•Kinases inhibitors were analyzed
• Model build to predict inhbitor against kinases
•Cross-Specificity were checked
•Useful for predicting targets and off targets
Future Plan
•Classification of proteins based on chemical interaction
•Clustering drug molecules based on interaction with proteins
28
29
Related documents