Download Important point - VideoLectures.NET

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Learning Metabolic Network
Inhibition using Abductive Stochastic
Logic Programming
Jianzhong Chen, Stephen Muggleton, José Santos
Imperial College, London
Florence, 2nd August 2007
Summary
•Background: Metabolic networks, ILPs
and SLPs
•Problem: Learning Metabolic Network
Inhibition
•Modeling: Abductive SLPs
•Results: Improvement by extracting
probabilistic examples
•Conclusions and Discussion
Metabolic Network
Metabolites interact with each other in a complex
metabolic network with enzymes catalyzing the
transformation of one metabolite into another.
Metabolites are the nodes and enzymes are the
arcs of this graph.
The metabolic network data was taken from the Kyoto Encyclopedia of
Genes and Genomes (KEGG).
Excerpt of the rat
metabolic network
Introducing ILP
Inductive Logic Programming is an
expressive machine learning technique
which receives a description of a problem
in a logic program.
This description is divided in background
knowledge (i.e. facts, rules and constraints
about the domain) and observations.
Induction and Abduction
Induction: Inference of a general theory that
entails as many observations as possible
(the observations are causes)
Abduction: Inference of the hypothesis that
best explain available observations (the
observations are consequences)
Problem
Build a model to predict enzyme inhibition
due to a toxin (hydrazine) injection.
Motivation
Predicting the inhibitory effects of a drug is crucial in
drug development to understand the possible side
effects of the drug in the metabolic system of the
recipient.
Experiment setting
Our dataset consists of 2 groups with 10
rats each. The control group was injected
with a placebo and the case group was
injected with 30mg of hydrazine.
The injection of hydrazine changes the
metabolite concentration in the rats. The
level of change was measured comparing
to the control rats.
Important point
It is not possible to directly observe the inhibitory
effects of the drug (i.e. which enzymes it inhibited). We can
know, however, the metabolite concentrations at
certain hours after the drug injection.
We abduce the inhibited status by
observing the metabolite concentrations
Background knowledge
concentration(X, down, T):- reactionnode(X, Enz, Y),
inhibited(Enz, Y, X, T).
concentration(X, down, T):- reactionnode(X, Enz, Y),
noninhibited(Enz, Y, X, T),
concentration(Y, down, T).
concentration(X, up, T):- reactionnode(Y, Enz, X),
inhibited(Enz, X, Y, T).
concentration(X, up, T):- reactionnode(X, Enz, Y),
noninhibited(Enz, Y, X, T),
concentration(Y,up,T).
:- concentration(M, up, T), concentration(M, down, T)
:- inhibited(Enz, From, To, T), noninhibited(Enz, From, To, T).
Background knowledgepartial network
reactionnode('l-2-aminoadipate','2.6.1.39', '2-oxo-glutarate').
reactionnode('2-oxo-glutarate','2.6.1.39', 'l-2-aminoadipate').
reactionnode('2-oxo-glutarate', '1.1.1.42', 'isocitrate').
reactionnode('isocitrate', '1.1.1.42', '2-oxo-glutarate').
reactionnode('2-oxo-glutarate','2.3.1.61', 'succinate').
reactionnode('succinate','2.3.1.61', '2-oxo-glutarate').
reactionnode('isocitrate','4.2.1.3', 'citrate').
reactionnode('citrate','4.2.1.3', 'isocitrate').
reactionnode('isocitrate','4.2.1.3', 'trans-aconitate').
reactionnode('trans-aconitate','4.2.1.3', 'isocitrate').
reactionnode('citrate','4.2.1.2', 'fumarate').
reactionnode('fumarate','4.2.1.2', 'citrate').
reactionnode('succinate','1.3.99.1', 'fumarate').
reactionnode('fumarate','1.3.99.1', 'succinate').
reactionnode('succinate','1.13.11.16', 'hippurate').
reactionnode('hippurate','1.13.11.16', 'succinate').
reactionnode('citrate','2.6.1.-', 'taurine').
reactionnode('taurine','2.6.1.-', 'citrate').
Observations after 8 hours of
injection
concentration('citrate',up,8).
concentration('2-oxo-glutarate',down,8).
concentration('succinate',up,8).
concentration('l-2-aminoadipate',up,8).
concentration('creatine',down,8).
concentration('creatinine',down,8).
concentration('hippurate',up,8).
concentration('beta-alanine',down,8).
concentration('lactate',up,8).
concentration('methylamine',up,8).
concentration('trans-aconitate',down,8).
concentration('formate',down,8).
concentration('taurine',up,8).
concentration('acetate',down,8).
concentration('nmna',down,8).
concentration('nmnd',up,8).
concentration('tmao',down,8).
concentration('fumarate',up,8).
concentration('l-as',up,8).
concentration('glucose',down,8).
Discovered abducibles
noninhibited('1.13.11.16',succinate,hippurate,8).
noninhibited('2.1.1.1',fumarate,nmnd,8).
noninhibited('2.1.1.7',fumarate,nmna,8).
noninhibited('1.1.99.8',formaldehyde,formate,8).
inhibited('2.6.1.39','l-2-aminoadipate','2-oxo-glutarate',8).
inhibited('4.2.1.3',isocitrate,'trans-aconitate',8).
inhibited('4.2.1.2',fumarate,citrate,8).
inhibited('1.3.99.1',fumarate,succinate,8).
inhibited('2.6.1.-',taurine,citrate,8).
inhibited('4.3.2.1','l-as',arginine,8).
inhibited('2.1.1.2',ornithine,creatine,8).
inhibited('3.5.1.59',sarcosine,creatinine,8).
inhibited('1.4.99.3',methylamine,formaldehyde,8).
inhibited('4.1.2.32',tmao,formaldehyde,8).
inhibited('4.2.1.54',lactate,'acryloyl-coA',8).
inhibited('4.3.1.6','beta-alanine','acryloyl-coA',8).
inhibited('2.7.1.69',glucose,pyruvate,8).
inhibited('6.2.1.1',acetate,acetylCoA,8).
Introducing SLPs
ILP has a great modeling power, however
more can be done if probabilities are
attached to the background knowledge
rules.
Attaching probabilities to rules is, in brief,
the idea of Stochastic Logic Programs
(SLPs)
Reformulating the problem
In our SLP modeling we started with the logic
program just like in the ILP case, including the
abducibles as part of the model.
The difference is that probabilities were
attached to the inhibited/4 and noninhibited/4
predicates as well as in the concentration/3
rules.
The SLP system has to discover the probability
values that maximize the likelihood of
observing the given concentrations.
Significance of this approach
In real world problems the status of entities is rarely
just two folds (on/off). Most problems are better
modeled if the modelling system allows for a
certain degree of fuzzyness.
Also, the rules previous discovered may not be all
equally important and we leave to the SLP system
the responsibility of determining the best relative
weighting.
Novelty of our work
We divided SLP modeling in two, Categorical
SLP (CSLP) and Probabilistic SLP (PSLP),
the only difference between the two is that
the latter uses probabilistic examples to learn
(confidence for being up or down) rather than
categorical (totally up/down)
The problem now is how to derive these confidences from
the dataset.
Extracting Probabilistic Examples from
Scientific Data
Pnorm(1.72, 0,1)=0.9573
Concentration
Empirical Prob
CSLP
PSLP
citrate
down
0,9843
0,69
0,686
2-og
down
1,000
0,568
0,69
succinate
down
0,9368
0,259
0,297
l-2-aa
up
0,9962
0,658
0,828
creatine
down
0,5052
0,307
0,443
creatinine
down
0,5798
0,322
0,493
hippurate
down
0,7136
0,303
0,166
beta-alanine
up
0,9659
0,567
0,686
lactate
up
0,9503
0,54
0,516
methylamine
up
1,000
0,301
0,525
trans-aconitate
down
0,6488
0,392
0,441
formate
down
0,9368
0,392
0,423
taurine
up
0,7362
0,65
0,81
acetate
up
0,6727
0,556
0,539
nmna
up
0,5239
0,489
0,492
nmnd
up
0,6414
0,489
0,499
tmao
up
0,5166
0,31
0,112
fumarate
up
0,697
0,297
0,502
l-as
up
0,6748
0,504
0,507
glucose
up
0,8096
0,557
0,531
68,34%
72,75%
Average accuracy
P-value
4,12%
Prediction accuracy
of CSLP vs PSLP
Default accuracy and
ILP accuracy= 60%
Abductive SLP model
Learned
metabolic
network with
Probabilistic SLP
Conclusions and Discussion
•This kind of problems is almost impossible to
model without a rich machine learning framework
like logic programs.
•SLPs produce a richer description of the
underneath biological reality compared to ILP.
•Additionally, learning SLPS from probabilistic
examples leads to an statistically significant
improvement in predictive accuracy.