Download Causal Structure Learning in Process

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Machine learning wikipedia , lookup

Data analysis wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Error detection and correction wikipedia , lookup

Data assimilation wikipedia , lookup

Signal-flow graph wikipedia , lookup

Pattern recognition wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Causal Structure Learning in Process
Engineering Using Bayes Nets and Soft
Interventions
Christian Kühnert, Thomas Bernard, Christian Frey
INDIN 2011
© Fraunhofer IOSB
1
Motivation
CRISP
Cross-Industry Standard Process for Data Mining
 Constant complexity increase in production
processes
 Increase of automation with more sensor and
process data available for analyses
 Machine learning can be used to find hidden
knowledge in the data
 Not possible to conclude how application
of found knowledge has an effect on the process
 Cause-effect relationships need to be detected
© Fraunhofer IOSB
2
Introduction
Probabilistic causation
 Learning the relationship between cause and effect using probability theory
 Observing only two variables, no cause-effect relationship can be learned
 Based on the calculation of conditional Independencies
 The cause raises the probability of a certain effect to occur
 1. V-structures are detected in the data
 2. The cause has been performed on purpose
V-structure
© Fraunhofer IOSB
Intervention
3
Chemical stirred tank reactor
 Continuous process: Fed with educt cA0 result in product cA
 Unknown reaction with resulting byproducts cB, cC
 Simulation of several productions
 Variation of initial values
Learning the graph of the underlying differential equations
© Fraunhofer IOSB
4
 Constraint-based approach
Ground truth
 Markov Blanket: All variables that shield the node from the rest of the network
 Redundancy: Detection of edges through two Markov blankets
 Size of measurement data depends on Markov blanket and not on topology
 Detection of V-structures in the graph
© Fraunhofer IOSB
5
 Scoring-based approach
 Edge direction from constraint-based approach are taken as fixed
 Calculation of the most relevant Bayesian Network
 Calculating the network which represents the maximal marginal Likelihood
 Maximum Likelihood is calculated by calculating the marginal Likelihood of
each family
© Fraunhofer IOSB
6
Ground truth
 Theoretically it is only possible to learn causal structures up to its Markov-equivalence class
 All conditional independencies are detected
 The resulting Bayesian Network is the most possible network
 Intervening in the process
 Intervening means clamping a process variable on a fixed value
 Not always possible as this could harm the process
 Physically not possible (i.e. stopping a chemical reaction)
 Using Soft Interventions
© Fraunhofer IOSB
7
Soft intervention
 Soft Intervention: ω describes the power of an intervention for the node
 There has been an intervention on the node
 Calculating the maximal Likelihood in
observed and intervened case for the
selected node
 Node is set on a fixed value
 Maximal Likelihood of the node is calculated
using observations , not in intervened case
 In intervened case node has a fixed value
(Parents have no influence)
How to select ߱ :
 Further knowledge of the process is needed
 Selection of ߱ , that known edges of the node are most possible (not) detected
© Fraunhofer IOSB
8
Chemical stirred tank reactor
Observational data
Interventional data
߱
Edge error
 Learning with intervention
 Learning from observations
Measurements
© Fraunhofer IOSB
9
Desintegration process
Observational data
Interventional data
 Intervening node B
߱
Edge error
 Assumption: A B is known
A B
© Fraunhofer IOSB
10
Measurements
Laboratory plant
 In containers liquid is pumped around in a cycle
 Pumping power is kept constant
 The process is in feed-forward control
 No collisions in the process
Prior knowledge
 No edge can be directed towards the pump
 As a priori knowledge it is assumed
Flow
Pressure
© Fraunhofer IOSB
Reduction of water flow
11
Interventional data
߱
Edge error
Observational data
Flow
© Fraunhofer IOSB
12
Pressure
Measurements
Glass forming process
 Production of 80 mm and 89 mm bars
 During start-up process runs in forward control
Ground truth
First set-point
Feature extraction
© Fraunhofer IOSB
13
First and second set-point
Conclusion
 It is possible to detect cause-effect relationships in measurement data based on probabilistic
measures
 Soft interventions can be used to validate the found structure and to direct more edges
 Performance has been shown on a simulated reactors , an experimental station and a well
known industrial glass forming process
 Hard to find the real value of ߱
Future research
 Using deterministic causality to generate causal relationships based on fault propagation
 Dynamic Bayesian Networks
© Fraunhofer IOSB
14