Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Contextual level Predictive level Learning Bayesian Metanetworks from Data with Multilevel Uncertainty Vagan Terziyan, Oleksandra Vitko [email protected], [email protected] University of Jyväskylä , Kharkov National University of Radioelectronics AIAI-2004 (WCC 2004), Toulouse, France 24 August 2004 Contents Bayesian Metanetworks Metanetworks for managing conditional dependencies Metanetworks for managing feature relevance Learning Bayesian Metanetworks from Data Conclusions This presentation: http://www.cs.jyu.fi/ai/AIAI-2004.ppt Oleksandra Vitko Department of Artificial Intelligence Kharkov National University of Radioelectronics (Ukraine) http://www.cs.jyu.fi/ai/oleksandra Vagan Terziyan Industrial Ontologies Group Department of Mathematical Information Technologies University of Jyvaskyla (Finland) http://www.cs.jyu.fi/ai/vagan 2 Bayesian Metanetworks 3 Bayesian Metanetwork Definition. The Bayesian Metanetwork is a set of Bayesian networks, which are put on each other in such a way that the elements (nodes or conditional dependencies) of every previous probabilistic network depend on the local probability distributions associated with the nodes of the next level network. 4 Two-level Bayesian C-Metanetwork for Managing Conditional Dependencies Contextual level Predictive level 5 Contextual and Predictive Attributes air pressure dust humidity temperature Machine emission Environment Sensors X x1 x2 x3 predictive attributes x4 x5 x6 x7 contextual attributes 6 Contextual Effect on Conditional Probability (1) X x1 x2 x3 x4 xk x6 x7 contextual attributes predictive attributes Assume conditional dependence between predictive attributes (causal relation between physical quantities)… x5 xt xr … some contextual attribute may effect directly the conditional dependence between predictive attributes but not the attributes itself 7 Contextual Effect on Conditional Probability (2) •X ={x1, x2, …, xn} – predictive attribute with n values; •Z ={z1, z2, …, zq} – contextual attribute with q values; •P(Y|X) = {p1(Y|X), p2(Y|X), …, p r(Y|X)} – conditional dependence attribute (random variable) between X and Y with r possible values; •P(P(Y|X)|Z) – conditional dependence between attribute Z and attribute P(Y|X); r n P(Y y j ) { pk (Y y j | X xi ) P( X xi ) k 1 i 1 q [ P( Z z m ) P( P(Y | X ) pk (Y | X ) | Z zm )]} m 1 8 Contextual Effect on Conditional Probability (3) Xt1 : I am in Paris xt Xt2 : I am in Moscow P1(Xr |Xk ) Xk1 Xk2 Xk1 : order flowers Xr1 0.3 0.9 Xr1 : visit football match Xk2 : order wine Xr2 0.4 0.5 Xr2 : visit girlfriend xr xk Xk : Order present P2(Xr |Xk ) Xk1 Xk2 Xr1 0.1 0.2 Xr2 0.8 0.7 Xr : Make a visit 9 Contextual Effect on Conditional Probability (4) Xt1 : I am in Paris Xt2 : I am in Moscow P( P (Xr |Xk ) | Xt ) X t1 X t2 P1(Xr |Xk ) 0.7 0.2 P2(Xr |Xk ) 0.3 0.8 xt xr xk P1(Xr |Xk ) Xk1 Xk2 P2(Xr |Xk ) Xk1 Xk2 Xr1 0.3 0.9 Xr1 0.1 0.2 Xr2 0.4 0.5 Xr2 0.8 0.7 10 Contextual Effect on Unconditional Probability (1) X x1 x2 x3 x4 X xk x7 xt P(X) x1 x2 x3 x4 x6 contextual attributes predictive attributes Assume some predictive attribute is a random variable with appropriate probability distribution for its values… x5 … some contextual attribute may effect directly the probability distribution of the predictive attribute 11 Contextual Effect on Unconditional Probability (2) X ={x1, x2, …, xn} – predictive attribute with n values; · Z ={z1, z2, …, zq} – contextual attribute with q values and P(Z) – probability distribution for values of Z; • P(X) = {p1(X), p2(X), …, pr(X)} – probability distribution attribute for X (random variable) with r possible values (different possible probability distributions for X) and P(P(X)) is probability distribution for values of attribute P(X); · P(Y|X) is a conditional probability distribution of Y given X; · P(P(X)|Z) is a conditional probability distribution for attribute P(X) given Z r n P(Y y j ) {P(Y y j | X xi ) pk ( X xi ) k 1 i 1 q [ P( Z z m ) P( P( X ) pk ( X ) | Z z m )]} m 1 12 Contextual Effect on Unconditional Probability (3) P( P (Xk ) | Xt ) X t1 X t2 P1(Xk ) 0.4 0.9 P2(Xk ) 0.6 0.1 xt P1(Xk) Xt2 : I am in Moscow P2(Xk) 0.7 0.5 0.3 0.2 Xk Xk Xk1 Xk2 Xk1 Xk2 Xk1 : order flowers Xk2 : order wine Xt1 : I am in Paris xk Xk : Order present 13 Causal Relation between Conditional Probabilities xm xn P(P(Xn| Xm)) P(Xn| Xm) P1(Xn|Xm) P2(Xn|Xm) P3(Xn|Xm) P(P(Xr| Xk)) P(P(Xr| Xk)|P(Xn| Xm)) P(Xr| Xk) P1(Xr|Xk) P2(Xr|Xk) xk There might be causal relationship between two pairs of conditional probabilities xr 14 Two-level Bayesian C-Metanetwork for managing conditional dependencies Contextual level P(B|A) P(Y|X) A B X Predictive level Y 15 Example of Bayesian C-Metanetwork The nodes of the 2nd-level network correspond to the conditional probabilities of the 1st-level network P(B|A) and P(Y|X). The arc in the 2ndlevel network corresponds to the conditional probability P(P(Y|X)|P(B|A)) P(Y y j ) { pk (Y y j | X xi ) P( X xi ) i k [ P( P(Y | X ) pk (Y | X ) | P( P( B | A) pr (Y | X )) P( P( B | A) pr ( B | A))]}. r 16 Two-level Bayesian R-Metanetwork for Modelling Relevant Features’ Selection Contextual level Predictive level 17 Feature relevance modelling (1) We consider relevance as a probability of importance of the variable to the inference of target attribute in the given context. In such definition relevance inherits all properties of a probability. P(X) Probability to have this model is: X P((X)=”no”)= 1-X Probability to have this model is: P((X)=”yes”)= X P(Y|X) Y P0(Y) Y 18 Feature relevance modelling (2) 1 P(Y ) P(Y | X ) [nx X P( X ) (1 X )]. nx X 19 General Case of Managing Relevance (1) Predictive attributes: X1 with values {x11,x12,…,x1nx1}; X2 with values {x21,x22,…,x2nx2}; … XN with values {xn1,xn2,…,xnnxn}; Target attribute: Y with values {y1,y2,…,yny}. Probabilities: P(X1), P(X2),…, P(XN); P(Y|X1,X2,…,XN). Relevancies: X1 = P((X1) = “yes”); X2 = P((X2) = “yes”); … XN = P((XN) = “yes”); Goal: to estimate P(Y). 20 General Case of Managing Relevance (2) Probability P(XN) P(Y ) 1 N nxs s 1 ... [ P(Y | X 1, X 2,... XN ) X1 X 2 XN nxr r ( ( Xr )" yes ") Xr P( Xr ) (1 Xq )] q ( ( Xq )"no") 21 Example of Relevance Bayesian Metanetwork (1) Conditional relevance !!! 1 P(Y ) {P(Y | X ) [nx P( X ) nx X P( X | A ) P( A ) (1 X )]}. A 22 Example of Relevance Bayesian Metanetwork (2) 23 Learning Bayesian Metanetworks from Data 24 Learning Task Given training set D of training examples <X1, X2, … Xn, Y> Goal is to restore: the set of levels of Bayesian Metanetwork {l1,, l2,, …lL}, each level is a Bayesian network; the interlevel links for each pair of successive levels {lr , lr+1}; the network structure and parameters at each level, particularly probabilities P(vi) and P(vi|parents(vi)) for each variable vi. <9.7 <0.2 <1.3 <?? 0.6 1.3 2.8 5.6 8 14 18> 5 ?? ??> ?? 0 1 > 0 10 ??> ………………. 25 Learning Bayesian Metanetwork Use well-known learning methods for learning component Bayesian networks on each level of the Metanetwork Add procedures for learning interlevel relationships for the case of multilevel probabilistic Metanetworks 26 Learning Process Stage 1. Division of attributes on the levels Stage 2. Learning the network structure Stage 3. Learning the interlevel links to the subsequent level Stage 4. Learning the network parameters over all levels of Metanetwork 27 Stage 1. Division of attributes among the levels The task of this stage is to divide the input vector of attributes <X1, X2, … Xn> into the predictive, contextual and perhaps metacontextual attributes. X x1 x2 x3 predictive attributes x4 x5 x6 x7 contextual attributes 28 Stage 2. Learning the network structure at the current level of Metanetwork can be made by well-known methods with good performance (Cheng-Greiner method, KA2 algorithm, etc.) A B C D E 29 Stage 3. Learning the interlevel links between the current and subsequent levels This is a new stage that has been added specifically for a Bayesian Metanetwork learning. Differs for the C-Metanetwork and for the R-Metanetwork. 30 Learning Interlevel Links in C-Metanetwork P1 (B|A) <A, B, X, Y>1 Context 1 P1 (Y|X) P2 (B|A) <A, B, X, Y>2 ... <A, B, X, Y>n Context 2 P2 (Y|X) Pn (B|A) Context n Pn (Y|X) 31 Different probability tables corresponding to different contexts are associated with vertexes of the second-level Bayesian network 32 Context variables in C-Metanetwork context random variable U {Pj(B|A)} P(W|U) context random variable W {Pt(Y|X)} P(W | U) {P((W Pj (Y | X)) | (U Pt (B | A))} 33 Learning Interlevel Links in R-Metanetwork Context 1 <A, B, X, Y>1 1(A) 1(B) 1(X) Context 2 <A, B, X, Y>2 2(A) 2(B) 2(X) ... Context n <A, B, X, Y>n n(A) n(B) n(X) 34 Different relevancies corresponding to different contexts are associated with vertexes of the second-level Bayesian network 35 Context variables in R-Metanetwork context random variable U {j(X)} P(W|U) context random variable W {t(A)} P(W|U) = {P((W = ψ j (X))| (U =ψ t (A))=ψ jt (X|A)} 36 Stage 4. Learning the parameters in the network at the current level P(A) is made by the standard A P(B) B P(D) procedure just taking into account the dynamics of D parameters’ values in different contexts P(D|A,B) P(C) C P(E) P(C|B) E P(E|D,C) 37 When Bayesian Metanetworks ? 1. Bayesian Metanetwork can be considered as very powerful tool in cases where structure (or strengths) of causal relationships between observed parameters of an object essentially depends on context (e.g. external environment parameters); 2. Also it can be considered as a useful model for such an object, which diagnosis depends on different set of observed parameters depending on the context. 38 Conclusions The main challenge of this work is the extension of the standard Bayesian learning procedures with the algorithm of learning the interlevel links The experiments on the data from the highly- contextual domain have shown the effectiveness of the proposed models and learning procedures 39 Read more about Bayesian Metanetworks in: Terziyan V., A Bayesian Metanetwork, In: In: International Journal on Artificial Intelligence Tools, Vol. 14, Ns. 3-4, World Scientific (to appear). http://www.cs.jyu.fi/ai/IJAIT-2003.doc Terziyan V., Vitko O., Bayesian Metanetwork for Modelling User Preferences in Mobile Environment, In: German Conference on Artificial Intelligence (KI-2003), Hamburg, Germany, September 15-18, 2003. http://www.cs.jyu.fi/ai/papers/KI-2003.pdf Vitko O. The Multilevel Probabilistic Networks for Modelling Complex Information Systems under Uncertainty. Ph.D. Thesis, Kharkov National University of Radioelectronics, 2003. 40