Download P(D, S 0 ) - Department of Computer Science

Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University Outline   Motivation Linear causal models Introduction  Properties   Structure learning  Greedy hillclimbing search   PC algorithm   Results Results EGS/EGS2 algorithm  Results Motivation  Given an amount of data, what can we deduce about the unknown underlying regulatory network?  Observe the state of a cell and how it changes under different conditions, and from this derive a model of how these state changes are generated Linear Causal Models  Special case of Bayesian Networks  Assumes linear interactions between a child and its parents in the graph     Influences of parents are additive Cannot model combinatorial effects Parameters specify the type of influence (positive/negative) and the amount of influence Allows continuous data Linear Causal Models  Example    X2 = C1 + b12X1 + e1 X3 = C2 + b13X1 + b23X2 + e2 …. X1 X5 X2 Nodes = variables (e.g, gene expression level) X4 Arcs = causes (e.g, biological processes) X3 X6 Properties  Conditional Independence: I (x,z,y) if  Each node is asserted to be conditionally independent of its non-descendants, given its parents  Each node is conditionally independent of all other nodes given its Markov blanket  Then the total probability is: n P( X 1  X n )   P( X i | parents( X i )) i 1 Independence  Structure and Correlation Constraints  Acyclic graphs  Example: X2 = C1 + b12X1 + e1  X3 = C2 + b23X2 + e2   Calculating the correlation of X1 and X3 from the model's equations, we find that Conditional independence  In linear causal models conditional independence is equivalent to zero partial correlation, also known as a vanishing partial correlation  xy. z       1   1    xy xz 2 xz yz 2 yz xy.z=0 => I(x,z,y), there is no linear correlation between x and y given the value of z Conditional independence Example of a linear correlation between x and y Conditional independence  When conditioning on a set of variables, the partial correlation coefficient can be computed recursively  xy.qZ   xy.Z   xq.Z  yq.Z (1   2 xq . Z )(1   2 yq . Z ) Conditional independence First-order partial correlation where  Correlation coefficient  Covariance  Variance  Standard deviation  Expected value/mean  xy. z       1   1    xy xz 2 xz yz 2 yz Conditional independence  D-separation   Two variables I and J are independent (have zero partial correlation) if all paths between them are blocked by evidence K A path is blocked in three cases  K Head-to-head node I  Intermediate node I J K I  Tail-to-tail node J J Z K Inferring independence in graphs  Algorithm to determine if dsep(X, Z, Y) holds:  Recursively prune every leaf node W as long as W does not belong to X U Y U Z  Delete all edges leaving any node in Z  Test if X and Y are disconnected in a resulting graph Inferring independence in graphs  I (X2, {} , X4)? X1 X2 X3 X4 X5 X6 X7 Inferring independence in graphs  I (X2, X5, X4)? X1 X2 X3 X4 X5 X6 X7 Inferring independence in graphs  I (X3, X2, X5)? X1 X2 X3 X4 X5 X6 X7 Paper review  “Revising Regulatory Networks: From Expression Data to Linear Causal Models” S. D. Bay, J. Shrager, A. Pohorille, and P. Langley http://newatlantis.isle.org/~sbay/papers/jbi-BSPL.pdf Greedy hillclimbing search  A partial model already exists  Modify/improve initial model using microarray gene expression data Revising the model  Start with initial model  Find causal relationships    Use greedy hillclimbing to search through the space of candidate models Use the bootstrap method to determine the stability of the suggested revisions Determine the type of regulation between variables (positive/negative regulation) Hillclimbing search Greedy search algorithm: 1. Start with a proposed model M (a DAG) 2. Until no improvements in the score can be made 1. Randomly pick two nodes x, y 2. Add, delete, or reverse an edge between x and y, depending on which operation causes the best improvement in the score of the model Revising model structure  Determine partial correlation ρxy.z for every combination and ordering of variables x,y,z  In the model:   Determine if ρxy.z =0 D-separation rules ≡ I(x,z,y) using In the data:  Test if rxy.z = 0 using first-order partial correlation coefficient Determining partial correlations In the data:  Test the significance of rxy . z, observed value of xy. z    Null hypothesis H 0: xy . z  0 Alternate hypothesis Ha : xy . z  0 Perform a t-test with 95% two-sided CI t N 3   Accept H 0 only if rxy. z N  3 1  rxy2 . z | tN  3 | c.v Determining partial correlations  If the null hypothesis is clearly rejected or accepted, there are four possible outcomes:      model entails model entails model entails model entails Scoring function and data implies and data implies and data implies and data implies (true positive) (false positive) (false negative) (true negative) Parameter signs and correlations     If I and J are directly connected, assume sign (ρij) = sign (rij) Otherwise, multiply the signs of the links If I and J are connected by multiple paths, pick the sign implied by the dominant path Use greedy hillclimbing with scoring function Bootstrapping  Goal: learn stable changes from an initial model  Method:    Sample with replacement from a data set of size n to create k new data sets of size n (in this case, k=20) Record suggested changes Only accept changes with repeatability greater than a threshold (in this case, 75%) Results Results Issues   Independence tests are performed conditioning on only one variable The search can easily get stuck in local maxima PC Algorithm 1. Start with a complete undirected graph G 2. For all adjacent nodes x and y, try to separate nodes by checking for conditional independencies between x and y. Check a conditional independence relation I(x, y|S) if all variables in S are adjacent to x or y. If I(x, y|S) for some set S, remove the edge between x and y. 3. For each triple of nodes (x, y, z) such that x is adjacent to y and y is adjacent to z but x is not adjacent to z, orient x – y – z as x -> y <z if and only if y was not in S 4. Until no more edges can be directed 1. direct all arcs necessary to avoid new v-structures 2. direct all arcs necessary to avoid cycles using rules R1 – R4 PC Algorithm: Problems  Order(V), that is, the order in which the independencies are checked, makes a difference in the orientation of edges  Relies heavily on independence tests Result: PC algorithm is unstable “A Hybrid Anytime Algorithm for the Construction of Causal Models From Sparse Data” Denver Dash, Marek J. Druzdzel In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), San Francisco, CA, 1999 PC Algorithm: problems EGS Algortihm Repeat: 1. Draw an  from P(). 2. Call PerformPC(, order(V ), D, I) to generate an essential graph G0. 3. Randomly convert G0 to a dag S0. 4. Calculate P(D, S0) and record the structure with the maximum value. 5. Randomly generate a different configuration of order(V ). EGS Algortihm Scoring metric (BSM): EGS2 algortihm Input: A set of variables V, a distribution P() over the significance level , a test of independence I(x; y |S), an initial configuration for order(V ), and an integer n. Repeat: 1. Draw an  from P(). 2. Call PerformPC(, order(V ), D, I) to generate an essential graph G0 . 3. Randomly convert G0 to a dag S0. 4. Calculate P(D, S0) and record the structure with the maximum value. 5. Randomly generate a different configuration of order(V ). Until: n structures are generated without updating the maximum value of P(D, S0). Results Results Discussion  Limitations of linear causal models:   Cannot model combinatorial effects Do not model time dependent effects References  “Revising Regulatory Networks: From Expression Data to Linear Causal Models” S. D. Bay, J. Shrager, A. Pohorille, and P. Langley http://newatlantis.isle.org/~sbay/papers/jbi-BSPL.pdf  “A Hybrid Anytime Algorithm for the Construction of Causal Models From Sparse Data” Denver Dash, Marek J. Druzdzel In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), San Francisco, CA, 1999

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download P(D, S 0 ) - Department of Computer Science