Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 A PPENDIX Proof of Theorem 2.1 Proof: Since Vars(ϕi1 ) ∩ Vars(ϕi2 ) = ∅, ϕi1 and ϕi2 are pairwise independent. Since Φi = ϕi1 ϕi2 , answer tuple r is made up of the elements which come from both Vars(ϕi1 ) and Vars(ϕi2 ). If we remove all variables in MCS r (Φi ) from the database, then Φi becomes a counterfactual cause for answer tuple r. If we further remove Vars(ϕi1 ) from the database, Φi becomes unsatisfiable and r is removed from the answer accordingly. Thus MCS r (ϕi1 ) = MCS r (Φi ). With the similar method, we can derive MCS r (ϕi2 ) = MCS r (Φi ). This proves the theorem. Proof of Corollary 2.1 Proof: Since Φi = ϕi1 ϕi2 and Vars(ϕi1 ) ∩ Vars(ϕi2 ) = ∅, we can derive the following results: 1. MCS ϕi1 (xj ) is independent of ϕi2 and MCS ϕi2 (yj ) is independent of ϕi1 . Thus, we can independently compute MCS ϕi1 (xj ) and MCS ϕi2 (yj ). 2. Since MCS Φ (ϕi1 ) = MCS Φ (ϕi2 ) = MCS Φ (Φi ) by Theorem 2.1, MCS Φi (ϕi1 ) = MCS Φi (ϕi2 ) = 0. In order to compute MCS Φ (xj ), we should compute MCS Φ (Φi ) first. Then we compute MCS Φi (ϕi1 ). Finally, we compute MCS ϕi1 (xj ). Thus, we can compute MCS Φ (xj ) as follows: MCS Φ (xj ) = MCS Φ (Φi )+MCS Φi (ϕi1 )+MCS ϕi1 (xj ) = MCS Φ (Φi ) + MCS ϕi1 (xj ) = MCS Φ (Φi ) + α With the similar method, we can derive MCS Φ (yi ) = MCS Φ (Φi ) + β. This proves the corollary. Proof of Proposition 3.1 Proof: From the lineage matrix, we observe that Φ0 = LM [0][0]LM [0][1] · · · LM [0][n − 1] denotes a set of clauses. Thus for any clause ci ∈ Φ, ci is conjunction of one and only one variable from every column in its first row. By Theorem 2.1, we can derive MCS r (Φ0 ) = MCS r (LM [0][0]) = · · · = MCS r (LM [0][n − 1]). Thus we should not remove any elements along the path from LM [0][0] to LM [0][j] in order to compute MCS (LM [0 ][j ]). Hence SPM [0 ][j ] = 0. So the distance of the shortest path from P M [0][0] to P M [0][j](j > 0) is 0. The proposition is proved. Proof of Proposition 3.2 Proof: From the lineage matrix, we know that ∑l Φ = i=0 Xi fxi , where Xi denotes LM [i][0] and fx0 ⊃∑· · · ⊃ fxl . We can transform ∑i−1 ∑lΦ as follows: l Φ = X f = X f + k=0 k xk k=i Xk fxk . Let ∑li=0 i xi Φ = X fxk . Since fx0 ⊃ · · · ⊃ fxl , MCS (Φ1 ) = ∑1i−1 k=i k ∑ i−1 |X | = k k=0 k=0 P M [k][0]. Hence SPM [k ][0 ] = ∑i−1 P M [k][0]. So the distance of the shortest path k=0 ∑i−1 from P M [0][0] to P M [i][0](i > 0) is k=0 P M [k][0]. This proves the proposition. Proof of Proposition 3.3 Proof: From the path matrix, we observe that if P M [i][j − 1] is not a node then P M [0][0] can not reach P M [i][j] by way of P M [i][j − 1]. So the shortest 1 path from P M [0][0] to P M [i][j] does not include cell P M [i][j − 1]. Thus, SPM [i][j] has nothing to do with P M [i][j − 1]. With the similar reason, SPM [i][j] has no concern with P M [i − 1][j] if P M [i − 1][j] is not a node. When computing SPM [i][j], we have computed SPM [i − 1][j] and SPM [i][j − 1] because we compute SPM [m][n] from the first row to the last row and in every row from the first column to the last column. Since path matrix is directed, P M [0][0] can reach cell P M [i][j] by way of cells P M [i − 1][j] and P M [i][j − 1] if P M [i − 1][j] and P M [i][j − 1] are both nodes. Thus SPM [i][j] is related to SPM [i − 1][j] and SPM [i][j − 1] if P M [i − 1][j] and P M [i][j − 1] are both nodes. Let ϕj = LM [0][j] for 0 ≤ j ≤ n − 1 and Φ0 = ϕ0 ∧ ϕ1 · · · ∧ ϕn−1 . From the second row of lineage matrix LM [m][n], if LM [i][j] is not equal to 0, then ϕj = LM [i][j]. After computing all ϕj s, Φi = ϕ0 ∧ ϕ1 · · · ∧ ϕn−1 . By Theorem 2.1, we can derive MCS r (Φi ) = MCS r (ϕ0 ) = MCS r (ϕ1 ) = · · · = MCS r (ϕn−1 ). Thus, if P M [i][j − 1] is a node, SPM [i][j] = SPM [i][j − 1] from direction P M [i][j − 1] → P M [i][j]. However, if P M [i−1][j] is a node, Vars(LM [i−1][j]) and Vars(LM [i][j]) first appear in different clauses. Thus SPM [i][j] ̸= SPM [i − 1][j]. Since P M [0][0] must pass through one more node P M [i − 1][j] to reach P M [i][j] from direction P M [i − 1][j] → P M [i][j], SPM [i][j] = SPM [i − 1][j] + P M [i − 1][j]. Let x = SPM [i][j − 1] and y = SPM [i − 1][j] + P M [i−1][j]. Combining two directions P M [i−1][j] → P M [i][j] and P M [i][j − 1] → P M [i][j], we know that SPM [i][j] = min(x, y) for i > 0. If either x or y is equal to 0, its corresponding cell is not a node. Thus we delete it from Function min(). This proves the proposition. Proof of Lemma 3.1 Proof: Let ϕj = LM [0][j] for 0 ≤ j ≤ n − 1 and Φ0 = ϕ0 ∧ϕ1 · · ·∧ϕn−1 . From the second row of lineage matrix LM [m][n], if LM [i][j] is not equal to 0, then ϕj = LM [i][j]. After computing all ϕj s, Φi = ϕ0 ∧ ϕ1 · · · ∧ ϕn−1 . Then, 1. If a directed edge from LM [i][j] passes through k rows, then LM [i][j] influences k sets of clauses Φi s. If all variables in LM [i][j] are set to false, then the k sets of clauses become false. The mapping path from LM [0][0] to LM [i][j] must include directed edges which add up to passing through the first i rows. Thus, if all variables in the cells in the first i rows along the mapping path from LM [0][0] to LM [i][j] are set to false, all clauses including any variable in colj (Φ) − Vars(LM [i ][j ]) − col[j] become false. 2. If all variables in any col[k](j ≤ k < n) are set to false, all variables in cells LM [l][k](i < l < n) become false. Since a clause is conjunction of variables from all columns, all clauses including any variable in col[j] become false. This proves the lemma. Proof of Theorem 3.1 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING Proof: By Lemma 3.1 Item 1, all clauses including any variable in colj (Φ)−Vars(LM [i ][j ])−col[j] become false if all variables in the cells in the first i rows along the mapping path from LM [0][0] to LM [i][j] are set to false. Let col[k] have the minimal number of elements among the sets col[j], . . . , col[n−1]. By Lemma 3.1 Item 2, all clauses including any variables in col[j] become false if all variables in col[k] are set to false. After the above two sets of variables are set to false, only clauses including variables in LM [i][j] remain. Thus those clauses become actual cause with respect to Φ. Let minValue = |col [k]| = min(|col[j]|, . . . , |col[n − 1]|). So MCS Φ (LM [i][j]) = SPM [i ][j ] + minValue. Since P M [i][j] = |LM [i ][j ]|, resp(xk ) = SPM [i][j ]+P M1[i][j]+minValue for any xk ∈ LM [i ][j ]. This proves the theorem. Proof of Lemma 3.2 Proof: Let Φi = Xi fxi = Xi fyi fzi . Since Xi is conjunction with fyi and fzi to get Φi , we can not compute responsibility of any variable in Xi by performing responsibility analysis for Φi1 = Xi alone. For formula Φ′ = fyi fzi , we can independently compute responsibility of any variable in fyi and fzi by Corollary 2.1 because Vars(fyi ) ∩ Vars(fzi ) = ∅. Since Xi is conjunction with fyi and fzi to get Φi , we can compute responsibility of any variable in fyi by performing responsibility analysis for Φi2 = Xi fyi alone. With the similar reason, we can compute responsibility of any variable in fzi by performing responsibility analysis for Φi3 = Xi fzi alone. Let Φj = Xj fxj = Xj fyj fzj (j ̸= i). Computing responsibility of any variable in fyj is independent of fzj by performing responsibility analysis for Φj2 = Xj fyj alone. Since Vars(fyj ) ∩ Vars(fzi ) = ∅, computing responsibility of any variable in fyj is also independent of fzi . So, computing responsibility of ∑l ∑l any variable in i=1 fyi is independent of j=1 fzj by performing responsibility analysis for Φ1 alone. Thus, resp Φ (yi ) = resp Φ1 (yi ) for any yi ∈ Φ1 . With the similar reason, computing responsibility of any ∑l ∑l variable in j=1 fzj is independent of i=1 fyi by performing responsibility analysis for Φ2 alone. Thus, resp Φ (zi ) = resp Φ1 (zi ) for any zi ∈ Φ2 . Finally, for any xi ∈ Φ, we get MCS Φ1 (xi ) and MCS Φ2 (xi ) by performing responsibility analysis for Φ1 and Φ2 , respectively. Thus, MCS Φ (xi ) = min(MCS Φ1 (xi ), MCS Φ2 (xi )) and resp Φ (xi ) = max(resp Φ1 (xi ), resp Φ2 (xi )). This proves the lemma. Proof of Lemma 3.3 Proof: For any two path lineages Φi and Φj (1 ≤ i, j ≤ k), Vars(Φi ) ∩ Vars(Φj ) = {X1 , . . . , Xl }. By Lemma 3.2, we can compute responsibilities of all variables in Φi ∪ Φj by independently performing responsibility analysis for Φi , Φj and resp Φi ∪Φj (xl ) = max(resp Φi (xl ), resp Φj (xl )) for any xl ∈ Φ. Since Φ is decomposed into Φ1 , . . . , Φk , resp Φ (xi ) = max(resp Φ1 (xi ), . . . , resp Φk (xi )) for any xi ∈ Φ. This 2 proves the lemma. Proof of Theorem 3.3 Proof: Assume that the first split node is Y . Then Φ is decomposed into a set of path lineages Φ1 , . . . , Φk−1 and a smaller composite lineage Φk . By Lemma 3.3, we can use Algorithm 1 to perform responsibility analysis for Φ1 , . . . , Φk−1 and resp Φ (xi ) = max(resp Φ1 (xi ), . . . , resp Φk−1 (xi ), resp Φk (xi )) for any xi ∈ Φ. Then we split the inequality graph of Φk recursively until it is split into a set of inequality paths. In each step, we employ Algorithm 1 to perform responsibility analysis for the corresponding path lineages. Thus we can decompose Φ into path lineages for responsibility analysis. This proves the theorem.