Download 1 APPENDIX Proof of Theorem 2.1 Proof: Since Vars(ϕi1) ∩ Vars

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
1
A PPENDIX
Proof of Theorem 2.1
Proof: Since Vars(ϕi1 ) ∩ Vars(ϕi2 ) = ∅, ϕi1 and
ϕi2 are pairwise independent. Since Φi = ϕi1 ϕi2 ,
answer tuple r is made up of the elements which
come from both Vars(ϕi1 ) and Vars(ϕi2 ). If we remove
all variables in MCS r (Φi ) from the database, then Φi
becomes a counterfactual cause for answer tuple r.
If we further remove Vars(ϕi1 ) from the database,
Φi becomes unsatisfiable and r is removed from the
answer accordingly. Thus MCS r (ϕi1 ) = MCS r (Φi ).
With the similar method, we can derive MCS r (ϕi2 ) =
MCS r (Φi ). This proves the theorem.
Proof of Corollary 2.1
Proof: Since Φi = ϕi1 ϕi2 and Vars(ϕi1 ) ∩
Vars(ϕi2 ) = ∅, we can derive the following results:
1. MCS ϕi1 (xj ) is independent of ϕi2 and
MCS ϕi2 (yj ) is independent of ϕi1 . Thus, we can
independently compute MCS ϕi1 (xj ) and MCS ϕi2 (yj ).
2. Since MCS Φ (ϕi1 ) = MCS Φ (ϕi2 ) = MCS Φ (Φi ) by
Theorem 2.1, MCS Φi (ϕi1 ) = MCS Φi (ϕi2 ) = 0.
In order to compute MCS Φ (xj ), we should compute
MCS Φ (Φi ) first. Then we compute MCS Φi (ϕi1 ). Finally, we compute MCS ϕi1 (xj ). Thus, we can compute
MCS Φ (xj ) as follows:
MCS Φ (xj ) = MCS Φ (Φi )+MCS Φi (ϕi1 )+MCS ϕi1 (xj )
= MCS Φ (Φi ) + MCS ϕi1 (xj )
= MCS Φ (Φi ) + α
With the similar method, we can derive
MCS Φ (yi ) = MCS Φ (Φi ) + β. This proves the
corollary.
Proof of Proposition 3.1
Proof: From the lineage matrix, we observe that
Φ0 = LM [0][0]LM [0][1] · · · LM [0][n − 1] denotes a
set of clauses. Thus for any clause ci ∈ Φ, ci is
conjunction of one and only one variable from every column in its first row. By Theorem 2.1, we
can derive MCS r (Φ0 ) = MCS r (LM [0][0]) = · · · =
MCS r (LM [0][n − 1]). Thus we should not remove any
elements along the path from LM [0][0] to LM [0][j] in
order to compute MCS (LM [0 ][j ]). Hence SPM [0 ][j ] =
0. So the distance of the shortest path from P M [0][0]
to P M [0][j](j > 0) is 0. The proposition is proved.
Proof of Proposition 3.2
Proof: From the lineage matrix, we know that
∑l
Φ =
i=0 Xi fxi , where Xi denotes LM [i][0] and
fx0 ⊃∑· · · ⊃ fxl . We
can transform
∑i−1
∑lΦ as follows:
l
Φ =
X
f
=
X
f
+
k=0 k xk
k=i Xk fxk . Let
∑li=0 i xi
Φ =
X fxk . Since fx0 ⊃ · · · ⊃ fxl , MCS (Φ1 ) =
∑1i−1 k=i k ∑
i−1
|X
|
=
k
k=0
k=0 P M [k][0]. Hence SPM [k ][0 ] =
∑i−1
P
M
[k][0].
So
the distance of the shortest path
k=0
∑i−1
from P M [0][0] to P M [i][0](i > 0) is k=0 P M [k][0].
This proves the proposition.
Proof of Proposition 3.3
Proof: From the path matrix, we observe that if
P M [i][j − 1] is not a node then P M [0][0] can not
reach P M [i][j] by way of P M [i][j − 1]. So the shortest
1
path from P M [0][0] to P M [i][j] does not include cell
P M [i][j − 1]. Thus, SPM [i][j] has nothing to do with
P M [i][j − 1]. With the similar reason, SPM [i][j] has
no concern with P M [i − 1][j] if P M [i − 1][j] is not a
node.
When computing SPM [i][j], we have computed
SPM [i − 1][j] and SPM [i][j − 1] because we compute
SPM [m][n] from the first row to the last row and in
every row from the first column to the last column.
Since path matrix is directed, P M [0][0] can reach cell
P M [i][j] by way of cells P M [i − 1][j] and P M [i][j − 1]
if P M [i − 1][j] and P M [i][j − 1] are both nodes. Thus
SPM [i][j] is related to SPM [i − 1][j] and SPM [i][j − 1]
if P M [i − 1][j] and P M [i][j − 1] are both nodes.
Let ϕj = LM [0][j] for 0 ≤ j ≤ n − 1 and Φ0 = ϕ0 ∧
ϕ1 · · · ∧ ϕn−1 . From the second row of lineage matrix
LM [m][n], if LM [i][j] is not equal to 0, then ϕj =
LM [i][j]. After computing all ϕj s, Φi = ϕ0 ∧ ϕ1 · · · ∧
ϕn−1 . By Theorem 2.1, we can derive MCS r (Φi ) =
MCS r (ϕ0 ) = MCS r (ϕ1 ) = · · · = MCS r (ϕn−1 ). Thus, if
P M [i][j − 1] is a node, SPM [i][j] = SPM [i][j − 1] from
direction P M [i][j − 1] → P M [i][j].
However, if P M [i−1][j] is a node, Vars(LM [i−1][j])
and Vars(LM [i][j]) first appear in different clauses.
Thus SPM [i][j] ̸= SPM [i − 1][j]. Since P M [0][0] must
pass through one more node P M [i − 1][j] to reach
P M [i][j] from direction P M [i − 1][j] → P M [i][j],
SPM [i][j] = SPM [i − 1][j] + P M [i − 1][j].
Let x = SPM [i][j − 1] and y = SPM [i − 1][j] +
P M [i−1][j]. Combining two directions P M [i−1][j] →
P M [i][j] and P M [i][j − 1] → P M [i][j], we know that
SPM [i][j] = min(x, y) for i > 0. If either x or y is
equal to 0, its corresponding cell is not a node. Thus
we delete it from Function min(). This proves the
proposition.
Proof of Lemma 3.1
Proof: Let ϕj = LM [0][j] for 0 ≤ j ≤ n − 1 and
Φ0 = ϕ0 ∧ϕ1 · · ·∧ϕn−1 . From the second row of lineage
matrix LM [m][n], if LM [i][j] is not equal to 0, then
ϕj = LM [i][j]. After computing all ϕj s, Φi = ϕ0 ∧
ϕ1 · · · ∧ ϕn−1 . Then,
1. If a directed edge from LM [i][j] passes through
k rows, then LM [i][j] influences k sets of clauses Φi s.
If all variables in LM [i][j] are set to false, then the
k sets of clauses become false. The mapping path
from LM [0][0] to LM [i][j] must include directed edges
which add up to passing through the first i rows.
Thus, if all variables in the cells in the first i rows
along the mapping path from LM [0][0] to LM [i][j]
are set to false, all clauses including any variable in
colj (Φ) − Vars(LM [i ][j ]) − col[j] become false.
2. If all variables in any col[k](j ≤ k < n) are set to
false, all variables in cells LM [l][k](i < l < n) become
false. Since a clause is conjunction of variables from
all columns, all clauses including any variable in col[j]
become false.
This proves the lemma.
Proof of Theorem 3.1
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Proof: By Lemma 3.1 Item 1, all clauses including
any variable in colj (Φ)−Vars(LM [i ][j ])−col[j] become
false if all variables in the cells in the first i rows along
the mapping path from LM [0][0] to LM [i][j] are set to
false. Let col[k] have the minimal number of elements
among the sets col[j], . . . , col[n−1]. By Lemma 3.1 Item
2, all clauses including any variables in col[j] become
false if all variables in col[k] are set to false.
After the above two sets of variables are set to
false, only clauses including variables in LM [i][j]
remain. Thus those clauses become actual cause
with respect to Φ. Let minValue = |col [k]| =
min(|col[j]|, . . . , |col[n − 1]|). So MCS Φ (LM [i][j]) =
SPM [i ][j ] + minValue. Since P M [i][j] = |LM [i ][j ]|,
resp(xk ) = SPM [i][j ]+P M1[i][j]+minValue for any xk ∈
LM [i ][j ]. This proves the theorem.
Proof of Lemma 3.2
Proof: Let Φi = Xi fxi = Xi fyi fzi . Since Xi is
conjunction with fyi and fzi to get Φi , we can not
compute responsibility of any variable in Xi by performing responsibility analysis for Φi1 = Xi alone. For
formula Φ′ = fyi fzi , we can independently compute
responsibility of any variable in fyi and fzi by Corollary 2.1 because Vars(fyi ) ∩ Vars(fzi ) = ∅. Since Xi is
conjunction with fyi and fzi to get Φi , we can compute
responsibility of any variable in fyi by performing
responsibility analysis for Φi2 = Xi fyi alone. With the
similar reason, we can compute responsibility of any
variable in fzi by performing responsibility analysis
for Φi3 = Xi fzi alone.
Let Φj = Xj fxj = Xj fyj fzj (j ̸= i). Computing
responsibility of any variable in fyj is independent of fzj by performing responsibility analysis for
Φj2 = Xj fyj alone. Since Vars(fyj ) ∩ Vars(fzi ) = ∅,
computing responsibility of any variable in fyj is also
independent of fzi . So, computing responsibility of
∑l
∑l
any variable in i=1 fyi is independent of j=1 fzj
by performing responsibility analysis for Φ1 alone.
Thus, resp Φ (yi ) = resp Φ1 (yi ) for any yi ∈ Φ1 . With
the similar reason, computing responsibility of any
∑l
∑l
variable in
j=1 fzj is independent of
i=1 fyi by
performing responsibility analysis for Φ2 alone. Thus,
resp Φ (zi ) = resp Φ1 (zi ) for any zi ∈ Φ2 .
Finally, for any xi ∈ Φ, we get MCS Φ1 (xi )
and MCS Φ2 (xi ) by performing responsibility
analysis for Φ1 and Φ2 , respectively. Thus,
MCS Φ (xi ) = min(MCS Φ1 (xi ), MCS Φ2 (xi )) and
resp Φ (xi ) = max(resp Φ1 (xi ), resp Φ2 (xi )). This proves
the lemma.
Proof of Lemma 3.3
Proof: For any two path lineages Φi and Φj (1 ≤
i, j ≤ k), Vars(Φi ) ∩ Vars(Φj ) = {X1 , . . . , Xl }. By
Lemma 3.2, we can compute responsibilities of all
variables in Φi ∪ Φj by independently performing
responsibility analysis for Φi , Φj and resp Φi ∪Φj (xl ) =
max(resp Φi (xl ), resp Φj (xl )) for any xl ∈ Φ. Since Φ is decomposed into Φ1 , . . . , Φk , resp Φ (xi ) =
max(resp Φ1 (xi ), . . . , resp Φk (xi )) for any xi ∈ Φ. This
2
proves the lemma.
Proof of Theorem 3.3
Proof: Assume that the first split node is Y .
Then Φ is decomposed into a set of path lineages
Φ1 , . . . , Φk−1 and a smaller composite lineage Φk . By
Lemma 3.3, we can use Algorithm 1 to perform responsibility analysis for Φ1 , . . . , Φk−1 and resp Φ (xi ) =
max(resp Φ1 (xi ), . . . , resp Φk−1 (xi ), resp Φk (xi )) for any
xi ∈ Φ.
Then we split the inequality graph of Φk recursively
until it is split into a set of inequality paths. In each
step, we employ Algorithm 1 to perform responsibility analysis for the corresponding path lineages.
Thus we can decompose Φ into path lineages for
responsibility analysis. This proves the theorem.