Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Structure (mathematical logic) wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
Finite Model Theory
Lecture 18
Extended 0/1 Laws
Or “Getting Real”
1
Outline
• A better probabilistic model
• Probabilities of conjunctive queries
• Probabilities for FO
• Based on work done with N. Dalvi and
G.Miklau, and on papers by Lynch, Shelah
and Spencer
2
Annomalies 0/1 Laws
Database schema:
Employee(name, city, occupation)
We are not given the instance.
• Any person belongs to Employee with m = 1/2 !
• The expected size E[Employee] = n3/2 ! 1 !!
• In practice need conditional probabilities,
m(f | y), but they often don’t exists [ why ?]
3
A Better Model
• Postulate that for each R 2 s
E[R] = cR (a constant)
• This leads to: for each tuple t:
Pr[t 2 R] = cR / na where a = arity(R)
4
A Better Model
No more anomalies:
• For a given person, the probability of it
belonging to Employee is ! 0
• The expected size is E[R] = cR
• Asymptotic conditional probabilities always
exists for conjunctive queries
5
Conjunctive Queries
• Have the form:
9 x1…9 xk.(C1 Æ … Æ Cm)
• Where each Ci is R(…) or xi=xj or xi xj
Empolyee(x,Seattle,-),Employee(x,y,Clerk),Employee(-,y,Lawer)
6
Conjunctive Queries
Theorem
For every Q there are numbers E, C s.t:
Pr[Q] =C / nE + O(1/NE+1)
Corollary Pr[Q1 | Q2] always has a limit
• Will show next how to compute C, E
7
Subgraph Properties
• Consider R(x,y);
• For every edge, Pr(R(u,v)) = c/n2
• Given Q, let H = Q obtained by adding all
predicates of the form xi  xj
• H checks for the presence of a subgraph
8
Subgraph Properties
Example 1:
• Q = R(x,y),R(y,z),R(z,x)
H=Q = R(x,y),R(y,z),R(z,x),x y,y z,z x
H=
9
Subgraph Properties
Pr(H) = Pr(Çu,v,w H(u,v,w))
· u,v,w Pr(H(u,v,w))
= n(n-1)(n-2) * 1/3 * c3 / n6
= 1/3 c3 / n3 + O(1/n4)
10
Subgraph Properties
Example 2:
Q = R(x,y),R(y,a),R(b,x)
H=Q =R(x,y),R(y,z),R(z,x),x y,ya,ax,xb,
bx
b
a
11
Subgraph Properties
Pr(H) = Pr(Çu,v H(u,v))
· u,v Pr(H(u,v))
= n(n-1) * 1/1 * c3 / n6
= c3 / n4 + O(1/n5)
12
Subgraph Properties
Let Q = G1, G2, …, Gm
V = number of variables in Q
A = arity(Q) = arity(G1) + … + arity(Gm)
E = A - V = “the exponent of Q”
H = number of automorphisms Q ! Q
C = c1 * c2 * … * cm = “the coefficient of Q”
Lemma Pr(Q) · C/H * 1/nE
13
Subgraph Properties
Lower bound, for the triangle:
Pr(H) = Pr(Çu,v,w H(u,v,w))
¸ Pr(H(u,v,w)) – Pr(H(u,v,w)Æ H(u’,v’,w’)
= 1/3 c3/n3 + O(1/n4) -  Pr(HH)
14
Subgraph Properties
• What is Pr(H) ? Each term belongs to one
of the following cases:
E = 12 – 6 = 6
E = 12 – 5 = 7
E = 10 – 4 = 6
A few others….
15
But all have E > 3 ! Hence Pr(HH) is neglijible
Subgraph Properties
• Hence, for the triangle:
Pr(H) ¼ 1/3 c3/n3
• This generalizes easily to any subgraph
property
16
Subgraphs with E = 0
H = R(x,y)
E = 2-2 = 0; what is Pr(H) ?
H = R(x,y)R(u,v)
E = 4–4 = 0what is Pr(H) ?
H = R(x,y)R(y,z)R(z,x), R(u,v) E(H) = E(triangle);
Exponent in the theorem is always correct, but need
to adjust the coefficient
17
Conjunctive Queries
• Consider the query:
R(x,y),R(y,z),R(z,x)
• Any of the variables x,y,z may be equal:
results in the following subgraphs:
H1 = R(x,y)R(y,z)R(z,x) E=6-3=3
H2 = R(x,x)R(x,z)R(z,x) E=6-2=4
H3 = R(x,x)R(x,x)R(x,x) = R(x,x) E=2
• Hence Pr(Q) = Pr(H3) = cR/n2
18
Conjunctive Queries
• Now consider
Q = R(a,x),R(y,b)
• Two graphs:
H1 = R(a,x)R(y,b) E = 4-2=2
H2 = R(a,b)
E=2
• One can prove:
Pr(Q) = Pr(H1) + Pr(H2) = (c + c2)/n2
19
More General Distributions
[Shelah&Spencer, Lynch]
• Pr(tuple) = b / na
• Example: H = triangle
• Pr(H) ¼ n3 * 1/3 * b3 / n3a = C / nE
• Simply redefine E(H) to use a
20
More General Distributions
• But, problem here; let \alpha = 3/2:
E(
E(
) = 3a – 3 = 3/2
) = 3a – 3 + a – 2 = 1
Hence the more complex graph is more likely !
Solution: adjust E(H) to be the max of E(H0) for H0 µ H
21
Threshold Functions for
Subgraphs
[Erdos and Reny]
Edge probability Pr(t) = p(n) = some function
Main theorem of random graphs:
For any monotone property C there exists a
threshold function t(n) s.t.
– If p(n) ¿ t(n) then limn Pr(C) = 0
– If p(n) À t(n) then limn Pr(C) = 1
22
Threshold Functions
[Erdos and Reny]
The threshold function for subgraph property H is
the following:
Let a = maxH0 µ H |nodes(H0)| / |edges(H0)|
Then t(n) = 1/na
Can derive it from the exponent [ show in class ]
23
Extended 0/1 Laws
• Shelah and Spencer, and Lynch consider the
following general case:
• Pr(t) = b / na, for a > 0
• Lynch: a logic admits an extended 0/1 law if
for each f one of the following holds:
Pr(f) ¼ C/nE, or
Pr(f) < 1/nE for every E >0
24