Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Constraint Satisfaction
Complexity
and
Logic
Phokion G. Kolaitis
IBM Almaden Research Center
&
UC Santa Cruz
1
Goals
Goals:
• Show that Constraint Satisfaction Problems (CSP)
form a broad class of algorithmic problems that
arise naturally in several different areas.
• Present an overview of CSP with emphasis on
the computational complexity aspects of CSP
and on the connections with database theory,
univeral algebra, and logic
2
Course Outline
Constraint Satisfaction Problems (CSP)
• Definition & Examples of CSP
• CSP & the Homomorphism Problem
• CSP & Database Theory
• Computational Complexity of CSP
– Uniform & Non-Uniform CSP
– Schaefer’s Dichotomy Theorem for
Boolean CSP
– The Feder-Vardi Dichotomy Conjecture for
general CSP
3
Course Outline
• The Pursuit of Tractable Cases of CSP
– Universal Algebra and Closure Properties
of Constraints
– Datalog, Finite-variable Logics, and Pebble
Games
– Bounded Treewidth and Finite-variable Logics
4
Algorithmic Problems
• k-SAT: (k ≥ 2 is fixed)
Given a k-CNF formula, is it satisfiable?
• k-Colorability: (k ≥ 2 is fixed)
Given a graph H = (V, E), is it k-colorable?
• Clique:
Given a graph H = (V, E) and a positive integer k, does H contain a k-clique?
• Hamiltonian Cycle:
Given a graph H = (V, E), does it contain a
Hamiltonian cycle?
5
Constraint Satisfaction Problems
Problems:
• k-SAT (k ≥ 2)
• k-Colorability (k ≥ 2)
• Clique
• Hamiltonian Cycle
• Latin Square Completion
Question: What do these have in common?
Answer: Each of them is a
Constraint Satisfaction Problem.
6
Constraint Satisfaction Problem
U. Montanari – 1974
CSP – Informal Definition
Input: A set V of variables, a domain D of values
for the variables, and a set C of constraints.
Question: Is there an assignment of values to
variables such that all constraints in C are satisfied?
Example: k-SAT as a CSP
k-CNF ϕ: variables x1 , . . . , xn ; clauses c1 , . . . , cm
• Variables: x1 , . . . , xn
• Values: 0, 1
• Constraints: Given by the clauses of ϕ.
7
Constraint Satisfaction Problem
Input: (V, D, C)
• A finite set V of variables
• A finite domain D of values for the variables
• A set C of constraints (t, R) restricting the
values that tuples of variables can take.
– t:
a tuple t = (x1 , . . . , xm ) of variables
– R: a relation on D of arity |t| = m
Question: Does (V, D, C) have a solution?
Solution: A mapping h : V → D such that
h(t) = (h(x1 ), . . . , h(xm )) ∈ R,
for every constraint (t, R) ∈ C.
An assignment of values to the variables such that
all constraints are satisfied.
8
Example
• 3-Sat:
Given a 3CNF-formula ϕ with variables x1 , . . . , xn
and clauses c1 , . . . , cm , is ϕ satisfiable?
• 3-Sat as a CSP:
– Variables x1 , . . . , xn
– Domain D = {0, 1}
– Constraints ((x, y, z), Ri ), i = 0, 1, 2, 3
Clause
Relation
(x ∨ y ∨ z)
R0 = {0, 1}3 − {(0, 0, 0)}
(¬x ∨ y ∨ z)
R1 = {0, 1}3 − {(1, 0, 0)}
(¬x ∨ ¬y ∨ z)
R2 = {0, 1}3 − {(1, 1, 0)}
(¬x ∨ ¬y ∨ ¬z)
R3 = {0, 1}3 − {(1, 1, 1)}
9
Example
• 3-Colorability:
Given a graph H = (V, E), is it 3-colorable?
• 3-Colorability as a CSP:
– The variables are the nodes in V
– The domain is the set D = {R, G, B} of
three colors.
– The constraints are the pairs ((u, v), Q), where
(u, v) ∈ E and Q is the relation
{(R, G), (G, R), (R, B), (B, R), (B, G), (G, B)}.
10
Constraint Satisfaction Problems
Fundamental & ubiquitous problems in AI and CS:
• Boolean Satisfiability, Graph Colorability, ...
• Database Query Processing
• Planning and Scheduling
• Belief Maintenance
• Machine Vision
• ···
• Linguistics
11
Homomorphisms
Definition: Consider two relational structures
A
B
A = (A, R1A , . . . , Rm
) and B = (B, R1B , . . . , Rm
).
h : A → B is a homomorphism if for every i ≤ m
and every tuple (a1 , . . . , an ) ∈ An ,
RiA (a1 , . . . , an ) =⇒ RiB (h(a1 ), . . . , h(an )).
Example: Clique K4
v
@
v
vs.
v
LQ
QQ
L
Q
Q
L
v
Qv
L
"
Bbb "
L "
B bb
"L
B
b
"
L b
B ""
b L b
"
BB"
bLv
v
v
@
@
@
@
Clique K5
@
@
@v
• There is a homomorphism from K4 to K5 ;
• There is no homomorphism from K5 to K4 .
12
The Homomorphism Problem
Definition: The Homomorphism Problem
Given two relational structures A and B, is there
a homomorphism h : A → B?
Fact: Feder & Vardi – 1993
CSP can be identified with the Homomorphism
Problem.
13
Example
3-Sat as a Homomorphism Problem:
Given 3-CNF formula ϕ with variables x1 , . . . , xn :
• Aϕ = ({x1 , . . . , xn }, R0ϕ , R1ϕ , R2ϕ , R3ϕ ), where
R0ϕ
= {(x, y, z) : (x ∨ y ∨ z) is a clause of ϕ}
R1ϕ
= {(x, y, z) : (¬x ∨ y ∨ z) is a clause of ϕ}
R2ϕ
= {(x, y, z) : (¬x ∨ ¬y ∨ z) is a clause of ϕ}
R3ϕ
= {(x, y, z) : (¬x ∨ ¬y ∨ ¬z) is a clause of ϕ}
• B = ({0, 1}, R0 , R1 , R2 , R3 ), where
R0
= {0, 1}3 − {(0, 0, 0)}
R1
= {0, 1}3 − {(1, 0, 0)}
R2
= {0, 1}3 − {(1, 1, 0)}
R3
= {0, 1}3 − {(1, 1, 1)}
Fact: ϕ is satisfiable iff there is hom. h : Aϕ → B.
14
Examples
3-Colorability as a Homomorphism Problem:
Fact:
A graph H = (V, E) is 3-colorable
⇐⇒
there is a homomorphism h : H → K3 , where K3
is the 3-clique, i.e., K3 = ({R, G, B}, E3 ), where
E3 = {(R, G), (G, R), (R, B), (B, R), (B, G), (G, B)}.
k-Colorability as a Homomorphism Problem:
Fact:
A graph H = (V, E) is k-colorable
⇐⇒
there is a homomorphism h : H → Kk , where Kk
is the k-clique.
15
From CSP to the Homomorphism Problem
Given a CSP-instance (V, D, C), let R1 , . . . , Rm be
the distinct relations on D occurring in C.
Define the instance (A, B) of the Homomorphism
Problem, where
A = (V, {t : (t, R1 ) ∈ C}, . . . , {t : (t, Rm ) ∈ C})
and
B = (D, R1 , . . . , Rm ).
Fact: The CSP instance (V, D, C) has a solution
⇐⇒
there is a homomorphism from A to B.
16
From the Homomorphism Problem to CSP
A
Given an instance A = (A, R1A , . . . , Rm
) and B =
B
(B, R1B , . . . , Rm
) of the Homomorphism Problem, define the instance (V, D, C) of CSP, where
• V = A (i.e., the elements of A are variables);
• D = B (i.e., the elements of B are values);
• C = {(t, RiB ) : t ∈ RiA , 1 ≤ i ≤ m}.
Fact: There is a homomorphism from A to B
⇐⇒
the CSP-instance (V, D, C) has a solution.
17
CSP vs. the Homomorphism Problem
Conclusion: Feder & Vardi – 1993
CSP
= Homomorphism Problem
Some Advantages:
• Clean, algebraic formulation of CSP.
• Makes it possible to apply methods from
universal algebra and logics to the study of
CSP.
• Makes the connections between CSP and database
theory transparent.
18
Course Outline
Constraint Satisfaction Problems
• Definition & Examples of CSP
• CSP & the Homomorphism Problem
• CSP & Database Theory
• Computational Complexity of CSP
• The Pursuit of Tractable Cases of CSP
19
Relational Databases
Definition:
• Relation Schema: R(A1 , . . . , Ak )
R: relation name; A1 , . . . , Ak : attribute names.
• Relation conforming with R:
A set R′ of k-tuples
• Database Schema: A set S of relation schemas
• Database over S: A set S of relations conforming with the relation schemas in S.
Database =
Relational Structure
Example:
• Relation Schemas:
ENROLLS(Student, Course)
TEACHES(Faculty, Course)
• Database Schema:
{ENROLLS, TEACHES}.
20
Relational Databases
Example: (continued)
• Relation Schemas:
ENROLLS(Student, Course)
TEACHES(Faculty, Course)
• Relation ENROLLS
Student
Course
Adams
CS101
Bing
Math10
···
···
• Relation TEACHES
Faculty
Course
Euler
Math10
Turing
CS130
Codd
CS180
···
···
21
Relational Join
Notation: Given relations R1 , . . . , Rm ,
⋉ R2 ⋊
⋉ ··· ⋊
⋉ Rm
R1 ⋊
is the relational join or, simply, the join of R1 , . . . , Rm .
Definition: Given
R1 (A, B, C), R2 (B, C, D), R3 (A, D, E),
R1 ⋊
⋉ R2 ⋊
⋉ R3
= {(a, b, c, d, e) :
R1 (a, b, c) ∧ R2 (b, c, d) ∧ R3 (a, d, e)}
Example: TAUGHT-BY(Student,Course,Faculty)
TAUGHT-BY = ENROLLS ⋊
⋉ TEACHES
Fact: Computing joins is the most frequent and
most costly operation in database systems. (Why?)
22
Relational Joins
• Relation ENROLLS
Student
Course
Adams
CS101
Bing
Math10
···
···
• Relation TEACHES
Faculty
Course
Euler
Math10
Turing
CS130
Codd
CS180
···
···
• Relation TAUGHT-BY
Student
Course
Faculty
Bing
Math10
Euler
···
···
···
23
CSP and Joins
• Consider a CSP instance (V, D, C), where
V
= {X1 , . . . , Xn }
C
= {C1 , . . . , Cm }
Ci
=
(ti , Ri )
ti
=
(Xj1 , . . . , Xji ).
• For each (ti , Ri ), view the occurrence of Ri as a
relation over the relation schema Ri (Xj1 , . . . , Xji ).
Fact: Bibel, Gyssens, Jeavons, ...
R1 ⋊
⋉ ··· ⋊
⋉ Rm = Solutions((V, D, C)).
In particular,
(V, D, C) has a solution ⇐⇒ R1 ⋊
⋉ ··· ⋊
⋉ Rm 6= ∅.
Conclusion: CSP is a database problem.
24
CSP and Databases
Fact: There are strong links between CSP and
key problems in database systems:
• Relational Join Evaluation
• Conjunctive Query Evaluation (CQE)
• Conjunctive Query Containment (CQC)
25
Queries
Definitions:
• Class C of structures: a collection of relational
structures closed under isomorphisms.
• k-ary Query Q on C:
a mapping Q with domain C and such that
– Q(A) is a k-ary relation on A, for A ∈ C;
– Q is preserved under isomorphisms, i.e.,
if h : A → B is an isomorphism, then
Q(B) = h(Q(A)).
• Boolean Query Q on C:
a mapping Q : C → {0, 1} preserved under
isomorphisms. Thus, Q can be identified with
the subclass C ′ of C, where
C ′ = {A ∈ C : Q(A) = 1}.
26
Examples of Queries
• Path of Length 2 – P 2:
Binary query on graphs H = (V, E) such that
P 2(H) = {(a, b) ∈ V 2 : (∃c)(E(a, c) ∧ E(c, b))}.
• Articulation Point – AP :
Unary query on graphs H = (V, E) such that
AP (H) = {a ∈ V : a is an articulation point of H}.
• Connectivity – CN :
Boolean query on graphs H = (V, E) such that
1 if H is connected
CN (H) =
0 otherwise.
• k-Colorability, k ≥ 2,
Hamiltonian Cycle, ...
27
Definability of Queries
Let L be a logic and C a class of structures
• A k-ary query Q on C is L-definable if there is
an L-formula ϕ(x1 , . . . , xk ) with x1 , . . . , xk as
free variables and such that for every A ∈ C
Q(A) = {(a1 , . . . , ak ) ∈ Ak : A |= ϕ(a1 , . . . , ak )}.
• A Boolean query Q on C is L-definable if there
is an L-sentence ψ such that for every A ∈ C
Q(A) = 1 ⇐⇒ A |= ψ.
28
Conjunctive Queries
Definition: A conjunctive query is a query definable by a FO-formula in prenex normal form built
from atomic formulas, ∧, and ∃ only.
(∃z1 . . . ∃zm )ψ(x1 , . . . , xk , z1 , . . . , zm ),
where ψ is a conjunction of atomic formulas.
Also, written as a rule:
Q(x1 , . . . , xk ) : − R(y2 , x3 , x1 ), S(x1 , y3 ), . . . , S(y7 , x2 )
Examples:
• Path of Length 2 (Binary query)
(∃z)(E(x1 , z) ∧ E(z, x2 )
P 2(x1 , x2 ) : − E(x1 , z), E(z, x2 )
• Cycle of Length 3 (Boolean query)
(∃x1 ∃x2 ∃x3 )(E(x1 , x2 )∧E(x2 , x3 )∧E(x3 , x1 ))
Q : − E(x1 , x2 ), E(x2 , x3 ), E(x3 , x1 )
29
Conjunctive Queries
• Every relational join is a conjunctive query
R1 (A, B, C), R2 (B, C, D),
R1 ⋊
⋉ R2 (x, y, z, w) : − R1 (x, y, z), R2 (y, z, w)
• Conjunctive Queries are the most frequently
asked queries in databases (a.k.a. SPJ queries)
• The main construct of SQL expresses conjunctive queries
SELECT R1 .A, R2 .D
FROM R1 , R2
WHERE R1 .B = R2 .B AND R1 .C = R2 .C
30
Conjunctive Query Evaluation
A fundamental problem about conjunctive queries
Definition: Conjunctive Query Evaluation CQE
• Given a CQ Q and a structure A, find
Q(A) = {(a1 , . . . ak ) : A |= Q(a1 , . . . , ak )}
• For Boolean queries Q, this becomes:
Given Q and A, does A |= Q? (is Q(A) = 1?)
Examples:
• Given a graph H, find all pairs of nodes connected by a path of length 4.
• Given a graph H, does it contain a triangle?
31
Conjunctive Query Containment
A fundamental problem about conjunctive queries
Definition: Conjunctive Query Containment CQC
• Given two k-ary CQs Q1 and Q2 , is it true that
for every structure A,
Q1 (A) ⊆ Q2 (A)?
• For Boolean queries, this becomes:
Given two Boolean queries Q1 and Q2 , does
Q1 |= Q2 ? (does Q1 logically imply Q2 ?)
Examples:
• Is it true that if two nodes of a graph H are
connected by a path of length 4, then they are
also connected by a path of length 3?
• It is true that if a graph H contains a K4 , then
it also contains a K3 ?
32
CQE, CQC, CSP
Theorem: Chandra & Merlin – 1977
CQE and CQC are the same problem.
Fact: K . . . & Vardi – 1998
CSP and CQC are the same problem
Question: What is the common link?
33
CQE, CQC, CSP
Theorem: Chandra & Merlin – 1977
CQE and CQC are the same problem.
Fact: K . . . & Vardi – 1998
CSP and CQC are the same problem
Question: What is the common link?
Answer: The Homomorphism Problem
34
Canonical CQs and Canonical Structures
Definition: Canonical Conjunctive Query
A
), the canonical CQ of
Given A = (A, R1A , . . . , Rm
A is the Boolean CQ QA with the elements of A
as variables and the “facts” of A as conjuncts:
QA : −
m ^
^
RiA (t)
i=1 t
Definition: Canonical Structure
Given a Boolean conjunctive query Q, let AQ be
the structure with the variables of Q as elements
and the conjuncts of Q as “facts”.
Example:
• A = ({a, b, c}, {(a, b), (b, c), (c, a)}
QA : − E(x, y) ∧ E(y, z) ∧ E(z, x)
• Q : − E(x, y) ∧ E(x, z)
AQ = ({a, b, c), {(a, b), (a, c)}
35
Homomorphisms, CQC and CQE
Theorem: Chandra & Merlin – 1977
For relational structures A and B, TFAE
• There is a homomorphism h : A → B
• B |= QA
(i.e., QA (B) = 1)
• QB ⊆ QA
Theorem: Chandra & Merlin – 1977
For conjunctive queries Q1 and Q2 , TFAE
• Q1 ⊆ Q2
• There is a homomorphism h : AQ2 → AQ1
• AQ1 |= Q2
(i.e., Q2 (AQ1 ) = 1)
36
Homomorphisms, CQC, and CQE
Example: 3-Colorability
For a graph H, the following are equivalent:
1. H is 3-colorable
2. There is a homomorphism h : H → K3
3. K3 |= QH
4. QK3 ⊆ QH
Proof:
(2) =⇒ (3): A homomorphism h : H → K3 provides witnesses in K3 for the existential quantifiers
in QH .
(3) =⇒ (4): If K3 |= QH and A |= QK3 , then
there is a homomorphism h : H → K3 and a homomorphism and h∗ : K3 → A.
The composition h∗ ◦h : H → A provides witnesses
in A for the existential quantifiers in QH .
37
CSP, CQE, and CQC
Theorem: Chandra & Merlin – 1977
For relational structures A and B, TFAE
• There is a homomorphism h : A → B
• B |= QA
(i.e., QA (B) = 1)
• QB ⊆ QA
Conclusion:
CSP, CQE, and CQC are the same problem.
38
Course Outline
Constraint Satisfaction Problems
• Definition & Examples of CSP
• CSP & the Homomorphism Problem
• CSP & Database Theory
• Computational Complexity of CSP
– Uniform & Non-Uniform CSP
– Schaefer’s Dichotomy Theorem for
Boolean CSP
– The Feder-Vardi Dichotomy Conjecture for
general CSP
• The Pursuit of Tractable Cases of CSP
– Datalog & Pebble Games
– Bounded Treewidth and Finite-Variable Logics
39
Computational Complexity Primer
Computational Complexity:
• The quantitative study of solvability
Y. Hartmanis – 1993
• Solvable decision problems are placed into classes
according to the time/space resources required
to solve them.
Some Major Complexity Classes:
P
polynomial time
NP
nondeterministic polynomial time
EXPTIME
exponential time
40
Complexity Classes
Fact: The following containments hold:
P ⊆ NP ⊆ EXPTIME
Theorem: Hartmanis & Stearns – 1965
P ⊂ EXPTIME
Open Problems:
• Is P 6= NP?
• Is NP 6= EXPTIME?
Note: It is generally believed that P 6= NP.
41
NP-Complete Problems
Definition: Let Q be a decision problem (a Boolean
query). We say that Q is NP-complete if
1. Q ∈ NP.
2. Q is NP-hard , i.e., for every Q′ ∈ NP, there
a polynomial-time many-one reduction f of Q′
to Q:
x ∈ Q′ ⇐⇒ f (x) ∈ Q.
Theorem: S. Cook – 1971, R. Karp 1972, ...
The following problems are NP-complete:
• k-Sat, k ≥ 3
• k-Colorability, k ≥ 3
• Clique
• Hamiltonian Cycle
• Latin Square Completion
42
NP-Complete Problems
Fact: Let Q be an NP-complete problem.
The following are equivalent:
• P = NP
• Q ∈ P.
Remark:
• NP-complete problems hold the secret of whether
or not P = NP.
• Establishing that Q is NP-complete is viewed
as evidence that Q 6∈ P and, hence, it is an
intractable algorithmic problem.
43
Complexity of CSP
Definition: Uniform CSP
{(A, B) : ∃ homomorphism h : A → B}
(Homomorphism Problem)
Theorem: Uniform CSP is NP-complete
Proof:
• Uniform CSP is in NP:
Given A, B, “guess” h : A → B and verify it
is a homomorphism.
• Uniform CSP is NP-hard:
It contains NP-hard problems as special cases
(e.g., 3-SAT)
44
Complexity of CQC
Definition: CQC Problem
Given Boolean queries Q1 , Q2 , does Q1 |= Q2 ?
Theorem: CQC is NP-complete
Proof: Same as Uniform CSP
Remark:
• Recall that Boolean CQs are FO-sentences built
from atomic formulas, ∧, and ∃ only.
So, for such sentences, logical implication is
decidable.
• In contrast, for arbitrary FO-sentences, logical
implication is undecidable.
45
Coping with NP-Completeness
Question: How to cope with NP-completeness?
Answer: Garey & Johnson – 1979
• Design heuristic algorithms that may work well
in practice.
• Identify polynomial-time solvable cases of the
problem, the so-called islands of tractability,
by imposing restrictions on the inputs.
Example 1:
– 3-Sat is NP-complete.
– Horn 3-Sat is in P.
Example 2:
– 3-Colorability is NP-complete.
– 3-Colorability on graphs of bounded treewidth
is in P.
46
Restricting Uniform CSP
Research Program: Identify all islands of tractability of Uniform CSP.
Definition: Let A, B be two classes of structures.
CSP(A, B) = {(A, B) : A ∈ A, B ∈ B and
∃ homomorphism h : A → B}
Research Program: Identify all A, B such that
CSP(A, B) is in P.
47
Non-Uniform CSP
Definition: Fix a structure A.
CSP({A}, All) = {B : ∃ homomorphism h : A → B}
Fact: For every fixed A,
CSP({A}, All) ∈ P
Proof: Exercise.
Definition: Fix a structure B:
CSP(All, {B}) = {A : ∃ homomorphism h : A → B}
Fact: The complexity of CSP(All, {B}) depends
on B.
Proof:
• CSP(All, {K3 }) is NP-complete
(3-Colorability)
• CSP(All, {K2 }) is in P.
(2-Colorability)
48
Complexity of Non-Uniform CSP
Notation: For simplicity, for every fixed structure B, we put
CSP(B) = CSP(All, {B})
Research Program:
• Identify the islands of tractability of CSP(B).
• In other words, characterize the structures B
for which CSP(B) is solvable in polynomial
time.
49
Boolean Non-Uniform CSP
Definition: Boolean Non-Uniform CSP
CSP(B) with B = ({0, 1}, R1B , . . . , R1B )
Fact: Boolean Non-Uniform CSP problems are
Generalized Satisfiability Problems, i.e.,
variants of Boolean satisfiability in CNF.
Examples: Each of the following problems is a
CSP(B) problem for a suitable Boolean B.
• 3-SAT: B = ({0, 1}, R0 , R1 , R2 , R3 )
• 2-SAT: B = ({0, 1}, R0′ , R1′ , R2′ )
• POSITIVE 1-IN-3-SAT:
3CNF-formulas with clauses (x ∨ y ∨ z);
exactly one variable is true in each clause.
B = ({0, 1}, R1/3 ),
where
R1/3 = {(1, 0, 0), (0, 1, 0), (0, 0, 1).
50
Complexity of Boolean Non-Uniform CSP
Schaefer’s Dichotomy Theorem – 1978
• If B is Boolean structure, then CSP(B) is in
P or it is NP-complete.
• Moreover, there is a polynomial-time algorithm
to decide, given a Boolean structure B, whether
CSP(B) is in P or it is NP-complete.
Question: Why is this a dichotomy theorem?
51
The Fine Structure of NP
Theorem: Ladner - 1975
If P 6= NP, then there is a problem Q such that
• Q ∈ NP − P;
• Q is not NP-complete.
NP-complete
NP − P
not NP-complete
P
Remark: Consequently, it is conceivable that a
family of NP-problems contains problems of such
intermediate complexity problems.
This possibility cannot be ruled out a priori.
52
Dichotomy of Boolean Non-Uniform CSP
Schaefer’s Dichotomy Theorem – 1978
• If B is Boolean structure, then CSP(B) is in
P or it is NP-complete.
ր
NP-complete
NP − P, not NP-complete
CSP(B)
ց
P
• Moreover, there is a polynomial-time algorithm
to decide, given a Boolean structure B, whether
CSP(B) is in P or it is NP-complete.
Question: What is the boundary in this dichotomy?
53
Dichotomy of Boolean Non-Uniform CSP
Proof Architecture:
B
B = ({0, 1}, R1B , . . . , Rm
)
• Identify a number of cases for which CSP(B)
is in P.
Each such case will be described in terms of
B
conditions on the relations R1B , . . . , Rm
of B.
• Show that CSP(B) is NP-complete in all other
cases.
• Show that there is a polynomial-time algorithm
for checking whether the conditions describing
the tractable cases hold.
54
Tractable Cases of Boolean CSP(B)
Definition: R ⊆ {0, 1}k , for some k ≥ 1.
• R is 0-valid if it contains the k-tuple (0, . . . , 0).
• R is 1-valid if it contains the k-tuple (1, . . . , 1).
B
Fact: B = ({0, 1}, R1B , . . . , Rm
)
• If every RiB is 0-valid, then CSP(B) is in P.
• If every RiB is 1-valid, then CSP(B) is in P.
Proof: Trivial
• The constant function h(x) = 0 is a homomorphism.
• The constant function h(x) = 1 is a homomorphism.
55
Relations and Solutions
Notation: Let ϕ be a Boolean formula with k
variables x1 , . . . , xk .
Sol(ϕ) is the set of all k-tuples (a1 , . . . , ak ) such
that if s is a truth assignment with s(xi ) = ai ,
1 ≤ i ≤ k, then s(ϕ) = 1.
Example: If ϕ is (¬x ∨ y), then
Sol(ϕ) = {(0, 0), (0, 1), (1, 1)}.
Remark: For all remaining tractable cases, the
condition on the relations of B will assert that each
relation RiB is equal to Sol(ϕ), where ϕ belongs to
a certain class of “well-behaved” Boolean formulas
(the class of formulas will change from tractable
case to tractable case).
56
Tractable Cases of Boolean CSP(B)
Definition: R ⊆ {0, 1}k , for some k ≥ 1.
R is bijunctive if there is a 2-CNF formula ϕ such
that R = Sol(ϕ).
B
Theorem: B = ({0, 1}, R1B , . . . , Rm
)
If every RiB is bijunctive, then CSP(B) is in P.
Proof: 2-SAT is in P.
The resolution algorithm runs in polynomial time
on 2-CNF formulas, because
• Every resolvent of two 2-clauses is a 2-clause:
{l1 , l2 }
{l1 , l3 }
ց
ւ
{l2 , l3 }
• Given n variables, there is a polynomial number of 2-clauses involving these variables.
57
Tractable Cases of Boolean CSP(B)
Definition: R ⊆ {0, 1}k , k ≥ 1.
• R is Horn if R = Sol(ϕ) for some Horn formula
ϕ, i.e., a CNF-formula in which each conjunct
has at most one positive literal.
x ∧ (¬x ∨ z) ∧ (¬y ∨ ¬z ∨ w)
• R is dual Horn R = Sol(ϕ) for some dual Horn
formula ϕ, i.e., a CNF-formula in which each
conjunct has at most one negative literal.
¬x ∧ (x ∨ ¬z) ∧ (y ∨ z ∨ ¬w)
B
Theorem: B = ({0, 1}, R1B , . . . , Rm
)
• If every RiB is Horn, then CSP(B) is in P.
• If every RiB is dual Horn, then CSP(B) is in P.
Proof:
Unit Propagation algorithm.
58
Tractable Cases of Boolean CSP(B)
Definition: R ⊆ {0, 1}k , k ≥ 1.
R is affine if it is the set of solutions of a system
of linear equations over the 2-element field.
Example: R = {(0, 1), (1, 0)} is affine, because
R = Sol(x ⊕ y = 1).
B
Theorem: B = ({0, 1}, R1B , . . . , Rm
)
If every RiB is affine, then CSP(B) is in P.
Proof: Use Gaussian Elimination algorithm.
59
Schaefer Structures
B
Definition: B = ({0, 1}, R1B , . . . , Rm
)
B is Schaefer if at least one of the following six
conditions holds:
• Every relation RiB is 0-valid.
• Every relation RiB is 1-valid.
• Every relation RiB is bijunctive.
• Every relation RiB is Horn.
• Every relation RiB is dual Horn.
• Every relation RiB is affine.
Otherwise, we say that B is non-Schaefer.
60
Schaefer’s Dichotomy Theorem
B
Theorem: Let B = ({0, 1}, R1B , . . . , Rm
).
• If B is Schaefer, then CSP(B) is in P.
• If B is non-Schaefer, then CSP(B) is NP-complete.
Proof:
• The first part has already been established.
• The second part requires much more work.
The key ingredient is a theorem about the expressive power of conjunctive queries on nonSchaefer structures.
61
Schaefer’s Expressibility Theorem
Note: Recall that a conjunctive query is a FOformula built from atomic formulas, ∧, and ∃ only.
B
Theorem: Let B = ({0, 1}, R1B , . . . , Rm
).
If B is non-Schaefer, then every Boolean relation
R ⊂ {0, 1}k , k ≥ 1, is definable on B by some
conjunctive query over B with constants 0 and 1.
This means that for every k ≥ 1 and for every
R ⊂ {0, 1}k , there is a formula ψ(x1 , . . . , xk , y, z)
B
that is built from R1B , . . . , Rm
, ∧, and ∃, and has
the property that for every (a1 , . . . , ak ) ∈ {0, 1}k
(a1 , . . . , ak ) ∈ R ⇐⇒ B |= ψ(a1 , . . . , ak , 0, 1).
Remark: Intutitively, this result asserts that on
non-Schaefer structures, conjunctive queries have
maximum expressive power (they can define all
Boolean relations).
62
Schaefer’s Dichotomy Theorem
B
Theorem: Let B = ({0, 1}, R1B , . . . , Rm
).
• If B is Schaefer, then CSP(B) is in P.
• If B is non-Schaefer, then CSP(B) is NP-complete.
• Moreover, there are simple polynomial-time criteria to test if B is Schaefer.
Remark: These criteria involve closure properties
of the relations of B.
63
Closure Properties
Theorem: Let R ⊆ {0, 1}k , k ≥ 1.
• R is bijunctive iff it is closed under the 3-ary
majority function maj(x, y, z).
• S is Horn iff it is closed under g(x, y) = x ∧ y.
• S is dual Horn iff it is is closed under g ′ (x, y) =
x ∨ y.
• S is affine iff it is closed under h(x, y, z) =
x ⊕ y ⊕ z.
64
Closure Properties
Example: Let R = {(0, 1), (1, 0), (1, 1)}
• R is not Horn, because
(0, 1) ∧ (1, 0) = (0, 0) 6∈ R.
• R is not affine, because
(0, 1) ⊕ (1, 0) ⊕ (1, 1) = (0, 0) 6∈ R.
Example: Let R = {(0, 0), (0, 1), (1, 0)}
• R is not dual Horn, because
(0, 1) ∨ (1, 0) = (1, 1) 6∈ R.
• R is not affine, because
(0, 0) ⊕ (0, 1) ⊕ (1, 0) = (1, 1) 6∈ R.
65
Positive 1-in-3 SAT
• 3-CNF formula with clauses (x ∨ y ∨ z);
exactly one variable/clause is true.
• Positive-1-In-3-SAT = CSP(({0, 1}, R1/3 ))
R1/3 = {(1, 0, 0), (0, 1, 0), (0, 0, 1).
Fact: Positive 1-In-3-SAT is NP-complete
Proof: Apply Schaefer’s Dichotomy Theorem
• R1/3 is not bijunctive, because
maj((1, 0, 0), (0, 1, 0), (0, 0, 1)) = (0, 0, 0) 6∈ R.
• R1/3 is neither Horn nor dual Horn, since
(1, 0, 0) ∧ (0, 1, 0) = (0, 0, 0) 6∈ R1/3 .
(1, 0, 0) ∨ (0, 1, 0) = (1, 1, 0) 6∈ R1/3 .
• R1/3 is not affine, since
(1, 0, 0) ⊕ (0, 1, 0) ⊕ (0, 0, 1) = (1, 1, 1) 6∈ R1/3 .
66
Schaefer’s Dichotomy Theorem
B
Theorem: Let B = ({0, 1}, R1B , . . . , Rm
).
• If B is Schaefer, then CSP(B) is in P.
• If B is non-Schaefer, then CSP(B) is NP-complete.
• Moreover, there are simple polynomial-time criteria to test if B is Schaefer.
Remark: Schaefer’s Dichotomy Theorem is the
mother of many other dichotomy results for families of decision problems.
67
Complexity of CSP(B)
Conclusion: Schaefer’s Dichotomy Theorem yields
a complete classification of the complexity of CSP(B)
for Boolean structures B.
Question: What about arbitrary structures B?
68
Complexity of CSP(B)
Conclusion: Schaefer’s Dichotomy Theorem yields
a complete classification of the complexity of CSP(B)
for Boolean structures B.
Question: What about arbitrary structures B?
Dichotomy Conjecture: Feder & Vardi –1993
For every structure B, one of the following holds:
• CSP(B) is in P.
• CSP(B) is NP-complete.
69
Towards the Dichotomy Conjecture
Theorem: Hell & Nešetril – 1990
Let B be an undirected graph.
• If B is 2-Colorable, then CSP(B) is in P.
• If B not 2-Colorable, then CSP(B) is NP-complete.
Theorem: A. Bulatov – 2002
The Dichotomy Conjecture is true for CSP(B),
where B = {0, 1, 2}.
Note: The Dichotomy Conjecture remains open
for CSP(B) with |B| ≥ 4.
70
Islands of Tractability of CSP(B)
Fact: An extensive pursuit of tractable cases of
CSP(B) has taken place over the years.
First Stage: Numerous particular results.
Second Stage: Broad sufficient conditions for
tractability of CSP(B).
Fact: Unifying explanations for many tractability
results have been discovered:
• Expressibility in Datalog
Feder & Vardi – 1993
• Closure Properties of Constraints
Connections with universal algebra
Feder & Vardi – 1993
Jeavons & collaborators – 1995 to present
71
Course Outline
Constraint Satisfaction Problems (CSP)
• Definition & Examples of CSP
• CSP & the Homomorphism Problem
• CSP & Database Theory
• Computational Complexity of CSP
– Uniform & Non-Uniform CSP
– Schaefer’s Dichotomy Theorem for
Boolean CSP
– The Feder-Vardi Dichotomy Conjecture for
general CSP
• The Pursuit of Tractable Cases of CSP
– Datalog & Pebble Games
– Bounded Treewidth & Finite-Variable Logics
72
Datalog
Note: Recall that CQs can be written as rules:
Path of Length 2 - P 2: (∃z)(E(x1 , z) ∧ E(z, x2 )
P 2(x1 , x2 ) : − E(x1 , z), E(z, x2 )
Definition:
• Datalog = Conjunctive Queries + Recursion
Logic Programming without function symbols
• A Datalog program is a finite set of rules given
by conjunctive queries
T (x) : − S1 (y 1 ), . . . , Sr (y r ).
– Some relation symbols may occur both in
the heads and the bodies of rules.
These are the recursive relation symbols or
intensional database predicates (IDBs).
– The remaining relation symbols are the
extensional database predicates (EDBs).
73
Datalog Examples
Definition: Transitive Closure Query T C
Given graph H = (V, E),
T C(H) = {(a, b) ∈ V 2 : there is a path from a to b}.
Example 1: Datalog program for T C
S(x, y) : − E(x, y)
S(x, y) : − E(x, z) ∧ S(z, y)
Example 2: Another Datalog program for T C
S(x, y) : − E(x, y)
S(x, y) : − S(x, z) ∧ S(z, y)
Note:
• E is the EDB.
• S is the IDB; it defines T C.
74
Datalog Examples
Definition: S. Cook – 1974
Path Systems S = (F, A, R)
Given a finite set of formulas F , a set of axioms
A ⊆ F , and a rule of inference R ⊆ F 3 , compute
the theorems of this system.
Example: Datalog program for Path Systems:
T (x) : − A(x)
T (x) : − T (y), T (z), R(x, y, z)
Note:
• A and R are the EDBs.
• T is the IDB; it defines the theorems of S.
75
Tractability of Datalog
Fact:
• If a query Q is defined by a Datalog program,
then Q is in P.
Proof:
• Datalog programs can be evaluated “bottomup” in a polynomial number of iterations.
76
Evaluation of Datalog Programs
Example : Datalog program for T C
S(x, y) : − E(x, y)
S(x, y) : − E(x, z) ∧ S(z, y)
Bottom-up Evaluation
1
S
= ∅
m+1
= {(a, b)) : ∃z(E(a, z) ∧ S m (z, b)}
S
Fact:
S
m
S
m
TC
=
TC
= S |V | .
77
Preservation Properties
Fact: Preservation Properties of Datalog.
• Datalog queries are preserved under
homomorphisms.
• Datalog queries are monotone, i.e., they are
preserved if new tuples are added to the EDBs.
Proof: Exercise - use preservation properties of
positive existential formulas.
Fact: On the class G of all finite graphs:
Datalog[G] $ PTIME[G].
78
Datalog and CSP
B
Fact: Let B = (B, R1B , . . . , Rm
).
• In general, CSP(B) is not monotone. (Why?)
• Hence, CSP(B) is not expressible in Datalog.
However,
• CSP(B) is monotone, where
CSP(B) = {A : 6 ∃ homomorphism h : A → B}.
• Hence, it is conceivable that CSP(B) is
expressible in Datalog (and, thus, it is in P).
79
Datalog and CSP
Fact: Feder & Vardi – 1993
Expressibility of CSP(B) in Datalog is a unifying explanation for many tractability results about
CSP(B).
Example: 2-Colorability = CSP(K2 )
Datalog program for Non 2-Colorability
O(X, Y ) : − E(X, Y )
O(X, Y ) : − O(X, Z), E(Z, W ), E(W, Y )
Q
: − O(X, X)
Example: Let B be Boolean structure.
If B is bijunctive, Horn, or dual Horn, then CSP(B)
is expressible in Datalog.
80
Datalog and CSP
Theorem: Feder & Vardi – 1993
• If B = (B, R1 , . . . , Rk ) is such that
Pol({R1 , . . . , Rk }) contains a near-unanimity
function, then CSP(B) is expressible in Datalog.
• If B = (B, R1 , . . . , Rk ) is such that
Pol({R1 , . . . , Rk }) contains an ACI function,
then CSP(B) is expressible in Datalog.
81
Datalog and CSP
Open Problem: Is there an algorithm to decide
whether, given B, we have that CSP(B) is expressible in Datalog?
B
Question: Fix B = (B, R1B , . . . , Rm
).
When is CSP(B) expressible in Datalog?
82
Datalog and CSP
B
Question: Fix B = (B, R1B , . . . , Rm
).
When is CSP(B) expressible in Datalog?
Answer: K . . . & Vardi – 1998, 2000
Expressibility of CSP(B) in Datalog can be
characterized in terms of
• Finite-Variable Logics and Pebble Games;
• Consistency Properties.
83
The Infinitary Logic Lω
∞ω
Definition: Barwise (1977)
• Infinitary logic with k variables, k ≥ 1.
Lk∞ω is the collection of L∞ω -formulas with at
most k distinct variables.
• Infinitary logic with finitely many variables
[
ω
Lk∞ω .
L∞ω =
k≥1
Fact: Lω
∞ω is a useful tool in analyzing the expressive power of least fixed-point logic.
Theorem: Barwise – 1977, Immerman – 1981
Lω
∞ω -definability can be characterized in terms of
pebble games.
84
Existential Positive Fragments of Lω
∞ω
Definition: K . . . & Vardi – 1995
• ∃Lk∞ω is the fragment of Lk∞ω that contains all
W V
atomic formulas and is closed under ∃, , .
• Existential infinitary logic with finitely many
variables
[
ω
∃Lk∞ω .
∃L∞ω =
k≥1
Fact:
• Datalog ⊆ ∃Lω
∞ω
• ∃Lω
∞ω is a useful tool for analyzing the expressive power of Datalog.
• ∃Lω
∞ω -definability can be characterized in terms
of existential pebble games.
85
Existential k-Pebble Games
Spoiler and Duplicator play on two structures A
and B. Each player uses k pebbles. In each move,
• Spoiler places a pebble on or removes a pebble
from an element of A.
• Duplicator tries to duplicate the move on B.
A:
B:
a1
a2
. . . al
↓
↓
···
↓
b1
b2
...
bl
l≤k
• Spoiler wins the (∃, k)-pebble game if at some
point the mapping ai 7→ bi , 1 ≤ i ≤ l,
is not a partial homomorphism.
• Duplicator wins the (∃, k)-pebble game if the
above never happens.
86
Example
Cliques of Different Size
v
@
v
v
LQ
QQ
L
Q
Q
L
v
Qv
L
b
"
B b " L
"
B bb
"L
B
b
"
L b
B ""
b L b
"
BB"
bLv
v
v
@
@
@
@
@
@
@v
K5
K4
Fact: Let Kk be the k-clique
• Duplicator wins the (∃, k)-pebble game on Kk
and Kk+1 .
• Spoiler wins the (∃, k)-pebble game on Kk and
Kk−1 .
87
Linear Orders of Different Size
w
w
w
w
w
w
w
w
w
w
w
w
w
Lm
Ln
• Spoiler wins the (∃, 2)-pebble game
on Lm and Ln , where m > n.
• Duplicator wins the (∃, 2)-pebble game
on Ln and Lm , where m > n.
88
Winning Strategies in the (∃, k)-Pebble Game
Definition: A winning strategy for the Duplicator
in the (∃, k)-pebble game is a non-empty family I
of partial homomorphisms from A to B such that
• If f ∈ I and h ⊆ f , then h ∈ I
(I is closed under subfunctions).
• If f ∈ I and |f | < k, then for every a ∈ A,
there is g ∈ I so that f ⊆ g and a ∈ dom(g).
(I has the forth property up to k)
89
Uses of Existential k-Pebble Games
• Existential k-pebble games can be used to study
the expressive power of ∃Lk∞ω .
• In particular, existential k-pebble games can
be used to obtain lower bounds for definability
in Datalog.
90
(∃, k)-Pebble Games and ∃Lk∞ω -Definability
Definition: Let A, B be two σ-structures.
A k∞ω B, if every ∃Lk∞ω -sentence that is true on
A is also true on B.
Theorem: K . . . & Vardi – 1995
The following statements are equivalent.
• A k∞ω B.
• Duplicator wins the (∃, k)-pebble game
on A and B.
91
Lower Bounds for ∃Lk∞ω -Definability
Corollary: Let Q be a Boolean query on C. The
following statements are equivalent:
• Q is ∃Lω
∞ω -definable on C.
• There is a k such that for every A, B in C, if
A |= Q and B 6|= Q, then the Spoiler wins the
(∃, k)-pebble game on A, B.
A Methodology for ∃Lω
∞ω -Definability
To show that a Boolean query Q is not expressible
in ∃Lω
∞ω , suffices to show that for every k there are
structures Ak and Bk such that
• Ak |= Q and Bk 6|= Q.
• Duplicator wins the (∃, k)-pebble game
on A, B.
Fact: This method is also complete.
92
k-Datalog
Definition: A k-Datalog program is a Datalog
program in which each rule
t0 : − t1 , . . . , tm
has at most k distinct variables in the head t0 and
at most k distinct variables in the body t1 , . . . , tm .
Example: Non
O(X, Y ) : −
O(X, Y ) : −
Q
:−
2-Colorability revisited
E(X, Y )
O(X, Z), E(Z, W ), E(W, Y )
O(X, X)
Therefore,
Non 2-Colorability is expressible in 4-Datalog.
93
k-Datalog and (∃, k)-Pebble Games
Theorem: K . . . & Vardi
• On the class of all finite structures,
k-Datalog ⊂ ∃Lk∞ω .
• There is a polynomial-time algorithm to decide
whether, given two finite structures A and B,
whether the Spoiler or the Duplicator wins the
(∃, k)-pebble game on A and B.
• For each finite structure B, there is a k-Datalog
program πB that expresses the query: given a
finite σ-structure A, does the Spoiler win the
(∃, k)-pebble game on A and B?
94
Datalog and CSP
Theorem: K . . . & Vardi – 1998
Let k be a positive integer and B a finite structure.
Then the following are equivalent:
• CSP(B) is expressible in k-Datalog
• CSP(B) is expressible in ∃Lk∞ω
• CSP(B) = {A : Duplicator wins the
(∃, k)-pebble game on A and B}.
Note: It is always the case that
CSP(B) ⊆ {A : Duplicator wins the
(∃, k)-pebble game on A and B}.
95
Islands of Tractability of CSP
Non-Uniform CSP(B) Problems
• Definability of CSP(B) in Datalog
• Definability of CSP(B) in ∃Lω
∞ω .
• Single canonical polynomial-time algorithm:
determine who wins the (∃, k)-pebble game.
96
From Non-Uniform CSP to Uniform CSP
Note: So far, we have focused on tractable cases
of non-uniform constraint satisfaction problems.
CSP(B) = CSP(All, {B})
Fact: Many applications in AI and in database
systems involve uniform CSP(A, B), where A and
B are classes of structures.
For example, usually, conjunctive query containment and conjunctive query evaluation appear as
uniform CSPs.
Question:
Can we obtain tractability results for uniform CSP
from tractability results for non-uniform CSP?
97
From Non-Uniform CSP to Uniform CSP
Note: Let A and B be two classes of structures.
It is possible that CSP(A, {B}) is in P, for every
B in B, but CSP(A, B) is NP-complete.
Example:
K = all cliques and G = all graphs:
• CSP(K, {H}) is in P, for every graph H;
• CSP(K, G) in NP-complete.
(it is the Clique problem)
Conclusion: Tractability results for non-uniform
CSP do not necessarily yield tractability results for
uniform CSP.
98
k-Datalog and CSP
However, certain non-uniform tractable cases do
uniformize!
Definition: For every positive integer k, let
Dk
= {B : CSP(B) is expressible in k-Datalog}.
Remark: Recall that if B ∈ Dk , then CSP(B) is
in P.
Theorem:
CSP(All, Dk ) is in P, for every positive integer k.
Proof: An easy consequence of the following:
• Characterization of when CSP(B) is expressible in k-Datalog (using (∃, k)-pebble games).
• Polynomial-time algorithm for determining the
winner in the (∃, k)-pebble game.
99
Datalog and CSP
Conclusions:
• Expressibility in Datalog provides a unifying
explanation for many tractability results about
non-uniform CSP.
• Expressibility of non-uniform CSPs in k-Datalog
uniformizes, which implies that CSP(All, Dk )
is a large island of tractability for uniform CSP.
However,
Open Problem: Let k be a positive integer.
Is there an algorithm to test, given a structure B,
whether or not CSP(B) is expressible in k-Datalog?
100
Bounded Treewidth: Motivation
Observation: Many algorithmic problems that
are “hard” on arbitrary graphs become “easy” on
trees.
Question: Can the concept of tree be weakened
while maintaining “good” computational properties?
101
Bounded Treewidth: Motivation
Observation: Many algorithmic problems that
are “hard” on arbitrary graphs become “easy” on
trees.
Question: Can the concept of tree be weakened
while maintaining “good” computational properties?
Answer: Robertson & Seymour – 1985–1995
Yes, the concept of bounded treewidth achieves this.
Note: Part of their work on graph minors.
(Downey & Fellows: “Parametrized Complexity”)
102
k-Trees, Partial k-Trees, Treewidth
Definition: Let k be a positive integer.
• A graph H is a k-tree if
1. H is a Kk+1 -clique, or
2. There are a k-tree G, nodes a1 , . . . , ak in
G forming a Kk -clique, and a node b in
H − G such that H is obtained from G by
connecting b to each of the nodes a1 , . . . , ak
(thus, forming a Kk+1 -clique).
• A graph is a partial k-tree if it is a subgraph
(not necessarily induced) of a k-tree.
• The treewidth of a graph H, denoted by tw(H),
is the smallest k such that H is a partial k-tree.
103
Treewidth: Examples
Fact:
• If T is a tree, then tw(T) = 1.
• If C is a cycle, then tw(C) = 2.
⋊
⋉) = 2.
• tw(⊟) = 2.
• tw(
• ...
• tw(Kk ) = k − 1, for every k ≥ 2.
104
Treewidth: Characterizations
Fact: The concept of treewidth can be characterized in two different ways:
• Using tree decompositions of graphs.
• Using induced widths of orderings of the nodes.
Note: All three descriptions of treewidth are useful, and are needed in different applications.
105
Treewidth: Complexity
Fact: The following problem is NP-complete:
Given graph H and integer k ≥ 1, is tw(H) ≤ k?
However,
Theorem: Bodlaender – 1993
For every integer k ≥ 1, there is a linear-time algorithm such that, given a graph H, it determines
whether or not tw(H) ≤ k.
106
Bounded Treewidth
Definition: For every integer k ≥ 1,
T (k) = {H : tw(H) < k}.
Fact: Fix an integer k ≥ 2.
Many problems that are NP-complete on arbitrary
graphs become tractable on T (k).
3-Colorability, Monochromatic Triangle,
...
107
Treewidth & Monadic Second-Order Logic
Definition: Monadic Second-Order Logic
(∃X1 ∀X2 · · · ∃Xm )ψ,
where X1 , . . . , Xm are set variables and ψ is firstorder.
Fact: For every m ≥ 1, monadic second-order
logic can express problems that are complete for
the m-th level ΣP
m of the polynomial hierarchy PH.
Theorem: Courcelle – 1990
Let Φ be a sentence of monadic second-order logic.
For every integer k ≥ 2, there is a polynomial-time
algorithm such that, given a graph H ∈ T (k), it
determines whether or not H |= Φ.
108
CSP & Monadic Second-Order Logic
B
Fact: If B = (B, R1B , . . . , Rm
) is a finite structure, then CSP(B) is definable by a sentence of
existential monadic second-order logic with a universal first-order part, i.e., by a sentece of the form
(∃X1 · · · ∃Xn )(∀y1 · · · ∀ys )θ,
where θ is quantifier-free.
Proof: Use one Xi for each element of B.
Example: 3-Colorability
(∃R∃G∃B)(∀y1 ∀y2 )θ.
Corollary: (to Courcelle’s Theorem)
If k ≥ 2, then CSP(T (k), {B}) is in P.
109
Uniform CSP and Bounded Treewidth
Note: The notion of treewidth can be defined for
A
arbitrary finite structures A = (A, R1A , . . . , Rm
).
tw(A) = tw(G(A)),
where G(A) = (A, E) is the Gaifman graph of A,
i.e.,
E(a, a′ ) ⇐⇒ a, a′ occur in a tuple of RiA , for some i.
Theorem: Dechter & Pearl - 1989, Freuder - 1990
If k ≥ 2, then CSP(T (k), All) is in P.
Note: Thus, CSP(T (k), All) is a large island of
tractability for uniform CSP.
Moreover, Bodlaender’s Theorem implies that it is
an easily “accessible” island.
110
Logical Aspects of Bounded Treewidth
Fact:
Fix an integer k ≥ 2.
• The tractability of CSP(T (k), All) can be explained in terms of expressibility in k-Datalog.
• There are tight connections between having
tw(A) < k and the canonical query QA being definable in a fragment of first-order logic
with k variables.
K . . . & Vardi – 1998
Dalmau, K . . ., Vardi – 2002
111
Finite-Variable Logics
Definition: Fix an integer k ≥ 2.
• FOk is the collection of all first-order formulas
with k distinct variables.
• Lk is the collection of all FOk -formulas built
using atomic formulas, ∧, and ∃ only.
Example: Let Cn be the n-element cycle, n ≥ 3.
The canonical conjunctive query QCn is expressible
in L3 .
For instance, QC4 is logically equivalent to
(∃x∃y∃z)(E(x, y)∧E(y, z)∧(∃y)(E(z, y)∧E(y, x))).
It is no accident that tw(Cn ) = 2 ...
112
Bounded Treewidth & Finite-Variable Logics
Theorem: Fix an integer k ≥ 2.
• If A ∈ T (k), then the canonical conjunctive
query QA is expressible in Lk .
• If B is a fixed structure, then CSP(T (k), {B})
is expressible in k-Datalog.
Proof:
• Use the construction of A as a partial k-tree.
• Combine the above with the characterization
of expressibility of non-uniform CSP in k-Datalog
in terms of the (∃, k)-pebble game.
Corollary: CSP(T (k), All) is in P.
Algorithm: Determine whether the Spoiler or
the Duplicator wins the (∃, k)-pebble game.
113
Bounded Treewidth & Finite-Variable Logics
Question: When is QA definable in Lk ?
Definition: A and B are homomorphically equivalent, denoted A ∼h B, if there are homomorphisms h : A → B and h′ : B → A.
Theorem: Fix a k and a finite structure A.
Then the following are equivalent:
• QA is definable in Lk .
• There is some B ∈ T (k) such that A ∼h B.
• core(A) ∈ T (k).
114
Cores
Definition: We say that a structure B is the core
of a structure A if
• B is a submodel of A
• There is no homomorphism h : B → B′ from
B to a proper submodel B′ of B.
Examples:
• core(Kk ) = Kk
• core(Cn ) = Cn
• core(
⋊
⋉) = K
3
• If H is 2-colorable, then core(H) = K2 .
Note: Cores play an important role in database
query processing and optimization.
115
Bounded Treewidth & Finite-Variable Logics
Definition: Fix an integer k ≥ 2.
H(T (k)) is the class of all structures that are
homomorphically equivalent to a structure in T (k).
Remark: T (k) is properly contained in H(T (k)):
There are 2-colorable graphs of arbitrarily large
treewidth (for instance, k-grids)
Theorem : Let k ≥ 2 be an integer.
• If B is a fixed structure, then CSP(H(T (k)), {B})
is definable in k-Datalog.
• CSP(H(T (k)), All) is definable in LFP2k ,
Least Fixed-Point Logic with 2k variables.
Algorithm: Determine whether the Spoiler
or the Duplicator wins the (∃, k)-pebble game.
116
New Islands of Tractability
Theorem: Fix an integer k ≥ 2.
• CSP(H(T (k)), All) is in P.
Good News!
• Determining membership in H(T (k)) is an NPcomplete problem.
Bad News!
Conclusion:
• We have discovered new islands of tractability
that are larger than T (k).
• Unfortunately, these islands are “inaccessible”.
117
Classification Theorem
Question: Are there islands of tractability of the
form CSP(A, All) larger than H(T (k))?
Answer: No!
(modulo a complexity-theoretic assumption)
Theorem: Grohe – 2003
Assume that FPT 6= W [1].
If A is a r.e. class of finite structures over some vocabulary such that CSP(A, All) is in P, then there
is a k ≥ 2 such that A ⊆ H(T (k)).
Note: FPT 6= W [1] is the analog of P 6= NP for
parametrized complexity.
Conclusion: The classes H(T (k)) constitute the
largest islands A of tractability for CSP(A, All).
118
The Archipelago of Tractability for CSP
Classification Program for CSP:
Discover all islands of tractability of CSP:
Identify all classes A and B of finite structures such
that CSP(A, B) is in P.
Current Status:
• All islands of tractability of the form
CSP(A, All)
have been discovered.
• All islands of tractability of the form
CSP(All, {B})
with |B| ≤ 3 have been discovered.
• Much more remains to be done. So, the quest
goes on ...
119
Bounded Treewidth & Finite-Variable Logics
Question: Is there a more precise relationship
between T (k) and Lk ?
Theorem: Fix a k and a finite structure A.
Then the following are equivalent:
• A ∈ T (k)
• QA can be transformed to an Lk -sentence by
repeatedly applying the following rewrite rules
to subformulas of QA :
– Associativity and Commutativity of ∧
– (∃x)(ϕ ∧ ψ) 7→ (∃x)ϕ ∧ ψ, provided x is
not free in ψ.
– (∃x)ϕ 7→ (∃y)ϕ[x/y], provided y is not
free in θ.
Note: The above rewrite rules are routinely used
in database query processing and optimization.
120
Concluding Remarks
Constraint Satisfaction is a meeting point of
• Computational Complexity
• Database Theory
• Logic
• Universal Algebra.
121