Download the number of satisfying assignments in a DNF forumla

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Inductive probability wikipedia , lookup

Random variable wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Birthday problem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Approximate Counting
(the number of satisfying assignments in a DNF forumla)
Up till now we have mostly talked about optimization problems, where the goal of the algorithm
was to find a (legal) solution with maximum/minimum value/cost. We described approximation
algorithms that found solutions whose value was relatively close to optimal. Today we are going
to talk about a different type of problems: Counting problems. Namely, we would like to find the
number of solutions. For example, we may want to know what is the number of perfect matchings
in a graph. The class #P contains all counting problems where the corresponding decision problem
is in NP. There are very few such problems that can be solved exactly in polynomial time: one is
counting the number of spanning trees in a graph and the other is counting the number of perfect
matchings in a planar graph. In other cases we can find approximate solutions, in particular, using
randomized algorithms.
Specifically, for a counting problem D, if we denote the number of solutions for an input I by
#I, then we are interested in a randomized algorithm such that for every given ǫ, δ, it outputs a
value Iˆ such that with probability at least 1 − δ,
(1 − ǫ)#I ≤ Iˆ ≤ (1 + ǫ)#I .
Comment: It suffices to ensure that the above holds with probability, say, at least 2/3. In order
to get confidence 1 − δ we simply run the algorithm Θ(log(1/δ)) times and take the median value
among the outputs (verifying this is left as an exercise).
In this lecture we’ll be interested in the problem of finding the number of satisfying assignment
of a given DNF (Disjunctive Normal Form) formula (this is an “Or” of terms, where each term is
an “And” of literals, e.g. (x1 ∧ x̄2 ) ∨ (x̄1 ∧ x3 )). Note that as opposed to a CNF formula, it is easy
to just find some satisfying assignment α = α1 , . . . , αn ∈ {0, 1}n if one exists: Consider any term
Tj . For each variable xi that appears un-negated in Tj , set αi = 1 and for each variable xi that
appears negated, set αi = 0. Set all other variables arbitrarily. However, it is NP-hard to compute
the number of satisfying assignment (exactly).
To see why this is true, assume we had an algorithm that computes the number of satisfying
assignments (exactly) for any given DNF formula. Then, given a CNF formula φ, consider the
negation of this formula ¬φ. By applying the negation we get
φ′ = ¬φ = ¬(C1 ∧ C2 ∧ . . . ∧ Cm ) = ¬C1 ∨ ¬C2 ∨ . . . ∨ ¬Cm = T1 ∨ T2 ∨ . . . ∨ Tm
(1)
where if Cj = ℓ1 ∨ ℓ2 ∨ . . . ∨ ℓk , then Tj = ¬Cj = ¬ℓ1 ∧ ¬ℓ2 ∧ . . . ∧ ℓk , that is, we get a DNF
formula of exactly the same size as the original CNF formula. Since φ′ (x) = ¬φ(x), if φ is not
satisfiable, that is, φ(x) = 0 for every x, then φ′ (x) = 1 for every x. That is, the number of
satisfying assignments for φ′ is 2n . Therefore, if we had an exact algorithm, then we could solve
SAT, which is NP-complete.
1
1
A First Attempt (Simple Sampling Algorithm)
For a given DNF formula φ, let psat (φ) be the probability that a uniformly selected random assignment is a satisfying assignment. By definition, #φ = 2n · psat (φ), so that if we can approximate
psat (φ), then we can approximate #φ.
Suppose we simply select, independently and uniformly at random, s assignments, and set p̂ to
be the fraction of assignments (among the selected assignments) that are satisfying assignments.
How big should we set s so that we get a good estimate with high probability (that is, an estimate
that is within (1 ± ǫ) from the correct value with probability at least 1 − δ)?
To determine what is a sufficient value of s, we use the multiplicative version of Chernoff’s
inequality (for the special case of independently distributed Bernoulli random variables).
Let χ1 , . . . , χs be s independent 0, 1 random variables, where Pr[χi = 1] = p (so that Exp[χi ] = p
as well) for every i (i.e., the random variables are equally distributed).) Then, for every γ ∈ (0, 1],
the following bounds hold:
s
1 X
·
χi > (1 + γ)p < exp −γ 2 ps/3
Pr
s i=1
#
"
and
s
1 X
·
χi < (1 − γ)p < exp −γ 2 ps/2
Pr
s i=1
#
"
Therefore, setting γ = ǫ and p = psat (φ), if we take s ≥ ǫ23p ln(2/δ) then with probability at least
1 − δ we have that (1 − ǫ)psat (φ) ≤ p̂ ≤ (1 + ǫ)psat (φ).
There are two problems with this approach. The first is that is seems that we need to know
psat (φ) in order to determine the sample size, but this is just what we don’t know... This can be
overcome if we know some lower bound on psat (φ), since we can use the lower bound instead. That
is, suppose we know in advance that psat (φ) ≥ p0 for some p0 (but we don’t know any more than
that). Then we can set s ≥ ǫ23p0 ln(2/δ). This is Ok as long as p0 is relatively large, that is, at least
1/nc for some constant c. But if the fraction of satisfying assignments is small, and in particular
exponentially small, then we don’t get an efficient algorithm.
2
A (Fully) Polynomial-Time Approximation Algorithm
We are going to define another problem, and then see how to use the algorithm we design for this
problem for our purposes.
Let S1 , . . . , Sm be subsets of some domain U (where U and these sets may be very large). We
S
would like to compute (approximate) the size of the union m
j=1 Sj . In principle it is possible to
compute this value by applying the Inclusion Exclusion formula, but this takes time 2m since we
need to compute 2m terms.1 Our algorithm works under the following assumptions, which hold for
every j = 1, . . . , m:
1
P
S
m
Recall that the Inclusion Exclusion principle says that 1≤j1 <j2 <j3 ≤m
|Sj1 ∩ Sj2 ∩ Sj3 | − · · · + (−1)
m−1
|S1 ∩ S2 ∩ · · · ∩ Sm |.
2
S =
j=1 j Pm
j=1
|Sj | −
P
1≤j1 <j2 ≤m
|Sj1 ∩ Sj2 | +
1. It is easy to compute |Sj |;
2. We can select a uniform element in Sj ;
3. For every u ∈ U it is easy to check whether u belongs to Sj .
Before describing and analyzing an algorithm, let’s see why this would be helpful for us. In our
case we shall define Sj to be the the set of satisfying assignments for each term Tj . Then the set
of satisfying assignments is exactly the union of these sets, and we want to know the size of this
union. We next check the assumptions:
1. If a term Tj contains k variables, then |Sj | = 2n−k (since each of the variables must be set in
a unique way, and all other variables can be set arbitrarily).
2. In order to sample uniformly from Sj , we simply set the variables in Tj in the unique manner
that satisfies Tj and set each other variable with equal probability to 1 or 0.
3. Finally, given an assignment we can easily check whether it satisfies Tj and is hence in Sj .
Note that each of these operations takes time linear in n.
Algorithm II
For i = 1, . . . , s (where s will be set in the analysis), do:
1. Select a set St (an index t) with probability Pm|St ||S | .
j=1
j
2. Uniformly select an element u ∈ St (for the selected t).
3. Find the minimal j such that u ∈ Sj .
4. If t = j then set χi = 1, otherwise, χi = 0.
Set χ =
Ps
i=1 χi ,
and output
F =
m
χX
|Sj | .
s j=1
S
As a sanity check consider two extreme cases: (1) The sets are disjoint. In this case m
j=1 Sj =
Pm
|S
|,
and
indeed,
we
get
that
χ
=
1
for
every
i,
so
that
χ
=
s
and
F
is
exactly
correct.
(2)
i
j=1 j
m
1
The sets are all exactly the same. In this case m
j=1 |Sj |, and indeed, each χi is set
j=1 Sj = m
to 1 if and only if t = 1, and this occurs with probability 1/m, so that we expect χ/s to be close
to 1/m.
S
Claim 1 Define
S
m
j=1 Sj η = Pm
.
j=1 |Sj |
Then, for each i = 1, . . . , s, Exp[χi ] = Pr[χi = 1] = η.
3
P
u′
u
1
|U|
S1
∗
∗
+
+
St
+
Sm
+
Figure 1: An illustration for the proof of Claim 1. For each set St and element u ∈ U , the entry in the
matrix
Pm is not empty (i.e., is either + or ∗) if an only if u ∈ St . The number of non-empty entries is therefore
j=1 |Sj |. For each u ∈ U that belongs to some set Sj , we mark by ∗ the entry corresponding to the first
Sm
such set. Therefore, the number of ∗ entries in the matrix is j=1 Sj . The algorithm samples s non-empty
entries independently, uniformly at random, and counts how many among them are ∗ entries.
Proof: Consider all pairs (t, u) such that t ∈ {1, . . . , m} and u ∈ St . The number of such pairs is
P
just m
j=1 |Sj | (since for each t ∈ {1, . . . , m}, the number of u’s that can be in a pair (t, u) is |St |).
Note that the first two steps in the algorithm select a pair (t, u). What is the probability that any
particular pair (t, u) is selected? It is
1
1
|St |
·
.
= Pm
Pr[t is selected ] · Pr[u is selected |t is selected ] = Pm
j=1 |Sj | |St |
j=1 |Sj |
(2)
That is, every pair (t, u) (such that t ∈ {1, . . . , m} and u ∈ St ) is selected with equal probability
P
1/ m
j=1 |Sj |.
Now, define for each u ∈ m
of the algorithm,
j=1 Sj the index j(u) = minj {u ∈ Sj }. By definition
Sm
the random variable χi gets value 1 if and only if t = j(u). Since for every u ∈ j=1 S j , the index
S
Sm
j(u) is uniquely defined, the number of pairs (t, u) such that t = j(u) is exactly S
P
m
the probability that χi = 1 is m
j=1 Sj /
j=1 |Sj | = η, as claimed.
χ Pm
Recall that the output of the algorithm is F =
Exp[F ] = Exp
"
s
1 X
s
·
i=1
χi
#
s
·
j=1 |Sj |.
Therefore,
m
[ ·
|Sj | = η ·
|Sj | = Sj .
j=1
j=1
j=1
m
X
m
X
j=1 Sj .
Therefore,
(3)
Since the χi ’s are independent random variables, we can apply the multiplicative version of Chernoff’s inequality that we already used before to get the following:
Theorem 2 If s ≥
3m
ǫ2
ln(2/δ) then with probability at least 1 − δ,
[
[
m
m
Sj ≤ F ≤ (1 + ǫ) · Sj (1 − ǫ) · j=1 j=1 4
Proof: By Claim 1 we have that Pr[χi = 1] = η. Note that since each u can belong to at most m
sets, η ≥ 1/m. Thus,
s
1X
χi > (1 + ǫ)η < exp(−ǫ2 ηs/3) ≤ exp(− ln(2/δ)) = δ/2
Pr
s i=1
(4)
s
1X
χi < (1 − ǫ)η < exp(−ǫ2 ηs/2) ≤ exp(− ln(2/δ)) = δ/2 .
Pr
s i=1
(5)
"
#
and similarly
"
#
Therefore, with probability at least 1 − δ we have that
(1 − ǫ)η ≤
χ
≤ (1 + ǫ)η
s
(6)
and the theorem follows by definition of F and η.
Returning to our original problem of approximating the number of satisfying assignments of a
DNF formula, recall that for each term Tj , the set Sj was the set of assignments that satisfy Tj .
Each iteration of the algorithm takes time O(nm) where the main cost is finding j(u) given j (we
have to go over at most all m sets, and for each we need to check if u belongs to it). Multiplying
2
· ln(1/δ) .
this by the number of iterations, we get a total running time of O nm
ǫ2
5