Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Approximate Counting (the number of satisfying assignments in a DNF forumla) Up till now we have mostly talked about optimization problems, where the goal of the algorithm was to find a (legal) solution with maximum/minimum value/cost. We described approximation algorithms that found solutions whose value was relatively close to optimal. Today we are going to talk about a different type of problems: Counting problems. Namely, we would like to find the number of solutions. For example, we may want to know what is the number of perfect matchings in a graph. The class #P contains all counting problems where the corresponding decision problem is in NP. There are very few such problems that can be solved exactly in polynomial time: one is counting the number of spanning trees in a graph and the other is counting the number of perfect matchings in a planar graph. In other cases we can find approximate solutions, in particular, using randomized algorithms. Specifically, for a counting problem D, if we denote the number of solutions for an input I by #I, then we are interested in a randomized algorithm such that for every given ǫ, δ, it outputs a value Iˆ such that with probability at least 1 − δ, (1 − ǫ)#I ≤ Iˆ ≤ (1 + ǫ)#I . Comment: It suffices to ensure that the above holds with probability, say, at least 2/3. In order to get confidence 1 − δ we simply run the algorithm Θ(log(1/δ)) times and take the median value among the outputs (verifying this is left as an exercise). In this lecture we’ll be interested in the problem of finding the number of satisfying assignment of a given DNF (Disjunctive Normal Form) formula (this is an “Or” of terms, where each term is an “And” of literals, e.g. (x1 ∧ x̄2 ) ∨ (x̄1 ∧ x3 )). Note that as opposed to a CNF formula, it is easy to just find some satisfying assignment α = α1 , . . . , αn ∈ {0, 1}n if one exists: Consider any term Tj . For each variable xi that appears un-negated in Tj , set αi = 1 and for each variable xi that appears negated, set αi = 0. Set all other variables arbitrarily. However, it is NP-hard to compute the number of satisfying assignment (exactly). To see why this is true, assume we had an algorithm that computes the number of satisfying assignments (exactly) for any given DNF formula. Then, given a CNF formula φ, consider the negation of this formula ¬φ. By applying the negation we get φ′ = ¬φ = ¬(C1 ∧ C2 ∧ . . . ∧ Cm ) = ¬C1 ∨ ¬C2 ∨ . . . ∨ ¬Cm = T1 ∨ T2 ∨ . . . ∨ Tm (1) where if Cj = ℓ1 ∨ ℓ2 ∨ . . . ∨ ℓk , then Tj = ¬Cj = ¬ℓ1 ∧ ¬ℓ2 ∧ . . . ∧ ℓk , that is, we get a DNF formula of exactly the same size as the original CNF formula. Since φ′ (x) = ¬φ(x), if φ is not satisfiable, that is, φ(x) = 0 for every x, then φ′ (x) = 1 for every x. That is, the number of satisfying assignments for φ′ is 2n . Therefore, if we had an exact algorithm, then we could solve SAT, which is NP-complete. 1 1 A First Attempt (Simple Sampling Algorithm) For a given DNF formula φ, let psat (φ) be the probability that a uniformly selected random assignment is a satisfying assignment. By definition, #φ = 2n · psat (φ), so that if we can approximate psat (φ), then we can approximate #φ. Suppose we simply select, independently and uniformly at random, s assignments, and set p̂ to be the fraction of assignments (among the selected assignments) that are satisfying assignments. How big should we set s so that we get a good estimate with high probability (that is, an estimate that is within (1 ± ǫ) from the correct value with probability at least 1 − δ)? To determine what is a sufficient value of s, we use the multiplicative version of Chernoff’s inequality (for the special case of independently distributed Bernoulli random variables). Let χ1 , . . . , χs be s independent 0, 1 random variables, where Pr[χi = 1] = p (so that Exp[χi ] = p as well) for every i (i.e., the random variables are equally distributed).) Then, for every γ ∈ (0, 1], the following bounds hold: s 1 X · χi > (1 + γ)p < exp −γ 2 ps/3 Pr s i=1 # " and s 1 X · χi < (1 − γ)p < exp −γ 2 ps/2 Pr s i=1 # " Therefore, setting γ = ǫ and p = psat (φ), if we take s ≥ ǫ23p ln(2/δ) then with probability at least 1 − δ we have that (1 − ǫ)psat (φ) ≤ p̂ ≤ (1 + ǫ)psat (φ). There are two problems with this approach. The first is that is seems that we need to know psat (φ) in order to determine the sample size, but this is just what we don’t know... This can be overcome if we know some lower bound on psat (φ), since we can use the lower bound instead. That is, suppose we know in advance that psat (φ) ≥ p0 for some p0 (but we don’t know any more than that). Then we can set s ≥ ǫ23p0 ln(2/δ). This is Ok as long as p0 is relatively large, that is, at least 1/nc for some constant c. But if the fraction of satisfying assignments is small, and in particular exponentially small, then we don’t get an efficient algorithm. 2 A (Fully) Polynomial-Time Approximation Algorithm We are going to define another problem, and then see how to use the algorithm we design for this problem for our purposes. Let S1 , . . . , Sm be subsets of some domain U (where U and these sets may be very large). We S would like to compute (approximate) the size of the union m j=1 Sj . In principle it is possible to compute this value by applying the Inclusion Exclusion formula, but this takes time 2m since we need to compute 2m terms.1 Our algorithm works under the following assumptions, which hold for every j = 1, . . . , m: 1 P S m Recall that the Inclusion Exclusion principle says that 1≤j1 <j2 <j3 ≤m |Sj1 ∩ Sj2 ∩ Sj3 | − · · · + (−1) m−1 |S1 ∩ S2 ∩ · · · ∩ Sm |. 2 S = j=1 j Pm j=1 |Sj | − P 1≤j1 <j2 ≤m |Sj1 ∩ Sj2 | + 1. It is easy to compute |Sj |; 2. We can select a uniform element in Sj ; 3. For every u ∈ U it is easy to check whether u belongs to Sj . Before describing and analyzing an algorithm, let’s see why this would be helpful for us. In our case we shall define Sj to be the the set of satisfying assignments for each term Tj . Then the set of satisfying assignments is exactly the union of these sets, and we want to know the size of this union. We next check the assumptions: 1. If a term Tj contains k variables, then |Sj | = 2n−k (since each of the variables must be set in a unique way, and all other variables can be set arbitrarily). 2. In order to sample uniformly from Sj , we simply set the variables in Tj in the unique manner that satisfies Tj and set each other variable with equal probability to 1 or 0. 3. Finally, given an assignment we can easily check whether it satisfies Tj and is hence in Sj . Note that each of these operations takes time linear in n. Algorithm II For i = 1, . . . , s (where s will be set in the analysis), do: 1. Select a set St (an index t) with probability Pm|St ||S | . j=1 j 2. Uniformly select an element u ∈ St (for the selected t). 3. Find the minimal j such that u ∈ Sj . 4. If t = j then set χi = 1, otherwise, χi = 0. Set χ = Ps i=1 χi , and output F = m χX |Sj | . s j=1 S As a sanity check consider two extreme cases: (1) The sets are disjoint. In this case m j=1 Sj = Pm |S |, and indeed, we get that χ = 1 for every i, so that χ = s and F is exactly correct. (2) i j=1 j m 1 The sets are all exactly the same. In this case m j=1 |Sj |, and indeed, each χi is set j=1 Sj = m to 1 if and only if t = 1, and this occurs with probability 1/m, so that we expect χ/s to be close to 1/m. S Claim 1 Define S m j=1 Sj η = Pm . j=1 |Sj | Then, for each i = 1, . . . , s, Exp[χi ] = Pr[χi = 1] = η. 3 P u′ u 1 |U| S1 ∗ ∗ + + St + Sm + Figure 1: An illustration for the proof of Claim 1. For each set St and element u ∈ U , the entry in the matrix Pm is not empty (i.e., is either + or ∗) if an only if u ∈ St . The number of non-empty entries is therefore j=1 |Sj |. For each u ∈ U that belongs to some set Sj , we mark by ∗ the entry corresponding to the first Sm such set. Therefore, the number of ∗ entries in the matrix is j=1 Sj . The algorithm samples s non-empty entries independently, uniformly at random, and counts how many among them are ∗ entries. Proof: Consider all pairs (t, u) such that t ∈ {1, . . . , m} and u ∈ St . The number of such pairs is P just m j=1 |Sj | (since for each t ∈ {1, . . . , m}, the number of u’s that can be in a pair (t, u) is |St |). Note that the first two steps in the algorithm select a pair (t, u). What is the probability that any particular pair (t, u) is selected? It is 1 1 |St | · . = Pm Pr[t is selected ] · Pr[u is selected |t is selected ] = Pm j=1 |Sj | |St | j=1 |Sj | (2) That is, every pair (t, u) (such that t ∈ {1, . . . , m} and u ∈ St ) is selected with equal probability P 1/ m j=1 |Sj |. Now, define for each u ∈ m of the algorithm, j=1 Sj the index j(u) = minj {u ∈ Sj }. By definition Sm the random variable χi gets value 1 if and only if t = j(u). Since for every u ∈ j=1 S j , the index S Sm j(u) is uniquely defined, the number of pairs (t, u) such that t = j(u) is exactly S P m the probability that χi = 1 is m j=1 Sj / j=1 |Sj | = η, as claimed. χ Pm Recall that the output of the algorithm is F = Exp[F ] = Exp " s 1 X s · i=1 χi # s · j=1 |Sj |. Therefore, m [ · |Sj | = η · |Sj | = Sj . j=1 j=1 j=1 m X m X j=1 Sj . Therefore, (3) Since the χi ’s are independent random variables, we can apply the multiplicative version of Chernoff’s inequality that we already used before to get the following: Theorem 2 If s ≥ 3m ǫ2 ln(2/δ) then with probability at least 1 − δ, [ [ m m Sj ≤ F ≤ (1 + ǫ) · Sj (1 − ǫ) · j=1 j=1 4 Proof: By Claim 1 we have that Pr[χi = 1] = η. Note that since each u can belong to at most m sets, η ≥ 1/m. Thus, s 1X χi > (1 + ǫ)η < exp(−ǫ2 ηs/3) ≤ exp(− ln(2/δ)) = δ/2 Pr s i=1 (4) s 1X χi < (1 − ǫ)η < exp(−ǫ2 ηs/2) ≤ exp(− ln(2/δ)) = δ/2 . Pr s i=1 (5) " # and similarly " # Therefore, with probability at least 1 − δ we have that (1 − ǫ)η ≤ χ ≤ (1 + ǫ)η s (6) and the theorem follows by definition of F and η. Returning to our original problem of approximating the number of satisfying assignments of a DNF formula, recall that for each term Tj , the set Sj was the set of assignments that satisfy Tj . Each iteration of the algorithm takes time O(nm) where the main cost is finding j(u) given j (we have to go over at most all m sets, and for each we need to check if u belongs to it). Multiplying 2 · ln(1/δ) . this by the number of iterations, we get a total running time of O nm ǫ2 5