Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Monte Carlo method 1 (1,1) (-1,1) (0,0) 1 Z= 0 1 (-1,-1) If X2+Y21 o/w (1,-1) • (X,Y) is a point chosen uniformly at random in a 22 square centered at the origin (0,0). • P(Z=1)=/4. 2 Assume we run this experiment m times, with Zi being the value of Z at the ith run. m . If W=iZi, then E[W ] E[ Z i ] E[ Z i ] 4 i 1 i 1 m m W W’=4W/m is an estimate of . m 4 3 By Chernoff bound, m m Pr[| W ' | ] Pr[| W | ] 4 4 Pr[| W E[W ] | E[W ]] 2e m 2 / 12 . Def: A randomized algorithm gives an (,)approximation for the value V if the output X of the algorithm satisfies Pr[|X-V|V] 1-. 4 The above method for estimating gives an (,)-approximation, as long as <1 and m large enough. 12ln (2 / ) m / 12 2e m . 2 2 5 Thm1: Let X1,…,Xm be independent and identically distributed indicator random variables, with =E[Xi]. If m (3ln(2/))/2, m 1 then Pr[| X | ] . m i 1 i I.e. m samples provide an (,)-approximation for . Pf: Exercise! 6 Def: FPRAS: fully polynomial randomized approximation scheme. A FPRAS for a problem is a randomized algorithm for which, given an input x and any parameters and with 0<,<1, the algorithm outputs an (,)-approximation to V(x) in time poly(1/,ln-1,|x|). 7 Def: DNF counting problem: counting the number of satisfying assignments of a Boolean formula in disjunctive normal form (DNF). Def: a DNF formula is a disjunction of clauses C1C2… Ct, where each clause is a conjunction of literals. Eg. (x1x2x3)(x2x4)(x1x3x4) 8 Counting the number of satisfying assignments of a DNF formula is actually #Pcomplete. Counting the number of Hamiltonian cycles in a graph and counting the number of perfect matching in a bipartite graph are examples of #P-complete problems. 9 A naïve algorithm for DNF counting problem: Input: A DNF formula F with n variables. Output: Y = an approximation of c(F) 1. X0. The number of satisfying Assigments of F. 2. For k=1 to m, do: (a) Generate a random assignment for the n variables, chosen uniformly at random. (b) If the random assignment satisfies F, then X X+1. 3. Return Y (X/m)2n. 10 Analysis 1 Xk= If the k-th iteration in the algorithm generated a satisfying assignment; o/w. 0 Pr[Xk=1]=c(F)/2n. m Let X= X k and then E[X]=mc(F)/2n. k 1 E[ X ]2n E[Y ] c( F ). m 11 Analysis By Theorem 1, X/m gives an (,)approximation of c(F)/2n, and hence Y gives an (,)-approximation of c(F), when m 32nln(2/))/2c(F). If c(F) 2n/poly(n), then this is not too bad, m is a poly. But, if c(F)=poly(n), then m=O(2n/c(F))! 12 Analysis Note that if Ci has li literals then there are exactly 2n-li satisfying assignments for Ci. Let SCi denote the set of assignments that satisfy clause i. U={(i,a): 1itt and aSCi}. t nl | SC | 2 |U|= i t i 1 i 1 Want to estimate c( F ) | SCi | . i 1 Define S={(i,a): 1it and aSCi, aSCj for j<i}. i 13 DNF counting algorithm II: 1. 2. 3. Input: A DNF formula F with n variables. Output: Y: an approximation of c(F). X0. For k=1 to m do: t (a) With probability | SCi | | SCi | choose, i 1 uniformly at random an assignment aSCi. (b) If a is not in any SCj, j<i, then XX+1. t Return Y ( X / m) | SCi |. i 1 14 DNF counting algorithm II: Note that |U|t|S|. Why? t Let Pr[i is chosen]=| SCi | | SCi | | SCi | | U | i 1 Then Pr[(i,a) is chosen] =Pr[i is chosen]Pr[a is chosen|i is chosen] | SCi | 1 1 . = | U | | SCi | |U | 15 DNF counting algorithm II: Thm: DNF counting algorithm II is an FPRAS for DNF counting problem when m=(3t/2)ln(2/). Pf: Step 2(a) chooses an element of U uniformly at random. The probability that this element belongs to S is at least 1/t. Fix any ,>0, and let m= (3t/2)ln(2/). poly(t,1/,ln(1/)) 16 DNF counting algorithm II: The processing time of each sample is poly(t). By Thm1, with m samples, X/m gives an (,)approximation of c(F)/|U| and hence Y gives an (,)-approximation of c(F). 17 Counting with Approximate Sampling Def: w: the output of a sampling algorithm for a finite sample space . The sampling algorithm generates an -uniform sample of if, for any subset S of , |Pr[wS]|S|/|||. 18 Def: A sampling algorithm is a fully polynomial almost uniform sampler (FPAUS) for a problem if, given an input x and >0, it generates an uniform sample of (x) in time poly(|x|,ln(1/)). Consider an FPAUS for independent sets would take as input a graph G=(V,E) and a parameter . The sample space: the set of all independent sets in G. 19 Goal: Given an FPAUS for independent sets, we construct an FPRAS for counting the number of independent sets. Assume G has m edges, and let e1,…,em be an arbitrary ordering of the edges. Ei: the set of the first i edges in E and let Gi=(V,Ei). (Gi): denote the set of independent sets in Gi. 20 | (Gm ) | | (Gm 1 ) | | (G1 ) | | (G ) | | (G0 ) | . | (Gm 1 ) | | (Gm 2 ) | | (G0 ) | |(G0)|=2n. Why? To estimate |(G)|, we need good estimates for r | (Gi ) | , i 1,.., m. i | (Gi 1 ) | 21 ~ Let ri be estimate for ri, then ~ ri . i 1 To evaluate the error, we need to bound the m ~ ratio R ri . i 1 | (G ) |~ 2 m n ri To have an (,)-approximation, we want Pr[|R-1|]1-. 22 ~ Lemma: Suppose that for all i, 1im, ri is an (/2m,/m)-approximation for ri. Then Pr[|R-1|]1-. Pf: For each 1im, we have ~ Pr[| ri ri | 2m ri ] 1 m . 23 ~ ri ] . Equivalently, Pr[| ri ri | 2m m ~ Pr[i, | ri ri | ri ] . 2m ~ Pr[i, | ri ri | ri ] 1 . 2~m ri Pr[i,1 1 ] 1 . 2m ri 2m ri m m ~ m Pr[(1 ) (1 ) ] 1 . 2m 2m i 1 ri Pr[1 R 1 ] 1 . 24 1. Estimating ri: Input: Graph Gi-1=(V,Ei-1) and Gi=(V,Ei) ~ Output: ri = an approximation of ri. X0 2. Repeat for M=(1296m2/2)ln(2m/) independent trials: 3. (a) Generate an (/6m)-uniform sample from (Gi-1). (b) If the sample is an independent set in Gi, then XX+1. Return ~ ri X/M. 25 Lemma: When m1 and 0<1, the procedure for estimating ri yields an (/2m,/m)approximation for ri. Pf: Suppose Gi-1 and Gi differ in that edge {u,v} is in Gi but not in Gi-1. (Gi) (Gi-1). An independent set in (Gi-1)\(Gi) contains both u and v. 26 Associate each I(Gi-1)\(Gi) with an independent set I\{v}(Gi). Note that I’(Gi) is associated with no more than one independent set I’{v}(Gi-1)\(Gi), thus |(Gi-1)\(Gi)| |(Gi)|. It follows that | (Gi ) | | (Gi ) | 1 ri . | (Gi 1 ) | | (Gi ) | | (Gi 1 ) \ (Gi ) | 2 27 Let Xk=1 if the k-th sample is in (Gi) and 0 o/w. Because our samples are generated by an (/6m)-uniform sampler, by definition, | (Gi ) | Pr[ X k 1] | (Gi 1 ) | 6m . | (Gi ) | E[ X k ] . | (Gi 1 ) | 6m 28 By linearity of expectations, M X E[ k 1 M k ] | (Gi ) | . | (Gi 1 ) | 6m M X ri ] ri | E[ k 1 | E[~ M k ] | (Gi ) | . | (Gi 1 ) | 6m Since ri1/2, we have 1 1 E[ ~ ri ] ri . 6m 2 6m 3 29 If M3ln(2m/)/(/12m)2(1/3)=1296m2-2ln(2m/), then ~ ri ~ ~ ~ Pr ~ 1 E[ ri ] . Pr ri E[ ri ] 12m 12m m E[ ri ] Equivalently, with probability 1- /m, ~ ri 1 1 . ~ 12m E[ ri ] 12m -----(1) 30 ~ | E [ r ] r | , we have i i As 6m E[~ ri ] 1 1 . 6mri ri 6mri Using, ri1/2, then E[~ ri ] 1 1 . 3m ri 3m -----(2) 31 Combining (1) and (2), with probability 1-/m, ~ ri 1 (1 )(1 ) (1 )(1 ) 1 . 2m 3m 12m ri 3m 12m 2m This gives the desired (/2m,/m)-approximation. Thm: Given FPAUS for independent sets in any graph, we can construct an FPRAS for the number of independent sets in a graph G. 32 The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method provides a very general approach to sampling from a desired probability distribution. Basic idea: Define an ergodic Markov chain whose set of states is the sample space and whose stationary distribution is the required sampling distribution. 33 Lemma: For a finite state space and neighborhood structure {N(x)|x}, let N=maxx|N(x)|. Let M be any number such that MN. Consider a Markov chain where 1/M if xy and yN(x) Px,y= 0 if xy and yN(x) 1-N(x)/M if x=y. If this chain is irreducible and aperiodic, then the stationary distribution is the uniform distribution. 34 Pf: For xy, if x=y, then xPx,y=yPy,x, since Px,y=Py,x=1/M. It follows that the uniform distribution x=1/|| is the stationary distribution by the following theorem. 35 Thm: P: transition matrix of a finite irreducible and ergodic Markov chain. If there are nonnegative n numbers =(0,..,n) such that i 1 i 0 and if, for any pair of states i,j, iPi,j=jPj,i, then is the stationary distribution corresponding to P. n n i Pi , j j Pj ,i j . P . Pf: n i 0 i 0 Since i 1, it follows that is the unique i 0 stationary distribution of the Markov Chain. 36 1. 2. Eg. Markov chain with states from independent sets in G=(V,E). X0 is an arbitrary independent set in G. To compute Xi+1: (a) choose a vertex v uniformly at random from V; (b) if vXi then Xi+1=Xi\{v}; (c) if vXi and if adding v to Xi still gives an independent set, then Xi+1=Xi{v}; (d) o/w, Xi+1=Xi. 37 The neighbors of a state Xi are independent sets that differ from Xi in just one vertex. Since every state is reachable from the empty set, the chain is irreducible. Assume G has at least one edge (u,v), then the state {v} has a self-loop (Pv,v>0), thus aperiodic. When xy, it follows that Px,y=1/|V| or 0, by the previous lemma, the stationary distribution is the uniform distribution. 38 The Metropolis Algorithm (When stationary distribution is nonuniform) Lemma: For a finite state space and neighborhood structure {N(x)|x}, let N=maxx|N(x)|. Let M be and number such that MN. For all x, let x>0 be the desired probability of state x in the stationary distribution. Consider a Markov chain Px,y= (1/M)min(1,y/x) 0 1-yxPx,y if xy and yN(x) if xy and yN(x) if x=y. Then, if this chain is irreducible and aperiodic, then the stationary distribution is given by x. 39 Pf: For any xy, if xy, then Px,y=1/M and Py,x=(1/M)(x/y). It follows that Px,y=1/M=(y/x)Py,x. xPx,y=yPy,x. Similarly, for x>y. Again, by the previous theorem, x’s form the stationary distribution. 40 Eg. Create a Markov chain, in the stationary distribution, each independent set I has probability proportional to |I|, for some >0. I.e. x=|Ix|/B, where Ix is the independent set corresponding to state x and B=x|Ix|. Note that, when =1, this is the uniform distribution. 41 1. 2. X0 is an arbitrary independent set in G. To compute Xi+1: (a) choose v uniformly at random from V; (b) if vXi then Xi+1=Xi\{v} with probability min(1,1/); (c) if vXi and Xi{v} still gives an independent set, then put Xi+1=Xi{v} with probability min(1,); (d) o/w, set Xi+1=Xi. 42