Download The Monte Carlo method

The Monte Carlo method 1 (1,1) (-1,1) (0,0) 1 Z= 0 1 (-1,-1) If X2+Y21 o/w (1,-1) • (X,Y) is a point chosen uniformly at random in a 22 square centered at the origin (0,0). • P(Z=1)=/4. 2  Assume we run this experiment m times, with Zi being the value of Z at the ith run. m . If W=iZi, then E[W ]  E[ Z i ]   E[ Z i ]  4 i 1 i 1 m   m W    W’=4W/m is an estimate of . m 4 3  By Chernoff bound, m m Pr[| W ' |  ]  Pr[| W  | ] 4 4  Pr[| W  E[W ] | E[W ]]  2e   m 2 / 12 . Def: A randomized algorithm gives an (,)approximation for the value V if the output X of the algorithm satisfies Pr[|X-V|V]  1-. 4  The above method for estimating  gives an (,)-approximation, as long as <1 and m large enough. 12ln (2 /  )  m / 12 2e   m . 2  2 5  Thm1: Let X1,…,Xm be independent and identically distributed indicator random variables, with =E[Xi]. If m  (3ln(2/))/2, m 1 then Pr[| X   | ]   . m    i 1 i I.e. m samples provide an (,)-approximation for . Pf: Exercise! 6   Def: FPRAS: fully polynomial randomized approximation scheme. A FPRAS for a problem is a randomized algorithm for which, given an input x and any parameters  and  with 0<,<1, the algorithm outputs an (,)-approximation to V(x) in time poly(1/,ln-1,|x|). 7    Def: DNF counting problem: counting the number of satisfying assignments of a Boolean formula in disjunctive normal form (DNF). Def: a DNF formula is a disjunction of clauses C1C2… Ct, where each clause is a conjunction of literals. Eg. (x1x2x3)(x2x4)(x1x3x4) 8   Counting the number of satisfying assignments of a DNF formula is actually #Pcomplete. Counting the number of Hamiltonian cycles in a graph and counting the number of perfect matching in a bipartite graph are examples of #P-complete problems. 9 A naïve algorithm for DNF counting problem: Input: A DNF formula F with n variables. Output: Y = an approximation of c(F)  1. X0. The number of satisfying Assigments of F. 2. For k=1 to m, do: (a) Generate a random assignment for the n variables, chosen uniformly at random. (b) If the random assignment satisfies F, then X X+1. 3. Return Y  (X/m)2n. 10 Analysis 1 Xk=  If the k-th iteration in the algorithm generated a satisfying assignment; o/w. 0 Pr[Xk=1]=c(F)/2n. m  Let X=  X k and then E[X]=mc(F)/2n. k 1  E[ X ]2n E[Y ]   c( F ). m 11 Analysis    By Theorem 1, X/m gives an (,)approximation of c(F)/2n, and hence Y gives an (,)-approximation of c(F), when m  32nln(2/))/2c(F). If c(F)  2n/poly(n), then this is not too bad, m is a poly. But, if c(F)=poly(n), then m=O(2n/c(F))! 12 Analysis       Note that if Ci has li literals then there are exactly 2n-li satisfying assignments for Ci. Let SCi denote the set of assignments that satisfy clause i. U={(i,a): 1itt and aSCi}. t nl | SC |  2 |U|=  i  t i 1 i 1 Want to estimate c( F ) |  SCi | . i 1 Define S={(i,a): 1it and aSCi, aSCj for j<i}. i 13 DNF counting algorithm II:   1. 2. 3. Input: A DNF formula F with n variables. Output: Y: an approximation of c(F). X0. For k=1 to m do: t (a) With probability | SCi |  | SCi | choose, i 1 uniformly at random an assignment aSCi. (b) If a is not in any SCj, j<i, then XX+1. t Return Y ( X / m) | SCi |. i 1 14 DNF counting algorithm II:   Note that |U|t|S|. Why? t Let Pr[i is chosen]=| SCi |  | SCi |  | SCi | | U | i 1  Then Pr[(i,a) is chosen] =Pr[i is chosen]Pr[a is chosen|i is chosen] | SCi | 1 1  . = | U | | SCi | |U | 15 DNF counting algorithm II:     Thm: DNF counting algorithm II is an FPRAS for DNF counting problem when m=(3t/2)ln(2/). Pf: Step 2(a) chooses an element of U uniformly at random. The probability that this element belongs to S is at least 1/t. Fix any ,>0, and let m= (3t/2)ln(2/). poly(t,1/,ln(1/)) 16 DNF counting algorithm II:   The processing time of each sample is poly(t). By Thm1, with m samples, X/m gives an (,)approximation of c(F)/|U| and hence Y gives an (,)-approximation of c(F). 17 Counting with Approximate Sampling   Def: w: the output of a sampling algorithm for a finite sample space . The sampling algorithm generates an -uniform sample of  if, for any subset S of , |Pr[wS]|S|/|||. 18    Def: A sampling algorithm is a fully polynomial almost uniform sampler (FPAUS) for a problem if, given an input x and >0, it generates an uniform sample of (x) in time poly(|x|,ln(1/)). Consider an FPAUS for independent sets would take as input a graph G=(V,E) and a parameter . The sample space: the set of all independent sets in G. 19     Goal: Given an FPAUS for independent sets, we construct an FPRAS for counting the number of independent sets. Assume G has m edges, and let e1,…,em be an arbitrary ordering of the edges. Ei: the set of the first i edges in E and let Gi=(V,Ei). (Gi): denote the set of independent sets in Gi. 20 | (Gm ) | | (Gm 1 ) | | (G1 ) | | (G ) |     | (G0 ) | . | (Gm 1 ) | | (Gm 2 ) | | (G0 ) |   |(G0)|=2n. Why? To estimate |(G)|, we need good estimates for r  | (Gi ) | , i  1,.., m. i | (Gi 1 ) | 21   ~ Let ri be estimate for ri, then ~  ri . i 1 To evaluate the error, we need to bound the m ~ ratio R   ri . i 1  | (G ) |~ 2 m n ri To have an (,)-approximation, we want Pr[|R-1|]1-. 22    ~ Lemma: Suppose that for all i, 1im, ri is an (/2m,/m)-approximation for ri. Then Pr[|R-1|]1-. Pf: For each 1im, we have   ~ Pr[| ri  ri | 2m ri ]  1  m . 23        ~ ri ]  . Equivalently, Pr[| ri  ri | 2m m  ~ Pr[i, | ri  ri | ri ]   . 2m  ~ Pr[i, | ri  ri | ri ]  1   . 2~m ri   Pr[i,1    1 ]  1  . 2m ri 2m ri  m m ~  m Pr[(1  )    (1  ) ]  1  . 2m 2m i 1 ri  Pr[1    R  1   ]  1   . 24 1. Estimating ri: Input: Graph Gi-1=(V,Ei-1) and Gi=(V,Ei) ~ Output: ri = an approximation of ri. X0 2. Repeat for M=(1296m2/2)ln(2m/) independent trials:    3. (a) Generate an (/6m)-uniform sample from (Gi-1). (b) If the sample is an independent set in Gi, then XX+1. Return ~ ri X/M. 25      Lemma: When m1 and 0<1, the procedure for estimating ri yields an (/2m,/m)approximation for ri. Pf: Suppose Gi-1 and Gi differ in that edge {u,v} is in Gi but not in Gi-1. (Gi)  (Gi-1). An independent set in (Gi-1)\(Gi) contains both u and v. 26  Associate each I(Gi-1)\(Gi) with an independent set I\{v}(Gi). Note that I’(Gi) is associated with no more than one independent set I’{v}(Gi-1)\(Gi), thus |(Gi-1)\(Gi)|  |(Gi)|.  It follows that  | (Gi ) | | (Gi ) | 1 ri    . | (Gi 1 ) | | (Gi ) |  | (Gi 1 ) \ (Gi ) | 2 27   Let Xk=1 if the k-th sample is in (Gi) and 0 o/w. Because our samples are generated by an (/6m)-uniform sampler, by definition, | (Gi ) |  Pr[ X k  1]   | (Gi 1 ) |  6m . | (Gi ) |  E[ X k ]   . | (Gi 1 ) | 6m 28  By linearity of expectations, M X E[ k 1 M k ] | (Gi ) |   . | (Gi 1 ) | 6m M X ri ]  ri | E[ k 1  | E[~ M k ] | (Gi ) |   . | (Gi 1 ) | 6m  Since ri1/2, we have  1  1 E[ ~ ri ]  ri     . 6m 2 6m 3 29  If M3ln(2m/)/(/12m)2(1/3)=1296m2-2ln(2m/), then  ~ ri    ~   ~ ~ Pr  ~  1  E[ ri ]  .   Pr  ri  E[ ri ]  12m  12m   m  E[ ri ]  Equivalently, with probability 1- /m,  ~ ri  1   1 . ~ 12m E[ ri ] 12m -----(1) 30    ~ | E [ r ]  r |  , we have i i As 6m E[~ ri ]   1   1 . 6mri ri 6mri Using, ri1/2, then  E[~ ri ]  1   1 . 3m ri 3m -----(2) 31  Combining (1) and (2), with probability 1-/m, ~ ri    1  (1  )(1  )   (1  )(1  )  1 . 2m 3m 12m ri 3m 12m 2m      This gives the desired (/2m,/m)-approximation. Thm: Given FPAUS for independent sets in any graph, we can construct an FPRAS for the number of independent sets in a graph G. 32 The Markov Chain Monte Carlo Method   The Markov Chain Monte Carlo Method provides a very general approach to sampling from a desired probability distribution. Basic idea: Define an ergodic Markov chain whose set of states is the sample space and whose stationary distribution is the required sampling distribution. 33    Lemma: For a finite state space  and neighborhood structure {N(x)|x}, let N=maxx|N(x)|. Let M be any number such that MN. Consider a Markov chain where 1/M if xy and yN(x) Px,y= 0 if xy and yN(x) 1-N(x)/M if x=y. If this chain is irreducible and aperiodic, then the stationary distribution is the uniform distribution. 34    Pf: For xy, if x=y, then xPx,y=yPy,x, since Px,y=Py,x=1/M. It follows that the uniform distribution x=1/|| is the stationary distribution by the following theorem. 35 Thm:  P: transition matrix of a finite irreducible and ergodic Markov chain. If there are nonnegative n numbers  =(0,..,n) such that   i  1 i 0 and if, for any pair of states i,j, iPi,j=jPj,i, then  is the stationary distribution corresponding to P. n n  i Pi , j    j Pj ,i   j .  P   . Pf:  n i 0 i 0  Since   i  1, it follows that  is the unique i 0 stationary distribution of the Markov Chain.  36  1. 2. Eg. Markov chain with states from independent sets in G=(V,E). X0 is an arbitrary independent set in G. To compute Xi+1: (a) choose a vertex v uniformly at random from V; (b) if vXi then Xi+1=Xi\{v}; (c) if vXi and if adding v to Xi still gives an independent set, then Xi+1=Xi{v}; (d) o/w, Xi+1=Xi. 37     The neighbors of a state Xi are independent sets that differ from Xi in just one vertex. Since every state is reachable from the empty set, the chain is irreducible. Assume G has at least one edge (u,v), then the state {v} has a self-loop (Pv,v>0), thus aperiodic. When xy, it follows that Px,y=1/|V| or 0, by the previous lemma, the stationary distribution is the uniform distribution. 38 The Metropolis Algorithm (When stationary distribution is nonuniform)   Lemma: For a finite state space  and neighborhood structure {N(x)|x}, let N=maxx|N(x)|. Let M be and number such that MN. For all x, let x>0 be the desired probability of state x in the stationary distribution. Consider a Markov chain Px,y=  (1/M)min(1,y/x) 0 1-yxPx,y if xy and yN(x) if xy and yN(x) if x=y. Then, if this chain is irreducible and aperiodic, then the stationary distribution is given by x. 39       Pf: For any xy, if xy, then Px,y=1/M and Py,x=(1/M)(x/y). It follows that Px,y=1/M=(y/x)Py,x.  xPx,y=yPy,x. Similarly, for x>y. Again, by the previous theorem, x’s form the stationary distribution. 40    Eg. Create a Markov chain, in the stationary distribution, each independent set I has probability proportional to |I|, for some >0. I.e. x=|Ix|/B, where Ix is the independent set corresponding to state x and B=x|Ix|. Note that, when =1, this is the uniform distribution. 41 1. 2. X0 is an arbitrary independent set in G. To compute Xi+1: (a) choose v uniformly at random from V; (b) if vXi then Xi+1=Xi\{v} with probability min(1,1/); (c) if vXi and Xi{v} still gives an independent set, then put Xi+1=Xi{v} with probability min(1,); (d) o/w, set Xi+1=Xi. 42

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download The Monte Carlo method