Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2WO08: Graphs and Algorithms Instructor: Nikhil Bansal 1 Lecture 14 Date: 14/5/2013 Scribe: Bouke Cloostermans Chernoff bound The Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. This means the bound is asymptotically stronger than Markov’s or Chebyshev’s inequality. However, independence of the random variables is required. Theorem 1 (Chernoff bound) Let X = x1 + x2 + · · · + xn where X1 , X2 , . . . , Xn are n independent 0/1 variables, each having probability pi . Then µ e P r [X ≥ (1 + )µ] ≤ (1) (1 + )1+ where µ = E[X]. Proof: Since ex is a convex function, for all t > 0, X ≥ (1 + )µ is equivalent with etX ≥ et(1+)µ . Here, t is a variable that we will optimize later. We can then apply Markov’s inequality which gives E[etX ] . et(1+)µ All Xi are independent so we can split the expectation into n parts. P r[X ≥ (1 + )µ] = P r[etX ≥ et(1+)µ ] ≤ (2) " n # n n i h Pn i h Pn Y Y Y tX tX t i=1 Xi tX (1 − pi ) + pi et E etXi = = E e i=1 i = E e i = E e =E e i=1 i=1 i=1 (3) Picking t = ln(1 + ) gives n n n Y Y Y E etX = (1 − pi ) + pi eln(1+) = (1 − pi ) + pi (1 + ) = 1 + pi . i=1 i=1 Now we use the bound 1 + x ≤ n Y ex (4) i=1 to get 1 + pi ≤ i=1 n Y epi = e Pn i=1 pi = eµ . (5) i=1 Combining this with equation 2 gives P r[X ≥ (1 + )µ] ≤ eµ eln(1+)(1+)µ = e (1 + )1+ µ . (6) 2 1 Corollary 2 We can simplify the Chernoff bound by distinguishing two cases. If ≤ 2e − 1 P r[X ≥ (1 + )µ] ≤ e (1 + )1+ µ ≤ e−µ/4 (7) and if > 2e − 1 P r[X ≥ (1 + )µ] ≤ 2 e (1 + )1+ µ ≤ e µ . (8) Some remarks • In our proof, the Xi were 0/1 random variables but this is not strictly necessary. The bound also holds if Xi ∈ [0, 1]. P • Independence of the Xi is crucial. Consider X = ni=1 X + i where X1 = 1 with probability 0 with probability 1 2 1 2 (9) n with probability 0 with probability 1 2 1 2 (10) and X1 = X2 = · · · = Xn . Then X= However, the Chernoff bound (equation 7) would imply n 1 1 n = P r[X = n] = P r X ≥ 1 + ≤ e− 16 2 2 2 (11) • Boundedness is also crucial. Consider the case where X1 , X2 , . . . , Xn are independent random variables with E[Xi ] = n1 and a random variable X0 with X0 = n with probability n1 0 with probability 1 − 1 n (12) Then h ni 1 Pr X ≥ ≥ P r [X0 = n] = 2 n (13) but the Chernoff bound (equation 8) would imply ni Pr X ≥ ≤ 2 h 2 e n/4 n/4 n/4 1 ≤ 2 (14) 3 Edge-disjoint paths We can find an application of Chernoff bounds in the problem of finding edge-disjoint paths in a graph. Problem 1 (Edge-disjoint paths) Given a directed graph G = (V, E), and k pairs (s1 , t1 ), (s2 , t2 ), . . . , (sk , tk ), find edge disjoint paths between (si , ti ) for all i ∈ {1, 2, . . . , k}. This problem is NP-hard. One might think this problem could be modeled into a max flow problem as follows. Create two auxiliary vertices s and t and connect all si to s with capacity 1, and connect all ti to t with capacity 1. Then find a flow of k units from s to t. However, this model can return a solution with a path from si to tj , i 6= j. For an example, we refer to figure 1 t2 s1 t s t1 s2 Figure 1: In this graph, a flow will be found from s1 to t2 and from s2 to t1 . Edge-disjoint flows from s1 to t1 and from s2 to t2 are not possible here. Even though this problem is NP-hard, we can still say something useful about it. For this reason we introduce the term Congestion. Definition 3 (Congestion) The congestion Ce of an edge e in a set of paths P is the amount of paths in P that use edge e. Theorem 4 (Congestion bound) There exists a polynomial-time algorithm that finds paths from si to ti for i ∈ {1, 2, . . . , k} such that for all e log n Ce = O (15) log log n provided edge-disjoint paths exist. We will prove this theorem, but first we need to do some work. We model this problem into a flow LP defined by equations 16 through 20. 3 k X fe(i) ≤ 1 ∀e ∈ E (congestion constraint) (16) i=1 fe(i) ≥ 0 X fe(i) = e∈N + (v) X e∈N + (s X fe(i) ∀e ∈ E, i = 1, 2, . . . , k (17) ∀v ∈ V, i = 1, 2, . . . , k (18) e∈N − (v) fe(i) = 1 (19) fe(i) = 1 (20) i) X e∈N − (ti ) (i) In this LP, i = 1, 2, . . . , k are commodities, fe represents how much commodity i is routed on edge i and N − (v) and N + (v) are the amount of incoming and outgoing edges of vertex v. Any flow f (i) from si to ti can be decomposed into a convex combination of paths. We formulate this more formally in the following lemma. Lemma 5 (Path decomposition) Let P (i) be the set of paths from si to ti . For any flow f (i) (i) from si to ti there exist αj , i = 1, 2, . . . , |P (i) | such that f (i) = X (i) (i) X αj Pj , j (i) αj = 1. (21) j (i) Proof: Let e be the edge with minimal flow in f (i) and let fe = δ. We can then extend this edge to a path from si to ti through e with flow δ and remove this from the flow. We can keep repeating this process until no flow remains which gives our convex combination of paths. 2 We can now prove the theorem. Proof: (Congestion bound) Let Pi be the collection of si -ti paths. For all commodities i = 1, 2, . . . , k we pick a path p ∈ Pi with probability αp . Now the expected congestion of an edge e is given by E[Ce ] = X X i p∈Pi st e∈p αp(i) = X fe(i) ≤ 1. (22) i Now the congestion of an edge can be written as X Xe(i) (23) 1 if path for i uses edge e 0 else. (24) Ce = i where Xe(i) = 4 (i) (i) Now P r[Xe = 1] = fe . The Chernoff bound gives c log n 1 P r Ce ≥ ≤ c log log n n (25) for all edges e and c > 0. As there are always less than n2 edges, the maximal congestion is bounded by 1 c log n 1 ≤ n2 c = c−2 P r max Ce ≥ (26) e∈E log log n n n which proves the theorem. 2 Now let G = (V, E) be a graph where the optimal maximum congestion is C where C is a large constant. An example of this problem could be a network where all paths have to cross a certain service point. √ Claim 6 The same algorithm as before now returns a congestion of C + O( C log n). (i) Proof: We again define random variables Xe such that 1 if path for i uses edge e (i) (27) Xe = 0 else. hP i (i) We find E [Ce ] = E X ≤ C. i e " # " # √ p X X C log n Pr (28) Xe(i) ≥ C + O C log n = P r C Xe(i) ≥ 1 + O C i i √ Since C is large, we can assume C > n log n and thus C C log n. Therefore we can apply equation 7 of the Chernoff bound. " Pr X i Xe(i) √ # C log n C log n C 1 ≥ 1+O C ≤ exp −C = e−O(log n) = (29) 2 C C 4 poly(n) 2 5