Download 1 Chernoff bound

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational fluid dynamics wikipedia , lookup

Randomness wikipedia , lookup

Path integral formulation wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

Birthday problem wikipedia , lookup

Travelling salesman problem wikipedia , lookup

Signal-flow graph wikipedia , lookup

Transcript
2WO08: Graphs and Algorithms
Instructor: Nikhil Bansal
1
Lecture 14
Date: 14/5/2013
Scribe: Bouke Cloostermans
Chernoff bound
The Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. This means the bound is asymptotically stronger than Markov’s or
Chebyshev’s inequality. However, independence of the random variables is required.
Theorem 1 (Chernoff bound) Let X = x1 + x2 + · · · + xn where X1 , X2 , . . . , Xn are n independent 0/1 variables, each having probability pi . Then
µ
e
P r [X ≥ (1 + )µ] ≤
(1)
(1 + )1+
where µ = E[X].
Proof: Since ex is a convex function, for all t > 0, X ≥ (1 + )µ is equivalent with etX ≥ et(1+)µ .
Here, t is a variable that we will optimize later. We can then apply Markov’s inequality which gives
E[etX ]
.
et(1+)µ
All Xi are independent so we can split the expectation into n parts.
P r[X ≥ (1 + )µ] = P r[etX ≥ et(1+)µ ] ≤
(2)
" n
#
n
n
i
h Pn
i
h Pn
Y
Y
Y
tX tX
t i=1 Xi
tX
(1 − pi ) + pi et
E etXi =
= E e i=1 i = E
e i =
E e
=E e
i=1
i=1
i=1
(3)
Picking t = ln(1 + ) gives
n
n
n
Y
Y
Y
E etX =
(1 − pi ) + pi eln(1+) =
(1 − pi ) + pi (1 + ) =
1 + pi .
i=1
i=1
Now we use the bound 1 + x ≤
n
Y
ex
(4)
i=1
to get
1 + pi ≤
i=1
n
Y
epi = e
Pn
i=1
pi
= eµ .
(5)
i=1
Combining this with equation 2 gives
P r[X ≥ (1 + )µ] ≤
eµ
eln(1+)(1+)µ
=
e
(1 + )1+
µ
.
(6)
2
1
Corollary 2 We can simplify the Chernoff bound by distinguishing two cases.
If ≤ 2e − 1
P r[X ≥ (1 + )µ] ≤
e
(1 + )1+
µ
≤ e−µ/4
(7)
and if > 2e − 1
P r[X ≥ (1 + )µ] ≤
2
e
(1 + )1+
µ
≤
e µ
.
(8)
Some remarks
• In our proof, the Xi were 0/1 random variables but this is not strictly necessary. The bound
also holds if Xi ∈ [0, 1].
P
• Independence of the Xi is crucial. Consider X = ni=1 X + i where
X1 =
1 with probability
0 with probability
1
2
1
2
(9)
n with probability
0 with probability
1
2
1
2
(10)
and X1 = X2 = · · · = Xn . Then
X=
However, the Chernoff bound (equation 7) would imply
n
1
1 n
= P r[X = n] = P r X ≥ 1 +
≤ e− 16
2
2 2
(11)
• Boundedness is also crucial. Consider the case where X1 , X2 , . . . , Xn are independent random
variables with E[Xi ] = n1 and a random variable X0 with
X0 =
n with probability n1
0 with probability 1 −
1
n
(12)
Then
h
ni
1
Pr X ≥
≥ P r [X0 = n] =
2
n
(13)
but the Chernoff bound (equation 8) would imply
ni
Pr X ≥
≤
2
h
2
e
n/4
n/4
n/4
1
≤
2
(14)
3
Edge-disjoint paths
We can find an application of Chernoff bounds in the problem of finding edge-disjoint paths in a
graph.
Problem 1 (Edge-disjoint paths) Given a directed graph G = (V, E), and k pairs (s1 , t1 ),
(s2 , t2 ), . . . , (sk , tk ), find edge disjoint paths between (si , ti ) for all i ∈ {1, 2, . . . , k}.
This problem is NP-hard. One might think this problem could be modeled into a max flow
problem as follows. Create two auxiliary vertices s and t and connect all si to s with capacity 1,
and connect all ti to t with capacity 1. Then find a flow of k units from s to t. However, this model
can return a solution with a path from si to tj , i 6= j. For an example, we refer to figure 1
t2
s1
t
s
t1
s2
Figure 1: In this graph, a flow will be found from s1 to t2 and from s2 to t1 . Edge-disjoint flows
from s1 to t1 and from s2 to t2 are not possible here.
Even though this problem is NP-hard, we can still say something useful about it. For this
reason we introduce the term Congestion.
Definition 3 (Congestion) The congestion Ce of an edge e in a set of paths P is the amount of
paths in P that use edge e.
Theorem 4 (Congestion bound) There exists a polynomial-time algorithm that finds paths from
si to ti for i ∈ {1, 2, . . . , k} such that for all e
log n
Ce = O
(15)
log log n
provided edge-disjoint paths exist.
We will prove this theorem, but first we need to do some work. We model this problem into a
flow LP defined by equations 16 through 20.
3
k
X
fe(i) ≤ 1
∀e ∈ E
(congestion constraint)
(16)
i=1
fe(i) ≥ 0
X
fe(i) =
e∈N + (v)
X
e∈N + (s
X
fe(i)
∀e ∈ E, i = 1, 2, . . . , k
(17)
∀v ∈ V, i = 1, 2, . . . , k
(18)
e∈N − (v)
fe(i) = 1
(19)
fe(i) = 1
(20)
i)
X
e∈N − (ti )
(i)
In this LP, i = 1, 2, . . . , k are commodities, fe represents how much commodity i is routed on
edge i and N − (v) and N + (v) are the amount of incoming and outgoing edges of vertex v.
Any flow f (i) from si to ti can be decomposed into a convex combination of paths. We formulate
this more formally in the following lemma.
Lemma 5 (Path decomposition) Let P (i) be the set of paths from si to ti . For any flow f (i)
(i)
from si to ti there exist αj , i = 1, 2, . . . , |P (i) | such that
f (i) =
X
(i)
(i)
X
αj Pj ,
j
(i)
αj = 1.
(21)
j
(i)
Proof: Let e be the edge with minimal flow in f (i) and let fe = δ. We can then extend this edge
to a path from si to ti through e with flow δ and remove this from the flow. We can keep repeating
this process until no flow remains which gives our convex combination of paths.
2
We can now prove the theorem.
Proof: (Congestion bound) Let Pi be the collection of si -ti paths. For all commodities i =
1, 2, . . . , k we pick a path p ∈ Pi with probability αp . Now the expected congestion of an edge e is
given by
E[Ce ] =
X
X
i
p∈Pi st e∈p
αp(i) =
X
fe(i) ≤ 1.
(22)
i
Now the congestion of an edge can be written as
X
Xe(i)
(23)
1 if path for i uses edge e
0 else.
(24)
Ce =
i
where
Xe(i)
=
4
(i)
(i)
Now P r[Xe = 1] = fe . The Chernoff bound gives
c log n
1
P r Ce ≥
≤ c
log log n
n
(25)
for all edges e and c > 0. As there are always less than n2 edges, the maximal congestion is
bounded by
1
c log n
1
≤ n2 c = c−2
P r max Ce ≥
(26)
e∈E
log log n
n
n
which proves the theorem.
2
Now let G = (V, E) be a graph where the optimal maximum congestion is C where C is a large
constant. An example of this problem could be a network where all paths have to cross a certain
service point.
√
Claim 6 The same algorithm as before now returns a congestion of C + O( C log n).
(i)
Proof: We again define random variables Xe such that
1 if path for i uses edge e
(i)
(27)
Xe =
0 else.
hP
i
(i)
We find E [Ce ] = E
X
≤ C.
i e
"
#
"
#
√
p
X
X
C
log
n
Pr
(28)
Xe(i) ≥ C + O
C log n = P r
C
Xe(i) ≥ 1 + O
C
i
i
√
Since C is large, we can assume C > n log n and thus C C log n. Therefore we can apply
equation 7 of the Chernoff bound.
"
Pr
X
i
Xe(i)
√
#
C log n
C log n C
1
≥ 1+O
C ≤ exp −C
= e−O(log n) =
(29)
2
C
C
4
poly(n)
2
5