Download Lecture 4:Linearity of Expectation with applications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Randomized Algorithms
CS648
Lecture 4
• Linearity of Expectation with applications
(Most important tool for analyzing randomized algorithms)
1
RECAP FROM THE LAST LECTURE
2
Random variable
Definition: A random variable defined over a probability space (Ω,P) is a
mapping Ω  R.
Examples:
o The number of HEADS when a coin is tossed 5 times.
o The sum of numbers seen when a dice is thrown 3 times.
o The number of comparisons during Randomized Quick Sort on an array of
size n.
Notations for random variables :
• X, Y, U, …(capital letters)
• X(ω) denotes the value of X on elementary event ω.
3
Expected Value of a random variable
(average value)
Definition: Expected value of a random variable X defined over a probability
space (Ω,P) is
E[X] = ωϵ Ω X(ω) ⨯ P(ω)
X= c
Ω
X= b
X= a
E[X] = aϵ X a ⨯ P(X = a)
4
Examples
Random experiment 1: A fair coin is tossed n times
Random Variable X: The number of HEADS
E[X] = 𝑖 𝑖 ⨯ P(X =𝑖)
𝑛
= 𝑖 𝑖⨯
(1 2)𝑖 (1 2)𝑛−𝑖
𝑖
=𝑛 2
Random Experiment 2: 4 balls into 3 bins
Random Variable X: The number of empty bins
E[X] = 𝑖 𝑖 ⨯ P(X =𝑖)
5
Can we solve these problems ?
Random Experiment 1 𝑚 balls into 𝑛 bins
Random Variable X: The number of empty bins
E[X]= ??
Random Experiment 2 Randomized Quick sort on 𝑛 elements
Random Variable Y: The number of comparisons
E[Y]= ??
6
Balls into Bins
(number of empty bins)
1 2 3 4 5
…
m-1 m
A subset of 𝑖 bins
1
2
3
…
…
n
Question : X is random variable denoting the number of empty bins.
E[X]= ??
Attempt 1: (based on definition of expectation)
E[X] = 𝑖 𝑖 ∙ P(X =𝑖)
𝑛
= 𝑖 𝑖 ∙ ( ) ∙ P(a specific subset of 𝑖 bins are empty and rest are nonempty)
𝑖
𝑛
= 𝑖 𝑖 ∙ ( ) ∙ (1 − 𝑛𝑖 )𝑚 ∙ (1 − p(𝑛 − 𝑖,𝑚))
𝑖
This is a right but useless answer !
7
Randomized Quick Sort
(number of comparisons)
Question : Y is random variable denoting the number of comparisons.
E[Y]= ??
Attempt 1: (based on definition of expectation)
E[Y] = 𝑖 𝑖 ∙ P(Y =𝑖)
A recursion tree
associated with Randomized Quick Sort
We can not proceed from this point …
8
Balls into Bins
(number of empty bins)
1 2 3 4 5
1
2
3
…
…
m-1 m
…
n
Randomized Quick Sort
(number of comparisons)
9
Balls into Bins
(number of empty bins)
1 2 3 4 5
1
2
3
…
…
𝑖
m-1 m
…
n
Question: Let 𝑿𝒊 be a random variable defined as follows.
𝑿𝒊 =
1 if 𝑖th bin is empty
0 otherwise
What is E[𝑿𝒊 ] ?
Answer : E[𝑿𝒊 ] = 1 ∙ P(𝑖th bin is empty) + 0 ∙ P(𝑖th bin 𝐢𝐬 𝐧𝐨𝐭 empty)
= P(𝑖th bin is empty)
= (1 − 𝑛1 )𝑚
10
Balls into Bins
(any relation between 𝑿 and 𝑿𝒊 ’s ?)
Consider any elementary event.
1 2 3 4 5 6
An elementary event
ω
1
2
3
4
5
𝑿 ω =𝟐
𝑿𝟏 (ω)
0
𝑿𝟐 (ω)
1
𝑿𝟑 (ω)
𝑿𝟒 (ω)
𝑿𝟓 (ω)
0
1
0
𝑿 ω = 𝑿𝟏 ω + 𝑿𝟐 ω + 𝑿 𝟑 ω + 𝑿𝟒 ω + 𝑿𝟓 ω
11
Sum of Random Variables
Definition: Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P)
such that
𝑼 ω = 𝑽 ω + 𝑾(ω) for each
ωϵΩ
Then 𝑼 is said to be the sum of random variables 𝑽 and 𝑾.
A compact notation :
𝑼=𝑽+𝑾
Definition: Let 𝑼 and 𝑽𝟏 , 𝑽𝟐 , … , 𝑽𝒏 be random variables defined over a probability
space (Ω,P) such that
𝑼 ω = 𝑽𝟏 ω + 𝑽𝟐 ω + … + 𝑽𝒏 ω for each
ωϵΩ
Then 𝑼 is said to be the sum of random variables 𝑽𝟏 , 𝑽𝟐 , … , 𝑽𝒏 .
A compact notation :
𝑼 = 𝑽𝟏 + 𝑽𝟐 + ⋯ + 𝑽𝒏
12
Randomized Quick Sort
(number of comparisons)
Elements of A arranged in
Increasing order of values
𝑒𝑖
Question : Let 𝒀𝒊𝒋 , for any 1 ≤ 𝑖 < 𝑗 ≤ 𝑛, be a random variable defined as follows.
𝒀𝒊𝒋 =
1 if 𝑒𝑖 is compared with 𝑒𝑗 during Randomized Quick Sort of A
0 otherwise
What is E[𝒀𝒊𝒋 ] ?
Answer : E[𝒀𝒊𝒋 ] = 1 ∙ P(𝑒𝑖 is compared with 𝑒𝑗 ) + 0 ∙ P(𝑒𝑖 is 𝐧𝐨𝐭 compared with 𝑒𝑗 )
= P(𝑒𝑖 is compared with 𝑒𝑗 )
2
= 𝑗−𝑖+1
13
Randomized Quick Sort
(any relation between 𝒀 and 𝒀𝒊𝒋 ’s ?)
Consider any elementary event.
Any elementary event
ω
𝒀𝟏𝟐 (ω)
𝒀𝟏𝟑 (ω)
…
𝒀𝟏𝒏 (ω)
1
0
…
0
𝒀𝟐𝟑 (ω) 𝒀𝟐𝟒 (ω)
1
1
…
𝒀𝒏−𝟏 𝒏 (ω)
…
0
Question: What is relation between and 𝒀 ω and 𝒀𝒊𝒋 ω ?
Answer: 𝒀 ω =
𝒊<𝒋 𝒀𝒊𝒋
ω
Hence 𝒀 =
𝒊<𝒋 𝒀𝒊𝒋
14
What have we learnt till now?
Balls into Bin experiment
X: random variable denoting the number
Randomized Quick Sort
Y: random variable for the number of
of empty bins
comparisons
Aim:
Aim:
E[X]= ??
𝑿=
Hence 𝒀 =
𝒊≤𝒏 𝑿𝒊
E[𝑿𝒊 ] = (1 − 𝑛1 )𝑚
E[𝑿] ≟
E[Y]= ??
𝒊≤𝒏 E[𝑿𝒊 ]
E[𝒀𝒊𝒋 ] =
𝒊<𝒋 𝒀𝒊𝒋
2
𝑗−𝑖+1
E[𝒀] ≟
𝒊<𝒋 E[𝒀𝒊𝒋 ]
15
The main question ?
Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such
that 𝑼 = 𝑽 + 𝑾,
𝐄[𝑼] ≟ 𝐄[𝑽] + 𝐄[𝑾]
𝐄[𝑼] =
=
ωϵ Ω 𝑼(ω) ∙ P(ω)
ωϵ Ω(𝑽(ω) + 𝑾(ω)) ∙ P(ω)
ωϵ Ω( 𝑽(ω) ∙ P(ω) + 𝑾(ω) ∙ P(ω) )
= ωϵ Ω 𝑽(ω) ∙ P(ω) +
ωϵ Ω 𝑾(ω) ∙ P(ω)
=
= 𝐄[𝑽] + 𝐄[𝑾]
16
Balls into Bins
(number of empty bins)
1 2 3 4 5
1
2
3
…
…
m-1 m
…
n
𝑿 : random variable denoting the number of empty bins.
𝑿 = 𝒊≤𝒏 𝑿𝒊
Using Linearity of Expectation
𝐄[𝑿] = 𝑖≤𝑛 𝐄[𝑿𝒊 ]
=
(1 − 1 )𝑚
𝑖≤𝑛
𝑛
1 𝑚
= 𝑛(1 − 𝑛
)
= 𝑛/𝑒 for 𝑚 = 𝑛
17
Randomized Quick Sort
(number of comparisons)
𝒀: r. v. for the no. of comparisons during Randomized Quick Sort on 𝑛 elements.
𝒀 = 𝒊<𝒋 𝒀𝒊𝒋
Using Linearity of expectation:
𝐄[𝒀] =
𝑖<𝑗 𝐄[𝒀𝑖𝑗 ]
=
𝑖<𝑗
2
𝑗−𝑖+1
=
𝑛
𝑖=1
2
𝑛
𝑗=𝑖+1 𝑗−𝑖+1
=2
<2
1
1
1
𝑛
[
+
+
…
+
]
𝑖=1 2
3
𝑛−𝑖+1
1
1
1
𝑛
[
1
+
+
+
…
+
]−
𝑖=1
2
3
𝑛
2𝑛
= 2𝑛 l𝑜𝑔𝑒 𝑛 − 𝑶(𝑛)
𝑯𝑛 ≤ l𝑜𝑔𝑒 𝑛 + 0.58
18
Linearity of Expectation
Theorem:
• (For sum of 2 random variables)
If 𝑼, 𝑽, 𝑾 are random variables defined over a probability space (Ω,P) such
that 𝑼 = 𝑽 + 𝑾, then
𝐄 𝑼 = 𝐄[𝑽] + 𝐄[𝑾]
• (For sum of more than 2 random variables)
If 𝑼 = 𝒊 𝑽𝒊 , then 𝐄[𝑼] = 𝑖 𝐄[𝑽𝑖 ]
19
Where to use Linearity of expectation ?
Whenever we need to find E[U] but none of the following work
• E[𝑼] = ωϵ Ω 𝑼(ω) ∙ P(ω)
• E[𝑼] = aϵ 𝑼 a ∙ P(𝑼= a)
In such a situation,
Try to express 𝑼 as 𝒊 𝑼𝒊 , such that it is “easy” to calculate 𝐄[𝑼𝑖 ].
Then calculate 𝐄[𝑼] using 𝐄[𝑼] = 𝑖 𝐄[𝑼𝑖 ]
20
Think over the following questions?
• Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P)
such that 𝑼 = 𝒂𝑽 + 𝒃𝑾, for some real no. 𝒂,𝒃 then
𝐄 𝑼 ≟ 𝒂𝐄[𝑽] + 𝒃𝐄[𝑾]
Answer: yes (prove it as homework)
•
Why does linearity of expectation holds always ?
(even when 𝑽 and 𝑾 are not independent)
Answer: (If you have internalized the proof of linearity of expectation,
this question should appear meaningless.)
21
Think over the following questions?
Definition: (Product of random variables)
Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such that
𝑼 ω = 𝑽 ω ∙ 𝑾(ω) for each
ωϵΩ
Then 𝑼 is said to be the product of random variables 𝑽 and 𝑾.
A compact notation is 𝑼 = 𝑽 ∙ 𝑾
• If 𝑼 = 𝑽 ∙ 𝑾, then
𝐄 𝑼 ≟ 𝐄[𝑽] ∙ 𝐄[𝑾]
Answer: No (give a counterexample to establish it.)
• If 𝑼 = 𝑽 ∙ 𝑾 and both 𝑽 and 𝑾 are independent then
𝐄 𝑼 = 𝐄[𝑽] ∙ 𝐄[𝑾]
Answer: Yes (prove it rigorously and find out the step which requires independence)
22
Independent random variables
In the previous slides, we used the notion of independence of random
variable. This notion is identical to the notion of independence of events:
Two random variables are said to be independent if knowing the value of one
random variable does not influence the probability distribution of the other.
In other words,
𝐏 𝑿 = 𝒂 𝒀 = 𝒃) = 𝐏(𝑿 = 𝒂)
for all 𝒂 ϵ 𝑿 and 𝒃 ϵ 𝒀.
23
Some Practice problems
as homework
• Balls into bin problem:
• What is the expected number of bins having exactly 2 balls ?
• We toss a coin n times, what is the expected number of times pattern HHT
appear ?
• A stick has n joints. The stick is dropped on floor and in this process each
joint may break with probability p independent of others. As a result the
stick will be break into many substicks.
– What is the expected number of substicks of length 3 ?
– What is the expected number of all the substicks ?
24
PROBLEMS OF THE NEXT LECTURE
25
Fingerprinting Techniques
Problem 1:
Given three 𝑛 ⨯ 𝑛 matrices 𝑨, 𝑩, and 𝑪, determine if 𝑪 = 𝑨 ∙ 𝑩.
Best deterministic algorithm:
• 𝑫𝑨∙𝑩;
• Verify if 𝑪 = 𝑫 ?
Time complexity: 𝑶(𝑛2.37 )
Randomized Monte Carlo algorithm:
Time complexity: 𝑶(𝑛2 𝐥𝐨𝐠 𝑛 )
Error probability: < 𝑛−𝑘 for any 𝑘.
26
Fingerprinting Techniques
Problem 2:
Given two large files A and B of 𝑛 bits located at two computers which are
connected by a network. We want to determine if A is identical to B. The aim
is to transmit least no. of bits to achieve it.
Randomized Monte Carlo algorithm:
Bits transmitted : 𝑶(𝐥𝐨𝐠 𝑛 )
Error probability: < 𝑛−𝑘 for any 𝑘.
27