Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Randomized Algorithms CS648 Lecture 4 • Linearity of Expectation with applications (Most important tool for analyzing randomized algorithms) 1 RECAP FROM THE LAST LECTURE 2 Random variable Definition: A random variable defined over a probability space (Ω,P) is a mapping Ω R. Examples: o The number of HEADS when a coin is tossed 5 times. o The sum of numbers seen when a dice is thrown 3 times. o The number of comparisons during Randomized Quick Sort on an array of size n. Notations for random variables : • X, Y, U, …(capital letters) • X(ω) denotes the value of X on elementary event ω. 3 Expected Value of a random variable (average value) Definition: Expected value of a random variable X defined over a probability space (Ω,P) is E[X] = ωϵ Ω X(ω) ⨯ P(ω) X= c Ω X= b X= a E[X] = aϵ X a ⨯ P(X = a) 4 Examples Random experiment 1: A fair coin is tossed n times Random Variable X: The number of HEADS E[X] = 𝑖 𝑖 ⨯ P(X =𝑖) 𝑛 = 𝑖 𝑖⨯ (1 2)𝑖 (1 2)𝑛−𝑖 𝑖 =𝑛 2 Random Experiment 2: 4 balls into 3 bins Random Variable X: The number of empty bins E[X] = 𝑖 𝑖 ⨯ P(X =𝑖) 5 Can we solve these problems ? Random Experiment 1 𝑚 balls into 𝑛 bins Random Variable X: The number of empty bins E[X]= ?? Random Experiment 2 Randomized Quick sort on 𝑛 elements Random Variable Y: The number of comparisons E[Y]= ?? 6 Balls into Bins (number of empty bins) 1 2 3 4 5 … m-1 m A subset of 𝑖 bins 1 2 3 … … n Question : X is random variable denoting the number of empty bins. E[X]= ?? Attempt 1: (based on definition of expectation) E[X] = 𝑖 𝑖 ∙ P(X =𝑖) 𝑛 = 𝑖 𝑖 ∙ ( ) ∙ P(a specific subset of 𝑖 bins are empty and rest are nonempty) 𝑖 𝑛 = 𝑖 𝑖 ∙ ( ) ∙ (1 − 𝑛𝑖 )𝑚 ∙ (1 − p(𝑛 − 𝑖,𝑚)) 𝑖 This is a right but useless answer ! 7 Randomized Quick Sort (number of comparisons) Question : Y is random variable denoting the number of comparisons. E[Y]= ?? Attempt 1: (based on definition of expectation) E[Y] = 𝑖 𝑖 ∙ P(Y =𝑖) A recursion tree associated with Randomized Quick Sort We can not proceed from this point … 8 Balls into Bins (number of empty bins) 1 2 3 4 5 1 2 3 … … m-1 m … n Randomized Quick Sort (number of comparisons) 9 Balls into Bins (number of empty bins) 1 2 3 4 5 1 2 3 … … 𝑖 m-1 m … n Question: Let 𝑿𝒊 be a random variable defined as follows. 𝑿𝒊 = 1 if 𝑖th bin is empty 0 otherwise What is E[𝑿𝒊 ] ? Answer : E[𝑿𝒊 ] = 1 ∙ P(𝑖th bin is empty) + 0 ∙ P(𝑖th bin 𝐢𝐬 𝐧𝐨𝐭 empty) = P(𝑖th bin is empty) = (1 − 𝑛1 )𝑚 10 Balls into Bins (any relation between 𝑿 and 𝑿𝒊 ’s ?) Consider any elementary event. 1 2 3 4 5 6 An elementary event ω 1 2 3 4 5 𝑿 ω =𝟐 𝑿𝟏 (ω) 0 𝑿𝟐 (ω) 1 𝑿𝟑 (ω) 𝑿𝟒 (ω) 𝑿𝟓 (ω) 0 1 0 𝑿 ω = 𝑿𝟏 ω + 𝑿𝟐 ω + 𝑿 𝟑 ω + 𝑿𝟒 ω + 𝑿𝟓 ω 11 Sum of Random Variables Definition: Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such that 𝑼 ω = 𝑽 ω + 𝑾(ω) for each ωϵΩ Then 𝑼 is said to be the sum of random variables 𝑽 and 𝑾. A compact notation : 𝑼=𝑽+𝑾 Definition: Let 𝑼 and 𝑽𝟏 , 𝑽𝟐 , … , 𝑽𝒏 be random variables defined over a probability space (Ω,P) such that 𝑼 ω = 𝑽𝟏 ω + 𝑽𝟐 ω + … + 𝑽𝒏 ω for each ωϵΩ Then 𝑼 is said to be the sum of random variables 𝑽𝟏 , 𝑽𝟐 , … , 𝑽𝒏 . A compact notation : 𝑼 = 𝑽𝟏 + 𝑽𝟐 + ⋯ + 𝑽𝒏 12 Randomized Quick Sort (number of comparisons) Elements of A arranged in Increasing order of values 𝑒𝑖 Question : Let 𝒀𝒊𝒋 , for any 1 ≤ 𝑖 < 𝑗 ≤ 𝑛, be a random variable defined as follows. 𝒀𝒊𝒋 = 1 if 𝑒𝑖 is compared with 𝑒𝑗 during Randomized Quick Sort of A 0 otherwise What is E[𝒀𝒊𝒋 ] ? Answer : E[𝒀𝒊𝒋 ] = 1 ∙ P(𝑒𝑖 is compared with 𝑒𝑗 ) + 0 ∙ P(𝑒𝑖 is 𝐧𝐨𝐭 compared with 𝑒𝑗 ) = P(𝑒𝑖 is compared with 𝑒𝑗 ) 2 = 𝑗−𝑖+1 13 Randomized Quick Sort (any relation between 𝒀 and 𝒀𝒊𝒋 ’s ?) Consider any elementary event. Any elementary event ω 𝒀𝟏𝟐 (ω) 𝒀𝟏𝟑 (ω) … 𝒀𝟏𝒏 (ω) 1 0 … 0 𝒀𝟐𝟑 (ω) 𝒀𝟐𝟒 (ω) 1 1 … 𝒀𝒏−𝟏 𝒏 (ω) … 0 Question: What is relation between and 𝒀 ω and 𝒀𝒊𝒋 ω ? Answer: 𝒀 ω = 𝒊<𝒋 𝒀𝒊𝒋 ω Hence 𝒀 = 𝒊<𝒋 𝒀𝒊𝒋 14 What have we learnt till now? Balls into Bin experiment X: random variable denoting the number Randomized Quick Sort Y: random variable for the number of of empty bins comparisons Aim: Aim: E[X]= ?? 𝑿= Hence 𝒀 = 𝒊≤𝒏 𝑿𝒊 E[𝑿𝒊 ] = (1 − 𝑛1 )𝑚 E[𝑿] ≟ E[Y]= ?? 𝒊≤𝒏 E[𝑿𝒊 ] E[𝒀𝒊𝒋 ] = 𝒊<𝒋 𝒀𝒊𝒋 2 𝑗−𝑖+1 E[𝒀] ≟ 𝒊<𝒋 E[𝒀𝒊𝒋 ] 15 The main question ? Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such that 𝑼 = 𝑽 + 𝑾, 𝐄[𝑼] ≟ 𝐄[𝑽] + 𝐄[𝑾] 𝐄[𝑼] = = ωϵ Ω 𝑼(ω) ∙ P(ω) ωϵ Ω(𝑽(ω) + 𝑾(ω)) ∙ P(ω) ωϵ Ω( 𝑽(ω) ∙ P(ω) + 𝑾(ω) ∙ P(ω) ) = ωϵ Ω 𝑽(ω) ∙ P(ω) + ωϵ Ω 𝑾(ω) ∙ P(ω) = = 𝐄[𝑽] + 𝐄[𝑾] 16 Balls into Bins (number of empty bins) 1 2 3 4 5 1 2 3 … … m-1 m … n 𝑿 : random variable denoting the number of empty bins. 𝑿 = 𝒊≤𝒏 𝑿𝒊 Using Linearity of Expectation 𝐄[𝑿] = 𝑖≤𝑛 𝐄[𝑿𝒊 ] = (1 − 1 )𝑚 𝑖≤𝑛 𝑛 1 𝑚 = 𝑛(1 − 𝑛 ) = 𝑛/𝑒 for 𝑚 = 𝑛 17 Randomized Quick Sort (number of comparisons) 𝒀: r. v. for the no. of comparisons during Randomized Quick Sort on 𝑛 elements. 𝒀 = 𝒊<𝒋 𝒀𝒊𝒋 Using Linearity of expectation: 𝐄[𝒀] = 𝑖<𝑗 𝐄[𝒀𝑖𝑗 ] = 𝑖<𝑗 2 𝑗−𝑖+1 = 𝑛 𝑖=1 2 𝑛 𝑗=𝑖+1 𝑗−𝑖+1 =2 <2 1 1 1 𝑛 [ + + … + ] 𝑖=1 2 3 𝑛−𝑖+1 1 1 1 𝑛 [ 1 + + + … + ]− 𝑖=1 2 3 𝑛 2𝑛 = 2𝑛 l𝑜𝑔𝑒 𝑛 − 𝑶(𝑛) 𝑯𝑛 ≤ l𝑜𝑔𝑒 𝑛 + 0.58 18 Linearity of Expectation Theorem: • (For sum of 2 random variables) If 𝑼, 𝑽, 𝑾 are random variables defined over a probability space (Ω,P) such that 𝑼 = 𝑽 + 𝑾, then 𝐄 𝑼 = 𝐄[𝑽] + 𝐄[𝑾] • (For sum of more than 2 random variables) If 𝑼 = 𝒊 𝑽𝒊 , then 𝐄[𝑼] = 𝑖 𝐄[𝑽𝑖 ] 19 Where to use Linearity of expectation ? Whenever we need to find E[U] but none of the following work • E[𝑼] = ωϵ Ω 𝑼(ω) ∙ P(ω) • E[𝑼] = aϵ 𝑼 a ∙ P(𝑼= a) In such a situation, Try to express 𝑼 as 𝒊 𝑼𝒊 , such that it is “easy” to calculate 𝐄[𝑼𝑖 ]. Then calculate 𝐄[𝑼] using 𝐄[𝑼] = 𝑖 𝐄[𝑼𝑖 ] 20 Think over the following questions? • Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such that 𝑼 = 𝒂𝑽 + 𝒃𝑾, for some real no. 𝒂,𝒃 then 𝐄 𝑼 ≟ 𝒂𝐄[𝑽] + 𝒃𝐄[𝑾] Answer: yes (prove it as homework) • Why does linearity of expectation holds always ? (even when 𝑽 and 𝑾 are not independent) Answer: (If you have internalized the proof of linearity of expectation, this question should appear meaningless.) 21 Think over the following questions? Definition: (Product of random variables) Let 𝑼, 𝑽, 𝑾 be random variables defined over a probability space (Ω,P) such that 𝑼 ω = 𝑽 ω ∙ 𝑾(ω) for each ωϵΩ Then 𝑼 is said to be the product of random variables 𝑽 and 𝑾. A compact notation is 𝑼 = 𝑽 ∙ 𝑾 • If 𝑼 = 𝑽 ∙ 𝑾, then 𝐄 𝑼 ≟ 𝐄[𝑽] ∙ 𝐄[𝑾] Answer: No (give a counterexample to establish it.) • If 𝑼 = 𝑽 ∙ 𝑾 and both 𝑽 and 𝑾 are independent then 𝐄 𝑼 = 𝐄[𝑽] ∙ 𝐄[𝑾] Answer: Yes (prove it rigorously and find out the step which requires independence) 22 Independent random variables In the previous slides, we used the notion of independence of random variable. This notion is identical to the notion of independence of events: Two random variables are said to be independent if knowing the value of one random variable does not influence the probability distribution of the other. In other words, 𝐏 𝑿 = 𝒂 𝒀 = 𝒃) = 𝐏(𝑿 = 𝒂) for all 𝒂 ϵ 𝑿 and 𝒃 ϵ 𝒀. 23 Some Practice problems as homework • Balls into bin problem: • What is the expected number of bins having exactly 2 balls ? • We toss a coin n times, what is the expected number of times pattern HHT appear ? • A stick has n joints. The stick is dropped on floor and in this process each joint may break with probability p independent of others. As a result the stick will be break into many substicks. – What is the expected number of substicks of length 3 ? – What is the expected number of all the substicks ? 24 PROBLEMS OF THE NEXT LECTURE 25 Fingerprinting Techniques Problem 1: Given three 𝑛 ⨯ 𝑛 matrices 𝑨, 𝑩, and 𝑪, determine if 𝑪 = 𝑨 ∙ 𝑩. Best deterministic algorithm: • 𝑫𝑨∙𝑩; • Verify if 𝑪 = 𝑫 ? Time complexity: 𝑶(𝑛2.37 ) Randomized Monte Carlo algorithm: Time complexity: 𝑶(𝑛2 𝐥𝐨𝐠 𝑛 ) Error probability: < 𝑛−𝑘 for any 𝑘. 26 Fingerprinting Techniques Problem 2: Given two large files A and B of 𝑛 bits located at two computers which are connected by a network. We want to determine if A is identical to B. The aim is to transmit least no. of bits to achieve it. Randomized Monte Carlo algorithm: Bits transmitted : 𝑶(𝐥𝐨𝐠 𝑛 ) Error probability: < 𝑛−𝑘 for any 𝑘. 27