Download Document

Introduction to Discrete Probability Unbiased die Discrete Probability CSC-2259 Discrete Structures Sample Space: S  {1,2,3,4,5,6} All possible outcomes Konstantin Busch - LSU 1 Probability of event Event: any subset of sample space E2  {2,5} E1  {3} 2 p( E )  E: size of event set |E|  size of sample space | S | Experiment: procedure that yields events Throw die Konstantin Busch - LSU 3 Note that: 0  p( E )  1 since 0 | E || S | Konstantin Busch - LSU 4 1 What is the probability that a die brings 3? E  {3} Event Space: Sample Space: Probability: What is the probability that a die brings 2 or 5? S  {1,2,3,4,5,6} p( E )  E  {2,5} Event Space: Sample Space: |E| 1  |S| 6 Probability: S  {1,2,3,4,5,6} p( E )  |E| 2  |S| 6 5 6 Two unbiased dice What is the probability that two dice bring (1,1)? Event Space: Sample Space: 36 possible outcomes S  {(1,1), (1,2), (1,3), , (6,6)} E  {(1,1)} Sample Space: S  {(1,1), (1,2), (1,3), , (6,6)} First die Second die Ordered pair Probability: 7 p( E )  |E| 1  | S | 36 8 2 Game with unordered numbers Game authority selects a set of 6 winning numbers out of 40 What is the probability that two dice bring same numbers? Event Space: E  {(1,1), (2,2), (3,3), (4,4), (5,5), (6,6)} Player picks a set of 6 numbers (order is irrelevant) Sample Space: S  {(1,1), (1,2), (1,3), , (6,6)} Probability: p( E )  Number choices: 1,2,3,…,40 i.e. winning numbers: 4,7,16,25,33,39 i.e. player numbers: 8,13,16,23,33,40 |E| 6  | S | 36 What is the probability that a player wins? 9 Winning event: E  {{4,7,16,25,33,39}} | E | 1 a single set with the 6 winning numbers Konstantin Busch - LSU 10 Probability that player wins: P( E )  Sample space: S  {all subsets with 6 numbers out of 40}  {{1,2,3,4,5,6},{1,2,3,4,5 ,7},{1,2,3,4,5,8},} |E| 1 1   | S |  40  3,838,380   6  40  | S |    3,838,380 6 Konstantin Busch - LSU 11 Konstantin Busch - LSU 12 3 Event: E  {{2 h ,2d ,2c ,2s },{3 h ,3d ,3c ,3s },,{jh , jd , jc , js }} A card game Deck has 52 cards | E | 13 13 kinds of cards (2,3,4,5,6,7,8,9,10,a,k,q,j), each kind has 4 suits (h,d,c,s) Player is given hand with 4 cards What is the probability that the cards of the player are all of the same kind? Konstantin Busch - LSU Sample Space: S  {all possible sets of 4 cards out of 52}  {{2 h ,2d ,2c ,2s },{2 h ,2d ,2c ,3h },{2 h ,2d ,2c ,3d },}  52  52! 52  51 50  49 | S |      270,725 4  3 2  4  4! 48! 13 Konstantin Busch - LSU 14 Game with ordered numbers Probability that hand has 4 same kind cards: P( E )  each set of 4 cards is of same kind Game authority selects from a bin 5 balls in some order labeled with numbers 1…50 |E| 13 13   | S |  52  270,725   4 Number choices: 1,2,3,…,50 i.e. winning numbers: 37,4,16,33,9 Player picks a set of 5 numbers (order is important) i.e. player numbers: 40,16,13,25,33 What is the probability that a player wins? Konstantin Busch - LSU 15 Konstantin Busch - LSU 16 4 Sampling without replacement: After a ball is selected it is not returned to bin Sampling with replacement: After a ball is selected it is returned to bin Sample space size: 5-permutations of 50 balls | S | P(50,5)  50! 50   50  49  48  47  46  245,251,200 (50  5)! 45! Probability of success: P( E )  |E| 1  | S | 245,251,200 Konstantin Busch - LSU Probability of Inverse: Sample space size: 5-permutations of 50 balls with repetition | S | 505  312,500,000 Probability of success: P( E )  17 |E| 1  | S | 312,500,000 Konstantin Busch - LSU 18 Example: What is the probability that a binary string of 8 bits contains at least one 0? p( E )  1  p( E ) E  {01111111, 10111111,,00111111,,00000000} Proof: p( E )  E  S E E  {11111111} |S E| |S || E| |E|   1  1  p( E ) |S| |S| |S| p( E )  1  p( E )  1  |E| 1  1 8 |S| 2 End of Proof Konstantin Busch - LSU 19 Konstantin Busch - LSU 20 5 Probability of Union: Example: What is the probability that a binary string of 8 bits starts with 0 or ends with 11? E1, E2  S p( E1  E2 )  p( E1 )  p( E2 )  p( E1  E2 ) Strings that start with 0: E1  {00000000, 00000001,,01111111} Proof: | E1  E2 || E1 |  | E2 |  | E1  E2 | | E1 | 27 (all binary strings with 7 bits 0xxxxxxx) | E1  E2 | | E1 |  | E1 |  | E1  E2 |  |S| |S| | E1 | | E2 | | E1  E2 |    |S| |S| |S|  p( E1 )  p( E2 )  p( E1  E2 ) p( E1  E2 )  Konstantin Busch - LSU Strings that end with 11: E2  {00000011, 00000111,,11111111} End of Proof 21 | E1  E2 | 25 (all binary strings with 5 bits 0xxxxx11) Strings that start with 0 or end with 11: p( E1  E2 )  p( E1 )  p( E2 )  p( E1  E2 ) Sample space: 22 S  {x1 , x2 ,, xn } Probability distribution function p: 0  p( xi )  1 | E1 | | E2 | | E1  E2 |   |S| |S| |S| n  p( x )  1 2 7 2 6 25 1 1 1 5  8 8 8     2 2 2 2 4 8 8 Konstantin Busch - LSU Konstantin Busch - LSU Probability Theory Strings that start with 0 and end with 11: E1  E2  {000000011, 00000111,,01111111}  | E2 | 26 (all binary strings with 6 bits xxxxxx11) x 1 23 i Konstantin Busch - LSU 24 6 Notice that it can be: Uniform probability distribution: p ( xi )  p( x j ) p ( xi )  Example: Biased Coin Heads (H) with probability 2/3 Tails (T) with probability 1/3 Sample space: p( H )  2 3 Sample space: 1 3 p( H )  p(T )  S  {x1 , x2 ,, xn } Example: Unbiased Coin Heads (H) or Tails (T) with probability 1/2 S  {H , T } p (T )  1 n 2 1  1 3 3 Konstantin Busch - LSU 25 S  {H , T } p( H )  E: E  {x1 , x2 ,, xk }  S p (T )  26 S  {1,2,3,4,5,6} p(1)  p(2)  p(3)  p(4)  p(5)  k p( E )   p( xi ) i 1 For uniform probability distribution: p( E )  Konstantin Busch - LSU What is the probability that the die outcome is 2 or 6? |E| |S| 27 1 2 Konstantin Busch - LSU Example: Biased die Probability of event 1 2 p( E )  p(2)  p(6)  Konstantin Busch - LSU 1 7 p(6)  2 7 E  {2,6} 1 2 3   7 7 7 28 7 Conditional Probability Combinations of Events: Complement: Three tosses of an unbiased coin p( E )  1  p( E ) Union: p( E1  E2 )  p( E1 )  p( E2 )  p( E1  E2 )   Union of disjoint events: p  Ei    p( Ei )  i  i Konstantin Busch - LSU 29 Sample space: Tails Heads Tails Condition: first coin is Tails Question: What is the probability that there is an odd number of Tails, given that first coin is Tails? Konstantin Busch - LSU 30 Event without condition: S  {TTT , TTH , THT , THH , HTT , HTH , HHT , HHH } E  {TTT , THH , HTH , HHT } Odd number of Tails Restricted sample space given condition: Event with condition: F  {TTT , TTH , THT , THH } EF  E  F  {TTT , THH } first coin is Tails Konstantin Busch - LSU first coin is Tails 31 Konstantin Busch - LSU 32 8 F  {TTT , TTH , THT , THH } Notation of event with condition: EF  E  F  {TTT , THH } EF  E | F event E given F Given condition, the sample space changes to F p( EF )  | E  F | | E  F | / | S | p( E  F ) 2 / 8     0.5 |F| |F|/|S| p( F ) 4/8 p( EF )  p( E | F )  p( E  F ) p( F ) (the coin is unbiased) Konstantin Busch - LSU 33 (for arbitrary probability distribution) p( E | F )  Assume equal probability to have boy or girl F is: Sample space: p( E  F ) p( F ) Konstantin Busch - LSU 34 Example: What is probability that a family of two children has two boys given that one child is a boy Conditional probability definition: Given sample space S with events E and F (where p ( F )  0 ) the conditional probability of E given Konstantin Busch - LSU Condition: 35 S  {BB , BG , GB, GG} F  {BB , BG , GB} one child is a boy Konstantin Busch - LSU 36 9 Event: Independent Events E  {BB} both children are boys Events E1 and E2 are independent iff: p( E1  E2 )  p( E1 ) p( E2 ) Conditional probability of event: p( E | F )  p( E  F ) p({BB}) 1/ 4 1    p( F ) p({BB , BG , GB}) 3 / 4 3 Equivalent definition (if p( E2 )  0 ): p( E1 | E2 )  p( E1 ) Konstantin Busch - LSU 37 Example: 4 bit uniformly random strings E1 : a string begins with 1 E2 : a string contains even 1 | E1  E2 | 4 p ( E1 )  p ( E2 )  p( E1  E2 )  8 1  16 2 Success probability: Failure probability: p q  1 p Example: Biased Coin 4 1 1 1     p( E1 ) p( E2 ) 16 4 2 2 Events E1 and E2 are independent Konstantin Busch - LSU 38 Bernoulli trial: Experiment with two outcomes: success or failure E1  {1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111} E2  {0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111} E1  E2  {1111, 1100, 1010, 1001} | E1 || E2 | 8 Konstantin Busch - LSU 39 Success = Heads 2 p  p( H )  3 Failure = Tails q  p(T )  Konstantin Busch - LSU 1 3 40 10 Throw the biased coin 5 times Independent Bernoulli trials: the outcomes of successive Bernoulli trials do not depend on each other What is the probability to have 3 heads? Heads probability: p 2 3 (success) Tails probability: q 1 3 (failure) Example: Successive coin tosses Konstantin Busch - LSU 41 Konstantin Busch - LSU 42 Probability that any particular sequence has 3 heads and 2 tails is specified positions: 3 2 HHHTT HTHHT pq For example: HTHTH HHHTT THHTH p p p q q p q p p q   HTHHT Total numbers of ways to arrange in sequence 5 coins with 3 heads:  5     3 Konstantin Busch - LSU pppqq  p 3q 2 HTHTH p 43 pqppq  p 3q 2 q p q Konstantin Busch - LSU p pqpqp  p 3q 2 44 11 Throw the biased coin 5 times Probability of having 3 heads:  5 p 3q 2  p 3q 2    p 3q 2    p 3q 2  3 1st sequence success (3 heads) 2nd sequence success (3 heads) Probability to have exactly 3 heads:  5 3 2   p q  3  5  st    3 2 1    0.0086  3 All possible ways to arrange in sequence 5 coins with 3 heads 45 Theorem: Probability to have k successes in n independent Bernoulli trials:  n  k nk   p q k  Konstantin Busch - LSU Proof: n b(k ; n, p)    p k q n  k k  Probability that a sequence has k successes and n  k failures in specified positions Example: SFSFFS…SSF 47 46  n  k nk   p q k  Total number of sequences with k successes and n  k failures Also known as binomial probability distribution: Konstantin Busch - LSU 3 Probability to have 3 heads and 2 tails in specified sequence positions sequence success (3 heads) Konstantin Busch - LSU 5!  2     3!2!  3  End of Proof Konstantin Busch - LSU 48 12 Birthday Problem Example: Random uniform binary strings probability for 0 bit is 0.9 probability for 1 bit is 0.1 What is probability of 8 bit 0s out of 10 bits? i.e. 0100001000 p  0 .9 q  0 .1 k 8 n  10 n 10  b(k ; n, p)    p k q nk   (0.9)8 (0.1) 2  0.1937102445 k  8 Konstantin Busch - LSU Birthday collision: two people have birthday in same day Problem: How many people should be in a room so that the probability of birthday collision is at least ½? Assumption: equal probability to be born in any day 49 Konstantin Busch - LSU 50 We will compute 366 days in a year pn If the number of people is 367 or more then birthday collision is guaranteed by pigeonhole principle :probability that n people have all different birthdays It will helps us to get Assume that we have n  366 Konstantin Busch - LSU 1  pn :probability that there is people a birthday collision among n people 51 Konstantin Busch - LSU 52 13 Sample space: Cartesian product Event set: each person’s birthday is different S  {1,2,,366}  {1,2,,366}   {1,2,,366} 1st person’s Birthday choices 2nd person’s Birthday choices nth person’s Birthday choices E  {(1,2,,366), (366,1,,365), , (366,365,,1)} 1st person’s birthday Sample size: S  {(1,1,,1), (2,1,,1),  (366,366,,366)} | E | P(366, n)  Sample space size: | S | 366  366366  366n Konstantin Busch - LSU 53 Probability of no birthday collision pn  1  pn n  22 1  pn  0.475 n  23 1  pn  0.506 Konstantin Busch - LSU #choices #choices 366!  366  365  364(366  n  1) (366  n)! Konstantin Busch - LSU n  23 Therefore: 55 nth person’s birthday #choices Probability of birthday collision: | E | 366  365  364(366  n  1)  |S| 366n Probability of birthday collision: 2nd person’s birthday 54 1  pn 1  pn  0.506 n  23 people have probability at least ½ of birthday collision Konstantin Busch - LSU 56 14 Randomized algorithms: The birthday problem analysis can be used to determine appropriate hash table sizes that minimize collisions algorithms with randomized choices (Example: quicksort) Las Vegas algorithms: Hash function collision: randomized algorithms whose output is always correct (i.e. quicksort) h(k1 )  h(k2 ) Monte Carlo algorithms: randomized algorithms whose output is correct with some probability (may produce wrong output) Konstantin Busch - LSU 57 A Monte Carlo algorithm b  random_num ber(1, ,n) if (Miller_Test( n,b ) == failure) return(false) // n is not prime } return(true) // most likely n is prime Konstantin Busch - LSU 58 Miller_Test( n,b ) { n-1  2 s t s, t  0, s  log n, t is odd for ( j  0 to s-1 ) { j t if ( b  1 ( mod n) or b2 t  1 ( mod n) ) return(success) } return(failure) } Primality_Test( n,k ) { for(i  1 to k ) { } Konstantin Busch - LSU 59 Konstantin Busch - LSU 60 15 If the primality test algorithm returns false then the number is not prime for sure A prime number n passes the Miller test for every 1  b  n If the algorithm returns true then the answer is correct (number is prime) with high probability: A composite number n passes the Miller test in range 1  b  n for fewer than n numbers k 1 1 1    1  1 n 4 4 false positive with probability: 1 4 Konstantin Busch - LSU for k  log 4 n 61 62 Bayes’ Theorem Proof: Bayes’ Theorem p( E )  0 Konstantin Busch - LSU p( F )  0 p( E | F ) p( F ) p( F | E )  p( E | F ) p( F )  p( E | F ) p( F ) p( F | E )  p( E  F ) p( E ) p( E  F )  p( F | E ) p( E ) p( E | F )  p( E  F ) p( F ) p( E  F )  p( E | F ) p( F ) p( F | E ) p( E )  p( E | F ) p( F ) Applications: Machine Learning Spam Filters Konstantin Busch - LSU p( F | E )  63 Konstantin Busch - LSU p( E | F ) p( F ) p( E ) 64 16 E  (E  F )  (E  F ) p( E )  p( E  F )  p( E  F ) p( F | E )  (E  F )  (E  F )   p( E  F )  p( E | F ) p( F ) p( E | F ) p( F ) p( E ) p( E )  p( E | F ) p( F )  p( E | F ) p( F ) p( E  F )  p( E | F ) p( F ) p( F | E )  p( E | F ) p( F ) p( E | F ) p( F )  p( E | F ) p( F ) p( E )  p( E | F ) p( F )  p( E | F ) p( F ) End of Proof Konstantin Busch - LSU 65 Example: Select random box then select random ball in box Box 1 Konstantin Busch - LSU 66 E : select red ball F : select box 1 E : select green ball F : select box 2 Box 2 Question probability: P( F | E )  ? Question: If a red ball is selected, what is the probability it was taken from box 1? Konstantin Busch - LSU 67 Question: If a red ball is selected, what is the probability it was taken from box 1? Konstantin Busch - LSU 68 17 Bayes’ Theorem: p( F | E )  p( E | F ) p( F ) p( E | F ) p( F )  p( E | F ) p( F ) E : select red ball F : select box 1 E : select green ball F : select box 2 Box 1 Box 2 p( F )  1 / 2  0.5 p( F )  1 / 2  0.5 Probability to select box 1 Probability to select box 2 We only need to compute: p( F ) p( F ) p( E | F ) p( E | F ) Konstantin Busch - LSU 69 E : select red ball F : select box 1 E : select green ball F : select box 2 Box 1 Konstantin Busch - LSU p( F | E )  Box 2 p( E | F ) p( F ) p( E | F ) p( F )  p( E | F ) p( F ) p( F )  1 / 2  0.5 p( F )  1 / 2  0.5 p( E | F )  7 / 9  0.777... p( E | F )  7 / 9  0.777... Probability to select red ball from box 1 p( E | F )  3 / 7  0.428.... Probability to select red ball from box 2 Konstantin Busch - LSU p( F | E )  70 p( E | F )  3 / 7  0.428.... 0.777  0.5 0.777   0.644 0.777  0.5  0.428  0.5 0.777  0.428 Final result 71 Konstantin Busch - LSU 72 18 Spam Filters What if we had more boxes? Training set: Generalized Bayes’ Theorem: p ( Fj | E )  Spam (bad) emails B Good emails G p( E | Fj ) p( Fj ) n  p( E | F ) p( F ) i 1 i A user classifies each email in training set as good or bad i Sample space S  F1  F2    Fn mutually exclusive events Konstantin Busch - LSU 73 Konstantin Busch - LSU A new email X arrives Find words that occur in B and G nG (w) nB (w) number of spam emails that contain word w p( w)  number of good emails that contain word w nB ( w) |B| q( w)  Konstantin Busch - LSU S: Event that X is spam E: Event that X contains word w What is the probability that X is spam given that it contains word w ? nG ( w) |G | P( S | E )  ? Probability that a good email contains w Probability that a spam email contains w 74 Reject if this probability is at least 0.9 75 Konstantin Busch - LSU 76 19 p( S | E )  Example: p( E | S ) p( S ) p( E | S ) p( S )  p( E | S ) p( S ) Training set for word “Rolex”: “Rolex” occurs in 250 of 2000 spam emails We only need to compute: p( S ) p( S ) p( E | S ) 0.5 0.5 p( w)  simplified assumption nB ( w) |B| “Rolex” occurs in 5 of 1000 good emails p( E | S ) q( w)  nG ( w) |G | Computed from training set Konstantin Busch - LSU 77 “Rolex” occurs in 250 of 2000 spam emails Konstantin Busch - LSU 78 If new email X contains word “Rolex” what is the probability that it is a spam? nB ( Rolex )  250 p( Rolex )  If new email contains word “Rolex” what is the probability that it is a spam? nB ( Rolex ) 250   0.125 |B| 2000 S: Event that X is spam E: Event that X contains word “Rolex” “Rolex” occurs in 5 of 1000 good emails nG ( Rolex )  5 q( Rolex )  P( S | E )  ? nG ( Rolex ) 5   0.005 |G| 1000 Konstantin Busch - LSU 79 Konstantin Busch - LSU 80 20 p( S | E )  p( E | S ) p( S ) p( E | S ) p( S )  p( E | S ) p( S ) p( S | E )  0.125  0.5 0.125   0.961... 0.125  0.5  0.005  0.5 0.13 We only need to compute: p( S ) p( S ) 0.5 0.5 simplified assumption p( E | S ) New email is considered to be spam because: p( E | S ) p ( Rolex )  0.125 q( Rolex )  0.005 p( S | E )  0.961  0.9 spam threshold Computed from training set Konstantin Busch - LSU 81 Konstantin Busch - LSU 82 Better spam filters use two words: p( S | E1  E2 )  p( E1 | S ) p( E2 | S ) p( E1 | S ) p( E2 | S )  p( E1 | S ) p( E2 | S ) Assumption: E1 and E2 are independent the two words appear independent of each other Konstantin Busch - LSU 83 21

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document