Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability Distributions Random Variables EXAMPLE: Consider rolling a fair die twice. S = {(i, j) : i, j ∈ {1, . . . , 6}} Suppose we are interested in computing the sum, i.e. we have placed a bet at a craps table. Let X be the sum. Then X ∈ {2, 3, . . . , 12} is random as it depends upon the outcome of the experiment. It is a random variable. We can compute probabilities associated with X. P (X = 2) = P {(1, 1)} = 1/36 P (X = 3) = P {(1, 2), (2, 1)} = 2/36 P (X = 4) = P {(1, 3), (2, 2), (1, 3)} = 3/36 Can write succinctly Sum, i P (X = i) 2 3 4 5 6 ... 1/36 2/36 3/36 4/36 5/36 . . . 12 1/36 DEFINITION: Let S be a sample space. Then a function X : S → R is a random variable. EXAMPLE: Consider a bin with 5 white and 4 red chips. Let S be the sample space associated with selecting three chips, with replacement, from the bin. Let X be the number of white chips chosen. The sample space is S = {(a, b, c) : a, b, c ∈ {W, R}}. We have X(R, R, R) = 0 X(W, R, R) = X(R, W, R) = X(R, R, W ) = 1 X(W, W, R) = X(W, R, W ) = X(R, W, W ) = 2 X(W, W, W ) = 3 We now find probabilities: 3 4 P (X = 0) = P (R, R, R) = 9 P (X = 1) = P ((W, R, R) ∪ (R, W, R) ∪ (R, R, W )) 5 = P (W, R, R) + P (R, W, R) + P (R, R, W ) = 3 · · 9 2 5 4 P (X = 2) = 3 · · 9 9 3 5 P (X = 3) = 9 1 2 4 9 X P (X = i) Note that 0 3 4 9 1 2 4 5 3· · 9 9 2 2 4 5 · 3· 9 9 3 3 5 9 2 3 2 3 4 5 5 5 4 4 +3· · +3· · + =1 9 9 9 9 9 9 The two tables above illustrate what we mean by a probability distribution. The Binomial Distribution Consider n independent repeated trials of a random variable. Let the probability of a success be the same for each trial. Let X be the number of successes in the n trials. Then n k p (1 − p)n−k if k ∈ {0, 1, 2, . . . , n} f (k) = P {X = k} = k EXAMPLE: If the probability is 0.70 that any one registered voter (randomly selected from official rolls) will vote in a given election, what is the probability that two of five registered voters will vote in the election? Solution: We have 5 (0.70)2 (1 − 0.70)5−2 = 10(0.70)2(0.30)3 = 0.132 f (2) = 2 EXAMPLE: There is a product that is defective with a probability of 0.01. They are sold in packages of 10. The company offers money back if two or more are defective. What is the probability that a given package will be returned for cash? Solution: Let X denote the number of defective products in a given package. We want to know P {X ≥ 2}. Clearly, X is a binomial random variable. It is obviously easier to compute P {X ≥ 2} = 1 − P {X ≤ 1}. Thus, we need 10 10 0 10 · 0.011 · 0.999 = 0.09135 · 0.01 · 0.99 = 0.90438, P {X = 1} = P {X = 0} = 1 0 Therefore, the desired probability is P {X ≥ 2} = 1 − P {X = 0} − P {X = 1} ≈ 0.0043 Thus, 0.4 percent of the packages are eligible for recall. EXAMPLE: The probability is 0.30 that a person shopping at a certain supermarket will take advantage of its special promotion of ice cream. Find the probabilities that among six persons shopping at this market there will be 0, 1, 2, 3, 4, 5, or 6 who will take advantage of this promotion. 2 EXAMPLE: The probability is 0.30 that a person shopping at a certain supermarket will take advantage of its special promotion of ice cream. Find the probabilities that among six persons shopping at this market there will be 0, 1, 2, 3, 4, 5, or 6 who will take advantage of this promotion. Solution: We have 6 (0.30)0 (0.70)6 f (0) = 0 6 (0.30)1 (0.70)5 f (1) = 1 6 (0.30)2 (0.70)4 f (2) = 2 6 (0.30)3 (0.70)3 f (3) = 3 6 (0.30)4 (0.70)2 f (4) = 4 6 (0.30)5 (0.70)1 f (5) = 5 6 (0.30)6 (0.70)0 f (6) = 6 = 0.118 = 0.303 = 0.324 = 0.185 = 0.060 = 0.010 = 0.001 all rounded to three decimals. Hypergeometric Distribution Suppose that n objects are to be chosen from a set of a objects of one kind (successes) and b objects of another kind (failures), the selection is without replacement, and we are interested in the probability of getting x successes and n−x failures. Then for sampling without replacement the probability of “x successes in n trials” is b a · n−x x f (x) = for x = 0, 1, 2, . . . , n a+b n where x cannot exceed a and n−x cannot exceed b. This is the formula for the hypergeometric distribution. EXAMPLE: A mailroom clerk is supposed to send 6 of 15 packages to Europe by airmail, but he gets them all mixed up and randomly put airmail postage on 6 of the packages. What is the probability that only 3 of the packages that are supposed to get sent by air get airmail postage? Solution: We have 9 6 · 20 · 84 6−3 3 = ≈ 0.336 f (3) = 6+9 5, 005 6 3 EXAMPLE: Among an ambulance service’s 16 ambulances, five emit excessive amounts of pollutants. If eight of the ambulances are randomly picked for inspection, what is the probability that this sample will include at least three of the ambulances that emit excessive amounts of pollutants? Solution: We have 11 5 · 10 · 462 8−3 3 = f (3) = = 0.359 5 + 11 12, 870 8 11 5 · 5 · 330 8−4 4 = f (4) = = 0.128 5 + 11 12, 870 8 11 5 · 1 · 165 8−5 5 = f (5) = ≈ 0.013 5 + 11 12, 870 8 then f (3) + f (4) + f (5) = 0.359 + 0.128 + 0.013 = 0.500 This result suggests that the inspection should, perhaps, have included more than eaght of the ambulances. The Poisson Distribution When n is large and p is small, binomial probabilities are often approximated by means of the formula (np)x · e−np f (x) = for x = 0, 1, 2, 3, . . . x! which is a special form of the Poisson distribution. REMARK: It is safe to use the Poisson approximation to the binomial distribution only when n ≥ 100 and np ≤ 10 EXAMPLE: A community of 100,000 people is struck with a fatal disease in which the chance of death is 0.0001. What is the probability that at most 8 people will die? 4 EXAMPLE: A community of 100,000 people is struck with a fatal disease in which the chance of death is 0.0001. What is the probability that at most 8 people will die? Solution: This is binomial with parameters n = 105 and p = 10−4 . So, if X is the random variable giving number of deaths, then 8 8 5 X X 105 ! 10 5 −4 k −4 105 −k (10 ) (1 − 10 ) = P {X ≤ 8} = (10−4 )k (1 − 10−4 )10 −k 5 k k!(10 − k)! k=0 k=0 105 ! 105 ! 5 −4 0 −4 105 −0 = (10 ) (1 − 10 ) + (10−4)1 (1 − 10−4 )10 −1 + . . . 5 5 k!(10 − 0)! 1!(10 − 1)! 105 ! 105 ! 5 −4 7 −4 105 −7 + (10 ) (1 − 10 ) + (10−4 )8 (1 − 10−4 )10 −8 5 5 7!(10 − 7)! 8!(10 − 8)! Computationally this formula is too complicated. Another option is to use the Poisson approximation with λ = np = 10: P {X ≤ 8} ≈ 8 X k=0 e−10 10k 100 101 102 108 = e−10 + e−10 + e−10 + . . . + e−10 = 0.3328197 k! 0! 1! 2! 8! EXAMPLE: Disney World averages 5 injuries per day. What is the probability that there will be more than 6 injuries tomorrow? Solution: It seems like there is not enough information to be able to answer this question. However, it is known that mean of a binomial distribution = np therefore np = 5. We also know that n is big, p is small. Therefore, P {X ≥ 7} = 1 − P {X < 7} ≈ 1 − 6 X e k=0 k 0 1 6 −5 5 −5 5 −5 5 = 0.2378 = 1− e +e + ...+ e k! 0! 1! 6! −5 5 Note that it would be impossible to solve this using the binomial view because we don’t know p or n. EXAMPLE: Suppose that there are, on average, 1.2 typos on a page of a given book. What is the probability that there are more than two errors on a given page? Solution: We model the number of errors on a given page, X, as a Poisson random variable with np = 1.2. Therefore P {X ≥ 3} = 1 − P {X ≤ 2} = 1 − 2 X e−1.2 k=0 1.2k k! 0 1 2 −1.2 1.2 −1.2 1.2 −1.2 1.2 =1− e = 0.120512 +e +e 0! 1! 2! 5