Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2. Averages and Expected Values of Random Variables In the next section we will be interested in computing the average cost of testing a diode when we test them in groups of n. This is a special case of the mean or expected value of a random variable. The mean or expected value of a random variable is is related to computing the average of a sequence of related measurements, but is not quite the same. So let's look at averages of a sequence of numbers first. Suppose we have a sequence of observations x1, x2, …, xn of something. So x1, x2, …, xn is just a sequence of numbers which may be observations of something that have already been made, so there is nothing _ probabilistic about them. The average x of these observations is their sum divided by the number of observations, i.e. (1) _ x1 + x2 + + xn x = n Example 1. You are a wholesaler for gasoline and each week you buy and sell gasoline. Naturally you are interested in how the price you pay for gasoline (the wholesale price) varies from week to week. Suppose the wholesate price of gasoline for the five weeks was q1 = $2.70 q2 = $2.60 q3 = $2.80 q4 = $2.70 q5 = $2.80 The average of these five prices is _ q1 + q2 + q3 + q4 + q5 2.70 + 2.60 + 2.80 + 2.70 + 2.80 13.60 q = = = = $2.72 5 5 5 Why are we interested in averages? One reason is that it falls somewhat in the "middle" of the values, so it is often used to summarize a group of numbers by a single number. Another reason is the following. Suppose you were to sell the gasoline over the five week period at a single price s. What price s should you have sold the gasoline for in order to come out even for the five week period assuming you buy and sell the same amount each week. It is not hard to see that s = the average = $2.72, since amount received for = 5s = (5)(2.72) = 13.60 = amount received for selling a gallon each week buying a gallon each week Problem 1. Each day a newsstand buys and sells The Wall Street Journal. Suppose the number the have sold for each of the past ten days is 1, 3, 0, 1, 2, 0, 2, 1, 3, 1. Find the average number of copies they have sold per day during the past ten days. Answer: 1.4 Problem 2. A company manufactures diodes. 100 diodes are taken from the production line and tested. 98 of these are good and 2 are bad. Suppose we assign 1 to a diode if it is bad and 0 if it is good so that the result of testing these 100 diodes is a sequence of numbers x1, …, x100 where xj is 1 if the diode is bad and 0 it it is good. What is the average number of this sequence of values? (This average is the proportion of diodes that are bad.) Answer: 0.02 2-1 Problem 3. A company manufactures diodes. 100 diodes are taken from the production line and tested. However, instead of testing them individually, they are tested in groups of four. So there are 25 groups of four. The cost of testing a group of four is 2 cents if they are all good and 7 cents if one or more are bad. So the costs of testing the 25 groups can be represented by c1, …, c25. Suppose the 23rd and the 52nd diodes are bad and the rest are good. Thus c6 = 7 and c13 = 7 and cj = 2 if j 6 and j 13. Find the average of c1, …, c25? Answer: 2.4 Now let's connect the average of a sequence of observations with random variables. Suppose we are modeling a situation where we are going to make a sequence of related observations by a sequence X1, X1, …, Xn of random variables where Xj is the result of the jth observation. Suppose each of the random variables Xj takes on the values x1, …, xm and all the random variables have the same probability mass function f(x) where f(xk) = Pr{Xj = xk} for each j and k. Suppose q1, q1, …, qn are the values we actually _ observe for the random variables X1, X1, …, Xn. In our computation of q let's group all the values of qj that equal x1 together and all the values of qj that equal x2 together, etc. Then we have _ q1 + q2 + + qn (x1 + x1 + + x1) + (x2 + x2 + + x2) + + (xm + xm + + xm) q = = n n = g1x1 + g2x2 + + gmxm g1 g2 gm = x + x + + xm n n 1 n 2 n where gj is the number of times that xj appears in q1, q1, …, qn. As n we expect gk Pr{X = xk} = f(xk) where X denotes any of the Xj. So as n we expect n (2) _ q f(x1)x1 + f(x2)x2 + + f(xm)xm The sum f(x1)x1 + f(x2)x2 + + f(xm)xm is called the mean or expected value of each of the random variables Xj. We summarize this by means of the following definition Definition 1. Suppose X is a random variable that takes on the values x1, …, xm. Let f(x) be the probability mass function, i.e. f(xk) = Pr{X = xk} for each k. Then X = E(X) = mean of X = expected value of X (3) = Pr{X = x1} x1 + Pr{X = x2} x2 + + Pr{X = xm} xm = m Pr{X = xk} xk k=1 = f(x1)x1 + f(x2)x2 + + f(xm)xm = m f(xk)xk k=1 So (2) can be restated as _ q X 2-2 where X is the common mean of X1, X1, …, Xn. The fact that (3) holds if the Xj are independent is actually an important theorem in probability theory called the Law of Large Numbers. A precise statement is in Theorem 7 below. Example 2. Suppose in Example 1 the set of possible values for the wholesale gasoline prices for any particular week is S = {2.60, 2.70, 2.80, 2.90, 3.00}. Let Xj be the wholesale price of gasoline on week j where week one is the first full week of May of this year. The Xj can be regarded as random variables. Assume each of the Xj has the same probability distribution and the probabilities that the gasoline price Xj takes on the values in S for the jth week is as follows Pr{Xj = 2.60} = 0.25 Pr{Xj = 2.70} = 0.4 Pr{Xj = 2.80} = 0.2 Pr{Xj = 2.90} = 0.1 Pr{Xj = 3.00} = 0.05 Then X = (0.25) (2.60) + (0.4) (2.70) + (0.2) (2.80) + (0.1) (2.90) + (0.05) (3.00) = 0.52 + 1.08 + 0.56 + 0.29 + 0.15 = 2.73 _ If the Xj are all independent, then we would expect the average qn of the actual prices over n weeks to approach $2.73 as n . Problem 4. Each day a newsstand buys and sells The Wall Street Journal. Based on records for the past month they feel that they would never sell more than 4 copies in any day. Suppose the probabilities of selling a certain number of copies on a given day are The probability, Pr{0}, of selling zero copies in a given day = 0.21, The probability, Pr{1}, of selling one copy in a given day = 0.26, (1.3) Pr{2} = 0.32, Pr{3} = 0.16, Pr{4} = 0.05, Let X be the number of copies the newsstand sells tomorrow. Find X. Answer: 1.58 Problem 5. A company manufactures diodes. Suppose the probability that a diode is defective is 0.3%, i.e. The probability that a diode is defective = Pr{d} = 0.003, The probability that a diode is good = Pr{g} = 0.997. 2-3 Suppose the random variable X is defined by X(d) = 1 and X(g) = 0. Find X. How is X related to the other Ans: X = 0.003 = Pr{d} parameters in the situation. Problem 6. A company manufactures diodes. They are tested in groups of four. The cost C of testing a group of four is 2 cents if they are all good and 7 cents if one or more are bad. Suppose, as in Problem 5, the probability of one diode being defective is 0.003 and whether one diode is defective is independent of whether any other diode is defective. We saw in Example 15 in section 1.3 that the probability that all four diodes in a group of 4 are good is (0.997)4 and the probability that one or more is defective is 1 – (0.997)4. Ans: 7 – 5 (0.997)4 = 2.05973 Find E(C). Means of Special Types of Distributions. For certain special types of random variables there are formulas for their mean. The following propositions give the mean for uniform, Bernoulli, geometric and Poisson distributions. Proposition 1. Suppose X has a uniform distribution on equally spaced outcomes, i.e. the set of possible 1 values for X is S = {a, a + h, a + 2h, …, a + mh = b} and Pr{Xj = a + kh} = for k = 0, 1, …, m. Then m+1 a+b X = 2 . Proof. X = m m m 1 1 1 m + 1 (a + kh) = m + 1 (a + kh) = m + 1 [(m + 1)a + h k ] k=0 k=0 k=0 1 m(m + 1) mh = a+ m + 1 h 2 = a + 2 = a + b - a m m b-a a+b = a+ = 2 2 2 Here we have used the fact that the sum of the integers from 1 to m is m(m + 1) . // 2 Example 3. Let X be the outcome of a single roll a fair die. Then X has a outcomes 1, 2, 3, 4, 5, 6. Since the die is assumed fair, X has a uniform distribution with a = 1, b = 6 and m = 5. By Proposition 1, a+b X = 2 = 3.5. Proposition 2. Suppose X has a Bernoulli distribution, i.e. Pr{X = 0} = 1 - p and Pr{X = 1} = p where p lies between 0 and 1. Then X = p. Proof. X = (1 – p)(0) + (p)(1) = p. // Proposition 3. Suppose X has a geometric distribution, i.e. Pr{X = k} = p(1 – p)k-1 for k = 1, 2, 3, …. 1 where p lies between 0 and 1. Then X = . p 2-4 Proof. X = kp(1 – p)k-1. In order to do this sum we start with the fact that (1 – p)m-1 = 1/p and k=1 m=1 take the derivative of both sides with respect to p. This gives (m - 1)(1 – p)m-2 = 1/p2. If we replace m=1 m - 1 by k and multiply both sides by p we get 1 kp(1 – p)k-1 = p. // k=1 Example 4. A store sells two types of tables: plain and deluxe. Whan a customer buys a table, there is an 80% chance that it will be a plain table. Assume that each day five tables are sold. Let N be the number of days until a deluxe table is sold starting with today which corresponds to N = 0. What is the expected number of days until a deluxe table is sold? Solution. The probability that the five tables sold on a give day are all plain is (0.8) 5 0.3277. The probability of selling at least one deluxe table on a given day is p = 1 – (0.8)5 0.6723. The probability of first selling a deluxe table on day n is p(1 – p)n, i.e. Pr{N = n} = p(1 – p)n. If we let M = N + 1, then Pr{M = m} = Pr{N + 1 = m} = Pr{N = m – 1} = p(1 – p)m-1. So M is geometric. By Proposition 3, 1 1 1 E{M} = . So E{N} = E{M – 1} = E{M} – 1 = - 1 - 1 1.486 – 1 = 0.486. p p 0.6723 Proposition 4. Suppose N has a Poisson distribution, i.e. Pr{N = n} = n e - n! for n = 0, 1, 2, 3, …. where is a positive parameter. Then E{N} = . Proof. E{N} = e n n! = n=0 giving E{N} = n - n=1 k e - k! n e - (n - 1)! = n e - . We replace n – 1 by k and factor out (n - 1)! n=1 = . Here we have used the fact that j=1 e - = 1. // k! k j=1 Example 5. A hospital observes that the number of heart attack cases that arrive in the Emergency Room is a Poisson random variable with mean 3 per hour? Find the probability that no more than two heart attack cases arrive in the Emergency Room during the next hour. Solution. By Proposition 4, one has = 3. Then Pr{N 2} = Pr{N = 0} + Pr{N = 1} + Pr{N = 2} = 0 e - 1 e - 2 e - 2 + + = (1 + + ) e- = (1 + 3 + 4.5) e-3 = 8.5e-3 0.423. 0! 1! 2! 2 Properties of means. The operation of finding the mean of a random variable has a number of useful properties. Theorem 5. Let S be a sample space and X be a random variables with domain S. Let x1, …, xm be the values X assumes and let E1, …, Eq be disjoint sets whose union is S such that for each r the random variable X assumes the same value on Er, i.e. there is k = k(r) such that X(a) = xk for a Er. Then 2-5 q (4) E(X) = xk(r) Pr{Er} r=1 (5) E(X) = X(a) Pr{a} aS m Proof. By (3) one has E(X) = Pr{X = xk} xk. Let Ek1, …, Ek,rk be those Eq such that X(a) = xk for k=1 a Ekq. Then {X = xk} = Ek1 … Ek,rk and Pr{X = xk} = rk m rk Pr{Ekr}. So E(X) = Pr{Ekr} xk = r=1 k = 1r = 1 q xk(r) Pr{Er} which proves (4). (5) is a special case of (4). // r=1 Example 6. You have two friends, Alice and Bob. You invite both of them over to help you clean house. If either or both of them come you get $100. If neither comes you get nothing. Suppose the probability of either coming is ¼ and whether one comes is independent of whether the other comes. The four outcomes, the probability of the outcomes and how much you get, W in each case are as follows. AB = both come Pr{AB} = 1/16 W=1 Ab = only Alice comes Pr{Ab} = 3/16 W=1 aB = only Bob comes Pr{aB} = 3/16 W=1 ab = neither comes Pr{ab} = 9/16 W=0 One has Pr{W = 0} = 9/16 and Pr{W = 1} = 7/16, so E(W) = (0)(9/16) + (1)(7/16) = 7/16. To illustrate formula (4) in Theorem 1, consider the following three events with their probabilities and the value of W for the outcomes in that event E1 = {AB} = both come Pr{E1} = 1/16 W=1 E2 = {Ab, aB} = only one comes Pr{E2} = 6/16 W=1 E3 = neither comes Pr{E3} = 9/16 W=0 Then according to formula (4) one has E(W) = (1) Pr{E1} + (1) Pr{E2} + (0) Pr{E3} = (1) (1/16) + (1) (6/16) + (0) (9/16) = 7/16. To illustrate formula (5) in Theorem 1, one has E(W) = (1) Pr{AB} + (1) Pr{Ab} + (1) Pr{Ab} + (0) Pr{ab} = (1) (1/16)} + (1) (3/16) + (1) (3/16) + (0) (9/16) = 7/16. Theorem 6. Let S be a sample space and X and Y be random variables with domain S. Let c be a real number and y = g(x) be a real valued function defined for real numbers x. Let x1, …, xm be the values X assumes. In (10) on the left E(c) denotes the expected value of the random variable which is c for every outcome. Then (6) E(X + Y) = E(X) + E(Y) (7) E(cX) = cE(X) 2-6 (8) E(XY) = (9) E(g(X)) = E(X)E(Y) if X and Y are independent m g(xk)f(xk) k=1 (10) E(c) = c (X(a) + Y(a)) Pr{a} = X(a) Pr{a} + Y(a) Pr{a} = Proof. Using (4) one has E(X + Y) = aS aS aS E(X) + E(Y) which proves (6). The proof of (7) is similar. Let y1, …, yr be the values Y takes on. By the definition of expectation one has E(X) = m m r j=1 k=1 Pr{X = xj} xj and E(Y) = Pr{Y = yk} yk. So E(X)E(Y) = r Pr{X = xj}Pr{Y = yk}xjyk. Since X and Y are independent one has Pr{X = xj}Pr{Y = yk} = j=1k=1 m Pr{X = xj, Y = yk}. So E(X)E(Y) = r Pr{X = xj, Y = yk}xjyk. However, by (4) this last sum equals j=1k=1 E(XY) which proves (8). Note that g(x) is constant on the sets {X = xj}. So (9) follows from (4). The proof of (9) is easy. // Example 7. A company produces transistors. They estimate that the probability of any one of the transistors is defective is 0.1. Suppose a box contains 20 transistors. What is the expected number of defective transistors in a box? Solution. Let Xj = 1 if the jth transistor is defective and Xj = 0 if is is good. The number N of defective transistors is N = X1 + … + X20. By (6) in Theorem 2 one has E(N) = E(X1) + … + E(X20). By Proposition 2 one has E(Xj) = 0.1 for each j. So E(N) = (0.1)(20) = 2. This example illustrates the following general proposition. Example 8. Consider a random walk where the probability of a step to the right is ½ and the probability of a step to the left is ½. After 4 steps your position Z could be either -2, 0 or 2 with probabilities ¼, ½ and ¼ respectively. Compute E(Z2). Solution. By (8) one has E(Z2) = (- 2)2 Pr{Z = - 2} + (0)2 Pr{Z = 0} + (2)2 Pr{Z = 2} = (4)(1/4) + (0)(1/2) + (4)(1/4) = 2. If we were to compute E(Z2) from the definition (2) then E(Z2) = (4) Pr{Z2 = 4} + (0)2 Pr{Z2 = 0} = (4)(1/2) + (0)2 (1/2) = 2. Example 9. An unfair coin is tossed twice with the two tosses independent of each other. Suppose on each toss Pr{H} = ¼ and Pr{T} = ¾. For each toss we win a $1 if it comes up heads and lose a $1 if it comes up tails. We Wj be the amount we win on the jth toss. Compute E(W1W2). Solution. By (7) one has E(W1W2) = E(W1) E(W2). One has E(W1) = E(W2) = (1)(1/4) + (-1)(3/4) = -1/2. Do E(W1W2) = (- ½)(- ½) = ¼. 2-7 The Law of Large Numbers. As mentioned earlier, the fact that (2) holds is called the Law of Large Numbers. A precise statement is as follows. Theorem 7. Let X1, X1, …, Xn, … be a sequence of independent random variables all taking on the same set of values x1, …, xm and having the same probability mass function f(x). Let _ Xn = 1 (X1 + X2 + + Xn) n _ Then Pr{a: Xn(a) X as n } = 1. _ _ _ _ Note that Xn is again a random variable, so that X1, X2, …, Xn, … is a new sequence of random variables and _ _ _ for each outcome a one has a sequence of numbers X1(a), X2(a), …, Xn(a), … The Law of Large Numbers _ says that Xn(a) X except for a set of outcomes that has probability zero. For the proof of the law of large numbers see a more advanced book in probability. 2-8