Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1.7 Averages and Expected Values of Random Variables One thing that we do frequently is compute the average of a series of related measurement. Example 1. You are a wholesaler for gasoline and each week you buy and sell gasoline. Naturally you are interested in how the price you pay for gasoline (the wholesale price) varies from week to week. Let Xj be the wholesale price of gasoline on week j where week one is the first full week of February of this year. The Xj can be regarded as random variables. Now suppose we are in mid-march and the actual wholesale prices of gasoline in the five weeks beginning with the first full week of February were q1 = $2.70 q2 = $2.60 q3 = $2.80 q4 = $2.70 q5 = $2.80 The average of these five prices is _ q1 + q2 + q3 + q4 + q5 2.70 + 2.60 + 2.80 + 2.70 + 2.80 13.60 q = = = = $2.72 5 5 5 _ In general, the average q of a sequence of values q1, q1, …, qn is the sum divided by the number n of values, i.e. _ q1 + q2 + + qn q = n Why are we interested in averages? One reason is that it falls somewhat in the "middle" of the values, so it is often used to summarize a group of numbers by a single number. Another reason is the following. Suppose you were to sell the gasoline over the five week period at a single price s. What price s should you sell the gasoline for in order to come out even assuming you buy and sell the same amount each week. It is not hard to see that s = the average = $2.72, since amount received for selling a gallon each week = 5s = (5)(2.72) = 13.60 = amount received for buying a gallon each week Suppose we have a sequence X1, X1, …, Xn of repeated independent trials where each of the random variables Xj take on the values x1, …, xm and they all have the same probability mass function f(x) where f(xk) = Pr{Xj = xk} for each j and k. Suppose q1, q1, …, qn are the values we actually observe for the random _ variables X1, X1, …, Xn, … In our computation of q let's group all the values of qj that equal x1 together and all the values of qj that equal x2 together, etc. Then we have 1.7 - 1 _ q1 + q2 + + qn (x1 + x1 + + x1) + (x2 + x2 + + x2) + + (xm + xm + + xm) q = = n n = g1x1 + g2x2 + + gmxm g1 g2 gm = x + x + + xm n n 1 n 2 n gk Pr{X = xk} = f(xk) n where gj is the number of times that xj appears in q1, q1, …, qn. As n one has where X denotes any of the Xj. So as n one has (1) _ q X where X = E(X) = mean of X = expected value of X (2) m = Pr{X = x1} x1 + Pr{X = x2} x2 + + Pr{X = xm} xm = Pr{X = xk} xk k=1 = f(x1)x1 + f(x2)x2 + + f(xm)xm = m f(xk)xk k=1 The fact that (1) holds is actually an important theorem in probability theory called the Law of Large Numbers. A precise statement is in Theorem 2 below. Example 2. Suppose in Example 1 the set of possible values for the wholesale gasoline prices for any particular week is = {2.60, 2.70, 2.80, 2.90, 3.00} and the probabilities that the gasoline price Xj takes on these values for the jth week is as follows Pr{X = 2.60} = 0.25 Pr{X = 2.70} = 0.4 Pr{Xk = 2.90} = 0.1 Pr{X = 2.80} = 0.2 Pr{Xk = 3.00} = 0.05 Then Xk = (0.25) (2.60) + (0.4) (2.70) + (0.2) (2.80) + (0.1) (2.90) + (0.05) (3.00) = 0.52 + 1.08 + 0.56 + 0.29 + 0.15 = 2.73 _ If the Xj are all independent, then we would expect the average qn of the actual prices over n weeks to approach $2.73 as n . For certain special types of random variables there are formulas for their mean. The following propositions give the mean for uniform, Bernoulli, geometric and Poisson distributions. 1.7 - 2 Proposition 1. Suppose X has a uniform distribution on equally spaced outcomes, i.e. the set of possible 1 values for X is = {a, a + h, a + 2h, …, a + mh = b} and Pr{Xj = a + kh} = for k = 0, 1, …, m. m+1 a+b Then X = . 2 Proof. X = m m m 1 1 1 m + 1 (a + kh) = m + 1 (a + kh) = m + 1 [(m + 1)a + h k ] k=0 k=0 k=0 1 m(m + 1) mh = a+ m + 1 h 2 = a + 2 = a + b - a m m b-a a+b = a+ = 2 2 2 Here we have used the fact that the sum of the integers from 1 to m is m(m + 1) . // 2 Example 3. Let X be the outcome of a single roll a fair die. Then X has a outcomes 1, 2, 3, 4, 5, 6. Since the die is assumed fair, X has a uniform distribution with a = 1, b = 6 and m = 5. By Proposition 1, a+b X = 2 = 3.5. Proposition 2. Suppose X has a Bernoulli distribution, i.e. Pr{X = 0} = 1 - p and Pr{X = 1} = p where p lies between 0 and 1. Then X = p. Proof. X = (1 – p)(0) + (p)(1) = p. // Proposition 3. Suppose X has a geometric distribution, i.e. Pr{X = k} = p(1 – p)k-1 for k = 1, 2, 3, …. 1 where p lies between 0 and 1. Then X = . p Proof. X = kp(1 – p)k-1. In order to do this sum we start with the fact that (1 – p)m-1 = 1/p and k=1 m=1 take the derivative of both sides with respect to p. This gives (m - 1)(1 – p)m-2 = 1/p2. If we replace m=1 m - 1 by k and multiply both sides by p we get 1 kp(1 – p)k-1 = p. // k=1 Example 4. A store sells two types of tables: plain and deluxe. Whan a customer buys a table, there is an 80% chance that it will be a plain table. Assume that each day five tables are sold. Let N be the number of days until a deluxe table is sold starting with today which corresponds to N = 0. What is the expected number of days until a deluxe table is sold? Solution. The probability that the five tables sold on a give day are all plain is (0.8) 5 0.3277. The probability of selling at least one deluxe table on a given day is p = 1 – (0.8)5 0.6723. The probability of first selling a deluxe table on day n is p(1 – p)n, i.e. Pr{N = n} = p(1 – p)n. If we let M = N + 1, then 1.7 - 3 Pr{M = m} = Pr{N + 1 = m} = Pr{N = m – 1} = p(1 – p)m-1. So M is geometric. By Proposition 3, 1 1 1 E{M} = . So E{N} = E{M – 1} = E{M} – 1 = - 1 - 1 1.486 – 1 = 0.486. p p 0.6723 Proposition 4. Suppose N has a Poisson distribution, i.e. Pr{N = n} = n e - n! for n = 0, 1, 2, 3, …. where is a positive parameter. Then E{N} = . Proof. E{N} = e n n! = n=0 giving E{N} = n - n e - (n - 1)! n=1 = n e - . We replace n – 1 by k and factor out (n - 1)! n=1 k e - = . Here we have used the fact that k! j=1 k e - k! = 1. // j=1 Example 5. A hospital observes that the number of heart attack cases that arrive in the Emergency Room is a Poisson random variable with mean 3 per hour? Find the probability that no more than two heart attack cases arrive in the Emergency Room during the next hour. Solution. By Proposition 4, one has = 3. Then Pr{N 2} = Pr{N = 0} + Pr{N = 1} + Pr{N = 2} = 0 e - 1 e - 2 e - 2 + + = (1 + + ) e- = (1 + 3 + 4.5) e-3 = 8.5e-3 0.423. 0! 1! 2! 2 The operation of finding the mean of a random variable has a number of useful properties. Theorem 5. Let S be a sample space and X be a random variables with domain S. Let x1, …, xm be the values X assumes and let E1, …, Eq be disjoint sets whose union is S such that for each r the random variable X assumes the same value on Er, i.e. there is k = k(r) such that X(a) = xk for a Er. Then q (3) E(X) = xk(r) Pr{Er} r=1 (4) E(X) = X(a) Pr{a} aS m Proof. By (2) one has E(X) = Pr{X = xk} xk. Let Ek1, …, Ek,rk be those Eq such that X(a) = xk for k=1 a Ekq. Then {X = xk} = Ek1 … Ek,rk and Pr{X = xk} = rk m rk Pr{Ekr}. So E(X) = Pr{Ekr} xk = r=1 k = 1r = 1 q xk(r) Pr{Er} which proves (3). (4) is a special case of (3). // r=1 Example 6. You have two friends, Alice and Bob. You invite both of them over to help you clean house. If either or both of them come you get $100. If neither comes you get nothing. Suppose the probability of 1.7 - 4 either coming is ¼ and whether one comes is independent of whether the other comes. The four outcomes , the probability of the outcomes and how much you get, W in each case are as follows. AB = both come Pr{AB} = 1/16 W=1 Ab = only Alice comes Pr{Ab} = 3/16 W=1 aB = only Bob comes Pr{aB} = 3/16 W=1 ab = neither comes Pr{ab} = 9/16 W=0 One has Pr{W = 0} = 9/16 and Pr{W = 1} = 7/16, so E(W) = (0)(9/16) + (1)(7/16) = 7/16. To illustrate formula (3) in Theorem 1, consider the following three events with their probabilities and the value of W for the outcomes in that event E1 = {AB} = both come Pr{E1} = 1/16 W=1 E2 = {Ab, aB} = only one comes Pr{E2} = 6/16 W=1 E3 = neither comes Pr{E3} = 9/16 W=0 Then according to formula (3) one has E(W) = (1) Pr{E1} + (1) Pr{E2} + (0) Pr{E3} = (1) (1/16) + (1) (6/16) + (0) (9/16) = 7/16. To illustrate formula (4) in Theorem 1, one has E(W) = (1) Pr{AB} + (1) Pr{Ab} + (1) Pr{Ab} + (0) Pr{ab} = (1) (1/16)} + (1) (3/16) + (1) (3/16) + (0) (9/16) = 7/16. Theorem 6. Let S be a sample space and X and Y be random variables with domain S. Let c be a real number and y = g(x) be a real valued function defined for real numbers x. Let x1, …, xm be the values X assumes. In (9) on the left E(c) denotes the expected value of the random variable which is c for every outcome. Then (5) E(X + Y) = E(X) + E(Y) (6) E(cX) = cE(X) (7) E(XY) = E(X)E(Y) (8) E(g(X)) = if X and Y are independent m g(xk)f(xk) k=1 (9) E(c) = c Proof. Using (4) one has E(X + Y) = (X(a) + Y(a)) Pr{a} = X(a) Pr{a} + Y(a) Pr{a} = aS aS aS E(X) + E(Y) which proves (5). The proof of (6) is similar. Let y1, …, yr be the values Y takes on. By the definition of expectation one has E(X) = m m r j=1 k=1 Pr{X = xj} xj and E(Y) = Pr{Y = yk} yk. So E(X)E(Y) = r Pr{X = xj}Pr{Y = yk}xjyk. Since X and Y are independent one has Pr{X = xj}Pr{Y = yk} = j=1k=1 1.7 - 5 m Pr{X = xj, Y = yk}. So E(X)E(Y) = r Pr{X = xj, Y = yk}xjyk. However, by (3) this last sum equals j=1k=1 E(XY) which proves (7). Note that g(x) is constant on the sets {X = xj}. So (8) follows from (3). The proof of (9) is easy. // Example 7. A company produces transistors. They estimate that the probability of any one of the transistors is defective is 0.1. Suppose a box contains 20 transistors. What is the expected number of defective transistors in a box? Solution. Let Xj = 1 if the jth transistor is defective and Xj = 0 if is is good. The number N of defective transistors is N = X1 + … + X20. By (5) in Theorem 2 one has E(N) = E(X1) + … + E(X20). By Proposition 2 one has E(Xj) = 0.1 for each j. So E(N) = (0.1)(20) = 2. This example illustrates the following general proposition. n Proposition 7. If N has a binomial distribution with Pr{N = k} = k pk qn-k for k = 0, 1, …, n then E(N) = np. Proof. N = X1 + … + Xn where Pr{Xj = 1} = p and Pr{Xj = 0} = 1 - p. By (5) in Theorem 2 one has E(N) = E(X1) + … + E(Xn). By Proposition 2 one has E(Xj) = p for each j. So E(N) = np. As mentioned earlier, the fact that (1) holds is called the Law of Large Numbers. A precise statement is as follows. Example 8. Consider a random walk where the probability of a step to the right is ½ and the probability of a step to the left is ½. After 4 steps your position Z could be either -2, 0 or 2 with probabilities ¼, ½ and ¼ respectively. Compute E(Z2). Solution. By (8) one has E(Z2) = (- 2)2 Pr{Z = - 2} + (0)2 Pr{Z = 0} + (2)2 Pr{Z = 2} = (4)(1/4) + (0)(1/2) + (4)(1/4) = 2. If we were to compute E(Z2) from the definition (2) then E(Z2) = (4) Pr{Z2 = 4} + (0)2 Pr{Z2 = 0} = (4)(1/2) + (0)2 (1/2) = 2. Example 9. An unfair coin is tossed twice with the two tosses independent of each other. Suppose on each toss Pr{H} = ¼ and Pr{T} = ¾. For each toss we win a $1 if it comes up heads and lose a $1 if it comes up tails. We Wj be the amount we win on the jth toss. Compute E(W1W2). Solution. By (7) one has E(W1W2) = E(W1) E(W2). One has E(W1) = E(W2) = (1)(1/4) + (-1)(3/4) = -1/2. Do E(W1W2) = (- ½)(- ½) = ¼. Theorem 8. Let X1, X1, …, Xn, … be a sequence of independent random variables all taking on the same set of values x1, …, xm and having the same probability mass function f(x). Let _ Xn = 1 (X1 + X2 + + Xn) n 1.7 - 6 _ Then Pr{a: Xn(a) X } = 1 as n . _ _ _ _ Note that Xn is again a random variable, so that X1, X2, …, Xn, … is a new sequence of random variables and _ _ _ for each outcome a one has a sequence of numbers X1(a), X2(a), …, Xn(a), … The Law of Large Numbers _ says that Xn(a) X except for a set of outcomes that has probability zero. For the proof of the law of large numbers see a more advanced book in probability. 1.7 - 7