Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
28 HARRY CRANE 8. Expected value of random variables Let X be a random variable taking three possible values, x1 , x2 , x3 . Suppose we repeatedly draw copies of X independently to obtain X1 , X2 , . . .. (Such a sequence is called independent and identically distributed and is abreviated i.i.d.) Given X1 , X2 , . . ., we count ni := #{1 ≤ j ≤ n : X j = i}, for each i = 1, 2, 3. Then the average of the observed values is 3 n1 x1 + n2 x2 + n3 x3 X = xi ni /n. n i=1 P Recall that this average converges to i xi P{X = xi } as n → ∞, which prompts the following definition. Definition 8.1 (Expectation). Let X be a discrete random variable with probability mass function pX . The expected value of X, denoted EX = E(X) = µX , is defined by X X (14) EX := xP{X = x} = xpX (x). x x We stress that if X takes infinitely many values, then the sum in (14) might not be defined. We say that X has an expectation if the righthand side of (14) is defined; X has finite expectation, or is integrable, if the righthand side of (14) is finite. As an aside, suppose X takes infinitely P many values {ai }i∈I , where I is some infinite indexing set. Then how might we define i∈I ai = S? For example, suppose ai = i for P P i iP= 1, 2, . . .. Then i∈I ai = ∞ i=1 i = ∞. However, if ai := (−1) for i = 1, 2, . . ., then ∞ ai = (−1) + 1 − 1 + 1 − · · · , which is not defined. Therefore, we define the infinite sum i=1 P S := i∈I ai by S := S+ − S− , where X X S+ := ai and S− := ai . i:ai >0 i:ai <0 If one of S+ , S− is finite, then S = S+ − S− is well-defined. In particular, if S+ = ∞ and S− < ∞, we have S = ∞ − S− = ∞; if S− = ∞ and S+ < ∞, then S = S+ − ∞ = −∞. However, if S+ = S− = ∞, then S = S+ − S− = ∞ − ∞ is not defined. Mathematically, there is no problem in defining the expected value of a random variable to be infinite in magnitude. However, there are some practical considerations that arise, as the next example illustrates. Problem 8.2 (St. Petersburg paradox). You pay $B to play the following game. Toss a fair coin repeatedly until the first head is flipped. Let W denote the number of flips before this happens. You are paid $2W−1 . So, for example, suppose we flip TTTTH. Then W = 5 and you win 25−1 = 16. How much should you be willing to pay to play this game? Solution. Let X be the amount you win playing this game. Then X = 2W−1 − B, where W follows the Geometric distribution with parameter 1/2. Furthermore, we have EX = E2W−1 = ∞ X n=0 2n /2n+1 = 1 1 + + · · · = ∞. 2 2 You are expected to win an unlimited amount of money. So (in theory), you should be willing to risk all of your wealth to play this game. Right? Be careful. PROBABILITY MODELS 29 8.1. Examples. (1) Let U be the number of pips on a fair six-sided die. Then pU (u) = 1/6, u = 1, . . . , 6, is the probability mass function of the discrete uniform distribution on {1, . . . , 6} and EU = 6 X upU (u) = u=1 1+2+3+4+5+6 = 7/2. 6 In general, if U is a discrete uniform random variable on [1, n] := {1, . . . , n}, then EU = n X i/n = i=1 1 n(n + 1) n + 1 × = , n 2 2 the midpoint of the interval. If U1 , U2 are the number of pips on two fair dice then S = U1 + U2 has ES = 7/2 + 7/2 = 7 by symmetry. Note that, in this case, E(U1 + U2 ) = EU1 + EU2 . (More on this later.) (2) Let N ∼ Bin(n, p). Then n X ! n k EN = k p (1 − p)n−k k k=0 ! n X n−1 k = n p (1 − p)n−k k−1 k=0 ! n−1 X n−1 k = np p (1 − p)n−1−k k = np. k=0 8.2. Properties of expectation. (E1) Expectation of a function of X: Suppose f : R → R and X is aPrandom variable. Then Y := f (X) := f ◦ X : Ω → R is a random variable and EY = x f (x)P{X = x}. Proof. Note that [ X P{Y = y} := P {X = x} = P{X = x}. x: f (x)=y x: f (x)=y 30 HARRY CRANE Thus, X EY := yP{Y = y} y X = y X y X X = P{X = x} x: f (x)=y yP{X = x} y x: f (x)=y X X = f (x)P{X = x} y x: f (x)=y X = f (x)P{X = x}. x (E2) Scaling rule: For c ∈ R, EcX = cEX. (E3) Addition rule: For any random variables X, Y, E(X + Y) = EX + EY. Proof. To use (E1), we put Z = (X, Y) and f (Z) = X + Y. Also, write p(x, y) = P{X = x, Y = y} and define g(Z) = X and h(Z) = Y so that f = g + h. Then X E f (Z) = f (x, y)p(x, y) x,y X = (g(x, y) + h(x, y))p(x, y) x,y = X g(x, y)p(x, y) + x,y X h(x, y)p(x, y) x,y = Eg(Z) + Eh(Z) = EX + EY. (E4) Multiplication rule: Let X and Y be independent random variables. Then EXY = EXEY. Proof. Using property (E1), we take Z = XY = f (X, Y). Then X EZ = f (x, y)P{X = x, Y = y} x,y = X xypX (x)pY (y) x,y = X xpX (x) x X ypY (y) y = EXEY. PROBABILITY MODELS 31 (5) Conditioning rule: Let C1 , C2 , . . . be a collection of exhaustive, mutually exclusive cases and define X E(X | Ci ) = xP(X = x | Ci ). x Then EX = P i P(Ci )E(X | Ci ). Proof. By the law of cases from Section 4.1, P{X = x} = thus, X EX = xP{X = x} P i P(Ci )P(X = x | Ci ) and, x X X = x P(Ci )P(X = x | Ci ) x = X i = X i P(Ci ) X xP(X = x | Ci ) x P(Ci )E(X | Ci ). i The above properties are extremely useful. For example, the calculation of the expectation of a Binomial random variable in Section 8.1(2) required some (but not much) ingenuity to figure out the sum. However, by property (E2), we could notice that N ∼ Bin(n, p) admits the expression N = N1 + · · · + Nn , where N1 , . . . , Nn are i.i.d. Bernoulli(p) random variables. For each i = 1, . . . , n, ENi = p; whence, n X EN = E(N1 + · · · + Nn ) = ENi = np. i=1