Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
If an experiment is performed, very often we are not interested in the actual outcome, but some (numerical) function of the outcome. For example, if a fair coin is tossed n times, we might be interested in the number of heads instead of the actual sequence of heads and tails. MTHE/STAT 351: 4 – Discrete Random Variables Let X denote the number of heads in n tosses. Then S = {H, T }n (head-tail sequences of length n) and for each s 2 S T. Linder X(s) = # of heads in s Queen’s University For example Fall 2016 X(H |T T ·{z · · T T}) = 1 and n 1 tails X(HH · · · H} T ) = n | {z 1 n 1 heads We can ask: What is P (X = k) for k = 0, 1, . . . , n? MTHE/STAT 351: 4 – Discrete Random Variables 1 / 54 S = {1, 2, 3, 4, 5, 6}2 = {(i, j) : i, j = 1, . . . , 6} 1 8 P (X = 0) = P ({T T T }) = P (X = 1) = P ({HT T, T HT, T T H}) = = P (X = 3) = and 3 8 X((i, j)) = i + j 3 P ({HHT, HT H, T HH}) = 8 1 P ({HHH}) = 8 We can easily calculate the probabilities P (X = k): k P (X = k) Letting {X = k} denote the event that X is equal to k, we have n [ k=0 {X = k} = S and MTHE/STAT 351: 4 – Discrete Random Variables P 3 [ k=0 {X = k} = 2 / 54 Example: Let the experiment be tossing two fair dice and define X as the sum of the dice. Then the sample space is Example: Let X be the number of heads appearing in three tosses of a fair coin. Then P (X = 2) MTHE/STAT 351: 4 – Discrete Random Variables 3 X 2 3 4 5 6 7 8 9 10 11 12 1 36 2 36 3 36 4 36 5 36 6 36 5 36 4 36 3 36 2 36 1 36 For example P (X = k) = 1 k=0 P (X = 4) = P ({(1, 3), (2, 2), (3, 1)}) = 3 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 3 36 4 / 54 Remarks on technicalities: Functions that assign numerical values to outcomes in the sample space are called random variables. For technical reasons we cannot allow arbitrary functions. The condition that {s 2 S : X(s) t} is an event for all t 2 R is necessary for building up a rich but consistent theory. This condition need not deeply concern us in this course. Definition If S is a sample space of a random experiment, a real-valued function X:S!R Since {s 2 S : X(s) t} is an event, {s 2 S : X(s) t}c = {s 2 S : X(s) > t} is called a random variable if for any t 2 R the set is also an event. Since the intersection of two events is an event \ {s 2 S : t < X(s) ⌧ } = {s 2 S : X(s) > t} {s 2 S : X(s) ⌧ } {s 2 S : X(s) t} is an event. is also an event. In general, if X is a random variable Remarks: {s 2 S : X(s) 2 B} We’ll use the shorthand “r.v.” for “random variable.” Random variables will usually be denoted by capital letters such as X, Y , T , Z, etc. Their specific (non-random) values are denoted by the corresponding lower case letters x, y, t, z. MTHE/STAT 351: 4 – Discrete Random Variables is an event for most sets B ⇢ R we can envision. Most importantly, {s 2 S : X(s) 2 I} is an event for for any interval I ⇢ R, so the probability P (X 2 I) is well defined. 5 / 54 Example: Kenny is picked up after school by one of his parents at a random time between 2:00 pm. and 2:30 pm. Let X be the time (in hours) he has to wait after his last class ends at 2:00 pm. S = {t : 2 t 2.5} = [2, 2.5] X(t) = t By letting Y = f (X) we obtain another random variable. For example sin X, 2 P (X 2 [↵, ]) = MTHE/STAT 351: 4 – Discrete Random Variables ↵) cos X, X2 2X, eX are all random variables. Recall that P (X = x) = 0 for all x, and that ↵ = 2( 0.5 6 / 54 Remark: From a given r.v. X we can construct other r.v.’s. Assume that f : R ! R is a reasonably well-behaved function. (We are deliberately being vague here, but practically all functions we can think of will be “well-behaved.”) We can take and define X : S ! R by MTHE/STAT 351: 4 – Discrete Random Variables if Questions of the form “what is P (sin X > 12 ) ?” will often be encountered. [↵, ] ⇢ [0, 0.5] 7 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 8 / 54 Example: The radius of a metal ball manufactured in a factory is a random number between 2 and 3 centimeters. What is the probability that the volume of a manufactured ball is more than twice the volume of a ball with radius 2 cm? Distribution functions We would like to be able to calculate all probabilities in the form P (X 2 B), where B is a reasonable subset of the reals, usually an interval. It turns out that all we need to know is the probability P (X x) for all x 2 R. Solution: Let X be radius of the ball in centimeters. Then X is a random point in the interval [2, 3]. The volume (in cm3 ) of a sphere or radius r is 43 ⇡r3 . Thus we want to know the probability ✓ ◆ 4 4 P ⇡X 3 > 2 ⇡23 3 3 Definition If X is a random variable, then the function F : R ! [0, 1], defined by F (x) = P (X x) 4 4 3 3 Since X > 0, = X 3 > 2 · 23 = X > 2 · 21/3 . 3 ⇡X > 2 3 ⇡2 Thus ✓ ◆ 4 4 3 2 · 21/3 P ⇡X 3 > 2 ⇡23 = P (X > 2 · 21/3 ) = ⇡ 0.48 3 3 3 2 MTHE/STAT 351: 4 – Discrete Random Variables is called the distribution function of X. 9 / 54 If x < a, we have P (X x) = 0. P (X = 0) = P (X x) = P (X 2 [a, x]) = x b a a 8 > 0 > > <x F (x) = > b > > :1 MTHE/STAT 351: 4 – Discrete Random Variables 1 , 8 P (X = 1) = Thus If x > b, then P (X x) = 1. Hence x<a a a 10 / 54 Example: Let X denote the number of heads in 3 flips of a fair coin. We recall that Example: Let X be a random number selected from the interval [a, b]. If x 2 [a, b] MTHE/STAT 351: 4 – Discrete Random Variables F (x) = P (X x) = axb x>b 11 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 3 , 8 8 > >0 > > > > 1 > > <8 P (X = 2) = 1 8 + > > > 1 > > >2 + > > : 1 3 , 8 P (X = 3) = 1 8 if x < 0 3 8 3 8 = = 1 2 7 8 if 0 x < 1 if 1 x < 2 if 2 x < 3 if x 3 12 / 54 Properties of distribution functions 3. lim F (x) = 0. x! 1 Proof: 1. F is nondecreasing, i.e., F (x) F (y) if x < y. Proof: If x < y, then {X x} ⇢ {X y}. Recalling that P (B) P (A) if B ⇢ A, we obtain P (X x) P (X y). Similar to the proof of 2. 4. F is right continuous, i.e., if {xn } is a decreasing sequence such that limn xn = x, then ⇤ 2. lim F (x) = 1. lim F (xn ) = F (x). x!1 Proof: It is enough to show that for any increasing sequence {xn } such that limn xn = 1 we have limn F (xn ) = 1. The events {X xn } form an increasing sequence since {X xn } ⇢ {X xn+1 } for all n. Thus lim {X xn } = n!1 1 [ n=1 n!1 Proof: The events {X xn } form a decreasing sequence, i.e., {X xn+1 } ⇢ {X xn } for all n. Thus lim {X xn } = n!1 {X xn } = {X < 1} 1 \ n=1 {X xn } = {X x} Using the continuity of the probability function Using the continuity of the probability function lim F (xn ) = lim P (X xn ) = P (X < 1) = 1 n!1 n!1 lim F (xn ) = lim P (X xn ) = P (X x) = F (x) ⇤ MTHE/STAT 351: 4 – Discrete Random Variables n!1 13 / 54 n!1 ⇤ MTHE/STAT 351: 4 – Discrete Random Variables 14 / 54 Detour: left and right limit of a functions Recall that a the right limit of a function f : R ! R at x is + In what follows we will use F (x) to express various probabilities in the form P (X 2 I), where I ⇢ R in interval. 4 f (x ) = lim f (x + ✏) ✏#0 P (X > a): if the limit on the right hand side (as ✏ ! 0 in such a way that ✏ > 0) exists. The left limit at x is 4 f (x ) = lim f (x ✏#0 Since {X > a} = {X a}c , we have P (X > a) = 1 ✏) P (X a) = 1 P (a < X b) for b > a: Since {a < X b} = {X b} {X a} ⇢ {X b}, we get if the limit exists. f is said to be right continuous at x if f (x+ ) = f (x), and left continuous if f (x ) = f (x). If f is both left and right continuous at x, then it is continuous at x. P (a < X b) = P (X b) F (a) {X a} and P (X a) = F (b) F (a) Since a distribution function F (x) is nondecreasing, the left and right limits F (x ) and F (x+ ) always exist at each x. Property 4, the right continuity of F , shows that F (x+ ) = F (x). MTHE/STAT 351: 4 – Discrete Random Variables 15 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 16 / 54 P (X < a): The events {X a increasing sequence. Thus lim {X a 1 [ 1/n} = n!1 n=1 1/n} for n = 1, 2, . . . form an The expression P (X = a) = F (a) {X a F (a ) implies the following: If F is continuous at a (i.e., F (a ) = F (a) ), then 1/n} = {X < a} P (X = a) = 0 Using the continuity of the probability function 1/n) = lim P (X a F (a ) = lim F (a n!1 P (X a): n!1 Since {X P (X If F is not continuous at a, then it has a jump of magnitude F (a) F (a ), which is the probability of X = a. 1/n) = P (X < a) a} = {X < a}c , we have a) = 1 P (X < a) = 1 P (X = a): Since {X = a} = {X a} {X < a} ⇢ {X a} P (X = a) = P (X a) Note: If F is continuous at a and b, then F (a ) F (b) {X < a} and P (X < a) = F (a) 17 / 54 P (X > 10) P (X 1) = F (1) = 1 = 1 MTHE/STAT 351: 4 – Discrete Random Variables 0= P (a X < b) = P (a < X < b) MTHE/STAT 351: 4 – Discrete Random Variables 18 / 54 Examples: 1 4 Set of positive integers N = {1, 2, 3, . . .} Set of all integers Z = {0, ±1, ±2, . . .}. P (X 0) 5 1 3 F (0) = = 8 4 8 P (X 10) = 1 41 3 = 44 44 = Recall that by definition a countably infinite set has as many elements as the set of positive integers N. More formally, elements of a set of countably infinite set C can be listed as C = {x1 , x2 , . . . , xn , . . .}. Calculate P (X = 0), P (0 < X 1), and P (X > 10). = P (a X b) = P (a < X b) Discrete random variables Example: The distribution function of the random variable X is given by 8 <0 if x < 0 F (x) = 1 + 4x : if x 0 4(1 + x) P (0 < X 1) = F (a ) MTHE/STAT 351: 4 – Discrete Random Variables Solution: P (X = 0): F has a jump at x = 0. Thus 1+4·0 P (X = 0) = F (0) F (0 ) = 4 · (1 + 0) F (a) Set of all rational numbers Q. Definition A discrete set is a set with a finite or countably infinite number of elements. X is a discrete random variable if the set of possible values of X is a discrete set. F (10) 19 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 20 / 54 Properties of a pmf We have already seen many examples for discrete r.v.’s. Let X denote the range of X, i.e., the set of its possible values. (a) p(x) 0 for all x 2 R. P (b) p(xi ) = 1. Sum of two fair dice: X = {2, . . . , 12} xi 2X Number of heads in n tosses of a fair coin: X = {0, 1, 2, . . . , n} Proof: (a) is obvious since P (X = xi ) 0. To prove (b) observe that {X = x1 }, {X = x2 }, {X = x3 }, . . . are mutually exclusive events such that [ {X = xi } = S Number of times a fair coin is tossed until a head shows up: X = {1, 2, 3, . . .} = N. Definition Let X be a discrete random variable with range X = {x1 , x2 , x3 , . . .}. The probability mass function (pmf) of X is the function p : R ! R defined by 1 p(xi ) = P (X = xi ) if xi 2 X , 2 p(x) = 0 if x 2 / X. xi 2X Thus X p(xi ) = xi 2X MTHE/STAT 351: 4 – Discrete Random Variables 21 / 54 Suppose X is a discrete random variable with range X and pmf p. How can we use p to calculate P (X 2 B) for B ⇢ X ? P (X 2 B) = MTHE/STAT 351: 4 – Discrete Random Variables X [ xi 2X {X = xi } = P (S) = 1 ⇤ MTHE/STAT 351: 4 – Discrete Random Variables 22 / 54 i = 1, 2, 3, . . . is a pmf for a random variable X with range X = N. Calculate the probabilities P (X 10) and P (X is an even number). P Solution: We need to find c such that i2X p(i) = 1. We have Since B is either finite or countably infinite, [ X X P (X 2 B) = P {X = xi } = P (X = xi ) = p(xi ) Thus xi 2X ✓ ◆i 1 p(i) = c 3 xi 2B xi 2B P (X = xi ) = P Example: Determine the constant c so that the function As before, the events {X = x}, x 2 B are mutually exclusive and [ {X = xi } = {X 2 B} xi 2B X 1 xi 2B = = p(xi ) xi 2B ✓ ◆X 1 ✓ ◆i 1 ✓ ◆k X 1 1 1 p(i) = c =c 3 3 3 i=1 i=1 k=0 ✓ ◆✓ ◆ 1 1 c c = 3 1 1/3 2 1 X so c = 2. 23 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 24 / 54 P (X 10) = = ✓ ◆i ✓ ◆10 X 1 1 ✓ ◆k X 1 1 1 2 =2 3 3 3 i=10 i2X : i 10 k=0 ✓ ◆10 ✓ ◆ ✓ ◆9 1 1 1 2 = 3 1 1/3 3 X P (X is an even number) P The formula P (X 2 B) = xi 2B p(xi ) and the known properties of distribution functions imply how the pmf and the distribution function of a discrete r.v. X determine each other: p(i) = = = = = If discrete r.v. X has pmf p(x) and range X , then its distribution function is ✓ ◆i 1 3 i2X : i is even i2X : i is even ✓ ◆ ✓ ◆ ✓ ◆ 1 1 2j 2 2 k X 1 1 X 1 2 =2 3 3 3 j=1 k=0 ✓ ◆2 ✓ ◆ 1 1 2 9 2 = · 3 1 (1/3)2 9 8 1 4 X p(i) = X 2 MTHE/STAT 351: 4 – Discrete Random Variables F (x) = P (X x) = X for all x 2 R p(xi ) xi 2X : xi x If x1 < x2 < x3 < · · · , then F (x) is constant over each interval [xi 1 , xi ) and has a jump of magnitude p(xi ) at x = xi . 25 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 26 / 54 Example: Let X be a discrete r.v. with pmf p(0) = F (xi ) p(1) = Solution: Check: 1 2 + 1 8 p(2) = 1 , 4 p(3) = 1 . 8 1 4 1 8 = 1, so p is a valid pmf. Using X F (x) = p(xi ) + + xi 2X : xi x we obtain for all xi 2 X F (x) = 8 > 0 > > > > > 1 > > <2 1 >2 > > 1 > > > >2 > : MTHE/STAT 351: 4 – Discrete Random Variables 1 , 8 Determine the distribution function of X. Conversely, if the distribution function F of a r.v. X is piecewise constant (staircase function) with jumps F (xi ) F (xi ) at points x1 , x2 , x3 , . . ., then X is a discrete r.v. with values X = {x1 , x2 , x3 , . . .} and pmf given by p(xi ) = P (X = xi ) = F (xi ) 1 , 2 27 / 54 1 MTHE/STAT 351: 4 – Discrete Random Variables if x < 0 + + 1 8 1 8 = + 5 8 1 4 if 0 x < 1 = 7 8 if 1 x < 2 if 2 x < 3 if x 3 28 / 54 Example: Given a coin that comes up head with probability p (0 < p < 1) and tails with probability q = 1 p, let X be the number of flips until the first head. Find the distribution function of X. Thus F (x) = Solution: X = {1, 2, 3, . . .}. For n 2 X P (X = n) = P (T · · · T} H) = q | T {z n 1 p p(n) = pq n 1 q Then X = p(i) = p n X1 qj = p j=0 n X pq i 1 F (x) = i=1 i2X : ix = if x < 1 n 1 qn =1 1 q qn MTHE/STAT 351: 4 – Discrete Random Variables 29 / 54 = = q if x 1 MTHE/STAT 351: 4 – Discrete Random Variables 30 / 54 1 1 +3· 2 8 5 8 1· 4· Remarks: The expected value of X is often called the expectation or the mean of X. 3 8 The expected value can also be expressed as X E(X) = xP (X = x) 4, the average can be written as x1 p(x1 ) + x2 p(x2 ) + x3 p(x3 ) = 3 X x2X xi p(xi ). i=1 MTHE/STAT 351: 4 – Discrete Random Variables if x < 1 bxc x2X If n is large, you can expect to win $1 about (1/2)n times, $3 about (1/8)n times, and lose $4 about (3/8)n times. Thus your average winnings are approximately If we let x1 = 1, x2 = 3, and x3 = :1 Definition The expected value of a discrete random variable with range X and pmf p(x) is X E(X) = xp(x) Suppose you play a game such that in each round you can win $1 with probability p(1) = 1/2, $3 with probability p(3) = 1/8, and lose $4 with probability p( 4) = 3/8. If you play n rounds of the game, how much will your average (per round) winnings be? 4 · (3/8)n 8 <0 The previous simple example motivates a general definition of average value for a discrete r.v. Expected Value of a Discrete R.V. 1 · (1/2)n + 3 · (1/8)n n if n x < n + 1, n = 1, 2, 3, . . . bxc = the largest integer that is less than or equal to x Let n x < n + 1. Then F (x) :1 F can be defined more elegantly by using the “floor function.” For any x 2 R let n 1 tails so 8 <0 If X is not a finite set, E(X) is defined by an infinite sum. In this P case we say that E(X) exists i↵ |x|p(x) < 1. x2X 31 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 32 / 54 Constant random variable: X is called a constant random variable if it has only one possible value. Thus P (X = c) = 1 for some constant c 2 R. Not surprisingly Example: What is the expected value of the number of heads in 3 flips of a fair coin? E(X) = cP (X = c) = c Solution: X = {0, 1, 2, 3} and E(X) = = = Indicator random variable: Let A be an event. The indicator r.v. X for A is defined as 8 <1 if A occurs X= :0 if A does not occur 0 · P (X = 0) + 1 · P (X = 1) + 2 · P (X = 2) + 3 · P (X = 3) 1 3 3 1 0· +1· +2· +3· 8 8 8 8 1.5 We can find E(X) as follows: E(X) = 0 · P (X = 0) + 1 · P (X = 1) = 0 · P (Ac ) + 1 · P (A) = P (A) Thus E(X) = P (A) for the indicator of the event A. MTHE/STAT 351: 4 – Discrete Random Variables 33 / 54 0 for all x 2 X , then X E(X) = xp(x) Solution: (a) Assume the numbers a1 , a2 , . . . , an are all distinct. Then X has n possible values X = {a1 , a2 , . . . , an } and pmf p(a1 ) = p(a2 ) = · · · p(an ) = 0 x2X since the sum only contains nonnegative terms. Thus ⇤ E(X) = n X ai p(ai ) = i=1 MTHE/STAT 351: 4 – Discrete Random Variables 34 / 54 Example: An urn contains n balls such that the ith ball has number ai written on it. Let X be the number written on a randomly selected ball. Find E(X). Nonnegative random variables: X is a nonnegative r.v. if all its possible values of X are nonnegative. The expectation of a nonnegative r.v. is nonnegative. Proof: If x MTHE/STAT 351: 4 – Discrete Random Variables 35 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 1 n a1 + a2 + · · · + an n 36 / 54 (b) If the ai are not all distinct, the solution is a bit more complicated. Example: Your friend challenges you for a game. You have to pay $1 to play. He rolls a pair of fair dice, and if the sum is 7, you win $4. If he rolls a pair of 6’s, you get $10. For any other outcome you get nothing and he gets to keep your $1. Would you like to play? Assume the possible distinct values are b1 , b2 , . . . , bm (where m < n). Then X = {b1 , b2 , . . . , bm } and we need to determine the pmf of X. Let n(bj ) = # of balls with the number bj Solution: Let X be your net winning (winnings minus the $1 you pay to play). Then X takes the values 3, 9, and 1 with probabilities Then since we pick each ball with probability 1/n P (X = bj ) = n(bj ) n P (X = 3) = P (sum is 7) = and so P (X = E(X) = m X bj P (X = bj ) = j=1 b1 n(b1 ) + b2 n(b2 ) + · · · + bm n(bm ) n E(X) = 3 · 37 / 54 xi E(X) p(xi ) = i=1 xi p(xi ) i=1 = E(X) 2 36 MTHE/STAT 351: 4 – Discrete Random Variables 38 / 54 If X is a discrete r.v. with range X and pmf p(x), and g : R ! R is a function, then X E[g(X)] = g(x)p(x) x2X Proof If X is the range of X, then the set 4 Y = g(X ) = {g(x) : x 2 X } p(xi )E(X) i=1 n X E(X) is the range of Y = g(X). p(xi ) = 0 The following is a key observation: for any y 2 Y X P (Y = y) = P (g(X) = y) = i=1 | {z } =1 MTHE/STAT 351: 4 – Discrete Random Variables 29 = 36 Theorem 1 is the center of gravity (center of mass) for the rod. That is, the sum of torques turning the rod around the point E(X) is zero as can be seen from n X 6 1 +9· 36 36 Often we know the pmf of X, but need to calculate the expectation of a function g(X) of X. i=1 n X 1 36 1 Thus you lose per game $ 18 ⇡ 5.5 cents on the average. The game would only be fair if we had E(X) = 0. Remark: Expectation as Center of Gravity Let X be a discrete r.v. with range {x1 , x2 , . . . , xn } and pmf p(x). If we view the real line as a weightless rod on which a point mass p(xi ) is located at the points xi for i = 1, 2, . . . , n, then n X E(X) = xi p(xi ) n X P (X = 9) = P (double 6) = 7 29 = 36 36 Thus But bj n(bj ) is just the sum of the ai ’s on n(bj ) balls. Thus we again obtain a1 + a2 + · · · + an E(X) = n MTHE/STAT 351: 4 – Discrete Random Variables 1) = 1 6 , 36 P (X = x) x2X : g(x)=y 39 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 40 / 54 Example: A discrete random variable takes the values -1, 0, and 1 with respective probabilities 0.1, 0.4, 0.5. Calculate E(X 2 ). Using this E(Y ) = X y2Y = X X yP (Y = y) = y y2Y X X Solution: In this simple example we can easily determine E(X 2 ) by (a) calculating directly the pmf of Y = X 2 , or (b) by using the theorem. P (X = x) x2X :g(x)=y (a) Y takes the values 0 and 1. P (Y = 0) = P (X = 0) = 0.4, while g(x)P (X = x) P (Y = 1) = P (X = y2Y x2X :g(x)=y = X g(x)P (X = x) Thus x2X = X E(Y ) = ⇤ g(x)p(x) X y2Y x2X 1) + P (X = 1) = 0.1 + 0.5 = 0.6 yP (Y = y) = 0 · 0.4 + 1 · 0.6 = 0.6 (b) Applying the theorem with g(x) = x2 X E(X 2 ) = x2 P (X = x) = ( 1)2 · 0.1 + 02 · 0.4 + 12 · 0.5 = 0.6 x2X MTHE/STAT 351: 4 – Discrete Random Variables 41 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 42 / 54 The following is a very often used corollary of the theorem: The following corollary can be proved in a similar way: Corollary 2 (Linearity of expectation) Corollary 3 If X is a discrete r.v. and a and b are real constants, then If X is a discrete r.v. and g1 , g2 , . . . , gn are real-valued functions on the range of X, and a1 , a2 , . . . , an are real constants, then E(aX + b) = aE(X) + b E Proof Applying the theorem with g(x) = ax + b we get X E(aX + b) = (ax + b)p(x) x2X = a X x2X = MTHE/STAT 351: 4 – Discrete Random Variables | xp(x) + b {z E(X) } aE(X) + b X X n ai gi (X) = i=1 n X i=1 ⇥ ⇤ ai E gi (X) Example: E[X 2 + 3X + 6] = E[X 2 ] + 3E[X] + 6 p(x) x2X | {z } 1 Remark: The property described by the corollary is called the linearity of expectation and is very often used in calculations. 43 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 44 / 54 Variance Example: 3 disks of radius 1, 2, and 3 inches are in a box. What is the expected value of the area of a randomly selected disk? Expectation measures the “average value” of a r.v., but gives no information regarding the fluctuation of values about their average. Solution: Let R be the radius of the selected disk. Then P (R = r) = 1/3 for r = 1, 2, 3. We need E[g(R)] = E[⇡R2 ]: Example: Suppose you are told that you can choose to play one of two games: E[⇡R2 ] 3 X = ⇡E[R2 ] = ⇡ = 12 + 22 + 32 14⇡ ⇡ = 3 3 (a) You pay $5 to play. You win $15 with probability 1/2. You lose with probability 1/2, in which case the bank keeps your $5. Your expected winnings are: $10 · (1/2) + ( $5) · (1/2) = $2.5 r2 P (R = r) r=1 (b) You pay $1,000 to play. You win $2,005 with probability 1/2. You lose with probability 1/2, in which case the bank keeps your $1,000. Your expected winnings are: $1, 005 · (1/2) + ( $1, 000) · (1/2) = $2.5 Which game would you like to play? MTHE/STAT 351: 4 – Discrete Random Variables 45 / 54 Definition Let X be discrete random variable with mean E(X) = µ. Then the variance of X is Var(X) = E[(X The standard deviation of X, denoted by X = p E[(X Var(X) = E(X 2 ) (x X, Proof is Var(X) µ)2 ] = E[(X = E(X 2 ) = = µ)2 p(x) The variance of X is sometimes denoted by 2 X µ)2 ] = E[X 2 2µE(X) + µ2 2 2µ + µ 2 [E(X)]2 E(X ) E(X ) 2µX + µ2 ] 2 (by the linearity of expectation) 2 ⇤ Since Var(X) is the expectation of the nonnegative r.v. (X have Var(X) 0. Thus the theorem implies: x2X µ)2 , we Corollary 5 instead of Var(X). The variance is a measure of the spread of the distribution of a r.v. about its expected value. MTHE/STAT 351: 4 – Discrete Random Variables [E(X)]2 µ)2 ] If X has range X and pmf p(x), the variance can be expressed as X 46 / 54 Theorem 4 Remarks: Var(X) = MTHE/STAT 351: 4 – Discrete Random Variables E(X 2 ) 47 / 54 MTHE/STAT 351: 4 – Discrete Random Variables [E(X)]2 48 / 54 Recall that a constant r.v. takes only one value with positive probability, i.e., P (X = c) = 1 for some c 2 R and P (X = x) = 0 if x 6= c. Example: Find the standard deviation of your winnings in games (a) and (b) in the previous example. Sometimes we say “X is constant with probability 1” or “X is constant” instead of “X is a constant random variable.” Solution: Let X denote your net winnings. In both games E(X) = 2.5. In game (a) we have Theorem 6 1 1 E(X ) = 10 · + ( 5)2 · = 50 + 12.5 = 62.5 2 2 2 Var(X) = E(X 2 ) 2 [E(X)]2 = 56.25, X In game (b) we have E(X 2 ) = 10052 · = p Let X be a discrete r.v. Then Var(X) = 0 if and only if X is a constant random variable. Var(X) = 7.5 Proof Let µ = E(X) and assume P (X = a) > 0 for some a 6= µ. Then (a µ)2 > 0, and so X Var(X) = (x µ)2 P (X = x) 1 1 + ( 1000)2 · = 1, 005, 012.5 2 2 Var(X) = 1, 005, 006.25, X = p x2X (a Var(X) = 1002.5 MTHE/STAT 351: 4 – Discrete Random Variables µ)2 P (X = a) > 0 Thus if Var(X) = 0, then P (X = µ) = 1 and P (X = x) = 0 for any x 6= µ. This means that X is constant with probability 1. 49 / 54 MTHE/STAT 351: 4 – Discrete Random Variables 50 / 54 Recall that for any constants a and b, E(aX + b) = aE(X) + b. Although the variance is not linear, the following useful formula holds. Conversely, if X is a constant r.v. with P (X = c) = 1, then Theorem 7 E(X) = c Let X be a discrete r.v. with variance Var(X). Then and Var(aX + b) = a2 Var(X) E(X 2 ) = c2 P (X = c) = c2 This implies Proof Let µ = E(X). From the linearity of expectation Var(X) = E(X 2 ) [E(X)]2 = c2 c2 = 0 ⇤ E[aX + b] = aµ + b Thus Note: Var(X) = 0 is equivalent to E(X 2 ) = [E(X)]2 . Thus Var(aX + b) E(X 2 ) > [E(X)]2 unless X is a constant r.v. MTHE/STAT 351: 4 – Discrete Random Variables 51 / 54 (aµ + b))2 ] = E[(aX = E[(aX + b = E[a2 (X µ)2 ] = a2 E[(X µ)2 ] = a2 Var(X) MTHE/STAT 351: 4 – Discrete Random Variables aµ)2 ] ⇤ 52 / 54 Proof For any c 6= µ Recall the interpretation of expectation as the center of gravity of a distribution of mass. Analogously, we can interpret the variance X Var(X) = (x µ)2 p(x) E[(X x2X as the moment of inertia about the center of gravity. Steiner’s theorem (the parallel axis theorem) in mechanics states that the moment of inertia of an object about an axis through its center of gravity is the minimum moment of inertia for any axis in that direction in space. c2R E[(X µ+µ = E[(X µ)2 + 2(X = E[(X µ)2 ] + 2(µ = E[(X > Var(X) c) + (µ µ)2 ] = min E[(X c2R and c = µ is a unique minimizer of E[(X c))2 ] c)2 ] c)2 ] c)2 ] ⇤ c)2 ]. Note: The preceding proof shows that for an arbitrary c c)2 ] E[(X MTHE/STAT 351: 4 – Discrete Random Variables µ)(µ µ) + (µ c) E(X µ) +E[(µ | {z } =0 2 2 µ) ] + (µ c) | {z } >0 Var(X) = E[(X Theorem 8 Let X be a discrete r.v. with mean µ. Then µ)2 ] = min E[(X c)2 ] = E[((X = Thus The analogous statement in probability theory is the following. Var(X) = E[(X c)2 ] 53 / 54 c)2 ] = E[(X MTHE/STAT 351: 4 – Discrete Random Variables µ)2 ] + (µ c)2 54 / 54