Download 8. Expected value of random variables Let X be a random variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
28
HARRY CRANE
8. Expected value of random variables
Let X be a random variable taking three possible values, x1 , x2 , x3 . Suppose we repeatedly
draw copies of X independently to obtain X1 , X2 , . . .. (Such a sequence is called independent
and identically distributed and is abreviated i.i.d.) Given X1 , X2 , . . ., we count ni := #{1 ≤ j ≤
n : X j = i}, for each i = 1, 2, 3. Then the average of the observed values is
3
n1 x1 + n2 x2 + n3 x3 X
=
xi ni /n.
n
i=1
P
Recall that this average converges to i xi P{X = xi } as n → ∞, which prompts the following
definition.
Definition 8.1 (Expectation). Let X be a discrete random variable with probability mass function
pX . The expected value of X, denoted EX = E(X) = µX , is defined by
X
X
(14)
EX :=
xP{X = x} =
xpX (x).
x
x
We stress that if X takes infinitely many values, then the sum in (14) might not be
defined. We say that X has an expectation if the righthand side of (14) is defined; X has
finite expectation, or is integrable, if the righthand side of (14) is finite.
As an aside, suppose X takes infinitely P
many values {ai }i∈I , where I is some infinite
indexing set. Then how might we define i∈I ai = S? For example, suppose ai = i for
P
P
i
iP= 1, 2, . . .. Then i∈I ai = ∞
i=1 i = ∞. However, if ai := (−1) for i = 1, 2, . . ., then
∞
ai = (−1) + 1 − 1 + 1 − · · · , which is not defined. Therefore, we define the infinite sum
i=1 P
S := i∈I ai by S := S+ − S− , where
X
X
S+ :=
ai and S− :=
ai .
i:ai >0
i:ai <0
If one of S+ , S− is finite, then S = S+ − S− is well-defined. In particular, if S+ = ∞ and
S− < ∞, we have S = ∞ − S− = ∞; if S− = ∞ and S+ < ∞, then S = S+ − ∞ = −∞. However,
if S+ = S− = ∞, then S = S+ − S− = ∞ − ∞ is not defined.
Mathematically, there is no problem in defining the expected value of a random variable
to be infinite in magnitude. However, there are some practical considerations that arise, as
the next example illustrates.
Problem 8.2 (St. Petersburg paradox). You pay $B to play the following game. Toss a fair coin
repeatedly until the first head is flipped. Let W denote the number of flips before this happens. You
are paid $2W−1 . So, for example, suppose we flip TTTTH. Then W = 5 and you win 25−1 = 16.
How much should you be willing to pay to play this game?
Solution. Let X be the amount you win playing this game. Then X = 2W−1 − B, where W
follows the Geometric distribution with parameter 1/2. Furthermore, we have
EX = E2W−1 =
∞
X
n=0
2n /2n+1 =
1 1
+ + · · · = ∞.
2 2
You are expected to win an unlimited amount of money. So (in theory), you should be
willing to risk all of your wealth to play this game. Right? Be careful.
PROBABILITY MODELS
29
8.1. Examples.
(1) Let U be the number of pips on a fair six-sided die. Then pU (u) = 1/6, u = 1, . . . , 6, is
the probability mass function of the discrete uniform distribution on {1, . . . , 6} and
EU =
6
X
upU (u) =
u=1
1+2+3+4+5+6
= 7/2.
6
In general, if U is a discrete uniform random variable on [1, n] := {1, . . . , n}, then
EU =
n
X
i/n =
i=1
1 n(n + 1) n + 1
×
=
,
n
2
2
the midpoint of the interval.
If U1 , U2 are the number of pips on two fair dice then S = U1 + U2 has ES =
7/2 + 7/2 = 7 by symmetry. Note that, in this case,
E(U1 + U2 ) = EU1 + EU2 .
(More on this later.)
(2) Let N ∼ Bin(n, p). Then
n
X
!
n k
EN =
k p (1 − p)n−k
k
k=0
!
n
X
n−1 k
= n
p (1 − p)n−k
k−1
k=0
!
n−1
X
n−1 k
= np
p (1 − p)n−1−k
k
= np.
k=0
8.2. Properties of expectation.
(E1) Expectation of a function of X: Suppose f : R → R and X is aPrandom variable.
Then Y := f (X) := f ◦ X : Ω → R is a random variable and EY = x f (x)P{X = x}.
Proof. Note that






[
X




P{Y = y} := P 
{X
=
x}
=
P{X = x}.





x: f (x)=y
 x: f (x)=y
30
HARRY CRANE
Thus,
X
EY :=
yP{Y = y}
y
X
=
y
X
y
X X
=
P{X = x}
x: f (x)=y
yP{X = x}
y x: f (x)=y
X X
=
f (x)P{X = x}
y x: f (x)=y
X
=
f (x)P{X = x}.
x
(E2) Scaling rule: For c ∈ R, EcX = cEX.
(E3) Addition rule: For any random variables X, Y, E(X + Y) = EX + EY.
Proof. To use (E1), we put Z = (X, Y) and f (Z) = X + Y. Also, write p(x, y) = P{X =
x, Y = y} and define g(Z) = X and h(Z) = Y so that f = g + h. Then
X
E f (Z) =
f (x, y)p(x, y)
x,y
X
=
(g(x, y) + h(x, y))p(x, y)
x,y
=
X
g(x, y)p(x, y) +
x,y
X
h(x, y)p(x, y)
x,y
= Eg(Z) + Eh(Z)
= EX + EY.
(E4) Multiplication rule: Let X and Y be independent random variables. Then EXY =
EXEY.
Proof. Using property (E1), we take Z = XY = f (X, Y). Then
X
EZ =
f (x, y)P{X = x, Y = y}
x,y
=
X
xypX (x)pY (y)
x,y
=
X
xpX (x)
x
X
ypY (y)
y
= EXEY.
PROBABILITY MODELS
31
(5) Conditioning rule: Let C1 , C2 , . . . be a collection of exhaustive, mutually exclusive
cases and define
X
E(X | Ci ) =
xP(X = x | Ci ).
x
Then EX =
P
i P(Ci )E(X | Ci ).
Proof. By the law of cases from Section 4.1, P{X = x} =
thus,
X
EX =
xP{X = x}
P
i P(Ci )P(X
= x | Ci ) and,
x
X
X
=
x
P(Ci )P(X = x | Ci )
x
=
X
i
=
X
i
P(Ci )
X
xP(X = x | Ci )
x
P(Ci )E(X | Ci ).
i
The above properties are extremely useful. For example, the calculation of the expectation
of a Binomial random variable in Section 8.1(2) required some (but not much) ingenuity to
figure out the sum. However, by property (E2), we could notice that N ∼ Bin(n, p) admits
the expression N = N1 + · · · + Nn , where N1 , . . . , Nn are i.i.d. Bernoulli(p) random variables.
For each i = 1, . . . , n, ENi = p; whence,
n
X
EN = E(N1 + · · · + Nn ) =
ENi = np.
i=1