Download MTHE/STAT 351: 4 – Discrete Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
If an experiment is performed, very often we are not interested in the
actual outcome, but some (numerical) function of the outcome.
For example, if a fair coin is tossed n times, we might be interested in
the number of heads instead of the actual sequence of heads and tails.
MTHE/STAT 351: 4 – Discrete Random Variables
Let X denote the number of heads in n tosses. Then S = {H, T }n
(head-tail sequences of length n) and for each s 2 S
T. Linder
X(s) = # of heads in s
Queen’s University
For example
Fall 2016
X(H |T T ·{z
· · T T}) = 1
and
n 1 tails
X(HH
· · · H} T ) = n
| {z
1
n 1 heads
We can ask: What is P (X = k) for k = 0, 1, . . . , n?
MTHE/STAT 351: 4 – Discrete Random Variables
1 / 54
S = {1, 2, 3, 4, 5, 6}2 = {(i, j) : i, j = 1, . . . , 6}
1
8
P (X = 0)
=
P ({T T T }) =
P (X = 1)
=
P ({HT T, T HT, T T H}) =
=
P (X = 3)
=
and
3
8
X((i, j)) = i + j
3
P ({HHT, HT H, T HH}) =
8
1
P ({HHH}) =
8
We can easily calculate the probabilities P (X = k):
k
P (X = k)
Letting {X = k} denote the event that X is equal to k, we have
n
[
k=0
{X = k} = S
and
MTHE/STAT 351: 4 – Discrete Random Variables
P
3
[
k=0
{X = k} =
2 / 54
Example: Let the experiment be tossing two fair dice and define X as
the sum of the dice. Then the sample space is
Example: Let X be the number of heads appearing in three tosses of a
fair coin. Then
P (X = 2)
MTHE/STAT 351: 4 – Discrete Random Variables
3
X
2
3
4
5
6
7
8
9
10
11
12
1
36
2
36
3
36
4
36
5
36
6
36
5
36
4
36
3
36
2
36
1
36
For example
P (X = k) = 1
k=0
P (X = 4) = P ({(1, 3), (2, 2), (3, 1)}) =
3 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
3
36
4 / 54
Remarks on technicalities:
Functions that assign numerical values to outcomes in the sample space
are called random variables. For technical reasons we cannot allow
arbitrary functions.
The condition that {s 2 S : X(s)  t} is an event for all t 2 R is
necessary for building up a rich but consistent theory. This condition
need not deeply concern us in this course.
Definition If S is a sample space of a random experiment, a real-valued
function
X:S!R
Since {s 2 S : X(s)  t} is an event,
{s 2 S : X(s)  t}c = {s 2 S : X(s) > t}
is called a random variable if for any t 2 R the set
is also an event. Since the intersection of two events is an event
\
{s 2 S : t < X(s)  ⌧ } = {s 2 S : X(s) > t} {s 2 S : X(s)  ⌧ }
{s 2 S : X(s)  t}
is an event.
is also an event. In general, if X is a random variable
Remarks:
{s 2 S : X(s) 2 B}
We’ll use the shorthand “r.v.” for “random variable.”
Random variables will usually be denoted by capital letters such as
X, Y , T , Z, etc. Their specific (non-random) values are denoted by
the corresponding lower case letters x, y, t, z.
MTHE/STAT 351: 4 – Discrete Random Variables
is an event for most sets B ⇢ R we can envision.
Most importantly, {s 2 S : X(s) 2 I} is an event for for any interval
I ⇢ R, so the probability P (X 2 I) is well defined.
5 / 54
Example: Kenny is picked up after school by one of his parents at a
random time between 2:00 pm. and 2:30 pm. Let X be the time (in
hours) he has to wait after his last class ends at 2:00 pm.
S = {t : 2  t  2.5} = [2, 2.5]
X(t) = t
By letting Y = f (X) we obtain another random variable. For example
sin X,
2
P (X 2 [↵, ]) =
MTHE/STAT 351: 4 – Discrete Random Variables
↵)
cos X,
X2
2X,
eX
are all random variables.
Recall that P (X = x) = 0 for all x, and that
↵
= 2(
0.5
6 / 54
Remark: From a given r.v. X we can construct other r.v.’s. Assume that
f : R ! R is a reasonably well-behaved function. (We are deliberately
being vague here, but practically all functions we can think of will be
“well-behaved.”)
We can take
and define X : S ! R by
MTHE/STAT 351: 4 – Discrete Random Variables
if
Questions of the form “what is P (sin X > 12 ) ?” will often be
encountered.
[↵, ] ⇢ [0, 0.5]
7 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
8 / 54
Example: The radius of a metal ball manufactured in a factory is a
random number between 2 and 3 centimeters. What is the probability
that the volume of a manufactured ball is more than twice the volume of
a ball with radius 2 cm?
Distribution functions
We would like to be able to calculate all probabilities in the form
P (X 2 B), where B is a reasonable subset of the reals, usually an
interval. It turns out that all we need to know is the probability
P (X  x) for all x 2 R.
Solution: Let X be radius of the ball in centimeters. Then X is a
random point in the interval [2, 3].
The volume (in cm3 ) of a sphere or radius r is 43 ⇡r3 . Thus we want to
know the probability
✓
◆
4
4
P
⇡X 3 > 2 ⇡23
3
3
Definition If X is a random variable, then the function F : R ! [0, 1],
defined by
F (x) = P (X  x)
4
4
3
3
Since X > 0,
= X 3 > 2 · 23 = X > 2 · 21/3 .
3 ⇡X > 2 3 ⇡2
Thus
✓
◆
4
4
3 2 · 21/3
P
⇡X 3 > 2 ⇡23 = P (X > 2 · 21/3 ) =
⇡ 0.48
3
3
3 2
MTHE/STAT 351: 4 – Discrete Random Variables
is called the distribution function of X.
9 / 54
If x < a, we have P (X  x) = 0.
P (X = 0) =
P (X  x) = P (X 2 [a, x]) =
x
b
a
a
8
>
0
>
>
<x
F (x) =
>
b
>
>
:1
MTHE/STAT 351: 4 – Discrete Random Variables
1
,
8
P (X = 1) =
Thus
If x > b, then P (X  x) = 1.
Hence
x<a
a
a
10 / 54
Example: Let X denote the number of heads in 3 flips of a fair coin. We
recall that
Example: Let X be a random number selected from the interval [a, b].
If x 2 [a, b]
MTHE/STAT 351: 4 – Discrete Random Variables
F (x) = P (X  x) =
axb
x>b
11 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
3
,
8
8
>
>0
>
>
>
>
1
>
>
<8
P (X = 2) =
1
8 +
>
>
>
1
>
>
>2 +
>
>
:
1
3
,
8
P (X = 3) =
1
8
if x < 0
3
8
3
8
=
=
1
2
7
8
if 0  x < 1
if 1  x < 2
if 2  x < 3
if x
3
12 / 54
Properties of distribution functions
3.
lim F (x) = 0.
x! 1
Proof:
1. F is nondecreasing, i.e., F (x)  F (y) if x < y.
Proof: If x < y, then {X  x} ⇢ {X  y}. Recalling that
P (B)  P (A) if B ⇢ A, we obtain P (X  x)  P (X  y).
Similar to the proof of 2.
4. F is right continuous, i.e., if {xn } is a decreasing sequence such
that limn xn = x, then
⇤
2. lim F (x) = 1.
lim F (xn ) = F (x).
x!1
Proof: It is enough to show that for any increasing sequence {xn }
such that limn xn = 1 we have limn F (xn ) = 1.
The events {X  xn } form an increasing sequence since
{X  xn } ⇢ {X  xn+1 } for all n. Thus
lim {X  xn } =
n!1
1
[
n=1
n!1
Proof: The events {X  xn } form a decreasing sequence, i.e.,
{X  xn+1 } ⇢ {X  xn } for all n. Thus
lim {X  xn } =
n!1
{X  xn } = {X < 1}
1
\
n=1
{X  xn } = {X  x}
Using the continuity of the probability function
Using the continuity of the probability function
lim F (xn ) = lim P (X  xn ) = P (X < 1) = 1
n!1
n!1
lim F (xn ) = lim P (X  xn ) = P (X  x) = F (x)
⇤
MTHE/STAT 351: 4 – Discrete Random Variables
n!1
13 / 54
n!1
⇤
MTHE/STAT 351: 4 – Discrete Random Variables
14 / 54
Detour: left and right limit of a functions
Recall that a the right limit of a function f : R ! R at x is
+
In what follows we will use F (x) to express various probabilities in the
form P (X 2 I), where I ⇢ R in interval.
4
f (x ) = lim f (x + ✏)
✏#0
P (X > a):
if the limit on the right hand side (as ✏ ! 0 in such a way that
✏ > 0) exists. The left limit at x is
4
f (x ) = lim f (x
✏#0
Since {X > a} = {X  a}c , we have
P (X > a) = 1
✏)
P (X  a) = 1
P (a < X  b) for b > a:
Since {a < X  b} = {X  b}
{X  a} ⇢ {X  b}, we get
if the limit exists.
f is said to be right continuous at x if f (x+ ) = f (x), and left
continuous if f (x ) = f (x). If f is both left and right continuous at
x, then it is continuous at x.
P (a < X  b) = P (X  b)
F (a)
{X  a} and
P (X  a) = F (b)
F (a)
Since a distribution function F (x) is nondecreasing, the left and
right limits F (x ) and F (x+ ) always exist at each x. Property 4,
the right continuity of F , shows that F (x+ ) = F (x).
MTHE/STAT 351: 4 – Discrete Random Variables
15 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
16 / 54
P (X < a): The events {X  a
increasing sequence. Thus
lim {X  a
1
[
1/n} =
n!1
n=1
1/n} for n = 1, 2, . . . form an
The expression P (X = a) = F (a)
{X  a
F (a ) implies the following:
If F is continuous at a (i.e., F (a ) = F (a) ), then
1/n} = {X < a}
P (X = a) = 0
Using the continuity of the probability function
1/n) = lim P (X  a
F (a ) = lim F (a
n!1
P (X
a):
n!1
Since {X
P (X
If F is not continuous at a, then it has a jump of magnitude
F (a) F (a ), which is the probability of X = a.
1/n) = P (X < a)
a} = {X < a}c , we have
a) = 1
P (X < a) = 1
P (X = a): Since {X = a} = {X  a}
{X < a} ⇢ {X  a}
P (X = a) = P (X  a)
Note: If F is continuous at a and b, then
F (a )
F (b)
{X < a} and
P (X < a) = F (a)
17 / 54
P (X > 10)
P (X  1)
=
F (1)
=
1
=
1
MTHE/STAT 351: 4 – Discrete Random Variables
0=
P (a  X < b) = P (a < X < b)
MTHE/STAT 351: 4 – Discrete Random Variables
18 / 54
Examples:
1
4
Set of positive integers N = {1, 2, 3, . . .}
Set of all integers Z = {0, ±1, ±2, . . .}.
P (X  0)
5 1
3
F (0) =
=
8 4
8
P (X  10) = 1
41
3
=
44
44
=
Recall that by definition a countably infinite set has as many elements as
the set of positive integers N. More formally, elements of a set of
countably infinite set C can be listed as C = {x1 , x2 , . . . , xn , . . .}.
Calculate P (X = 0), P (0 < X  1), and P (X > 10).
=
P (a  X  b) = P (a < X  b)
Discrete random variables
Example: The distribution function of the random variable X is given by
8
<0
if x < 0
F (x) =
1 + 4x
:
if x 0
4(1 + x)
P (0 < X  1)
=
F (a )
MTHE/STAT 351: 4 – Discrete Random Variables
Solution: P (X = 0): F has a jump at x = 0. Thus
1+4·0
P (X = 0) = F (0) F (0 ) =
4 · (1 + 0)
F (a)
Set of all rational numbers Q.
Definition A discrete set is a set with a finite or countably infinite
number of elements. X is a discrete random variable if the set of possible
values of X is a discrete set.
F (10)
19 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
20 / 54
Properties of a pmf
We have already seen many examples for discrete r.v.’s. Let X denote
the range of X, i.e., the set of its possible values.
(a) p(x) 0 for all x 2 R.
P
(b)
p(xi ) = 1.
Sum of two fair dice: X = {2, . . . , 12}
xi 2X
Number of heads in n tosses of a fair coin: X = {0, 1, 2, . . . , n}
Proof: (a) is obvious since P (X = xi ) 0. To prove (b) observe that
{X = x1 }, {X = x2 }, {X = x3 }, . . . are mutually exclusive events such
that
[
{X = xi } = S
Number of times a fair coin is tossed until a head shows up:
X = {1, 2, 3, . . .} = N.
Definition Let X be a discrete random variable with range
X = {x1 , x2 , x3 , . . .}. The probability mass function (pmf) of X is the
function p : R ! R defined by
1
p(xi ) = P (X = xi ) if xi 2 X ,
2
p(x) = 0 if x 2
/ X.
xi 2X
Thus
X
p(xi ) =
xi 2X
MTHE/STAT 351: 4 – Discrete Random Variables
21 / 54
Suppose X is a discrete random variable with range X and pmf p. How
can we use p to calculate P (X 2 B) for B ⇢ X ?
P (X 2 B) =
MTHE/STAT 351: 4 – Discrete Random Variables
X
[
xi 2X
{X = xi } = P (S) = 1 ⇤
MTHE/STAT 351: 4 – Discrete Random Variables
22 / 54
i = 1, 2, 3, . . .
is a pmf for a random variable X with range X = N. Calculate the
probabilities P (X 10) and P (X is an even number).
P
Solution: We need to find c such that i2X p(i) = 1. We have
Since B is either finite or countably infinite,
[
X
X
P (X 2 B) = P
{X = xi } =
P (X = xi ) =
p(xi )
Thus
xi 2X
✓ ◆i
1
p(i) = c
3
xi 2B
xi 2B
P (X = xi ) = P
Example: Determine the constant c so that the function
As before, the events {X = x}, x 2 B are mutually exclusive and
[
{X = xi } = {X 2 B}
xi 2B
X
1
xi 2B
=
=
p(xi )
xi 2B
✓ ◆X
1 ✓ ◆i
1 ✓ ◆k
X
1
1
1
p(i) =
c
=c
3
3
3
i=1
i=1
k=0
✓ ◆✓
◆
1
1
c
c
=
3
1 1/3
2
1
X
so c = 2.
23 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
24 / 54
P (X
10)
=
=
✓ ◆i
✓ ◆10 X
1
1 ✓ ◆k
X
1
1
1
2
=2
3
3
3
i=10
i2X : i 10
k=0
✓ ◆10 ✓
◆ ✓ ◆9
1
1
1
2
=
3
1 1/3
3
X
P (X is an even number)
P
The formula P (X 2 B) = xi 2B p(xi ) and the known properties of
distribution functions imply how the pmf and the distribution function of
a discrete r.v. X determine each other:
p(i) =
=
=
=
=
If discrete r.v. X has pmf p(x) and range X , then its distribution
function is
✓ ◆i
1
3
i2X : i is even
i2X : i is even
✓
◆
✓
◆
✓
◆
1
1
2j
2
2 k
X
1
1 X 1
2
=2
3
3
3
j=1
k=0
✓ ◆2 ✓
◆
1
1
2 9
2
= ·
3
1 (1/3)2
9 8
1
4
X
p(i) =
X
2
MTHE/STAT 351: 4 – Discrete Random Variables
F (x) = P (X  x) =
X
for all x 2 R
p(xi )
xi 2X : xi x
If x1 < x2 < x3 < · · · , then F (x) is constant over each interval
[xi 1 , xi ) and has a jump of magnitude p(xi ) at x = xi .
25 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
26 / 54
Example: Let X be a discrete r.v. with pmf
p(0) =
F (xi )
p(1) =
Solution: Check:
1
2
+
1
8
p(2) =
1
,
4
p(3) =
1
.
8
1
4
1
8
= 1, so p is a valid pmf. Using
X
F (x) =
p(xi )
+
+
xi 2X : xi x
we obtain
for all xi 2 X
F (x) =
8
>
0
>
>
>
>
>
1
>
>
<2
1
>2
>
>
1
>
>
>
>2
>
:
MTHE/STAT 351: 4 – Discrete Random Variables
1
,
8
Determine the distribution function of X.
Conversely, if the distribution function F of a r.v. X is piecewise
constant (staircase function) with jumps F (xi ) F (xi ) at points
x1 , x2 , x3 , . . ., then X is a discrete r.v. with values
X = {x1 , x2 , x3 , . . .} and pmf given by
p(xi ) = P (X = xi ) = F (xi )
1
,
2
27 / 54
1
MTHE/STAT 351: 4 – Discrete Random Variables
if x < 0
+
+
1
8
1
8
=
+
5
8
1
4
if 0  x < 1
=
7
8
if 1  x < 2
if 2  x < 3
if x
3
28 / 54
Example: Given a coin that comes up head with probability p
(0 < p < 1) and tails with probability q = 1 p, let X be the number of
flips until the first head. Find the distribution function of X.
Thus
F (x) =
Solution: X = {1, 2, 3, . . .}. For n 2 X
P (X = n) = P (T
· · · T} H) = q
| T {z
n 1
p
p(n) = pq n
1
q
Then
X
=
p(i) =
p
n
X1
qj = p
j=0
n
X
pq i
1
F (x) =
i=1
i2X : ix
=
if x < 1
n
1 qn
=1
1 q
qn
MTHE/STAT 351: 4 – Discrete Random Variables
29 / 54
=
=
q
if x
1
MTHE/STAT 351: 4 – Discrete Random Variables
30 / 54
1
1
+3·
2
8
5
8
1·
4·
Remarks:
The expected value of X is often called the expectation or the mean
of X.
3
8
The expected value can also be expressed as
X
E(X) =
xP (X = x)
4, the average can be written as
x1 p(x1 ) + x2 p(x2 ) + x3 p(x3 ) =
3
X
x2X
xi p(xi ).
i=1
MTHE/STAT 351: 4 – Discrete Random Variables
if x < 1
bxc
x2X
If n is large, you can expect to win $1 about (1/2)n times, $3 about
(1/8)n times, and lose $4 about (3/8)n times. Thus your average
winnings are approximately
If we let x1 = 1, x2 = 3, and x3 =
:1
Definition The expected value of a discrete random variable with range
X and pmf p(x) is
X
E(X) =
xp(x)
Suppose you play a game such that in each round you can win $1 with
probability p(1) = 1/2, $3 with probability p(3) = 1/8, and lose $4 with
probability p( 4) = 3/8. If you play n rounds of the game, how much
will your average (per round) winnings be?
4 · (3/8)n
8
<0
The previous simple example motivates a general definition of average
value for a discrete r.v.
Expected Value of a Discrete R.V.
1 · (1/2)n + 3 · (1/8)n
n
if n  x < n + 1, n = 1, 2, 3, . . .
bxc = the largest integer that is less than or equal to x
Let n  x < n + 1. Then
F (x)
:1
F can be defined more elegantly by using the “floor function.” For any
x 2 R let
n 1 tails
so
8
<0
If X is not a finite set, E(X) is defined by an infinite sum. In this
P
case we say that E(X) exists i↵
|x|p(x) < 1.
x2X
31 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
32 / 54
Constant random variable: X is called a constant random variable if
it has only one possible value. Thus P (X = c) = 1 for some constant
c 2 R. Not surprisingly
Example: What is the expected value of the number of heads in 3 flips
of a fair coin?
E(X) = cP (X = c) = c
Solution: X = {0, 1, 2, 3} and
E(X)
=
=
=
Indicator random variable: Let A be an event. The indicator r.v. X
for A is defined as
8
<1 if A occurs
X=
:0 if A does not occur
0 · P (X = 0) + 1 · P (X = 1) + 2 · P (X = 2) + 3 · P (X = 3)
1
3
3
1
0· +1· +2· +3·
8
8
8
8
1.5
We can find E(X) as follows:
E(X) = 0 · P (X = 0) + 1 · P (X = 1) = 0 · P (Ac ) + 1 · P (A) = P (A)
Thus E(X) = P (A) for the indicator of the event A.
MTHE/STAT 351: 4 – Discrete Random Variables
33 / 54
0 for all x 2 X , then
X
E(X) =
xp(x)
Solution: (a) Assume the numbers a1 , a2 , . . . , an are all distinct. Then
X has n possible values X = {a1 , a2 , . . . , an } and pmf
p(a1 ) = p(a2 ) = · · · p(an ) =
0
x2X
since the sum only contains nonnegative terms.
Thus
⇤
E(X) =
n
X
ai p(ai ) =
i=1
MTHE/STAT 351: 4 – Discrete Random Variables
34 / 54
Example: An urn contains n balls such that the ith ball has number ai
written on it. Let X be the number written on a randomly selected ball.
Find E(X).
Nonnegative random variables: X is a nonnegative r.v. if all its
possible values of X are nonnegative. The expectation of a nonnegative
r.v. is nonnegative.
Proof: If x
MTHE/STAT 351: 4 – Discrete Random Variables
35 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
1
n
a1 + a2 + · · · + an
n
36 / 54
(b) If the ai are not all distinct, the solution is a bit more complicated.
Example: Your friend challenges you for a game. You have to pay $1 to
play. He rolls a pair of fair dice, and if the sum is 7, you win $4. If he
rolls a pair of 6’s, you get $10. For any other outcome you get nothing
and he gets to keep your $1. Would you like to play?
Assume the possible distinct values are b1 , b2 , . . . , bm (where m < n).
Then X = {b1 , b2 , . . . , bm } and we need to determine the pmf of X.
Let
n(bj ) = # of balls with the number bj
Solution: Let X be your net winning (winnings minus the $1 you pay to
play). Then X takes the values 3, 9, and 1 with probabilities
Then since we pick each ball with probability 1/n
P (X = bj ) =
n(bj )
n
P (X = 3) = P (sum is 7) =
and so
P (X =
E(X) =
m
X
bj P (X = bj ) =
j=1
b1 n(b1 ) + b2 n(b2 ) + · · · + bm n(bm )
n
E(X) = 3 ·
37 / 54
xi
E(X) p(xi )
=
i=1
xi p(xi )
i=1
=
E(X)
2
36
MTHE/STAT 351: 4 – Discrete Random Variables
38 / 54
If X is a discrete r.v. with range X and pmf p(x), and g : R ! R is a
function, then
X
E[g(X)] =
g(x)p(x)
x2X
Proof If X is the range of X, then the set
4
Y = g(X ) = {g(x) : x 2 X }
p(xi )E(X)
i=1
n
X
E(X)
is the range of Y = g(X).
p(xi ) = 0
The following is a key observation: for any y 2 Y
X
P (Y = y) = P (g(X) = y) =
i=1
| {z }
=1
MTHE/STAT 351: 4 – Discrete Random Variables
29
=
36
Theorem 1
is the center of gravity (center of mass) for the rod. That is, the sum of
torques turning the rod around the point E(X) is zero as can be seen
from
n
X
6
1
+9·
36
36
Often we know the pmf of X, but need to calculate the expectation of a
function g(X) of X.
i=1
n
X
1
36
1
Thus you lose per game $ 18
⇡ 5.5 cents on the average. The game
would only be fair if we had E(X) = 0.
Remark: Expectation as Center of Gravity Let X be a discrete r.v. with
range {x1 , x2 , . . . , xn } and pmf p(x). If we view the real line as a
weightless rod on which a point mass p(xi ) is located at the points xi for
i = 1, 2, . . . , n, then
n
X
E(X) =
xi p(xi )
n
X
P (X = 9) = P (double 6) =
7
29
=
36
36
Thus
But bj n(bj ) is just the sum of the ai ’s on n(bj ) balls. Thus we again
obtain
a1 + a2 + · · · + an
E(X) =
n
MTHE/STAT 351: 4 – Discrete Random Variables
1) = 1
6
,
36
P (X = x)
x2X : g(x)=y
39 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
40 / 54
Example: A discrete random variable takes the values -1, 0, and 1 with
respective probabilities 0.1, 0.4, 0.5. Calculate E(X 2 ).
Using this
E(Y )
=
X
y2Y
=
X
X
yP (Y = y) =
y
y2Y
X
X
Solution: In this simple example we can easily determine E(X 2 ) by (a)
calculating directly the pmf of Y = X 2 , or (b) by using the theorem.
P (X = x)
x2X :g(x)=y
(a) Y takes the values 0 and 1. P (Y = 0) = P (X = 0) = 0.4, while
g(x)P (X = x)
P (Y = 1) = P (X =
y2Y x2X :g(x)=y
=
X
g(x)P (X = x)
Thus
x2X
=
X
E(Y ) =
⇤
g(x)p(x)
X
y2Y
x2X
1) + P (X = 1) = 0.1 + 0.5 = 0.6
yP (Y = y) = 0 · 0.4 + 1 · 0.6 = 0.6
(b) Applying the theorem with g(x) = x2
X
E(X 2 ) =
x2 P (X = x) = ( 1)2 · 0.1 + 02 · 0.4 + 12 · 0.5 = 0.6
x2X
MTHE/STAT 351: 4 – Discrete Random Variables
41 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
42 / 54
The following is a very often used corollary of the theorem:
The following corollary can be proved in a similar way:
Corollary 2 (Linearity of expectation)
Corollary 3
If X is a discrete r.v. and a and b are real constants, then
If X is a discrete r.v. and g1 , g2 , . . . , gn are real-valued functions on the
range of X, and a1 , a2 , . . . , an are real constants, then
E(aX + b) = aE(X) + b
E
Proof Applying the theorem with g(x) = ax + b we get
X
E(aX + b) =
(ax + b)p(x)
x2X
=
a
X
x2X
=
MTHE/STAT 351: 4 – Discrete Random Variables
|
xp(x) + b
{z
E(X)
}
aE(X) + b
X
X
n
ai gi (X) =
i=1
n
X
i=1
⇥
⇤
ai E gi (X)
Example:
E[X 2 + 3X + 6] = E[X 2 ] + 3E[X] + 6
p(x)
x2X
| {z }
1
Remark: The property described by the corollary is called the linearity of
expectation and is very often used in calculations.
43 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
44 / 54
Variance
Example: 3 disks of radius 1, 2, and 3 inches are in a box. What is the
expected value of the area of a randomly selected disk?
Expectation measures the “average value” of a r.v., but gives no
information regarding the fluctuation of values about their average.
Solution: Let R be the radius of the selected disk. Then
P (R = r) = 1/3 for r = 1, 2, 3. We need E[g(R)] = E[⇡R2 ]:
Example: Suppose you are told that you can choose to play one of two
games:
E[⇡R2 ]
3
X
=
⇡E[R2 ] = ⇡
=
12 + 22 + 32
14⇡
⇡
=
3
3
(a) You pay $5 to play. You win $15 with probability 1/2. You lose with
probability 1/2, in which case the bank keeps your $5.
Your expected winnings are: $10 · (1/2) + ( $5) · (1/2) = $2.5
r2 P (R = r)
r=1
(b) You pay $1,000 to play. You win $2,005 with probability 1/2. You
lose with probability 1/2, in which case the bank keeps your $1,000.
Your expected winnings are:
$1, 005 · (1/2) + ( $1, 000) · (1/2) = $2.5
Which game would you like to play?
MTHE/STAT 351: 4 – Discrete Random Variables
45 / 54
Definition Let X be discrete random variable with mean E(X) = µ.
Then the variance of X is
Var(X) = E[(X
The standard deviation of X, denoted by
X
=
p
E[(X
Var(X) = E(X 2 )
(x
X,
Proof
is
Var(X)
µ)2 ]
=
E[(X
=
E(X 2 )
=
=
µ)2 p(x)
The variance of X is sometimes denoted by
2
X
µ)2 ] = E[X 2
2µE(X) + µ2
2
2µ + µ
2
[E(X)]2
E(X )
E(X )
2µX + µ2 ]
2
(by the linearity of expectation)
2
⇤
Since Var(X) is the expectation of the nonnegative r.v. (X
have Var(X) 0. Thus the theorem implies:
x2X
µ)2 , we
Corollary 5
instead of Var(X).
The variance is a measure of the spread of the distribution of a r.v.
about its expected value.
MTHE/STAT 351: 4 – Discrete Random Variables
[E(X)]2
µ)2 ]
If X has range X and pmf p(x), the variance can be expressed as
X
46 / 54
Theorem 4
Remarks:
Var(X) =
MTHE/STAT 351: 4 – Discrete Random Variables
E(X 2 )
47 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
[E(X)]2
48 / 54
Recall that a constant r.v. takes only one value with positive probability,
i.e., P (X = c) = 1 for some c 2 R and P (X = x) = 0 if x 6= c.
Example: Find the standard deviation of your winnings in games (a) and
(b) in the previous example.
Sometimes we say “X is constant with probability 1” or “X is constant”
instead of “X is a constant random variable.”
Solution: Let X denote your net winnings. In both games E(X) = 2.5.
In game (a) we have
Theorem 6
1
1
E(X ) = 10 · + ( 5)2 · = 50 + 12.5 = 62.5
2
2
2
Var(X) = E(X 2 )
2
[E(X)]2 = 56.25,
X
In game (b) we have
E(X 2 ) = 10052 ·
=
p
Let X be a discrete r.v. Then Var(X) = 0 if and only if X is a constant
random variable.
Var(X) = 7.5
Proof Let µ = E(X) and assume P (X = a) > 0 for some a 6= µ. Then
(a µ)2 > 0, and so
X
Var(X) =
(x µ)2 P (X = x)
1
1
+ ( 1000)2 · = 1, 005, 012.5
2
2
Var(X) = 1, 005, 006.25,
X
=
p
x2X
(a
Var(X) = 1002.5
MTHE/STAT 351: 4 – Discrete Random Variables
µ)2 P (X = a) > 0
Thus if Var(X) = 0, then P (X = µ) = 1 and P (X = x) = 0 for any
x 6= µ. This means that X is constant with probability 1.
49 / 54
MTHE/STAT 351: 4 – Discrete Random Variables
50 / 54
Recall that for any constants a and b, E(aX + b) = aE(X) + b.
Although the variance is not linear, the following useful formula holds.
Conversely, if X is a constant r.v. with P (X = c) = 1, then
Theorem 7
E(X) = c
Let X be a discrete r.v. with variance Var(X). Then
and
Var(aX + b) = a2 Var(X)
E(X 2 ) = c2 P (X = c) = c2
This implies
Proof Let µ = E(X). From the linearity of expectation
Var(X) = E(X 2 )
[E(X)]2 = c2
c2 = 0
⇤
E[aX + b] = aµ + b
Thus
Note: Var(X) = 0 is equivalent to E(X 2 ) = [E(X)]2 . Thus
Var(aX + b)
E(X 2 ) > [E(X)]2
unless X is a constant r.v.
MTHE/STAT 351: 4 – Discrete Random Variables
51 / 54
(aµ + b))2 ] = E[(aX
=
E[(aX + b
=
E[a2 (X
µ)2 ]
=
a2 E[(X
µ)2 ]
=
a2 Var(X)
MTHE/STAT 351: 4 – Discrete Random Variables
aµ)2 ]
⇤
52 / 54
Proof For any c 6= µ
Recall the interpretation of expectation as the center of gravity of a
distribution of mass. Analogously, we can interpret the variance
X
Var(X) =
(x µ)2 p(x)
E[(X
x2X
as the moment of inertia about the center of gravity.
Steiner’s theorem (the parallel axis theorem) in mechanics states that the
moment of inertia of an object about an axis through its center of gravity
is the minimum moment of inertia for any axis in that direction in space.
c2R
E[(X
µ+µ
=
E[(X
µ)2 + 2(X
=
E[(X
µ)2 ] + 2(µ
=
E[(X
>
Var(X)
c) + (µ
µ)2 ] = min E[(X
c2R
and c = µ is a unique minimizer of E[(X
c))2 ]
c)2 ]
c)2 ]
c)2 ]
⇤
c)2 ].
Note: The preceding proof shows that for an arbitrary c
c)2 ]
E[(X
MTHE/STAT 351: 4 – Discrete Random Variables
µ)(µ
µ) + (µ
c) E(X µ) +E[(µ
| {z }
=0
2
2
µ) ] + (µ c)
| {z }
>0
Var(X) = E[(X
Theorem 8
Let X be a discrete r.v. with mean µ. Then
µ)2 ] = min E[(X
c)2 ] = E[((X
=
Thus
The analogous statement in probability theory is the following.
Var(X) = E[(X
c)2 ]
53 / 54
c)2 ] = E[(X
MTHE/STAT 351: 4 – Discrete Random Variables
µ)2 ] + (µ
c)2
54 / 54
Related documents