Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
STAT 421 Lecture Notes
4.4
105
Moments
Let X be a random variable and k denote a positive integer. The kth moment of X is defined
to be E(X k ). The kth moment need not exist, though the kth and lower moments exist if
E(|X|k ) exists.
Theorem 4.4.1. If E(|X k |) < ∞ for some positive k, then E(|X j |) < ∞ for all positive
integers j < k.
A sketch of the proof for the case of X with a continuous distribution follows. Suppose
that E(|X k |) < ∞ for some positive k and j is a positive integer less than k. Then,
∫ ∞
j
E(|X |) =
|xj |f (x) dx
−∞
∫ 1
∫
j
|x |f (x) dx +
|xj |f (x) dx
=
−1
1
∫
≤
{x||x|>1}
∫
f (x) dx +
{x||x|>1}
k
−1
|xk |f (x) dx
≤ Pr(|X| ≤ 1) + E(|X |)
< ∞.
Central Moments Suppose that X is a random variable with E(X) = µ. For every positive
integer k, the kth central moment is defined to be
E[(X − µ)k ].
For example, the second central moment is the variance E[(X − µ)2 ].
Suppose that the distribution of X is symmetric about µ and E[(X − µ)k ] exists. Then, if
k is odd, E[(X − µ)k ] = 0 because
∫ 0
∫ ∞
k
k
E[(X − µ) ] =
(x − µ) f (x) dx +
(x − µ)k f (x) dx
−∞
0
∫ ∞
∫ ∞
k
= −
(x − µ) f (x) dx +
(x − µ)k f (x) dx = 0.
0
0
Example 4.4.1. Suppose that X is continuous with p.d.f.
f (x) = ce−(x−3)
2 /2
, x ∈ R.
The p.d.f. is symmetric about 3, and so the median and mean of X is µ = 3. It can be
shown that for every positive k,
∫ ∞
2
k
E(|X |) =
|xk |ce−(x−3) /2 dx < ∞.
−∞
STAT 421 Lecture Notes
106
By symmetry of f , E[(X − µ)k ] = 0 when k is an odd positive integer. For the even central
moments, a recursive formula can be developed. Let 2n = k and y = x − µ. Then,
∫ ∞
2
k
E[(X − µ) ] =
y 2n ce−y /2 dx.
−∞
Using integration by parts, set
u = y 2n−1 ⇒ dy = (2n − 1)y 2n−2
dv = cye−y
2 /2
dy ⇒ v = −ce−y
2 /2
.
Then,
E[(X − µ) ] = −y
k
2n−1
∞
−y 2 /2 ce
−∞
∫
∞
+
(2n − 1)y 2n−2 ce−y
−∞
2n−2
= 0 + (2n − 1)E[(X − µ)
2 /2
dy
]
= (2n − 1)E[(X − µ)2n−2 ].
Now, E[(X − µ)0 ] = E(1) = 1, E[(X − µ)2 ] = (2 − 1) × 1 = 1, E[(X − µ)4 ] = (4 − 1) × 1 = 3,
E[(X − µ)6 ] = (6 − 1) × 3 = 15, and so on.
Skewness is a measure of lack of symmetric based on the third central moment. Since the
third central moment is 0 if the distribution is symmetric, the difference between E[(X −µ)3 ]
and 0 reflects lack of symmetry. For interpretability, E[(X − µ)3 ] is scaled by the second
central moment, and the measure of skewness is
E[(X − µ)3 ]
E[(X − µ)3 ]
=
.
E[(X − µ)2 ]
σ2
The figure below and left shows the probability function for the binomial distribution as a
function of p. Values of p are drawn from the set {.1, .15, .2, . . . , .85, .9}. Skewness is plotted
against p in the figure below and to the right.
107
−0.5
0.0
skewness
0.2
0.1
0.0
Pr(X=x)
0.3
0.5
0.4
STAT 421 Lecture Notes
0
5
10
15
0.2
20
0.4
0.6
0.8
p
x
Moment Generating Functions
Definition 4.4.2. Let X be a random variable. The moment generating function (m.g.f.)
of X is defined to be
ψ(t) = E(etX ), t ∈ R.
Any two random variables with the same distribution will have the same moment generating
function. The m.g.f. may or may not exist (though if X is bounded, then ψ(t) will exist).
Note that ψ(0) = E(1) = 1 always exists, but is uninteresting.
Theorem 4.4.2. Let X be a random variable whose m.g.f. ψ(t) is finite for all values of
t in an open interval containing t = 0. Then, for each positive integer n, E(X n ), the nth
moment of X, is finite and equals the nth derivative of ψ(t) evaluated at t = 0. The nth
derivative evaluated at 0 is denoted by ψ (n) (0).
The proof of the theorem depends strongly on the following result:
( n tX )
dn E(etX )
d e
(n)
ψ =
= E
.
n
dt
dtn
Accepting the truth of this statement leads to
)
detX ψ (0) = E
dt ( 0·X )t=0
= E Xe
(
(1)
= E(X).
STAT 421 Lecture Notes
108
Example 4.4.3. Suppose that X has the following p.d.f.
f (x) = e−x I{r|r>0} (x).
The m.g.f. of X is
( )
ψ(t) = E etX
∫ ∞
=
etx e−x dx
∫0 ∞
=
e(t−1)x dx.
0
This integral is finite provided that t ∈ (−∞, 1), and for t ∈ (−∞, 1),
ψ(t) = (1 − t)−1 .
To determine the first and second moments, and the variance of X, we compute
ψ (1) (t) = (1 − t)−2 ⇒ ψ (1) (0) = E(X) = 1
ψ (2) (t) = 2(1 − t)−3 ⇒ ψ (2) (0) = E(X 2 ) = 2
Var(X) = ψ (2) (0) − [ψ (1) (0)]2 = 1.
Question 1 : Compute E(X) where X has the following p.f.:
f (x) =
λx e−λ
I{0,1,2,...} (x).
x!
Properties of moment generating functions Let X be a random variable with mg.f. ψ1 (t)
and Y = aX + b where a, b are constants. Let ψ2 (t) denote the m.g.f. of Y . Then, for every
value of t for which ψ1 (at) is finite,
ψ2 (t) = ebt ψ1 (at).
The proof proceeds as follows:
]
[
ψ2 (t) = E e(aX+b)t
[
]
= E ebt eaXt
[
]
= ebt E eatX
= ebt ψ1 (at).
Example 4.4.4. Suppose that X has the following p.d.f.
f (x) = e−x I{r|r>0} (x),
STAT 421 Lecture Notes
109
and so the m.g.f. of X is ψ1 (t) = (1 − t)−1 . If Y = 3 − 2X, then the m.g.f. of Y will be
finite for t > −1/2 and will have the value
ψ2 (t) = e3t ψ1 (−2t) =
e3t
.
1 + 2t
Sums of independent random variables (but not necessarily identically distributed random
variables) are important statistics, and computing the m.g.f. of the sum is a convenient
method for determining the distribution, or at least the moments, of the sum. The following
theorem is key:
Theorem 4.4.4. Suppose that X1 , . . . , Xn are independent random variables with m.g.f.s
∑
ψ1 , . . . , ψn . Let Y = ni=1 Xi and let ψ denote the m.g.f. of Y . Then, for every value of t
for which ψ1 (t), . . . , ψn (t) are finite,
ψ(t) =
n
∏
ψi (t).
i=1
The proof proceeds as follows:
)
( ∑n
ψ(t) = E et i=1 Xi
( n
)
∏
= E
etXi
i=1
=
n
∏
(
)
E etXi (by independence of X1 , . . . , Xn )
i=1
=
n
∏
ψi (t).
i=1
The moment generating function for the binomial random variable
Suppose that X ∼ Binom(n, p) and that Xi ∼ Binom(1, p), i = 1, . . . , n are independent.
∑
Then, X =
Xi . Furthermore, the m.g.f. of Xi is
∑
ψi (t) =
etx px (1 − p)1−x
x∈{0,1}
= 1 − p + pet .
By Theorem 4.4.4., the m.g.f. of X is
ψ(t) =
n
∏
ψi (t)
i=1
= (1 − p + pet )n .
Theorem 4.4.5. If the m.g.f.s of random variables X and Y are finite and identical in an
open interval containing t = 0, then the probability distributions of X and Y are identical.
STAT 421 Lecture Notes
110
The proof is beyond the scope of the book.
Theorem 4.4.6. Suppose that X ∼ Binom(n1 , p) and Y ∼ Binom(n2 , p) are independent
(note that p is the same for both random variables). Then, X + Y ∼ Binom(n1 + n2 , p). To
prove the claim, let ψ1 (t) and ψ2 (t) denote the m.g.f.s of X and Y , and let ψ(t) denote the
m.g.f. of X + Y . By Theorem 4.4.4.,
ψ(t) = ψ1 (t)ψ2 (t)
= (1 − p + pet )n1 (1 − p + pet )n2
= (1 − p + pet )n1 +n2 .
The m.g.f. of Y , ψ(t), is the m.g.f. of a Binom(n1 + n2 , p) random variable. By Theorem
4.4.5., the distribution of Y is that of a Binom(n1 + n2 , p) random variable.