Download Homework 5 answers in pdf format

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
STAT 516 Answers
Homework 5
March 3, 2008
Solutions by Mark Daniel Ward
PROBLEMS
We always let “Z” denote a standard normal random variable in the computations below. We
simply use the chart from page 222 of the Ross book, when working with a standard normal
random variable; of course, a more accurate computation is possible using the computer.
2. We first compute
∞
Z ∞ −x/2 e−x/2 e
Cxe
dx = C x
1=
−
dx
−1/2 x=0
−1/2
0
0
Z ∞
∞
e−x/2 dx = −4Ce−x/2 x=0 = 4C
= 2C
Z
∞
−x/2
0
Thus C = 1/4. Now we compute the probability that the system functions for at least 5
months:
−x/2 ∞
Z ∞
Z ∞ −x/2 1 −x/2
e
1
e
P (X ≥ 5) =
−
xe
dx =
x
dx
4
4
−1/2 x=5
−1/2
5
5
Z ∞
∞ 7
1
1 −5/2
−x/2
−5/2
=
e
dx =
10e
+2
10e
− 4 e−x/2 x=5 = e−5/2
4
4
2
5
3a. The function
(
C(2x − x3 ) if 0 < x < 5/2
f (x) =
0
otherwise
√
√
3
cannot be a probability density function.
R ∞ To see this, note 2x − x = −x(x − 2)(x + 2).
If C = 0, then f (x) = 0 for all x, so −∞ f (x) = 0, but every probability density function
√
R∞
has −∞ f (x) = 1. If C > 0, then f (x) < 0 on the range x ∈ (0, 2), but every probability
√
density function has f (x) ≥ 0 for all x. If C < 0, then f (x) < 0 on the range x ∈ ( 2, 5/2),
but every probability density function has f (x) ≥ 0 for all x. So no value of C will satisfy
all of the properties needed for a probability density function.
3b. The function
(
C(2x − x2 ) if 0 < x < 5/2
f (x) =
0
otherwise
2
cannot be a probability
R ∞density function. To see this, note 2x−x = −x(x−2). RIf∞C = 0, then
f (x) = 0 for all x, so −∞ f (x) = 0, but every probability density function has −∞ f (x) = 1.
If C > 0, then f (x) < 0 on the range x ∈ (0, 2), but every probability density function has
f (x) ≥ 0 for all x. If C < 0, then f (x) < 0 on the range x ∈ (2, 5/2), but every probability
density function has f (x) ≥ 0 for all x. So no value of C will satisfy all of the properties
needed for a probability density function.
1
2
5. Let X denote the volume of sales in a week, given in thousands of gallons. We have the
density of X, and we want a with P (X > a) = .01. Note that we need 0 < a < 1. We have
1
Z ∞
Z 1
Z ∞
(1 − x)5
4
.01 = P (X > a) =
f (x) dx =
5(1 − x) dx +
0 dx = 5
(−1)
= (1 − a)5
5
a
a
1
x=a
√
√
So .01 = (1 − a)5 , and thus 5 .01 = 1 − a, so a = 1 − 5 .01 ≈ .6019.
6a. We compute
Z ∞
Z 0
Z ∞
Z
1
1 ∞ 2 −x/2
−x/2
E[X] =
xf (x) dx =
(x)(0) dx +
(x)
(x)(e
) dx =
xe
dx
4
4 0
−∞
−∞
0
Z ∞
Z ∞
−x/2 ∞
−x/2
−x/2 ∞
1
e
1
e
2e
2
−x/2
=
x
2x
dx =
x
xe
dx
−
+4
4
−1/2 x=0
−1/2
4
−1/2 x=0
0
0
∞
Z ∞ −x/2 −x/2 ∞
1
e
e−x/2 2e
=
x
dx
+ 4x
−4
4
−1/2 x=0
−1/2 x=0
−1/2
0
∞
∞ −x/2 ∞
1
e−x/2 e−x/2 1
2e
=
+ 4x
+8
= (16) = 4
x
4
−1/2 x=0
−1/2 x=0
−1/2 x=0
4
6b. We compute
Z
Z ∞
f (x) dx =
1=
−1
Z
c(1 − x ) dx +
−1
1
and thus c = 3/4. Now we compute
Z
Z −1
Z ∞
(x)(0) dx +
xf (x) dx =
E[X] =
−∞
= (3/4)
1
x3 0 dx = c x −
= c(4/3)
3 x=−1
1
2
= (3/4)
x=−1
Z
x(3/4)(1 − x ) dx +
−1
−∞
1
4
x2 x
−
2
4
∞
Z
2
0 dx +
−∞
−∞
1
∞
(x)(0) dx
1
1 1
−
4 4
=0
6c. We compute
Z
∞
E[X] =
Z
5
(x)(0) dx +
xf (x) dx =
−∞
Z
−∞
∞
x
5
5
dx = 5 ln x|∞
x=5 = +∞
x2
10a. The times in which the passenger would go to destination A are 7:05–7:15, 7:20–7:30,
7:35–7:45, 7:50–8:00, i.e., a total of 40 out of 60 minutes. So 2/3 of the time, the passenger
goes to destination A.
10b. The times in which the passenger would go to destination A are 7:10–7:15, 7:20–7:30,
7:35–7:45, 7:50–8:00, 8:05–8:10, i.e., a total of 40 out of 60 minutes. So 2/3 of the time, the
passenger goes to destination A.
11. To interpret this statement, we say that the location of the point is X inches from the
left-hand-edge of the line, with 0 ≤ X ≤ L. Since the point is chosen at random on the line,
with no further clarification, it is fair to assume that the distribution is uniform, i.e., X has
density function f (x) = L1 for 0 ≤ x ≤ L, and f (x) = 0 otherwise.
3
The ratio of the shorter to the longer segment is less than 1/4 if X ≤ 15 L or X ≥ 54 L. So
the ratio of the shorter to the longer segment is less than 1/4 with probability
Z L
Z 1L
5
1
1
dx +
dx = 1/5 + 1/5 = 2/5
4
L
L
0
L
5
14. Using proposition 2.1, we compute
Z
Z 1
Z 0
Z ∞
n
n
n
n
(x )(1) dx +
(x )(0) dx +
x fX (x) dx =
E[X ] =
−∞
−∞
=
0
1
∞
n
Z
(x )(0) dx =
1
xn dx
0
n+1 1
x
1
=
n + 1 x=0 n + 1
To use the definition of expectation, we need to find the density of the random variable Y =
X n . We first find the cumulative distribution function of Y . We know that P (Y ≤ a) = 0
for a ≤ 0, and P (Y
≤ a) = 1 for a ≥ 1. For 0 < a < 1, we have P (Y ≤ a) = P (X n ≤ a) =
√
R
na
√
√
P (X ≤ n a) = 0 1 dx = n a. Therefore, the cumulative distribution function of Y is
(√
n
a if 0 < a < 1
FY (a) =
0
otherwise
So the density of Y is
(
fY (x) =
1
1 n
a −1
n
0
if 0 < x < 1
otherwise
So the expected value of Y is
Z ∞
Z 0
Z 1
Z ∞
Z 1
1 1 −1
1 1/n
x dx
E[Y ] =
xfY (x) dx =
(x)(0) dx +
(x) x n dx +
(x)(0) dx =
n
−∞
−∞
0
1
0 n
1
1
1 x n +1 1
= 1
=
n n + 1
n+1
x=0
15a. We compute
X − 10
5 − 10
P (X > 5) = P
>
= P (Z > −5/6) = P (Z < 5/6) ≈ Φ(.83) ≈ .7967
6
6
15b. We compute
4 − 10
X − 10
16 − 10
P (4 < X < 16) = P
<
<
6
6
6
= P (−1 < Z < 1) = P (Z < 1) − P (Z < −1) = Φ(1) − (1 − Φ(1)) ≈ .6826
15c. We compute
X − 10
8 − 10
P (X < 8) = P
<
= P (Z < −1/3) = 1−P (Z < 1/3) ≈ 1−Φ(.33) ≈ .3707
6
6
15d. We compute
P (X < 20) = P
X − 20
20 − 10
<
6
6
= P (Z < 5/3) ≈ Φ(1.67) ≈ .9525
4
15e. We compute
X − 20
16 − 10
P (X > 16) = P
>
= P (Z > 1) = 1 − P (Z < 1) = 1 − Φ(1) ≈ .1587
6
6
16. Let X denote the rainfall in a given year.
Then X has a uniform (µ = 40, σ = 4)
50−40
≤
= P (Z ≤ 2.5) = Φ(2.5) ≈ .9938, where Z
distribution, so P (X ≤ 50) = P X−40
4
4
has a standard normal distribution. If the rainfall in each year is independent of all other
years, then it follows that the desired probability is (P (X ≤ 50))10 ≈ (.9938)10 ≈ .9397.
19. We compute
.10 = P (X > c) = P
Thus Φ
c−12
2
= .90. So
X − 12
c − 12
>
2
2
c−12
2
=P
c − 12
Z>
2
=1−Φ
c − 12
2
≈ 1.28. So c ≈ 14.56.
20a. We have n = 100 people, each of which is in favor of a proposed rise in school taxes
with probability p = .65. The number of the 100 people who are in favor of the rise in taxes
is a Binomial (n = 100, p = .65) random variable, which is well-approximated by a normal
random variable X with mean np = 65 and variance np(1 − p) = 22.75. So the probability
that at least 50 are in favor of the proposition is approximately
49.5 − 65
X − 65
> √
P (X > 49.5) = P √
≈ P (Z > −3.25) = Φ(3.25) ≈ .9994
22.75
22.75
20b. The desired probability is approximately
X − 65
70.5 − 65
59.5 − 65
√
<√
< √
P (59.5 < X < 70.5) = P
≈ P (−1.15 < Z < 1.15)
22.75
22.75
22.75
= P (Z < 1.15) − P (Z < −1.15) = Φ(1.15) − (1 − Φ(1.15)) ≈ .7498
20c. The desired probability is approximately
X − 65
74.5 − 65
P (X < 74.5) = P √
< √
≈ P (Z < 1.99) = P (Z < 1.99) ≈ .9767
22.75
22.75
23. Let X denote the number of times “6” shows during 1000 independent rolls of a fair die.
Then X is a Binomial random variable with n = 1000 and p = 1/6. So X is approximately
normal with mean np = 1000/6 and variance np(1 − p) = 1000(1/6)(5/6). Thus
P (150 ≤ X ≤ 200) = P (149.5 ≤ X ≤ 200.5)
=P
X − (1000/6)
200.5 − (1000/6)
149.5 − (1000/6)
p
≤p
≤p
1000(1/6)(5/6)
1000(1/6)(5/6)
1000(1/6)(5/6)
!
≈ P (−1.46 ≤ Z ≤ 2.87) = P (Z ≤ 2.87) − P (Z ≤ −1.46)
= P (Z ≤ 2.87) − P (Z ≥ 1.46) = P (Z ≤ 2.87) − (1 − P (Z ≤ 1.46))
= Φ(2.87) − (1 − Φ(1.46)) ≈ .9979 − (1 − .9279) = .9258
Given that “6” shows exactly 200 times, then the remaining 800 rolls are all independent,
with possible outcomes 1, 2, 3, 4, 5, each appearing with probability 1/5 on each die. Let
5
Y denote the number of times “5” shows during the 800 rolls. Then Y is a Binomial
random variable with n = 800 and p = 1/5. So Y is approximately normal with mean
np = 800/5 = 160 and variance np(1 − p) = 800(1/5)(4/5) = 128. Thus
Y − 160
149.5 − 160
√
√
P (Y < 150) = P (Y ≤ 149.5) = P
≤
128
128
≈ P (Z ≤ −.93) = P (Z ≥ .93) = 1 − P (Z ≤ .93) = 1 − Φ(.93) ≈ 1 − .8238 = .1762
25. Let X denote the number of acceptable items. Then X is a Binomial random variable
with n = 150 and p = .95. So X is approximately normal with mean np = (150)(.95) = 142.5
and variance np(1 − p) = (150)(.95)(.05) = 7.125. Thus
139.5 − 142.5
X − 142.5
√
P (150 − X ≤ 10) = P (140 ≤ X) = P (139.5 ≤ X) = P
≤ √
7.125
7.125
≈ P (−1.12 ≤ Z) = P (Z ≤ 1.12) = Φ(1.12) = .8686
26. With a fair coin, let X denote the number of heads. Then X is a Binomial random
variable with n = 1000 and p = 1/2. So X is approximately normal with mean np = 500
and variance np(1 − p) = 250. Thus the probability we reach a false conclusion is
524.5 − 500
X − 500
√
P (525 ≤ X) = P (524.5 ≤ X) = P
≤ √
≈ P (1.55 ≤ Z)
250
250
= 1 − P (Z ≤ 1.55) = 1 − Φ(1.55) ≈ 1 − .9394 = .0606
With a biased coin, let Y denote the number of heads. Then Y is a Binomial random variable
with n = 1000 and p = .55. So Y is approximately normal with mean np = 550 and variance
np(1 − p) = 247.5. Thus the probability we reach a false conclusion is
524.5 − 550
Y − 550
< √
≈ P (Z ≤ −1.62)
P (Y < 525) = P (Y < 524.5) = P √
247.5
247.5
= P (Z ≥ 1.62) = 1 − P (Z ≤ 1.62) = 1 − Φ(1.62) ≈ 1 − .9474 = .0526
28. We assume that each person is left-handed, independent of all the other people. Let
X denote the number of the 200 people who are lefthanded. Then X is a Binomial random
variable with n = 200 and p = .12. So X is approximately normal with mean np = 200(.12) =
24 and variance np(1 − p) = 200(.12)(.88) = 21.12. Thus the probability that at least 20 of
the 200 are lefthanded is
19.5 − 24
X − 24
√
P (20 ≤ X) = P (19.5 ≤ X) = P
<√
≈ P (−.98 < Z)
21.12
21.12
= P (Z < .98) = Φ(.98) ≈ .8365
32a. Let X denote the time (in hours) neededR to repair a machine. Then X is exponentially
∞
distributed with λ = 1/2. Then P (X > 2) = 2 12 e−(1/2)x dx = e−1 ≈ .368.
32b. The conditional probability is
R ∞ 1 −(1/2)x
e
dx
P (X > 10 & X > 9)
P (X > 10)
e−5
2
P (X > 10 | X > 9) =
=
= R10
=
= e−1/2
∞ 1 −(1/2)x
−9/2
P (X > 9)
P (X > 9)
e
e
dx
9 2
We could also have simply computed the line above by writing
P (X > 10)
1 − P (X ≤ 10)
1 − F (10)
1 − (1 − e−(1/2)10 )
e−5
=
=
=
=
= e−1/2
−(1/2)9
−9/2
P (X > 9)
1 − P (X ≤ 9)
1 − F (9)
1 − (1 − e
)
e
6
An alternative method is to simply compute that probability
that the waiting time would
R∞
be at least one additional hour, which is P (X > 1) = 1 12 e−(1/2)x dx = e−1/2 (or equivalently,
P (X > 1) = 1 − P (X ≤ 1) = 1 − F (1) = 1 − (1 − e−(1/2)1 ) = e−1/2 ). Either way, the answer
is e−1/2 ≈ .6065.
33. Let X denote the time in years that the radio continues to functino for Jones. The
desired probability is
∞
Z ∞
1 −(1/8)x
1 e−(1/8)x P (X > 8) =
e
dx =
= e−1 ≈ .368
8
8 −1/8 x=8
8
We could also have written P (X > 8) = 1 − P (X ≤ 8) = 1 − (1 − e−(1/8)8 ) = e−1 ≈ .368.
37a. We compute P (|X| > 1/2) = P (X > 1/2 or X < −1/2) = P (X > 1/2) + P (X <
+ 1/2
= 1/2.
−1/2) = 1/2
2
2
37b. We first compute the cumulative distribution function of Y = |X|. For a ≤ 0, we have
P (Y ≤ a) = 0, since |X| is never less than a in this case. For a ≥ 1, we have P (Y ≤ a) = 1,
since |X| is always less than a in this case. For 0 < a < 1, we compute
2a
P (Y ≤ a) = P (|X| ≤ a) = P (−a ≤ X ≤ a) =
=a
2
or equivalently, P (Y ≤ x) = x for 0 < x < 1. In summary, the cumulative distribution
function of Y = |X| is


0 x ≤ 0
FY (x) = x 0 < x < 1

1 x ≥ 1
Differentiating throughout with respect to x yields the density of Y = |X| is
(
1 0<x<1
fY (x) =
0 otherwise
Intuitively, if X is uniformly distributed on (−1, 1), then Y = |X| is uniformly distribution
on [0, 1).
(
e−x
39. Since X is exponentially distributed with λ = 1, then X has density fX (x) =
0
and cumulative distribution function
(
1 − e−x x ≥ 0
FX (x) =
0
otherwise
x≥0
otherwise
Next, we compute the cumulative distribution function of Y = ln X. We have
a
P (Y ≤ a) = P (log X ≤ a) = P (X ≤ ea ) = FX (ea ) = 1 − e−e
x
or equivalently, P (Y ≤ x) = 1 − e−e . Differentiating throughout with respect to x yields
the density of Y = ln X is
x
fY (x) = ex e−e
7
40. Since X is uniformly distributed on (0, 1), then X has density
(
1 0<x<1
fX (x) =
0 otherwise
and cumulative distribution function


0 x ≤ 0
FX (x) = x 0 < x < 1

1 x ≥ 1
Next, we compute the cumulative distribution function of Y = eX . For a ≤ 1, we have
P (Y ≤ a) = 0, since Y = eX is never less than a in this case (i.e., X is never less than 0
in this case). For a ≥ e, we have P (Y ≤ a) = 1, since Y = eX is always less than a in this
case (i.e., X is always less than 0 in this case). For 1 < a < e (and thus 0 < ln a < 1), we
compute
P (Y ≤ a) = P (eX ≤ a) = P (X ≤ ln a) = FX (ln a) = ln a
or equivalently, P (Y ≤ x) = ln x for 1 < x < e. In summary, the cumulative distribution
function of Y = eX is


x≤1
0
FY (x) = ln x 1 < x < e

1
x≥e
Differentiating throughout with respect to x yields the density of Y = eX is
(
1
1<x<e
fY (x) = x
0 otherwise
THEORETICAL EXERCISES
2. We observe that
Z
∞
E[Y ] =
Z
0
Z
xfY (x) dx =
−∞
xfY (x) dx +
−∞
∞
xfY (x) dx
0
Rx
The second integral can be simplified by observing that x = 0 1 dy, so
Z ∞
Z ∞Z x
Z ∞Z ∞
Z ∞
xfY (x) dx =
fY (x) dy dx =
fY (x) dx dy =
P (Y > y) dy
0
0
0
0
y
0
where the second equality in the line above follows by interchanging the order of integration
over all x and y with
R 0 0 ≤ y R≤−xx.
Similarly, x = −x 1 dy = 0 −1 dy, so
Z 0
Z 0 Z −x
Z ∞Z −y
Z ∞
xfY (x) dx = −
fY (x) dy dx = −
fY (x) dx dy = −
P (Y < −y) dy
−∞
−∞ 0
0
−∞
0
where the second equality in the line above follows by interchanging the order of integration
over all x and y with 0 ≤ y ≤ −x or equivalently x ≤ −y ≤ 0.
8
Altogether, this yields
∞
Z
Z
∞
P (Y > y) dy −
E[Y ] =
0
P (Y < −y) dy
0
as desired.
6. Let X be uniform on the interval (0, 1). For each a in the range
T0 < a < 1, define
T
Ea = P (X 6= a). So P (Ea ) = 1 for each a. Also,
Ea = ∅, so P
Ea = P (∅) = 0.
0<a<1
8. We see that
2
Z
c
2
Z
x f (x) dx ≤
E[X ] =
0<a<1
c
cxf (x) dx = cE[X]
0
0
Thus
2
2
2
Var(X) = E[X ] − (E[X]) ≤ cE[X] − (E[X]) = E[X](c − E[X]) = c
or equivalently, writing α =
E[X]
,
c
2 E[X]
c
E[X]
1−
c
we have
Var(X) = c2 α(1 − α)
Rc
Rc
We notice that 0 ≤ E[X] = 0 xf (x)dx ≤ 0 cf (x)dx = c, so 0 ≤ E[X] ≤ c, so 0 ≤ E[X]
≤ 1,
c
or equivalently, 0 ≤ α ≤ 1.
The largest value that α(1 − α) can achieve for 0 ≤ α ≤ 1 is 1/4, which happens when
α = 1/2 (to see this, just differentiate α(1 − α) with respect to α, set the result equal to
0, and solve for α, which yields α = 1/2; don’t forget to also check the endpoints, namely
α = 0 and α = 1).
Since Var(X) ≤ c2 α(1 − α) and α(1 − α) ≤ 1/4, it follows that Var(X) ≤ c2/4, as desired.
12a. If X is uniform on the interval (a, b), then
is (a + b)/2. To see this, write
R mthe1 median m−a
m for the median, and solve 1/2 = F (m) = a b−a
dx = b−a which immediately yields
m = (a + b)/2.
12b. If X is normal with mean µ and variance σ 2 , then the median is µ.To see
this, write
m
X−µ
m−µ
m−µ
for the median, and solve 1/2 = F (m) = P (X ≤ m) = P σ ≤ σ
=P Z≤ σ
=
Φ m−µ
where Z is standard normal, and thus m−µ
= 0, so m = µ.
σ
σ
12c. If X is exponential with parameter λ, then the median is lnλ2 . To see this, write m
for the median, and solve 1/2 = F (m) = 1 − e−λm , so e−λm = 1/2, so −λm = ln(1/2), so
λm = ln 2, and thus m = lnλ2 .
13a. If X is uniform on the interval (a, b), then any value of m between a and b is equally
valid to be used as the mode, since the density is constant on the interval (a, b).
13b. If X is normal with mean µ and variance σ 2 , then the mode is µ. To see this, write
2
2
2
2
d √1
1
m for the mode, and solve 0 = dx
e−(x−µ) /(2σ ) , i.e., 0 = √2π
e−(x−µ) /(2σ ) 2σ1 2 (2)(x − µ),
2π σ
σ
so x = µ is the location of the mode.
9
13c. If X is exponential with parameter λ, then the mode is 0. To see this, note that
−λx
d
λe−λx = λe−λ = −e−λx is the slope of the density for all x > 0, so the density is always
dx
decreasing for x > 0. So the mode must occur on the boundary of the positive portion of
the density, i.e., at x = 0.
14. Consider an exponential random variable X with parameter λ. To show that Y = cX is
exponential with parameter λ/c, it suffices to show that Y has cumulative density function
(
1 − e−(λ/c)a a > 0
FY (a) =
0
otherwise
To see this, first we note that X is always nonnegative, so Y = cX is always nonnegative,
so FY (a) = 0 for a ≤ 0.
For a > 0, we check
FY (a) = P (Y ≤ a) = P (cX ≤ a) = P (X ≤ a/c) = 1 − e−λ(a/c) = 1 − e−(λ/c)a
as desired.
Thus, Y = cX has the cumulative distribution function of an exponential random variable
with parameter λ/c, so Y = cX must indeed be exponential with parameter λ/c.
18. We prove that, for an exponential random variable X with parameter λ,
k!
for k ≥ 1
λk
To see this, we use proof by induction on k.
For k = 1, we use integration by parts with u = x and dv = λe−λx dx, and thus du = dx
−λx
and v = λe−λ , to see that
−λx ∞
Z ∞ −λx
Z ∞
λe
λe
−λx
−
xλe
dx = (x)
dx
E[X] =
−λ
−λ
0
0
x=0
E[X k ] =
or, more simply,
Z ∞
x ∞
E[X] = − λx +
e−λx dx
e x=0
0
∞
e−λx 1
We see that the integral evaluates to −λ = λ . We also see that, in the first part of the
x=0
expression, plugging in x = 0 yields 0; using L’Hospital’s rule as x → ∞ yields
x
1
= lim
=0
λx
x→∞ e
x→∞ λeλx
lim
So we conclude that
1
λ
This completes the base case, i.e., the case k = 1.
Now we do the inductive step of the proof. For k ≥ 2, we assume that E[X k−1 ] =
has already been proved, and we prove that E[X k ] = λk!k .
E[X] =
(k−1)!
λk−1
10
To do this, we use integration by parts with u = xk and dv = λe−λx dx, and thus du =
−λx
kxk−1 dx and v = λe−λ , to see that
−λx ∞
−λx Z ∞
Z ∞
λe
λe
k
k
−λx
k
k−1
dx
E[X ] =
x λe
dx = (x )
−
(kx )
−λ
−λ
0
0
x=0
or, more simply,
∞
Z
xk k ∞ k−1 −λx
k
E[X ] = − λx x λe
dx
+
e x=0 λ 0
We see that the integral is λk E[X k−1 ], which is (by the inductive assumption) equal to
k (k−1)!
= λk!k . We also see that, in the first part of the expression, plugging in x = 0 yields
λ λk−1
0; using L’Hospital’s rule k times as x → ∞ yields
xk
kxk−1
k(k − 1)xk−2
k!
lim λx = lim
=
lim
=
·
·
·
=
lim
=0
x→∞ e
x→∞ λeλx
x→∞
x→∞ λk eλx
λ2 eλx
So we conclude that
k!
E[X k ] = k
λ
This completes the inductive case, which completes the proof by induction.