Download Homework 6 answers in pdf format

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
STAT 516 Answers
Homework 6
April 2, 2008
Solutions by Mark Daniel Ward
PROBLEMS
Chapter 6 Problems
2a. The mass p(0, 0) corresponds to neither of the first two balls being white, so p(0, 0) =
8 7
= 14/39. The mass p(0, 1) corresponds to the first ball being red and the second ball
13 12
8 5
being white, so p(0, 1) = 13
= 10/39. The mass p(1, 0) corresponds to the first ball being
12
5 8
white and the second ball being red, so p(1, 0) = 13
= 10/39. The mass p(1, 1) corresponds
12
5 4
= 5/39.
to both of the first two balls being white, so p(1, 1) = 13
12
8 7 6
28
2b. Similarly, we have p(0, 0, 0) = 13
=
,
and
p(0,
0, 1) = p(0, 1, 0) = p(1, 0, 0) =
12 11
143
70
8 5 4
40
8 7 5
= 429 , and p(0, 1, 1) = p(1, 0, 1) = p(1, 1, 0) = 13 12 11 = 729
, and finally p(1, 1, 1) =
13 12 11
5 4 3
5
= 143 .
13 12 11
3a. The mass p(0, 0) corresponds to the white balls numbered “1” and “2” not appearing
(11)
within the three choices, so p(0, 0) = 133 = 15/26. The mass p(0, 1) corresponds to the white
(3)
balls numbered “1” not being chosen, and the white ball numbered “2” getting chosen, within
(1)(11)
the three choices, so p(0, 1) = 1 13 2 = 5/26. The mass p(1, 0) corresponds to the white balls
(3)
numbered “2” not being chosen, and the white ball numbered “1” getting chosen, within the
(1)(11)
three choices, so p(1, 0) = 1 13 2 = 5/26. The mass p(1, 1) corresponds to both of the white
(3)
(2)(11)
balls numbered “1” and “2” appearing within the three choices, so p(1, 1) = 2 13 1 = 1/26.
(3)
(103)
60
3b. Similarly, we have p(0, 0, 0) = 13 = 143
, and p(0, 0, 1) = p(0, 1, 0) = p(1, 0, 0) =
(3)
(11)(102)
(1)(11)(101)
45
5
= 286
, and p(0, 1, 1) = p(1, 0, 1) = p(1, 1, 0) = 1 13
= 143
, and finally p(1, 1, 1) =
13
(3)
(3)
(11)(11)(11)
1
= 286
.
(133)
5. The only modification
the
from problem 3a above is that 11
2 balls
are replaced
1 2 after1 each
3
1
11
11 3
draw. Thus p(0, 0) = 13 = 1331/2197; also, p(0, 1) = 3 13
+ 3 13 13 + 13 =
13
1
1 2
1
1
11
397/2197; similarly, p(1, 0) = 397/2197; and finally, p(1, 1) = 6 13
+3
+
13
13
13
13
2
1
1
3 13
= 72/2197.
13
6. We first note that 1 ≤ N1 , N2 ≤ 4 and N1 + N2 ≤ 5. There are 52 = 10 ways that the
two defective items can be chosen, each equally likely, and each way corresponds to exactly
one pair n1 , n2 satisfying the bounds mentioned. Thus
p(n1 , n2 ) = 1/10
if 1 ≤ n1 , n2 ≤ 4 and n1 + n2 ≤ 5. Otherwise p(n1 , n2 ) = 0.
1
2
10a. We first compute
∞
Z
y
Z
e−(x+y) dx dy = 1/2
P (X < Y ) =
0
0
Another method is to compute
Z
∞
Z
P (X < Y ) =
0
∞
e−(x+y) dy dx = 1/2
x
Finally, a third method is to just notice that X and Y are independent and have the same
distribution, so half the time we have X < Y and the other half of the time we have Y < X.
Thus P (X < Y ) = 1/2.
10b. For a < 0, we have P (X < a) = 0. For a > 0, we compute
Z ∞Z a
Z ∞Z a
P (X < a) =
f (x, y) dx dy =
e−(x+y) dx dy = 1 − e−a
−∞
−∞
0
0
Another method is to simply note that the joint density of X and Y shows us that, in this
case, X and Y must be independent exponential random variables, each with λ = 1. So
P (X < a) = 1 − e−a for a > 0, and P (X < a) = 0 otherwise, since this is the cumulative
distribution function of an exponential random variable.
5
11. There are 2,1,2
= 30 ways that the customers can be split into a pair who will buy
ordinary sets, a single buyer who will buy plasma, and a pair who will buy nothing. The
probability of each such choice is (.45)2 (.15)1 (.40)2 = .00486. So the desired probability is
(30)(.00486) = .1458.
14. We write X for the location of the ambulence, and Y for the location of the accident, both
in the interval [0, L]. The distance in between is D = |X − Y |. We know that P (D < a) = 0
for a < 0 and P (D < a) = 1 for a = L, since the minimum and maximum distances for D
are 0 and L, respectively. So D must be between 0 and L.
Perhaps the easiest method for computing P (D < a) with 0 ≤ a ≤ L is to draw a picture
of the sample space and then divide the desired area over the entire area of the sample space;
this method works since the joint distribution of X, Y is uniform. So the desired probability
1
1
(L−a)2
(L−a)2
is 1 − 2 L2 − 2 L2 = a(2L − a)/L2 .
Another possibility is to integrate, and we need to break the desired integral into three
regions:
Z a Z x+a
Z L−a Z x+a
Z L Z L
1
1
1
P (D < a) =
dy dx +
dy dx +
dy dx = a(2L − a)/L2
2
2
2
L
0
0
a
x−a L
L−a x−a L
RR
RR
15a. We note that 1 = R c dx dy, so 1/c = R dx dy = area of region R.
15b. We see that, for −1 < a, b < 1, we have FX,Y (a, b) = P (X ≤ a, Y ≤ b) =
Ra Rb
b−(−1)
y−(−1)
1/4 dy dx = a−(−1)
. Thus FX,Y (x, y) = x−(−1)
. So we have success2
2
2
2
−1 −1
fully factored the cumulative distribution function into two parts: one part is a function of
x and the other part is a function of y, so X and Y are independent, and their cumulative
distribution functions have the required forms for uniform random variables on the interval
(−1, 1), i.e., FX (a) = FY (a) = a−(−1)
for −1 < a < 1 and FX (a) = FY (a) = 0 otherwise.
2
2
2
R
2
15c. We integrate X 2 +Y 2 ≤1 (1/4) dx dy = area of the circle X + Y ≤ 1 = π14 = π/4.
area of the entire square
S
16a. We see that A happens if and only if at least one of the Ai happen. So A = ni=1 Ai .
3
16b. Yes, the Ai are mutually exclusive. We cannot have more than one Ai occur at the
same time.
S
P
16c. Since the Ai are mutually exclusive, then P (A) = P ( ni=1 Ai ) = ni=1 P (Ai ). We see
that P (Ai ) = 1/2n−1 , since we require each of the other n − 1 points (besidesPthe ith point
itself, of course) to be in the semicircle clockwise of the ith point. So P (A) = ni=1 1/2n−1 =
n
.
2n−1
19a. The marginal density of X is fX (x) = 0 for x ≤ 0 and also for x ≥ 1. For 0 < x < 1,
the marginal density of X is
Z x
1
fX (x) =
dy = 1
0 x
19b. The marginal density of Y is fY (y) = 0 for y ≤ 0 and also for y ≥ 1. For 0 < y < 1,
the marginal density of Y is
Z 1
1
fY (y) =
dx = ln(1/y)
y x
19c. The expected value of X is
Z
E[X] =
∞
xfX (x) dx =
(x)(1) dx = 1/2
−∞
0
19c. The expected value of Y is
Z ∞
Z
E[Y ] =
yfY (y) dy =
−∞
1
Z
1
y ln(1/y) dy = 1/4
0
To see this, use integration by parts, with u = ln(1/y) and dv = y dy.
20a. Yes, X and Y are independent, because we can factor the joint density as follows:
f (x, y) = fX (x)fY (y), where
(
(
xe−x x > 0
e−y y > 0
fX (x) =
fY (y) =
0
else
0
else
20b. No; in this case, X and Y are not independent. To see this, we note that the density
is nonzero when 0 < x < y < 1. So the domain does not allow us to factor the joint density
X can be in the range
into two separate regions. For instance, P 41 < X < 1 > 0 since
1
1
between 1/4 and 1. On the other hand, P 4 < X < 1 Y = 8 = 0, since X cannot be in
the range between 1/4 and 1 when Y = 1/8; instead, X must always be smaller than Y .
23a. Yes, X and Y are independent, because we can factor the joint density as follows:
f (x, y) = fX (x)fY (y), where
(
(
6x(1 − x) 0 < x < 1
2y 0 < y < 1
fX (x) =
fY (y) =
0
else
0 else
R∞
R1
23b. We compute E[X] = −∞ xfX (x) dx = 0 x6x(1 − x) dx = 1/2.
R∞
R1
23c. We compute E[Y ] = −∞ yfY (y) dy = 0 y2y dy = 2/3.
R∞
R1
23d. We compute E[X 2 ] = −∞ x2 fX (x) dx = 0 x2 6x(1 − x) dx = 3/10. Thus Var(X) =
2
3
− 12 = 1/20.
10
4
23e. We compute E[Y 2 ] =
1/18.
R∞
R1
y 2 fY (y) dy =
−∞
0
y 2 2y dy = 1/2. Thus Var(Y ) =
1
2
−
2 2
3
=
25. Since N is Binomial with n = 106 and p = 1/106 , then n is large and np = 1 is
small, so N is well approximated by a Poisson random variable with λ = np = 1. So
−λ i
−1
P (N = i) ≈ e i! λ = e i! .
26a. Since A, B, C are independent, we multiple their marginal densities to get the joint
density. Each of these variables has density 1 on the interval (0, 1) and density 0 otherwise.
So the joint density is f (a, b, c) = 1 for 0 < a, b, c < 1 and f (a, b, c) = 0. So the joint
distribution is F (a, b, c) = FA (a)FB (b)FC (c), where FA (a), FB (b), and FC (c) are each the
cumulative distribution functions of a uniform (0, 1) random variable, i.e., each of these
functions has the form F (x) = 0 if x ≤ 0, or F (x) = x if 0 < x < 1, or
F (x) = 1 if x ≥ 1.
√
−b± b2 −4ac
2
; these roots are
26b. The roots of the equation Ax +Bx+C = 0 are given by x =
RRR 2a
2
real if and only if b − 4ac ≥ 0, which happens with probability
f (a, b, c) da db dc,
b2 −4ac≥0
2
which is exactly the volume of the region {(a, b, c) | b − 4ac ≥ 0} divided by the volume of
the entire region {(a, b, c) | 0 < a, b, c < 1}. The second region has volume 1. The first region
R 1 R 1 R min{1,b2 /(4c)}
has volume 0 0 0
1 da db dc ≈ .2544. So the desired probability is approximately
.2544.
To see how to do the integral above, we compute
2
Z 1 Z 1 Z min{1,b2 /(4c)}
Z 1Z 1
b
1 da db dc =
min 1,
db dc
4c
0
0
0
0
0
which simplifies to
Z 1/4 Z √4c
0
0
b2
db +
4c
Z
1
√
!
1 db
4c
Z
1
Z
dc +
1/4
0
1
5
1
b2
db dc =
+ ln(2) ≈ .2544
4c
36 6
28. The cumulative distribution function of Z = X1 /X2 is P (Z ≤ a) = 0 for a ≤ 0, since Z
is never negative in this problem. For 0 < a, we compute
Z ∞ Z ax2
X1
λ1 a
P (Z ≤ a) = P
≤ a = P (X1 ≤ aX2 ) =
λ1 λ2 e−(λ1 x1 +λ2 x2 ) dx1 dx2 =
X2
λ1 a + λ2
0
0
An alternative method of computing is to write
Z ∞Z ∞
X1
1
λ1 a
P (Z ≤ a) = P
≤a =P
X1 ≤ X2 =
λ1 λ2 e−(λ1 x1 +λ2 x2 ) dx2 dx1 =
X2
a
λ1 a + λ2
0
x1 /a
32a. We assume that the weekly sales in separate weeks is independent. Thus, the number
of the mean sales in two weeks is (by independence) simply (2)(2200) = 4400. The variance
of sales in one week is 2302 , so that variance of sales in two weeks is (by independence) simply
(2)(2302 ) = 105,800. So the sales in two weeks, denoted by X, has normal distribution with
mean 4400 and variance 105,800. So
5000 − 4400
X − 4400
P (X > 5000) = P √
> √
105,800
105,800
≈ P (Z > 1.84)
= 1 − P (Z ≤ 1.84)
5
= 1 − Φ(1.84)
≈ 1 − .9671
= .0329
32b. The weekly sales Y has normal distribution with mean 2200 and variance 2302 =
52,900. So, in a given week, the probability p that the weekly sales Y exceeds 2000 is
p = P (Y > 2000)
Y − 2200
2000 − 2200
=P √
> √
52,900
52,900
≈ P (Z > −.87)
= P (Z < .87)
= Φ(.87)
≈ .8078
The
probability 3that
3 weekly sales exceeds 2000 in at least 2 out of 3 weeks is (approximately)
3 2
p (1 − p) + 3 p = .9034.
2
33a. Write X for Jill’s bowling scores, so X is normal with mean 170 and variance 202 = 400.
Write Y for Jack’s bowling scores, so Y is normal with mean 160 and variance 152 = 225. So
−X is normal with mean −170 and variance 202 = 400. Thus, Y − X is nomal with mean
160 − 170 = −10 and variance 225 + 400 = 625. So the desired probability is approximately
Y − X − (−10)
0 − (−10)
√
P (Y − X > 0) = P
> √
625
625
2
=P Z>
5
2
=1−P Z ≤
5
= 1 − Φ(.4)
≈ 1 − .6554
= .3446
Since the bowling scores are actually discrete integer values, we get an even better approximation by using continuity correction
P (Y − X > 0) = P (Y − X ≥ .5)
.5 − (−10)
Y − X − (−10)
√
> √
=P
625
625
= P (Z > .42)
= 1 − P (Z ≤ .42)
= 1 − Φ(.42)
≈ 1 − .6628
= .3372
6
33b. The total of their scores, X + Y , is nomal with mean 160 + 170 = 330 and variance
225 + 400 = 625. So the desired probability is approximately
X + Y − 330
350 − 330
√
P (X + Y > 350) = P
> √
625
625
4
=P Z>
5
= 1 − P (Z ≤ .8)
= 1 − Φ(.8)
≈ 1 − .7881
= .2119
Since the bowling scores are actually discrete integer values, we get an even better approximation by using continuity correction
X + Y − 330
350.5 − 330
√
√
P (X + Y ≥ 350.5) = P
>
625
625
= P (Z > .82)
= 1 − P (Z ≤ .82)
= 1 − Φ(.82)
≈ 1 − .7939
= .2061
37a. We recall from problem 3 that Y1 and Y2 have joint mass p(0, 0) = 15/26, p(0, 1) = 5/26,
p(1, 0) = 5/26, and p(1, 1) = 1/26.
5/26
= 56 and
So the condition mass of Y1 , given that Y2 = 1, is pY1 |Y2 (0|1) = 5/26+1/26
pY1 |Y2 (1|1) =
1/26
5/26+1/26
= 61 .
37b. The condition mass of Y1 , given that Y2 = 0, is pY1 |Y2 (0|0) =
pY1 |Y2 (1|0) =
5/26
15/26+5/26
15/26
15/26+5/26
=
3
4
and
= 41 .
39a. For 1 ≤ y ≤ x ≤ 5, the joint mass is p(x, y) = p(x)p(y|x) =
39b. The condition mass of X, given Y = i, is
pX|Y (x|i) =
1
p(x, i)
p(x, i)
= P5
= P55x
pY (i)
x=1 p(x, i)
x=1
1
5x
=
11
5x
=
1
5x
137/300
1
.
5x
=
60
137x
40. First, note that there is only one way to obtain X and Y as the same value, but for
X > Y , there are two ways to obtain X and Y as the same value. So, for y < i, the
conditional mass of Y , given X = i, is
pY |X (y|i) =
p(i, y)
p(i, y)
(2)(1/6)(1/6)
2
= Pi
=
=
pX (i)
(i − 1)(2)(1/6)(1/6) + (1/6)(1/6)
2i − 1
y=1 p(i, y)
For y = i, the conditional mass of Y , given X = i, is
pY |X (y|i) =
p(i, y)
p(i, y)
(1/6)(1/6)
1
= Pi
=
=
pX (i)
(i − 1)(2)(1/6)(1/6) + (1/6)(1/6)
2i − 1
y=1 p(i, y)
7
For y > i, the conditional mass of Y , given X = i, is pY |X (y|i) = 0.
Also note that X and Y are dependent. For instance, P (Y > 3) 6= 0, because Y can take
the value 3. On the other hand P (Y > 3 | X = 2) = 0. So X and Y are dependent. Once
X is given, for instance, then Y can be no larger than X.
41a. The conditional mass function of X given Y = 1 is
p(1, 1)
pX|Y (1, 1) =
=
pY (1)
1
8
1
8
+
1
8
= 1/2
p(2, 1)
=
pX|Y (2, 1) =
pY (1)
and
1
8
1
8
+
1
8
= 1/2
The conditional mass function of X given Y = 2 is
pX|Y (1, 2) =
p(1, 2)
=
pY (2)
1
4
1
4
+
1
2
= 1/3
and
pX|Y (2, 2) =
p(2, 2)
=
pY (2)
1
2
1
4
+
1
2
= 2/3
41b. Since the conditional mass of X changes depending on the value of Y , then the value
of Y affects the various probabilities for X, so X and Y are not independent.
41c. We compute
1 1 1
1
P (XY ≤ 3) = p(1, 1) + p(2, 1) + p(1, 2) = + + =
8 8 4
2
and
7
1 1 1
P (X + Y > 2) = p(2, 1) + p(1, 2) + p(2, 2) = + + =
8 4 2
8
and
P (X/Y > 1) = p(2, 1) = 1/8
√
uv and
54a. We see that
u
=
g
(x,
y)
=
xy
and
v
=
g
(x,
y)
=
x/y.
Thus
x
=
h
(u,
v)
=
1
2
1
p
y = h2 (u, v) = uv . The Jacobian is
y x x x
2x
J(x, y) = 1 −x = − − = −
y
y
y
y
y2
so |J(x, y)|−1 =
y
.
2x
Therefore the joint density of U, V is
fU,V (u, v) = fX,Y (x, y)|J(x, y)|−1 =
1
x2 y 2
1
1
1
y
= 3 = √ 3p u = 2
2x
2x y
2u v
2( uv)
v
57. We see that y1 = g1 (x1 , x2 ) = x1 + x2 and y2 = g2 (x1 , x2 ) = ex1 . Thus x1 = h1 (y1 , y2 ) =
ln(y2 ) and x2 = y1 − ln(y2 ). The Jacobian is
1 1
J(x, y) = x1 = −ex1
e
0
so |J(x, y)|−1 = e−x1 . Therefore the joint density of Y1 , Y2 is
fY1 ,Y2 (y1 , y2 ) = fX1 ,X2 (x1 , x2 )|J(x1 , x2 )|−1
= λ1 λ2 e−λ1 x1 −λ2 x2 e−x1
= λ1 λ2 e−((λ1 +1)x1 +λ2 x2 )
= λ1 λ2 e−((λ1 +1) ln(y2 )+λ2 (y1 −ln(y2 ))
= λ1 λ2 y2−λ1 +λ2 −1 e−y1 λ2
8
THEORETICAL EXERCISES
2. Generalizing Proposition 2.1 (there is nothing special about two variables), we note
that random discrete variables X1 , . . . , Xn are independent if and only if their joint mass
f (x1 ,P
. . . , xn ) can be factored as f1 (x1 ) · · · fn (xn ); in this case, once each fi is normalized so
that xi fi (xi ) = 1 for each i, then the fi ’s are the marginal mass functions of the Xi ’s.
Write X for the total number of events in the given time period, and write Xi as the
number of events of type i. Then we can factor the joint mass f (x1 , . . . , xn ) of X1 , . . . , Xn
by writing
f (x1 , . . . , xn ) = P (X = x1 + · · · + xn )P (X1 = x1 , . . . , Xn = xn | X1 + · · · + Xn )
e−λ λx1 +···+xn
x 1 + · · · + x n x1
=
p1 · · · pxnn
(x1 + · · · + xn )!
x1 , . . . , x n
−λ x1 +···+xn
e λ
(x1 + · · · + xn )! x1
=
p · · · pxnn
(x1 + · · · + xn )!x1 ! · · · xn ! 1
e−λ λx1 · · · λxn x1
p1 · · · pxnn
=
x1 ! · · · xn !
−λp1
e
(λp1 )x1
e−λpn (λpn )xn
=
···
x1 !
xn !
So
e−λpi (λpi )xi
xi !
is the mass of Xi , and also X1 , . . . , Xn are independent.
5a. For a > 0, the cumulativeR distribution
function of Z Ris FZ (a) R= P (Z ≤ a) =
∞ R ay
∞
ay
P
(X/Y ≤ a) = P (X ≤ aY ) = 0 0 fX (x)fY (y) dx dy = 0 fY (y) 0 fX (x) dx dy =
R∞
fY (y)FX (ay) dy, or equivalently, for z > 0, we have
0
Z ∞
FZ (z) =
fY (y)FX (zy) dy
0
Differentiating throughout with respect to z yields
Z ∞
fY (y)fX (zy)y dy
fZ (z) =
0
Of course, for z ≤ 0, we have fZ (z) = 0. When X, Y are independent exponential random
variables with parameters λ1 , λ2 , this yields
Z ∞
λ1 λ2
fZ (z) =
λ2 e−λ2 y λ1 e−λ1 zy y dy =
(λ2 + zλ1 )2
0
for z > 0; as before, fZ (z) = 0 for z ≤ 0.
5b. For a > 0, the cumulative distribution function of Z is FZ (a) = P (Z ≤ a) =
R ∞ R a/y
R∞
R a/y
P
(XY
≤
a)
=
P
(X
≤
a/Y
)
=
f
(x)f
(y)
dx
dy
=
f
(y)
fX (x) dx dy =
X
Y
Y
0
0
0
0
R∞
f
(y)F
(a/y)
dy,
or
equivalently,
for
z
>
0,
we
have
Y
X
0
Z ∞
FZ (z) =
fY (y)FX (z/y) dy
0
9
Differentiating throughout with respect to z yields
Z ∞
1
fZ (z) =
fY (y)fX (z/y) dy
y
0
Of course, for z ≤ 0, we have fZ (z) = 0. When X, Y are independent exponential random
variables with parameters λ1 , λ2 , this yields
Z ∞
1
λ2 e−λ2 y λ1 e−λ1 z/y dy
fZ (z) =
y
0
for z > 0, but I do not see an easy way to simplify this expression; as before, fZ (z) = 0 for
z ≤ 0.
6. Method 1. My solution does NOT use induction. If the Xi are independent and identically
distributed geometrics, each with probability p of success, then
X
P (X1 + · · · + Xn = i) =
((1 − p)x1 −1 p) · · · ((1 − p)xn −1 p)
x1 +···+xn =i
n
=
p
(1 − p)n
n
=
p
(1 − p)n
X
(1 − p)x1 +···+xn
x1 +···+xn =i
X
(1 − p)i
x1 +···+xn =i
n
X
p
i
(1
−
p)
1
(1 − p)n
x1 +···+xn =i
pn
i i−1
(1 − p)
=
using Proposition 6.1 of Chapter 1
(1 − p)n
n−1
i−1 n
=
p (1 − p)i−n
n−1
=
and thus X1 + · · · + Xn has a negative binomial distribution with parameters p and n.
Method 2. Here is a solution with no derivation needed. Do n consecutive experiments.
In each experiment, flip a coin with probability p of landing heads. Then X1 + · · · + Xn is
the total number of flips required. Of course, the total number of flips is a negative binomial
random variable, because the experiment ends when we see the nth success, or in other
words, when we see the appearance of the nth head.
9. For convenience, write Y = min(X1 , . . . , Xn ). For a ≤ 0, we know that FY (a) = 0,
since Y is never negative. For a > 0, we have FY (a) = P (Y ≤ a) = 1 − P (Y > a) =
1 − P (min(X1 , . . . , Xn ) > a) = 1 − P (X1 > a, X2 > a, . . . , Xn > a) = 1 − P (X1 >
a)P (X2 > a) · · · P (Xn > a) = 1−(e−λa )(e−λa ) · · · (e−λa ) = 1−e−λna . Thus Y is exponentially
distribution with parameter nλ.
10. The flashlight begins with 2 batteries installed and n−2 replacement batteries available.
When one battery dies, the battery is immediately replaced, and because of the memoryless
property of exponential random variables, both batteries installed (the old and new) are just
as good as new, and the waiting begins again. A dead battery is replaced a total of n − 2
times. Upon the (n − 1)st battery dying, we do not have enough batteries left to run the
flashlight anymore. So the length of time that the flashlight can operate is X1 + · · · + Xn−1 ,
10
where the Xi ’s are independent, identically distribution exponential random variables, each
with parameter λ. [So the length of time that the flashlight can operate is a gamma random
variable with parameters (n − 1, λ); see, for instance, Example 3b on page 282. If you did
not get this last sentence, don’t worry, because you are not required to understand gamma
random variables in my course.]
14a. We think of X as the number of flips of a coin until the first appearance of heads,
and Y as the number of additional flips of a coin until the second appearance of heads; here,
“heads” appears on each toss with probability p. Given that X + Y = n, we know that n
flips were required to reach the second head. So any of the first n − 1 flips could be the first
1
for
head, and all such n − 1 possibilities are equally likely, so P (X = i | X + Y = n) = n−1
1 ≤ i ≤ n − 1 and P (X = i | X + Y = n) = 0 otherwise.
14b. To verify this, we note that if i ≤ 0 or i ≥ n, we must have P (X = i | X + Y = n) = 0,
because X, Y are positive random variables. For 1 ≤ i ≤ n − 1, we compute
P (X = i and X + Y = n)
P (X = i | X + Y = n) =
P (X + Y = n)
P (X = i, Y = n − i)
=
P (X + Y = n)
P (X = i)P (Y = n − i)
= Pn−1
j=1 P (X = j)P (Y = n − j)
(1 − p)i−1 p(1 − p)n−i−1 p
= Pn−1
j−1 p(1 − p)n−j−1 p
j=1 (1 − p)
(1 − p)n−2
(n − 1)(1 − p)n−2
1
=
n−1
=
15. Method 1. We compute
P (X = i and X + Y = m)
P (X + Y = m)
P (X = i, Y = m − i)
=
P (X + Y = m)
P (X = i)P (Y = m − i)
= Pm
j=0 P (X = j)P (Y = m − j)
m−i
n i
n
p (1 − p)n−i m−i
p (1 − p)n−m+i
i
= Pm n j
n
n−j
p
(1
−
p)
pm−j (1 − p)n−m+j
j=0 j
m−j
n
n
P (X = i | X + Y = m) =
= Pm i
=
m−i
n n
j=0 j
m−j
n n
i
m−i
2n
m
11
Method 2. As in Ross’s hint, flip 2n coins, let X denote the number of heads in the first
sequence of n flips and let Y denote the number
of heads in the second sequence of
nn flips.
2n
n
If m flips are heads altogether, there are m equally likely possibilities, exactly i m−i of
which have i heads in the first sequence of n flips and the other m − i heads in the second
(n)( n )
sequence of n flips. So P (X = i | X + Y = m) = i 2nm−i , and thus, given X + Y = m, we
(m)
see that the conditional distribution of X is hypergeometric.
18a. Given the condition U > a, then U is uniformly distributed on the interval (a, 1). To
<b)
b−a
see this, just consider any b with a < b < 1. Then P (U < b | U > a) = PP(a<U
= 1−a
.
(U >a)
1
Differentiating with respect to b yields the conditional density of U , namely, 1−a , which is
constant (since a is fixed in this problem). So the conditional distribution of U is uniform
on the interval (a, 1), as we stated at the start.
18b. Given the condition U < a, then U is uniformly distributed on the interval (0, a).
(U <b)
= ab .
To see this, just consider any b with 0 < b < a. Then P (U < b | U < a) = PP (U
<a)
Differentiating with respect to b yields the conditional density of U , namely, a1 , which is
constant (since a is fixed in this problem). So the conditional distribution of U is uniform
on the interval (0, a), as we stated at the start.