Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STAT 516 Answers Homework 6 April 2, 2008 Solutions by Mark Daniel Ward PROBLEMS Chapter 6 Problems 2a. The mass p(0, 0) corresponds to neither of the first two balls being white, so p(0, 0) = 8 7 = 14/39. The mass p(0, 1) corresponds to the first ball being red and the second ball 13 12 8 5 being white, so p(0, 1) = 13 = 10/39. The mass p(1, 0) corresponds to the first ball being 12 5 8 white and the second ball being red, so p(1, 0) = 13 = 10/39. The mass p(1, 1) corresponds 12 5 4 = 5/39. to both of the first two balls being white, so p(1, 1) = 13 12 8 7 6 28 2b. Similarly, we have p(0, 0, 0) = 13 = , and p(0, 0, 1) = p(0, 1, 0) = p(1, 0, 0) = 12 11 143 70 8 5 4 40 8 7 5 = 429 , and p(0, 1, 1) = p(1, 0, 1) = p(1, 1, 0) = 13 12 11 = 729 , and finally p(1, 1, 1) = 13 12 11 5 4 3 5 = 143 . 13 12 11 3a. The mass p(0, 0) corresponds to the white balls numbered “1” and “2” not appearing (11) within the three choices, so p(0, 0) = 133 = 15/26. The mass p(0, 1) corresponds to the white (3) balls numbered “1” not being chosen, and the white ball numbered “2” getting chosen, within (1)(11) the three choices, so p(0, 1) = 1 13 2 = 5/26. The mass p(1, 0) corresponds to the white balls (3) numbered “2” not being chosen, and the white ball numbered “1” getting chosen, within the (1)(11) three choices, so p(1, 0) = 1 13 2 = 5/26. The mass p(1, 1) corresponds to both of the white (3) (2)(11) balls numbered “1” and “2” appearing within the three choices, so p(1, 1) = 2 13 1 = 1/26. (3) (103) 60 3b. Similarly, we have p(0, 0, 0) = 13 = 143 , and p(0, 0, 1) = p(0, 1, 0) = p(1, 0, 0) = (3) (11)(102) (1)(11)(101) 45 5 = 286 , and p(0, 1, 1) = p(1, 0, 1) = p(1, 1, 0) = 1 13 = 143 , and finally p(1, 1, 1) = 13 (3) (3) (11)(11)(11) 1 = 286 . (133) 5. The only modification the from problem 3a above is that 11 2 balls are replaced 1 2 after1 each 3 1 11 11 3 draw. Thus p(0, 0) = 13 = 1331/2197; also, p(0, 1) = 3 13 + 3 13 13 + 13 = 13 1 1 2 1 1 11 397/2197; similarly, p(1, 0) = 397/2197; and finally, p(1, 1) = 6 13 +3 + 13 13 13 13 2 1 1 3 13 = 72/2197. 13 6. We first note that 1 ≤ N1 , N2 ≤ 4 and N1 + N2 ≤ 5. There are 52 = 10 ways that the two defective items can be chosen, each equally likely, and each way corresponds to exactly one pair n1 , n2 satisfying the bounds mentioned. Thus p(n1 , n2 ) = 1/10 if 1 ≤ n1 , n2 ≤ 4 and n1 + n2 ≤ 5. Otherwise p(n1 , n2 ) = 0. 1 2 10a. We first compute ∞ Z y Z e−(x+y) dx dy = 1/2 P (X < Y ) = 0 0 Another method is to compute Z ∞ Z P (X < Y ) = 0 ∞ e−(x+y) dy dx = 1/2 x Finally, a third method is to just notice that X and Y are independent and have the same distribution, so half the time we have X < Y and the other half of the time we have Y < X. Thus P (X < Y ) = 1/2. 10b. For a < 0, we have P (X < a) = 0. For a > 0, we compute Z ∞Z a Z ∞Z a P (X < a) = f (x, y) dx dy = e−(x+y) dx dy = 1 − e−a −∞ −∞ 0 0 Another method is to simply note that the joint density of X and Y shows us that, in this case, X and Y must be independent exponential random variables, each with λ = 1. So P (X < a) = 1 − e−a for a > 0, and P (X < a) = 0 otherwise, since this is the cumulative distribution function of an exponential random variable. 5 11. There are 2,1,2 = 30 ways that the customers can be split into a pair who will buy ordinary sets, a single buyer who will buy plasma, and a pair who will buy nothing. The probability of each such choice is (.45)2 (.15)1 (.40)2 = .00486. So the desired probability is (30)(.00486) = .1458. 14. We write X for the location of the ambulence, and Y for the location of the accident, both in the interval [0, L]. The distance in between is D = |X − Y |. We know that P (D < a) = 0 for a < 0 and P (D < a) = 1 for a = L, since the minimum and maximum distances for D are 0 and L, respectively. So D must be between 0 and L. Perhaps the easiest method for computing P (D < a) with 0 ≤ a ≤ L is to draw a picture of the sample space and then divide the desired area over the entire area of the sample space; this method works since the joint distribution of X, Y is uniform. So the desired probability 1 1 (L−a)2 (L−a)2 is 1 − 2 L2 − 2 L2 = a(2L − a)/L2 . Another possibility is to integrate, and we need to break the desired integral into three regions: Z a Z x+a Z L−a Z x+a Z L Z L 1 1 1 P (D < a) = dy dx + dy dx + dy dx = a(2L − a)/L2 2 2 2 L 0 0 a x−a L L−a x−a L RR RR 15a. We note that 1 = R c dx dy, so 1/c = R dx dy = area of region R. 15b. We see that, for −1 < a, b < 1, we have FX,Y (a, b) = P (X ≤ a, Y ≤ b) = Ra Rb b−(−1) y−(−1) 1/4 dy dx = a−(−1) . Thus FX,Y (x, y) = x−(−1) . So we have success2 2 2 2 −1 −1 fully factored the cumulative distribution function into two parts: one part is a function of x and the other part is a function of y, so X and Y are independent, and their cumulative distribution functions have the required forms for uniform random variables on the interval (−1, 1), i.e., FX (a) = FY (a) = a−(−1) for −1 < a < 1 and FX (a) = FY (a) = 0 otherwise. 2 2 2 R 2 15c. We integrate X 2 +Y 2 ≤1 (1/4) dx dy = area of the circle X + Y ≤ 1 = π14 = π/4. area of the entire square S 16a. We see that A happens if and only if at least one of the Ai happen. So A = ni=1 Ai . 3 16b. Yes, the Ai are mutually exclusive. We cannot have more than one Ai occur at the same time. S P 16c. Since the Ai are mutually exclusive, then P (A) = P ( ni=1 Ai ) = ni=1 P (Ai ). We see that P (Ai ) = 1/2n−1 , since we require each of the other n − 1 points (besidesPthe ith point itself, of course) to be in the semicircle clockwise of the ith point. So P (A) = ni=1 1/2n−1 = n . 2n−1 19a. The marginal density of X is fX (x) = 0 for x ≤ 0 and also for x ≥ 1. For 0 < x < 1, the marginal density of X is Z x 1 fX (x) = dy = 1 0 x 19b. The marginal density of Y is fY (y) = 0 for y ≤ 0 and also for y ≥ 1. For 0 < y < 1, the marginal density of Y is Z 1 1 fY (y) = dx = ln(1/y) y x 19c. The expected value of X is Z E[X] = ∞ xfX (x) dx = (x)(1) dx = 1/2 −∞ 0 19c. The expected value of Y is Z ∞ Z E[Y ] = yfY (y) dy = −∞ 1 Z 1 y ln(1/y) dy = 1/4 0 To see this, use integration by parts, with u = ln(1/y) and dv = y dy. 20a. Yes, X and Y are independent, because we can factor the joint density as follows: f (x, y) = fX (x)fY (y), where ( ( xe−x x > 0 e−y y > 0 fX (x) = fY (y) = 0 else 0 else 20b. No; in this case, X and Y are not independent. To see this, we note that the density is nonzero when 0 < x < y < 1. So the domain does not allow us to factor the joint density X can be in the range into two separate regions. For instance, P 41 < X < 1 > 0 since 1 1 between 1/4 and 1. On the other hand, P 4 < X < 1 Y = 8 = 0, since X cannot be in the range between 1/4 and 1 when Y = 1/8; instead, X must always be smaller than Y . 23a. Yes, X and Y are independent, because we can factor the joint density as follows: f (x, y) = fX (x)fY (y), where ( ( 6x(1 − x) 0 < x < 1 2y 0 < y < 1 fX (x) = fY (y) = 0 else 0 else R∞ R1 23b. We compute E[X] = −∞ xfX (x) dx = 0 x6x(1 − x) dx = 1/2. R∞ R1 23c. We compute E[Y ] = −∞ yfY (y) dy = 0 y2y dy = 2/3. R∞ R1 23d. We compute E[X 2 ] = −∞ x2 fX (x) dx = 0 x2 6x(1 − x) dx = 3/10. Thus Var(X) = 2 3 − 12 = 1/20. 10 4 23e. We compute E[Y 2 ] = 1/18. R∞ R1 y 2 fY (y) dy = −∞ 0 y 2 2y dy = 1/2. Thus Var(Y ) = 1 2 − 2 2 3 = 25. Since N is Binomial with n = 106 and p = 1/106 , then n is large and np = 1 is small, so N is well approximated by a Poisson random variable with λ = np = 1. So −λ i −1 P (N = i) ≈ e i! λ = e i! . 26a. Since A, B, C are independent, we multiple their marginal densities to get the joint density. Each of these variables has density 1 on the interval (0, 1) and density 0 otherwise. So the joint density is f (a, b, c) = 1 for 0 < a, b, c < 1 and f (a, b, c) = 0. So the joint distribution is F (a, b, c) = FA (a)FB (b)FC (c), where FA (a), FB (b), and FC (c) are each the cumulative distribution functions of a uniform (0, 1) random variable, i.e., each of these functions has the form F (x) = 0 if x ≤ 0, or F (x) = x if 0 < x < 1, or F (x) = 1 if x ≥ 1. √ −b± b2 −4ac 2 ; these roots are 26b. The roots of the equation Ax +Bx+C = 0 are given by x = RRR 2a 2 real if and only if b − 4ac ≥ 0, which happens with probability f (a, b, c) da db dc, b2 −4ac≥0 2 which is exactly the volume of the region {(a, b, c) | b − 4ac ≥ 0} divided by the volume of the entire region {(a, b, c) | 0 < a, b, c < 1}. The second region has volume 1. The first region R 1 R 1 R min{1,b2 /(4c)} has volume 0 0 0 1 da db dc ≈ .2544. So the desired probability is approximately .2544. To see how to do the integral above, we compute 2 Z 1 Z 1 Z min{1,b2 /(4c)} Z 1Z 1 b 1 da db dc = min 1, db dc 4c 0 0 0 0 0 which simplifies to Z 1/4 Z √4c 0 0 b2 db + 4c Z 1 √ ! 1 db 4c Z 1 Z dc + 1/4 0 1 5 1 b2 db dc = + ln(2) ≈ .2544 4c 36 6 28. The cumulative distribution function of Z = X1 /X2 is P (Z ≤ a) = 0 for a ≤ 0, since Z is never negative in this problem. For 0 < a, we compute Z ∞ Z ax2 X1 λ1 a P (Z ≤ a) = P ≤ a = P (X1 ≤ aX2 ) = λ1 λ2 e−(λ1 x1 +λ2 x2 ) dx1 dx2 = X2 λ1 a + λ2 0 0 An alternative method of computing is to write Z ∞Z ∞ X1 1 λ1 a P (Z ≤ a) = P ≤a =P X1 ≤ X2 = λ1 λ2 e−(λ1 x1 +λ2 x2 ) dx2 dx1 = X2 a λ1 a + λ2 0 x1 /a 32a. We assume that the weekly sales in separate weeks is independent. Thus, the number of the mean sales in two weeks is (by independence) simply (2)(2200) = 4400. The variance of sales in one week is 2302 , so that variance of sales in two weeks is (by independence) simply (2)(2302 ) = 105,800. So the sales in two weeks, denoted by X, has normal distribution with mean 4400 and variance 105,800. So 5000 − 4400 X − 4400 P (X > 5000) = P √ > √ 105,800 105,800 ≈ P (Z > 1.84) = 1 − P (Z ≤ 1.84) 5 = 1 − Φ(1.84) ≈ 1 − .9671 = .0329 32b. The weekly sales Y has normal distribution with mean 2200 and variance 2302 = 52,900. So, in a given week, the probability p that the weekly sales Y exceeds 2000 is p = P (Y > 2000) Y − 2200 2000 − 2200 =P √ > √ 52,900 52,900 ≈ P (Z > −.87) = P (Z < .87) = Φ(.87) ≈ .8078 The probability 3that 3 weekly sales exceeds 2000 in at least 2 out of 3 weeks is (approximately) 3 2 p (1 − p) + 3 p = .9034. 2 33a. Write X for Jill’s bowling scores, so X is normal with mean 170 and variance 202 = 400. Write Y for Jack’s bowling scores, so Y is normal with mean 160 and variance 152 = 225. So −X is normal with mean −170 and variance 202 = 400. Thus, Y − X is nomal with mean 160 − 170 = −10 and variance 225 + 400 = 625. So the desired probability is approximately Y − X − (−10) 0 − (−10) √ P (Y − X > 0) = P > √ 625 625 2 =P Z> 5 2 =1−P Z ≤ 5 = 1 − Φ(.4) ≈ 1 − .6554 = .3446 Since the bowling scores are actually discrete integer values, we get an even better approximation by using continuity correction P (Y − X > 0) = P (Y − X ≥ .5) .5 − (−10) Y − X − (−10) √ > √ =P 625 625 = P (Z > .42) = 1 − P (Z ≤ .42) = 1 − Φ(.42) ≈ 1 − .6628 = .3372 6 33b. The total of their scores, X + Y , is nomal with mean 160 + 170 = 330 and variance 225 + 400 = 625. So the desired probability is approximately X + Y − 330 350 − 330 √ P (X + Y > 350) = P > √ 625 625 4 =P Z> 5 = 1 − P (Z ≤ .8) = 1 − Φ(.8) ≈ 1 − .7881 = .2119 Since the bowling scores are actually discrete integer values, we get an even better approximation by using continuity correction X + Y − 330 350.5 − 330 √ √ P (X + Y ≥ 350.5) = P > 625 625 = P (Z > .82) = 1 − P (Z ≤ .82) = 1 − Φ(.82) ≈ 1 − .7939 = .2061 37a. We recall from problem 3 that Y1 and Y2 have joint mass p(0, 0) = 15/26, p(0, 1) = 5/26, p(1, 0) = 5/26, and p(1, 1) = 1/26. 5/26 = 56 and So the condition mass of Y1 , given that Y2 = 1, is pY1 |Y2 (0|1) = 5/26+1/26 pY1 |Y2 (1|1) = 1/26 5/26+1/26 = 61 . 37b. The condition mass of Y1 , given that Y2 = 0, is pY1 |Y2 (0|0) = pY1 |Y2 (1|0) = 5/26 15/26+5/26 15/26 15/26+5/26 = 3 4 and = 41 . 39a. For 1 ≤ y ≤ x ≤ 5, the joint mass is p(x, y) = p(x)p(y|x) = 39b. The condition mass of X, given Y = i, is pX|Y (x|i) = 1 p(x, i) p(x, i) = P5 = P55x pY (i) x=1 p(x, i) x=1 1 5x = 11 5x = 1 5x 137/300 1 . 5x = 60 137x 40. First, note that there is only one way to obtain X and Y as the same value, but for X > Y , there are two ways to obtain X and Y as the same value. So, for y < i, the conditional mass of Y , given X = i, is pY |X (y|i) = p(i, y) p(i, y) (2)(1/6)(1/6) 2 = Pi = = pX (i) (i − 1)(2)(1/6)(1/6) + (1/6)(1/6) 2i − 1 y=1 p(i, y) For y = i, the conditional mass of Y , given X = i, is pY |X (y|i) = p(i, y) p(i, y) (1/6)(1/6) 1 = Pi = = pX (i) (i − 1)(2)(1/6)(1/6) + (1/6)(1/6) 2i − 1 y=1 p(i, y) 7 For y > i, the conditional mass of Y , given X = i, is pY |X (y|i) = 0. Also note that X and Y are dependent. For instance, P (Y > 3) 6= 0, because Y can take the value 3. On the other hand P (Y > 3 | X = 2) = 0. So X and Y are dependent. Once X is given, for instance, then Y can be no larger than X. 41a. The conditional mass function of X given Y = 1 is p(1, 1) pX|Y (1, 1) = = pY (1) 1 8 1 8 + 1 8 = 1/2 p(2, 1) = pX|Y (2, 1) = pY (1) and 1 8 1 8 + 1 8 = 1/2 The conditional mass function of X given Y = 2 is pX|Y (1, 2) = p(1, 2) = pY (2) 1 4 1 4 + 1 2 = 1/3 and pX|Y (2, 2) = p(2, 2) = pY (2) 1 2 1 4 + 1 2 = 2/3 41b. Since the conditional mass of X changes depending on the value of Y , then the value of Y affects the various probabilities for X, so X and Y are not independent. 41c. We compute 1 1 1 1 P (XY ≤ 3) = p(1, 1) + p(2, 1) + p(1, 2) = + + = 8 8 4 2 and 7 1 1 1 P (X + Y > 2) = p(2, 1) + p(1, 2) + p(2, 2) = + + = 8 4 2 8 and P (X/Y > 1) = p(2, 1) = 1/8 √ uv and 54a. We see that u = g (x, y) = xy and v = g (x, y) = x/y. Thus x = h (u, v) = 1 2 1 p y = h2 (u, v) = uv . The Jacobian is y x x x 2x J(x, y) = 1 −x = − − = − y y y y y2 so |J(x, y)|−1 = y . 2x Therefore the joint density of U, V is fU,V (u, v) = fX,Y (x, y)|J(x, y)|−1 = 1 x2 y 2 1 1 1 y = 3 = √ 3p u = 2 2x 2x y 2u v 2( uv) v 57. We see that y1 = g1 (x1 , x2 ) = x1 + x2 and y2 = g2 (x1 , x2 ) = ex1 . Thus x1 = h1 (y1 , y2 ) = ln(y2 ) and x2 = y1 − ln(y2 ). The Jacobian is 1 1 J(x, y) = x1 = −ex1 e 0 so |J(x, y)|−1 = e−x1 . Therefore the joint density of Y1 , Y2 is fY1 ,Y2 (y1 , y2 ) = fX1 ,X2 (x1 , x2 )|J(x1 , x2 )|−1 = λ1 λ2 e−λ1 x1 −λ2 x2 e−x1 = λ1 λ2 e−((λ1 +1)x1 +λ2 x2 ) = λ1 λ2 e−((λ1 +1) ln(y2 )+λ2 (y1 −ln(y2 )) = λ1 λ2 y2−λ1 +λ2 −1 e−y1 λ2 8 THEORETICAL EXERCISES 2. Generalizing Proposition 2.1 (there is nothing special about two variables), we note that random discrete variables X1 , . . . , Xn are independent if and only if their joint mass f (x1 ,P . . . , xn ) can be factored as f1 (x1 ) · · · fn (xn ); in this case, once each fi is normalized so that xi fi (xi ) = 1 for each i, then the fi ’s are the marginal mass functions of the Xi ’s. Write X for the total number of events in the given time period, and write Xi as the number of events of type i. Then we can factor the joint mass f (x1 , . . . , xn ) of X1 , . . . , Xn by writing f (x1 , . . . , xn ) = P (X = x1 + · · · + xn )P (X1 = x1 , . . . , Xn = xn | X1 + · · · + Xn ) e−λ λx1 +···+xn x 1 + · · · + x n x1 = p1 · · · pxnn (x1 + · · · + xn )! x1 , . . . , x n −λ x1 +···+xn e λ (x1 + · · · + xn )! x1 = p · · · pxnn (x1 + · · · + xn )!x1 ! · · · xn ! 1 e−λ λx1 · · · λxn x1 p1 · · · pxnn = x1 ! · · · xn ! −λp1 e (λp1 )x1 e−λpn (λpn )xn = ··· x1 ! xn ! So e−λpi (λpi )xi xi ! is the mass of Xi , and also X1 , . . . , Xn are independent. 5a. For a > 0, the cumulativeR distribution function of Z Ris FZ (a) R= P (Z ≤ a) = ∞ R ay ∞ ay P (X/Y ≤ a) = P (X ≤ aY ) = 0 0 fX (x)fY (y) dx dy = 0 fY (y) 0 fX (x) dx dy = R∞ fY (y)FX (ay) dy, or equivalently, for z > 0, we have 0 Z ∞ FZ (z) = fY (y)FX (zy) dy 0 Differentiating throughout with respect to z yields Z ∞ fY (y)fX (zy)y dy fZ (z) = 0 Of course, for z ≤ 0, we have fZ (z) = 0. When X, Y are independent exponential random variables with parameters λ1 , λ2 , this yields Z ∞ λ1 λ2 fZ (z) = λ2 e−λ2 y λ1 e−λ1 zy y dy = (λ2 + zλ1 )2 0 for z > 0; as before, fZ (z) = 0 for z ≤ 0. 5b. For a > 0, the cumulative distribution function of Z is FZ (a) = P (Z ≤ a) = R ∞ R a/y R∞ R a/y P (XY ≤ a) = P (X ≤ a/Y ) = f (x)f (y) dx dy = f (y) fX (x) dx dy = X Y Y 0 0 0 0 R∞ f (y)F (a/y) dy, or equivalently, for z > 0, we have Y X 0 Z ∞ FZ (z) = fY (y)FX (z/y) dy 0 9 Differentiating throughout with respect to z yields Z ∞ 1 fZ (z) = fY (y)fX (z/y) dy y 0 Of course, for z ≤ 0, we have fZ (z) = 0. When X, Y are independent exponential random variables with parameters λ1 , λ2 , this yields Z ∞ 1 λ2 e−λ2 y λ1 e−λ1 z/y dy fZ (z) = y 0 for z > 0, but I do not see an easy way to simplify this expression; as before, fZ (z) = 0 for z ≤ 0. 6. Method 1. My solution does NOT use induction. If the Xi are independent and identically distributed geometrics, each with probability p of success, then X P (X1 + · · · + Xn = i) = ((1 − p)x1 −1 p) · · · ((1 − p)xn −1 p) x1 +···+xn =i n = p (1 − p)n n = p (1 − p)n X (1 − p)x1 +···+xn x1 +···+xn =i X (1 − p)i x1 +···+xn =i n X p i (1 − p) 1 (1 − p)n x1 +···+xn =i pn i i−1 (1 − p) = using Proposition 6.1 of Chapter 1 (1 − p)n n−1 i−1 n = p (1 − p)i−n n−1 = and thus X1 + · · · + Xn has a negative binomial distribution with parameters p and n. Method 2. Here is a solution with no derivation needed. Do n consecutive experiments. In each experiment, flip a coin with probability p of landing heads. Then X1 + · · · + Xn is the total number of flips required. Of course, the total number of flips is a negative binomial random variable, because the experiment ends when we see the nth success, or in other words, when we see the appearance of the nth head. 9. For convenience, write Y = min(X1 , . . . , Xn ). For a ≤ 0, we know that FY (a) = 0, since Y is never negative. For a > 0, we have FY (a) = P (Y ≤ a) = 1 − P (Y > a) = 1 − P (min(X1 , . . . , Xn ) > a) = 1 − P (X1 > a, X2 > a, . . . , Xn > a) = 1 − P (X1 > a)P (X2 > a) · · · P (Xn > a) = 1−(e−λa )(e−λa ) · · · (e−λa ) = 1−e−λna . Thus Y is exponentially distribution with parameter nλ. 10. The flashlight begins with 2 batteries installed and n−2 replacement batteries available. When one battery dies, the battery is immediately replaced, and because of the memoryless property of exponential random variables, both batteries installed (the old and new) are just as good as new, and the waiting begins again. A dead battery is replaced a total of n − 2 times. Upon the (n − 1)st battery dying, we do not have enough batteries left to run the flashlight anymore. So the length of time that the flashlight can operate is X1 + · · · + Xn−1 , 10 where the Xi ’s are independent, identically distribution exponential random variables, each with parameter λ. [So the length of time that the flashlight can operate is a gamma random variable with parameters (n − 1, λ); see, for instance, Example 3b on page 282. If you did not get this last sentence, don’t worry, because you are not required to understand gamma random variables in my course.] 14a. We think of X as the number of flips of a coin until the first appearance of heads, and Y as the number of additional flips of a coin until the second appearance of heads; here, “heads” appears on each toss with probability p. Given that X + Y = n, we know that n flips were required to reach the second head. So any of the first n − 1 flips could be the first 1 for head, and all such n − 1 possibilities are equally likely, so P (X = i | X + Y = n) = n−1 1 ≤ i ≤ n − 1 and P (X = i | X + Y = n) = 0 otherwise. 14b. To verify this, we note that if i ≤ 0 or i ≥ n, we must have P (X = i | X + Y = n) = 0, because X, Y are positive random variables. For 1 ≤ i ≤ n − 1, we compute P (X = i and X + Y = n) P (X = i | X + Y = n) = P (X + Y = n) P (X = i, Y = n − i) = P (X + Y = n) P (X = i)P (Y = n − i) = Pn−1 j=1 P (X = j)P (Y = n − j) (1 − p)i−1 p(1 − p)n−i−1 p = Pn−1 j−1 p(1 − p)n−j−1 p j=1 (1 − p) (1 − p)n−2 (n − 1)(1 − p)n−2 1 = n−1 = 15. Method 1. We compute P (X = i and X + Y = m) P (X + Y = m) P (X = i, Y = m − i) = P (X + Y = m) P (X = i)P (Y = m − i) = Pm j=0 P (X = j)P (Y = m − j) m−i n i n p (1 − p)n−i m−i p (1 − p)n−m+i i = Pm n j n n−j p (1 − p) pm−j (1 − p)n−m+j j=0 j m−j n n P (X = i | X + Y = m) = = Pm i = m−i n n j=0 j m−j n n i m−i 2n m 11 Method 2. As in Ross’s hint, flip 2n coins, let X denote the number of heads in the first sequence of n flips and let Y denote the number of heads in the second sequence of nn flips. 2n n If m flips are heads altogether, there are m equally likely possibilities, exactly i m−i of which have i heads in the first sequence of n flips and the other m − i heads in the second (n)( n ) sequence of n flips. So P (X = i | X + Y = m) = i 2nm−i , and thus, given X + Y = m, we (m) see that the conditional distribution of X is hypergeometric. 18a. Given the condition U > a, then U is uniformly distributed on the interval (a, 1). To <b) b−a see this, just consider any b with a < b < 1. Then P (U < b | U > a) = PP(a<U = 1−a . (U >a) 1 Differentiating with respect to b yields the conditional density of U , namely, 1−a , which is constant (since a is fixed in this problem). So the conditional distribution of U is uniform on the interval (a, 1), as we stated at the start. 18b. Given the condition U < a, then U is uniformly distributed on the interval (0, a). (U <b) = ab . To see this, just consider any b with 0 < b < a. Then P (U < b | U < a) = PP (U <a) Differentiating with respect to b yields the conditional density of U , namely, a1 , which is constant (since a is fixed in this problem). So the conditional distribution of U is uniform on the interval (0, a), as we stated at the start.