Download 61solutions5

Stat 61 - Homework #5 - SOLUTIONS (10/29/07) 1. Let X be a random variable with E(X) = 100 and Var(X) = 15. What are… a. E ( X2 ) (Not 10000) Var(X) = E(X2) – E(X)2, so 15 = E(X2) – 10000, so E(X2) = 10015. b. E ( 3X + 10 ) =3 E(X) + 10 = 310 c. E (-X) = –100 d. Standard deviation of –X ? = 15  3.9, NOT –3.9 2. Let X be the random variable with pdf fX ( x ) = 2x 0 if 0 ≤ x ≤ 1 otherwise. a. What is the cdf, FX(a) ? FX(a) =  a  0  f ( x)dx = a 2 1  if a  0 if 0  a  1 if a  1 b. Let Y be another random variable defined by Y = X2. What is its cdf, FY (b) ? ( This is possible because if you know P( X ≤ a ) for every a, you can get P( Y ≤ b ) for any b.) FY(b) = P ( Y ≤ b ) (by definition of FY) 2 =P(X ≤b) (same event) (This is clearly zero of b < 0, so from here on assume b ≥ 0) = P(X≤ b ) (again, it’s the same event, at least when b is nonnegative) = FX ( b ) (by definition of FX) = b =b 2 (or: 1 if b  1, 0 if b ≤ 0) 1 c. What is the pdf, fY ? It’s the derivative of FY… if b<0 0  fY(b) = 1 if 0  b  1 0 if b  1  d. Compute E(Y) directly from FY. (Use the secret formula.) (See website if necessary) E(Y) = =  1 y 0  y 0 1  F ( y)  dy  y  F ( y)dy 0 (1  y )dy  0 = 1/2. e. Compute E(Y) from fY. E(Y) =   y  yf ( y )dy   1 y 0  y 1 dy = 1/2 f. Compute E(Y) = E(X2) directly from fX. (If your answers to d-e-f don’t agree, panic.) E(Y) = E(Y )  E ( X 2 )    x  x2 f x ( x)dx   1 x 0 x 2 (2 x)dx =1/2 3. (Sum of random variables) Let X and Y be independent random variables, with these pmf’s: pX(n) = (1/3) for n = 1, 2, 3; else 0 pY(n) = (1/2) for n = 1, 2; else 0 Let Z = X + Y. Using the formula pZ (n)   pX (m) pY (n  m) , all m [ = P(X=1) P(Y=n-1) + P(X=2) P(Y=n-2) + P(X=3) P(Y=n-3) etc. ] construct the pmf for Z. 2 P ( Z = 2 ) = P( X=1)P(Y=1) = (1/3)(1/2) = 1/6 P ( Z = 3 ) = P(X=1)P(Y=2) + P(X=2)P(Y=1) = (1/3)(1/2) + (1/3)(1/2) = 2/6 P ( Z = 4 ) = P(X=2)P(Y=2) + P(X=3)P(Y=1) = etc. = 2/6 P ( Z = 5 ) = P(X=3)P(Y=2) = etc. = 1/6 (I guess there are infinitely many other terms in each of these sums, but they are all zero) so the pmf for Z is z pZ(z) 2 1/6 3 2/6 4 2/6 5 1/6 1 4. Now let X and Y be independent Poisson random variables, with means X and Y respectively. Let Z = X + Y. Using the formula from problem 3, derive the pmf for Z. n n! a mb n  m . ) m  0 m !( n  m)! (Hint: The binomial formula is (a  b) n   In what follows, x, y, and z are integers (being typical values of X, Y, Z resp.). pZ ( z )   p X ( x) pY ( z  x) all x z  x    X e  X x 0  x !   Y z  x  Y e      z  x !     z  1  e  (  X  Y )    X x Y z  x   x 0 x ! z  x  !   z!  1  z  e  (  X  Y )      X x Y z  x   z !   x 0 x ! z  x  !  z 1  e  (  X  Y )     X  Y   z! 3 This is the Poisson pmf for Z, assuming Z = X + Y. So: The sum of two independent Poisson random variables is itself a Poisson random variable. 5. (The mother of all Poisson examples) Do problem 4.2.10, page 287. Either actually read the problem and do what it says, or do parts a and b below, which amount to the same thing. Specifically, given the following distribution… k Observed number of corps-years in which k fatalities occurred 0 1 2 3 4 ≥5 109 65 22 3 1 0 200 [This is a frequency table describing 200 annual reports from pre-WWI Prussian cavalry corps, each giving a number of soldiers killed by being kicked by horses.] It is suggested that these values represent 200 independent draws from a Poisson distribution. a. What is a reasonable estimate for the mean of the distribution? One reasonable estimate is the mean of the 200 observed values, which is  = [ 109(0)+65(1)+22(2)+3(3)+1(4) ] / 200 = 0.61. (As we’ll see soon, this is the method-of-moments estimator for the mean, which happens to agree in this situation with the maximum likelihood estimator.) b. What would the entries in the right-hand column be if the 200 draws were exactly distributed according to this Poisson distribution? (The entries would not be integers.) 4 Using  = 0.61: k 0 1 2 3 4 ≥5 observed 109 65 22 3 1 0 theoretical 108.67 = (200) ( 0.610 / 0! ) ( e–0.61 ) 66.29 20.22 4.11 0.63 0.09 200 200 (In November we’ll address this question: Comparing the original data to the theoretical values in part b, is it plausible that these really are draws from a Poisson distribution? Or should we discard that theory?) This looks to me like a very good fit --- even suspiciously good --- but we may try a Chi-Square test later. 6. (Editing a Poisson distribution) At a certain boardwalk attraction, customers appear according to a Poisson process at a rate of  = 15 customers/hour. So, if X is the number of customers appearing between noon and 1pm, X has a Poisson distribution with mean 15. Assume that each customer wins a prize with (independent) probability 1/5. Let Y be the number of customers winning prizes between noon and 1pm. One way to understand Y is that if the value of X is given, then Y is binomial, with parameters n = [value of X] and p = 1/5. This reasoning gives us: P(Y=k)=   n nk     P( X  n)    k  1/ 5 k  (4 / 5) nk   (The second part of the summand is the probability that Y=k given that X=n.) 5 a. Simplify this formula, to show that Y is itself a Poisson distribution with mean 3. P(Y=k)=  n    P( X  n)    k  1/ 5 k   nk  (4 / 5) n  k      15 15    n  k   e     1/ 5  (4 / 5) n  k  nk  n !  k   n  1 15  n  1   1 e 15   n k! nk  (n  k )!   5  1 15 k   3n  k 4n  k  e (3 )   k! n  k  ( n  k )!  15n  nk 4 ...and now since  (3k )(3n  k )...  n 5  When we substitute m for n-k in the last sum, we get  12m  , m 0 m ! which is the Taylor series for e+12; so 3k 3 P(Y=k)= e k! as desired. (Or, if after getting off to a good start you find this problem too annoying, do part b instead.) b. Find a much simpler explanation of why Y should be Poisson with mean 3. If the probability of a customer arriving in a small interval is 15 times the length of the interval in hours, independently of what happens in other intervals, then the probability of a winner appearing during that small interval is 3 times the length of the interval, and is still independent of what happens in other intervals. That means that the “winners” process is a Poisson process with mean 3, and the result follows. 6 7. (Waiting times) Buses arrive according to a Poisson process with mean (1/10) min-1. Let W be the waiting time from time t = 0 till the arrival of the second bus. a. What is E(W) ? If X is the time we have to wait for the first bus, then X is exponentially distributed with mean 10. After the first bus arrives, we can start waiting for the second bus, and if Y is the time that takes, then Y also has mean 10. Now W = X + Y. It doesn’t matter whether X and Y are independent (they are) --- in any case, E(W) = E(X) + E(Y) = 20 minutes. Part a is of no use in solving part b. b. Can you construct a pdf or cdf for W ? ( Good start: P(W ≤ t) = 1 – P(exactly 0 buses or exactly 1 bus between times 0 and t). Another approach: W is the sum of two independent one-bus waiting times.) P ( W ≤ t ) = 1 – exp(-(1/10) t) – (1/10) t exp(-(1/10) t). That’s the same as FW(t). If you want the density function fW(t), just differentiate FW(t); things cancel and you get fW(t) = (1/100) t exp(-(1/10)t ). This happens to be the density for (a special case of) the gamma distribution. 8. (The first boring normal tables problem) If Z is a standard normal variable, what are… a. P ( -1.0 ≤ Z ≤ +1.0 ) = 0.68 b. P ( -2.0 ≤ Z ≤ +2.0 ) = 0.95 c. P ( -3.0 ≤ Z ≤ +3.0 ) = 0.997 ( by the “Rule of 68 – 95 – 99.7” ) ( or, use Excel’s NORMSDIST function to get 0.682689, 0.9545, 0.9973 ) 7 9. (The second boring normal tables problem) Let Z be a standard normal variable. a. If P ( -a ≤ Z ≤ +a ) = 0.95, what is a ? Find z for which Phi(z) = 0.975 --- namely, z = 1.96. (Or: =NORMSINV(0.975) = 1.9600 ) b. If P ( -b ≤ Z ≤ +b ) = 0.99, what is b ? Find z for which Phi(z) = 0.995 --- namely, z = 2.58 (Or: =NORMSINV(0.995) = 2.575829 ) 10. (The third boring normal tables problem) a. If X is normal with mean 500 and standard deviation 110, what is P ( X ≥ 800 ) ? If Z = (X-500)/110 then Z is standard normal. P ( X ≥ 800 ) = P ( Z ≥ (800-500)/110 ) = P ( Z ≥ 2.7272 ) = 1 – (2.72) = 0.0032 b. If X is normal with mean 500 and standard deviation 120, what is P ( X ≥ 800 ) ? If Z = (X-500)/120 then Z is standard normal. P ( X ≥ 800 ) = P ( Z ≥ (800-500)/120 ) = P ( Z ≥ 2.5 ) = 1–(2.5) = 0.0062 (end) 8

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 61solutions5