Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Your first and last name:___________________________________ SSN:________________________ CODE:______________________ if you want me to post your grade on the WEB. Write either numeric or alphanumeric 6 characters. STAT 211 SUMMER 2002 You have 90 minutes to complete this exam. You can only use your own calculator and t, z, chisquare tables. There is penalty of 5 pt. if you separate your exam with any reason. There are 40 questions (100 pts. total) and there is no partial credit on this exam. If you did not mark your final answer on your scantron, your answer will be counted incorrect. If you did not mark your exam form on the scantron , the lowest grade of two forms will be assigned by the computer and your grade will not be corrected later. If you caught cheating, you will get a grade of zero. Good Luck. EXAM 2 - FORM A 1. How would you determine if the given data come from an exponential distribution? (a) I would graph data, x versus any f(x) to see if they make a symmetric graph. (b) I would graph data, x versus any f(x) to see if they make a 45 line. (c) I would graph the ordered data with their expected normality values to see if they make a symmetric graph. (d) I would graph the ordered data with their expected normality values to see if they make a 45 line. (e) I would graph ordered data versus exponential data values computed using cumulative percentiles to see if they make a 45 line. 2. If time between failures are exponentially distributed with =2, what is the median time between failures? (a) -0.6932 (b) -0.3466 (c) 0.5 (d) 0.3466 (e) 0.6932 3. If time between failures are exponentially distributed with =2, what is the probability that the next failure occur in less than 1 minutes? (a) 0.0183 (b) 0.1353 (c) 0.5144 (d) 0.8647 (e) 0.9817 4. If E(X)=5 and Var(X)=10, find the variance of Y=5X-2. (a) 25 (b) 48 (c) 125 (d) 248 (e) 250 5. If X1 and X2 are normally independently distributed random variables with the mean 3 and the variance 9, What is the probability that (a) 0.1587 (b) 0.6826 (c) 0.8413 (d) 0.9544 (e) 0.9772 ( X 1 X 2 ) / 3 is between 2 and 2? Let X be a continuous random variable with the legitimate probability density function (pdf), kx, f ( x) 0, 0 x 1 . otherwise Answer questions 6 to 8 using the information above. 6. Which of the following makes f(x) a legitimate pdf? (a) k=0.25 (b) k=0.5 (c) k=1 (d) k=1.41 (e) k=2 7. Which of the following is the F(x) (cumulative distribution function) when 0 x 1 using the legitimate pdf? (a) k (b) kx (c) kx 2 2 (d) kx / 2 (e) 1 8. Which of the following is the expected value of Y=X+2 using the legitimate pdf? (a) kx (b) kx +2 (c) (k 4) / 2 (d) ( k 3) / 3 (e) ( k 6) / 3 9. Suppose that you are interested in monitoring air pollution in Los Angeles, California, over one-week period. Let X be a random variable that represents the number of days out of seven on which the concentration of carbon monoxide surpasses a specified level. Which of the following distributions explains this random variable the best? (a) Binomial Distribution (b) Hypergeometric Distribution (c) Poisson distribution (d) Exponential Distribution (e) Geometric distribution 10. The number of cases of tetanus reported in the United States during a single month in 1989 has a expected number of cases 4.5. Let X be a random variable that represents the number of cases of tetanus will be reported in 3 months. Which of the following distributions explains this random variable the best? (a) Binomial Distribution (b) Hypergeometric Distribution (c) Poisson distribution (d) Exponential Distribution (e) Geometric distribution 11. Which of the following is correct? (a) Prediction intervals are narrower than confidence intervals. (b) Prediction intervals demonstrates the confidence interval for the true mean. (c) Prediction intervals are wider than confidence intervals. (d) Prediction intervals demonstrates the confidence interval for the true proportion. (e) Tolerance interval and the prediction interval are the same. Among females in the United States between 18 and 74 years of age, diastolic blood pressure is normally distributed with mean =77 mm Hg and standard deviation =11.6 mm Hg. Use this information to answer questions 12 to 17. 12. What is the probability that a randomly selected woman has a diastolic blood pressure less than 60 mm Hg? (a) 0.0708 (b) 0.1470 (c) 0.5241 (d) 0.8530 (e) 0.9292 13. What is the probability that a randomly selected woman has a diastolic blood pressure more than the mean? (a) 0 (b) 0.25 (c) 0.50 (d) 0.75 (e) 1 14. What is the probability that a randomly selected woman has a diastolic blood pressure between 60 and 90 mm Hg? (a) 0.1314 (b) 0.1470 (c) 0.7216 (d) 0.7978 (e) 0.8686 15. What is the lowest 2.5% of the diastolic blood pressure? (a) 54.264 (b) 57.918 (c) 96.082 (d) 99.736 (e) 106 16. If you choose a random sample of 100 women, what is the probability that total diastolic blood pressure less than 7500 mm Hg? (a) 0.0172 (b) 0.0427 (c) 0.0953 (d) 0.9528 (e) 0.9828 17. If you choose a random sample of 100 women, what is the probability that average diastolic blood pressure less than 75 mm Hg? (a) 0.0172 (b) 0.0427 (c) 0.0953 (d) 0.9528 (e) 0.9828 18. Which of the following is incorrect? (a) Chisquare distribution is used to construct the confidence interval for the variance or the standard deviation. (b) T and Z distributions are used to construct the confidence interval for the mean values. (c) No assumptions are necessary for us to use the chisquare distribution (d) Large sample rules should be checked out for us to use Z distribution to construct the confidence interval for the population proportion. (e) Exactly one of the above is incorrect. _ 19. What is the confidence level for the interval x 2.33 ? n (a) 0.90 (b) 0.95 (c) 0.98 (d) 0.99 (e) 0.999 20. Find the value of c for P(Z c)=0.3669 where z is the standard normal random variable. (a) -0.34 (b) -0.36 (c) 0.34 (d) 0.36 (e) 0.37 21. Find the value of c for P(-c Z c)=0.7498 where z is the standard normal random variable. (a) –1.15 (b) –0.67 (c) 0.49 (d) 0.67 (e) 1.15 22. Which of the following can be the critical values for constructing the confidence interval for the true variance with the confidence level 95% and the sample size 19? (a) 8.231 and 31.526 (b) 8.906 and 32.852 (c) 9.39 and 28.869 (d) 9.591 and 34.170 (e) 10.117 and 30.143 _ 23. If the lower confidence bound for testing the true mean is x 2.65 s , which of the following is 14 the confidence level. (a) More than 0.005 and less than 0.01 (b) 0.01 (c) 0.99 (d) More than 0.99 and less than 0.995 (e) 0.995 24. If the 95% confidence interval for testing the true mean is (128, 150), which of the following sample means might have been used to construct this interval? (a) 11 (b) 22 (c) 130 (d) 139 (e) 278 25. If the 95% confidence interval for testing the true mean is (128, 150), which of the following values might be the bound on the error estimation used to construct this interval? (a) 11 (b) 22 (c) 130 (d) 139 (e) 278 A joint pdf for x and y is defined by 1 x(1 y ), f ( x, y ) 3 0, if 0 x2 and 0 y 1 otherwise Answer questions 26 to 29 using the information above. 26. Which of the following is the marginal pdf for X? (a) x/2 for all x (b) x/2 when 0<x<1 (c) x/2 when 0<x<2 (d) 2x when 0<x<2 (e) y+x when 0<y<1 27. Which of the following is the f( y | x)? (a) 2(1+y)/3 when 0<y<1 (b) 2(1+y)/3 when 0<x<2 (c) x(1+y)/3 when 0<y<1 (d) x(1+y)/3 when 0<x<2 (e) none of the above 28. Are X and Y independent? (a) Yes because f(x,y)=f(x)f(y) for all x and y (b) No because f(x,y)f(x)f(y) for all x and y (c) Yes because f(x,y)f(x)f(y) for all x and y (d) No because f(x,y)=f(x)f(y) for all x and y (e) It is not possible to determine 29. Which of the following is the covariance between X and Y? (a) –0.504 (b) 0 (c) 0.504 (d) 0.910 (e) 1 30. Let X1 and X2 be a random sample of size 2 with the mean 5 and the variance 4, which of the following is the variance of Y=2X1-X2 where Corr(X1,X2)=0.5? (a) 2 (b) 4 (c) 12 (d) 16 (e) 20 31. In estimating the mean time taken by a sheetrocker to nail one 8-foot sheet already in place, a sample of 25 observations were taken at random times. The worker was found to be nailing during just 5 of these observations. During the 200 minute span of time covering the observations, the sheetrocker hung 20 sheets. Which of the following is the point estimate of the mean time per sheet to do nailing? (a) 2 (b) 4 (c) 8 (d) 20 (e) 40 32. In estimating the mean time taken by a sheetrocker to nail one 8-foot sheet already in place, a sample of 25 observations were taken at random times. The worker was found to be nailing during just 5 of these observations. Which of the following is the point estimate of the proportion of all working time spent nailing? (a) 0.05 (b) 0.20 (c) 0.25 (d) 5 (e) 25 33. We would like to estimate the true proportion using confidence intervals. In which case, the test procedure is valid using normal distribution? ^ (a) n=200, p =0.03 (b) n=200, p =0.97 ^ ^ (c) n=1000, p =0.02 (d) n=1000, p =0.0025 (e) n=100, p =0.012 ^ ^ 34. Which of the following does change the width of a large sample confidence interval for ? _ (a) (b) (c) (d) (e) x. The standard deviation of the population. The confidence level. The sample size. All of the above except (a) 35. You wish to estimate the mean contents in a shipment of 1000 cans of asparagus. After weighing a sample of 200 cans, you compute the sample mean 15.9 ounces and the standard deviation 0.3 ounces. Which of the following is the corresponding 95% confidence interval for the mean contents in a shipment of 1000 cans of asparagus? (a) (15.8584 , 15.9416) (b) (15.8651 , 15.9349) (c) (15.8814 , 15.9186) (d) (15.8844 , 15.9156) (e) need more information to answer this question 36. Which of the following is the conservative sample size to estimate the mean contents in a shipment of cans of asparagus to within 0.1 ounces with 95% confidence where the population standard deviation is 0.4? (a) 43 (b) 44 (c) 60 (d) 61 (e) 62 37. The proportion of voters in a large state approving a referendum is to be estimated. On July 1, 75 out of 130 persons sampled approved of the referendum. Which of the following is the point estimate for the true proportion of all voters not approving on July 1? (a) 0.25 (b) 0.42 (c) 0.58 (d) 0.75 (e) 0.95 38. The proportion of voters in a large state approving a referendum is to be estimated. On July 1, 75 out of 130 persons sampled approved of the referendum. Which of the following is the 95% confidence interval for the true proportion of all voters approving? (a) (0.352 , 0.494) (b) (0.492 , 0.662) (c) (0.506 , 0.648) (d) (0.576 , 0.703) (e) (0.665 , (0.814) 39. If the 95% confidence interval for the true average change of temperature is computed as (79F,99F) and I claim that the true average change of temperature is 85F, am I right based on the given confidence interval? (a) Yes (b) No 40. If I am interested in the number of children goes to school (X) in families of size 20 and the number of children goes to work in the same families (Y). Which of the following may be the values max(X,Y) can take? (a) 0 to 20 (integers) (b) 1 to 20 (integers) (c) 0 to infinity (integers) (d) Any number from 0 to 20 (e) Any number from 1 to 20 Answer Key: 1.e 2.d 13.c 14.d 25.a 26.c 37.b 38.b 3.d 15.a 27.a 39.a 4.e 16.b 28.a 40.a 5.b 17.b 29.b 6.e 18.c 30.c 7.d 19.c 31.a 8.e 20.c 32.b 9.a 21.e 33.c 10.c 22.a 34.e 11.c 23.c 35.a 12.a 24.d 36.e FORMULAS Binomial Distribution: Approximate probability model for sampling without replacement from a finite dichotomous population. X~Binomial(n,p). n fixed trials each trial is identical and results in success or failure independent trials the probability of success (p) is constant from trial to trial X is the number of successes among n trials n P( X x) p x (1 p) nx , x E(X) = np and x 0,1,2,...., n Var(X) = np(1-p) Hypergeometric Distribution: Exact probability model for the number of successes in the sample. X~Hyper(M,N,n) M N M x n x , P( X x) N n max( 0, n N M ) x min( n, M ) Let X be the number of successes in the sample, n be the sample size, N be the population size, and M be the number of successes in the population M M where is the proportion of successes in the population. N N N n N n M M n 1 where Var(X) = is the finite population correction factor. N 1 N 1 N N E(X) = n Poisson Distribution: The probability of an arrival is proportional to the length of waiting time. P( X x) e x , x 0,1,2,3,..........., x! 0 : intensity parameter (mean rate, expected number of occurrences). X : number of occurrences per given period i i 0 i! e Note that and E(X)=Var(X)=. Continuous probability distribution, f(x) is legitimate (i) f(x) 0, for all x (ii) f ( x)dx 1 = area under the entire graph of f(x). x Cumulative Distribution Function for continuous X (cdf) : F(x)= P ( X x) f ( y)dy. F(-)=0, F()=1, P(a X b) P(a X b) F (b) F (a ) , P(X>a)=P(Xa)=1-F(a) Obtaining f(x) from F(x) : If X is a continuous r.v. with pdf f(x) and cdf F(x), then at every x at which the derivative exists, F`(x)=f(x). Percentile of a continuous distribution: Let p be a number between 0 and 1. the (100p)th percentile of the distribution of a continuous r.v. X, denoted by r(p), is defined by r ( p) p F (r ( p )) P( X r ( p)) f ( y)dy. where P(Xmedian)=P(X>median)=0.50 Expected value for the continuous random variable, X: E( X ) x f ( x)dx. Variance for the continuous random variable, X: 2 Var( X ) E ( X 2 ) 2 E( X 2 ) where x 2 f ( x)dx. 1 , ba X Uniform Distribution: X ~U[a,b] then f ( x) Normal Distribution: X ~ N ( , 2 ) and z axb ~ N (0 , 1) Normal Approximation to the Binomial Distribution: Let X be a binomial r.v. based on n trials with success probability p. Then if the binomial probability histogram is not too skewed, X has approximately a normal distribution with = np and x 0.5 np (check if np10 and n(1-p)10 to use the np(1 p) np(1 p) then P( X x) formula). The Gamma Distribution:X~ Gamma ( , ) then (1 / 2) ( ) x 1 e x dx, , E ( X ) , Var ( X ) 2 0 0 ( 1) ( 1), 1 ( 1)!, is any positive integer If =1 then it is called standard gamma distribution. When the random variable is a standard gamma r.v. then the cdf is called the incomplete gamma function (Appendix Table A.4). F(x;,)=F(x/;) If = 1 then it is Exponential(=1/). f ( x) e x , x>0 and P( X x*) 1 e x* . Continuous Data: If f(x,y) is the joint pdf for x and y, the marginal pdf for x can be computed as p( x, y )dy and the marginal pdf for y can be computed as p( x, y)dx . y E(X)= x x f ( x, y)dydx x f ( x)dx x y E(XY)= and E ( X ) 2 x x y f ( x, y)dydx x 2 f ( x, y )dydx x 2 f ( x, y )dx x y and E(h(X,Y))= x y x h( x, y) f ( x, y)dydx x y f(x | y)=f(x,y) / f(y) where f(y)>0 and E(X|Y)= x f ( x | y)dx x Cov(X,Y)=E(XY) -E(X)E(Y) and Corr(X,Y)= Cov( X , Y ) Var ( X ) Var (Y )) Var(aXbY)=a Var(x)+b Var(Y) 2abCov(X,Y) 2 2 Random Sample:The random variables X1, X2, ….,Xn are said to form a random sample of size n if (i) The Xi's are independent random variables. (ii) Every Xi's has the same probability distribution. If X1, X2, ….,Xn are said to form a random sample of size n with the mean and the variance 2, the sampling n _ distribution of x has the mean and the variance 2/n, the sampling distribution of x i 1 variance n2, and so on. i has the mean n and the Point estimate of a parameter : single number that can be regarded as the most plausible value of . A point ^ estimator, ^ = + error of estimation. ^ is an unbiased estimator of if E( )= for every possible value of . ^ Otherwise, it is biased and Bias = E( )- . Minimum Variance Unbiased Estimator (MVUE): Among all estimators of that are unbiased, choose the one that has ^ minimum variance. The resulting ^ ^ The Invariance Principle: Let is MVUE. ^ 1 ., 2 ,..., m 1 , 2 ,..., m . be the MLE's of the parameters ^ ^ Then the MLE of any ^ function h( 1 , 2 ,..., m ) of these parameters is the function h( 1 ., 2 ,..., ) of the MLE's m Confidence Interval for a Population Mean, Suppose that the parameter of interest is the population mean, and that a. the population distribution is normal b. the value of the population standard deviation is known Let X1, X2, ....,Xn be a _ _ x z / 2 , x z / 2 n n random sample. Then 100(1-)% confidence where z / 2 _ . I mean n x will be within The sample size required to estimate a population mean to within an amount B= z / 2 z / 2 B 2 z / 2 n= w z / 2 is of . n with 100(1)% n 2 . for x P z / 2 z / 2 1 / n Choosing the sample size: Bound on the error estimation is confidence is n= interval _ The same formula can be written using the interval width, w= 2 z / 2 then n 2 . Large Sample Confidence Interval for Suppose that the parameter of interest is the population mean, and that a. X1, X2, ...,Xn is a random sample from a population distribution with mean, and standard deviation, . _ For the large sample size n, the CLT implies that x has approximately a normal distribution for any population distribution. The value of the population standard deviation may not be known. Instead, the value of the sample standard deviation s may be known. b. c. If n is sufficiently large s s x z / 2 , x z / 2 n n _ (n>40), _ 100(1-)% where large sample confidence interval s s P x z / 2 x z / 2 1 n n _ _ A General Large Sample Confidence Interval for is When the estimator satisfies the following properties, the confidence interval can be constructed. a. The estimator has approximately a normal population distribution b. It is at least unbiased c. standard deviation of the estimator is known Large Sample Confidence Interval for a population proportion, p ^ If n is sufficiently large ( n ^ p z / 2 ^ ^ p 10 p(1 p) ^ , p z / 2 n ^ n1 p 10 ), ^ ^ p(1 p) n and 100(1-)% large sample confidence interval for p is ^ Choosing the sample size: Bound on the error estimation is ^ ^ . I mean p will be within ^ p(1 p) n z / 2 z / 2 ^ p(1 p) n of p. The sample size required to estimate a population proportion p to within an amount ^ ^ z2 / 2 p1 p p(1 p) . The same formula can be written using B= z / 2 with 100(1)% confidence is n= 2 n B ^ ^ ^ ^ 4 z2 / 2 p1 p p(1 p) . The conservative sample size can be found the interval width, w= 2 z / 2 then n= n w2 ^ ^ when ^ ^ p = 1 p =0.5 Intervals based on a Normal Population Distribution: When the sample size is small and the population of interest is normal, so that X 1, X2, ...,Xn constitutes a random sample from a normal distribution with both and unknown. 100(1-)% confidence interval for is _ s _ s x t / 2;n 1 , x t / 2;n 1 n n . Prediction Interval for a Single Future Value: Let X1, X2, ...,Xn be a random sample from a normal population distribution and we wish to predict the value of X n+1, a single future observation. 100(1-)% prediction interval for Xn+1 is _ 1 _ 1 x t / 2;n 1 s 1 , x t / 2;n 1 s 1 n n Confidence Intervals for the Variance, 2 and Standard Deviation, of a Normal Population : The population of interest is normal, so that X 1, X2, ...,Xn constitutes a random sample from a normal distribution with parameters and 2. Then 100(1-)% confidence interval for 2 is (n 1) s 2 (n 1) s 2 , 2 2 / 2;n 1 1 / 2;n 1 .