Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Degrees of freedom (statistics) wikipedia , lookup
Inductive probability wikipedia , lookup
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Name: Student Number: STAT 285: Fall Semester 2014 Final Examination — Solutions 5 December 2014 Instructor: Richard Lockhart Instructions: This is an open book test. As such you may use formulas such as those for means and variances of standard distributions from the book without deriving them. You may use notes, text, other books and a calculator; you may not use a computer. Your work will be marked for clarity of explanation. I expect you to explain what assumptions you are making and to comment if those assumptions seem unreasonable. In general you need not finish doing arithmetic in confidence interval problems; I will be satisfied if your answers contain things like p 27 ± 1.96 247.5/11, but I have to be absolutely convinced you know what arithmetic to do! In hypothesis testing problems you will have to finish the arithmetic enough to reach a real world conclusion. I want the answers written on the paper. The exam is out of 70. 1. We generally assume that when a coin is tossed it lands heads up with probability 0.5. It has been suggested, however, that if you do the experiment in a different way the chance might change. One such way is to stand the coin on edge on a hard flat surface, hold it upright with a finger and then flick the edge with a finger to send the coin spinning away. A group of 107 statistics students actually did this, 40 times each for a total of 4280 spins. (a) [5 marks] Suppose they got a total of 2376 heads. Give a 99% confidence interval for the probability that spinning produces heads. We are going to assume that all 40 students had the same probability p of spinning heads; it is not obvious that this is realistic. If so then the number, X, of heads has a Binomial(4280,p) distribution. The estimate is p̂ = 2376 = 0.5551402 = 0.555 4280 The estimated standard error is r r p̂(1 − p̂) 0.555 · 0.445 σ̂p̂ = = = 0.007596106 = 0.00760. 4280 4280 1 For a 99% confidence interval the relevant critical point from the normal curve is z0.005 = 2.58 and the interval is 0.5551 ± 2.58 · 0.0076 or 0.5475 to 0.5627. It is not necessary to work out all the numbers for full marks. (b) [5 marks] Is it reasonable to believe that this method produces heads with probability 1/2? This is a typical hypothesis testing question. The way it is asked the null hypothesis is H0 : p = 0.5 and the alternative is Ha : p 6= 0.5. The relevant test statistic is 0.5551 − 0.5 = 7.25 T = 0.0076 and the P -value is obtained from a normal curve. On the test the tables being used end at 3.49 and you can only say that P is less than 2 × 0.0002 = 0.0004. I can use R to check that P ≈ 4.2 × 10−13 . In any case P is so ridiculously small that the null hypothesis is untenable. The probability of heads is more than 0.5. 2. [5 marks] It has also been suggested that if you flip a coin in the usual way and catch it in your hand it is slightly more likely than 50% to land the same way up as it started. If the chance of landing the same side up as it started were really 0.51 how many tosses would I need to make to have at least a 90% chance that a level 5% test would detect a significant difference between the probability of heads and 0.5? This is an application of the sample size determination problem. We are given α = 0.05, β = 0.1, p0 = 0.5, and p′ = 0.51. The alternative suggested calls for a one-sided test so #2 " √ √ 2 1.65 · 0.5 + 1.28 · 0.50 z0.05 0.5 · 0.5 + z0.1 0.51 · 0.49 = = 21609 n= 0.51 − 0.5 0.01 is the required sample size. 3. A large population of new mothers is divided into two groups: smokers and nonsmokers. Independent samples of 40 mothers are drawn from each group in order to make comparisons. One comparison made is birth weight. The babies of the 40 smokers average 112.1 ounces with a standard deviation of 16.1 ounces while those of the 40 non-smokers average 123.1 ounces with an SD of 15.8 ounces. 2 (a) [5 marks] Is it clear that smokers have lower birth weight babies? We have two samples: X1 , . . . , Xn with n = 40 from the population of smokers, and Y1 , . . . , Ym with m = 40 from the population of non-smokers. If µ1 is the population mean birth weight for smoker’s babies and µ2 is the population mean birth weight for non-smokers babies then our null hypothesis is H0 : µ2 = µ1 (or H0 : µ2 ≤ µ1 ) and the alternative is Ha : µ1 < µ2 . (There may be students who try to make the case for a 2 sided test; I would dock 0.5 marks only.) The relevant test statistic is Ȳ − X̄ T =q = 3.08. 15.82 16.12 + 40 40 The degrees of freedom, no matter how you calculate it will be close to 78; they cannot be lower than 39 and the two SDs are close together so they will be close to 78. The P value for 3.08, 1 sided, with any degrees of freedom from 40 to 120 is in the range of 0.001 to 0.002 so in any case we conclude that there is very strong evidence against the idea that smokers have babies as heavy as those of non-smokers. Clearly they have low birth weight babies. From R the P -value is between 0.00141 and 0.00187 and likely closer to the smaller number but the conclusion is not affected by this difference. (b) [5 marks] Give a 90 percent confidence interval for the difference in mean birth weights between smoking and non-smoking mothers. In the previous part you computed both a degrees of freedom and an estimated standard error. The df should be used to find a t multiplier. 3 4. The following sequence of questions concern the following model. Imagine we have 3 dice which have been carefully manufactured so that they all have exactly the same weight, θ. We begin by weighing 1 of the dice and recording Y1 which you may assume has mean θ. The error in the measurement, namely Y1 − θ has a normal distribution with mean 0 and standard deviation σ. In order to keep this problem simple you may assume that σ = 1 and that you know this somehow. Then you weigh 2 of the dice together and record Y2 whose mean is 2θ. Assume that the error Y2 − 2θ has a normal distribution with mean 0 and standard deviation σ. In this problem you are to compare two estimators for θ. (a) [5 marks] The first estimator is based on the idea that Y2 /2 has mean θ – the same as Y1 . This estimator is the average of these two. θ̂1 = Y1 + Y2 /2 2 Find the bias, standard error and mean square error of θ̂1 . The mean of θ̂1 is E(Y1 ) + E(Y2 )/2 Y1 + Y2 /2 θ + 2θ/2 = E(θ̂1 ) = E = = θ. 2 2 2 Thus θ̂1 is unbiased (the bias is 0). The variance is Var(Y1 ) + Var(Y2 )/4 σ 2 + σ 2 /4 5σ 2 Y1 + Y2 /2 = = = Var(θ̂1 ) = Var 2 4 4 16 The mean squared error is also the variance. The standard error is the square root of the variance √ 5σ . 4 4 (b) [5 marks] Another estimator is obtained by least squares. Derive the formula for the least squares estimate of θ; call this estimator θ̂2 . The error sum of squares is (Y1 − θ)2 + (Y2 − 2θ)2 To minimize this take the derivative with respect to θ and get −2(Y1 − θ) − 4(Y2 − 2θ) = −2(Y1 + 2Y2 − 5θ). This is 0 when Y1 + 2Y2 − 5θ = 0 so θ̂2 = Y1 + 2Y2 . 5 (c) [5 marks] Find the bias, standard error and mean squared error of θ̂2 . The mean of θ̂2 is Y1 + 2Y2 E(Y1 ) + 2E(Y2 ) θ + 4θ E = = = θ. 5 5 5 Thus θ̂2 is unbiased. Its variance is Var(Y1 ) + 4Var(Y2 ) σ 2 + 4σ 2 σ2 Y1 + 2Y2 = = = . Var 5 25 25 5 This is also the MSE. The standard error is σ √ . 5 (d) [2 marks] Based on these calculations, which is the better estimator of θ, θ̂1 or θ̂2 ? Both are unbiased and θ̂2 has a smaller variance because 5 1 < . 5 16 Thus θ̂2 is better. 5 5. Suppose X has density f (x, θ) = ( θ x2 0 x>θ x≤θ (a) [2 marks] Find the cumulative distribution function of X. Let F (x) = P (X ≤ x) be the cdf. For x < θ we have F (x) = 0. For x ≥ θ we have x Z x θ θ θ F (x) = P (X ≤ x) = = 1 − du = − . 2 u θ x θ u (b) [2 marks] For any b such that 1 ≤ b find P (1 ≤ X ≤ b) θ and then find the values of b for which this probability is 0.025 and 0.975. This is just P (θ ≤ X ≤ bθ) = F (bθ) − F (θ) = 1 − θ 1 =1− . bθ b To make this be α we need α = 1 − 1/b or 1/b = 1 − α or 1 . 1−α Thus for 0.025 we have b = 1/0.975 and for 0.975 we need b = 1/0.025. b= 6 (c) [5 marks] Use the results of the previous problem to find a 95% confidence interval for θ. We now know that 1 X 1 P = 0.975 − 0.025 = 0.95. ≤ ≤ 0.975 θ 0.025 Solving the inequalities we find P (0.025X ≤ θ ≤ 0.975X) = 0.95 so that the interval [0.025X, 0.975X] is a 95% confidence interval for θ. (d) [1 mark] Evaluate your interval if we observe X = 40. The interval runs from 0.025 × 40 = 1 to 0.975 × 40 = 39. 7 6. In the October 7, 2014 issue of the Canadian Medical Association Journal a randomized controlled double blind study of 378 patients studied the effect of melatonin on delirium. The treatment group had 186 patients and observed 55 cases of delirium while the control group had 192 patients and 49 cases of delirium. (a) [1 mark] If the treatment is completely ineffective what is the estimated probability of delirium in this group? The pooled estimate of the binomial probability is p̂ = 55 + 49 104 = = 0.27513. 186 + 192 378 (b) [2 marks] What is the estimated standard error for the estimate in the previous part. If the treatment is completely ineffective we have n = 378 trials and p̂ is a sample proportion so its standard error is r p(1 − p) 378 which we estimate by r p̂(1 − p̂) = 378 r 0.275 × 0.725 = 0.02297. 378 (c) [5 marks] Is there clear evidence of a difference in either direction in delirium rates between treatment and control? This is a test for the equality of two proportions, p1 , the probability of delirium in the treatment group and p2 the probability of delirium in the control group. The null hypothesis is p1 = p2 . The alternative is two sided: p1 6= p2 . The test statistic is p̂1 − p̂2 T =q = 0.736. 1 1 p̂(1 − p̂) 186 + 192 This is nowhere near significant; there is little evidence of a difference in the probability of delirium between treatment and control. The actual P -value is 0.46. 8 7. I have regularly used the heights of 1078 father / adult son pairs gathered in Victorian England. We are interested in predicting the heights of sons from the heights of fathers supposing that the relationship is described by a straight line. Fathers average 67.69 inches in height with a standard deviation of 2.74 inches. Sons average 68.68 inches in height with a standard deviation of 2.81 inches. The average of the products (son’s height times father’s height) is 4652.895 square inches. Students writing in Burnaby or at the CSD got to see the following sentence which was added after the exams were printed; students in Surrey never P found this out! I adjusted the marking to compensate. The Error Sum of Squares i (Yi − β̂0 − β̂1 xi )2 is 6390.331 square inches. (a) [2 marks] Show that n 1X (xi − x̄)(yi − ȳ) xy − x̄ȳ = n i=1 The right hand side is n n 1X 1X (xi − x̄)(yi − ȳ) = (xi yi − xi ȳ − yi x̄ + x̄ȳ) n i=1 n i=1 n n n 1X 1X 1X 1 = xi yi − ȳ xi − x̄ yi + nx̄ȳ n i=1 n i=1 n i=1 n = xy − ȳx̄ − x̄ȳ + x̄ȳ = xy − x̄ȳ as desired. (b) [3 marks] Estimate the slope and intercept of the least squares line for predicting son’s heights from father’s heights. This just requires you to plug in to the various formulas. The numerator of β̂1 is xy − x̄ȳ = 4652.895 − 67.69 × 68.68 = 3.946. The denominator is x2 − x̄2 = Thus 1077 n−1 2 sx = 2.742 = 7.501. n 1078 3.946 βˆ1 = = 0.526. 7.501 Then β̂0 = ȳ − β̂1 x̄ = 68.68 − 0.526 × 67.69 = 33.08. The units of β̂0 are inches while β̂1 is unitless (inches per inch). 9 (c) [5 marks]Give a 95% confidence interval for the true slope for the population these families are drawn from. The interval is S β̂1 ± t0.025,n−1 p (n − 1)s2x where n − 1 = 1077 which is so large we must use the normal critical value: 1.96. The quantity S is the estimated standard error which I had to add to the data give to you (it can be computed from the information given if you know how but I did not teach how). Using the information given in red above we find r 6390.331 S= = 2.437 1076 inches. Our interval is 0.526 ± 1.96 × √ 10 2.437 . 1077 × 2.752 1a 5 5a 2 1b 5 5b 2 2 5 5c 5 3a 5 5d 1 3b 5 6a 1 4a 5 6b 2 4b 5 6c 5 4c 5 7a 2 4d 2 7b 3 7c 5 Total 70 11