Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psych 315, Winter 2017, Homework 7 Answer Key Due Friday, February 17th either in section or in your TA’s mailbox by 4pm. ID Name Section [AA Adriana] [AB Adriana] [AC Kelly] [AD Kelly] Problem 1 According to the CDC, the average height of US women increased from 62 to 63 inches from 1964 to 2016. Let’s assume that heights are normally distributed with a standard deviation of 3 inches. This first problem is about the power associated with detecting this 1 inch increase in mean height. We’ll be filling in the probabilities in the following table: In order to determine if women are significantly taller than they were in 1964 we will test the null hypothesis that the mean in 2016 is the same as the mean in 1964, which was 62. We will then compare this number to an observed mean drawn from the 2016 distribution. We’ll use an alpha value of α =0.05. Probabilities (α = 0.05) H0 True H0 False Fail to reject H0 0.95 0.6205 Reject H0 0.05 0.3795 a) We can already fill in half of the table above. If H0 is true, then the probability of rejecting H0 when it is true is defined to be α. Since we will either reject or fail to reject H0 , the probability of failing to reject H0 when it’s actually true is 1 − α. Fill in those numbers in the table above. b) Suppose we were to draw 16 samples from the 1964 population. What is the standard error of the mean? The standard error of the mean is √3 = 0.75 inches 16 1 Here’s a bell curve that represents the distribution of means from the null hypothesis (1964) population. It has a mean of 62 and the standard deviation you calculated in problem b: c) Find the range of heights for the upper tail of this distribution which has an area of α = 0.05. The z-score is the area for which 5 The range is 62 + (1.64)(0.75) = 63.23 inches and above d) Shade this region in the curve above. e) Suppose we were to draw a mean from the 1964 population that fell in this shaded region. If we were testing the null hypothesis that the population mean was equal to 62, what would our decision be? Would this be a correct decision? If not, what type of error would it be? We would reject H0 . Since H0 is true, this would be a type I error. f ) Let’s assume that H0 is false and that the ’true’ distribution of heights is that measured in 2016. Now draw a normal distribution on the graph above that represents this ’true’ distribution of mean heights drawn from the 2016 population. This should be a normal distribution with mean 63 but with the same standard deviation as for H0 . This is a bell-curve with the same standard standard deviation as before (0.75) but shifted to have a mean of 63 inches 2 g) Take the rejection region that you found from problem c and lightly shade the area under this ’true’ distribution (on top of the previously shaded region). h) If we were to draw a mean of 16 samples from the true (2016) population that landed in the this new region, what would our decision be? Would this be a correct decision? If not, what type of error would we be making? We would reject H0 . This would be a correct decision since the true mean is 63 inches, not 62 inches i) What is the area of this newly shaded region? This is the area under the distribution with mean 63 above 63.23 = 0.31 z = 63.23−63 0.75 The area above z = 0.31 is 0.3795 j) The area you just calculated in part i is the power of our test. It’s the probability of rejecting H0 when H0 is false. That is, it is the probability of obtaining a mean from the 2016 distribution that is significantly greater than the null hypothesis mean of 62 inches. Now fill in the other two numbers in the table from part a above. 3 k) Now, redo the math and calculate the power for the same mean but using an α value of 0.01. Did power go up or down? Explain why. Here’s a new table and graph for you to fill out. Probabilities (alpha = 0.01) H0 True H0 False Fail to reject H0 0.99 0.8381 Reject H0 0.01 0.1619 The standard error of the mean is (again) √3 = 0.75 inches. 16 The z-score is the area for which 1 percent falls above is z = 2.3263. The range is 62 + (2.3263)(0.75) = 63.74 inches and above. = 0.99. For the true distribution, z = 63.74−63 0.75 The area above z = 0.99 is the power, which is 0.1619. As alpha went from 0.05 to 0.01 the power went from 0.3795 to 0.1619 This shows the trade off between type I and type II errors. As alpha goes down we become less willing to reject H0 , so power goes down too. l) Let’s compare our power calculations to estimates from the power curves. First, calculate the effect size, g = The effect size is |63−62| = 0.33 (small) 3 4 |x̄−µhyp | sx m) Finally, use the two sets of power curves for a 1-tailed test for 1 mean, sample size 16 and for α = 0.05 and 0.01 to see your two estimates of power agree with your calculations. they agree α = 0.05, 1 tail, 1 mean 1 1000 500 250 150 100 75 50 40 30 25 20 15 12 10 n=8 n = 16 0.38 0.9 0.8 0.7 Power 0.6 0.5 0.4 0.3 0.2 0.1 0.33 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.1 1.2 1.3 1.4 1 1.1 1.2 1.3 1.4 Effect size α = 0.01, 1 tail, 1 mean 1 0.9 1000 500 250 150 100 75 0.8 0.7 50 40 30 25 20 0.6 Power 1 0.5 15 12 10 n=8 0.4 0.3 0.2 0.16 n = 16 0.1 0.33 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Effect size 5 Problem 2 Suppose you think there might be a difference in self confidence between genders. In our class, we asked you what you thought your score would be on Exam 1. Splitting the class by gender, the estimated Exam 1 score for the 78 female students had a mean of 84.12 and a standard deviation of 9.8376. Exam 1 score predictions for the 23 male students had a mean of 88.7 and a standard deviation of 7.5705. Let’s see if these two means are significantly different from each other. We’ll use an α value of 0.05. a) Calculate the pooled standard deviation: s 2 (nx −1)s2 x +(ny −1)sy sp = (nx −1)+(ny −1) r sp = (78−1)9.83762 +(23−1)7.57052 = 9.3813 (78−1)+(23−1) b) Calculate the pooled standard error of the mean: r sx̄−ȳ = sp n1x + n1y q 1 + 1 = 2.2259 sx̄−ȳ = 9.3813 78 23 c) Calculate the t-statistic: t= x̄−ȳ sx̄−ȳ x̄−ȳ tobs = s = 84.12−88.7 2.2259 = −2.06 x̄−ȳ d) Find the critical value of t For df = 78 + 23 - 2 = 99, two tailed and α = 0.05, tcrit = ±1.984 6 e) State your decision as a sentence in APA format. The predicted Exam 1 score of female gender (M = 84.12, SD = 9.8376) is significantly different than the predicted Exam 1 score of male gender (M = 88.7, SD = 7.5705) t(99) = -2.06, p = 0.042. |84.12−88.7| |x̄−ȳ| = 0.49 self-confidence does appear to be associated with gender. The effect size is g = sp = 9.3813 f ) What is the effect size, g = |x̄−ȳ| sp The pooled standard deviation is sp = 9.3813 |84.12−88.7| The effect size is = 0.49 9.3813 This is a medium effect size g) Suppose this were the true effect size. What would be the power of this test? (Use the power curves for α = 0.05, two tailed, two means). Remember, each power curve corresponds to the average sample size for each mean (about 50 for this example). The power for an effect size of 0.49 is about 0.7. α = 0.05, 2 tails, 2 means 1 0.9 1000 500 250 0.70 150 100 75 0.8 0.7 Power 0.6 n = 51 50 40 30 25 20 15 12 10 n=8 0.5 0.4 0.3 0.2 0.1 0.49 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Effect size 7 1 1.1 1.2 1.3 1.4 h) Use the power curve below to determine, for this effect size, how large the sample size for each group would need to be to get a power of 0.8 α = 0.05, 2 tails, 2 means 1 0.9 1000 0.80 500 250 150 100 75 0.8 0.7 Power 0.6 n = 65 50 40 30 25 20 15 12 10 n=8 0.5 0.4 0.3 0.2 0.1 0.49 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Effect size We’d need a sample size of about 65 per group to get a power of 0.8 8 1 1.1 1.2 1.3 1.4 i) Draw a bar graph showing the two means with error bars representing ± one standard error of the mean. You’ll have to calculate the standard errors of the mean by dividing each standard deviation by the square root of each sample size. 91 predicted Exam 1 score 90 89 88 87 86 85 84 83 82 female male gender The standard errors of the mean are: √ = 1.11, for a mean of 84.12, the bar ranges from 83.0 to 85.2 female: 9.8376 78 √ male: 7.5705 = 1.58, for a mean of 88.7, the bar ranges from 87.1 to 90.3 23 9