Download HW 7 due

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Psych 315, Winter 2017, Homework 7 Answer Key
Due Friday, February 17th either in section or in your TA’s mailbox by 4pm.
ID
Name
Section [AA Adriana] [AB Adriana] [AC Kelly] [AD Kelly]
Problem 1
According to the CDC, the average height of US women increased from 62 to 63 inches from 1964 to 2016. Let’s assume that
heights are normally distributed with a standard deviation of 3 inches.
This first problem is about the power associated with detecting this 1 inch increase in mean height. We’ll be filling in the
probabilities in the following table:
In order to determine if women are significantly taller than they were in 1964 we will test the null hypothesis that the mean
in 2016 is the same as the mean in 1964, which was 62. We will then compare this number to an observed mean drawn from
the 2016 distribution. We’ll use an alpha value of α =0.05.
Probabilities (α = 0.05)
H0 True
H0 False
Fail to reject H0
0.95
0.6205
Reject H0
0.05
0.3795
a) We can already fill in half of the table above. If H0 is true, then the probability of rejecting H0 when it is true is defined
to be α. Since we will either reject or fail to reject H0 , the probability of failing to reject H0 when it’s actually true is 1 − α.
Fill in those numbers in the table above.
b) Suppose we were to draw 16 samples from the 1964 population. What is the standard error of the mean?
The standard error of the mean is √3 = 0.75 inches
16
1
Here’s a bell curve that represents the distribution of means from the null hypothesis (1964) population. It has a mean of
62 and the standard deviation you calculated in problem b:
c) Find the range of heights for the upper tail of this distribution which has an area of α = 0.05.
The z-score is the area for which 5
The range is 62 + (1.64)(0.75) = 63.23 inches and above
d) Shade this region in the curve above.
e) Suppose we were to draw a mean from the 1964 population that fell in this shaded region. If we were testing the null
hypothesis that the population mean was equal to 62, what would our decision be? Would this be a correct decision? If not,
what type of error would it be?
We would reject H0 . Since H0 is true, this would be a type I error.
f ) Let’s assume that H0 is false and that the ’true’ distribution of heights is that measured in 2016. Now draw a normal
distribution on the graph above that represents this ’true’ distribution of mean heights drawn from the 2016 population.
This should be a normal distribution with mean 63 but with the same standard deviation as for H0 .
This is a bell-curve with the same standard standard deviation as before (0.75)
but shifted to have a mean of 63 inches
2
g) Take the rejection region that you found from problem c and lightly shade the area under this ’true’ distribution (on top
of the previously shaded region).
h) If we were to draw a mean of 16 samples from the true (2016) population that landed in the this new region, what would
our decision be? Would this be a correct decision? If not, what type of error would we be making?
We would reject H0 . This would be a correct decision since the true
mean is 63 inches, not 62 inches
i) What is the area of this newly shaded region?
This is the area under the distribution with mean 63 above 63.23
= 0.31
z = 63.23−63
0.75
The area above z = 0.31 is 0.3795
j) The area you just calculated in part i is the power of our test. It’s the probability of rejecting H0 when H0 is false. That
is, it is the probability of obtaining a mean from the 2016 distribution that is significantly greater than the null hypothesis
mean of 62 inches.
Now fill in the other two numbers in the table from part a above.
3
k) Now, redo the math and calculate the power for the same mean but using an α value of 0.01. Did power go up or down?
Explain why. Here’s a new table and graph for you to fill out.
Probabilities (alpha = 0.01)
H0 True
H0 False
Fail to reject H0
0.99
0.8381
Reject H0
0.01
0.1619
The standard error of the mean is (again) √3 = 0.75 inches.
16
The z-score is the area for which 1 percent falls above is z = 2.3263.
The range is 62 + (2.3263)(0.75) = 63.74 inches and above.
= 0.99.
For the true distribution, z = 63.74−63
0.75
The area above z = 0.99 is the power, which is 0.1619.
As alpha went from 0.05 to 0.01 the power went from 0.3795 to 0.1619
This shows the trade off between type I and type II errors. As alpha goes down we become less willing to reject H0 , so
power goes down too.
l) Let’s compare our power calculations to estimates from the power curves. First, calculate the effect size, g =
The effect size is
|63−62|
= 0.33 (small)
3
4
|x̄−µhyp |
sx
m) Finally, use the two sets of power curves for a 1-tailed test for 1 mean, sample size 16 and for α = 0.05 and 0.01 to see
your two estimates of power agree with your calculations.
they agree
α = 0.05, 1 tail, 1 mean
1
1000
500
250
150
100
75
50
40
30
25
20
15
12
10
n=8
n = 16
0.38
0.9
0.8
0.7
Power
0.6
0.5
0.4
0.3
0.2
0.1
0.33
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1.1 1.2 1.3 1.4
1
1.1 1.2 1.3 1.4
Effect size
α = 0.01, 1 tail, 1 mean
1
0.9
1000
500
250
150
100
75
0.8
0.7
50
40
30
25
20
0.6
Power
1
0.5
15
12
10
n=8
0.4
0.3
0.2
0.16
n = 16
0.1
0.33
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Effect size
5
Problem 2 Suppose you think there might be a difference in self confidence between genders. In our class, we asked you
what you thought your score would be on Exam 1. Splitting the class by gender, the estimated Exam 1 score for the 78
female students had a mean of 84.12 and a standard deviation of 9.8376. Exam 1 score predictions for the 23 male students
had a mean of 88.7 and a standard deviation of 7.5705. Let’s see if these two means are significantly different from each
other. We’ll use an α value of 0.05.
a) Calculate the pooled standard deviation:
s
2
(nx −1)s2
x +(ny −1)sy
sp =
(nx −1)+(ny −1)
r
sp =
(78−1)9.83762 +(23−1)7.57052
= 9.3813
(78−1)+(23−1)
b) Calculate the pooled standard error of the mean:
r
sx̄−ȳ = sp n1x + n1y
q
1 + 1 = 2.2259
sx̄−ȳ = 9.3813 78
23
c) Calculate the t-statistic:
t=
x̄−ȳ
sx̄−ȳ
x̄−ȳ
tobs = s
= 84.12−88.7
2.2259 = −2.06
x̄−ȳ
d) Find the critical value of t
For df = 78 + 23 - 2 = 99, two tailed and α = 0.05, tcrit = ±1.984
6
e) State your decision as a sentence in APA format.
The predicted Exam 1 score of female gender (M = 84.12, SD = 9.8376) is significantly different than the predicted Exam 1
score of male gender (M = 88.7, SD = 7.5705) t(99) = -2.06, p = 0.042.
|84.12−88.7|
|x̄−ȳ|
= 0.49 self-confidence does appear to be associated with gender.
The effect size is g = sp =
9.3813
f ) What is the effect size, g =
|x̄−ȳ|
sp
The pooled standard deviation is sp = 9.3813
|84.12−88.7|
The effect size is
= 0.49
9.3813
This is a medium effect size
g) Suppose this were the true effect size. What would be the power of this test? (Use the power curves for α = 0.05, two
tailed, two means). Remember, each power curve corresponds to the average sample size for each mean (about 50 for this
example).
The power for an effect size of 0.49 is about 0.7.
α = 0.05, 2 tails, 2 means
1
0.9
1000
500
250
0.70
150
100
75
0.8
0.7
Power
0.6
n = 51
50
40
30
25
20
15
12
10
n=8
0.5
0.4
0.3
0.2
0.1
0.49
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Effect size
7
1
1.1 1.2 1.3 1.4
h) Use the power curve below to determine, for this effect size, how large the sample size for each group would need to be
to get a power of 0.8
α = 0.05, 2 tails, 2 means
1
0.9
1000
0.80 500
250
150
100
75
0.8
0.7
Power
0.6
n = 65
50
40
30
25
20
15
12
10
n=8
0.5
0.4
0.3
0.2
0.1
0.49
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Effect size
We’d need a sample size of about 65 per group to get a power of 0.8
8
1
1.1 1.2 1.3 1.4
i) Draw a bar graph showing the two means with error bars representing ± one standard error of the mean. You’ll have to
calculate the standard errors of the mean by dividing each standard deviation by the square root of each sample size.
91
predicted Exam 1 score
90
89
88
87
86
85
84
83
82
female
male
gender
The standard errors of the mean are:
√
= 1.11, for a mean of 84.12, the bar ranges from 83.0 to 85.2
female: 9.8376
78
√
male: 7.5705
= 1.58, for a mean of 88.7, the bar ranges from 87.1 to 90.3
23
9