Download Feedback Lab 4 - Trinity College Dublin

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Trinity College, Dublin
Generic Skills Programme
Dotplot
of IQ
Statistics
for Research
Students
Laboratory 4:
Feedback
Group
Initial data analysis
Boys
Girls
72
81
90
99
108
117
126
135
IQ
Boxplot of IQ
Group
Boys
Girls
70
80
90
100
110
120
130
IQ
Descriptive Statistics: IQ
Variable
IQ
Group
Boys
Girls
N
47
31
Mean
110.96
105.84
StDev
12.12
14.27
Compare Boys' and Girls' IQ with regard to
(a)
spread
(b)
level
Comment on patterns and exceptions.
Minimum
77.00
72.00
Maximum
136.00
132.00
140
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Both samples have similar spread, with girls slightly larger. This is clearer in the boxplot than in
the dotplot. sBoys is slightly smaller than sGirls. The ranges are almost identical.
The Boys IQ are slightly higher than Girls, around 5 on average.
Both samples appear to have exceptional cases at the lower end and, possibly, at the higher
end. Otherwise, the dotplot suggest a typical Normal pattern (more frequent towards the
middle).
Boxplot of IQ
140
130
120
IQ
110
100
90
80
70
Boys
Girls
Group
Which do you prefer? Why?
Both convey their message equally well. For consistency with the dotplot (and histograms), it
may be preferable to show the measurement scale on a horizontal axis. However, dotplots with
vertical scales are also popular, although not available in Minitab.
2-sample t
Two-Sample T-Test and CI: IQ, Group
Two-sample T for IQ
Group
Boys
Girls
N
47
31
Mean
111.0
105.8
StDev
12.1
14.3
SE Mean
1.8
2.6
Difference = mu (Boys) - mu (Girls)
Estimate for difference: 5.12
95% CI for difference: (-0.88, 11.12)
T-Test of difference = 0 (vs not =): T-Value = 1.70
P-Value = 0.093 DF = 76
Both use Pooled StDev = 13.0122
page 2
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Report on the result of the t-test.
The estimated difference of 5.12 is not statistically significant at the 5% level; t = 1.7, p = 0.9.
Explain the make up of the Pooled Standard Deviation.
The sum of squared deviations from the mean is calculated for each sample separately and
then pooled, that is, added, and the average squared deviation across both samples is
calculated by dividing by the combined degrees of freedom. The Pooled Standard Deviation is
the square root of this.
Why are the degrees of freedom = 76?
The degrees of freedom associated with the deviations from mean in the Boys sample, 47 – 1 =
46, are combined with those for the Girls sample, 31 – 1 = 30, to give a total of 46 + 30 = 76.
Report on the confidence interval estimate.
We are 95% confident that the difference between average IQ for seventh grade boys and girls
in the School district (from which the sample were drawn) lies between – 0.88 and 11.12.
Two-Sample T-Test and CI: IQ, Group
Two-sample T for IQ
Group
Boys
Girls
N
47
31
Mean
111.0
105.8
StDev
12.1
14.3
SE Mean
1.8
2.6
Difference = mu (Boys) - mu (Girls)
Estimate for difference: 5.12
95% CI for difference: (-1.12, 11.36)
T-Test of difference = 0 (vs not =): T-Value = 1.64
P-Value = 0.106 DF = 56
The t-value is slightly smaller (and the confidence interval slightly wider) because the standard
error is slightly bigger, reflecting the greater uncertainty due to not being able to use the
"knowledge" that the variances were equal.
The degrees of freedom are smaller and the p-value is correspondingly larger, for the same
reason.
Tradition favours the "equal variance" assumption unless three is clear evidence to the contrary.
It may be argued that not making the assumption is safer.
Checking Normality
Calculation and analysis of residuals is recommended practice when fitting sophisticate models,
for use in diagnostic checking of the models and the assumptions underlying them. Here, the
model is very simple; IQ values vary randomly around a common mean, with (possibly)
separate means for boys and girls.
page 3
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Probability Plot of Residuals
Normal
Mean
-2.18629E-15
StDev
12.93
N
78
AD
0.610
P-Value
0.109
40
30
Residuals
20
10
0
-10
-20
-30
-40
-50
-3
-2
-1
0
Score
1
2
3
Discuss the result.
The plot appears linear for the most part, supporting the Normal model, apart from a slight bend
in the middle and 2 or 3 possible exceptional cases at the lower end.
Note the apparent exceptional values in the plot and also the p-value recorded for
the Anderson-Darling (AD) test.
The p-value suggests that the Normal model is acceptable.
Probability Plot of Residuals
Normal - 95% CI
50
Mean
-2.18629E-15
StDev
12.93
N
78
AD
0.610
P-Value
0.109
Residuals
25
0
-25
-50
-3
-2
-1
0
Score
1
2
3
Discuss the result.
Three cases appear exceptional by reference to the interval.
page 4
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Can you reconcile the values outside the confidence limits with the result of the
Anderson-Darling (AD) test?
A single point lying outside the 95% confidence interval is statistically significant at the 5% level,
when taken individually. This appears to contradict the AD test, with p exceeding 0.05.
However, the AD test treats all point simultaneously. A confidence interval that applied to all
point simultaneously would need to be wider than a confidence interval for an individual point, to
allow for the greater uncertainty involved in dealing with all the points simultaneously.
Reference plots
Below are 3 of 19 plots generated as reference plots.
Probability Plot of C17
Normal
3
Mean
-0.1324
StDev
0.9908
N
78
AD
0.878
P-Value 0.023
2
C17
1
0
-1
-2
-3
-4
-3
-2
-1
0
Score
1
2
3
Probability Plot of C12
Normal
4
Mean
-0.1153
StDev
1.056
N
78
AD
0.521
P-Value 0.180
3
2
C12
1
0
-1
-2
-3
-4
-3
-2
-1
0
Score
1
2
page 5
3
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Probability Plot of C11
Normal
4
Mean
0.1115
StDev
1.026
N
78
AD
0.840
P-Value 0.029
3
2
C11
1
0
-1
-2
-3
-3
-2
-1
0
Score
1
2
3
Do you think that the Residuals data follow a Normal model? Explain.
By reference to these, the plot of residuals is not exceptional. The Normal model appears
reasonable.
Checking Equal Standard Deviations
Test for Equal Variances for IQ
F-Test
Test Statistic
P-Value
Group
Boys
0.72
0.312
Levene's Test
Girls
Test Statistic
P-Value
10
12
14
16
18
95% Bonferroni Confidence Intervals for StDevs
0.86
0.356
20
Comment on the results.
Group
The Boys
results do not contradict the equal standard deviations assumption.
Note the range of possible values of  included in the confidence interval for Girls
deviation. Discuss:
 what implications does the upper end have for possible spread of Girls IQ?
Girlsstandard
70
80
90
100
110
120
130
IQ a spread of  2 × 20 = 80, or more.
A standard deviation of 20 suggests

how does this compare to the spread evident from the boxplots?
This is considerably wider than the observed spread (60), even allowing for the possible
exceptional cases.
page 6
Trinity College, Dublin
Generic Skills Programme

Statistics for Research Students
Laboratory 4, Feedback
what is needed to reduce the confidence interval width?
Increase the sample size.
Reanalysis
Two-Sample T-Test and CI: IQ, Group
Two-sample T for IQ
Group
Boys
Girls
N
45
29
Mean
112.4
108.1
StDev
10.1
11.7
SE Mean
1.5
2.2
Difference = mu (Boys) - mu (Girls)
Estimate for difference: 4.32
95% CI for difference: (-0.77, 9.41)
T-Test of difference = 0 (vs not =): T-Value = 1.69
P-Value = 0.095 DF = 72
Both use Pooled StDev = 10.7301
Report on the results of the t-test.
The estimated difference of 4.32 is not statistically significant at the 5% level; t = 1.7, p = 0.95.
Compare with the results of the earlier application.
The conclusion is the same. The standard deviation is smaller, as expected, 10.7 instead of 13.
This means a smaller standard error for the mean difference. However, the mean difference is
also smaller and so the t value is almost identical.
Comment
Since removing the "exceptional" cases makes no great difference, it is probably safer to not
remove them.
2.
When are perceived sample differences statistically significant?
How do Boys+ and Girls compare in the initial dotplot?
There is no perceivable difference between them.
What is the value of t?
t = 0.
How big is delta when the sample means are significantly different according to t?
Between 6 and 7
How big is delta when the samples appear "different" according to the dotplot?
This is subjective. Seeing the change in the difference between the dotplots when delta
changes from 6 to 7 informs ones perception of significant change.
page 7
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Simulation
Discuss the effect of increasing sample size on
(i)
apparent differences between the samples,
(ii)
significant differences between the sample means
of Method
A, Method
B t statistically significant. This
With smaller sample sizes, a Dotplot
bigger delta
is needed
to make
means a bigger difference between the dotplots is needed to judge statistical; significance with
smaller sample sizes. This means that using dotplots to assess statistical significance is
somewhat dubious.
3
A comprehensive exercise
Initial data analysis
Method A
Method B
90
91
92
93
94
95
Data
96
97
98
99
100
Descriptive Statistics: Method A, Method B
Variable
Method A
Method B
N
12
12
Mean
95.392
96.817
StDev
1.106
1.247
Minimum
93.000
95.200
Maximum
97.300
99.200
The results using method B appear substantially higher than those using Method A. The mean
difference is around 1.4. The spread appears similar in both sets of measurements.
Two-Sample T-Test and CI: Method B, Method A
Difference = mu (Method B) - mu (Method A)
Estimate for difference: 1.425
95% CI for difference: (0.427, 2.423)
T-Test of difference = 0 (vs not =): T-Value = 2.96
P-Value = 0.007
The mean difference is highly statistically significant.
A 95% confidence interval for the mean difference is
(0.427, 2.423)
Checking assumptions
There is no evidence of unequal standard deviations (no test needed). There is no evidence of
non-Normality in the Normal plot of residuals that follows. (Reference plots not needed).
page 8
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4, Feedback
Probability Plot of Residuals
Normal
3
Mean
4.144833E-15
StDev
1.153
N
24
AD
0.215
P-Value
0.830
2
Residuals
1
0
-1
-2
-3
-2
-1
0
Score
1
2
Dotplot of Method A, Method B
Management Report
Two methods of assessing the percentage recovery of the nominal level of the active
ingredients of pharmaceutical products were used in one phase of the method. The results
showed that the results using Method B were statistically significantly higher than those using
Method A. The following dotplot shows a graphical comparison.
Method A
Method B
90
91
92
93
94
95
Data
96
97
98
99
100
The observed mean difference was 1.425 although the actual difference could be as low as
0.427 or as high as 2.423. We can be 95% confident that the actual difference lies between
those values.
Appendix: Numerical summaries
Two-Sample T-Test and CI: Method B, Method A
Method B
Method A
N
12
12
Mean
96.82
95.39
StDev
1.25
1.11
SE Mean
0.36
0.32
Difference = mu (Method B) - mu (Method A)
Estimate for difference: 1.425
95% CI for difference: (0.427, 2.423)
T-Test of difference = 0 (vs not =): T-Value = 2.96
P-Value = 0.007 DF = 22
page 9