Download What is Statistics? - Montana State University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Center for Biofilm
Engineering
Statistical design & analysis
for assessing the efficacy of
instructional modules
Marty Hamilton
Professor Emeritus of Statistics
Montana State University
CS 580 April 24, 2006
Why Statistics?
Provide convincing results
Improve communication
“...I do not mean to suggest that
computers eliminate stupidity---they
may in fact encourage it.”
Robert P. Abelson, in Statistics as
Principled Argument
(cited on Rocky Ross’s CS 580 home
page)
What is Statistics?
Data
Design
Uncertainty assessment
Statistical Thinking
Data
Design
Uncertainty assessment
Data: Choosing the
quantity to measure
Reliable test of knowledge
Quantitative response
Statistical thinking
Data
Design
Uncertainty assessment
After-treatment score
A student used the modules,
then scored 80% on the test
Conclusion:
modules have high efficacy
Data: Choosing the
quantity to measure
Reliable tests of knowledge:
before-treatment test
after-treatment test
Quantitative response: difference in
test scores, after-treatment minus
before-treatment
After-treatment score
Test score
High
Low
After
Before- and after-treatment
scores
Test score
High
Response
Low
Before
After
Difference between beforeand after-treatment scores
A student used the modules, then
scored 50 points higher on the aftertreatment test than on the before
treatment test (Response = 50).
Conclusion:
modules have high efficacy
Anticipating criticism:
“natural” improvement
Test score
High
without the
treatment
Response
Low
Before
After
Anticipating criticism
Before/after observations for just the
“treated” student may not accurately
represent the treatment effect
May need treated and untreated
students (i.e., a control)
Control or comparison
The control can be either a
negative control (placebo)
or
positive control (best conventional)
A student taking a conventional
classroom lecture/recitation course
would provide a positive control or
comparison
Difference
(after – before)
Difference scores for each of
12 students, 6 per group
100
Of practical importance?
0
Control
group
Treated
group
Study design
Before and after test scores for
each student in both the treated
and control groups
Good study design
Control or comparison
Replication
Randomization
Anticipate criticism
Data: 20 students per group
(randomly assigned?)
Treatment
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
Response
-28.5096
34.7186
-3.3184
-13.9297
-5.7949
29.0260
15.4682
29.1025
-10.8522
-18.7876
-3.1457
5.4531
-9.3185
1.2575
-11.5470
-17.6932
5.5314
6.7628
-10.8001
18.3930
Treatment
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
Response
53.4115
75.9697
8.3348
33.3584
42.5355
58.2345
47.9143
58.6826
48.3604
68.2412
91.1052
42.8328
48.9096
67.1174
39.2733
68.9961
52.2039
39.2210
31.1658
36.4764
Analysis via Minitab 14
.Minitab:
FirstStudy_CS580.MTW
Show data layout ... matrix
Stat > Basic Statistics > Display Descriptive Statistics ... Ask for individual value plot
Stat > Basic Statistics > 2 Sample t ...
Minitab output
Two-Sample T-Test and CI: Response, Treatment
Two-sample T for Response
Treatment N Mean StDev SE Mean
C
20 0.6 17.4
3.9
T
20 50.6 18.4
4.1
Difference = mu (C) - mu (T)
Estimate for difference: -50.0164
95% CI for difference: (-61.4656, -38.5672)
T-Test of difference = 0 (vs not =): T-Value = -8.84 P-Value = 0.000 DF = 38
Both use Pooled StDev = 17.8846
Null hypothesis: true mean response for Treatment = true mean response for Control
Conclusions:
1. Reject the null hypothesis because it is discredited by the data (p-value < 0.001)
2. 95% confident that the treatment mean response is between 38.6 and 61.5 larger than
the true control mean response
3. Is this efficacy repeatable?
Analysis via Minitab 14 (more)
FirstStudy_CS580
100
80
Response
60
40
20
0
-20
-40
C
T
Treatment
Analysis via Minitab 14 (more)
Minitab: SixStudies_CS580.MTW
Show data layout ... matrix
Stat > Tables > Descriptive Statistics
Minitab output
Tabulated statistics: Replicate, Treatment
Rows: Replicate Columns: Treatment
C
T
All
1
0.60 50.62 25.61
20 20 40
2
0.07 62.94 31.50
20 20 40
3
5.09 51.46 28.27
20 20 40
4
13.29 58.99 36.14
20 20 40
5
6.85 41.45 24.15
20 20 40
6
16.05 51.59 33.82
20 20 40
All 6.99 52.84 29.92
120 120 240
Cell Contents: Response: Mean
Count
Analysis via Minitab 14 (more)
SixStudies_CS580
C
1
T
2
3
100
Response
50
0
4
5
-50
6
100
50
0
-50
C
T
C
Treatment
Panel variable: Replicate
T
Analysis via Minitab 14 (more)
Stat > ANOVA > General Linear Model ...
Minitab output
General Linear Model: Response versus Treatment, Replicate
Factor
Type Levels Values
Treatment
fixed
2
C, T
Replicate(Treatment)
random 12 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6
Analysis of Variance for Response, using Adjusted SS for Tests
Source
DF
Seq SS
Adj SS
Adj MS
Treatment
1
126120
126120
126120
Replicate(Treatment)
10
9840
9840
984
Error
228
88786
88786
389
Total
239
224746
S = 19.7335
F
128.16
2.53
P
0.000
0.007
Variance Components, using Adjusted SS
Estimated
Source
Value
Replicate(Treatment)
29.73
Variance among replicate studies
Error
389.41
Variance among students in same study and treatment
---------added by Marty ---------Total variance
419.14
Repeatability Standard Deviation = 20.5 (single student)
Repeatability Standard Deviation = 9.9 (mean of 20 treated students minus mean of 20 control students)
Stat > Basic Statistics > Normality Test... of residuals provides an evaluation of key statistical assumption
underlying the ANOVA
Analysis via Minitab 14 (more)
Data copied from Tables output and pasted into the worksheet:
Rep
CntrlMean TrtMean Mean (Treatment minus Control)
1
0.60
50.62
50.02
2
0.07
62.94
62.87
3
5.09
51.46
46.37
4
13.29
58.99
45.70
5
6.85
41.45
34.60
6
16.05
51.59
35.54
Stat > Basic Statistics > 1 sample t ... analysis of 6 Means
Conclusions:
1. Reject the null hypothesis because it is discredited by the data (p-value < 0.001)
2. Estimated difference in mean responses = 45.9
3. 95% confident that the treatment mean response is between 36.9 and 54.9 larger than the true
control mean response
4. 95% confident that the treatment mean response is at least 38.6 larger than the true control mean
response
5. The efficacy measure is repeatable
Note: this straightforward analysis of the six means, one mean for each of the 6 repeated studies, using a 1-sample
t-test provides nearly the same results as does the ANOVA variance component analysis approach.
Trade-offs: What is the
main source of variability?
It is often more important to
repeat the study
than to expend time and materials
finding a precise efficacy estimate for a
single study.
Fin
Related documents