Download Feedback Lab 3 - Trinity College Dublin

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3:
1.
Feedback
Control charts
Discuss the stability of the days to payment process with regard to
(a)
(b)
level
spread
There appears to be excessive variation in the sample mean during the first 8 weeks, with 3
exceptionally low values, 2 below the LCL, and 2 exceptionally high values, 1 above the UCL.
The standard deviation appears stable over the whole period.
What is your next step in the analysis?
As the process appears stable during recent weeks, no action on the process is suggested.
However, as there is evidence of instability of the process mean in the first 8 weeks, an
investigation of the cause of this is suggested.
Identify centre lines and control limits to use for further process monitoring. Use
sensible rounding.
From the charts based on the recent data,
Xbar chart:
CL = 34.98, LCL = 30.56, UCL = 39.4
s chart:
CL = 3.295, LCL = 0, UCL = 6.894
With "sensible rounding",
Xbar chart:
CL = 35, LCL = 30.6, UCL = 39.4
s chart:
CL = 3.3, LCL = 0, UCL = 6.9.
2.
Simulating sampling distributions
Compare the results of the simulations.
What did you expect?
Expect no out-of-control points with 30 repetitions, around 1 with 300 repetitions, around 8 with
3,000 repetitions. This follows on multiplying the number of repetitions by 0.0027, the chance of
getting an out of control point in one repetition
How do the histograms compare?
Expect more Normal looking histograms with increasing repetitions.
Statistics for Research Students, Laboratory 3
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
False alarm data
Class Frequency
3
4
5
6
7
8
9
10
11
False Alarms
12
14
15
16
17
20
Lecturer Frequency
0
1
2
3
4
5
6
7
8 9 10 11 12 13 14 15 16 17 18 19 20 0
False Alarms
Theoretical
Frequency
Theoretical Frequency
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21
Alarms
Simulating the effect of increasing sample size
page 2
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
24
24
24
24
24
28
28
28
28
28
32
32
32
32
32
36
n=1
36
n=5
36
n=10
36
n=20
36
n=40
40
40
40
40
40
44
44
44
44
44
Mean
StDev
N
35.02
3.382
1000
Mean
StDev
N
35.07
1.467
1000
Mean
StDev
N
35.01
1.029
1000
48
48
48
Mean
StDev
N
35.01
0.7356
1000
Mean
StDev
N
35.01
0.5147
1000
48
48
Comment on the effect of increasing sample size, referring to



histogram spread,
values of StDev,
values of Mean.
reduces with increasing sample size
reduces with increasing sample size
stay around 35 (should get closer with increasing
sample size)
Compare


the values of StDev for n = 5 and n = 20,
the values of StDev for n = 10 and n = 40,
latter is roughly ½ former
latter is roughly ½ former
noting that the larger sample size is 4 times the smaller in both cases.
Explain your comparisons in terms of the formula /n for the standard error of the
sample mean
Increasing sample size by 4 reduces /n by 1/4 = ½
Note: The summary data with the histograms refers to "Standard Deviation". These
are the standard deviations of the Xbar values and so approximate the standard
error of Xbar.
Recall that the standard error is defined to be the standard deviation of the sampling
distribution of Xbar. There two ways of getting at the standard error of Xbar. One is
page 3
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
to use the formula s/n, where s is an estimate of the process standard deviation, ,
that may be calculated from single values sampled from the process. The second
way is to calculate the standard deviation of Xbar values sampled from the process.
The second is what is done in our simulation exercise and is recorded in the
histogram summaries.
3
One-sample significance tests
Each reference of a point in an Xbar chart to the control limits amounts to a significance
test based on the values from which the corresponding Xbar value was calculated.
Xbar-S Chart of Clip gap
1
UCL=80.73
Sample Mean
80
75
_
_
X=70
70
65
60
LCL=59.27
1
3
5
7
9
11
13
Sample
15
17
19
21
23
25
UCL=16.71
Sample StDev
16
12
_
S=8
8
4
LCL=0
0
1
3
5
7
9
11
13
Sample
15
17
19
21
23
25
Testing the hypothesis that the process is on target using Sample 5 gives the result:
One-Sample Z: Clip gap_5
Test of mu = 70 vs not = 70
The assumed standard deviation = 8
Variable
Clip gap_5
N
5
Mean StDev SE Mean
75.00
7.91
3.58
95% CI
(67.99, 82.01)
Z
P
1.40 0.162
What is the value of Z?
1.40
What is the value of p?
0.162
What conclusion do you draw?
The hypothesis  = 70 is accepted.
Note: The Laboratory exercise asked for a test of the statistical significance of the mean of
Sample 2. This led to a Z value of 1.96 and a p-value of 0.05. These calculated values are,
entirely coincidentally, the same as the conventional critical value and significance level for the
page 4
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Z-test. This coincidence is likely to lead to confusion, so another sample was chosen instead of
Sample 2.
Correspondence between control chart test and significance test
Verify the correspondence between the control chart test and the Z test.
The following extended answer to this question is taken from Stuart (2003), pp. 161-162.
The control chart test is conventionally formalised as follows. A point on the chart is outside the
control limits if its X value deviates from the centre line by more than 3 standard errors. If o is
the process mean value corresponding to the centre line, this is equivalent to saying that the
value of X satisfies
X – o > 3 * /√n
or
X – o < –3 * /√n,
X 0
>3
/ n
or
X  0
that is
that is, the calculated value of
Here,
/ n
< –3
X 0
exceeds 3 in magnitude.
/ n
X 0
, usually denoted by Z, is referred to as the
/ n
test statistic
and the number 3 is called the
critical value
for the test statistic. It is critical in the sense that the deviation of X from o is statistically
significant or not according as the calculated value of the test statistic exceeds the critical value
or not.
The null hypothesis is
rejected
if the test statistic exceeds the critical value in magnitude, corresponding to an "out-of-control"
signal from the control chart. Otherwise the null hypothesis is
accepted,
at least provisionally.
The correspondence between the control chart test and the Z test is illustrated in Figure 1.
page 5
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
-3
0 -3
Figure 1
-2

n
-1
0
1
0
2
3
0 +3 
n
Z scale
X scale
Normal curve for control chart test and Z test
This shows the sampling distribution of X or Z, as appropriate, assuming the null hypothesis
holds. For Z, this is the standard Normal distribution with mean 0 and standard deviation 1, for
which appropriate tables are available.
Recall that the sampling distribution of a sample statistic1 is the frequency distribution of values
of the statistic that arise from repeated sampling of the process, calculating a new value of the
statistic from each sample. Assuming that the process is in control, a value of X outside the
control limits is improbable; finding such a value makes an in control process implausible.
Equivalently, assuming the null hypothesis, a value of Z exceeding ±3 is improbable; finding
such a value makes the null hypothesis implausible.
What critical value for Z is needed to ensure the correspondence?
Control chart critical value = 3.
What is the significance level corresponding to this critical value?2 Use the Normal
table and / or the Minitab Normal cumulative distribution function (Calc menu).
Minitab calculation gives the following result:
Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1
x
-3
P( X <= x )
0.0013499
0.0013499 is the area of the left tail. Two tails area = 0.0027.
Check the comparison of the p-value shown in the Session window with the
significance level of the Z test and verify the conclusion of the Z test.
0.162 > 0.0027
1
In statistical language, any quantity whose value is calculated from sample data is called a
statistic.
2 The conventional significance level for a Z test is 0.05, corresponding to a critical value of 2 (or 1.96, to
be spuriously accurate). Shewhart chose 3 as the critical value for control charts to avoid too many false
alarms that might arise with frequent sampling in an industrial setting. See Stuart (2003, pages 167-8) for
further discussion of the choice of significance levels and critical values.
page 6
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Repeat the Z test for Sample 15; repeat the verification exercise.
One-Sample Z: Clip gap_15
Test of mu = 70 vs not = 70
The assumed standard deviation = 8
Variable
Clip gap_15
N
5
Mean
82.00
StDev
5.70
SE Mean
95% CI
3.58 (74.99, 89.01)
Z
P
3.35 0.001
Z = 3.35, >3.
P = 0.001, < 0.0027.
An extension of the control chart Z test
Re-analysing the data in separate subsets, up to and after Sample 17, results in:
Xbar-S Chart of Clip gap by Sample
1
18
Sample Mean
84
78
UCL=77.06
72
_
_
X=66.75
66
60
LCL=56.44
1
3
5
7
9
11
13
Sample
15
17
1
19
21
23
25
18
UCL=16.05
Sample StDev
16
12
_
S=7.69
8
4
LCL=0
0
1
3
5
7
9
11
13
Sample
15
17
19
21
23
25
Interpret the revised charts.
The process appears to operate at different levels before and after the change in raw material
Comment on the non-significance of Sample 15.
The (historical) centre line and control limits calculated from the data up to Sample 17 are
higher than in the original chart, based on the target centre line of 70, so that Sample 15 is not
out of control by reference to the new limits. The original control limits were inappropriate.
What is the result of the t-test applied to the "Before" data?
page 7
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Null hypothesis
 = 70
Test statistic
t=
Calculated value
Critical value
Comparison
Conclusion
4.44
1.99 (read approximately from the t-table)
4.44 > 1.99
Reject null hypothesis
X  0
s/ n
How many degrees of freedom did you associate with t?
84 = 85 – 1
Repeat the above analysis for the "After" data.
Null hypothesis
 = 70
Test statistic
t=
Calculated value
Critical value
Comparison
Conclusion
–2.75
2.02 (read approximately from the t-table)
2.75 > 2.02
Reject null hypothesis
X  0
s/ n
Explain why the deviation of the "Before" data from 70 appears considerably more
significant than that of the "After" data. What factors influence this difference
between the two tests?
The "Before" mean is further from 70.
The "Before" sample size is bigger, giving a smaller standard error and therefore bigger t value.
The "Before" degrees of freedom are bigger, giving a smaller critical value, easier to be
exceeded.
Diagnostic analysis
Review the s charts you constructed earlier and comment on the "constant standard
deviation" assumption.
The "constant standard deviation" assumption is supported by the s chart; differences between
successive s values may be ascribed to chance variation.
page 8
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Diagnostic analysis using Residuals
Time Series Plot of Residuals
20
Residuals
10
0
-10
-20
-30
1
12
24
36
48
60
Index
72
84
96
108
120
30
20
Residuals
10
0
-10
-20
-30
-3
-2
-1
0
Normal Score
1
2
3
Describe the variation pattern(s) you see in these data.
The time series plot shows random variation, with no particular pattern evident.
The Normal plot shows approximate linearity. The horizontal strips are due to the rounded
character of the data, rounded to the nearest 5 units.
Are there any patterns that would undermine an assumption of pure chance variation
or of Normality?
No.
page 9
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Data correlated over time tend to show "tracking" patterns, whereby successive values tend to
be close together, thus causing apparent waves in the data.
Rounding does not detract from the Normality assumption in any serious way, although tests of
Normality such as the Anderson-Darling test will be sensitive, being sensitive to any type of
departure from strict Normality.
4.
Application to Paired Comparisons
Profile Plot of Before, After
90
Variable
Before
After
Platelets, per cent
80
70
60
50
40
30
20
1
2
3
4
5
6
7
Subject
8
9
10
11
Comment on the variation pattern in the graph, with particular attention to
correspondences between pairs of measurements on subjects, or lack thereof.
There is considerable variation between subjects, varying from around 25 to around 80. The
variation pattern across subjects is similar for Before and After, with a couple of exceptions.
Within subject differences between After and Before are consistently positive, apart from
Subject 10, where the difference is –1.
Profile Plot of Before, After, Difference
90
Variable
Before
After
Difference
80
Platelets, per cent
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
Subject
8
page 10
9
10
11
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Comment on the variation pattern.
How does the range of variation of the differences relate to the ranges of variation of
the measurements?
How does the size of the differences relate to the size of the measurements?
The range of variation of the differences is considerably smaller than that of the measurements.
The size of the differences appears to be positively related to the size of the measurements.
A scatterplot of Differences versus Means provides further insight.
Scatterplot of Difference vs Means
30
25
Difference
20
15
10
5
0
20
30
40
50
Means
60
70
80
Comment on the variation pattern.
How does the size of the differences relate to the size of the measurements?
How do you regard the largest difference?
The relationship between Difference and size is less clear here and depends on how one or
more cases are treated as exceptional or not. If the largest difference is regarded as
exceptional, there appears to be less of a relationship. If the largest difference and the largest
mean are regarded as exceptional, there appears to be a quadratic relationship. In these
circumstances, speculating on the nature of relationships is risky.
page 11
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Probability Plot of Difference
Normal
30
Difference
20
Mean
StDev
N
AD
P-Value
10.27
7.976
11
0.265
0.618
Mean
StDev
N
AD
P-Value
47.32
16.54
11
0.292
0.538
10
0
-10
-2
-1
0
Score
1
2
Probability Plot of Means
Normal
90
80
70
Means
60
50
40
30
20
10
0
-2
-1
0
Score
1
2
Comment on the Normality of the differences and on the exceptional (or otherwise)
status of the largest difference.
Both differences and Means appear Normal, with no exceptional cases.
Note the similarity of the next three largest differences.
Patterns such as this, particularly in such a small data set, are not remarkable.
page 12
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 3, Feedback
Testing the significance of the differences.
One-Sample T: Difference
Test of mu = 0 vs not = 0
Variable
Difference
N
11
Mean
10.27
StDev
7.98
SE Mean
95% CI
2.40 (4.91, 15.63)
T
4.27
P
0.002
Paired T-Test and CI: After, Before
Paired T for After - Before
After
Before
Difference
N
11
11
11
Mean
52.45
42.18
10.27
StDev
18.30
15.61
7.98
SE Mean
5.52
4.71
2.40
95% CI for mean difference: (4.91, 15.63)
T-Test of mean difference = 0 (vs not = 0): T-Value = 4.27
P-Value = 0.002
Null hypothesis
 = 0
Test statistic
t=
Calculated value
Critical value
Comparison
Conclusion
4.27
2.23
4.27 > 2.23
Reject null hypothesis.
D 0
sD / 11
Note: Formal reports such as this are usually confined to text books.
informal reports are made, such as:
In practice, more
The observed mean difference was 10.27, with standard error 2.40. This was
statistically significant at the 5% level of significance (p = 0.002). A 95% confidence
interval for the true mean difference is
(4.91, 15.63).
The formal reports are sought in this exercise to help ensure that students understand the basis
for the statistical significance test.
Note also that the reporting of confidence intervals is important when statistically significant
results are found. Reporting merely on the statistical significance of a result is of little
assistance to a client who wants to know something about the substantive significance of the
results. Confidence intervals allow the client to make judgements on substantive significance.
page 13