* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Feedback Lab 3 - Trinity College Dublin
Survey
Document related concepts
Transcript
Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3: 1. Feedback Control charts Discuss the stability of the days to payment process with regard to (a) (b) level spread There appears to be excessive variation in the sample mean during the first 8 weeks, with 3 exceptionally low values, 2 below the LCL, and 2 exceptionally high values, 1 above the UCL. The standard deviation appears stable over the whole period. What is your next step in the analysis? As the process appears stable during recent weeks, no action on the process is suggested. However, as there is evidence of instability of the process mean in the first 8 weeks, an investigation of the cause of this is suggested. Identify centre lines and control limits to use for further process monitoring. Use sensible rounding. From the charts based on the recent data, Xbar chart: CL = 34.98, LCL = 30.56, UCL = 39.4 s chart: CL = 3.295, LCL = 0, UCL = 6.894 With "sensible rounding", Xbar chart: CL = 35, LCL = 30.6, UCL = 39.4 s chart: CL = 3.3, LCL = 0, UCL = 6.9. 2. Simulating sampling distributions Compare the results of the simulations. What did you expect? Expect no out-of-control points with 30 repetitions, around 1 with 300 repetitions, around 8 with 3,000 repetitions. This follows on multiplying the number of repetitions by 0.0027, the chance of getting an out of control point in one repetition How do the histograms compare? Expect more Normal looking histograms with increasing repetitions. Statistics for Research Students, Laboratory 3 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback False alarm data Class Frequency 3 4 5 6 7 8 9 10 11 False Alarms 12 14 15 16 17 20 Lecturer Frequency 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 False Alarms Theoretical Frequency Theoretical Frequency 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Alarms Simulating the effect of increasing sample size page 2 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback 24 24 24 24 24 28 28 28 28 28 32 32 32 32 32 36 n=1 36 n=5 36 n=10 36 n=20 36 n=40 40 40 40 40 40 44 44 44 44 44 Mean StDev N 35.02 3.382 1000 Mean StDev N 35.07 1.467 1000 Mean StDev N 35.01 1.029 1000 48 48 48 Mean StDev N 35.01 0.7356 1000 Mean StDev N 35.01 0.5147 1000 48 48 Comment on the effect of increasing sample size, referring to histogram spread, values of StDev, values of Mean. reduces with increasing sample size reduces with increasing sample size stay around 35 (should get closer with increasing sample size) Compare the values of StDev for n = 5 and n = 20, the values of StDev for n = 10 and n = 40, latter is roughly ½ former latter is roughly ½ former noting that the larger sample size is 4 times the smaller in both cases. Explain your comparisons in terms of the formula /n for the standard error of the sample mean Increasing sample size by 4 reduces /n by 1/4 = ½ Note: The summary data with the histograms refers to "Standard Deviation". These are the standard deviations of the Xbar values and so approximate the standard error of Xbar. Recall that the standard error is defined to be the standard deviation of the sampling distribution of Xbar. There two ways of getting at the standard error of Xbar. One is page 3 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback to use the formula s/n, where s is an estimate of the process standard deviation, , that may be calculated from single values sampled from the process. The second way is to calculate the standard deviation of Xbar values sampled from the process. The second is what is done in our simulation exercise and is recorded in the histogram summaries. 3 One-sample significance tests Each reference of a point in an Xbar chart to the control limits amounts to a significance test based on the values from which the corresponding Xbar value was calculated. Xbar-S Chart of Clip gap 1 UCL=80.73 Sample Mean 80 75 _ _ X=70 70 65 60 LCL=59.27 1 3 5 7 9 11 13 Sample 15 17 19 21 23 25 UCL=16.71 Sample StDev 16 12 _ S=8 8 4 LCL=0 0 1 3 5 7 9 11 13 Sample 15 17 19 21 23 25 Testing the hypothesis that the process is on target using Sample 5 gives the result: One-Sample Z: Clip gap_5 Test of mu = 70 vs not = 70 The assumed standard deviation = 8 Variable Clip gap_5 N 5 Mean StDev SE Mean 75.00 7.91 3.58 95% CI (67.99, 82.01) Z P 1.40 0.162 What is the value of Z? 1.40 What is the value of p? 0.162 What conclusion do you draw? The hypothesis = 70 is accepted. Note: The Laboratory exercise asked for a test of the statistical significance of the mean of Sample 2. This led to a Z value of 1.96 and a p-value of 0.05. These calculated values are, entirely coincidentally, the same as the conventional critical value and significance level for the page 4 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Z-test. This coincidence is likely to lead to confusion, so another sample was chosen instead of Sample 2. Correspondence between control chart test and significance test Verify the correspondence between the control chart test and the Z test. The following extended answer to this question is taken from Stuart (2003), pp. 161-162. The control chart test is conventionally formalised as follows. A point on the chart is outside the control limits if its X value deviates from the centre line by more than 3 standard errors. If o is the process mean value corresponding to the centre line, this is equivalent to saying that the value of X satisfies X – o > 3 * /√n or X – o < –3 * /√n, X 0 >3 / n or X 0 that is that is, the calculated value of Here, / n < –3 X 0 exceeds 3 in magnitude. / n X 0 , usually denoted by Z, is referred to as the / n test statistic and the number 3 is called the critical value for the test statistic. It is critical in the sense that the deviation of X from o is statistically significant or not according as the calculated value of the test statistic exceeds the critical value or not. The null hypothesis is rejected if the test statistic exceeds the critical value in magnitude, corresponding to an "out-of-control" signal from the control chart. Otherwise the null hypothesis is accepted, at least provisionally. The correspondence between the control chart test and the Z test is illustrated in Figure 1. page 5 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback -3 0 -3 Figure 1 -2 n -1 0 1 0 2 3 0 +3 n Z scale X scale Normal curve for control chart test and Z test This shows the sampling distribution of X or Z, as appropriate, assuming the null hypothesis holds. For Z, this is the standard Normal distribution with mean 0 and standard deviation 1, for which appropriate tables are available. Recall that the sampling distribution of a sample statistic1 is the frequency distribution of values of the statistic that arise from repeated sampling of the process, calculating a new value of the statistic from each sample. Assuming that the process is in control, a value of X outside the control limits is improbable; finding such a value makes an in control process implausible. Equivalently, assuming the null hypothesis, a value of Z exceeding ±3 is improbable; finding such a value makes the null hypothesis implausible. What critical value for Z is needed to ensure the correspondence? Control chart critical value = 3. What is the significance level corresponding to this critical value?2 Use the Normal table and / or the Minitab Normal cumulative distribution function (Calc menu). Minitab calculation gives the following result: Cumulative Distribution Function Normal with mean = 0 and standard deviation = 1 x -3 P( X <= x ) 0.0013499 0.0013499 is the area of the left tail. Two tails area = 0.0027. Check the comparison of the p-value shown in the Session window with the significance level of the Z test and verify the conclusion of the Z test. 0.162 > 0.0027 1 In statistical language, any quantity whose value is calculated from sample data is called a statistic. 2 The conventional significance level for a Z test is 0.05, corresponding to a critical value of 2 (or 1.96, to be spuriously accurate). Shewhart chose 3 as the critical value for control charts to avoid too many false alarms that might arise with frequent sampling in an industrial setting. See Stuart (2003, pages 167-8) for further discussion of the choice of significance levels and critical values. page 6 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Repeat the Z test for Sample 15; repeat the verification exercise. One-Sample Z: Clip gap_15 Test of mu = 70 vs not = 70 The assumed standard deviation = 8 Variable Clip gap_15 N 5 Mean 82.00 StDev 5.70 SE Mean 95% CI 3.58 (74.99, 89.01) Z P 3.35 0.001 Z = 3.35, >3. P = 0.001, < 0.0027. An extension of the control chart Z test Re-analysing the data in separate subsets, up to and after Sample 17, results in: Xbar-S Chart of Clip gap by Sample 1 18 Sample Mean 84 78 UCL=77.06 72 _ _ X=66.75 66 60 LCL=56.44 1 3 5 7 9 11 13 Sample 15 17 1 19 21 23 25 18 UCL=16.05 Sample StDev 16 12 _ S=7.69 8 4 LCL=0 0 1 3 5 7 9 11 13 Sample 15 17 19 21 23 25 Interpret the revised charts. The process appears to operate at different levels before and after the change in raw material Comment on the non-significance of Sample 15. The (historical) centre line and control limits calculated from the data up to Sample 17 are higher than in the original chart, based on the target centre line of 70, so that Sample 15 is not out of control by reference to the new limits. The original control limits were inappropriate. What is the result of the t-test applied to the "Before" data? page 7 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Null hypothesis = 70 Test statistic t= Calculated value Critical value Comparison Conclusion 4.44 1.99 (read approximately from the t-table) 4.44 > 1.99 Reject null hypothesis X 0 s/ n How many degrees of freedom did you associate with t? 84 = 85 – 1 Repeat the above analysis for the "After" data. Null hypothesis = 70 Test statistic t= Calculated value Critical value Comparison Conclusion –2.75 2.02 (read approximately from the t-table) 2.75 > 2.02 Reject null hypothesis X 0 s/ n Explain why the deviation of the "Before" data from 70 appears considerably more significant than that of the "After" data. What factors influence this difference between the two tests? The "Before" mean is further from 70. The "Before" sample size is bigger, giving a smaller standard error and therefore bigger t value. The "Before" degrees of freedom are bigger, giving a smaller critical value, easier to be exceeded. Diagnostic analysis Review the s charts you constructed earlier and comment on the "constant standard deviation" assumption. The "constant standard deviation" assumption is supported by the s chart; differences between successive s values may be ascribed to chance variation. page 8 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Diagnostic analysis using Residuals Time Series Plot of Residuals 20 Residuals 10 0 -10 -20 -30 1 12 24 36 48 60 Index 72 84 96 108 120 30 20 Residuals 10 0 -10 -20 -30 -3 -2 -1 0 Normal Score 1 2 3 Describe the variation pattern(s) you see in these data. The time series plot shows random variation, with no particular pattern evident. The Normal plot shows approximate linearity. The horizontal strips are due to the rounded character of the data, rounded to the nearest 5 units. Are there any patterns that would undermine an assumption of pure chance variation or of Normality? No. page 9 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Data correlated over time tend to show "tracking" patterns, whereby successive values tend to be close together, thus causing apparent waves in the data. Rounding does not detract from the Normality assumption in any serious way, although tests of Normality such as the Anderson-Darling test will be sensitive, being sensitive to any type of departure from strict Normality. 4. Application to Paired Comparisons Profile Plot of Before, After 90 Variable Before After Platelets, per cent 80 70 60 50 40 30 20 1 2 3 4 5 6 7 Subject 8 9 10 11 Comment on the variation pattern in the graph, with particular attention to correspondences between pairs of measurements on subjects, or lack thereof. There is considerable variation between subjects, varying from around 25 to around 80. The variation pattern across subjects is similar for Before and After, with a couple of exceptions. Within subject differences between After and Before are consistently positive, apart from Subject 10, where the difference is –1. Profile Plot of Before, After, Difference 90 Variable Before After Difference 80 Platelets, per cent 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 Subject 8 page 10 9 10 11 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Comment on the variation pattern. How does the range of variation of the differences relate to the ranges of variation of the measurements? How does the size of the differences relate to the size of the measurements? The range of variation of the differences is considerably smaller than that of the measurements. The size of the differences appears to be positively related to the size of the measurements. A scatterplot of Differences versus Means provides further insight. Scatterplot of Difference vs Means 30 25 Difference 20 15 10 5 0 20 30 40 50 Means 60 70 80 Comment on the variation pattern. How does the size of the differences relate to the size of the measurements? How do you regard the largest difference? The relationship between Difference and size is less clear here and depends on how one or more cases are treated as exceptional or not. If the largest difference is regarded as exceptional, there appears to be less of a relationship. If the largest difference and the largest mean are regarded as exceptional, there appears to be a quadratic relationship. In these circumstances, speculating on the nature of relationships is risky. page 11 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Probability Plot of Difference Normal 30 Difference 20 Mean StDev N AD P-Value 10.27 7.976 11 0.265 0.618 Mean StDev N AD P-Value 47.32 16.54 11 0.292 0.538 10 0 -10 -2 -1 0 Score 1 2 Probability Plot of Means Normal 90 80 70 Means 60 50 40 30 20 10 0 -2 -1 0 Score 1 2 Comment on the Normality of the differences and on the exceptional (or otherwise) status of the largest difference. Both differences and Means appear Normal, with no exceptional cases. Note the similarity of the next three largest differences. Patterns such as this, particularly in such a small data set, are not remarkable. page 12 Trinity College, Dublin Generic Skills Programme Statistics for Research Students Laboratory 3, Feedback Testing the significance of the differences. One-Sample T: Difference Test of mu = 0 vs not = 0 Variable Difference N 11 Mean 10.27 StDev 7.98 SE Mean 95% CI 2.40 (4.91, 15.63) T 4.27 P 0.002 Paired T-Test and CI: After, Before Paired T for After - Before After Before Difference N 11 11 11 Mean 52.45 42.18 10.27 StDev 18.30 15.61 7.98 SE Mean 5.52 4.71 2.40 95% CI for mean difference: (4.91, 15.63) T-Test of mean difference = 0 (vs not = 0): T-Value = 4.27 P-Value = 0.002 Null hypothesis = 0 Test statistic t= Calculated value Critical value Comparison Conclusion 4.27 2.23 4.27 > 2.23 Reject null hypothesis. D 0 sD / 11 Note: Formal reports such as this are usually confined to text books. informal reports are made, such as: In practice, more The observed mean difference was 10.27, with standard error 2.40. This was statistically significant at the 5% level of significance (p = 0.002). A 95% confidence interval for the true mean difference is (4.91, 15.63). The formal reports are sought in this exercise to help ensure that students understand the basis for the statistical significance test. Note also that the reporting of confidence intervals is important when statistically significant results are found. Reporting merely on the statistical significance of a result is of little assistance to a client who wants to know something about the substantive significance of the results. Confidence intervals allow the client to make judgements on substantive significance. page 13