Download Archived link

Stat Exposé The d-two-star tables uncovered Gregory F. Gruska, The Third Generation Inc with input from the MSA workgroup David Benham, DaimlerChrylser Corporation Peter Cvetkovski, Ford Motors Corporation Michael Down, General Motors Corporation Abstract From the beginning of the development and implementation of Statistical Quality Control –SQC (now known as Statistical Process Control – SPC), the Range has been used to develop an estimate of the process standard deviation. Since the Range is a biased estimate of the standard deviation “correction” factors have to be used to transform the average Range to an estimate of the process standard deviation. This paper will discuss the development of the d 2* tables which contain the necessary “correction” factors. ξξξξξ Warning This paper is rated ξ (xi) since it contains Greek letters, mathematical symbols, and statistical terminology. Individual with no statistical background or training should proceed with caution. Professional Statistical guidance is recommended. Warning ξξξξξ This paper is intended to serve as additional guidelines for the analysis of measurement systems. www.aiag.org/publications/quality/msa3.html 1 The d-two-star tables uncovered From the beginning of the development and implementation of Statistical Quality Control –SQC (now known as Statistical Process Control – SPC), the range has been used to develop an estimate of the process standard deviation. Although the range provides a less efficient estimate of the population standard deviation, it was widely used due to the ease of calculation and the lack of inexpensive computers and calculators capable of calculating the standard deviation during the first five decades of SPC. Because of its wide use the distribution of the range in random samples from a normal distribution has been studied by noted statisticians such as David, Grubbs, Weaver, Patniak, Hartley, Pearson, and Duncan. The difficulty with reading their papers is that there is no consistent notation used. This paper will use the notation contained in the Quality Control and Industrial Statistics1,.by Acheson Duncan because of its prominence in the Quality field. The above authors have shown that the distribution of the range in random samples from a normal distribution: • Depends on the sample size • Is independent of the population mean • Is dependent on the population standard deviation Further, the relative efficiency of the range as an estimator of the standard deviation decreases as the sample size increases2. Unfortunately, a simple form of the exact distribution of the (mean) range cannot be developed except for the trivial case of two samples of two observations each. However, Patniak (1950) did develop a useful approximation to this distribution which is utilized here. Approximation to the Distribution of the Mean Range Let x1 , x2 ,…, xm denote a random sample of size m from a normal population having mean µ and standard deviation σ . The range of this sample is Range = Rm = max ( x1 , x2 ,… , xm ) − min ( x1 , x2 ,… , xm ) If there are g such independent samples each with a sample size of m, the mean of the g ranges is denoted in this paper by Rg ,m . Let Wm denote the range of the standardized (z) values. That is, 1 Quality Control and Industrial Statistics, 5th edition, McGraw-Hill, 1986.by Acheson Duncan 2 A generally acceptable rule is that the range should not be used when the sample size exceeds 20. In these cases it is preferable to divide the sample into a number of groups and consider the average range over all the groups. A subgroup size of seven or eight provides the most efficient estimation. 2 Wm = max ( z1 , z2 ,… , zm ) − min ( z1 , z2 ,… , zm ) where zi = xi − µ σ Then the probability integral of Wm can be expressed as P (Wm ) = m ∫ ∞ −∞ f ( z) {∫ z +Wm z } m −1 f (u )du dz where f ( x ) is the normal frequency function: f ( x ) = 1 − 12 x2 e 2π The moments for the probability integral of Wm have been calculated to 5 decimal places using numerical quadrature by Hartley and Pearson (1951). The following table has been extracted from their work. Sample Size 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Mu = d2 Var = Vm 1.12838 1.69257 2.05875 2.32593 2.53441 2.70436 2.8472 2.97003 3.07751 3.17287 3.25846 3.33598 3.40676 3.47193 3.53198 3.58788 3.64006 3.68896 3.73495 0.72676 0.78922 0.77407 0.74661 0.71916 0.69424 0.67213 0.65262 0.63531 0.61984 0.60601 0.59353 0.58217 0.57186 0.56237 0.55363 0.54554 0.53802 0.53097 Table 1: Mean and Standard Deviation for the distribution of ranges in normal samples The range is a biased estimate of the standard deviation and a “correction” factor (d2 ) has to be used to transform the average range to an estimate of the process standard deviation. I.e., R E(Wm) = m = d 2 σ Rm σ = So d2 3 For the distribution of Rg ,m , things are not so simple. However, based on the work by Pearson (1926), Patniak selected the χ – distribution as a reasonably accurate representation of the distribution of Rg ,m . The first two moments of Rg ,m are related to those of Rm by R  E  g ,m  = d 2 σ   1 1 R  Vm = Vg ,m var  g ,m  = var ( Rm ) = 2 σ  gσ gσ 2  Relating these two moments with those of c χ where χ has ν degrees of freedom ν yields: d2 = Vg ,m c 2 ν +1  ν  Γ  Γ  ν  2  2 2 c2    ν +1  ν   = ν − 2  Γ   Γ    ν   2      2  where c = d 2* The Γ -functions can be expanded by Stirling’s formula and the resulting equations simplified and used to solve for d 2* and ν . (d ) * 2 2 = d 22 + Vm g ν = A−1 + 1 3 3 2 − A+ A 4 16 64 where A = 2Vm d 22 These are the formulae used to generate the d 2* table in Appendix C of the MSA Manual, 3rd edition. 2 The constant difference (C.D.) given in the table is calculated by d 2 . 2Vm Using the d 2* Table Whither go g and m? The thing that tends to be most confusing to people first using the d 2* table is what shout the value for g and m be. The best thing is to bring it down to basics: How many ranges are used to calculate the average range? =g How many pieces were used to determine each range? =m 4 In the MSA 3rd example for the Range Method there are five ranges used to calculate the average range – hence g = 5. And each range was the difference of samples of size two – m = 2. From the table the value of d 2* for g = 5 (= number of parts) and m = 2 (= number of appraisers) is 1.19105 or simply 1.19 (unless you are enamored with decimals). In the GRR example with a 3, 10, 3 setup, the average range used for repeatability calculations has g = 30 (= number of parts * number of appraisers) range values used in the calculations. Each range is based on a sample of m = 3 (= number of trials). Going to the table we cannot find g = 30 – the largest g is 20. We make the assumption that 30 is sufficiently large and use the d 2 value of 1.69257 for d 2* in the calculation of 1 1 K1 = * = = 0.59081751419439077852023845394873 = 0.5908 d 2 1.69257 C.D.s and dfs The constant difference term is used to determine the degrees of freedom value (ν ) when the number of samples (g) exceed the tabled values (i.e. g > 20). Example: Find d 2* and ν for g = 22 and m = 8. From the table we have d 2* = 2.85310 and ν = 120.9 for g = 20 m = 8 with d 2 = 2.8472 and C.D. = 6.0305. For g = 22 take d 2* = 2.853 since 22 is closer to 20 than infinity. ν = 120.9 + 2*(6.0305) = 132.961 or 133.0 Yes, there is some “fudging”, but remember these are only approximations. Gregory F. Gruska, a Fellow of the American Society for Quality (ASQ), is the principal consultant in performance excellence for Omnex, LLC. an Engineering and Management services firm. Greg has been involved in the development of theory and software and co-authored over 60 books and papers in statistical theory and applications and quality management. 5 References Duncan, A. (1986). Quality Control and Industrial Statistics, 5th edition, McGraw-Hill, New York. David, H. A. (1951). “Further Applications of Range to the Analysis of Variance” Biometrika, 38, 393. Florin, H. (1950). Comminucations of the Royal Finnish Academy (Science Series), 12, 6. Grubbs, F. E. and Weaver, C. L.(1947) “The Best Estimate of Population Standard Deviation Based on Group Ranges”, JASA, 42, 224 Hartley, H. O. and Pearson, E. S. (1951). “Moment constants for the distribution of Range in Normal Samples”, Biometrika, 38, 463. Patniak, P.B. (1950). “The Use of Mean Range as an Estimator of Variance in Statistical Tests”, Biometrika, 37, 78. Pearson E. S. (1951). “Some Notes on the Use of Range”, Biometrika, 38, 88. Pearson, E. S. and Hartley, H. O. (eds.)(1976). Biometrika Tables for Statisticians, Griffen and Co., London 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Archived link