Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dimensional Engineering News A monthly newsletter for Dimensional Engineers Knowledge Improves Quality --News Mission Statement How Many Simulation Runs Are Required Contributing editor – Brenda Quinlan [email protected], Dimensional Engineering Specialist Dimensional Control Systems 3DCS® is a variation analysis tool that uses Monte Carlo simulation to predict the results of a set of measurements [Dimensional Engineering News, June 2009]. After a variation model is built in 3DCS®, a Monte Carlo simulation can be performed to provide the following statistics: Descriptive statistics - calculations made directly from the sample data such as mean, minimum, maximum, standard deviation, percentage out-of-spec, confidence intervals, etc. Inferential statistics – estimations based on a curve-fitting algorithm such as estimated low, estimated high, estimated percentage out-of-spec, etc. November, 2010 Issue 91 A common and critical concern asks “What is the proper number of simulation runs when performing a Monte Carlo simulation?” In the following, the recommended number of runs is calculated based on the confidence interval of the standard deviation. Confidence Interval for Standard Deviation σ at Confidence Level CL is calculated as: CIL < σ < CIU (1) CIU = s [(N-1)/iχ2(α/2, ν)] (2) CIL = s [(N-1)/iχ2(1-α/2, ν)] (3) 2 Where α = 1 – CL, iχ (α/2, ν) is the quantile from the Chi-Square Distribution at α/2 and ν = N-1, and s is the sample standard deviation, which is the estimate of σ. Chi-squared χ2 Distribution This is the distribution used when adding ν squared normal-distributed random numbers. The distribution has one parameter ν, degrees of freedom. See statistics definitions and equations in [Dimensional Engineering News, October 2003 and August 2004]. A statistic is calculated from a certain number of samples randomly drawn from a population. If all the members of the population are used in the calculation, the result is a parameter, and often referred to as the “true” value. To estimate how well a statistic predicts the value of the parameter of the entire population, its confidence interval may also be calculated. A confidence level, frequently 90%, 95%, or 99%, is first chosen, and then the upper and lower limits of the confidence interval are calculated. The confidence level is then the probability the parameter is within the confidence interval. Copyright 2010 By Dimensional Control Systems, Inc. www.3dcs.com Pdf (x) = x(ν-2)/2exp(-x/2)/[2ν/2Γ(ν/2)] for x≥0 Where Γ (ν/2) is the gamma function with parameter ν/2. When n is a positive integer, Γ(n) = (n-1)! Figure 1. Chi-squared Distribution Let χU = [(N-1)/iχ2(α/2, ν)] χL= [(N-1)/iχ2(1-α/2, ν)] Define the Confidence Interval Factor for standard deviation σ as Page 1 Dimensional Engineering News A monthly newsletter for Dimensional Engineers CIσf = (CIU – CIL)/s (4) From EQs (1-4), the Confidence Interval Factor can be obtained as follows CIσf = (χu - χl) (5) So, the CIσf is the ratio of the confidence interval of the standard deviation to the standard deviation of samples. A smaller CIσf will correspond to more accurate simulation statistics. Table 1 provides values for Confidence Interval Factors based on EQ (5) for three confidence levels and five different numbers of samples. Table 1. Confidence Interval Factor CIσf CL α N = 1000 N = 2000 N = 5000 N = 10000 N = 20000 0.990 0.010 0.1156 0.0816 0.0515 0.0364 0.0258 0.950 0.050 0.0879 0.0621 0.0392 0.0277 0.0196 0.900 0.100 0.0737 0.0521 0.0329 0.0233 0.0165 Note: the estimate is the best when CIσf closes to 0.0 Table 1 can be used to select the number of Monte Carlo simulation runs. For example, if a simulation is made with 5000 runs the CIσf is 0.0392 with a confidence level of 95%. Therefore, the “true” value of the standard deviation is 95% probable to be in the confidence interval with a range of 3.92% of the standard deviation. The confidence limits are not perfectly centered about the standard deviation, but they are centered enough that the standard deviation can be said to be within 2% of the “true” standard deviation. If 20,000 samples are run, then the standard deviation is estimated to be within 1% of the population standard deviation. It should be noted that in the case of a variation model, the population is the infinite set of simulations that could be run. Therefore, the CIσf can help determine if running more samples might significantly Copyright 2010 By Dimensional Control Systems, Inc. www.3dcs.com November, 2010 Issue 91 change the results. Since the accuracy of the model depends on many factors beyond the number of samples run, more samples do not necessarily increase the predictability of the results. Although standard deviation was chosen to determine the number of samples needed, confidence intervals are more commonly calculated for the mean and a factor could have also been based on it. Confidence Interval for Mean µ at Confidence Level C is calculated as: CIL < µ < CIU (6) (7) CIU = x + itα/2 s/ N x CIL = - itα/2 s/ N (8) Where itα/2 is the quantile from the Student’s t Distribution at α/2, x is the sample mean, and s is its standard deviation. Student’s t Distribution The distribution has one parameter ν = N1, degrees of freedom. Pdf (x) = Γ[(ν +1)/2] /{ (лν)1/2 Γ(ν/2) 1+(x2/v)](v+1)/2 for -∞ ≤ x ≤ +∞. Figure 2. Student’s t Distribution The required sample size N is approximated as: N = (itα/2 s/d)2 (9) Where d is the allowable estimate error for this estimate: |µ – x | < d. Editors: Ying Qing Zhou, Earl Morgan, Thagu Vivek, Victor Monteverde, [email protected] [email protected] [email protected] [email protected] Page 2