Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SMU EMIS 7364 NTU TO-570-N Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow Inferences About Process Quality Updated: 2/3/04 1 Inferences about Process Quality • Sampling & Sampling Distributions • Inferences Based on Single Random Sample • Inferences Based on Two Random Samples • Inferences Based on More than Two Random Samples 2 Sampling & Sampling Distributions 3 Population vs. Sample • Population the total of all possible values (measurement, counts, etc.) of a particular characteristic for a specific group of objects. • Sample a part of a population selected according to some rule or plan. Why sample? 4 Sampling Characteristics that distinguish one type of sample from another: • the manner in which the sample was obtained • the purpose for which the sample was obtained 5 Simple Random Sample The sample X1, X2, ... ,Xn is a random sample if X1, X2, ... , Xn are independent identically distributed random variables. Remark: Each value in the population has an equal and independent chance of being included in the sample. 6 Generating Random Samples using Monte Carlo Simulation 7 Generating Random Numbers f(y) y F(y) ri 1.0 0.8 0.6 0.4 0.2 0 y yi 8 Generating Random Numbers Generating values of a random variable using the probability integral transformation to generate a random value y from a given probability density function f(y): 1. Generate a random value rU from a uniform distribution over (0, 1). 2. Set rU = F(y) 3. Solve the resulting expression for y. 9 Generating Random Numbers with Excel From the Tools menu, look for Data Analysis. 10 Generating Random Numbers with Excel If it is not there, you must install it. 11 Generating Random Numbers with Excel Once you select Data Analysis, the following window will appear. Scroll down to “Random Number Generation” and select it, then press “OK” 12 Generating Random Numbers with Excel Choose which distribution you would like. Use uniform for an exponential or weibull distribution or normal for a normal or lognormal distribution 13 Generating Random Numbers with Excel Uniform Distribution, U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. 14 Generating Random Numbers with Excel Normal Distribution, N(m, s). Select “Normal” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Enter the values for the mean (m) and standard deviation (s) then press OK. 10 random numbers of uniform distribution will now appear on a new chart. 15 Generating Random Values from an Exponential Distribution E() with Excel First generate n random variables, r1, r2, …, rn, from U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. 16 Generating Random Values from an Exponential Distribution E() with Excel Select a that you would like to use, we will use = 5. Type in the equation xi=-ln(1 - ri), with filling in as 5, and ri as cell A1 (=-5*LN(1-A1)). Now with that cell selected, place the cursor over the bottom right hand corner of the cell. A cross will appear, drag this cross down to B10. This will transfer that equation to the cells below. Now we have n random values from the exponential distribution with parameter =5 in cells B1 - B10. 17 Generating Random Values from an Weibull Distribution W(,b) with Excel First generate n random variables, r1, r2, …, rn, from U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. 18 Generating Random Values from an Weibull Distribution W(,b) with Excel Select a and b that you would like to use, we will use = 100, b = 20. Type in the equation xi = [-ln(1 - ri)]1/b, with filling in as 100, b as 20, and ri as cell A1 (=100*(-LN(1-A1))^(1/20)). Now transfer that equation to the cells below. Now we have n random variables from the Weibull distribution with parameters =100 and b=20 in cells B1 - B10. 19 Generating Random Values from an Lognormal Distribution LN(m, s) with Excel First generate n random variables, r1, r2, …, rn, from N(0, 1). Select “Normal” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Enter 0 for the mean and 1 for standard deviation then press OK. 10 random numbers of uniform distribution will now appear on a new chart. 20 Generating Random Values from an Lognormal Distribution LN(m, s) with Excel Select a m and s that you would like to use, we will use m = 2, s = 1. Type in the equation , xi e m ris with filling in m as 2, s as 1, and ri as cell A1 (=EXP(2+A1*1)). Now transfer that equation to the cells below. Now we have an Lognormal distribution in cells B1 - B10. 21 Flow Chart of Monte Carlo Simulation method Input 1: Statistical distribution for each component variable. Select a random value from each of these distributions Input 2: Relationship between component variables and system performance Calculate the value of system performance for a system composed of components with the values obtained in the previous step. Repeat many times Output: Summarize and plot resulting values of system performance. This provides an approximation of the distribution of system performance. 22 Distribution of Sample Mean 23 Sampling Distribution of X with known s If X1, X2, ... ,Xn is a random sample of size n from a normal distribution with mean m and known standard deviation s, 1 n and if X X i , n i 1 then and σ X ~ N μ, n Z X μ ~ N0,1 σ n 24 Central Limit Theorem If X is the mean of a random sample of size n, X1, X2, …, Xn, from a population with mean m and finite standard deviation s, then if n the limiting distribution of Z X m s n is the standard normal distribution. 25 Central Limit Theorem Remark: The Central Limit Theorem provides the basis for approximating the distribution of X with a normal distribution with mean m and standard deviation s n The approximation gets better as n gets larger. 26 Sampling Distribution of X with Unknown s Let X1, X2, ..., Xn be independent random variables that have normal distribution with mean m and unknown standard deviation s. Let 1 n X Xi n i 1 and n 2 1 2 S Xi X n 1 i 1 Then the random variable X μ T S n has a t-distribution with = n - 1 degrees of freedom. 27 Distribution of Sample Standard Deviation 28 Sampling Distributions of S2 If S2 is the variance of a random sample of size n taken from a normal population having the variance s2, then the statistic 2 n 1 s s 2 2 n i 1 X i X s2 2 has a chi-squared distribution with = n - 1 degrees of freedom. 29 Inferences Based on a Single Random Sample 30 Estimation - Binomial Distribution Estimation of a Proportion, p • X1, X2, …, Xn is a random sample of size n from B(n, p) • Point estimate of p: fs P n ^ where fs = # of successes 31 Estimation - Binomial Distribution • Approximate (1 - ) ·100% confidence interval for p: p 'L , p 'U where ^ and p 'L p p ^ p p p ' U ^ ^ where and p Z / 2 Z 2 pq n , is the value of the standard normal random variable Z such that PZ z / 2 2 32 Estimation of the Mean - Normal Distribution • X1, X2, …, Xn is a random sample of size n from N(m, s), where both m & s are unknown. • Point Estimate of m 1 n μ Xi X n i 1 ^ • (1 - ) 100% Confidence Interval for the mean μ L , μ U where Δμ t α 2 , n 1 s , n μ L X Δμ and μ U X Δμ 33 Estimation of the Mean - Infinite Population - Type Unknown • X1, X2, …, Xn is a random sample of size n • Point Estimate of m 1 n μ Xi X n i 1 ^ • An approximate (1 - ) 100% Confidence Interval for the mean μL , μU where μ t α 2 based on the Central Limit Theorem , n 1 s n μL X Δμ and μU X Δμ 34 Estimation of Means - Finite Populations • X1, X2, ... , Xn is a random sample of size n from a population of size N with unknown parameters m and s ^ • Point Estimate of m: m X • An approximate (1 - ) · 100% Confidence Interval for m is, m 'L ,m 'U where ^ m x m ' L where Δμ t α 2 , n 1 and s n ^ m x m , ' U Nn , N 1 35 Estimation of Means - Finite Populations where n 1 S Ti T n 1 i 1 2 2 • t is the value of T ~ tdf for which P T t , n 1 , n 1 2 2 2 • Nn is the finite population correction factor N 1 36 Estimation of Lognormal Distribution • Random sample of size n, X1, X2, ... , Xn from LN (m, s) • Let Yi = ln Xi for i = 1, 2, ..., n • Treat Y1, Y2, ... , Yn as a random sample from N(m, s) • Estimate m and s using the Normal Distribution Methods 37 Estimation of Weibull Distribution • Random sample of size n, T1, T2, …, Tn, from W(b, ), where both b & are unknown. • Point estimates ^ • β is the solution of g(b) = 0 n where gβ β T i lnT i i 1 n β T i 1 1 n lnT i β n i 1 i 1 1 1 n β^ β^ • θ Ti n i 1 ^ 38 Estimation of Standard Deviation - Normal Distribution • Point Estimate of s 1 n s Xi X n i 1 ^ 2 n 1 s n • (1 - ) · 100% Confidence Interval for s is, sL ,s U where (n 1) sL s 2 x / 2,n 1 and (n 1) sU s 2 x1 / 2,n 1 39 Testing Hypotheses There are two possible decision errors associated with testing a statistical hypothesis: A Type I error is made when a true hypothesis is rejected. A Type II error is made when a false hypothesis is accepted. Decision Accept H0 Reject H0 (Accept H1) True Situation H0 true H0 false correct Type II error Type I errorcorrect 40 Testing Hypotheses The decision risks are measured in terms of probability. = P(Type I error) = P(reject H0|H0 is true) = Producers risk b = P(Type II error) = P(accept H0|H1 is true) = Consumers risk Remark: 100% · is commonly referred to as the significance level of a test. Note: For fixed n, increases as b decreases, and vice 41 versa, as n increases, both and b decrease. Power Function Before applying a test procedure, i.e., a decision rule, we need to analyze its discriminating power, i.e., how good the test is. A function called the power function enables us to make this analysis. Power Function = P(rejecting H0|true parameter value) OC Function = P(accepting H0|true parameter value) = 1 - Power Function where OC is Operating Characteristic. 42 Power Function A plot of the power function vs the test parameter value is called the power curve and 1 - power curve is the OC curve. ideal power curve PR(m) 1 m 0 H0 H1 43 Power Function The power function of a statistical test of hypothesis is the probability of rejecting H0 as a function of the true value of the parameter being tested, say , i.e., PF() = PR() = P(reject H0|) = P(test statistic falls in CA|) 44 Operating Characteristic Function The operating characteristic function of a statistical test of hypothesis is the probability of accepting H0 as a function of the true value of the parameter being tested, say , i.e., OC() = PA() = P(accept H0|) = P(test statistic falls in CR|) 45 Tests of Proportions Let X1, X2, . . ., Xn be a random sample of size n from B(n, p). Case 1: small sample sizes To test the Null Hypothesis H0: p = p0, a specified value, against the appropriate Alternative Hypothesis or or 1. HA: p < p0 , 2. HA: p > p0 , 3. HA: p p0 , 46 Tests of Proportions at the 100 · % Level of Significance, calculate the value of the test statistic using X ~ B(n, p = p0). Find the number of successes and compute the appropriate P-Value, depending upon the alternative hypothesis and reject H0 if P , where or or 1. P = P(X x|p = p0) , 2. P = P(X x|p = p0) , 3. P = 2P(X x|p = p0) if x < np0, or P = 2P(X x|p = p0) if x > np0, 47 Tests of Proportions Case 2: large sample sizes with p not extremely close to 0 or 1. To test the Null Hypothesis H0: p = p0, a specified value, against the appropriate Alternative Hypothesis or or 1. HA: p < p0 , 2. HA: p > p0 , 3. HA: p p0 , 48 Tests of Proportions Calculate the value of the test statistic x np 0 Z np 0 q 0 and reject H0 if or or 1. z z , 2. z z , 3. z z α 2 or z zα , 2 depending on the alternative hypothesis. 49 Test of Means Let X1, …, Xn, be a random sample of size n, from a normal distribution with mean m and standard deviation s, both unknown. To test the Null Hypothesis H0: m = m0 , a given or specified value against the appropriate Alternative Hypothesis or or 1. HA: m < m0 , 2. HA: m > m0 , 3. HA: m m0 , 50 Test of Means at the 100 % level of significance. Calculate the value of the test statistic X m0 t s n Reject H0 if 1. t < -t, n-1 , 2. t > t, n-1 , 3. t < -t/2, n-1 , or if t > t/2, n-1 , depending on the Alternative Hypothesis. 51 Test of Variances Let X1, …, Xn, be a random sample of size n, from a normal distribution with mean m and standard deviation s, both unknown. To test the Null Hypothesis H0: s2 = s20, a specified value against the appropriate Alternative Hypothesis or or 1. HA: s2 < s20 , 2. HA: s2 > s20 , 3. HA: s2 s20 , 52 Test of Variances at the 100 % level of significance. Calculate the value of the test statistic 2 n 1 s2 s 02 Reject H0 if 1. 2 < 21-, n-1 , 2. 2 > 2, n-1 , 3. 2 < 21-/2, n-1 , or if 2 > 2/2, n-1 , depending on the Alternative Hypothesis. 53 Inferences Based on Two Random Samples 54 Estimation - Binomial Populations Estimation of the difference between two proportions • Let X11, X12, …, X1n1 , and X21, X22, …, X 2 n2 , be random samples from B(n1, p1) and B(n2, p2) respectively • Point estimation of p1 - p2 ^ ^ ^ p p1 p 2 X1 X 2 f f 1 2 n1 n2 55 Estimation - Binomial Populations • Approximate (1 - ) · 100% confidence interval for p p1 p2 pL , pU where ^ ^ pL p Z 2 ^ ^ ^ p1 q1 p2 q2 n1 n2 and ^ ^ pU p Z 2 ^ ^ ^ p1 q1 p2 q2 n1 n2 56 Estimation of Difference Between Two Means - Normal Distribution • Let X11, X12, …, X1n1, and X21, X22, …, X 2 n2be random samples from N(m1, s1) and N(m2, s2), respectively, where m1, s1, m2 and s2 are all unknown • Point estimation of m = m1 - m2 ^ ^ ^ Δ μ μ1 μ 2 X1 X 2 57 Estimation of Difference Between Two Means - Normal Distribution • An approximate (1 - ) · 100% Confidence Interval for m = m1 - m2 ' ' m , m L U s12 s22 m L m t , 2 n1 n2 ' ^ where s12 s22 mU m t , 2 n1 n2 ' ^ 58 Estimation of Difference Between Two Means - Normal Distribution where = degrees of freedom 2 s s n1 n2 2 2 2 2 s1 s2 n1 n2 n1 1 n2 1 2 1 2 2 59 Estimation of Ratio of Two Standard Deviations - Normal Distribution • Let X11, X12, …, X1n1, and X21, X22, …, X 2 n2be random samples from n(m1, s1) and n(m2, s2), respectively • Point estimation of s1 rs s2 ^ where s1 rs s2 1 ni si X ij X i n 1 j 1 for i = 1, 2 2 60 Estimation of Ratio of Two Standard Deviations - Normal Distribution • (1 - ) · 100% Confidence Interval for r sL where , rs U s1 rσ L s2 s1 rs s2 1 Fα , υ1 , υ 2 2 and rσ U s1 Fα , υ 2 , υ1 s2 2 61 Estimation of Ratio of Two Standard Deviations - Normal Distribution where F ,1 , 2 is the value of the F-Distribution with 2 1 n1 1 and 2 n2 1 degrees of freedom for which P F F ,1 ,2 2 2 62 Test on Two Means Let X11, X12, …, X1n1 be a random sample of size n1 from N(m1, s1) and X21, X22, …, X2n2 be a random sample of size n2 from N(m2, s2), where m1, s1, m2 and s2 are all unknown. To test H0: m1 - m2 = do, where do 0, against the appropriate alternative hypothesis 63 Test on Two Means 1. H1: m1 - m2 < do, where do 0, 2. H1: m1 - m2 > do, where do 0, 3. H1: m1 - m2 do, where do 0, or or at the 100% level of significance, calculate the value of the test statistic. t' X 1 X2 d0 s12 s 22 n1 n 2 64 Test on Two Means Reject Ho if 1. t' < t, or 2. t' > t, or 3. t' < t/2, or t' > t/2, depending on the alternative hypothesis. 2 s s n1 n 2 2 2 s12 s 22 n1 n 2 n1 1 n2 1 2 1 2 2 65 Test on Two Variances Let X11, X12, …, X1n1 be a random sample of size n1 from N(m1, s1) and X21, X22, …, X2n2 be a random sample of size n2 from N(m2, s2), where m1, s1, m2 and s2 are all unknown. To test H0: σ12 σ 22 against the appropriate alternative hypothesis 66 Test on Two Variances 1. H1: σ12 σ12 2. H1: σ12 σ12 3. H1: σ σ or or 2 1 2 1 at the 100% level of significance, calculate the value of the test statistic. 2 1 2 2 S F S 67 Test on Two Variances Reject Ho if F F1 (v1 , v2 ) or F Fα (v1,v2 ) or F F1 / 2 (v1 , v2 ) or F F / 2 (v1 , v2 ) depending on the alternative hypothesis. 68 Inferences Based on More than Two Random Samples 69 Normal Distribution - Estimation of m X1, X2, …, Xn is a random sample of size n from N(m, s), where both m & s are unknown. • Point Estimate of m 1 n μ Xi X n i 1 ^ • (1 - )·100% Confidence Interval for m is μ L , μ U , where μ L X Δμ and μ U X Δμ 70 Normal Distribution - Estimation of m Δμ t α 2 where t α 2 , n 1 , n 1 s n is the value of the t-distribution with parameter = n-1 which P(T> t α 2 , n 1 ) = /2 and may be obtained from the table t-distribution (Located in the resource section on the website). 71 Estimation of Lognormal Distribution • Random sample of size n, X1, X2, ... , Xn from LN (m, s) • Let Yi = ln Xi for i = 1, 2, ..., n • Treat Y1, Y2, ... , Yn as a random sample from N(m, s) • Estimate m and s using the Normal Distribution Methods 72 Estimation of Weibull Distribution • Random sample of size n, T1, T2, …, Tn, from W(b, ), where both b & are unknown. • Point estimates ^ • β is the solution of g(b) = 0 n where gβ β T i lnT i i 1 n β T i 1 1 n lnT i β n i 1 i 1 1 1 n β^ β^ • θ Ti n i 1 ^ 73