Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENGM 720 - Lecture 04 Comparison of Means, Confidence Intervals (CIs), & Operating Characteristic (OC) Curves 5/24/2017 ENGM 720: Statistical Process Control 1 Assignment: Reading: • • Chapter 4 • • Finish reading through 4.3.4 Begin reading 4.4 through 4.4.3 Chapter 8 • Begin reading 8 through 8.3 Assignments: • • • Obtain the Hypothesis Test (Chart &) Tables – Materials Page Obtain the Exam Tables DRAFT – Materials Page • Verify accuracy as you work assignments Access New Assignment and Previous Assignment Solutions: • • Download Assignment 2 Solutions Download Assignment 3 Instructions 5/24/2017 ENGM 720: Statistical Process Control 2 Hypothesis Tests An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. • The Null Hypothesis has a symbol H0, and is always the default situation that must be proven unlikely beyond a reasonable doubt. • The Alternative Hypothesis is denoted by the symbol HA and can be thought of as the opposite of the Null Hypothesis - it can also be either true or false, but it is always false when H0 is true and vice-versa. 5/24/2017 ENGM 720: Statistical Process Control 3 Hypothesis Testing Errors • Type I Errors occur when a test statistic leads us to reject the Null Hypothesis when the Null Hypothesis is true in reality. • The chance of making a Type I Error is estimated by the parameter (or level of significance), which quantifies the reasonable doubt. • Type II Errors occur when a test statistic leads us to fail to reject the Null Hypothesis when the Null Hypothesis is actually false in reality. • The probability of making a Type II Error is estimated by the parameter . 5/24/2017 ENGM 720: Statistical Process Control 4 Types of Hypothesis Tests Hypothesis Tests & Rejection Criteria H0: MA is not different than M0 H0: MA is not better than M0 HA: MA is different than M0 HA: MA is lower than M0 θA θ0 Dm 2 2 θA -θ0 Dm +θ0 θA H0: MA is not better than M0 HA: MA is higher than M0 θ0 θA Dm One-Sided Test Statistic < Rejection Criterion Two-Sided Test Statistic < -½ Rejection Criterion or Statistic > +½ Rejection Criterion One-Sided Test Statistic > Rejection Criterion H0: θA ≥ θ0 HA: θA < θ0 H0: -θ0 ≤ θA ≤ +θ0 HA: θA< -θ0 or +θ0< θA H0: θA ≤ θ0 HA: θA > θ0 5/24/2017 ENGM 720: Statistical Process Control 5 Hypothesis Testing Steps 1. State the null hypothesis (H0) from one of the alternatives: that the test statistic MA = M0 , MA ≥ M0 , or MA ≤ M0 . 2. Choose the alternative hypothesis (HA) from the alternatives: MA M0 , MA < M0 , or MA > M0 . (Respective to above!) 3. Choose a significance level of the test (). 4. Select the appropriate test statistic and establish a critical region. (If the decision is to be based on a P-value, it is not necessary to have a critical region) 5. Compute the value of the test statistic () from the sample data. 6. Decision: Reject H0 if the test statistic has a value in the critical region (or if the computed P-value is less than or equal to the desired significance level ); otherwise, do not reject H0. 5/24/2017 ENGM 720: Statistical Process Control 6 Testing Example Single Sample, Two-Sided t-Test: • H0: µ = µ0 versus HA: µ µ0 • Test Statistic: • Critical Region: reject H0 if |t| > t/2,n-1 • n x 0 ) t= s P-Value: 2 • P(X |t|), where the random variable x has a t-distribution with n _ 1 degrees of freedom 5/24/2017 ENGM 720: Statistical Process Control 7 Hypothesis Testing H0: μ = μ0 versus HA: μ μ0 tn-1 distribution P-value = P(X-|t|) + P(X|t|) -|t| 5/24/2017 0 ENGM 720: Statistical Process Control |t| 8 Hypothesis Testing Significance Level of a Hypothesis Test: A hypothesis test with a significance level or size rejects the null hypothesis H0 if a p-value smaller than is obtained, and accepts the null hypothesis H0 if a p-value larger than is obtained. In this case, the probability of a Type I error (the probability of rejecting the null hypothesis when it is true) is equal to . True Situation Test Conclusion 5/24/2017 H0 is True H0 is False H0 is True CORRECT Type II Error () H0 is False Type I Error () CORRECT ENGM 720: Statistical Process Control 9 Hypothesis Testing P-Value: One way to think of the P-value for a particular H0 is: given the observed data set, what is the probability of obtaining this data set or worse when the null hypothesis is true. A “worse” data set is one which is less similar to the distribution for the null hypothesis. P-Value 0 0.01 H0 not plausible 5/24/2017 1 0.10 Intermediate area ENGM 720: Statistical Process Control H0 plausible 10 Statistics and Sampling Objective of statistical inference: • Draw conclusions/make decisions about a population based on a sample selected from the population Random sample – a sample, x1, x2, …, xn , selected so that observations are independently and identically distributed (iid). Statistic – function of the sample data • Quantities computed from observations in sample and used to make statistical inferences • e.g. x = 1 n x measures central tendency i n i =1 5/24/2017 ENGM 720: Statistical Process Control 11 Sampling Distribution Sampling Distribution – Probability distribution of a statistic If we know the distribution of the population from which sample was taken, we can often determine the distribution of various statistics computed from a sample, ex: • • • When the CLT applies, the distribution is Normal When sampling for defective units in a large population, use the Binomial distribution When working with the sum of squared Normal distributions, use the 2-distribution 5/24/2017 ENGM 720: Statistical Process Control 12 e.g. Sampling Distribution of the Mean from the Normal Distribution Take a random sample, x1, x2, …, xn, from a normal population with mean μ and standard deviation σ, i.e., x ~ N(μ, σ ) Compute the sample average x Then x will be normally distributed with mean μ and standard deviation: σ n that is: 5/24/2017 σ x ~ N(μ, σ x ) = N μ, n ENGM 720: Statistical Process Control 13 Ex. Sampling Distribution of x When a process is operating properly, the mean density of a liquid is 10 with standard deviation 5. Five observations are taken and the average density is 15. What is the distribution of the sample average? • r.v. x = density of liquid Ans: since the samples come from a normal distribution, and are added together in the process of computing the mean: 5 x ~ N μ = 10, σ = 5 5/24/2017 ENGM 720: Statistical Process Control 14 Ex. Sampling Distribution of x (cont'd) What is the probability the sample average is greater than 15? x μ Δ 15 10 5 z= 0 σ0 n 0 = 5 5 = 2.24 = 2.23 Φ( z ) = Φ(2.23) = ? Would you conclude the process is operating properly? 5/24/2017 ENGM 720: Statistical Process Control 15 5/24/2017 ENGM 720: Statistical Process Control 16 Ex. Sampling Distribution of x (cont'd) What is the probability the sample average is greater than 15? x 15 10 5 z= 0 0 n 0 = 5 5 = 2.24 = 2.23 ( z ) = (2.23) = 0.98713 1 0.98713 = 0.01287 or 1.3% Would you conclude the process is operating properly? 5/24/2017 ENGM 720: Statistical Process Control 17 Comparison of Means • The first types of comparison are those that compare the location of two distributions. To do this: • Compare the difference in the mean values for the two distributions, and check to see if the magnitude of their difference is sufficiently large relative to the amount of variation in the distributions Definitely Different Probably Different Probably NOT Different Definitely NOT Different • Which type of test statistic we use depends on what is known about the process(es), and how efficient we can be with our collected data 5/24/2017 ENGM 720: Statistical Process Control 18 Situation I: Means Test, Both σ0 and μ0 Known Used with: • an existing process with good deal of data showing the variation and location are stable Procedure: • use the the z-statistic to compare sample mean with population mean 0 x 0 z0 = 0 n 5/24/2017 ENGM 720: Statistical Process Control 19 Situation II: Means Test σ(s) Known and μ(s) Unknown Used when: • the means from two existing processes may differ, but the variation of the two processes is stable, so we can estimate the population variances pretty closely. Procedure: • use the the z-statistic to compare both sample means z0 = x1 x 2 12 n1 5/24/2017 22 n2 ENGM 720: Statistical Process Control 20 Situation III: Means Test Unknown σ(s) and Known μ0 Used when: • have good control over the center of the distribution, but the variation changed from time to time Procedure: • use the the t-statistic to compare both sample means x 0 t0 = S n 5/24/2017 v = n – 1 degrees of freedom ENGM 720: Statistical Process Control 21 Situation IV: Means Test Unknown σ(s) and μ(s), Similar s2 Used when: • logical case for similar variances, but no real "history" with either process distribution (means & variances) Procedure: • use the the t-statistic to compare using pooled S, v = n1 + n2 – 2 degrees of freedom x1 x 2 t0 = 1 1 Sp n1 n2 5/24/2017 (n1 1)S12 (n2 1)S22 Sp = n1 n2 2 ENGM 720: Statistical Process Control 22 Situation V: Means Test Unknown σ(s) and μ(s), Dissimilar s2 Used when: • worst case data efficiency - no real "history" with either process distribution (means & variances) Procedure: • use the the t-statistic to compare, degrees of freedom given by: t0 = 5/24/2017 x1 x2 S12 S22 n1 n2 2 S S n1 n2 v= 2 2 2 2 S1 S2 n1 n2 n1 1 n2 1 ENGM 720: Statistical Process Control 2 1 2 2 23 Situation VI: Means Test Paired but Unknown σ(s) Used when: • exact same sample work piece could be run through both processes, eliminating material variation Procedure: • define variable (d) for the difference in test value pairs (di = x1i - x2i) observed on ith sample, v = n - 1 dof d t0 = Sd n 5/24/2017 d d) n 2 i Sd = i=1 n 1 ENGM 720: Statistical Process Control 24 Table for Means Comparisons Decision on which test to use is based on answering (at least some of) the following: • • • • • Do we know the population variance (σ2) or should we estimate it by the sample variance (s2)? Do we know the theoretical mean (μ), or should we estimate it by the sample mean ( y ) ? Do we know if the samples have equal-variance (σ12 = σ22)? Have we conducted a paired comparison? What are we trying to decide (alternate hypothesis)? 5/24/2017 ENGM 720: Statistical Process Control 25 Table for Means Comparisons These questions tell us: • • • • What sampling distribution to use What test statistic(s) to use What criteria to use How to construct the confidence interval Six major test statistics for mean comparisons • • • Two sampling distributions Six confidence intervals Twelve alternate hypotheses 5/24/2017 ENGM 720: Statistical Process Control 26 Ex. Surface Roughness Surface roughness is normally distributed with mean 125 and std dev of 5. The specification is 125 ± 11.65 and we have calculated that 98% of parts are within specs during usual production. This has been the case for a long time. My supplier of these parts has sent me a large shipment. I take a random sample of 10 parts. The sample average roughness is 134 which is within specifications. Test the hypothesis that the lot roughness is higher than specifications at = 0.05. 5/24/2017 ENGM 720: Statistical Process Control 27 e.g. Surface Roughness Cont'd Check the hypothesis that the sample of size 10, and with an average of 134 comes from a population with mean 125 and standard deviation of 5. One-Sided Test • • H0: ≤ 0 HA: > 0 Test Statistic: y 0 z0 = n z0 = = Critical Value: • Z = 1.645 Should I reject H0? • 134 125 9 = = 5.69 5 1.58 10 Alpha One-sided Two-sided Level (α) z z 0.1 1.28155 1.64485 0.05 1.64485 1.95996 Yes! Since 5.69 > 1.645, it is likely that it exceeds the roughness. 5/24/2017 ENGM 720: Statistical Process Control 28 ex. cont'd draw the distributions for the surface roughness and sample average 113.35 110 115 134 120 125 130 136.65 135 140 x r.v. x ~ N ( = 125, = 5) 134 x 125 120.27 129.74 r.v. x ~ N ( = 125, x = 5/ 10 = 1.58) 5/24/2017 ENGM 720: Statistical Process Control 29 e.g. Surface Roughness Cont'd Find the probability that the sample of size 10, and with an average of 134 does not come from a population with mean 125 and standard deviation of 5. z0 = y 0 n = 134 125 9 z0 = = = 5.69 5 1.58 10 P value = 1 ( z0 ) = 1 (5.69) 1 1 = 0 Should I accept this shipment? 5/24/2017 ENGM 720: Statistical Process Control 30 e.g. Surface Roughness Cont'd For future shipments, suggest good cutoff values for the sample average • (i.e., accept shipment if average of 10 observations is between what and what)? We know that 3 x encompasses over 99% of the probability mass of the distribution for x 5/24/2017 ENGM 720: Statistical Process Control 31 Operating Characteristic (OC) Curve Relates the size of the test difference to Type II Error () for a given risk of Type I Error () Designing a test involves a trade-off in sample size versus the power of the test to detect a difference • • The greater the difference in means (d), the smaller the chance of Type II Error () for a given sample size and . As the sample size increases, the chance of Type II Error () decreases for a specified and given difference in means (d). 5/24/2017 ENGM 720: Statistical Process Control 32 Operating Characteristic Curve 5/24/2017 ENGM 720: Statistical Process Control 33 O.C. Curve Use Agree on acceptable • Need to have an OC curve for the correct hypothesis test and the correct level Estimate anticipated d and to compute d: • d = | 1 - 2| = |d| Look for where d intersects with desired (Probability of accepting H0) to estimate the required sample size (n) 5/24/2017 ENGM 720: Statistical Process Control 34 OC Curve Example Assume our previous problem had a process std. dev. of 18 (instead of 5), and the same means (125 population & spec, 134 supplier sample). Assume the boss wants = 0.05 of exceeding either the high or low spec. for such a sample. • Probability of what (in English)? • Contracting an incapable supplier, based on a bad-luck test outcome Assume supplier needs = 0.2 • Probability of what (in English)? • (uses Fig 3-7, p.111) Unfairly being the incapable supplier, based on a bad-luck test outcome What sample size is needed to fit these constraints? 5/24/2017 ENGM 720: Statistical Process Control 35 Two-Sided Operating Characteristic Curve, = 0.05 n = 30 β= d = 0.5 5/24/2017 ENGM 720: Statistical Process Control 36 Estimation of Process Parameters In SPC: • the probability distribution is used to model a quality characteristic (e.g. dimension of a part, viscosity of a fluid) Therefore: • we are interested in making inferences about the parameters of the probability distribution • (e.g. mean μ and variance σ2) Since: • Values of these parameters are generally not known, so we need to estimate them from sample data 5/24/2017 ENGM 720: Statistical Process Control 37 Point Estimate Numerical value, computed from a sample of data, used to estimate a parameter of a distribution Example: • Say we take n = 50 measurements of a quality characteristic • Sample mean is point estimate of μ n i.e. x i i X = =1 • n Sample variance is point estimate of σ2 i.e. n X X ) 2 i s = 2 5/24/2017 i =1 n 1 = n i =1 X nX 2 = n 1 2 i n i =1 ENGM 720: Statistical Process Control n X i i =1 2 X i n 2 n 1 38 Confidence Intervals A confidence interval for an unknown parameter is an interval that contains a set of likely values of the parameter. It is associated with a confidence level 1- , which measures the probability that the confidence interval actually contains the unknown parameter. θ 5/24/2017 ENGM 720: Statistical Process Control 39 Confidence Interval (C.I.) (Interval Estimate) A C.I. is an interval that, with some probability, includes the true value of the parameter Ex. C.I. of mean μ is P{L μ U } = 1 α • • • L - lower confidence limit U - upper confidence limit (1-) - probability that true value of parameter lies in interval (we pick ) The interval L μ U is called a 100(1-)% C.I. for the mean 5/24/2017 ENGM 720: Statistical Process Control 40 C.I. on the Mean of Normal Distribution with Variance Unknown Suppose x ~ N , ) , and We don't know the true mean μ or true variance σ2 A 100(1-)% C.I. for the unknown (true) mean μ is: x t ,n 1 2 • • • • S S x t ,n 1 2 n n x - sample mean s - sample standard deviation n - number of observations in sample t ,n 1 - value of t distribution 2 5/24/2017 ENGM 720: Statistical Process Control 41 Ex. C.I. on the Mean of Normal Distribution with Variance Unknown Automatic filler deposits liquid in a container. WANT: 95% C.I. on the mean amount (ounces) per container • Collect random sample: x1, x2, …, xn say n = 10 • Compute sample average: n X = 1n X i = 1.6 • i =1 Compute sample variance: n S2 = 5/24/2017 xi x ) i =1 n 1 n 2 = 2 x nx i 2 i =1 n 1 ENGM 720: Statistical Process Control = 0.1 42 Ex. C.I. on Mean cont'd Find the t-distribution value: • • Look in Table (Appendix IV) Want a 95% C.I. so, 100(1 - )% = 95% = 0.05 = degrees of freedom = (n -1) = 9 so … tα 2 5/24/2017 ,n 1 = t .05 2 ,101 = t.025,9 = ? ENGM 720: Statistical Process Control 43 5/24/2017 TM 720: Statistical Process Control 44 Ex. C.I. on Mean cont'd Find the t-distribution value: • Look in Table (Appendix IV) • Want a 95% C.I. so, 100(1 - )% = 95% = 0.05 = degrees of freedom = (n -1) = 9 so … t α ,n 1 = t .05 ,101 = t.025,9 = 2.262 2 2 Substitute into C.I. x t ,n 1 2 S S x t ,n 1 2 n n 0.1 0.1 μ 1.6 2.262 = 1.6 2.262 10 10 5/24/2017 ENGM 720: Statistical Process Control - or - = 1.37 μ 1.83 45 Interpretation of a 95% C.I. Repeat sampling 10,000 (or many, many) times & obtain C.I.s Each C.I. will have (slightly) different center point and width On average, 95% of the C.I.s will include the true mean 5/24/2017 ENGM 720: Statistical Process Control 46 C.I.s on Other Parameters and Quantities Same procedure, different formulas For example, C.I. on • Mean (of any distribution) when variance is known • Variance of a normal distribution • Difference in two means (of any distribution) when variances are known • Difference in two means from normal distribution when variances are unknown • Ratio of variances of two normal distributions • etc. ... (See textbook Sections 4.3.1, 4.3.4 to review derivations) 5/24/2017 ENGM 720: Statistical Process Control 47 Questions & Issues 5/24/2017 ENGM 720: Statistical Process Control 48