* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter03
Bootstrapping (statistics) wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Student's t-test wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Chapter 3 Inferences About Process Quality Introduction to Statistical Quality Control, 4th Edition 3-1. Statistics and Sampling Distributions • Statistical methods are used to make decisions about a process – Is the process out of control? – Is the process average you were given the true value? – What is the true process variability? Introduction to Statistical Quality Control, 4th Edition 3-1. Statistics and Sampling Distributions • Statistics are quantities calculated from a random sample taken from a population of interest. • The probability distribution of a statistic is called a sampling distribution. Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • Let X represent measurements taken from a normal distribution. X~ N( , ) • Select a sample of size n, at random, and calculate the sample mean, x 2 2 • Then x ~ N , n Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • Probability Example – The life of an automotive battery is normally distributed with mean 900 days and standard deviation 35 days. What is the probability that a random sample of 25 batteries will have an average life of more than 1000 days? Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • Chi-square (2) Distribution – If x1, x2, …, xn are normally and independently distributed random variables with mean zero and variance one, then the random variable y x x ... x 2 1 2 2 2 n is distributed as chi-square with n degrees of freedom Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • Chi-square (2) Distribution – Furthermore, the sampling distribution of n (x i x) 2 2 ( n 1 ) S y i 1 2 2 is chi-square with n – 1 degrees of freedom when sampling from a normal population. Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution – Chi-square (2) Distribution for various degrees of freedom. Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • t-distribution – If x is a standard normal random variable and if y is a chi-square random variable with k degrees of freedom, then t x y k is distributed as t with k degrees of freedom. Introduction to Statistical Quality Control, 4th Edition 3-1.1 Sampling from a Normal Distribution • F-distribution – If w and y are two independent chi-square random variables with u and v degrees of freedom, respectively, then w/u F y/v is distributed as F with u numerator degrees of freedom and v denominator degrees of freedom. Introduction to Statistical Quality Control, 4th Edition 3-1.2 Sampling from a Bernoulli Distribution • A random variable, x, with probability function p p( x ) (1 p) q x 1 x0 is called a Bernoulli random variable. • The sum of a sample from a Bernoulli process has a binomial distribution with parameters n and p. Introduction to Statistical Quality Control, 4th Edition 3-1.2 Sampling from a Bernoulli Distribution • x1, x2, …, xn taken from a Bernoulli process • The sample mean is a discrete random variable given by 1 n x xi n i 1 • The mean and variance of x are x p p(1 p) n 2 x Introduction to Statistical Quality Control, 4th Edition 3-1.3 Sampling from a Poisson Distribution • Consider a random sample of size n, x1, x2, …, xn, taken from a Poisson process with parameter • The sum, x = x1 + x2 + … + xn is also Poisson with parameter n. • The sample mean is a discrete random variable given by 1 n x xi n i 1 2 • The mean and variance of x are x , x n Introduction to Statistical Quality Control, 4th Edition 3-2. Point Estimation of Process Parameters • Parameters are values representing the population. Ex) , 2 The population mean and variance, respectively. • Parameters in reality are often unknown and must be estimated. • Statistics are estimates of parameters. 2 Ex) x, S The sample mean and sample variance, respectively. Introduction to Statistical Quality Control, 4th Edition 3-2. Point Estimation of Process Parameters Two properties of good point estimators 1. The point estimator should be unbiased. 2. The point estimator should have minimum variance. Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample Two categories of statistical inference: 1. Parameter Estimation 2. Hypothesis Testing Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample • A statistical hypothesis is a statement about the values of the parameters of a probability distribution. H0 : 0 H1 : 0 Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample • Steps in Hypothesis Testing – – – – – – Identify the parameter of interest State the null hypothesis, H0 and alternative hypotheses, H1. Choose a significance level State the appropriate test statistic State the rejection region Compare the value of test statistic to the rejection region. Can the null hypothesis be rejected? Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample • Example: An automobile manufacturer claims a particular automobile can average 35 mpg (highway). – – Suppose we are interested in testing this claim. We will sample 25 of these particular autos and under identical conditions calculate the average mpg for this sample. Before actually collecting the data, we decide that if we get a sample average less than 33 mpg or more than 37 mpg, we will reject the makers claim. (Critical Values) Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample • Example (continued) – • H0: 35 H1: 35 Rejection Regions Do not reject From the sample of 25 cars, the average mpg was found to be 31.5. What is your conclusion? Reject Reject 33 Introduction to Statistical Quality Control, 4th Edition 35 x 37 3-3. Statistical Inference for a Single Sample Choice of Critical Values • How are the critical values chosen? • Wouldn’t it be easier to decide “how much room for error you will allow” instead of finding the exact critical values for every problem you encounter? OR • Wouldn’t be easier to set the size of the rejection region, rather than setting the critical values for every problem? Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample Significance Level • The level of significance, determines the size of the rejection region. • The level of significance is a probability. It is also known as the probability of a “Type I error” (want this to be small) • Type I error - rejecting the null hypothesis when it is true. • How small? Usually want 0.10 Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample Types of Error • Type I error - rejecting the null hypothesis when it is true. • Pr(Type I error) = . Sometimes called the producer’s risk. • Type II error - not rejecting the null hypothesis when it is false. • Pr(Type II error) = . Sometimes called the consumer’s risk. Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample An Engine Explodes H0: An automobile engine explodes when started. H1: An automobile engine does not explode when started. Which error would you take action to avoid? Whose risk is higher, the producer’s or the consumer’s? Introduction to Statistical Quality Control, 4th Edition 3-3. Statistical Inference for a Single Sample Power of a Test • The Power of a test of hypothesis is given by 1 - • That is, 1 - is the probability of correctly rejecting the null hypothesis, or the probability of rejecting the null hypothesis when the alternative is true. Introduction to Statistical Quality Control, 4th Edition 3-3.1 Inference on the Mean of a Population, Variance Known Hypothesis Testing • Hypotheses: H0: o H1: o • Test Statistic: x 0 Z0 / n • Significance Level, • Rejection Region: Zo Z / 2 or Z 0 Z / 2 • If Z0 falls into either of the two regions above, reject H0 Introduction to Statistical Quality Control, 4th Edition 3-3.1 Inference on the Mean of a Population, Variance Known Example 3-1 • Hypotheses: H0: 175 H1: 175 • 182 175 3.50 Test Statistic: Z0 10 / 25 • • • Significance Level, = 0.05 Rejection Region: Z0 Z 1.645 Since 3.50 > 1.645, reject H0 and conclude that the lot mean pressure strength exceeds 175 psi. Introduction to Statistical Quality Control, 4th Edition 3-3.1 Inference on the Mean of a Population, Variance Known Confidence Intervals • A general 100(1- )% two-sided confidence interval on the true population mean, is P[ L U ] (1 ) • 100(1- )% One-sided confidence intervals are: P[ U] (1 ) P[L ] (1 ) Upper Introduction to Statistical Quality Control, 4th Edition Lower 3-3.1 Inference on the Mean of a Population, Variance Known Confidence Interval on the Mean with Variance Known • Two-Sided: P[ x Z 2 • x Z ] (1 ) n n 2 See the text for one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition 3-3.1 Inference on the Mean of a Population, Variance Known Example 3-2 • Reconsider Example 3-1. Suppose a 95% two-sided confidence interval is specified. Using Equation (3-28) we compute x z / 2 x z / 2 n n 10 10 182 1.96 182 1.96 25 25 178.08 185.92 • Our estimate of the mean bursting strength is 182 psi 3.92 psi with 95% confidence Introduction to Statistical Quality Control, 4th Edition 3-3.2 The Use of P-Values in Hypothesis Testing • • If it is not enough to know if your test statistic, Z0 falls into a rejection region, then a measure of just how significant your test statistic is can be computed - Pvalue. P-values are probabilities associated with the test statistic, Z0. Introduction to Statistical Quality Control, 4th Edition 3-3.2 The Use of P-Values in Hypothesis Testing Definition • The P-value is the smallest level of significance that would lead to rejection of the null hypothesis H0. Introduction to Statistical Quality Control, 4th Edition 3-3.2 The Use of P-Values in Hypothesis Testing Example • Reconsider Example 3-1. The test statistic was calculated to be Z0 = 3.50 for a right-tailed hypothesis test. The P-value for this problem is then P = 1 - (3.50) = 0.00023 • Thus, H0: = 175 would be rejected at any level of significance P = 0.00023 Introduction to Statistical Quality Control, 4th Edition 3-3.3 Inference on the Mean of a Population, Variance Unknown Hypothesis Testing • • • • • Hypotheses: H0: o Test Statistic: x 0 t0 H 1: o s/ n Significance Level, Rejection Region: t 0 t / 2,n 1 Reject H0 if t 0 t / 2,n 1 Introduction to Statistical Quality Control, 4th Edition 3-3.3 Inference on the Mean of a Population, Variance Unknown Confidence Interval on the Mean with Variance Unknown • Two-Sided: s s P x t / 2,n 1 x t / 2,n 1 (1 ) n n • See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition 3-3.3 Inference on the Mean of a Population, Variance Unknown Computer Output Table 3-2. Minitab Output for Example 3-3 Welcome to Minitab, press F1 for help. One-Sample T: Strength Test of mu = 50 vs mu not = 50 Variable Strength Variable Strength N 16 Mean 49.864 95.0% CI (48.979, 50.750) StDev 1.661 T -0.33 Introduction to Statistical Quality Control, 4th Edition SE Mean 0.415 P 0.749 3-3.4 Inference on the Variance of a Normal Distribution Hypothesis Testing • • • • Hypotheses: H0: 2 02 Test Statistic: 2 (n 1)S2 0 H1: 2 02 02 Significance Level, 2 2 2 2 Rejection Region: 0 ,n 1 or 0 1 ,n 1 2 Introduction to Statistical Quality Control, 4th Edition 2 3-3.4 Inference on the Variance of a Normal Distribution Confidence Interval on the Variance • Two-Sided: 2 (n 1)s 2 ( n 1 ) s P 2 2 2 1 1 / 2,n 1 / 2,n 1 • See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition 3-3.5 Inference on a Population Proportion Hypothesis Testing • • • • Hypotheses: H0: p = p0 H1: p p0 Test Statistic: ( x 0.5) np 0 x np 0 np (1 p ) 0 0 Z0 ( x 0.5) np 0 np 0 (1 p 0 ) x np 0 Significance Level, Rejection Region: Z0 Z / 2 Introduction to Statistical Quality Control, 4th Edition 3-3.5 Inference on a Population Proportion Confidence Interval on the Population Proportion • Two-Sided: p̂(1 p̂) p̂(1 p̂) P p̂ Z / 2 p p̂ Z / 2 1 n n • See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition 3-3.6 The Probability of Type II Error Calculation of P(Type II Error) • • Assume the test of interest is H0: o H1: o P(Type II Error) is found to be n n Z Z 2 2 • The Power of the test is then 1 - Introduction to Statistical Quality Control, 4th Edition 3-3.6 The Probability of Type II Error Operating Characteristic (OC) Curves • Operating Characteristic (OC) curve is a graph representing the relationship between , , and n. • OC curves are useful in determining how large a sample is required to detect a specified difference with a particular probability. Introduction to Statistical Quality Control, 4th Edition 3-3.6 The Probability of Type II Error Operating Characteristic (OC) Curves Introduction to Statistical Quality Control, 4th Edition 3-3.7 Probability Plotting • • • Probability plotting is a graphical method for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data. Probability plotting uses special graph paper known as probability paper. Probability paper is available for the normal, lognormal, and Weibull distributions among others. Can also use the computer. Introduction to Statistical Quality Control, 4th Edition 3-3.7 Probability Plotting Example 3-8 x(j) 1176 1183 1185 1190 1191 1192 1201 1205 1214 1220 (j – 0.5)/10 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 99 ML Estimates 95 90 80 Percent j 1 2 3 4 5 6 7 8 9 10 Normal Probability Plot for Life 70 60 50 40 30 20 10 5 1 1150 1160 1170 1180 1190 1200 Data Introduction to Statistical Quality Control, 4th Edition 1210 1220 1230 1240 Mean: 1195.7 StDev: 13.3120 3-4. Statistical Inference for Two Samples • Previous section presented hypothesis testing and confidence intervals for a single population parameter. • Results are extended to the case of two independent populations • Statistical inference on the difference in population means, 1 2 Introduction to Statistical Quality Control, 4th Edition 3-4.1 Inference For a Difference in Means, Variances Known Assumptions 1. X11, X12, …, X1n1 is a random sample from population 1. 2. X21, X22, …, X2n2 is a random sample from population 2. 3. The two populations represented by X1 and X2 are independent 4. Both populations are normal, or if they are not normal, the conditions of the central limit theorem apply Introduction to Statistical Quality Control, 4th Edition 3-4.1 Inference For a Difference in Means, Variances Known • Point estimator for 1 2 is where X1 X 2 EX1 X2 EX1 EX2 1 2 12 22 VX1 X 2 VX1 VX 2 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.1 Inference For a Difference in Means, Variances Known Hypothesis Tests for a Difference in Means, Variances Known Null Hypothesis: H 0 : 1 2 0 Test Statistic: Z0 X1 X 2 0 12 22 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.1 Inference For a Difference in Means, Variances Known Hypothesis Tests for a Difference in Means, Variances Known Alternative Hypotheses Rejection Criterion H1 : 1 2 0 z 0 z / 2 or z 0 z / 2 H1 : 1 2 0 z0 z H1 : 1 2 0 z 0 z Introduction to Statistical Quality Control, 4th Edition 3-4.1 Inference For a Difference in Means, Variances Known Confidence Interval on a Difference in Means, Variances Known 100(1 - )% confidence interval on the difference in means is given by x1 x 2 z / 2 12 22 12 22 1 2 x1 x 2 z / 2 n1 n 2 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Case I: 12 22 2 • Point estimator for 1 2 is X1 X 2 where 1 1 VX1 X2 n1 n 2 n1 n 2 2 2 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Case I: 12 22 2 2 The pooled estimate of , denoted by S2p is defined by 2 2 n 1 S n 1 S 1 2 2 S2 1 p n1 n 2 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Case I: 12 22 2 Null Hypothesis: H 0 : 1 2 0 Test Statistic: X1 X 2 0 t0 1 1 Sp n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Variances Unknown Alternative Hypotheses Rejection Criterion H1 : 1 2 0 t 0 t / 2,n1 n 2 2 or H1 : 1 2 0 H1 : 1 2 0 t 0 t / 2 , n1 n 2 2 t 0 t / 2 , n1 n 2 2 t 0 t / 2 , n1 n 2 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Case II: 12 22 Null Hypothesis: H 0 : 1 2 0 Test Statistic: 0 t X1 X 2 0 S12 S22 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Hypothesis Tests for a Difference in Means, Case II: 12 22 • The degrees of freedom for t 0 are given by 2 S S n1 n 2 2 2 2 S12 n1 S22 n 2 n1 1 n2 1 2 1 2 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Confidence Intervals on a Difference in Means, Case I: 12 22 2 100(1 - )% confidence interval on the difference in means is given by x 1 x 2 t / 2 , n1 n 2 2 s p 1 1 1 1 1 2 x1 x 2 t / 2,n1 n 2 2s p n1 n 2 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Inference For a Difference in Means, Variances Unknown Confidence Intervals on a Difference in Means, Case II: 12 22 100(1 - )% confidence interval on the difference in means is given by x1 x 2 t / 2 , s12 s 22 s12 s 22 1 2 x1 x 2 t / 2, n1 n 2 n1 n 2 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Paired Data • • • Observations in an experiment are often paired to prevent extraneous factors from inflating the estimate of the variance. Difference is obtained on each pair of observations, dj = x1j – x2j, where j = 1, 2, …, n. Test the hypothesis that the mean of the difference, d, is zero. Introduction to Statistical Quality Control, 4th Edition 3-4.2 Paired Data • The differences, dj, represent the “new” set of data with the summary statistics: 1 n d dj n j1 d j d n S 2 d 2 j1 n 1 Introduction to Statistical Quality Control, 4th Edition 3-4.2 Paired Data Hypothesis Testing • • Hypotheses: H0: d = 0 H1: d 0 Test Statistic: d t0 Sd n • • Significance Level, Rejection Region: |t0| t/2,n-1 Introduction to Statistical Quality Control, 4th Edition 3-4.3 Inferences on the Variances of Two Normal Distributions Hypothesis Testing • Consider testing the hypothesis that the variances of two independent normal distributions are equal. H 0 : 12 22 H1 : 12 22 • Assume random samples of sizes n1 and n2 are taken from populations 1 and 2, respectively Introduction to Statistical Quality Control, 4th Edition 3-4.3 Inferences on the Variances of Two Normal Distributions Hypothesis Testing • • Hypotheses: H 0 : 12 22 H1 : 12 22 Test Statistic: 2 S1 F0 2 S2 • • Significance Level, Rejection Region: F F 0 / 2 , n1 1, n 2 1 Introduction to Statistical Quality Control, 4th Edition F0 F(1 / 2 ), n1 1,n 2 1 3-4.3 Inferences on the Variances of Two Normal Distributions Alternative Hypothesis Test Statistic H1 : S22 F0 2 S1 F0 F ,n 2 1,n1 1 S12 F0 2 S2 F0 F , n1 1, n 2 1 2 1 2 2 H1 : 2 1 2 2 Rejection Region Introduction to Statistical Quality Control, 4th Edition 3-4.3 Inferences on the Variances of Two Normal Distributions Confidence Intervals on Ratio of the Variances of Two Normal Distributions 100(1 - )% two-sided confidence interval on the ratio of variances is given by S12 12 S12 F 2 2 F / 2,n 2 1,n1 1 2 (1 / 2 ), n 2 1, n1 1 S2 2 S2 Introduction to Statistical Quality Control, 4th Edition 3-4.4 Inference on Two Population Proportions Large-Sample Hypothesis Testing • • Hypotheses: H0: p1 = p2 H1: p1 p2 Test Statistic: Z0 • • P̂1 P̂2 (p1 p 2 ) p1 (1 p1 ) p 2 (1 p 2 ) n1 n2 Significance Level, Rejection Region: Z0 Z / 2 Introduction to Statistical Quality Control, 4th Edition 3-4.4 Inference on Two Population Proportions Alternative Hypothesis Rejection Region H1 : p1 p2 z0 z H1 : p1 p2 z 0 z Introduction to Statistical Quality Control, 4th Edition 3-4.4 Inference on Two Population Proportions Confidence Interval on the Difference in Two Population Proportions • Two-Sided: P̂1 P̂2 Z / 2 p1 (1 p1 ) p 2 (1 p 2 ) p1 p 2 n1 n2 P̂1 P̂2 Z / 2 • p1 (1 p1 ) p 2 (1 p 2 ) n1 n2 See the text for the one-sided confidence intervals. Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? Example Investigating the effect of one factor (with several levels) on some response. See Table 3-5 Hardwood Concentration 5% 10 15 20 Overall 1 7 12 14 19 2 8 17 18 25 Observations 3 4 5 6 15 11 9 10 13 18 19 15 19 17 16 18 22 23 18 20 Totals Avg 60 10 94 15.67 102 17 127 21.17 383 15.96 Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? Analysis of Variance • • Always a good practice to compare the levels of the factor using graphical methods such as boxplots. Comparative boxplots show the variability of the observations within a factor level and the variability between factor levels. Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? Figure 3-14 (a) Tensile strength (psi) 25 15 5 5 10 15 20 Hardwood Concentration (%) Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • The observations yij can be modeled by i 1,2,..., a Yij i ij j 1,2,..., n a = number of factor levels n = number of replicates (# of observations per treatment (factor) level.) Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • The hypotheses being tested are H0 : 1 2 ... a 0 H1 : i 0 for at least one i • Total variability can be measured by the “total corrected sum of squares”: a n SST ( y ij y.. ) 2 i 1 j1 Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • The sum of squares identity is a n a a n ( y ij y.. ) n ( y i. y.. ) ( y ij y i. ) i 1 j1 2 i 1 2 i 1 j1 • Notationally, this is often written as SST = SSTreatments + SSE Introduction to Statistical Quality Control, 4th Edition 2 3-5. What If We Have More Than Two Populations? • The expected value of the treatment sum of squares is a E(SSTreatments ) (a 1) n 2 i 1 • If the null hypothesis is true, then SSTreatments E a 1 2 Introduction to Statistical Quality Control, 4th Edition 2 i 3-5. What If We Have More Than Two Populations? • The error mean square ss E MSE a (n 1) • If the null hypothesis is true, the ratio SSTreatments /( a 1) MSTreatments F0 SSE /[ a (n 1)] MSE has an F-distribution with a – 1 and a(n – 1) degrees of freedom. Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? The following formulas can be used to calculate the sums of squares. • Total Sum of Squares (SST): 2 y SST y ij2 .. i 1 j1 an a n • Sum of Squares for the Treatments (SSTreatment): SSTreatment yi2. y..2 i 1 n an a • Sum of Squares for error (SSE): SSE = SST -SSTreatment Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • Analysis of Variance Table 3-7 Source of Variation Sum of Squares Degrees of Freedom Mean Square F0 Treatments SSTreatments a-1 MSTreatments MS Treatments MS E Error Total SSE SST a(n – 1) an - 1 MSE Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • Analysis of Variance Table 3-8 Analysis of Variance Source DF SS Factor 3 382.79 Error 20 130.17 Total 23 512.96 Level 5 10 15 20 N 6 6 6 6 Pooled StDev = Mean 10.000 15.667 17.000 21.167 2.551 MS 127.60 6.51 StDev 2.828 2.805 1.789 2.639 F 19.61 P 0.000 Individual 95% CIs For Mean Based on Pooled StDev -----+---------+---------+---------+(---*---) (---*----) (---*---) (---*----) -----+---------+---------+---------+10.0 15.0 20.0 25.0 Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? Residual Analysis • Assumptions: model errors are normally and independently distributed with equal variance. • Check the assumptions by looking at residual plots. Introduction to Statistical Quality Control, 4th Edition 3-5. What If We Have More Than Two Populations? • Residual Analysis • Plot of residuals versus factor levels 5 4 3 Residual 2 1 0 -1 -2 -3 -4 5 10 15 Percent Hardwood Introduction to Statistical Quality Control, 4th Edition 20 3-5. What If We Have More Than Two Populations? • Residual Analysis • Normal probability plot of residuals Normal Probability Plot .999 .99 Probability .95 .80 .50 .20 .05 .01 .001 -4 -3 -2 -1 0 1 2 3 4 5 Residuals Introduction to Statistical Quality Control, 4th Edition