Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 1/33 Analysis of Variance Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 2/33 Outline(1/2) Example The ANOVA Analysis of Fixed effects Model Model adequacy Checking Practical Interpretation of results Determining Sample Size 3/33 Outline (2/2) Discovering Dispersion Effects Nonparametric Methods in the ANOVA 4/33 What If There Are More Than Two Factor Levels? The t-test does not directly apply There are lots of practical situations where there are either more than two levels of interest, or there are several factors of simultaneous interest The ANalysis Of VAriance (ANOVA) is the appropriate analysis “engine” for these types of experiments The ANOVA was developed by Fisher in the early 1920s, and initially applied to agricultural experiments Used extensively today for industrial experiments 5 An Example(1/6) 6 An Example(2/6) An engineer is interested in investigating the relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate. The response variable is etch rate. 7 An Example(3/6) She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power. The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W The experiment is replicated 5 times – runs made in random order 8 An Example --Data 9 An Example – Data Plot 10 An Example--Questions Does changing the power change the mean etch rate? Is there an optimum level for power? We would like to have an objective way to answer these questions The t-test really doesn’t apply here – more than two factor levels 11 The Analysis of Variance In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random order…a completely randomized design (CRD) N = an total runs We consider the fixed effects case…the random effects case will be discussed later Objective is to test hypotheses about the equality of the a treatment means 12 The Analysis of Variance 13 The Analysis of Variance The name “analysis of variance” stems from a partitioning of the total variability in the response variable into components that are consistent with a model for the experiment 14 The Analysis of Variance The basic single-factor ANOVA model is i 1, 2,..., a yij i ij , j 1, 2,..., n an overall mean, i ith treatment effect, ij experimental error, NID(0, 2 ) 15 Models for the Data There are several ways to write a model for the data Mean model yij i ij Also known as one-way or single-factor ANOVA 16 Models for the Data Fixed or random factor? The a treatments could have been specifically chosen by the experimenter. In this case, the results may apply only to the levels considered in the analysis. fixed effect models 17 Models for the Data The a treatments could be a random sample from a larger population of treatments. In this case, we should be able to extend the conclusion to all treatments in the population. random effect models 18 Analysis of the Fixed Effects Model Recall the single-factor ANOVA for the fixed effect model yij i ij yij i ij Define n yi . yij and yi . yi . / n i 1,2,..., a j 1 a n y.. yij and y.. y.. / N i 1 j 1 N an 19 Analysis of the Fixed Effects Model Hypothesis H 0 : 1 2 a H1 : i j for at least one pair(i, j) a i 1 i a a i 1 i 0 20 Analysis of the Fixed Effects Model Thus, the equivalent Hypothesis H 0 : 1 2 a 0 H1 : i 0 for at least one i 21 Analysis of the Fixed Effects Model-Decomposition Total variability is measured by the total sum of squares: a n SST ( yij y.. )2 i 1 j 1 The basic ANOVA partitioning is: a n a n 2 ( y y ) [( y y ) ( y y )] ij .. i. .. ij i. 2 i 1 j 1 i 1 j 1 a a n n ( yi. y.. ) 2 ( yij yi. ) 2 i 1 SST SSTreatments SS E i 1 j 1 22 Analysis of the Fixed Effects Model-Decomposition In detail y a n i 1 j 1 ij y.. y 2 a n i 1 j 1 i. y.. yij yi . a n yi . y.. i 1 a n 2 y 2 a n i 1 j 1 ij 2 yi . y.. yij yi . yi . i 1 j 1 (=0) 23 2 Analysis of the Fixed Effects Model-Decomposition Thus y a n i 1 j 1 ij y.. y 2 a n i 1 j 1 a SST i. y.. yij yi . n yi . y.. i 1 SSTreatments 2 y 2 a n i 1 j 1 ij yi . SSE 24 2 Analysis of the Fixed Effects Model-Decomposition SST SSTreatments SS E A large value of SSTreatments reflects large differences in treatment means A small value of SSTreatments likely indicates no differences in treatment means Formal statistical hypotheses are: H 0 : 1 2 a H1 : At least one mean is different 25 Analysis of the Fixed Effects Model-Decomposition For SSE a n SSE yij yi . i 1 j 1 Recall y n Si2 n j 1 ij yi . n 2 yij yi . i 1 j 1 a 2 n 1 yij yi . j 1 2 2 ( n 1) Si2 26 Analysis of the Fixed Effects Model-Decomposition Combine a sample variances a SS E ( n 1) S i 1 2 i ( n 1) S12 ( n 1) S22 ( n 1) Sa2 1 ( n 1) S12 ( n 1) S22 ( n 1) Sa2 N a 1 ( n 1 ) ( n 1 ) ( n 1 ) SS E ( n 1) S12 ( n 1) S22 ( n 1) Sa2 N a ( n 1) ( n 1) ( n 1) The above formula is a pooled estimate of the common variance with each a treatment. 27 Analysis of the Fixed Effects Model-Mean Squares Define SSTreatments MSTreatments a 1 df and SS E MS E N a df 28 Analysis of the Fixed Effects Model-Mean Squares By mathematics, E ( MS E ) 2 a E ( MSTreatments) 2 n i2 i 1 a 1 That is, MSE estimates σ2. If there are no differences in treatments means, MSTreatments also estimates σ2. 29 Analysis of the Fixed Effects Model-Statistical Analysis Cochran’s Theorem Let Zi be NID(0,1) for i=1,2,…,v and v Z i 1 i Q1 Q2 Qs where s v, and Qi has vi degrees of freedom (i 1,2,, s ) then Q1, q2,…,Qs are independent chisquare random variables withv1, v2, …,vs degrees of freedom, respectively, If and only if v v1 v2 vs 30 Analysis of the Fixed Effects Model-Statistical Analysis Cochran’s Theorem implies that SSTreatments / 2 and SS E / 2 are independently distributed chi-square random variables Thus, if the null hypothesis is true, the ratio F0 SSTreatments /( a 1) SS E /( N a ) is distributed as F with a-1 and N-a degrees of freedom. 31 Analysis of the Fixed Effects Model-Statistical Analysis Cochran’s Theorem implies that SSTreatments / 2 and SS E / 2 Are independently distributed chi-square random variables Thus, if the null hypothesis is true, the ratio F0 SSTreatments /( a 1) SS E /( N a ) is distributed as F with a-1 and N-a degrees of freedom. 32 Analysis of the Fixed Effects Models-- Summary Table The reference distribution for F0 is the Fa-1, a(n-1) distribution Reject the null hypothesis (equal treatment means) if F0 F ,a1,a ( n1) 33 Analysis of the Fixed Effects Models-- Example Recall the example of etch rate, Hypothesis H 0 : 1 2 3 4 H1 : some means are different 34 Analysis of the Fixed Effects Models-- Example ANOVA table 35 Analysis of the Fixed Effects Models-- Example Rejection region 36 Analysis of the Fixed Effects Models-- Example P-value P-value 37 Analysis of the Fixed Effects Models-- Example Minitab One-way ANOVA: Etch Rate versus Power Source DF SS MS F P Power 3 66871 22290 66.80 0.000 Error 16 5339 334 Total 19 72210 S = 18.27 R-Sq = 92.61% R-Sq(adj) = 91.22% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ---+---------+---------+---------+-----160 5 551.20 20.02 (--*---) 180 5 587.40 16.74 (--*---) 200 5 625.40 20.53 (--*---) 220 5 707.00 15.25 (--*---) ---+---------+---------+---------+-----550 600 650 700 Pooled StDev = 18.27 38 Analysis of the Fixed Effects Model-Statistical Analysis Coding the observations will not change the results Without the assumption of randomization, the ANOVA F test can be viewed as approximation to the randomization test. 39 Analysis of the Fixed Effects Model- Estimation of the model parameters Reasonable estimates of the overall mean and the treatment effects for the singlefactor model yij i ij are given by y.. i yi . y.. 40 Analysis of the Fixed Effects Model- Estimation of the model parameters Confidence interval for μi yi. t / 2,N a MS E MS E i yi. t / 2,N a n n Confidence interval for μi - μj yi. y j. t / 2,N a MS E MS E i yi. y j. t / 2,N a n n 41 Analysis of the Fixed Effects Model- Unbalanced data For the unbalanced data a 2 ni y.. 2 SST y ij N i 1 j 1 SSTreatments a i 1 y i2. 2 y.. ni N 42 A little (very little) humor… 43 Model Adequacy Checking Assumptions on the model yij i ij Errors are normally distributed and independently distributed with mean zero and constant but unknown variances σ2 Define residual eij yij yij where yij is an estimate of yij 44 Model Adequacy Checking --Normality Normal probability plot 45 Model Adequacy Checking --Normality Four-in-one 46 Model Adequacy Checking --Plot of residuals in time sequence Residuals vs run order 47 Model Adequacy Checking --Residuals vs fitted Residuals vs fitted 48 Model Adequacy Checking --Residuals vs fitted Defects Horn shape Moon type Test for equal variances Bartlett’s test H 0 : 12 22 a2 H1 : above not true for at least one i2 49 Model Adequacy Checking --Residuals vs fitted Test for equal variances Bartlett’s test Test for Equal Variances: Etch Rate versus Power 95% Bonferroni confidence intervals for standard deviations Power N Lower StDev Upper 160 5 10.5675 20.0175 83.0477 180 5 8.8384 16.7422 69.4591 200 5 10.8357 20.5256 85.1557 220 5 8.0496 15.2480 63.2600 Bartlett's Test (Normal Distribution) Test statistic = 0.43, p-value = 0.933 Levene's Test (Any Continuous Distribution) Test statistic = 0.20, p-value = 0.898 50 Model Adequacy Checking --Residuals vs fitted Test for equal variances Bartlett’s test 51 Model Adequacy Checking --Residuals vs fitted Variance-stabilizing transformation Deal with nonconstant variance If observations follows Poisson distribution square root transformation yij* yij or yij* 1 yij If observations follows Lognormal distribution Logarithmic transformation yij* log yij 52 Model Adequacy Checking --Residuals vs fitted Variance-stabilizing transformation If observations are binominal data Arcsine transformation yij* arcsin yij Other transformation check the relationship among observations and mean. 53 Practical Interpretation of Results – Regression Model The one-way ANOVA model yij i ij is a regression model and is similar to yij 0 1 xi ij 54 Practical Interpretation of Results – Regression Model Computer Results Regression Analysis: Etch Rate versus Power The regression equation is Etch Rate = 138 + 2.53 Power Predictor Coef SE Coef T P Constant 137.62 41.21 3.34 0.004 Power 2.5270 0.2154 11.73 0.000 S = 21.5413 R-Sq = 88.4% R-Sq(adj) = 87.8% Analysis of Variance Source Regression Residual Error Total DF 1 18 19 SS 63857 8352 72210 MS F 63857 137.62 464 P 0.000 Unusual Observations Obs Power Etch Rate Fit SE Fit Residual St Resid 11 200 600.00 643.02 5.28 -43.02 -2.06R 55 R denotes an observation with a large standardized residual. The Regression Model 56 Practical Interpretation of Results – Comparison of Means The analysis of variance tests the hypothesis of equal treatment means Assume that residual analysis is satisfactory If that hypothesis is rejected, we don’t know which specific means are different Determining which specific means differ following an ANOVA is called the multiple comparisons problem 57 Practical Interpretation of Results – Comparison of Means There are lots of ways to do this We will use pairwised t-tests on means…sometimes called Fisher’s Least Significant Difference (or Fisher’s LSD) Method 58 Practical Interpretation of Results – Comparison of Means Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 160 subtracted from: Power Lower 180 11.71 200 49.71 220 131.31 Center Upper 36.20 60.69 74.20 98.69 155.80 180.29 ----+---------+---------+---------+----(--*-) (-*--) (--*-) ----+---------+---------+---------+-----100 0 100 200 59 Practical Interpretation of Results – Comparison of Means Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 180 subtracted from: Power Lower Center Upper ----+---------+---------+---------+----200 13.51 38.00 62.49 (--*-) 220 95.11 119.60 144.09 (-*-) ----+---------+---------+---------+-----100 0 100 200 Power = 200 subtracted from: Power Lower Center Upper ----+---------+---------+---------+----220 57.11 81.60 106.09 (-*--) ----+---------+---------+---------+-----100 0 100 200 60 Practical Interpretation of Results – Graphical Comparison of Means 61 Practical Interpretation of Results – Contrasts A linear combination of parameters a a i 1 i 1 ci i and ci 0 So the hypothesis becomes a H 0 : ci i 0 i 1 a H1 : ci i 0 i 1 62 Practical Interpretation of Results – Contrasts Examples H 0 : 3 4 0 H1 : 3 4 0 H 0 : 1 2 3 4 0 H1 : 1 2 3 4 0 H 0 : 31 2 3 4 0 H1 : 31 2 3 4 0 63 Practical Interpretation of Results – Contrasts Testing t-test Contrast average C a c i 1 Contrast Variance yi . i V C 2 n a Test statistic t0 c i 1 MS E n i a c i 1 i yi . a c i 1 i 64 Practical Interpretation of Results – Contrasts 2 ci yi . MS C SSC / 1 2 i 1 F0 t 0 MS E a 2 MS E MS E c n i 1 i a Testing F-test Test statistic 1 MS E a ci yi . i 1 1 a 2 ci n i 1 2 ci yi . where SSC i 1 a 1 2 c n i 1 i a 2 65 Practical Interpretation of Results – Contrasts Testing F-test Reject hypothesis if F0 F ,1, N a 66 Practical Interpretation of Results – Contrasts Confidence interval a ci yi. t / 2,N a i 1 a 2 c i i 1 a c i 1 MS E n a i i ci yi. t / 2,N a i 1 MS E n a 2 c i i 1 67 Practical Interpretation of Results – Contrasts Standardized contrast Standardize the contrasts when more than one contrast is of interest Standardized contrast a * c i yi . i 1 where c * i ci 1 a 2 ci n i 1 68 Practical Interpretation of Results – Contrasts Unequal sample size Contrast a ci i i 1 a n c i 1 i i 0 t-statistic a t0 c y i 1 i i. a c i2 i 1 ni MS E 69 Practical Interpretation of Results – Contrasts Unequal sample size Contrast sum of squares ci yi . i 1 SSC a c2 i a 2 n i 1 i 70 Practical Interpretation of Results – Orthogonal Contrasts Define two contrasts with coefficients {ci} and {di} are orthogonal contrasts if a c d i 1 i i 0 Unbalanced design a n c d i 1 i i i 0 71 Practical Interpretation of Results – Orthogonal Contrasts Why use orthogonal contrasts ? For a treatments, the set of a-1 orthogonal contrasts partition the sum of squares due to treatments into a-1 independent singledegree-of freedom tests performed on orthogonal contrasts are independent. 72 Practical Interpretation of Results – Orthogonal Contrasts example 73 Practical Interpretation of Results – Orthogonal Contrasts Example for contrast 74 Practical Interpretation of Results – Scheffe’s method for comparing all contrasts Comparing any and all possible contrasts between treatment means Suppose that a set of m contrasts in the treatment means u c1u 1 c2u 2 cau a u 1,2,..., m of interest have been determined. The corresponding contrast in the treatment averages yi. is Cu c1u. y1. c2u y2. cau ya. u 1,2,..., m 75 Practical Interpretation of Results – Scheffe’s method for comparing all contrasts The standard error of this contrast is SCu MS E ciu2 / ni a i 1 The critical value against which Cu should be compared is S ,u SCu a-1Fα,a 1,N a If |Cu|>Sα,u , the hypothesis that contrast Γu equals zero is rejected. 76 Practical Interpretation of Results – Scheffe’s method for comparing all contrasts The simultaneous confidence intervals with type I error α Cu S ,u u Cu S ,u 77 Practical Interpretation of Results – example for Scheffe’s method Contrast of interest 1 1 2 3 4 2 1 4 The numerical values of these contrasts are C1 y1. y2. y3. y4. 551.2 587.4 625.4 707.0 193.8 C2 y1. y4. 551.2 707.0 155.8 78 Practical Interpretation of Results – example for Scheffe’s method Standard error SC1 16.34 SC2 11.55 One percent critical values are S0.01,1 65.09 S0.01,2 45.97 |C1|>S0.01,1 and |C1|>S0.01,1 , both contrast hypotheses should be rejected. 79 Practical Interpretation of Results –comparing pairs of treatment means Tukey’s test Fisher’s Least significant Difference (LSD) method Hsu’s Methods 80 Practical Interpretation of Results –comparing pairs of treatment means—Computer output One-way ANOVA: Etch Rate versus Power Source DF SS MS F P Power 3 66871 22290 66.80 0.000 Error 16 5339 334 Total 19 72210 S = 18.27 R-Sq = 92.61% R-Sq(adj) = 91.22% Individual 95% CIs For Mean Based on Pooled StDev Level 160 180 200 220 N 5 5 5 5 Mean 551.20 587.40 625.40 707.00 StDev 20.02 16.74 20.53 15.25 ---+---------+---------+---------+-----(--*---) (--*---) (--*---) (--*---) ---+---------+---------+---------+-----550 600 650 700 Pooled StDev = 18.27 81 Practical Interpretation of Results –comparing pairs of treatment means—Computer output Grouping Information Using Tukey Method Power N Mean Grouping 220 5 707.00 A 200 5 625.40 B 180 5 587.40 C 160 5 551.20 D Means that do not share a letter are significantly different. Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of Power Individual confidence level = 98.87% Power = 160 subtracted from: Power Lower Center Upper -----+---------+---------+---------+---180 3.11 36.20 69.29 (---*--) 200 41.11 74.20 107.29 (--*---) 220 122.71 155.80 188.89 (---*--) -----+---------+---------+---------+----100 0 100 200 82 Practical Interpretation of Results –comparing pairs of treatment means—Computer output Power = 180 subtracted from: Power Lower Center Upper -----+---------+---------+---------+---200 4.91 38.00 71.09 (---*--) 220 86.51 119.60 152.69 (--*--) -----+---------+---------+---------+----100 0 100 200 Power = 200 subtracted from: Power Lower Center Upper -----+---------+---------+---------+---220 48.51 81.60 114.69 (--*--) -----+---------+---------+---------+----100 0 100 200 83 Practical Interpretation of Results –comparing pairs of treatment means—Computer output Hsu's MCB (Multiple Comparisons with the Best) Family error rate = 0.05 Critical value = 2.23 Intervals for level mean minus largest of other level means Level Lower Center Upper ---+---------+---------+---------+-----160 -181.53 -155.80 0.00 (---*------------------) 180 -145.33 -119.60 0.00 (--*--------------) 200 -107.33 -81.60 0.00 (--*---------) 220 0.00 81.60 107.33 (---------*--) ---+---------+---------+---------+------160 -80 0 80 84 Practical Interpretation of Results –comparing pairs of treatment means—Computer output Grouping Information Using Fisher Method Power 220 200 180 160 N 5 5 5 5 Mean Grouping 707.00 A 625.40 B 587.40 C 551.20 D Means that do not share a letter are significantly different. Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 160 subtracted from: Power Lower Center 180 11.71 36.20 200 49.71 74.20 220 131.31 155.80 Upper ----+---------+---------+---------+----60.69 (--*-) 98.69 (-*--) 180.29 (--*-) ----+---------+---------+---------+-----100 0 100 200 85 Practical Interpretation of Results –comparing pairs of treatment means—Computer output Power = 180 subtracted from: Power Lower Center Upper ----+---------+---------+---------+----200 13.51 38.00 62.49 (--*-) 220 95.11 119.60 144.09 (-*-) ----+---------+---------+---------+-----100 0 100 200 Power = 200 subtracted from: Power Lower Center Upper ----+---------+---------+---------+----220 57.11 81.60 106.09 (-*--) ----+---------+---------+---------+-----100 0 100 200 86 Practical Interpretation of Results –comparing treatment means with a control Donntt’s method Control– the one to be compared Totally a-1 comparisons Dunnett's comparisons with a control Family error rate = 0.05 Individual error rate = 0.0196 Control = level (160) of Power Critical value = 2.59 Intervals for treatment mean minus control mean Level Lower Center Upper ---------+---------+---------+---------+ 180 6.25 36.20 66.15 (-----*-----) 200 44.25 74.20 104.15 (-----*-----) 220 125.85 155.80 185.75 (-----*-----) ---------+---------+---------+---------+ 50 100 150 200 87 Determining Sample size -- Minitab StatPower and sample size One-way ANOVA One-way ANOVA Alpha = 0.01 Assumed standard deviation = 25 Number of level ->4 Factors: 1 Number of levels: 4 Sample size -> Maximum Sample Target Difference Size Power Actual Power Max. difference 75 75 6 0.9 0.915384 Power value 0.9 The sample size is for each level. SD 25 88 Dispersion Effects ANOVA for location effects Different factor level affects variability dispersion effects Example Average and standard deviation are measured for a response variable. 89 Dispersion Effects ANOVA found no location effects Transform the standard deviation to y ln (s) Ratio Obser. Control Algorithm 1 2 3 4 5 6 1 -2.99573 -3.21888 -2.99573 -2.81341 -3.50656 -2.99573 2 -3.21888 -3.91202 -3.50656 -2.99573 -3.50656 -2.40795 3 -2.40795 -2.04022 -2.20727 -1.89712 -2.52573 -2.12026 4 -3.50656 -3.21888 -2.99573 -2.99573 -3.50656 -3.91202 90 Dispersion Effects ANOVA found dispersion effects One-way ANOVA: y=ln(s) versus Algorithm Source DF SS MS F P Algorithm 3 6.1661 2.0554 21.96 0.000 Error 20 1.8716 0.0936 Total 23 8.0377 S = 0.3059 R-Sq = 76.71% R-Sq(adj) = 73.22% 91 Nonparametric methods in the ANOVA When normality is invalid Use Kruskal-Wallis test Rank observation yij in ascending order Replace each observation by its rank, Rij In case tie, assign average rank to them test statistic 1 a Ri2. N ( N 1)2 H 2 S i 1 ni 4 92 Regression and ANOVA Regression Analysis: Etch Rate versus Power The regression equation is Etch Rate = 138 + 2.53 Power Predictor Coef SE Coef T P Constant 137.62 41.21 3.34 0.004 Power 2.5270 0.2154 11.73 0.000 S = 21.5413 R-Sq = 88.4% R-Sq(adj) = 87.8% Analysis of Variance Source DF SS MS F P Regression 1 63857 63857 137.62 0.000 Residual Error 18 8352 464 Total 19 72210 Unusual Observations Obs Power Etch Rate Fit SE Fit Residual St Resid 11 200 600.00 643.02 5.28 -43.02 -2.06R R denotes an observation with a large standardized residual. 93 Nonparametric methods in the ANOVA Kruskal-Wallis test test statistic 1 a Ri2. N ( N 1)2 H 2 S i 1 ni 4 where 2 a nj 1 N ( N 1 ) 2 2 S Rij N 1 i 1 j 1 4 94 Nonparametric methods in the ANOVA Kruskal-Wallis test If H 2,a1 the null hypothesis is rejected. 95