Download Lecture 3

What If There Are More Than Two Factor Levels? • The t-test does not directly apply • There are lots of practical situations where there are either more than two levels of interest, or there are several factors of simultaneous interest • The analysis of variance (ANOVA) is the appropriate analysis “engine” for these types of experiments – Chapter 3, textbook • The ANOVA was developed by Fisher in the early 1920s, and initially applied to agricultural experiments • Used extensively today for industrial experiments 1 An Example (See pg. 61) • Consider an investigation into the relationship between the RF (radio-frequency) power setting and the etch rate for a wafer etching tool. • The response variable is etch rate. • The experimenter wants to determine the power setting that will give a desired target etch rate. • Other variables are fixed (gas, gap, etc.). • RF power levels: 160, 180, 200, and 220 W. • The experiment is replicated 5 times – runs made in random order 2 An Example (cont’ed) The run order should be randomized. First, we need to assign numbers to the experimental run, e.g., as follows: RF Power Experimental Run Number 160 1 2 3 4 5 180 6 7 8 9 10 200 11 12 13 14 15 220 16 17 18 19 20 3 An Example (cont’ed) Choose a random number between 1 and 20, and assign it as the first to run/test, then the next. Until all 20 runs are assigned a test number. E.g.: Test Sequence Run Number Power Level 1 14 200 2 17 220 3 19 220 4 1 160 5 4 160 … … … 20 15 200 What if the test was run in the original nonrandomized order? 4 An Example (cont’ed) 5 The Analysis of Variance (Sec. 3-2, pg. 63) • In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random order…a completely randomized design (CRD) • N = an total runs • We consider the fixed effects case…the random effects case will be discussed later • Objective is to test hypotheses about the equality of the a treatment means 6 The Analysis of Variance • The name “analysis of variance” stems from a partitioning of the total variability in the response variable into components that are consistent with a model for the experiment • The basic single-factor ANOVA model is  i  1, 2,..., a yij     i   ij ,   j  1, 2,..., n   an overall mean,  i  ith treatment effect,  ij  experimental error, NID(0,  2 ) 7 Models for the Data There are several ways to write a model for the data: yij     i   ij is called the effects model Let i     i , then yij  i   ij is called the means model Regression models can also be employed 8 The Analysis of Variance • Total variability is measured by the total sum of squares: a n SST   ( yij  y.. )2 i 1 j 1 • The basic ANOVA partitioning is: a n a n 2 ( y  y )  [( y  y )  ( y  y )]  ij ..  i. .. ij i. 2 i 1 j 1 i 1 j 1 a a n  n ( yi.  y.. ) 2   ( yij  yi. ) 2 i 1 i 1 j 1 SST  SSTreatments  SS E 9 The Analysis of Variance SST  SSTreatments  SSE • A large value of SSTreatments reflects large differences in treatment means • A small value of SSTreatments likely indicates no differences in treatment means • Formal statistical hypotheses are: H 0 : 1  2   a H1 : At least one mean is different 10 The Analysis of Variance • While sums of squares cannot be directly compared to test the hypothesis of equal means, mean squares can be compared. • A mean square is a sum of squares divided by its degrees of freedom: dfTotal  dfTreatments  df Error an  1  a  1  a (n  1) MSTreatments SSTreatments SS E  , MS E  a 1 a (n  1) • If the treatment means are equal, the treatment and error mean squares will be (theoretically) equal. • If treatment means differ, the treatment mean square will be larger than the error mean square. 11 The Analysis of Variance is Summarized in a Table • Computing…see text, pp 70 – 74 • The reference distribution for F0 is the Fa-1, a(n-1) distribution • Reject the null hypothesis (equal treatment means) if F0 > Fa,a-1, a(n-1) => an upper tail, one tail critical region. 12 Calculation of Sum of Squares 2 y SST   yij2  .. N i 1 j 1 a n 1 a 2 y..2 SSTreatments   yi.  n i 1 N The error sum of squares is SSE = SST – SSTreatments Usually these are done using a computer program. 13 Confidence Intervals in ANOVA • Can be established based on t-distribution • A 100(1- a)% confidence interval on the ith treatment mean i is: yi.  ta / 2, N a MS E MS E  i  yi.  ta / 2, N a n n • A 100(1- a)% confidence interval on the difference in any two treatments means is: yi.  y j .  ta / 2, N a 2MS E 2MS E  i  i  yi.  y j .  ta / 2, N a n n 14 Simultaneous Confidence Intervals • Can be established based on one-at-a-time confidence intervals. • If there are r 100(1- a)% confidence intervals of interest, the probability that the r intervals will simultaneously be correct is at least 1-ra. E.g. r = 5, a = 0.05, 1 – ra = 0.75 r = 10, a = 0.05, 1 – ra = 0.50 • Bonferroni method: Replacing a in the equations with a /r, then the simultaneous confidence intervals on treatment means/differences have a confidence level of at least 100(1- a)%. 15 ANOVA Computer Output (Design-Expert) Response:etch rate ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model 66870.55 3 22290.18 66.80 < 0.0001 A 66870.55 3 22290.18 66.80 < 0.0001 Pure Error 5339.20 16 333.70 Cor Total 72209.75 19 Std. Dev. 18.27 Mean 617.75 C.V. 2.96 PRESS 8342.50 R-Squared Adj R-Squared Pred R-Squared Adeq Precision 0.9261 0.9122 0.8845 19.071 16

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 3