Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Statistics for Managers Using Microsoft® Excel 5th Edition Chapter 11 Analysis of Variance Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-1 Learning Objectives In this chapter, you learn: The basic concepts of experimental design How to use the one-way analysis of variance to test for differences among the means of several groups How to use the two-way analysis of variance and interpret the interaction Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-2 Chapter Overview Analysis of Variance (ANOVA) One-Way ANOVA F-test Two-Way ANOVA Interaction Effects TukeyKramer test Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-3 General ANOVA Setting Investigator controls one or more factors of interest Each factor contains two or more levels Levels can be numerical or categorical Different levels produce different groups Think of the groups as populations Observe effects on the dependent variable Are the groups the same? Experimental design: the plan used to collect the data Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-4 Completely Randomized Design Experimental units (subjects) are assigned randomly to the different levels (groups) Subjects are assumed homogeneous Only one factor or independent variable With two or more levels (groups) Analyzed by one-factor analysis of variance (one-way ANOVA) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-5 One-Way Analysis of Variance Evaluate the difference among the means of three or more groups Examples: Accident rates for 1st, 2nd, and 3rd shift Expected mileage for five brands of tires Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-6 Hypotheses: One-Way ANOVA H0 : μ1 μ2 μ3 μc All population means are equal i.e., no treatment effect (no variation in means among groups) H1 : Not all of the population means are the same At least one population mean is different i.e., there is a treatment (groups) effect Does not mean that all population means are different (at least one of the means is different from the others) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-7 Hypotheses: One-Way ANOVA H0 : μ1 μ2 μ3 μc H1 : Not all μ j are the same All Means are the same: The Null Hypothesis is True (No Group Effect) μ1 μ 2 μ 3 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-8 Hypotheses: One-Way ANOVA At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) H0 : μ1 μ 2 μ 3 μ c H1 : Not all μj are the same or μ1 μ2 μ3 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. μ1 μ2 μ3 Chap 11-9 Partitioning the Variation Total variation can be split into two parts: SST = SSA + SSW SST = Total Sum of Squares (Total variation) SSA = Sum of Squares Among Groups (Among-group variation) SSW = Sum of Squares Within Groups (Within-group variation) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-10 Partitioning the Variation SST = SSA + SSW Total Variation = the aggregate dispersion of the individual data values around the overall (grand) mean of all factor levels (SST) Among-Group Variation = dispersion between the factor sample means (SSA) Within-Group Variation = dispersion that exists among the data values within the particular factor levels (SSW) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-11 Partitioning the Variation Total Variation (SST) = Among Group Variation (SSA) + Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Within Group Variation (SSW) Chap 11-12 The Total Sum of Squares SST = SSA + SSW c nj SST ( X ij X ) Where: 2 j 1 i 1 SST = Total sum of squares c = number of groups nj = number of values in group j Xij = ith value from group j X = grand mean (mean of all data values) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-13 The Total Sum of Squares c nj SST ( X ij X ) 2 j 1 i 1 SST ( X 11 X ) ( X 12 X ) ... ( X nc X ) 2 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. 2 2 Chap 11-14 Among-Group Variation SST = SSA + SSW c SSA n j ( X j X ) 2 Where: j 1 SSA = Sum of squares among groups c = number of groups nj = sample size from group j Xj = sample mean from group j X = grand mean (mean of all data values) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-15 Among-Group Variation SSA n1 (X1 X) 2 n 2 (X 2 X) 2 ... n c (X c X) 2 c SSA n j (X j X) 2 j1 µ1 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. µ2 µc Chap 11-16 Within-Group Variation SST = SSA + SSW c Where: SSW j 1 nj i 1 ( X ij X j ) 2 SSW = Sum of squares within groups c = number of groups nj = sample size from group j Xj = sample mean from group j Xij = ith value in group j Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-17 Within-Group Variation c SSW j 1 nj i 1 ( X ij X j ) 2 SSW ( X11 X 1 )2 ( X 21 X 1 )2 ... ( X nc X c )2 j Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-18 Obtaining the Mean Squares SSA MSA c 1 Mean Squares Among SSW MSW nc Mean Squares Within SST MST n 1 Mean Squares Within Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-19 One-Way ANOVA Table Source of Variation df SS MS (Variance) Among Groups c-1 SSA MSA Within Groups n-c SSW MSW Total n-1 SST = SSA+SSW F-Ratio F MSA MSW c = number of groups n = sum of the sample sizes from all groups df = degrees of freedom Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-20 One-Way ANOVA Test Statistic H0: μ1= μ2 = … = μc H1: At least two population means are different Test statistic MSA F MSW MSA is mean squares among variances MSW is mean squares within variances Degrees of freedom df1 = c – 1 df2 = n – c (c = number of groups) (n = sum of all sample sizes) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-21 One-Way ANOVA Test Statistic The F statistic is the ratio of the among variance to the within variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large Decision Rule: Reject H0 if F > FU, otherwise do not reject H0 = .05 0 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Do not reject H0 Reject H0 FU Chap 11-22 One-Way ANOVA Example You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance? Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 Chap 11-23 One-Way ANOVA Example Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 Distance 270 260 250 240 230 220 x1 249.2 x 2 226.0 x 3 205.8 x 227.0 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. • •• X 1 • • •• • •• X2 210 •• •• 200 190 • 1 2 Club X X3 3 Chap 11-24 One-Way ANOVA Example X1 = 249.2 n1 = 5 X2 = 226.0 n2 = 5 X3 = 205.8 n3 = 5 X = 227.0 n = 15 c=3 SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4 SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-25 One-Way ANOVA Example 2358.2 F 25.275 93.3 MSA = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3 Critical Value: FU = 3.89 = .05 0 Do not reject H0 Reject H0 FU = 3.89 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. F = 25.275 Chap 11-26 One-Way ANOVA Example H0: μ1 = μ2 = μ3 H1: μj not all equal = .05 df1= 2 Decision: Reject H0 at α = 0.05 df2 = 12 Conclusion: There is evidence that at least one μj differs from the rest Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-27 One-Way ANOVA in Excel EXCEL: Select Tools ____ Data Analysis _____ ANOVA: single factor SUMMARY Groups Count Sum Average Variance Club 1 5 1246 249.2 108.2 Club 2 5 1130 226 77.5 Club 3 5 1029 205.8 94.2 ANOVA Source of Variation SS df MS Between Groups 4716.4 2 2358.2 Within Groups 1119.6 12 93.3 Total 5836.0 14 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. F 25.275 P-value 4.99E-05 F crit 3.89 Chap 11-28 The Tukey-Kramer Procedure Tells which population means are significantly different e.g.: μ1 = μ2 ≠ μ3 Done after rejection of equal means in ANOVA Allows pair-wise comparisons Compare absolute mean differences with critical range μ1= μ2 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. μ3 x Chap 11-29 Tukey-Kramer Critical Range Critical Range QU MSW 1 1 2 n j n j' where: QU = Value from Studentized Range Distribution with c and n - c degrees of freedom for the desired level of (see appendix E.9 table) MSW = Mean Square Within nj and nj’ = Sample sizes from groups j and j’ Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-30 The Tukey-Kramer Procedure Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 1. Compute absolute mean differences: x1 x 2 249.2 226.0 23.2 x1 x 3 249.2 205.8 43.4 x 2 x 3 226.0 205.8 20.2 2. Find the QU value from the table in appendix E.9 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = .05 used here): QU 3.77 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-31 The Tukey-Kramer Procedure 3. Compute Critical Range: Critical Range Q U 4. Compare: MSW 1 1 93.3 1 1 3.77 16.285 2 n j n j' 2 5 5 x1 x 2 23.2 x1 x 3 43.4 x 2 x 3 20.2 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at the 5% level of significance. Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-32 ANOVA Assumptions Randomness and Independence Select random samples from the c groups (or randomly assign the levels) Normality The sample values from each group are from a normal population Homogeneity of Variance Can be tested with Levene’s Test Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-33 ANOVA Assumptions Levene’s Test Tests the assumption that the variances of each group are equal. First, define the null and alternative hypotheses: H0: σ21 = σ22 = …=σ2c H1: Not all σ2j are equal Second, compute the absolute value of the difference between each value and the median of each group. Third, perform a one-way ANOVA on these absolute differences. Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-34 Two-Way ANOVA Examines the effect of Two factors of interest on the dependent variable e.g., Percent carbonation and line speed on soft drink bottling process Interaction between the different levels of these two factors e.g., Does the effect of one particular carbonation level depend on which level the line speed is set? Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-35 Two-Way ANOVA Assumptions Populations are normally distributed Populations have equal variances Independent random samples are selected Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-36 Two-Way ANOVA Sources of Variation Two Factors of interest: A and B r = number of levels of factor A c = number of levels of factor B n/ = number of replications for each cell n = total number of observations in all cells (n = rcn/) Xijk = value of the kth observation of level i of factor A and level j of factor B Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-37 Two-Way ANOVA Sources of Variation SST = SSA + SSB + SSAB + SSE SSA Factor A Variation SST Total Variation SSB Factor B Variation Degrees of Freedom: r–1 c–1 SSAB n-1 Variation due to interaction between A and B SSE Random variation (Error) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. (r – 1)(c – 1) rc(n/ – 1) Chap 11-38 Two-Way ANOVA Equations r Total Variation: n c SST ( Xijk X) 2 i1 j1 k 1 r Factor A Variation: 2 SSA cn ( Xi.. X) i1 c Factor B Variation: 2 SSB rn ( X. j. X) j1 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-39 Two-Way ANOVA Equations Interaction Variation: r c SSAB n ( X ij. X i.. X . j. X ) 2 i 1 j 1 Sum of Squares Error: r c n SSE ( Xijk Xij. ) 2 i1 j1 k 1 Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-40 Two-Way ANOVA Equations r X n c X i1 j1 k 1 rcn c Xi.. ijk Grand Mean n X j1 k 1 cn ijk Mean of ith level of factor A (i 1, 2, ..., r) r = number of levels of factor A c = number of levels of factor B n/ = number of replications in each cell Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-41 Two-Way ANOVA Equations r X.j. n X i 1 k 1 rn ijk Mean of jth level of factor B (j 1, 2, ..., c) n Xijk Xij. Mean of cell ij k 1 n r = number of levels of factor A c = number of levels of factor B n/ = number of replications in each cell Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-42 Two-Way ANOVA Equations SSA MSA Mean square factor A r 1 SSB MSB Mean square factor B c 1 SSAB MSAB Mean square interactio n (r 1)(c 1) SSE MSE Mean square error rc(n'1) Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-43 Two-Way ANOVA: The F Test Statistic F Test for Factor A Effect H0: μ1.. = μ2.. = • • • = μr.. MSA F MSE H1: Not all μi.. are equal H0: μ.1. = μ.2. = • • • = μ.c. F Test for Factor B Effect MSB F MSE H1: Not all μ.j. are equal H0: the interaction of A and B is equal to zero Reject H0 if F > FU Reject H0 if F > FU F Test for Interaction Effect H1: interaction of A and B isn’t zero Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. MSAB F MSE Reject H0 if F > FU Chap 11-44 Two-Way ANOVA: Summary Table Source of Variation Degrees of Freedom Sum of Squares Factor A r–1 SSA Factor B c–1 SSB AB (r – 1)(c – 1) (Interaction) SSAB Error rc(n’ – 1) SSE Total n–1 SST Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Mean Squares F Statistic MSA MSA/ MSE = SSA /(r – 1) MSB = SSB /(c – 1) MSAB = SSAB / (r – 1)(c – 1) MSB/ MSE MSAB/ MSE MSE = SSE/rc(n’ – 1) Chap 11-45 Two-Way ANOVA: Features Degrees of freedom always add up n-1 = rc(n/-1) + (r-1) + (c-1) + (r-1)(c-1) Total = error + factor A + factor B + interaction The denominator of the F Test is always the same but the numerator is different The sums of squares always add up SST = SSE + SSA + SSB + SSAB Total = error + factor A + factor B + interaction Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-46 Two-Way ANOVA: Interaction No Significant Interaction: Interaction is present: Factor B Level 3 Factor B Level 2 Mean Response Mean Response Factor B Level 1 Factor A Levels Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Factor B Level 1 Factor B Level 2 Factor B Level 3 Factor A Levels Chap 11-47 Chapter Summary In this chapter, we have Described one-way analysis of variance The logic of ANOVA ANOVA assumptions F test for difference in c means The Tukey-Kramer procedure for multiple comparisons Described two-way analysis of variance Examined effects of multiple factors Examined interaction between factors Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-48