Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Six Sigma Greenbelt Training ANOVA Dave Merritt 12/7/16 Learning Objectives Sums of Squares in relation to the ANOVA Theoretical Development of the ANOVA Table Statistical Assumptions about an ANOVA Statistical, Graphical and Diagnostic techniques What is ANOVA? • One-way analysis of variance (ANOVA) is used to test the null hypothesis that multiple population means are all equal Ho: 1 2 3 4 Ha: At least one k is different Simply speaking, an ANOVA tests whether any population means differ from each other. The ANOVA will not tell you which population means differ. What is ANOVA? 70 Response 65 60 55 1 2 3 4 Factor ANOVA determines the variation between subgroup means and the variation within subgroups Understanding the Fundamentals - Sums of Squares xj - Mean of Group Response 70 65 x - Grand Mean of the 60 experiment xij - individual measurement 55 1 2 3 4 Factor k n (x x) ij 2 k j 1 i 1 SS(Tot) n (x k j x) 2 j 1 SS(Factor) n (x ij xj ) j 1 i 1 2 SS(Error) i = represents the nth group j = represents a data point within the kth group k = total # of groups n = # of individuals in a group SS(Tot) = Total Sum of Squares of the Experiment (individuals - Grand Mean) SS(Factor) = Sum of Squares of the Factor (Group Mean - Grand Mean) SS(Error) = Sum of Squares within the Group (individuals - Group Mean) Understanding the Fundamentals - Sums of Squares SS(Factor) 4) To the kth Subgroup Group Mean – Grand Mean 1) The Sum of k n 5) Multiplied by the # of Individuals in the Subgroup 2) (The Average of the Subgroup minus The Grand Average) Squared (xj x ) 2 j 1 3) From Subgroup 1 Determines the variation between the subgroup means. Each subgroup represents a different population or factor Response 70 vs 65 vs vs 60 55 1 2 3 Factor 4 Understanding the Fundamentals - Sums of Squares 6) To the kth Subgroup 4) To the nth Individual Value SS(Error) Individuals – Group Mean k n 1) The Sum of 2) (The Individual value of the Subgroups minus the Average of their Subgroup) Squared ( xij xj ) 2 j 1 i 1 3) From Individual Value 1 5) From Subgroup 1 Determines the variation within the subgroups. The variation not attributed to the factor Response 70 65 60 55 1 2 3 Factor 4 Understanding the Fundamentals - Sums of Squares SS(Total) Individuals-Grand Mean k n ( xij x ) 2 j 1 i 1 Equals the Sum of the SS(Factor) and SS(Error) Represents the Total Variation in the Experiment Developing the “ANOVA” Table using Sums of Squares Hypothesis Test Ho: 1 2 3 4 Ha: At least one k is different To determine whether we can accept or not accept the null hypothesis we must calculate the Test Statistic (F-ratio) using the Analysis of Variance as shown in table below. SOURCE SS df MS (=SS/df) F {=MS(Factor)/MS(Error)} MS(Factor) / MS(Error) BETWEEN SS(Factor) k-1 SS(Factor)/(g - 1) WITHIN SS(Error) k(n -1) SS(Error)/g(n - 1) TOTAL SS(Total) kn - 1 F drives the p value (p<.05 is significant) We Need to Ensure Certain Statistical Assumptions Population Variances of the Output are equal across all levels of the given Factor (Homogeneity of Variance). We can test this assumption in Minitab using the following key strokes: Stat>Anova>Test of Equal Variances procedure. Response Means are independently and normally distributed. If randomization and adequate sample sizes are used, this assumption is usually valid. Warning: In chemical processes, the risk of dependent Means is high and randomization should always be considered. Let’s do an Example! We will use the data supplied below.. Twenty-fore golf balls with four dimple patterns. Dimple pattern is the Input variable; Distance traveled is the output variable. Golf balls were assigned randomly to Iron Byron who was using the USGA approved test driver. The golf balls were tested in random order. Why? Enter the data into Minitab. Dimple 1 277 268 281 263 Dimple 2 281 299 317 286 290 295 Dimple 3 304 295 317 299 304 304 Dimple 4 250 277 268 272 281 286 281 263 There are several ways to enter the data into Minitab to perform the analysis. We will enter the data unstacked. Enter the data in four separate columns as listed 11 Analyzing the Data in Minitab The Statistical, Graphical & Diagnostic techniques listed below will be used to analyze our results: Tests of Equal Variance Statistical Analysis of Variance Table Graphical Main Effects Plots Interval Plots Test of Equal Variance Stat>ANOVA>Test for Equal Variances For the driving data, the p-value for the multiple comparisons test is much larger than the significance level of 0.05. There are no significant differences between groups, and all of the comparison intervals overlap. Test for Equal Variances: Dimple 1 , Dimple 2, Dimple 3, Dimple 4 Multiple comparison intervals for the standard deviation, α = 0.05 Multiple Comparisons Dimple 1 P-Value 0.71 5 Levene’s Test P-Value Dimple 2 Dimple 3 Dimple 4 0 1 00 200 300 400 If intervals do not overlap, the corresponding stdevs are significantly different. 500 0.789 Analysis of Variance Method Null hypothesis All variances are equal Alternative hypothesis At least one variance is different Significance level α = 0.05 95% Bonferroni Confidence Intervals for Standard Deviations Sample Dimple 1 Dimple 2 Dimple 3 Dimple 4 Individual N 4 6 6 8 StDev 8.2209 12.6596 4.5683 11.7321 CI (1.76831, 101.762) (3.10415, 88.450) (2.98666, 685.438) (4.46076, 44.863) confidence level = 98.75% Tests Method Multiple Levene comparisons Test Statistic — 0.35 P-Value 0.715 0.789 Graphical Analysis - Main Effects Plots To analyze the main effects plot we will need to stack the data, use the following keystrokes: Manip>Stack/Unstack>Stack Columns Create a column titled “Golf Ball” and “Distance” Use the following keystrokes to prepare the Main Effects Plots: Stat>ANOVA>Main Effects Plots What does the main effects plot tell us? Summary We can test the null hypothesis on multiple populations using an ANOVA. The ANOVA will not tell you which population is different The ANOVA table is generated from the various components of the Sums of Squares Statistical and graphical techniques must be evaluated to correctly analyze your data