Download Oneway Anova

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Six Sigma Greenbelt Training
ANOVA
Dave Merritt
12/7/16
Learning Objectives
 Sums of Squares in relation to the ANOVA
 Theoretical Development of the ANOVA Table
 Statistical Assumptions about an ANOVA
 Statistical, Graphical and Diagnostic techniques
What is ANOVA?
• One-way analysis of variance (ANOVA) is used to test the null
hypothesis that multiple population means are all equal
Ho:  1   2   3   4
Ha: At least one  k is different
Simply speaking, an ANOVA tests whether any population
means differ from each other. The ANOVA will not tell you
which population means differ.
What is ANOVA?
70
Response
65
60
55
1
2
3
4
Factor
ANOVA determines the variation between subgroup means
and the variation within subgroups
Understanding the Fundamentals - Sums of
Squares
xj - Mean of Group
Response
70
65
x - Grand Mean of the
60
experiment
xij - individual measurement
55
1
2
3
4
Factor
k
n
 (x x)
ij
2
k

j 1 i 1
SS(Tot)
n
(x
k
j
 x)
2

j 1

SS(Factor)
n
(x
ij
 xj )
j 1 i 1

2
SS(Error)
i = represents the nth group
j = represents a data point within the kth group
k = total # of groups
n = # of individuals in a group
SS(Tot) = Total Sum of Squares of the Experiment (individuals - Grand Mean)
SS(Factor) = Sum of Squares of the Factor (Group Mean - Grand Mean)
SS(Error) = Sum of Squares within the Group (individuals - Group Mean)
Understanding the Fundamentals - Sums of Squares
SS(Factor)
4) To the kth Subgroup
Group Mean – Grand Mean
1) The Sum of
k
n
5) Multiplied by the # of
Individuals in the
Subgroup
2) (The Average of the Subgroup
minus The Grand Average) Squared
 (xj  x )
2
j 1
3) From Subgroup 1
Determines the variation
between the subgroup means.
Each subgroup represents a
different population or factor
Response
70
vs
65
vs
vs
60
55
1
2
3
Factor
4
Understanding the Fundamentals - Sums of Squares
6) To the kth Subgroup
4) To the nth Individual Value
SS(Error)
Individuals – Group Mean
k
n
1) The Sum of
2) (The Individual value of
the Subgroups minus the
Average of their Subgroup)
Squared
  ( xij  xj )
2
j 1
i 1
3) From Individual Value 1
5) From Subgroup 1
Determines the variation within
the subgroups.
The variation not attributed to
the factor
Response
70
65
60
55
1
2
3
Factor
4
Understanding the Fundamentals - Sums of Squares
SS(Total)
Individuals-Grand Mean
k
n
  ( xij  x )
2
j 1
i 1
Equals the Sum of the SS(Factor) and SS(Error)
Represents the Total Variation in the Experiment
Developing the “ANOVA” Table using Sums of
Squares
Hypothesis Test
Ho:  1   2   3   4
Ha: At least one  k is different
To determine whether we can accept or not accept the null hypothesis
we must calculate the Test Statistic (F-ratio) using the Analysis of Variance
as shown in table below.
SOURCE
SS
df
MS (=SS/df)
F {=MS(Factor)/MS(Error)}
MS(Factor) / MS(Error)
BETWEEN
SS(Factor)
k-1
SS(Factor)/(g - 1)
WITHIN
SS(Error)
k(n -1)
SS(Error)/g(n - 1)
TOTAL
SS(Total)
kn - 1
F drives the p value
(p<.05 is significant)
We Need to Ensure Certain
Statistical Assumptions
Population Variances of the Output are equal across all levels of the
given Factor (Homogeneity of Variance). We can test this assumption
in Minitab using the following key strokes: Stat>Anova>Test of Equal
Variances procedure.
Response Means are independently and normally distributed. If
randomization and adequate sample sizes are used, this assumption
is usually valid.
Warning: In chemical processes, the risk of dependent Means is high
and randomization should always be considered.
Let’s do an Example!
We will use the data supplied below..
Twenty-fore golf balls with four dimple patterns.
Dimple pattern is the Input variable; Distance traveled is the output variable.
Golf balls were assigned randomly to Iron Byron who was using the USGA
approved test driver. The golf balls were tested in random order. Why?
Enter the data into Minitab.
Dimple 1
277
268
281
263
Dimple 2
281
299
317
286
290
295
Dimple 3
304
295
317
299
304
304
Dimple 4
250
277
268
272
281
286
281
263
There are several ways to enter the data into
Minitab to perform the analysis. We will enter
the data unstacked. Enter the data in four
separate columns as listed
11
Analyzing the Data in Minitab

The Statistical, Graphical & Diagnostic techniques listed below will
be used to analyze our results:
Tests of Equal Variance
Statistical
 Analysis of Variance Table
Graphical
 Main Effects Plots
 Interval Plots
Test of Equal Variance
Stat>ANOVA>Test for Equal Variances
For the driving data, the p-value for the multiple comparisons test is
much larger than the significance level of 0.05. There are no significant
differences between groups, and all of the comparison intervals overlap.
Test for Equal Variances: Dimple 1 , Dimple 2, Dimple 3, Dimple 4
Multiple comparison intervals for the standard deviation, α = 0.05
Multiple Comparisons
Dimple 1
P-Value
0.71 5
Levene’s Test
P-Value
Dimple 2
Dimple 3
Dimple 4
0
1 00
200
300
400
If intervals do not overlap, the corresponding stdevs are significantly different.
500
0.789
Analysis of Variance
Method
Null
hypothesis
All variances are equal
Alternative hypothesis At least one variance is different
Significance level
α = 0.05
95%
Bonferroni Confidence Intervals for Standard Deviations
Sample
Dimple
1
Dimple 2
Dimple 3
Dimple 4
Individual
N
4
6
6
8
StDev
8.2209
12.6596
4.5683
11.7321
CI
(1.76831, 101.762)
(3.10415, 88.450)
(2.98666, 685.438)
(4.46076, 44.863)
confidence level = 98.75%
Tests

Method
Multiple
Levene
comparisons
Test
Statistic
—
0.35
P-Value
0.715
0.789
Graphical Analysis - Main Effects Plots



To analyze the main effects plot we will need to stack the data, use the
following keystrokes:
Manip>Stack/Unstack>Stack Columns
Create a column titled “Golf Ball” and “Distance”
Use the following keystrokes to prepare the Main Effects Plots:
Stat>ANOVA>Main Effects Plots
What does the main effects
plot tell us?
Summary

We can test the null hypothesis on multiple populations using an
ANOVA. The ANOVA will not tell you which population is different

The ANOVA table is generated from the various components of the
Sums of Squares

Statistical and graphical techniques must be evaluated to correctly
analyze your data