Download Step 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
12-1
Chapter
Twelve
McGraw-Hill/Irwin
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
12-2
Chapter Twelve
Analysis of Variance
GOALS
When you have completed this chapter, you will be able to:
ONE
List the characteristics of the F distribution.
TWO
Conduct a test of hypothesis to determine whether the variances of two
populations are equal.
THREE
Discuss the general idea of analysis of variance.
FOUR
Organize data into a one-way and a two-way ANOVA table.
Goals
Chapter Twelve
12-3
continued
Analysis of Variance
GOALS
When you have completed this chapter, you will be able to:
FIVE
Define and understand the terms treatments and blocks.
SIX
Conduct a test of hypothesis among three or more treatment means.
SEVEN
Develop confidence intervals for the difference between treatment means.
EIGHT
Conduct a test of hypothesis to determine if there is a difference among block
means.
Goals
12-4
Characteristics of the F-Distribution
4.5
1
There is a “family” of F Distributions.
Each member of the family is
determined by two parameters: the
numerator degrees of freedom and
the denominator degrees of
freedom.
Its values range from 0
to  . As F   the
F cannot be
curve approaches the XThe
F
negative, and
axis but never touches it.
distribution
is
it is a
positively
continuous
skewed.
Characteristics of Fdistribution.
Distribution
Test for Equal Variances of Two Populations
For the two tail test, the
test statistic is given by
F 
12-5
2
s1
2
s2
The degrees of freedom are
n1-1 for the numerator and
n2-1 for the denominator.
s12 and s 22 are the
sample variances for
the two samples. The
larger s is placed in
the denominator.
The null hypothesis is rejected
if the computed value of the
test statistic is greater than the
critical value.
Test for Equal Variances of Two Populations
12-6
Colin, a stockbroker at Critical
Securities, reported that the mean
rate of return on a sample of 10
internet stocks was 12.6 percent
with a standard deviation of 3.9
percent.
The mean rate of return on a sample
of 8 utility stocks was 10.9 percent
with a standard deviation of 3.5
percent. At the .05 significance level,
can Colin conclude that there is more
variation in the software stocks?
Example 1
12-7
Step 1: The hypotheses are
2
H0 : I
H1 :  I2
2
 U
  U2
Step 2: The significance level is .05.
Step 3: The test statistic is the F distribution.
Example 1 continued
12-8
Step 4: H0 is rejected
if F>3.68 or if p < .05.
The degrees of
freedom are n1-1 or 9
in the numerator and
n1-1 or 7 in the
denominator.
Step 5: The value of F is
computed as follows.
F
(3.9)
2
(3.5)
2
 1.2416
The p(F>1.2416) is .3965.
H0 is not rejected. There is
insufficient evidence to show more
variation in the internet stocks.
Example 1 continued
12-9
The ANOVA Test of Means
The F distribution is also
used for testing whether two
or more sample means came
from the same or equal
populations.
This technique is called
analysis of variance or
ANOVA
The null and alternate hypotheses for four sample
means is given as:
Ho: m1 = m2 = m3 = m4
H1: m1 = m2 = m3 = m4
The ANOVA Test of Means
12-10
ANOVA requires the following conditions
The sampled
populations follow the
normal distribution.
The samples are independent
The populations have equal
standard deviations.
Underlying assumptions for
ANOVA
12-11
Estimate of the population variance
based on the differences among the sample means
F=
Estimate of the population variance
based on the variation within the samples
Degrees of freedom
for the F statistic in
ANOVA
If there are k populations
being sampled, the numerator
degrees of freedom is k – 1
If there are a total of n
observations the denominator
degrees of freedom is n – k.
ANOVA Test of Means
12-12
ANOVA divides the Total
Variation into the variation
due to the treatment, Treatment Variation, and to
the error component, Random Variation.
In the following table,
i stands for the ith observation
xG is the overall or grand mean
k is the number of treatment groups
ANOVA Test of Means
12-13
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees
of
Freedom
Mean
Square
Treatments
(k)
SST
k-1
SST/(k-1)
=MST
Error
k
Snk(Xk-XG)2
SSE
n-k
i k
Total
SS(Xi.k-Xk)2
TSS
i
S(Xi-XG)2
n-1
F
MST
MSE
SSE/(n-k)
=MSE
Treatment variation
Random variation
Total variation
Anova Table
12-14
Rosenbaum Restaurants specialize in meals for
families. Katy Polsby, President, recently
developed a new meat loaf dinner. Before
making it a part of the regular menu she decides
to test it in several of her restaurants.
She would like to know if
there is a difference in the
mean number of dinners sold
per day at the Anyor, Loris,
and Lander restaurants. Use
the .05 significance level.
Example 2
12-15
Number of Dinners Sold by Restaurant
Restaurant
Day
Aynor
Loris
Lander
Day 1
Day 2
Day 3
Day 4
Day 5
13
12
14
12
10
12
13
11
18
16
17
17
17
Example 2 continued
12-16
Step One: State the null hypothesis and the alternate
hypothesis.
Ho: mAynor = mLoris = mLandis
H1: mAynor = mLoris = mLandis
Step Two: Select the level of significance. This is
given in the problem statement as .05.
Step Three: Determine the test statistic. The test
statistic follows the F distribution.
Example 2 continued
12-17
Step Four: Formulate the decision rule.
The numerator degrees of freedom, k-1, equal 3-1 or 2.
The denominator degrees of freedom, n-k, equal 13-3 or
10. The value of F at 2 and 10 degrees of freedom is
4.10. Thus, H0 is rejected if F>4.10 or p< a of .05.
Step Five: Select the sample, perform the calculations,
and make a decision.
Using the data provided, the
ANOVA calculations follow.
Example 2 continued
Computation of SSE
Anyor
#sold
13
12
14
12
Xk
SSE:
XG:
SS(Anyor)
(13-12.75)2
(12-12.75)2
(14-12.75)2
(12-12.75)2
2.75
12.75
Loris
#sold
10
12
13
11
i k
12-18
SS(Xi.k-Xk)2
SS(Loris)
Lander SS(Lander)
#sold
(10-11.5)2
18
(18-17)2
(12-11.5)2
16
(16-17)2
(13-11.5)2
17
(17-17)2
(11-11.5)2
17
(17-17)2
17
(17-17)2
5
2
11.5
17
2.75 + 5 + 2 = 9.75
14.00
Computation of TSS
12-19
i
S(Xi-XG)2
Anyor TSS(Anyor) Loris TSS(Loris) Lander TSS(Lander)
#sold
#sold
#sold
13
(13-14)2
10
(10-14)2
18
(18-14)2
12
(12-14)2
12
(12-14)2
16
(16-14)2
14
(14-14)2
13
(13-14)2
17
(17-14)2
12
(12-14)2
11
(11-14)2
17
(17-14)2
17
(17-14)2
9.00
30
47
TSS:
9.00 + 30 + 47 = 86.00
SSE:
9.75
XG:
14.00
Example 2 continued
Computation of TSS
Computation of SST
12-20
k
Snk(Xk-XG)2
Restaurant
Anyor
Loris
Lander
XT
SST
12.75
11.50
17.00
4(12.75-14)2
4(11.50-14)2
5(17.00-14)2
76.25
Shortcut: SST = TSS – SSE
= 86 – 9.75
= 76.25
Example 2 continued
Computation of SST
12-21
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees
of
Freedom
Mean
Square
Treatments
76.25
3-1
=2
76.25/2
=38.125
13-3
=10
13-1
=12
9.75/10
=.975
Error
9.75
Total
86.00
F
38.125
.975
= 39.103
Example 2 continued
12-22
The p(F> 39.103) is .000018.
Since an F of 39.103 > the
critical F of 4.10, the p of
.000018 < a of .05, the
decision is to reject the
null hypothesis and
conclude that
At least two of the
treatment means are
not the same.
The mean number of
meals sold at the three
locations is not the
same.
The ANOVA tables on the next two slides are from the
Minitab and EXCEL systems.
Example 2
continued
12-23
Analysis of Variance
Source
DF
SS
Factor
2
76.250
Error
10
9.750
Total
12
86.000
Level
---Aynor
Loris
Lander
MS
38.125
0.975
N
Mean
StDev
4
4
5
12.750
11.500
17.000
0.957
1.291
0.707
---Pooled StDev =
0.987
F
39.10
P
0.000
Individual 95% CIs For Mean
Based on Pooled StDev
---------+---------+---------+--(---*---)
(---*---)
(---*---)
---------+---------+---------+--12.5
15.0
17.5
Example 2
continued
12-24
Anova: Single Factor
SUMMARY
Groups
Count
Sum
Average Variance
Aynor
4
51
12.75
0.92
Loris
4
46
11.50
1.67
Lander
5
85
17.00
0.50
ANOVA
Source of Variation
SS
Between Groups
76.25
2
38.13
9.75
10
0.98
86.00
12
Within Groups
Total
df
MS
F
P-value F crit
39.10
2E-05
4.10
Example 2 continued
12-25
When I reject the null
hypothesis that the
means are equal, I want
to know which treatment
means differ.
One of the simplest procedures
is through the use of confidence
intervals around the difference
in treatment means.
Inferences About
Treatment Means
12-26
 1 1
 X1  X 2   t MSE  n  n 
1
2
t is obtained from
the t table with
degrees of freedom
(n - k).
MSE = [SSE/(n - k)]
If the confidence interval around the difference
in treatment means includes zero, there is not a
difference between the treatment means.
Confidence Interval for the
Difference Between Two Means
12-27
95% confidence interval for the difference
in the mean number of meat loaf dinners
sold in Lander and Aynor
Can Katy conclude that
there is a difference
between the two
restaurants?
 1 1
(17  12.75)  2.228 .975  
 4 5
4.25  148
.  (2.77,5.73)
EXAMPLE 3
12-28
Because zero is not
in the interval, we
conclude that this
pair of means
differs.
The mean number
of meals sold in
Aynor is different
from Lander.
Example 3continued
12-29
Sometimes there are other causes of variation. For the twofactor ANOVA we test whether there is a significant difference
between the treatment effect and whether there is a difference
in the blocking effect (a second treatment variable).
SSB = r S (Xb – XG)2
where
r is the number of blocks
Xb is the sample mean of block b
XG is the overall or grand mean
In the following ANOVA table, all sums of squares are
computed as before, with the addition of the SSB.
Two-Factor ANOVA
12-30
ANOVA Table
Source of
Variation
Sum of Squares
Treatments
(k)
Blocks
(b)
Error
SST
Total
Degrees
of
Freedom
k-1
SSB
b-1
SSE
(TSS – SST –SSB)
TSS
(k-1)(b-1)
Mean
Square
SST/(k-1)
=MST
SSB/(b-1)
=MSB
SSE/(n-k)
=MSE
F
MST
MSE
MSB
MSE
n-1
Two factor ANOVA table
The Bieber Manufacturing
Co. operates 24 hours a
day, five days a week. The
workers rotate shifts each
week. Todd Bieber, the
owner, is interested in
whether there is a
difference in the number of
units produced when the
employees work on
various shifts. A sample of
five workers is selected
and their output recorded
on each shift.
12-31
At the .05 significance level,
can we conclude there is a
difference in the mean
production by shift and in
the mean production by
employee?
Example 4
12-32
Employee
McCartney
Day
Output
31
Evening
Output
25
Night
Output
35
Neary
33
26
33
Schoen
28
24
30
Thompson
30
29
28
Wagner
28
26
27
Example 4 continued
Treatment Effect
12-33
Step 1: State the null hypothesis and
the alternate hypothesis.
Step 2: Select the level of
H 0 : m1  m 2  m 3
significance. Given as .05.
H1: Not all means are equal.
Step 4: Formulate the
decision rule.
Ho is rejected if F > 4.46,
the degrees of freedom are
2 and 8, or if p < .05.
Step 5: Perform the calculations
Example 4 continued
and make a decision.
Step 3: Determine the
test statistic. The test
statistic follows the F
distribution.
12-34
Block Effect
Step 1: State the null hypothesis and
the alternate hypothesis.
Step 2: Select the
H 0 : m1  m 2  m 3  m 4  m 5
level of significance.
Given as a = .05.
H1: Not all means are equal.
Step 3: Determine the
test statistic. The test
statistic follows the F
distribution.
Step 4: Formulate the
decision rule.
H0 is rejected if F>3.84,
df =(4,8) or if p < .05.
Step 5: Perform the calculations and
make a decision.
Example 4 continued
Note: xG = 28.87 Block Sums of Squares
Effects of time of day and worker on productivity
Day Evening Night Employee x
SSB
McCartney
31
25
35
30.33
Neary
33
26
33
30.67
Schoen
28
24
30
27.33
Thompson
30
29
28
29.00
Wagner
28
26
27
27.00
SSB
= 6.42 + 9.68 + 7.08 + .05 + 10.49= 33.73
12-35
3(30.33-28.87)2
= 6.42
3(30.67-28.87)2
= 9.68
3(27.33-28.87)2
7.08
3(29.00-28.87)2
.09
3(27.00-28.87)2
10.49
12-36
Compute the remaining sums of squares as before:
TSS = 139.73
SST = 62.53
SSE = 43.47 (139.73-62.53-33.73)
df(block) = 4 (b-1)
df(treatment) = 2 (k-1)
df(error)=8 (k-1)(b-1)
Example 4 continued
12-37
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Square
F
Treatments
(k)
62.53
2
62.53/2
=31.275
31.27/5.43
= 5.75
Blocks
(b)
Error
33.73
4
8.43/5.43
=1.55
43.47
8
33.73/4
=8.43
43.47/8
=5.43
Total
139.73
14
Example 4 continued
12-38
Treatment Effect
Since the computed
F of 5.75 > the
critical F of 4.10,
the p of .03 < a of
.05, H0 is rejected.
There is a
difference in the
mean number of
units produced for
the different time
periods.
Block Effect
Since the computed F of
1.55 < the critical F of 3.84,
the p of .28> a of .05, H0 is
not rejected since there is no
significant difference in the
average number of units
produced for the different
employees.
Example 4 continued
12-39
Minitab output
Two-way ANOVA: Units versus Worker, Shift
Analysis of Variance for Units
Source
DF
SS
MS
Worker
4
33.73
8.43
Shift
2
62.53
31.27
Error
8
43.47
5.43
Total
14
139.73
F
1.55
5.75
P
0.276
0.028
Example 4 continued
12-40
Anova: Two-Factor Without Replication
SUMMARY
Day
Evening
Night
McCartney
Neary
Schoen
Thompson
Wagner
Count
5
5
5
Sum
Average Variance
150
30.0
4.5
130
26.0
3.5
153
30.6
11.3
3
3
3
3
3
91
92
82
87
81
30.33
30.67
27.33
29.00
27.00
2
4
8
MS
31.27
8.43
5.43
25.33
16.33
9.33
1
1
ANOVA
Output
Using
EXCEL
Source of
Variation
Rows
Columns
Error
SS
62.53
33.73
43.47
Total
139.73
df
F
P-value
5.75
0.03
1.55
0.28
F crit
4.46
3.84
14
Example 4 continued
Related documents