Download Chapter 12

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Lecture Slides
Elementary Statistics
Tenth Edition
and the Triola Statistics Series
by Mario F. Triola
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
1
Chapter 12
Analysis of Variance
12-1 Overview
12-2 One-Way ANOVA
12-3 Two-Way ANOVA
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
2
Section 12-1 & 12-2
Overview and One-Way
ANOVA
Created by Erin Hodgess, Houston, Texas
Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State,
Chattanooga, TN
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
3
Key Concept
This section introduces the method
of one-way analysis of variance,
which is used for tests of
hypotheses that three or more
population means are all equal.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
4
Overview
Analysis of variance (ANOVA) is a
method for testing the hypothesis
that three or more population means
are equal.
For example:
H0: µ 1 = µ2 = µ3 = . . . µk
H1: At least one mean is different
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
5
ANOVA Methods
Require the F-Distribution
1. The F-distribution is not symmetric; it
is skewed to the right.
2. The values of F can be 0 or positive;
they cannot be negative.
3. There is a different F-distribution for
each pair of degrees of freedom for the
numerator and denominator.
Critical values of F are given in Table A-5
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
6
F - distribution
Figure 12-1
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
7
One-Way ANOVA
An Approach to Understanding ANOVA
1. Understand that a small P-value (such as
0.05 or less) leads to rejection of the null
hypothesis of equal means.
With a large P-value (such as greater than
0.05), fail to reject the null hypothesis of
equal means.
2. Develop an understanding of the underlying
rationale by studying the examples in this
section.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
8
One-Way ANOVA
An Approach to Understanding ANOVA
3. Become acquainted with the nature of the SS
(sum of squares) and MS (mean square) values
and their role in determining the F test
statistic, but use statistical software packages
or a calculator for finding those values.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
9
Definition
Treatment (or factor)
A treatment (or factor) is a property or
characteristic that allows us to
distinguish the different populations
from one another.
Use computer software or TI-83/84 for
ANOVA calculations if possible.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
10
One-Way ANOVA
Requirements
1. The populations have approximately normal
distributions.
2. The populations have the same variance 
(or standard deviation  ).
2
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The different samples are from populations
that are categorized in only one way.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
11
Procedure for testing
Ho: µ1 = µ2 = µ3 = . . .
1. Use STATDISK, Minitab, Excel, or a
TI-83/84 Calculator to obtain results.
2. Identify the P-value from the display.
3. Form a conclusion based on these
criteria:
If P-value  , reject the null hypothesis
of equal means.
If P-value > , fail to reject the null hypothesis
of equal means.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
12
Procedure for testing
Ho: µ1 = µ2 = µ3 = . . .
Caution when interpreting ANOVA results:
When we conclude that there is sufficient
evidence to reject the claim of equal population
means, we cannot conclude from ANOVA that
any particular mean is different from the others.
There are several other tests that can be used to
identify the specific means that are different, and
those procedures are called multiple comparison
procedures, and they are discussed later in this
section.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
13
Example: Weights of Poplar Trees
Do the samples come from
populations with different means?
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
14
Example: Weights of Poplar Trees
Do the samples come from
populations with different means?
H0: 1 = 2 = 3 = 4
H1: At least one of the means is different from the others.
For a significance level of  = 0.05, use STATDISK,
Minitab, Excel, or a TI-83/84 calculator to test the claim
that the four samples come from populations with
means that are not all the same.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
15
Example: Weights of Poplar Trees
Do the samples come from
populations with different means?
STATDISK
Minitab
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
16
Example: Weights of Poplar Trees
Do the samples come from
populations with different means?
Excel
TI-83/84 PLUS
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
17
Example: Weights of Poplar Trees
Do the samples come from
populations with different means?
H0: 1 = 2 = 3 = 4
H1: At least one of the means is different from the others.
The displays all show a P-value of approximately 0.007.
Because the P-value is less than the significance level of
 = 0.05, we reject the null hypothesis of equal means.
There is sufficient evidence to support the claim that the
four population means are not all the same. We conclude
that those weights come from populations having means
that are not all the same.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
18
ANOVA
Fundamental Concepts
Estimate the common value of  :
2
1. The variance between samples (also
called variation due to treatment) is an
estimate of the common population variance
 2 that is based on the variability among the
sample means.
2. The variance within samples (also called
variation due to error) is an estimate of the
2
common population variance  based on
the sample variances.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
19
ANOVA
Fundamental Concepts
Test Statistic for One-Way ANOVA
F=
variance between samples
variance within samples
An excessively large F test statistic is
evidence against equal population means.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
20
Relationships Between the
F Test Statistic and P-Value
Figure 12-2
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
21
Calculations with
Equal Sample Sizes
Variance between samples = n sx2
where sx2 = variance of sample means
Variance within samples = sp
2
2
where sp = pooled variance (or the mean
of the sample variances)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
22
Example: Sample Calculations
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
23
Example: Sample Calculations
Use Table 12-2 to calculate the variance between samples,
variance within samples, and the F test statistic.
1. Find the variance between samples =
ns
2
x
.
For the means 5.5, 6.0 & 6.0, the sample variance is
2
x
s = 0.0833
nsx2 = 4 X 0.0833 = 0.3332
2. Estimate the variance within samples by
calculating the mean of the sample variances.
s
.
2
3.0 + 2.0 + 2.0 = 2.3333
p =
3
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
24
Example: Sample Calculations
Use Table 12-2 to calculate the variance between samples,
variance within samples, and the F test statistic.
3. Evaluate the F test statistic
F = variance between samples
variance within samples
F=
0.3332 = 0.1428
2.3333
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
25
Critical Value of F
Right-tailed test
Degree of freedom with k samples of
the same size n
numerator df = k – 1
denominator df = k(n – 1)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
26
Calculations with
Unequal Sample Sizes
ni(xi – x)2
F=
variance within samples
variance between samples
=
k –1
(ni – 1)s2i
(ni – 1)
where x = mean of all sample scores combined
k = number of population means being compared
ni = number of values in the ith sample
xi = mean of values in the ith sample
2
si = variance of values in the ith sample
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
27
Key Components of
the ANOVA Method
SS(total), or total sum of squares, is a
measure of the total variation (around x) in
all the sample data combined.
Formula 12-1
SS(total) = (x – x)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
2
Slide
28
Key Components of
the ANOVA Method
SS(treatment), also referred to as SS(factor) or
SS(between groups) or SS(between samples),
is a measure of the variation between the
sample means.
Formula 12-2
SS(treatment) = n1(x1 – x)2 + n2(x2 – x)2 + . . . nk(xk – x)2
= ni(xi - x)2
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
29
Key Components of
the ANOVA Method
SS(error), (also referred to as SS(within groups)
or SS(within samples), is a sum of squares
representing the variability that is assumed to be
common to all the populations being considered.
Formula 12-3
SS(error) = (n1 –1)s12 + (n2 –1)s22 + (n3 –1)s32 . . . nk(xk –1)si2
= (ni – 1)s
2
i
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
30
Key Components of
the ANOVA Method
Given the previous expressions for SS(total),
SS(treatment), and SS(error), the following
relationship will always hold.
Formula 12-4
SS(total) = SS(treatment) + SS(error)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
31
Mean Squares (MS)
MS(treatment) is a mean square for
treatment, obtained as follows:
Formula 12-5
MS(treatment) =
SS (treatment)
k–1
MS(error) is a mean square for error,
obtained as follows:
Formula 12-6
MS(error) =
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
SS (error)
N–k
Slide
32
Mean Squares (MS)
MS(total) is a mean square for the total
variation, obtained as follows:
Formula 12-7
MS(total) =
SS(total)
N–1
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
33
Test Statistic for ANOVA
with Unequal Sample Sizes
Formula 12-8
F=
MS (treatment)
MS (error)
 Numerator
df = k – 1
 Denominator df = N – k
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
34
Example: Weights of Poplar Trees
Table 12-3 has a format often used
in computer displays.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
35
Identifying Means That Are Different
After conducting an analysis of
variance test, we might conclude that
there is sufficient evidence to reject a
claim of equal population means, but
we cannot conclude from ANOVA that
any particular mean is different from
the others.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
36
Identifying Means That Are Different
Informal procedures to identify the
specific means that are different
1. Use the same scale for constructing boxplots
of the data sets to see if one or more of the
data sets are very different from the others.
2. Construct confidence interval estimates of
the means from the data sets, then compare
those confidence intervals to see if one or
more of them do not overlap with the others.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
37
Bonferroni Multiple
Comparison Test
Step 1. Do a separate t test for each pair of
samples, but make the adjustments described in
the following steps.
Step 2. For an estimate of the variance σ2 that is
common to all of the involved populations, use
the value of MS(error).
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
38
Bonferroni Multiple
Comparison Test
Step 2 (cont.) Using the value of MS(error),
calculate the value of the test statistic, as
shown below. (This example shows the
comparison for Sample 1 and Sample 4.)
t
x1  x4
1 1
MS (error )    
 n1 n4 
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
39
Bonferroni Multiple
Comparison Test
Step 2 (cont.) Change the subscripts and use
another pair of samples until all of the different
possible pairs of samples have been tested.
Step 3. After calculating the value of the test
statistic t for a particular pair of samples, find
either the critical t value or the P-value, but
make the following adjustment:
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
40
Bonferroni Multiple
Comparison Test
Step 3 (cont.) P-value: Use the test statistic t
with df = N-k, where N is the total number of
sample values and k is the number of samples.
Find the P-value the usual way, but adjust the
P-value by multiplying it by the number of
different possible pairings of two samples.
(For example, with four samples, there are six
different possible pairings, so adjust the P-value
by multiplying it by 6.)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
41
Bonferroni Multiple
Comparison Test
Step 3 (cont.) Critical value: When finding the
critical value, adjust the significance level α by
dividing it by the number of different possible
pairings of two samples.
(For example, with four samples, there are six
different possible pairings, so adjust the P-value
by dividing it by 6.)
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
42
Example: Weights of Poplar Trees
Using the Bonferroni Test
Using the data in Table 12-1, we concluded that
there is sufficient evidence to warrant rejection
of the claim of equal means.
The Bonferroni test requires a separate t test for
each different possible pair of samples.
H 0 : 1  2
H 0 : 1  3
H 0 : 1  4
H 0 : 2  3
H 0 : 2  4
H 0 : 3  4
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
43
Example: Weights of Poplar Trees
Using the Bonferroni Test
We begin with testing H0: μ1 = μ4 .
From Table 12-1:
t
x1 = 0.184, n1 = 5,
x4 = 1.334, n4 = 5
x1  x4
1 1
MS (error )    
 n1 n4 

0.184  1.334
1 1
0.2723275    
5 5
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
 3.484
Slide
44
Example: Weights of Poplar Trees
Using the Bonferroni Test
Test statistic = – 3.484
df = N – k = 20 – 4 = 16
Two-tailed P-value is 0.003065, but adjust it by
multiplying by 6 (the number of different possible
pairs of samples) to get P-value = 0.01839.
Because the adjusted P-value is less than α = 0.05,
reject the null hypothesis.
It appears that Samples 1 and 4 have significantly
different means.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
45
Example: Weights of Poplar Trees
SPSS Bonferroni Results
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
46
Recap
In this section we have discussed:
 One-way analysis of variance (ANOVA)
Calculations with Equal Sample Sizes
Calculations with Unequal Sample Sizes
 Identifying Means That Are Different
Bonferroni Multiple Comparison Test
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
47
Section 12-3
Two-Way ANOVA
Created by Erin Hodgess, Houston, Texas
Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State,
Chattanooga, TN
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
48
Key Concepts
The analysis of variance procedure
introduced in Section 12-2 is referred to as
one-way analysis of variance because the
data are categorized into groups according
to a single factor (or treatment).
In this section we introduce the method of
two-way analysis of variance, which is used
with data partitioned into categories
according to two factors.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
49
Two-Way
Analysis of Variance
Two-Way ANOVA involves two factors.
The data are partitioned into
subcategories called cells.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
50
Example: Poplar Tree Weights
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
51
Definition
There is an interaction between
two factors if the effect of one of
the factors changes for different
categories of the other factor.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
52
Example: Poplar Tree Weights
Exploring Data
Calculate the mean for each cell.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
53
Example: Poplar Tree Weights
Exploring Data
Display the means on a graph. If a graph such as
Figure 12-3 results in line segments that are
approximately parallel, we have evidence that there
is not an interaction between the row and column
variables.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
54
Example: Poplar Tree Weights
Minitab
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
55
Requirements
1. For each cell, the sample values come
from a population with a distribution that
is approximately normal.
2. The populations have the same variance 2.
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The sample values are categorized two
ways.
6. All of the cells have the same number of
sample values.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
56
Two-Way ANOVA calculations are quite
involved, so we will assume that a
software package is being used.
Minitab, Excel, TI-83/4 or
STATDISK can be used.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
57
Figure 12- 4
Procedure for
Two-Way ANOVAS
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
58
Example: Poplar Tree Weights
Using the Minitab display:
Step 1: Test for interaction between the effects:
H0: There are no interaction effects.
H1: There are interaction effects.
MS(interaction)
F=
=
MS(error)
0.05721
= 0.17
0.33521
The P-value is shown as 0.915, so we fail to reject
the null hypothesis of no interaction between the
two factors.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
59
Example: Poplar Tree Weights
Using the Minitab display:
Step 2: Check for Row/Column Effects:
H0: There are no effects from the row factors.
H1: There are row effects.
H0: There are no effects from the column factors.
H1: There are column effects.
F=
MS(site)
=
MS(error)
0.27225
= 0.81
0.33521
The P-value is shown as 0.374, so we fail to reject
the null hypothesis of no effects from site.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
60
Special Case: One Observation
per Cell and No Interaction
If our sample data consist of only one observation
per cell, we lose MS(interaction), SS(interaction), and
df(interaction).
If it seems reasonable to assume that there is no
interaction between the two factors, make that
assumption and then proceed as before to test the
following two hypotheses separately:
H0: There are no effects from the row factors.
H0: There are no effects from the column factors.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
61
Special Case: One Observation
per Cell and No Interaction
Minitab
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
62
Special Case: One Observation
per Cell and No Interaction
Row factor:
F=
MS(site)
=
MS(error)
0.156800
= 0.28
0.562300
The P-value in the Minitab display is 0.634.
Fail to reject the null hypothesis.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
63
Special Case: One Observation
per Cell and No Interaction
Column factor:
F=
MS(site)
=
MS(error)
0.412217
= 0.73
0.562300
The P-value in the Minitab display is 0.598.
Fail to reject the null hypothesis.
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
64
Recap
In this section we have discussed:
 Two-way analysis of variance (ANOVA)
 Special case: One observation per cell and no
interaction
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Slide
65