Download 14.1 Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 14
Analysis of Variance
1
Introduction
Analysis of variance helps compare two or more
populations of quantitative data.
Specifically, we are interested in the relationships
among the population means (are they equal or
not).
The procedure works by analyzing the sample
variance.
2
14.1 One - Way Analysis of Variance
The analysis of variance is a procedure that
tests to determine whether differences exits
among two or more population means.
To do this, the technique analyzes the sample
variances
3
One - Way Analysis of Variance :
Example 1
– An apple juice manufacturer is planning to develop a new
product -a liquid concentrate.
– The marketing manager has to decide how to market the
new product.
– Three strategies are considered
Emphasize convenience of using the product.
Emphasize the quality of the product.
Emphasize the product’s low price.
4
One - Way Analysis of Variance :
Example 1 - continued
– An experiment was conducted as follows:
In three cities an advertisement campaign was launched .
In each city only one of the three characteristics
(convenience, quality, and price) was emphasized.
The weekly sales were recorded for twenty weeks
following the beginning of the campaigns.
5
One - Way Analysis of Variance :
Convnce
Weekly
sales
529
658
793
514
663
719
711
606
461
Weekly
529
sales
498
663
604
495
485
557
353
557
542
614
Quality
Price
804
630
774
717
679
604
620
697
706
615
492
719
787
699
572
Weekly
523
584
sales
634
580
624
672
531
443
596
602
502
659
689
675
512
691
733
698
776
561
572
469
581
679
532
See file
(Xm1.xls)
6
One - Way Analysis of Variance :
Solution
– The data is quantitative.
– Our problem objective is to compare sales in three
cities.
– We hypothesize on the relationships among the
three mean weekly sales:
7
Defining the Hypotheses
• Solution
H0: m1 = m2= m3
H1: At least two means differ
To build the statistic needed to test the
hypotheses use the following notation:
8
Notation
Independent samples are drawn from k populations (treatments).
First observation,
first sample
Second observation,
second sample
1
2
k
X11
x12
.
.
.
Xn1,1
X21
x22
.
.
.
Xn2,1
Xk1
xk2
.
.
.
Xnk,1
n2
nk
x2
xk
n1
x1
Sample size
Sample mean
X is the “response variable”.
The variables’ value are called “responses”.
9
Terminology
In the context of this problem…
Response variable – weekly sales
Responses – actual sale values
Experimental unit – weeks in the three cities when we record
sales figures.
Factor – the criterion by which we classify the populations (the
treatments). In this problems the factor is the marketing strategy.
Factor levels – the population (treatment) names. In this problem
factor levels are the marketing trategies.
10
The rationale of the test statistic
Two types of variability are employed
when testing for the equality of the
population means
11
Graphical demonstration:
Employing two types of variability
12
30
25
x3  20
20
x 2  15
16
15
14
11
10
9
x3  20
20
19
x 2  15
x1  10
12
10
9
x1  10
7
A small variability within
Treatment 1 Treatment 2 Treatment 3
the samples makes it easier
to draw a conclusion about the
population means.
1
The sample means are the same as before,
Treatment 1
Treatment 2 Treatment 3
but the larger within-sample variability
13
makes it harder to draw a conclusion
about the population means.
The rationale behind the test statistic – I
If the null hypothesis is true, we would expect all the
sample means be close to one another (and as a
result to the grand mean).
If the alternative hypothesis is true, at least some of
the sample means would reside away from one
another.
Thus, we measure variability among sample means.
14
Variability among sample means
The variability among the sample means is
measured as the sum of squared distances
between each mean and the grand mean.
This sum is called the
Sum of Squares for Treatments
SST
In our example treatments are
represented by the different
advertising strategies.
15
Sum of squares for treatments (SSTR)
k
SST   n j ( x j x)
2
j 1
There are k treatments
The size of sample j
The mean of sample j
Note: When the sample means are close to
one another, their distance from the grand
mean is small, leading to amall SST. Thus,
large SST indicates large variation among
sample means, which supports H1.
16
Sum of squares for treatments (SST)
Solution – continued
Calculate SST
x1  577.55 x 2  653.00 x 3  608.65
k
SST   n j (x j  x) 2
The grand mean is calculated by
n1x1  n2 x 2  ...  nk x k
X
n1  n2  ...  nk
j1
= 20(577.55 - 613.07)2 +
+ 20(653.00 - 613.07)2 +
+ 20(608.65 - 613.07)2 =
= 57,512.23
17
Sum of squares for treatments (SST)
Is SST = 57,512.23 large enough to favor
H1?
See next.
18
The rationale behind test statistic – II
Large variability within the samples weakens the
“ability” of the sample means to represent their
corresponding population means.
Therefore, even-though sample means may
markedly differ from one another, large SST
must be judged relative to the “within samples
variability”.
19
Within samples variability
The variability within samples is measured by
adding all the squared distances between
observations and their sample means.
This sum is called the
Sum of Squares for Error SSE.
In our example this is the
sum of all squared differences
between sales in city j and the
sample mean of city j (over all
the three cities).
20
Sum of squares for errors (SSE)
Solution – continued
Calculate SSE
s  10,775.00 s  7,238,11 s  8,670.24
2
1
2
2
k
nj
j 1
i 1
2
3
SSE   (x ij  x )  (n1 - 1)S12 + (n2 -1)S22 + (n3 -1)S32
2
= (20 -1)10,774.44 + (20 -1)7238.61+ (20-1)8,669.47 =
= 506,967.88
21
Sum of squares for errors (SSE)
• Note: If SST is small relative to SSE, we
can’t infer that treatments are the cause
for different average performance.
• Is SST = 57,512.23 large enough relative
to SSE = 506,983.50 to argue that the
means ARE different?
22
The mean sum of squares
To perform the test we need to calculate
the mean sum of squares as follows:
Calculation of MST Mean Square for Treatments
SST
MST 
k 1
57,512.23

3 1
 28,756.12
Calculation of MSE
Mean Square for Error
SSE
MSE 
nk
509,967.88

60  3
 8,894.17
23
For honors class:
Testing normality
For honors class:
Testing equal variances
Calculation of the test statistic
We assume:
1. The populations tested
are normally distributed.
2. The variances of all the
populations tested are
equal.
MST
F
MSE
28,756.12

8,894.17
 3.23
with the following degrees of freedom:
v1=k -1 and v2=n-k
24
The F test rejection region
And finally
the hypothesis test:
H0: m1 = m2 = …=mk
Ha: At least two means differ
MST
Test statistic: F 
MSE
R.R: F>Fa,k-1,n-k
25
The F test
MST
MSE
28,756.12

8,894.17
 3.23
F
Ho: m1 = m2= m3
H1: At least two means differ
Test statistic F= MST/ MSE= 3.23
R.R. : F  Fak 1nk  F0.05,31,60 3  3.15
Since 3.23 > 3.15, there is sufficient evidence
to reject Ho in favor of H1, and argue that at least one
of the mean sales is different than the others.
26
The F test p- value
Use Excel to find the p-value
=FDIST(3.23,2,57) = .0467
0.1
0.08
0.06
p Value = P(F>3.23) = .0467
0.04
0.02
0
-0.02 0
1
2
3
4
27
See file
(Xm1.xls)
Excel single factor printout
Anova: Single Factor
SUMMARY
Groups
Convnce
Quality
Price
Count
20
20
20
ANOVA
Source of Variation SS
Between Groups 57512.233
Within Groups
506983.5
Total
564495.73
Sum
11551
13060
12173
df
2
57
Average
577.55
653
608.65
Variance
10774.997
7238.1053
8670.2395
MS
F
28756.117 3.2330414
8894.4474
P-value
F crit
0.046773 3.1588456
59
SS(Total) = SST + SSE
28
14.2 Multiple Comparisons
If the single factor ANOVA leads us to conclude at least
two means differ, we often wants to know which ones.
Two means are considered different if the difference
between the corresponding sample means is larger
than a critical number.
The larger sample mean is believed to be associated
with a larger population mean.
Fisher’s Least Significant Difference
The Fisher’s Least Significant (LSD) method is one
procedure designed to determine which mean difference
is significant.
The hypotheses are:
H0: |mi – mj| = 0
Ha: |mi – mj|  0.
The statistic:
xi  x j
30
Fisher’s Least Significant Difference
This method builds on the equal variance t-test of the
difference between two means.
The test statistic is improved by using MSE rather than sp2.
We can conclude that mi and mj differ (at a% significance
level if |mi - mj| > LSD, where
1 1
LSD  t a 2 MSE(  )
ni n j
d.f .  n  k
Experimentwise type I error rate (aE)
(the effective type I error)
The Fisher’s method may result in an increased probability of
committing a type I error.
The probability of committing at least one type I error in a series of C
hypothesis tests each at a level of significance is increasing too.
This probability is called experimentwise type I error rate (aE ). It is
calculated by
aE = 1-(1 – a)C
where C is the number of pairwise comparisons (C = k(k-1)/2, k is the
number of treatments)
The Bonferroni adjustment determines the required type I error
probability per pairwise comparison (a) , to secure a pre-determined
overall aE.
The Bonferroni Adjustment
The procedure:
– Compute the number of pairwise comparisons (C)
[C=k(k-1)/2], where k is the number of
populations/treatments.
– Set a = aE/C, where the value of aE is predetermined
– We can conclude that mi and mj differ (at a/C% significance
level if
mi  m j  tαE
d.f .  n  k
(2C)
1 1
MSE   
n n 
j 
 i
The Fisher and Bonferroni methods
Example1 - continued
– Rank the effectiveness of the marketing strategies
(based on mean weekly sales).
– Use the Fisher’s method, and the Bonferroni adjustment method
Solution (the Fisher’s method)
– The sample mean sales were 577.55, 653.0, 608.65.
– Then,
The significant difference is between m1 and m2.
x1  x 2  577.55  653.0  75.45
x1  x 3  577.55  608.65  31.10
ta 2
1 1
MSE(  ) 
ni n j
x 2  x 3  653.0  608.65  44.35 t .05 / 2 8894 (1/ 20)  (1/ 20)  59.71
The Fisher and Bonferroni methods
Solution (the Bonferroni adjustment)
– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.
– We set a = .05/3 = .0167, thus t.0167/2, 60-3 = 2.467 (Excel).
x1  x 2  577.55  653.0  75.45
x1  x 3  577.55  608.65  31.10
ta 2
1 1
MSE(  ) 
ni n j
x 2  x 3  653.0  608.65  44.35 2.467 8894 (1/ 20)  (1/ 20)  73.54
Again, the significant difference is
between m1 and m2.
The Tukey Multiple Comparisons
The test procedure:
– Find a critical number w as follows:
MSE
w  q a (k ,  )
ng
If the sample sizes
are not extremely
different, we can use
the above procedure
with ng calculated as
the harmonic mean
of the sample sizes.
k = the number of samples
 =degrees of freedom = n - k
ng = number of observations per sample
(recall, all the sample sizes are the same)
a = significance level
qa(k,) = a critical value obtained from the
studentized range table
The Tukey Multiple Comparisons
The test procedure:
– Find a critical number w as follows:
MSE
w  q a (k ,  )
ng
k = the number of samples
 =degrees of freedom = n - k
ng = number of observations per sample
recall, all the sample sizes are the same
a = significance level
qa(k,) = a critical value obtained from the
studentized range table
The Tukey Multiple Comparisons
Recall, all the sample sizes are the same
If the sample sizes are not the same, but don’t differ
much from one another, we can use the harmonic mean
of the sample sizes for ng.
k
ng 
1 n1 1 n2 ...1 nk
The Tukey Multiple Comparisons
Select a pair of means. Calculate the difference between the
larger and the smaller mean. xmax  xmin
• If xmax  xmin  w
to conclude that
there is sufficient evidence
mmax > mmin .
• Repeat this procedure for each pair of
samples. Rank the means if possible.
The Tukey Multiple Comparisons
Example 1 – continued. We had three populations
(three marketing strategies).
K = 3,
Sample sizes were equal. n1 = n2 = n3 = 20,
 = n-k = 60-3 = 57,
MSE = 8894.
Take q.05(3,60) from the table.
ω  qα (k, ν)
Population
MSE
8894
 q.05 (3,57)
 71.70
ng
20
Mean
Sales - City 1 577.55
Sales - City 2 653
Sales - City 3 698.65
xmax  xmin
xmax  xmin  w
City 1 vs. City 2: 653 - 577.55 = 75.45
City 1 vs. City 3: 608.65 - 577.55 = 31.1
City 2 vs. City 3: 653 - 608.65 = 44.35
Excel – Tukey and Fisher LSD method
Xm15 -1.xls
Fisher’s LDS
Multiple Comparisons
Omega = 71.7007033950796
Variable Variable Difference
LSD
1
2
-75.45 59.72067
3
-31.1
59.72067
2
3
44.35
59.72067
a = .05
Type a = .05/3 = .0167
Bonferroni adjustments
Multiple Comparisons
Omega = 71.7007033950796
Variable Variable Difference
LSD
1
2
-75.45 73.54176
3
-31.1
73.54176
2
3
44.35
73.54176
14.3 Randomized Blocks Design
The purpose of designing a randomized block
experiment is to reduce the within-treatments
variation thus increasing the relative amount of
among-treatment variation.
This helps in detecting differences among the
treatment means more easily.
42
Randomized Blocks
The block of
The Block of
Greyish pinks bluish purples
The Block of
dark blues
Treatment 4
Treatment 3
Treatment 2
Treatment 1
43
Partitioning the total variability
The sum of square total is partitioned into three
Recall.
sources of variation
For the independent
– Treatments
– Blocks
– Within samples (Error)
samples design we have:
SS(Total) = SST + SSE
SS(Total) = SST + SSB + SSE
Sum of square for treatments
Sum of square for blocks
Sum of square for error
44
The mean sum of square
To perform hypothesis tests for treatments and blocks we
need
• Mean square for treatments
• Mean square for blocks
• Mean square for error
SST
MST 
k 1
SSB
MSB 
b 1
MSE 
SSE
(k  1)(b  1)
45
The test statistic for the randomized block
design ANOVA
Test statistics for treatments
MST
F
MSE
Test statistics for blocks
MSB
F
MSE
46
The F test rejection region
Testing the mean responses for treatments
F > Fa,k-1,(k-1)(b-1)
Testing the mean response for blocks
F> Fa,b-1,(k-1)(b-1)
47
Additional example
Randomized Blocks ANOVA - Example
Example 2
– Are there differences in the effectiveness of cholesterol
reduction drugs?
– To answer this question the following experiment was
organized:
25 groups of men with high cholesterol were matched by age
and weight. Each group consisted of 4 men.
Each person in a group received a different drug.
The cholesterol level reduction in two months was recorded.
– Can we infer from the data in Xm2.xls that there are
differences in mean cholesterol reduction among the four
drugs?
48
Randomized Blocks ANOVA - Example
Solution
– Each drug can be considered a treatment.
– Each 4 records (per group) can be blocked, because
they are matched by age and weight.
– This procedure eliminates the variability in
cholesterol reduction related to different
combinations of age and weight.
– This helps detect differences in the mean cholesterol
reduction attributed to the different drugs.
49
Randomized Blocks ANOVA - Example
ANOVA
Source of Variation
SS
df
MS
F
P-value
F crit
Rows
Columns
Error
3848.657
195.9547
1142.558
24 160.3607 10.10537 9.7E-15 1.669456
3 65.31823 4.116127 0.009418 2.731809
72 15.86886
Total
5187.169
99
Treatments Blocks b-1 K-1 MSTR / MSE MSBL / MSE
Conclusion: At 5% significance level there is sufficient evidence
to infer that the mean “cholesterol reduction” gained by at least
two drugs are different.
50
14.2 Multiple Comparisons
The rejection region:
xi  x j
> t α/2, nk
1 1
MSE   
n n 
j 
 i
Example – continued
Calculating LSD:
MSE = 8894.44; n1 = n2 = n3 =20.
t.05/2,60-3 = tinv(.05,57) = 2.002
LSD=(2.002)[8894.44(1/20+1/20)].5
= 59.72
Testing the differences
x1  x 2  75.45 >59.72
x1  x 3  31.10
<59.72
51
x 2  x 3  44.35 <59.72