Download No Slide Title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Degrees of freedom (statistics) wikipedia, lookup

Analysis of variance wikipedia, lookup

Transcript
Dr. Ka-fu Wong
ECON1003
Analysis of Economic Data
Ka-fu Wong © 2003
Chap 12- 1
Chapter Twelve
Analysis of Variance
GOALS
1.
2.
3.
4.
5.
6.
7.
8.
l
Discuss the general idea of analysis of variance.
List the characteristics of the F distribution.
Conduct a test of hypothesis to determine whether the
variances of two populations are equal.
Organize data into a one-way and a two-way ANOVA table.
Define and understand the terms treatments and blocks.
Conduct a test of hypothesis among three or more
treatment means.
Develop confidence intervals for the difference between
treatment means.
Conduct a test of hypothesis to determine if there is a
difference among block means.
Ka-fu Wong © 2003
Chap 12- 2
Two Sample Tests
TEST FOR EQUAL VARIANCES
Ho
Population 1
TEST FOR EQUAL MEANS
Ho
Population 2
Population 2
H1
Population 1
Population 2
Ka-fu Wong © 2003
Population 1
H1
Population 1 Population 2
Chap 12- 3
Characteristics of F-Distribution
 There is a “family” of F Distributions.
 Each member of the family is determined by
two parameters: the numerator degrees of
freedom and the denominator degrees of
freedom.
 F cannot be negative, and it is a continuous
distribution.
 The F distribution is positively skewed.
 Its values range from 0 to  . As F   the
curve approaches the X-axis.
Ka-fu Wong © 2003
Chap 12- 4
The F-Distribution, F(m,n)
Not symmetric
(skewed to the right)
Each member of the family is
determined by two parameters:
the numerator degrees of
freedom (m) and the
denominator degrees of freedom
(n).
a
0
1.0
F
Nonnegative values only
Ka-fu Wong © 2003
Chap 12- 5
Test for Equal Variances
 For the two tail test, the test statistic is given by:
Larger of (S12,S22)
F=
Smaller of (S12,S22)
where s12 and s22 are the sample variances for the two
samples.
 The null hypothesis is rejected at a level of significance if
the computed value of the test statistic is greater than the
critical value with a confidence level a/2 and numerator and
denominator dfs.
Ka-fu Wong © 2003
Chap 12- 6
Test for Equal Variances
 For the one tail test, the test statistic is given by:
S12
F = 2 if H1: σ12 > σ22
S2
where s12 and s22 are the sample variances for the two
samples.
 The null hypothesis is rejected at a level of significance if
the computed value of the test statistic is greater than the
critical value with a confidence level a and numerator and
denominator dfs.
Ka-fu Wong © 2003
Chap 12- 7
EXAMPLE 1
 Colin, a stockbroker at Critical Securities, reported that
the mean rate of return on a sample of 10 internet
stocks was 12.6 percent with a standard deviation of
3.9 percent. The mean rate of return on a sample of 8
utility stocks was 10.9 percent with a standard
deviation of 3.5 percent. At the .05 significance level,
can Colin conclude that there is more variation in the
software stocks?
Ka-fu Wong © 2003
Chap 12- 8
EXAMPLE 1
continued
 Step 1: The hypotheses are:
H0 :  I2   U2
H1 :  I2   U2
 Step 2: The significance level is .05.
 Step 3: The test statistic is the F distribution.
 Step 4: H0 is rejected if F>3.68. The degrees of freedom are 9
in the numerator and 7 in the denominator.
 Step 5: The value of F is
(3.9)2
F
 1.2416
2
(3.5)
H0 is not rejected. There is insufficient evidence to show more
variation in the internet stocks.
Ka-fu Wong © 2003
Chap 12- 9
Analysis of Variance
(ANOVA)
Ka-fu Wong © 2003
Chap 12- 10
Underlying Assumptions for ANOVA
 The F distribution is also used for testing whether two or
more sample means came from the same or equal
populations.
 if any group mean differs from the mean of all groups
combined
Answers: “Are all groups equal or not?”
 This technique is called analysis of variance or ANOVA.
 ANOVA requires the following conditions:
 The sampled populations follow the normal
distribution.
 The populations have equal standard deviations.
 The samples are randomly selected and are
independent.
Ka-fu Wong © 2003
Chap 12- 11
The hypothesis
 Suppose that we have independent samples of n1, n2, . . .,
nK observations from K populations. If the population
means are denoted by 1, 2, . . ., K, the one-way analysis
of variance framework is designed to test the null
hypothesis
H0 : μ1 = μ2 =  = μK
H1 : μi ≠μ j
Ka-fu Wong © 2003
For at least one pair μi , μ j
Chap 12- 12
Sample Observations from Independent
Random Samples of K Populations
Same !!
unequal !!
Population
1
2
...
K
Mean
1
2
...
K
Variance
2
2
...
2
Sample
observations
from the
population
x11
x12
.
.
.
x1n1
x21
x22
.
.
.
x2n2
...
...
...
xK1
xK2
.
.
.
xKnK
n1
n2
...
nK
Sample size
Unequal number of observations in the K samples in general.
nT=n1+…+nK
Ka-fu Wong © 2003
Chap 12- 13
Sum of Squares Decomposition for oneway analysis of variance
 Suppose that we have independent samples of n1, n2, . . .,
nK observations from K populations.
 Denote by x1, x2,, xK the K group sample means and
by x the overall sample mean. We define the following
sum of squares:
K
ni
i 1
j 1
Sum of Squares Error (Within - Groups) : SSE  ∑ ∑( xij - xi )2
K
Sum of Squares Treatment (Between - Groups) : SST  ∑ni ( xi - x )2
i 1
K
ni
i 1
j 1
Sum of Squares Total : SSTotal  ∑ ∑( xij - x )2
where xij denotes the jth sample observation in the ith group.
Ka-fu Wong © 2003
Chap 12- 14
An Numerical Example of Sum of Squares
Decomposition
3
K
ni
SSTotal  ∑ ∑( x ij - x )2
2
K
i 1
j 1
2
2
2
Sample obs
from the
population
(xij)
1
2
3
2
3
4
5
1
3
5
Sample
size (nj)
3
4
3
Sample
mean
2
3.5
3
Population
1
2
Mean
1
Variance
Grand
mean
Ka-fu Wong © 2003
2.9
 [(1  2.9)2  ...  (3  2.9)2 ]  ...  [(1  2.9)2  ...  (5  2.9)2 ]
 18.9
K
SST  ∑ni ( xi - x )2  3(2  2.9)2  4(3.5  2.9)2  3(3  2.9)2
i 1
 3. 9
K
ni
i 1
j 1
SSE  ∑ ∑( x ij - x i )2
 [(1  2)2  ...  (3  2)2 ]  ...  [(1  3)2  ...  (5  3)2 ]
 15
SSTotal = SST + SSE
Chap 12- 15
A proof of SSTotal = SST + SSE
SSTotal
Populat
ion
1
2
...
K
Sample
obs
x11
x12
.
.
.
x1n1
x21
x22
.
.
.
x2n2
...
...
xK1
xK2
.
.
.
xKnK
...
K
ni
i 1
j 1
K
ni
i 1
j 1
K
ni
i 1
j 1
K
ni
i 1
j 1
K
ni
 ∑ ∑( x ij - x )2
 ∑ ∑( x ij  x i  x i  x )2
 ∑ ∑[( x ij  x i )  ( x i  x )] 2
 ∑ ∑[( x ij  x i )2  ( x i  x )2  2( x ij  x i )( x i  x )]
K
ni
Sample
size
ni
∑( x
ij
n1
n2
...
nK
Ka-fu Wong © 2003
ni
i 1
j 1
i 1
j 1
K
ni
2
i 1
j 1
K
K
ni
i 1
j 1
 ∑ ∑( x ij  x i )  ∑n j ( x i  x )  2∑( x i  x )∑( x ij  x i )
2
ni
ni
j 1
j 1
 x i )  ∑xij  ∑x i  ni x i  ni x i
j 1
K
 ∑ ∑( x ij  x i )  ∑ ∑( x i  x )  ∑ ∑2( x ij  x i )( x i  x )
2
i 1
j 1
K
ni
2
i 1
K
 ∑ ∑( x ij  x i )  ∑n j ( x i  x )2
2
i 1
j 1
i 1
 SSE  SST
Chap 12- 16
Two Ways to estimate the population
variance
 Note that the variance is assumed to be identical
across populations
 If the population means are identical, we have
two ways to estimate the population variance
 Based on the K sample variances.
 Based on the deviation of the K sample means
from the grand mean.
Ka-fu Wong © 2003
Chap 12- 17
An estimate the population variance
based on sample variances
 Anyone of the K sample variances can be used to estimate
the population.
ni
ˆ 2  si2  ∑( xij - xi )2 /( ni  1)
j 1
 We can get a more precise estimate if we use all the
information from the K samples.
K
ni
K
ˆ  ∑ ∑( x ij - x i ) / ∑(ni  1)
2
2
i 1
j 1
i 1
K
K
 ∑(ni  1)s /( ∑ni )  K
2
i
i 1
i 1
K
 SSE /( ∑ni )  K
i 1
Ka-fu Wong © 2003
Chap 12- 18
An estimate the population variance based on
deviation of the K sample means from the grand
sample mean.
 If the sample sizes are the same for all samples, the Central Limit
Theorem suggests that sample mean will be distributed normally
with the population mean and the population variance divided by
sample size.
K
ˆ 2  n ∑( x  x )2 /( K  1)
i
i 1
???
 When sample sizes are different across samples, we will
have to weight
K
ˆ 2  ∑ni ( x  x )2 /( K  1)
i
i 1
 SST /( K  1)
Ka-fu Wong © 2003
Chap 12- 19
Comparing the Variance Estimates: The F Test
 If the null hypothesis is true and the ANOVA assumptions are
valid, the sampling distribution of ratio of the two variance
estimates follows F distribution with K - 1 and nT - K.
F  stat 
SST /( K  1)
K
SSE /( ∑ni  K )
SST /( K  1)

SSE /( nT  K )
i 1
 If the means of the K populations are not equal, the value
of F-stat will be inflated because SST/(K-1) will
overestimate 2.
 Hence, we will reject H0 if the resulting value of F-stat
appears to be too large to have been selected at random
from the appropriate F distribution.
Ka-fu Wong © 2003
Chap 12- 20
Test for the Equality of k Population
Means
 Hypotheses
H0: 1=2=4=….=k
H1: Not all population means are equal
 Test Statistic
F = [SST/(K-1)] / [SSE/(nT-K)]
 Rejection Rule
Reject H0 if F > Fa
where the value of Fa is based on an F distribution with
k - 1 numerator degrees of freedom and nT - K
denominator degrees of freedom.
Ka-fu Wong © 2003
Chap 12- 21
Sampling Distribution of MST/MSE
The figure below shows the rejection
region associated with a level of
significance equal to a where Fa denotes
the critical value.
Reject H0
MST/MSE
Do Not Reject H0 Fa Critical Value
Ka-fu Wong © 2003
Chap 12- 22
The ANOVA Table
Source of
Variation
Sum of
Squares
Degree of
Freedom
Mean
Squares
F
Treatment
SST
K-1
MST
MST/MSE
Error
SSE
nT-K
MSE
Total
SSTotal
nT-1
Ka-fu Wong © 2003
Chap 12- 23
Does learning method affect student’s
exam scores?
 Consider 3 methods:
 standard
 osmosis
 shock therapy
 Convince 15 students to take part. Assign 5
students randomly to each method.
 Wait eight weeks. Then, test students to get
exam scores.
 Are the three learning methods equally effective?
 i.e., are their population means of exam
scores same?
Ka-fu Wong © 2003
Chap 12- 24
“Analysis of Variance” (Study #1)
The variation between the group means and the
grand mean is larger than the variation within
each of the groups.
Ka-fu Wong © 2003
Chap 12- 25
ANOVA Table for Study #1
One-way Analysis of Variance
Source
Factor
Error
Total
DF
2
12
14
SS
2510.5
161.2
2671.7
“F” means “F test statistic”
MS
1255.3
13.4
F
93.44
P
0.000
P-Value
“Source” means “find the components of variation in this column”
“DF” means “degrees of freedom”
“SS” means “sums of squares”
Ka-fu Wong © 2003
“MS” means “mean squared”
Chap 12- 26
ANOVA Table for Study #1
One-way Analysis of Variance
Source
Factor
Error
Total
DF
2
12
14
SS
2510.5
161.2
2671.7
MS
1255.3
13.4
F
93.44
P
0.000
“Factor” means “Variability between groups” or
“Variability due to the factor of interest”
“Error” means “Variability within groups”
or “unexplained random variation”
“Total” means “Total variation from the grand
mean”
Ka-fu Wong © 2003
Chap 12- 27
ANOVA Table for Study #1
One-way Analysis of Variance
Source
Factor
Error
Total
14 = 2 + 12
DF
2
12
14
SS
2510.5
161.2
2671.7
F
93.44
P
0.000
1255.2 = 2510.5/2
13.4 = 161.2/12
2671.7 = 2510.5 + 161.2
Ka-fu Wong © 2003
MS
1255.3
13.4
93.44 = 1255.3/13.4
Chap 12- 28
“Analysis of Variance” (Study #2)
The variation between the group means and the
grand mean is smaller than the variation within
each of the groups.
Ka-fu Wong © 2003
Chap 12- 29
ANOVA Table for Study #2
One-way Analysis of Variance
Source
Factor
Error
Total
DF
2
12
14
SS
80.1
1050.8
1130.9
MS
40.1
87.6
F
0.46
P
0.643
The P-value is pretty large so cannot reject the null
hypothesis. There is insufficient evidence to conclude that
the average exam scores differ for the three learning
methods.
Ka-fu Wong © 2003
Chap 12- 30
Do Holocaust survivors have more sleep
problems than others?
Ka-fu Wong © 2003
Chap 12- 31
ANOVA Table for Sleep Study
One-way Analysis of Variance
Source
Factor
Error
Total
DF
2
117
119
SS
1723.8
1634.8
3358.6
MS
861.9
14.0
F
61.69
P
0.000
The P-value is so small that we reject the null hypothesis of
equal population means and favor the alternative hypothesis
that at least one pair of population means are different.
Ka-fu Wong © 2003
Chap 12- 32
Potential problem with the analysis
 What is driving the rejection of null of equal population means?
 From the plot, the Healthy and Depress seem to have
different mean sleep quality. It looks like that the rejection
is due to the difference between these two groups.
 If we pooled Healthy and Depress, the distribution will look
more like Survivor. That is, an acceptance of the null is more
likely.
 This example illustratse that we have to be careful about our
analysis and interpretation of the result when we conduct a test
of equal population means.
Ka-fu Wong © 2003
Chap 12- 33
EXAMPLE 2
 Rosenbaum Restaurants specialize in meals for
senior citizens. Katy Polsby, President, recently
developed a new meat loaf dinner. Before
making it a part of the regular menu she
decides to test it in several of her restaurants.
She would like to know if there is a difference in
the mean number of dinners sold per day at the
Anyor, Loris, and Lander restaurants. Use
the .05 significance level.
Ka-fu Wong © 2003
Chap 12- 34
Example 2
Obs
1
2
3
4
5
Ka-fu Wong © 2003
continued
# of dinners sold per day
Aynor
Loris
Lander
13
10
18
12
12
16
14
13
17
12
11
17
17
Chap 12- 35
EXAMPLE 2
continued
 Step 1: H0: 1 = 2 = 3
H1: Treatment means are not the same
 Step 2: H0 is rejected if F>4.10. There are 2
df in the numerator and 10 df in the denominator.
Ka-fu Wong © 2003
Chap 12- 36
Example 2
continued
 To find the value of F:
Source
SS
df
MS
F
p-value
76.25
2
38.125
39.10
1.87E-05
Error
9.75
10
0.975
Total
86.00
12
Treatment
 The decision is to reject the null hypothesis.
 The treatment means are not the same.
 The mean number of meals sold at the three locations is
not the same.
Ka-fu Wong © 2003
Chap 12- 37
Inferences About Treatment Means
 When we reject the null hypothesis that the
means are equal, we may want to know which
treatment means differ.
 One of the simplest procedures is through the
use of confidence intervals.
Ka-fu Wong © 2003
Chap 12- 38
Confidence Interval for the
Difference Between Two Means
1 1
( X 1  X 2 )  t MSE  
 n1 n2 
 where t is obtained from the t table with
degrees of freedom (nT - k).
 MSE = [SSE/(nT - k)]
because
Ka-fu Wong © 2003
Chap 12- 39
EXAMPLE 3
 From EXAMPLE 2 develop a 95% confidence interval for the
difference in the mean number of meat loaf dinners sold in
Lander and Aynor. Can Katy conclude that there is a
difference between the two restaurants?
 1 1
(17  12.75)  2.228 .975  
4 5
4.25  1.48  (2.77,5.73)
 Because zero is not in the interval, we conclude that this
pair of means differs.
 The mean number of meals sold in Aynor is different from
Lander.
Ka-fu Wong © 2003
Chap 12- 40
Two-Factor ANOVA
 For the two-factor ANOVA we test whether there
is a significant difference between the treatment
effect and whether there is a difference in the
blocking effect.
Ka-fu Wong © 2003
Chap 12- 41
Sample Observations from Independent
Random Samples of K Populations
TREATMENT
B
L
O
C
K
Ka-fu Wong © 2003
1
2
...
K
1
x11
x21
...
xK1
2
x12
x22
...
xK2
.
.
.
.
.
.
.
.
.
...
...
...
.
.
.
B
x1B
x2B
...
xKB
Chap 12- 42
Sum of Squares Decomposition for Two-Way
Analysis of Variance
 Suppose that we have a sample of observations with xij
denoting the observation in the ith group and jth block.
Suppose that there are K groups and B blocks, for a total
of n = KH observations. Denote the group sample means
by xi  (i  1,2,, K ,)
 the block sample means by x j ( j  1,2,, B ) and the overall
sample mean by x.
K
SSTotal  
i1
K
SSE  
i1
B
 (x
j1
ij
j1
SST  B  ( x i  x )2
i1
B
 (x
 x)
K
2
ij
 x i  x  j  x)
2
SSTotal = SSE+SST+SSB
Ka-fu Wong © 2003
B
SSB  K  ( x  j  x)2
j1
Chap 12- 43
General Format of Two-Way Analysis of
Variance Table
Source of
Variation
Sums of
Squares
Degrees
of
Freedom
Mean Squares
F Ratios
Treatments
SST
K-1
MST=SST/K-1)
MST/MSE
Blocks
SSB
B-1
MSB=SSB/(B-1)
MSB/MSE
Error
SSE
Total
SSTotal
Ka-fu Wong © 2003
(K-1)(B-1) MSE=SSE/[(K-1)(B-1)]
nT-1
Chap 12- 44
EXAMPLE 4
 The Bieber Manufacturing Co. operates 24
hours a day, five days a week. The workers
rotate shifts each week. Todd Bieber, the
owner, is interested in whether there is a
difference in the number of units produced
when the employees work on various shifts. A
sample of five workers is selected and their
output recorded on each shift. At the .05
significance level, can we conclude there is a
difference in the mean production by shift and
in the mean production by employee?
Ka-fu Wong © 2003
Chap 12- 45
EXAMPLE 4
Employee
Ka-fu Wong © 2003
continued
McCartney
Day
Output
31
Evening
Output
25
Night
Output
35
Neary
33
26
33
Schoen
Thompson
Wagner
28
30
28
24
29
26
30
28
27
Chap 12- 46
EXAMPLE 4
continued
 TREATMENT EFFECT
 Step 1: H0: µ1= µ2= µ3 versus H1: Not all
means are equal.
 Step 2: H0 is rejected if F>4.46, the
degrees of freedom are 2 and 8.
Ka-fu Wong © 2003
Chap 12- 47
Example 4
continued
 Step 3: Compute the various sum of squares:
Source
SS
df
MS
F
p-value
Treatments
62.53
2
31.267
5.75
.0283
Blocks
33.73
4
8.433
1.55
.2762
Error
43.47
8
5.433
Total
139.73
14
 Step 4: H0 is rejected. There is a difference in the mean
number of units produced for the different time periods.
Ka-fu Wong © 2003
Chap 12- 48
EXAMPLE 4
continued
 Block Effect:
 Step 1: H0: µ1= µ2= µ3 = µ4 = µ5 versus H1:
Not all means are equal.
 Step 2: H0 is rejected if F>3.84, the degrees of
freedom are 4 and 8.
 Step 3: F=[33.73/4]/[43.47/8]=1.55
 Step 4: H0 is not rejected since there is no
significant difference in the average number of
units produced for the different employees.
Ka-fu Wong © 2003
Chap 12- 49
Chapter Twelve
Analysis of Variance
- END -
Ka-fu Wong © 2003
Chap 12- 50