Download One-way ANOVA - Studentportalen

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Categorical variable wikipedia , lookup

Analysis of variance wikipedia , lookup

Transcript
Quantitative Methods II
WELCOME!
Lecture 13
Thommy Perlinger
Parametrical tests (tests for the mean)
Nature and number of variables
One-way vs. two-way ANOVA
One-way ANOVA
Y1

One dependent
variable (metric)
X1
One independent/explanatory variable
(categorical, e.g. different groups)
Two-way ANOVA
Y1

One dependent
variable (metric)
X1  X 2
Two independent/explanatory
variables = factors (categorical)
One-way Analysis of Variance (ANOVA)
H 0 : µ1 = µ2 = … = µk (all k population means are equal)
H a : The population means are not all equal
Assumptions:
• Independent
random samples
• The variable is Normally
distributed in each group
• Normally distributed residuals
• Equal variance in all
groups (homogeneity)
Advantages:
• Robust to slight deviations from
Normality of the variable when
the groups are of equal sizes
• Robust to slight deviations
from equality of variances
Disadvantage:
• Not appropriate if the
variability is large (use nonparametric test instead)
The basic principles of ANOVA
If we have three fertilisers, and we wish to compare their
efficacy, this could be done by a field experiment.
The three fertilisers are
applied to 10 plots each.
The 30 plots are later
harvested, with the crop yield
being calculated for each plot.
The basic principles of ANOVA
We now have three groups, with ten values (crop yield)
in each group, and we wish to know if there are any
differences between these groups.
The fertilisers do
differ in the amount
of yield produced.
But there is also a
lot of variation
between plots given
the same fertiliser.
The basic principles of ANOVA
The variability quantifies the spread of the data points
around the mean.
Variability – sum of squares
To measure the variability, first the mean is calculated,
then the deviation of each point from the mean.
The deviations are squared, and then summated, and this
sum is a useful measure of variability. The sum will
increase the greater the scatter of the data points around
the mean.
This quantity is referred to as a sum of squares (SS),
and is central to our analysis.
Variance
The SS however cannot be used as a comparative
measure between groups, since it clearly will be influenced
by the number of data points in the group. Instead, this
quantity is converted to a variance by dividing by n − 1
The variance is therefore a measure of variability, taking
account of the size of the dataset.
But why don’t we divide by the actual size, n?
We actually do not have n independent pieces of
information about the variance.
Degrees of freedom – number of
independent pieces of information
The first step was to calculate a mean (from the
n independent pieces of data collected).
The second step is to calculate a variance with reference
to that mean. If n − 1 deviations are calculated, it is known
what the final deviation must be, for they must all add up
to zero by definition.
So we have only n − 1 independent pieces of information
on the variability about the mean.
The number of independent pieces of information contributing
to a statistic are referred to as the degrees of freedom.
Partitioning the variance
In ANOVA, it is useful to keep the measure of variability in
its two components; a sum of squares, and the degrees of
freedom associated with the sum of squares.
The variance is also partitioned, i.e. divided into two parts:
• the variance due to the event we are interested in (e.g.
different fertilisers), and
• the variance due to any other factor
To illustrate the principle behind partitioning the variability,
first consider two extreme datasets.
Grand mean
If there was almost no variation between the plots due to
any of the other factors, and nearly all variation was due
to the application of the three fertilisers, then the data
would follow the following pattern
Grand mean vs. group means
The first step would be to calculate a grand mean (as on
the previous slide), and there is considerable variation
around this mean.
The second step is to calculate the three group means
that we wish to compare: that is, the means for the plots
given fertilisers A, B and C.
Group means
It can be seen that once these means are fitted, little
variation is left around the group means.
Amount of variability explained
In other words, fitting the group means has removed or
explained nearly all the variability in the data.
This has happened because the three means are distinct.
Now consider the other extreme, in which the three
fertilisers are, in fact, identical.
Grand mean
Once again, the first step is to fit a grand mean and
calculate the sum of squares.
Group means
Second, three group means are fitted, only to find that
there is almost as much variability as before.
Amount of variability explained
Little variability has been explained.
This has happened because the three means are relatively
close to each other (compared to the scatter of the data).
Amount of variability explained
The amount of variability that has been explained can be
quantified directly by measuring the scatter of the group
means around the grand mean.
In the first example, the deviations
of the group means around the
grand mean are considerable.
In the second example these
deviations are relatively small
Group means
Now consider a third example, an intermediate situation.
In this situation it is not immediately obvious if the
fertilisers have had an influence on yield.
Significant amount of variability
explained?
There is an obvious reduction in variability around the
three means (compared to the one mean).
But at what point do we decide that the amount of variation
explained by fitting the three means is significant?
The word significant, in this context, actually has a
technical meaning.
It means ‘When is the variability between the group means
greater than that we would expect by chance alone?’
Three measures of variability
SSB = Sum of Squares between groups
Sum of squared deviations of the group means from the
grand mean. A measure of the variation between the
different groups (eg the variation between between plots
given different fertilisers).
SSW = Sum of Squares within groups
Sum of squared deviations of the data around the separate
group means. A measure of the variation within each group
(eg the variation between the different plots that are given
the same fertilizer).
Three measures of variability
SST = Total Sum of Squares
Sum of squared deviations of the data around the grand
mean. A measure of the total variability in the dataset.
SST = SSB + SSW
Partitioning the variability
In the first example, SSW was small (small variation within
the groups) and SSB was large (large variation between
the groups)
Small variation within
the groups (SSW)
Large variation between
the groups (SSB)
Partitioning the variability
In the second example, SSW was large (large variation
within the groups) and SSB was small (small variation
between the groups)
Large variation within
the groups (SSW)
Small variation between
the groups (SSB)
Significant amount of variability
explained?
So, if the variability between the group means is
greater than that we would expect by chance alone, there
is a significant difference between the group means.
For a valid comparison between the two sources of
variability, we of course need to compare the variability
adjusted for the degree of freedom, i.e. the variances.
Partitioning the degrees of freedom
The first step in any analysis of variance is to calculate
SST. This is done with n-1 degrees of freedom.
The second step is to calculate the three group means.
When the deviations of two of the three treatment means
from the grand mean have been calculated, the third is
predetermined. Therefore, calculating SSB (the deviation
of the group means from the grand mean) has 2 df
associated with it, or more formally k-1 df (k = number of
groups).
Partitioning the degrees of freedom
Finally, SSW measures variation around the different
group means.
Within each of these groups, the deviations sum to zero.
For any number of deviations within the group, the last is
always predetermined. Thus SSW has n - k df associated
with it (k=number of groups).
Mean sum of squares
Combining the information from the sum of squares and
the degrees of freedom, we get mean sum of squares.
MST = Total Mean Square
The total variance in the dataset.
Three measures of variability
MSB = Mean Square between groups
The variance between the different groups (eg the variation
between plots given different fertilisers, adjusted for the
sample size).
MSW = Mean Square within groups
The variance within each group (eg the variation between
the different plots that are given the same fertilizer, adjusted
for the sample size).
F-ratio
If none of the fertilisers influenced yield, then the variation
between plots treated with the same fertiliser would be
much the same as the variation between plots given
different fertilisers.
This can be expressed in terms of mean squares:
the mean square for fertiliser (within group) would be the
same as the mean square between the groups
M SB
F 
MSW
The F-ratio would be 1 if the group means are the same.
F-ratio
F-ratio >1 means that the between-group variance is
larger than the within-group variance, and the group
means are quite different.
F-ratio <1 means that the within-group variance is larger
than the between-group variance, and the relatively large
spread within the groups makes it difficult to say that the
group means are different.
The F-ratio is compared to the F-distribution to calculate
an appropriate P-value.
F-distribution
The shape of the F-distribution depends on the df (both
within-group and between-group)
2 and 27 df
10 and 57 df
Example: Eyes and ad response
Research from a variety of fields has found significant
effects of eye gaze and eye color on emotions and
perceptions such as arousal, attractiveness, and honesty.
These findings suggest that a model’s eyes may play a
role in a viewer’s response to an ad.
Example: Eyes and ad response
In a recent study, 222 randomly chosen students at a
certain university were presented one of four portfolios.
Each portfolio contained a target ad for a fictional product,
Sparkle Toothpaste.
The students were asked to view the ad and then respond
to questions concerning their attitudes and emotions
about the ad and product.
The variable of main interest is the viewer’s “attitudes
towards the brand”, an average of 10 survey questions,
on a 7-point scale.
Example: Eyes and ad response
The only difference in three of the ads was the model’s
eyes, which were made to be either brown, blue, or green.
In the fourth ad, the model is in the same pose but looking
downward so the eyes are not visible.
Group
Blue
Brown
Green
Down
n
67
37
77
41
Mean
3.19
3.72
3.86
3.11
Std.dev
1.75
1.73
1.67
1.53
Example: Eyes and ad response
In SPSS: Analyze >> Graphs >> Legacy Dialogs >> Boxplot. Choose ”Simple”.
Example: Eyes and ad response
H 0 : µblue = µbrown = µgreen = µdown
(all 4 mean attitudes are equal in the population)
Ha :
The population mean attitudes are not all equal
Assumptions:
• Independent random samples
• The attitude score is Normally distributed in all groups
-To be checked by Normality plots and tests
• Normally distributed residuals
- Save residuals from the analysis and investigate Normality
plots and tests
• Equal variance in all groups
?
Equal variances in ANOVA
Using formal tests for the equality of variances in several
groups is not recommended (they are largely affected e.g.
by deviations from Normality).
Since ANOVA is robust to slight deviations from equality of
variances, the following rule of thumb can be used:
If the largest standard deviation is less than twice the
smallest standard deviation, the results from the ANOVA
will be approximately correct.
OK to use ANOVA if
slargest  2  ssmallest
Example: Eyes and ad response
Group
Blue
Brown
Green
Down
n
67
37
77
41
Mean
3.19
3.72
3.86
3.11
Std.dev
1.75
1.73
1.67
1.53
None of these
standard deviations are twice as large as
any other.
Example: Eyes and ad response
H 0 : µblue = µbrown = µgreen = µdown
(all 4 mean attitudes are equal in the population)
Ha :
The population mean attitudes are not all equal
Assumptions:
• Independent random samples
• The attitude score is Normally distributed in all groups
-To be checked by Normality plots and tests
• Normally distributed residuals
- Save residuals from the analysis and investigate Normality
plots and tests
• Equal variance in all groups
Example: Eyes and ad response
Significance level?
Wrongly rejecting the null hypothesis would mean that we
claim that the mean attitude towards the brand is different
depending on the eyes on the ad, when the attitudes in fact
are the same (on average).
Not a serious consequence, the standard 5% is fine to use.
Example: Eyes and ad response
ANOVA
Score
Sum of Squares
Between Groups
df
Mean Square
24,420
3
8,140
Within Groups
613,139
218
2,813
Total
637,558
221
F
Sig.
2,894
,036
F-ratio.
>1 if the between-group variance is larger
than the within-group variance (which means
that the group means are quite different)
P-value.
Tells us if the
group means are
significantly
different
In SPSS: Analyze >> Compare Means >> One-way ANOVA.
Choose your variable of interest under ”Dependent List”, and the grouping variable
under ”Factor”.
Example: Eyes and ad response
Tests of Between-Subjects Effects
Dependent Variable:Score
Type III Sum of
Source
Squares
F
Sig.
3
8,140
2,894
,036
2430,423
1
2430,423
864,131
,000
group
24,420
3
8,140
2,894
,036
Error
613,139
218
2,813
Total
3352,860
222
637,558
221
Intercept
Withingroup
Mean Square
a
Corrected Model
Betweengroup
df
24,420
Corrected Total
a. R Squared = ,038 (Adjusted R Squared = ,025)
R2
the fraction of the overall variance (pooling all the groups)
attributable to differences among the group means.
In SPSS: Analyze >> General Linear Models >> Univariate
Choose your explanatory variable as ”Fixed factors”
P-value.
Tells us if the
group means are
significantly
different
Example: Eyes and ad response
The P-value is to be compared to the significance level
0.036 < 0.05
H0 rejected (significant result)
Conclusion:
The test result indicates that the mean attitude towards the
brand is different depending on the eyes on the ad (eye color,
or not seeing the eyes).
But which eye colors?
Again, pairwise tests can be used to find which groups that
differ.
Recap: Pairwise tests can be
performed after a multigroup test
If you find a significant difference using a multigroup test,
you can perform pairwise tests to find which
groups/occasions that differ.
It is however very important not to start with the
pairwise testing, due to multiplicity issues.
First use e.g. ANOVA to find out if there are any
significant differences at all, then perform pairwise tests
to find where the differences are located.
This way you don’t have to adjust the significance level
for multiplicity, and can use e.g. 5% in every pairwise
comparison.
Two-way Analysis of Variance (ANOVA)
There are three sets of hypothesis tests with the two-way
ANOVA
H 01 : µ1_F1 = µ2_F1 = … = µk_F1 (all k population means of the
first factor are equal)
H 02 : µ1_F2 = µ2_F2 = … = µk_F2 (all k population means of the
second factor are equal)
H 03 :
There is no interaction between the two factors
The two explanatory variables in a two-way ANOVA are
called factors (categorical variables).
Two-way Analysis of Variance (ANOVA)
Assumptions:
• Independent
random samples
• The variable is Normally
distributed in each group
• Normally distributed residuals
• Equal variance in all
groups (homogeneity)
Advantages:
• Robust to slight deviations
from Normality when the
groups are of equal size
• Robust to slight deviations
from equality of variances
Disadvantage:
• Not appropriate if the
variability is large (use nonparametric test instead)
Example: Cardiovascular risk factors
A study of cardiovascular (heart disease) risk factors
compared runners who averaged at least 15 miles/week
with a control group (non-exercising).
Both men and women were included in the study.
Y = heart rate after 6 min of exercise
Factor 1 = exercise group (runners/control)
Factor 2 = gender
Example: Cardiovascular risk factors
Women
Runners Mean: 116 b/m
Std: 16.0 b/m
Control Mean: 148 b/m
Std: 16.3 b/m
Men
Mean: 104 b/m
Std: 12.5 b/m
Mean: 130 b/m
Std: 17.1 b/m
None of the standard deviations
are twice as large as any other.
In SPSS: Analyze >> Graphs >> Legacy Dialogs >> Boxplot. Choose ”Clustered”.
Example: Cardiovascular risk factors
Female
Runners
Control
Male
Heart rate
seems to be
Normally
distributed
in all groups
In SPSS:
Data >> Split File. Mark ”Organize output by groups” and add the two factor variables.
Analyze >> Descriptive Statistics >> Explore. Add dependent variable to ”Dependent
List”, and the two factor variables Click ”Plots”, mark ”Normality plots with tests”
Example: Cardiovascular risk factors
Residuals
seem to be
Normally
distributed
In SPSS: Analyze >> General Linear Models >> Univariate. Click ”Save”, mark
Standardized residuals. Then check the normality of the saved residuals.
Example: Cardiovascular risk factors
H 01 : µrunners = µcontrol
H 02 : µfemale = µmale
H 03 :
There is no interaction effect on heart rate between exercise
(running) and gender.
Assumptions:
• Independent random samples
• Heart rate is Normally distributed in all groups
• Normally distributed residuals
• Equal variance in all groups
Two-way Analysis of Variance (ANOVA)
There are two different effects measured with two-way
ANOVA:
• Main effect
• Interaction effect
Main effect
The main effect describes the effect of the explanatory
variables one at a time. The interaction is ignored for this
part. This is the part which is similar to the one-way
analysis of variance. Each of the variances calculated to
analyze the main effects are like the between-group
variances.
So for two variables, there will be two main effects.
Interaction effect
The interaction effect describes the effect that one factor
has on the other factor.
For two variables, there will be one interaction effect.
Example: Cardiovascular risk factors
Tests of Between-Subjects Effects
Dependent Variable:Heart rate
Type III Sum of
Source
Squares
Corrected Model
215256,090
a
3
71752,030
296,345
,000
12398208,080
1
12398208,080
51206,259
,000
group
168432,080
1
168432,080
695,647
,000
gender
45030,005
1
45030,005
185,980
,000
1794,005
1
1794,005
7,409
,007
Error
192729,830
796
242,123
Total
12806194,000
800
407985,920
799
Intercept
Main
effects
group * gender
Interaction
effect
Corrected Total
df
Mean Square
F
Sig.
a. R Squared = ,528 (Adjusted R Squared = ,526)
R2
the fraction of the overall variance (pooling all the groups)
attributable to differences among the group means.
In SPSS: Analyze >> General Linear Models >> Univariate
Choose your explanatory variables as ”Fixed factors”
Example: Cardiovascular risk factors
If the lines are
parallel, there is no
interaction effect
In SPSS: Analyze >> General Linear Models >> Univariate
Click ”Plots”.
Add one of the factors to ”Horizontal axis”, the other to ”Separate Lines”, click ”Add”
Example: Cardiovascular risk factors
Conclusion:
There is a significant interaction effect between exercise
(running) and gender on heart rate (P-value 0.007).
Exercise has a larger effect on heart rate for females than
for males, on average.
Main effects analysis shows that both exercise and gender,
separately, also have significant effects on average heart
rate (both P-values 0.000).
Men have on average lower heart rate than women, and
the running group has on average lower heart rate than the
control group.
Recap: Nature and number of variables
Analysis of variance (ANOVA/MANOVA)
ANOVA
Y1

X 1  X 2  X 3  ...  X n
One dependent
variable (metric)
Several independent/explanatory
variables (categorical)
MANOVA
Y1 +Y2  Y3  ...  Yn

Several dependent variables
(metric)
X 1  X 2  X 3  ...  X n
Several independent/explanatory
variables (categorical)
Recap: Dependence techniques
Several dependent variables in single relationship
Measurement scale of
the dependent variable
Metric
Nonmetric
Measurement scale of
the predictor variable
Canonical
correlation
analysis
with dummy
Metric
Nonmetric
Canonical Multivariate analysis
correlation of variance (MANOVA)
(Not included
in this course)
(Not included
in this course)