Download 1-Way Analysis of Variance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Transcript
Regression Approach To ANOVA
• Dummy (Indicator) Variables: Variables that take on the
value 1 if observation comes from a particular group, 0
if not.
• If there are I groups, we create I-1 dummy variables.
• Individuals in the “baseline” group receive 0 for all
dummy variables.
• Statistical software packages typically assign the “last”
(Ith) category as the baseline group
• Statistical Model: E(Y) = b0 + b1Z1+ ... + bI-1ZI-1
• Zi =1 if observation is from group i, 0 otherwise
• Mean for group i (i=1,...,I-1): mi = b0 + bi
• Mean for group I: mI = b0
2-Way ANOVA
• 2 nominal or ordinal factors are believed to
be related to a quantitative response
• Additive Effects: The effects of the levels of
each factor do not depend on the levels of
the other factor.
• Interaction: The effects of levels of each
factor depend on the levels of the other
factor
• Notation: mij is the mean response when
factor A is at level i and Factor B at j
Example - Thalidomide for AIDS
•
•
•
•
Response: 28-day weight gain in AIDS patients
Factor A: Drug: Thalidomide/Placebo
Factor B: TB Status of Patient: TB+/TBSubjects: 32 patients (16 TB+ and 16 TB-).
Random assignment of 8 from each group to
each drug). Data:
–
–
–
–
Thalidomide/TB+: 9,6,4.5,2,2.5,3,1,1.5
Thalidomide/TB-: 2.5,3.5,4,1,0.5,4,1.5,2
Placebo/TB+: 0,1,-1,-2,-3,-3,0.5,-2.5
Placebo/TB-: -0.5,0,2.5,0.5,-1.5,0,1,3.5
ANOVA Approach
• Total Variation (SST) is partitioned into 4
components:
– Factor A: Variation in means among levels of A
– Factor B: Variation in means among levels of B
– Interaction: Variation in means among combinations
of levels of A and B that are not due to A or B alone
– Error: Variation among subjects within the same
combinations of levels of A and B (Within SS)
ANOVA Calculations
• Balanced Data (n observations per treatment)
I


DFA  I  1


DFB  J  1
Factor A : SSA  nJ  x i..  x
2
i 1
J
Factor B : SSB  nJ  x. j .  x
2
j 1
I

J
AB Interactio n : SSAB   x ij.  x i..  x. j .  x

2
DFAB  ( I  1)( J  1)
i 1 j 1
I
J
n

Error : SSE   xijk  x ij

2
DFE  IJ (n  1)  N  IJ
i 1 j 1 k 1
I
J
n

Total : SST   xijk  x

2
DFT  N  1
i 1 j 1 k 1
SST  SSA  SSB  SSAB  SSE
DFT  DFA  DFB  DFAB  DFE
ANOVA Approach
General Notation: Factor A has I levels, B has J levels
Source
Factor A
Factor B
Interaction
Error
Total
df
I-1
J-1
(I-1)(J-1)
N-IJ
N-1
SS
SSA
SSB
SSAB
SSE
TSS
MS
MSA=SSA/(I-1)
MSB=SSB/(J-1)
MSAB=SSAB/[(I-1)(J-1)]
MSE=SSE/(N-IJ)
F
FA=MSA/MSE
FB=MSB/MSE
FAB=MSAB/MSE
• Procedure:
• Test H0: No interaction based on the FAB statistic
• If the interaction test is not significant, test for Factor A
and B effects based on the FA and FB statistics
Example - Thalidomide for AIDS
Individual Patients

7.5
Group Means

tb

Negative

Positive
3.000











2.5
0.0
-2.5











meanwg
wtgain
5.0
2.000
1.000

0.000
-1.000
Placebo

Thalidomide
Placebo
drug
Thalidomide
drug
p
W
e
N
e
G
a
8
8
4T
5
8
2T
0
8
6T
8
8
3T
5
2
7T
Example - Thalidomide for AIDS
n
-
D
I
I
I
S
d
F
S
S
u
i
f
g
a
C
8
3
3
6
0
I
n
0
1
0
7
0
D
1
1
1
2
0
T
1
1
1
8
4
D
5
1
5
7
2
E
3
8
3
T
0
2
C
0
1
a
R
• There is a significant Drug*TB interaction (FDT=5.897, P=.022)
• The Drug effect depends on TB status (and vice versa)
Regression Approach
• General Procedure:
– Generate I-1 dummy variables for factor A (A1,...,AI-1)
– Generate J-1 dummy variables for factor B (B1,...,BJ-1)
• Additive (No interaction) model:
E (Y )  b 0  b1 A1    b I 1 AI 1  b Ia B1    b I  J 2 BJ 1
Test for difference s among levels of factor A : H 0 : b1    b I 1  0
Test for difference s among levels of factor B : H 0 : b Ia    b I  J  2  0
Tests based on fitting full and reduced models.
Example - Thalidomide for AIDS
• Factor A: Drug with I=2 levels:
– D=1 if Thalidomide, 0 if Placebo
• Factor B: TB with J=2 levels:
•
•
•
•
– T=1 if Positive, 0 if Negative
Additive Model:
E(Y )    b1D  b2T
Population Means:
– Thalidomide/TB+: b0+b1+b2
– Thalidomide/TB-: b0+b1
– Placebo/TB+: b0+b2
– Placebo/TB-: b0
Thalidomide (vs Placebo Effect) Among TB+/TB- Patients:
TB+: (b0+b1+b2)-(b0+b2) = b1 TB-: (b0+b1)- b0 = b1
Example - Thalidomide for AIDS
• Testing for a Thalidomide effect on weight gain:
– H0: b1 = 0 vs HA: b1  0 (t-test, since I-1=1)
• Testing for a TB+ effect on weight gain:
– H0: b2 = 0 vs HA: b2  0 (t-test, since J-1=1)
• SPSS Output: (Thalidomide has positive effect, TB None)
i
a
c
d
a
a
i
i
c
c
B
e
M
i
t
E
g
t
1
(
5
7
0
3
D
3
3
7
9
0
T
3
3
1
2
9
a
D
Regression with Interaction
• Model with interaction (A has I levels, B has J):
– Includes I-1 dummy variables for factor A main effects
– Includes J-1 dummy variables for factor B main effects
– Includes (I-1)(J-1) cross-products
of factor A and B
m
dummy variables
• Model:
E(Y )  b 0  b1 A1    b I 1 AI 1  b I B1    b I  J 2 Bb1  b I  J 1 ( A1B1 )    b IJ 1 ( AI 1BJ 1 )
As with the ANOVA approach, we can partition the variation
to that attributable to Factor A, Factor B, and their interaction
Example - Thalidomide for AIDS
• Model with interaction:E(Y)=b0+b1D+b2T+b3(DT)
• Means by Group:
– Thalidomide/TB+: b0+b1+b2+b3
– Thalidomide/TB-: b0+b1
– Placebo/TB+: b0+b2
– Placebo/TB-: b0
• Thalidomide (vs Placebo Effect) Among TB+ Patients:
• (b0+b1+b2+b3)-(b0+b2) = b1+b3
• Thalidomide (vs Placebo Effect) Among TB- Patients:
• (b0+b1)-b0= b1
• Thalidomide effect is same in both TB groups if b3=0
Example - Thalidomide for AIDS
• SPSS Output from Multiple Regression:
i
a
c
d
a
a
i
i
c
c
S
B
e
E
M
i
t
g
t
1
(
7
9
7
3
D
8
6
9
3
5
T
7
6
8
7
0
D
0
8
9
8
2
a
D
We conclude there is a Drug*TB interaction (t=2.428, p=.022).
Compare this with the results from the two factor ANOVA table
Repeated Measures ANOVA
• Goal: compare g treatments over t time periods
• Randomly assign subjects to treatments
(Between Subjects factor)
• Observe each subject at each time period
(Within Subjects factor)
• Observe whether treatment effects differ over
time (interaction, Within Subjects)
Repeated Measures ANOVA
• Suppose there are N subjects, with ni in the
ith treatment group.
• Sources of variation:
–
–
–
–
–
Treatments (g-1 df)
Subjects within treatments aka Error1 (N-g df)
Time Periods (t-1 df)
Time x Trt Interaction ((g-1)(t-1) df)
Error2 ((N-g)(t-1) df)
Repeated Measures ANOVA
Source
Between Subjects
Treatment
Subj(Trt) = Error1
Within Subjects
Time
TimexTrt
Time*Subj(Trt)=Error2
df
SS
MS
F
g-1
N-g
SSTrt
SSE1
MSTrt=SSTrt/(g-1)
MSE1=SSE1/(N-g)
MSTrt/MSE1
t-1
(t-1)(g-1)
(N-g)(t-1)
SSTi
SSTiTrt
SSE2
MSTi=SSTi/(t-1)
MSTiTrt=SSTiTrt/((t-1)(g-1))
MSE2=SSE2/((N-g)(t-1))
MSTi/MSE2
MSTiTrt/MSE2
To Compare pairs of treatment means (assuming no time by
treatment interaction, otherwise they must be done within time
periods and replace tn with just n):
T
i

 T j  t / 2, N  g
 1

1

MSE1 
 tn tn 
j 
 i