Download Supplement_ANOVA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Supplement: Splus Programs
Logistic Regression:
>options(contrasts=c("contr.treatment", "contr.ploy"))
>import.data(DataFrame="smoke.data",FileName="c://smoke.data.txt",FileType=
"ASCII")
>student.smoke=as.matrix(smoke.data[,1:2])
>parents.smoke=factor(smoke.data[,3])
>smoke.d=data.frame(student.smoke,parents.smoke)
>smoke.glm=glm(student.smoke~parents.smoke,data=smoke.d,
family=binomial(link=logit),na.action=na.fail)
>coefficient<-summary(smoke.glm)$coefficients
>anova(smoke.glm,test="Chisq")
Log-linear Model:
>ship.type=factor(rep(c("A","B","C","D","E"),each=8))
>year.constr=factor(rep(rep(c("1960-64","1965-69","1970-74",
"1975-79"),each=2),5))
>period.oper=factor(rep(rep(c("1960-74","1975-79"),4),5))
>months.serv=c(127,63,1095,1095,1512,3353,0,2244,44882,17176,28609,20370,
7064,13099,0,7117,1179,552,781,676,783,1948,0,274,251,105,
288,192,349,1208,0,2051,45,0,789,437,1157,2161,0,542)
>damage.number=c(0,0,3,4,6,18,0,11,39,29,58,53,12,44,0,18,1,1,0,1,6,2,0,1,0
,0,0,0,2,11,0,4,0,0,7,7,5,12,0,1)
>ship.dataframe=data.frame(damage.number,months.serv,ship.type,
year.constr,period.oper)
>log.months.serv=log(months.serv)
>log.months.serv[months.serv==0]=0
#fit log-linear model
>log.ship.dataframe=data.frame(damage.number,log.months.serv,
ship.type,year.constr,period.oper)
>ship.glm=glm(damage.number~offset(log.months.serv)+ship.type+
year.constr+period.oper,data=log.ship.dataframe,
family=poisson(link=log),na.action=na.fail)
>table.6.3=round(summary(ship.glm,dispersion=1.69)$coefficients[,1:2],2)
1
Supplement: Exponential Family
Definition (canonical exponential family):
The exponential family has the density in the canonical form,
 s

. f x1 , x2 ,, x p ;1 , 2 ,, s   f x,   exp  j T j x   A  hx 
 j 1

Example 1:


Let X ~ N  ,  2 . Then,

f x;  ,
2

 x x 2
  x   2 

exp 
  exp  2  2
2
2
2

2
2



1

  2 log 2 2
  2 
2
 2
where

1
2 1
2
1  2 , T1 x   x, 2  2 , T2 x   x , A   2  log 2 2  .
2

2
2
Important Result 1:


E Tj X  
A 
 j
and


cov T j  X , Tk  X  
 2 A 
.
 j  k
Example 2:


Let X ~ Binomial  ,  2 . Then,
n
n
n x
f x; p     p x 1  p   exp x log  p   n  x  log 1  p  
 x
 x

 n 
 p 
  n log 1  p  
 exp  x log 
1 p 

 x 
2
 


where
n
 p 
, T1 x   x, hx    
1  log 
1 p 
 x
p
1 

 1  exp 1  
 exp 1  

1 p
1 p 
A1   n log 1  p   n log 1  exp 1 

 log 1  exp 1    log 1  p  

Thus,
p
A1  n exp 1 
1 p
ET1  X   E  X  

 n
 np,
1
1
1  exp 1 
1 p
 2 A1  n exp 1  n exp 1 exp 1 
VarT1  X   CovT1  X , T1  X  


1  exp 1 
12
1  exp 1 2 .
 np  np 2  np1  p 
Important Result 2:
The moment generating function for T  X   T1  X  T2  X   Ts  X  is
M T t   M T t1 , t 2 ,, t s   Eexp t1T1  X   t 2T2  X     t sTs  X 
 exp A  t   A 
and the cumulant generating function is
T t   log M T t   A  t   A  .
Example 3:
Let X ~ Poisson  . Then,
x exp   
1
f x;   
 exp x log     
x!
x!
where
1  log  , T1 x  x, A1     exp 1  .
3
The moment generating function of T1  X   X is
M T t   exp A1  t   A1   exp exp 1  t   exp 1 
 exp exp 1 exp t   1  exp  exp t   1
and the cumulant generating function is
T t   expt  1 .
Note that
 T' 0   exp t t 0    E  X 
and
 T' 0   exp t t 0    Var  X  .
Important Result 3:
T  X   T1  X  T2  X   Ts  X  is distributed according to an exponential
family with density
 s

f t1 , t 2 ,, t s ;1 , 2 ,, s   f t ,   exp  j t j  A  h t  .
 j 1

Important Result 4:
T  X   T1  X  T2  X   Ts  X  is sufficient and complete.
4
Supplement: Analysis of Variance
Regression Treatment of One-Way Classification:
I
Model 1:
Let
Yij  u  ti   ij , i  1,2,  , I ; j  1,2,  , J i ,  J i ti  0.
i 1
 i  u  t i .Then, Yij   i   ij .
The regression model in matrix form is
 Y11  1
Y  
 12  1
   

 
Y
 1J 1  1
 Y21  0

 
Y
 22  0
Y      
Y  0
 2J2  
   
 Y  0
 I1  
 YI 2  0
   

 
 YIJ I  0
 11 
0  0
 
0  0
 12 
  
  




0  0
 1J 1 
 
1  0
1   21 

1  0     22 
 2 
     X  
  

1  0    2 J 2 

 I  
  
  
 
0  1
 I1 

0  1
 I2 
  
  



0  1
  IJ I 
 J1 0
0 J
2
t
 X X 
 

0 0
 0
 0 
,
 

 JI 
5
 J1

  Y1 j 
 Jj 1   J 1Y1 
 2
 J Y 
Y
t

2
j
X Y   j 1    2 2  ,

   


 JI
 
J
Y
 Y   I I
I j

 j 1 
Ji
where Yi 
Y
j 1
ij
. Then,
Ji
Y1 
 
Y

1
b  X t X X tY   2 

 
YI 


Thus,
SS (b)  b t X t Y  Y1
 J 1Y1 


I
J
Y
2
2


 YI 
  J i Yi 2
   i 1


J
Y
 I I
Y2
and
RSS (model I)  residual sum of square  Y t Y  SS (b)
Ji
I
I
  Yij2   J i Yi 2
i 1 j 1
i 1
Ji
I
Ji
I
   Y   Yi 2
2
ij
i 1 j 1
Ji
I

i 1 j 1
   Y  Yi
i 1 j 1
2
ij
   Y
Ji
I
2
i 1 j 1
 Yi 
2
ij
since
 Y
Ji
j 1
ij




 Yi    Y  2YijYi  Yi   Y  Yi  2Yi  Yij
2
Ji
2
ij
j 1
Ji

2
Ji
j 1

2
ij
Ji

2
Ji
j 1

Ji
  Y  Yi  2Yi J i Yi   Y  Yi   2Yi 2
j 1
Ji

2
ij
2
  Yij2  Yi 2
j 1

j 1
6
2
ij
2
j 1
As test
H 0 : 1   2     I  u  H 0 : t1  t 2    t I  0 ,
the reduced model under H 0 is
Yij  u   ij (model 1) .
Thus,
Ji
I
Yˆij  Y 
 Y
I
J
i 1

ij
i 1 j 1
i
SS (u )  regression sum of square under H 0  nY 2 ,
RSS (model 1)  residual sum of square under H 0
Ji
I
 
i 1 j 1

Yij  Yˆij
   Y
Ji
I
2
ij
i 1 j 1
Y 
2
 RSS (model 1)  RSS (model I)   Yij  Y    Yij  Yi 
Ji
I
I
2
i 1 j 1
I

Ji
i 1 j 1
2

2
ij
  2 J i Yi Yi  Y   J i Yi  J i Y
I
2
i 1


   2Y Y  Y   Y
i 1
ij
i 1 j 1
   J 2Y
i
Ji
I
2
I
2
2
i 1 j 1
   Y  2YijY  Y  Y  2YijYi  Yi
2
ij
Ji
2
i
i
i
 2Yi Y  Yi 2  Y 2
  J i Yi 2  2Yi Y  Y 2   J i Yi  Y 
I
i 1
I
2
i 1
Then, the F statistic is
 J Y
I
RSS (model 1)  RSS (model I)
I 1
F 

RSS (model I)
nI
i
i 1
i
Y 
2
I 1
 Y
I
Ji
i 1 j 1
ij
 Yi 
nI
As H 0 is true, F ~ FI 1,n  I .
7
2

2
Y 2

Homework
1. The following table refers to 661 children with birth weights 650 g and 1749
g all of whom survived for at least one year. The variables of interest are:
Cardiac: mild heart problems of the mother during pregnancy
Comps: gynaecological problems during pregnancy
Smoking: mother smoked at least one cigarette per day during the first
months of pregnancy.
BW: was the birth weight less than 1250
Cardiac
Yes
No
Comps
Yes
No
Yes
No
Smoking Yes
No
Yes
No
Yes
No
Yes
No
BW Yes
10
25
12
15
18
12
42
45
No
7
5
22
19
10
12
202 205
Analyze the data and interpret the relationship of the children weights and
mother’s habits and health conditions.
2. The data given in Splus build in data frame Insurance (in the library MASS)
consist of the numbers of policy-holders, Holders, the numbers of car insurance
claims made by those policyholders, Claims. There are three explanatory
variables, District (four levels), Group (of car, four levels), and Age (four
ordered levels). Please analyze the data up to the three way interaction with
offset log(Holders). What are the factors in determining the number of claims?
8
Related documents