Download Solution for HW #3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
HW3-Solutions
Ozan Sonmez
October 19, 2016
Problem 2.1
For the Johnson & Johnson data, say yt , shown in Figure 1.1, let xt = log(yt ). In this problem, we are going
to fit a special type of structural model, xt = Tt + St + Nt where Tt is a trend component, St is a seasonal
component, and Nt is noise. In our case, time t is in quarters (1960.00, 1960.25, . . . ) so one unit of time is
a year.
part a
Fit the regression model
xt = βt + α1 Q1 (t) + α2 Q2 (t) + α3 Q3 (t) + α4 Q4 (t) + wt
|{z} |
{z
} |{z}
trend
seasonal
noise
where Qi (t) = 1 if time t corresponds to quarter i = 1, 2, 3, 4, and zero otherwise. The Qi (t)’s are called
indicator variables. We will assume for now that wt is a Gaussian white noise sequence.
library(astsa)
trend = time(jj) - 1970 # helps to `center' time
Q = factor(cycle(jj) ) # make (Q)uarter factors
reg1 = lm(log(jj)~0 + trend + Q, na.action=NULL) # no intercept
summary(reg1)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = log(jj) ~ 0 + trend + Q, na.action = NULL)
Residuals:
Min
1Q
Median
-0.29318 -0.09062 -0.01180
3Q
0.08460
Max
0.27644
Coefficients:
Estimate Std. Error t value Pr(>|t|)
trend 0.167172
0.002259
74.00
<2e-16 ***
Q1
1.052793
0.027359
38.48
<2e-16 ***
Q2
1.080916
0.027365
39.50
<2e-16 ***
Q3
1.151024
0.027383
42.03
<2e-16 ***
Q4
0.882266
0.027412
32.19
<2e-16 ***
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1254 on 79 degrees of freedom
Multiple R-squared: 0.9935, Adjusted R-squared: 0.9931
F-statistic: 2407 on 5 and 79 DF, p-value: < 2.2e-16
1
part b
The estimated average annual increase in the logged earnings per share is α̂1 + α̂2 + α̂3 + α̂4 , which can be
extracted from the sumamry table above, i.e., 1.052793 + 1.080916 + 1.151024 + 0.882266 = 4.166999
part c
If the model is correct, average logged earnings rate increase or decrease from the third quarter to the fourth
quarter is α̂4 − α̂3 = −0.268758 (decrease). And it decreases by (0.269/1.151024) × 100 = 23.37049 %.
part d
What happens if you include an intercept term in the model in (a)? Explain why there was a problem.
reg2 = lm(log(jj)~ trend + Q, na.action=NULL) # no intercept
summary(reg2)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = log(jj) ~ trend + Q, na.action = NULL)
Residuals:
Min
1Q
Median
-0.29318 -0.09062 -0.01180
3Q
0.08460
Max
0.27644
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.052793
0.027359 38.480 < 2e-16 ***
trend
0.167172
0.002259 73.999 < 2e-16 ***
Q2
0.028123
0.038696
0.727
0.4695
Q3
0.098231
0.038708
2.538
0.0131 *
Q4
-0.170527
0.038729 -4.403 3.31e-05 ***
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1254 on 79 degrees of freedom
Multiple R-squared: 0.9859, Adjusted R-squared: 0.9852
F-statistic: 1379 on 4 and 79 DF, p-value: < 2.2e-16
haing an intercept here takes away the first quarter effect, and the intercept apperas in all quarters, this does
not make sense since we want to study the effect of each quarter seperately.
part d
Graph the data, xt , and superimpose the fitted values, say x̂t , on the graph. Examine the residuals, xt − x̂t ,
and state your conclusions. Does it appear that the model fits the data well (do the residuals look white)?
par(mfrow=c(1,2))
plot(log(jj), main="plot of data and fitted value") # data
lines(fitted(reg1), col="red") # fitted
plot(log(jj)-fitted(reg1), main="plot of residuals")
2
plot of residuals
0.1
−0.3
−0.1 0.0
log(jj) − fitted(reg1)
1
0
log(jj)
2
0.2
plot of data and fitted value
1960
1970
1980
1960
Time
1970
1980
Time
the noise seems not to follow any pattern hence it looks fairly white, and the fit seems pretty good.
Problem 2.2
For the mortality data examined in Example 2.2:
part a
Add another component to the regression in (2.21) that accounts for the particulate count four weeks prior;
that is, add Pt−4 to the regression in (2.21). State your conclusion.
n = length(tempr)
temp = tempr - mean(tempr) # center temperature
temp2 = temp^2
trend = time(cmort) # time
fit1 = lm(cmort~ trend + temp + temp2 + part, na.action=NULL)
fit2 = lm(cmort[5:n]~ trend[5:n] + temp[5:n] + temp2[5:n] + part[5:n]
+ part[1:(n-4)], na.action=NULL)
summary(fit2)
##
## Call:
## lm(formula = cmort[5:n] ~ trend[5:n] + temp[5:n] + temp2[5:n] +
##
part[5:n] + part[1:(n - 4)], na.action = NULL)
##
## Residuals:
##
Min
1Q Median
3Q
Max
3
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
-18.228
-4.314
-0.614
3.713
27.800
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
2.808e+03 1.989e+02 14.123 < 2e-16 ***
trend[5:n]
-1.385e+00 1.006e-01 -13.765 < 2e-16 ***
temp[5:n]
-4.058e-01 3.528e-02 -11.503 < 2e-16 ***
temp2[5:n]
2.155e-02 2.803e-03
7.688 8.02e-14 ***
part[5:n]
2.029e-01 2.266e-02
8.954 < 2e-16 ***
part[1:(n - 4)] 1.030e-01 2.485e-02
4.147 3.96e-05 ***
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.287 on 498 degrees of freedom
Multiple R-squared: 0.608, Adjusted R-squared: 0.6041
F-statistic: 154.5 on 5 and 498 DF, p-value: < 2.2e-16
as the summary of the fit suggest all the predictors are sttaistically significant, yielding the model: Mt = b0
+b1t+b2(Tt
M̂t = β̂0 + β̂1 t + β̂2 (Tt − T. ) + β̂3 (Tt − T. )2 + β̂4 Pt + β̂5 Pt−4
where the estimated parameter (β̂0 , β̂1 , β̂2 , β̂3 , β̂4 , β̂5 ) are given in the above table
part b
Using AIC and BIC, is the model in (a) an improvement over the final model in Example 2.2?
aic1 = AIC(fit1)/n - log(2*pi)
aic2 = AIC(fit2)/(n-4) - log(2*pi)
aic = data.frame(model1 = aic1, model2=aic2)
aic
##
model1
model2
## 1 4.721732 4.692916
there is a little bit of improvement in the model fit, it is not a dramatic improvement over the model without
Pt−4 (model1).
Problem 2.3
In this problem, we explore the difference between a random walk and a trend stationary process.
part a
Pt
Note from (1.4), a random walk can be expressed as xt = tδ + k=1 wk , where wk is a white noise with
2
variance σw
. Hence here we will generate four series that are random walk with drift of length n = 100 with
2
δ = .01 and σw
= 1. Call the data xt for t = 1, ..., 100. Fit the regression xt = βt + wt using least squares.
Plot the data, the true mean function (i.e., µt = .01t) and the fitted line.
4
#Part a
set.seed(2)
n=100
delta = 0.01
time = 1:n #time
# generate the white noise
par(mfrow = c(2,2))
for (k in 1:4){
w = rnorm(n, 0, 1)
# generate the random walk based on the above eq
x=c()
for (t in 1:n){
x[t] = delta*t + sum(w[1:t])
}
#true mean function
mu = delta*time
# fit a regression without intercept
fit = lm(x ~ 0 + time)
plot(time, x, type="l", main="random walk")
lines(time, fitted(fit), col="red")
lines(time, mu, col="blue")
}
random walk
20
40
60
80
100
0
40
60
time
random walk
random walk
100
80
100
15
30
80
0
x
10
0
x
20
time
20
0
0 4 8
x
4
−2
x
8
random walk
0
20
40
60
80
100
0
20
time
40
60
time
part b
and for part b, we will do the same thing for the process yt = 0.01t + wt
5
#Part b
n=100
time = 1:n #time
# generate the white noise
par(mfrow = c(2,2))
for (k in 1:4){
w = rnorm(n, 0, 1)
y = 0.01 * time + w
#true mean function
mu = 0.01*time
# fit a regression without intercept
fit = lm(y ~ 0 + time)
plot(time, x, type="l", main="y_t")
lines(time, fitted(fit), col="red")
lines(time, mu, col="blue")
}
40
60
80
100
0
20
40
60
time
time
y_t
y_t
80
100
80
100
15
0
0
x
15
30
20
30
0
x
15
0
x
15
0
x
30
y_t
30
y_t
0
20
40
60
80
100
0
time
20
40
60
time
part c
Note that the distance between the fit and the true mean is significantly closer in partb because the errors in
yt are independent which is one of the main assumptions of the linear regression where as in xt the errors are
correlated because of the accumulation of the white noises.
Please also consider comparing the estimates of the lienar fits to the mean functions by simply looking at the
summary() function of the regression fits.
6
Problem 2.4
Consider a process consisting of a linear trend with an additive noise term consisting of independent random
2
variables wt with zero means and variances σw
, that is,
xt = β0 + β1 t + wt ,
where β0 , β1 are fixed constants.
part a
looking at the mean function:
E[xt ] = E[β0 + β1 t + wt ] = β0 + β1 t
which is dependent on time t, hence the series xt is not stationary.
part b
Note that yhe first order difference of xt can be simplified into:
∇xt = xt − xt−1
= β0 + β1 t + wt − [β0 + β1 (t − 1) + wt−1 ]
= β1 + wt − wt−1
and the mean function is:
E[∇xt ] = E[β1 + wt + wt−1 ] = β1 + E[wt ] − E[wt−1 ] = β1
which is indepednent of time t, and the autocovariance function is:
γ∇xt (t + h, h) = Cov(xt+h , xt )
= Cov(β1 + wt+h − wt+h−1 , β1 + wt − wt−1 )
= Cov(wt+h − wt+h−1 , wt − wt−1 )

2

2σw , h = 0
2
= −σw
, |h| = 1


0,
|h| > 1
which is also free from time t, hence ∇xt is stationary.
part c
Repeat part (b) if wt is replaced by a general stationary process, say yt , with mean function µy and
autocovariance function γy (h).
7
γ∇xt (t + h, h) = Cov(xt+h , xt )
= Cov(β1 + yt+h − yt+h−1 , β1 + yt − yt−1 )
= Cov(yt+h − yt+h−1 , yt − yt−1 )
= Cov(yt+h , yt ) − Cov(yt+h , yt−1 ) − Cov(yt+h−1 , yt ) + Cov(yt+h−1 , yt−1 )
|
{z
} |
{z
} |
{z
} |
{z
}
γy (h)
γy (h+1)
γy (h−1)
γy (h)
= γy (h + 1) − 2γy (h) + γy (h − 1)
which is independent of time since yt is stationary and γt (·) is independent of time t, and so is its linear
combination γ∇xt (h).
Problem 2.6
The glacial varve record plotted in Figure 2.6 exhibits some nonstationarity that can be improved by
transforming to logarithms and some additional nonstationarity that can be corrected by differencing the
logarithms.
part a
n = length(varve)
varve1 = varve[1:n/2]
varve2 = varve[(n/2+1):n]
data.frame(FirstHalf = var(varve1), SecondHalf = var(varve2))
##
FirstHalf SecondHalf
## 1
132.501
594.4904
as we can see the sample variance in the second half of the data is significanlty larger than the first half,
therefore the variance is not homogenous, hence we need to statibilize the variance by data transformation
such as log(·)-transformation. Having a log transformation obviously made data closer to the normality.
par(mfrow = c(1,2))
hist(varve, main="raw data")
hist(log(varve), main="log-transformed data")
8
log−transformed data
0
0
50
50
100
Frequency
150
100 150 200 250
Frequency
200
raw data
0
50
100
150
1
varve
2
3
4
5
log(varve)
part b
Plot the series yt = log(xt ). Do any time intervals, of the order 100 years, exist where one can observe
behavior comparable to that observed in the global temperature records in Figure 1.3.
plot(log(varve), main="log-transformed data")
4
3
2
log(varve)
5
log−transformed data
0
100
200
300
400
500
600
Time
It doesn’t seem that there is an increasing/decreasing trend over time as we observed in Figure 1.3.
part c
The ACF of yt is:
9
acf(log(varve), lag.max = 20)
0.4
0.0
0.2
ACF
0.6
0.8
1.0
Series log(varve)
0
5
10
15
Lag
20
seems
that the dependency in the data is very strong in close lags and dies down very slowly.
part d
Compute the difference ut = yt − yt−1 , examine its time plot and sample ACF, and argue that differencing
the logged varve data produces a reasonably stationary series.
−1
u
1
u = diff(log(varve), 1) #take the first order fifference
plot(u)
0
100
200
300
400
500
600
Time
differecning produces fairly reasonable stationary process. ut can be interpreted as the yearly increase in the
thicknesses varves. In statistical sense, ut can be viewed as the smoothing, i.e.,
10
ut = ∇yt
= yt − yt−1
= log(xt ) − log(xt−1 )
xt
xt−1
= log
,
±
xt−1
xt−1
xt − xt−1
= log 1 +
xt−1
xt − xt−1
≈
xt−1
which can be interpreted as the marginal change in the thicknesses varves.
Problem 2.7
# MA Smoothing
wgts = c(.5, rep(1,11), .5)/12
smooth1 = filter(gtemp, sides=2, filter=wgts)
# Kernel Smoothing
smooth2 = ksmooth(time(gtemp), gtemp, "normal", bandwidth=10)
# Lowess smoothing
smooth3 = lowess(gtemp)
par(mfrow=c(3,1))
plot(gtemp, type="o", ylab="MA Smoothing")
lines(smooth1, col="red")
plot(gtemp, type="o", ylab="Kernel Smoothing")
lines(smooth2, col="red")
plot(gtemp, type="o", ylab="Lowess Smoothing")
lines(smooth3, col="red")
11
0.6
0.4
0.2
0.0
MA Smoothing
−0.4 −0.2
1880
1900
1920
1940
1960
1980
2000
1960
1980
2000
1960
1980
2000
0.4
0.2
0.0
−0.4 −0.2
Kernel Smoothing
0.6
Time
1880
1900
1920
1940
0.4
0.2
0.0
−0.4 −0.2
Lowess Smoothing
0.6
Time
1880
1900
1920
1940
Time
All methods seems to model the ternd reasonably, but one needs to be careful on picking the tunning
parameters such as the weights of MA smoothing, badwitdh of the kernel smoothing, and so on.
12
Related documents