Download The REG Procedure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Regression toward the mean wikipedia , lookup

German tank problem wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Choice modelling wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
STAT 512
Midterm Exam March 1 2007
Name:
Show all your work (you can use this page).
All questions are for 5 points (90 total).
Good luck !
1. You use 27 observations to estimate parameters of the simple linear regression
model. The averages of your X and Y values are equal to: X =2 and Y =10. The
estimate of the intercept, b0 , is equal to 4, and its estimated standard deviation, s(b0), is
equal to 1. The estimate of the standard deviation of the error term is equal to 3.
a) Construct 95% confidence interval for β0.
b0  t (0.975,25)s(b0 )  4  2.06  [1.94,6.06
b) Calculate b1. (Hint – see the formula for b0).
b1 
Y  b0 10  4

3
X
2
c) Calculate SSX   ( X i  X ) 2 (Hint – see the formula for s2(b0)).
4 
 1
s 2 (b0 )  9 

 27 SSX 
SSX  54
d) Find s(b1).
9 1

54 6
s(b1 )  0.41
s 2 (b1 ) 
e) Construct confidence intervals for β0 and β1 so as to have at least 95% confidence
that both these parameters are included in the respective intervals.
 0.05 
t 1 
,25   t (0.9875,25)  2.393
4


4  2.393  [1.607,6.393]
3  0.41  2.393  [2.02,3.98]
2. You use 23 observations to build a multiple regression model
Yi   0  1 X i1   2 X i 2   i , where ξi ~ N(0,σ2). Your estimates are equal to:
b0=3, b1=4, b2=3, MSE=5.
a) Estimate the mean value of Y for X1=2 and X2=1 (give a point estimate).
ˆ  14
b) The estimated standard deviation of the estimate of mean you obtained in point
(a), s( ˆ ) , is equal to 2. Find the corresponding standard error of the prediction
of Y .
s ( pred )  5  4  3
c) Find the prediction interval for Y at X1=2 and X2=1.
t(0.975,20)=2.086
PI : 14±2.086∙3=[7.742, 20.258]
3. Here is the table of type I and type II sums of squares for three explanatory variables
used for the multiple regression model.
variable
X1
X2
X3
Type I SS
200
50
20
Type II SS
60
20
20
n= 24 and SST = 570 .
a) Give the estimate of the variance of the error term.
SSM=270, SSE=570-270=300, MSE=300/20=15
b) Test the hypothesis that the response variable is not associated with any
of the explanatory variables.
F=MSM/MSE, MSM=270/3=90, F=90/15=6
F(0.95,3,20)=3.1
6>3.1 – reject H0
Conclusion – at least one of my X’s is a useful predictor
c) Give the value of R2 for the full model.
R2=270/570≈0.474
d) Test the hypothesis that β1=0 in the full model.
F=60/15=4,
F(0.95,1,20)=4.35
Do not reject H0.
X1 is not a useful predictor in the full model.
e) Test the hypothesis that β1=0 in the simple linear regression of Y on X1.
SSM=MSM=200, SSE=370, MSE=370/22, F=MSM/MSE≈11.89
F(0.95,1,22)<4.35
Reject H0
X1 is significantly correlated with Y.
4.
We study the relation between degree of brand liking (Y) and moisture component
(X1) and sweetness (X2) of the product. Below you can find the results of SAS analysis.
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
2
13
15
1872.70000
94.30000
1967.00000
936.35000
7.25385
Root MSE
Dependent Mean
Coeff Var
2.69330
81.75000
3.29455
R-Square
Adj R-Sq
F Value
Pr > F
129.08
<.0001
0.9521
0.9447
Parameter Estimates
Variable
Intercept
x1
x2
DF
Parameter
Estimate
Standard
Error
t Value
Pr > |t|
1
1
1
37.65000
4.42500
4.37500
2.99610
0.30112
0.67332
12.57
14.70
6.50
<.0001
<.0001
<.0001
a) Give the result of the overall ANOVA test verifying if the response variable is
related to any of the explanatory variables (give the value of the corresponding
test statistic, the p-value and the conclusion).
F=129.08, p-value<0.0001,
Reject H0, at least one of our X’s is a useful predictor
b) Write the fitted regression equation.
Yˆ  37.65  4.425x1  4.375x2
c) Give the estimate of the standard deviation of the error term.
2.6933
d) Give the result of the test for significance of X2 in the full model (give the value
of the corresponding test statistic, the p-value and the conclusion).
t=6.5, pvalue<0.0001, reject H0, X2 is a significant predictor in a full model
5) What can you say about the distribution of the data demonstrated on the attached
qqplot ?
The distribution has (much) heavier (longer) tails than the normal distribution.
500
400
300
200
x
100
0
-100
-200
-300
-4
-3
-2
-1
0
Normal Quantiles
1
2
3
4