Download Assignment3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia , lookup

Choice modelling wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Assignment 3
NAME
1. The fungus Neurospora crassa was grown at 30 degrees Celcius on an agar
medium in tubes filled with an inert gas containing approximately 5%oxygen. The
following growth rates in millimeters per hour were recorded for each inert gas
used in the experiment. The molecular weight of each gas was also recorded as a
predictor variable. The data are given in the following table.
OBSERVATION
1
INERT
GAS
He
MOLECULAR WEIGHT
X
4.0
GROWTH RATE
Y
3.51
2
3
4
Ne
N2
N2
20.2
28.2
28.2
3.14
3.03
2.83
5
6
7
8
9
10
Ar
Ar
Kr
Kr
Xe
Xe
39.9
39.9
83.8
83.8
131.3
131.3
2.71
2.76
2.27
2.17
1.88
1.85
In this experiment the exact value of the predictor variable, molecular weight, is
essentially known. However, it is obvious that some experimental errors are
associated with the measurement of the response variable, growth rate, since
different values were observed for the growth rate when the same inert gas was
used. If additional information could be obtained about the manner in which this
experiment was performed, a list of the possible sources of error, or variation, in
the observed responses could be constructed. It is a good idea to make such a list,
but we will not do for this assignment.
(a) Construct a graph of the data with the growth rates on the vertical axes.
Does a straight line model appear to be appropriate for the relationship
between the molecular weight of the inert gas and the growth rate of the
fungus?
. Before you proceed with this problem, use a straight
edge to draw a line on the graph that appears to you to best fit the data. Use
the line you have drawn on the graph to obtain “visual” estimates for the
intercept and slope
ntercept =
slope =
(b) Consider the simple linear regression model discussed in class in which the
errors are assumed to be distributed as NID(0,2) random variables. Use this
model in answering the rest of the questions. Find the least squares estimates for
the parameters of the regression line.
intercept =
slope =
What physical quantity does the slope represent in this case?
(c) Test the null hypothesis that the slope is equal to zero against the alternative that
it is not zero. Report the values for
t-test =
d.f.=
p-value =
State your conclusion.
(d) Construct a 95% confidence interval for the intercept.
lower limit=
upper limit=
Explain why this interval may be an unrealistic interval estimate for the growth rate
when X is zero.
(e) Estimate the average growth rate for a gas with a molecular weight of 50.
Estimated average growth rate =
Construct a 95%confidence interval for the average growth rate.
lower limit=
upper limit=
(f) If the experiment was repeated using Helium (He) as the inert gas, estimate the
growth rate that would be observed.
Estimated growth rate =
Construct a 95% interval for the potential value for the observed growth rate.
lower limit =
upper limit =
(g) Complete the following AMOVA table :
SOURCE OF VARIATION DF SUM OF SQUARES
REGRESSION ON X
EXPERIMENTAL ERROR
CORRECTED TOTAL
MEAN SQUARE
F
(h) Determine the proportion of variation in the observed growth rate that is
explained by the simple linear regression model.
R-squared =
2. Occasionally a linear regression model is considered in which the intercept is
specified to be zero. This model is represented as
Yi= Xi+  i , i = 1,2,…,n ,
where the  i are NID(0, 2). Note that this regression line must pass through the
origin, that is, the mean of the response is zero when X is zero.
(a) Write down the quantity that must be minimized to find the least squares
estimate of  .
(b) Write out the normal equation.
(c) Find the solution to the normal equation.
b=
(d) Give a formula for the true variance of b.
(e) How would you estimate 2=VAR( i) ? Give a formula.
(h) Give a formula for a 95% confidence interval for .
3. Consider the simple linear regression model
Yi= 0 +1Xi+  i , i = 1,2,…,n ,
for which the random errors are independent and identically distributed with a
normal distribution with mean zero and variance 2. Show that the covariance
_
between the sample mean of the responses Y and the least squares estimate for the
slope of the regression line is zero. Which properties of the model are used in your
proof?