Download Chapter 3 Regression and Correlation Simple Least squares

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear least squares (mathematics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Chapter 3 Regression and Correlation
Simple Least squares regression SLR): is a
method to find the “best fitting line” to a set of n
points (x, y). SLR minimizes the sum of squares of
vertical distances from points to the fitted line. That
is why the procedure is also called least squares
regression.








Correlation coefficient (r):
A number that measures the strength and
direction of the linear association between X
and Y (both quantitative variables).
– 1  r  + 1 always.
Correlation (r) = + 1 when there is a perfectly
linear increasing relationship between X and Y.
Correlation (r) = – 1 when there is a perfectly
linear decreasing relationship between X and Y.
No units. Correlation is a unit-less entity
R2 = (r)2 = is called coefficient of determination.
R2 measures the percent of variability in the
response (Y) explained by the changes in X [or
by the regression on X]. What does R2 = 0.81
(=81%) mean?
How do you find r when you are given R2? For
example what is r, if R2 = 0.81 = 81%?
Chapters 3 and 11 Fall 2007
Page 1 of 34
Example: Suppose your friend claims that she can
guess a persons age correctly (well, almost). So, to
see if this claim is justifiable, you select a random
sample of 10 people, ask your friend to guess their
ages and then ask the person his/her true age. The
following are observed:
ID
1 2 3 4 5 6 7 8 9 10
Guessed age 18 52 65 90 28 58 13 66 44 35
True Age 20 45 70 85 25 50 15 60 40 35
The very first step in regression analysis is to identify
the independent (explanatory) and the dependent
(response) variables. Since the true age determines
your friend’s guesses, (and your friend’s guess has no
effect on a person’s true age) we have
X = Independent Variable = True age
Y = Response = Dependent Variable = Guessed Age.
The next step is to draw a scatter diagram of the data
and interpret what you see (to get some ideas about
the relation between two variables).
Chapters 3 and 11 Fall 2007
Page 2 of 34
Scatterplot of Guessed Age vs True Age
100
90
Guessed Age
80
70
60
50
40
30
20
10
10
20
30
40
50
True Age
60
70
80
90
1. What do you see?
2. Verify the following summary statistics using your
calculator:
x  44.5 s X  22.02
y  46.9 sY  24.02
3. Compute the slope and intercept of the least
squares regression line, given that r = 0.9844.
Slope = b = r
 sY / sX
= 0.9844  24.02 /22.42 = 1.054651561
Intercept = a = y  b  x
= 46.9 – 1.054651561 44.5
= – 0.0319944692
Chapters 3 and 11 Fall 2007
Page 3 of 34
Hence the prediction equation is yˆ  0.03  1.05 x .
Are these results consistent with what you have
observed in the scatter plot?
3. Interpret the numerical results:
 Correlation = r = 0.9844, so there is a strong,
increasing, linear relationship between the true
and guessed ages
 Slope = 1.05, for every unit increase in the true
age, the guessed age increases by 1.05 years.
 Intercept = – 0.03 DO NOT INTERPRET
the intercept in this case because
a) Zero is not within the range of observed
values of the independent variable (X) and
b) Zero and – 0.03 are not meaningful in this
context.
4. Compute R2 (coefficient of determination) and
interpret it.
R2 = (r)2 = (0.9844)2 = 0.969
= 0.969100% = 96.9%
Interpretation:
 96.9% of the variation in guessed ages (Y) is
explained by the true age (X).
 96.9% of variability in guessed ages is
explained by linear regression on true ages.
5. Plot the estimated regression line on the scatter
diagram.
Chapters 3 and 11 Fall 2007
Page 4 of 34
For this we choose two values of X (as far as
meaningful) and predict the value of Y for those two
values of X, using the prediction equation
yˆ  0.03  1.05 x .
For x = 15 we have ŷ = 15.72
yˆ  0.03  1.05 x  0.03  1.05 15  15.72
For x = 90, we get ŷ = 94.47
yˆ  0.03  1.05 x  0.03  1.05  90  94.47 .
These give us 2 points (15, 15.72) and (90, 94.47)
that we connect on the scatter diagram. Now mark
the points (15, 15.72) and (90, 94.47) on the graph
and joint them with a ruler.
Scatterplot of Guessed Age vs True Age
100
90
Guessed Age
80
70
60
50
40
30
20
10
10
20
30
40
50
True Age
60
70
80
90
Chapters 3 and 11 Fall 2007
Page 5 of 34
Chapter 11 Inferences for SLR
In Chapter 3, a linear relation between two
quantitative variables (denoted by X and Y) was
shown by ŷ  a  bX .
In this equation (called the prediction equation)
Y is called the response (dependent) variable,
X is called the explanatory variable
ŷ is called the predicted value of Y.
a = ̂ = estimate of the intercept (α) and
b = ̂ = estimate of the slope ().
Hence, ŷ  a  bX is an estimate for a simple linear
regression model.
Also, ŷ is called the estimate of true (but unknown)
regression line µY = α + X, so we also write ˆy  ˆ Y
Regression Model: A mathematical (or theoretical)
equation that shows the linear relation between
the explanatory variable and the response. The
simple linear regression model we will use is
Y = α + X + , where  is called the error
term.
Chapters 3 and 11 Fall 2007
Page 6 of 34
Let’s see the error terms graphically:
 Total error = y – y
= (y – ŷ )
+ ( ŷ – y )
= Random error + Regression error
 In this relation, total error is divided into two
parts, the random error = y – ŷ and the
regression error = ŷ – y = error due to using
the regression model instead of the sample mean,
y =46.9.
Scatterplot of Guessed Age vs True Age
90
80
Guessed Age
70
60
50
40
30
20
10
10
20
30
40
50
60
True Age
70
80
90
100
Chapters 3 and 11 Fall 2007
Page 7 of 34
The unbiased estimators of the parameters (α and )
of the regression line (those given in Chapter 3) are
found by the method of least squares estimation
(LSE) technique as
S
ˆ  b  r  Y
and ˆ  a  Y  bX
SX
Hence,
  is the (true and unknown) slope of the
regression line,
S
  is estimated by ˆ  b  r  Y and
SX
 α is the (true and unknown) y-intercept (or
simply intercept) of the regression line
  is estimated by ˆ  a  Y  bX .
What do the slope and intercept of the regression line
tell us?
 Slope is the average amount of change in Y for
one unit of increase in X.
Note: slope ≠ rise / run. Why?
 Intercept is the value of Y when X = 0.
Important Note: We DO NOT use the above
interpretation when
a) X = 0 is not meaningful or
b) Zero is not within the range or near the
observed values of X
Chapters 3 and 11 Fall 2007
Page 8 of 34
Assumptions of Simple Linear Regression:
1. A random sample of n pairs of observations,
(X1 , Y1), (X2 , Y2), …, (Xn , Yn)
2. The population of Ys have normal distribution
with mean µY = α + X, which changes for each
value of X, and the same standard deviation, ,
which is the same at every value of the
independent variable, X. The relation between X
and Y may also be formulated as Y     X   .
3. As a result of the above assumptions, the error
terms, , are iid (identically and independently
distributed) random variables that have a
normal distribution with mean zero and standard
deviation , i.e.,  ~ N(0, ).
4. These mean that both Y and  are random
variables [we may choose any value for X hence it
is assumed to be a non-random variable (even
when it is random)].
Are these assumptions satisfied in Example – 1?
Chapters 3 and 11 Fall 2007
Page 9 of 34
Assumption 1 (random sample) is satisfied.
To check assumptions 2 and 3 we look at the
residuals, where,
Residual = Observed value of Y – Predicted value of Y
= y  yˆ .
If these residuals do not have any extreme value, we
say assumptions 2 and 3 are justifiable, since we do
not have any reason to suspect otherwise (more later).
So, let’s calculate the residuals using the prediction
equation, yˆ  0.03  1.05 x found in chapter 3. Then
we will plot the residuals.
Observed Predicted Residual
Value of y Value ( ŷ ) y  yˆ
20
21.06 – 3.06
45
47.43
4.57
70
73.79 – 8.79
85
89.61
0.39
25
26.33
1.67
50
52.70
5.30
15
15.79 – 2.79
60
63.25
2.75
40
42.15
1.85
35
36.88 – 1.88
Chapters 3 and 11 Fall 2007
Page 10 of 34
Histogram
(response is Guessed Age)
3.0
Frequency
2.5
2.0
1.5
1.0
0.5
0.0
-5.0
-2.5
0.0
2.5
Residual
5.0
7.5
10.0
Do you think the assumption of normality is
satisfied? Why or why not?
Chapters 3 and 11 Fall 2007
Page 11 of 34
Inferences about parameters of SLR
The parameters of the regression model are
α,  and 
These parameters are estimated by a, b and S,
respectively.
Chapter 11 deals with inferences about the true
Simple Linear Regression (SLR) model1, i.e., a
regression model with one explanatory variable (X).
When making inferences about the parameters of the
regression model, we will determine
 If X is a “good predictor” of Y
 If the regression line is useful for making
predictions about Y
 If the slope is different from zero
In this chapter we will also see how to find
 Prediction interval for an individual response,
Y at X = x.
 Confidence intervals for the mean of Y, that is
µY = mean response, at X = x.
We carry out these using ANOVA.
In Chapter 12 we will see how to make inferences about
the parameters of a multiple regression model, i.e., a
regression model with several (k  2) explanatory
variables, X1, X2, …, Xk.
1
Chapters 3 and 11 Fall 2007
Page 12 of 34
ANOVA FOR SLR
Is X a good predictor of Y? This is equivalent to
saying is the slope of the line significantly different
from zero? [If not, we might as well use Y as a
predictor.] We can answer these questions using an
ANOVA table:
ANOVA for SLR
Source
df
Regression
Model
1
Residuals
(Error)
n–2
SSE
Total
n–1
SST
Total SS
SST
SS
MS
SSReg MS Re g  SS Re g /1
MSE 
F
F
MS Re g
MSE
SSE
n2
= Model SS + Error SS
= SSReg
+ SSE
n
n
n
2
ˆ
ˆ
(
y

y
)

(
y

y
)

(
y

y
)
 i
 i
 i i
2
i 1
df = (n – 1) =
2
i 1
i 1
1
+ (n – 2)
The df for regression = 1 because there is only 1
independent variable
The df for residuals = n – 2 because we estimate 2
parameters (α and ).
Chapters 3 and 11 Fall 2007
Page 13 of 34
Assumptions for ANOVA:
 Random sample
 Normal distribution (of  and hence Y)
 Constant variance (of  and Y)
The hypothesis of interest is Ho:  = 0 vs. Ha:  ≠ 0.
Test statistic = F = MSReg / MSE
To find the p-value, first find the tabulated F-value
from the F-tables with df1 = 1 and df2 = n – 2; then
compare that value with the F in ANOVA table. The
following is the output obtained from Minitab:
Regression Analysis: Guessed Age vs. True Age
Analysis of Variance
Source
DF
SS
MS
Regression
1 5030.0 5030.0
Residual Error 8 160.9
20.1
Total
9 5190.9
F
250.08
P
0.000
To test Ho:  = 0 vs. Ha:  ≠ 0, the test statistic is
F = 250.08 from the ANOVA table,
The F-value is extremely large!!! What does it mean?
The p-value = 0.000. What does it mean?
Decision?
Conclusion?
Chapters 3 and 11 Fall 2007
Page 14 of 34
Decision: Reject Ho since the p-value < 0.0005 is
less than any reasonable level of significance.
Conclusion: The observed data indicate that the
slope is significantly different from zero.
Using the t-test
We may also use t-test for testing the above
hypotheses as explained in Chapter 8. Fro this we use
the first block of the Minitab output:
The regression equation is
Guessed Age = – 0.03 + 1.05 True Age
Predictor Coef
Constant – 0.030
True Age 1.05462
SE Coef
3.289
0.06669
T
– 0.01
15.81
P
0.993
0.000
S = 4.48483 R-Sq = 96.9% R-Sq(adj) = 96.5%
In this case parameter is 
Estimate = b = 1.05462, SE(Estimate) = 0.06669
Significance Test: Ho:  = 0 vs. Ha:  ≠ 0
Estimator  Value of parameter in Ho
Test statistic = T 
SE ( Estimator )
Calculated value of the test statistic:
1.05462  0
Tcal 
 15.81
0.06669
Chapters 3 and 11 Fall 2007
Page 15 of 34
To find the p-value, go back and look at Ha. We have
a 2-sided alternative and hence,
P-Value = 2  P(T  |Tcal |) = 2  P(T  15.81) = 0.
This p-value gives us the same decision and
conclusion as the one we got from the ANOVA table.
In general, to find the p-value we would look at the ttable to see Tcal on the line with df = n – 2. [This is
the df for error in ANOVA table.]
Compare the Tcal above with the Fcal in ANOVA
table. We have the following general relation
2
between the Tcal and Fcal in SLR: Tcal   Fcal and
equivalentlyTcal   Fcal .
So the p-value for the t-test is the same as the p-value
for the F-test. Hence in SLR, the two significance
tests for the slope give the same results in both the Ftest (using ANOVA table) and the t-test.
Observe that the above conclusion does not tell us in
what way  is different from zero. We could use the
t-test for testing one-sided alternatives about .
However, these should be decided before looking at
the data.
Chapters 3 and 11 Fall 2007
Page 16 of 34
Confidence Interval for the Slope:
Remember the general formula for the confidence
interval:
CI   Estimate  ME 
  Estimate  t *  SE (estimate) 
This is used in finding a CI for , where the estimate
is b and SE(estimate) is given in the Minitab output.
All we need to do is to find t* from the t-tables with
df = (n – 2) = dferror in ANOVA table.
For the above example we had the following results
from Minitab:
Predictor Coef
Constant – 0.030
True Age 1.05462
SE Coef
3.289
0.06669
T
– 0.01
15.81
P
0.993
0.000
That is, b = slope = 1.05462, SE(Slope) = 0.06669.
Also, since dferror = 8 in the ANOVA table, we use
the table of t-distribution and read the t-value for a
95% CI on the row with df = 8 as t* = 2.306, which
gives
ME = t*SE(Estimate) = 2.3060.06669 = 0.153787.
Hence a 95% CI for  is
CI = (1.05462 ± 0.15379)
= (0.90083, 1.20841) = (0.9, 1.2)
Chapters 3 and 11 Fall 2007
Page 17 of 34
As in Chapters 7 – 9 we can use the CI to make a
decision for significance test: when zero is not in the
CI we reject Ho and conclude that
The observed data give strong evidence that
the slope of the regression line is different
from zero.
Actually, we can say more: since the CI for  in this
example is (0.9, 1.2), we see that both ends of the CI
are positive thus we can conclude with 95%
confidence that the slope of the true regression line is
some number between 0.9 and 1.2.
Alternatively we interpret the CI as follows:
We are 95% confident that on the average, as the
true age increases by one year, the guessed age
increases by somewhere between 0.9 and 1.2 years.
Chapters 3 and 11 Fall 2007
Page 18 of 34
Confidence Interval for Mean Response
And Prediction Interval
General formula for CIs:
CI   Estimator  t *  SE (estimator ) 
Additional symbols:
 µY|x* = α + x*
= Mean response for the population of
ALL Y’s that have to X = x*
= The point on the true regression line that
corresponds to X = x*
 ˆY | x* = yˆ | x*  a  bx *
=
Estimator of mean response at X = x*
 SE( yˆ | x *) = SE(Estimator of Mean Response)



1
(x *  X ) 

 S
 n
n

2 
(
X

X
)

i


i 1


 Hence CI for Mean Response is






1
(
x
*

X
)

CI ( Mean Response)   yˆ  t *   S   n
n


2 
(
X

X
)

i



i

1



Chapters 3 and 11 Fall 2007
Page 19 of 34
 yˆ | x*  a  bx * = Predicted value for
one new response at X = x*
 SE( yˆ | x *)
= SE(One new response)


1
(x *  X )
=  S  1  n
n

2
(
X

X
)

i

i

1







 Hence prediction interval (PI) for one new
response is






1
(
x
*

X
)

PI (One New Response)   yˆ  t *   S  1   n
n


2 
(
X

X
)
i1 i






Chapters 3 and 11 Fall 2007
Page 20 of 34
 Compare the formula for CI and PI to see the
difference between them:






1
(
x
*

X
)

CI ( Mean Response)   yˆ  t *   S   n
n


2 
(
X

X
)

i



i

1









1
(
x
*

X
)

PI (One New Response)   yˆ  t *   S  1   n
n


2 
(
X

X
)
i1 i






In both of the above formulas
 S = Standard deviation of points around the
regression line = MSE
 df = dferror
 x* = a particular value of X for which we are
making prediction.
 Both CI and PI are centered around
yˆ | x*  a  bx * = prediction at X = x*
 PI for a new response is always wider than CI
for mean response at the same value of X = x*.
(Why?)
 SE’s and hence intervals will be smaller when
x* is closer to X = the mean of the sample of
X’s and wider when x* is far from X . (Why?)
Chapters 3 and 11 Fall 2007
Page 21 of 34
CI and PI for Age Prediction problem
Guessed Age = - 0.030 + 1.055 True Age
Regression
95% CI
95% PI
100
S
R-Sq
R-Sq(adj)
Guessed Age
80
4.48483
96.9%
96.5%
60
40
20
0
10
20
30
40
50
60
True Age
70
80
90
Age prediction example (Continued):
a) Suppose you want to know with 95%
confidence the range of your friend’s guesses
for a 60 year old person.
Here we have one value of X = x* = 65, hence you
want a 95% prediction interval at this value. Using
the prediction equation we have found, we get the
predicted value of Y at X = 65 as
yˆ  0.03  1.05 x *
 0.03  1.05  65  68.22
Chapters 3 and 11 Fall 2007
Page 22 of 34
Calculations for the SE’s are long and tedious.
However, we can use any one of statistical software
to get what we want easily. For example, we got the
prediction interval for X = 65 as PI = (57.22, 79.82)
using Minitab.
Observe that the center of the above interval is also
68.52. This is the predicted value of Y ( ŷ ) Minitab
calculated, using X = 65. [This is slightly different
from what we’ve found because Minitab carries more
digits after the decimal point in its calculations.]
b) You want to know, with 95% confidence what
would be the average of your friend’s guesses
for all people aged 65.
Since we are now looking for the mean of all
guessed ages with X = 65, this is a problem of CI for
mean response.
Minitab gives this as CI = (63.98, 73.06).
Observe that both the CI and the PI are centered on
the same point, i.e., around ŷ = 68.52.
Chapters 3 and 11 Fall 2007
Page 23 of 34
Finally, observe the difference in the lengths of the
intervals we got from Minitab:
95% CI at X = 65 is (63.98, 73.06).
Length of CI = 73.06 – 63.98 = 9.08
95% PI at X = 65 is (57.22, 79.82).
Length of PI = 79.82 – 57.22 = 22.6
As mentioned before, the PI is ALWAYS wider
than the CI at the same level of confidence and the
same value of X.
Chapters 3 and 11 Fall 2007
Page 24 of 34
More on R2:
We have seen that R2 = (r)2. This can also be defined
and calculated from the following relation:
SS Re g
SST
Variation in Y explain by Regression

Total variation in Y
R2 
This leads to alternative interpretation of R2:
 R2 is the proportion of variability in Y that is
explained by the regression on X or equivalently,
 R2 is the proportional reduction in the prediction
error, that is,
 R2 is the percentage of reduction in prediction
error we will see when the prediction equation is
used, instead of y = the sample mean of Y as the
predicted value of Y.
Chapters 3 and 11 Fall 2007
Page 25 of 34
Example: In the ANOVA table for the analysis of
guessed ages we had the following output:
S = 4.48483 R-Sq = 96.9% R-Sq(adj) = 96.5%
Source
Regression
Residual Error
Total
Then, R 2 
DF
SS
1
5030.0
8
160.9
9
5190.9
MS
5030.0
20.1
F
P
250.08 0.000
SSR 5030.0

 0.969 = 96.9%.
SST 5190.0
This is the same result we had from Minitab, as it
should be. We may now interpret this as follows:
The regression model yields a predicted value for Y
that has 96.9% less error than we would have if we
used the sample mean of Y’s as a predicted value.
Chapters 3 and 11 Fall 2007
Page 26 of 34
More on Residuals:
Residual = Vertical distance from an observed point
to the predicted value for the same X.
= Observed y – predicted y
= y  yˆ
Where yˆ  0.03  1.05 x
Observed
Values of y
20
45
70
85
25
50
15
60
40
35
Predicted
Values ( ŷ )
21.06
47.43
73.79
89.61
26.33
52.70
15.79
63.25
42.15
36.88
Residuals
y  yˆ
– 3.06
4.57
– 8.79
0.39
1.67
5.30
– 2.79
2.75
1.85
– 1.88
Hence, for someone whose actual age is 35, the
predicted value of his/her age is 36.88. This means
the prediction was 1.88 years higher than the true
age.
Chapters 3 and 11 Fall 2007
Page 27 of 34

Positive residuals: Observations above regression line

Negative residuals: Observations below regression line
 Sum of residuals = 0 ALWAYS.
 We (or computers) can make residual plots to see
if there are any problems with the assumptions.
 Computer finds “standardized residuals = z-score
for each observation. Any point that has a z score
bigger than 3 in absolute value, i.e., |z| > 3 is
called an outlier.
More on Correlation:
If the distance between a given value of X, say x*
and X (in absolute value) is k standard deviations,
i.e., |x* – X | = k  S, then the distance (in absolute
value) between the predicted value of y ( ŷ ) at x* and
Y is r  k standard deviations, i.e., | ŷ – Y | = r  k 
S.
Chapters 3 and 11 Fall 2007
Page 28 of 34
Example: Suppose Y = Height of children and X =
heights of their fathers and the correlation between
the two variables is r = 0.5.
Then,
 If a father’s height is 2 standard deviations above
the mean height of all fathers, then the predicted
height of his child will be 0.5  2 = 1 standard
deviation above the mean height of children.
 If the father’s height is 1.5 standard deviations
below the mean height of all fathers, then his
child’s predicted height will be 0.5  1.5 = 0.75
standard deviations below the mean height of all
children.
Chapters 3 and 11 Fall 2007
Page 29 of 34
Some more on correlation
 Correlation is very much affected by outliers
and influential points.
 Outliers weaken the correlation.
 Influential points (far from the rest of
observations in the x-direction that does not
follow the trend) may change the sign and
value of the slope.
Chapters 3 and 11 Fall 2007
Page 30 of 34
Residual Plots
Residuals are the estimators of the error term () in
the regression model. Thus, the assumption of
normality of  can be checked by looking at the
histogram of the residuals.
 A histogram of residuals that is (almost) bellshaped (symmetric) supports the assumption of
normality of the residuals.
 A histogram or a dot plot that shows outliers is
indicative of the violation of the assumption of
normality.
 Normal probability plot or normal quantile plot
can also be used to check the normality
assumption. Points in a normal PP or QP around
a straight line support assumption of normality.
Chapters 3 and 11 Fall 2007
Page 31 of 34
Plot of residuals
Against the explanatory variable (X)
Magnify any problems with assumptions.
 If the residuals are randomly scattered around the
line residuals = 0, this is good. It means nothing
else is left after using X to predict Y.
 If the residual plot shows a curved pattern this
indicates that a curvilinear fit (quadratic?) will
give better results.
 If the residual plot is funnel shaped this means
the assumption of constant variance is violated.
 If the residual plot shows an outlier, this may
mean the violation of normality and/or constant
variance or show an influential point.
Chapters 3 and 11 Fall 2007
Page 32 of 34
11.5 Exponential regression
This is one of the nonlinear regression models of the
following form: Y   X   or equivalently,
Y   X . The model is called “exponential”
because the independent variable, X appears as the
exponent of the coefficient .
Observe that when we take the logarithm of the
model we obtain log( Y )  log   (log  )X , hence
logarithm of the mean of Y is a linear function of X
with coefficients log() and log().
Note that when X = 0, X = 0 = 1. Thus,  gives us
the mean of Y at X = 0, since Y = 0 =(1) = .
The parameter  represents the multiplicative effect
of X on Y (as opposed to the additive effect in simple
linear regression we have seen so far.). So, if, for
example,  = 1.5, increasing X by one unit will
increase Y by 50% from its previous value, i.e., we
need to multiply the value of Y at the previous value
by 1.5 to obtain the current value.
Chapters 3 and 11 Fall 2007
Page 33 of 34
Summary of SLR
Model: y     x  
Assumptions:
a) Random sample
b) Normal distribution
c) Constant variance
d)  ~ N(0, ).
Parameters
and Estimators:
 Intercept = α
Estimated by a  Y  bX
S
 Slope = 
Estimated by b  r  Y
SX
 Standard deviation = 
Estimated by S = MSE
Interpretation of
 Slope
 Intercept
 R2
 r
Testing if the model is good:
 ANOVA
 The t-test for slope
 CI for slope
PI and CI for response
Residual plots and interpretations.
Chapters 3 and 11 Fall 2007
Page 34 of 34