Download 1 - RIT - People

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Simple Linear
Regression
Estimates for single and mean
responses
1
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the Sampling Distribution
of a + bx for a Fixed x Value
Let x* denote a particular value of the
independent variable x. When the four basic
assumptions of the simple linear regression
model are satisfied, the sampling
distribution of the statistic a + bx* has the
following properties:
1. The mean value of a + bx* is  + x*,
so a + bx* is an unbiased statistic for
estimating the average y value when
x = x*
2
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the Sampling Distribution
of a + bx for a Fixed x Value
2. The standard deviation of the statistic
a + bx* denoted by sa+bx*, is given by
sabx*
1  x * x 
s 
n
S xx
2
3. The distribution of the statistic a + bx* is
normal.
3
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Addition Information about the Sampling
Distribution of a + bx for a Fixed x Value
The estimated standard deviation of
the statistic a + bx*, denoted by
2
sa+bx*, is given by
1  x * x 
sabx*  se

n
S xx
When the four basic assumptions of the
simple linear regression model are satisfied,
the probability distribution of the standardized
variable
a  bx * (  x*)
t
sabx*
is the t distribution with df = n - 2.
4
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval for a Mean y Value
When the four basic assumptions of the
simple linear regression model are met, a
confidence interval for a + bx*, the
average y value when x has the value x*, is
a + bx*  (t critical value)sa+bx*
Where the t critical value is based on
df = n -2.
Many authors give the following equivalent form
for the confidence interval.
a  bx * (t critical value)se
5
1 (x *  x)2

n
S xx
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval for a Single y Value
When the four basic assumptions of the simple
linear regression model are met, a prediction
interval for y*, a single y observation made
when x has the value x*, has the form
a  bx * (t critical value) s2e  sa2bx*
Where the t critical value is based on df = n -2.
Many authors give the following equivalent form
for the prediction interval.
a  bx * (t critical value)se
6
1 (x *  x)2
1 
n
Sxx
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Data was collected in certain regions of
Great Britain, Norway and Sweden to study
the relationship between the mean annual
temperature and the mortality rate for a
specific type of breast cancer in women.
Mean Annual
Temperature (F°)
Mortality Index
Mean Annual
Temperature (F°)
Mortality Index
7
51
50
50
49
49
48
103 105 100 96
87
95
47
45
46
42
44
89
89
79
85
82
* Lea, A.J. (1965) New Observations on distribution of neoplasms of female breast in
certain European countries. British Medical Journal, 1, 488-490
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Regression Analysis: Mortality index versus Mean annual temperature
The regression equation is
Mortality index = - 21.8 + 2.36 Mean annual temperature
Predictor
Constant
Mean ann
S = 7.545
Coef
-21.79
2.3577
SE Coef
15.67
0.3489
R-Sq = 76.5%
T
-1.39
6.76
P
0.186
0.000
R-Sq(adj) = 74.9%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
14
15
Unusual Observations
Obs
Mean ann
Mortalit
15
31.8
67.30
SS
2599.5
796.9
3396.4
Fit
53.18
MS
2599.5
56.9
F
45.67
SE Fit
4.85
P
0.000
Residual
14.12
St Resid
2.44RX
R denotes an observation with a large standardized residual
X denotes an observation whose X value gives it large influence.
8
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Regression Plot
Mortality in = -21.7947 + 2.35769 Mean annual
S = 7.54466
R-Sq = 76.5 %
R-Sq(adj) = 74.9 %
100
Mortality in
90
80
70
60
50
30
40
50
Mean annual
The point has a large standardized residual and is
influential because of the low Mean Annual Temperature.
9
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Predicted Values for New Observations
New Obs
Fit
SE Fit
95.0%
1
53.18
4.85
(
42.79,
2
60.72
3.84
(
52.48,
3
72.51
2.48
(
67.20,
4
83.34
1.89
(
79.30,
5
96.09
2.67
(
90.37,
6
99.16
3.01
(
92.71,
X denotes a row with X values away from
CI
63.57) (
68.96) (
77.82) (
87.39) (
101.81) (
105.60) (
the center
95.0%
33.95,
42.57,
55.48,
66.66,
78.93,
81.74,
PI
72.41) X
78.88)
89.54)
100.02)
113.25)
116.57)
Values of Predictors for New Observations
New Obs
1
2
3
4
5
6
10
Mean ann
31.8
35.0
40.0
44.6
50.0
51.3
These are the x* values for which the
above fits, standard errors of the fits,
95% confidence intervals for Mean y
values and prediction intervals for y
values given above.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Regression Plot
Mortality in = -21.7947 + 2.35769 Mean annual
S = 7.54466
R-Sq = 76.5 %
R-Sq(adj) = 74.9 %
120
110
Mortality in
100
90
80
70
60
50
Regression
95% CI
40
95% PI
30
30
40
50
Mean annual
95% confidence interval for Mean y value at x = 40.
95% prediction interval for single y value at x = 45.
11
(67.20, 77.82)
(67.62,100.98)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Related documents