Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 12
Simple Linear
Regression and
Correlation
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12.1
The Simple Linear
Regression Model
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Linear Relationship
The simplest deterministic mathematical
relationship between two variables x and
y is a linear relationship y  0  1x.
The set of pairs (x,y) for which y  0  1x
determines a straight line.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Terminology
The variable whose value is fixed by the
experimenter, denoted x, is the
independent (predictor, explanatory)
variable. For a fixed x, the second
variable will be a random variable Y with
observed value y, referred to as the
dependent (response) variable.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Simple Linear Regression
Model
There exists parameters 0 , 1 and 
such that for any fixed value of x, the
dependent variable is related to x
through the model equation
y  0  1x  
2
 is a random variable (called the random
deviation) with E( )  0 and V ( )   2 .
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Linear Regression Model
(x1,y1)
True regression
line y  0  1x
1
x1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Distribution of

Normal, mean = 0,
standard deviation 

0

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Distribution of Y for Different
Values of x
0  1x3
0  1x2
y  0  1x
0  1x1
x1
x2
x3
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12.2
Estimating Model
Parameters
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Principle of Least Squares
The vertical deviation of the point
(xi,yi) from the line y = b0 + b1x is
yi – (b0 + b1xi)
The sum of squared vertical deviations
from the points ( x1 , y1 ), ( x2 , y2 ),..., ( xn , yn )
to the line is:
n
f (b0 , b1 )    yi   b0  b1xi 
2
i 1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Principle of Least Squares
The least-squares (regression) line
for the data is given by y  ˆ0  ˆ1x
where
b1  ˆ1
x y    x   y  / n


 x    x  / n
i
i
i
2
2
i
i
and
b0  ˆ0 
i
ˆ x
y


 i 1i
n
 y  ˆ1 x
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Find the equation of least-squares for
the data (1, 2) (2,3) (3, 7)
Sum:
x
y
xy
x2
1
2
2
1
2
3
6
4
3
7
21
9
12
29
14
6
3  29    6 12 
ˆ
1 
= 2.5
2
3 14    6 
ˆ0 
12  2.5  6 
3
= –1
y  1  2.5 x
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Fitted Values and Residuals
The fitted (predicted) values yˆ1,..., yˆ n
are obtained by substituting x1,..., xn
into the equation of the estimated regression
line: yˆ1  ˆ0  ˆ1x1,..., yˆn  ˆ0  ˆ1xn . The
residuals are the vertical deviations
y1  yˆ1,..., yn  yˆ n from the estimated line.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Error Sum of Squares
The error sum of squares, denoted SSE,
is


SSE=  yi  yˆi     yi  ˆ0  ˆ1xi 


2
2
and the estimate of  is
2
SSE
ˆ  s 

n2
2
2
  yi  yˆi 
2
n2
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Computational Formula
A computational formula for the SSE, is
SSE= 
2
yi
 ˆ0  yi  ˆ1  xi yi
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Total Sum of Squares
The total sum of squares, denoted SST, is
SST  S yy    yi  y   
2
2
yi
   yi  / n
2
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Coefficient of Determination
The coefficient of determination, denoted
by r2, is given by
SSE
2
r  1
SST
It is interpreted as the proportion of
observed y variation that can be explained
by the simple linear regression model.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Regression Sum of Squares
SSR = SST – SSE
Regression sum of squares is interpreted
as the amount of variation that is
explained by the model. We have
SSR
r 
SST
2
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12.3
Inferences About
the
Slope Parameter1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
ˆ1
ˆ
ˆ

is
E
(

1. The mean of 1
1 )  ˆ1  1.
2. The variance and standard
deviation are 2
2

2
ˆ
V ( 1 )   ˆ 
1
S xx
s
sˆ 
1
S xx
 ˆ 
1

S xx
3. ˆ1 has a normal distribution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
T Variable
The assumptions of the simple linear
regression model imply that the
standardized variable
T
ˆ1  1
S / S xx

ˆ1  1
Sˆ
1
has a t distribution with n – 2 df.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval
A 100(1   )% CI for the slope 1
of the true regression line is
ˆ1  t / 2,n2  sˆ
1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing Procedures
H 0 : 1  10
Null hypothesis:
Test statistic value:
t
ˆ1  10
sˆ
1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing Procedures
Alternative
Hypothesis
H a : 1  10
H a : 1  10
H a : 1  10
Rejection Region for
Approx. Level  Test
t  t , n  2
t  t ,n  2
t  t / 2,n  2
or t
 t / 2,n  2
A P-value based on n – 2 df can be
calculated as in Chap 8 and 9.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing
The model utility test is the test of
H 0 : 1  0 versus H a : 1  0,
in which case the test statistic value
is the ratio t  ˆ1 / s ˆ .

1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Table
Source of
Variation
df
Sum of
squares
Mean
Square
f
Regression
1
SSR
SSR
SSR
SSE/(n  2)
Error
n–2
SSE
SSE
s 
n2
Total
n–1
SST
2
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12.4
Inferences Concerning
Y x* and
the Prediction of Future
Y Values
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Yˆ
Let Yˆ  ˆ0  ˆ1x*, where x * is some
fixed value of x.
1. The mean of Yˆ is
E (Yˆ )  ˆ
0
 ˆ x*
1
 0  1x *.
2. Variance and standard deviation:
2

2
2 1 (x * x )
ˆ
V (Y )   Yˆ    

S xx 
 n
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Yˆ
2. (continued)
1 ( x *  x )2
sYˆ  s

n
S xx
3. Yˆ has a normal distibution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
T Variable
The variable
T
ˆ0  ˆ1x * (0  1x*)
Sˆ
ˆ
0  1 x*
Yˆ  ( 0  1x*)

SYˆ
has a t distribution with n – 2 df.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval
A 100(1   )% CI for Y x* , the
expected value of Y when x = x*, is
ˆ0  ˆ1x * t / 2,n2  sˆ
ˆ x*


0
1
 yˆ  t / 2,n2  sYˆ
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Prediction Interval
A future value of Y is not a parameter but
instead a random variable; its interval of
plausible values is referred to as a
prediction interval.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Prediction Interval
A 100(1   )% PI for a future Y observation
to be made when x = x*, is
2
1
(
x
*

x
)
ˆ0  ˆ1x * t / 2,n2  s 1  
n
S xx
 yˆ  t / 2,n2  s
2
2
 sYˆ
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12.5
Correlation
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Sample Correlation Coefficient
The sample correlation coefficient,
denoted r, of n pairs (x1,y1),…,(xn,yn) is
r
S xy
  xi  x 
2

  yi  y 
2

S xy
S xx S yy
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Find the correlation coefficient
for the least-squares line from the
points (1, 2) (2,3) (3, 7)
n   xy     x   y 
r
n

 x    x
2
2
 n
 y    y 
2
2
3  29    6 12 
3 14    6   3  62   12 
2
2
= 0.9449
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of r
Important properties of r
1. The value of r does not depend on
which of the two variables under study
is labeled x and which is labeled y.
2. The value of r is independent of the
units in which x and y are measured.
3. 1  r  1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of r
4. r = 1 iff all (xi, yi) pairs lie on straight
line with positive slope, and r = –1 iff
all (xi, yi) pairs lie on a straight line with
negative slope.
5. The square of the sample correlation
coefficient gives the value of the
coefficient of determination that would
result from fitting the simple linear
regression model.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Different Values of r
r near 1
r near 0, no
relationship
r near -1
r near 0, nonlinear
relationship
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Population Correlation Coefficient
Cov( X , Y )
   ( X ,Y ) 
 X  Y
where
  ( x   X )( y  Y ) f ( x, y )
 x y
Cov( X , Y )    
   ( x   X )( y  Y ) f ( x, y )dxdy
  
depending on whether (X,Y) is discrete or
continuous.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Estimator
ˆ  R 
(X
(X
i
i
 X )(Yi  Y )
 X)
2
 (Y  Y )
2
i
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Assumption
The joint probability distribution of
(X,Y ) is specified by f ( x, y ) 
e
 (( x  1 ) /  1 )2  2  ( x  1 )( y   2 ) /  1 2  (( y   2 ) /  2 ) 2  /[2(1  2 )]


2 1 2 1  
  x  
2
  y  
f ( x, y ) is called the bivariate normal
probability distribution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Testing for the Absence of
Correlation
When H 0 :   0 is true, the test
statistic:
R n2
T
2
1 R
Has a t distribution with n – 2 df.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing
Alternative
Hypothesis
Ha :   0
Ha :   0
Ha :   0
Rejection Region for
Approx. Level  Test
t  t , n  2
t  t ,n  2
t  t / 2,n  2
or t
 t / 2,n  2
A P-value based on n – 2 df can be
calculated as described previously.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Other Inferences Concerning 
When (X1, Y1),…,(Xn, Yn) is a sample from
a bivariate normal distribution, the rv
1  1 R 
V  ln 

2  1 R 
has approximately a normal distribution
with mean and variance
1  1  
1
2
V  ln 
 V 
2  1  
n 3
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The test statistic for testing H 0 :   0
1
V  ln[(1  0 ) /(1  0 )]
2
Z
1/ n  3
Alternative
Hypothesis
H a :   0
H a :   0
H a :   0
Rejection Region for
Level  Test
z  z
z   z
z  z / 2 or z   z / 2
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
CI for 
A 100(1   )% CI for  is
 e 1 e 1 
, 2c2
 2c1

 e 1 e 1
2 c1
2 c2
where c1 and c2 are the left and right
endpoints, of the CI interval for V :
z / 2
z / 2 

,v 
v 

n3
n3 

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Related documents