Download Regression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Interaction (statistics) wikipedia , lookup

Choice modelling wikipedia , lookup

Regression toward the mean wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Least squares wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Regression + Structural Equation Modeling
PSY 211
2-5-08
A. Regression (R)
 Definition: Statistical technique for finding the
best fitting straight line for a set of data
 Advanced version of correlation
 Predict specific scores on Y from X
 Called “Multiple Regression” or “Multiple R”
when several predictors are used (e.g.X1, X2, X3)
B. Predicting Scores
 Put the predictor variable (independent variable)
on the X axis and outcome variable (dependent
variable) on the Y axis
 Correlation Example. Predict Happiness (Y)
from Parental Support (X): r = 0.38
10
93. Happiness
8
6
4
2
0
0
2
4
6
88. Parental Support
8
10
 Regression Example: Predict Happiness (Y)
from Parental Support (X): R = 0.38
 Scatterplot shows a best fit line used for
prediction
 The best fit line always gives you the best
estimate of Y, based on X
 Any other line would not be as useful for
prediction
10
93. Happiness
8
6
4
2
R Sq Linear = 0.145
0
0
2
4
6
88. Parental Support
8
10
 If we know a person’s score on X, we can
predict their score on Y by visually inspecting
the best fit line that SPSS generates on the
scatterplot
10
93. Happiness
8
6
4
2
R Sq Linear = 0.145
0
0
2
4
6
8
10
88. Parental Support
 Tough to be accurate inspecting visually
 From algebra, we could determine an equation
for the line to predict more accurately
^
o
Y  bX  a
^
o the Y is pronounced “Y hat” and means
the predicted Y score
 Because this is an introductory course, we will
let SPSS give us the equation rather than
calculate it ourselves
 This yields a lot of Output, including the
following:
Model Summary
R
Model
1
R
.381a
Adjusted
R Square
.142
R Square
.145
Std. Error of
the Estimate
1.442
a. Predictors: (Constant), 88. Parental Support
Coeffi cientsa
a
Model
1
(Const ant)
88. Parental Support
Unstandardized
Coeffic ients
B
St d. Error
4.004
.369
.327
.048
St andardiz ed
Coeffic ients
Beta
.381
t
10.849
6.849
Sig.
.000
.000
a. Dependent Variable: 93. Happiness
X
b
Y
^
 Y  bX  a = .327X + 4.004
 Put in a value (e.g. of 5) for X to predict Y:
.327(5) + 4.004
= 5.639
Perhaps predicting someone’s level of happiness
from their parental support is not your cup of tea
 But you could…
o Predict GPA from ACT scores
o Predict odds of a sex offender’s reoffense
given a survey score
o Predict duration of rehabilitation given selfefficacy scores
o Predict odds of divorce based on
personality scores
C. Predicting Y from multiple Xs
 Behavior is multidetermined
 Usually hard to predict Y from a single X
 We will use multiple X’s to make better
predictions
o This is why you may hear regression
referred to as “multiple regression” or
“multiple R”
 Let’s predict Happiness (Y) from
Parental Support (X1) and Physical Health (X2)
o Give numbered subscripts to multiple X
variables
10
93. Happiness
8
6
4
2
0
0
0
2
2
4
88. P
6
aren
8
tal S
uppo
rt
th
eal
H
l
8
ica
hys
P
86.
4
6
10 10
Physical Health
correlates, r = .30
with Happiness
 This yields lots of Output again:
Model Summary
R
Model
1
R
.449a
Adjusted
R Square
.196
R Square
.202
Std. Error of
the Estimate
1.396
a. Predictors: (Constant), 86. Physical Health, 88.
Parental Support
Coeffi cientsa
a
XModel
1
1
(Const ant)
88. Parental Support
86. Physic al Health
Unstandardized
Coeffic ients
B
St d. Error
3.135
.408
.288
.047
.185
.042
St andardiz ed
Coeffic ients
Beta
.336
.243
t
7.692
6.139
4.432
Sig.
.000
.000
.000
a. Dependent Variable: 93. Happiness
X2
b2
b1
Y
Betas
^

Y  b1 X 1  b2 X 2  a
= .288X1 + .185X2 + 3.135
 What if someone reported a 9 on Parental
Support and a 8 on Physical Health?
^

Y  .288(9) + .185(8) + 3.135 = 7.207
 What if someone reported a 1 on Parental
Support and a 2 on Physical Health?
^

Y  .288(1) + .185(2) + 3.135 = 3.793
 What are those Betas?
o Betas are NOT on the exam
o β1, β2
o Describe each variable’s unique
contribution
o Like r, except controlling for all other
predictors
o β1 = .272, which means that the correlation
between Agreeableness (X1) and
Helpfulness (Y) after controlling for Selfesteem (X2) is .272
o β2 = .133, which means that the correlation
between Self-esteem (X2) and Helpfulness
(Y) after controlling for Agreeableness (X1)
o Could use this for an impressive term
paper of PSY 385 project
D. Path Diagrams
Simple correlation:
Parental
Support
r = .38
Happiness
R = .38
Multiple Regression:
r = .38
Parental
Support
Happiness
r = .30
R = .45
Self-esteem
Some researchers put the Betas in the diagram instead of the r
values if they want to show off each variable’s unique effect.
E. More Complex Multiple Regressions
 You could add as many predictors (X’s) as you
want
Touchdowns
Yards
Quarterback
Rating
Interceptions
Studying
Reading
Going to
Class
PSY 211
Success
F. Reading Complicated Path Diagrams
 Often in journals, we see very complex path
diagrams
 Variables are multidetermined, so complexity is
a must for many important correlational studies
Crying
Stress
Poor Object
Relations
Depression
Sleep
Problems
Some tips (DON’T need to know for the exam)
 If circles are used instead of rectangles, it
means that the constructs were measured in
multiple ways (e.g. maybe several different
surveys were used to measure depression)
 Dashed lines mean a correlation is close enough
to zero, we might as well ignore it
 Because these “path diagrams” or “models”
show us a structure and are based on equations
(like regression), you may hear them referred to
as “structural equation models” …these are just
synonyms
Fast Food
Eating
Tanning
G. Examples from Journals