Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Multiple Regression Analysis:
Part 1
Correlation, Simple Regression, Introduction to Multiple
Regression and Matrix Algebra
1
Background: 3 Aims of Research
1.
2.
3.
Regression Defined:
2
Numerical Example




25 CDs
X = Marketing $’s
Y = Sales Index
Question: Can we predict
sales by knowing marketing
expenditures?
CD Marketing (x $1000) SalesIndx
1
87
33.7
2
69
35.1
3
70
36.4
4
73
37.8
5
129
39.1
6
189
40.5
7
88
41.8
8
93
43.2
9
111
44.6
10
123
45.9
11
255
47.3
12
113
48.6
13
201
50.0
14
189
51.4
15
99
52.7
16
125
54.1
17
222
55.4
18
198
56.8
19
236
58.2
20
172
59.5
21
144
60.9
22
139
62.2
23
92
63.6
24
189
64.9
25
200
66.3
3
Correlation
The relationship between x and y…
rxy 
rxy 
rxy 
( z x z y )
Or,
N 1
N XY  (X Y )
[ N X 2  (X ) 2 ][ N Y 2  (Y ) 2 ]
25(187, 253.80)  (3606)(1250)
[25(596,116)  (3606)2 ][25(64,899.87)  (1250) 2 ]

173,845
 .515
1,899,664  59996.75
4
Or visually…
r  .515  .265
2
2
CD Sales & Marketing
80.000
75.000
70.000
R2 = 0.2652
65.000
CD Sales Index
60.000
55.000
50.000
45.000
40.000
35.000
30.000
25.000
20.000
25
50
75
100
125
150
175
200
225
250
275
300
Marketing Costs
5
Given the relationship, we can predict y by developing
the simple regression equation
Predicted Score
y '  a  bx
Actual Score
y’ =
a=
b=
x=
e=
y  a  bx  e
6
Calculating parameter estimates
If you have the correlation and standard deviations…
br
sy
10
b  .515
 .092
56.27
sx
If you do not…
b
N (XY )  (X )(Y )
N (X 2 )  (X ) 2
b
25(187, 253.8)  (3606)(1250)
 .092
2
25(596116)  (3606)
Once you have b, a is easy…
a  Y  bX
a  50  .092(144.24)  36.73
7
Numerical Example with more stuff
CD
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sum
x
87
69
70
73
129
189
88
93
111
123
255
113
201
189
99
125
222
198
236
172
144
139
92
189
200
3606
y
x*y
33.7
2931.5
35.1
2418.8
36.4
2548.9
37.8
2757.3
39.1
5047.8
40.5
7652.4
41.8
3682.6
43.2
4018.2
44.6
4946.7
45.9
5648.6
47.3 12057.1
48.6
5496.5
50.0 10050.0
51.4
9706.8
52.7
5219.0
54.1
6759.5
55.4 12306.5
56.8 11245.1
58.2 13723.9
59.5 10235.9
60.9
8765.2
62.2
8649.7
63.6
5850.0
64.9 12274.7
66.3 13260.9
1250.0 187253.8
y'
44.8
43.1
43.2
43.5
48.6
54.1
44.9
45.3
47.0
48.1
60.1
47.1
55.2
54.1
45.9
48.2
57.1
54.9
58.4
52.5
50.0
49.5
45.2
54.1
55.1
1250.0
y - y'
-11.1
-8.1
-6.8
-5.7
-9.5
-13.6
-3.0
-2.1
-2.4
-2.1
-12.9
1.5
-5.2
-2.7
6.9
5.8
-1.7
1.9
-0.2
7.0
10.9
12.7
18.4
10.8
11.2
0.0
(y - y')2 y' - M(y) (y' - M(y))2
122.5
-5.24
27.4
65.0
-6.89
47.4
46.1
-6.79
46.2
32.6
-6.52
42.5
89.8
-1.39
1.9
185.2
4.10
16.8
9.0
-5.15
26.5
4.4
-4.69
22.0
5.7
-3.04
9.3
4.5
-1.94
3.8
165.2
10.14
102.7
2.3
-2.86
8.2
27.0
5.19
27.0
7.5
4.10
16.8
47.0
-4.14
17.1
34.1
-1.76
3.1
2.8
7.12
50.6
3.5
4.92
24.2
0.1
8.40
70.5
48.6
2.54
6.5
118.6
-0.02
0.0
161.5
-0.48
0.2
337.4
-4.78
22.9
117.7
4.10
16.8
125.5
5.10
26.0
1763.5
0.0
636.4
8
Partitioning Variance – What else?


Total Variation = SSy or SSTOT
What we cannot account for…

Actual y-scores minus predicted y-scores



y – y’
Can square and sum to get SSRES
What we can account for


SSTOT – SSRES (a.k.a. SSREG)
Or…


Predicted y-scores minus mean of y (squared & summed)
Why?
9
Calculating F, because we can
MS RES
SS RES 1763.5


 76.67
df REG
23
MS REG
SS REG 636.4


 636.4
df REG
1
MS REG 636.4
F

 8.301
MS RES 76.67
10
Effect Size / Fit…
SS RES
SS REG
r  1
, or
SSTOT
SSTOT
2
1763.5
r  1
 0.265
2399.9
2
Take our previously calculated F, 8.301
We can evaluate it at k, N – k – 1.
The null hypothesis of this test is ___________________________________.
11
Multiple Regression

Multiple Independent (predictor) variables

One Dependent (criterion) variable

Predicted Score


y’ = a + b1x1 + b2x2 + … + bkxk
Actual Score

yi = a + b1x1 + b2x2 + … + bkxk + ei
12
Numerical Example





N = 25 Participants (CDs)
X1: Marketing Expenditures
X2: Airplay/Day
Y: Sales Index
Question: Can the two pieces of
information, Marketing
Expenditures and Airplay be used
in combination to predict CD
Sales?
CD Marketing (x $1000)
1
87
2
69
3
70
4
73
5
129
6
189
7
88
8
93
9
111
10
123
11
255
12
113
13
201
14
189
15
99
16
125
17
222
18
198
19
236
20
172
21
144
22
139
23
92
24
189
25
200
Airplay/day SalesIndx
12.49
33.696
8.65
35.054
14.41
36.413
13.73
37.772
19.73
39.130
21.65
40.489
16.63
41.848
17.9
43.207
15.95
44.565
18.76
45.924
28.74
47.283
18.62
48.641
26.49
50.000
21.37
51.359
16.78
52.717
19.23
54.076
24.76
55.435
25.83
56.793
23.73
58.152
21.99
59.511
21.61
60.870
25.45
62.228
15.05
63.587
28.98
64.946
25.15
66.304
13
Selected SPSS Output (1)
Model Summaryb
Model
1
R
R Square
a
.661
.437
Adjus ted
R Square
.386
Std. Error of
the Es timate
7.83594
DurbinWatson
1.010
a. Predictors : (Cons tant), Number of plays per day, Marketing in thous ands
$'s
b. Dependent Variable: Sales Index
ANOVAb
Model
1
Regression
Residual
Total
Sum of
Squares
1049.020
1350.844
2399.864
df
2
22
24
Mean Square
524.510
61.402
F
8.542
Sig.
.002 a
a. Predictors: (Constant), Number of plays per day, Marketing in thousands $'s
b. Dependent Variable: Sales Index
14
Selected SPSS Output (2)
Coefficientsa
Model
1
(Constant)
Marketing in thousands
$'s
Number of plays per day
Unstandardized
Coefficients
B
Std. Error
22.883
6.934
Standardized
Coefficients
Beta
t
3.300
95% Confidence Interval for B
Sig.
Lower Bound Upper Bound
.003
8.502
37.265
-.048
.061
-.273
-.794
.436
-.175
.078
1.693
.653
.890
2.592
.017
.339
3.047
a. Dependent Variable: Sales Index
Notice the change in b for Marketing!
15
The equations introduced previously can be
extended to the two IV case

Involves finding six SS terms


Must also calculate





SSX1, SSX2, SSX1&SSX2, SSY, SSX1&Y, SSX2&Y
Two b-weights
Two beta weights
Correlation between X1 and X2
Then SS for Regression, Residual and Total
Then significance tests for each b-weight
In general, it is a pain in the backside.
16
For Instance, to obtain b1 & b2…
SSX 1  X 12 
(X1 )2
41.92
 135.15 
 135.15  109.73  25.42
N
16
SSX 2  X 22 
(X 2 )2
18522
 217,576 
 217,576  214,369  3207
N
16
SSY  Y 2 
(Y )2
13472
 115,149 
 115,149  113, 400.56  1748.44
N
16
SSX 1Y  X 1Y 
(X 1 )(Y )
(41.9)(1347)
 3704.5 
 3704.5  3527.46  177.04
N
16
SSX 2Y  X 2Y 
(X 2 )(Y )
(1852)(1347)
 158, 003 
 158, 003  155,915.25  2087.75
N
16
SSX 1 X 2  X 1 X 2 
(X 1 )(X 2 )
(41.9)(1852)
 5101 
 5101  4849.93  251.07
N
16
b1 
( SSX 2 )( SSX 1Y )  ( SSX 1 X 2 )( SSX 2Y ) (3207)(177.04)  (251.07)(2087.75) 567767.28  524171.39 43595.89



 2.358
( SSX 1 )( SSX 2 )  ( SSX 1 X 2 ) 2
(25.42)(3207)  251.07 2
81521.94  63036.14
18485.8
b2 
( SSX 1 )( SSX 2Y )  ( SSX 1 X 2 )( SSX 1Y ) (25.42)(2087.75)  (251.07)(177.04) 53070.61  44449.43 8621.18



 0.466
( SSX 1 )( SSX 2 )  ( SSX 1 X 2 ) 2
(25.42)(3207)  251.07 2
81521.94  63036.14 18485.8
Note: this is from a different example…, mileage may vary for the current example.
17
Which is why matrix algebra is our friend

There’s only one equation to get the Standardized Regression
Weights


Then another one to get R2


Bi = Rij-1Riy
R2 = RyiBi
And so on.
So, let’s take a joyride through the wonderful world of Matrix Algebra
18
First, some definitions


For us, matrix algebra is a set of operations that can be carried out on a
group of numbers (a matrix) as a whole.
A Matrix is denoted by a bold capital letter






Has R rows and C columns (thus has dimension of RxC)
R and/or C can be 1.
When R=1, the matrix is a row vector.
When C=1 it is a column vector
When both R and C are 1, it is a scalar (usually denoted by a small case
bold letter).
Xij – X is a matrix and i represents the row and j the column. Thus, x31
refers to the element in the third row and first column.
19
Example


The order of X is 5x2
X31 = 3
X
5
5
4
6
3
2
4
4
4
3
20
Some other important concepts


A is a diagonal matrix
I is an Identity Matrix
2.40
0.00
0.00
A
0.00
1.76
0.00
0.00
0.00
3.94
I
1.00
0.00
0.00
0.00
1.00
0.00
0.00
0.00
1.00
21
Matrix Transpose


X is our 5x2 matrix
previously introduced.
X’ is the transpose of X.
X
5
4
3
4
4
5
6
2
4
3
X’
5
5
4
6
3
2
4
4
4
3
22
Matrix Addition
Given two matrices, X and Y
5
4
3
4
4
X=
5
6
2
4
3
and,
Y=
7
6
5
4
4
7
6
2
4
7
Then we can add the individual elements of X and Y to get T
T=
5
4
3
4
4
5
6
2
4
3
+
7
6
5
4
4
7
6
2
4
7
=
12
10
8
8
8
12
12
4
8
10
23
Similarly, Matrix Subtraction…
Given the same two matrices, X and Y
X=
5
4
3
4
4
5
6
2
4
3
and,
Y=
7
6
5
4
4
7
6
2
4
7
Then we can subtract the individual elements of X and Y to get D
D=
5
4
3
4
4
5
6
2
4
3
--
7
6
5
4
4
7
6
2
4
7
=
-2
-6
-2
0
0
-2
0
0
0
-4
24
We can also use scalars w/matrices
C =T -9.2
12
10
8
8
8
12
12
4
8
10
=
2.8
2.8
0.8
-1.2
2.8
-5.2
-1.2
-1.2
-1.2
0.8
Here, I’ve subtracted a scalar, 9.2, from T. I could have also multiplied
T by 0.5 to get a matrix of means. The value 9.2 happens to be the
mean for each column, meaning we have centered the data within
each column.
25
Matrix Multiplication:
As seen on T.V.!

Matrices must be conformable for multiplication



First matrix must have the same number of columns as the
second matrix has rows.
The resulting matrix will be of order R1 x C2
We then multiply away…



We multiply each element from the first row of the first matrix
by the corresponding element of the first column of the second
matrix.
Then we multiply each element from the first row of the first
matrix by the corresponding element of the second column of
the second matrix.
We continue until we run out of columns in the second matrix,
and do it over again for the second row of the first matrix.
26
Example
If we take the transpose of C (C’) and post-multiply it by C, we could
get a new matrix called SSCP. It would go like this.
C’ =
2.8
0.8
-1.2
-1.2
-1.2
2.8
2.8
2.8
2.8
-5.2
-1.2
0.8
0.8
2.8
-1.2
-5.2
-1.2
-1.2
-1.2
0.8
X
SSCP11 = (2.8 * 2.8)+(0.8*0.8)+(-1.2*-1.2)+(-1.2*-1.2)+(-1.2*-1.2) = 12.8
SSCP12 = (2.8 * 2.8)+(0.8*2.8)+(-1.2*-5.2)+(-1.2*-1.2)+(-1.2*0.8) = 16.8
SSCP21 = (2.8 * 2.8)+(2.8*0.8)+(-5.2*-1.2)+(-1.2*-1.2)+(0.8*-1.2) = 16.8
SSCP22 = (2.8*2.8)+(2.8*2.8)+(-5.2*-5.2)+(-1.2*-1.2)+(0.8*0.8) = 44.8
27
SSCP, V-C & R
Rearranging the elements
into a matrix:
Multiplying by a scalar,
1/(n-1):
The above matrix is closely
related to the familiar R
SSCP =
V-C =
R=
12.8
16.8
3.2
4.2
1.000
0.702
16.8
44.8
4.2
11.2
0.702
1.000
28
Matrix Division:
It just keeps getting better!




Matrix Division is even stranger than matrix
multiplication.
You know most of what you need to know though,
since it is accomplished through multiplying by an
inverted matrix.
Finding the inverse is the tricky part.
We will do a very simple example.
29
Inverses


Not all matrices have an inverse.
A matrix inverse is defined such that


XX-1=I
We need two things in order to find the inverse


1. the determinant of the matrix we wish to take the inverse
of, V-C in this case, which is written as |V-C|
2. The adjoint of the same matrix, i.e. V-C, written adj(V-C)
30
Determinant and Adjoint
For a 2x2 matrix, V, the determinant is V11*V22 – V12*V21
|V-C| = 18.2
The adjoint is formed in the following way.
Adj(V-C) =
11.2
-4.2
V22
-V12
-V21
V11
-4.2
3.2
31
Almost there…
We then divide each element of the adjoint matrix by the determinant 
-1
V-C =
11.2 / 18.2
-4.2 / 18.2
-4.2 / 18.2
3.2/ 18.2
Or,
-1
V =
0.615
-0.231
-0.231
0.176
32
Checking our work…
V-C*V-C-1 = I
V-C =
3.2
4.2
4.2
11.2
X
-1
V =
0.615
-0.231
-0.231
0.176
V-C*V-C-111 = 3.2*0.615+4.2*-0.231 = 1.968-0.972 ≈ 1.0
V-C*V-C-112 = 3.2*-0.231+4.2*0.176 = -.7392+.7392 = 0
V-C*V-C-121 = 4.2*0.615+11.2*-0.231 = 2.583-2.5872 ≈ 1.0
V-C*V-C-112 = 4.2*-0.231+11.2*0.176 = -0.9702+1.9712 ≈ 0
-1
V-CxV-C =
1.0
0.0
0.0
1.0
33
Why we leave matrix operations to
computers
Finding the determinant of a 3 x 3 matrix:
a
d
g
b
e
h
c
f
i
D = a(ei – fh) + b(fg – di) + c(dh – eg)
Inverting the 3 x 3 matrix after solving for the determinant:
1/D x
ei - fh
fg - di
dh - eg
ch - bi
ai - cg
bg - ah
bf - ce
cd - af
ae - bd
34
So, why did I drag you through this?
35
Related documents