Download Solutions - OCCC.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression toward the mean wikipedia , lookup

Least squares wikipedia , lookup

Regression analysis wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
MULTIPLE CHOICE Solutions
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
ANS:
D
C
A
D
B
D
B
D
C
A
B
B
D
A
D
A
B
D
A
B
B
B
C
B
D
A
C
C
D
C
A
D
D
B
C
Note: This is a Chi-Square Goodness of Fit Test
Note: This is a Chi-Square Goodness of Fit Test
Note: This is a Chi-Square Test for Independence
Recall: DF = (r-1) (c-1)
Goodness of Fit Chi-Square test
Interpret the slope (note: y is in 1000s)
Test we must use in Multiple Regression (use the ANOVA)
hint: no error
deterministic model is a math model with no error
Hint: no error once again.
note: MSE = SSE/(n-2)
Note: If you look at x =2 ,4, and 6 you can see that y goes down (weakly)
Recall: t = b1 / Sb1
looking for b0
R2 is always SSR/SST for both the simple and multiple regressions
interpret slope of X1
note: look at the coefficient values to get the equation
note: SST = SSE = SSR
same as #35
1
Short Answer Problems Solutions
1.
a. Recall that we go to the values of coefficient and we find that the equation is:
= 13.251 + 0.803x
b. Since we want to run a hypothesis test on b1 we run the following test:
(i) Hypotheses
Ho : B1 = 0
Ha : B1 ≠ 0
(ii) Critical/Rejection Rule: Critical value is n -2 df which is 8-2 = 6
t
-tα/2 =-2.447
μ
tα/2 = 2.447
(iii) Test Stat: (0.803 - 0) / 0.385 ≈2.085
(iv) Conclusion: Fail to reject Ho and conclude that it is not significantly different from 0; So this tells us the
value we have for b1 is not to be used.
c. First we want to fill in the ANOVA Table Completely:
Source of
Degrees of
Sum of
Variation
Freedom
Squares
Regression
1
(71.87541.674) =
30.201
Residual Error
6
41.674
Total
8
Mean
Square
MSR/df =
30.201
MSE/df
≈6.95
71.875
(1) Identify Hypotheses
Ho: Bi = 0 for the all coefficients
Ha: Bi ≠ 0 for at least one of the coefficients
(2) Rejection Rule/Critical value
Note: we have an F-stat with df = k-1 = 2-1=1 for the numerator and
df n-(k+1) = 8 – 2 = 6 denominator degrees of freedom.
So the critical value for F = 5.99
(3) Calculate the test statistic  F = MSR/MSE = 4.35
2
F-Stat
MSR/MSE
≈4.35
(4) Conclusion: Our test stat does not lie in the tail so we fail to reject Ho and conclude that our constant (b0)
and slope (b1) are not any different from 0. So our estimated regression equation should not be used as a
predictor of y.
d. R2 = SSR/SST = 30.201/71.875 ≈ 0.42 ; So this tells us the regression equation can only explain 42% of the total
variation in y. So it does not do a good job.
2. The first Step is to finish putting in all the values for the ANOVA table:
Source of
Degrees of
Sum of Squares
Variation
Freedom
Regression
2
822.088
7
736.012
Error
Mean Square
F
SSR/df =
822.088/20 =
411.044
SSE/df =
736.012/7
≈105.14
MSR/MSE
=411.044/105.14
≈3.91
2+7 = 9
822.088+736.012=1558.1
Total
a. Now we want to run the following Hypothesis Test:
(1) Identify Hypotheses
Ho: Bi = 0 for the all coefficients
Ha: Bi ≠ 0 for at least one of the coefficients
(2) Rejection Rule/Critical value
Note: we have an F-stat with df = 2 for the numerator and
df = 7 denominator degrees of freedom.
So the critical value for F = 4.74
(3) Calculate the test statistic  F = MSR/MSE = 3.91
(4) Conclusion: Our test stat does not lie in the tail so we fail to reject Ho and conclude that our constant (b0)
and slope (b1) are not any different from 0. So our estimated regression equation should not be used as a
predictor of y.
b.R2 = SSR/SST =822.088/1558.1≈0.528
So this tells us the regression equation can only explain around 53% of the total
So it does not do a terribly good job.
c. Adj-R2 = [R2 –k/(n-1)] [(n-1)/{n-(k+1)}] =[0.528 –2/(10-1)] [(10-1)/{10-(2+1)}] 0.393
d. The entire sample is n. Since the total degrees of freedom is n-1 = 9  n=10
3.
a.When advertising was used, sales were higher by $18000; note: You only put in values of 0 or 1.
b. Recall that = 10 - 4X1 + 7X2 + 18X3 ; So if we have X1 = 3 because of 3000 in sales previously, X2= 10 because of
10,000 miles, and X3= 1 since they used advertising we get that Sales ( ) = 10 – 4(3) + 7(10) + 18*1 = 86 or $86,000 in
sales.
3
4. Has there been any significant change in the number of students in each major between the last school year and this
school year? Use  = 0.05. Recall 30% were Accounting majors, 24% Management majors,
26% Marketing majors, and 20% Economics majors with n=300 students
Expected
Major
Actual
0.30(300)=90
0.24(300)72
0.26(300)=78
0.20(300)=60
300
Accounting
Management
Marketing
Economics
Total
83
68
85
64
300
49/90≈0.54
016/72≈0.22
49/78≈0.63
16/60≈0.27
1.66
So now given the table we created above we can run the goodness of fit test:
(1) Hypotheses:
Ho: That Paccountig = 0.30 PMGT = 0.24 PMKT = 0.26 PEconomics = 0.20
Ha: At least one of the probabilities is different from 0.33
(2) Critical Value: If we note that the df = k -1 = 4 -1 = 3 and go to table A.10 we find that the critical value is
7.81
= 7.81
(3) Test Stat: X2=1.66
(4) Conclusion: Since 1.66 is NOT in the tail we fail to reject Ho and conclude that the proportions are not
significantly different from the expected values
5. Recall that 3 x 2 contingency table with observed values from a sample of 1,500. At 95% confidence, test for
independence of the row and column factors.
Table 1: Original Actual Frequencies
Row Factor
A
B
C
Total
Table2: Expected Frequency
Row\Colum
A
B
C
Total
X
Column Factor
Y
Total
450
300
150
900
300
300
0
600
750
600
150
1,500
X
(750)*(900)/1500=450
600(900)/1500 =360
150(900)/1500 = 90
900
Y
600(750)/1500=300
600(600)/1500=240
150(600)/1500 = 60
600
4
Total
750
600
150
1500
Chi-Square Test stat we take X2– Σ
=
(1) Hypotheses:
Ho: Column Factor and Row Factor are independent from one another
Ha: The variables are statistically dependent
(2) Critical value – it is chi-square distributed with (r-1) (c-1) =2*1 = 2 with α=0.05
Critical value is -5.99147
=5.99147
(3) Test Statistic: X2– Σall cells
= 125
(4) Conclusion: Reject Ho because 125>>5.99147 which implies the test-stat> critical value (i.e. it is in the
tail). So we conclude with 95% confidence that Column and Row factor are statistically dependent (ie they are
not independent)
6. Recall 2 x 3 contingency table with observed values from a sample of 500. At 95% confidence, test for
independence of the row and column factors.
Table 1: Original Observed Values
Row Factor\Column Factor
A
B
Total
Table2: Expected Frequency
Row\Colum
A
B
Total
X
Y
Z
Total
40
60
100
50
100
150
110
140
250
200
300
500
X
(200)*(100)/500=40
100(300)/500 =60
100
Chi-Square Test stat we take X2– Σ
Y
150(200)/500=60
150(300)/500=90
150
=
(1) Hypotheses:
5
Z
250(200)/500=100
250(300)/500=150
250
Total
200
300
500
Ho: Column Factor and Row Factor are independent from one another
Ha: The variables are statistically dependent
(2) Critical value – it is chi-square distributed with (r-1) (c-1) =1*2 = 2 with α=0.05
Critical value is -5.99147
=5.99147
(3) Test Statistic: X2– Σall cells
= 4.45
(4) Conclusion: Reject Ho because 4.45<5.99147 which implies the test-stat< critical value (i.e. it is not in the
tail). So we conclude with 95% confidence that Column and Row factor are statistically independent
6