Download Module 2 Homework Answers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia , lookup

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Linear regression wikipedia , lookup

Transcript
Module 4 Homework Answers UTA Summer 2005
Linear Regression 1 Pet Food
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.833982217
R Square
0.695526338
Adjusted R Square
0.667846914
Standard Error
0.305103313
Observations
13
69.55% of the variation in sales is explained by pet
food shelf space allocation.
ANOVA
df
Regression
Residual
Total
Intercept
Space-X
SS
MS
F
Significance F
1 2.339109 2.339109 25.12792 0.000395
11 1.023968 0.093088
12 3.363077
CoefficientsStandard Error t Stat
P-value Lower 95%Upper 95%
1.432883939 0.214896 6.667805 3.53E-05 0.959901 1.905867
0.077080891 0.015377 5.012776 0.000395 0.043237 0.110925
b0 = 1.43
b1 = .077
Y-hat = 1.43 * .077(Space-X)
b1 interpretation: For each 1 foot increase in shelf space, the expected increse
in weekly sales is .077 (hundred dollars) or $7.70.
predicted weekly sales for 17 feet of shelf space: $2.739 hundred dollars or $273.90.
actual weekly sales for 8 feet of shelf space: $3 hundred dollars or $300.
error: $26.10
Page 1 of 4
Linear Regression Homework
1.
Moving Spreadsheet
Scatter Diagram for Moving
90
80
70
Hours
60
50
40
30
20
10
0
0
200
400
600
800
1000
1200
1400
1600
Cubic Feet
b. b0=-2.37, b1=.05
c. for every foot increase in size of a move, the hours required for the move go up by .05
on average
d. Y-est=-2.37+.05X, so Y-est = 22.63 hours
e. r^2=.889, r^2= (6910.7 / 7771.4), .889 or 88.9% of the hours required to do a move can
be explained by size (cubic feet). 11.1% is left explained by factors other than house
size.
f. Se = 5.03, sqrt (860.7 / 34), this is the standard deviation of the errors, our “typical”
error in estimating a move time would be about 5 hours
g. very useful as evidenced by a large r-squared and relatively small standard error, also a
visual inspection of the graph shows a good linear trend with a relatively small deviation
of the observations from the line estimates
h. t-test statistic is a large 16.5, t-test > t-crit therefore reject H0 and conclude the is a
linear association between feet and hours
i. CI: b1+/- t Sb1 = (.044, .056)
Page 2 of 4
2. DJIA
a.
Rates & DJIA
14000
12000
DJIA
10000
8000
6000
4000
2000
0
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
Rate
Chart Title
14000
12000
10000
8000
6000
4000
2000
0
0
20
40
60
80
100
120
140
b & c.
d. possibly, for Time & DJIA (not completely linear though)
e. Model: y-hat = .0724 + .000000085X, rates are a very poor predictor, very small rsquared of .000178
f. Model: y-hat = 3596.6 + 59.77X, time is a good predictor of DJIA, fairly large rsquared value of .735 meaning that 73.5% of the DJIA is explained by time period, the
remaining 26.5% of the DJIA is explained by other factors. The b1 value of 59.77means
that for one additional time period (each new year), DJIA tends to increase on average by
59.77 (from 1993 to present).
g. Time is a much better predictor of where the DJIA is heading than interest rates.
Page 3 of 4
3. Invoice Data
b. b0 = 40.237, b1 = 1.26
c. b0: the time to process 0 invoices would be 40.23 seconds. 0 invoices is out of the data
range, therefore the intercept should not be interpreted
b1: It takes an additional 1.26 seconds on average to process an additional invoice.
d. 229 seconds
e. 33.4 seconds, this is the standard deviation of the errors. Pick a given number of
invoices to process and find the point estimate using the model. The standard error is
used to develop an interval for which most observations (processing times) will fall.
f. 89% of the variation in processing time can be explained by number of invoices.
Number of invoices does a good job in explaining amount of processing time.
g. yes, the t-test statistic of 15.2 is greater than the t-crit value of 2.0484, therefore we
reject H0: beta-1 = 0, and conclude there is a linear association between time and
invoices processed.
h. CI: [216, 242]
i. PI: [159, 299]
j. the CI is for the average processing time for many days in which 150 invoices are
processed, this interval will be narrower than the 95% PI for a single day’s processing
time for 150 invoices.
k. Actual Y = 250, Predicted Y = 191.4, error 58.6
l. This plot shows observations randomly scattered about 0, therefore no assumption
violation for independence or equal variance.
m. Yes, there is no discernable pattern, the error variances appear to be fairly small with
no pattern.
4. Ice Cream 2
a. y-hat = -180.76 + 4.6X
b. b0: for a temperature of 0, expected sales is -$180.76. These values are outside
our data range and do not make sense.
b1: for each 1 degree increase in temperature, the expected change in sales
increases by 4.6 dollars.
c. t* = 3.28 > t-crit of 2.0167, therefore reject H0 and conclude there is a linear
association between sales and temperature. The slope is not 0.
d. Unequal variance (in Y-values) assumption violated.
Independent error terms assumption violated.
At large X-values, the errors are getting larger.
e. No, the scatter plot is not linear and the 2 above assumptions are violated.
f. 20% of the variation in sales can be explained by temperature. The rest of the
variation is attributed to other things. Sales is not explained very well by temp.
g. It is very difficult to accept a hypothesis of the form; H0: beta1 = 0. It will almost
always be rejected (a parameter is almost never exactly equal to some value). The
other indicators (R-squared, plots, assumption violations) clearly indicate simple
linear regression is not applicable to this data set.
h. Obtain and analyze a residual plot.
Page 4 of 4