Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Module 4 Homework Answers UTA Summer 2005 Linear Regression 1 Pet Food SUMMARY OUTPUT Regression Statistics Multiple R 0.833982217 R Square 0.695526338 Adjusted R Square 0.667846914 Standard Error 0.305103313 Observations 13 69.55% of the variation in sales is explained by pet food shelf space allocation. ANOVA df Regression Residual Total Intercept Space-X SS MS F Significance F 1 2.339109 2.339109 25.12792 0.000395 11 1.023968 0.093088 12 3.363077 CoefficientsStandard Error t Stat P-value Lower 95%Upper 95% 1.432883939 0.214896 6.667805 3.53E-05 0.959901 1.905867 0.077080891 0.015377 5.012776 0.000395 0.043237 0.110925 b0 = 1.43 b1 = .077 Y-hat = 1.43 * .077(Space-X) b1 interpretation: For each 1 foot increase in shelf space, the expected increse in weekly sales is .077 (hundred dollars) or $7.70. predicted weekly sales for 17 feet of shelf space: $2.739 hundred dollars or $273.90. actual weekly sales for 8 feet of shelf space: $3 hundred dollars or $300. error: $26.10 Page 1 of 4 Linear Regression Homework 1. Moving Spreadsheet Scatter Diagram for Moving 90 80 70 Hours 60 50 40 30 20 10 0 0 200 400 600 800 1000 1200 1400 1600 Cubic Feet b. b0=-2.37, b1=.05 c. for every foot increase in size of a move, the hours required for the move go up by .05 on average d. Y-est=-2.37+.05X, so Y-est = 22.63 hours e. r^2=.889, r^2= (6910.7 / 7771.4), .889 or 88.9% of the hours required to do a move can be explained by size (cubic feet). 11.1% is left explained by factors other than house size. f. Se = 5.03, sqrt (860.7 / 34), this is the standard deviation of the errors, our “typical” error in estimating a move time would be about 5 hours g. very useful as evidenced by a large r-squared and relatively small standard error, also a visual inspection of the graph shows a good linear trend with a relatively small deviation of the observations from the line estimates h. t-test statistic is a large 16.5, t-test > t-crit therefore reject H0 and conclude the is a linear association between feet and hours i. CI: b1+/- t Sb1 = (.044, .056) Page 2 of 4 2. DJIA a. Rates & DJIA 14000 12000 DJIA 10000 8000 6000 4000 2000 0 0.00% 2.00% 4.00% 6.00% 8.00% 10.00% Rate Chart Title 14000 12000 10000 8000 6000 4000 2000 0 0 20 40 60 80 100 120 140 b & c. d. possibly, for Time & DJIA (not completely linear though) e. Model: y-hat = .0724 + .000000085X, rates are a very poor predictor, very small rsquared of .000178 f. Model: y-hat = 3596.6 + 59.77X, time is a good predictor of DJIA, fairly large rsquared value of .735 meaning that 73.5% of the DJIA is explained by time period, the remaining 26.5% of the DJIA is explained by other factors. The b1 value of 59.77means that for one additional time period (each new year), DJIA tends to increase on average by 59.77 (from 1993 to present). g. Time is a much better predictor of where the DJIA is heading than interest rates. Page 3 of 4 3. Invoice Data b. b0 = 40.237, b1 = 1.26 c. b0: the time to process 0 invoices would be 40.23 seconds. 0 invoices is out of the data range, therefore the intercept should not be interpreted b1: It takes an additional 1.26 seconds on average to process an additional invoice. d. 229 seconds e. 33.4 seconds, this is the standard deviation of the errors. Pick a given number of invoices to process and find the point estimate using the model. The standard error is used to develop an interval for which most observations (processing times) will fall. f. 89% of the variation in processing time can be explained by number of invoices. Number of invoices does a good job in explaining amount of processing time. g. yes, the t-test statistic of 15.2 is greater than the t-crit value of 2.0484, therefore we reject H0: beta-1 = 0, and conclude there is a linear association between time and invoices processed. h. CI: [216, 242] i. PI: [159, 299] j. the CI is for the average processing time for many days in which 150 invoices are processed, this interval will be narrower than the 95% PI for a single day’s processing time for 150 invoices. k. Actual Y = 250, Predicted Y = 191.4, error 58.6 l. This plot shows observations randomly scattered about 0, therefore no assumption violation for independence or equal variance. m. Yes, there is no discernable pattern, the error variances appear to be fairly small with no pattern. 4. Ice Cream 2 a. y-hat = -180.76 + 4.6X b. b0: for a temperature of 0, expected sales is -$180.76. These values are outside our data range and do not make sense. b1: for each 1 degree increase in temperature, the expected change in sales increases by 4.6 dollars. c. t* = 3.28 > t-crit of 2.0167, therefore reject H0 and conclude there is a linear association between sales and temperature. The slope is not 0. d. Unequal variance (in Y-values) assumption violated. Independent error terms assumption violated. At large X-values, the errors are getting larger. e. No, the scatter plot is not linear and the 2 above assumptions are violated. f. 20% of the variation in sales can be explained by temperature. The rest of the variation is attributed to other things. Sales is not explained very well by temp. g. It is very difficult to accept a hypothesis of the form; H0: beta1 = 0. It will almost always be rejected (a parameter is almost never exactly equal to some value). The other indicators (R-squared, plots, assumption violations) clearly indicate simple linear regression is not applicable to this data set. h. Obtain and analyze a residual plot. Page 4 of 4