Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

no text concepts found

Transcript

Chapter 9 Regression Analysis Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1 Icebreaker: The best tips to study for an exam • The class will be broken into pairs or groups of students. • Answer the following question: • What is the best way to do well on an exam? • Think about • Reading or re-reading chapters in the book • Taking and reviewing notes • Flashcards • Class attendance • Etc. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2 Chapter Objectives (1 of 2) By the end of this chapter, you should be able to: 09.01 Explain the purpose of regression analysis. 09.02 Distinguish between functional and statistical relationships. 09.03 Create and interpret scatter plots. 09.04 Explain the method of least squares for estimating model parameters. 09.05 Create regression models using Excel’s Data Analysis tool. 09.06 Create predictions using regression models. 09.07 Create prediction intervals. 09.08 Explain and use the R2 and adjusted-R2 statistics. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3 Chapter Objectives (2 of 2) 09.09 Explain and use forward stepwise regression. 09.10 Explain the meaning and dangers of extrapolation. 09.11 Explain the concept of overfitting. 09.12 Conduct statistical tests for population parameters. 09.13 Describe techniques for and the importance of variable selection in multiple regression. 09.14 Define multicollinearity and its implications in regression analysis. 09.15 Described the use of polynomial functions in linear regression models. 09.16 Explain and use the following functions: AVERAGE( ), TREND( ), SQRT( ), SUM( ),TINV( ), VAR.P( ), VAR.S( ). Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4 Introduction Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 5 Introduction to Regression Analysis • Regression Analysis is used to estimate a function f( ) that describes the relationship between a continuous dependent variable and one or more independent variables. Y f (X1, X1, X1, Xn ) Note: • f( ) describes systematic variation in the relationship. • ε represents the unsystematic variation (or random error) in the relationship. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 6 An Example • Consider the relationship between advertising (X1) and sales (Y) for a company. • There probably is a relationship... ...as advertising increases, sales should increase. • But how would we measure and quantify this relationship? See file Fig9-1.xlsm Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 7 A Scatter Plot of the Data Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 8 The Nature of a Statistical Relationship Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 9 A Simple Linear Regression Model • The scatter plot shows a linear relation between advertising and sales. • So the following regression model is suggested by the data, Yi 0 0 X1 i • This refers to the true relationship between the entire population of advertising and sales values. • The estimated regression function (based on our sample) will be represented as, Ŷi b 0 b1X1i • Ŷi is the estimated (of fitted) value of Y at a given level of X Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 10 Defining “Best Fit” • Numerical values must be assigned to b0 and b1 • The method of “least squares” selects the values that minimize: n n ˆ ) (Y (b b X )) 2 ESS (Yi Y i i 0 1 1i 2 i =1 i =1 • If ESS=0 our estimated function fits the data perfectly. • We could solve this problem using Solver... Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 11 Using Solver… Fig9-4.xlsm Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12 The Estimated Regression Function • The estimated regression function is: Ŷi 36.342 5.550X1i Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 13 Using the Regression Tool • Excel also has a built-in tool for performing regression that: • is easier to use • provides a lot more information about the problem See file Fig9-1.xlsm Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 14 The TREND() Function TREND(Y-range, X-range, X-value for prediction) where: Y-range is the spreadsheet range containing the dependent Y variable, X-range is the spreadsheet range containing the independent X variable(s), X-value for prediction is a cell (or cells) containing the values for the independent X variable(s) for which we want an estimated value of Y. Note: The TREND( ) function is dynamically updated whenever any inputs to the function change. However, it does not provide the statistical information provided by the regression tool. It is best two use these two different approaches to doing regression in conjunction with one another. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15 Evaluating the “Fit” Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 16 The R2 Statistic • The R2 statistic indicates how well an estimated regression function fits the data. • 0 < R2 < 1 • It measures the proportion of the total variation in Y around its mean that is accounted for by the estimated regression equation. • To understand this better, consider the following graph... Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17 Error Decomposition Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18 Partition of the Total Sum of Squares n n n i 1 i 1 i 1 2 2 2 ˆ ˆ (Y Y) (Y Y ) (Y Y) i i i i or, TSS ESS RSS RSS ESS R 1 TSS TSS 2 Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19 Making Predictions (1 of 2) • Suppose we want to estimate the average levels of sales expected if $65,000 is spent on advertising. Ŷi 36.342 5.550X1i • Estimated Sales = 36.342 + 5.550 * 65 = 397.092 • So when $65,000 is spent on advertising, we expect the average sales level to be $397,092. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20 The Standard Error • The standard error measures the scatter in the actual data around the estimate regression line. 2 ˆ (Y Y ) i1 i i n Se n k 1 • where k = the number of independent variables • For our example, Se = 20.421 • This is helpful in making predictions... Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 21 An Approximate Prediction Interval • An approximate 95% prediction interval for a new value of Y when X1=X1h is given by Ŷh 2 Se where Ŷh b0 b1X1h • Example: If $65,000 is spent on advertising: • 95% lower prediction interval = 397.092 − 2*20.421 = 356.250 • 95% upper prediction interval = 397.092 + 2*20.421 = 437.934 • If we spend $65,000 on advertising we are approximately 95% confident actual sales will be between $356,250 and $437,934. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 22 An Exact Prediction Interval • A (1−α)% prediction interval for a new value of Y when X1=X1h is given by Ŷh t 1 ,n 2 2 Sp where: Ŷh b0 b1X1h (X1h X) 2 1 S p Se 1 n n (X1 X) 2 i i 1 Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 23 Example • If $65,000 is spent on advertising: 95% lower prediction interval = 397.092 − 2.306*21.489 = 347.556 95% upper prediction interval = 397.092 + 2.306*21.489 = 446.666 • If we spend $65,000 on advertising we are 95% confident actual sales will be between $347,556 and $446,666. • This interval is only about $20,000 wider than the approximate one calculated earlier but was much more difficult to create. • The greater accuracy is not always worth the trouble. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24 Comparison of Prediction Interval Techniques Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25 Confidence Intervals for the Mean • A (1−α)% confidence interval for the true mean value of Y when X1=X1h is given by Ŷh t 1 ,n 2 2 Sa where: Ŷh b0 b1X1h S a Se (X1h X) 2 1 n n (X1 X) 2 i i 1 Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 26 A Note About Extrapolation • Predictions made using an estimated regression function may have little or no validity for values of the independent variables that are substantially different from those represented in the sample. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 27 Discussion Activity (1 of 2) You would like to create a regression model to help you invest in the stock market. There is a tremendous amount of historical data on markets, stocks, and other related areas. What are some of the variables you think would be most important? Do you think you could create a model that would give you confidence in predictions? Why or why not? Is it possible to create a general model that could be applied across different sectors? What are the potential dangers? Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 28 Multiple Regression Analysis • Most regression problems involve more than one independent variable. • If each independent variables varies in a linear manner with Y, the estimated regression function in this case is: Ŷi b0 b1X1i b2 X 2i bk X ki • The optimal values for the bi can again be found by minimizing the ESS. • The resulting function fits a hyperplane to our sample data. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 29 Example Regression Surface for Two Independent Variables Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 30 Multiple Regression Example: Real Estate Appraisal • A real estate appraiser wants to develop a model to help predict the fair market values of residential properties. • Three independent variables will be used to estimate the selling price of a house: • total square footage • number of bedrooms • size of the garage • See data in file Fig9-17.xlsm Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 31 Selecting the Model • We want to identify the simplest model that adequately accounts for the systematic variation in the Y variable. • Arbitrarily using all the independent variables may result in overfitting. • A sample reflects characteristics: • representative of the population • specific to the sample • We want to avoid fitting sample specific characteristics -- or overfitting the data. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 32 Model with One Independent Variable • With simplicity in mind, suppose we fit three simple linear regression functions: Ŷi b0 b1X1i Ŷi b0 b2 X 2i Ŷi b0 b3 X 3i • Key regression results are: Variables In the Model R2 Adjusted R2 Se Parameter Estimates X1 0.870 0.855 10.299 b0 = 109.503, b1 = 56.394 X2 0.759 0.731 14.030 b0 = 178.290, b2 = 28.382 X3 0.793 0.770 12.982 b0 = 116.250, b3 = 27.607 • The model using X1 accounts for 87% of the variation in Y, leaving 13% unaccounted for. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 33 Important Software Note • When using more than one independent variable, all variables for the X-range must be in one contiguous block of cells (that is, in adjacent columns). Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 34 Model with Two Independent Variable • Now suppose we fit the following models with two independent variables: Ŷi b0 b1X1i b2 X 2i Ŷi b0 b1X1i +b3 X 3i • Key regression results are: Variables In the Model R2 Adjusted R2 Se X1 0.870 0.855 10.299 b0 = 109.503, b1 = 56.394 X1 & X2 0.939 0.924 7.471 b0 = 127.684, b1 = 38.58, b2 = 12.875 X1 & X3 0.877 0.847 10.609 b0 = 108.311, b1 = 44.31, b3 = 6.743 Parameter Estimates • The model using X1 and X2 accounts for 93.9% of the variation in Y, leaving 6.1% unaccounted for. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 35 The Adjusted R2 Statistic • As additional independent variables are added to a model: • The R2 statistic can only increase. • The Adjusted-R2 statistic can increase or decrease. ESS n 1 R 2a 1 TSS n k 1 • The R2 statistic can be artificially inflated by adding any independent variable to the model. • We can compare adjusted-R2 values as a heuristic to tell if adding an additional independent variable really helps. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36 A Comment on Multicollinearity • It should not be surprising that adding X3 (# of bedrooms) to the model with X1 (total square footage) did not significantly improve the model. • Both variables represent the same (or very similar) things -- the size of the house. • These variables are highly correlated (or collinear). • Multicollinearity should be avoided. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 37 Model with Three Independent Variable • Now suppose we fit the following models with two independent variables: Ŷi b0 b1X1i b2 X 2i +b3 X 3i • Key regression results are: Variables In the Model R2 Adjusted R2 Se X1 0.870 0.855 10.299 b0 = 109.503, b1 = 56.394 X1 & X2 0.939 0.924 7.471 b0 = 127.684, b1 = 38.58, b2 = 12.875 X1, X2 & X3 0.943 0.918 7.762 b0 = 126.440, b1 = 30.803, b2 = 12.567, b3 = 4.576 • Parameter Estimates The model using X1 and X2 appears to be best: • Highest adjusted-R2 • Lowest Se (most precise prediction intervals) Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 38 Making Predictions (2 of 2) • Let’s estimate the avg selling price of a house with 2,100 square feet and a 2-car garage: Ŷi b0 b1X1i b2 X 2i ˆ = 127.64 + 38.576X + 12.875X Y i 1i 2i • The estimated average selling price is $234,444 • A 95% prediction interval for the actual selling price is approximately: Ŷh 2 Se • 95% lower prediction interval = 234.444 − 2*7.471 = $219,502 • 95% lower prediction interval = 234.444 + 2*7.471 = $249,386 Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 39 The Adjusted R2 Statistic and the Standard Error • Note that the model with the highest Adjusted-R2 will also have the lowest standard error Se (and vice-versa). ESS n 1 2 2 R 2a 1 1 S S e Y n k 1 TSS where S 2Y is the sample variance of Y. • Thus, the model with the highest Adjusted-R2 has the smallest (most precise) prediction intervals too. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40 Binary Independent Variables • Other types of non-quantitative factors could independent variables could be included in the analysis using binary variables. • Example: The presence (or absence) of a swimming pool, 1, if house i has a pool X pi 0, otherwise • Example: Whether the roof is in good, average or poor condition, 1, if the roof of house i is in good condition X ri 0, otherwise X r +1i 1, if the roof of house i is in average condition 0, otherwise Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41 Polynomial Regression • Sometimes the relationship between a dependent and independent variable is not linear. • This graph suggests a quadratic relationship between square footage (X) and selling price (Y). Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 42 The Regression Model • An appropriate regression function in this case might be, Ŷi b0 b1X1i b2 X12i or equivalently, Ŷi b0 b1X1i b2 X 2i where, X 2i X12i Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 43 Implementing the Model Fig9-25.xlsm Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44 Graph of Estimated Quadratic Regression Function Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 45 Fitting a Third Order Polynomial Model • We could also fit a third order polynomial model, Ŷi b0 b1X1i b2 X12i b3 X13i or equivalently, Ŷi b0 b1X1i b2 X 2i b3 X 3i where, X 2i X12i X 3i X13i Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 46 Graph of Estimated Third Order Polynomial Regression Function Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 47 Overfitting • When fitting polynomial models, care must be taken to avoid overfitting. • The adjusted-R2 statistic can be used for this purpose here also. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 48 Discussion Activity (2 of 2) • In the last few elections there appears to have been some terrible predictions on who would win which states. In light of this chapter and the power and use (along with weaknesses and abuse) of regression analysis, why do you think these poor predictions have occurred? Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 49 Self-Assessment 1. What does R2 mean? How is it different than adjusted-R2? 2. When data are overfit in a sample, what does that mean, and what are the implications? 3. What is multicollinearity, and how would you address that in your regression model? Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 50 Summary (1 of 2) Now that the lesson has ended, you should have learned how to: • Explain the purpose of regression analysis. • Distinguish between functional and statistical relationships. • Create and interpret scatter plots. • Explain the method of least squares for estimating model parameters. • Create regression models using Excel’s Data Analysis tool. • Create predictions using regression models. • Create prediction intervals. • Explain and use the R2 and adjusted-R2 statistics. Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 51 Summary (2 of 2) • • • • • Explain and use forward stepwise regression. Explain the meaning and dangers of extrapolation. Explain the concept of overfitting. Conduct statistical tests for population parameters. Describe techniques for and the importance of variable selection in multiple regression. • Define multicollinearity and its implications in regression analysis. • Described the use of polynomial functions in linear regression models. • Explain and use the following functions: AVERAGE( ), TREND( ), SQRT( ), SUM( ),TINV( ), VAR.P( ), VAR.S( ). Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, Ninth Edition. © 2022 Cengage. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 52