* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Simple Linear Regression
Data assimilation wikipedia , lookup
Forecasting wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Time series wikipedia , lookup
Interaction (statistics) wikipedia , lookup
Instrumental variables estimation wikipedia , lookup
Regression toward the mean wikipedia , lookup
Regression analysis wikipedia , lookup
Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 2-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor variable 2-2 Relationships Among Variables Functional Relationships – The value of the dependent variable Y can be computed exactly if we know the value of the independent variable X. (e.g., Y=2X) Statistical Relationships – Not a perfect or exact relationship. The expected value of the response variable Y is a function of the explanatory or predictor variable X. The observed value of Y is the expected value plus a random deviation. 2-3 Simple Linear Regression 2-4 Uses of SLR Why Use Simple Linear Regression? Descriptive/ Exploratory purposes (explore the strength of known cause/effect relationships) Administrative Control (often the response variable is $$$) Prediction of outcomes (predict future needs; often overlaps with cost control) 2-5 Statistical Relationships vs. Causality Statistical relationships do not imply causality!!! Example : A Lafayette ice cream shop does more business on days when attendance at an Indianapolis swimming pool is high. 2-6 Data for Simple Linear Regression Observe pairs of variables; Each pair is called a case or a data point Y i is the ith value of the response variable; X i is the ith value of the explanatory (or predictor) variable; in practice the value of X i is a known constant. 2-7 Simple Linear Regression Model Statement of Model ìï i = 1, 2,..., n Y i = b 0 + b 1X i + ei where ïí ïï ei ~ N (0, s 2 ) ïî Model Parameters (unknown) b 0 = intercept; may not have meaning b 1 = slope; b 1 = 0 if no relationship between X and Y. s is the error variance 2 2-8 E (Y i ) 64447 4448 Y i = b 0 + b1X i + ei 2-9 Interpretation of the Regression Coefficients b 0 is the expected value of the response variable when X = 0. b 1 represents the increase (or decrease if negative) in the mean response for a 1-unit increase in the value of X. 2-10 Features of SLR Model Errors are independent, identically distributed normal random variables: ei ~ Normal (0, s iid 2 ) iid Normal (b 0 + b 1X i , s 2 ) Implies Y i ~ (See A.36, p1303 for the proof) 2-11 Fitted Regression Equation The parameters b 0, b1, s must be estimated from the data. 2 Estimates denoted b0, b1, s 2 . Fitted (or estimated) regression line is Yˆ = b + b X i 0 1 i The “hat” symbol is used to differentiate the fitted value Yˆi from the actual observed value Y i . 2-12 Residuals The deviations (or errors) from the true regression line, ei = Y i - (b 0 + b1X i ), cannot be known since the regression parameters b 0 and b 1 are unknown. We may estimate these by the residuals: ei = Observed - P redict ed = Y i - Yˆi = Y i - (b0 + b1X i ) 2-13 Error Terms vs Residuals 2-14 Assumptions Model assumes that the error terms are independent, normal, and have constant variance. Residuals may be used to explore the legitimacy of these assumptions. More on this topic in later. 2-15 Least Squares Estimation Want to find “best” estimates (b0, b1 ) for (b 0, b1 ). Best estimates will minimize the sum of the squared residuals: SSE = å n 2 i e = å i= 1 2 éY i - (b0 + b1X i )ù ë û To do this, use calculus (see pages 17, 18 of KNNL). 2-16 Least Squares Solution The LS estimate for b 1 can be written in terms of the “sums of squares” b1 = å (X - X )(Y - Y ) = SS SS X X ( ) å i D i XY 2 i X The LS estimate for b 0 is b0 = Y - b1X 2-17 About the LS Estimates They are also the maximum likelihood estimates (see KNNL pages 27-32). These are the best estimates because they are unbiased (their expectation is the parameter that they are estimating) and they have minimum variance among all such estimators. Big picture: We wouldn’t want to use any other estimates because we can do no better. 2-18 Mean Square Error We also need to estimate s . This estimate is developed based on the sum of the squared residuals (SSE) and the available degrees of freedom: 2 ei SSE å s = MSE = = dfE n- 2 2 2 The error degrees of freedom are based on the fact that we have n observations and 2 parameters (b 0, b1 ) that we have already estimated. 2-19 Variance Notation s = MSE will always be the estimate for 2 s . This can be confusing, because there will be estimated variances for other quantities, and these will be denoted e.g. 2 2 s {b1 }, s {b0 }, etc. These are not products, but single variance quantities. 2 To avoid confusion, I will generally write MSE whenever referring to the estimate for s 2 . 2-20 EXAMPLE: Diamond Rings Variables Response Variable ~ price in Singapore dollars (Y) Explanatory Variable ~ weight of diamond in carats (X) Associated SAS File diamonds.sas 2-21 SAS Regression Procedure PROC REG data=diamonds; model price=weight; RUN; 2-22 Output (1) Source Model Error Total DF 1 46 47 Sum of Squares 2098596 46636 2145232 Mean Square 2098596 1013.81886 Root MSE = 31.84052 2-23 Output (2) Variable DF Intercept 1 weight 1 Parameter Estimate -259.62591 3721.02485 Standard Error 17.31886 81.78588 2-24 Output Summary From the output, we see that b0 = - 259.6 b1 = 3721.0 MSE = 1014 MSE = 31.8 Note that the Root-MSE has a direct interpretation as the estimated standard deviation (in $$). 2-25 Interpretations It doesn’t really make sense to talk about a 1-carat increase. But we can change this to a 0.01-carat increase by dividing by 100. From b1 we see that a 0.01-carat increase in the weight of a diamond will lead to a $37.21 increase in the mean response. The interpretation of b0 would be that one would actually be paid $260 to simply take a 0-carat diamond ring. Why doesn’t this make sense? 2-26 Scope of Model The scope of a regression model is the range of X-values over which we actually have data. Using a model to look at X-values outside the scope of the model (extrapolation) is quite dangerous. 2-27 2-28 Prediction for 0.43 Carats Does this make sense in light of the previous discussion? Suppose we assume that it does. Then the mean price for a 0.43 carat ring can be computed as follows: Yˆ = - 260 + 3721(0.43) = 1340 How confident would you be in this estimate? 2-29 Upcoming in Lecture 3... We will discuss more about inference concerning the regression coefficients Background Reading o 2.1-2.6 2-30