* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 13 - McGraw Hill Higher Education
Survey
Document related concepts
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Taylor's law wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Transcript
Chapter 13 Simple Linear Regression Analysis McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. Simple Linear Regression Analysis 13.1 The Simple Linear Regression Model and the Least Square Point Estimates 13.2 Model Assumptions and the Standard Error 13.3 Testing the Significance of Slope and yIntercept 13.4 Confidence and Prediction Intervals 13.5 Simple Coefficients of Determination and Correlation 13-2 Simple Linear Regression Analysis Continued 13.6 Testing the Significance of the Population Correlation Coefficient (Optional) 13.7 An F Test for the Model 13.8 The QHIC Case 13.9 Residual Analysis (Optional) 13.10 Some Shortcut Formulas (Optional) 13-3 LO 1: Explain the simple linear regression model. 13.1 The Simple Linear Regression Model and the Least Squares Point Estimates The dependent (or response) variable is the variable we wish to understand or predict The independent (or predictor) variable is the variable we will use to understand or predict the dependent variable Regression analysis is a statistical technique that uses observed data to relate the dependent variable to one or more independent variables The objective is to build a regression model that can describe, predict and control the dependent variable based on the independent variable 13-4 LO 2: Find the least squares point estimates of the slope and yintercept. The Least Squares Point Estimates Estimation/prediction equation ŷ = b0 + b1x Least squares point estimate of the slope β1 b1 SS xy SS xx SS xy ( xi x )( yi y ) xi yi x y i i n x SS ( x x ) x n Least squares point estimate of the y-intercept 0 y x b y b x y x 2 2 xx 2 i i i i 0 1 n i n 13-5 LO 3: Describe the assumptions behind simple linear regression and calculate the standard error. 1. 2. 3. 4. 13.2 Model Assumptions and the Standard Error Mean of Zero At any given value of x, the population of potential error term values has a mean equal to zero Constant Variance Assumption At any given value of x, the population of potential error term values has a variance that does not depend on the value of x Normality Assumption At any given value of x, the population of potential error term values has a normal distribution Independence Assumption Any one value of the error term ε is statistically independent of any other value of ε 13-6 LO3 Sum of Squares Sum of squared errors Mean square error SSE ei2 ( yi yˆi ) 2 Point estimate of the residual variance σ2 s 2 MSE SSE n-2 Standard error Point estimate of residual standard deviation σ SSE s MSE n-2 13-7 LO 4: Test the significance of the slope and y-intercept. 13.3 Testing the Significance of the Slope and y-Intercept A regression model is not likely to be useful unless there is a significant relationship between x and y To test significance, we use the null hypothesis: H0: β1 = 0 Versus the alternative hypothesis: Ha: β1 ≠ 0 13-8 LO3 Testing the Significance of the Slope #2 Alternative Reject H0 If p-Value Ha: β1 > 0 t > tα Area under t distribution right of t Ha: β1 < 0 t < –tα Area under t distribution left of t Ha: β1 ≠ 0 |t| > tα/2* Twice area under t distribution right of |t| * That is t > tα/2 or t < –tα/2 13-9 LO 5: Calculate and interpret a confidence interval for a mean value and a prediction interval for an individual value. 13.4 Confidence and Prediction Intervals The point on the regression line corresponding to a particular value of x0 of the independent variable x is ŷ = b0 + b1x0 It is unlikely that this value will equal the mean value of y when x equals x0 Therefore, we need to place bounds on how far the predicted value might be from the actual value We can do this by calculating a confidence interval mean for the value of y and a prediction interval for an individual value of y 13-10 LO 6: Calculate and interpret the simple coefficients of determination and correlation. 13.5 Simple Coefficient of Determination and Correlation How useful is a particular regression model? One measure of usefulness is the simple coefficient of determination It is represented by the symbol r2 This section may be read anytime after reading Section 13.1 13-11 LO 7: Test hypotheses about the population correlation coefficient (optional). 13.6 Testing the Significance of the Population Correlation Coefficient (Optional) The simple correlation coefficient (r) measures the linear relationship between the observed values of x and y from the sample The population correlation coefficient (ρ) measures the linear relationship between all possible combinations of observed values of x and y r is an estimate of ρ 13-12 LO 8: Test the significance of a simple linear regression model by using an F test. 13.7 An F Test for Model For simple regression, this is another way to test the null hypothesis H 0: β 1 = 0 This is the only test we will use for multiple regression The F test tests the significance of the overall regression relationship between x and y 13-13 LO 9: Use residual analysis to check the assumptions of simple linear regression (optional). Checks of regression assumptions are performed by analyzing the regression residuals Residuals (e) are defined as the difference between the observed value of y and the predicted value of y, e = y - ŷ 13.9 Residual Analysis (Optional) Note that e is the point estimate of ε If regression assumptions valid, the population of potential error terms will be normally distributed with mean zero and variance σ2 Different error terms will be statistically independent 13-14 13.10 Some Shortcut Formulas (Optional) Total variation SSTO SS yy Explained variation SSR SS xy2 SS xx Unexplaine d variation SSE = SS yy SS xy2 SS xx where SS xy x y ( x x )( y y ) x y i i SS xx ( xi x ) i i i x x n y y n i n 2 2 SS yy ( yi y ) 2 i i 2 2 2 i i 13-15