Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Requirement 1 for Inference on the Least-Squares Regression Model For any particular value of the explanatory variable x, the mean of the corresponding responses in the population depends linearly on x. That is, y|x 1 x 0 for some numbers β0 and β1, where μy|x represents the population mean response when the value of the explanatory variable is x. 14-2 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Requirement 2 for Inference on the Least-Squares Regression Model The response variables are normally distributed with mean y|x 1 x 0 and standard deviation σ. 14-3 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. “In Other Words” When doing inference on the least-squares regression model, we require (1) for any explanatory variable, x, the mean of the response variable, y, depends on the value of x through a linear equation, and (2) the response variable, y, is normally distributed with a constant standard deviation, σ. The mean increases/decreases at a constant rate depending on the slope, while the standard deviation remains constant. 14-4 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. A large value of σ, the population standard deviation, indicates that the data are widely dispersed about the regression line, and a small value of σ indicates that the data lie fairly close to the regression line. 14-5 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. The least-squares regression model is given by yi 1 xi 0 i where yi is the value of the response variable for the ith individual β0 and β1 are the parameters to be estimated based on sample data xi is the value of the explanatory variable for the ith individual εi is a random error term with mean 0 an variance 2i 2 , the error terms are independent. i =1,…,n, where n is the sample size (number of ordered pairs in the data set) 14-6 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. The standard error of the estimate, se, is found using the formula se 14-7 yi yˆi n2 2 residuals Copyright © 2013, 2010 and 2007 Pearson Education, Inc. n2 2 14-8 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. CAUTION! Be sure to divide by n – 2 when computing the standard error of the estimate. 14-9 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. 14-10 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Hypothesis Test Regarding the Slope Coefficient, β1 To test whether two quantitative variables are linearly related, we use the following steps provided that 1. the sample is obtained using random sampling. 2. the residuals are normally distributed with constant error variance. 14-11 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Step 1: Determine the null and alternative hypotheses. The hypotheses can be structured in one of three ways: Two-tailed Left-Tailed Right-Tailed H0: β1 = 0 H0: β1 = 0 H0: β1 = 0 H1: β1 ≠ 0 H1: β1 < 0 H1: β1 > 0 Step 2: Select a level of significance, α, depending on the seriousness of making a Type I error. 14-12 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Classical Approach Step 3: Compute the test statistic b1 1 b1 t0 sb sb 1 1 which follows Student’s t-distribution with n – 2 degrees of freedom. Remember, when computing the test statistic, we assume the null hypothesis to be true. So, we assume that β1 = 0. Use Table VI to determine the critical value using n – 2 degrees of freedom. 14-13 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Classical Approach Step 4: Compare the critical value with the test statistic. 14-14 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. P-Value Approach By Hand Step 3: Compute the test statistic b1 1 b1 t0 sb sb 1 1 which follows Student’s t-distribution with n – 2 degrees of freedom. Use Table VI to approximate the P-value. 14-15 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. P-Value Approach Step 4: If the P-value < α, reject the null hypothesis. 14-16 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. CAUTION! Before testing H0: β1 = 0, be sure to draw a residual plot to verify that a linear model is appropriate. 14-17 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Parallel Example 5: Testing for a Linear Relation Test the claim that there is a linear relation between drill depth and drill time at the α = 0.05 level of significance using the drilling data. 14-18 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Verify the requirements: • We assume that the experiment was randomized so that the data can be assumed to represent a random sample. • In Parallel Example 4 we confirmed that the residuals were normally distributed by constructing a normal probability plot. • To verify the requirement of constant error variance, we plot the residuals against the explanatory variable, drill depth. 14-19 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. There is no discernable pattern. 14-20 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Step 1: We want to determine whether a linear relation exists between drill depth and drill time without regard to the sign of the slope. This is a two-tailed test with H0: β1 = 0 versus H1: β1 ≠ 0 Step 2: The level of significance is α = 0.05. Step 3: Using technology, we obtained an estimate of β1 in Parallel Example 2, b1= 0.0116. To determine the 2 standard deviation of b1, we compute x i x . The calculations are on the next slide. 14-21 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. 14-22 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Step 3, cont’d: We have sb1 se x x 2 i 0.5197 0.0030 30006.25 The test statistic is b1 0.0116 t0 3.867 sb1 0.003 14-23 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution: Classical Approach Step 3: cont’d Since this is a two-tailed test, we determine the critical t-values at the α = 0.05 level of significance with n – 2 = 12 – 2 = 10 degrees of freedom to be –t0.025 = –2.228 and t0.025 = 2.228. Step 4: Since the value of the test statistic, 3.867, is greater than 2.228, we reject the null hypothesis. 14-24 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution: P-Value Approach Step 3: Since this is a two-tailed test, the P-value is the sum of the area under the t-distribution with 12 – 2 = 10 degrees of freedom to the left of –t0 = –3.867 and to the right of t0 = 3.867. Using Table VI we find that with 10 degrees of freedom, the value 3.867 is between 3.581 and 4.144 corresponding to right-tail areas of 0.0025 and 0.001, respectively. Thus, the P-value is between 0.002 and 0.005. Step 4: Since the P-value is less than the level of significance, 0.05, we reject the null hypothesis. 14-25 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Step 5: There is sufficient evidence at the α = 0.05 level of significance to conclude that a linear relation exists between drill depth and drill time. 14-26 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Confidence Intervals for the Slope of the Regression Line A (1 – α)•100% confidence interval for the slope of the true regression line, β1, is given by the following formulas: Lower bound: Upper bound: b1 t 2 b1 t 2 se x i x 2 se x i x 2 Here, ta/2 is computed using n – 2 degrees of freedom. 14-27 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Note: The confidence interval formula for β1 can be computed only if the data are randomly obtained, the residuals are normally distributed, and there is constant error variance. 14-28 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Parallel Example 7: Constructing a Confidence Interval for the Slope of the True Regression Line Construct a 95% confidence interval for the slope of the least-squares regression line for the drilling example. 14-29 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution The requirements for the usage of the confidence interval formula were verified in previous examples. We also determined • b1 = 0.0116 • sb 0.0030 in previous examples. 1 14-30 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Since t0.025 = 2.228 for 10 degrees of freedom, we have Lower bound = 0.0116 – 2.228 • 0.003 = 0.0049 Upper bound = 0.0116 + 2.228 • 0.003 = 0.0183. We are 95% confident that the mean increase in the time it takes to drill 5 feet for each additional foot of depth at which the drilling begins is between 0.005 and 0.018 minutes. 14-31 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section 14.2 Confidence and Prediction Intervals Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Confidence intervals for a mean response are intervals constructed about the predicted value of y, at a given level of x, that are used to measure the accuracy of the mean response of all the individuals in the population. Prediction intervals for an individual response are intervals constructed about the predicted value of y that are used to measure the accuracy of a single individual’s predicted value. 14-33 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Confidence Interval for the Mean Response of y, yˆ . A (1 – α)•100% confidence interval for yˆ , the mean response of y for a specified value of x, is given by yˆ t 2 se x x 1 n x i x 2 yˆ t 2 se x x 1 2 n x i x * Lower bound: * Upper bound: 2 2 where x* is the given value of the explanatory variable, n is the number of observations, and tα/2 is the critical value with n – 2 degrees of freedom. 14-34 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Parallel Example 1: Constructing a Confidence Interval for a Mean Response Construct a 95% confidence interval about the predicted mean time to drill 5 feet for all drillings started at a depth of 110 feet. 14-35 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution The least squares regression line is yˆ 0.0116x 5.5273 . To find the predicted mean time to drill 5 feet for all drillings started at 110 feet, let x*=110 in the regression equation and obtain yˆ 6.8033 . Recall: • se=0.5197 • x 126.25 • xi x 30006.25 • t0.025 = 2.228 for 10 degrees of freedom 14-36 2 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Therefore, Lower bound: 1 110 126.25 6.8033 2.228 0.5197 6.45 12 30006.25 2 Upper bound: 14-37 1 110 126.25 6.8033 2.228 0.5197 7.15 12 30006.25 2 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution We are 95% confident that the mean time to drill 5 feet for all drillings started at a depth of 110 feet is between 6.45 and 7.15 minutes. 14-38 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Prediction Interval for an Individual Response aboutyö A (1 – α)•100% prediction interval for y ˆ , the individual response of y, is given by Lower bound: yˆ t 2 se Upper bound: yˆ t 2 se 1 1 n x x x x 1 1 n x x x x 2 * 2 i 2 * 2 i where x* is the given value of the explanatory variable, n is the number of observations, and tα/2 is the critical value with n – 2 degrees of freedom. 14-39 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Parallel Example 2: Constructing a Prediction Interval for an Individual Response Construct a 95% prediction interval about the predicted time to drill 5 feet for a single drilling started at a depth of 110 feet. 14-40 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution The least squares regression line is yˆ 0.0116x 5.5273 . To find the predicted mean time to drill 5 feet for all drillings started at 110 feet, let x*=110 in the regression equation and obtain yˆ 6.8033. Recall: • se=0.5197 • x 126.25 • x i x 2 30006.25 • t0.025=2.228 for 10 degrees of freedom 14-41 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution Therefore, Lower bound: 1 110 126.25 6.8033 2.228 0.5197 1 5.59 12 30006.25 2 Upper bound: 1 110 126.25 6.8033 2.228 0.5197 1 8.01 12 30006.25 2 14-42 Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Solution We are 95% confident that the time to drill 5 feet for a random drilling started at a depth of 110 feet is between 5.59 and 8.01 minutes. 14-43 Copyright © 2013, 2010 and 2007 Pearson Education, Inc.