Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/ We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use. Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. Attribution Key for more information see: http://open.umich.edu/wiki/AttributionPolicy Use + Share + Adapt { Content the copyright holder, author, or law permits you to use, share and adapt. } Public Domain – Government: Works that are produced by the U.S. Government. (17 USC § 105) Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Creative Commons – Zero Waiver Creative Commons – Attribution License Creative Commons – Attribution Share Alike License Creative Commons – Attribution Noncommercial License Creative Commons – Attribution Noncommercial Share Alike License GNU – Free Documentation License Make Your Own Assessment { Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. } Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in your jurisdiction may differ { Content Open.Michigan has used under a Fair Use determination. } Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your jurisdiction may differ Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should do your own independent analysis to determine whether or not your use will be Fair. Recall our Regression Example Exam 2 vs Final Exam 2 33 65 44 64 60 40 Final 53 80 78 93 88 58 Least Squares Regression Line: yˆ 21.67 1.046( x) r2 = 0.791 r = 0.889 On to Inference: Sample reg line vs Population reg line On to Inference: Sample versus Population pg 192 Regression Line for the Sample From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission. On to Inference: Sample versus Population Regression Line for the Population From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission. Inference in Linear Regression Linear Model: Response y = [b0 + b1(x)] + e = [Population relationship] + Randomness For each x, the population of y values are normally distributed with some mean (may depend on x in linear way) and a std deviation s that does not depend on x From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission. Inference in Linear Regression For each x, the population of y values are normally distributed with some mean (may depend on x in linear way) and a std deviation s that does not depend on x Inference in Linear Regression e’s = true error terms (not observe), and have normal distribution with mean 0 and std deviation s. We cannot see e’s --but can see residuals (observed errors); so use residuals to assess if all ok about true error assumptions. Goals in Regression: pg 194 1. Estimate regression line based on data. 2. Measure strength of the linear relationship with the correlation. 3. Use estimated equation for predictions. 4. Assess if the linear relationship is statistically significant. 5. Provide interval estimates (CIs) for our predictions. 6. Understand and check the assumptions of our model. Estimating Std Dev for Regression Measuring the average size of the residuals. s= Note: Why n – 2? Estimating the Standard Deviation: Exam 2 and Final Exam Scores Model Summaryb Model 1 Adjus ted R Square .738 R R Square .889 a .791 Std. Error of the Es timate 8.24671 a. Predictors : (Constant), exam 2 s cores (out of 75) b. Dependent Variable: final exam scores (out of 100) ANOVAb Model 1 Regress ion Res idual Total Sum of Squares 1027.967 272.033 1300.000 df 1 4 5 Mean Square 1027.967 68.008 a. Predictors : (Constant), exam 2 s cores (out of 75) b. Dependent Variable: final exam scores (out of 100) F 15.115 Sig. .018 a Significant Linear Relationship? (pg 195) H0: b1 = 0 versus Ha: b1 ≠ 0 What happens if the null hypothesis is true? t sample statistic - null value standard error of the sample statistic t-test for the population slope b1 To test H0: b1 = 0 we would use where b1 0 t s.e.(b1 ) SE (b1) s 2 x x and degrees of freedom for t-distribution are n – 2. Could be modified to test a variety of hypotheses. Try It! Significant Linear Relationship between Exam 2 and Final Scores? Is there a significant (non-zero) linear relationship between exam 2 score and final exam score? Is exam 2 a useful linear predictor for final score? Test H0: b1 = 0 versus Ha: b1 ≠ 0 at the 5% level. A = Yes or B = No pg 196 Based on previous t-test at 5% significance level, do you think a 95% confidence interval for true slope would contain the value of 0? Exam 2 and Final Exam Scores Confidence Interval for population slope b1 b1 t * SE b1 where df = n-2 for the t* value Compute the 95% CI for the population slope Could you interpret the 95% confidence level.? Inference about the population slope using SPSS Coefficientsa Model 1 (Cons tant) exam 2 scores (out of 75) Uns tandardized Coefficients B Std. Error 21.667 14.125 1.046 .269 a. Dependent Variable: final exam scores (out of 100) Standardized Coefficients Beta .889 t 1.534 3.888 Sig. .200 .018 SPSS ANOVA F-test for Regression Note: Third way to test H0: b1 = 0 versus Ha: b1 ≠ 0 ANOVAb Model 1 Regress ion Res idual Total Sum of Squares 1027.967 272.033 1300.000 df 1 4 5 Mean Square 1027.967 68.008 a. Predictors : (Constant), exam 2 s cores (out of 75) b. Dependent Variable: final exam scores (out of 100) F 15.115 Sig. .018 a Recap pages 195-196 Learning about the popul slope b1 1. T-test for b1 … b1 null t s.e.(b1 ) df = n – 2 2. CI for b1 … b1 t * SE b1 df = n – 2 3. F-test for b1 … F MSRegr MSerror df = 1, n – 2 Which of the following could be used to test H0: b1 = 0 vs Ha: b1 ≠ 0? Select all that apply. A) B) C) t-test CI F-test Which of the following could be used to test H0: b1 = 2 vs Ha: b1 ≠ 2? Select all that apply. A) B) C) t-test CI F-test Which of the following could be used to test H0: b1 = 0 vs Ha: b1 > 0? Select all that apply. A) B) C) t-test CI F-test Predicting for Individuals versus Estimating the Mean yˆ 21.67 1.046( x) How would you predict the final exam score for Barb who scored 60 points on exam 2? How would you estimate the mean final exam score for all students who scored 60 points on exam 2? estimate for predicting a future observation and for estimating the mean response are same. What about their standard errors? Predicting for Individuals versus Estimating the Mean A population of individuals and a population of means… Std dev for a population of individuals? Std dev for a population of means? Which standard deviation is larger? So a prediction interval for an individual response will be (wider or narrower) than a confidence interval for a mean response. Predicting for Individuals versus Estimating the Mean Confidence interval for a mean response: yˆ t *s.e.(fit) where (x x) 2 1 s.e.(fit ) s n x i x 2 df = n – 2 Prediction interval for an individual response: yˆ t * s.e.(pred) where s.e.(pred) s 2 s.e.(fit ) 2 df = n – 2 Try It! Exam 2 versus Final Exam Construct a 95% CI for mean final exam score for all students who scored x = 60 points on exam 2. 2 Recall: n = 6, x 51 , x x S XX 940 , ŷ 21.67 +1.046(x), and s = 8.24761. Confidence interval for a mean response: yˆ t *s.e.(fit) where (x x) 2 1 s.e.(fit ) s n x i x 2 Prediction interval for an individual response: yˆ t * s.e.(pred) df = n – 2 Try It! Exam 2 versus Final Exam Construct a 95% PI for the final exam score for a student who scored x = 60 points on exam 2. Confidence interval for a mean response: yˆ t *s.e.(fit) where (x x) 2 1 s.e.(fit ) s n x i x 2 df = n – 2 Prediction interval for an individual response: yˆ t * s.e.(pred) where s.e.(pred) s 2 s.e.(fit ) 2 df = n – 2