Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Week 4 Bivariate Regression, Least Squares and Hypothesis Testing Lecture Outline • • • • • • • Method of Least Squares Assumptions Normality assumption Goodness of fit Confidence Intervals Tests of Significance alpha versus p IS 620 Spring 2006 2 Recall . . . • Regression curve as “line connecting the mean values” of y for a given x – No necessary reason for such a construction to be a line – Need more information to define a function IS 620 Spring 2006 3 Method of Least Squares • Goal: describe the functional relationship between y and x – Assume linearity (in the parameters) • What is the best line to explain the relationship? • Intuition: The line that is “closest” or “fits best” the data IS 620 Spring 2006 4 0 5 y 10 15 20 “Best” line, n = 2 0 5 10 15 20 25 30 35 x IS 620 Spring 2006 5 0 5 y 10 15 20 “Best” line, n = 2 0 5 10 15 20 25 30 35 x IS 620 Spring 2006 6 25 30 35 40 “Best” line, n > 2 0 5 10 15 y 20 ? 0 5 10 15 20 25 30 35 x IS 620 Spring 2006 7 0 5 10 15 y 20 25 30 35 40 “Best” line, n > 2 0 5 10 15 20 25 30 35 x IS 620 Spring 2006 8 40 Least squares: intuition 30 35 Goal : min u1 u2 u3 y 20 25 u2 u3 0 5 10 15 u1 0 5 10 15 20 25 30 35 x IS 620 Spring 2006 9 10 y15 20 25 30 35 40 Least squares, n > 2 -15 -10 -5 0 5 2 min uˆi i 0 5 10 15 20 25 30 35 40 45 x IS 620 Spring 2006 10 Why sum of squares? • Sum of residuals may be zero • Emphasize residuals that are far away from regression line • Better describes spread of residuals IS 620 Spring 2006 11 Least-squares estimates yˆi ˆ1 ˆ2 xi uˆi yˆi ˆ1 ˆ2 xi uˆi Intercept Residuals Effect of x on y (slope) IS 620 Spring 2006 12 Gauss-Markov Theorem • Least-squares method produces best, linear unbiased estimators (BLUE) • Also most efficient (minimum variance) • Provided classic assumptions obtain IS 620 Spring 2006 13 Classical Assumptions • Focus on #3, #4, and #5 in Gujarati – Implications for estimators of violations • Skim over #1, #2, #6 through #10 IS 620 Spring 2006 14 #3: Zero mean value of ui • Residuals are randomly distributed around the regression line • Expected value is zero for any given observation of x • NOTE: Equivalent to assuming the model is fully specified IS 620 Spring 2006 15 -20 0 y 20 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 16 -20 0 y 20 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 17 20 0 -20 y if E(u|X) > 0 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 18 20 0 -20 y if E(u|X) > 0 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 19 -20 0 y 20 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 20 20 0 -20 y if E(u|X) <> 0 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 21 20 0 -20 y if E(u|X) <> 0 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 22 20 0 -20 y if E(u|X) <> 0 40 60 #3: Zero mean value of ui 0 10 20 30 40 50 x IS 620 Spring 2006 23 Violation of #3 • Estimated betas will be – Unbiased but – Inconsistent – Inefficient • May arise from – Systematic measurement error – Nonlinear relationships (Phillips curve) IS 620 Spring 2006 24 #4: Homoscedasticity • The variance of the residuals is the same for all observations, irrespective of the value of x • “Equal variance” • NOTE: #3 and #4 imply (see “Normality Assumption”) uˆ ~ N 0, IS 620 Spring 2006 25 -20 0 y 20 40 60 #4: Homoscedasticity 0 10 20 30 40 50 x IS 620 Spring 2006 26 -50 0 50 100 #4: Homoscedasticity 0 10 20 30 40 50 x IS 620 Spring 2006 27 -50 0 50 100 #4: Homoscedasticity 0 10 20 30 40 50 x IS 620 Spring 2006 28 -50 0 50 100 #4: Homoscedasticity 0 10 20 30 40 50 x IS 620 Spring 2006 29 -50 0 50 100 #4: Homoscedasticity 0 10 20 30 40 50 x IS 620 Spring 2006 30 Violation of #4 • Estimated betas will be – Unbiased – Consistent but – Inefficient • Arise from – Cross-sectional data IS 620 Spring 2006 31 #5: No autocorrelation • The correlation between any two residuals is zero • Residual for xi is unrelated to xj IS 620 Spring 2006 32 -20 0 y 20 40 60 #5: No autocorrelation 0 10 20 30 40 50 x IS 620 Spring 2006 33 -50 0 50 100 #5: No autocorrelation 0 10 20 30 40 50 x IS 620 Spring 2006 34 -50 0 50 100 #5: No autocorrelation 0 10 20 30 40 50 x IS 620 Spring 2006 35 -50 0 50 100 #5: No autocorrelation 0 10 20 30 40 50 x IS 620 Spring 2006 36 Violations of #5 • Estimated betas will be – Unbiased – Consistent – Inefficient • Arise from – Time-series data – Spatial correlation IS 620 Spring 2006 37 Other Assumptions (1) • Assumption 6: zero covariance between xi and ui – Violations cause of heteroscedasticity – Hence violates #4 • Assumption 9: model correctly specified – Violations may violate #1 (linearity) – May also violate #3: omitted variables? IS 620 Spring 2006 38 Other Assumptions (2) • #7: n must be greater than number of parameters to be estimated – Key in multivariate regression – King, Keohane and Verba’s (1996) critique of small n designs IS 620 Spring 2006 39 Normality Assumption • Distribution of disturbance is unknown • Necessary for hypothesis testing of I.V.s – Estimates a function of ui • Assumption of normality is necessary for inference • Equivalent to assuming model is completely specified IS 620 Spring 2006 40 Normality Assumption • Central Limit Theorem: M&Ms • Linear transformation of a normal variable itself is normal • Simple distribution (mu, sigma) • Small samples IS 620 Spring 2006 41 Assumptions, Distilled 1. Linearity 2. DV is continuous, interval-level 3. Non-stochastic: No correlation between independent variables 4. Residuals are independently and identically distributed (iid) a) Mean of zero b) Constant variance IS 620 Spring 2006 42 If so, . . . • Least-squares method produces BLUE estimators IS 620 Spring 2006 43 Goodness of Fit • How “well” the least-squares regression line fits the observed data • Alternatively: how well the function describes the effect of x on y • How much of the observed variation in y have we explained? IS 620 Spring 2006 44 Coefficient of determination • Commonly referred to as “r2” • Simply, the ratio of explained variation in y to the total variation in y IS 620 Spring 2006 45 35 40 Components of variation explained 15 y 20 25 30 total 0 5 10 residual 0 5 10 IS 620 15 20 x Spring 2006 25 30 35 40 46 Components of variation • TSS: total sum of squares • ESS: explained sum of squares • RSS: residual sum of squares ESS RSS r 1 TSS TSS 2 IS 620 Spring 2006 47 Hypothesis Testing • • • • Confidence Intervals Tests of significance ANOVA Alpha versus p-value IS 620 Spring 2006 48 Confidence Intervals • Two components – Estimate – Expression of uncertainty • Interpretation: – Gujarati, p. 121: “The probability of constructing an interval that contains Beta is 1-alpha” – NOT: “The p that Beta is in the interval is 1alpha” IS 620 Spring 2006 49 C.I.s for regression • Depend upon our knowledge or assumption about the sampling distribution • Width of interval proportional to standard error of the estimators • Typically we assume – The t distribution for Betas – The chi-square distribution for variances – Due to unknown true standard error IS 620 Spring 2006 50 Confidence Intervals in IR • Examples? IS 620 Spring 2006 51 The worst weatherman in the world • “Three-degree guarantee” • If his forecast high is off by more than three degrees, someone wins an umbrella • Woo hoo IS 620 Spring 2006 52 How Many Umbrellas? • Data: mean daily temperature in February for Washington, DC – Daily observations from 1995 to 2005 (n = 311) – Mean: 47.91 degrees F – Standard deviation: 10.58 • The interval: +/- 3.5 degrees F – Due to rounding – Note: spread of seven (eight?) degrees IS 620 Spring 2006 53 The t value • We don’t know alpha: level of confidence • Assume t distribution Pr x t 2 x x t 1 2 n n Pr47.9 3.5 x 47.9 3.5 1 10.58 t 3.5 311 0.60016 t 3.5 t 5.83 IS 620 Spring 2006 54 The answer • From the t table: Pr t 5.83 3.746 10 8 for df 311 0.00000003746 Tom will give away an umbrella on average about once every 26,695,141 days. Thanks, Tom. IS 620 Spring 2006 55 Tests of Significance • A hypothesis about a point value rather than an interval – Does the observed sample value differ from the hypothesized value? • Null hypothesis (H0): no difference • Alternative hypothesis (Ha): significant difference IS 620 Spring 2006 56 Regression Interpretation • Is the hypothesized causal effect (beta) significantly different than zero? – Ho: no effect (β = 0) – Ha: effect (β ≠ 0) • The “zero” null hypothesis IS 620 Spring 2006 57 Two-tail v. One-tail tests Two-tail • Ha is not concerned with direction of difference – Exploratory • Theory in disagreement • Critical regions on both ends IS 620 One tailed • Ha specifies a direction of effect • Theory well developed • Critical regions only on one end Spring 2006 58 The 2-t rule • Gujarati, p. 134: zero null hypothesis can be rejected if t > 2 – D.F. > 20 – Level of significance = 0.05 – Recall Weatherman Tom: t = 5.62! IS 620 Spring 2006 59 Alpha versus p-values Alpha • Conventional • Findings reported at 0.5, 0.1, 0.01 • Accessible, intuitive • Arbitrary • Makes assumptions about Type I, II errors IS 620 P-value • “The lowest significance at which a null hypothesis can be rejected” • Widely accepted today • Know your readers! Spring 2006 60 ANOVA • Intuitively similar to r2 – Identical output for bivariate regression • A good test of the zero null hypothesis • In multivariate regression, tests the null hypotheses for all betas – Check F statistic before checking betas! IS 620 Spring 2006 61 Limits of ANOVA • Harder to interpret • Does not provide information on direction or magnitude of effect for independent variables IS 620 Spring 2006 62 ANOVA output from SPSS IS 620 Spring 2006 63